The UPDATE operator is used to update data rows in a data table that meet specified conditions.
OceanBase Database supports the UPDATE and DISTRIBUTED UPDATE operators.
UPDATE
The UPDATE statement is used to update data in a partition of a table.
In the following example, Q1 updates all rows in the t1 table that satisfy c2 = '100' and sets the value of the c2 column to 200.
--Oracle mode:
obclient> CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 VARCHAR2(10));
Query OK, 0 rows affected
obclient> CREATE TABLE t2 (c1 INT PRIMARY KEY, c2 VARCHAR2(10)) PARTITION BY
HASH(c1) PARTITIONS 10;
Query OK, 0 rows affected
obclient> CREATE TABLE t3 (c1 INT PRIMARY KEY, c2 VARCHAR2(10));
Query OK, 0 rows affected
obclient> CREATE INDEX IDX_t3_c2 ON t3 (c2) PARTITION BY HASH(c2) PARTITIONS 3;
Query OK, 0 rows affected
--MySQL mode:
obclient> CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> CREATE TABLE t2 (c1 INT PRIMARY KEY, c2 INT) PARTITION BY
HASH(c1) PARTITIONS 10;
Query OK, 0 rows affected
obclient> CREATE TABLE t3 (c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> CREATE INDEX IDX_t3_c2 ON t3 (c2) PARTITION BY HASH(c2) PARTITIONS 3;
Query OK, 0 rows affected
Q1:
obclient> EXPLAIN UPDATE t1 SET c2 = '200' WHERE c2 = '100';
Query Plan:
======================================
|ID|OPERATOR |NAME|EST. ROWS|COST |
-----------------------------------------------
|0 |EXCHANGE IN REMOTE | |990 |109687|
|1 | EXCHANGE OUT REMOTE| |990 |109687|
|2 | UPDATE | |990 |109687|
|3 | TABLE SCAN |T1 |990 |108697|
===============================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil)
1 - output(nil), filter(nil)
2 - output(nil), filter(nil), table_columns([{T1: ({T1: (T1.C1, T1.C2)})}]),
update([T1.C2=?])
3 - output([T1.C1], [T1.C2], [?]), filter([T1.C2 = '100']),
access([T1.C2], [T1.C1]), partitions(p0)
In the execution plan of the preceding example, the outputs & filters section lists the following information about the UPDATE operator:
| Information | Meaning |
|---|---|
| output | The expressions output by the operator. |
| filter | The filter condition on the operator. Since the UPDATE operator has no filter, this is nil. In an update statement, the predicates in WHERE are pushed down to the base table, such as c2 = '100' in Q1. |
| table_columns | The columns of the table involved in the update operation. |
| update | The assignment expressions in the update operation. |
The following examples show more information about the UPDATE operator:
Q2 updates all data rows in the
t1table and sets the value of thec2column to 200.Q3 updates data rows in the partitioned
t2table that satisfyc1='100'and sets the value of thec2column to 150.Q4 updates data rows in the partitioned
t2table that satisfyc2 ='100'and sets the value of thec2column torpad(t2.c2, 10, '9'). As shown in the execution plan, theUPDATEoperator is placed under theEXCHANGEoperator, indicating that the 2nd and 3rd operators will be scheduled as a task at the partition level. During execution, the 4th operator scans data in thet2table that meets thec2 = '100'condition in one partition, and the 3rdUPDATEoperator updates only the data scanned in the corresponding partition.
Q2:
obclient> EXPLAIN UPDATE t1 SET c2 = '200';
Query Plan:
======================================
|ID|OPERATOR |NAME|EST. ROWS|COST |
-----------------------------------------------
|0 |EXCHANGE IN REMOTE | |100000 |161860|
|1 | EXCHANGE OUT REMOTE| |100000 |161860|
|2 | UPDATE | |100000 |161860|
|3 | TABLE SCAN |T1 |100000 |61860 |
===============================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil)
1 - output(nil), filter(nil)
2 - output(nil), filter(nil), table_columns([{T1: ({T1: (T1.C1, T1.C2)})}]),
update([T1.C2=?])
3 - output([T1.C1], [T1.C2], [?]), filter(nil),
access([T1.C2], [T1.C1]), partitions(p0)
Q3:
obclient> EXPLAIN UPDATE t2 SET t2.c2 = '150' WHERE t2.c1 = '100';
Query Plan:
===================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
-----------------------------------
|0 |UPDATE | |1 |53 |
|1 | TABLE GET|T2 |1 |52 |
===================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil), table_columns([{T2: ({T2: (T2.C1, T2.C2)})}]),
update([T2.C2=?])
1 - output([T2.C1], [T2.C2], [?]), filter(nil),
access([T2.C1], [T2.C2]), partitions(p5)
Q4:
obclient> EXPLAIN UPDATE t2 SET t2.c2 = RPAD(t2.c2, 10, '9') WHERE t2.c2 = '100';
Query Plan:
===============================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
-------------------------------------------------------
|0 |PX COORDINATOR | |9900 |1096793|
|1 | EXCHANGE OUT DISTR |:EX10000|9900 |1096793|
|2 | PX PARTITION ITERATOR| |9900 |1096793|
|3 | UPDATE | |9900 |1096793|
|4 | TABLE SCAN |T2 |9900 |1086893|
=======================================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil)
1 - output(nil), filter(nil), dop=1
2 - output(nil), filter(nil)
3 - output(nil), filter(nil), table_columns([{T2: ({T2: (T2.C1, T2.C2)})}]),
update([T2.C2=column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,RPAD(T2.C2, 10, ?))])
4 - output([T2.C1], [T2.C2], [column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,RPAD(T2.C2, 10, ?))]), filter([T2.C2 = '100']),
access([T2.C1], [T2.C2]), partitions(p[0-9])
Distributed update
The DISTRIBUTED UPDATE operator is used to update data in multiple partitions of a table. For example, Q5 updates all data rows in the t3 table that meet the condition c2 < '100' and sets the value of the c2 column to 200. Although t3 is a non-partitioned table, it has a global index named idx_t3_c2. Therefore, each data row exists in multiple partitions.
Q5:
obclient> EXPLAIN UPDATE t3 SET c2 = '200' WHERE c2 < '100';
+--------------------------------------------------------------------------------------------------------+
| Query Plan |
+--------------------------------------------------------------------------------------------------------+
| ======================================================================= |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------------------------------- |
| |0 |DISTRIBUTED UPDATE | |1 |61 | |
| |1 |└─DISTRIBUTED TABLE RANGE SCAN|T3(IDX_T3_C2)|1 |6 | |
| ======================================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output(nil), filter(nil) |
| table_columns([{T3: ({T3: (T3.C1, T3.C2)}, {IDX_T3_C2: (T3.C2, T3.C1)})}]), |
| update([T3.C2=column_conv(VARCHAR2,utf8mb4_bin,length:10,NULL,cast('200', VARCHAR2(1048576 )))]) |
| 1 - output([T3.C1], [T3.C2]), filter(nil), rowset=256 |
| access([T3.C1], [T3.C2]), partitions(p[0-2]) |
| is_index_back=false, is_global_index=true, |
| range_key([T3.C2], [T3.C1]), range(MIN,MIN ; 100,MIN), |
| range_cond([T3.C2 < cast('100', VARCHAR2(1048576 ))]) |
+--------------------------------------------------------------------------------------------------------+
Here are more examples of the DISTRIBUTED UPDATE operator:
- Q6 updates the data rows in the
t2partitioned table that meet the conditionc1 = 100and sets the value of thec1column to 101. Since the updated column is a primary key column, the updated data rows may be in different partitions from the original data rows. Therefore, theDISTRIBUTED UPDATEoperator is used for the update.
Q6:
obclient> EXPLAIN UPDATE t2 SET t2.c1 = 101 WHERE t2.c1 = 100;
+------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------+
| ================================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| -------------------------------------------------- |
| |0 |DISTRIBUTED UPDATE| |1 |34 | |
| |1 |└─TABLE GET |T2 |1 |3 | |
| ================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output(nil), filter(nil) |
| table_columns([{T2: ({T2: (T2.C1, T2.C2)})}]), |
| update([T2.C1=column_conv(NUMBER,PS:(-1,0),NOT NULL,101)]) |
| 1 - output([T2.C1], [T2.C2]), filter(nil), rowset=256 |
| access([T2.C1], [T2.C2]), partitions(p5) |
| is_index_back=false, is_global_index=false, |
| range_key([T2.C1]), range[100 ; 100], |
| range_cond([T2.C1 = 100]) |
+------------------------------------------------------------------+