The UPDATE operator is used to update data rows in a data table that meet specified conditions.
OceanBase Database supports the UPDATE and DISTRIBUTED UPDATE operators.
UPDATE
The UPDATE statement is used to update data in a partition of a table.
In the following example, Q1 updates all rows in the t1 table that meet the condition c2 = '100' and sets the value of the c2 column to 200.
--Oracle mode:
obclient> CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 VARCHAR2(10));
Query OK, 0 rows affected
obclient> CREATE TABLE t2 (c1 INT PRIMARY KEY, c2 VARCHAR2(10)) PARTITION BY
HASH(c1) PARTITIONS 10;
Query OK, 0 rows affected
obclient> CREATE TABLE t3 (c1 INT PRIMARY KEY, c2 VARCHAR2(10));
Query OK, 0 rows affected
obclient> CREATE INDEX IDX_t3_c2 ON t3 (c2) PARTITION BY HASH(c2) PARTITIONS 3;
Query OK, 0 rows affected
--MySQL mode:
obclient> CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> CREATE TABLE t2 (c1 INT PRIMARY KEY, c2 INT) PARTITION BY
HASH(c1) PARTITIONS 10;
Query OK, 0 rows affected
obclient> CREATE TABLE t3 (c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected
obclient> CREATE INDEX IDX_t3_c2 ON t3 (c2) PARTITION BY HASH(c2) PARTITIONS 3;
Query OK, 0 rows affected
Q1:
obclient> EXPLAIN UPDATE t1 SET c2 = '200' WHERE c2 = '100';
Query Plan:
======================================
|ID|OPERATOR |NAME|EST. ROWS|COST |
-----------------------------------------------
|0 |EXCHANGE IN REMOTE | |990 |109687|
|1 | EXCHANGE OUT REMOTE| |990 |109687|
|2 | UPDATE | |990 |109687|
|3 | TABLE SCAN |T1 |990 |108697|
===============================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil)
1 - output(nil), filter(nil)
2 - output(nil), filter(nil), table_columns([{T1: ({T1: (T1.C1, T1.C2)})}]),
update([T1.C2=?])
3 - output([T1.C1], [T1.C2], [?]), filter([T1.C2 = '100']),
access([T1.C2], [T1.C1]), partitions(p0)
In the execution plan of the preceding example, the outputs & filters section lists the output information of the UPDATE operator:
| Information | Meaning |
|---|---|
| output | The expressions that the operator outputs. |
| filter | The filter conditions on the operator. Since the UPDATE operator has no filter in the example, it is nil. For update statements, the predicates in WHERE are pushed down to the base tables. For example, c2 = '100' in Q1 is pushed down to the 1st operator. |
| table_columns | The columns of the data tables involved in the update operation. |
| update | All assignment expressions in the update operation. |
The following are more examples of the UPDATE operator:
Q2 updates all data rows in the
t1table and sets the value of thec2column to 200.Q3 updates the data rows in the
t2partition table that meet the conditionc1='100'and sets the value of thec2column to 150.Q4 updates the data rows in the
t2partition table that meet the conditionc2 ='100'and sets the value of thec2column torpad(t2.c2, 10, '9'). As shown in the execution plan, theUPDATEoperator is placed under theEXCHANGEoperator, so operators 2 and 3 will be scheduled as a task at the partition level. When executing, operator 4 scans the data in thet2partition that meets the conditionc2 = '100', and operator 3, theUPDATEoperator, will only update the data it scanned in the corresponding partition.
Q2:
obclient> EXPLAIN UPDATE t1 SET c2 = '200';
Query Plan:
======================================
|ID|OPERATOR |NAME|EST. ROWS|COST |
-----------------------------------------------
|0 |EXCHANGE IN REMOTE | |100000 |161860|
|1 | EXCHANGE OUT REMOTE| |100000 |161860|
|2 | UPDATE | |100000 |161860|
|3 | TABLE SCAN |T1 |100000 |61860 |
===============================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil)
1 - output(nil), filter(nil)
2 - output(nil), filter(nil), table_columns([{T1: ({T1: (T1.C1, T1.C2)})}]),
update([T1.C2=?])
3 - output([T1.C1], [T1.C2], [?]), filter(nil),
access([T1.C2], [T1.C1]), partitions(p0)
Q3:
obclient> EXPLAIN UPDATE t2 SET t2.c2 = '150' WHERE t2.c1 = '100';
Query Plan:
===================================
|ID|OPERATOR |NAME|EST. ROWS|COST|
-----------------------------------
|0 |UPDATE | |1 |53 |
|1 | TABLE GET|T2 |1 |52 |
===================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil), table_columns([{T2: ({T2: (T2.C1, T2.C2)})}]),
update([T2.C2=?])
1 - output([T2.C1], [T2.C2], [?]), filter(nil),
access([T2.C1], [T2.C2]), partitions(p5)
Q4:
obclient> EXPLAIN UPDATE t2 SET t2.c2 = RPAD(t2.c2, 10, '9') WHERE t2.c2 = '100';
Query Plan:
===============================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
-------------------------------------------------------
|0 |PX COORDINATOR | |9900 |1096793|
|1 | EXCHANGE OUT DISTR |:EX10000|9900 |1096793|
|2 | PX PARTITION ITERATOR| |9900 |1096793|
|3 | UPDATE | |9900 |1096793|
|4 | TABLE SCAN |T2 |9900 |1086893|
=======================================================
Outputs & filters:
-------------------------------------
0 - output(nil), filter(nil)
1 - output(nil), filter(nil), dop=1
2 - output(nil), filter(nil)
3 - output(nil), filter(nil), table_columns([{T2: ({T2: (T2.C1, T2.C2)})}]),
update([T2.C2=column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,RPAD(T2.C2, 10, ?))])
4 - output([T2.C1], [T2.C2], [column_conv(VARCHAR,utf8mb4_bin,length:10,NULL,RPAD(T2.C2, 10, ?))]), filter([T2.C2 = '100']),
access([T2.C1], [T2.C2]), partitions(p[0-9])
Distributed update
The DISTRIBUTED UPDATE operator is used to update data in multiple partitions of a table. For example, Q5 updates all data rows in the t3 table that meet the condition c2 < '100' and sets the value of the c2 column to 200. Even though t3 is a non-partitioned table, it has a global index idx_t3_c2, so each data row exists in multiple partitions.
Q5:
obclient> EXPLAIN UPDATE t3 SET c2 = '200' WHERE c2 < '100';
+--------------------------------------------------------------------------------------------------------+
| Query Plan |
+--------------------------------------------------------------------------------------------------------+
| ======================================================================= |
| |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------------------------------- |
| |0 |DISTRIBUTED UPDATE | |1 |61 | |
| |1 |└─DISTRIBUTED TABLE RANGE SCAN|T3(IDX_T3_C2)|1 |6 | |
| ======================================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output(nil), filter(nil) |
| table_columns([{T3: ({T3: (T3.C1, T3.C2)}, {IDX_T3_C2: (T3.C2, T3.C1)})}]), |
| update([T3.C2=column_conv(VARCHAR2,utf8mb4_bin,length:10,NULL,cast('200', VARCHAR2(1048576 )))]) |
| 1 - output([T3.C1], [T3.C2]), filter(nil), rowset=256 |
| access([T3.C1], [T3.C2]), partitions(p[0-2]) |
| is_index_back=false, is_global_index=true, |
| range_key([T3.C2], [T3.C1]), range(MIN,MIN ; 100,MIN), |
| range_cond([T3.C2 < cast('100', VARCHAR2(1048576 ))]) |
+--------------------------------------------------------------------------------------------------------+
Here are more examples of the DISTRIBUTED UPDATE operator:
- Q6 updates the data rows in the
t2partitioned table that meet the conditionc1 = 100and sets the value of thec1column to 101. Since the updated column is a primary key column, the updated data rows may be located in different partitions from the original data rows. Therefore, theDISTRIBUTED UPDATEoperator is used for the update.
Q6:
obclient> EXPLAIN UPDATE t2 SET t2.c1 = 101 WHERE t2.c1 = 100;
+------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------+
| ================================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| -------------------------------------------------- |
| |0 |DISTRIBUTED UPDATE| |1 |34 | |
| |1 |└─TABLE GET |T2 |1 |3 | |
| ================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output(nil), filter(nil) |
| table_columns([{T2: ({T2: (T2.C1, T2.C2)})}]), |
| update([T2.C1=column_conv(NUMBER,PS:(-1,0),NOT NULL,101)]) |
| 1 - output([T2.C1], [T2.C2]), filter(nil), rowset=256 |
| access([T2.C1], [T2.C2]), partitions(p5) |
| is_index_back=false, is_global_index=false, |
| range_key([T2.C1]), range[100 ; 100], |
| range_cond([T2.C1 = 100]) |
+------------------------------------------------------------------+