User-defined variables|V4.2.1| docs|Distributed Database

User-defined variables

Last Updated：2023-12-25 08:49:41 Updated

The value stored in a user-defined variable can be referenced by another statement. This way, you can pass values from one statement to another.

User-defined variables are written in the @var_name format. var_name indicates the variable name and can contain letters, digits, periods (.), underscores (_), and dollar signs ($). If you quote a user-defined variable name as a string or identifier, such as @'my-obvar', @"my-obvar", or @`my-obvar`, the name can contain other characters. User-defined variable names are case-insensitive and can contain a maximum of 64 characters in length.

User-defined variables are session-specific. A user-defined variable defined by one client is invisible or inaccessible to other clients. When a client exits, the client automatically releases all variables for sessions of the client.

You can execute the SET statement to set user-defined variables:

SET @var_name = expr [, @var_name = expr] ...

For more information, see SET.

You can use = or := as an assignment operator in a SET statement.

The following data types are supported for assigning values to user-defined variables: integer, decimal, floating point, binary or non-binary string, or NULL. The precision or scale of the value is not retained in the assignment of decimal and real values. Values of other data types are converted to the supported data types. For example, values of the temporal or spatial data type are converted to binary strings.

To read a user-defined variable, execute the SELECT statement.

If a user-defined variable is assigned a non-binary (character) string value, the variable has the same character set and collation as the string. A hexadecimal or bit value assigned to a user-defined variable is processed as a binary string. To assign a hexadecimal or bit value to a user-defined variable as a number, add a zero or use the CAST(... AS UNSIGNED) function:

obclient> SET @v1 = X'43';
obclient> SET @v2 = X'43'+0;
obclient> SET @v3 = CAST(X'43' AS UNSIGNED);
obclient> SELECT @v1, @v2, @v3;
+------+------+------+
| @v1  | @v2  | @v3  |
+------+------+------+
| C    |   67 |   67 |
+------+------+------+
1 row in set

obclient> SET @v1 = b'1000011';
obclient> SET @v2 = b'1000011'+0;
obclient> SET @v3 = CAST(b'1000011' AS UNSIGNED);
obclient> SELECT @v1, @v2, @v3;
+------+------+------+
| @v1  | @v2  | @v3  |
+------+------+------+
| C    |   67 |   67 |
+------+------+------+
1 row in set

If the value of a user-defined variable is selected in a result set, the value is returned to the client as a string. If an uninitialized variable is referenced, its value is NULL and the data type is string.

Limitations

Note the following limitations when you use user-defined variables in OceanBase Database:

OceanBase Database allows you to assign values to user-defined variables in statements other than SET. However, this feature may be deprecated in future versions.
The evaluation order of expressions involving user-defined variables is undefined. For example, for SELECT @a, @a:=@a+123, it is not guaranteed that @a is evaluated before value assignment. To avoid such uncertainty, do not assign a value to a variable and read the value of the variable in the same statement.
User-defined variables may be incorrectly calculated during parallel execution. You need to use the /*+USE_DAS(...)*/ hint to specify serial execution in OceanBase Database V4.1.x and V4.2.0. In OceanBase Database V4.2.1 and later, adaptive serial execution is implemented and the hint is not required.

Compatibility with MySQL

In addition to a standalone execution engine similar to that of MySQL, OceanBase Database also provides a set of parallel scheduling engines for processing distributed queries. If you modify the value of a user-defined variable during the execution process, parallel queries will obtain different values of the variable. To ensure the correct behavior of the user-defined variable in parallel execution, the same value must be synchronized to all nodes involved during the execution process. However, such synchronization overhead in distributed scenarios is huge. Therefore, behavior of user-defined variables in OceanBase Database cannot be fully compatible with that in MySQL in distributed scenarios.
Compatibility with MySQL is supported when the @var:=val syntax is not used in an SQL statement and the statement only reads user-defined variables, for example, SELECT c1, c2 FROM t1 WHERE c1=@var;.
Prior to OceanBase Database V4.2.0, full compatibility with MySQL is not guaranteed when the := operator is used to change the value of a user-defined variable in real time during SQL execution. This is because, in this case, the behavior of the user-defined variable is undefined and depends on the evaluation order of expressions that involve this variable.
In OceanBase Database V4.2.0, compatibility with MySQL 5.6, 5.7, and 8.0 is guaranteed in the scenario listed below when the := operator is used to change the value of a user-defined variable in real time during SQL execution. In other scenarios, behavior of user-defined variables depends on the execution plan type and may be incompatible with that in MySQL.
- The := operator is used in an INSERT VALUES clause, for example, INSERT INTO t1 VALUES(@a:=1, @a);.

Initialize user-defined variables

If a user-defined variable is not initialized, OceanBase Database takes NULL as its value by default. Therefore, when you use user-defined variables to emulate window functions, you generally need to initialize the user-defined variables by using SELECT @var:=val FROM DUAL in the FROM subquery to form a join query with the main query. Here is an example:

SELECT A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
SELECT @rownum;

If the query uses a nested loop join and takes Table B as the inner table, the execution engine of OceanBase Database uses the short-circuit execution strategy when the outer table is empty and does not drive Table B for calculation. Therefore, initialization of user-defined variables on Table B is not performed. However, MySQL executes initialization statements, such as SELECT @var:=val FROM DUAL, regardless of whether the outer table is empty.

-- MySQL executes the rownum:=0 initialization regardless of whether t1 is empty.
MySQL> SELECT A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
Empty set, 2 warnings

MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
|       0 |
+---------+
1 row in set

-- Prior to OceanBase Database V4.2.0, if a nested loop join is used, rownum:=0 is not executed when t1 is empty.
obclient> SELECT /*+use_nl(A, B)*/ A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
Empty set

obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
| NULL    |
+---------+
1 row in set

For compatibility with MySQL, OceanBase Database takes a user-defined variable assignment that does not involve entity tables in the FROM clause of a query as the initialization operation of the variable. When a query starts to be executed, assignments in the query are preferentially calculated in the specified order. Here are some examples:

SELECT /*+use_nl(t1, A)*/ * FROM t1, (SELECT @rownum:=0 FROM DUAL) A;
-- SELECT @rownum:=0 is a variable initialization operation and is preferentially executed.

SELECT * FROM t1 WHERE (SELECT @rownum:=0 FROM DUAL)=0;
-- SELECT @rownum:=0 is not in the FROM clause, and therefore is not an initialization operation. Whether it is executed depends on whether t1 is empty.

SELECT /*+use_nl(t1, A)*/ * FROM t1, (SELECT @rownum:=(SELECT c1 FROM t2)) A;
-- SELECT @rownum:=(SELECT c1 FROM t1) is in the FROM clause but accesses the entity table t2, and therefore is not an initialization operation.

SELECT * FROM t1 WHERE (SELECT rownum FROM (SELECT @rownum:=0) t)=0;
-- SELECT @rownum:=0 is in the FROM clause and does involve any entity tables, and therefore is an initialization operation.

@var:=val in a FROM subquery

When a user-defined variable assignment (@var:=val) is included in a FROM subquery, MySQL prohibits the merging of FROM subqueries, and the data reference relationship between queries at two layers is no longer iterated in streaming mode. Instead, the result of the FROM subquery that involves @var:=val is materialized into a temporary table for upper-layer queries to access. Here is an example:

MySQL> CREATE TABLE t1(c1 INT);
Query OK, 0 rows affected

MySQL> INSERT INTO t1 VALUES(1), (2), (3);
Query OK, 3 rows affected
Records: 3  Duplicates: 0  Warnings: 0

MySQL> SELECT @a, c1 FROM (SELECT @a:=c1, c1 FROM t1) t;
+------+------+
| @a   | c1   |
+------+------+
|    3 |    1 |
|    3 |    2 |
|    3 |    3 |
+------+------+
3 rows in set, 1 warning

The final output shows that the result of SELECT @a:=c1, c1 FROM t1 is materialized and then the outer query SELECT @a, c1 FROM t is executed.

This feature of MySQL is stable and often used by users to emulate various grouping calculations in window functions. OceanBase Database is compatible with this feature.

When @var:=val is included in a FROM subquery, the optimizer of OceanBase Database disables view merge, a kind of rewrite, between FROM subqueries at different layers and adds a MATERIAL operator between the subqueries to block streaming data computing. This mechanism ensures compatibility with MySQL.

obclient>EXPLAIN SELECT @a, c1 FROM (SELECT @a:=c1, c1 FROM t1) t;
+---------------------------------------------------------------------+
| Query Plan                                                          |
+---------------------------------------------------------------------+
| ===============================================================     |
| |ID|OPERATOR                       |NAME|EST.ROWS|EST.TIME(us)|     |
| ---------------------------------------------------------------     |
| |0 |MATERIAL                       |    |1       |4           |     |
| |1 |└─SUBPLAN SCAN                 |t   |1       |4           |     |
| |2 |  └─DISTRIBUTED TABLE FULL SCAN|t1  |1       |4           |     |
| ===============================================================     |
| Outputs & filters:                                                  |
| -------------------------------------                               |
|   0 - output([('a')], [t.c1]), filter(nil)                          |
|   1 - output([t.c1]), filter(nil)                                   |
|       access([t.c1])                                                |
|   2 - output([(T_OP_ASSIGN, 'a', t1.c1)], [t1.c1]), filter(nil)     |
|       access([t1.c1]), partitions(p0)                               |
|       is_index_back=false, is_global_index=false,                   |
|       range_key([t1.__pk_increment]), range(MIN ; MAX)always true   |
+---------------------------------------------------------------------+
16 rows in set

Note that if you use @var:=val in a FROM subquery, the FROM subquery may be calculated and its result may be materialized in advance. This may affect optimizations such as pushing the join conditions of the original statement down to the base table. As a result, the nested loop join cannot use the index condition on the inner table, which affects the execution efficiency. Here is an example:

CREATE TABLE t1(a INT, b INT);
CREATE TABLE t2(a INT PRIMARY KEY, b INT);
obclient> EXPLAIN SELECT /*+use_nl(t1 t)*/ * FROM t1 JOIN (SELECT @b:=b AS b, a FROM t2) t ON t1.a=t.a WHERE t1.b=1;
+------------------------------------------------------------------------------------+
| Query Plan                                                                         |
+------------------------------------------------------------------------------------+
| =================================================================                  |
| |ID|OPERATOR                         |NAME|EST.ROWS|EST.TIME(us)|                  |
| -----------------------------------------------------------------                  |
| |0 |NESTED-LOOP JOIN                 |    |1       |8           |                  |
| |1 |├─DISTRIBUTED TABLE FULL SCAN    |t1  |1       |4           |                  |
| |2 |└─MATERIAL                       |    |1       |4           |                  |
| |3 |  └─SUBPLAN SCAN                 |t   |1       |4           |                  |
| |4 |    └─DISTRIBUTED TABLE FULL SCAN|t2  |1       |4           |                  |
| =================================================================                  |
| Outputs & filters:                                                                 |
| -------------------------------------                                              |
|   0 - output([t1.a], [t1.b], [t.b], [t.a]), filter(nil)                            |
|       conds([t1.a = t.a]), nl_params_(nil), use_batch=false                        |
|   1 - output([t1.a], [t1.b]), filter([t1.b = 1])                                   |
|       access([t1.a], [t1.b]), partitions(p0)                                       |
|       is_index_back=false, is_global_index=false, filter_before_indexback[false],  |
|       range_key([t1.__pk_increment]), range(MIN ; MAX)always true                  |
|   2 - output([t.b], [t.a]), filter(nil)                                            |
|   3 - output([t.a], [t.b]), filter(nil)                                            |
|       access([t.a], [t.b])                                                         |
|   4 - output([(T_OP_ASSIGN, 'b', t2.b)], [t2.a]), filter(nil)                      |
|       access([t2.a], [t2.b]), partitions(p0)                                       |
|       is_index_back=false, is_global_index=false,                                  |
|       range_key([t2.a]), range(MIN ; MAX)always true                               |
+------------------------------------------------------------------------------------+
24 rows in set

@var:=val in a DML statement

OceanBase Database allows you to use @var:=val in MySQL mode to extract old and new values during DML execution. This reduces the number of interactions between the client and the database and emulates the DML RETURNING INTO feature in Oracle mode. However, you need to take note of the following limitations:

Since DML statements are internally executed, it cannot be guaranteed in DML execution that expressions are evaluated in the order specified by users. Therefore, do not assign a value to a user-defined variable and access the same variable in a DML statement. For example, in INSERT INTO t1(c1, c2) VALUES(@a:=1, @a+1);UPDATE t1 SET c1=@a:=1, c2=@a+1 WHERE c1=1;, it cannot be guaranteed that @a:=1 is evaluated before @a+1.
Using @var:=val in DML statements is compatible with MySQL only when the number of affected rows is 1. When the number is greater than 1, the compatibility cannot be guaranteed because the SQL execution order may be different from that in MySQL, which leads to a different variable result.

@var:=val in a SELECT clause

In some business scenarios, an expression may need to be repeatedly evaluated in the same clause. To avoid such redundant operations, some users choose to assign a value to a user-defined variable and then reference the variable in other expressions, for example, SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1. However, according to official documentation of MySQL, the evaluation order of expressions that involve user variables is undefined. Therefore, the result may fail user expectations.

Researches on variable behavior in MySQL show that when the aforesaid SELECT clause is in the outermost query or a FROM subquery, the evaluation order of expressions is consistent with the order defined by users. In other scenarios, the consistency cannot be guaranteed. Here are some examples:

SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1;
-- The clause is in the outermost query, so the expressions are evaluated in the user-defined order.

SELECT * FROM (SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1) t;
-- The clause is in a FROM subquery, so the expressions are evaluated in the user-defined order.

SELECT * FROM t1 WHERE (c1, c2) in (SELECT @a:=c1, @a+1 FROM t2);
-- The clause is in a WHERE subquery, so the expressions that involve user-defined variables may not be evaluated in the user-defined order and the query result is uncertain.

OceanBase Database is compatible only with defined behavior in MySQL and may be incompatible with undefined behavior.

@var:=val in a WHERE clause

The execution result of @var:=val in a WHERE clause depends on the execution order of predicates or MySQL implementation. OceanBase Database does not support compatibility with MySQL in these scenarios. Here are some examples:

SELECT * FROM t1 WHERE (@a:=c1)>10;
-- The query result and the final result of @a are related to the execution. Compatibility with MySQL is not supported in this scenario.

SELECT * FROM t1 WHERE c1 in (SELECT @a:=1 FROM t2);
-- The result is related to the implementation of MySQL. Compatibility with MySQL is not supported in this scenario.

@var:=val in an ORDER BY clause

The ORDER BY clause rearranges data and thereby disrupts the iteration order of data. @var:=val in the ORDER BY clause results in unexpected behavior. OceanBase Database does not support full compatibility with MySQL when user-defined variables are used in the ORDER BY clause, because the behavior of the two databases is defined by their own implementation. Here is an example:

CREATE TABLE t1(a INT PRIMARY KEY, b INT);
INSERT INTO t1 VALUES(1, 1), (2, 1), (3, 1), (4, 1), (5, 5), (6, 6);

-- MySQL behavior
MySQL> SELECT @rownum:=@rownum+1 AS rn, MAX(a), sum(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn;
+------+--------+--------+
| rn   | max(a) | sum(a) |
+------+--------+--------+
|    2 |      4 |     10 |
|    3 |      5 |      5 |
|    4 |      6 |      6 |
+------+--------+--------+
3 rows in set, 2 warnings

-- When ORDER BY sorts data in ascending order, the final result of rownum points to the value in the last row, which is 4.
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
|       4 |
+---------+
1 row in set

MySQL> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn desc;
+------+--------+--------+
| rn   | max(a) | sum(a) |
+------+--------+--------+
|    4 |      6 |      6 |
|    3 |      5 |      5 |
|    2 |      4 |     10 |
+------+--------+--------+
3 rows in set, 2 warnings

-- When ORDER BY sorts data in descending order, the final result of rownum points to the value in the last row, which is 2.
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
|       2 |
+---------+
1 row in set

--OceanBase Database behavior
obclient> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn;
+------+--------+--------+
| rn   | max(a) | sum(a) |
+------+--------+--------+
|    2 |      4 |     10 |
|    3 |      5 |      5 |
|    4 |      6 |      6 |
+------+--------+--------+
3 rows in set

-- When ORDER BY sorts data in ascending order, the final result of rownum points to the value in the last row, which is 4.
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
|       4 |
+---------+
1 row in set

obclient> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn desc;
+------+--------+--------+
| rn   | max(a) | sum(a) |
+------+--------+--------+
|    4 |      6 |      6 |
|    3 |      5 |      5 |
|    2 |      4 |     10 |
+------+--------+--------+
3 rows in set

-- When ORDER BY sorts data in descending order, the final result of rownum points to the value in the first row, which is still 4.
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
|       4 |
+---------+
1 row in set

Therefore, if @var:=val is included in an ORDER BY clause or a column generated based on @var:=val calculation is referenced in an ORDER BY clause, OceanBase Database cannot guarantee that the user-defined variable result obtained after sorting is still compatible with MySQL. Therefore, do not use the user-defined variable in other operations after the ORDER BY operation.

@var:=val that disallows parallel execution of a query

If @var:=val is included in a query, OceanBase Database cannot support compatibility with MySQL in all scenarios. However, OceanBase Database must ensure that the result of such a query is explainable. Therefore, a query that contains @var:=val is executed in serial by force in OceanBase Database, and parallel execution explicitly specified for the query does not take effect. Here is an example:

obclient> EXPLAIN EXTENDED_NOADDR SELECT /*+parallel(2)*/ @a:=c1 FROM t1;
+-------------------------------------------------------------------+
| Query Plan                                                        |
+-------------------------------------------------------------------+
| ===========================================================       |
| |ID|OPERATOR                   |NAME|EST.ROWS|EST.TIME(us)|       |
| -----------------------------------------------------------       |
| |0 |DISTRIBUTED TABLE FULL SCAN|t1  |1       |4           |       |
| ===========================================================       |
| Outputs & filters:                                                |
| -------------------------------------                             |
|   0 - output([(T_OP_ASSIGN, 'a', t1.c1)]), filter(nil)            |
|       access([t1.c1]), partitions(p0)                             |
|       is_index_back=false, is_global_index=false,                 |
|       range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
| Used Hint:                                                        |
| -------------------------------------                             |
|   /*+                                                             |
|                                                                   |
|       PARALLEL(2)                                                 |
|   */                                                              |
| Qb name trace:                                                    |
| -------------------------------------                             |
|   stmt_id:0, stmt_type:T_EXPLAIN                                  |
|   stmt_id:1, SEL$1                                                |
| Outline Data:                                                     |
| -------------------------------------                             |
|   /*+                                                             |
|       BEGIN_OUTLINE_DATA                                          |
|       FULL(@"SEL$1" "test"."t1"@"SEL$1")                          |
|       USE_DAS(@"SEL$1" "test"."t1"@"SEL$1")                       |
|       PARALLEL(2)                                                 |
|       OPTIMIZER_FEATURES_ENABLE('4.0.0.0')                        |
|       END_OUTLINE_DATA                                            |
|   */                                                              |
| Optimization Info:                                                |
| -------------------------------------                             |
|   t1:                                                             |
|       table_rows:1                                                |
|       physical_range_rows:1                                       |
|       logical_range_rows:1                                        |
|       index_back_rows:0                                           |
|       output_rows:1                                               |
|       table_dop:1                                                 |
|       dop_method:DAS DOP                                          |
|       avaiable_index_name:[t1]                                    |
|       stats version:0                                             |
|       dynamic sampling level:0                                    |
|   Plan Type:                                                      |
|       LOCAL                                                       |
|   Note:                                                           |
|       Degree of Parallelism is 1 because of hint                  |
+-------------------------------------------------------------------+
48 rows in set