Values stored in user-defined variables can be referenced by other statements to pass values from one statement to another.
User variables are written in the format @var_name, where var_name consists of letters, digits, . , _, and $. If you reference a user variable as a string or identifier (for example, @'my-obvar', @"my-obvar", or @`my-obvar`), the name can contain other characters. User variable names are case-insensitive and can be up to 64 characters long.
User-defined variables are session-specific. A user variable defined by one client is not visible or usable by other clients. When the client exits, all variables defined for the session are automatically released.
One way to set a user-defined variable is to use the SET statement:
SET @var_name = expr [, @var_name = expr] ...
For more information, see SET.
For the SET statement, both = and := can be used as assignment operators.
User variables can be assigned values of the following data types: integer, decimal, floating-point, binary, or non-binary string, or NULL. Assigning decimal or floating-point values does not preserve the precision or scale of the value. Values of other data types are converted to the allowed data type. For example, time or spatial data type values are converted to binary strings.
To read a user-defined variable, you can use the SELECT statement.
If a user variable is assigned a non-binary (character) string value, it has the same character set and collation as the string. Hexadecimal or bit values assigned to a user variable are treated as binary strings. To assign a hexadecimal or bit value as a number to a user variable, you can add 0 or use the CAST function CAST(... AS UNSIGNED):
obclient> SET @v1 = X'43';
obclient> SET @v2 = X'43'+0;
obclient> SET @v3 = CAST(X'43' AS UNSIGNED);
obclient> SELECT @v1, @v2, @v3;
+------+------+------+
| @v1 | @v2 | @v3 |
+------+------+------+
| C | 67 | 67 |
+------+------+------+
1 row in set
obclient> SET @v1 = b'1000011';
obclient> SET @v2 = b'1000011'+0;
obclient> SET @v3 = CAST(b'1000011' AS UNSIGNED);
obclient> SELECT @v1, @v2, @v3;
+------+------+------+
| @v1 | @v2 | @v3 |
+------+------+------+
| C | 67 | 67 |
+------+------+------+
1 row in set
If the value of a user variable is selected in the result set, it is returned to the client as a string. If an uninitialized variable is referenced, its value is NULL and its type is string.
Limitations
User-defined variables in OceanBase Database have the following limitations:
- In early versions of OceanBase Database, user variables can be assigned values in statements other than
SET, but this feature may be removed in future versions. - The evaluation order of expressions involving user variables is undefined. For example, in the statement
SELECT @a, @a:=@a+123, it is not guaranteed that@awill be evaluated before the assignment expression. To avoid this issue, do not assign a value to a variable and read the variable's value in the same statement. - When executing in parallel, the correctness of user-defined variable calculations cannot be guaranteed. You need to use the
/*+USE_DAS(...)*/hint to specify sequential execution. Future versions plan to optimize this so that sequential execution will be used automatically without the need to specify the hint. In versions 4.1.x and 4.2.0, you need to use the/*+USE_DAS(...)*/hint to control sequential execution. In version 4.2.1 and later, sequential execution will be used automatically without the need to specify the hint.
MySQL compatibility
- OceanBase Database has a single-node execution engine similar to that of MySQL and a parallel scheduling engine for distributed queries. Changing the value of a user variable during execution will result in inconsistent values of the user variable read by parallel tasks. To ensure correct behavior of user variables in parallel execution, the values of variables must be consistent across nodes. This synchronization overhead is significant in distributed scenarios, so OceanBase Database cannot be fully compatible with MySQL in distributed scenarios.
- In SQL statements, if you only read user variables without using the
@var:=valsyntax, the behavior is compatible with MySQL. For example,SELECT c1, c2 FROM t1 WHERE c1=@var;is compatible with MySQL. - If you use the
:=operator to modify the value of a user variable during SQL execution, the behavior of the user variable is undefined. The specific behavior depends on the execution order. OceanBase Database V4.2.0 and earlier versions are not guaranteed to be fully compatible with MySQL. - If you use the
:=operator to modify the value of a user variable during SQL execution, OceanBase Database V4.2.0 is compatible with MySQL 5.6, 5.7, and 8.0 in the following scenarios. In other scenarios, the results may vary depending on the execution plan and may not be consistent with MySQL.- When the
:=operator is used in theINSERT VALUESclause, for example,INSERT INTO t1 VALUES(@a:=1, @a);. If a user variable is assigned a value in an SQL statement and the execution result depends on the previous value of the variable, the assignment result may differ from MySQL when the execution fails and is retried, due to changes in the initial value of the user variable.
- When the
Initialization of user variables
If a user-defined variable is not initialized, its default value is NULL. Therefore, to simulate the behavior of window functions, you usually initialize the user variable in the FROM subquery using SELECT @var:=val from dual, and then join it with the main query. For example:
SELECT A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
SELECT @rownum;
If the query uses the NESTED LOOP JOIN algorithm and B is the inner table, and the outer table has no data, the OceanBase execution engine will use a short-circuit strategy and will not calculate the B table. Therefore, the initialization of user variables on the B table will not be executed, while MySQL will execute SELECT @var:=val from dual regardless of whether the outer table has data.
--MySQL will execute rownum:=0 regardless of whether t1 contains data
MySQL> SELECT A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
Empty set, 2 warnings
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 0 |
+---------+
1 row in set
--In OceanBase Database V4.2 and earlier, if you use Nested Loop Join and t1 contains no data, rownum:=0 is not executed
obclient> SELECT /*+use_nl(A, B)*/ A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
Empty set
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
| NULL |
+---------+
1 row in set
To be compatible with MySQL, OceanBase treats user variable assignment statements in the FROM clause that do not include physical tables as initialization operations. These initialization operations are prioritized and executed at the beginning of the statement, in the order specified in the statement. For example:
SELECT /*+use_nl(t1, A)*/ * FROM t1, (SELECT @rownum:=0 FROM DUAL) A;
--SELECT @rownum:=0 is a variable initialization operation and is executed first
SELECT * FROM t1 WHERE (SELECT @rownum:=0 FROM DUAL)=0;
--SELECT @rownum:=0 is not in the FROM clause and is not an initialization statement. Its execution depends on whether t1 contains data
SELECT /*+use_nl(t1, A)*/ * FROM t1, (SELECT @rownum:=(SELECT c1 FROM t2)) A;
--SELECT @rownum:=(SELECT c1 FROM t1) is in the FROM clause but accesses the physical table t2, so it is not an initialization statement
SELECT * FROM t1 WHERE (SELECT rownum FROM (SELECT @rownum:=0) t)=0;
--SELECT @rownum:=0 is in the FROM clause and is an initialization statement
@var:=val in a FROM subquery
When a user variable assignment operation appears in a FROM subquery, MySQL prohibits the merging of FROM subqueries and no longer iterates data references between the two layers of queries in a streaming manner. Instead, the results of the FROM subquery containing the := operator are materialized into a temporary table for the upper-level query to access. For example:
MySQL> CREATE TABLE t1(c1 INT);
Query OK, 0 rows affected
MySQL> INSERT INTO t1 VALUES(1), (2), (3);
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
MySQL> SELECT @a, c1 FROM (SELECT @a:=c1, c1 FROM t1) t;
+------+------+
| @a | c1 |
+------+------+
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
+------+------+
3 rows in set, 1 warning
From the final result, you can see that SELECT @a:=c1, c1 FROM t1 is materialized before the outer query SELECT @a, c1 FROM t is executed.
MySQL's behavior is stable and is used by users to simulate various grouping calculations of window functions. OceanBase Database is compatible with this behavior.
When a user variable := operation appears in a FROM subquery, the OB optimizer prohibits VIEW MERGE rewriting between layers of FROM subqueries and adds a MATERIAL operator between each layer of subqueries to block streaming data calculation between queries, ensuring compatibility with MySQL.
obclient>EXPLAIN SELECT @a, c1 FROM (SELECT @a:=c1, c1 FROM t1) t;
+---------------------------------------------------------------------+
| Query Plan |
+---------------------------------------------------------------------+
| =============================================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| --------------------------------------------------------------- |
| |0 |MATERIAL | |1 |4 | |
| |1 |└─SUBPLAN SCAN |t |1 |4 | |
| |2 | └─DISTRIBUTED TABLE FULL SCAN|t1 |1 |4 | |
| =============================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([('a')], [t.c1]), filter(nil) |
| 1 - output([t.c1]), filter(nil) |
| access([t.c1]) |
| 2 - output([(T_OP_ASSIGN, 'a', t1.c1)], [t1.c1]), filter(nil) |
| access([t1.c1]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+---------------------------------------------------------------------+
16 rows in set
Note that using := in a FROM subquery may cause the subquery to be evaluated and its results materialized early, which may affect the optimization of the original statement, such as pushing down the JOIN condition to the base table. This can prevent the use of index conditions on the inner table in a NESTED LOOP JOIN, thereby affecting execution efficiency. For example:
CREATE TABLE t1(a INT, b INT);
CREATE TABLE t2(a INT PRIMARY KEY, b INT);
obclient> EXPLAIN SELECT /*+use_nl(t1 t)*/ * FROM t1 JOIN (SELECT @b:=b AS b, a FROM t2) t ON t1.a=t.a WHERE t1.b=1;
+------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------+
| ================================================================= |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------------------------- |
| |0 |NESTED-LOOP JOIN | |1 |8 | |
| |1 |├─DISTRIBUTED TABLE FULL SCAN |t1 |1 |4 | |
| |2 |└─MATERIAL | |1 |4 | |
| |3 | └─SUBPLAN SCAN |t |1 |4 | |
| |4 | └─DISTRIBUTED TABLE FULL SCAN|t2 |1 |4 | |
| ================================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t1.a], [t1.b], [t.b], [t.a]), filter(nil) |
| conds([t1.a = t.a]), nl_params_(nil), use_batch=false |
| 1 - output([t1.a], [t1.b]), filter([t1.b = 1]) |
| access([t1.a], [t1.b]), partitions(p0) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
| 2 - output([t.b], [t.a]), filter(nil) |
| 3 - output([t.a], [t.b]), filter(nil) |
| access([t.a], [t.b]) |
| 4 - output([(T_OP_ASSIGN, 'b', t2.b)], [t2.a]), filter(nil) |
| access([t2.a], [t2.b]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([t2.a]), range(MIN ; MAX)always true |
+------------------------------------------------------------------------------------+
24 rows in set
@var:=val in a DML statement
OceanBase Database allows you to use the := operator in MySQL mode to extract the old and new values during DML execution, reducing the number of interactions between the client and the database, and simulating the DML RETURNING INTO feature in Oracle mode. However, the following limitations apply:
- Due to internal execution constraints, the order of expression evaluation in DML statements may not strictly follow the user-defined order. Therefore, avoid assigning and accessing the same user variable in a DML statement. For example:
INSERT INTO t1(c1, c2) VALUES(@a:=1, @a+1); UPDATE t1 SET c1=@a:=1, c2=@a+1 WHERE c1=1;In both examples, it is not guaranteed that@a:=1will be executed before@a+1. - In DML statements, the
@var:=valsyntax is only compatible with MySQL when affected_row=1. When affected_row>1, the execution order of SQL data may differ from MySQL, leading to potential differences in the final value of the user variable.
@var:=val in the SELECT clause
In some business scenarios, expressions may be redundantly calculated within the same clause, and the results can be reused by assigning values to user variables and then referencing them. For example, in the query SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1, the calculation result of c1+1 is reused in subsequent expressions. However, the official MySQL documentation explicitly states that the execution order of expressions containing user variables cannot be guaranteed, and therefore, the results may not meet user expectations.
After investigating the specific behavior of MySQL, it was found that when the query appears in the SELECT clause of the top-level query or in the SELECT clause of a FROM subquery, the execution order of the expressions follows the user-defined order. However, in other clauses, this order is not guaranteed. For example:
SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1;
-- In the top-level query, the execution order of the expressions follows the user-defined order.
SELECT * FROM (SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1) t;
-- In the FROM subquery, the execution order of the expressions follows the user-defined order.
SELECT * FROM t1 WHERE (c1, c2) in (SELECT @a:=c1, @a+1 FROM t2);
-- In the WHERE subquery, the execution order of expressions containing user variables is not guaranteed, and the query results may be uncertain.
OceanBase Database only supports the deterministic behavior of MySQL. For non-deterministic behaviors, it may not be compatible with MySQL.
@var:=val in the WHERE clause
The result of the := operation in the WHERE clause depends on the execution order of the predicate conditions. MySQL's behavior is implementation-dependent, and OceanBase Database does not guarantee compatibility in such scenarios. For example:
SELECT * FROM t1 WHERE (@a:=c1)>10;
-- The query results and the final value of @a depend on the execution order, and are not compatible with MySQL.
SELECT * FROM t1 WHERE c1 in (SELECT @a:=1 FROM t2);
-- In the WHERE clause, the results depend on the implementation, and are not compatible with MySQL.
@var:=val in the ORDER BY clause
The ORDER BY clause reorders data, which disrupts the iteration order of the data. When user variables are used with the := operator in the ORDER BY clause, the results may be unpredictable. OceanBase Database cannot guarantee full compatibility with MySQL in such scenarios. When user variables are used in the ORDER BY clause, the behavior is implementation-defined. For example:
CREATE TABLE t1(a INT PRIMARY KEY, b INT);
INSERT INTO t1 VALUES(1, 1), (2, 1), (3, 1), (4, 1), (5, 5), (6, 6);
-- MySQL behavior
MySQL> SELECT @rownum:=@rownum+1 AS rn, MAX(a), sum(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 2 | 4 | 10 |
| 3 | 5 | 5 |
| 4 | 6 | 6 |
+------+--------+--------+
3 rows in set, 2 warnings
-- When the ORDER BY clause is in ascending order, the final value of @rownum points to the last row value, which is 4.
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 4 |
+---------+
1 row in set
MySQL> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn desc;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 4 | 6 | 6 |
| 3 | 5 | 5 |
| 2 | 4 | 10 |
+------+--------+--------+
3 rows in set, 2 warnings
-- When the ORDER BY clause is in descending order, the final value of @rownum points to the last row value, which is 2.
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 2 |
+---------+
1 row in set
-- OceanBase Database behavior
obclient> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 2 | 4 | 10 |
| 3 | 5 | 5 |
| 4 | 6 | 6 |
+------+--------+--------+
3 rows in set
-- When the ORDER BY clause is in ascending order, the final value of @rownum points to the last row value, which is 4.
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 4 |
+---------+
1 row in set
obclient> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn desc;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 4 | 6 | 6 |
| 3 | 5 | 5 |
| 2 | 4 | 10 |
+------+--------+--------+
3 rows in set
-- When the ORDER BY clause is in descending order, the final value of @rownum points to the first row value, which is still 4.
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 4 |
+---------+
1 row in set
Therefore, if user-defined variables are used with the := operator in the ORDER BY clause or if columns resulting from := operations are indirectly referenced for sorting, OceanBase Database cannot guarantee that the results will remain compatible with MySQL. Therefore, avoid using the user-defined variable after the ORDER BY clause for further operations.
@var:=val will prevent the query from being parallelized
When a query contains the @var:=val operation, although OceanBase Database cannot fully align with MySQL behavior in all scenarios, it must ensure that queries involving such operations can be self-interpreted. Therefore, queries with @var:=val assignment operations will be forced to execute sequentially. As a result, the parallel execution feature will not take effect for these queries. For example:
obclient> EXPLAIN EXTENDED_NOADDR SELECT /*+parallel(2)*/ @a:=c1 FROM t1;
+-------------------------------------------------------------------+
| Query Plan |
+-------------------------------------------------------------------+
| =========================================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------------------- |
| |0 |DISTRIBUTED TABLE FULL SCAN|t1 |1 |4 | |
| =========================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([(T_OP_ASSIGN, 'a', t1.c1)]), filter(nil) |
| access([t1.c1]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
| Used Hint: |
| ------------------------------------- |
| /*+ |
| |
| PARALLEL(2) |
| */ |
| Qb name trace: |
| ------------------------------------- |
| stmt_id:0, stmt_type:T_EXPLAIN |
| stmt_id:1, SEL$1 |
| Outline Data: |
| ------------------------------------- |
| /*+ |
| BEGIN_OUTLINE_DATA |
| FULL(@"SEL$1" "test"."t1"@"SEL$1") |
| USE_DAS(@"SEL$1" "test"."t1"@"SEL$1") |
| PARALLEL(2) |
| OPTIMIZER_FEATURES_ENABLE('4.0.0.0') |
| END_OUTLINE_DATA |
| */ |
| Optimization Info: |
| ------------------------------------- |
| t1: |
| table_rows:1 |
| physical_range_rows:1 |
| logical_range_rows:1 |
| index_back_rows:0 |
| output_rows:1 |
| table_dop:1 |
| dop_method:DAS DOP |
| avaiable_index_name:[t1] |
| stats version:0 |
| dynamic sampling level:0 |
| Plan Type: |
| LOCAL |
| Note: |
| Degree of Parallelism is 1 because of hint |
+-------------------------------------------------------------------+
48 rows in set
