The values stored in user-defined variables can be referenced by other statements to pass values from one statement to another.
User variables are written in the format @var_name, where var_name consists of letters, numbers, periods (.), underscores (_), and dollar signs ($). If the user variable name is quoted as a string or identifier (for example, @'my-obvar', @"my-obvar", or @`my-obvar`), it can contain other characters. User variable names are case-insensitive and can be up to 64 characters long.
User-defined variables are session-specific. A user-defined variable defined by one client is not visible or usable by other clients. When the client exits, all variables defined for the session are automatically released.
One way to set a user-defined variable is to use the SET statement:
SET @var_name = expr [, @var_name = expr] ...
For more information, see SET.
For the SET statement, both = and := can be used as assignment operators.
You can assign the following data types to a user variable: integers, decimals, floating-point numbers, binary or non-binary strings, or NULL. Assigning decimal or floating-point values does not preserve the precision or scale of the value. Other data types are converted to the allowed data types. For example, time or spatial data types are converted to binary strings.
To read a user-defined variable, you can use the SELECT statement.
If you assign a non-binary (character) string value to a user variable, it has the same character set and collation as the string. Hexadecimal or bit values assigned to a user variable are treated as binary strings. To assign a hexadecimal or bit value as a number to a user variable, you can add 0 or use the CAST function CAST(... AS UNSIGNED):
obclient> SET @v1 = X'43';
obclient> SET @v2 = X'43'+0;
obclient> SET @v3 = CAST(X'43' AS UNSIGNED);
obclient> SELECT @v1, @v2, @v3;
+------+------+------+
| @v1 | @v2 | @v3 |
+------+------+------+
| C | 67 | 67 |
+------+------+------+
1 row in set
obclient> SET @v1 = b'1000011';
obclient> SET @v2 = b'1000011'+0;
obclient> SET @v3 = CAST(b'1000011' AS UNSIGNED);
obclient> SELECT @v1, @v2, @v3;
+------+------+------+
| @v1 | @v2 | @v3 |
+------+------+------+
| C | 67 | 67 |
+------+------+------+
1 row in set
If you select the value of a user variable from the result set, it is returned to the client as a string. If you reference an uninitialized variable, its value is NULL and its type is string.
Limitations
User-defined variables in OceanBase Database are subject to the following limitations:
- The maximum length of a user-defined variable name is 64 characters.
- In earlier versions of OceanBase Database, you can assign values to user variables in statements other than
SET. This feature may be removed in future versions. - The evaluation order of expressions involving user variables is undefined. For example, in
SELECT @a, @a:=@a+123, it is not guaranteed that@awill be calculated before the assignment expression. To avoid this issue, do not assign a value to a variable and read the variable's value in the same statement. - When executing in parallel, the correctness of user-defined variable calculations cannot be guaranteed. You must use the
/*+USE_DAS(...)*/hint to specify serial execution. Future versions plan to optimize this so that serial execution is automatically used without needing to specify the hint. In versions 4.1.x and 4.2.0, you must use the/*+USE_DAS(...)*/hint to control this. In version 4.2.1 and later, serial execution is automatically used without needing to specify the hint.
MySQL compatibility
- OceanBase Database has a single-node execution engine similar to MySQL and a parallel scheduling engine for distributed queries. Changing user variables during execution can lead to inconsistent user variable values read by parallel tasks. To ensure correct user variable behavior in parallel execution, variable values must be consistent across nodes. Synchronizing variables in distributed scenarios is costly, so full compatibility with MySQL is not guaranteed.
- Reading user variables without using
@var:=valsyntax, such as inSELECT c1, c2 FROM t1 WHERE c1=@var;, is compatible with MySQL. - Real-time modification of user variables using
:=during SQL execution results in undefined behavior, which depends on the execution order. OceanBase Database versions earlier than V4.2.0 are not fully compatible with MySQL. - In OceanBase Database V4.2.0, using
:=to modify user variables during SQL execution is compatible with MySQL 5.6, 5.7, and 8.0 in the following scenarios, but may differ in other cases depending on the execution plan.- In the
INSERT VALUESclause, such as inINSERT INTO t1 VALUES(@a:=1, @a);.
- In the
Initializing user variables
If a user-defined variable is not initialized, its default value is NULL. To simulate window functions, user variables are typically initialized in the FROM subquery using SELECT @var:=val FROM DUAL and joined with the main query. For example:
SELECT A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
SELECT @rownum;
If the query uses the NESTED LOOP JOIN algorithm and B is the inner table, and the outer table is empty, OceanBase Database will not execute the B table's initialization. In contrast, MySQL will always execute SELECT @var:=val FROM DUAL regardless of the outer table's data.
-- MySQL will execute rownum:=0 regardless of whether t1 has data
MySQL> SELECT A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
Empty set, 2 warnings
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 0 |
+---------+
1 row in set
-- In OceanBase Database versions earlier than V4.2, if Nested Loop Join is used and t1 is empty, rownum:=0 will not be executed
obclient> SELECT /*+use_nl(A, B)*/ A.c1, (@rownum:= @rownum+1) AS rownum FROM t1 A,(SELECT @rownum:=0) B;
Empty set
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
| NULL |
+---------+
1 row in set
To maintain compatibility with MySQL, OceanBase treats user variable assignments in the FROM clause without an actual table as initialization operations. These are prioritized and executed at the beginning of the statement, in the order specified. For example:
SELECT /*+use_nl(t1, A)*/ * FROM t1, (SELECT @rownum:=0 FROM DUAL) A;
-- SELECT @rownum:=0 is an initialization operation and is executed first
SELECT * FROM t1 WHERE (SELECT @rownum:=0 FROM DUAL)=0;
-- SELECT @rownum:=0 is not in the FROM clause, so it is not an initialization statement. Its execution depends on whether t1 has data
SELECT /*+use_nl(t1, A)*/ * FROM t1, (SELECT @rownum:=(SELECT c1 FROM t2)) A;
-- SELECT @rownum:=(SELECT c1 FROM t1) is in the FROM clause but accesses the actual table t2, so it is not an initialization statement
SELECT * FROM t1 WHERE (SELECT rownum FROM (SELECT @rownum:=0) t)=0;
-- SELECT @rownum:=0 is in the FROM clause and is an initialization statement
@var:=val in FROM subqueries
When user variable assignments are in the FROM subquery, MySQL prevents merging of FROM subqueries and iterates data references between subqueries in a non-streaming manner. Instead, it materializes the results of FROM subqueries containing := into temporary tables for upper-level queries. For example:
MySQL> CREATE TABLE t1(c1 INT);
Query OK, 0 rows affected
MySQL> INSERT INTO t1 VALUES(1), (2), (3);
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
MySQL> SELECT @a, c1 FROM (SELECT @a:=c1, c1 FROM t1) t;
+------+------+
| @a | c1 |
+------+------+
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
+------+------+
3 rows in set, 1 warning
The final result shows that SELECT @a:=c1, c1 FROM t1 is materialized before the outer query SELECT @a, c1 FROM t is executed.
MySQL's behavior is stable and is used to simulate various grouping calculations in window functions. OceanBase Database is compatible with this behavior.
When user variable := operations are in the FROM subquery, the OB optimizer prevents VIEW MERGE rewriting between subqueries and adds MATERIAL operators to block streaming data calculations between queries, ensuring compatibility with MySQL.
obclient>EXPLAIN SELECT @a, c1 FROM (SELECT @a:=c1, c1 FROM t1) t;
+---------------------------------------------------------------------+
| Query Plan |
+---------------------------------------------------------------------+
| =============================================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| --------------------------------------------------------------- |
| |0 |MATERIAL | |1 |4 | |
| |1 |└─SUBPLAN SCAN |t |1 |4 | |
| |2 | └─DISTRIBUTED TABLE FULL SCAN|t1 |1 |4 | |
| =============================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([('a')], [t.c1]), filter(nil) |
| 1 - output([t.c1]), filter(nil) |
| access([t.c1]) |
| 2 - output([(T_OP_ASSIGN, 'a', t1.c1)], [t1.c1]), filter(nil) |
| access([t1.c1]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
+---------------------------------------------------------------------+
16 rows in set
Note that using := in the FROM subquery may cause the subquery to be calculated and materialized early, which can affect optimizations like JOIN condition pushing to base tables. This can prevent the use of indexes on the inner table in NESTED LOOP JOINs, impacting execution efficiency. For example:
CREATE TABLE t1(a INT, b INT);
CREATE TABLE t2(a INT PRIMARY KEY, b INT);
obclient> EXPLAIN SELECT /*+use_nl(t1 t)*/ * FROM t1 JOIN (SELECT @b:=b AS b, a FROM t2) t ON t1.a=t.a WHERE t1.b=1;
+------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------+
| ================================================================= |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------------------------- |
| |0 |NESTED-LOOP JOIN | |1 |8 | |
| |1 |├─DISTRIBUTED TABLE FULL SCAN |t1 |1 |4 | |
| |2 |└─MATERIAL | |1 |4 | |
| |3 | └─SUBPLAN SCAN |t |1 |4 | |
| |4 | └─DISTRIBUTED TABLE FULL SCAN|t2 |1 |4 | |
| ================================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([t1.a], [t1.b], [t.b], [t.a]), filter(nil) |
| conds([t1.a = t.a]), nl_params_(nil), use_batch=false |
| 1 - output([t1.a], [t1.b]), filter([t1.b = 1]) |
| access([t1.a], [t1.b]), partitions(p0) |
| is_index_back=false, is_global_index=false, filter_before_indexback[false], |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
| 2 - output([t.b], [t.a]), filter(nil) |
| 3 - output([t.a], [t.b]), filter(nil) |
| access([t.a], [t.b]) |
| 4 - output([(T_OP_ASSIGN, 'b', t2.b)], [t2.a]), filter(nil) |
| access([t2.a], [t2.b]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([t2.a]), range(MIN ; MAX)always true |
+------------------------------------------------------------------------------------+
24 rows in set
@var:=val in DML statements
OceanBase Database allows using user variables := in MySQL mode to extract new and old values during DML execution, reducing client-database interaction and simulating Oracle's DML RETURNING INTO functionality. However, there are limitations:
- Due to internal execution constraints, DML cannot guarantee the calculation order of expressions to match the user-defined order. Avoid assigning and accessing the same user variable in DML clauses. For example:
INSERT INTO t1(c1, c2) VALUES(@a:=1, @a+1); UPDATE t1 SET c1=@a:=1, c2=@a+1 WHERE c1=1;In both examples,@a:=1may not execute before@a+1. - Using
@var:=valin DML is only compatible with MySQL when affected_row=1. When affected_row>1, the execution order of SQL data may differ from MySQL, leading to different user variable results.
@var:=val in the SELECT clause
In some business scenarios, the same clause may involve redundant calculations for some expressions, allowing the results of these calculations to be reused by assigning values to user variables and then referencing them. For example, in the query SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1, the result of the expression c1+1 is stored in the user variable @a and then used in subsequent calculations. However, the MySQL official documentation explicitly states that the execution order of expressions containing user variables cannot be guaranteed, so the results may not meet user expectations.
After investigating the specific behavior of MySQL, it was found that when the query appears in the SELECT clause of the top-level query or in the SELECT clause of a FROM subquery, the execution order of the expressions follows the user-defined order. However, in other clauses, this order is not guaranteed. For example:
SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1;
-- In the top-level query, the expressions are executed in the user-defined order.
SELECT * FROM (SELECT @a:=(c1+1), @a+c2, @a*c3 FROM t1) t;
-- In the FROM subquery, the expressions are executed in the user-defined order.
SELECT * FROM t1 WHERE (c1, c2) in (SELECT @a:=c1, @a+1 FROM t2);
-- In the WHERE subquery, the execution order of expressions containing user variables is not guaranteed to follow the user-defined order, and the query results may be uncertain.
OceanBase Database only supports the deterministic behavior of MySQL. For non-deterministic behaviors, it may not be compatible with MySQL.
@var:=val in the WHERE clause
The result of the := operation in the WHERE clause depends on the execution order of the predicate conditions. MySQL's behavior in this scenario is implementation-defined, and OceanBase Database does not guarantee compatibility with MySQL in such cases. For example:
SELECT * FROM t1 WHERE (@a:=c1)>10;
-- The query results and the final value of @a depend on the execution order, and are not guaranteed to be compatible with MySQL.
SELECT * FROM t1 WHERE c1 in (SELECT @a:=1 FROM t2);
-- In the WHERE clause, the results depend on the implementation and are not guaranteed to be compatible with MySQL.
@var:=val in the ORDER BY clause
The ORDER BY clause rearranges data, which can disrupt the iteration order of the data. When user variables are used with the := operator in the ORDER BY clause, the results may be unpredictable. OceanBase Database does not guarantee full compatibility with MySQL in such cases. When user variables are used in the ORDER BY clause, their behavior is implementation-defined. For example:
CREATE TABLE t1(a INT PRIMARY KEY, b INT);
INSERT INTO t1 VALUES(1, 1), (2, 1), (3, 1), (4, 1), (5, 5), (6, 6);
-- MySQL behavior
MySQL> SELECT @rownum:=@rownum+1 AS rn, MAX(a), sum(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 2 | 4 | 10 |
| 3 | 5 | 5 |
| 4 | 6 | 6 |
+------+--------+--------+
3 rows in set, 2 warnings
-- When the ORDER BY clause is in ascending order, the final value of @rownum points to the last row, which is 4.
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 4 |
+---------+
1 row in set
MySQL> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn desc;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 4 | 6 | 6 |
| 3 | 5 | 5 |
| 2 | 4 | 10 |
+------+--------+--------+
3 rows in set, 2 warnings
-- When the ORDER BY clause is in descending order, the final value of @rownum points to the last row, which is 2.
MySQL> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 2 |
+---------+
1 row in set
-- OceanBase Database behavior
obclient> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 2 | 4 | 10 |
| 3 | 5 | 5 |
| 4 | 6 | 6 |
+------+--------+--------+
3 rows in set
-- When the ORDER BY clause is in ascending order, the final value of @rownum points to the last row, which is 4.
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 4 |
+---------+
1 row in set
obclient> SELECT @rownum:=@rownum+1 AS rn, MAX(a), SUM(a) FROM t1, (SELECT @rownum:=1) t GROUP BY b ORDER BY rn desc;
+------+--------+--------+
| rn | max(a) | sum(a) |
+------+--------+--------+
| 4 | 6 | 6 |
| 3 | 5 | 5 |
| 2 | 4 | 10 |
+------+--------+--------+
3 rows in set
-- When the ORDER BY clause is in descending order, the final value of @rownum points to the first row, which is still 4.
obclient> SELECT @rownum;
+---------+
| @rownum |
+---------+
| 4 |
+---------+
1 row in set
Therefore, if user-defined variables are used with the := operator in the ORDER BY clause or if columns resulting from the := operation are indirectly referenced for sorting, OceanBase Database cannot guarantee that the results will remain compatible with MySQL. Therefore, avoid using the user-defined variable after the ORDER BY clause for further calculations.
@var:=val will prevent the query from being parallelized
When a query contains the @var:=val operation, although OceanBase Database cannot be fully compatible with MySQL in all scenarios, it must ensure that queries containing such operations can be self-explained. Therefore, queries containing @var:=val assignment operations will be forced to execute in a serial manner. As a result, the display of parallel execution for such queries will not take effect. For example:
obclient> EXPLAIN EXTENDED_NOADDR SELECT /*+parallel(2)*/ @a:=c1 FROM t1;
+-------------------------------------------------------------------+
| Query Plan |
+-------------------------------------------------------------------+
| =========================================================== |
| |ID|OPERATOR |NAME|EST.ROWS|EST.TIME(us)| |
| ----------------------------------------------------------- |
| |0 |DISTRIBUTED TABLE FULL SCAN|t1 |1 |4 | |
| =========================================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([(T_OP_ASSIGN, 'a', t1.c1)]), filter(nil) |
| access([t1.c1]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([t1.__pk_increment]), range(MIN ; MAX)always true |
| Used Hint: |
| ------------------------------------- |
| /*+ |
| |
| PARALLEL(2) |
| */ |
| Qb name trace: |
| ------------------------------------- |
| stmt_id:0, stmt_type:T_EXPLAIN |
| stmt_id:1, SEL$1 |
| Outline Data: |
| ------------------------------------- |
| /*+ |
| BEGIN_OUTLINE_DATA |
| FULL(@"SEL$1" "test"."t1"@"SEL$1") |
| USE_DAS(@"SEL$1" "test"."t1"@"SEL$1") |
| PARALLEL(2) |
| OPTIMIZER_FEATURES_ENABLE('4.0.0.0') |
| END_OUTLINE_DATA |
| */ |
| Optimization Info: |
| ------------------------------------- |
| t1: |
| table_rows:1 |
| physical_range_rows:1 |
| logical_range_rows:1 |
| index_back_rows:0 |
| output_rows:1 |
| table_dop:1 |
| dop_method:DAS DOP |
| avaiable_index_name:[t1] |
| stats version:0 |
| dynamic sampling level:0 |
| Plan Type: |
| LOCAL |
| Note: |
| Degree of Parallelism is 1 because of hint |
+-------------------------------------------------------------------+
48 rows in set