Parameterization refers to a process of converting constants in an SQL query into variables.
Retrieve an execution plan based on fast parameterization
An SQL statement may use different parameters in each execution. These parameters are parameterized to generate a parameter-irrelevant SQL string. This SQL string is used as a plan caching key to retrieve an execution plan from the plan cache. This design ensures that SQL statements with different parameters can share the same plan.
Typically, the syntax tree in a conventional database is parameterized to generate keys for retrieving plans from the plan cache. Different from these databases, OceanBase Database uses a lexical parameterization process to directly parameterize text strings as the keys of the plan cache. That is why this process is called fast parameterization.
OceanBase Database also supports Adaptive Cursor Sharing to enable the selection of plans with different parameters.
Based on fast parameterization, the plan caching of OceanBase Database has the following benefits:
Simplified parsing process.
When you search a hash map, you can perform the hash and
MEMCMPoperations on the text string instead of the hash and comparison operations on the parameterized syntax tree. This approach provides higher execution efficiency.
Constant parameterization and constraints
In OceanBase Database, constants cannot be parameterized in some scenarios. The following constants cannot be parameterized:
All constants after
ORDER BY, such asORDER BY 1,2;.All constants after
GROUP BY, such asGROUP BY 1,2;.Constants after
LIMIT, such asLIMIT 5;.A string constant in a format string, such as
%dinSELECT DATE_FORMAT('2006-06-00', '%d');.Constants, among the input parameters of a function, that affect the result of the function and eventually the execution plan, such as
NUMBER(2,1)inCAST(999.88 as NUMBER(2,1)), and1and2inSUBSTR('abcd', 1, 2).Constants, among the input parameters of a function, that carry implicit information and eventually affect the execution plan, such as
2015-11-13 10:20:19.012inSELECT UNIX_TIMESTAMP('2015-11-13 10:20:19.012');. It not only specifies input timestamp but also implies that the precision of the function is set to milliseconds.
Examples of general constant parameterization
The following example shows an SQL query statement:
obclient> SELECT * FROM t1 WHERE c1 = 5 AND c2 ='oceanbase';
The following syntax shows the parameterization result of the SQL query in the preceding example in OceanBase Database. Constants 5 and oceanbase are parameterized into variables @1 and @2, which are checked against SQL statements that have been executed in the database and are mapped to the corresponding execution plan.
obclient> SELECT * FROM T1 WHERE c1 = @1 AND c2 = @2;
Examples of constants that cannot be parameterized
However, not all constants can be parameterized. For example, constants after ORDER BY cannot be parameterized because they indicate sorting by a specific column in the projected columns in the SELECT statement.
Create a sample table t1 and insert proper data into the table. Table t1 contains the c1 and c2 columns, and c1 is the primary key column.
obclient> CREATE TABLE t1(c1 INT PRIMARY KEY,c2 INT);
Query OK, 0 rows affected
obclient> INSERT INTO t1 VALUES (1,2);
Query OK, 1 row affected
obclient> INSERT INTO t1 VALUES (2,1);
Query OK, 1 row affected
obclient> INSERT INTO t1 VALUES (3,1);
Query OK, 1 row affected
Execute the following SQL query and sort the query result by column
c1, which is the primary key column. Sorting is not necessary for access by primary key because columnc1is sorted.obclient> SELECT c1, c2 FROM t1 ORDER BY 1; +----+------+ | C1 | C2 | +----+------+ | 1 | 2 | | 2 | 1 | | 3 | 1 | +----+------+ 3 rows in set obclient> EXPLAIN SELECT c1, c2 FROM t1 ORDER BY 1; Query Plan: | =================================== |ID|OPERATOR |NAME|EST. ROWS|COST| ----------------------------------- |0 |TABLE SCAN|t1 |1000 |1381| =================================== Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0)Execute the following SQL query and sort the query result by column
c2. A sorting operation is required. The execution plan is shown in the following example:obclient> SELECT c1, c2 FROM t1 ORDER BY 2; +----+------+ | C1 | C2 | +----+------+ | 2 | 1 | | 3 | 1 | | 1 | 2 | +----+------+ 3 rows in set obclient> EXPLAIN SELECT c1, c2 FROM t1 ORDER BY 2; Query Plan: | ==================================== |ID|OPERATOR |NAME|EST. ROWS|COST| ------------------------------------ |0 |SORT | |1000 |1886| |1 | TABLE SCAN|t1 |1000 |1381| ==================================== Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2]), filter(nil), sort_keys([T1.C2, ASC]) 1 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0)
Therefore, if constants after ORDER BY are parameterized, different values of ORDER BY are assimilated to have the same parameterized SQL statements, resulting in the retrieval of wrong plans.
Solve mismatches in constant parameterization
To solve the possible mismatch problems in constant parameterization, the system parameterizes SQL queries by analyzing their syntax trees in the process of generating the execution plan during hard parsing and obtains the corresponding distinctive information. Assume that a statement conveys this information: “The third item of the parameter array for fast parameterization must be 3”. This information is considered a “constraint”.
For query Q1 SELECT c1, c2, c3 FROM t1 WHERE c1 = 1 AND c2 LIKE 'senior%' ORDER BY 3;, the following parameterized SQL statement can be obtained through lexical analysis. The corresponding array for parameterization is {1, ‘senior%’, 3}.
obclient> SELECT c1, c2, c3 FROM t1 WHERE c1 = @1 AND c2 LIKE @2 ORDER BY @3;
If constants in the SQL statements after ORDER BY are different, the SQL statements cannot share the same execution plan. Therefore, another result is generated by parameterizing the SQL statements through syntax tree analysis. In the following example, the array for parameterization corresponding to query Q1 is {1,‘senior’}, and the constraint is that “The third item of the parameter array for fast parameterization must be 3”. OceanBase Database stores the new parameterized string, constraint, and execution plan generated for query Q1 in the plan cache.
obclient> SELECT c1, c2, c3 FROM t1 WHERE c1 = @1 AND c2 LIKE @2 ORDER BY 3;
When the user sends query Q2 SELECT c1, c2, c3 FROM t1 WHERE c1 = 1 AND c2 LIKE 'senior%' ORDER BY 2;, the following result can be obtained upon fast parameterization. The corresponding array for parameterization is {1, ‘senior%’, 2}.
obclient> SELECT c1, c2, c3 FROM t1 WHERE c1 = @1 and c2 like @2 ORDER BY @3;
The parameterized SQL statement is the same as that of query Q1. However, the SQL statement of query Q2 does not meet the constraint that “the third item of the parameter array for fast parameterization must be 3”. Therefore, query Q2 cannot match the plan for Q1. In this case, the optimizer generates a new execution plan and constraint for Q2 during hard parsing and adds the new plan and constraint to the cache. The constraint is that the third item of the parameter array for fast parameterization must be 2. The next time Q1 and Q2 are executed, the correct execution plans are retrieved from the plan cache.
Strategies for differentiating parameterized execution plans
OceanBase Database also allows you to use hints to change the literals in SQL statements. For more information, see CURSOR_SHARING_EXACT hint.
Use the following strategies to determine whether an SQL query uses a parameterized plan or a non-parameterized plan:
Check the
cursor_sharingvariableCheck the
cursor_sharingsystem variable. SQL statements with differentcursor_sharingvalues hit different plans. For more information, see cursor_sharing.Check the
query_sqlvariableIf you cannot determine, based on the
cursor_sharingvariable, whether the SQL query uses a parameterized plan, thecursor_sharingvariable is in theEXACTmode and is controlled by theCURSOR_SHARING_EXACThint. In this case, the hint will affect the generation ofquery_sql. You can check thequery_sqlvariable to determine whether the SQL query uses a parameterized plan or a non-parameterized plan. You can checkquery_sqlin theGV$OB_SQL_AUDITorGV$OB_PLAN_CACHE_PLAN_STATview.