The optimizer can use hints to generate specific plans.
Generally, a hint is not required because the optimizer selects the best execution plan for a query. In some scenarios, however, the execution plan generated by the optimizer may not meet user requirements. In this case, you need to add a hint to your query to generate a specific execution plan.
Hint syntax
A hint is a special SQL comment in terms of syntax, except that a plus sign (+) is added to the opening tag (/*) of the comment. As HINT is a comment, the optimizer ignores it and uses the default plan, if the server does not recognize hints in the SQL statement. In addition, HINT only affects the logical structure of the plan generated by the optimizer. The semantics of the SQL statement remains unaffected.
{DELETE|INSERT|SELECT|UPDATE|REPLACE} /*+ [hint_text] [hin_text]... */
*<span data-type="background" style="background-color: rgb(191, 191, 191);"></span>*
Notice
To execute an SQL statement that has a hint in a MySQL C client, you must log on by using the -c option. Otherwise, the MySQL client removes the hint as a comment in the SQL statement.
Hints
The following table provides the names, syntax, and description of the hints.
| Hint | Syntax | Description |
|---|---|---|
| NO_REWRITE | NO_REWRITE | Specifies to prohibit SQL rewrite. |
| READ_CONSISTENCY | READ_CONSISTENCY (WEAK[STRONGFROZEN]) | Sets the read consistency (weak/strong). |
| INDEX_HINT | /*+ INDEX(table_name index_name) */ | Sets the table index. |
| QUERY_TIMEOUT | QUERY_TIMEOUT(INTNUM) | Sets the connection timeout value. |
| LOG_LEVEL | LOG_LEVEL([']log_level[']) | Sets the log level. A module-level statement starts and ends with an apostrophe ('). For example, 'DEBUG'. |
| LEADING | LEADING([qb_name] TBL_NAME_LIST) | Sets the join order. |
| ORDERED | ORDERED | Sets the join by the order in the SQL statement. |
| FULL | FULL([qb_name] TBL_NAME) | Specifies that the primary access path is equivalent to INDEX(TBL_NAME PRIMARY). |
| USE_PLAN_CACHE | USE_PLAN_CACHE(NONE[DEFAULT]) | Specifies whether to use plan cache. Valid values: NONE and DEFAULT. * NONE indicates not to use the plan cache. * DEFAULT indicates configuration based on other variables. |
| ACTIVATE_BURIED_POINT | ACTIVATE_BURIED_POINT(INTNUM, [FIX_MOD | BEFORE_MODE], INTNUM, [INTNUM | -INTNUM]) | For debugging only. Activates a preset internal error point. |
| USE_MERGE | USE_MERGE([qb_name] TBL_NAME_LIST) | Specifies the use of MERGE JOINwhen the specified table is the table on the right. |
| USE_HASH | USE_HASH([qb_name] TBL_NAME_LIST) | Specifies the use of HASH JOIN when the specified table is the table on the right. |
| NO_USE_HASH | NO_USE_HASH([qb_name] TBL_NAME_LIST) | Specifies not to use HASH JOIN when the specified table is the table on the right. |
| USE_NL | USE_NL([qb_name] TBL_NAME_LIST) | Specifies the use of NESTED LOOP JOIN when the specified table is the table on the right. |
| USE_BNL | USE_BNL([qb_name] TBL_NAME_LIST) | Specifies the use of BLOCKED NESTED LOOP JOIN when the specified table is the table on the right. |
| USE_HASH_AGGREGATION | USE_HASH_AGGREGATION([qb_name]) | Sets HASH AGGREGATE as the method to aggregate data, such as HASH GROUP BY or HASH DISTINCT. |
| NO_USE_HASH_AGGREGATION | NO_USE_HASH_AGGREGATION([qb_name]) | Sets MERGE GROUP BY or MERGE DISTINCT, rather than HASH AGGREGATE, as the method to aggregate data. |
| USE_LATE_MATERIALIZATION | USE_LATE_MATERIALIZATION | Specifies the use of LATE MATERIALIZATION. |
| NO_USE_LATE_MATERIALIZATION | NO_USE_LATE_MATERIALIZATION | Specifies not to use LATE MATERIALIZATION. |
| TRACE_LOG | TRACE_LOG | Specifies the collection of the trace log for SHOW TRACE. |
| QB_NAME | QB_NAME( NAME ) | The name of the query block. |
| PARALLEL | PARALLEL(INTNUM) | Sets the degree of parallelism for distributed execution. |
| TOPK | TOPK(PRECISION MINIMUM_ROWS) | Specifies the precision and the minimum number of rows of a fuzzy query. PRECISION is an integer type with a value range of [0, 100], which means the percentage of rows queried in a fuzzy query. MINIMUM_ROWS specifies the minimum number of returned rows. |
Note
Syntax of
QB_NAME:@NAMESyntax of
TBL_NAME:[db_name.]relation_name [qb_name]
QB_NAME introduction
In data manipulation language (DML) statements, each query_block has a QB_NAME (query block name), which can be specified by the user or automatically generated by the system. If you do not use a hint to specify QB_NAME, the system generates the names of SEL$1, SEL$2, UPD$1, DEL$1 from left to right, which is the operation order of Resolver.
You can use QB_NAME to accurately locate every table and specify the behavior of any query block at one position. QB_NAME in TBL_NAME is used to locate the table, and the first QB_NAME in the hint is used to locate the query_block to which the hint applies.
In the following example, the default access path is t_c1 for table t in SEL$1, and PRIMARY for table t in SEL$2. However, if a hint is used in the SQL statement to specify primary table access for table t in SEL$1, table t in SEL$2 is then accessed by index.
obclient>CREATE TABLE t(c1 INT, c2 INT, KEY t_c1(c1));
Query OK, 0 rows affected (0.31 sec)
obclient>EXPLAIN SELECT * FROM t , (SELECT * FROM t WHERE c2 = 1) ta
WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan:
============================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN| |1 |1895|
|1 | TABLE SCAN |t(t_c1)|1 |472 |
|2 | TABLE SCAN |t |1 |1397|
============================================================
Outputs & filters:
-------------------------------------
0 - output([t.c1], [t.c2], [t.c1], [t.c2]), filter(nil),
conds(nil), nl_params_(nil)
1 - output([t.c1], [t.c2]), filter(nil),
access([t.c1], [t.c2]), partitions(p0)
2 - output([t.c2], [t.c1]), filter([t.c2 = 1]),
access([t.c2], [t.c1]), partitions(p0)
Notice
After the rewrite, the SEL$2 query block is promoted to the SEL$1 query block. So, it is not necessary to specify the query block to which the hint applies.
obclient>EXPLAIN SELECT/*+INDEX(t@SEL$1 PRIMARY) INDEX(t@SEL$2 t_c1)*/ *
FROM t , (SELECT * FROM t WHERE c2 = 1) ta WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan:
=============================================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
-------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN| |1 |16166|
|1 | TABLE SCAN |t |1 |1397 |
|2 | TABLE SCAN |t(t_c1)|1 |14743|
=============================================================
Outputs & filters:
-------------------------------------
0 - output([t.c1], [t.c2], [t.c1], [t.c2]), filter(nil),
conds(nil), nl_params_(nil)
1 - output([t.c1], [t.c2]), filter([t.c1 = 1]),
access([t.c1], [t.c2]), partitions(p0)
2 - output([t.c2], [t.c1]), filter([t.c2 = 1]),
access([t.c2], [t.c1]), partitions(p0)
In this example, the SQL statement can also be written as:
obclient>SELECT/*+INDEX(t@SEL$1 PRIMARY) INDEX(@SEL$2 t@SEL$2 t_c1)*/ * FROM t ,
(SELECT * FROM t WHERE c2 = 1) ta WHERE t.c1 = 1\G;
<==>
obclient>SELECT/*+INDEX(t@SEL$1 PRIMARY)*/ * from t , (SELECT/*+INDEX(t@SEL$2 t_c1)*/ * from t
WHERE c2 = 1) ta WHERE t.c1 = 1\G;
<==>
obclient>SELECT/*+INDEX(@SEL$1 t@SEL$1 PRIMARY) INDEX(@SEL$2 t@SEL$2 t_c1)*/ * from t ,
(SELECT * FROM t WHERE c2 = 1) ta WHERE t.c1 = 1\G;
You can run the EXPLAIN EXTENDED command and view the Outline Data to learn about the hint.
obclient>EXPLAIN EXTENDED SELECT *
FROM t , (SELECT *
FROM t WHERE c2 = 1) ta
WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan:
============================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN| |1 |1895|
|1 | TABLE SCAN |t(t_c1)|1 |472 |
|2 | TABLE SCAN |t |1 |1397|
============================================================
Used Hint:
-------------------------------------
/*+
*/
Outline Data:
-------------------------------------
/*+
BEGIN_OUTLINE_DATA
USE_NL(@"SEL$1" "test.t"@"SEL$2")
LEADING(@"SEL$1" "test.t"@"SEL$1" "test.t"@"SEL$2")
INDEX(@"SEL$1" "test.t"@"SEL$1" "t_c1")
FULL(@"SEL$2" "test.t"@"SEL$2")
END_OUTLINE_DATA
*/
General rules of hints
A hint applies to the query block where it resides, if no query block is specified. In the following example, table
t1resides in theSEL$2query block and cannot be relocated to theSEL$1query block through rewriting. So, the hint does not take effect.obclient>CREATE TABLE t1(c1 INT, c2 INT, INDEX t1_c1(c1), INDEX t1_c2(c2)); Query OK, 0 rows affected (0.31 sec) obclient>EXPLAIN SELECT/*+INDEX(t1 t1_c2)*/ * FROM t, (SELECT * FROM t1 GROUP BY c1) ta WHERE t.c1 = 1\G; *************************** 1. row *************************** Query Plan: ============================================================ |ID|OPERATOR |NAME |EST. ROWS|COST| ------------------------------------------------------------ |0 |NESTED-LOOP INNER JOIN CARTESIAN| |666 |5906| |1 | TABLE SCAN |t(t_c1)|1 |472 | |2 | SUBPLAN SCAN |ta |666 |5120| |3 | HASH GROUP BY | |666 |4454| |4 | TABLE SCAN |t1 |1000 |1397| ============================================================ Outputs & filters: ------------------------------------- 0 - output([t.c1], [t.c2], [ta.c1], [ta.c2]), filter(nil), conds(nil), nl_params_(nil) 1 - output([t.c1], [t.c2]), filter(nil), access([t.c1], [t.c2]), partitions(p0) 2 - output([ta.c1], [ta.c2]), filter(nil), access([ta.c1], [ta.c2]) 4 - output([t1.c1], [t1.c2]), filter(nil), group([t1.c1]), agg_func(nil) 5 - output([t1.c1], [t1.c2]), filter(nil), access([t1.c1], [t1.c2]), partitions(p0)In the following example, the hint takes effect because the SQL statement is rewritten and table
t1is relocated to theSEL$1query block.obclient>EXPLAIN SELECT/*+INDEX(t1 t1_c2)*/ * FROM t, (SELECT * FROM t1) ta WHERE t.c1 = 1\G; *************************** 1. row *************************** Query Plan: =============================================================== |ID|OPERATOR |NAME |EST. ROWS|COST | --------------------------------------------------------------- |0 |NESTED-LOOP INNER JOIN CARTESIAN| |1000 |15674| |1 | TABLE SCAN |t(t_c1) |1 |472 | |2 | TABLE SCAN |t1(t1_c2)|1000 |14743| =============================================================== Outputs & filters: ------------------------------------- 0 - output([t.c1], [t.c2], [t1.c1], [t1.c2]), filter(nil), conds(nil), nl_params_(nil) 1 - output([t1.c1], [t1.c2]), filter(nil), access([t1.c1], [t1.c2]), partitions(p0) 2 - output([t.c1], [t.c2]), filter(nil), access([t.c1], [t.c2]), partitions(p0)
If a table is specified but is not found in the query block where the hint resides, or a conflict occurs, the hint is invalid.
If the table is not found, refer to the first example in Rule 1. The following is an example where two conflicts occur at the same time:
obclient>EXPLAIN EXTENDED SELECT/*+INDEX(t PRIMARY)*/ * FROM t , (SELECT * FROM t WHERE c1 = 1) ta WHERE t.c1 = 1\G; *************************** 1. row *************************** Query Plan: ============================================================ |ID|OPERATOR |NAME |EST. ROWS|COST| ------------------------------------------------------------ |0 |NESTED-LOOP INNER JOIN CARTESIAN| |1 |970 | |1 | TABLE SCAN |t(t_c1)|1 |472 | |2 | TABLE SCAN |t(t_c1)|1 |472 | ============================================================ Outputs & filters: ------------------------------------- 0 - output([t.c1(0x7f7b7cdd3e60)], [t.c2(0x7f7b7cdd40f0)], [t.c1(0x7f7b7cdd2bd0)], [t.c2(0x7f7b7cdd2e60)]), filter(nil), conds(nil), nl_params_(nil), inner_get=false, self_join=false, batch_join=false 1 - output([t.c1(0x7f7b7cdd3e60)], [t.c2(0x7f7b7cdd40f0)]), filter(nil), access([t.c1(0x7f7b7cdd3e60)], [t.c2(0x7f7b7cdd40f0)]), partitions(p0), is_index_back=true, range_key([t.c1(0x7f7b7cdd3e60)], [t.__pk_increment(0x7f7b7cde86e0)]), range(1,MIN ; 1,MAX), range_cond([t.c1(0x7f7b7cdd3e60) = 1(0x7f7b7cdd3800)]) 2 - output([t.c1(0x7f7b7cdd2bd0)], [t.c2(0x7f7b7cdd2e60)]), filter(nil), access([t.c1(0x7f7b7cdd2bd0)], [t.c2(0x7f7b7cdd2e60)]), partitions(p0), is_index_back=true, range_key([t.c1(0x7f7b7cdd2bd0)], [t.__pk_increment(0x7f7b7cdf41b0)]), range(1,MIN ; 1,MAX), range_cond([t.c1(0x7f7b7cdd2bd0) = 1(0x7f7b7cdd2570)]) Used Hint: ------------------------------------- /*+ */
If the table specified in a hint for join cannot be found, the table is ignored, but other specifications effective. If the optimizer cannot generate the specified join method, it selects another method, and the hint is invalid.
If a table in the join order cannot be found, the hint is invalid.
Frequently used hint syntaxes
The optimizer of OceanBase Database dynamically schedules tasks and takes all possible optimal paths into account. Users can use hints to specify behaviors of the optimizer, so that the optimizer can execute queries based on the hints.
INDEX hint
The INDEX hint supports syntaxes in both MySQL and Oracle formats.
INDEXhint syntax in Oracle format:
obclient> SELECT/*+INDEX(table_name index_name) */ * FROM table_name;
INDEXhint syntax in MySQL format:
tbl_name [[AS] alias] [index_hint_list]
index_hint_list:
index_hint [, index_hint] ...
index_hint:
USE {INDEX|KEY}
[FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
| IGNORE {INDEX|KEY}
[FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
| FORCE {INDEX|KEY}
[FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
index_list:
index_name [, index_name] ...
You can specify only one INDEX for a table in Oracle syntax, but you can specify multiple indexes for a table in MySQL syntax. Although MySQL syntax in OceanBase Database supports specifying multiple indexes for a table, only the first index is used to generate the path when the USE INDEX and FORCE INDEX hints are used, even if the SQL statement does not contain a filter for the INDEX and results in a full scan and TABLE ACCESS BY INDEX PRIMARY KEY operations. This is because OceanBase Database is designed with an idea that a user who write a hint knows better than a program about which path is the best. The IGNORE INDEX hint tells the optimizer to ignore all specified indexes. Essentially, the USE INDEX and FORCE INDEX hints work in the same way as the Oracle INDEX hint does. The INDEX hint does not take effect if the INDEX does not exist or is invalid. The IGNORE INDEX hint is invalid if all indexes, including the primary table, are ignored.
FULL hint
The following syntax of the FULL hint specifies to scan the primary table:
/*+ FULL(table_name)*/
The FULL hint specifies to perform a primary table scan, which is equivalent to the INDEX hint /*+ INDEX(table_name PRIMARY)*/.
ORDERED hint
The ORDERED hint specifies that the join is executed in the order of tables after the FROM clause. Syntax:
/*+ ORDERED*/
If the hint is rewritten, the join is executed in the order of the FROM items in the rewritten stmt. This is because sub_query adds new table items at corresponding positions in the FROM items during rewriting.
LEADING hint
The LEADING hint specifies the order in which tables are joined. The Syntax:
/*+ LEADING(table_name_list)*/
table_name in table_name_list is quite special. Syntax of other table_name:
db_name . relation_name
relation_name
.relation_name
Syntax of table_name in table_name_list:
db_name . relation_name
relation_name
Syntax of table_name_list:
table_name
table_name_list table_name
table_name_list, table_name
The LEADING hint is strictly examined to ensure that tables are joined in the order specified by the user. The LEADING hint becomes invalid if the table_name specified in the hint does not exist, or duplicate tables are found in the hint. If the optimizer does not find a table in FROM items by table_id during a join operation, the query may have been rewritten. In this case, the join order for this table and tables after this table is invalid. The join order before the table is still valid.
Use_merge
Specifies the use of the merge-join algorithm to join tables. Syntax: /*+ USE_MERGE(table_name_list) */
The merge-join algorithm sets the table specified by the use_merge hint to the table on the right. Notice
In OceanBase Database, a merge-join must have a join-condition with the equivalent value. When you join two tables that do not have an equivalent condition, the use_merge hint is invalid.
At present, no conclusion is made about whether A merge-join B is equivalent to B merge-join A. Based on the cost model, the table on the left and the table on the right are considered separately during the calculation of the cost of merge-join. To use a hint with greater flexibility, the table on the left and the table on the right are discriminated in a merge-join operation. This means that the use_merge hint is valid only for the table on the right.
Use_nl
Specifies to use the Nested Loop Join algorithm for a join operation when the specified table is on the right. Syntax:
/*+ USE_NL(table_name_list) */
Use_hash
Specifies to use the Hash Join algorithm for a join operation when the specified table is on the right. Syntax:
/*+ USE_HASH(table_name_list) */
Parallel
Specifies the statement-level degree of parallelism. After you set this hint, the value you specify overwrites that of system variable ob_stmt_parallel_degree. Syntax:
/*+ PARALLEL(4) */