Optimizer hints|V2.2.77|OceanBase Database| docs|Distributed Database

Optimizer hints

Last Updated：2023-08-18 09:26:34 Updated

The optimizer can use hints to generate specific plans.

Generally, a hint is not required because the optimizer selects the best execution plan for a query. In some scenarios, however, the execution plan generated by the optimizer may not meet user requirements. In this case, you need to add a hint to your query to generate a specific execution plan.

Hint syntax

A hint is a special SQL comment in terms of syntax, except that a plus sign (+) is added to the opening tag (/*) of the comment. As HINT is a comment, the optimizer ignores it and uses the default plan, if the server does not recognize hints in the SQL statement. In addition, HINT only affects the logical structure of the plan generated by the optimizer. The semantics of the SQL statement remains unaffected.

{DELETE|INSERT|SELECT|UPDATE|REPLACE} /*+ [hint_text] [hin_text]... */
*<span data-type="background" style="background-color: rgb(191, 191, 191);"></span>*

Notice

To execute an SQL statement that has a hint in a MySQL C client, you must log on by using the -c option. Otherwise, the MySQL client removes the hint as a comment in the SQL statement.

Hints

The following table provides the names, syntax, and description of the hints.

Hint	Syntax	Description
NO_REWRITE	NO_REWRITE	Specifies to prohibit SQL rewrite.
READ_CONSISTENCY	READ_CONSISTENCY (WEAK[STRONGFROZEN])	Sets the read consistency (weak/strong).
INDEX_HINT	/+ INDEX(table_name index_name) /	Sets the table index.
QUERY_TIMEOUT	QUERY_TIMEOUT(INTNUM)	Sets the connection timeout value.
LOG_LEVEL	LOG_LEVEL([']log_level['])	Sets the log level. A module-level statement starts and ends with an apostrophe ('). For example, 'DEBUG'.
LEADING	LEADING([qb_name] TBL_NAME_LIST)	Sets the join order.
ORDERED	ORDERED	Sets the join by the order in the SQL statement.
FULL	FULL([qb_name] TBL_NAME)	Specifies that the primary access path is equivalent to INDEX(TBL_NAME PRIMARY).
USE_PLAN_CACHE	USE_PLAN_CACHE(NONE[DEFAULT])	Specifies whether to use plan cache. Valid values: `NONE` and `DEFAULT`. * `NONE` indicates not to use the plan cache. * `DEFAULT` indicates configuration based on other variables.
ACTIVATE_BURIED_POINT	ACTIVATE_BURIED_POINT(INTNUM, [FIX_MOD \| BEFORE_MODE], INTNUM, [INTNUM \| -INTNUM])	For debugging only. Activates a preset internal error point.
USE_MERGE	USE_MERGE([qb_name] TBL_NAME_LIST)	Specifies the use of `MERGE JOIN`when the specified table is the table on the right.
USE_HASH	USE_HASH([qb_name] TBL_NAME_LIST)	Specifies the use of `HASH JOIN` when the specified table is the table on the right.
NO_USE_HASH	NO_USE_HASH([qb_name] TBL_NAME_LIST)	Specifies not to use `HASH JOIN` when the specified table is the table on the right.
USE_NL	USE_NL([qb_name] TBL_NAME_LIST)	Specifies the use of `NESTED LOOP JOIN` when the specified table is the table on the right.
USE_BNL	USE_BNL([qb_name] TBL_NAME_LIST)	Specifies the use of `BLOCKED NESTED LOOP JOIN` when the specified table is the table on the right.
USE_HASH_AGGREGATION	USE_HASH_AGGREGATION([qb_name])	Sets `HASH AGGREGATE` as the method to aggregate data, such as `HASH GROUP BY` or `HASH DISTINCT`.
NO_USE_HASH_AGGREGATION	NO_USE_HASH_AGGREGATION([qb_name])	Sets `MERGE GROUP BY` or `MERGE DISTINCT`, rather than `HASH AGGREGATE`, as the method to aggregate data.
USE_LATE_MATERIALIZATION	USE_LATE_MATERIALIZATION	Specifies the use of `LATE MATERIALIZATION`.
NO_USE_LATE_MATERIALIZATION	NO_USE_LATE_MATERIALIZATION	Specifies not to use `LATE MATERIALIZATION`.
TRACE_LOG	TRACE_LOG	Specifies the collection of the trace log for `SHOW TRACE`.
QB_NAME	QB_NAME( NAME )	The name of the query block.
PARALLEL	PARALLEL(INTNUM)	Sets the degree of parallelism for distributed execution.
TOPK	TOPK(PRECISION MINIMUM_ROWS)	Specifies the precision and the minimum number of rows of a fuzzy query. `PRECISION` is an integer type with a value range of [0, 100], which means the percentage of rows queried in a fuzzy query. `MINIMUM_ROWS` specifies the minimum number of returned rows.

Note

Syntax of QB_NAME: @NAME
Syntax of TBL_NAME: [db_name.]relation_name [qb_name]

QB_NAME introduction

In data manipulation language (DML) statements, each query_block has a QB_NAME (query block name), which can be specified by the user or automatically generated by the system. If you do not use a hint to specify QB_NAME, the system generates the names of SEL$1, SEL$2, UPD$1, DEL$1 from left to right, which is the operation order of Resolver.

You can use QB_NAME to accurately locate every table and specify the behavior of any query block at one position. QB_NAME in TBL_NAME is used to locate the table, and the first QB_NAME in the hint is used to locate the query_block to which the hint applies.

In the following example, the default access path is t_c1 for table t in SEL$1, and PRIMARY for table t in SEL$2. However, if a hint is used in the SQL statement to specify primary table access for table t in SEL$1, table t in SEL$2 is then accessed by index.

obclient>CREATE TABLE t(c1 INT, c2 INT, KEY t_c1(c1));
Query OK, 0 rows affected (0.31 sec)

obclient>EXPLAIN SELECT * FROM t , (SELECT * FROM t WHERE c2 = 1) ta 
        WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan: 
============================================================
|ID|OPERATOR                        |NAME   |EST. ROWS|COST|
------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN|       |1        |1895|
|1 | TABLE SCAN                     |t(t_c1)|1        |472 |
|2 | TABLE SCAN                     |t      |1        |1397|
============================================================
Outputs & filters:
-------------------------------------
  0 - output([t.c1], [t.c2], [t.c1], [t.c2]), filter(nil),
      conds(nil), nl_params_(nil)
  1 - output([t.c1], [t.c2]), filter(nil),
      access([t.c1], [t.c2]), partitions(p0)
  2 - output([t.c2], [t.c1]), filter([t.c2 = 1]),
      access([t.c2], [t.c1]), partitions(p0)

Notice

After the rewrite, the SEL$2 query block is promoted to the SEL$1 query block. So, it is not necessary to specify the query block to which the hint applies.

obclient>EXPLAIN SELECT/*+INDEX(t@SEL$1 PRIMARY) INDEX(t@SEL$2 t_c1)*/ * 
        FROM t , (SELECT * FROM t WHERE c2 = 1) ta WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan: 
=============================================================
|ID|OPERATOR                        |NAME   |EST. ROWS|COST |
-------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN|       |1        |16166|
|1 | TABLE SCAN                     |t      |1        |1397 |
|2 | TABLE SCAN                     |t(t_c1)|1        |14743|
=============================================================

Outputs & filters:
-------------------------------------
  0 - output([t.c1], [t.c2], [t.c1], [t.c2]), filter(nil),
      conds(nil), nl_params_(nil)
  1 - output([t.c1], [t.c2]), filter([t.c1 = 1]),
      access([t.c1], [t.c2]), partitions(p0)
  2 - output([t.c2], [t.c1]), filter([t.c2 = 1]),
      access([t.c2], [t.c1]), partitions(p0)

In this example, the SQL statement can also be written as:

obclient>SELECT/*+INDEX(t@SEL$1 PRIMARY) INDEX(@SEL$2 t@SEL$2 t_c1)*/ * FROM t ,
     (SELECT * FROM t WHERE c2 = 1) ta WHERE t.c1 = 1\G;
<==>
obclient>SELECT/*+INDEX(t@SEL$1 PRIMARY)*/ * from t , (SELECT/*+INDEX(t@SEL$2 t_c1)*/ * from t 
    WHERE c2 = 1) ta WHERE t.c1 = 1\G;
<==>
obclient>SELECT/*+INDEX(@SEL$1 t@SEL$1 PRIMARY) INDEX(@SEL$2 t@SEL$2 t_c1)*/ * from t , 
     (SELECT * FROM t WHERE c2 = 1) ta WHERE t.c1 = 1\G;

You can run the EXPLAIN EXTENDED command and view the Outline Data to learn about the hint.

obclient>EXPLAIN EXTENDED SELECT * 
FROM t , (SELECT * 
FROM t WHERE c2 = 1) ta 
                 WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan: 
============================================================
|ID|OPERATOR                        |NAME   |EST. ROWS|COST|
------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN|       |1        |1895|
|1 | TABLE SCAN                     |t(t_c1)|1        |472 |
|2 | TABLE SCAN                     |t      |1        |1397|
============================================================
Used Hint:
-------------------------------------
  /*+
 */

Outline Data:
-------------------------------------
  /*+
      BEGIN_OUTLINE_DATA
      USE_NL(@"SEL$1" "test.t"@"SEL$2")
      LEADING(@"SEL$1" "test.t"@"SEL$1" "test.t"@"SEL$2")
      INDEX(@"SEL$1" "test.t"@"SEL$1" "t_c1")
      FULL(@"SEL$2" "test.t"@"SEL$2")
      END_OUTLINE_DATA
  */

General rules of hints

A hint applies to the query block where it resides, if no query block is specified. In the following example, table t1 resides in the SEL$2 query block and cannot be relocated to the SEL$1 query block through rewriting. So, the hint does not take effect.

obclient>CREATE TABLE t1(c1 INT, c2 INT, INDEX t1_c1(c1), INDEX
 t1_c2(c2));
Query OK, 0 rows affected (0.31 sec)

obclient>EXPLAIN SELECT/*+INDEX(t1 t1_c2)*/ * FROM t, 
       (SELECT * FROM t1 GROUP BY c1) ta WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan: 
============================================================
|ID|OPERATOR                        |NAME   |EST. ROWS|COST|
------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN|       |666      |5906|
|1 | TABLE SCAN                     |t(t_c1)|1        |472 |
|2 | SUBPLAN SCAN                   |ta     |666      |5120|
|3 |  HASH GROUP BY                 |       |666      |4454|
|4 |   TABLE SCAN                   |t1     |1000     |1397|
============================================================
Outputs & filters:
-------------------------------------
  0 - output([t.c1], [t.c2], [ta.c1], [ta.c2]), filter(nil),
      conds(nil), nl_params_(nil)
  1 - output([t.c1], [t.c2]), filter(nil),
      access([t.c1], [t.c2]), partitions(p0)
  2 - 
output([ta.c1], [ta.c2]), filter(nil),
      access([ta.c1], [ta.c2])
  4 - output([t1.c1], [t1.c2]), filter(nil),
      group([t1.c1]), agg_func(nil)
  5 - output([t1.c1], [t1.c2]), filter(nil),
      access([t1.c1], [t1.c2]), partitions(p0)

In the following example, the hint takes effect because the SQL statement is rewritten and table t1 is relocated to the SEL$1 query block.

obclient>EXPLAIN SELECT/*+INDEX(t1 t1_c2)*/ * FROM t, 
       (SELECT * FROM t1) ta WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan: 
===============================================================
|ID|OPERATOR                        |NAME     |EST. ROWS|COST |
---------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN|         |1000     |15674|
|1 | TABLE SCAN                     |t(t_c1)  |1        |472  |
|2 | TABLE SCAN                     |t1(t1_c2)|1000     |14743|
===============================================================
Outputs & filters:
-------------------------------------
  0 - output([t.c1], [t.c2], [t1.c1], [t1.c2]), filter(nil),
      conds(nil), nl_params_(nil)
  1 - output([t1.c1], [t1.c2]), filter(nil),
      access([t1.c1], [t1.c2]), partitions(p0)
  2 - 
output([t.c1], [t.c2]), filter(nil),
      access([t.c1], [t.c2]), partitions(p0)

If a table is specified but is not found in the query block where the hint resides, or a conflict occurs, the hint is invalid.

If the table is not found, refer to the first example in Rule 1. The following is an example where two conflicts occur at the same time:

obclient>EXPLAIN EXTENDED SELECT/*+INDEX(t PRIMARY)*/ * 
                  FROM t , (SELECT * FROM t WHERE c1 = 1) ta 
                 WHERE t.c1 = 1\G;
*************************** 1. row ***************************
Query Plan: 
============================================================
|ID|OPERATOR                        |NAME   |EST. ROWS|COST|
------------------------------------------------------------
|0 |NESTED-LOOP INNER JOIN CARTESIAN|       |1        |970 |
|1 | TABLE SCAN                     |t(t_c1)|1        |472 |
|2 | TABLE SCAN                     |t(t_c1)|1        |472 |
============================================================
Outputs & filters:
-------------------------------------
  0 - output([t.c1(0x7f7b7cdd3e60)], [t.c2(0x7f7b7cdd40f0)], [t.c1(0x7f7b7cdd2bd0)], [t.c2(0x7f7b7cdd2e60)]), filter(nil),
      conds(nil), nl_params_(nil), inner_get=false, self_join=false, batch_join=false
  1 - output([t.c1(0x7f7b7cdd3e60)], [t.c2(0x7f7b7cdd40f0)]), filter(nil),
      access([t.c1(0x7f7b7cdd3e60)], [t.c2(0x7f7b7cdd40f0)]), partitions(p0),
      is_index_back=true,
      range_key([t.c1(0x7f7b7cdd3e60)], [t.__pk_increment(0x7f7b7cde86e0)]), range(1,MIN ; 1,MAX),
      range_cond([t.c1(0x7f7b7cdd3e60) = 1(0x7f7b7cdd3800)])
  2 - 
output([t.c1(0x7f7b7cdd2bd0)], [t.c2(0x7f7b7cdd2e60)]), filter(nil),
      access([t.c1(0x7f7b7cdd2bd0)], [t.c2(0x7f7b7cdd2e60)]), partitions(p0),
      is_index_back=true,
      range_key([t.c1(0x7f7b7cdd2bd0)], [t.__pk_increment(0x7f7b7cdf41b0)]), range(1,MIN ; 1,MAX),
      range_cond([t.c1(0x7f7b7cdd2bd0) = 1(0x7f7b7cdd2570)])

Used Hint:
-------------------------------------
  /*+
  */

If the table specified in a hint for join cannot be found, the table is ignored, but other specifications effective. If the optimizer cannot generate the specified join method, it selects another method, and the hint is invalid.
If a table in the join order cannot be found, the hint is invalid.

Frequently used hint syntaxes

The optimizer of OceanBase Database dynamically schedules tasks and takes all possible optimal paths into account. Users can use hints to specify behaviors of the optimizer, so that the optimizer can execute queries based on the hints.

INDEX hint

The INDEX hint supports syntaxes in both MySQL and Oracle formats.

INDEX hint syntax in Oracle format:

obclient> SELECT/*+INDEX(table_name index_name) */ * FROM table_name;

INDEX hint syntax in MySQL format:

tbl_name [[AS] alias] [index_hint_list]

index_hint_list:
index_hint [, index_hint] ...

index_hint:
USE {INDEX|KEY}
  [FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
  | IGNORE {INDEX|KEY}
  [FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
  | FORCE {INDEX|KEY}
  [FOR {JOIN|ORDER BY|GROUP BY}] (index_list)

index_list:
index_name [, index_name] ...

You can specify only one INDEX for a table in Oracle syntax, but you can specify multiple indexes for a table in MySQL syntax. Although MySQL syntax in OceanBase Database supports specifying multiple indexes for a table, only the first index is used to generate the path when the USE INDEX and FORCE INDEX hints are used, even if the SQL statement does not contain a filter for the INDEX and results in a full scan and TABLE ACCESS BY INDEX PRIMARY KEY operations. This is because OceanBase Database is designed with an idea that a user who write a hint knows better than a program about which path is the best. The IGNORE INDEX hint tells the optimizer to ignore all specified indexes. Essentially, the USE INDEX and FORCE INDEX hints work in the same way as the Oracle INDEX hint does. The INDEX hint does not take effect if the INDEX does not exist or is invalid. The IGNORE INDEX hint is invalid if all indexes, including the primary table, are ignored.

FULL hint

The following syntax of the FULL hint specifies to scan the primary table:

/*+ FULL(table_name)*/

The FULL hint specifies to perform a primary table scan, which is equivalent to the INDEX hint /*+ INDEX(table_name PRIMARY)*/.

ORDERED hint

The ORDERED hint specifies that the join is executed in the order of tables after the FROM clause. Syntax:

/*+ ORDERED*/

If the hint is rewritten, the join is executed in the order of the FROM items in the rewritten stmt. This is because sub_query adds new table items at corresponding positions in the FROM items during rewriting.

LEADING hint

The LEADING hint specifies the order in which tables are joined. The Syntax:

/*+ LEADING(table_name_list)*/

table_name in table_name_list is quite special. Syntax of other table_name:

db_name . relation_name

relation_name

.relation_name

Syntax of table_name in table_name_list:

db_name . relation_name

relation_name

Syntax of table_name_list:

table_name
table_name_list table_name
table_name_list, table_name

The LEADING hint is strictly examined to ensure that tables are joined in the order specified by the user. The LEADING hint becomes invalid if the table_name specified in the hint does not exist, or duplicate tables are found in the hint. If the optimizer does not find a table in FROM items by table_id during a join operation, the query may have been rewritten. In this case, the join order for this table and tables after this table is invalid. The join order before the table is still valid.

Use_merge

Specifies the use of the merge-join algorithm to join tables. Syntax: /*+ USE_MERGE(table_name_list) */

The merge-join algorithm sets the table specified by the use_merge hint to the table on the right. Notice

In OceanBase Database, a merge-join must have a join-condition with the equivalent value. When you join two tables that do not have an equivalent condition, the use_merge hint is invalid.

At present, no conclusion is made about whether A merge-join B is equivalent to B merge-join A. Based on the cost model, the table on the left and the table on the right are considered separately during the calculation of the cost of merge-join. To use a hint with greater flexibility, the table on the left and the table on the right are discriminated in a merge-join operation. This means that the use_merge hint is valid only for the table on the right.

Use_nl

Specifies to use the Nested Loop Join algorithm for a join operation when the specified table is on the right. Syntax:

/*+ USE_NL(table_name_list) */

Use_hash

Specifies to use the Hash Join algorithm for a join operation when the specified table is on the right. Syntax:

/*+ USE_HASH(table_name_list) */

Parallel

Specifies the statement-level degree of parallelism. After you set this hint, the value you specify overwrites that of system variable ob_stmt_parallel_degree. Syntax:

/*+ PARALLEL(4) */

OceanBase

Customer Stories

Documentation