The following table describes join operation hints that are related to enabling or disabling specific join algorithms in SQL queries.
| Hint | Description |
|---|---|
USE_MERGE |
Specifies to use the sorted merge join algorithm when the table specified in this hint is used as the right table for joining. The reverse operation is NO_USE_MERGE. |
NO_USE_MERGE |
Specifies not to use the sorted merge join algorithm when the table specified in this hint is used as the right table for joining. The reverse operation is USE_MERGE. |
USE_HASH |
Specifies to use the HASH JOIN algorithm when the table specified in this hint is used as the right table for joining. The reverse operation is NO_USE_HASH. |
NO_USE_HASH |
Specifies not to use the HASH JOIN algorithm when the table specified in this hint is used as the right table for joining. The reverse operation is USE_HASH. |
USE_NL |
Specifies to use the nested loop join algorithm when the table specified in this hint is used as the left table for joining. The reverse operation is NO_USE_NL. |
NO_USE_NL |
Specifies not to use the nested loop join algorithm when the table specified in this hint is used as the left table for joining. The reverse operation is USE_NL. |
PQ_DISTRIBUTE |
Controls data distribution for join operations. |
PQ_MAP |
Specifies to use the mapping strategy for join operations on the followers. |
USE_NL_MATERIALIZATION |
Forcibly materializes the left table in nested loop joins. The reverse operation is NO_USE_NL_MATERIALIZATION. |
NO_USE_NL_MATERIALIZATION |
Prevents the left table from being materialized in nested loop joins. The reverse operation is USE_NL_MATERIALIZATION. |
PX_JOIN_FILTER |
Instructs the optimizer to control the use of JOIN FILTERS for hash joins. The reverse operation is NO_PX_JOIN_FILTER. |
NO_PX_JOIN_FILTER |
Instructs the optimizer to control the hash join to disable JOIN FILTERS. The reverse operation is PX_JOIN_FILTER. |
PX_PART_JOIN_FILTER |
Instructs the optimizer to manually enable PART JOIN FILTERS. The reverse operation is NO_PX_PART_JOIN_FILTER. |
NO_PX_PART_JOIN_FILTER |
Instructs the optimizer to manually disable PART JOIN FILTERS. The reverse operation is NO_PX_PART_JOIN_FILTER. |
USE_MERGE hint
The USE_MERGE hint instructs the optimizer to use the sorted merge join algorithm when the table specified in this hint is used as the right table in a join. The NO_USE_MERGE hint is the reverse operation of the USE_MERGE hint.
Syntax
/*+ USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Considerations and limitations
We recommend that you use the
USE_NLandUSE_MERGhints together with theLEADINGorORDEREDhint.The optimizer will use these hints when the referenced table is used as the right table in a join.
If the referenced table is used as the left table in a join, the hints are ignored.
The
USE_MERGEhint specifies to use theMERGE JOINalgorithm when the specified table is used as the right table.When the
USE_MERGEhint is used, the optimizer of OceanBase Database requires anjoin-conditionwith an equal value condition to use theMERGE JOINalgorithm. Therefore, if no equal value condition is specified between the two tables to be joined, theUSE_MERGEhint becomes invalid.
Examples
-- Use the USE_MERGE hint to instruct the optimizer to use the sorted merge join algorithm to execute the query.
-- In the join operation between the employees and departments tables, the employees table is used as the right table, and the departments table is used as the left table.
SELECT /*+ USE_MERGE(employees departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_MERGE hint
The NO_USE_MERGE hint instructs the optimizer to exclude the USE_MERGE hint- specified join when the specified table is used as the left table to perform a parallel join with another row source. The USE_MERGE hint is the inverse operation of the NO_USE_MERGE hint.
Syntax
/*+ NO_USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_MERGE hint to instruct the optimizer not to use the sort-merge join algorithm to execute the query.
-- When the employees and departments tables are joined, sort-merge joins will be excluded.
SELECT /*+ NO_USE_MERGE(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_HASH hint
The USE_HASH hint specifies to use the HASH-JOIN algorithm when the specified table is the right table in a join. The NO_USE_HASH hint is the reverse operation of the USE_HASH hint.
Syntax
/*+ USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_HASH hint to instruct the optimizer to use the hash join algorithm (HASH-JOIN) to execute the query.
-- For the join operation between the orders and order_items tables, the orders table is the right table, and the order_items table is the left table.
SELECT /*+ USE_HASH(l h) */ *
FROM orders h, order_items l
WHERE l.order_id = h.order_id
AND l.order_id > 2400;
NO_USE_HASH hint
The NO_USE_HASH hint specifies that the HASH-JOIN algorithm should not be used when the specified table is the right table for joining. The USE_HASH hint is the opposite of the NO_USE_HASH hint.
Syntax
/*+ NO_USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_HASH hint to indicate the optimizer not to use the hash join algorithm for the query.
-- HASH-JOIN algorithm will be excluded for joining the employees and departments tables.
SELECT /*+ NO_USE_HASH(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_NL hint
The USE_NL hint specifies that the nested loop join (NL-JOIN) algorithm be used when the specified table is the left table in a join. The NO_USE_NL hint is the inverse operation of the USE_NL hint.
We recommend that you use the
USE_NLhint and theUSE_MERGhint together with theLEADINGhint or theORDEREDhint.When the referenced table is the left table in a join, the optimizer takes these hints into consideration.
If the referenced table is the right table, the hints are ignored.
Syntax
/*+ USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
In the following example, the hint directs the optimizer to use nested loops to access the orders table by full table scan and apply the filter condition l.order_id = h.order_id to each row. For each row that meets the filter condition, the optimizer uses the order_id index to access the order_items table.
-- Use the USE_NL hint to instruct the optimizer to use nested loops to execute the query.
-- For the join between the orders and order_items tables, the orders table serves as the right table, and the order_items table serves as the left table.
SELECT /*+ USE_NL(l h) */ h.customer_id, l.unit_price * l.quantity
FROM orders h, order_items l
WHERE l.order_id = h.order_id;
NO_USE_NL hint
The NO_USE_NL hint specifies that the nested loop (NL) join algorithm should not be used when the specified table is the left table in a join operation. The USE_NL hint is the opposite of the NO_USE_NL hint.
Syntax
/*+ NO_USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL hint to indicate the optimizer not to use the NL join algorithm for the query execution.
-- For the join operation between the employees and departments tables, the NL-JOIN algorithm will be excluded.
SELECT /*+ NO_USE_NL(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
PQ_DISTRIBUTE hint
The PQ_DISTRIBUTE hint specifies how the optimizer distributes data between the producers (servers that generate row data) and consumers (servers that receive and process the row data) when you execute a query in parallel. You can use the PQ_DISTRIBUTE hint to control the distribution of row data during join or load operations.
In parallel query scenarios, especially when a large amount of data is involved, the PQ_DISTRIBUTE hint can optimize resource usage and improve query performance.
Syntax
/*+ PQ_DISTRIBUTE
( [ @queryblock ] tablespec
{ distribution | outer_distribution inner_distribution }
) */
Control the distribution of joins
You can specify two distribution methods to control the distribution of joins.
In the following example, which shows the second branch of the syntax:
outer_distributionspecifies the distribution method for the left table.inner_distributionspecifies the distribution method for the right table.
The following table describes valid combinations of distribution methods.
| Distribution method | Description |
|---|---|
| HASH, HASH | Hashes the rows of each table to a query server based on the join key. Then, each query server performs the join operation on the result partitions. When the sizes of the tables are comparable and the join operation is performed by using hash join or sort-merge join, we recommend that you use this distribution method. |
| BROADCAST, NONE | Broadcasts all rows of the right table to each query server. The rows of the left table are randomly partitioned. When the right table is much smaller than the left table, we recommend that you use this distribution method. Generally, when the product of the size of the left table and the number of query servers is greater than the size of the right table, you can use this distribution method. |
| NONE, BROADCAST | Broadcasts all rows of the left table to each query server. The rows of the right table are randomly partitioned. When the left table is much smaller than the right table, we recommend that you use this distribution method. Generally, when the product of the size of the left table and the number of query servers is less than the size of the right table, you can use this distribution method. |
| PARTITION, NONE | Maps the rows of the left table to the partitions of the right table. The left table must be partitioned by using the join key. When the number of partitions of the right table is a multiple of or nearly a multiple of the number of query servers, we recommend that you use this distribution method. For example, the right table has 14 partitions and the system has 15 query servers. Notice If the left table is not partitioned or the join is not evenly distributed by using the partitioning key, the optimizer ignores this hint. |
| NONE, PARTITION | Maps the rows of the right table to the partitions of the left table. The right table must be partitioned by using the join key. When the number of partitions of the right table is a multiple of or nearly a multiple of the number of query servers, we recommend that you use this distribution method. For example, the right table has 14 partitions and the system has 15 query servers. Notice If the right table is not partitioned by using the partitioning key or the join is not evenly distributed, the optimizer ignores this hint. |
| NONE, NONE | Each query server performs the join operation on a pair of matching partitions, with one partition in each table. The two tables must be evenly distributed by using the join key. |
Examples
The following example shows how to join two tables r and s by using hash join. The query contains a hint that specifies to use the hash distribution method:
SELECT /*+ORDERED PQ_DISTRIBUTE(s HASH, HASH) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
To broadcast the right table r, the query statement must contain the following hint:
SELECT /*+ORDERED PQ_DISTRIBUTE(s BROADCAST, NONE) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
USE_NL_MATERIALIZATION hint
The USE_NL_MATERIALIZATION hint forces the optimizer to generate a materialization operator for a table specified as the left table (subtree) to cache data. The NO_USE_NL_MATERIALIZATION hint negates the effect of the USE_NL_MATERIALIZATION hint.
Syntax
/*+ USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_NL_MATERIALIZATION hint to indicate the optimizer to materialize the departments table in a nested loop join.
SELECT /*+ USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_NL_MATERIALIZATION hint
The NO_USE_NL_MATERIALIZATION hint forces the optimizer to avoid generating a materialization operator to cache data when the specified table is the left table (subtree). The USE_NL_MATERIALIZATION hint is its inverse operation.
Syntax
/*+ NO_USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL_MATERIALIZATION hint to prevent the optimizer from materializing the departments table in nested loop joins.
-- This means that the data in the departments table is reaccessed each time a join is performed, rather than using the cached materialized results.
SELECT /*+ NO_USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
Join Filter hint
There are four Join Filter hints, the first two of which are used to control regular join filters, and the last two of which are used to control partial join filters:
PX_JOIN_FILTERhintNO_PX_JOIN_FILTERhintPX_PART_JOIN_FILTERhintNO_PX_PART_JOIN_FILTERhint
Note that these four hints take effect only in parallel execution environments and have no significant effect in non-parallel environments.
The syntax and parameters of these hints are described as follows:
PX_JOIN_FILTER hint
In a parallel execution environment, the PX_JOIN_FILTER hint instructs the optimizer to control hash joins to use join filters. You can use this hint to specify that join filters be used for execution-time filtering when a specific table is used as the right table of a hash join. The reverse operation of this hint is NO_PX_JOIN_FILTER.
Syntax
/*+ PX_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
Parameters
qb_name: the name of the query block on which the hint takes effect. This parameter is optional.filter_table: the single table on which join filters are pushed down for a subquery. For a subquery, this parameter indicates the name of the view.left_tables: the single table on which join filters are assigned for the left table of a hash join. This parameter is optional.real_filter_table: the single table on which join filters are actually pushed down for a subquery.
NO_PX_JOIN_FILTER Hint
The NO_PX_JOIN_FILTER hint instructs the optimizer to control hash joins to disable join filters. The reverse operation of this hint is PX_JOIN_FILTER.
Syntax
/*+ NO_PX_JOIN_FILTER( table ) */
PX_PART_JOIN_FILTER hint
The PX_PART_JOIN_FILTER hint instructs the optimizer to manually enable part filters. The reverse operation of this hint is NO_PX_PART_JOIN_FILTER.
Syntax
/*+ PX_PART_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
NO_PX_PART_JOIN_FILTER hint
The NO_PX_PART_JOIN_FILTER hint instructs the optimizer to manually disable part filters. The reverse operation of this hint is PX_PART_JOIN_FILTER.
Syntax
/*+ NO_PX_PART_JOIN_FILTER (table) */
Application scenarios
The four join filter hints (PX_JOIN_FILTER, NO_PX_JOIN_FILTER, PX_PART_JOIN_FILTER, and NO_PX_PART_JOIN_FILTER) are typically used in conjunction with the leading and use_hash hints. If used separately, these hints may become ineffective due to the generation of different join orders or algorithms.
General scenarios
In most cases, join filter hints should be used together with the LEADING and USE_HASH hints. Otherwise, they may fail due to the generation of different join orders or algorithms.
First, create a partitioned table:
CREATE TABLE t1 (
c1 INT,
c2 INT,
c3 INT,
c4 INT
) PARTITION BY HASH(c1) PARTITIONS 10;
Enforce the use of join filters
You can use the following SQL statements to enforce the use of join filters:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b)
PX_PART_JOIN_FILTER(b)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
or:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b a)
PX_PART_JOIN_FILTER(b a)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
An example of the generated execution plan is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 456 |
| 1 | EXCHANGE OUT DISTR | :EX10001| 1 | 456 |
| 2 | SHARED HASH JOIN | | 1 | 455 |
| 3 | JOIN FILTER CREATE | :BF0001 | 1 | 228 |
| 4 | PART JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 5 | EXCHANGE IN DISTR | | 1 | 228 |
| 6 | EXCHANGE OUT DISTR (BC2HOST)| :EX10000| 1 | 228 |
| 7 | PX BLOCK ITERATOR | | 1 | 228 |
| 8 | TABLE SCAN | a | 1 | 228 |
| 9 | JOIN FILTER USE | :BF0001 | 1 | 228 |
| 10 | PX BLOCK HASH JOIN-FILTER | :BF0000 | 1 | 228 |
| 11 | TABLE SCAN | b | 1 | 228 |
===============================================================
Multi-table scenarios
For a join involving three tables, if the left table is specified as a, you can generate a join filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c a)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
An example of the generated execution plan is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 4 | EXCHANGE IN DISTR | | 1 | 228 |
| 5 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 6 | PX BLOCK ITERATOR | | 1 | 228 |
| 7 | TABLE SCAN | a | 1 | 228 |
| 8 | SHARED HASH JOIN | | 1 | 455 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
===============================================================
Similarly, for a join involving three tables where the left table is specified as b, you can generate a join filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c b)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
An example of the generated execution plan is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | EXCHANGE IN DISTR | | 1 | 228 |
| 4 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 5 | PX BLOCK ITERATOR | | 1 | 228 |
| 6 | TABLE SCAN | a | 1 | 228 |
| 7 | SHARED HASH JOIN | | 1 | 455 |
| 8 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
=======================================================================
Handle hint conflicts
PX_JOIN_FILTER and NO_PX_JOIN_FILTER can take four valid forms based on whether the left table left_tables is specified. Prioritize and match them as follows:
| Hint | Description |
|---|---|
| NO_PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), do not apply join filter to a in the right table. |
| PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), apply join filter to a in the right table. |
| NO_PX_JOIN_FILTER( a ) | Do not apply join filter to a in the right table for any left table. |
| PX_JOIN_FILTER( a ) | Apply join filter to a in the right table for any left table. |
The conflict resolution for PX_PART_JOIN_FILTER and NO_PX_PART_JOIN_FILTER is the same as that for PX_JOIN_FILTER.