The following table describes hints related to join operations in SQL queries, including hints for enabling or disabling specific join algorithms.
| Hint type | Description |
|---|---|
USE_MERGE |
Specifies to use the merge join algorithm when the specified table is the right table in the join. The reverse operation is NO_USE_MERGE. |
NO_USE_MERGE |
Specifies not to use the merge join algorithm when the specified table is the right table in the join. The reverse operation is USE_MERGE. |
USE_HASH |
Specifies to use the hash join algorithm when the specified table is the right table in the join. The reverse operation is NO_USE_HASH. |
NO_USE_HASH |
Specifies not to use the hash join algorithm when the specified table is the right table in the join. The reverse operation is USE_HASH. |
USE_NL |
Specifies to use the nested loop join algorithm when the specified table is the left table in the join. The reverse operation is NO_USE_NL. |
NO_USE_NL |
Specifies not to use the nested loop join algorithm when the specified table is the left table in the join. The reverse operation is USE_NL. |
PQ_DISTRIBUTE |
Controls the data distribution method for the join operation. |
PQ_MAP |
Specifies to use the mapping strategy for the join operation. |
USE_NL_MATERIALIZATION |
Forces the optimizer to materialize the left table in the nested loop join. The reverse operation is NO_USE_NL_MATERIALIZATION. |
NO_USE_NL_MATERIALIZATION |
Prevents the optimizer from materializing the left table in the nested loop join. The reverse operation is USE_NL_MATERIALIZATION. |
PX_JOIN_FILTER |
Indicates to the optimizer to control the use of JOIN FILTER in the HASH JOIN. The reverse operation is NO_PX_JOIN_FILTER. |
NO_PX_JOIN_FILTER |
Indicates to the optimizer to disable JOIN FILTER in the HASH JOIN. The reverse operation is PX_JOIN_FILTER. |
PX_PART_JOIN_FILTER |
Indicates to the optimizer to manually open the PART FILTER. The reverse operation is NO_PX_PART_JOIN_FILTER. |
NO_PX_PART_JOIN_FILTER |
Indicates to the optimizer to manually close the PART FILTER. The reverse operation is NO_PX_PART_JOIN_FILTER. |
USE_MERGE Hint
USE_MERGE Hint specifies that the join algorithm is a merge join when the specified table is the right table in a join. Its reverse operation is NO_USE_MERGE.
Syntax
/*+ USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Considerations
We recommend that you use the
USE_NLandUSE_MERGEhints with theLEADINGorORDEREDhint.If the referenced table is the right table in a join, the optimizer uses these hints.
If the referenced table is the left table in a join, the optimizer ignores these hints.
The
USE_MERGEhint specifies that the merge join algorithm is used when the specified table is the right table in a join.OceanBase Database uses the merge join algorithm only when the join condition contains an equality condition. Therefore, the
USE_MERGEhint is invalid when you join two tables that do not have an equality condition.
Examples
-- Use the USE_MERGE hint to specify that the optimizer use the sort-merge join algorithm to execute the query.
-- In the join between the employees and departments tables, the employees table is the right table and the departments table is the left table.
SELECT /*+ USE_MERGE(employees departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_MERGE Hint
NO_USE_MERGE Hint specifies that the optimizer excludes the use of the merge join algorithm when the specified table is the left table in a join. Its reverse operation is USE_MERGE.
Syntax
/*+ NO_USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_MERGE hint to specify that the optimizer does not use the sort-merge join algorithm to execute the query.
-- In the join between the employees and departments tables, the sort-merge join algorithm is excluded.
SELECT /*+ NO_USE_MERGE(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_HASH Hint
USE_HASH Hint specifies that the join algorithm is a hash join when the specified table is the right table in a join. Its reverse operation is NO_USE_HASH.
Syntax
/*+ USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_HASH hint to specify that the optimizer use the hash join algorithm to execute the query.
-- In the join between the orders and order_items tables, the orders table is the right table and the order_items table is the left table.
SELECT /*+ USE_HASH(l h) */ *
FROM orders h, order_items l
WHERE l.order_id = h.order_id
AND l.order_id > 2400;
NO_USE_HASH Hint
NO_USE_HASH Hint specifies that the optimizer does not use the hash join algorithm when the specified table is the right table in a join. Its reverse operation is USE_HASH.
Syntax
/*+ NO_USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_HASH hint to specify that the optimizer does not use the hash join algorithm to execute the query.
-- In the join between the employees and departments tables, the hash join algorithm is excluded.
SELECT /*+ NO_USE_HASH(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_NL Hint
USE_NL Hint specifies that the join algorithm is a nested loop join (NL-JOIN) when the specified table is the left table in a join. Its reverse operation is NO_USE_NL.
We recommend that you use the
USE_NLandUSE_MERGEhints with theLEADINGorORDEREDhint.If the referenced table is the left table in a join, the optimizer uses these hints.
If the referenced table is the right table in a join, the optimizer ignores these hints.
Syntax
/*+ USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
The following query example shows that the nested loop join algorithm is used. The orders table is accessed by a full table scan, and the filter condition l.order_id = h.order_id is applied to each row. For each row that satisfies the filter condition, the order_items table is accessed by the order_id index.
-- Use the USE_NL hint to specify that the optimizer use the nested loop join algorithm to execute the query.
-- In the join between the orders and order_items tables, the orders table is the right table and the order_items table is the left table.
SELECT /*+ USE_NL(l h) */ h.customer_id, l.unit_price * l.quantity
FROM orders h, order_items l
WHERE l.order_id = h.order_id;
NO_USE_NL Hint
NO_USE_NL Hint specifies that the optimizer does not use the nested loop join (NL-JOIN) algorithm when the specified table is the left table in a join. Its reverse operation is USE_NL.
Syntax
/*+ NO_USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL hint to specify that the optimizer does not use the nested loop join algorithm to execute the query.
-- In the join between the employees and departments tables, the nested loop join algorithm is excluded.
SELECT /*+ NO_USE_NL(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
PQ_DISTRIBUTE Hint
The PQ_DISTRIBUTE hint specifies how the optimizer distributes data between the producer and consumer servers during parallel query execution. You can use this hint to control the distribution of row data in join or load operations.
In parallel query scenarios, especially when handling large volumes of data, PQ_DISTRIBUTE can optimize resource usage and improve query performance.
Syntax
/*+ PQ_DISTRIBUTE
( [ @queryblock ] tablespec
{ distribution | outer_distribution inner_distribution }
) */
Control the distribution of joins
You can control the distribution of joins by specifying two distribution methods.
As shown in the following part of the syntax:
outer_distributionspecifies the data distribution method for the left table.inner_distributionspecifies the data distribution method for the right table.
The distribution methods include HASH, BROADCAST, PARTITION, and NONE. Only the following six combinations of distribution methods are valid:
| Distribution Method | Description |
|---|---|
| HASH, HASH | Uses a hash function on the join key to map rows from each table to the query servers. After mapping, each query server performs a join between a pair of result partitions. This distribution method is recommended when the tables are of comparable size and the join operation is implemented using a hash join or a sort-merge join. |
| BROADCAST, NONE | Broadcasts all rows from the right table to each query server. Rows from the left table are randomly partitioned. This distribution method is recommended when the right table is significantly smaller than the left table. Typically, it is also recommended when the size of the left table multiplied by the number of query servers is greater than the size of the right table. |
| NONE, BROADCAST | Broadcasts all rows from the left table to each query server. Rows from the right table are randomly partitioned. This distribution method is recommended when the left table is significantly smaller than the right table. Typically, it is also recommended when the size of the left table multiplied by the number of query servers is less than the size of the right table. |
| PARTITION, NONE | Maps rows from the left table based on the partitions of the right table. The left table must be partitioned on the join key. This distribution method is recommended when the number of partitions in the right table is equal to or nearly a multiple of the number of query servers. For example, if there are 14 partitions and 15 query servers. Note If the left table is not partitioned or if the join is not evenly distributed across the partitions, the optimizer will ignore this hint. |
| NONE, PARTITION | Maps rows from the right table based on the partitions of the left table. The right table must be partitioned on the join key. This distribution method is recommended when the number of partitions in the right table is equal to or nearly a multiple of the number of query servers. For example, if there are 14 partitions and 15 query servers. Note If the right table is not partitioned on the join key or if the join is not evenly distributed across the partitions, the optimizer will ignore this hint. |
| NONE, NONE | Each query server performs a join operation between a pair of matching partitions in each table. Both tables must be evenly distributed across the partitions on the join key. |
Examples
The following query example specifies using a hash join to join the r and s tables, and includes a hint for hash distribution:
SELECT /*+ORDERED PQ_DISTRIBUTE(s HASH, HASH) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
If you want to broadcast the right table r, the query statement with the hint is as follows:
SELECT /*+ORDERED PQ_DISTRIBUTE(s BROADCAST, NONE) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
USE_NL_MATERIALIZATION Hint
The USE_NL_MATERIALIZATION hint forces the optimizer to generate a materialization operator to cache data when a table is specified as the left table (subtree). Its opposite is NO_USE_NL_MATERIALIZATION.
Syntax
/*+ USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_NL_MATERIALIZATION hint to instruct the optimizer to materialize the departments table in a nested loop join.
SELECT /*+ USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_NL_MATERIALIZATION Hint
The NO_USE_NL_MATERIALIZATION hint forces the optimizer to avoid generating a materialization operator to cache data when a table is specified as the left table (subtree). Its opposite is USE_NL_MATERIALIZATION.
Syntax
/*+ NO_USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL_MATERIALIZATION hint to prevent the optimizer from materializing the departments table in a nested loop join.
-- This means that the data from the departments table will be accessed again during each nested loop join, rather than using the cached materialized results.
SELECT /*+ NO_USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
Join Filter hint
There are four types of hints related to Join Filter. The first two are used to control the Join Filter, and the last two are used to control the Partial Join Filter:
PX_JOIN_FILTERhintNO_PX_JOIN_FILTERhintPX_PART_JOIN_FILTERhintNO_PX_PART_JOIN_FILTERhint
Note that these four hints are effective only in parallel execution environments and have no significant effect in non-parallel environments.
Their syntax and parameter explanations are as follows:
PX_JOIN_FILTER hint
In a parallel execution environment, the PX_JOIN_FILTER hint instructs the optimizer to control the use of JOIN FILTER for HASH JOIN. By using this hint, you can specify a particular table as the right table for hash join and apply join filter for filtering during execution. The opposite operation is performed by the NO_PX_JOIN_FILTER hint.
Syntax
/*+ PX_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
Parameter explanation
qb_name: Specifies the query block to which the hint applies. This is an optional parameter.filter_table: Describes the single table to which the JOIN FILTER is pushed down. If it is a subquery, this should be the name of the view.left_tables: Specifies the left table for HASH-JOIN when allocating the JOIN FILTER. This is an optional parameter.real_filter_table: The single table in the subquery to which the JOIN FILTER is pushed down.
NO_PX_JOIN_FILTER hint
The NO_PX_JOIN_FILTER hint instructs the optimizer to disable JOIN FILTER for HASH JOIN. The opposite operation is performed by the PX_JOIN_FILTER hint.
Syntax
/*+ NO_PX_JOIN_FILTER( table ) */
PX_PART_JOIN_FILTER hint
The PX_PART_JOIN_FILTER hint instructs the optimizer to manually enable PART FILTER. The opposite operation is performed by the NO_PX_PART_JOIN_FILTER hint.
Syntax
/*+ PX_PART_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
NO_PX_PART_JOIN_FILTER hint
The NO_PX_PART_JOIN_FILTER hint instructs the optimizer to manually disable PART FILTER. The opposite operation is performed by the PX_PART_JOIN_FILTER hint.
Syntax
/*+ NO_PX_PART_JOIN_FILTER (table) */
Application scenarios
The four types of Join Filter hints (PX_JOIN_FILTER, NO_PX_JOIN_FILTER, PX_PART_JOIN_FILTER, and NO_PX_PART_JOIN_FILTER) are typically used together with the leading and use_hash hints. If these hints are not used in conjunction with leading and use_hash, they may become ineffective due to the generation of different join orders or algorithms.
General scenarios
Join Filter hints are generally used together with the LEADING and USE_HASH hints. Otherwise, they may become ineffective due to the generation of different join orders or algorithms.
First, create a partitioned table:
CREATE TABLE t1 (
c1 INT,
c2 INT,
c3 INT,
c4 INT
) PARTITION BY HASH(c1) PARTITIONS 10;
Force the use of Join Filter
You can use the following SQL statement to force the application of Join Filter:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b)
PX_PART_JOIN_FILTER(b)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
Or:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b a)
PX_PART_JOIN_FILTER(b a)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 456 |
| 1 | EXCHANGE OUT DISTR | :EX10001| 1 | 456 |
| 2 | SHARED HASH JOIN | | 1 | 455 |
| 3 | JOIN FILTER CREATE | :BF0001 | 1 | 228 |
| 4 | PART JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 5 | EXCHANGE IN DISTR | | 1 | 228 |
| 6 | EXCHANGE OUT DISTR (BC2HOST)| :EX10000| 1 | 228 |
| 7 | PX BLOCK ITERATOR | | 1 | 228 |
| 8 | TABLE SCAN | a | 1 | 228 |
| 9 | JOIN FILTER USE | :BF0001 | 1 | 228 |
| 10 | PX BLOCK HASH JOIN-FILTER | :BF0000 | 1 | 228 |
| 11 | TABLE SCAN | b | 1 | 228 |
===============================================================
Multi-table scenario
For a three-table join, when the left table is specified as a, you can generate a Join Filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c a)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 4 | EXCHANGE IN DISTR | | 1 | 228 |
| 5 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 6 | PX BLOCK ITERATOR | | 1 | 228 |
| 7 | TABLE SCAN | a | 1 | 228 |
| 8 | SHARED HASH JOIN | | 1 | 455 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
===============================================================
Similarly, for a three-table join where the left table is specified as b, you can generate a Join Filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c b)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | EXCHANGE IN DISTR | | 1 | 228 |
| 4 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 5 | PX BLOCK ITERATOR | | 1 | 228 |
| 6 | TABLE SCAN | a | 1 | 228 |
| 7 | SHARED HASH JOIN | | 1 | 455 |
| 8 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
=======================================================================
Conflict resolution for hints
The PX_JOIN_FILTER and NO_PX_JOIN_FILTER hints can have four valid forms based on whether the left table left_tables is specified. According to the priority, they are matched and used as follows:
| Hint | Function |
|---|---|
| NO_PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), prohibits the use of join filter for the right table's a |
| PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), uses join filter for the right table's a |
| NO_PX_JOIN_FILTER( a ) | Prohibits the use of join filter for the right table's a for any left table |
| PX_JOIN_FILTER( a ) | Uses join filter for the right table's a for any left table |
The conflict resolution for PX_PART_JOIN_FILTER and NO_PX_PART_JOIN_FILTER is the same as for PX_JOIN_FILTER.
