The following table describes hints related to join operations in SQL queries. These hints are used to enable or disable specific join algorithms.
| Hint type | Description |
|---|---|
USE_MERGE |
When this hint is specified for the right table of a join, the merge join algorithm is used. The reverse operation is NO_USE_MERGE. |
NO_USE_MERGE |
When this hint is specified for the right table of a join, the merge join algorithm is not used. The reverse operation is USE_MERGE. |
USE_HASH |
When this hint is specified for the right table of a join, the hash join algorithm is used. The reverse operation is NO_USE_HASH. |
NO_USE_HASH |
When this hint is specified for the right table of a join, the hash join algorithm is not used. The reverse operation is USE_HASH. |
USE_NL |
When this hint is specified for the left table of a join, the nested loop join algorithm is used. The reverse operation is NO_USE_NL. |
NO_USE_NL |
When this hint is specified for the left table of a join, the nested loop join algorithm is not used. The reverse operation is USE_NL. |
PQ_DISTRIBUTE |
Controls the data distribution method for join operations. |
PQ_MAP |
Specifies the mapping strategy for join operations. |
USE_NL_MATERIALIZATION |
Forces the materialization of the left table in nested loop joins. The reverse operation is NO_USE_NL_MATERIALIZATION. |
NO_USE_NL_MATERIALIZATION |
Prevents the materialization of the left table in nested loop joins. The reverse operation is USE_NL_MATERIALIZATION. |
PX_JOIN_FILTER |
Indicates that the optimizer controls the use of JOIN FILTER in HASH JOIN. The reverse operation is NO_PX_JOIN_FILTER. |
NO_PX_JOIN_FILTER |
Indicates that the optimizer disables JOIN FILTER in HASH JOIN. The reverse operation is PX_JOIN_FILTER. |
PX_PART_JOIN_FILTER |
Indicates that the optimizer manually opens PART FILTER. The reverse operation is NO_PX_PART_JOIN_FILTER. |
NO_PX_PART_JOIN_FILTER |
Indicates that the optimizer manually closes PART FILTER. The reverse operation is NO_PX_PART_JOIN_FILTER. |
USE_MERGE Hint
USE_MERGE Hint specifies that the optimizer use the merge join algorithm when the specified table is the right table in a join. The reverse operation is NO_USE_MERGE.
Syntax
/*+ USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Considerations
We recommend that you use the
USE_NLandUSE_MERGEhints together with theLEADINGorORDEREDhint.If the referenced table is the right table in a join, the optimizer uses the hint.
If the referenced table is the left table in a join, the optimizer ignores the hint.
The
USE_MERGEhint specifies that the optimizer use theMERGE JOINalgorithm when the specified table is the right table in a join.OceanBase Database must have an equijoin condition when it uses the
MERGE JOINalgorithm. Therefore, if you join two tables without an equijoin condition, theUSE_MERGEhint takes effect.
Examples
-- Use the USE_MERGE hint to indicate that the optimizer use the sort-merge join algorithm to execute the query.
-- For the join between the employees and departments tables, the employees table is the right table and the departments table is the left table.
SELECT /*+ USE_MERGE(employees departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_MERGE Hint
NO_USE_MERGE Hint specifies that the optimizer exclude the USE_MERGE hint from the join when the specified table is the left table in a join. The reverse operation is USE_MERGE.
Syntax
/*+ NO_USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_MERGE hint to indicate that the optimizer do not use the sort-merge join algorithm to execute the query.
-- For the join between the employees and departments tables, the optimizer excludes the use of the sort-merge join algorithm.
SELECT /*+ NO_USE_MERGE(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_HASH Hint
USE_HASH Hint specifies that the optimizer use the hash join algorithm when the specified table is the right table in a join. The reverse operation is NO_USE_HASH.
Syntax
/*+ USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_HASH hint to indicate that the optimizer use the hash join algorithm (HASH-JOIN) to execute the query.
-- For the join between the orders and order_items tables, the orders table is the right table and the order_items table is the left table.
SELECT /*+ USE_HASH(l h) */ *
FROM orders h, order_items l
WHERE l.order_id = h.order_id
AND l.order_id > 2400;
NO_USE_HASH Hint
NO_USE_HASH Hint specifies that the optimizer do not use the hash join algorithm when the specified table is the right table in a join. The reverse operation is USE_HASH.
Syntax
/*+ NO_USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_HASH hint to indicate that the optimizer do not use the hash join algorithm to execute the query.
-- For the join between the employees and departments tables, the optimizer excludes the use of the hash join algorithm.
SELECT /*+ NO_USE_HASH(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_NL Hint
USE_NL Hint specifies that the optimizer use the nested loop join (NL-JOIN) algorithm when the specified table is the left table in a join. The reverse operation is NO_USE_NL.
We recommend that you use the
USE_NLandUSE_MERGEhints together with theLEADINGorORDEREDhint.If the referenced table is the left table in a join, the optimizer uses the hint.
If the referenced table is the right table in a join, the optimizer ignores the hint.
Syntax
/*+ USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
The following query example shows how to force the optimizer to use the nested loop join algorithm. The orders table is accessed by full table scan, and the filter condition l.order_id = h.order_id is applied to each row. For each row that satisfies the filter condition, the order_items table is accessed by using the order_id index.
-- Use the USE_NL hint to indicate that the optimizer use the nested loop join algorithm (NL-JOIN) to execute the query.
-- For the join between the orders and order_items tables, the orders table is the right table and the order_items table is the left table.
SELECT /*+ USE_NL(l h) */ h.customer_id, l.unit_price * l.quantity
FROM orders h, order_items l
WHERE l.order_id = h.order_id;
NO_USE_NL Hint
NO_USE_NL Hint specifies that the optimizer do not use the nested loop join (NL-JOIN) algorithm when the specified table is the left table in a join. The reverse operation is USE_NL.
Syntax
/*+ NO_USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL hint to indicate that the optimizer do not use the nested loop join algorithm to execute the query.
-- For the join between the employees and departments tables, the optimizer excludes the use of the NL-JOIN algorithm.
SELECT /*+ NO_USE_NL(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
PQ_DISTRIBUTE Hint
PQ_DISTRIBUTE Hint specifies how to distribute data between the producer servers (which generate query result rows) and consumer servers (which receive and process these rows) during parallel query execution. You can use PQ_DISTRIBUTE Hint to control the distribution of row data in join or load operations.
In parallel query scenarios, especially when handling large volumes of data, PQ_DISTRIBUTE can optimize resource usage and enhance query performance.
Syntax
/*+ PQ_DISTRIBUTE
( [ @queryblock ] tablespec
{ distribution | outer_distribution inner_distribution }
) */
Control the distribution of joins
You can specify two distribution methods to control the distribution of joins.
As shown in the lower part of the syntax:
outer_distributionspecifies the data distribution method for the left table.inner_distributionspecifies the data distribution method for the right table.
The distribution methods include HASH, BROADCAST, PARTITION, and NONE. Only the following six combinations of distribution methods are valid:
| Distribution Method | Description |
|---|---|
| HASH, HASH | Uses a hash function on the join key to map rows from each table to query servers. After mapping, each query server performs a join between a pair of result partitions. This distribution method is recommended when the tables are of comparable size and the join operation is implemented using a hash join or a sort-merge join. |
| BROADCAST, NONE | All rows from the right table are broadcasted to each query server. Rows from the left table are randomly partitioned. This distribution method is recommended when the right table is significantly smaller than the left table. Typically, it is also recommended when the size of the left table multiplied by the number of query servers exceeds the size of the right table. |
| NONE, BROADCAST | All rows from the left table are broadcasted to each query server. Rows from the right table are randomly partitioned. This distribution method is recommended when the left table is significantly smaller than the right table. Typically, it is also recommended when the size of the left table multiplied by the number of query servers is less than the size of the right table. |
| PARTITION, NONE | Rows from the left table are mapped based on the partitions of the right table. The left table must be partitioned on the join key. This distribution method is recommended when the number of partitions in the right table is equal to or nearly a multiple of the number of query servers. For example, if there are 14 partitions and 15 query servers. Note If the left table is not partitioned or the join is not evenly distributed across the partitions, the optimizer will ignore this hint. |
| NONE, PARTITION | Rows from the right table are mapped based on the partitions of the left table. The right table must be partitioned on the join key. This distribution method is recommended when the number of partitions in the right table is equal to or nearly a multiple of the number of query servers. For example, if there are 14 partitions and 15 query servers. Note If the right table is not partitioned on the join key or the join is not evenly distributed across the partitions, the optimizer will ignore this hint. |
| NONE, NONE | Each query server performs a join operation between a pair of matching partitions, with one partition in each table. Both tables must be evenly distributed across the join key. |
Examples
The following query example specifies a hash join to join the r and s tables, and includes a hint for hash distribution:
SELECT /*+ORDERED PQ_DISTRIBUTE(s HASH, HASH) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
If you want to broadcast the right table r, the query statement with the hint is as follows:
SELECT /*+ORDERED PQ_DISTRIBUTE(s BROADCAST, NONE) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
USE_NL_MATERIALIZATION Hint
USE_NL_MATERIALIZATION Hint forces the optimizer to generate a materialization operator to cache data when the specified table is the left table (subtree). Its opposite operation is NO_USE_NL_MATERIALIZATION.
Syntax
/*+ USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_NL_MATERIALIZATION Hint to instruct the optimizer to materialize the departments table in a nested loop join.
SELECT /*+ USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_NL_MATERIALIZATION Hint
NO_USE_NL_MATERIALIZATION Hint forces the optimizer to avoid generating a materialization operator to cache data when the specified table is the left table (subtree). Its opposite operation is USE_NL_MATERIALIZATION.
Syntax
/*+ NO_USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL_MATERIALIZATION Hint to prevent the optimizer from materializing the departments table in a nested loop join.
-- This means that the data from the departments table will be accessed each time a nested loop join is performed, instead of using the cached materialized results.
SELECT /*+ NO_USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
Join Filter Hint
There are four Join Filter hints. The first two are used to control the normal Join Filter, and the last two are used to control the partial Join Filter:
PX_JOIN_FILTERHintNO_PX_JOIN_FILTERHintPX_PART_JOIN_FILTERHintNO_PX_PART_JOIN_FILTERHint
Note that these four hints take effect only in parallel execution environments. They have no significant effect in non-parallel environments.
The syntax and parameter explanations are as follows:
PX_JOIN_FILTER Hint
In a parallel execution environment, the PX_JOIN_FILTER hint indicates that the optimizer should control the use of JOIN FILTER in HASH JOIN. This hint allows you to specify a particular table as the right table in a hash join and apply the join filter for filtering during execution. Its reverse operation is NO_PX_JOIN_FILTER.
Syntax
/*+ PX_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
Parameters
qb_name: Specifies the query block where the hint takes effect. This is an optional parameter.filter_table: Describes the single table to which the JOIN FILTER is pushed down. If it's a subquery, this should be the name of the view.left_tables: Specifies the left table in the HASH-JOIN when allocating the JOIN FILTER. This is an optional parameter.real_filter_table: The single table in the subquery where the JOIN FILTER is actually pushed down.
NO_PX_JOIN_FILTER Hint
The NO_PX_JOIN_FILTER hint indicates that the optimizer should control the disablement of JOIN FILTER in HASH JOIN. Its reverse operation is PX_JOIN_FILTER.
Syntax
/*+ NO_PX_JOIN_FILTER( table ) */
PX_PART_JOIN_FILTER Hint
The PX_PART_JOIN_FILTER hint indicates that the optimizer should manually enable the PART FILTER. Its reverse operation is NO_PX_PART_JOIN_FILTER.
Syntax
/*+ PX_PART_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
NO_PX_PART_JOIN_FILTER Hint
The NO_PX_PART_JOIN_FILTER hint indicates that the optimizer should manually disable the PART FILTER. Its reverse operation is PX_PART_JOIN_FILTER.
Syntax
/*+ NO_PX_PART_JOIN_FILTER (table) */
Scenarios
The four Join Filter hints (PX_JOIN_FILTER, NO_PX_JOIN_FILTER, PX_PART_JOIN_FILTER, and NO_PX_PART_JOIN_FILTER) are typically used together with the leading and use_hash hints. If they are not used with these hints, they may become ineffective due to the generation of other join orders and algorithms.
General Scenario
Join Filter hints are generally used together with the LEADING and USE_HASH hints. Otherwise, they may become ineffective due to the generation of different join orders or algorithms.
First, create a partitioned table:
CREATE TABLE t1 (
c1 INT,
c2 INT,
c3 INT,
c4 INT
) PARTITION BY HASH(c1) PARTITIONS 10;
Force Join Filter Usage
You can use the following SQL to force the application of a Join Filter:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b)
PX_PART_JOIN_FILTER(b)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
Or:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b a)
PX_PART_JOIN_FILTER(b a)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 456 |
| 1 | EXCHANGE OUT DISTR | :EX10001| 1 | 456 |
| 2 | SHARED HASH JOIN | | 1 | 455 |
| 3 | JOIN FILTER CREATE | :BF0001 | 1 | 228 |
| 4 | PART JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 5 | EXCHANGE IN DISTR | | 1 | 228 |
| 6 | EXCHANGE OUT DISTR (BC2HOST)| :EX10000| 1 | 228 |
| 7 | PX BLOCK ITERATOR | | 1 | 228 |
| 8 | TABLE SCAN | a | 1 | 228 |
| 9 | JOIN FILTER USE | :BF0001 | 1 | 228 |
| 10 | PX BLOCK HASH JOIN-FILTER | :BF0000 | 1 | 228 |
| 11 | TABLE SCAN | b | 1 | 228 |
===============================================================
Multi-Table Scenario
For a three-table join, when the left table is specified as a, you can generate a Join Filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c a)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 4 | EXCHANGE IN DISTR | | 1 | 228 |
| 5 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 6 | PX BLOCK ITERATOR | | 1 | 228 |
| 7 | TABLE SCAN | a | 1 | 228 |
| 8 | SHARED HASH JOIN | | 1 | 455 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
===============================================================
Similarly, for a three-table join with the left table specified as b, you can generate a Join Filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c b)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | EXCHANGE IN DISTR | | 1 | 228 |
| 4 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 5 | PX BLOCK ITERATOR | | 1 | 228 |
| 6 | TABLE SCAN | a | 1 | 228 |
| 7 | SHARED HASH JOIN | | 1 | 455 |
| 8 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
=======================================================================
Conflict Resolution
The PX_JOIN_FILTER and NO_PX_JOIN_FILTER hints can have four valid forms based on whether the left table left_tables is specified. According to the priority, they are matched and used as follows:
| Hint | Function |
|---|---|
| NO_PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), it prohibits the use of join filter for the right table's a. |
| PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), it uses join filter for the right table's a. |
| NO_PX_JOIN_FILTER( a ) | It prohibits the use of join filter for the right table's a for any left table. |
| PX_JOIN_FILTER( a ) | It uses join filter for the right table's a for any left table. |
The conflict resolution for PX_PART_JOIN_FILTER and NO_PX_PART_JOIN_FILTER is the same as for PX_JOIN_FILTER.