The following table describes hints that are related to join operations. These hints enable or disable specific join algorithms.
Hint type |
Description |
|---|---|
USE_MERGE |
When the table specified in this hint is the right table of a join, the optimizer uses the merge join algorithm. The reverse operation is NO_USE_MERGE. |
NO_USE_MERGE |
When the table specified in this hint is the right table of a join, the optimizer does not use the merge join algorithm. The reverse operation is USE_MERGE. |
USE_HASH |
When the table specified in this hint is the right table of a join, the optimizer uses the hash join algorithm. The reverse operation is NO_USE_HASH. |
NO_USE_HASH |
When the table specified in this hint is the right table of a join, the optimizer does not use the hash join algorithm. The reverse operation is USE_HASH. |
USE_NL |
When the table specified in this hint is the left table of a join, the optimizer uses the nested loops join algorithm. The reverse operation is NO_USE_NL. |
NO_USE_NL |
When the table specified in this hint is the left table of a join, the optimizer does not use the nested loops join algorithm. The reverse operation is USE_NL. |
PQ_DISTRIBUTE |
Controls the data distribution method for join operations. |
PQ_MAP |
Specifies the mapping strategy for join operations. |
USE_NL_MATERIALIZATION |
Forces the materialization of the left table in a nested loops join. The reverse operation is NO_USE_NL_MATERIALIZATION. |
NO_USE_NL_MATERIALIZATION |
Prevents the materialization of the left table in a nested loops join. The reverse operation is USE_NL_MATERIALIZATION. |
PX_JOIN_FILTER |
Indicates that the optimizer controls the use of JOIN FILTER in hash joins. The reverse operation is NO_PX_JOIN_FILTER. |
NO_PX_JOIN_FILTER |
Indicates that the optimizer disables JOIN FILTER in hash joins. The reverse operation is PX_JOIN_FILTER. |
PX_PART_JOIN_FILTER |
Indicates that the optimizer manually opens PART FILTER. The reverse operation is NO_PX_PART_JOIN_FILTER. |
NO_PX_PART_JOIN_FILTER |
Indicates that the optimizer manually closes PART FILTER. The reverse operation is NO_PX_PART_JOIN_FILTER. |
USE_MERGE Hint
USE_MERGE Hint specifies that the optimizer use a merge join algorithm when the specified table is the right table in a join. The reverse operation is NO_USE_MERGE.
Syntax
/*+ USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Considerations
We recommend that you use
USE_NLandUSE_MERGEhints together withLEADINGorORDEREDhints.If the referenced table is the right table in a join, the optimizer uses these hints.
If the referenced table is the left table in a join, the optimizer ignores these hints.
USE_MERGEspecifies that the optimizer use a merge join algorithm when the specified table is the right table in a join.OceanBase Database must have an equijoin condition for a merge join. Therefore, if you join two tables without an equijoin condition, the
USE_MERGEhint is ineffective.
Examples
-- Use the USE_MERGE hint to indicate that the optimizer use a sort-merge join algorithm to execute the query.
-- In the join operation between the employees and departments tables, the employees table is the right table and the departments table is the left table.
SELECT /*+ USE_MERGE(employees departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_MERGE Hint
NO_USE_MERGE Hint specifies that the optimizer do not use a merge join algorithm when the specified table is the left table in a join. The reverse operation is USE_MERGE.
Syntax
/*+ NO_USE_MERGE ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_MERGE hint to indicate that the optimizer do not use a sort-merge join algorithm to execute the query.
-- In the join operation between the employees and departments tables, the optimizer does not use a sort-merge join algorithm.
SELECT /*+ NO_USE_MERGE(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_HASH Hint
USE_HASH Hint specifies that the optimizer use a hash join algorithm when the specified table is the right table in a join. The reverse operation is NO_USE_HASH.
Syntax
/*+ USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_HASH hint to indicate that the optimizer use a hash join algorithm to execute the query.
-- In the join operation between the orders and order_items tables, the orders table is the right table and the order_items table is the left table.
SELECT /*+ USE_HASH(l h) */ *
FROM orders h, order_items l
WHERE l.order_id = h.order_id
AND l.order_id > 2400;
NO_USE_HASH Hint
NO_USE_HASH Hint specifies that the optimizer do not use a hash join algorithm when the specified table is the right table in a join. The reverse operation is USE_HASH.
Syntax
/*+ NO_USE_HASH ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_HASH hint to indicate that the optimizer do not use a hash join algorithm to execute the query.
-- In the join operation between the employees and departments tables, the optimizer does not use a hash join algorithm.
SELECT /*+ NO_USE_HASH(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
USE_NL Hint
USE_NL Hint specifies that the optimizer use a nested loop join (NL-JOIN) algorithm when the specified table is the left table in a join. The reverse operation is NO_USE_NL.
We recommend that you use
USE_NLandUSE_MERGEhints together withLEADINGorORDEREDhints.If the referenced table is the left table in a join, the optimizer uses these hints.
If the referenced table is the right table in a join, the optimizer ignores these hints.
Syntax
/*+ USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
The following query example shows that the hint forces the optimizer to use a nested loop join. The optimizer scans the entire orders table and applies the filter condition l.order_id = h.order_id to each row. For each row that satisfies the filter condition, the optimizer accesses the order_items table by using the order_id index.
-- Use the USE_NL hint to indicate that the optimizer use a nested loop join algorithm to execute the query.
-- In the join operation between the orders and order_items tables, the orders table is the right table and the order_items table is the left table.
SELECT /*+ USE_NL(l h) */ h.customer_id, l.unit_price * l.quantity
FROM orders h, order_items l
WHERE l.order_id = h.order_id;
NO_USE_NL Hint
NO_USE_NL Hint specifies that the optimizer do not use a nested loop join (NL-JOIN) algorithm when the specified table is the left table in a join. The reverse operation is USE_NL.
Syntax
/*+ NO_USE_NL ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL hint to indicate that the optimizer do not use a nested loop join algorithm to execute the query.
-- In the join operation between the employees and departments tables, the optimizer does not use a nested loop join algorithm.
SELECT /*+ NO_USE_NL(e d) */ *
FROM employees e, departments d
WHERE e.department_id = d.department_id;
PQ_DISTRIBUTE Hint
The PQ_DISTRIBUTE hint is used to indicate how the optimizer should distribute data between the producer (server responsible for generating query result rows) and consumer (server responsible for receiving and processing these rows) servers during parallel query execution. You can use the PQ_DISTRIBUTE hint to control the distribution of row data in join or load operations.
In parallel query scenarios, especially when dealing with large amounts of data, the PQ_DISTRIBUTE hint can help optimize resource usage and improve query performance.
Syntax
/*+ PQ_DISTRIBUTE
( [ @queryblock ] tablespec
{ distribution | outer_distribution inner_distribution }
) */
Control the distribution of joins
You can control the distribution of joins by specifying two distribution methods.
As shown in the lower part of the syntax:
outer_distributionspecifies the data distribution method for the left table.inner_distributionspecifies the data distribution method for the right table.
The distribution methods include HASH, BROADCAST, PARTITION, and NONE. Only the following six combinations of distribution methods are valid:
Distribution Method |
Description |
|---|---|
| HASH, HASH | A hash function is used on the join key to map rows from each table to the query servers. After mapping, each query server performs a join between a pair of result partitions. This distribution method is recommended when the sizes of the tables are comparable and the join operation is performed using a hash join or a sort-merge join. |
| BROADCAST, NONE | All rows from the right table are broadcasted to each query server. Rows from the left table are randomly partitioned. This distribution method is recommended when the right table is significantly smaller than the left table. Typically, this method is also recommended when the size of the left table multiplied by the number of query servers exceeds the size of the right table. |
| NONE, BROADCAST | All rows from the left table are broadcasted to each query server. Rows from the right table are randomly partitioned. This distribution method is recommended when the left table is significantly smaller than the right table. Typically, this method is also recommended when the size of the left table multiplied by the number of query servers is less than the size of the right table. |
| PARTITION, NONE | Rows from the left table are mapped based on the partitions of the right table. The left table must be partitioned on the join key. This distribution method is recommended when the number of partitions in the right table is equal to or nearly a multiple of the number of query servers. For example, if there are 14 partitions and 15 query servers. Note If the left table is not partitioned or the join is not evenly distributed on the partition key, the optimizer will ignore this hint. |
| NONE, PARTITION | Rows from the right table are mapped based on the partitions of the left table. The right table must be partitioned on the join key. This distribution method is recommended when the number of partitions in the right table is equal to or nearly a multiple of the number of query servers. For example, if there are 14 partitions and 15 query servers. Note If the right table is not partitioned on the partition key or the join is not evenly distributed, the optimizer will ignore this hint. |
| NONE, NONE | Each query server performs a join operation between a pair of matching partitions, one from each table. Both tables must be evenly distributed on the join key. |
Examples
The following query example shows how to specify a hash join to join the two tables r and s, and includes a hint for hash distribution:
SELECT /*+ORDERED PQ_DISTRIBUTE(s HASH, HASH) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
If you want to broadcast the right table r, the query statement with the hint is as follows:
SELECT /*+ORDERED PQ_DISTRIBUTE(s BROADCAST, NONE) USE_HASH (s) */ column_list
FROM r, s
WHERE r.c = s.c;
USE_NL_MATERIALIZATION Hint
The USE_NL_MATERIALIZATION hint forces the optimizer to generate a materialization operator to cache data when a table is specified as the left table (subtree). Its opposite is NO_USE_NL_MATERIALIZATION.
Syntax
/*+ USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the USE_NL_MATERIALIZATION hint to instruct the optimizer to materialize the departments table in a nested loop join
SELECT /*+ USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
NO_USE_NL_MATERIALIZATION Hint
The NO_USE_NL_MATERIALIZATION hint forces the optimizer to avoid generating a materialization operator to cache data when a table is specified as the left table (subtree). Its opposite is USE_NL_MATERIALIZATION.
Syntax
/*+ NO_USE_NL_MATERIALIZATION ( [ @queryblock ] tablespec [ tablespec ]... ) */
Examples
-- Use the NO_USE_NL_MATERIALIZATION hint to prevent the optimizer from materializing the departments table in a nested loop join
-- This means that the data from the departments table will be accessed again each time a nested loop join is performed, rather than using the cached materialized results
SELECT /*+ NO_USE_NL_MATERIALIZATION(departments) */ *
FROM employees, departments
WHERE employees.department_id = departments.department_id;
Join Filter Hint
There are four Join Filter hints. The first two are used to control the general Join Filter, and the last two are used to control the partial Join Filter:
PX_JOIN_FILTERHintNO_PX_JOIN_FILTERHintPX_PART_JOIN_FILTERHintNO_PX_PART_JOIN_FILTERHint
Note that these four hints only take effect in parallel execution environments and have no significant effect in non-parallel environments.
Their syntax and parameter explanations are as follows:
PX_JOIN_FILTER Hint
In parallel execution environments, the PX_JOIN_FILTER Hint instructs the optimizer to control the use of JOIN FILTER in HASH JOIN. Using this HINT allows you to specify a particular table as the right table in a hash join and apply join filter for filtering during execution. Its reverse operation is NO_PX_JOIN_FILTER.
Syntax
/*+ PX_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
Parameter Explanation
qb_name: Specifies the query block where the hint takes effect. This is an optional parameter.filter_table: Describes the single table to which the JOIN FILTER is pushed down. If it's a subquery, this should be the name of the view.left_tables: Specifies the left table in the HASH-JOIN when allocating the JOIN FILTER. This is an optional parameter.real_filter_table: The single table in the subquery to which the JOIN FILTER is actually pushed down.
NO_PX_JOIN_FILTER Hint
The NO_PX_JOIN_FILTER Hint is used to instruct the optimizer to disable the JOIN FILTER in HASH JOIN. Its reverse operation is PX_JOIN_FILTER.
Syntax
/*+ NO_PX_JOIN_FILTER( table ) */
PX_PART_JOIN_FILTER Hint
The PX_PART_JOIN_FILTER Hint is used to instruct the optimizer to manually enable the PART FILTER. Its reverse operation is NO_PX_PART_JOIN_FILTER.
Syntax
/*+ PX_PART_JOIN_FILTER ( [ @qb_name ] filter_table [ left_tables ] [real_filter_table]) */
NO_PX_PART_JOIN_FILTER Hint
The NO_PX_PART_JOIN_FILTER Hint is used to instruct the optimizer to manually disable the PART FILTER. Its reverse operation is PX_PART_JOIN_FILTER.
Syntax
/*+ NO_PX_PART_JOIN_FILTER (table) */
Application Scenarios
The four Join Filter hints (PX_JOIN_FILTER, NO_PX_JOIN_FILTER, PX_PART_JOIN_FILTER, and NO_PX_PART_JOIN_FILTER) are typically used together with the leading and use_hash hints. If they are not used in conjunction with these hints, they may become ineffective due to the generation of different join orders or algorithms.
General Scenario
Join Filter hints are generally used together with LEADING and USE_HASH. Otherwise, they may become ineffective due to the generation of different join orders or algorithms.
First, create a partitioned table:
CREATE TABLE t1 (
c1 INT,
c2 INT,
c3 INT,
c4 INT
) PARTITION BY HASH(c1) PARTITIONS 10;
Force Join Filter
You can use the following SQL to force the application of a Join Filter:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b)
PX_PART_JOIN_FILTER(b)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
Or:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a b) USE_HASH(b) PQ_DISTRIBUTE(b BC2HOST NONE)
PX_JOIN_FILTER(b a)
PX_PART_JOIN_FILTER(b a)
*/ *
FROM t1 a, t1 b WHERE a.c1 = b.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 456 |
| 1 | EXCHANGE OUT DISTR | :EX10001| 1 | 456 |
| 2 | SHARED HASH JOIN | | 1 | 455 |
| 3 | JOIN FILTER CREATE | :BF0001 | 1 | 228 |
| 4 | PART JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 5 | EXCHANGE IN DISTR | | 1 | 228 |
| 6 | EXCHANGE OUT DISTR (BC2HOST)| :EX10000| 1 | 228 |
| 7 | PX BLOCK ITERATOR | | 1 | 228 |
| 8 | TABLE SCAN | a | 1 | 228 |
| 9 | JOIN FILTER USE | :BF0001 | 1 | 228 |
| 10 | PX BLOCK HASH JOIN-FILTER | :BF0000 | 1 | 228 |
| 11 | TABLE SCAN | b | 1 | 228 |
===============================================================
Multi-table Scenario
For a three-table join, when the left table is specified as a, you can generate a Join Filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c a)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 4 | EXCHANGE IN DISTR | | 1 | 228 |
| 5 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 6 | PX BLOCK ITERATOR | | 1 | 228 |
| 7 | TABLE SCAN | a | 1 | 228 |
| 8 | SHARED HASH JOIN | | 1 | 455 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
===============================================================
Similarly, for a three-table join with the left table specified as b, you can generate a Join Filter for the right table c:
EXPLAIN SELECT
/*+ PARALLEL(2) LEADING(a (b c)) USE_HASH(c (b c)) PQ_DISTRIBUTE((b c) BC2HOST NONE) PQ_DISTRIBUTE(c BC2HOST NONE)
NO_PX_JOIN_FILTER(c)
NO_PX_JOIN_FILTER(b)
NO_PX_PART_JOIN_FILTER(c)
NO_PX_PART_JOIN_FILTER(b)
PX_JOIN_FILTER(c b)
*/ *
FROM t1 a, t1 b, t1 c WHERE a.c1 = c.c1 AND b.c1 = c.c1;
The execution plan output is as follows:
===============================================================
| ID | OPERATOR | NAME | EST. ROWS | COST |
---------------------------------------------------------------
| 0 | PX COORDINATOR | | 1 | 684 |
| 1 | EXCHANGE OUT DISTR | :EX10002| 1 | 683 |
| 2 | SHARED HASH JOIN | | 1 | 683 |
| 3 | EXCHANGE IN DISTR | | 1 | 228 |
| 4 | EXCHANGE OUT DISTR (BC2HOST) | :EX10000| 1 | 228 |
| 5 | PX BLOCK ITERATOR | | 1 | 228 |
| 6 | TABLE SCAN | a | 1 | 228 |
| 7 | SHARED HASH JOIN | | 1 | 455 |
| 8 | JOIN FILTER CREATE | :BF0000 | 1 | 228 |
| 9 | EXCHANGE IN DISTR | | 1 | 228 |
| 10 | EXCHANGE OUT DISTR (BC2HOST) | :EX10001| 1 | 228 |
| 11 | PX BLOCK ITERATOR | | 1 | 228 |
| 12 | TABLE SCAN | b | 1 | 228 |
| 13 | JOIN FILTER USE | :BF0000 | 1 | 228 |
| 14 | PX BLOCK ITERATOR | | 1 | 228 |
| 15 | TABLE SCAN | c | 1 | 228 |
=======================================================================
Conflict Resolution
The PX_JOIN_FILTER and NO_PX_JOIN_FILTER hints can have four valid forms based on whether the left table left_tables is specified. According to the priority, they are matched and used as follows:
Hint |
Function |
|---|---|
| NO_PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), prohibits the use of join filter for the right table a. |
| PX_JOIN_FILTER( a (b c) ) | When the left table is (b c), uses join filter for the right table a. |
| NO_PX_JOIN_FILTER( a ) | Prohibits the use of join filter for the right table a regardless of the left table. |
| PX_JOIN_FILTER( a ) | Uses join filter for the right table a regardless of the left table. |
The conflict resolution for PX_PART_JOIN_FILTER and NO_PX_PART_JOIN_FILTER is the same as for PX_JOIN_FILTER.
