The DISTINCT hint is used to guide the optimizer in handling queries with the DISTINCT keyword to take specific actions.
| Hint Type | Description |
|---|---|
USE_HASH_DISTINCT |
Enables the optimizer to use the HASH algorithm for executing the DISTINCT operation. Its reverse operation is the NO_USE_HASH_DISTINCT hint. |
NO_USE_HASH_DISTINCT |
Disables the optimizer from not using the HASH algorithm when executing the DISTINCT operation. Its reverse operation is the USE_HASH_DISTINCT hint. |
DISTINCT_PUSHDOWN |
Instructs the optimizer to allow DISTINCT pushdown in a distributed environment. Its reverse operation is the NO_DISTINCT_PUSHDOWN hint. |
NO_DISTINCT_PUSHDOWN |
Instructs the optimizer to disallow DISTINCT pushdown in a distributed environment. Its reverse operation is the DISTINCT_PUSHDOWN hint. |
USE_HASH_DISTINCT Hint
The USE_HASH_DISTINCT hint enables the optimizer to use the HASH algorithm for the DISTINCT operation. Its reverse operation is the NO_USE_HASH_DISTINCT hint.
Syntax
/*+ USE_HASH_DISTINCT [ ( [ @ qb_name ] ) ] */
Examples
The following query example shows how to use the USE_HASH_DISTINCT hint to enable the HASH deduplication algorithm.
-- Use the `USE_HASH_DISTINCT` hint to make the query use the `HASH` algorithm for the `DISTINCT` operation.
EXPLAIN BASIC SELECT /*+ USE_HASH_DISTINCT */ DISTINCT c1, c2, c3 FROM t1;
| Query Plan |
+-------------------------------------------------------------------+
| =========================== |
| |ID|OPERATOR |NAME| |
| --------------------------- |
| |0 |HASH DISTINCT | | |
| |1 |└─TABLE FULL SCAN|T1 | |
| =========================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| distinct([T1.C1], [T1.C2], [T1.C3]) |
| 1 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| access([T1.C1], [T1.C2], [T1.C3]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([T1.__pk_increment]), range(MIN ; MAX)always true |
+-------------------------------------------------------------------+
14 rows in set
NO_USE_HASH_DISTINCT Hint
The NO_USE_HASH_DISTINCT Hint disables the specific algorithm used for executing the distinct operation. Its reverse operation is the USE_HASH_DISTINCT Hint.
Syntax
/*+ NO_USE_HASH_DISTINCT [ ( [ @ qb_name ] ) ] */
Examples
The following query example uses the NO_USE_HASH_DISTINCT Hint to ensure that the optimizer does not use the HASH algorithm for the DISTINCT operation but instead uses the MERGE algorithm. Since OceanBase Database supports both the HASH and MERGE algorithms for deduplication, using the NO_USE_HASH_DISTINCT Hint results in the use of the MERGE algorithm.
-- Use the `NO_USE_HASH_DISTINCT` Hint to ensure that the `MERGE` algorithm is used for the `DISTINCT` operation instead of the `HASH` algorithm.
EXPLAIN BASIC SELECT /*+ NO_USE_HASH_DISTINCT */ DISTINCT c1, c2, c3 FROM t1;
+-------------------------------------------------------------------+
| Query Plan |
+-------------------------------------------------------------------+
| ============================= |
| |ID|OPERATOR |NAME| |
| ----------------------------- |
| |0 |MERGE DISTINCT | | |
| |1 |└─SORT | | |
| |2 | └─TABLE FULL SCAN|T1 | |
| ============================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| distinct([T1.C1], [T1.C2], [T1.C3]) |
| 1 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| sort_keys([T1.C1, ASC], [T1.C2, ASC], [T1.C3, ASC]) |
| 2 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| access([T1.C1], [T1.C2], [T1.C3]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([T1.__pk_increment]), range(MIN ; MAX)always true |
+-------------------------------------------------------------------+
DISTINCT_PUSHDOWN Hint
The DISTINCT_PUSHDOWN hint instructs the optimizer to push down the DISTINCT operation during distributed execution, allowing for local deduplication at each partition node. Its reverse operation is the NO_DISTINCT_PUSHDOWN hint.
Syntax
/*+ DISTINCT_PUSHDOWN [ ( [ @ qb_name ] ) ] */
Examples
The following query example demonstrates how to use the DISTINCT_PUSHDOWN hint to enable the pushdown of the DISTINCT operation during distributed execution. If no hint is specified, the default behavior is to push down the DISTINCT operation.
-- Use the `DISTINCT_PUSHDOWN` hint to push down the `DISTINCT` operation to each partition node during distributed query execution.
CREATE TABLE tp1(c1 INT, c2 INT, c3 INT) PARTITION BY HASH(c1) PARTITIONS 10;
EXPLAIN BASIC SELECT /*+ DISTINCT_PUSHDOWN */ DISTINCT c2, c3 FROM tp1;
+---------------------------------------------------------------------------+
| Query Plan |
+---------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME | |
| ----------------------------------------------- |
| |0 |PX COORDINATOR | | |
| |1 |└─EXCHANGE OUT DISTR |:EX10001| |
| |2 | └─HASH DISTINCT | | |
| |3 | └─EXCHANGE IN DISTR | | |
| |4 | └─EXCHANGE OUT DISTR (HASH)|:EX10000| |
| |5 | └─HASH DISTINCT | | |
| |6 | └─PX PARTITION ITERATOR| | |
| |7 | └─TABLE FULL SCAN |TP1 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| 1 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| dop=1 |
| 2 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| distinct([TP1.C2], [TP1.C3]) |
| 3 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| 4 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| (#keys=2, [TP1.C2], [TP1.C3]), dop=1 |
| 5 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| distinct([TP1.C2], [TP1.C3]) |
| 6 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| force partition granule |
| 7 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| access([TP1.C2], [TP1.C3]), partitions(p[0-9]) |
| is_index_back=false, is_global_index=false, |
| range_key([TP1.__pk_increment]), range(MIN ; MAX)always true |
+---------------------------------------------------------------------------+
NO_DISTINCT_PUSHDOWN Hint
The NO_DISTINCT_PUSHDOWN hint prevents the optimizer from pushing down the DISTINCT operation during distributed execution. Its reverse operation is the DISTINCT_PUSHDOWN hint.
Syntax
/*+ NO_DISTINCT_PUSHDOWN [ ( [ @ qb_name ] ) ] */
Example
The following query example shows how to use the NO_DISTINCT_PUSHDOWN hint to prevent the optimizer from pushing down the DISTINCT operation during distributed execution.
-- Use the NO_DISTINCT_PUSHDOWN hint to prevent the optimizer from pushing down the DISTINCT operation during distributed execution.
CREATE TABLE tp1(c1 INT, c2 INT, c3 INT) PARTITION BY HASH(c1) PARTITIONS 10;
EXPLAIN BASIC SELECT /*+ NO_DISTINCT_PUSHDOWN */ DISTINCT c2, c3 FROM tp1;
+---------------------------------------------------------------------------+
| Query Plan |
+---------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME | |
| ----------------------------------------------- |
| |0 |PX COORDINATOR | | |
| |1 |└─EXCHANGE OUT DISTR |:EX10001| |
| |2 | └─HASH DISTINCT | | |
| |3 | └─EXCHANGE IN DISTR | | |
| |4 | └─EXCHANGE OUT DISTR (HASH)|:EX10000| |
| |5 | └─PX PARTITION ITERATOR | | |
| |6 | └─TABLE FULL SCAN |TP1 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| 1 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| dop=1 |
| 2 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| distinct([TP1.C2], [TP1.C3]) |
| 3 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| 4 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| (#keys=2, [TP1.C2], [TP1.C3]), dop=1 |
| 5 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| force partition granule |
| 6 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| access([TP1.C2], [TP1.C3]), partitions(p[0-9]) |
| is_index_back=false, is_global_index=false, |
| range_key([TP1.__pk_increment]), range(MIN ; MAX)always true |
+---------------------------------------------------------------------------+
27 rows in set