The DISTINCT hint specifies how the optimizer should handle queries that contain the DISTINCT keyword.
Hint type |
Description |
|---|---|
USE_HASH_DISTINCT |
Enables the optimizer to use the HASH algorithm to execute the DISTINCT operation. Its reverse operation is the NO_USE_HASH_DISTINCT hint. |
NO_USE_HASH_DISTINCT |
Disables the optimizer from using the HASH algorithm when executing the DISTINCT operation. Its reverse operation is the USE_HASH_DISTINCT hint. |
DISTINCT_PUSHDOWN |
Indicates that the optimizer can perform DISTINCT pushdown in a distributed environment. Its reverse operation is the NO_DISTINCT_PUSHDOWN hint. |
NO_DISTINCT_PUSHDOWN |
Indicates that the optimizer cannot perform DISTINCT pushdown in a distributed environment. Its reverse operation is the DISTINCT_PUSHDOWN hint. |
USE_HASH_DISTINCT Hint
The USE_HASH_DISTINCT hint enables the optimizer to use the HASH algorithm for the DISTINCT operation. Its reverse operation is the NO_USE_HASH_DISTINCT hint.
Syntax
/*+ USE_HASH_DISTINCT [ ( [ @ qb_name ] ) ] */
Examples
The following query example shows how to use the USE_HASH_DISTINCT hint to enable the HASH deduplication algorithm.
-- Use the `USE_HASH_DISTINCT` hint to make the query use the `HASH` algorithm for the `DISTINCT` operation.
EXPLAIN BASIC SELECT /*+ USE_HASH_DISTINCT */ DISTINCT c1, c2, c3 FROM t1;
| Query Plan |
+-------------------------------------------------------------------+
| =========================== |
| |ID|OPERATOR |NAME| |
| --------------------------- |
| |0 |HASH DISTINCT | | |
| |1 |└─TABLE FULL SCAN|T1 | |
| =========================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| distinct([T1.C1], [T1.C2], [T1.C3]) |
| 1 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| access([T1.C1], [T1.C2], [T1.C3]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([T1.__pk_increment]), range(MIN ; MAX)always true |
+-------------------------------------------------------------------+
14 rows in set
NO_USE_HASH_DISTINCT Hint
The NO_USE_HASH_DISTINCT Hint disables the specific algorithm used for executing distinct operations. Its reverse operation is the USE_HASH_DISTINCT Hint.
Syntax
/*+ NO_USE_HASH_DISTINCT [ ( [ @ qb_name ] ) ] */
Examples
The following query example uses the NO_USE_HASH_DISTINCT Hint to disable the HASH algorithm for executing DISTINCT operations. Since OceanBase Database supports both the HASH and MERGE algorithms for deduplication, using the NO_USE_HASH_DISTINCT Hint results in the use of the MERGE algorithm.
-- Use the `NO_USE_HASH_DISTINCT` Hint to ensure that the `MERGE` algorithm is used for deduplication instead of the `HASH` algorithm when the `DISTINCT` operation is executed.
EXPLAIN BASIC SELECT /*+ NO_USE_HASH_DISTINCT */ DISTINCT c1, c2, c3 FROM t1;
+-------------------------------------------------------------------+
| Query Plan |
+-------------------------------------------------------------------+
| ============================= |
| |ID|OPERATOR |NAME| |
| ----------------------------- |
| |0 |MERGE DISTINCT | | |
| |1 |└─SORT | | |
| |2 | └─TABLE FULL SCAN|T1 | |
| ============================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| distinct([T1.C1], [T1.C2], [T1.C3]) |
| 1 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| sort_keys([T1.C1, ASC], [T1.C2, ASC], [T1.C3, ASC]) |
| 2 - output([T1.C1], [T1.C2], [T1.C3]), filter(nil), rowset=16 |
| access([T1.C1], [T1.C2], [T1.C3]), partitions(p0) |
| is_index_back=false, is_global_index=false, |
| range_key([T1.__pk_increment]), range(MIN ; MAX)always true |
+-------------------------------------------------------------------+
DISTINCT_PUSHDOWN Hint
The DISTINCT_PUSHDOWN hint tells the optimizer to push down the DISTINCT operation during distributed execution, allowing for local deduplication at each partition node. Its reverse operation is the NO_DISTINCT_PUSHDOWN hint.
Syntax
/*+ DISTINCT_PUSHDOWN [ ( [ @ qb_name ] ) ] */
Examples
The following example shows how to use the DISTINCT_PUSHDOWN hint to enable the pushdown of the DISTINCT operation during distributed execution. If no hint is specified, the DISTINCT operation is pushed down by default.
-- Use the `DISTINCT_PUSHDOWN` hint to push down the `DISTINCT` operation to each partition node during distributed query execution.
CREATE TABLE tp1(c1 INT, c2 INT, c3 INT) PARTITION BY HASH(c1) PARTITIONS 10;
EXPLAIN BASIC SELECT /*+ DISTINCT_PUSHDOWN */ DISTINCT c2, c3 FROM tp1;
+---------------------------------------------------------------------------+
| Query Plan |
+---------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME | |
| ----------------------------------------------- |
| |0 |PX COORDINATOR | | |
| |1 |└─EXCHANGE OUT DISTR |:EX10001| |
| |2 | └─HASH DISTINCT | | |
| |3 | └─EXCHANGE IN DISTR | | |
| |4 | └─EXCHANGE OUT DISTR (HASH)|:EX10000| |
| |5 | └─HASH DISTINCT | | |
| |6 | └─PX PARTITION ITERATOR| | |
| |7 | └─TABLE FULL SCAN |TP1 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| 1 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| dop=1 |
| 2 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| distinct([TP1.C2], [TP1.C3]) |
| 3 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| 4 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| (#keys=2, [TP1.C2], [TP1.C3]), dop=1 |
| 5 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| distinct([TP1.C2], [TP1.C3]) |
| 6 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| force partition granule |
| 7 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| access([TP1.C2], [TP1.C3]), partitions(p[0-9]) |
| is_index_back=false, is_global_index=false, |
| range_key([TP1.__pk_increment]), range(MIN ; MAX)always true |
+---------------------------------------------------------------------------+
NO_DISTINCT_PUSHDOWN Hint
The NO_DISTINCT_PUSHDOWN hint prevents the optimizer from pushing down the DISTINCT operation during distributed execution. Its reverse operation is the DISTINCT_PUSHDOWN hint.
Syntax
/*+ NO_DISTINCT_PUSHDOWN [ ( [ @ qb_name ] ) ] */
Examples
The following query example uses the NO_DISTINCT_PUSHDOWN hint to prevent the optimizer from pushing down the DISTINCT operation during distributed execution.
-- Use the NO_DISTINCT_PUSHDOWN hint to prevent the optimizer from pushing down the DISTINCT operation during distributed execution.
CREATE TABLE tp1(c1 INT, c2 INT, c3 INT) PARTITION BY HASH(c1) PARTITIONS 10;
EXPLAIN BASIC SELECT /*+ NO_DISTINCT_PUSHDOWN */ DISTINCT c2, c3 FROM tp1;
+---------------------------------------------------------------------------+
| Query Plan |
+---------------------------------------------------------------------------+
| =============================================== |
| |ID|OPERATOR |NAME | |
| ----------------------------------------------- |
| |0 |PX COORDINATOR | | |
| |1 |└─EXCHANGE OUT DISTR |:EX10001| |
| |2 | └─HASH DISTINCT | | |
| |3 | └─EXCHANGE IN DISTR | | |
| |4 | └─EXCHANGE OUT DISTR (HASH)|:EX10000| |
| |5 | └─PX PARTITION ITERATOR | | |
| |6 | └─TABLE FULL SCAN |TP1 | |
| =============================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| 1 - output([INTERNAL_FUNCTION(TP1.C2, TP1.C3)]), filter(nil), rowset=16 |
| dop=1 |
| 2 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| distinct([TP1.C2], [TP1.C3]) |
| 3 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| 4 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| (#keys=2, [TP1.C2], [TP1.C3]), dop=1 |
| 5 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| force partition granule |
| 6 - output([TP1.C2], [TP1.C3]), filter(nil), rowset=16 |
| access([TP1.C2], [TP1.C3]), partitions(p[0-9]) |
| is_index_back=false, is_global_index=false, |
| range_key([TP1.__pk_increment]), range(MIN ; MAX)always true |
+---------------------------------------------------------------------------+
27 rows in set
