Hints related to parallel execution|V4.2.1| docs|Distributed Database

Hints related to parallel execution

Last Updated：2023-12-25 08:49:42 Updated

Hints related to parallel execution are PARALLEL, NO_PARALLEL, USE_PX, and PQ_DISTRIBUTE.

PARALLEL hint

The PARALLEL hint is a statement-level hint that instructs the optimizer to specify the number of parallel threads that can be used for a parallel operation. The hint overrides the value of PARALLEL_DEGREE_ POLICY. The hint can be used as a part of the SELECT, INSERT, UPDATE, and DELETE statements, or as a part of a table scan. The PARALLEL hint is ignored if any parallel limit is violated.

The syntax of the PARALLEL hint is as follows:

/*+ PARALLEL(n) */

The value of the n parameter in the PARALLE hint specifies the degree of parallelism (DOP). If the statement includes the SORT BY or GROUP BY clause, the number of available threads is twice the value specified by the PARALLEL hint.

In the following example, the specified DOP is 5:

SELECT /*+ PARALLEL(5) */ last_name
FROM emp;

OceanBase Database also supports table-level PARALLEL hints. The syntax is as follows:

/*+ PARALLEL(table_name n) */

If the global DOP and table-level DOP are both specified, the table-level DOP does not take effect.

NO_PARALLEL hint

The NO_PARALLEL hint instructs the optimizer to execute statements serially. It overrides the PARALLEL parameter for creating or changing the table.

The syntax of the NO_PARALLEL hint is as follows:

/*+ NO_PARALLEL ( [ @queryblock ] tablespec ) */

In the following example, the NO_PARALLEL hint overwrites the PARALLEL parameter value.

ALTER TABLE emp PARALLEL 6;
SELECT /*+ NO_PARALLEL(hr_emp) */ last_name
  FROM emp hr_emp;

USE_PX hint

The USE_PX hint forces the server to execute SQL statements in parallel execution (PX) mode, which allows multithreading in statement execution. USE_PX and PARALLEL hints are often used in pairs.

The syntax of the USE_PX hint is as follows:

/*+ USE_PX */

Here is an example of the USE_PX hint:

SELECT /*+ USE_PX PARALLEL(4)*/ e.dept_id, sum(e.salary)
FROM emp e
WHERE e.dept_id = 1001 GROUP BY e.dept_id;

PQ_DISTRIBUTE hint

The PQ_DISTRIBUTE hint instructs the optimizer on how to distribute rows between the producer server and the consumer server. The producer server is also known as the query server, and the consumer server is also known as the load server. You can use this hint to specify the row distribution method for the join or load. The syntax of the PQ_DISTRIBUTE hint is as follows:

/*+ PQ_DISTRIBUTE
  ( [ @queryblock ] tablespec
    { distribution | outer_distribution inner_distribution }
  ) */

Specify row distribution for joins

You can use two assignment methods to specify the row distribution for joins.

In the preceding syntax:

outer_distribution specifies the data distribution mode for the table on the left.
inner_distribution specifies the data distribution mode for the table on the right.

The distribution methods are HASH, BROADCAST, PARTITION, and NONE. Only the six combinations of distribution methods in the following table are valid.

Distribution method	Description
HASH, HASH	A hash function on the join key is used to map rows of each table to query servers. After the mapping is complete, each query server performs the join operation between a pair of resulting partitions. We recommend that you use this distribution method when the tables are comparable in size and the join operation is hash join or sort-merge join.
BROADCAST, NONE	All rows of the left-side table are broadcast to each query server. Rows of the right-side table are randomly partitioned. We recommend that you use this distribution method when the size of the left-side table is far smaller than that of the right-side table, or when the size of the right-side table multiplied by the number of query servers is greater than the size of the left-side table.
NONE, BROADCAST	All rows of the right-side table are broadcast to each query server. Rows of the left-side table are randomly partitioned. We recommend that you use this distribution method when the size of the right-side table is far smaller than that of the left-side table, or when the size of the right-side table multiplied by the number of query servers is smaller than that of the left-side table.
PARTITION, NONE	The rows of the left-side table are mapped by using the partitioning of the right-side table. The right-side table must be partitioned on the join keys. We recommend that you use this distribution method when the number of partitions of the left-side table equals or nearly equals a multiple of the number of query servers. For example, you can use this method when you have 14 partitions and 15 query servers. Notice: If the right-side table is not partitioned or not equally joined on the partitioning key, the optimizer ignores this hint.
NONE, PARTITION	The rows of the right-side table are mapped by using the partitioning of the left-side table. The left-side table must be partitioned on the join keys. We recommend that you use this distribution method when the number of partitions of the left-side table equals or nearly equals a multiple of the number of query servers. For example, you can use this method when you have 14 partitions and 15 query servers. Notice: If the left-side table is not partitioned or not equally joined on the partitioning key, the optimizer ignores this hint.
NONE, NONE	Each query server performs the join operation between a pair of matching partitions, one from each table. Both tables must be equally partitioned on the join keys.

In the following example, tables r and s are joined by using a hash join, and the query contains a hint to use hash distribution:

SELECT /*+ORDERED PQ_DISTRIBUTE(s HASH, HASH) USE_HASH (s)*/ column_list
FROM r,s
WHERE r.c=s.c;

To broadcast the left-side table r, use the following query statement that includes a hint:

SELECT /*+ORDERED PQ_DISTRIBUTE(s BROADCAST, NONE) USE_HASH (s) */ column_list
FROM r,s
WHERE r.c=s.c;