In OceanBase Database, the performance of a parallel query is determined by the values of the parallel execution parameters, including the EXCHANGE parameter and those related to the degree of parallelism (DOP).
DOP parameters
DOP parameters determine the number of parallel worker threads for each query. The following table describes the parameters.
| Parameter | Description | Value range | Default value | Recommended configuration |
|---|---|---|---|---|
| parallel_servers_target | Before a query is put into a queue, the system checks whether the required DOP and the total number of required workers exceeds the value that you specified for this parameter. If yes, the system queues the query. Otherwise, the system proceeds to the query execution. | [0, 1800] | 10. The value is determined based on the actual number of CPUs. |
This parameter controls whether to proceed to the parallel execution of a query or queue it if the number of available worker threads is insufficient. |
| _force_parallel_query_dop | Specifies the default DOP for an SQL query in a session. If no PARALLEL hint is specified, this parameter specifies the actual DOP for the SQL query. |
[1, +∞] | 1 | You can set this parameter as needed. For example, if you want to execute some SQL queries in parallel but do not want to add a PARALLEL hint to each of them, we recommend that you set this parameter. |
| _force_parallel_dml_dop | Specifies the default DOP of DML statements in a session. If no PARALLEL hint is specified, this parameter specifies the actual DOP for the DML statements. |
[1,+∞] | 1 | You can set this parameter as needed. For example, if you want to execute some DML statements in parallel but do not want to add a PARALLEL hint to each of them, we recommend that you set this parameter. |
You can run the SHOW VARIABLES command to view the values of these parameters.
EXCHANGE (shuffle) parameters
EXCHANGE (shuffle) parameters specify the memory for data shuffle between data flow objects (DFOs). OceanBase Database encapsulates its data transfer service into a module called Data Transfer Layer (DTL).
| Parameter | Description | Value range | Default value | Recommended configuration |
|---|---|---|---|---|
| dtl_buffer_size | Specifies the size of the buffer for each data transmission operation between EXCHANGE operators, namely the Transmit and Receive operators. Data is sent only when the value of this parameter reaches the upper limit. This reduces the cost of transmitting each row. | [0, 1800] | 10. The value is determined based on the actual number of CPUs. |
This parameter controls the data transmission between EXCHANGE operators. We recommend that you do not modify this parameter unless you want to reduce the frequency of data transmission. |
You can use SHOW PARAMETERS to view the value of these parameters, as shown in the following example:
obclient> SHOW PARAMETERS LIKE '%dtl%';
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
| zone | svr_type | svr_ip | svr_port | name | data_type | value | info | section | scope | source | edit_level | default_value | isdefault |
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
| zone1 | observer | 172.xx.xxx.xxx | 2882 | dtl_buffer_size | CAPACITY | 64K | to be removed | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE | 64K | 1 |
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
1 row in set
Other parameters related to parallel execution
| Parameter | Description | Value range | Default value | Recommended configuration |
|---|---|---|---|---|
| _enable_px_batch_rescan | Specifies whether to enable BATCH RESCAN for better performance when you use the Nested-Loop Join (NLJ) algorithm to generate a distributed PX RESCAN plan. |
True and False | True | By default, BATCH RESCAN is enabled. It provides better performance but consumes more memory resources. |
| _bloom_filter_enabled | Specifies whether to enable bloom filters when you use the HASH JOIN algorithm. |
True or False | True | Bloom filters are automatically enabled when the DOP exceeds 1. If the join condition of the HASH JOIN algorithm has poor filtering performance, bloom filters may bring additional overhead. In this case, you can disable bloom filters. |