The performance of parallel query (PX) in OceanBase Database is determined by parameters (variables or configuration items), such as parallelism and EXCHANGE.
Parallelism variables
The parallelism variables mainly determine the number of workers for each query. The following table describes these variables.
Variable name |
Description |
Value range |
Default value |
Configuration recommendation |
|---|---|---|---|---|
| parallel_servers_target | Controls whether to queue a query before it is executed based on the parallelism required by the query and the total number of workers. If the query requires more workers than the value of this variable, the query is queued. Otherwise, it is executed. | [0, 1800] | 10 (calculated based on the number of CPU cores) | This variable determines whether to continue or queue a query when the number of workers is insufficient for the query in PX scenarios. |
| _force_parallel_query_dop | Specifies the default parallelism for SQL queries in a session. If no PARALLEL hint is specified, the parallelism of the SQL query is controlled by this variable. | [1, +∞] | 1 | You can set this variable to specify the default parallelism for a batch of SQL queries in a session without adding a hint to each SQL query. |
| _force_parallel_dml_dop | Specifies the default parallelism for DML SQL statements in a session. If no PARALLEL hint is specified, the parallelism of the DML SQL statement is controlled by this variable. | [1,+∞] | 1 | You can set this variable to specify the default parallelism for a batch of DML SQL statements in a session without adding a hint to each SQL statement. |
View the values of parallel variables
You can run the SHOW VARIABLES command to view the values of parallel variables in OceanBase Database. However, note that some variables are hidden and should not be modified.
Session variables
The session variables _force_parallel_query_dop and _force_parallel_dml_dop are hidden. You can query them as follows:
-- Query from the DBA_OB_SYS_VARIABLES table.
SELECT * FROM oceanbase.DBA_OB_SYS_VARIABLES WHERE name LIKE '%_force_parallel%';
-- Query from the system variables.
SELECT @@_force_parallel_query_dop;
Although these queries can return the values of hidden variables, we recommend that you do not modify these variables.
SYS tenant
If you are the sys tenant, you can also query these parallel variables by using the following command:
SELECT tenant_id, name, value
FROM oceanbase.CDB_OB_SYS_VARIABLES
WHERE name LIKE '%_force_parallel%';
EXCHANGE (Shuffle) configuration items
The EXCHANGE (Shuffle) configuration items are used to control memory management and data redistribution strategies during data transmission between distributed function operators (DFOs). OceanBase Database encapsulates this data transmission process in the data transfer layer (DTL) module.
Configuration item name |
Description |
Value range |
Default value |
Configuration recommendation |
|---|---|---|---|---|
| dtl_buffer_size | Specifies the size of the buffer for data transmission between EXCHANGE operators (Transmit and Receive). Data is sent only when it reaches the upper limit of this buffer size, reducing the cost of data transmission per row. |
[4k, 2M] | 64K | In PX scenarios, the data transmission between EXCHANGE operators depends on the value of this configuration item. Generally, you do not need to modify this configuration item. If you want to reduce the number of data transmissions, you can try modifying this value. However, we recommend that you do not modify it. |
You can run the SHOW PARAMETERS command to view the values of configuration items. Here is an example:
obclient> SHOW PARAMETERS LIKE '%dtl%';
The expected output is as follows:
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
| zone | svr_type | svr_ip | svr_port | name | data_type | value | info | section | scope | source | edit_level | default_value | isdefault |
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
| zone1 | observer | 172.xx.xxx.xxx | 2882 | dtl_buffer_size | CAPACITY | 64K | to be removed | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE | 64K | 1 |
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
1 row in set
Other parallelism-related parameters
Parameter name |
Description |
Value range |
Default value |
Configuration recommendation |
|---|---|---|---|---|
| _enable_px_batch_rescan | Specifies whether to use BATCH RESCAN when generating a distributed PX RESCAN plan for nested loop join (NLJ). This can improve performance. | True or False | True | By default, this parameter is set to True, which can improve performance but consumes more memory. |
| _bloom_filter_enabled | Specifies whether to enable the Bloom filter in hash join scenarios. | True or False | True | By default, this parameter is set to True when the parallelism is greater than 1. If the join conditions in hash join scenarios do not provide sufficient filtering, enabling the Bloom filter will incur additional overhead. In such cases, you can consider disabling this feature. |
