The performance of parallel query in OceanBase Database is determined by parameters (variables or configuration items), which mainly include parallelism and EXCHANGE.
Parallelism variables
The number of workers for each query in parallelism is mainly determined by parallelism variables. The following table describes these variables.
| Variable | Description | Value range | Default value | Configuration suggestion |
|---|---|---|---|---|
| parallel_servers_target | Controls whether to queue a query based on the parallelism required by the query and the total number of workers. If the required parallelism exceeds the value of this variable, the query is queued. Otherwise, the query continues to execute. | [0, 1800] | 10 (The value is calculated based on the number of CPU cores.) | This variable mainly controls whether to queue a query when the number of workers is insufficient to process the query in parallel query scenarios. |
| _force_parallel_query_dop | Specifies the default parallelism for SQL queries in the current session. If no PARALLEL hint is specified, the parallelism of the SQL query is controlled by this variable. | [1, +∞] | 1 | If you want to run a batch of parallel SQL queries in the same session without adding a hint to each SQL statement, we recommend that you use this variable. |
| _force_parallel_dml_dop | Specifies the default parallelism for DML SQL statements in the current session. If no PARALLEL hint is specified, the parallelism of the DML SQL statement is controlled by this variable. | [1,+∞] | 1 | If you want to run a batch of parallel DML SQL statements in the same session without adding a hint to each SQL statement, we recommend that you use this variable. |
View the values of parallel variables
You can run the SHOW VARIABLES statement to view the values of parallel variables in OceanBase Database. However, note that some variables are hidden and should not be modified.
Session variables
The session variables _force_parallel_query_dop and _force_parallel_dml_dop are hidden. You can query them by using the following statements:
-- Query from the DBA_OB_SYS_VARIABLES table.
SELECT * FROM oceanbase.DBA_OB_SYS_VARIABLES WHERE name LIKE '%_force_parallel%';
-- Query from the system variables.
SELECT @@_force_parallel_query_dop;
Although these statements can return the values of hidden variables, we recommend that you do not modify these hidden variables.
SYS tenant
If you are the sys tenant, you can query the values of these parallel variables by using the following statement:
SELECT tenant_id, name, value
FROM oceanbase.CDB_OB_SYS_VARIABLES
WHERE name LIKE '%_force_parallel%';
EXCHANGE (Shuffle) configuration items
The EXCHANGE (Shuffle) configuration item is used to control memory management and data redistribution strategies during data transmission between DFOs (Distributed Function Operators). OceanBase Database encapsulates this data transmission process in the DTL (Data Transfer Layer) module.
| Configuration Item | Description | Value range | Default value | Configuration suggestion |
|---|---|---|---|---|
| dtl_buffer_size | Controls the size of the buffer used to send data between EXCHANGE operators (Transmit and Receive). Data is sent only when it reaches the buffer size limit, reducing the cost of data transmission per row. |
[4k, 2M] | 64K | In PX scenarios, the data transmission between EXCHANGE operators depends on the value of this configuration item. Generally, this configuration item does not need to be modified. If you want to reduce the number of data transmissions, you can try modifying this value. However, we do not recommend modifying it. |
You can run the SHOW PARAMETERS statement to view the values of configuration items. Here is an example:
obclient> SHOW PARAMETERS LIKE '%dtl%';
Expected output:
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
| zone | svr_type | svr_ip | svr_port | name | data_type | value | info | section | scope | source | edit_level | default_value | isdefault |
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
| zone1 | observer | 172.xx.xxx.xxx | 2882 | dtl_buffer_size | CAPACITY | 64K | to be removed | OBSERVER | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE | 64K | 1 |
+-------+----------+----------------+----------+-----------------+-----------+-------+---------------+----------+---------+---------+-------------------+---------------+-----------+
1 row in set
Other parallelism-related parameters
| Parameter | Description | Value range | Default value | Configuration suggestion |
|---|---|---|---|---|
| _enable_px_batch_rescan | Specifies whether to use BATCH RESCAN when generating a distributed PX RESCAN plan in NLJ. Enabling this option can improve performance. | True or False | True | By default, this option is enabled to achieve better performance, but it consumes more memory. |
| _bloom_filter_enabled | Specifies whether to enable BLOOM FILTER in HASH JOIN scenarios. | True or False | True | By default, this option is enabled when the parallelism is greater than 1. If the filtering conditions of the HASH JOIN are not effective, enabling BLOOM FILTER will add additional overhead. In this scenario, we recommend that you disable this option. |
