OceanBase Database provides a set of parameters to control the initialization and tuning of parallel execution.
When OceanBase Database starts, it calculates default parallel execution control parameters based on the number of CPU cores of the tenant and the tenant parameter px_workers_per_cpu_quota. You can also specify the parameter values manually when OceanBase Database starts. Afterwards, you can modify the parameters based on the actual scenario.
By default, parallel execution is enabled. This topic explains how to control parallel execution parameters:
- Default parallel execution parameters
- Tuning parallel execution parameters
Default parameters for parallel execution
The parameters for controlling the number of parallel threads and the queuing behavior of parallel execution can be specified to configure parallel execution. The following table summarizes the parameters:
| Parameter | Default value | Level | Description |
|---|---|---|---|
| px_workers_per_cpu_quota | 10 | Tenant-level configuration item | The maximum number of parallel threads that can be allocated on each CPU. The value range is [1,20]. |
| parallel_servers_target | MIN CPU * px_workers_per_cpu_quota | Tenant-level variable | The maximum number of parallel threads that a tenant can apply for on each node. |
| parallel_degree_policy | MANUAL | Tenant-level or session-level variable | The switch for enabling automatic DOP. When set to AUTO, automatic DOP is enabled and the optimizer automatically calculates the DOP of a query based on statistics. When set to MANUAL, the DOP is manually controlled based on hints, the TABLE PARALLEL attribute, and session-level DOP. |
To lower the barrier to entry for parallel execution, OceanBase Database uses the minimum number of parameters necessary for parallel execution. You can use the default parameters to get started. In specific scenarios, you can modify the default parameters for application tuning.
px_workers_per_cpu_quota
This parameter specifies the number of parallel execution threads that can be allocated on each CPU. If the minimum number of CPU cores allocated to a tenant is N, when the parallel workload is evenly distributed, the number of threads that can be allocated on each node is N × px_workers_per_cpu_quota. If the data accessed by a parallel workload is unevenly distributed, the actual number of threads that can be allocated on some nodes may exceed N × px_workers_per_cpu_quota for a short period of time. After the workload is completed, the extra threads will be automatically recycled.
The value of px_workers_per_cpu_quota affects the default value of parallel_servers_target only when the tenant is created. After the tenant is created, modifying the value of px_workers_per_cpu_quota does not change the value of parallel_servers_target.
Generally, you do not need to modify the default value of px_workers_per_cpu_quota. If the resource isolation feature is disabled and a parallel workload occupies all CPU resources, you can try reducing the value of px_workers_per_cpu_quota to resolve this issue.
parallel_servers_target
This parameter specifies the number of parallel execution threads that a tenant can apply for on each node. When the threads are exhausted, parallel execution requests need to be queued.
If the CPU utilization of parallel execution is very low, you can increase the value of parallel_servers_target. This may be caused by a small DOP, resulting in fewer actually allocated threads than expected.
In OceanBase Database V3.2.3 and earlier, the default value of parallel_servers_target is a small number. In these versions, you can set parallel_servers_target to a larger value, such as the value of MIN CPU × 10.
In OceanBase Database V3.2.3 and later, the default value of parallel_servers_target has been changed to MIN CPU × 10, and you generally will not encounter this issue.
Note
MIN CPU indicates the min_cpu value specified when the tenant is created.
After you set the value of parallel_servers_target, reconnect and run the following command to query the latest value:
SHOW VARIABLES LIKE 'parallel_servers_target';
From an O&M perspective, to avoid frequent adjustments, you can set the maximum value of parallel_servers_target. Theoretically, you can set parallel_servers_target to an infinitely large value. In this case, all queries would start executing without queuing, and they would compete for CPU time slices, disk I/O, and network I/O.
From a throughput perspective, this setting does not cause significant issues. However, from the perspective of single SQL query latency, competition for resources would significantly increase the latency. To improve CPU and I/O utilization, we recommend that you set parallel_servers_target to the value of MIN CPU × 10 as the benchmark, or MIN CPU × 20 in I/O-intensive scenarios.
parallel_degree_policy
This parameter is the auto DOP enable switch. It is set to AUTO to enable auto DOP and set to MANUAL to disable it. When auto DOP is enabled, the optimizer automatically calculates the DOP of a query based on the statistics. When auto DOP is disabled, the DOP is controlled based on hints, the TABLE PARALLEL attribute, and the session-level DOP.
In OceanBase Database V4.2 and later, if you are not familiar with DOP settings, you can set parallel_degree_policy to AUTO to enable auto DOP.
In OceanBase Database V4.2 and earlier, the auto DOP feature is not supported. In these versions, you need to manually set the DOP.
Optimize parameters for parallel execution
ob_sql_work_area_percentage
A tenant-level variable that limits the maximum memory available to the SQL module. It indicates the percentage of the total memory of the tenant and defaults to 5, which means 5% of the tenant memory. When the memory used by SQL exceeds the limit, the system writes data to disk.
You can search for WORK_AREA in the observer.log file to check the actual memory usage of SQL work areas. For example:
[MEMORY] tenant_id=1001 ctx_id=WORK_AREA hold=2,097,152 used=0 limit=157,286,400
In read-intensive scenarios, if data is written to disk due to memory limitations, you can increase the value of ob_sql_work_area_percentage.
workarea_size_policy
OceanBase Database implements global adaptive memory management. When workarea_size_policy is set to AUTO, the execution framework uses the optimal strategy to allocate memory to operators (such as HashJoin, GroupBy, and Sort).
Parameter tuning for parallel DML
In OceanBase Database V4.1 and later, if no transaction is required, we recommend that you use the INSERT INTO SELECT statement with the direct load feature to insert data into a new table at one time. This can shorten the data import time and avoid insufficient memory due to fast write.
In OceanBase Database V4.1 and earlier, if a single parallel DML statement takes more than 30 minutes to execute, you need to set the undo_retention parameter.