OceanBase Database provides a group of parameters for you to control the initialization and tuning of parallel execution.
When OceanBase Database starts, the default values for parallel execution parameters can be calculated based on the number of CPU cores of the tenant and the tenant-level parameter px_workers_per_cpu_quota. You can also choose not to use the default values but to manually specify parameter values upon startup of OceanBase Database or manually adjust the parameter values later as needed.
By default, parallel execution is enabled. This topic covers the following sections:
- Default values of parallel execution parameters
- Tuning of parallel execution parameters
Default values of parallel execution parameters
You can set parallel execution parameters to control the number of parallel execution (PX) threads and queuing in parallel execution. The following table describes the parameters.
| Parameter | Default value | Level | Description |
|---|---|---|---|
| px_workers_per_cpu_quota | 10 | Tenant-level parameter | The number of PX threads that can be allocated on each CPU core. Value range: [1,20]. |
| parallel_servers_target | MIN CPU * px_workers_per_cpu_quota | Tenant-level variable | The number of PX threads that can be requested from each node of the tenant. |
| parallel_degree_policy | MANUAL | Tenant- and session-level variable | The auto degree of parallelism (DOP) strategy. You can set the value to AUTO to enable auto DOP. After auto DOP is enabled, the optimizer automatically calculates the DOP for queries based on statistics. If you set the value to MANUAL, you can specify a DOP by using hints, a table-level PARALLEL attribute, or a session-level PARALLEL attribute. |
To lower the requirements on using parallel execution, OceanBase Database minimizes the number of parallel execution parameters. You can use the default values to directly enable parallel execution. In special scenarios, you can change the parameter values for optimization.
px_workers_per_cpu_quota
This parameter specifies the number of PX threads that can be allocated on each CPU core. Assume that the value of MIN_CPU of the tenant is N. If the data to be processed in parallel is evenly distributed, the number of threads that can be allocated on each node is calculated by using the following formula: N × Value of px_workers_per_cpu_quota. If the data is unevenly distributed, the actual number of threads allocated on some nodes may exceed the value calculated by using the foregoing formula for a short time. After the parallel execution is completed, the excess threads are automatically reclaimed.
px_workers_per_cpu_quota affects the default value of parallel_servers_target only during tenant creation. If you change the value of px_workers_per_cpu_quota after you create a tenant, the value of parallel_servers_target is not affected.
Generally, you do not need to change the default value of px_workers_per_cpu_quota. If all CPU resources are occupied by parallel execution when resource isolation is disabled, you can try to decrease the value of px_workers_per_cpu_quota to lower the CPU utilization.
parallel_servers_target
This parameter specifies the number of PX threads that can be requested from each node of the tenant. When thread resources are used up, subsequent PX requests need to wait in a queue. For the concept of queuing, see Concurrency control and queuing.
In parallel execution, the CPU utilization can be very low due to factors such as an excessively small value for parallel_servers_target, which downgrades the DOP for the SQL statement, resulting in fewer threads allocated than expected.
In OceanBase Database of a version earlier than V3.2.3, the default value of parallel_servers_target is very small. You can increase the value of parallel_servers_target to resolve the issue. We recommend that you set parallel_servers_target to the value of MIN_CPU × 10.
In OceanBase Database V3.2.3 and later, the default value of parallel_servers_target is the value of MIN_CPU × 10. Therefore, this issue does not occur.
Note
MIN_CPU specifies the minimum number of CPU cores for the tenant and is specified during tenant creation.
After you set an appropriate value for parallel_servers_target, reconnect to your database and execute the following statement to view the latest value:
SHOW VARIABLES LIKE 'parallel_servers_target';
For ease of O&M, you can set parallel_servers_target to the maximum value to avoid frequent adjustment. Theoretically, you can set parallel_servers_target to an infinite value. However, this results in low efficiency, because all queries are executed once they are issued, without the need to wait in a queue, and contend for CPU time slices, disk I/Os, and network I/Os.
This issue is not severe in terms of throughput. However, resource contention will significantly increase the latency of individual SQL statements. Considering the CPU and I/O utilization, you can set parallel_servers_target to the value of MIN_CPU × 10. In a few I/O-intensive scenarios, CPU resources may not be fully used. In this case, you can set parallel_servers_target to the value of MIN_CPU × 20.
parallel_degree_policy
This parameter specifies the DOP strategy. Valid values are AUTO and MANUAL. You can set the value to AUTO to enable auto DOP. In this case, the optimizer automatically calculates the DOP for queries based on statistics. If you set the value to MANUAL, you can specify a DOP by using hints, a table-level PARALLEL attribute, or a session-level PARALLEL attribute.
In OceanBase Database V4.2 and later, if you are not familiar with the DOP setting rules, you can set parallel_degree_policy to AUTO to allow the optimizer to automatically select a DOP. For more information about the rules for automatically calculating a DOP, see Set a DOP for parallel execution.
OceanBase Database of a version earlier than V4.2 does not support the parallel_degree_policy parameter, and therefore does not support the auto DOP feature. In this case, you must manually specify a DOP.
Tuning of parallel execution parameters
ob_sql_work_area_percentage
This is a tenant-level variable that specifies the maximum memory space available for the SQL workarea. The value is in percentage that indicates the percentage of the memory space available for the SQL workarea to the total memory space of the tenant. The default value is 5, which indicates 5%. When the memory space occupied by the SQL workarea exceeds the specified value, data in the memory is flushed to the disk.
To view the actual memory usage of the SQL workarea, you can search for WORK_AREA in the observer.log file. Here is an example.
[MEMORY] tenant_id=1001 ctx_id=WORK_AREA hold=2,097,152 used=0 limit=157,286,400
In a scenario with more reads than writes, if data in the memory is flushed to the disk due to insufficient memory for the SQL workarea, you can increase the value of ob_sql_work_area_percentage.
workarea_size_policy
OceanBase Database implements global adaptive memory management. When workarea_size_policy is set to AUTO, the execution framework allocates memory to operators, such as HashJoin, GroupBy, and Sort, based on the optimal strategy, and enables the adaptive data flush strategy.
Tuning of PDML parameters
The transaction mechanism is no longer a must in OceanBase Database since V4.1. Therefore, when you import data into a table, we recommend that you use the
INSERT INTO SELECTstatement in combination with the direct load feature to insert the data into the table at a time. This can shorten the import time and avoid memory shortage caused by a high write speed.In OceanBase Database of a version earlier than V4.1, if the execution time of a single PDML statement exceeds 30 minutes, you must set the
undo_retentionparameter. For more information, see the “Transaction processing” section in the Classification and optimization topic.