What is auto DOP?
The OceanBase Database optimizer supports two strategies for determining the degree of parallelism (DOP): AUTO and MANUAL. Auto DOP is enabled when the AUTO strategy is selected.
Why auto DOP?
You can use the MANUAL strategy to manually specify the DOP. One way to do this is by using system variables to enable parallel execution for all queries in the current session and specify the DOP. However, this may cause additional parallel execution overheads for queries that do not require parallel execution or a high DOP in the session, resulting in performance degradation. Another way is to use hints to specify the DOP for specific queries, but you need to look into each business query separately in this case, which is not feasible for scenarios with a large number of business queries.
When you select the AUTO strategy, the optimizer automatically enables parallel execution and determines the DOP based on the actual situation of a query.
With its auto DOP feature enabled, the OceanBase Database optimizer evaluates the execution time required for a query when generating an execution plan for the query. It then automatically determines whether to enable parallel execution and specifies an appropriate DOP for the query, eliminating the necessity of manual DOP configuration and the consequent performance degradation.
How to enable auto DOP?
The OceanBase Database optimizer provides the parallel_degree_policy system variable for you to enable and disable auto DOP at the global and session levels. You can also use hints with higher priorities to enable and disable auto DOP at the query level. The syntaxes are as follows:
/* Enable auto DOP at the global level. */
SET global parallel_degree_policy = AUTO;
/* Enable auto DOP at the session level. */
SET session parallel_degree_policy = AUTO;
SET parallel_degree_policy = AUTO;
/* Disable auto DOP at the global level. */
SET global parallel_degree_policy = MANUAL;
/* Disable auto DOP at the session level. */
SET session parallel_degree_policy = MANUAL;
SET parallel_degree_policy = MANUAL;
/* Use a hint to enable auto DOP at the query level. */
SELECT /*+PARALLEL(AUTO)*/ * FROM ...;
/* Use a hint to disable auto DOP at the query level. */
SELECT /*+PARALLEL(MANUAL)*/ * FROM ...;
SELECT /*+PARALLEL(8)*/ * FROM ...;
Relationship between auto DOP and other parallel execution enabling methods
For more information, see Parallel execution enabling methods and their priority.
After you enable the auto DOP feature, the optimizer can enable parallel execution for multiple operations such as SELECT, DELETE, UPDATE, and INSERT. For data manipulation operations, such as DELETE, UPDATE, and INSERT, once parallel execution is enabled, the parallel DML (PDML) feature is also automatically enabled without needing to specify the _enable_parallel_dml variable.
How to determine whether to enable auto DOP?
Before enabling the auto DOP feature, you need to evaluate the type of the system business load and properly configure related variables. Otherwise, enabling this feature may bring serious risks to your system.
By using auto DOP, the optimizer aims to reduce the response time (RT) of a query based on its estimated execution costs. If your business load contains insignificant slow queries and auto DOP is enabled, these queries will be executed in parallel, achieving high concurrency. However, the parallel execution of these queries will occupy system resources and interfere with other queries that require concurrent processing and RT optimization. Excessive usage of system resources by these queries will sharply increase CPU load, severely affecting normal business queries. To avoid this situation, you need to carefully evaluate the business load of your system and isolate resources between queries before enabling auto DOP. For more information, see Overview of resource isolation.
In addition, the auto DOP strategy only optimizes the RT for the current query without considering scenarios where a large number of queries need to be concurrently processed in your system. Therefore, you need to adjust related variables based on the actual optimization objective when using the auto DOP feature. For more information, see Scenario 1: ultimate query performance optimization and Scenario 2: overall system performance optimization in the Application scenarios of auto DOP section.
How to intervene in DOP selection in the auto DOP strategy?
In the auto DOP strategy, each DOP obtained is affected by two variables: parallel_degree_limit and parallel_min_scan_time_threshold. You need to adjust the two variables based on deployment resources and business load.
Set the parallel_min_scan_time_threshold variable
The variable parallel_min_scan_time_threshold is used to calculate a DOP in the auto DOP strategy. It specifies the reference execution time for a base table scan when parallel execution is enabled. When the estimated execution time for a base table scan exceeds the value of this variable, parallel execution is enabled for the scan, and an appropriate DOP is selected based on calculations using the value of this variable. The default value is 1000, and the minimum value is 10, in milliseconds.
By reducing the value of parallel_min_scan_time_threshold, you can lower the threshold for triggering parallel scan. In this way, the optimizer can enable parallel execution for base table scans with less estimated execution time and efficiently scan tables with a fixed data volume in parallel.
After auto DOP is enabled, a large number of queries may be executed at a high DOP, leading to excessive usage of system resources. Therefore, exercise caution when setting parallel_min_scan_time_threshold. You need to adjust the value of this variable based on the actual system load to ensure query efficiency and system stability. For example, set parallel_min_scan_time_threshold to 20 or 10 to increase the DOP of single-table scans to 4 or 6 for better query performance.
Here are some configuration examples:
/* Set the maximum DOP for the auto DOP strategy at the global level. */
SET GLOBAL parallel_degree_limit = 64;
/* Set the maximum DOP for the auto DOP strategy at the session level. */
SET SESSION parallel_degree_limit = 64;
SET parallel_degree_limit = 64;
/* Set the threshold for enabling parallel execution for the auto DOP strategy at the global level. */
SET GLOBAL parallel_min_scan_time_threshold = 100;
/* Set the threshold for enabling parallel execution for the auto DOP strategy at the session level. */
SET SESSION parallel_min_scan_time_threshold = 100;
SET parallel_min_scan_time_threshold = 100;
Here are some application examples:
explain select /*+parallel(auto)*/ * from t1;
-- Set the variable to 20. In this case, the calculated DOP is 4, and the estimated cost of a base table scan with parallel execution enabled is 12 ms.
set parallel_min_scan_time_threshold = 20;
=========================================================
|ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)|
---------------------------------------------------------
|0 |PX COORDINATOR | |512000 |619089 |
|1 |└─EXCHANGE OUT DISTR |:EX10000|512000 |230501 |
|2 | └─PX BLOCK ITERATOR| |512000 |12189 |
|3 | └─TABLE FULL SCAN|t1 |512000 |12189 |
=========================================================
Outputs & filters:
-------------------------------------
0 - output([INTERNAL_FUNCTION(t1.c1, t1.c2, t1.c3, t1.c4)]), filter(nil), rowset=256
1 - output([INTERNAL_FUNCTION(t1.c1, t1.c2, t1.c3, t1.c4)]), filter(nil), rowset=256
dop=4
-- Set the variable to 10. In this case, the calculated DOP is 6, and the estimated cost of a base table scan with parallel execution enabled is 8 ms.
set parallel_min_scan_time_threshold = 10;
=========================================================
|ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)|
---------------------------------------------------------
|0 |PX COORDINATOR | |512000 |542256 |
|1 |└─EXCHANGE OUT DISTR |:EX10000|512000 |153667 |
|2 | └─PX BLOCK ITERATOR| |512000 |8126 |
|3 | └─TABLE FULL SCAN|t1 |512000 |8126 |
=========================================================
Outputs & filters:
-------------------------------------
0 - output([INTERNAL_FUNCTION(t1.c1, t1.c2, t1.c3, t1.c4)]), filter(nil), rowset=256
1 - output([INTERNAL_FUNCTION(t1.c1, t1.c2, t1.c3, t1.c4)]), filter(nil), rowset=256
dop=6
Set the parallel_degree_limit variable
parallel_degree_limit is a system variable that specifies the maximum DOP allowed when auto DOP is enabled. The default value is 0, which specifies not to limit the automatically obtained DOP. When this value is used, the optimizer limits the maximum DOP based on the number of CPU cores of the tenant and the value of the parallel_servers_target variable.
After you enable auto DOP for your system, the system load can be high if a large number of queries require high DOPs. In this case, you can set the parallel_degree_limit variable to a value less than the number of CPU cores to limit the maximum DOP allowed.
For example, when the parallel_min_scan_time_threshold variable is set to 10 and the parallel_degree_limit variable is set to 4, the maximum DOP allowed for a query is still 4 even though the DOP calculated based on the value 10 of parallel_min_scan_time_threshold is 6.
Application scenarios of auto DOP
The auto DOP feature allows the optimizer to proactively enable parallel execution for query operations, thus increasing the execution efficiency. However, the actual DOP is constrained by specific variables, which must be fine-tuned in different scenarios for performance optimization.
Scenario 1: ultimate query performance optimization
When the database load is low, meaning there is no significant query pressure, and you want to leverage all available parallel execution resources in a tenant to maximize the query performance with auto DOP enabled, we recommend that you set parallel_min_time_threshold to its minimum value and parallel_degree_limit to 0. The settings are as follows:
ALTER SYSTEM SET parallel_degree_policy = AUTO;
ALTER SYSTEM SET parallel_degree_limit = 0; -- No limitation on the maximum DOP
ALTER SYSTEM SET parallel_min_scan_time_threshold = 10; -- 10 milliseconds
With the preceding settings, the optimizer can enable parallel execution under the lowest execution time threshold and at the highest DOP, thus accelerating queries.
Scenario 2: overall system performance optimization
When the database query load is high, meaning a large number of queries are executed at high DOPs, excessive consumption of parallel execution resources can occur, leading to issues such as query queuing. To avoid this situation, you can adjust related variables as follows:
- Decrease the value of the
parallel_degree_limitvariable to a value less than the number of CPU cores to limit the maximum DOP allowed. - Increase the value of the
parallel_min_scan_time_thresholdvariable to make it harder to trigger parallel execution and reduce the utilization of parallel execution resources.
For example, if you want to limit the maximum DOP to 32 and enable parallel execution only for queries whose estimated execution time exceeds 20 milliseconds, set related variables as follows:
ALTER SYSTEM SET parallel_degree_policy = AUTO;
ALTER SYSTEM SET parallel_degree_limit = 32;
ALTER SYSTEM SET parallel_min_scan_time_threshold = 20; -- 20 milliseconds
The preceding settings can balance the query performance and resource allocation for the system and optimize the database RT and stability.