Parallel execution is a deep and complex subject. You need to have a proper understanding of the technical mechanism of parallel execution to make full use of its capabilities. In OceanBase Database V3.1 and later, a quick start guide is provided for beginners to quickly understand the essentials of parallel execution. The parameter values provided in this topic are not necessarily the best choices but can effectively avoid most unfavorable conditions, helping beginners easily use the parallel execution feature.
Initialize the environment
Complete the following settings in an online analytical processing (OLAP) tenant:
/* MySQL */
SET GLOBAL parallel_servers_target = MIN_CPU * 20;
/* Oracle */
ALTER SYSTEM SET parallel_servers_target = MIN_CPU * 20;
Collect statistics
In OceanBase Database V3.x, statistics collection is bound with major compactions. Therefore, after you import data, you must initiate a major compaction before you collect statistics.
In OceanBase Database V4.x, after you import data, you can directly call the database management system (DBMS) package for statistics collection to collect statistics.
For more information about statistics, see Overview of statistics.
Set a hint
Make sure that the maximum degree of parallelism (DOP) of an SQL statement does not exceed 1.5 times the number of physical CPU cores. Generally, if multiple SQL statements will not be executed in parallel, you can set the DOP of a single SQL statement to the number of CPU cores. For example, if the system has 32 physical CPU cores, you can set the hint as /*+ PARALLEL(32) */.
Perform performance tuning
Run the
top -Hcommand to view the CPU utilization of the current tenant.If the performance of a single SQL statement is not as expected, query the
sql_plan_monitorview for the performance report.-- `open_dt` indicates the interval from when the operator is opened to when it is closed. -- `row_dt` indicates the interval from when the operator generates the first row to when it generates OB_ITER_END. -- If the value of `row_dt` is NULL, the operator has not generated the first row or OB_ITER_END. -- If the SQL statement is considered a slow statement, add the hint /*+ monitor */ to the statement. -- Execute the slow SQL statement added with the hint. Replace `Yxxxxxxxxx` in the following statement with the obtained trace ID and then execute the following statement: -- MySQL tenant: -- Summary: SELECT op_id, op, rows, rescan, threads, (close_time - open_time) open_dt, (last_row_eof_time-first_row_time) row_dt, open_time, close_time, first_row_time, last_row_eof_time FROM ( select plan_line_id op_id, concat(lpad('', plan_depth, ' '), plan_operation) op, sum(output_rows) rows, sum(STARTS) rescan, min(first_refresh_time) open_time, max(last_refresh_time) close_time, min(first_change_time) first_row_time, max(last_change_time) last_row_eof_time, count(1) threads from oceanbase.gv$sql_plan_monitor where trace_id = 'Yxxxxxxxxx' group by plan_line_id, plan_operation order by plan_line_id ) a; -- Details SELECT op_id, thread, op, rows, rescan, (close_time - open_time) open_dt, (last_row_eof_time-first_row_time) row_dt, open_time, close_time, first_row_time, last_row_eof_time FROM ( select plan_line_id op_id, PROCESS_NAME thread, concat(lpad('', plan_depth, ' '), plan_operation) op, output_rows rows, STARTS rescan, first_refresh_time open_time, last_refresh_time close_time, first_change_time first_row_time, last_change_time last_row_eof_time from oceanbase.gv$sql_plan_monitor where trace_id = 'Yxxxxxxxxx' order by plan_line_id, PROCESS_NAME ) a; -- Oracle tenant: -- Note: `open_dt` indicates the interval from when the operator is opened to when the operator is closed. -- Note: `row_dt` indicates the interval from when the operator generates the first row to when the operator generates OB_ITER_END. -- Note: If the value of `row_dt` is NULL, the operator has not generated the first row or OB_ITER_END. -- Summary SELECT op_id, op, output_rows, rescan,threads ,(close_time - open_time) open_dt, (last_row_eof_time-first_row_time) row_dt, open_time, close_time, first_row_time, last_row_eof_time FROM ( select plan_line_id op_id, concat(lpad(' ', max(plan_depth), ' '), plan_operation) op, sum(output_rows) output_rows, sum(STARTS) rescan, min(first_refresh_time) open_time, max(last_refresh_time) close_time, min(first_change_time) first_row_time, max(last_change_time) last_row_eof_time, count(1) threads from sys.gv$sql_plan_monitor where trace_id = 'Yxxxxxxxxx' group by plan_line_id, plan_operation,plan_depth order by plan_line_id ) a; -- Details SELECT op_id, thread, op, output_rows, rescan, (close_time - open_time) open_dt, (last_row_eof_time-first_row_time) row_dt, open_time, close_time, first_row_time, last_row_eof_time FROM ( select plan_line_id op_id, PROCESS_NAME thread, concat(lpad(' ', plan_depth, ' '), plan_operation) op, output_rows, STARTS rescan, first_refresh_time open_time, last_refresh_time close_time, first_change_time first_row_time, last_change_time last_row_eof_time from sys.gv$sql_plan_monitor where trace_id = 'Yxxxxxxxxx' order by plan_line_id, process_name ) a;
FAQ
What do I do if the query performance is not as expected while the CPU resources are not fully used?
Execute the
show variables like 'parallel_servers_targetstatement and check that the value ofparallel_servers_targetis not less thanMIN_CPU× 20.What do I do if the PDML performance is not as expected?
Execute the
explain extendedstatement to verify whether PDML is used. If PDML is not used, theNotefield at the bottom of the plan describes the reason. Generally, if the target table contains triggers, foreign keys, or local unique indexes, PDML will not be used.Keywords such as
DISTRIBUTED INSERT,DISTRIBUTED UPDATE, andDISTRIBUTED DELETEindicate that PDML is not used.What do I do when the error
-4138 OB_SNAPSHOT_DISCARDEDis returned upon a PDML timeout?Set the
undo_retentionparameter to a value that is not less than the maximum execution time of a PDML statement. The default value ofundo_retentionis 30 minutes. If the execution time of a PDML statement exceeds 30 minutes, this error may be returned and the statement will be aborted and retried until it times out.This issue never occurs in OceanBase Database V4.1 and later. Therefore, you do not need to set the
undo_retentionparameter in OceanBase Database V4.1 and later.How do I enable parallel execution for business SQL statements without making any modifications to the business?
OceanBase Database Proxy (ODP) provides a web UI for you to modify connection configurations to enable parallel execution. For example, you can set the DOP of all SQL statements in a read/write splitting connection to 2.