This topic introduces basic parallel execution tuning methods in OceanBase, covering statistics collection, partitioning strategy optimization, parallelism configuration, index optimization, and the use of replica tables.
Manually collect statistics
If the optimizer statistics are outdated, it may lead to a decrease in the quality of the execution plan. Note the syntax differences across versions:
- Optimizer statistics in V3.2.x in the documentation of OceanBase Database V3.2.x
- Overview of optimizer statistics in the current version
V3.2.x
This section describes the syntax for manually collecting statistics on a primary table in OceanBase Database V3.2. Statistics collection is not supported for index tables in this version.
Syntax in MySQL-compatible mode
-- Enable support for the ANALYZE TABLE statement.
ALTER SYSTEM SET enable_sql_extension = true;
-- Collect the statistics on the T1 table of the TEST user, and enable the AUTO strategy for determining the number of buckets for all columns.
ANALYZE TABLE T1 COMPUTE STATISTICS FOR ALL COLUMNS SIZE AUTO;
Syntax in Oracle-compatible mode
-- Collect the global statistics on the T1 table of the TEST user, and enable the AUTO strategy for determining the number of buckets for all columns.
CALL dbms_stats.gather_table_stats('TEST', 'T1', granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE AUTO');
-- Alternatively, use the analyze syntax.
-- Collect the statistics on the T1 table of the TEST user, and enable the AUTO strategy for determining the number of buckets for all columns.
ANALYZE TABLE T1 COMPUTE STATISTICS FOR ALL COLUMNS SIZE AUTO;
V4.x
Statistics can be collected for a primary table or an index table:
Syntax in MySQL-compatible mode
-- Collect the global statistics on the T1 table of the TEST user, and enable the AUTO strategy for determining the number of buckets for all columns.
CALL dbms_stats.gather_table_stats('TEST', 'T1', granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE AUTO');
-- Collect the statistics on the IDX index in the T1 table of the TEST user, set the degree of parallelism (DOP) to 4, and specify the table name. The table name must be specified because the index name is not unique.
CALL dbms_stats.gather_index_stats('TEST', 'IDX', degree=>4, tabname=>'T1');
Syntax in Oracle-compatible mode
-- Collect the global statistics on the T1 table of the TEST user, and enable the AUTO strategy for determining the number of buckets for all columns.
CALL dbms_stats.gather_table_stats('TEST', 'T1', granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE AUTO');
-- Collect the statistics on the IDX index of the TEST user, and set the DOP to 4. You do not need to specify the table name because the index name is unique.
CALL dbms_stats.gather_index_stats('TEST', 'IDX', degree=>4);
Modify the partitioning method for a partition-wise join
For a large-table join in a proof of concept (POC) scenario, if allowed by the business system, you can use the same partitioning method for the large tables and bind the tables to the same table group to achieve optimal performance for partition-wise joins.
When you perform a partition-wise join, you must adjust the DOP to a value that matches the partition quantity to achieve optimal performance.
Adapt the DOP and partition quantity
In general, you can achieve better performance by setting the degree of parallelism to be proportional to the number of partitions. For example, if there are 8 partitions, setting the parallelism to 4, 8, or 16 typically yields good results.
For more information, see the Performance optimization through load balancing section in the Overview of parallel execution topic.
Create appropriate indexes
You can create appropriate indexes to reduce the amount of data to be scanned, thereby improving the parallel execution performance. You need to determine the tables and columns on which indexes are to be created based on specific SQL statements.
Create replicated tables (V4.2 and later)
In OceanBase Database V4.2 and later, you can create replicated tables to reduce data redistribution, thereby improving the parallel execution performance.
For more information about replicated tables, see Replicated tables (MySQL-compatible mode) and Replicated tables (Oracle-compatible mode).