In OceanBase Database, cleaning up data from a large number of small tables at once is a common maintenance task. The ability to execute the TRUNCATE TABLE statement in parallel varies by OceanBase version, directly affecting the efficiency of batch table cleanup and the choice of best practices.
- OceanBase V3.x and earlier: The
TRUNCATE TABLEstatement executes serially within a tenant, resulting in low efficiency when cleaning up many small tables. In this case, use concurrentDELETEoperations with batch commits to improve cleanup performance. - OceanBase V4.x and later: The
TRUNCATE TABLEstatement supports parallel execution, significantly increasing the efficiency of batch table cleanup. Use parallelTRUNCATE TABLEfor these versions.
This topic explains how to use different data cleanup methods in various scenarios, analyzes the performance of each approach in specific environments based on test results, and provides best practice recommendations for different versions. Test results are for reference only. Evaluate actual performance based on your business environment and system configuration.
Note
Starting from OceanBase Database V4.x, the TRUNCATE TABLE statement supports parallel execution, significantly improving the efficiency of batch table cleanup. For versions earlier than V4.x, use concurrent DELETE operations with batch commits.
Characteristics of business scenarios
- Large number of small tables: A typical system contains hundreds or even thousands of tables.
- Small data volume: Each table usually contains a few hundred to several thousand rows.
- Frequent table cleanup: Data often needs to be batch cleaned at specific times, such as at the end of the day or month.
- Strict time window requirements: Business processing windows are tight, and the efficiency of data cleanup directly affects subsequent business operations.
Pain points
Limitations of the traditional TRUNCATE TABLE method
- Serial execution limitation: In OceanBase Database V3.X,
TRUNCATE TABLEoperations cannot be executed in parallel within a tenant. - Resource contention: Running a large number of
TRUNCATE TABLEoperations can lead to competition for DDL resources. - Low execution efficiency: When cleaning up hundreds of tables, serial execution causes total cleanup time to increase linearly.
Note
Only DDL operations that support parallel execution can run in parallel. If a serial DDL operation is included, the entire DDL execution switches to serial mode. For best results, include only DDL operations that support parallel execution during TRUNCATE TABLE operations. If other DDL operations are present, check whether they support parallel execution.
Business impact
- Truncate operations become a bottleneck for system performance.
- The batch processing window is extended, affecting subsequent business processing.
- System resource utilization becomes uneven.
Solutions
Batch processing design
- Create a temporary table to store the names of tables to be processed.
- Dynamically generate and execute
DELETEstatements.
-- Create a table to store the names of tables to be processed. CREATE TABLE tables_to_clean (table_name VARCHAR(100)); INSERT INTO tables_to_clean VALUES ('fund_daily_position'), ('fund_transaction'), ...; -- Dynamically generate and execute DELETE statements. BEGIN FOR r IN (SELECT table_name FROM tables_to_clean) LOOP EXECUTE IMMEDIATE 'DELETE FROM ' || r.table_name; END LOOP; END;Parallelism adjustment
- Set the session-level degree of parallelism.
- Increase the workspace memory.
- Set an appropriate degree of parallelism.
-- Set the session-level degree of parallelism. SET ob_sql_work_area_percentage = 30; -- Increase the workspace memory. SET parallel_servers_target = 16; -- Set the degree of parallelism.Session parallelism
- Use multiple sessions to concurrently execute
DELETEoperations. - Dynamically allocate session groups based on the number of tables.
- Use multiple sessions to concurrently execute
Transaction management
- Submit transactions in batches to avoid a single large transaction.
- Process a subset of tables in each batch.
- Perform staged commits.
-- Submit transactions in batches to avoid a single large transaction. BEGIN FOR i IN 1..10 LOOP -- Process a portion of tables in each batch. DELETE FROM table_group_1; DELETE FROM table_group_2; COMMIT; -- Perform a staged commit. END LOOP; END;
Performance testing
Test environment
- OceanBase Database V4.3.5
- Cluster configuration: single-node, single-replica deployment, with 16 CPU cores and 64 GB memory per node
- Test dataset: 100 tables, each with 1,000 rows
- Tenant specifications: 8 CPU cores, 32 GB of memory,
cpu_quota_concurrencyset to 4.
Test results
| Method | Tables | Total rows | Time consumed (s) | TPS |
|---|---|---|---|---|
| TRUNCATE TABLE (serial execution) | 100 | 100,000 | 8.00 | 12,500 |
| TRUNCATE TABLE (parallel execution, with a concurrency level of 10) | 100 | 100,000 | 1.1 | 90,909 |
| DELETE (with batch commit) | 100 | 100,000 | 2.00 | 50,000 |
The test results show that:
- The TRUNCATE TABLE (serial execution) method takes 8 seconds.
- The TRUNCATE TABLE (parallel execution) method takes 1.2 seconds.
- The DELETE method with batch commit takes 2 seconds.
- The TPS of the TRUNCATE TABLE (parallel execution) method (83,333) is higher than that of the other two methods.
- The performance of the TRUNCATE TABLE (parallel execution) method is better than that of the other two methods.
The performance in your actual environment may vary due to the following factors:
- Actual hardware configuration and system load
- OceanBase Database version and parameter settings
- Table structure, number of indexes, and constraint types
- Actual data volume and data distribution
You can monitor performance metrics by using the following statements:
-- Monitor the execution status
SELECT * FROM gv$session_longops WHERE opname LIKE '%DELETE%';
-- Monitor resource usage
SELECT * FROM gv$sysstat WHERE name LIKE '%parallel%';
We recommend that you test any optimization solution in a test environment similar to your production environment to obtain accurate performance data before implementation.
Notice
- DDELETE operations generate undo logs. Ensure sufficient undo space is available.
- Consider referential integrity constraints when cleaning up related tables.
- Run ANALYZE TABLE regularly to maintain statistics.
Note
The test results mentioned above are for reference only. Actual performance will vary depending on your business environment and system configuration.
Conclusion
When cleaning up a large number of small tables in OceanBase Database, the parallel TRUNCATE TABLE strategy can significantly improve system performance and resource utilization in V4.X. In V3.X, we recommend using the DELETE command with batch commits. By properly designing batch processing logic and parallelism, you can shorten the time window for bulk table cleanup and provide more time for business operations. Specific recommendations:
- OceanBase V4.X: We recommend that you use the parallel
TRUNCATE TABLEcommand. - OceanBase V3.X: We recommend that you use the
DELETEcommand with batch commit. - Choose the optimal table cleanup strategy based on your OceanBase version to improve overall system performance.
This best practice is particularly suitable for business scenarios with a large number of tables but small data volumes per table, and it effectively addresses the limitation that TRUNCATE TABLE operations cannot be performed in parallel in OceanBase Database.