This topic describes the classification, status, and compression algorithm of major compactions. A major compaction compacts all dynamic and static data, which is a time-consuming operation. When the incremental data generated by minor compactions reaches the specified threshold, OceanBase Database performs a major compaction on data of the same major version. The main difference between a minor compaction and a major compaction is that a major compaction compacts data in all partitions in the cluster with the global static data at a unified snapshot point. A major compaction is a global operation and generates a global snapshot.
Classification of major compactions
Major compactions can be classified into the following types based on the data volume:
Full compaction: All the static data is read and compacted with the dynamic data to generate the final static data. A full compaction takes a long time and consumes a large amount of I/O and CPU resources.
Incremental compaction: Only macroblocks modified since the last compaction are compacted. Macroblocks with no changes are reused.
This mode greatly reduces the compaction workload. Therefore, it is the default compaction mode in OceanBase Database.
Progressive compaction: A part of the full data is compacted each time, and the full data can be overwritten after several progressive compactions.
Parallel compaction: Data is distributed to different threads for parallel compaction.
For more information about major compactions, see Major compactions.
For more information about major compaction parameters, see Modify major compaction settings.
Major compaction status
You can query the major compaction status in the status column of the DBA_OB_ZONE_MAJOR_COMPACTION view.
The major compaction status can be:
IDLE: No major compaction is in progress.COMPACTING: A major compaction is in progress.VERIFYING: The checksum is being verified.
Compression algorithms for major compactions
OceanBase Database does not flush a small portion of the data to the disk in real time. Instead, the data is flushed to the disk in a centralized manner through major compactions. Therefore, data can be compressed before being written to the disk to improve disk space utilization. The data compression ratio and CPU consumption level vary based on the compression algorithm and method. You can choose the algorithm and method based on your business needs.
You can specify the default_compress_func parameter to set the compression algorithm. The default value is zstd_1.3.8. Other values supported are none, lz4_1.0, snappy_1.0, zlib_1.0, and zstd_1.0.
Note
A higher compression ratio saves more disk space but undermines the performance. For example, ZSTD consumes less disk space than LZ4 but takes a longer time and has a longer response time for an I/O query.
OceanBase Database allows you to specify a compression algorithm when you create a data table.
For more information about the syntaxes for creating a table and specifying a compression algorithm, see CREATE TABLE (MySQL mode) and CREATE TABLE (Oracle mode).
Trigger a major compaction
OceanBase Database supports automatic, scheduled, and manual major compactions.