This topic describes the classification, status, and compression algorithms of major compactions.
A major compaction compacts all dynamic and static data, which is a time-consuming operation. When the incremental data generated by minor compactions reaches the specified threshold, OceanBase Database performs a major compaction on data of the same major version. The main difference between a minor compaction and a major compaction is that a major compaction compacts data in a tenant with its static data at a unified snapshot point and finally generates a tenant-level snapshot.
Classification of major compactions
Major compactions can be classified into the following types based on the data volume:
Full compaction: All the static data is read and compacted with the dynamic data to generate the final static data. A full compaction takes a long time and consumes a large amount of I/O and CPU resources.
Incremental compaction: Only macroblocks modified after the last compaction are compacted. Macroblocks with no changes are reused.
This mode greatly reduces the compaction workload. Therefore, it is the default compaction mode in OceanBase Database.
Progressive compaction: A part of the full data is compacted each time, and the full data can be overwritten after several progressive compactions.
Window compaction: An incremental and on-demand compaction executed within a specified time window.
Depending on the operation granularity, major compactions can be categorized as follows:
Tenant-level major compaction: A specific frozen SCN is specified to freeze all tables within a tenant, forcing all tables to be compacted to the same version.
A tenant-level major compaction can be triggered in the following ways:
Automatically: When the number of minor freezes reaches a certain threshold, a major compaction is automatically triggered.
At a scheduled time: The trigger time is specified by the tenant-level parameter
major_freeze_duty_time, and a major compaction is triggered daily.Manually: A major compaction is triggered by executing the
ALTER SYSTEM MAJOR FREEZEstatement.
Table-level major compaction: A specific table within a tenant is specified, and a partition-level major compaction is initiated for all partitions of the table. Not all partitions are guaranteed to be compacted to the same version.
A table-level major compaction can be manually triggered by executing the
ALTER SYSTEM MAJOR FREEZE TABLE_ID = table_idstatement.Partition-level major compaction: A major compaction is executed for all replicas of a specified partition based on a unified snapshot version, generating a new major SSTable.
A partition-level major compaction can be triggered in the following ways:
Automatically: Hot partitions are automatically scheduled for medium compactions based on adaptive compaction strategies and buffer table strategies.
Manually: A partition-level major compaction is triggered by executing the
ALTER SYSTEM MAJOR FREEZE TABLET_ID = tablet_idstatement.
Major compaction status
You can obtain the major compaction status from the status column of the DBA_OB_ZONE_MAJOR_COMPACTION view.
The major compaction status can be one of the following:
IDLE: No major compaction is in progress.COMPACTING: A major compaction is in progress.VERIFYING: The checksum is being verified.
Compression algorithms for major compactions
OceanBase Database does not flush a small portion of the data to the disk in real time. Instead, the data is flushed to the disk in a centralized manner through major compactions. Therefore, data can be compressed before being written to the disk to improve disk space utilization. The data compression ratio and CPU consumption level vary based on the compression algorithm and method. You can choose the algorithm and method based on your business needs.
You can specify the default_compress_func parameter to set the compression algorithm. The default value is zstd_1.3.8. Other values supported are none, lz4_1.0, snappy_1.0, and zstd_1.0.
Note
A higher compression ratio saves more disk space but undermines the performance. For example, ZSTD consumes less disk space than LZ4 but takes a longer time and has a longer response time for an I/O query.
OceanBase Database allows you to specify a compression algorithm when you create a data table.
For more information about the syntaxes for creating a table and specifying a compression algorithm, see CREATE TABLE (MySQL mode) and CREATE TABLE (Oracle mode).
