Unlike minor compaction, a major compaction is a more significant operation that takes a relatively long time. Therefore, it is generally recommended to perform a major compaction once a day during off-peak hours. This is why a major compaction is sometimes called a daily major compaction.
A major compaction is a time-consuming process because it involves the merge of static and dynamic data. When a sufficient amount of incremental data is generated from minor compactions, a major compaction is triggered to perform a major freeze. The major freeze is similar to a minor compaction in that both are initiated based on the size of SSTables. However, a major compaction is different from a minor compaction in that a major compaction is a tenant-level operation where all MemTables of the tenant are frozen and merged with the full static data of the previous major version. This process generates a new set of full data for the tenant.
| Minor compaction (also known as a mini compaction) | Major compaction (also known as a major compaction) | |
|---|---|---|
| Partition- or tenant-level operation that only involves MemTables | Tenant-level operation that generates a tenant-level snapshot | |
| Each OBServer node's tenant decides independently whether to freeze its MemTables. The freeze status of primary partitions may differ from that of standby partitions. | A partition performs a minor compaction based on the current number of SSTables (including mini SSTables and minor SSTables). | All MemTables of all partitions of a tenant are frozen. The freeze status of primary partitions must be consistent with that of standby partitions. Consistency of data is verified during a major compaction. |
| May contain data of multiple versions | May contain data of multiple versions | Contains only the data of the latest snapshot point. |
| Persists one or more MemTables as a mini SSTable | Merges multiple mini SSTables into one mini SSTable, or merges multiple mini SSTables with one minor SSTable to generate a new minor SSTable. The merged SSTable contains only incremental data. Rows that will be deleted are specially marked. | Merges the current major version's SSTables and MemTables with the full static data of the previous major version to generate new full data. |
Although a major compaction takes a long time, it provides an opportunity for the database to perform multiple CPU-intensive tasks, thereby improving the overall resource utilization efficiency.
Data compression
During a major compaction, OceanBase Database performs two rounds of data compression. The first round is semantic compression within the database, and the second round is general compression using a user-specified compression algorithm, such as lz4. This process reduces the size of encoded data, which not only saves storage space but also significantly enhances query performance. OceanBase Database supports (snappy, lz4, lzo, and zstd) compression algorithms. You can choose an appropriate compression algorithm by balancing the compression ratio and decompression time. To some extent, MySQL and Oracle databases also support data compression. However, due to the fixed-length page design in traditional databases, compression inevitably causes storage fragmentation, which reduces the compression efficiency. More importantly, compression has no impact on the data write efficiency in a log-structured merge-tree (LSM-tree) architecture like that of OceanBase Database.
Data verification
A major compaction based on tenant-level consistent snapshots helps OceanBase Database easily verify the data consistency among multiple replicas. After the major compaction, you can directly compare the baseline data of multiple replicas to verify the business data consistency among the replicas. You can also verify the data consistency between the primary and standby tables based on the snapshot baseline data.
Schema change
For schema changes such as adding or removing columns, the OceanBase database can complete data change operations together during the merge, making DDL operations smoother for businesses.
Merging methods
There are various methods to merge data. The following descriptions outline the different methods.
Full major compaction
The full major compaction is the original major compaction algorithm of OceanBase Database, similar to the major compaction process in HBase and RocksDB. During a full major compaction, the current static data is read, merged with the dynamic data in memory, and written to the disk as new static data. All the data is rewritten during the process. Full major compactions greatly consume disk I/O and space. Apart from being explicitly triggered by the database administrator (DBA), OceanBase Database generally does not initiate full major compactions.
Incremental major compaction
In the storage engine of OceanBase Database, a macroblock is the basic unit of I/O write. In many scenarios, not all macroblocks are modified. When a macroblock is not incremental modified, the macroblock can be reused in the major compaction. OceanBase Database calls this method of reusing macroblocks incremental major compaction. Incremental major compaction significantly reduces the workload of the major compaction and has become the default major compaction method in OceanBase Database. Furthermore, OceanBase Database divides data into smaller microblocks within a macroblock. In many scenarios, not all microblocks are modified. Instead of rewriting all microblocks, microblocks can be reused in the major compaction. This reduces the time of the major compaction.
Progressive major compaction
To support rapid business growth, you must inevitably perform DDL operations such as adding and dropping columns and creating indexes. These DDL operations are usually expensive for the database. For a long time, MySQL did not support online DDL operations (MySQL 5.6 was the first version to support online DDL operations). Even today, performing online DDL operations in MySQL 5.7 is still risky for a DBA, because a large DDL operation can cause replication lag between the primary and standby MySQL databases.
OceanBase Database is designed to support online DDL operations from the very beginning. In OceanBase Database, DDL operations such as adding and dropping columns and creating indexes do not block reads and writes, nor do they affect Paxos synchronization between replicas. DDL changes take effect immediately for column addition and dropping, or are deferred to the next daily major compaction. However, some DDL operations, such as column addition and dropping, require rewriting all data. If all data is rewritten during a single major compaction, it will put a great strain on storage space and the major compaction time. To address this, OceanBase Database introduced the progressive major compaction. In this way, data rewritten due to DDL changes is distributed across multiple major compactions. For example, if 60 progressive major compactions are configured, a single major compaction will rewrite only 1/60 of the data. After 60 rounds of major compactions, the data is rewritten as a whole. The progressive major compaction reduces the workload of DDL operations for DBAs and makes DDL changes smoother.
Parallel major compaction
OceanBase Database supports partitioned tables in version 1.0 and later. In OceanBase Database, major compactions for different data partitions are performed in parallel. However, data skew can lead to large differences in the sizes of data partitions. In some cases, major compactions for some data partitions need to handle a large amount of data, which can make the major compaction process time-consuming. To address this, OceanBase Database introduced parallel minor compaction for data partitions. During a major compaction, data in a data partition is divided into multiple threads, and the major compaction in the partition is performed in parallel, which significantly speeds up the major compaction process.
Triggers
Major compactions can be triggered automatically, on a scheduled basis, or manually.
A major compaction is automatically triggered when the number of minor freezes in a tenant exceeds the threshold.
You can set parameters to schedule a major compaction during off-peak hours every day.
obclient> ALTER SYSTEM SET major_freeze_duty_time = '02:00' TENANT = t1;You can manually trigger a major compaction by using the following O&M commands.
Initiate a major compaction in another tenant from the sys tenant
Initiate a major compaction in a user tenant
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = sys;Initiate a major compaction in all user tenants
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = all_user;Initiate a major compaction in all META tenants
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = all_meta;Initiate a major compaction in tenants t1 and t2
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = t1,t2;
Initiate a major compaction in your own tenant from a user tenant
obclient> ALTER SYSTEM MAJOR FREEZE;
References
For more information about merge operations, see Major compactions.