Unlike minor compactions, major compactions are more significant and time-consuming. Therefore, it is recommended to perform a major compaction once a day during off-peak hours. A major compaction (major compaction) is a process of merging static and dynamic data. During a major compaction, the database is down for a period of time. To minimize the impact on business, a major compaction is typically scheduled during off-peak hours.
A minor compaction (minor compaction) is a process of merging multiple mini SSTables or a mini SSTable and a minor SSTable into one minor SSTable. Unlike a major compaction, a minor compaction is initiated by partitions based on the number of SSTables (including mini SSTables and minor SSTables) in each partition. A major compaction and a minor compaction are different from each other in the following aspects:
| Minor compaction (Mini compaction) | Major compaction (Major compaction) | |
|---|---|---|
| A minor compaction is a process of materializing MemTables at the tenant or partition level. The data in the MemTable may be of different versions. | A major compaction is a process of merging the data of a tenant at one snapshot point with the corresponding static data. It generates a tenant-level snapshot. | |
| Each OBServer node independently determines when to freeze its MemTables at the tenant or partition level, and the freeze points of primary and standby partitions do not necessarily align. | A minor compaction is performed within a partition based on the number of SSTables (including mini SSTables and minor SSTables) in the partition. | All MemTables of all partitions are frozen at the same time during a major compaction. The primary and standby partitions must maintain consistency during a major compaction, and the data consistency is verified during the major compaction. |
| The data of multiple versions may be included. | The data of multiple versions may be included. | Only the data of the snapshot point is included. |
| A minor compaction persists one or more MemTables into a mini SSTable. | A minor compaction synthesizes multiple mini SSTables into one mini SSTable or merges a mini SSTable with a minor SSTable to generate a new minor SSTable. The new minor SSTable contains only incremental data. Finally, a special mark is placed on the data that will be deleted. | A major compaction merges the current major SSTables and MemTables with the full static data of the previous major version to generate full data of the new major version. |
Although a major compaction is time-consuming, it provides an opportunity for the database to perform multiple CPU-intensive tasks, thereby improving the overall resource utilization.
Data compression
During a major compaction, OceanBase Database performs two rounds of compression on data. The first round is semantic compression within the database, and the second round is general compression using algorithms such as lz4. This not only reduces the consumption of storage space but also significantly enhances query performance. OceanBase Database supports (snappy, lz4, lzo, and zstd) compression algorithms, allowing users to choose algorithms that balance compression ratio and decompression time. To some extent, MySQL and Oracle databases also support data compression. However, due to the fixed-length page design in traditional databases, compression inevitably causes storage fragmentation, which affects the compression efficiency. More importantly, compression has negligible impact on the write performance of a storage system with an LSM-tree architecture, such as OceanBase Database.
Data verification
A major compaction based on tenant-level consistent snapshots helps OceanBase Database easily verify the data consistency among multiple replicas. After the major compaction, the business data in the replicas can be directly compared with baseline data to ensure consistency. In addition, data consistency can be verified between the primary and index tables based on the baseline data.
Schema changes
For schema changes such as adding or removing columns, the OceanBase database can complete data modification operations together during the merge, making DDL operations smoother for businesses.
Merging methods
There are various merging methods. The following descriptions outline the details.
Full major compaction
The full major compaction is the initial major compaction algorithm of OceanBase Database, which is similar to the major compaction process in HBase and RocksDB. During a full major compaction, the current static data is read, merged with the dynamic data in memory, and written to the disk as new static data. All the data is rewritten during the process. Full major compactions consume a large amount of disk I/O and space. Apart from being explicitly triggered by the database administrator (DBA), OceanBase Database generally does not initiate full major compactions.
Incremental major compaction
In the storage engine of OceanBase Database, a macroblock is the basic unit of I/O write. In many scenarios, not all macroblocks are modified. When no incremental modifications are made to a macroblock, the macroblock can be directly reused during a major compaction. This merging method is called an incremental major compaction in OceanBase Database. Incremental major compactions significantly reduce the workload of major compactions and have become the default major compaction algorithm in OceanBase Database. In addition, OceanBase Database splits data into smaller microblocks within macroblocks for I/O read and write. In many scenarios, not all microblocks are modified. Therefore, microblocks can be reused instead of being rewritten during a major compaction. This further reduces the time of major compactions.
Progressive major compaction
To support rapid business growth, you must inevitably perform DDL operations such as adding and dropping columns and creating indexes. These DDL operations are usually expensive for the database. For a long time, MySQL did not support online DDL operations (MySQL 5.6 was the first to support online DDL operations). Even today, performing online DDL operations in MySQL 5.7 is still risky for database administrators, because a large DDL operation can cause replication lag between the primary and standby MySQL databases.
OceanBase Database is designed to support online DDL operations from the very beginning. In OceanBase Database, DDL operations such as adding and dropping columns and creating indexes do not block reads and writes, nor do they affect Paxos synchronization between replicas. DDL changes take effect in real time for column addition and subtraction, and the changes to stored data are deferred to the next daily major compaction. However, some DDL operations, such as column addition and subtraction, require rewriting all data. If all data is rewritten during a single major compaction, it will put a heavy demand on storage space and the major compaction time. To address this, OceanBase Database introduces the progressive major compaction to distribute the data rewriting caused by DDL changes across multiple major compactions. For example, if 60 progressive major compactions are configured, a single major compaction will rewrite only 1/60 of the data. After 60 rounds of major compactions, the data will be rewritten as a whole. The progressive major compaction reduces the workload of the DBA in performing DDL operations and makes the DDL changes smoother.
Parallel major compaction
OceanBase Database supports partitioned tables in version 1.0 and later. In OceanBase Database, major compactions for different data partitions are performed in parallel. However, data skew can lead to large differences in the volumes of data among partitions. In some partitions, a major compaction may handle a massive amount of data. Even if incremental major compactions are used, a major compaction can still handle a large volume of data in a business update frequent scenario. To address this, OceanBase Database introduced parallel minor compaction within a partition. During a major compaction, data in a macroblock is divided into different threads and merged in parallel, which significantly speeds up the major compaction process.
Merging triggers
Major compactions can be triggered automatically, on a scheduled basis, or manually.
A major compaction is automatically triggered when the number of minor freezes in a tenant exceeds the threshold.
You can set parameters to schedule a major compaction during off-peak hours every day.
obclient> ALTER SYSTEM SET major_freeze_duty_time = '02:00' TENANT = t1;You can manually trigger a major compaction by using the following O&M commands.
Initiate a major compaction in another tenant from the sys tenant
Initiate a major compaction in a user tenant
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = sys;Initiate a major compaction in all user tenants
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = all_user;Initiate a major compaction in all META tenants
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = all_meta;Initiate a major compaction in tenants t1 and t2
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = t1,t2;
Initiate a major compaction in your own tenant from a user tenant
obclient> ALTER SYSTEM MAJOR FREEZE;
References
For more information about merge operations, see Major compactions.