Unlike minor compaction, a major compaction is a more significant operation that usually takes a relatively long time. Therefore, it is generally recommended to perform a major compaction once a day during off-peak hours. This process is also known as the daily major compaction.
A major compaction is a time-consuming process. When a sufficient amount of incremental data is generated, a major freeze is used to initiate a major compaction. Compared with a minor compaction, a major compaction is a process where all partitions of a tenant merge their MemTables to the static data at a unified snapshot point to generate a tenant-level snapshot.
| Minor compaction (also known as a mini compaction) | Major compaction (also known as a major compaction) | |
|---|---|---|
| A minor compaction is performed at the partition or tenant level to materialize MemTables. | A major compaction is performed at the tenant level to generate a tenant-level snapshot. | |
| Each OBServer node independently decides when to freeze MemTables in its tenants. The freeze points of primary and standby partitions do not necessarily align. | A minor compaction is triggered in a partition based on the current number of SSTables (including mini SSTables and minor SSTables). | A major compaction requires that the freeze points of primary and standby partitions align. During the process, consistency is verified between data in primary and standby partitions. |
| It may involve data of multiple versions. | It may involve data of multiple versions. | It includes only the latest version of data in the snapshot point. |
| It persists one or more MemTables into a mini SSTable. | It synthesizes multiple mini SSTables into one mini SSTable or merges multiple mini SSTables with one minor SSTable to generate a new minor SSTable. The new SSTable contains only incremental data. Finally, a special mark is placed on the row that will be deleted. | It merges the current major SSTables and MemTables with the full static data of the previous major version to generate full data of the new major version. |
Although a major compaction is time-consuming, it provides an opportunity for operations. During this period, OceanBase Database can perform multiple CPU-intensive computation tasks based on the major compaction feature, thereby improving the overall resource utilization.
Data compression
During a major compaction, OceanBase Database compresses data in two steps. First, it applies semantic compression to the data internally. Then, it uses a general compression algorithm, such as lz4, to compress the encoded data. This two-step compression not only saves storage space but also significantly enhances query performance. OceanBase Database supports (snappy, lz4, lzo, zstd) compression algorithms. You can choose an appropriate compression algorithm by balancing the compression ratio and decompression time. To some extent, MySQL and Oracle databases also support data compression. However, due to the fixed-length page design in traditional databases, compression inevitably causes storage fragmentation, which reduces the compression efficiency. More importantly, compression has no impact on the data write efficiency in a database with an LSM-tree architecture, such as OceanBase Database.
Data verification
A major compaction based on tenant-level consistent snapshots helps OceanBase Database easily verify the data consistency among multiple replicas. After the major compaction, you can directly compare the baseline data of multiple replicas to verify the business data consistency across replicas. You can also verify the data consistency between the primary and standby tables based on the snapshot baseline data.
Schema change
For schema changes such as adding or removing columns, the OceanBase database can complete data modification operations together during the merge, making DDL operations smoother for businesses.
Merging methods
There are various merging methods. Here is a breakdown.
Full major compaction
The full major compaction is the initial major compaction algorithm of OceanBase Database, which is similar to the major compaction process in HBase and RocksDB. During a full major compaction, the current static data is read, merged with the dynamic data in memory, and written to the disk as new static data. All the data is rewritten during the process. Full major compactions consume a large amount of disk I/O and space. Apart from being explicitly triggered by a database administrator (DBA), OceanBase Database generally does not initiate full major compactions.
Incremental major compaction
In the storage engine of OceanBase Database, a macroblock is the basic unit of I/O write. In many scenarios, not all macroblocks are modified. When a macroblock is not modified, the data in the macroblock can be reused in the major compaction. OceanBase Database calls this method of reusing data an incremental major compaction. Incremental major compactions significantly reduce the workload of the major compaction and have become the default major compaction method in OceanBase Database. OceanBase Database further divides data within a macroblock into smaller microblocks. In many scenarios, not all microblocks are modified. Microblocks can be reused instead of being rewritten. Incremental major compactions at the microblock level further reduce the duration of the major compaction.
Progressive major compaction
To support rapid business growth, you must inevitably perform DDL operations such as adding and dropping columns and creating indexes. These DDL operations are usually expensive in the database. For a long time, MySQL did not support online DDL operations (MySQL 5.6 was the first to support online DDL operations). Even today, performing online DDL operations in MySQL 5.7 is still risky for a DBA, because a large DDL operation can cause replication lag between the primary and standby MySQL databases.
OceanBase Database is designed with online DDL operations in mind. Add and drop column and create index operations do not block reads and writes and do not interrupt Paxos synchronization among replicas. DDL operations with real-time effects are completed during minor compactions. However, some DDL operations, such as column addition and drop, require rewriting all data. If all data is rewritten during a single minor compaction, it will put a heavy strain on storage space and the duration of the minor compaction. To address this, OceanBase Database introduces the progressive major compaction to distribute data rewriting caused by DDL operations across multiple minor compactions. For example, if 60 progressive rounds are configured, data is rewritten only 1/60 during each minor compaction. After 60 rounds of minor compactions, data is rewritten in its entirety. The progressive major compaction reduces the burden on DBAs to perform DDL operations and makes the DDL operations smoother.
Parallel major compaction
OceanBase Database supports partitioned tables in version 1.0 and later. Minor compactions for different data partitions are performed in parallel. However, data skew can result in large data volumes in some partitions. Even with incremental major compaction, major compactions can consume a large amount of disk I/O and time in some business scenarios with frequent updates. To address this, OceanBase Database introduced parallel minor compaction in partition. The data in a macroblock is divided into multiple parts and processed by different threads for minor compaction. This significantly speeds up the minor compaction process.
Merging triggers
Merges can be triggered automatically, on a scheduled basis, or manually.
A minor compaction is automatically triggered when the number of minor freezes in a tenant exceeds the threshold.
You can set parameters to schedule a minor compaction during off-peak hours every day.
obclient> ALTER SYSTEM SET major_freeze_duty_time = '02:00' TENANT = t1;You can trigger a minor compaction manually by using the following O&M commands.
Initiate a minor compaction in another tenant from the sys tenant
Initiate a minor compaction in a user tenant
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = sys;Initiate a minor compaction in all user tenants
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = all_user;Initiate a minor compaction in all META tenants
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = all_meta;Initiate a minor compaction in tenants t1 and t2
obclient> ALTER SYSTEM MAJOR FREEZE TENANT = t1,t2;
Initiate a minor compaction in your own tenant
obclient> ALTER SYSTEM MAJOR FREEZE;
References
For more information about merge operations, see Major compaction.