OceanBase Database supports the merge of incremental data and baseline data. The merge is completed in two stages: minor compaction and major compaction. The former corresponds to minor freeze or minor merge, and the latter corresponds to major freeze or major merge. A major compaction is triggered when multiple minor versions of data reach the specified threshold or at the specified point in time. This topic describes the common causes of major compaction blockage and provides solutions.
Check whether a major compaction is blocked
Query the start time of the major compaction, and compare the time consumed with that of previous major compactions. If the time consumed is far longer than the average and the changes in the system or traffic are excluded, the major compaction may be blocked. Sample SQL statement:
obclient> SELECT * FROM __all_zone WHERE name in ('last_merged_time','merge_status','merge_start_time');Check whether
merge erroris reported. Amerge errorcan stop the major compaction.obclient> SELECT * FROM __all_zone WHERE name = 'is_merge_error';View the version of the partition that is stuck during the major compaction. You can query the __all_zone table, and the value of the
frozen_versionfield indicates the latest version of the major compaction.obclient> SELECT * FROM __all_virtual_meta_table WHERE data_version != 'latest_version' LIMIT 1;
Emergency procedure
The major compaction of a cluster can be stuck due to the following causes:
Manual major compaction is enabled. Execute the following statement. If the return value is
true, the major compaction is not automatically triggered, and you need to set the value tofalseto disable manual major compaction.obclient> show parameters like 'enable_manual_merge';A hardware failure of an OBServer node.
Perform the following steps to continue the major compaction:
Make sure that the faulty OBServer enters the inactive status.
Set the permanent offline time to a smaller value. After the faulty OBServer is automatically removed, the major compaction can proceed.
obclient> alter system set server_permanent_offline_time ='xxx';
The disk is full. You can clear or migrate data to release some disk space. For more information, see Full usage of the OBServer data disk.
The remote procedure call (RPC) requests fail due to exhausted I/O resources or overloaded network interface controller (NIC) at an OBServer node. For more information, see High disk I/O on an OBServer node and Overloaded NIC on an OBServer node.
The major compaction fails because RootService is leaderless. For more information, see sys tenant or RootService exceptions.
To check whether RootService is leaderless, log on to the OBServer node that hosts RootService and run the following commands to search
election.logandrootservice.log:grep 'election_error_rs' election.log grep 'ob_major_freeze_launcher.cpp' rootservice.log |grep 'failed'