If the clog disk is fully occupied, the cluster may experience election without a leader, write failure, and lack of replicas.
Scenario
The data write fails because the clog disk is full.
Emergency procedure
To release the space of the clog disk without impairing data consistency in the cluster, perform the following steps:
Stop the business data write to the cluster. Otherwise, the space temporarily released from the clog disk can be quickly used up again.
Increase the threshold of the clog disk usage from 95% to 98%.
ALTER system SET clog_disk_usage_limit_percentage = 98 server ='[IP address]:2882';Observe the performance. In most cases, you can resume the business data write to the cluster after the clog is synchronized. To check whether the clog is synchronized and partitions are being rebuilt, execute the following SQL statements. If both queries return 0, all replicas are synchronized:
SELECT svr_ip, count(*) FROM __all_virtual_clog_stat WHERE is_offline = 0 and is_in_sync = 0 group by 1; SELECT svr_ip, count(*) FROM __all_virtual_partition_migration_status WHERE action != 'END' group by 1;If the clog disk is fully occupied again, you can manually migrate some of the earliest clog data on the most lagged node to a temporary space. By default, clogs are stored in the
/data/log1/directory. We recommend that you migrate 100 clog files and observe the performance. If the issue persists, contact OceanBase Technical Support.Notice
You can perform the preceding operations only on the same OBServer.
Do not delete the clogs. Put them in a temporary directory.
After the data write to the cluster is resumed, you can delete the clogs in the temporary directory. If the issue persists, contact OceanBase Technical Support.