Description
You can query the major compaction time of a cluster and compare the time with the current time, in microseconds.
Statement
OceanBase Database of a version earlier than V4.0.0:
select zone, name, value, time_to_usec(now()) from __all_zone where name = 'merge_start_time';
OceanBase Database V4.0.0 and later:
select tenant_id, time_to_usec(start_time) as start_time, time_to_usec(now()) as now from CDB_OB_MAJOR_COMPACTION;
Troubleshooting method
OceanBase Cloud Platform (OCP) generates an alert if no major compaction task is triggered within the specified interval, which is 108,000s by default. RootService of an OceanBase cluster freezes data regularly. If RootService does not freeze or compact the data for a long time, the disk usage will increase. When the disk is used up, business data can no longer be written to the cluster.
Check whether the daily major compaction feature is enabled for a cluster.
To do so, log on to the OCP console, On the
Overview page of the cluster, chooseCompaction Management >Configuration for Major Compaction and check whether Daily Major Compaction Time is specified inMajor Compaction Strategy .- If the daily major compaction time is not specified, the major compaction will not be automatically triggered. You need to manually initiate a major compaction.
- If the daily major compaction time is specified, go to the next step.
Check whether the RootService issue is caused by a process exception.
An observer process exception also triggers the ob_cluster_exists_inactive_server alert at the same time. You can handle the exception based on the alert. For more information about the alert, see ob_cluster_exists_inactive_server. If the alert is triggered again within 5 minutes after you handled the exception, go to the next step.
Check whether RootService is leaderless.
Execute the following statement in the sys tenant to query the
__all_virtual_core_meta_tabletable:SELECT svr_ip, zone, role, member_list FROM __all_virtual_core_meta_table;If the connection is normal but the query fails, RootService is leaderless. You can restart the OBServer node where RootService is deployed or restart all OBServer nodes specified by the rootservice_list parameter.
Run the following commands to obtain the IP addresses of the corresponding OBServer nodes:
-- Connect to the sys tenant to query -- Query the OBServer node where RootService is deployed SELECT zone, svr_ip, svr_port FROM __all_server WHERE with_rootserver=1; -- Query the rootservce_list parameter. -- OBServer nodes specified by the rootservice_list parameter are separated with semicolons (;). Each part represents an OBServer node. SELECT DISTINCT `value` AS rootservice_list FROM __all_virtual_sys_parameter_stat WHERE `name` = 'rootservice_list';On the
Overview page of the cluster, find the target OBServer node in the OBServer node list and then clickRestart in theActions column.Note
If you select **Force Restart **, the OBServer node will be restarted by ending the observer process. When no response is received from the majority of nodes, you need to select this option to restart the OBServer node. If you restart the OBServer node when no response is received from the majority of nodes, some replicas may be leaderless. You will need to wait for 15 minutes before you can connect to the sys tenant. The same applies to the business tenants.If the OBServer node is not restored 15 minutes after the restart or force restart, repeat Step 3.
If the leader is available, go to Step 4.
Collect the logs of the OBServer node and contact OCP Technical Support for help.