ob_cluster_sync_failed

2023-08-15 02:30:56  Updated

Description

OceanBase Cloud Platform (OCP) tries to connect to the OceanBase cluster once every 60s to synchronize the metadata of the OceanBase cluster to OCP. When synchronization fails, this alert is triggered.

Principle

The synchronization task is executed once every minute as per the following steps:

  1. OCP-Server checks whether the OceanBase cluster can be connected.

    • If yes, it proceeds to Step 2.

    • Otherwise, it ends the process but does not trigger the alert.

  2. If the status of the OceanBase cluster recorded in the OCP console is Unavailable, OCP-Server tries to update the status to Running.

    • If the update fails, it triggers the alert.

    • If the update is successful, it proceeds to Step 3.

  3. Other information about the OceanBase cluster is also synchronized, such as zone, cluster, and hosts. If the synchronization fails, the alert is triggered.

Alert rule

Metric Default threshold Duration Detection cycle Time before clearance
None None 0 seconds 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Timed task of OCP Critical Cluster

Alert templates

  • Overview: ${alarm_target} ${alarm_name}

  • Details: ${alarm_target} ${alarm_name}. Cause of failure: ${failed_reason}

  • Overview example: ob_cluster=C1-1000. OceanBase cluster synchronization failed.

  • Details example: ob_cluster=C1-1000. OceanBase cluster synchronization failed. Cause of failure: some reason

Impact on the system

The impact varies with the cause of the failure.

  • If the OCP metadata is polluted, subsequent OCP O&M operations may be affected.

  • If the synchronization failure was caused by an OceanBase cluster error, user services may be affected.

Possible causes

  • If the OCP metadata is polluted, the metadata is inconsistent with that of the OceanBase cluster. For example, the IDC and region information maintained by the OceanBase cluster is inconsistent with that maintained on the OCP-Server.

  • If the synchronization failure is caused by an OceanBase cluster error, the ob_cluster_status_check_failed alert is also generated.

Suggested solutions

Solve the problem based on the cause of failure indicated in the alert. After the problem is solved, restore cluster synchronization in the OCP console.

  1. Check whether the synchronization failure is caused by an OceanBase cluster error.

    If yes, the ob_cluster_status_check_failed alert is also triggered.

    You can first solve the problem and check whether the alert of this topic is cleared after five minutes. For more information, see ob_cluster_status_check_failed.

    • If the alert persists, go to the next step.

    • Otherwise, the problem is solved.

  2. Check whether the metadata of the OCP is polluted.

    Check whether the IDC and region of the zone in the OceanBase cluster exists in the OCP metadata, and whether the association of the IDC and region is consistent with the OCP metadata by referring to the following information:

    • The OCP metadata includes the following information:

      • compute_host

      • compute_idc

      • compute_region

    • Run the following SQL command to query all IDC and region information maintained by the OCP MetaDB:

      -- Query the IDC and region information maintained by the OCP MetaDB.
      SELECT r.`name` AS region, i.`name` AS idc 
       FROM compute_region r JOIN compute_idc i 
       ON r.id=i.region_id;
      
      
    • Run the following SQL command to query the IDC and region information of a zone maintained by the OceanBase cluster:

      -- Query the IDC and region information of a zone maintained by the OceanBase cluster.
      SELECT `zone`,
        MAX(CASE `name` WHEN 'region' THEN `info` ELSE '' END ) `region`, 
        MAX(CASE `name` WHEN 'idc' THEN `info` ELSE '' END ) `idc`       
       FROM oceanbase.__all_zone WHERE `zone` <> '' GROUP BY `zone` ;
      

Note

If the IDC and region maintained by the OCP MetaDB are inconsistent with those of the zone maintained by the OceanBase cluster, the cluster synchronization will fail. If the IDC of the zone in the OceanBase cluster is set to empty, the cluster synchronization will not be affected.

 If the metadata is polluted, contact OCP Technical Support for help.

Contact Us