Alert description
This alert is triggered if the synchronization status of the standby cluster is abnormal.
Alert principle
Parameter |
Value |
|---|---|
| Monitoring Metrics | standby_cluster_restore_status_code: The status code of the standby cluster. Valid values:0indicates that the standby cluster is in a normal synchronization state. The parameter value is1indicates that the synchronization status of the standby cluster is abnormal, triggering an alert. |
| Monitoring Expression | sum(standby_cluster_restore_status_code{@LABELS}) by (@GBLABELS) |
| Metric Collection | standby_cluster_restore_status_code |
| Metric Source | CheckV$OB_STANDBY_STATUSThe synchronization_status field of the standby cluster. |
| Collection Cycle | 5 Seconds |
Rule information
Monitoring Expression |
Default Threshold |
Duration |
Detection Cycle |
Elimination Cycle |
|---|---|---|---|---|
| standby_cluster_restore_status_code | 1 | 0 Seconds | 60 Seconds | 5 Minutes |
Alert information
Alert Trigger Method |
Alert Level |
Scope |
|---|---|---|
| Based on monitoring metric expressions | Warning | Cluster |
Alert template
Alert overview
- Template: ${alarm_target} ${alarm_name}
- Example: alarm_template_id=0:ob_cluster=cluster01-5 Synchronization status of the standby cluster is abnormal.
Alert details
- Template: OceanBase cluster ${ob_cluster_name}:${ob_cluster_id} has an abnormal synchronization status. The current synchronization status is ${cluster_synchronization_status}.
- Example: OceanBase cluster cluster01:5 has an abnormal synchronization status. The current synchronization status is NOT AVAILABLE.
Alert recovery
- Template: Alert: ${alarm_name}, Is the standby cluster synchronization status normal: ${value_down}
- Example: Alert: Synchronization status exception in the standby cluster. Is the synchronization status of the standby cluster normal? 0
Impact on the system
The impact varies depending on the cause:
- Inconsistency between OCP MetaDB and OceanBase cluster information may affect subsequent OCP maintenance operations.
- If synchronization fails due to an OceanBase cluster failure, user services may be affected.
Possible causes
- The IDC/Region information maintained by the OceanBase cluster is inconsistent with that maintained by the OCP MetaDB.
- If a failure in the OceanBase cluster prevents synchronization, the
ob_cluster_status_check_failedalert is also triggered.
Solution
Troubleshoot the issue based on the failure cause specified in the alert. Once the fault is resolved, cluster synchronization in OCP should be restored.
Run the following command to check the synchronization status of the standby cluster. If the synchronization_status return value is OK, the synchronization is successful; otherwise, it fails and an alert is triggered.
select cluster_id, cluster_name, cluster_role, cluster_status, current_scn, rootservice_list, synchronization_status from FROM V$OB_STANDBY_STATUSNote
The synchronization_status field is supported only in OceanBase Database V2.2.77 and later.Confirm whether the synchronization failure is caused by an OceanBase cluster fault.
In this case, OCP will also report the
ob_cluster_status_check_failedalert. You can first refer to the troubleshooting method for the ob_cluster_status_check_failed OceanBase cluster status check failed alert to investigate the issue. After waiting for 5 minutes, observe whether the alert continues to be reported.- If the alert continues to be reported, proceed to the next step.
- If the alert is cleared, the issue is resolved.
Check whether the IDC/Region information of zones in the OceanBase cluster exists in the OCP MetaDB, and whether the association between IDCs and regions in the cluster is consistent with that in the OCP MetaDB. You can refer to the following information for the check.
The following table describes the data tables for hosts, regions, and IDCs maintained by OCP:
- compute_host: host
- compute_idc: IDC
- compute_region: Region.
Run the following command to query all region/IDC information in the OCP MetaDB:
-- Query IDC and Region Information Maintained by OCP MetaDB SELECT r.`name` AS region, i.`name` AS idc FROM compute_region r JOIN compute_idc i ON r.id=i.region_id;Run the following command to query the IDC/Region information of zones in the OceanBase cluster:
-- Query IDC and Region Information of Zones in an OceanBase Cluster SELECT `zone`, MAX(CASE `name` WHEN 'region' THEN `info` ELSE '' END ) `region`, MAX(CASE `name` WHEN 'idc' THEN `info` ELSE '' END ) `idc` FROM oceanbase.__all_zone WHERE `zone` <> '' GROUP BY `zone` ;
If the region/IDC association maintained in the OCP MetaDB is inconsistent with that of the zones in the OceanBase cluster, proceed to the next step.
Note
When the IDC for a zone in the OceanBase cluster is set to empty, it does not affect cluster synchronization.The IDCs or regions of zones in an OceanBase cluster are logically divided. When the IDC/region information in an OceanBase cluster changes, manual synchronization is required in the OCP MetaDB. The following information is for example:
- The name of the IDC before modification is
old idc, and the name after modification iszue. - The name of the region before modification is
old region, and the name after modification isSHANGHAI.
The procedure is as follows:
Log in to the ocp_meta database as the root user of the ocp_meta tenant that uses the OCP MetaDB (the cluster name is obcluster by default), and query whether any modification records exist.
Query whether a region named
SHANGHAIexists:select * from compute_region where name='SHANGHAI';Query whether an IDC named
zueexists:
select * from compute_idc where name='zue';If it does not exist, run the following command to add a region and IDC; if it exists, proceed to the next step.
-- Add Region insert into compute_region (`name`, `description`) values ('SHANGHAI', 'SHANGHAI region'); -- Add IDC insert into compute_idc (`name`, `description`, `region_id`) select 'zue', 'zue idc', id from compute_region where name='SHANGHAI';Revised the region and IDC information for the OCP zone.
update ob_zone set idc_id = (select id from compute_idc where name='zue') where name='your zone name' and cluster_id= your cluster id in ocp;Modify the region and IDC information of the OCP compute_host.
update compute_host set idc_id = (select id from compute_idc where name='zue') where inner_ip_address in (ips of the hosts);
- The name of the IDC before modification is
