Description
The OceanBase Cloud Platform (OCP) server monitors the synchronization status of a standby tenant during the synchronization with its primary tenant. If the synchronization status of the standby tenant is abnormal, this alert is generated. The alert level varies with the abnormal synchronization status. If the synchronization status of the standby tenant is Log Missing or Log Conflict, a critical alert is generated. If the synchronization status of the standby tenant is ABNORMAL, a warning alert is generated.
Principle
| Parameter | Value |
|---|---|
| Metric | standby_tenant_restore_status_code |
| Source | The OCP server queries the internal table V$OB_LS_LOG_RESTORE_STATUS in OceanBase Database to obtain the synchronization status of each log stream of a tenant and summarizes the synchronization status of the current tenant based on the synchronization status of all log streams. |
| Collected metric | standby_tenant_restore_status_code |
| Metric expression | sum(standby_tenant_restore_status_code{@LABELS}) by (@GBLABELS) |
| Collection cycle | 30 seconds |
Alert rule
| Alert rule expression | Metric description | Default threshold | Detection cycle | Time before clearance |
|---|---|---|---|---|
| standby_tenant_restore_status_code == 1 | The synchronization status of the standby tenant is ABNORMAL. | 0 | 60 seconds | 5 minutes |
| standby_tenant_restore_status_code == 2 | The synchronization status of the tenant is Log Missing or Log Conflict. | 0 | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Based on the expression of the metric | Warning or critical | Tenant |
Alert templates
Overview
- Template: ${alarm_target} ${alarm_name}
- Sample: alarm_template_id=0:ob_cluster=clusterA-1:tenant_name=tenant_standby The synchronization status of the standby tenant is abnormal.
Details
- Template: The synchronization status of the standby tenant ${tenant_name} of the OceanBase cluster ${ob_cluster_name} is abnormal. The current synchronization status is ${tenant_restore_status}.
- Sample: The synchronization status of the standby tenant tenant_standby of the OceanBase cluster clusterA is abnormal. The current synchronization status is ABNORMAL.
Impact on the system
If the synchronization status of a standby tenant is abnormal, data synchronization with its primary tenant is interrupted, resulting in data inconsistency between the primary and standby tenants.
Possible causes
- Data archiving stops in the primary tenant.
- The OceanBase Database version of the primary tenant is later than that of the standby tenant.
- The primary tenant has been deleted.
- The network connection between the primary and standby tenants or between the standby tenant and the archive address is abnormal.
Solutions
- If the synchronization status of the standby tenant is Log Missing or Log Conflict, data synchronization fails to continue in the standby tenant. In this case, drop and rebuild the standby tenant.
- If the synchronization status of the standby tenant is ABNORMAL, check the primary-standby relationship topology of tenants to view the synchronization status of each log stream of the tenant, and fix abnormal log streams as prompted.