Description
The OceanBase Cloud Platform (OCP) server monitors the synchronization latency between a standby tenant and its primary tenant. If the synchronization latency of the standby tenant is greater than 600 seconds, this alert is generated.
Principle
| Parameter | Value |
|---|---|
| Metric | standby_tenant_delay_seconds |
| Source | The OCP server queries the GV$OB_LOG_STAT and DBA_OB_TENANTS tables in OceanBase Database for the current data savepoints of a standby tenant and its primary tenant and obtains the synchronization latency of the standby tenant based on the subtraction result of the data savepoints. |
| Collected metric | standby_tenant_delay_seconds |
| Metric expression | sum(standby_tenant_delay_seconds{@LABELS}) by (@GBLABELS) |
| Collection cycle | 30 seconds |
Alert rule
| Alert rule expression | Metric description | Default threshold | Detection cycle | Time before clearance |
|---|---|---|---|---|
| standby_tenant_delay_seconds >= 600 | Synchronization latency between a standby tenant and its primary tenant | 600 seconds | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Based on the expression of the metric | Warning | Tenant |
Alert templates
Overview
- Template: ${alarm_target} ${alarm_name}
- Sample: alarm_template_id=0:ob_cluster=clusterB-1:tenant_name=tenant_standby The standby tenant has a high synchronization latency.
Details
- Template: The synchronization latency of the standby tenant ${tenant_name} of the OceanBase cluster ${ob_cluster_name} is high. The current synchronization latency is ${value} seconds.
- Sample: The synchronization latency of the standby tenant tenant_standby of the OceanBase cluster clusterB is high. The current synchronization latency is 2854.016 seconds.
Impact on the system
The data synchronized to the standby tenant is inconsistent with the data in the primary tenant.
Possible causes
- The resource specifications of the standby tenant are too low.
- The synchronization status of the standby tenant is abnormal.
- The
archive_lag_targetparameter is set to an excessively large value for the primary tenant.
Solutions
- View the primary-standby relationship topology of the tenants and check whether the synchronization status of the standby tenant is abnormal. If yes, handle the synchronization status exception.
- View the unit specifications of the primary and standby tenants, and continuously observe the synchronization latency of the standby tenant. If the synchronization latency of the standby tenant remains high and its unit specifications are lower than those of the primary tenant, we recommend that you upgrade the unit specifications of the standby tenant or use the same unit specifications as those of the primary tenant.
- If the synchronization between the primary and standby tenants is based on archive files, check the value of the
archive_lag_targetparameter of the primary tenant. The parameter affects the archiving latency of the primary tenant and thereby the synchronization latency of its standby tenants. We recommend that you specify a reasonable value for thearchive_lag_targetparameter.