standby_tenant_sync_delay_too_long

2024-08-23 10:14:19  Updated

Description

The OceanBase Cloud Platform (OCP) server monitors the synchronization latency between a standby tenant and its primary tenant. If the synchronization latency of the standby tenant is greater than 600 seconds, this alert is generated.

Principle

Parameter Value
Metric standby_tenant_delay_seconds
Source The OCP server queries the GV$OB_LOG_STAT and DBA_OB_TENANTS tables in OceanBase Database for the current data savepoints of a standby tenant and its primary tenant and obtains the synchronization latency of the standby tenant based on the subtraction result of the data savepoints.
Collected metric standby_tenant_delay_seconds
Metric expression sum(standby_tenant_delay_seconds{@LABELS}) by (@GBLABELS)
Collection cycle 30 seconds

Alert rule

Alert rule expression Metric description Default threshold Detection cycle Time before clearance
standby_tenant_delay_seconds >= 600 Synchronization latency between a standby tenant and its primary tenant 600 seconds 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Based on the expression of the metric Warning Tenant

Alert templates

  • Overview

    • Template: ${alarm_target} ${alarm_name}
    • Sample: alarm_template_id=0:ob_cluster=clusterB-1:tenant_name=tenant_standby The standby tenant has a high synchronization latency.
  • Details

    • Template: The synchronization latency of the standby tenant ${tenant_name} of the OceanBase cluster ${ob_cluster_name} is high. The current synchronization latency is ${value} seconds.
    • Sample: The synchronization latency of the standby tenant tenant_standby of the OceanBase cluster clusterB is high. The current synchronization latency is 2854.016 seconds.

Impact on the system

The data synchronized to the standby tenant is inconsistent with the data in the primary tenant.

Possible causes

  • The resource specifications of the standby tenant are too low.
  • The synchronization status of the standby tenant is abnormal.
  • The archive_lag_target parameter is set to an excessively large value for the primary tenant.

Solutions

  1. View the primary-standby relationship topology of the tenants and check whether the synchronization status of the standby tenant is abnormal. If yes, handle the synchronization status exception.
  2. View the unit specifications of the primary and standby tenants, and continuously observe the synchronization latency of the standby tenant. If the synchronization latency of the standby tenant remains high and its unit specifications are lower than those of the primary tenant, we recommend that you upgrade the unit specifications of the standby tenant or use the same unit specifications as those of the primary tenant.
  3. If the synchronization between the primary and standby tenants is based on archive files, check the value of the archive_lag_target parameter of the primary tenant. The parameter affects the archiving latency of the primary tenant and thereby the synchronization latency of its standby tenants. We recommend that you specify a reasonable value for the archive_lag_target parameter.

Contact Us