ob_cluster_merge_error

2023-08-22 02:46:05  Updated

Description

This alert is triggered when an error occurs while compacting OceanBase clusters.

Principle

The following table describes the key parameters that are involved in the monitoring and alerting logic.

Parameter Value
Metric ob_cluster_merge_error_flag
Source SQL: select zone, name, value, time_to_usec(now()) from __all_zone;
Collected metric zone_value
Metric expression max(zone_value{metric_group="all_zone",name="is_merge_error",@LABELS}) by (@GBLABELS)
Collection cycle 1 second

If a major compaction request, initiated automatically or manually, fails, the corresponding flag bit is set and the major compaction status is updated in the __all_zone table. The value of the value field in the__all_zone table is used as the value of the collected metric, and the values of other fields are used as labels.

The value of the metric indicates the cluster compaction status. When the value is 0, the cluster compaction is normal. When the value is 1, an error occurred while compacting the cluster, and the alert is triggered.

Alert rule

Metric Default threshold Duration Detection cycle Time before clearance
ob_cluster_merge_error_flag 1 0 seconds 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Metric expression Critical Cluster

Alert templates

  • Overview: ${alarm_target} ${alarm_name}

  • Details: Cluster: ${ob_cluster_name}, Alert: ${alarm_name}

  • Overview example: ob_cluster=C1-1000. OceanBase cluster compaction failed.

  • Details example: Cluster: ob_cluster=C1-1000, Alert: OceanBase cluster compaction failed.

Impact on the system

  • Replicas of the OceanBase cluster are inconsistent. For example, data between replicas is inconsistent, or the index table is different from the primary table.

  • Switchover cannot be performed between the primary and standby OceanBase clusters to avoid spreading the error.

Possible causes

Technically, only some extreme cases may cause this error.

Suggested solution

For more information about how to troubleshoot this error, see Exception handling for OceanBase cluster compaction.

Improper handling may cause data inconsistency. If you do not have professional knowledge, contact OCP Technical Support for help.

Contact Us