ob_server_cannot_connect_arbitration

2025-03-26 07:47:21  Updated

Description

OBServer nodes in the OceanBase cluster fail to communicate with the arbitration service. This alert is valid only in a cluster of OceanBase Database V4.2.0 or later that is associated with the arbitration service.

Principle

Parameter Value
Metric arbitration_status_inactive_server_count
Source Connect to OceanBase Database and perform this query: select /*+ MONITOR_AGENT */ count(1) as inactive_server_count from GV$OB_ARBITRATION_SERVICE_STATUS where status = 'INACTIVE'
Collected metric arbitration_status_inactive_server_count
Metric expression max(arbitration_status_inactive_server_count{@LABELS}) by (@GBLABELS)
Collection cycle 60 seconds

Alert rule

Metric expression Metric description Default threshold Detection cycle Time before clearance
arbitration_status_inactive_server_count > 0 The OceanBase cluster contains OBServer nodes that cannot communicate with the arbitration service. 0 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
The network between an OBServer node and the arbitration service fails. Critical Cluster

Alert templates

  • Overview

    • Template: ${alarm_target} ${alarm_name}
    • Example: ob_cluster=TEST. The OceanBase cluster contains OBServer nodes that cannot communicate with the arbitration service.
  • Details

    • Template: Cluster: ${ob_cluster_name}. Alert: ${alarm_name}. The ${ob_cluster_name} cluster contains ${value_shown} OBServer nodes that cannot communicate with the arbitration service.
    • Example: Cluster: TEST. Alert: The OceanBase cluster contains OBServer nodes that cannot communicate with the arbitration service. The TEST cluster contains 2 OBServer nodes that cannot communicate with the arbitration service.

Impact on the system

If an OBServer node fails to communicate with the arbitration service, the arbitration feature may be abnormal and cannot provide the high availability service for tenants with two or four full-featured replicas. When half of the replicas are unavailable in a tenant with the arbitration service enabled, service exceptions occur in the tenant.

Possible causes

  1. The network between an OBServer node and the arbitration service fails.
  2. The arbitration service process is abnormal.

Suggested solutions

  1. Connect to the cluster as the root@sys user. Execute the following SQL statement to check for OBServer nodes in the INACTIVE state.

    SELECT svr_ip, svr_port, arbitration_service_address, status FROM GV$OB_ARBITRATION_SERVICE_STATUS;
    
  2. Check the network between these OBServer nodes and the arbitration service.

  3. Check whether the arbitration service process runs properly.

Contact Us