Description
OceanBase cluster has nodes that cannot communicate with the arbitration service. This alert is only applicable to clusters after OceanBase V4.2.0, and it can only be triggered if the cluster is associated with an arbitration service.
Principle
| Parameter | Value |
|---|---|
| Monitoring metric | arbitration_status_inactive_server_count |
| Metric source | Query after connecting to OceanBase: select /*+ MONITOR_AGENT */ count(1) as inactive_server_count from GV$OB_ARBITRATION_SERVICE_STATUS where status = 'INACTIVE' |
| Metric collection | arbitration_status_inactive_server_count |
| Monitoring expression | max(arbitration_status_inactive_server_count{@LABELS}) by (@GBLABELS) |
| Metric collection cycle | 60 seconds |
Rule information
| Monitoring expression | Description of the monitoring metric | Default threshold | Detection cycle | Elimination cycle |
|---|---|---|---|---|
| arbitration_status_inactive_server_count > 0 | The cluster has nodes that cannot communicate with the arbitration service. | 0 | 60 seconds | 5 minutes |
Alert information
| Alert trigger method | Alert level | Scope |
|---|---|---|
| Cut off the network between the ObServer and the arbitration service | Critical | Cluster |
Alert template
Alert summary
- Template: ${alarm_target} ${alarm_name}
- Example: ob_cluster=TEST OceanBase cluster has nodes that cannot communicate with the arbitration service.
Alert details
- Template: Cluster: ${ob_cluster_name}, alert: ${alarm_name}. The cluster ${ob_cluster_name} has nodes that cannot communicate with the arbitration service, with a total of ${value_shown} nodes.
- Example: Cluster: TEST, alert: OceanBase cluster has nodes that cannot communicate with the arbitration service. The cluster TEST has nodes that cannot communicate with the arbitration service, with a total of 2 nodes.
Alert recovery
- Template: Alert: ${alarm_name}, the number of nodes in the OceanBase cluster that cannot communicate with the arbitration service: ${value_shown}
- Example: Alert: OceanBase cluster has nodes that cannot communicate with the arbitration service, the number of nodes in the OceanBase cluster that cannot communicate with the arbitration service: 0
Impact on the system
If the arbitration service cannot communicate with the OBServer, the arbitration feature may be abnormal, and the high availability feature of tenants 2F/4F cannot be provided. When half of the replicas of tenants that enable the arbitration service are unavailable, tenant services will be abnormal.
Possible causes
- The network between the OBServer and the arbitration service is down.
- The arbitration service process is abnormal.
Resolution
Connect to the cluster as root@sys. Execute the following SQL statement to check the status of each server.
SELECT svr_ip, svr_port, arbitration_service_address, status FROM GV$OB_ARBITRATION_SERVICE_STATUS;Check whether the network between these servers and the arbitration service is working properly.
Check whether the arbitration service process is normal.