Description
If OCP-MonitorDB cannot be connected within 3 minutes, a disconnection alert is triggered.
Principle
| Parameter | Value |
|---|---|
| Metric | ocp_meta_db_health |
| Source | OCP server tracking data |
| Collected metric | ocp_meta_db_health |
| Metric expression | ocp_meta_db_health{@LABELS} |
| Collection cycle | 60 seconds |
Note
The value 0 indicates a connection failure and 1 indicates a connection success.
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Based on the expression of the metric | Warning | OCP node |
Alert rule
| Metric | Default threshold | Source | Detection cycle | Elimination cycle |
|---|---|---|---|---|
| ocp_meta_db_health | 0 | OCP server tracking data | 60 seconds | 5 minutes |
Alert template
- Alert overview
- Template: ${alarm_target} ${alarm_name}
- Sample: alarm_template_id=0:svr_ip=xxx.xxx.xxx.xxx:db=monitor ocp_meta_db_disconnected
- Alert details
- Template: [${alarm_name}] OCP server: ${svr_ip} ${db} MonitorDB is disconnected.
- Sample: [OCP-MonitorDB health status] OCP server: xxx.xxx.xxx.xxx MonitorDB is disconnected.
Impact on the system
If the OCP-MonitorDB is disconnected for a long time, OCP will malfunction and fail, and OCP-MonitorDB will lose monitoring data.
Possible causes
- The network is unstable.
- The MonitorDB service is abnormal.
Solutions
Manually connect to the OCP-MonitorDB to check whether the disconnection occurs again.
If the OCP-MonitorDB is frequently disconnected, check the status or network environment of the OCP-MonitorDB.
Restart the OCP-MonitorDB. If the problem persists, check the logs and contact the administrator for troubleshooting.