Description
This alert indicates whether the backup or recovery service is inactive. It is triggered when the backup and recovery module stops working.
Principle
OCP-Server runs the following commands by using the OCP-Agent to check the status of the backup or recovery service.
cd /home/admin/oceanbase && bash ./bin/agent.sh exist_restore
cd /home/admin/oceanbase && bash ./bin/agent.sh exist_backup
The return value 1 indicates that the service is active. Otherwise, the alert is triggered.
Alert rule
| Metric | Default threshold | Duration | Detection cycle | Time before clearance |
|---|---|---|---|---|
| None | None | 0 seconds | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Reminder of OCP | Stopped | Services |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: ${alarm_target} ${alarm_name}
Overview example: service=Backup:svr_ip=xxx.xxx.xxx.xxx. The backup or recovery service is inactive.
Details example: service=Backup:svr_ip=xxx.xxx.xxx.xxx. The backup or recovery service is inactive.
${alarm_target} follows the ob_cluster=xxxxxxx:svr_ip=xxxxxx format.
ob_cluster indicates the name of the cluster that generated the alert.
svr_ip indicates the IP address of the OBServer of the cluster that generated the alert.
Impact on the system
The corresponding cluster cannot be backed up.
Possible causes
The backup and recovery module malfunctions or a configuration error occurs.
Suggested solution
Check the logs of the backup and recovery module to locate the cause of the failure.
The logs include agentserver.log and agentrestore.log, and are located under: /home/admin/oceanbase/log of the server where the backup and recovery services are run.