backup_process_dead

2025-03-26 07:47:21  Updated

Description

This alert indicates whether the backup or recovery service is inactive. It is triggered when the backup and recovery module stops working.

Principle

OCP-Server runs the following commands by using the OCP-Agent to check the status of the backup or recovery service.

cd /home/admin/oceanbase && bash ./bin/agent.sh exist_restore
cd /home/admin/oceanbase && bash ./bin/agent.sh exist_backup

The return value 1 indicates that the service is active. Otherwise, the alert is triggered.

Alert rule

Metric Default threshold Duration Detection cycle Time before clearance
None None 0 seconds 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Reminder of OCP Stopped Services

Alert templates

  • Overview: ${alarm_target} ${alarm_name}

  • Details: Host: ${host}, Alert: The backup or recovery service ${agent_type} is inactive.

  • Overview example: svr_ip=xxx.xxx.xxx.xxx. The backup or recovery service is inactive.

  • Details example: Host: xxx.xxx.xxx.xxx. Alert: The backup or recovery service Backup-1 is inactive.

${alarm_target} follows the svr_ip=xxxxxx format.

  • ob_cluster indicates the name of the cluster that generated the alert.

  • svr_ip indicates the IP address of the OBServer of the cluster that generated the alert.

Impact on the system

The corresponding cluster cannot be backed up.

Possible causes

The backup and recovery module malfunctions or a configuration error occurs.

Suggested solution

Check the logs of the backup and recovery module to locate the cause of the failure.

The logs include agentserver.log and agentrestore.log, and are located under: /home/admin/oceanbase/log of the server where the backup and recovery services are run.

Contact Us