base_backup_timeout

2025-01-10 06:15:54  Updated

Description

This alert is triggered when a data backup task initiated in the OceanBase Cloud Platform (OCP) timed out.

Principle

OCP-Server runs a scheduled task to check the log backup tasks every minute. When a log backup task timed out, this alert is triggered.

Alert rule

Metric Default threshold Duration Detection cycle Elimination cycle
N/A N/A 0 seconds 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Reminder of OCP Caution Services

Alert template

  • Alert overview

    • Template: ${alarm_target} ${alarm_name}
    • Sample: ob_cluster=obcluster-1 base_backup_timeout
  • Alert details

    • Template: Cluster: ${ob_cluster_name}[${ob_cluster_id}], Tenant: ${tenant_name}, Alert: The running duration of the data backup task exceeds the alert threshold of ${timeout_minutes_threshold} minutes.
    • Sample: Cluster: obcluster-1[10006], Tenant: obtenant, Alert: The running duration of the data backup task exceeds the alert threshold of 7 minutes.

Impact on the system

If data backup timed out due to a backup task exception, the tenant data may fail to be recovered.

Possible causes

This problem is commonly found in the following scenarios:

  • A small timeout alert threshold is specified for data backup.
  • An exception occurred when the data backup task is executed.

Solutions

  • If the data backup task finally succeeded, log on to the OCP console and choose Backup and Recovery > Backup Strategy. On the page that appears, increase the timeout alert threshold for data backup.

  • For other reasons, collect logs to locate the problem.

    • If the alert is triggered during a logical log backup task, check the agentserver.log of the backup service for troubleshooting.``

    • If the alert is triggered during a physical log backup, check the OBServer logs to find the specific cause. Perform the following steps to find the logs:

      1. Log on to the OCP console. On the Clusters page, find the target cluster in the Clusters list and click its name. Then, the Overview page of the cluster appears.

      2. Click the More icon in the upper-right corner and select Download Logs.

        • Specify the time range, which must contain the alert time.

        • For the log type, select OBServer Logs.

      3. Click Download Logs to download the OBServer logs.

      4. Open the observer.log file and search for WARN- and ERROR-level logs to view the details.``

Contact Us