Description
This alert is triggered when OceanBase Cloud Platform (OCP) fails to detect the status of an OceanBase cluster that is managed by OCP.
Principle
The OCP-Server runs a timed task to detect the status of an OceanBase cluster every 60 seconds. Each time this task is executed, the OCP-Server uses the OceanBase SDK to connect to the OceanBase cluster. If the connection fails, this alert is triggered.
Alert rule
| Metric | Default threshold | Duration | Detection cycle | Time before clearance |
|---|---|---|---|---|
| None | None | 0 seconds | 1 minute | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Timed task of OCP | Critical | Cluster |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: ${alarm_target} ${alarm_name}. Check item: ${check_item}. Cause of failure: ${failed_reason}
Overview example: ob_cluster=C1-1000. OceanBase cluster status checking failed.
Details example: ob_cluster=C1-1000. OceanBase cluster status checking failed. Check item: cluster connect check. Cause of failure: some reason.
Impact on the system
The following impacts may be caused:
OCP cannot connect to OBProxy.
The status of an OceanBase cluster is abnormal.
Possible causes
This problem is commonly found in the following scenarios:
OCP failed to connect to the OceanBase cluster.
The network connection between OCP and the corresponding OBServer is disconnected.
The OceanBase cluster is unavailable, because the Root Service is leaderless or the sys tenant has a leaderless table.
The sys tenant of OceanBase Database is abnormal, causing the logon failure. The problem may be caused by the following issues:
The sys tenant is abnormal. For example, the ocp_monitor@sys account does not exist.
The password of the sys tenant is incorrect.
Suggested solutions
If the cause of the failure in
Alert Details is Access Denied, it is likely that the ocp_monitor@sys account is deleted or the password of this account is changed.Use the root@sys account to log on to the OceanBase cluster by using the CLI tool and run the following command to create an ocp_monitor@sys account.
Run the following command to view the ocp_monitor account and the permissions of this account:
select user_name,priv_select from __all_user;Check whether ocp_monitor is included in the value of the user_name field in the returned information and whether the value of priv_select is 1.
If the ocp_monitor account does not exist, run the following command to create the ocp_monitor account:
CREATE USER IF NOT EXISTS ocp_monitor IDENTIFIED BY 'your password';If the ocp_monitor account exists but the value of priv_select is 0, run the following command to grant the global SELECT permission to the ocp_monitor account:
GRANT SELECT ON *.* TO ocp_monitor;If the ocp_monitor account exists and the value of priv_select is 1, the password of the ocp_monitor account may be incorrect. Run the following command to change the password:
ALTER USER ocp_monitor IDENTIFIED BY 'your password';
Save the ocp_monitor@sys account to the password box. For more information about how to manage the account, see Creat OceanBase cluster connection credentials.
If the ocp_monitor@sys account of the cluster already exists in the password box, delete this account first.
If the cause of the failure in
Alert Details is connection failure, perform the following steps:Check for other alerts of this OceanBase cluster in the OCP console.
If any, resolve these alerts and check whether the alert of this topic is cleared.
If not, go to Step b.
Check the network connection between OCP and OBServer.
Run the
obclient -hxxx.xxx.xxx.xxx -P2888 -u<username>@<tenantname#clustername> -p -Doceanbasecommand on the OCP-Server to check whether the OceanBase cluster can be connected.Note
If OCP connects to and manages this OceanBase cluster by using an OBProxy server, replace the IP address and port number of the OceanBase cluster with those of the OBProxy server.
If the OceanBase cluster cannot be connected, troubleshoot the issue based on the returned information of the connection failure.
If the connection failure is caused by a network failure, contact the network administrator to resolve the issue.
If the obproxy or observer process is faulty, restart the corresponding process.
Note
To check for an OBProxy or OBServer-related failure, view the following logs:
Log files of the OBProxy deployed from the OCP console are located in
/home/admin/logs/obproxy/log.Log files of the manually deployed OBProxy are located in
/opt/taobao/install/obproxy-?/log. The value of?is subject to the OBProxy version you have installed.The OBServer log files are saved in
/home/admin/oceanbase/log.
- If the OceanBase cluster can be connected, go to the next step.
Collect the preceding logs and contact OCP Technical Support to resolve the issue.