ocp_remote_server_time_diff_too_large |V3.3.1|OceanBase Cloud Platform| docs|Distributed Database

ocp_remote_server_time_diff_too_large

Last Updated：2023-08-15 11:20:59 Updated

Description

This alert is triggered when the time difference between the clock of the OCP-Server and that of the remote host managed by the OCP-Server is too large.

Principle

OCP-Server checks the clock offset with a remote host by running the clockdiff -o {ip} command once every minute. When the offset exceeds the threshold, this alert is triggered. The default threshold is 500 ms and is specified by the system parameter ocp.host.check.clock-diff.max-diff.

Alert rule

Metric	Default threshold (unit: ms)	Duration	Detection cycle	Time before clearance
None	500	0 seconds	1 minute	5 minutes

Alert information

Trigger method	Alert level	Scope
Metric expression	Critical	Server

Alert templates

Overview: ${alarm_target} ${alarm_name}
Details: ${alarm_target} ${alarm_name}. The time difference between the OCP-Server and the remote host is too large. The remote host time is ${remote_server_time} and the OCP-Server time is ${ocp_time}. The threshold is ${max_interval_millis} ms.
Overview example: The time difference between the OCP-Server and the remote host is too large. The current difference is 1400 ms, exceeding the threshold of 500 ms.
Details example: The time difference between the OCP-Server and the remote host is too large. The current time difference is 1400 ms, exceeding the threshold of 500 ms.

Impact on the system

The system cannot send RPC requests.
Authorization may fail.
Errors may occur while querying some metrics.

Possible causes

This problem is commonly found in the following scenarios:

The clock of the OCP-Server is malfunctioning.
The clock of the remote host is malfunctioning.

Suggested solutions

Check the clock of the OCP-Server.

Check whether the alert is reported multiple times in the OCP console.
- If no, go to Step 2.
- Otherwise, the problem is likely to have been caused by the malfunctioning clock of the OCP-Server.
  
  We recommend that you first ensure the clock of the OCP-Server is properly functioning.
  
  The problem may also be caused by clock synchronization exceptions between the OCP-Server and the clock source. For more information, see host_ntp_offset_too_large.
  
  Check whether the alert recurs 5 minutes later.
Check the lock of the host managed by the OCP-Server.
- When the host managed by the OCP-Server is unavailable or cannot be connected, the following alerts are also triggered at the same time. We recommend that you solve the problems that caused those alerts before manually triggering the synchronization between the host clock and the clock source, and check whether the alert recurs 5 minutes later.
  - ob_cannot_connected
  - ob_cluster_exists_inactive_server
- The problem may be caused by exceptions of synchronization between the OCP host clock and the clock source. For more information, see host_ntp_offset_too_large.
- If the clocks of the OCP-Server and the host are both normal except that the clocks are not synchronized, you can manually synchronize the clocks.
  
  Run one of the following commands based on the clock synchronization service installed on the OCP-Server.
```
# When the NTP service is in use, run the following command for synchronization (192.168.0.1 indicates the IP address of the clock source that must be specified): 
ntpdate 192.168.0.1


# When the Chrony service is in use, make sure that the clock source has been configured in the /etc/chrony.conf file and run the following command for synchronization: 
chronyc -a makestep
```

Community Edition

Enterprise Edition