Alert description
This alert monitors whether the Binlog Server process exists. If it does not, an alert is triggered.
Alert principle
The following table lists the key parameters involved in the monitoring logic of this alert.
Parameter |
Value |
|---|---|
| Monitoring Metrics | binlog_process_exists: Indicates whether the Binlog Server process exists. Metric value:1indicates presence; a metric value of0indicates that the process does not exist and triggers an alert. |
| Monitoring Expression | process_exists{name="logproxy",@LABELS} |
| Metric Collection | binlog_process_exists |
| Metric Source | The metric source for this alert is unique. OCP-Agent executes the following Linux command to check whether the Binlog Server process exists:ps -ef\ | grep -w logproxy\ | grep -v grep\ | wc -l |
| Collection Cycle | 5 Seconds |
Rule information
Monitoring Metrics |
Default Threshold |
Duration |
Detection Cycle |
Elimination Cycle |
|---|---|---|---|---|
| binlog_process_exists | 0 | 30 Seconds | 10 Seconds | 5 Minutes |
Alert information
Alert Trigger Method |
Alert Level |
Scope |
|---|---|---|
| Based on monitoring metric expressions | Downtime | Host |
Alert template
Alert Overview
- Template: ${alarm_target} ${alarm_name}
- Example: alarm_template_id=0:binlog_cluster=binlog02-2000005:svr_ip=xxx.xxx.xxx.xxx The Binlog Server process does not exist.
Alert Details
- Template: Binlog Cluster: ${binlog_cluster}, Host: ${host}, Alert: ${alarm_name}.
- Example: Binlog cluster: binlog02, host: xxx.xxx.xxx.xxx, alert: Binlog Server process does not exist.
Alert recovery
- Template: Alert: ${alarm_name}, Binlog Server process liveness status: ${recover_value}
- Example: Alert: Binlog Server process does not exist, Binlog Server process status: 1
Impact on the system
The Binlog Server process is responsible for managing the Binlog instance. In a single-node deployment scenario, the management capability of the Binlog instance will be lost.
Possible causes
The Binlog MetaDB is unavailable.
Solution
Check whether OCP has restarted the Binlog Server process.
This alert triggers the default automatic startup of the Binlog Server process in OCP. You can check whether the plan is executed successfully and whether the Binlog Server process has resumed normal operation.
If OCP fails to start the Binlog Server process, you can refer to the following procedure for troubleshooting:
- Check whether the MetaDB of the Binlog service is available. If the MetaDB remains unavailable, the Binlog Server process may exit abnormally.
- Collect the error logs, coredump files of the Binlog Server process, and its runtime log
log/logproxy.log, and contact OCP Technical Support for assistance with troubleshooting.
