Description
This alert is triggered when the obproxy process stops.
The ocp_agent process monitors the uptime of the processes of components in OceanBase Database. When the uptime of a process is 0, the process is stopped. This alert indicates that the process has stopped at the moment rather than for a long time. In the latter case, other status alerts are triggered.
Relevant alerts:
- observer process: observer_process_stop
- obproxy process: obproxy_process_stop
- obproxyd.sh process: obproxyd_process_stop
- ocp_agentd process: agentd_process_stop
- ocp_mgragent process: mgragent_process_stop
- ocp_monagent process: monagent_process_stop
Principle
| Parameter | Value |
|---|---|
| Metric | obproxy_uptime_delta_seconds |
| Source | The difference between the current time and the process creation time displayed in the 14th column of the stat file in the /proc/[pid] directory. |
| Collected metric | process_uptime_seconds |
| Metric expression | 0 - min(delta(process_uptime_seconds{name="obproxy",@LABELS}[@INTERVAL])) by (@GBLABELS) |
| Collection cycle | 5 seconds |
Alert rule
| Metric expression | Metric description | Default threshold | Detection cycle | Time before clearance |
|---|---|---|---|---|
| obproxy_uptime_delta_seconds > 0 | The negative offset value of uptime is used. A metric value greater than 0 indicates that the process has stopped. | 0 seconds | 30 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Based on the expression of the metric | Stopped | OBProxy |
Alert templates
Overview
- Template:
${alarm_target} ${alarm_name} - Example:
alarm_template_id=0:obproxy_cluster=ODPT2-2000005:host=xxx.xxx.xxx.xxx. The obproxy process stopped.
- Template:
Details
- Template:
OBProxy cluster: ${obproxy_cluster}. Host: ${host}. Alert: ${alarm_name}. The process has been running for ${value_shown}. - Example:
OBProxy cluster: ODPT2. Host: xxx.xxx.xxx.xxx. Alert: The obproxy process stopped. The process has been running for 1 day 15 hours 12 minutes 51 seconds.
- Template:
Impact on the system
After the obproxy process stops, it can be restarted by the obproxyd.sh process. Before the obproxy process is restarted, business applications are disconnected. This results in business interruption, such as a QPS drop.
Possible causes
- The memory usage of the obproxy process exceeds the threshold.
- Other exceptions.
Solutions
After the obproxy process stops, the obproxyd.sh process tries to restart it. If the obproxyd.sh process stops unexpectedly, you need to manually restart the obproxyd.sh process.
cd /opt/taobao/install/obproxy-3.4.0 && bin/obproxyd.sh -c start -r /home/admin/logs/obproxy -n ODPT2 -p 2883If the obproxyd.sh process is running but not the obproxy process, you cannot manually restart the obproxy process. In this case, perform the following troubleshooting operations:
If you have started OBProxy by specifying the
obproxy_config_server_urlparameter, check whether the config server is running. The config server service is provided by OceanBase Cloud Platform (OCP).Get the error logs and core dump file of the obproxy process, and contact OceanBase Technical Support.