Description
OCP-Agent runs with the following three processes: ocp_agentd, ocp_mgragent, and ocp_monagent. This alert is triggered when the number of any of the processes is greater than 1 for more than 60s. The specific process name is displayed in the alert details.
Principle
| Parameter | Value | ||
|---|---|---|---|
| Metric | agent_process_count | ||
| Source | OCP-Agent | ||
| Collected metric | process_count | ||
| Metric expression | max(process_count{name=~"ocp_mgragent | ocp_monagent | ocp_agentd",@LABELS}) by (@GBLABELS) |
| Collection cycle | 5 seconds |
Alert rule
| Metric expression | Metric description | Default threshold | Detection cycle | Time before clearance |
|---|---|---|---|---|
| agent_process_count | The number of OCP-Agent processes is abnormal. | 1 | 20 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Metric expression | Warning | OceanBase Cloud Platform (OCP) nodes |
Alert templates
Overview
- Template: ${alarm_target} ${alarm_name}
- Example: alarm_template_id=0:host=xxx.xxx.xxx.xxx:name=ocp_mgragent. The number of processes is abnormal.
Details
- Template: Cluster: ${ob_cluster_name}. Server: ${host}. Process: ${name}. Number of processes: ${value}.
- Example: Cluster: TEST. Server: xxx.xxx.xxx.xxx. Process: ocp_mgragent. Number of processes: 2.
Impact on the system
This alert indicates that the cluster has a residual OCP-Agent process. OCP-Agent processes are configured with SO_REUSEADDR, allowing the port to be reused. When multiple OCP-Agent processes provide the same API, the API behavior may vary with the version, resulting in unexpected results.
Possible causes
- OCP-Agent processes are not fully cleaned up during uninstallation.
- OCP-Agent processes are not fully cleaned up during re-installation.
- You have operated OCP-Agent by using a CLI tool.
Suggested solutions
View the IDs of OCP-Agent processes on the GUI of the OCP console, and terminate the OCP-Agent processes that will no longer be used in a CLI tool. Run the kill {pid} command.