Description
This alert is triggered when the CPU load average of the OBServer over the last minute exceeds the threshold.
Principle
The following table describes the key parameters that are involved in the monitoring and alerting logic.
| Parameter | Value |
|---|---|
| Metric | ob_host_load1_per_cpu |
| Source | |
| Collected metrics | node_load1 and cpu_count |
| Metric expression | sum(node_load1{@LABELS}) by (@GBLABELS) / sum(cpu_count{@LABELS}) by (@GBLABELS) |
| Collection cycle | 1 second |
Note
The metric source of this alert is special. For more information, see the description in the Source row of the preceding table.
The value of the metric ob_host_load1_per_cpu indicates the CPU load average of the OBServer over the last minute. When this value exceeds the threshold, this alert is triggered. The default threshold is 2.
Alert rule
| Metric | Default threshold | Duration | Detection cycle | Time before clearance |
|---|---|---|---|---|
| ob_host_load1_per_cpu | 2 | 0 seconds | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Metric expression | Critical | Server |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: ${alarm_target} ${alarm_name}. The CPU load average of the OBServer over the last minute is ${value}, exceeding the threshold of ${alarm_threshold}.
Overview example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx. The CPU load average of the OBServer over the last minute exceeds the threshold.
Details example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx. The CPU load average of the OBServer over the last minute is 3.0, exceeding the threshold of 2.0.
Impact on the system
High CPU load of the server slows down the response of the OBServer to requests.
Possible causes
The OBServer is processing highly concurrent requests or hotspot rows.
The observer process is busy.
Suggested solutions
Check for processes that have high CPU load on the host. Then, run the top command in the command line provided by the operating system of the host to check whether the observer process caused the CPU to be overloaded.
If the observer process caused the CPU to be overloaded, resolve the issue. For more information, see ob_cpu_percent_over_threshold.
If the high CPU load was caused by a non-observer process, contact the database administrator or O&M engineer.