ob_cpu_percent_over_threshold Average tenant thread usage rate exceeds the threshold on OceanBase server|V4.3.6| docs|Distributed Database

ob_cpu_percent_over_threshold Average tenant thread usage rate exceeds the threshold on OceanBase server

Last Updated：2025-09-08 08:15:43 Updated

Alert description

The average tenant thread usage on OceanBase Database exceeds the threshold.

Alerting principle

The following table describes the key parameters involved in the alerting monitoring logic.

Parameter	Value
Monitoring metric	ob_cpu_percent
Metric source	`select /+read_consistency(weak)/ tenant_name, tenant_id, stat_id, value from v$sysstat, __all_tenant where stat_id IN (140005, 140006) and (con_id > 1000 or con_id = 1) and __all_tenant.tenant_id = v$sysstat.con_id;` The value field is assigned to the collected metric, and other fields serve as labels.
Collected metric	sysstat_value
Monitoring expression	100 * sum(sysstat_value{metric_group="sysstat",stat_id="140006",@LABELS}) by (@GBLABELS) / sum(sysstat_value{metric_group="sysstat",stat_id="140005",@LABELS}) by (@GBLABELS)
Collection interval	1 second

The monitoring metric ob_cpu_percent indicates the average tenant thread usage on the OceanBase server. An alert is triggered when this value exceeds the threshold (default 90%).

Note

Statistical event ID 140005: the maximum number of tenant threads available on the OceanBase server.
Statistical event ID 140006: the number of tenant threads used on the OceanBase server.

The monitoring expression in the table calculates the ratio of the sum of values for stat_id=140006 to the sum of values for stat_id=140005, and uses this ratio as the monitoring metric value to indicate the average tenant thread usage on the OceanBase server.

Rule information

Monitoring metric	Default threshold (unit: %)	Duration	Detection cycle	Elimination cycle
ob_cpu_percent	90	60 seconds	60 seconds	5 minutes

Alert information

Alert trigger method	Alert level	Scope
Based on the expression of the monitoring metric	Critical	Server

Alert template

Alert summary
- Template: ${alarm_target} ${alarm_name}
- Example: ob_cluster=obcluster-1:svr_ip=xxx.xxx.xxx.xxx OceanBase server tenant thread average usage exceeds the limit
Alert details
- Template: Cluster: ${ob_cluster_name}, Host: ${host}, Alert: OceanBase tenant thread average usage is ${value_shown}%, which exceeds ${alarm_threshold} %.
- Example: Cluster: obcluster-1, Host: xxx.xxx.xxx.xxx, Alert: OceanBase tenant thread average usage is 91.0 %, which exceeds 90.0 %.
Alert recovery
- Template: Alert: ${alarm_name}, OceanBase server tenant thread average usage: ${value_shown}
- Example: Alert: OceanBase server tenant thread average usage exceeds the limit, OceanBase server tenant thread average usage: 89 %

Impact on the system

A high average tenant thread usage rate leads to a decrease in system throughput and an increase in request latency.

If the average tenant thread usage rate is only temporarily high, it generally does not cause significant issues. However, if it remains consistently high, it is necessary to address the problem.

Possible causes

This can occur during the execution of complex SQL queries.

Solution

First, check if tenant_cpu_percent_over_threshold OceanBase tenant thread usage exceeds the threshold has occurred.
- If it has, follow the solution outlined in tenant_cpu_percent_over_threshold OceanBase tenant thread usage exceeds the threshold.
- If it has not, proceed to the next step.
It is possible that multiple tenants are experiencing increased load simultaneously, leading to load accumulation on the OBServer node and triggering the alert.

To reduce the thread usage rate caused by a surge in traffic affecting multiple tenants, you can take the following actions:
- Perform emergency scaling for the cluster.
  
  Add an OBServer node to the OceanBase cluster. For more information, see Add an OBServer node.
- Limit the traffic of the OceanBase cluster. For more information, see Limit the traffic of an OceanBase cluster.