ob_tenant_cpu_usage_over_threshold|V4.3.6| docs|Distributed Database

ob_tenant_cpu_usage_over_threshold

Last Updated：2025-09-08 08:15:43 Updated

Description

This alert is triggered when the CPU utilization of an OceanBase Database tenant exceeds the threshold. The threshold for this alert equals the percentage of the number of CPU cores occupied by the tenant to the number of CPU cores allocated to the tenant. Therefore, this alert is more accurate than the tenant_cpu_percent_over_threshold and ob_cpu_percent_over_threshold alerts, which are triggered by the CPU utilization based on the thread usage of a tenant.

OBServer and OceanBase Cloud Platform (OCP) must meet the following version requirements for this alert to work:

OBServer: V3.2.3 BP10, V3.2.4 BP5, V4.1 BP3, V4.2.0, and later.
OCP: V4.2.2 and later.

Principle

Parameter	Value
Metric	ob_tenant_cpu_percent
Source	Query virtual tables: Query statement for OceanBase Database tenants of versions earlier than V4.0: `select /+ MONITOR_AGENT READ_CONSISTENCY(WEAK) / con_id tenant_id, stat_id, value from v$sysstat where stat_id IN (140005, 140013) and (con_id > 1000 or con_id = 1) and class < 1000`. Query statement for OceanBase Database tenants of version V4.0 and later: `select /* MONITOR_AGENT */ con_id tenant_id, stat_id, value from v$sysstat where stat_id IN (140005, 140013) and (con_id > 1000 or con_id = 1) and class < 1000`. The CPU time is obtained from cpuacct.usage by the cgroup feature only when the cgroup feature is enabled on an OceanBase Database tenant of V4.x and later. In other cases, the CPU time is obtained from the `/proc/<pid>/task/<tid>/stat` directory. In OceanBase Database V4.x and later but earlier than V4.1 BP3, the CPU time can be obtained only when the cgroup feature is enabled.
Collected metric	ob_sysstat: 140005: max cpus, which is the number of CPU cores allocated to each tenant. 140013: cpu time, which is the cumulative CPU time of all threads of each tenant, in microseconds.
Metric expression	sum(rate(ob_sysstat{stat_id="140013"}[$__rate_interval])) by (ob_cluster_name,ob_cluster_id,obzone,tenant_name,ob_tenant_id,svr_ip)/sum(ob_sysstat{stat_id="140005"}) by (ob_cluster_name,ob_cluster_id,obzone,tenant_name,ob_tenant_id,svr_ip)/100
Collection cycle	60 seconds

Alert rule

Metric expression	Metric description	Default threshold	Detection cycle	Time before clearance
ob_tenant_cpu_percent > 95	The CPU utilization of the tenant.	95	10 seconds	5 minutes

Alert information

Trigger method	Alert level	Scope
Metric expression	Critical	Tenant

Alert templates

Alert overview
- Template: ${alarm_target} ${alarm_name}
- Example: alarm_template_id=0:ob_cluster=TEST-1:tenant_name=t1:host=xxx.xxx.xxx.xxx. The CPU utilization of the OceanBase Database tenant exceeeds the threshold.
Alert details
- Template: Cluster: ${ob_cluster_name}. Tenant: ${tenant_name}. Alert: ${alarm_name}. The CPU utilization of the tenant is ${value_shown}, which exceeds ${alarm_threshold} %.
- Example: Cluster: TEST. Tenant: t1. Alert: The CPU utilization of the OceanBase Database tenant exceeeds the threshold. The CPU utilization of the tenant is 96%, which exceeds 95%.
Alert recovery
- Template: Alert: ${alarm_name}. OceanBase tenant CPU utilization: ${value_shown}
- Example: Alert: OceanBase tenant CPU utilization exceeeds the threshold. OceanBase tenant CPU utilization: 90%

Impact on the system

High CPU utilization of a tenant may have the following impacts:

Business responses are affected due to long SQL response time.
SQL requests may time out, affecting the business.

Possible causes

The tenant may have slow SQL statements. You can analyze the top SQL statements with high percentage of CPU resources to locate specific SQL statements that cause high tenant CPU utilization.
The SQL performance degrades or the execution plan deteriorates. In this case, OCP generates the following alerts:
- oas_anomaly_sql_from_sql_inspection_plan_changed
- oas_anomaly_sql_from_sql_inspection_perf_degradation
- oas_anomaly_sql_from_anomaly_event_analysis_plan_changed
- oas_anomaly_sql_from_anomaly_event_analysis_perf_degradation
- oas_anomaly_sql_from_anomaly_event_analysis_cpu_percent_high
The tenant has large transactions or large queries. You can view the performance monitoring information of the tenant to locate large transactions and large queries.
The business concurrency is high.