ob_cpu_percent_over_threshold Average tenant thread usage rate exceeds the threshold on OceanBase server

2025-09-08 08:15:43  Updated

Alert description

The average tenant thread usage on OceanBase Database exceeds the threshold.

Alerting principle

The following table describes the key parameters involved in the alerting monitoring logic.

Parameter Value
Monitoring metric ob_cpu_percent
Metric source select /*+read_consistency(weak)*/ tenant_name, tenant_id, stat_id, value from v$sysstat, __all_tenant where stat_id IN (140005, 140006) and (con_id > 1000 or con_id = 1) and __all_tenant.tenant_id = v$sysstat.con_id;
The value field is assigned to the collected metric, and other fields serve as labels.
Collected metric sysstat_value
Monitoring expression 100 * sum(sysstat_value{metric_group="sysstat",stat_id="140006",@LABELS}) by (@GBLABELS) / sum(sysstat_value{metric_group="sysstat",stat_id="140005",@LABELS}) by (@GBLABELS)
Collection interval 1 second

The monitoring metric ob_cpu_percent indicates the average tenant thread usage on the OceanBase server. An alert is triggered when this value exceeds the threshold (default 90%).

Note

  • Statistical event ID 140005: the maximum number of tenant threads available on the OceanBase server.
  • Statistical event ID 140006: the number of tenant threads used on the OceanBase server.

The monitoring expression in the table calculates the ratio of the sum of values for stat_id=140006 to the sum of values for stat_id=140005, and uses this ratio as the monitoring metric value to indicate the average tenant thread usage on the OceanBase server.

Rule information

Monitoring metric Default threshold (unit: %) Duration Detection cycle Elimination cycle
ob_cpu_percent 90 60 seconds 60 seconds 5 minutes

Alert information

Alert trigger method Alert level Scope
Based on the expression of the monitoring metric Critical Server

Alert template

  • Alert summary

    • Template: ${alarm_target} ${alarm_name}
    • Example: ob_cluster=obcluster-1:svr_ip=xxx.xxx.xxx.xxx OceanBase server tenant thread average usage exceeds the limit
  • Alert details

    • Template: Cluster: ${ob_cluster_name}, Host: ${host}, Alert: OceanBase tenant thread average usage is ${value_shown}%, which exceeds ${alarm_threshold} %.
    • Example: Cluster: obcluster-1, Host: xxx.xxx.xxx.xxx, Alert: OceanBase tenant thread average usage is 91.0 %, which exceeds 90.0 %.
  • Alert recovery

    • Template: Alert: ${alarm_name}, OceanBase server tenant thread average usage: ${value_shown}
    • Example: Alert: OceanBase server tenant thread average usage exceeds the limit, OceanBase server tenant thread average usage: 89 %

Impact on the system

A high average tenant thread usage rate leads to a decrease in system throughput and an increase in request latency.

If the average tenant thread usage rate is only temporarily high, it generally does not cause significant issues. However, if it remains consistently high, it is necessary to address the problem.

Possible causes

This can occur during the execution of complex SQL queries.

Solution

  1. First, check if tenant_cpu_percent_over_threshold OceanBase tenant thread usage exceeds the threshold has occurred.

  2. It is possible that multiple tenants are experiencing increased load simultaneously, leading to load accumulation on the OBServer node and triggering the alert.

    To reduce the thread usage rate caused by a surge in traffic affecting multiple tenants, you can take the following actions:

Contact Us