tenant_cpu_percent_over_threshold

2025-03-28 08:05:55  Updated

Description

This alert is triggered when the thread usage of an OceanBase Database tenant exceeds the threshold.

Thread usage of an OceanBase Database tenant = Actual number of active threads of the tenant/Sum of the maximum numbers of tenant threads for all units in the primary zone of the tenant.

Principle

The following table describes the key parameters that are involved in the monitoring and alerting logic.

Parameter Value
Metric ob_tenant_thread_percent

Note

This metric indicates the thread usage of a tenant in an OceanBase cluster. An alert is triggered when the metric value is greater than the threshold. The default threshold is 95%.

Source SQL: select /*+read_consistency(weak)*/ tenant_name, tenant_id, stat_id, value from v$sysstat, __all_tenant where stat_id IN (140006, 140005) and (con_id > 1000 or con_id = 1) and __all_tenant.tenant_id = v$sysstat.con_id;
The value of the value field is used as that of the sysstat_value metric, and other fields are used as labels.
Collected metric sysstat_value
Metric expression 100 * sum(sysstat_value{metric_group="sysstat",stat_id="140006",@LABELS}) by (@GBLABELS) / sum(sysstat_value{metric_group="sysstat",stat_id="140005",@LABELS}) by (@GBLABELS)
  • If the value of stat_id is 140005, the result is the maximum number of threads available for the tenant.
  • If the value of stat_id is 140006, the result is the number of threads being used by the tenant.
The metric expression means that the value of ob_tenant_thread_percent is the percentage of the sum of values of the value field when stat_id is 140006 to that when stat_id is 140005.
Collection cycle 1 second

Alert information

Trigger method Alert level Scope
Based on the expression of the metric Warning Tenant

Alert rule

Metric Default threshold (unit: %) Duration Detection cycle Time before clearance
ob_tenant_thread_percent{app="OB"} 95 0 seconds 60 seconds 5 minutes

Alert templates

  • Overview template: ${alarm_target} ${alarm_name}

  • Details template: Cluster: ${ob_cluster_name}. Tenant: ${tenant_name}. Alert: ${alarm_name}. The tenant thread usage is ${value_shown}, exceeding the threshold ${alarm_threshold} %.

  • Overview example: ob_cluster=obcluster-1:tenant_name=tenant-1:svr_ip=xxx.xxx.xxx.xxx. The thread usage of the OceanBase Database tenant exceeds the threshold.

  • Details example: Cluster: obcluster-1. Tenant: m****. Alert: The thread usage of the OceanBase Database tenant exceeds the threshold. The tenant thread usage is 96.0%, exceeding the threshold 95.0%.

Impact on the system

No impact

Possible causes

  • The application queries a large amount of data or generates hotspot data.

  • The resource plan of the tenant cannot cope with business requirements or hotspot data is generated.

Solutions

  1. Check whether the load is normal for the application.

    Go to the Performance Monitoring page of the tenant that generated the alert in the OceanBase Cloud Platform (OCP) console. View the Tenant thread usage chart on the Performance and SQL tab. Check whether the tenant thread usage at the alert time surged in comparison with the thread usage in the last one to seven days.

    • If yes, the tenant thread usage is abnormal.

    • Otherwise, the tenant thread usage is normal.

      If the tenant thread usage surge is caused by normal traffic, we recommend that you increase the unit specifications of the tenant and allocate more thread resources to the tenant. For more information, see Manage unit specifications.

  2. If the high tenant thread usage is caused by querying a large amount of data or hotspot traffic, perform the following steps based on the actual scenarios.

    • A large amount of data is queried in SQL execution.

      On the TopSQL tab of the SQL Diagnostics page, check whether SQL queries with high CPU utilization exist.

      • If yes, optimize the SQL query that causes high tenant thread usage.

      • Otherwise, perform the next step.

    • The high tenant thread usage is caused by slow SQL queries.

      On the SlowSQL tab of the SQL Diagnostics page, check the diagnosis result.

      Optimize the slow SQL queries you find.

    • If the high tenant thread usage is caused by hotspot rows of a node in the tenant. For more information, see Apply throttling to an OceanBase cluster.

Contact Us