Description
This alert is triggered when the thread usage of an OceanBase Database tenant exceeds the threshold.
Thread usage of an OceanBase Database tenant = Actual number of active threads of the tenant/Sum of the maximum numbers of tenant threads for all units in the primary zone of the tenant.
Principle
The following table describes the key parameters that are involved in the monitoring and alerting logic.
| Parameter | Value |
|---|---|
| Metric | ob_tenant_thread_percent
NoteThis metric indicates the thread usage of a tenant in an OceanBase cluster. An alert is triggered when the metric value is greater than the threshold. The default threshold is 95%. |
| Source | SQL: select /*+read_consistency(weak)*/ tenant_name, tenant_id, stat_id, value from v$sysstat, __all_tenant where stat_id IN (140006, 140005) and (con_id > 1000 or con_id = 1) and __all_tenant.tenant_id = v$sysstat.con_id;The value of the value field is used as that of the sysstat_value metric, and other fields are used as labels. |
| Collected metric | sysstat_value |
| Metric expression | 100 * sum(sysstat_value{metric_group="sysstat",stat_id="140006",@LABELS}) by (@GBLABELS) / sum(sysstat_value{metric_group="sysstat",stat_id="140005",@LABELS}) by (@GBLABELS)
ob_tenant_thread_percent is the percentage of the sum of values of the value field when stat_id is 140006 to that when stat_id is 140005. |
| Collection cycle | 1 second |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Based on the expression of the metric | Warning | Tenant |
Alert rule
| Metric | Default threshold (unit: %) | Duration | Detection cycle | Time before clearance |
|---|---|---|---|---|
| ob_tenant_thread_percent{app="OB"} | 95 | 0 seconds | 60 seconds | 5 minutes |
Alert templates
Overview template: ${alarm_target} ${alarm_name}
Details template: Cluster: ${ob_cluster_name}. Tenant: ${tenant_name}. Alert: ${alarm_name}. The tenant thread usage is ${value_shown}, exceeding the threshold ${alarm_threshold} %.
Overview example: ob_cluster=obcluster-1:tenant_name=tenant-1:svr_ip=xxx.xxx.xxx.xxx. The thread usage of the OceanBase Database tenant exceeds the threshold.
Details example: Cluster: obcluster-1. Tenant: m****. Alert: The thread usage of the OceanBase Database tenant exceeds the threshold. The tenant thread usage is 96.0%, exceeding the threshold 95.0%.
Impact on the system
No impact
Possible causes
The application queries a large amount of data or generates hotspot data.
The resource plan of the tenant cannot cope with business requirements or hotspot data is generated.
Solutions
Check whether the load is normal for the application.
Go to the Performance Monitoring page of the tenant that generated the alert in the OceanBase Cloud Platform (OCP) console. View the Tenant thread usage chart on the Performance and SQL tab. Check whether the tenant thread usage at the alert time surged in comparison with the thread usage in the last one to seven days.
If yes, the tenant thread usage is abnormal.
Otherwise, the tenant thread usage is normal.
If the tenant thread usage surge is caused by normal traffic, we recommend that you increase the unit specifications of the tenant and allocate more thread resources to the tenant. For more information, see Manage unit specifications.
If the high tenant thread usage is caused by querying a large amount of data or hotspot traffic, perform the following steps based on the actual scenarios.
A large amount of data is queried in SQL execution.
On the TopSQL tab of the SQL Diagnostics page, check whether SQL queries with high CPU utilization exist.
If yes, optimize the SQL query that causes high tenant thread usage.
Otherwise, perform the next step.
The high tenant thread usage is caused by slow SQL queries.
On the SlowSQL tab of the SQL Diagnostics page, check the diagnosis result.
Optimize the slow SQL queries you find.
If the high tenant thread usage is caused by hotspot rows of a node in the tenant. For more information, see Apply throttling to an OceanBase cluster.