Alert description
This alert is triggered when the ratio of the total number of CPU cores allocated to tenants to the total number of CPU cores across all hosts in the OceanBase cluster exceeds the threshold.
Each tenant can be allocated multiple resource units (RU) in a zone. Each RU has specified upper and lower limits for the number of available CPU cores. This alert focuses on whether the allocation is saturated, not the actual usage rate.
Alert principle
The following table lists the key parameters involved in the monitoring logic of this alert.
| Parameter | Value |
|---|---|
| Monitoring metric | cpu_assigned_percent |
| Metric source | SQL:select /*+ MONITOR_AGENT READ_CONSISTENCY(WEAK) */ cpu_total, cpu_assigned, mem_total, mem_assigned,disk_total, cpu_assigned_percent, mem_assigned_percent from __all_virtual_server_stat where svr_ip = ? and svr_port = ?select /*+ MONITOR_AGENT READ_CONSISTENCY(WEAK) */ cpu_total,cpu_max_assigned as cpu_assigned,mem_total,mem_max_assigned as mem_assigned,disk_total, cpu_assigned_percent, mem_assigned_percent from __all_virtual_server_stat where svr_ip = ? and svr_port = ?select /* MONITOR_AGENT */ cpu_capacity_max as cpu_total,cpu_assigned_max as cpu_assigned,mem_capacity as mem_total,mem_assigned as mem_assigned,data_disk_capacity as disk_total, (cpu_assigned_max / cpu_capacity_max) as cpu_assigned_percent, (mem_assigned / mem_capacity) as mem_assigned_percent from V$OB_SERVERS where svr_ip = ? and svr_port |
| Metric collection | cpu_assigned, cpu_total |
| Monitoring expression | 100 * avg(ob_server_resource_cpu_assigned{@LABELS}) by (@GBLABELS) / avg(ob_server_resource_cpu{@LABELS}) by (@GBLABELS) |
| Collection cycle | 60 seconds |
The Monitoring metric value indicates the ratio of the total number of CPU cores allocated to tenants to the total number of CPU cores across all OBServer nodes in the OceanBase cluster. The alert is triggered when this ratio exceeds the threshold (which is 98% by default).
Note
- cpu_assigned indicates the total number of CPU cores allocated to tenants.
- cpu_total indicates the total number of CPU cores across all OBServer nodes in the cluster.
- The monitoring expression in the table calculates the ratio of these two values as the monitoring metric.
Alert information
| Alert trigger method | Alert level | Scope |
|---|---|---|
| Based on the monitoring metric expression | Notice | Server |
Rule information
| Monitoring metric | Default threshold (unit: %) | Duration | Detection cycle | Elimination cycle |
|---|---|---|---|---|
| cpu_assigned_percent | 98 | 0 seconds | 60 seconds | 5 minutes |
Alert template
Alert overview
- Template: ${alarm_target} ${alarm_name}
- Example: ob_cluster=obcluster-1:svr_ip=xxx.xxx.xxx.xxx OceanBase has exceeded the percentage of CPU cores allocated to tenants.
Alert details
- Template: Cluster: ${ob_cluster_name}, Host: ${host}, Alert: CPU utilization of ${value_shown} exceeds ${alarm_threshold} %.
- Example: Cluster: obcluster-1, Host: xxx.xxx.xxx.xxx, Alert: CPU utilization of 99.0 % exceeds 98.0 %.
Alert recovery
- Template: Alert: ${alarm_name}, OceanBase has exceeded the percentage of CPU cores allocated to tenants: ${value_shown}
- Example: Alert: OceanBase has exceeded the percentage of CPU cores allocated to tenants, OceanBase has exceeded the percentage of CPU cores allocated to tenants: 97%
Impact on the system
If more resources are needed or a new tenant needs to be created, the system may fail due to insufficient resources.
Possible causes
This usually occurs when the number of CPU cores in the OceanBase cluster is nearly fully allocated, and further allocation may lead to over-subscription or over-provisioning.
Solution
You can increase the available resources by using the following methods:
Delete unnecessary tenants.
For more information, see Delete a tenant.
Scale down tenants that are using less resources than allocated.
For more information, see Manage unit specifications.
Expand the cluster.
Add OBServer nodes to the OceanBase cluster. For more information, see Add an OBServer node.