Description
This alert is triggered when the proportion of the total number of CPU cores allocated to tenants in the total number of CPU cores of all OBServers in the OceanBase cluster exceeds the threshold.
You can allocate multiple resource units to each zone of a tenant and specify the maximum and the minimum numbers of CPU cores for each unit. This alert focuses on CPU allocation instead of CPU utilization.
Principle
The following table describes the key parameters that are involved in the monitoring and alerting logic.
| Parameter | Value |
|---|---|
| Metric | cpu_assigned_percent |
| Source | SQL: select /*+ MONITOR_AGENT READ_CONSISTENCY(WEAK) */ cpu_total, cpu_assigned, mem_total, mem_assigned,disk_total, cpu_assigned_percent, mem_assigned_percent from __all_virtual_server_stat where svr_ip = ? and svr_port = ?select /*+ MONITOR_AGENT READ_CONSISTENCY(WEAK) */ cpu_total,cpu_max_assigned as cpu_assigned,mem_total,mem_max_assigned as mem_assigned,disk_total, cpu_assigned_percent, mem_assigned_percent from __all_virtual_server_stat where svr_ip = ? and svr_port = ?select /* MONITOR_AGENT */ cpu_capacity_max as cpu_total,cpu_assigned_max as cpu_assigned,mem_capacity as mem_total,mem_assigned as mem_assigned,data_disk_capacity as disk_total, (cpu_assigned_max / cpu_capacity_max) as cpu_assigned_percent, (mem_assigned / mem_capacity) as mem_assigned_percent from V$OB_SERVERS where svr_ip = ? and svr_port |
| Collected metrics | cpu_assigned and cpu_total |
| Metric expression | 100 * avg(ob_server_resource_cpu_assigned{@LABELS}) by (@GBLABELS) / avg(ob_server_resource_cpu{@LABELS}) by (@GBLABELS) |
| Collection cycle | 60 seconds |
The value of the metric indicates the proportion of the total number of CPU cores allocated to tenants in the total number of CPU cores of all OBServers in the OceanBase cluster. This alert is triggered when this proportion exceeds the threshold. The default threshold is 98%.
Note
- cpu_assigned indicates the total number of CPU cores allocated to tenants.
- cpu_total indicates the total number of CPU cores of all OBServers in the cluster.
The metric expression in the preceding table uses the proportion of cpu_assigned in cpu_total as the monitoring value.
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Metric expression | Caution | Server |
Alert rule
| Metric | Default threshold (unit: %) | Duration | Detection cycle | Time before clearance |
|---|---|---|---|---|
| cpu_assigned_percent | 98 | 0 seconds | 60 seconds | 5 minutes |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: ${alarm_target} ${alarm_name}. ${value}% of CPU resources are allocated, exceeding the threshold of ${alarm_threshold}%.
Overview example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx. The proportion of CPU resources allocated to tenants exceeds the threshold.
Details example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx. The proportion of CPU resources allocated to tenants reaches 99.0%, exceeding the threshold of 98.0%.
Impact on the system
The customer needs to apply for more resources, or new tenants may not be able to be created because of insufficient resources.
Possible cause
This problem is commonly found in cases where the CPU cores of an OceanBase cluster are fully allocated. Further allocation of the CPU cores may cause the CPU cores to be over-allocated.
Suggested solutions
Increase the available resources in the following ways:
Delete unwanted tenants.
For more information, see Delete a tenant.
Reduce unit specifications of tenants that use less resources than those allocated to them.
For more information, see Manage unit specifications.
Scale out the cluster.
Add OBServers to the OceanBase cluster. For more information, see Add an OBServer.