Description
This alert is triggered when the proportion of data disk space used for storing data files on an OBServer exceeds the threshold.
Note
By default, the data disk is mounted to the /data/1 directory.
When you create an OceanBase cluster, the data disk is initialized and the size of disk space for storing data files is specified.
Principle
The following table describes the key parameters that are involved in the monitoring and alerting logic.
| Parameter | Value |
|---|---|
| Metric | ob_server_disk_percent |
| Source | SQL: select /*+ READ_CONSISTENCY(WEAK)*/ total_size, free_size from __all_virtual_disk_stat where svr_ip = @svr_ip and svr_port = rpc_port(); |
| Collected metric | total_size and free_size |
| Metric expression | 100 * (sum(total_size{metric_group="all_virtual_disk_stat",@LABELS}) by (@GBLABELS) - sum(free_size{metric_group="all_virtual_disk_stat",@LABELS}) by (@GBLABELS)) / sum(total_size{metric_group="all_virtual_disk_stat",@LABELS}) by (@GBLABELS) |
| Collection cycle | 60 seconds |
The value of the metric ob_server_disk_percent indicates the data disk usage of the OBServer. When this value exceeds the threshold, this alert is triggered. The default threshold is 85%.
Alert rule
| Metric | Default threshold (unit: %) | Duration | Detection cycle | Elimination cycle |
|---|---|---|---|---|
| ob_disk_percent{app="OB"} | 85 | 0 seconds | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Based on the expression of the metric | Critical | Server |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: Cluster: ${ob_cluster_name}, Host: ${host}, Alert: ${alarm_name}. The data disk usage is ${value}%, exceeding the threshold of ${alarm_threshold}%.
Overview example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx Excessive data disk usage of the OBServer
Details example: Cluster: obcluster-1, Host: xxx.xxx.xxx.xxx, Alert: Excessive data disk usage of the OBServer. The data disk usage is 86.0%, exceeding the threshold of 85.0%.
Impact on the system
If the data disk space is insufficient, new data cannot be written to the data disk.
Possible causes
The application data size increases continuously.
Suggested solutions
Perform the following steps to solve the problem:
Delete unwanted tenants, databases, and tables.
Empty the recycle bin, and perform two cluster compaction tasks to release resources.
Migrate units to another OBServer. If no other OBServer is available, add OBServers to scale out the cluster.
Run the following SQL commands to view the table storage space of a tenant:
-- Specify the tenant ID.
set @tenant_id=1001
-- Applies to OceanBase 1.x
select b.table_name,sum(a.required_size)/1024/1024/1024 required_GB,sum(row_count) as rows from __all_meta_table a, __all_table b where a.table_id = b.table_id and a.zone = 'zone_name' and a.tenant_id = @tenant_id group by a.table_id order by required_GB desc;
-- Applies to OceanBase 2.x and 3.x
select c.database_name, b.table_name,sum(a.required_size)/1024/1024/1024 required_GB,sum(row_count) as rows from __all_virtual_meta_table a inner join __all_virtual_table b on a.table_id=b.table_id inner join __all_virtual_database c on b.database_id=c.database_id where b.table_type<>5 and a.zone = 'zone_name' and a.tenant_id = @tenant_id group by a.table_id order by required_GB desc;