Description
This alert is triggered when the disk usage of the log directory mount point of OBServer exceeds the threshold. Note
The log directory of OBServer is /data/log1 by default.
Principle
The following table describes the key parameters that are involved in the monitoring and alerting logic.
| Parameter | Value |
|---|---|
| Metric | ob_host_log_path_disk_percent Note This metric indicates the disk usage of the OBServer log directory. When the usage exceeds the threshold, this alert is triggered. The default threshold is 85%. |
| Source | unknow df -B1 Note The metric source of this alert is special. OCP-Agent uses the preceding command to check the disk usage of OBServer and adds labels to disks. The labels vary with the disks: * Disk of the installation directory: mount_lable="install_path" * Disk of the data directory: mount_lable="data_path" * Disk of the log directory: mount_lable="log_path" |
| Collected metrics | host_partition_volume_free and host_partition_volume_total |
| Metric expression | 100 * (1 - avg(host_partition_volume_free{@LABELS}) by (@GBLABELS) / avg(host_partition_volume_total{@LABELS}) by (@GBLABELS)) |
| Collection cycle | 1 second |
Alert rule
| Metric | Default threshold (unit: %) | Duration | Detection cycle | Time before clearance |
|---|---|---|---|---|
| ob_host_log_path_disk_percent | 85 | 0 seconds | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Metric expression | Critical | Server |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: ${alarm_target} ${alarm_name}. The disk usage in the log directory ${log_disk_path} of the ${mount_point} mount point is ${value}%, exceeding the threshold of ${alarm_threshold}%.
Overview example: ob_cluster=C1-1000:svr_ip=192.168.1.1. The disk usage in the log directory of the OceanBase host exceeds the threshold.
Details example: ob_cluster=C1-1000:svr_ip=192.168.1.1. The disk usage in the log directory /data/log1 of the mount point /data/log1 is 98.0%, exceeding the threshold of 85.0%.
Impact on the system
When the free space of the OBServer log disk is insufficient, OBServer cannot work properly.
Possible cause
Too many files are generated by other applications.
Suggested solution
Run the
dfcommand to check whether the disk usage of/data/log1exceeds the threshold.If the disk usage exceeds the threshold, run the following command to find the directories and files that occupy the most space:
# Find the five directories that occupy the most space. du -a /data/log1 | sort -n -r | head -n 5 # Find the five largest files. cd /data/log1 && find -type f -exec du -Sh {} + | sort -rh | head -n 5Delete unwanted files.