ob_host_disk_readonly

2023-08-22 02:51:01  Updated

Description

This alert is triggered when the OBServer disk is read-only.

Principle

The following table describes the key parameters that are involved in the monitoring and alerting logic.

Parameter Value
Metric ob_host_disk_readonly_flag
Source node_exporter
Collected metric node_filesystem_readonly
Metric expression max(node_filesystem_readonly{@LABELS}) by (@GBLABELS)
Collection cycle 1 second

OCP-Agent runs timed tasks to monitor whether the disk can be read in each collection cycle. If the disk is read-only, it returns 1 to the collected metric node_filesystem_readonly. Otherwise, it returns 0. In the former case, it also attaches the flag is_ob_disk="1" to the disk where the OBServer installation directory, data directory, or log directory is located.

The value of the metric ob_host_disk_readonly_flag indicates whether the disk is read-only.

The alert is triggered when the value is 1, indicating that the disk is read-only.

Alert rule

Metric Default threshold Duration Detection cycle Time before clearance
ob_host_disk_readonly_flag 1 0 seconds 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Metric expression Critical Server

Alert templates

  • Overview: ${alarm_target} ${alarm_name}

  • Details: ${alarm_target} ${alarm_name}

  • Overview example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx. The OBServer disk is read-only.

  • Details example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx. The OBServer disk is read-only.

${alarm_target} follows the ob_cluster=xxxxxxx:svr_ip=xxxxxx format.

  • ob_cluster indicates the name of the cluster that generated the alert.

  • svr_ip indicates the IP address of the OBServer of the cluster that generated the alert.

Impact on the system

  • If the log disk and data disk are read-only, the OBServer cannot properly run.

  • If the log file disk is read-only, the application cannot normally print logs.

Possible causes

  • The disk does not have sufficient space.

  • A permission exception occurred in the operating system.

  • A physical medium error occurred. For example, the disk malfunctions.

Suggested solutions

  • Check the disk space.

    Insufficient disk space also triggers the following alerts. You can refer to the corresponding topics, take actions as needed, and check whether the ob_host_disk_readonly alert is cleared.

  • Check for other issues.

    Many other issues can cause a disk to become read-only. We recommend that you contact your Linux O&M engineer for troubleshooting.

    If that is not an option, run the following command to view the logs that record error messages about a disk being read-only in the /var/log/messages directory and Google search the error message for solutions.

    # List all the read-only disks
    grep "[[:space:]]ro[[:space:],]" /proc/mounts
    # /dev/loop0 in the following message indicates "read-only". 
    tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
    /dev/loop0 /home/admin iso9660 ro,relatime,nojoliet,check=s,map=n,blocksize=2048 0 0
    
    # Query the reason why the disk is read-only. You can replace the '/dev/loop0' with 'readonly'. 
    grep '/dev/loop0' /var/log/messages -Rn
    

Contact Us