monitor_exporter_unavailable

2025-01-10 06:15:54  Updated

Description

This alert is triggered when an exporter becomes unavailable.

Currently, the following eight exporters are involved:

  • Three exporters for collecting OBServer monitoring data: /metrics/node/ob, /metrics/ob/basic, and /metrics/ob/extra
  • Two exporters for collecting OBProxy monitoring data: /metrics/node/obproxy and /metrics/obproxy
  • One exporter for collecting host monitoring data: /metrics/node/host
  • Two exporters for collecting OCP-Agent monitoring data: :62888/metrics/stat and :62889/metrics/stat

If both the observer and obproxy processes are deployed on the same host, eight exporters are expected to exist on the host. When it is detected that less than eight exporters are active, this alert is triggered.

Note

An exporter is an API for collecting monitoring data on a host.

Principle

The ocp_exporter_address table of the ocp_meta tenant records the information about each exporter. If the status of an exporter is inactive, the exporter may fail to collect monitoring data.

The following table describes the key parameters that are involved in the monitoring and alerting logic.

Parameter Value
Metric monitor_exporter_available
Source The exporter status that is collected from the memory. Note that the status in the ocp_exporter_address table may lag behind the actual status.
Collected metric monitor_exporter_available
Metric expression avg(monitor_exporter_available{@LABELS}) by (@GBLABELS) max(ntp_offset{app="HOST",@LABELS}) by (@GBLABELS)
Collection cycle 60 seconds

Note

Unlike other expression-triggered alerts, this alert is triggered based on the status information of an exporter. The status information is monitored and collected by the OCP-Server by executing the preceding statement in MetaDB.

The value of the monitor_exporter_available metric indicates the status of an exporter. The value 1 indicates that the exporter is available and the value 0 indicates that the exporter is unavailable.

This alert is triggered when the value of the metric is 0.

Alert rule

Metric Default threshold Duration Detection cycle Time before clearance
monitor_exporter_avaliable 0 300 seconds 15 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Based on the expression of the metric Notice Server

Alert templates

${alarm_target} indicates the object that generated the alert, in the exporter_addr=xxx:exporter_type=xxx:scrape_interval=xxx format. exporter_addr indicates the URL of the exporter. exporter_type indicates the type of the exporter. scrape_interval indicates the collection cycle.

Impact on the system

No monitoring data is available in the OCP console. The system running status cannot be viewed in real time, and monitoring-related alerts cannot be triggered.

Possible causes

  • OCP-Agent is abnormal and does not return monitoring data.

  • The network connection is disconnected. As a result, OCP cannot access the monitored URL.

Solutions

The ocp_exporter_address table records the exporter status, which may lag behind the actual status. If the status changes to inactive, OCP will decrease the frequency of collecting exporter status data. Therefore, the occasional inactive status of an exporter is not immediately updated to the ocp_exporter_address table.

  1. You can call the unix-socket operation on the server where the exporter in the inactive state resides to check whether the exporter can be accessed.

     sudo curl -s --unix-socket /home/admin/ocp_agent/run/ocp_monagent.$(cat /home/admin/ocp_agent/run/ocp_monagent.pid).sock http://unix-socket-server/metrics/ob/basic
    
  2. Check the network connectivity between OCP server and the server where the exporter in the inactive state resides: Run the following command on the OCP server to check whether the URL of the exporter is accessible: Since OCP V3.2.0, authentication is enabled for the exporters on each server by default. You can configure the system parameter ocp.agent.auth.metric-auth-enabled to temporarily disable authentication.

     curl http://xxx.xxx.xxx.xxx:62889/metrics/ob/basic
    

    Note

    http://xxx.xxx.xxx.xxx:62889/metrics/ob/basic indicates the value of exporter_addr in the alert information.

    • If this URL is inaccessible, a network connection issue may have occurred.

      Check whether the network connection between OCP and this server is available. For more information, see Network troubleshooting.

    • If the URL is accessible, OCP-Agent is faulty.

      Troubleshoot and resolve the problem. For more information, see OCP-Agent O&M script.

Contact Us