Alert description
This alert indicates that the JVM heap memory usage of an OCP node has exceeded the limit.
Alert principle
The following table lists the key parameters involved in the monitoring logic of this alert.
Parameter |
Value |
|---|---|
| Monitoring Metrics | ocp_jvm_heap_memory_used_percent: The memory usage of the JVM heap on the OCP node. An alert is triggered when the memory usage exceeds the threshold. |
| Monitoring Expression | 100 * sum(jvm_memory_used_bytes{area="heap", @LABELS}) by (@GBLABELS) / sum(jvm_memory_max_bytes{area="heap", @LABELS}) by (@GBLABELS) |
| Metric Collection | |
| Metric Source | The OCP process uses the spring-boot-starter-actuator component to collect JVM-related runtime data, namely internalMemoryPoolMetricsThe class provides access to Java Management Extensions (JMX) interfaces, such asMemoryPoolMXBean) to read the memory usage in real time. |
| Collection Cycle | 5 Seconds |
Rule information
Monitoring Metrics |
Default Threshold (Unit: %) |
Duration |
Detection Cycle |
Elimination Cycle |
|---|---|---|---|---|
| ocp_jvm_heap_memory_used_percent | This metric has two default thresholds: |
60 Seconds | 10 Seconds | 5 Minutes |
Alert information
Alert Trigger Method |
Alert Level |
Scope |
|---|---|---|
| Based on monitoring metric expressions | service |
Alert template
Alert overview
- Template: ${alarm_target} ${alarm_name}
- Example: alarm_template_id=0:svr_ip=xx.xx.xx.xx:svr_port=8080 OCP Node JVM Heap Memory Usage Exceeds Threshold
Alert Details
- Template: Alert: ${alarm_name}, heap memory usage ${value_shown} exceeds ${alarm_threshold} %
- Example: Alert: OCP Node JVM Heap Memory Usage Exceeds Threshold, with heap memory usage of 90% exceeding the 80% threshold.
Alert recovery
- Template: Alert: ${alarm_name}, OCP Node JVM Heap Memory Usage Exceeds Threshold: ${value_shown}
- Example: Alert: CP Node JVM Heap Memory Usage Exceeds Threshold, OCP Node JVM Heap Memory Usage Exceeds Threshold: 10 %
Impact on the system
When the JVM heap memory usage of an OCP node exceeds the upper limit, the system may experience lag when processing business requests.
Possible causes
- OCP memory reclamation is less than allocation, causing heap memory usage to soar.
- You have modified parameters in Parameter Management, causing OCP monitoring collection, alert detection, backup and recovery, cluster operation and maintenance operations to consume excessive heap memory.
Solution
- Log in to OCP, and choose System Management > Platform Monitoring from the left navigation pane to view the performance monitoring and HTTP request monitoring of the OCP platform. Observe whether related performance metrics such as memory, disk, and system load are normal.
- View the JVM memory distribution chart, analyze heap memory anomalies, and examine memory allocation issues by analyzing the memory dump snapshot.
- Perform memory expansion for OCP by increasing the allocated memory, and then restart OCP.
