Alert description
The difference between the version number about to initiate a major compaction and the version number for a major compaction that is completed is greater than the specified threshold.
Alerting principle
The following table describes the key parameters involved in the alert monitoring logic.
| Parameter | Value |
|---|---|
| Metric | ob_cluster_frozen_version_delta |
| Metric source | SQL: select zone, name, value, time_to_usec(now()) from __all_zone;
|
| Sampled metric | current_timestamp, zone_value |
| Monitoring expression | max(zone_value{metric_group="all_zone",name="frozen_version",@LABELS}) by (@GBLABELS) - min(zone_value{metric_group="all_zone",name="last_merged_version",@LABELS}) by (@GBLABELS) |
| Sampling interval | 1 second |
The value of the metric ob_cluster_frozen_version_delta indicates the difference between the frozen version and the baseline version of the OceanBase cluster. An alert is triggered when the difference exceeds the threshold (which is 1 by default).
Rule information
| Monitoring metric | Default threshold | Duration | Detection cycle | Elimination cycle |
|---|---|---|---|---|
| ob_cluster_frozen_version_delta | 1 | 0 seconds | 60 seconds | 5 minutes |
Alert information
| Alert trigger method | Alert level | Scope |
|---|---|---|
| Expression based on monitoring metrics | Severe | Cluster |
Alert template
Alert summary
- Template: ${alarm_target} ${alarm_name}
- Example: ob_cluster=obcluster-1 OceanBase cluster frozen version and baseline version difference exceeds the limit
Alert details
- Template: Cluster: ${ob_cluster_name}, Host: ${host}, Alert: ${alarm_name}. The value ${value} exceeds ${alarm_threshold}.
- Example: Cluster: obcluster-1, Host: host-1, Alert: OceanBase cluster frozen version and baseline version difference exceeds the limit. The value 2.0 exceeds 1.0.
Alert recovery
- Template: Alert: ${alarm_name}, OceanBase cluster frozen version and baseline version difference: ${value}
- Example: Alert: OceanBase cluster frozen version and baseline version difference exceeds the limit, OceanBase cluster frozen version and baseline version difference: 0.5
where ${host} by default displays the host IP address. If set to display the host name, it will display the host IP address if the host is not managed by OCP.
Impact on the system
The response time of the statement is affected. In extreme cases, OBServer memory usage, business write suspension, and clog log disk full may occur.
Possible causes
This is common in the following scenarios:
OceanBase cluster is automatically major compaction, but a manual major compaction is initiated.
A manual major compaction is continuously initiated.
The previous major compaction failed and was not completed, but a new major compaction is triggered.
Solution
You can determine the specific scenario that triggered the alert based on the following two pieces of information:
Check the most recent major compaction information in Major Compaction > Major Compaction Details of the specific cluster.
Check the major compaction task in Task Center.
Take appropriate action based on the determination.
If this alert is triggered by a manual major compaction, you can ignore it and wait for the cluster to complete the major compaction.
Avoid initiating unnecessary major compactions.
If this alert is triggered by slow or failed major compactions, other alerts such as ob_cluster_merge_error OB cluster major compaction error and ob_cluster_merge_timeout OB cluster major compaction timeout are also triggered.
You can resolve these alerts based on the documentation and then perform a major compaction to see if any alerts are reported.
For other abnormal scenarios, proceed to the next step for troubleshooting.
Perform troubleshooting in the order described in OceanBase cluster major compaction troubleshooting.
If the issue cannot be located or resolved, contact OCP technical support.