Alert description
This alert detects the number of partitions (or SSTables) in an OceanBase tenant that have not undergone minor compaction for an extended period. The presence of frozen memtables that remain unreleased for a long time indicates that the memtables cannot be compacted or released normally.
Alert principle
When a memtable is frozen, a END_SCN is set for the frozen memtable. This time is close to the system time at that moment. After the minor compaction of this frozen memtable is completed and memory is released, it can no longer be observed in the GV$OB_SSTABLES view.
The following table lists the key parameters involved in the monitoring logic of this alert.
Parameter |
Value |
|---|---|
| Monitoring Metrics | ob_tenant_compact_timeout_count: The number of partitions (or SSTables) in the tenant that have not undergone compaction for an extended period. An alert is triggered when this count exceeds the threshold. |
| Monitoring Expression | sum(ob_tenant_sstable_compact_timeout_count{@LABELS}) by (@GBLABELS) |
| Metric Collection | ob_tenant_sstable_compact_timeout_count |
| Data Source | SQL collection: select /*+ MONITOR_AGENT NO_REWRITE READ_CONSISTENCY(WEAK) QUERY_TIMEOUT(20000000) */ a.tenant_id,b.svr_ip,b.svr_port,count(*) as cnt from DBA_OB_TENANTS a join ( select * from GV$OB_SSTABLES where table_type = 'MEMTABLE' and is_active='NO' ) as b on a.tenant_id = b.tenant_id and (TIMESTAMP_TO_SCN(now()) - end_log_scn) > 1800 * 1000 * 1000 * 1000 group by b.tenant_id,b.svr_ip,b.svr_portThis SQL statement runs in the sys tenant of each cluster. A single SSTable minor compaction exceeding 1,800 seconds is counted. |
| Collection Cycle | 5 Seconds |
Rule information
Monitoring Metrics |
Default Threshold (Unit: Count) |
Duration |
Detection Cycle |
Elimination Cycle |
|---|---|---|---|---|
| ob_tenant_compact_timeout_count | 0 | 0 Seconds | 60 Seconds | 5 Minutes |
Alert information
Alert Trigger Method |
Alert Level |
Scope |
|---|---|---|
| Based on monitoring metric expressions | Critical | Tenant |
Alert template
Alert overview
- Template: ${alarm_target} ${alarm_name}
- Example: alarm_template_id=0:ob_cluster_name=obcluster:ob_cluster_id=4:tenant_name=mysql host=xx.xx.xx.xx Number of OceanBase tenant partitions (sstables) that have not undergone minor compaction for a long time
Alert Details
- Template: Cluster: ${ob_cluster_name}, Tenant: ${tenant_name}, Alert: ${alarm_name}. The number of partitions (sstables) that the tenant has not compacted for a long time, ${value_shown}, exceeds the threshold ${alarm_threshold}.
- Example: cluster: obcluster, tenant: mysql, alert: Number of OceanBase tenant partitions (SSSTables) not compacted for a long time. The number of partitions (SSSTables) not compacted for a long time exceeds 0 when it is 10.
Alert recovery
- Template: Alert: ${alarm_name}, Number of OceanBase tenant partitions (sstables) without minor compaction for a long time: ${value_shown}
- Example: Alert: Number of OceanBase tenant partitions (sstables) not compacted for a long time, Number of OceanBase tenant partitions (sstables) not compacted for a long time: 10
Impact on the system
This may cause memory limits to be exceeded, or prevent clogs from being recycled properly, leading to a full clog disk.
Possible causes
- A backlog of minor compactions, where the rate at which minor compactions are consumed exceeds the rate at which they are added.
- Minor compaction task failed.
- Minor compaction succeeded but the MemTable could not be released, which is related to internal logic.
- The partition does not meet minor compaction conditions, such as exceeding the limit on the number of SSTables to be compacted and being unable to add new SSTables.
Solution
If minor compaction tasks are accumulating, you can add minor compaction queue threads by using the following command:
alter system set compaction_high_thread_score = xxx tenant xxx;You can check the
V$OB_COMPACTION_DIAGNOSE_INFOorGV$OB_COMPACTION_DIAGNOSE_INFOview for records of failed minor compactions in a tenant. The reference commands are as follows:select * from oceanbase.GV$OB_COMPACTION_DIAGNOSE_INFO where tenant_id = xxx;
