Description
This alert is triggered when the active memory usage exceeds the threshold. Active memory usage = Active memory used (active MemStore used)/Threshold to trigger freeze (major freeze trigger)
Principle
The following table describes the key parameters that are involved in the monitoring and alerting logic.
| Parameter | Value |
|---|---|
| Metric | ob_tenant_host_active_memstore_percent Note The value of this metric indicates the percentage of active memory in the tenant. When the value is greater than the threshold, the alert is triggered. The default threshold is 110%. |
| Source | SQL: select /*+read_consistency(weak)*/ tenant_name, tenant_id, stat_id, value from v$sysstat, __all_tenant where stat_id IN (130000, 130002) and (con_id > 1000 or con_id = 1) and __all_tenant.tenant_id = v$sysstat.con_id;
Note |
| Collected metric | sysstat_value |
| Metric expression | 100 * sum(sysstat_value{metric_group="sysstat",stat_id="130000",@LABELS}) by (@GBLABELS) / sum(sysstat_value{metric_group="sysstat",stat_id="130002",@LABELS}) by (@GBLABELS)
Note |
| Collection cycle | 1 second |
Alert rule
| Metric | Default threshold (unit: %) | Duration | Detection cycle | Time before clearance |
|---|---|---|---|---|
| ob_tenant_host_active_memstore_percent | 110 | 0 seconds | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Metric expression | Critical | Tenant |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: Cluster: ${ob_cluster_name}, Alert: ${alarm_name}, The active memory usage is ${value}%, exceeding the threshold of ${alarm_threshold}%.
Overview example: ob_cluster=C1-1000:tenant_name=tenant-1:svr_ip=xxx.xxx.xxx.xxx. The active memory usage of an OceanBase tenant exceeds the threshold.
Details example: Cluster: obcluster-1, Alert: The active memory usage of an OceanBase tenant exceeds the threshold. The active memory usage is 201.0%, exceeding the threshold of 200.0%.
Impact on the system
When this alert is triggered, the OBServer memory may be full, causing the application to stop writing in extreme scenarios.
Possible causes
When the used active memory space reaches the freeze threshold, the freeze and compaction operations are triggered to freeze the active memory (MemTable), generate a new active MemTable, compact data in the frozen MemTable with the dumped data of the previous version (if any), and then persist the compacted data to the disk to release the memory. This alert is triggered when the write speed is faster than the dump speed (the speed of freezing, compacting, and persisting the data in the active memory) and the active memory usage exceeds the threshold before it can be dumped (frozen, compacted, and persisted to the disk).
Suggested solutions
Check whether the amount of business traffic is too large. For example, the amount of traffic can be large during data import.
On the tenant page of the cluster, choose SQL Diagnosis > TopSQL . On the TopSQL page, check whether the quantity of SQL statements and the number of executions are too large.
If yes, the amount of business traffic is large.
Apply throttling to the OceanBase cluster. For more information, see Apply throttling to an OceanBase cluster.
Check whether the alert is cleared in the OCP console 10 minutes later. If the alert is not cleared, the problem may have been caused by other issues.
Otherwise, the alert is caused by other issues.
Check for minor compaction or major compaction exceptions.
Typically, minor compaction and major compaction exceptions can be fixed by restarting the OBServer. For more information, see Restart an OBServer.
Check for disk write errors.
If the business traffic is normal but the problem persists after you restart the OBServer, check for disk write errors. For example, the disk is damaged or the disk write is slow. In the case of a disk error, see Exception handling for OceanBase cluster compaction.