ob_tenant_memtable_release_timeout

2023-08-15 11:20:54  Updated

Description

OceanBase Database is a read/write splitting system. Internal data of OceanBase Database is divided into baseline data in the SSTable format and incremental data in the MemTable format based on the storage method.

The incremental data is all the new data generated after the point of time of the current major compaction. It is usually stored in the MemTable and also instantiated into a commit log file on a disk. During a minor compaction in OceanBase Database, the MemTable is first frozen and then dumped to the disk, to release the memory space occupied by the MemTable. If the MemTable of an OceanBase Database tenant has been frozen for longer than 10 minutes, this alert is triggered.

Principle

The following table describes the key parameters that are involved in the monitoring and alerting logic.

Parameter Value
Metric ob_memtable_snapshot_max_duration_seconds
Source sql SELECT /*+ PARALLEL(2), ENABLE_PARALLEL_DML, MONITOR_AGENT READ_CONSISTENCY(WEAK) */ __all_tenant.tenant_id, __all_tenant.tenant_name, snapshot.max_snapshot_duration_seconds FROM __all_tenant INNER JOIN (SELECT tenant_id, max(UNIX_TIMESTAMP(NOW()) - snapshot_version/1000000) max_snapshot_duration_seconds FROM __all_virtual_table_mgr WHERE table_type=0 and is_active=0 and svr_ip=? and svr_port=? GROUP BY tenant_id ) snapshot ON __all_tenant.tenant_id = snapshot.tenant_id
Collected metric ob_memtable_max_snapshot_duration_seconds
Metric expression max(ob_memtable_max_snapshot_duration_seconds{@LABELS}) by (@GBLABELS)
Collection cycle 60 seconds

Alert rule

Metric Default threshold (unit: s) Source Detection cycle Time before clearance
ob_memtable_snapshot_max_duration_seconds 600 SQL collection 60 seconds 5 minutes

Alert information

Trigger method Alert level Scope
Based on the expression of the metric Critical Tenant

Alert templates

  • Alert overview

    • Template: ${alarm_target} ${alarm_name}

    • Example: ob_cluster=obcluster:1001,tenant_name=tenant1,svr_ip=192.168.1.1 MemTables that have been frozen for a long time exist in the OceanBase Database tenant.

  • Alert details

    • Template: Cluster: ${ob_cluster_name}; Tenant: ${tenant_name}; Host: ${svr_ip} (zone: ${obzone}); Alert: MemTables that have been frozen for a long time exist in the OceanBase Database tenant. The longest freeze time is ${value_shown}, which has exceeded the threshold of ${alarm_threshold} seconds.

    • Example: Cluster: obcluster; Tenant: tenant1; Host: 192.168.1.1 (zone: zone1); Alert: MemTables that have been frozen for a long time exist in the OceanBase Database tenant. The longest freeze time is 1 hour, which has exceeded the threshold of 600 seconds.

Impact on the system

During a minor compaction in OceanBase Database, the MemTable is first frozen and then dumped to the disk, to release the MemStore space occupied by the MemTable, thereby ensuring continuous high performance of data write services. If MemTables that have been frozen for a long time exist, the memory space may be exhausted, resulting in insufficient resources and affecting the OBServer stability.

Possible causes

  1. A large amount of data is concurrently imported.

  2. The write speed of the MemStore is quicker than the minor compaction speed of the system. Error 4030 is reported, displaying "Over tenant memory limits".

Solutions

Figure out the cause of slow minor compaction and improve the minor compaction speed, or reduce the write speed to resolve this problem.

  1. Contact the administrator of OceanBase Database to confirm whether the minor compaction speed is as expected and whether it is necessary to increase the minor compaction speed. The following parameters can be configured.

    • freeze_trigger_percentage: The MemTable is frozen when the MemStore usage reaches this threshold.

    • minor_merge_concurrency: the number of universal minor compaction threads. If the value of this parameter is set to 0, 10 threads are used, as defined in the OceanBase Database kernel.

  2. Reduce the MemStore write speed.

    Notice

    Reducing the MemStore write speed may affect the SQL response time on the user side. Set the query timeout duration to a proper value.

Contact Us