ob_tenant_large_trans_exist

2024-03-22 09:34:34  Updated

Description

In OceanBase Cloud Platform (OCP), a transaction that satisfies either of the following two conditions is considered a large transaction:

  1. The log volume of a single participant exceeds 0.5 MB. (This option is supported in OceanBase Database V3.2 and later.)

  2. The transaction execution time exceeds 500 ms.

Note

This alert applies only when the log volume exceeds 0.5 MB. The ob_tenant_slow_sql_exists alert applies when the transaction execution time exceeds 500 ms.

Principle

The following table describes the key parameters that are involved in the monitoring and alerting logic.

Parameter Value
Metric trans_max_log_size_mb
Source
  • When the OceanBase Database version is V3.2 or later: select /*+ MONITOR_AGENT READ_CONSISTENCY(WEAK) QUERY_TIMEOUT(%d)*/ a.tenant_id, b.tenant_id as tenant_id_xa, a.trans_id, `partition`, floor(ctx_create_time) as ctx_create_time, session_id, participants, trans_type, part_trans_action, sql_no, log_size_byte from (select tenant_id, svr_ip, trans_id, `partition`, unix_timestamp(ctx_create_time) *1000000 as ctx_create_time, session_id, participants, (pending_log_size + flushed_log_size) as log_size_byte, trans_type, part_trans_action, sql_no from __all_virtual_trans_stat where svr_ip = ? and svr_port = ? and is_exiting != 1) a left join __all_virtual_global_transaction b on a.tenant_id = b.tenant_id and a.trans_id = b.trans_id
  • Transaction log volume: select max(log_size_byte)/1024/1024 as trans_max_log_size_mb from ob_hist_trans_stat
  • Collected metric log_size_byte
    Metric expression max(log_size_byte)/1024/1024
    Collection cycle 60 seconds

    Note

    The system parameter ocp.alarm.datasource.trans-min-log-size-mb specifies the log volume threshold, in MB, that triggers an alert. The value of trans_max_log_size_mb must be equal to or greater than the value of this system parameter. The default value is 0.5 MB.

    Alert rule

    Metric Default threshold (unit: MB) Source Detection cycle Time before clearance
    trans_max_log_size_mb 0.5 Tenant 60 seconds 5 minutes

    Alert information

    Trigger method Alert level Scope
    Based on the expression of the metric Critical Tenant

    Alert templates

    • Alert overview

      • Template: ${alarm_target} ${alarm_name}

      • Example: ob_cluster=obcluster-2:tenant_name=orac2:trans_hash={hash:10801753558860391353, inc:59202486, addr:"xxx.xxx.xxx.xxx:2882", t:1646993121179509} A large transaction exists in the OceanBase Database tenant.

    • Alert details

      • Template: Cluster: ${ob_cluster_name}; Tenant: A large transaction exists in the ${tenant_name} tenant; Session ID: ${session_id}; Transaction ID: ${trans_hash}; Maximum Log Volume: ${trans_max_log_size_mb} MB; Transaction Type: ${trans_type}; Transaction Created At: ${trans_create_time}

      • Example: Cluster: obcluster; Tenant: A long-running transaction exists in the orac2 tenant; Session ID: 3221635048; Transaction ID: {hash:10801753558860391353, inc:59202486, addr:"xxx.xxx.xxx.xxx:2882", t:1646993121179509}; Maximum Log Volume: 0.7 MB; Transaction Type: distribute; Transaction Created At: 2022-03-11T18:05:21.184+08:00

    Impact on the system

    Large transactions may cause system memory problems. For example, the playback on the follower is out of order. If a minor compaction needs to be performed due to insufficient memory, transactions that are played back before they are committed need to be migrated. Both transaction migration and playback to the minor compaction point consume memory. This drains the memory so that both tasks cannot be completed. Similar issues may occur during transaction execution on the leader.

    Possible causes

    Too much data is committed in a single transaction.

    Solutions

    1. Use the transaction diagnostics feature in the tenant management module in OCP to identify the cause and perform troubleshooting accordingly.

    2. If the transaction is expected to be a large one in business, check whether the business can be split to avoid a large transaction, or increase the alert threshold.

    Contact Us