ob_tenant_expired_trans_exist OceanBase tenant has expired transactions|V4.3.6| docs|Distributed Database

ob_tenant_expired_trans_exist OceanBase tenant has expired transactions

Last Updated：2025-09-08 08:15:43 Updated

Alert description

This alert is triggered when transactions remain in the commit stage for 1200 seconds or more within the tenant.

Alerting principle

The following table describes the key parameters involved in the alerting monitoring logic.

Parameter	Value
Monitoring metric	pending_trans_max_duration_seconds
Data source	SQL for data collection When OceanBase Database is of a version earlier than V2.2.7: select /+ MONITOR_AGENT READ_CONSISTENCY(WEAK) QUERY_TIMEOUT(%d)/ tenant_id, trans_id, `partition`, floor(unix_timestamp(ctx_create_time) 1000000) as ctx_create_time, session_id, participants, trans_type, part_trans_action, sql_no from __all_virtual_trans_stat where svr_ip = ? and svr_port = ? and is_exiting != 1 When OceanBase Database is of a version from V2.2.7 to V3.2: select /+ MONITOR_AGENT READ_CONSISTENCY(WEAK) QUERY_TIMEOUT(%d)/ a.tenant_id, b.tenant_id as tenant_id_xa, a.trans_id, `partition`, floor(ctx_create_time) as ctx_create_time, session_id, participants, trans_type, part_trans_action, sql_no from (select tenant_id, svr_ip, trans_id, `partition`, unix_timestamp(ctx_create_time) 1000000 as ctx_create_time, session_id, participants, trans_type, part_trans_action, sql_no from __all_virtual_trans_stat where svr_ip = ? and svr_port = ? and is_exiting != 1) a left join __all_virtual_global_transaction b on a.tenant_id = b.tenant_id and a.trans_id = b.trans_id When OceanBase Database is of a version V3.2 or later: select /+ MONITOR_AGENT READ_CONSISTENCY(WEAK) QUERY_TIMEOUT(%d)/ a.tenant_id, b.tenant_id as tenant_id_xa, a.trans_id, `partition`, floor(ctx_create_time) as ctx_create_time, session_id, participants, trans_type, part_trans_action, sql_no, log_size_byte from (select tenant_id, svr_ip, trans_id, `partition`, unix_timestamp(ctx_create_time) 1000000 as ctx_create_time, session_id, participants, (pending_log_size + flushed_log_size) as log_size_byte, trans_type, part_trans_action, sql_no from __all_virtual_trans_stat where svr_ip = ? and svr_port = ? and is_exiting != 1) a left join __all_virtual_global_transaction b on a.tenant_id = b.tenant_id and a.trans_id = b.trans_id XA transaction latency* `select max((collect_time - ctx_create_time)/1000000) as trans_max_duration_seconds from ob_hist_trans_stat where trans_type != 3 and ctx_trans_state=3`
Collected metrics	collect_time,ctx_create_time
Monitoring expression	max((collect_time - ctx_create_time)/1000000)
Collection interval	60 seconds

Rule information

Monitoring metric	Default threshold	Monitoring metric source	Detection cycle	Elimination cycle
pending_trans_max_duration_seconds	1200	Tenant metrics	60 seconds	5 minutes

Alert information

Alert trigger method	Alert level	Scope
Based on the expression of the monitoring metric	Critical	Tenant

Alert template

Alert summary
- Template: ${alarm_target} ${alarm_name}
- Example: ob_cluster=obcluster-1:tenant_name=orac2:trans_hash={hash:10801753558860391353, inc:59202486, addr:"xxx.xxx.xxx.xxx:2882", t:1646993121179509} OceanBase tenant has a pending transaction
Alert details
- Template: Cluster: ${ob_cluster_name}, Tenant: ${tenant_name} has a pending transaction. Session ID: ${session_id}, Transaction ID: ${trans_hash}, Transaction type: ${trans_type}, Transaction creation time: ${trans_create_time}, Maximum duration of the transaction: ${value_shown}.
- Example: Cluster: obcluster-1, Tenant: orac2 has a pending transaction. Session ID: 3221635048, Transaction ID: {hash:10801753558860391353, inc:59202486, addr:"xxx.xxx.xxx.xxx:2882", t:1646993121179509}, Transaction type: distribute, Transaction creation time: 2022-03-11T18:05:21.184+08:00, Maximum duration of the transaction: 25 days 19 hours 57 minutes 24.66 seconds.
Alert recovery
- Template: Alert: ${alarm_name}
- Example: Alert: OceanBase tenant has a pending transaction

Impact on the system

A pending transaction causes the MemStore to stop the flush, and the business stops.

Possible causes

This alert is usually caused by a minority, full disk, or memory overflow.
The machine clock is out of synchronization by more than 100 ms.

Solution

Check whether a minority occurs.

A minority occurs generally because of OBServer node exceptions or network failures, and the ob_cannot_connected OB server cannot be connected alert is reported.

If this alert is reported, refer to the alert documentation to handle the issue, and then check whether the alert in this section is resolved 5 minutes later.
Check whether the disk space is insufficient.

If the disk space is insufficient, the following alerts are reported at the same time. First, refer to the corresponding alert documentation to resolve the issue, and then check whether the alert in this section is resolved 5 minutes later.
Check whether the memory is insufficient.

If the memory is insufficient, the ob_host_mem_percent_over_threshold OB server memory usage exceeds the threshold alert may be reported. For more information, see the solution of this alert.

If the issue persists, contact Technical Support by using the following commands:

-- View information about all servers to check whether an OBServer node is abnormal.
select * from __all_server;

-- View the current hanging transactions.
SELECT *
FROM __all_virtual_trans_stat
WHERE is_exiting !=1 AND part_trans_action > 2 AND ctx_create_time < DATE_SUB(NOW(), INTERVAL 500 SECOND)
LIMIT 100;