If a transaction is not committed after its execution duration exceeds the threshold specified for applications, the transaction is defined as a long-running transaction. In general, this threshold is shorter than the transaction timeout period, which is specified by the tenant-level system variable ob_trx_timeout. When the transaction execution duration exceeds the transaction timeout period, the transaction automatically rolls back and releases the relevant resources. This is a monitoring mechanism that prevents applications from occupying system resources for long-running transactions.
Hazards of long-running transactions and suspended transactions are as follows:
- An active transaction occupies the reference count of the MemTable. If the transaction cannot be finished within a long period, the MemTable cannot be released after it is frozen, and the memory usage may exceed the upper limit.
- If an active transaction holds a row lock, the row lock cannot be released. If another transaction is waiting for the row, the transaction may wait for a long time or roll back due to a timeout.
Once you notice an alert of a long-running transaction or suspended transaction, you need to immediately confirm the causes and eliminate the risk.
You can perform the following troubleshooting steps:
Identify long-running transactions.
You can query the
GV$OB_TRANSACTION_PARTICIPANTSview from thesystenant to check whether a long-running transaction exists. If a long-running transaction exists, you can view the transaction ID in theTX_IDcolumn, the node where the scheduler resides in theSCHEDULER_ADDRcolumn, and the session ID in theSESSION_IDcolumn.SELECT count(1) FROM GV$OB_TRANSACTION_PARTICIPANTS WHERE CTX_CREATE_TIME < date_sub(now(), INTERVAL 600 SECOND) AND STATE = 'INIT';INTERVAL 600 SECONDin the preceding sample code indicates the default transaction timeout period. You can modify the value for applications as needed.Notice
Before you perform this transaction inspection operation, make sure that the system has a leader.
Check whether the transaction is an eXtended Architecture (XA) transaction. To do so, use the transaction ID as the filter condition to query the
V$OB_GLOBAL_TRANSACTIONview. If a record exists, the transaction is an XA transaction.If the transaction is an XA transaction, execute the
XA ROLLBACKstatement to roll back the transaction in the corresponding user tenant. If the transaction is an ordinary transaction, use OBClient to directly connect to thesystenant on the node where the scheduler of the transaction resides, and then execute theKILL SESSIONstatement to roll back the transaction.
O&M suggestions:
We recommend that you do not set a prolonged statement timeout period by using the tenant-level system variable ob_query_timeout, whose default value is 10s, or a prolonged transaction timeout period by using the tenant-level system variable ob_trx_timeout, whose default value is 100s. Prolonged execution of transactions affects system stability.