This topic describes how to troubleshoot row lock contention and provides examples of troubleshooting.
Applicable versions
The solution provided in this topic is applicable to all versions of OceanBase Database.
Logic to trouble row lock contention
In a business environment, the two objects that are closely related to a row lock are defined as the lock holder and lock waiter, which are a part of a transaction. Therefore, the monitoring of the lock holder and waiter during the monitoring of active transactions helps troubleshoot row lock contention.
OceanBase Database provides the following virtual tables for monitoring active transactions, row lock holders, and row lock waiters.
__all_virtual_trans_stat: used to monitor active transactions.The following table describes the main columns in the virtual table.
Column name Description svr_ip The IP address of the OBServer node that creates the transaction context. session_id The unique ID of the session to which the transaction context belongs. proxy_id The IP address and port number of the ODP or Java Client. trans_id The unique ID of the transaction. is_existing Indicates whether the current transaction context is exiting. partition The partition on which the current transaction context is created. participants The list of participants in the current transaction. ctx_create_time The point in time when the transaction context was created. ref The reference count of the transaction context. sql_no The sql_noof the last SQL statement that is executed on the current transaction context.state The current status of the transaction context. Valid values: INIT,PREPARE,COMMIT,ABORT, andCLEAR.part_trans_action The last action executed on the transaction context. Valid values: START_TASK,END_TASK, andCOMMIT.lock_for_read_entry_count The number of retries when the transaction encounters row lock contention during table scanning. __all_virtual_trans_lock_stat: used to record the information about the row locks held by all active transactions in the current cluster.The following table describes the main columns in the virtual table.
Column name Description rowkey The rowkey of the row that holds the lock. session_id The unique ID of the session to which the lock-holding transaction belongs. proxy_id The IP address and port number of the ODP or Java Client to which the lock-holding transaction belongs. trans_id The unique ID of the transaction that holds the lock. __all_virtual_lock_wait_stat: used to record the information about SQL queries or statements that are waiting for the row lock in the current cluster.The following table describes the main columns in the virtual table.
Column name Description session_id The unique ID of the session to which the lock-waiting transaction belongs. proxy_id The IP address and port number of the ODP or Java Client to which the lock-waiting transaction belongs. trans_id The unique ID of the transaction that is waiting for the lock. lock_ts The point in time when the request starts waiting for the lock. abs_timeout The absolute timeout period of the SQL statement that is waiting for the lock. try_lock_times The number of retries performed by a lock-waiting statement in acquiring the lock. block_session_id The ID of the session to which the first lock-waiting transaction in the row belongs.``
Examples of troubleshooting row lock contention
Scenario 1
If business times out for a long period and an unknown long transaction in a session holds the lock. This blocks the execution of other transactions. In this case, you must find and terminate the long transaction.
Solution 1: Query the lock-holding transaction by using the lock-waiting transaction.
Use the
session_idof the lock-waiting transaction to obtain therowkeyof the row that is waiting for the lock.Then, you can find that the lock-waiting transaction is waiting for the row whose primary key is
pk.obclient> select * from __all_virtual_lock_wait_stat where session_id = 3221580756\G *************************** 1. row *************************** svr_ip: xxx.xxx.xx.xxx svr_port: xxxx table_id: 1101710651081554 rowkey: table_id=1101710651081554 hash=779dd9b202397d7 rowkey_object=[{"VARCHAR":"pk", collation:"utf8mb4_general_ci"}] addr: 140433355180784 need_wait: 1 recv_ts: 1600440077959302 lock_ts: 1600440077960167 abs_timeout: 1600450077859302 try_lock_times: 1 time_after_recv: 1307610861 session_id: 3221580756 block_session_id: 3221580756 type: 0 lock_mode: 0 1 row in set (0.01 sec)Use the primary key to find the
session_idof the transaction that holds the row lock.The
session_idof the lock-holding transaction is3221577520.obclient> select * from __all_virtual_trans_lock_stat where rowkey like '%pk%'\G *************************** 1. row *************************** tenant_id: 1002 trans_id: {hash:6605492148156030705, inc:3284929, addr:"xxx.xxx.xx.xxx:xxxx", t:1600440036535233} svr_ip: xxx.xxx.xx.xxx svr_port: xxxx partition: {tid:1101710651081554, partition_id:0, part_cnt:0} table_id: 1101710651081554 rowkey: table_id=1101710651081554 hash=779dd9b202397d7 rowkey_object=[{"VARCHAR":"pk", collation:"utf8mb4_general_ci"}] session_id: 3221577520 proxy_id: NULL ctx_create_time: 2020-09-18 22:41:03.583285 expired_time: 2020-09-19 01:27:16.534919 1 row in set (0.05 sec)Terminate the session of the lock-holding transaction based on the
session_id. Log on to the OBServer node where the transaction resides and run thekillcommand to terminate the session.KILL 3221577520
Solution 2: Query the long transaction.
Obtain the
trans_idof the transaction that consumes the longest execution time and is not terminated.obclient> select * from __all_virtual_trans_lock_stat order by ctx_create_time limit 5\G *************************** 1. row *************************** tenant_id: 1002 trans_id: {hash:6605492148156030705, inc:3284929, addr:"xxx.xxx.xx.xxx:xxxx", t:1600440036535233} svr_ip: xxx.xxx.xx.xxx svr_port: xxxx partition: {tid:1101710651081554, partition_id:0, part_cnt:0} table_id: 1101710651081554 rowkey: table_id=1101710651081554 hash=779dd9b202397d7 rowkey_object=[{"VARCHAR":"pk", collation:"utf8mb4_general_ci"}] session_id: 3221577520 proxy_id: NULL ctx_create_time: 2020-09-18 22:41:03.583285 expired_time: 2020-09-19 01:27:16.534919 1 row in set (0.05 sec)Use the
trans_idof the transaction to find all the locks that the transaction holds, and identify which row needs to be unlocked based on therowkey.In the following example, the
rowkeyof the first row is identical to the queriedrowkeyin the previous step. This indicates that the queried transaction holds the lock.obclient> select * from __all_virtual_trans_lock_stat where trans_id like '%hash:6605492148156030705, inc:3284929%'\G *************************** 1. row *************************** tenant_id: 1002 trans_id: {hash:6605492148156030705, inc:3284929, addr:"xxx.xxx.xx.xxx:xxxx", t:1600440036535233} svr_ip: xxx.xxx.xx.xxx svr_port: xxxx partition: {tid:1101710651081554, partition_id:0, part_cnt:0} table_id: 1101710651081554 rowkey: table_id=1101710651081554 hash=779dd9b202397d7 rowkey_object=[{"VARCHAR":"pk", collation:"utf8mb4_general_ci"}] session_id: 3221577520 proxy_id: NULL ctx_create_time: 2020-09-18 22:41:03.583285 expired_time: 2020-09-19 01:27:16.534919 *************************** 2. row *************************** tenant_id: 1002 trans_id: {hash:6605492148156030705, inc:3284929, addr:"xxx.xxx.xx.xxx:xxxx", t:1600440036535233} svr_ip: xxx.xxx.xx.xxx svr_port: xxxx partition: {tid:1101710651081554, partition_id:0, part_cnt:0} table_id: 1101710651081554 rowkey: table_id=1101710651081554 hash=89413aecf767cd7 rowkey_object=[{"VARCHAR":"ob", collation:"utf8mb4_general_ci"}] session_id: 3221577520 proxy_id: NULL ctx_create_time: 2020-09-18 22:41:03.583285 expired_time: 2020-09-19 01:27:16.534919 2 rows in set (0.05 sec)Confirm the
session_idof the session that corresponds to the lock-holding transaction, and terminate the session. Log on to the OBServer node where the transaction resides and run thekillcommand to terminate the session.KILL 3221577520
Scenario 2
Transactions executed on a row frequently times out, and the rowkey of the row is known. In this scenario, the rowkey contains test.
Use the
rowkeyto query the__all_virtual_trans_lock_stattable, and find the lock-holding transaction.obclient> select * from __all_virtual_trans_lock_stat where memtable_key like '%test%'\G; *************************** 1. row *************************** tenant_id: 1 trans_id: {hash:6124095709354809361, inc:249476, addr:"xxx.xxx.xx.xxx:xxxx", t:1529075759177984} svr_ip: xxx.xxx.xx.xxx svr_port: xxxx partition: {tid:1099511677778, partition_id:0, part_cnt:0} memtable_key: {table_id:1099511677778, hash_val:7893135555906369137, buf:"table_id=1099511677778 hash=3fb183d083d6d9f1 rowkey_object=[{"VARCHAR":"test", collation:"utf8mb4_general_ci"}] "} session_id: 2147549190 proxy_id: NULL ctx_create_time: 2018-06-15 23:16:35.695821 expired_time: 2018-06-15 23:32:39.177533 1 row in set (0.03 sec)Query the
__all_virtual_trans_stattable based on the preceding results to obtain thesession_idof the session that corresponds to the lock-holding transaction.You can find that the
session_idof the session is2147549190.OceanBase (root@oceanbase)> select * from __all_virtual_trans_stat where trans_id like '%6124095709354809361%'\G; *************************** 1. row *************************** tenant_id: 1 svr_ip: xxx.xxx.xx.xxx svr_port: xxxx inc_num: 249476 session_id: 2147549190 proxy_id: NULL trans_type: 0 trans_id: {hash:6124095709354809361, inc:249476, addr:"xxx.xxx.xx.xxx:xxxx", t:1529075759177984} is_exiting: 0 is_readonly: 0 is_decided: 0 active_memstore_version: 0-0-0 partition: {tid:1099511677778, partition_id:0, part_cnt:0} participants: [{tid:1099511677778, partition_id:0, part_cnt:0}] autocommit: 0 trans_consistency: 0 ctx_create_time: 2018-06-15 23:16:35.695821 expired_time: 2018-06-15 23:32:39.177533 refer: 1073741826 sql_no: 1 state: 0 part_trans_action: 2 lock_for_read_retry_count: 1 1 row in set (0.01 sec)Confirm the
session_idof the session that corresponds to the lock-holding transaction, and terminate the session. Log on to the OBServer node where the transaction resides and run thekillcommand to terminate the session.KILL 2147549190