Concurrency control model
Each transaction involves multiple read/write operations on different data in a database. The simplest concurrency control method is serial execution. In serial execution, a process does not trigger an operation until the previous process completes its operation (receives a response). However, this method does not meet the requirement of high concurrency. Therefore, scholars proposed a serializable method. This method can be used to perform multiple operations of a transaction in parallel (not as a series) and achieve the same results as serial execution.
You can create dependencies between transactions based on read/write operations in transactions. The dependencies determine the sequence of the transactions in serial execution. For example, if Transaction B depends on Transaction A, Transaction A must be executed after Transaction B.
Write dependency: If Transaction B attempts to modify Data X after Transaction A modifies Data X, Transaction B depends on Transaction A.
Read dependency: If Transaction A attempts to read Data X after Transaction B modifies Data X, Transaction A depends on Transaction B.
Anti-dependency: If Transaction B attempts to modify Data X after Transaction A reads Data X, Transaction B depends on Transaction A.
Serializability that is defined by conflicts is known as conflict serializability. You can easily analyze conflict serializability by using the preceding conflict mechanism. Conflicts between transactions can be serialized if they are not circular. Conflict serializability can be implemented by using the following mechanisms: two-phase locking and optimistic locking. Two-phase locking uses locks to limit conflicting modifications of transactions. The mechanism supports deadlock detection to roll back circular transactions and prevent loops. Optimistic locking rolls back all possible circular transactions in the detection phase during a commit to prevent loops.
However, few commercial databases support the serializable isolation level. The preceding implementation mechanisms affect database performance in a significant way. Therefore, specific acceptable circularization conditions are usually allowed to expose exceptions and improve the performance and scalability of transactions. The snapshot read and read committed isolation levels are common concurrency control methods that allow exceptions. The snapshot read isolation level depends on multiple versions of data. It allows you to read data of each version based on a fixed read version. As a result, different versions of data in a transaction cause a loop due to their anti-dependencies. For example, Transaction A reads Data X of version 1 and modifies it to generate Data Y of version 2, and Transaction B reads Data Y of version 1 and modifies it to generate Data X of version 2. In this case, Transaction A and Transaction B form a loop. This exception is usually referred to as write skew. The exception is exposed by the snapshot read isolation level. The read committed isolation level exposes the non-repeatable read exception in which two read results of a transaction are different. The balance between performance and semantic usability is essential to designing transaction isolation levels.
Concurrency control model of OceanBase Database
OceanBase Database supports two isolation levels: snapshot read and read committed. The isolation levels ensure external consistency based on distributed semantics.
Process commit requests
Distributed transactions in OceanBase Database have three possible states: RUNNING, PREPARE, and COMMIT. Transaction status cannot be atomically confirmed in distributed scenarios. Therefore, the PREPARE phase is added to the two-phase commit. OceanBase Database maintains a local commit version (also known as a prepare version) for a transaction in each partition. The global commit version (also known as a commit version) of the transaction is determined by the maximum local commit version among all partitions. In each partition, it is guaranteed that the global commit version of the transaction is greater than or equal to the local commit version. This guarantee is essential to concurrency control of read/write requests.
When you commit a transaction, OceanBase Database starts a two-phase commit process and obtains the maximum read timestamp of each participant partition as the local commit timestamp. This guarantee ensures single-value anti-dependencies. Based on the guarantee, the commit timestamp is greater than all previous read timestamps. Therefore, the commit is executed after the reads in serial execution.
Before the two-phase commit ends, OceanBase Database ensures that the global commit timestamp is greater than or equal to the local commit timestamp. OceanBase Database can obtain the global commit timestamp after receipt of the two-phase commit message. In this way, you do not need to query the transaction table later. In addition, the maximum transaction commit timestamp is updated to support subsequent read request optimization and wake up the corresponding transaction in the lock queue.
Process write requests
When you write data to OceanBase Database, the database modifies data based on the two-phase lock protocol to ensure a write dependency. When you initiate a write request to a row, OceanBase Database puts the request to the lock manager to wait for processing if multiple versions of data in the row are involved in an active transaction. OceanBase Database maintains the waiting queue in the lock manager and wakes up the write request by using the lock or timeout mechanism.
The snapshot read isolation level is designed to prevent the circularization of anti-dependencies and read dependencies. The circularization causes lost updates. Even if locking succeeds after data writing or wake-up, OceanBase Database compares the read timestamp with the maximum transaction commit timestamp maintained for the row, and rolls back the transaction if the read timestamp is less than the maximum transaction commit timestamp.
Process read requests
OceanBase Database allows you to read data based on a read version. In a read operation, the maximum local read timestamp is replaced with the read version. The preceding guarantee allows OceanBase Database to gracefully process read requests in distributed scenarios.
When you initiate a read request to a transaction in the COMMIT or ABORT state, OceanBase Database determines whether to read data based on the global commit timestamp and transaction status.
When you initiate a read request to a transaction in the RUNNING state, the maximum local read timestamp increases. Therefore, the transaction in the RUNNING state enters the two-phase commit state with a larger local timestamp. Based on the preceding guarantee and the concept of snapshot read, OceanBase Database can safely skip the data.
When you initiate a read request to a transaction in the PREPARE state, the preceding guarantee allows OceanBase Database to skip the transaction if its local timestamp is greater than the read timestamp. However, if the local timestamp of the transaction is less than the read timestamp, OceanBase Database cannot determine the relationship between the global commit timestamp of the local timestamp and the read timestamp. Therefore, OceanBase Database gracefully waits upon the transaction, which is called lock for read. The two-phase commit process is supposed to end within a short time in OceanBase Database.