Concept
Replicas are a fundamental concept within the storage engine of OceanBase Database. They refer to copies of the same data on different OBServer nodes.
From a user perspective, replicas are synonymous with data partitions within OceanBase Database. Each data partition has multiple replicas based on tenant locality, providing high horizontal scalability and disaster recovery capabilities.
Data partitioning is the process of dividing a table or an index into smaller, more manageable parts based on specific rules for creating tables. Each partition is an independent object with its own name and optional storage characteristics.
Note
OceanBase Database is known for its multi-replica architecture that utilizes Paxos-based technology to ensure high availability. Replicas in this architecture consist of copies of the same data on different OBServer nodes. OceanBase Database stores data in containers of various dimensions, such as data partitions, log streams, units, and tenants. While replicas typically refer to data partition replicas, they may correspond to different database entities in various contexts.
Benefits
Replicas improve the availability and fault tolerance of OceanBase Database. They can be distributed across different geographical locations to cope with network or data center failures.
OceanBase Database replicates data to multiple replicas through partition replication or log synchronization to prevent data loss. This way, OceanBase Database can provide lossless database services even if a minority of replicas fail.
Supported replica type
OceanBase Database adopts a hierarchical LSM-tree structure. Data is divided into baseline data and incremental data.
Baseline data is persisted as SSTables on disks and remains unchanged once generated.
Incremental data is written by users into MemTables and is stored in memory. Redo logs, also called commit logs (clogs), are used to ensure transactional performance.
Multiple replicas of the data are distributed across OBServer nodes. For example, in the deployment mode of three IDCs in the same city, three replicas are available, while five replicas are available in the deployment mode of five IDCs across three regions. During transaction commitment, redo logs are synchronized across OBServer nodes based on the Paxos protocol to ensure a majority of replicas are committed and data consistency is maintained between replicas.
OceanBase Database v4.0 supports full-featured replicas, also known as standard replicas. The name of these replicas is FULL or F for short. A full-featured replica has a full set of data types and features, including transaction logs, a MemTable, and an SSTable.
Full-featured replicas are classified into leaders and followers based on data partitions. Leaders offer external write services and strong-consistency read services, as well as weak-consistency read services. Followers provide external weak-consistency read services. If a leader fails, a follower can be quickly elected as a new leader to continue providing external services.
Log streams
Concept
Log streams are a collection of data automatically created and managed in OceanBase Database. A log stream includes data partitions, transaction operation logs, and transaction management structures. Redo logs, which run on the Paxos protocol, synchronize logs across replicas for data consistency and high availability. TxCtxMgr is a transaction management structure that allows for atomic commitment of modifications across all data partitions within a log stream. A transaction that spans multiple log streams is atomically committed based on the optimized two-phase commit protocol of OceanBase Database. Log streams are participants in distributed transactions.
With the introduction of OceanBase Database v4.0, the basic unit for transaction commitment has changed from partitions to log streams.
In OceanBase Database v3.x, transactions are committed by partition and the write-ahead logging (WAL) mechanism ensured atomicity within partitions.
In OceanBase Database v4.x, transactions are committed by log stream, optimizing resources, performance, and features. Each log stream is a participant of two-phase commit, ensuring data consistency and high availability.
View the basic information of log streams
You can query for the basic information of all log streams in a tenant from the oceanbase.DBA_OB_LS view, including the status and logging progress. Example:
obclient> SELECT * FROM oceanbase.DBA_OB_LS limit 10;
+-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+
| LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN |
+-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+
| 1 | NORMAL | sa128_obv4_2;sa128_obv4_1,sa128_obv4_3 | 0 | 0 | NULL | NULL | NULL | NULL |
+-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+
1 row in set
You can execute the preceding statement in a system tenant or user tenant to view the basic information of log streams corresponding to the tenant. In the preceding example, the statement is executed in a system tenant, and the system tenant has only one log stream, whose ID is 1.
View the location information and role information of a log stream
The location information of a log stream records the OBServer nodes on which the log stream is distributed. You can query for the location information of log streams from the MEMBER_LIST field in the oceanbase.DBA_OB_LS_LOCATIONS view. Data partitions do not have independent location information. Instead, their locations are determined by the log streams they belong to. You can migrate and replicate log streams across OBServer nodes for performance balancing and disaster recovery.
The role information of a log stream defines whether the log stream is a leader or a follower. You can query the role information of log streams from the ROLE field in the oceanbase.DBA_OB_LS_LOCATIONS view. Data partitions do not have independent role information. Instead, their roles are determined by the log streams they belong to. The roles of log streams are determined based on the election protocol.
For more information about the oceanbase.DBA_OB_LS_LOCATIONS view, see oceanbase.DBA_OB_LS_LOCATIONS.
View the mapping between data partitions and log streams
You can query for the mapping between data partitions and log streams of a tenant from the DBA_OB_TABLE_LOCATIONS view. Each replica of a data partition records the basic information of the data partition and the information about the log stream to which the data partition belongs.
For more information about the oceanbase.DBA_OB_TABLE_LOCATIONS view, see oceanbase.DBA_OB_TABLE_LOCATIONS.