Concept
Replicas are a fundamental concept within the storage engine of OceanBase Database. From a user perspective, replicas are copies of the same data on different OBServer nodes.
From a database perspective, replicas are data partitions within OceanBase Database. Each data partition has multiple replicas based on tenant locality, providing high horizontal scalability and disaster recovery capabilities.
Data partitioning is the process of dividing a table or an index into smaller, more manageable parts based on specific rules for creating tables. Each partition is an independent object with its own name and optional storage characteristics.
Note
OceanBase Database is known for its multi-replica architecture that utilizes Paxos-based technology to ensure high availability. Replicas in this architecture consist of copies of the same data on different OBServer nodes. OceanBase Database stores data in containers of various dimensions, such as data partitions, log streams, units, and tenants. While replicas typically refer to data partition replicas, they may correspond to different database entities in various contexts.
Benefits
Replicas improve the availability and fault tolerance of OceanBase Database. They can be distributed across different geographical locations to cope with network or data center failures.
OceanBase Database replicates data to multiple replicas through partition replication or log synchronization to prevent data loss. This way, OceanBase Database can provide lossless database services even if a minority of replicas fail.
Supported replica type
OceanBase Database adopts a hierarchical LSM-tree structure. Data is divided into baseline data and incremental data.
Baseline data is persisted as SSTables on disks and remains unchanged once generated.
Incremental data is written by users into MemTables and is stored in memory. Redo logs, also called commit logs (clogs), are used to ensure transactional performance.
Multiple replicas of the data are distributed across OBServer nodes. For example, in the deployment mode of three IDCs in the same region, three replicas are available, while five replicas are available in the deployment mode of five IDCs across three regions. During transaction commitment, redo logs are synchronized across OBServer nodes based on the Paxos protocol to ensure a majority of replicas are committed and data consistency is maintained between replicas.
The current version of OceanBase Database supports two types of replicas: full-featured replicas and read-only replicas. Full-featured replicas are also called standard replicas. The name of these replicas is FULL or F for short. A full-featured replica has a full set of data types and features, including transaction logs, a MemTable, and an SSTable. The name of read-only replicas is READONLY or R for short. Read-only replicas provide read capabilities and do not provide write capabilities. A read-only replica can serve only as a follower, and cannot participate in election or voting. In other words, a read-only replica cannot be elected as the leader of a log stream.
Full-featured replicas are classified into leaders and followers based on data partitions. Leaders offer external write services and strong-consistency read services, as well as weak-consistency read services. Followers provide external weak-consistency read services. If a leader fails, a follower can be quickly elected as a new leader to continue providing external services.
Log streams
Concept
Log streams are a collection of data automatically created and managed in OceanBase Database. A log stream includes data partitions, transaction operation logs, and transaction management structures. Redo logs, which run on the Paxos protocol, synchronize logs across replicas for data consistency and high availability. TxCtxMgr is a transaction management structure that allows for atomic commitment of modifications across all data partitions within a log stream. A transaction that spans multiple log streams is atomically committed based on the optimized two-phase commit protocol of OceanBase Database. Log streams are participants in distributed transactions.

Log streams are a new concept introduced in OceanBase Database V4.0. A significant difference between OceanBase Database V4.0 and OceanBase Database V3.x is that the basic unit for transaction commitment has changed from partitions to log streams.
In OceanBase Database V3.x, transactions are committed by partition and the write-ahead logging (WAL) mechanism ensures the atomicity of modifications within a partition. Each partition is a participant in two-phase commit, and the basic unit for transaction commit is a partition.
In OceanBase Database V4.x, transactions are committed by log stream. The WAL mechanism ensures the atomicity of modifications within a log stream. Each log stream is a participant in two-phase commit, and the basic unit for transaction commit is a log stream. This way, OceanBase Database is optimized in terms of resources, performance, and features.
Broadcast log streams
OceanBase Database introduces the concept of broadcast log streams in V4.2.0. When the first replicated table is created for a tenant, the system automatically creates a special log stream, which is called a broadcast log stream. Then, subsequent replicated tables of this tenant are all created in this broadcast log stream. A broadcast log stream differs from a general log stream in that the broadcast log stream automatically deploys a replica on each OBServer node of the tenant, to ensure that the replicated table can provide strong-consistency reads on any OBServer node in ideal conditions.
Generally, the more replicas participating in Paxos voting, the longer the time required for the majority of replicas to reach a consensus. For a tenant with many OBServer nodes, it is impossible for replicas on all OBServer nodes to participate in voting. Therefore, the broadcast log stream deploys a read-only replica on an OBServer node whose replica does not need to participate in voting, and deploys a full-featured replica on an OBServer node whose replica needs to participate in voting.
Differences between a broadcast log stream and a general log stream in terms of replicas are as follows:
A general log stream deploys only one replica in each zone, and the replica type must match the one specified in the locality.
In each zone, in addition to a replica of the type specified in the locality, a broadcast log stream also deploys a read-only replica on each OBServer node on which resources of the resource unit for the tenant are distributed. A broadcast log stream does not deploy any replica in a zone for which no replica type is specified in the locality.
Limitations of broadcast log streams are described as follows:
The sys tenant and all meta tenants do not have a broadcast log stream or support replication tables.
A user tenant can have only one broadcast log stream.
Attribute conversion between a broadcast log stream and a general log stream is not supported.
A broadcast log stream cannot be separately deleted. At present, a broadcast log stream can only be deleted together with the corresponding tenant.
View the basic information of log streams
You can query the DBA_OB_LS view for the basic information of all log streams in the current tenant, including the log synchronization status and progress. Here are some examples:
View the information of general log streams
You can view the basic information of log streams in the sys tenant or a user tenant. Execute the following statement in the sys tenant. The result shows that the sys tenant has only one log stream with the ID 1.
SELECT * FROM oceanbase.DBA_OB_LS limit 10;The result is as follows:
+-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+ | LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | +-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+ | 1 | NORMAL | sa128_obv4_2;sa128_obv4_1,sa128_obv4_3 | 0 | 0 | NULL | NULL | NULL | NULL | +-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+ 1 row in setView the information of a broadcast log stream
You can view the information of a broadcast log stream only in a user tenant. The sys tenant does not have a broadcast log stream. Execute the following statement in a user tenant. The result shows the information of the broadcast log stream of the user tenant. The replication table is created in the broadcast log stream.
SELECT * FROM oceanbase.DBA_OB_LS WHERE flag LIKE "%DUPLICATE%";The result is as follows:
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+ | LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | FLAG | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+ | 1003 | NORMAL | z1;z2 | 0 | 0 | 1683267390195713284 | NULL | 1683337744205408139 | 1683337744205408139 | DUPLICATE | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+
View the location information and role information of a log stream
The location information of a log stream records the OBServer nodes on which the log stream is distributed. You can query for the location information of log streams from the MEMBER_LIST and LEARNER_LIST fields in the oceanbase.DBA_OB_LS_LOCATIONS view. Data partitions do not have independent location information. Instead, their locations are determined by the log streams they belong to. You can migrate and replicate log streams across OBServer nodes for performance balancing and disaster recovery.
The role information of a log stream defines whether the log stream is a leader or a follower. You can query the role information of log streams from the ROLE field in the oceanbase.DBA_OB_LS_LOCATIONS view. Data partitions do not have independent role information. Instead, their roles are determined by the log streams they belong to. The roles of log streams are determined based on the election protocol.
For more information about the oceanbase.DBA_OB_LS_LOCATIONS view, see oceanbase.DBA_OB_LS_LOCATIONS.
View the mapping between data partitions and log streams
You can query for the mapping between data partitions and log streams of a tenant from the oceanbase.DBA_OB_TABLE_LOCATIONS view. Each replica of a data partition records the basic information of the data partition and the information about the log stream to which the data partition belongs.
For more information about the oceanbase.DBA_OB_TABLE_LOCATIONS view, see oceanbase.DBA_OB_TABLE_LOCATIONS.