Concept
Replicas are a fundamental concept within the storage engine of OceanBase Database. From a user perspective, replicas are copies of the same data on different OBServer nodes.
From a database perspective, replicas are data partitions within OceanBase Database. Each data partition has multiple replicas based on tenant locality, providing high horizontal scalability and disaster recovery capabilities.
Data partitioning is the process of dividing a table or an index into smaller, more manageable parts based on specific rules for creating tables. Each partition is an independent object with its own name and optional storage characteristics.
Note
OceanBase Database is known for its multi-replica architecture that utilizes Paxos-based technology to ensure high availability. Replicas in this architecture consist of copies of the same data on different OBServer nodes. OceanBase Database stores data in containers of various dimensions, such as data partitions, log streams, units, and tenants. While replicas typically refer to data partition replicas, they may correspond to different database entities in various contexts.
Benefits
Replicas improve the availability and fault tolerance of OceanBase Database. Replicas can be distributed in different geographical locations to cope with network failures and data center failures.
In OceanBase Database, data is replicated to multiple replicas through partition replication or log synchronization to prevent data loss. In this way, OceanBase Database can still provide lossless database services when a minority of replicas fail.
Types
OceanBase Database adopts a hierarchical log-structured merge-tree (LSM Tree) structure. Data is divided into baseline data and incremental data.
Baseline data is persisted as SSTables in disks and is not modified once generated.
Incremental data is written into MemTables by users and is stored in the memory. Redo logs, also called commit logs (clogs), are used to ensure transactional performance.
Multiple replicas of the data are distributed across OBServer nodes. For example, three replicas are available in the deployment mode of three IDCs in the same city, and five replicas are available in the deployment mode of five IDCs across three regions. During a transaction commit, redo logs are synchronized across OBServer nodes based on the Paxos protocol, to complete the commit for a majority of replicas and maintain data consistency between replicas.
OceanBase Database supports two types of replicas in the current version: full-featured replicas and read-only replicas. Full-featured replicas are also called standard replicas. The name of these replicas is FULL or F for short. A full-featured replica has a full set of data types and features, including redo logs, a MemTable, and an SSTable. The name of read-only replicas is READONLY or R for short. Read-only replicas provide read capabilities and do not provide write capabilities. A read-only replica can serve only as a follower, and cannot participate in election or voting. In other words, a read-only replica cannot be elected as the leader of a log stream.
Full-featured replicas are classified into leaders and followers based on data partitions. Leaders mainly provide external write services and strong-consistency read services, as well as weak-consistency read services. Followers provide external weak-consistency read services. If the current leader fails, a follower can be quickly elected as the new leader to provide external services.
Log streams
Concept
Log streams are entities that are automatically created and managed in OceanBase Database. A log stream is a collection of data, including several data partitions, logs of transaction operations performed on the data partitions, and transaction management structures. Redo logs are a log module running based on the Paxos protocol. Redo logs are used to synchronize logs across replicas to ensure data consistency between replicas and high availability of data. TxCtxMgr is a transaction management structure. Modifications to all data partitions in a log stream can be atomically committed within the log stream. A transaction that spans multiple log streams is atomically committed based on the optimized two-phase commit protocol of OceanBase Database. Log streams are participants of distributed transactions.

Log streams are a new concept introduced in OceanBase Database V4.0. A significant difference between OceanBase Database V4.0 and OceanBase Database V3.x lies in the basic unit for transaction commits.
In OceanBase Database V3.x, transactions are committed by partition. The write-ahead logging (WAL) mechanism ensures the atomicity of modifications within a partition. Each partition is a participant in two-phase commit, and the basic unit for transaction commit is a partition.
In OceanBase Database V4.x, transactions are committed by log stream. The WAL mechanism ensures the atomicity of modifications within a log stream. Each log stream is a participant in two-phase commit, and the basic unit for transaction commit is a log stream. This way, OceanBase Database is optimized in terms of resources, performance, and features.
Broadcast log streams
OceanBase Database introduces the concept of broadcast log streams in V4.2.0. When the first replicated table is created for a tenant, the system automatically creates a special log stream, which is called a broadcast log stream. Then, subsequent replicated tables of this tenant are all created in this broadcast log stream. A broadcast log stream differs from a general log stream in that the broadcast log stream automatically deploys a replica on each OBServer node of the tenant, to ensure that the replicated table can provide strong-consistency reads on any OBServer node in ideal conditions.
Generally, the more replicas participating in Paxos voting, the longer the time required for the majority of replicas to reach a consensus. For a tenant with many OBServer nodes, it is infeasible for replicas on all OBServer nodes to participate in voting. Therefore, the broadcast log stream deploys a read-only replica on an OBServer node whose replica does not need to participate in voting, and deploys a full-featured replica on an OBServer node whose replica needs to participate in voting.
The following list describes the differences between a broadcast log stream and a general log stream in terms of replicas:
A general log stream deploys only one replica in each zone, and the replica type must match the one specified in the locality.
In each zone, in addition to a replica of the type specified in the locality, a broadcast log stream also deploys a read-only replica on each OBServer node on which resources of the resource unit for the tenant are distributed. A broadcast log stream does not deploy any replica in a zone for which no replica type is specified in the locality.
The following list describes the limitations of broadcast log streams:
The
systenant and all meta tenants do not have a broadcast log stream or support replicated tables.A user tenant can have only one broadcast log stream.
Attribute conversion between a broadcast log stream and a general log stream is not supported.
A broadcast log stream cannot be separately deleted. At present, a broadcast log stream can only be deleted together with the corresponding tenant.
View the basic information of log streams
You can query the DBA_OB_LS view for the basic information of all log streams in the current tenant, including the log synchronization status and progress. Here are some examples:
View the information of a general log stream
You can view the basic information of log streams in the
systenant or a user tenant. Execute the following statement in thesystenant. The result shows that thesystenant has only one log stream with the ID 1.SELECT * FROM oceanbase.DBA_OB_LS limit 10;The following result is returned:
+-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+ | LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | +-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+ | 1 | NORMAL | sa128_obv4_2;sa128_obv4_1,sa128_obv4_3 | 0 | 0 | NULL | NULL | NULL | NULL | +-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+ 1 row in setView the information of a broadcast log stream
You can view the information of a broadcast log stream only in a user tenant. The
systenant does not have a broadcast log stream. Execute the following statement in a user tenant. The result shows the information of the broadcast log stream of the user tenant. The replicated table is created in the broadcast log stream.SELECT * FROM oceanbase.DBA_OB_LS WHERE flag LIKE "%DUPLICATE%";The following result is returned:
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+ | LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | FLAG | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+ | 1003 | NORMAL | z1;z2 | 0 | 0 | 1683267390195713284 | NULL | 1683337744205408139 | 1683337744205408139 | DUPLICATE | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+
View the location information and role information of a log stream
The location information of a log stream records the OBServer nodes on which the log stream is distributed. You can query the distribution of full-featured replicas and read-only replicas in the log stream respectively from the MEMBER_LIST and LEARNER_LIST fields in the oceanbase.DBA_OB_LS_LOCATIONS view. No independent location information is provided for data partitions. Instead, the locations of data partitions are determined by the locations of log streams to which the data partitions belong. You can migrate and replicate log streams across OBServer nodes for performance balancing and disaster recovery.
The role information of a log stream defines whether the log stream is a leader or a follower. You can query the role information of log streams from the ROLE field in the oceanbase.DBA_OB_LS_LOCATIONS view. No independent role information is provided for data partitions. Instead, the roles of data partitions are determined by the roles of log streams to which the data partitions belong. The roles of log streams are determined based on the election protocol.
For more information about the oceanbase.DBA_OB_LS_LOCATIONS view, see DBA_OB_LS_LOCATIONS.
View the mapping between data partitions and log streams
You can query the mapping between data partitions and log streams of a tenant from the DBA_OB_TABLE_LOCATIONS view. Each replica of a data partition records the basic information of the data partition and the information about the log stream to which the data partition belongs.
For more information about the oceanbase.DBA_OB_TABLE_LOCATIONSview, see DBA_OB_TABLE_LOCATIONS.