Replica concept
A replica is a copy of data in OceanBase Database's storage engine. The concept of replicas applies to data at the user level.
In OceanBase Database, data partitions provide high scalability and disaster recovery capabilities. A data partition is a logical container that holds data and logs. Multiple replicas of a data partition are distributed across nodes based on the locality attribute of the tenant to which the nodes belong.
A table or index can be divided into multiple, smaller, and more manageable parts based on a specific rule. Each of these parts is a data partition and an independent object with its own name and optional storage characteristics.
Note
OceanBase Database is known for its multi-replica architecture, which is built on the Paxos protocol and provides high availability. In a multi-replica setup, a replica is a copy of data on a different node. Data in OceanBase Database is stored in data partitions, log streams, units, and tenants. Generally, when we refer to a replica, we are referring to a replica of a data partition. However, it's important to note that different terms of replica may refer to different database entities in different contexts.
Replicas
Replicas enhance the availability and fault tolerance of OceanBase Database. Replicas can be deployed across different regions to guard against network failures or data center outages.
OceanBase Database replicates data across multiple replicas through partition replication and log synchronization to prevent data loss and ensure that the database services can be provided losslessly even if some replicas fail.
Types of replicas
The storage engine of OceanBase Database uses a layered LSM-Tree structure. Data in this structure is divided into two parts: baseline data and incremental data.
Baseline data is data that is written to the disk and persisted. Once baseline data is generated, it will not be modified. This data is stored in SSTables.
Incremental data is data that is stored in memory. When you write data, it is written to memory first. This data is stored in MemTables. To ensure transactionality (also known as CommitLogs or clogs).
These data are redundant (for example, three replicas are deployed in one region, and five replicas are deployed across three regions). When a transaction is committed, the Paxos protocol is used to synchronize the redo logs across multiple nodes to achieve majority commit. This maintains consistency among replicas.
OceanBase Database supports full-featured replicas and read-only replicas in the current version. Full-featured replicas, also known as primary replicas, are named FULL and referred to as F. They store complete redo logs, MemTables, and SSTables. Read-only replicas are named READONLY and referred to as R. They provide only read capabilities and do not accept write requests. As a result, they can only follow log streams and cannot act as leaders for log streams. They do not participate in elections or log voting and cannot be elected as leaders.
Full-featured replicas have the concept of a role, which applies to data partitions. The roles are leader and follower. Leaders primarily provide write services and strong-consistency read services, and can also provide weak-consistency read services. Followers provide only weak-consistency read services. In the case of leader failure, followers can quickly switch to leaders to provide services.
Log stream introduction
Log stream concepts
A log stream is an entity automatically created and managed by OceanBase Database. It represents a collection of data, including multiple data partitions and the transaction logs and transaction management structures for operating on the data. The redo log module, which is implemented based on the Paxos protocol, synchronizes logs across replicas to ensure data consistency and achieve high availability. The TxCtxMgr transaction management structure enables atomic commits of modifications on all data partitions within a log stream. For modifications across log streams, it uses the two-phase commit protocol to ensure atomicity. As a participant in a distributed transaction, a log stream is also referred to as a transaction stream.

In OceanBase Database V4.0, a log stream is a new concept introduced to replace the transaction commit unit in OceanBase Database V3.x. This change brings significant benefits in terms of resources, performance, and features.
In OceanBase Database V3.x, the transaction commit unit is a partition. Modifications within a partition are atomically committed based on write-ahead logging (WAL). Each partition participates in a two-phase commit, making it the basic unit of transaction commit.
In OceanBase Database V4.x, the transaction commit unit is a log stream. Modifications within a log stream are atomically committed based on WAL. Each log stream participates in a two-phase commit, making it the basic unit of transaction commit.
Broadcast log stream
Starting from OceanBase Database V4.2.0, the concept of a broadcast log stream is introduced. When the first replicated table is created for a tenant, a special log stream, known as a broadcast log stream, is automatically created for the tenant. Then, any replicated tables created for the tenant are created in the broadcast log stream. Unlike a normal log stream, a broadcast log stream is automatically deployed with a replica on each OBServer node in the tenant to ensure strong consistency for reads from any OBServer node in ideal conditions.
Generally, the more replicas participate in a vote, the longer it takes to reach a majority. If a tenant has many OBServer nodes, it is not practical to have all OBServer nodes participate in a vote. Therefore, in this case, the broadcast log stream deploys read-only (R) replicas (also known as readonly replicas) on OBServer nodes that do not participate in a vote and full-featured (F) replicas (also known as full-featured replicas) on OBServer nodes that do participate in a vote.
The differences between a broadcast log stream and a normal log stream in terms of replica deployment are as follows:
In a normal log stream, each zone can have only one replica, and the type of this replica must match the replica type specified in the locality.
In a broadcast log stream, in addition to the replica of the type specified in the locality for each zone, a readonly replica is deployed on each server within the zone that has unit resources of the tenant. Zones not described in the locality can have no replicas.
The limitations on a broadcast log stream are as follows:
The
systenant and allMetatenants do not have broadcast log streams and do not support creating replicated tables.A user tenant can have at most one broadcast log stream.
You cannot convert the attributes between a broadcast log stream and a normal log stream.
You cannot manually delete a broadcast log stream. It will be deleted when the corresponding tenant is deleted.
Query basic information about log streams
You can query the DBA_OB_LS view for basic information about all log streams in the current tenant, such as the status and log progress. For example:
Query information about normal log streams
Both the sys tenant and user tenants can query the basic information about the log streams in the current tenant. The following example shows a query performed in the sys tenant, which is the only tenant in this example.
SELECT * FROM oceanbase.DBA_OB_LS limit 10;The result is as follows.
+-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+-----------+ | LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | FLAG | +-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+-----------+ | 1 | NORMAL | sa128_obv4_2;sa128_obv4_1,sa128_obv4_3 | 0 | 0 | NULL | NULL | NULL | NULL | | +-------+--------+----------------------------------------+---------------+-------------+------------+----------+----------+--------------+-----------+ 1 row in setQuery information about broadcast log streams
Only user tenants can query broadcast log streams. The sys tenant does not have broadcast log streams. The following example shows a query performed in a user tenant. The result is the information about the broadcast log stream in the tenant, where replicated tables are created.
SELECT * FROM oceanbase.DBA_OB_LS WHERE flag LIKE "%DUPLICATE%";The result is as follows.
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+ | LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | FLAG | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+ | 1003 | NORMAL | z1;z2 | 0 | 0 | 1683267390195713284 | NULL | 1683337744205408139 | 1683337744205408139 | DUPLICATE | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+
View the location and role information of log streams
Log streams contain location information that indicates the nodes on which they are distributed. You can query the MEMBER_LIST and LEARNER_LIST columns of the oceanbase.DBA_OB_LS_LOCATIONS view for the distribution of full-featured replicas and read-only replicas, respectively. Data partitions no longer have independent location information. Instead, they inherit the location information of the log stream to which they belong. Log streams can be migrated and replicated to different nodes for performance balancing and disaster recovery.
Log streams contain role information that indicates whether they are leaders or followers. You can query the ROLE column of the oceanbase.DBA_OB_LS_LOCATIONS view for the role of each log stream. Data partitions no longer have independent role information. Instead, they inherit the role of the log stream to which they belong. Log stream roles are elected based on the election protocol.
For more information about the oceanbase.DBA_OB_LS_LOCATIONS view, see DBA_OB_LS_LOCATIONS.
View the mappings between data partitions and log streams
You can query the DBA_OB_TABLE_LOCATIONS view for the mappings between data partitions and log streams in the current tenant. Each replica of each data partition is represented as a row in the view, which records the basic information of the data partition and the log stream to which it belongs.