Based on the concept of table partitioning in conventional database services, OceanBase Database can divide the data of a table into different partitions. A partition is a logical object created by a user and a mechanism used to divide and manage table data.
Each partition has a corresponding data storage object, which is referred to as a tablet. A tablet stores data and supports transfer between servers. It is the smallest unit for data balancing.
A log stream (LS) is an entity that is automatically created and managed by OceanBase Database. It is a collection of data and contains several tablets and ordered redo logs. It uses the Paxos protocol to synchronize logs between replicas to ensure data consistency between the replicas and thereby implement high availability of data.
In addition, as a distributed database service, OceanBase Database copies the data in the same log stream to multiple OBServer nodes to ensure high availability of data reads and writes. These data copies on different OBServer nodes are called replicas. A log stream group is an independent Paxos group that consists of a log stream and its replicas. The Paxos consensus protocol is used to ensure strong consistency among the members. In each log stream group, one log stream serves as the leader and the rest as followers. The leader supports strong-consistency reads and writes, and the followers support weak-consistency reads.
Location cache
OceanBase Database organizes user data by log stream. Each log stream has multiple replicas for disaster recovery. To execute an SQL statement, OceanBase Database must obtain the locations of partition data. Then, OceanBase Database can locate the specific server to read data from or write data to the corresponding replica. Each observer process provides a service for refreshing and caching the partition locations required by the local server. The service is called a location cache service.
OceanBase Database introduces the concepts of log stream and tablet in V4.0.0. To obtain the locations of partition data, you must know the tablet corresponding to the partition, the mappings between tablets and log streams, and the locations of log stream replicas. You can obtain the information from the following data dictionaries of the sys tenant:
oceanbase.CDB_OBJECTS: You can setOBJECT_TYPEtoTABLE,TABLE PARTITION, orTABLE SUBPARTITIONto obtain theDATA_OBJECT_IDvalue, namely theTABLET_IDvalue, of the corresponding partition, which uniquely identifies a tablet in a tenant.oceanbase.CDB_OB_TABLET_TO_LS: allows you to obtain the log stream corresponding to a tablet. A log stream is uniquely identified byLS_ID.oceanbase.CDB_OB_LS_LOCATIONS: allows you to obtain the replica types and locations of a log stream.
Replica types
Several replica types are available based on the types of data stored. This is to support the different business preferences in terms of data security, performance scalability, availability, and cost.
OceanBase Database supports the following four types of replicas:
Full-featured replica (FULL/F)
Read-only replica (READONLY/R)
A full-featured replica is also called a Paxos replica. Paxos replicas can constitute a Paxos group. A read-only replica is also called a non-Paxos replica, which cannot constitute a Paxos group.
Distributed consensus protocol
OceanBase Database synchronizes transaction logs among replicas of the same log stream based on the Paxos protocol. It commits transaction logs only when the logs are synchronized in the majority of replicas. By default, the leader ensures strong consistency reads and writes. Followers support weak consistency reads, which allows you to read data of an earlier version.
Data balancing
OceanBase Database uses the RootService to manage load balancing among the resource units of a tenant. Different types of replicas require different amounts of resources. RootService considers the CPU utilization, disk usage, memory usage, and IOPS of each resource unit when it balances data. To make full use of resources available on each OBServer node, RootService balances the usage of various resources among all OBServer nodes after load balancing.
RootService balances data in the following two ways:
Replica balancing
RootService adjusts the resource usage of a tenant on different servers by migrating resource units or replicating or migrating log stream replicas.
Leader balancing
After replica balancing is achieved, RootService further creates and balances log stream leaders among servers based on the primary zone of the tenant. Specifically, log stream leaders are aggregated on the same server to reduce the possibility of executing distributed transactions and the response time (RT) of business requests.