Based on the concept of table partitioning in conventional database services, OceanBase Database can divide the data of a table into different partitions. A partition is a logical object created by a user and a mechanism used to divide and manage table data.
Each partition has a corresponding data storage object, which is referred to as a tablet. A tablet stores data and supports transfer between servers. It is the smallest unit for data balancing.
A log stream (LS) is an entity that is automatically created and managed by OceanBase Database. It is a collection of data and contains several tablets and ordered redo logs. It uses the Paxos protocol to synchronize logs between replicas to ensure data consistency between the replicas and thereby implement high availability of data.
In addition, as a distributed database service, OceanBase Database copies the data in the same log stream to multiple OBServer nodes to ensure high availability of data reads and writes. These data copies on different OBServer nodes are called replicas. A log stream group (LS group) is an independent Paxos group that consists of a log stream and its replicas. The Paxos consensus protocol is used to ensure strong consistency among the members. In each LS group, one log stream serves as the leader and the rest as followers. The leader supports strong-consistency reads and writes, and the followers support weak-consistency reads.
Location cache
OceanBase Database organizes user data by log stream. Each log stream has multiple replicas for disaster recovery. To execute an SQL statement, OceanBase Database must obtain the locations of partition data. Then, OceanBase Database can locate the specific server to read data from or write data to the corresponding replica. Each observer process provides a service for refreshing and caching the partition locations required by the local server. The service is called a location cache service.
OceanBase Database introduces the concepts of log stream and tablet in V4.0.0. To obtain the locations of partition data, you must know the tablet corresponding to the partition, the mappings between tablets and log streams, and the locations of log stream replicas. You can obtain the information from the following data dictionaries of the sys tenant:
oceanbase.CDB_OBJECTS: You can setOBJECT_TYPEtoTABLE,TABLE PARTITION, orTABLE SUBPARTITIONto obtain theDATA_OBJECT_IDvalue, namely theTABLET_IDvalue, of the corresponding partition, which uniquely identifies a tablet in a tenant.oceanbase.CDB_OB_TABLET_TO_LS: allows you to obtain the log stream corresponding to a tablet. A log stream is uniquely identified byLS_ID.oceanbase.CDB_OB_LS_LOCATIONS: allows you to obtain the replica types and locations of a log stream.
Replica types
Several replica types are available based on the types of data stored. This is to support the different business preferences in terms of data security, performance scalability, availability, and cost.
OceanBase Database supports the following types of replicas:
Full-featured replica (FULL/F)
Read-only replica (READONLY/R)
A full-featured replica is also called a Paxos replica. Paxos replicas can constitute a Paxos group. A read-only replica is also called a non-Paxos replica, which cannot constitute a Paxos group.
Distributed consensus protocol
OceanBase Database synchronizes transaction logs among replicas of the same log stream based on the Paxos protocol. It commits transaction logs only when the logs are synchronized in the majority of replicas. By default, the leader ensures strong consistency reads and writes. Followers support weak consistency reads, which allows you to read data of an earlier version.
Data balancing
OceanBase Database uses the RootService to manage load balancing among the resource units of a tenant. The resources required vary according to the replica type. RootService performs load balancing based on the CPU utilization, disk usage, memory usage, and IOPS of each resource unit. To make full use of resources available on each OBServer node, RootService balances the usage of various resources among all OBServer nodes after load balancing.
RootService balances data in the following two ways:
Replica balancing
RootService adjusts the resource usage of a tenant on different servers by migrating resource units or log stream replicas or creating log stream replicas.
Leader balancing
Based on replica balancing, RootService further creates and balances log stream leaders among servers automatically based on the settings of the tenant such as the primary zone setting. Specifically, log stream leaders are aggregated on the same server to reduce the possibility of executing distributed transactions and the response time (RT) of business requests.