OceanBase Database supports two deployment modes: Shared-Nothing (SN) and Shared-Storage (SS).
The SN mode is the most commonly used deployment mode for OceanBase Database. In this mode, all nodes are equal to each other. Each node has its own SQL engine, storage engine, and transaction engine. The nodes are deployed on a cluster of general-purpose PC servers. This mode offers high scalability, high availability, high performance, low cost, and high compatibility with mainstream databases.
An OceanBase cluster consists of multiple nodes. These nodes are distributed across multiple zones. Each node belongs to one zone. A zone is a logical concept that represents a group of nodes with similar hardware availability within the cluster. The meaning of a zone varies depending on the deployment mode. For example, when the entire cluster is deployed in the same data center (IDC), the nodes in a zone can be on the same rack or connected to the same switch. When the cluster is distributed across multiple data centers, each zone can correspond to one data center. Each zone has two attributes: IDC and region. The IDC attribute describes the data center where the zone is located, and the region attribute describes the region to which the data center belongs. Generally, a region refers to the city where the IDC is located. The IDC and region attributes of a zone must reflect the actual deployment situation to ensure that the automatic disaster recovery and optimization strategies within the cluster work effectively. Based on different high-availability requirements of business applications, OceanBase Database provides multiple deployment modes. For more information, see OceanBase cluster high-availability deployment solutions.
In OceanBase Database, the data of a table can be horizontally partitioned into multiple shards based on a specified partitioning rule. Each shard is called a table partition, or simply a partition (Partition). Each row of data belongs to and is stored in only one partition. The partitioning rule is specified by the user when creating the table. It can be a hash partition, range partition, or list partition. It also supports subpartitioning. For example, in a transaction database, the orders table can be partitioned by user ID into multiple primary partitions, and then each primary partition can be further subpartitioned by month into multiple subpartitions. For a subpartitioned table, each subpartition is a physical partition, while a primary partition is a logical concept. Multiple partitions of a table can be distributed across multiple nodes in the same zone. Each physical partition has a storage layer object called a tablet, which stores ordered data records.
When a record in a tablet is modified, to ensure data persistence, the redo log must be recorded in the log stream (Log Stream, LS) corresponding to the tablet. Each log stream serves multiple tablets on the same node. To protect data and ensure continuous service in case of node failure, each log stream and its corresponding tablets have multiple replicas. Generally, these replicas are distributed across different zones. Only one replica, called the leader, accepts modification operations. The other replicas are called followers. The leader and followers communicate with each other using a distributed consensus protocol based on Multi-Paxos to ensure data consistency. If the node hosting the leader fails, one of the followers is elected as the new leader to continue providing services.
Each node in the cluster runs an observer service process. The observer process contains multiple operating system threads. All nodes have the same functionality. Each observer process is responsible for storing and retrieving partitioned data on its own node and for parsing and executing SQL statements routed to this node. The observer processes on different nodes communicate with each other using the TCP/IP protocol. Each observer process also listens for connection requests from external applications, establishes connections and database sessions, and provides database services. For more information about the observer service process, see Thread concepts.
To simplify the management of multiple business databases and reduce resource costs, OceanBase Database provides a unique multi-tenant feature. In an OceanBase cluster, you can create multiple isolated databases called tenants. From the perspective of applications, each tenant is equivalent to an independent database instance. Moreover, each tenant can choose between MySQL and Oracle compatibility modes. After connecting to a MySQL tenant, an application can create users and databases within the tenant, enjoying the same experience as using an independent MySQL database. Similarly, after connecting to an Oracle tenant, an application can create schemas and manage roles within the tenant, enjoying the same experience as using an independent Oracle database. After a new cluster is initialized, a special tenant named sys, which is the system tenant, is created. The system tenant stores the metadata of the cluster and operates in MySQL compatibility mode.
Applicability
Only MySQL mode is provided in OceanBase Database Community Edition.
To isolate resources among tenants, each observer process can have multiple virtual containers, called resource units (Units), for different tenants. Resource units include CPU and memory resources. The resource units of a tenant across multiple nodes form a resource pool.
To shield applications from the internal details of partitioning and replica distribution in OceanBase Database, making it as simple as accessing a standalone database, we provide the OceanBase Database Proxy (ODP), also known as OBProxy. Applications do not directly connect to OceanBase Database nodes. Instead, they connect to ODP, which forwards SQL requests to the appropriate OceanBase Database nodes. ODP is stateless. Multiple ODP nodes provide a unified network address to applications through network load balancing (such as SLB).
The Shared-Storage (SS) mode is designed to provide more cost-effective database services in multi-cloud environments. OceanBase Database implements the SS mode based on general-purpose object storage, offering cloud-native database services in the cloud. This reduces database usage costs and improves performance and ease of use. For more information, see Architecture of a storage-compute separation system
