OceanBase Cloud supports four deployment modes: single-IDC deployment (2 nodes), single-IDC deployment (3 nodes), dual-IDC deployment, and multi-IDC deployment.
Concepts
- Full-featured replica: A full-featured replica, also known as a regular replica, is named FULL and abbreviated as F. It has all the complete data and features, including RedoLog, MemTable, and SSTable. Full-featured replicas have roles, specifically for data partitions, which are the leader and the follower. The leader mainly provides write services and strong-consistency read services, and can also provide weak-consistency read services. The follower provides weak-consistency read services and can quickly switch to the Leader to provide services when the leader fails.
- Arbitration service: OceanBase Database supports the arbitration service, abbreviated as A. The arbitration service maintains arbitration members corresponding to tenant log streams. The arbitration members have the following characteristics. For more information, see Arbitration service overview.
- They participate only in elections, Paxos prepare, and member group change voting, and do not participate in Paxos accept (log majority voting).
- They do not store logs, have no MemTable or SSTable, and consume minimal resources (bandwidth, memory, disk, and CPU).
- They cannot be elected as the leader to provide services.
The deployment modes of OceanBase Cloud data vary based on the replicas and the arbitration service deployed across one or more zones.
Transactional instance
The Shared Storage Architecture for transactional instances is suitable for scenarios such as order archiving, cold backup of transaction data, auditing in the financial industry, and log archiving. These instances are characterized by high write and low read volumes, with massive amounts of data stored in single tables of a single instance, often reaching hundreds of TBs or even petabytes. Write operations have little impact on write response time (RT), and the write traffic has no absolute peak and remains stable. Data consistency is important. Read RT is comparable to that in core transactional processing (TP) applications. The instance should support efficient transactional queries, which are primarily simple. Additionally, when selecting the Shared Storage Architecture for a transactional instance, you also need to consider the following characteristics:
Storage scalability: The storage system must be capable of handling increasing amounts of data without requiring changes to the storage architecture, and must provide both scalability for storing massive amounts of data and the ability to efficiently and continuously import data into online databases.
Storage costs: As the data volume increases, the storage costs increase significantly. To reduce costs, we need to use more cost-effective storage medium to store more data.
Query performance: Although the access frequency of historical data is low, the query latency of historical data must be similar to that of online data in some scenarios.
Architecture description
Deployment modes: Dual-replica deployment (recommended) and single-replica deployment are supported.
Separated storage and compute resources: Compute resources and storage resources can be independently scaled to achieve maximum flexibility. The local cache can be independently adjusted to ensure performance. Full data is stored in a mode where storage resources are redundantly deployed in the same city to ensure the security and reliability of data. In this mode, two copies of data are stored to ensure that RPO = 0 and RTO < 8s.
Cost-effective: Based on the usage of a cheaper storage medium, only the full amount of data needs to be stored, and you pay for actual usage instead of pre-allocating storage space in advance. It also provides the same high compression ratio as the Shared Nothing Architecture.
Independent logs: Based on Paxos, independent log storage service is decoupled from compute nodes.

Analytical instance
The shared storage architecture for analytical instances is suitable for ad-hoc queries, BI reports, multi-dimensional analysis, real-time analysis, user profiling, metric computation, and real-time risk control. With a single platform, you can perform unified online queries and offline compute operations and process massive amounts of data, with the capability to support PB-level data. It supports real-time computation and efficient query performance at the milliseconds/seconds level. In addition to the federated analysis features, you can also analyze external data lakes (such as ODPS or Hive) and other data sources. You can scale out compute and storage resources as needed for more efficient resource utilization and lower cost. When you select the Shared Storage Architecture for an analytical instance, pay attention to the following features:
Complicated architecture: Traditional offline data warehouses and online real-time analytics typically require maintaining multiple technology stacks, leading to high technical complexity and operational costs.
Storage cost: The emergence of data lakes and their integration with data warehouses requires scalable storage for massive amounts of data, leading to significantly increased storage costs. This necessitates the need for more cost-effective storage media to handle the vast data volumes.
Query performance: For large data sets, especially when performing multi-dimensional aggregations and complex calculations, the requirements for query performance and response latencies must be high.
Architecture description
Deployment modes: Analytical instances are primarily deployed with single replicas. By leveraging independent log services, cross-AZ disaster recovery capabilities can be achieved, ensuring RPO = 0 and sub-minute RTO, thus enabling optimal cost reduction for businesses.
High performance: The columnar storage engine supports vectorized execution for operators and expressions, significantly boosting the performance of the vectorized execution engine. Materialized views enhance query performance with flexible refresh strategies and real-time materialization. A cost-based optimizer allows for flexible query rewriting and dynamic adjustment of parallelism during plan generation.
Low cost: The system uses low-cost storage media and stores only a single full data copy. This allows flexible local cache management based on your specific requirements.
Extensibility: The system provides flexible APIs and the ability to integrate with external data sources, supporting external files (CSV/Parquet/ORC), external data sources (OSS/HDFS/S3…), and Catalog (ODPS/Hive).

Key-Value instance
The Shared Storage Architecture for Key-Value instances is suitable for storing various types of structured, semi-structured, and unstructured data in scenarios such as IoT, connected vehicle, and time-series data. One technical stack can support the management of multiple types of model data and various open-source standards. This architecture enables SQL queries, time-series processing, and retrieval and analysis capabilities, thereby satisfying the storage and analysis requirements for massive structured and semi-structured data. For this architecture, you may have scalability and cost concerns related to the storage of massive data. Additionally, when selecting the Shared Storage Architecture for a Key-Value instance, you should also take into account the complexity of the architecture. Diverse business requirements have led to an increasing number of data types. However, the complexity and rising costs of data storage technologies have caused a contradiction. To address this contradiction, you may need to use different storage and analysis technologies for different data types. However, this approach involves using multiple, complex technologies. With business growth, data types will continue to increase, which will increase the demand for differentiated processing of different data types. This can cause serious data storage fragmentation.
Architecture description
Deployment modes: Supports single-replica deployment and dual-replica deployment.
Multimodel integration: This platform manages various types of model data on the same platform. It supports wide table services compatible with HBase and provides massive data access and storage capabilities based on tables. In addition to JSON, GIS, vector, and array data, this platform also provides native operation interfaces for object storage.
High availability: All OBKV nodes are equal in functionality and provide high availability to eliminate the risk of a single point of failure for critical components like ZK in the traditional HBase architecture.
Multi-tenancy: The multi-tenant architecture of OceanBase Database supports both resource sharing and resource isolation.
Low cost: High compression ratio, infinite expansion of the storage layer, and lower cost enable you to scale large volumes of structured, semi-structured, and unstructured data.

Transactional instance
Single-IDC deployment (2 nodes)
In the single-IDC deployment (2 nodes), two full-featured nodes are deployed in the same zone to eliminate latency caused by cross-zone or cross-IDC deployment. This mode is suitable for scenarios that require low latency and strong cost reduction. This mode offers host-level disaster recovery capabilities, and has the following advantages:
Multiple full-featured replicas provide read and write capabilities simultaneously, offering higher performance in load balancing.
Write requests in this mode do not require cross-IDC synchronization, only intra-IDC synchronization and access, resulting in lower latency.

Single-IDC deployment (3 nodes)
In the single-IDC deployment (3 nodes), three full-featured nodes are deployed in the same zone to eliminate latency caused by cross-zone or cross-IDC deployment. This mode is suitable for scenarios that require low latency and a high total compute power of a single cluster.
This mode offers host-level disaster recovery capabilities, and has the following advantages:
Multiple full-featured replicas provide read and write capabilities simultaneously, offering higher performance in load balancing.
Write requests in this mode do not require cross-IDC synchronization, only intra-IDC synchronization and access, resulting in lower latency.
Dual-IDC deployment
The dual-IDC deployment deploys two full-featured nodes in two zones and one arbitration node in a third zone. The arbitration node does not synchronize or replay logs, does not store redo logs or baseline data, and does not provide read or write services. This mode is supported in OceanBase Database V4.1.0.0 and later.

Multi-IDC deployment
The multi-IDC deployment deploys three nodes in three different zones to achieve cross-zone disaster recovery. Each node is a full-featured replica. One of the nodes serves as the leader to provide read and write services, and the other two nodes serve as read-only replicas. When the leader fails, one of the read-only replicas becomes the leader to continue providing read and write services.
Customers with higher performance and multi-IDC availability requirements are recommended to choose the multi-IDC deployment.

Differences between deployment modes
| Deployment mode | Single-IDC (2 nodes) | Single-IDC (3 nodes) | Dual-IDC | Multi-IDC |
|---|---|---|---|---|
| Nodes | 3 | 3 | 3 | 3 |
| Full-featured replicas | 2 | 3 | 2 | 3 |
| Arbitration node | 1 | 0 | 1 | 0 |
Analytical instance
Single-IDC deployment (1 node)
The single-IDC deployment (1 node) of OceanBase Cloud is a lightweight deployment mode that has only one full-featured replica. It provides cross-zone disaster recovery capability (the disaster recovery zone can be specified in the instance console). This mode is suitable for scenarios such as testing and learning, cost optimization for non-core businesses, and situations where high availability and business continuity are not highly required.
This mode has the following characteristics:
Only one data replica is maintained. Data is stored in only one replica without data synchronization or replication mechanisms. Compared with the multi-replica strategy (which typically stores three copies of data to ensure high availability), the single-replica strategy saves storage space and reduces replication overheads.
Low resource consumption. In the single-replica strategy, you do not need to maintain data consistency between multiple replicas. This reduces the consumption of compute resources and network traffic. This mode is more suitable for scenarios with a low budget or where disaster recovery requirements are low.
Higher performance. In a single-replica deployment, there is no data replication or consistency verification overhead. All requests are processed on a single replica, resulting in lower latency and higher throughput.

Single-IDC deployment (2 nodes)
In single-IDC deployment (2 nodes), two full-featured nodes are deployed in the same zone. This eliminates latency caused by cross-zone or cross-IDC deployment. This mode is suitable for scenarios that require low latency and strong cost-saving requirements. This provides host-level disaster recovery capability. In addition, it has the following advantages:
Multiple full-featured replicas provide read and write capabilities, offering higher performance in load balancing.
Write requests in this mode do not require cross-IDC synchronization. They are synchronized and accessed within the same IDC, resulting in lower latency.

Dual-IDC deployment
In the dual-IDC deployment of OceanBase Cloud, two full-featured nodes are deployed in two zones, and one arbitration node is deployed in a third zone. The arbitration node does not synchronize or replay logs, does not store redo logs or baseline data, and does not provide read or write services to external applications. Dual-IDC deployment is supported in OceanBase Cloud V4.1.0.0 and later.

Differences between deployment modes
| Deployment mode | Single-IDC (1 node) | Single-IDC (2 nodes) | Dual-IDC |
|---|---|---|---|
| Number of nodes | 1 | 3 | 3 |
| Number of full-featured replicas | 1 | 2 | 2 |
| Arbitration node | 0 | 1 | 1 |
Note
In the 2F1A (2 full-featured replicas and 1 arbitration service) strategy, the arbitration service node is invisible to users. Therefore, if you purchase three nodes, only two nodes are actually visible in the system.
Key-Value instance
Single-IDC deployment (2 nodes)
The single-IDC deployment of OceanBase Cloud deploys two full-featured nodes in the same zone to eliminate latency caused by cross-zone or cross-IDC deployment, suitable for scenarios requiring low latency and strong cost reduction demands. The single-IDC deployment offers the following advantages:
Multiple full-featured replicas provide read and write capabilities, delivering higher performance in load balancing.
Write requests in the single-IDC deployment do not require cross-IDC synchronization, only intra-IDC data synchronization and access, resulting in lower latency.

Dual-IDC deployment
The dual-IDC deployment of OceanBase Cloud deploys two full-featured nodes in two zones and deploys an arbitration node in a third zone. The arbitration node does not store redo logs or baseline data, does not replay logs, and does not provide read or write services to external applications. This deployment mode is supported in OceanBase Cloud V4.1.0.0 and later.

Differences between deployment modes
| Deployment mode | Single-IDC (2 nodes) | Dual-IDC |
|---|---|---|
| Nodes | 3 | 3 |
| Full-featured replicas | 2 | 2 |
| Arbitration node | 1 | 1 |
Note
In the 2F1A (2 full-featured replicas and 1 arbitration service) strategy, the arbitration service node is invisible to users. Therefore, three nodes are purchased, but only two nodes are visible in the system.