OceanBase Database in shared storage mode relies on Log Service for data persistence and on multiple compute nodes for high availability. This topic describes the disaster recovery deployment options supported by OceanBase SS.
OceanBase SS ensures data integrity through object storage and log integrity through Log Service, providing the following high availability solutions:
- High availability solution based on Paxos consensus protocol for LogService This solution provides multi-replica high availability disaster recovery capabilities based on the Paxos consensus protocol. Write nodes (RW) directly write data to multiple replicas of LogService through the Paxos protocol to achieve persistence. Read-only nodes (RO) read logs from the most recent LogService replica and replay hot data locally. When a minority of replicas are unavailable (up to one replica in a three-replica cluster), the compute nodes' read and write operations remain unaffected, and services continue to operate normally.
- High availability solution for read-write nodes (RW) with multiple compute nodes: This solution leverages multiple read-only nodes to provide disaster recovery capabilities. When a read-write node becomes unavailable, the database automatically performs a disaster recovery switch and restores services, ensuring no data loss (RPO = 0) and a fault recovery time of less than 8 seconds (RTO < 8s).
- High availability solution based on single-replica startup: This solution provides disaster recovery capabilities even without compute nodes. In cases where no read-only nodes are available for disaster recovery, since OceanBase SS stores data completely in object storage and ensures the integrity of incremental data through Log Service, a new compute node can be started up to restore full service capabilities. Specifically, data and metadata are restored from object storage to the local node, and incremental data is restored from Log Service. This ensures no data loss (RPO = 0) and a fault recovery time of minutes (RTO at the minute level). OceanBase SS recommends the following deployment modes, which you can choose based on your IDC configuration and performance and availability requirements. | Deployment mode | Disaster recovery capability | RTO | RPO | |---------|---------|-----|-----| | Single IDC (multi-replica compute nodes) | Machine-level zero-loss disaster recovery | Within 8 seconds | 0 | | Single IDC (single-replica compute node) | Single-replica startup disaster recovery | At the minute level | 0 | | Multi-IDC (multi-replica compute nodes) | IDC-level zero-loss disaster recovery | Within 8 seconds | 0 | | Three IDCs (three-replica LogService) | LogService-level zero-loss disaster recovery | Within 8 seconds | 0 |
Single IDC deployment (multi-replica compute nodes)
When only one IDC is available, you can deploy read-write nodes (RW) and multiple read-only nodes (RO) within the same IDC to achieve machine-level zero-loss disaster recovery. When the server hosting the RW node fails, the system automatically performs a disaster recovery switch to promote an RO node to an RW node to continue providing services, ensuring no data loss (RPO = 0) and a fault recovery time of less than 8 seconds (RTO < 8s).
Single IDC deployment (single-replica compute node)
When only one IDC is available and only a single compute node is deployed, you can use the single-replica startup mechanism to provide disaster recovery capabilities. When the compute node becomes unavailable, a new compute node can be started up to restore full service capabilities by restoring data and metadata from object storage and incremental data from Log Service. This ensures no data loss (RPO = 0) and a fault recovery time of minutes (RTO at the minute level).
Multi-IDC deployment (multi-replica compute nodes)
When multiple IDCs are available in the same city, you can deploy compute nodes (RW and RO) across IDCs to achieve IDC-level zero-loss disaster recovery. When the IDC hosting the RW node becomes unavailable, the system automatically performs a disaster recovery switch to an RO node in another IDC to continue providing services, ensuring no data loss (RPO = 0) and a fault recovery time of less than 8 seconds (RTO < 8s).
Three IDCs deployment (three-replica LogService)
Log Service supports three-replica deployment across three IDCs to ensure IDC-level disaster recovery capabilities. The three replicas of Log Service are deployed in three different IDCs. When any one of the IDCs becomes unavailable, Log Service continues to provide services, ensuring the persistence and integrity of logs.
Note
- Compute nodes (RW and RO) and Log Service are currently only supported for deployment within the same IDC.
- Compute nodes support deployment within a single IDC or across multiple IDCs, while Log Service is only supported for three-replica deployment across three IDCs.
