From High Availability to Business Continuity: What Global Systems Actually Need

Tim Zhang
Tim Zhang
Published on March 26, 2026
10 minute read
Key Takeaways
  • Most teams have high availability (Level 1–2: node and AZ failures) but haven't tested business continuity (Level 3–4: region and cloud provider failures). The gap between "can fail over" and "will fail over cleanly" is where outages become incidents.
  • Four databases compared across four failure levels: Aurora and Spanner are locked to their respective clouds; CockroachDB supports cross-cloud for self-hosted but not managed clusters; OceanBase runs on seven clouds with an independent control plane for true Level 4 coverage.
  • Start with cross-region DR and real drills before jumping to multi-cloud. The most common failure pattern is skipping the fundamentals.
  • Share
    X
    linkedin
    ICON_SHARE
    mail