Why Critical Workloads Need a New Architecture

Ray Yu
Ray Yu
Published on May 21, 2026
8 minute read
Key Takeaways
  • Centralized databases were designed for a world where traffic was predictable and hardware scaled vertically. Neither assumption holds for modern critical workloads.
  • Manual sharding and middleware-based "distributed" approaches relieve the symptoms but fragment ACID guarantees, restrict query patterns, and push complexity into the application layer.
  • Native distributed SQL solves the underlying problem: horizontal scalability, strong consistency, and operational resilience — without forcing applications to be rewritten.

Databases That Outgrew Their Architecture

Every relational database in widespread production use today — Oracle, DB2, SQL Server, PostgreSQL, MySQL — was conceived between the 1970s and the 1990s. The design assumptions were reasonable for that era: traffic was predictable, hardware scaled vertically, and a single well-provisioned server could be sized for the worst case the business plausibly faced. Concurrency was bounded by the number of physical operators behind a counter, an ATM, or a back-office terminal.

The internet broke every one of those assumptions. When every customer with a phone is a potential operator, traffic can rise by two or three orders of magnitude in seconds and collapse just as quickly. E-commerce flash sales, payment peaks, multi-tenant SaaS bursts, real-time risk control — these workloads share a common profile: high concurrency, strong consistency, continuous availability, and unpredictable scale. They define what "mission-critical" means today.

A workload like that places three non-negotiable demands on the database underneath it:

  • Data must not be wrong. Transaction records are the source of truth for the business. Anything weaker than ACID is unacceptable for the system of record.
  • Service must not stop. A maintenance window is not just an inconvenience — for global, always-on services it is a direct revenue and trust impact.
  • Concurrency must scale. The number of simultaneous operations is no longer a planning input; it is an emergent property of user behavior.

The architectural question is whether the database underneath was ever designed for this.


When Vertical Scaling Hits the Wall

For decades, the answer to "we need more capacity" was "buy a bigger box." That answer has structural problems that have only become more visible with time.

First, vertical scaling is bounded by physics. Moore's Law no longer delivers reliable single-thread performance gains, and high-end servers and shared storage arrays sit on a steeply non-linear price curve — doubling capacity often costs five to ten times more, not twice as much. Above a certain point, no amount of money buys a faster single machine.

Second, vertical scaling is operationally fragile. The reliability of the entire database depends on the reliability of one server, one storage array, one power domain. High availability is bolted on through primary-standby replication, where asynchronous modes risk data loss and synchronous modes typically cut throughput by 30% or more. Failover is rarely automatic, and recovery times are measured in minutes — sometimes longer.

Third, the cost curve diverges from the data curve. Workloads at 10 TB are still comfortable on a centralized stack. At 100 TB, the high-end storage and licensing costs grow faster than the data. At 1 PB and beyond, the centralized option simply runs out — and teams are forced into a re-architecture under deadline pressure, which is the worst possible time to do it.

The pattern is clear enough: centralized architecture is not failing because it is poorly engineered. It is failing because the workloads it now has to serve are categorically different from the ones it was designed for.


Three Paths, Three Trade-offs

The industry has produced three broad responses to this problem. They are worth comparing directly, because architects evaluating database platforms today are usually choosing between them, even when the choice isn't framed that way.

ApproachHow it worksWhere it breaks
Classic single-node (MySQL, PostgreSQL, Oracle)One database process, vertical scaling, sharding handled outside the databaseNo elastic scale-out; HA bolted on; one server is one blast radius
Middleware-based sharding (MyCAT, ShardingSphere, application-level shards)Multiple independent databases behind a routing layerNo global ACID; cross-shard queries and transactions degrade or fail; application carries the complexity
Native distributed SQL (OceanBase, Spanner, CockroachDB)Distribution is part of the database engine itselfHigher engineering complexity inside the database; ecosystem still maturing relative to monoliths

Why sharding isn't a distributed database

Sharding solves capacity. It does not automatically preserve global correctness invariants. Once you split a logical database into multiple physical databases, several invariants quietly disappear: there is no longer a single transaction manager, so cross-shard ACID in middleware-based sharding systems is typically implemented via XA or an external transaction coordinator. While technically feasible, the added latency, prolonged lock holding, and operational complexity around failure recovery often make it unsuitable for high-throughput OLTP at scale. As a result, many large-scale deployments either constrain transactions to a single shard or adopt eventual consistency patterns in practice; there is no global secondary index, so any query that doesn't include the shard key becomes a fan-out; uniqueness across the dataset has to be reinvented at the application layer.

The deeper issue is that sharding picks a single dimension — typically user ID or account ID — and optimizes for it. Any query along a different dimension (a seller's view of orders, a regional aggregate, a fraud pattern across customers) becomes structurally hard. The database is no longer one database; it is a collection of databases held together by application code.

Why middleware is a transitional architecture

Middleware-based "distributed" systems push the routing logic into a layer between the application and the underlying single-node databases. This is a meaningful improvement over pure application-level sharding, and it has carried many internet workloads through their growth phase. But the limits are the same in kind: queries that don't include the shard key are restricted, secondary indexes that aren't aligned with the shard key are difficult to maintain, and cross-shard transactions are typically delegated to compensating logic or eventual consistency. Industry analyses, including white papers from telecom and financial standards bodies, have converged on the same conclusion: middleware-based distribution is a transitional architecture, not a destination.

Why some "distributed SQL" still pays a tax

Even systems that are distributed at the database layer don't all behave the same way. A common architectural pattern separates the SQL layer from the storage layer entirely: the SQL layer parses, plans, and coordinates; the storage layer handles distribution and replication. This is conceptually clean, but it imposes a cost on every operation. Even a single-row read against data that happens to live on the same physical machine has to travel through the storage abstraction, which means a network hop. Public sysbench results for the better-known examples in this category show single-node throughput at roughly one-fifth to one-tenth of MySQL.

For OLAP, that overhead is irrelevant — analytical queries are long enough that one extra hop disappears in the noise. For OLTP, where workloads are dominated by short, latency-sensitive operations, it shows up directly in p99.


Native Distributed: Solving the Root Cause

A native distributed SQL database is one where distribution is implemented inside the database engine — not above it as middleware, not below it as a remote storage tier. Done well, this approach earns three properties at once: horizontal scalability, full SQL with global ACID, and high availability that doesn't depend on external orchestration. Done poorly, it inherits the "distributed tax" described above.

The architectural question is therefore not "should the database be distributed?" but "how is distribution implemented, and what does it cost on the operations that matter most?"

The 80/20 design assumption

Production OLTP traffic, examined carefully across diverse industries, shows a consistent shape: more than 80% of operations touch a single partition, and fewer than 20% genuinely cross node boundaries. Most online transactions partition naturally by user, account, or tenant. A balance check, an order lookup, a profile update — these are local. Cross-partition operations exist (transfers, reconciliations, aggregates) but they are the minority.

This observation reframes the optimization target. Instead of paying coordination overhead on every operation, a native distributed engine can be designed so that the dominant 80% runs with no distributed overhead at all — competing directly with a single-node database on the same hardware — and only the 20% pays the cost of cross-node coordination. This is the assumption underlying OceanBase's unified standalone-distributed architecture.

What that looks like in practice

The assumption only pays off if four implementation choices line up. The architecture below is how OceanBase realizes them:

  • A dynamic log stream, not one log per partition. Many distributed databases run an independent consensus log per data partition. With thousands of partitions on a node, that becomes thousands of independent log streams, each with its own leader election, heartbeat, and replication overhead. OceanBase uses a single dynamic log stream per tenant per node, with partitions bound to it at runtime. Per-partition coordination overhead disappears, and adding a node becomes a background partition rebind rather than an application-visible reshard.
  • Paxos with an asynchronous commit boundary. RPO=0 requires synchronous replication through a consensus protocol. The naïve implementation blocks worker threads on consensus, which is where the often-quoted 30% performance penalty comes from. OceanBase's Paxos implementation moves log submission to an asynchronous boundary inside the engine — workers don't block on the network round-trip, but consensus still completes before commit acknowledgement. The cost drops to single-digit percentages while preserving zero data loss, and combined with automated election protocols delivers an RTO under 8 seconds in production failover scenarios.
  • An LSM-tree storage engine with B+ tree-style block discipline. LSM-tree engines compress well but suffer from write amplification; B+ tree engines have predictable point-read latency but compress poorly. OceanBase's storage engine applies block-level encoding and a column-aware compression pipeline inside an LSM-tree structure, delivering compression ratios in the 70–90% range against legacy InnoDB sizes while keeping hot-path reads in memory. The same engine supports row, column, and hybrid layouts at the table level — making HTAP a storage-layer property, not an architectural detour.
  • A unified deployment model from one node to thousands. When the same engine runs on one node and on hundreds, the migration from "small" to "large" stops being an architectural event. OceanBase's single-node deployment runs without coordination overhead and shares the same SQL surface, the same drivers, and the same operational tooling as a multi-node cluster. Scaling up means adding a node and letting partitions rebalance in the background. Applications don't change. SDKs don't change. The path from prototype to production no longer crosses a database boundary.

The combined result is an engine where single-node operations approach native single-node performance, cross-node operations invoke two-phase commit only when a transaction actually spans nodes, and horizontal scaling has been demonstrated end-to-end: OceanBase's published TPC-C result reached 707 million tpmC across more than 1,500 nodes — with throughput scaling near-linearly with cluster size and a workload that includes 10–15% distributed transactions, close to real production ratios.


What Changes for Production Stacks

Three things change for teams running mission-critical workloads on this architecture.

The "grow or re-architect" dilemma goes away. Teams that start with a single-node deployment can scale to dozens or hundreds of nodes without rewriting applications, switching SDKs, or re-modeling data. Sharding is no longer a project; it is a configuration change.

Operational complexity moves out of the application. Application code stops carrying shard keys, routing tables, fan-out logic, and hand-written compensating transactions. Cross-shard joins become regular SQL. Global secondary indexes work the way developers expect. The database absorbs the complexity that middleware previously externalized.

Transactional and analytical workloads can converge. Once the engine handles distribution natively, HTAP becomes a configuration choice rather than a separate system. OceanBase, for example, exposes row, column, and hybrid storage at the table level, isolates tenants by resource group inside a shared cluster, and serves analytical queries against fresh transactional data — replacing what used to be ETL pipelines into a separate warehouse linked by best-effort sync.

For DBAs, the day-to-day implications are concrete: fewer instances to manage, automated failover with sub-8-second RTO, online scaling without maintenance windows, multi-tenant isolation that prevents noisy-neighbor incidents, and a single set of backup, monitoring, and security tooling across the whole platform.


What Native Distributed SQL Doesn't Solve

A balanced post requires honesty about the boundaries of this approach.

A distributed system has more moving parts than a single-node system. Even when the database engine absorbs most of that complexity, operating a multi-node cluster still demands network awareness, capacity planning across replicas, and a more sophisticated failure model than a primary-standby pair.

If a workload has no natural partition key — every transaction crosses every node — the scale-out advantage is reduced. The single-node deployment of a unified architecture still removes distributed overhead entirely, and the multi-node deployment is no worse than any other shared-nothing system, but the linear scaling story doesn't apply at the same level.

Pure analytical workloads with extreme scan or join requirements can still benefit from a dedicated columnar engine. A native distributed SQL database with HTAP capabilities covers a wide range of mixed workloads, but it is not a replacement for a specialized data warehouse at the largest scales.

And the surrounding ecosystem — drivers, ORMs, observability tools, talent — has decades less accumulated maturity than the monoliths it is starting to replace. That gap is closing quickly, but it is real today.


What Comes Next

The rest of this series moves from architectural context to engineering substance. We'll look at how high availability is actually delivered in production failure scenarios, how the transaction engine maintains correctness under high concurrency, how concurrency control trades pessimistic and optimistic execution, how global consistency is enforced across nodes, how partitioning and replica placement are decided, why Paxos is the consensus choice for critical workloads, how multi-tenancy holds up at SaaS scale, and how operational resilience — scaling, upgrades, recovery — is maintained without business disruption.

Each post stands on its own. Together they describe what it takes to run a mission-critical workload on an architecture that was designed for one — rather than retrofitted from one that wasn't — and how OceanBase's engine has approached each of those problems in production.


Further Reading

Share
X
linkedin
mail