Your AI Is Only as Good as Your OLTP Layer

右侧logo

oceanbase database

A recommendation engine pushes a product the user just bought. A fraud model clears a flagged account because it's still reading yesterday's snapshot. A customer service bot answers a policy question with outdated information.

These aren't model failures. They're data failures. Most AI applications today still consume historical data — retrieval and analytics over cold datasets. But production AI needs hot data: real-time state, fresh transactions, live inventory. The system that supplies hot data reliably is a strong-consistency OLTP database.

Most conversations about databases and AI jump straight to vector search and embeddings. Those matter — but they're the retrieval layer, not the foundation. The foundation is transactional data: orders, balances, inventory, permissions, session state. Get that layer wrong, and no amount of semantic search will save the output.

The Data Contract for Real-Time AI

In production AI systems — fraud detection, intelligent routing, real-time recommendations, enterprise agents — the data requirements boil down to three properties:

Freshness. A five-minute-old account balance is five minutes too old for a fraud decision. AI applications consuming transactional data need current-read guarantees, not eventual consistency with an asterisk.

Consistency. AI pipelines are long. A single user action might trigger a retrieval, an inference, and a write-back — potentially with retries across network boundaries. If the database can't guarantee that a retried transaction won't execute twice, the entire pipeline produces garbage.

Predictability. If latency spikes during peak traffic, the AI feature built on top of it becomes unreliable. Stable p99 latency isn't a nice-to-have; it's a prerequisite for any AI feature that touches production traffic.

These three properties — freshness, consistency, predictability — are precisely what a well-built OLTP database has spent decades getting right. Transactions, write-ahead logging, replication, failover. The fundamentals.

What's New: Strengthening the Core, Adding Retrieval

OceanBase 4.4.2 LTS is the latest Long-Term Support release. It strengthens the transactional core and adds vector search and full-text search — so that the "retrieve → reason → write-back" loop can happen inside a single database.

At a high level, it delivers:

  • Hardened HA: tenant cloning, optimized standby reads (RPO=0, RTO <1 min), and heterogeneous Zone architecture for true fault isolation
  • Column-level encryption and audit: transparent data protection without application-layer changes
  • Expanded MySQL/Oracle compatibility: session-level temporary tables, INTERVAL partitioning, PL execution path optimizations
  • Vector search and full-text search: native support for RAG-style retrieval alongside transactional data
  • Performance gains: 14x improvement in replica table follower reads, 60% faster DDL for batch table creation

Build Real-Time AI on Battle-Tested OLTP

The enhancements listed above — HA, security, compatibility, performance — aren't just incremental improvements. They directly underpin what AI applications demand from the data layer. Freshness comes from Paxos-based multi-replica synchronization and optimized standby reads: writes are durable and readable immediately, no async replication lag. Consistency comes from native distributed transactions with strict ACID semantics, plus column-level encryption for sensitive data in AI pipelines. Predictability comes from the LSM-Tree engine's compaction scheduling and tenant-level resource isolation, with heterogeneous Zones preventing single-node failures from cascading into latency spikes.

These aren't properties bolted on for AI. They're the result of 15 years of engineering under production pressure. OceanBase started in 2010 as Alipay's transaction database and has since expanded into banking cores, insurance policy systems, telecom billing, and government service platforms — systems where a single failed transaction is a compliance incident. That track record is what gives the three properties teeth: they've been tested at scale, under real load, in environments where downtime makes headlines.

Here's what the latest release adds to that foundation.

High Availability

Three key improvements in this release:

Tenant cloning. Clone a running tenant with matching unit distribution — useful for spinning up test environments or running disaster recovery drills without touching production.

Optimized standby reads. The leader now collects and caches participant state centrally, reducing message round-trips. Automatic retry logic forwards requests when a replica lags. If the primary fails, the standby takes over within one minute — RPO=0, RTO under 60 seconds.

Heterogeneous Zones. Tenants can now configure two different unit counts across Zones. A single-node failure can be handled by removing the node directly, with no rebalancing impact on other Zones. Fault isolation that actually isolates.

Security

This release introduces column-level data protection rules. SELECT queries from unauthorized users return ciphertext; DML operations are blocked entirely. Combined with tenant-level transparent encryption, access control, and audit logging, this meets financial-grade security requirements without requiring application changes.

Compatibility

Migration is where good intentions meet painful reality. Years of accumulated stored procedures, complex PL/SQL logic, and thousands of lines of business code — changing one line can break ten others.

OceanBase supports both MySQL and Oracle modes. The latest release adds several features that matter for real-world migrations: MySQL-mode session-level temporary tables (with distributed consistency guarantees), INTERVAL partitioning (auto-creates new partitions as data arrives — no DBA intervention), cursor reads within uncommitted transactions, and substantial PL execution optimizations including expanded FORALL batch support.

Performance and Diagnostics

Key numbers: serial table creation is 60% faster (verified with 1,000 sysbench tables). Follower read throughput on replica tables jumped from 20K QPS to 270K+ — a 14x improvement, approaching leader-level performance. UDF, trigger, and procedure execution paths have been optimized across the board.

On the diagnostics side: new real-time statistics for storage-layer query access patterns, pushdown path latency, and filter efficiency. SQLSTAT has been decoupled from library cache to eliminate resource contention with plan cache. ASH data now uses a weighting mechanism to preserve diagnostic integrity during queue backlog scenarios.

MySQL/Oracle dual-mode compatibility means existing production systems can migrate and gain all three properties without rewriting application code.

With the OLTP foundation in place, this release adds the retrieval capabilities that AI applications need to complete the data loop:

Transactional data as the source of truth. Orders, accounts, inventory, permissions — served with ACID guarantees and low-latency access through existing indexing, caching, read-write separation, and partition management.

Vector and full-text search (new). Native vector retrieval and full-text search, interoperable with JSON and multi-modal indexes. For RAG workloads, the retrieve → reason → write-back cycle completes within a single system — no external search infrastructure required.

Change data capture. OBCDC streams incremental changes to Kafka, Flink, data lakes, or feature stores in real time, keeping downstream models and feature pipelines current.

Write-back and audit. AI-generated scores, decisions, and actions write back to the same database with unified access control, encryption, and audit trails. Every decision is traceable.

One database. Transactions, retrieval, inference support, and audit — in a single operational footprint.

In the Field

Large Life Insurance Company: Oracle Migration + Intelligent Risk Control

A major life insurance company was hitting performance ceilings on its Oracle-based systems. Database throughput couldn't keep up with business growth, and the legacy stack had no path to AI-driven decision-making.

Using OceanBase's Oracle compatibility mode, the migration was smooth — most stored procedures ran without modification, and application code changes were minimal. The distributed architecture provided elastic scalability, while PL execution optimizations and vector search capabilities enabled the team to build an intelligent risk control system on the same platform.

Results: batch processing time dropped from hours to minutes. The fraud detection model now identifies risks at transaction time, with both false positive and false negative rates significantly reduced. Full audit traceability satisfies financial regulatory requirements.

Government Services Platform: RAG-Powered Q&A

Government service platforms share a classic pain point: citizens ask policy questions, and keyword-based search returns irrelevant results. "I want to register a business" and "What do I need to start a company" are the same question to a human but completely different queries to a search engine.

The platform built a RAG-based Q&A system on OceanBase, combining vector search for semantic understanding, full-text search for keyword matching, and structured filtering for region and time-sensitivity constraints. The system retrieves from policy databases, service guides, and historical tickets in a single query path.

Results: answer accuracy improved significantly, response time dropped from minutes to seconds, and pressure on human agents decreased substantially. All Q&A records are fully traceable — in government services, accountability for every answer matters.

Looking Ahead

Upcoming patch releases will continue strengthening several areas:

  • Migration: Oracle global temporary table performance improvements, private temporary tables, distributed OBCDC cluster for better sync throughput, native UPDATE output in CDC (replacing the current DELETE+INSERT decomposition), and full MySQL Binlog compatibility
  • HA: Primary-standby strong sync with maximum protection and maximum availability modes
  • Performance: Parallel DDL for schema changes and index creation, deeper PL/SQL engine optimizations, and continued query optimizer improvements

Fresh. Consistent. Predictable. These aren't buzzwords. They're the properties that determine whether your AI application works in production or just in demos.

Full release notes

X
ICON_SHARE
linkedin