
Many of the database primitives we rely on — transactions, locks, audit logs — were designed around a human operator who thinks before querying, understands what they are changing, and can usually reconstruct what happened when something breaks.
Agents operate very differently. In a typical vibe-coding or agentic workflow, an agent may execute dozens of INSERT, UPDATE, and DELETE statements with minimal human review — high-frequency writes generated from probabilistic decisions rather than deterministic business logic. And when the resulting data is wrong, it is often difficult to reconstruct which step introduced the error from conventional logs alone.
This is the default operating mode for RAG pipelines, multi-agent systems, and AI-driven data workflows. seekdb's Fork Table addresses this directly — giving every agent its own isolated branch where it can write freely, with zero risk to production data.
Recently, one question kept appearing in our user community:
"An AI modified my data. How do I get back to where I was?"
Here are three real-life situations:
The common thread is simple: agents need sandboxes. Not backups. Not read-only snapshots. They need complete, read-write, isolated workspaces where an agent can modify data freely, and where the entire branch can be discarded if the run fails or promoted if it succeeds.
In 2005, Git shipped with one insight: branches should be cheap. When branch creation costs nothing, developer behavior shifts from "be careful on main" to "branch everything, experiment freely, merge the best result."
Data needs the same shift — and for the same reason. AI development today looks like this:
All share one structure: the same dataset, multiple parallel evolution paths, and you need to pick the best outcome.
What are people actually doing? CREATE TABLE features_v1 AS SELECT ..., features_v2, features_v2_final, features_v2_final_REAL. That is not version control; it is manual table sprawl. At terabyte scale, each full-table copy can mean long wait times, significant storage overhead, and operational friction that discourages experimentation.
Fork Table creates a new branch from a transactionally consistent snapshot of an existing table. The branch initially shares underlying data structures through copy-on-write, so creation is fast and storage-efficient. Once the branch diverges, only changed data and affected index state need to be materialized.
Branch creation is O(1) with respect to table size because the operation creates new branch metadata rather than copying table contents. In practice, a 1 TB table can be forked in roughly the same time as a 1 MB table — typically in under a second in our implementation. Copy-on-write means data is not duplicated until the branch actually diverges.
This changes what becomes operationally practical. When branching is this cheap, you can create an isolated writable branch before an agent run or before a risky transformation step. When branching is expensive, teams avoid doing it and accept shared-state risk instead.
Each branch is a complete sandbox — not a read-only snapshot. Agents can INSERT, UPDATE, and DELETE freely inside the branch. Because writes are isolated at the branch level, two agents working on separate branches do not mutate each other’s visible state.
-- Create a branch
FORK TABLE main_table TO experiment_branch;
-- Work on the branch (any SQL operations)
UPDATE experiment_branch SET ...;
INSERT INTO experiment_branch ...;
-- Discard if failed
DROP TABLE experiment_branch;
-- Or promote if successful
RENAME TABLE main_table TO main_backup, experiment_branch TO main_table;
Standard SQL — no new syntax to learn beyond FORK TABLE.
When you fork a table with HNSW vector indexes, the index structures are initially shared across branches using copy-on-write. That means forking a RAG knowledge base with millions of embeddings does not require an immediate full index rebuild. The branch reuses existing index state until data or index paths diverge, which keeps branching fast and avoids making experimentation prohibitively expensive.
This is where AI-oriented storage semantics start to matter. In systems where branching and vector indexing are introduced as separate layers, efficient index reuse can become difficult, and teams may face tradeoffs between rebuild cost, freshness, and branch isolation.
Fork Table is not just about making experiments faster. It changes the safety model of agent-driven data systems. Instead of letting multiple agents mutate shared state and hoping logs are enough to recover, you give each run its own isolated branch and decide later what deserves to be promoted.
For teams building AI applications, native data branching removes a category of operational risk.
The pattern is the same one Git brought to code: do not rely on people being more careful. Reduce the cost of experimentation and the cost of being wrong.
Fork Table handles single-table branching. But real AI applications rarely touch just one table — an agent's knowledge base might span structured data, vector indexes, and metadata tables that need to be forked together at a consistent snapshot point. In this case, Fork Database (available in seekdb V1.2.0) allows developers to create a complete, isolated copy of the entire database at a globally consistent point in time.
seekdb's Fork Table is available locally on either macOS or Docker environment.
# macOS
brew tap oceanbase/seekdb && brew install seekdb && seekdb-start
# Docker
docker run -d --name seekdb oceanbase/seekdb:latest
Connect with any MySQL client (mysql -h127.0.0.1 -uroot -P2881 -A -Dtest) and start branching.
For documentation and online trial environments, see oceanbase.ai.