
curl returns a MySQL-compatible instance with vector search, full-text search, JSON, GIS, and data branching baked in — no signup, no credit card, no config.FULLTEXT search with multi-dimensional filters on a disposable D0 instance.Agent sandbox via FORK TABLE, and even Agent-driven database provisioning through the SKILL.md contract.You just finished a blog post on hybrid search and an idea hits: take the few hundred internal docs scattered across the team, ingest them, build a natural-language Q&A bot. The retrieval pipeline is already mapped out in your head — chunk, embed, vector recall, rerank, summarize with an LLM. Twenty minutes of Python, max.
Then you stall. Not on the algorithm — on storage. Full-text search needs MySQL FULLTEXT or Elasticsearch. Vectors need Milvus or Pinecone. Metadata filters want a relational store. Three Docker containers. Three schemas to keep in sync. Or worse: open a cloud console, register, verify a phone number, link a credit card, pick a region, choose an instance size, configure VPC rules, wait for provisioning. By the time you've got a connection string, it's past midnight and the spark is gone.
Saturday afternoon, the idea is still sitting in ideas.md. It will probably never get opened again.
seekdb D0 exists to delete that segment of the journey.
seekdb is the AI-native hybrid search database from the OceanBase team, open-sourced under Apache-2.0. D0 is its free trial entry point, designed around three "zeros":
mysql straight in.On top of that, every D0 instance ships the full seekdb capability surface in one engine: SQL, vectors, full-text search, JSON, GIS, and data branching. One instance, one connection string, one SQL dialect — no glue code between three stores.
A few constraints worth knowing up front:
AI_EMBED, AI_COMPLETE, AI_RERANK) are disabled on D0. Self-hosted seekdb gives you the full set.D0 is the fastest path from idea to working prototype. It is not a production substitute. The full operating contract lives at d0.seekdb.ai/SKILL.md — also worth a look if you're curious how Agents consume the API.
seekdb_d0_search_demo is a deliberately small FastAPI + PyMySQL service. No ORM, no async workers, no layered architecture — three Python files, under 250 lines total. The goal is not to be a reference application; it's to be readable in one sitting and trivial to fork.
What it does:
title + body via MySQL FULLTEXT indexesblog / note / ticket), status, and ticket priorityMATCH ... AGAINST natural-language mode, with paginationOn startup the app auto-creates the table and seeds sample data, so the first run produces a searchable catalog with no manual setup.
git clone https://github.com/liuhao6741/seekdb_d0_search_demo.git
cd seekdb_d0_search_demoThe simplest path on bash or zsh is to eval the shell-formatted response. Each call provisions a fresh instance:
eval "$(curl -s -X POST 'https://d0.seekdb.ai/api/v1/instances?format=shell')"
echo "$D0_INSTANCE_ID"If echo returns a value, you're in. On fish, Windows CMD, or any shell that doesn't grok export, drop ?format=shell and parse the JSON manually:
curl -s -X POST 'https://d0.seekdb.ai/api/v1/instances'The response includes host, port (always 2881), username, password, database, plus a ready-made connection URL. The credentials are returned once — there's no recovery flow.
.env and runcp .env.example .env
# Paste D0_HOST, D0_PORT, D0_USERNAME, D0_PASSWORD, D0_DATABASE into .env
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload --host 127.0.0.1 --port 8000Open http://127.0.0.1:8000 and search for TLS or FULL-TEXT. If you see ranked results, the loop is closed: code → connection → schema → seeded data → search, all on a database that didn't exist five minutes ago.
TLS note: D0 requires TLS. PyMySQL needsssl={"ssl": True}; Nodemysql2wantsssl: { rejectUnauthorized: false }; JDBC takesuseSSL=true&verifyServerCertificate=false. Plain-text connections are rejected at the gateway.
The demo is structured to be skimmed in three files:
| File | Role |
| db.py | TLS connection, env-var configuration |
| schema.py | DDL with FULLTEXT(title, body) index, sample data |
| main.py | FastAPI routes, MATCH ... AGAINST ranking, filters |
The most useful thing to read is how a single MATCH(title, body) AGAINST(?) expression in main.py doubles as the relevance score and the filter predicate. That, plus a few WHERE clauses on type / status / priority, is the entire search engine.
The demo only uses full-text search. The same D0 instance can support far more — here are three extensions, each a small diff away.
Vectors and full-text indexes live on the same table in seekdb, and a single SQL statement can score both:
ALTER TABLE kb_items ADD COLUMN embedding VECTOR(1536);
-- D0 ships the IVF family; HNSW is available on self-hosted seekdb.
CREATE VECTOR INDEX idx_emb ON kb_items(embedding) USING IVF
WITH (DISTANCE=cosine, TYPE=ivf_flat);
SELECT id, title,
cosine_distance(embedding, '[0.12, 0.34, ...]') AS vec_score,
MATCH(title, body) AGAINST('keyword') AS text_score
FROM kb_items
WHERE MATCH(title, body) AGAINST('keyword')
ORDER BY (vec_score * 0.7 + text_score * 0.3) DESC
LIMIT 10;IVF_FLAT gives the highest recall in the IVF family and works well up to a few million rows. For larger sets, IVF_SQ8 trades some accuracy for storage, and IVF_PQ goes further on storage compression. HNSW and HGRAPH are not available on D0 — they're enabled in self-hosted seekdb. Embeddings need to be generated client-side (OpenAI, BGE, Qwen) since the in-database AI_EMBED is disabled on D0.
Letting an Agent write directly against your production database is a knife held by the wrong end. seekdb's FORK capability gives you a copy-on-write branch instead:
-- Millisecond branch — no physical copy
FORK TABLE kb_items TO kb_items_agent;
-- Agent operates on the branch
UPDATE kb_items_agent SET status = 'archived' WHERE created_at < '2025-01-01';
-- Review the diff before committing
DIFF TABLE kb_items AGAINST kb_items_agent;
-- Merge if you like the result, DROP if you don'tThe trick is the LSM-tree storage engine: FORK TABLE records a logical branch point and shares historical SST files; new writes generate fresh files only on the branch. Branch creation is independent of table size — it stays in millisecond range whether the table is 10 MB or 100 GB. Instance-level forks (POST /api/v1/instances/{id}/fork) exist too, but D0's resource ceiling can occasionally reject them; for serious branching workloads, run on self-hosted seekdb.
The most interesting design choice in D0 is that the operating manual is Agent-readable. Inside Cursor or Claude Code, you can paste:
Read https://d0.seekdb.ai/SKILL.md and create a database to store conversation history.
The Agent fetches SKILL.md, follows the API contract, provisions the instance, creates tables, and writes rows. This is Anthropic's Skill pattern at work — Markdown describes the API contract, and Agents consume it as first-class clients. For developers, that means you no longer write a single line of code to "prepare a database" for your Agent toolchain.
eval runs but D0_* is empty? Your shell can't parse export (fish, CMD), or the API didn't return shell format. Drop ?format=shell and copy the JSON fields into .env manually.
SSL handshake errors? D0 enforces TLS. Make sure your driver explicitly enables it — see the TLS note above.
Instance expired? Run the create call again, paste the new credentials into .env, and restart. schema.py rebuilds the table and seed data on boot, so you're back where you started in under a minute. For demos and workshops, bake the create call into your launch script.
Can I run production on this? No. D0 has no SLA, has consumption limits, and self-deletes after 7 days. For production, the three real options are: self-hosted seekdb (open source, Apache-2.0), the persistent seekdb M0 service at m0.seekdb.ai, or OceanBase Cloud.
What's valuable here isn't a particular SQL dialect or vector index. It's a path that has been compressed to its limit: idea → one curl → cloned 250-line demo → vector + full-text + branching wired up → real data, real prototype, no setup tax. In a domain where the iteration loop is everything, time-to-first-query is the bottleneck — and D0 makes it close to zero.
If 7 days is enough to convince you, the next step is open source: github.com/oceanbase/seekdb, Apache-2.0, no usage cap. If you need a managed path or a larger footprint, talk to the OceanBase team.
And if you're just curious — 7 days is plenty of room for an Agent to run a lot of experiments.
Links

AI era doesn't need another heavy, complex enterprise database. It needs agility. It needs flexibility. We went back to the drawing board to understand what an AI application actually needs from a database. Our answer is OceanBase seekdb


Welcome to the latest episode in our series of articles designed to help you get started with OceanBase, a next-generation distributed relational database. Building on our previous guides where we connected OceanBase to a Sveltekit app and built an e-commerce app with Flask and OceanBase, we now ...


In this post, we’ll break down the sync methods, network paths, latency factors, and cost trade-offs that shape whether a cross-cloud deployment is viable in practice.
