5 Minutes to a Working AI Prototype: A Developer's Guide to seekdb D0

Mike Liu
Mike Liu
Published on May 14, 2026
6 minute read
Key Takeaways
  • seekdb D0 is the free trial entry point for seekdb, an open-source AI-native hybrid search database. One curl returns a MySQL-compatible instance with vector search, full-text search, JSON, GIS, and data branching baked in — no signup, no credit card, no config.
  • We walk through a FastAPI knowledge-base search demo (~250 lines, no ORM) that combines FULLTEXT search with multi-dimensional filters on a disposable D0 instance.
  • Three extension paths show how the same instance can power hybrid-search RAG, an Agent sandbox via FORK TABLE, and even Agent-driven database provisioning through the SKILL.md contract.

You just finished a blog post on hybrid search and an idea hits: take the few hundred internal docs scattered across the team, ingest them, build a natural-language Q&A bot. The retrieval pipeline is already mapped out in your head — chunk, embed, vector recall, rerank, summarize with an LLM. Twenty minutes of Python, max.

Then you stall. Not on the algorithm — on storage. Full-text search needs MySQL FULLTEXT or Elasticsearch. Vectors need Milvus or Pinecone. Metadata filters want a relational store. Three Docker containers. Three schemas to keep in sync. Or worse: open a cloud console, register, verify a phone number, link a credit card, pick a region, choose an instance size, configure VPC rules, wait for provisioning. By the time you've got a connection string, it's past midnight and the spark is gone.

Saturday afternoon, the idea is still sitting in ideas.md. It will probably never get opened again.

seekdb D0 exists to delete that segment of the journey.

What seekdb D0 actually is

seekdb is the AI-native hybrid search database from the OceanBase team, open-sourced under Apache-2.0. D0 is its free trial entry point, designed around three "zeros":

  • Zero signup — no account, email, or credit card. The API requires no auth at all.
  • Zero config — no instance sizing, no VPC, no IP allowlist. The endpoint comes back ready to connect.
  • Zero wait — instances are created in seconds. You get credentials and mysql straight in.

On top of that, every D0 instance ships the full seekdb capability surface in one engine: SQL, vectors, full-text search, JSON, GIS, and data branching. One instance, one connection string, one SQL dialect — no glue code between three stores.

A few constraints worth knowing up front:

  • Each instance lives for 7 days, then data is securely deleted.
  • For security, in-database AI functions (AI_EMBED, AI_COMPLETE, AI_RERANK) are disabled on D0. Self-hosted seekdb gives you the full set.
  • D0 has no SLA. Don't put production traffic, customer PII, or anything GDPR-regulated on it.

D0 is the fastest path from idea to working prototype. It is not a production substitute. The full operating contract lives at d0.seekdb.ai/SKILL.md — also worth a look if you're curious how Agents consume the API.

seekdb_d0_search_demo is a deliberately small FastAPI + PyMySQL service. No ORM, no async workers, no layered architecture — three Python files, under 250 lines total. The goal is not to be a reference application; it's to be readable in one sitting and trivial to fork.

What it does:

  • Full-text search on title + body via MySQL FULLTEXT indexes
  • Multi-dimensional filters on content type (blog / note / ticket), status, and ticket priority
  • Relevance ranking via MATCH ... AGAINST natural-language mode, with pagination
  • Two interfaces — an HTML UI for humans and a JSON API for Agents

On startup the app auto-creates the table and seeds sample data, so the first run produces a searchable catalog with no manual setup.

The 5-minute walkthrough

1. Clone the repo

git clone https://github.com/liuhao6741/seekdb_d0_search_demo.git
cd seekdb_d0_search_demo

2. Spin up a D0 instance

The simplest path on bash or zsh is to eval the shell-formatted response. Each call provisions a fresh instance:

eval "$(curl -s -X POST 'https://d0.seekdb.ai/api/v1/instances?format=shell')"
echo "$D0_INSTANCE_ID"

If echo returns a value, you're in. On fish, Windows CMD, or any shell that doesn't grok export, drop ?format=shell and parse the JSON manually:

curl -s -X POST 'https://d0.seekdb.ai/api/v1/instances'

The response includes host, port (always 2881), username, password, database, plus a ready-made connection URL. The credentials are returned once — there's no recovery flow.

3. Wire up .env and run

cp .env.example .env
# Paste D0_HOST, D0_PORT, D0_USERNAME, D0_PASSWORD, D0_DATABASE into .env

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload --host 127.0.0.1 --port 8000

Open http://127.0.0.1:8000 and search for TLS or FULL-TEXT. If you see ranked results, the loop is closed: code → connection → schema → seeded data → search, all on a database that didn't exist five minutes ago.

TLS note: D0 requires TLS. PyMySQL needs ssl={"ssl": True}; Node mysql2 wants ssl: { rejectUnauthorized: false }; JDBC takes useSSL=true&verifyServerCertificate=false. Plain-text connections are rejected at the gateway.

Reading the code in one pass

The demo is structured to be skimmed in three files:

FileRole
db.pyTLS connection, env-var configuration
schema.pyDDL with FULLTEXT(title, body) index, sample data
main.pyFastAPI routes, MATCH ... AGAINST ranking, filters

The most useful thing to read is how a single MATCH(title, body) AGAINST(?) expression in main.py doubles as the relevance score and the filter predicate. That, plus a few WHERE clauses on type / status / priority, is the entire search engine.

Three places to take it next

The demo only uses full-text search. The same D0 instance can support far more — here are three extensions, each a small diff away.

Vectors and full-text indexes live on the same table in seekdb, and a single SQL statement can score both:

ALTER TABLE kb_items ADD COLUMN embedding VECTOR(1536);

-- D0 ships the IVF family; HNSW is available on self-hosted seekdb.
CREATE VECTOR INDEX idx_emb ON kb_items(embedding) USING IVF
    WITH (DISTANCE=cosine, TYPE=ivf_flat);

SELECT id, title,
       cosine_distance(embedding, '[0.12, 0.34, ...]') AS vec_score,
       MATCH(title, body) AGAINST('keyword')          AS text_score
FROM kb_items
WHERE MATCH(title, body) AGAINST('keyword')
ORDER BY (vec_score * 0.7 + text_score * 0.3) DESC
LIMIT 10;

IVF_FLAT gives the highest recall in the IVF family and works well up to a few million rows. For larger sets, IVF_SQ8 trades some accuracy for storage, and IVF_PQ goes further on storage compression. HNSW and HGRAPH are not available on D0 — they're enabled in self-hosted seekdb. Embeddings need to be generated client-side (OpenAI, BGE, Qwen) since the in-database AI_EMBED is disabled on D0.

A safe sandbox for Agents

Letting an Agent write directly against your production database is a knife held by the wrong end. seekdb's FORK capability gives you a copy-on-write branch instead:

-- Millisecond branch — no physical copy
FORK TABLE kb_items TO kb_items_agent;

-- Agent operates on the branch
UPDATE kb_items_agent SET status = 'archived' WHERE created_at < '2025-01-01';

-- Review the diff before committing
DIFF TABLE kb_items AGAINST kb_items_agent;

-- Merge if you like the result, DROP if you don't

The trick is the LSM-tree storage engine: FORK TABLE records a logical branch point and shares historical SST files; new writes generate fresh files only on the branch. Branch creation is independent of table size — it stays in millisecond range whether the table is 10 MB or 100 GB. Instance-level forks (POST /api/v1/instances/{id}/fork) exist too, but D0's resource ceiling can occasionally reject them; for serious branching workloads, run on self-hosted seekdb.

Let the Agent provision its own database

The most interesting design choice in D0 is that the operating manual is Agent-readable. Inside Cursor or Claude Code, you can paste:

Read https://d0.seekdb.ai/SKILL.md and create a database to store conversation history.

The Agent fetches SKILL.md, follows the API contract, provisions the instance, creates tables, and writes rows. This is Anthropic's Skill pattern at work — Markdown describes the API contract, and Agents consume it as first-class clients. For developers, that means you no longer write a single line of code to "prepare a database" for your Agent toolchain.

A few things people ask

eval runs but D0_* is empty? Your shell can't parse export (fish, CMD), or the API didn't return shell format. Drop ?format=shell and copy the JSON fields into .env manually.

SSL handshake errors? D0 enforces TLS. Make sure your driver explicitly enables it — see the TLS note above.

Instance expired? Run the create call again, paste the new credentials into .env, and restart. schema.py rebuilds the table and seed data on boot, so you're back where you started in under a minute. For demos and workshops, bake the create call into your launch script.

Can I run production on this? No. D0 has no SLA, has consumption limits, and self-deletes after 7 days. For production, the three real options are: self-hosted seekdb (open source, Apache-2.0), the persistent seekdb M0 service at m0.seekdb.ai, or OceanBase Cloud.

The path is the point

What's valuable here isn't a particular SQL dialect or vector index. It's a path that has been compressed to its limit: idea → one curl → cloned 250-line demo → vector + full-text + branching wired up → real data, real prototype, no setup tax. In a domain where the iteration loop is everything, time-to-first-query is the bottleneck — and D0 makes it close to zero.

If 7 days is enough to convince you, the next step is open source: github.com/oceanbase/seekdb, Apache-2.0, no usage cap. If you need a managed path or a larger footprint, talk to the OceanBase team.

And if you're just curious — 7 days is plenty of room for an Agent to run a lot of experiments.

Links

Share
X
linkedin
mail