
Andrej Karpathy's LLM Wiki dropped a simple idea: store knowledge as plain text, let an LLM understand and update it. Garry Tan's GBrain ran with the same concept. Both projects prove that LLM + local storage is a surprisingly powerful combination for personal knowledge management.
But after using them, I kept hitting the same wall: notes pile up, nothing gets updated, and finding connections between pieces of knowledge requires me to do all the work. So I built ex-brain — a CLI tool that compiles, links, and evolves a personal knowledge base using LLMs.
At a high level, ex-brain provides four mechanisms that standard note-taking tools don't:
The result: a knowledge base that behaves less like a filing cabinet and more like a memory that keeps itself current.
Tools like Notion and Obsidian are great at storing information. They're terrible at keeping it current. You write a note about a company's Series A in March, their new CEO in June, and their Series B in August — and six months later, you have to read all three notes and mentally reconstruct the current state.
AI-powered alternatives like Mem or Granola add summarization, but the intelligence is a black box. You can't control how it categorizes, what it prioritizes, or when it decides something is outdated.
The human brain doesn't work this way. When you learn that a company raised a Series B, you don't file it next to the Series A note — you update your mental model. The Series A becomes history. The Series B becomes current state.
ex-brain applies the same principle to a knowledge base.
Run a single command to feed new information into an existing knowledge page:
ebrain compile companies/river-ai \
"River AI closed Series A, $50M" \
--source meeting_notes \
--date 2024-05-20
The LLM analyzes the information type — is this a status change (funding stage moved from Seed to Series A), a new fact (founded in 2020), or an event (product launched)? — then applies the right update strategy:
| Type | Strategy | Example |
|---|---|---|
| Status | Update current, archive previous | Funding stage, CEO, headcount |
| Fact | Append, keep existing | Founded year, industry, HQ |
| Event | Add to timeline | Product launch, funding close |
The compiled page always reflects current truth:
## Status
- **Funding Stage**: Series A (Source: meeting_notes, 2024-05-20)
- **Valuation**: ~$50M
## History
- Previously Seed (until 2024-05-20)
## Facts
- Series A led by Sequoia
- Founded 2020
No manual reorganization. No stale information buried in a page you'll never re-read.
Time is the axis that makes knowledge useful. ex-brain extracts events from compiled pages and structures them chronologically:
ebrain timeline extract companies/river-ai
[
{
"date": "2024-05-20",
"summary": "Series A closed, $50M",
"detail": "Led by Sequoia"
},
{
"date": "2024-06-15",
"summary": "Sarah Chen appointed CEO"
}
]
Date parsing handles ISO, natural language (last week, yesterday), and localized formats. Timeline extraction runs automatically during compilation — every compile that contains an event adds it to the timeline.
A piece of knowledge is rarely about one thing. "Ali Partovi is the founder of Neo" connects a person, an organization, and a role. ex-brain uses LLMs to detect these relationships:
ebrain put people/ali-partovi --file notes.md
# Detected:
# - Ali Partovi founder_of Neo
# - Ali Partovi invested_in [other companies]
When a new entity is detected, the system creates a stub page for it automatically:
# people/sarah-chen
## Facts
- **CEO_of** [River AI](companies/river-ai): appointed June 2024
The knowledge graph grows organically as you add information. No manual tagging, no predefined ontologies.
Single-mode search breaks down fast in a knowledge base. Full-text search is precise but misses semantics — search "funding" and you won't find "financing round." Vector search understands meaning but can be noisy — search "Sequoia" and you might get results about trees.
ex-brain uses seekdb as its search and storage layer. seekdb is an AI-native database that unifies vector search, full-text search, and scalar filtering in a single engine. One query combines BM25 keyword matching with vector similarity — no need to stitch two retrieval systems together.
# Keyword search
ebrain search "River AI Series A"
# Semantic query
ebrain query "Which companies raised funding recently?"
Under the hood, seekdb supports multi-stage retrieval: vector and full-text indexes recall candidates independently, then results are fused via weighted combination or Reciprocal Rank Fusion (RRF), with optional LLM-based reranking for precision.
ex-brain adds a scoring layer on top:
Several properties made seekdb the right fit for this project:
Embedded mode, zero ops. seekdb runs as a single database file — no server process, no Docker container. For a local-first personal tool, this is the lightest possible deployment. It runs comfortably on 1 CPU core and 2 GB of memory.
Native hybrid search. Vector search (HNSW, IVF, and quantized variants), full-text search (BM25 with phrase and boolean matching), and scalar filtering — all in one engine with multi-stage ranking pipelines.
Built-in AI functions. AI_EMBED generates vector embeddings in SQL. AI_COMPLETE runs text generation. AI_RERANK applies reranking models. These work with OpenAI, DashScope, or custom model endpoints. Embedding, retrieval, and inference happen inside the database — no external pipeline needed.
SQL-compatible. seekdb is built on the OceanBase engine and speaks MySQL-compatible SQL. Standard CREATE TABLE, CREATE INDEX, and query syntax. Full ACID transactions with real-time write visibility.
Multi-model data. Vectors, text, scalars, JSON, and GIS data coexist in the same engine. ex-brain stores structured metadata (page properties, entity links) and unstructured content (text, embeddings) in one database.
Here's the core integration code:
// Connect — it's just a file path
const db = await BrainDb.connect("~/.ebrain/data/ebrain.db");
// Create a vector collection
const pages = await db.getOrCreateCollection({
name: "ebrain_pages",
embeddingFunction: createBrainEmbeddingFunction(settings.embed),
});
// Hybrid search
const hits = await pages.hybridSearch({
query: { whereDocument: { $contains: "funding" } },
nResults: 10,
});
ex-brain ships with a built-in MCP server. If you use Claude, connect it in one step:
{
"mcpServers": {
"ebrain": {
"command": "ebrain",
"args": ["serve"]
}
}
}
Claude can then read pages (brain_get), write pages (brain_put), search (brain_search), compile new information (brain_compile), and create links (brain_link) — directly against your local knowledge base.
# Install
bun install -g ex-brain
# Initialize
ebrain init
# Create your first page
ebrain put companies/river-ai --type company --content "
River AI is an AI analytics platform.
Founded 2020.
"
# Compile new information
ebrain compile companies/river-ai \
"River AI closed Series A, Sequoia led" \
--source news \
--date 2024-05-20
# Search
ebrain search "River AI funding"
# Start MCP server
ebrain serve
ex-brain is early-stage. The compilation logic isn't perfect, timeline extraction occasionally misses events, and entity detection produces false positives. But the core idea works: knowledge should update itself when new information arrives, not just accumulate.
A few directions worth exploring: conflict detection when new information contradicts existing records, confidence decay for stale data, bidirectional propagation when linked entities change, and batch compilation for high-volume ingestion.
If you're interested in building knowledge tools — or if you just want a second brain that actually keeps up — check out ex-brain.
About seekdb
ex-brain's storage and retrieval layer is powered by seekdb — an open-source, AI-native database that unifies vector search, full-text search, structured data, and built-in AI functions in a single engine. Whether you're building RAG pipelines, semantic search, or AI agent applications, seekdb handles storage and retrieval without the need to stitch together multiple systems.
If you're building an application that needs storage + semantic search + AI inference, give seekdb a try:
pip install -U pyseekdb