working, short_term, long_term) backed by Ebbinghaus-style exponential decay.final_score = relevance × decay), turning forgetting into a quality regulator rather than a delete switch.If every memory carries equal weight at retrieval time, two problems compound:
PowerMem's forgetting mechanism decides two things: when a memory dies, and how much weight it carries during retrieval. Before walking through the code in a follow-up post, it is worth tracing the cognitive-science principles the system is modeled on.
The biological substrate of memory is the synaptic connection between neurons. Those connections are anything but static — two opposing mechanisms continuously modulate them:
LTP and LTD are partners, not adversaries. If every synapse were strengthened equally, the brain would lose its ability to distinguish signal from noise. LTD selectively weakens inactive connections so that limited synaptic resources concentrate on the active pathways. Forgetting is the price memory pays for discrimination.
A newer memory is first held in the hippocampus — high-throughput, low-capacity, much like RAM. During sleep, the brain replays these traces and gradually transfers selected ones to the neocortex for long-term storage.
The transfer is selective. Only memories that are repeatedly activated, richly associated with prior knowledge, or marked by strong emotion are prioritized. Isolated, single-occurrence, emotionally neutral information falls off during the move. Nature performs filtering automatically during consolidation, and this is the direct biological blueprint for PowerMem's three-tier model: working → short_term → long_term.
Cognitive psychology adds another lens: interference theory. Forgetting is often not about information being erased, but about it being un-retrievable. Proactive interference — old memories disrupt the recall of new ones (you keep typing your old phone number). Retroactive interference — new memories disrupt the recall of old ones (learning Spanish makes Italian vocabulary slip).
The hard problem is not writing — it is reading under interference. As the store grows, cross-memory interference rises super-linearly. Decaying low-value entries reduces interference density and restores retrieval precision.
Claude Shannon's 1948 definition of information quantifies surprise:
I(x) = -log₂(p(x))The information content of an event is inversely related to its probability — common events carry little information; rare events carry a lot.
Mapped onto a memory system this gives a natural rule. "What I had for breakfast yesterday" (happens daily, p ≈ 1, I ≈ 0) is not worth long-term storage. "The master password for our production database" (almost never asked, tiny p, huge I) must be persisted.
A well-designed forgetting mechanism is therefore an information filter: high-information content (rare but critical) is retained, low-information content (frequent but trivial) is decayed and evicted, and everything in between is interpolated smoothly. PowerMem's tiered architecture implements this filter; the forgetting curve gives it a time-varying weight, so classification keeps evolving instead of being decided once at write time.
In 1885, Hermann Ebbinghaus turned memory research from philosophy into laboratory science. Using roughly 2,300 nonsense syllables to avoid prior-knowledge bias, he ran a strict protocol on himself:
The retention data:
| Interval | Retention |
| Immediately after | 100% |
| 20 minutes | ~58% |
| 1 hour | ~44% |
| 9 hours | ~36% |
| 1 day | ~33% |
| 2 days | ~28% |
| 6 days | ~25% |
| 31 days | ~21% |
Two conclusions, still standing more than a century later:
Ebbinghaus's original fit was logarithmic:
b = 100k / ((log t)^c + k)with b the savings percentage, t the time in minutes, and constants k ≈ 1.84, c ≈ 1.25.
Later work showed that a simpler exponential model approximates the data just as well, and it is now the standard form:
R(t) = e^(-λt)t, the fraction of the original information still recallable, in [0, 1].The graph is a fast-then-slow curve. Most of the loss happens early; whatever survives the early window is far more stable, simply because there's not much left to forget. These equations are the mathematical foundation of PowerMem's forgetting mechanism.
The defining feature of forgetting is that the rate of forgetting is proportional to what remains. The differential statement is dR/dt = -λR — change rate proportional to current state — and its unique solution is exactly R(t) = e^(-λt).
Newton's law of cooling, radioactive decay, capacitor discharge — apparently unrelated phenomena that share the same equation because they share the same self-consistent relationship between rate and state. Memory decay is no exception. Modern spaced-repetition systems (SuperMemo, Anki, PowerMem) converge on exponential decay because it offers the best balance between simplicity, computability, and empirical fit.
Ebbinghaus also discovered that spaced repetition resets the curve, and each reset slows the next decay. Neuroscience explains why through memory reconsolidation: when a consolidated memory is actively retrieved, it briefly returns to a plastic state, and the brain re-stabilizes it through a fresh round of protein synthesis and synaptic reinforcement.
Reconsolidation needs time. Cramming ten repetitions into five minutes does not allow protein synthesis and synaptic remodeling to complete — the biological reason rote cramming is inefficient. Wait too long, however, and the trace has already decayed below retrieval threshold, leaving nothing to reconsolidate. Robert Bjork (UCLA, 1994) crystallized this into the concept of desirable difficulty: the most efficient learning happens when retrieval is just hard enough to trigger adaptation. This principle drives PowerMem's review-scheduling logic.
This is where the biology, the information theory, and the math all land in code. PowerMem is not the first system to talk about "memory tiers" — but the way it makes forgetting a tunable parameter at every layer is what makes the design worth examining in detail.
The cognitive-science principles above translate into three engineering tiers:
| Tier | Biological analogue | Decay-rate multiplier | Typical lifetime | Promotion condition |
| working | Prefrontal cortex | ×2.0 | hours – 1 day | access ≥ 3 or importance ≥ 0.6 |
| short_term | Hippocampus | ×1.5 | days – weeks | access ≥ 3 or importance ≥ 0.6 |
| long_term | Neocortex | ×1.0 | weeks – months | — (already at the top) |
Classification is driven by an importance score:
importance ≥ 0.8 → long_term
importance ≥ 0.6 → short_term
importance < 0.6 → workingThe decay-rate multiplier is the key differentiating parameter. Over the same 24-hour window, a working memory decays at twice the rate of a long_term one. Importance directly controls expected lifespan: unimportant content disappears quickly, freeing retrieval space for the things that actually matter.
PowerMem's forgetting subsystem has four cooperating components, arranged along the lifecycle of a memory entry:
New input → ImportanceEvaluator
→ EbbinghausAlgorithm
→ EbbinghausIntelligencePlugin
├─ on_add(): inject decay parameters at creation
├─ on_get(): check decay / promotion / archival on access
└─ on_search(): batch-process lifecycle during search
→ MemoryOptimizer
├─ exact dedup (MD5 hash)
├─ semantic dedup (cosine similarity)
└─ memory compression (LLM summarization)In retrieval, the forgetting mechanism plays an equally critical role as a ranking signal. Search results are ordered by:
final_score = relevance_score × decay_factorrelevance_score — semantic match (vector similarity).decay_factor — temporal freshness (the exponential decay value).These two parameters jointly determine the final ranking, which makes non-trivial cross-rankings possible. The numbers below are illustrative; the actual decay factor depends on the configured decay_rate:
| Memory | Relevance | Decay factor | final_score | Rank |
| Meeting notes 3 hours ago, highly relevant | 0.92 | 0.62 | 0.57 | 1 |
| Meeting notes 10 days ago, perfect semantic match | 0.98 | 0.02 | 0.02 | 2 |
| Idle chat 1 minute ago, moderate match | 0.45 | 0.99 | 0.45 | — |
Forgetting is not a simple delete switch — it is a quality regulator for retrieval. It guarantees that the result respects both the content match dimension and the time freshness dimension simultaneously.
Pulling the threads together:
Nature designed it this way. PowerMem translates that design into code you can configure, tune, and reason about.
The next post follows a single piece of information through the full PowerMem pipeline — importance evaluation → tier assignment → decay → access trigger → promotion or forgetting → global optimization — to see exactly how the theory becomes runtime behavior.
PowerMem on GitHub: https://github.com/oceanbase/powermem
If you find PowerMem helpful, please give it a ⭐ on GitHub. It would be a great help to the project!
Based on PowerMem v1.1.1. All code references come from the actual project files.

AI era doesn't need another heavy, complex enterprise database. It needs agility. It needs flexibility. We went back to the drawing board to understand what an AI application actually needs from a database. Our answer is OceanBase seekdb


On the DABstep Global Leaderboard, OceanBase DataPilot agent has secured the top spot, maintaining a significant lead over the runner-up for a month. The secret to our SOTA results was a fundamental shift in engineering paradigm: moving from "Prompt Engineering" to "Asset Engineering."


Streaming benchmark shows traditional bulk-load tests miss agent needs. seekdb's fixed dual-index keeps P99 jitter at 1.1×.
