Meet OceanBase AI Database, the unified database for operational data, real-time analytics, and AI. Explore ->

Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All

BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All

PRODUCTS

OceanBase Cloud

OceanBase Database

Tools Connectors and Middleware

QUICK START

OceanBase Cloud OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Back to Blog

From Neurons to Code: The Forgetting Design Behind PowerMem

Qiu Fan

Published on June 10, 2026Updated on 2026-07-27

6 minute read

On this page

How Nature Forgets

A Shannon-Information View

The Ebbinghaus Forgetting Curve

PowerMem's Three-Tier Memory Architecture

Why Forgetting Matters

Key Takeaways

PowerMem treats forgetting as a first-class capability — not a bug — using a three-tier memory model (working, short_term, long_term) backed by Ebbinghaus-style exponential decay.
Decay-rate multipliers differ by tier (×2.0 / ×1.5 / ×1.0), so unimportant memories fade quickly while frequently accessed ones are promoted and stabilized — directly mirroring synaptic plasticity and memory consolidation.
Retrieval ranking combines semantic similarity with a decay factor (final_score = relevance × decay), turning forgetting into a quality regulator rather than a delete switch.

If every memory carries equal weight at retrieval time, two problems compound:

Retrieval quality decays. New and old memories interfere with each other in the embedding space. As the corpus grows, the signal-to-noise ratio of any query drops.
Storage costs spiral. Most low-value content is never retrieved, yet it keeps consuming space, index time, and embedding budget forever.

PowerMem's forgetting mechanism decides two things: when a memory dies, and how much weight it carries during retrieval. Before walking through the code in a follow-up post, it is worth tracing the cognitive-science principles the system is modeled on.

How Nature Forgets

Synaptic Plasticity

The biological substrate of memory is the synaptic connection between neurons. Those connections are anything but static — two opposing mechanisms continuously modulate them:

Long-Term Potentiation (LTP) — frequently used pathways are strengthened. This is the basis of remembering.
Long-Term Depression (LTD) — rarely used pathways are weakened. This is the basis of forgetting.

LTP and LTD are partners, not adversaries. If every synapse were strengthened equally, the brain would lose its ability to distinguish signal from noise. LTD selectively weakens inactive connections so that limited synaptic resources concentrate on the active pathways. Forgetting is the price memory pays for discrimination.

From Hippocampus to Neocortex

A newer memory is first held in the hippocampus — high-throughput, low-capacity, much like RAM. During sleep, the brain replays these traces and gradually transfers selected ones to the neocortex for long-term storage.

The transfer is selective. Only memories that are repeatedly activated, richly associated with prior knowledge, or marked by strong emotion are prioritized. Isolated, single-occurrence, emotionally neutral information falls off during the move. Nature performs filtering automatically during consolidation, and this is the direct biological blueprint for PowerMem's three-tier model: working → short_term → long_term.

Forgetting Is a Retrieval Problem

Cognitive psychology adds another lens: interference theory. Forgetting is often not about information being erased, but about it being un-retrievable. Proactive interference — old memories disrupt the recall of new ones (you keep typing your old phone number). Retroactive interference — new memories disrupt the recall of old ones (learning Spanish makes Italian vocabulary slip).

The hard problem is not writing — it is reading under interference. As the store grows, cross-memory interference rises super-linearly. Decaying low-value entries reduces interference density and restores retrieval precision.

A Shannon-Information View

Claude Shannon's 1948 definition of information quantifies surprise:

I(x) = -log₂(p(x))

The information content of an event is inversely related to its probability — common events carry little information; rare events carry a lot.

Mapped onto a memory system this gives a natural rule. "What I had for breakfast yesterday" (happens daily, p ≈ 1, I ≈ 0) is not worth long-term storage. "The master password for our production database" (almost never asked, tiny p, huge I) must be persisted.

A well-designed forgetting mechanism is therefore an information filter: high-information content (rare but critical) is retained, low-information content (frequent but trivial) is decayed and evicted, and everything in between is interpolated smoothly. PowerMem's tiered architecture implements this filter; the forgetting curve gives it a time-varying weight, so classification keeps evolving instead of being decided once at write time.

The Ebbinghaus Forgetting Curve

Memory Becomes Measurable

In 1885, Hermann Ebbinghaus turned memory research from philosophy into laboratory science. Using roughly 2,300 nonsense syllables to avoid prior-knowledge bias, he ran a strict protocol on himself:

Learn a 13-syllable list until two consecutive error-free recitations.
Wait 20 minutes, 1 hour, 9 hours, 1 day, 2 days, 6 days, 31 days.
Re-learn using the savings method — measure how much faster than the first time.

The retention data:

Interval	Retention
Immediately after	100%
20 minutes	~58%
1 hour	~44%
9 hours	~36%
1 day	~33%
2 days	~28%
6 days	~25%
31 days	~21%

Two conclusions, still standing more than a century later:

Forgetting is exponential, not linear — about 40% lost in the first 20 minutes, more than half within an hour, then a long slow tail.
Spaced review rewrites the curve — repeated reviews at the right interval slow subsequent decay.

From the Original Fit to Modern Exponential Decay

Ebbinghaus's original fit was logarithmic:

b = 100k / ((log t)^c + k)

with b the savings percentage, t the time in minutes, and constants k ≈ 1.84, c ≈ 1.25.

Later work showed that a simpler exponential model approximates the data just as well, and it is now the standard form:

R(t) = e^(-λt)

R(t) — retention at time t, the fraction of the original information still recallable, in [0, 1].
e — the natural constant (≈ 2.71828), the mathematical base for any continuous, smooth exponential process.
λ (lambda) — the decay rate. Larger λ → faster forgetting (steeper curve). Smaller λ → more durable memory (flatter curve).
t — elapsed time since the memory was formed, typically in hours.

The graph is a fast-then-slow curve. Most of the loss happens early; whatever survives the early window is far more stable, simply because there's not much left to forget. These equations are the mathematical foundation of PowerMem's forgetting mechanism.

Why Exponential Is the Right Functional Form

The defining feature of forgetting is that the rate of forgetting is proportional to what remains. The differential statement is dR/dt = -λR — change rate proportional to current state — and its unique solution is exactly R(t) = e^(-λt).

Newton's law of cooling, radioactive decay, capacitor discharge — apparently unrelated phenomena that share the same equation because they share the same self-consistent relationship between rate and state. Memory decay is no exception. Modern spaced-repetition systems (SuperMemo, Anki, PowerMem) converge on exponential decay because it offers the best balance between simplicity, computability, and empirical fit.

Spaced Repetition and Desirable Difficulty

Ebbinghaus also discovered that spaced repetition resets the curve, and each reset slows the next decay. Neuroscience explains why through memory reconsolidation: when a consolidated memory is actively retrieved, it briefly returns to a plastic state, and the brain re-stabilizes it through a fresh round of protein synthesis and synaptic reinforcement.

Reconsolidation needs time. Cramming ten repetitions into five minutes does not allow protein synthesis and synaptic remodeling to complete — the biological reason rote cramming is inefficient. Wait too long, however, and the trace has already decayed below retrieval threshold, leaving nothing to reconsolidate. Robert Bjork (UCLA, 1994) crystallized this into the concept of desirable difficulty: the most efficient learning happens when retrieval is just hard enough to trigger adaptation. This principle drives PowerMem's review-scheduling logic.

PowerMem's Three-Tier Memory Architecture

This is where the biology, the information theory, and the math all land in code. PowerMem is not the first system to talk about "memory tiers" — but the way it makes forgetting a tunable parameter at every layer is what makes the design worth examining in detail.

From Biology to Code

The cognitive-science principles above translate into three engineering tiers:

Tier	Biological analogue	Decay-rate multiplier	Typical lifetime	Promotion condition
working	Prefrontal cortex	×2.0	hours – 1 day	access ≥ 3 or importance ≥ 0.6
short_term	Hippocampus	×1.5	days – weeks	access ≥ 3 or importance ≥ 0.6
long_term	Neocortex	×1.0	weeks – months	— (already at the top)

Classification is driven by an importance score:

importance ≥ 0.8  →  long_term
importance ≥ 0.6  →  short_term
importance < 0.6  →  working

The decay-rate multiplier is the key differentiating parameter. Over the same 24-hour window, a working memory decays at twice the rate of a long_term one. Importance directly controls expected lifespan: unimportant content disappears quickly, freeing retrieval space for the things that actually matter.

Global Architecture of the Forgetting Subsystem

PowerMem's forgetting subsystem has four cooperating components, arranged along the lifecycle of a memory entry:

New input → ImportanceEvaluator
         → EbbinghausAlgorithm
         → EbbinghausIntelligencePlugin
            ├─ on_add():    inject decay parameters at creation
            ├─ on_get():    check decay / promotion / archival on access
            └─ on_search(): batch-process lifecycle during search
         → MemoryOptimizer
            ├─ exact dedup (MD5 hash)
            ├─ semantic dedup (cosine similarity)
            └─ memory compression (LLM summarization)

ImportanceEvaluator — judges how important a piece of information is and outputs a 0.0–1.0 score.
EbbinghausAlgorithm — pure-math layer providing decay computation, review scheduling, and the forget / promote / archive decisions.
EbbinghausIntelligencePlugin — injects management logic at the key lifecycle hooks: creation, access, and search.
MemoryOptimizer — periodic global pass that performs deduplication and compression.

Forgetting Is More Than Deletion

In retrieval, the forgetting mechanism plays an equally critical role as a ranking signal. Search results are ordered by:

final_score = relevance_score × decay_factor

relevance_score — semantic match (vector similarity).
decay_factor — temporal freshness (the exponential decay value).

These two parameters jointly determine the final ranking, which makes non-trivial cross-rankings possible. The numbers below are illustrative; the actual decay factor depends on the configured decay_rate:

Memory	Relevance	Decay factor	final_score	Rank
Meeting notes 3 hours ago, highly relevant	0.92	0.62	0.57	1
Meeting notes 10 days ago, perfect semantic match	0.98	0.02	0.02	2
Idle chat 1 minute ago, moderate match	0.45	0.99	0.45	—

Forgetting is not a simple delete switch — it is a quality regulator for retrieval. It guarantees that the result respects both the content match dimension and the time freshness dimension simultaneously.

Why Forgetting Matters

Pulling the threads together:

Forgetting is the foundation of ranking. Decay manufactures a second axis beyond semantic similarity, so otherwise-equivalent matches can be separated cleanly.
Forgetting lets memory evolve. Frequently accessed entries are promoted and assigned lower decay rates; repeated use stabilizes what is genuinely useful, exactly as reconsolidation does in the brain.
Forgetting is continuous, not binary. A smooth 1.0 → 0.0 spectrum mimics how human memory actually fades, and leaves room for future features — soft deletes, memory revival, tiered archival — without breaking the model.

Nature designed it this way. PowerMem translates that design into code you can configure, tune, and reason about.

The next post follows a single piece of information through the full PowerMem pipeline — importance evaluation → tier assignment → decay → access trigger → promotion or forgetting → global optimization — to see exactly how the theory becomes runtime behavior.

PowerMem on GitHub: https://github.com/oceanbase/powermem

If you find PowerMem helpful, please give it a ⭐ on GitHub. It would be a great help to the project!

Based on PowerMem v1.1.1. All code references come from the actual project files.

Ask AI

Content

How Nature Forgets

A Shannon-Information View

The Ebbinghaus Forgetting Curve

PowerMem's Three-Tier Memory Architecture

Why Forgetting Matters

Keep Reading

View all posts

ENGINEERING

From Complex to Simple: How We Built seekdb for the AI Era

AI era doesn't need another heavy, complex enterprise database. It needs agility. It needs flexibility. We went back to the drawing board to understand what an AI application actually needs from a database. Our answer is OceanBase seekdb

Mike LiuNovember 28, 2025

ENGINEERING

Beyond Fine-tuning: Solving DABstep's Hard Mode with Versioned Assets

On the DABstep Global Leaderboard, OceanBase DataPilot agent has secured the top spot, maintaining a significant lead over the runner-up for a month. The secret to our SOTA results was a fundamental shift in engineering paradigm: moving from "Prompt Engineering" to "Asset Engineering."

Zion GaoJanuary 9, 2026

ENGINEERING

Permanent Server Offline in OceanBase: How the Cluster Heals After a Node Is Gone

How OceanBase distinguishes a transient outage from a permanent loss, and why operators should intervene rather than wait for the automatic re-replication timer.

Zhennan Wang July 6, 2026