[Paper] Continual Knowledge Updating in LLM Systems: Learning Through Multi-Timescale Memory Dynamics
Source: arXiv - 2605.05097v1
Overview
The paper “Continual Knowledge Updating in LLM Systems: Learning Through Multi‑Timescale Memory Dynamics” proposes a new way for large language models (LLMs) to keep their knowledge fresh without costly retraining. By borrowing ideas from biological memory, the authors introduce Memini, an external associative memory that updates itself on both fast and slow timescales, allowing LLMs to instantly use new information while still preserving long‑term knowledge.
Key Contributions
- Multi‑timescale memory architecture: Extends the Benna‑Fusi synaptic consolidation model to an external graph‑based memory for LLMs.
- Unified mechanism for episodic sensitivity, consolidation, and forgetting: All three phenomena emerge from the same coupled fast/slow edge dynamics.
- Associative graph representation: Knowledge is stored as a directed graph where each edge holds a pair of fast and slow variables, enabling efficient retrieval and incremental updates.
- Demonstrated continual learning: Empirical experiments show that Memini can incorporate new facts after a single exposure and retain them without catastrophic forgetting.
- Open‑source reference implementation: The authors release code and a set of benchmarks for evaluating continual knowledge updating in LLM pipelines.
Methodology
-
Memory Graph Construction – Textual concepts extracted from the LLM’s context become nodes; semantic relations become directed edges.
-
Coupled Edge Dynamics – Each edge stores two scalar values: a fast variable (quickly reflects recent exposures) and a slow variable (integrates information over many repetitions). The update rule follows the Benna‑Fusi differential equations:
[ \begin{aligned} \dot{f} &= -\frac{f}{\tau_f} + \eta , \text{signal} \ \dot{s} &= -\frac{s}{\tau_s} + \frac{f}{\tau_f} \end{aligned} ]
where (f) and (s) are the fast and slow components, (\tau_f \ll \tau_s) are their respective time constants, and (\eta) is a learning rate.
-
Integration with LLM – During inference, the LLM queries the graph for relevant edges; the combined fast+slow weight determines the strength of the retrieved association.
-
Training‑free Updates – When the model encounters a new fact (e.g., “The CEO of X is Y”), the corresponding edge’s fast variable is boosted instantly, and repeated mentions gradually shift weight into the slow component.
-
Evaluation – The authors test on three continual‑learning scenarios: (a) single‑shot fact insertion, (b) repeated fact reinforcement, and (c) controlled forgetting through lack of exposure. Accuracy on downstream QA and open‑ended generation tasks is measured over time.
Results & Findings
| Scenario | Immediate Recall (after 1 exposure) | Long‑Term Retention (after 10k steps) | Forgetting (no exposure) |
|---|---|---|---|
| Memini | 92 % correct | 84 % correct | Gradual decay → 30 % after 10k steps |
| Baseline static external KV store | 45 % | 45 % | 45 % (no decay) |
| Fine‑tuned LLM (full retrain) | 96 % | 90 % | N/A (retraining required) |
- Fast learning: Memini reaches near‑human performance after a single exposure, rivaling full model fine‑tuning.
- Consolidation: Repeated mentions push knowledge into the slow component, making it robust to noise and distribution shift.
- Selective forgetting: Unused edges naturally decay, freeing capacity and preventing stale information from contaminating responses.
- Efficiency: Updating the graph costs < 5 ms per fact on a single GPU, orders of magnitude cheaper than a full LLM fine‑tune.
Practical Implications
- Dynamic knowledge bases for chatbots – Customer‑support bots can ingest policy updates or product releases on the fly, instantly reflecting them in conversations.
- Edge‑deployed LLMs – Devices with limited compute (e.g., smartphones, IoT) can keep a lightweight Memini graph locally, allowing personalized, up‑to‑date information without sending data to the cloud.
- Regulatory compliance – Companies can purge or modify specific facts (e.g., GDPR “right to be forgotten”) by simply letting the corresponding edges decay, avoiding costly model retraining.
- Continual learning pipelines – Memini can serve as a plug‑and‑play module for any transformer‑based model, turning static inference APIs into adaptive systems.
- Reduced carbon footprint – By sidestepping frequent full‑model retraining, organizations can lower GPU usage and associated emissions.
Limitations & Future Work
- Scalability of the graph – While the authors demonstrate up to a few million edges, real‑world enterprise knowledge graphs can be orders of magnitude larger; efficient indexing and pruning strategies are needed.
- Semantic drift – The current edge update rule does not explicitly handle contradictory information; future work could incorporate confidence weighting or conflict resolution.
- Evaluation breadth – Benchmarks focus on factual QA; extending to procedural knowledge, reasoning chains, or multimodal data remains open.
- Integration with retrieval‑augmented generation (RAG) – The paper treats Memini as a stand‑alone memory; exploring hybrid pipelines with dense vector retrievers could yield further gains.
Overall, Memini offers a biologically inspired, low‑overhead route to keep LLMs relevant in a constantly evolving world, opening a practical path toward truly continual AI systems.
Authors
- Andreas Pattichis
- Constantine Dovrolis
Paper Information
- arXiv ID: 2605.05097v1
- Categories: cs.LG, cs.AI, cs.CL
- Published: May 6, 2026
- PDF: Download PDF