Why do CoALA and Letta disagree on what counts as working memory?
This reads as asking why two agent-memory frameworks draw the boundary of 'working memory' in different places — and the corpus doesn't name CoALA or Letta directly, but it does hold the conceptual fault line that produces such disagreements.
This explores why agent frameworks disagree on what 'working memory' means — and a candid note first: the collection doesn't contain notes naming CoALA or Letta specifically, so I can't tell you their exact framings. What it *does* have is the reason any two such systems will draw that line differently: 'working memory' isn't one thing, so where you put the boundary depends on which design axis you privilege. The most direct map here shows that agent working memory decomposes into four components along two axes — dialogue-level (running conversation history, scratchpad) versus turn-level (in-context examples, the current task trajectory) How should agent memory split across time scales?. Two frameworks can both say 'working memory' and mean opposite ends of that grid: one counts the persistent conversation buffer, the other counts only the volatile per-turn scratch state. The disagreement is definitional, not empirical.
A second source of divergence is whether you anchor the definition in *architecture* or in *cognitive analogy*. One line in the corpus maps agent memory onto the brain's complementary learning systems: transformer weights as a consolidated neocortex, retrieval (RAG) as hippocampal rapid encoding, and agentic state as prefrontal executive control Can brain memory systems explain how LLMs should store knowledge?. If a framework defines working memory by that prefrontal/executive-control analogy, it will scope it narrowly to active task state. If instead it defines it operationally — 'whatever currently sits in the context window' — the scope balloons to include retrieved documents and history. Same term, two reference frames, guaranteed disagreement.
There's also a functional axis: is working memory just the buffer, or is it a *workspace that does work*? Stateful narrative reasoning research shows the payoff of a persistent memory workspace that doesn't merely store but actively detects and resolves contradictions across retrieval cycles, beating stateless multi-step retrieval by up to 11% Can reasoning systems maintain memory across retrieval cycles?. A framework that treats working memory as this active reasoning workspace will include machinery (reflection, contradiction-checking) that a framework treating it as passive storage would file under something else entirely.
The thing worth carrying away: definitional fights about 'working memory' in agents are downstream of an unresolved design question — should memory tiers be carved by *time scale* (turn vs. session), by *cognitive analogy* (executive vs. index vs. consolidated store), or by *function* (passive buffer vs. active workspace)? Each carving is defensible, each predicts different failure modes and update policies, and none has won. So two frameworks 'disagreeing' is less a contradiction than two reasonable answers to a question the field hasn't settled. If you want to go deeper on why the tiers don't cleanly integrate, the complementary-learning-systems note flags exactly the missing consolidation mechanisms that keep these definitions from converging Can brain memory systems explain how LLMs should store knowledge?.
Sources 3 notes
RAISE shows that agent memory consists of four components organized by two design axes: dialogue-level (conversation history, scratchpad) versus turn-level (examples, task trajectory). This granularity distinction predicts different failure modes and update policies for each component.
Research shows transformer weights function as a distributed neocortex for consolidated knowledge, RAG stores as hippocampal indexing for rapid encoding, and agentic state as prefrontal executive control. The CLS framework predicts why hybrid systems outperform single-tier approaches and identifies missing consolidation mechanisms that prevent memory integration.
ComoRAG demonstrates that iterative evidence acquisition with a persistent memory workspace outperforms stateless multi-step retrieval by detecting and resolving contradictions through deeper exploration, achieving up to 11% gains on complex queries.