AI Agents Need Memory Control Over More Context

Paper · arXiv 2601.11653 · Published January 15, 2026
LLM Memory

AI agents are increasingly used in long, multi-turn workflows in both research and enterprise settings. As interactions grow, agent behavior often degrades due to loss of constraint focus, error accumulation, and memory-induced drift. This problem is especially visible in real-world deployments where context evolves, distractions are introduced, and decisions must remain consistent over time. A common practice is to equip agents with persistent memory through transcript replay or retrieval-based mechanisms. While convenient, these approaches introduce unbounded context growth and are vulnerable to noisy recall and memory poisoning, leading to unstable behavior and increased drift. In this work, we introduce the Agent Cognitive Compressor (ACC), a bio-inspired memory controller that replaces transcript replay with a bounded internal state updated online at each turn. ACC separates artifact recall from state commitment, enabling stable conditioning while preventing unverified content from becoming persistent memory. We evaluate ACC using an agent-judge-driven live evaluation framework that measures both task outcomes and memory-driven anomalies across extended interactions.

Introduction. Agentic AI is moving beyond conversational assistance toward operational decision support. In enterprise settings, AI agents are increasingly expected to execute multi-step workflows, coordinate external tools, and sustain context across extended multi-turn interactions in domains such as IT operations, cybersecurity response, healthcare operations, and e-commerce. These settings differ from short-form question answering because success depends on continuity under changing requirements, reliable preservation of constraints, and consistent tracking of entities and intermediate decisions. A widely adopted execution pattern is to interleave reasoning and acting during runtime [38]. Despite rapid progress in agent architectures, memory handling remains a central barrier to reliability in multi-turn workflows. The dominant implementation pattern continues to rely on transcript replay, where prior interactions are appended to the prompt.

Discussion / Conclusion. This paper showed that multi-turn agent failures are often driven less by missing knowledge than by weak memory control. Transcript replay causes context to grow with turn count, reduces attention selectivity, and allows early errors to persist and reappear, which increases hallucination carryover and drift from established constraints. Retrieval-based memory bounds prompt length, but adds selection error: stale, conflicting, or injected artifacts can perturb the current task state and destabilize long-horizon behavior, which in our setting required restricting retrieval to three artifacts per turn to limit drift escalation. We introduced the Agent Cognitive Compressor (ACC), a memory control mechanism that replaces accumulation with a bounded, schema-governed internal state. ACC separates artifact recall from state commitment and updates a single persistent variable, the Compressed Cognitive State (CCS), via controlled replacement rather than growth. This makes the write path explicit and auditable while keeping memory footprint bounded.