Home / Companies / WorkOS / Blog / Post Details
Content Deep Dive

Memory and context poisoning: Don't let attackers rewrite your AI agent's memory

Blog post from WorkOS

Post Details
Company
Date Published
Author
Maria Paktiti
Word Count
2,135
Language
English
Hacker News Points
-
Summary

In December 2025, researchers introduced the concept of MemoryGraft, a method to compromise AI agents by embedding malicious entries in their long-term memory through seemingly harmless content. This attack could result in AI agents adopting harmful behaviors by retrieving and acting upon these poisoned memories, believing them to be part of their own successful experiences. The MINJA attack, revealed at NeurIPS 2025, demonstrated a more sophisticated version whereby an attacker could corrupt an agent's memory merely through regular interactions without direct memory access. This poses a significant security threat distinct from prompt injection attacks due to its temporal decoupling and implicit trust in memory. Memory poisoning affects future decisions and can spread across multi-agent systems, making detection and mitigation challenging. Various defense strategies are suggested, including validating content at ingestion, tracking memory provenance, isolating memory by trust scope, setting expiration policies, monitoring for behavioral drift, and implementing incident response processes to trace and quarantine poisoned memories. These strategies are crucial for maintaining the integrity of AI agents and preventing compromised decision-making processes.