Prompt injection attacks: What are they and how to defend against them
Blog post from WorkOS
Prompt injection is identified as the primary security vulnerability in applications utilizing large language models (LLMs), as noted by its top ranking in the OWASP Top 10 for LLM Applications since 2025. This type of attack exploits the way LLMs interpret instructions, targeting the lack of a mechanism to differentiate between trusted and untrusted data. As LLMs process input as a single text stream, they are vulnerable to adversarial instructions that can lead to unintended actions, making it difficult to eliminate this threat entirely. The problem is exacerbated by the absence of a parameterized query equivalent for LLMs, unlike in SQL injection, and the stochastic nature of LLMs further complicates deterministic security guarantees. Prompt injection attacks can be direct, indirect, multi-modal, or agentic, each presenting unique challenges. Effective defense strategies involve layered security measures, including system prompt hardening, input scanning, output validation, privilege minimization, and adversarial testing. Moreover, compliance with regulations such as the EU AI Act mandates defenses against such vulnerabilities. The field requires continuous adaptation, with a focus on building security into LLM integrations from the onset to manage risks associated with prompt injection effectively.