Who Is Gandalf? The AI Challenge That Tests Your Prompting Skills

Post Details

Company

Lakera

Date Published

Nov. 14, 2025

Author

Max Mathys

Word Count

2,759

Language

-

Hacker News Points

-

Source URL

www.lakera.ai/blog/who-is-gandalf

Summary

Gandalf is a challenge created by Lakera to highlight the vulnerabilities of large language models (LLMs) and improve their defenses, particularly in contexts like healthcare and finance where data security is crucial. The game, stemming from an internal hackathon, involves trying to coax a language model into revealing a secret password, with each of the seven levels presenting increased difficulty as more sophisticated defenses are applied. As users progress, they encounter various strategies to prevent password leaks, such as checking both input and output for mentions of the password and employing additional language model checks. Despite these measures, users have found creative ways to bypass the defenses, demonstrating real-world implications for LLM security. Gandalf has gained significant popularity, registering millions of interactions and illustrating the ongoing challenge of securing AI applications against prompt attacks and other vulnerabilities.