The Expanding Attack Surface of Multimodal LLMs and How to Secure It | Lakera â Protecting AI teams that disrupt the world.

Company

Lakera

Date Published

Nov. 13, 2025

Author

Pablo Mainar

Word count

1223

Language

Hacker News points

None

URL

www.lakera.ai/blog/the-expanding-attack-surface-of-multimodal-llms-and-how-to-secure-it

Summary

Multimodal large language models (LLMs) have expanded beyond text to process audio, images, and video, enhancing user experiences and enabling new product possibilities, but also introducing significant security challenges. While traditional text-based LLMs had a singular attack vector through user input, multimodal models face a wider range of threats due to their ability to interpret nuanced audio inputs, making them susceptible to various acoustic attacks. These attacks include methods like clean audio jailbreaks, transcriber bypass via reverberation, dual-audio obfuscation, and transcriber muting, which exploit the limitations of transcription-based defenses. To combat these vulnerabilities, Lakera Guard provides an advanced security solution by analyzing raw audio streams for adversarial patterns and malicious intents, operating independently of transcription quality, thereby offering real-time protection against these evolving threats. As multimodal systems grow more prevalent, effective security measures like those provided by Lakera Guard become crucial in mitigating risks associated with their expanded attack surfaces.

The Expanding Attack Surface of Multimodal LLMs and How to Secure It | Lakera â Protecting AI teams that disrupt the world.

Summary

The Expanding Attack Surface of Multimodal LLMs and How to Secure It | Lakera â Protecting AI teams that disrupt the world.