A Watermark for Large Language Models

Post Details

Company

Arize

Date Published

July 30, 2025

Author

Dylan Couzon

Word Count

802

Language

English

Hacker News Points

-

Source URL

arize.com/blog/a-watermark-for-large-language-models

Summary

John Kirchenbauer from the University of Maryland introduces a watermarking method for large language models that subtly biases text generation toward a “green” set of tokens, making it detectable through statistical analysis without degrading text quality. This watermark is embedded by slightly adjusting the likelihood of certain words during sampling, allowing it to be detected by analyzing the frequency of these "green" tokens in generated text. The method is robust but can be weakened by paraphrasing or editing, and it presents challenges in preventing spoofing as downstream models can learn the watermark pattern. Additionally, a similar watermarking technique is applied to diffusion models for images, using noise perturbation in the Fourier space to create a detectable pattern. The approach aims to maintain a measurable distinction between human and model-generated content, ensuring the watermark remains detectable as language models evolve.