Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL

Post Details

Company

Hugging Face

Date Published

May 27, 2026

Author

Amine Dirhoussi, Quentin Gallouédec, Kashif Rasul, Lewis Tunstall, Edward Beeching, Albert Villanova del Moral, and Leandro von Werra

Word Count

4,227

Company Posts That Month

55

Language

-

Hacker News Points

-

Post removed?

No

Source URL

huggingface.co/blog/delta-weight-sync

Summary

In a recent development, the process of asynchronous reinforcement learning (Async RL) has been made significantly more efficient by minimizing the data transfer between the trainer and inference engine. Traditionally, each training step required the entire model to be sent, which could be up to a terabyte for frontier models. However, it has been observed that between consecutive RL optimizer steps, over 98% of the weights remain unchanged, allowing for only the changed weights to be sent as a sparse safetensors file. This approach drastically reduces the payload size from gigabytes to mere megabytes. The implementation involves encoding the changes, uploading them to a Hugging Face Bucket, and fetching them with vLLM, which can operate independently on different servers or regions. This new method eliminates the need for shared clusters or complex networking setups, making Async RL more accessible and cost-effective while maintaining efficiency, especially for large-scale models.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
LLM	4	9,074	1,640	224	+53%
Secrets Management	1	2,152	360	101	+18%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.