Introducing Storage Buckets on the Hugging Face Hub
Blog post from HuggingFace
Storage Buckets on the Hugging Face Hub offer a flexible, S3-like object storage solution specifically designed for managing the mutable, high-throughput artifacts generated by machine learning workflows, such as checkpoints, optimizer states, and processed data. Built on Hugging Face’s chunk-based storage backend, Xet, Buckets efficiently handle deduplication, facilitating faster transfers and reduced storage costs, particularly beneficial for enterprise users. These Buckets integrate seamlessly with existing tools like the hf CLI and Python API, allowing users to create, sync, and inspect storage containers easily, and are compatible with popular libraries through the fsspec interface. Offering the familiarity of S3 storage with enhancements tailored for AI workflows, Storage Buckets complement the Hub's existing versioned model and dataset repositories, maintaining a clear distinction between the working and publishing layers while promoting a continuous Hub-native workflow.