Run 30,000+ LoRAs on Hugging Face with Replicate
Blog post from Replicate
LoRAs, or Low-Rank Adaptations, have become a popular method for training image models to convey specific styles or concepts, such as Studio Ghibli stills or 80s cyberpunk aesthetics. Hugging Face, a prominent platform for sharing and experimenting with LoRAs, now allows users to run these models directly on its hub using Replicate for inference, thanks to an update to Hugging Face’s inference client. This integration enables fast and cost-effective model inference by routing requests to Replicate’s backend model, where the requested LoRA is applied dynamically using a parameter called lora_weights. This setup allows for efficient support of all LoRAs in Hugging Face’s Flux library without the need for separate models, thus enhancing accessibility and usability for artists, researchers, and hobbyists alike.