FLUX fine-tunes are now fast

Post Details

Company

Replicate

Date Published

Nov. 26, 2024

Author

bfirsh

Word Count

595

Language

English

Hacker News Points

-

Source URL

replicate.com/blog/fast-flux-fine-tunes

Summary

Replicate has introduced optimizations to significantly accelerate fine-tuning FLUX models with user data, achieving speeds comparable to base models, with the improvements being open-source. The adjustments involve utilizing Alex Redden’s flux-fp8-api, torch.compile, and CuDNN attention kernels, and support loading LoRAs from sources like Hugging Face and Civitai. Fine-tunes are quantized as fp8 and merged with the base model, with an automatic increase in lora_scale for optimal output when go_fast=true is enabled. While quantization slightly alters outputs, it minimally affects quality, and all models, including existing and future ones, will benefit from these enhancements. Acknowledging the challenge of comparing model outputs, Replicate emphasizes transparency and the importance of contributing optimizations back to the open-source community, with a commitment to making both fine-tuning and training processes faster through ongoing developments and community collaboration.