Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

Welcome FLUX.2 - BFL’s new open image generation model 🤗

Blog post from HuggingFace

Post Details
Company
Date Published
Author
YiYi Xu, Daniel Gu, Sayak Paul, Alvaro Somoza, Dhruv Nair, Aritra Roy Gosthipaty, Linoy Tsaban, and Apolinário from multimodal AI art
Word Count
3,460
Language
-
Hacker News Points
-
Summary

FLUX.2, developed by Black Forest Labs, is an advanced image generation model that builds upon its predecessor, Flux.1, with a new architecture designed from the ground up. Unlike Flux.1, FLUX.2 employs a single text encoder, Mistral Small 3.1, simplifying prompt embedding by stacking outputs from intermediate layers. The model retains the multimodal diffusion transformer (MM-DiT) + parallel DiT architecture but introduces modifications such as shared modulation parameters across transformer blocks and the elimination of bias parameters, leading to a more streamlined structure. FLUX.2 supports both image-guided and text-guided generation, allows multiple reference images, and provides advanced prompting techniques. It requires significant memory resources, with various strategies available to optimize inference and fine-tuning on consumer-grade hardware. Users can leverage techniques like LoRA fine-tuning, remote text encoding, and quantization to maximize efficiency and performance, making FLUX.2 a versatile tool for complex image generation tasks.