Understanding InstaFlow/Rectified Flow
Blog post from HuggingFace
Isamu Isozaki's blog post explores the concept of InstaFlow, a technique that applies rectified flows to stable diffusion models for rapid image generation, reducing the typical 12-step diffusion process to just one step. The method, derived from the paper "Flow Straight and Fast," aims to create a direct path between two image distributions, such as horses and zebras, facilitating instantaneous image generation through image-image translation and transport mapping. Rectified flows address the training instability and image quality issues of GANs by establishing a straight-line path between the initial noise and the final image, defined by a consistent velocity. The implementation of InstaFlow, which includes reflow and distillation processes, was achieved with 4,776 A100 GPU hours, significantly more cost-effective than previous models like Stable Diffusion 2.1. The training incorporated text conditioning and used a subset of prompts from laion2B-en, with the authors emphasizing the importance of reflow for high-quality results. The post concludes with a to-do list for implementing InstaFlow as a pull request in the diffusers library, highlighting the need for scripts to generate datasets and refine the methodology.