Constraining CLIPDraw
Blog post from Replicate
Dom, a web engineer at Replicate, shares his journey of working on the CLIPDraw model, which generates images from prompts by transforming random squiggles into recognizable forms. Despite his limited background in machine learning, Dom collaborates with experts to refine CLIPDraw to produce more centralized and detailed outputs. He navigates challenges like gradient flow interruptions and learns to apply vectorization, transitioning from procedural to differentiable programming. Through trial and error, including adjusting loss weights and experimenting with ReLU functions, Dom successfully enhances the model's ability to generate cohesive images without compromising on the resemblance to prompts. Despite progress, he acknowledges that further improvements are needed to match the sophistication of established tools like pixray/text2image.