DALL·E 3 and Midjourney Fail Astral Codex Ten's Image Generation Bet
Blog post from Surge AI
In a 2022 bet by Astral Codex Ten (ACT) on the capabilities of generative AI models, the challenge was to create images based on five specific prompts, with success defined as at least one of ten generated images accurately depicting the scene in at least three prompts. Initially, DALL-E 2 failed, but Google’s Imagen model later claimed partial success. However, ACT's victory was contested due to inaccuracies in the generated images, such as missing elements like a key in a raven's mouth or lipstick on a fox. A recent evaluation using DALL-E 3 and Midjourney showed limited progress: DALL-E 3 succeeded in two prompts, partially succeeding in one, while Midjourney failed all. The evaluation highlighted ongoing challenges in image compositionality, suggesting that while AI image generation has advanced, it still struggles with nuanced prompt details, and the bet remains unwon.