DALLÂ·E 3 and Midjourney Fail Astral Codex Ten's Image Generation Bet

Post Details

Company

Surge AI

Date Published

Aug. 1, 2024

Author

Edwin Chen

Word Count

2,016

Language

English

Hacker News Points

-

Source URL

surgehq.ai/blog/dalle-3-and-midjourney-fail-astral-codex-tens-image-generation-bet

Summary

In a 2022 bet by Astral Codex Ten (ACT) on the capabilities of generative AI models, the challenge was to create images based on five specific prompts, with success defined as at least one of ten generated images accurately depicting the scene in at least three prompts. Initially, DALL-E 2 failed, but Google’s Imagen model later claimed partial success. However, ACT's victory was contested due to inaccuracies in the generated images, such as missing elements like a key in a raven's mouth or lipstick on a fox. A recent evaluation using DALL-E 3 and Midjourney showed limited progress: DALL-E 3 succeeded in two prompts, partially succeeding in one, while Midjourney failed all. The evaluation highlighted ongoing challenges in image compositionality, suggesting that while AI image generation has advanced, it still struggles with nuanced prompt details, and the bet remains unwon.