Cohere For AI has introduced Aya Vision, a cutting-edge vision model designed to enhance multilingual and multimodal communication globally, excelling across 23 languages, which are spoken by over half the world's population. Aya Vision aims to bridge the performance gap in AI models for tasks involving both text and images, such as image captioning and visual question answering. The models have demonstrated superior performance compared to larger counterparts, achieving high win rates on benchmarks like AyaVisionBench and m-WildVision. The release includes open-weight models available on platforms like Kaggle and Hugging Face, promoting accessibility and collaboration in AI research. This initiative follows the success of Aya Expanse and represents Cohere's commitment to advancing multilingual AI through research grants and an open science community, inviting global researchers to join in driving innovation and bridging cultural and language divides.