The Aya project represents a significant advancement in multilingual AI research, aiming to bridge the gap created by previous language limitations in AI models. By gathering contributions from over 3,000 researchers worldwide, Aya focuses on enhancing support for underserved languages, outperforming existing open-source multilingual models like mT0 and Bloomz in tasks such as natural language understanding, summarization, and translation. It covers more than 50 previously unserved languages and offers the most extensive multilingual dataset to date, with 513 million prompts and completions in 114 languages, including 204,000 rare human-annotated entries. Aya's open-source, fully permissive Apache 2.0 licensed model and data collection provide a valuable resource for developers and researchers, promoting linguistic diversity and offering a foundation for further open science projects. The initiative invites academics, civil institutions, and small companies to participate and contribute to enhancing AI's cultural and linguistic relevance.