Home / Companies / Cohere / Blog / Post Details
Content Deep Dive

C4AI Launches Aya, an LLM Covering More Than 100 Languages

Blog post from Cohere

Post Details
Company
Date Published
Author
Cohere Labs Team
Word Count
631
Language
English
Hacker News Points
-
Source URL
Summary

The Aya project represents a significant advancement in multilingual AI research, aiming to bridge the gap created by previous language limitations in AI models. By gathering contributions from over 3,000 researchers worldwide, Aya focuses on enhancing support for underserved languages, outperforming existing open-source multilingual models like mT0 and Bloomz in tasks such as natural language understanding, summarization, and translation. It covers more than 50 previously unserved languages and offers the most extensive multilingual dataset to date, with 513 million prompts and completions in 114 languages, including 204,000 rare human-annotated entries. Aya's open-source, fully permissive Apache 2.0 licensed model and data collection provide a valuable resource for developers and researchers, promoting linguistic diversity and offering a foundation for further open science projects. The initiative invites academics, civil institutions, and small companies to participate and contribute to enhancing AI's cultural and linguistic relevance.