rerank-2.5 and rerank-2.5-lite: instruction-following rerankers
Blog post from Voyage AI
The rerank-2.5 series represents a significant advancement in retrieval accuracy and introduces instruction-following capabilities, enabling users to guide the model's relevance scores with natural language inputs. Compared to its predecessors, rerank-2.5 and rerank-2.5-lite achieve notable improvements in accuracy, outperforming Cohere Rerank v3.5 by 7.94% and 7.16% respectively on standard datasets and by even larger margins on the Massive Instructed Retrieval Benchmark (MAIR) and in-house evaluations. Both models support a 32K token context length, enhancing retrieval accuracy for longer documents without increased costs. This upgrade stems from improved training data mixtures and advanced distillation techniques, allowing the models to excel as rerankers, surpassing existing large language models in effectiveness. The instruction-following feature is particularly beneficial for nuanced search tasks, demonstrated across diverse domain-specific datasets, and provides an average accuracy increase of over 8% when utilized. The models consistently outperform others across various domains and languages, confirming their position as a new cost-to-performance benchmark in retrieval systems.