Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

A framework and leaderboard for Retrieval Pipelines evaluation on ViDoRe v3

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Quentin Macé, Gabriel de Souza Pereira Moreira, Antoine EDY, Radek Osmulski, and Bo Liu
Word Count
1,886
Language
-
Hacker News Points
-
Summary

The blog post discusses the development and utility of the ViDoRe v3 framework and leaderboard, which aims to evaluate retrieval pipelines, particularly in the context of Retrieval Augmented Generation (RAG). RAG enhances Large Language Models (LLMs) by integrating a retrieval component that injects relevant context into prompts, and ViDoRe v3 serves as a benchmark for assessing the performance of embedding models in visual retrieval tasks. The post highlights key components of retrieval pipelines, such as Optical Character Recognition (OCR), Vision-Language Models (VLMs), and algorithms like Sparse Search, Dense Embedding Models, and Late Interaction models. It emphasizes the importance of choosing the right components for specific business and system requirements to build state-of-the-art retrieval systems. The ViDoRe v3 Pipeline Leaderboard, available on Hugging Face, facilitates the comparison of different pipeline implementations by showcasing their average accuracy and search latency. The blog underscores the transition from static pipelines to dynamic Retrieval Agents, which can adaptively enhance search accuracy by rewriting queries or utilizing various tools. ViDoRe v3 provides a standardized framework for evaluating diverse retrieval pipelines, allowing for comparisons between dense, sparse, and hybrid retrieval approaches, as well as between text-based and image-based retrieval methods.