Home / Companies / AI21 Labs / Blog / Post Details
Content Deep Dive

Chunk size is query-dependent: a simple multi-scale approach to RAG retrieval

Blog post from AI21 Labs

Post Details
Company
Date Published
Author
Niv Granot, Algorithms Group Lead @ AI21
Word Count
1,700
Language
English
Hacker News Points
-
Summary

Research shows that adapting chunk sizes for different queries can significantly enhance retrieval performance in retrieval-augmented generation (RAG) systems. Traditional methods often rely on a fixed chunk size, which can lead to suboptimal results due to the trade-off between preserving fine-grained details with smaller chunks and capturing broader context with larger ones. Experiments reveal that indexing the same corpus at multiple chunk sizes and using Reciprocal Rank Fusion (RRF) to aggregate retrieval results can improve performance by 1-37% without retraining models. Oracle experiments demonstrate that different queries benefit from different chunk sizes, with a potential recall improvement of 20-40% when optimal sizes are selected. The study proposes a practical multi-scale retrieval method that indexes at various chunk sizes and aggregates results at inference time using RRF, achieving gains comparable to those from switching embedding models. This approach emphasizes the dynamic interaction between chunk size and query context, allowing systems to leverage multiple representations for more robust retrieval outcomes.