Chunk size is query-dependent: a simple multi-scale approach to RAG retrieval

Post Details

Company

AI21 Labs

Date Published

Jan. 29, 2026

Author

Niv Granot, Algorithms Group Lead @ AI21

Word Count

1,700

Company Posts That Month

9

Language

English

Hacker News Points

-

Source URL

www.ai21.com/blog/query-dependent-chunking

Summary

Research shows that adapting chunk sizes for different queries can significantly enhance retrieval performance in retrieval-augmented generation (RAG) systems. Traditional methods often rely on a fixed chunk size, which can lead to suboptimal results due to the trade-off between preserving fine-grained details with smaller chunks and capturing broader context with larger ones. Experiments reveal that indexing the same corpus at multiple chunk sizes and using Reciprocal Rank Fusion (RRF) to aggregate retrieval results can improve performance by 1-37% without retraining models. Oracle experiments demonstrate that different queries benefit from different chunk sizes, with a potential recall improvement of 20-40% when optimal sizes are selected. The study proposes a practical multi-scale retrieval method that indexes at various chunk sizes and aggregates results at inference time using RRF, achieving gains comparable to those from switching embedding models. This approach emphasizes the dynamic interaction between chunk size and query context, allowing systems to leverage multiple representations for more robust retrieval outcomes.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Vector Search	8	1,668	286	111	+15%
RAG	3	849	194	70	-7%
LLM	1	3,836	662	193	+2%