Company
Date Published
Author
-
Word count
924
Language
-
Hacker News points
None

Summary

Elasticsearch uses a default search type called "Query Then Fetch" to retrieve relevant documents from scattered shards within a cluster, but this method can result in scoring discrepancies due to each shard only having local Term/Document Frequency statistics. These discrepancies are particularly evident in small indexes with few documents per shard. To address this, Elasticsearch offers an alternative search type, "DFS Query Then Fetch," which performs a pre-query to calculate global document frequencies across all shards, leading to more accurate and consistent scoring results. While this approach improves accuracy, it involves an additional round-trip between shards, potentially impacting performance, but is beneficial in cases where accurate relevancy scoring is critical, especially when the data is sparse or unevenly distributed.