Company
Date Published
Author
-
Word count
1093
Language
English
Hacker News points
None

Summary

Choosing the optimal index for database queries, especially when specific filter values are unknown, is a complex challenge. The blog post discusses methodologies for selecting the best index from a set of feasible options in databases, focusing primarily on graph databases but applicable to relational databases as well. It presents several approaches: starting with a simple heuristic of counting nodes, then advancing to calculating average group size, and finally employing probabilistic measures like chi-squared statistics to assess conformity to a uniform distribution. The average group size approach is highlighted for its ease of calculation and interpretability, while the probabilistic method offers insights into the distribution of node groups, aiding in more accurate index selection. The article concludes by suggesting that databases like Memgraph could potentially learn user query patterns over time to optimize index selection further.