Company
Date Published
Author
Rebecca Taft
Word count
3491
Language
English
Hacker News points
20

Summary

In a blog post by Rebecca Taft on CockroachDB 19.1, the importance of automatic table statistics collection for optimizing query plans is highlighted. CockroachDB's cost-based optimizer, rebuilt for the 2.1 release, now includes automatic statistics collection which aids in selecting efficient query plans by estimating query costs based on computing resources like CPU and I/O. The post explains how statistics influence the optimizer’s decisions, particularly in estimating the number of rows processed at each stage of a query plan, using assumptions of uniformity and independence to simplify calculations. To maintain up-to-date statistics without burdening users, CockroachDB has implemented a system to automatically trigger statistics collection when a significant portion of a table's data changes, with minimal impact on performance. This automatic collection addresses user challenges in manually updating statistics, such as determining refresh frequency and handling large data sets, ensuring that queries are optimized for current data conditions.