Why Adding More Indexes Eventually Makes Things Worse
Blog post from Tiger Data
In Matty Stratton's article, the challenges of adding indexes in PostgreSQL, particularly in high-frequency data ingestion scenarios, are examined in detail. While indexes can significantly speed up query performance by reducing the need for sequential scans, they also introduce a "write tax" on each row insertion, as every index requires additional write operations. This tax becomes increasingly problematic at high insertion rates, leading to write amplification and increased latency. Timestamp indexes, often used in time-series tables, exacerbate the problem due to their propensity for creating "hot right edges," causing frequent page splits and index bloat. The feedback loop created by adding more indexes to resolve slow queries can, paradoxically, worsen write performance, as it competes with autovacuum processes for I/O resources. Stratton suggests that reconsidering the storage model, such as moving from row-based to columnar storage, can mitigate these issues by reducing write amplification and changing the cost structure of data operations.