Home / Companies / Tines / Blog / Post Details
Content Deep Dive

Futureproofing Tines: Partitioning a 17TB table in PostgreSQL

Blog post from Tines

Post Details
Company
Date Published
Author
Julianne Walker
Word Count
4,361
Language
English
Hacker News Points
-
Summary

Tines faced a critical engineering challenge as their PostgreSQL output_payloads table neared the 32TB limit, posing risks to workflow continuity. To address this, they explored partitioning strategies to manage their rapidly growing 17TB table, which held essential event data in JSON format. Partitioning was preferred over sharding to avoid complexity, as it divides a large table into smaller, more manageable sub-tables, improving query performance by reducing the data scanned per query. They experimented with several strategies, ultimately favoring a two-level partitioning approach based on root_story_id and id, allowing for balanced data distribution and efficient querying. This strategy required reverse-engineering PostgreSQL's hash-based partitioning logic to query partitions directly, significantly enhancing performance by reducing CPU load and execution times. The project involved phased rollout with dual writes, verification using Github's scientist library for data consistency, and a gradual shift to reading primarily from the newly optimized table. Despite the challenges, this approach mitigated the issues of inefficient range queries and ensured consistent performance, showcasing the importance of systematic experimentation and testing in optimizing database management.