Using ClickHouse to count unique users at scale
Blog post from Twilio
Twilio encountered challenges in accurately counting unique users interacting with their Journeys product due to the high data volume and need for exact counts over arbitrary date ranges, which standard pre-aggregation techniques could not handle. Initially, they used a self-managed ClickHouse setup with distributed tables to query data, but performance issues arose for high-volume journeys, where some queries exceeded memory limits or timeout constraints. To address this, Twilio implemented semantic sharding to ensure user events were consistently directed to the same node, reducing the memory footprint and improving query efficiency. Additionally, they optimized query performance by hashing UUIDs to integers, significantly speeding up comparisons and reducing query times by 80%. These optimizations enabled real-time analysis of journey data and improved scalability, although Twilio acknowledges the need for ongoing improvements as product adoption and journey data grow.