Home / Companies / Twilio / Blog / Post Details
Content Deep Dive

Using ClickHouse to count unique users at scale

Blog post from Twilio

Post Details
Company
Date Published
Author
Rahul Ramakrishna, Lew Gordon, Clayton McClure
Word Count
1,559
Language
English
Hacker News Points
-
Summary

Twilio encountered challenges in accurately counting unique users interacting with their Journeys product due to the high data volume and need for exact counts over arbitrary date ranges, which standard pre-aggregation techniques could not handle. Initially, they used a self-managed ClickHouse setup with distributed tables to query data, but performance issues arose for high-volume journeys, where some queries exceeded memory limits or timeout constraints. To address this, Twilio implemented semantic sharding to ensure user events were consistently directed to the same node, reducing the memory footprint and improving query efficiency. Additionally, they optimized query performance by hashing UUIDs to integers, significantly speeding up comparisons and reducing query times by 80%. These optimizations enabled real-time analysis of journey data and improved scalability, although Twilio acknowledges the need for ongoing improvements as product adoption and journey data grow.