Home / Companies / Twilio / Blog / Post Details
Content Deep Dive

Delivering billions of messages exactly once

Blog post from Twilio

Post Details
Company
Date Published
Author
Amir Abu Shareb
Word Count
3,693
Language
English
Hacker News Points
-
Summary

Twilio Segment has developed a sophisticated de-duplication system to address the challenge of delivering messages exactly once within a data pipeline, despite the inherent complications of distributed systems where data can often be delayed or reordered but not lost. Their previous system, which relied on Memcached for key storage and atomic operations, was costly and memory-intensive. The new system leverages Kafka and RocksDB to efficiently manage message duplication by assigning unique IDs to messages, storing them in Kafka for durability, and using a Go-based deduplication worker to process them. RocksDB's log-structured-merge-tree architecture allows for fast writing and efficient key management, while Kafka acts as the source of truth to ensure reliability and consistency in message delivery. This system enables Twilio Segment to handle billions of messages with increased reliability and at reduced costs, by moving data storage to disk and optimizing performance through partitioning and batching reads and writes. The new architecture has proven successful over several months, significantly improving upon the limitations of their previous Memcached-based approach.