Home / Companies / WarpStream / Blog / Post Details
Content Deep Dive

Taking out the Trash: Garbage Collection of Object Storage at Massive Scale

Blog post from WarpStream

Post Details
Company
Date Published
Author
Richard Artoul
Word Count
2,588
Language
English
Hacker News Points
-
Summary

Over the past decade, the author has tackled the challenge of efficiently removing logically deleted files from object storage in distributed systems like WarpStream, which is complicated by the necessity of maintaining compatibility with systems like Apache Kafka. Common methods such as using bucket policies or synchronous deletion have proven inadequate due to their limitations in handling variable retention policies and in-flight queries. Instead, more effective solutions include using a delayed queue, which allows files to be deleted from the metadata store and enqueued for physical deletion after a delay to accommodate live queries, and asynchronous reconciliation, which involves scanning the object store to identify and remove orphaned files. WarpStream initially relied on a reconciliation approach due to its scalability and ability to address orphaned files, but as customer demands increased, a hybrid method emerged, incorporating an "optimistic deletion queue" that reduces costs by preemptively deleting files after compactions. This approach, which balances efficiency, cost, and system integrity, has proven to be the author's preferred solution in managing object storage cleanup in distributed systems.