WarpStream is an Apache Kafka protocol compatible data streaming system built on top of object storage, with zero local disks and no inter-zone bandwidth costs. It separates data from metadata, allowing for a massively parallel write engine without synchronization or serialization issues. The system uses a metadata store to track batch sequence IDs, enabling idempotent producer functionality that ensures duplicate batches are dropped before being written to immutable segment files in object storage. This separation of data and metadata also enables "retroactive tombstoning" to identify and drop duplicate batches after they've been written. While implementing idempotency, WarpStream introduced a performance bottleneck due to the need for compaction to merge smaller batches into larger ones, but this was addressed by modifying the file cache interface to support reading batches in a single RPC.