Home / Companies / Redpanda / Blog / Post Details
Content Deep Dive

Apache Kafka's log compaction corrupts data. Here's how we fixed it

Blog post from Redpanda

Post Details
Company
Date Published
Author
Alexey Bashtanov
Word Count
2,385
Company Posts That Month
8
Language
English
Hacker News Points
-
Summary

The blog post addresses a critical bug in Apache Kafka's log compaction process, which can result in data inconsistencies across broker replicas. It explains how compaction manages data by retaining only the latest value for each key, using tombstones for deletion, and applying expiration-based rules to transaction control batches. However, issues arise when a broker falls behind, potentially leading to scenarios where deleted or aborted data reappears as committed, committed data is hidden, or partitions become frozen. The root cause is identified as a race condition between compaction and replication, where a broker missing critical markers may end up with inconsistent data. Redpanda Streaming introduces a coordinated compaction protocol that uses metrics like Maximum Cleanly Compacted Offset (MCCO) and Maximum Tombstone Removal Offset (MTRO) to ensure all replicas are synchronized before removing tombstones or transaction markers. This approach prioritizes data safety and allows for optimal cleanup decisions even during prolonged node outages, ensuring that compaction does not compromise data integrity.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Real-time 4 5,457 1,338 238 -5%