Apache Kafka's log compaction corrupts data. Here's how we fixed it

Post Details

Company

Redpanda

Date Published

June 25, 2026

Author

Alexey Bashtanov

Word Count

2,385

Company Posts That Month

8

Language

English

Hacker News Points

-

Source URL

www.redpanda.com/blog/kafka-log-compaction-bug-fix-streaming

Summary

The blog post addresses a critical bug in Apache Kafka's log compaction process, which can result in data inconsistencies across broker replicas. It explains how compaction manages data by retaining only the latest value for each key, using tombstones for deletion, and applying expiration-based rules to transaction control batches. However, issues arise when a broker falls behind, potentially leading to scenarios where deleted or aborted data reappears as committed, committed data is hidden, or partitions become frozen. The root cause is identified as a race condition between compaction and replication, where a broker missing critical markers may end up with inconsistent data. Redpanda Streaming introduces a coordinated compaction protocol that uses metrics like Maximum Cleanly Compacted Offset (MCCO) and Maximum Tombstone Removal Offset (MTRO) to ensure all replicas are synchronized before removing tombstones or transaction markers. This approach prioritizes data safety and allows for optimal cleanup decisions even during prolonged node outages, ensuring that compaction does not compromise data integrity.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Real-time	4	5,457	1,338	238	-5%