Company
Date Published
Author
-
Word count
1708
Language
-
Hacker News points
None

Summary

In Logstash 5.1, the Persistent Queue (PQ) feature was introduced as a beta and later became a production-ready feature in version 5.4, offering disk-based resiliency to mitigate data loss during application failures. Unlike the default in-memory queue, PQ stores input data on disk, ensuring that unacknowledged data is replayed upon restart, thus providing At-Least-Once delivery guarantees. While it helps prevent data loss during application crashes and manages data ingestion spikes without backpressure, PQ does not protect against storage hardware failures unless external replication technologies are used. The feature can impact performance, with tests showing about a 10% reduction in throughput compared to in-memory queues, depending on the specific configuration and hardware. PQ's design includes settings like queue.page_capacity, queue.max_events, and queue.max_bytes to manage disk usage, along with checkpoint settings that balance reliability and performance. The feature is particularly beneficial in maintaining data integrity during abnormal shutdowns, though it may result in duplicate data processing due to its At-Least-Once delivery approach. Users are encouraged to test the PQ on their configurations to assess its performance impact and to provide feedback for further improvements.