How we measure data completeness at scale

Post Details

Company

Datadog

Date Published

July 1, 2026

Author

Valentin Touffet, Alexandre Olivier

Word Count

3,664

Company Posts That Month

4

Language

English

Hacker News Points

-

Source URL

www.datadoghq.com/blog/engineering/data-pipeline-completeness

Summary

Datadog's Data Completeness team has developed a robust system to ensure the integrity and completeness of data across its vast distributed ingestion pipelines, which handle billions of payloads per second. This system is crucial for maintaining the reliability of automated decisions and customer-facing dashboards, as incomplete data can lead to flawed outcomes. To achieve this, the team tracks data completeness by segmenting pipelines and monitoring payloads as they traverse each segment, using create and acknowledgment events to gauge completeness. By employing a time-bucket model, the system ensures idempotency and minimizes external dependencies, allowing it to remain functional even during system degradations. Additionally, a load-shedding mechanism dynamically adjusts sampling to maintain accuracy without incurring prohibitive costs. The completeness system is designed to be resilient, deploying independently across multiple availability zones and employing custom in-memory storage to handle the vast data volumes efficiently. By integrating metadata for real-time topology insights and facilitating incident response, Datadog has created a system that not only detects and mitigates pipeline issues swiftly but also supports ongoing automation and scalability efforts.

Trends Found in this Post

No tracked trend matches for this post yet.