Home / Companies / dltHub / Blog / Post Details
Content Deep Dive

Testing Before Loading: WAP and AWAP

Blog post from dltHub

Post Details
Company
Date Published
Author
Roshni Melwani, Working Student
Word Count
1,659
Language
English
Hacker News Points
-
Summary

In the data engineering realm, the balance between strict and permissive pipelines often presents challenges, with strict pipelines halting at minor changes and permissive ones accruing technical debt. To address this, the Audit-Write-Audit-Publish (AWAP) framework is introduced as a resilient solution that mitigates the issues of both extremes by incorporating a two-gate validation system. This approach involves syntactic validation at the row level to prevent malformed data from causing schema mutations, followed by semantic validation at the batch level to catch anomalies that could corrupt data integrity. The AWAP model not only accommodates necessary schema evolution but also maintains data reliability by separating recoverable drifts from destructive anomalies. Through practical examples like the Street Survey System, AWAP demonstrates its effectiveness in filtering out untrustworthy data before it enters production, thereby preventing the need for extensive post-hoc corrections. The model offers a structured approach to data management that ensures a stable production environment while minimizing the risk of state corruption.