Home / Companies / dltHub / Blog / Post Details
Content Deep Dive

Row vs. Batch Contracts: Using AWAP to Prevent Schema Scars and State Corruption with dlt

Blog post from dltHub

Post Details
Company
Date Published
Author
Roshni Melwani, Working Student
Word Count
1,755
Language
English
Hacker News Points
-
Summary

The blog post discusses the challenges and solutions in data engineering, focusing on the balance between strict and permissive data ingestion pipelines. It highlights the shortcomings of strict pipelines, which often lead to frequent pipeline breaks due to minor upstream changes, and permissive pipelines, which risk schema scars and state corruption by allowing harmful data through. The article introduces AWAP (Audit-Write-Audit-Publish) as a middle-ground approach that employs a two-gate validation system to separate syntactic and semantic data validations, thereby allowing safe schema evolution while preventing destructive anomalies. By implementing AWAP, data engineers can maintain system integrity and avoid the pitfalls of manual interventions, ensuring that production tables remain reliable sources of truth while accommodating unavoidable upstream drifts. The concept is illustrated through a practical example using a street survey system, demonstrating how AWAP can filter out both row-level and batch-level data issues effectively.