Stop Performing SQL Surgery: Implement Row and Batch Contracts with AWAP
Blog post from dltHub
In data engineering, the challenge of balancing strict schema enforcement and permissive auto-evolution is addressed by the AWAP (Audit-Write-Audit-Publish) framework, which introduces a multi-gate validation layer to manage data quality issues effectively. Unlike strict enforcement that can halt pipelines due to minor upstream changes, or permissive evolution that can lead to technical debt and data corruption, AWAP offers a middle ground by distinguishing between recoverable drifts and destructive anomalies. It employs a two-gate architecture that separates syntactic validation (row-level failures) from semantic validation (batch-level failures), thus preventing issues like Schema Scars and State Corruption. By using an example of a street survey system, the text illustrates how AWAP allows for the ingestion of verified data while ensuring that both syntactic and semantic anomalies are checked before data is published to production, thereby maintaining system integrity without compromising uptime.