Company
Date Published
Author
Adrian Brudaru
Word count
1102
Language
English
Hacker News points
None

Summary

The process of combining technical automation with human curation is key to schema evolution, which aims to structure unstructured data into a structured database. This process involves identifying the need for structuring data upfront, rather than relying on implicit structuring during read or deferring it to analysts. The current approach often leads to issues such as untested parsing code and silent bugs being pushed to production. A better approach is to automate technical processes like structuring, typing, and normalization, while decoupling curation from the technical process. This can be achieved through data contracts that define a schema, test for conformance, and notify producers and curators of violations. The implementation involves using tools like dlt to automatically infer and version schemas, defining notification channels, capturing load job info, and sending it to the hook.