Introducing self-describing JSONs
Blog post from Snowplow
The concept of self-describing JSONs, proposed by Snowplow Analytics, aims to address gaps in data schema knowledge by embedding schema information directly within JSON data, thus simplifying schema evolution and historical data management. This approach involves enhancing JSON Schema with a "self" property that details the vendor, object type, format, and version, and modifying individual JSON objects to include a "schema" field that uniquely identifies the associated JSON Schema. This methodology facilitates a two-pass validation process, ensuring that JSON instances are verified against their claimed schemas. The initiative is inspired by the need for clearer documentation and validation of unstructured events and custom contexts, with potential applications in programmatically generating database and schema objects. Snowplow intends to utilize this system internally and invites community feedback to refine the draft proposal before full implementation in future releases.