Soda Data Quality
Blog post from Soda
Data observability platform Soda aims to tackle the issue of silent data quality problems that often go undetected, posing significant risks to data-driven products. Having transitioned from software to data engineering, the founders initiated this endeavor to implement strategies akin to software testing and monitoring within the data domain. Soda introduced Soda SQL, an open-source tool launched in February 2021, which allows data engineers to define what constitutes good data quality, facilitating testing and monitoring within existing data workflows. This tool, which utilizes YAML config files and SQL for data testing, helps detect and address invalid, missing, or unexpected data, potentially halting pipelines to prevent further issues. Additionally, Soda Cloud, a web application, extends Soda SQL's capabilities by offering a collaborative platform where metrics and test results are visualized over time, enabling even non-technical users to participate in monitoring and ensuring data quality. This approach fosters a collaborative environment where data teams can preemptively address data issues, aligning stakeholders on quality expectations through integrations with communication tools like email and Slack.