Company
Date Published
Author
David Wallace
Word count
1361
Language
English
Hacker News points
None

Summary

This case study highlights the implementation of Dagster, a data platform orchestration tool, to manage the complexity and relevancy of Good Eggs' data platform. The company found itself dealing with accumulated tech debt in disparate tools with implicit dependencies, leading to issues with unused dbt models and forgotten Mode reports. To address this, they built a "metapipeline" using Dagster that automates the process of cleaning up these artifacts. The pipeline uses Mode Analytics API to build a graph of all reports and queries defined in Mode, identifies stale reports, archives them, and marks them for future deletion. Additionally, it composes a graph expressing relationships between dbt nodes and downstream consumer nodes, allowing for automated management of exposures. Finally, the metapipeline performs analyses to help identify important dbt models that require optimization work, providing a tangible impact on operations.