Home / Companies / Dagster / Blog / Post Details
Content Deep Dive

Case Study: Catalyst Cooperative - Liberating Public Utility Data with Dagster

Blog post from Dagster

Post Details
Company
Date Published
Author
Fraser Marlow
Word Count
1,498
Language
English
Hacker News Points
-
Summary

The Public Utility Data Liberation (PUDL) Project, developed by Catalyst Cooperative, aims to make valuable public energy data readily accessible and user-friendly for those working to decarbonize the energy system. Initially using Python and pandas, the team faced challenges such as a burdensome process for adding new data sources, lack of parallelism, and difficult access to interim outputs from the ETL pipeline. To overcome these issues, they adopted Dagster, an open-source data engineering solution that enabled them to create a declarative approach, simplify their workflow, and accelerate development iteration cycles. With Dagster, Catalyst Cooperative can now add new data sources with relative ease, publish cleaned versions of tables, and make interim assets available to users, ultimately improving their ability to scale up the project and integrate more diverse datasets.