Company
Date Published
Author
Elliot Gunn
Word count
2094
Language
English
Hacker News points
None

Summary

This guide is for beginners who want to start their first data engineering project with a basic understanding of Python. It focuses on using Dagster, an open-source solution for data orchestration, and provides a step-by-step approach to creating successful data pipelines. The guide covers setting up the project's root directory, launching a virtual environment, installing Dagster and scaffolding an initial project, declaring assets in Dagster, and understanding serialization in Dagster. It also introduces software-defined assets, which enable a declarative approach to data management, making it easier to manage and organize code. The guide provides examples of creating two assets: hackernews_top_story_ids and hackernews_top_stories, and demonstrates how to run the pipeline and create assets using Dagster's user interface.