Home / Companies / Starburst / Blog / Post Details
Content Deep Dive

How to use Starburst and Airflow to create resilient data pipelines

Blog post from Starburst

Post Details
Company
Date Published
Author
Yusuf Cattaneo
Word Count
985
Language
English
Hacker News Points
-
Summary

The text discusses the integration of Starburst Galaxy with Apache Airflow to create resilient and efficient data pipelines, leveraging fault-tolerant execution capabilities to enhance reliability and speed. By executing a Directed Acyclic Graph (DAG) with tasks that include querying data using the Starburst Galaxy engine, printing the number of records via Python, and performing a data quality check with the SQLColumnCheckOperator, the process demonstrates the seamless management and monitoring of data workflows. The blog outlines the practical steps for setting up and running this environment, emphasizing the improvements in handling ETL workloads through advanced resource-aware scheduling and granular retries. It encourages users to explore the powerful combination of Starburst Galaxy and Airflow for scalable and reliable data solutions.