When Sync Isn't Enough
Blog post from Dagster
A custom async executor for Dagster, named dagster-async-executor, is designed to enhance high-concurrency, async-native libraries, and incremental adoption in data workflows, without altering the way runs are initiated or monitored. The async executor allows for the natural integration of asynchronous operations within Dagster jobs, enabling a mix of synchronous and asynchronous tasks and supporting dynamic and fan-out graphs. This innovation addresses the challenges of modern data engineering, which often involves I/O-bound tasks like real-time enrichment and high-fan-out inference, by efficiently managing these operations within a single run worker. By leveraging an async orchestration layer, the executor enhances concurrency and reduces latency in I/O-heavy workloads, while maintaining compatibility with existing Dagster tooling and interfaces. It is particularly effective for tasks that are latency-sensitive and involve numerous small I/O operations, although it is not intended to replace traditional distributed compute frameworks like Dask or Celery for CPU-intensive jobs.