3.7x Faster EL Pipelines: Arrow + ADBC vs. SQLAlchemy

Post Details

Company

dltHub

Date Published

Feb. 3, 2026

Author

Aman Gupta, Data Engineer

Word Count

920

Language

English

Hacker News Points

-

Source URL

dlthub.com/blog/arrow-adbc-vs-sqlalchemy

Summary

Aman Gupta, a Data Engineer, explores the performance benefits of using Apache Arrow and ADBC over SQLAlchemy for EL pipelines that transfer data from DuckDB to MySQL. The experiment demonstrates a significant 3.7x speedup when adopting Arrow's columnar data format and ADBC for bulk loading, reducing the time from 344 seconds to 92 seconds. This efficiency is achieved by minimizing Python object handling and serialization costs, thereby shifting bottlenecks away from the CPU. Arrow's in-memory columnar format streamlines data movement, reduces compute costs, and enhances throughput by eliminating the overhead associated with row-based data structures. The use of dlt with Arrow further simplifies the pipeline architecture, ensuring fewer moving parts and easier maintenance while maintaining high performance.