Introducing Python DataFrames in Starburst Galaxy – Now in Public Preview
Blog post from Starburst
Starburst Galaxy has introduced support for Python DataFrames through the PyStarburst and Ibis libraries, allowing data engineers to perform complex data transformations using Python while leveraging the performance of the Trino SQL query engine. This integration aims to streamline development practices by enabling a unified platform for both analytical and transformation workloads, eliminating the need for separate engines and reducing associated costs and complexity. PyStarburst allows for easy migration of existing PySpark and Snowpark workloads, facilitating the use of Python DataFrames within Starburst, while Ibis provides a uniform Python API that decouples execution from the DataFrame API, enabling scalable data processing across various data sources. This development is part of Starburst's broader effort to simplify data engineering processes and support open-source initiatives, in collaboration with partners like Voltron Data.