Home / Companies / Starburst / Blog / Post Details
Content Deep Dive

Setting up Trino for dbt

Blog post from Starburst

Post Details
Company
Date Published
Author
Przemek Denkiewicz
Word Count
1,166
Language
English
Hacker News Points
-
Summary

Trino is a distributed SQL query engine designed for querying large datasets across heterogeneous data sources, supporting Online Analytical Processing (OLAP) workloads such as data warehousing and analytics rather than functioning as a general-purpose relational database. This guide, authored by Przemek Denkiewicz and Michiel De Smet from Starburst, details the setup of Trino with dbt for lakehouse ETL processes, emphasizing the use of Docker and Docker Compose for managing multiple containers needed for this configuration. Key components include Trino for executing distributed queries, along with PostgreSQL for a webshop database, MongoDB for clickstream data, and the Iceberg table format for the lakehouse, all orchestrated through a YAML file to streamline service configuration and management. The document highlights the introduction of the MERGE statement in Trino version 393, enhancing ETL/ELT operations, and notes that Starburst products like Starburst Galaxy and Starburst Enterprise support dbt-trino, enabling incremental models and snapshot features.