How to Avoid Data Access Bottlenecks When Using Trino

Post Details

Company

Starburst

Date Published

July 14, 2025

Author

Daniel Abadi

Word Count

2,311

Language

English

Hacker News Points

-

Source URL

www.starburst.io/blog/trino-data-access-starburst

Summary

Trino is a powerful data processing and query execution engine that excels at accessing and integrating data across various storage systems due to its connector-based architecture. Unlike traditional database management systems that handle both storage and processing, Trino focuses solely on processing and querying, which can lead to performance bottlenecks, especially when using generic connectors like JDBC. These bottlenecks are mainly due to the slow, single-threaded nature of JDBC data transfer, which can severely limit throughput despite Trino's high processing capabilities. To mitigate these issues, Trino attempts to push down query operators to reduce the data transferred, but this is not always sufficient. System-specific connectors can sometimes alleviate bottlenecks through partition-aware extraction and increased processing pushdown, offering significant performance improvements over generic connectors. In cases where performance issues persist, best practices include exploring other Trino distributions with better connectors or storing periodic data snapshots in open formats in data lakes, where Trino performs more efficiently.