Home / Companies / Starburst / Blog / Post Details
Content Deep Dive

How to Avoid Data Access Bottlenecks When Using Trino

Blog post from Starburst

Post Details
Company
Date Published
Author
Daniel Abadi
Word Count
2,311
Language
English
Hacker News Points
-
Summary

Trino is a powerful data processing and query execution engine that excels at accessing and integrating data across various storage systems due to its connector-based architecture. Unlike traditional database management systems that handle both storage and processing, Trino focuses solely on processing and querying, which can lead to performance bottlenecks, especially when using generic connectors like JDBC. These bottlenecks are mainly due to the slow, single-threaded nature of JDBC data transfer, which can severely limit throughput despite Trino's high processing capabilities. To mitigate these issues, Trino attempts to push down query operators to reduce the data transferred, but this is not always sufficient. System-specific connectors can sometimes alleviate bottlenecks through partition-aware extraction and increased processing pushdown, offering significant performance improvements over generic connectors. In cases where performance issues persist, best practices include exploring other Trino distributions with better connectors or storing periodic data snapshots in open formats in data lakes, where Trino performs more efficiently.