Company
Date Published
Author
Tom Schreiber
Word count
5067
Language
English
Hacker News points
5

Summary

ClickHouse is a fast query engine that can run on Parquet files directly without ingestion, outperforming many databases when querying their own native formats. It has been optimized for Parquet for years and its current reader applies parallelism across every layer of the query execution, using metadata like min/max statistics and Bloom filters to skip unnecessary work. A new native Parquet reader is on the way, bringing support for dictionary-based filtering, page-level min/max stats, and ClickHouse-specific optimizations like PREWHERE and lazy materialization. When benchmarked against other file formats and a purpose-built table engine, ClickHouse's performance over Parquet came closest to the engine. This makes ClickHouse a solid foundation for Lakehouse architectures, not just fast for Parquet but already there.