Company
Date Published
Author
Christian Petrasch and Robin Gontermann
Word count
1919
Language
English
Hacker News points
None

Summary

ClickHouse is being used by DENIC (Deutsche Network Information Center) as a database for their data science platform, which analyzes data from various sources such as relational databases, server logs, and other information sources. Initially, a relational DBMS was used but resulted in too many target tables and containers, making it difficult to administer and overcomplicated. After testing Hadoop and Spark, ClickHouse was chosen due to its efficiency, low administrative effort, and cost-effectiveness. DENIC created a custom data structure for their registry database using ClickHouse's column-oriented databases, which enabled fast queries over large amounts of data. However, they encountered performance issues with the initial filling of the cluster, but were able to optimize the query runtime by creating materialized views and implementing an ARRAY JOIN, resulting in a significant improvement from 5 minutes to about 30 seconds. ClickHouse's performance and expandability have provided DENIC with extensive support in developing their data science platform.