Company
Date Published
Author
The ClickHouse Team
Word count
1382
Language
English
Hacker News points
None

Summary

Modal utilizes ClickHouse Cloud to enhance real-time observability for AI workloads across numerous GPUs and containers, overcoming previous scaling issues with data reads and writes. ClickHouse Cloud enables Modal to ingest 1-2 million events per minute and manage around 500 billion logs while maintaining sub-second query speeds, which is crucial for AI infrastructure that supports large-scale GPU workloads for training, inference, and batch processing. Through a seamless Python SDK, developers can deploy workloads without dealing with the complexity behind the scenes. Modal's use of ClickHouse has resulted in the development of several real-time dashboards that offer users detailed insights into function performance, such as execution time and latency trends. These dashboards, powered by a single ClickHouse table, facilitate efficient data querying and provide full lifecycle visibility of function calls. As Modal continues to grow, with event ingestion doubling to 2 million per minute, the team is exploring new features like a billing API and a visual function call graph to further enhance user experience and performance.