How to Create Roll-ups in Apache Druid
Blog post from Rill
Apache Druid Rollups are a powerful feature that enhances data storage and query performance by aggregating data at ingestion time, making them particularly useful in scenarios like point-of-sale systems. The text provides insights into how rollups can be effectively utilized alongside Apache Druid's features such as real-time Kafka ingestion, data sketches, and custom time granularity. It explores various examples, including how partitioning in Kafka can improve rollup performance, the use of data sketches for approximate uniqueness without compromising rollup effectiveness, and the potential of custom time granularity in transforming timestamp values during ingestion. The examples demonstrate how dimensions such as USER_REWARD and STORE_ZIPCODE can impact rollup effectiveness, with specific data setups illustrating the benefits of using sketches like Theta and HLL for unique count approximations. The document concludes with an invitation to explore these examples through a hands-on tutorial, offering a practical approach to understanding and implementing rollups within Apache Druid.