Home / Companies / Rill / Blog / Post Details
Content Deep Dive

Setting up Apache Druid on Kubernetes in under 30 minutes

Blog post from Rill

Post Details
Company
Date Published
Author
Adheip Singh
Word Count
1,481
Language
English
Hacker News Points
-
Summary

Apache Druid, a real-time analytics database, is integrated with Kubernetes through a Druid Operator, which simplifies the operational management of Druid clusters by automating tasks like autoscaling, rolling upgrades, and resource cleanup. Initially, Helm Charts were used to deploy Druid clusters, but their limitations led to the development of custom Golang-based operators and Kubernetes Custom Resource Definitions (CRDs) to manage the complex system. The Druid Operator, introduced in late 2019, understands Druid's internal architecture, facilitating better uptime, high availability, and seamless rolling upgrades without downtime. It supports both StatefulSets and deployments for different node types, and automates tasks like PVC cleanup using Kubernetes finalizers. Features such as selective node upgrades and self-healing StatefulSets enhance operational efficiency, while the operator runs in a high-availability mode with a single active controller managing events. Additionally, the Kubectl Druid plugin extends Kubernetes command-line interface capabilities, simplifying Druid cluster management further.