Implementing mTLS and Securing Apache Kafka at Zendesk
Blog post from Confluent
Zendesk utilizes Apache Kafka as a foundational service for event distribution across its global pods, which are isolated cloud environments. Each pod has its own Kafka cluster, running on Kubernetes and virtual machines, with communications initially in plaintext. To enhance security and facilitate a global event bus, Zendesk implemented a self-hosted mutual TLS (mTLS) authentication system, using HashiCorp's Vault for certificate management and Consul for key-value storage. The solution includes automated certificate generation, rotation, and monitoring to ensure secure and authenticated Kafka communications. Despite the inability to perform individual certificate revocation due to JVM limitations, Zendesk developed a robust CA root certificate rotation mechanism to maintain system security. The implementation of mTLS comes with a performance impact due to increased CPU usage, primarily from encryption and decryption processes, and the loss of zero-copy optimization. However, the use of kernel TLS is explored as a potential mitigation strategy. Zendesk has also focused on simplifying client onboarding with tools and guides to facilitate the adoption of mTLS across their systems, while ensuring developers can focus on customer value instead of security and compliance intricacies.