Monitoring Kubernetes (part 3): Kubernetes troubleshooting service discovery
Blog post from Sysdig
The blog post, part of a series on operating Kubernetes in production, delves into troubleshooting service discovery within Kubernetes, using a real-world example involving a Kubernetes service with Nginx pods and a curl client. It highlights the challenges of troubleshooting in containerized environments, where traditional tools fall short due to container isolation and volatility. The post demonstrates the use of Sysdig, a Linux visibility tool, to capture and analyze system calls and metadata, revealing issues such as curl's misinterpretation of fully qualified domain names and the redundant appending of search domains, including "localdomain," which caused DNS resolution delays. The investigation traces the source of these issues to Kubernetes' interaction with Docker, ultimately attributing the inclusion of "localdomain" to Kubernetes' configuration. The post underscores the importance of Sysdig captures in reproducing and troubleshooting ephemeral container environments by providing comprehensive system-level insights.