Home / Companies / Cloudflare / Blog / Post Details
Content Deep Dive

Debugging war story: the mystery of NXDOMAIN

Blog post from Cloudflare

Post Details
Company
Date Published
Author
Ivan Babrou
Word Count
1,621
Language
English
Hacker News Points
-
Summary

The blog post describes a debugging adventure on Cloudflare's Mesos-based cluster, which is primarily used to process log file information and detect attacks. Engineers encountered an issue where internal DNS queries were returning "no such host" errors for existing domains. Through extensive testing and analysis, it was discovered that the problem stemmed from packet loss during DNS resolution attempts. The solution involved increasing the retries option in the resolv.conf file to better handle transient network issues and improve the reliability of DNS resolution.