Home / Companies / Lumigo / Blog / Post Details
Content Deep Dive

Amazon Builders' Library in Focus #6: Implementing Health Checks

Blog post from Lumigo

Post Details
Company
Date Published
Author
Yan Cui
Word Count
783
Language
English
Hacker News Points
-
Summary

The Amazon Builders' Library article discusses implementing health checks for scalable and resilient systems. The authors highlight the importance of balancing thorough health checks that quickly mitigate single-server failures with the harm of false positives that affect the entire fleet. They recommend using a combination of liveness, local, and dependency health checks to measure system health. However, they also caution against over-reliance on health checks, particularly when it comes to dependencies, as this can lead to cascade failures. The article also shares real-world examples of failures with health checks at Amazon and provides guidance on how to react safely to health check failures.