Company
Date Published
Author
Andreas Grabner
Word count
804
Language
American English
Hacker News points
None

Summary

Dynatrace hosts its SaaS clusters on Amazon Web Services (AWS), where they operate 51 clusters on 1100 EC2 instances across six AWS regions, achieving an availability of 99.95% since moving to AWS in 2014. This success is attributed to Dynatrace's use of its own platform for monitoring and its reliance on the Dynatrace DavisĀ® AI for alerting on issues like the disk write latency problem encountered in AWS's Sydney data center. When Davis detected disk latency issues across multiple instances, Dynatrace's Autonomous Cloud Enablement (ACE) team collaborated with AWS to confirm and resolve a known hardware issue, demonstrating the importance of proactive monitoring in hybrid-multicloud environments. This incident underscores the benefits of using advanced monitoring tools that detect anomalies and provide timely alerts, allowing for quick remediation and maintaining service level agreements.