Company
Date Published
Author
Shy Peleg, Director of Software Engineering, Applied Intelligence
Word count
698
Language
English
Hacker News points
None

Summary

A company's Incident Intelligence product was taken offline due to an engineer mistakenly deleting a subscriber resource in Google Cloud Platform, which was used in production. The team took measures to prevent similar issues, including setting up a smoke test using synthetics and New Relic Alerts. The system, which processes incident data from various sources, was continuously tested to ensure that it was functioning correctly, with a heartbeat test regularly checking for data flow between input and output components. This added layer of protection provides an extra layer of reliability for the company's customers.