Company
Date Published
Author
Patrick Rodjito, Senior Solutions Architect
Word count
1600
Language
English
Hacker News points
None

Summary

Auto-remediation is highlighted as a significant advancement in observability, minimizing human intervention and reducing the mean time to resolve issues. This guide details the process of implementing auto-remediation using New Relic and AWS EventBridge, specifically addressing high memory utilization in Amazon EC2 instances by automating their restart. The EC2 instance's metrics are monitored by New Relic, and when a threshold is exceeded, an alert triggers a notification to EventBridge, which executes a rule to restart the instance. The guide emphasizes configuring the integration between New Relic and EventBridge, including dynamically supplying the EC2 instance ID through alert notifications and setting up AWS roles and policies to automate the remediation process. The document also provides instructions on creating rules in EventBridge and configuring automation parameters to ensure the correct EC2 instance is targeted. Testing and validation procedures are included to confirm the successful execution of the AWS-RestartEC2Instance runbook, with additional options for creating dead letter queues or logging events in CloudWatch for troubleshooting purposes.