Home / Companies / Datadog / Blog / Post Details
Content Deep Dive

Monitoring Apache Spark applications running on Amazon EMR

Blog post from Datadog

Post Details
Company
Date Published
Author
Priya Matpadi
Word Count
1,372
Language
English
Hacker News Points
-
Summary

The authors of the text describe how they implemented monitoring for a Spark streaming application running on Amazon EMR. They chose to use Datadog, which provides built-in integrations with both EMR and Spark. The authors set up the Datadog integration with EMR by linking their AWS account and ensuring the relevant permissions were in place. They then installed the Datadog Agent on each node of the EMR cluster using a bootstrap action script, configured the Agent to run the Spark check at regular intervals, and published Spark metrics to Datadog. To collect custom application metrics, they instrumented their application code to publish metrics as events are processed by the application. The authors believe that monitoring their Spark streaming application from all sides will provide them with visibility into its health and performance.