Machine learning in cybersecurity: Detecting DGA activity in network data with Elastic

Company

Elastic

Date Published

July 8, 2020

Author

Word count

1938

Language

Hacker News points

None

URL

www.elastic.co/blog/machine-learning-in-cybersecurity-detecting-dga-activity-in-network-data

Summary

In the second part of this blog series, the use of the Elastic Stack's machine learning capabilities is explored to detect Domain Generation Algorithm (DGA) activity in network data. By employing a supervised classification model, network data can be enriched with classifications during ingestion, identifying potentially malicious domains by analyzing DNS queries. The methodology involves setting up an ingest pipeline with inference and Painless script processors to extract features like unigrams, bigrams, and trigrams, which are then used by a pre-trained model to predict domain maliciousness. Addressing false positives, which can be substantial due to high DNS traffic volumes, the blog suggests using anomaly detection as a secondary analysis technique to differentiate actual DGA activity from noise. This approach not only improves the accuracy of threat detection but also demonstrates how the Elastic Stack can be configured to enhance cybersecurity measures through automated data enrichment and analysis. For practical application, the blog provides guidance on setting up and testing these systems using Elastic's services, offering a trial for users to experiment with these tools in their network environments.