Using deep learning to detect DGAs

Post Details

Company

Elastic

Date Published

Nov. 18, 2016

Author

Hyrum Anderson • Jonathan Woodbridge

Word Count

1,841

Language

-

Hacker News Points

-

Source URL

www.elastic.co/blog/using-deep-learning-detect-dgas

Summary

Researchers at Endgame have developed a method to detect domain names generated by Domain Generation Algorithms (DGAs) using deep learning, specifically Long Short-Term Memory networks (LSTMs), which outperform existing state-of-the-art techniques. DGAs are used by adversaries to create pseudorandom domain names for connecting malware to command and control servers, which makes blacklisting or sinkholing ineffective. The deep learning approach eliminates the need for manual feature engineering, allowing the model to adapt swiftly to changes by automatically learning feature representations. Unlike traditional methods, this technique does not rely on contextual information like NXDomains or domain reputation, yet it achieves higher accuracy in identifying DGA-generated domains. The model was trained using data from Alexa's top 1 million sites for benign domains and custom DGA algorithms for malicious data, demonstrating an impressive AUC of 0.9977 and a 90% detection rate with a 1/10,000 false positive rate on a diverse dataset. The research highlights the potential for LSTMs to enhance cybersecurity measures against evolving malware threats.