Company
Date Published
Author
-
Word count
2628
Language
English
Hacker News points
None

Summary

EMBER2024 represents the latest update to the EMBER dataset, originally released in 2018, which plays a critical role in advancing machine learning-based malware detection within the cybersecurity industry. This enhanced dataset, developed by a team including CrowdStrike data scientists, encompasses metadata and calculated features for over 3.2 million files across six file formats, offering a comprehensive resource for training and evaluating machine learning models. The update includes a challenge set that features files initially undetected as malicious by antivirus products, highlighting the difficulties in classifying malware and providing a metric for future improvement. Additionally, the release includes infrastructure code for dataset construction, facilitating future research endeavors by enabling researchers to replicate or expand the dataset, thereby contributing to ongoing advancements in the field. EMBER2024 exemplifies CrowdStrike's dedication to research and collaboration, supporting innovation and strengthening defenses against cyber threats.