Company
Date Published
Author
Hubert Dulay, Victoria Xia, Wade Waldron
Word count
1277
Language
English
Hacker News points
None

Summary

Building on the first part of the series, this section delves into processing osquery logs using the Confluent Platform and ksqlDB to detect anomalous behavior with machine learning. By training a Latent Dirichlet Allocation (LDA) model on osquery logs, the system can identify deviations in behavior, categorizing them as GOOD, BAD, or UGLY based on their scores. The GOOD logs, indicating normalcy, are fed back into the model for retraining, while BAD logs, suggesting suspicious activity, and UGLY logs, requiring further analysis, can be routed to security systems for deeper investigation. The architecture uses a combination of batch and streaming pipelines, with a Lambda Architecture approach, to manage model training and real-time log scoring. This setup facilitates the integration of the Confluent Platform, Kafka Connect, and ksqlDB to create a streamlined SIEM pipeline capable of real-time alerting and investigation, laying a foundation for scalable security solutions.