Company
Date Published
Author
-
Word count
3000
Language
English
Hacker News points
None

Summary

CrowdStrike's blog post explores its strategic use of Log-Structured Merge (LSM) tree-based databases to effectively manage and process the vast scale of data within its Threat Graph, which now handles trillions of events per day and stores over 40 petabytes of data. The LSM tree structure optimizes for high write throughput, crucial for CrowdStrike's needs, as six out of every seven operations are writes. This is achieved through an append-only approach, avoiding costly delete operations by using time-to-live (TTL) mechanisms during background compaction processes. The sorted nature of LSM tree databases aids in efficient read operations by leveraging data locality, enabling related data to be stored closely together. CrowdStrike's adoption of LSM trees, including contributions to the Apache Cassandra project, has allowed the company to scale its systems while ensuring efficient data processing and customer protection in the cybersecurity domain.