Company
Date Published
Author
Peter Corless
Word count
2469
Language
English
Hacker News points
None

Summary

Zeotap, a customer intelligence platform based in Germany, leverages ScyllaDB and JanusGraph to manage a vast database of 20 billion user IDs, integrating data from over 80 partners and distributing it to more than 40 destinations. Emphasizing consumer privacy and data security, Zeotap's approach involves using deterministic identity resolution to correlate offline CRM data with online identifiers, thereby enhancing the ability to understand customer behavior. They initially used Apache Spark for batch processing but transitioned to a mixed streaming and batch ingestion system with Kafka to meet their service level agreements. This transition was necessary as their database scaled beyond three billion IDs, necessitating more efficient data ingestion and query performance. Zeotap's use of JanusGraph on ScyllaDB allows for low-latency neighborhood traversal and efficient management of linkages, which are crucial for their business model that focuses on transitive links and metadata filtering. The company continuously evaluates and optimizes their system, considering factors like graph database properties, integration with Apache analytics projects, and the operational costs of potential solutions, ultimately choosing JanusGraph backed by ScyllaDB for its cost-efficiency and ability to meet their demanding workloads.