Company
Date Published
Author
Lucia Cerchie, Martin Kleppmann, Josep Prat
Word count
7267
Language
English
Hacker News points
112

Summary

This talk introduces Apache Samza, a distributed stream processing framework developed at LinkedIn. It discusses how traditional databases and caches are like global variables, a kind of shared mutable state that becomes messy at scale. The author argues that writing data as a log produces better-quality data than if you update a database directly. He also highlights the problems with read-through caches, such as cold starts and race conditions. Materialized views, which are derived from the data in the log, can help fix these issues. The author proposes an architecture where materialized views are updated from a stream of changes, enabling clients to subscribe to streams and notify subscribers of new events. This approach requires a big rethink of how we write applications, shifting from request-response models to stream-friendly programming models based on actors and channels or reactive frameworks.