Home / Companies / Confluent / Blog / Post Details
Content Deep Dive

How to Process GitHub Data with Kafka Streams

Blog post from Confluent

Post Details
Company
Date Published
Author
Lucia Cerchie, Bill Bejeck
Word Count
1,528
Language
English
Hacker News Points
-
Summary

The text discusses using Apache Kafka to track events in a large codebase, specifically GitHub's data sources (REST + GraphQL APIs). It explains how to use the Confluent GitHub source connector to get GitHub events into a Kafka topic and then process those events using Kafka Streams topology. The author also provides an overview of data pipelines, sources, and sinks, as well as details on implementing a state store in Kafka Streams. Furthermore, the text touches upon extending the project by adding a sink and mentions other resources for learning more about Kafka demos, Flink SQL tutorials, and resolving "unknown magic byte" errors.