Hello World, Kafka Connect + Kafka Streams

Company

Confluent

Date Published

April 1, 2016

Author

Andrew Sellers, Michal Haris, Paul Mac Farland

Word count

1992

Language

English

Hacker News points

None

URL

www.confluent.io/blog/hello-world-kafka-connect-kafka-streams

Summary

In this article, Michal Haris and Neha Narkhede explore the integration of Apache Kafka Connect and Kafka Streams to build a real-time stream processing application. They demonstrate how to use Kafka Connect to ingest data from Wikipedia IRC channels into a partitioned topic, which is then processed by Kafka Streams to extract and parse the raw IRC messages into WikipediaMessage objects. The application also uses KTable to continuously print updates on usage analytics. The authors highlight the benefits of using Kafka Connect and Kafka Streams, including simplicity, scalability, and fault tolerance, as well as the ability to run instances in parallel without sacrificing performance. They also discuss potential improvements to the integration between Kafka Connect and Kafka Streams, such as a first-class integration that would allow connectors to map directly to KStreams.