Home / Companies / Confluent / Blog / Post Details
Content Deep Dive

🚂 On Track with Apache Kafka – Building a Streaming ETL Solution with Rail Data

Blog post from Confluent

Post Details
Company
Date Published
Author
Robin Moffatt
Word Count
3,127
Company Posts That Month
11
Language
English
Hacker News Points
-
Post removed?
No
Summary

Apache Kafka is used to build a powerful data system that ingests events from an external system, enriches with other data, transforms, and drives both analytics and real-time notification applications. The system uses KSQL for data transformations, streaming to target databases using Kafka Connect, and Elasticsearch for interactive dashboards. The data pipeline includes reserializing JSON data to Avro schema, flattening nested columns, and resolving foreign keys such as location codes. Event time is used instead of system time when aggregating and filtering on timestamps in the event stream. Kafka Connect is used to stream enriched data to target systems. The benefits of using Apache Kafka include its robust log-based architecture, scalability, and versatility, allowing users to handle both streaming and queue processing.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Real-time 12 354 133 58 -28%
Data Pipeline 4 50 20 16 +9%
Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.