Building Streaming Data Pipelines, Part 1: Data Exploration With Tableflow

Company

Confluent

Date Published

April 25, 2025

Author

Robin Moffatt

Word count

1937

Language

English

Hacker News points

None

URL

www.confluent.io/blog/building-streaming-data-pipelines-part-1

Summary

This blog post showcases how to explore and validate data in Apache Kafka using Tableflow, a feature in Confluent Cloud that automatically synchronizes the contents of a Kafka topic to an Iceberg table. The author builds a pipeline to ingest and analyze data from the U.K. Environment Agency's network of sensors, using Tableflow to expose Kafka topics as Apache Iceberg tables and then querying them using standard SQL tools like Trino and PopSQL. The post demonstrates how to use these tools to explore and visualize the data, including unnesting arrays and joining data from multiple sources. It also highlights the benefits of using Tableflow, which makes it easier to get answers out of the data and provides a more efficient way to build streaming data pipelines.