Processing Time-Series Data with QuestDB and Apache Kafka
Blog post from QuestDB
QuestDB is an open-source, next-generation database optimized for market data, known for its high ingestion throughput, advanced SQL analytics, and hardware efficiency, making it suitable for handling tick data. Apache Kafka serves as a robust distributed stream-processing platform ideal for managing real-time market data, crucial for trading, risk management, and fraud detection in financial institutions. A typical data pipeline involves streaming data from market feeds to Kafka and then storing it in databases using Kafka Connect. This article demonstrates creating a sample data pipeline that polls real-time stock and ETF quotes from FinnHub, publishes them to Kafka, and utilizes Kafka Connect to stream this data into QuestDB for analysis. The setup involves building a Docker environment with Kafka, Kafka Connect, and QuestDB, and configuring Kafka Connect sinks for specific stocks like Tesla and SPY. The process includes setting up prerequisites such as Git, Docker, Golang, and a FinnHub API token, and executing a Golang script to retrieve and publish stock prices to Kafka topics every 30 seconds. The data is then accessible in QuestDB for further querying and analysis, illustrating the pipeline's capability to handle real-time financial data effectively.