The Scrapinghub team leverages Apache Kafka, Flink, and MongoDB to build an RAG-enabled GenAI data extraction API called AutoExtract, which extracts structured data from web pages without requiring custom code. The system receives a URL as input, fetches and renders the page, and then processes the content using AI-powered data extraction engine. Confluent Cloud is used to scale and distribute requests, providing on-demand instances and eliminating management overhead. The team chose Confluent Cloud over alternatives due to its vendor-independent pricing model, ease of use, and reliability. After migrating to Confluent Cloud, the team experienced no latency issues or throughput problems during load testing, and only minor tradeoffs, such as limited access to ZooKeeper.