Migrating Data from Batch Ingestion to Streamkap: A Technical Deep Dive

Post Details

Company

Streamkap

Date Published

July 18, 2025

Author

Daniel Corley

Word Count

1,277

Language

English

Hacker News Points

-

Source URL

streamkap.com/blog/migrating-data-from-batch-ingestion-to-streamkap-a-technical-deep-dive

Summary

SpotOn transitioned from a batch ingestion process to using Streamkap for synchronizing data from MongoDB to Snowflake to improve data latency and reduce maintenance costs. The migration involved refactoring over 2,000 dbt models to accommodate the new data source, ensuring data validation, and efficiently handling Change Data Capture (CDC). By moving to Streamkap, SpotOn achieved ultra-low latency, allowing data to be available almost in real-time, and reduced both ingestion and compute costs. Refactoring dbt models included updating source references, validating data types, and efficiently managing CDC data. The process also involved integrating historical data from existing dbt snapshots with new data from Streamkap using techniques like Common Table Expressions (CTEs) and UNION ALL operations to maintain historical integrity. The migration simplified data pipelines, reduced infrastructure costs by threefold, and improved the ability to track record changes, ultimately enhancing both internal analytics and customer-facing reporting capabilities.