Company
Date Published
Author
William Laroche
Word count
1255
Language
English
Hacker News points
None

Summary

This solution utilizes GCP's Compute Engine to create an autoscaled cluster of workers that pull messages from a Pub/Sub subscription, batch them together, and insert them into a Cloud SQL PostgreSQL database using micro-batch processing. The use of instance templates and instance groups allows for efficient scaling and cost-effectiveness. The code is simple and leverages DLT for schema inference and data load tool (DLT) to bulk insert records into the database. The solution has been benchmarked, showing a minimum throughput capacity of 700 messages/s per worker and a cost of $16.06/mth with a minimum cluster size of 2.