The Road to 100PiBs and Hundreds of Thousands of Partitions: Goldsky Case Study
Blog post from WarpStream
Goldsky, a Web3 developer platform, offers real-time and historical blockchain data access, facilitating dApp development by indexing and streaming data via API endpoints, subgraphs, and streaming pipelines. Initially built on Apache Kafka, Goldsky faced scaling, cost, and reliability issues, prompting a migration to WarpStream. This transition brought significant cost reductions and improved reliability, thanks to WarpStream's diskless architecture, auto-scaling capabilities, and efficient tiered storage implementation. WarpStream's unique architecture decouples hardware from partition counts, allowing for seamless scaling based on workload demands, and eliminates networking costs by aligning traffic zonally. Goldsky's continued growth has been supported by WarpStream's adaptation to accommodate large data clusters, including a redesigned storage engine that optimizes metadata tracking and compaction processes, thereby enabling the platform to handle extensive data storage needs efficiently.