Home / Companies / Vespa / Blog / Post Details
Content Deep Dive

Doubling the throughput of data redistribution

Blog post from Vespa

Post Details
Company
Date Published
Author
Geir Storli
Word Count
983
Language
English
Hacker News Points
-
Summary

Vespa, a data platform, has significantly enhanced its data redistribution process, doubling its throughput and reducing the time required to replace a failing content node by half. These improvements are part of Vespa version 7.528.3 and involve several technical optimizations, such as enhanced scheduling semantics, asynchronous operations, and optimized handling of delete bucket operations, which collectively minimize latency spikes and bottlenecks. Specifically, the upgraded system now allows for an average throughput of 44 MB/sec during data redistribution, reducing the process duration significantly from 3 hours and 50 minutes to about 2 hours. These advancements ensure that data redistribution occurs with minimal disruption to query or write traffic and are crucial in maintaining data redundancy and system reliability in Vespa Cloud.