Retaining Database Goodput Under Stress with Per-Partition Query Rate Limiting
Blog post from ScyllaDB
ScyllaDB's per-partition rate limiting feature addresses the "hot partition" problem in distributed database clusters by implementing a mechanism to manage the rate of accepted requests on a per-partition basis. This feature is crucial in maintaining stable "goodput," which refers to the amount of useful data transferred between clients and servers, by rejecting excess requests when a partition exceeds its configured limit. The implementation involves estimating request rates for each partition and making decisions based on these estimates to prevent system overload and ensure efficient resource allocation. Benchmarks demonstrate that this rate limiting effectively restores goodput under stress, as it rejects problematic requests that could otherwise overwhelm the database's resources, thereby enabling stable performance even during high traffic incidents.