Company
Date Published
Author
Justin Gage
Word count
2141
Language
English
Hacker News points
None

Summary

Sharding is a strategy for scaling out relational databases by storing partitions of data across multiple servers instead of putting everything on a single giant one. It involves deciding on a sharding scheme, organizing the target infrastructure, creating a routing layer, and planning and executing the migration to a sharded solution. There are various sharding schemes and algorithms, including hash-based, range-based, and directory-based sharding. Sharding maintenance is crucial as it requires managing hotspots and redistributing data and load. Routing queries to the right databases is also essential, which can be achieved through application logic or using tools like ProxySQL. The migration process involves double-writing, backfilling, verifying, and switching over to the new database. Several sharding frameworks and tools are available, including Vitess for MySQL and Citus for Postgres, which provide a layer on top of the relational databases to give them sharding capabilities. Additionally, serverless databases like CockroachDB are emerging as an alternative solution that can handle scaling out natively.