Company
Date Published
Author
Sid Choudhury
Word count
1822
Language
English
Hacker News points
None

Summary

Data sharding is a solution for business applications with large data sets and scale needs, allowing them to distribute their data across multiple servers to improve scalability and performance. Sharding can alleviate the impact of unplanned outages by ensuring high availability, increase total cluster storage capacity, speed up processing, and offer higher availability at a lower cost than vertical scaling. However, manual sharding can complicate operational processes, require significant development complexity, and lead to uneven shard allocation, hotspots, and storing data on too few shards. Common sharding architectures include hash sharding, range sharding, and geo-partitioning, each with its own strengths and weaknesses. YugabyteDB is an auto-sharded distributed SQL database that supports both hash and range sharding and has support for geo-partitioning as a work-in-progress feature.