Sharding is a critical part of modern databases that allows for horizontal scaling, spreading data and traffic across multiple machines. It enables scalability beyond physical limitations by distributing data and workloads proportionally to the number of servers added. Sharding works by specifying a shard key, which partitions data into various shards, and using balancers to ensure each shard contains roughly the same amount of data. While sharding is necessary for handling increased demand, it also poses challenges such as unevenly distributed workloads and latency issues. To optimize shards for faster queries, users must select critical shard keys with high cardinality, plan and prepare their data structures, and use compound shard keys to reduce query bottlenecks.