Sharding distributes data across multiple servers.
Architecture¶
- Shard — replica set with data
- Config server — metadata
- mongos — router
Setup¶
sh.enableSharding('mydb')
sh.shardCollection('mydb.orders',{userId:'hashed'})
sh.status()
Shard Key¶
- Hashed — even distribution
- Ranged — range queries efficient
- Compound — balanced distribution
Choosing a Shard Key¶
Choosing the right shard key is the most important decision when sharding MongoDB. A bad shard key leads to hotspots — where one shard receives most writes while others sit idle. An ideal shard key has high cardinality, distributes writes evenly, and supports your most common queries.
A hashed shard key ensures even distribution but makes efficient range queries impossible. A compound shard key (for example, {tenant_id: 1, created_at: 1}) is often the best compromise — it distributes data by tenant and enables efficient time-based queries within a tenant. Once you choose a shard key, you cannot change it without data migration. The balancer automatically moves chunks between shards for even distribution, but this process consumes I/O and network bandwidth.
Shard Key is Critical¶
Bad shard key = hotspots.