Daily Archives: February 10, 2025

Distributed Design Pattern: Consistent Hashing for Load Distribution

[A Music Streaming Service Shard Management Case Study]

Imagine you’re building the next Spotify or Apple Music. Your service needs to store and serve millions of music files to users worldwide. As your user base grows, a single server cannot handle the load, so you need to distribute the data across multiple servers. This raises several critical challenges:

  1. Initial Challenge: How do you determine which server should store and serve each music file?
  2. Scaling Challenge: What happens when you need to add or remove servers?
  3. Load Distribution: How do you ensure an even distribution of data and traffic across servers?

Let’s see how these challenges manifest in a real scenario:

Consider a music streaming service with:

  • 10 million songs
  • 4 servers (initially)
  • Need to scale to 5 servers due to increased load

Traditional Approach Using Simple Hash Distribution

The simplest approach would be to use a hash function with modulo operation:

server_number = hash(song_id) % number_of_servers

Problems with this approach:

  1. When scaling from 4 to 5 servers, approximately 80% of all songs need to be redistributed
  2. During redistribution:
  • High network bandwidth consumption
  • Temporary service degradation
  • Risk of data inconsistency
  • Increased operational complexity

For example:

  • Song “A” with hash 123 → Server 3 (123 % 4 = 3)
  • After adding 5th server → Server 3 (123 % 5 = 3)
  • Song “B” with hash 14 → Server 2 (14 % 4 = 2)
  • After adding 5th server → Server 4 (14 % 5 = 4)

Solution: Consistent Hashing

Consistent Hashing elegantly solves these problems by creating a virtual ring (hash space) where both servers and data are mapped using the same hash function.

How It Works

1. Hash Space Creation:

  • Create a circular hash space (typically 0 to 2²⁵⁶ — 1)
  • Map both servers and songs onto this space using a uniform hash function

2. Data Assignment:

  • Each song is assigned to the next server clockwise from its position
  • When a server is added/removed, only the songs between the affected server and its predecessor need to move

3. Virtual Nodes:

  • Each physical server is represented by multiple virtual nodes
  • Improves load distribution
  • Handles heterogeneous server capacities

Implementation Example

Let’s implement this for our music streaming service:

class ConsistentHash:
def __init__(self, replicas=3):
self.replicas = replicas
self.ring = {} # Hash -> Server mapping
self.sorted_keys = [] # Sorted hash values

def add_server(self, server):
# Add virtual nodes for each server
for i in range(self.replicas):
key = self._hash(f"{server}:{i}")
self.ring[key] = server
self.sorted_keys.append(key)
self.sorted_keys.sort()

def remove_server(self, server):
# Remove all virtual nodes for the server
for i in range(self.replicas):
key = self._hash(f"{server}:{i}")
del self.ring[key]
self.sorted_keys.remove(key)

def get_server(self, song_id):
# Find the server for a given song
if not self.ring:
return None

key = self._hash(str(song_id))
for hash_key in self.sorted_keys:
if key <= hash_key:
return self.ring[hash_key]
return self.ring[self.sorted_keys[0]]

def _hash(self, key):
# Simple hash function for demonstration
return hash(key)

The Consistent Hashing Ring ensures efficient load distribution by mapping both servers and songs onto a circular space using SHA-256 hashing. Each server is assigned multiple virtual nodes, helping balance the load evenly. When a new server is added, it gets three virtual nodes to distribute traffic more uniformly. To determine where a song should be stored, the system hashes the song_id and assigns it to the next available server in a clockwise direction. This mechanism significantly improves scalability, as only a fraction of songs need to be reassigned when adding or removing servers, reducing data movement and minimizing disruptions.

How This Solves Our Previous Problems

  1. Minimal Data Movement:
  • When adding a new server, only K/N songs need to move (where K is total songs and N is number of servers)
  • For our 10 million songs example, scaling from 4 to 5 servers:
  • Traditional: ~8 million songs move
  • Consistent Hashing: ~2 million songs move

2. Better Load Distribution:

  • Virtual nodes ensure even distribution
  • Each server handles approximately equal number of songs
  • Can adjust number of virtual nodes based on server capacity

3. Improved Scalability:

  • Adding/removing servers only affects neighboring segments
  • No system-wide recalculation needed
  • Operations can be performed without downtime
The diagram illustrates Consistent Hashing for Load Distribution in a Music Streaming Service. Songs (e.g., Song A and Song B) are assigned to servers using a hash function, which maps them onto a circular hash space. Servers are also mapped onto the same space, and each song is assigned to the next available server in the clockwise direction. This ensures even distribution of data across multiple servers while minimizing movement when scaling. When a new server is added or removed, only the affected segment of the ring is reassigned, reducing disruption and improving scalability.

Real-World Benefits

Efficient Scaling: Servers can be added or removed without downtime.
Better User Experience: Reduced query latency and improved load balancing.
Cost Savings: Optimized network bandwidth usage and lower infrastructure costs.

Consistent Hashing is a foundational pattern used in large-scale distributed systems like DynamoDB, Cassandra, and Akamai CDN. It ensures high availability, efficient load balancing, and seamless scalability — all crucial for real-time applications like music streaming services.

💡 Key Takeaways:
Reduces data movement by 80% during scaling.
Enables near-linear scalability with minimal operational cost.
Prevents service disruptions while handling dynamic workloads.

This elegant approach turns a brittle, inefficient system into a robust, scalable infrastructure — making it the preferred choice for modern distributed architectures.

Thank you for being a part of the community

Before you go: