Load Balancing

[A Music Streaming Service Shard Management Case Study]

Imagine you’re building the next Spotify or Apple Music. Your service needs to store and serve millions of music files to users worldwide. As your user base grows, a single server cannot handle the load, so you need to distribute the data across multiple servers. This raises several critical challenges:

Initial Challenge: How do you determine which server should store and serve each music file?
Scaling Challenge: What happens when you need to add or remove servers?
Load Distribution: How do you ensure an even distribution of data and traffic across servers?

Let’s see how these challenges manifest in a real scenario:

Consider a music streaming service with:

10 million songs
4 servers (initially)
Need to scale to 5 servers due to increased load

Traditional Approach Using Simple Hash Distribution

The simplest approach would be to use a hash function with modulo operation:

server_number = hash(song_id) % number_of_servers

Problems with this approach:

When scaling from 4 to 5 servers, approximately 80% of all songs need to be redistributed
During redistribution:

High network bandwidth consumption
Temporary service degradation
Risk of data inconsistency
Increased operational complexity

For example:

Song “A” with hash 123 → Server 3 (123 % 4 = 3)
After adding 5th server → Server 3 (123 % 5 = 3)
Song “B” with hash 14 → Server 2 (14 % 4 = 2)
After adding 5th server → Server 4 (14 % 5 = 4)

Solution: Consistent Hashing

Consistent Hashing elegantly solves these problems by creating a virtual ring (hash space) where both servers and data are mapped using the same hash function.

How It Works

1. Hash Space Creation:

Create a circular hash space (typically 0 to 2²⁵⁶ — 1)
Map both servers and songs onto this space using a uniform hash function

2. Data Assignment:

Each song is assigned to the next server clockwise from its position
When a server is added/removed, only the songs between the affected server and its predecessor need to move

3. Virtual Nodes:

Each physical server is represented by multiple virtual nodes
Improves load distribution
Handles heterogeneous server capacities

Implementation Example

Let’s implement this for our music streaming service:

class ConsistentHash:
    def __init__(self, replicas=3):
        self.replicas = replicas
        self.ring = {}  # Hash -> Server mapping
        self.sorted_keys = []  # Sorted hash values
        
    def add_server(self, server):
        # Add virtual nodes for each server
        for i in range(self.replicas):
            key = self._hash(f"{server}:{i}")
            self.ring[key] = server
            self.sorted_keys.append(key)
        self.sorted_keys.sort()
    
    def remove_server(self, server):
        # Remove all virtual nodes for the server
        for i in range(self.replicas):
            key = self._hash(f"{server}:{i}")
            del self.ring[key]
            self.sorted_keys.remove(key)
    
    def get_server(self, song_id):
        # Find the server for a given song
        if not self.ring:
            return None
            
        key = self._hash(str(song_id))
        for hash_key in self.sorted_keys:
            if key <= hash_key:
                return self.ring[hash_key]
        return self.ring[self.sorted_keys[0]]
    
    def _hash(self, key):
        # Simple hash function for demonstration
        return hash(key)

The Consistent Hashing Ring ensures efficient load distribution by mapping both servers and songs onto a circular space using SHA-256 hashing. Each server is assigned multiple virtual nodes, helping balance the load evenly. When a new server is added, it gets three virtual nodes to distribute traffic more uniformly. To determine where a song should be stored, the system hashes the song_id and assigns it to the next available server in a clockwise direction. This mechanism significantly improves scalability, as only a fraction of songs need to be reassigned when adding or removing servers, reducing data movement and minimizing disruptions.

How This Solves Our Previous Problems

Minimal Data Movement:

When adding a new server, only K/N songs need to move (where K is total songs and N is number of servers)
For our 10 million songs example, scaling from 4 to 5 servers:
Traditional: ~8 million songs move
Consistent Hashing: ~2 million songs move

2. Better Load Distribution:

Virtual nodes ensure even distribution
Each server handles approximately equal number of songs
Can adjust number of virtual nodes based on server capacity

3. Improved Scalability:

Adding/removing servers only affects neighboring segments
No system-wide recalculation needed
Operations can be performed without downtime

The diagram illustrates **Consistent Hashing for Load Distribution** in a **Music Streaming Service**. Songs (e.g., *Song A* and *Song B*) are assigned to servers using a **hash function**, which maps them onto a **circular hash space**. Servers are also mapped onto the same space, and each song is assigned to the **next available server in the clockwise direction**. This ensures **even distribution** of data across multiple servers while minimizing movement when scaling. When a new server is added or removed, only the **affected segment of the ring** is reassigned, reducing disruption and improving scalability.

Real-World Benefits

✔ Efficient Scaling: Servers can be added or removed without downtime.
✔ Better User Experience: Reduced query latency and improved load balancing.
✔ Cost Savings: Optimized network bandwidth usage and lower infrastructure costs.

Consistent Hashing is a foundational pattern used in large-scale distributed systems like DynamoDB, Cassandra, and Akamai CDN. It ensures high availability, efficient load balancing, and seamless scalability — all crucial for real-time applications like music streaming services.

💡 Key Takeaways:
✅ Reduces data movement by 80% during scaling.
✅ Enables near-linear scalability with minimal operational cost.
✅ Prevents service disruptions while handling dynamic workloads.

This elegant approach turns a brittle, inefficient system into a robust, scalable infrastructure — making it the preferred choice for modern distributed architectures.

Thank you for being a part of the community

Before you go:

Be sure to clap and follow the writer ️👏️️
Follow us: X | LinkedIn | YouTube | Newsletter | Podcast
Check out CoFeed, the smart way to stay up-to-date with the latest in tech 🧪
Start your own free AI-powered blog on Differ 🚀
Join our content creators community on Discord 🧑🏻‍💻
For more content, visit plainenglish.io + stackademic.com

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

Shanoj

Learn.Share.Grow

Tag Archives: Load Balancing

Distributed Design Pattern: Consistent Hashing for Load Distribution