
Here’s a behind-the-scenes look at creating a URL-shortening service using Amazon Web Services (AWS).
Users and System Interaction:
- User Requests: Users submit a long web address wanting a shorter version, or they might want to use a short link to reach the original website or remove a short link.
- API Gateway: This is AWS’s reception. It directs user requests to the right service inside AWS.
- Lambda Functions: These are the workers. They perform tasks like making a link shorter, retrieving the original from a short link, or deleting a short link.
- DynamoDB: This is the storage room. All the long and short web addresses are stored here.
- ElastiCache: Before heading to DynamoDB, the system checks here first when users access a short link. It’s faster.
- VPC & Subnets: This is the AWS structure. The welcoming part (API Gateway) is public, while sensitive data (DynamoDB) is kept private and secure.
Making Links Shorter for Users:
- Sequential Counting: Every web link gets a unique number. To keep it short, that number is converted into a combination of letters and numbers.
- Hashing: The system also shortens the long web address into a fixed-length string. This method may produce similar results for different links, but the system manages and differentiates them efficiently.
Sequential Counting: This takes a long URL as input and uses a unique counter value from the database to generate a short URL.
For instance, the URL https://example.com/very-long-url might be shortened to https://short.url/1234AB using a unique number from the database, then converting this number into a mix of letters and numbers.
Hashing: This involves taking a long URL and converting it to a fixed-size string of characters using a hashing algorithm. So, https://example.com/very-long-url could become https://short.url/h5Gk9.
The rationale for Combining:
- Enhanced Uniqueness & Collision Handling: Sequential counting ensures uniqueness, and in the unlikely event of a hashing collision, the sequential identifier can be used as a fallback or combined with the hash.
- Balancing Predictability & Compactness: Hashing gives compact URLs, and by adding a sequential component, we reduce predictability.
- Scalability & Performance: Sequential lookups are faster. If the hash table grows large, the performance could degrade due to hash collisions. Combining with sequential IDs ensures fast retrievals.
Lambda Function for Shortening (PUT Request)
- Input: Long URL e.g. “https://www.example.com/very-long-url“
- URL Exists: Retrieved Shortened URL e.g. “abcd12”
- Hash URL: Output e.g. “a1b2c3”
- Assign Number: Unique Sequential Number e.g. “456”
- Combine Hash & Number: e.g. “a1b2c3456”
- Store in DynamoDB: {“https://www.example.com/very-long-url“: “a1b2c3456”}
- Update ElastiCache: {“a1b2c3456”: “https://www.example.com/very-long-url”}
- Return to API Gateway: Shortened URL e.g. “a1b2c3456”
Lambda Function for Redirecting (GET Request)
- Input: The user provides a short URL like “a1b2c3456”.
- Check-in ElastiCache: System looks up the short URL in ElastiCache.
- Cache Hit: If the Long URL is found in the cache, the system retrieves it directly.
- Cache Miss: If not in the cache, the system searches in DynamoDB.
- Check-in DynamoDB: Searches the DynamoDB for the corresponding Long URL.
- URL Found: The Long URL matching the given short URL is found, e.g. “https://www.example.com/very-long-url“.
- Update ElastiCache: System updates the cache with {“a1b2c3456”: “https://www.example.com/very-long-url”}.
- Return to API Gateway: The system redirects users to the original Long URL.
Lambda Function for Deleting (DELETE Request)
- Input: The user provides a short URL they want to delete.
- Check-in DynamoDB: System looks up the short URL in DynamoDB.
- URL Found: If the URL mapping for the short URL is found, it proceeds to deletion.
- Delete from DynamoDB: The system deletes the URL mapping from DynamoDB.
- Clear from ElastiCache: The System also clears the URL mapping from the cache to ensure that the short URL no longer redirects users.
- Return Confirmation to API Gateway: After the deletion is successful, a confirmation is sent to the API Gateway, confirming the user about the deletion.
Simple Math Behind Our URL Shortening (Envelope Estimation):
When we use a 6-character mix of letters (both small and capital) and numbers for our short URLs, we have about 56.8 billion different combinations. If users create 100 million short links every day, we can keep making unique links for over 500 days without repeating them.
In Plain English
Thank you for being a part of our community! Before you go:
- Be sure to clap and follow the writer! 👏
- You can find even more content at PlainEnglish.io 🚀
- Sign up for our free weekly newsletter. 🗞️
- Follow us: Twitter(X), LinkedIn, YouTube, Discord.
- Check out our other platforms: Stackademic, CoFeed, Venture.
