- When you are building projects as a B.Tech student, spinning up a local Node.js or Next.js server feels pretty straightforward. But how do massive platforms like Amazon keep their apps running during the Big Billion Days without crashing?
- System Design is all about architecting your infrastructure so it can handle explosive traffic, maintain zero downtime, and run efficiently. Let's break down the core components of a scalable backend.
1. The Bottleneck: Vertical vs. Horizontal Scaling
At the end of the day, a server is just a physical machine running somewhere with a CPU and RAM.
When your app starts getting more traffic, your server's 2vCPUs and 4GB of RAM will quickly max out. Once it runs out of memory, the server crashes.
You have two choices to fix this:
Vertical Scaling (Scaling Up): Just buy a bigger machine. Upgrade from 4GB of RAM to 128GB.
The Problem: It has a limit (you can't buy infinite RAM), and you have to physically turn off the server to upgrade it, causing a massive downtime right when you need the server the most.
Horizontal Scaling (Scaling Out): Instead of making one machine bigger, just add more small machines. Keep spinning up identical servers to share the load.
2. Managing the Traffic: The Load Balancer
If you scaled horizontally and now have 4 identical servers, how does the internet know which one to send the user to?
You place a Load Balancer (like AWS Elastic Load Balancer or NGINX) in front of them. The Load Balancer has one public IP address. It takes the incoming request, checks which server is currently the least busy, and routes the traffic there.
# Example of NGINX Load Balancing (Round Robin) upstream my_backend { server backend1.example.com; # Server 1 server backend2.example.com; # Server 2 server backend3.example.com; # Server 3 } server { listen 80; location / { proxy_pass http://my_backend; } }
If Server 1 crashes, the Load Balancer instantly stops sending traffic to it and redirects it to the others. The user never experiences a crash.
3. Microservices and API Gateways
Instead of having one massive codebase (a monolith), you break your app into smaller, independent services: Auth Service, Orders Service, and Payments Service.
To manage where a user's request goes based on the URL path, we use an API Gateway.
- If the user hits
amazon.com/auth, the API Gateway sends them to the Auth Load Balancer. - If they hit
amazon.com/orders, the API Gateway sends them to the Orders Load Balancer.
4. Background Workers & The Queue System
To fix this, we use Asynchronous Queues (like AWS SQS or RabbitMQ). When a payment succeeds, the Payment Service simply drops a message into a "Queue" saying:
"Payment successful for User 123."
Meanwhile, a separate Background Worker pulls messages out of the queue one by one and sends the emails at a steady rate. If the Email API crashes or enforces rate limits, the queue simply holds the messages safely until the API is back online.
// A conceptual look at pushing an event to an async Queue async function handlePayment(userId, amount) { const success = await processPayment(amount); if (success) { // Instead of sending the email here, just push it to the queue! await messageQueue.push({ type: 'SEND_CONFIRMATION_EMAIL', userId: userId }); return "Payment Successful!"; // User gets an instant response } }
5. Speeding it up globally: CDN
If your primary server is in the US, but a user in India requests a product image, the data has to travel halfway across the world, creating latency.
A CDN (Content Delivery Network like AWS CloudFront) solves this by placing "Edge Servers" in data centers all over the world. When the first user in India requests an image, the CDN fetches it from the US server and caches it in India.
Now, when the next 100,000 Indian users ask for that same image, the local CDN server in India sends it to them instantly without ever bothering the main US server!
By combining Load Balancers, API Gateways, Async Queues, and CDNs, you create an architecture where traffic spikes are smoothly distributed, long tasks don't block the user, and static content is served at lightning speed globally!
~ Anurag Singh