1. Traffic Patterns: Netflix vs. YouTube vs. Hotstar
All three are video streaming platforms, yet their architectures are drastically different because their users behaves differently.
- Netflix(Predictable Traffic): Traffic spikes on Netflix occur when a highly anticipated movie or series drops. Because they know the exact date and time of the release, they can Pre-warm(scale up) their servers in advance and aggressively cache content on global CDNs.
- Youtube(Unpredictable Spikes): A massive creator (like MrBeast) or a global news channel can go live at any random moment. Traffic spikes are unpredictable, meaning YouTube’s system relies on extreme, real-time auto-scaling to survive sudden surges.
- Hotstar(Live Sports & Back Button Problem): Live cricket mathes cause erratic traffic spikes(eg: when Dhoni comes to bat or a wicket falls). Auto-scaling is too risky here because traffic traffic fluctuates in seconds. Instead, they over-provision their servers beforehand.
Interview Edge Case: Image 24 million people are watching a live match. if the match gets boring, a huge chunk of those users will hit the BACK button to browse other content. Suddenly, those 24 million concurrent users shift from highly scalable Live Stream API to Movies API HomePage. if the system isn’t prepared for this internal migration, the homepage will instantly crash !! .
2. Serverless Architecture (AWS Lambda)
Managing physical servers (allocating CPU, RAM, and handling OS updates) is a massive operational overhead. To eliminate this, the industry introduced Serverless Architecture.
"Serverless" does not mean there are no servers. It simply means you don't manage them (the cloud provider does). You just upload your code. When a request comes in, the cloud instantly spins up a lightweight environment, executes the function, and destroys it.
// A simple AWS Lambda Handler (Stateless) exports.handler = async (event) => { // 1. Request comes in -> Lambda wakes up // 2. Processes data // 3. Returns response & immediately gets destroyed return { statusCode: 200, body: 'Hello from Serverless!' }; };
Pros:
- Incredibly cost-effective (AWS gives you the first 1 million requests per month for free).
- If you have zero traffic, you pay zero dollars (Scales to zero).
Cons (What the interviewer actually wants to hear):
- Cold Starts: If a function hasn't been invoked in a while, the cloud takes a moment to boot up the environment for the first incoming request, causing noticeable latency.
- Vendor Lock-in: You get deeply entangled in the AWS ecosystem (SQS, API Gateway, S3). Migrating a serverless app to Azure or Google Cloud later requires completely rewriting your architecture.
- Database Connection Exhaustion: Because a new Lambda container is created for almost every concurrent request, a sudden spike of 100,000 requests will attempt to open 100,000 new connections to your database (like MongoDB or Postgres). The database will immediately crash unless you implement a Connection Pooler or Proxy.
3. The Evolution: From VMs to Containers (Docker)
In traditional deployment, developers constantly face the "It works on my machine" problem. Code runs perfectly on a local Windows/Mac environment but crashes on the Linux production server due to mismatched package versions.
- Virtual Machines (VMs): The initial solution was VMs—running a complete OS (like Ubuntu) inside your physical server. But VMs are incredibly heavy. Every VM requires booting up a full 4-5 GB Operating System just to run a small app.
- Containers (Docker): Containers stripped away the OS layer. They package only your code and its exact dependencies into a "Lightweight VM." A container might only be 200MB. You can easily run 50 isolated containers on a single server, and they can start or stop in milliseconds.
4. Kubernetes (K8s): The Brain of Containers
If you are running 50 containers on one server, and you have thousands of servers, who manages them? If a container crashes, who restarts it? How do you push a new version of your code without shutting down the site?
This is where Container Orchestration comes in.
Google originally built an internal system called 'Borg' to manage their massive data centers. Based on those learnings, they built Kubernetes (K8s) from scratch and open-sourced it to the CNCF.
# Example: A basic Kubernetes Deployment strategy apiVersion: apps/v1 kind: Deployment metadata: name: payment-service spec: replicas: 3 # K8s will ALWAYS ensure 3 containers are running template: spec: containers: - name: payment-app image: payment-app:v2 # Enables zero-downtime rolling updates!
Kubernetes acts as the "brain." It monitors container health, automatically scales containers up or down based on CPU/Memory usage, and ensures that users never experience downtime during updates by implementing Rolling or Blue-Green deployments.
~ Anurag Singh