📐

System Design

High Level Design

Low Level Design

Framework

⚡

HLD

Scalability

Handle more load without breaking

Simple Explanation

Scalability means your system can handle more users or data without crashing or slowing down — like a highway that can add more lanes when traffic grows.

Key Concepts · Click to Expand

Vertical Scaling (Scale Up)

▼

Horizontal Scaling (Scale Out)

▼

Load Balancer

▼

Stateless vs Stateful

▼

🌍

Real-World Case Studies

NetflixCASE STUDY · 1 / 2

Handling 200M+ subscribers globally

⚠ The Problem

When everyone watches Stranger Things at 9 PM on a Friday, traffic spikes 10x. A single server would melt.

✓ The Solution

Netflix runs thousands of stateless microservices on AWS that auto-scale horizontally. AWS ELB (Elastic Load Balancer) distributes traffic across servers in multiple availability zones. When CPU usage hits 60%, new instances spin up automatically. When the spike ends, they shut down to save cost.

⚖ The Trade-off

Auto-scaling has a 2-5 minute warm-up time. To handle sudden spikes, Netflix keeps a buffer of pre-warmed instances — costing extra but worth it during launches.

★ Interview Takeaway

Stateless services + auto-scaling + load balancers = handle any spike. The key insight: any server can handle any request because no session data is stored locally.

💡

Interview Cheat Sheet for Scalability

Always clarify: read-heavy or write-heavy system?

Start with vertical, mention horizontal when limits are hit

Draw the load balancer explicitly in your diagram

Mention auto-scaling triggers (CPU %, request rate, queue depth)