Scalability means your system can handle more users or data without crashing or slowing down โ like a highway that can add more lanes when traffic grows.
When everyone watches Stranger Things at 9 PM on a Friday, traffic spikes 10x. A single server would melt.
Netflix runs thousands of stateless microservices on AWS that auto-scale horizontally. AWS ELB (Elastic Load Balancer) distributes traffic across servers in multiple availability zones. When CPU usage hits 60%, new instances spin up automatically. When the spike ends, they shut down to save cost.
Auto-scaling has a 2-5 minute warm-up time. To handle sudden spikes, Netflix keeps a buffer of pre-warmed instances โ costing extra but worth it during launches.
Stateless services + auto-scaling + load balancers = handle any spike. The key insight: any server can handle any request because no session data is stored locally.