A repo of links to articles, papers, conference talks, and tooling related to load management in software services: loadshedding, circuitbreaking, quota management and throttling. PRs welcome.
- Loadshedding
- Fairness and Isolation
- Circuit Breaking and Adaptive Concurrency Control
- Quota Management and Ratelimiting
- Reliability and traffic management generally
- Handling Overload, from the Google SRE Book, by Alejandro Forero Cuervo and edited by Sarah Chavis
- Addressing Cascading Failures, by Mike Ulrich
- Managing Load, from the Google SRE Workbook, by Cooper Bethea et al.
- Using load shedding to avoid overload, from the AWS Builders Library, by David Yanacek
- Timeouts, Retries, Backoff, and Jitter, from the AWS Builders Library, by Marc Brooker
- FIFO Considered Harmful, by Jos Visser
- Using load shedding to survive a success disaster—CRE life lessons, from Google Cloud Blog, by Dave Rensin and Adrian Hilton
- How to avoid a self-inflicted DDoS Attack—CRE life lessons, from Google Cloud Blog, by Dave Rensin and Adrian Hilton
- Keeping Netflix Reliable Using Prioritized Load Shedding, from the Netflix Tech Blog, by Manuel Correa, Arthur Gonigberg, and Daniel West
- Why Disaster Happens at the Edges: An Introduction to Queue Theory, by Avishai Ish-Shalom
- Applying Back Pressure When Overloaded, by Martin Thompson
- Why your reliability problems are really traffic problems
- Load Shedding—Approaches, Principles, Experiences, and Impact in Service Management by Acacio Cruz, at SREcon EMEA 2016
- Envoy, Take the Wheel: Real-time Adaptive Circuit Breaking, by Tony Allen, at KubeCon + CloudNativeCon Europe 2020
- Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service, Mostafa Elhemali et al., Amazon Web Services
- Fairness in multi-tenant systems, from the AWS Builders Library, by David Yanacek
- Method overloading the circuit, Christopher Meiklejohn et al.
- Service mesh circuit breaker: From panic button to performance management tool, by Mohammad Reza Saleh Sedghpour, Johan Tordsson
- Will circuit breakers solve my problems?, by Marc Brooker
- Performance Under Load: Adaptive Concurrency Limits @ Netflix, by Eran Landau, William Thurston, Tim Bozarth