DevOps Intermediate
Error Budget¶
Error BudgetSREReliability 3 min read
Koncept error budgetu v SRE. Balance spolehlivost a rychlost.
Principle¶
SLO 99.9% = error budget 0.1% = 43 min/mesic. Mate budget? Nasazujte. Dosel? Zpomalte.
- Budget > 50% - deploy freely
- Budget 20-50% - canary releases
- Budget < 20% - jen critical fixes
- Budget = 0 - code freeze
Implementing Error Budgets¶
The error budget is calculated from the defined SLO. If you have an SLO of 99.9% availability per month (30 days), the error budget is 0.1% of total time — approximately 43 minutes of downtime. This budget is continuously tracked and serves as an objective metric for decisions about the pace of deploying changes.
The key is linking the error budget to specific actions: above 50% remaining budget the team deploys freely, between 20-50% they switch to canary releases, below 20% only critical fixes are deployed, and when the budget is exhausted a code freeze begins. This framework removes subjective debates between development (wants to deploy fast) and operations (wants stability) and replaces them with data. Error budget reporting should be automated and visible to the entire team — a Grafana dashboard with the current budget status is the minimum.
Shrnuti¶
Error budget kvantifikuje risk appetite organizace.
Need Help with Implementation?¶
Our team has experience designing and implementing modern architectures. We’re happy to help.