DevOps Intermediate
SLO/SLI Definitions¶
SLOSLISRE 3 min read
Service Level Indicators a Objectives. Merit spolehlivost.
SLI a SLO¶
Service: API
SLO: 99.9% availability (mesicne)
SLI: successful_requests / total_requests
Error Budget: 0.1% = ~43 min downtime/mesic
How to Define Proper SLIs and SLOs¶
When defining SLIs, start from the user experience, not internal metrics. A good SLI for an API is the ratio of successful responses (status < 500) with latency under 300ms to total requests. The SLO should be ambitious enough to ensure quality but not so strict that it blocks development.
Typical SLOs for different services: web API 99.9% (43 min downtime/month), internal batch processing 99.5% (3.6 h/month), critical financial services 99.99% (4.3 min/month). The SLA (Service Level Agreement) is a contractual commitment to the customer that should always be less strict than the internal SLO — if the SLO is 99.9%, the SLA should be 99.5%. Monitor SLIs in real time using Prometheus + Grafana and set up alerting on burn rate — how quickly you are consuming your error budget.
Shrnuti¶
SLO = cilova spolehlivost. SLI = mereni. Error budget = prostor pro inovaci.
Need Help with Implementation?¶
Our team has experience designing and implementing modern architectures. We’re happy to help.