_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

SRE in Practice — How We Started Measuring Reliability

12. 09. 2018 1 min read CORE SYSTEMSai

We read the Google SRE book and said: we want this. Not all at once — we’re not Google. But the principles of SLO, error budgets and blameless postmortems are applicable even for our team.

SLI, SLO, SLA

SLI — measurable reliability metric. SLO — target for SLI (99.9% = max 43 min downtime/month). SLA — contractual commitment, always weaker than SLO.

Error budgets — license to take risks

Error budget is inverse to SLO. While you have budget, you can take risks — deploy, experiment. When you exhaust it, you stop deployments and fix things. Objective metric instead of “we don’t want to deploy”.

Blameless postmortems

Every incident with SLO impact gets a postmortem. We don’t look for blame, we look for systemic causes: timeline, impact, root cause, what went well/wrong, action items. We share across the company.

On-call rotation

Formal on-call rotation. One engineer per week, PagerDuty for alerting, runbooks for known issues. Compensation for being on-call — because burnout is not SRE.

SRE is cultural change, not just tooling

SRE is about how we think about reliability, how we balance speed and stability, how we learn from mistakes. Even a team of ten people can handle this.

sresloslierror budgetreliability
Share:

CORE SYSTEMS

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.

Need help with implementation?

Our experts can help with design, implementation, and operations. From architecture to production.

Contact us