Skip to content
_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN DE
Let's talk

On-Call Best Practices

15. 07. 2025 Updated: 27. 03. 2026 1 min read intermediate

DevOps Intermediate

On-Call Best Practices

On-CallSREAlerting 3 min read

Efektivni on-call. Alerting, runbooks, udrzitelnost.

Principley

  • Jasna rotace
  • Dokumentovane runbooks
  • Actionable alerts
  • Kompenzace

Runbook

# Alert: HighErrorRate
## Kroky
1. kubectl get pods -n production
2. kubectl logs -l app=api --tail=100
3. Bad deploy? kubectl rollout undo deploy/api

How to Set Up Sustainable On-Call

Healthy on-call requires a maximum of 1 week on-call out of 4 (25%). If the team is too small, on-call becomes unsustainable and leads to burnout. Every alert must be actionable — if an alert does not require immediate action, lower its severity or remove it. The target is a maximum of 2 alerts per on-call shift.

Runbooks are living documents that describe step by step how to diagnose and resolve a specific alert. They should contain: what the alert means, what steps to take, when to escalate, and expert contacts. Automate as much as possible — if a runbook contains repetitive steps, create a script or auto-remediation. Compensation for on-call (bonus or time off) is essential for a fair system. After every incident, update the runbook with new findings.

Shrnuti

Actionable alerts + runbooks + ferova rotace = udrzitelny on-call.

Need Help with Implementation?

Our team has experience designing and implementing modern architectures. We’re happy to help.

Free Consultation

Share:

CORE SYSTEMS team

We build core systems and AI agents that keep operations running. 15 years of experience with enterprise IT.