_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Prometheus Alerting Rules

28. 09. 2021 1 min read intermediate

Cloud Intermediate

Prometheus Alerting Rules

PrometheusAlertingSRE 3 min read

Konfigurace alertů v Prometheus. PrometheusRule, Alertmanager routing a best practices.

Alert pravidla

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: app-alerts
spec:
  groups:
    - name: app.rules
      rules:
        - alert: HighErrorRate
          expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.05
          for: 5m
          labels: {severity: critical}
          annotations:
            summary: "High error rate ({{ $value | humanizePercentage }})"
        - alert: PodCrashLooping
          expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
          for: 5m
          labels: {severity: warning}

Alertmanager routing

route:
  receiver: default
  routes:
    - match: {severity: critical}
      receiver: pagerduty
    - match: {severity: warning}
      receiver: slack
receivers:
  - name: slack
    slack_configs:
      - channel: '#alerts'

Summary

Alertujte na symptomy (error rate, latence), ne na příčiny (CPU). Nastavte správné severity a routing.

Need Help with Implementation?

Our team has experience designing and implementing modern architectures. We’re happy to help.

Free Consultation

Share:

CORE SYSTEMS tým

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.