Cloud Intermediate
Prometheus Alerting Rules¶
PrometheusAlertingSRE 3 min read
Alert configuration in Prometheus. PrometheusRule, Alertmanager routing and best practices.
Alert pravidla¶
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: app-alerts
spec:
groups:
- name: app.rules
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.05
for: 5m
labels: {severity: critical}
annotations:
summary: "High error rate ({{ $value | humanizePercentage }})"
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels: {severity: warning}
Alertmanager routing¶
route:
receiver: default
routes:
- match: {severity: critical}
receiver: pagerduty
- match: {severity: warning}
receiver: slack
receivers:
- name: slack
slack_configs:
- channel: '#alerts'
Summary¶
Alert on symptoms (error rate, latency), not causes (CPU). Set proper severity and routing.
Need Help with Implementation?¶
Our team has experience designing and implementing modern architectures. We’re happy to help.