_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Prometheus — monitoring for the container world

11. 05. 2017 2 min read CORE SYSTEMSdevops
Prometheus — monitoring for the container world

Nagios served us faithfully for ten years. But in the dynamic world of containers, where pods are born and die every minute, static monitoring configuration is unsustainable. Prometheus with its service discovery and pull model is exactly what we need.

Why not Nagios/Zabbix

Traditional monitoring works on the principle: configure a list of hosts, define checks, monitor. But in Kubernetes you don’t have “hosts” — you have pods that dynamically move between nodes, scale up and down, die and are reborn.

Prometheus architecture

Prometheus works on a pull model — it actively fetches metrics from defined endpoints. In Kubernetes it has native service discovery: it automatically finds all pods with the prometheus.io/scrape: "true" annotation and starts collecting metrics from them.

PromQL — a language you either love or hate

# Request rate per second over the last 5 minutes
rate(http_requests_total[5m])

# 99th percentile latency
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

# Pod CPU utilization
rate(container_cpu_usage_seconds_total{namespace="production"}[5m])

You learn PromQL gradually, but once you master it, you can answer questions you would never ask with Nagios.

Grafana dashboards

Prometheus itself has a minimalist web UI. For visualization we use Grafana, which has native Prometheus datasource. The community shares thousands of ready-made dashboards on grafana.com.

Alerting with Alertmanager

groups:
- name: application
  rules:
  - alert: HighErrorRate
    expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High error rate on {{ $labels.service }}"

Application instrumentation

Prometheus client libraries exist for Java, Go, Python, Node.js and others. In Spring Boot, just add Micrometer with Prometheus registry and you have metrics in minutes. Counter, Gauge, Histogram, Summary — four metric types cover most needs.

Prometheus is the standard for cloud-native monitoring

The transition from Nagios wasn’t trivial — we had to rethink what and how we monitor. But the result is incomparably better. Prometheus with Grafana and Alertmanager is now our standard monitoring trio.

prometheusmonitoringkubernetesgrafana
Share:

CORE SYSTEMS

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.

Need help with implementation?

Our experts can help with design, implementation, and operations. From architecture to production.

Contact us