_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Prometheus: Monitoring for the Cloud-Native World

03. 12. 2015 2 min read CORE SYSTEMSdevops
Prometheus: Monitoring for the Cloud-Native World

Prometheus, the monitoring system developed at SoundCloud, introduces a pull-based model, a flexible query language (PromQL), and native support for dynamic environments.

Monitoring for the Container Era

Traditional monitoring tools (Nagios, Zabbix) assume static infrastructure — manually configured hosts with permanent IP addresses. In a containerized environment where instances are created and destroyed dynamically, this model breaks down.

Prometheus was developed at SoundCloud specifically for dynamic, cloud-native environments. Inspired by Google’s internal Borgmon system, it brings large-scale monitoring principles within reach of every engineering team.

Pull Model and Service Discovery

Prometheus actively scrapes metrics from HTTP endpoints exposed by services — the opposite of a push model (StatsD, Graphite).

Advantages of the pull model:

  • Simpler — a service only needs to expose a /metrics endpoint
  • Failure detection — if a scrape fails, the service is down
  • Service discovery integration — Consul, Kubernetes, DNS
# Prometheus configuration
scrape_configs:
  - job_name: 'web-app'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        regex: web
        action: keep

PromQL — Query Language

PromQL is one of Prometheus’s greatest strengths — a flexible query language for metrics:

# Request rate per second over the last 5 minutes
rate(http_requests_total[5m])

# 99th percentile latency
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))

# Error rate
rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m])

PromQL enables ad-hoc analysis, dashboard creation, and the definition of alerting rules.

Alerting and Grafana Integration

Prometheus Alertmanager handles alerts — deduplication, grouping, silencing, and routing to notification channels (email, Slack, PagerDuty).

For visualization, Prometheus pairs perfectly with Grafana — the most popular open-source dashboarding tool. The combination of Prometheus + Grafana + Alertmanager forms a complete monitoring stack.

Recommended metrics to monitor: RED (Rate, Errors, Duration) for services, USE (Utilization, Saturation, Errors) for infrastructure.

Conclusion: The Standard for Cloud-Native Monitoring

Prometheus is rapidly becoming the standard for monitoring in cloud-native environments. It was the second project accepted into CNCF after Kubernetes — that is no coincidence. For every new project involving containers, we recommend Prometheus as the primary monitoring solution.

prometheusmonitoringmetrikyalertingcloud-nativeobservability
Share:

CORE SYSTEMS

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.

Need help with implementation?

Our experts can help with design, implementation, and operations. From architecture to production.

Contact us