// THE REBEL BLOG

Thoughts on free software, privacy, and taking back control

2026-03-1410 min readDevOps

Prometheus: The Backbone of Modern Monitoring

Your application is in production. Users are clicking, data is flowing, and everything seems fine—until it isn't. A service starts responding slowly. A server runs out of memory. A database connection pool saturates. How do you know? More importantly, how do you know before your users start complaining?

This is where Prometheus comes in. Originally built at SoundCloud and now a top-level Cloud Native Computing Foundation project, Prometheus has become the standard for cloud-native monitoring. It's the tool that answers the question: "What's happening in my system right now?"

What Is Prometheus?

Prometheus is an open-source systems monitoring and alerting toolkit. It collects metrics from configured targets, stores them locally, and exposes them for querying via a powerful PromQL (Prometheus Query Language).

Unlike traditional monitoring systems that push data to a central server, Prometheus follows a pull model. It reaches out to your applications and scrapes their metrics endpoints at regular intervals. This approach has several advantages:

  • Decentralized — No single point of failure
  • Simple — Applications just need to expose metrics
  • Scalable — Add targets without reconfiguring the server
  • Reliable — Works during network outages

Core Concepts

Metrics

Metrics are the heart of Prometheus. A metric is a numeric measurement that changes over time. Prometheus supports four types:

  • Counter — A monotonically increasing value (e.g., total requests, total errors)
  • Gauge — A value that can go up or down (e.g., memory usage, CPU temperature)
  • Histogram — A distribution of values (e.g., request latencies)
  • Summary — Similar to histogram, but with pre-computed quantiles

Labels

Labels add dimensions to your metrics. Instead of having one "http_requests_total" metric, you can have multiple with different labels:

http_requests_total{method="GET", status="200", endpoint="/api/users"}
http_requests_total{method="POST", status="201", endpoint="/api/users"}
http_requests_total{method="GET", status="404", endpoint="/api/nonexistent"}

This lets you slice and dice your data however you need.

Exporters

An exporter is a tool that exposes existing metrics in Prometheus format. There's an exporter for almost everything:

  • node_exporter — System-level metrics (CPU, memory, disk, network)
  • blackbox_exporter — HTTP, HTTPS, DNS, TCP, ICMP probes
  • postgres_exporter — PostgreSQL database metrics
  • redis_exporter — Redis metrics
  • cadvisor — Docker container metrics

PromQL

PromQL is Prometheus's query language. It's incredibly powerful for time series analysis:

# CPU usage across all nodes
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Requests per second
rate(http_requests_total[5m])

# 95th percentile latency
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

Setting Up Prometheus

# Download Prometheus
wget https://github.com/prometheus/prometheus/releases/latest/download/prometheus-*.linux-amd64.tar.gz
tar xvfz prometheus-*.linux-amd64.tar.gz

# Create config
cat > prometheus.yml << 'EOF'
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  
  - job_name: 'node'
    static_configs:
      - targets: ['localhost:9100']
EOF

# Start Prometheus
./prometheus --config.file=prometheus.yml

Now visit http://localhost:9090 to access the Prometheus UI.

Alerting with Alertmanager

Prometheus isn't just for viewing metrics—it can also alert you when things go wrong. The Alertmanager handles alert routing, grouping, and notification:

# alerts.yml
groups:
  - name: example
    rules:
      - alert: HighMemoryUsage
        expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage detected"

Visualization with Grafana

While Prometheus has a built-in UI, Grafana is the standard for visualization. They integrate seamlessly:

  1. Install Grafana
  2. Add Prometheus as a data source
  3. Build dashboards with PromQL queries

Grafana also has pre-built dashboards for most exporters, so you can get started in minutes.

The Ecosystem

Prometheus doesn't exist in isolation. It fits into a larger observability ecosystem:

  • Thanos — Adds long-term storage, global querying, and high availability
  • Cortex — Multi-tenant Prometheus as a service
  • VictoriaMetrics — Cost-effective alternative with better compression
  • OpenTelemetry — Standard for collecting traces and metrics

Instrumenting Your Applications

Most languages have client libraries for Prometheus. Here's a simple Python example:

from prometheus_client import Counter, Gauge, generate_latest

# Count requests
requests_total = Counter('app_requests_total', 'Total requests')

# Track in-progress requests
in_progress = Gauge('app_in_progress', 'Requests in progress')

# Measure request latency
request_duration = Histogram('app_request_duration_seconds', 'Request duration')

@app.route("/api/users")
def get_users():
    in_progress.inc()
    try:
        return fetch_users()
    finally:
        in_progress.dec()
        requests_total.inc()

Why Prometheus Matters

Prometheus succeeded because it embraced simplicity while delivering power. It doesn't try to do everything—it does one thing extremely well: collect and query time series metrics.

In a world where systems are increasingly distributed, ephemeral, and complex, having solid observability isn't optional. Prometheus gives you the visibility you need to debug issues, optimize performance, and sleep soundly at night.

If you can't measure it, you can't improve it. — Peter Drucker

Getting Started

  1. Install Prometheus — Start with a single instance
  2. Add node_exporter — Monitor your first server
  3. Learn PromQL — Practice querying in the UI
  4. Set up alerting — Add Alertmanager
  5. Add Grafana — Build beautiful dashboards
  6. Instrument your code — Expose application metrics

The learning curve is gentle, and the payoff is immediate. Once you have Prometheus running, you'll wonder how you ever managed without it.

// Comments

Leave a Comment