DATA
VISUALIZATION

// See everything. Know instantly.

GRAFANA GIVES YOU SUPERPOWERS.

In a world drowning in data, Grafana transforms raw metrics into actionable insights. It's not just a dashboard toolβ€”it's the lens through which you view your entire infrastructure. When something breaks at 3 AM, Grafana tells you what, when, and why.

VISUALIZE ANYTHING.

From server CPU usage to business KPIs, from network traffic to application latency. Grafana connects to Prometheus, InfluxDB, Elasticsearch, PostgreSQL, and dozens of other data sources. You decide what to measure. Grafana makes it beautiful.

ALERT BEFORE DISASTER STRIKES.

Proactive monitoring means fixing problems before users notice. Grafana's alerting system notifies you via email, Slack, PagerDuty, or webhook when metrics cross thresholds. Sleep better at night knowing Grafana is watching your systems.

BEGIN YOUR JOURNEY β†’

// The Path to Observability Mastery

12 lessons. Complete Grafana control.

LESSON 01

Introduction to Grafana

What is observability? Installing Grafana and understanding the interface.

Beginner
LESSON 02

Data Sources

Connecting to Prometheus, InfluxDB, Elasticsearch, and more.

Beginner
LESSON 03

Your First Dashboard

Creating panels, queries, and organizing dashboards.

Beginner
LESSON 04

Query Language

PromQL, InfluxQL, and other query languages for time series data.

Intermediate
LESSON 05

Visualizations

Graphs, heatmaps, tables, gauges, and advanced visualizations.

Intermediate
LESSON 06

Variables & Templating

Dynamic dashboards with variables and template queries.

Intermediate
LESSON 07

Alerting System

Creating alerts, notification policies, and alert rules.

Intermediate
LESSON 08

Alert Notifications

Email, Slack, PagerDuty, webhook integrations.

Intermediate
LESSON 09

User Management

Teams, permissions, and organization management.

Advanced
LESSON 10

Plugins & Extensions

Installing plugins, building custom panels, Grafana Loki.

Advanced
LESSON 11

API & Automation

Grafana API, provisioning, and infrastructure as code.

Advanced
LESSON 12

Production Best Practices

High availability, security, and scaling Grafana.

Advanced

LESSON 01: Introduction to Grafana

Γ—

What is Grafana?

Grafana is an open-source platform for data visualization and monitoring. It connects to various data sources and transforms data into beautiful, interactive dashboards. Originally created in 2014, it has become the de facto standard for observability dashboards.

Grafana is used by companies of all sizesβ€”from small startups to massive enterprises like Google, Netflix, and PayPal. It's the visualization layer for monitoring systems, providing the "what happened" and "why" behind your metrics.

⚑ POWER MOVE: Grafana is free and open source. You can run it on your own infrastructure, customize it, and contribute back to the community. No vendor lock-in, no SaaS pricing.

What is Observability?

Observability is the ability to measure the internal states of a system by examining its outputs. In IT operations, this means:

  • Metrics: Quantitative measurements (CPU usage, request count, latency)
  • Logs: Timestamped records of events
  • Traces: Request paths through distributed systems

Grafana excels at metrics and can integrate with logging (Loki) and tracing (Tempo) systems for complete observability.

Installing Grafana

Ubuntu/Debian

# Add Grafana repository
sudo apt-get install -y apt-transport-https software-properties-common wget
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list

# Install
sudo apt update
sudo apt install grafana

# Start
sudo systemctl start grafana-server
sudo systemctl enable grafana-server

Docker

# Run Grafana
docker run -d \
  --name=grafana \
  -p 3000:3000 \
  -v grafana-data:/var/lib/grafana \
  grafana/grafana

Access Grafana at http://localhost:3000. Default credentials: admin/admin

Grafana Interface

The Grafana interface consists of several key areas:

  • Left Sidebar: Navigation - Dashboards, Explore, Alerts, Configuration, Server Admin
  • Top Bar: Search, User menu, Notifications
  • Dashboard View: Panels arranged in rows
  • Panel Editor: Query builder, visualization options, metadata

Key Concepts

  • Data Source: A storage backend that provides metrics (Prometheus, InfluxDB, Elasticsearch, etc.)
  • Dashboard: A collection of panels organized in rows
  • Panel: A single visualization (graph, stat, table, etc.)
  • Query: A request for data from a data source
  • Alert: A rule that triggers notifications when conditions are met
  • Folder: Organization unit for dashboards

LESSON 02: Data Sources

Γ—

Data Sources Overview

Grafana supports 50+ data sources. Let's cover the most popular ones.

Prometheus

Prometheus is the most common pairing with Grafanaβ€”a powerful time-series database designed for metrics collection.

  1. Go to Configuration > Data Sources
  2. Click "Add data source"
  3. Select "Prometheus"
  4. Configure URL: http://localhost:9090
  5. Click "Save & Test"
# If you don't have Prometheus, install it:
docker run -d \
  --name=prometheus \
  -p 9090:9090 \
  -v prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

# prometheus.yml:
global:
  scrape_interval: 15s
scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

InfluxDB

InfluxDB is popular for custom application metrics.

# Run InfluxDB
docker run -d \
  --name=influxdb \
  -p 8086:8086 \
  influxdb:2

# Configure in Grafana:
# URL: http://localhost:8086
# Database: mydb
# Username: admin
# Password: yourpassword

PostgreSQL / MySQL

Use SQL databases for metrics that live in your application database.

# Query example:
SELECT
  $__timeGroup(created_at, '5m') AS time,
  count(*) AS request_count,
  avg(response_time) AS avg_response_time
FROM http_requests
WHERE $__timeFilter(created_at)
GROUP BY 1
ORDER BY 1

Elasticsearch

For log and document storage visualization.

# Configure:
# URL: http://localhost:9200
# Index name: logs-*
# Time field: @timestamp

Multiple Data Sources

You can add multiple data sources and reference them in different panels:

# In a single dashboard:
# - Panel 1: Queries Prometheus (infrastructure metrics)
# - Panel 2: Queries InfluxDB (application metrics)
# - Panel 3: Queries PostgreSQL (business metrics)

LESSON 03: Your First Dashboard

Γ—

Creating a Dashboard

  1. Click "+" icon in sidebar or "Create" > "Dashboard"
  2. Click "Add panel"
  3. Select your data source
  4. Write a query
  5. Choose visualization
  6. Save dashboard

Your First Query

Let's query Prometheus for CPU usage:

# Prometheus query:
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

This calculates the percentage of CPU that's NOT idle, giving you total CPU usage.

Panel Editor

The panel editor has several tabs:

  • Query: Write and test queries
  • Visualization: Choose chart type
  • Panel: Panel settings (title, description, links)
  • Field: Field options, units, thresholds
  • Alert: Create alert rules

Organizing Dashboards

  • Folders: Group related dashboards
  • Tags: Add tags for filtering
  • Annotations: Add event markers to graphs
  • Dashboard links: Link between related dashboards

Sharing Dashboards

Grafana offers multiple sharing options:

  • Share: Generate a link or embed
  • Export: Download JSON for backup/sharing
  • Snapshot: Save current data as snapshot
  • PDF: Export dashboard as PDF

LESSON 04: Query Language

Γ—

PromQL Fundamentals

Prometheus Query Language (PromQL) is powerful for time-series data.

Basic Queries

# Direct metric
node_cpu_seconds_total

# With label filter
node_cpu_seconds_total{mode="idle"}

# Rate - per-second rate of change
rate(node_cpu_seconds_total[5m])

# Irate - instant rate
irate(node_cpu_seconds_total[5m])

Aggregations

# Sum all values
sum(node_cpu_seconds_total)

# Average
avg(rate(http_requests_total[5m]))

# Max/Min
max(node_memory_MemAvailable_bytes)

# Count
count(node_cpu_seconds_total{mode="user"})

# By label
sum by (instance) (rate(node_cpu_seconds_total[5m]))

Functions

# Calculate percentage
100 - (avg by (mode) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Moving average
avg_over_time(node_memory_MemAvailable_bytes[5m])

# Predict future
predict_linear(node_memory_MemAvailable_bytes[1h], 3600)

Grafana Variables in Queries

# Use $variable in queries
node_exporter_build_info{job="$job"}

# Multi-value variable
node_exporter_build_info{job=~"$job"}

# Time ranges
$__range_s

Query Performance

  • Use [5m] ranges instead of [1h] for faster queries
  • Reduce cardinality - avoid high-cardinality labels
  • Use recording rules - pre-aggregate expensive queries
  • Limit time range - don't query more data than needed

LESSON 05: Visualizations

Γ—

Visualization Types

Grafana offers dozens of visualizations:

  • Time series: Line, area, bar charts
  • Stat: Single big number with sparkline
  • Bar gauge: Horizontal/vertical gauges
  • Table: Tabular data
  • Heatmap: Density visualization
  • pieChart: Pie/Donut charts
  • Logs: Log entries display

Graph Panel

The classic time-series visualization:

# Visualization options:
# - Mode: Lines, Bars, Points
# - Line interpolation: Smooth, Step, Linear
# - Line width: 1-5px
# - Fill opacity: 0-100%
# - Gradient mode: Opacity, Hue, Saturation

Stat Panel

Display a single value prominently:

  • Sparkline: Mini chart in background
  • Color by: Background or value
  • Orientation: Horizontal/Vertical
  • Text mode: Value only, name only, value and name

Thresholds

Color-code values based on thresholds:

# Thresholds:
# Green: 0-70
# Yellow: 70-90
# Red: 90-100

# Base: green
# Threshold 1: 70 (yellow)
# Threshold 2: 90 (red)

Value Mappings

Transform values for display:

# Map numeric codes to text:
# 0 -> OK
# 1 -> Warning
# 2 -> Critical

# Map boolean:
# true -> Active
# false -> Inactive

LESSON 06: Variables & Templating

Γ—

Dashboard Variables

Variables make dashboards dynamic and reusable.

Creating Variables

  1. Go to Dashboard Settings (gear icon)
  2. Click "Variables" > "Add variable"
  3. Configure name, type, query
  4. Use in queries: $variableName
# Query type: Query
# Data source: Prometheus
# Query: label_values(node_exporter_build_info, job)
# Name: job
# Multi-value: βœ“
# Include All option: βœ“

Variable Types

  • Query: Dynamic values from data source
  • Constant: Fixed value
  • Textbox: User input
  • Interval: Time intervals
  • Datasource: Select data source
  • Custom: Comma-separated values

Chained Variables

Make selections cascade:

# Variable 1: $environment
# Query: label_values(node_exporter_build_info, environment)

# Variable 2: $host
# Query: label_values(node_exporter_build_info{environment="$environment"}, instance)

# Result: Select environment, then available hosts in that environment

Advanced Queries

# Prometheusadhoc variable (filters):
# Metric: node_network_receive_bytes_total
# Filters: {{label}}="{{value}}"

# Using regex:
# Query: label_values(up{job=~"$job.*"}, instance)

LESSON 07: Alerting System

Γ—

Alerting Overview

Grafana alerting monitors your metrics and notifies you when thresholds are crossed.

⚑ NOTE: Grafana 8+ introduced unified alerting. Legacy alerts still exist but unified is recommended.

Creating Alert Rules

  1. Open a panel > Click "Alert" tab
  2. Click "Create Alert Rule"
  3. Define query and conditions
  4. Set evaluation behavior
  5. Configure notifications
# Condition:
# WHEN: avg() OF query(A, 5m, now) IS ABOVE 80

# This triggers when the 5-minute average 
# of query A exceeds 80

Alert Conditions

# Basic: IS ABOVE / IS BELOW / IS OUTSIDE RANGE / IS WITHIN RANGE

# Example queries:
# CPU usage above 80%
avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) < 20

# Memory above 90%
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes > 0.9

# Request errors above 5%
rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.05

Evaluation

# How often to check:
# Evaluation interval: 5m (check every 5 minutes)

# How long condition must be true:
# For: 5m (trigger after 5 minutes of violation)

# This prevents flapping alerts

Alert States

  • Normal: Below threshold
  • Pending: Above threshold but not yet firing
  • Firing: Above threshold for the "for" duration
  • No Data: No data returned
  • Error: Query error

LESSON 08: Alert Notifications

Γ—

Notification Channels

Grafana supports many notification destinations:

  • Email
  • Slack
  • PagerDuty
  • OpsGenie
  • Webhook
  • Telegram
  • Microsoft Teams
  • Discord

Configuring Email

  1. Go to Configuration > Notification channels
  2. Add "Email" channel
  3. Configure SMTP in grafana.ini:
[smtp]
enabled = true
host = smtp.example.com:587
user = grafana@example.com
password = yourpassword
from_address = grafana@example.com
from_name = Grafana Alert

Slack Integration

# In Slack:
# 1. Create Incoming Webhook
# 2. Copy webhook URL

# In Grafana:
# 1. Add Slack notification channel
# 2. Paste webhook URL
# 3. Set recipient (#alerts or @username)
# 4. Test notification

Webhook for Custom Integrations

# Webhook sends JSON:
{
  "title": "[FIRING:1] CPU High (grafana)",
  "message": "CPU usage above 80%",
  "state": "alerting",
  "evalMatches": [...],
  "ruleUrl": "http://localhost:3000/alerting/..."
}

# Handle in your application to:
# - Create tickets
# - Page on-call
# - Run automation

Notification Policies

Route alerts based on labels:

# Default policy: All alerts go to email

# Custom policies:
# - IF severity=critical -> Slack #critical + PagerDuty
# - IF team=backend -> Slack #backend-alerts
# - IF service=api -> Email on-call@company.com

LESSON 09: User Management

Γ—

Organization Model

Grafana uses organizations for multi-tenancy:

  • Org: Isolated workspace with own data sources and dashboards
  • Users: Belong to one or more orgs with different roles
  • Teams: Groups within an org for easier permission management

Roles

  • Viewer: View dashboards only
  • Editor: View and edit dashboards
  • Admin: Full org management
  • Server Admin: User management, global settings (Grafana admin)

Team Permissions

  1. Create team in Configuration > Teams
  2. Add members
  3. Set team permissions on folders/dashboards
# Team permissions:
# - Members can edit
# - Can admin
# - View only

LDAP Integration

# grafana.ini
[auth.ldap]
enabled = true
config_file = /etc/grafana/ldap.toml

# ldap.toml
[[servers]]
host = "ldap.example.com"
port = 636
use_ssl = true

[[servers.group_mappings]]
group_dn = "cn=admins,ou=groups,dc=example,dc=com"
org_role = "Admin"

LESSON 10: Plugins & Extensions

Γ—

Grafana Plugin Ecosystem

Extend Grafana with plugins:

  • Data Source Plugins: Connect to new databases
  • Panel Plugins: New visualization types
  • App Plugins: Bundled dashboards and plugins

Installing Plugins

# Via grafana-cli
grafana-cli plugins install grafana-worldmap-panel
grafana-cli plugins install grafana-piechart-panel

# Via Docker
docker run -d -p 3000:3000 \
  -v $(pwd)/plugins:/var/lib/grafana/plugins \
  -e GF_PLUGINS=/var/lib/grafana/plugins \
  grafana/grafana

# Restart Grafana after installation
sudo systemctl restart grafana-server

Grafana Loki

Loki is Grafana's log aggregation system:

# Run Loki
docker run -d --name=loki -p 3100:3100 grafana/loki

# Configure Loki as data source:
# URL: http://localhost:3100

# LogQL queries:
{job="nginx"} |= "error"
{job="nginx"} | json | status_code >= 500
rate({job="app"}[5m])

Popular Plugins

  • Worldmap Panel: Geographic visualization
  • Pie Chart: Donut/pie charts
  • Geomap: Advanced mapping
  • Apache ECharts: Rich visualizations
  • JSON API: Query any REST API
  • CloudWatch: AWS CloudWatch metrics

LESSON 11: API & Automation

Γ—

Grafana API

Everything in Grafana can be automated via API.

# Get API key: Configuration > API Keys

# Base URL
curl -H "Authorization: Bearer $API_KEY" \
  http://localhost:3000/api/dashboards/uid/my-dashboard

# List dashboards
curl -H "Authorization: Bearer $API_KEY" \
  http://localhost:3000/api/search?type=dash-db

# Create dashboard
curl -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -X POST http://localhost:3000/api/dashboards/db \
  -d '{"dashboard": {...}}'

Provisioning

Declaratively define dashboards, data sources, and more:

# provisioning/datasources/datasources.yml
apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true

  - name: Loki
    type: loki
    url: http://loki:3100

# provisioning/dashboards/dashboards.yml
apiVersion: 1

providers:
  - name: 'Dashboards'
    orgId: 1
    folder: 'Monitoring'
    type: file
    options:
      path: /var/lib/grafana/dashboards

Dashboard as Code

# Export dashboard JSON and version control
curl -H "Authorization: Bearer $API_KEY" \
  http://localhost:3000/api/dashboards/uid/my-dashboard \
  > dashboards/my-dashboard.json

# Or use grafonnet library to generate from templates
# https://github.com/grafana/grafonnet

LESSON 12: Production Best Practices

Γ—

Security

  • Enable HTTPS - Use reverse proxy with TLS
  • API Keys - Rotate regularly
  • Disable anonymous access
  • Data source proxy - Don't expose data sources to browsers
  • RBAC - Use teams and permissions

High Availability

For HA, use multiple Grafana instances:

# Use external database:
# PostgreSQL or MySQL

# grafana.ini:
[database]
type = postgres
host = dbserver:5432
name = grafana
user = grafana
password = yourpassword

# Use shared data sources:
# Configure data sources with same URL across instances
# Or use load balancer

Performance

  • Query caching - Enable data source query caching
  • Limit time ranges - Don't query more than needed
  • Recording rules - Pre-aggregate expensive queries
  • Resource limits - Set query timeouts
  • Dashboard optimization - Reduce panel count

Backup & Recovery

# Backup:
# - Database (dashboards, users, settings)
# - Provisioning files
# - Plugins
# - Dashboards JSON

# Grafana stores in:
# - Database (SQLite, PostgreSQL, MySQL)
# - /var/lib/grafana (dashboards, plugins, etc.)

Conclusion

You've completed the Grafana mastery guide. You now know how to:

  • Install and configure Grafana
  • Connect to various data sources
  • Build beautiful dashboards
  • Write complex queries
  • Create dynamic dashboards with variables
  • Set up alerting
  • Configure notifications
  • Manage users and permissions
  • Extend with plugins
  • Automate with API
  • Deploy production-grade Grafana

Next steps:

  • Build dashboards for your infrastructure
  • Set up alerting for critical metrics
  • Explore Grafana Loki for logs
  • Integrate with Prometheus