SEE EVERYTHING

// If it happens, log it.

LOGGING IS OBSERVABILITY.

In production, you can't debug with print statements. You need structured logs, centralized aggregation, and powerful search. When things break at 3 AM, good logs are your best friend.

WHY CENTRALIZED LOGGING?

With dozens of servers and services, checking logs on each machine is impossible. Centralized logging aggregates everything in one place. Search across all logs, create dashboards, and get alerts when errors spike.

BECOME OBSERVABLE.

Master the ELK stack (Elasticsearch, Logstash, Kibana) or Grafana Loki. Learn to structure logs, create alerts, and build dashboards. The key to running reliable systems is seeing what's happening.

BEGIN YOUR JOURNEY โ†’

// The Path to Mastery

12 lessons. Complete logging control.

LESSON 01

Logging Fundamentals

Understand log levels, structured logging, and best practices.

Beginner
LESSON 02

System Logging

Configure syslog, journald, and system log rotation.

Beginner
LESSON 03

Application Logging

Implement logging in Python, Go, and Node.js applications.

Beginner
LESSON 04

Log Format & Structure

Create JSON logs with timestamps, levels, and context.

Beginner
LESSON 05

Log Shippers

Use Filebeat, Fluentd, and journalbeat to forward logs.

Intermediate
LESSON 06

Elasticsearch Basics

Install Elasticsearch and understand the data model.

Intermediate
LESSON 07

Logstash & Ingest

Parse, transform, and enrich logs with Logstash pipelines.

Intermediate
LESSON 08

Kibana Dashboards

Build visualizations and dashboards in Kibana.

Intermediate
LESSON 09

Grafana Loki

Set up Loki for cost-effective log aggregation.

Advanced
LESSON 10

Log Alerts

Create alerts for errors, spikes, and anomalies.

Advanced
LESSON 11

Performance & Scaling

Optimize Elasticsearch and handle billions of logs.

Advanced
LESSON 12

Security & Compliance

Secure logs, manage access, and meet compliance requirements.

Advanced

// Why Centralized Logging

The old approachโ€”ssh to each server and grep through log filesโ€”doesn't scale. When you have microservices, containers, and auto-scaling, you need centralized logging.

The ELK stack (Elasticsearch, Logstash, Kibana) is the most popular solution. Elasticsearch stores and searches logs, Logstash processes them, and Kibana provides the interface.

Grafana Loki offers a cheaper alternative, storing logs in object storage and only indexing labels. Either way, centralized logging is essential for debugging production issues.

You can't fix what you can't see. Own your logs.

// Tools & References

๐Ÿ“– Elasticsearch Docs

Official Documentation

elastic.co

๐Ÿ“ฆ Logstash

Log Processing

logstash

๐Ÿ“Š Kibana

Visualization

kibana

๐Ÿณ Loki

Grafana Logging

grafana.com/loki

// Logging Fundamentals

ร—

What is Logging?

Logging is the practice of recording events, errors, and information from your applications. Good logs help you understand what your system is doing and debug when things go wrong.

Log Levels

  • DEBUG: Detailed diagnostic info
  • INFO: General information about operations
  • WARNING: Something unexpected, but not an error
  • ERROR: A problem that prevented something from working
  • FATAL: Critical error causing application crash
PRO TIP: Use INFO for normal operations, WARNING for recoverable issues, ERROR for failures. Reserve DEBUG for development only.

Basic Logging

$ logger "Application started" Feb 15 10:30:00 server logger: Application started

Logging in Applications

# Python import logging logging.info("User logged in") logging.error("Connection failed") # Go log.Info("Request processed") log.Error("Database timeout") # JavaScript/Node console.log("Info message"); console.error("Error message");

Quiz

1. What level for normal operations?

2. What level for application crashes?

// System Logging

ร—

syslog

Standard logging daemon on Unix systems:

$ logger -p user.info "My application message" # Write to /var/log/syslog $ tail -f /var/log/syslog # Watch logs in real-time

systemd Journal

$ journalctl -xe # View journal $ journalctl -u nginx.service # Logs for specific service $ journalctl --since "1 hour ago" # Recent logs

Log Locations

/var/log/syslog # General system messages /var/log/auth.log # Authentication /var/log/kern.log # Kernel messages /var/log/dmesg # Boot messages /var/log/nginx/ # Application logs

Log Rotation

$ sudo logrotate -f /etc/logrotate.conf # Force rotation

Quiz

1. What command views systemd logs?

// Application Logging

ร—

Python Logging

import logging logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' ) logger = logging.getLogger(__name__) logger.info("Application started") logger.error("Database connection failed")

Go Logging

import ( "log" "os" ) // Set output log.SetOutput(os.Stdout) log.Info("Server starting on port 8080") log.Error("Failed to connect to database")

Node.js Logging

# npm install pino const pino = require('pino') const logger = pino() logger.info("Server started") logger.error("Connection refused")

Quiz

1. What module for logging in Python?

// Log Format & Structure

ร—

Structured Logging

JSON logs are machine-parseable and work great with log aggregation:

{ "timestamp": "2024-02-15T10:30:00Z", "level": "INFO", "message": "User logged in", "user_id": 12345, "ip_address": "192.168.1.1" }

Key Fields

# Essential fields for every log - timestamp # ISO 8601 format - level # DEBUG, INFO, WARN, ERROR, FATAL - message # Human-readable description - service # Which service generated it - trace_id # For request tracing

Context is King

# Good log - has context { "message": "Payment failed", "user_id": 123, "amount": 99.99, "currency": "USD", "error_code": "CARD_DECLINED" } # Bad log - no context "Payment failed"

Quiz

1. What format is machine-parseable?

// Log Shippers

ร—

What are Log Shippers?

Agents that forward logs from servers to centralized logging systems:

Filebeat

$ sudo apt install filebeat # Install Filebeat
# /etc/filebeat/filebeat.yml filebeat.inputs: - type: log paths: - /var/log/nginx/*.log fields: service: nginx output.elasticsearch: hosts: ["localhost:9200"]

Fluentd

$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-fluentd-{{site.version}}-2.sh | sh

syslog-ng

$ sudo apt install syslog-ng

Quiz

1. What forwards logs to central systems?

// Elasticsearch Basics

ร—

What is Elasticsearch?

Distributed search and analytics engine, optimized for log storage and searching:

Installing

$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - $ echo "deb https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list $ sudo apt update && sudo apt install elasticsearch

Starting Elasticsearch

$ sudo systemctl enable elasticsearch $ sudo systemctl start elasticsearch $ curl http://localhost:9200

Elasticsearch Concepts

Index # Like a database Type # Like a table (deprecated in ES 7+) Document # Like a row (JSON) Shard # Fragment of an index

Quiz

1. What stores documents in Elasticsearch?

// Logstash & Ingest

ร—

Logstash Pipeline

Input โ†’ Filter โ†’ Output

input { beats { port => 5044 } } filter { grok { match => { "message" => "%{TIMESTAMP:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}" } } date { match => [ "timestamp", "ISO8601" ] } } output { elasticsearch { hosts => ["localhost:9200"] index => "logs-%{+YYYY.MM.dd}" } }

Grok Patterns

# Common patterns %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{IP:client_ip} %{NUMBER:response_time} %{URI:request_path}

Quiz

1. What parses unstructured logs?

// Kibana Dashboards

ร—

What is Kibana?

Visualization and dashboarding for Elasticsearch:

Installing

$ sudo apt install kibana $ sudo systemctl enable kibana $ sudo systemctl start kibana $ curl -X GET "localhost:5601/api/status"

Creating Visualizations

# Steps to create a dashboard 1. Go to Visualize โ†’ Create new 2. Choose chart type (line, bar, pie, etc.) 3. Select index pattern 4. Choose metric (count, average, etc.) 5. Choose bucket (terms, date histogram) 6. Save and add to dashboard

Common Visualizations

  • Line Chart: Errors over time
  • Bar Chart: Requests by endpoint
  • Pie Chart: Status code distribution
  • Metric: Total requests, error rate

Quiz

1. What visualizes Elasticsearch data?

// Grafana Loki

ร—

What is Loki?

Cost-effective log aggregation from Grafana. Unlike ELK, it only indexes labels, not full text:

Installing

$ docker run -d --name loki -p 3100:3100 grafana/loki:latest # Or use docker-compose

Promtail Configuration

server: http_listen_port: 9080 grpc_listen_port: 0 positions: filename: /tmp/positions.yaml clients: - url: http://localhost:3100/loki/api/v1/push scrape_configs: - job_name: system static_configs: - targets: - localhost labels: job: syslog __path__: /var/log/syslog

Loki vs ELK

FeatureLokiELK
StorageCheap (S3/GCS)Expensive
IndexingLabels onlyFull text
SetupSimpleComplex

Quiz

1. What indexes only labels?

// Log Alerts

ร—

Kibana Alerts

# Create alert in Kibana 1. Stack Management โ†’ Alerts โ†’ Create 2. Define condition (threshold) 3. Set action (email, Slack, webhook) 4. Enable and test

Grafana Alerts

# Alert rule - Query: rate(http_requests_total{status="500"}[5m]) > 10 - For: 5m - Labels: severity=critical - Annotations: description: High error rate detected

Alert Actions

  • Email: Send notification
  • Slack: Post to channel
  • PagerDuty: Incident management
  • Webhook: Trigger automation

Quiz

1. What triggers on conditions?

// Performance & Scaling

ร—

Index Management

# Close old indices POST /logs-2023.01/_close # Delete very old indices DELETE /logs-2022.01 # Use index templates PUT /_index_template/logs-template { "index_patterns": ["logs-*"], "template": { "settings": { "number_of_shards": 3 } } }

ILM (Index Lifecycle Management)

# Hot โ†’ Warm โ†’ Cold โ†’ Delete PUT /_ilm/policy/logs-policy { "policy": { "phases": { "hot": { "min_age": "0ms" }, "warm": { "min_age": "30d" }, "cold": { "min_age": "90d" }, "delete": { "min_age": "1y" } } } }

Scaling Strategies

  • Add more data nodes
  • Use proper shard counts
  • Implement ILM policies
  • Use ultra-warm nodes for historical data

Quiz

1. What manages index lifecycle?

// Security & Compliance

ร—

Encrypting Logs

$ # Enable TLS for Elasticsearch xpack.security.http.ssl.enabled: true xpack.security.http.ssl.certificate: /path/to/cert.crt xpack.security.http.ssl.key: /path/to/cert.key

Access Control

# Kibana spaces for team isolation - Admin space: Full access - Dev space: Development logs only - Support space: Production read-only

Audit Logging

# Enable audit logging xpack.security.audit.enabled: true xpack.security.audit.outputs: [index, logfile]

Data Retention

# Compliance requirements - PCI-DSS: 1 year - HIPAA: 6 years - GDPR: Based on purpose - SOC 2: 1 year

Quiz

1. What tracks access to data?