// Every application is built on data.
SQL IS THE BACKBONE OF MODERN APPLICATIONS.
Every website, every mobile app, every enterprise system— underneath it all is a database and SQL. Whether you're building a simple blog or a complex analytics platform, understanding SQL gives you the power to harness your data.
WHY SQL MATTERS
SQL (Structured Query Language) is the universal language for relational databases. It works with PostgreSQL, MySQL, Oracle, SQL Server, and countless others. Master SQL once, and you can work with any database in the world.
BECOME DATA FLUENT
Learn to write queries that extract exactly what you need. Understand joins, subqueries, window functions, and transactions. Design schemas that scale. Optimize performance. SQL is not just about getting data—it's about getting it right.
12 lessons. Complete SQL control.
What is SQL? Setting up a database and understanding relational concepts.
BeginnerSELECT, WHERE, ORDER BY, and filtering data.
BeginnerINSERT, UPDATE, DELETE, and managing data.
BeginnerCREATE TABLE, data types, constraints, and schema design.
BeginnerINNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN, and multiple tables.
IntermediateGROUP BY, HAVING, COUNT, SUM, AVG, MAX, MIN.
IntermediateNested queries, correlated subqueries, and common table expressions.
IntermediateCreating indexes, understanding execution plans, and query optimization.
AdvancedACID properties, BEGIN, COMMIT, ROLLBACK, and isolation levels.
AdvancedROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, and analytical queries.
AdvancedCreating views, stored procedures, functions, and triggers.
AdvancedDatabase design patterns, security, backup, and scaling strategies.
AdvancedSQL (Structured Query Language) is a domain-specific language used to manage and manipulate relational databases. It's the standard language for database systems, allowing you to create, read, update, and delete data, as well as manage database structure.
SQL is declarative—you describe what you want, not how to get it. The database engine figures out the most efficient way to execute your query.
A table is a collection of related data entries (rows) with a consistent structure (columns):
+-------------+-------------+-------------+-------------+
| first_name | last_name | email | created_at |
+-------------+-------------+-------------+-------------+
| John | Doe | j@example.com| 2024-01-15 |
| Jane | Smith | jane@example| 2024-01-16 |
+-------------+-------------+-------------+-------------+
Each column has a data type that defines what kind of data it can hold:
A primary key uniquely identifies each row in a table:
CREATE TABLE users (
id INTEGER PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(100)
);
A foreign key creates a relationship between two tables:
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
total DECIMAL(10,2),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
# Install PostgreSQL (Ubuntu)
sudo apt update
sudo apt install postgresql postgresql-contrib
# Start the service
sudo systemctl start postgresql
sudo systemctl enable postgresql
# Connect to PostgreSQL
sudo -u postgres psql
# Create a database
CREATE DATABASE myapp;
# Connect to your database
\c myapp
-- Select all rows from a table
SELECT * FROM users;
-- Select specific columns
SELECT first_name, email FROM users;
-- Add a WHERE clause to filter
SELECT * FROM users WHERE id = 1;
The SELECT statement is the foundation of SQL. It retrieves data from one or more tables based on specified criteria.
-- Select all columns
SELECT * FROM employees;
-- Select specific columns
SELECT first_name, last_name, salary FROM employees;
-- Use aliases for readability
SELECT
first_name AS "First Name",
last_name AS "Last Name",
salary * 12 AS "Annual Salary"
FROM employees;
-- Basic equality
SELECT * FROM products WHERE category = 'Electronics';
-- Numeric comparisons
SELECT * FROM orders WHERE total > 100;
SELECT * FROM employees WHERE years_of_service >= 5;
-- Multiple conditions with AND
SELECT * FROM users
WHERE status = 'active'
AND created_at > '2024-01-01';
-- OR condition
SELECT * FROM products
WHERE category = 'Electronics'
OR category = 'Books';
-- IN operator
SELECT * FROM users
WHERE country IN ('USA', 'Canada', 'Mexico');
-- BETWEEN operator (inclusive)
SELECT * FROM orders
WHERE total BETWEEN 50 AND 100;
-- LIKE for pattern matching
SELECT * FROM users
WHERE email LIKE '%@gmail.com';
-- Case-insensitive search (PostgreSQL)
SELECT * FROM users
WHERE name ILIKE 'john%';
-- Ascending order (default)
SELECT * FROM employees ORDER BY salary;
-- Explicit ascending
SELECT * FROM employees ORDER BY salary ASC;
-- Descending order
SELECT * FROM employees ORDER BY salary DESC;
-- Multiple columns
SELECT * FROM employees
ORDER BY department ASC, salary DESC;
-- Order by expression
SELECT * FROM products
ORDER BY price * quantity DESC;
-- Limit number of results
SELECT * FROM users LIMIT 10;
-- Skip first 5, get next 10 (OFFSET)
SELECT * FROM users LIMIT 10 OFFSET 5;
-- PostgreSQL syntax for pagination
SELECT * FROM users
ORDER BY created_at DESC
LIMIT 20 OFFSET 40; -- Page 3, 20 per page
-- FETCH (SQL standard)
SELECT * FROM users
ORDER BY name
LIMIT 10;
-- Get unique values
SELECT DISTINCT country FROM users;
-- Distinct combinations
SELECT DISTINCT category, subcategory
FROM products;
-- Arithmetic operations
SELECT
product_name,
price,
quantity,
price * quantity AS total
FROM order_items;
-- String concatenation (PostgreSQL)
SELECT
first_name || ' ' || last_name AS full_name
FROM users;
-- String functions
SELECT
UPPER(name) AS name_upper,
LOWER(email) AS email_lower,
LENGTH(name) AS name_length
FROM users;
-- Date functions (PostgreSQL)
SELECT
created_at,
EXTRACT(YEAR FROM created_at) AS year,
EXTRACT(MONTH FROM created_at) AS month,
DATE_TRUNC('day', created_at) AS day
FROM users;
SQL isn't just for reading data—you can also insert, update, and delete data. These operations are collectively known as DML (Data Manipulation Language).
-- Insert a single row
INSERT INTO users (first_name, last_name, email)
VALUES ('John', 'Doe', 'john@example.com');
-- Insert multiple rows
INSERT INTO users (first_name, last_name, email) VALUES
('Alice', 'Smith', 'alice@example.com'),
('Bob', 'Johnson', 'bob@example.com'),
('Carol', 'Williams', 'carol@example.com');
-- Insert with default values
INSERT INTO products (name, price)
VALUES ('Widget', 9.99);
-- created_at will use DEFAULT CURRENT_TIMESTAMP
-- Insert and return the inserted row
INSERT INTO users (first_name, last_name, email)
VALUES ('Dave', 'Brown', 'dave@example.com')
RETURNING id, email, created_at;
-- Insert from another table
INSERT INTO archived_users (id, name, email, archived_at)
SELECT id, name, email, CURRENT_TIMESTAMP
FROM users
WHERE status = 'inactive';
-- Update all rows
UPDATE users SET status = 'active';
-- Update with WHERE clause
UPDATE users
SET email = 'newemail@example.com'
WHERE id = 1;
-- Update multiple columns
UPDATE employees
SET
salary = salary * 1.1,
updated_at = CURRENT_TIMESTAMP
WHERE department = 'Engineering';
-- Update based on another table
UPDATE orders o
SET customer_name = u.name
FROM users u
WHERE o.user_id = u.id;
-- Update with subquery
UPDATE products p
SET price = (
SELECT AVG(price) FROM products WHERE category = p.category
)
WHERE price > (
SELECT AVG(price) FROM products WHERE category = p.category
);
-- Delete with WHERE clause
DELETE FROM users WHERE id = 1;
-- Delete multiple rows
DELETE FROM sessions
WHERE last_activity < NOW() - INTERVAL '30 days';
-- Delete all rows (careful!)
DELETE FROM log_entries;
-- Delete and return deleted rows
DELETE FROM orders
WHERE status = 'cancelled'
RETURNING id, total, deleted_at;
-- PostgreSQL UPSERT (INSERT ... ON CONFLICT)
INSERT INTO users (email, name, updated_at)
VALUES ('john@example.com', 'John Doe', CURRENT_TIMESTAMP)
ON CONFLICT (email)
DO UPDATE SET
name = EXCLUDED.name,
updated_at = EXCLUDED.updated_at;
-- MySQL UPSERT (ON DUPLICATE KEY UPDATE)
INSERT INTO users (email, name)
VALUES ('john@example.com', 'John Doe')
ON DUPLICATE KEY UPDATE
name = VALUES(name);
-- Truncate a table (faster than DELETE)
TRUNCATE TABLE logs;
-- Truncate multiple tables
TRUNCATE TABLE logs, access_logs, error_logs;
-- Restart identity (reset auto-increment)
TRUNCATE TABLE users RESTART IDENTITY;
-- CASCADE to truncate dependent tables
TRUNCATE TABLE orders CASCADE;
Good database design is crucial for application performance and data integrity. Let's learn how to create tables that scale.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL UNIQUE,
email VARCHAR(255) NOT NULL UNIQUE,
password_hash VARCHAR(255) NOT NULL,
first_name VARCHAR(100),
last_name VARCHAR(100),
bio TEXT,
avatar_url VARCHAR(500),
is_active BOOLEAN DEFAULT TRUE,
role VARCHAR(20) DEFAULT 'user',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Small integer (32,768 to 32,767)
smallint
-- Standard integer (-2.1B to 2.1B)
integer
-- Large integer (-9.2Q to 9.2Q)
bigint
-- Exact decimal (precision, scale)
DECIMAL(10,2) -- 12345678.99
NUMERIC(10,2)
-- Floating point
REAL -- 6 decimal digits
DOUBLE PRECISION -- 15 decimal digits
-- Fixed length
CHAR(10) -- Always 10 chars, padded
-- Variable length with limit
VARCHAR(255) -- Up to 255 chars
-- Unlimited length
TEXT -- PostgreSQL, MySQL
-- JSON types (PostgreSQL)
JSON -- Validated JSON
JSONB -- Binary, indexed, faster
DATE -- 2024-01-15
TIME -- 14:30:00
TIMESTAMP -- 2024-01-15 14:30:00
TIMESTAMPTZ -- 2024-01-15 14:30:00-05:00 (with timezone)
INTERVAL -- 3 days, 2 hours
CREATE TABLE products (
id SERIAL PRIMARY KEY,
-- NOT NULL - Required field
name VARCHAR(200) NOT NULL,
-- UNIQUE - No duplicates
sku VARCHAR(50) UNIQUE,
-- CHECK - Custom validation
price DECIMAL(10,2) CHECK (price >= 0),
quantity INTEGER CHECK (quantity >= 0),
-- DEFAULT - Automatic value
status VARCHAR(20) DEFAULT 'active',
-- REFERENCES - Foreign key
category_id INTEGER REFERENCES categories(id),
-- Composite primary key (order_items table)
order_id INTEGER,
product_id INTEGER,
PRIMARY KEY (order_id, product_id)
);
-- Add a column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Add a constraint
ALTER TABLE users
ADD CONSTRAINT unique_phone UNIQUE (phone);
-- Drop a column
ALTER TABLE users DROP COLUMN old_field;
-- Rename a table
ALTER TABLE users RENAME TO customers;
-- Rename a column
ALTER TABLE users RENAME COLUMN username TO handle;
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Composite index (for multi-column queries)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Unique index
CREATE UNIQUE INDEX idx_products_sku ON products(sku);
-- Partial index (only index active users)
CREATE INDEX idx_users_active_email ON users(email)
WHERE status = 'active';
-- Expression index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
Joins are how you combine data from multiple tables. They're the heart of relational database design.
-- Sample tables
-- users: id, name
-- orders: id, user_id, total
-- INNER JOIN: Only matching rows
SELECT u.name, o.total
FROM users u
INNER JOIN orders o ON u.id = o.user_id;
Table A Table B
+----+----+ +----+----+
| id | A | | id | B |
+----+----+ +----+----+
| 1 | a1 | | 1 | b1 |
| 2 | a2 | | 3 | b3 |
| 3 | a3 | | 4 | b4 |
+----+----+ +----+----+
INNER JOIN (A.id = B.id): {1, a1, b1}
LEFT JOIN (A.id = B.id): {1, a1, b1}, {2, a2, NULL}, {3, a3, NULL}
RIGHT JOIN (A.id = B.id): {1, a1, b1}, {NULL, NULL, b3}, {NULL, NULL, b4}
FULL JOIN (A.id = B.id): {1, a1, b1}, {2, a2, NULL}, {3, a3, NULL}, {NULL, NULL, b3}, {NULL, NULL, b4}
-- Basic inner join
SELECT
o.id AS order_id,
u.name AS customer_name,
u.email,
o.total,
o.status
FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.status = 'completed';
-- Multiple joins
SELECT
o.id,
u.name,
p.name AS product,
oi.quantity,
oi.price
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id;
-- Users with their orders (including users with no orders)
SELECT
u.name,
COUNT(o.id) AS order_count,
COALESCE(SUM(o.total), 0) AS total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name;
-- All categories, even if no products (rarely needed, usually flip tables)
SELECT
c.name AS category,
p.name AS product
FROM products p
RIGHT JOIN categories c ON p.category_id = c.id;
-- All users and all products they've ordered (including unmatched)
SELECT
u.name AS user,
p.name AS product
FROM users u
FULL OUTER JOIN orders o ON u.id = o.user_id
FULL OUTER JOIN order_items oi ON o.id = oi.order_id
FULL OUTER JOIN products p ON oi.product_id = p.id;
-- Employees and their managers
SELECT
e.name AS employee,
m.name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id;
-- Find employees in same department
SELECT
e1.name AS employee1,
e2.name AS employee2,
e1.department
FROM employees e1
JOIN employees e2 ON e1.department = e2.department AND e1.id < e2.id;
-- ANSI SQL syntax (preferred)
SELECT * FROM a JOIN b ON a.id = b.a_id
-- Implicit join (older syntax, avoid)
SELECT * FROM a, b WHERE a.id = b.a_id
-- NATURAL JOIN (use with caution - joins on same-named columns)
SELECT * FROM orders NATURAL JOIN order_items
Aggregation functions transform sets of rows into single values. They're essential for analytics and reporting.
-- COUNT - Number of rows
SELECT COUNT(*) FROM orders; -- All rows
SELECT COUNT(DISTINCT user_id) FROM orders; -- Unique users
-- SUM - Total of values
SELECT SUM(total) FROM orders;
-- AVG - Average
SELECT AVG(price) FROM products;
-- MIN/MAX - Smallest/Largest
SELECT MIN(created_at), MAX(created_at) FROM users;
-- Group by single column
SELECT
category,
COUNT(*) AS product_count,
AVG(price) AS avg_price
FROM products
GROUP BY category;
-- Group by multiple columns
SELECT
EXTRACT(YEAR FROM created_at) AS year,
EXTRACT(MONTH FROM created_at) AS month,
COUNT(*) AS order_count,
SUM(total) AS revenue
FROM orders
GROUP BY
EXTRACT(YEAR FROM created_at),
EXTRACT(MONTH FROM created_at)
ORDER BY year, month;
-- WHERE filters rows, HAVING filters groups
SELECT
user_id,
COUNT(*) AS order_count,
SUM(total) AS total_spent
FROM orders
GROUP BY user_id
HAVING COUNT(*) >= 5 -- Users with 5+ orders
AND SUM(total) > 1000; -- And $1000+ spent
-- Common pattern: categories with more than 10 products
SELECT
category_id,
COUNT(*) AS product_count
FROM products
GROUP BY category_id
HAVING COUNT(*) > 10;
-- Conditional aggregation
SELECT
COUNT(*) AS total_users,
COUNT(CASE WHEN status = 'active' THEN 1 END) AS active_users,
COUNT(CASE WHEN is_premium THEN 1 END) AS premium_users,
AVG(CASE WHEN is_premium THEN 1.0 ELSE 0.0 END) AS premium_percentage
FROM users;
-- Pivot-like aggregation
SELECT
COUNT(CASE WHEN status = 'pending' THEN 1 END) AS pending,
COUNT(CASE WHEN status = 'processing' THEN 1 END) AS processing,
COUNT(CASE WHEN status = 'shipped' THEN 1 END) AS shipped,
COUNT(CASE WHEN status = 'delivered' THEN 1 END) AS delivered
FROM orders;
-- PostgreSQL STRING_AGG
SELECT
department,
STRING_AGG(name, ', ' ORDER BY salary DESC) AS employees
FROM employees
GROUP BY department;
-- MySQL GROUP_CONCAT
SELECT
department,
GROUP_CONCAT(name ORDER BY salary DESC SEPARATOR ', ') AS employees
FROM employees
GROUP BY department;
-- Running total
SELECT
created_at,
amount,
SUM(amount) OVER (ORDER BY created_at) AS running_total
FROM transactions;
-- Percentage of total
SELECT
category,
sales,
sales * 100.0 / SUM(sales) OVER () AS percentage_of_total
FROM category_sales;
-- Rank within group
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank
FROM employees;
Subqueries are queries nested inside other queries. They let you use the results of one query in another.
-- Subquery in SELECT
SELECT
name,
price,
(SELECT AVG(price) FROM products) AS avg_price
FROM products;
-- Subquery in WHERE
SELECT * FROM products
WHERE price > (SELECT AVG(price) FROM products);
-- Subquery in FROM (derived table)
SELECT category, avg_price
FROM (
SELECT
category,
AVG(price) AS avg_price
FROM products
GROUP BY category
) AS category_avg
WHERE avg_price > 50;
-- Find users who have placed orders
SELECT * FROM users
WHERE id IN (SELECT user_id FROM orders);
-- Find users who have NOT placed orders
SELECT * FROM users
WHERE id NOT IN (SELECT user_id FROM orders);
-- Find products ordered by premium users
SELECT * FROM products
WHERE id IN (
SELECT product_id FROM order_items
WHERE order_id IN (
SELECT id FROM orders
WHERE user_id IN (
SELECT id FROM users WHERE is_premium = true
)
)
);
-- Users who have placed at least one order
SELECT * FROM users u
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.user_id = u.id
);
-- Categories with no products
SELECT * FROM categories c
WHERE NOT EXISTS (
SELECT 1 FROM products p WHERE p.category_id = c.id
);
-- Products with price above their category average
SELECT * FROM products p
WHERE price > (
SELECT AVG(price)
FROM products
WHERE category_id = p.category_id
);
-- Each user's latest order
SELECT * FROM orders o
WHERE created_at = (
SELECT MAX(created_at)
FROM orders
WHERE user_id = o.user_id
);
-- Employees earning more than their department average
SELECT * FROM employees e
WHERE salary > (
SELECT AVG(salary)
FROM employees
WHERE department = e.department
);
-- CTE (PostgreSQL, SQL Server, MySQL 8+)
WITH
active_orders AS (
SELECT * FROM orders WHERE status = 'active'
),
premium_users AS (
SELECT id FROM users WHERE is_premium = true
)
SELECT
u.name,
COUNT(o.id) AS order_count
FROM premium_users u
LEFT JOIN active_orders o ON u.id = o.user_id
GROUP BY u.id, u.name;
-- Recursive CTE (for hierarchical data)
WITH RECURSIVE org_chart AS (
-- Base case: top-level employees
SELECT id, name, manager_id, 1 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: employees who report to someone in the org
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart;
-- PostgreSQL LATERAL - subquery can reference previous tables
SELECT
p.name,
p.price,
o.order_count
FROM products p
CROSS JOIN LATERAL (
SELECT COUNT(*) AS order_count
FROM order_items
WHERE product_id = p.id
) AS o
ORDER BY o.order_count DESC;
Indexes are the key to database performance. They allow the database to find data quickly without scanning every row.
Think of an index like a book index. Instead of reading every page to find a topic, you check the index and go directly to the right page. However, indexes have a cost—they take space and slow down writes (INSERT, UPDATE, DELETE).
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Composite index (left-to-right order matters!)
CREATE INDEX idx_orders_user_status
ON orders(user_id, status);
-- Unique index
CREATE UNIQUE INDEX idx_products_sku
ON products(sku);
-- Partial index
CREATE INDEX idx_orders_active
ON orders(user_id)
WHERE status = 'active';
-- Expression index
CREATE INDEX idx_users_lower_email
ON users(LOWER(email));
-- Most common, good for equality and range queries
CREATE INDEX idx_orders_date ON orders(created_at);
CREATE INDEX idx_products_price ON products(price);
-- Only for equality comparisons
CREATE INDEX idx_users_email_hash ON users USING HASH(email);
-- For JSON, arrays, full-text search
CREATE INDEX idx_products_tags ON products USING GIN(tags);
CREATE INDEX idx_products_attributes ON products USING GIN(attributes);
-- For very large, naturally ordered data
CREATE INDEX idx_logs_time ON logs USING BRIN(created_at);
-- EXPLAIN shows the query plan
EXPLAIN SELECT * FROM users WHERE email = 'john@example.com';
-- ANALYZE actually runs the query and shows timing
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'john@example.com';
-- Example output:
-- Index Scan using idx_users_email on users
-- (cost=0.43..8.45 rows=1 width=104)
-- (actual time=0.021..0.022 rows=1 loops=1)
-- Index Cond: (email = 'john@example.com'::text)
-- Good: Composite for specific query pattern
CREATE INDEX idx_orders_user_status
ON orders(user_id, status);
-- For queries like:
-- WHERE user_id = 5 AND status = 'active'
-- WHERE user_id = 5
-- (can't use second column alone!)
-- Reindex (rebuild fragmented index)
REINDEX INDEX idx_users_email;
-- Check index usage (PostgreSQL)
SELECT
indexrelname,
idx_scan
FROM pg_stat_user_indexes
WHERE relname = 'users';
-- Remove unused indexes
DROP INDEX idx_users_unused;
Transactions ensure data integrity when multiple operations must succeed or fail together. They're essential for financial transactions, inventory management, and any operation where consistency matters.
-- Start transaction
BEGIN;
-- Multiple operations
UPDATE accounts SET balance = balance - 100 WHERE user_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE user_id = 2;
-- Commit (make permanent)
COMMIT;
-- Or rollback (undo everything)
ROLLBACK;
BEGIN;
INSERT INTO orders (user_id, total) VALUES (1, 100);
SAVEPOINT order_inserted;
-- Something fails, but we want to keep the order
ROLLBACK TO SAVEPOINT order_inserted;
-- Continue with other operations
INSERT INTO log (message) VALUES ('Order processing');
COMMIT;
-- Set isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Or at start of transaction
BEGIN ISOLATION LEVEL REPEATABLE READ;
| Level | Dirty Read | Non-repeatable | Phantom |
|---|---|---|---|
| READ UNCOMMITTED | Possible | Possible | Possible |
| READ COMMITTED | Not | Possible | Possible |
| REPEATABLE READ | Not | Not | Possible |
| SERIALIZABLE | Not | Not | Not |
-- Two transactions waiting for each other
-- Transaction 1: UPDATE accounts WHERE id = 1
-- Transaction 2: UPDATE accounts WHERE id = 2
-- Transaction 1: UPDATE accounts WHERE id = 2 (waits)
-- Transaction 2: UPDATE accounts WHERE id = 1 (DEADLOCK)
-- Solution: Always access tables in same order
BEGIN;
UPDATE accounts WHERE id = 1;
UPDATE accounts WHERE id = 2;
COMMIT;
-- Optimistic: Let it fail if conflict
UPDATE products
SET quantity = quantity - 1, version = version + 1
WHERE id = 1 AND version = 5;
-- If version changed, another transaction modified it - retry or error
-- Pessimistic: Lock before modifying
BEGIN;
SELECT * FROM products WHERE id = 1 FOR UPDATE;
-- ... validate ...
UPDATE products SET quantity = quantity - 1 WHERE id = 1;
COMMIT;
Window functions perform calculations across sets of rows related to the current row. Unlike aggregate functions, they don't collapse rows—you get results for each individual row.
-- Without window function (aggregate collapses rows)
SELECT department, AVG(salary) FROM employees GROUP BY department;
-- With window function (each row keeps its value)
SELECT
name,
department,
salary,
AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees;
-- ROW_NUMBER: Unique sequential number
SELECT
name,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS rank
FROM employees;
-- RANK: Same value = same rank, with gaps
SELECT
name,
salary,
RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;
-- DENSE_RANK: Same value = same rank, no gaps
SELECT
name,
salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;
-- Rank within department
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
RANK() OVER (ORDER BY salary DESC) AS overall_rank
FROM employees;
-- Previous row (LAG)
SELECT
created_at,
price,
LAG(price) OVER (ORDER BY created_at) AS prev_price,
price - LAG(price) OVER (ORDER BY created_at) AS price_change
FROM stock_prices;
-- Next row (LEAD)
SELECT
name,
salary,
LEAD(salary) OVER (ORDER BY salary DESC) AS next_higher_salary
FROM employees;
-- Multiple rows back/forward
SELECT
created_at,
LAG(created_at, 7) OVER (ORDER BY created_at) AS week_ago
FROM events;
-- Running total
SELECT
date,
amount,
SUM(amount) OVER (ORDER BY date) AS running_total
FROM transactions;
-- Moving average (last 7 days)
SELECT
date,
price,
AVG(price) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7
FROM stock_prices;
-- Frame specification
-- ROWS: Physical rows
-- RANGE: Logical rows with same value
-- GROUPS: Groups of same values
-- First value in partition
SELECT
name,
department,
salary,
FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary) AS lowest_in_dept,
LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary) AS highest_in_dept
FROM employees;
-- Percent rank within group
SELECT
salary,
PERCENT_RANK() OVER (ORDER BY salary) AS pct_rank
FROM employees;
-- Median (50th percentile)
SELECT
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median
FROM employees;
Views and functions let you encapsulate complex logic, improve security, and create reusable database components.
-- Simple view
CREATE VIEW active_users AS
SELECT * FROM users WHERE is_active = true;
-- Use view like a table
SELECT * FROM active_users;
-- Complex view with joins
CREATE VIEW order_summary AS
SELECT
o.id AS order_id,
u.name AS customer_name,
u.email,
o.total,
o.status,
o.created_at,
COUNT(oi.id) AS item_count
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN order_items oi ON o.id = oi.order_id
GROUP BY o.id, u.name, u.email, o.total, o.status, o.created_at;
-- Updatable view (simple conditions)
CREATE VIEW simple_users AS
SELECT id, name, email FROM users;
INSERT INTO simple_users (name, email) VALUES ('John', 'john@example.com');
-- Materialized view (stores results physically)
CREATE MATERIALIZED VIEW monthly_sales AS
SELECT
EXTRACT(YEAR FROM created_at) AS year,
EXTRACT(MONTH FROM created_at) AS month,
SUM(total) AS revenue,
COUNT(*) AS order_count
FROM orders
GROUP BY
EXTRACT(YEAR FROM created_at),
EXTRACT(MONTH FROM created_at)
ORDER BY year, month;
-- Refresh materialized view
REFRESH MATERIALIZED VIEW monthly_sales;
-- Auto-refresh on schedule (PostgreSQL)
CREATE MATERIALIZED VIEW monthly_sales AS
SELECT ...
WITH DATA;
-- PostgreSQL function
CREATE OR REPLACE FUNCTION get_user_orders(user_id INT)
RETURNS TABLE (
order_id INT,
total DECIMAL(10,2),
created_at TIMESTAMP
) AS $$
BEGIN
RETURN QUERY
SELECT o.id, o.total, o.created_at
FROM orders o
WHERE o.user_id = get_user_orders.user_id
ORDER BY o.created_at DESC;
END;
$$ LANGUAGE plpgsql;
-- Use function
SELECT * FROM get_user_orders(1);
-- MySQL stored function
DELIMITER //
CREATE FUNCTION get_user_email(user_id INT)
RETURNS VARCHAR(255)
DETERMINISTIC
BEGIN
DECLARE email VARCHAR(255);
SELECT e.email INTO email
FROM users u
WHERE u.id = user_id;
RETURN email;
END //
DELIMITER ;
-- PostgreSQL procedure
CREATE OR REPLACE PROCEDURE transfer_funds(
from_user INT,
to_user INT,
amount DECIMAL(10,2)
) AS $$
BEGIN
UPDATE accounts
SET balance = balance - amount
WHERE user_id = from_user;
UPDATE accounts
SET balance = balance + amount
WHERE user_id = to_user;
INSERT INTO transactions (from_user, to_user, amount)
VALUES (from_user, to_user, amount);
END;
$$ LANGUAGE plpgsql;
-- Call procedure
CALL transfer_funds(1, 2, 100.00);
-- Trigger to update timestamp
CREATE OR REPLACE FUNCTION update_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = CURRENT_TIMESTAMP;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER users_updated_at
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION update_updated_at();
-- Audit trail trigger
CREATE OR REPLACE FUNCTION audit_user_changes()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO user_audit (action, old_data, new_data, changed_at)
VALUES (
CASE
WHEN TG_OP = 'INSERT' THEN 'INSERT'
WHEN TG_OP = 'UPDATE' THEN 'UPDATE'
WHEN TG_OP = 'DELETE' THEN 'DELETE'
END,
CASE WHEN TG_OP != 'INSERT' THEN OLD ELSE NULL END,
CASE WHEN TG_OP != 'DELETE' THEN NEW ELSE NULL END,
CURRENT_TIMESTAMP
);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Running SQL in production requires careful attention to security, backup, performance, and scaling.
-- Polymorphic association (for different entity types)
CREATE TABLE comments (
id SERIAL PRIMARY KEY,
commentable_type VARCHAR(50), -- 'post', 'product', etc.
commentable_id INTEGER,
body TEXT
);
-- JSONB for semi-structured data (PostgreSQL)
CREATE TABLE events (
id SERIAL PRIMARY KEY,
type VARCHAR(50),
payload JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Index on JSON field
CREATE INDEX idx_events_type ON events((payload->>'type'));
-- Use parameterized queries (prevents SQL injection)
-- In application code:
-- cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
-- Principle of least privilege
-- Create read-only user
CREATE USER reader WITH PASSWORD 'secure_password';
GRANT CONNECT ON DATABASE myapp TO reader;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO reader;
-- Row-level security (PostgreSQL)
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_orders ON orders
FOR SELECT
USING (user_id = current_user_id());
-- PostgreSQL backup
pg_dump -U postgres myapp > backup.sql
-- Restore
psql -U postgres myapp < backup.sql
-- Point-in-time recovery requires WAL archiving
-- In postgresql.conf:
-- wal_level = replica
-- archive_mode = on
-- archive_command = 'cp %p /backup/wal/%f'
-- PgBouncer configuration
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
-- On primary (postgresql.conf)
wal_level = replica
max_wal_senders = 3
-- On replica (recovery.conf)
primary_conninfo = 'host=primary port=5432 user=repl'
restore_command = 'cp /wal/%f %p'
Congratulations on completing this guide! You've learned:
Continue your journey with: