PostgreSQL Performance Guide — Indexes, EXPLAIN, and Query Optimization

Understanding Query Performance

PostgreSQL's query planner generates an execution plan for every query. Understanding these plans is the key to optimization. EXPLAIN ANALYZE is your most important diagnostic tool.

EXPLAIN ANALYZE

EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2025-01-01'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 10;

Reading the Output

Seq Scan — full table scan, no index used (often slow for large tables)
Index Scan — uses an index to find rows
Index Only Scan — satisfied entirely from the index (fastest)
Nested Loop, Hash Join, Merge Join — join strategies
actual time — real execution time (first row, total)
rows — actual vs estimated row counts (large discrepancy = stale statistics)

Indexes

When to Add an Index

-- Index for WHERE clause columns
CREATE INDEX idx_users_created_at ON users(created_at);

-- Composite index for multi-column WHERE
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Partial index (index only a subset of rows)
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';

-- Index for LIKE queries (prefix only)
CREATE INDEX idx_users_email_text ON users USING gin(email gin_trgm_ops);
-- Requires: CREATE EXTENSION pg_trgm;

Index Types

B-tree (default) — equality and range queries, ORDER BY
GIN — full-text search, JSONB, arrays
GiST — geometric data, range types
BRIN — very large tables with naturally ordered data (timestamps)
Hash — equality only (rarely needed)

Query Optimization Patterns

Avoid SELECT *

-- Bad: fetches all columns, may skip Index Only Scan
SELECT * FROM users WHERE email = '[email protected]';

-- Good: fetch only needed columns
SELECT id, name, email FROM users WHERE email = '[email protected]';

Use CTEs for Readability (Not Performance)

-- CTE (easier to read)
WITH recent_orders AS (
  SELECT user_id, SUM(amount) as total
  FROM orders
  WHERE created_at > NOW() - INTERVAL '30 days'
  GROUP BY user_id
)
SELECT u.name, ro.total
FROM users u
JOIN recent_orders ro ON ro.user_id = u.id
WHERE ro.total > 1000;

Update Statistics

ANALYZE users;           -- update stats for one table
ANALYZE;                  -- update all stats
VACUUM ANALYZE users;     -- reclaim space and update stats

Finding Slow Queries

-- Enable pg_stat_statements
CREATE EXTENSION pg_stat_statements;

-- Find slowest queries
SELECT
    query,
    calls,
    mean_exec_time::numeric(10,2) AS mean_ms,
    total_exec_time::numeric(10,2) AS total_ms
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

Connection Pooling

PostgreSQL creates a new process per connection. For applications with many short-lived connections, use PgBouncer or pgpool-II. In production, target 50-100 max connections to PostgreSQL, with the connection pooler handling thousands of application connections.

Frequently Asked Questions

When is a sequential scan better than an index?

For small tables or queries that return more than ~5-10% of rows, a sequential scan is often faster than an index scan because it has lower overhead. PostgreSQL's planner typically makes this decision correctly.

How do I find missing indexes?

Look for Seq Scan nodes on large tables in EXPLAIN ANALYZE output. Also query pg_stat_user_tables for tables with high seq_scan counts relative to idx_scan.

→ Explore Free Developer Tools at DevKits
aiforeverthing.com — 100+ tools, no signup required

Deploy Your Own Tools — Recommended Hosting

🌐

Hostinger

Web Hosting from $2.99/mo

💧

DigitalOcean

$200 Free Credit