Operations¶
Scope
Production deployment, replication, backup/recovery, performance tuning, and monitoring for PostgreSQL.
Deployment Patterns¶
Replication Topologies¶
| Pattern | Consistency | Failover | Use Case |
|---|---|---|---|
| Streaming (async) | Eventual | Manual/automatic | Standard HA |
| Streaming (sync) | Strong | Automatic | Zero data loss |
| Logical replication | Eventual | Manual | Cross-version, selective |
| Patroni + etcd | Strong | Automatic | Enterprise K8s deployments |
| PgBouncer + HAProxy | N/A (proxy) | Automatic | Connection pooling |
Connection Pooling¶
# PgBouncer recommended settings
[pgbouncer]
pool_mode = transaction # Best for most workloads
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3
server_lifetime = 3600
server_idle_timeout = 600
Connection Limits
PostgreSQL defaults to max_connections = 100. Each connection consumes ~10MB of memory. Use PgBouncer to multiplex thousands of application connections.
Performance Tuning¶
Memory Configuration¶
| Parameter | Formula | Example (32Gi RAM) |
|---|---|---|
shared_buffers |
25% of RAM | 8GB |
effective_cache_size |
75% of RAM | 24GB |
work_mem |
RAM / max_connections / 4 | 32MB |
maintenance_work_mem |
RAM / 16 | 2GB |
wal_buffers |
3% of shared_buffers | 256MB (cap at 256MB) |
WAL & Checkpoint Tuning¶
-- Production WAL settings
ALTER SYSTEM SET wal_level = 'replica';
ALTER SYSTEM SET max_wal_size = '4GB';
ALTER SYSTEM SET min_wal_size = '1GB';
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
ALTER SYSTEM SET checkpoint_timeout = '15min';
ALTER SYSTEM SET wal_compression = 'zstd'; -- PG 15+
Query Performance¶
-- Enable query statistics
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Find slow queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
-- Identify missing indexes
SELECT schemaname, tablename, seq_scan, seq_tup_read,
idx_scan, idx_tup_fetch
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan AND seq_tup_read > 10000
ORDER BY seq_tup_read DESC;
Backup & Recovery¶
pg_basebackup (Physical)¶
# Full base backup
pg_basebackup -h primary -U replication -D /backup/base \
--checkpoint=fast --wal-method=stream -z
# Point-in-time recovery
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2026-04-12 10:00:00'
pgBackRest (Recommended)¶
# Full backup
pgbackrest --stanza=main --type=full backup
# Incremental backup
pgbackrest --stanza=main --type=incr backup
# Restore to point-in-time
pgbackrest --stanza=main --type=time \
--target="2026-04-12 10:00:00+00" restore
Monitoring¶
Key Metrics¶
-- Active connections vs limit
SELECT count(*), (SELECT setting FROM pg_settings WHERE name='max_connections')
FROM pg_stat_activity;
-- Replication lag (on replica)
SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;
-- Cache hit ratio (should be > 99%)
SELECT sum(blks_hit) / (sum(blks_hit) + sum(blks_read)) AS cache_hit_ratio
FROM pg_stat_database;
-- Transaction rate
SELECT xact_commit + xact_rollback AS total_txn FROM pg_stat_database WHERE datname = current_database();
Common Issues¶
| Issue | Diagnosis | Fix |
|---|---|---|
| Connection exhausted | SELECT count(*) FROM pg_stat_activity |
Use PgBouncer, increase max_connections |
| Bloated tables | SELECT pg_size_pretty(pg_table_size('table')) |
VACUUM FULL or pg_repack |
| Slow queries | pg_stat_statements |
Add indexes, rewrite query |
| WAL disk full | SELECT pg_current_wal_lsn() |
Archive WAL, increase max_wal_size |
| Replication lag | pg_stat_replication |
Check network, increase wal_sender_timeout |
Commands & Recipes¶
Connection & Basics¶
# Connect
psql -h localhost -U postgres -d mydb
# Create database
createdb mydb
# Import SQL
psql -d mydb -f schema.sql
# Dump
pg_dump mydb > backup.sql
pg_dump -Fc mydb > backup.custom # compressed
Essential Queries¶
-- Database size
SELECT pg_size_pretty(pg_database_size('mydb'));
-- Table sizes
SELECT relname, pg_size_pretty(pg_total_relation_size(oid))
FROM pg_class WHERE relkind = 'r'
ORDER BY pg_total_relation_size(oid) DESC LIMIT 10;
-- Active connections
SELECT pid, usename, application_name, state, query
FROM pg_stat_activity WHERE state = 'active';
-- Kill long query
SELECT pg_terminate_backend(pid);
-- Index usage
SELECT relname, idx_scan, seq_scan,
ROUND(100.0 * idx_scan / NULLIF(idx_scan + seq_scan, 0), 1) AS idx_pct
FROM pg_stat_user_tables ORDER BY seq_scan DESC;
PostgreSQL 18 Features¶
-- UUIDv7 (time-ordered, no index bloat)
CREATE TABLE events (
id uuid DEFAULT uuidv7() PRIMARY KEY,
data jsonb NOT NULL,
created_at timestamptz DEFAULT now()
);
-- Virtual generated columns (computed, not stored)
ALTER TABLE products ADD COLUMN price_with_tax numeric
GENERATED ALWAYS AS (price * 1.1) VIRTUAL;
Replication¶
# Set up streaming replication (primary)
# postgresql.conf
wal_level = replica
max_wal_senders = 5
max_replication_slots = 5
# On replica
pg_basebackup -h primary -D /var/lib/postgresql/18/main -R -P
Performance Tuning¶
# Essential postgresql.conf settings
shared_buffers = '4GB' # 25% of RAM
effective_cache_size = '12GB' # 75% of RAM
work_mem = '64MB'
maintenance_work_mem = '1GB'
max_wal_size = '4GB'
random_page_cost = 1.1 # SSD
effective_io_concurrency = 200 # SSD/NVMe