Operations
Scope
Cluster deployment, topology patterns, range management, backup/recovery, and monitoring.
Deployment Patterns
Cluster Topology
| Pattern |
Min Nodes |
Survival Goal |
Use Case |
| Single-region, multi-zone |
3 |
Zone failure |
Standard HA |
| Multi-region |
9 (3 per region) |
Region failure |
Global distribution |
| Global tables |
9+ |
Region failure + low reads |
Reference data everywhere |
# Start a 3-node cluster
cockroach start --insecure --store=node1 --listen-addr=localhost:26257 --http-addr=localhost:8080 --join=localhost:26257,localhost:26258,localhost:26259
cockroach init --insecure --host=localhost:26257
| Parameter |
Default |
Tuned |
Impact |
kv.range.max_bytes |
512MiB |
128MiB-512MiB |
Smaller = more parallelism |
sql.defaults.distsql |
auto |
auto |
Distributed SQL execution |
kv.snapshot_rebalance.max_rate |
32MiB/s |
64MiB/s |
Faster rebalancing |
server.time_until_store_dead |
5m |
5m |
Dead node detection |
-- Check range distribution
SELECT range_id, replicas, lease_holder FROM crdb_internal.ranges WHERE table_name = 'orders';
-- Hotspot detection
SELECT * FROM crdb_internal.node_statement_statistics ORDER BY count DESC LIMIT 10;
Backup & Recovery
-- Full cluster backup to cloud storage
BACKUP INTO 's3://bucket/backup?AUTH=implicit' AS OF SYSTEM TIME '-10s';
-- Incremental backup
BACKUP INTO LATEST IN 's3://bucket/backup?AUTH=implicit';
-- Restore
RESTORE FROM LATEST IN 's3://bucket/backup?AUTH=implicit';
Common Issues
| Issue |
Diagnosis |
Fix |
| Range under-replicated |
DB Console > Replication |
Add nodes, check disk space |
| High query latency |
EXPLAIN ANALYZE |
Add indexes, check network |
| Clock skew |
cockroach node status |
Configure NTP, max-offset |
| Node decommission stuck |
cockroach node status --decommission |
Check range rebalancing progress |
Commands & Recipes
Cluster Setup
# Start single node (dev)
cockroach start-single-node --insecure --store=node1
# Start multi-node cluster
cockroach start --insecure --store=node1 --listen-addr=:26257 --join=node1:26257,node2:26257,node3:26257
cockroach start --insecure --store=node2 --listen-addr=:26258 --join=node1:26257,node2:26257,node3:26257
cockroach start --insecure --store=node3 --listen-addr=:26259 --join=node1:26257,node2:26257,node3:26257
# Initialize cluster
cockroach init --insecure --host=node1:26257
# SQL shell (uses PG wire protocol)
cockroach sql --insecure --host=node1:26257
# or: psql "postgresql://root@node1:26257/defaultdb?sslmode=disable"
Geo-Partitioning
-- Create multi-region database
ALTER DATABASE mydb SET PRIMARY REGION = 'us-east1';
ALTER DATABASE mydb ADD REGION 'eu-west1';
ALTER DATABASE mydb ADD REGION 'ap-southeast1';
-- Pin table data to region (GDPR compliance)
ALTER TABLE users SET LOCALITY REGIONAL BY ROW;
-- Each row's crdb_region column determines where it lives
Operations
# Cluster status
cockroach node status --insecure
# Decommission node (safe removal)
cockroach node decommission <node-id> --insecure
# Backup
cockroach sql -e "BACKUP DATABASE mydb INTO 's3://bucket/backup?AUTH=implicit'"
# Restore
cockroach sql -e "RESTORE DATABASE mydb FROM LATEST IN 's3://bucket/backup?AUTH=implicit'"
# CDC to Kafka
cockroach sql -e "CREATE CHANGEFEED FOR TABLE orders INTO 'kafka://broker:9092' WITH format=json;"
Sources