Operations
Deployment & Typical Setup
Single-Node (Simplest Production Path)
# VictoriaMetrics — single binary, metrics
./victoria-metrics -storageDataPath=/data/vm -retentionPeriod=12
# VictoriaLogs — single binary, logs
./victoria-logs -storageDataPath=/data/vl -retentionPeriod=30d
# VictoriaTraces — single binary, traces
./victoria-traces -storageDataPath=/data/vt
Each binary starts an HTTP server and is immediately ready to receive data. No configuration files needed for basic usage.
Kubernetes (vmoperator)
The recommended production path uses the vmoperator with CRDs:
# Install the operator
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm repo update
helm install vmoperator vm/victoria-metrics-operator -n monitoring --create-namespace
# Deploy cluster via CRD
kubectl apply -f - <<EOF
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMCluster
metadata:
name: vm-cluster
spec:
retentionPeriod: "12"
replicationFactor: 2
vminsert:
replicaCount: 2
resources:
requests: { cpu: "500m", memory: "512Mi" }
vmselect:
replicaCount: 2
resources:
requests: { cpu: "500m", memory: "1Gi" }
vmstorage:
replicaCount: 3
storageDataPath: /vm-data
resources:
requests: { cpu: "1", memory: "4Gi" }
storage:
volumeClaimTemplate:
spec:
resources:
requests: { storage: 100Gi }
storageClassName: fast-ssd
EOF
Production Readiness Checklist
Configuration & Optimal Tuning
vmauth Routing Configuration
The single most important config file — routes traffic across all three databases:
# vmauth-config.yaml
unauthorized_user:
url_map:
# === METRICS ===
- src_paths:
- "/api/v1/write"
- "/api/v1/import.*"
url_prefix: "http://vminsert:8480/insert/0/prometheus"
- src_paths:
- "/api/v1/query.*"
- "/api/v1/series.*"
- "/api/v1/labels.*"
url_prefix: "http://vmselect:8481/select/0/prometheus"
# === LOGS ===
- src_paths:
- "/insert/jsonline.*"
- "/insert/elasticsearch.*"
- "/loki/api/v1/push"
url_prefix: "http://victorialogs:9428"
- src_paths:
- "/select/logsql/.*"
url_prefix: "http://victorialogs:9428"
# === TRACES ===
- src_paths:
- "/insert/opentelemetry/.*"
url_prefix: "http://victoriatraces:10428"
- src_paths:
- "/api/traces.*"
- "/api/services.*"
url_prefix: "http://victoriatraces:10428"
Critical Tuning Flags
| Component |
Flag |
Purpose |
Default |
| All |
-retentionPeriod |
Data retention duration |
1 month |
| vmstorage |
-search.maxUniqueTimeseries |
Prevent OOM on high-cardinality queries |
300,000 |
| vmstorage |
-memory.allowedPercent |
Max RAM usage percent before aggressive GC |
60% |
| vmstorage |
-search.maxQueryDuration |
Max single query execution time |
30s |
| vminsert |
-replicationFactor=N |
Replicate data to N storage nodes |
1 |
| vmselect |
-dedup.minScrapeInterval |
Deduplicate data when RF > 1 |
0s |
| vmagent |
-remoteWrite.label |
Add global labels to all scraped metrics |
— |
| VictoriaLogs |
-retentionPeriod |
Log retention |
7d |
Reliability & Scaling
Scaling Decision Matrix
| Symptom |
Component to Scale |
How |
| Slow metric queries |
vmselect |
Add replicas |
| Write backpressure |
vminsert |
Add replicas |
| Disk full on metrics |
vmstorage |
Add nodes or increase disk |
| High RAM on storage |
vmstorage |
Increase -memory.allowedPercent, reduce cardinality |
| Slow log search |
VictoriaLogs |
Add CPU/RAM (single-node) or cluster |
| Log ingestion lag |
VictoriaLogs |
Increase resources or switch to cluster |
High Availability
| Mechanism |
Implementation |
| Metrics replication |
-replicationFactor=2 on vminsert + -dedup.minScrapeInterval on vmselect |
| Metrics availability |
If 1 vmstorage fails with RF=2, vmselect returns partial results transparently |
| Logs/Traces HA |
Deploy cluster mode with vlinsert/vlstorage/vlselect |
| Proxy HA |
Multiple vmauth replicas behind load balancer |
| Backup |
vmbackup creates instant, consistent snapshots without locking the DB |
Cost
Cost Drivers
| Factor |
Driver |
Optimization |
| Compute |
Insert + select pods |
Right-size, use spot nodes for vmselect |
| Storage |
Data volume × retention |
ZSTD compression reduces 2–7x naturally, tune retention |
| Network |
Internal cluster traffic |
Co-locate in same AZ |
| NO object storage |
Local SSD only |
Eliminates S3/GCS egress costs entirely |
Cost at Scale (Self-Hosted)
| Scale |
Active Series |
Logs (GB/day) |
Estimated Monthly |
| Small |
100k |
10 |
$100–300 |
| Medium |
1M |
100 |
$500–1,500 |
| Large |
10M |
1 TB |
$2,000–8,000 |
| Enterprise |
100M+ |
10 TB+ |
$10,000–50,000 |
VictoriaMetrics Cloud Pricing
| Tier |
Starting Cost |
Includes |
| Single-node |
~$225/mo |
Up to 500k active series, 1-month retention |
| Cluster |
~$1,300/mo |
Multi-tenancy, HA, advanced networking |
Security
Authentication & Authorization
- The databases themselves do not implement RBAC natively.
- Security relies strictly on
vmauth, which acts as the gatekeeper:
- Bearer token authentication
- Basic auth
- URL-based access control
- Header manipulation
- Enterprise: SSO integration in vmauth
Network Security Best Practices
- Never expose ingestion nodes to the internet — always put vmauth or NGINX in front
- Use Kubernetes NetworkPolicies to restrict pod-to-pod communication
- Only vmauth should be externally accessible
- Use mTLS between components in sensitive environments
- Cluster multi-tenancy: Data isolation via account IDs in URL paths (
/insert/TENANT_ID/)
Best Practices
Metrics
- Global Relabeling: Append datacenter/environment labels at the vmagent layer before data hits storage
- Drop high-cardinality labels: Use vmagent relabeling to drop labels like
pod_ip, request_id before ingestion
- Recording rules: Precompute expensive MetricsQL expressions via vmalert
- Deduplication: With replication, always set
-dedup.minScrapeInterval on vmselect
Logs
- Avoid Translation: Use native APIs whenever possible — point Fluent Bit directly to
/insert/jsonline rather than going through an intermediary
- Structured logging: Use JSON logs to enable field extraction at query time
- Stream fields: Set
_stream_fields on ingestion to logically group related log entries
- Retention per signal: Set different retention periods for logs (30d) vs metrics (12mo) vs traces (14d)
Operations
- Monitor with itself: Scrape VictoriaMetrics' own
/metrics endpoint
- Use vmbackup regularly: Schedule daily incremental backups to S3
- Test upgrades on LTS: Use the LTS release line for production stability
Common Issues & Playbook
| Symptom |
Likely Cause |
Fix |
| High CPU on vmstorage during queries |
Large time-window queries |
Limit -search.maxQueryDuration, scale vmselect |
| OOM on vmstorage |
High cardinality churn |
Tune -memory.allowedPercent, drop unused labels at vmagent |
| "too many unique timeseries" |
Query returns too many series |
Increase -search.maxUniqueTimeseries or refine query |
| Slow VictoriaLogs queries |
Large time range without filters |
Add time restrictions (_time:1h), use specific filters |
| vmagent not discovering targets |
ServiceMonitor/PodScrape CRDs not picked up |
Verify vmoperator is running, check CRD labels |
| VictoriaTraces not receiving spans |
OTLP gRPC not enabled |
Explicitly enable gRPC port in config |
| Data gap after vmstorage restart |
WAL not flushed |
Normal — WAL replays on restart, gap is temporary |
Monitoring & Troubleshooting
Key Self-Monitoring Metrics
| Metric |
What It Tells You |
vm_rows_inserted_total |
Ingestion throughput |
vm_active_timeseries |
Current cardinality |
vm_slow_queries_total |
Queries exceeding duration threshold |
vm_cache_entries |
Cache utilization |
vm_data_size_bytes |
On-disk data size |
process_resident_memory_bytes |
Actual RAM usage |
vm_merge_duration_seconds |
Background compaction health |
Commands & Recipes
Installation
Docker (Quick Start — All Components)
# VictoriaMetrics (metrics)
docker run -d --name vm \
-p 8428:8428 \
-v vm-data:/storage \
victoriametrics/victoria-metrics \
-storageDataPath=/storage -retentionPeriod=12
# VictoriaLogs (logs)
docker run -d --name vl \
-p 9428:9428 \
-v vl-data:/vlogs \
victoriametrics/victoria-logs \
-storageDataPath=/vlogs -retentionPeriod=30d
# VictoriaTraces (traces)
docker run -d --name vt \
-p 10428:10428 \
-p 4317:4317 \
-v vt-data:/vtraces \
victoriametrics/victoria-traces \
-storageDataPath=/vtraces
Docker Compose (Full Stack)
# docker-compose.yaml — Full Victoria stack for development
version: '3.8'
services:
victoriametrics:
image: victoriametrics/victoria-metrics:latest
ports: ["8428:8428"]
volumes: ["vm-data:/storage"]
command:
- "-storageDataPath=/storage"
- "-retentionPeriod=12"
victorialogs:
image: victoriametrics/victoria-logs:latest
ports: ["9428:9428"]
volumes: ["vl-data:/vlogs"]
command:
- "-storageDataPath=/vlogs"
- "-retentionPeriod=30d"
victoriatraces:
image: victoriametrics/victoria-traces:latest
ports:
- "10428:10428" # HTTP
- "4317:4317" # OTLP gRPC
volumes: ["vt-data:/vtraces"]
command:
- "-storageDataPath=/vtraces"
vmagent:
image: victoriametrics/vmagent:latest
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- "-promscrape.config=/etc/prometheus/prometheus.yml"
- "-remoteWrite.url=http://victoriametrics:8428/api/v1/write"
vmauth:
image: victoriametrics/vmauth:latest
ports: ["8427:8427"]
volumes:
- ./vmauth-config.yml:/etc/vmauth/config.yml
command:
- "-auth.config=/etc/vmauth/config.yml"
vmalert:
image: victoriametrics/vmalert:latest
volumes:
- ./alert-rules.yml:/etc/rules/rules.yml
command:
- "-rule=/etc/rules/*.yml"
- "-datasource.url=http://victoriametrics:8428"
- "-remoteWrite.url=http://victoriametrics:8428"
grafana:
image: grafana/grafana-oss:latest
ports: ["3000:3000"]
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
vm-data:
vl-data:
vt-data:
Helm (Kubernetes)
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm repo update
# Single-node VictoriaMetrics
helm install vm vm/victoria-metrics-single -n monitoring --create-namespace
# Cluster VictoriaMetrics
helm install vm-cluster vm/victoria-metrics-cluster -n monitoring -f vm-values.yaml
# vmoperator (manages all components via CRDs)
helm install vmoperator vm/victoria-metrics-operator -n monitoring
# vmagent
helm install vmagent vm/victoria-metrics-agent -n monitoring
# vmalert
helm install vmalert vm/victoria-metrics-alert -n monitoring
# VictoriaLogs (single-node)
helm install vl vm/victoria-logs-single -n monitoring
vmagent Recipes
# Start vmagent as drop-in Prometheus replacement
./vmagent \
-promscrape.config=/path/to/prometheus.yml \
-remoteWrite.url=http://victoriametrics:8428/api/v1/write
# Add global labels to all scraped metrics
./vmagent \
-remoteWrite.label=datacenter=us-east-1 \
-remoteWrite.label=env=production \
-promscrape.config=prometheus.yml \
-remoteWrite.url=http://vminsert:8480/insert/0/prometheus/api/v1/write
# Multi-destination remote write (fan-out)
./vmagent \
-remoteWrite.url=http://vm-primary:8428/api/v1/write \
-remoteWrite.url=http://vm-secondary:8428/api/v1/write
Data Ingestion Recipes
Fluent Bit → VictoriaLogs
# fluent-bit.conf — Push logs directly to VictoriaLogs
[OUTPUT]
Name http
Match *
Host victorialogs
Port 9428
URI /insert/jsonline?_stream_fields=stream&_msg_field=log&_time_field=date
Format json_lines
Compress gzip
OpenTelemetry Collector → VictoriaTraces
# otel-collector-config.yaml
exporters:
otlp/victoriatraces:
endpoint: "victoriatraces:4317"
tls:
insecure: true
prometheusremotewrite/vm:
endpoint: "http://victoriametrics:8428/api/v1/write"
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/victoriatraces]
metrics:
receivers: [otlp, prometheus]
processors: [batch]
exporters: [prometheusremotewrite/vm]
Promtail / Loki Push → VictoriaLogs
# promtail-config.yaml — VictoriaLogs accepts Loki push API
clients:
- url: http://victorialogs:9428/insert/loki/api/v1/push
Direct OTLP → VictoriaTraces
- HTTP:
http://victoriatraces:10428/insert/opentelemetry/v1/traces
- gRPC:
grpc://victoriatraces:4317
vmauth Routing Config
# vmauth-config.yml — Route all signals through one proxy
unauthorized_user:
url_map:
# Metrics write
- src_paths: ["/api/v1/write", "/api/v1/import.*"]
url_prefix: "http://vminsert:8480/insert/0/prometheus"
# Metrics read
- src_paths: ["/api/v1/query.*", "/api/v1/series.*", "/api/v1/labels.*"]
url_prefix: "http://vmselect:8481/select/0/prometheus"
# Logs write
- src_paths: ["/insert/jsonline.*", "/insert/elasticsearch.*", "/loki/api/v1/push"]
url_prefix: "http://victorialogs:9428"
# Logs read
- src_paths: ["/select/logsql/.*"]
url_prefix: "http://victorialogs:9428"
# Traces write
- src_paths: ["/insert/opentelemetry/.*"]
url_prefix: "http://victoriatraces:10428"
# Traces read (Jaeger API)
- src_paths: ["/api/traces.*", "/api/services.*"]
url_prefix: "http://victoriatraces:10428"
Backup & Restore
# Create instant snapshot (single-node)
curl http://victoriametrics:8428/snapshot/create
# Returns: {"status":"ok","snapshot":"20260410120000-..."}
# Backup snapshot to S3
./vmbackup \
-storageDataPath=/data/vm \
-snapshot.createURL=http://localhost:8428/snapshot/create \
-dst=s3://my-bucket/vm-backups/
# Incremental backup (only new data since last backup)
./vmbackup \
-storageDataPath=/data/vm \
-snapshot.createURL=http://localhost:8428/snapshot/create \
-dst=s3://my-bucket/vm-backups/ \
-origin=s3://my-bucket/vm-backups/ # previous backup path
# Restore from backup
./vmrestore \
-src=s3://my-bucket/vm-backups/latest \
-storageDataPath=/data/vm-restored
Note: For clustered setup, vmbackup must be executed on EVERY vmstorage node.
API Recipes
# Query VictoriaMetrics (PromQL/MetricsQL)
curl -s "http://vm:8428/api/v1/query?query=up" | jq .
# Range query
curl -s "http://vm:8428/api/v1/query_range?query=rate(http_requests_total[5m])&start=-1h&step=60s" | jq .
# Import data via JSON
curl -d '{"metric":{"__name__":"test","job":"api"},"values":[1,2,3],"timestamps":[1617000000000,1617000001000,1617000002000]}' \
http://vm:8428/api/v1/import
# Query VictoriaLogs (LogsQL)
curl -s "http://vl:9428/select/logsql/query?query=_time:5m+AND+error" | jq .
# Push a test log
curl -X POST "http://vl:9428/insert/jsonline?_stream_fields=app&_msg_field=msg" \
-d '{"app":"test","msg":"hello from curl","level":"info"}'
# Look up a trace by ID (Jaeger API)
curl -s "http://vt:10428/api/traces/abc123" | jq .
# Check health
curl -s "http://vm:8428/-/healthy" && echo "OK"
Grafana Data Source Config
# Grafana provisioning for Victoria Stack
apiVersion: 1
datasources:
- name: VictoriaMetrics
type: prometheus
url: http://vmauth:8427
isDefault: true
jsonData:
httpMethod: POST
- name: VictoriaLogs
type: victoriametrics-logs-datasource
url: http://vmauth:8427
- name: VictoriaTraces
type: jaeger
url: http://vmauth:8427