Skip to content

Victoria Stack

Home | Knowledge Hub | Projects Hub

Summary

The Victoria Stack is an interconnected suite of open-source observability databases built by VictoriaMetrics, Inc. It abandons the monolithic "all-in-one" design (like Datadog) in favor of specialized, highly optimized, loosely coupled binaries. Each component is designed for extreme resource efficiency — using 5–10x less RAM and ~50% less disk than competing solutions.

Core Databases

Component Signal Query Language API Compatibility License
VictoriaMetrics Metrics MetricsQL (PromQL superset) Prometheus, InfluxDB, Graphite, Datadog, NewRelic, OpenTelemetry Apache 2.0
VictoriaLogs Logs LogsQL Elasticsearch Bulk, Loki Push, Syslog, OTLP, Fluentbit JSON Apache 2.0
VictoriaTraces Traces Jaeger Query API + experimental Tempo API (v0.8+) OTLP (gRPC + HTTP), Jaeger, Zipkin, Tempo DS (experimental) Apache 2.0

Edge Daemons

Tool Purpose Key Feature
vmagent Drop-in Prometheus scraper + metric router Supports 50+ service discovery mechanisms
vmalert Alerting & recording rule evaluator Evaluates rules against any backend (metrics + logs)
vmauth Smart HTTP proxy, auth, routing, load balancing Routes traffic across all 3 databases by URL path
vmbackup / vmrestore Incremental snapshot to S3/GCS/Azure Point-in-time consistent backups without lock
vmoperator Kubernetes operator (CRDs) GitOps-native management of the full stack

Repository & Community

Attribute Detail
Repository github.com/VictoriaMetrics/VictoriaMetrics
Stars 16.7k+ ⭐
Latest Version v1.139.0 (stable), v1.122.x (LTS)
Language Go
License Apache 2.0 (core); Enterprise features require paid license
Company VictoriaMetrics, Inc.
Founded ~2018 by Aliaksandr Valialkin

Evaluation

  • Why it's better: Built on a shared design philosophy of extreme resource efficiency and zero-tuning operability. Uses ~10x less RAM and ~50% less disk space than Prometheus (via ZSTD optimizations). OTLP and Loki native ingestion APIs mean no translation layers are needed. Decoupled reads and writes allow linear cluster scaling.

  • When it fits (Applicability):

  • High-ephemerality Kubernetes clusters with massive label churn
  • IoT telemetry streams with millions of active series
  • Environments processing terabytes of logs/traces daily
  • When cloud observability costs (Datadog/NewRelic) have become prohibitive
  • When local Prometheus/Loki instances are frequently OOM-killing due to high cardinality
  • Teams that want a single-binary, zero-dependency deployment

  • Pros and Cons:

Pros Cons
5–10x less RAM than Prometheus, 50% less disk Physically fragmented backends (3 separate storage engines)
Apache 2.0 license (more permissive than AGPL) Lacks built-in correlation UI — relies on Grafana
Single-binary deployment, near-zero config VictoriaTraces is newer and less battle-tested than Tempo
Drop-in API compatibility (Prometheus, Loki, OTLP) Enterprise features (downsampling, SSO, retention filters) require paid license
MetricsQL fixes PromQL quirks (no extrapolation) Smaller community than Grafana ecosystem
Cluster mode with shared-nothing architecture No native multi-tenancy in single-node mode
vmoperator for GitOps Kubernetes management Cross-signal correlation requires manual Grafana config
  • Common Use Cases:
  • Roblox: Billions of active time series, 100% uptime across multiple quarters
  • Spotify R&D: Replaced internal "Heroic" system for better scale
  • CERN: Real-time monitoring of CMS detector system
  • Grammarly: 10x cost reduction vs previous solution
  • DreamHost: 80% memory reduction, 76M active time series
  • Other: Adidas, Wix, Brandwatch, DSV, Dig Security, Sensedia

  • Licensing & Commercial Use:

  • Core databases: Apache 2.0 (no copyleft restrictions)
  • Enterprise features: separate paid license (downsampling, retention filters, SSO in vmauth, advanced alerting)
  • VictoriaMetrics Cloud: Managed SaaS — Single-node from $225/mo, Cluster from $1,300/mo
  • No per-host, per-user, or per-GB licensing

  • Ecosystem & Data Connections:

  • Ingestion: Prometheus remote_write, InfluxDB line protocol, Graphite, Datadog, NewRelic, OpenTelemetry (OTLP), Elasticsearch Bulk, Loki Push, Syslog, Fluentbit JSON
  • Querying: MetricsQL/PromQL, LogsQL, Jaeger Query API
  • Visualization: Grafana (primary), built-in VMUI dashboard
  • Collection: vmagent, OpenTelemetry Collector, Prometheus, Fluentbit, Logstash, Promtail
  • IaC: vmoperator (K8s CRDs), Helm charts, Terraform provider
  • Backup: vmbackup/vmrestore to S3/GCS/Azure

  • Compatibility & Requirements:

  • Runs on Linux, macOS, Docker, Kubernetes
  • Zero external dependencies — no PostgreSQL, no Redis, no object storage required
  • Single-node: 1 CPU, 256 MB RAM minimum (handles millions of series)
  • Cluster: scales linearly with added vmstorage nodes
  • Storage: local SSDs recommended (not object storage)

  • Alternatives:

  • Grafana Mimir — Horizontally scalable Prometheus, microservices-based, AGPL
  • Thanos — Sidecar pattern for existing Prometheus, object storage
  • Prometheus — The standard, single-node only
  • InfluxDB — Dedicated TSDB, different query language (Flux)
  • Grafana Loki — Log aggregation, label-only indexing
  • SigNoz — OpenTelemetry-native, ClickHouse-backed
  • Elasticsearch/OpenSearch — Full-text log search, heavier resource footprint

  • Migration & Lock-in Risks:

  • Very low lock-in on metrics — 100% PromQL compatible; MetricsQL extensions are optional
  • Low lock-in on logs — accepts Loki API, Elasticsearch Bulk API; LogsQL is proprietary but data is portable
  • Low lock-in on traces — accepts OTLP natively; data can be re-ingested elsewhere
  • Migration from Prometheus: Add remote_write URL — zero downtime
  • Migration from Loki: Switch Promtail/Fluentbit destination URL

  • Community Health & Support:

  • 16.7k+ GitHub stars, active development, responsive maintainers
  • Used by major companies (Roblox, Spotify, CERN, Grammarly, Adidas)
  • Enterprise SLAs available
  • Active Slack community, regular blog posts and conference talks

Notes In This Folder

  • Grafana — visualization layer used with the Victoria Stack
  • LGTM Stack — Grafana's competing full-stack observability solution
  • LGTM vs Victoria Stack — canonical comparison note
  • Observability Stacks Comparison — 6-way comparison including Coroot, SigNoz, SkyWalking, OpenObserve
  • Prometheus — the metrics standard that VictoriaMetrics is wire-compatible with
  • OpenTelemetry — the standard telemetry framework accepted natively by all Victoria components

Assets

Store local images, diagrams, and PDFs in the _assets/ subfolder. Prefer Mermaid for inline diagrams.

Next Actions

  • ~~Create comparison note: LGTM vs Victoria~~ → completed: LGTM vs Victoria Stack
  • Create comparison note: VictoriaLogs vs Loki (standalone deep dive)
  • Benchmark VictoriaLogs vs Loki at 100 GB/day log volume
  • Research VictoriaMetrics anomaly detection features (vmanomaly)

Sources

Primary Sources

URL Source Kind Authority Retrieved Via Date
https://github.com/VictoriaMetrics/VictoriaMetrics repository primary web search 2026-04-10
https://docs.victoriametrics.com/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/victorialogs/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/victoriatraces/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/metricsql/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/operator/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/vmagent/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/vmalert/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/vmauth/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/vmbackup/ docs primary web search 2026-04-10
https://docs.victoriametrics.com/cluster-victoriametrics/ docs primary web search 2026-04-10
https://victoriametrics.com/case-studies/ case study primary web search 2026-04-10
https://github.com/VictoriaMetrics/prometheus-benchmark tool primary web search 2026-04-10

Secondary Sources

URL Source Kind Authority Retrieved Via Date
https://github.com/VictoriaMetrics/helm-charts repository secondary web search 2026-04-10

Community Sources

URL Source Kind Authority Retrieved Via Date
https://victoriametrics.com/blog/ blog primary web search 2026-04-10
Slack community (victoriametrics.slack.com) chat community manual 2026-04-10

Questions

Open

Answered

  • Do they share a unified backend? — No, they are separate databases, though VictoriaTraces runs on top of VictoriaLogs storage engine, resolved in observability/victoriametrics/architecture
  • What license is used? — Apache 2.0 for all core components; Enterprise features require paid license, resolved in observability/victoriametrics/index
  • Is VictoriaMetrics a drop-in Prometheus replacement? — Yes, 100% PromQL compatible via MetricsQL superset; accepts remote_write natively, resolved in observability/victoriametrics/architecture
  • Why is VictoriaMetrics more memory-efficient than Prometheus? — Custom LSM-Tree storage, ZSTD compression, and -memory.allowedPercent enforcement preventing OOM, resolved in observability/victoriametrics/architecture
  • How do companies use it at scale? — Roblox (billions of series), Spotify (replaced Heroic), CERN (CMS detector), Grammarly (10x cost reduction), DreamHost (76M series), resolved in observability/victoriametrics/index
  • Does VictoriaTraces require a completely separate VictoriaLogs cluster, or can it share the same storage nodes as the main VictoriaLogs cluster? — VictoriaTraces uses its own vtstorage nodes with -storageNode flag for cluster mode; it does not share vlstorage nodes with VictoriaLogs. Each product has its own distinct storage components (vtstorage/vlstorage), insert nodes, and select nodes, resolved in observability/victoriametrics/architecture
  • How does LogsQL compare to LogQL (Loki) in terms of performance on large cardinality? — LogsQL handles high-cardinality data natively by storing fields separately rather than requiring runtime JSON parsing (| unpack/| json in LogQL). VictoriaLogs stores high-cardinality fields like trace_id as native filter fields (trace_id:=abcdef) instead of extracting them at query time, providing significantly better performance and compression for high-cardinality queries, resolved in observability/victoriametrics/architecture
  • How does VictoriaMetrics' downsampling (Enterprise) compare to Mimir's compactor? What data fidelity is lost? — VM Enterprise downsampling (-downsampling.period) aggregates older data into lower-resolution samples (e.g., 30d:5m,180d:1h,1y:6h,2y:1d), keeping last-value per interval. Periods must be multiples of each other. Fidelity is reduced from raw scrape interval to the configured aggregation level. Mimir's compactor focuses on block merging and deduplication rather than time-based aggregation, resolved in observability/victoriametrics/architecture
  • What are the production gotchas of running VictoriaLogs in cluster mode vs single-node for >1 TB/day logs? — Cluster mode splits into vlinsert, vlselect, and vlstorage components; each must be scaled independently. At >1 TB/day, vlstorage nodes need sufficient local SSD for per-day partitions before compaction, and vlinsert nodes must handle the ingestion fan-out. Single-node is simpler but lacks horizontal scaling; cluster mode adds operational complexity but enables independent scaling of read vs write paths, resolved in observability/victoriametrics/operations
  • How does vmagent's performance compare to Grafana Alloy for OTel-native telemetry pipelines? — vmagent has a very low memory/CPU footprint for metrics collection via Prometheus scraping and remote_write. Grafana Alloy (OTel Collector-based) is heavier but supports full OTel pipelines (traces, metrics, logs) with batching and processing stages. vmagent is optimal for metrics-only workloads; Alloy is better for unified OTel telemetry across all signal types, resolved in observability/victoriametrics/architecture
  • What is the actual trace search performance of VictoriaTraces vs Tempo at 100M+ spans/day? — No public head-to-head benchmark at 100M+ spans/day exists. VictoriaTraces uses the same storage engine lineage as VictoriaLogs with per-day partitions optimized for trace lookups. Tempo relies on object storage + optional Bloom filters. Real-world comparison TBD -- performance depends heavily on query patterns (trace ID lookup vs attribute search), resolved in observability/victoriametrics/architecture
  • How does VictoriaMetrics anomaly detection (vmanomaly) work, and is it production-ready? — vmanomaly is a service that continuously scans time series with ML models (Z-score online, Prophet, Holt-Winters, etc.) to compute an interpretable anomaly score (de-trended, de-seasonalized). It writes anomaly scores back to VM and integrates with vmalert for alerting. Supports configurable schedulers, detection direction, and domain-knowledge constraints. It is production-ready and commercially supported as part of the VictoriaMetrics ecosystem, resolved in observability/victoriametrics/architecture
  • What is the migration path from Elasticsearch to VictoriaLogs for existing log pipelines? — VictoriaLogs exposes an Elasticsearch-compatible ingestion endpoint at /insert/elasticsearch, enabling migration via Logstash Elasticsearch output plugin, Telegraf Elasticsearch output, Vector Elasticsearch sink, and OTel Collector Elasticsearch exporter. Key parameters: VL-Msg-Field, VL-Time-Field, VL-Stream-Fields headers control field mapping. No full re-index of historical data -- migrate pipelines forward and query old data from ES during transition, resolved in observability/victoriametrics/operations
  • How does VictoriaMetrics Cloud pricing compare to Grafana Cloud at equivalent scale? — VM Cloud uses fixed-tier pricing (compute by capacity tier + storage at $0.511/GB/month + egress at $0.09/GB) with no per-series charges, protecting against cardinality explosions. Grafana Cloud bills per active series (~$8/series/month) plus per-GB for logs/traces. At scale (1M+ series), VM Cloud is typically 3-10x cheaper due to VM's compression advantage and fixed-tier model, resolved in observability/victoriametrics/architecture
  • Can vmauth replace NGINX Ingress as the sole entry point for the Victoria Stack on Kubernetes? — Yes, vmauth acts as an authentication proxy with its own Ingress resource support (including TLS via cert-manager). It handles auth (basic + bearer token), RBAC via VMUser CRDs, and routing to all VM components. It sits between your ingress controller and VM backends rather than replacing the ingress controller itself. Use vmauth + ingress controller (NGINX/Traefik) for production, resolved in observability/victoriametrics/architecture