OpenObserve — Benchmarks
Storage efficiency, query performance, and cost analysis for OpenObserve.
Storage Efficiency vs Elasticsearch
Architectural Comparison
| Factor |
OpenObserve |
Elasticsearch |
| Storage format |
Parquet (columnar) + Zstd |
Lucene segments (row-oriented) + inverted index |
| Compression ratio |
~10:1 |
~1.5:1 |
| Storage tier |
Object storage (S3: $0.023/GB/mo) |
SSD ($0.10+/GB/mo) |
| Indexing |
Optional per-column bloom filters |
Full inverted index on every field |
| Net storage cost |
~$0.002/GB/mo |
~$0.28/GB/mo |
| Claimed advantage |
~140x cheaper storage |
— |
Why 140x Cheaper
The 140x claim combines two factors:
- Compression: Parquet columnar + Zstd achieves ~10:1 vs Lucene's ~1.5:1 → ~7x less raw bytes
- Storage tier: S3 ($0.023/GB/mo) vs SSD ($0.10+/GB/mo) → ~4–5x cheaper per GB
- No replica overhead: S3 provides 11-nines durability natively vs manually replicated Elasticsearch shards → ~2–3x savings
Combined: ~7x × ~4x × ~2.5x ≈ ~70–140x depending on configuration.
Caveat: This is a vendor-provided comparison. Actual ratios depend on data patterns, compression ratios, and S3 pricing tiers.
Analytical Queries (Aggregations)
| Aspect |
OpenObserve |
Elasticsearch |
| Column pruning |
Yes (read only needed columns) |
No (reads full documents) |
| Predicate pushdown |
Yes (DataFusion → Parquet row group stats) |
Partial (inverted index) |
| Vectorized execution |
Yes (Apache Arrow batches) |
No |
| Aggregation speed |
Often faster for analytical patterns |
Faster for full-text search |
Full-Text Search
| Aspect |
OpenObserve |
Elasticsearch |
| Approach |
Parquet scan + bloom filters |
Inverted index |
| Wildcard search |
Full scan (slower) |
Fast (inverted index) |
| Best for |
Known-field searches, aggregations |
Complex full-text search |
Single-Node (Dev/POC)
| Metric |
Value |
| Binary size |
~50 MB |
| Startup time |
< 5 seconds |
| Idle RAM |
~50–100 MB |
| Minimum resources |
1 CPU, 512 MB RAM |
Production HA
| Component |
CPU |
RAM |
Storage |
| Ingester (×3) |
2 vCPU |
4 GB |
100 GB WAL disk |
| Querier (×2) |
4 vCPU |
8 GB |
50 GB cache disk |
| Compactor (×1) |
2 vCPU |
4 GB |
50 GB temp |
| Router (×2) |
1 vCPU |
1 GB |
— |
| AlertManager (×1) |
1 vCPU |
1 GB |
— |
| PostgreSQL |
1 vCPU |
2 GB |
20 GB |
| S3 |
— |
— |
Unlimited |
Cost Comparison (100 GB/day logs, 30-day retention)
| Cost Item |
OpenObserve (self-hosted) |
Elasticsearch (self-hosted) |
| Storage |
~$7/mo (S3, 300 GB after compression) |
~$300/mo (3 TB SSD, 3× replicated) |
| Compute |
~$500/mo (small stateless nodes) |
~$1,500/mo (3× data nodes, 64 GB each) |
| Total |
~$507/mo |
~$1,800/mo |
Scale Limits
| Dimension |
Practical Limit |
Notes |
| Daily ingestion |
PB-scale |
S3 write throughput bottleneck |
| Query concurrency |
50–100 |
Add querier replicas |
| Retention |
Unlimited |
S3 lifecycle policies |
| Streams (indices) |
10,000+ |
Metadata store may need PostgreSQL |
| Single query scan |
TB-range |
DataFusion parallelizes across partitions |
Caveats
- 140x cost claim is from vendor benchmarks and combines compression + storage tier + replication savings.
- Full-text search performance lags behind Elasticsearch's inverted index for wildcard/fuzzy queries.
- DataFusion is less battle-tested than ClickHouse or Elasticsearch at extreme scale.
- Performance varies significantly with data patterns and query types.
Sources