OpenObserve — Architecture
Component breakdown, deployment topologies, and storage architecture for OpenObserve.
System Architecture
flowchart TB
subgraph Sources["Data Sources"]
OTEL_S["OTel Collector\n(OTLP gRPC/HTTP)"]
PROM_S["Prometheus\n(remote_write)"]
ES_S["ES Bulk API\nclients"]
FB_S["FluentBit /\nVector"]
KF_S["Kinesis Firehose\n/ GCP Pub/Sub"]
RUM_S["RUM SDK\n(browser)"]
end
subgraph O2Cluster["OpenObserve Cluster"]
direction TB
subgraph Stateless["Stateless Compute"]
Router["Router\n(request dispatch)"]
Ingester["Ingester\n(WAL → Parquet)"]
Querier["Querier\n(DataFusion engine)"]
Compactor["Compactor\n(file merging)"]
AlertMgr["AlertManager\n(alerts + reports)"]
end
subgraph Infra["Infrastructure"]
WAL["WAL\n(local disk\nmemtable)"]
Cache["Disk Cache\n(querier-side)"]
end
end
subgraph Storage["Storage Layer"]
S3["Object Storage\n(S3/GCS/Azure/MinIO)"]
PQ["Apache Parquet\n(Zstd compressed)"]
Meta["Metadata Store\n(PostgreSQL / SQLite)"]
end
Sources --> Router
Router --> Ingester
Ingester --> WAL
WAL -->|"flush every\n5min or size"| PQ
PQ --> S3
Querier -->|"scan"| S3
Querier --> Cache
Compactor -->|"merge"| S3
AlertMgr --> Querier
style Stateless fill:#e65100,color:#fff
style Storage fill:#1565c0,color:#fff
Node Role Architecture
flowchart LR
subgraph Roles["ZO_NODE_ROLE"]
ALL["all\n(single node)"]
R["router"]
I["ingester"]
Q["querier"]
C["compactor"]
A["alertmanager"]
end
subgraph Groups["ZO_NODE_ROLE_GROUP"]
Default["default\n(user queries)"]
Background["background\n(alerts, reports)"]
end
R --> I
R --> Q
R --> A
Q -.- Default
A -.- Background
Role Responsibility
| Role |
State |
Scales |
CPU Profile |
Memory Profile |
| Router |
Stateless |
Horizontal |
Low |
Low |
| Ingester |
WAL on disk |
Horizontal |
Medium |
Medium (memtable) |
| Querier |
Cache on disk |
Horizontal |
High (DataFusion) |
High (scan buffers) |
| Compactor |
Stateless |
1–2 nodes |
Medium |
Low |
| AlertManager |
Stateless |
1–2 nodes |
Low |
Low |
Storage Architecture
Data Path
sequenceDiagram
participant Client as Client
participant Ingester as Ingester
participant WAL as Local WAL
participant S3 as Object Storage
participant Compactor as Compactor
Client->>Ingester: JSON / OTLP / ES Bulk
Ingester->>Ingester: Schema inference
Ingester->>WAL: Write to memtable (Arrow batches)
Note over WAL: Flush triggers:<br/>5 min elapsed OR<br/>file size threshold
WAL->>S3: Write small Parquet file
Note over S3: Small files (1-10 MB)
loop Background compaction
Compactor->>S3: Read small files
Compactor->>Compactor: Sort, merge, re-partition
Compactor->>S3: Write large Parquet file
Compactor->>S3: Delete old small files
end
Note over S3: Large files (100+ MB)<br/>Sorted by time, partitioned by stream
Parquet File Structure
| Layer |
Detail |
| Partitioning |
By organization → stream → date → time window |
| Compression |
Zstd (default), high compression ratio |
| Bloom filters |
Per-column, configurable for high-cardinality fields |
| Row groups |
Optimized for DataFusion predicate pushdown |
| Metadata |
Column statistics for partition pruning |
Query Engine: DataFusion
flowchart LR
SQL["SQL Query"] --> Parser["SQL Parser"]
Parser --> LP["Logical Plan"]
LP --> Opt["Optimizer\n(predicate pushdown,\nprojection pruning,\npartition pruning)"]
Opt --> PP["Physical Plan"]
PP --> Scan["Parquet Scanner\n(parallel, columnar)"]
Scan --> S3_Q["Read from S3\n(only needed cols)"]
S3_Q --> Exec["Vectorized Execution\n(Arrow batches)"]
Exec --> Result["Query Result"]
style Opt fill:#2e7d32,color:#fff
HA Deployment Topology
flowchart TB
LB["Load Balancer"]
subgraph Routers["Router Pool"]
R1["Router 1"]
R2["Router 2"]
end
subgraph Ingesters["Ingester Pool"]
I1["Ingester 1\n(WAL /data1)"]
I2["Ingester 2\n(WAL /data2)"]
I3["Ingester 3\n(WAL /data3)"]
end
subgraph Queriers["Querier Pool"]
Q1["Querier 1\n(cache /cache1)"]
Q2["Querier 2\n(cache /cache2)"]
end
C1["Compactor"]
A1["AlertManager"]
S3_HA["S3 / MinIO\n(shared storage)"]
PG["PostgreSQL\n(metadata)"]
LB --> Routers
R1 --> Ingesters
R2 --> Ingesters
R1 --> Queriers
R2 --> Queriers
Ingesters --> S3_HA
Queriers --> S3_HA
C1 --> S3_HA
A1 --> Queriers
Routers --> PG
Ingesters --> PG
Queriers --> PG
style S3_HA fill:#1565c0,color:#fff
Sources