Architecture¶

Component topology, data model, deployment patterns, and technology choices for Monoscope.

Component Topology¶

flowchart TB
    subgraph Clients["Client Applications"]
        Apps["Your Apps\n(OTel SDKs)"]
        Browser["Browser\n(Session Replay SDK)"]
    end

    subgraph Ingestion["Ingestion Layer"]
        OTel["OTel Collector\n(gRPC :4317)"]
        API["Monoscope API\n(Haskell)"]
        Kafka["Kafka Buffer"]
    end

    subgraph Processing["Processing Layer"]
        Worker["Extraction Worker"]
        Agent["AI Agent\nScheduler"]
        LLM["LLM API"]
    end

    subgraph Storage["Storage Layer"]
        TF["TimeFusion\n(Rust + DataFusion)"]
        PG["PostgreSQL\n+ TimescaleDB (pg18)"]
        S3["S3 Bucket\n(Delta Lake / Parquet)"]
        Cache["Foyer Cache\n(512MB mem + 100GB disk)"]
    end

    subgraph UI["Presentation Layer"]
        Web["Web Dashboard\n(HTMX + Tailwind)"]
        Alerts["Alert Channels\n(Slack, Discord, PagerDuty)"]
    end

    Apps -->|"OTLP/gRPC"| OTel
    Browser -->|"Session events"| API
    OTel -->|"Bearer token"| API
    API --> Kafka --> Worker
    Worker --> TF
    Worker --> PG
    TF --> S3
    TF --> Cache
    Agent -->|"Query"| TF
    Agent --> LLM
    Agent --> Alerts
    Web -->|"SQL via pgwire"| TF
    Web --> PG

    style Storage fill:#2e7d32,color:#fff
    style Ingestion fill:#1565c0,color:#fff
    style Processing fill:#e65100,color:#fff

Technology Breakdown¶

Component	Language	Framework/Library	Purpose
Monoscope Backend	Haskell (80.5%)	Hasql, Lucid, HTMX, Eff	API, ingestion, processing, web UI
TimeFusion	Rust	DataFusion, pgwire, Delta Lake, Foyer	Time-series query engine with S3 storage
Metadata DB	PLpgSQL (2.4%)	PostgreSQL + TimescaleDB (pg18)	Project config, alerts, user management
Frontend	TypeScript (11.7%)	HTMX, Tailwind v4, DaisyUI v5, ECharts	Server-rendered UI with dynamic updates
SDKs	Multi-language	OTel SDK wrappers	Application instrumentation
Migrations	PLpgSQL	87KB of SQL migrations	Schema evolution

Haskell Backend Internals¶

hasql-interpolate for type-safe PostgreSQL queries (migrated from postgresql-simple in v0.5.0)
Lucid for HTML templating (server-rendered)
HTMX for dynamic page updates with morphing
Eff effect system for IO abstraction
Effectful.Time for time operations
Fourmolu for code formatting
GHC 9.12 compatible

Data Model¶

Telemetry Storage (TimeFusion / S3)¶

erDiagram
    PROJECT ||--o{ OTEL_EVENTS : "contains"
    OTEL_EVENTS {
        uuid id PK
        uuid project_id FK
        timestamptz timestamp
        date date_partition
        text name
        bigint duration_ns
        text kind
        text[] hashes
        text attributes
    }
    TRACE {
        uuid trace_id
        uuid span_id
        uuid parent_span_id
        text service_name
        text operation_name
    }
    LOG_ENTRY {
        uuid id PK
        timestamptz timestamp
        text severity
        text body
        text attributes
    }
    METRIC {
        text metric_name
        text metric_type
        float value
        text labels
    }

Metadata Storage (PostgreSQL + TimescaleDB)¶

Projects — tenant isolation, API keys, retention settings
Monitors — alerting rules, health checks, renotify intervals
Alerting state — active incidents, notification history
Users/Teams — authentication, authorization, audit logs
AI Agent configs — schedules, LLM prompts, report recipients

Deployment Topologies¶

Docker Compose (Development / Small Production)¶

flowchart LR
    subgraph Host["Docker Host"]
        M["monoscope\n:8080"]
        TF["timefusion\n:5432"]
        PG["postgres+timescaledb\n:5433"]
        K["kafka\n:9092"]
        S3["localstack/minio\nS3-compatible"]
    end

    M --> TF --> S3
    M --> PG
    M --> K

Self-Hosted Production¶

flowchart TB
    subgraph LB["Load Balancer"]
        Nginx["NGINX\n(TLS Termination)"]
    end

    subgraph K8s["Kubernetes Cluster"]
        subgraph Monoscope["Monoscope Pods"]
            M1["monoscope-api-1"]
            M2["monoscope-api-2"]
        end

        subgraph Workers["Background Workers"]
            W1["extraction-worker"]
            W2["ai-agent-scheduler"]
        end

        OTelCol["OTel Collector"]
    end

    subgraph Data["External Data"]
        S3Prod["AWS S3 / MinIO"]
        PGHA["PostgreSQL HA\n(Patroni / RDS)"]
        TFProd["TimeFusion\n(Deployed separately)"]
        KProd["Kafka Cluster"]
    end

    Nginx --> M1
    Nginx --> M2
    M1 --> S3Prod
    M1 --> PGHA
    M1 --> TFProd
    Workers --> S3Prod
    OTelCol --> M1

    style K8s fill:#1565c0,color:#fff
    style Data fill:#2e7d32,color:#fff

Monoscope Cloud (SaaS)¶

flowchart LR
    subgraph Cloud["Monoscope Cloud"]
        MC["Managed Monoscope\n+ TimeFusion"]
        MCS3["Monoscope S3"]
    end

    subgraph BYOS["Your S3 Bucket\n(optional)"]
        US3["Your S3\n(unlimited retention)"]
    end

    Apps["Your Apps"] -->|"OTLP"| Cloud
    MC --> MCS3
    MC -->|"BYOS mode"| US3

Sources¶

How It Works¶

How Monoscope ingests telemetry via OTLP, stores it in S3 through TimeFusion, and provides LLM-powered querying with AI agent scheduling.

Ingestion Pipeline¶

Monoscope uses OpenTelemetry Protocol (OTLP) as its sole ingestion path:

flowchart LR
    subgraph Apps["Your Applications"]
        SDK1["Go SDK"]
        SDK2["Python SDK"]
        SDK3["Node.js SDK"]
        SDK4["Java Agent"]
    end

    subgraph Collector["OTel Collector"]
        OLTP["OTLP Receiver\n(gRPC :4317)"]
    end

    subgraph Monoscope["Monoscope Backend"]
        API["Ingestion API\n(Haskell)"]
        Kafka["Kafka\n(Buffer)"]
        Worker["Extraction Worker"]
    end

    subgraph Storage["Data Layer"]
        TF["TimeFusion\n(Rust + DataFusion)"]
        PG["PostgreSQL\n+ TimescaleDB"]
        S3["S3 Bucket\n(Delta Lake)"]
    end

    Apps -->|"OTLP"| Collector
    Collector -->|"OTLP/gRPC\nBearer API_KEY"| API
    API --> Kafka --> Worker
    Worker --> TF --> S3
    Worker --> PG

OTLP Ingestion¶

All telemetry arrives via OTLP over gRPC on port 4317 with Bearer token authentication:

Logs — structured and unstructured log events
Traces — spans with parent-child relationships, duration, attributes
Metrics — Sum, Histogram, ExponentialHistogram, Summary types

The ingestion API normalizes all data into a unified otel_logs_and_spans table schema before passing to TimeFusion.

TimeFusion Storage Engine¶

TimeFusion is Monoscope's purpose-built time-series database (separate open-source project at monoscope-tech/timefusion):

flowchart TB
    subgraph TF["TimeFusion Engine (Rust)"]
        PGWire["PostgreSQL Wire Protocol\n(pgwire)"]
        DF["Apache DataFusion\n(Query Engine)"]
        Cache["Two-Tier Cache\n(Foyer)"]
        Mem["Memory Cache\n512MB default"]
        Disk["Disk Cache\n100GB default"]
        DL["Delta Lake\n(ACID Transactions)"]
    end

    subgraph S3["S3-Compatible Storage"]
        PQ["Parquet Files\n(Zstd compressed)"]
    end

    PGWire --> DF
    DF --> Cache
    Cache --> Mem
    Cache --> Disk
    DF --> DL --> PQ

    style TF fill:#1565c0,color:#fff
    style S3 fill:#2e7d32,color:#fff

Key Properties¶

Property	Detail
Wire protocol	PostgreSQL-compatible via pgwire — any Postgres client can query
Query engine	Apache DataFusion with vectorized execution
Storage format	Delta Lake with Parquet files on S3
Compression	Zstandard (10-20x reduction)
Throughput	500K+ events/sec per instance
ACID	Delta Lake transactions for consistency
Caching	Foyer adaptive: 512MB memory + 100GB disk, 7-day TTL, 95%+ hit rate
Distributed	DynamoDB-based locking for multi-instance deployments

Main Table Schema¶

The otel_logs_and_spans table stores all telemetry in a unified schema:

Column	Type	Purpose
`name`	text	Span/log name (e.g., HTTP endpoint path)
`id`	uuid	Unique identifier
`project_id`	uuid	Tenant/project isolation
`timestamp`	timestamptz	Event timestamp
`date`	date	Partition key
`hashes`	text[]	Trace lookup hashes
`duration`	bigint	Span duration in nanoseconds
`attributes___http___response___status_code`	text	Flattened OTel attributes (triple underscore separator)
`attributes___user___id`	text	User identity propagation
`attributes___error___type`	text	Error classification
`kind`	text	Span kind (SERVER, CLIENT, INTERNAL, etc.)

Natural Language Query Engine¶

Monoscope integrates LLMs to translate plain-English queries into SQL executed against TimeFusion:

User input — "Show me all 500 errors from the payments service yesterday"
LLM translation — converts to a parameterized SQL query targeting otel_logs_and_spans
Query execution — TimeFusion executes with vectorized DataFusion engine
Result visualization — charts, log tables, and trace waterfalls rendered in the UI

AI Agent Scheduler¶

Scheduled agents run LLM-powered analysis on telemetry data:

flowchart LR
    Scheduler["Agent Scheduler\n(Haskell)"]
    LLM["LLM API"]
    Data["TimeFusion\nQuery"]
    Detect["Anomaly Detection"]
    Report["Email Report"]
    Alert["Alert Channels"]

    Scheduler -->|"Query + Analyze"| Data
    Data --> LLM
    LLM --> Detect
    Detect -->|"Anomaly found"| Report
    Detect -->|"Critical"| Alert

Configurable intervals: hourly, daily, weekly
Anomaly detection: volume spikes, error rate changes, latency degradation
Email reports: summary of findings delivered to configured recipients
Alerting: critical findings routed to Slack, Discord, PagerDuty, or webhooks

Error Fingerprinting¶

Monoscope uses a two-tier fingerprinting system:

Jaccard similarity — groups errors with similar stack traces using set-based comparison
Embedding-based merging — semantically similar errors are merged even with different text
Framework-error rollup — known framework errors (e.g., Django Http404, Express ECONNREFUSED) are automatically categorized

Session Replay¶

Browser session recordings synced with backend telemetry:

Browser SDK captures DOM mutations, user interactions, and network requests
Events are batched and sent to Monoscope's ingestion API
Session merging worker combines replay events with backend spans using correlation IDs
Merged sessions are stored in S3 and viewable in the UI alongside traces and logs