Skip to content

Architecture

Component topology, data model, deployment patterns, and technology choices for Monoscope.

Component Topology

flowchart TB
    subgraph Clients["Client Applications"]
        Apps["Your Apps\n(OTel SDKs)"]
        Browser["Browser\n(Session Replay SDK)"]
    end

    subgraph Ingestion["Ingestion Layer"]
        OTel["OTel Collector\n(gRPC :4317)"]
        API["Monoscope API\n(Haskell)"]
        Kafka["Kafka Buffer"]
    end

    subgraph Processing["Processing Layer"]
        Worker["Extraction Worker"]
        Agent["AI Agent\nScheduler"]
        LLM["LLM API"]
    end

    subgraph Storage["Storage Layer"]
        TF["TimeFusion\n(Rust + DataFusion)"]
        PG["PostgreSQL\n+ TimescaleDB (pg18)"]
        S3["S3 Bucket\n(Delta Lake / Parquet)"]
        Cache["Foyer Cache\n(512MB mem + 100GB disk)"]
    end

    subgraph UI["Presentation Layer"]
        Web["Web Dashboard\n(HTMX + Tailwind)"]
        Alerts["Alert Channels\n(Slack, Discord, PagerDuty)"]
    end

    Apps -->|"OTLP/gRPC"| OTel
    Browser -->|"Session events"| API
    OTel -->|"Bearer token"| API
    API --> Kafka --> Worker
    Worker --> TF
    Worker --> PG
    TF --> S3
    TF --> Cache
    Agent -->|"Query"| TF
    Agent --> LLM
    Agent --> Alerts
    Web -->|"SQL via pgwire"| TF
    Web --> PG

    style Storage fill:#2e7d32,color:#fff
    style Ingestion fill:#1565c0,color:#fff
    style Processing fill:#e65100,color:#fff

Technology Breakdown

Component Language Framework/Library Purpose
Monoscope Backend Haskell (80.5%) Hasql, Lucid, HTMX, Eff API, ingestion, processing, web UI
TimeFusion Rust DataFusion, pgwire, Delta Lake, Foyer Time-series query engine with S3 storage
Metadata DB PLpgSQL (2.4%) PostgreSQL + TimescaleDB (pg18) Project config, alerts, user management
Frontend TypeScript (11.7%) HTMX, Tailwind v4, DaisyUI v5, ECharts Server-rendered UI with dynamic updates
SDKs Multi-language OTel SDK wrappers Application instrumentation
Migrations PLpgSQL 87KB of SQL migrations Schema evolution

Haskell Backend Internals

  • hasql-interpolate for type-safe PostgreSQL queries (migrated from postgresql-simple in v0.5.0)
  • Lucid for HTML templating (server-rendered)
  • HTMX for dynamic page updates with morphing
  • Eff effect system for IO abstraction
  • Effectful.Time for time operations
  • Fourmolu for code formatting
  • GHC 9.12 compatible

Data Model

Telemetry Storage (TimeFusion / S3)

erDiagram
    PROJECT ||--o{ OTEL_EVENTS : "contains"
    OTEL_EVENTS {
        uuid id PK
        uuid project_id FK
        timestamptz timestamp
        date date_partition
        text name
        bigint duration_ns
        text kind
        text[] hashes
        text attributes
    }
    TRACE {
        uuid trace_id
        uuid span_id
        uuid parent_span_id
        text service_name
        text operation_name
    }
    LOG_ENTRY {
        uuid id PK
        timestamptz timestamp
        text severity
        text body
        text attributes
    }
    METRIC {
        text metric_name
        text metric_type
        float value
        text labels
    }

Metadata Storage (PostgreSQL + TimescaleDB)

  • Projects — tenant isolation, API keys, retention settings
  • Monitors — alerting rules, health checks, renotify intervals
  • Alerting state — active incidents, notification history
  • Users/Teams — authentication, authorization, audit logs
  • AI Agent configs — schedules, LLM prompts, report recipients

Deployment Topologies

Docker Compose (Development / Small Production)

flowchart LR
    subgraph Host["Docker Host"]
        M["monoscope\n:8080"]
        TF["timefusion\n:5432"]
        PG["postgres+timescaledb\n:5433"]
        K["kafka\n:9092"]
        S3["localstack/minio\nS3-compatible"]
    end

    M --> TF --> S3
    M --> PG
    M --> K

Self-Hosted Production

flowchart TB
    subgraph LB["Load Balancer"]
        Nginx["NGINX\n(TLS Termination)"]
    end

    subgraph K8s["Kubernetes Cluster"]
        subgraph Monoscope["Monoscope Pods"]
            M1["monoscope-api-1"]
            M2["monoscope-api-2"]
        end

        subgraph Workers["Background Workers"]
            W1["extraction-worker"]
            W2["ai-agent-scheduler"]
        end

        OTelCol["OTel Collector"]
    end

    subgraph Data["External Data"]
        S3Prod["AWS S3 / MinIO"]
        PGHA["PostgreSQL HA\n(Patroni / RDS)"]
        TFProd["TimeFusion\n(Deployed separately)"]
        KProd["Kafka Cluster"]
    end

    Nginx --> M1
    Nginx --> M2
    M1 --> S3Prod
    M1 --> PGHA
    M1 --> TFProd
    Workers --> S3Prod
    OTelCol --> M1

    style K8s fill:#1565c0,color:#fff
    style Data fill:#2e7d32,color:#fff

Monoscope Cloud (SaaS)

flowchart LR
    subgraph Cloud["Monoscope Cloud"]
        MC["Managed Monoscope\n+ TimeFusion"]
        MCS3["Monoscope S3"]
    end

    subgraph BYOS["Your S3 Bucket\n(optional)"]
        US3["Your S3\n(unlimited retention)"]
    end

    Apps["Your Apps"] -->|"OTLP"| Cloud
    MC --> MCS3
    MC -->|"BYOS mode"| US3

Sources


How It Works

How Monoscope ingests telemetry via OTLP, stores it in S3 through TimeFusion, and provides LLM-powered querying with AI agent scheduling.

Ingestion Pipeline

Monoscope uses OpenTelemetry Protocol (OTLP) as its sole ingestion path:

flowchart LR
    subgraph Apps["Your Applications"]
        SDK1["Go SDK"]
        SDK2["Python SDK"]
        SDK3["Node.js SDK"]
        SDK4["Java Agent"]
    end

    subgraph Collector["OTel Collector"]
        OLTP["OTLP Receiver\n(gRPC :4317)"]
    end

    subgraph Monoscope["Monoscope Backend"]
        API["Ingestion API\n(Haskell)"]
        Kafka["Kafka\n(Buffer)"]
        Worker["Extraction Worker"]
    end

    subgraph Storage["Data Layer"]
        TF["TimeFusion\n(Rust + DataFusion)"]
        PG["PostgreSQL\n+ TimescaleDB"]
        S3["S3 Bucket\n(Delta Lake)"]
    end

    Apps -->|"OTLP"| Collector
    Collector -->|"OTLP/gRPC\nBearer API_KEY"| API
    API --> Kafka --> Worker
    Worker --> TF --> S3
    Worker --> PG

OTLP Ingestion

All telemetry arrives via OTLP over gRPC on port 4317 with Bearer token authentication:

  • Logs — structured and unstructured log events
  • Traces — spans with parent-child relationships, duration, attributes
  • Metrics — Sum, Histogram, ExponentialHistogram, Summary types

The ingestion API normalizes all data into a unified otel_logs_and_spans table schema before passing to TimeFusion.

TimeFusion Storage Engine

TimeFusion is Monoscope's purpose-built time-series database (separate open-source project at monoscope-tech/timefusion):

flowchart TB
    subgraph TF["TimeFusion Engine (Rust)"]
        PGWire["PostgreSQL Wire Protocol\n(pgwire)"]
        DF["Apache DataFusion\n(Query Engine)"]
        Cache["Two-Tier Cache\n(Foyer)"]
        Mem["Memory Cache\n512MB default"]
        Disk["Disk Cache\n100GB default"]
        DL["Delta Lake\n(ACID Transactions)"]
    end

    subgraph S3["S3-Compatible Storage"]
        PQ["Parquet Files\n(Zstd compressed)"]
    end

    PGWire --> DF
    DF --> Cache
    Cache --> Mem
    Cache --> Disk
    DF --> DL --> PQ

    style TF fill:#1565c0,color:#fff
    style S3 fill:#2e7d32,color:#fff

Key Properties

Property Detail
Wire protocol PostgreSQL-compatible via pgwire — any Postgres client can query
Query engine Apache DataFusion with vectorized execution
Storage format Delta Lake with Parquet files on S3
Compression Zstandard (10-20x reduction)
Throughput 500K+ events/sec per instance
ACID Delta Lake transactions for consistency
Caching Foyer adaptive: 512MB memory + 100GB disk, 7-day TTL, 95%+ hit rate
Distributed DynamoDB-based locking for multi-instance deployments

Main Table Schema

The otel_logs_and_spans table stores all telemetry in a unified schema:

Column Type Purpose
name text Span/log name (e.g., HTTP endpoint path)
id uuid Unique identifier
project_id uuid Tenant/project isolation
timestamp timestamptz Event timestamp
date date Partition key
hashes text[] Trace lookup hashes
duration bigint Span duration in nanoseconds
attributes___http___response___status_code text Flattened OTel attributes (triple underscore separator)
attributes___user___id text User identity propagation
attributes___error___type text Error classification
kind text Span kind (SERVER, CLIENT, INTERNAL, etc.)

Natural Language Query Engine

Monoscope integrates LLMs to translate plain-English queries into SQL executed against TimeFusion:

  1. User input — "Show me all 500 errors from the payments service yesterday"
  2. LLM translation — converts to a parameterized SQL query targeting otel_logs_and_spans
  3. Query execution — TimeFusion executes with vectorized DataFusion engine
  4. Result visualization — charts, log tables, and trace waterfalls rendered in the UI

AI Agent Scheduler

Scheduled agents run LLM-powered analysis on telemetry data:

flowchart LR
    Scheduler["Agent Scheduler\n(Haskell)"]
    LLM["LLM API"]
    Data["TimeFusion\nQuery"]
    Detect["Anomaly Detection"]
    Report["Email Report"]
    Alert["Alert Channels"]

    Scheduler -->|"Query + Analyze"| Data
    Data --> LLM
    LLM --> Detect
    Detect -->|"Anomaly found"| Report
    Detect -->|"Critical"| Alert
  • Configurable intervals: hourly, daily, weekly
  • Anomaly detection: volume spikes, error rate changes, latency degradation
  • Email reports: summary of findings delivered to configured recipients
  • Alerting: critical findings routed to Slack, Discord, PagerDuty, or webhooks

Error Fingerprinting

Monoscope uses a two-tier fingerprinting system:

  1. Jaccard similarity — groups errors with similar stack traces using set-based comparison
  2. Embedding-based merging — semantically similar errors are merged even with different text
  3. Framework-error rollup — known framework errors (e.g., Django Http404, Express ECONNREFUSED) are automatically categorized

Session Replay

Browser session recordings synced with backend telemetry:

  1. Browser SDK captures DOM mutations, user interactions, and network requests
  2. Events are batched and sent to Monoscope's ingestion API
  3. Session merging worker combines replay events with backend spans using correlation IDs
  4. Merged sessions are stored in S3 and viewable in the UI alongside traces and logs

Sources