Skip to content

Security

Security model for Coroot covering node agent eBPF permissions, ClickHouse access control, API authentication, and data privacy. See also: observability/coroot/index, observability/coroot/architecture, observability/coroot/operations.

Security Architecture Overview

Coroot consists of three components with distinct security requirements:

  1. Coroot Server — Central API server and UI, stores configuration in PostgreSQL/SQLite and telemetry in ClickHouse.
  2. Coroot Node Agent — DaemonSet deployed to every Kubernetes node. Uses eBPF to collect traces, profiles, and network metrics.
  3. ClickHouse — External time-series store for telemetry data (traces, logs, profiles).
flowchart TD
    subgraph "Kubernetes Cluster"
        subgraph "Each Node"
            Agent[Coroot Node Agent<br/>DaemonSet]
            Workloads[Application Pods]
        end
        Coroot[Coroot Server<br/>Deployment]
        Operator[Coroot Operator<br/>Optional]
    end

    subgraph "Data Stores"
        PG[(PostgreSQL / SQLite<br/>Configuration)]
        CH[(ClickHouse<br/>Traces / Logs / Profiles)]
        Prom[(Prometheus<br/>Metrics)]
    end

    subgraph "Users"
        Browser[Browser / API Client]
    end

    Agent -->|eBPF traces / profiles| CH
    Agent -->|Prometheus remote_write| Prom
    Coroot -->|Read queries| CH
    Coroot -->|Read queries| Prom
    Coroot -->|Config state| PG
    Operator -->|Manages| Coroot
    Operator -->|Manages| Agent
    Browser -->|HTTP / API key| Coroot

Node Agent Security

eBPF Kernel Capabilities

The Coroot node agent uses eBPF for zero-instrumentation distributed tracing and continuous profiling. eBPF requires elevated kernel privileges.

Required capabilities and permissions:

Requirement Details
Privileged mode Agent runs with --privileged flag or equivalent SecurityContext
Host PID namespace --pid host required to observe all process namespaces
Filesystem mounts /sys/kernel/debug (rw) for eBPF program attachment; /sys/fs/cgroup (ro) for cgroup discovery
Linux kernel 4.16+ minimum for eBPF system call tracing; 5.2+ recommended for full feature support

Privileged Container Requirement

The node agent currently requires a privileged container because eBPF system calls (bpf(), perf_event_open()) need CAP_SYS_ADMIN on kernels older than 5.8. On kernels 5.8+, CAP_BPF and CAP_PERFMON can replace CAP_SYS_ADMIN for BPF operations, but Coroot's agent has not fully transitioned to fine-grained capabilities.

Agent Resource Isolation

The agent supports resource limits to prevent it from consuming excessive node resources:

# Kubernetes Operator configuration
nodeAgent:
  resources:
    requests:
      cpu: 100m
      memory: 200Mi
    limits:
      cpu: 500m
      memory: 1Gi

Agent Configuration Options

# Coroot Operator: node agent configuration
nodeAgent:
  ebpfTracer:
    enabled: true
    sampling: "1.0"        # 1.0 = 100% sampling
  ebpfProfiler:
    enabled: true           # CPU profiling via eBPF
  logCollector:
    collectLogBasedMetrics: true
    collectLogEntries: true  # Store raw logs in ClickHouse
  trackPublicNetworks: ["0.0.0.0/0"]  # Network tracking scope

Disabling eBPF Profiling Per Process

Individual processes can opt out of eBPF profiling by setting an environment variable:

COROOT_EBPF_PROFILING=disabled

This is useful for security-sensitive workloads where even observability instrumentation is undesirable.

ClickHouse Access Control

Connection Security

Coroot connects to ClickHouse using configured credentials. TLS is recommended for the connection:

# Coroot configuration
global_clickhouse:
  address: "clickhouse.coroot.svc:9000"
  user: "coroot"
  password: "${CLICKHOUSE_PASSWORD}"
  database: "coroot"
  tls_enable: true
  tls_skip_verify: false

ClickHouse Hardening

Create a dedicated ClickHouse user for Coroot with the minimum required permissions: - SELECT and INSERT on the Coroot database. - CREATE TABLE for automatic schema provisioning. - No access to other databases.

Data Retention

TTL policies limit data retention at the ClickHouse table level. These are applied during table creation:

# Coroot configuration
traces:
  ttl: 7d
logs:
  ttl: 7d
profiles:
  ttl: 7d

TTL Immutability

TTL values are applied during table creation and do not currently affect existing tables. To change retention, you must manually alter the ClickHouse table TTL or recreate the tables.

ClickHouse Schema

Coroot uses dedicated ClickHouse tables for each telemetry signal: - Trace spans stored with DoubleDelta and ZSTD compression for efficient time-series storage. - Log entries stored with OTel-compatible schema (timestamp, body, severity_text, trace_id). - Profiling data stored with stack trace aggregation.

Coroot Server Authentication

Authentication Modes

Coroot supports three authentication modes:

# Coroot configuration
auth:
  anonymous_role: ""                    # Empty = authentication required
  bootstrap_admin_password: "changeme"  # Initial admin password
Mode Configuration Behavior
Anonymous anonymous_role: Admin (or Editor/Viewer) No login required; all users get the specified role
Password-based anonymous_role: "" + bootstrap_admin_password Users must log in; admin can create additional users
SSO (Enterprise) OIDC/SAML integration Delegated to external identity provider

Project-Level API Keys

Projects are the primary isolation boundary in Coroot. Each project has dedicated API keys used by node agents:

projects:
  - name: production
    api_keys:
      - key: "uuid-or-random-string"
        description: "Production cluster agent"
  - name: staging
    api_keys:
      - key: "another-uuid"
        description: "Staging cluster agent"

API keys are scoped to a project. An agent authenticating with a production key cannot write data to the staging project.

Role-Based Access

Coroot defines three roles:

Role Capabilities
Admin Full access: manage projects, users, API keys, configuration
Editor View and modify dashboards, investigate incidents
Viewer Read-only access to dashboards and investigation results

Prometheus Integration Security

Coroot reads metrics from a Prometheus server. The connection supports authentication and TLS:

global_prometheus:
  url: "https://prometheus.monitoring.svc:9090"
  refresh_interval: 15s
  tls_skip_verify: false
  user: "coroot-reader"
  password: "${PROM_PASSWORD}"
  custom_headers:
    X-Custom-Header: value

For write access (remote_write), configure the remote_write_url:

global_prometheus:
  remote_write_url: "https://prometheus.monitoring.svc:9090/api/v1/write"

Data Privacy Considerations

Sensitive Data Exposure

eBPF-based tracing captures request payloads and response data at the network level. This can inadvertently capture sensitive information:

  • PII in log entries: Application logs collected by the log collector may contain user data.
  • Database queries: Traced SQL queries may include parameter values with sensitive data.
  • HTTP headers: Network traces capture request/response headers including cookies and auth tokens.

Mitigation Strategies

Risk Mitigation
PII in logs Set collectLogEntries: false; keep only log-based metrics
Sensitive traces Reduce sampling rate; use COROOT_EBPF_PROFILING=disabled on sensitive pods
Data retention Set short TTL values (default 7d) to limit data accumulation
Cross-project leakage Use separate projects with distinct API keys per environment

Kubernetes Security Context

The node agent requires privileged access, but the Coroot server should run with restricted security:

# Coroot server (restricted)
securityContext:
  runAsNonRoot: true
  runAsUser: 65532
  readOnlyRootFilesystem: true
  capabilities:
    drop: ["ALL"]

# Node agent (privileged - required for eBPF)
securityContext:
  privileged: true
  hostPID: true

Network Policies

Restrict network flows to least-privilege:

Direction Source Destination Port Purpose
Ingress Coroot server Node agent 8080 Agent API
Egress Node agent ClickHouse 9000 Trace/log/profile delivery
Egress Node agent Prometheus 9090 Remote write metrics
Ingress Browser Coroot server 8080 UI access

Hardening Checklist

Area Recommendation
ClickHouse TLS Enable tls_enable: true with valid certificates
ClickHouse auth Dedicated user with minimum permissions
Agent resources Set CPU/memory limits to prevent node starvation
Data retention Set TTLs per signal type (traces, logs, profiles)
Authentication Disable anonymous access in production
API key isolation Separate keys per project/environment
Network policies Restrict egress from agents to ClickHouse and Prometheus only
Privacy Disable log collection for workloads processing PII
Privileged pods Accept as required for eBPF; audit agent DaemonSet
Secrets Store ClickHouse and Prometheus passwords in Kubernetes secrets or Vault

Sources