Skip to content

Coroot — Operations

Deployment, configuration, scaling, and day-2 operations for Coroot.

Deployment

The Coroot Operator manages the lifecycle of all Coroot components via Custom Resources:

# Add Helm repo
helm repo add coroot https://coroot.github.io/helm-charts
helm repo update coroot

# Install the Operator
helm install -n coroot --create-namespace \
  coroot-operator coroot/coroot-operator

# Deploy Community Edition
helm install -n coroot coroot coroot/coroot-ce \
  --set "clickhouse.shards=2,clickhouse.replicas=2"

Docker Compose (Development)

git clone https://github.com/coroot/coroot.git
cd coroot
docker compose up -d

Multi-Cluster Setup

For multi-cluster observability, deploy the main Coroot instance in a centralized "observability" cluster. In remote clusters, deploy the operator in agentsOnly mode:

# Remote cluster — agents only
helm install -n coroot --create-namespace \
  coroot-operator coroot/coroot-operator

helm install -n coroot coroot coroot/coroot-ce \
  --set "agentsOnly=true" \
  --set "agents.server.url=https://central-coroot.example.com"

Configuration

Custom Resource (CR) Configuration

Parameter Default Description
metricsRefreshInterval 15s Prometheus query resolution
cacheTTL 720h Metric cache retention
clickhouse.shards 1 ClickHouse shard count
clickhouse.replicas 1 ClickHouse replica count
registry.url Docker Hub Custom container registry
registry.pullSecret Image pull secret name

Private Registry Configuration

helm install -n coroot coroot coroot/coroot-ce \
  --set registry.url=https://registry.example.com/coroot \
  --set registry.pullSecret=coroot-registry-auth

Node Agent Configuration

Flag Description
--cgroupfs-root Override cgroup filesystem path
--listen Agent listen address (default: :80)
--wal-dir Write-ahead log directory
ENABLE_JAVA_TLS Enable Java TLS profiling support

Scaling

Vertical Scaling

  • Coroot Server: Increase CPU/memory for larger service maps and more concurrent inspections
  • ClickHouse: Scale storage and memory based on log/trace retention

Horizontal Scaling

  • ClickHouse sharding: Distribute data across multiple shards for write throughput
  • ClickHouse replication: Add replicas for read performance and HA

Storage Tuning

Coroot v1.18.6+ supports S3 storage for ClickHouse, eliminating local disk requirements:

# ClickHouse S3 storage configuration
clickhouse:
  storage:
    type: s3
    s3:
      endpoint: https://s3.amazonaws.com
      bucket: coroot-clickhouse
      accessKeyId: AKIAIOSFODNN7EXAMPLE
      secretAccessKey: wJalrXUtnFEMI/K7MDENG

Monitoring

Health Checks

# Check Coroot CR status
kubectl get coroot -n coroot

# Check all pods
kubectl get pods -n coroot

# Port-forward to access UI
kubectl port-forward -n coroot service/coroot-coroot 8080:8080

Key Metrics to Watch

Metric Alert Threshold Description
ClickHouse disk usage > 80% Risk of storage exhaustion
ClickHouse memory > 70% RAM Query performance degradation
Node agent restarts > 3/hour eBPF loading issues
Inspection queue depth > 1000 Server overloaded

Upgrades

# Upgrade operator
helm repo update coroot
helm upgrade -n coroot coroot-operator coroot/coroot-operator

# Upgrade Coroot CE
helm upgrade -n coroot coroot coroot/coroot-ce

Backup & Restore

Backup is handled through the storage backends:

  • Prometheus/VM metrics: Use vmbackup / Thanos / Mimir backup
  • ClickHouse: Use clickhouse-backup tool

Sources