Operations¶
Deployment, configuration, scaling, and day-2 operations for Coroot.
Deployment¶
Kubernetes via Coroot Operator (Recommended)¶
The Coroot Operator manages the lifecycle of all Coroot components via Custom Resources:
# Add Helm repo
helm repo add coroot https://coroot.github.io/helm-charts
helm repo update coroot
# Install the Operator
helm install -n coroot --create-namespace \
coroot-operator coroot/coroot-operator
# Deploy Community Edition
helm install -n coroot coroot coroot/coroot-ce \
--set "clickhouse.shards=2,clickhouse.replicas=2"
Docker Compose (Development)¶
Multi-Cluster Setup¶
For multi-cluster observability, deploy the main Coroot instance in a centralized "observability" cluster. In remote clusters, deploy the operator in agentsOnly mode:
# Remote cluster — agents only
helm install -n coroot --create-namespace \
coroot-operator coroot/coroot-operator
helm install -n coroot coroot coroot/coroot-ce \
--set "agentsOnly=true" \
--set "agents.server.url=https://central-coroot.example.com"
Configuration¶
Custom Resource (CR) Configuration¶
| Parameter | Default | Description |
|---|---|---|
metricsRefreshInterval |
15s |
Prometheus query resolution |
cacheTTL |
720h |
Metric cache retention |
clickhouse.shards |
1 |
ClickHouse shard count |
clickhouse.replicas |
1 |
ClickHouse replica count |
registry.url |
Docker Hub | Custom container registry |
registry.pullSecret |
— | Image pull secret name |
Private Registry Configuration¶
helm install -n coroot coroot coroot/coroot-ce \
--set registry.url=https://registry.YOUR_DOMAIN/coroot \
--set registry.pullSecret=coroot-registry-auth
Node Agent Configuration¶
| Flag | Description |
|---|---|
--cgroupfs-root |
Override cgroup filesystem path |
--listen |
Agent listen address (default: :80) |
--wal-dir |
Write-ahead log directory |
ENABLE_JAVA_TLS |
Enable Java TLS profiling support |
Scaling¶
Vertical Scaling¶
- Coroot Server: Increase CPU/memory for larger service maps and more concurrent inspections
- ClickHouse: Scale storage and memory based on log/trace retention
Horizontal Scaling¶
- ClickHouse sharding: Distribute data across multiple shards for write throughput
- ClickHouse replication: Add replicas for read performance and HA
Storage Tuning¶
Coroot v1.18.6+ supports S3 storage for ClickHouse, eliminating local disk requirements:
# ClickHouse S3 storage configuration
clickhouse:
storage:
type: s3
s3:
endpoint: https://s3.amazonaws.com
bucket: coroot-clickhouse
accessKeyId: AKIAIOSFODNN7EXAMPLE
secretAccessKey: wJalrXUtnFEMI/K7MDENG
Monitoring¶
Health Checks¶
# Check Coroot CR status
kubectl get coroot -n coroot
# Check all pods
kubectl get pods -n coroot
# Port-forward to access UI
kubectl port-forward -n coroot service/coroot-coroot 8080:8080
Key Metrics to Watch¶
| Metric | Alert Threshold | Description |
|---|---|---|
| ClickHouse disk usage | > 80% | Risk of storage exhaustion |
| ClickHouse memory | > 70% RAM | Query performance degradation |
| Node agent restarts | > 3/hour | eBPF loading issues |
| Inspection queue depth | > 1000 | Server overloaded |
Upgrades¶
# Upgrade operator
helm repo update coroot
helm upgrade -n coroot coroot-operator coroot/coroot-operator
# Upgrade Coroot CE
helm upgrade -n coroot coroot coroot/coroot-ce
Backup & Restore¶
Backup is handled through the storage backends:
- Prometheus/VM metrics: Use vmbackup / Thanos / Mimir backup
- ClickHouse: Use
clickhouse-backuptool
Sources¶
Commands & Recipes¶
Runnable commands, configuration snippets, and troubleshooting recipes for Coroot.
Installation¶
Helm Install (Kubernetes)¶
# Add repo and install operator
helm repo add coroot https://coroot.github.io/helm-charts
helm repo update coroot
helm install -n coroot --create-namespace coroot-operator coroot/coroot-operator
# Install Community Edition
helm install -n coroot coroot coroot/coroot-ce
# Install with custom ClickHouse sizing
helm install -n coroot coroot coroot/coroot-ce \
--set "clickhouse.shards=2,clickhouse.replicas=2" \
--set "clickhouse.resources.requests.memory=4Gi"
Docker Compose¶
git clone https://github.com/coroot/coroot.git && cd coroot
docker compose up -d
# UI: http://localhost:8080
Access & Verification¶
# Port-forward to UI
kubectl port-forward -n coroot service/coroot-coroot 8080:8080
# Check operator status
kubectl get coroot -n coroot
# List all Coroot pods
kubectl get pods -n coroot -o wide
# Check node-agent logs
kubectl logs -n coroot -l app=coroot-node-agent --tail=50
# Check cluster-agent logs
kubectl logs -n coroot -l app=coroot-cluster-agent --tail=50
Configuration Recipes¶
Enable TLS for Coroot Server (v1.19.0+)¶
helm upgrade -n coroot coroot coroot/coroot-ce \
--set "server.tls.enabled=true" \
--set "server.tls.certFile=/etc/coroot/tls/tls.crt" \
--set "server.tls.keyFile=/etc/coroot/tls/tls.key"
Private Registry¶
helm install -n coroot coroot coroot/coroot-ce \
--set registry.url=https://registry.YOUR_DOMAIN/coroot \
--set registry.pullSecret=coroot-registry-auth
Multi-Cluster (Remote Agents Only)¶
# On remote cluster
helm install -n coroot --create-namespace coroot-operator coroot/coroot-operator
helm install -n coroot coroot coroot/coroot-ce \
--set "agentsOnly=true" \
--set "agents.server.url=https://central-coroot.example.com"
Troubleshooting¶
eBPF Agent Not Collecting Data¶
# Check kernel version (needs 4.16+)
uname -r
# Check if node-agent has privileged access
kubectl get pod -n coroot -l app=coroot-node-agent -o jsonpath='{.items[0].spec.containers[0].securityContext}'
# Check node-agent eBPF loading errors
kubectl logs -n coroot -l app=coroot-node-agent | grep -i "error\|failed\|ebpf"
# Verify cgroup mount
kubectl exec -n coroot $(kubectl get pod -n coroot -l app=coroot-node-agent -o name | head -1) -- ls /sys/fs/cgroup/
ClickHouse Storage Full¶
# Check ClickHouse disk usage
kubectl exec -n coroot $(kubectl get pod -n coroot -l app=clickhouse -o name | head -1) -- \
clickhouse-client --query "SELECT formatReadableSize(sum(bytes_on_disk)) FROM system.parts"
# Check retention settings
kubectl exec -n coroot $(kubectl get pod -n coroot -l app=clickhouse -o name | head -1) -- \
clickhouse-client --query "SELECT database, table, engine_full FROM system.tables WHERE database LIKE 'coroot%'"
# Force merge old parts
kubectl exec -n coroot $(kubectl get pod -n coroot -l app=clickhouse -o name | head -1) -- \
clickhouse-client --query "OPTIMIZE TABLE coroot_traces.spans FINAL"
Upgrade¶
helm repo update coroot
helm upgrade -n coroot coroot-operator coroot/coroot-operator
helm upgrade -n coroot coroot coroot/coroot-ce
# Verify
kubectl get pods -n coroot
kubectl get coroot -n coroot