Coroot — Operations¶
Deployment, configuration, scaling, and day-2 operations for Coroot.
Deployment¶
Kubernetes via Coroot Operator (Recommended)¶
The Coroot Operator manages the lifecycle of all Coroot components via Custom Resources:
# Add Helm repo
helm repo add coroot https://coroot.github.io/helm-charts
helm repo update coroot
# Install the Operator
helm install -n coroot --create-namespace \
coroot-operator coroot/coroot-operator
# Deploy Community Edition
helm install -n coroot coroot coroot/coroot-ce \
--set "clickhouse.shards=2,clickhouse.replicas=2"
Docker Compose (Development)¶
Multi-Cluster Setup¶
For multi-cluster observability, deploy the main Coroot instance in a centralized "observability" cluster. In remote clusters, deploy the operator in agentsOnly mode:
# Remote cluster — agents only
helm install -n coroot --create-namespace \
coroot-operator coroot/coroot-operator
helm install -n coroot coroot coroot/coroot-ce \
--set "agentsOnly=true" \
--set "agents.server.url=https://central-coroot.example.com"
Configuration¶
Custom Resource (CR) Configuration¶
| Parameter | Default | Description |
|---|---|---|
metricsRefreshInterval |
15s |
Prometheus query resolution |
cacheTTL |
720h |
Metric cache retention |
clickhouse.shards |
1 |
ClickHouse shard count |
clickhouse.replicas |
1 |
ClickHouse replica count |
registry.url |
Docker Hub | Custom container registry |
registry.pullSecret |
— | Image pull secret name |
Private Registry Configuration¶
helm install -n coroot coroot coroot/coroot-ce \
--set registry.url=https://registry.example.com/coroot \
--set registry.pullSecret=coroot-registry-auth
Node Agent Configuration¶
| Flag | Description |
|---|---|
--cgroupfs-root |
Override cgroup filesystem path |
--listen |
Agent listen address (default: :80) |
--wal-dir |
Write-ahead log directory |
ENABLE_JAVA_TLS |
Enable Java TLS profiling support |
Scaling¶
Vertical Scaling¶
- Coroot Server: Increase CPU/memory for larger service maps and more concurrent inspections
- ClickHouse: Scale storage and memory based on log/trace retention
Horizontal Scaling¶
- ClickHouse sharding: Distribute data across multiple shards for write throughput
- ClickHouse replication: Add replicas for read performance and HA
Storage Tuning¶
Coroot v1.18.6+ supports S3 storage for ClickHouse, eliminating local disk requirements:
# ClickHouse S3 storage configuration
clickhouse:
storage:
type: s3
s3:
endpoint: https://s3.amazonaws.com
bucket: coroot-clickhouse
accessKeyId: AKIAIOSFODNN7EXAMPLE
secretAccessKey: wJalrXUtnFEMI/K7MDENG
Monitoring¶
Health Checks¶
# Check Coroot CR status
kubectl get coroot -n coroot
# Check all pods
kubectl get pods -n coroot
# Port-forward to access UI
kubectl port-forward -n coroot service/coroot-coroot 8080:8080
Key Metrics to Watch¶
| Metric | Alert Threshold | Description |
|---|---|---|
| ClickHouse disk usage | > 80% | Risk of storage exhaustion |
| ClickHouse memory | > 70% RAM | Query performance degradation |
| Node agent restarts | > 3/hour | eBPF loading issues |
| Inspection queue depth | > 1000 | Server overloaded |
Upgrades¶
# Upgrade operator
helm repo update coroot
helm upgrade -n coroot coroot-operator coroot/coroot-operator
# Upgrade Coroot CE
helm upgrade -n coroot coroot coroot/coroot-ce
Backup & Restore¶
Backup is handled through the storage backends:
- Prometheus/VM metrics: Use vmbackup / Thanos / Mimir backup
- ClickHouse: Use
clickhouse-backuptool