Kubernetes — Benchmarks¶
Scope
Kubernetes scalability limits, API server performance, etcd throughput, and scheduling benchmarks.
Official Scalability Targets (SIG Scalability)¶
Kubernetes officially tests and targets these limits per cluster:
| Dimension | Target | Notes |
|---|---|---|
| Nodes | 5,000 | Tested by SIG Scalability |
| Pods | 150,000 | 30 pods/node avg |
| Pods per node | 110 | Kubelet default maxPods |
| Services | 10,000 | |
| Endpoints per Service | 5,000 | Beyond this, use EndpointSlices |
| Namespaces | 10,000 | |
| ConfigMaps | 30,000 | |
| Secrets | 30,000 | |
| Total API objects | ~300,000 | etcd storage limit |
API Server Performance¶
| Metric | Target SLO | Notes |
|---|---|---|
| API request latency (mutating, P99) | < 1s | At 5000-node scale |
| API request latency (non-mutating, P99) | < 5s | For resource-list calls |
| API request latency (P50) | < 100ms | Typical operations |
| Startup latency (P99) | < 5s | Pod ready from API call |
etcd Performance¶
| Cluster Size | WAL fsync P99 | Read latency P99 | Write QPS | Storage |
|---|---|---|---|---|
| < 100 nodes | < 5ms | < 10ms | 1,000 | 2Gi |
| 100-500 nodes | < 10ms | < 25ms | 5,000 | 4Gi |
| 500-5000 nodes | < 10ms | < 50ms | 10,000 | 8Gi |
Disk Latency is Critical
etcd requires sequential writes with fsync. Any disk with > 10ms fsync latency will cause leader elections, cluster instability, and cascading failures.
Scheduling Performance¶
| Scheduler Metric | Value | Conditions |
|---|---|---|
| Scheduling throughput | ~100 pods/sec | Default scheduler, 5000-node cluster |
| Scheduling latency (P99) | < 100ms | Without complex affinity rules |
| Scheduling with affinity | 20-50 pods/sec | Pod anti-affinity across nodes |
| Preemption overhead | +50-100ms | When preemption kicks in |
Network Performance (CNI Comparison)¶
| CNI | Pod-to-Pod Latency | Throughput (TCP) | Throughput (eBPF) | Encryption Overhead |
|---|---|---|---|---|
| Cilium | ~50us | 9.5 Gbps | 9.8 Gbps (native) | 15-20% (WireGuard) |
| Calico | ~60us | 9.2 Gbps | 9.5 Gbps (eBPF mode) | 20-25% (WireGuard) |
| Flannel (VXLAN) | ~80us | 8.5 Gbps | N/A | N/A (no native) |
| Host networking | ~30us | 10 Gbps | N/A | N/A |
Real-World Scale References¶
- Google GKE: Supports 15,000 nodes per cluster (managed)
- AWS EKS: Up to 5,000 nodes with managed control plane
- OpenAI: Runs 7,500-node clusters for ML training
- Alibaba Cloud: Reported testing at 10,000+ nodes