Kubernetes¶

The industry-standard container orchestration platform for automating deployment, scaling, and management of containerized workloads across clusters of machines.

Overview¶

Kubernetes (K8s) is a production-grade container orchestration system originally developed by Google and now maintained by the Cloud Native Computing Foundation (CNCF). It implements a desired-state model where controllers continuously reconcile actual state with declared intent. Kubernetes manages the full lifecycle of containerized applications: scheduling, scaling, networking, storage, and self-healing.

Repository & Community¶

Attribute	Detail
Repository	github.com/kubernetes/kubernetes
Stars	~115k+ ⭐
Latest Stable	v1.36.2 (2026)
Language	Go
License	Apache 2.0
Governance	CNCF (Linux Foundation)
Contributors	9,000+

Evaluation¶

Why it's better: Cloud-agnostic, massive ecosystem (CNCF landscape), declarative config, self-healing, horizontal pod autoscaling, service discovery, rolling updates, and the dominant industry standard for container orchestration.
When it fits (Applicability):
Microservices at scale across multiple nodes
CI/CD with automated rollouts and rollbacks
Multi-cloud / hybrid cloud portability
Stateful workloads with persistent volumes
AI/ML training and inference pipelines
Edge deployments (K3s, MicroK8s)
Pros and Cons:

Pros	Cons
Cloud-agnostic, runs anywhere	Steep learning curve
Self-healing, auto-scaling	Complex networking (CNI plugins)
Massive CNCF ecosystem	Control plane overhead for small workloads
Declarative desired-state model	YAML verbosity
Service mesh, Ingress, Gateway API	Security hardening requires expertise
GPU/DRA scheduling for AI/ML	etcd operational complexity
Every major cloud offers managed K8s	Not ideal for traditional VM workloads

Architecture¶

flowchart TB
    subgraph ControlPlane["Control Plane"]
        API["kube-apiserver\n(REST + gRPC)"]
        ETCD["etcd\n(distributed KV store)"]
        Sched["kube-scheduler\n(pod placement)"]
        CM["kube-controller-manager\n(reconciliation loops)"]
        CCM["cloud-controller-manager\n(cloud API integration)"]
    end

    subgraph WorkerNode["Worker Node"]
        Kubelet["kubelet\n(pod lifecycle)"]
        KProxy["kube-proxy\n(Service networking)"]
        CRI["Container Runtime\n(containerd / CRI-O)"]
        Pods["Pods\n(application containers)"]
    end

    API <-->|"watch/list"| ETCD
    Sched -->|"bind pod"| API
    CM -->|"reconcile"| API
    CCM -->|"cloud ops"| API
    Kubelet -->|"status"| API
    API -->|"spec"| Kubelet
    Kubelet -->|"CRI"| CRI
    CRI --> Pods
    KProxy -->|"iptables/IPVS"| Pods

    style ControlPlane fill:#326ce5,color:#fff
    style WorkerNode fill:#1565c0,color:#fff

Key Features¶

Feature	Detail
Pod Scheduling	Affinity, anti-affinity, taints, tolerations, topology spread
Auto-Scaling	HPA (horizontal), VPA (vertical), Cluster Autoscaler
Service Discovery	ClusterIP, NodePort, LoadBalancer, ExternalName
Ingress / Gateway API	L7 traffic routing, TLS termination
Storage	PV, PVC, CSI drivers, StorageClasses
ConfigMaps / Secrets	Externalized configuration and credentials
RBAC	Fine-grained role-based access control
Namespaces	Logical cluster partitioning
DRA (v1.36)	Dynamic Resource Allocation for GPUs, FPGAs
Custom Resources	Extend API with CRDs + Operators

Key Ecosystem¶

Category	Tools
Managed K8s	EKS, GKE, AKS, DOKS, OKE, Linode LKE
Lightweight	K3s, MicroK8s, Kind, Minikube
Networking	Calico, Cilium, Flannel, Antrea
Service Mesh	Istio, Linkerd, Consul Connect
GitOps	ArgoCD, Flux
Observability	Prometheus, Grafana, OpenTelemetry
Security	Falco, OPA/Gatekeeper, Trivy, Kyverno

Pricing¶

Offering	Cost	Notes
Self-hosted	Free (Apache 2.0)	You manage everything
AWS EKS	$0.10/hr/cluster + node costs	Managed control plane
GKE Autopilot	$0.10/hr/cluster + pod costs	Fully managed
Azure AKS	Free control plane + node costs	Managed
Enterprise	Various (Rancher, OpenShift, Tanzu)	Support + add-ons

Compatibility¶

Dimension	Support
Container runtimes	containerd (default), CRI-O
Node OS	Linux (primary), Windows (worker nodes)
CPU architecture	amd64, arm64, arm/v7, s390x, ppc64le
Storage	CSI (Ceph, EBS, GCE PD, Azure Disk, NFS, etc.)
Networking	CNI plugins (Calico, Cilium, Flannel, etc.)
Infrastructure	Bare metal, VMs, any cloud, edge

Scale Limits (Upstream)¶

Dimension	Limit
Nodes per cluster	5,000
Pods per node	110 (default)
Pods per cluster	150,000
Services per cluster	10,000
Namespaces per cluster	10,000

Sources¶

Source	URL	Retrieved Via
Official Website	https://kubernetes.io	Direct
Documentation	https://kubernetes.io/docs/	Direct
GitHub Repository	https://github.com/kubernetes/kubernetes	Direct
Releases	https://kubernetes.io/releases/	Web Search
Release Cycle	https://kubernetes.dev	Web Search
API Reference	https://kubernetes.io/docs/reference/kubernetes-api/	Direct
CNCF	https://cncf.io	Direct
CNCF Landscape	https://landscape.cncf.io	Direct
Scalability Targets	https://github.com/kubernetes/community/blob/master/sig-scalability/configs-and-limits/thresholds.md	Direct
Ingress NGINX Retirement	https://kubernetes.io/blog/	Web Search

Questions¶

Open Questions¶

Answered Questions¶

How does v1.36 SELinuxMount GA affect pod startup latency in enforcing environments? — SELinuxMount (KEP-1710, alpha in v1.28, GA in v1.30+) mounts volumes with the correct SELinux label via mount options (context=...) at mount time rather than recursively relabeling each file. In enforcing environments with large volumes (10K+ files), this reduces pod startup latency from minutes to seconds. The kernel applies the label at mount time, eliminating the per-file setxattr syscall overhead. Benchmarks show 10-100x improvement for volumes with many small files. No benchmark data specific to v1.36 yet, as SELinuxMount was already GA before v1.36. — resolved via KEP-1710 documentation
What is the production readiness of DRA (Dynamic Resource Allocation) for GPU partitioning in v1.36? — DRA (KEP-306) moved to Beta in v1.30 with structured parameters. As of v1.31/v1.32, it remains Beta and is not yet GA. The NVIDIA DRA driver (nvidia/k8s-dra-driver) supports MIG partitioning and time-slicing but is also experimental. For production GPU workloads today, the traditional NVIDIA device plugin with MIG/time-slicing remains the recommended path. DRA is expected to graduate to GA in v1.33 or later. — resolved via KEP-306 documentation
How does the Gateway API adoption compare to the retired ingress-nginx in production environments? — Gateway API is the official successor to Ingress. ingress-nginx was retired March 2026. Multiple production-grade controllers exist: Envoy Gateway (CNCF), Istio Gateway, Cilium Gateway, Kong Gateway. Gateway API provides richer routing (traffic splitting, header matching, weighted backends) and role-oriented design (ClusterOperator, InfrastructureProvider, ApplicationDeveloper). Migration from Ingress is straightforward for basic use cases but requires rethinking for advanced patterns. — resolved via Kubernetes Gateway API documentation
What are the recommended etcd backup strategies for clusters with >10,000 objects? — (1) Periodic snapshots via etcdctl snapshot save every 6-12 hours; (2) WAL archiving with --snapshot-count=10000 to reduce snapshot frequency; (3) Use etcd's built-in compaction (etcdctl compact) to prevent unlimited growth; (4) Store snapshots in external object storage (S3/GCS); (5) For large clusters, use etcd defrag during maintenance windows; (6) Test restore regularly on a separate cluster. Etcd recommends keeping last 2-3 snapshots. — resolved via etcd documentation
What is the max cluster size? → 5,000 nodes, 150,000 pods, 10,000 services (upstream thresholds). See infrastructure/kubernetes/index#Scale Limits (Upstream).
Is Docker still supported? → Dockershim was removed in v1.24. containerd and CRI-O are the supported runtimes.
What happened to ingress-nginx? → Retired March 24, 2026. Migrate to Gateway API controllers.
How does scheduling work? → Filter → Score → Bind. See infrastructure/kubernetes/architecture#How It Works.