Architecture¶
Related Notes
index | architecture | operations | security
Overview¶
Project Calico is a CNI plugin providing Layer 3 networking and network policy for Kubernetes clusters. It supports two dataplanes -- the standard Linux dataplane (iptables) and an eBPF dataplane -- and distributes routes between nodes using BGP. Calico components run as DaemonSets and Deployments inside the calico-system namespace.
See also: security for policy and encryption details.
Component Diagram¶
graph TB
subgraph "Kubernetes Control Plane"
API["Kubernetes API Server"]
CALICO_API["Calico API Server"]
KC["kube-controllers"]
end
subgraph "Typha (Deployment)"
TYPHA["Typha proxy<br/>caches & deduplicates events"]
end
subgraph "Node 1 — calico/node DaemonSet"
FELIX1["Felix<br/>policy + route programming"]
BIRD1["BIRD<br/>BGP routing daemon"]
CONFD1["confd<br/>template renderer"]
CNI1["CNI plugin + IPAM"]
end
subgraph "Node 2 — calico/node DaemonSet"
FELIX2["Felix"]
BIRD2["BIRD"]
CONFD2["confd"]
CNI2["CNI plugin + IPAM"]
end
API -->|watches resources| TYPHA
CALICO_API -->|validates CRDs| API
TYPHA -->|cached updates| FELIX1
TYPHA -->|cached updates| FELIX2
TYPHA -->|config changes| CONFD1
TYPHA -->|config changes| CONFD2
CONFD1 -->|generates config| BIRD1
CONFD2 -->|generates config| BIRD2
BIRD1 <-->|BGP peering| BIRD2
FELIX1 -->|programs iptables / eBPF / routes| NET1["Linux Kernel FIB"]
FELIX2 -->|programs iptables / eBPF / routes| NET2["Linux Kernel FIB"]
KC -->|syncs labels + IPs| API
Core Components¶
Felix¶
Felix is the primary per-node agent. It runs inside the calico/node container on every node and is responsible for:
- Endpoint management -- programs routes and interface configuration for local workloads (veth pairs for pods, host endpoints for host interfaces).
- Policy enforcement -- translates Calico NetworkPolicy and GlobalNetworkPolicy into iptables rules (standard dataplane) or eBPF bytecode (eBPF dataplane).
- Route programming -- writes per-endpoint routes into the Linux kernel FIB so the kernel can forward traffic to the correct workload.
- Status reporting -- exposes health and endpoint status to the Calico datastore.
Felix watches the Calico datastore (via Typha in production) for changes to endpoints, policies, and IP pools, and reacts in real time.
BIRD¶
BIRD is an open-source internet routing daemon that runs on each node inside calico/node. Its responsibilities:
- Route distribution -- reads routes that Felix programs into the kernel FIB and advertises them to BGP peers on other nodes.
- BGP peering -- maintains BGP sessions with other Calico nodes (full mesh by default) or with centralized BGP route reflectors in larger deployments.
- Topology flexibility -- supports full-mesh, route-reflector, and top-of-rack (ToR) switch peering models.
BIRD is not used with eBPF VXLAN mode
When Calico operates in VXLAN overlay mode (no BGP), BIRD does not run. VXLAN mode uses kernel VXLAN interfaces instead of BGP for inter-node traffic.
confd¶
confd is a lightweight template renderer that watches the Calico datastore for BGP-related configuration (IP pools, node-to-node mesh settings, AS numbers, BGP peers). When configuration changes, confd regenerates BIRD config files and triggers a BIRD reload.
Typha¶
Typha sits between the datastore (typically the Kubernetes API server) and the per-node Felix/confd processes. Its purpose is scalability:
- Connection multiplexing -- maintains a single datastore watch instead of one per node, reducing API server load from O(nodes) to O(typha replicas).
- Event deduplication -- caches the full datastore state and filters irrelevant updates before pushing to Felix, reducing per-node CPU usage.
- Deployment -- runs as a Deployment (typically 2--3 replicas for HA). Automatically installed when using the Calico Operator.
When to use Typha
Typha is recommended for all clusters using the Kubernetes API datastore, especially clusters with 50+ nodes. It is redundant (but harmless) when using an external etcd v3 datastore.
kube-controllers¶
calico-kube-controllers is a Deployment (single replica by default) that syncs Kubernetes metadata into the Calico datastore:
- Namespace controller -- syncs namespace labels so policies can select by namespace.
- ServiceAccount controller -- syncs ServiceAccount labels for policy matching.
- Pod controller -- maps pod IP addresses and labels into Calico workload endpoints.
- Node controller -- syncs node labels and IP addresses for host endpoint policy.
Calico API Server¶
The Calico API Server is a Kubernetes aggregated API server that exposes Calico resources (NetworkPolicy, GlobalNetworkPolicy, IPPool, BGPConfiguration, etc.) as native Kubernetes-style CRDs. This enables kubectl get networkpolicy and kubectl apply -f workflows without requiring calicoctl.
CNI Plugin and IPAM¶
- CNI plugin -- invoked by the kubelet when a pod is created. Configures the veth pair, moves one end into the pod's network namespace, and calls the IPAM plugin.
- Calico IPAM -- manages IP address allocation from configured IPPools. Allocates CIDR blocks (/26 by default) to nodes for local pod IP assignment. Supports block affinity (a node "owns" a block) to minimize route churn.
eBPF Dataplane¶
Calico offers an alternative eBPF dataplane that replaces iptables with eBPF programs attached to TC hooks on network interfaces.
graph LR
subgraph "Pod Traffic Path"
POD["Pod"] -->|"veth pair"| TC_IN["TC ingress<br/>eBPF program"]
TC_IN -->|fast-path| TC_OUT["TC egress<br/>eBPF program"]
TC_OUT -->|"direct to dest pod<br/>or tunnel"| DEST["Destination Pod"]
end
subgraph "BPF Maps"
NAT_MAP["NAT frontend/backend<br/>map (services)"]
POL_MAP["Policy map<br/>(IP sets from selectors)"]
SVC_MAP["Service metadata map<br/>(ExternalTrafficPolicy,<br/>session affinity)"]
end
TC_IN -->|"lookup"| NAT_MAP
TC_IN -->|"policy check"| POL_MAP
TC_OUT -->|"service resolve"| SVC_MAP
subgraph "Connect-time LB"
SOCK["Socket BPF hook"] -->|"intercepts connect()"| DIRECT["Direct backend IP<br/>no NAT overhead"]
end
Key characteristics of the eBPF dataplane:
- Bypasses iptables entirely -- packets are processed by eBPF programs at TC hooks, skipping the iptables rule chains and conntrack overhead.
- Connect-time load balancing -- hooks into socket BPF to intercept
connect()syscalls. When a pod connects to a ClusterIP, Calico resolves the destination directly to a backend pod IP, eliminating all NAT overhead for inter-pod traffic. - kube-proxy replacement -- in eBPF mode, Calico can fully replace kube-proxy, handling ClusterIP, NodePort, LoadBalancer, and ExternalIP services.
- Policy as eBPF bytecode -- network policies are compiled into optimized eBPF programs using BPF maps to store IP sets matched by label selectors.
eBPF dataplane requirements
Requires Linux kernel 5.3 or later (5.10+ recommended). Not all Calico features are available in eBPF mode -- consult the Calico documentation for feature parity details.
Datapath: Standard (iptables) Mode¶
sequenceDiagram
participant PodA as Pod A (Node 1)
participant Felix1 as Felix (Node 1)
participant FIB1 as Kernel FIB (Node 1)
participant BIRD1 as BIRD (Node 1)
participant BIRD2 as BIRD (Node 2)
participant FIB2 as Kernel FIB (Node 2)
participant Felix2 as Felix (Node 2)
participant PodB as Pod B (Node 2)
Note over PodA,PodB: Pod A sends packet to Pod B (different node)
PodA->>FIB1: Packet leaves pod veth
FIB1->>FIB1: Route lookup -> via Node 2
Note over FIB1: Felix programmed route for Pod B's IP via Node 2
FIB1->>FIB2: Packet forwarded over physical network
FIB2->>PodB: Route lookup -> local veth
PodB->>PodB: Packet received
Note over BIRD1,BIRD2: Control plane (parallel)
Felix1->>FIB1: Program route: Pod-A-CIDR -> local
BIRD1->>BIRD2: BGP UPDATE: advertise Pod-A-CIDR
BIRD2->>FIB2: Install route: Pod-A-CIDR -> Node 1
Felix2->>FIB2: Program route: Pod-B-CIDR -> local
BIRD2->>BIRD1: BGP UPDATE: advertise Pod-B-CIDR
BIRD1->>FIB1: Install route: Pod-B-CIDR -> Node 2
Networking Modes¶
| Mode | Inter-node transport | Requires BGP | Encapsulation | Use case |
|---|---|---|---|---|
| BGP (no overlay) | Direct routing via BGP | Yes | None | Bare-metal, on-prem, AWS |
| VXLAN overlay | Kernel VXLAN tunnels | No | VXLAN (UDP 4789) | Cloud environments without BGP support |
| IPIP overlay | IP-in-IP tunnels | Yes | IPIP (proto 4) | Legacy environments |
| eBPF | Direct or VXLAN | Optional | Optional | High-performance, kube-proxy-free |
Datastore Options¶
| Datastore | Description | When to use |
|---|---|---|
| Kubernetes API | Stores Calico state as CRDs in etcd (via kube-apiserver) | Default for most Kubernetes installs; requires Typha |
| etcd v3 | Direct etcd connection, bypasses kube-apiserver | Large-scale deployments needing datastore isolation |
Sizing and Scalability¶
- Typha replicas -- 1 replica per 100--200 nodes. Deploy 3+ replicas for HA.
- BGP route reflectors -- required at 100+ nodes to avoid full-mesh O(n^2) peering. Route reflectors carry control-plane traffic only (no data-plane forwarding).
- IPAM block size -- defaults to /26 (64 addresses per block per node). Tunable per IPPool.
- Felix CPU -- scales with endpoint churn rate. eBPF mode typically uses less CPU than iptables mode.
Sources¶
How It Works¶
Felix agent, BIRD BGP, Typha fan-out proxy, and pluggable data plane internals.
Component Architecture¶
sequenceDiagram
participant K8sAPI as Kubernetes API
participant Typha as Typha (fan-out proxy)
participant Felix as Felix (per-node agent)
participant DP as Data Plane (eBPF/iptables)
participant BIRD as BIRD (BGP daemon)
participant Network as Network Fabric
K8sAPI->>Typha: Watch NetworkPolicy, Pod, Node changes
Typha->>Felix: Fan-out updates (reduces API load)
Felix->>Felix: Calculate policy rules
Felix->>DP: Program eBPF maps / iptables rules
Felix->>BIRD: Update route table
BIRD->>Network: Advertise routes via BGP
Data Plane Options¶
| Data Plane | How It Works | Best For |
|---|---|---|
| iptables | Felix writes iptables chains with ipsets | Legacy, widest compatibility |
| eBPF | Felix loads eBPF programs (TC hooks) | Performance, kube-proxy replacement |
| nftables | Felix writes nftables rules | Newer kernels, atomic rule updates |
| VPP | Vector Packet Processing (userspace) | Telecom / NFV, extreme throughput |
| Windows HNS | Host Networking Service rules | Windows worker nodes |
Typha — Scale Proxy¶
Without Typha, every Felix agent watches the Kubernetes API directly. At scale (500+ nodes), this overwhelms the API server.
flowchart TB
subgraph Without["Without Typha (≤200 nodes)"]
API1["K8s API"] --> F1["Felix 1"]
API1 --> F2["Felix 2"]
API1 --> FN["Felix N"]
end
subgraph With["With Typha (200+ nodes)"]
API2["K8s API"] --> T1["Typha 1"]
API2 --> T2["Typha 2"]
T1 --> FA["Felix A"]
T1 --> FB["Felix B"]
T2 --> FC["Felix C"]
T2 --> FD["Felix D"]
end
style Without fill:#c62828,color:#fff
style With fill:#2e7d32,color:#fff
Network Policy Tiers (Calico Extended)¶
flowchart TB
Packet["Incoming Packet"] --> Security["Security Tier\n(highest priority)"]
Security --> Platform["Platform Tier"]
Platform --> App["Application Tier"]
App --> Default["Default Tier\n(K8s NetworkPolicy)"]
Default --> Allow["Allow / Deny"]
style Security fill:#c62828,color:#fff
style Platform fill:#e65100,color:#fff
style App fill:#1565c0,color:#fff
Sources¶
Benchmarks¶
Scope
Performance characteristics, scaling limits, and resource consumption for Calico.
Dataplane Performance¶
| Dataplane | Throughput | Latency | CPU Overhead |
|---|---|---|---|
| iptables | 8-9 Gbps | 100us | High (10k+ rules) |
| eBPF | 9.5+ Gbps | 50us | Low |
| Windows HNS | 5-7 Gbps | 200us | Medium |
Scaling Limits¶
| Dimension | iptables | eBPF |
|---|---|---|
| Network policies | 1,000 | 10,000+ |
| Endpoints per node | 200 | 500+ |
| Total policies cluster-wide | 5,000 | 50,000+ |
| Policy evaluation time | 1-10ms | < 1ms |
Policy Sync Performance¶
| Policies | Felix Sync Time | Memory Usage |
|---|---|---|
| 100 | < 1s | 100MB |
| 1,000 | 2-5s | 300MB |
| 10,000 | 10-30s | 1GB+ |
Sourcing Status¶
Unsourced Performance Data
The performance numbers in this document are estimated from vendor documentation, community benchmarks, and engineering judgment. They do not represent controlled benchmarks with documented test conditions. Specific hardware configurations, software versions, and test methodologies were not recorded.
Use these figures as rough guidance only. For production capacity planning, run your own benchmarks against your specific workload and infrastructure.