Architecture¶
Overview¶
Istio is a service mesh that provides traffic management, security, and observability for microservices. It supports two data plane modes: the traditional sidecar model and the newer ambient mesh model. The control plane has been consolidated into a single binary called istiod, which merges the formerly separate Pilot, Citadel, and Galley components.
See also: index, security, operations
1. Control Plane -- Istiod¶
graph TB
subgraph "Istiod (Single Binary)"
PILOT["Pilot<br/>Traffic Management & xDS"]
CITADEL["Citadel<br/>Certificate Management"]
GALLEY["Galley<br/>Config Validation"]
end
subgraph "Configuration Input"
K8s["Kubernetes API Server"]
K8sCRDs["Istio CRDs<br/>(VirtualService, DestinationRule,<br/>AuthorizationPolicy, PeerAuthentication)"]
end
K8s -->|watches| PILOT
K8sCRDs -->|watches| PILOT
PILOT -->|xDS gRPC| SIDECARS
PILOT -->|xDS gRPC| ZTUNNEL
CITADEL -->|SPIFFE certs| SIDECARS
CITADEL -->|SPIFFE certs| ZTUNNEL
subgraph "Data Plane (Sidecar Mode)"
SIDECARS["Envoy Sidecar Proxies"]
end
subgraph "Data Plane (Ambient Mode)"
ZTUNNEL["ztunnel<br/>(Node-level DaemonSet)"]
WP["Waypoint Proxies<br/>(Per-namespace Envoy)"]
ZTUNNEL -->|HBONE tunnel| WP
end
Pilot (Traffic Management)¶
Pilot is the core traffic management component responsible for:
- Watching Kubernetes Service/Endpoint resources and Istio CRDs (VirtualService, DestinationRule, Gateway, ServiceEntry)
- Translating routing rules, load balancing policies, and failover configuration into Envoy xDS APIs (LDS, RDS, CDS, EDS, SDS)
- Streaming xDS configuration to all Envoy sidecar proxies and ztunnel node proxies via gRPC
Citadel (Certificate Authority)¶
Citadel handles identity and certificate management:
- Issues SPIFFE-format X.509 certificates to every workload in the mesh
- Identity format:
spiffe://<trust-domain>/ns/<namespace>/sa/<service-account> - Manages automatic certificate rotation (default 24-hour lifetime)
- Supports both self-signed CA and integration with external CAs (via CSR flow)
Galley (Configuration Validation)¶
Galley was formerly a separate component responsible for validating and transforming Istio configuration. In current Istio releases, Galley's functionality has been merged into istiod. It validates Istio CRDs before Pilot processes them and provides configuration introspection via istioctl analyze.
2. Data Plane -- Sidecar Mode¶
sequenceDiagram
participant K8s as Kubernetes API
participant Istiod as Istiod
participant MutWebhook as Mutating Webhook
participant Pod as Application Pod
participant Sidecar as Envoy Sidecar
participant App as Application Container
K8s->>MutWebhook: Pod creation event
MutWebhook->>MutWebhook: Inject istio-proxy container + istio-init container
MutWebhook->>Pod: Modified Pod spec with sidecar
Pod->>Sidecar: istio-proxy starts (pilot-agent)
Sidecar->>Istiod: Connect xDS stream
Istiod-->>Sidecar: Push xDS config (listeners, routes, clusters, secrets)
App->>Sidecar: Outbound traffic via iptables redirect
Sidecar->>Sidecar: Apply routing, load balancing, mTLS
Sidecar->>Sidecar: Forward to destination sidecar
Sidecar Injection¶
Istio uses a Kubernetes Mutating Admission Webhook to automatically inject the istio-proxy container (Envoy + pilot-agent) into application pods. Key details:
- Automatic injection -- enabled by labeling a namespace with
istio-injection=enabled - Revision-based injection -- supports canary upgrades by using
istio.io/rev=<revision>labels - istio-init container -- sets up iptables rules to intercept all inbound and outbound traffic to/from the application container
- pilot-agent -- manages the Envoy proxy lifecycle, bootstrap config generation, and health checking
Traffic Flow (Sidecar Mode)¶
- Application container sends traffic to a Kubernetes Service
- iptables rules in the pod redirect outbound traffic to the Envoy sidecar (port 15001)
- Envoy applies routing rules from VirtualService, load balancing from DestinationRule
- Envoy establishes mTLS connection to the destination pod's sidecar
- Destination sidecar (port 15006) receives traffic and forwards to the application container
3. Data Plane -- Ambient Mesh¶
graph TB
subgraph "Node A"
PodA["Pod A<br/>(istio.io/dataplane-mode=ambient)"]
ZT_A["ztunnel<br/>(Node DaemonSet)"]
end
subgraph "Node B"
PodB["Pod B<br/>(istio.io/dataplane-mode=ambient)"]
ZT_B["ztunnel<br/>(Node DaemonSet)"]
end
subgraph "Namespace waypoint"
WP["Waypoint Proxy<br/>(Envoy, per-namespace)"]
end
PodA -->|traffic redirected via iptables/tc| ZT_A
ZT_A -->|HBONE tunnel on port 15008| ZT_B
ZT_B -->|plaintext to pod| PodB
ZT_A -.->|optional L7 path| WP
WP -.->|L7 policies applied| ZT_B
style ZT_A fill:#2d5a8a,color:#fff
style ZT_B fill:#2d5a8a,color:#fff
style WP fill:#8a2d5a,color:#fff
Workload Categories in Ambient Mode¶
| Category | Label | Behavior |
|---|---|---|
| Out of Mesh | (none) | Standard pod, no mesh features |
| In Mesh (L4) | istio.io/dataplane-mode=ambient |
Traffic intercepted at L4 by ztunnel, mTLS enforced |
| In Mesh (L7) | istio.io/dataplane-mode=ambient + istio.io/use-waypoint |
L4 by ztunnel plus L7 policies via waypoint proxy |
ztunnel (Node Proxy)¶
ztunnel is a purpose-built, high-performance node proxy written in Rust:
- Runs as a DaemonSet on every node in the cluster
- Implements L4 (TCP) traffic management: mTLS, traffic encryption, L4 authorization policies
- Uses xDS APIs to communicate with istiod for configuration and certificate distribution
- Multi-tenant: a single ztunnel serves all pods on its node, efficiently managing certificates for all local Service Accounts
- Uses the HBONE (HTTP-Based Overlay Network Encapsulation) protocol for inter-node tunneling on port 15008
Waypoint Proxies (L7)¶
Waypoint proxies provide L7 traffic management capabilities in ambient mode:
- Typically deployed per-namespace (not per-pod)
- Built on Envoy to handle HTTP routing, retries, timeouts, circuit breaking, L7 authorization policies
- Traffic is routed through the waypoint only when L7 features are needed
- Enforced by setting
istio.io/use-waypointlabel on the namespace or pod
Ambient vs Sidecar
Ambient mode removes the per-pod sidecar overhead. L4 security (mTLS) is provided by ztunnel at the node level. L7 features are opt-in via waypoint proxies. This reduces resource consumption and operational complexity for workloads that only need mTLS encryption.
4. xDS API Usage¶
Istio uses Envoy's xDS protocol to distribute configuration to data plane proxies:
| xDS API | Purpose | Consumers |
|---|---|---|
| LDS (Listener Discovery) | Inbound/outbound listeners | Sidecars, ztunnel, waypoint |
| RDS (Route Discovery) | HTTP routing rules | Sidecars, waypoint |
| CDS (Cluster Discovery) | Upstream clusters (services) | Sidecars, ztunnel, waypoint |
| EDS (Endpoint Discovery) | Individual endpoint addresses | Sidecars, ztunnel, waypoint |
| SDS (Secret Discovery) | TLS certificates and keys | Sidecars, ztunnel, waypoint |
Istiod pushes configuration incrementally. When a VirtualService or DestinationRule changes, only the affected proxies receive updated xDS resources.
5. SPIFFE Identity¶
Every workload in the mesh receives a SPIFFE identity in the format:
This identity is encoded in the X.509 certificate SAN (Subject Alternative Name) and used for:
- mTLS peer authentication between sidecars and ztunnel proxies
- AuthorizationPolicy rules that restrict access based on source identity
- Audit logging and telemetry with source/destination identity fields
Example ztunnel log showing identity:
src.identity="spiffe://cluster.local/ns/default/sa/curl"
dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-details"
6. Component Comparison¶
| Aspect | Sidecar Mode | Ambient Mode |
|---|---|---|
| Proxy location | Per-pod (istio-proxy) | Per-node (ztunnel) + per-namespace (waypoint) |
| Resource overhead | ~50 MB RAM per pod | ~100 MB per node + optional waypoint |
| L4 security | Full mTLS | Full mTLS via ztunnel |
| L7 features | Full (routing, retries, etc.) | Requires waypoint proxy |
| Network model | iptables redirect in pod | iptables/tc redirect at node |
| Suitable for | Per-pod L7 customization | Large-scale mTLS-first deployments |
Key Insight
Istio's architecture has consolidated from three separate components (Pilot, Citadel, Galley) into a single istiod binary, simplifying operations. The ambient mesh mode introduces a two-tier data plane (ztunnel for L4, waypoint for L7) that eliminates per-pod sidecar overhead for workloads that primarily need mTLS encryption.
How It Works¶
Ambient Mode data path, ztunnel L4 processing, HBONE tunnel, waypoint L7 routing, xDS configuration distribution, and mTLS flow.
Ambient Mode Data Path¶
sequenceDiagram
participant PodA as Pod A
participant ZT_A as ztunnel (Node A)
participant ZT_B as ztunnel (Node B)
participant WP as Waypoint Proxy (optional L7)
participant PodB as Pod B
PodA->>ZT_A: TCP connect (intercepted via iptables)
ZT_A->>ZT_A: mTLS handshake (SPIFFE identity)
ZT_A->>ZT_A: L4 AuthorizationPolicy check
alt L7 policy needed
ZT_A->>WP: Forward via HBONE tunnel
WP->>WP: HTTP routing, retries, L7 policy
WP->>ZT_B: Forward to destination node
else L4 only
ZT_A->>ZT_B: Direct HBONE tunnel
end
ZT_B->>PodB: Deliver to destination pod
HBONE Tunnel¶
ztunnel uses the HTTP-Based Overlay Network Environment (HBONE) for inter-node traffic:
- Connection intercept: Node-level iptables rules redirect pod traffic to ztunnel (port 15006 inbound, 15001 outbound)
- HBONE CONNECT: ztunnel wraps the original TCP connection inside an HTTP/2 CONNECT request to the peer ztunnel
- mTLS establishment: The HTTP/2 connection uses mutual TLS with SPIFFE SVIDs as client/server certificates
- Data relay: Original bytes are tunneled through the HTTP/2 stream with minimal overhead
This is similar to Kubernetes' API server kubectl proxy tunnel but applied to every pod connection.
istiod Configuration Distribution¶
istiod (formerly Pilot) translates Kubernetes resources (Service, Deployment, Istio CRDs) into Envoy xDS configuration:
flowchart TB
subgraph istiod_I["istiod"]
K8S_Watcher["K8s API Watcher"]
Config_Analysis["Config Analysis & Validation"]
XDS_Generator["xDS Generator"]
CA["Citadel CA\n(certificate signing)"]
end
subgraph Data_Plane["Data Plane"]
ZT["ztunnel / Envoy"]
end
K8S_Watcher --> Config_Analysis
Config_Analysis --> XDS_Generator
XDS_Generator -->|"xDS gRPC\n(delta or SOTW)"| ZT
CA -->|"Sign CSR"| ZT
style istiod_I fill:#5f6caf,color:#fff
xDS Resource Types¶
| xDS API | Purpose |
|---|---|
| LDS (Listener) | Inbound/outbound listener configuration |
| RDS (Route) | HTTP routing rules (VirtualService) |
| CDS (Cluster) | Upstream cluster definition (Service) |
| EDS (Endpoint) | Dynamic endpoint discovery (Endpoints/EndpointSlice) |
| SDS (Secret) | TLS certificate distribution |
istiod pushes configuration using delta xDS by default, which sends only changed resources rather than the full configuration snapshot. This reduces CPU and bandwidth at scale.
mTLS Identity (SPIFFE)¶
flowchart LR
Istiod_C["istiod\n(Citadel CA)"] -->|"sign cert"| ZT["ztunnel /\nEnvoy sidecar"]
ZT -->|"present SPIFFE\nSVID"| Peer["Peer ztunnel"]
Peer -->|"verify cert\nchain"| Trust["Trust Bundle\n(root CA)"]
style Istiod_C fill:#5f6caf,color:#fff
Certificate Lifecycle¶
| Certificate | Lifetime | Rotation Trigger |
|---|---|---|
| Root CA | 10 years (self-signed) | Manual rotation |
| Intermediate CA | 1 year | Automatic via istiod |
| Workload certificate | 24 hours (default) | CSR sent when 50% of lifetime consumed |
SPIFFE ID Format¶
Sidecar Mode (Legacy)¶
In sidecar mode, an Envoy proxy is injected into each pod as a sidecar container. All pod traffic is redirected through iptables to the Envoy proxy, which applies L4/L7 policies. This mode has higher resource overhead (one Envoy per pod) but is more mature and supports all Istio features.
Sources¶
Benchmarks¶
Scope
Performance characteristics, scaling limits, and resource consumption for Istio.
Sidecar vs Ambient Performance¶
| Metric | Sidecar (Envoy) | Ambient (ztunnel) | Native |
|---|---|---|---|
| Latency (P50) | +1-2ms | +0.5ms | Baseline |
| Latency (P99) | +3-10ms | +1-3ms | Baseline |
| Throughput | 90-95% native | 95-98% native | 100% |
| Memory per pod | +50-100Mi | 0 (shared) | 0 |
| CPU per pod | +50-100m | 0 (shared) | 0 |
Control Plane Scaling¶
| Pods in Mesh | istiod CPU | istiod Memory | Config Push Time |
|---|---|---|---|
| 100 | 200m | 512Mi | < 1s |
| 1,000 | 1-2 | 2-4Gi | 1-5s |
| 5,000 | 4-8 | 8-16Gi | 5-15s |
| 10,000 | 8-16 | 16-32Gi | 15-30s |
Scaling Limits¶
| Dimension | Limit | Notes |
|---|---|---|
| Pods per mesh | 10,000+ | Single istiod |
| Services | 5,000+ | xDS push complexity |
| Namespaces | 1,000+ | |
| VirtualServices | 5,000+ | Envoy route table size |
| Gateways | 100+ | Resource consumption |
Sourcing Status¶
Unsourced Performance Data
The performance numbers in this document are estimated from vendor documentation, community benchmarks, and engineering judgment. They do not represent controlled benchmarks with documented test conditions. Specific hardware configurations, software versions, and test methodologies were not recorded.
Use these figures as rough guidance only. For production capacity planning, run your own benchmarks against your specific workload and infrastructure.