Skip to content

Architecture

Overview

Istio is a service mesh that provides traffic management, security, and observability for microservices. It supports two data plane modes: the traditional sidecar model and the newer ambient mesh model. The control plane has been consolidated into a single binary called istiod, which merges the formerly separate Pilot, Citadel, and Galley components.

See also: index, security, operations

1. Control Plane -- Istiod

graph TB
    subgraph "Istiod (Single Binary)"
        PILOT["Pilot<br/>Traffic Management & xDS"]
        CITADEL["Citadel<br/>Certificate Management"]
        GALLEY["Galley<br/>Config Validation"]
    end

    subgraph "Configuration Input"
        K8s["Kubernetes API Server"]
        K8sCRDs["Istio CRDs<br/>(VirtualService, DestinationRule,<br/>AuthorizationPolicy, PeerAuthentication)"]
    end

    K8s -->|watches| PILOT
    K8sCRDs -->|watches| PILOT
    PILOT -->|xDS gRPC| SIDECARS
    PILOT -->|xDS gRPC| ZTUNNEL
    CITADEL -->|SPIFFE certs| SIDECARS
    CITADEL -->|SPIFFE certs| ZTUNNEL

    subgraph "Data Plane (Sidecar Mode)"
        SIDECARS["Envoy Sidecar Proxies"]
    end

    subgraph "Data Plane (Ambient Mode)"
        ZTUNNEL["ztunnel<br/>(Node-level DaemonSet)"]
        WP["Waypoint Proxies<br/>(Per-namespace Envoy)"]
        ZTUNNEL -->|HBONE tunnel| WP
    end

Pilot (Traffic Management)

Pilot is the core traffic management component responsible for:

  • Watching Kubernetes Service/Endpoint resources and Istio CRDs (VirtualService, DestinationRule, Gateway, ServiceEntry)
  • Translating routing rules, load balancing policies, and failover configuration into Envoy xDS APIs (LDS, RDS, CDS, EDS, SDS)
  • Streaming xDS configuration to all Envoy sidecar proxies and ztunnel node proxies via gRPC

Citadel (Certificate Authority)

Citadel handles identity and certificate management:

  • Issues SPIFFE-format X.509 certificates to every workload in the mesh
  • Identity format: spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>
  • Manages automatic certificate rotation (default 24-hour lifetime)
  • Supports both self-signed CA and integration with external CAs (via CSR flow)

Galley (Configuration Validation)

Galley was formerly a separate component responsible for validating and transforming Istio configuration. In current Istio releases, Galley's functionality has been merged into istiod. It validates Istio CRDs before Pilot processes them and provides configuration introspection via istioctl analyze.

2. Data Plane -- Sidecar Mode

sequenceDiagram
    participant K8s as Kubernetes API
    participant Istiod as Istiod
    participant MutWebhook as Mutating Webhook
    participant Pod as Application Pod
    participant Sidecar as Envoy Sidecar
    participant App as Application Container

    K8s->>MutWebhook: Pod creation event
    MutWebhook->>MutWebhook: Inject istio-proxy container + istio-init container
    MutWebhook->>Pod: Modified Pod spec with sidecar
    Pod->>Sidecar: istio-proxy starts (pilot-agent)
    Sidecar->>Istiod: Connect xDS stream
    Istiod-->>Sidecar: Push xDS config (listeners, routes, clusters, secrets)
    App->>Sidecar: Outbound traffic via iptables redirect
    Sidecar->>Sidecar: Apply routing, load balancing, mTLS
    Sidecar->>Sidecar: Forward to destination sidecar

Sidecar Injection

Istio uses a Kubernetes Mutating Admission Webhook to automatically inject the istio-proxy container (Envoy + pilot-agent) into application pods. Key details:

  • Automatic injection -- enabled by labeling a namespace with istio-injection=enabled
  • Revision-based injection -- supports canary upgrades by using istio.io/rev=<revision> labels
  • istio-init container -- sets up iptables rules to intercept all inbound and outbound traffic to/from the application container
  • pilot-agent -- manages the Envoy proxy lifecycle, bootstrap config generation, and health checking

Traffic Flow (Sidecar Mode)

  1. Application container sends traffic to a Kubernetes Service
  2. iptables rules in the pod redirect outbound traffic to the Envoy sidecar (port 15001)
  3. Envoy applies routing rules from VirtualService, load balancing from DestinationRule
  4. Envoy establishes mTLS connection to the destination pod's sidecar
  5. Destination sidecar (port 15006) receives traffic and forwards to the application container

3. Data Plane -- Ambient Mesh

graph TB
    subgraph "Node A"
        PodA["Pod A<br/>(istio.io/dataplane-mode=ambient)"]
        ZT_A["ztunnel<br/>(Node DaemonSet)"]
    end

    subgraph "Node B"
        PodB["Pod B<br/>(istio.io/dataplane-mode=ambient)"]
        ZT_B["ztunnel<br/>(Node DaemonSet)"]
    end

    subgraph "Namespace waypoint"
        WP["Waypoint Proxy<br/>(Envoy, per-namespace)"]
    end

    PodA -->|traffic redirected via iptables/tc| ZT_A
    ZT_A -->|HBONE tunnel on port 15008| ZT_B
    ZT_B -->|plaintext to pod| PodB

    ZT_A -.->|optional L7 path| WP
    WP -.->|L7 policies applied| ZT_B

    style ZT_A fill:#2d5a8a,color:#fff
    style ZT_B fill:#2d5a8a,color:#fff
    style WP fill:#8a2d5a,color:#fff

Workload Categories in Ambient Mode

Category Label Behavior
Out of Mesh (none) Standard pod, no mesh features
In Mesh (L4) istio.io/dataplane-mode=ambient Traffic intercepted at L4 by ztunnel, mTLS enforced
In Mesh (L7) istio.io/dataplane-mode=ambient + istio.io/use-waypoint L4 by ztunnel plus L7 policies via waypoint proxy

ztunnel (Node Proxy)

ztunnel is a purpose-built, high-performance node proxy written in Rust:

  • Runs as a DaemonSet on every node in the cluster
  • Implements L4 (TCP) traffic management: mTLS, traffic encryption, L4 authorization policies
  • Uses xDS APIs to communicate with istiod for configuration and certificate distribution
  • Multi-tenant: a single ztunnel serves all pods on its node, efficiently managing certificates for all local Service Accounts
  • Uses the HBONE (HTTP-Based Overlay Network Encapsulation) protocol for inter-node tunneling on port 15008

Waypoint Proxies (L7)

Waypoint proxies provide L7 traffic management capabilities in ambient mode:

  • Typically deployed per-namespace (not per-pod)
  • Built on Envoy to handle HTTP routing, retries, timeouts, circuit breaking, L7 authorization policies
  • Traffic is routed through the waypoint only when L7 features are needed
  • Enforced by setting istio.io/use-waypoint label on the namespace or pod

Ambient vs Sidecar

Ambient mode removes the per-pod sidecar overhead. L4 security (mTLS) is provided by ztunnel at the node level. L7 features are opt-in via waypoint proxies. This reduces resource consumption and operational complexity for workloads that only need mTLS encryption.

4. xDS API Usage

Istio uses Envoy's xDS protocol to distribute configuration to data plane proxies:

xDS API Purpose Consumers
LDS (Listener Discovery) Inbound/outbound listeners Sidecars, ztunnel, waypoint
RDS (Route Discovery) HTTP routing rules Sidecars, waypoint
CDS (Cluster Discovery) Upstream clusters (services) Sidecars, ztunnel, waypoint
EDS (Endpoint Discovery) Individual endpoint addresses Sidecars, ztunnel, waypoint
SDS (Secret Discovery) TLS certificates and keys Sidecars, ztunnel, waypoint

Istiod pushes configuration incrementally. When a VirtualService or DestinationRule changes, only the affected proxies receive updated xDS resources.

5. SPIFFE Identity

Every workload in the mesh receives a SPIFFE identity in the format:

spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>

This identity is encoded in the X.509 certificate SAN (Subject Alternative Name) and used for:

  • mTLS peer authentication between sidecars and ztunnel proxies
  • AuthorizationPolicy rules that restrict access based on source identity
  • Audit logging and telemetry with source/destination identity fields

Example ztunnel log showing identity:

src.identity="spiffe://cluster.local/ns/default/sa/curl"
dst.identity="spiffe://cluster.local/ns/default/sa/bookinfo-details"

6. Component Comparison

Aspect Sidecar Mode Ambient Mode
Proxy location Per-pod (istio-proxy) Per-node (ztunnel) + per-namespace (waypoint)
Resource overhead ~50 MB RAM per pod ~100 MB per node + optional waypoint
L4 security Full mTLS Full mTLS via ztunnel
L7 features Full (routing, retries, etc.) Requires waypoint proxy
Network model iptables redirect in pod iptables/tc redirect at node
Suitable for Per-pod L7 customization Large-scale mTLS-first deployments

Key Insight

Istio's architecture has consolidated from three separate components (Pilot, Citadel, Galley) into a single istiod binary, simplifying operations. The ambient mesh mode introduces a two-tier data plane (ztunnel for L4, waypoint for L7) that eliminates per-pod sidecar overhead for workloads that primarily need mTLS encryption.


How It Works

Ambient Mode data path, ztunnel L4 processing, HBONE tunnel, waypoint L7 routing, xDS configuration distribution, and mTLS flow.

Ambient Mode Data Path

sequenceDiagram
    participant PodA as Pod A
    participant ZT_A as ztunnel (Node A)
    participant ZT_B as ztunnel (Node B)
    participant WP as Waypoint Proxy (optional L7)
    participant PodB as Pod B

    PodA->>ZT_A: TCP connect (intercepted via iptables)
    ZT_A->>ZT_A: mTLS handshake (SPIFFE identity)
    ZT_A->>ZT_A: L4 AuthorizationPolicy check
    alt L7 policy needed
        ZT_A->>WP: Forward via HBONE tunnel
        WP->>WP: HTTP routing, retries, L7 policy
        WP->>ZT_B: Forward to destination node
    else L4 only
        ZT_A->>ZT_B: Direct HBONE tunnel
    end
    ZT_B->>PodB: Deliver to destination pod

HBONE Tunnel

ztunnel uses the HTTP-Based Overlay Network Environment (HBONE) for inter-node traffic:

  1. Connection intercept: Node-level iptables rules redirect pod traffic to ztunnel (port 15006 inbound, 15001 outbound)
  2. HBONE CONNECT: ztunnel wraps the original TCP connection inside an HTTP/2 CONNECT request to the peer ztunnel
  3. mTLS establishment: The HTTP/2 connection uses mutual TLS with SPIFFE SVIDs as client/server certificates
  4. Data relay: Original bytes are tunneled through the HTTP/2 stream with minimal overhead

This is similar to Kubernetes' API server kubectl proxy tunnel but applied to every pod connection.

istiod Configuration Distribution

istiod (formerly Pilot) translates Kubernetes resources (Service, Deployment, Istio CRDs) into Envoy xDS configuration:

flowchart TB
    subgraph istiod_I["istiod"]
        K8S_Watcher["K8s API Watcher"]
        Config_Analysis["Config Analysis & Validation"]
        XDS_Generator["xDS Generator"]
        CA["Citadel CA\n(certificate signing)"]
    end

    subgraph Data_Plane["Data Plane"]
        ZT["ztunnel / Envoy"]
    end

    K8S_Watcher --> Config_Analysis
    Config_Analysis --> XDS_Generator
    XDS_Generator -->|"xDS gRPC\n(delta or SOTW)"| ZT
    CA -->|"Sign CSR"| ZT

    style istiod_I fill:#5f6caf,color:#fff

xDS Resource Types

xDS API Purpose
LDS (Listener) Inbound/outbound listener configuration
RDS (Route) HTTP routing rules (VirtualService)
CDS (Cluster) Upstream cluster definition (Service)
EDS (Endpoint) Dynamic endpoint discovery (Endpoints/EndpointSlice)
SDS (Secret) TLS certificate distribution

istiod pushes configuration using delta xDS by default, which sends only changed resources rather than the full configuration snapshot. This reduces CPU and bandwidth at scale.

mTLS Identity (SPIFFE)

flowchart LR
    Istiod_C["istiod\n(Citadel CA)"] -->|"sign cert"| ZT["ztunnel /\nEnvoy sidecar"]
    ZT -->|"present SPIFFE\nSVID"| Peer["Peer ztunnel"]
    Peer -->|"verify cert\nchain"| Trust["Trust Bundle\n(root CA)"]

    style Istiod_C fill:#5f6caf,color:#fff

Certificate Lifecycle

Certificate Lifetime Rotation Trigger
Root CA 10 years (self-signed) Manual rotation
Intermediate CA 1 year Automatic via istiod
Workload certificate 24 hours (default) CSR sent when 50% of lifetime consumed

SPIFFE ID Format

spiffe://cluster.local/ns/<namespace>/sa/<service-account>

Sidecar Mode (Legacy)

In sidecar mode, an Envoy proxy is injected into each pod as a sidecar container. All pod traffic is redirected through iptables to the Envoy proxy, which applies L4/L7 policies. This mode has higher resource overhead (one Envoy per pod) but is more mature and supports all Istio features.

Sources


Benchmarks

Scope

Performance characteristics, scaling limits, and resource consumption for Istio.

Sidecar vs Ambient Performance

Metric Sidecar (Envoy) Ambient (ztunnel) Native
Latency (P50) +1-2ms +0.5ms Baseline
Latency (P99) +3-10ms +1-3ms Baseline
Throughput 90-95% native 95-98% native 100%
Memory per pod +50-100Mi 0 (shared) 0
CPU per pod +50-100m 0 (shared) 0

Control Plane Scaling

Pods in Mesh istiod CPU istiod Memory Config Push Time
100 200m 512Mi < 1s
1,000 1-2 2-4Gi 1-5s
5,000 4-8 8-16Gi 5-15s
10,000 8-16 16-32Gi 15-30s

Scaling Limits

Dimension Limit Notes
Pods per mesh 10,000+ Single istiod
Services 5,000+ xDS push complexity
Namespaces 1,000+
VirtualServices 5,000+ Envoy route table size
Gateways 100+ Resource consumption

Sourcing Status

Unsourced Performance Data

The performance numbers in this document are estimated from vendor documentation, community benchmarks, and engineering judgment. They do not represent controlled benchmarks with documented test conditions. Specific hardware configurations, software versions, and test methodologies were not recorded.

Use these figures as rough guidance only. For production capacity planning, run your own benchmarks against your specific workload and infrastructure.

Sources