Skip to content

Architecture

Overview

Linkerd is an ultralight, security-first service mesh for Kubernetes. Its data plane uses linkerd2-proxy, a Rust-based micro-proxy designed specifically for the service mesh use case. The control plane runs as a small set of Kubernetes deployments. Linkerd prioritizes simplicity, minimal resource footprint, and automatic mTLS by default.

See also: index, security, operations

1. Control Plane Components

graph TB
    subgraph "Control Plane (linkerd namespace)"
        DEST["destination<br/>Service discovery & routing"]
        IDENTITY["identity<br/>Certificate authority (mTLS)"]
        INJECTOR["proxy-injector<br/>Mutating admission webhook"]
        HEARTBEAT["heartbeat<br/>Health & usage reporting"]
        WEB["web<br/>Dashboard UI"]
    end

    subgraph "Configuration"
        K8s["Kubernetes API"]
        SP["Service Profiles<br/>(linkerd.io/v1alpha2)"]
    end

    subgraph "Data Plane"
        PROXY_A["linkerd2-proxy<br/>(Rust, per-pod sidecar)"]
        PROXY_B["linkerd2-proxy<br/>(Rust, per-pod sidecar)"]
    end

    K8s -->|watches Services, Endpoints| DEST
    SP -->|retries, timeouts, routing| DEST
    DEST -->|service discovery info| PROXY_A
    DEST -->|service discovery info| PROXY_B
    IDENTITY -->|issues TLS certs| PROXY_A
    IDENTITY -->|issues TLS certs| PROXY_B
    K8s -->|pod creation events| INJECTOR
    INJECTOR -->|injects proxy + init containers| PROXY_A
    INJECTOR -->|injects proxy + init containers| PROXY_B
    PROXY_A <-->|automatic mTLS| PROXY_B

destination Service

The destination component is Linkerd's equivalent of a service discovery and routing engine:

  • Watches Kubernetes Services, Endpoints, and ServiceProfile resources
  • Provides dynamic service discovery information to linkerd2-proxy instances via gRPC
  • Supplies retry budgets, timeouts, and load balancing configuration from Service Profiles
  • Handles traffic splitting for canary and blue/green deployments
  • Returns endpoint metadata including zone information for topology-aware routing

identity Controller

The identity component is Linkerd's built-in certificate authority:

  • Issues TLS certificates to every linkerd2-proxy instance using a trust anchor chain
  • Certificate identity is derived from the pod's Kubernetes Service Account
  • Identity format: <service-account>.<namespace>.serviceaccount.identity.linkerd.cluster.local
  • Manages automatic certificate rotation with a configurable lifetime (default 24 hours)
  • Root of trust is the trust anchor certificate (ECDSA, provisioned during install)

proxy-injector

The proxy-injector is a Kubernetes Mutating Admission Webhook that automatically injects the linkerd2-proxy sidecar into pods:

  • Triggered by the linkerd.io/inject: enabled annotation on namespaces or pods
  • Adds the linkerd2-proxy container and a linkerd-init container (iptables setup) to the pod spec
  • Configures proxy startup parameters: destination service address, identity service address, proxy metrics port
  • Supports annotation-based configuration for proxy resources (CPU, memory limits) and log levels

heartbeat

A CronJob-like component that periodically reports cluster health and anonymized usage data back to Buoyant (can be disabled). It also validates that the control plane is functioning correctly.

web (Dashboard)

Provides the Linkerd dashboard UI for visualizing service mesh health, traffic metrics, and topology. Optional -- not required for production data plane operation.

2. Data Plane -- linkerd2-proxy

sequenceDiagram
    participant K8s as Kubernetes API
    participant Injector as proxy-injector
    participant Pod as Application Pod
    participant Proxy as linkerd2-proxy
    participant App as Application Container
    participant Dest as destination service
    participant Identity as identity controller

    K8s->>Injector: Pod creation event (inject=enabled)
    Injector->>Pod: Modified Pod spec with linkerd2-proxy + linkerd-init
    Pod->>Proxy: linkerd2-proxy starts
    Proxy->>Identity: Request TLS certificate (CSR)
    Identity-->>Proxy: Issue certificate (SPIFFE-style identity)
    Proxy->>Dest: Connect for service discovery
    Dest-->>Proxy: Endpoints, routing, retry configuration
    App->>Proxy: Outbound traffic (iptables redirect)
    Proxy->>Proxy: Apply routing, load balancing, mTLS
    Proxy->>Proxy: Forward to destination proxy via mTLS

linkerd2-proxy Design

The linkerd2-proxy is purpose-built for Linkerd and written in Rust:

  • Protocol detection -- automatically detects HTTP/1, HTTP/2, gRPC, and TCP traffic
  • L7 routing -- HTTP routing, retries, timeouts, and traffic splitting when Service Profiles are defined
  • L4 proxying -- transparent TCP proxying for non-HTTP protocols
  • Load balancing -- EWMA (Exponentially Weighted Moving Average) load balancing using latency estimates from the destination service
  • mTLS -- automatically encrypts all traffic between meshed pods using certificates from the identity controller
  • Telemetry -- reports request-level metrics (latency, success rate, throughput) to the control plane
  • Resource footprint -- approximately 10 MB RAM per proxy instance

Init Container (linkerd-init)

The linkerd-init container runs before the application starts and configures iptables rules to redirect all inbound and outbound TCP traffic through the linkerd2-proxy. Alternatively, Linkerd provides a CNI plugin that performs this redirection at the network level, avoiding privileged init containers.

3. Service Profiles

Service Profiles are Linkerd CRDs that define per-service routing policies:

apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
  name: my-service.my-namespace.svc.cluster.local
  namespace: my-namespace
spec:
  routes:
  - name: GET /api/users
    condition:
      method: GET
      pathRegex: /api/users
    responseClasses:
    - condition:
        status:
          min: 500
          max: 599
      isFailure: true
  retryBudget:
    retryRatio: 0.2
    minRetriesPerSecond: 10
    ttl: 10s

Service Profiles enable:

  • Retries with configurable budgets (retry ratio, min retries per second, TTL)
  • Timeouts per route
  • Traffic splitting for progressive delivery (canary, blue/green)
  • Response classification for success/failure rate calculation

4. Multi-Cluster Architecture

Linkerd supports multi-cluster communication via gateway proxies:

  • A linkerd-gateway deployment runs in each cluster
  • Remote service discovery is handled by a mirror service that watches remote cluster endpoints
  • Traffic between clusters is automatically encrypted with mTLS
  • No special configuration required in application code

5. Supported Features

Feature Status
HTTP, HTTP/2, gRPC proxying Stable
TCP proxying with protocol detection Stable
Automatic mTLS Stable (on by default)
Retries and timeouts Stable (via Service Profiles)
Traffic splitting (canary) Stable
Load balancing (EWMA) Stable
Distributed tracing Stable
Fault injection Stable
Gateway API support Stable
Multi-cluster communication Stable
CNI plugin Stable
Rate limiting Available
Topology-aware routing Available
Non-Kubernetes workloads Available (mesh expansion)

6. Resource Comparison

Metric Linkerd Istio (sidecar)
Proxy language Rust C++ (Envoy)
Proxy memory per pod ~10 MB ~50 MB
Control plane memory ~300 MB total ~500 MB (istiod)
Proxy startup time ~1 second ~3-5 seconds
Configuration CRDs ServiceProfile VirtualService, DestinationRule

Key Insight

Linkerd's architecture is deliberately minimal: a single Rust micro-proxy per pod, a small control plane with three core services (destination, identity, proxy-injector), and automatic mTLS on by default. This design trades Istio's rich L7 feature set for simplicity, lower resource consumption, and a faster time-to-value for teams that need secure service-to-service communication without complex traffic management.


How It Works

Rust micro-proxy internals, transparent iptables interception, mTLS trust chain, EWMA load balancing, and sidecar injection.

Proxy Data Path

sequenceDiagram
    participant App as App Container
    participant Proxy as linkerd2-proxy (Rust)
    participant Dest as Destination Service
    participant Peer as Remote linkerd2-proxy
    participant Remote as Remote App

    App->>Proxy: TCP connect (transparent iptables intercept)
    Proxy->>Dest: Lookup service endpoints + policy
    Proxy->>Proxy: Establish mTLS (ML-KEM-768 + X25519)
    Proxy->>Peer: mTLS tunnel + HTTP/2 multiplex
    Peer->>Remote: Forward request
    Remote-->>Peer: Response
    Peer-->>Proxy: Response
    Proxy-->>App: Response

    Note over Proxy: Emits metrics:<br/>latency, success rate, RPS

Transparent Interception

Linkerd uses iptables rules injected into each pod's network namespace via an init container (linkerd-init). The rules redirect all TCP traffic through the proxy:

  1. Outbound redirect: All traffic from the app container to port 4140 (outbound proxy port)
  2. Inbound redirect: All incoming traffic on port 4143 (inbound proxy port)
  3. Exclusions: Traffic to 169.254.169.254 (metadata service) and the proxy's own control port is excluded

The iptables rules use the REDIRECT target, which keeps the original destination IP in the socket option SO_ORIGINAL_DST. The proxy reads this option to discover where the application intended to connect.

mTLS Identity and Trust Chain

Linkerd uses a rotating certificate system with a trust anchor (root CA):

Component Certificate Lifetime Rotation
Trust anchor Root CA (self-signed) 365 days Manual (on upgrade)
Identity issuer Intermediate CA 24 hours Automatic via linkerd-identity
Proxy certificate Leaf (SPIFFE ID) 24 hours Automatic on process start

SPIFFE ID format: spiffe://<trust-domain>/ns/<namespace>/sa/<service-account>

When proxy A connects to proxy B: 1. A presents its leaf certificate (signed by the identity issuer) 2. B validates the certificate chain back to the trust anchor 3. Both proxies verify each other's SPIFFE ID against the configured RBAC policy

EWMA Load Balancing

Linkerd uses Exponentially Weighted Moving Average -- it tracks backend latency and avoids slow endpoints:

flowchart LR
    Req["Request"] --> LB["EWMA LB"]
    LB -->|"pick lowest\nlatency score"| B1["Backend A\n(EWMA: 5ms) ✅"]
    LB -.->|"avoid"| B2["Backend B\n(EWMA: 500ms) ❌"]
    LB -.->|"avoid"| B3["Backend C\n(EWMA: 200ms)"]

    style B1 fill:#2e7d32,color:#fff
    style B2 fill:#c62828,color:#fff

The EWMA algorithm: new_ewma = (1 - α) × old_ewma + α × latest_latency where α is a decay factor. After each request completes, the proxy updates the EWMA for that endpoint. Endpoints with the lowest EWMA scores receive more traffic. This automatically routes around degraded backends without health checks.

Control Plane Components

Component Purpose
linkerd-destination Service discovery: resolves Kubernetes services to endpoint lists + policy
linkerd-identity Certificate authority: issues leaf certificates to proxies on startup
linkerd-proxy-injector Mutating webhook: injects the proxy sidecar into pods based on annotations
linkerd-sp-validator Validates ServerPolicy and Server CRDs before admission

Sources


Benchmarks

Scope

Performance characteristics, scaling limits, and resource consumption for Linkerd.

Proxy Performance

Metric Linkerd2-proxy Envoy (Istio) Notes
Latency added (P50) < 1ms 1-2ms Rust vs C++
Latency added (P99) 1-3ms 3-10ms Under load
Memory per proxy 15-25MB 50-100MB Significantly lighter
CPU per proxy 10-50m 50-100m Lighter workload
Throughput 95-98% native 90-95% native Minimal overhead

Control Plane Resources

Mesh Size (pods) Destination CPU Destination Memory Identity Memory
100 100m 128Mi 64Mi
500 200m 256Mi 128Mi
2,000 500m 512Mi 256Mi

Scaling Limits

Dimension Limit Notes
Pods in mesh 10,000+ Per cluster
Services 5,000+
Endpoints per service 5,000 EndpointSlice support
Certificate rotation Auto, 24h default Zero-downtime

Sourcing Status

Unsourced Performance Data

The performance numbers in this document are estimated from vendor documentation, community benchmarks, and engineering judgment. They do not represent controlled benchmarks with documented test conditions. Specific hardware configurations, software versions, and test methodologies were not recorded.

Use these figures as rough guidance only. For production capacity planning, run your own benchmarks against your specific workload and infrastructure.

Sources