Web Services & APIs — Operations¶

Practical guide to deploying, documenting, securing, versioning, testing, and monitoring web service APIs.

API Specification Formats¶

Specifications are machine-readable contracts for APIs — enabling codegen, mock servers, linting, and documentation.

OpenAPI 3.1 (REST)¶

The industry standard for describing RESTful HTTP APIs. Version 3.1 aligns with JSON Schema draft 2020-12.

openapi: 3.1.0
info:
  title: Orders API
  version: 2.4.0
  contact:
    email: [email protected]
  license:
    name: Apache 2.0
servers:
  - url: https://api.example.com/v2
    description: Production
  - url: https://sandbox.api.example.com/v2
    description: Sandbox

paths:
  /orders/{orderId}:
    get:
      operationId: getOrder
      summary: Retrieve a single order
      tags: [Orders]
      parameters:
        - name: orderId
          in: path
          required: true
          schema:
            type: string
            format: uuid
      responses:
        "200":
          description: Order found
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Order"
        "404":
          $ref: "#/components/responses/NotFound"
      security:
        - bearerAuth: []

components:
  schemas:
    Order:
      type: object
      required: [id, status, createdAt]
      properties:
        id:
          type: string
          format: uuid
        status:
          type: string
          enum: [pending, confirmed, shipped, delivered, cancelled]
        createdAt:
          type: string
          format: date-time
  responses:
    NotFound:
      description: Resource not found
      content:
        application/json:
          schema:
            $ref: "#/components/schemas/ProblemDetail"
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: JWT

Key OpenAPI 3.1 improvements over 3.0: - Full JSON Schema 2020-12 alignment (replaces OpenAPI's extended subset) - webhooks top-level field for inbound webhooks - discriminator improvements, const, $schema per-schema - exclusiveMinimum/exclusiveMaximum now numeric (not boolean)

AsyncAPI 3.0 (Event-Driven APIs)¶

OpenAPI equivalent for WebSocket, MQTT, Kafka, AMQP, SNS/SQS APIs.

asyncapi: 3.0.0
info:
  title: Order Events API
  version: 1.0.0
channels:
  orderCreated:
    address: orders.created
    messages:
      OrderCreated:
        payload:
          type: object
          properties:
            orderId:
              type: string
            customerId:
              type: string
operations:
  onOrderCreated:
    action: receive
    channel:
      $ref: "#/channels/orderCreated"

Protocol Buffers IDL (gRPC)¶

See architecture#protocol-buffers for the full .proto format. The .proto file IS the API spec for gRPC services.

Tooling comparison:

Format	Ecosystem	Codegen	Mock Server	Linting
OpenAPI 3.1	REST	Any language	Prism, WireMock	Spectral, Vacuum
AsyncAPI 3.0	Event-driven	Node.js, Java	Microcks	AsyncAPI Studio
Protobuf	gRPC	Any language	grpc-go test server	`buf lint`
WSDL	SOAP	Java, .NET, Python	SoapUI	SOAP UI

API Gateways¶

An API gateway is the single entry point for all client traffic — handling routing, auth enforcement, rate limiting, observability, and protocol translation.

flowchart LR
    C1[Mobile Client] --> GW[API Gateway]
    C2[Browser] --> GW
    C3[Partner API] --> GW
    GW -->|/orders| OS[Orders Service]
    GW -->|/users| US[User Service]
    GW -->|/products| PS[Product Service]
    GW --> Auth[Auth Service]
    GW --> RL[Rate Limiter\nRedis]
    GW --> Log[Observability\nDatadog / Grafana]

Kong Gateway¶

Open-source gateway built on NGINX + OpenResty (Lua). Enterprise tier adds RBAC, Dev Portal, and Vitals analytics.

# Kong declarative config (deck format)
services:
  - name: orders-service
    url: http://orders-service:8080
    plugins:
      - name: rate-limiting
        config:
          minute: 1000
          policy: redis
          redis_host: redis
      - name: jwt
        config:
          claims_to_verify: [exp]
    routes:
      - name: orders-route
        paths: [/v2/orders]
        strip_path: false
        methods: [GET, POST, PUT, PATCH, DELETE]

# Kong Admin API — add plugin to route
curl -X POST http://kong:8001/routes/orders-route/plugins \
  --data name=request-transformer \
  --data "config.add.headers[]=X-Request-ID:$(uuidgen)"

Envoy Proxy¶

High-performance C++ proxy developed at Lyft. Operates as data plane in Istio service mesh. Configured via xDS APIs (dynamic) or static YAML.

# Envoy static config — HTTP rate limit filter
http_filters:
  - name: envoy.filters.http.ratelimit
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
      domain: orders_api
      rate_limit_service:
        grpc_service:
          envoy_grpc:
            cluster_name: rate_limit_service
        transport_api_version: V3

AWS API Gateway¶

Managed gateway for REST, HTTP, and WebSocket APIs. Integrates natively with Lambda, ALB, and VPC Link.

# Create HTTP API (simpler, lower cost than REST API)
aws apigatewayv2 create-api \
  --name orders-api \
  --protocol-type HTTP \
  --target arn:aws:lambda:us-east-1:123456789:function:orders-handler

# Add JWT authorizer
aws apigatewayv2 create-authorizer \
  --api-id abc123 \
  --authorizer-type JWT \
  --identity-source '$request.header.Authorization' \
  --jwt-configuration Audience=orders-api,Issuer=https://auth.example.com \
  --name JwtAuthorizer

Gateway comparison:

Gateway	Deployment	Config Model	Best For
Kong	Self-hosted / Cloud	Declarative YAML / Admin API	Large teams, plugin ecosystem
Envoy	Self-hosted (sidecar)	xDS (dynamic) / YAML	Service mesh, Kubernetes
AWS API Gateway	Managed	Console / CDK / SAM	AWS-native serverless
Nginx	Self-hosted	Imperative config	Simple reverse proxy
Traefik	Self-hosted	Auto-discover (Kubernetes)	Kubernetes ingress
Azure API Management	Managed	Portal / ARM / Bicep	Azure-native

Authentication and Authorization¶

API Keys¶

Simplest scheme. Suitable for server-to-server or developer access where OAuth overhead is unneeded.

GET /v2/orders HTTP/1.1
X-API-Key: sk_live_a1b2c3d4e5f6

Best practices: - Prefix keys by environment: sk_live_, sk_test_ - Store only the hash (SHA-256) in database — never plaintext - Rotate on compromise; provide 30-day grace period during planned rotations - Associate keys with scopes: orders:read, orders:write

JWT (JSON Web Tokens)¶

Stateless bearer tokens. Three base64url-encoded parts: header, payload, signature.

# Decode JWT without verification (debugging)
echo "eyJhbGci..." | cut -d. -f2 | base64 -d | jq

// Payload claims
{
  "sub": "user_01HXYZ",
  "iss": "https://auth.example.com",
  "aud": "orders-api",
  "exp": 1745600000,
  "iat": 1745596400,
  "scope": "orders:read orders:write",
  "jti": "01HXYZ-unique-token-id"
}

JWT security checklist: - Use RS256 (asymmetric) for public key distribution, not HS256 (shared secret) - Short expiry: 15 minutes for access tokens; refresh tokens via httpOnly cookies - Validate iss, aud, exp, nbf on every request - Include jti (JWT ID) for revocation lookup in Redis blocklist - Never store sensitive data in payload — JWTs are encoded, not encrypted (use JWE for confidentiality)

OAuth 2.0 / OAuth 2.1¶

Authorization Code + PKCE (browser and mobile clients):

sequenceDiagram
    participant U as User
    participant C as Client App
    participant AS as Auth Server
    participant RS as Resource Server

    C->>C: Generate code_verifier, code_challenge = SHA256(verifier)
    C->>AS: GET /authorize?response_type=code&client_id=...&code_challenge=...
    AS->>U: Login + Consent screen
    U->>AS: Approve
    AS->>C: Redirect with ?code=AUTH_CODE
    C->>AS: POST /token {code, code_verifier, client_id}
    AS->>C: {access_token, refresh_token, expires_in}
    C->>RS: GET /orders Authorization: Bearer ACCESS_TOKEN
    RS->>C: 200 {orders: [...]}

Client Credentials (machine-to-machine):

curl -X POST https://auth.example.com/oauth/token \
  -d grant_type=client_credentials \
  -d client_id=service-account \
  -d client_secret=secret \
  -d scope="orders:read inventory:write"

OAuth 2.1 key changes (draft consolidation): - PKCE mandatory for all public clients - Implicit flow removed - Resource Owner Password Credentials (ROPC) flow removed - Refresh token rotation required for public clients

mTLS (Mutual TLS)¶

Both client and server present certificates — eliminates shared secrets for service-to-service auth.

# Generate client cert signed by your CA
openssl req -new -key client.key -out client.csr \
  -subj "/CN=orders-service/O=internal"
openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key \
  -CAcreateserial -out client.crt -days 365

# Call API with client cert
curl --cert client.crt --key client.key \
  --cacert ca.crt \
  https://internal-api.example.com/v2/orders

In Kubernetes: use SPIFFE/SPIRE for automatic workload identity, or let Istio inject mTLS transparently via sidecar.

API Versioning¶

Versioning Strategies¶

Strategy	Example	Pros	Cons
URI path	`/v2/orders`	Most visible, easy routing	Breaks resource identity
Query param	`/orders?version=2`	Non-breaking URL	Easily forgotten, cache unfriendly
Header	`API-Version: 2024-01-01`	Clean URLs	Less discoverable
Content negotiation	`Accept: application/vnd.api+json;version=2`	RFC-compliant	Complex client setup

URI versioning is the most common choice for public APIs (used by Stripe, Twilio, GitHub). Header versioning (calendar-based like Stripe-Version: 2023-10-16) is used by Stripe alongside URI versioning for fine-grained migrations.

Calendar-Based Versioning (Stripe Pattern)¶

Instead of major version bumps, every breaking change gets a calendar date:

GET /v1/charges HTTP/1.1
Stripe-Version: 2023-10-16

Each API key locks to a version at creation. Customers opt into new versions explicitly.

Deprecation Headers (RFC 8594)¶

HTTP/1.1 200 OK
Deprecation: "2026-01-01T00:00:00Z"
Sunset: "2027-01-01T00:00:00Z"
Link: <https://docs.example.com/migration/v3>; rel="successor-version"

Deprecation: when the endpoint was deprecated
Sunset: when it will stop working (RFC 8594)
Link: migration guide

Non-Breaking vs Breaking Changes¶

Non-breaking (safe to ship): - Adding optional request fields - Adding new response fields - Adding new endpoints - New enum values (unless clients use exhaustive matching)

Breaking (require new version): - Removing or renaming fields - Changing field types - Changing HTTP method for an operation - Altering authentication requirements - Removing enum values

Rate Limiting¶

Rate limiting protects services from abuse, ensures fair usage, and enables monetization tiers.

Algorithms¶

Token Bucket (allow bursting):

capacity = 100 tokens
refill_rate = 10 tokens/second

on request:
  if tokens >= cost:
    tokens -= cost
    return ALLOW
  else:
    return 429 Too Many Requests

AWS API Gateway and Kong use token bucket by default.

Sliding Window Log (most precise):

Stores timestamp of each request. Counts requests within [now - window, now]. High memory cost at scale.

Sliding Window Counter (approximation, low memory):

rate = (prev_count × (1 - elapsed/window)) + curr_count

Redis-based implementation: two counters (current window, previous window) per key.

Fixed Window (simplest, boundary spike risk):

Resets counter at fixed intervals. A burst at 11:59:59 and 12:00:01 yields 2× the allowed rate.

Response Headers¶

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1745600000
Retry-After: 30

On 429:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745600000
Content-Type: application/problem+json

{
  "type": "https://api.example.com/errors/rate-limit-exceeded",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "You have exceeded 1000 requests per minute."
}

Rate Limit Keys¶

Choose the right granularity:

Key	Use Case
IP address	Unauthenticated public APIs, DDoS protection
API key	Developer tier enforcement
User ID	Per-account limits after auth
Endpoint	Expensive operations (e.g., `/search`)
Tenant ID	SaaS multi-tenant isolation

CORS restricts which browser origins can call your API. It does NOT protect server-to-server calls.

# Preflight request (browser auto-sends for non-simple requests)
OPTIONS /v2/orders HTTP/1.1
Origin: https://app.example.com
Access-Control-Request-Method: POST
Access-Control-Request-Headers: Authorization, Content-Type

# Server response
HTTP/1.1 204 No Content
Access-Control-Allow-Origin: https://app.example.com
Access-Control-Allow-Methods: GET, POST, PUT, PATCH, DELETE, OPTIONS
Access-Control-Allow-Headers: Authorization, Content-Type, X-Request-ID
Access-Control-Max-Age: 86400
Access-Control-Allow-Credentials: true

Critical rules: - Never set Access-Control-Allow-Origin: * with Access-Control-Allow-Credentials: true — browsers block it - Maintain an allowlist of trusted origins; validate dynamically against it - Cache preflight with Access-Control-Max-Age to reduce OPTIONS overhead

API Design Best Practices¶

Resource Naming¶

# Good — noun-based, plural, lowercase
GET    /v2/orders
POST   /v2/orders
GET    /v2/orders/{orderId}
PUT    /v2/orders/{orderId}
PATCH  /v2/orders/{orderId}
DELETE /v2/orders/{orderId}

# Nested resources — use sparingly; max 2 levels deep
GET /v2/orders/{orderId}/items
POST /v2/orders/{orderId}/items

# Actions (verbs) — use only for operations that don't map to CRUD
POST /v2/orders/{orderId}/cancel
POST /v2/orders/{orderId}/refund
POST /v2/payments/{paymentId}/capture

Idempotency Keys¶

Prevent duplicate processing when clients retry on network failure.

POST /v2/orders HTTP/1.1
Idempotency-Key: 01HXYZ-unique-request-id
Content-Type: application/json

{"productId": "prod_123", "quantity": 2}

Server logic:
1. Hash Idempotency-Key → look up in idempotency store (Redis/DB)
2. If found and result cached → return cached response immediately
3. If found and in-flight → return 409 Conflict or wait
4. If not found → process, store result keyed to hash, return result

TTL: 24–48 hours (per Stripe: 24h)

Pagination¶

Cursor-based (recommended for large/real-time datasets):

// Request: GET /v2/orders?limit=20&after=01HXYZ
{
  "data": [...],
  "pagination": {
    "limit": 20,
    "hasNextPage": true,
    "nextCursor": "01HABC",
    "hasPrevPage": true,
    "prevCursor": "01HWXY"
  }
}

Offset-based (simpler, avoid for real-time data — page drift on inserts):

// Request: GET /v2/orders?limit=20&offset=40
{
  "data": [...],
  "pagination": {
    "total": 1847,
    "limit": 20,
    "offset": 40,
    "pages": 93
  }
}

Standardized Error Responses (RFC 9457 / Problem Details)¶

{
  "type": "https://api.example.com/errors/validation-error",
  "title": "Validation Error",
  "status": 422,
  "detail": "Request body contains invalid fields.",
  "instance": "/v2/orders/01HXYZ",
  "errors": [
    {
      "field": "quantity",
      "message": "Must be a positive integer",
      "code": "INVALID_VALUE"
    },
    {
      "field": "productId",
      "message": "Product not found",
      "code": "RESOURCE_NOT_FOUND"
    }
  ],
  "traceId": "4bf92f3577b34da6a3ce929d0e0e4736"
}

Always include traceId or requestId for support/debugging correlation.

Long-Running Operations (202 Async Pattern)¶

# 1. Client submits job
POST /v2/reports HTTP/1.1
{"type": "monthly-revenue", "month": "2026-03"}

# 2. Server accepts immediately
HTTP/1.1 202 Accepted
Location: /v2/reports/jobs/job_01HXYZ
Retry-After: 30

# 3. Client polls
GET /v2/reports/jobs/job_01HXYZ

# 4a. Still processing
HTTP/1.1 200 OK
{"status": "processing", "progress": 42, "estimatedCompletion": "2026-04-25T10:15:00Z"}

# 4b. Complete
HTTP/1.1 200 OK
{"status": "complete", "resultUrl": "/v2/reports/rpt_01HABC", "expiresAt": "2026-04-26T10:00:00Z"}

# 5. Retrieve result
GET /v2/reports/rpt_01HABC

Alternative: use webhook callback instead of polling — POST /v2/reports body includes callbackUrl.

Filtering, Sorting, Searching¶

# Filtering — use query params
GET /v2/orders?status=pending&customerId=cust_123&createdAfter=2026-01-01

# Sorting — field and direction
GET /v2/orders?sort=-createdAt,+status   # minus = desc, plus = asc

# Sparse fieldsets — reduce payload size
GET /v2/orders?fields=id,status,total

# Full-text search
GET /v2/products?q=wireless+headphones&category=electronics

API First Design¶

Design the API contract before writing implementation code.

Workflow: 1. Write OpenAPI spec in YAML (use Spectral to lint against rules) 2. Generate mock server with Prism: prism mock openapi.yaml 3. Share mock URL with frontend team — both sides develop in parallel 4. Generate server stubs with oapi-codegen (Go), openapi-generator (Java/Python/etc.) 5. Write implementation against generated interfaces 6. Run contract tests against live server to verify spec compliance

# Prism mock server (read OpenAPI spec, serve mock responses)
npx @stoplight/prism-cli mock openapi.yaml --port 4010

# Call mock
curl http://localhost:4010/v2/orders/01HXYZ \
  -H "Authorization: Bearer test-token"

# Prism validation proxy (forward to real server, validate request/response against spec)
npx @stoplight/prism-cli proxy openapi.yaml http://localhost:8080

Testing¶

REST API Testing (curl)¶

# GET with auth header and pretty JSON
curl -s -X GET https://api.example.com/v2/orders/01HXYZ \
  -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" | jq

# POST with JSON body
curl -s -X POST https://api.example.com/v2/orders \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{"productId": "prod_123", "quantity": 2}' | jq

# Test rate limiting — fire 10 requests rapidly
for i in {1..10}; do
  curl -s -o /dev/null -w "%{http_code}\n" \
    -H "Authorization: Bearer $TOKEN" \
    https://api.example.com/v2/orders
done

# Inspect headers only
curl -sI https://api.example.com/v2/orders

# Follow redirects, show timing
curl -v -w "@curl-format.txt" -L https://api.example.com/v2/orders

gRPC Testing (grpcurl)¶

# Install
brew install grpcurl

# List services (server reflection must be enabled)
grpcurl -plaintext localhost:50051 list

# Describe a service
grpcurl -plaintext localhost:50051 describe orders.OrderService

# Unary call
grpcurl -plaintext \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"order_id": "01HXYZ"}' \
  localhost:50051 orders.OrderService/GetOrder

# Server streaming call
grpcurl -plaintext \
  -d '{"customer_id": "cust_123"}' \
  localhost:50051 orders.OrderService/WatchOrders

# Call with TLS
grpcurl \
  -cert client.crt -key client.key -cacert ca.crt \
  api.example.com:443 orders.OrderService/GetOrder \
  -d '{"order_id": "01HXYZ"}'

WebSocket Testing (wscat)¶

# Install
npm install -g wscat

# Connect to WebSocket server
wscat -c wss://api.example.com/ws \
  --header "Authorization: Bearer $TOKEN"

# Send a message (after connecting)
> {"type": "subscribe", "channel": "orders", "customerId": "cust_123"}
< {"type": "subscribed", "channel": "orders"}
< {"type": "order.updated", "orderId": "01HXYZ", "status": "shipped"}

# Connect with subprotocol
wscat -c wss://api.example.com/ws --subprotocol "v2.orders"

GraphQL Testing (curl + jq)¶

# Introspection query
curl -s -X POST https://api.example.com/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ __schema { types { name } } }"}' | jq '.data.__schema.types[].name'

# Query with variables
curl -s -X POST https://api.example.com/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "query GetOrder($id: ID!) { order(id: $id) { status total } }",
    "variables": {"id": "01HXYZ"}
  }' | jq

# Mutation
curl -s -X POST https://api.example.com/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation CancelOrder($id: ID!) { cancelOrder(id: $id) { success } }",
    "variables": {"id": "01HXYZ"}
  }' | jq

Load Testing (k6)¶

// k6 load test script — orders API
import http from "k6/http";
import { check, sleep } from "k6";
import { Rate } from "k6/metrics";

const errorRate = new Rate("errors");

export const options = {
  stages: [
    { duration: "30s", target: 50 },   // ramp up to 50 VUs
    { duration: "2m", target: 50 },    // hold
    { duration: "30s", target: 200 },  // spike to 200 VUs
    { duration: "1m", target: 200 },   // hold spike
    { duration: "30s", target: 0 },    // ramp down
  ],
  thresholds: {
    http_req_duration: ["p(95)<500"],  // 95th percentile < 500ms
    errors: ["rate<0.01"],             // error rate < 1%
  },
};

export default function () {
  const res = http.get("https://api.example.com/v2/orders", {
    headers: { Authorization: `Bearer ${__ENV.API_TOKEN}` },
  });
  const ok = check(res, {
    "status is 200": (r) => r.status === 200,
    "response time < 500ms": (r) => r.timings.duration < 500,
  });
  errorRate.add(!ok);
  sleep(1);
}

k6 run --env API_TOKEN=$TOKEN load-test.js

Contract Testing (Pact)¶

Consumer-driven contract tests verify that API providers honour contracts expected by consumers.

# Consumer writes expectations → generates pact file
# Provider verifies pact file against running service

# Publish to Pact Broker
npx pact-broker publish ./pacts \
  --broker-base-url https://your-pact-broker.example.com \
  --consumer-app-version $(git rev-parse HEAD)

# Provider verifies
npx pact-provider-verifier \
  --provider-base-url http://localhost:8080 \
  --pact-broker-base-url https://your-pact-broker.example.com \
  --provider orders-service

Monitoring and Observability¶

Key Metrics (RED Method)¶

Metric	Description	Alert Threshold (example)
Rate	Requests per second	Traffic drop > 50% vs baseline
Errors	5xx error rate	> 1% over 5 minutes
Duration	p50, p95, p99 latency	p99 > 1000ms

Additional API-specific metrics: - 4xx rate (client errors) — spike may indicate breaking change or client bug - Auth failure rate — spike indicates credential attack or misconfiguration - Rate limit hit rate (429 responses) — indicate capacity planning needs - Payload size distribution — detect runaway requests

Distributed Tracing (OpenTelemetry)¶

# Node.js — auto-instrumentation with OTLP export
npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node

# Inject trace context headers
GET /v2/orders HTTP/1.1
traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
tracestate: rend=congo

Propagate traceparent across all service boundaries. Every response should include X-Request-ID or X-Trace-ID tied to the trace.

Structured Logging¶

{
  "level": "info",
  "timestamp": "2026-04-25T10:00:00.123Z",
  "service": "orders-api",
  "traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
  "spanId": "00f067aa0ba902b7",
  "method": "GET",
  "path": "/v2/orders/01HXYZ",
  "statusCode": 200,
  "durationMs": 47,
  "customerId": "cust_123",
  "region": "us-east-1"
}

Health Endpoints¶

# Liveness — is the process alive?
GET /health/live
HTTP/1.1 200 OK
{"status": "ok"}

# Readiness — is the service ready to receive traffic?
GET /health/ready
HTTP/1.1 200 OK
{
  "status": "ok",
  "checks": {
    "database": "ok",
    "cache": "ok",
    "dependencyServiceA": "ok"
  }
}

# Degraded state
HTTP/1.1 503 Service Unavailable
{
  "status": "degraded",
  "checks": {
    "database": "ok",
    "cache": "error",
    "dependencyServiceA": "ok"
  }
}

Circuit Breaker Pattern¶

Prevents cascading failures when a downstream dependency is degraded.

States:
  CLOSED → normal operation, requests pass through
  OPEN   → dependency is failing; requests fail fast with 503
  HALF_OPEN → test probe requests sent; if success → CLOSED, if fail → OPEN

Transition triggers:
  CLOSED → OPEN:     failure rate > 50% over last 10 requests (or time window)
  OPEN → HALF_OPEN:  after cooldown period (e.g. 30 seconds)
  HALF_OPEN → CLOSED: 3 consecutive successes
  HALF_OPEN → OPEN:   1 failure

Libraries: Resilience4j (Java), polly (.NET), opossum (Node.js), gobreaker (Go).

Webhooks as a Product¶

For APIs that offer webhooks, treat the delivery system as a first-class product.

Delivery Architecture¶

sequenceDiagram
    participant ES as Event Source
    participant Q as Message Queue
    participant WD as Webhook Dispatcher
    participant C as Customer Server

    ES->>Q: Publish event
    Q->>WD: Consume event
    WD->>C: POST /webhook (signed payload)
    alt Success (2xx)
        C->>WD: 200 OK (within 5s)
        WD->>Q: Ack message
    else Failure / Timeout
        WD->>Q: Nack / retry
        WD->>WD: Exponential backoff\n(5s, 25s, 125s, ...)
        WD->>WD: After 72h: mark dead, alert
    end

Payload Signing (HMAC-SHA256)¶

import hashlib, hmac, time

def sign_payload(secret: str, payload: bytes) -> str:
    timestamp = str(int(time.time()))
    message = f"{timestamp}.{payload.decode()}".encode()
    signature = hmac.new(secret.encode(), message, hashlib.sha256).hexdigest()
    return f"t={timestamp},v1={signature}"

def verify_signature(secret: str, payload: bytes, header: str, tolerance: int = 300) -> bool:
    parts = dict(part.split("=", 1) for part in header.split(","))
    timestamp = int(parts["t"])
    if abs(time.time() - timestamp) > tolerance:
        return False  # replay attack
    message = f"{timestamp}.{payload.decode()}".encode()
    expected = hmac.new(secret.encode(), message, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, parts["v1"])

Reliability Patterns¶

Pattern	Implementation
Idempotency keys	Include `webhookId` in payload; consumer deduplicates
Immediate 200	Return 200 before processing; use queue for async work
Retry with backoff	`5s → 25s → 125s → 625s`; max 72h delivery window
Dead letter queue	After max retries, route to DLQ; alert operator
Event ordering	Include `sequence` counter; consumer handles out-of-order
CloudEvents format	Standardize payload envelope (`specversion`, `type`, `source`, `id`)

Webhook Management Portal (product features)¶

Endpoint registration with per-event-type subscription
Delivery attempt log with request/response bodies (last 30 days)
Manual replay of failed deliveries
HMAC secret rotation (grace period supporting both old + new key)
200 OK webhook test endpoint for validation

API Caching Strategies¶

Caching is the single most impactful API performance optimization. Multiple layers can cache independently.

Caching Layers¶

flowchart LR
    C[Client] -->|1| BC[Browser Cache\nCache-Control]
    BC -->|2| CDN[CDN Edge\nCloudflare / CloudFront]
    CDN -->|3| GW[API Gateway\nVarnish / nginx]
    GW -->|4| APP[Application\nRedis / Memcached]
    APP -->|5| DB[(Database\nQuery Cache)]

Cache-Control Patterns¶

# Immutable asset (hashed filename — never changes)
Cache-Control: public, max-age=31536000, immutable

# Frequently changing API resource
Cache-Control: private, max-age=0, must-revalidate
ETag: "a1b2c3"

# Shared resource (CDN-cacheable)
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=600
Vary: Accept-Encoding, Authorization

# No caching (sensitive data)
Cache-Control: no-store

stale-while-revalidate — the CDN/proxy serves the stale cached response immediately while fetching a fresh copy in the background. The client gets a fast response; the cache updates asynchronously. Critical for APIs where slight staleness is acceptable (product catalogs, search results).

stale-if-error — serve stale content if the origin returns a 5xx error. Provides graceful degradation when the backend is down.

Cache Invalidation Patterns¶

Pattern	How It Works	Best For
TTL expiry	Cache expires after `max-age` seconds	Simple, predictable; acceptable staleness
Event-driven purge	Backend publishes event → CDN/cache purge API called	Real-time consistency; more infrastructure
Surrogate keys (tags)	Tag cached responses; purge all responses with a tag	Purge all `/products/*` when inventory changes
Conditional revalidation	`If-None-Match` / `If-Modified-Since` → 304 or fresh	Bandwidth savings; origin still hit

# Fastly — purge by surrogate key
curl -X POST https://api.fastly.com/service/SERVICE_ID/purge/product-42 \
  -H "Fastly-Key: $FASTLY_TOKEN"

# CloudFront — invalidation
aws cloudfront create-invalidation \
  --distribution-id E1234 \
  --paths "/v2/products/42" "/v2/products?category=electronics"

GraphQL Caching¶

GraphQL's POST /graphql endpoint breaks traditional HTTP caching — see architecture#caching for normalized client cache, APQ, and @cacheControl directive approaches.

Retry Patterns¶

Retries are essential for resilient API consumers, but naive retries cause retry storms that amplify failures.

Exponential Backoff with Jitter¶

Attempt 1: wait 0ms (immediate)
Attempt 2: wait random(0, 1000ms)          → e.g., 487ms
Attempt 3: wait random(0, 2000ms)          → e.g., 1,342ms
Attempt 4: wait random(0, 4000ms)          → e.g., 2,891ms
Attempt 5: wait random(0, 8000ms)          → e.g., 5,203ms
Give up after attempt 5

Full jitter (recommended by AWS) prevents thundering herd — all retrying clients spread randomly across the backoff window instead of hitting the server at the same instant.

import random, time

def retry_with_backoff(func, max_retries=5, base_delay=1.0, max_delay=30.0):
    for attempt in range(max_retries):
        try:
            return func()
        except RetryableError:
            if attempt == max_retries - 1:
                raise
            delay = min(base_delay * (2 ** attempt), max_delay)
            jittered = random.uniform(0, delay)  # full jitter
            time.sleep(jittered)

Retry Budgets¶

Instead of per-request retry limits, set a budget: "retry at most 10% of total requests." This prevents cascading retry storms where every client retries simultaneously during an outage.

If 1000 req/s normally, allow at most 100 retries/s total
When budget exhausted → fail fast instead of retrying

Istio and Envoy support retry budgets natively via retryBudget configuration.

Which Errors to Retry¶

Status Code	Retry?	Reason
`408` Request Timeout	✅	Transient timeout
`429` Too Many Requests	✅ (respect `Retry-After`)	Rate limited; wait and retry
`500` Internal Server Error	✅ (cautiously)	May be transient; limit retries
`502` Bad Gateway	✅	Upstream briefly unavailable
`503` Service Unavailable	✅ (respect `Retry-After`)	Server overloaded; back off
`504` Gateway Timeout	✅	Upstream timeout
`400` Bad Request	❌	Client error; retry won't help
`401`/`403`	❌	Auth issue; retry with same creds won't help
`404`	❌	Resource doesn't exist
`409` Conflict	⚠️	Re-read state, then maybe retry with updated data
`422`	❌	Validation error; fix input first

Idempotency Requirement

Only retry non-idempotent operations (POST) if the API supports idempotency keys. Otherwise, retrying a POST may create duplicate resources.

SDK and Client Code Generation¶

Generating typed client SDKs from API specifications eliminates hand-written HTTP calls and catches breaking changes at compile time.

REST — openapi-generator¶

# Install
npm install -g @openapitools/openapi-generator-cli

# Generate TypeScript client from OpenAPI spec
openapi-generator-cli generate \
  -i https://api.example.com/v2/openapi.yaml \
  -g typescript-fetch \
  -o ./generated/api-client \
  --additional-properties=supportsES6=true,npmName=@example/api-client

# Generate Go server stubs
openapi-generator-cli generate \
  -i openapi.yaml \
  -g go-server \
  -o ./internal/api

oapi-codegen (Go-specific, lighter weight):

# Generate Go types + Echo server from spec
oapi-codegen -package api -generate types,server openapi.yaml > api/api.gen.go

Generated client usage (TypeScript):

import { OrdersApi, Configuration } from '@example/api-client';

const api = new OrdersApi(new Configuration({
  basePath: 'https://api.example.com/v2',
  accessToken: token,
}));

// Fully typed — input and output types from OpenAPI spec
const order = await api.getOrder({ orderId: '01HXYZ' });
// order is typed as Order, not `any`

GraphQL — graphql-codegen¶

npm install -D @graphql-codegen/cli @graphql-codegen/typescript \
  @graphql-codegen/typescript-operations @graphql-codegen/typed-document-node

# codegen.ts
import type { CodegenConfig } from '@graphql-codegen/cli';

const config: CodegenConfig = {
  schema: 'https://api.example.com/graphql',
  documents: 'src/**/*.graphql',
  generates: {
    './src/generated/graphql.ts': {
      plugins: [
        'typescript',
        'typescript-operations',
        'typed-document-node',
      ],
    },
  },
};
export default config;

npx graphql-codegen

Result: every .graphql query/mutation file produces a fully typed TypedDocumentNode — input variables and response shape are both type-checked at compile time.

gRPC — buf generate¶

# Install buf CLI
brew install bufbuild/buf/buf

# buf.gen.yaml — code generation config
version: v2
plugins:
  - remote: buf.build/protocolbuffers/go
    out: gen/go
    opt: paths=source_relative
  - remote: buf.build/grpc/go
    out: gen/go
    opt: paths=source_relative
  - remote: buf.build/connectrpc/go
    out: gen/go
    opt: paths=source_relative

# Generate
buf generate proto/

buf advantages over raw protoc: - Dependency management (BSR — Buf Schema Registry) - buf lint — enforces proto style guide - buf breaking — detects breaking changes between proto versions in CI - buf generate — replaces complex protoc plugin chains

API Documentation Generation¶

Swagger UI¶

Interactive documentation from an OpenAPI spec — users can try API calls directly in the browser.

# Docker — serve Swagger UI for your spec
docker run -p 8080:8080 \
  -e SWAGGER_JSON=/spec/openapi.yaml \
  -v $(pwd):/spec \
  swaggerapi/swagger-ui

# Or embed in Express:
npm install swagger-ui-express

const swaggerUi = require('swagger-ui-express');
const spec = require('./openapi.json');
app.use('/docs', swaggerUi.serve, swaggerUi.setup(spec));

Redoc¶

Static, clean, three-panel documentation. Better for public-facing API docs than Swagger UI.

# CLI rendering
npx @redocly/cli build-docs openapi.yaml -o docs/index.html

# Or CDN-hosted single HTML
# <script src="https://cdn.redoc.ly/redoc/latest/bundles/redoc.standalone.js"></script>
# <redoc spec-url="openapi.yaml"></redoc>

Scalar¶

Modern, customizable API reference with a built-in API client. Growing alternative to Swagger UI.

npm install @scalar/express-api-reference

# Express integration
app.use('/reference', apiReference({
  spec: { url: '/openapi.yaml' },
  theme: 'kepler',
}));

GraphQL Documentation¶

GraphiQL — official in-browser IDE with docs explorer, query autocomplete, variable pane
Apollo Studio / Apollo Sandbox — schema explorer, operation history, field usage analytics
Postman — supports GraphQL schema import and introspection

# Serve GraphiQL standalone
npx graphiql-explorer --endpoint https://api.example.com/graphql

API Governance and Linting¶

Spectral (OpenAPI / AsyncAPI Linting)¶

Spectral enforces API design standards via configurable rules. Run in CI to prevent non-compliant changes.

# Install
npm install -g @stoplight/spectral-cli

# Lint against built-in OpenAPI rules
spectral lint openapi.yaml

# Lint against custom ruleset
spectral lint openapi.yaml --ruleset .spectral.yaml

# .spectral.yaml — custom API governance rules
extends:
  - spectral:oas

rules:
  # Require operationId on every endpoint
  operation-operationId:
    severity: error
    given: "$.paths[*][*]"
    then:
      field: operationId
      function: truthy

  # Enforce kebab-case paths
  paths-kebab-case:
    severity: error
    given: "$.paths"
    then:
      function: pattern
      functionOptions:
        match: "^(/[a-z0-9-{}]+)+$"

  # Require description on all parameters
  parameter-description:
    severity: warn
    given: "$.paths[*][*].parameters[*]"
    then:
      field: description
      function: truthy

  # Ban query string versioning
  no-query-version:
    severity: error
    given: "$.paths[*][*].parameters[?(@.name == 'version' && @.in == 'query')]"
    then:
      function: falsy

  # Require error response schemas
  require-error-responses:
    severity: warn
    given: "$.paths[*][*].responses"
    then:
      - field: "400"
        function: truthy
      - field: "500"
        function: truthy

Breaking Change Detection¶

# openapi-diff — detect breaking changes between spec versions
npx openapi-diff old-spec.yaml new-spec.yaml

# buf breaking — detect protobuf breaking changes
buf breaking proto/ --against .git#branch=main

# optic — track API changes in CI
npx @useoptic/optic diff openapi.yaml --base main --check

CI integration example (GitHub Actions):

- name: Lint API spec
  run: spectral lint openapi.yaml --fail-severity warn

- name: Check for breaking changes
  run: |
    git show main:openapi.yaml > /tmp/old-spec.yaml
    npx openapi-diff /tmp/old-spec.yaml openapi.yaml --fail-on-incompatible

Proto Linting with buf¶

# buf.yaml — proto lint configuration
version: v2
lint:
  use:
    - STANDARD         # Google's protobuf style guide
    - COMMENTS         # Require comments on all public types
  except:
    - PACKAGE_VERSION_SUFFIX

# Run lint
buf lint proto/

# Check for breaking changes against main branch
buf breaking proto/ --against '.git#branch=main'

buf breaking catches: field number reuse, type changes, field removal, service method signature changes — all before merge.

API Tooling Ecosystem¶

Category	Tools
API spec editors	Stoplight Studio, Swagger Editor, Redocly
Linting	Spectral (OpenAPI/AsyncAPI), `buf lint` (Protobuf)
Mock servers	Prism, WireMock, Microcks
Client testing	Postman, Insomnia, Bruno, HTTPie
CLI testing	curl, httpie, grpcurl, wscat, mqtt-cli
Load testing	k6, Gatling, Locust, Apache JMeter
Contract testing	Pact, Dredd, Schemathesis
Documentation	Redoc, Swagger UI, Scalar, Mintlify
API gateways	Kong, Envoy, AWS API Gateway, Traefik
Service mesh	Istio, Linkerd, Consul Connect
Code generation	openapi-generator, oapi-codegen, `buf generate`
Monitoring	Datadog APM, Grafana + Prometheus, New Relic