Skip to content

Web Services Architecture

Deep dive into every major API paradigm — how each protocol works under the hood, when to use each, and how they compare.


Protocol Comparison Overview

graph TD
    A[Client needs data] --> B{Use case?}
    B -->|Public API, CRUD, browser-native| C[REST]
    B -->|Flexible queries, complex frontends| D[GraphQL]
    B -->|Internal service-to-service, streaming| E[gRPC]
    B -->|Real-time bidirectional| F[WebSocket]
    B -->|Server pushes only, notifications| G[SSE]
    B -->|TypeScript full-stack only| H[tRPC]
    B -->|Event notification to external systems| I[Webhooks]
    B -->|Legacy enterprise integration| J[SOAP]
Protocol Transport Format Direction Browser Native Best For
REST HTTP/1.1, HTTP/2 JSON (typically) Req/Res Public APIs, CRUD, resource modeling
GraphQL HTTP/1.1, HTTP/2 JSON Req/Res + Subscription Complex frontends, data aggregation
gRPC HTTP/2 only Protocol Buffers (binary) Req/Res + Streaming ⚠️ (needs proxy) Internal microservices, high-throughput
SOAP HTTP, SMTP, TCP XML Req/Res Legacy enterprise, financial services
WebSocket WS (TCP upgrade) Any (text/binary) Full-duplex Real-time chat, gaming, collaboration
SSE HTTP/1.1, HTTP/2 Text (UTF-8) Server → Client only Feeds, notifications, AI streaming
Webhooks HTTP POST JSON (typically) Server → Client push Event-driven integrations, automation
tRPC HTTP/WebSocket JSON Req/Res + Subscription ✅ (Node/TS only) TypeScript full-stack monorepos

REST (Representational State Transfer)

Roy Fielding defined REST in his 2000 doctoral dissertation as an architectural style — not a protocol — built on six constraints that, when applied together, produce a scalable, stateless, and cacheable web service.

The Six Architectural Constraints

1. Client–Server Separation

The client and server evolve independently. The server manages data storage and business logic; the client manages the user interface and user state. Neither depends on the other's implementation details — only the shared API contract.

This decoupling allows frontend teams to swap frameworks (React → Vue) or mobile clients to evolve, without requiring backend changes, and vice versa.

2. Stateless

Every request from client to server must contain all information necessary to understand and process the request. The server stores no session state between requests.

❌ Stateful (server stores session):
POST /login       → server creates session, returns cookie
GET /dashboard    → server reads session to identify user

✅ Stateless (client carries state):
GET /dashboard
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...

Consequences: - Scalability: any server instance can handle any request — no sticky sessions - Reliability: no session state to lose if a server crashes - Overhead: every request must carry auth credentials and context (larger payloads)

3. Cacheable

Responses must declare whether they are cacheable or not. When responses are cacheable, clients and intermediaries (CDNs, proxies) can serve them without contacting the server.

Key HTTP cache headers: | Header | Purpose | Example | |---|---|---| | Cache-Control | Directives for caching behavior | Cache-Control: max-age=3600, public | | ETag | Fingerprint of resource version | ETag: "d8e8fca2dc0f896fd7cb4cb0031ba249" | | Last-Modified | When resource last changed | Last-Modified: Tue, 22 Apr 2026 12:00:00 GMT | | Vary | Which headers affect the cache key | Vary: Accept-Encoding, Authorization |

Conditional requests let clients validate their cache:

GET /users/42
If-None-Match: "d8e8fca2dc0f896fd7cb4cb0031ba249"

→ 304 Not Modified (body omitted — client uses cached copy)
→ 200 OK + new ETag + new body (cache miss — resource changed)

4. Uniform Interface

The single most important constraint. It defines four sub-principles:

4a. Resource Identification in Requests — every resource has a stable URI:

/users                        → collection of users
/users/42                     → specific user
/users/42/orders              → orders belonging to user 42
/users/42/orders/7/items      → items in that order

4b. Manipulation via Representations — clients hold representations (JSON, XML, HTML), not live objects. The client modifies the representation and sends it back.

4c. Self-Descriptive Messages — each request/response carries enough metadata to describe how to process it: Content-Type, method, status code, cache directives.

4d. HATEOAS — see section below.

5. Layered System

Clients cannot tell whether they're connected directly to the server or an intermediary (load balancer, CDN, API gateway, caching proxy). Each layer only knows about the adjacent layer.

This enables transparent insertion of: - CDNs for caching at the edge - API gateways for auth, rate limiting, routing - Load balancers for distributing traffic - Service meshes for observability and mTLS

6. Code on Demand (optional)

The only optional constraint. Servers can temporarily extend client functionality by transferring executable code (e.g., JavaScript). Rarely relevant in modern API design.

HTTP Methods and Idempotency

Method Semantics Idempotent Safe Common Use
GET Retrieve resource(s) Read data
HEAD GET without body (check existence/metadata) Cache validation
POST Create a new resource; non-idempotent actions Create, submit form, trigger action
PUT Replace entire resource (upsert) Full update
PATCH Partial update ❌* Partial update
DELETE Remove resource Delete
OPTIONS Discover allowed methods (used for CORS preflight) CORS

* PATCH can be designed idempotently but is not required to be.

Safe = no side effects (read-only). Idempotent = making the same request N times has the same effect as making it once.

HTTP Status Codes

Range Category Key Codes
2xx Success 200 OK, 201 Created, 202 Accepted, 204 No Content
3xx Redirection 301 Moved Permanently, 304 Not Modified
4xx Client Error 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 409 Conflict, 422 Unprocessable Entity, 429 Too Many Requests
5xx Server Error 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout

Common Status Code Mistakes

  • Never return 200 OK with an error in the body — clients must parse every body to detect errors
  • Use 401 for unauthenticated, 403 for authenticated but unauthorized
  • Use 422 (not 400) when the request is syntactically valid but semantically wrong (e.g. invalid field value)
  • 404 means "resource not found", not "I don't know" — don't use it as a catch-all

HATEOAS

Hypermedia as the Engine of Application State — the highest constraint of REST. Responses include hyperlinks that describe what actions are available next. Clients need no prior knowledge of URL structure; they navigate by following links.

{
  "id": 42,
  "name": "Alice",
  "email": "[email protected]",
  "_links": {
    "self":   { "href": "/users/42", "method": "GET" },
    "orders": { "href": "/users/42/orders", "method": "GET" },
    "update": { "href": "/users/42", "method": "PUT" },
    "delete": { "href": "/users/42", "method": "DELETE" }
  }
}

Benefits: API is self-documenting; server can change URL structure without breaking clients; workflow steps are discoverable.

In practice: very few production APIs implement full HATEOAS. Most APIs reach Level 2 of the Richardson Maturity Model (proper HTTP verbs) and stop there.

Richardson Maturity Model

A framework for measuring how RESTful an API actually is:

Level Name What It Adds Example
0 Swamp of POX Single endpoint, single method POST /api with XML body specifying action
1 Resources Multiple URIs, but still single HTTP verb POST /users, POST /users/42
2 HTTP Verbs Uses GET/POST/PUT/DELETE meaningfully GET /users/42, DELETE /users/42
3 Hypermedia Responses contain links for navigation (HATEOAS) JSON with _links section

Roy Fielding stated that Level 3 is the pre-condition of REST. Most production APIs sit at Level 2 — which is fine for practical purposes, even if technically not "truly RESTful."


GraphQL

Facebook created GraphQL in 2012 and open-sourced it in 2015. It is a query language for your API and a runtime for executing those queries — giving clients the power to ask for exactly what they need and nothing more.

Core Concept: Single Endpoint

Unlike REST's resource-per-endpoint model, GraphQL exposes a single endpoint (typically POST /graphql) that accepts queries describing the exact shape of data needed.

# REST requires 3 round trips:
# GET /users/42
# GET /users/42/posts
# GET /posts/7/comments

# GraphQL fetches all in one request:
query {
  user(id: 42) {
    name
    email
    posts(limit: 5) {
      title
      publishedAt
      comments(limit: 3) {
        body
        author { name }
      }
    }
  }
}

Type System and Schema

Everything in GraphQL is strongly typed. The schema is the single source of truth — it describes every piece of data the API can return and every operation clients can perform.

Scalar Types

Built-in primitives: Int, Float, String, Boolean, ID. Custom scalars can be defined (e.g., DateTime, URL, JSON).

Object Types

type User {
  id: ID!                  # ! = non-nullable
  name: String!
  email: String!
  createdAt: DateTime!
  posts: [Post!]!          # non-null list of non-null Posts
}

type Post {
  id: ID!
  title: String!
  body: String
  author: User!
  tags: [String!]!
}

Special Root Types

type Query {
  user(id: ID!): User
  users(limit: Int = 20, offset: Int = 0): [User!]!
}

type Mutation {
  createUser(input: CreateUserInput!): User!
  updateUser(id: ID!, input: UpdateUserInput!): User!
  deleteUser(id: ID!): Boolean!
}

type Subscription {
  userCreated: User!
  messageReceived(roomId: ID!): Message!
}

Other Type Categories

Type Purpose Example
Input Arguments to mutations input CreateUserInput { name: String!, email: String! }
Enum Fixed set of values enum Status { ACTIVE INACTIVE SUSPENDED }
Interface Shared fields across types interface Node { id: ID! }
Union Type can be one of many union SearchResult = User \| Post \| Comment
Fragment Reusable field selection fragment UserFields on User { id name email }

Queries, Mutations, Subscriptions

Query — read data. Resolvers can be called in parallel:

query GetDashboard {
  currentUser {
    name
    notifications(unread: true) { id title }
  }
  trending { title views }
}

Mutation — write data. Resolvers execute sequentially:

mutation CreatePost($input: CreatePostInput!) {
  createPost(input: $input) {
    id
    title
    author { name }
  }
}

Subscription — real-time data via WebSocket (typically). Server pushes updates when events occur:

subscription OnMessageReceived($roomId: ID!) {
  messageReceived(roomId: $roomId) {
    id body sender { name } sentAt
  }
}

Resolvers

Resolvers are functions that produce data for each field in the schema. GraphQL execution is a depth-first traversal of the query tree — each field resolver receives:

  1. parent — resolved value of the parent field
  2. args — arguments passed to this field
  3. context — shared object (DB connection, auth user, DataLoaders)
  4. info — query metadata (field name, selection set, schema)
const resolvers = {
  Query: {
    user: async (_, { id }, { db }) => db.users.findById(id),
    users: async (_, { limit, offset }, { db }) =>
      db.users.findAll({ limit, offset }),
  },
  User: {
    // Parent resolver returned a user object; now resolve its posts field
    posts: async (user, { limit }, { db }) =>
      db.posts.findByUserId(user.id, limit),
  },
  Mutation: {
    createUser: async (_, { input }, { db }) => db.users.create(input),
  },
};

The N+1 Problem

The most common GraphQL performance trap. Without optimization, resolving a list of N users and their posts triggers 1 + N queries:

Query: users(limit: 20)    → SELECT * FROM users LIMIT 20          (1 query)
  User[0].posts            → SELECT * FROM posts WHERE user_id = 1  (1 query)
  User[1].posts            → SELECT * FROM posts WHERE user_id = 2  (1 query)
  ...
  User[19].posts           → SELECT * FROM posts WHERE user_id = 20 (1 query)
                                                                TOTAL: 21 queries

Real-world impact compounds with nesting — posts fetching authors fetching their posts can generate hundreds of queries for a single GraphQL request.

DataLoader — The Solution

Facebook's DataLoader batches and caches loads within a single request using Node.js's event loop tick:

import DataLoader from 'dataloader';

// Created once per request (NOT per application startup)
const postsByUserLoader = new DataLoader(async (userIds: readonly string[]) => {
  // Single batch query: SELECT * FROM posts WHERE user_id IN (1, 2, ..., 20)
  const posts = await db.posts.findByUserIds(userIds);
  // Return results in same order as input keys
  return userIds.map(id => posts.filter(p => p.userId === id));
});

// In resolver — these 20 calls become ONE SQL query
const resolvers = {
  User: {
    posts: (user, _, { loaders }) =>
      loaders.postsByUser.load(user.id),  // batched automatically
  },
};

Result: 21 queries → 2 queries (one for users, one batch for all posts).

DataLoader Instance Per Request

Create a new DataLoader instance for each request. DataLoader caches results for the duration of a request — sharing across requests will serve stale data.

Directives

Directives annotate schema elements or control query execution:

type User {
  email: String! @deprecated(reason: "Use contactEmail instead")
  contactEmail: String!
  password: String! @auth(requires: ADMIN)  # custom directive
}

# Built-in execution directives:
query GetUser($showEmail: Boolean!) {
  user(id: 42) {
    name
    email @include(if: $showEmail)   # conditionally include field
    phone @skip(if: $skipPhone)      # conditionally skip field
  }
}

Introspection

GraphQL APIs are self-documenting — clients can query the schema itself:

{
  __schema {
    types { name kind }
  }
  __type(name: "User") {
    fields { name type { name kind } }
  }
}

Introspection powers tools like GraphiQL, Apollo Studio, and GraphQL Playground. Disable introspection in production for security-sensitive APIs.

Query Complexity and Depth Limiting

Without limits, a malicious client can craft exponentially expensive queries:

# Denial-of-service via deeply nested query:
{ user { friends { friends { friends { friends { ... } } } } } }

Protect with: - Depth limiting: reject queries deeper than N levels (graphql-depth-limit) - Complexity analysis: assign costs to fields; reject queries over a budget (graphql-validation-complexity) - Query whitelisting (persisted queries): only allow pre-approved queries in production

Federation

GraphQL Federation lets multiple teams own separate subgraphs that compose into a unified supergraph — one schema, one endpoint, distributed implementation.

┌─────────────────────────────────────────────┐
│           Apollo Router (Supergraph)         │
│     Single endpoint: POST /graphql           │
└────────┬──────────────┬──────────────────────┘
         │              │
   ┌─────▼─────┐  ┌─────▼──────┐  ┌──────────┐
   │  Users     │  │  Products  │  │  Orders  │
   │  Subgraph  │  │  Subgraph  │  │ Subgraph │
   │  (Team A)  │  │  (Team B)  │  │ (Team C) │
   └───────────┘  └────────────┘  └──────────┘

Key concepts: - Entities: types that can be extended across subgraphs, identified by a @key directive - __resolveReference: resolver that hydrates an entity from a key passed by the router - @external: field defined in another subgraph, referenced here - Each subgraph is independently deployable; the router composes them at query time


gRPC

gRPC (Google Remote Procedure Call) is a high-performance, open-source RPC framework that uses Protocol Buffers as its interface definition language and serialization format, and HTTP/2 as the transport protocol. A CNCF project since 2016.

Protocol Buffers (Protobuf)

Protobuf is a language-neutral, platform-neutral binary serialization format. Compared to JSON:

Property JSON Protobuf
Format Text (UTF-8) Binary
Size ~1x baseline 3–10x smaller
Parse speed ~1x baseline 5–10x faster
Schema Optional (JSON Schema) Required (.proto file)
Human-readable ❌ (need tools)
Schema evolution Manual / fragile Built-in field numbering

A .proto service definition:

syntax = "proto3";
package com.example.users;

// Message types
message User {
  string id        = 1;
  string name      = 2;
  string email     = 3;
  int64  created_at = 4;
}

message GetUserRequest  { string user_id = 1; }
message CreateUserRequest {
  string name  = 1;
  string email = 2;
}
message UserList { repeated User users = 1; }

// Service definition
service UserService {
  // Unary
  rpc GetUser(GetUserRequest) returns (User);

  // Server streaming
  rpc ListUsers(ListUsersRequest) returns (stream User);

  // Client streaming
  rpc CreateUsersBulk(stream CreateUserRequest) returns (UserList);

  // Bidirectional streaming
  rpc Chat(stream ChatMessage) returns (stream ChatMessage);
}

The protoc compiler generates strongly-typed client stubs and server interfaces in Go, Java, Python, C++, Node.js, Rust, Kotlin, Swift, and more.

HTTP/2 Features Exploited by gRPC

HTTP/2 Feature What It Enables
Multiplexing Multiple RPC calls on one TCP connection; no head-of-line blocking between requests
Binary framing Headers and data sent as binary frames — more efficient than HTTP/1.1 text headers
Header compression (HPACK) Repeated headers (auth token, content-type) sent as index references after first use; 85–90% header reduction
Full-duplex streams Client and server can send frames simultaneously on the same stream
Flow control Prevents fast producers from overwhelming slow consumers per-stream
Server push Server can pre-emptively send resources (rarely used in gRPC)

The Four Streaming Types

Unary RPC

rpc GetUser(GetUserRequest) returns (User);
Classic request-response. Client sends one message, server sends one message. Equivalent to a REST GET.

Server Streaming RPC

rpc WatchLogs(WatchRequest) returns (stream LogEntry);
Client sends one request; server streams multiple responses. Useful for: live logs, large dataset export, real-time feeds.

Client Streaming RPC

rpc UploadMetrics(stream MetricPoint) returns (UploadSummary);
Client streams multiple messages; server collects them and returns one response. Useful for: telemetry ingestion, file uploads chunked by the client, batch writes.

Bidirectional Streaming RPC

rpc BidirectionalChat(stream ChatMessage) returns (stream ChatMessage);
Both sides can send and receive messages in any order over a long-lived connection. Both streams operate independently. Useful for: chat, collaborative editing, real-time games, audio/video signaling.

Deadlines and Cancellation

Every gRPC call should set a deadline — the absolute time by which the client requires a response. The server checks whether the deadline has been exceeded before starting expensive work.

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.GetUser(ctx, &pb.GetUserRequest{UserId: "42"})

Deadlines propagate through the entire call chain — if service A calls service B calls service C, all three respect the same deadline window, preventing one slow downstream call from causing timeouts at every layer.

Interceptors

Interceptors wrap gRPC method invocations — the gRPC equivalent of middleware:

// Unary server interceptor for logging
func loggingInterceptor(ctx context.Context, req interface{},
  info *grpc.UnaryServerInfo, handler grpc.UnaryHandler,
) (interface{}, error) {
  start := time.Now()
  resp, err := handler(ctx, req)
  log.Printf("Method: %s | Duration: %v | Error: %v",
    info.FullMethod, time.Since(start), err)
  return resp, err
}

// Register:
s := grpc.NewServer(
  grpc.UnaryInterceptor(loggingInterceptor),
  grpc.StreamInterceptor(streamLoggingInterceptor),
)

Common interceptors: authentication, tracing (OpenTelemetry), logging, metrics, panic recovery, rate limiting, deadline enforcement.

Load Balancing

Because gRPC multiplexes many RPCs over a single TCP connection, L4 (TCP) load balancing distributes connections, not RPCs. A single long-lived connection from service A to a single pod of service B bypasses all other pods.

Solutions: - L7 (application-layer) load balancing — proxy understands HTTP/2 streams and distributes individual RPCs: Envoy, nginx, gRPC-aware load balancers - Client-side load balancing — the gRPC client resolves all backend IPs (via DNS), maintains connections to each, and distributes RPCs itself - Headless services in Kubernetes — returns all pod IPs; combined with gRPC client-side round-robin


SOAP / XML-RPC

SOAP (Simple Object Access Protocol) is the predecessor to REST. Still deeply embedded in enterprise systems, financial services, healthcare (HL7), and government integrations.

Protocol Structure

A SOAP message is an XML document with a mandatory Envelope, optional Header, and mandatory Body:

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope
  xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:usr="http://example.com/users">
  <soap:Header>
    <usr:AuthToken>abc123</usr:AuthToken>
  </soap:Header>
  <soap:Body>
    <usr:GetUser>
      <usr:UserId>42</usr:UserId>
    </usr:GetUser>
  </soap:Body>
</soap:Envelope>

WSDL (Web Services Description Language)

WSDL is SOAP's IDL — an XML document that describes the service completely: operations, input/output message types, bindings (how operations map to protocols), and endpoints. It serves the same role as OpenAPI for REST or .proto files for gRPC.

<wsdl:definitions name="UserService" ...>
  <wsdl:types>
    <xs:schema>
      <xs:element name="GetUserRequest">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="UserId" type="xs:string"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:schema>
  </wsdl:types>
  <wsdl:message name="GetUserInput">
    <wsdl:part name="parameters" element="tns:GetUserRequest"/>
  </wsdl:message>
  <wsdl:portType name="UserServicePortType">
    <wsdl:operation name="GetUser">
      <wsdl:input message="tns:GetUserInput"/>
      <wsdl:output message="tns:GetUserOutput"/>
    </wsdl:operation>
  </wsdl:portType>
</wsdl:definitions>

SOAP vs REST

Dimension SOAP REST
Payload XML (verbose) JSON (compact)
Contract WSDL (machine-readable) OpenAPI (optional)
Transport HTTP, SMTP, TCP HTTP only
State Stateful or stateless Stateless
Security WS-Security (powerful but complex) OAuth 2.0, JWT, mTLS
Error handling soap:Fault (standardized) HTTP status codes (convention-based)
Tooling Mature but heavy Light and universal
Still used for Banking, insurance, health (HL7), government Virtually everything new

XML-RPC predates SOAP — a simpler, less extensible ancestor using XML payloads over HTTP POST. Effectively obsolete.


WebSocket

WebSocket provides a persistent, full-duplex TCP connection between client and server, established via an HTTP upgrade handshake. Once established, either side can send messages at any time with minimal overhead.

Handshake

# Client initiates upgrade:
GET /ws HTTP/1.1
Host: api.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

# Server confirms upgrade:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

After the handshake, the connection is no longer HTTP. Data flows as frames — the minimal overhead unit:

Frame Type Description
Text frame UTF-8 text message
Binary frame Raw bytes (audio, video, protobuf)
Ping frame Heartbeat probe (server → client)
Pong frame Heartbeat response
Close frame Graceful connection termination

Connection Management

The primary operational challenge of WebSocket is connection state management:

  • Heartbeats (ping/pong): detect dead connections that appear open at the TCP layer. Servers should send pings every 30–60s; if no pong arrives, close and clean up.
  • Reconnection: clients should implement exponential backoff when the connection drops. Libraries like reconnecting-websocket handle this automatically.
  • Backpressure: if a slow client can't consume fast enough, the server's send buffer fills. Monitor ws.bufferedAmount on the client, or implement application-level flow control.
  • Horizontal scaling: WebSocket connections are stateful and sticky. A message sent by user A (connected to server 1) destined for user B (connected to server 2) must be routed between servers via a pub/sub layer (Redis Pub/Sub, Kafka).

When to Use WebSocket

  • Interactive real-time features: chat, collaborative document editing, multiplayer gaming
  • Financial data: live order books, tick-by-tick price feeds
  • IoT: bidirectional device control with low latency
  • When the client sends frequent data to the server (>1 msg/second)

Server-Sent Events (SSE)

SSE is a W3C standard for server-to-client streaming over plain HTTP. Unlike WebSocket, there is no protocol upgrade — it's just a long-lived HTTP response with Content-Type: text/event-stream.

Protocol

Server response:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive

id: 1
event: message
data: {"type": "notification", "text": "Hello!"}

id: 2
event: update
data: {"user": "alice", "status": "online"}

: heartbeat comment (ignored by client)

SSE message fields: | Field | Purpose | |---|---| | data: | The message payload (can span multiple lines) | | event: | Custom event type (client listens via addEventListener) | | id: | Message ID; sent as Last-Event-ID header on reconnect | | : (comment) | Ignored by client; used for keepalive pings |

Auto-Reconnection

SSE's killer feature: if the connection drops, the browser automatically reconnects and sends the Last-Event-ID header — the server can resume from where it left off. No client code required.

const source = new EventSource('/events');

source.addEventListener('message', e => console.log(e.data));
source.addEventListener('update', e => handleUpdate(JSON.parse(e.data)));
source.onerror = e => console.error('SSE error', e);
// Reconnection happens automatically — no manual retry logic needed

HTTP/2 SSE

Under HTTP/1.1, browsers limit each domain to 6 connections. With 7 tabs open, SSE connections compete with XHR/fetch requests. Under HTTP/2, all SSE streams multiplex over a single TCP connection — this limit disappears entirely.

AI Streaming

SSE is the standard for LLM token streaming. OpenAI, Anthropic, and virtually all LLM APIs stream completions via SSE because data flows in one direction (server → client), SSE is simpler than WebSocket, and auto-reconnect handles transient failures gracefully.


Webhooks

Webhooks are HTTP POST callbacks — the server pushes events to client-registered URLs instead of the client polling for changes. "Don't call us, we'll call you."

Flow

sequenceDiagram
    participant Client
    participant YourServer
    participant WebhookConsumer

    Client->>YourServer: Register webhook URL
    Note over YourServer: Event occurs (payment, commit, signup)
    YourServer->>WebhookConsumer: POST /webhook {"event": "payment.succeeded", ...}
    WebhookConsumer-->>YourServer: 200 OK (within 5s)
    Note over WebhookConsumer: Queue event for async processing

Production Webhook Pattern

Respond immediately, process asynchronously:

@app.post("/webhook")
async def webhook_handler(request: Request):
    payload = await request.json()
    # 1. Validate signature FIRST
    verify_signature(request.headers, payload)
    # 2. Return 200 immediately — before any processing
    background_tasks.add_task(process_event, payload)
    return {"status": "accepted"}

Never do slow work (DB queries, API calls) in the webhook handler. Return 200 within 5 seconds or the sender will retry.

Security: Signature Verification

Every webhook provider should sign payloads. Verify before processing:

import hmac, hashlib

def verify_signature(headers: dict, body: bytes, secret: str) -> bool:
    expected = hmac.new(
        secret.encode(), body, hashlib.sha256
    ).hexdigest()
    received = headers.get("X-Signature-256", "").removeprefix("sha256=")
    return hmac.compare_digest(expected, received)

Reliability Patterns

Pattern Purpose
Idempotency key Deduplicate retried deliveries — store processed event IDs
Exponential backoff retries Sender retries on non-2xx: immediately, 5s, 30s, 5m, 30m, 2h
Dead letter queue After N retries, move to DLQ for manual inspection
Event replay Allow consumers to re-request past events by ID
CloudEvents format Standard envelope: id, source, type, time, data

tRPC

tRPC lets TypeScript full-stack teams build APIs where type safety flows automatically from server to client — no code generation, no schema files, no out-of-sync types.

How It Works

  1. Define procedures on the server (TypeScript functions)
  2. Export the router's type
  3. Import and use that type on the client
  4. TypeScript infers input/output types automatically

The client never imports server implementation code — only the type. At runtime, tRPC serializes calls over HTTP (queries → GET/POST, mutations → POST, subscriptions → WebSocket).

Routers and Procedures

// server/routers/users.ts
import { z } from 'zod';
import { router, publicProcedure, protectedProcedure } from '../trpc';

export const userRouter = router({
  // Query — GET /trpc/users.getById
  getById: publicProcedure
    .input(z.object({ id: z.string() }))
    .query(async ({ input, ctx }) => {
      return ctx.db.user.findUnique({ where: { id: input.id } });
    }),

  // Mutation — POST /trpc/users.create
  create: protectedProcedure
    .input(z.object({ name: z.string(), email: z.string().email() }))
    .mutation(async ({ input, ctx }) => {
      return ctx.db.user.create({ data: input });
    }),
});

// server/routers/_app.ts
export const appRouter = router({
  users: userRouter,
  posts: postRouter,
  comments: commentRouter,
});

export type AppRouter = typeof appRouter;  // ← this is all the client needs

Client Usage

// client/trpc.ts
import { createTRPCReact } from '@trpc/react-query';
import type { AppRouter } from '../server/routers/_app';

export const trpc = createTRPCReact<AppRouter>();

// In a React component:
function UserProfile({ userId }: { userId: string }) {
  // Fully typed: input, output, error — all inferred from server code
  const { data, isLoading } = trpc.users.getById.useQuery({ id: userId });
  // data is typed as: User | null | undefined
  // Change server return type → TypeScript error here immediately
}

Context and Middleware

// Context: per-request shared state (auth user, DB, etc.)
export const createContext = async ({ req, res }: CreateNextContextOptions) => ({
  db: prisma,
  session: await getSession({ req }),
});

// Middleware: wraps procedures with reusable logic
const isAuthenticated = middleware(({ ctx, next }) => {
  if (!ctx.session?.user) throw new TRPCError({ code: 'UNAUTHORIZED' });
  return next({ ctx: { ...ctx, user: ctx.session.user } });
});

// Protected procedure: any procedure using this is automatically auth-gated
const protectedProcedure = publicProcedure.use(isAuthenticated);

tRPC vs Alternatives

Dimension tRPC REST + OpenAPI GraphQL
Type safety ✅ Automatic, zero-gen ⚠️ Code generation required ⚠️ Code generation required
Language support TypeScript/JS only Universal Universal
Schema file ❌ None (types are the schema) OpenAPI YAML/JSON .graphql SDL
Learning curve Low (just TypeScript) Low High
Client flexibility ❌ Must use tRPC client ✅ Any HTTP client ✅ Any GraphQL client
Over/under-fetching Field selection not built-in Full response always ✅ Client specifies fields
Best for TypeScript monorepos (T3 stack, Next.js) Public APIs, polyglot Complex multi-client frontends

Choosing the Right API Paradigm

Is this a public API consumed by external developers or third parties?
→ REST (universal, familiar, broad tooling)

Is the frontend complex with multiple clients fetching different data shapes?
→ GraphQL (eliminates over/under-fetching, empowers frontend teams)

Is this internal service-to-service communication with high throughput?
→ gRPC (fastest, binary, streaming support, code-gen clients)

Does the data need to flow in real time in both directions?
→ WebSocket (full-duplex, persistent)

Does the server push updates to passive clients (feeds, notifications)?
→ SSE (simpler than WebSocket, HTTP-native, auto-reconnect)

Is the entire stack TypeScript and owned by one team?
→ tRPC (zero boilerplate, type-safe end-to-end)

Does an external system need to notify you when events occur?
→ Webhooks (event-driven push, polling eliminated)

Is this a legacy enterprise or regulated domain (banking, healthcare)?
→ SOAP (accept the complexity; interoperability with existing systems)

It Is Not Either-Or

Real systems commonly use multiple paradigms together: a public REST API for external consumers, gRPC internally between microservices, GraphQL for the customer-facing frontend, WebSocket for real-time features, and webhooks for third-party integrations.