Web Services Architecture¶
Deep dive into every major API paradigm — how each protocol works under the hood, when to use each, and how they compare.
Protocol Comparison Overview¶
graph TD
A[Client needs data] --> B{Use case?}
B -->|Public API, CRUD, browser-native| C[REST]
B -->|Flexible queries, complex frontends| D[GraphQL]
B -->|Internal service-to-service, streaming| E[gRPC]
B -->|Real-time bidirectional| F[WebSocket]
B -->|Server pushes only, notifications| G[SSE]
B -->|TypeScript full-stack only| H[tRPC]
B -->|Event notification to external systems| I[Webhooks]
B -->|Legacy enterprise integration| J[SOAP]
| Protocol | Transport | Format | Direction | Browser Native | Best For |
|---|---|---|---|---|---|
| REST | HTTP/1.1, HTTP/2 | JSON (typically) | Req/Res | ✅ | Public APIs, CRUD, resource modeling |
| GraphQL | HTTP/1.1, HTTP/2 | JSON | Req/Res + Subscription | ✅ | Complex frontends, data aggregation |
| gRPC | HTTP/2 only | Protocol Buffers (binary) | Req/Res + Streaming | ⚠️ (needs proxy) | Internal microservices, high-throughput |
| SOAP | HTTP, SMTP, TCP | XML | Req/Res | ✅ | Legacy enterprise, financial services |
| WebSocket | WS (TCP upgrade) | Any (text/binary) | Full-duplex | ✅ | Real-time chat, gaming, collaboration |
| SSE | HTTP/1.1, HTTP/2 | Text (UTF-8) | Server → Client only | ✅ | Feeds, notifications, AI streaming |
| Webhooks | HTTP POST | JSON (typically) | Server → Client push | ✅ | Event-driven integrations, automation |
| tRPC | HTTP/WebSocket | JSON | Req/Res + Subscription | ✅ (Node/TS only) | TypeScript full-stack monorepos |
REST (Representational State Transfer)¶
Roy Fielding defined REST in his 2000 doctoral dissertation as an architectural style — not a protocol — built on six constraints that, when applied together, produce a scalable, stateless, and cacheable web service.
The Six Architectural Constraints¶
1. Client–Server Separation¶
The client and server evolve independently. The server manages data storage and business logic; the client manages the user interface and user state. Neither depends on the other's implementation details — only the shared API contract.
This decoupling allows frontend teams to swap frameworks (React → Vue) or mobile clients to evolve, without requiring backend changes, and vice versa.
2. Stateless¶
Every request from client to server must contain all information necessary to understand and process the request. The server stores no session state between requests.
❌ Stateful (server stores session):
POST /login → server creates session, returns cookie
GET /dashboard → server reads session to identify user
✅ Stateless (client carries state):
GET /dashboard
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...
Consequences: - Scalability: any server instance can handle any request — no sticky sessions - Reliability: no session state to lose if a server crashes - Overhead: every request must carry auth credentials and context (larger payloads)
3. Cacheable¶
Responses must declare whether they are cacheable or not. When responses are cacheable, clients and intermediaries (CDNs, proxies) can serve them without contacting the server.
Key HTTP cache headers:
| Header | Purpose | Example |
|---|---|---|
| Cache-Control | Directives for caching behavior | Cache-Control: max-age=3600, public |
| ETag | Fingerprint of resource version | ETag: "d8e8fca2dc0f896fd7cb4cb0031ba249" |
| Last-Modified | When resource last changed | Last-Modified: Tue, 22 Apr 2026 12:00:00 GMT |
| Vary | Which headers affect the cache key | Vary: Accept-Encoding, Authorization |
Conditional requests let clients validate their cache:
GET /users/42
If-None-Match: "d8e8fca2dc0f896fd7cb4cb0031ba249"
→ 304 Not Modified (body omitted — client uses cached copy)
→ 200 OK + new ETag + new body (cache miss — resource changed)
4. Uniform Interface¶
The single most important constraint. It defines four sub-principles:
4a. Resource Identification in Requests — every resource has a stable URI:
/users → collection of users
/users/42 → specific user
/users/42/orders → orders belonging to user 42
/users/42/orders/7/items → items in that order
4b. Manipulation via Representations — clients hold representations (JSON, XML, HTML), not live objects. The client modifies the representation and sends it back.
4c. Self-Descriptive Messages — each request/response carries enough metadata to describe how to process it: Content-Type, method, status code, cache directives.
4d. HATEOAS — see section below.
5. Layered System¶
Clients cannot tell whether they're connected directly to the server or an intermediary (load balancer, CDN, API gateway, caching proxy). Each layer only knows about the adjacent layer.
This enables transparent insertion of: - CDNs for caching at the edge - API gateways for auth, rate limiting, routing - Load balancers for distributing traffic - Service meshes for observability and mTLS
6. Code on Demand (optional)¶
The only optional constraint. Servers can temporarily extend client functionality by transferring executable code (e.g., JavaScript). Rarely relevant in modern API design.
HTTP Methods and Idempotency¶
| Method | Semantics | Idempotent | Safe | Common Use |
|---|---|---|---|---|
| GET | Retrieve resource(s) | ✅ | ✅ | Read data |
| HEAD | GET without body (check existence/metadata) | ✅ | ✅ | Cache validation |
| POST | Create a new resource; non-idempotent actions | ❌ | ❌ | Create, submit form, trigger action |
| PUT | Replace entire resource (upsert) | ✅ | ❌ | Full update |
| PATCH | Partial update | ❌* | ❌ | Partial update |
| DELETE | Remove resource | ✅ | ❌ | Delete |
| OPTIONS | Discover allowed methods (used for CORS preflight) | ✅ | ✅ | CORS |
* PATCH can be designed idempotently but is not required to be.
Safe = no side effects (read-only). Idempotent = making the same request N times has the same effect as making it once.
HTTP Status Codes¶
| Range | Category | Key Codes |
|---|---|---|
| 2xx | Success | 200 OK, 201 Created, 202 Accepted, 204 No Content |
| 3xx | Redirection | 301 Moved Permanently, 304 Not Modified |
| 4xx | Client Error | 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 409 Conflict, 422 Unprocessable Entity, 429 Too Many Requests |
| 5xx | Server Error | 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout |
Common Status Code Mistakes
- Never return
200 OKwith an error in the body — clients must parse every body to detect errors - Use
401for unauthenticated,403for authenticated but unauthorized - Use
422(not400) when the request is syntactically valid but semantically wrong (e.g. invalid field value) 404means "resource not found", not "I don't know" — don't use it as a catch-all
HATEOAS¶
Hypermedia as the Engine of Application State — the highest constraint of REST. Responses include hyperlinks that describe what actions are available next. Clients need no prior knowledge of URL structure; they navigate by following links.
{
"id": 42,
"name": "Alice",
"email": "[email protected]",
"_links": {
"self": { "href": "/users/42", "method": "GET" },
"orders": { "href": "/users/42/orders", "method": "GET" },
"update": { "href": "/users/42", "method": "PUT" },
"delete": { "href": "/users/42", "method": "DELETE" }
}
}
Benefits: API is self-documenting; server can change URL structure without breaking clients; workflow steps are discoverable.
In practice: very few production APIs implement full HATEOAS. Most APIs reach Level 2 of the Richardson Maturity Model (proper HTTP verbs) and stop there.
Richardson Maturity Model¶
A framework for measuring how RESTful an API actually is:
| Level | Name | What It Adds | Example |
|---|---|---|---|
| 0 | Swamp of POX | Single endpoint, single method | POST /api with XML body specifying action |
| 1 | Resources | Multiple URIs, but still single HTTP verb | POST /users, POST /users/42 |
| 2 | HTTP Verbs | Uses GET/POST/PUT/DELETE meaningfully | GET /users/42, DELETE /users/42 |
| 3 | Hypermedia | Responses contain links for navigation (HATEOAS) | JSON with _links section |
Roy Fielding stated that Level 3 is the pre-condition of REST. Most production APIs sit at Level 2 — which is fine for practical purposes, even if technically not "truly RESTful."
GraphQL¶
Facebook created GraphQL in 2012 and open-sourced it in 2015. It is a query language for your API and a runtime for executing those queries — giving clients the power to ask for exactly what they need and nothing more.
Core Concept: Single Endpoint¶
Unlike REST's resource-per-endpoint model, GraphQL exposes a single endpoint (typically POST /graphql) that accepts queries describing the exact shape of data needed.
# REST requires 3 round trips:
# GET /users/42
# GET /users/42/posts
# GET /posts/7/comments
# GraphQL fetches all in one request:
query {
user(id: 42) {
name
email
posts(limit: 5) {
title
publishedAt
comments(limit: 3) {
body
author { name }
}
}
}
}
Type System and Schema¶
Everything in GraphQL is strongly typed. The schema is the single source of truth — it describes every piece of data the API can return and every operation clients can perform.
Scalar Types¶
Built-in primitives: Int, Float, String, Boolean, ID. Custom scalars can be defined (e.g., DateTime, URL, JSON).
Object Types¶
type User {
id: ID! # ! = non-nullable
name: String!
email: String!
createdAt: DateTime!
posts: [Post!]! # non-null list of non-null Posts
}
type Post {
id: ID!
title: String!
body: String
author: User!
tags: [String!]!
}
Special Root Types¶
type Query {
user(id: ID!): User
users(limit: Int = 20, offset: Int = 0): [User!]!
}
type Mutation {
createUser(input: CreateUserInput!): User!
updateUser(id: ID!, input: UpdateUserInput!): User!
deleteUser(id: ID!): Boolean!
}
type Subscription {
userCreated: User!
messageReceived(roomId: ID!): Message!
}
Other Type Categories¶
| Type | Purpose | Example |
|---|---|---|
| Input | Arguments to mutations | input CreateUserInput { name: String!, email: String! } |
| Enum | Fixed set of values | enum Status { ACTIVE INACTIVE SUSPENDED } |
| Interface | Shared fields across types | interface Node { id: ID! } |
| Union | Type can be one of many | union SearchResult = User \| Post \| Comment |
| Fragment | Reusable field selection | fragment UserFields on User { id name email } |
Queries, Mutations, Subscriptions¶
Query — read data. Resolvers can be called in parallel:
query GetDashboard {
currentUser {
name
notifications(unread: true) { id title }
}
trending { title views }
}
Mutation — write data. Resolvers execute sequentially:
mutation CreatePost($input: CreatePostInput!) {
createPost(input: $input) {
id
title
author { name }
}
}
Subscription — real-time data via WebSocket (typically). Server pushes updates when events occur:
subscription OnMessageReceived($roomId: ID!) {
messageReceived(roomId: $roomId) {
id body sender { name } sentAt
}
}
Resolvers¶
Resolvers are functions that produce data for each field in the schema. GraphQL execution is a depth-first traversal of the query tree — each field resolver receives:
parent— resolved value of the parent fieldargs— arguments passed to this fieldcontext— shared object (DB connection, auth user, DataLoaders)info— query metadata (field name, selection set, schema)
const resolvers = {
Query: {
user: async (_, { id }, { db }) => db.users.findById(id),
users: async (_, { limit, offset }, { db }) =>
db.users.findAll({ limit, offset }),
},
User: {
// Parent resolver returned a user object; now resolve its posts field
posts: async (user, { limit }, { db }) =>
db.posts.findByUserId(user.id, limit),
},
Mutation: {
createUser: async (_, { input }, { db }) => db.users.create(input),
},
};
The N+1 Problem¶
The most common GraphQL performance trap. Without optimization, resolving a list of N users and their posts triggers 1 + N queries:
Query: users(limit: 20) → SELECT * FROM users LIMIT 20 (1 query)
User[0].posts → SELECT * FROM posts WHERE user_id = 1 (1 query)
User[1].posts → SELECT * FROM posts WHERE user_id = 2 (1 query)
...
User[19].posts → SELECT * FROM posts WHERE user_id = 20 (1 query)
TOTAL: 21 queries
Real-world impact compounds with nesting — posts fetching authors fetching their posts can generate hundreds of queries for a single GraphQL request.
DataLoader — The Solution¶
Facebook's DataLoader batches and caches loads within a single request using Node.js's event loop tick:
import DataLoader from 'dataloader';
// Created once per request (NOT per application startup)
const postsByUserLoader = new DataLoader(async (userIds: readonly string[]) => {
// Single batch query: SELECT * FROM posts WHERE user_id IN (1, 2, ..., 20)
const posts = await db.posts.findByUserIds(userIds);
// Return results in same order as input keys
return userIds.map(id => posts.filter(p => p.userId === id));
});
// In resolver — these 20 calls become ONE SQL query
const resolvers = {
User: {
posts: (user, _, { loaders }) =>
loaders.postsByUser.load(user.id), // batched automatically
},
};
Result: 21 queries → 2 queries (one for users, one batch for all posts).
DataLoader Instance Per Request
Create a new DataLoader instance for each request. DataLoader caches results for the duration of a request — sharing across requests will serve stale data.
Directives¶
Directives annotate schema elements or control query execution:
type User {
email: String! @deprecated(reason: "Use contactEmail instead")
contactEmail: String!
password: String! @auth(requires: ADMIN) # custom directive
}
# Built-in execution directives:
query GetUser($showEmail: Boolean!) {
user(id: 42) {
name
email @include(if: $showEmail) # conditionally include field
phone @skip(if: $skipPhone) # conditionally skip field
}
}
Introspection¶
GraphQL APIs are self-documenting — clients can query the schema itself:
Introspection powers tools like GraphiQL, Apollo Studio, and GraphQL Playground. Disable introspection in production for security-sensitive APIs.
Query Complexity and Depth Limiting¶
Without limits, a malicious client can craft exponentially expensive queries:
# Denial-of-service via deeply nested query:
{ user { friends { friends { friends { friends { ... } } } } } }
Protect with:
- Depth limiting: reject queries deeper than N levels (graphql-depth-limit)
- Complexity analysis: assign costs to fields; reject queries over a budget (graphql-validation-complexity)
- Query whitelisting (persisted queries): only allow pre-approved queries in production
Federation¶
GraphQL Federation lets multiple teams own separate subgraphs that compose into a unified supergraph — one schema, one endpoint, distributed implementation.
┌─────────────────────────────────────────────┐
│ Apollo Router (Supergraph) │
│ Single endpoint: POST /graphql │
└────────┬──────────────┬──────────────────────┘
│ │
┌─────▼─────┐ ┌─────▼──────┐ ┌──────────┐
│ Users │ │ Products │ │ Orders │
│ Subgraph │ │ Subgraph │ │ Subgraph │
│ (Team A) │ │ (Team B) │ │ (Team C) │
└───────────┘ └────────────┘ └──────────┘
Key concepts:
- Entities: types that can be extended across subgraphs, identified by a @key directive
- __resolveReference: resolver that hydrates an entity from a key passed by the router
- @external: field defined in another subgraph, referenced here
- Each subgraph is independently deployable; the router composes them at query time
gRPC¶
gRPC (Google Remote Procedure Call) is a high-performance, open-source RPC framework that uses Protocol Buffers as its interface definition language and serialization format, and HTTP/2 as the transport protocol. A CNCF project since 2016.
Protocol Buffers (Protobuf)¶
Protobuf is a language-neutral, platform-neutral binary serialization format. Compared to JSON:
| Property | JSON | Protobuf |
|---|---|---|
| Format | Text (UTF-8) | Binary |
| Size | ~1x baseline | 3–10x smaller |
| Parse speed | ~1x baseline | 5–10x faster |
| Schema | Optional (JSON Schema) | Required (.proto file) |
| Human-readable | ✅ | ❌ (need tools) |
| Schema evolution | Manual / fragile | Built-in field numbering |
A .proto service definition:
syntax = "proto3";
package com.example.users;
// Message types
message User {
string id = 1;
string name = 2;
string email = 3;
int64 created_at = 4;
}
message GetUserRequest { string user_id = 1; }
message CreateUserRequest {
string name = 1;
string email = 2;
}
message UserList { repeated User users = 1; }
// Service definition
service UserService {
// Unary
rpc GetUser(GetUserRequest) returns (User);
// Server streaming
rpc ListUsers(ListUsersRequest) returns (stream User);
// Client streaming
rpc CreateUsersBulk(stream CreateUserRequest) returns (UserList);
// Bidirectional streaming
rpc Chat(stream ChatMessage) returns (stream ChatMessage);
}
The protoc compiler generates strongly-typed client stubs and server interfaces in Go, Java, Python, C++, Node.js, Rust, Kotlin, Swift, and more.
HTTP/2 Features Exploited by gRPC¶
| HTTP/2 Feature | What It Enables |
|---|---|
| Multiplexing | Multiple RPC calls on one TCP connection; no head-of-line blocking between requests |
| Binary framing | Headers and data sent as binary frames — more efficient than HTTP/1.1 text headers |
| Header compression (HPACK) | Repeated headers (auth token, content-type) sent as index references after first use; 85–90% header reduction |
| Full-duplex streams | Client and server can send frames simultaneously on the same stream |
| Flow control | Prevents fast producers from overwhelming slow consumers per-stream |
| Server push | Server can pre-emptively send resources (rarely used in gRPC) |
The Four Streaming Types¶
Unary RPC¶
Classic request-response. Client sends one message, server sends one message. Equivalent to a RESTGET.
Server Streaming RPC¶
Client sends one request; server streams multiple responses. Useful for: live logs, large dataset export, real-time feeds.Client Streaming RPC¶
Client streams multiple messages; server collects them and returns one response. Useful for: telemetry ingestion, file uploads chunked by the client, batch writes.Bidirectional Streaming RPC¶
Both sides can send and receive messages in any order over a long-lived connection. Both streams operate independently. Useful for: chat, collaborative editing, real-time games, audio/video signaling.Deadlines and Cancellation¶
Every gRPC call should set a deadline — the absolute time by which the client requires a response. The server checks whether the deadline has been exceeded before starting expensive work.
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
resp, err := client.GetUser(ctx, &pb.GetUserRequest{UserId: "42"})
Deadlines propagate through the entire call chain — if service A calls service B calls service C, all three respect the same deadline window, preventing one slow downstream call from causing timeouts at every layer.
Interceptors¶
Interceptors wrap gRPC method invocations — the gRPC equivalent of middleware:
// Unary server interceptor for logging
func loggingInterceptor(ctx context.Context, req interface{},
info *grpc.UnaryServerInfo, handler grpc.UnaryHandler,
) (interface{}, error) {
start := time.Now()
resp, err := handler(ctx, req)
log.Printf("Method: %s | Duration: %v | Error: %v",
info.FullMethod, time.Since(start), err)
return resp, err
}
// Register:
s := grpc.NewServer(
grpc.UnaryInterceptor(loggingInterceptor),
grpc.StreamInterceptor(streamLoggingInterceptor),
)
Common interceptors: authentication, tracing (OpenTelemetry), logging, metrics, panic recovery, rate limiting, deadline enforcement.
Load Balancing¶
Because gRPC multiplexes many RPCs over a single TCP connection, L4 (TCP) load balancing distributes connections, not RPCs. A single long-lived connection from service A to a single pod of service B bypasses all other pods.
Solutions: - L7 (application-layer) load balancing — proxy understands HTTP/2 streams and distributes individual RPCs: Envoy, nginx, gRPC-aware load balancers - Client-side load balancing — the gRPC client resolves all backend IPs (via DNS), maintains connections to each, and distributes RPCs itself - Headless services in Kubernetes — returns all pod IPs; combined with gRPC client-side round-robin
SOAP / XML-RPC¶
SOAP (Simple Object Access Protocol) is the predecessor to REST. Still deeply embedded in enterprise systems, financial services, healthcare (HL7), and government integrations.
Protocol Structure¶
A SOAP message is an XML document with a mandatory Envelope, optional Header, and mandatory Body:
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:usr="http://example.com/users">
<soap:Header>
<usr:AuthToken>abc123</usr:AuthToken>
</soap:Header>
<soap:Body>
<usr:GetUser>
<usr:UserId>42</usr:UserId>
</usr:GetUser>
</soap:Body>
</soap:Envelope>
WSDL (Web Services Description Language)¶
WSDL is SOAP's IDL — an XML document that describes the service completely: operations, input/output message types, bindings (how operations map to protocols), and endpoints. It serves the same role as OpenAPI for REST or .proto files for gRPC.
<wsdl:definitions name="UserService" ...>
<wsdl:types>
<xs:schema>
<xs:element name="GetUserRequest">
<xs:complexType>
<xs:sequence>
<xs:element name="UserId" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
</wsdl:types>
<wsdl:message name="GetUserInput">
<wsdl:part name="parameters" element="tns:GetUserRequest"/>
</wsdl:message>
<wsdl:portType name="UserServicePortType">
<wsdl:operation name="GetUser">
<wsdl:input message="tns:GetUserInput"/>
<wsdl:output message="tns:GetUserOutput"/>
</wsdl:operation>
</wsdl:portType>
</wsdl:definitions>
SOAP vs REST¶
| Dimension | SOAP | REST |
|---|---|---|
| Payload | XML (verbose) | JSON (compact) |
| Contract | WSDL (machine-readable) | OpenAPI (optional) |
| Transport | HTTP, SMTP, TCP | HTTP only |
| State | Stateful or stateless | Stateless |
| Security | WS-Security (powerful but complex) | OAuth 2.0, JWT, mTLS |
| Error handling | soap:Fault (standardized) |
HTTP status codes (convention-based) |
| Tooling | Mature but heavy | Light and universal |
| Still used for | Banking, insurance, health (HL7), government | Virtually everything new |
XML-RPC predates SOAP — a simpler, less extensible ancestor using XML payloads over HTTP POST. Effectively obsolete.
WebSocket¶
WebSocket provides a persistent, full-duplex TCP connection between client and server, established via an HTTP upgrade handshake. Once established, either side can send messages at any time with minimal overhead.
Handshake¶
# Client initiates upgrade:
GET /ws HTTP/1.1
Host: api.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
# Server confirms upgrade:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
After the handshake, the connection is no longer HTTP. Data flows as frames — the minimal overhead unit:
| Frame Type | Description |
|---|---|
| Text frame | UTF-8 text message |
| Binary frame | Raw bytes (audio, video, protobuf) |
| Ping frame | Heartbeat probe (server → client) |
| Pong frame | Heartbeat response |
| Close frame | Graceful connection termination |
Connection Management¶
The primary operational challenge of WebSocket is connection state management:
- Heartbeats (ping/pong): detect dead connections that appear open at the TCP layer. Servers should send pings every 30–60s; if no pong arrives, close and clean up.
- Reconnection: clients should implement exponential backoff when the connection drops. Libraries like
reconnecting-websockethandle this automatically. - Backpressure: if a slow client can't consume fast enough, the server's send buffer fills. Monitor
ws.bufferedAmounton the client, or implement application-level flow control. - Horizontal scaling: WebSocket connections are stateful and sticky. A message sent by user A (connected to server 1) destined for user B (connected to server 2) must be routed between servers via a pub/sub layer (Redis Pub/Sub, Kafka).
When to Use WebSocket¶
- Interactive real-time features: chat, collaborative document editing, multiplayer gaming
- Financial data: live order books, tick-by-tick price feeds
- IoT: bidirectional device control with low latency
- When the client sends frequent data to the server (>1 msg/second)
Server-Sent Events (SSE)¶
SSE is a W3C standard for server-to-client streaming over plain HTTP. Unlike WebSocket, there is no protocol upgrade — it's just a long-lived HTTP response with Content-Type: text/event-stream.
Protocol¶
Server response:
HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
id: 1
event: message
data: {"type": "notification", "text": "Hello!"}
id: 2
event: update
data: {"user": "alice", "status": "online"}
: heartbeat comment (ignored by client)
SSE message fields:
| Field | Purpose |
|---|---|
| data: | The message payload (can span multiple lines) |
| event: | Custom event type (client listens via addEventListener) |
| id: | Message ID; sent as Last-Event-ID header on reconnect |
| : (comment) | Ignored by client; used for keepalive pings |
Auto-Reconnection¶
SSE's killer feature: if the connection drops, the browser automatically reconnects and sends the Last-Event-ID header — the server can resume from where it left off. No client code required.
const source = new EventSource('/events');
source.addEventListener('message', e => console.log(e.data));
source.addEventListener('update', e => handleUpdate(JSON.parse(e.data)));
source.onerror = e => console.error('SSE error', e);
// Reconnection happens automatically — no manual retry logic needed
HTTP/2 SSE¶
Under HTTP/1.1, browsers limit each domain to 6 connections. With 7 tabs open, SSE connections compete with XHR/fetch requests. Under HTTP/2, all SSE streams multiplex over a single TCP connection — this limit disappears entirely.
AI Streaming
SSE is the standard for LLM token streaming. OpenAI, Anthropic, and virtually all LLM APIs stream completions via SSE because data flows in one direction (server → client), SSE is simpler than WebSocket, and auto-reconnect handles transient failures gracefully.
Webhooks¶
Webhooks are HTTP POST callbacks — the server pushes events to client-registered URLs instead of the client polling for changes. "Don't call us, we'll call you."
Flow¶
sequenceDiagram
participant Client
participant YourServer
participant WebhookConsumer
Client->>YourServer: Register webhook URL
Note over YourServer: Event occurs (payment, commit, signup)
YourServer->>WebhookConsumer: POST /webhook {"event": "payment.succeeded", ...}
WebhookConsumer-->>YourServer: 200 OK (within 5s)
Note over WebhookConsumer: Queue event for async processing
Production Webhook Pattern¶
Respond immediately, process asynchronously:
@app.post("/webhook")
async def webhook_handler(request: Request):
payload = await request.json()
# 1. Validate signature FIRST
verify_signature(request.headers, payload)
# 2. Return 200 immediately — before any processing
background_tasks.add_task(process_event, payload)
return {"status": "accepted"}
Never do slow work (DB queries, API calls) in the webhook handler. Return 200 within 5 seconds or the sender will retry.
Security: Signature Verification¶
Every webhook provider should sign payloads. Verify before processing:
import hmac, hashlib
def verify_signature(headers: dict, body: bytes, secret: str) -> bool:
expected = hmac.new(
secret.encode(), body, hashlib.sha256
).hexdigest()
received = headers.get("X-Signature-256", "").removeprefix("sha256=")
return hmac.compare_digest(expected, received)
Reliability Patterns¶
| Pattern | Purpose |
|---|---|
| Idempotency key | Deduplicate retried deliveries — store processed event IDs |
| Exponential backoff retries | Sender retries on non-2xx: immediately, 5s, 30s, 5m, 30m, 2h |
| Dead letter queue | After N retries, move to DLQ for manual inspection |
| Event replay | Allow consumers to re-request past events by ID |
| CloudEvents format | Standard envelope: id, source, type, time, data |
tRPC¶
tRPC lets TypeScript full-stack teams build APIs where type safety flows automatically from server to client — no code generation, no schema files, no out-of-sync types.
How It Works¶
- Define procedures on the server (TypeScript functions)
- Export the router's type
- Import and use that type on the client
- TypeScript infers input/output types automatically
The client never imports server implementation code — only the type. At runtime, tRPC serializes calls over HTTP (queries → GET/POST, mutations → POST, subscriptions → WebSocket).
Routers and Procedures¶
// server/routers/users.ts
import { z } from 'zod';
import { router, publicProcedure, protectedProcedure } from '../trpc';
export const userRouter = router({
// Query — GET /trpc/users.getById
getById: publicProcedure
.input(z.object({ id: z.string() }))
.query(async ({ input, ctx }) => {
return ctx.db.user.findUnique({ where: { id: input.id } });
}),
// Mutation — POST /trpc/users.create
create: protectedProcedure
.input(z.object({ name: z.string(), email: z.string().email() }))
.mutation(async ({ input, ctx }) => {
return ctx.db.user.create({ data: input });
}),
});
// server/routers/_app.ts
export const appRouter = router({
users: userRouter,
posts: postRouter,
comments: commentRouter,
});
export type AppRouter = typeof appRouter; // ← this is all the client needs
Client Usage¶
// client/trpc.ts
import { createTRPCReact } from '@trpc/react-query';
import type { AppRouter } from '../server/routers/_app';
export const trpc = createTRPCReact<AppRouter>();
// In a React component:
function UserProfile({ userId }: { userId: string }) {
// Fully typed: input, output, error — all inferred from server code
const { data, isLoading } = trpc.users.getById.useQuery({ id: userId });
// data is typed as: User | null | undefined
// Change server return type → TypeScript error here immediately
}
Context and Middleware¶
// Context: per-request shared state (auth user, DB, etc.)
export const createContext = async ({ req, res }: CreateNextContextOptions) => ({
db: prisma,
session: await getSession({ req }),
});
// Middleware: wraps procedures with reusable logic
const isAuthenticated = middleware(({ ctx, next }) => {
if (!ctx.session?.user) throw new TRPCError({ code: 'UNAUTHORIZED' });
return next({ ctx: { ...ctx, user: ctx.session.user } });
});
// Protected procedure: any procedure using this is automatically auth-gated
const protectedProcedure = publicProcedure.use(isAuthenticated);
tRPC vs Alternatives¶
| Dimension | tRPC | REST + OpenAPI | GraphQL |
|---|---|---|---|
| Type safety | ✅ Automatic, zero-gen | ⚠️ Code generation required | ⚠️ Code generation required |
| Language support | TypeScript/JS only | Universal | Universal |
| Schema file | ❌ None (types are the schema) | OpenAPI YAML/JSON | .graphql SDL |
| Learning curve | Low (just TypeScript) | Low | High |
| Client flexibility | ❌ Must use tRPC client | ✅ Any HTTP client | ✅ Any GraphQL client |
| Over/under-fetching | Field selection not built-in | Full response always | ✅ Client specifies fields |
| Best for | TypeScript monorepos (T3 stack, Next.js) | Public APIs, polyglot | Complex multi-client frontends |
Choosing the Right API Paradigm¶
Is this a public API consumed by external developers or third parties?
→ REST (universal, familiar, broad tooling)
Is the frontend complex with multiple clients fetching different data shapes?
→ GraphQL (eliminates over/under-fetching, empowers frontend teams)
Is this internal service-to-service communication with high throughput?
→ gRPC (fastest, binary, streaming support, code-gen clients)
Does the data need to flow in real time in both directions?
→ WebSocket (full-duplex, persistent)
Does the server push updates to passive clients (feeds, notifications)?
→ SSE (simpler than WebSocket, HTTP-native, auto-reconnect)
Is the entire stack TypeScript and owned by one team?
→ tRPC (zero boilerplate, type-safe end-to-end)
Does an external system need to notify you when events occur?
→ Webhooks (event-driven push, polling eliminated)
Is this a legacy enterprise or regulated domain (banking, healthcare)?
→ SOAP (accept the complexity; interoperability with existing systems)
It Is Not Either-Or
Real systems commonly use multiple paradigms together: a public REST API for external consumers, gRPC internally between microservices, GraphQL for the customer-facing frontend, WebSocket for real-time features, and webhooks for third-party integrations.