OpenClaw Architecture¶

OpenClaw follows a hub-and-spoke architecture centered on a Gateway control plane that mediates between user-facing messaging channels and an AI agent runtime.

High-Level Architecture¶

graph TB
    subgraph Channels["Messaging Channels"]
        WA[WhatsApp]
        TG[Telegram]
        SL[Slack]
        DC[Discord]
        IM[iMessage]
        WEB[WebChat]
        MORE[24+ more...]
    end

    subgraph Core["OpenClaw Core"]
        GW["Gateway<br>(WebSocket Control Plane)<br>ws://127.0.0.1:18789"]
        PI["Pi Runtime<br>(Minimal Agent Core)"]
        LB["Lobster<br>(Workflow Engine)"]
        MEM["Memory System<br>(MEMORY.md + Daily Notes)"]
        SK["Skills Engine<br>(53 bundled + ClawHub)"]
    end

    subgraph Tools["Agent Capabilities"]
        BASH[Bash/Shell]
        FS[File System]
        BROWSER[Browser Automation]
        CRON[Cron/Scheduling]
        MCP_T[MCP Servers]
        CANVAS[Canvas Rendering]
    end

    subgraph LLMs["LLM Providers"]
        CLAUDE[Claude]
        GPT[GPT]
        DEEP[DeepSeek]
        GEM[Gemini]
        NEM[Nemotron/Local]
    end

    Channels --> GW
    GW <-->|RPC| PI
    PI --> Tools
    PI <--> LLMs
    GW --> LB
    GW --> MEM
    GW --> SK

Gateway — The Control Plane¶

The Gateway is a WebSocket server that handles:

Session management — conversation state, branching, and history
Presence — online/offline status across channels
Configuration — bot settings, channel bindings, model selection
Cron jobs — scheduled task execution
Webhooks — external trigger integration
Channel routing — dispatching messages to/from 24+ platforms

The Gateway runs locally, bound to ws://127.0.0.1:18789 by default. All messaging platform connectors route through this single control plane.

One Agent, Many Channels

The key architectural insight is that OpenClaw separates the interface layer (where messages arrive) from the assistant runtime (where intelligence lives). One persistent agent is accessible through every messaging app, with state managed centrally on your hardware.

Pi Runtime — The Agent Core¶

Underneath the Gateway, the actual agent work is done by Pi, a minimal coding agent runtime.

Design Philosophy¶

Pi is deliberately minimal:

Shortest system prompt of any major coding agent
Ships with only four core tools: Read, Write, Edit, Bash
Communicates with the Gateway over RPC with tool streaming and block streaming
The philosophy: if you want the agent to do something new, you don't download a plugin — you ask the agent to write code that extends itself

Execution Model¶

sequenceDiagram
    participant U as User (Channel)
    participant GW as Gateway
    participant Pi as Pi Runtime
    participant LLM as LLM Provider
    participant T as Tools

    U->>GW: Message via WhatsApp/Telegram/etc
    GW->>Pi: RPC dispatch (session context)
    Pi->>Pi: Assemble context (history + memory)
    Pi->>LLM: Inference request
    LLM-->>Pi: Response + tool calls
    Pi->>T: Execute tool calls
    T-->>Pi: Tool results
    Pi->>LLM: Continue with results
    LLM-->>Pi: Final response
    Pi-->>GW: Response + state update
    GW-->>U: Reply via channel

Session Architecture — Trees, Not Logs¶

One of OpenClaw's most distinctive architectural features: sessions are trees, not flat logs.

You can branch and navigate within a session, which enables workflows impossible in most agent frameworks:

Branch for debugging — instead of polluting the main context, the agent branches off to diagnose a broken tool
Rewind after fix — once the issue is resolved, the agent rewinds to the main branch and summarizes what happened
Clean main context — the main session stays focused while side-quests happen in branches

This is critical for failure recovery — the agent literally recovers in-band.

Lobster — The Workflow Engine¶

Lobster is a typed, local-first "macro engine" that turns skills and tools into composable pipelines.

Why Lobster Exists¶

Complex workflows require many back-and-forth tool calls, each costing tokens. Lobster moves orchestration into a typed runtime: one call instead of many.

LLMs do what LLMs are good at: writing code, analyzing code, running tests. Lobster does what code is good at: sequencing, counting, routing, retrying.

Key Properties¶

Property	Detail
Determinism	Pipelines are data — easy to log, diff, replay, review
Resumable state	Halted workflows return a `resumeToken`; approve and continue without re-running
Safety	Timeouts, output caps, sandbox checks, allowlists enforced by runtime
Execution	Runs inside the Gateway process — no external subprocess spawned
Grammar	Intentionally tiny — a predictable, AI-friendly pipeline spec

Multi-Agent Pipeline Example¶

graph LR
    subgraph Lobster["Lobster Pipeline"]
        CODE[Programmer Agent] --> REVIEW[Reviewer Agent]
        REVIEW -->|Pass| TEST[Tester Agent]
        REVIEW -->|Fail<br>max 3 iterations| CODE
        TEST -->|Pass| DONE[Done]
        TEST -->|Fail| CODE
    end

A real-world dev pipeline: code → review (max 3 iterations) → test → done, with no human in the loop unless something breaks.

Resilience & State¶

OpenClaw uses a write-ahead queue for task interruption handling:

If an agent fails mid-execution, it resumes from the last saved checkpoint
State is preserved for long-running autonomous tasks
An email triage workflow that crashes halfway doesn't restart from scratch

Multi-Agent Patterns¶

OpenClaw supports three distinct multi-agent patterns:

1. Sub-Agents¶

Background workers spawned from the main agent. They run in parallel, finish their task, and report back.

Best for: long-running research, batch processing, code review.

2. Multi-Agent Routing¶

Separate, independent agents on one Gateway, each bound to different channels or users.

Best for: home/work separation, security isolation, different personas.

3. Agent Teams¶

Community-built orchestration systems (SWAT, OpenMOSS, Mission Control, ClawTeam) that add structured coordination on top of OpenClaw.

Best for: complex pipelines with review loops, 24/7 operations.

NemoClaw Integration (NVIDIA)¶

graph TB
    subgraph NemoClaw["NemoClaw Stack"]
        OC[OpenClaw Agent]
        OS["OpenShell Runtime<br>(K3s in Docker)"]
        PR["Privacy Router"]
        PE["Policy Engine<br>(YAML declarative)"]
    end

    subgraph Inference["Inference Routing"]
        LOCAL["Local Nemotron<br>(120B on DGX Spark)"]
        CLOUD["Cloud Provider<br>(Claude/GPT)"]
    end

    OC -->|Sandboxed| OS
    OS --> PR
    PR -->|Sensitive| LOCAL
    PR -->|Non-sensitive| CLOUD
    PE -->|Enforces| OS

NemoClaw adds:

Out-of-process policy enforcement — policies enforced outside the agent, so even a compromised agent cannot bypass them (similar to browser tab isolation)
Privacy Router — routes inference to local or cloud based on configured policy
Credential injection — credentials never touch the sandbox filesystem
Isolated sandboxes — each agent runs in its own sandbox with declarative YAML policy controlling filesystem, network, process, and inference routing
K3s under the hood — all components run as a K3s Kubernetes cluster inside a single Docker container

When to Use NemoClaw

Personal, non-confidential → plain OpenClaw is sufficient
Confidential data or business use → NemoClaw recommended
Corporate adoption → NemoClaw virtually essential
Best practice → two instances: one everyday, one confidential via NemoClaw