Skip to content

OpenClaw Architecture

OpenClaw follows a hub-and-spoke architecture centered on a Gateway control plane that mediates between user-facing messaging channels and an AI agent runtime.

High-Level Architecture

graph TB
    subgraph Channels["Messaging Channels"]
        WA[WhatsApp]
        TG[Telegram]
        SL[Slack]
        DC[Discord]
        IM[iMessage]
        WEB[WebChat]
        MORE[24+ more...]
    end

    subgraph Core["OpenClaw Core"]
        GW["Gateway<br>(WebSocket Control Plane)<br>ws://127.0.0.1:18789"]
        PI["Pi Runtime<br>(Minimal Agent Core)"]
        LB["Lobster<br>(Workflow Engine)"]
        MEM["Memory System<br>(MEMORY.md + Daily Notes)"]
        SK["Skills Engine<br>(53 bundled + ClawHub)"]
    end

    subgraph Tools["Agent Capabilities"]
        BASH[Bash/Shell]
        FS[File System]
        BROWSER[Browser Automation]
        CRON[Cron/Scheduling]
        MCP_T[MCP Servers]
        CANVAS[Canvas Rendering]
    end

    subgraph LLMs["LLM Providers"]
        CLAUDE[Claude]
        GPT[GPT]
        DEEP[DeepSeek]
        GEM[Gemini]
        NEM[Nemotron/Local]
    end

    Channels --> GW
    GW <-->|RPC| PI
    PI --> Tools
    PI <--> LLMs
    GW --> LB
    GW --> MEM
    GW --> SK

Gateway — The Control Plane

The Gateway is a WebSocket server that handles:

  • Session management — conversation state, branching, and history
  • Presence — online/offline status across channels
  • Configuration — bot settings, channel bindings, model selection
  • Cron jobs — scheduled task execution
  • Webhooks — external trigger integration
  • Channel routing — dispatching messages to/from 24+ platforms

The Gateway runs locally, bound to ws://127.0.0.1:18789 by default. All messaging platform connectors route through this single control plane.

One Agent, Many Channels

The key architectural insight is that OpenClaw separates the interface layer (where messages arrive) from the assistant runtime (where intelligence lives). One persistent agent is accessible through every messaging app, with state managed centrally on your hardware.

Pi Runtime — The Agent Core

Underneath the Gateway, the actual agent work is done by Pi, a minimal coding agent runtime.

Design Philosophy

Pi is deliberately minimal:

  • Shortest system prompt of any major coding agent
  • Ships with only four core tools: Read, Write, Edit, Bash
  • Communicates with the Gateway over RPC with tool streaming and block streaming
  • The philosophy: if you want the agent to do something new, you don't download a plugin — you ask the agent to write code that extends itself

Execution Model

sequenceDiagram
    participant U as User (Channel)
    participant GW as Gateway
    participant Pi as Pi Runtime
    participant LLM as LLM Provider
    participant T as Tools

    U->>GW: Message via WhatsApp/Telegram/etc
    GW->>Pi: RPC dispatch (session context)
    Pi->>Pi: Assemble context (history + memory)
    Pi->>LLM: Inference request
    LLM-->>Pi: Response + tool calls
    Pi->>T: Execute tool calls
    T-->>Pi: Tool results
    Pi->>LLM: Continue with results
    LLM-->>Pi: Final response
    Pi-->>GW: Response + state update
    GW-->>U: Reply via channel

Session Architecture — Trees, Not Logs

One of OpenClaw's most distinctive architectural features: sessions are trees, not flat logs.

You can branch and navigate within a session, which enables workflows impossible in most agent frameworks:

  • Branch for debugging — instead of polluting the main context, the agent branches off to diagnose a broken tool
  • Rewind after fix — once the issue is resolved, the agent rewinds to the main branch and summarizes what happened
  • Clean main context — the main session stays focused while side-quests happen in branches

This is critical for failure recovery — the agent literally recovers in-band.

Lobster — The Workflow Engine

Lobster is a typed, local-first "macro engine" that turns skills and tools into composable pipelines.

Why Lobster Exists

Complex workflows require many back-and-forth tool calls, each costing tokens. Lobster moves orchestration into a typed runtime: one call instead of many.

LLMs do what LLMs are good at: writing code, analyzing code, running tests. Lobster does what code is good at: sequencing, counting, routing, retrying.

Key Properties

Property Detail
Determinism Pipelines are data — easy to log, diff, replay, review
Resumable state Halted workflows return a resumeToken; approve and continue without re-running
Safety Timeouts, output caps, sandbox checks, allowlists enforced by runtime
Execution Runs inside the Gateway process — no external subprocess spawned
Grammar Intentionally tiny — a predictable, AI-friendly pipeline spec

Multi-Agent Pipeline Example

graph LR
    subgraph Lobster["Lobster Pipeline"]
        CODE[Programmer Agent] --> REVIEW[Reviewer Agent]
        REVIEW -->|Pass| TEST[Tester Agent]
        REVIEW -->|Fail<br>max 3 iterations| CODE
        TEST -->|Pass| DONE[Done]
        TEST -->|Fail| CODE
    end

A real-world dev pipeline: code → review (max 3 iterations) → test → done, with no human in the loop unless something breaks.

Resilience & State

OpenClaw uses a write-ahead queue for task interruption handling:

  • If an agent fails mid-execution, it resumes from the last saved checkpoint
  • State is preserved for long-running autonomous tasks
  • An email triage workflow that crashes halfway doesn't restart from scratch

Multi-Agent Patterns

OpenClaw supports three distinct multi-agent patterns:

1. Sub-Agents

Background workers spawned from the main agent. They run in parallel, finish their task, and report back.

Best for: long-running research, batch processing, code review.

2. Multi-Agent Routing

Separate, independent agents on one Gateway, each bound to different channels or users.

Best for: home/work separation, security isolation, different personas.

3. Agent Teams

Community-built orchestration systems (SWAT, OpenMOSS, Mission Control, ClawTeam) that add structured coordination on top of OpenClaw.

Best for: complex pipelines with review loops, 24/7 operations.

NemoClaw Integration (NVIDIA)

graph TB
    subgraph NemoClaw["NemoClaw Stack"]
        OC[OpenClaw Agent]
        OS["OpenShell Runtime<br>(K3s in Docker)"]
        PR["Privacy Router"]
        PE["Policy Engine<br>(YAML declarative)"]
    end

    subgraph Inference["Inference Routing"]
        LOCAL["Local Nemotron<br>(120B on DGX Spark)"]
        CLOUD["Cloud Provider<br>(Claude/GPT)"]
    end

    OC -->|Sandboxed| OS
    OS --> PR
    PR -->|Sensitive| LOCAL
    PR -->|Non-sensitive| CLOUD
    PE -->|Enforces| OS

NemoClaw adds:

  • Out-of-process policy enforcement — policies enforced outside the agent, so even a compromised agent cannot bypass them (similar to browser tab isolation)
  • Privacy Router — routes inference to local or cloud based on configured policy
  • Credential injection — credentials never touch the sandbox filesystem
  • Isolated sandboxes — each agent runs in its own sandbox with declarative YAML policy controlling filesystem, network, process, and inference routing
  • K3s under the hood — all components run as a K3s Kubernetes cluster inside a single Docker container

When to Use NemoClaw

  • Personal, non-confidential → plain OpenClaw is sufficient
  • Confidential data or business use → NemoClaw recommended
  • Corporate adoption → NemoClaw virtually essential
  • Best practice → two instances: one everyday, one confidential via NemoClaw

Sources