How IronCurtain Works

architecture

Agent

any LLM · Claude Code in Docker · your own agent

V8 Sandbox

TypeScript in isolated runtime

Trusted Process

Policy engine · allow / deny / escalate

MCP Servers

sandboxed via Anthropic SRT

Every agent, whether a direct LLM session or Claude Code running in a Docker container, goes through the same pipeline.

The V8 Sandbox

The agent never interacts with your system directly. Instead, it writes TypeScript that runs inside a V8 isolated VM. That code can only do one thing: issue typed function calls that map to MCP tool calls. It has no access to the filesystem, network, or environment variables.¹

Because the agent expresses intent through typed function calls like gmail.sendEmail({to: "bob@example.com"}) rather than raw HTTP requests, we can write meaningful policy against it. We can ask “is this recipient in the user’s contacts?” in a way that would be impossible if the agent were making opaque web_fetch calls.

Docker Mode

Docker Mode wraps the pipeline with additional containment. A full autonomous agent like Claude Code runs inside a container with --network=none and all capabilities dropped. The only way out is through host-side proxies: one for the policy engine, one for LLM API requests via a TLS-terminating MITM proxy, and one for package installations (npm/PyPI) via a validating registry proxy.

The container receives a fake API key that passes format validation but is useless on its own. The MITM proxy intercepts each request, swaps in the real credential, and forwards to the upstream API. For OAuth-based authentication, the proxy auto-detects tokens and handles refresh before expiry. The real credentials never enter the container.

The Terminal Multiplexer

The terminal multiplexer (ironcurtain mux) is the recommended way to run Docker Agent Mode. It gives you the full Claude Code TUI with spinners, diffs, file previews, and slash commands while IronCurtain mediates every tool call through its policy engine. All in a single terminal.

When a tool call needs approval, an escalation panel overlays the viewport. Press Ctrl-A to enter command mode, type /approve or /deny, and return. Multiple concurrent sessions run in tabs.

Text typed in command mode is captured on the host side before entering the container. This creates a trusted input signal that the agent cannot forge or influence. The auto-approver uses this signal to make higher-confidence approval decisions.

The Trusted Process

The trusted process is an MCP proxy that sits between the sandbox and the real MCP servers. Every tool call passes through it. Nothing reaches the servers without a policy decision.

Evaluation happens in two phases. First, structural invariants: protected paths (like the constitution itself and audit logs) are always denied, sandbox-contained paths are auto-allowed, and unknown tools are rejected. Second, compiled policy rules: each argument is checked against the rules generated from your constitution. Each argument role is evaluated independently and the most restrictive result wins.

The outcome is one of three decisions:

allow The call matches a policy rule. Forward it to the MCP server.

deny The call violates a rule or structural invariant. Block it and return an error.

escalate The call is outside known policy. Pause and ask the human to approve or deny.

The default is deny. The policy compiler produces only allow and escalate rules, so anything not explicitly permitted is blocked. This makes it straightforward to sub-scope an agent to a reduced policy by removing allow rules. Every decision is logged with the rule that triggered it, making the system auditable.

The Auto-Approver

Escalation keeps the human in the loop, but constant approval prompts slow things down. The optional auto-approver sits between the policy engine’s escalation decision and the human. It uses a small, cheap LLM to check whether the user’s most recent message explicitly authorized the escalated action. In the terminal multiplexer, the auto-approver also uses trusted input captured on the host side, because the agent controls the PTY inside the container and its conversation history cannot be trusted.

The rules are conservative. The user must have requested the specific operation. “Push my changes to origin” approves a git_push to GitHub. “Go ahead” or “continue” always escalates to the human. If the tool arguments don’t match what the user asked for (e.g., the user said “push to origin” but the remote is an unknown server), it escalates.

The auto-approver can only approve or escalate. It can never deny. Any error, timeout, or uncertain match falls through to human approval. Every auto-approved action is recorded in the audit log as such.

MCP Servers and Sandboxing

The MCP servers are standard, unmodified servers that handle filesystem access, git operations, and web fetching. IronCurtain does not require custom server implementations.

Each server runs in its own OS-level sandbox via Anthropic’s sandbox runtime (SRT), with permissions tailored to its purpose. The git server can reach GitHub and GitLab, can read most of the filesystem, but can only write inside its sandbox directory. A different server gets different constraints. Credentials (OAuth tokens, API keys) live exclusively in the servers and trusted process. The agent never sees them.

The Flow

Example flow

Agent writes code

The LLM generates TypeScript that calls typed functions: read a file, commit code, fetch a URL.

Sandbox intercepts

Function calls are captured in the V8 isolate and forwarded to the trusted process as MCP tool-call requests.

Policy evaluates

Structural invariants and compiled rules decide: allow, deny, or escalate to the human.

Server executes

Approved calls run on the sandboxed MCP server. Results flow back through the trusted process to the agent.

Features

Write security policy in plain English, a constitution, not a DSL
Every MCP tool call evaluated against compiled policy rules
V8 sandbox isolation: agents never touch your filesystem directly
Human-in-the-loop escalation for sensitive operations
Agent-framework agnostic: works with any agent that speaks MCP
Full audit log of every tool call and policy decision

Get Started Docs

This architecture, having the agent write sandboxed code to orchestrate tools rather than using raw JSON tool calling, aligns with the “Code Mode” pattern explored independently by Cloudflare and Anthropic. It is not only more secure but highly token-efficient, with Anthropic reporting a 98.7% reduction in token usage.

How It Works

> The V8 Sandbox

> Docker Mode

> The Terminal Multiplexer

> The Trusted Process

> The Auto-Approver

> MCP Servers and Sandboxing

> The Flow

> Features