Two-wall security architecture
Status: Accepted
Two-wall security architecture
Section titled “Two-wall security architecture”Decision
Section titled “Decision”All worker sandboxes are secured by two hard walls. Advisory controls (hooks, settings, CLAUDE.md) are defense-in-depth, not enforcement.
Wall 1 — Network gateway (Bifrost). All MCP tool calls and LLM API calls route through Bifrost. The gateway holds real credentials, enforces tool-level RBAC, rate limits, and provides audit logging. The agent receives only a scoped virtual key — it never sees real API keys.
Wall 2 — Kernel sandbox (OpenSandbox). Agent processes run in isolated containers (gVisor/Kata/Firecracker backends via OpenSandbox). Syscall interception, filesystem restrictions, and egress filtering prevent exfiltration. The credential proxy injects secrets into the sandbox environment without the secrets ever entering the container filesystem or process env directly.
Advisory layer. Hooks, settings.json, permission modes, and CLAUDE.md instructions are the outermost layer. They catch mistakes and improve UX but are explicitly not trusted as enforcement. A sufficiently capable agent can bypass all advisory controls.
Hard (can't bypass) Soft (can bypass)--------------------- ---------------------Wall 1: Gateway (Bifrost) Agent permission modesWall 2: Sandbox (OpenSandbox) PreToolUse hooks settings.json rules CLAUDE.md instructionsAdvisory controls alone are not enforceable. The 2026 incident where Claude Code extracted secrets via docker compose config after being blocked from .env demonstrated that hooks and permission modes can be circumvented. Only network interception and kernel isolation provide real security boundaries.
Credential proxy
Section titled “Credential proxy”Secrets never enter the sandbox directly. The credential proxy (part of our OpenSandbox contributions) sits between the IAM provider and the sandbox:
- Agent requests a secret via the proxy API inside the sandbox.
- Proxy authenticates the request against the agent’s euid and the definition’s allowed secrets.
- Proxy fetches from the IAM provider and injects into the sandbox’s isolated env namespace.
- The secret is available to the agent process but cannot be exfiltrated — egress is filtered to only the gateway.
This is stronger than env var injection because the proxy can revoke access mid-session and audit every access.
Implementation
Section titled “Implementation”- Bifrost intercepts all outbound network calls from agent processes. Agent gets a virtual key to the gateway, never real credentials.
- OpenSandbox (our fork with Apple Container backend, credential proxy, policy engine) restricts filesystem to the mounted project directory, blocks egress to everything except the gateway endpoint, and intercepts syscalls.
arpi spawnin sandbox mode configures both walls automatically. In bare mode, neither wall is active — the developer’s own machine is the trust boundary.
Verification
Section titled “Verification”- Agent inside sandbox cannot read host filesystem outside the project mount
- Agent inside sandbox cannot
curlexternal services directly (only via gateway) - Secrets are injected via credential proxy, never written to disk
- Gateway audit log captures: uid, euid, tool called, timestamp
- Revoking a machine identity immediately blocks all active sessions
- Hooks remain configured as first-line defense but are not relied upon for security