ADR-001: Sandbox Architecture — OpenSandbox + CUA Layer + eBPF Audit

Status: Accepted

ADR-001: Sandbox Architecture — OpenSandbox + CUA Layer + eBPF Audit

Status: Accepted Date: 2026-03-28 Deciders: Alexandre Philippi

Context

We’re building a compute platform that runs agent sandboxes with three workload types:

Coding sandboxes — agents execute shell commands, write files, run code
Linux CUA sandboxes — agents see a screen, click, type (computer-use agents)
Windows CUA sandboxes — same, but on a Windows desktop

We need VM-level isolation (untrusted agents from external users), structured audit trails (every action the agent takes must be recorded for compliance and debugging), and the ability to run 100+ concurrent sandboxes.

An existing internal project (Lunar Sandbox) already implements CUA for both Linux and Windows, with Anthropic computer-use protocol support, but uses a model where the agent runs outside the sandbox (weaker isolation, stronger audit).

Decision

1. Use OpenSandbox as the sandbox runtime (stock, no fork)

OpenSandbox provides:

Kata Containers + Cloud-Hypervisor runtime (VM-level isolation)
Kubernetes-native scheduling (k3s)
execd agent inside sandboxes (exec, files, code interpretation)
Ingress proxy for routing to sandbox ports
Pre-built example images: desktop (XFCE + VNC), chrome (Chromium + CDP), playwright

We run OpenSandbox’s server as infrastructure. We do not fork it. We use its SDK as a client library.

2. Build a thin CUA library on top of OpenSandbox SDK

For Linux CUA, the CUA layer wraps OpenSandbox’s exec + files APIs:

screenshot()  →  sandbox.commands.run("scrot /tmp/s.png") + sandbox.files.read_bytes(...)
action(click) →  sandbox.commands.run("xdotool mousemove X Y click 1")
action(type)  →  sandbox.commands.run("xdotool type 'text'")
vnc_url()     →  sandbox.get_endpoint(6080)  # noVNC

This is a library (~500 lines), not a service. It wraps the existing OpenSandbox SDK to provide CUA-specific methods. Sandbox images extend OpenSandbox’s desktop example with scrot, xdotool, and ffmpeg.

For Windows CUA, the same interface wraps Lunar’s QEMU + cua_helper.ps1 approach. Windows VMs are managed outside k8s (QEMU with KVM on bare metal nodes). The CUA library abstracts the difference — callers don’t know if the sandbox is Linux/Kata or Windows/QEMU.

3. Agent runs inside the sandbox

The agent process lives inside the sandbox VM, not on the host. This gives:

Isolation — a compromised agent can’t escape the Kata VM
Low latency — screen capture is local (no network hop for screenshots)
Simplicity — no external orchestration loop for each action

The agent calls vLLM over the cluster network (allowed by egress policy). It sees its own screen via the local X11 display. It cannot interact with other sandboxes or the host.

4. eBPF for audit (replaces mitmproxy and structured logging)

Instead of requiring the agent to self-report its actions (unreliable) or running a mitmproxy (detectable, TLS issues, high overhead), we use eBPF to observe the agent from the kernel level:

Probe	What it captures
`tracepoint/sys_enter_execve`	Every command the agent runs
`tracepoint/sys_enter_openat` + `sys_enter_write`	Every file the agent reads/writes
`kprobe/tcp_sendmsg` + `tcp_recvmsg`	Network traffic including LLM API calls (pre-TLS plaintext)

A trusted eBPF agent binary runs inside the Kata VM, mounted read-only so the agent process can’t modify or disable it. It starts before the agent, streams structured events to the OTel Collector, which forwards to VictoriaLogs.

Properties:

Invisible — the agent can’t detect eBPF probes
Tamper-proof — read-only mount, the agent process has no privileges to unload eBPF programs
Zero overhead — kernel-level hooks, not in the data path
Structured — events have type, timestamp, PID, args (not raw video)

5. Screen recording via VNC capture

In addition to eBPF (which captures actions structurally), we record the VNC stream as video using ffmpeg for human review:

Xvfb → x11vnc → ffmpeg (records to mp4) → object storage

This runs inside the sandbox alongside the eBPF agent, also on a read-only mount. The combination gives both structured audit (eBPF) and visual playback (video).

Architecture

┌─ Sandbox (Kata VM) ────────────────────────────────────┐
│                                                         │
│  Trusted layer (read-only mounts, starts first)        │
│  ├── eBPF agent → structured events → OTel Collector   │
│  └── ffmpeg (VNC recorder) → screen.mp4 → storage      │
│                                                         │
│  Desktop environment                                   │
│  ├── Xvfb (virtual display)                            │
│  ├── x11vnc (VNC server, live viewing)                 │
│  ├── noVNC (browser-based VNC access)                  │
│  └── Chromium / apps                                   │
│                                                         │
│  Agent process (untrusted)                             │
│  ├── Sees screen via DISPLAY=:0                        │
│  ├── Injects input via xdotool                         │
│  ├── Calls vLLM via HTTP (egress-allowed)              │
│  └── Cannot see/modify eBPF agent or ffmpeg            │
└─────────────────────────────────────────────────────────┘
        │                              │
        ▼                              ▼
  OTel Collector                Object Storage
  → VictoriaLogs               → Screen recordings
  (structured audit)           (video playback)

Consequences

Positive

One sandbox concept — coding and CUA sandboxes are the same domain object with different capabilities, not separate systems
No fork of OpenSandbox — we consume it stock, get upstream updates for free
Agent-inside isolation — strongest security model, agents can’t escape Kata VMs
Complete audit — eBPF sees everything without agent cooperation, video for human review
SDK compatibility — OpenSandbox SDKs (Python, Java, JS, C#) work against our platform
Windows support — same CUA interface, different backend (QEMU instead of Kata)

Negative

eBPF in Kata VMs — needs eBPF support in the guest kernel. Kata’s guest kernel is minimal and may need a custom build with eBPF enabled. Must validate.
Windows is separate — Windows VMs live outside k8s (QEMU on bare metal). Different lifecycle management path, though same client interface.
CUA library to build — thin but still custom code (~500 lines wrapping OpenSandbox SDK + Lunar’s Windows code)
No existing eBPF audit agent — must build or adapt (Tetragon, Inspektor Gadget, or custom with cilium/ebpf).

Risks

Kata guest kernel eBPF support — ~~if the Kata guest kernel doesn’t support eBPF~~ VALIDATED: default kernel strips eBPF, but custom build with KERNEL_DEBUG_ENABLED=yes is officially supported. Ant Group runs this in production.
HTTPS traffic visibility — eBPF tcp_sendmsg sees ciphertext for TLS connections. For internal vLLM (plain HTTP) this is fine. For external HTTPS (api.anthropic.com, api.openai.com), eBPF captures metadata (destination IP, port, timing, bytes transferred) but not request/response bodies. See “Audit Tiers” below.

Audit Tiers

Two audit levels, configurable per sandbox:

Tier 1: eBPF (always-on, default)

Lightweight, zero-overhead, invisible to the agent. Captures:

Probe	What	HTTPS visibility
`tracepoint/sys_enter_execve`	Every command the agent runs	N/A
`tracepoint/sys_enter_openat`	Every file read/write	N/A
`kprobe/tcp_sendmsg`	Network connections — destination, bytes, timing	Metadata only (ciphertext)
VNC → ffmpeg	Screen recording	N/A

Sufficient for: “what did the agent do, what did it touch, who did it talk to, what did the screen look like.”

Not sufficient for: “what exact prompt did the agent send to Claude API.”

Tier 2: mitmproxy (opt-in, for deep debugging)

Enabled per sandbox via environment config (AUDIT_LEVEL=deep). Adds transparent mitmproxy inside the sandbox:

# iptables redirects all outbound HTTPS through mitmproxy
iptables -t nat -A OUTPUT -p tcp --dport 443 -j REDIRECT --to-port 8080
mitmdump --mode transparent -w /audit/flows.mitm &

The sandbox image includes a pre-baked mitmproxy CA cert (system-wide trusted). All HTTPS is decrypted, logged as structured HAR/JSON with full request/response bodies, then re-encrypted.

Additional cost:

~50-100ms latency per HTTPS request (proxy overhead)
CA cert baked into image (agent could theoretically detect it, but can’t remove it — read-only mount)
Larger audit data volume (full HTTP bodies vs metadata)
Some TLS clients may reject the intercepted cert (certificate pinning)

When to use:

Debugging agent behavior (“why did it send that prompt?”)
Compliance audit requiring full LLM conversation transcripts
Security investigation after an incident

Not needed for:

Normal operation — eBPF tier gives enough visibility
Internal vLLM calls — already plain HTTP, eBPF sees everything

Alternatives Considered

A. Fork OpenSandbox, add CUA + Windows natively

Rejected. High maintenance burden (fork divergence), CUA endpoints don’t fit OpenSandbox’s exec/logs API model, Windows would require upstream OSEP with uncertain acceptance.

B. Run Lunar and OpenSandbox as separate services

Rejected. Two sandbox concepts, two APIs, duplicated auth/monitoring. Violates DDD — “sandbox” is one bounded context.

C. Agent outside sandbox with structured audit (Lunar’s model)

Rejected for production. Weaker isolation (agent has host access). Acceptable for internal evaluation/testing but not for external untrusted agents.

D. mitmproxy as the only network audit (replacing eBPF)

Rejected as default. Too heavy for always-on use (latency, cert pinning issues, data volume). Kept as opt-in Tier 2 for deep debugging alongside the eBPF default.

E. nginx/envoy proxy with BASE_URL env vars for LLM audit

Rejected. Requires per-provider proxy config, only works if agents respect *_BASE_URL env vars (Codex/Rust may not), doesn’t capture non-LLM HTTPS traffic.

F. uprobe on SSL_write for TLS interception

Rejected as primary approach. Requires knowing the exact TLS library path per language runtime (OpenSSL, BoringSSL, rustls — all different). Breaks on binary updates (symbol offsets change). mitmproxy is simpler for the same result when deep audit is needed.

Validation Results

Kata guest kernel eBPF — CONFIRMED: custom build with KERNEL_DEBUG_ENABLED=yes enables full eBPF. Ant Group runs this in production.
Host kernel eBPF — CONFIRMED: Azure kernel 6.17 has CONFIG_BPF=y, CONFIG_BPF_SYSCALL=y, CONFIG_BPF_JIT=y, all tracepoints available.
KVM for Windows VMs — CONFIRMED: /dev/kvm present, kvm_intel loaded.
tcp_sendmsg for HTTP — CONFIRMED: works for plain HTTP (vLLM internal). HTTPS requires Tier 2 (mitmproxy).
Benchmark eBPF audit overhead inside Kata VM
Test xdotool + scrot latency inside Kata VM (CUA action loop speed)
Build Windows QCOW2 image with virtio drivers + cua_helper.ps1 for QEMU provider
Validate mitmproxy transparent mode inside Kata VM (iptables redirect)