Skip to content

Template Reference

Templates are TOML files that define everything about a worker — its identity, model, capabilities, compute mode, and evaluation criteria. The template is the source of truth.

template_version = "1.0"
[worker]
name = "my-worker"
[workstation]
model = "claude-sonnet-4-20250514"
task = "Analyze the codebase and generate a dependency report."
instructions = "Focus on circular dependencies and unused imports."
[workstation.settings]
max_tokens = 8192
temperature = 0.7
[workstation.capabilities]
mcps = ["filesystem", "github"]
skills = ["code-analysis"]
[compute]
mode = "sandbox"
timeout = "2h"
[eval]
suite = "analysis-quality"

Required. Currently "1.0".

template_version = "1.0"

Worker identity.

FieldTypeRequiredDescription
namestringYesUnique name for this worker type
[worker]
name = "code-reviewer"

Model configuration and task definition.

FieldTypeRequiredDescription
modelstringYesModel identifier (e.g., claude-sonnet-4-20250514)
taskstringNoOne-shot task description. Worker completes and exits
instructionsstringNoOngoing instructions for daemon workers

Use task for one-shot workers that complete a single job. Use instructions for long-running daemon workers that process multiple episodes.

[workstation]
model = "claude-sonnet-4-20250514"
task = "Review the latest PR for security issues."

Model parameters.

FieldTypeDefaultDescription
max_tokensintegerModel defaultMaximum tokens per response
temperaturefloatModel defaultSampling temperature (0.0 - 1.0)
[workstation.settings]
max_tokens = 4096
temperature = 0.3

Tools and skills available to the worker.

FieldTypeDefaultDescription
mcpsstring[][]MCP server names (e.g., filesystem, github, slack)
skillsstring[][]Skill names (e.g., code-review, testing)

Capabilities drive credential inference. When you list mcps = ["github"], arpi’s registry resolves what credentials are needed (GitHub token), provisions them via Infisical, and injects them into the worker’s sandbox.

[workstation.capabilities]
mcps = ["filesystem", "github"]
skills = ["code-review"]

Compute environment configuration.

FieldTypeDefaultDescription
modestring"bare""sandbox" (isolated container) or "bare" (host process)
timeoutstring"1h"Maximum worker lifetime (e.g., "30m", "2h", "24h")

Sandbox mode provides kernel-level isolation and is required for credential injection. Bare mode runs without isolation — suitable for trusted, low-risk tasks.

[compute]
mode = "sandbox"
timeout = "2h"

Evaluation configuration.

FieldTypeRequiredDescription
suitestringNoName of the eval suite to run against the worker’s output

Eval suites score worker output using three grader types:

  • Code graders — Deterministic checks (regex, assertions, test execution)
  • Model graders — LLM-as-judge scoring (rubric-based evaluation)
  • Human graders — Manual review queue for subjective quality assessment
[eval]
suite = "code-review-quality"