Build an orchestration mode (Claude API doc) — source summary
A Claude API build guide showing how to implement agent-orchestration — a
session-level “orchestration mode” with multi-agent fan-out — built only from documented
primitives. Delivered via Telegram, ingested 2026-05-29. Text in
raw/orchestration-mode.md.
Key points
- Orchestration mode = a session toggle granting standing consent for the model to break substantive requests into parallel subtasks automatically (vs. per-request opt-in). On for high-stakes work, off for casual turns.
- Built on mid-conversation system messages — placed after a user turn so the
static top-level
systemstays cached. Three notices: MODE_ENTER / MODE_REFRESH / MODE_EXIT. This is exactly the claude-opus-4-8 capability noted earlier (steer late in a long loop without restating the prompt; cache-preserving). - Orchestrator-worker pattern: main agent → Workflow tool (fan-out) → up to ~10 parallel subagents (each a nested loop with bash + report_findings). Quality patterns: adversarial verification, a completeness critic, multi-phase sequencing.
- “No hidden API” — composed from documented effort levels, system messages, and tools. Cost caveat: fan-out multiplies tokens; reserve for work that justifies it.
Why it’s here
The clearest API-level worked example of agent-orchestration — the orchestrator- worker fan-out that claude-managed-agents deploys (leaf-worker subagents) and that gbrain runs as its durable “Minions” queue. Reflexively, it’s the same pattern the agent maintaining this wiki uses when it spawns subagents.
Related
agent-orchestration · claude-opus-4-8 · claude-managed-agents · gbrain