Orchestration is delegation with intent: you decompose a goal into roles, hand each role the minimum context and tools it needs, and keep a tight feedback loop with the humans you build for. The cards below are the working patterns — wiring, model/effort budgeting, the collab wrapper, and the security guardrails that keep an autonomous fleet trustworthy.
Mental model
Roles vs. Personas vs. Sub-agents
- Persona = the human you serve (scientist, engineer, reviewer). Shapes tone, defaults, and what "done" means.
- Agent role = a job description you assign Claude: planner, implementer, tester, security reviewer, doc writer.
- Sub-agent = the runtime instance of a role — its own context window, tools, and system prompt, spawned by the orchestrator.
- You are the orchestrator: you don't do the work, you route it and integrate the results.
Why delegate
What sub-agents buy you
- Context isolation: a search-heavy sub-agent burns tokens in its own window, returns only a summary, and keeps your main thread clean.
- Parallelism: fan out independent tasks (explore arch / map files / list risks) at once.
- Specialization: a security-reviewer role with a focused prompt beats a generalist doing everything.
- Cheaper main loop: push grunt work to Haiku/Sonnet sub-agents; reserve Opus for synthesis.
Copyable
Orchestration prompt skeletons
Spawn a scoped sub-agent
Spawn a sub-agent with the "data-loader" role.
Scope: ONLY read src/io/ and write a summary of every file
format we parse. Do not edit code. Return: a table of
{format, entrypoint, known gaps}. Budget: keep it tight,
use Sonnet-level effort.
Fan-out, then synthesize (orchestrator pattern)
Run three sub-agents in parallel, then YOU synthesize:
1. architecture-mapper → how modules depend on each other
2. risk-scanner → top 5 correctness/security risks
3. test-gap-finder → untested critical paths
After all three return, reconcile conflicts and give me
ONE ranked action list. Flag any disagreement between agents.
Role hand-off with explicit contract
Workflow: planner → implementer → tester → reviewer.
Each role hands off a written artifact the next consumes:
planner → PLAN.md (numbered steps, acceptance criteria)
implementer → code + CHANGES.md
tester → test results, pass/fail per criterion
reviewer → review notes, must explicitly APPROVE or BLOCK
Do not let a role start until the prior artifact exists.
Decision
Model & effort selection
| Job | Model | Effort |
|---|---|---|
| Plan / architect / reconcile sub-agents | Opus | high–xhigh |
| Implement well-specified code | Sonnet | medium–high |
| Search, summarize, mechanical edits | Haiku/Sonnet | low–medium |
| Security / correctness review | Opus | high |
- Match effort to ambiguity, not size. A 3-line change in a subtle race is high-effort.
- Run the orchestrator on a strong model; let it down-shift sub-agents.
Signature technique
The Collab Wrapper loop
- Wrap a reveal.js deck or a UI so the user can select any element and attach a comment or edit its text in place.
- An Export-to-JSON button serializes every requested change with a stable selector/ID per element.
- Paste that JSON into a single Claude Code prompt → the agent applies all edits in one batch pass.
- Kills the "which part of page 7 do you mean?" round-trips that dominate one-by-one prompting.
Batch-apply prompt
Here is exported collab JSON. Apply EVERY change in one pass.
Each item = {selector, type: "edit"|"comment", value}.
For "edit", replace that element's text. For "comment",
treat as an instruction to revise that element. Make all
edits, then list what you changed by selector.
Extending the agent
Skills & plugins
- Skills = reusable, model-invoked capability packets (instructions + scripts) Claude loads on demand. Great for encoding a workflow once.
- Plugins bundle skills, slash-commands, sub-agent definitions, hooks, and MCP servers into one installable unit.
- As orchestrator, package recurring role+workflow combos as a plugin so the whole fleet ships with the repo.
- Hooks let you enforce policy (lint, block dangerous commands) at tool boundaries — automation the harness runs, not the model.
Security priority
MCP servers, used securely
- MCP exposes external tools/data to the agent. Treat every server as code you are granting tool access.
- Prefer read-only / least-privilege scopes; separate a "reader" MCP from any "writer" MCP.
- Pin and review the server source; don't auto-connect unvetted community servers in an agent with write/exec tools.
- Keep secrets in env/secret stores, never in prompts or committed config.
- Prompt-injection is real: data an MCP returns can contain instructions. Don't let a sub-agent both ingest untrusted web/content AND hold destructive tools.
Tradeoffs
Modalities & token cost
| Modality | Best for | Token cost |
|---|---|---|
| Copy/paste chat | one-off reasoning, no repo | low |
| Inline autocomplete | local, in-flow edits | low |
| CLI single-agent | scoped repo tasks | medium |
| CLI + sub-agent workflow | multi-step PoCs | high |
- Orchestration is the most capable and most expensive tier — use it when isolation/parallelism earns its keep.
- Watch per-session and weekly limits; sub-agent fan-out spends both fast.
- Opus carries a model premium vs Sonnet/Haiku — budget the fleet accordingly.
Practice
Do / Avoid
Do
- Give each role a written hand-off artifact (PLAN.md, CHANGES.md).
- Scope sub-agents to specific dirs/tools.
- Reconcile conflicting sub-agent outputs yourself.
- Batch user edits via the collab JSON loop.
- Gate irreversible/exec actions behind approval.
Avoid
- Spawning sub-agents that each re-scan the whole repo.
- Opus-everywhere when Sonnet/Haiku suffice.
- Letting one agent ingest untrusted data and hold write tools.
- One-by-one "change this bit" prompts on a UI.
- Vague roles with no acceptance criteria.
Output format
Artifact-driven orchestration
- For dense results, ask for an HTML artifact or SVG diagram instead of a wall of text — easier to review and to wrap for collab.
- A workflow diagram (roles → hand-offs) is the fastest way to verify the orchestration you intended.
- reveal.js v6+ with vertical slides = progressive depth: the deeper the vertical stack, the more technical the detail.
Diagram request
Draw an SVG flow of this workflow: planner → implementer →
tester → reviewer, with the artifact each hands off labeled
on the arrows. Mark the human-approval gate before deploy.
claude.ai vs Claude Code
Where to orchestrate
- claude.ai (web): great for design, artifacts, the collab wrapper, and reasoning — but no repo, no sub-agent fleet, no MCP file/exec tools.
- Claude Code (CLI): where true orchestration lives — sub-agents, workflows, MCP, skills, plugins, hooks.
- Common pattern: design + collab in web, then export JSON / a spec into Claude Code for the batch implementation pass.
- The response "styles" feature appears to be migrating into Skills — verify in your own account; either way, encode tone in the role prompt.
Honest framing
How this could be better
- These are my working patterns, not gospel — I want sharper ones.
- Open questions: best sub-agent budgeting heuristics; cleaner artifact contracts between roles; tighter MCP sandboxing defaults.
- Cross-tool: the role/workflow ideas should port to Codex and Antigravity — compare limits, premiums, and ToU before committing a fleet.