The Orchestrator — AI-Augmented Development Cheat Sheet

Orchestration is delegation with intent: you decompose a goal into roles, hand each role the minimum context and tools it needs, and keep a tight feedback loop with the humans you build for. The cards below are the working patterns — wiring, model/effort budgeting, the collab wrapper, and the security guardrails that keep an autonomous fleet trustworthy.

Mental model

Roles vs. Personas vs. Sub-agents

Persona = the human you serve (scientist, engineer, reviewer). Shapes tone, defaults, and what "done" means.
Agent role = a job description you assign Claude: planner, implementer, tester, security reviewer, doc writer.
Sub-agent = the runtime instance of a role — its own context window, tools, and system prompt, spawned by the orchestrator.
You are the orchestrator: you don't do the work, you route it and integrate the results.

Tip: Name roles explicitly in the prompt. "Act as the test-author sub-agent" gives the model a frame and keeps its output scoped.

Why delegate

What sub-agents buy you

Context isolation: a search-heavy sub-agent burns tokens in its own window, returns only a summary, and keeps your main thread clean.
Parallelism: fan out independent tasks (explore arch / map files / list risks) at once.
Specialization: a security-reviewer role with a focused prompt beats a generalist doing everything.
Cheaper main loop: push grunt work to Haiku/Sonnet sub-agents; reserve Opus for synthesis.

Caution: Each sub-agent re-reads context from scratch. Five sub-agents that each re-scan the repo can cost more than one well-scoped pass. Delegate for isolation/parallelism, not reflexively.

Copyable

Orchestration prompt skeletons

Spawn a scoped sub-agent

Spawn a sub-agent with the "data-loader" role.
Scope: ONLY read src/io/ and write a summary of every file
format we parse. Do not edit code. Return: a table of
{format, entrypoint, known gaps}. Budget: keep it tight,
use Sonnet-level effort.

Fan-out, then synthesize (orchestrator pattern)

Run three sub-agents in parallel, then YOU synthesize:
  1. architecture-mapper → how modules depend on each other
  2. risk-scanner → top 5 correctness/security risks
  3. test-gap-finder → untested critical paths
After all three return, reconcile conflicts and give me
ONE ranked action list. Flag any disagreement between agents.

Role hand-off with explicit contract

Workflow: planner → implementer → tester → reviewer.
Each role hands off a written artifact the next consumes:
  planner   → PLAN.md (numbered steps, acceptance criteria)
  implementer → code + CHANGES.md
  tester    → test results, pass/fail per criterion
  reviewer  → review notes, must explicitly APPROVE or BLOCK
Do not let a role start until the prior artifact exists.

Decision

Model & effort selection

Job	Model	Effort
Plan / architect / reconcile sub-agents	Opus	high–xhigh
Implement well-specified code	Sonnet	medium–high
Search, summarize, mechanical edits	Haiku/Sonnet	low–medium
Security / correctness review	Opus	high

Match effort to ambiguity, not size. A 3-line change in a subtle race is high-effort.
Run the orchestrator on a strong model; let it down-shift sub-agents.

Tip: Most of these patterns are model-agnostic — the role/workflow structure transfers to other AI coding tools even if names differ.

Signature technique

The Collab Wrapper loop

Wrap a reveal.js deck or a UI so the user can select any element and attach a comment or edit its text in place.
An Export-to-JSON button serializes every requested change with a stable selector/ID per element.
Paste that JSON into a single Claude Code prompt → the agent applies all edits in one batch pass.
Kills the "which part of page 7 do you mean?" round-trips that dominate one-by-one prompting.

Batch-apply prompt

Here is exported collab JSON. Apply EVERY change in one pass.
Each item = {selector, type: "edit"|"comment", value}.
For "edit", replace that element's text. For "comment",
treat as an instruction to revise that element. Make all
edits, then list what you changed by selector.

Tip: Stable element IDs are the whole trick. Generate them when you build the artifact so the JSON round-trips cleanly.

Extending the agent

Skills & plugins

Skills = reusable, model-invoked capability packets (instructions + scripts) Claude loads on demand. Great for encoding a workflow once.
Plugins bundle skills, slash-commands, sub-agent definitions, hooks, and MCP servers into one installable unit.
As orchestrator, package recurring role+workflow combos as a plugin so the whole fleet ships with the repo.
Hooks let you enforce policy (lint, block dangerous commands) at tool boundaries — automation the harness runs, not the model.

Tip: If you explain the same multi-step routine twice, make it a skill. The third time you'll wish you had.

Security priority

MCP servers, used securely

MCP exposes external tools/data to the agent. Treat every server as code you are granting tool access.
Prefer read-only / least-privilege scopes; separate a "reader" MCP from any "writer" MCP.
Pin and review the server source; don't auto-connect unvetted community servers in an agent with write/exec tools.
Keep secrets in env/secret stores, never in prompts or committed config.
Prompt-injection is real: data an MCP returns can contain instructions. Don't let a sub-agent both ingest untrusted web/content AND hold destructive tools.

Caution: An autonomous workflow + a write-capable MCP + untrusted input = the classic confused-deputy. Gate irreversible actions behind a human approval step.

Tradeoffs

Modalities & token cost

Modality	Best for	Token cost
Copy/paste chat	one-off reasoning, no repo	low
Inline autocomplete	local, in-flow edits	low
CLI single-agent	scoped repo tasks	medium
CLI + sub-agent workflow	multi-step PoCs	high

Orchestration is the most capable and most expensive tier — use it when isolation/parallelism earns its keep.
Watch per-session and weekly limits; sub-agent fan-out spends both fast.
Opus carries a model premium vs Sonnet/Haiku — budget the fleet accordingly.

Practice

Do / Avoid

Do

Give each role a written hand-off artifact (PLAN.md, CHANGES.md).
Scope sub-agents to specific dirs/tools.
Reconcile conflicting sub-agent outputs yourself.
Batch user edits via the collab JSON loop.
Gate irreversible/exec actions behind approval.

Avoid

Spawning sub-agents that each re-scan the whole repo.
Opus-everywhere when Sonnet/Haiku suffice.
Letting one agent ingest untrusted data and hold write tools.
One-by-one "change this bit" prompts on a UI.
Vague roles with no acceptance criteria.

Output format

Artifact-driven orchestration

For dense results, ask for an HTML artifact or SVG diagram instead of a wall of text — easier to review and to wrap for collab.
A workflow diagram (roles → hand-offs) is the fastest way to verify the orchestration you intended.
reveal.js v6+ with vertical slides = progressive depth: the deeper the vertical stack, the more technical the detail.

Diagram request

Draw an SVG flow of this workflow: planner → implementer →
tester → reviewer, with the artifact each hands off labeled
on the arrows. Mark the human-approval gate before deploy.

claude.ai vs Claude Code

Where to orchestrate

claude.ai (web): great for design, artifacts, the collab wrapper, and reasoning — but no repo, no sub-agent fleet, no MCP file/exec tools.
Claude Code (CLI): where true orchestration lives — sub-agents, workflows, MCP, skills, plugins, hooks.
Common pattern: design + collab in web, then export JSON / a spec into Claude Code for the batch implementation pass.
The response "styles" feature appears to be migrating into Skills — verify in your own account; either way, encode tone in the role prompt.

Honest framing

How this could be better

These are my working patterns, not gospel — I want sharper ones.
Open questions: best sub-agent budgeting heuristics; cleaner artifact contracts between roles; tighter MCP sandboxing defaults.
Cross-tool: the role/workflow ideas should port to Codex and Antigravity — compare limits, premiums, and ToU before committing a fleet.

Tip: Treat every orchestration as an experiment: log what you delegated, what it cost, and whether the isolation paid off.