AI-Augmented Development & Research Tutorial

The Orchestrator

Sub-agents · workflows · agent roles · personas · the collab iteration loop

Who this is for: the builder who has outgrown single-prompt chats — you direct a CLI agent that spawns sub-agents, assigns roles, runs multi-step workflows, and iterates with users on real PoCs. You care about results, cost, and security in equal measure.

Orchestration is delegation with intent: you decompose a goal into roles, hand each role the minimum context and tools it needs, and keep a tight feedback loop with the humans you build for. The cards below are the working patterns — wiring, model/effort budgeting, the collab wrapper, and the security guardrails that keep an autonomous fleet trustworthy.

Mental model

Roles vs. Personas vs. Sub-agents

  • Persona = the human you serve (scientist, engineer, reviewer). Shapes tone, defaults, and what "done" means.
  • Agent role = a job description you assign Claude: planner, implementer, tester, security reviewer, doc writer.
  • Sub-agent = the runtime instance of a role — its own context window, tools, and system prompt, spawned by the orchestrator.
  • You are the orchestrator: you don't do the work, you route it and integrate the results.
Tip: Name roles explicitly in the prompt. "Act as the test-author sub-agent" gives the model a frame and keeps its output scoped.

Why delegate

What sub-agents buy you

  • Context isolation: a search-heavy sub-agent burns tokens in its own window, returns only a summary, and keeps your main thread clean.
  • Parallelism: fan out independent tasks (explore arch / map files / list risks) at once.
  • Specialization: a security-reviewer role with a focused prompt beats a generalist doing everything.
  • Cheaper main loop: push grunt work to Haiku/Sonnet sub-agents; reserve Opus for synthesis.
Caution: Each sub-agent re-reads context from scratch. Five sub-agents that each re-scan the repo can cost more than one well-scoped pass. Delegate for isolation/parallelism, not reflexively.

Copyable

Orchestration prompt skeletons

Spawn a scoped sub-agent

Spawn a sub-agent with the "data-loader" role.
Scope: ONLY read src/io/ and write a summary of every file
format we parse. Do not edit code. Return: a table of
{format, entrypoint, known gaps}. Budget: keep it tight,
use Sonnet-level effort.

Fan-out, then synthesize (orchestrator pattern)

Run three sub-agents in parallel, then YOU synthesize:
  1. architecture-mapper → how modules depend on each other
  2. risk-scanner → top 5 correctness/security risks
  3. test-gap-finder → untested critical paths
After all three return, reconcile conflicts and give me
ONE ranked action list. Flag any disagreement between agents.

Role hand-off with explicit contract

Workflow: planner → implementer → tester → reviewer.
Each role hands off a written artifact the next consumes:
  planner   → PLAN.md (numbered steps, acceptance criteria)
  implementer → code + CHANGES.md
  tester    → test results, pass/fail per criterion
  reviewer  → review notes, must explicitly APPROVE or BLOCK
Do not let a role start until the prior artifact exists.

Decision

Model & effort selection

JobModelEffort
Plan / architect / reconcile sub-agentsOpushigh–xhigh
Implement well-specified codeSonnetmedium–high
Search, summarize, mechanical editsHaiku/Sonnetlow–medium
Security / correctness reviewOpushigh
  • Match effort to ambiguity, not size. A 3-line change in a subtle race is high-effort.
  • Run the orchestrator on a strong model; let it down-shift sub-agents.
Tip: Most of these patterns are model-agnostic — the role/workflow structure transfers to other AI coding tools even if names differ.

Signature technique

The Collab Wrapper loop

  • Wrap a reveal.js deck or a UI so the user can select any element and attach a comment or edit its text in place.
  • An Export-to-JSON button serializes every requested change with a stable selector/ID per element.
  • Paste that JSON into a single Claude Code prompt → the agent applies all edits in one batch pass.
  • Kills the "which part of page 7 do you mean?" round-trips that dominate one-by-one prompting.

Batch-apply prompt

Here is exported collab JSON. Apply EVERY change in one pass.
Each item = {selector, type: "edit"|"comment", value}.
For "edit", replace that element's text. For "comment",
treat as an instruction to revise that element. Make all
edits, then list what you changed by selector.
Tip: Stable element IDs are the whole trick. Generate them when you build the artifact so the JSON round-trips cleanly.

Extending the agent

Skills & plugins

  • Skills = reusable, model-invoked capability packets (instructions + scripts) Claude loads on demand. Great for encoding a workflow once.
  • Plugins bundle skills, slash-commands, sub-agent definitions, hooks, and MCP servers into one installable unit.
  • As orchestrator, package recurring role+workflow combos as a plugin so the whole fleet ships with the repo.
  • Hooks let you enforce policy (lint, block dangerous commands) at tool boundaries — automation the harness runs, not the model.
Tip: If you explain the same multi-step routine twice, make it a skill. The third time you'll wish you had.

Security priority

MCP servers, used securely

  • MCP exposes external tools/data to the agent. Treat every server as code you are granting tool access.
  • Prefer read-only / least-privilege scopes; separate a "reader" MCP from any "writer" MCP.
  • Pin and review the server source; don't auto-connect unvetted community servers in an agent with write/exec tools.
  • Keep secrets in env/secret stores, never in prompts or committed config.
  • Prompt-injection is real: data an MCP returns can contain instructions. Don't let a sub-agent both ingest untrusted web/content AND hold destructive tools.
Caution: An autonomous workflow + a write-capable MCP + untrusted input = the classic confused-deputy. Gate irreversible actions behind a human approval step.

Tradeoffs

Modalities & token cost

ModalityBest forToken cost
Copy/paste chatone-off reasoning, no repolow
Inline autocompletelocal, in-flow editslow
CLI single-agentscoped repo tasksmedium
CLI + sub-agent workflowmulti-step PoCshigh
  • Orchestration is the most capable and most expensive tier — use it when isolation/parallelism earns its keep.
  • Watch per-session and weekly limits; sub-agent fan-out spends both fast.
  • Opus carries a model premium vs Sonnet/Haiku — budget the fleet accordingly.

Practice

Do / Avoid

Do

  • Give each role a written hand-off artifact (PLAN.md, CHANGES.md).
  • Scope sub-agents to specific dirs/tools.
  • Reconcile conflicting sub-agent outputs yourself.
  • Batch user edits via the collab JSON loop.
  • Gate irreversible/exec actions behind approval.

Avoid

  • Spawning sub-agents that each re-scan the whole repo.
  • Opus-everywhere when Sonnet/Haiku suffice.
  • Letting one agent ingest untrusted data and hold write tools.
  • One-by-one "change this bit" prompts on a UI.
  • Vague roles with no acceptance criteria.

Output format

Artifact-driven orchestration

  • For dense results, ask for an HTML artifact or SVG diagram instead of a wall of text — easier to review and to wrap for collab.
  • A workflow diagram (roles → hand-offs) is the fastest way to verify the orchestration you intended.
  • reveal.js v6+ with vertical slides = progressive depth: the deeper the vertical stack, the more technical the detail.

Diagram request

Draw an SVG flow of this workflow: planner → implementer →
tester → reviewer, with the artifact each hands off labeled
on the arrows. Mark the human-approval gate before deploy.

claude.ai vs Claude Code

Where to orchestrate

  • claude.ai (web): great for design, artifacts, the collab wrapper, and reasoning — but no repo, no sub-agent fleet, no MCP file/exec tools.
  • Claude Code (CLI): where true orchestration lives — sub-agents, workflows, MCP, skills, plugins, hooks.
  • Common pattern: design + collab in web, then export JSON / a spec into Claude Code for the batch implementation pass.
  • The response "styles" feature appears to be migrating into Skills — verify in your own account; either way, encode tone in the role prompt.

Honest framing

How this could be better

  • These are my working patterns, not gospel — I want sharper ones.
  • Open questions: best sub-agent budgeting heuristics; cleaner artifact contracts between roles; tighter MCP sandboxing defaults.
  • Cross-tool: the role/workflow ideas should port to Codex and Antigravity — compare limits, premiums, and ToU before committing a fleet.
Tip: Treat every orchestration as an experiment: log what you delegated, what it cost, and whether the isolation paid off.