AI-Augmented Development & Research Tutorial
AI-Augmented Development & Research Tutorial

Claude Code Power Use — Developer / RSE

Who this is for: Software engineers and Research Software Engineers driving Claude Code from the CLI — orchestrating sub-agents and workflows, extending it with skills & plugins, wiring MCP securely, and tuning model & effort for cost-aware, viable PoCs.

1 Co-development modalities & tokens

Pick the lightest tool that fits the job. Token cost roughly climbs as you move down this list.

ModalityBest forToken cost
Chat copy/pasteOne-off snippets, explanation, design talkLow
Inline autocompleteLocal, line-level edits in flowLow
CLI single-agentMulti-file edits with repo contextMed
CLI + sub-agent workflowParallel research, big refactors, reviewsHigh
Tip: Sub-agents run in isolated context windows — they keep the main thread lean, but each one spends its own tokens. Fan out only when parallelism or context isolation actually pays.

2 Model & effort selection

ModelReach for it when…
OpusArchitecture, gnarly debugging, multi-file reasoning, orchestration brain
SonnetDaily driver: feature work, refactors, solid cost/quality balance
HaikuCheap, fast: lint fixups, formatting, classification, bulk sub-agent grunt work

Effort / reasoning

  • low — mechanical edits, known patterns
  • medium — most feature work (default)
  • high — design, tricky bugs, security review
  • xhigh — rare deep reasoning; budget the tokens
Caution: High effort on a trivial task burns tokens without buying quality. Match effort to difficulty, not to anxiety.

3 Sub-agents & workflows

  • Sub-agent = a delegated task in its own context window with its own tools/model.
  • Give each a role: explorer, implementer, reviewer, test-writer.
  • Orchestrate: main agent plans → fans out → synthesizes results.
  • Use cheaper models (Haiku/Sonnet) for narrow sub-tasks; keep Opus as the planner.
Copyable prompt
Plan a refactor of the auth module. Spawn three
sub-agents in parallel:
1) map every call site of legacy login()
2) draft the new token flow + tests
3) review both for security regressions
Then synthesize a single plan. Do NOT edit files
until I approve the plan.
Tip: Ask for a plan first, approve, then implement. Cheaper than letting a workflow run wide and redoing it.

4 Skills & plugins

  • Skills — packaged instructions + scripts Claude loads on demand (progressive disclosure). Great for repeatable team conventions.
  • Plugins — bundle commands, agents, skills, hooks & MCP config into one installable unit.
  • Hooks — deterministic automation on events (PreToolUse, PostToolUse, Stop): block dangerous commands, run formatters, gate commits.
  • Scope to the repo so the whole team inherits the same behavior.
Copyable prompt
Create a skill that captures our PR conventions:
conventional-commit messages, run `make test`
before claiming done, and require a CHANGELOG
entry. Make it trigger on commit/PR requests.
Tip: A good skill description is what makes it auto-trigger. Write it for the situation, not the mechanism.

5 MCP servers — use them securely

  • MCP exposes external tools/data (DBs, APIs, browsers) to the agent.
  • Least privilege: read-only tokens & scoped creds first; grant writes only when needed.
  • Vet the server: prefer first-party/audited; pin versions; review what tools it exposes.
  • Secrets: env vars or a secret store — never paste keys into prompts or commit them.
  • Treat tool output as untrusted input — guard against prompt injection from fetched web/DB content.
  • Keep an allowlist; require confirmation for destructive tools.
Caution: An MCP tool that can read your filesystem and reach the network is an exfiltration path. Isolate or sandbox those.

6 Artifact-driven interaction

  • For dense/long answers, ask for a self-contained HTML artifact or SVG diagram instead of a wall of text.
  • Great for architecture maps, sequence diagrams, data-flow, runbooks.
  • Artifacts are reviewable, printable, and easy to drop into docs.
Copyable prompt
Summarize the request lifecycle as a single
self-contained HTML file with an inline SVG
sequence diagram (no external assets). Label
each hop and mark where auth happens.
Tip: "Self-contained, no CDN" keeps artifacts offline-usable and print-clean — exactly like this sheet.

7 The "Collab Wrapper" — batch edits instead of one-by-one prompts

Wrap a reveal.js deck or UI so reviewers click any element to add a comment or edit its text. An Export-to-JSON button emits every requested change with stable selectors. Paste that JSON into one Claude Code prompt and apply all changes in a single batch pass — no more "the heading near the middle of slide 4."

Why it wins

  • Kills ambiguity/deixis — selectors say exactly which element, so you never describe a page region in prose.
  • Far fewer human round-trips — one batch pass replaces dozens of back-and-forth turns.
  • Avoids the per-turn context re-reads that each fresh prompt would trigger.
  • Non-coders can drive the review; you apply.

Watch for

  • The win is round-trips and clarity, not raw output tokens — emitting N edits costs about the same either way.
  • Batch size is bounded by the context window and edit coherence — chunk very large change-sets.
  • Keep selectors stable (IDs/data-attrs) so edits map cleanly.
  • Review the diff — batch apply still needs a human gate.
Copyable prompt
Here is exported JSON of review edits from the deck
wrapper (array of {selector, action, value}). Apply
ALL of them in one pass to slides.html. For "edit"
replace text; for "comment" add a TODO note beside
the element. Show me a unified diff before saving.

8 claude.ai vs Claude Code

  • claude.ai (web): chat, artifacts, research, fast iteration; no direct repo/file/tool execution.
  • Claude Code (CLI): reads/edits your files, runs commands, orchestrates sub-agents, MCP, hooks.
  • The response "styles" feature appears to be migrating into Skills — verify in your own account rather than relying on it.
  • Most prompting patterns are model-agnostic — they transfer to other AI coding tools too.
Tip: Prototype the conversation in web chat, then hand the agreed plan to Claude Code to execute against the repo.

9 Usage model & cost awareness

  • Plans carry per-session and weekly limits; premium models (Opus) draw down faster.
  • Wide sub-agent fan-out and high effort are the biggest token multipliers.
  • Check Terms of Use before piping proprietary/sensitive code or data through any provider.
  • Comparatively, OpenAI Codex and Google Antigravity have their own quota/premium models — the same frugality habits transfer.
Caution: Don't run xhigh-effort Opus workflows on a Friday if you need quota for the rest of the week. Budget deliberately.

10 Do / Avoid

Do

  • Plan → approve → implement, especially for multi-file work.
  • Start with Sonnet/medium; escalate to Opus/high only when stuck.
  • Give sub-agents narrow roles and the cheapest capable model.
  • Encode conventions as skills/hooks so the team inherits them.
  • Scope MCP creds to least privilege; review diffs before commit.
  • Request HTML/SVG artifacts for dense answers.
  • Batch UI edits via the collab-wrapper JSON.

Avoid

  • Defaulting every task to Opus + xhigh effort.
  • Letting workflows edit files before a plan is approved.
  • Pasting API keys/secrets into prompts or commits.
  • Trusting MCP/web tool output as safe instructions.
  • Describing UI regions in prose when a selector exists.
  • Re-prompting one tiny change at a time across many turns.

11 PoC workflow recipe

  1. Define the persona & success criteria for the PoC.
  2. Opus/high: produce a plan + risk list. Approve it.
  3. Fan out implementer + test-writer sub-agents.
  4. Reviewer sub-agent (security + correctness) before merge.
  5. Generate an HTML artifact summarizing the architecture.
  6. Iterate via collab-wrapper batch edits.
Tip: Vertical reveal.js stacks = depth on demand. Keep the top slide high-level; push technical detail down the vertical stack.

12 How this could be better honest

  • Token accounting is still coarse — better per-sub-agent cost telemetry would sharpen fan-out decisions.
  • MCP trust boundaries deserve formal sandboxing patterns, not just discipline.
  • Collab-wrapper selectors break on heavy DOM churn — a more durable anchoring scheme is wanted.
  • These are current practices, not gospel — share better ones.
Tip: Treat every recipe here as a starting hypothesis. Measure, then adjust effort and orchestration to what your repo actually rewards.