1 Co-development modalities & tokens
Pick the lightest tool that fits the job. Token cost roughly climbs as you move down this list.
| Modality | Best for | Token cost |
|---|---|---|
| Chat copy/paste | One-off snippets, explanation, design talk | Low |
| Inline autocomplete | Local, line-level edits in flow | Low |
| CLI single-agent | Multi-file edits with repo context | Med |
| CLI + sub-agent workflow | Parallel research, big refactors, reviews | High |
Tip: Sub-agents run in isolated context windows — they keep the main thread lean, but each one spends its own tokens. Fan out only when parallelism or context isolation actually pays.
2 Model & effort selection
| Model | Reach for it when… |
|---|---|
| Opus | Architecture, gnarly debugging, multi-file reasoning, orchestration brain |
| Sonnet | Daily driver: feature work, refactors, solid cost/quality balance |
| Haiku | Cheap, fast: lint fixups, formatting, classification, bulk sub-agent grunt work |
Effort / reasoning
- low — mechanical edits, known patterns
- medium — most feature work (default)
- high — design, tricky bugs, security review
- xhigh — rare deep reasoning; budget the tokens
Caution: High effort on a trivial task burns tokens without buying quality. Match effort to difficulty, not to anxiety.
3 Sub-agents & workflows
- Sub-agent = a delegated task in its own context window with its own tools/model.
- Give each a role: explorer, implementer, reviewer, test-writer.
- Orchestrate: main agent plans → fans out → synthesizes results.
- Use cheaper models (Haiku/Sonnet) for narrow sub-tasks; keep Opus as the planner.
Copyable prompt
Plan a refactor of the auth module. Spawn three
sub-agents in parallel:
1) map every call site of legacy login()
2) draft the new token flow + tests
3) review both for security regressions
Then synthesize a single plan. Do NOT edit files
until I approve the plan.
Tip: Ask for a plan first, approve, then implement. Cheaper than letting a workflow run wide and redoing it.
4 Skills & plugins
- Skills — packaged instructions + scripts Claude loads on demand (progressive disclosure). Great for repeatable team conventions.
- Plugins — bundle commands, agents, skills, hooks & MCP config into one installable unit.
- Hooks — deterministic automation on events (PreToolUse, PostToolUse, Stop): block dangerous commands, run formatters, gate commits.
- Scope to the repo so the whole team inherits the same behavior.
Copyable prompt
Create a skill that captures our PR conventions:
conventional-commit messages, run `make test`
before claiming done, and require a CHANGELOG
entry. Make it trigger on commit/PR requests.
Tip: A good skill description is what makes it auto-trigger. Write it for the situation, not the mechanism.
5 MCP servers — use them securely
- MCP exposes external tools/data (DBs, APIs, browsers) to the agent.
- Least privilege: read-only tokens & scoped creds first; grant writes only when needed.
- Vet the server: prefer first-party/audited; pin versions; review what tools it exposes.
- Secrets: env vars or a secret store — never paste keys into prompts or commit them.
- Treat tool output as untrusted input — guard against prompt injection from fetched web/DB content.
- Keep an allowlist; require confirmation for destructive tools.
Caution: An MCP tool that can read your filesystem and reach the network is an exfiltration path. Isolate or sandbox those.
6 Artifact-driven interaction
- For dense/long answers, ask for a self-contained HTML artifact or SVG diagram instead of a wall of text.
- Great for architecture maps, sequence diagrams, data-flow, runbooks.
- Artifacts are reviewable, printable, and easy to drop into docs.
Copyable prompt
Summarize the request lifecycle as a single
self-contained HTML file with an inline SVG
sequence diagram (no external assets). Label
each hop and mark where auth happens.
Tip: "Self-contained, no CDN" keeps artifacts offline-usable and print-clean — exactly like this sheet.
7 The "Collab Wrapper" — batch edits instead of one-by-one prompts
Wrap a reveal.js deck or UI so reviewers click any element to add a comment or edit its text. An Export-to-JSON button emits every requested change with stable selectors. Paste that JSON into one Claude Code prompt and apply all changes in a single batch pass — no more "the heading near the middle of slide 4."
Why it wins
- Kills ambiguity/deixis — selectors say exactly which element, so you never describe a page region in prose.
- Far fewer human round-trips — one batch pass replaces dozens of back-and-forth turns.
- Avoids the per-turn context re-reads that each fresh prompt would trigger.
- Non-coders can drive the review; you apply.
Watch for
- The win is round-trips and clarity, not raw output tokens — emitting N edits costs about the same either way.
- Batch size is bounded by the context window and edit coherence — chunk very large change-sets.
- Keep selectors stable (IDs/data-attrs) so edits map cleanly.
- Review the diff — batch apply still needs a human gate.
Copyable prompt
Here is exported JSON of review edits from the deck
wrapper (array of {selector, action, value}). Apply
ALL of them in one pass to slides.html. For "edit"
replace text; for "comment" add a TODO note beside
the element. Show me a unified diff before saving.
8 claude.ai vs Claude Code
- claude.ai (web): chat, artifacts, research, fast iteration; no direct repo/file/tool execution.
- Claude Code (CLI): reads/edits your files, runs commands, orchestrates sub-agents, MCP, hooks.
- The response "styles" feature appears to be migrating into Skills — verify in your own account rather than relying on it.
- Most prompting patterns are model-agnostic — they transfer to other AI coding tools too.
Tip: Prototype the conversation in web chat, then hand the agreed plan to Claude Code to execute against the repo.
9 Usage model & cost awareness
- Plans carry per-session and weekly limits; premium models (Opus) draw down faster.
- Wide sub-agent fan-out and high effort are the biggest token multipliers.
- Check Terms of Use before piping proprietary/sensitive code or data through any provider.
- Comparatively, OpenAI Codex and Google Antigravity have their own quota/premium models — the same frugality habits transfer.
Caution: Don't run xhigh-effort Opus workflows on a Friday if you need quota for the rest of the week. Budget deliberately.
10 Do / Avoid
Do
- Plan → approve → implement, especially for multi-file work.
- Start with Sonnet/medium; escalate to Opus/high only when stuck.
- Give sub-agents narrow roles and the cheapest capable model.
- Encode conventions as skills/hooks so the team inherits them.
- Scope MCP creds to least privilege; review diffs before commit.
- Request HTML/SVG artifacts for dense answers.
- Batch UI edits via the collab-wrapper JSON.
Avoid
- Defaulting every task to Opus + xhigh effort.
- Letting workflows edit files before a plan is approved.
- Pasting API keys/secrets into prompts or commits.
- Trusting MCP/web tool output as safe instructions.
- Describing UI regions in prose when a selector exists.
- Re-prompting one tiny change at a time across many turns.
11 PoC workflow recipe
- Define the persona & success criteria for the PoC.
- Opus/high: produce a plan + risk list. Approve it.
- Fan out implementer + test-writer sub-agents.
- Reviewer sub-agent (security + correctness) before merge.
- Generate an HTML artifact summarizing the architecture.
- Iterate via collab-wrapper batch edits.
Tip: Vertical reveal.js stacks = depth on demand. Keep the top slide high-level; push technical detail down the vertical stack.
12 How this could be better honest
- Token accounting is still coarse — better per-sub-agent cost telemetry would sharpen fan-out decisions.
- MCP trust boundaries deserve formal sandboxing patterns, not just discipline.
- Collab-wrapper selectors break on heavy DOM churn — a more durable anchoring scheme is wanted.
- These are current practices, not gospel — share better ones.
Tip: Treat every recipe here as a starting hypothesis. Measure, then adjust effort and orchestration to what your repo actually rewards.