How to Use Claude Code Subagents: Parallel AI Work Without Chaos

Claude Code subagents run parallel AI work without context chaos. Config, dispatch patterns, model choice, and the mistakes that waste tokens, with examples.

12 minutes
Intermediate
2026-04-17

Running more than one AI helper on the same project sounds great until they start tripping over each other. One rewrites the file another one was about to edit. Another one forgets the convention the first one just established. You end up spending more time untangling the mess than you saved by parallelizing.

Claude Code subagents are the answer to that problem, and they work well when you respect their boundaries. This guide covers what a subagent actually is, how to configure one, when to dispatch in parallel, which model to pick, and the common mistakes that turn a promising speedup into wasted tokens.

What Is a Claude Code Subagent?

A Claude Code subagent is a second Claude instance that runs inside your main session, with its own context window, its own tool access, and its own optional model. The parent session talks to it through a single prompt string. When the subagent finishes, it hands back one final message. Everything in between stays inside the subagent's context and never pollutes yours.

Three things make this useful:

  • Fresh context. A subagent starts with no conversation history. You hand it a job, it does the job, it reports back. Your main session stays clean.
  • Scoped tools. You can restrict a subagent to read-only tools, or give it shell access for a specific task. The main session keeps its broader permissions.
  • Optional filesystem isolation. With isolation: worktree, the subagent gets its own git worktree and branch. It can experiment without touching your working tree.

If you've read our Claude Code power user guide, you've seen subagents mentioned alongside skills and hooks. This post goes deeper on just the subagent primitive.

Why Subagents for Parallel Work?

Subagents solve four specific problems that a single Claude session handles badly.

  1. Context window pressure. A long task packs reasoning, tool calls, and file contents into the same window. Quality drops once the window is heavily loaded, and long sessions feel it. Subagents fan that pressure out across independent windows, so the main session stays sharp.

  2. Model cost control. Opus is roughly five times more expensive than Sonnet. Haiku is cheaper still and keeps about 90 percent of Sonnet's capability on bounded tasks. Running Opus in the main session and delegating mechanical work to Sonnet or Haiku subagents cuts costs without hurting output quality on scoped work.

  3. Concurrent throughput. Claude Code can run up to 10 subagents in parallel. For genuinely independent work, this is real wall-clock speedup, not just vibes. Writing tests for five separate modules takes five times less time when you fan out properly.

  4. Blast radius containment. Risky work, destructive commands, schema rewrites, bulk refactors, belongs in an isolated worktree. If the subagent goes sideways, you throw away the branch and nothing on your main tree moves.

The cost model matters here. If you're spending $20 a day on Claude Code without subagents, a well-tuned setup can cut that to $12 with no productivity loss. For a team of ten, that adds up.

Building Your First Subagent

Subagents live in markdown files with YAML frontmatter. Two scopes:

  • ~/.claude/agents/ for personal subagents, available everywhere
  • .claude/agents/ in your project root for team subagents, committed to the repo

The filename becomes the agent name. The frontmatter configures it. The body becomes its system prompt.

Here's a review-focused subagent, saved as .claude/agents/code-reviewer.md:

---
name: code-reviewer
description: Reviews diffs for logic bugs, missing tests, and convention breaks. Invoke after finishing a feature, before creating a PR.
model: sonnet
tools: [Read, Grep, Glob, Bash]
---

You are a senior engineer reviewing a diff before it goes to a human reviewer.

Your job is to find real problems, not to suggest stylistic rewrites. Focus on:

- Logic bugs the author may have missed
- Missing test coverage on non-trivial branches
- Conventions violated compared to neighboring code
- Security issues: unvalidated input, exposed secrets, SQL injection

Report findings as a short list grouped by severity. Skip anything that's purely taste.

You invoke it two ways. Explicitly, by name: "use the code-reviewer subagent on the diff". Or automatically, where Claude reads the description and dispatches when it matches the task. Auto-selection is flaky, so for anything important, name the subagent in your prompt.

A few fields are worth knowing:

  • model: haiku, sonnet, or opus. Omit to inherit the main session's model.
  • tools: an allowlist. Omit to inherit all tools. This is the single biggest lever on safety.
  • isolation: worktree: the subagent works in its own git worktree. Useful for anything that edits files.
  • background: true: the subagent runs asynchronously. You keep working while it runs and read the result later.

Start with one or two. Don't build a library of fifteen subagents on day one. You won't use most of them, and the description field gets crowded enough that auto-selection breaks.

Dispatch Patterns That Scale

Once you have a subagent or two, the real question is when to use them. There are three patterns that cover most real work.

Pattern 1: Parallel fan-out for independent domains

This is the case people imagine when they hear "parallel AI", and it's also the case most people get wrong. Parallel dispatch only works when the subagents touch different files. Overlapping writes corrupt each other. Shared state is a race condition waiting to happen.

Good fan-out example:

"Dispatch three subagents in parallel: one writes unit tests for src/billing/, one writes tests for src/auth/, one writes tests for src/reporting/. Each subagent gets its own directory and never reads from the others."

Bad fan-out example:

"Dispatch three subagents to refactor our data model in parallel."

The second one is a disaster because the data model is shared. All three will edit overlapping files. You'll end up with conflicting diffs and zero speedup.

Rule of thumb: if you can't draw a line on a whiteboard separating what each subagent will touch, don't fan out.

Pattern 2: Sequential handoff with disk as shared state

Sometimes you want multiple subagents, but they depend on each other. Use sequential dispatch where each subagent writes its output to disk and the next one reads it.

A migration workflow:

  1. Research subagent reads the old schema, writes a plan to plan.md
  2. Main session reviews plan.md with you, edits if needed
  3. Implementation subagent reads plan.md and executes

This pattern trades wall-clock speed for quality. Each subagent starts fresh with only the file it needs. No context pollution, no assumption drift.

Pattern 3: Background subagents for long-running work

If a task takes 10 minutes and doesn't need your input, run it in the background. You keep working on the main session. When the subagent finishes, its result shows up in the transcript and you can act on it.

Typical use: a test suite that takes a while to regenerate, a doc-spelunking task where the subagent reads through external docs and reports back, a codebase audit that touches hundreds of files.

For longer pieces on agent orchestration patterns beyond Claude Code, our LangGraph state machines guide covers explicit state management, which is useful when subagent handoff gets complicated.

Best Practices

Six rules that hold up after the first week of real use.

  1. Dispatch in parallel only when file boundaries are obvious. If the subagents could possibly touch the same file, run them sequentially. The cost of a merge conflict is higher than the cost of waiting.

  2. Pass file paths explicitly in the prompt. The subagent starts with no memory of your conversation. If you say "review the file we just edited", it has no idea which file that is. Say "review src/billing/invoice.ts".

  3. Restrict tools per subagent. A reviewer needs Read and Grep, not Bash or Write. A documentation subagent needs WebFetch but not shell access. Tool scoping is your biggest safety lever and costs nothing.

  4. Use isolation: worktree for anything that writes code. The cost is near zero: the worktree cleans itself up if no changes are made. The upside is that a bad subagent run can't touch your main working tree.

  5. Match model to task complexity. Opus for planning and architecture. Sonnet for implementation. Haiku for lookups, grep, and bounded mechanical work. If you're running Opus on every subagent, you're overpaying.

  6. Name the subagent explicitly when it matters. Auto-selection works sometimes. For anything load-bearing, write the subagent name into your prompt so it's not left to chance.

Deployment Considerations

Rolling subagents out to a team is a different exercise from using them solo. Four concerns that come up.

Scalability. The 10-concurrent limit matters more than you'd think. If you dispatch 15 subagents, five wait in a queue. Most teams rarely need more than three at once, but when you're running bulk work, queue contention shows up.

Cost. Track token spend per subagent. Some teams set up a hook on SubagentStop that logs token counts to a file. Over a month, this tells you which subagents are worth the overhead and which ones are ceremony. Our production AI agents guide covers similar instrumentation for LangChain-style agents.

Security. If a subagent can run git push, aws commands, or anything else that affects shared state, tool-scope it aggressively. Better yet, use isolation: worktree and a personal branch. Never give a subagent credentials it doesn't strictly need.

Monitoring. Subagent output is easy to miss because it lands in the transcript inside a collapsible block. Set up a hook that writes subagent results to a log file, especially for background subagents. You'll catch failures faster, and you'll have an audit trail when something ships that shouldn't have.

Real-World Applications

Five patterns that teams use today.

  • Code review before PR. A reviewer subagent runs on every diff. Sonnet or Haiku is enough here. The main session keeps working on the next feature while the review runs.

  • Parallel test writing. A feature that spans three modules gets three test-writing subagents, one per module. Wall-clock speedup is real when modules don't share test helpers.

  • Research subagent for external docs. A subagent with WebFetch and WebSearch reads docs, returns a summary. Main session never loads the docs directly, which keeps the window clean.

  • Migration assistants. Schema migrations, config renames, dependency upgrades. Each subagent handles one file or one directory, isolation: worktree on, results reviewed before merging.

  • Security scans during development. While the main session ships features, a background subagent scans for hardcoded secrets, unvalidated inputs, or common vulnerabilities. Findings come back in the transcript without blocking forward progress.

For teams coming from LangChain or custom agent frameworks, our write-up on building agentic AI systems compares how these patterns look outside Claude Code.

Conclusion

Subagents save time when the work is genuinely independent. They waste tokens when you pretend sequential work is parallel. The difference is boring: can you point at the files each subagent will touch, and are those files actually separate?

If you take one thing from this post, take this: start with one subagent, one that matches a real recurring task. Measure whether it actually helps. Add a second only when you have evidence the first one paid off. A library of fifteen unused subagents is worse than a single one you trust.

For deeper context on how subagents fit alongside Claude Code skills and hooks, see the Claude Code power user guide. For how Claude Code compares to Cursor and Copilot, our AI coding assistants comparison has the tradeoffs.

Next Steps

  1. Create .claude/agents/code-reviewer.md with the config above and try it on your next diff.
  2. Pick one recurring task in your workflow that could benefit from a fresh context. Write a subagent for just that task.
  3. Measure: log token spend and wall-clock time for a week. Keep the subagent if it earned its keep, retire it if it didn't.
  4. Once one subagent is working, experiment with parallel fan-out on genuinely independent work, not before.
R

Refactix Team

Practical guides on software architecture, AI engineering, and cloud infrastructure.

Share this article

Topics Covered

Claude Code SubagentsClaude Code ParallelSubagent OrchestrationClaude Code Agent ToolSubagent ConfigurationClaude Code Agents

You Might Also Like

Ready for More?

Explore our comprehensive collection of guides and tutorials to accelerate your tech journey.

Explore All Guides
Weekly Tech Insights

Stay Ahead of the Curve

Join thousands of tech professionals getting weekly insights on AI automation, software architecture, and modern development practices.

No spam, unsubscribe anytimeReal tech insights weekly