Sub-Agents and Parallel Pipelines
Spawn multiple agents working in parallel — and build the research pipeline that scales beyond a single session.
What you'll learn
The Architecture Ceiling — and the Way Through It
Every capability you have built so far runs in a single session: one context window, processing work sequentially. That is fine for most tasks. But some problems are fundamentally parallel in nature — a collection of documents that need individual analysis, a queue of bugs that each need a fix, a set of market segments that each need a research brief.
For these problems, sequential execution is not just slow — it is architecturally wrong. The items do not depend on each other. There is no reason item 3 should wait for item 2 to finish. Sub-agents change the equation.
Claude Code Remote is the current sub-agent mechanism — both Jenny (Anthropic) and Swyx have described this pattern in detail. It is not yet in Co-Work's official scheduling documentation and may evolve as the platform matures. The sub-agent capability itself is verified in 01-cowork-overview.md; the Claude Code Remote mechanism is attributed to these practitioners and may change.
How Sub-Agents Work
Co-Work can spawn sub-agents to run tasks in parallel — multiple instances working simultaneously. This is documented in official Co-Work documentation as "sub-agent coordination" and "parallel workstreams."
The current mechanism, described by Jenny (Anthropic) and Swyx in technical discussions, uses Claude Code Remote as the sub-agent engine. The architecture works like this:
- The parent Co-Work session identifies the collection of parallel work items
- It writes a template prompt describing what to do with each item
- It spawns a separate Claude Code Remote session for each item
- Each session runs independently — accessing files, using tools, producing output
- The parent Co-Work session monitors all running sessions, collects their outputs, and synthesizes a final result
The key constraint: each session is independent. Sub-agents cannot communicate with each other mid-run. They receive their prompt from the parent, do their work, and return results to the parent. This means sub-agents are suited for tasks where the items do not depend on each other's intermediate outputs.
Parallel vs. Sequential: The Design Decision
The choice between parallel and sequential execution is an architectural decision, not a performance preference. Use parallel execution when:
- Tasks are independent — item B does not need item A's output before it can start
- Tasks follow a common template — the same prompt works for every item in the collection
- The collection is uniform — all items are of the same type (all documents, all bugs, all interviews)
- Speed matters and cost is acceptable — parallel execution finishes faster but costs proportionally more in tokens
Use sequential execution when:
- Task B uses task A's output — the work is a chain, not a collection
- Items are heterogeneous and require different prompts
- You need each result reviewed before proceeding to the next
- Context from earlier items should inform later ones
The Bug Triage Pipeline
The most technically sophisticated pipeline documented by the community comes from Swyx, describing a bug triage workflow. It demonstrates what sub-agents enable at scale:
- Co-Work monitors a crash reporting tool (such as Sentry) via a connector
- For each new bug report: Co-Work writes a markdown prompt file describing the issue, the relevant code context, and the expected behavior
- Co-Work spawns a separate Claude Code Remote session for each bug
- Each session reads its prompt, pulls the relevant code, attempts a fix, and creates a pull request
- The parent Co-Work session monitors all sessions, collects the PR links, and generates a summary report of all attempted fixes
The prompt that triggers this: "Go to [crash reporting tool]. Find all bugs filed today. For each bug, start a Claude Code Remote session, have it read the issue description, find the relevant code, attempt a fix, and create a pull request."
The bug triage pipeline is a pattern described by Swyx in a community discussion (Future of Software). It is an advanced community practice, not from official Anthropic documentation. The capability it describes — sub-agent coordination with parallel workstreams — is verified. The specific implementation pattern is community-sourced and attributed to Swyx.
The Jenny Monday Insights Pattern
Jenny, a researcher at Anthropic, describes a parallel research pattern she uses in her own work. Each Monday morning, she asks Co-Work to analyze a folder of UXR (user experience research) interview transcripts alongside recent Reddit discussions about the product. Co-Work processes multiple sources simultaneously, each as a separate analysis stream, and synthesizes them into a unified insights brief.
The structure: "Look in this folder of UXR interviews and on Reddit — tell me the main insights." What sounds like a simple request triggers parallel processing across multiple sources. Without sub-agents, this would take hours and a serial reading of each source. With parallel execution, the consolidated brief arrives while the first coffee is still hot.
Token Economics in Parallel Workloads
Parallel execution is faster but not cheaper. Five parallel sub-agents consume approximately five times the tokens compared to running the same work sequentially. The wall-clock time drops by a factor of five; the token cost does not.
Jack Roberts uses a framing that clarifies the decision: think of token spend as employee cost. Running five sub-agents in parallel is like hiring five specialists to each work on one problem for an hour — versus hiring one specialist to work on all five problems for five hours. Same total labor cost, very different delivery speed.
Parallel agents multiply token consumption proportionally. Before running a 20-item parallel pipeline, test with 3 items first and review the cost. Set clear expectations about spend — especially for scheduled pipelines that run automatically. The OpenTelemetry monitoring covered in Module 17 enables cost tracking per pipeline at Team/Enterprise scale.
Designing Your First Parallel Pipeline
The design pattern for any parallel pipeline follows four steps:
- Identify the parallelizable unit — what is the "one item" that the template prompt operates on? One document, one bug, one customer interview, one market segment.
- Write the template prompt — one prompt that works for any item in the collection. Test it on a single item first. If it does not work on one, it will not work on twenty.
- Define the synthesis — how should the parent Co-Work session combine all the individual outputs? A consolidated summary? A comparison table? A ranked list? Define this in the prompt.
- Set the scope limit — start with 3–5 items in the first run. Verify the outputs are correct before scaling. A bad template prompt running on 20 items in parallel produces 20 bad outputs.
Build a Parallel Research Pipeline
Choose 3–5 articles, documents, or URLs that share a common theme. This could be competitor blog posts, industry news pieces, research papers, or user feedback files. The items should be similar enough that one analysis prompt applies to all of them.
Success criteria: At least 3 items analyzed in parallel with individual results and a consolidated synthesis produced. Time comparison documented. Template prompt ready to reuse for the next collection of the same type.