Anthropic Finally Reveals How It Built Its Multi-Agent Claude Research System - Know All About It Here

Home » News

3 min. read

Published on June 16, 2025

by Abhijay Singh Rawat

published on June 16, 2025

Share this article

Improve this guide

Readers help support MSpoweruser. We may get a commission if you buy through our links.

Anthropic launched the Research extension for Claude, a multi-agent system that applies multiple AI agents in parallel to dissect complex questions faster and more thoroughly.

At its core, Claude Research splits a task between a lead agent and several subagents. The lead agent parses the request, designs a research plan, and delegates parallel tasks. Subagents then search the web and other tools simultaneously, each focusing on a specific aspect. Once they return their findings, the lead agent combines them into a cohesive, cited answer .

This design tackles the limitations of single-threaded retrieval systems. Fixed-context models struggle with deep, branching inquiries and token limits. By distributing the workload across agents, each with its own context window, Claude Research expands capacity while avoiding path dependency. In internal benchmarks, this setup outperformed a solo Claude Opus 4 run by 90.2% in information coverage on broad queries, much like identifying all board members across the IT S&P 500, by breaking problems into parallel tasks.

Other recent AI news –

That speed comes at a cost. Multi-agent calls use about four times more tokens than conversational chats, and overall system costs can hit fifteen times chat usage. Anthropic emphasizes that this architecture makes economic sense only for high-value, parallelizable tasks.

Building it posed several hurdles. Prompt design needed fine-tuning because early versions spawned unnecessary subagents or duplicated work. Reliability required checkpointing, tracing, and gradual “rainbow deployments” to avoid disrupting ongoing processes. The current orchestration is synchronous, with each subagent needing to finish before moving on, creating potential bottlenecks. Anthropic notes asynchronous execution may follow, once they solve the complexity around coordination, state tracking, and error handling.

Evaluation also became more complex. With evolving agent plans, typical step-by-step measures failed. So Anthropic trained a judge agent using a rubric focused on accuracy, citation quality, completeness, and tool use efficiency, supplemented by human review to catch qualitative issues like tendency to cite low-quality sites .

Despite challenges, Anthropic reports users already leverage the tool to uncover technical opportunities, healthcare options, and academic insights, sometimes saving both time and effort. The Research feature marks a shift from single-agent pipelines toward coordinated, specialized AI teams.

You may also be interested to read –

Abhijay Singh Rawat

Previously the Editor at GadgetsToUse, Abhijay is a seasoned tech and gaming journalist, who also shares his passion for gaming and trekking in the hills of Uttrakhand. For any feedback or query, feel free to reach out to him at [email protected]

User forum

0 messages

Sort by:

Leave a Reply Cancel reply