Anthropic Finally Reveals How It Built Its Multi-Agent Claude Research System - Know All About It Here
3 min. read
Published on
Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more
Anthropic launched the Research extension for Claude, a multi-agent system that applies multiple AI agents in parallel to dissect complex questions faster and more thoroughly.
At its core, Claude Research splits a task between a lead agent and several subagents. The lead agent parses the request, designs a research plan, and delegates parallel tasks. Subagents then search the web and other tools simultaneously, each focusing on a specific aspect. Once they return their findings, the lead agent combines them into a cohesive, cited answer .
This design tackles the limitations of single-threaded retrieval systems. Fixed-context models struggle with deep, branching inquiries and token limits. By distributing the workload across agents, each with its own context window, Claude Research expands capacity while avoiding path dependency. In internal benchmarks, this setup outperformed a solo Claude Opus 4 run by 90.2% in information coverage on broad queries, much like identifying all board members across the IT S&P 500, by breaking problems into parallel tasks.
Other recent AI news –
- Microsoft rolls out Copilot Vision for Windows and The Screen-aware AI Is Now Free for all U.S. users
- Google’s AI Storm Tracker Cuts 5 Day Cyclone Error by 87 Miles, Now Live in Weather Lab
- Meta Bets $15B on Scale AI, Taps Founder for New AGI Lab
That speed comes at a cost. Multi-agent calls use about four times more tokens than conversational chats, and overall system costs can hit fifteen times chat usage. Anthropic emphasizes that this architecture makes economic sense only for high-value, parallelizable tasks.
Building it posed several hurdles. Prompt design needed fine-tuning because early versions spawned unnecessary subagents or duplicated work. Reliability required checkpointing, tracing, and gradual “rainbow deployments” to avoid disrupting ongoing processes. The current orchestration is synchronous, with each subagent needing to finish before moving on, creating potential bottlenecks. Anthropic notes asynchronous execution may follow, once they solve the complexity around coordination, state tracking, and error handling.
Evaluation also became more complex. With evolving agent plans, typical step-by-step measures failed. So Anthropic trained a judge agent using a rubric focused on accuracy, citation quality, completeness, and tool use efficiency, supplemented by human review to catch qualitative issues like tendency to cite low-quality sites .
Despite challenges, Anthropic reports users already leverage the tool to uncover technical opportunities, healthcare options, and academic insights, sometimes saving both time and effort. The Research feature marks a shift from single-agent pipelines toward coordinated, specialized AI teams.
You may also be interested to read –
User forum
0 messages