Complete Bibliographic Citation
Zhang, Xing, Yanwei Cui, Guanghui Wang, Wei Qiu, Ziyuan Li, Fangwei Han, Yajing Huang, Hengzhi Qiu, Bing Zhu, and Peiyang He (2026). “Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution.” ICLR 2026 Workshop on MALGAI (Multi-Agent Learning: Generalization and Adaptation in Intelligence). arXiv:2603.11445. Available: https://arxiv.org/html/2603.11445
Summary
This paper introduces VMAO (Verified Multi-Agent Orchestration), a system for coordinating specialized LLM-based agents through a verification-driven iterative loop. It provides academic grounding for Multi-Agent-Orchestration patterns, demonstrating how complex queries can be decomposed, executed in parallel, and iteratively improved through verification.
The Core Framework
VMAO implements the Plan-Execute-Verify-Replan cycle across five phases:
- Plan: A QueryPlanner decomposes complex queries into directed acyclic graphs (DAGs) of sub-questions, assigning each to domain-specific agents
- Execute: A DAGExecutor respects dependencies while maximizing parallel execution (default batch size of 3)
- Verify: A ResultVerifier evaluates completeness using LLM-based evaluation, producing scores (0–1 scale), identifying missing aspects, and issuing recommendations
- Replan: An AdaptiveReplanner addresses gaps through retries or new queries, preserving previous results
- Synthesize: Hierarchical synthesis groups results by agent type before integrating into final answers with source attribution
Agent Architecture and Isolation
The system organizes agents into three functional tiers, illustrating Sub-Agents-Context-Isolation principles:
- Tier 1 (Data Gathering): RAG, web search, financial, and competitor agents
- Tier 2 (Analysis): Analysis, reasoning, and raw data processing agents
- Tier 3 (Output): Document generation and visualization agents
The system deploys 42 unique tools across 8 microservices via Model Context Protocol (MCP). Separation of verification from execution is an instance of Dual-Agent-Design — using independent evaluators rather than self-assessment.
Stopping and Safety Mechanisms
The configurable stop conditions demonstrate Back-Pressure-Mechanisms in practice:
- Completeness threshold: 80%
- Confidence with partial coverage: 75% confidence + 50% complete
- Diminishing returns: <5% improvement per iteration
- Token budget: 1M tokens maximum
- Iteration cap: 3 replanning cycles
Experimental Results
Tested on 25 expert-curated market research queries across four categories (Performance Analysis, Competitive Intelligence, Financial Investigation, Strategic Assessment):
| Method | Completeness | Source Quality | Avg Tokens |
|---|---|---|---|
| Single-Agent | 3.1 | 2.6 | 100K |
| Static Pipeline | 3.5 | 3.2 | 350K |
| VMAO | 4.2 | 4.1 | 850K |
Key findings:
- +35% completeness and +58% source quality over single-agent baseline
- Largest gains on open-ended Strategic Assessment queries (+53% completeness)
- 8.5× token cost reflects thoroughness required for complex synthesis tasks
Implementation
- Orchestration: LangGraph with Strands Agent framework on AWS Bedrock
- Models: Claude Sonnet 4.5 for execution; Claude Opus 4.5 for independent verification
- Safety: Tool call limiters (max 10 consecutive same-tool, 50 total), per-execution timeouts (600s), token tracking
Limitations
- Modest evaluation set (25 queries only)
- LLM-based verification cannot independently establish factual accuracy
- Framework tested only with Claude model family
Key Concepts Extracted
- Multi-Agent-Orchestration
- Plan-Execute-Verify-Replan
- Sub-Agents-Context-Isolation
- Dual-Agent-Design
- Back-Pressure-Mechanisms
Sources
- Zhang, Xing, Yanwei Cui, Guanghui Wang, Wei Qiu, Ziyuan Li, Fangwei Han, Yajing Huang, Hengzhi Qiu, Bing Zhu, and Peiyang He (2026). “Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution.” ICLR 2026 Workshop on MALGAI (Multi-Agent Learning: Generalization and Adaptation in Intelligence). arXiv:2603.11445. Available: https://arxiv.org/html/2603.11445
- Primary source: full system description, experimental setup, and results
Fair Use Notice
This note contains summaries and analysis of copyrighted material for educational and commentary purposes. This constitutes fair use/fair dealing under copyright law. The original work remains the property of its copyright holders. Full citation provided above.
Note
This content was drafted with assistance from AI tools for research, organization, and initial content generation. All final content has been reviewed, fact-checked, and edited by the author to ensure accuracy and alignment with the author’s intentions and perspective.