Plan-Execute-Verify-Replan (PEVR) is an iterative agent control loop that treats task completion as a convergence problem rather than a single-pass operation. Instead of generating a plan and executing it blindly, PEVR introduces a verification gate after execution that determines whether results are sufficient — and replans adaptively if they are not.
The Four Stages
-
Plan: A QueryPlanner decomposes the incoming task into a directed acyclic graph (DAG) of sub-tasks. Each node is an independent unit of work assigned to a specialized agent; edges encode dependencies between sub-tasks. Independent nodes can execute in parallel; dependent nodes wait for prerequisites.
-
Execute: A DAGExecutor runs the sub-task graph, respecting topological order while maximizing parallelism. Context from upstream results propagates automatically to downstream nodes. Execution is bounded by resource limits (token budget, iteration cap, per-task timeouts).
-
Verify: A ResultVerifier — a separate LLM acting as an independent evaluator, not the same model that produced the results — assesses completeness. It produces a score, identifies missing aspects, and issues recommendations. This separation of verification from production is an instance of Dual-Agent-Design: self-assessment by the producing agent is structurally weaker than independent evaluation.
-
Replan: If verification finds gaps and stop conditions have not been met, an AdaptiveReplanner issues corrective sub-tasks — retries, new queries, or alternative strategies — while preserving already-accepted results. The loop then executes these new tasks and re-verifies.
Why Single-Pass Planning Fails
Linear planning (generate plan → execute → done) treats task completion as a solved problem at plan time. For complex, open-ended queries this assumption fails:
- Sub-tasks may surface information that invalidates earlier assumptions
- No mechanism exists to detect partial coverage or low-quality outputs
- Errors in early nodes propagate silently through the rest of the graph
PEVR converts task completion into a feedback loop where each iteration narrows the gap between current state and target completeness, an AI-native implementation of the same convergence logic behind PDCA (Plan-Do-Check-Act) in quality management.
Stop Conditions
PEVR loops are bounded by configurable Back-Pressure-Mechanisms to prevent infinite iteration:
- Completeness threshold reached (e.g., ≥80%)
- High confidence with partial coverage (e.g., 75% confidence + 50% complete)
- Diminishing returns (e.g., <5% improvement per iteration)
- Token budget exhausted
- Iteration cap reached (e.g., 3 cycles)
Empirical Results
Zhang et al. (2026) implemented PEVR in VMAO (Verified Multi-Agent Orchestration), tested on 25 expert-curated market research queries. Results on a 1–5 scale:
| Method | Completeness | Source Quality |
|---|---|---|
| Single-Agent | 3.1 | 2.6 |
| Static Pipeline | 3.5 | 3.2 |
| VMAO (PEVR) | 4.2 | 4.1 |
- +35% completeness improvement over single-agent baseline
- +58% source quality improvement over single-agent baseline
- Largest gains on open-ended Strategic Assessment queries (+53% completeness)
See Zhang-et-al-2026-Verified-Multi-Agent-Orchestration for the full experimental setup.
Related Concepts
- Multi-Agent-Orchestration
- Back-Pressure-Mechanisms
- Zhang-et-al-2026-Verified-Multi-Agent-Orchestration
- Agentic-Flywheel
Sources
-
Zhang, Xing, Yanwei Cui, Guanghui Wang, Wei Qiu, Ziyuan Li, Fangwei Han, Yajing Huang, Hengzhi Qiu, Bing Zhu, and Peiyang He (2026). “Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution.” ICLR 2026 Workshop on MALGAI. arXiv:2603.11445. Available: https://arxiv.org/abs/2603.11445
- Primary source: full PEVR system design, DAG execution model, LLM-based verification, stop conditions, and empirical results
-
Yao, Shunyu, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao (2022). “ReAct: Synergizing Reasoning and Acting in Language Models.” ICLR 2023. arXiv:2210.03629. Available: https://arxiv.org/abs/2210.03629
- Establishes the foundational reasoning-action interleave pattern; PEVR extends this by externalizing the verification step to an independent agent rather than embedding it in the acting model’s trace
-
Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan (2023). “Tree of Thoughts: Deliberate Problem Solving with Large Language Models.” NeurIPS 2023. arXiv:2305.10601. Available: https://arxiv.org/abs/2305.10601
- Demonstrates LLM self-evaluation of intermediate states (“sure/maybe/impossible”) as a verification signal during planning; ToT’s state evaluator is a single-model precursor to PEVR’s independent verifier
-
American Society for Quality (ASQ). “PDCA Cycle — What is the Plan-Do-Check-Act Cycle?” ASQ Quality Resources. Available: https://asq.org/quality-resources/pdca-cycle
- Canonical reference for the PDCA iterative improvement cycle; PEVR is an LLM-native instantiation of the same Check→Act feedback loop formalized by Shewhart and Deming
-
Shaikh, Salman, et al. (2025). “Plan Verification for LLM-Based Embodied Task Completion Agents.” arXiv:2509.02761. Available: https://arxiv.org/abs/2509.02761
- Extends plan-verify-replan to embodied agents; a Judge LLM critiques action sequences and a Planner LLM applies revisions, yielding progressively cleaner trajectories — convergent evidence for the value of independent verification across agent domains
Note
This content was drafted with assistance from AI tools for research, organization, and initial content generation. All final content has been reviewed, fact-checked, and edited by the author to ensure accuracy and alignment with the author’s intentions and perspective.