Plan-Execute-Verify-Replan (PEVR) is an iterative agent control loop that treats task completion as a convergence problem rather than a single-pass operation. Instead of generating a plan and executing it blindly, PEVR introduces a verification gate after execution that determines whether results are sufficient — and replans adaptively if they are not.
The Four Stages
-
Plan: A QueryPlanner decomposes the incoming task into a directed acyclic graph (DAG) of sub-tasks. Each node is an independent unit of work assigned to a specialized agent; edges encode dependencies between sub-tasks. Independent nodes can execute in parallel; dependent nodes wait for prerequisites.
-
Execute: A DAGExecutor runs the sub-task graph, respecting topological order while maximizing parallelism. Context from upstream results propagates automatically to downstream nodes. Execution is bounded by resource limits (token budget, iteration cap, per-task timeouts).
-
Verify: A ResultVerifier — a separate LLM acting as an independent evaluator, not the same model that produced the results — assesses completeness. It produces a score, identifies missing aspects, and issues recommendations. This separation of verification from production is an instance of Dual-Agent-Design: self-assessment by the producing agent is structurally weaker than independent evaluation.
-
Replan: If verification finds gaps and stop conditions have not been met, an AdaptiveReplanner issues corrective sub-tasks — retries, new queries, or alternative strategies — while preserving already-accepted results. The loop then executes these new tasks and re-verifies.
Why Single-Pass Planning Fails
Linear planning (generate plan → execute → done) treats task completion as a solved problem at plan time. For complex, open-ended queries this assumption fails:
- Sub-tasks may surface information that invalidates earlier assumptions
- No mechanism exists to detect partial coverage or low-quality outputs
- Errors in early nodes propagate silently through the rest of the graph
PEVR converts task completion into a feedback loop where each iteration narrows the gap between current state and target completeness, an AI-native implementation of the same convergence logic behind PDCA (Plan-Do-Check-Act) in quality management.
Stop Conditions
PEVR loops are bounded by configurable Back-Pressure-Mechanisms to prevent infinite iteration:
- Completeness threshold reached (e.g., ≥80%)
- High confidence with partial coverage (e.g., 75% confidence + 50% complete)
- Diminishing returns (e.g., <5% improvement per iteration)
- Token budget exhausted
- Iteration cap reached (e.g., 3 cycles)
Empirical Results
Zhang et al. (2026) implemented PEVR in VMAO (Verified Multi-Agent Orchestration), tested on 25 expert-curated market research queries. Results on a 1–5 scale:
| Method | Completeness | Source Quality |
|---|---|---|
| Single-Agent | 3.1 | 2.6 |
| Static Pipeline | 3.5 | 3.2 |
| VMAO (PEVR) | 4.2 | 4.1 |
- +35% completeness improvement over single-agent baseline
- +58% source quality improvement over single-agent baseline
- Largest gains on open-ended Strategic Assessment queries (+53% completeness)
See Zhang-et-al-2026-Verified-Multi-Agent-Orchestration for the full experimental setup.
Related Concepts
- Multi-Agent-Orchestration
- Back-Pressure-Mechanisms
- Zhang-et-al-2026-Verified-Multi-Agent-Orchestration
- Agentic-Flywheel
Sources
-
Zhang, Xing, Yanwei Cui, Guanghui Wang, Wei Qiu, Ziyuan Li, Fangwei Han, Yajing Huang, Hengzhi Qiu, Bing Zhu, and Peiyang He (2026). “Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution.” ICLR 2026 Workshop on MALGAI. arXiv:2603.11445. Available: https://arxiv.org/abs/2603.11445
- Primary source: full PEVR system design, DAG execution model, LLM-based verification, stop conditions, and empirical results
-
Yao, Shunyu, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao (2022). “ReAct: Synergizing Reasoning and Acting in Language Models.” ICLR 2023. arXiv:2210.03629. Available: https://arxiv.org/abs/2210.03629
- Establishes the foundational reasoning-action interleave pattern; PEVR extends this by externalizing the verification step to an independent agent rather than embedding it in the acting model’s trace
-
Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan (2023). “Tree of Thoughts: Deliberate Problem Solving with Large Language Models.” NeurIPS 2023. arXiv:2305.10601. Available: https://arxiv.org/abs/2305.10601
- Demonstrates LLM self-evaluation of intermediate states (“sure/maybe/impossible”) as a verification signal during planning; ToT’s state evaluator is a single-model precursor to PEVR’s independent verifier
-
American Society for Quality (ASQ). “PDCA Cycle — What is the Plan-Do-Check-Act Cycle?” ASQ Quality Resources. Available: https://asq.org/quality-resources/pdca-cycle
- Canonical reference for the PDCA iterative improvement cycle; PEVR is an LLM-native instantiation of the same Check→Act feedback loop formalized by Shewhart and Deming
-
Shaikh, Salman, et al. (2025). “Plan Verification for LLM-Based Embodied Task Completion Agents.” arXiv:2509.02761. Available: https://arxiv.org/abs/2509.02761
- Extends plan-verify-replan to embodied agents; a Judge LLM critiques action sequences and a Planner LLM applies revisions, yielding progressively cleaner trajectories — convergent evidence for the value of independent verification across agent domains
Note
This note was researched and drafted with AI. How these notes are written →