Plan-Execute-Verify-Replan

Plan-Execute-Verify-Replan (PEVR) is an iterative agent control loop that treats task completion as a convergence problem rather than a single-pass operation. Instead of generating a plan and executing it blindly, PEVR introduces a verification gate after execution that determines whether results are sufficient — and replans adaptively if they are not.

The Four Stages

Plan: A QueryPlanner decomposes the incoming task into a directed acyclic graph (DAG) of sub-tasks. Each node is an independent unit of work assigned to a specialized agent; edges encode dependencies between sub-tasks. Independent nodes can execute in parallel; dependent nodes wait for prerequisites.
Execute: A DAGExecutor runs the sub-task graph, respecting topological order while maximizing parallelism. Context from upstream results propagates automatically to downstream nodes. Execution is bounded by resource limits (token budget, iteration cap, per-task timeouts).
Verify: A ResultVerifier — a separate LLM acting as an independent evaluator, not the same model that produced the results — assesses completeness. It produces a score, identifies missing aspects, and issues recommendations. This separation of verification from production is an instance of Dual-Agent-Design: self-assessment by the producing agent is structurally weaker than independent evaluation.
Replan: If verification finds gaps and stop conditions have not been met, an AdaptiveReplanner issues corrective sub-tasks — retries, new queries, or alternative strategies — while preserving already-accepted results. The loop then executes these new tasks and re-verifies.

Why Single-Pass Planning Fails

Linear planning (generate plan → execute → done) treats task completion as a solved problem at plan time. For complex, open-ended queries this assumption fails:

Sub-tasks may surface information that invalidates earlier assumptions
No mechanism exists to detect partial coverage or low-quality outputs
Errors in early nodes propagate silently through the rest of the graph

PEVR converts task completion into a feedback loop where each iteration narrows the gap between current state and target completeness, an AI-native implementation of the same convergence logic behind PDCA (Plan-Do-Check-Act) in quality management.

Stop Conditions

PEVR loops are bounded by configurable Back-Pressure-Mechanisms to prevent infinite iteration:

Completeness threshold reached (e.g., ≥80%)
High confidence with partial coverage (e.g., 75% confidence + 50% complete)
Diminishing returns (e.g., <5% improvement per iteration)
Token budget exhausted
Iteration cap reached (e.g., 3 cycles)

Empirical Results

Zhang et al. (2026) implemented PEVR in VMAO (Verified Multi-Agent Orchestration), tested on 25 expert-curated market research queries. Results on a 1–5 scale:

Method	Completeness	Source Quality
Single-Agent	3.1	2.6
Static Pipeline	3.5	3.2
VMAO (PEVR)	4.2	4.1

+35% completeness improvement over single-agent baseline
+58% source quality improvement over single-agent baseline
Largest gains on open-ended Strategic Assessment queries (+53% completeness)

See Zhang-et-al-2026-Verified-Multi-Agent-Orchestration for the full experimental setup.

Sources

Zhang, Xing, Yanwei Cui, Guanghui Wang, Wei Qiu, Ziyuan Li, Fangwei Han, Yajing Huang, Hengzhi Qiu, Bing Zhu, and Peiyang He (2026). “Verified Multi-Agent Orchestration: A Plan-Execute-Verify-Replan Framework for Complex Query Resolution.” ICLR 2026 Workshop on MALGAI. arXiv:2603.11445. Available: https://arxiv.org/abs/2603.11445
- Primary source: full PEVR system design, DAG execution model, LLM-based verification, stop conditions, and empirical results
Yao, Shunyu, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao (2022). “ReAct: Synergizing Reasoning and Acting in Language Models.” ICLR 2023. arXiv:2210.03629. Available: https://arxiv.org/abs/2210.03629
- Establishes the foundational reasoning-action interleave pattern; PEVR extends this by externalizing the verification step to an independent agent rather than embedding it in the acting model’s trace
Yao, Shunyu, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan (2023). “Tree of Thoughts: Deliberate Problem Solving with Large Language Models.” NeurIPS 2023. arXiv:2305.10601. Available: https://arxiv.org/abs/2305.10601
- Demonstrates LLM self-evaluation of intermediate states (“sure/maybe/impossible”) as a verification signal during planning; ToT’s state evaluator is a single-model precursor to PEVR’s independent verifier
American Society for Quality (ASQ). “PDCA Cycle — What is the Plan-Do-Check-Act Cycle?” ASQ Quality Resources. Available: https://asq.org/quality-resources/pdca-cycle
- Canonical reference for the PDCA iterative improvement cycle; PEVR is an LLM-native instantiation of the same Check→Act feedback loop formalized by Shewhart and Deming
Shaikh, Salman, et al. (2025). “Plan Verification for LLM-Based Embodied Task Completion Agents.” arXiv:2509.02761. Available: https://arxiv.org/abs/2509.02761
- Extends plan-verify-replan to embodied agents; a Judge LLM critiques action sequences and a Planner LLM applies revisions, yielding progressively cleaner trajectories — convergent evidence for the value of independent verification across agent domains

Note

This note was researched and drafted with AI. How these notes are written →

Manu's Vault

Explorer

Plan-Execute-Verify-Replan

The Four Stages

Why Single-Pass Planning Fails

Stop Conditions

Empirical Results

Sources

Graph View

Table of Contents

Backlinks

Manu's Vault

Explorer

Plan-Execute-Verify-Replan

The Four Stages

Why Single-Pass Planning Fails

Stop Conditions

Empirical Results

Related Concepts

Sources

Graph View

Table of Contents

Backlinks