Bui-2026-Building-Effective-AI-Coding-Agents

Complete Bibliographic Citation

Bui, Nghi D. Q. (2026). “Building Effective AI Coding Agents for the Terminal: Scaffolding, Harness, Context Engineering, and Lessons Learned.” arXiv preprint, arXiv:2603.05344 [cs.AI]. Available: https://arxiv.org/abs/2603.05344

Summary

This paper presents OPENDEV, an open-source Rust-based CLI coding agent, and provides formal academic definitions for core concepts in AI agent engineering. It is among the most technically rigorous sources on Harness Engineering, grounding concepts like context compaction, dual-agent design, and safety-layered orchestration in a working implementation.

The paper’s central thesis: effective autonomous AI coding assistance is not primarily a model problem — it is an engineering problem. The harness, scaffolding, and context management infrastructure determine whether a capable LLM becomes a reliable agent.

Formal Definitions

The paper establishes precise vocabulary:

Scaffolding: The construction phase that runs before the first user prompt. Covers system prompt compilation, tool schema building, and subagent registration. Every agent in OPENDEV is fully constructed before the conversation lifecycle begins.
Harness: The runtime orchestration infrastructure that transforms a stateless LLM into a persistent, tool-using agent. Responsible for tool execution, context management, safety enforcement, and session persistence.
Context Engineering: Treating context management as a first-class engineering concern, not an afterthought. OPENDEV implements four subsystems: System Reminders, Prompt Composer, Memory, and Compaction.

Dual-Agent Design

OPENDEV separates planning from execution through two agent modes:

Plan Mode: A read-only Planner subagent explores the codebase, analyzes patterns, and produces structured plans requiring user approval before any writes occur
Normal Mode: Full read-write tool access for implementation

The key insight: write operations are excluded from the Planner’s tool schema entirely — the LLM never sees tool definitions it cannot use. This architectural constraint eliminates write attempts during planning by design, not by instruction.

Adaptive Context Compaction

As token budgets approach limits, OPENDEV applies a five-stage compaction pipeline integrated directly into the Extended ReAct reasoning loop. Each stage applies progressively more aggressive reduction to older observations. This addresses context rot — the degradation of reasoning quality as older, lower-relevance content accumulates in the window.

Lazy Tool Discovery

External tools are discovered on-demand via MCP (Model Context Protocol) rather than loaded upfront. A search_tools mechanism provides keyword-scored tool discovery, keeping the initial system prompt lean while preserving access to a large tool ecosystem.

Memory Accumulation

Cross-session continuity is achieved through two systems:

Episodic Memory: Summaries of historical conversations
Working Memory: Current session context

Together, these enable the agent to accumulate project-specific knowledge across sessions — patterns, strategies, and codebase-specific conventions that would otherwise be re-derived from scratch.

Safety: Defense-in-Depth

Five independent layers prevent harmful operations:

Prompt-Level Guardrails: Security policies in the system prompt
Schema-Level Restrictions: Plan-mode whitelist, per-subagent tool filtering
Runtime Approval System: Manual, Semi-Auto, and Auto levels with persistent permissions
Tool-Level Validation: DANGEROUS_PATTERNS blocklist, timeouts, stale-read detection
Lifecycle Hooks: User-defined pre-execution blocking and argument mutation

Design principle: no single point of failure compromises the system. Each layer operates independently.

Compound AI Architecture

OPENDEV routes different workloads to different LLMs — five distinct model roles (normal execution, deliberation, self-critique, vision, fallback). This reflects a broader trend: state-of-the-art AI results increasingly come from systems that compose multiple models and tools rather than from a single model call.

Key Lessons

Context pressure is the central constraint: The context window is the limiting resource in long-running coding sessions, not model capability
Steer behavior over long horizons with explicit decision trees, not prompt instructions alone
Safety through architectural constraints is more reliable than safety through instructions
Design for approximate outputs: terminal environments require handling imperfect LLM outputs gracefully

Key Concepts Extracted

Zhang-et-al-2026-Verified-Multi-Agent-Orchestration

Sources

Bui, Nghi D. Q. (2026). “Building Effective AI Coding Agents for the Terminal: Scaffolding, Harness, Context Engineering, and Lessons Learned.” arXiv preprint, arXiv:2603.05344 [cs.AI]. Available: https://arxiv.org/abs/2603.05344
- Primary source: full system description (OPENDEV), formal definitions, experimental implementation, and lessons learned

Fair Use Notice

This note contains summaries and analysis of copyrighted material for educational and commentary purposes. This constitutes fair use/fair dealing under copyright law. The original work remains the property of its copyright holders. Full citation provided above.

Note

This content was drafted with assistance from AI tools for research, organization, and initial content generation. All final content has been reviewed, fact-checked, and edited by the author to ensure accuracy and alignment with the author’s intentions and perspective.

Manu's Vault

Explorer

Bui-2026-Building-Effective-AI-Coding-Agents

Complete Bibliographic Citation

Summary

Formal Definitions

Dual-Agent Design

Adaptive Context Compaction

Lazy Tool Discovery

Memory Accumulation

Safety: Defense-in-Depth

Compound AI Architecture

Key Lessons

Key Concepts Extracted

Sources

Graph View

Table of Contents

Backlinks

Manu's Vault

Explorer

Bui-2026-Building-Effective-AI-Coding-Agents

Complete Bibliographic Citation

Summary

Formal Definitions

Dual-Agent Design

Adaptive Context Compaction

Lazy Tool Discovery

Memory Accumulation

Safety: Defense-in-Depth

Compound AI Architecture

Key Lessons

Key Concepts Extracted

Related Literature

Sources

Graph View

Table of Contents

Backlinks