AI-Harness-Engineering - The Complete Framework

The Fundamental Shift

AI coding agents change the economics of software development — but not the need for engineering discipline. The shift is architectural: the unit of engineering changes from the codebase alone to the codebase plus the system that generates and maintains it.

The foundational equation captures this precisely:

coding agent = AI model(s) + harness

The model provides language understanding and code generation. The harness provides everything that makes the model reliable in practice: configuration, constraints, tools, and feedback loops. Practitioners who treat this as a model problem — expecting better AI to solve reliability — consistently underperform those who treat it as a harness problem. The HumanLayer team frames this plainly: most agent quality failures are configuration failures, not model failures (Horthy 2026).

Harness-Engineering is the discipline of building, maintaining, and improving this harness systematically. It is not prompt engineering — which is one-off adjustment — but a first-class engineering concern with its own lifecycle.

What the Harness Is

Agent-Harness-Components defines six configurable levers, each addressing a distinct failure mode in AI-Coding-Agent-Architecture:

AGENTS.md / CLAUDE.md files — Markdown documents injected deterministically into the system prompt before every task. Most effective when containing only non-inferable details: custom build commands, known failure patterns, project conventions that differ from defaults
MCP servers — Extend the agent beyond file I/O to external tools and capabilities. Effective use: prefer CLI wrappers for well-known tools, which carry less context overhead and benefit from richer model training
Skills — Progressive disclosure mechanism. Instructions load on demand, not upfront, preserving context budget for the task at hand
Sub-agents — Isolated instances with independent context windows and filtered tool access. The value is context isolation, not role specialisation
Hooks — Scripts executing at agent lifecycle events (pre-tool, post-tool, on-stop). Pattern: silent on pass, structured error output on failure, forcing resolution before completion
Back-pressure mechanisms — Back-Pressure-Mechanisms — verification systems enabling the agent to self-check its own work; the strongest predictor of agent success is the ability to self-verify

The design principle is additive: start with nothing, add a component only when an observed failure demands it. Each component consumes context budget and adds maintenance surface. Components are not features to install — they are responses to failures.

How the Harness Improves

The Iterative-Signal-Loop is the operational philosophy that turns failures into infrastructure:

Observe — the agent produces incorrect output or makes a mistake
Diagnose — identify what is missing: documentation, a tool, a guardrail
Engineer — add or modify the harness element
Prevent — the harness encodes the lesson; the mistake cannot recur

Every loop iteration produces one of two outputs:

Documentation update — add instructions to AGENTS.md so the agent has correct context next time (fast path)
Programmatic tool — build a linter, type-check, or structured verification script that enforces correctness deterministically (durable path)

The critical reframing that makes this work: failures are attributed to the harness, not the model. Mistele (2026) names this directly: “The model is probably fine. It’s just a skill issue.” Attribution to the model produces dead ends — waiting for a better model. Attribution to the harness produces engineering prompts. This is structurally identical to Deming’s quality insight: defects are process signals, not worker failures. The loop only functions when failures are treated as diagnostic data.

The Flywheel

When enough iterations of the signal loop have accumulated, momentum builds into the Agentic-Flywheel. Boeckeler and Morris (2026) define five progressive stages:

Stage	What Changes
Information foundation	Agents receive performance data from existing tests and evaluations
Signal enrichment	Richer feedback is added: pipeline metrics, failure validations, production data
Recommendation generation	Agents propose improvements to upstream workflow components; the harness becomes self-evaluating
Interactive approval	Humans-On-The-Loop review agent recommendations and direct implementation
Automated decision-making	High-confidence improvements receive automatic approval; flywheel accelerates

The flywheel changes the human role fundamentally: from executor (manually identifying and fixing each failure) to governor (setting confidence thresholds and reviewing agent recommendations). This is not humans being removed — it is humans being elevated.

Collins (2001) introduced the flywheel metaphor to describe compounding momentum through consistent reinforcing actions. The same physics applies: each harness improvement reduces failure rate → cleaner signal data → more confident recommendations → higher automation threshold → more human attention available for higher-order governance.

The key safety constraint: automation follows demonstrated human-validated confidence, it does not precede it.

Implications for Engineering Practice

Three non-obvious implications for teams adopting this framework:

Runtime constraints enable autonomy — counter-intuitively, adding verification steps and guardrails to the harness increases reliable agent autonomy. Unconstrained agents produce outputs that require constant human correction; well-constrained agents complete tasks with minimal intervention. Constraint is the enabler of trust, not its enemy
Existing tooling is a foundation — pre-commit hooks, linters, type-checkers, and build steps already present in most codebases are the earliest harness components. Audit what exists before building new infrastructure
Architectures converge toward harness-ability — codebases with clear module boundaries, property-based tests, and deterministic build processes are dramatically easier to work on with AI agents. Harness engineering creates pressure toward cleaner software architecture as a side effect

The AI-Coding-Agent-Architecture decomposition clarifies the leverage point: the model is largely fixed; the harness is what practitioners engineer. This shifts the relevant question from “which model should we use?” to “how good is our harness?”

Harness-Engineering — the atomic definition of the discipline
AI-Coding-Agent-Architecture — the model + harness decomposition
Agent-Harness-Components — the six configurable elements of the harness
Iterative-Signal-Loop — the operational loop that improves the harness
Agentic-Flywheel — the compounding acceleration that emerges from sustained looping
Humans-On-The-Loop — the human oversight model that governs the flywheel
Back-Pressure-Mechanisms — the verification systems that enable self-correction

Sources

Boeckeler, Birgitta (2026). “Harness Engineering.” Exploring Generative AI, MartinFowler.com. Published 17 February 2026. Available: https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html
- Three-component harness model; iterative signal loop; garbage-collection pattern for entropy management
Boeckeler, Birgitta and Pete Morris (2026). “Humans and Agents.” Exploring Generative AI, MartinFowler.com. Retrieved 2026-03-31. Available: https://martinfowler.com/articles/exploring-gen-ai/humans-and-agents.html
- Primary source for the five-stage agentic flywheel model; human role evolution from executor to governor
Hashimoto, Mitchell (2026). “My AI Adoption Journey — Step 5: Engineer the Harness.” mitchellh.com. Published 5 February 2026. Available: https://mitchellh.com/writing/my-ai-adoption-journey#step-5-engineer-the-harness
- “Never make that mistake again” commitment; two-mechanism taxonomy (documentation vs. programmatic tools); Ghostty project case study
Horthy, Dex (2026). “Skill Issue: Harness Engineering for Coding Agents.” HumanLayer Blog. Published 12 March 2026. Available: https://www.humanlayer.dev/blog/skill-issue-harness-engineering-for-coding-agents
- coding agent = AI model(s) + harness equation; six-component harness taxonomy; attribution shift framing; “configuration not model” argument
Anthropic Engineering (2026). “Effective Harnesses for Long-Running Agents.” Anthropic. Available: https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents
- Initializer + coding agent session pattern; JSON feature lists; back-pressure design principles; multi-session state management
Mundler, Niels et al. (2026). “AGENTS.md: How to Build Context Files for AI Coding Agents.” ETH Zurich / AGENTbench. arXiv:2503.09056. Available: https://arxiv.org/abs/2503.09056
- Empirical study on 138 tasks: LLM-generated agent files reduce success rate by 3% and increase inference cost by 20%; human-authored files perform better
Yang, John, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press (2024). “SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering.” arXiv:2405.15793. Available: https://arxiv.org/abs/2405.15793
- Empirical evidence that harness/interface design determines agent performance; ACI custom interface improved SWE-bench pass@1 from near-zero to 12.5%
Collins, Jim (2001). Good to Great: Why Some Companies Make the Leap… and Others Don’t. HarperBusiness. ISBN: 978-0-06-662099-2. Available: https://www.jimcollins.com/concepts/the-flywheel.html
- Conceptual origin of the flywheel metaphor: compounding momentum through sustained reinforcing actions
Deming, W. Edwards (1986). Out of the Crisis. MIT Press. ISBN: 978-0262541152. Available: https://mitpress.mit.edu/9780262541152/out-of-the-crisis/
- PDSA cycle as theoretical ancestor of the iterative signal loop; defects as process signals, not worker failures
Yuksel, Kamer Ali, Thiago Castro Ferreira, Mohamed Al-Badrashiny, and Hassan Sawaf (2025). “A Multi-AI Agent System for Autonomous Optimization of Agentic AI Solutions via Iterative Refinement and LLM-Driven Feedback Loops.” Proceedings of the 1st Workshop for Research on Agent Language Models (REALM 2025), pp. 52–62. Vienna, Austria. DOI: 10.18653/v1/2025.realm-1.4. Available: https://aclanthology.org/2025.realm-1.4/
- Peer-reviewed validation: iterative LLM-driven feedback loops outperform one-shot prompt engineering for complex agentic tasks

Note

This note was researched and drafted with AI. How these notes are written →

Manu's Vault

Explorer

AI-Harness-Engineering - The Complete Framework

The Fundamental Shift

What the Harness Is

How the Harness Improves

The Flywheel

Implications for Engineering Practice

Sources

Graph View

Table of Contents

Backlinks

Manu's Vault

Explorer

AI-Harness-Engineering - The Complete Framework

The Fundamental Shift

What the Harness Is

How the Harness Improves

The Flywheel

Implications for Engineering Practice

Related Concepts

Sources

Graph View

Table of Contents

Backlinks