A coding agent harness is composed of six primary configuration levers. Each addresses a distinct failure mode in AI-Coding-Agent-Architecture. The design philosophy is additive: start with nothing, add a component only when an observed failure demands it.

The Six Components

1. AGENTS.md / CLAUDE.md Files Markdown files injected deterministically into the agent’s system prompt before any task begins. Support a layering model: global (~/.codex) → repo-root → per-directory. ETH Zurich’s AGENTbench study (2026) found LLM-generated files reduced task success by 3% while increasing inference cost 20%. Human-written files performed marginally better (+4%) at similar cost. Effective files contain only non-inferable details: custom build commands, project-specific tooling, known failure patterns.

2. MCP Servers Extend the agent beyond basic file I/O by exposing external tool capabilities. Tool descriptions are injected into the context window and count against the instruction budget. Effective practice: prefer CLI wrappers over MCP servers for well-known tools — they carry less context overhead and are better represented in model training data. Too many connected servers degrade reasoning by pushing the model into context pressure.

3. Skills A progressive disclosure mechanism. Rather than loading all instructions upfront, skills are injected on demand when the agent needs domain-specific knowledge or bundled tool workflows. A root skill index signals availability; individual skill files load only when invoked. Prevents instruction budget depletion from instructions that are irrelevant to the current task.

4. Sub-Agents Isolated agent instances with their own context windows, system prompts, and filtered tool access. The parent agent receives condensed results without seeing intermediate steps. The core value is context isolation — not role specialisation. Sub-agents specialised by task type (locate-definitions, analyse-patterns, research-docs) outperform those specialised by role (frontend-engineer, backend-engineer).

5. Hooks User-defined scripts that execute at agent lifecycle events: pre-tool, post-tool, and on-stop. Effective hooks follow a silent-success / verbose-failure pattern — zero output on pass, structured error output on failure so the agent is forced to resolve issues before completing. Examples: TypeScript type-checks on stop, Slack notifications on completion, auto-deny rules for dangerous commands.

6. Back-Pressure Mechanisms Verification systems that let the agent check its own work. The strongest predictor of agent success is the agent’s ability to self-verify. Mechanisms include type-checks, build steps, unit tests, and structured feature lists (JSON rather than Markdown — models are less likely to modify structured data inappropriately). Output discipline is critical: swallow passing output, surface only failures.

Design Principle

Components are not features to install — they are responses to failures. Each component added consumes context budget and introduces maintenance surface. Iterate from observed failures, not from an ideal-harness design upfront.

Sources

Note

This content was drafted with assistance from AI tools for research, organisation, and initial content generation. All final content has been reviewed, fact-checked, and edited by the author to ensure accuracy and alignment with the author’s intentions and perspective.