跪拜 Guibai
← All articles
Backend

The Agent Harness: Why Runtime Control, Not Prompt Engineering, Defines Production Agents

By lizhongxuan ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

Production agents that touch infrastructure, data, or money cannot rely on model self-restraint. A harness is the difference between a demo that works in a notebook and a system that can run unattended without exceeding its authority, leaking context, or producing un-auditable conclusions.

Summary

Most agent discussions stop at prompt engineering and tool calling. The harness layer sits underneath, managing context assembly, dynamic tool surfaces, permission gates, observation pipelines, and loop controllers. It treats every model output as an untrusted event to be validated, not a command to be executed.

A complete harness separates tool schemas from tool call events and tool result events, sanitizes external data to prevent injection, and enforces a stop policy based on output contracts and budgets rather than the model's self-assessment. Every final claim must map back to specific evidence in the trace.

Interviewing for harness knowledge means asking about data flow across a single turn, distinguishing prompt guidance from runtime enforcement, and designing recovery for interrupted long-running tasks. The difference between someone who understands prompts and someone who understands harnesses is whether they believe a stricter system prompt can solve permission, safety, and audit problems.

Takeaways
An agent harness is a runtime control system that wraps the model, not a collection of prompts and tools.
A single agent turn flows through context assembly, prompt compilation, tool surface resolution, model call, permission check, tool execution, observation projection, and loop control.
Tool schemas define the contract, tool call events record what the model requested, and tool result events record what the runtime actually executed.
Permissions, tool visibility, host bindings, budgets, and stop conditions must be enforced by runtime code, not by prompt instructions.
Raw tool results must be validated, truncated, sanitized, and tagged with provenance and freshness metadata before becoming an observation for the model.
Prompt injection in tool results or model output cannot alter runtime approval state, which must live in a dedicated approval store.
Long-running task recovery requires durable run state, checkpoints, and explicit handling of pending dangerous actions so they are not auto-replayed.
Claim-to-evidence mapping ties every conclusion in a final answer back to a specific observation, tool result, and permission decision in the trace.
A loop controller decides to continue, pause, or finalize based on output contracts and budget exhaustion, not on the model declaring itself done.
Interviewing for harness understanding means asking candidates to diagram a full turn's data flow and explain why a stricter prompt cannot replace runtime enforcement.
Conclusions

Most agent frameworks conflate tool exposure with tool execution, which is why a model can hallucinate a call to a tool it should never have seen.

The industry's over-investment in prompt engineering has created a blind spot: a perfectly tuned prompt still cannot stop a model from emitting a dangerous tool call, only runtime code can.

Treating the model as an untrusted event producer rather than a co-equal decision-maker is the foundational mindset shift from application-layer agent development to harness engineering.

Output contracts that programmatically check whether required evidence categories are satisfied are a more reliable termination condition than asking the model if it is finished.

Silent truncation of large tool results is a common production bug that causes models to reason on incomplete data without knowing it, producing confident but wrong conclusions.

The distinction between a tool schema, a tool call event, and a tool result event is a litmus test: conflating them signals a developer has never debugged a real agent failure.

Sub-agent context isolation is not a feature request; without mediated handoffs, a compromised or buggy child agent can read parent transcripts and leak sensitive context.

Claim-to-evidence mapping turns an agent's final answer from a black-box natural language output into a verifiable, auditable artifact suitable for regulated environments.

Concepts & terms
Agent Harness
The runtime control system that wraps a model, managing context assembly, tool exposure, permission checks, loop control, state persistence, observation processing, and output constraints. It enforces boundaries that prompts cannot.
Tool Surface
The set of tools currently visible and callable by the model, dynamically resolved per turn based on task, role, permissions, and environment. Tools the model should not use must not appear in this surface.
Observation
The safe, structured feedback returned to the agent loop after a raw tool result has been validated, sanitized, truncated, and tagged with provenance and confidence metadata. It is the model's reasoning material, not the raw external output.
Output Contract
A programmatic specification of what evidence categories a final answer must contain. The loop controller checks this contract against collected evidence to decide termination, rather than trusting the model's self-assessment.
Claim-to-Evidence Mapping
The requirement that every key conclusion in a final answer can be traced back to a specific observation, tool result, and permission decision in the agent's run trace, enabling debugging, auditing, and automated evaluation.
Loop Controller / Stop Policy
The runtime component that decides whether to continue, pause, or finalize an agent run based on hard limits (max steps, budget), pending approvals, output contract satisfaction, and evidence sufficiency.
Projection
The layer that converts internal runtime events into different representations for different consumers: the model sees reasoning material, the UI sees user-friendly status, the audit log sees structured facts.
Mediated Handoff
A pattern for sub-agent delegation where the parent passes a scoped task contract and permitted resources, and the child cannot directly access the parent's full transcript or context.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗