Dewu's AI Harness Wraps the Full PDCA Loop Around Recommendation Agents
Most AI coding efforts stop at generation and leave verification, rollback, and knowledge reuse as manual afterthoughts. Dewu's Harness shows how to close that loop in a high-stakes recommendation system, with measurable gains in accuracy and cost that any team running AI agents in production can benchmark against.
AI coding tools solve the Do phase, but complex recommendation systems fail across the whole PDCA cycle. Dewu's AI Harness embeds constraints, verification, and feedback into the environment itself, so agents operate freely but stay within verifiable, rollback-able engineering contexts. The system spans seven guardrail stages, from structured requirement contracts (T-PRD) through automated 24/7 AI evaluation to Bad Case capture that feeds directly into the next iteration.
A hybrid agent architecture called TuiChaCha splits work into a deterministic Highway for the 80% of problems that are high-frequency and reproducible, and an ATV exploration mode for the 20% that are long-tail. Successful ATV explorations get pruned, generalized, and promoted into new Highway capabilities, creating a compounding memory loop. A three-layer knowledge governance model — architecture docs, module design docs, and code comments — lifted simple-task accuracy from 52% to 91% while cutting token consumption by 48%.
Framing the Harness as an environment rather than a set of hard rules is a useful mental model: constraints that feel natural to the agent produce fewer workarounds than explicit guardrails bolted on afterward.
The 80/20 split between Highway and ATV mirrors how human SRE teams actually work — runbooks for known incidents, ad-hoc investigation for novel ones — and formalizing that split lets each mode optimize for its own cost profile.
L3 code comments delivering a 48% token reduction while improving accuracy challenges the assumption that more context always helps; proximity and specificity matter more than volume.
Turning Bad Cases into reusable Stories rather than one-off postmortems creates a compounding knowledge asset that directly feeds the Highway, making the system self-improving without retraining.
The observation that humans are being 'interfaced' — constrained by SOPs, input/output contracts, and health metrics — while AI is treated as creative and emergent, is a genuine inversion worth watching as agent orchestration matures.