A Logistics Platform Cut Financial Losses 99.96% by Letting AI Agents Police Their Own Rules
Rule decay is the silent killer of financial controls in any high-velocity engineering org. This architecture shows that a distilled small model plus a multi-agent rule-auditing system can outperform both manual processes and monolithic large-model approaches, at a fraction of the cost and latency.
Manual review and static rule engines collapse under the pace of modern software delivery. Huolala's internal controls team found that over a third of their financial reconciliation rules were dead within six months, leaving the door open to six-figure losses. Their response replaces the entire human-reliant lifecycle with an AI-native architecture that catches risks at the requirements stage and keeps rules fresh after deployment. The core loop uses large models to auto-label a million code and text samples, then distills that knowledge into a small ModernBERT model that runs cheaply in production at 95% recall. A separate multi-agent system—four specialized agents working as surveyor, inspector, communicator, and scout—continuously maps code facts against rule logic to flag what's missing, what's stale, and what needs to change. Writing new reconciliation rules became 90% faster, and the platform now blocks risky code in the CI/CD pipeline before it ships. The next phase aims for fully autonomous internal controls with dedicated agents for deduction, adversarial simulation, and automatic remediation.
Rule decay is a harder problem than initial risk detection because it requires continuous alignment between evolving code and static rule definitions—a task that single-model approaches handle poorly.
The 99.96% loss reduction figure is impressive, but the more replicable insight is the architecture pattern: large models for offline labeling, small distilled models for online inference, and specialized agents for ongoing maintenance.
BERT-class models, not frontier LLMs, proved to be the cost-performance sweet spot for production code-risk classification once distillation was applied.
Embedding internal control gates directly into CI/CD treats financial safety as a build-time property rather than an audit afterthought, which is still rare in most Western engineering organizations.
The multi-agent design mirrors military reconnaissance doctrine—specialization and adversarial checking produce more reliable outputs than a single generalist model trying to do everything.