How to Build an Agent Tool System That Won't Collapse After 5 Tools
Agent tooling breaks down predictably: parameter validation drifts, error codes become inconsistent, and path safety depends on every tool author remembering to check. This design shows how to centralize those concerns so that adding a 30th tool costs no more boilerplate than the 5th, and a single forgotten try/catch won't crash the agent.
Connecting a handful of tools to an LLM agent is easy; keeping them stable at scale is not. The dskcode CLI project tackles this with a four-part architecture: a filterable tool registry, strong typing with deliberate type erasure, a dual-track Zod/JSONSchema validator that produces isomorphic error structures, and a sandbox that centralizes path confinement, timeout signals, and EOL normalization.
The registry uses a three-layer AND filter — user config, feature flags, and provider compatibility — to decide tool availability. Type erasure via `eraseTool` lets the registry store any tool while each tool retains its own strict `AgentTool<I, O>` signature. Validation failures return structured `ValidationIssue[]` objects with JSON Pointer paths and Chinese-language messages that the LLM can use to self-correct in the next conversation turn.
Every tool result, success or failure, shares the same `ToolResult` shape with a machine-readable `error` code. The sandbox enforces write-path confinement through `realpath` resolution, preventing symlink bypasses, and uses a two-stage SIGTERM/SIGKILL timeout for bash child processes. The entire core is roughly 1,200 lines of TypeScript, designed to be copied directly into any agent project.
Most agent frameworks treat tool validation as each tool's own problem. Centralizing it into a schema validator that speaks the same error language regardless of whether the schema came from Zod or raw JSONSchema is a genuine architectural upgrade.
The decision to use duck-typing (`_def` and `safeParse`) instead of `instanceof z.ZodType` avoids pulling Zod into every project that only uses JSONSchema tools — a small detail with real bundle-size consequences.
Making validation error messages Chinese is a pragmatic hack: LLMs trained on predominantly Chinese prompt corpora self-correct faster when errors match that language, and JSON Pointer paths add machine precision on top.
`additionalProperties: false` on tool parameter schemas is underrated defense against LLM hallucination — extra fields the model invents get rejected before they reach tool logic.
The `data` field being a plain string rather than serialized JSON is a design choice that respects how LLMs actually consume information: natural language with numbered lists beats raw object dumps every time.
Tool descriptions function as the LLM's system prompt for tool selection; writing them with explicit 'do not use for X' clauses is effectively prompt engineering at the schema level.