跪拜 Guibai
← All articles
Frontend · Claude · Artificial Intelligence

How to Build an Agent Tool System That Won't Collapse After 5 Tools

By 小白菜的编程日记 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

Agent tooling breaks down predictably: parameter validation drifts, error codes become inconsistent, and path safety depends on every tool author remembering to check. This design shows how to centralize those concerns so that adding a 30th tool costs no more boilerplate than the 5th, and a single forgotten try/catch won't crash the agent.

Summary

Connecting a handful of tools to an LLM agent is easy; keeping them stable at scale is not. The dskcode CLI project tackles this with a four-part architecture: a filterable tool registry, strong typing with deliberate type erasure, a dual-track Zod/JSONSchema validator that produces isomorphic error structures, and a sandbox that centralizes path confinement, timeout signals, and EOL normalization.

The registry uses a three-layer AND filter — user config, feature flags, and provider compatibility — to decide tool availability. Type erasure via `eraseTool` lets the registry store any tool while each tool retains its own strict `AgentTool<I, O>` signature. Validation failures return structured `ValidationIssue[]` objects with JSON Pointer paths and Chinese-language messages that the LLM can use to self-correct in the next conversation turn.

Every tool result, success or failure, shares the same `ToolResult` shape with a machine-readable `error` code. The sandbox enforces write-path confinement through `realpath` resolution, preventing symlink bypasses, and uses a two-stage SIGTERM/SIGKILL timeout for bash child processes. The entire core is roughly 1,200 lines of TypeScript, designed to be copied directly into any agent project.

Takeaways
Tool availability is decided by a three-layer AND filter: user-disabled list, feature flags, and provider compatibility — all three must pass for a tool to be enabled.
Type erasure via `eraseTool` decouples the registry (which stores `AnyAgentTool`) from each tool's own strict `AgentTool<I, O>` signature, so callers assert back by name.
Zod-first tool definitions produce TS types, JSONSchema, and a runtime validator from a single schema source, eliminating sync drift between the three.
Validation failures return `ValidationIssue[]` with JSON Pointer paths and Chinese-language messages structured to feed directly back to the LLM for self-correction.
Every tool result uses the same `ToolResult` shape — `success`, `data`, `error` — so the LLM never sees structurally different success and failure responses.
Write-path confinement uses `realpath` to resolve symlinks before checking against allowed roots, closing the `/var/www -> /home/user/www` bypass.
Bash child processes get a two-stage kill: SIGTERM with a 5-second grace window, then SIGKILL, preventing zombie processes from commands like `npm install`.
Read tools are fanned out in parallel; write tools run serially, enforced by `ToolKind` classification rather than per-tool logic.
CRLF/LF normalization in `edit_file` flattens line endings for matching, then restores the original EOL on write, so Windows users don't break LLM edits.
The `@` path prefix (from system prompt syntax) is stripped centrally in `resolvePath`, so no individual tool can forget to handle it.
Conclusions

Most agent frameworks treat tool validation as each tool's own problem. Centralizing it into a schema validator that speaks the same error language regardless of whether the schema came from Zod or raw JSONSchema is a genuine architectural upgrade.

The decision to use duck-typing (`_def` and `safeParse`) instead of `instanceof z.ZodType` avoids pulling Zod into every project that only uses JSONSchema tools — a small detail with real bundle-size consequences.

Making validation error messages Chinese is a pragmatic hack: LLMs trained on predominantly Chinese prompt corpora self-correct faster when errors match that language, and JSON Pointer paths add machine precision on top.

`additionalProperties: false` on tool parameter schemas is underrated defense against LLM hallucination — extra fields the model invents get rejected before they reach tool logic.

The `data` field being a plain string rather than serialized JSON is a design choice that respects how LLMs actually consume information: natural language with numbered lists beats raw object dumps every time.

Tool descriptions function as the LLM's system prompt for tool selection; writing them with explicit 'do not use for X' clauses is effectively prompt engineering at the schema level.

Concepts & terms
Type Erasure
A pattern where a generic type `AgentTool<I, O>` is wrapped into a non-generic `AnyAgentTool` with `args: unknown`, so a registry can store heterogeneous tools without losing each tool's internal strong typing. Callers re-assert the concrete type by tool name before calling `execute`.
Zod ↔ JSONSchema Dual Track
A validation architecture where tools can declare their parameter schema in either raw JSONSchema or Zod. A duck-type check detects which one is in use, and both paths produce the same `ValidationIssue[]` output, so downstream consumers (LLM, Reflector, UI) never branch on schema format.
ToolResult Protocol
A contract requiring every tool to return `{ success: boolean, data: string, error?: string }` for both success and failure. The `error` field carries a stable machine-readable code (e.g., `TEXT_NOT_FOUND`), while `data` carries a human-readable description the LLM can consume directly.
Sandbox Path Confinement
A centralized function (`confine`) that resolves a target path through `realpath` to eliminate symlinks, then checks it falls within a whitelist of allowed roots. Any `..` escape, absolute path outside roots, or symlink redirect is rejected before a write tool executes.
Two-Stage Process Kill
A timeout strategy for child processes: send SIGTERM first, wait 5 seconds for graceful shutdown, then send SIGKILL. Prevents orphaned grandchild processes from commands that spawn sub-processes (npm install, cargo build).
Source: juejin.cn ↗ Google Translate ↗ Backup ↗