Artificial Intelligence

Tool Use Isn't Magic: The Three-Stage Pipeline That Makes LLMs Actually Do Things

By Oo920 · Jun 26, 2026

Read original on juejin.cn ↗ Google Translate ↗ Alt translation

For Western developers building AI agents, this is the canonical architecture behind every tool-calling system from OpenAI, Anthropic, and open-source models. Understanding the three-stage pipeline — not just the API surface — is what separates production-grade agent implementations from toy demos.

Summary

An LLM is a brain trapped in a server — it can't see a screen, touch a keyboard, or call an API. Yet users watch it search the web, analyze spreadsheets, and control computers. The mechanism behind this illusion is Tool Use, and it works through a precise three-stage pipeline.

First, cognitive implantation: every tool is described as a JSON Schema — a text-based instruction manual that translates a function's name, parameters, and purpose into language the LLM can understand. The LLM doesn't know what an API is, but it reads descriptions. Second, intent recognition: when the LLM encounters a question it can't answer from training data (like a real-time stock price), it outputs a structured `tool_calls` object specifying which function to call and with what arguments. The LLM never executes anything — it only produces instructions.

Third, code intervention: application-layer code parses the `tool_calls`, executes the actual function, and pushes the result back into the message array with a `tool` role. The LLM then reads that result and generates a natural-language response. The entire flow requires two LLM calls — one to decide, one to summarize — and the messages array acts as the nervous system connecting the brain to its tools.

Takeaways

— Tool Use works through three stages: cognitive implantation (describing tools as JSON Schema), intent recognition (LLM outputs tool_calls), and code intervention (your code executes the function).

— The LLM never executes tools — it only outputs structured instructions. Your application code does the actual execution.

— Each tool_calls message must be followed by a corresponding tool message in the messages array, or the API throws an error.

— Tool results are returned to the LLM, not directly to the user. The LLM then generates the final response based on the complete context.

— The messages array typically grows by four entries over two API calls: user question, assistant with tool_calls, tool result, and final assistant response.

— tool_choice: 'auto' lets the LLM decide whether to use a tool; you can also force a specific tool or disable tool use.

— JSON.parse is required on function.arguments because it's a string, not a JavaScript object.

— Multiple tool_calls can be returned in one response for parallel tool execution.

Conclusions

The most common production bug — pushing assistant messages twice — reveals how fragile the message protocol is and why developers need to understand the state machine, not just copy-paste SDK examples.

The fact that LLMs can't execute code but can describe function signatures in JSON Schema is a profound architectural constraint: the model's power is in pattern matching, not action.

Calling Tool Use 'just API calls' misses the point — the real innovation is the message protocol that lets a probabilistic text generator orchestrate deterministic function execution.

The two-call pattern (decision then summary) is an elegant solution to the fundamental problem: LLMs are good at reasoning but bad at doing, and code is good at doing but bad at reasoning.

Concepts & terms

Tool Use

A mechanism that allows an LLM to call external functions (APIs, databases, file operations) by outputting structured instructions, which application code then executes. The LLM decides which tool to call and with what parameters, but never executes the tool itself.

Cognitive Implantation

The process of describing tools to an LLM using JSON Schema — translating function signatures, parameters, and purposes into natural language descriptions that the model can understand and reason about.

tool_calls

A structured output from an LLM that specifies which function to call and with what arguments. It includes a unique id, function name, and a JSON string of arguments. The LLM produces this instead of (or alongside) a text response.

messages array

The conversation history passed to an LLM API, containing entries with roles (user, assistant, tool). It acts as the state machine that tracks the entire tool-calling flow across multiple API calls.

Source: juejin.cn ↗ Google Translate ↗ Backup ↗