跪拜 Guibai
← All articles
Artificial Intelligence · Architecture · Frontend

Build an Agent That Thinks Out Loud with LangChain and TypeScript

By 没落英雄 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

Most LLM demos stop at a single prompt-response pair. A tool-calling agent with streaming, memory, and visible reasoning closes the gap between a chatbot and a system that can actually do work—and the content-type edge cases in the message stream are exactly the kind of detail that breaks real integrations.

Summary

Starting from an empty folder, this walkthrough constructs a functional AI agent using LangChain, LangGraph, and TypeScript. Two tools—a calculator and a time fetcher—are defined with Zod schemas that tell the LLM exactly how to call them. The agent is wired together with `createDeepAgent`, which loops the LLM through tool execution until it can answer without further calls.

Streaming output is handled by iterating over message tuples, with careful attention to the dual nature of `content`: a plain string when no tool is invoked, and an array of blocks when tool calls are in play. Multi-turn memory is achieved through `thread_id` and a `MemorySaver` checkpointer, letting the agent recall prior calculations when a user challenges a result.

Extended Thinking surfaces the model's internal monologue as gray ANSI text before the final answer, making the reasoning chain visible. The entire stack runs directly via `tsx` with no compilation step, and the Anthropic-compatible API endpoint allows swapping in models like Qwen without changing the client code.

Takeaways
An agent is just an LLM wrapped in a loop that calls tools until no more tool calls are needed.
Zod schemas are converted to JSON Schema and sent to the LLM; without `.describe()` annotations, the model guesses parameter meanings.
`createDeepAgent` from the `deepagents` package bundles model, tools, system prompt, and a checkpointer into a single LangGraph-backed agent.
AI message `content` is a string for plain text replies but becomes an array of blocks when tool calls are involved—both cases must be handled in streaming code.
`MemorySaver` stores conversation history by `thread_id`; swapping it for `PostgresSaver` makes memory survive restarts.
Extended Thinking injects a `thinking` block into the message content array, which can be rendered separately to show the model's reasoning.
Pointing `ANTHROPIC_BASE_URL` at a proxy lets any Anthropic-compatible model (like Qwen) work with `ChatAnthropic` unchanged.
Conclusions

The jump from a single LLM call to an agent loop is conceptually small but operationally large—streaming, tool result re-injection, and multi-turn state all compound quickly.

Zod's dual role as TypeScript validator and LLM instruction manual is an underappreciated design pattern; the schema is simultaneously code contract and prompt engineering.

Handling both string and array `content` types is a rite of passage for LangChain streaming; the framework's type flexibility here creates a silent bug surface.

Extended Thinking turns the model from a black box into a debuggable system—seeing the reasoning chain makes failures diagnosable rather than mysterious.

Using an Anthropic-compatible proxy to run non-Anthropic models is a practical bridge that avoids vendor lock-in while keeping the developer experience consistent.

Concepts & terms
Agent Loop
The cycle where an LLM decides to call a tool, the framework executes it, and the result is fed back to the LLM, repeating until the LLM produces a final answer without further tool calls.
Tool Calling
A mechanism where an LLM outputs a structured request (tool name + parameters) instead of free text, which the host framework intercepts, executes, and returns the result as a new message.
Zod Schema
A runtime type-validation library for TypeScript. In agent frameworks, Zod schemas define tool parameters and are automatically converted to JSON Schema so the LLM knows what arguments to pass.
MemorySaver
LangGraph's in-memory checkpointer that stores conversation state keyed by `thread_id`. State is lost on process restart; production deployments typically use `PostgresSaver` or similar persistent backends.
Extended Thinking
An Anthropic API feature that lets a model emit an internal reasoning trace (a `thinking` block) before its final response, making the chain of thought visible and debuggable.
ChatAnthropic
LangChain's ChatModel implementation for the Anthropic API. It can also target any Anthropic-compatible endpoint by overriding `anthropicApiUrl`, enabling use with proxy servers or alternative models.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗