跪拜 Guibai
← All articles
AI Coding · Claude · Frontend

Claude Code Gets a Browser UI with Real-Time Tool Approval

By 我不是外星人 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

Moving Claude Code into a browser with controllable approval hooks turns a developer-only CLI agent into a tool that product managers, operations staff, or any non-engineer can use for workflow automation. The frame-based protocol also makes it straightforward to build custom agent orchestration on top of Claude Code without fighting terminal output parsing.

Summary

Claude Code normally lives in a terminal, which locks out non-developers and makes it hard to intercept or repackage its output. A new open-source project bridges the CLI agent to a browser UI by running a Node.js backend that spawns Claude Code via the official claude-agent-sdk and pipes normalized frames over WebSocket to a thin frontend.

The architecture splits responsibilities cleanly: the Node server drives the agent, handles session persistence, and pushes tool-approval requests to the browser as asynchronous promises. The browser renders streaming text deltas, tool-use cards, and permission bars, then sends decisions back. A custom eight-frame protocol decouples the frontend from the SDK's internal event types, so SDK upgrades or MCP changes never touch the UI layer.

A critical implementation detail is that the SDK's Zod runtime schema is stricter than its TypeScript declarations. The `canUseTool` callback must return `updatedInput` on allow, not just a behavior string, or the agent crashes. The project also defaults to the MiniMax-M3 model and records per-turn token usage and cost.

Takeaways
Claude Code's official claude-agent-sdk lets JavaScript programs drive the CLI agent, including session resume, tool approval callbacks, and streaming events.
Omitting `settingSources` or passing an empty array prevents the SDK from loading global skills and MCP configurations, which is the key to a zero-config web deployment.
The `includePartialMessages: true` option is required for character-by-character streaming; without it, only complete message blocks arrive.
Inside the `canUseTool` callback, an `allow` decision must return `updatedInput` alongside `behavior`, because the SDK's Zod runtime schema enforces stricter validation than the TypeScript type definitions.
A five-minute timeout on pending permission promises prevents the SDK from hanging indefinitely if a browser tab is closed mid-approval.
The `normalize()` function compresses over 20 SDK internal event types into eight frontend frame types, insulating the UI from SDK version changes.
The frontend contains almost no state; it is a single `switch` statement that dispatches frames to rendering functions, while all agent state lives on the Node server.
Streaming text rendering requires buffering to avoid feeding half-received Markdown syntax (like unclosed backticks) to the parser.
Conclusions

Anthropic's agent SDK is designed to be embedded, not just invoked. The async `canUseTool` callback turns every dangerous operation into a promise that an external system can resolve, which is the architectural enabler for any human-in-the-loop agent workflow.

The strictness mismatch between the SDK's Zod runtime and its TypeScript declarations is a recurring source of silent failures. Developers relying solely on IDE autocompletion will ship code that passes type checks but throws at runtime when a tool approval returns the wrong shape.

Separating the agent runtime (Node) from the UI (browser) with a minimal frame protocol is a pattern that generalizes beyond Claude Code. Any CLI agent with a streaming API can be given a web frontend this way, and the protocol's simplicity means adding new frame types costs only a new `case` branch.

Non-developer access to coding agents is less about dumbing down the interface and more about relocating the complexity. Skills, MCP servers, and model selection stay on the backend; the browser only sees text, tool cards, and yes/no decisions.

The five-minute permission timeout is not a nice-to-have. Without it, a single unresponsive browser tab leaks a promise that keeps the entire Node process from garbage-collecting the session, eventually exhausting memory under load.

Concepts & terms
claude-agent-sdk
Anthropic's official Node.js library that wraps the Claude Code CLI as a programmable object, exposing its full toolset, session persistence, permission callbacks, and streaming events through a `query()` API.
canUseTool callback
An async function passed to the SDK that intercepts every tool invocation. It returns a Promise, allowing an external system (like a browser UI) to approve, deny, or modify the tool input before execution proceeds.
settingSources
An SDK option array (`['user', 'project', 'local']`) that controls which Claude Code configuration layers are loaded. Omitting it prevents loading global skills and MCP servers, enabling a zero-config deployment.
normalize()
A server-side function that compresses the SDK's 20+ internal event types into a small set of frontend-friendly frame types (text_delta, tool_use, permission_request, etc.), decoupling the UI from SDK internals.
Zod runtime validation
The SDK uses Zod schemas at runtime to validate callback return values. These schemas can be stricter than the accompanying TypeScript type declarations, causing runtime errors for shapes that pass compile-time checks.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗