跪拜 Guibai
← All articles
Frontend · AI Programming · AIGC

The Agent Protocol Quartet: MCP, A2A, AG-UI, and A2UI Are Not Competitors — They're Layers

By threerocks ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

For Western developers building production AI agents, this four-protocol stack signals the end of ad-hoc, monolithic agent architectures. The industry is converging on standardized, swappable layers for tools, inter-agent communication, frontend synchronization, and UI generation. Understanding these boundaries — not memorizing every spec — is the difference between building maintainable systems and piling up technical debt.

Summary

The AI agent ecosystem is settling on a four-protocol stack that separates concerns instead of competing. MCP (Model Context Protocol) standardizes how agents connect to tools and data. A2A (Agent-to-Agent) defines how agents discover and collaborate with each other. AG-UI, backed by CopilotKit's recent $27M Series A, turns an agent's internal runtime — tool calls, state changes, streaming output — into events the frontend can display and react to. A2UI, from Google, lets agents describe UI declaratively using a pre-approved component catalog, avoiding the security and maintainability risks of letting models emit raw HTML or JSX.

The key insight is that these are not alternatives. They are complementary layers in a growing engineering discipline: tool access, inter-agent workflows, real-time user interaction, and safe UI generation. For developers, the practical advice is to learn the conceptual boundaries first, then adopt only what your project stage requires. A demo needs MCP and simple event streaming. A user-facing product needs AG-UI's event model. A system that generates dynamic interfaces needs A2UI. A multi-agent platform needs A2A. The real cost isn't learning all four — it's writing glue code that mixes them up.

Takeaways
Four protocols form the 2026 agent stack: MCP (tools/data), A2A (agent-to-agent), AG-UI (frontend events), and A2UI (declarative UI).
AG-UI turns agent runtime events (tool calls, state changes, streaming) into frontend-displayable events like ToolCallStart and ToolCallEnd.
A2UI uses a flat, ID-referenced component list instead of deep nesting, optimized for LLM incremental generation and safe rendering.
CopilotKit, the company behind AG-UI, raised a $27M Series A in May 2026 with public support from Google, Microsoft, AWS, and Oracle.
A2A, originally from Google, is now under the Linux Foundation and uses Agent Cards, Tasks, Messages, Parts, and Artifacts for structured collaboration.
MCP separates tools (actions with side effects), resources (read-only context), and prompts (reusable templates) — a distinction critical for security.
AG-UI supports frontend-defined tools, allowing agents to request user confirmation or UI actions that execute on the client side.
A2UI payloads can be transported via AG-UI, A2A, WebSocket, REST, or MCP — the format is independent of the transport.
The recommended adoption order is: MCP first, then AG-UI for event streaming, then A2UI for dynamic UI, and finally A2A for multi-agent systems.
Enterprises care because protocolization reduces glue code, enables debugging per step, and allows control over tool permissions and UI components.
AG-UI and A2UI are not the same: AG-UI handles the 'how' of communication and interaction; A2UI handles the 'what' of UI structure and validation.
A2UI's design acknowledges LLM limitations: it avoids letting models emit executable code and instead uses declarative data against a client-side component catalog.
Conclusions

The real anxiety around these protocols comes from a lack of a mental map, not from technical complexity. Once you see them as layers (tools, collaboration, interaction, rendering), the panic subsides.

The most valuable skill for a mid-level developer is not writing an A2A server, but knowing when to use a tool vs. a resource vs. a prompt in MCP — a judgment call that directly impacts system security and maintainability.

AG-UI's frontend-defined tools are an underappreciated feature for enterprise products. They shift trust from the agent's promise ('I'll be careful') to the interface's explicit controls ('you must confirm this action').

A2UI's flat component list is a pragmatic design choice that acknowledges LLMs are bad at generating correct deeply nested structures. This is a rare example of a protocol designed around model limitations rather than ideal developer ergonomics.

The four-protocol stack signals that agent engineering is maturing from a 'model smarts' problem to a 'software architecture' problem. The hardest part is no longer getting the LLM to answer correctly, but making every edge of the system controllable, debuggable, and replaceable.

CopilotKit's $27M raise and the backing from major cloud providers is a strong market signal that AG-UI is becoming the de facto standard for agent-frontend communication, not just a niche library.

Many developers will be tempted to adopt all four protocols at once. The smarter, more conservative path is to adopt only what your current project stage demands, starting with MCP and simple event streaming.

Concepts & terms
MCP (Model Context Protocol)
An open protocol that standardizes how AI applications (hosts) connect to external tools and data sources (servers) through a client. It separates tools (callable actions with side effects), resources (read-only context like files or schemas), and prompts (reusable templates).
A2A (Agent-to-Agent)
A protocol, originally from Google and now under the Linux Foundation, for structured communication between remote agents. It uses an Agent Card (identity and capabilities), Task (trackable work order), Message, Part (text, file, or structured data), and Artifact (final output).
AG-UI
An open, event-based protocol for connecting an agent's backend runtime to a user frontend. It standardizes events for streaming text, tool calls (e.g., ToolCallStart, ToolCallEnd), state changes, and user interactions like confirmation or interruption. It also supports frontend-defined tools.
A2UI
A declarative protocol for agents to generate UI safely. Instead of emitting raw HTML or JSX, the agent outputs a structured description (e.g., a flat list of components with IDs) that a client-side renderer interprets against a pre-approved component catalog. Designed for incremental, streamable generation by LLMs.
Agent Card
A structured 'business card' for an agent in the A2A protocol. It describes the agent's identity, capabilities, endpoint URL, and authentication requirements, enabling other agents to discover and interact with it.
Frontend-defined tools
A feature of AG-UI where the frontend registers tools (e.g., 'show confirmation dialog', 'open map at location') that the agent can call. The execution happens on the client side, giving the application control over sensitive UI actions.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗