跪拜 Guibai
← All articles
AI Coding · Frontend · Agent

CLI vs. MCP: Execution vs. the Capability Manual

By JacksonChen ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

Choosing between exposing a CLI or an MCP server to coding agents is a direct trade-off between context-window bloat and planning flexibility. Getting it wrong means either burning tokens on unused tool descriptions or leaving an agent blind to capabilities it needs for complex workflows.

Summary

AI coding agents spend most of their time running shell commands, making a platform's CLI a natural, low-friction execution interface. An agent doesn't need to know API schemas or auth flows; it just runs `platform deploy` and reads the output, exactly like a developer would. This works well when the task is already clear.

MCP, by contrast, acts as a capability manual pushed into the model's context upfront. It declares every available tool, its parameters, and its purpose, which helps an agent plan complex, multi-step workflows across hundreds of interfaces. The trade-off is token cost: a large MCP server description inflates every request's context window.

The two approaches are complementary layers, not competitors. CLI handles the "how to execute" for known tasks with minimal context overhead, while MCP handles the "what can be executed" for open-ended planning. Platforms are increasingly building both, letting agents reach for a CLI command when the goal is specific and lean on MCP descriptions when they need to discover and compose capabilities.

Takeaways
Agents interact with terminals far more than APIs; CLI is their most natural execution interface.
CLI requires no upfront knowledge of API structure, auth, or parameters — the agent just runs a command.
MCP pre-declares every tool, parameter, and description, functioning as a capability catalog for the agent.
Every MCP tool description consumes context tokens, so large tool sets inflate cost across every turn of a conversation.
CLI avoids context bloat by fetching help text on demand rather than loading all capabilities upfront.
Simple, goal-clear tasks like deploying or checking logs are well-served by CLI alone.
Complex platforms with hundreds of interfaces and multi-step workflows benefit from MCP's planning support.
Future platforms will likely offer both: CLI for execution and MCP for capability discovery.
Conclusions

Framing CLI and MCP as competitors misses that they operate at different phases of an agent's loop — doing versus deciding what can be done.

The token-cost argument flips a common assumption: a "richer" MCP interface can actually hurt agent performance by eating context that could hold conversation history or code.

CLI's simplicity is also a constraint; an agent exploring an unfamiliar CLI via `--help` is essentially doing trial-and-error that a well-structured MCP server could short-circuit.

Platforms that ship a CLI first are betting that most agent tasks are well-defined, while MCP-first platforms are betting on open-ended composition.

The observation that agents behave like developers in a terminal suggests that tool design should optimize for the command-line interaction pattern, not just API completeness.

Concepts & terms
MCP (Model Context Protocol)
A protocol that lets AI agents discover a server's available tools, their parameters, and descriptions. It acts as a pre-loaded capability manual pushed into the model's context window.
CLI as an agent interface
Using a command-line tool not just for humans but as the primary way an AI agent executes tasks, reading stdout/stderr to decide next steps without needing to know underlying API details.
Context window / token cost
The finite amount of text an LLM can process in one request. Every tool description, parameter schema, and help text consumes tokens, directly increasing latency and cost per interaction.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗