跪拜 Guibai
← All articles
Agent · AI Programming · MCP

From Chat to Execution: Why Enterprise Agents Need a CLI, Not Just Another Chat Box

By 花椒技术 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

As enterprises push Agents beyond chat into real business workflows, the biggest failure point isn't model capability — it's the lack of a disciplined, scalable integration layer. Huajiao's architecture offers a concrete, battle-tested pattern that any team building production Agents can adopt to avoid the fragmentation and maintenance nightmare of ad-hoc tool connections.

Summary

When enterprise Agents move beyond simple Q&A to executing business operations, the critical bottleneck shifts from model intelligence to stable, safe capability invocation. Huajiao Technology's engineering team found that relying on MCP or ad-hoc tool wrappers leads to scattered capability descriptions, inconsistent execution boundaries, and fragile invocation chains that break as the number of capabilities grows.

Their solution is a three-layer architecture: a Skill layer that acts as a capability protocol for the Agent (describing when and how to call commands), a CLI layer that provides a stable, contract-driven execution surface with clear parameters and structured output, and a Gateway layer that handles cross-cutting concerns like authentication, rate limiting, routing, and audit logging. This design decouples the Agent from backend complexity and makes capability integration sustainable at scale.

The team emphasizes that this is not a replacement for MCP, which remains useful for rapid prototyping and exploration. Instead, the Skill + CLI + Gateway pattern is for the engineering phase, where capabilities need to be continuously integrated, uniformly constrained, and traceable. They also provide a practical checklist for deciding which capabilities are suitable for CLI-ization and how to evaluate whether an integration is production-ready.

Takeaways
Huajiao's architecture uses three layers: Skill (capability protocol for the Agent), CLI (stable command execution), and Gateway (controlled invocation with auth, rate limiting, routing, and logging).
MCP and temporary tools are fine for exploration and demos, but stable, high-frequency capabilities should be migrated to the Skill + CLI + Gateway pattern for long-term maintainability.
CLI commands must follow a strict contract: clear action boundaries, explainable parameters, stable structured output, interpretable error messages, and explicit confirmation steps where needed.
A capability is suitable for CLI-ization only if it has stable semantics, clear parameters, interpretable results, recoverable failures, and is reusable across multiple AI assistants.
The Gateway layer is essential for internal Agents because it handles identity verification, rate limiting, route mapping, audit logging, and manual confirmation nodes — concerns that should not be in the Agent or CLI.
Not every capability belongs in the CLI: frequently changing actions, those requiring heavy page context, or those needing extensive manual judgment should stay out until they stabilize.
The team provides a reusable integration checklist covering five areas: capability suitability, Skill completeness, CLI stability, Gateway coverage, and downstream service boundary preservation.
Huajiao's CLI was built iteratively: first validating the path with simple public capabilities (live stream search, chat send), then converging the command layer, and finally adding the Gateway for internal scenarios.
Agents should not directly bear business boundary judgments — that responsibility belongs in the controlled execution chain (CLI + Gateway).
The overall relationship is: Agent reads Skill → CLI executes standard commands → Gateway handles control → downstream services stay unchanged.
Huajiao Technology is the engineering team behind the Huajiao live streaming platform, and this CLI is their production solution for Agent capability integration.
The team explicitly warns against the temptation to cram all capabilities into the CLI — it's a standardized entry point, not a universal one.
Conclusions

The real bottleneck in enterprise Agent adoption is not model intelligence but the engineering discipline of capability integration — a problem that looks like AI but is actually systems architecture.

Huajiao's choice of CLI as the execution surface is clever because it leverages existing developer tooling (Claude Code, Cursor, Codex) that already understands command execution, reducing the integration burden on both humans and Agents.

The distinction between 'can connect' and 'can integrate long-term' is a crucial maturity model for any team building production Agents — most teams get stuck at the first stage.

The Skill layer is arguably the most underappreciated part of this architecture: it's not documentation for humans but a machine-readable protocol that encodes operational knowledge, which is a fundamentally different design constraint.

The Gateway layer reveals a key insight: Agents are good at intent and orchestration but should not be trusted with security, rate limiting, or audit decisions — those belong in a separate control plane.

The checklist approach is a sign of engineering maturity — it shows the team has internalized the patterns enough to codify them, making the architecture repeatable across different capabilities and teams.

The explicit boundary of 'not every capability belongs in CLI' is a healthy antidote to the common over-engineering trap where teams try to build a universal platform too early.

This architecture implicitly argues that the future of enterprise AI is not about smarter models but about better engineering infrastructure around those models — a view that aligns with the industry shift toward Agent frameworks and tool-use patterns.

The fact that this comes from a live streaming company (Huajiao) rather than a pure AI startup suggests that practical Agent engineering is happening across all industries, not just in AI-native companies.

The progression from MCP exploration to Skill+CLI+Gateway engineering mirrors the natural evolution of any platform: start with flexibility, then add structure as patterns emerge.

Concepts & terms
Skill (in Agent context)
A machine-readable protocol that tells an Agent what capabilities are available, when to use them, how to organize parameters, what to do when information is missing, how to interpret results, and how to handle failures. It's not a tutorial for humans but a structured capability description for AI.
Agent CLI
A command-line interface designed specifically for AI Agents to invoke business capabilities. Unlike human-oriented CLIs, Agent CLIs must have extremely stable command semantics, structured output, and interpretable error messages so the Agent can reliably parse and act on the results.
Gateway (in Agent architecture)
A control layer that sits between the CLI and downstream business services, handling cross-cutting concerns like identity verification, rate limiting, route mapping, audit logging, and manual confirmation nodes. It ensures that Agent invocations are controlled, traceable, and safe.
Action Contract
The implicit agreement between an Agent CLI and its consumers (both human developers and AI Agents) that defines the command's name, parameter semantics, output structure, error behavior, and confirmation requirements. A stable action contract is essential for long-term Agent reliability.
Capability Entry Point
A standardized, reusable interface through which an Agent can invoke a business capability. Instead of connecting directly to backend services or using ad-hoc tools, capabilities are organized into a consistent layer (Skill + CLI + Gateway) that decouples the Agent from backend complexity.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗