From Chat to Execution: Why Enterprise Agents Need a CLI, Not Just Another Chat Box

Agent Isn't Just for Chat: How We Used CLI to Organize Business Capability Entry Points

This article comes from the real engineering practice of Huajiao Technology. If you are also interested in AI engineering, enterprise Agents, MCP, Skills, or R&D toolchains, there is a "Huajiao Technology Exchange Group" entry at the end of the article. Welcome to join the discussion.

When building enterprise Agents, many teams initially focus on model capabilities:

Can it understand user intent? Can it break down tasks? Can it organize responses like a reliable assistant?

These are all important, but once you truly enter the business process, the next problem we encounter earlier is:

After the Agent understands, how can it stably and safely invoke business capabilities?

If it's just answering questions, a chat box is enough. But once you want the Agent to complete queries, interactions, analyses, and confirm operations, it must know:

What capabilities can be invoked;
What parameters each capability requires;
Which actions require user confirmation before execution;
How to explain and retry after failure;
How to troubleshoot when something goes wrong in the invocation chain.

Huajiao CLI was built to address this problem.

On the surface, it's a command-line tool, but more accurately, it's a layer of business capability entry points we prepared for the Agent:

Skill is responsible for describing capabilities
CLI is responsible for stable execution of actions
Gateway is responsible for converging and controlling invocations
Downstream business services maintain their original boundaries

This article is not about "we released a CLI," but a review of a more specific engineering problem:

When an enterprise Agent moves from chat to execution, why can't it rely solely on temporary tool integration? Why does it ultimately require a capability entry point layer of Skill + CLI + Gateway?

1. The Problem Isn't "Can It Be Connected," but "Can It Be Integrated Long-Term"

In the early stages of Agent capability integration, the most direct approach was usually MCP or temporary tool encapsulation.

First, connect a query capability, then connect an execution capability, and finally let the Agent return the result to the user. This approach is very suitable for validating ideas and quickly running demos.

The initial goal of the first version of Huajiao CLI was also simple: first, prove that an Agent can invoke Huajiao's public capabilities through a stable execution entry point.

For example, finding live streams:

The user describes what they want to watch in the AI assistant, the Agent queries public live stream content via CLI, and then organizes the results into natural language feedback.

Another example, sending interactive content:

The user states the target and text, the Agent organizes the parameters, and after user confirmation, invokes the call via CLI.

These scenarios themselves are not complex, but they cover several key issues:

Can the Agent read the capability description;
Is the command semantics clear enough;
Are the parameters easy to organize;
Is the returned result convenient for secondary processing;
Can clear feedback be given to the user before and after execution.

The value of this step is not in the complexity of the commands, but in validating a path:

Business capabilities don't have to be handed directly to the Agent, nor does the Agent need to understand the complexity behind the business. A stable execution entry point can be placed in between.

The real problem appears in the next phase.

As capabilities increase, temporary integration leads to several typical issues:

Problem	Manifestation
Scattered capability descriptions	Each tool has its own description method, increasing Agent learning costs
Scattered execution boundaries	Which actions need confirmation, which are read-only, easily implemented differently
Entry points hard to reuse	Different AI assistants and business scenarios may require repeated encapsulation
Fragmented invocation chain	When problems occur, it's hard to tell if it's an Agent parameter, tool execution, or business service anomaly
Heavier subsequent integration	Each new capability requires re-handling descriptions, execution, feedback, and boundaries

So, the problem isn't "Can it be connected."

More critically: After capabilities multiply, can they be continuously integrated, uniformly constrained, have reusable entry points, and allow problem tracing?

2. Why Choose CLI as the Execution Entry Point

We later evaluated the Agent CLI form for a simple reason: CLI is friendly to developers and also friendly to Agents.

Tools like Claude Code, Cursor, and Codex can naturally understand commands, execute commands, and read output. For an Agent, a stable command is a clear action.

It doesn't need to understand the complex structure behind the business service; it only needs to know:

When to call
What parameters to pass
What result to get
How to handle failure

For the engineering team, CLI also has several practical advantages.

First, easy local validation.

Developers can run commands directly in the terminal to see if parameters, results, and error messages are stable, without having to go through the complete Agent chain for debugging each time.

Second, convenient for cross-AI assistant reuse.

As long as the commands and output are stable, the same CLI can be invoked by different entry points like Claude Code, Cursor, and Codex. Different Agents don't need to encapsulate a set of business capabilities separately.

Third, complex capabilities can be converged into a small number of clear actions.

Agents shouldn't directly face complex business details. They are better suited to face a set of actions with clear commands and interpretable results.

A sanitized pseudo-command can illustrate this idea:

# Example is a sanitized pseudo-command, not representing actual command names
hj live list --keyword "singing" --limit 5

hj chat send \
  --room "<room_id>" \
  --text "This song sounds great"

The most critical thing here is not the command name, but the command contract:

Clear action;
Restrained parameters;
Stable output;
Explainable errors;
Confirmable necessary actions.

The value of CLI is not to make users learn another command-line tool, but to give the Agent a more stable execution surface.

3. Skill Is Not a Tutorial, but a Capability Protocol for the Agent

CLI alone is not enough.

Humans can read documentation, look at examples, and slowly try commands when they see them. Agents are not suitable for guessing; they need more direct capability descriptions:

In what scenarios should this command be used;
How should parameters be organized;
What information, if missing, requires asking the user;
Which actions require confirmation before execution;
How should the returned result be explained to the user;
How should failures be prompted and degraded.

This is the role of Skill.

In this design, Skill is not an ordinary usage tutorial, but a capability protocol for the Agent to read. It translates human operational experience into steps the Agent can understand.

For example, for "finding live streams," humans can open a page and filter themselves, but the Agent needs to know:

User expresses viewing intent
-> Extract keywords or preferences
-> Call live stream query command
-> Read structured results
-> Organize recommendations based on user intent
-> Follow up with further questions if necessary

Similarly, for "sending interactive content," the Agent needs to know:

User expresses sending intent
-> Confirm target live room and text
-> Determine if secondary confirmation is needed
-> Call CLI to execute action
-> Explain execution result to user

From this perspective, the core of Huajiao CLI is not that CLI stands alone, but that CLI and Skill work together.

Skill is responsible for describing capabilities, and CLI is responsible for stable execution. Together, they give the Agent the opportunity to move from "being able to answer" to "being able to get things done."

4. The Command Layer Must Converge into an "Action Contract"

The focus of the second version of Huajiao CLI was to advance from "can connect" to "can use."

"Can connect" means engineers can run it.

"Can use" means external developers or AI assistants can find the entry point, complete installation, know how to invoke it, and get basic feedback when it fails.

Therefore, the command layer should be as converged as possible.

For example, publicly exposed capabilities can be organized into these types:

Capability Type	Command Layer Focus
Login and environment check	Can it confirm the current execution environment is available
Version and help info	Can the Agent determine the current CLI capability scope
Query capabilities	Is the output structured, stable, and convenient for secondary processing
Interactive capabilities	Are parameters clear, and is confirmation needed before execution
Data capabilities	Is the result scope public, and does it need further desensitization

The command layer fears two things most.

First, unstable command semantics.

Today a parameter represents a filter condition, tomorrow it's reused for another meaning. It's hard for an Agent to call stably long-term.

Second, unstable output.

Humans can read semi-structured text, but Agents need stable fields. Otherwise, subsequent explanation, summarization, and re-invocation become unreliable.

So the command layer is not an internal implementation detail, but a set of actions for the Agent to recognize business capabilities.

Whether a command can enter the CLI depends not only on whether it can execute, but also on whether it meets several conditions:

Clear action boundaries;
Explainable input parameters;
Stable output results;
Expressible failure reasons;
Whether user confirmation is needed can be clearly described.

This is also the biggest difference between CLI and ordinary scripts.

Scripts can serve only the person who wrote them.

Agent CLI must serve an executor that automatically organizes steps, continuously invokes capabilities, and may also misunderstand context.

5. Why Internal Capabilities Also Need a Gateway

After the public capabilities were running smoothly, we started looking at another problem: Can the same approach be used to support internal Agents?

Internal scenarios are much more complex than public capabilities.

More capabilities, finer boundaries, and more traceable consequences of invocations. If we still rely on "wherever needed, add a tool," it will be hard to maintain later.

At this point, Skill + CLI alone is not enough.

We need a layer of Gateway to converge controlled invocations.

The Gateway solves not "how to make commands execute," but "how to make commands execute in a controlled manner."

It mainly handles several types of general capabilities:

Capability	Role
Identity verification	Confirm the caller and execution environment
Rate limiting	Prevent abnormal continuous invocations from affecting services
Route mapping	CLI does not directly perceive complex backend structures
Logging	Facilitate troubleshooting between Agent, CLI, and business services
Necessary confirmation	Retain manual confirmation nodes for critical actions

A boundary to note here:

Agents are good at understanding intent and organizing steps, but they should not directly bear business boundary judgments. Capabilities that truly need long-term stability should enter the controlled execution chain.

With the Gateway, the overall relationship becomes clearer:

Agent reads Skill, knows when to invoke capabilities
CLI executes standard commands, outputs stable results
Gateway handles identity, rate limiting, routing, logging, and confirmation
Downstream business services maintain their original boundaries

This way, the Agent faces clear commands; the CLI faces a unified entry point; business services face controlled invocations.

6. From MCP to Skill + CLI + Gateway: Not a Replacement Relationship

We did not pit MCP against CLI.

MCP is suitable for exploration and rapid integration. Especially in the early validation phase, it can quickly hand a tool to the Agent, allowing the team to judge whether the scenario is viable.

But for long-term maintenance, stable capabilities are better precipitated into reusable entry points.

This is also why we moved from MCP exploration to Skill + CLI + Gateway.

It can be simply understood as two phases:

Exploration phase:
Agent -> MCP / Temporary tool -> Single business capability

Engineering phase:
Agent -> Skill -> CLI -> Gateway -> Downstream business service

The exploration phase focuses on "Can this work."

The engineering phase focuses on "Can this type of capability be continuously integrated, uniformly constrained, have reusable entry points, and allow problem tracing."

Both phases are needed.

If you build a complete platform from the start, it's easy to over-engineer.

If you always stay with temporary integration, it will later become a fragmented collection of tools.

A more realistic path is:

First, use lightweight methods to validate scenarios, then precipitate stable, clear, and high-frequency capabilities into the Skill + CLI + Gateway chain.

7. Not Every Capability Is Suitable for CLI

At this point, there's another temptation: cramming all capabilities into the CLI.

We didn't do this.

CLI is suitable for capabilities with clear boundaries, clear actions, and interpretable results. It is not suitable for fully porting complex backends into the command line, nor for replacing existing business systems.

To judge whether a capability is suitable for CLI, you can first look at a few questions:

Question	Judgment
Is this action stable enough?	Frequently changing capabilities are not suitable for early precipitation
Can parameters be clearly expressed?	Capabilities that rely heavily on page context should be approached with caution
Is the result interpretable?	Output must be understandable by both Agent and user
Does it require manual confirmation?	Actions needing confirmation must be explicitly expressed in the chain
Is failure recoverable?	Failure reasons should be explainable, preferably with next steps
Is it suitable for cross-entry point reuse?	Capabilities serving only a single page don't necessarily need CLI-ization

This is also the boundary of Huajiao CLI:

It is not a universal entry point, but a standardized entry point for Agents to invoke business capabilities.

Capabilities that can be abstracted into stable actions are suitable for CLI.

Capabilities that heavily rely on manual judgment, have heavy page context, or have frequently changing rules should not be rushed in.

8. A Reusable Integration Checklist

If you are also working on enterprise Agents or internal business capability integration, you can use this checklist to review your approach.

8.1 Is the Capability Suitable for the Agent

Can user intent be stably identified;
Are action boundaries clear;
Can input parameters be reliably extracted from natural language;
Can the output be re-interpreted by the Agent;
Are there steps requiring manual confirmation.

8.2 Is the Skill Clearly Written

When to invoke;
How to organize parameters;
How to ask follow-up questions when information is missing;
Whether confirmation is needed before invocation;
How to explain after failure;
How to convert returned results into user-understandable language.

8.3 Is the CLI Stable Enough

Is the command name stable;
Are parameter meanings stable;
Is the output structured;
Are error codes or messages interpretable;
Can it be independently verified locally.

8.4 Does the Gateway Converge General Boundaries

Is identity verification unified;
Is rate limiting available;
Are routes decoupled from CLI;
Can logs locate problems;
Does necessary confirmation exist explicitly.

8.5 Do Downstream Businesses Maintain Boundaries

Does the Agent not directly face complex business services;
Does the CLI not carry excessive business decisions;
Does the Gateway only perform general control without intruding on specific business logic;
Do downstream services maintain their original stable boundaries.

The core of this table is not "do everything perfectly," but to avoid Agent capability integration becoming a collection of temporary tools.

9. Summary

Looking back at Huajiao CLI, it is not a command-line tool that suddenly appeared.

It is more like an execution entry point layer in enterprise Agent engineering.

From this practice, the judgments we have precipitated are:

When an Agent moves from chat to execution, the first step is not to continue building a bigger chat box;
Business capabilities must first be organized into describable, invocable, and governable entry points;
Skill solves "how to describe capabilities";
CLI solves "how to stably execute actions";
Gateway solves "how to control invocations and trace problems";
Downstream business services maintain their original boundaries.

This approach is not only applicable to Huajiao CLI but also to many enterprise internal Agent projects.

Don't rush to connect everything to the Agent.

Huajiao Technology Exchange Group

Still researching AI engineering, AI programming, Agent implementation alone, without peers to exchange ideas or analyze real-world practices?

Here, front-line technical practitioners gather, focusing on code review and real-world implementation of enterprise internal AI assistants.

If you want to keep up with AI cutting-edge trends, exchange engineering implementation experience, and avoid common pitfalls, feel free to join the "Huajiao Technology Exchange Group."

Exclusive group benefits: Daily curated R&D-oriented AI industry newsletters, exclusive extended materials for articles, and technical details not expanded upon in the text, all shared together.

If the group QR code expires, follow the official WeChat account Huajiao Technology, and reply with "Exchange Group" to get the latest group QR code.