The 170,000-Line Go CLI That Treats AI Agents as First-Class Users
What I Learned After Reading Through 170,000 Lines of Go Code in lark-cli
TL;DR: lark-cli is an open-source CLI tool by Feishu with 170,000 lines of Go, 200+ commands, covering 18 business domains. It has a special design premise—both humans and AI Agents are its users. Once you commit to this premise, every layer needs to be rethought: how commands are layered, how errors are designed, how identity is resolved, how Skill files are written, and how security is safeguarded. This article dissects its complete architecture: from the command system, execution pipeline, and factory pattern, to the error system, identity system, Skill files, security guardrails, and output system. My biggest takeaway after reading through it all: "Agent-Native" is not about adding a
--jsonparameter; it's about treating the Agent as a second-class user, designing for it from the very first line of code, at every layer.
Recently, I read through the source code of Feishu's open-source lark-cli. The reason was simple—the first paragraph of its AGENTS.md states:
This CLI's primary consumers include AI agents (Claude Code, Cursor, Gemini CLI). Your code is read by machines — error messages, output format, and flag design all directly affect agent success rates.
Honestly, my expectations were low before reading it. Most tools claiming to be "AI-Native" just add a --json parameter and call it a day. After reading through it, this article ended up nearly 5,000 words—not because I wanted to write a long piece, but because this project truly deserves this level of detailed breakdown.
1. Command System: Three Layers, Three Trust Boundaries
lark-cli's commands are divided into three layers, not "advanced/intermediate/low-level," but three trust boundaries.
| Layer | Prefix | For Whom | What the Framework Does for You |
|---|---|---|---|
| Shortcuts | Starts with + |
Humans + Agents (stable interface) | Identity resolution, Scope validation, parameter validation, error classification, pagination merging, content safety scanning |
| API Commands | No prefix | Developers familiar with the API | Auto-generated from OAPI metadata, 1:1 mapping |
| Raw API | api subcommand |
Escape hatch | Does nothing; you are responsible for everything |
The + prefix is a deliberate design choice. It visually separates Shortcuts from ordinary subcommands—an Agent can distinguish at a glance between "this is a stable command encapsulated by the framework" and "this is a command that directly maps to the API."
# Shortcut: The framework handles identity, Scope, error classification, and pagination for you
lark-cli calendar +agenda
# API Command: You handle parameters and responses yourself
lark-cli calendar events instance_view --params '{"calendar_id":"primary",...}'
# Raw API: You handle everything yourself
lark-cli api GET /open-apis/calendar/v4/calendars
Agents should use Shortcuts as much as possible and only fall back to Raw API when Shortcuts don't cover the needed functionality. This is not a "capability progression" but a trust progression—the lower you go, the less the framework does for you, and the more responsibility you bear.
2. Execution Pipeline: What a Shortcut Goes Through from Call to Return
A Shortcut is not as simple as "parse parameters → call API → return result." Its execution pipeline has 6 stages:
lark-cli calendar +agenda --start 2025-03-21
│
▼
① Identity Resolution
├── Read --as parameter (user / bot / auto)
├── Read config file defaultAs
├── Read strict mode (force user-only or bot-only)
└── Validate: Does this Shortcut support the current identity?
│
▼
② Config Loading
├── Load config from Credential chain (multi-profile support)
└── Validate: Are app_id and secret configured?
│
▼
③ Scope Pre-check
├── Read scopes declared by the Shortcut
├── Read scopes of the current token
└── Missing scope → return typed error, telling the Agent which command to run
│
▼
④ RuntimeContext Creation
├── Inject APIClient (lazy loading, sync.OnceValues)
├── Inject Lark SDK client
├── Parse --format and --jq
└── Set bot-only flag
│
▼
⑤ Validate
├── Enum value validation (--priority can only be high/medium/low)
├── @file and stdin input parsing
├── --jq expression legality check
└── Business logic validation (e.g., "bot cannot query user calendar")
│
▼
⑥ Execute
├── --dry-run? → Print request preview, do not execute
├── --print-schema? → Print JSON Schema, do not execute
├── High-risk operation? → Check --yes
└── Call API → Classify error → Format output → Return
Each step is an independent phase with clear inputs and outputs. The design philosophy of this pipeline is: Let the framework do everything Agents are bad at, and let Execute only handle business logic.
3. Factory Pattern: How to Share Dependencies Across 200+ Commands While Keeping Them Independently Testable
lark-cli has 200+ commands, and each command needs access to configuration, HTTP client, Lark SDK, credential chain, file system, and Keychain. Using global variables would be a testing disaster.
Its solution is the Factory pattern—a single struct holding all shared dependencies, with all function fields being lazily loaded:
type Factory struct {
Config func() (*core.CliConfig, error) // Lazy-loaded config
HttpClient func() (*http.Client, error) // Lazy-loaded HTTP client
LarkClient func() (*lark.Client, error) // Lazy-loaded SDK client
IOStreams *IOStreams // stdin/stdout/stderr
Keychain keychain.KeychainAccess // System keychain
Credential *credential.CredentialProvider // Credential chain
// ...
}
Production uses NewDefault() to create it, and tests directly replace fields:
// Mock all external dependencies in tests
f := &cmdutil.Factory{
Config: func() (*core.CliConfig, error) {
return &core.CliConfig{AppID: "test"}, nil
},
IOStreams: cmdutil.NewTestIO(),
// ...
}
No DI framework needed, no wire or dig. Go's struct + function field is sufficient. The simplest solution is often the most testable solution.
4. Error System: This Is the Most Worth-Stealing Part of the Entire Project
Most CLI tools handle errors by printing a human-readable error message and exiting non-zero. The Agent gets this message and can only do string matching.
lark-cli's errors all go through a JSON envelope on stderr:
{
"ok": false,
"identity": "user",
"error": {
"type": "authorization",
"subtype": "missing_scope",
"code": 99991679,
"message": "missing scope `calendar:event:create` for app cli_xxx",
"hint": "run lark-cli auth login --scope calendar:event:create",
"log_id": "20260520-0a1b2c3d",
"missing_scopes": ["calendar:event:create"],
"console_url": "https://open.feishu.cn/app/cli_xxx/auth?q=..."
}
}
The Agent doesn't need to read message. It just reads the type and subtype fields to know what to do.
9 Categories, exhaustive and closed:
| Category | When to Use | Exit | What the Agent Should Do |
|---|---|---|---|
validation |
Parameter is wrong | 2 | Read params, fix parameters, retry |
authentication |
Not logged in / no token | 3 | Run auth login |
authorization |
Token lacks scope | 3 | Run auth login --scope |
config |
Local config is missing | 3 | Run config init |
network |
DNS / timeout / connection refused | 4 | Wait a moment and retry |
api |
Feishu API returned an error | 1 | Read code and log_id, check documentation |
policy |
Content safety interception | 6 | Read challenge_url, let user handle it |
internal |
Bug in the tool itself | 5 | Stop, do not retry, report bug |
confirmation |
High-risk operation not confirmed | 10 | Add --yes, run again |
Each Category in Go is an independent struct with a builder API. Category is locked to the function name, Subtype must be a declared constant, and Message is for humans (the Agent does not depend on it):
return errs.NewPermissionError(errs.SubtypeMissingScope,
"missing required scope(s): %s", strings.Join(missing, ", ")).
WithMissingScopes(missing...).
WithHint("run: lark-cli auth login --scope %s", strings.Join(missing, " "))
This contract is locked down by lint, not by documentation. The project runs two golangci-lint rules plus a custom AST check module:
| Lint Rule | What It Blocks |
|---|---|
forbidigo |
fmt.Errorf / errors.New returned at command boundaries fails compilation directly |
CheckDeclaredSubtype |
Subtype must be a declared constant; hand-written strings fail CI |
CheckProblemEmbed |
Every typed error struct must embed errs.Problem |
| Error Reclassification Ban | *PermissionError cannot be wrapped into *InternalError and thrown upwards |
Why is this important? Because once an Agent starts relying on type: "authorization" → re-login, you cannot accidentally change it to type: "api" one day. The Agent will walk into the wrong branch, repeatedly log in, and then report to the user that it "can't do it."
This is not a code style issue. It is an interface contract. Change the API return format for humans, and they will curse you. Change the error type for an Agent, and it will repeatedly do the wrong thing in front of the user.
5. Identity System: user, bot, auto, and strict mode
lark-cli supports three identities, each corresponding to different token types and API permissions:
| Identity | Meaning | Token Type | Use Case |
|---|---|---|---|
user |
Call as a user | User Access Token | Check own calendar, send messages |
bot |
Call as an application | Tenant Access Token | Group bots, batch operations |
auto |
Auto-detect | Automatically selected based on config and credential | Default behavior |
Identity resolution order: --as parameter > config file defaultAs > auto-detection.
There is also strict mode—an administrator can force --strict-mode bot, so all commands can only run as a bot. An Agent using --as user in strict mode will get an error directly, without silently degrading.
Each Shortcut declares its supported AuthTypes:
var CalendarAgenda = common.Shortcut{
AuthTypes: []string{"user", "bot"}, // Supports both identities
// ...
}
If a Shortcut declares AuthTypes: ["bot"], the framework will reject --as user calls during the Validate phase. The Agent doesn't need trial and error—it sees the AuthTypes metadata and knows which identity to use.
6. Skill Files: The Agent's Operation Manual, Embedded in the Binary
--help can only list parameters. An Agent needs to know "When the user says 'help me schedule a meeting,' which command should I call? What should I do first?".
lark-cli's approach is to write a SKILL.md for each business domain, embedded into the binary via //go:embed:
## Intent Routing
| User Intent | Route To |
|-------------|----------|
| "Help me schedule a meeting" | +create (read schedule-meeting.md first) |
| "Check today's calendar" | +agenda (Note: what the user calls "calendar" is "events") |
| "Yesterday's meeting minutes" | Not calendar, it's lark-vc |
## Prerequisites
| Scenario | Prerequisite |
|----------|--------------|
| Edit existing event | Locate event_id first (recurring events need to locate the instance) |
| Verify after delete/modify | Wait 2 seconds before querying (API eventual consistency) |
26 business domains, each with such a Skill file. The benefit of embedding in the binary is version consistency—upgrade the CLI, and the Skill content upgrades along with it, preventing the Agent from calling new commands based on an old Skill file.
Skill files are not documentation for humans; they are operation manuals for Agents. They contain intent routing tables, prerequisite checks, and terminology mapping ("what the user calls 'calendar' is 'events,' not 'calendar container'")—this is the information Agents truly need when making decisions.
7. Security Guardrails: The Agent Is an Untrusted Caller
When a human opens a terminal, you assume they can operate the current directory by default. When an Agent runs commands in the background, you assume it cannot.
lark-cli's security design revolves around one premise: the flag values filled by the Agent are untrusted input.
| Mechanism | What It Prevents |
|---|---|
| vfs abstraction layer | All file I/O does not go through os.Open, but through internal/vfs. Path validation rejects absolute paths, ../ traversal, symlink escapes, and control characters |
| Output scanning | output.ScanForSafety scans content before outputting to stdout. Agent A's output may enter Agent B's pipeline—prevent malicious content from passing through |
| dry-run | All Shortcuts automatically support --dry-run. When unsure, the Agent previews the request without executing it |
| OS Keychain | Tokens do not go into config files or environment variables; they go into the system keychain. The Agent cannot read them, and prompt injection cannot get them |
| High-risk confirmation | Shortcuts with risk: "high-risk-write" require --yes, otherwise return type: "confirmation" (exit 10) |
These mechanisms are not for security audits. Agents make mistakes, can be prompt-injected, and can call the same command repeatedly in a loop. Guardrails are not to prevent bad actors, but to prevent bad outcomes.
8. Output System: Five Formats, One Envelope, One Notification System
A command's output has three destinations: a human viewing it in the terminal, an Agent parsing it, and the next command in the pipeline.
lark-cli's --format supports five formats: json / pretty / table / ndjson / csv. With the --jq expression, the Agent can perform JSON filtering inside the command without needing to pipe to jq.
All output goes through the same JSON envelope:
{
"ok": true,
"identity": "user",
"data": { ... },
"meta": { "count": 42 },
"_notice": {
"update": {
"current": "1.2.0",
"latest": "1.3.0",
"command": "lark-cli update"
}
}
}
The _notice field is a push system—the Agent can check it to determine if a tool upgrade is needed. If the Agent doesn't check, it still works fine—_notice does not affect the semantics of ok: true. This design turns "push notifications" into "a field in the output envelope," without interfering with the normal flow.
9. Common Pitfalls: Three Mistakes Most CLI Tools Make Regarding "Agent as a User"
After reading through lark-cli and looking back at other CLI tools, three particularly common pitfalls emerge:
| Anti-pattern | Symptom | lark-cli's Solution |
|---|---|---|
| Errors as strings | fmt.Errorf("permission denied"), Agent guesses using regex matching |
Structured JSON, stable routing via type + subtype |
| Flat commands | All commands are sibling subcommands; Agent can't distinguish "stable interface" from "raw API" | Three-layer system, isolated by + prefix |
| Agent as trusted caller | No path validation, no output scanning, no dry-run | vfs + ScanForSafety + dry-run + keychain |
The first pitfall is the most common—almost all CLI tools fall into it. The second pitfall is "having functionality but no design." The third pitfall is the most insidious—completely invisible until something goes wrong; once an Agent is prompt-injected, the entire system is running naked.
10. The Design Philosophy Throughout
After reading through 170,000 lines of code, a few principles emerge that run through all modules:
1. The Agent is a second-class user, not a "simplified version of a human user." The error system needs type and subtype because Agents rely on them for routing. Skill files need intent routing tables because Agents don't know that "what the user calls 'calendar' is 'events'." Security guardrails need to be stricter because Agents can be injected.
2. Contracts are locked down by code, not by documentation. Error classification is locked down by lint, Subtype is locked down by constant declarations, and identity validation is locked down in the pipeline. Documentation becomes outdated; CI does not.
3. The framework does what Agents are bad at; Shortcuts only do business logic. Identity resolution, Scope pre-check, parameter validation, error classification, pagination merging, content safety scanning—all of these are completed in the pipeline. Execute only receives a clean RuntimeContext and directly calls the API.
4. The simplest solution is often the most testable solution. The factory pattern doesn't use a DI framework, the vfs abstraction doesn't use heavyweight mock libraries, and tests use t.Setenv and t.TempDir to isolate state. There is no over-engineering; every layer of abstraction has clear testing benefits.
11. If You Want to Build Your Own Agent CLI: 8 Things You Can Take from lark-cli
Reading through lark-cli is not just for writing a book report. If you are building a CLI tool that will be called by Agents, here are 8 design decisions you can directly take, sorted by priority.
1. Command Layering: + Prefix Isolates Stable Interfaces
Problem: The Agent cannot distinguish which command is a "framework-encapsulated stable interface" and which is a "raw command directly mapping the API."
Takeaway: Add a prefix convention to your CLI—+ or stable: both work. Prefixed commands = stable interfaces, where the framework handles validation, error classification, and pagination for the Agent. Non-prefixed commands = raw interfaces, where the Agent is responsible for everything.
lark-cli's code:
// Shortcuts use + prefix, framework automatically injects dry-run, format, jq, identity resolution
var CalendarAgenda = common.Shortcut{
Command: "+agenda",
Scopes: []string{"calendar:calendar.event:read"},
Execute: func(ctx context.Context, runtime *common.RuntimeContext) error {
// Only write business logic, leave everything else to the framework
},
}
2. Error System: Exhaustive Categories, Stability Locked by Lint
Problem: The Agent relies on string matching to guess error types; changing the wording breaks it.
Takeaway: Define 5-10 exhaustive error Categories, each corresponding to a stable type field. Each Category gets an exit code. Use lint to force all command boundaries to return typed errors, banning bare fmt.Errorf.
lark-cli's code:
// Don't write this
return fmt.Errorf("permission denied")
// Write this—Agent reads type and subtype, doesn't need to read message
return errs.NewPermissionError(errs.SubtypeMissingScope,
"missing required scope(s): %s", strings.Join(missing, ", ")).
WithMissingScopes(missing...).
WithHint("run: mycli auth login --scope %s", strings.Join(missing, " "))
Minimum viable version: You don't need 9 Categories from the start. 3 can cover 80% of scenarios: validation (parameter wrong), permission (no permission), internal (tool's own bug). Add more later.
3. Execution Pipeline: Framework Does What Agents Are Bad At
Problem: Every time the Agent calls a command, it has to handle identity, Scope, parameter validation, error classification, and pagination itself—these are not business logic, but Agents often make mistakes here.
Takeaway: Design a Shortcut execution pipeline that places identity resolution, Scope pre-check, parameter validation, error classification, and pagination merging all before Execute. Execute only receives a clean RuntimeContext and directly calls the API.
Identity → Config → Scopes → RuntimeContext → Validate → Execute
lark-cli's approach: Shortcuts only declare Scopes, AuthTypes, Flags, and Execute; the rest is automatically completed by the runShortcut pipeline. You don't need to repeat identity validation and Scope checks in every Shortcut.
4. Identity System: user/bot Dual Identity + Enforcement Mode
Problem: Sometimes the Agent needs to call the API as a user, sometimes as a bot. If the Agent judges entirely on its own, it will get confused.
Takeaway: Define 2-3 identities (user, bot, auto), and each command declares which identities it supports. The framework automatically resolves and validates in the pipeline. Add a strict mode so administrators can force bot-only identity.
lark-cli's approach: --as user / --as bot parameter + config file defaultAs + AuthTypes declaration. If the Agent calls a bot-only command with --as user, the framework rejects it during the Validate phase, not waiting until the API call to discover the error.
5. Security Guardrails: Agent Is an Untrusted Caller
Problem: Agents can be prompt-injected, can call the same command repeatedly in a loop, and can fill in malicious flag values.
Takeaway: Three things—path validation (all file I/O goes through an abstraction layer, rejecting ../ traversal), dry-run (all write operations support preview), credentials not in config files (use system keychain or environment variables, don't let the Agent read them).
Minimum viable version: dry-run offers the best cost-benefit ratio. Adding a --dry-run flag costs almost nothing, but the Agent can preview when unsure.
6. Output Envelope: Unified Format + _notice Push
Problem: The Agent needs to simultaneously get data, metadata, and system notifications from the output. If these three things are scattered across stdout and stderr, the Agent struggles to piece them together.
Takeaway: All output goes through the same JSON envelope, containing four fields: ok, data, meta, _notice. _notice is used to push system notifications like "new version available" or "Skill file outdated," without affecting the semantics of ok: true.
{
"ok": true,
"data": { "items": [...] },
"_notice": {
"update": { "current": "1.2.0", "latest": "1.3.0", "command": "mycli update" }
}
}
7. Factory Pattern: struct + function field Dependency Injection
Problem: As CLI tools accumulate more commands, each command needs access to configuration, HTTP client, and credential chain. Using global variables makes testing a disaster.
Takeaway: Use a Factory struct to hold all shared dependencies, with all function fields lazily loaded. Replace fields directly in tests; no DI framework needed.
type Factory struct {
Config func() (*Config, error)
Client func() (*http.Client, error)
// ...
}
// In tests
f := &Factory{
Config: func() (*Config, error) { return &Config{...}, nil },
}
lark-cli's approach: NewDefault() creates the production Factory, tests use cmdutil.TestFactory(t, config) to create mocks. No need for wire, dig, or any third-party DI library.
8. Skill Files: Agent Operation Manual Embedded in Binary
Problem: The Agent gets --help and only sees a parameter list; it doesn't see "which command to call when the user says X" or "what needs to be done before calling this command."
Takeaway: Write a SKILL.md for each business domain, containing intent routing tables, prerequisites, and terminology mapping. Embed it in the binary using //go:embed to ensure version consistency.
Minimum viable version: Start with one, not 26. Choose the most commonly used business domain, and clearly write three things—what the user says → which command to call, what to prepare before calling, and how to recover from failure. The Agent's call success rate will noticeably improve with this file.
Priority Suggestions
If you need to build an Agent CLI right now, follow this order:
| Priority | What to Do | Why Do It First | Effort |
|---|---|---|---|
| P0 | Error system (3 Categories + JSON envelope) | Agents rely on errors for decisions; this is the highest-frequency interaction interface | 1-2 days |
| P0 | dry-run | Lowest cost, highest benefit. Agent can preview before executing | Half a day |
| P1 | Command layering (+ prefix) | Lets Agent distinguish "stable interface" from "raw interface" | Half a day |
| P1 | Output envelope (unified JSON format) | Agent doesn't need to parse multiple output formats | 1 day |
| P2 | Execution pipeline (identity + Scope auto-validation) | Reduces Agent errors in non-business logic | 2-3 days |
| P2 | Factory pattern | Makes code testable; otherwise, you won't dare change it after the 10th command | 1 day |
| P3 | Identity system (user/bot + strict mode) | Most tools only have one identity initially | 1-2 days |
| P3 | Skill files | Write one first, validate effectiveness, then expand | Half a day |
Final Thoughts: The Three Stages of Building a CLI
| Stage | Focus | Typical Approach |
|---|---|---|
Add a --json |
Make output parseable | Add a format parameter, JSON output |
| Structured errors | Let Agent make decisions from errors | Error classification, type/subtype, executable hints |
| Agent is a second-class user | Design the CLI's interface contract from the ground up | Command layering, identity system, Skill files, lint-locked contracts, security guardrails |
Most tools stop at the first stage. lark-cli is at the third stage.
Treating the Agent as a second-class user—the gap between this design principle and the code is not technical ability, but the willingness to admit that your CLI is not just for humans.
After reading through 170,000 lines of code, the above is what I think is most worth taking away. There are certainly many details I haven't seen. If you are also building an Agent CLI, or have discovered interesting CLI designs, feel free to let me know.
Project repository: https://github.com/larksuite/cli