Inside Claude Code: How CLAUDE.md, Hooks, Skills, and Subagents Actually Work

Deep Understanding of Claude Code: CLAUDE.md, Hooks, Skills, Subagents

Hello everyone, I'm Brother Er.

To fully master Claude Code, I deliberately built a PaiCLI from scratch.

During this process, I gained a deep understanding of MCP, Skills, context management, summary compression, ReAct, Multi-Agent, Plan-and-Execute, web search, RAG, Memory, and IM terminals.

To help everyone better master Claude Code and Codex, I also built an AI Advanced Path website from scratch yesterday.

Using a subdomain to avoid re-filing: [AI Advanced Path Website]

I hope to make a small contribution to the development of AI in China.

Let me bask in the glory a bit 😄

It can be said that Claude Code is currently the most powerful Agent Harness besides Codex.

It has better support for domestic models than Codex. So if you are using domestic models, Claude Code is the best choice.

It has its own context management strategy, Rules, Hooks, Skills, and Subagent delegation mechanism.

This content starts from the Claude Code official documentation, dissecting Claude Code's context, CLAUDE.md, Hooks, Skills, and Subagent internal mechanisms.

https://code.claude.com/docs/zh-CN/how-claude-code-works

01. What exactly is in the context window

Some might think Claude Code's context is just "what I said, what it replied." In reality, conversation history is just one of seven types of content in the context window.

Conversation history: records of interactions between the user and Claude
File content: source files Claude has read or edited
Command output: results of terminal command executions
CLAUDE.md: project-level persistent rules
Auto-memory: learning records Claude saves across sessions
Loaded Skills: full instruction text injected after being triggered
System instructions: Claude Code framework's own operating rules and tool definitions

What happens when the context is full?

Claude Code's handling strategy is in two steps.

First, clear earlier tool call outputs. File read results, command execution results — content that takes up a lot of space but has low timeliness — are removed first.

Second, perform summary compression on the entire conversation history, condensing specific wording, values, and code snippets into a summary.

The consequence of compression is information loss.

Rules emphasized in the third round of conversation, architectural decisions discussed in the fifth round — after compression, they are likely to become a vague summary. This is the real reason Claude "forgets" instructions in long conversations.

Solutions include:

①️ Rules that need to persist across rounds should not be written in the conversation; write them into CLAUDE.md.

Codex uses AGENTS.md.

The content of CLAUDE.md is re-injected into the context after each compression, unaffected by summary compression. You can also add a "compression instructions" section in CLAUDE.md, telling Claude what information must be preserved during compression.

②️ Another effective method is the /compact command, which actively triggers compression and specifies the retention focus, e.g., /compact Focus on API changes.

③️ Auto-memory is also a part of cross-session persistence.

It is stored in the ~/.claude/projects/ directory. Each new session loads the first 200 lines or 25KB of MEMORY.md.

Unlike CLAUDE.md, auto-memory is written by Claude itself, recording patterns and preferences it learns during work. However, the loaded amount of auto-memory is much smaller than CLAUDE.md, making it more suitable for storing granular experiential information, not for bearing core project rules.

Each new Claude Code session starts with a fresh context window; historical conversations are not automatically carried over.

There are only two channels for cross-session information transfer: CLAUDE.md and auto-memory.

Remember: Rules that need long-term effect should be written into CLAUDE.md, not in the conversation.

It can be said that CLAUDE.md is the most important persistent content in the context. Its loading mechanism determines whether rules are seen by Claude Code.

02. CLAUDE.md Loading Mechanism

When Claude Code starts, it searches for and loads CLAUDE.md from four locations in order.

Level	Location	Sharing Scope
Managed Policy	`/Library/Application Support/ClaudeCode/CLAUDE.md`	All users in the organization
User-level	`~/.claude/CLAUDE.md`	Common to all projects
Project-level	`./CLAUDE.md` or `./.claude/CLAUDE.md`	Shared with the team via Git
Local-level	`./CLAUDE.local.md` (gitignored)	Local machine only

There is an important design decision here: the four layers of files are concatenated, not overridden.

Content from all levels is concatenated in order from the file system root to the current working directory, all injected into the context window. CLAUDE.local.md is appended after CLAUDE.md at each level.

What does concatenation mean?

Suppose the user-level CLAUDE.md says "use four spaces for indentation," and the project-level CLAUDE.md says "use two spaces for indentation." Both rules will exist in the context simultaneously.

Claude needs to decide which one takes precedence. If it guesses wrong, it manifests as "the rule didn't take effect." When troubleshooting why a rule isn't working, first check for conflicts between CLAUDE.md files at different levels — this is often more effective than tweaking the prompt.

The official recommendation is to keep each CLAUDE.md file under 200 lines, with instructions specific enough to be verifiable.

03. Hooks' Mandatory Interception Mechanism

CLAUDE.md is a suggestion.

Claude reads the rules and considers them when making decisions, but it doesn't guarantee compliance every time. It has its own judgment, and sometimes it thinks a certain rule is not applicable in the current scenario and skips it.

If certain rules must be enforced — for example, "must run lint before commit," "force push to main branch is prohibited," "all write operations must be approved" — Hooks are needed for mandatory interception.

Hooks are user-defined hook programs, bound to specific nodes in the Claude Code lifecycle and executed automatically.

Unlike CLAUDE.md's "please comply," Hooks' logic is "block the operation if conditions are not met." Claude has no room to skip.

The structure of Hooks is three layers nested.

There are five types of handlers, covering different validation scenarios.

command: Executes Shell commands, most commonly used. For example, automatically running formatting checks before commit, or validating file content against security standards before writing.
http: Sends a POST request to a specified URL, used to notify external systems or trigger CI pipelines.
mcp_tool: Calls a tool on a connected MCP server, bringing MCP ecosystem capabilities into the lifecycle hooks.
prompt: Throws a question to the Claude model for a true/false judgment, used for dynamic review scenarios requiring semantic understanding.
agent: Deploys a Subagent for complex multi-step validation, currently in the experimental stage.

Among the five types, command is the most commonly used. It controls behavior through exit codes.

Returning 0 means validation passed, and the operation continues; returning 2 means blocked, the operation is prohibited, and stderr is shown to Claude as an error message; other return codes indicate non-blocking errors, logged but not preventing the operation.

Hook events cover the complete lifecycle of Claude Code.

Session-level events include SessionStart and SessionEnd.
Per-turn dialogue events include UserPromptSubmit (when user input is submitted) and Stop (when Claude finishes replying).
Tool call events include PreToolUse (before calling a tool) and PostToolUse (after calling a tool).
Additionally, there are finer-grained events like SubagentStart, TaskCreated, FileChanged, PreCompact, totaling over 20.

The most noteworthy is the PreToolUse event. It fires before Claude calls any tool. The Hook can return a permissionDecision field to control permissions.

allow: Directly permits, no confirmation prompt.
deny: Directly rejects, the operation is blocked.
ask: Shows a confirmation prompt, letting the user decide.
defer: Defers to the default permission logic.

Beyond permission control, PreToolUse Hooks can also modify the tool's input parameters via the updatedInput field.

This means you can automatically rewrite command content before Claude executes it — for example, forcibly adding a signature parameter to all git commit commands, or automatically appending a copyright notice before file writes.

The configuration location for Hooks is similar to CLAUDE.md's hierarchy: user-level in ~/.claude/settings.json, project-level in .claude/settings.json, and local-level in .claude/settings.local.json.

The default timeout is 600 seconds (for command/http/mcp_tool types), 30 seconds for prompt type, and 60 seconds for agent type.

There is also a practical feature: asynchronous Hooks. By adding "async": true to the Hook configuration, the Hook runs in the background without blocking Claude's operation flow.

If "asyncRewake": true is also set, when a background Hook returns exit code 2, it sends a system alert to Claude, notifying it that an async validation has failed. This mechanism is suitable for time-consuming checks, such as remote code scanning services.

CLAUDE.md sets the direction; Hooks enforce the discipline.

04. Skills Trigger Mechanism

CLAUDE.md is loaded every session; Skills are loaded on demand.

The full instructions of a Skill are only injected into the context window after being triggered. Normally, only a one-line description hangs in the index list.

This is the most fundamental difference between Skills and CLAUDE.md, and it is also the reason Skills exist — to remove infrequently used but critical capabilities from the resident context, freeing up space for everyday conversations.

Skill triggering is a three-step process.

Step 1: When Claude Code starts, it collects the name and description text of all Skills, forming an index list that is injected into the context.
Step 2: In each round of conversation, Claude scans this list to determine if the current task matches any Skill.
Step 3: If a match is found, a call is initiated, and only then is the full text of SKILL.md loaded into the context.

Note the implication of Step 1.

When Claude makes the matching decision, it only sees the description in the index list, not the body of SKILL.md. If the description text is poorly written, Claude will never know the Skill exists. No matter how exquisite the body text is, it won't matter because the model, at the moment of deciding "whether to call," has no chance to see the body.

The index list has strict budget constraints.

The entire list is only allowed to occupy 1% of the context window, controlled by the skillListingBudgetFraction parameter.

The description text of a single Skill (combined description and when_to_use fields) is limited to 1536 characters, controlled by the maxSkillDescriptionChars parameter. Parts exceeding 1536 characters are truncated. Trigger conditions and usage scenarios written after the 1537th character are invisible to the model.

Once a Skill is triggered and loaded, its content remains in the current session and does not disappear when a new round of dialogue starts. However, Skills are also processed during context compression.

Claude Code's compression mechanism provides special protection for Skills. After compression, the first 5000 tokens of each Skill are re-injected. The total budget for re-injecting all Skills is 25000 tokens, sorted by most recently used. If too many Skills are triggered in a single session, the earliest used Skills will no longer be retained after compression.

A standard Skill directory looks like this.

my-skill/
  SKILL.md           # Main file, containing frontmatter and instruction body
  references/        # Reference materials, automatically imported when loaded
  scripts/           # Executable scripts, for Claude to call
  examples/          # Example outputs

There are several noteworthy fields in the frontmatter of SKILL.md.

Setting disable-model-invocation to true prevents Claude from automatically loading this Skill; it can only be triggered by the user manually entering a slash command. Suitable for heavyweight Skills used only in specific scenarios.
allowed-tools can specify a list of tools that Claude can use without permission when the Skill is active, reducing frequent permission confirmation pop-ups.
model can override the current session's model, allowing a Skill to execute with a more powerful or more economical model.
context: fork allows the Skill to run in an independent Subagent context, preventing the Skill's large output from polluting the main context.

SKILL.md also supports dynamic content injection.

Using the !`<command>` syntax, you can execute a Shell command at load time, replacing the command output into the Skill body.

For example, !`date "+%Y年%m月%d日"` will automatically insert the current date each time it is loaded. This capability allows the Skill to be aware of the runtime environment state without needing manual content updates.

Skill priority also has its nuances.

When a Skill with the same name is defined in different locations, the priority from high to low is: enterprise managed policy > personal level (~/.claude/skills/) > project level (.claude/skills/) > plugin level. A higher-priority Skill with the same name overrides lower-priority ones.

05. Subagent Context Isolation

A Subagent is an assistant with its own independent context window.

The main Agent delegates a task to it. The Subagent completes the work within its own window — it might read dozens of files, run multiple commands, search the entire codebase — and ultimately returns only the conclusion to the main Agent.

All intermediate process data stays within the Subagent's own window; the main context doesn't consume a single extra byte.

Claude Code has three built-in Subagent types, each with a different role.

Type	Model	Available Tools	Purpose
Explore	Haiku (fast model)	Read-only, no write or edit permissions	File discovery, code search, codebase exploration
Plan	Inherits main model	Read-only, no write or edit permissions	Solution research, architecture planning
General-purpose	Inherits main model	All tools	Complex research, multi-step operations, code modification

An important design detail.

Explore and Plan Subagents skip loading CLAUDE.md and Git status when they start.

They begin work with a clean context, aiming for faster response and higher context utilization. Only the General-purpose type loads the full CLAUDE.md and Git status information.

This choice is a trade-off.

Explore and Plan mainly do information gathering and analysis; they don't need to follow the project's code conventions and commit rules. Skipping CLAUDE.md doesn't cause quality issues. General-purpose needs to modify code and execute commands, so it must understand project rules to produce qualified results.

How to choose in practice?

For exploratory work that only requires reading — searching code, finding file definitions, browsing directory structures — delegate to Explore. It uses the Haiku model, runs fast, and has low cost.

For tasks that require comprehensive codebase analysis to formulate a plan, delegate to Plan. It inherits the main model's reasoning ability but won't accidentally modify files.

For complex subtasks that actually require modifying code or running commands, delegate to General-purpose. It has full tool permissions.

The conversation records of each Subagent are stored independently in the ~/.claude/projects/{project}/{sessionId}/subagents/ directory, with a default retention of 30 days before automatic cleanup.

The way to create a custom Subagent is to create a Markdown file in the .claude/agents/ directory. The YAML header declares attributes like name, description, available tools, and the model to use. The body is the system prompt.

---
name: code-reviewer
description: Reviews code for quality and best practices
tools: Read, Glob, Grep
model: sonnet
---

You are a code reviewer...

This definition allows the main Agent to delegate code review tasks to code-reviewer, which only has read-only tools and won't accidentally modify code. It uses the Sonnet model, running faster and cheaper.

You can also preload specific Skills via the skills field and configure independent MCP servers for the Subagent via the mcpServers field. These servers connect when the Subagent starts and disconnect when it ends, not occupying the main context's tool definition space.

There is also a special Fork Subagent, enabled via the /fork command or context: fork configuration.

A Fork Subagent inherits the main Agent's complete conversation history, eliminating the need to re-explain background information. It is suitable for scenarios requiring deep branching exploration based on existing discussions.

For example, if you're in the middle of a discussion and want another model to evaluate whether the current plan is feasible, fork a Subagent. It continues from the current discussion context without needing to re-describe the background. Fork Subagents share the prompt cache with the main Agent, avoiding repeated cache quota consumption.

Ending

Context is Claude Code's working memory, with limited capacity.

CLAUDE.md sets the direction, Hooks enforce discipline, Skills handle expertise, and Subagents manage division of labor.

These five mechanisms each guard their own boundaries, yet they collaborate in every round of dialogue.

Many people spend a lot of time polishing the wording of their prompts, but what determines whether Claude Code can produce consistently and stably is never how cleverly a single sentence is written, but rather the configuration quality and depth of understanding of these five mechanisms.

[Spending time understanding how the tool works yields far greater returns than spending time guessing its temperament.]

See you next time.