Claude · AI Programming · Agent

Inside Claude Code: How CLAUDE.md, Hooks, Skills, and Subagents Actually Work

By 沉默王二 · Jun 26, 2026

Read original on juejin.cn ↗ Google Translate ↗ Alt translation

For developers relying on Claude Code or Codex for daily coding, understanding these five mechanisms is the difference between a tool that works reliably and one that seems to "forget" instructions or behave unpredictably. The same architectural patterns — context budgeting, layered rules, mandatory hooks, on-demand skills, and isolated subagents — are becoming standard across AI coding agents, making this knowledge transferable to any agent harness.

Summary

Claude Code's context window holds seven types of content, not just conversation history — file contents, command outputs, CLAUDE.md, auto-memory, loaded Skills, and system instructions all compete for space. When the window fills, Claude Code first clears older tool outputs, then compresses the conversation history into a summary. This compression is the real reason Claude "forgets" instructions in long sessions: rules mentioned in round three become a vague sentence after compression.

The solution is to put persistent rules into CLAUDE.md, which gets re-injected after every compression. CLAUDE.md loads from four concatenated levels — managed policy, user, project, and local — and all levels merge rather than override, which can cause conflicting rules. For rules that must be enforced, Hooks provide mandatory interception at over 20 lifecycle events, from session start to individual tool calls. Hooks can block, allow, ask, or even modify tool inputs before Claude executes them.

Skills load on demand from a 1% context budget index, so a poorly written description means Claude never discovers the Skill exists. Subagents run in isolated context windows, with three built-in types — Explore (fast, read-only Haiku), Plan (read-only, inherits main model), and General-purpose (full tools, loads CLAUDE.md). Custom Subagents can be defined in Markdown files with their own tools, models, and even dedicated MCP servers.

Takeaways

— Claude Code's context window contains seven content types: conversation history, file contents, command outputs, CLAUDE.md, auto-memory, loaded Skills, and system instructions.

— When context fills, Claude Code first clears old tool outputs, then compresses conversation history into a summary — this compression causes instruction forgetting in long sessions.

— Rules written in CLAUDE.md survive compression because they are re-injected after every compression cycle.

— CLAUDE.md loads from four concatenated levels (managed policy, user, project, local) — all levels merge rather than override, which can create conflicting rules.

— Hooks provide mandatory enforcement at over 20 lifecycle events, using exit codes to block (code 2), allow (code 0), or log (other codes) operations.

— PreToolUse Hooks can modify tool input parameters before Claude executes them, enabling automatic command rewriting.

— Skills are loaded on demand from a 1% context budget index — a poorly written description means Claude never discovers the Skill.

— Subagents run in isolated context windows: Explore (Haiku, read-only), Plan (read-only, inherits main model), and General-purpose (full tools, loads CLAUDE.md).

— Explore and Plan Subagents skip CLAUDE.md and Git status loading for faster, cleaner context.

— Custom Subagents can be defined in `.claude/agents/` with their own tools, models, and dedicated MCP servers.

— Fork Subagents inherit the main Agent's full conversation history and share prompt cache, ideal for branching explorations.

— Auto-memory stores Claude's learned patterns across sessions but loads only 200 lines or 25KB — far less than CLAUDE.md.

Conclusions

The most common reason Claude Code "forgets" instructions is not a model limitation but a predictable consequence of context compression — the real fix is structural, not prompt-based.

CLAUDE.md's concatenation design is a double-edged sword: it enables layered rules but creates silent conflicts when different levels specify contradictory instructions.

Hooks are the most underutilized feature — the ability to modify tool inputs before execution opens up security and compliance use cases far beyond simple validation.

The 1% context budget for Skill index means developers must treat Skill descriptions as critical UX copy, not afterthoughts.

Explore and Plan Subagents skipping CLAUDE.md is a deliberate trade-off that prioritizes speed over rule adherence — a design lesson for any multi-agent system.

Custom Subagents with dedicated MCP servers represent a new architectural pattern: disposable, task-specific micro-environments that don't pollute the main context.

The five-mechanism architecture (CLAUDE.md, Hooks, Skills, Subagents, context management) is becoming a de facto standard for AI coding agents — learning it on Claude Code transfers directly to Codex and future tools.

Concepts & terms

Context Compression

When Claude Code's context window fills, it compresses conversation history into a summary, discarding specific wording and details. This is the primary cause of instruction forgetting in long sessions.

CLAUDE.md

A project-level Markdown file containing persistent rules that survive context compression. Loaded from four concatenated levels (managed, user, project, local) and re-injected after every compression cycle.

Hooks

User-defined programs that bind to Claude Code lifecycle events and enforce mandatory rules. Unlike CLAUDE.md suggestions, Hooks can block operations by returning specific exit codes.

Skills

On-demand instruction sets loaded only when triggered. A 1% context budget index lists all Skills by description; the full SKILL.md is injected only after Claude matches the description to a task.

Subagent

An assistant with its own independent context window. The main Agent delegates tasks, and the Subagent returns only the conclusion, keeping all intermediate work isolated from the main context.

Fork Subagent

A special Subagent that inherits the main Agent's full conversation history and shares prompt cache, enabling deep branching explorations without re-explaining background context.

Auto-memory

Cross-session learning records written by Claude itself, stored in `~/.claude/projects/`. Each session loads only the first 200 lines or 25KB, making it suitable for granular patterns but not core rules.

Source: juejin.cn ↗ Google Translate ↗ Backup ↗