The Six Abstractions That Make Claude Code an Agent Runtime
Understanding these six abstractions gives developers a blueprint for building their own agentic systems. The async-generator Query Loop, speculative tool execution, and bubble-permission model for sub-agents are concrete design patterns that solve real control, latency, and safety problems any multi-agent runtime will hit.
A leaked source-map analysis reveals Claude Code's architecture as a TypeScript monolith organized around six abstractions. The Query Loop, an async generator, is the heartbeat: it calls the model, streams responses, executes tools, and feeds results back into context until a stop condition is met. This generator design gives the outer UI pull-based control over the event stream, enabling clean cancellation with explicit stop reasons.
The Tool System interleaves execution with model streaming through a speculative scheduler that launches concurrency-safe tools like Read before the model finishes outputting, trading occasional wasted tokens for lower latency. Tasks fork recursive sub-agents that run their own Query Loops with isolated context, while a bubble permission mode prevents sub-agents from self-approving dangerous operations. State is split into a mutable session singleton and a reactive Zustand-driven UI layer. Memory persists cross-session context as scannable Markdown files, and Hooks intercept 27 lifecycle events across shell commands, LLM prompts, agent conversations, and webhooks.
Seven permission modes range from bypass to plan (read-only), with auto mode calling a lightweight LLM to judge whether an action matches user intent. The multi-provider architecture abstracts Direct API, Bedrock, and Vertex AI behind a factory-built client, and compile-time feature flags strip internal-only code—though an early source-map leak with sourcesContent exposed the full TypeScript source.
An async generator is a deliberate inversion of control: callbacks push events from the inside, but a generator lets the outer loop pull, which makes cancellation and backpressure straightforward.
Speculative tool execution is a bet that the model won't contradict itself mid-stream—a latency optimization that accepts wasted compute as a cost of doing business.
Forking sub-agents through the same Query Loop rather than a separate runtime keeps the architecture simple but makes the bubble permission mode non-negotiable for safety.
Splitting state into a dumb mutable singleton and a reactive UI store avoids the overhead of making everything observable; only what the user sees needs real-time updates.
Memory as filesystem Markdown is underrated: it's version-controllable, editable by humans, and trivially scannable by an LLM at session start without a vector database.
Hooks that can block the entire Query Loop turn lifecycle events into policy enforcement points—permission checks, input validation, and context injection all run through the same mechanism.
The auto permission mode offloads intent-matching to a second LLM, which is a pragmatic middle ground between full manual approval and bypass, but introduces a second point of failure and token cost.
Feature flags that strip code at build time are effective until a source map leaks; the security boundary was the bundler, not the runtime.