跪拜 Guibai
← All articles
Agent · Claude · LLM

Run Claude Code on Ollama or DeepSeek — No Anthropic Bill Required

By 前端君 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

Claude Code's CLI is one of the most capable AI coding agents available, but its hard dependency on Anthropic's API makes it prohibitively expensive for many developers and teams. CCR breaks that lock-in, letting developers use the same agent interface with local models (free, private, offline) or cheaper cloud alternatives like DeepSeek — a pattern that could reshape how teams budget for AI-assisted development.

Summary

Claude Code is powerful but expensive, and it only speaks Anthropic's Messages API. Claude Code Router (CCR) solves this by running as a local proxy on port 3456 that translates Anthropic's request format into the OpenAI Chat Completions format that Ollama and DeepSeek natively support. Developers point Claude Code at CCR via the ANTHROPIC_BASE_URL environment variable, and CCR handles the protocol conversion transparently.

The setup supports multiple providers simultaneously. For local, offline-capable coding, Ollama models like llama3.1 or qwen2.5 run on the developer's machine. For heavier lifting, DeepSeek's cloud models (deepseek-chat and deepseek-reasoner) provide stronger reasoning at a fraction of Anthropic's pricing. CCR also supports advanced routing rules — sending background tasks to cheap local models and complex reasoning to cloud models — plus API key rotation and fallback chains.

CCR is a desktop application with a GUI for configuring providers, routing rules, and virtual model aliases. It auto-applies profiles to Claude Code's settings.json, and includes a Network Logs panel for debugging protocol conversion issues. The project is open-source and third-party, not affiliated with Anthropic.

Takeaways
Claude Code Router (CCR) is a local proxy that translates Anthropic's /v1/messages API into OpenAI's /v1/chat/completions format.
CCR runs as a desktop app on macOS, Windows, and Linux, exposing a gateway on localhost:3456.
Developers configure Claude Code to use CCR by setting ANTHROPIC_BASE_URL=http://127.0.0.1:3456 and ANTHROPIC_API_KEY to any non-empty string.
Ollama models (llama3.1, qwen2.5, deepseek-r1:7b) run locally and require no API key or internet.
DeepSeek models (deepseek-chat, deepseek-reasoner) are cloud-based and require an API key from platform.deepseek.com.
CCR supports multi-model routing: different tasks (background, thinking, subagent) can be sent to different models automatically.
Virtual model names let developers create aliases like 'default-coding' that map to specific provider models.
API key rotation is supported by comma-separating multiple keys in the provider configuration.
Setting CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1 reduces protocol conversion errors by stripping experimental fields.
The correct startup order is: start Ollama (if local), start CCR gateway, then launch Claude Code.
Conclusions

CCR's protocol translation approach is elegant because it requires zero changes to Claude Code itself — the agent doesn't know it's talking to a different backend.

The ability to route background tasks to cheap local models while reserving cloud models for complex reasoning is a practical cost optimization that enterprise teams will find immediately useful.

Using local models with Claude Code trades capability for privacy and cost — developers working on proprietary codebases may prefer this even when cloud models are technically superior.

The fact that a third-party tool is needed to use Claude Code with non-Anthropic models highlights a deliberate platform lock-in strategy by Anthropic, similar to how early AWS services only worked with AWS backends.

DeepSeek's models, especially deepseek-reasoner, are emerging as a credible alternative for coding tasks at a fraction of the cost — this is a signal that the AI coding market is becoming multi-provider by necessity.

CCR's desktop GUI approach is unusual for a developer tool — most similar proxies are CLI-only — which suggests the project is targeting a broader audience beyond hardcore terminal users.

Concepts & terms
Claude Code Router (CCR)
A third-party open-source desktop application that acts as a protocol translation gateway. It receives Anthropic-format API requests from Claude Code and translates them into OpenAI-format requests for models like Ollama and DeepSeek, then translates the responses back.
Protocol translation gateway
A middleware service that sits between a client and a backend, converting requests and responses between different API formats. In this case, it translates between Anthropic's Messages API and OpenAI's Chat Completions API so that tools designed for one can work with the other.
Virtual model
A user-defined alias that maps a simple name (like 'default-coding') to a specific provider and model (like 'deepseek-chat'). This simplifies configuration and allows routing rules to reference models by purpose rather than by exact model ID.
Fallback routing
A configuration where CCR automatically switches to a backup model if the primary model fails or is unavailable. For example, if DeepSeek's cloud API is down, CCR can fall back to a local Ollama model without interrupting the developer's workflow.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗