跪拜 Guibai
← All articles
AIGC · AI Programming · OpenAI

Even Microsoft Can't Afford AI: The Token Trap That's Breaking Enterprise Budgets

By 牛奶 ·
Read original on juejin.cn ↗ Google Translate ↗ Alt translation

This signals a structural shift in AI economics that every Western developer and engineering leader needs to understand: the traditional software subscription model is dead for AI tools. Usage-based token pricing means that productivity gains come with directly proportional cost increases, forcing companies to choose between capability and budget. The Microsoft case shows that even the deepest pockets have limits, and the Uber case proves that AI adoption can outrun financial planning by orders of magnitude.

Summary

Microsoft is pulling the plug on internal Claude Code licenses for nearly 100,000 engineers across Windows, Microsoft 365, Teams, Outlook, and Surface, forcing a migration to its own GitHub Copilot CLI. The reason: the token-based billing model made Claude Code too expensive, even for a $3.5 trillion company that has invested billions in both OpenAI and Anthropic.

Uber's experience is even more dramatic. Its CTO revealed that 95% of engineers use AI tools monthly, 84% are in agentic coding mode, and 70% of committed code is AI-generated — but the entire 2026 AI budget of $3.4 billion was burned through in just four months. Per-engineer API costs for Claude Code range from $500 to $2,000 monthly.

The core issue is a fundamental pricing shift: software has moved from all-you-can-eat subscriptions to pay-per-token, where every line of AI-generated code consumes cash in real time. NVIDIA's VP of Applied Deep Learning confirmed that computing costs now exceed employee salary costs for his team. As models grow larger and agent chains lengthen, the paradox is clear — the more indispensable AI becomes, the more expensive it gets.

Takeaways
Microsoft is canceling internal Claude Code licenses for nearly 100,000 engineers, effective June 30, forcing migration to GitHub Copilot CLI.
Uber burned through its entire $3.4 billion 2026 AI budget in just four months, with 70% of committed code now AI-generated.
Per-engineer monthly API costs for Claude Code at Uber range from $500 to $2,000.
NVIDIA's VP of Applied Deep Learning stated that computing costs now exceed employee salary costs for his team.
GitHub switched to usage-based billing on June 1; Anthropic is moving enterprise renewals from seat-based to usage-based pricing.
Claude Code's daily cost has doubled from $6 to $13, and US AI software prices have risen 20-37% in the past year.
Microsoft's internal Claude Code usage meant paying Anthropic per token, effectively funding a direct competitor while using Azure infrastructure.
AI tool pricing is shifting from subscription plans to pay-as-you-go, reversing the traditional software cost curve.
Conclusions

The traditional software pricing model — pay a flat fee, use unlimitedly — is fundamentally incompatible with AI's per-token cost structure, creating a new class of financial risk for enterprises.

Microsoft's move may be less about cost and more about strategic learning: using Claude Code as a benchmark to improve Copilot CLI before cutting off the competitor.

The Uber case reveals a dangerous feedback loop: AI adoption drives productivity metrics that look great to leadership, while silently consuming budgets at an alarming rate.

Token-based pricing creates a perverse incentive where the most productive AI users become the most expensive, potentially discouraging the very behavior companies want to encourage.

The fact that even NVIDIA — the company selling the picks and shovels — admits computing costs exceed salary costs suggests the entire industry is facing a sustainability crisis.

Microsoft's position is uniquely conflicted: it builds AI infrastructure for Anthropic while competing with Anthropic's tools internally, highlighting the blurred lines between partner, customer, and competitor in the AI ecosystem.

Concepts & terms
Token-based billing
A pricing model where users pay for each unit of text (token) processed by an AI model, rather than a flat subscription fee. This means costs scale linearly with usage, unlike traditional software where marginal costs approach zero.
Agentic coding mode
A development workflow where AI agents autonomously write, test, and debug code, often chaining multiple model calls together. This dramatically increases token consumption compared to simple chat-based AI assistance.
GitHub Copilot CLI
Microsoft's command-line AI coding assistant that integrates with GitHub. Unlike Claude Code, its costs are internal to Microsoft's Azure ecosystem, making it financially preferable for the company even if functionally inferior.
Source: juejin.cn ↗ Google Translate ↗ Backup ↗