The Silent Price Hike: How AI Coding Tools Got More Expensive Without Changing Their Pricing Pages
Headline prices on AI coding tools stayed flat in early 2026, but tokenizer changes, quota throttling, and credit rebases pushed effective unit cost up across Anthropic, GitHub Copilot, OpenAI and Google. Open-weight coding models like Kimi K2.6 and DeepSeek V4-Pro now match the previous frontier on SWE-bench Pro, with DeepSeek matching Opus 4.7 on context window - which means the exit ramp finally exists.

Datum
Headline prices stayed flat. Effective unit cost did not. Across Anthropic, GitHub, OpenAI and Google, every major AI coding vendor made a structural change in the first four months of 2026 that pushed real cost-per-task up without touching the sticker on their pricing page. You feel it as a quota wall on a Tuesday afternoon, not as a line item on your subscription. Enterprises on token-billed contracts feel the same shift as a budget overrun. The rest of us feel it as friction.
I want to walk through the four vendors with dates and primary sources, then show why open-weight coding models have stopped being a research curiosity. The frame here is not outrage. Vendors have every right to reprice as compute tightens. The point is what to do about it.
What changed at Anthropic
Anthropic's per-token list prices for Opus, Sonnet and Haiku have not moved publicly in 2026. Three things compounded into a real price increase anyway.
First, the tokenizer. Claude Opus 4.7, released 16 April 2026, ships with a new tokenizer that, in Anthropic's own words, can map the same input to "roughly 1.0 to 1.35x" more tokens. OpenRouter measured this on roughly one million switcher requests on 27 April 2026 and found a real cost impact between plus 11.9 and plus 27.2 percent across prompt-size buckets. ClaudeCodeCamp's measurement on real Claude Code workloads landed at a weighted 1.325x. An 80-turn debugging session that cost USD 6.65 on 4.6 now lands at USD 7.86 to 8.76 on 4.7. Same pricing page. Twenty to thirty percent more money out the door.
Second, the Pro and Max quotas. The Register reported on 26 March 2026 that Anthropic introduced peak-hour throttling on the 5-hour session window during the early afternoon to evening Swiss time. By 31 March, Anthropic was admitting on the record that quotas were running out faster than the policy alone explained. Fortune's 24 April write-up named three engineering missteps that compounded the friction. Today, for the first time since I started running Max, I checked my account and saw my weekly limit nearly used up. That is the sharp end of "policy unchanged, behaviour different".
Third, the Pro tier itself. On 21 April 2026 Anthropic quietly updated docs and pricing to remove Claude Code from the USD 20 Pro plan, framed by Head of Growth Amol Avasare as "a small test on about 2 percent of new prosumer signups". Public backlash led to a restoration in less than 24 hours.
Gergely Orosz at The Pragmatic Engineer captured the aggregate on 30 April 2026: across the engineers he interviewed, token spend was up roughly 10x over six months. Headline prices, again, unchanged.
What changed at GitHub Copilot
On 27 April 2026 GitHub announced that effective 1 June 2026 every Copilot plan transitions from "premium request units" to "GitHub AI Credits" priced by actual token consumption (one credit equals USD 0.01). Headline subscriptions stayed exactly where they were (prices in USD for better comparability to the experience of the community): Pro USD 10, Pro+ USD 39, Business USD 19 per user, Enterprise USD 39 per user. Each plan now includes credits equal to its monthly price (USD 10, 39, 19, 39 of credits respectively).
What that buys you depends on the model. Per-million-token list prices range from Grok Code Fast 1 at USD 0.20 input to GPT-5.5 at USD 5.00 input and USD 30.00 output. Claude Opus variants sit at USD 5.00 input and USD 25.00 output, identical to Anthropic's direct pricing. The community math on GitHub Discussion #192948 is brutal: a Pro+ user calculated that Opus 4.6 usage under the new rate would cost about 27 times more than under the prior PRU model, dropping roughly 1'500 monthly premium requests to about 140 for the same USD 39 a month. The thread sits at over 750 down-votes against fewer than 20 up-votes. Visual Studio Magazine's 27 April headline summarised it as "you will get less, but pay the same price".
What changed at OpenAI
OpenAI's 2026 message is a little more transparent and a little less subtle. ChatGPT now spans six tiers: Free, Go (USD 8), Plus (USD 20), Pro (USD 100 or USD 200), Business (USD 25 to 30 per user), Enterprise (custom). Pro USD 100 maps to 5x Plus quotas; Pro USD 200 maps to 20x. On 23 April 2026, GPT-5.5 launched across Plus, Pro, Business and Enterprise. ChatGPT head Nick Turley said publicly that pricing will "significantly evolve" and floated phasing out unlimited plans, comparing them to "unlimited electricity".
Codex CLI tightened in parallel. The OpenAI Developer Community thread "Pro plan: hit 5-hour limit twice in 2h and 1.5h" documents Pro USD 100 users hitting their session ceiling twice within two hours after the 9 April update. On 2 April 2026 Codex shifted to API-style token rates for Business and new Enterprise plans, while Plus and Pro stayed on the message-based rate card. Per OpenAI's own Codex pricing page, a Plus user now averages roughly 14 credits per GPT-5.5 message, 7 per GPT-5.4, 2 per GPT-5.4-mini. Promotions through 31 May 2026 keep Pro USD 200 at 25x Plus (vs the standard 20x) and double Codex usage on the USD 100 tier (10x vs standard 5x), but the cliff at the end of May is real.
The communication pattern is also worth flagging. On 28 April 2026 Codex rate limits were reset for all paid plans, but the announcement came as a tweet from OpenAI's Tibo Sottiaux, not a formal channel.
What changed at Google
Google's Gemini 3.1 Pro launched 19 February 2026 at USD 2.00 per million input and USD 12.00 per million output up to 200K context, USD 4.00 / USD 18.00 above. Gemini 3 Flash sits at USD 0.50 / USD 3.00. The 1 million-token context window is genuinely useful and pricing is competitive on long-context work.
The quieter change happened on 1 April 2026: Gemini Pro models were removed from the free tier. Flash models stay free with reduced daily quotas. For prosumers and small dev shops who used Pro on the free tier as their first stop, the effective price went from zero to non-zero overnight. Gemini Code Assist Enterprise sits at USD 75 per developer per month, and Jules (Google's coding agent) caps the free tier at 15 daily tasks with paid plans only available on personal Gmail accounts, which is its own procurement complication.
There is also a Workspace bundling story (Gemini folded into Workspace plans with a list-price uplift), but the percentage figure I have seen circulating is sourced to a single procurement guide and I cannot corroborate it from Google's primary channels. So I will skip the number. The direction is clear, the previously optional add-on is now mandatory, but no percentage without a primary source.
The structural pattern
Strip the four vendors back to their bones and the same shape appears everywhere.
If you pay a flat subscription, the change is invisible on the line item but visible at the quota wall. The throttle, the tokenizer, the model removal, the credit rebase: these all happen one or two layers below the price page. You feel them as "Claude Code felt slower this week" or "I hit the weekly cap on Friday again". Companies on token-billed contracts feel the same shift as a budget overrun: 10x token spend in six months at unchanged headline price, in Orosz's recent reporting.
Both face the same underlying driver. Anthropic is, on the record, compute-constrained. OpenAI's "unlimited electricity" framing is a leading indicator. GitHub's pivot to AI Credits is structurally honest about where this is heading: pay for what you use, no more flat-rate hiding place. The question is not whether the per-token economics will keep tightening. It is whether you have an exit ramp when they do.
The open-source counterweight
What the community has noticed in the last weeks is that open-weight coding models stopped being a research curiosity. Two of them now match the previous generation of frontier-closed coding models on the hardest publicly graded coding benchmark, with Anthropic's just-released Opus 4.7 still in the lead.
SWE-bench Pro grades a model on real GitHub issues that are too long and too messy to fit in a smaller test suite. It is the closest public proxy to "would this model survive in your Claude Code session". The current scoreboard:
| Model | Open weights | Context window | SWE-bench Pro |
|---|---|---|---|
| Claude Opus 4.7 (Anthropic) | No | 1M | 64.3% |
| Kimi K2.6 (Moonshot) | Yes (Apr 2026) | 256K | 58.6% |
| GPT-5.4 (OpenAI) | No | 400K | 57.7% |
| DeepSeek V4-Pro | Yes (Apr 2026) | 1M | 55.4% |
| Claude Opus 4.6 (Anthropic) | No | 1M | 53.4% |
Qwen3-Coder-Next (Alibaba, Apache 2.0, 256K context) and GLM-4.6 (Z.ai, MIT, 200K context) sit one tier below but are catching up fast. Several of these expose Anthropic-compatible endpoints, which means they drop into the Claude Code session you already use. No re-architecture. DeepSeek V4-Pro even matches Opus 4.7 on context window. The point is simple: when your subscription stops being enough, the exit ramp now exists.
Vendor optionality as a design principle
I am building Shipwright on Claude Code. I use Claude every day. None of what is above is a hatchet job on Anthropic. The point is industry-wide pattern recognition.
Shipwright runs Gemini and OpenAI review loops alongside Claude as part of its 5-axis review, exactly because vendor optionality belongs in the architecture, not in a panic migration when prices move. If Anthropic ships a tokenizer change, the harness keeps working. If GitHub flips Copilot to credits, the harness keeps working. The failure mode is signing a single-vendor flat-rate deal and discovering, six months later, that there is no graceful way out.
Closing
Vendors have every right to reprice as compute tightens. Anthropic's compute shortage is real. OpenAI's unit economics are real. GitHub's "agentic workflows consume far more resources than the original plan structure was built to support" is honest. None of this is bad faith.
What ended is the assumption that a flat-rate subscription means flat cost. That is gone. The published price is no longer the price.
So here is what I am doing about it personally. The day my Max plan stops being enough - and based on this week's quota wall, that day might be sooner than I thought - I will be running an open-weight model that holds the top of the SWE-bench leaderboard by then, through Claude Code. Vendor optionality as a personal capability, not just an architectural one.
Headline prices stayed flat. Effective unit cost did not. Plan accordingly.
Sources:
- "Introducing Claude Opus 4.7" - Anthropic - 16 Apr 2026 - https://www.anthropic.com/news/claude-opus-4-7
- "Opus 4.7's New Tokenizer: What It Actually Costs" - OpenRouter - 27 Apr 2026 - https://openrouter.ai/announcements/opus-47-tokenizer-analysis
- "Claude Token Counter, now with model comparisons" - Simon Willison - 20 Apr 2026 - https://simonwillison.net/2026/apr/20/claude-token-counts/
- "I Measured Claude 4.7's New Tokenizer. Here's What It Costs You." - ClaudeCodeCamp - 16 Apr 2026 - https://www.claudecodecamp.com/p/i-measured-claude-4-7-s-new-tokenizer-here-s-what-it-costs-you
- "Anthropic admits Claude Code quotas running out too fast" - The Register (Tim Anderson) - 31 Mar 2026 - https://www.theregister.com/2026/03/31/anthropic_claude_code_limits/
- "Anthropic explains Claude Code's recent performance decline after weeks of user backlash" - Fortune - 24 Apr 2026 - https://fortune.com/2026/04/24/anthropic-engineering-missteps-claude-code-performance-decline-user-backlash/
- "Max 20 plan: rate limit 100% exhausted within ~70 minutes after reset" - GitHub Issue #41788 - 1 Apr 2026 - https://github.com/anthropics/claude-code/issues/41788
- "The Pulse: token spend breaks budgets - what next?" - The Pragmatic Engineer (Gergely Orosz) - 30 Apr 2026 - https://blog.pragmaticengineer.com/the-pulse-token-spend-breaks-budgets-what-next/
- "GitHub Copilot is moving to usage-based billing" - GitHub Blog - 27 Apr 2026 - https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/
- "Changes to GitHub Copilot Individual plans" - GitHub Blog - 20 Apr 2026 - https://github.blog/news-insights/company-news/changes-to-github-copilot-individual-plans/
- "GitHub Copilot is moving to usage-based billing - Discussion #192948" - GitHub Community - 27 Apr 2026 - https://github.com/orgs/community/discussions/192948
- "Codex Pricing" - OpenAI Developers - current - https://developers.openai.com/codex/pricing
- "Codex Rate limits reset for all paid plans April 28, 2026" - OpenAI Developer Community - 28 Apr 2026 - https://community.openai.com/t/codex-rate-limits-reset-for-all-paid-plans-april-28-2026/1379921
- "Gemini Developer API pricing" - Google AI for Developers - current - https://ai.google.dev/gemini-api/docs/pricing
- "moonshotai/Kimi-K2.6" - Hugging Face - Apr 2026 - https://huggingface.co/moonshotai/Kimi-K2.6
- "deepseek-ai/DeepSeek-V4-Pro" - Hugging Face - Apr 2026 - https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
- "DeepSeek V4 Preview Release" - DeepSeek API Docs - 24 Apr 2026 - https://api-docs.deepseek.com/news/news260424
- "DeepSeek-V4: a million-token context that agents can actually use" - Hugging Face Blog - Apr 2026 - https://huggingface.co/blog/deepseekv4
- "SWE-Bench Pro Public Leaderboard" - Scale - current - https://labs.scale.com/leaderboard/swe_bench_pro_public
- "Claude Opus 4.7 Benchmarks Explained" - Vellum - Apr 2026 - https://www.vellum.ai/blog/claude-opus-4-7-benchmarks-explained
