Futures
Access hundreds of perpetual contracts
TradFi
Gold
One platform for global traditional assets
Options
Hot
Trade European-style vanilla options
Unified Account
Maximize your capital efficiency
Demo Trading
Introduction to Futures Trading
Learn the basics of futures trading
Futures Events
Join events to earn rewards
Demo Trading
Use virtual funds to practice risk-free trading
Launch
CandyDrop
Collect candies to earn airdrops
Launchpool
Quick staking, earn potential new tokens
HODLer Airdrop
Hold GT and get massive airdrops for free
Launchpad
Be early to the next big token project
Alpha Points
Trade on-chain assets and earn airdrops
Futures Points
Earn futures points and claim airdrop rewards
Lofree: The lobster cost black hole is emerging, requiring a more efficient token Agent framework.
Anthropic’s move to block third-party tools from abusing subscription access is uncovering a long-ignored cost crisis in the AI Agent era.
Two days ago, Anthropic announced it was cutting off the channel that allowed access to Claude subscriptions via a third-party calling framework. Xiaomi MiMo large model executive Fuli Luo immediately posted a message, offering an in-depth interpretation of this event in light of the Token plan MiMo rolled out just three days earlier.
She believes Anthropic’s action is not simply a defensive move in business, but a necessary milestone in the ecosystem’s maturation, given that the global supply of compute can’t keep up with the accelerating demand of Agents.
Directly impacted by this change are users of third-party calling frameworks such as OpenClaw and OpenCode that run on Claude subscription access. These users are facing a sharp spike in costs, which in the short term could reach dozens of times what it was previously.
But Luo believes this pressure is precisely a catalyst to improve engineering quality—only by making the truly visible costs of inefficiency apparent can we push developers to take context management and cache optimization seriously.
A cost black hole behind the subscription system
Luo points out that the subscription system for Claude Code is quite sophisticated in terms of compute allocation design, but she admits that this system likely won’t be profitable, and may even be in the red.
The root cause lies in how third-party frameworks make calls. Take OpenClaw as an example: its context management has obvious flaws—in handling a single user request, the system breaks it into multiple rounds of low-value tool calls, which are then issued one by one as separate API requests. Each request typically carries a context window of more than 100k tokens.
Even if there are cache hits, this pattern is extremely wasteful; in extreme cases, it can also drive up the cache miss rates of other requests.
Luo estimates that the number of actual requests produced per query by these frameworks is often several times that of the Claude Code native framework. When converted to API billing, the real cost could be dozens of times the subscription price. She describes this gap as “not a shortfall, but a bottomless pit.”
AI Workshop organizer @newlinedotco commented: The subscription “all you can eat” bundle has been a ticking time bomb from the start—third-party harnesses (such as OpenClaw) loop calls 24/7, with API costs that may reach as high as $5,000, while the subscription is only $200. Official tools (such as Claude Code) can remain sustainable only thanks to prompt cache optimization.
After the block: short-term pain and long-term规律
Anthropic’s adjustment did not completely close the door to third-party access. Tools such as OpenClaw and OpenCode can still call Claude via APIs, but they have lost the channel to ride on the subscription plan.
This distinction is crucial. For users who are used to using these tools at subscription prices, the cost shock is immediate and significant.
But Luo believes this pain has a corrective effect—it will force framework developers to seriously improve their context management capabilities, maximize prompt cache hit rates to reuse already-processed context, and cut down on wasted token consumption. She describes this process as “pain eventually transforming into engineering规律.”
She also reminds major language model companies that before they have clarified the programming plan cost structure, they should not blindly jump into pricing races. Selling tokens cheaply both opens the door for third-party frameworks and appears friendly to users on the surface, but in reality it’s a trap—and Anthropic has just stepped out of that trap.
She further points out that if users spend a lot of effort on low-quality frameworks, unstable inference services, and degraded models and still get nothing in return, it will cause real harm to user experience and retention.
In response, AI engineer @karpathy noted:
Different paths of the MiMo Token plan
While interpreting Anthropic’s move, Luo also explains the design logic behind the MiMo Token plan.
The plan supports third-party calling frameworks and uses token-allocated quota billing. Logically, it matches the overage usage packs that Claude recently introduced.
Luo emphasizes that MiMo’s goal is “to deliver high-quality models and services steadily over the long term, not to let users pay impulsively and then churn.”
This statement reflects a compute allocation philosophy that differs from the subscription model: constrain user and framework behavior through real usage costs, rather than managing abuse risk in a closed way.
Competition in efficiency, not compute consumption
At the end of her article, Luo offers a more macro judgment: the global supply of compute can no longer keep up with the rate of token demand generated by Agents.
In her view, the way forward is not to further drive down token prices, but to evolve in synergy between “Agent frameworks with higher token efficiency” and “more powerful, higher-efficiency models.”
Whether Anthropic’s move is driven by intentional design or not, it is pushing the entire ecosystem—whether open source or closed source—toward that direction.
“The Agent era does not belong to those who burn the most compute, but to those who use it the smartest.” Luo wrote.
Risk warning and disclaimer