The Cost Dilemma of AI: How Infrastructure Economics Will Reshape the Next Phase of the Market

SmartContractAuditor

2026-03-26 16:11:41

Source: International Business Times UK

Author: Anastasia Matveeva

Compiled and Edited by: Gonka.ai

AI is expanding at an astonishing pace, but its underlying economic logic is far more fragile than it appears on the surface. When three cloud giants control two-thirds of the world’s computing power, training costs soar toward $1 billion, and inference bills catch startups off guard— the true cost of this computing arms race is quietly reshaping the distribution of value across the entire AI industry.

This article is not about who will build the most advanced models. It explores a more fundamental question: Is the current economic model of AI infrastructure sustainable at scale? How will changes in the computing resource allocation mechanism reshape the market’s value distribution?

The Hidden Cost of Intelligence Behind the Scenes

Training a cutting-edge large model often costs tens of millions to hundreds of millions of dollars. Anthropic has publicly stated that training Claude 3.5 Sonnet cost “tens of millions of dollars,” and its CEO Dario Amodei previously estimated that the next-generation model’s training cost could approach $1 billion. Industry reports suggest GPT-4’s training cost may have already exceeded $100 million.

However, training costs are just the tip of the iceberg. The real structural pressure comes from inference costs—that is, the expenses incurred each time the model is called upon. According to OpenAI’s public API pricing, inference is billed per million tokens. For high-usage applications, this means that even before scaling, daily inference costs can reach thousands of dollars.

AI is often described as software. But its economic essence is increasingly like capital-intensive infrastructure—requiring high upfront investments and ongoing operational expenses.

This shift in economic structure is quietly changing the competitive landscape of the AI industry. Those who can afford the compute are the giants with large-scale infrastructure; startups trying to survive in the cracks are being gradually squeezed by inference bills.

Capital Intensity and Market Concentration

According to Holori’s 2026 cloud market analysis, AWS currently holds about 33% of the global cloud market, Microsoft Azure about 22%, and Google Cloud about 11%. These three control roughly two-thirds of global cloud infrastructure, and most AI workloads are run on their platforms.

The significance of this concentration is clear: when OpenAI’s API experiences downtime, thousands of products are affected simultaneously; when a major cloud provider encounters a failure, cross-industry, cross-region services are disrupted.

This concentration is not narrowing; infrastructure spending continues to grow. Nvidia, for example, reports annual revenue from its data center business surpassing $80 billion, reflecting sustained demand for high-performance GPUs.

More subtly, there is an implicit structural inequality. According to SEC filings and market reports, leading labs like OpenAI and Anthropic lock in GPU resources through multi-billion-dollar “equity-for-compute” agreements at near-cost prices of $1.30–$1.90 per hour. Smaller companies lacking strategic partnerships with Nvidia, Microsoft, or Amazon are forced to buy at retail prices exceeding $14 per hour—up to 600% premium.

This pricing gap is driven by Nvidia’s recent strategic investments totaling $40 billion in leading labs. Access to AI infrastructure is increasingly dictated by capital-intensive procurement agreements rather than open market competition.

In early adoption phases, this concentration may seem “efficient.” But at scale, it introduces pricing risks, supply bottlenecks, and infrastructure dependencies—adding up to triple vulnerabilities.

The Overlooked Energy Dimension

Another often neglected aspect of AI infrastructure costs is energy.

According to the International Energy Agency (IEA), data centers currently account for about 1–1.5% of global electricity consumption, and AI-driven demand growth could significantly increase this share in the coming years.

This means that the economics of compute are not just a financial issue but also an infrastructure and energy challenge. As AI workloads continue to expand, the geopolitical significance of electricity supply will become increasingly prominent—whose country can provide the most stable compute at the lowest energy cost will hold a structural advantage in AI industry competition.

When Jensen Huang announced at GTC26 that Nvidia’s order visibility exceeded $1 trillion, he was describing more than a company’s commercial success; he was illustrating a grand process where civilization transforms electricity, land, and scarce minerals into intelligent compute.

Rethinking Infrastructure Mechanisms

While centralized data centers continue to expand, a different exploration is quietly emerging—aiming to fundamentally redefine how compute resources are coordinated.

Decentralized Inference: A Structural Alternative

Gonka protocol exemplifies this direction. It is a decentralized network designed specifically for AI inference, with the core goal of minimizing network synchronization and consensus overhead, channeling as much computing power as possible toward real AI workloads.

In governance, Gonka adopts the principle of “one compute unit, one vote”—governance weight is determined by verifiable compute contribution, not by capital share. Technologically, the protocol uses short-cycle performance measurement intervals (called Sprint), requiring participants to demonstrate real GPU compute power via a Transformer-based proof-of-work (PoW) mechanism in real time.

The significance of this design is that nearly 100% of the network’s compute power is directed toward AI inference workloads, rather than spent on maintaining consensus, communication, and other infrastructure overhead.

Economic Logic of Distributed Compute

From an economic perspective, the value proposition of decentralized compute networks has three levels.

First is cost. The pricing structure of centralized cloud providers inherently includes massive fixed asset depreciation, data center operating costs, and shareholder profit expectations. Decentralized networks monetize idle GPU resources, significantly reducing these costs. For example, Gonka’s current inference service via its USD-based GonkaGate gateway charges about $0.0009 per million tokens—while centralized providers like Together AI charge around $1.50 for similar models (e.g., DeepSeek-R1), a thousandfold difference.

Second is supply elasticity. Centralized providers’ compute supply is rigid, with expansion taking months or quarters. Decentralized participants can join or leave flexibly based on demand, theoretically responding more quickly to peak loads—similar to how Amazon Web Services emerged from holiday traffic surges, AI inference peaks and valleys also require elastic infrastructure.

Third is sovereignty. This is especially relevant from a national sovereignty perspective. When a country’s public services depend heavily on an external cloud provider, compute becomes a strategic vulnerability. Decentralized networks offer a potential solution: local data centers can serve as nodes in a global distributed network, ensuring data sovereignty while providing sustainable commercial returns through global market access.

The Moment to Reshape Value Distribution

Returning to the core question at the start: Is the current economic model of AI infrastructure sustainable at scale?

The answer: sustainable for leading players; increasingly unsustainable for everyone else.

AWS, Azure, and Google Cloud have built formidable moats through decades of capital accumulation, and their scale advantages are nearly unassailable in the short term. But this structural advantage also means that pricing power, data access, and infrastructure dependence are highly concentrated in a few private entities.

Historically, every major monopoly in technological infrastructure has eventually spawned alternative distributed architectures— the internet itself was a rebellion against telecom monopolies, BitTorrent disrupted centralized content distribution, and Bitcoin challenged centralized currency issuance.

Decentralizing AI infrastructure may not be an ideological choice but an economic inevitability—when the costs of centralization become high enough to drive large-scale user migration, demand for alternatives will explode. Jensen Huang’s analogy that “every financial crisis pushes more people toward Bitcoin” applies equally to compute markets.

The emergence of DeepSeek proves one thing: in a world where open-source models approach the capabilities of closed-source frontiers, inference costs will be the key variable determining the speed of AI application scaling. Whoever can provide the lowest-cost, highest-availability inference compute will hold the ticket to this race.

Conclusion: The Infrastructure War Has Just Begun

The next phase of AI competition will not be decided solely by model capabilities but by the economic game of infrastructure.

Centralized compute giants hold capital and scale advantages but also bear fixed cost structures and pricing pressures. Decentralized networks are entering the market at extremely low marginal costs but must prove they can reach real business thresholds in stability, usability, and ecosystem scale.

Both paths will coexist long-term, exerting mutual pressure. The tension between centralization and decentralization will be one of the most important structural themes to watch in the AI industry over the next five years.

This infrastructure war has only just begun.

About the Author

Anastasia Matveeva is a senior product manager and researcher at Product Science, and one of the co-founders of Gonka protocol. Her research focuses on foundational machine learning infrastructure, large language model inference, and distributed computing systems.

She graduated with a PhD in Mathematics from UPC Barcelona and has served as a researcher and lecturer there. Since joining Product Science in 2021, she has led the development of AI engineering tools now adopted by over a hundred engineers and used in multiple Fortune 500 companies.

About Gonka.ai

Gonka is a decentralized network designed to provide efficient AI compute, aiming to maximize the utilization of global GPU resources for meaningful AI workloads. By removing centralized gatekeepers, Gonka offers developers and researchers permissionless access to compute resources, rewarding all participants with its native token GNK.

Gonka was incubated by US-based AI developer Product Science Inc. Founded by industry veterans and former Snap Inc. core product directors Libermans siblings, the company raised $18 million in 2023 and an additional $51 million in 2025. Investors include OpenAI investor Coatue Management, Solana investor Slow Ventures, Bitfury, K5, Insight, and Benchmark partners. Early contributors include 6 blocks, Hard Yaka, Gcore, and other leading Web 2/Web 3 companies.

BTC-2.72%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.