AI Compute Shortage Hits Agentic Commerce

Anthropic outages. OpenAI killed Sora. GPU prices up 50 percent. The $690 billion in AI infrastructure capex is not enough. The models powering agentic commerce are hitting capacity limits.

Anthropic went down five times in March. Not minor blips. Full outages that locked paying customers out of Claude during business hours. The company tightened usage limits during peak weekday windows, throttling the very users who depend on the platform for production workloads. Then it happened again on April 7. Today.

OpenAI looked at its GPU allocation, looked at Sora burning $1 million a day to serve fewer than 500,000 users, and killed the product. Sam Altman told Variety he felt "terrible" breaking the news to Disney CEO Josh D'Amaro. He did it anyway because the GPUs were worth more running inference for ChatGPT than rendering videos nobody was watching.

GPU rental prices are up nearly 40 percent in six months. Memory costs are spiking 50 percent. Lead times for data center GPUs have stretched to 52 weeks.

❝

The five largest AI companies are spending $690 billion on infrastructure this year. And they cannot get enough compute to keep the lights on. For agentic commerce, the question we have been asking is who pays. Now add: what happens when you cannot buy capacity even if you can afford it?

The Numbers That Do Not Add Up

Start with the capex. Futurum Group pegged the combined 2026 capital expenditure of Microsoft, Alphabet, Amazon, Meta, and Oracle at $660 billion to $690 billion. That is a near-doubling from $380 billion in 2025. About 75 percent of it, roughly $450 billion, goes directly to AI infrastructure: servers, GPUs, data centers, cooling systems.

Those numbers are staggering. They are also insufficient.

Amazon alone is projecting $200 billion. Alphabet is in the $175 billion to $185 billion range. Microsoft is tracking toward $120 billion. As we covered in our analysis of OpenAI's $852 billion valuation, the infrastructure thesis behind these bets depends on demand scaling fast enough to justify the spend. Demand scaled. The problem is that supply did not keep up.

Hyperscalers now spend 45 to 57 percent of revenue on capital expenditure. Oracle hit 57 percent capital intensity. After buybacks and dividends, aggregate capex for the big five now exceeds projected cash flows. They are borrowing to build infrastructure that is already oversubscribed before it comes online.

Where the Bottleneck Actually Is

The GPU gets the headlines. The memory gets the blame.

High-bandwidth memory, HBM3 and HBM4, is produced by three manufacturers globally. Micron has already sold out its entire 2026 HBM4 capacity. Hyperscalers have locked up an estimated 40 percent of global DRAM supply through multi-year contracts, leaving enterprise buyers fighting over what remains.

The downstream effects are measurable. SemiAnalysis launched an H100 rental price index and found that one-year contract pricing jumped from $1.70 per GPU-hour in October 2025 to $2.35 by March 2026. On-demand capacity is effectively sold out. Half the providers surveyed had zero H100 or H200 availability. Not limited availability. Zero.

A training run budgeted at $40,000 on reserved capacity now costs $80,000 to $120,000 on-demand, if you can find capacity at all. NVIDIA's newer Blackwell chips face lead times stretching into mid-2026. The company shifted wafer allocation toward Blackwell, which means H100 supply is actually tightening even as demand for it stays elevated.

Here is the thing. Efficiency gains in AI models are not reducing compute demand. They are increasing it. Better models attract more users who run more inference, which consumes more GPUs. Jevons paradox, playing out in real time across every data center on the planet.

Why Agentic Commerce Should Be Worried

We have spent months covering the protocol layer for agentic commerce: trust frameworks, payment rails, discovery standards. All of it assumes the models are available. That assumption is cracking.

An AI agent that initiates a purchase needs to call an LLM. It needs to evaluate options, compare prices, verify trust signals, and execute payment. Each of those steps is an inference call. Each inference call requires GPU capacity. When the model goes down, the agent goes down. When the model is rate-limited, the agent is rate-limited. When capacity is sold out, the agent sits idle.

Anthropic's March outages offer a concrete case study. The company doubled off-peak usage limits and throttled peak-hour access. If you were running an agentic commerce workflow on Claude during a weekday morning, your agent's ability to transact depended on whether Anthropic had enough GPUs to serve your request. That is not a hypothetical constraint. It happened. Repeatedly.

As we explored in our AI demand shock analysis, the chip shortage was already biting into AI deployment timelines. What changed is that the shortage has now reached the application layer. It is no longer a supply chain story for hardware procurement teams. It is a reliability story for anyone building on top of these models.

❝

Agentic commerce just hit a wall that no protocol can solve. You can build the best trust framework in the world. If the underlying model cannot stay online, agent-initiated transactions do not happen.

The Capacity Allocation Problem

OpenAI's decision to kill Sora reveals how this plays out in practice. The company had a product with a million users, a $150 million Disney partnership, and genuine market interest. It killed it anyway because the GPUs generated more revenue running ChatGPT.

That is a resource allocation decision that would have been unthinkable two years ago. You do not shut down a product with a Disney deal because you are feeling strategic. You shut it down because you physically cannot run it and the thing that makes you money at the same time.

OpenAI CEO Sam Altman was explicit about the reasoning: the company needed to "concentrate compute and product capacity into automated researchers and companies." Translation: every GPU allocated to Sora was a GPU not allocated to the business lines where OpenAI competes with Anthropic and Google.

This is the new reality. Compute is not just expensive. It is contested. Companies are making zero-sum choices about which products get to exist based on GPU availability.

For agentic commerce, this creates a dependency chain that the industry has not fully reckoned with. An agent commerce platform built on Claude is at the mercy of Anthropic's capacity planning. A workflow built on GPT-4o depends on OpenAI not deciding those GPUs are better used elsewhere. Gartner projects that over 40 percent of agentic AI projects will fail by 2027 because legacy systems cannot support modern AI execution demands. Add compute scarcity to that list.

What Comes Next

The $690 billion is real money. Data centers are being built. Chips are being fabricated. The capacity gap will eventually narrow. But "eventually" does not help the company that needs to run a production agentic commerce workflow today.

Short-term, expect three things.

First, more aggressive rate limiting across all major LLM providers. Anthropic already moved to peak-hour throttling. OpenAI is consolidating products. Google has not faced the same public pressure yet, but its Gemini infrastructure is absorbing similar demand curves. Agents that depend on any single provider will face intermittent failures.

Second, compute cost becomes a line item in agentic commerce economics. We flagged the economics question in our State of the Stack coverage: who pays when an agent transacts? Now add: what margin remains after the inference cost? If GPU rental prices are up 40 percent and climbing, every agent-initiated transaction just got more expensive to execute.

Third, multi-model fallback architectures become a requirement, not a nice-to-have. If Claude goes down, the agent needs to fall back to GPT-4o or Gemini without breaking the workflow. That is engineering complexity that most agentic commerce platforms have not built yet.

The infrastructure is being built. The money is being spent. But right now, the AI industry is in the strange position of having committed $690 billion to solve a problem that will take years to resolve, while the models everyone depends on are running out of room today.

Sources

If $690 billion is not enough to keep the models running, what does that mean for every agent commerce platform that assumed compute would always be available?