Project Deal, GPT-5.5, $40B Google: One Story, Three Names

Anthropic ran a 69-employee marketplace where AI agents traded real goods. Stronger models won and the losers never noticed. OpenAI shipped GPT-5.5 at double the API price. Google committed up to $40 billion to Anthropic. These dropped in 72 hours. They are the same story.

In the last three days of last week, three of the most consequential AI announcements of the quarter landed inside 72 hours.

On April 23, OpenAI shipped GPT-5.5, an agentic model priced at $5 per million input tokens and $30 per million output tokens. That is exactly double GPT-5.4. On April 25, Anthropic published the results of Project Deal, a one-week internal experiment where 69 employees let AI agents trade real goods on their behalf, with $100 budgets and 186 closed deals worth over $4,000. The same week, Google committed up to $40 billion to Anthropic, $10 billion immediately at the current $380 billion valuation, $30 billion contingent on performance targets.

Three different headlines. One thesis. Agents transacting on behalf of humans is no longer hypothetical, the cost of agent-grade intelligence is doubling, and the capital is flowing toward the lab that just published the most concrete agent-on-agent commerce data anyone has produced.

❝

If agents are going to transact for us, the model behind your agent is the new credit score. It decides what you pay, what you get, and whether you ever notice the difference.

Project Deal: The Result the Industry Has Been Waiting For

Anthropic's experiment is the data point we have been missing.

Sixty-nine self-selected San Francisco employees received $100 each via gift card. AI agents handled all negotiation in Slack. Humans only stepped in for the final exchange. Anthropic ran four marketplaces in parallel: two staffed entirely by Claude Opus 4.5 agents, and two mixed marketplaces where employees had a 50 percent chance of being represented by Claude Haiku 4.5, the smaller and cheaper model.

The numbers are sharp.

In the all-Opus marketplace, agents closed 186 deals totaling more than $4,000 in volume. In the mixed marketplace, the gap between strong-model and weak-model employees was concrete. Opus sellers averaged $2.68 more per item than Haiku sellers. Opus buyers paid $2.45 less than Haiku buyers on identical items. A lab-grown ruby went for $65 with an Opus seller, $35 with a Haiku one. A broken bike fetched $65 versus $38. Across the experiment, Opus users closed roughly two more deals on average.

Then the part Anthropic flagged.

Haiku users rated the fairness of their experience identically to Opus users. 4.06 versus 4.05 on a seven-point scale. They had received objectively worse prices. They did not notice. Of the 28 participants who used both models, 17 preferred Opus, 11 preferred Haiku.

Anthropic's framing for the result is "invisible inequality." Their own statement: "Policy and legal frameworks around AI models that transact on our behalf simply don't exist yet."

That is the cleanest articulation of the agentic commerce problem we have seen from any frontier lab. We covered the trust gap in agentic commerce and the agent tax earlier this year as theoretical risks. Project Deal made them measurable.

GPT-5.5 at Double the Price Is the Cost Curve Everyone Knew Was Coming

OpenAI did not bury the lead. GPT-5.5 is more capable. It is also twice as expensive.

The performance gains are real. Terminal-Bench 2.0 jumps from 75.1 percent to 82.7 percent. FrontierMath Tier 4 climbs from 27.1 percent to 35.4 percent. The model is built to switch between tools autonomously and complete extended workflows, which is the agentic shape the entire industry has converged on. OpenAI's quote: "Agents built with GPT-5.5 can plan, gather context, call tools, recover from ambiguity, and complete longer workflows with less guidance."

The pricing is the part that matters for anyone running production agents.

GPT-5.5 Pro lands at $30 per million input tokens, $180 per million output tokens. That is six times the GPT-5.4 input rate and twelve times the output rate. The standard GPT-5.5 sits at $5 input and $30 output, exactly 2x GPT-5.4. The two-tier pricing is not accidental. It is OpenAI signaling that agentic intelligence is a different product category from chat intelligence, and it is going to be priced as such.

Three implications for builders.

The agent tax just got more expensive. A workflow that calls GPT-5.5 across multi-step tool use costs materially more than the same workflow on GPT-5.4. If your application's unit economics were marginal at GPT-5.4 prices, GPT-5.5 economics may not work. Either the application gets repriced or it stays on the older model.

Tier shopping becomes a real engineering discipline. A serious agentic deployment now needs to route between GPT-5.4, GPT-5.5, GPT-5.5 Pro, plus Claude Opus, Claude Haiku, Gemini 3.1, and increasingly some open-source local model for low-stakes calls. The orchestration layer is the new strategic surface, which is exactly the bet that Google's Gemini Enterprise platform is making.

The benchmark wins are not uniform. GPT-5.5 leads on programming and math benchmarks but loses to Claude Opus 4.7 on SWE-Bench Pro (64.3 vs 58.6) and trails both Claude and Gemini on tool-use benchmarks. For agentic workflows specifically, OpenAI's lead is narrower than the price increase suggests.

$40 Billion Is the Capital Backing the Bet

Google committing up to $40 billion to Anthropic, with $10 billion landing immediately at the $380 billion valuation, is the largest single funding signal in the agentic AI market. Anthropic now has commitments from Google ($40B), Amazon (up to $25B announced days earlier), and a 5 gigawatt compute capacity agreement with Google and Broadcom. The company reports over $30 billion in annual recurring revenue and an IPO expected in the coming months.

Read the deal alongside the Eric Boyd hire we covered last week in Microsoft losing talent and absorbing GitHub. Boyd ran Microsoft's AI platform business through the Azure OpenAI rise. He took the infrastructure chair at Anthropic. The capital and the operator are now in the same place at the same time. That has not been true for any frontier lab outside OpenAI before.

The Google framing is the part to read carefully. Google has Gemini 3.1 Pro. It does not need Anthropic's models for capability. What it needs is the enterprise distribution that Anthropic is building, and the option value of being inside the lab that just published the most rigorous agent-on-agent commerce data anyone has produced. $40 billion buys both.

For Microsoft and OpenAI, the read is sharper. The Microsoft-OpenAI deal was the original of this category. It used to be the obvious pole of capital concentration in AI. As of last week, it is one of two. Anthropic is no longer the scrappy second lab. It is the lab that just took $65 billion of cumulative commitment from the two largest cloud providers and is publishing the most concrete agentic commerce evidence at the same time.

The Single Story Underneath

Read together, the three events describe a single transition.

The Project Deal experiment is the demonstration that agent-on-agent commerce works. Not as a research paper. As a one-week internal pilot with real money and real outcomes. The result also demonstrates that the model class behind the agent matters in ways the agent's own user does not perceive. That is a structural insight, not a curiosity.

GPT-5.5 is the supply-side response. Frontier labs are pricing agentic intelligence as a premium tier because the demonstration is now in. Agents are economically valuable. The capability behind the agent is differentially valuable. The lab that supplies the better capability captures a price premium. OpenAI moved first on the explicit two-tier price structure. Anthropic and Google will follow within the quarter.

The $40 billion is the capital validating the supply-side bet. Google committed at the $380 billion valuation because the Project Deal-style evidence makes the upside case credible to a level it was not 90 days ago. Capital is not flowing on narrative anymore. It is flowing on early commerce evidence.

❝

The agentic economy stopped being an analyst slide last week. It became a benchmark, a price tier, and a $40 billion capital commitment, in three days.

What This Means for Commerce and Payments

Two specific exposures for our readers.

Merchant pricing in agentic commerce will not be uniform. If buyer-side agents close systematically different deals based on the model behind them, every merchant catalog needs to think about price discrimination in a new shape. Today, dynamic pricing responds to demand signals, geography, time of day. Tomorrow it will respond to which agent is asking and what evidence the merchant has about that agent's negotiation power. Visa's Trusted Agent Protocol becomes more important if the merchant needs cryptographic evidence of which agent class they are quoting.

The tier-shopping problem extends to payments orchestration. A bank or processor running fraud detection, real-time scoring, or customer-facing agents on GPT-5.5 will pay more than one running on GPT-5.4. The question is whether the additional capability justifies the additional inference cost on a per-transaction basis. For high-value transactions, yes. For everything else, the math gets close.

What To Watch

Four signals over the next eight weeks.

First, whether Anthropic ships an externally usable version of Project Deal. The internal experiment is a paper. An external sandbox where merchants and platforms can test agent-on-agent commerce against Claude Opus 4.7 would be the productization. That is the move that locks Anthropic in as the default agentic commerce experimentation environment.

Second, whether OpenAI introduces a similar experimental framework, or whether it focuses purely on capability and price. The competitive question is whether OpenAI sees agent-on-agent transactional research as something it needs to lead, or whether it cedes that to Anthropic and competes purely on the model.

Third, the Google integration cadence. Google has $40 billion committed and a 5 gigawatt compute deal in flight. The first product evidence of that integration, which probably arrives in a Cloud Next side launch this summer, tells us whether Anthropic stays a separate lab or starts to look more like a Google business unit.

Fourth, regulatory response. Anthropic's framing of "invisible inequality" is going to get picked up by the FTC and the European Commission. Both bodies have been building cases on algorithmic price discrimination for years. Project Deal hands them a measurable example. Watch for opening statements within 90 days.

Sources

If the model behind your agent is the new credit score, who decides what class of agent you get to use, and what happens to the people whose agents are quietly losing them money they never see?

Charlie Major is a Product Development Manager at Mastercard. The views and opinions expressed in Major Matters are his own and do not represent those of Mastercard.