Two weeks ago, your AI agent bought the wrong hotel. Not the wrong city. The right city, the right dates, a reasonable price. But it was a smoking room on the ground floor next to the car park, and you'd told the agent you wanted something quiet with a view. The agent optimised for price. You meant preference.

Who do you call? Not the hotel. The agent booked it. Not your bank. The transaction was authorised. Not the AI provider. They don't handle commerce disputes. Not the card network. Their reason codes don't have a category for "my AI misunderstood what I wanted."

You are stuck. And so is everyone else.

When we published The Agentic Commerce Dispute Crisis Nobody Is Preparing For on March 11, we identified a 12 to 18 month window before agent-initiated disputes overwhelm existing chargeback infrastructure. Since then, three more protocol layers have shipped. Stripe launched the Machine Payments Protocol. MoonPay open-sourced the Open Wallet Standard for AI agent wallets. Gap became the first major fashion retailer to enable checkout inside Google Gemini.

The stack is filling in fast. The dispute layer is still empty.

The Stack as It Stands

Map every layer of the agentic commerce stack in March 2026 and a pattern emerges. Every function required to get an AI agent from product discovery to completed payment has at least one production protocol. Most have two or three.

Discovery and Intent. OpenAI's Agentic Commerce Protocol (ACP), built with Stripe, is live in ChatGPT with over one million Shopify merchants accessible. Google's Universal Commerce Protocol (UCP) has 20+ launch partners including Visa, Adyen, and Target. Both are open source. Both are in production. As we covered in our Q1 2026 stack analysis, the discovery layer is the most contested and the most mature.

Trust and Identity. Visa's Trusted Agent Protocol (TAP) uses cryptographic signatures to verify agent legitimacy at the network edge, built with Cloudflare and Akamai. Mastercard's Agent Pay framework provides governance for AI agent transactions through tokenisation, strong authentication, and fraud prevention. Vouched launched Agent Checkpoint, a Know Your Agent (KYA) platform with real-time agent detection and permissioning. The trust layer is covered from multiple angles: is this agent real, is this agent allowed, is this agent who it claims to be.

Payment Protocols. Coinbase's x402 embeds stablecoin settlement into HTTP itself. Stripe's Machine Payments Protocol settles on stablecoins, cards, BNPL, and bitcoin. Google's Agent Payments Protocol (AP2) bridges both approaches. The settlement layer war is live, and both sides have real coalitions.

Wallets. MoonPay's Open Wallet Standard gives AI agents the ability to hold assets, sign transactions, and pay across eight chain families from a single seed phrase. Keys are encrypted at rest and wiped after signing. PayPal, Ripple, Circle, and the Solana Foundation contributed.

Pre-transaction Authorisation. Mastercard's Verifiable Intent creates cryptographic records binding consumer instructions to transaction outcomes. Independent researchers are exploring pre-transaction verification layers that would give agents a structured way to confirm authorisation before executing. The concept is early but the architectural thinking exists.

Merchant Integration. Fiserv adopted both Visa TAP and Mastercard Agent Pay simultaneously. Shopify is integrated with ACP. Adyen supports UCP. Merchants don't need to choose sides. The processors are doing that work.

Now look at what comes after the transaction.

Post-transaction Dispute Resolution. Nothing.

The agentic commerce stack has a protocol for every phase of a transaction except the one that determines what happens when a transaction goes wrong.

Not a single protocol. Not a working group. Not a published specification. Not even a sandbox demo. The most critical function in payment infrastructure, the ability to resolve disputes fairly and at speed, has zero coverage in the agentic commerce stack.

Seven Things a Dispute Resolution Layer Would Need to Do

We have spent two months mapping the gap between agent-initiated transactions and existing chargeback infrastructure. We've talked to dispute management companies, studied the card network frameworks, and reviewed the emerging protocol landscape. The requirements for an agentic dispute resolution layer are becoming clear, even if nobody has started building one.

Handle evidence that doesn't exist in today's chargeback system. Current representment relies on IP geolocation, device fingerprints, browser data, session logs. None of that exists when the buyer is a software agent running on a cloud server. An agent dispute layer would need to ingest delegation mandates, reasoning traces, intent declarations, and agent identifiers. These are evidence types that no card network adjudication system is built to process.

Operate at agent speed. Card network chargebacks take 30 to 120 days. Agent transactions will happen in milliseconds. A dispute layer built for agent commerce can't run on human timelines. If an agent makes a cascading set of purchases based on a flawed assumption, every subsequent transaction compounds the error. Resolution needs to happen before the cascade, not months after.

Work across every payment rail. Agents will pay on cards, ACH, RTP, stablecoins, wire, and whatever comes next. As we identified in our analysis of the trust gap, the fragmentation of payment rails in agentic commerce is already a problem. A dispute layer that only works on card rails is a dispute layer that covers a shrinking share of agent transactions.

Define new dispute categories. Existing card network reason codes were designed for human commerce. There is no reason code for scope drift, where an agent acts within its technical permissions but outside the consumer's intent. No code for intent mismatch, where the agent's interpretation of an instruction differs from what the human meant. No code for cascading errors, stale mandates, or cross-agent conflicts where two agents acting for the same household place duplicate orders. These are new failure modes that need new categories.

Enable machine-to-machine resolution before human escalation. Most agent-initiated disputes will be resolvable without human involvement, if the infrastructure exists to let the agents negotiate. An agent that booked the wrong hotel and the merchant's agent could, in theory, resolve the issue in seconds. Cancel, rebook, adjust. But that requires a structured protocol for agents to raise, negotiate, and settle disputes between themselves. Today, the only path is a human filing a chargeback.

Create immutable records for regulators and auditors. As we covered in our analysis of the NVIDIA security gap, regulators are watching. The GAO has confirmed that no federal banking regulator has issued standalone guidance on agent-initiated transactions. When that guidance arrives, it will require audit trails. A dispute resolution layer needs to produce records that regulators can inspect, auditors can verify, and courts can admit.

Assign liability when no existing framework says who pays. This is the hardest requirement. When an agent makes a purchase that goes wrong, liability could fall on the consumer, the AI provider, the merchant, the platform, or the payment network. No jurisdiction has answered this question. A dispute layer needs to operate in that vacuum, providing a framework for liability assignment even before regulation catches up. Credit unions are already facing this with fewer resources than banks.

Why Nobody Has Built It

The requirements are clear. The gap is documented. So why isn't anyone building the dispute resolution layer?

Follow the incentives.

Card networks profit from the existing chargeback system. Interchange fees generate revenue on every transaction. Chargeback fees generate revenue on every dispute. The current system, slow and adversarial as it is, works financially for Visa and Mastercard. Rebuilding it for agent commerce is expensive, and the agent transaction volume that would justify the investment doesn't exist yet.

Payment processors are focused on enablement. Stripe, Adyen, Fiserv, and every other processor in the stack are competing to be the rails agents pay through. That's where the revenue is right now. Getting agents to transact. Not adjudicating what happens when those transactions go wrong. Dispute management is a cost centre. Enablement is a revenue driver. The roadmap reflects that.

Agent platforms consider disputes a payment infrastructure problem. OpenAI, Anthropic, and Google are building agent capabilities, not payment adjudication. When ChatGPT's agent buys the wrong product, OpenAI's position is that the payment system should handle the dispute, the same way Visa handles disputes when you buy something through a browser. Reasonable, except the payment system was built for browsers and humans, not agents and mandates.

Merchants are focused on conversion. Getting agents to complete purchases is the priority. Handling disputes from those purchases is tomorrow's problem.

Nobody's incentive is aligned with building the layer that sits between all of them. The dispute resolution layer is an infrastructure commons, and commons don't get built when every participant assumes someone else will do it.

Every Payment System in History Built This

The absence is genuinely unusual. Look at the precedent.

The card networks built dispute resolution into the system from the start. The Fair Credit Billing Act of 1974 mandated the creation of the chargeback process. Before a single modern credit card transaction was processed at scale, the legal infrastructure for resolving disputes was already in place. The reason code system, the representment process, the liability frameworks. All of it was designed into the architecture, not bolted on after the fact.

ACH has return codes. R10 for unauthorised transactions. R05 for consumer account errors. R29 for corporate entry refusals. Nearly 70 distinct return codes, each mapping a specific failure mode to a specific resolution process. NACHA didn't launch the ACH network and then figure out what to do when payments went wrong. The return infrastructure was part of the specification.

Wire transfers have recall processes. SWIFT gpi includes stop and recall services, case management, and investigation procedures. When a wire goes wrong, there is a documented, structured process for every participant in the chain.

Even cryptocurrency, the most anarchic corner of financial infrastructure, is building arbitration. Kleros operates a decentralised dispute resolution protocol with crowdsourced jurors, escrow mechanisms, and structured evidence submission. Kleros 2.0 launched in 2026 with enhanced arbitration for DeFi, NFTs, and e-commerce.

Agentic commerce is the first payment infrastructure being built without a dispute resolution layer. Not because the architects decided it wasn't needed. Because nobody's job description includes building it.

What Happens If Nobody Builds It

The scenario is straightforward and it's getting closer.

Agent transaction volumes scale. Bain projects the US agentic commerce market at $300 billion to $500 billion by 2030. As those volumes grow, disputes hit card networks. An agent books the wrong flight. An agent reorders a product the consumer no longer needs. An agent optimises for price when the consumer meant quality. The consumer calls their bank.

The issuer opens a case. The reason codes don't fit. "Merchandise not as described" is the closest option, but the merchandise was exactly as described. The agent just chose wrong. The evidence is incompatible. No device fingerprint. No IP address. No browser session. Just a delegation mandate and a reasoning trace that the chargeback analyst has never seen before and has no framework to evaluate.

The issuer can't adjudicate. Two outcomes follow, and both are bad.

Outcome one: blanket liability assignment to merchants. The networks default to the existing pattern where merchants bear the burden of proof. Merchants who can't produce traditional evidence lose every agent-initiated dispute. Chargeback ratios spike. Merchants start declining agent transactions entirely. The VAMP framework penalises merchants who exceed dispute thresholds. Agentic commerce adoption stalls because merchants won't accept the risk.

Outcome two: blanket rejection of agent disputes. Issuers decide that agent-initiated transactions are pre-authorised and therefore not disputable. Consumers who genuinely got the wrong product through no fault of their own have no recourse. Consumer advocacy groups notice. Regulators intervene. The Consumer Financial Protection Bureau or its equivalent in the EU writes rules that may not understand the technology well enough to regulate it effectively.

Either outcome kills agentic commerce adoption. The 12 to 18 month window we identified on March 11 is already narrowing. Three more protocol layers have shipped. The dispute layer hasn't moved.

We're watching an industry build a car at remarkable speed. Engine, transmission, steering, brakes. Everything except the seatbelts. And the people building the engine assume the seatbelt team is working on it somewhere. They're not. There is no seatbelt team.

Sources

The dispute resolution layer will get built. The question is whether it gets built by the card networks who control the existing system, the agent platforms who control the transaction flow, the payment processors who sit in the middle, or someone entirely new who sees the gap before the incumbents do. Who builds the courtroom for agentic commerce?

Reply

Avatar

or to participate

Keep Reading