Agentic Commerce Dispute Crisis Nobody Is Preparing For

"The card wasn't stolen. The merchant didn't make a mistake. The agent did exactly what it was told to do. But the customer still says, 'I didn't want that.' That is a very different situation."

That is Monica Eaton, founder and CEO of Chargebacks911, describing what she believes is an entirely new category of payment dispute. Not fraud. Not merchant error. Not buyer's remorse. Something the industry has never had to adjudicate before: a transaction that was technically authorised, executed correctly by the agent, but that the consumer either does not remember approving, does not recognise on their statement, or simply did not expect.

The agentic commerce ecosystem has spent the last 18 months in a protocol land grab. Mastercard shipped Verifiable Intent. Visa launched Trusted Agent Protocol. Google announced Universal Commerce Protocol. OpenAI and Stripe co-developed the Agentic Commerce Protocol. Every one of these initiatives solves for the transaction. Authentication, tokenisation, discovery, checkout. The architecture for how AI agents buy things is taking shape at remarkable speed.

The architecture for what happens when those purchases go wrong barely exists.

❝

The agentic commerce ecosystem is building the rails. It has not rebuilt the courtroom.

The Numbers That Should Worry You

Before agentic commerce adds a single transaction to the system, the dispute infrastructure is already under severe strain.

Ethoca, a Mastercard company, projects global chargeback volume will reach 337 million by 2026, a 42 percent increase from 238 million in 2023. By 2028, that figure climbs to 324 million at a compound annual growth rate of seven percent, according to Datos Insights research commissioned by Ethoca. In dollar terms, annual chargeback losses are expected to rise from roughly $33.8 billion in 2025 to $41.7 billion by 2028.

The United States is bearing the worst of it. US chargebacks are expected to hit $15.3 billion by 2026, more than doubling from $7.2 billion in 2019, according to Ethoca's trend data. Global card-not-present fraud losses are projected to reach $28.1 billion by 2026, a 40 percent increase from 2023.

Here is the number that matters most for what comes next: friendly fraud, where a consumer disputes a legitimate transaction rather than seeking a refund, already accounts for an estimated 75 percent of all chargeback disputes. First-party fraud is now the leading fraud type globally, representing 36 percent of all reported fraud in 2024, up from 15 percent a year earlier, according to Chargeflow's analysis of industry data.

❝

This is the system agentic commerce is about to stress-test. And the stress test has not even started yet.

Every dollar lost to fraud now costs US merchants $4.61, a 37 percent increase over five years, according to LexisNexis Risk Solutions. Merchants report that all categories of chargebacks have increased over the past 12 months. And 84 percent of consumers say they prefer filing a chargeback to requesting a refund directly, often because they do not know the difference between the two, according to Ethoca research.

That last statistic is the one that should keep payments leaders awake. If 84 percent of consumers already default to disputes over direct refunds when a human made the purchase, imagine the reflex when an AI agent they barely remember authorising buys something they did not expect.

Why Agentic Transactions Break the Dispute Model

The chargeback system was designed in a world where three things were true: a human initiated the transaction, there was a clear moment of intent, and the authorisation was a direct action. A PIN, a signature, a biometric, a tap.

Agentic commerce breaks all three.

The entity initiating the transaction is not the human. The intent may have been expressed hours or days before the purchase, in a context the consumer may not specifically remember. And the authorisation is not a direct action but a delegation, a set of instructions that the agent interprets and acts upon.

Chargebacks911 has identified what amounts to a fourth category of dispute that sits outside the traditional framework of fraud, merchant error, and buyer's remorse. The purchase is technically authorised, but the result does not match the customer's expectations. An agent might renew a subscription automatically, choose a cheaper alternative brand, book travel that fits the calendar but not the customer's preferences, or reorder products that are no longer needed. From the system's point of view, everything looks correct.

The scenarios multiply quickly. Chargeflow's analysis of emerging dispute patterns identifies several distinct failure modes. Purchases the customer cannot recall, where the agent reorders based on past behaviour but the human never consciously approved the specific transaction. Delegated mistakes, where the agent selects a different product, merchant, or quantity than expected because it optimised based on data rather than context. Overlapping orders, where multiple agents responding to the same household request place duplicate purchases. And shared account ambiguity, where agents connected to multi-user accounts act on unclear signals about who authorised what.

At a ChargebackX panel, Jamie George, VP of Account Management and Partnerships at Ravelin, put the liability question bluntly: the card schemes are not going to absorb it, the customers and their issuers will not accept it, and the AI model has not technically earned any revenue from the transaction. That does not leave many options. "It's going to be the merchant," George said.

The core problem is evidentiary. As Checkout.com noted in its analysis of agentic dispute readiness, AI-initiated transactions undermine the traditional evidence sources that merchants rely on to fight chargebacks. IP geolocation, device fingerprints, browser data, session logs. None of these exist in a meaningful form when the buyer is a software agent running on a cloud server. The evidence trail that has underpinned chargeback representment for decades simply does not apply.

The Liability Black Hole

Nobody has answered the fundamental legal question: when an AI agent makes a purchase that goes wrong, who is responsible?

The liability could fall on the consumer who delegated authority, the AI provider that built the agent, the merchant that accepted the transaction, the platform that facilitated it, or the payment network that processed it. As of early 2026, no jurisdiction has enacted regulation specifically addressing this question.

The Consumer Bankers Association convened a Symposium on Agentic AI Payments in late 2025 that examined this gap in detail. The Electronic Fund Transfer Act, the foundational US law governing electronic payments, was written in the late 1970s. Its framework assumes a human initiator. Whether existing exemptions for pre-authorised transfers apply to AI agent delegation is, at best, unclear. The CBA's white paper warned that if agentic payments scale substantially, the volume of consumer disputes could overwhelm existing systems, particularly if agents make systematic errors that generate widespread disputes simultaneously.

McKinsey has framed this as the "third actor problem," a term that captures the structural challenge neatly. Consumer protection law was built for a two-party model: buyer and seller. When a non-human entity sits between them, initiating transactions that existing law was never designed to accommodate, the liability framework has no clear answer, according to European Business Magazine's analysis.

In Europe, the situation is no clearer despite more regulatory infrastructure. The EU AI Act, which becomes fully applicable for high-risk systems by August 2026, classifies many agentic systems as high-risk if they influence financial decisions or handle sensitive data. But as Edgar, Dunn & Company noted, the Act contains no specific provisions for autonomous purchasing agents. It mandates human oversight but does not specify whether that oversight must be real-time at the moment of purchase. It requires transparency and audit trails but does not address who bears liability when a fully compliant agent still makes a purchase the consumer disputes.

The AI Liability Directive, which was intended to clarify non-contractual liability for AI-caused damage, may not survive. The European Parliament's Internal Market and Consumer Protection Committee called its adoption "premature and unnecessary" and recommended rejection. PSD3, expected to address delegated payment initiation and harmonise liability for AI-initiated transactions, is still being negotiated with key details unresolved.

❝

The technology is processing live transactions on production rails. The legal framework to adjudicate disputes from those transactions is years behind.

The Protocol Land Grab and the Missing Layer

To understand why the dispute problem remains unsolved, look at what the major agentic commerce protocols actually do, and what they do not do.

As we explored in our analysis of Mastercard's Verifiable Intent, the protocol landscape is forming around three complementary but distinct layers. Google's Universal Commerce Protocol covers discovery, consideration, purchase, and post-purchase management. OpenAI's Agentic Commerce Protocol, built with Stripe, handles checkout and payment delegation. Mastercard's Verifiable Intent creates cryptographic records linking consumer instructions to transaction outcomes.

Of the three, Verifiable Intent comes closest to solving the dispute problem. It captures consumer intent at the point of delegation, creates a cryptographic binding between what was authorised and what was executed, and uses Selective Disclosure to share only the minimum information each party needs for adjudication. If a consumer tells their agent to book a flight under $500 and the agent books a $750 flight, the mismatch is visible and verifiable.

Visa's Trusted Agent Protocol solves a different but equally essential problem: verifying that the agent initiating a transaction is legitimate. It answers "is this a real agent?" but not "did this real agent follow the human's instructions?"

Google's Agent Payments Protocol (AP2) provides mandate frameworks and smart contracts for sharing purchase instructions, giving merchants a way to demonstrate that customers authorised specific transactions.

But here is the gap that nobody is closing fast enough: none of these protocols have been tested in a live chargeback adjudication. The specifications are elegant. Whether they hold up when a consumer files a dispute with their issuer, a chargeback analyst pulls up the case, and the merchant needs to submit representment evidence is an entirely different question.

The card networks' existing reason code structures were not designed for agent-initiated transactions. There is no reason code for "the agent operated within its mandate but the consumer's expectations differed from their instructions." There is no evidence standard for "cryptographic intent record." There is no established workflow for an acquirer to pull a Verifiable Intent object and submit it as compelling evidence in a chargeback response.

❝

The protocols are competing on transaction enablement. The dispute layer is the gap in the stack.

What the Merchant Acquiring World Needs to Build

For anyone who has worked in merchant acquiring, and we should disclose that our perspective is informed by time spent at Fiserv, the gap between protocol specification and operational readiness is familiar territory. Standards bodies produce specifications. The acquiring ecosystem turns those specifications into workflows that merchants and their processors actually use. That translation layer is where agentic dispute readiness will succeed or fail.

Today, when a merchant receives a chargeback, the representment process relies on a well-established evidence chain: proof of delivery, authentication records, device data, IP logs, customer communication history, and transaction metadata. Every piece of that evidence assumes a human buyer. The merchant's chargeback team knows what to collect, how to format it, and what the issuer's adjudication team expects to see.

Agent-initiated transactions require a fundamentally different evidence framework. Merchants will need to collect and present intent records, showing what the consumer delegated to the agent. They will need mandate logs that demonstrate the constraints the consumer set: spending limits, merchant categories, time windows, product preferences. They will need agent identifiers that prove which specific agent instance executed the transaction. And they will need execution traces that map the agent's decision path from instruction to purchase.

As Checkout.com argued, early adoption of protocols like Trusted Agent Protocol and AP2 gives merchants the traceability they need to handle agent-led disputes with confidence. These protocols replace signals like IP geolocation and browser fingerprints with verifiable records of intent, identity, and execution.

But the infrastructure to process these new evidence types does not yet exist in most acquiring stacks. Fiserv's integration with Mastercard Agent Pay means agentic transactions will flow through the same authorisation and settlement systems millions of merchants already use, including those embedded in Clover. That is good news for adoption. It means merchants will not need to deploy entirely new payment infrastructure to accept agent-initiated transactions.

The bad news is that the dispute management layer sitting behind those same systems was built for human commerce. Chargeback management platforms, whether in-house or outsourced to specialists like Ethoca, Chargebacks911, or Justt, will need to ingest, parse, and present entirely new data types. The card networks will need to create or adapt reason codes for agent-initiated disputes. And merchant monitoring programmes, including Visa's new VAMP framework, will need to account for the fact that agentic transactions may generate dispute patterns that look very different from traditional commerce.

The window for building this is narrow. As Shahar Tal, CTO of Justt, observed at ChargebackX: the volume of agentic transactions today is a fraction of a percent. But the trajectory suggests rapid acceleration. If the dispute infrastructure is not ready when volumes scale, merchants will absorb the losses.

The Window Is Closing

Bain & Company estimates the US agentic commerce market could reach $300 billion to $500 billion by 2030, representing 15 to 25 percent of total online retail sales. Morgan Stanley projects a range of $190 billion to $385 billion over the same period, with roughly 23 percent of Americans already having made a purchase via AI in the past month.

Those projections assume the trust problem gets solved. Today, it has not been. Bain's consumer research found that only 24 percent of US consumers feel comfortable using AI to complete purchases. Just 10 percent have actually bought something through an AI agent, and those purchases were overwhelmingly small-ticket items: groceries, household goods, categories where the downside of a mistake is negligible. Around half of consumers say they are cautious about letting AI handle an end-to-end transaction without their involvement.

Security and privacy top the list of concerns. Those are not irrational fears. They are a rational response to the absence of clear accountability when something goes wrong.

The dispute crisis is not a future problem. It is a present gap with a known escalation path. Chargeback volumes are already growing at double-digit rates annually. The agentic commerce protocols are moving from specification to live production. The first wave of AI-initiated disputes will arrive well before the regulatory frameworks, reason codes, evidence standards, and adjudication workflows are ready to handle them.

The industry has a 12 to 18 month window. The protocols exist to solve this. Mastercard's Verifiable Intent, Visa's Trusted Agent Protocol, Google's AP2, and the emerging standards from the FIDO Alliance, EMVCo, and W3C collectively provide the raw materials for a dispute-ready trust layer. But specifications do not resolve chargebacks. Operational workflows do. And those workflows need to be built, tested, and deployed into the acquiring infrastructure that processes the vast majority of global commerce.

Datos Insights predicts chargeback volume will climb 24 percent from 2025 to 2028 on existing trends alone. Add a new category of disputes from agent-initiated transactions, where the evidence framework is undefined and the liability is unresolved, and the trajectory steepens considerably.

As Sift CMO Armen Najarian told American Banker: "It's going to be messy for the next five years."

He may be optimistic.

Sources

When the first wave of agentic commerce disputes hits chargeback departments at scale, will the industry have an answer, or will merchants be left holding a receipt for a transaction nobody can prove was authorised?