AI Agents Shopping: The Security Gap Nobody Fixed

The gap between consumer trust and infrastructure readiness has never been wider.

70 percent of consumers are ready to let AI agents handle their shopping. A CVSS 10.0 vulnerability shows why the security model is not ready to let them.

New research from PYMNTS Intelligence paints a striking picture of where consumer behaviour is heading. In a survey of 2,299 U.S. adults, 70 percent said they are interested in using AI agents for shopping. Not browsing. Not getting recommendations. Shopping: finding products, comparing prices, and completing purchases on their behalf.

The same week that research dropped, security firm LayerX disclosed a vulnerability in Anthropic's Claude Desktop Extensions that scored a perfect 10.0 on the CVSS severity scale. The exploit requires nothing more than a manipulated Google Calendar entry. One event. Zero clicks. Full system compromise.

These two stories are being covered in completely different corners of the internet. The consumer data is making the rounds in fintech and commerce circles. The security disclosure is circulating among cybersecurity researchers. Nobody is connecting them. They should be, because together they define the central tension of agentic commerce: the agents that are useful enough to shop for you are also powerful enough to be weaponised against you.

❝

The demand for autonomous commerce is real. The infrastructure is being built fast. But the security model is an afterthought, and consumers will pay the price if that does not change.

The Demand Signal

The PYMNTS Intelligence report, "From Assistive to Agentic AI: Consumers Wade Into Autonomous Commerce," is the most comprehensive look at consumer appetite for AI-powered purchasing we have seen. The study assessed adoption across 54 personal-use cases spanning nine areas of daily life, from shopping and finance to health and travel.

The headline number, 70 percent consumer interest, tells only part of the story. Dig into the segments and the picture sharpens. 71 percent are interested in agentic AI for health and wellness management. 70 percent for travel planning. 69 percent for grocery shopping, subscription management, and meal planning. 66 percent for bill management.

These are not hypothetical use cases. They are recurring, transaction-heavy activities where consumers are already spending money regularly.

The most revealing finding is this: 49 percent of interested consumers would allow an AI agent to complete both routine purchases and larger, research-driven purchases. That is not a cautious "let me try it with a coffee order" adoption curve. That is nearly half of interested consumers saying they would trust an agent with considered purchases: appliances, electronics, travel bookings.

But consumers are not naive. The same research shows they are drawing clear boundaries. They want autonomy that is bounded and interruptible, with the ability to approve actions before they occur, undo decisions, and invoke human override at any point.

Among non-regular AI tool users, just 3 percent trust generative AI platforms as agentic assistant providers. Banks, digital wallets, and card networks ranked highest. The trust sits with financial institutions, not AI companies.

The supply side is moving just as fast. Perplexity launched "Buy with Pro" for agentic shopping. Opera built Neon, a browser where the AI can shop, book, and transact in a live session. Google and Shopify announced the Universal Commerce Protocol, an open standard for agent-to-merchant transactions.

Target built a ChatGPT shopping app and became the first retailer to run ads inside AI-generated conversations. According to a WP Engine website traffic report, nearly one in three website visits now comes from bots and AI agents rather than human browsers.

The infrastructure for agentic commerce is being built in real time. The question is whether it is being built safely.

The Infrastructure Race

The checkout systems that power online commerce were designed for humans. Form fields, CAPTCHA challenges, multi-step confirmations, card number entry: every step assumes a person is sitting at a screen making deliberate choices. AI agents do not work that way.

According to Total Retail, agents "can sometimes get all the way to 'buy,' then stall, forced to hand control back to a human." The intelligence is there. The identity and payment layer is not. Sean Neville, cofounder of Circle, points to a 96-to-1 ratio of non-human to human identities in financial services. The bots are already in the system. The payment infrastructure has not caught up.

"The bottleneck for the agent economy is shifting from intelligence to identity," Jim Nguyen, co-founder and CEO of InFlow, told Total Retail. What agents need is fundamentally different from what humans use: API-driven purchasing pathways, embedded payments within agent workflows, programmatic authorisation with audit trails, and support for flexible monetisation models from micropayments to recurring subscriptions.

❝

Agents can browse, compare, and decide. But at the moment of transaction, the most critical moment in commerce, they hit a wall built for humans.

The Calendar Exploit

On February 9, LayerX published research that should concern anyone building or using agentic commerce infrastructure. The security firm found a zero-click remote code execution vulnerability in Claude Desktop Extensions (DXT) that affects over 10,000 active users across more than 50 extensions.

The attack is disarmingly simple. An attacker creates a Google Calendar entry with a benign title like "Task Management." The event description contains plain-text instructions directing the system to download and execute code from a remote URL. No sophisticated prompt engineering required. No direct interaction with the victim.

The exploit triggers when the user issues a vague but common prompt: "Please check my latest events in Google Calendar and then take care of it." Claude interprets "take care of it" as authorisation to act on the calendar instructions. The model reads the event, invokes a local extension with execution privileges, downloads the attacker's code, and runs it. No confirmation prompt. No warning. No visible indication to the user.

The root cause is architectural. Unlike modern browser extensions, which operate within sandboxed environments, Claude's MCP servers run with full system privileges on the host machine. These extensions are, as LayerX describes them, "privileged execution bridges" capable of reading files, accessing stored credentials, and modifying OS settings. There is no isolation between low-risk data connectors like Google Calendar and high-privilege execution tools.

Anthropic declined to fix the issue, stating it "falls outside our current threat model" and that the behaviour represents the intended design of a "local development tool." Udo Schneider of Trend Micro offered a blunter assessment: "Security and usefulness are in direct competition."

The Trust Gap

This is the catch-22 at the heart of agentic AI. Roy Paz of LayerX summarised it precisely: "To unlock the productivity benefits of AI, you need to give these tools deep access to sensitive data. But if any data is compromised as a result, the AI and model providers don't see themselves responsible for the security of users using their products."

The PYMNTS data shows that consumers instinctively understand this tension. They want bounded autonomy: agents that act on their behalf but within limits they control. Approve before executing. Undo after the fact. Human override at any point. The architecture being built does not reflect those preferences. Claude Desktop Extensions run with full system privileges and no guardrails. The agent that checks your calendar has the same access as the agent that executes code on your machine.

The performance data compounds the concern. The APEX-Agents benchmark, based on real tasks designed by investment banking analysts, management consultants, and corporate lawyers, found that AI agents succeed at complex cross-application tasks just 24 percent of the time. Gartner predicts that 40 percent of agentic AI projects will fail by 2027 due to poor risk management and unclear ROI.

Now apply this to commerce. Target has built a shopping experience inside ChatGPT. Perplexity can execute purchases through its interface. Google's Universal Commerce Protocol is designed to let agents transact directly with merchants. If a hijacked agent has access to stored payment credentials, prompt injection stops being a cybersecurity curiosity and becomes financial fraud. A manipulated calendar entry does not just compromise a computer. It compromises a wallet.

❝

A 24 percent success rate. A CVSS 10.0 vulnerability that the vendor will not fix. And 70 percent of consumers ready to hand over purchasing authority. The gap between consumer trust and infrastructure readiness has never been wider.

Why This Matters

We have been tracking the convergence of AI, payments, and commerce since Major Matters launched. In our analysis of the search market splitting, we covered how Perplexity, Opera Neon, and Google are building agentic commerce into the browser layer. In our coverage of ChatGPT advertising, we explored how Target and Roundel are turning AI conversations into transaction surfaces.

In our analysis of the AI content marketplace race, we examined how the infrastructure layer is being contested by cloud providers. The pattern across all of these stories is the same: the demand side is accelerating, the infrastructure is being built at speed, and the security and identity layers are being treated as problems to solve later. That ordering is dangerous.

There are specific questions we are watching closely. First, who builds the authentication and identity layer for agent commerce? The current model, where agents inherit user privileges without granular permission controls, is not sustainable. Someone needs to build the equivalent of OAuth for agentic transactions - a framework that lets agents act on behalf of users with scoped, revocable, auditable permissions.

Second, how do payment processors respond? Stripe and PayPal are already building agentic commerce capabilities, but neither has publicly addressed the security implications of agent-initiated transactions at scale. When an agent processes a payment, who bears liability for a fraudulent transaction triggered by prompt injection?

Third, regulation. PSD2 and the forthcoming PSD3 mandate Strong Customer Authentication for online transactions. Those requirements were designed for human users clicking "confirm" on a device they own. Nobody has defined what SCA looks like when an AI agent acts on your behalf. The regulatory frameworks have not caught up with the technology, and the gap is widening every week.

This is not an abstract concern. The LayerX disclosure demonstrates that a trivial attack can compromise an agent with full system access. The PYMNTS data demonstrates that consumers are ready to give agents purchasing authority. The infrastructure is being built to connect those two realities. The question is whether the security model will be ready before the first major breach of an agent-initiated payment occurs.

Sources

If 70 percent of consumers are ready to hand their wallets to AI agents, who is responsible when the agent gets hijacked?