Agentic AI Security Reckoning: Who Guards the Bot?

In January 2026, developer Lucas Valbuena ran OpenClaw, a popular AI agent platform formerly known as Clawdbot, through a security analysis tool called ZeroLeaks. The results were catastrophic. Using Gemini 3 Pro, the platform scored 2 out of 100 points. The system prompt was fully exposed on the first attempt. The extraction rate hit 84 percent. Injection attacks succeeded 91 percent of the time.

Around the same time, security researcher Jamieson O'Reilly discovered that Moltbook, a platform where AI agents interact with each other, had left its entire database sitting on the public network. No protection. API keys exposed. Attackers could impersonate any agent on the platform, including one belonging to AI researcher Andrej Karpathy and his 1.9 million followers.

Meanwhile, 7 percent of enterprise CFOs have already deployed agentic AI in live finance workflows. Another 5 percent are running pilots. These systems do not just forecast cash needs. They execute. They sweep idle balances into short-term yield instruments. They reverse those moves when liquidity demands it. They operate within guardrails set by finance leadership.

The guardrails assume the agent is secure. It is not.

❝

We are handing autonomous financial authority to systems with no reliable defense against their primary attack vector. This is not a future risk. It is happening now.

The Agentic Promise

The pitch to enterprise finance is compelling. Traditional treasury platforms are passive. They show balances, highlight trends, flag exceptions. Even when they use machine learning, the output is advisory: a forecast, a recommendation, a scenario.

Agentic systems change this posture entirely. Instead of asking "what should we do with this cash?" the system asks "what am I allowed to do right now?" Then it executes.

The economic pressure driving adoption is real. Higher interest rates mean idle cash represents genuine opportunity cost. Complexity has outpaced manual control as global entities manage dozens or hundreds of accounts across currencies, banks, and jurisdictions. AI forecasting has reached a level of reliability that makes automation feel defensible rather than reckless.

The numbers reflect this momentum. According to G2's Enterprise AI Report, 57 percent of companies already have AI agents in production, with another 22 percent piloting. Adoption jumped from 11 percent to 42 percent in just six months. Gartner predicts 40 percent of enterprise applications will feature embedded task-specific agents by the end of 2026, up from less than 5 percent in early 2025.

PYMNTS research found that firms using agentic AI have automated up to 95 percent of their accounts receivable processes, compared to 38 percent among firms without AI integration. The market is projected to hit $11.79 billion this year.

❝

"Folks are just starting to understand that AI isn't just automation with kind of sexier marketing," Ernest Rolfson, CEO of Finexio, told PYMNTS in December.

He is right. But the market is moving faster than its security posture can support.

The Front Door Is Wide Open

The OpenClaw vulnerability is not an edge case. It is a window into how fragile the agentic AI infrastructure actually is.

Valbuena's ZeroLeaks analysis tested multiple models. Gemini 3 Pro scored 2 out of 100. Codex 5.1 Max managed 4 out of 100. Opus 4.5 fared better at 39 out of 100, but that is still a failing grade by any reasonable standard. The analysis found that anyone interacting with an OpenClaw-based agent can access its complete system prompt, internal tool configurations, and memory files. This includes files like SOUL.md and AGENTS.md, along with all skills and embedded information.

For agents handling sensitive workflows or private data, this is not a theoretical concern. It is an open invitation.

The Moltbook breach was worse. Security researcher O'Reilly found the entire database exposed, including secret API keys that would let attackers post on behalf of any agent. The Karpathy example illustrates the reputational risk: with exposed keys, bad actors could spread fake statements about AI safety, crypto scam promotions, or inflammatory political content under his name.

The scale of exposure extends beyond these two platforms. X user fmdz warned of an impending "clawd disaster" after a simple scan turned up 954 Clawdbot instances with open gateway ports, many without any authentication. The instances were spread across servers in the US, China, Germany, Russia, and Finland.

Beyond research environments, real-world incidents are already accumulating. According to Adversa AI's 2025 Security Report, a supply chain attack on the OpenAI plugin ecosystem compromised 47 enterprise deployments. Attackers harvested credentials and accessed customer data, financial records, and proprietary code for six months before discovery. Threat actor UNC6395 used stolen OAuth tokens from Drift's Salesforce integration to access customer environments across more than 700 organisations. No exploit. No phishing. The activity looked legitimate because it came from a trusted SaaS connection.

❝

The average cost of an AI-powered breach is $5.72 million. Shadow AI breaches cost $670,000 more than traditional incidents and take 247 days to detect.

Why Governance Does Not Solve This

CFOs deploying agentic treasury systems believe they have safeguards in place. Bounded autonomy with strict policies. Audit trails showing what the agent did. Human escalation thresholds for larger moves. Low-risk instruments only. Parallel testing before production deployment.

These controls assume the agent itself is trustworthy. They do not account for what happens when the agent is compromised at the instruction layer.

Prompt injection collapses the boundary between data and instructions. Professor George Chalhoub framed the core problem clearly: it "potentially turns an AI agent from a helpful tool to a potential attack vector against the user."

If an attacker can extract system prompts, they see the rules. They see the boundaries. They see exactly how much autonomy the agent has and where the escalation thresholds sit. The audit trail shows what the agent did, not what it was tricked into doing.

OWASP's 2025 Top 10 for LLM Applications ranks prompt injection as the number one vulnerability, present in 73 percent of production AI deployments assessed during security audits. eSecurity Planet reports that Q4 2025 saw over 91,000 attack sessions targeting AI infrastructure. The most common objective was system prompt extraction, which reveals role definitions, tool descriptions, policy boundaries, and workflow logic that attackers can use for more effective follow-on attacks.

OpenAI has been direct about the challenge. In December 2025, the company wrote: "Prompt injection, much like scams and social engineering on the web, is unlikely to ever be fully solved."

This is not reassuring language for CFOs who have just handed an AI agent the authority to move corporate cash.

We Have Been Here Before

The pattern is familiar. A powerful new technology reaches mass adoption before its security model matures. The damage comes first. The standards follow.

In 2016, the Mirai botnet infected over 300,000 IoT devices using just 62 default passwords. The resulting attack reached 1.2 terabits per second, the largest DDoS attack ever recorded at the time. When Dyn went down, it took Twitter, Netflix, GitHub, and Airbnb with it. The lesson was clear: connect everything, secure nothing, pay later.

Cloud adoption followed the same arc. Today, 80 percent of companies have experienced cloud breaches in the past year. Cloud security incidents surged 154 percent year over year, from 24 percent of organisations affected in 2023 to 61 percent in 2024. The root cause in 68 percent of cases is misconfiguration, and 82 percent of those misconfigurations stem from human error. On average, it takes 186 days to identify a misconfiguration-driven breach and another 65 days to contain it. Each incident costs approximately $3.86 million.

Agentic AI is on the same curve, but with higher stakes. When the compromised system can autonomously execute financial transactions, the blast radius expands dramatically.

❝

GenAI was involved in 70 percent of AI security incidents in 2025, but agentic AI caused the most dangerous failures: crypto thefts, API abuses, legal disasters, and supply chain attacks.

What Doing It Right Looks Like

The picture is not entirely bleak. Some organisations are building security into the foundation rather than bolting it on after deployment.

Lakera, recently acquired by Check Point, has emerged as a leader in AI-native security. Their Lakera Guard platform analyses every input and output in real time, flagging or blocking prompt injections before they reach the model. The system scans fetched content, attachments, and URLs for embedded instructions, including those hidden in HTML, PDFs, or less common languages.

The approach is grounded in continuous learning. Lakera processes over 100,000 new attacks daily through Gandalf, their AI security research platform. Enterprise customers include Dropbox and regulated banks deploying AI across money transfers, financial services, and customer support.

Microsoft has developed FIDES, an information-flow control approach for agentic systems. Their broader strategy emphasises architecture-level solutions: trust boundaries, context isolation, output verification, strict tool-call validation, least-privilege design, and continuous red teaming. OpenAI is using an "LLM-based automated attacker" trained via reinforcement learning to probe its own systems for vulnerabilities.

The McKinsey playbook for deploying agentic AI emphasises securing agent-to-agent collaborations through protocols like Anthropic's Model Context Protocol, Cisco's Agent Connect Protocol, Google's Agent2Agent, and IBM's Agent Communication Protocol. These are still maturing, but the recommendation is clear: implement safeguards now and plan for upgrades as more secure protocols emerge.

For enterprises evaluating agent deployments, the questions to ask are straightforward. What is your prompt injection defence? If the answer is vague, walk away. Has the platform undergone independent security testing like ZeroLeaks? What happens when an agent is compromised? Is there containment? Detection? Rollback?

❝

According to the Cisco State of AI Security Report, only 34 percent of enterprises have AI-specific security controls in place. Less than 40 percent conduct regular security testing on AI models or agent workflows. These numbers need to change.

The Clock Is Ticking

Michael Freeman, head of threat intelligence at Armis, offered a prediction: "By mid-2026, at least one major global enterprise will fall to a breach caused or significantly advanced by a fully autonomous agentic AI system."

This is not fear-mongering. It is pattern recognition. The attack surface is expanding. The defences are immature. The adoption curve is steep. Something will give.

The path forward is not to abandon agentic AI. The productivity gains are real. The efficiency improvements are measurable. The competitive pressure is genuine.

The path forward is to treat agentic AI the way we eventually learned to treat cloud infrastructure: as a shared responsibility model where security is a first-class concern, not an afterthought. Where procurement includes security audits. Where deployment includes red teaming. Where governance assumes compromise rather than trust.

CFOs letting AI agents touch their cash should be asking one question: who else can give that agent instructions?

Right now, the answer is almost anyone willing to try.

Related reading from Major Matters:

Sources

Your organisation is probably somewhere on the agentic AI adoption curve. Where does security fit in that timeline, and are you comfortable with the gap?