Agentjacking: a fake bug report hijacks your AI coding assistant

Researchers fooled Claude Code, Cursor and Codex with a single forged error report and made them run malicious code, with an 85% success rate. The unsettling part isn't the percentage. It's why it works.

Written by Nelson Brilhante · 12 Jun 2026 · updated 20 Jun 2026 · 4 min Ask BRI about this ↗

Agentjacking: a fake bug report hijacks your AI coding assistant — FIG. NB-L032 · Agentes de IA · Cibersegurança

A single fake error report was enough. Researchers at Tenet Security, a firm focused on the security of artificial-intelligence agents, showed they can fool the most widely used AI coding assistants in the world, including Claude Code, Cursor and Codex, into running an attacker's code on the machine of whoever is using them. They tested the technique against more than a hundred real targets and succeeded 85% of the time. They call it Agentjacking: the hijacking of an AI agent.

What should worry anyone pushing these agents into production isn't the percentage. It's the reason it works. An AI agent cannot tell the data it reads apart from an instruction to act. If the information reaching it carries disguised commands, it obeys, with the privileges of the person using it. Thirty years of cybersecurity were built to catch what is not authorized: the malware, the stolen password, the intrusion. None of that happens here. Every step in the chain is a legitimate, authorized action, carried out by a trusted tool.

How it works, step by step

The way in is Sentry, a popular error-monitoring platform that many software teams use to log their applications' failures. Sending it an error only takes a public key, the so-called DSN, which Sentry itself documents as safe to leave visible in a website's code. Anyone who finds it can inject an "error" of their choosing.

The attacker submits a forged error with a fake "Resolution" section, written to look like legitimate technical advice, hiding a malicious command. Later, the developer asks their AI assistant to fix the open issues. The agent fetches them through MCP (the Model Context Protocol, which connects AI agents to external tools), reads the planted error and treats it as trusted output from the system. It runs the command. From there, a package controlled by the researchers rummaged through environment variables, Amazon keys and code credentials, and quietly sent everything out. In an enterprise setting, the same technique can steal software pipeline credentials, reach private code repositories, compromise cloud infrastructure and install permanent access.

Through passive reconnaissance, Tenet found 2,388 exposed organizations with valid injectable keys, from companies worth 250 billion dollars down to solo developers. In more than a hundred confirmed cases, the agent actually ran the code.

Why it bypasses everything else

This attack walks past advanced antivirus, the firewall, the VPN, Cloudflare and identity management. And it does so for one simple reason: there is nothing malicious to detect. "The innovation is not a novel exploit: it is how trivially and at what scale agents can be hijacked in the wild," the researchers wrote. Worse still, defenses at the instruction level failed. Even when the agent was explicitly told to ignore untrusted data, it ran the command anyway.

One detail seals the argument. When Tenet warned Sentry, the answer came the same day: they acknowledged the problem but declined to fix it at the root, calling it "technically not defensible." They settled for blocking one specific signature of the attack. In other words, the platform's owner says this can't be solved on their side. That leaves the only place where it can still be stopped: the agent itself, the moment it decides to act.

This isn't one product's problem. It's the bill coming due for moving AI agents from completing text to running terminals, opening repositories and managing infrastructure, all rushed into production. Every new tool we connect to an agent is a new door. The agent has become the attack surface.

How to protect yourself

Treat the output of any tool connected to your agent as untrusted data, never as instructions to follow.
Give the agent the least privilege and the fewest secrets possible: short-lived, scoped keys, never the master key to the environment.
Require human confirmation before the agent runs system commands or installs packages.
Isolate the development environment from everything else, in containers or dedicated machines, so a compromised agent never reaches production.
Inventory your integrations: which tools your agent queries, and which of them return data that comes from outside.
Rotate now any keys and secrets that may have passed through an exposed agent.

The rush to hand machines autonomy is moving faster than our ability to give them brakes. Until that changes, assume everything your agent reads may be giving it orders.

Source: Tenet Security.

#StaySafe
🙏🖖

Agentjacking: a fake bug report hijacks your AI coding assistant

How it works, step by step

Why it bypasses everything else

How to protect yourself

More stories

Anthropic releases Claude Fable 5, the public version of the model it held back as too dangerous

Claude Mythos: Separating Fact from Hype Hours Before the Rumored Launch

When AI Attacks: Container Escape and the Future of Cybersecurity

The Boletim