Datawiza
Back to blog
April 18, 2026BlogIndustry

Prompt Injection Is Stealing API Keys From Claude Code, Gemini CLI, and GitHub Copilot

API, Application programming interface, Technology and software development tool, API technology Integration, Internet and networking concept

Last week, security researcher Aonan Guan published research showing that three of the most widely deployed AI coding agents on GitHub — Anthropic’s Claude Code Security Review, Google’s Gemini CLI Action, and GitHub’s Copilot Agent — could be hijacked into leaking their host repository’s API keys and tokens.

The attack surface was nothing exotic. A pull request title. An issue comment. An HTML comment hidden inside a Markdown issue body. The credentials exfiltrated: ANTHROPIC_API_KEY , GEMINI_API_KEY , GITHUB_TOKEN , GITHUB_COPILOT_API_TOKEN , and more — all sitting in the agent runner’s environment, all extractable through ps auxeww or a faked “Trusted Content Section” injected into the prompt.

If you’ve read our earlier post on AI agent credential security, you already know why this keeps happening. But Guan’s research is the clearest real-world proof we’ve seen that the industry’s default credential model for AI agents is broken.

What actually happened

Guan and his collaborators demonstrated the same attack pattern against all three vendors:

  1. Attacker-controlled text enters the agent’s context — a PR title, an issue body, a comment — the kind of data the agent is designed to read.
  2. The injected text redirects the agent to execute commands: whoami , ps auxeww , cat /proc/*/environ .
  3. The agent dutifully runs them and writes the output back to GitHub — as a PR comment, as a commit, as an Actions log line.
  4. The attacker reads the result. No external infrastructure required. The entire command-and-control loop runs inside GitHub.

GitHub had even stacked three runtime defenses on Copilot Agent — environment filtering, secret scanning, and a network firewall. All three were bypassed. Environment filtering didn’t cover the parent Node process or the MCP server sidecar. Secret scanning didn’t match base64-encoded tokens. The network firewall allowed github.com — so the attacker git push ed the encoded credentials into a PR.

The bounties (Anthropic $100, Google $1,337, GitHub $500) are almost beside the point. What matters is what was stolen and why it was useful the moment it was stolen.

The real lesson: these are bearer tokens on the public internet

Every credential that leaked in these three attacks shares one property: it is a bearer token accepted by a public API endpoint.

  • ANTHROPIC_API_KEY works from any machine that can reach api.anthropic.com .
  • GEMINI_API_KEY works from any machine that can reach Google’s API endpoints.
  • GITHUB_TOKEN works from any machine that can reach api.github.com .

That’s the internet. Every machine.

The moment a key like that appears in a PR comment, a commit, or an Actions log, the attacker can copy it to their laptop and use it immediately. No lateral movement. No foothold. No exploitation chain. The leak is the compromise.

This is not a bug in any of the three products. It’s the architecture of how AI agents are wired to backend services today. Every approach in common use — environment variables, mounted secrets, short-lived OAuth tokens, workload identity — ends the same way: a usable credential lives inside the agent’s process. Prompt injection is the lockpick. The credential is the prize.

When an AI agent runs behind Datawiza Agent Gateway, it does not hold ANTHROPIC_API_KEY . It does not hold GITHUB_TOKEN . It does not hold any backend credential at all. The agent holds a gateway-issued key — a drop-in replacement for a provider API key that only works against the gateway inside your network. The real backend credentials live in the gateway’s encrypted store, and the gateway injects them on the outbound side.

Why gateway-issued keys change the math

Datawiza Agent Gateway issues each agent its own key — a drop-in replacement for a provider API key that the agent uses exactly the same way. No SDK changes. No OAuth flow. No refresh logic. The value of the key changes; the agent code does not.

But the gateway key is a fundamentally different kind of secret than a provider key. It is issued by the gateway, only accepted by the gateway, and the gateway typically runs inside the customer’s own network — a VPC, a private subnet, an on-prem cluster. The real provider credentials ( ANTHROPIC_API_KEY , GITHUB_TOKEN , and the rest) live in the gateway’s encrypted store and never leave it. When the agent makes a call, the gateway validates the agent’s key, strips it, looks up the correct backend credential, injects it, and forwards the request.

The asymmetry this creates is the whole point:

Leaked provider API key | Leaked gateway-issued key

Who can use it?Anyone, from anywhere on the internetOnly someone who can reach the gateway endpoint
Where does the endpoint live?Public SaaS APIInside the customer’s network / VPC
What does the attacker need?Just the keyThe key and network access to the customer
Blast radiusFull provider account until manually rotatedBounded by gateway policy, often nothing usable at all
RevocationRotate the key at the providerRevoke the agent’s key at the gateway — instant

Think of it this way. A provider API key is a house key. Drop it on the sidewalk and whoever picks it up walks into your house. A gateway-issued key is a badge for a building you already have to be inside. Drop it on the sidewalk and it’s a piece of plastic.

If the attacks Guan documented had targeted an agent running behind Datawiza Agent Gateway, the exfiltrated payload would have contained a gateway key that is only accepted inside the customer’s VPC, has no direct access to Anthropic, Google, or GitHub’s APIs, is governed by gateway policy on every request, and can be revoked instantly without touching the provider account.

A catastrophic, internet-wide credential theft becomes a non-event.

What to take from this

This will not be the last prompt injection disclosure this year. It will not even be the last one this quarter. Prompt injection is the new phishing — a decades-long cat-and-mouse game in which the defender will sometimes win a round and never win the war.

What changes is how much damage a successful round does to you.

Organizations deploying AI agents today have one strategic decision to make: do the credentials in your agent runtime belong to bearer tokens on the public internet, or to keys bound to your own network? That single architectural choice is the difference between a leak that becomes a headline and a leak that becomes a log line.

We wrote the deeper architectural case for never giving agents real credentials. Guan’s research is the field evidence.

The agent cannot exfiltrate what it cannot access.

Book a 30-minute demo →

Datawiza is Easy to Get Started

Sign up to secure your AI agents and critical enterprise apps

Try Datawiza