#106 - AI Credential Harvest Scales as Agentic Attack Surface Mapped

PRESENTED BY

Cyber AI Chronicle

By Simon Ganiere · 12^th April 2026

Welcome back!

Anthropic announced a new model this week that can autonomously discover vulnerabilities and build working exploit chains. They gave it to a consortium of tech giants first, as a head start for defenders. The same week, Google API keys hardcoded in 250,000 Android apps were found to silently authenticate to Gemini AI endpoints, a Microsoft device-code phishing campaign was running 10–15 AI-personalised attacks per day and compromising hundreds of organisations daily, and the FBI's annual crime report recorded $893 million in confirmed AI-enabled fraud losses in 2025 alone. The head start may be shorter than advertised.

The theme this week isn't a single incident. It's the steady widening of a gap between AI as an attack capability and AI as infrastructure, secured or otherwise. Agentic AI systems are being targeted via systematic prompt injection chains. AI credentials are being harvested at scale by threat actors who understand their value better than most security teams do. And the AI model powering your developers' workflows may already be manipulable via a poisoned file in a repository they clone.

If your organisation has AI tools deployed, this week asks a pointed question: do you have visibility into what those tools can access, and what happens when the instructions they receive aren't yours?

❝

If you have been enjoying the newsletter, it would mean the world to me if you could share it with at least one person 🙏🏼 and if you really really like it then feel free to offer me a coffee ☺️

Simon

AI Threat Tempo

🤖 AI-Enabled Social Engineering: ↑↑↑ Multiple documented campaigns running at scale

An unattributed threat actor has been running 10–15 Microsoft device-code phishing campaigns per day since March 15, using AI to generate hyper-personalised emails and route victims through compromised serverless platforms to harvest OAuth tokens and bypass MFA. Hundreds of organisations compromised daily across all sectors.
The newly documented VENOM phishing-as-a-service platform targets C-suite executives specifically, using Unicode-rendered QR codes and double Base64-encoded URLs in fragments to defeat link scanners, with both AiTM and device-code flows supported.

Significance: Commodity AI-personalised phishing is now a volume operation. The interesting thing isn't the technique, it's the cadence. Ten to fifteen campaigns a day from a single actor is an operations tempo that was implausible without AI assistance.

🏴‍☠️ Nation-State AI Operations: ↑ Active, lower volume than last week

Salt Typhoon's 200-organisation breach across 80+ countries remains the reference datapoint; Anthropic's September 2025 disclosure of Chinese APT group GTG-1002 weaponising Claude Code as an autonomous attack platform (executing 80–90% of attacks with minimal human oversight) continues to frame the AI-as-attacker discussion.
China-linked actors were also identified this week using a fake Claude AI website to distribute PlugX RAT, with C2 callback within 22 seconds of execution. Not AI-enabled attack technique, but AI brand as lure.

Significance: Nation-state AI operations are now a stable feature of the landscape, not a novel finding. The signal this week is lower than #105's landmark Microsoft reporting, which is worth calibrating - the forensic baseline has moved, not the threat.

💀 AI-Augmented Ransomware & Cybercrime: ↑↑ Structurally concerning

The Sicarii RaaS group is almost certainly using AI to assemble its malware - and has shipped a fatally flawed encryptor that discards the private decryption key post-encryption. Victims cannot recover files even after paying. This is what AI-accelerated, low-quality malware development looks like at the bottom end.
The FBI's 2025 IC3 report confirmed $893 million in AI-enabled fraud losses in the US alone, with voice cloning, deepfake video, and AI-generated identity documents used in BEC, romance scams, and investment fraud. First time AI has had its own section in the IC3 report.

Significance: Cynthia Kaiser's observation from RSAC is worth noting: AI is amplifying low-sophistication attack volume, straining defender resources in a way that could be masking simultaneous sophisticated intrusions. Volume is the tactic.

🔗 AI Supply Chain & Model Attacks: ↑↑↑ Persistent, structurally unresolved

LiteLLM's PyPI compromise extended through 1,705 transitive dependencies including dspy (5M monthly downloads) and crawl4ai (1.4M). GitGuardian analysis of related prior campaign found 33,185 unique secrets across 6,943 developer machines, with 3,760 still valid at time of analysis.
Google API keys hardcoded in Android apps per Google's own documentation have been silently elevated to Gemini AI credentials - found in 22 popular apps with 500 million users. Over 35,000 unique keys across 250,000 apps. Retroactive privilege escalation at scale.
The Register's analysis of the Trivy/Axios supply chain campaign continues the post-mortem on last week's cascading compromise.

Significance: LLM API credentials are now a high-value harvest target and they're accumulating on developer machines in plaintext at density. Most organisations applying good hygiene to AWS keys are not applying it to their AI provider credentials.

🔬 AI Platform Vulnerabilities & Prompt Injection: ↑↑↑ Dominant theme, multiple findings

Google DeepMind published a taxonomy of six "AI Agent Trap" classes targeting agentic systems via malicious web content: content injection, semantic manipulation, cognitive state corruption, behavioural control, systemic traps, and human-in-the-loop subversion. This is a research publication, not an active campaign report, but the six categories are real attack surfaces.
Unit 42 demonstrated a four-stage attack chain against Amazon Bedrock's multi-agent framework requiring no platform vulnerability, only prompt injection against default configurations - enabling agent enumeration, payload delivery to specific sub-agents, and unauthorised tool invocation.
Apple Intelligence's safety guardrails were bypassed at 76% success rate by combining Neural Execs prompt injection with Unicode right-to-left override, affecting an estimated 200 million devices. Patched in iOS 26.4.
Docker CVE-2026-34040 introduces an AI-specific attack scenario: AI coding agents manipulated via prompt injection in a malicious GitHub repository can autonomously trigger the auth bypass and harvest cloud credentials.

Significance: Three separate AI platform guardrail failures in one week, plus research mapping the systematic attack surface for agentic AI. The pattern is consistent with last week's argument: probabilistic controls fail probabilistically. Independent enforcement layers are not optional.

Are you tracking agent views on your docs?

AI agents already outnumber human visitors to your docs — now you can track them.

See your AI traffic!

Interesting Stats

❝

$893M - Confirmed AI-enabled fraud losses in the US in 2025 alone, per the FBI's IC3 annual report, representing over 22,000 individual complaints. This is the first year the IC3 has dedicated a category to AI-assisted crime. Divide by the total $20.87 billion in cybercrime losses and AI-enabled fraud is currently 4% of the total. That number will not be static.

❝

35,000 - Unique Google API keys found hardcoded across 250,000 Android applications, each now silently authenticating to Gemini AI endpoints following a retroactive privilege escalation that developers had no visibility into. This is what a new credential class looks like before organisations have built controls around it.

❝

9h 41m - Time between public disclosure of Marimo Python notebook RCE vulnerability CVE-2026-39987 and confirmed active exploitation on a honeypot. Not AI-specific, but representative of the exploitation tempo that AI-assisted vulnerability scanning is beginning to normalise for defenders to operate against.

Three Things Worth Your Attention

1. Claude Mythos and the "Head Start" Question

Anthropic's announcement of Claude Mythos Preview dominated the AI security conversation this week, and it deserves careful calibration rather than either hype or dismissal.

The Wired reporting describes a model that can autonomously discover vulnerabilities and build working exploit chains across operating systems and browsers. Anthropic's response was Project Glasswing: releasing Mythos exclusively to a consortium of Microsoft, Apple, Google, and Cisco to give defenders a head start before broader availability. The framing is responsible; the assumption embedded in it is worth examining.

A head start works if defenders can absorb, triage, and remediate faster than attackers can acquire and operationalise. The evidence this week doesn't support optimism on that ratio. CVE-2026-39987 in Marimo was exploited in under ten hours. CVE-2025-55182 in React2Shell was in active exploitation within two days of disclosure. And AISLE's research tested Anthropic's showcased vulnerabilities against small, cheap open-weights models and found that 8 out of 8 models detected the FreeBSD buffer overflow - including a 3.6 billion parameter model at $0.11 per million tokens. The moat, they argue, is in orchestration and maintainer trust, not the frontier model itself.

The security experts interviewed by Wired are divided. That division is appropriate. What isn't appropriate is either "AI cybersecurity will save us" or "this is pure hype." The honest position is that we don't yet know whether AI-assisted vulnerability discovery at scale will benefit defenders or attackers more, and the answer will depend heavily on whether defensive tooling can actually close the triage-to-patch gap.

The practical question isn't whether Mythos is real. It's whether your organisation could absorb a 10x increase in vulnerability disclosures with current patching infrastructure. If not, that's the gap to fix first.

2. The AI Credential Harvest Is Now Systematic

Last week's TeamPCP supply chain reporting established that LLM API keys are a credential class worth stealing. This week's data adds structural depth to that observation.

The LiteLLM PyPI compromise extended via transitive dependencies to 1,705 packages, meaning organisations that never directly installed LiteLLM were exposed because something they installed depended on it. GitGuardian's analysis of a related campaign found 33,185 unique secrets - not historical, not theoretical. Three thousand seven hundred sixty still valid. Developer machines are credential warehouses, and AI toolchain libraries are now a deliberate attack surface specifically because of how many secrets accumulate there.

The Google Gemini key discovery adds a different dimension. CloudSEK found 32 hardcoded API keys across 22 popular Android apps with 500 million combined users. Truffle Security found 35,000 keys across 250,000 apps. These keys were embedded per Google's own documentation, treated as public identifiers, and then retroactively elevated to Gemini AI credentials when Google enabled the AI service on those projects. No notification. No rotation. No controls. An attacker who decompiles an APK now gets Gemini API access.

The common thread is that AI API credentials are being created faster than organisations are building governance around them. They're not going through the same rotation and monitoring discipline as AWS access keys or service account tokens, despite having comparable blast radius. The question to ask on Monday: does your AI API key inventory exist? Do you know which applications, environments, and developer machines hold OpenAI, Anthropic, or Google AI credentials, and are those keys in any form of rotation programme?

3. Agentic AI Has a Systematic Attack Surface, and Researchers Are Mapping It

Three separate research publications this week focused on the attack surface that emerges when AI systems act autonomously, and together they constitute a reasonably complete picture of the problem.

Google DeepMind's AI Agent Trap taxonomy names six attack classes against autonomous AI agents operating on the web: content injection, semantic manipulation, cognitive state corruption, behavioural control, systemic traps, and human-in-the-loop subversion. The key finding is that the attack surface isn't a bug in any particular system; it's structural. There is an inherent gap between what a human sees when rendering a web page and what a machine processes, and that gap is exploitable at every category in the taxonomy.

Unit 42's Amazon Bedrock research demonstrates this concretely: a four-stage attack chain that requires no vulnerability in Bedrock, only prompt injection against default configurations. Stage one: determine whether the target is in agent or chat mode. Stage two: enumerate collaborator sub-agents. Stage three: deliver a payload targeted to a specific sub-agent. Stage four: trigger unauthorised tool invocation. All of this works because the underlying LLM cannot distinguish developer instructions from adversarial input, regardless of how careful the developer was.

Unit 42's retail fraud scenario models this against agentic commerce systems, with indirect prompt injection via a malicious deals aggregator appending unauthorized gift cards to checkout payloads. This is forward-looking research, not an active incident. But agentic commerce is projected to handle 15–25% of global e-commerce by 2030, and the attack technique requires nothing more than placing hidden HTML instructions in a web page the agent visits.

The operational implication for anyone deploying agentic AI in enterprise workflows is the same across all three papers: the AI's ability to distinguish "instructions from my developers" from "instructions from malicious content" is probabilistic, not guaranteed. Independent enforcement controls, not AI-native guardrails, are what you need sitting between your agentic systems and your sensitive infrastructure.

In Brief - AI Threat Scan

🤖 AI-Enabled Attacks Hundreds of organisations are being compromised daily via AI-personalised Microsoft device-code phishing campaigns running at 10–15 per day, using serverless platforms to bypass detection and harvest OAuth tokens that bypass MFA. VENOM PhaaS specifically targets C-suite executives using Unicode QR codes and Base64-encoded URL fragments to defeat email scanners, operating since at least November 2025.

🏴‍☠️ Nation-State AI Activity China-linked actors distributed PlugX RAT via a convincing fake Claude AI download site, achieving C2 callback within 22 seconds of execution using a DLL sideloading chain through a legitimately-signed G DATA executable. TikTok removed six covert influence networks ahead of Hungary's elections, with AI-generated content targeting the opposition tracked by civil society groups.

💀 AI in Ransomware / Cybercrime Sicarii RaaS group appears to be using AI-assembled malware that ships with a critical flaw: the encryptor discards the private decryption key, making file recovery impossible even after ransom payment.

🔗 AI System Vulnerabilities Apple Intelligence's guardrails were bypassed at 76% success rate using Neural Execs prompt injection combined with Unicode right-to-left override characters, with patches issued in iOS 26.4. Docker CVE-2026-34040 introduces an AI-specific vector: coding agents can be manipulated via prompt injection in malicious repositories to autonomously trigger an auth bypass yielding full host access.

📜 AI Policy & Regulation The UK government proposed criminal liability for tech executives whose platforms fail to remove non-consensual AI-generated intimate imagery within required timeframes, following the Grok scandal in which xAI's chatbot generated millions of nudified images.

☁️ Cloud Attacks North Korean Slow Pisces (Lazarus) stole Kubernetes service account tokens at a cryptocurrency exchange to pivot into cloud financial infrastructure; Kubernetes-related threat actor operations increased 282% year-over-year per Unit 42 telemetry, with IT sector representing 78% of activity. UAT-10608 exploited CVE-2025-55182 in Next.js applications to harvest AI API keys, cloud credentials, and SSH keys from 766 hosts using the NEXUS Listener framework.

🔬 Research & Detection Wired's analysis of synthetic media proliferation documents that automated bot traffic now constitutes an estimated 51% of internet activity and that AI-generated content has eliminated classic deepfake detection tells.

The Bottom Line

The FBI published its 2025 internet crime report this week. The headline is $20.87 billion in US cybercrime losses. Buried in the body, and notable because it's new, is a dedicated AI section: 22,364 complaints, $893 million in losses. AI-enabled crime is currently 4% of the total. That number will not stay at 4%.

The more interesting signal this week isn't any single incident. It's the convergence between AI as an attack delivery mechanism and AI as an attack target. AI-personalised phishing is running at volume - hundreds of organisations per day from a single campaign. AI development tooling is the preferred harvest ground for a new credential class. And the AI systems your organisation is deploying have systematic adversarial attack surfaces that researchers at Google, Palo Alto, and universities are now formally mapping. The attack surface and the attack capability are scaling in parallel.

Apply the Rosling negativity instinct to Claude Mythos. The most alarming framing - "AI can now autonomously hack anything" - is not what the evidence shows. AISLE's research found that cheap open-weights models replicate many of Mythos's showcased capabilities. Anthropic's announcement is significant, but it's significant because it signals where capability is heading, not because frontier AI has crossed a threshold that was previously uncrossable. The gap between attacker AI capability and defender AI capability is real and widening; it's not yet a discontinuous break.

What has changed, structurally, is that AI credentials are now a target class with no standard controls. Every organisation with AI tooling in its development pipeline is accumulating OpenAI keys, Anthropic keys, and Google AI credentials on developer machines, in CI/CD pipelines, and hardcoded into applications. Most of those credentials are not rotated. Most of those machines are not in scope for the same credential hygiene applied to cloud infrastructure. TeamPCP, UAT-10608, and the Gemini key researchers all confirmed this gap independently this week.

The Monday morning question: do you know where your AI API keys are?

Wisdom of the Week

❝

Failure is simply the opportunity to begin again, this time more intelligently.

Henry Ford

AI Influence Level

Level 4 - Al Created, Human Basic Idea / The whole newsletter is generated via Claude workflow based on hundreds of news and research article. Human-in-the-loop to review the selected articles and subjects.

Reference: AI Influence Level from Daniel Miessler

Till next time!

Project Overwatch is a cutting-edge newsletter at the intersection of cybersecurity, AI, technology, and resilience, designed to navigate the complexities of our rapidly evolving digital landscape. It delivers insightful analysis and actionable intelligence, empowering you to stay ahead in a world where staying informed is not just an option, but a necessity.

Buy me a Coffee

Like the content? Share Project Overwatch with your friends or colleagues