#109 - AI Developer Toolchain Under Attack: DPRK LLM Malware, Gemini CLI RCE

Cyber AI Chronicle

By Simon Ganiere · 3^rd May 2026

Welcome back!

North Korea used Claude Opus to write a malicious commit this week. That sentence deserves a moment before you move on to the rest of the edition.

The PromptMink campaign, attributed to Famous Chollima by researchers, involved a DPRK operator using Anthropic's Claude Opus as a co-author - not for persona generation, not for phishing content, but for the functional malicious dependency inserted into a cryptocurrency trading agent's GitHub repository. This is a category shift from what we documented in Edition #108. Last week's story was AI configuration files becoming the loot and OAuth delegation creating invisible trust bridges. This week's story is that the AI toolchain itself - the coding assistants, the agent skill marketplaces, the ML framework libraries - has become the attack surface, and at least one nation-state has figured out how to use the tools' own brand credibility as a trust weapon.

In a single seven-day window, Google Gemini CLI received a CVSS 10 patch, Cursor IDE disclosed a sandbox escape and an unpatched credential extraction flaw, ClawHub and Hugging Face confirmed as malware distribution infrastructure, and PyTorch Lightning was poisoned on PyPI with commits falsely attributed to Claude Code. Meanwhile, KnowBe4's data confirmed that 86% of phishing campaigns now involve AI - not a spike, a baseline.

If you have been enjoying the newsletter, it would mean the world to me if you could share it with at least one person 🙏🏼

❝

If you have been enjoying the newsletter, it would mean the world to me if you could share it with at least one person 🙏🏼 and if you really really like it then feel free to offer me a coffee ☺️

Simon

AI Threat Tempo

🛡️ AI System Vulnerabilities (attacks ON AI): ↑ +44% (13 vs 9 high-scoring articles week-on-week)

Gemini CLI CVSS 10 RCE via workspace config auto-trust without sandboxing; Cursor sandbox escape via malicious git hooks; CursorJacking credential extraction from local SQLite database (unpatched)
ClawSwarm infection mechanism using legitimate SDK calls undetectable by static scanners; 600+ malicious ClawHub skills using indirect prompt injection to execute payloads without user interaction
Significance: The AI tooling layer is now a primary attack surface, and it does not have the security maturity of the software ecosystems it is replacing.

🦠 AI-Assisted Malware Development: ↑↑ +100% (4 vs 2)

Famous Chollima operationalises Claude Opus as a direct code weapon in a supply chain commit; first confirmed nation-state use of a frontier LLM for malware authorship, not just content generation
Darknet AI tools (WormGPT, HexStrike AI, BruteForceAI) cited by FortiGuard as primary driver of time-to-exploit collapsing to 24–48 hours
Significance: AI is now inside the commit history. Code review as a security control is degrading faster than teams realise.

🤖 AI-Enabled Social Engineering: ↑ +50% (3 vs 2)

KnowBe4 seventh Phishing Threat Trends report: 86% of campaigns AI-enabled, 4.5x effectiveness over human-written lures, calendar invite abuse up 49%, Teams IT-support impersonation up 41%
Bluekit PhaaS bundles GPT-4.1, Claude, Gemini, Llama, and DeepSeek into a single dashboard with 40 templates at approximately $250/month
Significance: The skill floor for AI-personalized phishing is now near zero. Detection calibrated on historical human-quality lures is structurally obsolete.

🔗 AI Supply Chain & Developer Tool Abuse: ↓ -27% (8 vs 11)

PyTorch Lightning compromised by TeamPCP (LAPSUS$-linked) as an extension of the Shai-Hulud campaign from Edition #108, with worm propagation into up to 50 repository branches per victim; commits falsely attributed to Anthropic Claude Code
AWS Bedrock AgentCore "Agent God Mode" - auto-generated IAM execution roles with wildcard permissions allow a compromised agent to enumerate ECR images, extract MemoryIDs, poison the conversation memory of any other agent, and invoke arbitrary agents across the account
Significance: Volume is lower than last week, but the campaign footprint expanded. The Mini Shai-Hulud extension confirms supply chain activity is iterative, not episodic.

🤖🏃 AI Autonomous & Agentic Attacks: ↓ -25% (6 vs 8)

Unit 42 Bedrock multi-agent attack chain: four-stage systematic attack progressing from operating mode detection through agent discovery, payload delivery, and unauthorised tool invocation - exploiting LLM inability to separate developer instructions from adversarial input
The Bedrock AgentCore God Mode finding compounds this: misconfigured default IAM policies give compromised agents cross-account lateral movement capability without any additional exploitation
Significance:Volume is down, but the research quality is up. AWS-deployed multi-agent systems now have two documented attack paths, both requiring only default configurations.

🔍 AI-Accelerated Vulnerability Exploitation: ↓ -62.5% (3 vs 8)

UK NCSC warning of imminent "patch tsunami": CTO Ollie Whitehouse states AI can exploit technical debt at scale and pace across the technology ecosystem; recommends minimising internet-facing attack surfaces and preparing to patch at volume
FortiGuard documents 656 CVEs with active darknet discussion in 2025, many packaged with working exploit code, time-to-exploit now 24–48 hours
Significance: The NCSC warning is the policy acknowledgement of what Mythos demonstrated operationally last week. The defender consequence is not optional: patch windows that assume a seven-day response cycle are already miscalibrated.

Works inside Cursor, Warp, VS Code, and every IDE.

Wispr Flow sits at the system level — dictate into any editor, terminal, or app with full syntax accuracy. No plugins needed. No setup per tool. 89% of messages sent with zero edits.

Start flowing free

Interesting Stats

❝

86% - The share of phishing campaigns now involving AI, per KnowBe4's seventh annual report, up from 80% in 2024 and 84% in 2025. The trajectory is a steady, compounding adoption curve, not a spike. Detection infrastructure calibrated on human-quality lures is already behind.

❝

9,800 - Downloads of the thirty malicious ClawHub skills that silently enrolled AI agents into the ClawSwarm cryptocurrency botnet, using only legitimate SDK calls with zero detectable malicious code patterns. This number represents confirmed downloads of a confirmed operational attack with no static analysis mitigation available.

❝

24–48 hours - Current time-to-exploit window documented by FortiGuard Labs for critical vulnerabilities, compressed from approximately one week by commoditised AI criminal tooling. Any patch SLA longer than 24 hours for internet-facing critical CVEs is now, arithmetically, too slow.

Three Things Worth Your Attention

1. The AI Developer Stack Had Its Worst Week Yet

There is a useful test for whether a week in the AI threat landscape represents genuine escalation or a volume artefact. Ask whether the techniques are new or the targets are new. This week, both qualify simultaneously, and across every layer of the AI developer stack.

Gemini CLI received a CVSS 10.0 patch for a remote code execution flaw tracked as GHSA-wpqr-6v78-jr5g. The vulnerability is not subtle: Google's AI coding tool automatically trusted whatever configuration files it found in the current workspace folder and loaded them without sandboxing before initialisation. An attacker with write access to any workspace directory - via a malicious pull request, a poisoned submodule, or a compromised dependency - could achieve arbitrary code execution on the CI/CD host before any security control had the chance to activate. In pipeline contexts, that means secrets, tokens, and source code at minimum. The Hacker News coverage additionally disclosed CVE-2026-26268 in Cursor IDE (CVSS 8.1), where an AI agent performing a routine git operation in response to a benign user prompt can autonomously trigger a malicious git hook embedded in a bare repository, achieving sandbox escape without the user ever touching the malicious artifact directly. A third flaw in Cursor, CursorJacking (CVSS 8.2), remains unpatched: any installed Cursor extension can extract API keys and session tokens from a local SQLite database.

Separately, researchers at Acronis documented nearly 600 malicious skills across 13 ClawHub developer accounts, delivering the Atomic macOS Stealer and other payloads via indirect prompt injection. Hugging Face simultaneously hosted multi-stage infection chains targeting Windows, Linux, and Android. The mechanism on ClawHub is technically instructive: attackers used indirect prompt injection to instruct AI agents to download and execute malicious code. The user never ran anything. The agent did. Traditional endpoint controls and user-awareness training have no purchase on a threat delivered entirely through the agent execution plane.

The combined picture this week is that four distinct AI developer tooling vectors - IDE coding assistants, CI/CD integrations, agent skill marketplaces, and model repositories - were all confirmed attack surfaces in a single seven-day window. This is not coincidence. The structural weakness is shared: AI tooling inherits user trust without earning it, and the security controls that exist in traditional software ecosystems simply have not been built into the AI layer yet. The Rosling calibration here matters. None of these vulnerabilities required novel research. CVSS 10 workspace trust abuse is conceptually identical to the threat model for malicious .npmrc files - the AI toolchain just has not developed the defensive norms that npm took a decade to accumulate. Monday morning question: have you audited which AI coding assistants and agent frameworks are running in your CI/CD pipelines, and what permissions they hold?

2. DPRK Used Claude Opus to Write a Malicious Commit

The PromptMink campaign, documented this week by The Hacker News, provides the clearest evidence yet that nation-state actors have moved AI from content generation support into active code weaponisation. Famous Chollima (Shifty Corsair) operators used Anthropic's Claude Opus to co-author a GitHub commit that introduced a malicious dependency into an autonomous cryptocurrency trading agent, enabling wallet theft from downstream users. This is not AI assisting with a phishing email. This is a large language model as a participant in the artifact creation step of a supply chain attack.

The concurrent graphalgo and Contagious Trader campaigns running in parallel - deploying remote access trojans via fake Florida LLCs, job interviews as social engineering vectors, and OtterCookie stealer via trojanised npm and PyPI packages - confirm that DPRK is operating multiple simultaneous offensive programmes across the open-source ecosystem. The SSH backdoors and Rust-compiled payloads documented in these campaigns represent significant operational maturation since the relatively straightforward JavaScript-based campaigns of 2024.

Two calibrations matter here. The first is the Rosling check: what specifically is new? Famous Chollima using AI for social engineering content and fake LinkedIn profiles was documented in Edition #108. The new element is direct LLM participation in code modification - producing syntactically clean, contextually plausible malicious changes of the kind that pass human code review because reviewers are increasingly conditioned to treat AI-generated commits as lower-risk than unknown human contributors. The second is the trust laundering dynamic that TeamPCP's PyTorch Lightning compromise also exploited: commits in that campaign were falsely attributed to Anthropic's Claude Code. Attackers are now actively weaponising AI brand credibility to defeat review processes. The countermeasure - cryptographic commit signing plus explicit scrutiny of AI-attributed dependency changes - is not yet standard practice at most organisations, including most that consider themselves security-conscious.

3. Phishing Has Crossed a Threshold. Most Detection Infrastructure Has Not.

KnowBe4's seventh Phishing Threat Trends report published this week contains a number that deserves to sit in every security team's operational picture: 86% of phishing campaigns observed in the past six months involved AI. That is up from 80% in 2024 and 84% in 2025. The trajectory is not a spike driven by a single campaign. It is a steady, compounding adoption curve, which is the harder threat to defend against. Spikes revert. Adoption curves do not.

The operational implication of the 4.5x effectiveness multiplier for AI-crafted lures over human-written ones is specific. Any phishing detection capability that relies on grammatical error detection, template matching, or behavioural heuristics calibrated against historical human-quality campaigns is now systematically undercalibrated for the majority of attacks it will encounter. The multi-vector escalation data compounds this: calendar invite abuse is up 49%, and Teams IT-support impersonation is up 41%. Attackers are treating email as the initial access vector and pivoting to collaboration channels for credential harvest, exploiting the trust differential between a slightly suspicious email and an apparently legitimate IT helpdesk message in Teams.

The infrastructure enabling this at commodity scale arrived in the Bluekit phishing-as-a-service platform, which integrates GPT-4.1, Claude, Gemini, Llama, and DeepSeek natively alongside 40 phishing templates, domain registration, anti-analysis controls (VPN/proxy blocking, fingerprint filtering), and real-time credential exfiltration to Telegram, all in a single dashboard at approximately $250 per month. The AI assistant is described as experimental. The surrounding infrastructure is not. Unit 42's Kali365 v2 PhaaS documentation adds OAuth device code flow token theft and Cloudflare worker hosting to the picture. The skill barrier to launching AI-personalised, multi-vector phishing campaigns at scale has not just lowered - it has effectively collapsed. The FBI's figure of $893 million in AI-related fraud losses, part of a record $20.87 billion in US cybercrime, is the monetisation evidence that confirms this is not theoretical.

In Brief - AI Threat Scan

🔗 AI Supply Chain & Developer Tool Abuse. PyTorch Lightning v2.6.2 and v2.6.3 were poisoned by TeamPCP (LAPSUS$-linked) in an extension of the Shai-Hulud campaign from Edition #108, injecting credential-harvesting payloads that propagate worm-like into up to 50 repository branches per victim; users should downgrade to v2.6.1 immediately and rotate all credentials. Unit 42's Agent God Mode research documents how Amazon Bedrock AgentCore's default IAM roles with wildcard resource permissions allow a compromised agent to read and poison the conversation memory of any other agent in the account - AWS updated documentation to warn against production use, but no scoping fix has been deployed to the toolkit itself.

🦠 AI-Assisted Malware Development. FortiGuard's 2025 Global Threat Landscape Report documents the commoditisation of WormGPT, FraudGPT, HexStrike AI, APEX AI, and BruteForceAI as the primary drivers of an industrialised exploit development ecosystem with 656 CVEs carrying active darknet discussion and working exploit code. A Huntress SOC report documents a confirmed incident where a developer's use of OpenAI Codex to troubleshoot a compromise generated bash commands nearly indistinguishable from attacker tradecraft, substantially degrading SOC investigation efficiency - a novel operational problem for detection teams.

🤖 AI-Enabled Social Engineering. Kali365 v2 PhaaS has emerged with dedicated OAuth device code flow exploitation and AI-based lure generation distributed via Telegram at approximately $250/month with a domain marketplace, Cloudflare worker hosting, and keyword searching for target personalisation. A large-scale campaign compromised 30,000 Facebook accounts (AccountDumpling) and separately exploited a Robinhood account creation vulnerability for phishing legitimacy - consistent with AI-automated targeting at volume.

🤖🏃 AI Autonomous & Agentic Attacks. Unit 42's systematic attack chain research against Amazon Bedrock multi-agent applications demonstrated four-stage exploitation progressing from mode detection and agent discovery through payload delivery to unauthorised tool invocation, exploiting the LLM's inability to differentiate developer instructions from adversarial input; enabling Bedrock's built-in prompt attack Guardrail blocked all demonstrated attacks. Google documents a 32% increase in indirect prompt injection attempts targeting Gemini, Copilot, and ChatGPT between November 2025 and February 2026, with current payload focus on credential exfiltration and file deletion - sophistication remains low, but volume infrastructure is established.

🛡️ AI System Vulnerabilities. The Gemini CLI CVSS 10 RCE (GHSA-wpqr-6v78-jr5g) has been patched; Cursor CVE-2026-26268 (CVSS 8.1 sandbox escape via git hooks) is patched in v2.5; CursorJacking (CVSS 8.2 SQLite credential extraction) remains unpatched. Google's prompt injection volume data confirms the sophistication gap between research-demonstrated capabilities and operational attacks is closing - the 32% volume growth is the warning signal, not the current sophistication level.

📜 AI Governance & Defensive Innovation. The UK NCSC is warning organisations to prepare for a "patch tsunami"as AI-powered vulnerability discovery tools accelerate exposure of long-standing code debt; CTO Ollie Whitehouse recommends minimising internet-facing attack surfaces, prioritising perimeter technologies, and building the operational capacity to patch quickly at scale - the practical acknowledgement that the Mythos-class capability from Edition #108 changes the defender's patching mathematics permanently.

The Bottom Line

The theme that runs through every major story this week is a single structural failure: organisations are deploying AI tooling into production at a pace that has outrun their ability to understand what trust they are granting. The Gemini CLI flaw - automatic workspace trust without sandboxing - is a design decision that made sense in a world where the tool was a local assistant. It became indefensible the moment the same tool was integrated into CI/CD pipelines with access to production secrets. The same logic applies to ClawHub skills loaded without semantic vetting, to Hugging Face models pulled without integrity verification, to AI agent OAuth scopes (the Vercel story from Edition #108) wider than any human would have consciously approved.

What North Korea figured out with PromptMink is that this trust gap extends to code review processes. A syntactically clean, contextually plausible commit attributed to Claude Opus looks safer to a reviewer than an unsigned commit from an unknown contributor. That perception is now a weapon. The countermeasure is not suspicion of AI - it is cryptographic signing and explicit review criteria that apply regardless of the stated author.

Apply the Rosling size instinct before the framing hardens. "The AI toolchain is completely compromised" is the alarming version. The calibrated version is that four specific trust boundaries were exploited this week because organisations deployed AI tooling without asking the same questions they ask of any other software dependency: what does this tool access, what can it execute, and what happens if it is compromised? Those are not novel questions. They are the supply chain questions that took the npm ecosystem a decade to institutionalise. The AI ecosystem is about three years into that same journey.

The Monday question: does your security team have an inventory of every AI coding assistant, agent framework, and model repository dependency in use across your development environments - with the same permissions audit, least-privilege enforcement, and incident response coverage you apply to production cloud credentials?

❝

Simon

Wisdom of the Week

❝

Be more concerned with your character than your reputation, because your character is what you really are, while your reputation is merely what others think you are.

John Wooden

AI Influence Level

Level 4 - Al Created, Human Basic Idea / The whole newsletter is generated via Claude workflow based on hundreds of news and research articles. Human-in-the-loop to review the selected articles and subjects.

Reference: AI Influence Level from Daniel Miessler

Till next time!

Project Overwatch is a cutting-edge newsletter at the intersection of cybersecurity, AI, technology, and resilience, designed to navigate the complexities of our rapidly evolving digital landscape. It delivers insightful analysis and actionable intelligence, empowering you to stay ahead in a world where staying informed is not just an option, but a necessity.

Buy me a Coffee

Like the content? Share Project Overwatch with your friends or colleagues