#105 - North Korea's AI Malware, Claude Code Exploit, and the AI Supply Chain Breach

Cyber AI Chronicle

By Simon Ganiere · 5^th April 2026

Welcome back!

Microsoft Threat Intelligence published - in three separate dispatches - forensic-level documentation of North Korean APT groups using AI across every phase of the kill chain. Not "AI-assisted phishing." AI-generated malware with emoji markers left in the code as artefacts of how it was built. Voice-cloning software used live on job interview calls. Agentic workflows for end-to-end infrastructure provisioning. This is operational, attributed, at scale. The OtterCookie malware samples are sitting in labs with the fingerprints of a chatbot still on them.

Meanwhile, a critical prompt injection vulnerability in Anthropic's Claude Code quietly lets a malicious CLAUDE.md file bypass the entire permission system - no warning to the user, no indication anything went wrong. And TeamPCP's multi-stage supply chain campaign burned through Trivy, KICS, LiteLLM, and Axios, landing in the European Commission's AWS environment and exposing proprietary training data from Meta, OpenAI, and Anthropic along the way.

The question this week isn't whether AI is being weaponised. It's whether the AI tools your teams are deploying have security architectures commensurate with the access they're being given.

❝

If you have been enjoying the newsletter, it would mean the world to me if you could share it with at least one person 🙏🏼 and if you really really like it then feel free to offer me a coffee ☺️

Simon

AI Threat Tempo

🤖 AI-Enabled Social Engineering: ↑↑ Significant escalation

AI-enhanced phishing campaigns documented at 54% click-through rate vs ~12% baseline - a 450% effectiveness increase attributed directly to AI-enabled role targeting and content precision
North Korean Jasper Sleet operationalising voice-changing software in live job interviews; Faceswap and GAN-generated deepfake identity documents deployed at scale

Significance: The economics of spear-phishing have collapsed. If AI can generate a convincing targeted lure at the cost of a commodity prompt, your click-rate assumptions from three years ago are wrong.

🏴‍☠️ Nation-State AI Operations: ↑↑↑ Landmark week

Four North Korean APT clusters (Jasper Sleet, Coral Sleet, Sapphire Sleet, Emerald Sleet) confirmed operating AI across the full attack lifecycle - recon, persona fabrication, malware development, post-compromise data triage
Iranian MOIS-linked Boggy Serpens (MuddyWater) now deploying LampoRAT with strong indicators of AI-generated code; China's TA416 resurfaces targeting EU/NATO diplomatic missions with evolved PlugX delivery chains

Significance: North Korea has crossed from AI-assisted into AI-integrated. The distinction matters: integration means the techniques don't disappear when the tooling is disrupted.

💀 AI-Augmented Ransomware & Cybercrime: ↑ Active

Tycoon2FA (Storm-1747) phishing-as-a-service platform - responsible for approximately 62% of phishing volume Microsoft blocked at peak - seized via 330-domain operation by Microsoft and Europol
TeamPCP supply chain exfiltration linked to Vect and CipherForce ransomware groups for follow-on extortion; $285M Drift Protocol DeFi heist by DPRK used AI-enhanced social engineering as initial access

Significance:The Tycoon2FA disruption matters but the model remains intact - AI-enabled PhaaS platforms are a commodity infrastructure class now, not a single group's advantage.

🔗 AI Supply Chain & Model Attacks: ↑↑↑ Critical - dominant theme of the week

TeamPCP executed a cascading supply chain campaign: Trivy → KICS → LiteLLM → Axios, with each compromise used to harvest credentials for the next target; European Commission's AWS environment breached via stolen Trivy API key
LiteLLM attack exposed OpenAI and Anthropic credentials plus proprietary AI training datasets from Meta and OpenAI including Meta's 'Chordus' initiative data
North Korean Sapphire Sleet attributed to Axios npm supply chain attack: 100M+ weekly downloads, WAVESHAPER.V2 implant, 15-second-from-install compromise

Significance: The AI development ecosystem is now the supply chain. LLM API keys are a credential class that didn't exist two years ago - and most organisations don't have controls built around their exposure.

🔬 AI Platform Vulnerabilities & Prompt Injection: ↑↑↑ Multiple critical findings

Claude Code v2.1.88: critical prompt injection via malicious CLAUDE.md bypasses entire deny-rule permission system silently; Anthropic source code leak (512,000 lines) compounds risk by exposing enforcement logic
OpenAI ChatGPT: DNS-based covert channel silently exfiltrated user data bypassing all AI guardrails; Codex: command injection via unsanitised GitHub branch names enabled GitHub token theft
Palo Alto AdvJudge-Zero: 99% bypass rate against LLM-as-a-judge security gatekeepers using benign-looking stealth control tokens invisible to WAFs

Significance: LLM-native security controls - guardrails, deny rules, safety layers - are not substitutes for independent enforcement. Three major AI platforms demonstrated this in the same week.

Interesting Stats

❝

450% - The increase in phishing click-through rates when AI is used for role-based targeting and content precision. 54% vs ~12% baseline, documented by Microsoft. This is not a marginal improvement; it's a structural change in what "average phishing" means.

❝

99% - AdvJudge-Zero's bypass rate against LLM-as-a-judge security gatekeepers. The tool requires no white-box access, uses human-readable inputs, and evades WAF detection. If you are relying on an LLM to enforce your AI safety policy, this number should concern you.

❝

18 - The number of DPRK cryptocurrency thefts tracked by Elliptic in 2026 alone, with over $300M stolen year-to-date as of April. The $285M Drift Protocol heist last week included an attack that drained five vaults in under ten seconds after three weeks of pre-positioning. North Korea's crypto theft operation is industrialised.

Are you tracking agent views on your docs?

AI agents already outnumber human visitors to your docs — now you can track them.

See your AI traffic!

Three Things Worth Your Attention

1. North Korea's AI Integration Is Forensically Confirmed

Microsoft published what amounts to a landmark intelligence report - actually three dispatches from the same underlying research - documenting four North Korean APT clusters (Jasper Sleet, Coral Sleet, Sapphire Sleet, Emerald Sleet) operating AI not as a supplementary tool but as a structural component of their attack infrastructure.

The forensic detail is what separates this from previous AI threat reporting. Recovered OtterCookie malware samples contain emoji markers and conversational inline comments - artefacts of how the code was written, left in by AI coding tools that threat actors failed to sanitise before deploying to production. This is not a researcher's hypothesis about what might be happening. It is forensic evidence of what did happen, sitting in a sandbox.

Coral Sleet is running agentic AI workflows for end-to-end lure development: fake company websites, remote infrastructure provisioning, and payload deployment, with LLM jailbreaking used to generate the malicious code. Jasper Sleet is using Faceswap and GAN-generated photos for fraudulent identity documents at scale, with voice-changing software deployed during live job interviews - not to fool a machine, but to fool the humans conducting them. Microsoft disrupted thousands of fraudulent IT worker accounts linked to these operations, which tells you something about the scale the AI tooling is enabling.

The practical implication isn't "AI makes attackers scary." It's more specific. If your organisation hires remote IT contractors and relies on a video call for identity verification, the assumptions that process is built on no longer hold. If your threat model for insider-style access treats it as a rare, high-effort attack, that cost assumption has changed. Liveness detection and multi-session unscheduled verification should be in scope for any high-trust remote hiring process.

2. The AI Supply Chain Has a Credential Problem Nobody Built Controls For

TeamPCP's campaign this week was instructive not for its sophistication - supply chain attacks aren't new - but for what it exposed about how the AI development ecosystem is structured.

The sequence: compromise Trivy (widely used security scanner in CI/CD pipelines) to harvest credentials → use those credentials to poison LiteLLM (an AI API routing library in approximately 36% of cloud environments) → exfiltrate LLM API keys, cloud tokens, Kubernetes secrets, and SSH keys from 500,000 machines → compromise Axios npm (100M+ weekly downloads, attributed by Microsoft to Sapphire Sleet / DPRK) via the same credential chain → breach the European Commission's AWS environment using a stolen Trivy API key, sitting there with management rights.

What's notable is the specific credential class being targeted: LLM API keys. OpenAI credentials and Anthropic credentials were among the exfiltrated assets, and they are already appearing on BreachForums. This is a new credential category - one that didn't exist in your threat model three years ago - and most organisations have not applied the same rotation and monitoring discipline to LLM API keys that they apply to AWS access keys or service account tokens.

The Wired reporting on Mercor adds a second dimension: proprietary AI training data is now a target. Meta paused all work with Mercor after training datasets for the 'Chordus' project were exposed. AI training datasets - including the bespoke, secret ones that define competitive AI capability - are sitting inside contractor environments that may not be secured to the same standard as the model weights themselves. That's a gap worth examining if your organisation is involved in any form of AI development or procurement.

3. Three AI Platforms Demonstrated the Same Lesson in One Week

The Claude Code vulnerability, the OpenAI ChatGPT exfiltration flaw, and the AdvJudge-Zero research all landed in the same week. Individually, each is significant. Together, they make a single argument with considerable force.

Claude Code v2.1.88 has a vulnerability where a malicious CLAUDE.md file crafts a 50+ subcommand pipeline, exceeds a performance cap, and triggers a fallback that silently skips the entire deny-rule permission system. No user warning. No indication the security controls were bypassed. SSH keys, AWS credentials, GitHub tokens - available for exfiltration. And Anthropic's concurrent source code leak (512,000 lines of TypeScript via a debugging sourcemap on npm) hands adversaries the blueprint for the enforcement logic.

ChatGPT's Linux runtime contained a DNS-based covert channel that silently exfiltrated user messages and uploaded files by exploiting the AI platform's own assumption that its execution environment was isolated. The system bypassed its own guardrails because it didn't recognise the behaviour as a data transfer. This is patched, but it was not theoretical - and the mechanism is the kind of thing that emerges when AI tools are given broad system access without independent monitoring.

AdvJudge-Zero is research, not an active incident, but it demonstrates that LLM-as-a-judge systems - increasingly used as automated gatekeepers for enterprise AI policy enforcement - achieve 99% bypass rates against inputs that appear benign to humans and WAFs alike. The attack requires only user-level access.

The common thread is that AI-native security controls - deny rules, guardrails, safety layers, LLM judges - are not the same class of control as a network ACL or an IAM policy. They are probabilistic, they can be fooled by carefully crafted inputs, and they fail silently. Independent enforcement layers, non-LLM controls sitting between your organisation and your AI tools, and dedicated AI interaction telemetry are not optional additions. At this point, they're table stakes.

In Brief - AI Threat Scan

🤖 AI-Enabled Attacks Microsoft at RSAC 2026 framed AI as evolving from attack tool to attack surface - with Tycoon2FA's MFA-bypassing PhaaS achieving 54% phishing click rates before 330 domains were seized by Microsoft and Europol in March.

Microsoft's HashJack research documents a prompt abuse technique embedding malicious instructions in URL fragments that AI summarisation tools ingest as prompt context - invisible to servers, WAFs, and network monitoring.

🏴‍☠️ Nation-State AI Activity China-linked TA416 re-emerged targeting EU and NATO diplomatic missions after a two-year regional hiatus, deploying PlugX via Microsoft OAuth redirect abuse and Azure Blob-hosted payloads - geopolitically timed.

Chinese hackers are actively exploiting CVE-2026-3502 in TrueConf video conferencing via the "TrueChaos" campaign, replacing legitimate update packages with ShadowPad-carrying malware on on-premises servers serving government entities across Southeast Asia.

💀 AI in Ransomware / Cybercrime North Korea's Drift Protocol heist - $285M stolen in under 10 seconds after 20 days of AI-enhanced social engineering to pre-sign multisig transactions and manipulate oracle pricing - is the 18th DPRK crypto theft tracked by Elliptic in 2026.

Qilin ransomware hit Die Linke, a German parliamentary party, with double extortion - the party characterised the Russian-speaking group as having both financial and political motivations, framing it as potential hybrid warfare.

🔗 AI System Vulnerabilities Boggy Serpens (MuddyWater) has deployed LampoRAT with strong indicators of AI-generated code, using hijacked government and corporate email accounts as a "trusted relationship compromise" model to bypass spam filters in multi-wave campaigns across the Middle East.

📜 AI Policy & Regulation OWASP confirmed prompt injection as the #1 LLM vulnerability in its 2025 Top 10; Microsoft's operational detection playbook using Defender for Cloud Apps, Purview DLP, and Sentinel is the most concrete enterprise response framework published this week against AI prompt abuse.

🏢 Enterprise AI Risk CVE-2025-55182 - a CVSS 10.0 RCE in Next.js React Server Components - is being actively exploited by UAT-10608, who deploy the NEXUS Listener framework to extract OpenAI and Anthropic API keys, AWS/GCP/Azure credentials, SSH keys, and database strings from compromised hosts. Over 766 instances confirmed compromised.

Three China-linked clusters (Mustang Panda, Earth Estries, Unfading Sea Haze) simultaneously targeted a single Southeast Asian government across a six-month window, deploying nine distinct malware families with Dropbox-based C2 - indicating high-priority state-directed tasking.

☁️ Cloud Attacks TeamPCP's pivot to AWS used TruffleHog to validate stolen cloud credentials within 24 hours of exfiltration, then leveraged ECS Exec to run commands directly on container workloads - a cloud-native lateral movement technique that many organisations don't have detection coverage for.

The Bottom Line

The week's data points to something more specific than "AI threats are increasing." What changed this week is that the distance between attacker capability and AI tool vulnerability closed on both sides simultaneously.

North Korea's AI integration is now forensically documented - not inferred, not modelled, but evidenced in the malware artefacts they left behind. The emoji in the OtterCookie code is not incidental. It tells you they're using the same LLM coding tools your developers use, they're not sanitising the output, and they're moving fast enough that they don't care. Speed is the point. AI lowers the cost per operation, compresses the timeline, and lets smaller teams run larger campaigns. That's the economics of the threat shift.

On the other side, three AI platforms this week demonstrated that AI-native security controls fail in qualitatively different ways from traditional controls. A firewall rule either blocks or it doesn't. An LLM guardrail can be flipped by a markdown symbol at 99% reliability, or silently bypassed by a too-long command pipeline, or circumvented via a DNS channel the platform itself didn't recognise as a data transfer. The failure mode is silent, contextual, and probabilistic. That means the governance assumption - "we have guardrails in place" - does not carry the same weight as "we have an access control in place."

One thing that looks scarier than it is: the claim that AI is enabling "fully autonomous" attacks. Microsoft explicitly noted at RSAC that fully autonomous AI campaigns are not the dominant model. Human operators still retain strategic control. AI is accelerating, not replacing. That's still a serious problem, but it's a different problem than self-directed malware making targeting decisions. Don't let the agentic AI narrative distort your threat model toward science fiction when the operational reality is faster, cheaper, human-directed attacks.

The Monday morning question: does your AI tool deployment inventory exist? Not your approved AI tools list - the actual inventory of what developer machines have installed, what API keys are present in those environments, and what those tools can access. If that inventory doesn't exist, this week's CVE-2025-55182 exploitation and TeamPCP's credential harvest suggest it should.

Wisdom of the Week

❝

Every storm has two purposes:
to destroy what isn’t solid and to reveal what is.

AI Influence Level

Level 4 - Al Created, Human Basic Idea / The whole newsletter is generated via a n8n workflow based on publicly available RSS feeds. Human-in-the-loop to review the selected articles and subjects.

Reference: AI Influence Level from Daniel Miessler

Till next time!

Project Overwatch is a cutting-edge newsletter at the intersection of cybersecurity, AI, technology, and resilience, designed to navigate the complexities of our rapidly evolving digital landscape. It delivers insightful analysis and actionable intelligence, empowering you to stay ahead in a world where staying informed is not just an option, but a necessity.

Buy me a Coffee

Like the content? Share Project Overwatch with your friends or colleagues