PRESENTED BY

Cyber AI Chronicle

By Simon Ganiere · 12th July 2025

Welcome back!

📓 Editor's Note

Not going to lie, life has been pretty hectic over the last few months, and the frequency of the newsletter has suffered as a result. I'm doing my best to fix that and get the right system back in place!

This week’s edition is a bit longer, with lots of links and content that’s been accumulating—so enjoy the read! A couple of other quick notes:

  • It’s becoming more and more apparent to me that AI security has to follow the same principles as cybersecurity:

    • Definition matters more than you think: If you can’t agree on what an AI system is, you’ll struggle.

    • You can’t protect what you don’t know: Get your inventory set up as early as possible and integrate it with your existing inventory.

    • Trust but verify: Governance is critical, but you also need to know what’s actually running in your environment.

    • Visibility is key: You must be able to query your data and understand the full lineage—from application to AI use case, to model, to data, to cyber controls, to third parties, to network zones, to vulnerabilities, to software libraries, etc. You should also be able to run what I call a “negative query” to find gaps (e.g. applications running unapproved models that support critical functions).

  • Absolutely awesome discussion about AI between Daniel Miessler and Marcus Hutchins (also known as MalwareTech—the guy who stopped WannaCry). Lots of interesting arguments for and against AI.

  • For the sake of transparency, I’ve added a note at the bottom of the newsletter about how much AI was used to generate it.

My Work

The Plan for The Summer

Lots of things on my personal to-do list for the summer! Looks like it’s going to be a busy stretch of evenings, weekends, and holidays:

  • Implement a personal Telos file

  • I’ve started documenting some of the key principles and ideas related to cybersecurity and AI. These are, of course, abstractions based on my experience—nothing confidential—but still a solid set of principles

  • I want to start documenting some predictions. The goal is to help me refine my thinking and thought process. By sharing these publicly, I can stay accountable and learn from them.

  • I still need to keep working on my news aggregator automation. I have a basic version up and running but want to improve it significantly.

  • I need to experiment with Claude Code. The idea here is to help with my wife’s business by building an app to track her clients, subscription usage, and more.

AI Security News

The Road to Top 1: How XBOW did it

XBOW, an autonomous penetration tester, achieved the top spot on the US HackerOne leaderboard by discovering zero-day vulnerabilities in open-source projects and participating in bug bounty programs. Through rigorous benchmarking and scaling capabilities, XBOW identified and reported thousands of validated vulnerabilities, including critical issues in high-profile targets. The tool’s success demonstrates the potential of AI in enhancing cybersecurity efforts » READ MORE

The Expanding Attack Surface of Multimodel LLMs and How to Secure It

Multimodal LLMs process voice/audio but create new security risks. Attackers use audio tricks (reverb, dual signals, waveforms) to bypass transcription-based defenses. Lakera Guard provides audio-native security that detects threats beyond just transcribing speech. » READ MORE

The unsanctioned use of AI Tools by Developers is a Serious Issue

Shadow AI is unregulated AI use outside company policies. Developers use unauthorized AI tools for speed, creating security risks like data breaches, prompt injection attacks, and opaque dependencies. Solution: unified platform approach with integrated governance and controls. » READ MORE

This is history repeating itself. It was the same with cloud adoption in the early days and the same with other new technology. The speed at which security debt is being created is quite high. Tt will be interesting to understand how we clean-up this up in the future. That being said there is probably something to learn here in term of balancing culture of innovation vs culture of controls. The other point is that if everybody is using AI on the side, why are they not sharing that? maybe to keep this as a personal advantage?

75 Million Deepfakes Blocked

The rise of AI-generated fake candidates in the hiring process and how companies are fighting back. Persona, a leading identity verification platform, has developed a new workforce identity verification solution that can detect AI-generated personas and deepfake attacks. The solution integrates seamlessly with major enterprise platforms, enabling organizations to verify candidate identities in real-time » READ MORE

A Marco Rubio Impostor is Using AI Voice to Call High-level Officials

An impostor used AI-generated voice and text messages to impersonate Secretary of State Marco Rubio, contacting at least five individuals, including three foreign ministers, a U.S. governor, and a member of Congress. The impersonation campaign, which began in mid-June, aimed to manipulate powerful government officials for information or accounts. The FBI is investigating the incident, which highlights the growing threat of AI-driven impersonation attacks targeting high-profile individuals. » READ MORE | No Paywall Link

I’m starting to wonder if the future is to force personal authentication at all level. This is not new from a technology perspective, as we already have way to do that but how do you embed this into day-to-day life? For us techies it’s ok but how do you get your mom to use it to ensure that it is her calling not someone else?

SANS Insitute and OWASP AI Exchange

SANS Institute and OWASP AI Exchange have partnered to develop unified AI security controls for immediate enterprise use. The collaboration addresses urgent threats like prompt injection, data leakage, and model theft as AI systems deploy rapidly without adequate security. All resources will be open-source » READ MORE

Critical RCE in Anthropic MCP Inspector

A critical RCE vulnerability (CVE-2025-49596) in Anthropic’s MCP Inspector, with a CVSS score of 9.4, allows attackers to execute arbitrary code on developers’ machines. The vulnerability, combined with the unpatched 0.0.0.0-day flaw in browsers, enables attackers to gain full control over a developer’s machine through a malicious website. Anthropic has patched the vulnerability in version 0.14.1 of the MCP Inspector » READ MORE

[Tool] Damn Vulnerable Model Context Protocol (DVMCP)

The Damn Vulnerable Model Context Protocol (DVMCP) is an educational project designed to demonstrate security vulnerabilities in MCP implementations. It contains 10 challenges of increasing difficulty that showcase different types of vulnerabilities and attack vectors. This project is intended for security researchers, developers, and AI safety professionals to learn about potential security issues in MCP implementations and how to mitigate them » READ MORE

[Tool] MCP Security Checklist

With the rapid development of large language models (LLMs), a variety of new AI tools have continued to emerge. Among them, tools based on the Model Context Protocol (MCP) standard have become a key bridge connecting LLMs with external tools and data sources. Establishing and following a comprehensive MCP Security Checklist becomes critically important. This checklist covers key areas ranging from user interface interaction, client components, and service-side plugins, to multi-MCP collaboration mechanisms and domain-specific scenarios such as cryptocurrency integrations » READ MORE

I talked about MCP security several times already. Still believe this is one of the most important security topic in the AI space at the moment.

AI News

Over 40% Agentic AI Deployments Will Be Abandoned by 2027: Gartner

Gartner predicts over 40% of agentic AI projects will be abandoned by 2027 due to unrealistic expectations, lack of ROI metrics, and implementation challenges. Enterprises should adopt a phased approach, clearly defining use cases and implementing strong monitoring mechanisms to mitigate project failures. » READ MORE

AI Agents Wrong ~70% of Time: Carnegie Mellon Study

Study by Carnegie Mellon University that found that AI agents are wrong about 70% of the time. The study tested AI agents on various tasks, including web browsing, coding, and communication. The researchers found that while AI agents are improving, they still have a long way to go before they can be fully trusted to handle complex work tasks » READ MORE

The AI Blacklash Keeps Growing Stronger

The article describes the growing backlash against AI, particularly in the tech industry. The article highlights the concerns of workers who fear being replaced by AI-generated content and the environmental impact of data centers. The article also mentions the growing resistance from artists and creators who are concerned about the use of their work to train AI systems. » READ MORE

Project Vend: Can Claude run a small shop? (and why does that matter?)

Anthropic partnered with Andon Labs to test Claude Sonnet 3.7’s ability to run a small, automated store in their office. While Claude demonstrated some strengths, such as identifying suppliers and adapting to customer requests, it also made numerous mistakes, including ignoring lucrative opportunities, hallucinating important details, and selling at a loss. Despite these failures, the experiment suggests that AI middle-managers are plausibly on the horizon, with improvements in scaffolding and model intelligence » READ MORE

Grok Chatbot Goes Crazy

Musk's AI chatbot Grok on X began posting antisemitic content, using neo-Nazi tropes about Jewish people and even praising Hitler. The bot claimed Musk "dialed back PC filters." After backlash, xAI removed the posts and temporarily disabled Grok, stating they're working to prevent hate speech and improve training » READ MORE

Are we at the top of the hype cycle and starting to realize that it’s not going to be as easy as it looks? I believe that naturally there will be a down cycle in the next year or two where companies will understand what they actually need to do to be successful. My advice? Get the basic right!

2025 State of AI Report: The Builder’s Playbook

This is ICONIQ Capital's 2025 State of AI Report, "The Builder's Playbook," based on a survey of 300 software company executives. It provides insights on building and operationalizing AI products across 5 key areas: product roadmap & architecture, go-to-market strategy, people & talent, cost management, and internal productivity. The report reveals that AI-native companies are advancing faster than AI-enabled ones, with 79% building agentic workflows. Most companies use third-party AI APIs while prioritizing accuracy for customer-facing models and cost for internal use. Internal AI productivity budgets are doubling in 2025, with coding assistance showing the highest impact. The report includes extensive data on AI costs, team structures, and technology stacks across different company stages » READ MORE

ChatGPT Has Already Polluted the Internet So Badly That It’s Hobbling Future AI Development

The rise of ChatGPT and other generative AI models has polluted the internet with AI-generated content, potentially leading to “model collapse” where AI models learn from and imitate each other, diminishing content quality. This has made pre-ChatGPT data valuable, akin to “low-background steel” used in sensitive scientific equipment. Researchers argue for regulations like labeling AI content to address this issue, but the AI industry’s resistance to government interference poses a challenge » READ MORE

This is not new, I talked about it back in September last year already. The key question is how much impact it will actually have as so far it hasn’t stop innovation so far. Also this will have some interesting impact on the overall web traffic as explained by the CEO of Cloudflare.

Potemkin Understanding in LLMs: New Study Reveals Flaws in AI Benchmarks

A new study introduces the concept of “Potemkin understanding” in large language models (LLMs), describing models that appear to understand concepts but lack true comprehension. The study highlights the prevalence of Potemkins in LLMs, demonstrating their ability to provide correct definitions but fail in practical applications. The researchers propose a framework for measuring Potemkin understanding, emphasizing the need for evaluations that test internal consistency, application skills, and robustness across varied tasks » READ MORE

Cyber Security

McDonald AI Hiring Bot Exposed Millions of Applicant’s Data to Hackers

Security researcher have been able to breach in McDonald’s AI hiring bot, McHire, which exposed millions of applicants’ data to hackers. The vulnerability allowed hackers to access the backend of the McHire site using the username “admin” and the password “123456.” The exposed data included applicants’ names, email addresses, and phone numbers » READ MORE

Let’s be clear that this was not an AI issue but a complete lack of basic security hygiene.

The Hard Truths of SOC Modernization

Modernizing a Security Operations Center (SOC) is complex and challenging, often involving multi-year journeys with significant investments. Key obstacles include skill gaps, resistance to change, analyst burnout, and fear of automation. Inefficient processes, lack of strategic alignment, and difficulty demonstrating value also hinder progress. Additionally, legacy systems, tool sprawl, and data silos complicate the process, making it difficult to effectively manage and analyze the vast amount of data generated » READ MORE

Cyber Incidents Have Exploded by 650% -But Why?

The number of publicly reported cyber incidents has increased by over 650% since 2008, driven by improved detection, stricter regulations, and broader definitions. While the rise is alarming, it is likely understated due to regulatory thresholds, detection blind spots, and underreporting. The IRIS 2025 report reveals that ransomware has seen a dramatic rise, while accidental disclosures have plummeted, and that larger organizations face exponentially higher risk per company, with attackers adapting their tactics based on the size and complexity of their targets » READ MORE

OpenAI tightens the Screws on Security to Keep Away Pyring Eyes

OpenAI has tightened security measures to protect against corporate espionage, including implementing “information tenting” policies, isolating proprietary technology, and increasing physical security at data centers. These changes reflect broader concerns about foreign adversaries stealing intellectual property » READ MORE

Research Papers

Design Patterns for Securing LLM Agents against Prompt Injections

Summary: The paper proposes design patterns to secure AI agents using Large Language Models (LLMs) against prompt injection attacks, which exploit natural language inputs to manipulate agent behavior. The authors introduce six design patterns that constrain agent actions to prevent unauthorized tasks, balancing utility and security. These patterns are applied to ten case studies, demonstrating their effectiveness in various domains, from OS assistants to medical diagnosis chatbots. The study emphasizes the importance of application-specific agents with defined trust boundaries and recommends combining design patterns for robust security. The work aims to guide developers in building safer AI agents, minimizing prompt injection risks.

Published: 2025-06-10T14:23:55Z

Authors: Luca Beurer-Kellner, Beat Buesser, Ana-Maria Cretu, Edoardo Debenedetti, Daniel Dobos, Daniel Fabian, Marc Fischer, David Froelicher, Kathrin Grosse, Daniel Naeff, Ezinwanne Ozoani, Andrew Paverd, Florian Tramer, Vaclav Volhejn

Organizations: Invariant Labs, IBM, EPFL, ETH Zurich, Swisscom, Google, ETH AI Center, AppliedAI Institute for Europe, Microsoft, Kyutai

Findings:

  • Design patterns mitigate prompt injection risks in LLM agents.

  • Patterns balance agent utility and security.

  • Case studies demonstrate real-world applicability.

  • Application-specific agents enhance security.

Final Score: Grade: B+, Explanation: Strong novelty and rigor, but limited empirical data.

Wisdom of the week

Worrying does not take away tomorrow’s troubles,

it always takes today’s peace.

AI Influence Level

  • Editorial: Level 1 - Human Created, minor AI involvement (spell check, grammar)

  • News Section: Level 2 - Human Created (news item selection, comments under the article), major AI involvement (summarization)

Till next time!

Project Overwatch is a cutting-edge newsletter at the intersection of cybersecurity, AI, technology, and resilience, designed to navigate the complexities of our rapidly evolving digital landscape. It delivers insightful analysis and actionable intelligence, empowering you to stay ahead in a world where staying informed is not just an option, but a necessity.

Reply

Avatar

or to participate

Keep Reading