PRESENTED BY

Cyber AI Chronicle

By Simon Ganiere · 5^th May 2024

Welcome back!

Project Overwatch is a cutting-edge newsletter at the intersection of cybersecurity, AI, technology, and resilience, designed to navigate the complexities of our rapidly evolving digital landscape. It delivers insightful analysis and actionable intelligence, empowering you to stay ahead in a world where staying informed is not just an option, but a necessity.

What I learned this week
- TL;DR
- Is Agentic THE Future?
Worth a full read
- No, LLM Agents can not Autonomously Exploit One-da …
  - Key Takeaway
- Goldman Sachs - Marco Argenti - CIO Interview
  - Key Takeaway
Some more reading
Wisdom of the week
Contact

What I learned this week

TL;DR

In the news this week, Change Healthcare CEO testify on the ransomware attack they suffered earlier this year. It did bring your weekly reminder of security 101: lack of MFA on Citrix devices. Interesting twist apparently those Citrix devices came via an acquisition they did recently.
On the back of the US Cyber Safety Review Board report and multiple security issues over the last couple of years, Microsoft has decided to “make security our top priority at Microsoft, above all else”. Microsoft outlined three security principles and six security pillars, all of this being tied to some of Microsoft’s leadership compensation.
Interesting counter-arguments on the autonomous hacking capability of LLM. I did mention the initial research paper in one of the previous editions. We are for sure not (yet?) in a world where autonomous hack can be done out of thin air. Don’t think any of the LLMs are up to that right now. The initial research paper was a bit obscure in terms of prompts being used and overall setup.
Talking about Agentic LLM, this is what I have been looking at this week. Not that you can deep dive into this in one week 😉 but you got to start somewhere. Currently LLMs are good to generate text and this is how most of us use them. Ask a question and get some content back. Agentic workflow brings a whole new world as you can task them, tell them to use tools and, the interesting part from my perspective, get multiple agents to collaborate. There has been a lot of talk about Agentic LLM so let’s deep dive into some of the design patterns and most important concept.

Is Agentic THE Future?

I spent the week watching some very informative videos and reading some really good articles on Agentic LLM. It’s not the first time I’m mentioning agent. I have been playing with agents for a while and decided to take a step back and start with the basic architecture of an agent-based application.

So what are agents and what do they bring to an LLM?

❝

An LLM-based agent (Agentic LLM) is an advanced unit that uses LLM as a brain to think, make decisions, and take actions to complete certain tasks. These agents can also have memory, which can be short-term memory (agent’s train of thought) and long-term memory (conversation histories). They can also access tools for carrying out tasks such as searching on the web, using calculators, etc.

Source

The typical architecture of an agent looks like the following:

Typical Agent Architecture - Source (after ~ 1min)

The agent is the core or the central coordination module. It’s basically managing the core logic of what you need the agenda to do. It has a definition of the core logic, definition of its role and goals, backstory (context or example), the list of tools it needs or can use, etc.

The memory is basically a store of the agent internal logs and interactions. Based on this the agent can significantly enhance its capability as it can go back to previous interactions and answers. There are two types of memory modules:

Short-term memory: temporarily stores recent interactions and outcomes, enabling agents to recall and utilise information relevant to their current context.
Long-term memory: preserves valuable insights and learning from past execution, allowing agents to build and refine their knowledge over time. Long-term memory can go back across conversation and history across weeks or months.

Tools are well-defined executable workflows that agents can use to execute tasks. Most of the time the tool is leveraging a third-party APIs for example to check the weather in a particular location, or fetch the latest information from a news website. Another typical use-case would be a RAG pipeline to get context from internal documents.

The planning requires a little bit more explanation. Solving complex problems usually requires to break down the problem in smaller pieces and take a step-by-step approach. The best analogy I can think of is the one of a business analyst. After having spent time understanding the questions or process, the business analyst can then break down the problem in smaller chunks. In the world of agent, those chunks are named tasks. They basically describe what an agent needs to do. For example the task is to get the latest news based on keywords. This task is assigned to an agent and that task can use one or more tools (like an API call to Google News).

Tasks are one of the key elements, however one of the other key aspects of an agent is its ability to “reason”. Leveraging several technic such as reflexion, chain of thoughts, subgoal decomposition or self-critics. Most of those are prompting technic that can be embedded directly in the agent. However, they provide a way to create a loop so to ensure that the output of the tasks is achieved.

Langchain has published quite a few blog post that go into more details for Reflection Agents or Plan-and-execute Agents (and more on the Langchain website)

All of this enable an LLM to be used in a very similar way as a process like the one we interact with every day. You can take more or less any process and break it down in an agent model. Assuming you have the right tools, of course, but the key point here is the same I made last week. Not everything has to be LLM powered. You don’t need an LLM to give a CVE information. You need an agent with a specific task that has a tool that can get you an accurate and complete information about that CVE.

Conclusion

In the end agent can solve more complex problems or questions than the normal LLM interactions. In particular they enable the following:

Perform specialised and complex tasks: leveraging tools, agent can perform tasks that a typical LLM cannot perform. By connecting to APIs for example or perform complex mathematical calculation.
Real-time and dynamic interaction: Big LLM model is usually trained on data until a specific date. Agent can enable a real-time and dynamic answer from an LLM by connecting to different services, pull data from the internet, etc.
Better precision and reliability: LLMs have significant difficulties to resonate and provide precise answers (not to mention hallucination). Agent can improve accuracy as they use different techniques such as chain of thoughts.

The real value from LLM will only become apparent when complex tasks can be solved. Having an LLM to generate text or summarize it is great, having an agentic LLM that can take some real action, based on real-time information is probably THE key next step.

By the way if you don’t believe me on how an agent based approach can improve the result of an LLM, check the screenshot below. It clearly shows that an agent based approach can significantly improve the outcome (here for a coding benchmark).

Source

Worth a full read

No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities

Chris Rohlf critiques a paper claiming LLM agents can autonomously exploit vulnerabilities, arguing it overstates LLM capabilities and lacks transparency.

struct.github.io/auto_agents_1_day.html

Key Takeaway

Researchers used a small dataset of 15 public vulnerabilities, mostly web-based, for their study.
GPT-4 reportedly exploited 87% of vulnerabilities with CVE descriptions, but only 7% without, suggesting reliance on existing information.
The author argues the agent's success is due to web search abilities, not autonomous analysis or exploitation skills.
Public exploits for the studied vulnerabilities are simple and easily found online, questioning the research's novelty.
LLM agents' ability to exploit vulnerabilities largely depends on accessing existing online information, not autonomous capability.
Misinterpretations of AI capabilities in cybersecurity can reinforce false narratives about AI risks and controls.

Goldman Sachs - Marco Argenti - CIO Interview

Goldman Sachs CIO Marco Argenti emphasizes the strategic role of technology in finance, focusing on engineering culture, technical debt, and generative AI's potential.

www.thestack.technology/the-big-interview-goldman-sachs-cio-marco-argenti-2

Key Takeaway

Marco Argenti, Goldman Sachs' CIO, prioritizes transforming the bank's engineering culture to align closely with business strategies.
He introduced nine engineering tenets at Goldman Sachs to foster a culture of purposeful building and client-focused development.
Argenti emphasizes the importance of asking "why" before "how" in engineering projects, reversing traditional approaches.
The CIO writes bi-weekly emails to engineers, focusing on concepts like the difference between speed and velocity in project execution.
Tackling technical debt is a priority, with a culture that celebrates its retirement as much as new project releases.
The bank explores generative AI for business growth, productivity, and complex financial modeling, emphasizing the importance of time in alpha generation.
Changing the culture of engineering within finance is key to integrating technology with business strategy effectively.
The future of banking technology lies in real-time data processing and leveraging AI for decision-making and efficiency.

Some more reading

An attacker could run you up a huge AWS bill just by sending rejected requests to an S3 bucket and there’s little you can do about it » READ

Change Healthcare hackers broke in using stolen credentials — and no MFA, says UHG CEO » READ

Microsoft rolls out passkey auth for personal Microsoft accounts » READ

Growing number of threats leveraging Microsoft API » READ

Microsoft overhaul treats security as “top priority” after a series of failures » READ

(nothing against Microsoft…but well they are in the news 😁 ) Microsoft signed the largest ever power deal in order to feed the huge energy requirement for AI » READ

Financial Times (FT) signs its content over to OpenAI to train its GPT models…in return OpenAI will with FT to develop me AI models for its reads » READ

Apple quietly released OpenELM models designed to run efficiently on devices like iPhones and Macs » READ

Verizon Data Breach Investigation Report for 2024 has been release » READ

Deepfake and AI-Driven Disinformation Threaten Polls » READ

Wisdom of the week

❝

A contrarian isn’t one who always objects — that’s a confirmist of a different sort. A contrarian reasons independently, from the ground up, and resists pressure to conform.

Naval Ravikant

Contact

Let me know if you have any feedback or any topics you want me to cover. You can ping me on LinkedIn or on Twitter/X. I’ll do my best to reply promptly!

Thanks! see you next week! Simon

#018 - Cyber AI Chronicle - Is Agentic THE next step?

Table of Contents

What I learned this week

TL;DR

Is Agentic THE Future?

Worth a full read

No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities

Key Takeaway

Goldman Sachs - Marco Argenti - CIO Interview

Key Takeaway

Some more reading

Wisdom of the week

Contact

Reply

Keep Reading

Project Overwatch