DEF CON Hackers to Attack AI Models in Largest-Ever Public Exercise to Find Vulnerabilities

Over the next few days, over 3,000 hackers and security researchers will descend upon the DEF CON 31 hacker conference in Las Vegas to compete in a new event: Hacking AI models. The event is being organized by top tech companies as part of an initiative to address the security risks posed by rapidly advancing AI and Large Language Model (LLM) technology.

Leading AI companies in the U.S. have committed to opening their models to red-teaming at the event, meaning that security researchers will be able to attack them in an attempt to find vulnerabilities. The models that will be targeted include those from Anthropic, Google, Hugging Face, Microsoft, NVIDIA, OpenAI, Meta, and Stability AI.

Models to be red-teamed will primarily include large language models, with the goal to stress-test and discover if any model poses embedded risks to hallucinations, jailbreaks, and bias.

Many of these same tech companies recently signed a pledge to voluntarily contribute to cybersecurity and external ethical AI testing in accordance with a White House initiative.

Chris Rohlf, a security engineer at Meta, said that recruiting a larger group of workers with diverse perspectives to red-team AI systems is “something that’s hard to recreate internally” or by “hiring third-party experts.” By opening Meta’s AI models to attack at DEF CON, Rohlf said he hopes it will help “us find more issues, which is going to lead us to more robust and resilient models in the future.”

The event will be hosted at the AI Village at DEF CON and is expected to draw thousands of security researchers. Participants will be given laptops to use to attack the models. Any bugs discovered will be disclosed using industry-standard responsible disclosure practices.

Merging AI with Cybersecurity

The DEF CON event is a significant step in the effort to improve the security of AI models. By allowing hackers to poke and prod at these models in an open setting, the event will help to identify and mitigate vulnerabilities that could be exploited by malicious actors.

“There is this sort of space of AI and data science and security — and they’re not strictly the same,” Daniel Rohrer, NVIDIA’s vice president of product security architecture and research, told CyberScoop in an interview. “Merging those disciplines, I think is really important.” 

The event is also significant because it demonstrates the growing interest in AI security among the hacking community. In recent years, there has been a rise in the number of attacks against AI systems. These attacks have ranged from relatively simple ones, such as injecting malicious code into training data, to more sophisticated ones, such as using adversarial machine learning techniques to fool AI systems into making wrong decisions.

Researchers and AI tech companies alike are hoping to gain substantial data points for future AI model refinement. Opening the models to red-teaming attackers will deliver inexpensive, independent auditing from expert hackers and researchers from around the world.

DEF CON Red-Teaming Results to be Published

According to the AI Village blog, the reports of the AI red-teaming will be published. “The more people who know how to best work with these models, and their limitations, the better. This is also an opportunity for new communities to learn skills in AI by exploring its quirks and limitations,” the announcement reads.

Generative AI is sweeping the globe and will continue to change both defensive and offensive cybersecurity. If you’d like to learn more about generative AI, there are 10 free courses available for anyone to take on Google Cloud.


Discover more from Cybersecurity Careers Blog

Subscribe to get the latest posts sent to your email.