For more than 65 years, the federal government has funded MITRE's security research in many fields—cancer research, radar technology, GPS, and, of course, cyber security. Nowadays, the big topic is generative AI backed by large language models (LLMs). At this year's Black Hat conference, MITRE sent a team of presenters to showcase tests they're conducting to determine whether LLMs will enhance cyber operations or open new security holes.
Are LLMs Dangerous or Helpful?
About a year ago, MITRE started fielding questions about the potential security risks of LLMs, said Michael Kourematis, a principal adversary emulation engineer. Without a way to test LLMs, however, it's difficult to know if they can generate or identify malicious code.
"We’re trying to make progress on answering that question,” he said, which includes a series of tests the MITRE team outlined here at Black Hat.
Marisa Dotter, a senior machine learning engineer, introduced the first test, which runs an LLM through a set of multiple-choice questions about a simulated cyber-ops scenario. She emphasized that they test the basic, unaugmented LLM with no special tuning.
(The Image used in this Blog Post has been taken from FREEPIK)