Skip to main content

Adversarial program Gremlin tests neural networks with an evolutionary algorithm

Gremlin, software developed by researchers at the US Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) to identify weaknesses in neural networks, has been recognized with a 2022 R&D 100 Award in the Software/Services category.

Neural networks are algorithms used to recognize patterns in datasets of text, images, or sounds. As the underlying technology of AI, neural networks are already being used for applications in medicine, business, and transportation, with more on the way.

“There’s been a lot of effort since 2012 to develop AI for a wide range of applications, but how do you really test and evaluate it when it comes to operational conditions? That was the motivation behind Gremlin. We wanted to develop a new system for the sole purpose of testing and evaluating AI,” said Robert Patton, head of ORNL’s Learning Systems Group and leader of the Gremlin development team.

Gremlin acts as an adversarial program that dynamically generates data or scenarios that the neural network may not recognize from its training to reveal how it may behave in unfamiliar situations. This can expose potential areas of failure, thereby allowing programmers to improve their networks before they’re put into production. Such advance testing could be especially useful in industries for which safety is critical, such as autonomous vehicles that rely on AI to avoid obstacles.

“Neural networks are only as good as the training data you provide. When you train it, you might get very good results, but then when you go to be operational, you might encounter data you’ve never seen before. The neural network won’t know what to do, and it breaks,” Patton said. “That can happen a lot in autonomous vehicles, where you train it on a particular dataset, but then operationally you might encounter a lighting condition or weather condition that you didn’t have in your training data, and the neural network goes, ‘I don’t know what to do.’”

ORNL Gremlin team

ORNL research scientists (from left) Robert Patton, Mark Coletti, and Quentin Haas have won a 2022 R&D 100 Award in recognition of their development of Gremlin, software that tests and evaluates neural networks in AI. Photo by Carlos Jones/ORNL.

Patton’s team tested Gremlin on their own autonomous driving neural network, which was developed with their award-winning AI software system, Multinode Evolutionary Neural Networks for Deep Learning (MENNDL). MENNDL uses an evolutionary algorithm to design optimized neural networks by evaluating thousands of networks in a matter of hours, depending on the power of the computer. Using CARLA, an open-source simulator for autonomous driving research, the team inserted Gremlin to generate different driving scenarios on the fly to test the MENNDL-designed neural network—and it uncovered a bias in their network’s training data.

“With our autonomous vehicle application, most of the time people are driving straight. If you think about it, we spend very little time turning. Our training data had that bias in it, which is a natural bias. But Gremlin helped us identify it: ‘You’re not doing very well in the turns.’ So, we decided to change that bias in our training data to include more turns, and we were able to improve the driving behavior of the neural network,” Patton said.

Gremlin is based on the PhD dissertation that Patton wrote nearly 20 years ago about an adversarial program for software testing. In 2019, Patton brought in evolutionary computation expert Mark Coletti to apply that concept to AI. Over the next 2 years, Coletti wrote most of Gremlin’s software, which is entirely new except for its use of the Library for Evolutionary Algorithms in Python, which Coletti codevelops.

“I’m also a musician, and Gremlin works similarly to how musicians learn new music. When I encounter a difficult phrase of music, I will go over that phrase many times to—hopefully—master it. Similarly, Gremlin can find difficult situations for a neural network so that we can add more examples of those situations for it to repeat during training,” said Coletti, an ORNL research scientist in AI and machine learning.

Other adversarial programs that use evolutionary algorithms may exist, but Gremlin is unique in the AI field. Although developed for autonomous driving, Gremlin can be used in any application that incorporates a neural network to control a device.

“Most existing systems are one-shot implementations to demonstrate a specific research concept—they aren’t readily generalizable to other problems. Gremlin was designed to be flexible enough to quickly apply it to new problems,” Coletti said.

Moreover, Gremlin can be scaled to leverage computing resources that range from a laptop to a supercomputer—such as the Oak Ridge Leadership Computing Facility’s Summit or Frontier. Indeed, Gremlin was designed with Summit in mind because the software must evaluate up to thousands of different autonomous driving scenarios simultaneously, and this analysis would be extremely difficult and time consuming on a lesser computing cluster. Quentin Haas, an ORNL software engineer in AI and machine learning, developed the necessary software infrastructure and container support that provided the foundations for Gremlin and enabled it to scale on Summit.

Gremlin is an open-source code available for download at Coletti’s Github page.

Overseen by science and technology media group R&D World, the annual R&D 100 Awards are decided by an outside panel of nearly 50 research and development experts from around the world. The awards recognize “new commercial products, technologies, and materials for their technological significance that are available for sale or license.” The R&D 100 Awards will be presented on November 17 in Coronado, California.

The OLCF is a DOE Office of Science User Facility at ORNL.

UT-Battelle LLC manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, visit

Coury Turczyn

Coury Turczyn writes communications content for the Oak Ridge Leadership Computing Facility (OLCF). He has worked in different communications fields over the years, though much of his career has been devoted to local journalism. He helped create Knoxville’s alternative news weekly, Metro Pulse, in 1991 and served as its managing editor for nine years; he returned to the paper in 2006 as its editor-in-chief, until its closure in 2014. Several months later, he and two other former Metro Pulse editors dared to start a new weekly paper, the Knoxville Mercury, based on a hybrid nonprofit ownership model. In between his journalism stints, Coury worked as a web content editor for, the G4 cable TV network, and