Vandana Janeja and Christine Mallinson have received a two-year, $300,000 grant from the National Science Foundation (NSF) to study deepfakes, focusing on audio clips. Deepfakes are images, videos, and sounds that are developed using artificial intelligence (AI) technology, but that are designed to appear as authentic, real-life recordings. They can be highly deceiving for audiences, impacting public opinion and behavior.
Through their NSF Early-Concept Grant for Exploratory Research (EAGER) award, Janeja and Mallinson will study and evaluate listener perceptions of audio deepfakes that have been created with varying degrees of linguistic complexity. This study will include training sessions to help listeners discern audio deepfakes. Informed by training and linguistic labels, this project will develop data science algorithms that can help people discern audio deepfakes. More broadly, their project will establish a new pathway for collaborative public-impact research across the social sciences and computing.
UMBC was engaging in multi-disciplinary work between computing and the social sciences when NSF started this initiative. “We can’t solve big societal issues with an AI algorithm alone,” explains Janeja, professor and chair of information systems. She notes that collaboration between researchers in computing and sociolinguistics is essential to address complex, real-world problems that involve both technology and communication.
Evaluating listener perceptions
Deepfakes can contribute to the rapid spread of misinformation. The threat of deepfakes on social media has received some visibility, but they can appear in other contexts as well.
Janeja highlights an example, recently covered in The New York Times, of a situation when an employee at a well-known investment banking company flagged that a person on the other end of a call sounded like their voice was being digitally altered. After the call, the company found that the person on the call was a leader from a media company posing as a different leader at another firm.
With this type of scenario in mind, the research team will develop training sessions to help listeners improve their ability to recognize audio deepfakes with varying degrees of linguistic complexity, says Janeja, principal investigator (PI) on the grant. They will then evaluate the efficacy of those training sessions to help the listeners protect themselves against deception by audio deepfakes. Using linguistic features, the research team will also create data science algorithms to augment the information that a listener is presented with.
The resulting tools will empower listeners to evaluate the accuracy and authenticity of information they see online, explains Mallinson, professor of language, literacy, and culture (LLC), and director of UMBC’s Center for Social Science Scholarship, who is also co-PI on the award. Participants will receive sociolinguistic training to help them develop a more finely-tuned ear for distinguishing linguistic details, and they will draw upon that information as they evaluate deepfakes.
Open-access tools
Mallinson’s work focuses on language as a socially and culturally embedded phenomenon. She explains that the linguistic complexity of audio deepfakes makes it challenging for listeners to distinguish them from natural speech and identify them as inauthentic misinformation. At the same time, linguistic training and tools can help address these challenges. By working together, experts in computing and linguistics can disentangle this complexity.
The EAGER grant is “high risk, high reward,” she says. It involves approaching a challenging phenomenon in an entirely new way, and building bridges across disciplines. Students studying both data science and the social sciences will develop the skills to identify audio deepfakes, which is uncommon, Mallinson explains. Success would mean helping people protect themselves against deception by deepfakes and increasing the equitability of AI technology.
Janeja and Mallinson’s project team will include UMBC data science scholars as well as Sara Khanjani, Ph.D. ‘24, information systems, and Lavon Davis, incoming LLC Ph.D. student. Khanjani also completed initial research informing the grant, along with Gabrielle Watson ‘21, information systems. That work explored college students’ audio deepfake perceptions.
Khanjani looks forward to creating tutorials that can better prepare people to spot deepfakes. The team’s series of online educational modules will be openly accessible to the public, to help them improve their critical listening and discernment skills.
Ultimately, Mallinson says, this interdisciplinary research in sociolinguistics and data science will better prepare people to navigate emerging communication issues in today’s technologically complex world.
Mallinson and Janeja hope that in establishing an innovative pathway for collaborative research that fully integrates sociolinguistics, human-centered analytics, and data science, the study will also lay the groundwork for future analyses of deepfakes in ways that are broadly relevant to all of these fields.