Experts say “emotion recognition” lacks scientific foundation
Emotion recognition is a hot new area, with numerous companies peddling products that claim to be able to read people’s internal emotional states, and AI researchers looking to improve computers’ ability to do so. This is done through voice analysis, body language analysis, gait analysis, eye tracking, and remote measurement of physiological signs like pulse and breathing rates. Most of all, though, it’s done through analysis of facial expressions.
A new study, however, strongly suggests that these products are built on a bed of intellectual quicksand.
The key question is whether human emotions can be reliably determined from facial expressions. “The topic of facial expressions of emotion — whether they’re universal, whether you can look at someone’s face and read emotion in their face — is a topic of great contention that scientists have been debating for at least 100 years,” Lisa Feldman Barrett, Professor of Psychology at Northeastern University and an expert on emotion, told me. Despite that long history, she said, a comprehensive assessment of all the emotion research that has been done over the past century had never been done. So, several years ago, the Association for Psychological Science brought together five distinguished scientists from various sides of the debate to conduct “a systematic review of the evidence testing the common view” that emotion can be reliably determined by external facial movements.
The five scientists “represented very different theoretical views,” according to Barrett, who was one of them. “We came to the project with very different expectations of what the data would show, and our job was to see if we could find consensus in what the data shows and how to best interpret it. We were not convinced we could, just because it’s such a contentious topic.” The process, expected to take a few months, ended up taking two years.
Nevertheless, in the end, after reviewing over 1,000 scientific papers in the psychological literature, these experts came to a unanimous conclusion: there is no scientific support for the common assumption “that a person’s emotional state can be readily inferred from his or her facial movements.”
The scientists conclude that there are three specific misunderstandings “about how emotions are expressed and perceived in facial movements.” The link between facial expressions and emotions is not reliable (i.e., the same emotions are not always expressed in the same way), specific (the same facial expressions do not reliably indicate the same emotions), or generalizable (the effects of different cultures and contexts has not been sufficiently documented).
As Barrett put it to me, “A scowling face may or may not be an expression of anger. Sometimes people scowl in anger, sometimes you might smile, or cry, or just seethe with a neutral expression. Also, people scowl at other times — when they’re confused, when they’re concentrating, when they have gas.”
The scientists conclude:
These research findings do not imply that people move their faces randomly or that [facial expressions] have no psychological meaning. Instead, they reveal that the facial configurations in question are not “fingerprints” or diagnostic displays that reliably and specifically signal particular emotional states regardless of context, person, and culture. It is not possible to confidently infer happiness from a smile, anger from a scowl, or sadness from a frown, as much of current technology tries to do when applying what are mistakenly believed to be the scientific facts.
This paper is significant because an entire industry of automated purported emotion-reading technologies is quickly emerging. As we wrote in our recent paper on “Robot Surveillance,” the market for emotion recognition software is forecast to reach at least $3.8 billion by 2025. Emotion recognition (aka “affect recognition” or “affective computing”) is already being incorporated into products for purposes such as marketing, robotics, driver safety, and (as we recently wrote about) audio “aggression detectors.”
Emotion recognition is based on the same underlying premise as polygraphs aka “lie detectors”: that physical body movements and conditions can be reliably correlated with a person’s internal mental state. They cannot — and that very much includes facial muscles. What is true of facial muscles, it stands to reason, would also be true of all the other methods of detecting emotion such as body language and gait.
The belief that such mind reading is possible, however, can do real harm. A jury’s cultural misunderstanding about what a foreign defendant’s facial expressions mean can lead them to sentence him to death, for example, rather than prison. Translated into automated systems, that belief could lead to other harms; a “smart” body camera falsely telling a police officer that someone is hostile and full of anger could contribute to an unnecessary shooting.
As Barrett put it to me, “there is no automated emotion recognition. The best algorithms can encounter a face — full frontal, no occlusions, ideal lighting — and those algorithms are very good at detecting facial movements. But they’re not equipped to infer what those facial movements mean.”
Blog by Jay Stanley, Senior Policy Analyst, ACLU Speech, Privacy, and Technology Project.