Speech recognition systems misunderstand Black users 35 percent of the time, study finds

Voice assistant concept. Vector sound wave. Microphone voice control technology, voice and sound rec...
Shutterstock
Impact

We've all had the frustrating experience of voice assistants like Siri or Alexa being unable to understand what we're saying, but according to a recent study it may be happening to Black users more than others.

Published on Monday, the study from Stanford researchers focused on speech recognition systems from Amazon, Apple, Google, IBM, and Microsoft. The systems were put to the task of transcribing interviews conducted with 42 white speakers and 73 Black speakers.

Researchers wrote that each system "exhibited substantial racial disparities." On average, the systems misidentified white speakers' words about 19 percent of the time. But when it came to Black speakers, the errors leapt up to 35 percent. If that's not enough, 20 percent of audio from Black speakers was marked unreadable, compared to just 2 percent from white ones.

This isn't the first time that bias in speech recognition systems has come up. In 2018, Slate reported on how these systems fail people with speech disabilities. What could be beneficial for day-to-day life becomes inaccessible because the system isn't taught to hear you.

When approached by the New York Times, only one company out of five responded. Google told the outlet, "We've been working on the challenge of accurately recognizing variations of speech for several years, and will continue to do so."

One of the likely issues here has to do with sampling. You can't train systems on primarily homogenous datasets and expect them to work well for everyone. This was seen before with a study that found self-driving cars are more likely to hit darker-skinned people.

"This is yet another example of sampling bias that demonstrates the discriminatory impact on certain communities," AI expert Sandra Wachter told Business Insider. "Compared to 'traditional' forms of discrimination ... automated discrimination is more abstract and unintuitive, subtle, intangible, and difficult to detect."

And while it may be tempting to suggest the solution is simply to get more audio of Black people talking, tech companies have gone about expanding datasets in the wrong way before. In May 2019, Google launched Project Euophonia to improve speech recognition for people with speech disabilities. However, the company received pushback for failing to pay people for providing that data.

Then in October 2019, reports found Google used questionable methods to target darker-skinned people to improve facial recognition. Contractors were reportedly instructed to conceal the fact that people's faces were being recorded and, if people were told, only offered them $5.

Artificial intelligence continues to have problems with encoded biases. But it's worth asking, if these companies have struggled time and time again, can they be trusted to do better?