Whose voice matters? Word embeddings reveal identity bias in news quotes

Nnaemeka Ohamadike, Kevin Durrheim, Mpho Primus

Research output: Contribution to journalArticlepeer-review

Abstract

This paper investigates identity bias (gender and race) in the South African news selection and representation of COVID-19 vaccination quotes. Social bias studies have qualitatively examined race and gender bias in South African news, given South Africa’s apartheid history; yet, studies that examine and quantify these biases at the speaker level using news quotes from a representative South African news corpus remain limited. To address this gap, we examined race and gender bias in news selection and framing of quotes. We used word embedding trained on 22,627 vaccination quotes from 76 South African news sources between 2020 and 2023. These large-scale processing embeddings are unbiased by design but can learn and uncover biases hidden in language. Our findings reveal gender and race bias in the news selection and framing of quotes – journalists privilege White voices as more authoritative and connected to global and technical vaccination discourse but confine black voices to primarily localised contexts. They also quote male speakers more frequently in the news than females. In an era where human biases are becoming increasingly implicit, we argue that embeddings offer a robust tool to unearth, monitor, and evaluate these biases at the micro or speaker level in the news.

Original languageEnglish
Article number30
JournalEPJ Data Science
Volume14
Issue number1
DOIs
Publication statusPublished - Dec 2025

Keywords

  • COVID-19 vaccination
  • Gender bias
  • News media
  • Race bias
  • South Africa
  • Word embedding

ASJC Scopus subject areas

  • Modeling and Simulation
  • Computer Science Applications
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Whose voice matters? Word embeddings reveal identity bias in news quotes'. Together they form a unique fingerprint.

Cite this