Abstract
Does race bias manifest in South African news, and how can computational methods like word embeddings reveal it? After apartheid’s end in 1994, South Africa implemented policies to address racial and economic divides and transform institutions and structures, including the news media. This study introduces a computational approach to quantify race bias in South African news using neural embeddings. We trained word2vec word embeddings on COVID-19 vaccination news articles from 76 South African news sources. These large-scale embeddings are unbiased by design but can detect and reveal hidden biases in language. We found consistent race bias in the coverage of socioeconomic phenomena, while health results were weaker, mixed and likely corpus-dependent. COVID-19 may have also amplified associations between “Black” and unhealthy terms in news coverage. Our methodology complements traditional qualitative techniques and allows for a more objective and representative way of investigating racism in South African news. Findings are validated through multiple methods, including human ratings, and have implications for South African news and this research field.
| Original language | English |
|---|---|
| Article number | 83 |
| Journal | EPJ Data Science |
| Volume | 14 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - Dec 2025 |
Keywords
- COVID-19 vaccination
- Computational social science
- Natural language processing
- News media
- Race bias
- South Africa
- Speaker names
- Word embedding
ASJC Scopus subject areas
- Modeling and Simulation
- Computer Science Applications
- Computational Mathematics