Improving Cause-of-Death Classification from Verbal Autopsy Reports

Thokozile Manaka, Terence van Zyl, Deepak Kar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

In many lower-and-middle income countries including South Africa, data access in health facilities is restricted due to patient privacy and confidentiality policies. Further, since clinical data is unique to individual institutions and laboratories, there are insufficient data annotation standards and conventions. As a result of the scarcity of textual data, natural language processing (NLP) techniques have fared poorly in the health sector. A cause of death (COD) is often determined by a verbal autopsy (VA) report in places without reliable death registration systems. A non-clinician field worker does a verbal autopsy (VA) report using a set of standardized questions as a guide to uncover symptoms of a COD. This analysis focuses on the textual part of the VA report as a case study to address the challenge of adapting NLP techniques in the health domain. We present a system that relies on two transfer learning paradigms of monolingual learning and multi-source domain adaptation to improve VA narratives for the target task of the COD classification. We use the Bidirectional Encoder Representations from Transformers (BERT) and Embeddings from Language Models (ELMo) models pre-trained on the general English and health domains to extract features from the VA narratives. Our findings suggest that this transfer learning system improves the COD classification tasks and that the narrative text contains valuable information for figuring out a COD. Our results further show that combining binary VA features and narrative text features learned via this framework boosts the classification task of COD.

Original languageEnglish
Title of host publicationArtificial Intelligence Research - Third Southern African Conference, SACAIR 2022, Proceedings
EditorsAnban Pillay, Edgar Jembere, Aurona Gerber
PublisherSpringer Science and Business Media Deutschland GmbH
Pages46-59
Number of pages14
ISBN (Print)9783031223204
DOIs
Publication statusPublished - 2022
Event3rd Southern African Conference on Artificial Intelligence Research, SACAIR 2022 - Stellenbosch, South Africa
Duration: 5 Dec 20229 Dec 2022

Publication series

NameCommunications in Computer and Information Science
Volume1734 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference3rd Southern African Conference on Artificial Intelligence Research, SACAIR 2022
Country/TerritorySouth Africa
CityStellenbosch
Period5/12/229/12/22

Keywords

  • Cause of death
  • Monolingual learning
  • Multi-domain adaptation
  • Natural language processing
  • Transfer learning

ASJC Scopus subject areas

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Improving Cause-of-Death Classification from Verbal Autopsy Reports'. Together they form a unique fingerprint.

Cite this