Developing A Dynamic WordNet for Under-Resourced Languages

Stephen Obare, Abejide Ade-Ibijola, Kennedy Ogada

Research output: Contribution to journalArticlepeer-review

Abstract

The development of WordNets has contributed to a number of tasks in Natural Language Processing (NLP). While there is growing interest in building WordNets for popular languages, there are no major efforts for African languages which are evolving and commonly used by younger generation in social media platforms. Even where there are claims of such efforts, no publicly accessible work exist that has comprehensively addressed the challenge of creating and updating WordNets as new words are coined and meaning of words change. We present a novel technique implemented in a software tool called “Sense-Mapper” that maps Princeton WordNet 3.0 (PWN) synsets to concepts extracted from a lexical resource, detects unknown words from social media platforms, assigns senses to the unknown words and identify optimal location in the WordNet to insert the new words to cater for the evolving vocabulary. We assess the performance and effectiveness of Sense-Mapper using lexical resources and data generated from social media platforms in Kenya and show that the proposed tool achieved an accuracy of 87.34% in mapping senses between lexical resources and 88.75% in updating our WordNet. Sense-Mapper is expected to find application in a number of NLP tasks that are require assigning senses to previously unseen or rare words and updating lexical resources.

Original languageEnglish
Pages (from-to)1263-1288
Number of pages26
JournalJournal of Information Science and Engineering
Volume41
Issue number5
DOIs
Publication statusPublished - 2025
Externally publishedYes

Keywords

  • natural language processing
  • social media platforms
  • under resourced languages
  • unseen words
  • WordNet

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Hardware and Architecture
  • Library and Information Sciences
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Developing A Dynamic WordNet for Under-Resourced Languages'. Together they form a unique fingerprint.

Cite this