Investigating Sentiment-Bearing Words- and Emoji-based Distant Supervision Approaches for Sentiment Analysis

Koena Ronny Mabokela, Mpho Raborife, Turgay Celik

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Sentiment analysis focuses on the automatic detection and classification of opinions expressed in texts. Emojis can be used to determine the sentiment polarities of the texts (i.e. positive, negative, or neutral). Several studies demonstrated how sentiment analysis is accurate when emojis are used (Kaity and Balakrishnan, 2020). While they have used emojis as features to improve the performance of sentiment analysis systems, in this paper we analyse the use of emojis to reduce the manual effort in labelling text for training those systems. Furthermore, we investigate the manual effort reduction in the sentiment labelling process with the help of sentiment-bearing words as well as the combination of sentiment-bearing words and emojis. In addition to English, we evaluated the approaches with the low-resource African languages Sepedi, Setswana, and Sesotho. The combination of emojis and words sentiment lexicon shows better performance compared to emojis-only lexicons and words-based lexicons. Our results show that our emoji sentiment lexicon approach is effective, with an accuracy of 75% more than other sentiment lexicon approaches, which have an average accuracy of 69.1%. Furthermore, our distant supervision method obtained an accuracy of 77.0%. We anticipate that only 23% of the tweets will need to be changed as a result of our annotation strategies.

Original languageEnglish
Title of host publication4th Workshop on Resources for African Indigenous Languages, RAIL 2023 - Proceedings of the Workshop
EditorsRooweither Mabuya, Don Mthobela, Mmasibidi Setaka, Menno Van Zaanen
PublisherAssociation for Computational Linguistics
Pages115-124
Number of pages10
ISBN (Electronic)9781959429586
Publication statusPublished - 2023
Event4th Workshop on Resources for African Indigenous Languages, RAIL 2023, co-located with the 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Hybrid, Dubrovnik, Croatia
Duration: 6 May 2023 → …

Publication series

Name4th Workshop on Resources for African Indigenous Languages, RAIL 2023 - Proceedings of the Workshop

Conference

Conference4th Workshop on Resources for African Indigenous Languages, RAIL 2023, co-located with the 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023
Country/TerritoryCroatia
CityHybrid, Dubrovnik
Period6/05/23 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Investigating Sentiment-Bearing Words- and Emoji-based Distant Supervision Approaches for Sentiment Analysis'. Together they form a unique fingerprint.

Cite this