TY - JOUR
T1 - Classification and Comparative Evaluation of Text and Emoji-Based Tweets With Deep Neural Network Models
AU - Chandra Sekhar, J. N.
AU - Kiran Mayee, M.
AU - Nadagoudar, Ranjana
AU - Chinna Alluraiah, N.
AU - Dhanamjayulu, C.
AU - Chinthaginjala, Ravikumar
AU - Ravi, K.
AU - Praveenkumar, M.
AU - Mohanty, Satyajit
AU - Khan, Baseem
N1 - Publisher Copyright:
Copyright © 2024 J. N. Chandra Sekhar et al.
PY - 2024
Y1 - 2024
N2 - Emojis have become increasingly prevalent in today’s digital world, allowing individuals to convey a wide range of emotions, from uncomplicated to intricate, to a greater extent than previously. Consequently, emojis are being utilized in sentiment analysis and tailored marketing strategies. The ongoing research on conducting emotion detection on both tweets and a symbolic expression dataset sourced from Kaggle. Given that tweets are largely commentaries, we utilized two end-to-end sentence embedding models, the DistilBERT, USE-Large3, and RoBERTa, which generate embeddings. These embeddings are further utilized for training with dense neural networks (NNs) and LSTM techniques. Remarkably, it was perceived that the text classification accuracy for both models was consistently high, hovering around 98%. However, when the validation set is constructed with that of symbolic expression or emojis not included in the training dataset, a significant drop in accuracy for both models, plummeting to 75%, had been observed. Additionally, a distributed training methodology is utilized as a substitute for the conventional single-threaded model to enhance scalability. This approach resulted in a roughly 17% reduction in the runtime while maintaining accuracy. Lastly, in pursuit of explainable AI, the SHAP and LIME algorithms are employed to elucidate the model’s behavior and assess any potential biases in the dataset. The creative use of advanced deep NN techniques customized for the delicate complexities of hybrid-data sentiment analysis indicates a significant leap forward. Our proposed work provides the critical gap in existing sentiment analysis methods, which primarily aimed at either text or emojis in isolation, thereby exploring more holistic understanding of sentiment in digital communications. Moreover, the application of explainable AI techniques, SHAP and LIME, to demystify model decisions emphasizes commitment to advancing transparent and trustworthy deep learning technologies in sentiment analysis.
AB - Emojis have become increasingly prevalent in today’s digital world, allowing individuals to convey a wide range of emotions, from uncomplicated to intricate, to a greater extent than previously. Consequently, emojis are being utilized in sentiment analysis and tailored marketing strategies. The ongoing research on conducting emotion detection on both tweets and a symbolic expression dataset sourced from Kaggle. Given that tweets are largely commentaries, we utilized two end-to-end sentence embedding models, the DistilBERT, USE-Large3, and RoBERTa, which generate embeddings. These embeddings are further utilized for training with dense neural networks (NNs) and LSTM techniques. Remarkably, it was perceived that the text classification accuracy for both models was consistently high, hovering around 98%. However, when the validation set is constructed with that of symbolic expression or emojis not included in the training dataset, a significant drop in accuracy for both models, plummeting to 75%, had been observed. Additionally, a distributed training methodology is utilized as a substitute for the conventional single-threaded model to enhance scalability. This approach resulted in a roughly 17% reduction in the runtime while maintaining accuracy. Lastly, in pursuit of explainable AI, the SHAP and LIME algorithms are employed to elucidate the model’s behavior and assess any potential biases in the dataset. The creative use of advanced deep NN techniques customized for the delicate complexities of hybrid-data sentiment analysis indicates a significant leap forward. Our proposed work provides the critical gap in existing sentiment analysis methods, which primarily aimed at either text or emojis in isolation, thereby exploring more holistic understanding of sentiment in digital communications. Moreover, the application of explainable AI techniques, SHAP and LIME, to demystify model decisions emphasizes commitment to advancing transparent and trustworthy deep learning technologies in sentiment analysis.
KW - artificial intelligence
KW - emojis
KW - sentiment analysis
KW - social networks
UR - https://www.scopus.com/pages/publications/105003536682
U2 - 10.1155/2024/9652424
DO - 10.1155/2024/9652424
M3 - Article
AN - SCOPUS:105003536682
SN - 2090-0147
VL - 2024
JO - Journal of Electrical and Computer Engineering
JF - Journal of Electrical and Computer Engineering
IS - 1
M1 - 9652424
ER -