Abstract
Sarcasm detection is a crucial task in natural language processing (NLP), particularly in sentiment analysis and opinion mining, where sarcasm can distort sentiment interpretation. Accurately identifying sarcasm remains challenging due to its context-dependent nature and linguistic complexity across informal text sources like social media and conversational dialogues. This study utilizes three benchmark datasets, namely, News Headlines, Mustard, and Reddit (SARC), which contain diverse sarcastic expressions from headlines, scripted dialogues, and online conversations. The proposed methodology leverages transformer-based models (RoBERTa and DistilBERT), integrating context summarization, metadata extraction, and conversational structure preservation to enhance sarcasm detection. The novelty of this research lies in combining contextual summarization with metadata-enhanced embeddings to improve model interpretability and efficiency. Performance evaluation is based on accuracy, F1 score, and the Jaccard coefficient, ensuring a comprehensive assessment. Experimental results demonstrate that RoBERTa achieves 98.5% accuracy with metadata, while DistilBERT offers a 1.74x speedup, highlighting the trade-off between accuracy and computational efficiency for real-world sarcasm detection applications.
Original language | English |
---|---|
Article number | 95 |
Journal | Computers |
Volume | 14 |
Issue number | 3 |
DOIs | |
Publication status | Published - Mar 2025 |
Keywords
- contextual summarization
- natural language processing (NLP)
- sarcasm detection
- sentiment analysis
- transformer models
ASJC Scopus subject areas
- Human-Computer Interaction
- Computer Networks and Communications