Automated detection of human users in Twitter

M. A. Fernandes, P. Patel, T. Marwala

Research output: Contribution to journalConference articlepeer-review

14 Citations (Scopus)

Abstract

This paper compares Suppport Vector Machine (SVM) classification and a number of clustering approaches to separate human from not human users in Twitter in order to identify normal human activity. These approaches have similar F1 accuracy scores of 90% with both experiencing difficulties in classifying human users behaving abnormally. A second stage classification step was then used to further separate not human users into brands, celebrities and promoters/information achieving an average F1 accuracy of 74%. These accuracies were achieved by reducing the size of the feature space using stepwise feature selection and category balancing from manual inspection of classification results.

Original languageEnglish
Pages (from-to)224-231
Number of pages8
JournalProcedia Computer Science
Volume53
Issue number1
DOIs
Publication statusPublished - 2015
EventINNS Conference on Big Data 2015 - San Francisco, United States
Duration: 8 Aug 201510 Aug 2015

Keywords

  • Clustering
  • Human
  • SVM
  • Twitter

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Automated detection of human users in Twitter'. Together they form a unique fingerprint.

Cite this