TY - GEN
T1 - Enhancing digital forensic investigations into emails through sentiment analysis
AU - McGuire, James Christopher
AU - Leung, Wai Sze
N1 - Publisher Copyright:
© 2018 Curran Associates Inc. All rights reserved.
PY - 2018
Y1 - 2018
N2 - An estimated 269 billion emails are sent per day. Frequently used as both a business and personal form of communication, emails can therefore serve as a rich source of information if the appropriate and relevant messages are identified and extracted from the rest of the unrelated mails. With current methods of analysing these emails best described as being very manual, time-consuming, and mundane, the task is not unlike finding a needle in a haystack and represents a considerable challenge for digital forensic investigators who may not have yet identified their persons or points of interest when having to trawl through an organization’s many emails. The need to apply manual searching and/or read emails one by one to identify potential points of interest can be attributed to a lack of search context. Existing work has examined how the search space of emails can be dramatically reduced by looking at specific criteria. This includes examining relationships between users based on their account activity to identify cliques and potential associates (or accomplices). This paper proposes another approach by means of leveraging sentiment analysis, which will allow an investigator to identify points of interest in emails according to the attitude of authors with respect to the content of their emails. As a proof of concept, a prototype system was designed and developed to investigate how sentiment analysis could be appropriately applied to the problem domain. The prototype makes use of natural language processing to derive sentiment, emotion, and keyword data from individual emails, with the resulting data parsed and processed to identify and highlight messages that warrant an investigator’s attention. The prototype was further enhanced by adding time as an additional dimension, allowing investigators to consider the sentiment of authors over the full period, or over a shorter span of time (for example, before and after an incident). The prototype demonstrated how the application of sentiment analysis made it possible to draw the attention of investigators to specific individuals and messages, opening leads for further investigation.
AB - An estimated 269 billion emails are sent per day. Frequently used as both a business and personal form of communication, emails can therefore serve as a rich source of information if the appropriate and relevant messages are identified and extracted from the rest of the unrelated mails. With current methods of analysing these emails best described as being very manual, time-consuming, and mundane, the task is not unlike finding a needle in a haystack and represents a considerable challenge for digital forensic investigators who may not have yet identified their persons or points of interest when having to trawl through an organization’s many emails. The need to apply manual searching and/or read emails one by one to identify potential points of interest can be attributed to a lack of search context. Existing work has examined how the search space of emails can be dramatically reduced by looking at specific criteria. This includes examining relationships between users based on their account activity to identify cliques and potential associates (or accomplices). This paper proposes another approach by means of leveraging sentiment analysis, which will allow an investigator to identify points of interest in emails according to the attitude of authors with respect to the content of their emails. As a proof of concept, a prototype system was designed and developed to investigate how sentiment analysis could be appropriately applied to the problem domain. The prototype makes use of natural language processing to derive sentiment, emotion, and keyword data from individual emails, with the resulting data parsed and processed to identify and highlight messages that warrant an investigator’s attention. The prototype was further enhanced by adding time as an additional dimension, allowing investigators to consider the sentiment of authors over the full period, or over a shorter span of time (for example, before and after an incident). The prototype demonstrated how the application of sentiment analysis made it possible to draw the attention of investigators to specific individuals and messages, opening leads for further investigation.
KW - Computer forensics
KW - Email analysis
KW - Natural language processing
KW - Sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85050798567&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85050798567
T3 - European Conference on Information Warfare and Security, ECCWS
SP - 288
EP - 295
BT - Proceedings of the 17th European Conference on Cyber Warfare and Security, ECCWS 2018
A2 - Josang, Audun
PB - Curran Associates Inc.
T2 - 17th European Conference on Cyber Warfare and Security, ECCWS 2018
Y2 - 28 June 2018 through 29 June 2018
ER -