Synthesis of social media profiles using a probabilistic context-free grammar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Citations (Scopus)

Abstract

One helpful resource to have when presenting delicate/sensitive information is hypothetical data, or placeholders that conceal the identity of concerned parties. This is crucial in environments such as medicine and criminology as volunteers in medical research, patients of dreaded diseases, and convicts of certain crimes often prefer to remain anonymous, even when they agree to their records being shared. Recently, research based on social media has raised similar ethical concerns about privacy and the use of real users' profiles. In this paper, we present a new application of a type of formal grammar-probabilistic/stochastic context-free grammar-in the automatic generation of social media profiles using Facebook as a test case. First, we present a grammar-based formalism for describing the rules governing the formulation of reasonable user attributes (e.g. full names, date of birth, addresses, phone numbers, etc). These grammar rules are specified with associated probabilistic weights that decides when (if at all) a rule is used or chosen. Secondly, we describe the implementation of these grammar rules. Our implementation results produced one million iterations of unique Facebook profiles within three hours of execution time-with an almost-impossible probability that a profile will reoccur. 100,000 of these synthesised profiles can be viewed at: tinyurl.com/synthesisedprofiles2017. These profiles may find applications in role-playing games in health, and social media research; and the described technique may find a wider application in generation of hypothetical profiles for data anonymisation in different domains.

Original languageEnglish
Title of host publication2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference, PRASA-RobMech 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages104-109
Number of pages6
ISBN (Electronic)9781538623138
DOIs
Publication statusPublished - 1 Jul 2017
Event28th Annual Symposium of the Pattern Recognition Association of South Africa and the 10th Robotics and Mechatronics International Conference, PRASA-RobMech 2017 - Bloemfontein, South Africa
Duration: 29 Nov 20171 Dec 2017

Publication series

Name2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference, PRASA-RobMech 2017
Volume2018-January

Conference

Conference28th Annual Symposium of the Pattern Recognition Association of South Africa and the 10th Robotics and Mechatronics International Conference, PRASA-RobMech 2017
Country/TerritorySouth Africa
CityBloemfontein
Period29/11/171/12/17

Keywords

  • Data Anonymisation
  • Facebook Profile Synthesis
  • Privacy Protection
  • Probabilistic Context-Free Grammars

ASJC Scopus subject areas

  • Control and Optimization
  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Synthesis of social media profiles using a probabilistic context-free grammar'. Together they form a unique fingerprint.

Cite this