Missing data estimation in high-dimensional datasets: A swarm intelligence-deep neural network approach

Collins Leke, Tshilidzi Marwala

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

24 Citations (Scopus)

Abstract

In this paper, we examine the problem of missing data in high-dimensional datasets by taking into consideration the Missing Completely at Random and Missing at Random mechanisms, as well as the Arbitrary missing pattern. Additionally, this paper employs a methodology based on Deep Learning and Swarm Intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The investigated methodology in this paper therefore has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages259-270
Number of pages12
DOIs
Publication statusPublished - 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9712 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Keywords

  • Deep learning
  • High-dimensional data
  • Missing data
  • Supervised learning
  • Swarm intelligence
  • Unsupervised learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Missing data estimation in high-dimensional datasets: A swarm intelligence-deep neural network approach'. Together they form a unique fingerprint.

Cite this