Using principal component analysis and autoassociative Neural Networks to estimate missing data in a database

Jaisheel Mistry, Fulufhelo V. Nelwamondo, Tshilidzi Marwala

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

In this paper, three new methods on estimating missing data in a database using Neural Networks, Principal Component Analysis and Genetic Algorithms are presented. The proposed methods are tested on a set of data obtained from the South African Antenatal Survey. The data is a collection of demographic properties of patients. The proposed methods use Principal Component Analysis to remove redundancies and reduce the dimensionality in the data. Variations of autoassociative Neural Networks are used to further reduce the dimensionality of the data. A Genetic Algorithm is then used to find the missing data by optimizing the error function of the three variants of the Autoencoder Neural Network. The proposed system was tested on data with 1 to 6 missing fields in a single record of data and the accuracy of the estimated values were calculated and recorded. All methods are as accurate as a conventional feedforward neural network structure however the use of the newly proposed methods employs neural network architectures that have fewer hidden nodes.

Original languageEnglish
Title of host publicationWMSCI 2008 - The 12th World Multi-Conference on Systemics, Cybernetics and Informatics, Jointly with the 14th International Conference on Information Systems Analysis and Synthesis, ISAS 2008 - Proc.
Pages24-29
Number of pages6
Publication statusPublished - 2008
Externally publishedYes
Event12th World Multi-Conference on Systemics, Cybernetics and Informatics, WMSCI 2008, Jointly with the 14th International Conference on Information Systems Analysis and Synthesis, ISAS 2008 - Orlando, FL, United States
Duration: 29 Jun 20082 Jul 2008

Publication series

NameWMSCI 2008 - The 12th World Multi-Conference on Systemics, Cybernetics and Informatics, Jointly with the 14th International Conference on Information Systems Analysis and Synthesis, ISAS 2008 - Proc.
Volume5

Conference

Conference12th World Multi-Conference on Systemics, Cybernetics and Informatics, WMSCI 2008, Jointly with the 14th International Conference on Information Systems Analysis and Synthesis, ISAS 2008
Country/TerritoryUnited States
CityOrlando, FL
Period29/06/082/07/08

Keywords

  • Auto associative Neural Network
  • Autoencoder Neural Networks
  • Missing data
  • Principal Component Analysis and Genetic Algorithm

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Using principal component analysis and autoassociative Neural Networks to estimate missing data in a database'. Together they form a unique fingerprint.

Cite this