Estimating missing data and determining the confidence of the estimate data

Jaisheel Mistry, Fulufhelo Nelwamondo, Tshlidzi Marwala

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Citations (Scopus)

Abstract

A Computational Intelligence approach to estimate missing data makes use of Autoassociative Neural Networks (ANN) and a stochastic optimization technique. The ANN captures interrelationships within data and the optimization technique estimates probable values that are used as inputs to the ANN. The optimum estimate is one that has a minimum influence on the output of the ANN. A method to determine the confidence of this estimate is presented in this paper. An ensemble of ANNs with a Multi Layer Perceptron architecture is collected using Bayesian training methods. The percentage of the most dominant estimate values is used as a confidence measure. The South African antenatal seroprevalence survey data is used and the HIV status of the patients is estimated. It was found that the missing data could be estimated with an overall accuracy of 68% and the confidence ranges between 50% and 97%. Estimates that have a confidence exceeding 70% have 88% estimation accuracy.

Original languageEnglish
Title of host publicationProceedings - 7th International Conference on Machine Learning and Applications, ICMLA 2008
Pages752-755
Number of pages4
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event7th International Conference on Machine Learning and Applications, ICMLA 2008 - San Diego, CA, United States
Duration: 11 Dec 200813 Dec 2008

Publication series

NameProceedings - 7th International Conference on Machine Learning and Applications, ICMLA 2008

Conference

Conference7th International Conference on Machine Learning and Applications, ICMLA 2008
Country/TerritoryUnited States
CitySan Diego, CA
Period11/12/0813/12/08

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Estimating missing data and determining the confidence of the estimate data'. Together they form a unique fingerprint.

Cite this