TY - GEN
T1 - Estimating missing data and determining the confidence of the estimate data
AU - Mistry, Jaisheel
AU - Nelwamondo, Fulufhelo
AU - Marwala, Tshlidzi
PY - 2008
Y1 - 2008
N2 - A Computational Intelligence approach to estimate missing data makes use of Autoassociative Neural Networks (ANN) and a stochastic optimization technique. The ANN captures interrelationships within data and the optimization technique estimates probable values that are used as inputs to the ANN. The optimum estimate is one that has a minimum influence on the output of the ANN. A method to determine the confidence of this estimate is presented in this paper. An ensemble of ANNs with a Multi Layer Perceptron architecture is collected using Bayesian training methods. The percentage of the most dominant estimate values is used as a confidence measure. The South African antenatal seroprevalence survey data is used and the HIV status of the patients is estimated. It was found that the missing data could be estimated with an overall accuracy of 68% and the confidence ranges between 50% and 97%. Estimates that have a confidence exceeding 70% have 88% estimation accuracy.
AB - A Computational Intelligence approach to estimate missing data makes use of Autoassociative Neural Networks (ANN) and a stochastic optimization technique. The ANN captures interrelationships within data and the optimization technique estimates probable values that are used as inputs to the ANN. The optimum estimate is one that has a minimum influence on the output of the ANN. A method to determine the confidence of this estimate is presented in this paper. An ensemble of ANNs with a Multi Layer Perceptron architecture is collected using Bayesian training methods. The percentage of the most dominant estimate values is used as a confidence measure. The South African antenatal seroprevalence survey data is used and the HIV status of the patients is estimated. It was found that the missing data could be estimated with an overall accuracy of 68% and the confidence ranges between 50% and 97%. Estimates that have a confidence exceeding 70% have 88% estimation accuracy.
UR - http://www.scopus.com/inward/record.url?scp=60649116946&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2008.71
DO - 10.1109/ICMLA.2008.71
M3 - Conference contribution
AN - SCOPUS:60649116946
SN - 9780769534954
T3 - Proceedings - 7th International Conference on Machine Learning and Applications, ICMLA 2008
SP - 752
EP - 755
BT - Proceedings - 7th International Conference on Machine Learning and Applications, ICMLA 2008
T2 - 7th International Conference on Machine Learning and Applications, ICMLA 2008
Y2 - 11 December 2008 through 13 December 2008
ER -