Abstract
Two techniques have emerged from the recent literature as candidate solutions to the problem of missing data imputation. These are the expectation maximization (EM) algorithm and the auto-associative neural network and genetic algorithm (GA) combination. Both these techniques have been discussed individually and their merits discussed at length in the available literature. However, they have not been compared with each other. This article provides a comparison of the two techniques using datasets of an industrial power plant, an industrial winding process and HIV seroprevalence survey data. Results show that the EM algorithm is more suitable and performs better in cases where there is little or no interdependency between the input variables, whereas the auto-associative neural network and GA combination is suitable when there are inherent nonlinear relationships between some of the given variables.
Original language | English |
---|---|
Pages (from-to) | 1514-1521 |
Number of pages | 8 |
Journal | Current Science |
Volume | 93 |
Issue number | 11 |
Publication status | Published - 10 Dec 2007 |
Externally published | Yes |
Keywords
- Expectation maximization algorithm
- Genetic algorithm
- Missing data
- Neural network
ASJC Scopus subject areas
- Multidisciplinary