TY - CHAP
T1 - Missing Data Estimation Using Firefly Algorithm
AU - Leke, Collins Achepsah
AU - Marwala, Tshilidzi
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - In this chapter, we examine the problem of missing data in high-dimensional datasets by taking into consideration the missing completely at random and missing at random mechanisms, as well as the arbitrary missing pattern. Additionally, this chapter employs a methodology based on deep learning and swarm intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The proposed methodology in this chapter, therefore, has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed.
AB - In this chapter, we examine the problem of missing data in high-dimensional datasets by taking into consideration the missing completely at random and missing at random mechanisms, as well as the arbitrary missing pattern. Additionally, this chapter employs a methodology based on deep learning and swarm intelligence algorithms in order to provide reliable estimates for missing data. The deep learning technique is used to extract features from the input data via an unsupervised learning approach by modeling the data distribution based on the input. This deep learning technique is then used as part of the objective function for the swarm intelligence technique in order to estimate the missing data after a supervised fine-tuning phase by minimizing an error function based on the interrelationship and correlation between features in the dataset. The proposed methodology in this chapter, therefore, has longer running times, however, the promising potential outcomes justify the trade-off. Also, basic knowledge of statistics is presumed.
UR - http://www.scopus.com/inward/record.url?scp=85132854445&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-01180-2_5
DO - 10.1007/978-3-030-01180-2_5
M3 - Chapter
AN - SCOPUS:85132854445
T3 - Studies in Big Data
SP - 73
EP - 89
BT - Studies in Big Data
PB - Springer Science and Business Media Deutschland GmbH
ER -