Partial imputation of unseen records to improve classification using a hybrid multi-layered artificial immune system and genetic algorithm

Mlungisi Duma, Tshilidzi Marwala, Bhekisipho Twala, Fulufhelo Nelwamondo

Research output: Contribution to journalArticlepeer-review

30 Citations (Scopus)


Missing data in large insurance datasets affects the learning and classification accuracies in predictive modelling. Insurance datasets will continue to increase in size as more variables are added to aid in managing client risk and will therefore be even more vulnerable to missing data. This paper proposes a hybrid multi-layered artificial immune system and genetic algorithm for partial imputation of missing data in datasets with numerous variables. The multi-layered artificial immune system creates and stores antibodies that bind to and annihilate an antigen. The genetic algorithm optimises the learning process of a stimulated antibody. The evaluation of the imputation is performed using the RIPPER, k-nearest neighbour, naïve Bayes and logistic discriminant classifiers. The effect of the imputation on the classifiers is compared with that of the mean/mode and hot deck imputation methods. The results demonstrate that when missing data imputation is performed using the proposed hybrid method, the classification improves and the robustness to the amount of missing data is increased relative to the mean/mode method for data missing completely at random (MCAR) missing at random (MAR), and not missing at random (NMAR).The imputation performance is similar to or marginally better than that of the hot deck imputation.

Original languageEnglish
Pages (from-to)4461-4480
Number of pages20
JournalApplied Soft Computing Journal
Issue number12
Publication statusPublished - 2013


  • Correlation-based feature extraction
  • Genetic algorithms
  • Missing data
  • Multi-layered artificial immune system

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'Partial imputation of unseen records to improve classification using a hybrid multi-layered artificial immune system and genetic algorithm'. Together they form a unique fingerprint.

Cite this