TY - GEN
T1 - Microarray data feature selection using hybrid genetic algorithm simulated annealing
AU - Perez, Meir
AU - Marwala, Tshilidzi
PY - 2012
Y1 - 2012
N2 - Microarray data feature selection is crucial for the development of a viable cancer diagnostic system based on microarray data. This paper assesses the effectiveness of the Hybrid Genetic Algorithm Simulated Annealing (HGASA) algorithm in selecting features for various classification architectures. HGASA combines the parallel search capability of Genetic Algorithm (GA) with the flexibility of Simulated Annealing (SA). The algorithm is guided by Separability Index, which quantifies the extent of class separability demonstrated by a combination of features. Four classifiers are used in the assessment: Artificial Neural Network (ANN), Support Vector Machine (SVM), Naïve Bayesian Classifier (NBC) and K-Nearest Neighbour (KNN) classifier. Results from HGSA is compared to those from standard GA as well as to those from Population based incremental Learning (PBIL) algorithm. Two data sets are used facilitate this analysis: a prostate cancer data set and a lymphoma data set. For the prostate cancer data set, features selected by the HGASA attained the highest classification accuracy on the SVM classifier with an accuracy of 88%. For the Lymphoma data set, the highest classification accuracy was attained using the ANN classifier, which attained an accuracy of 95%. The performance of the HGASA is ascribed to its ability to search the feature space more thoroughly by employing a deeper exploration of the feature space, when compared to GA and PBIL.
AB - Microarray data feature selection is crucial for the development of a viable cancer diagnostic system based on microarray data. This paper assesses the effectiveness of the Hybrid Genetic Algorithm Simulated Annealing (HGASA) algorithm in selecting features for various classification architectures. HGASA combines the parallel search capability of Genetic Algorithm (GA) with the flexibility of Simulated Annealing (SA). The algorithm is guided by Separability Index, which quantifies the extent of class separability demonstrated by a combination of features. Four classifiers are used in the assessment: Artificial Neural Network (ANN), Support Vector Machine (SVM), Naïve Bayesian Classifier (NBC) and K-Nearest Neighbour (KNN) classifier. Results from HGSA is compared to those from standard GA as well as to those from Population based incremental Learning (PBIL) algorithm. Two data sets are used facilitate this analysis: a prostate cancer data set and a lymphoma data set. For the prostate cancer data set, features selected by the HGASA attained the highest classification accuracy on the SVM classifier with an accuracy of 88%. For the Lymphoma data set, the highest classification accuracy was attained using the ANN classifier, which attained an accuracy of 95%. The performance of the HGASA is ascribed to its ability to search the feature space more thoroughly by employing a deeper exploration of the feature space, when compared to GA and PBIL.
UR - http://www.scopus.com/inward/record.url?scp=84872002208&partnerID=8YFLogxK
U2 - 10.1109/EEEI.2012.6377146
DO - 10.1109/EEEI.2012.6377146
M3 - Conference contribution
AN - SCOPUS:84872002208
SN - 9781467346801
T3 - 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2012
BT - 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2012
T2 - 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2012
Y2 - 14 November 2012 through 17 November 2012
ER -