TY - GEN
T1 - The fuzzy gene filter
T2 - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligence Systems, IEA/AIE 2010
AU - Perez, Meir
AU - Rubin, David M.
AU - Marwala, Tshilidzi
AU - Scott, Lesley E.
AU - Featherston, Jonathan
AU - Stevens, Wendy
PY - 2010
Y1 - 2010
N2 - The identification of class differentiating genes is central to microarray data classification. Genes are ranked in order of differential expression and the optimal top ranking genes are selected as features for classification. In this paper, a new approach to gene ranking, based on a fuzzy inference system - the Fuzzy Gene Filter - is presented and compared to classical ranking approaches (the t-test, Wilcoxon test and ROC analysis). Two performance metrics are used; maximum Separability Index and highest cross-validation accuracy. The techniques were implemented on two publically available data-sets. The Fuzzy Gene Filter outperformed the other techniques both with regards to maximum Separability Index, as well as highest cross-validation accuracy. For the prostate data-set it a attained a Leave-one-out cross-validation accuracy of 96.1% and for the lymphoma data-set, 100%. The Fuzzy Gene Filter cross-validation accuracies were also higher than those recorded in previous publications which used the same data-sets. The Fuzzy Gene Filter's success is ascribed to its incorporation of both parametric and non-parametric data features and its ability to be optimised to suit the specific data-set under analysis.
AB - The identification of class differentiating genes is central to microarray data classification. Genes are ranked in order of differential expression and the optimal top ranking genes are selected as features for classification. In this paper, a new approach to gene ranking, based on a fuzzy inference system - the Fuzzy Gene Filter - is presented and compared to classical ranking approaches (the t-test, Wilcoxon test and ROC analysis). Two performance metrics are used; maximum Separability Index and highest cross-validation accuracy. The techniques were implemented on two publically available data-sets. The Fuzzy Gene Filter outperformed the other techniques both with regards to maximum Separability Index, as well as highest cross-validation accuracy. For the prostate data-set it a attained a Leave-one-out cross-validation accuracy of 96.1% and for the lymphoma data-set, 100%. The Fuzzy Gene Filter cross-validation accuracies were also higher than those recorded in previous publications which used the same data-sets. The Fuzzy Gene Filter's success is ascribed to its incorporation of both parametric and non-parametric data features and its ability to be optimised to suit the specific data-set under analysis.
UR - http://www.scopus.com/inward/record.url?scp=79551552453&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-13033-5_7
DO - 10.1007/978-3-642-13033-5_7
M3 - Conference contribution
AN - SCOPUS:79551552453
SN - 3642130321
SN - 9783642130328
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 62
EP - 71
BT - Trends in Applied Intelligent Systems - 23rd International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2010, Proceedings
Y2 - 1 June 2010 through 4 June 2010
ER -