Abstract
Robustness of prediction models is an essential requirement for cancer related diagnostic and prognostic studies. A reliable prognosis of breast cancer is very much dependent on accurate identification of the diagnosed cases. Predictive analytics and learning based methods have shown to provide an effective framework for prognostic studies by accurately classifying data instances into the relevant set of classes based on the severity of the tumor. However a performance validation check is an important analysis to be carried out for benchmarking the best performing variants of a predictive model. This study assesses the relative performance of different variants of a supervised learning algorithm that is used quite commonly to implement a pattern-recognition based model for prognostic assessment of breast cancer data. Principal components analysis performs the pre-processing stage and extracts the most relevant set of features for training different types of decision trees that learn the patterns in the data for classification of new instances. The data of diagnostic cases from the original Wisconsin breast cancer database has been used in the study. Major algorithms under the decision tree family of techniques namely CART and C4.5 have been implemented under different platforms like WEKA, Python and Matlab to evaluate the comparative performance of each other. A major finding has been the low degree of sensitivity of classification accuracy to feature reduction in the case of this data and the same has been investigated and reported.
Original language | English |
---|---|
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | 2016 International Conference on Inventive Computation Technologies, ICICT 2016 - Coimbatore, India Duration: 26 Aug 2016 → 27 Aug 2016 |
Conference
Conference | 2016 International Conference on Inventive Computation Technologies, ICICT 2016 |
---|---|
Country/Territory | India |
City | Coimbatore |
Period | 26/08/16 → 27/08/16 |
Keywords
- Cancer
- Cancer detection
- Decision trees
- Predictive models
- Principal component analysis
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Artificial Intelligence
- Computer Graphics and Computer-Aided Design
- Computer Networks and Communications
- Computer Science Applications
- Health Informatics