Improving the performance of the ripper in insurance risk classification: A comparitive study using feature selection

Mlungisi Duma, Bhekisipho Twala, Tshilidzi Marwala, Fulufhelo V. Nelwamondo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

The Ripper algorithm is designed to generate rule sets for large datasets with many features. However, it was shown that the algorithm struggles with classification performance in the presence of missing data. The algorithm struggles to classify instances when the quality of the data deteriorates as a result of increasing missing data. In this paper, feature selection technique is used to help improve the classification performance of the Ripper algorithm. Principal component analysis and evidence automatic relevance determination techniques are chosen to improve the performance of the Ripper. A comparison is done to see which technique helps the algorithm improve the most. Training datasets with completely observable data were used to construct the algorithm, and testing datasets with missing values were used for measuring accuracy. The results showed that principal component analysis is a better feature selection for the Ripper. The results show that with principal component analysis, the classification performance improves significantly as well as increase in resilience in the presence of escalating missing data.

Original languageEnglish
Title of host publicationICINCO 2011 - Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics
Pages203-210
Number of pages8
Publication statusPublished - 2011
Event8th International Conference on Informatics in Control, Automation and Robotics, ICINCO 2011 - Noordwijkerhout, Netherlands
Duration: 28 Jul 201131 Jul 2011

Publication series

NameICINCO 2011 - Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics
Volume1

Conference

Conference8th International Conference on Informatics in Control, Automation and Robotics, ICINCO 2011
Country/TerritoryNetherlands
CityNoordwijkerhout
Period28/07/1131/07/11

Keywords

  • Artificial neural network
  • Automatic relevance determination
  • Missing data
  • Principal component analysis
  • Ripper

ASJC Scopus subject areas

  • Information Systems
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Improving the performance of the ripper in insurance risk classification: A comparitive study using feature selection'. Together they form a unique fingerprint.

Cite this