Abstract
Software defect prediction is a critical task in software engineering, enabling organizations to proactively identify and address potential issues in software systems, thereby improving quality and reducing costs. In this study, we evaluated and compared various machine learning models, including logistic regression (LR), random forest (RF), support vector machines (SVMs), convolutional neural networks (CNNs), and eXtreme Gradient Boosting (XGBoost), for software defect prediction using a combination of diverse datasets. The models were trained and tested on preprocessed and feature-selected data, followed by optimization through hyperparameter tuning. Performance evaluation metrics were employed to analyze the results comprehensively, including classification reports, confusion matrices, receiver operating characteristic–area under the curve (ROC-AUC) curves, precision–recall curves, and cumulative gain charts. The results revealed that XGBoost consistently outperformed other models, achieving the highest accuracy, precision, recall, and AUC scores across all metrics. This indicates its robustness and suitability for predicting software defects in real-world applications.
| Original language | English |
|---|---|
| Article number | 8832164 |
| Journal | IET Software |
| Volume | 2025 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 2025 |
Keywords
- NLP techniques
- ensemble learning
- hyperparameter optimization
- machine learning models
- software defect prediction
- text mining
ASJC Scopus subject areas
- Computer Graphics and Computer-Aided Design
Fingerprint
Dive into the research topics of 'Predicting Software Perfection Through Advanced Models to Uncover and Prevent Defects'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver