Abstract
Antigenic peptides (APs), also known as T-cell epitopes (TCEs), represent the immunogenic segment of pathogens capable of inducing an immune response, making them potential candidates for epitope-based vaccine (EBV) design. Traditional wet lab methods for identifying TCEs are expensive, challenging, and time-consuming. Alternatively, computational approaches employing machine learning (ML) techniques offer a faster and more cost-effective solution. In this study, we present a robust XGBoost ML model for predicting TCEs of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus as potential vaccine candidates. The peptide sequences comprising TCEs and non-TCEs retrieved from Immune Epitope Database Repository (IEDB) were subjected to feature extraction process to extract their physicochemical properties for model training. Upon evaluation using a test dataset, the model achieved an impressive accuracy of 97.6%, outperforming other ML classifiers. Employing a five-fold cross-validation a mean accuracy of 97.58% was recorded, indicating consistent and linear performance across all iterations. While the predicted epitopes show promise as vaccine candidates for SARS-CoV-2, further scientific examination through in vivo and in vitro studies is essential to validate their suitability.
| Original language | English |
|---|---|
| Article number | e2319 |
| Journal | PeerJ Computer Science |
| Volume | 10 |
| DOIs | |
| Publication status | Published - 2024 |
Keywords
- Antigenic peptide
- COVID-19
- Epitope-based vaccine
- Machine learning
- SARS-CoV-2
- T-cell epitope
- XGBoost
ASJC Scopus subject areas
- General Computer Science