TY - JOUR
T1 - Optimal feature selection for a weighted k-nearest neighbors for compound fault classification in wind turbine gearbox
AU - Gbashi, Samuel M.
AU - Adedeji, Paul A.
AU - Olatunji, Obafemi O.
AU - Madushele, Nkosinathi
N1 - Publisher Copyright:
© 2024
PY - 2025/3
Y1 - 2025/3
N2 - The k-nearest neighbors is renowned for its adaptability and ease of use, making it favored for data-driven turbine component fault diagnostics. However, the basic k-NN model is constrained by the curse of dimensionality, rendering it ineffective at capturing the dynamics in high-dimensional turbine gearbox vibration signals. To address this problem, this study advanced a framework for choosing the best features for a k-NN fault diagnostic model while also leveraging the benefits of “weighting” to further improve its performance. The framework was validated using vibration signals from a wind turbine gearbox under four different fault conditions. The study first extracted statistical frequency and time domain characteristics from the vibration dataset for feeding the model. The most discriminative features—including the mean, mean frequency, standard deviation frequency, maximum frequency, and Shannon entropy—were selected from the feature space using the proposed strategy. Results of the study indicate that by incorporating weights into the k-NN model, classification accuracy improved by 0.005 %. The Manhattan distance metric outperformed all other metrics in classifying the various gearbox health states. The optimal k-value was determined to be 20. Overall, the optimal k-NN model achieved an average classification accuracy of 95.95 % across all performance metrics, with accuracy at 95.97 %, recall at 95.97 %, precision at 95.93 %, and F1 score at 95.93 %. The fault diagnostic model is recommended for deployment in wind turbine gearbox condition monitoring.
AB - The k-nearest neighbors is renowned for its adaptability and ease of use, making it favored for data-driven turbine component fault diagnostics. However, the basic k-NN model is constrained by the curse of dimensionality, rendering it ineffective at capturing the dynamics in high-dimensional turbine gearbox vibration signals. To address this problem, this study advanced a framework for choosing the best features for a k-NN fault diagnostic model while also leveraging the benefits of “weighting” to further improve its performance. The framework was validated using vibration signals from a wind turbine gearbox under four different fault conditions. The study first extracted statistical frequency and time domain characteristics from the vibration dataset for feeding the model. The most discriminative features—including the mean, mean frequency, standard deviation frequency, maximum frequency, and Shannon entropy—were selected from the feature space using the proposed strategy. Results of the study indicate that by incorporating weights into the k-NN model, classification accuracy improved by 0.005 %. The Manhattan distance metric outperformed all other metrics in classifying the various gearbox health states. The optimal k-value was determined to be 20. Overall, the optimal k-NN model achieved an average classification accuracy of 95.95 % across all performance metrics, with accuracy at 95.97 %, recall at 95.97 %, precision at 95.93 %, and F1 score at 95.93 %. The fault diagnostic model is recommended for deployment in wind turbine gearbox condition monitoring.
KW - Fault detection
KW - Feature selection
KW - Vibration signals
KW - Weighted k-nearest neighbors
KW - Wind turbine gearbox
UR - http://www.scopus.com/inward/record.url?scp=85212968006&partnerID=8YFLogxK
U2 - 10.1016/j.rineng.2024.103791
DO - 10.1016/j.rineng.2024.103791
M3 - Article
AN - SCOPUS:85212968006
SN - 2590-1230
VL - 25
JO - Results in Engineering
JF - Results in Engineering
M1 - 103791
ER -