TY - JOUR
T1 - HARNet in deep learning approach—a systematic survey
AU - Kumar, Neelam Sanjeev
AU - Deepika, G.
AU - Goutham, V.
AU - Buvaneswari, B.
AU - Reddy, R. Vijaya Kumar
AU - Angadi, Sanjeevkumar
AU - Dhanamjayulu, C.
AU - Chinthaginjala, Ravikumar
AU - Mohammad, Faruq
AU - Khan, Baseem
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - A comprehensive examination of human action recognition (HAR) methodologies situated at the convergence of deep learning and computer vision is the subject of this article. We examine the progression from handcrafted feature-based approaches to end-to-end learning, with a particular focus on the significance of large-scale datasets. By classifying research paradigms, such as temporal modelling and spatial features, our proposed taxonomy illuminates the merits and drawbacks of each. We specifically present HARNet, an architecture for Multi-Model Deep Learning that integrates recurrent and convolutional neural networks while utilizing attention mechanisms to improve accuracy and robustness. The VideoMAE v2 method (https://github.com/OpenGVLab/VideoMAEv2) has been utilized as a case study to illustrate practical implementations and obstacles. For researchers and practitioners interested in gaining a comprehensive understanding of the most recent advancements in HAR as they relate to computer vision and deep learning, this survey is an invaluable resource.
AB - A comprehensive examination of human action recognition (HAR) methodologies situated at the convergence of deep learning and computer vision is the subject of this article. We examine the progression from handcrafted feature-based approaches to end-to-end learning, with a particular focus on the significance of large-scale datasets. By classifying research paradigms, such as temporal modelling and spatial features, our proposed taxonomy illuminates the merits and drawbacks of each. We specifically present HARNet, an architecture for Multi-Model Deep Learning that integrates recurrent and convolutional neural networks while utilizing attention mechanisms to improve accuracy and robustness. The VideoMAE v2 method (https://github.com/OpenGVLab/VideoMAEv2) has been utilized as a case study to illustrate practical implementations and obstacles. For researchers and practitioners interested in gaining a comprehensive understanding of the most recent advancements in HAR as they relate to computer vision and deep learning, this survey is an invaluable resource.
KW - Accuracy
KW - CNN
KW - Deep learning
KW - Feature-based approaches
KW - Human action recognition (HAR)
UR - https://www.scopus.com/pages/publications/85189901430
U2 - 10.1038/s41598-024-58074-y
DO - 10.1038/s41598-024-58074-y
M3 - Article
C2 - 38600138
AN - SCOPUS:85189901430
SN - 2045-2322
VL - 14
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 8363
ER -