TY - GEN
T1 - Real-Time South African Sign Language Interpretation Using Computer Vision Methods
AU - Magodi, Precious
AU - Moodley, Tevin
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Vision-based sign language recognition significantly advances communication within the deaf community, enhancing accessibility and inclusion for those who are deaf or hard of hearing. This paper presents a system developed for real-time recognition of South African Sign Language (SASL) using Google’s MediaPipe framework for spatial feature extraction and a long short-term memory (LSTM) network for temporal modelling. We utilise a subset of the ASL Citizen dataset, focusing on five classes: “SCHOOL,” “TIMEOUT,” “MORNING,” “THANK YOU,” and “I LOVE YOU,” which serve as proxies for SASL vocabulary. Keypoint sequences from both hands and body pose are extracted via MediaPipe and fed into a two-layer LSTM for classification. Trained with a TensorFlow TFRecord pipeline, our model achieves a test accuracy of 35.5% and highlights the challenges posed by limited data and variability among signers. This work demonstrates the potential of combining MediaPipe and LSTM for real-time sign recognition and emphasises the need for larger, language-specific datasets to improve accuracy.
AB - Vision-based sign language recognition significantly advances communication within the deaf community, enhancing accessibility and inclusion for those who are deaf or hard of hearing. This paper presents a system developed for real-time recognition of South African Sign Language (SASL) using Google’s MediaPipe framework for spatial feature extraction and a long short-term memory (LSTM) network for temporal modelling. We utilise a subset of the ASL Citizen dataset, focusing on five classes: “SCHOOL,” “TIMEOUT,” “MORNING,” “THANK YOU,” and “I LOVE YOU,” which serve as proxies for SASL vocabulary. Keypoint sequences from both hands and body pose are extracted via MediaPipe and fed into a two-layer LSTM for classification. Trained with a TensorFlow TFRecord pipeline, our model achieves a test accuracy of 35.5% and highlights the challenges posed by limited data and variability among signers. This work demonstrates the potential of combining MediaPipe and LSTM for real-time sign recognition and emphasises the need for larger, language-specific datasets to improve accuracy.
KW - Deaf Accessibility
KW - Long Short-Term Memory (LSTM)
KW - Real-time Gesture Recognition
KW - Sign Language Recognition (SRL)
KW - South African Language (SASL)
UR - https://www.scopus.com/pages/publications/105030923569
U2 - 10.1007/978-3-032-13196-6_35
DO - 10.1007/978-3-032-13196-6_35
M3 - Conference contribution
AN - SCOPUS:105030923569
SN - 9783032131959
T3 - Lecture Notes in Networks and Systems
SP - 377
EP - 386
BT - Information Systems for Intelligent Systems - Proceedings of ISBM 2025
A2 - Iglesias, Andres
A2 - Shin, Jungpil
A2 - Bhatt, Nityesh
A2 - Joshi, Amit
PB - Springer Science and Business Media Deutschland GmbH
T2 - 4th World Conference on Information Systems for Business Management, ISBM 2025
Y2 - 24 September 2025 through 26 September 2025
ER -