Hybrid Dual-Stream Deep Learning Approach for Real-Time Kannada Sign Language Recognition in Assistive Healthcare
Downloads
Background: Recent advances in sign language recognition (SLR) focus on high-resource languages (e.g., ASL), leaving low-resource languages like Kannada Sign Language (KSL) underserved. Edge-compatible, real-time SLR systems for healthcare remain scarce, with most existing methods (CNN-LSTM, 3D ResNet) failing to balance accuracy and latency for dynamic gestures.
Objective: This research work aims to develop a real-time, edge-deployable KSL recognition system for assistive healthcare, addressing gaps in low-resource language processing and spatio-temporal modeling of regional gestures.
Methods: We propose a hybrid dual-stream deep learning architecture combining EfficientNetB0 for spatial feature extraction from RGB frames. A lightweight Transformer with pose-aware attention to model 3D hand keypoints (MediaPipe-derived roll/pitch/yaw angles). We curated a new KSL medical dataset (1,080 videos of 10 critical healthcare gestures) and trained the model using transfer learning. Performance was evaluated quantitatively (accuracy, latency) against baselines (CNN-LSTM, 3D ResNet) and in real-world tests.
Results: The system achieved 97.6% training accuracy and 96.7% validation accuracy, 81% real-world test accuracy (unseen users/lighting conditions). 53ms latency on edge devices (TensorFlow.js, 1.2GB RAM), outperforming baselines by ≥12% accuracy at similar latency. The two-stage output pipeline (Kannada text + synthetic speech) demonstrated 98.2% speech synthesis accuracy (Google TTS API).
Conclusion: Our architecture successfully bridges low-resource SLR and edge AI, proving feasible for healthcare deployment. Limitations include sensitivity to rapid hand rotations and dialect variations.
Keywords: Assistive Healthcare, Edge AI, Kannada Sign Language, Low-resource Language, Real-time Recognition, Transformer.
S. Kumar, R. Rani, and U. Chaudhari, “Real-time sign language detection: Empowering the disabled community,” MethodsX, vol. 13, p. 102901, 2024, doi: 10.1016/j.mex.2024.102901.
M. Papatsimouli, P. Sarigiannidis, and G. F. Fragulis, “A survey of advancements in real-time sign language translators: Integration with IoT technology,” Technologies (Basel), vol. 11, no. 4, p. 83, 2023, doi: 10.3390/technologies11040083.
S. Aiouez, “Real-time Arabic sign language recognition based on YOLOv5,” in Proc. 2nd Int. Conf. Image Process. Vis. Eng., 2022, pp. 17–25.
B. A. Al-Mohimeed, “Dynamic sign language recognition based on real-time videos,” Int. J. Online Biomed. Eng., vol. 18, no. 1, pp. 4–17, 2022.
M. De Sisto, “Challenges with sign language datasets for sign language recognition and translation,” in Proc. 13th Conf. Lang. Resour. Eval., 2022, pp. 2478–2487. [Online]. Available: https://aclanthology.org/2022.lrec-1.264
P. Jayanthi, “Real time static and dynamic sign language recognition using deep learning,” J. Sci. Ind. Res., vol. 81, no. 11, pp. 1186–1194, 2022.
R. M. Abdulhamied, M. M. Nasr, and S. N. Abdulkader, “Real-time recognition of American sign language using long-short term memory neural network and hand detection,” Indones. J. Electr. Eng. Comput. Sci., vol. 30, no. 1, pp. 545–556, 2023.
S. Paul, “An Adam-based CNN and LSTM approach for sign language recognition in real time for deaf people,” Bull. Electr. Eng. Inform., vol. 13, no. 1, pp. 499–509, 2024, doi: 10.11591/eei.v13i1.499-509.
R. A. Kadhim and M. Khamees, “A real-time American sign language recognition system using convolutional neural network for real datasets,” TEM J., vol. 9, no. 3, pp. 937–943, 2020.
R. Rastgoo, K. Kiani, and S. Escalera, “Real-time isolated hand sign language recognition using deep networks and SVD,” J. Ambient Intell. Humaniz. Comput., vol. 13, pp. 591–611, 2022, doi: 10.1007/s12652-021-03028-8.
C. Amaya and V. Murray, “Real-time sign language recognition,” in Proc. IEEE XXVII Int. Conf. Electron., Electr. Eng. Comput., 2020.
M. Rivera-Acosta, “Spelling correction real-time American sign language alphabet translation system based on YOLO network and LSTM,” Electronics (Basel), vol. 10, no. 9, p. 1035, 2021, doi: 10.3390/electronics10091035.
J. J. Raval and R. Gajjar, “Real-time sign language recognition using computer vision,” in Proc. 3rd Int. Conf. Signal Process. Commun., 2021, pp. 542–547.
K. K. Podder, “Signer-independent Arabic sign language recognition system using deep learning model,” Sensors, vol. 23, no. 16, p. 7156, 2023, doi: 10.3390/s23167156.
L. T. Wong Sze Ee, C. R. Ramachandiran, and R. Logeswaran, “Real-time sign language learning system,” J. Phys., Conf. Ser., vol. 1712, no. 1, p. 012011, 2020.
H. Muthu Mariappan and V. Gomathi, “Real-time recognition of Indian sign language,” in Proc. Int. Conf. Comput. Intell. Data Sci., 2019.
R. Sreemathy, “Continuous word level sign language recognition using an expert system based on machine learning,” Int. J. Cogn. Comput. Eng., vol. 4, pp. 170–178, 2023, doi: 10.1016/j.ijcce.2023.05.002.
D. Kothadiya, “SignExplainer: An explainable AI-enabled framework for sign language recognition with ensemble learning,” IEEE Access, vol. 11, pp. 47410–47419, 2023, doi: 10.1109/ACCESS.2023.3272101.
N. Saquib and A. Rahman, “Application of machine learning techniques for real-time sign language detection using wearable sensors,” in Proc. 11th ACM Multimedia Syst. Conf., 2020.
A. S. M. Miah, “Spatial-temporal attention with graph and general neural network-based sign language recognition,” Pattern Anal. Appl., vol. 27, p. 37, 2024, doi: 10.1007/s10044-023-01295-9.
S. Xiong, “Continuous sign language recognition enhanced by dynamic attention and maximum backtracking probability decoding,” Signal, Image Video Process., vol. 19, p. 141, 2025, doi: 10.1007/s11760-024-03012-7.
K. Lakshmi, “Real-time hand gesture recognition for improved communication with deaf and hard of hearing individuals,” Int. J. Intell. Syst. Appl. Eng., vol. 11, no. 6s, pp. 23–37, 2023, [Online]. Available: https://www.ijisae.org/article/2825
B. Alsharif, “Deep learning technology to recognize American sign language alphabet,” Sensors, vol. 23, no. 18, p. 7970, 2023, doi: 10.3390/s23187970.
A. Oguntimilehin and K. Balogun, “Real-time sign language fingerspelling recognition using convolutional neural network,” Int. Arab J. Inf. Technol., vol. 21, no. 1, pp. 158–165, 2024.
S. Gan, “Towards real-time sign language recognition and translation on edge devices,” in Proc. 31st ACM Int. Conf. Multimedia, 2023, pp. 4502–4512.
N. Swapna, S. Peddakapu, M. K. Dhumala, and S. Gudumala, “An effective real time sign language recognition using YOLO algorithm,” in Proc. 3rd Int. Conf. Optim. Tech. Eng., 2024.
V. J. Schmalz, “Real-time Italian sign language recognition with deep learning,” in CEUR Workshop Proc., 2021. [Online]. Available: https://ceur-ws.org/Vol-3078/paper-17.pdf
L. T. Woods and Z. A. Rana, “Modelling sign language with encoder-only transformers and human pose estimation keypoint data,” Mathematics, vol. 11, no. 9, p. 2129, 2023, doi: 10.3390/math11092129.
J. Shin, “Korean sign language recognition using transformer-based deep neural network,” Appl. Sci., vol. 13, no. 5, p. 3029, 2023, doi: 10.3390/app13053029.
Y. Liu, “Sign language recognition from digital videos using feature pyramid network with detection transformer,” Multimedia Tools Appl., vol. 82, no. 14, pp. 21673–21685, 2023, doi: 10.1007/s11042-023-15362-3.
Copyright (c) 2025 The Authors. Published by Universitas Airlangga.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).















