A Multi-Modal AI Framework for Real-Time American Sign Language Translation

Rohan P. Nandanwar,Diksha Nishane,Himanshu Chamatkar,Om Nishane,P. P. Zode,Lakshmi Madireddy

Published 2025 in 2025 3rd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIHEI)

ABSTRACT

Effective communication is a fundamental human right; yet, significant barriers persist for deaf and non-verbal individuals. This paper introduces a novel multimodal AI framework for real-time American Sign Language (ASL) translation, aiming to bridge this gap. Our system pioneers the fusion of two distinct data streams: computer vision for recognising hand shapes and movements, and surface electromyography (sEMG) for capturing the underlying muscle-level gesture intent. This dual-modal approach provides unprecedented robustness, overcoming common limitations of vision-only systems (e.g., occlusion, poor lighting) and sEMG-only systems (e.g., sensor drift). We propose a novel attention-based fusion network that intelligently integrates these data streams. Evaluated on a comprehensive dataset, our multi-modal framework achieves a 94.5 % accuracy, significantly outperforming single-modality baselines. This work details the system architecture, data fusion strategy, and real-world deployment challenges, offering a viable pathway toward creating more inclusive and effective assistive communication technologies.

PUBLICATION RECORD

Publication year
2025
Venue
2025 3rd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIHEI)
Publication date
2025-11-28
Fields of study
Not labeled
Identifiers
DOI 10.1109/IDICAIHEI65991.2025.11379213
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Enhanced Sign Language Translation Between American Sign Language and Indian Sign Language Using LLMs
2025cited by this paper
Speech Enhancement Using Deep Neural Networks
2024cited by this paper
A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis
2024influential reference
Real-time sign language recognition based on YOLO algorithm
2024cited by this paper
sEMG-Based Hand Gesture Recognition Using Binarized Neural Network
2023cited by this paper
A Survey of Advancements in Real-Time Sign Language Translators: Integration with IoT Technology
2023cited by this paper
Sign Language Recognition: A Deep Survey
2021cited by this paper
American Sign Language Alphabet Recognition by Extracting Feature from Hand Pose Estimation
2021cited by this paper
Physics-informed machine learning
2021cited by this paper
Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation
2020cited by this paper
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
2019cited by this paper
The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems
2019cited by this paper
Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again
2019cited by this paper
Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective
2019cited by this paper
Attention is All you Need
2017cited by this paper
Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition
2016cited by this paper
Densely Connected Convolutional Networks
2016cited by this paper
Deep Learning
2016cited by this paper
Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks
2016cited by this paper
Librispeech: An ASR corpus based on public domain audio books
2015cited by this paper
Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin
2015cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
Convolutional Neural Networks for Speech Recognition
2014cited by this paper
Long short-term memory recurrent neural network architectures for large scale acoustic modeling
2014cited by this paper
Connectionist Temporal Classification
2012cited by this paper
RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus
2012cited by this paper
Multimodal Deep Learning
2011cited by this paper
The American Sign Language Lexicon Video Dataset
2008cited by this paper
Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes
1999cited by this paper
Long Short-Term Memory
1997cited by this paper
Two-Dimensional Signal and Image Processing
1989cited by this paper

CITED BY

No citing papers are available for this paper.