Impact Factor (2025): 6.9
DOI Prefix: 10.47001/IRJIET
This paper
introduces a novel hybrid framework for Indian Sign Language (ISL) translation
that performs real-time recognition of both static and dynamic gestures and
generates multilingual outputs in both text and speech. Unlike existing systems
that are limited to either static classification or single-language outputs,
our approach integrates a fine-tuned ResNet50V2 model for static gesture
classification (98.7% accuracy) and a YOLOv8m detector for dynamic word
recognition (88.7% mAP@50). The system employs MediaPipe for efficient hand
landmark extraction and incorporates frame-skipping and cooldown strategies to
optimize real-time performance on CPU-based devices, achieving an average of
3.4 FPS without GPU acceleration. Recognized gestures are mapped to sequences,
translated into eight Indian languages using Google Translate, and converted
into synthesized speech using gTTS. Experimental results validate the system's
robustness across gesture types and linguistic outputs. The proposed work is
the first to offer a complete ISL-to-text-and-speech pipeline with integrated
multi-language support,via a desktop User-interface. This makes it a scalable,
low-cost assistive tool designed to enhance accessibility and communication for
the hearing-impaired community in multilingual contexts.
Country : India
IRJIET, Volume 9, Special Issue of ICCIS-2025 May 2025 pp. 155-161