Advancing Dysarthric Speech-to-Text Recognition with LATTE: A Low-Latency Acoustic Modeling Approach for Real-Time Communication.

Qurat Ul Ain,Hammad Afzal,Fazli Subhan,Mazliham Mohd Suud,Younhyun Jung

Published 2026 in Big Data

ABSTRACT

Dysarthria, a motor speech disorder characterized by slurred and often unintelligible speech, presents substantial challenges for effective communication. Conventional automatic speech recognition systems frequently underperform on dysarthric speech, particularly in severe cases. To address this gap, we introduce low-latency acoustic transcription and textual encoding (LATTE), an advanced framework designed for real-time dysarthric speech recognition. LATTE integrates preprocessing, acoustic processing, and transcription mapping into a unified pipeline, with its core powered by a hybrid architecture that combines convolutional layers for acoustic feature extraction with bidirectional temporal layers for modeling temporal dependencies. Evaluated on the UA-Speech dataset, LATTE achieves a word error rate of 12.5%, phoneme error rate of 8.3%, and a character error rate of 1%. By enabling accurate, low-latency transcription of impaired speech, LATTE provides a robust foundation for enhancing communication and accessibility in both digital applications and real-time interactive environments.

PUBLICATION RECORD

Publication year
2026
Venue
Big Data
Publication date
2026-02-09
Fields of study
Medicine, Computer Science, Engineering
Identifiers
DOI 10.1177/2167647X251411174 PMID 41657331
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

A novel Swin transformer based framework for speech recognition for dysarthria
2025cited by this paper
Dysarthric speech recognition: an investigation on using depthwise separable convolutions and residual connections
2024cited by this paper
Robust language independent voice data driven Parkinson's disease detection
2024cited by this paper
A Comprehensive Survey on the Data-Driven Approaches used for Tackling the COVID-19 Pandemic
2024cited by this paper
Improving cross-lingual low-resource speech recognition by Task-based Meta PolyLoss
2024cited by this paper
Data-driven approaches to bridging the gap in health communication disparities: A systematic review
2024cited by this paper
Exploration of Whisper fine-tuning strategies for low-resource ASR
2024cited by this paper
Intelligent speech technologies for transcription, disease diagnosis, and medical equipment interactive control in smart hospitals: A review
2023cited by this paper
A Survey of Automatic Speech Recognition for Dysarthric Speech
2023cited by this paper
Deep learning-based speech analysis for Alzheimer’s disease detection: a literature review
2022cited by this paper
Biosignal Sensors and Deep Learning-Based Speech Recognition: A Review
2021cited by this paper
Improving dysarthric speech recognition using empirical mode decomposition and convolutional neural network
2020cited by this paper
Dysarthric Speech Recognition using Convolutional Recurrent Neural Networks
2020cited by this paper
Investigation of Data Augmentation Techniques for Disordered Speech Recognition
2020cited by this paper
An attention Long Short-Term Memory based system for automatic classification of speech intelligibility
2020cited by this paper
A Review on Different Approaches for Speech Recognition System
2015cited by this paper
The TORGO database of acoustic and articulatory speech from speakers with dysarthria
2011cited by this paper

CITED BY

No citing papers are available for this paper.