Dysarthria, a motor speech disorder characterized by slurred and often unintelligible speech, presents substantial challenges for effective communication. Conventional automatic speech recognition systems frequently underperform on dysarthric speech, particularly in severe cases. To address this gap, we introduce low-latency acoustic transcription and textual encoding (LATTE), an advanced framework designed for real-time dysarthric speech recognition. LATTE integrates preprocessing, acoustic processing, and transcription mapping into a unified pipeline, with its core powered by a hybrid architecture that combines convolutional layers for acoustic feature extraction with bidirectional temporal layers for modeling temporal dependencies. Evaluated on the UA-Speech dataset, LATTE achieves a word error rate of 12.5%, phoneme error rate of 8.3%, and a character error rate of 1%. By enabling accurate, low-latency transcription of impaired speech, LATTE provides a robust foundation for enhancing communication and accessibility in both digital applications and real-time interactive environments.
Advancing Dysarthric Speech-to-Text Recognition with LATTE: A Low-Latency Acoustic Modeling Approach for Real-Time Communication.
Qurat Ul Ain,Hammad Afzal,Fazli Subhan,Mazliham Mohd Suud,Younhyun Jung
Published 2026 in Big Data
ABSTRACT
PUBLICATION RECORD
- Publication year
2026
- Venue
Big Data
- Publication date
2026-02-09
- Fields of study
Medicine, Computer Science, Engineering
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-17 of 17 references · Page 1 of 1
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1