The 2010 CMU GALE speech-to-text system

Florian Metze,Roger Hsiao,Qin Jin,Udhyakumar Nallasamy,Tanja Schultz

Published 2010 in Interspeech

ABSTRACT

This paper describes the latest Speech-to-Text system developed for the Global Autonomous Language Exploitation (“GALE”) domain by Carnegie Mellon University (CMU). This systems uses discriminative training, bottle-neck features and other techniques that were not used in previous versions of our system, and is trained on 1150 hours of data from a variety of Arabic speech sources. In this paper, we show how different lexica, pre-processing, and system combination techniques can be used to improve the final output, and provide analysis of the improvements achieved by the individual techniques. Index Terms: speech recognition, discriminative training, bottle-neck features

PUBLICATION RECORD

Publication year
2010
Venue
Interspeech
Publication date
Unknown publication date
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.21437/Interspeech.2010-439
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

The IBM 2011 GALE Arabic speech transcription system
2011cited by this paper
A comparative study on system combination schemes for LVCSR
2010cited by this paper
Recent improvements to the Cambridge Arabic Speech-to-Text systems
2010cited by this paper
Training and adapting MLP features for Arabic speech recognition
2009cited by this paper
Lexical and phonetic modeling for Arabic automatic speech recognition
2009cited by this paper
Generalized discriminative feature transformation for speech recognition
2009cited by this paper
Correlated Bigram LSA for Unsupervised Language Model Adaptation
2008cited by this paper
Optimizing bottle-neck features for lvcsr
2008cited by this paper
Boosted MMI for model and feature-space discriminative training
2008cited by this paper
Development of the SRI/nightingale Arabic ASR system
2008cited by this paper
Transcribing broadcast data using MLP features
2008cited by this paper
Advances in the CMU/Interact Arabic GALE Transcription System
2007cited by this paper
Issues in Arabic Orthography and Morphology Analysis
2004influential reference
Speaker segmentation and clustering in meetings
2004cited by this paper
SRILM - an extensible language modeling toolkit
2002cited by this paper
A one-pass decoder based on polymorphic linguistic context assignment
2001cited by this paper
Tandem connectionist feature extraction for conventional HMM systems
2000cited by this paper
Semi-tied covariance matrices for hidden Markov models
1999cited by this paper
Speaker normalization based on frequency warping
1997cited by this paper

CITED BY

Cross-Lingual Bridges with Models of Lexical Borrowing
2016cites this paper
Proposal : Cross-Lingual Transfer of Linguistic and Metalinguistic Knowledge via Lexical Borrowing
2015cites this paper
Constraint-Based Models of Lexical Borrowing
2015cites this paper
Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information
2014cites this paper
Multilingual multilayer perceptron for rapid language adaptation between and across language families
2013cites this paper
Lattice-based training of bottleneck feature extraction neural networks
2013cites this paper
Towards single pass discriminative training for speech recognition
2012cites this paper
Active learning for accent adaptation in Automatic Speech Recognition
2012cites this paper
An Investigation on Initialization Schemes for Multilayer Perceptron Training Using Multilingual Dat
2012cites this paper
Enhanced Polyphone Decision Tree Adaptation for Accented Speech Recognition
2012cites this paper
Generalized discriminative training for speech recognition
2012cites this paper
Multilingual bottle-neck features and its application for under-resourced languages
2012cites this paper
Semi-supervised learning for speech recognition in the context of accent adaptation
2012cites this paper
Acoustic and Lexical Modeling Techniques for Accented Speech Recognition
2012cites this paper
Temporal Patterns ( TRAPs ) in Janus Recognition Toolkit Student Project of Tatiana Glushkova
2012cites this paper
Temporal Patterns (TRAPs) in Janus Recognition Toolkit
2012cites this paper
A Study on Speaker Normalized MLP Features in LVCSR
2011cites this paper
The 2011 KIT QUAERO speech-to-text system for Spanish
2011cites this paper
Analysis of Dialectal Influence in Pan-Arabic ASR
2011cites this paper
Generalized Baum-Welch Algorithm and its Implication to a New Extended Baum-Welch Algorithm
2011cites this paper