End-to-end automatic speech recognition systems often fail to transcribe domain-specific named entities, causing catastrophic failures in downstream tasks. Numerous fast and lightweight named entity correction (NEC) models have been proposed in recent years. These models, mainly leveraging phonetic-level edit distance algorithms, have shown impressive performances. However, when the forms of the wrongly-transcribed words(s) and the ground-truth entity are significantly different, these methods often fail to locate the wrongly transcribed words in hypothesis, thus limiting their usage. We propose a novel NEC method that utilizes speech sound features to retrieve candidate entities. With speech sound features and candidate entities, we inovatively design a generative method to annotate entity errors in ASR transcripts and replace the text with correct entities. This method is effective in scenarios of word form difference. We test our method using open-source and self-constructed test sets. The results demonstrate that our NEC method can bring significant improvement to entity accuracy. The self-constructed training data and test set is publicly available at github.com/L6-NLP/Generative-Annotation-NEC.
Generative Annotation for ASR Named Entity Correction
Yuanchang Luo,Daimeng Wei,Shaojun Li,Hengchao Shang,Jiaxin Guo,Zongyao Li,Zhanglin Wu,Xiaoyu Chen,Zhiqiang Rao,Jinlong Yang,Hao Yang
Published 2025 in Conference on Empirical Methods in Natural Language Processing
ABSTRACT
PUBLICATION RECORD
- Publication year
2025
- Venue
Conference on Empirical Methods in Natural Language Processing
- Publication date
2025-08-28
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-23 of 23 references · Page 1 of 1
CITED BY
Showing 1-1 of 1 citing papers · Page 1 of 1