Improving Speech Recognition with Prompt-based Contextualized ASR and LLM-based Re-predictor

Nguyen Manh Tien Anh,Thach Ho Sy

Published 2024 in Interspeech

ABSTRACT

In recent years, advancements in automatic speech recognition (ASR) systems have led to their widespread use in applications such as call center bots and virtual assistants. However, these systems encounter challenges in adverse speech conditions, lack of contextual information, and recognizing rare words. In this paper, we propose a novel architecture to tackle these limitations by integrating Large Language Models (LLMs) and prompt mechanisms, aiming to enhance ASR accuracy. By using a pre-trained text encoder with a text adapter for task-specific adaptation and an efficient LLM-based re-prediction mechanism, our method has shown remarkable results in various real-world scenarios. Our proposed system achieves an average relative word error rate improvement of 27% for conventional tasks, 30% for utterance-level contextual tasks, and 33% for word-level biasing tasks compared to a base-line ASR system on multiple public datasets.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-30 of 30 references · Page 1 of 1