Open-set speaker identification in broadcast news

Chao Gao,G. Saikumar,Amit Srivastava,P. Natarajan

Published 2011 in IEEE International Conference on Acoustics, Speech, and Signal Processing

ABSTRACT

In this paper, we examine the problem of text-independent open-set speaker identification (OS-SI) in broadcast news. Particularly, the impact of the population of registered speakers to OS-SI performance is investigated, which is the central issue for designing practical OS-SI system. We amend the maximum mutual information (MMI)-based discriminative training scheme to facilitate its incorporation in OS-SI systems. We also improve the implementation to allow the application of MMI-based approach with 2048-component Gaussian mixture models. All systems are evaluated using NIST RT-03, RT-04 and FBIS corpora, with a maximum of 82 registered speakers. Our study shows that notable performance improvement can be obtained with MMI-based discriminative training, which reduces the equal error rate (EER) by 15.9% relatively, in comparison to the GMM-MAP scheme.

PUBLICATION RECORD

  • Publication year

    2011

  • Venue

    IEEE International Conference on Acoustics, Speech, and Signal Processing

  • Publication date

    2011-05-01

  • Fields of study

    Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.