Structural Gaussian mixture models for efficient text-independent speaker verification

Bing Xiang,T. Berger

Published 2002 in Interspeech

ABSTRACT

Structural Gaussian mixture models (SGMMs) are proposed for efficient text-independent speaker verification. A structural background model (SBM) is constructed first by hierarchically clustering all Gaussian mixture components in a universal background model (UBM). In this way the acoustic space is partitioned into multiple regions in different levels of resolution. For each target speaker, a SGMM can be generated through multi-level maximum a posteriori (MAP) adaptation from the SBM. During test, only a small subset of Gaussian mixture components is scored for each feature vector in order to reduce the computational cost signifi-cantly. Furthermore, the scores obtained in different layers of the tree-structured models are combined via a neural network for fi-nal decision. Different configurations are compared in the experiments conducted on the telephony speech data used in the NIST speaker verification evaluation. The experimental results show that computational reduction by a factor of 17 can be achieved with equal error rate (EER) reduced by (cid:0)(cid:1) compared with the baseline. The SGMM-SBM also shows some advantages over the recently proposed hash GMM.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

CITED BY