TinyCAM: A Lightweight and Discriminative Network for Tibetan Speaker Verification

Zhenye Gan,Le Wei,Li Zhang

Published 2025 in 2025 International Conference on Algorithms, Data Mining, and Information Technology (ADMIT)

ABSTRACT

Speaker verification for minority languages faces considerable challenges, primarily due to the scarcity of training data and the structural complexities inherent to these languages. In particular, Tibetan speech is characterized by its rich plosive inventory and strong rhythmic patterns, which pose unique difficulties for robust speaker modeling. To address these issues, this paper proposes TinyCAM, an efficient and lightweight speaker verification framework tailored for such acoustic properties. Built upon the CAM++ architecture, TinyCAM introduces three structural enhancements aimed at improving feature representation and computational efficiency. First, the framework incorporates WDS-ResBlock, which fuses wavelet convolution with depthwise separable convolution to enable more effective multi-scale processing of local time-frequency details. Second, it employs a streamlined GS-TDNN module, which utilizes grouped and pointwise convolutions to capture diverse feature types and temporal dynamics with reduced complexity. Third, TinyCAM integrates the SLRT Layer, which combines low-rank compression with sparse constraints to minimize channel redundancy while enhancing information extraction. Extensive experiments and ablation studies conducted on a Tibetan speech dataset demonstrate that these architectural innovations not only improve recognition accuracy but also significantly reduce model size and computational overhead. Specifically, TinyCAM achieves an Equal Error Rate (EER) of 5.9905% and a minimum Detection Cost Function (minDCF) of 0.8287, with only 6.67 million parameters and 1.38 GFLOPs, marking reductions of 26.6% and 11.7 %, respectively, compared to the original model. These results highlight TinyCAM's strong potential for practical deployment in resource-constrained scenarios, particularly in applications involving minority languages.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-18 of 18 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1