Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Phoneme dependent inter-session variability reduction for speaker verification

Phoneme dependent inter-session variability reduction for speaker verification GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis (PCA), and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science (NRIPS) to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method. Keywords: inter-session variability; phoneme; speaker verification; principal component analysis. Reference to this paper should be made as follows: Lu, H., Zhang, W., Horiuchi, Y. and Kuroiwa, S. (2015) `Phoneme dependent inter-session variability reduction for speaker verification', Int. J. Biometrics, Vol. 7, No. 2, pp.83­96. Biographical notes: Haoze Lu received his BE in Electronic Commerce from Donghua, University, Shanghai, China. Currently, he is a doctor student of the Graduate School of Science and Technology, http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Biometrics Inderscience Publishers

Phoneme dependent inter-session variability reduction for speaker verification

Loading next page...
 
/lp/inderscience-publishers/phoneme-dependent-inter-session-variability-reduction-for-speaker-JY4l30qoG0

References (14)

Publisher
Inderscience Publishers
Copyright
Copyright © 2015 Inderscience Enterprises Ltd.
ISSN
1755-8301
eISSN
1755-831X
DOI
10.1504/IJBM.2015.070922
Publisher site
See Article on Publisher Site

Abstract

GMM-UBM super-vectors will potentially lead to worse modelling for speaker verification due to the inter-session variability, especially when a small amount of training utterances were available. In this study, we propose a phoneme dependent method to suppress the inter-session variability. A speaker's model can be represented by several various phoneme Gaussian mixture models. Each of them covers an individual phoneme whose inter-session variability can be constrained in an inter-session independent subspace constructed by principal component analysis (PCA), and it uses corpus uttered by a single speaker that has been recorded over a long period. SVM-based experiments performed using a large corpus, constructed by the National Research Institute of Police Science (NRIPS) to evaluate Japanese speaker recognition, and demonstrate the improvements gained from the proposed method. Keywords: inter-session variability; phoneme; speaker verification; principal component analysis. Reference to this paper should be made as follows: Lu, H., Zhang, W., Horiuchi, Y. and Kuroiwa, S. (2015) `Phoneme dependent inter-session variability reduction for speaker verification', Int. J. Biometrics, Vol. 7, No. 2, pp.83­96. Biographical notes: Haoze Lu received his BE in Electronic Commerce from Donghua, University, Shanghai, China. Currently, he is a doctor student of the Graduate School of Science and Technology,

Journal

International Journal of BiometricsInderscience Publishers

Published: Jan 1, 2015

There are no references for this article.