MAP speaker adaptation of state duration distributions for speech recognition

Yoma, Néstor Becerra; Sánchez, Jorge Silva

dc.creator	Yoma, Néstor Becerra
dc.creator	Sánchez, Jorge Silva
dc.date.accessioned	2019-01-29T17:51:50Z
dc.date.available	2019-01-29T17:51:50Z
dc.date.created	2019-01-29T17:51:50Z
dc.date.issued	2002
dc.identifier	IEEE Transactions on Speech and Audio Processing, Volumen 10, Issue 7, 2018, Pages 443-450
dc.identifier	10636676
dc.identifier	10.1109/TSA.2002.803441
dc.identifier	http://repositorio.uchile.cl/handle/2250/163580
dc.description.abstract	This paper presents a framework for maximum a posteriori (MAP) speaker adaptation of state duration distributions in hidden Markov models (HMM). Four key issues of MAP estimation, namely analysis and modeling of state duration distributions, the choice of prior distribution, the specification of the parameters of the prior density and the evaluation of the MAP estimates, are tackled. Moreover, a comparison with an adaptation procedure based on maximum likelihood (ML) estimation is presented, and the problem of truncation of the state duration distribution is addressed from the statistical point of view. The results shown in this paper suggest that the speaker adaptation of temporal restrictions substantially improves the accuracy of speaker-independent (SI) HMM with clean and noisy speech. The method requires a low computational load and a small number of adapting utterances, and can be useful to follow the dynamics of the speaking rate in speech recognition.
dc.language	en
dc.rights	http://creativecommons.org/licenses/by-nc-nd/3.0/cl/
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Chile
dc.source	IEEE Transactions on Speech and Audio Processing
dc.subject	Speaker adaptation
dc.subject	Speech recognition
dc.subject	State duration modeling
dc.title	MAP speaker adaptation of state duration distributions for speech recognition
dc.type	Artículos de revistas

Este ítem pertenece a la siguiente institución

Universidad de Chile