Speech recognition in a dialog system: from conventional to deep processing A case study applied to Spanish

Becerra, Aldonso; De la Rosa Vargas, José Ismael; González Ramírez, Efrén

info:eu-repo/semantics/article

Fecha

2018-08

Registro en:

1380-7501

1573-7721

http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/1713

https://doi.org/10.48779/2d22-9s79

Autor

Becerra, Aldonso

De la Rosa Vargas, José Ismael

González Ramírez, Efrén

Institución

Universidad Autónoma de Zacatecas (México)

Resumen

The aim of this paper is to illustrate an overview of the automatic speech recognition (ASR) module in a spoken dialog system and how it has evolved from the conventional GMM-HMM (Gaussian mixture model - hidden Markov model) architecture toward the recent nonlinear DNN-HMM (deep neural network) scheme. GMMs have dominated for a long time the baseline of speech recognition, but in the past years with the resurgence of artificial neural networks (ANNs), the former models have been surpassed in most recognition tasks. An outstanding consideration for ANNs-based acoustic model is the fact that their weights can be adjusted in two training steps: i) initialization of the weights (with or without pre-training) and ii) fine-tuning.

Materias

Mostrar el registro completo del ítem