Robustness over time-varying channels in DNN-HMM ASR based human-robot interaction

dc.date.accessioned	2021-08-23T22:56:12Z
dc.date.accessioned	2022-10-19T00:26:09Z
dc.date.available	2021-08-23T22:56:12Z
dc.date.available	2022-10-19T00:26:09Z
dc.date.created	2021-08-23T22:56:12Z
dc.date.issued	2017
dc.identifier	http://hdl.handle.net/10533/251795
dc.identifier	1151306
dc.identifier	WOS:000457505000178
dc.identifier.uri	https://repositorioslatinoamericanos.uchile.cl/handle/2250/4483058
dc.description.abstract	This paper addresses the problem of time-varying channels in speech-recognition-based human-robot interaction using Locally-Normalized Filter-Bank features (LNFB), and training strategies that compensate for microphone response and room acoustics. Testing utterances were generated by re-recording the Aurora-4 testing database using a PR2 mobile robot, equipped with a Kinect audio interface while performing head rotations and movements toward and away from a fixed source. Three training conditions were evaluated called Clean, 1-IR and 33-IR. With Clean training, the DNN-HMM system was trained using the Aurora-4 clean training database. With 1-IR training, the same training data were convolved with an impulse response estimated at one meter from the source with no rotation of the robot head. With 33-IR training, the Aurora-4 training data were convolved with impulse responses estimated at one, two and three meters from the source and 11 angular positions of the robot head. The 33-IR training method produced reductions in WER greater than 50% when compared with Clean training using both LNFB and conventional Mel filterbank features. Nevertheless, LNFB features provided a WER 23% lower than MelFB using 33-IR training. The use of 33-IR training and LNFB features reduced WER by 64% compared to Clean training and MelFB features.
dc.language	eng
dc.relation	https://doi.org/10.21437/Interspeech.2017-1308
dc.relation	handle/10533/111557
dc.relation	10.21437/Interspeech.2017-1308
dc.relation	handle/10533/111541
dc.relation	handle/10533/108045
dc.rights	info:eu-repo/semantics/article
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 Chile
dc.rights	http://creativecommons.org/licenses/by-nc-nd/3.0/cl/
dc.title	Robustness over time-varying channels in DNN-HMM ASR based human-robot interaction
dc.type	Articulo

Este ítem pertenece a la siguiente institución

ANID (Chile)