dc.date.accessioned2021-08-23T22:56:12Z
dc.date.accessioned2022-10-19T00:26:09Z
dc.date.available2021-08-23T22:56:12Z
dc.date.available2022-10-19T00:26:09Z
dc.date.created2021-08-23T22:56:12Z
dc.date.issued2017
dc.identifierhttp://hdl.handle.net/10533/251795
dc.identifier1151306
dc.identifierWOS:000457505000178
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/4483058
dc.description.abstractThis paper addresses the problem of time-varying channels in speech-recognition-based human-robot interaction using Locally-Normalized Filter-Bank features (LNFB), and training strategies that compensate for microphone response and room acoustics. Testing utterances were generated by re-recording the Aurora-4 testing database using a PR2 mobile robot, equipped with a Kinect audio interface while performing head rotations and movements toward and away from a fixed source. Three training conditions were evaluated called Clean, 1-IR and 33-IR. With Clean training, the DNN-HMM system was trained using the Aurora-4 clean training database. With 1-IR training, the same training data were convolved with an impulse response estimated at one meter from the source with no rotation of the robot head. With 33-IR training, the Aurora-4 training data were convolved with impulse responses estimated at one, two and three meters from the source and 11 angular positions of the robot head. The 33-IR training method produced reductions in WER greater than 50% when compared with Clean training using both LNFB and conventional Mel filterbank features. Nevertheless, LNFB features provided a WER 23% lower than MelFB using 33-IR training. The use of 33-IR training and LNFB features reduced WER by 64% compared to Clean training and MelFB features.
dc.languageeng
dc.relationhttps://doi.org/10.21437/Interspeech.2017-1308
dc.relationhandle/10533/111557
dc.relation10.21437/Interspeech.2017-1308
dc.relationhandle/10533/111541
dc.relationhandle/10533/108045
dc.rightsinfo:eu-repo/semantics/article
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 Chile
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/3.0/cl/
dc.titleRobustness over time-varying channels in DNN-HMM ASR based human-robot interaction
dc.typeArticulo


Este ítem pertenece a la siguiente institución