dc.date.accessioned | 2021-08-23T22:56:12Z | |
dc.date.accessioned | 2022-10-19T00:26:09Z | |
dc.date.available | 2021-08-23T22:56:12Z | |
dc.date.available | 2022-10-19T00:26:09Z | |
dc.date.created | 2021-08-23T22:56:12Z | |
dc.date.issued | 2017 | |
dc.identifier | http://hdl.handle.net/10533/251795 | |
dc.identifier | 1151306 | |
dc.identifier | WOS:000457505000178 | |
dc.identifier.uri | https://repositorioslatinoamericanos.uchile.cl/handle/2250/4483058 | |
dc.description.abstract | This paper addresses the problem of time-varying channels in speech-recognition-based human-robot interaction using Locally-Normalized Filter-Bank features (LNFB), and training strategies that compensate for microphone response and room acoustics. Testing utterances were generated by re-recording the Aurora-4 testing database using a PR2 mobile robot, equipped with a Kinect audio interface while performing head rotations and movements toward and away from a fixed source. Three training conditions were evaluated called Clean, 1-IR and 33-IR. With Clean training, the DNN-HMM system was trained using the Aurora-4 clean training database. With 1-IR training, the same training data were convolved with an impulse response estimated at one meter from the source with no rotation of the robot head. With 33-IR training, the Aurora-4 training data were convolved with impulse responses estimated at one, two and three meters from the source and 11 angular positions of the robot head. The 33-IR training method produced reductions in WER greater than 50% when compared with Clean training using both LNFB and conventional Mel filterbank features. Nevertheless, LNFB features provided a WER 23% lower than MelFB using 33-IR training. The use of 33-IR training and LNFB features reduced WER by 64% compared to Clean training and MelFB features. | |
dc.language | eng | |
dc.relation | https://doi.org/10.21437/Interspeech.2017-1308 | |
dc.relation | handle/10533/111557 | |
dc.relation | 10.21437/Interspeech.2017-1308 | |
dc.relation | handle/10533/111541 | |
dc.relation | handle/10533/108045 | |
dc.rights | info:eu-repo/semantics/article | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.rights | Atribución-NoComercial-SinDerivadas 3.0 Chile | |
dc.rights | http://creativecommons.org/licenses/by-nc-nd/3.0/cl/ | |
dc.title | Robustness over time-varying channels in DNN-HMM ASR based human-robot interaction | |
dc.type | Articulo | |