info:eu-repo/semantics/article
A cross-linguistic analysis of the temporal dynamics of turn-taking cues using machine learning as a descriptive tool
Fecha
2020-12Registro en:
Brusco, Pablo; Vidal, Jazmín; Beňuš, Štefan; Gravano, Agustin; A cross-linguistic analysis of the temporal dynamics of turn-taking cues using machine learning as a descriptive tool; Elsevier Science; Speech Communication; 125; 12-2020; 24-40
0167-6393
CONICET Digital
CONICET
Autor
Brusco, Pablo
Vidal, Jazmín
Beňuš, Štefan
Gravano, Agustin
Resumen
In dialogue, speakers produce and perceive acoustic/prosodic turn-taking cues, which are fundamental for negotiating turn exchanges with their interlocutors. However, little of the temporal dynamics and cross-linguistic validity of these cues is known. In this work, we explore a set of acoustic/prosodic cues preceding three turn-transition types (hold, switch and backchannel) in three different languages (Slovak, American English and Argentine Spanish). For this, we use and refine a set of machine learning techniques that enable a finer-grained temporal analysis of such cues, as well as a comparison of their relative explanatory power. Our results suggest that the three languages, despite belonging to distinct linguistic families, share the general usage of a handful of acoustic/prosodic features to signal turn transitions. We conclude that exploiting features such as speech rate, final-word lengthening, the pitch track over the final 200 ms, the intensity track over the final 1000 ms, and noise-to-harmonics ratio (a voice-quality feature) might prove useful for further improving the accuracy of the turn-taking modules found in modern spoken dialogue systems.