Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions

Umfurer, Alfredo; Kamienkowski, Juan E.; Bianchi, Bruno

Objeto de conferencia

Registro en:

http://sedici.unlp.edu.ar/handle/10915/140235

http://50jaiio.sadio.org.ar/pdfs/asai/ASAI-03.pdf

issn:2451-7585

https://repositorioslatinoamericanos.uchile.cl/handle/2250/7481082

Autor

Umfurer, Alfredo

Kamienkowski, Juan E.

Bianchi, Bruno

Institución

Universidad Nacional de La Plata (Argentina)

Resumen

Modern Natural Language Processing (NLP) models can achieve great results resolving di erent types of linguistic tasks. This is possible thanks to a high volume of internal parametersthat are optimized during the training phase. They allow to model high-level linguistic properties. For example, LSTM-based language models have the ability to nd long-term dependencies between words on a text, and use them to make predictions about upcoming words. Nevertheless, their complexity makes it hard to understand which features they use to generate predictions. The neurolinguistic eld faces a similar issue when studying how our brain processes language. For example, every adult reader has the ability to understand long texts and to make predictions of upcoming words. Nevertheless, our understanding on how these predictions are driven is limited. During the last decades, the study of eye movements during reading have shed some light on this topic, nding a relation between the time spent on a word (gaze duration) and its processing cost. Here, we aim to understand how LSTM-based models predict future words and these predictions relate with human predictions, tting statistical models commonly used in the neurolinguistic eld with gaze duration as the dependent variable. We found that an AWD-LSTM Language Model can partially model eye movements, with high overlap with both human-Predictability and lexical frequency. Interestingly, this last overlap is seen to depend on the training corpus, being lower when the model is ne-tuned with a corpus similar to the one used for testing.

Sociedad Argentina de Informática e Investigación Operativa

Materias

Ciencias Informáticas

LSTM

Eye Movements

Linear Mixed Models

Mostrar el registro completo del ítem