An adversarial model for paraphrase generation

info:eu-repo/semantics/masterThesis

Fecha

2020

Registro en:

1073514

http://hdl.handle.net/20.500.12590/16901

https://repositorioslatinoamericanos.uchile.cl/handle/2250/6478688

Autor

Ochoa Luna, Jose Eduardo

Institución

Universidad Católica San Pablo (Perú)

Resumen

Paraphrasing is the action of expressing the idea of a sentence using different words. Paraphrase generation is an interesting and challenging task due mainly to three reasons: (1) The nature of the text is discrete, (2) it is diffcult to modify a sentence slightly without changing the meaning, and (3) there are no accurate automatic metrics to evaluate the quality of a paraphrase. This problem has been addressed with several methods. Even so, neural network-based approaches have been tackling this task recently. This thesis presents a novel framework to solve the paraphrase generation problem in English. To do so, this work focuses and evaluates three aspects of a model, as the teaser figure shows. (a) Static input representations extracted from pre-trained language models. (b) Convolutional sequence to sequence models as our main architecture. (c) Hybrid loss function between maximum likelihood and adversarial REINFORCE, avoiding the computationally expensive Monte-Carlo search. We compare our best models with some baselines in the Quora question pairs dataset. The results show that our framework is competitive against the previous benchmarks.

Materias

Paraphrase generation

Input representations

Convolutional sequence to sequence

Adversarial training

Mostrar el registro completo del ítem