Artículos de revistas
Portuguese text generation using factored language models
Fecha
2011Registro en:
Journal of the Brazilian Computer Society, Guildford, v. 19, n. 2, p. 135–146, jun. 2013
0104-6500
10.1007/s13173-012-0095-1
Autor
Novais, Eder Miranda de
Paraboni, Ivandre
Institución
Resumen
As in many other natural language processing (NLP) fields, the use of statistical methods is now part of mainstream natural language generation (NLG). In the development of systems of this kind, however, there is the issue of data sparseness, a problem that is particularly evident in the case of morphologically-rich languages such as Portuguese. This work presents a shallow surface realisation system that makes use of factored language models (FLMs) of Portuguese to overcome some of these difficulties. The system combines FLMs trained on a large corpus with a number of NLP resources that have been made publicly available by the Brazilian NLP research community in recent years, such as corpora, dictionaries, thesauri and others. Our FLM-based approach to surface realisation has been successfully applied to the generation of Brazilian newspapers headlines, and the results are shown to outperform a number of statistical and non-statistical baseline systems alike