Brasil
| Artículos de revistas
Extractive summarization using complex networks and syntactic dependency
Fecha
2013-08-02Registro en:
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, AMSTERDAM, v. 391, n. 4, supl. 1, Part 3, pp. 1855-1864, FEB 15, 2012
0378-4371
10.1016/j.physa.2011.10.015
Autor
Amancio, Diego R.
Nunes, Maria G. V.
Oliveira Junior, Osvaldo Novais de
Costa, Luciano da Fontoura
Institución
Resumen
The realization that statistical physics methods can be applied to analyze written texts represented as complex networks has led to several developments in natural language processing, including automatic summarization and evaluation of machine translation. Most importantly, so far only a few metrics of complex networks have been used and therefore there is ample opportunity to enhance the statistics-based methods as new measures of network topology and dynamics are created. In this paper, we employ for the first time the metrics betweenness, vulnerability and diversity to analyze written texts in Brazilian Portuguese. Using strategies based on diversity metrics, a better performance in automatic summarization is achieved in comparison to previous work employing complex networks. With an optimized method the Rouge score (an automatic evaluation method used in summarization) was 0.5089, which is the best value ever achieved for an extractive summarizer with statistical methods based on complex networks for Brazilian Portuguese. Furthermore, the diversity metric can detect keywords with high precision, which is why we believe it is suitable to produce good summaries. It is also shown that incorporating linguistic knowledge through a syntactic parser does enhance the performance of the automatic summarizers, as expected, but the increase in the Rouge score is only minor. These results reinforce the suitability of complex network methods for improving automatic summarizers in particular, and treating text in general. (C) 2011 Elsevier B.V. All rights reserved.