O uso de informações semânticas do PALAVRAS : em busca do aprimoramento da seleção de unidades textuais correferentes na Sumarização Automática
TOMAZELA, Élen Cátia. O uso de informações semânticas do PALAVRAS : em busca do aprimoramento da seleção de unidades textuais correferentes na Sumarização Automática. 2010. 149 f. Dissertação (Mestrado em Ciências Humanas) - Universidade Federal de São Carlos, São Carlos, 2010.
Tomazela, Élen Cátia
This dissertation aims at presenting a theoretical heuristic model which not only takes into consideration the Veins Theory, but also semantic information obtained from the Parser PALAVRAS to improve the selection of correferential textual units to be included in automatic summaries. Based on the analysis of the problems presented by VeinSum, an automatic summarizer, two main issues have been raised: the necessity of improving its summaries salience and reducing their size so that they suit the compression rate more adequately. Better results can be achieved through the elimination of irrelevant textual units although the summaries referential clarity may not be damaged. Heuristics based on the semantic information have then been proposed. Despite the semantic annotation inconsistencies, all the noun phrases that compose the Summ-it Corpus have been post-edited manually, which increases the credibility of the heuristics. Eleven texts from the corpus have been analysed and the results obtained are satisfactory, although a wider study would be required to better evaluate the results of this proposal.