Um estudo comparativo de modelos baseados em estatísticas textuais, grafos e aprendizado de máquina para sumarização automática de textos em português

Leite, Daniel Saraiva

dc.contributor	Rino, Lúcia Helena Machado
dc.contributor	http://lattes.cnpq.br/0315640846525832
dc.contributor	http://lattes.cnpq.br/4602931087864561
dc.creator	Leite, Daniel Saraiva
dc.date.accessioned	2011-04-07
dc.date.accessioned	2016-06-02T19:05:48Z
dc.date.available	2011-04-07
dc.date.available	2016-06-02T19:05:48Z
dc.date.created	2011-04-07
dc.date.created	2016-06-02T19:05:48Z
dc.date.issued	2010-12-21
dc.identifier	LEITE, Daniel Saraiva. Um estudo comparativo de modelos baseados em estatísticas textuais, grafos e aprendizado de máquina para sumarização automática de textos em português. 2010. 231 f. Dissertação (Mestrado em Ciências Exatas e da Terra) - Universidade Federal de São Carlos, São Carlos, 2010.
dc.identifier	https://repositorio.ufscar.br/handle/ufscar/459
dc.description.abstract	Automatic text summarization has been of great interest in Natural Language Processing due to the need of processing a huge amount of information in short time, which is usually delivered through distinct media. Thus, large-scale methods are of utmost importance for synthesizing and making access to information simpler. They aim at preserving relevant content of the sources with little or no human intervention. Building upon the extractive summarizer SuPor and focusing on texts in Portuguese, this MsC work aimed at exploring varied features for automatic summarization. Computational methods especially driven towards textual statistics, graphs and machine learning have been explored. A meaningful extension of the SuPor system has resulted from applying such methods and new summarization models have thus been delineated. These are based either on each of the three methodologies in isolation, or are hybrid. In this dissertation, they are generically named after the original SuPor as SuPor-2. All of them have been assessed by comparing them with each other or with other, well-known, automatic summarizers for texts in Portuguese. The intrinsic evaluation tasks have been carried out entirely automatically, aiming at the informativeness of the outputs, i.e., the automatic extracts. They have also been compared with other well-known automatic summarizers for Portuguese. SuPor-2 results show a meaningful improvement of some SuPor-2 variations. The most promising models may thus be made available in the future, for generic use. They may also be embedded as tools for varied Natural Language Processing purposes. They may even be useful for other related tasks, such as linguistic studies. Portability to other languages is possible by replacing the resources that are language-dependent, namely, lexicons, part-of-speech taggers and stop words lists. Models that are supervised have been so far trained on news corpora. In spite of that, training for other genres may be carried out by interested users using the very same interfaces supplied by the systems.
dc.publisher	Universidade Federal de São Carlos
dc.publisher	BR
dc.publisher	UFSCar
dc.publisher	Programa de Pós-Graduação em Ciência da Computação - PPGCC
dc.rights	Acesso Aberto
dc.subject	Processamento da linguagem natural (Computação)
dc.subject	Sumarização automática
dc.subject	Inteligência artificial
dc.subject	Extractive automatic summarization
dc.subject	Graph-based automatic summarization
dc.subject	Automatic summarization based upon statistics
dc.subject	Machine learning approach for automatic summarization
dc.subject	Hybrid methods for automatic summarization
dc.subject	Automatic summarization
dc.subject	Natural language processing
dc.subject	Artificial intelligence
dc.title	Um estudo comparativo de modelos baseados em estatísticas textuais, grafos e aprendizado de máquina para sumarização automática de textos em português
dc.type	Tesis

Este ítem pertenece a la siguiente institución

Universidade Federal de São Carlos (Brasil)

Um estudo comparativo de modelos baseados em estatísticas textuais, grafos e aprendizado de máquina para sumarização automática de textos em português

Este ítem pertenece a la siguiente institución

Ítems relacionados

Sumarização automática abstrativa de textos utilizando Deep Learning ﻿

Joint semantic discourse models for automatic multi-document summarization ﻿

SAB(IO): A BIOLOGICALLY PLAUSIBLE CONNECTIONIST APPROACH TO AUTOMATIC TEXT SUMMARIZATION ﻿

Sumarização automática abstrativa de textos utilizando Deep Learning

Joint semantic discourse models for automatic multi-document summarization

SAB(IO): A BIOLOGICALLY PLAUSIBLE CONNECTIONIST APPROACH TO AUTOMATIC TEXT SUMMARIZATION