Descrição linguística da complementaridade para a sumarização automática multidocumento

Souza, Jackson Wilke da Cruz

dc.contributor	Di Felippo, Ariani
dc.contributor	http://lattes.cnpq.br/8648412103197455
dc.contributor	http://lattes.cnpq.br/0019187301069627
dc.creator	Souza, Jackson Wilke da Cruz
dc.date.accessioned	2016-11-08T19:05:06Z
dc.date.available	2016-11-08T19:05:06Z
dc.date.created	2016-11-08T19:05:06Z
dc.date.issued	2015-11-11
dc.identifier	SOUZA, Jackson Wilke da Cruz. Descrição linguística da complementaridade para a sumarização automática multidocumento. 2015. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2015. Disponível em: https://repositorio.ufscar.br/handle/ufscar/8311.
dc.identifier	https://repositorio.ufscar.br/handle/ufscar/8311
dc.description.abstract	Automatic Multidocument Summarizarion (AMS) is a computational alternative to process the large quantity of information available online. In AMS, we try to automatically generate a single coherent and cohesive summary from a set of documents which have same subject, each these documents are originate from different sources. Furthermore, some methods of AMS select the most important information from the collection to compose the summary. The selection of main content sometimes requires the identification of redundancy, complementarity and contradiction, characterized by being the multidocument phenomena. The identification of complementarity, in particular, is relevant inasmuch as some information may be selected to the summary as a complement of another information that was already selected, ensuring more coherence and most informative. Some AMS methods to condense the content of the documents based on the identification of relations from the Cross-document Structure Theory (CST), which is established between sentences of different documents. These relationships (for example Historical background) capture the phenomenon of complementarity. Automatic detection of these relationships is often made based on lexical similarity between a pair of sentences, since research on AMS not count on studies that have characterized the phenomenon and show other relevant linguistic strategies to automatically detect the complementarity. In this work, we present the linguistic description of complementarity based on corpus. In addition, we elaborate the characteristics of this phenomenon in attributes that support the automatic identification. As a result, we obtained sets of rules that demonstrate the most relevant attributes for complementary CST relations (Historical background, Follow-up and Elaboration) and its types (temporal and timeless) complementarity. According this, we hope to contribute to the Descriptive Linguistics, with survey-based corpus of linguistic characteristics of this phenomenon, as of Automatic Processing of Natural Languages, by means of rules that can support the automatic identification of CST relations and types complementarity.
dc.language	por
dc.publisher	Universidade Federal de São Carlos
dc.publisher	UFSCar
dc.publisher	Programa de Pós-Graduação em Linguística - PPGL
dc.publisher	Câmpus São Carlos
dc.rights	Acesso aberto
dc.subject	Complementaridade
dc.subject	Relações CST
dc.subject	Linguística textual
dc.subject	Descrição linguística
dc.subject	Sumarização automática multidocumento
dc.title	Descrição linguística da complementaridade para a sumarização automática multidocumento
dc.type	Tesis

Este ítem pertenece a la siguiente institución

Universidade Federal de São Carlos (Brasil)