dc.contributorDi Felippo, Ariani
dc.contributorhttp://lattes.cnpq.br/8648412103197455
dc.contributorhttp://lattes.cnpq.br/0019187301069627
dc.creatorSouza, Jackson Wilke da Cruz
dc.date.accessioned2020-04-28T11:44:13Z
dc.date.accessioned2022-10-10T21:27:08Z
dc.date.available2020-04-28T11:44:13Z
dc.date.available2022-10-10T21:27:08Z
dc.date.created2020-04-28T11:44:13Z
dc.date.issued2019-02-27
dc.identifierSOUZA, Jackson Wilke da Cruz. Aprofundamento da caracterização linguístico-computacional da complementaridade em um corpus jornalístico multidocumento. 2019. Tese (Doutorado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2019. Disponível em: https://repositorio.ufscar.br/handle/ufscar/12643.
dc.identifierhttps://repositorio.ufscar.br/handle/ufscar/12643
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/4041922
dc.description.abstractIn the context of the dissemination of digital information, CISCO, an agency of web security, projects that 3.3 Zettabytes of information will be circulated on the Web in 2021. In this context, sub-areas of Automatic Natural Languages Processing (NLP) develop linguistic-computational solutions to dynamize the short time the user has in front of the demand of information in circulation on web. One of these sub-areas is Automatic Multi-document Summarization (AMS), which aims to create automatic summaries from collections of source texts that deal with the same subject. In order to make possible the selection of contents to automatic summaries and to improve this technique, some studies are based in linguistic descriptions of multi-document phenomena. One of these phenomena is complementarity, which occurs when, in a sentence pair (S1, S2), S2 elaborates some information presents in S1. The theoretical model Cross-Document Structure Theory (CST) translates the complementarity into three semantic relations: Historical Background and Follow-up (temporal) and Elaboration (timeless). Some studies in this area indicate that (superficial) linguistic temporal attributes are relevant to automatically identify such CST relations, obtaining automatic classifiers with 75% accuracy. Thus, under the hypothesis that deep linguistic information could generate more efficient classifiers, we propose a refined set of attributes that characterize the complementarity. After the manual analysis of the pairs of sentences annotated with the CST relations of complementarity of CSTNews corpus, we built a typology of 32 signs, organized in seven categories, namely: anaphora, textual structure, morphology, syntax, semantics, pragmatics. Using symbolic algorithms of Machine Learning, it was possible to construct and train new classifiers, whose accuracy surpassed the state-of-the-art. Thus, we contribute with (i) Descriptive Linguistics, as a typology organized in signs that present systematically the evidences and characteristics of complementarity in sentences pairs of journalistic texts, and with (ii) NLP, as it produced a more refined and specific description for the automatic identification of complementarity and consequently the selection of content to the automatic multi-document summaries.
dc.languagepor
dc.publisherUniversidade Federal de São Carlos
dc.publisherUFSCar
dc.publisherPrograma de Pós-Graduação em Linguística - PPGL
dc.publisherCâmpus São Carlos
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/3.0/br/
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/3.0/br/
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Brazil
dc.subjectDescrição linguística
dc.subjectComplementaridade
dc.subjectFenômenos multidocumento
dc.subjectSumarização automática multidocumento
dc.subjectProcessamento automático de línguas naturais
dc.subjectLanguage description
dc.subjectComplementarity
dc.subjectMulti-document phenomena
dc.subjectAutomatic multi-document summarization
dc.subjectAutomatic natural language processing
dc.subjectText analysis
dc.titleAprofundamento da caracterização linguístico-computacional da complementaridade em um corpus jornalístico multidocumento
dc.typeTesis


Este ítem pertenece a la siguiente institución