dc.contributor | Caseli, Helena de Medeiros | |
dc.contributor | http://lattes.cnpq.br/6608582057810385 | |
dc.contributor | http://lattes.cnpq.br/1341941141535178 | |
dc.creator | Polastri, Paulo César | |
dc.date.accessioned | 2016-10-14T14:13:28Z | |
dc.date.available | 2016-10-14T14:13:28Z | |
dc.date.created | 2016-10-14T14:13:28Z | |
dc.date.issued | 2016-03-04 | |
dc.identifier | POLASTRI, Paulo César. Aprendizado sem-fim de paráfrases. 2016. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2016. Disponível em: https://repositorio.ufscar.br/handle/ufscar/7868. | |
dc.identifier | https://repositorio.ufscar.br/handle/ufscar/7868 | |
dc.description.abstract | Use different words to express/convey the same message is a necessity in any natural language and, as such, should be investigated in research in Natural Language Processing (NLP). When it is just a simple word, we say that the interchangeable words are synonyms; while the term paraphrase is used to express a more general idea and that also may involve more than one word. For example, the sentences "the light is red" and "the light is closed" are examples of paraphrases as "sign" and "traffic light" represent synonymous in this context. Proper treatment of paraphrasing is important in several NLP applications, such as Machine Translation, which paraphrases can be used to increase the coverage of Statistical Machine Translation systems; on Multidocument Summarization, where paraphrases identification allows the recognition of repeated information; and Natural Language Generation, where the generation of paraphrases allows creating more varied and fluent texts. The project described in this document is intended to verify that is possible to learn, in an incremental and automatic way, paraphrases in words level from a bilingual parallel corpus, using Never-Ending Machine Learning (NEML) strategy and the Internet as a source of knowledge. The NEML is a machine learning strategy, based on how humans learn: what is learned previously can be used to learn new information and perhaps more complex in the future. Thus, the NEML has been applied together with the strategy for paraphrases extraction proposed by Bannard and Callison-Burch (2005) where, from bilingual parallel corpus, paraphrases are extracted using a pivot language. In this context, it was developed NEPaL (Never-Ending Paraphrase Learner) AMSF system responsible for: (1) extract the internet texts, (2) align the text using a pivot language, (3) rank the candidates according to a classification model and (4) use the knowledge to produce a new classifier model and therefore gain more knowledge restarting the never-ending learning cycle. | |
dc.language | por | |
dc.publisher | Universidade Federal de São Carlos | |
dc.publisher | UFSCar | |
dc.publisher | Programa de Pós-Graduação em Ciência da Computação - PPGCC | |
dc.publisher | Câmpus São Carlos | |
dc.rights | Acesso aberto | |
dc.subject | Paráfrases | |
dc.subject | Reconhecimento automático de paráfrases | |
dc.subject | Aprendizado de máquina sem-fim | |
dc.subject | Processamento de língua natural | |
dc.subject | Português do Brasil | |
dc.subject | Paraphrase lexicon | |
dc.subject | Automatic paraphrase recognition | |
dc.subject | Never-ending machine learning | |
dc.subject | Natural language processing | |
dc.subject | Brazilian Portuguese | |
dc.title | Aprendizado sem-fim de paráfrases | |
dc.type | Tesis | |