dc.contributorGisele Lobo Pappa
dc.contributorLuiz Henrique de Campos Merschmann
dc.contributorOmar Paranaiba Vilela Neto
dc.creatorJuliana Oliveira Ferreira
dc.date.accessioned2019-08-12T11:59:03Z
dc.date.accessioned2022-10-03T23:08:20Z
dc.date.available2019-08-12T11:59:03Z
dc.date.available2022-10-03T23:08:20Z
dc.date.created2019-08-12T11:59:03Z
dc.date.issued2013-07-01
dc.identifierhttp://hdl.handle.net/1843/ESBF-9GMPLV
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/3817388
dc.description.abstractThe outgrowing number of information posted by users in social network, together with other resources provided by the Web 2.0, asked for a paradigm shift in the way data-based systems work. A few well-behaved data instances were replaced by a continuous and non-stationary data flow. Hence, traditional mining algorithms used to extract patterns from data had to be adapted for dealing with these new reality. Given the nature of data flows, algorithms had to learn how to deal with at least three challenges: (i) What data should be kept and which should be discarded during the learning process? (ii) When should the classification model be updated? (iii) How should it be update?In this direction, this paper proposes an evolutionary algorithm (EA) for learning in data streams and is able to explore the evolution of classifiers together with the evolution of the data. One of the main reasons we use AE is that it has a population of possible solutions to the problem which tends to evolve over time by selecting the fittest individuals and operations of crossover and mutation. This feature can be exploited so that, over time, both models as data evolve simultaneously.The proposed algorithm works with a dynamic vocabulary, and tackles the three challenges aforementioned. It uses a method based on a data repository, which stores a predefined set of instances. The Page-Hinkley (PH) statistical test is used to detect changes in the performance of classifiers, signaling when the model should be retrained. The model is updated leveraging the evolution operators of the EA.The method was tested in four datasets of short text and extensive vocabulary collected from Twitter, each of them corresponding to a real-life event. The results were compared with two state of the art algorithms from the literature, and the results obtained were equal to or better than those obtained by these algorithms.
dc.publisherUniversidade Federal de Minas Gerais
dc.publisherUFMG
dc.rightsAcesso Aberto
dc.subjectAlgoritmos Evolucionários
dc.subjectClassificador
dc.subjectFluxo de Dados Contínuos
dc.subjectMineração de Dados
dc.subjectAlgoritmos Genéticos
dc.titleUm algoritmo evolucionário para mineração de fluxo de dados em microblogs
dc.typeDissertação de Mestrado


Este ítem pertenece a la siguiente institución