Tesis
Obtenção de padrões sequenciais em data streams atendendo requisitos do Big Data
Fecha
2016-06-06Registro en:
Autor
Carvalho, Danilo Codeco
Institución
Resumen
The growing amount of data produced daily, by both businesses and individuals in the web, increased the demand for analysis and extraction of knowledge of this data. While the last two decades the solution was to store and perform data mining algorithms, currently it has become unviable even to supercomputers. In addition, the requirements of the Big Data age go far beyond the large amount of data to analyze. Response time requirements and complexity of the data acquire more weight in many areas in the real world. New models have been researched and developed, often proposing distributed computing or different ways to handle the data stream mining. Current researches shows that an alternative in the data stream mining is to join a real-time event handling mechanism with a classic mining association rules or sequential patterns algorithms. In this work is shown a data stream mining approach to meet the Big Data response time requirement, linking the event handling mechanism in real time Esper and Incremental Miner of Stretchy Time Sequences (IncMSTS) algorithm. The results show that is possible to take a static data mining algorithm for data stream environment and keep tendency
in the patterns, although not possible to continuously read all data coming into the data stream.