Tesis
Hardware architecture for frequent itemset mining in static datasets using a segmentation strategy
Autor
MAURO MARTIN LETRAS LUNA
Institución
Resumen
In recent years there has been a significant increase in the information generated from
distinct domains and the size of datasets overwhelm the human capacity to process
them and obtain valuable information. Because of this, Data Mining has emerged as
a set of techniques and algorithms dedicated to finding patterns in datasets, and then
these patterns are used to classify or predict the behavior of some phenomena related
to the data. Association Rules Mining is an important branch inside Data Mining,
and it consists in finding relationships among the data in the form of implication
rules. The problem is usually decomposed into two subproblems. One is to find
those itemsets whose occurrences exceed a predefined threshold in the database; those
itemsets are called frequent itemsets. The second problem is to generate association
rules from those frequent itemsets.
In this research, Frequent Itemset Mining is explored, because the huge amount
of data in some cases makes dificult to obtain a response in an acceptable time
according to the application requirements, due to the exhaustive nature of the
problem. There are many algorithms dedicated to searching frequent itemsets, the
most widely used are: Apriori, FP-Growth, and Eclat. They use strategies like
breadth-first search and depth-first search to go over to the search space. They
have to do a search in datasets, some of them like Apriori, have to access many
times the dataset. FP-Growth reads the dataset twice, but it must keep in memory
large amounts of data. Frequent Itemset Mining is an exhaustive task since the
database must be read many times independently of the way in which the data is
stored (in main memory or hard disk). In the literature, there have been reported
two ways to accelerate Frequent Itemset Mining: the first one consists in improving
the existing software algorithms through proposing new heuristics to save time,
and the second one consists in developing hardware architectures dedicated to this task.
The main goal of this research is to design a Hardware Architecture to accelerate
the Frequent Itemsets Mining process. A segmentation strategy is proposed
using equivalence classes to guarantee that all the frequent itemsets will be found
independently of the available hardware resources. An implementation in FPGA willbe carried out to validate the proposed architecture and compare it with software only implementations.
Ítems relacionados
Mostrando ítems relacionados por Título, autor o materia.
-
Compendio de innovaciones socioambientales en la frontera sur de México
Adriana Quiroga -
Caminar el cafetal: perspectivas socioambientales del café y su gente
Eduardo Bello Baltazar; Lorena Soto_Pinto; Graciela Huerta_Palacios; Jaime Gomez -
Material de empaque para biofiltración con base en poliuretano modificado con almidón, metodos para la manufactura del mismo y sistema de biofiltración
OLGA BRIGIDA GUTIERREZ ACOSTA; VLADIMIR ALONSO ESCOBAR BARRIOS; SONIA LORENA ARRIAGA GARCIA