Dissertação de Mestrado
Evolução automática de algoritmos de redes bayesianas de classificação
Fecha
2014-02-26Autor
Alex Guimarães Cardoso de Sá
Institución
Resumen
When faced with a new machine learning problem, selecting which classifier is the best to perform the task at hand is a very hard problem. The reason for this is the nature of the data used by the classifier, which can differ abruptly from one set to another, consequently affecting the classification outcome. In other words, the same classifier can not be adapted to different types of data. Most solutions proposed in the literature are based on meta-learning, and use meta-data about the problem to recommend an effective algorithm to solve the task. This work proposes a new approach to this problem: to build an algorithm tailored to the application problem at hand. More specifically, we propose an evolutionary algorithm (EA) to automatically evolve Bayesian Network Classifiers (BNCs). The method receives as input a list of the main components of BNC algorithms, and uses an EA to encode these components. Given an input dataset (or a group of datasets), the method tests different combinations of components and returns the best BNC algorithm to that specific application domain. For testing, we divided the experiments in three main parts: (i) tests in specific datasets domains; (ii) tests directed to sets of similar datasets; (iii) tests directed to sets of distinct datasets. For the first part, 15 UCI datasets were chosen to evaluate the proposed approach and generate tailored algorithms for these datasets. The other two parts focused on applying the EA on sets of datasets. In this case, 20 datasets with distinct characteristics were selected in order to cluster them and, thus, create different experiment scenarios. Tests were performed on the AE considering the three parts of experiments and results were compared separately with a greedy search method and, then, with three state-of-art BNC algorithms (Naïve Bayes, TAN and K2). Results showed that the generated BNC algorithms are competitive with those of the state-of-art methods, and in most cases the use of an evolutionary algorithm, rather than a simple greedy search, improved statistically the results.