Tese de Doutorado
LORC: classificação supervisionada baseada em grafos esparsos, robusta para dados com ruído no rótulo
Fecha
2015-06-26Autor
Leticia Cavalari Pinheiro
Institución
Resumen
This thesis presents the development of a new supervised classification method based in sparse graphs. The basic idea is to learn from data instances to build a minimum spanning tree (MST), based on the distances between attributes. Based on a dissimilarity measure calculated from the labels, we obtain a graph partition by pruning the MST edges. This partition defines the classification regions that seek to balance major intra-region homogeneity and great inter-region heterogeneity, providing good results for posterior classifications of instances with unknown labels. A great advancement presented by the developed methodology is the potential classification improvement when the training datasets have label noise. This type of noise is common and impairs the performance of most classification methods. This thesis includes a study about supervised classification and label noise data, the development of a new classification methodology with 4 possible variations making possible to adapt to diferent datasets, the proof of its efficiency under some assumptions, and the quality verification based on comparisions with other popular methods. The results are promising.