Actas de congresos
Combined unsupervised and semi-supervised learning for data classification
Fecha
2016-11-08Registro en:
IEEE International Workshop on Machine Learning for Signal Processing, MLSP, v. 2016-November.
2161-0371
2161-0363
10.1109/MLSP.2016.7738877
2-s2.0-85002156994
5693860025538327
0000-0002-1123-9784
Autor
Universidade Estadual Paulista (Unesp)
Institución
Resumen
Semi-supervised learning methods exploit both labeled and unlabeled data items in their training process, requiring only a small subset of labeled items. Although capable of drastically reducing the costs of labeling process, such methods are directly dependent on the effectiveness of distance measures used for building the kNN graph. On the other hand, unsupervised distance learning approaches aims at capturing and exploiting the dataset structure in order to compute a more effective distance measure, without the need of any labeled data. In this paper, we propose a combined approach which employs both unsupervised and semi-supervised learning paradigms. An unsupervised distance learning procedure is performed as a pre-processing step for improving the kNN graph effectiveness. Based on the more effective graph, a semi-supervised learning method is used for classification. The proposed Combined Unsupervised and Semi-Supervised Learning (CUSSL) approach is based on very recent methods. The Reciprocal kNN Distance is used for unsupervised distance learning tasks and the semi-supervised learning classification is performed by Particle Competition and Cooperation (PCC). Experimental results conducted in six public datasets demonstrated that the combined approach can achieve effective results, boosting the accuracy of classification tasks.