doctoralThesis
Uma proposta de automatização do processo de rotulagem de instâncias em algoritmos de aprendizado semissupervisionado
Fecha
2019-11-22Registro en:
VALE, Karliane Medeiros Ovidio. Uma proposta de automatização do processo de rotulagem de instâncias em algoritmos de aprendizado semissupervisionado. 2019. 117f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências Exatas e da Terra, Universidade Federal do Rio Grande do Norte, Natal, 2019.
Autor
Vale, Karliane Medeiros Ovidio
Resumen
Semi-supervised learning is a kind of machine learning that integrates supervised and
unsupervised learning mechanisms. In this type of learning, most of training set labels are
unknown, while there is a small part of data that has known labels. The semi-supervised
learning is attractive because of its potential to use labeled and unlabeled data to perform better than supervised learning. This paper consists of a study in the field of semisupervised learning and implements changes on the self-training and co-training semisupervised learning algorithms. In the literature, it is common to develop researches that
change the structure of such algorithms, however, none of them proposes automating the
labeling process of unlabeled instances, which is the main purpose of this work. In order
to achieve this goal, three methods are proposed: FlexCon-G, FlexCon e FlexCon-C. The
main difference among these methods is how tje confidence rate is calculated and the strategy used to choose a label in each iteration, among them ensembles. In order to evaluate
the proposed methods’ performance, we have carried out an empirical analysis, in which
the performances of these methods have been evaluated on 30 datasets with diversified
characteristics. The obtained results indicate that the three proposed methods perform
better than original self-training and co-training methods in most cases.