dc.contributorCerri, Ricardo
dc.contributorhttp://lattes.cnpq.br/6266519868438512
dc.contributorVelázquez, Isaac Triguero
dc.contributorhttp://lattes.cnpq.br/7631837051398731
dc.creatorAlcantara, Leonardo Utida
dc.date.accessioned2022-04-21T13:37:40Z
dc.date.accessioned2022-10-10T21:39:37Z
dc.date.available2022-04-21T13:37:40Z
dc.date.available2022-10-10T21:39:37Z
dc.date.created2022-04-21T13:37:40Z
dc.date.issued2021-11-19
dc.identifierALCANTARA, Leonardo Utida. Árvore de predição semi-supervisionada para predição de localização subcelular de proteínas. 2021. Trabalho de Conclusão de Curso (Graduação em Engenharia de Computação) – Universidade Federal de São Carlos, São Carlos, 2021. Disponível em: https://repositorio.ufscar.br/handle/ufscar/15893.
dc.identifierhttps://repositorio.ufscar.br/handle/ufscar/15893
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/4045941
dc.description.abstractProtein subcellular localization is a really important classification task, because the location of proteins inside a cell is directly related to these protein’s functions. As there are a lot of proteins that reside at the same time in two or more locations in a cell or move between locations, usually supervised multi-label classification methods are designed to attack this problem. This approach is well-established in the literature; however, it presents some disadvantages such as: (i) the need for a large amount of labeled instances to train the classifier; (ii) this approach ignores the fact that unlabeled instances can provide valuable information for the classification; and (iii) there are a lot of areas in which unlabeled data is abundant but manually labelling an instance is too expensive and time-consuming. Semi-Supervised Learning (SSL) is a subfield of traditional machine learning, in which the learner tries to exploit both labeled and unlabeled data at the same time. Semi-Supervised Classification is a in a subcategory of SSL which uses the available unlabeled data to improve the classification prformance of a classification process that already uses labeled data. The main goal of this project was the develop a semi-supervised multi-label classifier able to use the abundant number of unlabeled proteins to improve the prediction of protein subcellular localization. The SSL algorithm developed in this work is based on the predictive clustering tree framework and it was constructed, tested and analysed in many SSL scenarios in order to test whether or not the classifier was able to use the unlabeled instances to help during the classification process in a set of Multi-Label protein subcellular localization datasets, from 3 different taxonomies: Viridiplantae, Virus and Fungi.
dc.languagepor
dc.publisherUniversidade Federal de São Carlos
dc.publisherUFSCar
dc.publisherCâmpus São Carlos
dc.publisherEngenharia de Computação - EC
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/3.0/br/
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Brazil
dc.subjectAprendizado de máquina
dc.subjectClassificação
dc.subjectBioinformática
dc.subjectAprendizado de máquina multirrótulo
dc.subjectAprendizado de máquina semi-supervisionado
dc.subjectMachine learning
dc.subjectClassification
dc.subjectBioinformatics
dc.subjectMulti-label machine learning
dc.subjectSemi-supervised machine learning
dc.titleÁrvore de predição semi-supervisionada para predição de localização subcelular de proteínas
dc.typeOtros


Este ítem pertenece a la siguiente institución