dc.contributorLUIS ENRIQUE SUCAR SUCCAR
dc.contributorEDUARDO FRANCISCO MORALES MANZANARES
dc.creatorMALLINALI RAMIREZ CORONA
dc.date2014-10
dc.date.accessioned2023-07-25T16:21:05Z
dc.date.available2023-07-25T16:21:05Z
dc.identifierhttp://inaoe.repositorioinstitucional.mx/jspui/handle/1009/188
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/7805410
dc.descriptionThe core of supervised classification consists in assigning to an object or phenomenon one of a previously specified set of categories or classes. There are more complex problems where, instead of a single label, a set of labels are assigned to each instance, this is called multi-label classification. When the labels in a multi-label classification problem are ordered in a predefined structure, typically a tree or a Direct Acyclic Graph (DAG); the task is called Hierarchical Multi-label Classification (HMC). There are HMC methods that create a global model which take advantage of the relations (predefined structure) of the labels. However these methods tend to create too complex models unusable for large scale data. Other methods divide the problem in a set of subproblems, which usually does not benefit from the predefined structure. This thesis addresses the problem of hierarchical classification for tree and DAG structures considering large datasets with a considerable number of labels. A local classifier per parent node is trained for each non-leaf node in the hierarchy. Our method exploits the correlation of the labels with its ancestors in the hierarchy and evaluates each possible path from the root to a leaf node, taking into account the level of the predicted labels to give a score to each path and finally return the one with the best score. In some cases there are instances whose labels do not reach a leaf node, for this cases we developed an extension of the base method for Non Mandatory Leaf Node Prediction (NMLNP); in which a pruning phase is performed before selecting the best path. We noticed that many evaluation measures scored the short paths that only predict the most general cases better than longer more specific paths, that is why we also propose a new evaluation measure that avoids the bias toward conservative predictions in the case of NMLNP. We tested our methods with 18 datasets with tree and DAG structured hierarchies against a number of state-of-the-art methods. The evaluation shows the advantages of these methods, in terms of predictive performance, execution time and scalability compared with other methods for hierarchical classification. Our methods proved to obtain superior results when dealing with deep hierarchies and competitive with shallower hierarchies.
dc.formatapplication/pdf
dc.languageeng
dc.publisherInstituto Nacional de Astrofísica, Óptica y Electrónica
dc.relationcitation:Ramirez-Corona M.
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/4.0
dc.subjectinfo:eu-repo/classification/Cadenas de clasificación/Classification chains
dc.subjectinfo:eu-repo/classification/Ontologías/Ontologies
dc.subjectinfo:eu-repo/classification/Proteínas/Proteins
dc.subjectinfo:eu-repo/classification/cti/1
dc.subjectinfo:eu-repo/classification/cti/12
dc.subjectinfo:eu-repo/classification/cti/1203
dc.subjectinfo:eu-repo/classification/cti/1203
dc.titleHierarchical multi-label classification for tree and DAG hierarchies
dc.typeinfo:eu-repo/semantics/masterThesis
dc.typeinfo:eu-repo/semantics/acceptedVersion
dc.audiencestudents
dc.audienceresearchers
dc.audiencegeneralPublic


Este ítem pertenece a la siguiente institución