dc.creatorTomas, Jimena Torres
dc.creatorSpolaôr, Newton
dc.creatorCherman, Everton Alvares
dc.creatorMonard, Maria Carolina
dc.date.accessioned2014-05-09T19:13:03Z
dc.date.accessioned2018-07-04T16:47:53Z
dc.date.available2014-05-09T19:13:03Z
dc.date.available2018-07-04T16:47:53Z
dc.date.created2014-05-09T19:13:03Z
dc.date.issued2014-02-25
dc.identifierElectronic Notes in Theoretical Computer Science, Amsterdam, v.302, p.155-176, 2014
dc.identifierhttp://www.producao.usp.br/handle/BDPI/44788
dc.identifier10.1016/j.entcs.2014.01.025
dc.identifierhttp://dx.doi.org/10.1016/j.entcs.2014.01.025
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/1640508
dc.description.abstractA controlled environment based on known properties of the dataset used by a learning algorithm is useful to empirically evaluate machine learning algorithms. Synthetic (artificial) datasets are used for this purpose. Although there are publicly available frameworks to generate synthetic single-label datasets, this is not the case for multi-label datasets, in which each instance is associated with a set of labels usually correlated. This work presents Mldatagen, a multi-label dataset generator framework we have implemented, which is publicly available to the community. Currently, two strategies have been implemented in Mldatagen: hypersphere and hypercube. For each label in the multi-label dataset, these strategies randomly generate a geometric shape (hypersphere or hypercube), which is populated with points (instances) randomly generated. Afterwards, each instance is labeled according to the shapes it belongs to, which defines its multi-label. Experiments with a multi-label classification algorithm in six synthetic datasets illustrate the use of Mldatagen.
dc.languageeng
dc.publisherElsevier
dc.publisherAmsterdam
dc.relationElectronic Notes in Theoretical Computer Science
dc.rightsElsevier B.V.
dc.rightsopenAccess
dc.subjectdata generator
dc.subjectartificial datasets
dc.subjectmulti-label learning
dc.subjectpublicly available framework
dc.subjectJava
dc.subjectPHP
dc.titleA framework to generate synthetic multi-label datasets
dc.typeArtículos de revistas


Este ítem pertenece a la siguiente institución