dc.creator | Tomas, Jimena Torres | |
dc.creator | Spolaôr, Newton | |
dc.creator | Cherman, Everton Alvares | |
dc.creator | Monard, Maria Carolina | |
dc.date.accessioned | 2014-05-09T19:13:03Z | |
dc.date.accessioned | 2018-07-04T16:47:53Z | |
dc.date.available | 2014-05-09T19:13:03Z | |
dc.date.available | 2018-07-04T16:47:53Z | |
dc.date.created | 2014-05-09T19:13:03Z | |
dc.date.issued | 2014-02-25 | |
dc.identifier | Electronic Notes in Theoretical Computer Science, Amsterdam, v.302, p.155-176, 2014 | |
dc.identifier | http://www.producao.usp.br/handle/BDPI/44788 | |
dc.identifier | 10.1016/j.entcs.2014.01.025 | |
dc.identifier | http://dx.doi.org/10.1016/j.entcs.2014.01.025 | |
dc.identifier.uri | http://repositorioslatinoamericanos.uchile.cl/handle/2250/1640508 | |
dc.description.abstract | A controlled environment based on known properties of the dataset used by a learning algorithm is useful to empirically evaluate machine learning algorithms. Synthetic (artificial) datasets are used for this purpose. Although there are publicly available frameworks to generate synthetic single-label datasets, this is not the case for multi-label datasets, in which each instance is associated with a set of labels usually correlated. This work presents Mldatagen, a multi-label dataset generator framework we have implemented, which is publicly available to the community. Currently, two strategies have been implemented in Mldatagen: hypersphere and hypercube. For each label in the multi-label dataset, these strategies randomly generate a geometric shape (hypersphere or hypercube), which is populated with points (instances) randomly generated. Afterwards, each instance is labeled according to the shapes it belongs to, which defines its multi-label. Experiments with a multi-label classification algorithm in six synthetic datasets illustrate the use of Mldatagen. | |
dc.language | eng | |
dc.publisher | Elsevier | |
dc.publisher | Amsterdam | |
dc.relation | Electronic Notes in Theoretical Computer Science | |
dc.rights | Elsevier B.V. | |
dc.rights | openAccess | |
dc.subject | data generator | |
dc.subject | artificial datasets | |
dc.subject | multi-label learning | |
dc.subject | publicly available framework | |
dc.subject | Java | |
dc.subject | PHP | |
dc.title | A framework to generate synthetic multi-label datasets | |
dc.type | Artículos de revistas | |