Brasil | Artículos de revistas
dc.creatorHRUSCHKA, Eduardo R.
dc.creatorGARCIA, Antonio J. T.
dc.creatorHRUSCHKA JR., Estevam R.
dc.creatorEBECKEN, Nelson F. F.
dc.date.accessioned2012-10-20T03:30:59Z
dc.date.accessioned2018-07-04T15:38:01Z
dc.date.available2012-10-20T03:30:59Z
dc.date.available2018-07-04T15:38:01Z
dc.date.created2012-10-20T03:30:59Z
dc.date.issued2009
dc.identifierJOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, v.21, n.1, p.43-58, 2009
dc.identifier0952-813X
dc.identifierhttp://producao.usp.br/handle/BDPI/28784
dc.identifier10.1080/09528130802246602
dc.identifierhttp://dx.doi.org/10.1080/09528130802246602
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/1625426
dc.description.abstractThe substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of missing entries for some attributes whose values are actually known. These artificially missing values are imputed and then compared with the original values. Although this evaluation is useful, it does not allow the influence of imputed values in the ultimate modelling task (e.g. in classification) to be inferred. We argue that imputation cannot be properly evaluated apart from the modelling task. Thus, alternative approaches are needed. This article elaborates on the influence of imputed values in classification. In particular, a practical procedure for estimating the inserted bias is described. As an additional contribution, we have used such a procedure to empirically illustrate the performance of three imputation methods (majority, naive Bayes and Bayesian networks) in three datasets. Three classifiers (decision tree, naive Bayes and nearest neighbours) have been used as modelling tools in our experiments. The achieved results illustrate a variety of situations that can take place in the data preparation practice.
dc.languageeng
dc.publisherTAYLOR & FRANCIS LTD
dc.relationJournal of Experimental & Theoretical Artificial Intelligence
dc.rightsCopyright TAYLOR & FRANCIS LTD
dc.rightsrestrictedAccess
dc.subjectmissing values
dc.subjectclassification
dc.subjectimputation
dc.subjectBayesian methods
dc.titleOn the influence of imputation in classification: practical issues
dc.typeArtículos de revistas


Este ítem pertenece a la siguiente institución