dc.contributorFlores Muñoz, Pablo Javier
dc.contributorPazmiño Maji, Rubén Antonio
dc.creatorVinueza Chalco, Jamilton Daniel
dc.creatorMasaquiza Aragón, Galo Alexander
dc.date.accessioned2022-01-20T17:21:23Z
dc.date.accessioned2022-10-20T18:59:45Z
dc.date.available2022-01-20T17:21:23Z
dc.date.available2022-10-20T18:59:45Z
dc.date.created2022-01-20T17:21:23Z
dc.date.issued2021-08-23
dc.identifierVinueza Chalco, Jamilton Daniel; Masaquiza Aragón, Galo Alexander. (2021). Medición de la efectividad de técnicas de imputación para datos faltantes. Escuela Superior Politécnica de Chimborazo. Riobamba.
dc.identifierhttp://dspace.espoch.edu.ec/handle/123456789/14828
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/4583245
dc.description.abstractThe objective of this research work was to measure the effectiveness in terms of precision and quality of estimation presented by different imputation techniques for missing data, coming from a normal distribution. From the Monte Carlo method, a bivariate matrix structured by observed data and by missing data was created, where the missing values were developed through an established model. Representative samples of size 5, 10, 30 and 100 were simulated 100,000 times working with different percentages of information loss for the scenarios: Missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). The imputation techniques by elimination, mean, median and linear regression were applied, in which the adjustment of the data was diagnosed through a precision measure and it was verified if the imputed data maintain their estimation properties of unbiasedness and minimum variance., using the mean and variance estimators. Using the RStudio software, it was determined which linear regression is the most accurate in samples from 30, while the mean and median in small samples such as 5 to obtain values closer to the real data. The unbiasedness of the mean shows that the best technique is the imputation by linear regression, since its property is maintained in samples from 30 onwards. In the unbiasedness of the variance, the most viable technique in MAR and MCAR is elimination for samples of 30 and 100, while for MNAR in samples of any size. According to the minimum variance of the mean and variance, the technique that yielded a lower variance in most contexts is linear regression. It is recommended to extend the study using multiple imputation techniques and machine learning to diagnose better results.
dc.languagespa
dc.publisherEscuela Superior Politécnica de Chimborazo
dc.relationUDCTFC;226T0093
dc.rightshttps://creativecommons.org/licenses/by-nc-sa/3.0/ec/
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectCIENCIAS EXACTAS Y NATURALES
dc.subjectESTADÍSTICA
dc.subjectMÉTODO DE MONTECARLO
dc.subjectIMPUTACIÓN DE DATOS
dc.subjectPRECISIÓN DE AJUSTE
dc.subjectPROPIEDADES DEL ESTIMADOR
dc.titleMedición de la efectividad de técnicas de imputación para datos faltantes
dc.typeTesis


Este ítem pertenece a la siguiente institución