BIG DATA AND THE CENTRAL LIMIT THEOREM: A STATISTICAL LEGEND

Allende-Alonso, S.; Bouza-Herrera, C. N; Rizvi, S. E. H; Sautto-Vallejo, J. M

BIG DATA Y EL TEOREMA DEL LÍMITE CENTRAL: UNA LEYENDA ESTADÍSTICA

dc.creator	Allende-Alonso, S.
dc.creator	Bouza-Herrera, C. N
dc.creator	Rizvi, S. E. H
dc.creator	Sautto-Vallejo, J. M
dc.date	2023-04-11
dc.date.accessioned	2023-05-22T20:48:57Z
dc.date.available	2023-05-22T20:48:57Z
dc.identifier	https://revistas.uh.cu/invoperacional/article/view/2694
dc.identifier.uri	https://repositorioslatinoamericanos.uchile.cl/handle/2250/6330267
dc.description	Nowadays we deal with Big-Data commonly. The users of statistics rely on having a large sample size n for using the statistical methods based on normality. Usual inference methods are typically based on considering the Normal as the limit distributions of the sample mean for a large n. With large enough sample sizes (> 30 or 40), the violation of the normality assumption should not cause major problems. This fact implies that we can use parametric procedures even when the data are not normally distributed. Al least a goodness-of-fit test must be performed for accepting whether normality is valid or not. Monte Carlo (MC) techniques are used for selecting independent random samples of populations of means of three variables of importance in web network management. Different tests are performed to establish the acceptance of the normality. We did not find reliable results even for samples of size 10 000	en-US
dc.description	Nowadays we deal with Big-Data commonly. The users of statistics rely on having a large sample size n for using the statistical methods based on normality. Usual inference methods are typically based on considering the Normal as the limit distributions of the sample mean for a large n. With large enough sample sizes (> 30 or 40), the violation of the normality assumption should not cause major problems. This fact implies that we can use parametric procedures even when the data are not normally distributed. Al least a goodness-of-fit test must be performed for accepting whether normality is valid or not. Monte Carlo (MC) techniques are used for selecting independent random samples of populations of means of three variables of importance in web network management. Different tests are performed to establish the acceptance of the normality. We did not find reliable results even for samples of size 10 000	es-ES
dc.format	application/pdf
dc.language	eng
dc.publisher	Departamento de Matemática Aplicada. Facultad de Matemática y Computación. Universidad de La Habana	en-US
dc.relation	https://revistas.uh.cu/invoperacional/article/view/2694/2349
dc.rights	https://creativecommons.org/licenses/by/4.0	es-ES
dc.source	Investigación Operacional; Vol. 40 No. 1 (2019): SPECIAL ISSUE: CONTRIBUTIONS IN MATHEMATICAL MODELING WITH IMPACT IN MEDICAL AND ENVIRONMENTS /NÚMERO ESPECIAL : CONTRIBUCIONES EN MODELACIÓN MATEMÁTICA CON IMPACTO EN MEDICINA Y MEDIO AMBIENTE	en-US
dc.source	Investigación Operacional; Vol. 40 Núm. 1 (2019): SPECIAL ISSUE: CONTRIBUTIONS IN MATHEMATICAL MODELING WITH IMPACT IN MEDICAL AND ENVIRONMENTS /NÚMERO ESPECIAL : CONTRIBUCIONES EN MODELACIÓN MATEMÁTICA CON IMPACTO EN MEDICINA Y MEDIO AMBIENTE	es-ES
dc.source	2224-5405
dc.subject	grandes masas de datos	es-ES
dc.subject	pruebas de normalidad	es-ES
dc.subject	normalidad asintótica de medias	es-ES
dc.subject	Big-Data	en-US
dc.subject	normality tests	en-US
dc.subject	asymptotic normality of means	en-US
dc.title	BIG DATA AND THE CENTRAL LIMIT THEOREM: A STATISTICAL LEGEND	en-US
dc.title	BIG DATA Y EL TEOREMA DEL LÍMITE CENTRAL: UNA LEYENDA ESTADÍSTICA	es-ES
dc.type	info:eu-repo/semantics/article
dc.type	info:eu-repo/semantics/publishedVersion
dc.type	Articles	en-US
dc.type	Artículo	es-ES

Este ítem pertenece a la siguiente institución

Universidad de la Habana (Cuba)