Extended and updated tables for the Friedman rank test [archivos ASCII]

López-Vázquez, Carlos; Hochsztain, Esther

Software

Fecha

2017

Registro en:

https://doi.org/10.1080/03610926.2017.1408829

http://hdl.handle.net/20.500.11968/3419

https://repositorioslatinoamericanos.uchile.cl/handle/2250/4911512

Autor

López-Vázquez, Carlos

Hochsztain, Esther

Institución

Universidad Ort Uruguay

Resumen

Estos archivos en código ASCII acompañan al artículo Extended and updated tables for the Friedman rank test, publicado en Communications in Statistics—Theory and Methods (ISSN: 1532-415X). Contenido del artículo: Muchos test y estadísticos de interés tienen distribuciones complejas, las que suelen aproximarse por distribuciones asintóticas “...para N grande, k pequeño,...” etc. En la práctica hay pocas reglas para evaluar lo adecuado de su aplicación para k y N particulares. Aunque el problema es general, en este trabajo se ilustra el hecho con el Test de Friedman, desarrollado para analizar datos ordinales en 1937. Este test no paramétrico (de variables N y k) tiene dos aproximaciones asintóticas: una válida para todo k y N grande y otra normal para k grande y N pequeño. En nuestro trabajo, se comparó exhaustivamente cada aproximación asintótica contra la distribución empírica obtenida mediante simulación de Monte Carlo, elaborándose cotas del error relativo de los percentiles clásicos en un amplio rango de k y N. Los resultados obtenidos tras más de 100 años-CPU de procesamiento muestran que la discrepancia excede fácilmente el 10%. Asimismo, mediante la revisión de casos reportados en la literatura, se identificaron ejemplos en que el uso de la distribución asintótica llevó a los autores a conclusiones erróneas. Este trabajo de big computing presenta por lo tanto aportes a la estadística teórica y aplicada.

The companion ASCII files are linked with the paper "Extended and updated tables for the Friedman rank test", published by Communications in Statistics—Theory and Methods (ISSN: 1532-415X). The Friedman’s test is used for assessing the independence of repeated experiments resulting in ranks, summarized as a table of integer entries ranging from 1 to k, with k columns and N rows. For its practical use, the hypothesis testing can be derived either from published tables with exact values for small k and N, or using an asymptotic analytical approximation valid for large N or large k. The quality of the approximation, measured as the relative difference of the true critical values with respect those arising from the asymptotic approximation is simply not known. The literature review shows cases where the wrong conclusion could have been drawn using it, although it may not be the only cause of opposite decisions. By Monte Carlo simulation we conclude that published tables do not cover a large enough set of (k, N) values to assure adequate accuracy. Our proposal is to systematically extend existing tables for k and N values, so that using the analytical approximation for values outside it will have less than a prescribed relative error. For illustration purposes some of the tables have been included in the paper, but the complete set is presented as a source code valid for Octave/Matlab/Scilab etc., and amenable to be ported to other programming languages.

Materias

FRIEDMAN TEST

RANKS

MONTECARLO SIMULATION

ASYMPTOTIC APPROXIMATION

APPLIED STATISTICS

DISTRIBUCIÓN EXACTA

DISTRIBUCIÓN APROXIMADA

DISTRIBUCIÓN ASINTÓTICA

TEST DE FRIEDMAN

Mostrar el registro completo del ítem