Artículos de revistas
Nonparametric Significance Testing And Group Variable Selection
Registro en:
Journal Of Multivariate Analysis. Academic Press Inc., v. 133, n. , p. 51 - 60, 2015.
0047259X
10.1016/j.jmva.2014.08.014
2-s2.0-84907970195
Autor
Zambom A.Z.
Akritas M.G.
Institución
Resumen
In the context of a heteroscedastic nonparametric regression model, we develop a test for the null hypothesis that a subset of the predictors has no influence on the regression function. The test uses residuals obtained from local polynomial fitting of the null model and is based on a test statistic inspired from high-dimensional analysis of variance. Using p-values from this test, and multiple testing ideas, a group variable selection method is proposed, which can consistently select even groups of variables with diminishing predictive significance. A backward elimination version of this procedure, called GBEAMS for Group Backward Elimination Anova-type Model Selection, is recommended for practical applications. Simulation studies, suggest that the proposed test procedure outperforms the generalized likelihood ratio test when the alternative is non-additive or there is heteroscedasticity. Additional simulation studies reveal that the proposed group variable selection procedure performs competitively against other variable selection methods, and outperforms them in selecting groups having nonlinear or dependent effects. The proposed group variable selection procedure is illustrated on a real data set. 133
51 60 Abramovich, F., Benjamini, Y., Donoho, D.L., Johnstone, I.M., Adapting to unknown sparsity by controlling the false discovery rate (2006) Ann. Statist., 34, pp. 584-653 Akritas, M.G., Papadatos, N., Heteroscedastic one-way ANOVA and lack-of-fit tests (2004) J. Amer. Statist. Assoc., 99, pp. 368-382 Alon, U., Barkai, N., Notterdam, D., Gish, K., Ybarra, S., Mack, D., Levine, A., Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays (1999) Proc. Natl. Acad. Sci., 96, pp. 6745-6750 Bair, E., Hastie, T., Paul, D., Tibshirani, R., Prediction by supervised principal components (2006) J. Amer. Statist. Assoc., 101, pp. 119-137 Benjamini, Y., Gavrilov, Y., A simple forward selection procedure based on false discovery rate control (2009) Ann. Appl. Stat., 3, pp. 179-198 Benjamini, Y., Hochberg, Y., Controlling the false discovery rate: a practical and powerful approach to multiple testing (1995) J. R. Stat. Soc. Ser. B, 57, pp. 289-300 Benjamini, Y., Krieger, A.M., Yekutieli, D., Adaptive linear step-up procedures that control the false discovery rate (2006) Biometrika, 93, pp. 491-507 Benjamini, Y., Yekutieli, D., The control of the false discovery rate in multiple testing under dependency (2001) Ann. Statist., 29, pp. 1165-1188 Bunea, F., Wegkamp, M., Auguste, A., Consistent variable selection in high dimensional regression via multiple testing (2006) J. Statist. Plann. Inference, 136, pp. 4349-4364 Candes, E., Tao, T., The Dantzig selector: statistical estimation when p is much larger than n (2007) Ann. Statist., 35, pp. 2313-2351 Dettling, M., Bhlmann, P., Supervised clustering of genes (2002) Genome Biol., 3 (12). , research0069.1-0069.15 Dettling, M., Bhlmann, P., Finding predictive gene groups from microarray data (2004) J. Multivariate Anal., 90, pp. 106-131 Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., Least angle regression (2004) Ann. Statist., 32, pp. 407-499 Fan, J., Jiang, J., Nonparametric inferences for additive models (2005) J. Amer. Statist. Assoc., 100, pp. 890-907 Fan, J., Li, R., Variable selection via nonconcave penalized likelihood and its oracle properties (2001) J. Amer. Statist. Assoc., 96, pp. 1348-1360 Hall, P., On projection pursuit regression (1989) Ann. Statist., 17, pp. 573-588 Huang, J., Horowitz, J.L., (2010) Variable selection in nonparametric additive models, , http://faculty.wcas.northwestern.edu/~jlh951/papers/HHW-npam.pdf Li, K.C., Sliced inverse regression for dimension reduction (1991) J. Amer. Statist. Assoc., 86, pp. 316-327 Li, L., Cook, R.D., Nachtsheim, C., Model-free variable selection (2005) J. R. Stat. Soc. Ser. B, 67, pp. 285-299 Li, R., Liang, H., Variable selection in semiparametric regression modeling (2008) Ann. Statist., 36, pp. 261-286 Ma, S., Song, X., Huang, J., Supervised group Lasso with applications to microarray data analysis (2007) BMC Bioinform., 8, p. 60 Masry Multivariate local polynomial regression for time series: uniform strong consistency rates (1996) J. Time Ser. Anal., 17, pp. 571-599 Park, M.Y., Hastie, T., Tibshirani, R., Averaged gene expressions for regression (2007) Biostatistics, 8, pp. 212-227 Rice, J., Bandwidth choice for nonparametric regression (1984) Ann. Statist., 12, pp. 1215-1230 Storlie, C.B., Bondell, H.D., Reich, B.J., Zhang, H.H., Surface estimation, variable selection, and the nonparametric oracle property (2011) Statist. Sinica, 21, pp. 679-705 Tibshirani, R., Regression shrinkage and selection via the Lasso (1996) J. R. Stat. Soc. Ser. B, 58, pp. 267-288 Tibshirani, R., Knight, K., The covariance inflation criterion for adaptive model selection (1999) J. R. Stat. Soc. Ser. B, 61, pp. 529-546 Wang, L., Akritas, M.G., Keilegom, I.V., An ANOVA-type nonparametric diagnostic test for heteroscedastic regression models (2008) J. Nonparametr. Stat., 20, pp. 365-382 Wang, H., Xia, Y., Shrinkage estimation of the varying coefficient model (2008) J. Amer. Statist. Assoc., 104, pp. 747-757 Xia, Y., A multiple-index model and dimension reduction (2008) J. Amer. Statist. Assoc., 103, pp. 1631-1640 Yeh, Y.-R., Huang, S.-U., Lee, Y.-J., Nonlinear dimension reduction with kernel sliced inverse regression (2009) IEEE Trans. Knowl. Data Eng., 21, pp. 1590-1603 Yuan, M., Lin, Y., Model selection and estimation in regression with grouped variables (2006) J. R. Stat. Soc. Ser. B, 68, pp. 49-67 Zambom, A.Z., (2012) Hypothesis testing and variable selection in nonparametric regression, , (Doctoral dissertation), Department of Statistics, Penn State University Zambom, A.Z., Akritas, M., Nonparametric lack-of-fit testing and consistent variable selection (2014) Statist. Sinica, 24, pp. 1837-1858 Zhu, H., Li, L., Biological pathway selection through nonlinear dimension reduction (2011) Biostatistics, 12, pp. 429-444 Zou, H., The adaptive lasso and its oracle properties (2006) J. Amer. Statist. Assoc., 101, pp. 1418-1429