Semiparametric smoothing spline to joint mean and variance models with responses from the biparametric exponential family: a bayesian perspective

Zárate Solano, Héctor Manuel

dc.contributor	Cepeda Cuervo, Edilberto
dc.contributor	Inferencia Bayesiana
dc.creator	Zárate Solano, Héctor Manuel
dc.date.accessioned	2022-02-05T00:31:26Z
dc.date.available	2022-02-05T00:31:26Z
dc.date.created	2022-02-05T00:31:26Z
dc.date.issued	2022-01
dc.identifier	https://repositorio.unal.edu.co/handle/unal/80887
dc.identifier	Universidad Nacional de Colombia
dc.identifier	Repositorio Institucional Universidad Nacional de Colombia
dc.identifier	https://repositorio.unal.edu.co/
dc.description.abstract	Statistical applications need to address an increasing complexity due to new data arising from recent technologies, new phenomenons, and diverse sources of uncertainty. The demand for flexible methods with non-standard data structures, high-dimensional real-time estimation, and latent models framework have caused semiparametric modeling to play a crucial role in contemporary statistical analysis. We provide flexible Bayesian methods to jointly infer the mean, variance, and skewness functions when the response variable comes either from a two-parameter exponential family or asymmetric distributions. Hence, we implemented Bayesian algorithms based on MCMC sampling techniques and deterministic variational Bayesian learning theory. In these settings, each sub-model depends on some covariates parametrically and for others in a non-parametrically way. It follows that understanding how the moments change with predictors is a goal of Statistics, and it is of intrinsic interest given the role in approximating other quantities. We propose several modeling scenarios that benefit from the fusion of the graphical models' approach to Bayesian semiparametric regression under the architecture of GLM models. The significance and implications of our strategy lie in its potential to contribute to a unified computational methodology that provides insight into many complex models that otherwise could be intractable analytically. Therefore, combining data models and algorithms contribute to solving real-world problems enjoying crucial advantages related to faster computation time, which allow not only to explore quickly many models for the data but to estimate them accurately.
dc.description.abstract	Las aplicaciones estadísticas deben abordar una complejidad cada vez mayor debido a los nuevos datos que surgen con las tecnologías recientes, los nuevos fenómenos y las diversas fuentes de incertidumbre. La demanda por métodos con estructuras de datos no estándar, estimación en tiempo real de alta dimensión y modelos latentes adecuados ha causado que los modelos semiparamétricos desempeñen un papel crucial en el análisis estadístico reciente. En esta tesis se implementan métodos Bayesianos flexibles para inferir conjuntamente las funciones de media, varianza y asimetría cuando la variable de respuesta proviene de la familia exponencial biparamétrica o de distribuciones asimétricas. La aproximación es obtenida con métodos basados en técnicas de simulación de Monte Carlo con cadenas de markov y en algoritmos de aprendizaje variacional determinístico. En estos escenarios, cada submodelo incluye variables en forma paramétrica y no paramétrica para analizar el efecto de los predictores sobre los momentos. Los escenarios de modelamiento se benefician de la fusión entre los modelos gráficos y la regresión semiparamétrica Bayesiana utilizando la arquitectura de modelos lineales generalizados. La importancia e implicaciones de nuestra estrategia radican en su potencial para contribuir con una metodología computacional unificada que proporciona información sobre una gran variedad de modelos complejos que, de otro modo, podrían resultar analíticamente intratables. Por lo tanto, la combinación de modelos de datos y algoritmos contribuye a resolver problemas del mundo real y disfruta de ventajas cruciales relacionadas con el bajo tiempo de cómputo, lo cual permite no solo explorar rápidamente muchos modelos para los datos, sino también estimarlos con precisión. (Texto tomado de la fuente).
dc.language	eng
dc.publisher	Universidad Nacional de Colombia
dc.publisher	Bogotá - Ciencias - Doctorado en Ciencias - Estadística
dc.publisher	Departamento de Estadística
dc.publisher	Facultad de Ciencias
dc.publisher	Bogotá, Colombia
dc.publisher	Universidad Nacional de Colombia - Sede Bogotá
dc.relation	Anderson, D. F. and Livingston, P. S. The zero-divisor graph of a commutative ring. Journal of Algebra, 217(2):434–447, 1999.
dc.relation	Berry, S., Carroll, R., and Ruppert, D. Bayesian smoothing and regression splines for measurement error problems. Journal of the American Statistical Association, 97(457):160 – 169, 2002.
dc.relation	Bishop, C. M. Pattern recognition and Machine Learning, volume 1 of Graduate Texts in Mathematics. New York : Springer, 2006.
dc.relation	Blei, D. M., Kucukelbir, A., and McAuliffe, J.D. Variational inference: a review for statisticians. Journal of the American Statistical Association, 112:859–857, 2017.
dc.relation	Brooks, S., Gelman, A., Jones, G., and Meng, X. Handbook of Markov Chain Monte Carlo. Handbooks of Modern and Statistical Methods. Chapman Hall/CR, 2011.
dc.relation	Cepeda, E. and Gamerman, D. Bayesian modeling of variance heterogeneity in normal regression models. J. Prob. Stat, 14:207–221, 2001.
dc.relation	Crainiceanu, C., Ruppert, D., and Wand, M. Bayesian analysis for penalized spline regression using winbugs. Journal of Statistical Software, 14:1–24, 2005.
dc.relation	Crevar, M. Shared file systems: Determining the best choice for your distributed SAS® foundation applications. SAS Institute Inc., Cary, NC., pages 1–11, 2017. 21
dc.relation	Dey, D.K., Gelfand, A.E., and Peng, F. Overdispersed generalized linear models. Journal of Statistical Planning and Inference,, 64(64):93–108, 1997.
dc.relation	Evans, M. J. and Rosenthal, J. S. Probability and Statistics: The science of Uncertainty. The American Statistician. New York: W.H Freeman and Company, 2004.
dc.relation	Gilks, W.R., Richardson, S., and Spiegelhalter, D.J. Markov Chain Monte Carlo in practice. Interdisciplinary Statistics. Chapman Hall/CR, 1998.
dc.relation	Green, P.J. and Silverman, B.W. Nonparametric Regression and Generalized Linear Models: A roughness penalty approach. Chapman and Hall, London, 1994.
dc.relation	Groll, A., Hambuckers, J., , Kneib, T., and Umlauf, N. Lasso-type penalization in the framework of generalized additive models for location, scale and shape. Computational Statistics Data Analysis, 140:59–73, 2019.
dc.relation	Hastings, W. Monte carlo sampling methods using markov chains and their applications. Biometrika, (57):97–109, 1970.
dc.relation	Jang, E. A beginner’s guide to variational methods: Mean-field approximation. http://blog.evjang.com/2016/08/variational-bayes.html, (3):233–240, 2016.
dc.relation	Kullback, S. Information Theory and Statistics. Gloucester , MASS, 1978.
dc.relation	Landau, D. and Binder, K. A Guide to Monte Carlo Simulations in Statistical Physics. Cambridge University Press, 2005.
dc.relation	Nakajima, S., Watanabe, K., and Sugiyama, M. Variational Bayesian Learning Theory, volume 1 of Graduate Texts in Mathematics. Cambridge University Press, 2019.
dc.relation	Nosedal-Sanchez, A., Storlie, C., Thomas, C., and Chisrensen, R. Reproducing kernel hilbert spaces for penalized regression : A tutorial. The American Statistician, (66):50–60, 2012.
dc.relation	Ormerod, J. T. and Wand, J. T. Explaining variational approximations. The American Statistician, 64:140–153, 2001. 22
dc.relation	Pierce, N. and Wand, D. Penalized splines and reproducible kernel methods. American Statistical Association, (3):233–240, 2006.
dc.relation	Ruppert, D., Wand, M., and Carroll, R. Semiparametric regression during 2003-2007. Electronic Journal of Statistics, 3:1193–1256, 2009.
dc.relation	Stasinopoulos, D. and Rigby, R. Generalized additive models for location scale and shape (gamlss) in r. Journal of Statistical Software, 23:1–46, 2007.
dc.relation	Turkman, M., Paulino, C., and Muller, P. Computational Bayesian Statistics . An Introduction. Cambridge, 2019.
dc.relation	Umlauf, N., Nadja, K., and Achim, Z. Bamlss: Bayesian additive models for location, scale, and shape (and beyond). Journal of Computational and Graphical Statistics, 27:612–627, 2018.
dc.relation	Wahba, G. Spline Models for Observational Data. Society for Industrial and Applied Mathematics, 1990.
dc.relation	Whaba, G. and Wendelberger, J. Some new mathematical methods for variational objective analysis using splines and cross validation. Monthly weather review, 108:1122–1143, 1980. 23
dc.relation	Berry, S., Carroll, R., and Ruppert, D. Bayesian smoothing and regression splines for measurement error problems. Journal of the American Statistical Association, (457):160–169, 2011.
dc.relation	Cepeda, E. Variability modeling in Generalized Linear models. PhD thesis, Unpublished Ph.D thesis, Mathematics Institute Universidade Federal do Rio de Janeiro, 2001.
dc.relation	Cepeda, E. and Gamerman, D. Bayesian methodology for modeling parameters in the two parametric exponential family. Estadística, 57:93–105, 2005.
dc.relation	Cepeda,E., Achcar,J., and Garrido Lopera,L. Bivariate beta regression models: joint modeling of the mean, dispersion and association parameters. Journal of Applied statistics, 41(3):677– 687, Marzo 2014.
dc.relation	Crainiceanu, C. Spatially adaptative bayesian penalized splines with heteroscedastic errors. Journal of Computational and Graphical Statistics, (2):265–288, 2007.
dc.relation	Currie, I. and Burban, M. Flexible smoothing with p-splines : a unified approach. Statistical Modelling, (4):333–349, 2002.
dc.relation	Davidian, M., Lin, X., Morris, J., and Stefanski, O. The Work of Raymond J. Carroll. The impact and influence of a Statistician. 2014. 49
dc.relation	Denison, D., Mallick, B., and Smith, F. A bayesian cart algorithm. Biometrika, (2):363 – 367, 1998.
dc.relation	Dey, D.K., Gelfand,A.E., and Peng, F. Overdispersed generalized linear models. Journal of Statistical Planning and Inference,, 64(64):93–108, 1997.
dc.relation	Eilers, P.and Marx, B. and Durbán, M. Twenty years of p-splines. SORT (Statistics and Operations Research Transactions), (39), 2014.
dc.relation	Gamerman, D. Sampling from the posterior distribution in generalized linear mixed models. Instituto de matemática, Universidade Federal do Rio de Janeiro, pages 59 – 68, 1997.
dc.relation	Gijbels, I. and Prosdocimi, I. Flexible mean and dispersion function estimation in extended generalized additive models. Communications in statistics - Theory and Methods, (41):3259 – 3277, 2012.
dc.relation	Gu,C. Smoothing Spline ANOVA Models. Springer, West Lafayette,USA, 2002.
dc.relation	Littell, R. and Schabenberger, O. SAS for Mixed Models. Number 2. 2006.
dc.relation	Loomis, C. MCMC in SAS: From scratch or by proc. Western users of SAS software 2016, 1(1):1 – 19, 2016.
dc.relation	Ma, Y. and Carroll R.J. Locally efficient estimators for semiparametric models with measurement error. Journal of the American Statistical Association, (101):1465–1474, 2006.
dc.relation	Mencitas, M. and Wand, M. Variational inference for heteroscedastic semiparametric regression. School of mathematical sciences, University of Technology. Sydney, Australia, 2014.
dc.relation	Nosedal-Sanchez, A., Storlie, C., Thomas, C., and Chisrensen, R. Reproducing kernel hilbert spaces for penalized regression : A tutorial. The American Statistician, (66):50–60, 2012.
dc.relation	Nott, D. Semiparametric estimation of mean and variance functions for non-gaussian data. Computational Statistics, (3-4):603–620, 2006.
dc.relation	Pierce, N. and Wand, D. Penalized splines and reproducible kernel methods. American Statistical Association, (3):233–240, 2006.
dc.relation	Pinheiro, J. and Bates, D. Mixed-Effects Models in S and S-Plus. Springer Verlag, 2009.
dc.relation	Robert, C.P., Elvira,V., Tawn, N., and Wu, C. Accelerating mcmc algorithms. Journal of the American Statistical Association, 2018.
dc.relation	Ruppert, D., Wand, M.P., Holst,U., and Hössjer, O. Local polynomial variance-function estimation. Technometrics, (39):262–273, 1997.
dc.relation	Spiegelhalter, D. and Best, N. Bayesian approaches to multiple sources of evidence in complex cost-effectiveness modelling. Statistics in Medicine, (23):3687 – 3709, 2003.
dc.relation	Tran, M., Nguyen, N., Nott, D., and Kohn, R. Bayesian deep net glm and glmm. SORT ( arXiv:1805.10157v1 [stat.CO]), 2018.
dc.relation	Xu,D. and Zhang,Z. A semiparametric bayesian approach to joint mean and variance models. Statistics & Probability Letters, 83(7):1624 – 1631, 2013.
dc.relation	Azzalini, A. A class of distribution which includes the normal ones. Scandinavian Journal of Statistics, 12:171–178, 1985.
dc.relation	Azzalini, A. Further results on a class of distributions which includes the normal ones. Statistica, 46:199–208, 1986.
dc.relation	Charenza, W. and Diaz, C. Choosing the rigth skew normal distribution: the macroeconomist dilemma. Journal of Forecasting, 19:235–254, 2015.
dc.relation	Dursun, A. and Ersin, Y. Modified estimators in semiparametric regression models with rightcensored data. Journal of Statistical Computation and Simulation, 88:1470–1498, 2018.
dc.relation	Ferreira, J. and Steel, M. A new class of skewed multivariate distributions with applications to regression analysis. Statistica Sinica, 17:505–529, 2007.
dc.relation	Franceschini, C. and Loperfido, N. Testing for normality when the sampled distribution is extended skew-normal. Mathematical and statistical methods for actuarial sciences and finance Springer, 2014.
dc.relation	Genton, M. Discussion of the ’skew-normal’. Scandinavian Journal of Statistics, 32:189–198, 2005.
dc.relation	Groll, A., Hambuckers, J., , Kneib, T., and Umlauf, N. Lasso-type penalization in the framework of generalized additive models for location, scale and shape. Computational Statistics Data Analysis, 140:59–73, 2019.
dc.relation	Gómez, H., Salinas, H., and Bolfarine, H. Generalized skew-normal models: properties and inference. Statistics, 40:495–505, 2006.
dc.relation	Huiqiong, L., Liucang, W., and Ting, M. Variable selection in joint location, scale and skewness models of the skew-normal distribution. J Syst Sci Complex, 30:694–709, 2017.
dc.relation	Kuligowsk, A. and Mendes, L. Working with sparse matrices in sas®. Proceedings of the SAS® Global Forum 2019 Conference, pages 1–9, 2019.
dc.relation	Ma, Y. and Genton, M. Flexible class of skew-symmetric distributions. Scandinavian Journal of Statistics, 31:459–468, 2004.
dc.relation	Ma, Y., Genton, M., and Tsiatis, A. Locally efficient semiparametric estimators for generalized skew-elliptical distributions. Journal of American Statistical Association, 100:980–989, 2005.
dc.relation	Perez, P., Acosta, R., and Perez, S. A bayesian genomic regression model with skew normal random errors? G3, 8:1771–1785, 2018.
dc.relation	Potgieter, C. and Genton, M. Bayesian analysis of two-piece normal regression models. Presented at Joint Statistical meeting, San Francisco Statistics, 2003.
dc.relation	Potgieter, C. and Genton, M. Characteristic function-based semiparametric inference for skewsymmetric models. Scandinavian Journal of Statistics, 40:1803–1819, 2013.
dc.relation	Satori, N. Bias prevention of maximum likelihood estimates for scalar skew normal and skew t distributions. Journal of Statistical Planning and Inference, 136:4259–4275, 2006.
dc.relation	Stasinopoulos, D. and Rigby, R. Generalized additive models for location scale and shape (gamlss) in r. Journal of Statistical Software, 23:1–46, 2007.
dc.relation	Umlauf, N., Nadja, K., and Achim, Z. Bamlss: Bayesian additive models for location, scale, and shape (and beyond). Journal of Computational and Graphical Statistics, 27:612–627, 2018.
dc.relation	Wilhelmsson, Y. Value at risk with time varying variance, skewness and kurtosis - the nig-acd model. Econometrics Journal, 12:82–104, 2009.
dc.relation	Yu, K., Alhamzawi, A., Becker, F., and Lord, J. Statistical methods for body mass index: a selective review of the literature. arXiv:1412.3653v1 [stat.AP], pages 1–32, 2014.
dc.relation	Zareifard, H. and Khaledi, M. Non-gaussian modeling of spatial data using scale mixing of a unified skew gaussian process. Journal of Multivariate Analysis, 114:16–28, 2013. 99
dc.relation	Ahmedt, D., Ali, M., Denman, S., Fookes, C., and Petersson, L. Graph-based deep learning for medical diagnosis and analysis: Past, present and future. arXIV:2105.13137v1 cs.LG, pages 1–41, 2021.
dc.relation	Barro, S and Sala i Martin, X. Economic growth. MIT, 2004.
dc.relation	Bishop, C. Pattern recognition and machine learning. Springer, 2016.
dc.relation	Bugbee, B., Bredit, J., and Van der Woerd, M. Laplace variational approximation for semiparametric regression in presence of heterocedastic errors. Journal of Computational and Graphical Statistics, 25:225–245, 2016.
dc.relation	Buxton, R. Introduction to Functional Magnetic Resonance Imaging : Principles and Techniques. Cambridge, 2009.
dc.relation	Caffo, B., Bowman, D., Elberly, L., and Bassett, S. Handbook of Markov Chain Monte Carlo: Part II, chapter 14. Chapman Hall / CRC, 2011.
dc.relation	Faes, C. and Wand, M.P. Semiparametric mean field variational bayes: General principles and numerical issues. Journal of Machine Learning Research, (17):1–47, 2016.
dc.relation	Hans, S. MRI made easy (...well almost). Schering, 1990.
dc.relation	Koller, D. and Friedman, N. Probabilistic Graphical Models: Principles and Techniques. The MIT Press, 2010.
dc.relation	Larsen, W. and McCleary, S. The use of partial residual plots in regression analysis. Technometrics, (14):781–790, 1970.
dc.relation	Lazaro, M. Bayesian warped gaussian processes. Advances in Neural Information Processing Systems, 26:225–245, 2013.
dc.relation	Mencitas, M. and Wand, M. Variational inference for heteroscedastic semiparametric regression. School of mathematical sciences, University of Technology. Sydney, Australia, 2014.
dc.relation	Nakajima, S., Watanabe, K., and Sugiyama, M. Variational Bayesian Learning Theory. Cambridge University press, 2019.
dc.relation	Nott , D., Tran, M., and Kuk, A. Efficient variational inference for generalized linear mixed models with large datasets. arXiv preprint, 2018.
dc.relation	Potgieter, C. and Genton, M. Bayesian analysis of two-piece normal regression models. Presented at Joint Statistical meeting, San Francisco Statistics, 2003.
dc.relation	Rindler, F. Calculus of Variations. Springer-Verlag, 2016.
dc.relation	Starke, L. and Ostwald, D. Variational bayesian parameter estimation techniques for the general linear model. Frontiers in Neuroscience, pages 1–22, 2017.
dc.relation	Zárate, H. and Cepeda, E. Semiparametric smoothing spline in joint mean and dispersion models with responses from the biparametric exponential family: A bayesian perspective. Statistics, Optimization Information Computing, 9(2):351–367, 2021.
dc.rights	Reconocimiento 4.0 Internacional
dc.rights	http://creativecommons.org/licenses/by/4.0/
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Semiparametric smoothing spline to joint mean and variance models with responses from the biparametric exponential family: a bayesian perspective
dc.type	Trabajo de grado - Doctorado

Este ítem pertenece a la siguiente institución

Universidad Nacional de Colombia