Scalable kernel methods using randomized numerical linear algebra

Castellanos Martinez, Ivan Yesid

dc.contributor	Gonzalez Osorio, Fabio Augusto
dc.contributor	MindLab
dc.creator	Castellanos Martinez, Ivan Yesid
dc.date.accessioned	2021-11-18T04:30:09Z
dc.date.available	2021-11-18T04:30:09Z
dc.date.created	2021-11-18T04:30:09Z
dc.date.issued	2021
dc.identifier	https://repositorio.unal.edu.co/handle/unal/80695
dc.identifier	Universidad Nacional de Colombia
dc.identifier	Repositorio Institucional Universidad Nacional de Colombia
dc.identifier	https://repositorio.unal.edu.co/
dc.description.abstract	Los métodos de kernel corresponden a un grupo de algoritmos de aprendizaje maquinal que hacen uso de una función de kernel para representar implicitamente datos en un espacio de alta dimensionalidad, donde sistemas de optimización lineal guíen a relaciones no lineales en el espacio original de los datos y por lo tanto encontrando patrones complejos dento de los datos. La mayor desventaja que tienen estos métodos es su pobre capacidad de escalamiento, pues muchos algoritmos basados en kernel requiren calcular una matriz de orden cuadrática respecto al numero de ejemplos en los datos, esta limitación ha provocado que los metodos de kernel sean evitados en configuraciones de datos a gran escala y utilicen en su lugar tecnicas como el aprendizaje profundo. Sin embargo, los metodos de kernel todavía son relevantes para entender mejor los métodos de aprendizaje profundo y ademas pueden mejorarlos haciendo uso de estrategias híbridas que combinen lo mejor de ambos mundos. El principal objetivo de esta tesis es explorar maneras eficientes de utilizar métodos de kernel sin una gran pérdida en precisión. Para realizar esto, diferentes enfoque son presentados y formulados, dentro de los cuales, nosotros proponemos la estrategía de aprendizaje utilizando budget, la cual es presentada en detalle desde una perspectiva teórica, incluyendo un procedimiento novedoso para la selección del budget, esta estrategia muestra en la evaluación experimental un rendimiento competitivo y mejoras respecto al método estandar de aprendizaje utilizando budget, especialmente cuando se seleccionan aproximaciones mas pequeñas, las cuales son las mas útiles en ambientes de gran escala. (Texto tomado de la fuente)
dc.description.abstract	Kernel methods are a set of machine learning algorithms that make use of a kernel function in order to represent data in an implicit high dimensional space, where linear optimization systems lead to non-linear relationships in the data original space and therefore finding complex patterns in the data. The main disadvantage of these methods is their poor scalability, as most kernel based algorithms need to calculate a matrix of quadratic order regarding the number of data samples. This limitation has caused kernel methods to be avoided for large scale datasets and use approaches such as deep learning instead. However, kernel methods are still relevant to better understand deep learning methods and can improve them through hybrid settings that combine the best of both worlds. The main goal of this thesis is to explore efficient ways to use kernel methods without a big loss in accuracy performance. In order to do this, different approaches are presented and formulated, from which, we propose the learning-on-a-budget strategy, which is presented in detail from a theoretical perspective, including a novel procedure of budget selection. This strategy shows, in the experimental evaluation competitive performance and improvements to the standard learning-on-a-budget method, especially when selecting smaller approximations, which are the most useful in large scale environments.
dc.language	eng
dc.publisher	Universidad Nacional de Colombia
dc.publisher	Bogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación
dc.publisher	Departamento de Ingeniería de Sistemas e Industrial
dc.publisher	Facultad de Ingeniería
dc.publisher	Bogotá, Colombia
dc.publisher	Universidad Nacional de Colombia - Sede Bogotá
dc.relation	Ahuja, S. and Angra, S. (2017). Machine learning and its applications: A review.
dc.relation	Baveye, Y., Dellandr´ea, E., Chamaret, C., and Chen, L. (2015). Deep learning vs. kernel methods: Performance for emotion prediction in videos. In 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), pages 77–83.
dc.relation	Belkin, M., Ma, S., and Mandal, S. (2018). To understand deep learning we need to understand kernel learning.
dc.relation	Bengio, Y., Delalleau, O., and Le Roux, N. (2005). The curse of dimensionality for local kernel machines. Techn. Rep, 1258:12.
dc.relation	Borgwardt, K. M. (2011). Kernel Methods in Bioinformatics, pages 317– 334. Springer Berlin Heidelberg, Berlin, Heidelberg.
dc.relation	Bousquet, O. and Herrmann, D. J. (2003). On the complexity of learning the kernel matrix. Advances in neural information processing systems, pages 415–422.
dc.relation	Boutsidis, C., Mahoney, M. W., and Drineas, P. (2009). An Improved Approximation Algorithm for the Column Subset Selection Problem, pages 968–977.
dc.relation	Chen, D., Jacob, L., and Mairal, J. (2020). Convolutional kernel networks for graph-structured data. In III, H. D. and Singh, A., editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 1576–1586. PMLR.
dc.relation	Chitta, R., Jin, R., Havens, T. C., and Jain, A. K. (2014). Scalable kernel clustering: Approximate kernel k-means. CoRR, abs/1402.3849.
dc.relation	Chitta, R., Jin, R., and Jain, A. K. (2012). Efficient kernel clustering using random fourier features. In 2012 IEEE 12th International Conference on Data Mining, pages 161–170. IEEE.
dc.relation	Wu, L., Chen, P.-Y., Yen, I. E.-H., Xu, F., Xia, Y., and Aggarwal, C. (2018). Scalable spectral clustering using random binning features. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2506–2515.
dc.relation	Xiao, H., Rasul, K., and Vollgraf, R. (2017). Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms.
dc.relation	Yang, M.-H. (2002). Kernel eigenfaces vs. kernel fisherfaces: Face recognition using kernel methods. In Fgr, volume 2, page 215.
dc.relation	Yu, F. X., Suresh, A. T., Choromanski, K., Holtmann-Rice, D. N., and Kumar, S. (2016). Orthogonal random features. CoRR, abs/1610.09072.
dc.relation	Zhang, D., Wang, J., Cai, D., and Lu, J. (2010). Self-taught hashing for fast similarity search. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, pages 18–25.
dc.relation	Zhang, Q., Filippi, S., Gretton, A., and Sejdinovic, D. (2017). Largescale kernel methods for independence testing. Statistics and Computing, 28(1):113–130.
dc.relation	Trokicic, A. and Todorovic, B. (2020). On expected error of randomized nystrom kernel regression. Filomat, 34(11):3871–3884.
dc.relation	Vanegas, J. A., Escalante, H. J., and González, F. A. (2018). Semisupervised online kernel semantic embedding for multi-label annotation. In Mendoza, M. and Velastín, S., editors, Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pages 693–701, Cham. Springer International Publishing.
dc.relation	Wang, D. E. J. (2006). Fast approximation of centrality. Graph algorithms and applications, 5(5):39.
dc.relation	Wang, J., Cao, B., Yu, P., Sun, L., Bao, W., and Zhu, X. (2018). Deep learning towards mobile applications. In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS), pages 1385–1393.
dc.relation	Wang, J., Shen, H. T., Song, J., and Ji, J. (2014). Hashing for similarity search: A survey. CoRR, abs/1408.2927.
dc.relation	Wang, S., Gittens, A., and Mahoney, M. W. (2019). Scalable kernel k-means clustering with nyström approximation: relative-error bounds. The Journal of Machine Learning Research, 20(1):431–479.
dc.relation	Wang, S., Luo, L., and Zhang, Z. (2016). Spsd matrix approximation via column selection: Theories, algorithms, and extensions.
dc.relation	Wang, S. and Zhang, Z. (2013). Improving cur matrix decomposition and the nyström approximation via adaptive sampling. The Journal of Machine Learning Research, 14(1):2729–2769.
dc.relation	Wang, Y., Liu, X., Dou, Y., Lv, Q., and Lu, Y. (2017). Multiple kernel learning with hybrid kernel alignment maximization. Pattern Recognition, 70:104–111.
dc.relation	Wang, Z., Crammer, K., and Vucetic, S. (2012). Breaking the curse of kernelization: Budgeted stochastic gradient descent for large-scale svm training. Journal of Machine Learning Research, 13(100):3103–3131.
dc.relation	Williams, C. K. I. and Seeger, M. (2001). Using the nyström method to speed up kernel machines. In Leen, T. K., Dietterich, T. G., and Tresp, V., editors, Advances in Neural Information Processing Systems 13, pages 682–688. MIT Press.
dc.relation	Witten, R. and Candes, E. (2015). Randomized algorithms for low-rank matrix factorizations: sharp performance bounds. Algorithmica, 72(1):264–281.
dc.rights	Reconocimiento 4.0 Internacional
dc.rights	http://creativecommons.org/licenses/by/4.0/
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Scalable kernel methods using randomized numerical linear algebra
dc.type	Trabajo de grado - Maestría

Este ítem pertenece a la siguiente institución

Universidad Nacional de Colombia