dc.contributor | Hernández Gress, Neil | |
dc.contributor | School of Engineering and Sciences | |
dc.contributor | Ceballos Cancino, Héctor | |
dc.contributor | López Guajardo, Rafael | |
dc.contributor | Ceballos Cancino, Héctor Gibrán | |
dc.contributor | Preciado Arreola, José Luis | |
dc.contributor | Campus Monterrey | |
dc.contributor | puelquio | |
dc.creator | Gómez Cravioto, Daniela Alejandra | |
dc.date.accessioned | 2023-07-13T22:09:45Z | |
dc.date.accessioned | 2023-07-19T20:25:50Z | |
dc.date.available | 2023-07-13T22:09:45Z | |
dc.date.available | 2023-07-19T20:25:50Z | |
dc.date.created | 2023-07-13T22:09:45Z | |
dc.date.issued | 2021-05 | |
dc.identifier | Gómez Cravioto, D.(2021). Analyzing factors that impact alumni income with a machine learning approach. Instituto Tecnológico y de Estudios Superiores de Monterrey. | |
dc.identifier | https://hdl.handle.net/11285/651027 | |
dc.identifier | https://orcid.org/0000-0001-9286-9480 | |
dc.identifier | 972191 | |
dc.identifier.uri | https://repositorioslatinoamericanos.uchile.cl/handle/2250/7716587 | |
dc.description.abstract | This thesis presents an exploration of different machine-learning algorithms and different approaches for predicting alumni income. The aim is to obtain insights regarding the strongest predictors for income and a ``high" earners class. The study examines the alumni sample data obtained from a survey from Tec de Monterrey, a multi-campus Mexican private university. Survey results encompass 17,898 observations before cleaning and preprocessing and 12,275 observations after this. The dataset includes values for income and a large set of independent variables, including demographic and occupational attributes of the former students and academic attributes from the institution's history. For the problem of income prediction, there have been several attempts in both social science and econometric studies. However, this study investigates whether the accuracy of conventional algorithms in econometric research to predict income can be improved with a data science approach. Furthermore, we present insights obtained with explainable AI techniques. The results show that the Gradient Boosting Model outperformed the parametric models, Linear Regression and Logistic Regression, in predicting the current income of alumni with statistically significant results (p<0.05) in three different approaches: OLS regression, Multi-class Classification, and Binary Classification. The study also identified that for predicting the alum's first income after graduation, the Linear and Logistic Regression models were the most accurate methods, as the non-parametric models did not show a significant improvement. Succinctly, we identified that age, gender, working hours per week, their first income after graduation, and those factors related to their job position and their firm contributed to explaining their income. Simultaneously, post-graduation education and family background had an insignificant contribution to the model. In addition, the results, which showed a gender wage gap indicate that further work is required to enable equality in Mexico. | |
dc.language | eng | |
dc.publisher | Instituto Tecnológico y de Estudios Superiores de Monterrey | |
dc.relation | publishedVersion | |
dc.relation | https://doi.org/10.1109/CSASE48920.2020.9142069 | |
dc.rights | http://creativecommons.org/licenses/by-nc-nd/4.0 | |
dc.rights | openAccess | |
dc.title | Analyzing factors that impact alumni income with a machine learning approach | |
dc.type | Tesis de Maestría / master Thesis | |