dc.contributorHernández Gress, Neil
dc.contributorSchool of Engineering and Sciences
dc.contributorCeballos Cancino, Héctor
dc.contributorLópez Guajardo, Rafael
dc.contributorCeballos Cancino, Héctor Gibrán
dc.contributorPreciado Arreola, José Luis
dc.contributorCampus Monterrey
dc.contributorpuelquio
dc.creatorGómez Cravioto, Daniela Alejandra
dc.date.accessioned2023-07-13T22:09:45Z
dc.date.accessioned2023-07-19T20:25:50Z
dc.date.available2023-07-13T22:09:45Z
dc.date.available2023-07-19T20:25:50Z
dc.date.created2023-07-13T22:09:45Z
dc.date.issued2021-05
dc.identifierGómez Cravioto, D.(2021). Analyzing factors that impact alumni income with a machine learning approach. Instituto Tecnológico y de Estudios Superiores de Monterrey.
dc.identifierhttps://hdl.handle.net/11285/651027
dc.identifierhttps://orcid.org/0000-0001-9286-9480
dc.identifier972191
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/7716587
dc.description.abstractThis thesis presents an exploration of different machine-learning algorithms and different approaches for predicting alumni income. The aim is to obtain insights regarding the strongest predictors for income and a ``high" earners class. The study examines the alumni sample data obtained from a survey from Tec de Monterrey, a multi-campus Mexican private university. Survey results encompass 17,898 observations before cleaning and preprocessing and 12,275 observations after this. The dataset includes values for income and a large set of independent variables, including demographic and occupational attributes of the former students and academic attributes from the institution's history. For the problem of income prediction, there have been several attempts in both social science and econometric studies. However, this study investigates whether the accuracy of conventional algorithms in econometric research to predict income can be improved with a data science approach. Furthermore, we present insights obtained with explainable AI techniques. The results show that the Gradient Boosting Model outperformed the parametric models, Linear Regression and Logistic Regression, in predicting the current income of alumni with statistically significant results (p<0.05) in three different approaches: OLS regression, Multi-class Classification, and Binary Classification. The study also identified that for predicting the alum's first income after graduation, the Linear and Logistic Regression models were the most accurate methods, as the non-parametric models did not show a significant improvement. Succinctly, we identified that age, gender, working hours per week, their first income after graduation, and those factors related to their job position and their firm contributed to explaining their income. Simultaneously, post-graduation education and family background had an insignificant contribution to the model. In addition, the results, which showed a gender wage gap indicate that further work is required to enable equality in Mexico.
dc.languageeng
dc.publisherInstituto Tecnológico y de Estudios Superiores de Monterrey
dc.relationpublishedVersion
dc.relationhttps://doi.org/10.1109/CSASE48920.2020.9142069
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/4.0
dc.rightsopenAccess
dc.titleAnalyzing factors that impact alumni income with a machine learning approach
dc.typeTesis de Maestría / master Thesis


Este ítem pertenece a la siguiente institución