dc.contributorCruz Pérez, Edwin Andrés
dc.contributorPerdomo Charry, Oscar
dc.contributorhttps://orcid.org/0000-0003-2134-0058
dc.contributorhttps://scholar.google.com/citations?hl=es&user=e6Oad5sAAAAJ
dc.contributorhttps://scienti.minciencias.gov.co/cvlac/visualizador/generarCurriculoCv.do?cod_rh=0001525346
dc.contributorUniversidad Santo Tomás
dc.creatorGil Rubio, Ricardo
dc.date.accessioned2022-09-22T15:43:07Z
dc.date.available2022-09-22T15:43:07Z
dc.date.created2022-09-22T15:43:07Z
dc.date.issued2022-09-22
dc.identifierGil Rubio, R. (2022). Modelos de machine learning para clasificar la cartera en un fondo de pensiones. [Maestría, Universidad Santpo Tomás]. Repositorio institucional.
dc.identifierhttp://hdl.handle.net/11634/47294
dc.identifierreponame:Repositorio Institucional Universidad Santo Tomás
dc.identifierinstname:Universidad Santo Tomás
dc.identifierrepourl:https://repository.usta.edu.co
dc.description.abstractThe present paper has as objective, the application of different Machine Learning techniques as well as statistical and inferential diagnostics, to propose predictive analysis models that allow to in due time identify, classify and process the companies that are not paying pension contributions to their employees affiliated to the pension fund, and thus to implement different collection strategies to recover contributions owed. In the process of evaluating the performance of the models, it was possible to show that the Decision Trees technique presents excellent results: it did not require standardization of the data by achieving an excellent percentage of certainty and it quickly and efficiently classified the predictor variable in a database with an adequate number of records. The other techniques showed good results in class type 0, 3 and 4 with percentages above 96.8\% both in completeness and in measure-F, while the performance decreased for Logistic Regression 71.8\% and Support Vector Machines 69.2\% in completeness and Bayesian Networks 18.5\% in measure-F, the above for class type 1. In the Bayesian Networks technique for class type 2 it was reduced by 24.7\% and 29.3\% both in completeness and F-measure and Support Vector Machines at 59.4\% for F-measure. This was addressed with the treatment of unbalanced classes and with the reinforcement or ensemble algorithms. Class imbalance is a fairly common problem when working with real data; when samples from one or multiple classes are over represented in a data set. There are several areas in which it can occur, such as spam filtering, cancer detection, fraud identification or disease detection. Strategies to deal with class imbalance include minority class up sampling, majority class down sampling, and generation of synthetic training samples using the most commonly used algorithm (SMOTE). Once the models with the proposed segmentation were evaluated, the strategies were generated that allowed identifying the collection management mechanisms depending on the type of debtor, this ranges from a commercial visit, contact center management for preventive collection or an extract with payment information, for debtors of low criticality, going through a persuasive collection letter, advice at service points or text messages for debtors of medium criticality, to the coercive collection process, embargoes and other measures for debtors who are reluctant to pay.
dc.languagespa
dc.publisherUniversidad Santo Tomás
dc.publisherMaestría Estadística Aplicada
dc.publisherFacultad de Estadística
dc.relationAgresti, A. (2002). Análisis de datos categóricos. Segunda edición, John Wiley & Sons, Inc., Nueva York. En línea. Recuperado de: http://dx.doi.org/10.1002/0471249688.
dc.relationAlpaydin, E. (2004). Introduction to Machine Learning. The MIT press Cambridge, MA.
dc.relationAmat, J. (2016). Regresión logística simple y múltiple. https://www.cienciadedatos.net/documentos/27- \_regresion\_logistica\_simple\_y\_multiple.
dc.relationAruna, R. \& Nirmala, K. (2013). Construction of Decision Tree: Attribute Selection Measures. International Journal of Advancements in Research & Technology, Volume 2, Issue 4. Recuperado de: http://www.ijoart.org/docs/Construction-of-Decision-Tree--Attribute-Selection-Measures.pdf.
dc.relationBrito, F. \& Artes, R. (2018). Aplicación de árboles de regresión aditiva bayesiana en el desarrollo de modelos de calificación crediticia en Brasil. Producción, 28., https://doi.org/10.1590/0103-6513.20170110.
dc.relationCohen, J. (1960). Un coeficiente de acuerdo con las escalas nominales. Medida educativa y psicológica, 20 (1), pp. 37-46. Doi: 10.1177 / 001316446002000104.
dc.relationColfondos. (2013). Manual del participante, Ley 100 de 1993.En línea. Recuperado de: https://www.colfondos.com.co/dxp/documents/20143/37693/LEY+100+DE+1993.pdf/c2be65aa-08dd-decc-447c-647409ce4f12.
dc.relationInternational Business Machines Corporation (2019). Funcionamiento de SVM. Recuperado de: https://www.ibm.com/docs/es/spss-modeler/SaaS?topic=models-how-svm-works.
dc.relationGarcia, N. (2020).Qué son los árboles de decisión y para que sirven.Recuperado de: https://www.maximaformacion.es/blog-dat/que-son-los-arboles-de-decision-y-para-que-sirven/.
dc.relationHusejinovic et al. (2018). Aplicación de algoritmos de aprendizaje automático en la predicción de pagos predeterminados de tarjetas de crédito.Recuperado de: https://www.researchgate.net/publication/328026972-Application-of-Machine- Learning-Algorithms-in-Credit-Card-Default-Payment-Prediction.
dc.relationIronhack (2015). ¿En qué consiste el Machine Learning?. En línea. Recuperado de: https://www.ironhack.com/es/data-analytics/que-es-machine-learning.
dc.relationLópez, R. (2015). Machine Learning con Python. En línea. Recuperado de: https://relopezbriega.github.io/blog/2015/10/10/machine-learning-con-python/.
dc.relationMendoza, J. (2020). XGBoost en Python. En línea. Recuperado de: https://medium.com/@jboscomendoza/tutorial-xgboost-en-python-53e48fc58f73.
dc.relationMüller, A. \& Guido, S. (2017). Introduction to Machine Learning with Python. A Guide for Data Scientists. O'reilly, United States of America.
dc.relationNaviani, (2018).Clasificador AdaBoost en Python. Recuperado de: https://www.datacamp.com/tutorial/adaboost-classifier-python#rdl.
dc.relationNieto, S. (2010). Crédito al Consumo: La estadística aplicada a un problema de riesgo crediticio [Tesis de Maestría]. Universidad Autónoma Metropolitana. Recuperado de: http://mat.izt.uam.mx/mcmai/documentos/tesis/Gen.07-O/Nieto-S-Tesis.pdf.
dc.relationOlarte, N. (8 de abril de 2016). El pequeño dato que puede arruinar su futuro. Revista Semana. En línea. Recuperado de: http://www.finanzaspersonales.co/pensiones-y-cesantias/articulo/que-hacer-cuando-la-empresa-no-hace-aportes-a-pension/59958.Fecha de consulta: noviembre de 2018.
dc.relationOñate (2016). Análisis de la Deserción y Permanencia Académica en la Educación superior Aplicando Minería De Datos. Universidad Nacional de Colombia.
dc.relationParra, F. (2017). Estadística y Machine Learning con R. Rpubs. Recuperado de: https://rpubs.com/PacoParra/293405, Fecha de consulta: noviembre de 2018.
dc.relationRaschka \& Mirjalili (2019). Python Machine Learning. Aprendizaje automático y aprendizaje profundo con Python, scikit-learn y TensorFlow. Marcombo.
dc.relationResolución 2082 de 2016 (2017). Principales cambios o ajustes. Proceso de extracción de conocimiento. Unidad de Gestión Pensional y Parafiscales.
dc.relationRuiz, S. (2016). Algoritmos de clasificación: K-NN, Árboles de decisión simples y múltiples (random forest). En línea. Recuperado de: https://rstudio-pubs-static.s3.amazonaws.com.
dc.relationSancho, F. (2017). Redes Neuronales: una visión superficial. En línea. Recuperado de: http://www.cs.us.es/~fsancho/?e=72.
dc.relationSrinath \& Gururaja (2022). Aprendizaje automático explicable en la identificación de morosos de tarjetas de crédito.Recuperado de: https://www.sciencedirect.com/science/article/pii/S2666285X22000619.
dc.relationStatistical Analysis System (SAS) Institute. (2019). Machine Learning, una expresión de la Inteligencia Artificial. En línea. Recuperado de: https://www.sas.com/content/dam/SAS/es_mx/doc/whitepaper1/109075_0917.pdf.
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.rightsAbierto (Texto Completo)
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.rightsAtribución-NoComercial-SinDerivadas 2.5 Colombia
dc.titleModelos de machine learning para clasificar la cartera en un fondo de pensiones


Este ítem pertenece a la siguiente institución