dc.creator | Acosta-Solano, Jairo | |
dc.creator | Lancheros Cuesta, Diana Janeth | |
dc.creator | Umaña Ibáñez, Samir F. | |
dc.creator | Coronado-Hernandez, Jairo R. | |
dc.date | 2022-06-08T13:28:14Z | |
dc.date | 2022-06-08T13:28:14Z | |
dc.date | 2022 | |
dc.date.accessioned | 2023-10-03T20:09:18Z | |
dc.date.available | 2023-10-03T20:09:18Z | |
dc.identifier | 1877-0509 | |
dc.identifier | https://hdl.handle.net/11323/9221 | |
dc.identifier | https://doi.org/10.1016/j.procs.2021.12.278 | |
dc.identifier | 10.1016/j.procs.2021.12.278 | |
dc.identifier | Corporación Universidad de la Costa | |
dc.identifier | REDICUC - Repositorio CUC | |
dc.identifier | https://repositorio.cuc.edu.co/ | |
dc.identifier.uri | https://repositorioslatinoamericanos.uchile.cl/handle/2250/9174590 | |
dc.description | The purpose of this paper is to evaluate several machine learning models under the CRISP-DM methodology in order to determine, through its metrics, the best model for predicting the performance of high school students in the Colombian Caribbean region in the Saber 11º test, while proposing a new methodology for evaluating the results of the test by regions in order to take into account the socioeconomic particularities of each one of them. The CRISP-DM methodology is taken as a basis due to its maturity, this methodology allows the extraction of business and data knowledge, offers a guide for data preparation, modeling and validation of the models; it is expected that the proposed methodology will be implemented by the Colombian Institute for the Promotion of Higher Education (ICFES), departmental education secretariats and educational institutions. A variety of techniques and tools were used to develop ETL processes to obtain a data set with the most relevant attributes, in order to evaluate four machine learning models developed with the J48 (C4.5), LMT, PART and Multilayer Perceptron algorithms; obtaining that the best data set and the best learning model is obtained using the InfoGain attribute selection method and the LMT decision tree algorithm, respectively. Therefore, this project will facilitate the actors of the National Education System to make decisions for the benefit of students and the quality of education in the country, especially in the Caribbean region. | |
dc.format | 6 páginas | |
dc.format | application/pdf | |
dc.format | application/pdf | |
dc.language | eng | |
dc.publisher | Elsevier BV | |
dc.publisher | Netherlands | |
dc.relation | Procedia Computer Science | |
dc.relation | [1] Isis Gómez López, Desarrollo sostenible. Elearning, 2020. | |
dc.relation | [2] ICFES, “ICFES. (2019b). Guía de orientación Saber 11.o 2020-1.” . | |
dc.relation | [3] R. Ricardo Timarán-Pereira, J. Caicedo-Zambrano, and A. Hidalgo-Troya, “Árboles de decisión para predecir factores asociados al desempeño académico de estudiantes de bachillerato en las pruebas Saber 11°,” Rev. Investig. Desarro. E Innovación, vol. 9, no. 2, 2019. | |
dc.relation | [4] W. Y. Ayele, “Adapting CRISP-DM for idea mining a data mining process for generating ideas using a textual dataset,” Int. J. Adv. Comput. Sci. Appl., vol. 11, no. 6, pp. 20–32, 2020. | |
dc.relation | [5] R. Wirth, “CRISP-DM : Towards a Standard Process Model for Data Mining,” Proc. Fourth Int. Conf. Pract. Appl.
Knowl. Discov. Data Min., no. 24959, pp. 29–39, 2000. | |
dc.relation | [6] F. Martinez-Plumed et al., “CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science
Trajectories,” IEEE Trans. Knowl. Data Eng., pp. 1–1, 2019. | |
dc.relation | [7] R. C. Q. Jordi Gironés Roig, Jordi Casas Roma, Julià Minguillón Alfonso, Minería de datos. Editorial UOC, 2017. | |
dc.relation | [8] ICFES, “Icfes Instituto Colombiano para la Evaluación de la Educación - Portal Icfes.” | |
dc.relation | [9] S. U. Ibáñez and Jairo R. Coronado-Hernández, “Código desarrollado en R para el proceso ETL.” | |
dc.relation | [10] C. L. Corso, “Aplicación de algoritmos de clasificación supervisada usando Weka,” Univ. Tecnológica Nac. Fac. Reg. Córdoba, p. 11, 2009. | |
dc.relation | [11] J. R. (2021): Umaña, Samir; Coronado-Hernández, “Métricas de evaluación de los modelos. figshare. Dataset.” | |
dc.relation | [12] J. R. Umaña, Samir; Coronado-Hernández, “Estructura del Logistic Model Tree. figshare. Dataset.” | |
dc.relation | [13] J. R. (2021): Umaña, Samir; Coronado-Hernández, “Código en R y gráfica del árbol generado por el algoritmo rpart. figshare. Dataset.” | |
dc.relation | 517 | |
dc.relation | 512 | |
dc.relation | 198 | |
dc.rights | © 2021 The Authors. Published by Elsevier B.V | |
dc.rights | Atribución-NoComercial-SinDerivadas 4.0 Internacional (CC BY-NC-ND 4.0) | |
dc.rights | https://creativecommons.org/licenses/by-nc-nd/4.0/ | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.rights | http://purl.org/coar/access_right/c_abf2 | |
dc.source | https://www.sciencedirect.com/science/article/pii/S1877050921025175 | |
dc.subject | CRISP-DM methodology | |
dc.subject | Education | |
dc.subject | Learning models | |
dc.subject | National education system | |
dc.subject | Predictive models | |
dc.title | Predictive models assessment based on CRISP-DM methodology for students performance in Colombia - Saber 11 Test | |
dc.type | Artículo de revista | |
dc.type | http://purl.org/coar/resource_type/c_6501 | |
dc.type | Text | |
dc.type | info:eu-repo/semantics/article | |
dc.type | info:eu-repo/semantics/publishedVersion | |
dc.type | http://purl.org/redcol/resource_type/ART | |
dc.type | info:eu-repo/semantics/publishedVersion | |
dc.type | http://purl.org/coar/version/c_ab4af688f83e57aa | |
dc.coverage | Colombia | |