dc.contributorMartínez Carvajal, Hernán Eduardo
dc.contributorARISTIZABAL GIRALDO, EDIER VICENTE
dc.contributorBranch Bedoya, John Willian
dc.contributorInvestigación en Geología Ambiental Gea
dc.contributorGidia: Grupo de Investigación Y Desarrollo en Inteligencia Artificial
dc.creatorPalacio Jiménez, David
dc.date.accessioned2022-06-06T15:48:29Z
dc.date.accessioned2022-09-21T18:07:31Z
dc.date.available2022-06-06T15:48:29Z
dc.date.available2022-09-21T18:07:31Z
dc.date.created2022-06-06T15:48:29Z
dc.date.issued2022
dc.identifierhttps://repositorio.unal.edu.co/handle/unal/81507
dc.identifierUniversidad Nacional de Colombia
dc.identifierRepositorio Institucional Universidad Nacional de Colombia
dc.identifierhttps://repositorio.unal.edu.co/
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/3406044
dc.description.abstractLas avenidas torrenciales son fenómenos destructivos característicos de regiones montañosas. En el departamento de Antioquia (Colombia), estos eventos ocurren con frecuencia y las pérdidas en términos económicos y de vidas humanas reflejan la importancia de predecirlos. Las condiciones climáticas extremas, la expansión urbana y el crecimiento poblacional tienden a incrementar el riesgo en aquellas zonas donde ya se han presentado eventos en el pasado. Actualmente, se carece de una base de datos que recopile el detalle de las avenidas torrenciales que han ocurrido en Antioquia con sus respectivas variables hidrometeorológicas, además, la mayoría de las investigaciones están orientadas a identificar la susceptibilidad espacial de estos fenómenos. Con el auge de las técnicas de aprendizaje de máquinas, se propone un método de clasificación binaria para la predicción temporal de avenidas torrenciales a partir de datos abiertos. De esta manera, se identifican las múltiples fuentes de información para construir un inventario de eventos con sus respectivas variables hidrometeorológicas. Luego se realiza el preprocesamiento y entendimiento profundo de los datos, de manera que se seleccionan las variables que más influencia tienen en la ocurrencia de las avenidas torrenciales mediante métodos de envoltura y de filtrado. Seguidamente, se aborda el problema del desbalanceo entre las clases, usando diferentes proporciones de los datos y generando datos sintéticos para evaluar el desempeño del clasificador propuesto. Por último, se obtiene que el algoritmo de bosques aleatorios con el conjunto de datos balanceado y desbalanceado en una proporción de 1:99 entre las clases de ocurrencia y no ocurrencia de avenida torrencial fue el que mejor desempeño obtuvo, logrando un F1-score y sensibilidad del 85% para el conjunto balanceado, mientras que el conjunto de datos desbalanceado obtuvo 66% y 55% respectivamente. Además, se determina que las variables que mayor influencia tienen en el modelo de clasificación corresponden a la lluvia antecedente de 1 día, la escorrentía, la evapotranspiración potencial y el índice de vegetación baja. (Texto tomado de la fuente)
dc.description.abstractDebris flows are destructive phenomena characteristic of mountainous regions. In the Department of Antioquia (Colombia), these events occur frequently and the losses in economic terms and in human lives reflect the importance of predicting them. Extreme weather conditions, urbanization, and population growth tend to increase the risk in those areas where events have already occurred in the past. Currently, there is a lack of a database that compiles the details of the debris flows that have occurred in Antioquia with their respective hydrometeorological variables, in addition, most of the investigations are aimed at identifying the spatial susceptibility of these phenomena. With the rise of machine learning techniques, a binary classification method is proposed for the temporal prediction of debris flows from open data. In this way, multiple sources of information are identified to build an inventory of events with their respective hydrometeorological variables. Then, the preprocessing and deep understanding of the data is carried out, so that the variables that have the most influence on the occurrence of debris flows are selected through wrapping and filtering methods. Next, the problem of imbalance between classes is addressed, using different proportions of the data and generating synthetic data to evaluate the performance of the proposed classifier. Finally, it is obtained that the random forest algorithm with the balanced and unbalanced data set in a ratio of 1:99 between the classes of occurrence and non-occurrence of debris flows was the one that obtained the best performance, achieving an F1-score and sensitivity of 85% for the balanced set, while the unbalanced data set obtained 66% and 55% respectively. In addition, it is determined that the variables that have the greatest influence on the classification model correspond to the antecedent rainfall of 1 day, runoff, potential evapotranspiration, and the low vegetation index.
dc.languagespa
dc.publisherUniversidad Nacional de Colombia
dc.publisherMedellín - Minas - Maestría en Ingeniería - Analítica
dc.publisherDepartamento de la Computación y la Decisión
dc.publisherFacultad de Minas
dc.publisherMedellín, Colombia
dc.publisherUniversidad Nacional de Colombia - Sede Medellín
dc.relationAchour, Y., Gar¸cia, S., y Cavaleiro, V. (2018). GIS-based spatial prediction of debris flows using logistic regression and frequency ratio models for Zˆezere River basin and its surrounding area, Northwest Covilh˜a, Portugal. Arabian Journal of Geosciences, 11 (18), 1–17. Descarga do de https://link-springer-com.ezproxy.unal.edu.co/article/10.1007/s12517-018-3920-9 doi: 0.1007/S12517-018-3920-9/FIGURES/6
dc.relationAl Majzoub, H., Elgedawy, I., Akaydın, O¨ ., y K¨ose Uluk¨ok, M. (2020). HCAB-SMOTE: A Hybrid Clustered Affinitive Borderline SMOTE Approach for Imbalanced Data Binary Classification. Arabian Journal for Science and Engineering, 45 (4), 3205–3222. Descargado de https:// link-springer-com.ezproxy.unal.edu.co/article/10.1007/s13369-019-04336-1 doi: 10.1007/S13369-019-04336-1/FIGURES/19
dc.relationAlexandropoulos, S. A. N., Kotsiantis, S. B., y Vrahatis, M. N. (2019). Data preprocessing in pre- dictive data mining. The Knowledge Engineering Review , 34 . Descargado de https://www .cambridge.org/core/journals/knowledge-engineering-review/article/abs/data -preprocessing-in-predictive-data-mining/F7F2D7AC540D2815C613BA6575359AAA doi: 10.1017/S026988891800036X
dc.relationAlipour, A., Ahmadalipour, A., Abbaszadeh, P., y Moradkhani, H. (2020). Leveraging machine learning for predicting flash flood damage in the southeast US. Environmental Research Letters, 15 (2), 024011. Descargado de https://doi.org/10.1088/1748-9326/ab6edd doi: 10.1088/1748-9326/ab6edd
dc.relationArango, M. I., Aristiz´abal, E., y G´omez, F. (2020). Morphometrical analysis of torrential flows-prone catchments in tropical and mountainous terrain of the Colombian Andes by machine learning techniques. Natural Hazards, 105 (1), 983–1012. Descargado de https://link.springer .com/article/10.1007/s11069-020-04346-5 doi: 10.1007/S11069-020-04346-5
dc.relationAristiz´abal, E. (2013). SHIA Landslide: Developing a physically based model to predict shallow landslides triggered by rainfall in tropical environments (Tesis Doctoral, Universidad Nacional de Colombia). Descargado de https://repositorio.unal.edu.co/handle/unal/20811
dc.relationAristiz´abal, E., Arango, M. I., y Garcia, I. K. (2020). Definici´on y clasificaci´on de las avenidas torrenciales y su impacto en los andes colombianos. Revista Colombiana de Geograf´ıa, 29 , 242-258. doi: 10.15446/rcdg.v29n1.72612
dc.relationBaez-Villanueva, O. M., Zambrano-Bigiarini, M., Ribbe, L., Nauditt, A., Giraldo-Osorio, J. D., y Thinh, N. X. (2018). Temporal and spatial evaluation of satellite rainfall estimates over different regions in Latin-America. Atmospheric Research, 213 , 34–50. Descarga- do de https://www.sciencedirect.com/science/article/pii/S0169809517313029 doi: https://doi.org/10.1016/j.atmosres.2018.05.011
dc.relationBai, T., Jiang, Z., y Tahmasebi, P. (2021). Debris flow prediction with machine learning: smart management of urban systems and infrastructures. Neural Computing and Applications, 33 (22), 15769–15779. Descargado de https://doi.org/10.1007/s00521-021-06197-y doi: 10.1007/s00521-021-06197-y
dc.relationBao, Y., Chen, J., Sun, X., Han, X., Li, Y., Zhang, Y., . . . Wang, J. (2019). Debris flow prediction and prevention in reservoir area based on finite volume type shallow-water model: a case study of pumped-storage hydroelectric power station site in Yi County, Hebei, China. Environmental Earth Sciences, 78 (19). doi: 10.1007/S12665-019-8586-4
dc.relationBenhar, H., Idri, A., y Fern´andez-Alem´an, J. L. (2019). A Systematic Mapping Study of Data Preparation in Heart Disease Knowledge Discovery. Journal of Medical Systems, 43 (1), 1–17. Descargado de https://link.springer.com/article/10.1007/s10916-018-1134-z doi: 10.1007/S10916-018-1134-Z/TABLES/6
dc.relationBreiman, L. (2001, oct). Random Forests. Machine Learning 2001 45:1 , 45 (1), 5–32. Descarga- do de https://link.springer.com/article/10.1023/A:1010933404324 doi: 10.1023/A: 1010933404324
dc.relationBrownlee, J. (2020). How to Choose a Feature Selection Method For Machine Learning. Descarga- do 2022-05-08, de https://machinelearningmastery.com/feature-selection-with-real -and-categorical-data/
dc.relationBui, D. T., Ngo, P.-T. T., Pham, T. D., Jaafari, A., Minh, N. Q., Hoa, P. V., y Samui, P. (2019). A novel hybrid approach based on a swarm intelligence optimized extreme learning machine for flash flood susceptibility mapping. CATENA, 179 , 184-196. doi: https://doi.org/10.1016/ j.catena.2019.04.009
dc.relationCaballero, H. (2011). Las avenidas torrenciales una amenaza potencial en el valle de Aburr´a. Gesti´on y Ambiente, 14 (3), 45–50.
dc.relationFranklin, J. (2005). The elements of statistical learning: data mining, inference and prediction. The Mathematical Intelligencer , 27 (2), 83–85. doi: 10.1007/BF02985802
dc.relationGelcer, E., Fraisse, C., Dzotsi, K., Hu, Z., Mendes, R., y Zotarelli, L. (2013). Ef- fects of El Nin˜o Southern Oscillation on the space-time variability of Agricultu- ral Reference Index for Drought in midlatitudes. Agricultural and Forest Me- teorology , 174-175 , 110–128. Descargado de https://www.researchgate.net/ publication/248702784 Effects of El Nino Southern Oscillation on the space -time variability of Agricultural Reference Index for Drought in midlatitudes doi: 10.1016/J.AGRFORMET.2013.02.006
dc.relationGeorge, D., y Iverson, R. (2014). A depth-averaged debris-flow model that includes the effects of evolving dilatancy. ii. numerical predictions and experimental tests. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 470 (2170). Descar- gado de https://www.scopus.com/inward/record.uri?eid=2-s2.0-84907221089&doi=10 .1098%2frspa.2013.0820&partnerID=40&md5=5362643cdd5862ce8b74d635ade1329b doi: 10.1098/rspa.2013.0820
dc.relationGuzzetti, F., Mondini, A. C., Cardinali, M., Fiorucci, F., Santangelo, M., y Chang, K.-T. (2012). Landslide inventory maps: New tools for an old problem. Earth-Science Reviews, 112 (1), 42–66. Descargado de https://www.sciencedirect.com/science/article/pii/ S0012825212000128 doi: https://doi.org/10.1016/j.earscirev.2012.02.001
dc.relationHan, H., Wang, W. Y., y Mao, B. H. (2005). Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. Lecture Notes in Computer Science, 3644 LNCS , 878–887. Descargado de https://link-springer-com.ezproxy.unal.edu.co/chapter/ 10.1007/11538059 91 doi: 10.1007/11538059 91
dc.relationHeckerman, D. (1986). Probabilistic Interpretations for Mycin’s Certainty Factors. Machine Intelli- gence and Pattern Recognition, 4 (C), 167–196. Descargado de https://www.sciencedirect .com/science/article/abs/pii/B9780444700582500176?via%3Dihub doi: 10.1016/B978 -0-444-70058-2.50017-6
dc.relationHosseini, F. S., Choubin, B., Mosavi, A., Nabipour, N., Shamshirband, S., Darabi, H., y Haghighi, A. T. (2020). Flash-flood hazard assessment using ensembles and bayesian-based machine learning models: Application of the simulated annealing feature selection method. Science of The Total Environment , 711 , 135161. Descargado de https://www.sciencedirect.com/ science/article/pii/S0048969719351538 doi: https://doi.org/10.1016/j.scitotenv.2019 .135161
dc.relationHou, J., Dou, M., Zhang, Y., Wang, J., y Li, G. (2021). An evaluation model for landslide and debris flow prediction using multiple hydrometeorological variables. Environmental Earth Sciences, 80 (515), 1–18. Descargado de https://link.springer.com/article/10.1007/ s12665-021-09840-y doi: 10.1007/S12665-021-09840-Y
dc.relationHoyos, C. D., Ceballos, L. I., P´erez-Carrasquilla, J. S., Sepulveda, J., L´opez-Zapata, S. M., Zuluaga, M. D., . . . Zapata, M. (2019). Meteorological conditions leading to the 2015 Salgar flash flood: Lessons for vulnerable regions in tropical complex terrain. Natural Hazards and Earth System Sciences, 19 (11), 2635–2665. doi: 10.5194/NHESS-19-2635-2019
dc.relationImaizumi, F., Masui, T., Yokota, Y., Tsunetaka, H., Hayakawa, Y. S., y Hotta, N. (2019). Initiation and runout characteristics of debris flow surges in Ohya landslide scar, Japan. Geomorpho- logy , 339 , 58–69. Descargado de http://www.sciencedirect.com/science/article/pii/ S0169555X19301825http://files/173/S0169555X19301825.html doi: 10.1016/j.geomorph .2019.04.026
dc.relationJanizadeh, S., Avand, M., Jaafari, A., Phong, T. V., Bayat, M., Ahmadisharaf, E., . . . Lee, S. (2019). Prediction success of machine learning methods for flash flood susceptibility mapping in the tafresh watershed, iran. Sustainability, 11 (19). Descargado de https://www.mdpi.com/ 2071-1050/11/19/5426 doi: 10.3390/su11195426
dc.relationKern, A., Addison, P., Oommen, T., Salazar, S., y Coffman, R. (2017). Machine learning based predictive modeling of debris flow probability following wildfire in the intermountain western united states. Mathematical geosciences, 49 , 717–735. doi: 10.1007/s11004-017-9681-2
dc.relationKruskal, W. H., y Wallis, . W. A. (1952). Use of Ranks in One-Criterion Variance Analysis. Journal of the American Statistical Association, 47 (260), 583–621. Descargado de https:// people.ucalgary.ca/$\sim$jefox/KruskalandWallis1952.pdf
dc.relationLiang, Z., Wang, C. M., Zhang, Z. M., y Khan, K. U. J. (2020). A comparison of statistical and machine learning methods for debris flow susceptibility mapping. Stochastic Environmental Research and Risk Assessment , 34 (11), 1887–1907. Descargado de https://link.springer .com/article/10.1007/s00477-020-01851-8 doi: 10.1007/S00477-020-01851-8
dc.relationLiu, J., Gao, Y., y Hu, F. (2021). A fast network intrusion detection system using adaptive synthetic oversampling and LightGBM. Computers and Security, 106 , 102289. doi: 10.1016/ J.COSE.2021.102289
dc.relationMartinez-Plumed, F., Contreras-Ochando, L., Ferri, C., Hernandez-Orallo, J., Kull, M., Lachiche, N., . . . Flach, P. (2021). CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories. IEEE Transactions on Knowledge and Data Engineering, 33 (8), 3048–3061. doi: 10.1109/TKDE.2019.2962680
dc.relationNaranjo, K., Aristiz´abal, E. V., y Morales, J. A. (2019). Influencia del ENSO en la va- riabilidad espacial y temporal de la ocurrencia de movimientos en masa desencadena- dos por lluvias en la regi´on Andina colombiana. Ingenier´ıa y Ciencia, 15 , 11 - 42. Descargado de http://www.scielo.org.co/scielo.php?script=sci arttext&pid=S1794 -91652019000100011&nrm=iso
dc.relationNguyen, V. N., Yariyan, P., Amiri, M., Tran, A. D., Pham, T. D., Do, M. P., . . . Bui, D. T. (2020). A New Modeling Approach for Spatial Prediction of Flash Flood with Biogeo- graphy Optimized CHAID Tree Ensemble and Remote Sensing Data. Remote Sensing, 12 (9), 1373. Descargado de https://www.mdpi.com/2072-4292/12/9/1373/htmhttps:// www.mdpi.com/2072-4292/12/9/1373 doi: 10.3390/RS12091373
dc.relationPaw-luszek, K., Marczak, S., Borkowski, A., y Tarolli, P. (2019). Multi-Aspect Analysis of Object- Oriented Landslide Detection Based on an Extended Set of LiDAR-Derived Terrain Features. ISPRS International Journal of Geo-Information, 8 (8). Descargado de https://www.mdpi .com/2220-9964/8/8/321 doi: 10.3390/ijgi8080321
dc.relationQing, F., Zhao, Y., Meng, X., Su, X., Qi, T., y Yue, D. (2020). Application of machine learning to debris flow susceptibility mapping along the China-Pakistan Karakoram Highway. Remote Sensing, 12 (18). doi: 10.3390/RS12182933
dc.relationRivera, J. A., y Penalba, O. C. (2015). El Nin˜o/La Nin˜a events as a tool for regional drought monitoring in Southern South America. Drought: Re- search and Science-Policy Interfacing - Proceedings of the International Conferen- ce on Drought: Research and Science-Policy Interfacing, 293–300. Descargado de https://www.researchgate.net/publication/274195085 El NinoLa Nina events as a tool for regional drought monitoring in Southern South America doi: 10.1201/ B18077-50
dc.relationS´aez, J. A., Luengo, J., Stefanowski, J., y Herrera, F. (2014). Managing Borderline and Noisy Examples in Imbalanced Classification by Combining SMOTE with Ensemble Fil- tering. Lecture Notes in Computer Science, 8669 LNCS , 61–68. Descargado de https:// link-springer-com.ezproxy.unal.edu.co/chapter/10.1007/978-3-319-10840-7 8 doi: 10.1007/978-3-319-10840-7 8
dc.relationShirzadi, A., Asadi, S., Shahabi, H., Ronoud, S., Clague, J. J., Khosravi, K., . . . Bui, D. T. (2020). A novel ensemble learning based on Bayesian Belief Network coupled with an extreme lear- ning machine for flash flood susceptibility mapping. Engineering Applications of Artificial Intelligence, 96 , 103971. doi: 10.1016/J.ENGAPPAI.2020.103971
dc.relationTang, W., tao Ding, H., sheng Chen, N., Ma, S. C., hong Liu, L., lin Wu, K., y feng Tian, S. (2021). Artificial Neural Network-based prediction of glacial debris flows in the ParlungZangbo Basin, southeastern Tibetan Plateau, China. Journal of Mountain Science, 18 (1), 51–67. doi: 10.1007/S11629-020-6414-7
dc.relationTerti, G., Ruin, I., Gourley, J. J., Kirstetter, P., Flamig, Z., Blanchet, J., . . . Anquetin, S. (2019). Toward Probabilistic Prediction of Flash Flood Human Impacts. Risk analysis : an official publication of the Society for Risk Analysis, 39 (1), 140–161. doi: 10.1111/risa.12921
dc.relationToyos, G., Gunasekera, R., Zanchetta, G., Sulpizio, R., Favalli, M., y Pareschi, M. T. (2008). GIS- assisted modelling for debris flow hazard assessment based on the events of May 1998 in the area of Sarno, Southern Italy: II. Velocity and dynamic pressure. Earth Surface Processes and Landforms, 33 . Descargado de https://onlinelibrary.wiley.com/doi/abs/10.1002/esp .1640http://files/161/esp.html doi: 10.1002/esp.1640
dc.relationVargas-Cuervo, G., Rotigliano, E., y Conoscenti, C. (2019). Prediction of debris-avalanches and -flows triggered by a tropical storm by using a stochastic approach: An application to the events occurred in Mocoa (Colombia) on 1 April 2017. Geomorphology , 339 , 31–43. doi: 10.1016/J.GEOMORPH.2019.04.023
dc.relationVon Ru¨tte, J., y Or, D. (2015). Linking rainfall-induced landslides with predictions of debris flow runout distances. Landslides, 13 , 1097-1107. doi: 10.1007/s10346-015-0621-2
dc.relationYan, Y., Zhuang, Q., Zan, C., Ren, J., Yang, L., Wen, Y., . . . Kong, L. (2021). Using the Google Earth Engine to rapidly monitor impacts of geohazards on ecological quality in highly susceptible areas. Ecological Indicators, 132 , 108258. Descargado de https:// www.sciencedirect.com/science/article/pii/S1470160X21009237 doi: https://doi.org/ 10.1016/j.ecolind.2021.108258
dc.relationYong, Y. (2008). Characteristics and mechanism of landslides in loess during freezing and thawing periods in seasonally frozen ground regions. Journal of Disaster Prevention and Mitigation Engineering
dc.relationZapata, D. (2021). M´etodo para la detecci´on de estudiantes en riesgo de desercio´n, basado en un disen˜o de m´etricas y una t´ecnica de miner´ıa de datos (Tesis Doctoral, Universidad Nacional de Colombia, Medell´ın). Descargado de https://repositorio.unal.edu.co/handle/unal/ 80615
dc.relationZhang, S., Yang, H., Wei, F., Jiang, Y., y Liu, D. (2014). A model of debris flow forecast based on the water-soil coupling mechanism. Journal of Earth Science, 25 (4), 757–763. Descargado de https://link.springer.com/article/10.1007/s12583-014-0463-1 doi: 10.1007/S12583 -014-0463-1
dc.relationZhang, Y., Ge, T., Tian, W., y Liou, Y.-A. (2019). Debris Flow Susceptibility Mapping Using Machine-Learning Techniques in Shigatse Area, China. Remote Sensing, 11 (23), 2801. Descargado de https://www.mdpi.com/2072-4292/11/23/2801http://files/ 70/Zhangetal.-2019-DebrisFlowSusceptibilityMappingUsingMachine-L.pdfhttp:// files/71/2801.html doi: 10.3390/rs11232801
dc.relationZhao, Y., Meng, X., Qi, T., Li, Y., Chen, G., Yue, D., y Qing, F. (2021). AI-based rainfall prediction model for debris flows. Engineering Geology . doi: 10.1016/J.ENGGEO.2021.106456
dc.relationZhao, Z., Anand, R., y Wang, M. (2019). Maximum relevance and minimum redundancy feature selection methods for a marketing machine learning platform. Proceedings - 2019 IEEE International Conference on Data Science and Advanced Analytics, DSAA, 442–452. doi: 10.1109/DSAA.2019.00059
dc.rightsAtribución-NoComercial 4.0 Internacional
dc.rightshttp://creativecommons.org/licenses/by-nc/4.0/
dc.rightsinfo:eu-repo/semantics/openAccess
dc.titleMétodo para la predicción temporal de avenidas torrenciales a partir de datos abiertos usando aprendizaje de máquinas
dc.typeTesis


Este ítem pertenece a la siguiente institución