dc.contributor | Cardozo Álvarez, Nicolás | |
dc.contributor | Dusparic, Ivana | |
dc.contributor | Gauthier Umaña, Valerie Elisabeth | |
dc.contributor | FLAG research lab | |
dc.creator | Patiño Sáenz, Michel Andrés | |
dc.date.accessioned | 2023-07-18T14:09:20Z | |
dc.date.accessioned | 2023-09-06T23:44:10Z | |
dc.date.available | 2023-07-18T14:09:20Z | |
dc.date.available | 2023-09-06T23:44:10Z | |
dc.date.created | 2023-07-18T14:09:20Z | |
dc.date.issued | 2023-07-10 | |
dc.identifier | http://hdl.handle.net/1992/68513 | |
dc.identifier | instname:Universidad de los Andes | |
dc.identifier | reponame:Repositorio Institucional Séneca | |
dc.identifier | repourl:https://repositorio.uniandes.edu.co/ | |
dc.identifier.uri | https://repositorioslatinoamericanos.uchile.cl/handle/2250/8726784 | |
dc.description.abstract | Deep neural networks are black box models for which there is no established formal solution on how to interpret their behavior. Abductive explanations are formal explanations that entail an observation within a logical system and satisfy certain minimality criteria. These explanations have been known to be computed for deep neural networks with binary input features and for neural networks with continuous input features. It is not currently known if a "deletion" algorithm designed to compute abductive explanations could be modified and extended to reinforcement learning tasks with continuous input features. Here, we show evidence that explanations generated by this algorithm may be biased. The algorithm favors the inclusion of features deleted later in the execution, a so called "order effect". We proposed a solution on how to fix this problem and designed an elementary algorithm to compute robust, formal and "non-biased" explanations to deep reinforcement learning model predictions. Our results suggest that this bias may be present in other implementations of the deletion algorithm for machine learning models in general, including the ones that have discrete input features, affecting models with bigger input dimensions more strongly. In the future, new methods to compute abductive explanations or other types formal explanations should be explored for deep reinforcement learning and machine learning in general | |
dc.language | eng | |
dc.publisher | Universidad de los Andes | |
dc.publisher | Maestría en Ingeniería de Sistemas y Computación | |
dc.publisher | Facultad de Ingeniería | |
dc.publisher | Departamento de Ingeniería Sistemas y Computación | |
dc.relation | [1] A. Ignatiev, N. Narodytska, and J. Marques-Silva, "Abduction-Based Explanations for Machine Learning Models," Nov. 2018, doi: 10.48550/arXiv.1811.10656. | |
dc.relation | [2] C. Liu, T. Arnon, C. Lazarus, C. Strong, C. Barrett, and M. J. Kochenderfer, "Algorithms for Verifying Deep Neural Networks," Mar. 2019, doi: 10.48550/arXiv.1903.06758. | |
dc.relation | [3] J. Marques-Silva and A. Ignatiev, "Delivering Trustworthy AI through Formal XAI," Proc. AAAI Conf. Artif. Intell., vol. 36, no. 11, Art. no. 11, Jun. 2022, doi: 10.1609/aaai.v36i11.21499. | |
dc.relation | [4] A. Heuillet, F. Couthouis, and N. Díaz-Rodríguez, "Explainability in Deep Reinforcement Learning," Aug. 2020, doi: 10.48550/arXiv.2008.06693. | |
dc.relation | [5] P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, "Explainable AI: A Review of Machine Learning Interpretability Methods," Entropy, vol. 23, no. 1, Art. no. 1, Jan. 2021, doi: 10.3390/e23010018. | |
dc.relation | [6] A. Krajna, M. Brcic, T. Lipic, and J. Doncevic, "Explainability in reinforcement learning: perspective and position," Mar. 2022, doi: 10.48550/arXiv.2203.11547. | |
dc.relation | [7] E. Puiutta and E. M. Veith, "Explainable Reinforcement Learning: A Survey," ArXiv200506247 Cs Stat, May 2020, Accessed: Jun. 26, 2021. [Online]. Available: http://arxiv.org/abs/2005.06247 | |
dc.relation | [8] A. Albarghouthi, "Introduction to Neural Network Verification," Sep. 2021, doi: 10.48550/arXiv.2109.10317. | |
dc.relation | [9] E. La Malfa, A. Zbrzezny, R. Michelmore, N. Paoletti, and M. Kwiatkowska, "On Guaranteed Optimal Robust Explanations for NLP Models," May 2021, doi: 10.24963/366. | |
dc.relation | [10] A. Ignatiev, N. Narodytska, N. Asher, and J. Marques-Silva, "On Relating ¿Why? and ¿Why Not?" Explanations." arXiv, Dec. 20, 2020. doi: 10.48550/arXiv.2012.11067. | |
dc.relation | [11] T. Eiter and G. Gottlob, "The complexity of logic-based abduction," J. ACM, vol. 42, no. 1, pp. 3-42, Jan. 1995, doi: 10.1145/200836.200838. | |
dc.relation | [12] A. Ignatiev, N. Narodytska, and J. Marques-Silva, "On Relating Explanations and Adversarial Examples," in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2019. Accessed: Dec. 12, 2022. [Online]. Available: https://papers.nips.cc/paper/2019/hash/7392ea4ca76ad2fb4c9c3b6a5c6e31e3-Abstract.html | |
dc.relation | [13] ¿,¿-CROWN (alpha-beta-CROWN): A Fast and Scalable Neural Network Verifier using the Bound Propagation Framework. Verified Intelligence, Jun. 22, 2023. Accessed: Jun. 26, 2023. [Online]. Available: https://github.com/Verified-Intelligence/alpha-beta-CROWN | |
dc.relation | [14] H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, and L. Daniel, "Efficient Neural Network Robustness Certification with General Activation Functions." arXiv, Nov. 02, 2018. doi: 10.48550/arXiv.1811.00866. | |
dc.relation | [15] H. Salman, G. Yang, H. Zhang, C.-J. Hsieh, and P. Zhang, "A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks." arXiv, Jan. 09, 2020. doi: 10.48550/arXiv.1902.08722. | |
dc.relation | [16] K. Xu et al., "Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond." arXiv, Oct. 25, 2020. doi: 10.48550/arXiv.2002.12920. | |
dc.relation | [17] K. Xu et al., "Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers." arXiv, Mar. 16, 2021. doi: 10.48550/arXiv.2011.13824. | |
dc.relation | [18] H. Zhang et al., "General Cutting Planes for Bound-Propagation-Based Neural Network Verification." arXiv, Dec. 04, 2022. doi: 10.48550/arXiv.2208.05740. | |
dc.relation | [19] C. Brix, "vnncomp2022_benchmarks." Apr. 28, 2023. Accessed: Jun. 26, 2023. [Online]. Available: https://github.com/ChristopherBrix/vnncomp2022_benchmarks | |
dc.relation | [20] M. N. Müller, C. Brix, S. Bak, C. Liu, and T. T. Johnson, "The Third International Verification of Neural Networks Competition (VNN-COMP 2022): Summary and Results." arXiv, Feb. 16, 2023. doi: 10.48550/arXiv.2212.10376. | |
dc.relation | [21] OpenAI, "Gym: A toolkit for developing and comparing reinforcement learning algorithms." https://gym.openai.com (accessed Jul. 05, 2021). | |
dc.relation | [22] U. J. Ravaioli, J. Cunningham, J. McCarroll, V. Gangal, K. Dunlap, and K. L. Hobbs, "Safe Reinforcement Learning Benchmark Environments for Aerospace Control Systems," in 2022 IEEE Aerospace Conference (AERO), Mar. 2022, pp. 1-20. doi: 10.1109/AERO53065.2022.9843750. | |
dc.relation | [23] T. Freiesleben, "The Intriguing Relation Between Counterfactual Explanations and Adversarial Examples," Minds Mach., vol. 32, no. 1, pp. 77-109, Mar. 2022, doi: 10.1007/s11023-021-09580-9. | |
dc.relation | [24] X. Ma et al., "Understanding Adversarial Attacks on Deep Learning Based Medical Image Analysis Systems," Pattern Recognit., vol. 110, p. 107332, Feb. 2021, doi: 10.1016/j.patcog.2020.107332. | |
dc.relation | [25] A. Barto and S. Mahadevan, "Recent Advances in Hierarchical Reinforcement Learning," Discrete Event Dyn. Syst. Theory Appl., vol. 13, Dec. 2002, doi: 10.1023/A:1025696116075. | |
dc.relation | [26] V. Mnih et al., "Playing Atari with Deep Reinforcement Learning." arXiv, Dec. 19, 2013. doi: 10.48550/arXiv.1312.5602. | |
dc.relation | [27] M. Fischetti and J. Jo, "Deep neural networks and mixed integer linear optimization," Constraints, vol. 23, no. 3, pp. 296-309, Jul. 2018, doi: 10.1007/s10601-018-9285-6. | |
dc.relation | [28] D. Shriver, S. Elbaum, M. Dwyer, A. Silva, and K. R. M. Leino, "DNNV: A Framework for Deep Neural Network Verification," Computer Aided Verification - 33rd International Conference, CAV 2021, Virtual Event, July 20-23, 2021, Proceedings, Part I. in Computer Aided Verification. Springer International Publishing, pp. 137-150, Jul. 2021. doi: 10.1007/978-3-030-81685-8_6. | |
dc.relation | [29] S. Bak, C. Liu, and T. Johnson, "The Second International Verification of Neural Networks Competition (VNN-COMP 2021): Summary and Results." arXiv, Aug. 30, 2021. doi: 10.48550/arXiv.2109.00498. | |
dc.relation | [30] "Cart Pole - Gym Documentation." https://www.gymlibrary.dev/environments/classic_control/cart_pole/ (accessed Jun. 26, 2023). | |
dc.relation | [31] "Playing CartPole with the Actor-Critic method | TensorFlow Core," TensorFlow. https://www.tensorflow.org/tutorials/reinforcement_learning/actor_critic (accessed Jun. 26, 2023). | |
dc.relation | [32] "Lunar Lander - Gym Documentation." https://www.gymlibrary.dev/environments/box2d/lunar_lander/ (accessed Jun. 26, 2023). | |
dc.relation | [33] "openai/gym." OpenAI, Jun. 27, 2023. Accessed: Jun. 27, 2023. [Online]. Available: https://github.com/openai/gym/blob/dcd185843a62953e27c2d54dc8c2d647d604b635/gym/envs/box2d/lunar_lander.py | |
dc.relation | [34] F. Doshi-Velez and B. Kim, "Towards A Rigorous Science of Interpretable Machine Learning." arXiv, Mar. 02, 2017. doi: 10.48550/arXiv.1702.08608. | |
dc.relation | [35] G. Vilone and L. Longo, "Explainable Artificial Intelligence: a Systematic Review," arXiv, arXiv:2006.00093, Oct. 2020. doi: 10.48550/arXiv.2006.00093. | |
dc.relation | [36] S. Huang, N. Papernot, I. Goodfellow, Y. Duan, and P. Abbeel, "Adversarial Attacks on Neural Network Policies." arXiv, Feb. 07, 2017. doi: 10.48550/arXiv.1702.02284. | |
dc.relation | [37] G. Amir, M. Schapira, and G. Katz, "Towards Scalable Verification of Deep Reinforcement Learning." arXiv, Aug. 13, 2021. doi: 10.48550/arXiv.2105.11931. | |
dc.relation | [38] L. Wells and T. Bednarz, "Explainable AI and Reinforcement Learning".A Systematic Review of Current Approaches and Trends," Front. Artif. Intell., vol. 4, 2021, Accessed: Jun. 05, 2022. [Online]. Available: https://www.frontiersin.org/article/10.3389/frai.2021.550030 | |
dc.relation | [39] Z. Juozapaitis, A. Koul, A. Fern, M. Erwig, and F. Doshi-Velez, "Explainable Reinforcement Learning via Reward Decomposition," 2019. | |
dc.relation | [40] M. T. Ribeiro, S. Singh, and C. Guestrin, "¿Why Should I Trust You?": Explaining the Predictions of Any Classifier. arXiv, Aug. 09, 2016. doi: 10.48550/arXiv.1602.04938. | |
dc.relation | [41] M. T. Ribeiro, S. Singh, and C. Guestrin, "Anchors: High-Precision Model-Agnostic Explanations," Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, Art. no. 1, Apr. 2018, doi: 10.1609/aaai.v32i1.11491. | |
dc.rights | Atribución-NoComercial 4.0 Internacional | |
dc.rights | Atribución-NoComercial 4.0 Internacional | |
dc.rights | http://creativecommons.org/licenses/by-nc/4.0/ | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.rights | http://purl.org/coar/access_right/c_abf2 | |
dc.title | Formal robust explanations for deep reinforcement learning models | |
dc.type | Trabajo de grado - Maestría | |