dc.contributorTakahashi Rodríguez, Silvia
dc.contributorTakahashi Rodríguez, Silvia
dc.creatorBayona Latorre, Andrés Leonardo
dc.date.accessioned2023-07-27T13:39:23Z
dc.date.accessioned2023-09-07T02:16:33Z
dc.date.available2023-07-27T13:39:23Z
dc.date.available2023-09-07T02:16:33Z
dc.date.created2023-07-27T13:39:23Z
dc.date.issued2023-07-25
dc.identifierhttp://hdl.handle.net/1992/68811
dc.identifierinstname:Universidad de los Andes
dc.identifierreponame:Repositorio Institucional Séneca
dc.identifierrepourl:https://repositorio.uniandes.edu.co/
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/8729119
dc.description.abstractThis document presents a comparative study of the Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms in the context of Multi-Agent Reinforcement Learning (MARL) using the Unity ML-Agents framework. The objective is to investigate the performance and adaptability of these algorithms in dynamic environments. A collaborative-competitive multi-agent problem is formulated in the context of a food-gathering task. The proposed solution includes a dynamic environment generator and reward-shaping training techniques. The results showcase the effectiveness of SAC and PPO in learning complex behaviors and strategies in the objective MARL task. Using dynamic environments and reward shaping enables the agents to exhibit intelligent and adaptive behaviors. This study highlights the potential of MARL algorithms in addressing real-world challenges and their suitability for training agents in dynamic environments with the Unity ML-Agents framework.
dc.languageeng
dc.publisherUniversidad de los Andes
dc.publisherIngeniería de Sistemas y Computación
dc.publisherFacultad de Ingeniería
dc.publisherDepartamento de Ingeniería Sistemas y Computación
dc.relationBengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. Proceedings of the 26th Annual International Conference on Machine Learning, 41-48. https://doi.org/10.1145/1553374.1553380
dc.relationBusoniu, L., Babuska, R., & De Schutter, B. (2008). A Comprehensive Survey of Multiagent Reinforcement Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2), 156-172. https://doi.org/10.1109/TSMCC.2007.913919
dc.relationHaarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor (arXiv:1801.01290). arXiv. http://arxiv.org/abs/1801.01290
dc.relationJuliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2020). Unity: A General Platform for Intelligent Agents (arXiv:1809.02627). arXiv. http://arxiv.org/abs/1809.02627
dc.relationSamek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Muller, K.-R. (2021). Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications. Proceedings of the IEEE, 109(3), 247-278. https://doi.org/10.1109/JPROC.2021.3060483
dc.relationSchulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms (arXiv:1707.06347). arXiv. http://arxiv.org/abs/1707.06347
dc.relationWong, A., Bäck, T., Kononova, A. V., & Plaat, A. (2023). Deep multiagent reinforcement learning: Challenges and directions. Artificial Intelligence Review, 56(6), 5023-5056. https://doi.org/10.1007/s10462-022-10299-x
dc.relationU. T. (2022, December 14). ml-agents/Training-Configuration-File.md at develop · Unity-Technologies/ml-agents. GitHub. https://github.com/Unity-Technologies/ml-agents
dc.relationNeumann, C, Duboscq, J, Dubuc, C, Ginting, A, Irwan, AM, Agil, M, Widdig, A and Engelhardt, A (2011). Assessing dominance hierarchies: validation and advantages of progressive evaluation with Elo-rating. Animal Behaviour, 82 (4). pp. 911-921. ISSN 0003-3472
dc.relationABL. (2023, May 30). PPOvsSAC [Video]. YouTube. https://www.youtube.com/watch?v=ZtdtpRmoFSE
dc.relationABL. (2023a, May 30). PPOvsRandom [Video]. YouTube. https://www.youtube.com/watch?v=N-aRvKfYnpI
dc.relationABL. (2023c, May 30). SACvsRandom [Video]. YouTube. https://www.youtube.com/watch?v=744kTLEubK0
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional
dc.rightshttps://repositorio.uniandes.edu.co/static/pdf/aceptacion_uso_es.pdf
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.titleComparative study of SAC and PPO in multi-agent reinforcement learning using unity ML-agents
dc.typeTrabajo de grado - Pregrado


Este ítem pertenece a la siguiente institución