Trabajo de grado - Pregrado
Comparative study of SAC and PPO in multi-agent reinforcement learning using unity ML-agents
Fecha
2023-07-25Registro en:
instname:Universidad de los Andes
reponame:Repositorio Institucional Séneca
Autor
Bayona Latorre, Andrés Leonardo
Institución
Resumen
This document presents a comparative study of the Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO) algorithms in the context of Multi-Agent Reinforcement Learning (MARL) using the Unity ML-Agents framework. The objective is to investigate the performance and adaptability of these algorithms in dynamic environments. A collaborative-competitive multi-agent problem is formulated in the context of a food-gathering task. The proposed solution includes a dynamic environment generator and reward-shaping training techniques. The results showcase the effectiveness of SAC and PPO in learning complex behaviors and strategies in the objective MARL task. Using dynamic environments and reward shaping enables the agents to exhibit intelligent and adaptive behaviors. This study highlights the potential of MARL algorithms in addressing real-world challenges and their suitability for training agents in dynamic environments with the Unity ML-Agents framework.