GAME BASED DEEP REINFORCEMENT LEARNING FOR TARGET TRACKING

Marco Antonio Esquivel Basaldua

info:eu-repo/semantics/other

Registro en:

http://cimat.repositorioinstitucional.mx/jspui/handle/1008/1160

https://repositorioslatinoamericanos.uchile.cl/handle/2250/7729698

Autor

Marco Antonio Esquivel Basaldua

Institución

Centro de Investigación en Matemáticas (México)

Resumen

This work proposes a methodology for solving the tracking problem under classic visibility in the 2D Euclidean space for a pair of omnidirectional antagonistic players, a pursuer and an evader. The methodology starts proposing motion policies for the players in a discrete state-space applying optimal motion planning in a pursuit-evasion game. The first approach in the continuous state-space consists of two neural networks, one per each player, acting as motion policies whose entries are states in the environment and outputs are the actions to perform. The policies are trained from the behaviour in the discrete state-space. Finally, we implement an improvement for the pursuer motion policy using deep reinforcement learning (DRL) considering a fixed trajectory for the evader. In all these cases, the action-space is discrete. A DRL approach from scratch is compared to a initialized DRL approach, using the weights in the neural network trained from the optimal motion planning, and a DRL approach using a master policy (the same neural network trained from the optimal motion planning) which generates transitions in training for a pursuer in two proposed environments. Results show that a simple initialization is enough to achieve favorable outcomes in a simple environment while the use of a master policy is preferred in a more complex one.