dc.contributor | Takahashi Rodríguez, Silvia | |
dc.contributor | Takahashi Rodríguez, Silvia | |
dc.creator | González Oviedo, Rodrigo José | |
dc.date.accessioned | 2023-08-16T13:44:18Z | |
dc.date.accessioned | 2023-09-07T01:27:49Z | |
dc.date.available | 2023-08-16T13:44:18Z | |
dc.date.available | 2023-09-07T01:27:49Z | |
dc.date.created | 2023-08-16T13:44:18Z | |
dc.date.issued | 2023-08-15 | |
dc.identifier | http://hdl.handle.net/1992/69749 | |
dc.identifier | instname:Universidad de los Andes | |
dc.identifier | reponame:Repositorio Institucional Séneca | |
dc.identifier | repourl:https://repositorio.uniandes.edu.co/ | |
dc.identifier.uri | https://repositorioslatinoamericanos.uchile.cl/handle/2250/8728383 | |
dc.description.abstract | El presente trabajo explica detalladamente el funcionamiento de los algoritmos de aprendizaje reforzado, DQN y PPO. Adicionalmente se realiza una comparación entre estos algoritmos utilizando el framework de OpenAI Gym-retro, para entrenar un game agent basado en cada uno de los algoritmos. | |
dc.language | spa | |
dc.publisher | Universidad de los Andes | |
dc.publisher | Ingeniería de Sistemas y Computación | |
dc.publisher | Facultad de Ingeniería | |
dc.publisher | Departamento de Ingeniería Sistemas y Computación | |
dc.relation | Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A
survey. Journal of artificial intelligence research, 4, 237-285. | |
dc.relation | Gym.openai.com. 2016. Gym: A toolkit for developing and comparing reinforcement
learning algorithms. [online] Available at: <https://gym.openai.com/docs/> [Accessed
23 December 2021]. | |
dc.relation | Pfau, V., Nichol, A., Hesse, C., Schiavo, L., Schulman, J. and Klimov, O., 2018. Gym
Retro. [online] OpenAI. Available at: <https://openai.com/blog/gym-retro/> [Accessed
25 December 2021]. | |
dc.relation | Lipovetzky, N., & Sardina, S. (2018). Pacman capture the flag in AI courses. IEEE
Transactions on Games, 11(3), 296-299. | |
dc.relation | Nichol, A., Pfau, V., Hesse, C., Klimov, O., & Schulman, J. (2018). Gotta learn fast: A
new benchmark for generalization in rl. arXiv preprint arXiv:1804.03720. | |
dc.relation | Alemán de León, C. D. Agente Sonic. Deep Reinforcement Learning | |
dc.relation | LeBlanc, D. G., & Lee, G. General Deep Reinforcement Learning in NES Games. | |
dc.relation | Bellman, R.E. 1957. Dynamic Programming. Princeton University Press, Princeton, NJ.
Republished 2003: Dover | |
dc.relation | Dusparic, I. and Cardozo, N., 2021. ISIS 4222 RL Markov Decision Processes. | |
dc.relation | Dusparic, I. and Cardozo, N., 2021. ISIS 4222 RL Q learning. | |
dc.relation | Watkins, C.J.C.H. (1989). Learning from Delayed Rewards. PhD thesis, Cambridge
University, Cambridge, England | |
dc.relation | Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT
press. | |
dc.relation | Odemakinde, E., 2022. Model-Based and Model-Free Reinforcement Learning: Pytennis
Case Study - neptune.ai. [online] neptune.ai. Available at:
<https://neptune.ai/blog/model-based-and-model-free-reinforcement-learning-pytenniscase-
study> [Accessed 8 June 2022]. | |
dc.relation | Spinningup.openai.com. 2018. Part 2: Kinds of RL Algorithms Spinning Up
documentation. [online] Available at:
<https://spinningup.openai.com/en/latest/spinningup/rl_intro2.html> [Accessed 7 June
2022]. | |
dc.relation | Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT
Press. | |
dc.relation | Watkins, Christopher J. C. H., and Peter Dayan. "Q-learning." Machine learning 8.3-4
(1992): 279-292. | |
dc.relation | Lin, L.-J. Self-improving reactive agents based on reinforcement learning, planning and
teaching. Machine learning, 8(3-4):293-321, 1992 | |
dc.relation | Fedus, W., Ramachandran, P., Agarwal, R., Bengio, Y., Larochelle, H., Rowland, M., &
Dabney, W. (2020, November). Revisiting fundamentals of experience replay.
In International Conference on Machine Learning (pp. 3061-3071). PMLR. | |
dc.relation | Li, Y. (2017). Deep reinforcement learning: An overview. arXiv preprint
arXiv:1701.07274. | |
dc.relation | Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., &
Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint
arXiv:1312.5602. | |
dc.relation | Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal
policy optimization algorithms. arXiv preprint arXiv:1707.06347. | |
dc.relation | Schulman, J., Levine, S., Abbeel, P., Jordan, M., & Moritz, P. (2015, June). Trust region
policy optimization. In International conference on machine learning (pp. 1889-1897).
PMLR. | |
dc.relation | Estes, R. (2020) Rjalnev - DQN, GitHub. Available at: https://github.com/rjalnev
(Accessed: 15 December 2021). | |
dc.rights | Atribución 4.0 Internacional | |
dc.rights | http://creativecommons.org/licenses/by/4.0/ | |
dc.rights | info:eu-repo/semantics/openAccess | |
dc.rights | http://purl.org/coar/access_right/c_abf2 | |
dc.title | Análisis de dos algoritmos de Reinforcement Learning aplicados a OpenAi Gym Retro | |
dc.type | Trabajo de grado - Pregrado | |