Reinforcement learning for an attitude control algorithm for racing quadcopters

BUSTAMANTE BELLO, MARTIN ROGELIO; 58810; Nakasone Nakamurakari, Shun Mauricio

dc.contributor	Bustamante Bello, Martín Rogelio
dc.contributor	School of Engineering and Sciences
dc.contributor	Navarro Durán, David
dc.contributor	Galuzzi Aguilera, Renato
dc.contributor	Campus Ciudad de México
dc.contributor	puemcuervo
dc.creator	BUSTAMANTE BELLO, MARTIN ROGELIO; 58810
dc.creator	Nakasone Nakamurakari, Shun Mauricio
dc.date.accessioned	2023-06-23T15:06:58Z
dc.date.accessioned	2023-07-19T19:56:08Z
dc.date.available	2023-06-23T15:06:58Z
dc.date.available	2023-07-19T19:56:08Z
dc.date.created	2023-06-23T15:06:58Z
dc.date.issued	2022-06-15
dc.identifier	Nakasone Nakamurakari, S. M.(2022), Reinforcement learning for an attitude control algorithm for racing quadcopters [Unpublished master's thesis]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/650935
dc.identifier	https://hdl.handle.net/11285/650935
dc.identifier	https://orcid.org/ 0000-0002-2660-8378
dc.identifier	CVU 1080299
dc.identifier.uri	https://repositorioslatinoamericanos.uchile.cl/handle/2250/7716398
dc.description.abstract	From its first conception to its wide commercial distribution, Unmanned Aerial Vehicle (UAV)’s have always presented an interesting control problem as their dynamics are not as simple to model and present a non-linear behavior. These vehicles have improved as the technology in these devices has been developed reaching commercial and leisure use in everyday life. Out of the many applications for these vehicles, one that has been rising in popularity is drone racing. As technology improves, racing quadcopters have also improved reaching capabilities never seen before in flying vehicles. Though hardware and performance have improved throughout the drone racing industry, something that has been lacking, in a way, is better and more robust control algorithms. In this thesis, a new control strategy based on Reinforcment Learning (RL) is presented in order to achieve better performance in attitude control for racing quadcopters. For this process, two different plants were developed to fulfill, a) the training process needs with a simplified dynamics model and b) a higher fidelity Multibody model to validate the resulting controller. By using Proximal Policy Optimization (PPO), the agent is trained via a reward function and interaction with the environment. This dissertation presents a different approach on how to determine a reward function such that the agent trained learns in a more effective and faster way. The control algorithm obtained from the training process is simulated and tested against the most common attitude control algorithm used in drone races (Proportional Integral Derivative (PID) control), as well as its ability to reject noise in the state signals and external disturbances from the environment. Results from agents trained with and without these disturbances are also presented. The resulting control policies were comparable to the PID controller and even outperformed this control strategy in noise rejection and robustness to external disturbances.
dc.language	eng
dc.publisher	Instituto Tecnológico y de Estudios Superiores de Monterrey
dc.relation	acceptedVersion
dc.relation	REPOSITORIO NACIONAL CONACYT
dc.rights	http://creativecommons.org/licenses/by/4.0
dc.rights	openAccess
dc.title	Reinforcement learning for an attitude control algorithm for racing quadcopters
dc.type	Tesis de Maestría / master Thesis

Este ítem pertenece a la siguiente institución

Instituto Tecnológico de Monterrey (México)