Deep-based recurrent approaches for gesture recognition

Igor Leonardo Oliveira Bastos

dc.contributor	William Robson Schwartz
dc.contributor	http://lattes.cnpq.br/0704592200063682
dc.contributor	Erickson Rangel do Nascimento
dc.contributor	Guillermo Camara Chávez
dc.contributor	Leandro Augusto Frata Fernandes
dc.contributor	Ricardo da Silva Torres
dc.creator	Igor Leonardo Oliveira Bastos
dc.date.accessioned	2022-01-17T16:48:08Z
dc.date.accessioned	2022-10-04T00:59:05Z
dc.date.available	2022-01-17T16:48:08Z
dc.date.available	2022-10-04T00:59:05Z
dc.date.created	2022-01-17T16:48:08Z
dc.date.issued	2020-06-12
dc.identifier	http://hdl.handle.net/1843/39110
dc.identifier	https://orcid.org/0000-0001-6998-4771
dc.identifier.uri	http://repositorioslatinoamericanos.uchile.cl/handle/2250/3838002
dc.description.abstract	The recognition of gestures corresponds to a mathematical interpretation of a human motion by a machine. It involves different aspects and parts of human body, such as variations in the positioning of hands and arms, facial and body expressions, head positioning and trunk posture. Since gesture recognition takes into account both appearance (appearance of body parts, for example) and movement, it is related to the extraction of spatiotemporal information in videos, leading to a wide range of applications. As a consequence, many approaches focus on this topic, presenting variations in terms of employed features and learning algorithms used on the task. However, despite the existence of a wide range of approaches related to the recognition of gestures, gaps are noticed regarding aspects such as scalability (in terms of the number of gestures), time to incorporate new gestures; and actuation over unsegmented videos, i.e., videos containing multiple gestures and no information about the start and end of these gestures. Thus, this work aims at presenting strategies that fill these gaps, addressed in two different lines: (i) creation of scalable models for incremental application in large databases; (ii) formulation of a model to detect and recognize gestures concomitantly, considering unsegmented videos. For an efficient performance on gesture videos, it is important to take into account the well-defined temporal structure of gestures, which preaches for the existence of ordered sub-events. To handle this order of sub-events, we propose models that are capable of extracting spatiotemporal information and also weigh this temporal structure, contemplating the contribution of previous inputs (previous videos snippets) to evaluate subsequent ones. Thereby, our models correlate information from different video parts, producing richer representations of gestures that are used for a more accurate recognition. Finally, to evaluate the proposed approach, we present the results obtained from the application of the models described in this document. These outcomes were obtained from tests on widely used databases, considering the metrics employed to evaluate performance on each of them. On ChaLearn LAP Isolated Gestures (ChaLearn IsoGD) and Sheffield Kinect Gestures (SKIG), the method proposed in this document achieved 69.44% and 99.53% of accuracy, respectively. On ChaLearn Multimodal Gesture Recognition (ChaLearn Montalbano) and ChaLearn Continuous Gestures (ChaLearn ConGD), the method contemplated in this document obtained 0.919 and 0.623 as Jaccard Score, respectively. Comparisons with literature approaches evidence the good performance of the proposed methods, rivaling with state-of-the-art researches on all evaluated databases.
dc.publisher	Universidade Federal de Minas Gerais
dc.publisher	Brasil
dc.publisher	ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
dc.publisher	Programa de Pós-Graduação em Ciência da Computação
dc.publisher	UFMG
dc.rights	http://creativecommons.org/licenses/by-nc-sa/3.0/pt/
dc.rights	Acesso Aberto
dc.subject	Gesture recognition
dc.subject	Recurrent neural networks
dc.subject	Spatiotemporal information
dc.subject	Isolated gestures
dc.subject	Unsegmented videos
dc.title	Deep-based recurrent approaches for gesture recognition
dc.type	Tese

Este ítem pertenece a la siguiente institución

Universidade Federal de Minas Gerais (Brasil)