bachelorThesis
Estudo sobre técnicas de visualização quanto ao uso de rótulos em repositórios de software
Fecha
2019-11-29Registro en:
SAMPEDRO, Cláudia Lázara Poiet. Estudo sobre técnicas de visualização quanto ao uso de rótulos em repositórios de software. 2019. Trabalho de Conclusão de Curso (Bacharelado em Ciência da Computação) - Universidade Tecnológica Federal do Paraná, Campo Mourão, 2019.
Autor
Sampedro, Cláudia Lázara Poiet
Resumen
Context: Visualizations techniques are used to analyzes large amounts of data, because they enhance human cognitive ability in the process of data exploration through the use of graphical models and visual representations. Repository mining is another widely explored area, which can transform data collected from software repositories into useful information for decisiont. By correlating these two areas it is possible to look for unidenti ed patterns in software projects. Objective: This study pourpose is use visualization techniques to analyze the use of labels in issues present in projects hosted on social development platforms. Method: The method employed was organized in ve steps: domain knowledge; data collection and preprocessing; extraction and visualization of patterns, responsible for generating the visualization with the preprocessed data, and visual analysis of it; postprocessing, which may restart the cycle already employed, searching for new patterns by using others techniques and / or parameter setting; and, nally, use of knowledge. Results:Analyzing the domain of open source projects, social software development platforms and label-centric collaboration mechanisms, labels were chosen and their use in issues in the context of the GitHub platform for the NextCloud repository. As for collection and preprocessing, we used the GitHub platform REST API and scripts, developed in Python and JavaScript. In order to characterize and analyze the use of labels, we used visualizations based on box plot, streamgraph, graph drawing and Sankey diagram techniques. Using the knowledge obtained in the previous steps, it is concluded that the analyzed project uses the labels feature and this tends to increase the number of comments on issues, improving communication between developers. As for the issues lifetime, these were shorter for issues without labels, which may indicate that they are quite simple and therefore completed quickly. Looking at the label co-occurrence graph, it is evident that in addition to several labels per issue, there are a large number of label associations used per issue. It was also noted that the project community tends to use more than one label per issue. Considering the Sankey diagram, it was possible to observe the relationship between labels, number of comments, issue lifetime and the content handled in comments, noting, for example, that issues with few comments are nished faster. Conclusions: Visualization techniques facilitates the identi cation of patterns and evidences, regarding the questions established in this study about the use of labels on github repositories, in particular, the contribution of labels on communication process with the developers, and what is the global e ect on the issue’s closing time, in the communication and in the issue conclusion.