Diagnóstico de desempenho e reconfiguração dinâmica em processamento de dados massivos

Vinicius Vitor dos Santos Dias

dc.contributor	Dorgival Olavo Guedes Neto
dc.contributor	Jussara Marques de Almeida
dc.contributor	Wagner Meira Junior
dc.creator	Vinicius Vitor dos Santos Dias
dc.date.accessioned	2019-08-12T00:27:18Z
dc.date.accessioned	2022-10-03T22:14:08Z
dc.date.available	2019-08-12T00:27:18Z
dc.date.available	2022-10-03T22:14:08Z
dc.date.created	2019-08-12T00:27:18Z
dc.date.issued	2016-12-07
dc.identifier	http://hdl.handle.net/1843/ESBF-AKUNB8
dc.identifier.uri	http://repositorioslatinoamericanos.uchile.cl/handle/2250/3796660
dc.description.abstract	The increasing amount of data being stored and the variety of algorithms proposed to meet processing demands of the data scientists have led to a new generation of computational environments and paradigms. These environments facilitate the task of programming through high level abstractions; however, achieving the ideal performance continues to be a challenge. In this work we investigate important factors concerning the performance of common big-data applications and consider the Spark framework as the target for our contributions. In particular, we organize our methodology of analysis based on diagnosis dimensions, which allow the identification of uncommon scenarios that provide us with valuable information about the environments limitations and possible actions to mitigate the issues. First, we validate our observations by showing the potential that manual adjustments have for improving the applications performance.Finally, we apply the lessons learned from the previous findings through the design and implementation of a extensible tool that automates the reconfiguration of Spark applications. Our tool leverages logs from previous executions as input, enforces configurable adjustment policies over the collected statistics and makes its decisions taking into account communication behaviors specific of the application evaluated. In order to accomplish that, the tool identifies global parameters that should be updated or points in the user program where the data partitioning can be adjusted based on those policies. Our results show gains of up to 1.9 in the scenarios considered.
dc.publisher	Universidade Federal de Minas Gerais
dc.publisher	UFMG
dc.rights	Acesso Aberto
dc.subject	diagnóstico de desempenho
dc.subject	ferramenta
dc.subject	reconfiguração dinâmica
dc.subject	dados massivos
dc.subject	balanceamento de carga
dc.title	Diagnóstico de desempenho e reconfiguração dinâmica em processamento de dados massivos
dc.type	Dissertação de Mestrado

Este ítem pertenece a la siguiente institución

Universidade Federal de Minas Gerais (Brasil)