dc.contributorDorgival Olavo Guedes Neto
dc.contributorJussara Marques de Almeida
dc.contributorWagner Meira Junior
dc.creatorVinicius Vitor dos Santos Dias
dc.date.accessioned2019-08-12T00:27:18Z
dc.date.accessioned2022-10-03T22:14:08Z
dc.date.available2019-08-12T00:27:18Z
dc.date.available2022-10-03T22:14:08Z
dc.date.created2019-08-12T00:27:18Z
dc.date.issued2016-12-07
dc.identifierhttp://hdl.handle.net/1843/ESBF-AKUNB8
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/3796660
dc.description.abstractThe increasing amount of data being stored and the variety of algorithms proposed to meet processing demands of the data scientists have led to a new generation of computational environments and paradigms. These environments facilitate the task of programming through high level abstractions; however, achieving the ideal performance continues to be a challenge. In this work we investigate important factors concerning the performance of common big-data applications and consider the Spark framework as the target for our contributions. In particular, we organize our methodology of analysis based on diagnosis dimensions, which allow the identification of uncommon scenarios that provide us with valuable information about the environments limitations and possible actions to mitigate the issues. First, we validate our observations by showing the potential that manual adjustments have for improving the applications performance.Finally, we apply the lessons learned from the previous findings through the design and implementation of a extensible tool that automates the reconfiguration of Spark applications. Our tool leverages logs from previous executions as input, enforces configurable adjustment policies over the collected statistics and makes its decisions taking into account communication behaviors specific of the application evaluated. In order to accomplish that, the tool identifies global parameters that should be updated or points in the user program where the data partitioning can be adjusted based on those policies. Our results show gains of up to 1.9 in the scenarios considered.
dc.publisherUniversidade Federal de Minas Gerais
dc.publisherUFMG
dc.rightsAcesso Aberto
dc.subjectdiagnóstico de desempenho
dc.subjectferramenta
dc.subjectreconfiguração dinâmica
dc.subjectdados massivos
dc.subjectbalanceamento de carga
dc.titleDiagnóstico de desempenho e reconfiguração dinâmica em processamento de dados massivos
dc.typeDissertação de Mestrado


Este ítem pertenece a la siguiente institución