bachelorThesis
Implementação do paradigma MapReduce por meio do Hadoop integrado ao framework Hive: um estudo prático
Fecha
2016-05-24Registro en:
SILVA, Guilherme Santiago Ribeiro; URBAN, Lincoln Moro. Implementação do paradigma MapReduce por meio do Hadoop integrado ao framework Hive: um estudo prático. 2016. 78 f. Trabalho de Conclusão de Curso (Graduação) - Universidade Tecnológica Federal do Paraná, Ponta Grossa, 2016.
Autor
Silva, Guilherme Santiago Ribeiro
Urban, Lincoln Moro
Resumen
Nowadays, with the advancement of technology and the constant creation of new applications, many companies are faced with a crucial issue for the segment of the IT (Information Technology) services as a storage and handling of large volumes of data. Companies like Facebook, Twitter, Google, among others, has its technologies and innovations guided per a new concept called Big Data. This new tendency allow the development of solutions that can meet the market demand, considering that the Relational Database Management Systems, although still widely used, encounter problems with regard to performance, scalability and processing of large databases. One of the most widely used concepts nowadays, when mention the Big Data, is the MapReduce paradigm. This was developed by Google and has its operation based on the processing and distribution of data in a set of computers (cluster), interconnected over a network, thus enabling greater flexibility in handling such data. Considering the MapReduce paradigm, some technologies were created to implement their concepts, one of them is Hadoop, which has modules that perform the management and distribution of databases between multiple machines. This paper proposes the implementation and practical implementation of the MapReduce paradigm through the Hadoop in a virtualized environment. Therefore was used an experimental environment, compound per virtualization technologies and benchmark techniques, which simulate analytical workloads on synthetic databases. The results in turn, point to the analysis in the time of execution of the queries submitted to this environment and also serve as a base for future work and related searches.