dc.contributorSenger, Hermes
dc.contributorhttp://lattes.cnpq.br/3691742159298316
dc.contributorhttp://lattes.cnpq.br/8688712033943534
dc.creatorCastro, Marcelo Rodrigo de
dc.date.accessioned2017-09-25T17:05:03Z
dc.date.available2017-09-25T17:05:03Z
dc.date.created2017-09-25T17:05:03Z
dc.date.issued2017-02-13
dc.identifierCASTRO, Marcelo Rodrigo de. SparkBLAST : utilização da ferramenta Apache Spark para a execução do BLAST em ambiente distribuído e escalável. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2017. Disponível em: https://repositorio.ufscar.br/handle/ufscar/9114.
dc.identifierhttps://repositorio.ufscar.br/handle/ufscar/9114
dc.description.abstractWith the evolution of next generation sequencing devices, the cost for obtaining genomic data has significantly reduced. With reduced costs for sequencing, the amount of genomic data to be processed has increased exponentially. Such data growth supersedes the rate at which computing power can be increased year after year by the hardware and software evolution. Thus, the higher rate of data growth in bioinformatics raises the need for exploiting more efficient and scalable techniques based on parallel and distributed processing, including platforms like Clusters, and Cloud Computing. BLAST is a widely used tool for genomic sequences alignment, which has native support for multicore-based parallel processing. However, its scalability is limited to a single machine. On the other hand, Cloud computing has emerged as an important technology for supporting rapid and elastic provisioning of large amounts of resources. Current frameworks like Apache Hadoop and Apache Spark provide support for the execution of distributed applications. Such environments provide mechanisms for embedding external applications in order to compose large distributed jobs which can be executed on clusters and cloud platforms. In this work, we used Spark to support the high scalable and efficient parallelization of BLAST (Basic Local Alingment Search Tool) to execute on dozens to hundreds of processing cores on a cloud platform. As result, our prototype has demonstrated better performance and scalability then CloudBLAST, a Hadoop based parallelization of BLAST.
dc.languagepor
dc.publisherUniversidade Federal de São Carlos
dc.publisherUFSCar
dc.publisherPrograma de Pós-Graduação em Ciência da Computação - PPGCC
dc.publisherCâmpus São Carlos
dc.rightsAcesso aberto
dc.subjectBLAST
dc.subjectApache Spark
dc.subjectNuvens computacionais
dc.subjectSequenciamento genético
dc.subjectCloud computing
dc.subjectGenetic sequencing
dc.subjectHadoop
dc.titleSparkBLAST : utilização da ferramenta Apache Spark para a execução do BLAST em ambiente distribuído e escalável
dc.typeTesis


Este ítem pertenece a la siguiente institución