Apache Kafka: implementação da técnica de replicação de banco de dados baseada em Middleware para o contexto de raspagem de dados

Benedito Neto, Manoel

bachelorThesis

Fecha

2022-07-26

Registro en:

BENEDITO NETO, Manoel. Apache Kafka: implementação da técnica de replicação de banco de dados baseada em Middleware para o contexto de raspagem de dados. 2022. 55f. Trabalho de Conclusão de Curso (Graduação em Engenharia de Computação) - Centro de Tecnologia, Universidade Federal do Rio Grande do Norte, Natal, 2022.

https://repositorio.ufrn.br/handle/123456789/48850

http://repositorioslatinoamericanos.uchile.cl/handle/2250/3942770

Autor

Benedito Neto, Manoel

Institución

Universidade Federal do Rio Grande do Norte (Brasil)

Resumen

The demand for stability and availability of databases in the age of information and distributed computing is increasingly urgent. The recent Covid-19 'Data Blackout' case, which occurred in December 2021 in DataSUS systems, can be mentioned as an alarming occasion that could have been mitigated with the implementation of database replication techniques. Database replication techniques seek to increase consistency, performance and availability characteristics through a service architecture capable of fully copying the data present in a database. This paper has the general objective of implement the Middleware-based database replication technique using Apache Kafka tool to mediate the exchange of information between a database and its replica in a data scraping application context. The data are stored in a PostgreSQL database, stored by a Python application that, in turn, perform the data scraping of meteorological data referring to fire outbreaks, publicly provided by the National Institute for Space Research (INPE) through an Application Programming Interface (API). The concepts of service virtualization were used to instantiate the data scraping application, the database service and a Database Management System (DBMS), the Apache Kafka service architecture and a control panel for visualization of its performance. Thus, concluded that the methodology applied had resulted in a consistent database replica for the data scraping system developed.

Materias

Apache Kafka

Sistemas distribuídos

Replicação de banco de dados

Raspagem de dados

Virtualização de serviços

Distributed systems

Database replication

Data scrapping

Service virtualization

Mostrar el registro completo del ítem