Uma arquitetura para análise de agrupamentos sobre bases de dados distribuídas

Gorgônio, Flavius da Luz e

dc.contributor	Costa, José Alfredo Ferreira
dc.contributor
dc.contributor	http://lattes.cnpq.br/7375286161719016
dc.contributor
dc.contributor	Martins, Allan de Medeiros
dc.contributor
dc.contributor	Canuto, Anne Magaly de Paula
dc.contributor
dc.contributor	Barreto, Guilherme de Alencar
dc.contributor
dc.contributor	Adeodato, Paulo Jorge Leitão
dc.contributor
dc.creator	Gorgônio, Flavius da Luz e
dc.date.accessioned	2020-03-26T18:01:55Z
dc.date.accessioned	2022-10-06T12:31:54Z
dc.date.available	2020-03-26T18:01:55Z
dc.date.available	2022-10-06T12:31:54Z
dc.date.created	2020-03-26T18:01:55Z
dc.date.issued	2009-03-06
dc.identifier	GORGÔNIO, Flavius da Luz e. Uma arquitetura para análise de agrupamentos sobre bases de dados distribuídas. 2009. 156f. Tese (Doutorado em Engenharia Elétrica e Computação) - Centro de Tecnologia, Universidade Federal do Rio Grande do Norte, Natal, 2009.
dc.identifier	https://repositorio.ufrn.br/jspui/handle/123456789/28672
dc.identifier.uri	http://repositorioslatinoamericanos.uchile.cl/handle/2250/3954176
dc.description.abstract	Data mining can be defined as a set of techniques for knowledge extraction and search of useful and previously unknown patterns in large multidimensional databases. Clustering is the process of discovering data clusters within high-dimensional databases, based on similarities, with a minimal knowledge of their structure. Distributed data clustering is a recent approach to deal with distributed databases, since traditional clustering algorithms require centering all databases in a single dataset. Moreover, current privacy requirements in distributed databases demand algorithms with the ability to process clustering securely. Thus, an increasing need of methods to mining data stored in a distributed way has motivated the development of algorithms to analyze each database separately and to combine the partial results to get a final result. This thesis presents a framework for cluster analysis in distributed databases using traditional algorithms, as K-means and self-organizing maps. This approach reduces significantly the amount of data transferred between remote units and the central unit. The framework includes a strategy, based on vectorial quantization, that extract a representatives subset, in order to get partial views of the existing clusters in each horizontal and/or vertical partitions of the database. Later, the representatives of each local unit are sent to the central unit, which carry out a combination of the partial results applying a clustering algorithm over all representative subsets. The experimental results with different datasets show that the framework proposed obtains results very close and with effectiveness comparable to conventional data mining techniques, where all the databases are transferred to a central unit in the pre-processing stage.
dc.publisher	Brasil
dc.publisher	PROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA ELÉTRICA E COMPUTAÇÃO
dc.rights	http://creativecommons.org/licenses/by-nc-nd/3.0/br/
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Brazil
dc.subject	Análise de agrupamentos distribuída
dc.subject	Comitês de agrupamento
dc.subject	K-médias
dc.subject	Mapas auto-organizáveis
dc.title	Uma arquitetura para análise de agrupamentos sobre bases de dados distribuídas
dc.type	doctoralThesis

Este ítem pertenece a la siguiente institución

Universidade Federal do Rio Grande do Norte (Brasil)