dc.contributorCosta, José Alfredo Ferreira
dc.contributor
dc.contributorhttp://lattes.cnpq.br/7375286161719016
dc.contributor
dc.contributorMartins, Allan de Medeiros
dc.contributor
dc.contributorCanuto, Anne Magaly de Paula
dc.contributor
dc.contributorBarreto, Guilherme de Alencar
dc.contributor
dc.contributorAdeodato, Paulo Jorge Leitão
dc.contributor
dc.creatorGorgônio, Flavius da Luz e
dc.date.accessioned2020-03-26T18:01:55Z
dc.date.accessioned2022-10-06T12:31:54Z
dc.date.available2020-03-26T18:01:55Z
dc.date.available2022-10-06T12:31:54Z
dc.date.created2020-03-26T18:01:55Z
dc.date.issued2009-03-06
dc.identifierGORGÔNIO, Flavius da Luz e. Uma arquitetura para análise de agrupamentos sobre bases de dados distribuídas. 2009. 156f. Tese (Doutorado em Engenharia Elétrica e Computação) - Centro de Tecnologia, Universidade Federal do Rio Grande do Norte, Natal, 2009.
dc.identifierhttps://repositorio.ufrn.br/jspui/handle/123456789/28672
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/3954176
dc.description.abstractData mining can be defined as a set of techniques for knowledge extraction and search of useful and previously unknown patterns in large multidimensional databases. Clustering is the process of discovering data clusters within high-dimensional databases, based on similarities, with a minimal knowledge of their structure. Distributed data clustering is a recent approach to deal with distributed databases, since traditional clustering algorithms require centering all databases in a single dataset. Moreover, current privacy requirements in distributed databases demand algorithms with the ability to process clustering securely. Thus, an increasing need of methods to mining data stored in a distributed way has motivated the development of algorithms to analyze each database separately and to combine the partial results to get a final result. This thesis presents a framework for cluster analysis in distributed databases using traditional algorithms, as K-means and self-organizing maps. This approach reduces significantly the amount of data transferred between remote units and the central unit. The framework includes a strategy, based on vectorial quantization, that extract a representatives subset, in order to get partial views of the existing clusters in each horizontal and/or vertical partitions of the database. Later, the representatives of each local unit are sent to the central unit, which carry out a combination of the partial results applying a clustering algorithm over all representative subsets. The experimental results with different datasets show that the framework proposed obtains results very close and with effectiveness comparable to conventional data mining techniques, where all the databases are transferred to a central unit in the pre-processing stage.
dc.publisherBrasil
dc.publisherPROGRAMA DE PÓS-GRADUAÇÃO EM ENGENHARIA ELÉTRICA E COMPUTAÇÃO
dc.rightshttp://creativecommons.org/licenses/by-nc-nd/3.0/br/
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Brazil
dc.subjectAnálise de agrupamentos distribuída
dc.subjectComitês de agrupamento
dc.subjectK-médias
dc.subjectMapas auto-organizáveis
dc.titleUma arquitetura para análise de agrupamentos sobre bases de dados distribuídas
dc.typedoctoralThesis


Este ítem pertenece a la siguiente institución