Dissertação de Mestrado
Utilizando agrupamento com restrições e agrupamento espectral para integração de dados de enzimas
Fecha
2011-02-28Autor
Elisa Boari de Lima
Institución
Resumen
When multiple data sources are available for data mining, an a priori data integration process is usually required. This process may be costly and not lead to good results, since important information is likely to be discarded. In this master's thesis, we propose constrained clustering and spectral clustering as strategies for integrating data sources without losing any information. The process basically consists of adding the complementary data sources as constraints that the clustering algorithms must satisfy, or using them to increase the similarity between pairs of objects for the spectral clustering algorithms.As a concrete application of our approach, we focus on the problem of enzyme function prediction, which is a hard task usually performed by intensive experimental work. We use constrained and spectral clustering as means of integrating information from diverse sources, and analyze how this additional information impacts clustering quality in an enzyme clustering application scenario. Our results show that the use of such additional information generally improves the clustering quality when compared to the results using only the main database.Keywords: constrained clustering, data integration, enzyme clustering, spectral clustering.
Ítems relacionados
Mostrando ítems relacionados por Título, autor o materia.
-
Evolução da semissupervisão em detecção online de agrupamentos
Silva, Guilherme Alves da -
Análise comparativa de técnicas avançadas de agrupamento
Piantoni, Jane (Universidade Federal de São CarlosUFSCarPrograma de Pós-Graduação em Ciência da Computação - PPGCC-SoCâmpus Sorocaba, 2016-01-29)The goal of this study is to investigate the characteristics of the new data clustering approaches, carrying out a comparative study of clustering techniques that combine or select multiple solutions, analyzing these latest ...