Artículos de revistas
Open issues for partitioning clustering methods: an overview
Fecha
2014-05Registro en:
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Oxford, v.4, n.3, p.161-177, 2014
1942-4795
10.1002/widm.1127
Autor
Barioni, Maria Camila N.
Razente, Humberto
Marcelino, Alessandra M. R.
Traina, Agma Juci Machado
Traina Junior, Caetano
Institución
Resumen
Over the last decades, a great variety of data mining techniques have been developed to reach goals concerning Knowledge Discovery in Databases. Among them, cluster detection techniques are of major importance. Although these techniques have already been largely explored in the scientific literature, there are at least two important open issues: the existent algorithms are not scalable for large high-dimensional datasets, and the unsupervised nature of traditional data clustering makes it very difficult to generate meaningful clusters. This article presents an overview of the strategies being explored in order to deal more deeply with these issues. Moreover, it describes a new semi-supervised clustering strategy that exemplifies the integration of several approaches and that can be employed with partitioning algorithms, such as PAM and Clarans. The technique addresses an improvement to these types of algorithms, which is obtained by using must-link feedback information provided by the users in an interactive and visual environment.