Artículos de revistas
Silhouette + Attraction: A Simple and Effective Method for Text Clustering
Fecha
2015-08-14Registro en:
Errecalde, Marcelo L.; Cagnina, Leticia Cecilia; Rosso, Paolo ; Silhouette + Attraction: A Simple and Effective Method for Text Clustering; Cambridge University Press; Natural Language Engineering; 1; 14-8-2015; 1-40
1351-3249
Autor
Errecalde, Marcelo L.
Cagnina, Leticia Cecilia
Rosso, Paolo
Resumen
This article presents Sil-Att, a simple and effective method for text clustering, which is based on two main concepts: the silhouette coefficient and the idea of attraction. The combination of both principles allows to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that Sil-Att is able to obtain high quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of Sil-Att with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered.