Artículos de revistas
Towards improving cluster-based feature selection with a simplified silhouette filter
Fecha
2011Registro en:
INFORMATION SCIENCES, v.181, n.18, p.3766-3782, 2011
0020-0255
10.1016/j.ins.2011.04.050
Autor
COVOES, Thiago F.
HRUSCHKA, Eduardo R.
Institución
Resumen
This paper proposes a filter-based algorithm for feature selection. The filter is based on the partitioning of the set of features into clusters. The number of clusters, and consequently the cardinality of the subset of selected features, is automatically estimated from data. The computational complexity of the proposed algorithm is also investigated. A variant of this filter that considers feature-class correlations is also proposed for classification problems. Empirical results involving ten datasets illustrate the performance of the developed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to state of the art algorithms that find clusters of features. We show that, if computational efficiency is an important issue, then the proposed filter May be preferred over their counterparts, thus becoming eligible to join a pool of feature selection algorithms to be used in practice. As an additional contribution of this work, a theoretical framework is used to formally analyze some properties of feature selection methods that rely on finding clusters of features. (C) 2011 Elsevier Inc. All rights reserved.