Artículos de revistas
Discovering knowledge from data clustering using automatically defined interval type-2 fuzzy predicates
Fecha
2017-02Registro en:
Comas, Diego Sebastián; Meschino, Gustavo Javier; Nowé, Ann; Ballarin, Virginia Laura; Discovering knowledge from data clustering using automatically defined interval type-2 fuzzy predicates; Pergamon-Elsevier Science Ltd.; Expert Systems with Applications; 68; 1; 2-2017; 136-150
0957-4174
CONICET Digital
CONICET
Autor
Comas, Diego Sebastián
Meschino, Gustavo Javier
Nowé, Ann
Ballarin, Virginia Laura
Resumen
In data clustering fuzzy predicates act as cluster descriptors providing linguistically expressed knowledge which indicates how features are related to each cluster. Fuzzy predicates directly and automatically obtained from data enable discovering knowledge inside clusters, even when there is no prior-information about the clustering problem. In this work a new method for automatic discovering of interval type-2 fuzzy predicates in data clustering is proposed, called Type-2 Data-based Fuzzy Predicate Clustering (T2-DFPC). In a first stage, a data analysis is performed by making a random partition of the original data and running a clustering scheme that automatically determines the suitable number of clusters. From this stage, interval type-2 fuzzy predicates are discovered. Results obtained on very different clustering datasets show that the T2-DFPC method was consistently one of the best in terms of accuracy. The method preserves all known advantages of the interval type-2 FL to deal with problems with vagueness, quantifying the degree of truth of the fuzzy predicates and modelling the variability of the data inside the clusters. The proposed method is a fast, useful, general, and unsupervised approach for interpretable data clustering, being the knowledge-extracting capabilities one of the main contributions. Linguistic expressions can be easily adapted to match the terminology used in the field the data are related to. The predicates are able to generalize the knowledge for new cases (new data), as an intelligent system. This new approach might be surprisingly useful in contexts where, besides the clustering partition, summary information from data is of interest.