Other
Fast Most Similar Neighbor (MSN) classifiers for Mixed Data
Fecha
2010-09-30Registro en:
Revista Computación y Sistemas; Vol. 14 No.1
1405-5546
Autor
Hernández Rodríguez, Selene
Institución
Resumen
Abstract. The k nearest neighbor (k-NN) classifier has been extensively used in Pattern Recognition because of its simplicity and its good performance. However, in large datasets applications, the exhaustive k-NN classifier becomes impractical. Therefore, many fast k-NN classifiers have been developed; most of them rely on metric properties (usually the triangle inequality) to reduce the number of prototype comparisons. Hence, the existing fast k-NN classifiers are applicable only when the comparison function is a metric (commonly for numerical data). However, in some sciences such as Medicine, Geology, Sociology, etc., the prototypes are usually described by qualitative and quantitative features (mixed data). In these cases, the comparison function does not necessarily satisfy metric properties. For this reason, it is important to develop fast k most similar neighbor (k-MSN) classifiers for mixed data, which use non metric comparisons functions. In this thesis, four fast k-MSN classifiers, following the most successful approaches, are proposed. The experiments over different datasets show that the proposed classifiers significantly reduce the number of prototype comparisons.