Artículos de revistas
Improving the space cost of k-NN search in metric spaces by using distance estimators
Fecha
2009Registro en:
Multimed Tools Appl (2009) 41:215–233
DOI 10.1007/s11042-008-0226-z
Autor
Bustos Cárdenas, Benjamín
Navarro, Gonzalo
Institución
Resumen
Similarity searching in metric spaces has a vast number of applications in
several fields like multimedia databases, text retrieval, computational biology, and
pattern recognition. In this context, one of the most important similarity queries is
the k nearest neighbor (k-NN) search. The standard best-first k-NN algorithm uses
a lower bound on the distance to prune objects during the search. Although optimal
in several aspects, the disadvantage of this method is that its space requirements
for the priority queue that stores unprocessed clusters can be linear in the database
size. Most of the optimizations used in spatial access methods (for example, pruning
using MinMaxDist) cannot be applied in metric spaces, due to the lack of geometric
properties. We propose a new k-NN algorithm that uses distance estimators, aiming to
reduce the storage requirements of the search algorithm. The method stays optimal,
yet it can significantly prune the priority queue without altering the output of the
query. Experimental results with synthetic and real datasets confirm the reduction
in storage space of our proposed algorithm, showing savings of up to 80% of the
original space requirement.