Near neighbor searching with K nearest references

Chávez, E.; Graff, M.; Navarro, Gonzalo; Téllez, E. S.

dc.creator	Chávez, E.
dc.creator	Graff, M.
dc.creator	Navarro, Gonzalo
dc.creator	Téllez, E. S.
dc.date.accessioned	2015-08-18T19:08:31Z
dc.date.available	2015-08-18T19:08:31Z
dc.date.created	2015-08-18T19:08:31Z
dc.date.issued	2015
dc.identifier	Information Systems51(2015)43–61
dc.identifier	0306-4379
dc.identifier	DOI: 10.1016/j.is.2015.02.001
dc.identifier	https://repositorio.uchile.cl/handle/2250/132859
dc.description.abstract	Proximity searching is the problem of retrieving,from a given database, those objects closest to a query.To avoid exhaustive searching,data structures called indexes are builton the database prior to serving queries.The curse of dimensionality is a well-known problem for indexes:in spaces with sufficiently concentrated distance histograms,no index out performs an exhaustives can of the database. In recent years,a number of index es for approximate proximity searching have been proposed. These are able to cope with the curse of dimensionality inexchangefor returning an answer that might beslightly different from the correctone. In this paper we show that many of those recent indexes can be understood as variants of a simple general model basedon K-nearest reference signatures.A set o freferences is chosen from the database,and the signature of each object consists of the K references nearest to the object. At query time,the signature of the query is computed and the search examines only the objects whose signature is close enough to that of the query. Many known and novelindexes are obtained by considering different ways to determine how much detail the signature records(e.g.,justthesetofnearestreferences, or also their proximity order to the object, oral so their distances to the object,and soon), how the similarity between signatures is defined,and how the parameters are tuned.In addition,we introduce a space-efficient representation for those families of indexes, making it possible to search very large databases in main memory.Small indexes are cache friendly,inducing faster queries. We perform exhaustive experiments comparing several known and new indexes that derive from our framework,evaluatingtheir time performance,memory usage,and quality of approximation.The best indexes out perform the state of the art,offering an attractive balance between all these aspects,and turn out to be excellent choices in many scenarios.Our framework gives high flexibility to design new indexes.
dc.language	en
dc.publisher	Elsevier
dc.rights	http://creativecommons.org/licenses/by-nc-nd/3.0/cl/
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 Chile
dc.subject	Proximity search
dc.subject	Searching bycontentin multimedia databases
dc.subject	k nearest neighbors
dc.subject	Indexing metric spaces
dc.title	Near neighbor searching with K nearest references
dc.type	Artículo de revista

Este ítem pertenece a la siguiente institución

Universidad de Chile