info:eu-repo/semantics/article
A graph-based cache for large-scale similarity search engines
Fecha
2018-05Registro en:
Gil Costa, Graciela Verónica; Marin, Mauricio; Bonacic, Carolina; Solar, Roberto; A graph-based cache for large-scale similarity search engines; Springer; Journal of Supercomputing; 74; 5; 5-2018; 2006-2034
0920-8542
1573-0484
CONICET Digital
CONICET
Autor
Gil Costa, Graciela Verónica
Marin, Mauricio
Bonacic, Carolina
Solar, Roberto
Resumen
Large-scale similarity search engines are complex systems devised to process unstructured data like images and videos. These systems are deployed on clusters of distributed processors communicated through high-speed networks. To process a new query, a distance function is evaluated between the query and the objects stored in the database. This process relays on a metric space index distributed among the processors. In this paper, we propose a cache-based strategy devised to reduce the number of computations required to retrieve the top-k object results for user queries by using pre-computed information. Our proposal executes an approximate similarity search algorithm, which takes advantage of the links between objects stored in the cache memory. Those links form a graph of similarity among pre-computed queries. Compared to the previous methods in the literature, the proposed approach reduces the number of distance evaluations up to 60%.