dc.creatorSantos, Lúcio Fernandes Dutra
dc.creatorOliveira, Willian Dener de
dc.creatorCarvalho, Luiz Olmes
dc.creatorFerreira, Mônica Ribeiro Porto
dc.creatorTraina, Agma Juci Machado
dc.creatorTraina Junior, Caetano
dc.date.accessioned2015-06-30T13:56:18Z
dc.date.accessioned2018-07-04T17:05:50Z
dc.date.available2015-06-30T13:56:18Z
dc.date.available2018-07-04T17:05:50Z
dc.date.created2015-06-30T13:56:18Z
dc.date.issued2015-04
dc.identifierSymposium on Applied Computing, 30th, 2015, Salamanca.
dc.identifier9781450331968
dc.identifierhttp://www.producao.usp.br/handle/BDPI/49015
dc.identifierhttp://dx.doi.org/10.1145/2695664.2695798
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/1644611
dc.description.abstractResult diversification methods are intended to retrieve elements similar to a given object whereas also enforcing a certain degree of diversity among them, aimed at improving the answer relevance. Most of the methods are based on optimization, but bearing NP-hard solutions. Diversity is injected into an otherwise all-too-similar result set in two phases: in the first, the search space is reduced to speed up finding the optimal solution, whereas in the second a trade-off between diversity and similarity over the reduced space is obtained. It is assumed that the first phase is achieved by applying a traditional nearest neighbor algorithm, but no previous investigation evaluated the impact of the first over the second phase. In this paper, we devised alternative techniques to execute the first phase and evaluated how obtaining a better quality set of elements in the first phase can improve the diversity. Besides the traditional nearest neighbor-based pre-selection, we also considered naive random selection, cluster-based and influence-based ones. Thereafter, extensive experiments evaluated a number of state-of-the-art diversity algorithms employed in the second phase, regarding both processing time and answer quality. The obtained results have shown that although the much more elaborated (and much more time consuming) methods indeed provide best answers, other alternatives are able to provide a better commitment regarding quality and performance. Moreover, the pre-selection techniques can reduce the total running time by up to two orders of magnitude.
dc.languageeng
dc.publisherAssociation for Computing Machinery - ACM
dc.publisherUniversity of Salamanca
dc.publisherSalamanca
dc.relationSymposium on Applied Computing, 30th
dc.rightsCopyright ACM
dc.rightsclosedAccess
dc.subjectSearch result diversification
dc.subjectsimilarity search
dc.subjectsampling
dc.titleCombine-and-conquer: improving the diversity in similarity search through influence sampling
dc.typeActas de congresos


Este ítem pertenece a la siguiente institución