Tesis de Doctorado
A fast and low-cost method to detect nearduplicate Images in large dataset based on fingerprint extraction and Deep Learning
Fecha
2023-03-16Autor
Nasri Shandiz, Fatemeh
Institución
Resumen
Recognizing near-duplicate images from large datasets is a crucial task in image retrieval
and content identification. Finding similar images in order to reduce redundancy is timeconsuming in large datasets. Most of image representation targeting methods at
conventional image retrieval issues for detecting duplicate are either computationally
expensive to extract and match or have robustness limitations. In this work, we propose a
fast method to detect near-duplicate images in a large dataset, which is computationally
low cost and effective by using image fingerprints to determine similarity between a query
image and near-duplicated images in a large dataset. We extract a series of fingerprints
combining global and local features also using a deep learning model as a fingerprint for
each image in the dataset and store them in a separate database. Then we apply successive
filters to the query image, discarding non-similar images in the process until reaching a
final set of near-duplicate images. we achieved to discarding most of the non-similar
images in the early stages of the process and focuses on robustness in the latter stages,
where the set of near-duplicate candidate images is significantly smaller. This allows to
perform the query process on the fly. The proposed method and experimental results
provide a right compromise between accuracy and speed in detecting near-duplicate images
from a large dataset even via a low performance potential computer such has home use
laptop or a workstation computer.