Sketch-Based multimodal image retrieval using deep learning

Berno, Brenda Cinthya Solari

masterThesis

Fecha

2021-05-21

Registro en:

BERNO, Brenda Cinthya Solari. Sketch-Based multimodal image retrieval using deep learning. 2021. Dissertação (Mestrado em Engenharia Elétrica e Informática Industrial) - Universidade Tecnológica Federal do Paraná, Curitiba, 2021.

http://repositorio.utfpr.edu.br/jspui/handle/1/25496

https://repositorioslatinoamericanos.uchile.cl/handle/2250/5257859

Autor

Berno, Brenda Cinthya Solari

Institución

Universidade Tecnológica Federal do Paraná (Brasil)

Resumen

The constant growth of multimedia data generated every day makes it increasingly difficult to retrieve it. Google is known to do a good job of retrieving documents by searching for keyword matches. However, multimedia data hardly contain keywords that identify them. The main objective of this work is to retrieve a photographic image using another modality different from that of the photograph, such as a sketch. A sketch is different from the image since it is a set of hand-drawn lines and colors and texture is lost, when compared with a photograph that is a more complex visual representation representing the real world. The selected study case for this method is tattoo photograph retrieval using sketches. Due to the lack of appropriate data for this study, a new dataset of sketches and tattoo images was created. The proposed model consists of a Siamese neural network that receives as input visual features previously extracted from each modality to learn an optimal representation for photographs and sketches within an embedded space, where the image of a class is close to the sketch of the same class. Two cost functions were tested, and experiments showed that the contrastive loss function achieved better results than the triplet loss function in the retrieval of images. Despite having limited data, in the image retrieval experiments the average precision achieved 85% precision for our dataset at top-5 results and 85% precision for Sketchy at top-10 results. We observed that retrieval results depend on the quality and diversity of the data used for training, especially in sketch-based image retrieval, which, in turn, depends on the user’s ability to draw. Overall, the proposed methods are promising and results encourage further research. Future works include the extension of the dataset (both tattoo images and sketches) and, also, experiments with other modalities.

Materias

Sistemas multimídia

Recuperação de dados (Computação)

Sistemas de recuperação da informação

Redes neurais (Computação)

Visão Computacional

Aprendizado do computador

Tatuagem - Imagem

Multimedia systems

Data recovery (Computer science)

Information storage and retrieval systems

Neural networks (Computer science)

Computer vision

Machine learning

Tattooing - Imaging

Mostrar el registro completo del ítem