Novel metric-learning methods for generalizable and discriminative few-shot image classification

OCHOA RUIZ, GILBERTO; 352103; Méndez Ruiz, Mauricio

dc.contributor	Ochoa Ruiz, Gilberto
dc.contributor	School of Engineering and Sciences
dc.contributor	Chang Fernández, Leonardo
dc.contributor	Méndez Vázquez, Andrés
dc.contributor	Campus Monterrey
dc.contributor	puelquio/mscuervo
dc.creator	OCHOA RUIZ, GILBERTO; 352103
dc.creator	Méndez Ruiz, Mauricio
dc.date.accessioned	2023-06-22T15:00:42Z
dc.date.accessioned	2023-07-19T19:16:29Z
dc.date.available	2023-06-22T15:00:42Z
dc.date.available	2023-07-19T19:16:29Z
dc.date.created	2023-06-22T15:00:42Z
dc.date.issued	2021-12-09
dc.identifier	Méndez Ruiz, M. (2021). Novel metric-learning methods for generalizable and discriminative few-shot image classification [Unpublished master's thesis]. Instituto Tecnológico y de Estudios Superiores de Monterrey. Recuperado de: https://hdl.handle.net/11285/650931
dc.identifier	https://hdl.handle.net/11285/650931
dc.identifier	1053812
dc.identifier.uri	https://repositorioslatinoamericanos.uchile.cl/handle/2250/7715837
dc.description.abstract	Few-shot learning (FSL) is a challenging and relatively new technique that specializes in problems where we have little amount of data. The goal of these methods is to classify categories that have not been seen before with just a handful of labeled samples. Recent works based on metric-learning approaches benefit from the meta-learning process in which we have episodic tasks conformed by a support set (training) and a query set (test), and the objective is to learn a similarity comparison metric between those sets. Metric learning methods have demonstrated that simple models can achieve good performance. However, the feature space learned by a given metric learning approach may not exploit the information given by a specific few-shot task. Due to the lack of data, the learning process of the embedding network becomes an important part for these models to take better advantage of the similarity metric on a few-shot task. The contributions of the present thesis are three-fold. First, we explore the use of dimension reduction techniques as a way to find significant features in the few-shot task, which allows a better classification. We measure the performance of the reduced features by assigning a score based on the intra-class and inter-class distance, and select the best feature reduction method in which instances of different classes are far away and instances of the same class are close. This method outperforms the metric learning baselines in the miniImageNet dataset by around 2% in accuracy performance. Further on, we propose two different distance-based loss functions for few-shot classification. One is inspired on the triplet-loss function while the other evaluates the embedding vectors from a task using the concepts of intra-class and inter-class distance among the few samples. Extensive experimental results on the miniImagenNet dataset show an increase on the accuracy performance compared with other metric-based FSL methods by a margin of 2%. Lastly, we evaluate the generalization ca- pabilities of meta-learning based FSL on two real-life medical datasets with small availability of data. It has been repeatedly showed that deep learning (DL) methods trained on a dataset don’t generalize well to datasets from other domains or even to similar datasets, due to the data distribution shifts. We propose the use of a meta-learning based FSL approach to alleviate these problems by demonstrating, using two datasets of kidney stones samples acquired with different endoscopes and different acquisition conditions, that such methods are indeed capable of handling domain shifts. Where deep learning based methods fail to generalize to instances of the same class but from different data distributions, we prove that FSL is capable of generalizing without a large decrease on performance. This method is capable of doing remarkably well even under the very limited data conditions, attaining an accuracy of 74.38% and 88.52% in the 5-way 5-shot and 5-way 20-shot settings respectively, while traditional DL methods attained an accuracy of 45% in the same data.
dc.language	eng
dc.publisher	Instituto Tecnológico y de Estudios Superiores de Monterrey
dc.relation	draft
dc.relation	REPOSITORIO NACIONAL CONACYT
dc.rights	http://creativecommons.org/licenses/by/4.0
dc.rights	openAccess
dc.title	Novel metric-learning methods for generalizable and discriminative few-shot image classification
dc.type	Tesis de Maestría / master Thesis

Este ítem pertenece a la siguiente institución

Instituto Tecnológico de Monterrey (México)