doctoralThesis
Estimación de parámetros y clasificación de datos : aplicaciones biomédicas
Autor
Agnelli, Juan Pablo
Institución
Resumen
En esta tesis se proponen principalmente dos tipos de aplicaciones biomédicas para las cuales hemos empleado diferentes herramientas matemáticas y por lo cual el trabajo está dividido en dos partes. En la primera parte nos hemos abocado a la detección de tumores. El objetivo aquí fue estimar la localización, tamaño y parámetros térmicos asociados a un tumor utilizando como información perfiles de temperaturas
medidos sobre la superficie corporal. En la segunda parte del trabajo, el objetivo fue desarrollar un algoritmo capaz de extraer, de una gran base de datos, información que reside de manera implícita en estos. Dicha información es previamente desconocida y puede resultar útil para describir el proceso o fenómeno que está bajo análisis o estudio. En particular, aquí se aplicó para la clasificación de distintos tipos de tumores
usando como base de datos niveles de expresión genética. In this thesis we propose two main areas of study, so the work is divided into two parts. The
first one is related with tumor location and estimation of parameters related with tumor regions
and the second part is concerned with the development of an algorithm for tumor classification
from gene expression levels.
In the first situation the goal is to estimate position, size and thermal parameters of a tumor
using temperature profiles that have been measured on the top boundary of the domain using a
thermography camera. From the mathematical point of view the study of these problems imply to
pose and analyze inverse problems and also to develop numerical methods to solve it. In a first
stage, we use partial differential equations to model heat transfer in living tissue, more precisely
we consider the stationary Pennes equation with mixed boundary conditions. For this elliptical
equation we have proved existence and uniqueness of the solution and to solve this direct problem
a finite difference scheme of second order is considered. Then, to solve the inverse problems
these problems were reformulated as optimization problems and to solve these new problems two
different methodologies will be presented. The first one, is based on the use of the Patter Search
algorithm. This is a direct search algorithm, so it does not make use of derivatives and therefore
is very easy to implement. The second methodology that we present makes use of the information
provided by the derivative of the function to minimize with respect to the different variables to be
estimated. To calculate this derivative we consider some sensitivity analysis tools.
In the second part of the work, the goal is to build an algorithm capable to extract, from a
large database, useful information that resides implicitly. This information is previously unknown
and may be useful to describe the process or phenomenon that is under analysis or study. In particular, here we are interested in classify different types of tumors using gene expression levels.
The proposed methodology is based on three main ingredients: 1)the blurring of distinctions between training and testing populations, through the soft assignment of the latter to classes, in an
expectation-maximization framework, 2) a procedure for density estimation through a descent
flow, that transforms the original distribution into an isotropic Gaussian distribution and 3) a measure of the clustering capability of a set of variables, which leads to an effective procedure for
variable selection. The methodology is particularly useful in situations where there are relatively
few observations for a phenomenon that is described by a large amount of variables, and no a
priori knowledge that strongly links a small subset of these variables to the classification sought.
According to the results obtained the methodologies proposed in the first part of this work
can be considered as a potential tool to locate tumor regions, like nodular melanomas, as well
as to estimate parameters associated with them that could be useful and important to study the
tumor evolution after a treatment procedure. The same conclusion applies to the methodology
developed in the second part in order to diagnose, prevent and treat different diseases based on
gene expression levels.