Tesis
Machine learning classification of single cell rna-seq across different types of cáncer.
Autor
Vidal Miranda, Mabel Angélica
Institución
Resumen
Human cancers are complex ecosystems composed of different types of cells. The diverse
populations of co-existing cells within the same tumor that have genetic, functional, and
environmental differences determine the tumor heterogeneity, which is one of the major
challenges facing cancer diagnosis and treatment. The aim of this thesis was to apply
different machine learning methods to classify single cell RNA-seq (scRNA-seq) samples
across nine different types of cancer. We observed that T cells are the most abundant
datasets in public repositories due to their important role in immunotherapies. For this
reason, we performed an in-silico analysis from scRNA-seq data available in the Gene
Expression Omnibus. A őrst approach was to analyze and characterize genetic T cell
signatures from őve different types of cancer and apply dimensionality reduction and clus tering methods to identify subpopulations from malignant and non-malignant datasets.
This analysis revealed that pathways related to immune response, metabolism and viral
immunoregulation were observed exclusively in samples of malignant origin. A second
approach was to perform two deep learning models to classify cells from nine different
types of cancer, where the cells were grouped in the diversity of the cell state, giving
us a new perspective in the different classes of tumors present in our dataset. Finally,
we observed that working with unsupervised methods, our data help us understand the
heterogeneity between tumors. Characterization of cellular diversity was associated with
pathways that play a key role in tumor proliferation, progression, and regulation of the
microenvironmental immune response.