Objeto de conferencia
Machine learning algorithms identified relevant SNPs for milk fat content in cattle
Autor
Ríos, Pablo
Raschia, María Agustina
Maizon, Daniel O.
Demitrio, Daniel
Poli, Mario A.
Institución
Resumen
In recent years, machine learning methods have been shown to be efficient in identifying a subset of single nucleotide polymorphisms (SNP) underlying a trait of interest. The aim of this study was the construction of predictive models using machine learning algorithms, for the identification of loci that best explain the variance in milk fat production of dairy cattle. Further objectives involve determining the genes flanking relevant SNPs and retrieving the pathways, biological processes, or molecular functions overrepresented by them. Fat production values adjusted for fixed effects (FPadj) and estimated breeding values for milk fat production (EBVFP) were used as phenotypes and SNPs as predictor variables. The models constructed for EBVFP performed better and yield considerably less relevant SNPs than models for FPadj. Among the genes flanking relevant SNPs, signaling transduction pathways and gated channel activities were detected as overrepresented. The loci obtained for EBVFP matched better with previously reported relevant loci for milk fat content than those obtained for FPadj. Based on the better performance showed by the models trained for EBVFP and their agreement with previous reported results for the trait studied, we conclude that the relationship among individuals should be accounted for in the phenotype used. Sociedad Argentina de Informática e Investigación Operativa