dc.creatorTogninalli, M.
dc.creatorXu Wang
dc.creatorKucera, T.
dc.creatorShrestha, S.
dc.creatorJuliana, P.
dc.creatorMondal, S.
dc.creatorPinto Espinosa, F.
dc.creatorVelu, G.
dc.creatorCrespo-Herrera, L.A.
dc.creatorHuerta-Espino, J.
dc.creatorSingh, R.P.
dc.creatorBorgwardt, K.
dc.creatorPoland, J.A.
dc.date2023-06-29T20:20:11Z
dc.date2023-06-29T20:20:11Z
dc.date2023
dc.date.accessioned2023-07-17T20:10:41Z
dc.date.available2023-07-17T20:10:41Z
dc.identifierhttps://hdl.handle.net/10883/22634
dc.identifier10.1093/bioinformatics/btad336
dc.identifier.urihttps://repositorioslatinoamericanos.uchile.cl/handle/2250/7514376
dc.descriptionMotivation: Developing new crop varieties with superior performance is highly important to ensure robust and sustainable global food security. The speed of variety development is limited by long field cycles and advanced generation selections in plant breeding programs. While methods to predict yield from genotype or phenotype data have been proposed, improved performance and integrated models are needed. Results: We propose a machine learning model that leverages both genotype and phenotype measurements by fusing genetic variants with multiple data sources collected by unmanned aerial systems. We use a deep multiple instance learning framework with an attention mechanism that sheds light on the importance given to each input during prediction, enhancing interpretability. Our model reaches 0.754 6 0.024 Pearson correlation coefficient when predicting yield in similar environmental conditions; a 34.8% improvement over the genotype-only linear baseline (0.559 6 0.050). We further predict yield on new lines in an unseen environment using only genotypes, obtaining a prediction accuracy of 0.386 6 0.010, a 13.5% improvement over the linear baseline. Our multi-modal deep learning architecture efficiently accounts for plant health and environment, distilling the genetic contribution and providing excellent predictions. Yield prediction algorithms leveraging phenotypic observations during training therefore promise to improve breeding programs, ultimately speeding up delivery of improved varieties.
dc.languageEnglish
dc.publisherOxford University Press
dc.relationhttps://10.5061/dryad.kprr4xh5p
dc.rightsCIMMYT manages Intellectual Assets as International Public Goods. The user is free to download, print, store and share this work. In case you want to translate or create any other derivative work and share or distribute such translation/derivative work, please contact CIMMYT-Knowledge-Center@cgiar.org indicating the work you want to use and the kind of use you intend; CIMMYT will contact you with the suitable license for that purpose
dc.rightsOpen Access
dc.source6
dc.source39
dc.source1367-4803
dc.sourceBioinformatics
dc.sourcebtad336
dc.subjectAGRICULTURAL SCIENCES AND BIOTECHNOLOGY
dc.subjectNew Crop Varieties
dc.subjectPlant Breeding Programs
dc.subjectYield Prediction
dc.subjectLEARNING
dc.subjectGRAIN
dc.subjectYIELDS
dc.subjectWHEAT
dc.subjectBREEDING
dc.subjectFOOD SECURITY
dc.subjectWheat
dc.titleMulti-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
dc.typeArticle
dc.typePublished Version
dc.coverageOxford (United Kingdom)


Este ítem pertenece a la siguiente institución