info:eu-repo/semantics/article
Algorithmic Learning for Auto-deconvolution of GC-MS Data to Enable Molecular Networking within GNPS
Fecha
2020-01-14Registro en:
Aksenov, Alexander; Laponogov, Ivan; Zhang, Zheng; Doran, Sophie L. F.; Belluomo, Ilaria; et al.; Algorithmic Learning for Auto-deconvolution of GC-MS Data to Enable Molecular Networking within GNPS; Cold Spring Harbor Laboratory Press; Nature Biotechnology; 39; 14-1-2020; 1-25
1087-0156
1943-0264
CONICET Digital
CONICET
Autor
Aksenov, Alexander
Laponogov, Ivan
Zhang, Zheng
Doran, Sophie L. F.
Belluomo, Ilaria
Veselkov, Dennis
Bittremieux, Wout
Nothias, Louis Felix
Nothias Esposito, Mélissa
Maloney, Katherine N.
Misra, Biswapriya B.
Melnik, Alexey V.
Jones, Kenneth L.
Dorrestein, Kathleen
Panitchpakdi, Morgan
Ernst, Madeleine
van der Hooft, Justin J.J.
Gonzalez, Mabel
Carazzone, Chiara
Amézquita, Adolfo
Callewaert, Chris
Morton, James
Quinn, Robert
Bouslimani, Amina
Albarracín Orio, Andrea Georgina
Petras, Daniel
Smania, Andrea
Couvillion, Sneha P.
Burnet, Meagan C.
Nicora, Carrie D.
Zink, Erika
Metz, Thomas O.
Artaev, Viatcheslav
Humston Fulmer, Elizabeth
Gregor, Rachel
Meijler, Michael M.
MizrahiI, tzhak
Eyal, Stav
Anderson, Brooke
Dutton, Rachel
Lugan, Raphaël
Le Boulch, Pauline
Guitton, Yann
Prevost, Stephanie
Poirier, Audrey
Dervilly, Gaud
Le Bizec, Bruno
Fait, Aaron
Sikron Persi, Noga
Song, Chao
Gashu, Kelem
Coras, Roxana
Vasiliou, Vasilis
Schmid, Robin
Borisov, Roman S.
Kulikova, Larisa N.
Knight, Rob
Wang, Mingxun
Hanna, George B
Dorrestein, Pieter
Veselkov, Kirill
Resumen
Gas chromatography-mass spectrometry (GC-MS) represents an analytical technique with significant practical societal impact. Spectral deconvolution is an essential step for interpreting GC-MS data. No public GC-MS repositories that also enable repository-scale analysis exist, in part because deconvolution requires significant user input. We therefore engineered a scalable machine learning workflow for the Global Natural Product Social Molecular Networking (GNPS) analysis platform to enable the mass spectrometry community to store, process, share, annotate, compare, and perform molecular networking of GC-MS data. The workflow performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization, using a Fast Fourier Transform-based strategy to overcome scalability limitations. We introduce a "balance score" that quantifies the reproducibility of fragmentation patterns across all samples. We demonstrate the utility of the platform with breathomics analysis applied to the early detection of oesophago-gastric cancer, and by creating the first molecular spatial map of the human volatilome.