Evolutionary and structural analyses of SARS‐CoV‐2 D614G spike protein mutation now documented worldwide
Autor
Isabel, Sandra
Graña‑Miraglia, Lucía
Gutierrez, Jahir M.
Bundalovic‑Torma, Cedoljub
Groves, Helen E.
Isabel, Marc R.
Eshagh, AliReza
Pate, Samir N.
Gubbay, Jonathan B.
Poutanen, Tomi
Guttman, David S.
Poutanen, Susan M.
Institución
Resumen
The COVID-19 pandemic, caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARSCoV-2), was declared on March 11, 2020 by the World Health Organization. As of the 31st of May,
2020, there have been more than 6 million COVID-19 cases diagnosed worldwide and over 370,000
deaths, according to Johns Hopkins. Thousands of SARS-CoV-2 strains have been sequenced to
date, providing a valuable opportunity to investigate the evolution of the virus on a global scale. We
performed a phylogenetic analysis of over 1,225 SARS-CoV-2 genomes spanning from late December
2019 to mid-March 2020. We identifed a missense mutation, D614G, in the spike protein of SARSCoV-2, which has emerged as a predominant clade in Europe (954 of 1,449 (66%) sequences) and
is spreading worldwide (1,237 of 2,795 (44%) sequences). Molecular dating analysis estimated the
emergence of this clade around mid-to-late January (10–25 January) 2020. We also applied structural
bioinformatics to assess the potential impact of D614G on the virulence and epidemiology of SARSCoV-2. In silico analyses on the spike protein structure suggests that the mutation is most likely
neutral to protein function as it relates to its interaction with the human ACE2 receptor. The lack of
clinical metadata available prevented our investigation of association between viral clade and disease
severity phenotype. Future work that can leverage clinical outcome data with both viral and human
genomic diversity is needed to monitor the pandemic.