Background A mechanistic understanding of the spread of SARS-CoV-2 and diligent tracking of ongoing mutagenesis are of key importance to plan robust strategies for confining its transmission. Large numbers of available sequences and their dates of transmission provide an unprecedented opportunity to analyze evolutionary adaptation in novel ways. Addition of high-resolution structural information can reveal the functional basis of these processes at the molecular level. Integrated systems biology-directed analyses of these data layers afford valuable insights to build a global understanding of the COVID-19 pandemic. Results Here we identify globally distributed haplotypes from 15,789 SARS-CoV-2 genomes and model their success based on their duration, dispersal, and frequency in the host population. Our models identify mutations that are likely compensatory adaptive changes that allowed for rapid expansion of the virus. Functional predictions from structural analyses indicate that, contrary to previous reports, the Asp 614 Gly mutation in the spike glycoprotein (S) likely reduced transmission and the subsequent Pro 323 Leu mutation in the RNA-dependent RNA polymerase led to the precipitous spread of the virus. Our model also suggests that two mutations in the nsp13 helicase allowed for the adaptation of the virus to the Pacific Northwest of the USA. Finally, our explainable artificial intelligence algorithm identified a mutational hotspot in the sequence of S that also displays a signature of positive selection and may have implications for tissue or cell-specific expression of the virus. Conclusions These results provide valuable insights for the development of drugs and surveillance strategies to combat the current and future pandemics.
【저자키워드】 COVID-19, SARS-CoV-2, coronavirus, molecular evolution, adaptive mutation, Local adaptation, 【초록키워드】 Mutation, adaptive, Positive selection, COVID-19 pandemic, spike glycoprotein, Transmission, drug, virus, Helicase, nsp13, Spread, Surveillance, SARS-CoV-2 genome, Algorithm, Pandemics, RNA-dependent RNA polymerase, molecular, USA, information, expression, change, Haplotype, Mutagenesis, Frequency, Analysis, dispersal, High-resolution, tissue, These data, sequence, contrary, rapid expansion, hotspot, Host, implication, robust, Result, identify, subsequent, reduced, functional, build, Pro, 【제목키워드】 adaptive, SARS-CoV-2 mutation, spatiotemporal,