Significance The COVID-19 pandemic is a worldwide public health emergency caused by the β -coronavirus SARS-CoV-2. A very large and continuously increasing number of high-quality whole-genome sequences are available. We have investigated whether these sequences show effects of epistatic contributions to fitness. In a population evolving under a high rate of recombination, such effects of natural selection can be detected by direct coupling analysis, a global model learning technique. The paper opens up the prospect to leverage very large collections of genome sequences to find combinatorial weaknesses of highly recombinant pathogens. Genome-wide epistasis analysis is a powerful tool to infer gene interactions, which can guide drug and vaccine development and lead to deeper understanding of microbial pathogenesis. We have considered all complete severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes deposited in the Global Initiative on Sharing All Influenza Data (GISAID) repository until four different cutoff dates, and used direct coupling analysis together with an assumption of quasi-linkage equilibrium to infer epistatic contributions to fitness from polymorphic loci. We find eight interactions, of which three are between pairs where one locus lies in gene ORF3a, both loci holding nonsynonymous mutations. We also find interactions between two loci in gene nsp13, both holding nonsynonymous mutations, and four interactions involving one locus holding a synonymous mutation. Altogether, we infer interactions between loci in viral genes ORF3a and nsp2, nsp12, and nsp6, between ORF8 and nsp4, and between loci in genes nsp2, nsp13, and nsp14. The paper opens the prospect to use prominent epistatically linked pairs as a starting point to search for combinatorial weaknesses of recombinant viral pathogens.
【저자키워드】 SARS-CoV-2, Recombination, epistasis, direct coupling analysis, 【초록키워드】 Vaccine development, coronavirus, Nsp12, Mutation, Pathogenesis, COVID-19 pandemic, Genome, nsp13, ORF3a, ORF8, nsp14, natural selection, Pathogens, public health emergency, GISAID, interactions, Interaction, Analysis, NSP2, Nonsynonymous mutations, genome sequence, microbial, starting point, acute respiratory syndrome, Repository, locus, sequence, powerful tool, Cutoff, loci, assumption, viral pathogens, initiative, Whole-genome sequence, Effect, Complete, nsp4, synonymous, caused, investigated, eight, in viral, Significance, 【제목키워드】 SARS-CoV-2 genome, Analysis, eight, reveal, viral gene,