Given the global impact and severity of COVID-19, there is a pressing need for a better understanding of the SARS-CoV-2 genome and mutations. Multi-strain sequence alignments of coronaviruses (CoV) provide important information for interpreting the genome and its variation. We apply a comparative genomics method, ConsHMM, to the multi-strain alignments of CoV to annotate every base of the SARS-CoV-2 genome with conservation states based on sequence alignment patterns among CoV. The learned conservation states show distinct enrichment patterns for genes, protein domains, and other regions of interest. Certain states are strongly enriched or depleted of SARS-CoV-2 mutations, which can be used to predict potentially consequential mutations. We expect the conservation states to be a resource for interpreting the SARS-CoV-2 genome and mutations. Kwon and Ernst applied the comparative genomics method, ConsHMM, to the multi-strain alignments of different coronaviruses in order to annotate every base of the SARS-CoV-2 genome with conservation states. The conservation states reflect sequence alignment patterns among different coronaviruses, which would assist with understanding the functional consequences of SARS-CoV-2 mutations.
【저자키워드】 Infectious diseases, Evolutionary biology, Sequence annotation, 【초록키워드】 Coronaviruses, coronavirus, Variation, Genome, mutations, Region, severity of COVID-19, SARS-CoV-2 genome, Comparative genomics, CoV, SARS-CoV-2 mutations, information, resource, predict, Sequence alignment, sequence alignments, functional consequences, sequence, Ernst, Kwon, protein domains, Genes, applied, can be used, assist, functional consequence, the SARS-CoV-2 genome, 【제목키워드】 the SARS-CoV-2 genome,