Background COVID-19 became a global pandemic not long after its identification in late 2019. The genomes of SARS-CoV-2 are being rapidly sequenced and shared on public repositories. To keep up with these updates, scientists need to frequently refresh and reclean data sets, which is an ad hoc and labor-intensive process. Further, scientists with limited bioinformatics or programming knowledge may find it difficult to analyze SARS-CoV-2 genomes. Objective To address these challenges, we developed CoV-Seq, an integrated web server that enables simple and rapid analysis of SARS-CoV-2 genomes. Methods CoV-Seq is implemented in Python and JavaScript. The web server and source code URLs are provided in this article. Results Given a new sequence, CoV-Seq automatically predicts gene boundaries and identifies genetic variants, which are displayed in an interactive genome visualizer and are downloadable for further analysis. A command-line interface is available for high-throughput processing. In addition, we aggregated all publicly available SARS-CoV-2 sequences from the Global Initiative on Sharing Avian Influenza Data (GISAID), National Center for Biotechnology Information (NCBI), European Nucleotide Archive (ENA), and China National GeneBank (CNGB), and extracted genetic variants from these sequences for download and downstream analysis. The CoV-Seq database is updated weekly. Conclusions We have developed CoV-Seq, an integrated web service for fast and easy analysis of custom SARS-CoV-2 sequences. The web server provides an interactive module for the analysis of custom sequences and a weekly updated database of genetic variants of all publicly accessible SARS-CoV-2 sequences. We believe CoV-Seq will help improve our understanding of the genetic underpinnings of COVID-19.
【저자키워드】 COVID-19, SARS-CoV-2, Genome, bioinformatics, virus, genetics, web server, data sets, sequence, programming, 【초록키워드】 knowledge, Genetic, Biotechnology, database, global pandemic, genetic variants, Genetic variant, GISAID, predict, Analysis, SARS-CoV-2 sequences, SARS-CoV-2 genomes, help, SARS-CoV-2 sequence, center, NCBI, European Nucleotide Archive, initiative, objective, downstream analysis, IMPROVE, Result, identify, sequenced, addition, provided, provide, automatically, China National GeneBank, CNGB, ENA, interactive module, labor-intensive, 【제목키워드】 Usability, genome analysis, development, New,