Abstract
Background: Being able to efficiently call variants from the increasing amount of sequencing data daily produced from multiple viral strains is of the utmost importance, as demonstrated during the COVID-19 pandemic, in order to track the spread of the viral strains across the globe.
Results: We present MALVIRUS, an easy-to-install and easy-to-use application that assists users in multiple tasks required for the analysis of a viral population, such as the SARS-CoV-2. MALVIRUS allows to: (1) construct a variant catalog consisting in a set of variations (SNPs/indels) from the population sequences, (2) efficiently genotype and annotate variants of the catalog supported by a read sample, and (3) when the considered viral species is the SARS-CoV-2, assign the input sample to the most likely Pango lineages using the genotyped variations.
Conclusions: Tests on Illumina and Nanopore samples proved the efficiency and the effectiveness of MALVIRUS in analyzing SARS-CoV-2 strain samples with respect to publicly available data provided by NCBI and the more complete dataset provided by GISAID. A comparison with state-of-the-art tools showed that MALVIRUS is always more precise and often have a better recall.
Keywords: Genotyping; Lineage classification; SARS-CoV-2; Sequence analysis; Virus.
【저자키워드】 SARS-CoV-2, virus, Sequence analysis, genotyping, Lineage classification, 【초록키워드】 COVID-19 pandemic, Variation, variant, Test, variants, Spread, Sequence analysis, Genotype, Lineage, Effectiveness, dataset, variations, recall, Illumina, genotyping, GISAID, Analysis, Efficiency, viral strain, viral strains, available data, NCBI, sequencing data, SARS-CoV-2 strain, being, Complete, produced, globe, required, provided, supported, demonstrated, genotyped, assist, the SARS-CoV-2, 【제목키워드】 viral variant, Analysis,