Summary Tracking the emergence and spread of pathogen variants is an important component of monitoring infectious disease outbreaks. To that end, accurately estimating the number and prevalence of pathogen variants in a population requires carefully designed surveillance programs. However, current approaches to calculating the number of pathogen samples needed for effective surveillance often do not account for the various processes that can bias which infections are detected and which samples are ultimately characterized as a specific variant. In this article, we introduce a framework that accounts for the logistical and epidemiological processes that may bias variant characterization, and we demonstrate how to use this framework (implemented in a publicly available tool) to calculate the number of sequences needed for surveillance. Our framework is designed to be easy to use while also flexible enough to be adapted to various pathogens and surveillance scenarios. Graphical abstract Highlights • Tracking pathogen variant spread is important for infectious disease monitoring • Sample size calculations should account for biological and logistical biases • A simple framework can be used to calculate samples needed for pathogen surveillance • Sample size calculation framework is implemented in the R package phylosamp and an Excel workbook Wohl et al. present a framework for calculating the number of pathogen genome sequences needed for variant surveillance or to calculate confidence in variant detection or prevalence estimates given a sample size. This study presents concrete examples of sample size calculations, with an emphasis on usability and flexibility.
【저자키워드】 SARS-CoV-2, variants of concern, Infectious disease, pathogen variants, pathogen genomics, Sample size calculations, variant surveillance,