Analysis of SARS-CoV-2 genetic diversity within infected hosts can provide insight into the generation and spread of new viral variants and may enable high resolution inference of transmission chains. However, little is known about temporal aspects of SARS-CoV-2 intrahost diversity and the extent to which shared diversity reflects convergent evolution as opposed to transmission linkage. Here we use high depth of coverage sequencing to identify within-host genetic variants in 325 specimens from hospitalized COVID-19 patients and infected employees at a single medical center. We validated our variant calling by sequencing defined RNA mixtures and identified viral load as a critical factor in variant identification. By leveraging clinical metadata, we found that intrahost diversity is low and does not vary by time from symptom onset. This suggests that variants will only rarely rise to appreciable frequency prior to transmission. Although there was generally little shared variation across the sequenced cohort, we identified intrahost variants shared across individuals who were unlikely to be related by transmission. These variants did not precede a rise in frequency in global consensus genomes, suggesting that intrahost variants may have limited utility for predicting future lineages. These results provide important context for sequence-based inference in SARS-CoV-2 evolution and epidemiology. Author summary Understanding the evolution and transmission of SARS-CoV-2 is important for designing public health interventions to prevent outbreaks. Viral genome sequencing has been widely used to reconstruct patterns of SARS-CoV-2 transmission through communities and to monitor the spread of new strains. However, because SARS-CoV-2 can transmit multiple times before a new mutation fixes, consensus sequences often cannot determine “who infected whom.” Identifying individuals who share the same viral genetic variants at low frequencies within each infection may help resolve this problem, but to do this we need to accurately identify within-host genetic variants and understand how they evolve and spread. We investigated within-host diversity of SARS-CoV-2 with samples collected in southeastern Michigan in March–May 2020. We show that there are relatively few genetic variants present in any given infection, and variants do not tend to accumulate in people over time. We also found that people who are not part of the same epidemic cluster can share the same within-host variants, due to chance or various evolutionary forces.
【초록키워드】 Evolution, SARS-CoV-2, Mutation, Epidemiology, Variation, Sequencing, variant, Infection, Transmission, variants, RNA, Spread, Outbreaks, Cohort, Genome sequencing, Epidemic, SARS-CoV-2 transmission, Coverage, Viral, Viral load, Metadata, Community, Cluster, Public health interventions, understanding, SARS-CoV-2 evolution, genetic diversity, genomes, Genetic variant, convergent evolution, utility, viral variant, lineages, Strains, Critical, consensus sequence, hospitalized COVID-19 patient, Frequency, Public health intervention, low frequencies, can not, high resolution, symptom onset, Consensus, hospitalized COVID-19 patients, chance, individual, sequence, specimen, help, consensus sequences, single medical center, Viral genome sequencing, MONITOR, transmission of SARS-CoV-2, identifying, Host, Prevent, defined, identify, collected, sequenced, investigated, unlikely, determine, reflect, accumulate, 【제목키워드】 SARS-CoV-2 mutation, Host, Temporal,