Background SARS-CoV-2, a positive-sense RNA virus in the family Coronaviridae , has caused a worldwide pandemic of coronavirus disease 2019 or COVID-19. Coronaviruses generate a tiered series of subgenomic RNAs (sgRNAs) through a process involving homology between transcriptional regulatory sequences (TRS) located after the leader sequence in the 5′ UTR (the TRS-L) and TRS located near the start of ORFs encoding structural and accessory proteins (TRS-B) near the 3′ end of the genome. In addition to the canonical sgRNAs generated by SARS-CoV-2, non-canonical sgRNAs (nc-sgRNAs) have been reported. However, the consistency of these nc-sgRNAs across viral isolates and infection conditions is unknown. The comprehensive definition of SARS-CoV-2 RNA products is a key step in understanding SARS-CoV-2 pathogenesis. Methods Here, we report an integrative analysis of eight independent SARS-CoV-2 transcriptomes generated using three sequencing strategies, five host systems, and seven viral isolates. Read-mapping to the SARS-CoV-2 genome was used to determine the 5′ and 3′ coordinates of all junctions in viral RNAs identified in these samples. Results Using junctional abundances, we show nc-sgRNAs make up as much as 33% of total sgRNAs in cell culture models of infection, are largely consistent in abundance across independent transcriptomes, and increase in abundance over time during infection. By assessing the homology between sequences flanking the 5′ and 3′ junction points, we show that nc-sgRNAs are not associated with TRS-like homology. By incorporating read coverage information, we find strong evidence for subgenomic RNAs that contain only 5′ regions of ORF1a. Finally, we show that non-canonical junctions change the landscape of viral open reading frames. Conclusions We identify canonical and non-canonical junctions in SARS-CoV-2 sgRNAs and show that these RNA products are consistently generated by many independent viral isolates and sequencing approaches. These analyses highlight the diverse transcriptional activity of SARS-CoV-2 and offer important insights into SARS-CoV-2 biology. Supplementary information The online version contains supplementary material available at 10.1186/s13073-020-00802-w.
【저자키워드】 COVID-19, SARS-CoV-2, Direct RNA sequencing, Transcription, 【초록키워드】 coronavirus disease, Coronavirus disease 2019, Sequencing, Genome, Infection, RNA, Region, Coverage, Viral, SARS-CoV-2 genome, subgenomic RNA, Cell culture, leader sequence, sgRNA, SARS-CoV-2 RNA, Integrative analysis, RNA virus, information, SARS-CoV-2 pathogenesis, accessory protein, ORFs, Evidence, open reading frames, transcriptomes, make up, Consistency, supplementary material, viral RNAs, approaches, sequences, worldwide pandemic, sequence, abundance, landscape, family Coronaviridae, 3′ end, 5′ UTR, canonical sgRNAs, homology, nc-sgRNAs, non-canonical sgRNAs, ORF1a, positive-sense RNA virus, SARS-CoV-2 biology, SARS-CoV-2 sgRNAs, SARS-CoV-2 transcriptomes, sgRNAs, subgenomic RNAs, transcriptional regulatory sequences, TRS-B, TRS-L, viral isolates, offer, transcriptional activity, Host, FIVE, TRS, independent, highlight, junction, Seven, SARS-CoV-2 transcriptome, Result, identify, was used, caused, reported, addition, generate, eight, ORF, condition, determine, in viral, analysis, increase in, canonical, Coordinate, canonical sgRNA, nc-sgRNA, non-canonical sgRNA, SARS-CoV-2 sgRNA, the SARS-CoV-2 genome, transcriptional regulatory sequence, viral isolate, 【제목키워드】 non-canonical subgenomic RNA,