Background Viral protein expression in Escherichia coli ( E. coli ) is a powerful tool for structural/functional studies as well as for vaccine and diagnostics development. However, numerous factors such as codon bias, mRNA secondary structure and nucleotides distribution, have been indentified to hamper this heterologous expression. Results In this study, we combined computational and biochemical methods to analyze the influence of these factors on the expression of different segments of hepatitis E virus (HEV) ORF 2 protein and hepatitis B virus surface antigen (HBsAg). Three out of five HEV antigens were expressed while all three HBsAg fragments were not. The computational analysis revealed a significant difference in nucleotide distribution between expressed and non-expressed genes; and all these non-expressing constructs shared similar stable 5′-end mRNA secondary structures that affected the accessibility of both Shine-Dalgarno (SD) sequence and start codon AUG. By modifying the 5′-end of HEV and HBV non-expressed genes, there was a significant increase in the total free energy of the mRNA secondary structures that permitted the exposure of the SD sequence and the start codon, which in turn, led to the successful expression of these genes in E. coli . Conclusions This study demonstrates that the mRNA secondary structure near the start codon is the key limiting factor for an efficient expression of HEV ORF2 proteins in E. coli. It describes also a simple and effective strategy for the production of viral proteins of different lengths for immunogenicity/antigenicity comparative studies during vaccine and diagnostics development. Electronic supplementary material The online version of this article (10.1186/s12934-017-0812-8) contains supplementary material, which is available to authorized users.
【저자키워드】 recombinant protein expression, RNA secondary structure, Escherichia coli, hepatitis E virus,