EP4366767A2

EP4366767A2 - Oligonucleotides and viral untranslated region (utr) for increasing expression of target genes and proteins

Info

Publication number: EP4366767A2
Application number: EP22838408.7A
Authority: EP
Inventors: Wenhui Hu
Original assignee: Temple University of Commonwealth System of Higher Education
Current assignee: Temple University of Commonwealth System of Higher Education
Priority date: 2021-07-08
Filing date: 2022-07-07
Publication date: 2024-05-15
Also published as: WO2023283342A2; WO2023283342A3; CA3226284A1

Abstract

A novel, small (21-mer oligonucleotide) and unique cz's-regulatory coding motif can greatly enhance the production of a variety of different types of proteins ranging from viral transcripts/proteins, endogenous gene products, vaccines, antibodies to engineered recombinant proteins in mammalian cells. The combination of novel peptide tag(s) having specified short amino acid sequences or derivatives thereof and the untranslated region (UTR) of viruses (snUTR) enhanced production of tagged proteins, including viral transcripts/proteins, endogenous gene products, vaccine, antibody, engineered recombinant proteins in a cell both in vitro, ex vivo and in vivo.

Description

OLIGONUCLEOTIDES AND VIRAL UNTRANSLATED REGION (UTR) FOR INCREASING EXPRESSION OF TARGET GENES AND PROTEINS

CROSS REFERENCE TO RELATED APPLICATIONS

This Application claims the benefit of U.S. Provisional Application 63/332,378 filed on April 19, 2022, U.S. Provisional Application 63/219,596 filed on July 8, 2021, U.S. Provisional Application 63/219,599 filed on July 8, 2021, and U.S. Provisional Application 63/219,587 filed on July 8, 2021. The entire contents of these applications are incorporated herein by reference in their entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This disclosure was made with government support under Grant Number 1R01AI145034 awarded by the National Institutes of Health. The government has certain rights in the disclosure.

FIELD

This disclosure relates to novel oligonucleotides, peptide tag(s) having specified short nucleotide sequences or derivatives thereof as well as the native untranslated region (UTR) of SARS-CoV-2 (snUTR). Methods utilizing these novel molecules include enhancing production of the targeted proteins, including viral transcripts/proteins, endogenous gene products, vaccine, antibody, engineered recombinant proteins in a cell both in vitro , ex vivo and in vivo.

BACKGROUND

Many technologies have been developed to enhance protein expression/production, such as promoter optimization, mRNA stabilization, codon optimization, coding regulation, and protein stabilization, as well as modification of host cellular expression machinery including humanized yeast system. Although these optimization strategies have been extensively utilized in the biopharmaceutical industry and biomedical research, additional enhancing technology is still vital to help reduce the cost and speed up production. Computational analysis recently identified the secretion-enhancing c/.s- regulatory targeting element (SECReTE) that facilitates ER- localized mRNA translation and protein secretion (Cohen-Zontag, Baez et al. 2019). This SECReTE motif is enriched in nearly all mRNAs encoding secreted/membrane proteins in eukaryotes and its addition results in enhanced protein secretion (Cohen-Zontag, Baez et al.

2019). It also boosts protein expression and secretion when adding to an mRNA for an exogenously expressed protein such as GFP(Cohen-Zontag, Baez et al. 2019). Various types of peptide (epitope) tags such as Flag, Myc, HA, Ollas, V5, His, C7, and T7 have demonstrated functions in protein labeling, affinity purification, and immune detection (DeCaprio and Kohl, 2019; Katayama et al., 2021; Lee et al., 2020; Mishra, 2020; Peighambardoust et al., 2021; Pina et al., 2021; Traenkle et al., 2020). However, no tagging peptides have been identified that enhance the expression/production of the targeted proteins in mammalian cells.

The 5’-UTR within SARS-CoV-2 genome is critical to initiate the generation of the entire genomic and subgenomic transcripts (Baldassarre et al., 2020; Yang and Leibowitz, 2015). The 3’-UTR also regulates the viral genome expression and replication (Chan et al., 2020; Zhao et al.,

2020). Both 5’-UTR and 3’-UTR are highly conserved among SARS-CoV genome and their variants (Baldassarre et al. , 2020; Bottaro et al., 2021; Rangan et al., 2020; Rouchka et al., 2020; Ryder et al., 2021; Yang and Leibowitz, 2015). Recent computerization studies have identified a very stable four-way junction of 5’-UTR close to the AUG start codon (Miao et al., 2020).

SUMMARY

Embodiments are directed to novel chimeric molecules comprising an oligonucleotide comprising a c/.s-regul atory coding motif, a peptide tag, a 5’- untranslated region (5’-UTR), a 3’- untranslated region (3’-UTR) and combinations thereof for use in the enhanced production and expression of a desired biomolecule. The synergistic boosting effect observed has extensive applications and broad research interest. For industrial applications, the strategy will reduce the cost of many widely used products and facilitate their availability, such as vaccines, antibodies, recombinant proteins, and therapeutic gene products. An immediate and highly important usage of this system would be to boost mRNA vaccines against COVID-19 variants. For biomedical research, novel chimeric molecules will stimulate interest in exploring novel oligonucleotides and peptides that regulate protein expression and secretion as well as screening additional viral native UTRs for protein production boost. In certain aspects, a composition comprises an expression-enhancing oligonucleotide having between 15 and 30 nucleic acid bases and includes a c/.s-regulatory coding motif that locates in the coding regions and retains open reading frame (ORF) with targeted genes. In certain embodiments, the expression-enhancing oligonucleotide comprises twenty-one nucleic acid bases. In certain embodiments, the expression-enhancing oligonucleotide comprises a nucleic acid sequence having at least a 75% sequence identity to cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the expression enhancing oligonucleotide comprises a nucleic acid sequence having at least a 95% sequence identity to cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the expression enhancing oligonucleotide comprises a nucleic acid sequence comprising cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7).

In another aspect, a synthetic oligonucleotide comprises a nucleic acid sequence having at least a 75% sequence identity to cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the synthetic oligonucleotide comprises a nucleic acid sequence having at least a 95% sequence identity to cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the oligonucleotide comprises a nucleic acid sequence comprising cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the oligonucleotide encodes a peptide comprising an amino acid sequence having at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the oligonucleotide encodes a peptide comprising an amino acid sequence having at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the oligonucleotide encodes a peptide comprising the amino acid sequence QPRFAAA (SEQ ID NO: 1).

In another aspect, a construct comprises the synthetic oligonucleotide embodied herein.

In another aspect, a chimeric molecule comprises one or more peptide domains and one or more 5’- and/or 3’ -untranslated region (UTR) sequences or fragments thereof. In certain embodiments, the one or more peptide domains comprise from about five amino acids to about twenty amino acids. In certain embodiments, the one or more peptide domains comprise about seven amino acids. In certain embodiments, the one or more peptide domains comprise an amino acid sequence having at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the peptide comprises an amino acid sequence having at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the peptide comprises the amino acid sequence QPRFAAA (SEQ ID NO: 1). In certain embodiments, the peptide comprises X_n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof. In certain embodiments, the one or more 5’- untranslated region (UTR) sequences or fragments thereof, are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picomaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof. In certain embodiments, the 5’ -UTR and/or 3’ -UTR are from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’-UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’-UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’ -UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 3’ -UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’- UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 3’-UTR. In certain embodiments, the chimeric molecule further comprises one or more biomolecules operably linked to the one or more peptide domains and/or the one or more 5’UTR and/or 3’ -UTR sequences. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics. In certain embodiments, the chimeric molecule further comprises one or more promoters and/or regulatory sequences operably linked to the UTRs or biomolecule.

In another aspect a host cell comprises an oligonucleotide embodied herein, or a chimeric molecule embodied herein.

In another aspect, a construct encodes an oligonucleotide embodied herein, or a chimeric molecule embodied herein.

In another aspect, a method of enhancing production of biomolecules, comprises tagging a desired peptide or a nucleic acid sequence with the chimeric molecule of any one of claims 1- 34, by fusion or cloning, expressing the peptide or nucleic acid sequence, and harvesting the protein. In certain embodiments, the proteins comprise: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

In another aspect, a nucleic acid comprises a promoter, a 5’ -untranslated region (5’ -UTR) sequence, a biomolecule of interest, an oligonucleotide comprising a c/.s-regulatory coding motif, a 3’ -untranslated region (3’-UTR) sequence and combinations thereof. In certain embodiments, the one or more 5’ -untranslated region (UTR) and/or 3’UTR sequences or fragments thereof, are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof. In certain embodiments, the 5’ -UTR and/or 3’ -UTR are derived from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’-UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’ -UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 3’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 3’-UTR.

In another aspect, a chimeric molecule comprises one or more oligonucleotides comprising a nucleic acid sequence of cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7) and one or more 5’- and/or 3’ -untranslated region (UTR) sequences or fragments thereof. In certain embodiments, the one or more oligonucleotides encode a peptide comprising from about five amino acids to about twenty amino acids. In certain embodiments, the one or more peptides comprise about seven amino acids. In certain embodiments, the one or more peptides comprise an amino acid sequence having at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptides comprise an amino acid sequence having at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptides comprise the amino acid sequence QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptides comprises a sequence comprising X_n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof. In certain embodiments, the one or more 5’ -untranslated region (UTR) sequences or fragments thereof, are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picomaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof. In certain embodiments, the 5’ -UTR and/or 3’ -UTR are from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’- UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS- CoV-2 5’ -UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’ -UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-23’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 3’- UTR. In certain embodiments, the chimeric molecule further comprises one or more biomolecules operably linked to the one or more oligonucleotides and/or the one or more 5’UTR and/or 3’-UTR sequences. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics. In certain embodiments, the chimeric molecule further comprises one or more promoters and/or regulatory sequences operably linked to the UTRs or biomolecule.

In another aspect, an expression vector comprises the nucleic acids embodied herein.

In another aspect, a novel peptide tag comprises a specified short amino acid sequence or its derivative. In certain embodiments the peptide tag is about 5 to about 10 amino acids in length. In certain embodiments, the peptide tag is about 7 amino acids in length. In certain embodiments, the peptide tag comprises two or more tandem repeats of peptides.

In certain aspects, a synthetic peptide tag comprises an amino acid sequence unit of about five to about fifteen amino acids wherein the N-terminal and/or C-terminal amino acids are linked or fused to a target molecule. In certain embodiments, the amino acid sequence unit comprises seven amino acids. In certain embodiments, the amino acid sequence comprises at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the amino acid sequence comprises at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the amino acid sequence comprises the amino acid sequence QPRFAAA (SEQ ID NO: 1). In certain embodiments, the amino acid sequence comprises the amino acid sequence wherein the peptide domain comprises Xn-QPRFAAA-Xn, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof. In certain embodiments, the synthetic peptide tag further comprises a plurality of repeating amino acid sequence units. In certain embodiments, the repeating amino acid sequence units are in tandem. In certain embodiments, the amino acid sequence units are separated by linker molecules or one or more amino acids.

In another aspect, a synthetic peptide comprises the structure: (AA-AA-AA-AA-AA- AAZ-AAZ)X, wherein x is greater than or equal to 1, z is 0 or 1 and each AA is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

In another aspect, a synthetic peptide comprises the structure: AA1-AA2-AA3-AA4- AA5-AA6-AA7, wherein each AA is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

In another aspect, a synthetic peptide comprises an amino acid sequence comprising the structure: Xn-QPRFAAA-Xn, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

In another aspect, a fusion protein comprises a synthetic peptide embodied herein fused to one or more target peptides. In certain embodiments, two or more synthetic peptides embodied herein are fused to a target peptide. In another aspect, a fusion molecule comprises a synthetic peptide embodied herein fused to one or more biomolecules. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

In another aspect, a method of enhancing production of proteins comprises tagging a desired peptide or a nucleic acid sequence with the peptide tag embodied herein, by fusion or cloning, expressing the peptide or nucleic acid sequence, and harvesting the protein. In certain embodiments, the proteins comprise: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, biomimetics, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

In certain aspects, a composition comprises a peptide-tagged biomolecule embodied herein and a pharmaceutically acceptable excipient, diluent or carrier.

In another aspect, a nucleic acid encodes the peptide tags embodied herein.

In another aspect, an expression vector comprises a nucleic acid encoding the peptide tags embodied herein.

In another aspect, a host cell comprises the expression vector encoding the peptide tags embodied herein.

In certain aspects, a method of utilizing the peptide tag(s) comprises enhancing production of the tagged proteins, including viral transcripts/proteins, endogenous gene products, vaccine, antibody, engineered recombinant proteins in a cell both in vitro , ex vivo and in vivo. In certain embodiments, tandem peptide repeats further boost production of a targeted molecule. In certain embodiments, a method of increasing protein production in a cell comprises tagging a target molecule in the cell.

In another aspect, a chimeric molecule comprises one or more 5’- and/or 3’ -untranslated region (UTR) sequences or fragments thereof associated with one or more biomolecules. In certain embodiments, the one or more 5’ -untranslated region (UTR) and/or 3’ -UTR sequences or fragments thereof, are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picomaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof. In certain embodiments, the 5’ -UTR and/or 3’ -UTR are from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’- UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS- CoV-2 5’ -UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’ -UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-23’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 3’- UTR. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics. In certain embodiments, the chimeric molecule further comprises one or more promoters and/or regulatory sequences operably linked to the UTRs or biomolecule.

In another aspect, a host cell comprises the chimeric molecule embodied herein.

In another aspect, a construct encodes the chimeric molecules embodied herein.

In another aspect, a method of enhancing production of biomolecules, comprises tagging a desired peptide or a nucleic acid sequence with the chimeric molecules embodied herein, by fusion or cloning, expressing the peptide or nucleic acid sequence, and harvesting the protein. In certain embodiments, the proteins comprise: oligonucleotides, polynucleotides, mRNA vaccines, DNA vaccines, viral transcripts/proteins, antibodies, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics. In another aspect, a nucleic acid comprises a promoter, a 5’ -untranslated region (5’ -UTR) sequence, a biomolecule of interest, a peptide domain, a 3’ -untranslated region (3’ -UTR) sequence and combinations thereof. In certain embodiments, the one or more 5’ -untranslated region (UTR) sequences or fragments thereof, are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picomaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof. In certain embodiments, the 5’ -UTR and/or 3’ -UTR are derived from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’-UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’-UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-23’ -UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’- UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 3’ -UTR.

In another aspect, a host cell comprises the nucleic acids or expression vectors embodied herein.

Definitions

Unless specifically defined otherwise, all technical and scientific terms used herein shall be taken to have the same meaning as commonly understood by one of ordinary skill in the art (e.g., in cell culture, molecular genetics, and biochemistry).

The term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, “about” can mean within 1 or more than 1 standard deviation, per the practice in the art. Alternatively, “about” can mean a range of up to 20%, up to 10%, up to 5%, or up to 1% of a given value or range. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude within 5-fold, and also within 2-fold, of a value. Where particular values are described in the application and claims, unless otherwise stated the term “about” meaning within an acceptable error range for the particular value should be assumed. It is understood that where a parameter range is provided, all integers within that range, and tenths thereof, are also provided by the disclosure. For example, “0.2-5 mg” is a disclosure of 0.2 mg, 0.3 mg, 0.4 mg, 0.5 mg, 0.6 mg etc. up to and including 5.0 mg.

In the descriptions in the disclosure and in the claims, phrases such as “at least one of’ or “one or more of’ may occur followed by a conjunctive list of elements or features. The term “and/or” may also occur in a list of two or more elements or features. Unless otherwise implicitly or explicitly contradicted by the context in which it is used, such a phrase is intended to mean any of the listed elements or features individually or any of the recited elements or features in combination with any of the other recited elements or features. For example, the phrases “at least one of A and B;” “one or more of A and B;” and “A and/or B” are each intended to mean “A alone, B alone, or A and B together.” A similar interpretation is also intended for lists including three or more items. For example, the phrases “at least one of A, B, and C;” “one or more of A, B, and C;” and “A, B, and/or C” are each intended to mean “A alone, B alone, C alone, A and B together, A and C together, B and C together, or A and B and C together.” In addition, use of the term “based on,” is intended to mean, “based at least in part on,” such that an unrecited feature or element is also permissible.

The term “amino acid,” as used herein, encompasses both naturally occurring amino acids and non-naturally occurring amino acids. Examples of non-naturally occurring amino acids include, but are not limited to, D-amino acids (i.e., an amino acid of an opposite chirality to the naturally occurring form), N-a-methyl amino acids, C-a-methyl amino acids, b-methyl amino acids and D- or L^-amino acids. Other non-naturally occurring amino acids include, for example, b-alanine (b-Ala), norleucine (Me), norvaline (Nva), homoarginine (Har), 4- aminobutyric acid (g-Abu), 2-aminoisobutyric acid (Aib), 6-aminohexanoic acid (e-Ahx), ornithine (orn), sarcosine, a-amino isobutyric acid, 3-aminopropionic acid, 2,3-diaminopropionic acid (2,3-diaP), D- or L-phenylglycine, D-(trifluoromethyl)-phenylalanine, and D-p- fluoropheny 1 al anine .

As used herein, the term “biomolecule” refers to any of the numerous substances that are produced by cells and living organisms. Biomolecules have a wide range of sizes and structures and perform a vast array of functions. The four major types of biomolecules are carbohydrates, lipids, nucleic acids, and proteins or characteristic associated with the peptide and/or protein of interest. The biomolecules may be used in a variety of applications including, but not limited to curative agents for diseases (e.g., insulin, interferon, interleukins, anti -angiogenic peptides, tumor necrosis factor); molecules that bind to defined cellular targets such as receptors, channels, lipids, cytosolic proteins, and membrane proteins, to name a few; biomolecules having antimicrobial activity, antiviral activity, anti-cancer, anti-inflammatory activity, and the like.

As used herein, “cleavable linker elements”, “peptide linkers”, and “cleavable peptide linkers” will be used interchangeably and refer to cleavable peptide segments found, in certain embodiments, between peptide tags and the biomolecule, e.g., peptide, of interest. After the peptide tags are separated and/or partially purified or purified from the cell lysate, the cleavable linker elements can be cleaved chemically and/or enzymatically to separate the peptide tag from the biomolecule, e.g. peptide, of interest. The fusion peptide may also include a plurality of regions encoding one or more peptides of interest separated by one or more cleavable peptide linkers. The peptide of interest can then be isolated from the peptide tag, if necessary. In one embodiment, the peptide tag(s) and the peptide of interest exhibit different solubilities in a defined medium (typically an aqueous medium), facilitating separation of the peptide tag from the biomolecule, e.g., polypeptide of interest. In an embodiment, the peptide tag is insoluble in an aqueous solution while the protein/polypeptide of interest is appreciably soluble in an aqueous solution. The pH, temperature, and/or ionic strength of the aqueous solution can be adjusted to facilitate recovery of the peptide of interest. In an embodiment, the differential solubility between the inclusion body tag and the peptide of interest occurs in an aqueous solution having a pH of 4 to 11 and a temperature range of 15 to 50° C. The cleavable peptide linker may be from 1 to about 50 amino acids, from 1 to about 20 amino acids in length. The cleavable peptide linkers may be incorporated into the fusion proteins using any number of techniques well known in the art. Means to prepare the present peptides (peptide tags, cleavable peptide linkers, peptides of interest, and fusion peptides) are well known in the art and in preferred embodiments the entire peptide reagent may be prepared using the recombinant DNA and molecular cloning techniques.

The term “checkpoint proteins” means a group of molecules on the cell surface of CD4⁺ and/or CD8⁺ T cells that fine-tune immune responses by down-modulating or inhibiting an anti tumor immune response.

As used herein, the terms “comprising,” “comprise” or “comprised,” and variations thereof, in reference to defined or described elements of an item, composition, apparatus, method, process, system, etc. are meant to be inclusive or open ended, permitting additional elements, thereby indicating that the defined or described item, composition, apparatus, method, process, system, etc. includes those specified elements— or, as appropriate, equivalents thereof— and that other elements can be included and still fall within the scope/defmition of the defined item, composition, apparatus, method, process, system, etc.

As used herein, the terms “conjugated,” “linked,” “attached,” “fused” and “tethered,” when used with respect to two or more moieties, means that the moieties or domains are physically associated or connected with one another, either directly or via one or more additional moieties that serve as a linking agent, to form a structure that is sufficiently stable so that the moieties remain physically associated under the conditions in which the structure is used, e.g., physiological conditions. The linkage can be based on genetic fusion according to the methods known in the art or can be performed by, e.g., chemical cross-linking. The compounds and targeting agents may be linked by a flexible linker, such as a polypeptide linker. The polypeptide linker can comprise plural, hydrophilic or peptide-bonded amino acids of varying lengths. The term “associated” will be used for the sake of brevity and is meant to include all possible methods of physically and chemically associating each domain.

As used herein, the terms “enhancement,” “enhance,” “enhanced”, “enhances,” “enhancing” or “enhance” as used interchangeably and refer to an increase in the specified parameter (e.g., at least about a 1.1-fold, 1.25-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 8-fold, 10-fold, twelve-fold, or even fifteen-fold or more increase) and/or an increase in the specified activity of at least about 5%, 10%, 25%, 35%, 40%, 50%, 60%, 75%, 80%, 90%, 95%, 97%, 98%, 99% or 100% over baseline values.

As used herein, the terms “fusion protein”, “fusion peptide”, “chimeric protein”, and “chimeric peptide” will be used interchangeably and will refer to a polymer of amino acids (peptide, oligopeptide, polypeptide, or protein) comprising at least two portions, each portion comprising a distinct function. At least one first portion of the fusion peptide comprises at least one of the present peptide tags. At least one second portion of the fusion peptide comprises at least one peptide of interest. In certain embodiments, the fusion protein additionally includes at least one cleavable peptide linker that facilitates cleavage (chemical and/or enzymatic) and separation of the peptide tag(s) and the peptide(s) of interest.

“Nucleic acid” refers to nucleotides ( e.g ., deoxyribonucleotides, ribonucleotides, and T - modified nucleotides) and polymers thereof in either single-, double- or multiple-stranded form, or complements thereof. The terms “polynucleotide,” “oligonucleotide,” “oligo” or the like refer, in the usual and customary sense, to a linear sequence of nucleotides. The term “nucleotide” refers, in the usual and customary sense, to a single unit of a polynucleotide, i.e., a monomer. Nucleotides can be ribonucleotides, deoxyribonucleotides, or modified versions thereof. Examples of polynucleotides contemplated herein include single and double stranded DNA, single and double stranded RNA, and hybrid molecules having mixtures of single and double stranded DNA and RNA. Examples of nucleic acid, e.g., polynucleotides contemplated herein include any types of RNA, e.g., mRNA, siRNA, miRNA, and guide RNA and any types of DNA, genomic DNA, plasmid DNA, and mini circle DNA, and any fragments thereof. The term “duplex” in the context of polynucleotides refers, in the usual and customary sense, to double strandedness.

Nucleic acids, including e.g, nucleic acids with a phosphorothioate backbone, can include one or more reactive moieties. As used herein, the term reactive moiety includes any group capable of reacting with another molecule, e.g, a nucleic acid or polypeptide through covalent, non-covalent or other interactions. By way of example, the nucleic acid can include an amino acid reactive moiety that reacts with an amio acid on a protein or polypeptide through a covalent, non-covalent, or other interaction. The terms also encompass nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non- naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, include, without limitation, phosphodiester derivatives including, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate (also known as phosphothioate having double bonded sulfur replacing oxygen in the phosphate), phosphorodithioate, phosphonocarboxylic acids, phosphonocarboxylates, phosphonoacetic acid, phosphonoformic acid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamidite linkages (see Eckstein, OLIGONUCLEOTIDES AND ANALOGUES: A PRACTICAL APPROACH, Oxford University Press) as well as modifications to the nucleotide bases such as in 5-methyl cytidine or pseudouridine.; and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, modified sugars, and non-ribose backbones (e.g., phosphorodiamidate morpholino oligos or locked nucleic acids (LNA) as known in the art), including those described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, CARBOHYDRATE MODIFICATIONS IN ANTISENSE RESEARCH, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g. , to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. In embodiments, the intemucleotide linkages in DNA are phosphodiester, phosphodiester derivatives, or a combination of both.

As used herein, the term “operably linked” refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence (i.e., that the coding sequence is under the transcriptional control of the promoter). In a further embodiment, the definition of “operably linked” may also be extended to describe the products of chimeric genes, such as fusion proteins. As such, “operably linked” will also refer to the linking of peptide tag to a biomolecule, e.g., peptide of interest to be produced and recovered. The peptide tag is “operably linked” to the peptide of interest if upon expression the fusion protein is insoluble and accumulates it inclusion bodies in the expressing host cell. In a preferred embodiment, the fusion peptide will include at least on cleavable peptide linker useful in separating the peptide tag from the peptide of interest. The cleavable peptide linkers may be incorporated into the fusion proteins using any number of techniques well known in the art.

As used herein, the terms “polypeptide” and “peptide” will be used interchangeably to refer to a polymer of two or more amino acids joined together by a peptide bond, wherein the peptide is of unspecified length, thus, peptides, oligopeptides, polypeptides, and proteins are included within the present definition. In one aspect, this term also includes post expression modifications of the polypeptide, for example, glycosylations, acetylations, phosphorylations and the like. Included within the definition are, for example, peptides containing one or more analogues of an amino acid or labeled amino acids and peptidomimetics.

As used herein, the terms “protein of interest”, “polypeptide of interest”, “peptide of interest”, “targeted protein”, “targeted polypeptide”, “targeted peptide”, “expressible protein”, and “expressible polypeptide” will be used interchangeably and refer to a protein, polypeptide, or peptide which may be expressed by the genetic machinery of a host cell.

As used herein, the terms “plasmid”, “vector” and “cassette” refer to an extrachromosomal element often carrying genes which are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA molecules. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing a promoter fragment and DNA sequence for a selected gene product along with appropriate 3' untranslated sequence into a cell. “Transformation cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that facilitates transformation of a particular host cell. “Expression cassette” refers to a specific vector containing a foreign gene and having elements in addition to the foreign gene that allow for enhanced expression of that gene in a foreign host. As used herein, the term “promoter/regulatory sequence” means a nucleic acid sequence which is required for expression of a gene product operably linked to the promoter/regulatory sequence. In some instances, this sequence may be the core promoter sequence and in other instances, this sequence may also include an enhancer sequence and other regulatory elements which are required for expression of the gene product. The promoter/regulatory sequence may, for example, be one which expresses the gene product in a tissue specific manner.

As used herein, the term “promoter” as used herein is defined as a DNA sequence recognized by the synthetic machinery of the cell, or introduced synthetic machinery, required to initiate the specific transcription of a polynucleotide sequence. A “constitutive” promoter is a nucleotide sequence which, when operably linked with a polynucleotide which encodes or specifies a gene product, causes the gene product to be produced in a cell under most or all physiological conditions of the cell. An “inducible” promoter is a nucleotide sequence which, when operably linked with a polynucleotide which encodes or specifies a gene product, causes the gene product to be produced in a cell substantially only when an inducer which corresponds to the promoter is present in the cell. A “tissue-specific” promoter is a nucleotide sequence which, when operably linked with a polynucleotide encodes or specified by a gene, causes the gene product to be produced in a cell substantially only if the cell is a cell of the tissue type corresponding to the

As used herein, the terms “target molecule”, “biomolecule” or “target biomolecule” includes any macromolecule, including protein, peptide, polypeptide, gene, polynucleotide, oligonucleotide, carbohydrate, enzyme, polysaccharide, glycoprotein, receptor, antigen, tumor antigen, markers, molecules associated with a disease, an antibody, growth factor; or it may be any small organic molecule including a hormone, substrate, metabolite, cofactor, inhibitor, drug, dye, nutrient, pesticide, peptide; or it may be an inorganic molecule including a metal, metal ion, metal oxide, and metal complex; it may also be an entire organism including a bacterium, virus, and single-cell eukaryote such as a protozoon.

As used herein, the term “translatable” may be used interchangeably with the term “expressible.” These terms can refer to the ability of polynucleotide, or a portion thereof, to provide a polypeptide, by transcription and/or translation events in a process using biological molecules, or in a cell, or in a natural biological setting. In some settings, translation is a process that can occur when a ribosome creates a polypeptide in a cell. In translation, a messenger RNA (mRNA) can be decoded by a ribosome to produce a specific amino acid chain, or polypeptide.

A translatable polynucleotide can provide a coding sequence region (usually, CDS), or portion thereof, that can be processed to provide a polypeptide, protein, or fragment thereof.

As used herein, the term “3 '-untranslated region” (3'-UTR) relates to the section of messenger RNA (mRNA) that immediately follows the translation termination codon. The 3' UTR may comprise regulatory regions within the 3 '-untranslated region which are known to influence polyadenylation and stability of the mRNA. Many 3'-UTRs also contain AU-rich elements (AREs). Furthermore, the 3 '-UTR may preferably contain the sequence that directs addition of several hundred adenine residues called the poly(A) tail to the end of the mRNA transcript.

As used herein, the term “5'- untranslated region” (5’ -UTR) refers to a polynucleotide sequence that, when linked to a transcript, is capable of recruiting ribosome complexes and initiating translation of the transcript. Typically, a 5’-UTR is positioned directly upstream of the initiation codon of a transcript; specifically, between the cap site and the initiation codon. The 5' UTR begins at the transcription start site and ends one nucleotide (nt) before the start codon (usually AUG in the mRNA) of the coding region. In eukaryotes the length of the 5' UTR is generally from 100 to several thousand nucleotides long but sometimes also shorter UTRs occur in eukaryotes.

Throughout this disclosure, various aspects of the disclosure can be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the disclosure. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 2.7, 3, 4, 5, 5.3, and 6. This applies regardless of the breadth of the range. Any compositions or methods provided herein can be combined with one or more of any of the other compositions and methods provided herein.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A- IF are a series of graphs, schematic representation and fluorescent microscopic images demonstrating that Qa tagging in SARS-CoV-2 viral proteins robustly boosts the production of dual reporter-fused viral proteins in HEK293T cells.

FIG. 1A: Diagram for 2A-mediated dual reporter gdLuc/dsGFP fused with viral protein and potential multiple measures of viral protein expression/production.

FIGS. 1B-1D: Representative experiments for Qa boosting of SARS-CoV-2 envelop (E) protein dynamic production (FIG. IB) and average fold induction of 10 experiments determined by gdLuc assay of cultured media at 24-48 h after transfection with indicated pcDNA6B vector (100 ng/well) in quadruplicates for each experiment (FIG. 1C), as well as representative images of Qa-boosted dsGFP expression detected by fluorescent microscopy (FIG. ID).

FIGS. 1E-1F: Representative gdLuc assay showing various degrees of Qa boosting in other SARS-CoV-2 structural protein spike (S) and nucleocapsid (N), as well as accessory proteins NSP2, NSP16 and ORF3. Cells were transfected in quadruplicates with indicated pcDNA6B vectors at 100 ng/well. Data represent mean ± SE of gdLuc activity in the supernatant at 48 h after transfection. The fold number indicates relative changes in Qa groups compared with corresponding control group.

FIGS. 2A-2G are a series of graphs and fluorescent microscopic images demonstrating that Qa boosting is versatile in dosages, non-viral proteins, cell types and tagging location.

FIG. 2A: Dose-dependent Qa boosting of SARS-CoV-2 S, M, N and ORF3 in a various degree. Cells were transfected in quadruplicates with indicated pcDNA6B vectors at indicated amounts of vectors. Data represent mean ± SE of gdLuc activity in the supernatant at 48 h after transfection. The fold number indicate relative changes in Qa groups compared with corresponding control group. FIGS. 2B-2D: Dose-independent Qa boosting in host cellular gene NIBP and hACE2 determined by gdLuc assay (FIGS. 2B, 2C) and representative fluorescent microscopic images (FIG. 2D) at 48 h after transfection with pcDNA6B regular vectors.

FIGS. 2E, 2F: Dose-independent Qa boosting in secretory IFNy and IL-2 by gdLuc assay at 48 h after transfection with pRRL LV vectors.

FIG. 2G: Qa boosting on viral proteins E and S as well as non-viral protein hACE2 exhibits similar efficiency in different cell types.

FIGS. 3A-3I are a series of graphs, fluorescent microscopic images, a schematic representation and a blot demonstrating that Qa boosting is accelerated by stronger promoter and SARS-CoV-2 native untranslated region (UTR).

FIGS. 3A-3C: Stronger promoter CAG further increases Qa boosting efficiency in viral protein E, S and NSP16.

FIGS. 3D, 3E: The 5’ -UTR inclusion robustly increases promoter-dependent expression of E protein as determined by Western blot and immunocytochemistry with anti -Flag antibody, and addition of 3’ -UTR further increases E protein expression. Different size of E-Flag-Q results from the 37 amino acids addition within the open reading frame after stop removal during the cloning.

FIGS. 3F, 3G: The 5’ -UTR inclusion dramatically increases the C AG-driven expression of Qa tagged S-fused dual reporter as determined by representative fluorescent microscopic images and gdLuc assay.

FIG. 3H: The 5’ -UTR inclusion further accelerates Qa boosting efficiency of CMV- driven S dual reporter protein production as compared with LG group at 48 h after transfection with regular vector.

FIG. 31: The 5’ -UTR inclusion accelerates Qa boosting of dual reporter without viral proteins as determined by gdLuc assay at 48 h after transfection with pRRL LV vectors.

FIGS. 4A-4H are a series of graphs, fluorescent microscopic images, a schematic representation and a blot demonstrating that Qa tagging and 5’ -UTR inclusion boost the packaging and transduction efficiency of SARS-CoV-2 S protein-pseudotyped lentivirus-like particles (S-LVLP).

FIG. 4A: Diagram for different vectors expressing human codon-optimized Sdl8 and the process of S-LVLP packaging.

FIG. 4B: Qa and 5’-UTR increase Sdl8 protein expression in the transfected cells as determined by Western blot with serum from SARS-CoV-2 patient.

FIGS. 4C-4F: Qa tagging increases S-LVLP packaging titer for the standard pRRL-GFP LV vector determined by GFP positivity, which is further increased by 5’-UTR inclusion, polybrene treatment and purification.

FIGS. 4G-4H: Qa tagging and 5’-UTR inclusion increase S-LVLP packaging titer for dual reporter LV vectors pRRL-LG or pRRL-E-LG determined by GFP positivity and gdLuc activity.

FIGS. 5A-5I is a series of graphs demonstrating that Qa tagging and 5’-UTR inclusion boost mRNA-dependent production of SARS-CoV-2 viral proteins S, N, E and ORF3 as well as non-viral hACE2 via increasing mRNA stability and translational efficiency.

FIGS. 5A-5C: Qa tagging robustly boosts mRNA-dependent production of dual reporter in a time- and dose-dependent manner to a various extent with different targeted proteins.

FIGS. 5D, 5E: The 5’-UTR inclusion further accelerates Qa boosting on mRNA-derived production of S, E and hACE2 proteins.

FIGS. 5F-5I: Qa tagging increases posttranscriptional mRNA stability and translational efficiency in the presence of transcriptional inhibitor actinomycin D.

FIGS. 6A-6M are a series of diagrams, blots, graphs, fluorescent microscopic images and a photograph demonstrating that Qa tagging boosts the production yield of anti-SARS monoclonal antibody and lentiviruses.

FIG. 6A: Diagram showing the Qa tagging on the C-terminus of constant regions for heavy and light chains of anti-SARS monoclonal antibody. FIGS. 6B-6D: Representative ELISA for robust boosting of mAb production at 48 h after co-transfection of H/L or HQ/LQ (50ng/well) with or without normalization of GFP (FIG. 6G) or firefly-luciferase (FIG. 6L). The mAb amount is quantified by Sigmoidal four-parameter logistic curve (4PL) determined. Relative fold changes were presented as compared with corresponding LG.

FIG. 6E: Average fold changes of 16 experiments based on ELISA results.

FIG. 6F: Western blot analysis confirmed the boost of Qa on mAb production in the supernatant.

FIG. 6G: Qa tagging in the LV transfer vector pRRL-E-LG increased gdLuc activity in the supernatant after LV infection of HEK293T cells.

FIGS. 6H, 61: Qa tagging in the LV transfer vector pLV-EFla-Flag-spCas9-Qa-T2A- RFP increases the transgene expression determined by Western blot analysis with anti-Flag antibody (FIG. 6H) but does not increase the packaging efficiency measured with FACS for RFP positivity (FIG. 61).

FIG. 6J: Representative fluorescent images showing Qa tagging on Pol and RRE boosts LV packaging efficiency for pRRL-GFP transfer LV vector but Qa tagging on Gag impairs LV packaging.

FIGS. 6K-GM: Qa tagging on Pol and RRE boosts LV packaging efficiency for pRRL- UTR-QLG and pLV-EFla-MS2-spCas9-F2A-GFP determined with LV qPCR titer kit (FIG. 6K), flow cytometry (FIG. 6L) and gdLuc assay (FIG. 6M).

FIGS. 7A-7H are a series of blots and graphs identifies the secretion boost of Qa tagging on various targeted proteins in HEK293T cells.

FIGS. 7A-7B: Qa tagging remarkably decreases the expression level of E-Flag-gdLuc protein in the cell lysate at 48 h after transfection with indicated vectors.

FIG. 7C: Qa tagging decreases the expression levels of secretory IFNy/IL-2 or non- secretory viral protein N and non-viral protein hACE2 in the cell lysates at 48 h after transfection with indicated vectors. T2A auto-cleaving efficiency varies with targeted proteins, showing different ratio of the cleaved band (c) and non-cleaved band (n). FIG. 7D: Qa tagging decreases S protein level in the cell lysates while 5’-UTR inclusion does increase the protein level despite of the continuous secretion. The cleaved S-Flag-gdLuc fragment was detected with anti-Flag antibody and the cleaved dsGFP fragment was detected with anti-GFP antibody.

FIGS. 7E, 7F: Qa tagging robustly increases the protein level of secretory E-QLG in the supernatant detected by Western blot analysis and gdLuc assay of the supernatant. Cells were transfected with indicated vectors in quadruplicates for 24 h and cultured with FreeStyle™ 293 Expression Medium for 48 h.

FIG. 7G: ER-Golgi trafficking inhibitor brefeldin A completely blocks the secretion of Qa tagged viral proteins and host cellular proteins.

FIG. 7H: Qa tagging does increase the protein expression of non-secretory firefly- luciferase (fLuc) in the cell lysate and has no effect on the background of fLuc activity in the supernatant.

FIGS. 8A-8I are a series of schematics, photographs of stained cells and graphs demonstrating that Exen21/Qa addition in SARS-CoV-2 viral proteins robustly boosts production of dual reporter-fused viral proteins in HEK293T cells.

FIG. 8A: Diagram of 2A-mediated dual reporter gdLuc/dsGFP (LG) and Qa tagged LG (QLG) fused with viral protein and potential multiple measures of viral protein expression/production. The Exen21/Qa stands for the 21-mer nucleotide motif and its corresponding heptapeptide.

FIGS. 8B-8D: Representative experiments showing Exen21 boosting of SARS-CoV-2 envelope (E) protein dynamic production (FIG. 8B) and average fold induction with results of 20 experiments (FIG. 8C) determined by gdLuc assays in supernatants, 24-72 h after transfection with indicated pcDNA6B vector (100 ng/well, quadruplicate), and representative images of Exen21 -boosted dsGFP expression detected by fluorescence microscopy (FIG. 8D). Data represent mean ± SE of gdLuc activity with the relative fold changes (in red) in QLG over corresponding LG groups (the same below). FIGS. 8E-8F: Representative gdLuc assay showing various degrees of Exen21 boosting in other SARS-CoV-2 structural proteins: spike (S), nucleocapsid (N), and accessory proteins: NSP2, NSP16, and ORF3. Cells were transfected with indicated pcDNA6B vectors (100 ng/well, quadruplicates). Data represent mean ± SE of gdLuc activity in supernatants 48 h post transfection.

FIGS. 8G-8I: Alanine scanning and deletion mutation (FIG. 8G) as well as degenerate (FIG. 8H) and missense (FIG. 81) mutation assays showing the critical role of the unique and specific Exen21 in boosting E-LG production. Cells were transfected with indicated pcDNA6B- E vectors (100 ng/well, quadruplicates). Data represent mean ± SE of gdLuc activity in supernatants 48 h post-transfection, with the relative percentage changes compared with the parent E-QLG group. Inset in FIG. 8G shows the heptapeptide structure with the residue position. Insets in FIGS. 8H and 81 show the mutated nucleotide and corresponding residues. The dQ for degenerate QLG and mQ for missense QLG mutants.

FIGS. 9A-9G are a series of graphs demonstrating that Exen21 boosting is versatile in dosages, non-viral proteins and cell types.

FIG. 9A: Dose-dependent and varying extents of Exen21 -boosted expression of SARS- CoV-2 S, M, N and ORF3 protein levels. Cells were transfected in quadruplicates with indicated pcDNA6B (6B) vectors in indicated amounts. Data represent mean ± SE of gdLuc activity in supernatants 48 h post- transfection. Fold values indicate changes in Exen21 groups relative to those of corresponding control groups.

FIGS. 9B, 9C: Dose-independent boosting by Exen21 of host cellular gene NIBP and hACE2 levels, determined by gdLuc assay (FIGS. 9B, 9C) 48 h after transfection with pcDNA6B regular vectors.

FIGS. 9D, 9E: Dose-independent boosting by Exen21 of secretory IFNy and IL-2 by gdLuc assay 48 h after transfection with pRRL LV vectors.

FIG. 9F: Stronger promoter CAG further increases boosting efficiency in LG system by Qa (QLG) in viral protein E. FIG. 9G: Exen21 -induced boosting of viral E and S proteins and non-viral protein hACE2 exhibits similar efficiencies across different cell types.

FIGS. 10A-10F are a series of schematics, a photograph, a blot and graphs demonstrating that Exen21 addition boosts production yields of anti-SARS monoclonal antibody (mAb).

FIG. 10A: Diagram showing human anti-SARS mAb and Exen21/Qa tags introduced (right panel) on its C-termini of constant regions of heavy and light chains.

FIG. 10B: Representative ELISA showing robust boosting by Exen21/Qa (HQ/LQ) of mAb production 48h after co-transfection of mAb H/L or HQ/LQ expression vectors (50 ng/well, in triplicates), with normalization vectors empty control (C), GFP (G) or firefly-luciferase (L).

FIG. IOC: Sigmoidal 4-parameter logistic curve (4PL) determination of mAb concentrations.

FIG. 10D: Normalized quantitative data from experiment/assay shown in B. Relative fold changes are presented as compared with corresponding mAb H/L.

FIG. 10E: Average Exen21/Qa-induced fold changes of ELISA-based mAb production for 16 experiments at p<0.0001 with student’s t test.

FIG. 10F: Western blot analysis confirming the boost of Exen21/Qa on mAb production in the supernatant. Membrane staining as a loading control is for densitometric analysis of relative fold changes in light chain (LC) between HQ/LQ and H/L groups.

FIGS. 11A-11K are a series of schematics, blots, graphs and photographs demonstrating that Exen21 addition boosts packaging and transduction efficiencies of SARS-CoV-2 S protein- pseudotyped lentivirus-like particles (S-LVLP) and standard lentiviral packaging.

FIG. 11 A: Diagrams of different vectors expressing human codon-optimized Sdl8, and the process of S-LVLP packaging in HEK293T cells.

FIG. 11B: Exen21 increases Sdl8 protein expression in transfected cells, shown by Western blot with serum from SARS-CoV-2 patient, which contains specific anti-S antibody. Representative fold change for S2 fragment is quantified by densitometric analysis with GAPDH normalization. FIG. 11C: Exen21 addition increases S-LVLP packaging titer of the standard pRRL-GFP LV vector determined by GFP positivity.

FIGS. 11D-11E: Exen21 addition increases S-LVLP packaging titer for dual reporter LV vectors pRRL-E-QLG determined by GFP positivity (FIG. 11D) and gdLuc activity (FIG. HE).

FIGS. 11F: Exen21/Qa in the LV transfer vector pRRL-E-QLG induces LV dose-related increases in gdLuc activity in supernatants of HEK293T cells 48-72 h after infection with indicated amount of crude LV preparation (mΐ per well, triplicates). Shown are fold changes in gdLuc activity from E-QLG vs. control E-LG group.

FIGS. 11G, 11H: Exen21/Qa in the LV transfer vector pLV-EFla-Flag-spCas9-Qa- T2A-RFP (Qa) increases transgene expression vs. untagged vector (Con), seen by Western blot analysis with anti -Flag antibody (FIG. 11G), but does not increase packaging efficiency as measured by FACS for RFP positivity 48 h after infection with crude LV preparation (FIG.

11H) at no significance (ns) by student’s t test.

FIG. Ill: Representative fluorescence images show that Exen21/Qa addition to Pol and RRE enhances pRRL-GFP LV packaging efficiency vs. control (psPAX2) levels, but Exen21/Qa ion Gag impairs LV packaging.

FIGS. 11J, 11K: Exen21/Qa tagging on Pol and RRE (PolQ/RREQ) boosts LV packaging efficiency for pRRL-GFP transfer vector, determined by cell counting (FIG. 11J) and flow cytometry (FIG. 11K).

FIG. 11L: The gdLuc assay showing the boosting of Exen21/Qa tagging.

FIGS. 12A-12G are a series of graphs demonstrating that Exen21 addition boosts mRNA-dependent production of SARS-CoV-2 viral proteins S, N, E and ORF3 as well as non- viral hACE2 by increasing mRNA stability and translational efficiency.

FIGS. 12A-12C: Exen21 addition robustly boosts mRNA-dependent production of dual reporter in a time-and dose-dependent manner to a various extent with different targeted proteins.

FIG. 12A: Time course of responses to different concentrations of capped mRNAs for S- LGvs S-QLG (ng/well, quadruplicate).

FIG. 12B: Time course of response to indicated mRNAs (100 ng/well, quadruplicate). FIG. 12C: Dose response to indicated mRNAs at 24 h post transfection.

FIGS. 12D-12G: Exen21 addition (QLG; right panels in D, E) increases posttranscriptional mRNA stability and translational efficiency in the presence of transcriptional inhibitor actinomycin D, shown in time-course plots of reporter activity (FIGS. 12D, 12E) and mRNA decay (FIGS. 12F, 12G). The mRNA levels were determined by RT-qPCR analysis.

FIGS. 13A-13G are a series of blots and graphs demonstrating that Exen21 addition enhances secretion of various targeted proteins in HEK293T cells, shown by Western blot analyses.

FIGS. 13A-13C: Exen21 addition remarkably decreases protein expression levels of viral proteins (E, S, N), non-viral protein hACE2 and secretory IFNy/IL-2 in cell lysates 48 h after transfection with indicated pcDNA6B (6B) vectors. Fold numbers are relative densitometric changes after normalization by the loading control GAPDH or non-specific bands (NS). P2A auto-cleaving efficiency varies with targeted proteins, showing different ratio of the cleaved band (c) and non-cleaved band (n).

FIGS. 13D, 13E: Exen21 addition robustly increases secretory E-QLG protein levels in the supernatants, seen both by Western blot analyses (FIG. 13D) and gdLuc assay (FIG. 13E). Cells were transfected with indicated vectors in quadruplicates for 24 h and cultured with FreeStyle™ 293 Expression Medium for 48 h. Membrane staining as a loading control is for densitometric analysis of relative fold changes in E-QLG over E-LG.

FIG. 13F: ER-Golgi trafficking inhibitor brefeldin A blocks the secretion of Qa-tagged viral E protein (E-QLG) and host cell protein (IFNy), seen both by Western blot analyses (left)) and gdLuc assay (right) of the supernatants 48 h after brefeldin A treatment.

FIG. 13G: Exen21 addition elevates non-secretory firefly-luciferase (fLuc) protein levels in cell lysates. Relative fold change is quantified by densitometric analysis with GAPDH normalization.

FIGS. 14A-14C are a series of photographs showing a representative fluorescent microscopy detection of dual reporter. Related to FIGS. 8A-8I and 9A-9G. FIG. 14A: Three indicated antibodies detected dual reporter of E-Flag-gdLuc-T2A-GFP with 2A and Flag complete colocalization while some cleaved GFP stayed alone without the corresponding E-Flag- gdLuc-T2A, which may have been secreted.

FIGS. 14B, 14C: Representative fluorescent micrographs showing the boosting of N and ORF3 viral proteins and human ACE2 cellular protein under CMV promoter (FIG. 14B) as well as time dependent E protein under CAG promoter (FIG. 14C) by Exen21/Qa addition. Images were taken at the same exposure settings. Scale bars = 100 pm.

FIGS. 15A-15B are a series of graphs and photographs demonstrating the dose- dependent Exen21/Qa boosting of SARS-CoV2 viral proteins and saturation of boosting activity at a higher amount of transfected reporter DNA in all the tested viral dual reporters. Related to FIG. 9A.

FIG. 15A: Exen21/Qa boosting in different dosage determined by NanoLight Gaussia luciferase assay.

FIG. 15B: Representative confocal images from 3 fields of 4 wells per group. All images were taken at the same exposure settings. For lower dosage or lower expression groups, stronger GFP signal can be observed at a longer exposure. Scale bars = 100 pm. HEK293T cells in a 96- well plate was transfected with indicated gdLuc-P2A-dsGFP reporter at indicated amount. At 72 h after transfection, EGFP images were taken, and the supernatants were collected for luciferase assay. Data represents relative fold changes compared to corresponding LG group with mean ± SE of 4 wells.

FIGS. 16A-16E are a series of photographs, a blot, a graph and a schematic demonstrating eExen21 boosting of mRNA vaccine production and efficacy. Related to FIGS. 12A-12G.

FIG. 16A: Diagram for in vitro transcription and 5’ -Cap modification.

FIG. 16B: Gel electrophoresis (1% agarose) images for transcript length, integrity and quantity of both CO and Cl 5’ -Capped mRNAs.

FIG. 16C: 10~30-fold increases in the expression of dual reporter at equal levels of functional mRNA for viral genes N, E and ORF3. The Capped (Cap-CO) and tailed mRNAs of indicated targets were synthesized using Hi Scribe T7 ARC A mRNA Kit (NEB, E2065) and cDNA template from the corresponding linearized plasmid. Half of the Cap-CO mRNAs were further methylated at the 2 -0 position of the first nucleotide adjacent to the Cap-CO structure using mRNA Cap 2'-0-Methyltransferase (NEB, M0366). Both Cap-CO and Cap-Cl mRNAs were purified with Monarch RNA Cleanup Kit (NEB, T2040). HEK293T cells in a 96-well plate were transfected with indicated mRNA (100 ng/well). At 24 h after transfection, the supernatants were collected for NanoLight Gaussia luciferase assay. Data represents relative fold changes compared to corresponding LG group with mean ± SE of 4 independent experiments.

FIGS. 16D, 16E: Representative fluorescent GFP images in live cells at 24 h after transfection with indicated vectors. Scale bars = 100 pm.

FIGS. 17A-17E are a series of blots and a graph demonstrating that Qa tagging robustly increases the protein level of secretory E-QLG in the supernatant detected by Western blot analysis and gdLuc assay of the supernatant. Related to Figures 6C, D.

FIGS. 17A, 17B: Western blot with anti-gdLuc monoclonal antibody (Proteintech, Cat# 60158-1-Ig).

FIGS. 17C, 17D: Western blot with anti-GFP polyclonal antibody (Proteintech, Cat# 50430-2-AP).

FIG. 17E: Relative fold changes of boosting efficiency by gdLuc assay. HEK293T Cells were transfected with indicated vectors (100 ng/well) in triplicates for 24 h and cultured with FreeStyle™ 293 Expression Medium for 48 h before analysis.

FIGS. 18A-18D are a series of graphs and blots demonstrating that ER-Golgi trafficking inhibitor brefeldin A blocks the secretion of Qa tagged viral proteins and host cellular proteins. Related to FIGS. 13A-13G.

FIGS. 18A, 18B: Relative gdLuc activity changes in the supernatant (FIG. 18A) and cell lysate (FIG. 18B) after brefeldin A treatment.

FIGS. 18C, 18D: Western blot with anti-gdLuc monoclonal antibody (Proteintech, Cat# 60158-1-Ig) and anti-GFP polyclonal antibody (Proteintech, Cat# 50430-2-AP). HEK293T Cells were transfected with indicated vectors (50 ng/well) in quadruplicates for 24 h and cultured with FreeStyle™ 293 Expression Medium for 48 h before analysis.

FIGS. 19A-19C are a series of schematics and a table showing the SARS-CoV-2 UTR- E-Flag-Qa-UTR synthesis and cloning. FIG. 19A: Diagram for the synthetic 5’-UTR-E-Flag- Qa-3’-UTR. FIG. 19B: NEBuilder HiFi DNA assembly cloning of the synthetic nucleotides (946 bp) into pCAG-Flag expression vector. FIG. 19C: List of cloning strategy to obtain indicated vector for E protein and S protein fused with QLG dual reporter.

FIGS. 20A-20C are a series of photographs of stained cells showing that both 5’-UTR and 3’-UTR apparently enhanced the promoter-driven expression of QA-tagged E protein in HEK293T cells. HEK293T cells in a 96-well plate were transfected with indicated vectors in triplicate (100 ng/well). At 48 h after transfection, cells were fixed with 4% PAF for 10 minutes and immunocytochemistry with anti-Flag antibody was performed. FIG. 20A: Representative confocal images. FIG. 20B: Mean fluorescent intensity determined by ImageJ analysis of 6 fields from 3 wells. FIG. 20C: Western blot analysis with anti-Flag antibody and anti-GAPDH for loading control.

FIGS. 21A-21D are a series of schematics, photographs of stained cells and graphs showing that addition of 5’-UTR between CAG promoter and S-Flag-QLG dual reporter enhanced S protein expression. FIG. 21A: Diagram of dual reporter design with the secretable gaussia dura luciferase (gdLuc) plus P2A autocleavable destabilized GFP (dsGFP) and various measures to assess the expression of targeted proteins (here SARS-CoV-2 viral proteins). Novel Q tag locates between targeted protein and gdLuc. FIGS. 21B-21D: HEK293T cells in a 96-well plate were transfected with indicated vectors in quadruplicate at indicated amount of DNA (12.5- 100 ng/well). At 24-72 h after transfection, EGFP images were taken (FIG. 21B), and the supernatants were collected for NanoLight Gaussia luciferase assay (FIGS. 21B, 21C, 21D). Data represents relative light unit of bioluminescence (FIG. 21C) or fold changes (FIG. 21D) compared to corresponding non-UTR group with mean ± SE of 4 wells.

FIGS. 22A-22C are a series of blots and graphs showing that addition of 5’-UTR to the pCAG, pcDNA6B and pRRL vectors dramatically increased the protein expression of the transgenes. HEK293T cells in a 24-well plate (FIG. 22A) or 96-well plate (FIGS. 22B, 22C) were transfected with indicated vectors (500 ng/well in A or 100 ng/well in FIGS. 22B, 22C). At 48 h after transfection, EGFP expression was determined with Western blot (FIG. 22A), and the supernatants were collected for NanoLight Gaussia luciferase assay (FIGS. 22B, 22C). Data represents fold changes compared to corresponding LG group (FIG. 22B) or relative light unit of bioluminescence (FIG. 22C) non-UTR group with mean ± SE of 4 wells.

FIGS. 23 A, 23B are a series of graphs showing that addition of 5’-UTR to the upstream of in vitro transcribed mRNA significantly enhances the protein expression in HEK293T cells. HEK293T cells in a 96-well plate were transfected using Lipofectamine® MessengerMAX mRNA Transfection Reagent with indicated mRNAs (50 ng/well) generated from in vitro transcription with 5’ -capped and 3’ -poly A tail. The mRNAs encode indicated viral protein (E or S protein) or endogenous hACE2 protein fused with dual reporter LG or QLG. At 6-24 h after transfection, EGFP images were taken (data not shown), and the supernatants were collected for NanoLight Gaussia luciferase assay. Data represents relative fold changes compared to corresponding LG group with mean ± SE of 4 wells.

FIGS. 24A-24F are a series of schematics, photographs of stained cells, blots and graphs showing that Qa tagging and 5’-UTR inclusion boost the packaging and transduction efficiency of SARS-CoV-2 S protein-pseudotyped lentivirus-like particles (S-LVLP).

FIG. 24A: Diagram for different vectors expressing human codon-optimized Sdl8 and the process of S-LVLP packaging.

FIG. 24B: Qa and 5’-UTR increase Sdl8 protein expression in the transfected cells as determined by Western blot with serum from SARS-CoV-2 patient.

FIGS. 24C-24D: Qa tagging increases S-LVLP packaging titer for the standard pRRL- GFP LV vector determined by GFP positivity, which is further increased by 5’-UTR inclusion.

FIGS. 24E-24F: Qa tagging and 5’-UTR inclusion increase S-LVLP packaging titer for dual reporter LV vectors pRRL-LG or pRRL-E-LG determined by GFP positivity and gdLuc activity.

DETAILED DESCRIPTION The disclosure is based in part, of the unexpected finding that an oligonucleotide cAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7) that encodes a short peptide (termed herein “Qa” that significantly boosted the expression/production of fusion protein. Further expanded studies identified the versatile property of Exen21/Qa tagging in boosting the production (by up to thousand-folds) of various proteins including viral proteins, endogenous gene products, vaccine, antibody, engineered recombinant proteins and virus packaging proteins. Also discovered was the potent boosting of protein production by SARS-CoV-2 native 5’-UTR, and its synergistic role with Qa tagging. Mechanistically, Qa increased mRNA/protein stability and/or enhanced protein translation as well as facilitates protein secretion. These versatile protein boost strategies will be beneficial extensively to the biomedical science and protein engineering industry. This is the first evidence for protein regulation/boosting by short peptide tagging and SARS-CoV2 native 5’-UTR.

Accordingly, embodiments are directed to novel chimeric molecules comprising a peptide tag and 5’- untranslated region (5’ -UTR) for use in the enhanced production and expression of a desired biomolecule.

5’- and 3’ -Untranslated Regions (UTRs)

An untranslated region (or UTR) refers to either of two sections, one on each side of a coding sequence on a strand of mRNA. If it is found on the 5' side, it is called the 5’ -UTR (or leader sequence), or if it is found on the 3' side, it is called the 3' UTR (or trailer sequence). The mRNA is initially transcribed from the corresponding DNA sequence and then translated into protein. However, several regions of the mRNA are usually not translated into protein, including the 5' and 3' UTRs.

Within the 5’-UTR is a sequence that is recognized by the ribosome which allows the ribosome to bind and initiate translation. The mechanism of translation initiation differs in prokaryotes and eukaryotes. The 3' UTR is found immediately following the translation stop codon. The 3' UTR plays a critical role in translation termination as well as post-transcriptional modification.

In this study, it was found that the presence of the native 5’ -UTR and 3’ -UTR in standard protein expression system robustly enhances the expression of the viral subgenomic transcripts and further viral protein production. It was also found that the combination of the native 5’-UTR with a short peptide, termed herein as the Qa peptide, further facilitated the production of viral and non-viral proteins.

Accordingly, in certain embodiments, a chimeric molecule for use in enhancing the expression and production of a desired biomolecule comprises one or more short peptide domains and one or more UTRs. In certain embodiments, the UTR is a 5’-UTR. In certain embodiments, the UTR is a 3 ’-UTR.

In certain embodiments, the one or more 5’ -untranslated region (UTR) domains or fragments thereof, are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picomaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof. In certain embodiments, the 5’ -UTR and/or 3’-UTR are from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2.

In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS- CoV-2 5’-UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’ -UTR. In certain embodiments, the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’-UTR.

In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS- CoV-23’-UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-23’ -UTR. In certain embodiments, the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 3’-UTR.

In certain embodiments, the one or more UTR sequences are engineered to include a Shine-Dalgarno sequence 5'-AGGAGGU-3'). This sequence is found 3-10 base pairs upstream from the initiation codon. In certain embodiments, the one or more UTR sequences are engineered to contain a Kozak consensus sequence (ACCAUGG). In certain embodiments, the one or more of the 5’-UTR sequences (or nucleic acid molecules each comprising a 5’-UTR sequence) may comprise a synthetic sequence (i.e., a sequence that is not found in nature).

In certain embodiments, one or more of the 5’-UTR sequences (or nucleic acid molecules each comprising a 5’-UTR sequence) may comprise an endogenous 5’-UTR sequence (i.e., a 5’- UTR sequence that is used in nature to recruit ribosome complexes and initiate translation of a transcript). For example, an endogenous 5’-UTR sequence may be part of a mRNA expressed in a cell or population of cells. The cells in the population of cells may be the same type of cell (e.g., HEK-293 cells, PC3 cells, or muscle cells). Alternatively, the population of cells may comprise different cell types (e.g., HEK-293 cells, PC3 cells, and muscle cells). Methods of identifying mRNAs expressed in a cell or population of cells, and of identifying the 5’-UTR sequences of the mRNAs, are known to those having skill in the art. Indeed, various public databases contain cellular mRNA expression and/or 5’-UTR sequence information.

The length of the 5’-UTR sequences (or the nucleic acid molecules each comprising a 5’- UTR sequence) may vary. For example, in some embodiments, at least two of the 5’-UTR sequences have different lengths. In some embodiments, at least two of the 5’-UTR sequences have the same length. In some embodiments, each of the 5’-UTR sequences have the same length. In some embodiments, the length of at least one of the 5’-UTR sequences in the initial chimeric molecule is 3, 4, 5, 6, 7, 8, 9, or 10 base pairs in length.

In some embodiments, the length of at least one of the 5’-UTR sequences (or the nucleic acid molecules each comprising a 5’-UTR sequence) is at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 150, at least 200, at least 250, at least 300, at least 350, at least 400, at least 450, at least 500, at least 550, at least 600, at least 650, at least 700, at least 750, at least 800, at least 850, at least 900, at least 950, at least 1000, at least 1500, at least 2000, or at least 3000 base pairs in length. In some embodiments, the length of each of the 5’-UTR sequences (or the nucleic acid molecules each comprising a 5’-UTR sequence) is at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 150, at least 200, at least 250, at least 300, at least 350, at least 400, at least 450, at least 500, at least 550, at least 600, at least 650, at least 700, at least 750, at least 800, at least 850, at least 900, at least 950, at least 1000, at least 1500, at least 2000, or at least 3000 base pairs in length.

In some embodiments, the chimeric molecule comprises one or more coronavirus 5’-UTR and/ or 3’ -UTR sequences, the length of at least one UTR sequence is increased to a length of interest by added nucleotides to one or both ends (e.g., by adding repeats of a motif that does not have known secondary structure). Nucleotides may be added to the 5' end, the 3' end, or both the 5' and 3' ends of a 5’-UTR and/or 3’-UTR sequences. In some embodiments, the length of one or more 5’- or 3’-UTR sequences are decreased to a length of interest by removing nucleotides to one or both ends. Nucleotides may be removed from the 5' end, the 3' end, or both the 5' and 3' ends of a 5’ -UTR sequence.

In certain embodiments, the UTR sequences comprise one or more mutations. The mutations may be introduced using a genetic algorithm. Examples of genetic algorithms are known to those having skill in the art. See e.g., Scrucca, L. GA: A Package for Genetic Algorithms in R. J. Stat. Softw. (2015). doi:10.18637/jss.v053.i04. The number of mutations introduced into each of the UTR sequences may vary. In some embodiments, at least one UTR sequences is mutated at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotide positions. A mutation may comprise a base pair substitution, a deletion, or an insertion.

Modifications: In certain embodiments, the UTRs comprise one or more chemically modified nucleotides. Amongst these is the inclusion of chemically modified nucleotides;

Current Opinion in Drug Discovery and Development, 2007, 10:523. Kormann et al. have shown that the replacement of only 25% of uridine and cytidine residues by 2-thiouridine and 5-methyl- cytidine suffices to increase mRNA stability as well as to reduce the activation of innate immunity triggered by externally administered mRNA in vitro (WO2012/0195936 Al; W02007024708 A2). For example, known modifications of RNA molecules can be found, for example, in Genes VI, Chapter 9 (“Interpreting the Genetic Code”), Lewis, ed. (1997, Oxford University Press, New York), and Modification and Editing of RNA, Grosjean and Benne, eds. (1998, ASM Press, Washington D.C.). Modified RNA components include the following: 2'-0- methylcytidine; N⁴-methylcytidine; N⁴-2'-0-dimethylcytidine; N⁴-acetylcytidine; 5- methylcytidine; 5,2'-0-di methylcytidine; 5-hydroxymethylcytidine; 5-formylcytidine; 2'-0- methyl-5-formaylcytidine; 3-methylcytidine; 2-thiocytidine; lysidine; 2'-0-methyluridine; 2- thiouridine; 2-thio-2'-0-methyluridine; 3,2'-0-dimethyluridine; 3-(3-amino-3- carboxypropyl)uridine; 4-thiouridine; ribosylthymine; 5,2'-0-dimethyluridine; 5-methyl-2- thiouridine; 5-hydroxyuridine; 5-methoxyuridine; uridine 5-oxyacetic acid; uridine 5-oxyacetic acid methyl ester; 5-carboxymethyluridine; 5-methoxycarbonylmethyluridine; 5- methoxycarbonylmethyl-2'-0-methyluridine; 5-m ethoxy carbonylmethyl-2'-thiouri dine; 5- carbamoylmethyluridine; 5-carbamoylmethyl-2'-0-methyluridine; 5- (carboxyhydroxymethyl)uridine; 5-(carboxyhydroxymethyl) uridinemethyl ester; 5- aminomethyl-2-thiouridine; 5-methylaminomethyluridine; 5-methylaminomethyl-2-thiouridine; 5-methylaminomethyl-2-selenouridine; 5-carboxymethylaminomethyluridine; 5- carboxymethylaminomethyl-2'-0-methyl -uridine; 5-carboxymethylaminomethyl-2-thiouridine; dihydrouridine; dihydroribosylthymine; 2'-methyladenosine; 2-methyladenosine; N⁶N- methyladenosine; N⁶,N⁶-dimethyladenosine; N⁶,2'-0-trimethyladenosine; 2-methylthio-N⁶N- isopentenyladenosine; N⁶-(cis-hydroxyisopentenyl)-adenosine; 2-methylthio-N⁶-(cis- hydroxyisopentenyl)-adenosine; N⁶-glycinylcarbamoyl)adenosine; N⁶-threonylcarbamoyl adenosine; N⁶-methyl-N⁶-threonylcarbamoyl adenosine; 2-methylthio-N⁶-methyl-N⁶- threonylcarbamoyl adenosine; N⁶-hydroxynorvalylcarbamoyl adenosine; 2-methylthio-N⁶- hydroxnorvalylcarbamoyl adenosine; 2'-0-ribosyladenosine (phosphate); inosine; 2'0-methyl inosine; 1 -methyl inosine; l;2'-0-dimethyl inosine; 2'-0-methyl guanosine; 1 -methyl guanosine; N²-methyl guanosine; N²,N²-dimethyl guanosine; N²,2'-0-dimethyl guanosine; N²,N²,2'-0- trimethyl guanosine; 2'-0-ribosyl guanosine (phosphate); 7-methyl guanosine; N²;7-dimethyl guanosine; N²; N²;7-trimethyl guanosine; wyosine; methylwyosine; under-modified hydroxywybutosine; wybutosine; hydroxywybutosine; peroxywybutosine; queuosine; epoxyqueuosine; galactosyl-queuosine; mannosyl-queuosine; 7-cyano-7-deazaguanosine; arachaeosine [also called 7-formamido-7-deazaguanosine]; and 7-aminomethyl-7- deazaguanosine.

In some embodiments, the UTR is a synthetic oligonucleotide. In some embodiments, the synthetic nucleotide comprises a modified nucleotide. Modification of the inter-nucleoside linker (i.e. backbone) can be utilized to increase stability or pharmacodynamic properties. For example, inter-nucleoside linker modifications prevent or reduce degradation by cellular nucleases, thus increasing the pharmacokinetics and bioavailability of the UTR. Generally, a modified inter nucleoside linker includes any linker other than other than phosphodiester (PO) liners, that covalently couples two nucleosides together. In some embodiments, the modified inter nucleoside linker increases the nuclease resistance of the UTR compared to a phosphodiester linker. For naturally occurring oligonucleotides, the inter-nucleoside linker includes phosphate groups creating a phosphodiester bond between adjacent nucleosides. In some embodiments, the UTR comprises one or more inter-nucleoside linkers modified from the natural phosphodiester.

In some embodiments all of the inter-nucleoside linkers of the UTR, or contiguous nucleotide sequence thereof, are modified. For example, in some embodiments the inter-nucleoside linkage comprises sulfur (S), such as a phosphorothioate inter-nucleoside linkage.

Modifications to the ribose sugar or nucleobase can also be utilized herein. Generally, a modified nucleoside includes the introduction of one or more modifications of the sugar moiety or the nucleobase moiety. In some embodiments, the UTRs, as described, comprise one or more nucleosides comprising a modified sugar moiety, wherein the modified sugar moiety is a modification of the sugar moiety when compared to the ribose sugar moiety found in deoxyribose nucleic acid (DNA) and RNA. Numerous nucleosides with modification of the ribose sugar moiety can be utilized, primarily with the aim of improving certain properties of oligonucleotides, such as affinity and/or stability. Such modifications include those where the ribose ring structure is modified. These modifications include replacement with a hexose ring (HNA), a bicyclic ring having a biradical bridge between the C2 and C4 carbons on the ribose ring (e.g. locked nucleic acids (LNA)), or an unlinked ribose ring which typically lacks a bond between the C2 and C3 carbons (e.g. UNA). Other sugar modified nucleosides include, for example, bicyclohexose nucleic acids or tricyclic nucleic acids. Modified nucleosides also include nucleosides where the sugar moiety is replaced with a non-sugar moiety, for example in the case of peptide nucleic acids (PNA), or morpholino nucleic acids.

Sugar modifications also include modifications made by altering the substituent groups on the ribose ring to groups other than hydrogen, or the 2'-OH group naturally found in DNA and RNA nucleosides. Substituents may, for example be introduced at the 2', 3', 4' or 5' positions. Nucleosides with modified sugar moieties also include 2' modified nucleosides, such as 2' substituted nucleosides. Indeed, much focus has been spent on developing 2' substituted nucleosides, and numerous 2' substituted nucleosides have been found to have beneficial properties when incorporated into oligonucleotides, such as enhanced nucleoside resistance and enhanced affinity. A 2' sugar modified nucleoside is a nucleoside that has a substituent other than H or -OH at the 2' position (2' substituted nucleoside) or comprises a 2' linked biradicle, and includes 2' substituted nucleosides and LNA (2'-4' biradicle bridged) nucleosides. Examples of 2' substituted modified nucleosides are 2'-0-alkyl-RNA, 2'-0-methyl-RNA, 2'-alkoxy-RNA, 2'-0- methoxyethyl-RNA (MOE), 2'-amino-DNA, 2'-Fluoro-RNA, and 2'-F-ANA nucleoside. By way of further example, in some embodiments, the modification in the ribose group comprises a modification at the 2' position of the ribose group. In some embodiments, the modification at the 2' position of the ribose group is selected from the group consisting of 2'-0-methyl, 2'-fluoro, 2'- deoxy, and 2'-0-(2-methoxyethyl).

In some embodiments, the UTRs comprise one or more modified sugars. In some embodiments, the gRNA comprises only modified sugars. In certain embodiments, the gRNA comprises greater than 10%, 25%, 50%, 75%, or 90% modified sugars. In some embodiments, the modified sugar is a bicyclic sugar. In some embodiments, the modified sugar comprises a 2'- O-methoxyethyl group. In some embodiments, the UTR comprises both inter-nucleoside linker modifications and nucleoside modifications.

In additional aspects, the chimeric molecule comprises an internal ribosome entry site (IRES). As is understood in the art, an IRES is an RNA element that allows for translation initiation in an end-independent manner. In exemplary embodiments, the IRES is in the 5' UTR. In other embodiments, the IRES may be outside the 5' UTR.

Peptide Domains

As discussed above, the chimeric molecule for use in enhancing the expression and production of a desired biomolecule comprises one or more short peptide domains and one or more UTRs.

In certain embodiments, the chimeric molecule comprises one or more peptide domains. In certain embodiments, the one or more peptide domains comprise from about five amino acids to about twenty amino acids. In certain embodiments, the one or more peptide domains comprise about seven amino acids. In certain embodiments, the synthetic peptide tag comprises an amino acid sequence having at least about 70% (such as at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or greater) sequence identity to the sequence: QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptide domains comprise an amino acid sequence having at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptide domains comprise an amino acid sequence having at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptide domains comprise the amino acid sequence QPRFAAA (SEQ ID NO:

1)^.

In certain embodiments, the chimeric molecule comprises one or more peptide domains comprise an amino acid sequence of X_n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

In certain embodiments, the one or more peptide domains comprise one or more non natural amino acids or modified amino acids. Examples of modified amino acids include amino acids that have been phosphorylated, acetylated, glycosylated, carboxylated, hydroxylated, sulfated, and the like. Examples of non-natural amino acids include D-amino acids, homo amino acids, N-methyl amino acids, alpha-methyl amino acids, beta (homo) amino acids, gamma amino acids, helix/turn stabilizing motifs, backbone modifications (e.g. peptoids). Other examples of amino acids that are contemplated include hydroxyproline (Hyp), beta-alanine, citrulline (Cit), ornithine (Orn), norleucine (Me), 3-nitrotyrosine, nitroarginine, pyroglutamic acid (Pyr).

Biomolecules of Interest

A fusion protein or chimeric molecule, e.g. a peptide domain and/or UTR sequence associated with a biomolecule such as a protein, of the present disclosure is obtained by associating a peptide tag to a target protein (also referred to as a fusion protein of a tag and a target protein). One or more chimeric molecules may be bound to the N-terminus of the target protein, one or more chimeric molecules may be bound to the C-terminus of the target protein, or one or more chimeric molecules may be bound to both the N-terminus and the C-terminus of the target protein, or one or more chimeric molecules may be inserted into internal region of the tagged proteins. The one or more chimeric molecules may be directly bound to the N-terminus and / or the C-terminus of the target protein or may be bound through a sequence of 1 to several amino acids (for example, 1 to 10 amino acids). The sequence of 1 to several amino acids may be any sequence as far as the sequence does not adversely affect the function or the expression level of the chimeric molecule-target protein. However, the chimeric molecules may be isolated from the target protein after expression and purification by using a protease recognition sequence.

In certain embodiments, at least one or more chimeric molecules are associated with one or more biomolecules of interest. Examples of biomolecules include cytokines, growth factors, viral antigens, tumor antigens, antigens, polynucleotides, oligonucleotides, hormones, enzymes, checkpoint proteins, an antigen, an antibody, a transcription factor, a receptor, a ligand, immunoglobulins, immunoglobulin fragments, a fluorescent protein, etc. The length of the biomolecule, e.g. peptide of interest may vary as long as the amount of the targeted biomolecule, e.g. a peptide produced is significantly increased when expressed in the form of a fusion peptide/chimeric molecule.

Examples of the enzyme include enzymes such as lipase, protease, steroid synthesizing enzyme, kinase, phosphatase, xylanase, esterase, methylase, demethylase, oxidase, reductase, cellulase, aromatase, Carnauba, transglutaminase, glycosidase, and chitinase. Growth factors include, for example, epithelial growth factor (EGF), insulin-like growth factor (IGF), transforming growth factor (TGF), nerve growth factor (NGF), brain derived neurotrophic factor (BDNF) (VEGF), granulocyte colony stimulating factor (G-CSF), granulocyte macrophage colony stimulating factor (GM-CSF), platelet derived growth factor (PDGF), erythropoietin (EPO), thrombopoietin, Pre-eukaryotic cell growth factor (FGF), hepatocyte growth factor (HGF). Examples of the hormone include insulin, glucagon, somatostatin, growth hormone, parathyroid hormone, prolactin, leptin and calcitonin. Examples of cytokines include interleukin, interferon (IFN alpha, IFN beta, IFN gamma), tumor necrosis factor (TNF). Blood proteins include, for example, thrombin, serum albumin, Factor VII, Factor VII, Factor X, Factor X, tissue plasminogen activator. Antibody proteins include for example, F (ab')2, Fc, Fc fusion protein, heavy chain (H chain), light chain (L chain), short chain Fv scFv), sc(Fv)2, disulfide- linked Fv (sdFv), Diabodies. Immune checkpoint proteins are well known in the art and include, without limitation, CTLA-4, PD-1, VISTA, B7-H2, B7-H3, PD-L1, B7-H4, B7-H6, 2B4, ICOS, HVEM, PD-L2,

CD 160, gp49B, PIR-B, KIR family receptors, TIM-1, TIM-3, TIM-4, LAG-3, BTLA, SIRPalpha (CD47), CD48, 2B4 (CD244), B7.1, B7.2, ILT-2, ILT-4, TIGIT, and A2aR.

Antigens may be appropriately selected depending on the subject of the immunological response, for example, a protein derived from a pathogenic bacterium, or a protein derived from a pathogenic virus.

The chimeric molecules may be combined with a secretory signal peptide functioning in the host cell for secretory production. When the yeast is used as a host, the secretory signal peptide can be exemplified by an invertase secretion signal. In certain embodiments, the secretory signal is obtained from two or more different sources. Various sources include, for example, Bacillus species, Lactococcus lactis , Streptomyces, or Corynebacterium . Other signal sequences include, for example, human IL-2, human chymotrypsin, human interferon gamma, etc.

In certain embodiments, the chimeric molecules may be added with a transport signal peptide such as an endoplasmic reticulum residual signal peptide or a liquid phase transition signal peptide for expression in a specific cell compartment.

The chimeric biomolecules can be chemically synthesized or can be genetically produced. The DNA of the present disclosure is characterized by including nucleic acids encoding the chimeric molecule of the present disclosure.

The DNA of the present disclosure may contain an enhancer sequence or the like functioning in the host cell so as to improve the expression in the host cell. Examples of the enhancer include the 5'-untranslated region of the Kozak sequence and the plant-derived alcohol dehydrogenase gene.

Constructs

Genetic constructs or vectors comprise a nucleotide sequence that encodes a desired protein operably linked to regulatory elements needed for gene expression. Accordingly, incorporation of the DNA or RNA molecule into a living cell results in the expression of the DNA or RNA encoding the desired protein and thus, production of the desired protein. The chimeric molecules of the present disclosure can be produced by a general genetic engineering technique. For example, a recombinant vector encoding for the chimeric molecule. The recombinant vector of the present disclosure is not particularly limited as long as the nucleic acid sequences chimeric molecule is inserted into the vector so that it can be expressed in a host cell into which the vector is introduced. The vector is not particularly limited as long as it is replicable in the host cell, and examples thereof include plasmid DNA and viral DNA. The regulatory elements necessary for gene expression of a DNA molecule include: a promoter, an initiation codon, a stop codon, and a polyadenylation signal. In addition, enhancers are often required for gene expression. It is necessary that these elements be operable linked to the sequence that encodes the desired proteins and that the regulatory elements are operably in the individual to whom they are administered.

Initiation codons and stop codon are generally considered to be part of a nucleotide sequence that encodes the desired protein. However, it is necessary that these elements are functional in the individual to whom the gene construct is administered. The initiation and termination codons must be in frame with the coding sequence.

The molecule that encodes a desired protein may be DNA or RNA which comprise a nucleotide sequence that encodes the desired protein. These molecules may be cDNA, genomic DNA, synthesized DNA or a hybrid thereof or an RNA molecule such as mRNA. Accordingly, as used herein, the terms “DNA construct”, “genetic construct”, “nucleotide sequence”, nucleic acid” are meant to refer to both DNA and RNA molecules.

When taken up by a cell, the genetic construct which includes the nucleotide sequence encoding the desired protein operably linked to the regulatory elements may remain present in the cell as a functioning extrachromosomal molecule or it may integrate into the cell's chromosomal DNA. DNA may be introduced into cells where it remains as separate genetic material in the form of a plasmid. Alternatively, linear DNA which can integrate into the chromosome may be introduced into the cell. When introducing DNA into the cell, reagents which promote DNA integration into chromosomes may be added. DNA sequences which are useful to promote integration may also be included in the DNA molecule. Alternatively, RNA may be administered to the cell. It is also contemplated to provide the genetic construct as a linear minichromosome including a centromere, telomeres and an origin of replication.

Accordingly, in certain embodiments, the present disclosure includes a vector comprising one or more cassettes comprising: a UTR, biomolecule, peptide tag domain, e.g. Qa tag (SEQ ID NO: 1). The vector can be any vector that is known in the art and is suitable for expressing the desired expression cassette. A number of vectors are known or can be designed to be capable of mediating transfer of gene products to mammalian cells, as is known in the art and described herein. In certain aspects, a vector refers to a nucleic acid polynucleotide to be delivered to a host cell, either in vitro or in vivo. In some embodiments, one or more cassettes are provided on a single vector. In some embodiments, one or more cassettes are provided on a two or more vectors. In some embodiments, cassettes are provided by one or more vectors comprising an isolated nucleic acid encoding one or more elements of a gene editing system. In some embodiments, the cassettes are provided by one or more vectors comprising an isolated nucleic acid encoding one or more components comprising: a UTR(s), biomolecule(s), peptide tag(s). In some instances, the expression of natural or synthetic nucleic acids encoding a RNA and/or peptide is typically achieved by operably linking a nucleic acid encoding the RNA and/or peptide or portions thereof to a promoter and incorporating the construct into an expression vector. The vectors to be used are suitable for replication and, optionally, integration in eukaryotic cells. Typical vectors contain transcription and translation terminators, initiation sequences, and promoters useful for regulation of the expression of the desired nucleic acid sequence.

The isolated nucleic acids of the disclosure can be cloned into a number of types of vectors. For example, the nucleic acid can be cloned into a vector including, but not limited to a plasmid, a phagemid, a phage derivative, an animal virus, and a cosmid. Vectors of particular interest include expression vectors, replication vectors, probe generation vectors, and sequencing vectors.

Additional promoter elements, e.g., enhancers, regulate the frequency of transcriptional initiation. In some embodiments, the vector also includes conventional control elements which are operably linked to the transgene in a manner which permits its transcription, translation and/or expression in a cell transfected with the plasmid vector or infected with the virus comprising a nucleic acid comprising the described cassettes or compositions. As used herein, “operably linked” sequences include both expression control sequences that are contiguous with the gene of interest and expression control sequences that act in trans or at a distance to control the gene of interest. Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation (polyA) signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (i.e., Kozak consensus sequence); sequences that enhance protein stability; and when desired, sequences that enhance secretion of the encoded product. A great number of expression control sequences, including promoters which are native, constitutive, inducible and/or tissue-specific, are known in the art and can be utilized.

Typically, these are located in the region 30-110 bp upstream of the start site, although a number of promoters have recently been shown to contain functional elements downstream of the start site as well. The spacing between promoter elements frequently is flexible, so that promoter function is preserved when elements are inverted or moved relative to one another. In the thymidine kinase (tk) promoter, the spacing between promoter elements can be increased to 50 bp apart before activity begins to decline. Depending on the promoter, it appears that individual elements can function either cooperatively or independently to activate transcription.

The selection of appropriate promoters can readily be accomplished. In certain aspects, one would use a high expression promoter. Promoters and polyadenylation signals used must be functional within the cells of the individual. The promoter used in the vector may be appropriately selected depending on the host cell into which the vector is introduced. For example, when expressed in yeast, the GALl promoter, the PGK1 promoter, the TEF1 promoter, the ADH1 promoter, the TPI1 promoter, the PYK1 promoter and the like can be used. When expressed in plants, Cauliflower Mosaic Virus 35S promoter, rice actin promoter, com ubiquitin promoter, lettuce ubiquitin promoter, and the like can be used. When expressed in Escherichia coli, T7 promoter and the like can be used. In the case of expression in Brevibacillus, P2 promoter and P22 promoter and the like can be mentioned. Inducible promoter. For example, in addition to lac, tac and trc which are inducible by IPTG, trp which can be induced by Iaa, ara which can be induced by L-arabinose, Pzt-1 which can be induced by using tetracycline, A P L promoter inducible at high temperature (42 ° C), and a promoter of cspA gene, which is one of cold shock genes. Other examples of promoters useful in the production of a genetic vaccine for humans, include but are not limited to promoters from Simian Virus 40 (SV40, Mouse Mammary Tumor Virus (MMTV) promoter, Human Immunodeficiency Virus (HIV) such as the HIV Long Terminal Repeat (LTR) promoter, Moloney virus, ALV, Cytomegalovirus (CMV) such as the CMV immediate early promoter, Epstein Barr Virus (EBV), Rous Sarcoma Virus (RSV) as well as promoters from human genes such as human Actin, human Myosin, human Hemoglobin, human muscle creatine and human metalothionein. Examples of polyadenylation signals useful to practice the present disclosure, especially in the production of a genetic vaccine for humans, include but are not limited to SV40 polyadenylation signals and LTR polyadenylation signals. In particular, the SV40 polyadenylation signal which is in pCEP4 plasmid (Invitrogen, San Diego Calif.), referred to as the SV40 polyadenylation signal, is used.

One example of a suitable promoter is the CAG promoter or the immediate early cytomegalovirus (CMV) promoter sequence. This promoter sequence is a strong constitutive promoter sequence capable of driving high levels of expression of any polynucleotide sequence operatively linked thereto. In certain embodiments, the Rous sarcoma virus (RSV) and MMT promoters are also be used. Certain proteins can be expressed using their native promoter. Other elements that can enhance expression can also be included such as an enhancer or a system that results in high levels of expression such as a tat gene and tar element. This cassette can then be inserted into a vector, e.g., a plasmid vector such as, pUC19, pUCl 18, pBR322, or other known plasmid vectors, that includes, for example, an E. coli origin of replication.

Another example of a suitable promoter is Elongation Growth Factor-la (EF-la). However, in some embodiments, other constitutive promoter sequences are used, including, but not limited to the simian virus 40 (SV40) early promoter, mouse mammary tumor virus (MMTV), human immunodeficiency virus (HIV) long terminal repeat (LTR) promoter, MoMuLV promoter, an avian leukemia virus promoter, an Epstein-Barr virus immediate early promoter, a Rous sarcoma virus promoter, as well as human gene promoters such as, but not limited to, the actin promoter, the myosin promoter, the hemoglobin promoter, and the creatine kinase promoter. Further, the disclosed should not be limited to the use of constitutive promoters. Inducible promoters are also contemplated as part of the disclosed. The use of an inducible promoter provides a molecular switch capable of turning on expression of the polynucleotide sequence which it is operatively linked when such expression is desired or turning off the expression when expression is not desired. Examples of inducible promoters include, but are not limited to a metallothionine promoter, a glucocorticoid promoter, a progesterone promoter, and a tetracycline promoter.

Enhancer sequences found on a vector also regulates expression of the gene contained therein. Typically, enhancers are bound with protein factors to enhance the transcription of a gene. In some instances, enhancers are located upstream or downstream of the gene it regulates. In some instances, enhancers are also tissue-specific to enhance transcription in a specific cell or tissue type. In some embodiments, the vector of the present disclosure comprises one or more enhancers to boost transcription of the gene present within the vector. In some instances, the expression of the nucleic acid and/or protein, the expression vector to be introduced into a cell can also contain either a selectable marker gene or a reporter gene or both to facilitate identification and selection of expressing cells from the population of cells sought to be transfected or infected through viral vectors. In other embodiments, the selectable marker is carried on a separate piece of DNA and used in a co-transfection procedure. Both selectable markers and reporter genes can be flanked with appropriate regulatory sequences to enable expression in the host cells. Useful selectable markers include, for example, antibiotic-resistance genes, such as neo and the like.

If necessary, a terminator sequence may also be included depending on the host cell.

The recombinant vector of the present disclosure can be produced, for example, by digesting a DNA construct with a suitable restriction enzyme, or adding a restriction enzyme site by PCR, and inserting the vector into a restriction enzyme site or a multicloning site.

Host cells. The host cell used for transformation (“transformant”) may be eukaryotic cells or prokaryotic cells, preferably eukaryotic cells. In certain embodiments, eukaryotic cells, yeast cells, mammalian cells, plant cells, insect cells and the like are used. Examples of the yeast include Saccharomyces cerevisiae, Candida utilis, Schizosaccharomyces pombe, Pichia pastoris , and the like. In addition, microorganisms such as Aspergillus may be used. Examples of prokaryotic cells include Escherichia coli, Lactobacillus, Bacillus, Brevibacillus, Agrobacterium tumefaciens, actinomycetes and the like. Plant cells include plant cells belonging to Astaraceae, Solanaceae, Brassicaceae, Rosaceae, Chenopodiaceae, etc., such as Lactuca. The transformant used in the present disclosure can be produced by introducing the recombinant vector of the present disclosure into a host cell using a general genetic engineering technique. For example, an electrophoresis method (Tada, et al., 1990, Theor. Appl. Genet, 80: 475), a protoplast method (Gene, 39, 281-286 (1985)), a polyethylene glycol method, 1993, Transgenic, Res. 2: 218, Hiei, et al., 1994), Agrobacterium-mediated transformation (Hood et al., 1991, Theor. Appl. Genet. J. 6: 271), particle dry method (Sanford et al., 1987, J. Part. Sci. Tech. 5: 27), polycation method (Ohtsuki, et al., FEBS Lett. 3): 235-240) can be used. In addition, gene expression may be a routine expression or a stable expression inserted into a chromosome.

After introducing the recombinant vector of the present disclosure into a host cell, the transformants can be selected according to the phenotype of the selection marker. In addition, the tagged protein can be produced by culturing the selected transformant. The culture medium and conditions used for the culture can be appropriately selected depending on the species of the transformant.

When the host cell is a plant cell, the plant cell can be regenerated by culturing the selected plant cell by a conventional method, and the tag-added protein can be accumulated in the plant cell or outside the cell membrane of the plant cell.

Tagged biomolecules which have accumulated in the cells or cells can be separated and purified according to methods well known to those skilled in the art. For example, a known method known in the art, such as salting out, ethanol precipitation, ultrafiltration, gel filtration chromatography, ion exchange column chromatography, affinity chromatography, medium pressure liquid chromatography, reversed phase chromatography, hydrophobic chromatography, can be separated and purified.

Hereinafter, embodiments of the present disclosure will be described, but the present disclosure is not limited to these embodiments.

Examples

Example 1: Tagging for enhancing protein expression/secretion

Proteins play a key role in various physiological processes and pathological conditions. Protein expression is a critical and integral process in biological and medical research, but it can be difficult and costly to increase for large-scale applications. Enhancing the production yield of protein expression/secretion is increasingly important for biopharmaceutical development, immunological/vaccine industry, and biological therapeutics.

Results

Unexpected discovery of short peptide Qa tag to enhance gene expression. During the preliminary study to express SARS-CoV-2 viral proteins in mammalian cells, the aim was to set up a dual reporter system by fusing Gaussia-Dura luciferase (gdLuc) and destabilized GFP (dsGFP), abbreviated as LG, into the C-terminus of SARS-CoV-2 viral proteins (FIG. 1 A). The advantage of this dual reporter is the dynamic quantitative measurement of secretory gdLuc- fused target protein in the culture media with sensitive gdLuc assay and dsGFP positivity and intensity by fluorescent microscope and flow cytometry. Since using a lentiviral (LV) system would allow a wider range of targeting cells and easily establish stable cell line, an LV pCDH- nCoV-E-Flag vector (Zhang et al., 2020) was selected as the backbone for dual reporter cloning, which expresses the SARS-CoV-2 structural envelope (E) protein. NEBbuilder-HiFi cloning was performed via NotEApal sites using two fragments derived by PCR using the primers Flag-Not- gdLuc-F and gdLuc-P2A-R for the gdLuc fragment and P2A-dsGFP-F and dsGFP-PCR-Apa-R for the dsGFP fragment. Due to failure of such cloning, a new pair of primers were designed to generate one fragment by overlay PCR with the PCR products from the above two fragments and cloned into the pcDNA6B-nCoV-E-Flag vector (Zhang et al., 2020) via the SacII cloning site using NEBbuilder-HiFi kit. After confirmation of the correct clones by restriction enzyme digestion, the positive clones, El and E7, were tested for protein expression detected by fluorescent microscopy and secretory gdLuc reporter assay (FIG. 1 A). Surprisingly, it was found that E7 exhibited >20-fold higher luciferase activity than El. After Sanger sequencing, it was discovered that the E7 clone had an additional 21 nucleotides before the LG and after the Flag tag, which encoded 7 amino acids (amino acids) in frame. This 7-amino acids peptide was assigned the term “Qa” based on the potential pronunciation of the sequence and its linked LG as QLG in the following studies. Further validation studies confirmed that the pcDNA6B-E-QLG had up to 90-fold higher expression than pcDNA6B-E-LG (FIGS. 1 A-1D). The effect of this Qa tag on the expression of other structural proteins of SARS-CoV-2, such as spike (S), nucleocapsid (N), and membrane (M), as well as accessory proteins NSP2, NSP16 and ORF3 was examined. It was found that Qa had a 3 to 4000-fold efficiency of enhancing all the tested viral proteins (FIGS. 1E-1F and FIG. 2A), with the extent depending on each protein. Such variation of enhancing efficiency may result from differences in cellular density/functionality, transfection efficiency, reporter dosage, and viral protein types. Similar enhancing effects are applicable to many non-viral proteins (FIGS. 2B-2E). Interestingly, Transfection with a lower amount of plasmid DNA in HEK293T cells showed a higher efficiency of enhancing for most viral proteins of SARS-CoV-2 (FIG. 2A), but not in host cellular gene products such as mouse NIBP (FIG. 2B) and human ACE2 (FIGS. 2C, 2D) or cytokines such as IFNy (FIG. 2E) and IL-2 (FIG. 2F). Similar enhancing function is applicable to other cell types such as Hela, BHK and others (FIG. 2G). In addition to regular plasmids, this Qa tag also enhances protein expression in viral transfer vectors such as LV vectors (FIGS. 2E, 2F). In summary, this Qa tagging is versatile to enhance protein expression/production in various genes, cell types and species.

Further enhancement of viral protein expression by optimization of promoter and native UTR. During the initial studies herein, to express SARS-CoV-2 viral proteins for their cellular distribution and functional role in the pathogenesis of COVID-19, there was an absence or low expression of most viral proteins (data not shown). In the pcDNA6B expression vector system (CMV promoter), it was found that most viral proteins showed undetectable expression by Western blot and immunocytochemistry, while the host cellular gene hACE2 showed very strong expression. In the pCAG vector system (CAG promoter), it was found that most viral proteins were detectable but to various degrees. However, the expression for E and S proteins remained very weak or undetectable by immunocytochemistry and Western blot with anti-Flag antibody, which is in concordance with several reports (Boson et ak, 2020; Hu et ah, 2020; Ou et ah, 2020; Zhang et al ., 2020). To solve this problem, the dual LG reporter system was established with high sensitivity and quantitative analysis. The data suggested that the LG system can detect the expression of both E and S protein in either pcDNA6B or pCAG vector although the levels remained very low, however, Qa tagging robustly increased their expressions (FIG. 3 A). Again, the CAG promoter exhibited higher activity than CMV promoter as previously demonstrated (Dou et ak, 2021; Zhang et al ., 2020). To address whether the Qa tagging further enhanced CAG-driven gene expression of viral proteins, parallel experiments were performed between CMV and CAG promotors for viral protein E and S. As shown in FIGS. 3A-3C, Qa induced stronger enhancing of viral E and S proteins in the presence of the stronger CAG promoter (5~6- fold). The enhanced enhancing by Qa in CAG-driven NSP16 expression reached up to 212-fold (FIG 3C).

To test whether SARS-CoV-2 native UTR regulates Qa-accelerated expression of the viral proteins, a DNA fragment containing 5’-UTR-E-Flag-Qa-3’-UTR was synthesized based on the public SARS-CoV-2 strain (Wu et al., 2020) and cloned into the pCAG vector. The E protein was selected because of its relatively small size for cheap and fast synthesis as compared with S protein. Surprisingly, it was found that the addition of the native 5’ -UTR robustly enhanced the expression of E protein as determined by Western blot and immunocytochemistry with anti -Flag antibody (FIGS. 3D, 3E). Inclusion of the native 3’ -UTR further increased E protein expression (FIGS. 3D, 3E). S protein is very important for vaccine development, pseudovirion production, and drug discovery, however, its expression is the most difficult among the viral proteins of SARS-CoV-2 (Boson et al. , 2020; Hu et al. , 2020; Ou et al. , 2020; Walls et al., 2020; Wang et al., 2020; Zhang et al. , 2020). To maximize the production of S protein, the native 5’ -UTR was added upstream of Qa-tagged S protein in the pCAG expression vector, which shows 6-fold higher expression than pcDNA6B vector (FIG. 3C). Addition of 5’ -UTR further enhanced S protein production by 20~70-fold as compared with CAG-driven Qa-tagged S protein as determined by fluorescent immunocytochemistry (FIG. 3F) and gdLuc assay (FIG. 3G). Similar enhanced power occurs with the CMV-driven Qa-tagged S protein expression system (FIG. 3H). In the LV vector, addition of 5’ -UTR enhances the dual reporter Flag-tagged LG and QLG in the absence of viral proteins (FIG. 31). These data provide evidence that optimization of gene expression components (promoter, UTR) further increases Qa-tagged viral protein expression. Importantly, the native 5’ -UTR of SARS-CoV-2 alone dramatically enhances the expression of the targeted gene protein.

Enhancing of SARS-CoV-2 S pseudovirion production. Pseudotyped virus has been widely used for not only gene delivery but also vaccine production, antibody neutralization, cellular entry, and pathogenic mechanisms. Pseudovirion is an excellent alternative for high-risk viruses that require BSL3 facilities for working with live viruses, such as SARS-CoV-2 and its variants (Korber et al., 2020; Muik et al., 2021; Nie et al., 2020; Walls et al ., 2020; Weissman et al., 2021; Wibmer et al., 2021a). For example, limited access to BSL3 and ABSL3 facilities has slowed down basic research and vaccine/therapy development for COVID-19. Pseudovirion is the virus-like particle coated with viral surface or membrane proteins that harbor specific cellular tropism (Kuzmina et al., 2021; Walls et al, 2020; Wibmer et al, 2021a). Virus-like particles pseudotyped with S protein will have better immune responses than individual viral proteins due to similarity of three-dimensional structure to live virus (Kuzmina et al. , 2021; Walls et al. ,

2020; Wibmer et al, 2021a). SARS-CoV-2 S protein has been widely used to generate S pseudovirion but the packaging efficiency for lentivirus-like (LVLP) or VSV-like particles (VSVLP) has been very low in most reports, even when using the codon-optimized C-terminal deletion S protein (Korber et al., 2020; Muik et al, 2021; Ou et al, 2020; Walls et al, 2020). Given Qa tagging enhances S protein production in mammalian cells, it was speculated that Qa could enhance the packaging efficiency of S pseudotyped LVLP (S-LVLP). Using the C- terminal 18 amino acids-deleted codon-optimized SARS-CoV-2 S protein (Sdl8) as the test platform (FIG. 4A), which is widely used for S pseudovirus studies, it was validated that Qa addition on the C-terminal Sdl8 (Sdl8Q) enhanced Sdl8 expression, which was further increased by inclusion of 5’-UTR, as determined by Western blot analysis (FIG. 4B). It was found the Qa tagging increased the S-LVLP packaging efficiency by ~2-4-fold using a standard pRRL-GFP LV reporter transfer vector using microscopy (FIG. 4C) and flow cytometry (FIG. 4D) in HEK-hACE2 cells. Addition of 5’-UTR further increased the packaging efficiency by ~4- 10-fold (FIGS. 4C, 4D). Like regular VSV-G-pseudotyped LV packaging, polybrene treatment increased the titer of S-LVLP (FIG. 4E) and simplified high-speed sucrose concentration/purification retained the transduction capability (FIG. 4F). To provide dynamic measurement of S-pseudovirion transduction, the packaging efficiency was tested for the dual reporter LV vector pRRL-LG and pRRL-E-LG, which can harbor bigger size of inserts than just the GFP insert (FIGS. 4G, 4H). As expected, the original Sdl8 had very low packaging efficiency for pRRL-LG and pRRL-E-LG. However, Qa addition significantly enhanced their transduction efficiency, and 5’-UTR additions further enhanced the transduction efficiency as determined by fluorescent microscopy and gdLuc assay (FIGS. 4G, 4H). These data provide evidence that both Qa and 5’-UTR additions on the Sdl8 expression system significantly enhanced the packaging and transduction efficiency of SARS-CoV-2 S-LVLP. Enhancing of DN A and mRNA vaccine production. One significant immediate usage of Qa tagging could be for enhancing vaccine production for the urgent need to fight COVID-19. The most promising vaccines against SARS-CoV-2 or its variants derived from the mRNA or DNA encoding S protein or other SARS-CoV-2 viral proteins. Taking S protein as an example, Qa tagging increased the S protein expression by 4~27-fold in the CMV promoter-driven cDNA expression vector (FIG. 1H) and additional 6-fold increase occurs when CAG promoter is utilized (FIG. 3C). Inclusion of 5’-UTR in the CAG-driven cDNA expression vector further increased S protein expression by ~20-70-fold (FIG. 3G). Therefore, the Qa tagging plus 5’-UTR regulation for S protein-encoding DNA vaccine enhanced the vaccine production by at least 200- fold (FIG. 3H). Such an enhancement of DNA vaccine production in a large scale will robustly reduce the cost and expedite the availability of COVID-19 vaccines. Since mRNA vaccine exhibits numerous advantages over other vaccines and COVID-19 S protein mRNA vaccine has been well established for extensive human application during the pandemic, it was hypothesized that the Qa tagging plus 5’-UTR would enhance the mRNA-dependent translation, leading to increased expression of viral proteins such as S protein for vaccine production. To test this, in vitro transcription was performed to generate capped mRNA with Qa tag and it was examined if Qa tagging can affect the translation of viral proteins after mRNA transfection in HEK293T cells. As shown in FIG. 5A, the presence of Qa tag significantly increased the production of viral protein S from the transfected functional mRNAs in a time-dependent and dose-dependent manner. Such enhancing is universally applicable to the mRNAs of other viral proteins N, E, and ORF3 as well as the host cellular gene ACE2 (FIGS. 5B, 5C). Addition of 5’-UTR significantly increased the mRNA-dependent translation of Qa-tagged viral S protein (FIG. 5D) as well as viral E protein and cellular ACE2 (FIG. 5E), consistent with the cDNA expression vector (FIGS. 3D-3I). These data provide evidence that the Qa tagging and 5’-UTR inclusion enhance the translational efficiency leading to an increase in the production of the reporter protein in all the tested targets, and the extent relies on different targeted proteins.

To further determine if Qa tagging regulated mRNA-dependent translation, the dynamic changes of translational products were measured after transcriptional inhibition with actinomycin D. It was found that treatment with actinomycin D completely blocked the production of viral protein S (FIG. 5F) and ORF3 (FIG. 5G) without Qa tagging as determined by the gdLuc activity. However, Qa tagging increased the protein expression/production during transcriptional inhibition, which was accumulating in a time-dependent manner (FIGS. 5F, 5G). These data provide evidence that Qa tagging on the targeted protein facilitates the protein expression through posttranscriptional regulation (increased translation efficiency and/or mRNA stability). To further determine if Qa tagging influenced mRNA stability of the targeted genes, mRNA decay assays were performed using S and E viral proteins as examples. Although the time-course changing between S and E viral mRNAs exhibited different pattern, Qa tagging on both S (FIG. 5H) and E (FIG. 51) viral proteins increased the mRNA half-life by around 6-7 h.

Taken all together, Qa tagging and native 5’-UTR inclusion on a target mRNA significantly increased mRNA stability and translational efficiency and thus enhanced the protein expression/production of the targeted mRNA (e.g., S protein mRNA vaccine). Such enhancing not only reduces the cost but also stimulates vaccine response due to higher level of S protein expression/release from vaccinated target cells.

Enhancing of antibody production. The therapeutics based on effective monoclonal antibody (mAh) requires optimization of antibody production in a suitable cell culture platform, which relies on high performance expression vectors. Various genetic elements in monoclonal antibody production vectors have been widely modified. To determine if the novel Qa tagging would enhance antibody production, the human anti-SARS-CoV monoclonal antibody (Bei, CR3022) was used as a test platform. The Qa tag was cloned into the C-terminus of the immunoglobulin heavy and light chain (H/L) of CR3022, which contains variable regions of heavy and light chains derived from human anti-SARS-CoV mAh (GenBank: DQ 168569 and DQ 168570, respectively), to generate Qa-tagged HQ and LQ (FIG. 6A). The HQ and LQ were co-transfected into HEK293T cells to generate Qa-tagged monoclonal antibody, using the original H and L vectors (NR52399 and NR52400) as a control. The supernatants containing the monoclonal antibody were collected at 2-3 days after transfection and their levels were measured using sandwich ELISA with SARS-CoV-2 S protein as the coating antigen (FIGS. 6B, 6C). It was found that Qa tagging enhanced the antibody production by up to 37-fold with or without the normalization for transfection efficiency (FIG. 6D). The enhancing efficiency varied with the experimental conditions (cell density, transfection efficiency, and ELISA variations) in an average of 13-fold (FIG. 6E). Western blot analysis of the supernatant validated Qa enhancing of the antibody production (FIG. 6F). These data provide evidence that Qa tagging induces a robust enhancing of antibody production(secretion).

Enhancing of lentivirus production. Viral gene therapy has been extensively studied and actively applied to clinical diseases. AAV and LV are the most promising strategies for viral gene therapy. However, viral packaging efficiency (production yield) has been a bottleneck for both AAV and LV gene therapy. In the field of CRISPR/Cas genome editing, viral packaging efficiency is also a rate-limiting factor in developing genome editing and therapeutics. Generally, the level of mRNA from LV transfer vector could affect the LV packaging efficiency. It was hypothesized that Qa tagging in the LV transfer vector would enhance the efficiency of LV packaging and gene delivery if Qa tagging increases the mRNA level of the transgene during the packaging. To test this, the LV transfer vectors pRRL-E-LG and pRRL-E-QLG were compared for standard LV packaging (psPax2 and VSV-G). After LV infection of HEK293T cells, Qa tagging increased the production of the transgene reporter gdLuc from the transfer vector (FIG. 6G), similar to the enhancing efficiency in the transfected cells without LV packaging (FIG. 2H). However, Qa tagging on the transfer vector only had a marginal effect on the packaging efficiency i.e., the titer of packaged LV (data not shown). Similar changes were seen with LV- spCas9-Q-RFP and LV-MS2-spCas9-Q-GFP (FIGS. 6H, 61), where the packaging efficiency is usually >100-fold less than the standard LV-RFP or LV-GFP. These data provide evidence that the marginal change in the mRNA of transfer gene in the transfer LV vector by the Qa tagging did not increase the packaging efficiency, although the production of transgene protein in the transduced cells is enhanced by the Qa tagging (FIG. 6G). This is consistent with the finding that Qa tagging influences translation instead of transcription (FIGS. 5A-5I). It was then tested if Qa tagging on the LV packaging proteins, such as Gag, Pol, and RRE, in the packaging vector psPAX2 could enhance the packaging efficiency. Interestingly, Qa tagging on Gag significantly impaired LV packaging but Qa tagging on Pol and RRE enhanced the LV packaging for pRRL- GFP, pRRL-QLG and LV-MS2-Cas9-GFP; however, the enhance efficiency wass only l~3-fold (FIGS. 6J-6L). Further optimization of LV packaging enhance by Qa tagging is warranted.

Qa tagging enhances secretion of targeted proteins. As demonstrated above, Qa tagging increased the expression of various types of targeted proteins. When Western blot analysis was performed using the cell lysates to confirm the enhancing effect of Qa tagging on E dual reporter protein expression, it was unexpectedly found that the E-Flag-gdLuc protein level in the cell lysates was remarkably reduced in Qa tagging group (FIG. 7A), even though the gdLuc activity in the supernatant was robustly increased by Qa tagging (FIGS. 1 A-1F). In the presence of 5’- UTR, the reduction in CAG-driven E-Flag-gdLuc expression level was more robust in the cell lysate (FIG. 7B). This reduction also occurred with other viral proteins, N and S, as well as host cellular genes such as IFNy, IL-2, and hACE2 (FIGS. 7C, 7D). Inclusion of UTR increased the protein expression in the cell lysates (FIGS. 7B, 7D and data not shown).

These unexpected observations prompted the hypothesis that the robust increase in the supernatant gdLuc activity by Qa tagging must involve the protein secretion process. This was supported by the enhanced antibody secretion (FIGS. 6A-6E) as well as the secretory IFNy and IL-2 (FIGS. 2D, 2E). To corroborate the enhanced secretion, the protein level of the secretory E- Flag-gdLuc in the supernatant was analyzed using serum-free culture media. As shown in FIG. 7E, the cleaved E-Flag-gdLuc and GFP as well as the uncleaved E-Flag-gdLuc-GFP were detected by Western blot analysis with anti-gdLuc and anti -GFP antibodies in the unconcentrated supernatant (40%) of Qa tagging E-QLG group. Densitometric quantification detected a 17-fold increase in the secretory protein, which is consistent with the enhancing detected by the gdLuc assay (FIG. 7F). The secretion can be completely blocked by treatment with the ER-Golgi protein trafficking inhibitor Brefeldin A (FIG. 7G). To further confirm the secretion enhancing of Qa tagging, the non-secretory //>y//>'-luciferase (fLuc) assay was utilized. There was no fLuc activity in the supernatant even in the presence of Qa tagging, but the levels of protein expression and enzyme activity were still significantly increased (FIG. 7H). Altogether, Qa tagging apparently enhanced the expression and secretion of the targeted proteins. We also noted the incomplete auto-cleavage by 2A system in most targeted proteins, with variable cleaving efficiency for different proteins (FIG. 7C).

Discussion

In both published literature and patents, different types of bioactive peptides have been developed that regulate or enhance the production of targeted proteins (Daliri et ak, 2017; Katayama etal. , 2021; Peighambardoust et al. , 2021). This study presents a novel short peptide (epitope) tag (only 7 amino acids) that enhances targeted protein expression and secretion. Various types of peptide (epitope) tags have been identified previously for protein labeling, tracing, purification, and immunostaining (DeCaprio and Kohl, 2019; Katayama etal. , 2021; Lee etal. , 2020; Mishra, 2020; Peighambardoust etal., 2021; Pina etal. , 2021; Traenkle etal.,

2020). However, no peptide tag has been identified for the regulation or enhancing of targeted protein production including expression and secretion. This Qa tag would serve as a universal enhancer for protein production. So far, all the tested target proteins as shown in this study have an enhanced protein production, with some proteins having up to thousand-fold increases. Extensive testing on many other target proteins is warranted in the context of research interests and potential applications. To the inventor’s knowledge, this is the first evidence for protein regulation/enhancing by a short peptide (epitope) tag that traditionally serves for protein labeling/detection and affinity purification. This finding offers a paradigm shift in the context of epitope tagging and protein functional regulation.

Protein and peptide tags have been extensively employed for protein labeling/detection and affinity purification (DeCaprio and Kohl, 2019; Katayama etal., 2021; Lee et al, 2020; Mishra, 2020; Peighambardoust et al, 2021; Pina et al, 2021; Traenkle et al, 2020). The fusion of peptide tags with targeted proteins allows detection by immunostaining and immunoblotting with corresponding highly specific antibodies both in vitro and in vivo. Novel “spaghetti monster' fluorescent protein (smFP) technology with tandem tags dramatically enhances the sensitivity of the tagged protein detection (Viswanathan et al., 2015). Most tags can be also used for protein purification by immunoprecipitation and/or affinity chromatography. Some tags may enhance the yield of protein purification by extending protein half-life or rendering protein soluble (Bhagawati et al., 2019; Han et al., 2020; Li, 2011; Saribas et al, 2018). For some cases, tagging may influence the activity or function of the targeted proteins (Majorek et al., 2014). For example, N-terminal tagging on PI3KCA increases kinase activity while C-terminal tagging affects membrane binding activity (Vasan et al., 2019). The N-terminal secretory signal peptide of the gdLuc not only determines its inherent secretory property but also regulates the protein folding and functional activity (Gaur et al., 2017). For a specific protein, the C-terminal or N- terminal amino acid composition could regulate the protein expression (Cambray et al., 2018; Weber et al., 2020). Modification of C-terminal endoplasmic reticulum targeting peptide on the gdLuc significantly improves its intracellular retention (Gaur etal, 2017). Some peptides such as PEST (Shumway et al., 1999) or KFERQ (Dong et al., 2020; Park et al., 2016) fused or endogenously contained in the target proteins mark the proteins for proteolysis or degradation. However, there is no evidence that these epitope tags could directly boost the protein expression and secretion, particularly in mammalian cells.

In this study, a novel epitope tag Qa was discovered that enhanced the expression and secretion of the tagged SARS-CoV-2 viral proteins and many non-viral proteins. This conclusion is supported by the findings that 1) Qa tag robustly enhances the production of secretory gdluc fusion protein for several viral proteins and host cellular gene products determined by dual reporter assay and fluorescent microscopy (FIGS. 1 A-1F, 2A-2G); 2) Qa tag enhancing is promoter-independent (FIGS. 3A-3I, 5A-5I); 3) Qa significantly enhances mRNA-dependent production of targeted viral and non-viral protein fusion reporter determined by in vitro RNA transcription, mRNA transfection, and dual reporter assay (FIGS. 5A-5I); 4) Qa tagging retained its time-dependent enhancing in the presence of transcriptional inhibition, indicating posttranscriptional mechanism (increased translation efficiency and/or mRNA stability); 5) Qa tagging on S protein enhanced the production yield of S pseudoviruses (FIGS. 4A-4H); 6) Qa tagging on antibody heavy and light chain robustly increased the antibody production; 7) Qa tagging on the LV packaging vector enhanced the packaging efficiency; and 8) Qa tagging enhanced the protein secretion process as shown by Western blot and gdLuc analysis on gdLuc fusion proteins with or without brefeldin A treatment, antibody production, as well as the secretion form of IFNy and IL-2.

Although the mechanisms underlying the enhance of protein expression/secretion remain to be delineated, the studies herein, using classical measure of mRNA stability via global transcription inhibition identified the posttranscriptional mechanisms, such as increased mRNA stability and translation efficiency. Novel measurements of mRNA decay (Chan et ak, 2018) are needed to expand these preliminary observations. The regulation of mRNA involves the dynamic balancing between the synthesis and degradation processes. The synthesis process is well understood; however, less is known about the mRNA decay (Chan et al ., 2018). Qa tagging may serve as a novel mechanism to regulate mRNA stability. How Qa tag regulates mRNA stability and translation initiation/elongation would be an interesting and important direction to be explored. For example, it’s important to know whether the RNA sequence encoding the Qa peptide has secondary structure that may directly regulate the mRNA stability of targeted protein (Boo and Kim, 2020). It’s interesting to determine whether the synonymous substitution of Qa peptide influences the expression/production of tagged proteins. Whether the amino acids sequence of Qa tag directly binds to poly- A or 3’-UTR or which residues contribute to mRNA stabilization and translation enhancing needs to be determined. For the protein secretion, this Qa tag has no function similar to secretion peptide, because Qa tagging on non-secretory protein i.e. firefly-luciferase does not change the background luciferase activity in the cultured media, which exists likely due to partial cell death. However, Qa tagging on secretory proteins such as S protein, antibody, IFNy and IL-2 robustly enhanced their production yields. This is very important for the industrial application of these secretory proteins. In particular, S protein for mRNA vaccine would be released more from the vaccinated cells in the presence of Qa tag, which not only reduces the mRNA amount for each vaccination but also promotes the immune response due to higher level of secretory S protein. Given brefeldin A is well known to inhibit ER-Golgi trafficking and completely blocks Qa-stimulated protein secretion, we speculate that Qa tag could regulate protein retrograde or anterograde trafficking. Other secretion inhibitors could be used to identify additional pathways for protein secretion. Whether Qa tag influences unconventionally secreted proteins remains to be determined (Cohen et ak, 2020).

The UTRs at both ends of a viral genome or host cellular mRNA are important in regulating the transcription and translation efficiency (Berkhout et ak, 2011; Hinnebusch et ak, 2016; Raman and Brian, 2005; Senanayake and Brian, 1999; Williams et ak, 1999). In particular, the 5’-UTR of coronaviruses regulates translational rate via ribosomal scanning (Berkhout et al. , 2011; Hinnebusch et al. , 2016; Shirokikh et ak, 2019; Zhang et ak, 2015). A synthetic (non-viral) 5’-UTR has been used to enhance the translation of SARS-CoV-2 S mRNA in both Pfizer and Modema vaccines. The native UTRs of SARS-CoV-2 are highly conserved and plays key role in viral RNA replication and transcription of the genomic and subgenomic viral transcripts (Baldassarre etal. , 2020; Yang and Leibowitz, 2015). Thus, native 5’-UTR is assumed to enhance accumulation of viral protein. In this study, solid evidence is provided that the native (natural) 5’- and 3’-UTRs of SARS-CoV-2 enhanced the production of viral E-LG fusion protein. Importantly, the native 5’-UTR served as a universal regulator in enhancing not only viral proteins but also many non-viral cellular proteins. It was hypothesized that this potent UTR could be used in enhancing any proteins, particularly for virus packaging systems. For example, it was observed that UTR-Sdl8Q increased the packaging efficiency of S-pseudotyped LVLP or VSVLP. UTR in viral transfer vector enhanced the lentivirus production. The native UTR would also enhance the AMINO ACIDSV packaging and transduction efficiency.

This study identified the combination of Qa tagging and SARS-CoV-2 native UTR as a novel strategy to enhance or enhance the production of any targeted gene/protein of interest. For industry applications, this strategy will reduce the cost of many widely used products and facilitate their availability. Since it enhanced the production of all tested viral proteins of SARS- CoV-2, an immediate usage of this method would be the enhancing of vaccine production for the urgent need to fight COVID-19. The studies herein demonstrated at least a 200-fold enhance efficiency of S mRNA vaccine. This is extremely important to expedite the mRNA vaccine availability when producing new mRNA vaccines against SARS-CoV-2 variants or any other emerging viruses. This strategy can be easily incorporated into the DNA vaccine vector. Thus, enhancing vaccine production yield in a large scale will reduce robustly the cost and expedite the availability of COVID-19 vaccine. Another immediate industry value of the methods herein, is to enhance antibody production. Taking human anti-SARS-CoV monoclonal antibody as an example, it was found that Qa tagging at the C-terminus of the immunoglobulin heavy and light chain variable regions robustly enhanced the antibody secretion by up to 37-fold (average 13- fold). Given Qa tagging in the middle of targeted protein shows much stronger enhancing efficiency, optimization of Qa tagging in different region of the targeted antibody heavy and light chains is expected to achieve higher levels of antibody production enhance.

Enhancing the production yield of viruses or pseudotyped viruses is also invaluable in the fields of gene therapy and biomedical research. Pseudotyped viruses have facilitated the research on high-risk viruses that require BSL3 facilities. Pseudovirus of SARS-CoV-2 S protein or its variants have been extensively utilized for evaluation of neutralization antibody and vaccination as well as mechanistic and functional studies (Donofrio et ah, 2021; Korber etal. , 2020; Muik et al ., 2021; Ou et al ., 2020; Wibmer et ah, 2021b). The bottleneck for generation of S pseudovirions is the limited packaging efficiency for LVLP or VSV-like particles (Korber et al. , 2020; Muik et al. , 2021; Ou et al. , 2020; Walls et al. , 2020). The method used herein to combine Qa tagging and native 5’ -UTR on the Sdl8 expression system robustly enhanced the packaging and transduction efficiency of SARS-CoV-2 S-LVLP. This strategy has facilitated current research on the antiviral effect of EGCG and the protective efficiency of vaccinated serum from patients against the emerging SARS-CoV-2 variants (Liu et al., 2021a; Liu et al., 2021b). One of the challenges for viral gene therapy is the limited viral packaging efficiency (production yield). Using an LV system as a test platform, it was found that Qa tagging in the LV transfer vector had only a marginal effect on the packaging efficiency, although the production of transgene protein in the transduced cells or the transfected packaging cells was enhanced. This is expected because Qa tagging influences translation instead of transcription of targeted genes, while LV packaging needs the presence of intermediate RNA from the transfer vector. Qa tagging at the C -terminus of the Pol and RRE in the packaging vector psPAX2 increased the LV packaging efficiency, but Qa tagging at Gag C-terminus impaired LV packaging. Thus, optimization of Qa tagging location in the LV packaging proteins was essential for maximation of Qa enhancing efficiency. Given Sdl8 Qa tagging enhanced Sdl8 expression and the packaging efficiency of S- LVLP, Qa tagging on VSV-G protein could enhance regular LV packaging efficiency. Qa insertion at different locations of VSV-G (Lorenz et al., 2014; Schlehuber and Rose, 2004) can maximize the enhancing efficiency. Like LV packaging, Qa tagging on AMINO ACIDSV packaging system warrants further optimization.

In recombinant protein production systems, the Qa tagging would facilitate the yield of protein expression, such as insulin, interferon, interleukin, cytokines, and growth factors. Even only a few-fold increase of enhancing reduced the production expenses and expedite clinical applications. For in vivo gene enhancing, Qa tagging via novel CRISPR/Cas gene knockin strategy could be used to facilitate the expression of loss-of-function genes, particularly in haplo- insufficient mutagenic diseases such as Angelman syndrome, Pitts-Hopkins syndrome, and others. For genetic engineering, Qa tagging enhancement of dominant genes may improve phenotype of organisms, particularly in agriculture applications. Finally, this novel Qa tag can be used as a general tag in a similar way to other peptide tags such as Flag, Myc, HA, Ollas, C7, and T7 for protein tracing, protein purification, immunostaining, and Western blotting. Importantly, the Qa tagging can enhance the labeling intensity of the endogenous proteins due to its enhancing property. This is very important for neural network tracing.

When performing Western blot analysis, the incomplete cleavage of several targeted proteins via the auto-cleaving 2A system was observed. This is in accord with previous reports on 2A cleaving insufficiency, which varies with different type of 2A peptide and different targeted proteins (Chng et al., 2015; Kim et al., 2011). Addition of a peptide (APVKQLL) to F2A increases the cleavage efficiency (Groot Bramel-Verheije et al., 2000). Whether Qa tagging affects the function of 2A system remains to be determined. The 2A system functions mainly inside the cells but may have relatively low activity outside the cells, because the ratio of cleaved over non-cleaved bands in the cell lysates is apparently higher than that in the cell culture media (FIGS. 7A-7H).

Based on a lot of previous studies on epitope tags such as Flag, HA, Myc, Ollas, C7 and T7 both in vitro and in vivo , it is anticipated that the smaller Qa tag (7 amino acids) should not have any toxicity.

In summary, this study reports a novel peptide tag consisting of a specified short amino acid (7 amino acids) sequence that can be utilized for enhancing production of the tagged proteins, including viral transcripts/proteins, endogenous gene products, vaccine, antibody, engineered recombinant proteins in a cell both in vitro , ex vivo, and in vivo. This novel and universal peptide tag would facilitate protein expression and secretion. It would be invaluable to perform library screening for this master Qa tag to discover optimal peptides that maximize the protein expression/production/secretion. This study also reports the exceptionally potent efficiency of SARS-CoV-2 native 5’-UTR in boosting the protein expression/production. Combining Qa tagging with the native 5’-UTR offers a synergistic boosting on the production of viral and non-viral proteins. All these strategies are invaluable in biopharmaceutical development, immunological/vaccine industry, and biological therapeutics.

Example 2: Experimental model and subject details

Cell lines:

HEK293T, Hela and BHK cell lines were cultured in standard protocol.

Method Details Vector cloning

All the PCR reactions for cloning in this study were performed using Phusion High-Fidelity PCR Master Mix kit (Thermo Fisher, F531) and purified using the Monarch PCR & DNA Cleanup Kit (NEB, T1030S). The correct clones were verified by restriction enzyme digestion and Sanger sequencing as well as functional measures.

Dual reporter vectors: The dual reporter LG fragment, encoding Gaussia-Dura luciferase (gdLuc) and destabilized GFP (dsGFP), was generated by overlay PCR: 1) Standard PCR was performed to generate fragment 1 (gdLuc) from template plasmid pMCS-Gaussia-Dura-Luciferase (Thermo Fisher Scientific, Cat#16190) with primer pair T1290/T1291 while fragment 2 (dsGFP) from plasmid pLenti-EFS-EGFPd2PEST-2A-MCS-Hygro (TP1380), a gift from Neville Sanjana (Addgene Cat# 138152) with T1292/T1293; 2) Purified two fragments (100 ng/each) with overlay ed 19 nucleotides were mixed for 8 cycles of PCR; 3) the PCR product at 1:100 dilution was used as template for 28 cycles of standard PCR with primer pairs T1292/T1293 to generate LG fragment. After purification with NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel, Cat# REF740609), this LG fragment (1485bp) was cloned into pcDNA6B-nCoV-x-Flag vector encoding various viral proteins of SARS-CoV-2 or cellular gene hACE2 as listed in Key Resource Table (Zhang et al., 2020) via SacW cloning site using NEBuilder® HiFi DNA Assembly cloning kit (NEB, E5520S) to generate pcDNA6B-SARS-CoV-2-x-Flag-LG vectors as listed in Table. The “x” indicates the gene of interest.

Unexpectedly, functional assay and Sanger sequencing identified a novel clone assigned as pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP1479), which has a Qa peptide in open reading frame before LG, assigned as QLG. The insert fragment encoding SARS-CoV-2 S protein from pcDNA6B-nCoV-S-Flag vector (TP1456) was cloned into TP1479 via XhoEXbal sites to generate pcDNA6B-SARS-CoV-2-S-Flag-QLG (TP1487). The insert fragment encoding SARS-CoV2 N protein from pcDNA6B-nCoV-N-Flag vector (TP1431) or hACE2 from pcDNA6B-hACE2-Flag vector (TP1470) was cloned into TP1479 via KpnEXbal sites to generate pcDNA6B-SARS-CoV2- N-Flag-QLG (TP1490) or pcDNA6B-hACE2-Flag-QLG (TP1491).

The pcDNA6B-NTBP-Flag-LG (TP 1560) vector was generated by NEB-HiFi cloning of NIBP PCR product from pYX-Asc-mNIBP (Genbank # BC070463) into pcDNA6B-hACE2-Flag- LG (TP1540) via NotEXbal, while the pcDNA6B-NIBP-Flag-QLG (TP1558) was generated by NEB-HiFi cloning of NIBP PCR product into pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP1479) via XhoEXbal. The pCAG vectors encoding E, S and NSP16 were generated by replacing the CMV promoter in corresponding pcDNA6B-SARS-CoV2-x-Flag-LG or -QLG vectors with CAG promoter via SnaBI/Kpnl sites.

UTR containing vectors: The DNA fragment containing 5’-UTR-E-Flag-Qa-3’-UTR designed according to the public SARS-CoV-2 sequencing was synthesized by Synbio Technologies and cloned into the pCAG-Flag vector via EcoRV/Age sites using NEBuilder® HiFi DNA Assembly cloning kit (NEB, E5520). This vector pC AG-UTR-E-F1 ag-Qa-UTR (TP1583) was digested with SnaBI/EcoRV (both blunt end) to remove CAG promoter and re-ligation generated pUTR-E-Flag-Qa-UTR vector (TP1585). The 3’-UTR with pCAG-UTR-E-Flag-Qa- UTR was removed by Notl digestion and ligation to generate pCAG-UTR-E-Flag-Qa (TP 1584) with additional 37 amino acids at open reading frame. The pCAG-UTR-S-Flag-QLG vector (TP1586) was generated by replacing the E-Flag-Qa-UTR fragment with S-Flag-QLG fragment from pCAG-S-Flag-QLG vector (TP1518) via XhoEAgel sites. The pCAG-UTR-Sdl8-Q (TP 1595) was generated by NEB HiFi cloning via EcoRI sites of pCAG-Sdl8-Q vector (TP 1506) with PCR product 5’ -UTR from pUTR-E-Flag-UTR vector (TP 1585). UTR-containing pcDNA6B vectors were generated by restriction cloning via KpnEXhoI to transfer 5 ’ -UTR from pC AG-UTR- E-Flag-Qa-UTR into corresponding pcDNA6B vectors such as pcDNA6B-S-QLG (TP1487), pcDNA6B-E-Flag-QLG (TP1479), pcDNA6B-ORF3-Flag-QLG (TP1483).

Antibody vectors: The plasmid set CR3022 for pFUSEss-CHIg-hGl-SARS-CoV2-mAb (NR-52399, TP1565) and pFUSE2ss-CLIg-hk-SARS-CoV2-mAb (NR-52400, TP1566) expressing the heavy (H) and light (L) chains of human anti-SARS-CoV mAh respectively (GenBank: DQ 168569 and DQ168570) were produced under HHSN272201400008C and obtained from BEI Resources, NIAID, NIH (Cat# NR-53260). The Qa-tagged HQ (TP 1574) and LQ (TP1571) vectors were generated from H or L plasmids at Nhe site using NEBuilder® HiFi DNA Assembly cloning kit with the synthesized oligonucleotides that contain Qa-encoding sequence and the C-terminus of the immunoglobulin heavy and light chain (see Table SI for sequences).

Lentiviral vector : The pRRLSIN.cPPT.PGK-GFP.WPRE (TP792), a gift from Didier Trono (Addgene #12252), was used to generate pRRL-E-Flag-LG-GFP (TP1577) by transferring E-Flag- LG insert from TP1478 to TP792 via BamHI/Agel. The pRRL-E-Fl ag-LG (TP1578) was generated from TP 1577 by Agel/Kpnl blunt ligation. The pRRL-E-Flag-QLG was generated by transferring E-Flag-QLG from TP1479 to TP1578 via BamHI/BstBI. TP1578 and TP1579 were used as the backbone vector for NEB-HiFi cloning of human IFNy and IL2 PCR products via Xbal site to generate pRRL-IFNy-LG (TP 1604) or QLG (TP 1605) and pRRL-IL2-LG (TP 1606) or QLG (TP 1607). The PCR fragments of IFNy and IL2 were derived, respectively, from pUC8-IFNy (a gift from Howard Young, Addgene #17600) and pAIP-hIL2-co (a gift from Jeremy Luban, Addgene #90513) using primer pairs T1407/Tq408 and T1409/T1410 as listed in Table SI. The pRRL-UTR-Flag-LG (TP 1621) and pRRL-UTR-Flag-QLG (TP 1622) were generated respectively by NEB-HiFi cloning of 5’-UTR PCR products from TP1583 into TP1578 and TP1579 via Xbal. The pRRL-Flag-LG (TP 1685) and pRRL-Flag-QLG (TP 1686) were generated respectively from TP 1621 and Tpl622 via BsmBI/Xbal digestion to remove UTR and NEB HiFi cloning with oligonucleotide insert (T1469) to correct the ATG site in ORF.

The LV packaging vector psPAX2-Gag-Q (TP1618) was generated from psPAX2 (TP592, a gift from Didier Trono, Addgene #12260) via Sph/EcoRV sites by NEB HiFi cloning of two overlay PCR fragments with primer pairs T1396/T1397 and T1398/T1399 using TP592 as PCR template. The psPAX2-Pol-Q-RRE-Q (TP1619) was generated from psPAX2 via Swal/Nhel sites by NEB HiFi cloning of two overlay PCR fragments with primer pairs T1400/T1401 and T1402/T1403 using psPAX2 as PCR template. The psPAX2-Gag-Q-Pol-Q-RRE-Q (TP 1620) was from TP1619 via Sph/EcoRV sites by NEB HiFi cloning of two overlay PCR fragments with primer pairs T1396/T1397 and T1398/T1399 using TP592 as PCR template.

S-pseudoviral vectors : The vector pCAG-SARS-CoV2-Sdl8Q (TP1506) encoding human codon-optimized S gene of SARS-CoV2 with C-terminal 18 amino acids deletion (Sdl8) and Qa tag fusion was constructed using NEBuilder® HiFi DNA Assembly cloning kit (NEB, E5520S). Briefly, the Sdl8 expression cassette in the CMV-driven vector pcDNA3.1-SARS2-S, a gift from Fang Li (Addgene Cat # 145032), was transferred to a CAG-driven vector pCAG-Flag-SARS- CoV2-S (gift from Peihui Wang) via EcoRVNotl sites and PCR with primer pairs T1323/T1324. The vector pCAG-SARS-CoV2-Sdl8 (TP1567) encoding Sdl8 without Qa tag was constructed via the NEB HiFi cloning with synthesized oligonucleotide insert T1367 at SacII/Not site of pCAG-SARS-CoV2-Sdl8Q vector. The pCAG-UTR-Sdl8Q (TP1595) vector was generated as described above. Plasmid DNA purification and DNA quantification

Plasmid DNAs were purified using commercial kits for endotoxin-free miniprep (Cat# REF 740490) or midipreps (Cat# REF 740420) from Macherey-Nagel (Germany). The E. coli bacterial cultures (5 ml for miniprep, 200 ml for midiprep) harboring relevant plasmids were grown in LB or 2YT media supplemented with 100 pg/ml Carbenicillin at 30°C for NEB-stable or 37°C for DH5alpha E. coli cells overnight. The bacterial cultures were harvested by centrifugation, the pellets obtained after centrifugation were processed to purify plasmid DNA according to manufacturer’s guideline. The final DNA was dissolved in ultra-pure distilled water and DNA concentrations were determined either using Nanodrop 1 UV-Vis Spectrophotometer (Thermo- Fisher) or in a Take3 plate using Bio-Tek multiplate reader.

Cell culture and Transfections

HEK293T human fetal kidney and Hela human cervix epithelial cells were obtained from ATCC (http://www. atcc.org). Both cells were cultures in Dulbecco’s Modified Eagle’s Medium (DMEM, Gibco) supplemented with Fetal Bovine Serum (FBS) and antibiotic 1% Penicillin/Streptomycin (Corning). BHK-21-/WI-2 cells (EH1011, Kerafast, Boston, MA, USA) were grown in DMEM supplemented with 5% FBS and 1% Penicillin/Streptomycin. All cells were incubated in a 37°C incubator under 5% CO2 atmosphere.

For most experiments, 96-well plate was used. For mRNA stability, 24-well plate was used. Cells resuspended (in DMEM plus 10% FBS) were seeded (3-4xl0⁴ cells/well for 96-well plate or l-2xl0⁵ cells/well for 24-well plate) the night before the transfections. For transfections, Transporter 5 transfection reagent (TP5) (Polysciences Cat# 26008) was used at 1 to 4 ratios of DNA/reagent. Typically, 50-100 ng plasmid DNA per well for 96-well plate was mixed 0.2-0.4 mΐ TP5 in 0.9% NaCl solution and incubated at room temperature for 20 min. The transfection reagent and DNA solution were mixed again and added to each well dropwise. The transfections were incubated at 37°C in 5% CO2 overnight (16-18 h), the media was replaced with DMEM plus 10% FBS.

Multilabeled fluorescent immunocytochemistry and confocal image analysis

Cells were fixed for 30 min with 4% paraformaldehyde (PFA), washed with lx PBS, and permeabilized with 0.5% TritonX-100/lx phosphate buffered saline (PBS) for 30 minutes, blocked with 10% donkey serum for 1 hour and incubated with mouse anti -Flag monoclonal or anti-2A primary antibodies (Table 1) in 0.1% TritonX-100/lX PBS overnight at 4°C. The next day, cells were washed with IX PBS and incubated with the corresponding Alexa Fluor secondary antibodies (Jackson Immuno Research Labs; donkey anti-rabbit, anti-mouse, IgG (H+L) 488, 594, or 680) at a 1:400 dilution for 1 hour at room temperature, using Hoechst 33258 (1:5000) as a nuclear counterstain. Fluorescent confocal images were acquired and analyzed using the Leica SP8 confocal system.

Table 1

Flow Cytometry

Cells expressing dsGFP reporter were dissociated with Accutase (Coming), passed through a 70 pm nylon cell strainer (Corning) to remove large clumps, and washed with IX PBS. Dissociated cells were fixed with 4% PFA in PBS and GFP positive cells were analyzed using Cytek Aurora Flow cytometer. RNA Extraction and Reverse Transcription Quantitative PCR (RT-qPCR) for mRNA stability assay

HEK293T cells were transfected with indicated vectors (500 ng/well for 24-well plate) for 24 h before treatment with transcriptional inhibitor actinomycin D (10 mM) for various period. Total RNA was extracted using Monarch Total RNA Miniprep Kit (NEB, Cat# T2010) that includes two steps of DNA removal. Equal amount of RNA (0.5 pg) was used to synthesize cDNA using High Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scientific, Cat# 4368814) with random hexanucleotide primer. Real time PCR analysis was carried out on QuantStudio™ 3 System. The mRNA expression levels of reporter gdLuc luciferase and huma b-actin were determined using iTaq Universal SYBR Green Supermix kit (BioRad, Cat# 1725121). The sequences for gdLuc primers are (forward) 5’- GATTACAAGGATGACGACGATAAG-3’ (SEQ ID NO: 2) (T1364 targeting Flag) and (reverse) 5’- AAGTCTTCGTTGTTCTCGGTGGG-3 ’ (SEQ ID NO: 3) (T432 targeting gdLuc). Human b-actin primers are (forward) 5’- AAGAGCTATGAGCTGCCTGA-3 ’ (SEQ ID NO: 4) and (reverse) 5’- TACGGATGTCAACGTCACAC-3’ (SEQ ID NO: 5). Each sample was tested in triplicate. Cycle threshold (Ct) values were obtained graphically for reporter and b-actin. The difference in Ct values between for reporter and b-actin were represented as ACt values. The AACt values were obtained by subtracting the ACt values of the control samples from that of the samples at different time points. Relative percentage change in gene expression was calculated as 2-AACt. The mRNA decay rate was calculated by non-linear regression curve fitting (one phase decay) using GraphPad Prism 9.1. Three independent experiments were performed.

Luciferase assays:

For gdLuc assay, the Coelenterazine (CTZ) substrate (Cat # 3032, Nanolight Technology) was dissolved in 10 ml ultra-sterile distilled water to make the stock solutions and kept at -20°C until use. The CTZ stock solution was diluted 10-30 times to make working solutions. Equal amount of CTZ working solution and cell culture media (25-50 mΐ) after transfection were mixed in a Coming (CLS3922) white opaque 96-well optiplate, and the luminescence was measured in a BioTek Synergy LX multiplate reader. For firefly luciferase assay in some experiment, the ONE- Glo Luciferase assay kit (Promega Corp, Cat # E6110) was used. Aliquots of 100 mΐ substrate solution were mixed with 3-5 mΐ of cell lysates and the luminescence was measured in a BioTek Synergy LX multiplate reader. Data were presented as relative luciferase activity or fold changes compared with corresponding group. Experiments were performed at least 3 times with each in quadruplicates. In vitro transcription and mRNA transfection

For pcDNA6B vector containing T7 promoter, the DNA was lineated with Agel digestion followed by gel purification. For PCR product, the primers included the T7 promoter (TTA ATAC GACTC AC TATAGGGT GGA ATTC T GC AGATAT C C AG (SEQ ID NO: 6), T1427), generating DNA fragment containing 5’-UTR, target gene, LG or QLG dual reporter and a poly(A) tail. PCR was performed using Phusion High-Fidelity PCR Master Mix kit (Thermo Fisher Scientific, F531). The DNA was purified using gel extraction kit. and the concentration determined using Take3 plate in Bio-Tek multiplate reader. RNA was synthesized from the purified DNA template using HiScribe™ T7 ARCA mRNA Kit (New England Biolabs, Cat#E2060) and cotranscriptionally capped with m7G anti-reverse cap analog (ARCA, Cat#1411), and poly A tailing. The synthesized RNA was purified using Monarch RNA cleanup kit (New England Biolabs, Cat#E2040) and quantified with Take3 plate. Equal amount of RNA between LG and QLG groups at different dosage were used for transfection into HEK293T cells in quadruplicate with Lipofectamine® MessengerMAX mRNA Transfection Reagent (Thermo Fisher Scientific, Cat#LMRNA015) following manufacture’s manual. At 4-72 h post-transfection, the culture media containing gdLuc were collected, and gdLuc assay was performed as above.

VSV-G or S protein-pseudotyped lentivirus packaging and titration

The recombinant lentivirus carrying indicated lentiviral vector was produced in a small scale using the second generation of LV packaging system according to standard protocols. Briefly, HEK293T cells in one of 6-well plate were cotransfected by TP5 kit with the indicated transfer LV vector (1.4 pg), the packaging vector psPAX2 or its mutants (1 pg) and VSV-G or Sdl8 vector (0.4 pg). At 2-3 days post-transfection, the supernatants containing LV were concentrated and purified with simplified 10% sucrose purification as described previously. The functional titers of the crude and purified lentivirus were determined by counting GFP-expressing HEK293T cells at 48 h after infection with serial dilutions of lentiviruses under fluorescent microscopy. For some cases, flow cytometry or RT-qPCR analysis were used for LV titration. For PCR analysis, cell culture medium was collected from infected cells and centrifuged at 2,000 g for 5 min. Supernatant was subjected to viral lysis to extract viral RNA. One step RT-qPCR was performed using the qPCR Lentivirus Complete Titration Kit (Applied Biological Materials Inc., Cat No. LV900-S) and the QuantStudio 3 Real-Time PCR System (Applied Biosystems, Cat No. A28567) according to manufacturer protocols. The resulting data was analyzed using QuantStudio Design and Analysis Desktop Software (Applied Biosystems).

Western Blot Analysis

SDS-polyacrylamide gels (10-12%) were home-made or Mini -PROTEAN TGX gels (Cat# 4561093, 4561096) were purchased from BioRad. The cell lysates were prepared using the lysis buffer composed of 50 mM Tris-HCl pH 7.0, 150 mM NaCl, 5 mM EDTA and 1 % Triton X-100 supplemented with PMSF (lOOx), Aprotinin and Leupeptin (200x). The 50 mΐ lysates were prepared from each well after collecting the supernatant. The lysates were incubated at 4°C for 20- 30 minutes, centrifuged at maximum speed in an Eppendorf Centrifuge. The clear lysates were either denatured for 5 minutes at 98°C immediately in lx SDS-PAGE loading dye or stored at - 80°C until use. Supernatants were stored at 4°C until before they treated with lx SDS-PAGE loading dye. The denatured 10-20 mΐ aliquots of cell lysates or 20-30 mΐ supernatants were loaded onto SDS-polyacryramide gels. The SDS-PAGE was performed in Tris-Glycine/SDS buffers under denaturing and reducing conditions.

The polyacrylamide gels were transferred to 0.2-mih nitrocellulose membranes (BioRad supported nitrocellulose (NC) membrane, Cat # 162-0097) either using wet transfer or iBlot®2 device using IBlot®2 NC mini (IB23002) or regular Stacks (IB23001). In wet transfer following lx transfer buffer was used: 25 mM Tris-HCl pH 7.6, 192 mM glycine, 20% Methanol. The gels were sandwiched together with NC membranes and transfers were performed in lx Transfer buffer at 250 mA at 4°C for 1- 2 hours.

Dry Western blot transfers were performed in a IBlot®2 gel transfer device (Invitrogen, Thermo-Fisher, Ref# IB21001) using mini or regular IBlot®2 stacks for 7 min according to manufacturer’s guidelines. After the transfer, the membranes were blocked in lx TBST buffer containing 5% milk. The membranes then were treated with primary antibodies overnight at 4°C or 2 hours at RT. The membranes were washed three times with lx TBST buffer minute each followed by incubation with secondary antibodies. The secondary antibodies with infrared tag were diluted 1/10000- 120000 and incubated with the NC membranes for 45 minutes to an hour. At the end of incubation, the membranes were washed with lx TBST buffer three times, 5 minutes each and scanned on a Li-COR Odyssey image analyzer.

Antibody Detection with Enzyme-Linked Immunosorbent Assay (ELISA)

HEK293T cells were cotransfected with the Qa-tagged HQ (TP1574) and LQ (TP1571) at 50 ng/well of 96-well plate in quadruplicates with or without normalization vector pGL4.16-CMV (TP329) or pRRL-E-Flag-LG (TP1578) at 20 ng/well. The original antibody plasmids for pFUSEss-CHIg-hGl-SARS-CoV2-mAb (TP 1565) and pFUSE2ss-CLIg-hk-SARS-CoV2-mAb (TP 1566) were used as the control. ELISA was performed using a Human IgG (Total) Uncoated ELISA Kit (Invitrogen, Thermo-Fisher, Cat # 88-50550-88). A 96-well Costar ELISA plate (Coming) was first coated with SARS-Cov2-Spike (S) protein from BEI (Cat # NR52724) at 100 pg/well overnight at 4°C. The washing and blocking steps were performed using the buffers and solutions provided in the kit. Supernatants containing secreted antibodies were collected from the transfections at 24 and 48 h and kept at 4°C until use. The aliquots of 0.5, 2.5 and 5.0 antibody supernatants were added to each SARS-Cov2-S coated wells. After overnight incubation, the wells were washed (400 mΐ per well) the solutions provided in the kit. The horse radish peroxidase (HRP)-conjugated anti-monoclonal detection antibody was diluted in assay buffer (1/250) and added to each well and incubated at room temperature (RT) for 2-3 h. The wells were then washed 3 times (400 mΐ each) using a buffer provided in the kit at RT and treated with 300 pL substrate TMB (3, 3’, 5, 5’- tetramethyl benzidine) for 15 min to develop blue color and the reactions were terminated with 2 N HC1. The yellow color formation was measured at 450 nm using a BioTek microplate reader. The level of anti-SARS-CoV monoclonal antibody was quantified by Sigmoidal four-parameter logistic curve (4PL) fit using Prism GraphPad 9.1.

ER-Golgi transport inhibition with Brefeldin A Brefeldin A (BA, AdipoGen Life Sciences, Cat # AG-CN2-0018) was dissolved in DMSO to make 1 mg/mL working solution. HEK293T cells were transfected with indicated vectors using TP5 transfection reagent in DMEM plus 10 % FBS as described above. The transfected cells were incubated overnight, and 10 pg/ml BA was added prior to media change and incubated for 3 hours at 37°C in 5% CO2. The culture media was replaced with 293 FreeStyle serum free media (Gibco, Thermo-Fisher, Cat# 12-338-018) with 10 pg/ml BA and incubation was continued for 24 h at 37°C in 5% CO2. The supernatants were withdrawn right after media replacement and collected after 24 h. The cell lysates were also prepared at 24 h time point. The supernatants and cell lysates were tested for gdLuc activity and Western blot analysis.

Quantification and statistical analysis

Quantification of fold changes in Qa groups or UTR groups compared with corresponding non-Qa or non-UTR groups was performed using excel software. Statistical analysis was performed using Prism GraphPad 9.1. Significance at *P < 0.05, ** P < 0.01 and *** P < 0.001 was determined using a two-tailed student’s t-test between two groups or by one-way ANOVA for multiple comparisons. Data were presented as mean SE. The size and type of individual samples were indicated and specified in the figure legends.

All of the features disclosed in this specification may be combined in any combination. Each feature disclosed in this specification may be replaced by an alternative feature serving the same, equivalent, or similar purpose.

References

Baldassarre, A., Paolini, A., Bruno, S.P., Felli, C., Tozzi, A.E., and Masotti, A. (2020). Potential use of noncoding RNAs and innovative therapeutic strategies to target the 5'UTR of SARS-CoV-2. Epigenomics 12, 1349-1361. 10.2217/epi-2020-0162.

Berkhout, B., Arts, K., and Abbink, T.E. (2011). Ribosomal scanning on the 5'- untranslated region of the human immunodeficiency virus RNA genome. Nucleic Acids Res 39, 5232-5244. 10.1093/nar/gkrl 13.

Bhagawati, M., Terhorst, T.M.E., Fusser, F., Hoffmann, S., Pasch, T., Pietrokovski, S., and Mootz, H.D. (2019). A mesophilic cysteine-less split intein for protein trans-splicing applications under oxidizing conditions. Proc Natl Acad Sci U S A 116 , 22164-22172.

10.1073/pnas.1909825116.

Boo, S.H., and Kim, Y.K. (2020). The emerging role of RNA modifications in the regulation of mRNA stability. Exp Mol Med 52, 400-408. 10.1038/sl2276-020-0407-z.

Boson, B., Legros, V., Zhou, B., Siret, E., Mathieu, C., Cosset, F.L., Lavillette, D., and Denolly, S. (2020). The SARS-CoV-2 envelope and membrane proteins modulate maturation and retention of the spike protein, allowing assembly of virus-like particles. J Biol Chem 296 , 100111. 10.1074/jbc.RA120.016175.

Bottaro, S., Bussi, G., and Lindorff-Larsen, K. (2021). Conformational Ensembles of Noncoding Elements in the SARS-CoV-2 Genome from Molecular Dynamics Simulations. J Am Chem Soc. 10.1021/jacs.lc01094.

Cambray, G., Guimaraes, J.C., and Arkin, A.P. (2018). Evaluation of 244,000 synthetic sequences reveals design principles to optimize translation in Escherichia coli. Nat Biotechnol 36, 1005-1015. 10.1038/nbt.4238.

Chan, A.P., Choi, Y., and Schork, N. J. (2020). Conserved Genomic Terminals of SARS- CoV-2 as Coevolving Functional Elements and Potential Therapeutic Targets. mSphere 5.

10.1128/mSphere.00754-20.

Chan, L.Y., Mugler, C.F., Heinrich, S., Vallotton, P., and Weis, K. (2018). Non-invasive measurement of mRNA decay reveals translation initiation as the major determinant of mRNA stability. Elife 7. 10.7554/eLife.32536.

Chng, J., Wang, T., Nian, R., Lau, A., Hoi, K.M., Ho, S.C., Gagnon, P., Bi, X., and Yang, Y. (2015). Cleavage efficient 2A peptides for high level monoclonal antibody expression in CHO cells. MAbs 7, 403-412. 10.1080/19420862.2015.1008351.

Corbett, K.S., Edwards, D.K., Leist, S.R., Abiona, O.M., Boyoglu-Barnum, S., Gillespie, R.A., Himansu, S., Schafer, A., Ziwawo, C.T., DiPiazza, A.T., et al. (2020). SARS-CoV-2 mRNA vaccine design enabled by prototype pathogen preparedness. Nature 586 , 567-571.

10.1038/s41586-020-2622-0. Daliri, E.B., Oh, D.H., and Lee, B.H. (2017). Bioactive Peptides. Foods 6. 10.3390/foods6050032.

DeCaprio, J., and Kohl, T.O. (2019). Tandem Immunoaffmity Purification Using Anti- FLAG and Anti-HA Antibodies. Cold Spring Harb Protoc 2019. 10.1101/pdb.prot098657.

Donofrio, G., Franceschi, V., Macchi, F., Russo, L., Rocci, A., Marchica, V., Costa, F., Giuliani, N., Ferrari, C., and Missale, G. (2021). A Simplified SARS-CoV-2 Pseudovirus Neutralization Assay. Vaccines (Basel) 9. 10.3390/vaccines9040389.

Dou, Y., Lin, Y., Wang, T.Y., Wang, X.Y., Jia, Y.L., and Zhao, C.P. (2021). The CAG promoter maintains high-level transgene expression in HEK293 cells. FEBS Open Bio 11, 95- 104. 10.1002/2211-5463.13029.

Gaur, S., Bhargava-Shah, A., Hori, S., Afjei, R., Sekar, T.V., Gambhir, S.S., Massoud, T.F., and Paulmurugan, R. (2017). Engineering Intracellularly Retained Gaussia Luciferase Reporters for Improved Biosensing and Molecular Imaging Applications. ACS Chem Biol 12, 2345-2353. 10.1021/acschembio.7b00454.

GrootBramel-Verheije, M.H., Rottier, P.J., and Meulenberg, J.J. (2000). Expression of a foreign epitope by porcine reproductive and respiratory syndrome virus. Virology 278, 380-389. 10.1006/viro.2000.0525.

Han, X., Ning, W., Ma, X., Wang, X., and Zhou, K. (2020). Improving protein solubility and activity by introducing small peptide tags designed with machine learning models. Metab Eng Commun 11, e00138. 10.1016/j.mec.2020.e00138.

Hinnebusch, A.G., Ivanov, I.P., and Sonenberg, N. (2016). Translational control by 5'- untranslated regions of eukaryotic mRNAs. Science 352, 1413-1416. 10.1126/science. amino acidsd9868.

Hu, J., Gao, Q., He, C., Huang, A., Tang, N., and Wang, K. (2020). Development of cell- based pseudovirus entry assay to identify potential viral entry inhibitors and neutralizing antibodies against SARS-CoV-2. Genes Dis 7, 551-557. 10.1016/j.gendis.2020.07.006. Katayama, S., Corpuz, H.M., and Nakamura, S. (2021). Potential of plant-derived peptides for the improvement of memory and cognitive function. Peptides 142, 170571.

10.1016/j.peptides.2021.170571.

Kim, J.H., Lee, S.R., Li, L.H., Park, H.J., Park, J.H., Lee, K.Y., Kim, M.K., Shin, B.A., and Choi, S.Y. (2011). High cleavage efficiency of a 2 A peptide derived from porcine teschovirus-1 in human cell lines, zebrafish and mice. PLoS One 6 , el 8556.

10.1371/journal. pone.0018556.

Kolahchi, Z., De Domenico, M., Uddin, L.Q., Cauda, V., Grossmann, I., Lacasa, L., Grancini, G., Mahmoudi, M., and Rezaei, N. (2021). COVID-19 and Its Global Economic Impact. Adv Exp Med Biol 1318, 825-837. 10.1007/978-3-030-63761-3_46.

Korber, B., Fischer, W.M., Gnanakaran, S., Yoon, H., Theiler, L, Abfalterer, W., Hengartner, N., Giorgi, E.E., Bhattacharya, T., Foley, B., et al. (2020). Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus. Cell 182, 812-827 e819. 10.1016/j.cell.2020.06.043.

Kuzmina, A., Khalaila, Y., Voloshin, O., Keren-Naus, A., Boehm-Cohen, L., Raviv, Y., Shemer-Avni, Y., Rosenberg, E., and Taube, R. (2021). SARS-CoV-2 spike variants exhibit differential infectivity and neutralization resistance to convalescent or post-vaccination sera. Cell Host Microbe. 10.1016/j . chom.2021.03.008.

Lee, T.H., Kim, K.S., Kim, J.H., Jeong, J.H., Woo, H R., Park, S.R., Sohn, M.H., Lee, H.J., Rhee, J.H., Cha, S.S., et al. (2020). Novel short peptide tag from a bacterial toxin for versatile applications. J Immunol Methods 479, 112750. 10.1016/j.jim.2020.112750.

Li, Y. (2011). Recombinant production of antimicrobial peptides in Escherichia coli: a review. Protein Expr Purif 80, 260-267. 10.1016/j . pep.2011.08.001.

Liu, J., Bodnar, B.H., Meng, F., Khan, A., Wang, X., Luo, G., Saribas, S., Wang, T., Lohani, S.C., Wang, P., et al. (2021a). Epigallocatechin Gallate from Green Tea Effectively Blocks Infection of SARS-CoV-2 and New Variants by Inhibiting Spike Binding to ACE2 Receptor. bioRxiv, 2021.2003.2017.435637. 10.1101/2021.03.17.435637. Liu, J., Bodnar, B.H., Wang, X., Wang, P., Meng, F., Khan, A.I., Saribas, A.S., Padhiar, N.H., McCluskey, E., Shah, S., et al. (2021b). Correlation of vaccine-elicited antibody levels and neutralizing activities against SARS-CoV-2 and its variants. bioRxiv, 2021.2005.2031.445871. 10.1101/2021.05.31.445871.

Lorenz, I.C., Nguyen, H.T., Kemelman, M., Lindsay, R.W., Yuan, M., Wright, K.J., Arendt, H., Back, J.W., DeStefano, J., Hoffenberg, S., et al. (2014). The stem of vesicular stomatitis virus G can be replaced with the HIV-1 Env membrane-proximal external region without loss of G function or membrane-proximal external region antigenic properties. AIDS Res Hum Retroviruses 30, 1130-1144. 10.1089/AID.2013.0206.

Majorek, K.A., Kuhn, M.L., Chruszcz, M., Anderson, W.F., and Minor, W. (2014). Double trouble-Buffer selection and His-tag presence may be responsible for nonreproducibility of biomedical experiments. Protein Sci 23, 1359-1368. 10.1002/pro.2520.

Miao, Z., Tidu, A., Eriani, G., and Martin, F. (2020). Secondary structure of the SARS- CoV-2 5'-UTR. RNA Biol, 1-10. 10.1080/15476286.2020.1814556.

Mishra, V. (2020). Affinity Tags for Protein Purification. Curr Protein Pept Sci 21, 821 - 830. 10.2174/1389203721666200606220109.

Muik, A., Wallisch, A.K., Sanger, B., Swanson, K.A., Muhl, J., Chen, W., Cai, H., Maurus, D., Sarkar, R., Tureci, O., et al. (2021). Neutralization of SARS-CoV-2 lineage B.1.1.7 pseudovirus by BNT162b2 vaccine-elicited human sera. Science 371, 1152-1153.

10.1126/science. abg6105.

Nie, L, Li, Q., Wu, L, Zhao, C., Hao, H., Liu, H., Zhang, L., Nie, L., Qin, H., Wang, M., et al. (2020). Establishment and validation of a pseudovirus neutralization assay for SARS-CoV- 2. Emerg Microbes Infect 9, 680-686. 10.1080/22221751.2020.1743767.

Ou, X., Liu, Y., Lei, X., Li, P., Mi, D., Ren, L., Guo, L., Guo, R., Chen, T., Hu, L, et al. (2020). Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV. Nat Commun 11, 1620. 10.1038/s41467-020-15562-9. Peighambardoust, S.H., Karami, Z., Pateiro, M., and Lorenzo, J.M. (2021). A Review on Health-Promoting, Biological, and Functional Aspects of Bioactive Peptides in Food Applications. Biomolecules 11. 10.3390/bioml 1050631.

Pina, A.S., Batalha, I.L., Dias, A., and Roque, A.C.A. (2021). Affinity Tags in Protein Purification and Peptide Enrichment: An Overview. Methods Mol Biol 2178 , 107-132.

10.1007/978-l-0716-0775-6_10.

Polack, F.P., Thomas, S.J., Kitchin, N., Absalon, L, Gurtman, A., Lockhart, S., Perez, J.L., Perez Marc, G., Moreira, E.D., Zerbini, C., et al. (2020). Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine. N Engl J Med 383 , 2603-2615.

10.1056/NEJMoa2034577.

Raman, S., and Brian, D.A. (2005). Stem-loop IV in the 5' untranslated region is a cis- acting element in bovine coronavirus defective interfering RNA replication. J Virol 79, 12434- 12446. 10.1128/JVI.79.19.12434-12446.2005.

Rangan, R., Zheludev, I.N., Hagey, R.J., Pham, E.A., Wayment- Steele, H.K., Glenn, J.S., and Das, R. (2020). RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: a first look. RNA 26, 937-959. 10.1261/ma.076141.120.

Rezaei, N., Ashkevarian, S., Fathi, M.K., Hanaei, S., Kolahchi, Z., Ladi Seyedian, S.S., Rayzan, E., Sarzaeim, M., Vahed, A., Mohamed, K., et al. (2021). Introduction on Coronavirus Disease (COVID-19) Pandemic: The Global Challenge. Adv Exp Med Biol 1318, 1-22.

10.1007/978-3-030-63761-3_1.

Rouchka, E.C., Chariker, J.H., and Chung, D. (2020). Variant analysis of 1,040 SARS- CoV-2 genomes. PLoS One 15, e0241535. 10.1371/journal.pone.0241535.

Ryder, S.P., Morgan, B.R., Coskun, P., Antkowiak, K., and Massi, F. (2021). Analysis of Emerging Variants in Structured Regions of the SARS-CoV-2 Genome. Evol Bioinform Online 17, 11769343211014167. 10.1177/11769343211014167.

Saribas, A.S., White, M.K., and Safak, M. (2018). Structure-based release analysis of the JC virus agnoprotein regions: A role for the hydrophilic surface of the major alpha helix domain in release. J Cell Physiol 233, 2343-2359. 10.1002/jcp.26106. Schlehuber, L.D., and Rose, J.K. (2004). Prediction and identification of a permissive epitope insertion site in the vesicular stomatitis virus glycoprotein. J Virol 78, 5079-5087.

10.1128/jvi.78.10.5079-5087.2004.

Senanayake, S.D., and Brian, D.A. (1999). Translation from the 5' untranslated region (UTR) of mRNA 1 is repressed, but that from the 5' UTR of mRNA 7 is stimulated in coronavirus-infected cells. J Virol 73, 8003-8009. 10.1128/JVI.73.10.8003-8009.1999.

Shirokikh, N.E., Dutikova, Y.S., Staroverova, M.A., Hannan, R.D., and Preiss, T. (2019). Migration of Small Ribosomal Subunits on the 5' Untranslated Regions of Capped Messenger RNA. Int J Mol Sci 20. 10.3390/ijms20184464.

Traenkle, B., Segan, S., Fagbadebo, F.O., Kaiser, P.D., and Rothbauer, U. (2020). A novel epitope tagging system to visualize and monitor antigens in live cells with chromobodies. Sci Rep 10, 14267. 10.1038/s41598-020-71091-x.

Vasan, N., Razavi, P., Johnson, J.L., Shao, H., Shah, H., Antoine, A., Ladewig, E., Gorelick, A., Lin, T.Y., Toska, E., et al. (2019). Double PIK3CA mutations in cis increase oncogenicity and sensitivity to PBKalpha inhibitors. Science 366, 714-723.

10.1126/science. amino acidsw9032.

Viswanathan, S., Williams, M.E., Bloss, E.B., Stasevich, T.J., Speer, C.M., Nern, A., Pfeiffer, B.D., Hooks, B.M., Li, W.P., English, B.P., et al. (2015). High-performance probes for light and electron microscopy. Nat Methods 12, 568-576. 10.1038/nmeth.3365.

Walls, A.C., Park, Y.J., Tortorici, M.A., Wall, A., McGuire, A.T., and Veesler, D.

(2020). Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein. Cell 181, 281-292 e286. 10.1016/j.cell.2020.02.058.

Walsh, E.E., Frenck, R.W., Jr., Falsey, A.R., Kitchin, N., Absalon, J., Gurtman, A., Lockhart, S., Neuzil, K., Mulligan, M.J., Bailey, R., et al. (2020). Safety and Immunogenicity of Two RNA-Based Covid-19 Vaccine Candidates. N Engl J Med 383, 2439-2450.

10.1056/NEJMoa2027906. Wang, Q., Zhang, Y., Wu, L., Niu, S., Song, C., Zhang, Z., Lu, G., Qiao, C., Hu, Y., Yuen, K.Y., et al. (2020). Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2. Cell 181, 894-904 e899. 10.1016/j.cell.2020.03.045.

Weber, M., Burgos, R., Yus, E., Yang, J.S., Lluch-Senar, M., and Serrano, L. (2020). Impact of C-terminal amino acid composition on protein expression in bacteria. Mol Syst Biol 16, e9208. 10.15252/msb.20199208.

Weissman, D., Alameh, M.G., de Silva, T., Collini, P., Hornsby, H., Brown, R., LaBranche, C.C., Edwards, R.J., Sutherland, L., Santra, S., et al. (2021). D614G Spike Mutation Increases SARS CoV-2 Susceptibility to Neutralization. Cell Host Microbe 29, 23-31 e24.

10.1016/j.chom.2020.11.012.

Wibmer, C.K., Ayres, F., Hermanus, T., Madzivhandila, M., Kgagudi, P., Oosthuysen,

B., Lambson, B.E., de Oliveira, T., Vermeulen, M., van der Berg, K., et al. (2021a). SARS-CoV- 2 501 Y.V2 escapes neutralization by South African COVID-19 donor plasma. Nat Med. 10.1038/s41591-021-01285-x.

B., Lambson, B.E., de Oliveira, T., Vermeulen, M., van der Berg, K., et al. (2021b). SARS-CoV- 2 501 Y.V2 escapes neutralization by South African COVID-19 donor plasma. Nat Med 27, 622- 625. 10.1038/s41591-021-01285-x.

Williams, G.D., Chang, R.Y., and Brian, D.A. (1999). A phylogenetically conserved hairpin-type 3' untranslated region pseudoknot functions in coronavirus RNA replication. J Virol 73, 8349-8355. 10.1128/JVI.73.10.8349-8355.1999.

Wu, F., Zhao, S., Yu, B., Chen, Y.M., Wang, W., Song, Z.G., Hu, Y., Tao, Z.W., Tian, J.H., Pei, Y.Y., et al. (2020). A new coronavirus associated with human respiratory disease in China. Nature 579, 265-269. 10.1038/s41586-020-2008-3.

Yang, D., and Leibowitz, J.L. (2015). The structure and functions of coronavirus genomic 3' and 5' ends. Virus Res 206, 120-133. 10.1016/j.virusres.2015.02.025. Zhang, J., Cruz-Cosme, R., Zhuang, M.W., Liu, D., Liu, Y., Teng, S., Wang, P.H., and Tang, Q. (2020). A systemic and molecular study of subcellular localization of SARS-CoV-2 proteins. Signal Transduct Target Ther 5, 269. 10.1038/s41392-020-00372-8.

Zhang, J., Roberts, R., and Rakotondrafara, A.M. (2015). The role of the 5' untranslated regions of Potyviridae in translation. Virus Res 206, 74-81. 10.1016/j.virusres.2015.02.005.

Zhao, J., Qiu, J., Aryal, S., Hackett, J.L., and Wang, J. (2020). The RNA Architecture of the SARS-CoV-2 3 '-Untranslated Region. Viruses 12. 10.3390/vl2121473.

Example 3: Protein Expression/Secretion Boost By An Expression-Enhancing 21-mer Cis- Regulatory Motif (Exen21)

Many technologies have been developed to boost protein production, such as promoter optimization, mRNA regulation, codon optimization, and protein stabilization, as well as modification of host cellular expression machinery including humanized yeast system. While these strategies have been successfully used in various research fields and by the biopharmaceutical companies, it remains to be a research focus for developing a simple universal method that can increase protein production at lower cost. Studying SARS-CoV- 2 has been hampered by the low-level expression of many viral proteins including the spike (S)

1 protein in mammalian cells , which has limited the quick response to the COVID-19 pandemic^ . To optimize the SARS-CoV-2 viral protein expression, various expression vectors were developed herein using different promoters and a luciferase/GFP -based dual reporter system. During the vector optimizing process, it was discovered that the addition of a novel 21-mer oligonucleotide motif (termed herein “Exen21”, Expression-Enhancing 21) into the vector dramatically increased the expression and secretion of SARS-CoV-2 envelop (E) protein. This unique Exen21 encodes a specific heptapeptide designated as Qa. The insertion of Exen21/Qa was extended to various types of proteins and found out that it could enhance the production of other proteins of SARS-CoV-2, cellular gene products, mRNA vaccines, antibodies, engineered recombinant proteins, and virus-packaging proteins.

Materials and Methods

Vector cloning All the PCR reactions for cloning in this study were performed using Phusion High- Fidelity PCR Master Mix kit (Thermo Fisher, F531) and purified using the Monarch PCR & DNA Cleanup Kit (NEB, T1030S). The correct clones were verified by restriction enzyme digestion and Sanger sequencing as well as functional measures.

Dual reporter vectors: The dual reporter LG fragment, encoding Gaussia- Dura luciferase (gdLuc) and destabilized GFP (dsGFP), was generated by overlay PCR: 1) Standard PCR was performed to generate fragment 1 (gdLuc) from template plasmid pMCS-Gawvv/a-Dura- Luciferase (Thermo Fisher Scientific, Cat#16190) with primer pair T1290/T1291, while fragment 2 (dsGFP) was generated from plasmid pLenti-EFS-EGFPd2PEST-2A-MCS-Hygro (TP 1380, a gift from Neville Sanjana (Addgene Cat# 138152)) with T1292/T1293; 2) Purified two fragments (100 ng/each) with overlayed 19 nucleotides were mixed for 5 cycles of PCR with primer pairs T1304/T1305 at 98°C 15 sec, 58°C 30 sec and 72°C 1 min followed by 30 cycles of PCR at 98°C 30 sec, 55°C 30 sec and 72°C 1 min to generate LG fragment. After purification with NucleoSpin Gel and PCR Clean-up kit (Macherey -Nagel, Cat# REF740609), this LG fragment (1485 bp) was cloned into pcDNA6B-nCoV-X-Flag vectors encoding various viral proteins of SARS-CoV-2 or cellular gene hACE2 via Sacll cloning site using NEBuilder® HiFi DNA Assembly cloning kit (NEB, Cat# E5520S, assigned as NEB-HiFi) to generate pcDNA6B-SARS-CoV-2-X-Flag-LG vectors. The “X” indicates the gene of interest.

Unexpectedly, functional assay and Sanger sequencing identified a novel clone assigned as pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP1479), which has a Qa peptide in ORF before LG, assigned as QLG. The insert fragment encoding SARS-CoV-2 S protein from pcDNA6B-nCoV-S-Flag vector (TP1456) was cloned into TP1479 via XhoMXbaX sites to generate pcDNA6B-SARS-CoV-2-S-Flag-QLG (TP1487). The insert fragment encoding SARS-CoV-2 N protein from pcDNA6B-nCoV-N-Flag vector (TP1431) or hACE2 from pcDNA6B-hACE2-Flag vector (TP1470) was cloned into TP1479 via KpuMXbal sites to generate pcDNA6B-SARS-CoV-2-N-Flag-QLG (TP1490) or pcDNA6B-hACE2-Flag-QLG (TP1491).

The pcDNA6B-NIBP-Flag-LG (TP 1560) vector was generated by NEB-HiFi cloning of NIBP PCR product from pYX-Asc-mNIBP (TP546, Genbank # BC070463) with the primers T1375/T1376 into pcDNA6B-hACE2-Flag-LG (TP1538) via NotVXbal, while the pcDNA6B- NIBP-Flag-QLG(TP1558) was generated by NEB-HiFi cloning of the NIBP PCR product into pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP1479) viaXhoVXbal.

The pCAG vectors encoding E were generated by replacing the CMV promoter in corresponding pcDNA6B-SARS-CoV-2-E-Flag-LG or -QLG vectors with C AG promoter via SnaBVKpnl sites.

Mutation vectors: Site-directed or deletion mutagenesis of Exen21/Qa were performed using pcDNA6B-SARS-CoV-2-E-Flag-QLG(TP1479) as atemplate. Mutagenic primers were designed to change or delete specific nucleotides in Exen21 sequence. For each mutation a Phusion High-Fidelity PCR reaction was performed using a universal primer (T1640) matching a region upstream of SARS-CoV-2 E and a mutagenic primer matching Exen21 sequence except for the region a desired mutation introduced. The PCR product which carries the Exen21 mutation was gel purified and cloned into AcoAV/Ao/I-digested 6B-E-QLG DNA using NEBuilder ®HiFi DNA assembly kit.

Antibody vectors: The plasmid set CR3022 for pFUSEss-CHIg-hGl-SARS-CoV-2- mAb (NR-52399, TP1565) and pFUSE2ss-CLIg-hk-SARS-CoV-2-mAb (NR-52400, TP1566) expressing the heavy (H) and light (L) chains of human anti-SARS-CoV mAb respectively (GenBank: DQ168569 and DQ168570) were produced under HHSN272201400008C and obtained from BEI Resources, NIAID, NIH (Cat# NR-53260). The Q-tagged HQ (TP 1574) and LQ (TP1571) vectors were generated from H or L plasmids at Nhe I site using NEB-HiFi with the synthesized oligonucleotides that contain Q-encoding sequence and the C-terminus of the immunoglobulin heavy and light chain (T1378, T1380-T1383).

Lentiviral vectors : The vector pRRL-SIN.cPPT.PGK-GFP.WPRE (TP792), (Addgene #12252), was used to generate pRRL-E-Flag-LG-GFP (TP1577) by transferring E-Flag-LG insert from TP1478 to TP792 via BamHUAgel. The pRRL-E-Flag-LG (TP1578) vector was generated from TP 1577 by AgeVKpnl blunt ligation. The pRRL-E-Flag-QLG (TP 1579) vector was generated by transferring E-Flag-QLG from TP1479 to TP1578 via BamHliBstBl. TP1578 and TP1579 vectors were used as the backbone for NEB-HiFi cloning of human IFNy and IL2 PCR products via Xbal site to generate pRRL-IFNy-LG (TP 1604) or QLG (TP 1605) and pRRL-IL2-LG (TP 1606) or QLG (TP 1607). The PCR fragments of IFNy and IL2 were derived, respectively, from pUC8-IFNY (Addgene #17600) and pAIP-hIL2-co (Addgene #90513) using primer pairs T1407/T1408 and T1409/T1410. The pRRL-Flag-LG (TP 1685) and pRRL-Flag-QLG (TP 1686) vectors were generated respectively from TP 1621 and TP 1622 via BsmBMXbal digestion and NEB-HiFi cloning with oligonucleotide insert (T1469). The pLV-EFla-spCas9-Q-T2A-RFP (TP1562) was generated from pLV-EFla-spCas9-T2A-RFP (TP855) at Ariel site using NEB- HiFi cloning with the synthesized oligonucleotide that contains Q-encoding sequence (T1361). The pLV-EFla-MS2-spCas9-Q-F2A-GFP (TP1552) vector was generated from pLV-EFla-MS2- spCas9-F2A-GFP (TP 1081) at Ariel site using NEB-HiFi cloning with oligonucleotide (T1361).

The LV packaging vector psPAX2-Gag-Q (TP1618) was generated from psPAX2 (TP592, Addgene #12260) via SphVEcoKV sites by NEB-HiFi cloning of two overlay PCR fragments with primer pairs T1396/T1397 and T1398/T1399 using TP592 as PCR template. The psPAX2-Pol-Q-RRE-Q (TP1619) was generated from psPAX2 via SwaVNhel sites by NEB-HiFi cloning of two overlay PCR fragments with primer pairs T1400/T1401 and T1402/T1403 using psPAX2 as PCR template. The psPAX2-Gag-Q-Pol-Q-RRE-Q (TP 1620) was from TP1619 via SphVEcoRV sites by NEB-HiFi cloning of two overlay PCR fragments with primer pairs T1396/T1397 and T1398/T1399 using TP592 as PCR template.

S-pseudoviral vectors : The vector pCAG-SARS-CoV-2-Sdl8Q (TP1506) encoding human codon-optimized S gene of SARS-CoV-2 with C-terminal 18 aa deletion (Sdl8) and Qa tag fusion was constructed using NEB-HiFi. Briefly, the Sdl8 expression cassette in the CMV- driven vector pcDNA3.1-SARS2-S (Addgene Cat# 145032), was transferred to a CAG-driven vector pCAG-Flag-SARS-CoV-2-S (gift from Peihui Wang) via EcoRVNotl sites and PCR with primer pairs T1323/T1324. The vector pCAG-SARS-CoV-2-Sdl8 (TP1567) encoding Sdl8 without Qa tag was constructed via the NEB-HiFi cloning with synthesized oligonucleotide insert T1367 at SacIVNotl site of pCAG-SARS-CoV-2-Sdl8Q vector.

Plasmid DNA purification and DNA quantification

Plasmid DNAs were purified using commercial kits for endotoxin-free miniprep (Cat# REF 740490) or midipreps (Cat# REF 740420) from Macherey -Nagel. The E. coli bacterial cultures (4 ml for miniprep, 200 ml for midiprep) harboring relevant plasmids were grown in LB or 2YT media supplemented with 100 pg/ml Carbenicillin, 50 pg/ml Kanamycin, 50 pg/ml blasticidin, or 50 pg/ml Zeocin at 30°C for NEB-stable or 37°C for DH5a E. coli cells overnight. The bacterial cultures were harvested by centrifugation, the pellets obtained after centrifugation were processed to purify plasmid DNA according to manufacturer’s guideline. The final DNA was dissolved in ultra-pure DNase/RNase-free distilled water (Thermo Fisher, Cat#10977023) and DNA concentrations were determined either using Nanodrop 1 UV-Vis Spectrophotometer (Thermo-Fisher) or in a Take3 plate using Bio-Tek multiplate reader.

Cell culture and Transfections

HEK293T human fetal kidney and Hela human cervix epithelial cells were obtained from ATCC (Cat# CRL-3216 and CCL-2). Both cells were cultures in Dulbecco’s Modified Eagle’s Medium (DMEM, Gibco) supplemented with Fetal Bovine Serum (FBS) and antibiotic 1% Penicillin/Streptomycin (Coming). BHK-21-/WI-2 cells (Kerafast, EH1011) were grown in DMEM supplemented with 5% FBS and 1% Penicillin/Streptomycin. All cells were incubated in a 37°C incubator under 5% CO2 atmosphere.

For most experiments, 96-well plate was used. For mRNA stability, 24-well plate was used. Cells resuspended (in DMEM plus 10% FBS) were seeded (3-4x10^ cells/well for 96- well plate or 1-2x10^ cells/well for 24-well plate) the night before the transfections. For transfections, Transporter 5 transfection reagent (TP5) (Polysciences Cat# 26008) was used at 1 to 4 ratios of DNA/reagent. Typically, 50-100 ng plasmid DNA per well for 96-well plate was mixed 0.2-0.4 mΐ TP5 in 0.9% NaCl solution and incubated at room temperature for 20 min. The transfection reagent and DNA solution were mixed again and added to each well dropwise. The transfections were incubated at 37°C in 5% CO2 overnight (16-18 h), the media was replaced with DMEM plus 10% FBS.

Multilabeled fluorescent immunocytochemistry and confocal image analysis

Cells were fixed for 30 min with 4% paraformaldehyde (PFA), washed with lx PBS, and permeabilized with 0.5% TritonX-100/lx phosphate buffered saline (PBS) for 30 min, blocked with 10% donkey serum for 1 h and incubated with mouse anti -Flag monoclonal or anti-2 A primary antibodies in 0.1% TritonX-100/lX PBS overnight at 4°C. The next day, cells were washed with IX PBS and incubated with the corresponding Alexa Fluor secondary antibodies (Jackson Immuno Research Labs; donkey anti-rabbit, anti-mouse, IgG (H+L) 488, 594, or 680) at a 1:400 dilution for 1 h at room temperature, using Hoechst 33258 (1:5000) as a nuclear counterstain. Fluorescent confocal images were acquired and analyzed using the Leica SP8 confocal system. Flow Cytometry

Cells expressing dsGFP reporter were dissociated with Accutase (Coming), passed through a 70 pm nylon cell strainer (Coming) to remove large clumps, and washed with IX PBS. Dissociated cells were fixed with 4% PFA in PBS and GFP positive cells were analyzed using Cytek Aurora Flow cytometer. RNA Extraction and Reverse Transcription Quantitative PCR (RT-qPCR) for inRNA stability assay

HEK293T cells were transfected with indicated vectors (500 ng/well for 24-well plate) for 24 h before treatment with transcriptional inhibitor actinomycin D (10 pM) for various period. Total RNA was extracted using Monarch Total RNA miniprep Kit (NEB, Cat# T2010) that includes two steps of DNA removal. Equal amount of RNA (0.5 pg) was used to synthesize cDNA using High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scientific, Cat# 4368814) with random hexanucleotide primer. Real time PCR analysis was carried out on QuantStudio™ 3 System. The mRNA expression levels of reporter gdLuc luciferase and huma b-actin were determined using iTaq Universal SYBR Green Supermix kit (BioRad, Cat# 1725121). The sequences for gdLuc primers are (forward) 5’-

GATTAC AAGGATGACGACGATAAG-3 ’ (T1364 targeting Flag) and (reverse) 5’- AAGTCTTCGTTGTTCTCGGTGGG-3’ (T432 targeting gdLuc). Human b-actin primers are (forward) 5’-AAGAGCTATGAGCTGCCTGA-3’ and (reverse) 5’-

TACGGATGTCAACGTCACAC-3’. Each sample was tested in triplicate. Cycle threshold (Ct) values were obtained graphically for reporter and b-actin. The difference in Ct values between for reporter and b-actin were represented as ACt values. The AACt values were obtained by subtracting the ACt values of the control samples from that of the samples at different time points. Relative percentage change in gene expression was calculated as 2-AACt. The mRNA decay rate was calculated by non-linear regression curve fitting (one phase decay) using GraphPad Prism 9.1. Three independent experiments were performed.

Lucif erase assays

For gdLuc assay, the Coelenterazine (CTZ) substrate (Nanolight Technology, Cat # 3032) was dissolved in 10 ml ultra-sterile distilled water to make the stock solutions and kept at - 20°C until use. The CTZ stock solution was diluted 10-30 times to make working solutions.

Equal amount of CTZ working solution and cell culture media (25-50 mΐ) after transfection were mixed in a white opaque 96-well optiplate (Coming, Cat# CLS3922), and the luminescence was measured in aBioTek Synergy LX multiplate reader. For firefly luciferase assay in some experiment, the ONE-Glo Luciferase assay kit (Promega Corp, Cat # E6110) was used.

Aliquots of 100 mΐ substrate solution were mixed with 3-5m1 of cell lysates and the luminescence was measured in a BioTek Synergy LX multiplate reader. Data were presented as relative luciferase activity or fold changes compared with corresponding group. Experiments were performed at least 3 times with each in quadrupli cates.

In vitro transcription and mRNA transfection

For pcDNA6B vector containing T7 promoter, the DNA was lineated with A gel digestion followed by gel purification. For PCR product, the primers included the T7 promoter (TTAATACGACTCACTATAGGGTGGAATTCTGCAGATATCCAG. T1427), generating DNA fragment containing target gene, LG or QLG dual reporter and a poly(A) tail. PCR was performed using Phusion High-Fidelity PCR Master Mix kit (Thermo Fisher Scientific, F531). The DNA was purified using gel extraction kit and the concentration determined using Take3 plate in Bio Tek multiplate reader. RNA was synthesized from the purified DNA template using HiScribe™ T7 ARCA mRNA Kit (NEB, Cat#E2060) and co-transcriptionally capped with m7G anti-reverse cap analog (ARCA, Cat#1411), and poly A tailing. The synthesized RNA was purified using Monarch RNA cleanup kit (NEB, Cat#E2040) and quantified with Take3 plate. Equal amount of RNA between LG and QLG groups at different dosage were used for transfection into HEK293T cells in quadruplicate with Lipofectamine® MessengerMAX mRNA Transfection Reagent (Thermo Fisher Scientific, Cat#LMRNA015) following manufacture’s manual. At 4-72 h post-transfection, the culture media containing gdLuc were collected, and gdLuc assay was performed as above.

VSV-G or S protein-pseudotyped lentivirus packaging and titration The recombinant lentivirus carrying indicated lentiviral vector was produced in a small scale using the second generation of LV packaging system according to standard protocols. Briefly, HEK293T cells in one of 6-well plate were cotransfected by TP5 kit with the indicated transfer LV vector (1.4 pg), the packaging vector psPAX2 or its mutants (1 pg) and VSV-G or Sdl8 vector (0.4 pg). At 2-3 days post-transfection, the supernatants containing LV were concentrated and

49 purified with simplified 10% sucrose purification as described previously . The functional titers of the crude and purified lentivirus were determined by counting GFP-expressing HEK293T cells at 48 h after infection with serial dilutions of lentiviruses under fluorescent microscopy. For some cases, flow cytometry analysis was used for LV titration.

Western Blot Analysis

SDS-polyacrylamide gels (10-12%) were home-made or mini-PROTEAN TGX gels (Cat# 4561093, 4561096) were purchased from BioRad. The cell lysates were prepared using the lysis buffer composed of 50 mM Tris-HCl pH 7.0, 150 mM NaCl, 5 mM EDTA and 1 % Triton X-100 supplemented with PMSF (lOOx), Aprotinin and Leupeptin (200x). The 50 pi lysates were prepared from each well after collecting the supernatant. The lysates were incubated at 4°C for 20-30 min, centrifuged at maximum speed in an Eppendorf Centrifuge.

The clear lysates were either denatured for 5 min at 98°C immediately in lx SDS-PAGE loading dye or stored at -80°C until use. Supernatants were stored at 4°C until before they treated with lx SDS-PAGE loading dye. The denatured 10-20 ul aliquots of cell lysates or 20-30 pi supernatants were loaded onto SDS- polyacrylamide gels. The SDS-PAGE was performed in Tris-Glycine/SDS buffers under denaturing and reducing conditions.

The polyacrylamide gels were transferred to 0.2-pm nitrocellulose membranes (BioRad supported nitrocellulose (NC) membrane, Cat # 162-0097) either using wet transfer or iBlot®2 device using IBlot®2 NC mini (IB23002) or regular Stacks (IB23001). In wet transfer following lx transfer buffer was used: 25 mM Tris-HCl pH 7.6, 192 mM glycine, 20% Methanol. The gels were sandwiched together with NC membranes and transfers were performed in lx Transfer buffer at 250 mA at 4°C for 1- 2 h.

Dry western blot transfers were performed in a IBlot®2 gel transfer device (Invitrogen, Thermo-Fisher, Ref# IB21001) using mini or regular IBlot®2 stacks for 7 min according to manufacturer’s guidelines. After the transfer, the membranes were blocked in lx TBST buffer containing 5% milk. The membranes then were treated with primary antibodies overnight at 4°C or 2 h at RT. The membranes were washed three times with lx TBST buffer minute each followed by incubation with secondary antibodies. The secondary antibodies with infrared tag were diluted 1/10000-120000 and incubated with the NC membranes for 45 min to an h. At the end of incubation, the membranes were washed with lx TBST buffer three times, 5 min each and scanned on a Li-COR Odyssey image analyzer. The images were analyzed with NIH ImageJ (1.53 version) densitometric measurements. The data were expressed as integrated density times area and presented as relative fold in comparison with corresponding control.

Antibody Detection with Enzyme-Linked Immunosorbent Assay (ELISA)

HEK293T cells were cotransfected with the Q-tagged HQ (TP1574) and LQ (TP1571) at 50 ng/well of 96-well plate in quadruplicates with or without normalization vector pGL4.16-CMV (TP329), which derived from the promoterless vector pGL4.16 (Promega, Cat#E6711), or pRRL-E-Flag-LG (TP1578) at 20 ng/well. The original antibody plasmids for pFUSEss-CHIg-hGl-SARS-CoV-2-mAb (TP 1565) and pFUSE2ss-CLIg-hk-SARS-CoV-2-mAb (TP1566) were used as the control. ELISA was performed using a Human IgG (Total) Uncoated ELISA Kit (Invitrogen, Thermo-Fisher, Cat# 88-50550-88). A 96-well Costar ELISA plate (Coming) was first coated with SARS-CoV-2-Spike (S) protein from BEI (Cat # NR52724) at 100 pg/well overnight at 4°C. The washing and blocking steps were performed using the buffers and solutions provided in the kit. Supernatants containing secreted antibodies were collected from the transfections at 24 and 48 h and kept at 4°C until use. The aliquots of 0.5, 2.5 and 5.0 mΐ antibody supernatants were added to each SARS-CoV-2-S coated wells. After overnight incubation, the wells were washed (400 mΐ per well) 4 times. The horse radish peroxidase (HRP)-conjugated anti-human IgG detection monoclonal antibody in assay buffer (1/250) was added to each well and incubated at room temperature for 2-3 h. The wells were then washed 3 times (400 mΐ each) and treated with 300 pL substrate TMB (3, 3’, 5, 5’- tetramethyl benzidine) for 15 min to develop blue color and the reactions were terminated with 2 N HC1. The yellow color formation was measured at 450 nm using a BioTek microplate reader. The level of anti- SARS-CoV monoclonal antibody was quantified by Sigmoidal four-parameter logistic curve (4PL) fit using Prism GraphPad 9.1.

ER-Golgi transport inhibition with brefeldin A

Brefeldin A (BA, AdipoGen Life Sciences, Cat # AG-CN2-0018) was dissolved in DMSO to make 1 mg/mL working solution. HEK293T cells were transfected with indicated vectors using TP5 transfection reagent in DMEM plus 10 % FBS as described above. The transfected cells were incubated overnight, and 10 pg/ml BA was added prior to media change and incubated for 3 h at 37°C in 5% CO2. The culture media was replaced with 293 FreeStyle serum free media (Gibco, Thermo-Fisher, Cat# 12-338-018) with 10 pg/ml BA and incubation was continued for 24 h at 37°C in 5% CO2. The supernatants were withdrawn right after media replacement and collected after 24 h. The cell lysates were also prepared at 24 h time point. The supernatants and cell lysates were tested for gdLuc activity and Western blot analysis.

Quantification and statistical analysis

Quantification of fold changes in Q groups compared with corresponding non-Q groups was performed using excel software. Statistical analysis was performed using Prism GraphPad 9.1. Significance at *P < 0.05, ** P < 0.01 and *** P < 0.001 was determined using a two- tailed student’s t-test between two groups or by one-way ANOVA for multiple comparisons. Data were presented as mean ± SE. The size and type of individual samples were indicated and specified in the figure legends.

Results

Discovery of a novel heptapeptide Qa in boosting protein expression/production.

To study SARS-CoV-2 viral protein expression in mammalian cells, a dual reporter system was generated to measure the viral protein expression quantitatively and dynamically. Gaussia- Dura luciferase (gdLuc) and destabilized green fluorescent protein (dsGFP) were fused, abbreviated LG, onto the C-terminus of SARS-CoV-2 E protein (FIGS. 8A and 14A-14C). This design allows dual measures of the secretory gdLuc-fused target protein in culture media by sensitive gdLuc assay and the dsGFP positivity and intensity by fluorescence microscopy and flow cytometry. During the cloning of the E protein-expressing vector, the correct clones were initially screened by restriction enzyme digestion and tested positive clones El and E7 for protein expression by gdLuc assay (FIGS. 8B-8C) and fluorescence microscopy (FIGS. 8D and 14A). Surprisingly, E7 exhibited >20-fold higher luciferase activity than El. The E7 DNA sequence was confirmed by Sanger sequencing. Unexpectedly, it was discovered that E7 had an additional 21 -nucleotide sequence that encodes 7 amino acids (aa) in frame between the upstream of LG and the downstream of the Flag tag. This heptapeptide was designated as Qa based on its aa sequence and named its linked LG as QLG. It was confirmed that transfection of pcDNA6B-E-QLG (E7) exhibited up to 90-fold higher expression than pcDNA6B-E-LG (El) in HEK293T cells (FIGS. 8B-8D). The effect of this Qa addition on the expression of other SARS- CoV-2 structural proteins was examined, including S, nucleocapsid (N), and membrane (M), and accessory proteins NSP2, NSP16 and ORF3. It was found that Qa boosts the production of all the tested viral proteins (FIGS. 8E, 8F, 9 A, 14A-14C, and 15A-15B), with efficiency ranging 3- to 3848-fold, depending upon the respective protein. Such variation of Qa boosting efficiency may result from differences in cellular density/function, transfection efficiency, reporter dosage, and viral protein types.

Novel and Unique 21-mer Oligonucleotide Cis-Regulatory Motif Contributing to Qa Boosting.

Given that the Qa insertion needs an open reading frame (ORF) with the targeted genes for protein expression and functional detection, it was initially speculated that the in-frame heptapeptide Qa plays a critical role in boosting protein production. Thus, alanine scanning and deletion mutation assays were performed (FIG. 8G) to determine the role of amino acid residues in regulating Qa function at the peptide level. All these tested mutations impaired the boosting activity to various extents from >57% loss of boosting activity to almost complete loss in the 4A mutation, indicating that each residue of this unique Qa heptapeptide appears to be important for the boosting activity and the residue 4 is the most critical one. To explore the contribution of the underlying oligonucleotides at the RNA level, synonymous (silent, degenerate) mutations were created that only change nucleotides but not the amino acids. Unexpectedly, it was found that all the degenerate mutants tested showed significant loss (>90%) of Qa boosting activity (FIG. 8H), indicating that Qa boosting activity derived predominantly from the action of the 21-mer oligonucleotide motif instead of the unique heptapeptide. Then nonsynonymous (missense) mutation assay were performed by retaining the ORF required for the reporter expression. All the tested mutants lost the boosting activity to various degrees as compared to the parent Qa group (FIG. 81). These data provide evidence that the sequence (composition) and number of this 21- mer motif is critical for Qa boosting activity. The name Exen21 was assigned as a new name for the unique and novel expression-enhancing 21-mer cis-regulatory motif, which encodes an epitope tag (Qa).

Broad capability of Exen21/Qa addition to boost protein expression/production.

To expand the potential application of Exen21/Qa in boosting protein expression and production, similar assays were performed in different types of proteins, mammalian cells, and species. Similar boosting effects for many non-viral proteins were observed (FIGS. 9B-9E). Interestingly, transfection with a lower amount of plasmid DNA in HEK293T cells yielded higher boosting efficiency for most SARS-CoV-2 viral proteins (FIGS. 9A, 15A and 15B), but not for host cellular gene products such as mouse NIBP4 (FIG. 9B) and human ACE2 (hACE2) (FIG. 9C), or cytokines such as IFNy (FIG. 9D) and IL-2 (FIG. 9E). Exen21/Qa induced stronger boosting of SARS-CoV-2 E protein in the presence of the stronger CAG promoter (FIG. 9F). It was further found that similar boosting of protein expression and production occurred in other cell types including Hela, BHK, and others (FIG. 9G). In addition to being functional in regular plasmids, the Exen21/Qa also exhibited boosting activity in viral transfer vectors such as lentiviral (LV) vectors (FIGS. 9D and 9E).

In summary, the Exen21/Qa addition has a broad capability of boosting protein expression/production across various gene products, vectors, mammalian cell types, and species.

Exen21/Qa enhancement of antibody production.

Monoclonal antibody (mAb)-based therapeutics require the optimization of antibody production in suitable cell culture platforms, which relies on high-performance expression vectors. To achieve this, genetic elements in mAh production vectors have been widely modified. To determine if Exen21/Qa addition plays a role in boosting antibody production, a human anti-SARS-CoV mAh (Bei, CR3022) was used, which contains the consistent regions of heavy and light chains (GenBank: DQ 168569, DQ 168570, respectively) as a test platform. Exen21 was inserted into the C-termini of the immunoglobulin heavy and light chains (H/L) of CR3022 to generate Qa-tagged HQ and LQ (FIG. 10 A). HQ and LQ were cotransfected into HEK293T cells to generate Qa-tagged mAh, using the original H and L vectors (NR52399 and NR52400) as the controls. MAb-containing supernatants were collected 2-3 days after transfection and their mAh levels were measured by ELISA using SARS-CoV-2 S protein as the coating antigen (FIGS. 10B and IOC). It was found that the Exen21/Qa boosted mAh production by up to 37 folds, with or without normalization of transfection efficiency (FIG. 10D). Boosting efficiency was obtained at least 13 folds on average from 16 independent experiments even with varied experimental conditions (cell density, transfection efficiency, and ELISA variations) (FIG 10E). It was further confirmed that Exen21/Qa boosted mAh production by Western blot analyses of cell culture supernatants (FIG. 10F). The data indicate that Exen21/Qa addition robustly boosts mAh production/secretion.

Exen21/Qa enhancement of SARS-CoV-2 S pseudovirion production.

Pseudotyped virus has been widely used in studies not only for gene delivery, but also for vaccine production, antibody neutralization, cellular entry, and pathogenic exploration. Pseudovirion is an excellent alternative to high-risk viruses such as SARS-CoV-2 and its variants and does not require BSL3 facilities for working with. Pseudovirions are virus-like particles (VLPs) coated with viral surface or membrane proteins that harbor specific cellular tropisms. VLPs pseudotyped with SARS-CoV-2 S protein evoke stronger immune responses than any individual viral protein due to their 3-dimensional structures like those of live virus ⁸’ ^{ia u}. SARS-CoV-2 S protein has been widely used to generate S pseudovirion, but the packaging efficiency for lenti virus-like (LVLP) or vesicular stomatitis virus-like (VSVLP) particles has been low in most reports, even with the codon-optimized C-terminal deletion S protein^{5, 6}’ ^{X 12}. Given the fact that Exen21/Qa addition boosts S protein production in mammalian cells, it was speculated that it might boost the packaging efficiency of S pseudotyped LVLP (S-LVLP). By applying the widely used C-terminal 18 aa-deleted codon-optimized SARS-CoV-2 S protein (Sdl8) as a test platform (FIG. 11 A), it was validated that Exen21/Qa addition on the C-terminal Sdl8 (Sdl8Q) boosted Sdl8 expression by Western blot analysis (FIG. 1 IB). It was also found that Exen21/Qa addition increased S-LVLP packaging efficiency by ~2-4 folds in HEK-hACE2 cells (FIG. 11C). To provide dynamic measurement of S-pseudovirion transduction, packaging efficiency of the dual-reporter LV vector pRRL-E-QLG was tested, which harbors inserts larger than the GFP insert alone. As expected, the original Sdl8 in transfer vector pRRL-E-QLG showed significantly lower packaging efficiency than of the Exen21/Qa addition (FIGS. 1 ID and 1 IE). These data demonstrate that Exen21/Qa addition in the Sdl8 expression system significantly boosts packaging and transduction efficiencies of SARS-CoV-2 S-LVLP.

Exen21/Qa enhancement of lentivirus production.

Viral gene therapy has been extensively studied and actively applied to clinical diseases. Both AAV and LV are the most promising strategies for viral gene therapy, but viral packaging efficiency (production yield) has been a bottleneck. In genome editing by CRISPR/Cas, viral packaging efficiency is also a rate-limiting factor for development of novel therapeutics. Generally, the level of mRNA supplied by LV transfer vector can affect LV packaging efficiency. It was hypothesized that Exen21 addition in the LV transfer vector can elevate the transgene mRNA levels during packaging and thereby boost the efficiency of LV packaging and gene delivery. This idea was tested by comparing the LV transfer vectors pRRL-E-LG and pRRL-E-QLG for standard LV packaging (psPAX2 and VSV-G). After LV infection of HEK293T cells, Exen21 increased production of the transgene reporter gdLuc from the transfer vector (FIG. 1 IF), like its boosting efficiency in transfected cells without LV packaging (FIGS. 9D, 9E). However, Exen21 addition in the transfer vector only marginally affected packaging efficiency (i.e., the titer of packaged LV; data not shown). Similar changes were observed with LV-spCas9-Q-RFP and LV-MS2-spCas9-Q-GFP (FIGS. 11G and 11H), for which packaging efficiency is usually < 1% that of standard LV-RFP or LV-GFP. These data provide evidence that the Exen21 -induced marginal change in mRNA level of the transfer gene in the transfer LV vector does not increase packaging efficiency, although Exen21 addition does enhance production of transgene protein in the transduced cells (FIG. 1 IF). This is consistent with the finding that Exen21 addition augments translation, rather than transcription (FIGS. 12A-12G). It was also tested if Exen21/Qa addition on the LV packaging proteins such as Gag, Pol, and RRE, via the packaging vector psPAX2 could boost packaging efficiency. Interestingly, Exen21 addition to Gag significantly impaired, rather than augmenting, LV packaging, but to Pol and RRE it significantly boosted LV packaging of pRRL-GFP (FIG. 111). These data provide evidence that proper insertion of Exen21/Qa in the LV packaging vectors could boost the packaging and transduction efficiency.

Exen21/Qa enhancement of vaccine production via increasing mRNA stability and translational efficiency.

Another immediate application of Exen21/Qa addition may be in the elevation of vaccine yields for the urgent fight against COVID-19 pandemic. Currently, the most promising vaccines against SARS-CoV-2 and its variants are derived from mRNA or DNA encoding S proteinl3. As shown in the above results, the Exen21/Qa addition increased S protein expression by -3-24 fold in a CMV-driven cDNA expression vector (FIG. 9A). If such an enhancement of vaccine production is applied in large scale, it would reduce costs and expedite the availability of COVID-19 vaccines. Since mRNA vaccine exhibits numerous advantages over other vaccines and the application of SARS-CoV-2 S protein mRNA-based vaccines are now well-established in humans, it was hypothesized that the Exen21/Qa addition could also boost mRNA-dependent translation of SARS-CoV-2 proteins such as S protein for increasing vaccination efficiency. To test this idea, a capped mRNA was generated with the Exen21 insertion by in vitro transcription (promoter independent) and examined if the Exen21/Qa after mRNA transfection in HEK293T cells (FIGS. 16A-16E). The data showed that the presence of Exen21/Qa significantly increased the production of SARS-CoV-2 protein S from the transfected functional mRNAs in a time- and dose-dependent manner (FIG. 12 A). It was found that such protein production-boosting motif can be universally applicable to mRNAs of other SARS-CoV-2 proteins including N, E, and ORF3 and the host cellular gene hACE2 (FIGS. 12B, 12C, and 16A-16E). These data provide evidence that the Exen21/Qa addition could act in a transcription-independent manner (promoterless) by facilitating mRNA stability and/or translational efficiency. To further determine if the Exen21/Qa addition regulates mRNA-dependent translation, the dynamic changes of translational products were measured after inhibiting transcription with actinomycin D. In the absence of Exen21/Qa addition, actinomycin D completely blocked the production of viral protein E (FIG. 12D) and ORF3 (FIG. 12E), measured by gdLuc activity. In contrast, the Exen21/Qa addition showed a time-dependent increase of the protein expression and production/accumulation even with the transcriptional inhibition (FIGS. 12D and 12E), providing evidence that the Exen21/Qa addition in the targeted genes facilitates protein expression and production via the posttranscriptional regulation (increased translation efficiency and/or mRNA stability). To further determine if the Exen21 addition influences mRNA stability of the targeted genes, an mRNA decay assay was used for E and S viral proteins. Although E and S viral mRNAs exhibited different patterns of changes during the time course, the Exen21/Qa addition on both viral E (FIG. 12F) and S (FIG. 12G) proteins increased half-life of the encoding mRNAs by ~6-7 hours.

Taken together, the data indicate that the Exen21 addition in a given target mRNA significantly increases mRNA stability and translational efficiency, thereby boosts protein expression and production of the targeted mRNA (e.g., S protein mRNA vaccine).

Exen21/Qa boosting of targeted protein secretion.

As was found above, Exen21/Qa addition elevated expression of various types of targeted proteins. Aiming to test if Exen21/Qa addition boosted E protein dual reporter protein expression within cells (by Western blot analyses on cell lysates), it was unexpectedly found that E-QL protein levels in the lysates were remarkably reduced rather than increased, in the Exen21/Qa group detected by Western blot analysis with anti-Flag antibody (FIG. 13 A), even though Exen21/Qa addition robustly increased gdLuc activity in culture supernatants (FIG. 8C). Similar reductions by Exen21/Qa addition were found in corresponding intracellular levels of other viral proteins (S and N), and the host cellular proteins (IFNy, IL-2, and hACE2) (FIGS.

13B and 13C).

Based on these unexpected observations, it was hypothesized that the robust Exen21- induced increases in supernatant gdLuc activity must involve the protein secretion process. This idea is supported by the Exen21 -induced boosting that was observed in antibody secretion (FIGS. 10A-10F) and secretory IFNy and IL-2 (FIGS. 9E and 9F) experiments. To corroborate this secretion-boosting activity, the protein levels of secretory E-Flag-gdLuc were analyzed in the cell culture supernatants using serum-free media. It was found that the cleaved E-Flag-gdLuc and GFP as well as the non-cleaved E-Flag-gdLuc-GFP were detectable by Western blot analyses using anti -gdLuc and anti-GFP antibodies in the unconcentrated supernatants (40 mΐ from 100 mΐ) of the Qa-tagged E-QLG group (FIGS. 13D and 17A-17E). Densitometric quantification analysis revealed a 17-fold increase in the level of secretory protein (FIG. 13D), consistent with the boosting also seen in the gdLuc assay (FIG. 13E). The protein secretion was blocked by treatment with the endoplasmic reticulum (ER)-Golgi protein trafficking inhibitor brefeldin A (FIGS. 13F and 18A-18D). To further confirm the secretion-enhancing feature of the Exen21/Qa addition, we used the non-secretory firefly-luciferase (fLuc) assay. Cellular levels of the fLuc protein expression and enzyme activity were significantly increased in cell lysates, but no fLuc activity was detectable in the supernatants, even in presence of Exen21/Qa addition (FIG. 13G), which is consistent with non-secretory protein spCas9 (FIG. 11G). Thus, the Exen21/Qa addition appears to boost expression of the targeted proteins and facilitate their secretion. It was noted that auto-cleavage by the 2A system of most of the targeted proteins was incomplete, varying among different proteins (FIGS. 11C, 1 ID and 1 IF), which has been reported by others^{14, 15}.

Discussion

In this study, the discovery of a novel and unique Exen21/Qa c/.s-regulqlory motif was reported that has versatile capabilities of boosting the expression and secretion of targeted proteins. This c/.s-regulatory Exen21 sounds like the secretion-enhancing c/.s- regulatory targeting element (SECReTE) that was recently identified by computational analysis to facilitate ER- localized mRNA translation and protein secretion¹⁶. This SECReTE motif is enriched in nearly all mRNAs encoding secreted/membrane proteins in eukaryotes and its addition results in enhanced protein secretion¹⁶. It also boosts protein expression and secretion when adding to an mRNA for an exogenously expressed protein such as GFP16. However, Exen21 has many features different from SECReTE: (1) No triplet repeats such as NNY or NYN; (2) Unique and exclusive composition/order of the 21 nucleotides; (3) Smaller size (21-mer) than SECReTE (> 30-mer from >10 triplet repeats); and (4) Absence in any cellular or viral genes. In addition, Exen21/Qa is also quite different from the activity-enhancing motif that involves promoter enhancerl7-19 or anti-sense activity²⁰. The data herein indicated that adding the Exen21 motif to a given mRNA could remarkably enhance the corresponding protein expression and secretion. This was also demonstrated in different types of proteins including viral, nonviral, intracellular, structural, and secretory proteins. The extent of such enhancement varied, with proteins such as N and ORF3 exhibiting up to thousands-fold increase. It is believed that these findings may be translatable to a paradigm shift in applied protein production in research and industry. The range, extent, and mechanisms of these Exen21/Qa actions was explored using a variety of tools, approaches, and target proteins. The Exen21/Qa addition robustly augmented production of a secretory gdLuc fusion protein derivative of multiple SARS-CoV-2 structural proteins (S, M, N, and E), the accessory proteins (NSP2, NSP16, and ORF3), and the host cellular gene products (FIGS. 8A-8I and 9A-9G). The protein production-enhancing actions of Exen21/Qa were largely independent of the specific promoter used, among those tested, but it did elicit stronger enhancement of protein production in combination with the stronger CAG promoter (FIGS. 9A-9G). The Exen21/Qa addition enhanced mRNA-dependent production of targeted viral and non-viral protein fusion reporters, determined by in vitro RNA transcription and mRNA transfection, followed by dual reporter assays (FIGS 12A-12G). Exen21/Qa enhanced the yield of S-containing pseudoviruses and lentivirus packaging (Fig. 4). Exen21/Qa addition increased the release of secreted host proteins, including a robust enhancement of antibody production when Exen21/Qa was placed in antibody heavy and light chains, and augmented the secretion of IFNy and IL-2. Exen21/Qa actions were blocked by the Golgi - trafficking inhibitor brefeldin A. These findings point not only to a wide range of activities elicited by Exen21 addition, but also to potentially important and diverse applications in biotechnology areas such as production of vaccines, monoclonal antibodies, and other biopharmaceuticals where mammalian cell expression systems are needed.

It was found that the Exen21/Qa addition robustly boosted the regulated secretion of secretory proteins such as S protein, antibody, IFNy and IL-2, but not via any signal peptide-like intracellular targeting mechanism, because it did not induce release of non-secretory proteins such as ///v/Ty-luciferase and spCas9. This property could potentially prove invaluable for industrial application of such secreted proteins. For example, the Exen21/Qa addition could presumably enhance the production/secretion of S protein in mRNA-based vaccines against SARS-CoV-2 variants, therefore reduce the amount of mRNA needed per vaccination due to the higher levels of S protein released¹³ while still provide the same host immune responses.

The ability to boost production yields of viruses or pseudotyped viruses can be invaluable to the fields of gene therapy and biomedical research. The use of pseudotyped viruses has facilitated the research on high-risk viruses that require BSL3 facilities. Pseudoviruses of SARS- CoV-2 S protein and its variants have been used extensively in the evaluation of neutralizing antibodies and vaccination, as well as in mechanistic and functional studies^{5, 6}’ ^{12, 21, 22}. The bottleneck for generation of S pseudovirions has been the limited packaging efficiencies for LVLP or VSVLP^{5, 6}’ ^{8 12}. The new approach herein to add Exen21/Qa in the Sdl8 expression system boosted packaging and transduction efficiencies of SARS-CoV-2 S-LVLP. This strategy has facilitated the ongoing research on the antiviral effect of EGCG and the protective efficiency of serum from vaccinated patients against the emerging SARS-CoV-2 variants^{23, 24}. A challenge in viral gene therapy is the limited efficiency of viral packaging. Using a LV system as the test platform, it was found that the Exen21/Qa addition to the LV transfer vector affected packaging efficiency only marginally, but it boosted the production of transgene protein in the transduced cells or the transfected packaging cells. This was expected, because it was found that Exen21/Qa influences posttranscriptional regulation rather than transcription of targeted genes, whereas LV packaging requires the presence of intermediate RNA from the transfer vector. In the packaging vector psPAX2, the Exen21/Qa addition at the C-termini of Pol and RRE increased LV packaging efficiency, but at the Gag C-terminus it impaired LV packaging. Thus, optimizing Exen21/Qa locations within LV packaging vector will be helpful in applications to maximize Exen21/Qa boosting efficiency. Because Exen21/Qa addition boosted both Sdl8 expression and the packaging efficiency of S-LVLP, the Exen21/Qa aition in VSV-G protein may boost regular LV packaging efficiency. The Exen21/Qa addition at different locations of VSV-G25, 26 may thus maximize its production-boosting efficiency. Likewise, optimizing Exen21/Qa boosting activity on AAV, or other viral packaging system may prove valuable in biopharmaceutical applications.

Many varieties of epitope tags including Flag, Myc, HA, Ollas, V5, His, C7, and T7 developed earlier enable specific research and biotechnological applications such as protein labeling, tracing, immunoaffmity purification, immunostaining, immunodetection enhancing^{27 34}, protein degradation slowing, and solubility conferring^{35 38}. Other tags modulate activity or function of targeted proteins³⁹, such as N- or C-terminal tagging of PI3KCA, which increase its kinase and membrane binding activity, respectively⁴⁰. Until now, however, no tagged epitope had ever been discovered that can stimulate protein expression and secretion. A series of mutation analyses were conducted including alanine scanning, deletion, synonymous and nonsynonymous mutation and proved that the unique 21-mer motif Exen21 with a specific order/number of the nucleotide composition is essential for its boosting activity, which requires ORF fitting into the targeted genes. Thus, the encoded unique heptapeptide Qa can serve as a novel epitope tag that shares features with well-established epitope tags for general applications. Importantly, the Exen21/Qa addition can enhance the intensity of endogenous protein labeling owing to its boosting capacity and thus improve detecting sensitivities in applications such as neural network tracing²⁷. A broad area yet of importance, yet to be explored, is the potential of the Exen21/Qa addition to enhance the expression of targeted, highly specific bioengineered proteins in vivo , such as via novel CRISPR/Cas gene knock-in strategies that could facilitate expression of loss-of-function genes. Such applications would be valuable in treating disorders such as haplo-insufficient mutagenic diseases including Angelman syndrome, Pitts-Hopkins syndrome, and others. In genetic engineering, the Exen21/Qa boost of dominant genes may improve organism phenotypes, such as in agriculture applications. Of course, any potential toxicities or off-target effects of such in vivo expression of Qa-tagged proteins are yet unknown and untested. Nevertheless, based on prior findings with the well-tested epitope tags both in vitro and in vivo , we do not anticipate any propensity for toxicity of the very small 7-aa Qa tag.

The mechanisms via which Exen21/Qa exerts its actions on the enhancement of protein expression/secretion remain mainly to be delineated. However, the initial findings indicated that the presence of Exen21/Qa slowed mRNA decay as the boosting effects persisted during global transcription inhibition by actinomycin D, providing evidence that Exen21/Qa plays a key role in posttranscriptional regulation, which may include increased mRNA stability and perhaps translation efficiency. This Exen21/Qa supports previous proof of concept that the coding sequence harbors numerous regulatory sites that may regulate mRNA location, stability and translation efficiency⁴¹. It would be interesting to determine if the Exen21/Qa c/.s-regulatory motif has a special secondary RNA structure that can recruit RNA-binding proteins⁴¹, directly regulates mRNA stability of targeted proteins⁴², or binds directly to poly-A or untranslated region (UTR) to exert its stabilizing effects upon mRNA and boosting of translation. Because brefeldin A, an inhibitor of the conventional ER-Golgi secretion pathway, blocked Exen21/Qa- stimulated protein secretion, it was speculated that Exen21/Qa may regulate protein retrograde or anterograde trafficking among ER-Golgi network^{43 46} and facilitate ER-targeted mRNA translation and protein secretion like SECReTel6. Other secretion inhibitors might be used to identify additional pathways involved in the Exen21/Qa-modulated protein secretion, particularly the non-conventionally secreted proteins (e.g., that of cytokines such as IL-1)⁴⁷’ ⁴⁸.

In summary, a novel, small (21-mer) and unique c/.s-regulatory motif Exen21/Qa was discovered that can greatly enhance the production of a variety of different types of proteins ranging from viral transcripts/proteins, endogenous gene products, vaccines, antibodies to engineered recombinant proteins in mammalian cells. This Exen21/Qa has a universal protein production-boosting capacity that should facilitate versatile applications in biomedical research and biotechnological industry. Library screening related to this master Exen21/Qa is underway for optimizing the motif that would maximize the protein expression/secretion.

References

1. Zhang, J. et al. A systemic and molecular study of subcellular localization of SARS- CoV-2 proteins. Signal Transduct Target Ther 5, 269 (2020).

2. Rezaei, N. et al. Introduction on Coronavirus Disease (COVID-19) Pandemic: The Global Challenge. Adv Exp Med Biol 1318, 1-22 (2021).

3. Kolahchi, Z. et al. COVID-19 and Its Global Economic Impact. Adv Exp Med Biol 1318, 825-837 (2021).

4. Bodnar, B. et al. Emerging role of NIK/IKK2 -binding protein (NIBP)/trafficking protein particle complex 9 (TRAPPC9) in nervous system diseases. Transl Res 224, 55-70 (2020).

5. Korber, B. et al. Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus. Cell 182, 812-827 e819 (2020).

6. Muik, A. et al. Neutralization of SARS-CoV-2 lineage B.1.1.7 pseudovirus by BNT162b2 vaccine-elicited human sera. Science 371, 1152-1153 (2021).

7. Nie, J. et al. Establishment and validation of a pseudovirus neutralization assay for SARS-CoV-2. Emerg Microbes Infect 9, 680-686 (2020).

8. Walls, A.C. et al. Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein. Cell 181, 281-292 e286 (2020).

9. Weissman, D. et al. D614G Spike Mutation Increases SARS CoV-2 Susceptibility to Neutralization. Cell Host Microbe 29, 23-31 e24 (2021). 10. Wibmer, C.K. et al. SARS-CoV-2 501 Y.V2 escapes neutralization by South African COVID-19 donor plasma. Nat Med (2021).

11. Kuzmina, A. et al. SARS-CoV-2 spike variants exhibit differential infectivity and neutralization resistance to convalescent or post-vaccination sera. Cell Host Microbe (2021).

12. Ou, X. et al. Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV. Nat Commun 11, 1620 (2020).

13. Rijkers, G.T. et al. Antigen Presentation of mRNA-Based and Virus- Vectored SARS- CoV-2 Vaccines. Vaccines (Basel) 9 (2021).

14. Chaung, J. et al. Cleavage efficient 2A peptides for high level monoclonal antibody expression in CHO cells. MAbs 7, 403-412 (2015).

15. Kim, J.H. et al. High cleavage efficiency of a 2 A peptide derived from porcine teschovirus-1 in human cell lines, zebrafish and mice. PLoS One 6, el8556 (2011).

16. Cohen-Zontag, O. et al. A secretion-enhancing cis regulatory targeting element (SECReTE) involved in mRNA localization and protein synthesis. PLoS Genet 15, el008248 (2019).

17. Erceg, J. et al. Subtle changes in motif positioning cause tissue-specific effects on robustness of an enhancer's activity. PLoS Genet 10, el004060 (2014).

18. Kheradpour, P. et al. Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay. Genome Res 23, 800-811 (2013).

19. Ma, S., Shah, S., Bohnert, H.J., Snyder, M. & Dinesh-Kumar, S.P. Incorporating motif analysis into gene co-expression networks reveals novel modular expression pattern and new signaling pathways. PLoS Genet 9, el003840 (2013).

20. Matveeva, O.V. et al. Identification of sequence motifs in oligonucleotides whose presence is correlated with antisense activity. Nucleic Acids Res 28, 2862-2865 (2000).

21. Donofrio, G. et al. A Simplified SARS-CoV-2 Pseudovirus Neutralization Assay. Vaccines (Basel) 9 (2021). 22. Wibmer, C.K. et al. SARS-CoV-2 501 Y.V2 escapes neutralization by South African COVID-19 donor plasma. Nat Med 21, 622-625 (2021).

23. Liu, J. et al. Correlation of vaccine-elicited antibody levels and neutralizing activities against SARS-CoV-2 and its variants. Clin Transl Med11, e644 (2021).

24. Liu, J. et al. Epigallocatechin gallate from green tea effectively blocks infection of SARS-CoV-2 and new variants by inhibiting spike binding to ACE2 receptor. Cell Biosci 11, 168 (2021).

25. Schlehuber, L.D. & Rose, J.K. Prediction and identification of a permissive epitope insertion site in the vesicular stomatitis virus glycoprotein. J Virol78, 5079-5087 (2004).

26. Lorenz, I.C. et al. The stem of vesicular stomatitis virus G can be replaced with the HIV-1 Env membrane-proximal external region without loss of G function or membrane-proximal external region antigenic properties. AIDS Res Hum Retroviruses 30, 1130-1144 (2014).

27. Viswanathan, S. et al. High-performance probes for light and electron microscopy. Nat Methods 68212, 568-576 (2015).

28. Pina, A.S., Batalha, I.L., Dias, A. & Roque, A.C.A. Affinity Tags in Protein Purification and Peptide Enrichment: An Overview. Methods Mol Biol 2178, 107-132 (2021).

29. Peighambardoust, S.H., Karami, Z., Pateiro, M. & Lorenzo, J.M. A Review on Health-Promoting, Biological, and Functional Aspects of Bioactive Peptides in Food Applications. Biomolecules 11 (2021).

30. Katayama, S., Corpuz, H.M. & Nakamura, S. Potential of plant-derived peptides for the improvement of memory and cognitive function. Peptides 142, 170571 (2021).

31. Lee, T.H. et al. Novel short peptide tag from a bacterial toxin for versatile applications. J Immunol Methods 479, 112750 (2020).

32. DeCaprio, J. & Kohl, T.O. Tandem Immunoaffmity Purification Using Anti-FLAG and Anti -HA Antibodies. Cold Spring Harb Protoc2019 (2019). 33. Traenkle, B., Segan, S., Fagbadebo, F.O., Kaiser, P.D. & Rothbauer, U. A novel epitope tagging system to visualize and monitor antigens in live cells with chromobodies. Sci Rep 10, 14267 (2020).

34. Mishra, V. Affinity Tags for Protein Purification. Curr Protein Pept Sci 21, 821-830 (2020).

35. Li, Y. Recombinant production of antimicrobial peptides in Escherichia coli: a review. Protein Expr Purif 80, 260-267 (2011).

36. Bhagawati, M. et al. A mesophilic cysteine-less split intein for protein trans-splicing applications under oxidizing conditions. Proc Natl Acad Sci USA 116, 22164-22172 (2019).

37. Han, X., Ning, W., Ma, X., Wang, X. & Zhou, K. Improving protein solubility and activity by introducing small peptide tags designed with machine learning models. Metab Eng Commun 11, e00138 (2020).

38. Saribas, A.S., White, M.K. & Safak, M. Structure-based release analysis of the JC virus agnoprotein regions: A role for the hydrophilic surface of the major alpha helix domain in release. J Cell Physiol 233, 2343-2359 (2018).

39. Majorek, K.A., Kuhn, M.L., Chruszcz, M., Anderson, W.F. & Minor, W. Double trouble-Buffer selection and His-tag presence may be responsible for nonreproducibility of biomedical experiments. Protein Sci 23, 1359-1368 (2014).

40. Vasan, N. et al. Double PIK3CA mutations in cis increase oncogenicity and sensitivity to PBKalpha inhibitors. Science 366, 714-723 (2019).

41. Ding, Y., Lorenz, W.A. & Chuang, J.H. Coding Motif: exact determination of overrepresented nucleotide motifs in coding sequences. BMC Bioinformatics 13, 32 (2012).

42. Boo, S.H. & Kim, Y.K. The emerging role of RNA modifications in the regulation of mRNA stability. Exp Mol Med 52, 400-408 (2020).

43. Kim, J.J., Lipatova, Z. & Segev, N. TRAPP Complexes in Secretion and Autophagy. Front Cell Dev Biol 4, 20 (2016). 44. Pinar, M. et al. TRAPPII regulates exocytic Golgi exit by mediating nucleotide exchange on the Ypt31 ortholog RabERAB 11. Proc Natl Acad Sci USA 112, 4346- 4351 (2015).

45. Reitz, C. The role of the retromer complex in aging-related neurodegeneration: a molecular and genomic review. Mol Genet Genomics 290, 413-427 (2015).

46. Vardarajan, B.N. et al. Identification of Alzheimer disease-associated variants in genes that regulate retromer function. Neurobiol Aging 33, 2231 e2215-2231 e2230 (2012).

47. Cohen, M. I, Chirico, W. J. & Lipke, P.N. Through the back door: Unconventional protein secretion. Cell Surf 6, 100045 (2020).

48. Ni, D. et al. Canonical Secretomes, Innate Immune Caspase-1-, 4/11-Gasdermin D Non-Canonical Secretomes and Exosomes May Contribute to Maintain Treg-Ness for Treg Immunosuppression, Tissue Repair and Modulate Anti-Tumor Immunity via ROS Pathways. Front Immunol 12, 678201 (2021).

49. Boroujeni, M.E. & Gardaneh, M. The Superiority of Sucrose Cushion Centrifugation to Ultrafiltration and PEGylation in Generating High-Titer Lentivirus Particles and Transducing Stem Cells with Enhanced Efficiency. Mol Biotechnol 60, 185-193 (2018).

OTHER EMBODIMENTS

While the disclosure has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the disclosure, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.

The patent and scientific literature referred to herein establishes the knowledge that is available to those with skill in the art. All United States patents and published or unpublished United States patent applications cited herein are incorporated by reference. All published foreign patents and patent applications cited herein are hereby incorporated by reference. All other published references, documents, manuscripts and scientific literature cited herein are hereby incorporated by reference.

Claims

What is claimed:

1. A composition comprising an expression enhancing oligonucleotide having 21 nucleic acid bases and includes a c/.s-regulatory coding motif that retains in-frame of target gene.

2. The composition of claim 1, wherein the expression enhancing oligonucleotide comprises a nucleic acid sequence comprising CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7).

3. A synthetic oligonucleotide comprising a nucleic acid sequence comprising CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7).

4. The synthetic oligonucleotide of claim 3, wherein the oligonucleotide encodes a peptide comprising an amino acid sequence having at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

5. The synthetic oligonucleotide of any one of claims 3-4 wherein the oligonucleotide encodes a peptide comprising an amino acid sequence having at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

6. The synthetic oligonucleotide of any one of claims 3-5 wherein the oligonucleotide encodes a peptide comprising the amino acid sequence QPRFAAA (SEQ ID NO: 1).

7. A construct comprising the oligonucleotide of any one of claims 1-6.

8. A chimeric molecule comprising one or more peptide domains and one or more 5’- and/or 3’ -untranslated region (UTR) sequences or fragments thereof.

9. The chimeric molecule of claim 8, wherein the one or more peptide domains comprise from about five amino acids to about twenty amino acids.

10. The chimeric molecule of claim 9, wherein the one or more peptide domains comprise about seven amino acids.

11. The chimeric molecule of claim 8, wherein the one or more peptide domains comprise an amino acid sequence having at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

12. The chimeric molecule of claim 8, wherein the peptide comprises an amino acid sequence having at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

13. The chimeric molecule of claim 12, wherein the peptide comprises the amino acid sequence QPRFAAA (SEQ ID NO: 1).

14. The chimeric molecule of claim 8, wherein the peptide domain comprises X_n- QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

15. The chimeric molecule of claim 8, wherein the one or more 5’ -untranslated region (UTR) sequences or fragments thereof, are derived from one or more viruses.

16. The chimeric molecule of claim 15, wherein the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof.

17. The chimeric molecule of claim 16, wherein the 5’ -UTR and/or 3’ -UTR are from a coronavirus.

18. The chimeric molecule of claim 17, wherein the coronavirus is SARS-CoV-2.

19. The chimeric molecule of claim 18, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’-UTR.

20. The chimeric molecule of claim 19, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’-UTR.

21. The chimeric molecule of claim 19, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’ -UTR.

22. The chimeric molecule of claim 19, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 3’-UTR.

23. The chimeric molecule of claim 19, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’-UTR.

24. The chimeric molecule of claim 19, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-23’ -UTR.

25. The chimeric molecule of claim 8, further comprising one or more biomolecules operably linked to the one or more peptide domains and/or the one or more 5’UTR and/or 3’ -UTR sequences.

26. The chimeric molecule of claim 25, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

27. The chimeric molecule of claims 8-26, further comprising one or more promoters and/or regulatory sequences operably linked to the UTRs or biomolecule.

28. A host cell comprising an oligonucleotide of any one of claims 1-7 or the chimeric molecule of any one of claims 8-27.

29. A construct encoding the chimeric molecule of claims 8-27.

30. A method of enhancing production of biomolecules, comprising: tagging a desired peptide or a nucleic acid sequence with the chimeric molecule of any one of claims 1-27, by conjugation or cloning, expressing the peptide or nucleic acid sequence, and, harvesting the protein.

31. The method of claim 30, wherein the proteins comprise: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

32. A nucleic acid comprising a promoter, a 5’ -untranslated region (5’ -UTR) sequence, a biomolecule of interest, an oligonucleotide comprising a c/.s-regulatory coding motif, a 3’- untranslated region (3’ -UTR) sequence and combinations thereof.

33. The nucleic acid of claim 32, wherein the one or more 5’ -untranslated region (UTR) and/or 3’UTR sequences or fragments thereof, are derived from one or more viruses.

34. The nucleic acid of claim 33, wherein the one or more viruses comprise coronaviruses, retroviruses, picomaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof.

35. The nucleic acid of claim 33, wherein the 5’-UTR and/or 3’-UTR are derived from a coronavirus.

36. The nucleic acid of claim 35, wherein the coronavirus is SARS-CoV-2.

37. The nucleic acid of claim 36, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’ -UTR.

38. The nucleic acid of claim 36, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’ -UTR.

39. The nucleic acid of claim 36, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’ -UTR.

40. The nucleic acid of claim 36, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 3’ -UTR.

41. The nucleic acid of claim 36, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’ -UTR.

42. The nucleic acid of claim 36, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-23’ -UTR.

43. A chimeric molecule comprising one or more oligonucleotides comprising a nucleic acid sequence of CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7) and one or more 5’- and/or 3’- untranslated region (UTR) sequences or fragments thereof.

44. The chimeric molecule of claim 43, wherein the one or more oligonucleotides encode a peptide comprising from about five amino acids to about twenty amino acids.

45. The chimeric molecule of claim 44, wherein the one or more peptides comprise about seven amino acids.

46. The chimeric molecule of claim 44, wherein the one or more peptides comprise an amino acid sequence having at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

47. The chimeric molecule of claim 44, wherein the one or more peptides comprise an amino acid sequence having at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

48. The chimeric molecule of claim 44, wherein the one or more peptides comprise the amino acid sequence QPRFAAA (SEQ ID NO: 1).

49. The chimeric molecule of claim 43, wherein the one or more peptides comprises a sequence comprising X_n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

50. The chimeric molecule of claim 43, wherein the one or more 5’ -untranslated region (UTR) sequences or fragments thereof, are derived from one or more viruses.

51. The chimeric molecule of claim 50, wherein the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof.

52. The chimeric molecule of claim 51, wherein the 5’ -UTR and/or 3’ -UTR are from a coronavirus.

53. The chimeric molecule of claim 52, wherein the coronavirus is SARS-CoV-2.

54. The chimeric molecule of claim 53, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’-UTR.

55. The chimeric molecule of claim 53, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’-UTR.

56. The chimeric molecule of claim 53, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’ -UTR.

57. The chimeric molecule of claim 53, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 3’-UTR.

58. The chimeric molecule of claim 53, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’-UTR.

59. The chimeric molecule of claim 53, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-23’ -UTR.

60. The chimeric molecule of claim 43, further comprising one or more biomolecules operably linked to the one or more oligonucleotides and/or the one or more 5’UTR and/or 3’- UTR sequences.

61. The chimeric molecule of claim 60, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

62. The chimeric molecule of claims 43-61, further comprising one or more promoters and/or regulatory sequences operably linked to the UTRs or biomolecule.

63. An expression vector comprising the nucleic acids of any one of claims 43-62.

64. A host cell comprising the nucleic acid of any one of claims 1-63.

65. A chimeric molecule comprising one or more 5’- and/or 3’ -untranslated region (UTR) sequences or fragments thereof associated with one or more biomolecules.

66. The chimeric molecule of claim 65, wherein the one or more 5’ -untranslated region (UTR) and/or 3’ -UTR sequences or fragments thereof, are derived from one or more viruses.

67. The chimeric molecule of claim 66, wherein the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, togaviruses, orthomyxoviruses, rhabdoviruses or combinations thereof.

68. The chimeric molecule of claim 67, wherein the 5’-UTR and/or 3’-UTR are from a coronavirus.

69. The chimeric molecule of claim 68, wherein the coronavirus is SARS-CoV-2.

70. The chimeric molecule of claim 69, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 5’-UTR.

71. The chimeric molecule of claim 69, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 5’-UTR.

72. The chimeric molecule of claim 69, wherein the one or more 5’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-2 5’ -UTR.

73. The chimeric molecule of claim 69, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 70% sequence identity to a SARS-CoV-2 3’-UTR.

74. The chimeric molecule of claim 69, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a nucleic acid sequence having at least a 90% sequence identity to a SARS-CoV-2 3’-UTR.

75. The chimeric molecule of claim 69, wherein the one or more 3’- UTR nucleic acid sequences or fragments thereof, comprise a SARS-CoV-23’ -UTR.

76. The chimeric molecule of claim 65, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

77. The chimeric molecule of any one of claims 65-76, further comprising one or more promoters and/or regulatory sequences operably linked to the UTRs or biomolecule.

78. A synthetic peptide tag comprising an amino acid sequence unit of about five to about fifteen amino acids wherein the N-terminal and/or C-terminal amino acids are linked or fused to a target molecule.

79. The synthetic peptide tag of claim 78, wherein the amino acid sequence unit comprises seven amino acids.

80. The synthetic peptide tag of claim 79, wherein the amino acid sequence comprises at least a 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

81. The synthetic peptide tag of claim 79, wherein the amino acid sequence comprises at least a 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

82. The synthetic peptide tag of claim 79, wherein the amino acid sequence comprises the amino acid sequence QPRFAAA (SEQ ID NO: 1).

83. The synthetic peptide tag of claim 78, wherein the amino acid sequence comprises the amino acid sequence wherein the peptide domain comprises Xn-QPRFAAA-Xn, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

84. The synthetic peptide tag of claim 78, further comprising a plurality of repeating amino acid sequence units.

85. The synthetic peptide tag of claim 84, wherein the repeating amino acid sequence units are in tandem.

86. The synthetic peptide tag of claim 85, wherein the amino acid sequence units are separated by linker molecules or one or more amino acids.

87. A synthetic peptide comprising the structure: (AA-AA-AA-AA-AA-AAz-AAz)x, wherein x is greater than or equal to 1, z is 0 or 1 and each AA is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

88. A synthetic peptide comprising the structure: AA1-AA2-AA3-AA4-AA5-AA6-AA7, wherein each AA is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

89. A synthetic peptide comprising an amino acid sequence comprising the structure: Xn- QPRFAAA-Xn, wherein n is independently 0 or greater than or equal to 1 and each X is independently: Alanine (A), Arginine (R), Asparagine (N), Aspartate (D), Aspartate (D), Asparagine (N), Cysteine (C), Glutamate (E), Glutamine (Q), Glycine (G), Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Proline (P), Serine (S), Threonine (T), Tryptophan (W), Tyrosine (Y), Valine (V), Selenocysteine, Pyrrolysine, modified amino acids or combinations thereof.

90. A fusion protein comprising a synthetic peptide of any one of claims 78-89, fused to one or more target peptides.

91. The fusion protein of claim 90, wherein two or more synthetic peptides of any one of claims 78-89 are fused to a target peptide.

92. A fusion molecule comprising a synthetic peptide of any one of claims 78-91, fused to one or more biomolecules.

93. The fusion molecule of claim 92, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

94. A method of enhancing production of proteins comprising: tagging a desired peptide or a nucleic acid sequence with the peptide tag of any one of claims 78-93, by fusion or cloning, expressing the peptide or nucleic acid sequence, and, harvesting the protein.

95. The method of claim 94, wherein the proteins comprise: viral transcripts/proteins, vaccines, antibodies, an mRNA, an mRNA vaccine, a DNA vaccine, peptide vaccines, an oligonucleotide, a polynucleotide, a peptide, a polypeptide, biomimetics, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

96. A composition comprising a peptide-tagged biomolecule according to any one of claims 78-93 and a pharmaceutically acceptable excipient, diluent or carrier.

97. A nucleic acid encoding the peptide tag according to any one of claims 78-93.

98. An expression vector comprising the nucleic acid according to claim 97.

99. A host cell comprising the expression vector according to claim 98.