CN117979994A

CN117979994A - Oligonucleotides and viral untranslated regions (UTRs) for increased expression of target genes and proteins

Info

Publication number: CN117979994A
Application number: CN202280061128.1A
Authority: CN
Inventors: W·胡
Original assignee: Temple University of Commonwealth System of Higher Education
Current assignee: Temple University of Commonwealth System of Higher Education
Priority date: 2021-07-08
Filing date: 2022-07-07
Publication date: 2024-05-03

Abstract

The present disclosure provides a novel small (21-mer oligonucleotides) and unique cis-regulatory coding motif that can greatly enhance the production of a variety of different types of proteins in mammalian cells ranging from viral transcripts/proteins, endogenous gene products, vaccines, antibodies, to engineered recombinant proteins. The combination of novel peptide tags having a specified short amino acid sequence or derivative thereof with the untranslated region (UTR) (snUTR) of a virus enhances the production of tagged proteins, including viral transcripts/proteins, endogenous gene products, vaccines, antibodies, engineered recombinant proteins, in vitro, ex vivo and in vivo cells.

Description

Oligonucleotides and viral untranslated regions (UTRs) for increased expression of target genes and proteins

Cross Reference to Related Applications

The application claims the benefits of U.S. provisional application 63/332,378 filed on month 4 of 2022, U.S. provisional application 63/219,596 filed on month 7 of 2021, U.S. provisional application 63/219,599 filed on month 7 of 2021, and U.S. provisional application 63/219,587 filed on month 7 of 2021. The entire contents of these applications are incorporated herein by reference in their entirety.

Statement regarding federally sponsored research

The present disclosure was completed with government support under grant by the national institutes of health (project number: 1R01AI 145034). The government has certain rights in this disclosure.

Technical Field

The present disclosure relates to novel oligonucleotides, peptide tags having a specified short nucleotide sequence or derivative thereof, and the natural untranslated region (UTR) (snUTR) of SARS-CoV-2. Methods of using these novel molecules include enhancing production of proteins of interest (including viral transcripts/proteins, endogenous gene products, vaccines, antibodies, engineered recombinant proteins) in cells in vitro, ex vivo, and in vivo.

Background

Various techniques have been developed to enhance protein expression/production, such as promoter optimization, mRNA stabilization, codon optimization, coding modulation, and protein stabilization, as well as modification of host cell expression machinery including humanized yeast systems. While these optimization strategies have been widely used in biopharmaceutical industry and biomedical research, additional enhancement techniques remain important in helping to reduce costs and enhance production speeds. Recently, computational analysis identified a secretion(s) -enhancing (e) cis (c) -modulating 9re targeting (t) element (e) (SECReTE) that facilitates ER-localized mRNA translation and protein secretion (Cohen-Zontag, baez et al.2019). This SECReTE motif is enriched in almost all mRNA encoding secretion/membrane proteins in eukaryotic cells, and its addition results in enhanced protein secretion (Cohen-Zontag, baez et al.2019). When added to mRNA for exogenously expressed proteins such as GFP, protein expression and secretion are also enhanced (Cohen-Zontag, baez et al 2019). Various types of peptide (epitope) tags such as Flag, myc, HA, ollas, V, his, C7, and T7 have been shown to play a role in protein labeling, affinity purification, and immunodetection (DeCaprio and Kohl,2019;Katayama et al.,2021;Lee et al.,2020;Mishra,2020;Peighambardoust et al.,2021;Pina et al.,2021;Traenkle et al.,2020). however, no tagged peptides have been identified to enhance expression/production of a protein of interest in mammalian cells.

The 5' -UTR within the SARS-CoV-2 genome is critical to initiate the production of complete genome and subgenomic transcripts (Baldassarre et al, 2020;Yang and Leibowitz,2015). The 3' -UTR also regulates viral genome expression and replication (Chan et al 2020; zhao et al 2020). Recent computerized studies of high conservation (Baldassarreet al.,2020;Bottaro et al.,2021;Rangan et al.,2020;Rouchka et al.,2020;Ryder et al.,2021;Yang and Leibowitz,2015). of both the 5' -UTR and the 3' -UTR in the SARS-CoV genome and variants thereof have identified a very stable four-way linkage of the 5' -UTR near the AUG initiation codon (Miao et al 2020).

Disclosure of Invention

Embodiments are directed to novel chimeric molecules comprising oligonucleotides comprising cis-regulatory coding motifs, peptide tags, 5 '-untranslated regions (5' -UTRs), 3 '-untranslated regions (3' -UTRs), and combinations thereof, for use in enhanced production and expression of desired biomolecules. The observed synergistic effects can be widely used and have attracted extensive research attention. For industrial applications, this strategy will reduce the cost and promote availability of numerous widely used products such as vaccines, antibodies, recombinant proteins and therapeutic gene products. A straightforward and highly important use of this system would be to boost mRNA vaccines against the COVID-19 variants. For biomedical research, novel chimeric molecules will stimulate interest in exploring novel oligonucleotides and peptides that modulate protein expression and secretion, as well as screening other viral natural UTRs for enhanced protein production.

In certain aspects, the compositions comprise expression enhancing oligonucleotides having between 15 and 30 nucleobases and comprise cis-regulatory coding motifs located in coding regions that retain an Open Reading Frame (ORF) and a gene of interest. In certain embodiments, the expression enhancing oligonucleotide comprises twenty-one nucleobases. In certain embodiments, the expression enhancing oligonucleotide comprises a nucleic acid sequence having at least 75% sequence identity to CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the expression enhancing oligonucleotide comprises a nucleic acid sequence having at least 95% sequence identity to CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the expression enhancing oligonucleotide comprises a nucleic acid sequence comprising CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7).

In other aspects, the synthetic oligonucleotide comprises a nucleic acid sequence having at least 75% sequence identity to CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the synthetic oligonucleotide comprises a nucleic acid sequence having at least 95% sequence identity to CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the oligonucleotide comprises a nucleic acid sequence comprising CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7). In certain embodiments, the oligonucleotide encodes a peptide comprising an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the oligonucleotide encodes a peptide comprising an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the oligonucleotide encodes a peptide comprising amino acid sequence QPRFAAA (SEQ ID NO: 1).

In another aspect, the construct comprises a synthetic oligonucleotide as exemplified herein.

In another aspect, the chimeric nucleic acid comprises one or more peptide domains and one or more 5 '-and/or 3' -untranslated region (UTR) sequences or fragments thereof. In certain embodiments, the one or more peptide domains comprise from about five amino acids to about twenty amino acids. In certain embodiments, the one or more peptide domains comprise about seven amino acids. In certain embodiments, the one or more peptide domains comprise an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the peptide comprises an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the peptide comprises amino acid sequence QPRFAAA (SEQ ID NO: 1). In certain embodiments, the peptide domain comprises X _n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof. In certain embodiments, the one or more 5' -untranslated region (UTR) sequences or fragments thereof are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof. In certain embodiments, the 5'-UTR and/or 3' -UTR is from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 3' -UTR. In certain embodiments, the chimeric molecule further comprises one or more biomolecules operably linked to the one or more peptide domains and/or the one or more 5'-UTR and/or 3' -UTR sequences. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics. In certain embodiments, the chimeric molecule further comprises one or more promoter and/or regulatory sequences operably linked to the UTR or biomolecule.

In another aspect, the host cell comprises an oligonucleotide as exemplified herein or a chimeric molecule as exemplified herein.

In another aspect, the construct encodes an oligonucleotide as exemplified herein or a chimeric molecule as exemplified herein.

In another aspect, a method of enhancing production of a biomolecule comprises: labelling a desired peptide or nucleic acid sequence by fusion or cloning using a chimeric molecule according to any one of claims 1 to 34; expressing the peptide or nucleic acid sequence; and (5) harvesting the protein. In certain embodiments, the protein comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

In another aspect, the nucleic acid comprises a promoter, a 5 '-untranslated region (5' -UTR) sequence, a biomolecule of interest, an oligonucleotide comprising a cis-regulatory coding motif, a3 '-untranslated region (3' -UTR) sequence, and combinations thereof. In certain embodiments, the one or more 5 '-untranslated region (UTR) sequences and/or 3' -UTRs or fragments thereof are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof. In certain embodiments, the 5'-UTR and/or 3' -UTR is derived from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-25' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-25' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-25' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-23' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-23' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-23' -UTR.

In another aspect, the chimeric molecule comprises one or more oligonucleotides comprising the nucleic acid sequence of CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7) and one or more 5 '-and/or 3' -untranslated region (UTR) sequences or fragments thereof. In certain embodiments, the one or more oligonucleotides encode a peptide comprising from about five amino acids to about twenty amino acids. In certain embodiments, the one or more peptides comprise about seven amino acids. In certain embodiments, the one or more peptides comprise an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptides comprise an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptides comprise amino acid sequence QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptides comprise a sequence comprising X _n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof. In certain embodiments, the one or more 5' -untranslated region (UTR) sequences or fragments thereof are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof. In certain embodiments, the 5'-UTR and/or 3' -UTR is from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 3' -UTR. In certain embodiments, the chimeric molecule further comprises one or more biomolecules operably linked to the one or more oligonucleotides and/or the one or more 5'-UTR and/or 3' -UTR sequences. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics. In certain embodiments, the chimeric molecule further comprises one or more promoter and/or regulatory sequences operably linked to the UTR or biomolecule.

In another aspect, the expression vector comprises a nucleic acid as exemplified herein.

In another aspect, the novel peptide tag comprises a specified short amino acid sequence or derivative thereof. In certain embodiments, the peptide tag is about 5 to about 10 amino acids in length. In certain embodiments, the peptide tag is about 7 amino acids in length. In certain embodiments, the peptide tag comprises a tandem repeat sequence of two or more peptides.

In certain aspects, the synthetic peptide tag comprises an amino acid sequence unit of about five to about fifteen amino acids, wherein the N-terminal and/or C-terminal amino acids are linked or fused to a target molecule. In certain embodiments, the amino acid sequence unit comprises seven amino acids. In certain embodiments, the amino acid sequence comprises an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the amino acid sequence comprises an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the amino acid sequence comprises amino acid sequence QPRFAAA (SEQ ID NO: 1). In certain embodiments, the amino acid sequence comprises the amino acid sequence, wherein the peptide domain comprises Xn-QPRFAAA-Xn, wherein n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof. In certain embodiments, the synthetic peptide tag further comprises a plurality of repeating amino acid sequence units. In certain embodiments, the repeat amino acid sequence units are in tandem. In certain embodiments, the amino acid sequence units are separated by a linker molecule or one or more amino acids.

In another aspect, the synthetic peptide comprises the structure: (Aa-Aa-Aa-AA-AAZ-AAZ) X, wherein X is greater than or equal to 1, z is 0 or 1, and each Aa is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

In another aspect, the synthetic peptide comprises the structure: AA1-AA2-AA3-AA4-AA5-AA6-AA7, wherein each AA is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

In another aspect, the synthetic peptide comprises an amino acid sequence comprising the structure: xn-QPRFAAA-Xn, where n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

In another aspect, the fusion protein comprises a synthetic peptide as exemplified herein fused to one or more target peptides. In certain embodiments, two or more synthetic peptides exemplified herein are fused to one target peptide.

In another aspect, the fusion molecule comprises a synthetic peptide as exemplified herein fused to one or more biomolecules. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

In another aspect, a method of enhancing the production of a protein comprises: labelling of the desired peptide or nucleic acid sequence by fusion or cloning using the peptide tags exemplified herein; expressing the peptide or nucleic acid sequence; and (5) harvesting the protein. In certain embodiments, the protein comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, biomimetics, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

In certain aspects, the compositions comprise a peptide-tagged biomolecule exemplified herein, and a pharmaceutically acceptable excipient, diluent, or carrier.

In another aspect, the nucleic acid encodes a peptide tag as exemplified herein.

In another aspect, the expression vector comprises a nucleic acid encoding a peptide tag as exemplified herein.

In another aspect, the host cell comprises an expression vector encoding a peptide tag as exemplified herein.

In certain aspects, methods of using peptide tags comprise enhancing production of tagged proteins (including viral transcripts/proteins, endogenous gene products, vaccines, antibodies, engineered recombinant proteins) in cells in vitro, ex vivo, and in vivo. In certain embodiments, the tandem peptide repeats further enhance the production of the target molecule. In certain embodiments, the method of increasing protein production in a cell comprises labeling a target molecule in the cell.

In another aspect, the chimeric molecule comprises one or more 5 '-and/or 3' -untranslated region (UTR) sequences or fragments thereof associated with one or more biomolecules. In certain embodiments, the one or more 5 '-untranslated region (UTR) sequences and/or 3' -UTRs or fragments thereof are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof. In certain embodiments, the 5'-UTR and/or 3' -UTR is from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 3' -UTR. In certain embodiments, the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics. In certain embodiments, the chimeric molecule further comprises one or more promoter and/or regulatory sequences operably linked to the UTR or biomolecule.

In another aspect, the host cell comprises a chimeric molecule exemplified herein.

In another aspect, the construct encodes a chimeric molecule exemplified herein.

In another aspect, a method of enhancing production of a biomolecule comprises: labelling of the desired peptide or nucleic acid sequence by fusion or cloning using the chimeric molecules exemplified herein; expressing the peptide or nucleic acid sequence; and (5) harvesting the protein. In certain embodiments, the protein comprises: oligonucleotides, polynucleotides, mRNA vaccines, DNA vaccines, viral transcripts/proteins, antibodies, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens, or biomimetics.

In another aspect, the nucleic acid comprises a promoter, a 5 '-untranslated region (5' -UTR) sequence, a biomolecule of interest, a peptide domain, a 3 '-untranslated region (3' -UTR) sequence, and combinations thereof. In certain embodiments, the one or more 5' -untranslated region (UTR) sequences or fragments thereof are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof. In certain embodiments, the 5'-UTR and/or 3' -UTR is derived from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 5' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 3' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 3' -UTR.

In another aspect, the host cell comprises a nucleic acid or expression vector as exemplified herein.

Definition of the definition

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs (e.g., in cell culture, molecular genetics, and biochemistry).

The term "about" or "approximately" means within an acceptable error range for a particular value as determined by one of ordinary skill in the art, and will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, according to the practice in the art, "about" may mean within 1 or more than 1 standard deviation. Or "about" may mean a range of up to 20%, up to 10%, up to 5%, or up to 1% of a given value or range. Or in particular with respect to biological systems or overlengths, the term may mean within a certain order of magnitude, within 5 times and within 2 times of the value. When a particular value is described in the specification and claims, unless otherwise specified, the term "about" shall be assumed to mean within an acceptable error range for the particular value. It should be understood that if a range of parameters is provided, all integers and decimal points thereof that are within the range are also provided by the present disclosure. For example, "0.2 to 5mg" is a disclosure of 0.2mg, 0.3mg, 0.4mg, 0.5mg, 0.6mg, etc. up to 5.0 mg.

In the description and claims of the present disclosure, phrases such as "at least one" or "one or more" may occur after a series of elements or features. The term "and/or" may also occur in the enumeration of two or more elements or features. Unless implicitly or explicitly excluded by the context in which it applies, this phrase is intended to mean a single one of the recited elements or features or a combination of any one of the recited elements or features with any other applied element or feature. For example, the phrases "at least one of a and B", "one or more of a and B", and "a and/or B" are each intended to mean "a only, B only, or a and B together". Similar explanations apply to an enumeration comprising three or more items. For example, the phrases "at least one of A, B and C", "one or more of A, B and C", and "A, B and/or C" are each intended to mean "a only, B only, C, A and B together, a and C together, B and C together, or a and B and C together. Furthermore, use of the term "based on" is intended to mean "based, at least in part, on" and thus unrecited features or elements are also permitted.

As used herein, the term "amino acid" encompasses both naturally occurring amino acids and non-naturally occurring amino acids. Examples of naturally occurring amino acids include, but are not limited to, D-amino acids (i.e., amino acids having opposite chirality to naturally occurring amino acids), N-alpha-methyl amino acids, C-alpha-methyl amino acids, beta-methyl amino acids, and D-or L-beta-amino acids. Other non-naturally occurring amino acids include, for example, beta-alanine (beta-Ala), norleucine (Nle), norvaline (Nva), homoarginine (Har), 4-aminobutyric acid (gamma-Abu), 2-aminoisobutyric acid (Aib), 6-aminocomputing (ε -Ahx), ornithine (orn), sarcosine, alpha-aminoisobutyric acid, 3-aminopropionic acid, 2, 3-diaminopropionic acid (2, 3-diaP), D-or L-phenylglycine, D- (trifluoromethyl) -phenylalanine, and D-p-fluorophenylalanine.

As used herein, the term "biomolecule" refers to any of a number of substances that can be produced by cells and living organisms. Biomolecules have a wide range of sizes and structures and perform a large number of functions. Four main types of biomolecules are carbohydrates, lipids, nucleic acids and proteins, or features associated with peptides and/or proteins of interest. Biomolecules are useful in a variety of applications, including, but not limited to, curative agents for diseases (e.g., insulin, interferon, interleukin, anti-angiogenic peptides, tumor necrosis factor); molecules that bind to defined cellular targets (such as receptors, channels, lipids, cytoplasmic proteins, and membrane proteins, for example); biological molecules having antimicrobial activity, antiviral activity, anticancer, anti-inflammatory activity, etc.

As used herein, "cleavable linker element," "peptide linker," and "cleavable peptide linker" will be used interchangeably and refer to a cleavable peptide segment, in certain embodiments, found between a peptide tag and a biomolecule of interest (e.g., a peptide). After separation and/or partial purification or purification of the peptide tag from the cell lysate, the cleavable linker element may be chemically and/or enzymatically cleaved to separate the peptide tag from the biomolecule of interest (e.g., peptide). The fusion peptide may also include a plurality of regions encoding one or more peptides of interest separated by one or more cleavable peptide linkers. If necessary, the peptide of interest can then be separated from the peptide tag. In one embodiment, the peptide tag and the peptide of interest exhibit different solubilities in a defined medium (typically an aqueous medium), thereby facilitating the separation of the peptide tag from the biomolecule of interest (e.g., polypeptide). In one embodiment, the peptide tag is insoluble in aqueous solution, while the protein/polypeptide of interest is suitably soluble in aqueous solution. The pH, temperature, and/or ionic strength of the aqueous solution may be adjusted to facilitate recovery of the peptide of interest. In one embodiment, differential solubility between the inclusion body tag and the peptide of interest occurs in an aqueous solution having a pH of 4 to 11 and a temperature range of 15 ℃ to 50 ℃. The cleavable peptide linker can be from 1 to about 50 amino acids in length, from 1 to about 20 amino acids in length. The cleavable peptide linker can be incorporated into the fusion protein using any number of techniques well known in the art. Means for preparing the peptides of the invention (peptide tags, cleavable peptide linkers, peptides of interest, and fusion peptides) are well known in the art, and in preferred embodiments, recombinant DNA and molecular cloning techniques can be used to prepare the complete peptide reagents.

The term "checkpoint protein" means a group of molecules on the surface of CD4 ⁺ and/or CD8 ⁺ T cells that fine-tune the immune response by down-regulating or inhibiting the anti-tumor immune response.

As used herein, the terms "comprises," "comprising," or "including" and variations thereof, when used in relation to elements of a defined or described item, composition, apparatus, method, process, system, etc., are intended to be inclusive or open ended, allowing additional elements to be taken to mean that the defined or described item, composition, apparatus, method, process, system, etc., includes those specified elements (or equivalents thereof, if appropriate), and that other elements may be included and still fall within the scope/definition of the defined item, composition, apparatus, method, process, system, etc.

As used herein, when the terms "conjugated," "linked," "attached," "fused," and "tethered" are used with respect to two or more moieties, the term means that the moieties or domains are physically related or linked to each other, either directly or via one or more additional moieties that act as linking agents, to form a structure that is stable enough that the moieties remain related under the conditions under which the structure will be used, such as physiological conditions. Linking may be based on gene fusion according to methods known in the art or may be performed by, for example, chemical cross-linking. The compound and the targeting agent may be linked by a flexible linker, such as a polypeptide linker. The polypeptide linker may comprise a plurality of hydrophilic or peptide-bonded amino acids of various lengths. The term "associating" will be used for brevity and is intended to include all possible methods of physically and chemically associating each domain.

As used herein, the terms "enhance", "enhanced", "enhancement" are used interchangeably and refer to an increase in a specified parameter (e.g., at least about 1.1-fold, 1.25-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 8-fold, 10-fold, twelve-fold, or even fifteen-fold or more increase) and/or an increase in a specified activity of at least about 5%, 10%, 25%, 35%, 40%, 50%, 60%, 75%, 80%, 90%, 95%, 97%, 98%, 99% or 100% relative to a baseline value.

As used herein, the terms "fusion protein," "fusion peptide," "chimeric protein," and "chimeric peptide" will be used interchangeably and will refer to a polymer of amino acids (peptide, oligopeptide, polypeptide, or protein) that comprises at least two portions, each portion having a different function. At least one first portion of the fusion peptide comprises at least one of the peptide tags of the invention. At least one second portion of the fusion peptide comprises at least one peptide of interest. In certain embodiments, the fusion protein additionally includes at least one cleavable peptide linker that facilitates cleavage (chemical and/or enzymatic) and separation of the peptide tag from the peptide of interest.

"Nucleic acid" refers to nucleotides (e.g., deoxyribonucleotides, ribonucleotides, and 2' -modified nucleotides) and their single-, double-, or multiple-stranded forms that are exacerbated, or their complements. In a general and customary sense, the terms "polynucleotide", "oligonucleotide", "oligomer" and the like refer to a linear sequence of nucleotides. In a general and customary sense, the term "nucleotide" refers to a single unit of a polynucleotide, i.e., a monomer. The nucleotide may be a ribonucleotide, a deoxyribonucleotide, or a modified version thereof. Examples of polynucleotides contemplated herein include single-and double-stranded DNA, single-and double-stranded RNA, and hybrid molecules having a mixture of single-and double-stranded DNA and RNA. Examples of nucleic acids (e.g., polynucleotides) contemplated herein include any type of RNA (e.g., mRNA, siRNA, miRNA and guide RNAs) and any type of DNA, genomic DNA, plasmid DNA, and microloop DNA, or any fragment thereof. In the usual and customary sense, the term "duplex" in the context of a polynucleotide refers to a duplex.

Nucleic acids (including, for example, nucleic acids having phosphorothioate backbones) may include one or more reactive moieties. As used herein, the term reactive moiety includes any group capable of reacting with another molecule (e.g., a nucleic acid or polypeptide) through covalent, non-covalent, or other interactions. For example, a nucleic acid may include an amino acid reactive moiety that reacts with an amino acid on a protein or polypeptide by covalent, non-covalent, or other interactions.

The term also encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotide. Examples of such analogs include, but are not limited to, phosphodiester derivatives including, for example, phosphoramidates, phosphorodiamidates, phosphorothioates (also known as phosphorothioates, which have double bond thios to replace oxygen in the phosphate), phosphorodithioates, phosphonocarboxylic acids, phosphonocarboxylic acid esters, phosphonoacetic acid, phosphonoformic acid, methylphosphonates, borophosphates, or O-methyl phosphoramidate linkages (see Eckstein, OLIGONUCLEOTIDES AND ANALOGUES: A PRACTICAL APPROACH, oxford University Press), and modifications to nucleotide bases such as 5-methylcytidine or pseudouridine; and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those having a positive backbone, a nonionic backbone, modified saccharides, and a non-ribose backbone (e.g., phosphorodiamidate N-morpholino oligonucleotides or Locked Nucleic Acids (LNAs), as known in the art), including those described in U.S. Pat. nos. 5,235,033 and 5,034,506 and in chapters AND CHAPTERS 6and 7,ASC Symposium Series 580,CARBOHYDRATE MODIFICATIONS IN ANTISENSE RESEARCH, chapters 6and 7 of Sanghui & Cook. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acid. Modification of the ribose-phosphate backbone can be done for a variety of reasons, for example, to increase the stability and half-life of such molecules in physiological environments or as probes on biochips. Mixtures of naturally occurring nucleic acids and analogs can be prepared; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs, can be prepared. In embodiments, the internucleotide linkages in the DNA are phosphodiester, phosphodiester derivatives, or a combination of both.

As used herein, the term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment such that the function of one is affected by the other. For example, a promoter is operably linked to a coding sequence when it is capable of affecting the expression of the coding sequence (i.e., the coding sequence is under the transcriptional control of the promoter). In yet another embodiment, the definition of "operably linked" can also be extended to describe the product of a chimeric gene, such as a fusion protein. Thus, "operably linked" will also refer to linking of a peptide tag to a biomolecule of interest (e.g., peptide) to be produced and recovered. A peptide tag is "operably linked" to a peptide of interest if the fusion protein is insoluble upon expression and accumulates its inclusion bodies in the expression host cell. In a preferred embodiment, the fusion peptide will include at least one cleavable peptide linker that is useful in separating the peptide tag from the peptide of interest. The cleavable peptide linker can be incorporated into the fusion protein using any number of techniques well known in the art.

As used herein, the terms "polypeptide" and "peptide" will be used interchangeably to refer to a polymer of two or more amino acids joined together by peptide bonds, wherein the length of the peptide is not specified, and thus, peptides, oligopeptides, polypeptides, and proteins are included within the definition of the invention. In one aspect, this term also includes post-expression modifications of the polypeptide, e.g., glycosylation, acetylation, phosphorylation, and the like. Included within this definition are, for example, peptides and peptidomimetics that contain one or more amino acid analogs or labeled amino acids.

As used herein, the terms "protein of interest", "polypeptide of interest", "peptide of interest", "expressible protein" and "expressible polypeptide" will be used interchangeably and refer to a protein, polypeptide or peptide that can be expressed by a genetic element in a host cell.

As used herein, the terms "plasmid," "vector," and "cassette" refer to an extrachromosomal element that normally carries a gene, which is not part of the central metabolism of a cell and is typically in the form of a circular double stranded DNA molecule. Such elements may be autonomously replicating sequences, gene integrating sequences, phage or nucleotide sequences, linear or circular single-or double-stranded DNA or RNA, from any source, wherein a large number of nucleotide sequences have been joined or recombined into a unique framework capable of introducing into a cell a promoter fragment and DNA sequence for a selected gene product along with the appropriate 3' untranslated sequence. "transformation cassette" refers to a specific vector that contains a foreign gene and has elements that promote transformation of a specific host cell in addition to the foreign gene. "expression cassette" refers to a particular vector that contains a foreign gene and has elements in addition to the foreign gene that allow for enhanced expression of the gene in a foreign host.

As used herein, the term "promoter/regulatory sequence" means a nucleic acid sequence that is required for expression of a gene product operably linked to the promoter/regulatory sequence. In some cases, this sequence may be a core promoter sequence; in other cases, this sequence may also include enhancer sequences and other regulatory elements required for expression of the gene product. For example, a promoter/regulatory sequence may be an element that expresses a gene product in a tissue-specific manner.

As used herein, the term "promoter" as used herein is defined as a DNA sequence recognized by a cell's synthetic machinery or an introduced synthetic machinery that is required to initiate specific transcription of a polynucleotide sequence. A "constitutive" promoter is a nucleotide sequence which, when operably linked to a polynucleotide encoding or specifying a gene product, results in the production of the gene product in a cell under most or all physiological conditions of the cell. An "inducible" promoter is a nucleotide sequence which, when operably linked to a polynucleotide encoding or specifying a gene product, results in the production of the gene product in a cell substantially only when an inducer corresponding to the promoter is present in the cell. A "tissue-specific" promoter is a nucleotide sequence that, when operably linked to a coding gene or a polynucleotide specified by a gene, results in the production of the gene product in a cell substantially only if the cell is a cell corresponding to the tissue type.

As used herein, the term "target molecule," "biomolecule," or "target biomolecule" includes any macromolecule, including proteins, peptides, polypeptides, genes, polynucleotides, oligonucleotides, carbohydrates, enzymes, polysaccharides, glycoproteins, receptors, antigens, tumor antigens, markers, molecules associated with disease, antibodies, growth factors; or it may be any small organic molecule including hormones, substrates, metabolites, cofactors, inhibitors, drugs, dyes, nutrients, pesticides, peptides; or it may be an inorganic molecule including metals, metal ions, metal oxides, and metal complexes; it may also be an intact organism including bacteria, viruses and unicellular eukaryotic organisms such as protozoa.

As used herein, the term "translatable" may be used interchangeably with the term "expressible". These terms may refer to the ability of a polynucleotide or portion thereof to provide a polypeptide through transcriptional and/or translational events during use of the biomolecule, or in a cell, or in the natural biological environment. In some circumstances, translation is a process that may occur when ribosomes create polypeptides in cells. In translation, messenger RNAs (mrnas) may be decoded by ribosomes to produce specific amino acid chains or polypeptides. The translatable polynucleotide may provide a coding sequence region (typically, CDS) or portion thereof, which may be treated to provide a polypeptide, protein, or fragment thereof.

As used herein, the term "3 '-untranslated region" (3' -UTR) relates to the secretion of messenger RNA (mRNA) followed by a translation stop codon. The 3'UTR may comprise regulatory regions within the 3' -untranslated region that are known to affect polyadenylation and stability of mRNA. Many 3' -UTRs also contain AU-rich elements (AREs). Furthermore, the 3' -UTR may preferably contain sequences that direct the addition of hundreds of adenylate residues (so-called poly (a) tails) to the ends of mRNA transcripts.

As used herein, the term "5 '-untranslated region" (5' -UTR) refers to a polynucleotide sequence that, when linked to a transcript, is capable of recruiting a ribosomal complex and initiating translation of the transcript. Typically, the 5' -UTR is located immediately upstream of the start codon of the transcript; specifically, between the cap site and the start codon. The 5' utr starts at the transcription start site and ends one nucleotide (nt) before the start codon of the coding region (typically AUG in mRNA). In eukaryotic cells, the 5' UTR is typically from 100 to thousands of nucleotides in length, but sometimes shorter UTRs are also present in eukaryotic cells.

Throughout this disclosure, aspects of the disclosure may be presented in a range format. It should be understood that the description of the range format is merely for convenience and brevity and should not be construed as a rigid limitation on the scope of the present disclosure. Accordingly, the description of a range should be considered as having all possible subranges and individual numerical values of that range specifically disclosed. For example, descriptions of ranges such as 1 to 6 should be considered as having specifically disclosed sub-ranges such as 1 to 3, 1 to 4, 1 to 5, 2 to 4, 2 to 6, 3 to 6, etc., as well as individual numbers within the range, e.g., 1,2, 2.7, 3, 4, 5, 5.3, and 6. This applies regardless of the breadth of the range.

Any of the compositions or methods provided herein can be used with one or more of any other compositions and methods provided herein.

Drawings

FIGS. 1A through 1F are a series of graphs, schematic representations and fluorescence microscopy images demonstrating that Qα tagging in SARS-CoV-2 viral proteins strongly enhances production of dual reporter fused viral proteins in HEK293T cells.

Fig. 1A: schematic representation of 2A mediated dual reporter gdLuc/dsGFP fusion with viral proteins, potential multiplex measurement of viral protein expression/production.

Fig. 1B to 1D: representative experiments with respect to qα -enhanced SARS-CoV-2 envelope (E) protein dynamic production (fig. 1B) and 24 to 48 hours after transfection with the indicated pcDNA6B vector (100 ng/well), mean fold induction of 10 experiments measured in quadruplicate for each experiment by the gdLuc assay of culture medium (fig. 1C), and representative images of qα -enhanced dsGFP expression detected by fluorescent microscopy (fig. 1D).

Fig. 1E to 1F: representative gdLuc experiments, which showed qα potentiation in other SARS-CoV-2 structural proteins spinous process (S) and nucleocapsid (N) and helper proteins NSP2, NSP16 and ORF 3. Cells were transfected in quadruplicates with the indicated pcDNA6B vector at 100 ng/well. Data represent mean ± SE of gdLuc activities in supernatant 48 hours after transfection. Fold numbers indicate relative changes in the qα group compared to the corresponding control group.

Fig. 2A to 2G are a series of graphs and fluorescence microscopy images demonstrating that qα boost is diverse in terms of dose, non-viral proteins, cell type, and tagged localization.

Fig. 2A: various degrees of dose-dependent qα boosting of SARS-CoV-2S, M, N and ORF 3. Cells were transfected in quadruplicates with the indicated pcDNA6B vector in the indicated vector amounts. Data represent mean ± SE of gdLuc activities in supernatant 48 hours after transfection. Fold numbers indicate relative changes in the qα group compared to the corresponding control group.

Fig. 2B to 2D: dose-dependent qα boost for host cell genes NIBP and hACE2 (fig. 2B, 2C) and representative fluorescence microscopy images 48 hours after transfection with pcDNA6B conventional vector (fig. 2D) determined by gdLuc assay.

Fig. 2E, 2F: dose-dependent qα boosting for ifnγ and IL-2 secretion as measured by gdLuc assay 48 hours after transfection with pRRL LV vector.

Fig. 2G: the qα boost for viral proteins E and S and non-viral protein hACE shows similar efficiency in different cell types.

FIGS. 3A through 3I are a series of graphs, fluorescence microscopy images, schematic presentations and blots showing that Qα enhancement is accelerated by a stronger promoter and SARS-CoV-2 natural untranslated region (UTR).

FIGS. 3A to 3C; the stronger promoter CAG further increases qα -boosting efficiency for viral proteins E, S and NSP 16.

Fig. 3D, 3E: the inclusion of the 5'-UTR strongly increased the promoter-dependent expression of the E protein as determined by western blotting and immunocytochemistry using anti-Flag antibodies, and the addition of the 3' -UTR further increased the E protein expression. During cloning, after stopping the removal, E-Flag-Q results of different sizes were caused by the addition of 37 amino acids in the open reading frame.

Fig. 3F, 3G: the inclusion of 5' -UTRs dramatically increases CAG-driven expression of qα -tagged S-fusion dual reporters as determined by representative fluorescence microscopy images and gdLuc assay.

Fig. 3H: the inclusion of 5' -UTRs further accelerates the qα potentiation efficiency of CMV driven S dual reporter protein production when compared to LG groups 48 hours after transfection with conventional vectors.

Fig. 3I: the inclusion of the 5' -UTR accelerates the qα boost of the dual reporter without viral proteins as determined by gdLuc assay 48 hours after transfection with pRRL LV virus.

FIGS. 4A through 4H are a series of graphs, fluorescence microscopy images, schematic presentations and blots demonstrating the packaging and transduction efficiency of Qα -tagged and 5' -UTR inclusion of lentiviral-like particles (S-LVLP) that enhance SARS-CoV-2S protein pseudotyping.

Fig. 4A: schematic of different vectors for expressing human codon optimized Sd18, and S-LVLP packaging process.

Fig. 4B: qα and 5' -UTR increased Sd18 protein expression in transfected cells as determined by western blot using serum from SARS-CoV-2 patients.

Fig. 4C to 4F: q.alpha.tagging increased S-LVLP encapsulation titers against standard pRRL-GFP LV vectors as determined by GFP positivity, which were further increased by 5' -UTR inclusion, polybrene treatment and purification.

Fig. 4G to 4H: q.alpha.tagging and 5' -UTR inclusion increased the S-LVLP encapsulation titers against dual reporter LV vectors pRRL-LG or pRRL-E-LG as determined by GFP-positive and gdLuc activity.

FIGS. 5A through 5I are a series of graphs demonstrating that Q.alpha.tagging and 5' -UTR inclusion enhance mRNA dependent production of SARS-CoV-2 viral proteins S, N, E and ORF3, as well as non-virus hACE2, via increased mRNA stability and translational efficiency.

Fig. 5A to 5C: qα labelling strongly potentiates mRNA-dependent production of dual reporters to various degrees in a time and dose dependent manner using different target proteins.

Fig. 5D, 5E: inclusion of the 5' -UTR further accelerates the enhancement of qα for mRNA driven production of S, E and hACE2 proteins.

Fig. 5F to 5I: qα tagging increases post-transcriptional mRNA stability and transcription efficiency in the presence of the transcription inhibitor actinomycin D.

Fig. 6A to 6M are a series of schematic, blot, coordinate, fluorescence microscopic images and photographs showing that qα labelling enhances the production of anti-SARS monoclonal antibodies and lentiviruses.

Fig. 6A: schematic diagram showing qα labelling on the C-terminal end of the constant region of the heavy and light chains of an anti-SARS monoclonal antibody.

Fig. 6B to 6D: normalization with or without GFP (fig. 6G) or firefly luciferase (fig. 6L) was performed with respect to a robust, representative ELISA for mAb production 48 hours after H/L or HQ/LQ (50 ng/well) co-transfection. The amount of mAb was quantified by a Sigmoidal four-parameter logistic curve (4 PL). Exhibiting a relative fold change when compared to the corresponding LG.

Fig. 6E: based on ELISA results, the mean fold change for 16 experiments.

Fig. 6F: western blot analysis confirmed that qα enhanced mAb production in the supernatant.

Fig. 6G: after LV infection of HEK293T cells, Qα tagging in LV transfer vector pRRL-E-LG increased gdLuc activity in the supernatant.

Fig. 6H, 6I: qα tagging in LV transfer vector pLV-EF1 α -Flag-spCas9-qα -T2A-RFP increased transgene expression, determined by western blot analysis using anti-Flag antibodies (fig. 6H); but without increasing encapsulation efficiency, measurements were made using FACS positive for RFP (fig. 6I).

Fig. 6J: representative fluorescent images showing that qα labeling on Pol and RRE enhances LV encapsulation efficiency against pRRL-GFP-transferred LV vectors, but qα labeling on Gag compromises LV encapsulation.

Fig. 6K to 6M: qα labeling on Pol and RRE increased LV encapsulation efficiency for pRRL-UTR-QLG and pLV-EF1 α -MS2-spCas9-F2A-GFP, as determined using LV qPCR titre kit (FIG. 6K), flow cytometry (FIG. 6L) and gdLuc assays (FIG. 6M).

Figures 7A to 7H are a series of blot and graph that identify enhancement of qα labelling for secretion of various target proteins in HEK293T cells.

Fig. 7A to 7B: q.alpha.tagging significantly reduced the expression level of E-Flag-gdLuc protein in cell lysates after transfection with the indicated vectors 48.

Fig. 7C: qα -tagging reduced the expression levels of secreted ifnγ/IL-2 or non-secreted viral protein N and non-viral protein hACE2 in cell lysates 48 hours after transfection with the indicated vectors. T2A autolysis efficiency varies with the target protein, showing different ratios of cleaved (c) to non-cleaved (n) bands.

Fig. 7D: qα tagging reduces S protein levels in cell lysates, while 5' -UTR inclusion does not increase protein levels, although continuously secreted. The cleaved S-Flag-gdLuc fragment was detected with anti-Flag antibody and the cleaved dsGFP fragment was detected with anti-GFP antibody.

Fig. 7E, 7F: q alpha labelling strongly increased the protein level of secreted E-QLG in the supernatant, as detected by western blot analysis and gdLuc assay of the supernatant. Cells were transfected in quadruplicate with indicated vectors for 24 hours and cultured with FreeStyle ^TM 293 expression medium for 48 hours.

Fig. 7G: the ER golgi trafficking inhibitor brefeldin a completely blocks secretion of qα -tagged viral proteins and host cell proteins.

Fig. 7H: qα labeling increases protein expression of non-secreted firefly luciferase (fLuc) in cell lysates and is ineffective against the background of fLuc activity in the supernatant.

FIGS. 8A through 8I are a series of protocol diagrams, photographs and graphs of stained cells, demonstrating that Exen/Qα addition in SARS-CoV-2 viral protein strongly enhances production of dual reporter fused viral proteins in HEK293T cells.

Fig. 8A: schematic of 2A-mediated dual reporter gdLuc/dsGFP (LG) and qα -tagged LG (QLG) fusion with viral proteins and potential multiplex measurement of viral protein expression/production. Exen21/Qα represents the 21-mer nucleotide motif and its corresponding heptad.

Fig. 8B to 8D: representative experiments showing Exen enhancement of dynamic production of SARS-CoV-2 envelope (E) protein (fig. 8B) and mean fold induction of the results of 20 experiments determined by gdLuc assay in supernatants 24 to 72 hours after transfection with indicated pcDNA6B vector (100 ng/well, in quadruplicate) as well as representative images of dsGFP expression enhanced by Exen detected by fluorescent microscopy (fig. 8D). The data represent mean ± SE of gdLuc activity, and relative fold change (red) of QLG relative to the corresponding LG group (hereinafter the same).

Fig. 8E to 8F: representative gdLuc experiments, which showed Exen potentiation in other SARS-CoV-2 structural proteins spinous process (S), nucleocapsid (N), and helper proteins NSP2, NSP16 and ORF 3. Cells were transfected with the indicated pcDNA6B vector (100 ng/well, in quadruplicate). Data represent mean ± SE of gdLuc activities in supernatant 48 hours after transfection.

Fig. 8G to 8I: alanine scanning and deletion mutations (FIG. 8G) as well as degenerate (FIG. 8H) and missense (FIG. 8I) mutation assays, which show a unique and specific key role for Exen in enhancing E-LG production. Cells were transfected with the indicated pcDNA6B vector (100 ng/well, in quadruplicate). Data represent mean ± SE of gdLuc activity in supernatant 48 hours after transfection, and relative percent change compared to the parental E-QLG group. The inset in fig. 8G shows the heptapeptide structure with residue positions. The inset in figures 8H and 8I shows the mutated nucleotides and corresponding residues. dQ was used for the degenerate QLG mutant and mQ was used for the missense QLG mutant.

Figures 9A to 9G are a series of graphs showing that Exen a boost is diverse in terms of dose, non-viral proteins and cell type.

Fig. 9A: dose-dependent and varying degrees of Exen-enhanced levels of SARS-CoV-2S, M, N and ORF3 protein expression. Cells were transfected in quadruplicates with indicated pcDNA6B (6B) vectors in indicated amounts. Data represent mean ± SE of gdLuc activities in supernatant 48 hours after transfection. Fold values indicate the change in those values in the Exen groups relative to the corresponding control groups.

Fig. 9B, 9C: dose-dependent boosting of Exen21 against host cell gene NIBP and hACE2 levels, as determined by gdLuc assay, 48 hours after transfection with pcDNA6B conventional vector (fig. 9B, 9C).

Fig. 9D, 9E: the dose-dependent boost of Exen21 against secretion of ifnγ and IL-2, as measured by the gdLuc assay, was 48 hours after transfection with pRRL LV vector.

Fig. 9F: the stronger promoter CAG further increases the efficiency of Qα (QLG) boosting for viral protein E in LG systems.

Fig. 9G: the Exen-induced boost for viral E and S proteins and non-viral protein hACE2 exhibited similar efficiency across different cell types.

FIGS. 10A through 10F are a series of protocol, photographs, blots and graphs showing that Exen addition enhanced the production of anti-SARS monoclonal antibody (mAb).

Fig. 10A: schematic diagram showing human anti-SARS mAb and Exen/qα tag introduced at the C-terminal end of constant region of heavy and light chain (right panel).

Fig. 10B: representative ELISA showing robust enhancement of mAb production by Exen/qα (HQ/LQ) 48 hours after co-transfection of mAb H/L or HQ/LQ expression vectors (50 ng/well in quadruplicates) with normalized vector empty control (C), GFP (G) or firefly luciferase (L).

Fig. 10C: sigmoidal 4 parameter logic curve (4 PL) for mAb concentration.

Fig. 10D: normalized quantitative data from the experiment/trial shown in B. The relative fold change is presented as a comparison with the corresponding mAb H/L.

Fig. 10E: with student's t-test, at p <0.0001, the mean Exenn1/qα -induced fold change for ELISA-based mAb production for 16 experiments.

Fig. 10F: western blot analysis, which demonstrates Exen a 21/qα enhancement to mAb production in the supernatant. Membrane staining as a load control was used for densitometric analysis of Light Chain (LC) relative fold change between HQ/LQ and H/L groups.

FIGS. 11A through 11K are a series of protocol diagrams, blots, graphs and photographs showing that Exen addition enhanced the encapsulation and transduction efficiencies of SARS-CoV-2S protein-pseudotyped lentiviral-like particles (S-LVLPs) and standard lentiviral encapsulation.

Fig. 11A: schematic of different vectors for expressing human codon optimized Sd18, and the encapsulation process of S-LVLP in HEK 293T.

Fig. 11B: exen21 increases Sd18 protein expression in transfected cells, shown by western blotting using serum from SARS-CoV-2 patients (which contains specific anti-S antibodies). Representative fold changes in S2 fragments were quantified by densitometry normalized with GAPDH.

Fig. 11C: the addition of Exen (R) 21 increased the S-LVLP packaging titer of standard pRRL-GFP LV vector as measured by GFP positivity.

Fig. 11D to 11E: the addition of Exen increased the S-LVLP encapsulation conflict of dual reporter LV vector pRRL-E-QLG, as determined by GFP-positivity (fig. 11D) and gdLuc activity (fig. 11E).

Fig. 11F: the Exen/Qα in LV transfer vector pRRL-E-QLG induced a gdLuc active LV dose-dependent increase in supernatant of HEK293T cells 48 to 72 hours after infection with indicated amounts of crude LV preparation (μl per well, in triplicate). Fold changes in gdLuc activity of E-QLG versus control E-LG groups are shown.

Fig. 11G, 11H: exen21/qα in LV transfer vector pLV-EF1 α -Flag-spCas9-qα -T2A-RFP (qα) increased gene expression relative to unlabeled vector (Con), as seen by western blot analysis using anti-Flag antibody (fig. 11G); but without increasing encapsulation efficiency, as measured by FACS positive for RFP at 48 hours post infection with crude LV preparation (fig. 11H), no significance (ns) was detected by student t-test.

Fig. 11I: representative fluorescence images show that Exen/qα addition to Pol and RRE enhanced pRRL-GFP LV encapsulation efficiency compared to control (psPAX 2) levels, but Exen/qα ion Gag compromised LV encapsulation.

Fig. 11J, 11K: exen21/qα labeling on Pol and RRE (PolQ/RREQ) enhanced LV encapsulation efficiency against pRRL-GFP transfer vector, as determined by cell counting (FIG. 11J) and flow cytometry (FIG. 11K).

Fig. 11L: gdLuc test, which shows enhancement of Exen21/qα labelling.

FIGS. 12A through 12G are a series of graphs showing that Exen addition enhances mRNA dependent production of SARS-CoV-2 viral proteins S, N, E and ORF3, as well as non-virus hACE2 by increasing mRNA stability and transduction efficiency.

Fig. 12A to 12C: the addition of Exen with different target proteins strongly enhanced the mRNA dependent production of dual reporters to various degrees in a time and dose dependent manner.

Fig. 12A: response time course for different concentrations of capped mRNA for S-LG compared to S-QLG (ng/well, quadruplicates).

Fig. 12B: response time course to indicated mRNA (100 ng/well, in quadruplicate).

Fig. 12C: dose response to indicated mRNA 24 hours after transfection.

Fig. 12D to 12G: the addition of Exen a in the presence of the transcription inhibitor actinomycin D (QLG; right panel in D, E) increases post-transcriptional mRNA stability and translation efficiency, shown in the time course graphs of reporter activity (fig. 12D,12 e) and mRNA decay (fig. 12F, 12G). RNA levels were determined by RT-qPCR analysis.

Figures 13A to 13G are a series of blot and graph showing that the addition of Exen enhances secretion of various target proteins in HEK293T cells, as shown by western blot analysis.

Fig. 13A to 13C: the addition of Exen21 significantly reduced the protein expression levels of viral protein (E, S, N), non-viral protein hACE2 and secreted ifnγ/IL-2 in cell lysates 48 hours after transfection with the indicated pcDNA6B (6B) vector. Fold change is the relative optical density change after normalization by the load control GAPDH or non-specific band (NS). The P2A autolysis efficiency varies with the target protein, showing different ratios of cleaved bands (c) to non-cleaved bands (n).

Fig. 13D, 13E: the addition of Exen21 strongly increased the secreted E-QLG protein level in the supernatant, as seen by both western blot analysis (fig. 13D) and gdLuc assay (fig. 13E). Cells were transfected in quadruplicate with indicated vectors for 24 hours and cultured with FreeStyle ^TM 293 expression medium for 48 hours. Film staining as a load control was used for optical density analysis of the relative fold change of E-QLG over E-LG.

Fig. 13F: the ER golgi trafficking inhibitor brefeldin blocks secretion of qα -tagged viral E protein (E-QLG) and host cell protein (ifnγ), seen by both western blot analysis (left panel) and gdLuc assay (right panel) 48 hours after brefeldin treatment.

Fig. 13G: the addition of Exen (r) 21 raised the non-secreted firefly luciferase (fLuc) protein level in the cell lysate. The relative fold change was quantified by densitometry normalized with GAPDH.

Fig. 14A-14C are a series of photographs showing representative fluorescence microscopy of dual reporters. With respect to fig. 8A to 8I and fig. 9A to 9G.

Fig. 14A: the three indicated antibodies detected that the dual reporter of E-Flag-gdLuc-T2A-GFP was completely co-localized with 2A and Flag, while some cleaved GFP remained alone without the corresponding E-Flag-gdLuc-T2A, which might have been secreted.

Fig. 14B, 14C: representative fluorescence micrographs showing Exen addition of 21/qα with enhancement of N and ORF3 viral proteins as well as human ACE2 cell proteins under CMV promoter (fig. 14B) and time dependent enhancement of E protein under CAG promoter (fig. 14C). The images were taken under the same exposure environment. The scale bar is 100 μm.

FIGS. 15A-15B are a series of graphs and photographs showing dose-dependent Exen A/Q alpha boost for SARS-CoV2 viral protein and saturation of boost activity for all tested viral dual reporters at higher amounts of transfected reporter DNA. In relation to fig. 9A.

Fig. 15A: exen21/qα boost at different doses was determined by the NanoLight Gaussia luciferase assay.

Fig. 15B: representative confocal images from 3 fields of view of 4 wells per group. All images were taken under the same exposure environment. For lower dose or lower expression groups, a stronger GFP signal can be observed after longer exposure. The scale bar is 100 μm. HEK293T cells were transfected in 96-well plates with the indicated gdLuc-P2A-dsGFP reporter in the indicated amounts. At 72 hours post-transfection, EGFP images were taken and supernatants were collected for luciferase assays. The data represent the relative fold change compared to the corresponding LG group, as well as the mean ± SE of 4 wells.

Fig. 16A to 16E are a series of photographs, blots, graphs and protocol, which demonstrate eExen a boost for mRNA vaccine production and efficiency. In relation to fig. 12A to 12G.

Fig. 16A: schematic representation of in vitro transcription and 5' -Cap modification.

Fig. 16B: gel electrophoresis (1% agarose) images for transcript length, integrity and number of both C0 and C1 5' end capped mRNA.

Fig. 16C: the expression of the dual reporter increased 10 to 30 fold at equivalent levels of functional mRNA of viral gene N, E and ORF 3. The end-capped (Cap-C0) and tailing mRNA of the indicated targets were synthesized using HiScribe T ARCA mRNA kit (NEB, E2065) and cDNA templates from the corresponding linearized plasmids. Half of the Cap-C0 mRNA was further methylated using the mRNA Cap2 '-O-methyltransferase (NEB, M0366) at the 2' -O position of the first nucleotide adjacent to the Cap-C0 structure. Both Cap-C0 and Cap-C1mRNA were purified using the Monarch RNA cleaning kit (NEB, T2040). HEK293T cells were transfected with indicated mRNA (100 ng/well) in 96-well plates. The supernatants were collected 24 hours after transfection for NanoLight Gaussia luciferase assay. The data represent the relative fold change compared to the corresponding LG group, as well as the mean ± SE of 4 independent experiments.

Fig. 16D, 16E: representative fluorescent GFP images of living cells 24 hours after transfection with the indicated vector. The scale bar is 100 μm.

Figures 17A to 17E are a series of blot graphs and graphs showing that qα labelling strongly increases the protein level of secreted E-QLG in the supernatant, as detected by western blot analysis and gdLuc assay of the supernatant. In relation to fig. 6C, 6D.

Fig. 17A, 17B: western blot obtained using anti-gdLuc monoclonal antibody (Proteintech, catalog number 60158-1-Ig).

Fig. 17C, 17D: western blot obtained using anti-GFP polyclonal antibody (Proteintech, catalog number 50430-2-AP).

Fig. 17E: the relative fold change in boost efficiency as measured by gdLuc test. HEK293T cells were transfected in triplicate with indicated vector (100 ng/well) for 24 hours and cultured with FreeStyle ^TM 293 expression medium for 48 hours before analysis.

Fig. 18A to 18D are a series of graphs and blots demonstrating that the ER golgi trafficking inhibitor brefeldin blocks secretion of qα -tagged viral proteins and host cell proteins. With respect to fig. 13A to 13G.

Fig. 18A, 18B: following brefeldin treatment, relative gdLuc activity in the supernatant (fig. 18A) and cell lysate (fig. 18B).

Fig. 18C, 18D: western blot obtained using anti-gdLuc monoclonal antibody (Proteintech, cat. No. 60158-1-Ig) and anti-GFP polyclonal antibody (Proteintech, cat. No. 50430-2-AP). HEK293T cells were transfected with indicated vector (50 ng/well) in quadruplicate for 24 hours and cultured with FreeStyle ^TM 293 expression medium for 48 hours before analysis.

FIGS. 19A through 19C are a series of protocol diagrams and tables showing SARS-CoV-2UTR-E-Flag-Qα -UTR synthesis and cloning. Fig. 19A: schematic representation of the synthesis of 5'-UTR-E-Flag-Qα -3' -UTR. Fig. 19B: the nucleotide (946 bp) was synthesized into NEBuilder HiFi DNA assembly clones in the pCAG-Flag expression vector. Fig. 19C: a list of cloning strategies for the indicated vectors for the E and S proteins fused to QLG dual reporters was obtained.

FIGS. 20A through 20C are photographs of a series of stained cells showing that both the 5'-UTR and the 3' -UTR significantly enhance promoter-driven expression of QA tagged E proteins in HEK293T cells. HEK293T cells were transfected in triplicate (100 ng/well) with the indicated vectors in 96-well plates. At 48 hours post-transfection, cells were fixed with 4% PAF for 10min and immunocytochemistry was performed using anti-Flag antibodies. Fig. 20A: representative confocal images. Fig. 20B: average fluorescence intensity determined by ImageJ analysis of 6 fields from 3 wells. Fig. 20C: western blot analysis of the load control using anti-Flag antibody and anti-GAPDH.

FIGS. 21A to 21D are a series of protocol diagrams, photographs and graphs of stained cells showing that the addition of 5' -UTR between the CAG promoter and the S-Flag-QLG dual reporter enhances S protein expression. Fig. 21A: a dual reporter design schematic of a secretable gaussia dura luciferase (gdLuc) plus P2A autocleachable destabilizing GFP (dsGFP) and various measurements were used to assess expression of the protein of interest (here the SARS-CoV-2 viral protein). The novel Q tag is located between the target protein and gdLuc. Fig. 21B to 21D: HEK293T cells were transfected in triplicate with indicated vectors in 96-well plates with indicated amounts of DNA (12.5-100 ng/well). At 24 to 72 hours post-transfection, EGFP images were taken (fig. 21B), and supernatants were collected for NanoLight Gaussia luciferase assays (fig. 21B, 21C, 21D). The data represent relative light units of bioluminescence (fig. 21C) or fold change compared to the corresponding non-UTR group (fig. 21D), and mean ± SE of 4 wells.

FIGS. 22A through 22C are a series of blots and graphs showing that addition of 5' -UTR to pCAG, pcDNA6B and pRRL vectors dramatically increased the protein expression of the transgene. HEK293T cells were transfected with the indicated vectors (500 ng/well in fig. 22A; 100 ng/well in fig. 22B, 22C) in 24-well plates (fig. 22A) or 96-well plates (fig. 22B, 22C). EGFP expression was determined by western blotting 48 hours after transfection (fig. 22A), and supernatants were collected for NanoLight Gaussia luciferase assay (fig. 22B, 22C). The data represent the fold change (fig. 22B) or relative light units of bioluminescence (fig. 22C) versus corresponding LG groups for non-UTR groups, as well as the mean ± SE of 4 wells.

FIGS. 23A, 23B are a series of graphs showing that the addition of 5' -UTR upstream of in vitro transcribed mRNA significantly enhances protein expression in HEK293T cells. HEK293T cells in 96 well platesMessengerMAX mRNA transfection reagents, transfection with indicated mRNAs generated by in vitro transcription (50 ng/well) with 5 '-end-capped and 3' -poly A tail. The mRNA encodes the indicated viral protein (E or S protein) or endogenous hACE protein fused to dual reporter LG or QLG. At 6 to 24 hours post-transfection, EGFP images were taken (data not shown) and supernatants were collected for NanoLight Gaussia luciferase assay. The data represent the relative fold change compared to the corresponding LG group, as well as the mean ± SE of 4 wells.

FIGS. 24A to 24F are a series of protocol diagrams, photographs of stained cells, blots and graphs showing Qα labeling and encapsulation and transduction efficiencies of 5' -UTR inclusion of lentiviral-like particles (S-LVLP) that enhance SARS-CoV-2S protein pseudotyping.

Fig. 24A: schematic of different vectors for expressing human codon optimized Sd18, and S-LVLP packaging process.

Fig. 24B: qα and 5' -UTR increased Sd18 protein expression in transfected cells as determined by western blot using serum from SARS-CoV-2 patients.

Fig. 24C to 24D: q.alpha.tagging increased S-LVLP encapsulation titers against standard pRRL-GFP LV vectors, as determined by GFP positivity, which were further increased by inclusion of the 5' -UTR.

Fig. 24E to 24F: q.alpha.tagging and 5' -UTR inclusion increased the S-LVLP encapsulation titers against dual reporter LV vectors pRRL-LG or pRRL-E-LG as determined by GFP-positive and gdLuc activity.

Detailed Description

The present disclosure is based in part on the unexpected discovery that: oligonucleotide CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7) encodes a short peptide (referred to herein as "Qα") that significantly enhances expression/production of the fusion protein. Further extended studies identified a variety of properties of Exen21/qα tagging in enhancing (up to thousand-fold increase) the production of various proteins, including viral proteins, endogenous gene products, vaccines, antibodies, engineered recombinant proteins, and viral packaging proteins. An effective enhancement of SARS-CoV-2 natural 5' -UTR for protein production and its synergy with Qα tagging have also been discovered. Mechanistically, qα increases mRNA/protein stability and/or enhances protein translation, as well as promotes protein secretion. These various protein boosting strategies would provide a wide range of benefits to the biomedical science and protein engineering industries. This is the first evidence showing the role of short peptide tagging and SARS-CoV2 natural 5' -UTR for protein regulation and enhancement.

Accordingly, embodiments are directed to novel chimeric molecules comprising a peptide tag and a5 '-untranslated region (5' -UTR) for enhanced production and expression of a desired biomolecule.

5 '-And 3' -untranslated regions (UTRs)

Untranslated region (or UTR) refers to either of two segments, one on each side of the coding sequence on one strand of the mRNA. If it is found on the 5 'side, it is referred to as the 5' -UTR (or leader sequence); or if it is found on the 3 'side, it is referred to as the 3' UTR (or trailing sequence). mRNA is transcribed from the corresponding DNA sequence and is subsequently translated into protein. However, several regions of mRNA are not normally translated into proteins, including the 5 'and 3' utrs.

Within the 5' -UTR is a sequence recognized by the ribosome that allows ribosome binding and initiates translation. The mechanism of translation initiation varies between prokaryotic and eukaryotic cells. The 3' UTR was found to follow immediately the translation stop codon. The 3' UTR plays a key role in translational termination and post-translational modification.

In this study, it was found that the presence of the native 5'-UTR and 3' -UTR in standard protein expression systems strongly enhanced the expression of viral subgenomic transcripts and further production of viral proteins. It has also been found that the combination of the native 5' -UTR with a short peptide (referred to herein as qα peptide) further facilitates the production of viral and non-viral proteins.

Accordingly, in certain embodiments, a chimeric molecule for enhancing expression and production of a desired biomolecule comprises one or more short peptide domains and one or more UTRs. In certain embodiments, the UTR is a 5' -UTR. In certain embodiments, the UTR is a 3' -UTR.

In certain embodiments, the one or more 5' -untranslated region (UTR) domains or fragments thereof are derived from one or more viruses. In certain embodiments, the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof. In certain embodiments, the 5'-UTR and/or 3' -UTR is from a coronavirus. In certain embodiments, the coronavirus is SARS-CoV-2.

In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-25' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-25' -UTR. In certain embodiments, the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-25' -UTR.

In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-23' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-23' -UTR. In certain embodiments, the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-23' -UTR.

In certain embodiments, the one or more UTR sequences are engineered to include the Shine-Dalgarno sequence (5 '-AGGAGGU-3'). This sequence is found 3 to 10 base pairs upstream of the start codon. In certain embodiments, the one or more UTR sequences are engineered to contain a Kozak consensus sequence (ACCAUGG).

In certain embodiments, one or more of the 5'-UTR sequences (or nucleic acid molecules that each comprise a 5' -UTR sequence) may comprise a synthetic sequence (i.e., a sequence that is not found in nature).

In certain embodiments, one or more of the 5'-UTR sequences (or nucleic acid molecules that each comprise a 5' -UTR sequence) may comprise an endogenous 5'-UTR sequence (i.e., a 5' -UTR sequence that is used in nature to recruit ribosomal complexes and initiate translation of transcripts). For example, the endogenous 5' -UTR sequence may be part of an mRNA expressed in a cell or cell population. The cells in the population of cells may be the same type of cells (e.g., HEK-293 cells, PC3 cells, or muscle cells). Alternatively, the cell populations may include different cell types (e.g., HEK-293 cells, PC3 cells, and muscle cells). Methods for identifying mRNAs expressed in a cell or cell population and methods for identifying the 5' -UTR sequence of mRNAs are known to those of ordinary skill in the art. In fact, the various public databases contain cellular mRNA expression and/or 5' -UTR sequence information.

The length of the 5'-UTR sequence (or nucleic acid molecules each comprising a 5' -UTR sequence) is variable. For example, in some embodiments, at least two of the 5' -UTR sequences have different lengths. In some embodiments, at least two of the 5' -UTR sequences have the same length. In some embodiments, each of the 5' -UTR sequences has the same length. In some embodiments, at least one of the 5' -UTR sequences in the initial chimeric molecule is 3, 4, 5, 6, 7, 8, 9, or 10 base pairs in length.

In some embodiments, at least one of the 5'-UTR sequences (or nucleic acid molecules each comprising a 5' -UTR sequence) has a length of at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 150, at least 200, at least 250, at least 300, at least 350, at least 400, at least 450, at least 500, at least 550, at least 600, at least 650, at least 700, at least 750, at least 800, at least 850, at least 900, at least 950, at least 1000, at least 1500, at least 2000, or at least 3000 base pairs. In some embodiments, each of the 5'-UTR sequences (or nucleic acid molecules each comprising a 5' -UTR sequence) is at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 150, at least 200, at least 250, at least 300, at least 350, at least 400, at least 450, at least 500, at least 550, at least 600, at least 650, at least 700, at least 750, at least 800, at least 850, at least 900, at least 950, at least 1000, at least 1500, at least 2000, or at least 3000 base pairs in length.

In some embodiments, the chimeric molecule comprises one or more coronavirus 5'-UTR and/or 3' -UTR sequences, the length of at least one UTR sequence is increased to the length of interest by adding nucleotides to one or both ends (e.g., by adding a repeat sequence of a motif that does not have a known secondary structure). Nucleotides may be added at the 5 'end, the 3' end, or both the 5 'and 3' ends of the 5'-UTR and/or 3' -UTR sequences. In some embodiments, the length of one or more 5 '-or 3' -UTR sequences is reduced to the length of interest by removing nucleotides at one or both ends. Nucleotides may be removed from the 5' end, the 3' end, or both the 5' and 3' ends of the 5' -UTR sequence.

In certain embodiments, the UTR sequence comprises one or more mutations. Mutations can be introduced using genetic algorithms. Examples of genetic algorithms are known to those skilled in the art. See, e.g., the variable number of mutations introduced in each of the ,Scrucca,L.GA:A Package for Genetic Algorithms in R.J.Stat.Softw.(2015).doi:10.18637/jss.v053.i04.UTR sequences. In some embodiments, at least one UTR sequence is mutated at a1, 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotide position. Mutations may include base pair substitutions, deletions or insertions.

Modification: in certain embodiments, the UTR comprises one or more chemically modified nucleotides. Including chemically modified nucleotides; current Opinion in Drug Discovery and Development,2007,10:523.Kormann et al have demonstrated that replacement of only 25% of uridine and cytidine by 2-thiouridine and 5-methyl-cytidine is sufficient to increase mRNA stability and reduce in vitro activation of innate immunity triggered by externally administered mRNA (WO 2012/0195936A1;WO2007024708 A2). For example, known modifications of RNA molecules can be found, for example, in Genes VI, chapter 9 ("INTERPRETING THE GENETIC Code"), lewis Code (1997,Oxford University Press,New York); modification AND EDITING of RNA, grosjean and Benne (1998,ASM Press,Washington D.C.). The modified RNA component included the following: 2' -O-methylcytidine; n ⁴ -methylcytidine; n ⁴ -2' -O-dimethylcytidine; n ⁴ -acetylcytidine; 5-methylcytidine; 5,2' -O-dimethylcytidine; 5-hydroxymethylcytosine; 5-formyl cytidine; 2' -O-methyl-5-formylcytidine; 3-methylcytidine; 2-thiocytidine; lai Baogan; 2' -O-methyluridine; 2-thiouridine; 2-thio-2' -O-methyluridine; 3,2' -O-dimethyluridine; 3- (3-amino-3-carboxypropyl) uridine; 4-thiouridine; ribosyl thymine; 5,2' -O-dimethyluridine; 5-methyl-2-thiouridine; 5-hydroxyuridine; 5-methoxyuridine; uridine-5-oxyacetic acid; uridine-5-oxoacetic acid methyl ester; 5-carboxymethyl uridine; 5-methoxycarbonylmethyluridine; 5-methoxycarbonylmethyl-2' -O-methyluridine; 5-methoxycarbonylmethyl-2' -thiouridine; 5-carbamoyl methyluridine; 5-carbamoylmethyl-2' -O-methyluridine; 5- (carboxyhydroxymethyl) uridine; 5- (carboxyhydroxymethyl) uridine methyl ester; 5-aminomethyl-2-thiouridine; 5-methylaminomethyl uridine; 5-methylaminomethyl-2-thiouridine; 5-methylaminomethyl-2-selenouridine; 5-carboxymethylaminomethyl uridine; 5-carboxymethyl aminomethyl-2' -O-methyl-uridine; 5-carboxymethyl aminomethyl-2-thiouridine; dihydrouridine; dihydroribosyl thymine; 2' -methyladenosine; 2-methyladenosine; n ⁶ N-methyladenosine; n ⁶,N⁶ -dimethyl adenosine; n ⁶, 2' -O-trimethyladenosine; 2-methylsulfanyl-N ⁶ N-isopentenyl adenosine; n ⁶ - (cis-hydroxyisopentenyl) -adenosine; 2-methylsulfanyl-N ⁶ - (cis-hydroxyisopentenyl) -adenosine; n ⁶ -glycylcarbamoyl) adenosine; n ⁶ -threonyl carbamoyl adenosine; n ⁶ -methyl-N ⁶ -threonyl carbamoyl adenosine; 2-methylsulfanyl-N ⁶ -methyl-N ⁶ -threonyl carbamoyl adenosine; n ⁶ -hydroxy N-valylcarbamoyladenosine; 2-methylsulfanyl-N ⁶ -hydroxy N-valylcarbamoyladenosine; 2' -O-ribosyl adenosine (phosphate); inosine; 2' -O-methyl inosine; 1-methyl inosine; 1,2' -O-dimethylinosine; 2' -O-methylguanosine; 1-methylguanosine; n ² -methylguanosine; n ²,N² -dimethylguanosine; n ², 2' -O-dimethylguanosine; n ²,N², 2' -O-trimethylguanosine; 2' -O-ribosyl guanosine (phosphate); 7-methylguanosine; n ², 7-dimethylguanosine; n ²,N², 7-trimethylguanosine; huai Russian glycoside (wyosine); methyl huadin; modified hydroxy huagaside; huai Dinggan (wybutosine); a hydroxyl group Huai Dinggan; peroxy Huai Dinggan; braided glycoside (queuosine); epoxy braids; galactosyl-pigtail glycoside; mannosyl-pigtail glycoside; 7-cyano-7-deazaguanosine; arachaeosine [ also known as 7-carboxamide-7-deazaguanosine ]; and 7-aminomethyl-7-deazaguanosine.

In some embodiments, UTR is a synthetic oligonucleotide. In some embodiments, the synthetic oligonucleotides comprise modified nucleotides. Modifications of the internucleoside linker (i.e., backbone) can be utilized to increase stability or pharmacodynamic properties. For example, internucleoside linker modifications prevent or reduce degradation by cellular nucleases, thus increasing the pharmacokinetics and bioavailability of UTRs. In general, modified internucleoside linkers include any linker that covalently couples two nucleosides together, except for a Phosphodiester (PO) linker. In some embodiments, the modified internucleoside linker increases nuclease resistance of the UTR compared to the phosphodiester linker. For naturally occurring oligonucleotides, the internucleoside linker comprises a phosphate group that creates a phosphodiester linkage between adjacent nucleosides. In some embodiments, the UTR comprises one or more internucleoside linkers modified from a natural phosphodiester. In some embodiments, all internucleoside linkers of the UTRs, or contiguous nucleotide sequences thereof, are modified. For example, in some embodiments, the internucleoside linkage comprises sulfur (S), such as a phosphorothioate internucleoside linkage.

Modifications to ribose or nucleobases may also be utilized herein. In general, modified nucleosides include incorporation of one or more modifications to the sugar moiety or nucleobase moiety. In some embodiments, a UTR as described comprises one or more nucleosides comprising a modified sugar moiety, wherein the modified sugar moiety is a modification to the sugar moiety when compared to a ribose sugar moiety found in deoxyribonucleic acid (DNA) and RNA. A large number of nucleosides with modifications to the ribose moiety may be utilized, primarily for the purpose of improving certain properties of the oligonucleotide, such as affinity and/or stability. Such modifications include those in which the ribose ring structure is modified. These modifications include substitution with hexose rings (HNAs), bicyclic rings with a double radical bridge between the C2 and C4 carbons of the ribose ring (e.g., locked Nucleic Acids (LNAs) or unlinked ribose rings (e.g., UNAs) that typically lack a bond between the C2 and C3 carbons other sugar modified nucleosides include, for example, bicyclic hexose nucleic acids or tricyclic nucleic acids modified nucleosides also include nucleosides in which the sugar moiety is replaced with a non-sugar moiety, for example, in the case of Peptide Nucleic Acids (PNAs), or N-morpholino nucleic acids.

Sugar modifications also include modifications made by changing substituents on the ribose ring to groups other than hydrogen or changing the 2' -OH group naturally found in DNA and RNA nucleosides. Substituents may be introduced, for example, at the 2', 3', 4 'or 5' positions. Nucleosides having modified sugar moieties also include 2 'modified nucleosides, such as 2' substituted nucleosides. In fact, much attention has been expended in developing 2 'substituted nucleosides and a large number of 2' substituted nucleosides have been found to have beneficial properties when incorporated into oligonucleotides, such as enhanced nucleoside resistance and enhanced affinity. The 2' sugar-modified nucleoside is a nucleoside having a substituent other than H or-OH at the 2' position (2 ' substituted nucleoside) or a diradical comprising a 2' linkage, and includes 2' substituted nucleosides and LNA (2 ' -4' diradical bridged) nucleosides. Examples of 2 '-substituted modified nucleosides are 2' -O-alkyl-RNA, 2 '-O-methyl-RNA, 2' -alkoxy-RNA, 2 '-O-methoxyethyl-RNA (MOE), 2' -amino-DNA, 2 '-fluoro-RNA and 2' -F-ANA nucleosides. Further by way of example, in some embodiments, the modification in the ribose group comprises a modification at the 2' position of the ribose group. In some embodiments, the modification at the 2' position of the ribose group is selected from the group consisting of 2' -O-methyl, 2' -fluoro, 2' -deoxy, and 2' -O- (2-methoxyethyl).

In some embodiments, the UTR comprises one or more modified sugars. In some embodiments, the gRNA comprises only modified sugars. In certain embodiments, the gRNA comprises more than 10%, 25%, 50%, 75%, or 90% modified sugar. In some embodiments, the modified sugar is a bicyclic sugar. In some embodiments, the modified sugar comprises a 2' -O-methoxyethyl group. In some embodiments, the UTR comprises both an internucleoside linker modification and a nucleoside modification.

In other aspects, the chimeric molecule comprises an Internal Ribosome Entry Site (IRES). IRES, as understood in the art, is an RNA element that allows translation to be initiated in a terminal independent manner. In an exemplary embodiment, the IRES is in the 5' utr. In other embodiments, the IRES may be outside the 5' utr.

Peptide domains

As described above, chimeric molecules for enhancing expression and production of a desired biomolecule comprise one or more short peptide domains and one or more UTRs.

In certain embodiments, the chimeric molecule comprises one or more peptide domains. In certain embodiments, the one or more peptide domains comprise from about five amino acids to about twenty amino acids. In certain embodiments, the one or more peptide domains comprise about seven amino acids. In certain embodiments, the synthetic peptide tag comprises an amino acid sequence having at least about 70% (such as at least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater) sequence identity to sequence QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptide domains comprise an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptide domains comprise an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1). In certain embodiments, the one or more peptide domains comprise amino acid sequence QPRFAAA (SEQ ID NO: 1).

In certain embodiments, the chimeric molecule comprises one or more peptide domains comprising the amino acid sequence of X _n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

In certain embodiments, the one or more peptide domains comprise one or more unnatural amino acids or modified amino acids. Examples of modified amino acids include amino acids that have been phosphorylated, acetylated, glycosylated, carboxylated, hydroxylated, sulfated, and the like. Examples of unnatural amino acids include D-amino acids, homoamino acids, N-methyl amino acids, alpha-methyl amino acids, beta (homoamino acids), gamma amino acids, helix/turn stabilizing motifs, backbone modifications (e.g., peptidomimetics). Other examples of contemplated amino acids include hydroxyproline (Hyp), beta-alanine, citrulline (Cit), ornithine (Orn), norleucine (Nle), 3-nitrotyrosine, nitroarginine, pyroglutamic acid (Pyr).

Biomolecules of interest

The fusion proteins or chimeric molecules of the present disclosure, e.g., peptide domains and/or UTR sequences, are associated with a biomolecule such as a protein, obtained by associating a peptide tag to a target protein (also referred to as a fusion protein of the tag with the target protein). One or more chimeric molecules may bind to the N-terminus of the target protein, one or more chimeric molecules may bind to the C-terminus of the target protein, or one or more chimeric molecules may bind to both the N-terminus and the C-terminus of the target protein, or one or more chimeric molecules may be inserted within the interior region of the tagged protein. The one or more chimeric molecules may bind directly to the N-terminus and/or C-terminus of the target protein, or may bind through a sequence of 1 to several amino acids (e.g., 1 to 10 amino acids). The sequence of 1 to several amino acids may be any sequence as long as the sequence does not negatively affect the function or the expression level of the chimeric molecule-target protein. However, the chimeric molecule may be isolated from the target protein after expression and purification by using a protease recognition sequence.

In certain embodiments, at least one or more chimeric molecules are associated with one or more biomolecules of interest. Examples of biomolecules include cytokines, growth factors, viral antigens, tumor antigens, polynucleotides, oligonucleotides, hormones, enzymes, checkpoint proteins, antigens, antibodies, transcription factors, receptors, ligands, immunoglobulins, immunoglobulin fragments, fluorescent proteins, and the like. The length of the biomolecule of interest (e.g., peptide) can vary, so long as the amount of the target biomolecule (e.g., peptide) produced is significantly increased when expressed in the form of a fusion peptide/chimeric molecule.

Examples of enzymes include enzymes such as lipases, proteases, steroid synthases, kinases, phosphatases, xylanases, esterases, methylases, demethylases, oxidases, reductases, cellulases, aromatases, carnauba, transglutaminases, glycosidases and chitinases. Growth factors include, for example, epidermal Growth Factor (EGF), insulin-like growth factor (IGF), transforming Growth Factor (TGF), nerve Growth Factor (NGF), brain-derived neurotrophic factor (BDNF) (VEGF), granulocyte-colony stimulating factor (G-CSF), granulocyte-macrophage-colony stimulating factor (GM-CSF), platelet-derived growth factor (PDGF), erythropoietin (EPO), thrombopoietin, proeukaryotic growth factor (FGF), hepatocyte Growth Factor (HGF). Examples of hormones include insulin, glucagon, somatostatin, growth hormone, parathyroid hormone, prolactin, leptin and calcitonin. Examples of cytokines include interleukins, interferons (ifnα, ifnβ, ifnγ), tumor Necrosis Factor (TNF). Blood proteins include, for example, thrombin, serum albumin, factor VII, factor X, tissue-type plasminogen activator. Antibody proteins include, for example, F (ab') ₂, fc fusion proteins, heavy (H), light (L), short chain Fv (scFv), sc (Fv) ₂, disulfide-linked Fv (sdFv), diabodies.

Immune checkpoint proteins are well known in the art and include, but are not limited to, CTLA-4, PD-1, VISTA, B7-H2, B7-H3, PD-L1, B7-H4, B7-H6, 2B4, ICOS, HVEM, PD-L2, CD160, gp49B, PIR-B, KIR family receptor, TIM-1, TIM-3, TIM-4, LAG-3, BTLA, SIRPalpha (CD 47), CD48, 2B4 (CD 244), B7.1, B7.2, ILT-2, ILT-4, TIGIT and A2aR.

The antigen may be appropriately selected depending on the subject of the immune response (e.g., a pathogenic-derived protein or a pathogenic-virus-derived protein).

The chimeric molecule may be combined with a secretory signal peptide to function in a host cell for secretory production. When yeast is used as a host, the secretory signal peptide may be exemplified by a invertase secretion signal. In certain embodiments, the secretory signal is obtained from two or more different sources. Various sources include, for example, bacillus species, lactococcus lactis (Lactococcus lactis), streptomyces (Streptomyces) or Corynebacterium. Other signal sequences include, for example, human IL-2, human chymotrypsin, human interferon gamma, and the like.

In certain embodiments, the chimeric molecule may be supplemented with a transport signal peptide such as an intranet residual signal peptide or a liquid phase transition signal peptide for expression in a particular cellular compartment.

Chimeric biomolecules may be chemically synthesized or may be produced by genes. The DNA of the present disclosure is characterized by comprising a nucleic acid encoding a chimeric molecule of the present disclosure.

The DNA of the present disclosure may contain enhancer sequences or the like that function in the host cell to improve expression in the host cell. Examples of enhancers include the Kozak sequence and the 5' -untranslated region of a plant-derived alcohol dehydrogenase gene.

Constructs

The gene construct or vector comprises a nucleotide sequence encoding a desired protein operably linked to regulatory elements required for gene expression. Accordingly, the incorporation of a DNA or RNA molecule into a living cell results in the expression of the DNA or RNA encoding the desired protein and thus in the production of the desired protein. The chimeric molecules of the present disclosure may be produced by general genetic engineering techniques. For example, a recombinant vector encoding a chimeric molecule. The recombinant vector of the present disclosure is not particularly limited as long as the nucleic acid sequence chimeric molecule is inserted into the vector so that it can be expressed in the host cell into which the vector is introduced. The vector is not particularly limited as long as it is replicable in a host cell, and examples thereof include plasmid DNA and viral DNA. Regulatory elements necessary for gene expression of DNA molecules include: promoters, start codons, stop codons, and polyadenylation signals. Furthermore, enhancers are often required for gene expression. It is necessary that these elements are operably linked to a sequence encoding the desired protein, and that the elements are warned to be operable in the individual to whom they are administered.

The start codon and stop codon are generally considered to be part of the nucleotide sequence encoding the desired protein. However, it is essential that these elements function in the individual to whom the genetic construct is administered. The start and stop codons must be in frame with the coding sequence.

The molecule encoding the desired protein may be DNA or RNA comprising a nucleotide sequence encoding the desired protein. These molecules may be cDNA, genomic DNA, synthetic DNA or hybrids thereof, or RNA molecules such as mRNA. Accordingly, as used herein, the terms "DNA construct", "gene construct", "nucleotide sequence", "nucleic acid" are intended to refer to DNA and RNA molecules.

When taken up by a cell, a genetic construct comprising a nucleotide sequence encoding a desired protein operably linked to a regulatory element may remain present in the cell as an extrachromosomal functional molecule, or it may be integrated into the chromosomal DNA of the cell. The DNA may be introduced into the cell where it is maintained as separate genetic material in the form of a plasmid. Alternatively, linear DNA that can be integrated into the chromosome can be introduced into the cell. When introducing DNA into cells, reagents may be added that promote integration of the DNA into the chromosome. DNA sequences useful for promoting integration may also be included in the DNA molecule. Alternatively, the RNA can be administered to the cell. It is also contemplated to provide the gene construct as a linear minichromosome comprising a centromere, a telomere and an origin of replication.

Accordingly, in certain embodiments, the present disclosure includes a carrier comprising one or more cassettes comprising: UTR, biomolecules, peptide tag domains, e.g. Q.alpha.tag (SEQ ID NO: 1). The vector may be any vector known in the art and is suitable for expressing the desired expression cassette. A number of vectors are known or can be designed to be capable of receiving the transfer of a gene product to a mammalian cell, as known in the art and described herein. In certain aspects, a vector refers to a nucleic acid polynucleotide to be delivered to a host cell in vitro or in vivo. In some embodiments, one or more cassettes are provided on a single carrier. In some embodiments, one or more cassettes are provided on two or more carriers. In some embodiments, the cassette is provided by one or more vectors comprising an isolated nucleic acid encoding one or more elements of a gene editing system. In some embodiments, the cassette is provided by one or more vectors comprising isolated nucleic acids encoding one or more components comprising: UTR, biomolecule, peptide tag. In some cases, expression of a natural or synthetic nucleic acid encoding an RNA and/or peptide is typically achieved by operably linking the nucleic acid encoding the RNA and/or peptide or portion thereof with a promoter and incorporating the construct into an expression vector. The vector to be used is suitable for replication in eukaryotic cells and optionally for integration. Typical vectors contain transcriptional and translational terminators, promoter sequences, and promoters useful for regulating expression of the desired nucleic acid sequence.

The isolated nucleic acids of the present disclosure can be cloned into a variety of types of vectors. For example, the nucleic acid may be cloned into vectors including, but not limited to, plasmids, phagemids, phage derivatives, animal viruses and cosmids. Vectors of particular interest include expression vectors, replication vectors, probe-generating vectors and sequencing vectors.

Other promoter elements (e.g., enhancers) regulate the frequency of transcription initiation. In some embodiments, the vector further comprises a conventional control element operably linked to the transgene in a manner that allows transcription, translation and/or expression of the transgene in cells transfected with the plasmid vector or infected with a virus comprising a nucleic acid comprising the cassette or composition. As used herein, "operably linked" sequences include expression control sequences that are contiguous with the gene of interest as well as expression control sequences that act in trans or at a distance to control the gene of interest. Expression control sequences include also appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation (polyA) signals; a sequence that stabilizes cytoplasmic mRNA; sequences that enhance transcription efficiency (i.e., kozak consensus sequences); a sequence that enhances protein stability; and sequences that enhance secretion of the encoded product when desired. Numerous natural, constitutive, inducible and/or tissue-specific expression control sequences, including promoters, are known and available in the art.

Typically, these expression control sequences are located in a region 30-110bp upstream of the start site, but recently a large number of promoters have been shown to also contain functional elements downstream of the start site. The spacing between promoter elements tends to be flexible so that promoter function is preserved when the elements are switched or moved relative to each other. In the thymidine kinase (tk) promoter, the spacing between promoter elements may increase to 50bp apart, above which the activity begins to decrease. Depending on the promoter, it is clear that individual elements may function cooperatively or independently to activate transcription.

Secretion of the appropriate promoter can be readily achieved. In certain aspects, a high expression promoter will be used. The promoter and polyadenylation signal used must be functional within the cell of the individual. The promoter used in the vector may be appropriately selected depending on the host cell into which the vector is introduced. For example, when expressed in yeast, GAL1 promoter, PGK1 promoter, TEF1 promoter, ADH1 promoter, TPI1 promoter, PYK1 promoter, etc. may be used. When expressed in plants, the cauliflower mosaic virus 35S promoter, rice actin promoter, maize ubiquitin promoter, lettuce ubiquitin promoter, and the like can be used. When expressed in E.coli, T7 promoter and the like can be used. In the case of expression in Brevibacillus (Brevibacillus), there may be mentioned the P2 promoter, the P22 promoter, and the like. Inducible promoters. For example, in addition to lac, tac and trc which can be induced by IPTG, trp which can be induced by Iaa, ara which can be induced by L-arabinose, pzt-1 which can be induced by using tetracycline, APL promoter which can be induced at high temperature (42 ℃) and promoter of cspA gene which is one of cold shock genes are treated. Other examples of promoters that may be used to produce a genetic vaccine for human use include, but are not limited to, promoters from monkey virus 40 (SV 40), murine Mammary Tumor Virus (MMTV), human Immunodeficiency Virus (HIV) such as HIV Long Terminal Repeat (LTR) promoters, moloney virus, ALV, cytomegalovirus (CMV) such as CMV immediate early promoter, epstein Barr Virus (EBV), rous Sarcoma Virus (RSV), and promoters from human genes such as human actin, human sarcoplasmic globulin, human hemoglobin, human muscle creatine, and human metallothionein. Examples of polyadenylation signals useful in the practice of the present disclosure, particularly in the production of genetic vaccines for human use, include, but are not limited to, SV40 polyadenylation signals and LTR polyadenylation signals. In particular, the SV40 polyadenylation signal in the pCEP4 plasmid (Invitrogen, san Diego calif.) known as SV40 polyadenylation signal was used.

One example of a suitable promoter is the CAG promoter or the immediate early Cytomegalovirus (CMV) promoter sequence. This promoter sequence is a strong constitutive promoter sequence capable of driving high levels of expression of any polynucleotide sequence to which it is operably linked. In certain embodiments, rous Sarcoma Virus (RSV) and MMT promoters are also used. Some proteins may be expressed using their native promoters. Other elements that may enhance expression may also include, for example, enhancers or systems that result in high levels of expression, such as tat gene or tar elements. The cassette may then be inserted into a vector, for example, a plasmid vector such as pUC19, pUC118, pBR322 or other known plasmid vectors, including, for example, an e.

Another example of a suitable promoter is elongated growth factor 1 alpha (EF-1 alpha). However, in some embodiments, other constitutive promoter sequences are used, including, but not limited to, monkey virus 40 (SV 40) early promoter, murine Mammary Tumor Virus (MMTV), human Immunodeficiency Virus (HIV) Long Terminal Repeat (LTR) promoter, moMuLV promoter, avian leukemia virus promoter, epstein-Barr virus immediate early promoter, rous sarcoma virus promoter, and human gene promoters such as, but not limited to, actin promoter, sarcoplasmic globulin promoter, hemoglobin promoter, and creatine kinase promoter. Furthermore, the present disclosure should not be limited to the use of constitutive promoters. Inducible promoters are also contemplated as part of the present disclosure. The use of inducible promoters provides a molecular switch that is capable of turning on such expression when expression of the polynucleotide sequence to which it is operably linked is desired or turning off expression when such expression is not desired. Examples of inducible promoters include, but are not limited to, metallothionein promoters, glucocorticoid promoters, and tetracycline promoters.

The enhancer sequences found on the vector also regulate the expression of the genes contained in the vector. Typically, enhancers bind to protein factors to increase transcription of a gene. In some cases, the enhancer is located upstream or downstream of the gene it modulates. In some cases, enhancers are also tissue specific to enhance transcription in a particular cell or tissue type. In some embodiments, the vectors of the present disclosure comprise one or more enhancers to enhance transcription of genes present within the vector. In some cases, expression of nucleic acids and/or proteins, the expression vector to be introduced into the cell may also contain a selectable marker gene or a reporter gene or both to facilitate identification and selection of the expressing cells from the population of cells intended to be transfected or infected by the viral vector. In other embodiments, the selectable marker is carried on separate DNA fragments and used in a co-transfection procedure. Both the selectable marker and the reporter gene may be flanked by appropriate regulatory sequences to enable expression in the host cell. Useful selectable markers include, for example, antibiotic resistance genes, such as neo and the like.

If necessary, terminator sequences may also be included, depending on the host cell.

Recombinant vectors of the present disclosure can be produced, for example, by digesting the DNA construct with an appropriate restriction enzyme, or by PCR adding a restriction enzyme site and inserting the vector into the restriction enzyme site or multiple cloning site.

Host cell: the host cell used for transformation ("transformant") may be a eukaryotic cell or a prokaryotic cell, preferably a eukaryotic cell. In certain embodiments, eukaryotic cells, yeast cells, mammalian cells, plant cells, insect cells, and the like are used. Examples of yeasts include Saccharomyces cerevisiae (Saccharomyces cerevisiae), candida utilis (Candida utilis), schizosaccharomyces cerevisiae (Schizosaccharomyces pombe), pichia pastoris (Pichia pastoris), and the like. In addition, microorganisms such as Aspergillus fungi (Aspergillus) may be used. Examples of prokaryotic cells include E.coli (ESCHERICHIA COLI), lactobacillus, bacillus, brevibacterium, agrobacterium tumefaciens (Agrobacterium tumefaciens), actinomycetes (actinomycetes), and the like. Plant cells include plant cells belonging to the genus Lactuca, such as Asteraceae, solanaceae, brassicaceae, rosaceae, chenopodiaceae, etc.

Transformants used in the present disclosure can be produced by introducing the recombinant vectors of the present disclosure into host cells using general genetic engineering techniques. For example, electroporation (Tada, et al, 1990, theor. Appl. Genet, 80:475), protoplast method (Gene, 39,281-286 (1985)), polyethylene glycol method (1993,Transgenic,Res.2:218,Hiei,et al, 1994), agrobacterium-mediated transformation (Hood et al, 1991, theor. Appl. Genet. J.6:271), particle drying method (Sanford et al, 1987, J.part. Sci. Tech.5:27), polycation method (Ohtsuki, et al, FEBS Lett. 3): 235-240) may be used. Furthermore, gene expression may be conventional expression or stable expression inserted into a chromosome.

After introducing the recombinant vectors of the present disclosure into host cells, transformants can be selected according to the phenotype of the selectable marker. In addition, tagged proteins can be produced by culturing selected transformants. The medium and conditions for the culture may be appropriately selected depending on the kind of transformant.

When the host cell is a plant cell, the plant cell can be regenerated by culturing the selected textile cell according to conventional methods, and the tagged protein can accumulate in the plant cell or outside the cell membrane of the plant cell.

The labeled biomolecules that have accumulated in the cells can be isolated and purified according to methods well known to those skilled in the art. For example, separation and purification can be performed using methods known in the art such as salting out, ethanol precipitation, ultrafiltration, gel filtration chromatography, ion exchange column chromatography, affinity chromatography, medium pressure liquid chromatography, reverse phase chromatography, hydrophobic chromatography.

Embodiments of the present disclosure will be described hereinafter, but the present disclosure is not limited to these embodiments.

Examples

Example 1: labelling for enhancing protein expression/secretion

Proteins play a critical role in a variety of physiological processes and conditions. Protein expression is a critical and necessary process in biological and medical research, but increasing protein expression in large scale applications can be difficult and expensive. For biopharmaceutical development, immunological/vaccine processes and biotherapy, there is an increasing demand for enhanced protein expression/secretion yields.

Results

Unexpected findings regarding short peptide qα tags enhancing gene expression. During preliminary studies on expression of SARS-CoV-2 viral proteins in mammalian cells, the objective was to establish a dual reporter system by fusing Gaussia-Dura luciferase (gdLuc) with destabilized GFP (dsGFP) (abbreviated LG) to the C-terminus of the SARS-CoV-2 viral protein (FIG. 1A). The advantage of this dual reporter is the dynamic quantitative measurement of secreted gdLuc fused target protein in culture medium using the sensitive dLuc assay and dsGFP positivity and the dynamic quantitative measurement of intensity by fluorescence microscopy and lost cells. Since the use of Lentiviral (LV) systems would allow a broad range of targeted cells and easy establishment of stable cell lines, LV pCDH-nCoV-E-Flag vector (Zhang et al, 2020) was chosen as the backbone for dual reporter cloning, which expresses SARS-CoV-2 structural envelope (E) protein. NEBbuilder-HiFi cloning was performed via NotI/ApaI sites using two fragments derived by PCR using primers Flag-Not-gdLuc-F and gdLuc-P2A-R (for fragment gdLuc) and P2A-dsGFP-F and dsGFP-PCR-Apa-R (for fragment dsGFP). Due to the failure of such cloning, a pair of new primers was designed to generate one fragment by overlapping PCR using PCR products from the two fragments and the one fragment was cloned into pcDNA6B-nCoV-E-Flag vector via SacII cloning using NEBbuilder-HiFi kit (Zhang et al 2020). After confirming the correct clones by restriction enzyme digestion, positive clones E1 and E7 were tested for protein expression and detected by fluorescent limit check and secretion gdLuc reporter assay (fig. 1A). Surprisingly, it was found that E7 exhibited >20 times higher luciferase activity than E1. After Sanger sequencing, the E7 clone was found to have an additional 21 nucleotides before LG and after the Flag tag, which code for 7 amino acids (amino acids) in frame. These 7 amino acid peptides were designated as "qα" based on the potential pronunciation of the sequence and its linked LG as QLG in the following study. Further validation studies demonstrated that pcDNA6B-E-QLG had up to 90-fold higher expression than pcDNA6B-E-LG (FIGS. 1A to 1D). The effect of this qα -tag on the expression of other structural proteins of SARS-CoV-2 such as spinous process (S), nucleocapsid (N) and membrane (M) and helper proteins NSP2, NSP16 and ORF3 was examined. Qα was found to have 3 to 4000 fold enhancement efficiency (fig. 1E to 1F and fig. 2A) in terms of enhancing all viral proteins tested, with the extent depending on each protein. Such variations in enhanced efficiency can lead to differences in cell density/functionality, transfection efficiency, reporter dose, and viral protein type. Similar enhancement effects apply to many non-viral proteins (fig. 2B to 2E). Interestingly, transfection with lower amounts of plasmid DNA in HEK293T cells showed higher enhancement efficiency for most viral proteins of SARS-CoV-2 (fig. 2A), but no effect for host cell gene products such as murine NIBP (fig. 2B) and human ACE2 (fig. 2C, 2D) or cytokines such as ifnγ (fig. 2E) and IL-2 (fig. 2F). Similar enhancement functions apply to other cell types such as Hela, BHK, etc. (fig. 2G). In addition to conventional plasmids, this qα tag also enhances protein expression in viral transfer vectors such as LV vectors (fig. 2E, 2F). In summary, this qα tagging is multifunctional and enhances protein expression/production in a variety of genes, cell types and species.

Viral protein expression is further enhanced by optimizing the promoter and native UTR. During the initial studies herein, most viral proteins were either absent or under-expressed in order to express SARS-CoV-2 viral proteins to investigate their intracellular distribution and their functional role in COVID-19 pathogenesis (data not shown). In the pcDNA6B expression vector system (CMV promoter), most viral proteins were found to exhibit undetectable expression by western blotting and immunocytochemistry, while host cell gene hACE exhibited very strong expression. In the pCAG vector system (CAG promoter), most viral proteins were found to be detectable but to varying degrees. However, the expression of E and S proteins was still very weak or could not be detected by immunocytochemistry and western blotting using anti-Flag antibodies, in agreement with several reports (Boson et al, 2020; hu et al, 2020; ou et al, 2020; zhanget al, 2020). To solve this problem, a dual LG reporter system with high sensitivity and quantitative analysis was established. The data indicate that LG systems can detect expression of both E and S proteins in pcDNA6B or pCAG vectors, but the levels are still very low, however, qα tagging strongly increases their expression (fig. 3A). Again, the CAG promoter exhibits higher activity than the CMV promoter, as previously demonstrated (Dou et al.,2021; zhanget al., 2020). To investigate whether qα tagging further enhanced gene expression of CAG driven viral proteins, parallel experiments were performed between CMV promoter and CAG promoter for viral proteins E and S. As shown in fig. 3A to 3C, qα induced a stronger enhancement (5 to 6-fold) of viral E and S proteins in the presence of a stronger CAG promoter. Enhanced qα enhanced CAG-driven NSP16 expression by up to 212-fold (fig. 3C).

To test whether the natural UTR of SARS-CoV-2 regulates expression of viral proteins accelerated by Qα, a DNA fragment containing the 5'-UTR-E-Flag-Qα -3' -UTR was synthesized based on the public SARS-CoV-2 strain (Wu et al 2020) and cloned into the pCAG vector. The E protein is chosen for its inexpensive and rapid synthesis compared to when using S protein because of its relatively small size. Surprisingly, it was found that the addition of the native 5' -UTR strongly enhances the expression of E protein as determined by western blotting and immunocytochemistry using anti-Flag antibodies (fig. 3D, 3E). Inclusion of the native 3' -UTR further increased E protein expression (fig. 3D, 3E). The S protein is very important for vaccine development, pseudovirion production and drug discovery, however, (Bosonet al.,2020;Hu etal.,2020;Ou et al.,2020;Walls et al.,2020;Wang et al.,2020;Zhang etal.,2020). whose expression is the most difficult of the viral proteins of SARS-CoV-2 in order to maximize S protein production, the native 5' -UTR was added upstream of the qα -tagged S protein in the pCAG expression vector, which showed 6-fold higher expression than the pcDNA6B vector (fig. 3C). The addition of 5' -UTR further enhanced S protein production by 20 to 70 fold compared to Q alpha tagged S protein driven using CAG, as determined by fluorescent immunocytochemistry (fig. 3F) and gdLuc assay (fig. 3G). Similar enhancement efficacy occurred with CMV driven qα tagged S protein expression system (fig. 3H). In the LV vector, the addition of 5' -UTR enhanced the Flag-tagged LG and QLG of the dual reporter in the absence of viral proteins (FIG. 3I). These data provide evidence that optimization of the gene expression components (promoters, UTRs) further increases qα -tagged viral protein expression. Importantly, the natural 5' -UTR of SARS-CoV-2 alone dramatically enhances expression of the target gene protein.

Enhancing SARS-CoV-2S pseudovirion production. Pseudotyped viruses have been widely used not only for gene delivery but also for vaccine production, antibody neutralization, cell entry and pathogenesis. Pseudovirions are excellent alternatives to high risk viruses that require BSL3 facilities to handle or limited access of viruses such as SARS-CoV-2 and variants (Korber et al.,2020;Muik etal.,2021;Nie et al.,2020;Wallset al.,2020;Weissman et al.,2021;Wibmer et al.,2021a). thereof, e.g., BSL3 and ABSL3 facilities, slow down the basic research and vaccine/therapy development of COVID-19. Pseudovirions are virus-like particles coated with viral surfaces or carrying membrane proteins of specific cell chemotaxis (Kuzmina et al.,2021; wallset al.,2020;Wibmer et al.,2021 a). Virus-like particles pseudotyped with S protein will have a better immune response than single viral proteins due to the similarity to the three-dimensional structure of live viruses (Kuzminaet al, 2021; walls et al, 2020;Wibmer et al, 2021 a). SARS-CoV-2S protein has been widely used to generate S-pseudovirions, but in most reports encapsulation efficiency against lentiviral-like particles (LVLP) or VSV-like particles (VSVLP) is very low, even when codon-optimized C-terminal deletion S proteins are used (Korberet al, 2020; muik et al, 2021; ou et al, 2020; walls et al, 2020). Given that qα labeling enhances S protein production in mammalian cells, qα is presumed to enhance encapsulation efficiency of S-pseudotyped LVLP (S-LVLP). Using the codon-optimized SARS-CoV-2S protein (Sd 18) widely used for C-terminal 18 amino acid deletion in S-pseudovirus studies as a test platform (fig. 4A), it was verified that adding qα to the C-terminal Sd18 (Sd 18Q) enhances Sd18 expression, which was further increased by inclusion of the 5' -UTR, as determined by western blot analysis (fig. 4B). Using standard pRRL-GFP LV reporter transfer vector Using microscopic examination (FIG. 4C) and flow cytometry (FIG. 4D), it was found that in HEK-hACE2 cells, Qα labeling increased S-LVLP encapsulation efficiency by about 2 to 4 fold. The addition of the 5' -UTR further increases the packaging efficiency by a factor of about 4 to 10 (fig. 4C, 4D). Similar to conventional VSV-G pseudotyped LV encapsulation, polybrene treatment increased the titer of S-LVLP (FIG. 4E) and simplified high speed sucrose concentration/purification, preserving transduction capacity (FIG. 4F). To provide a dynamic measurement of S-pseudovirion transduction, the encapsulation efficiency of dual reporter LV vectors pRRL-LG and pRRL-E-LG, which can carry inserts of larger size than GFP-only inserts, were tested (fig. 4G, 4H). As expected, the original Sd18 has very low pRRL-LG and pRRL-E-LG packaging efficiencies. However, the addition of qα significantly enhanced its transduction efficiency, and the addition of 5' -UTR further enhanced transduction efficiency as determined by fluorescence microscopy and gdLuc assay (fig. 4G, 4H). These data provide evidence that the addition of both qα and 5' -UTR to Sd18 expression system significantly enhances the packaging and transduction efficiency of SARS-CoV-2S-LVLP.

Enhancing DNA and mRNA vaccine production. One significant instant use of qα tagging would be for enhancing vaccine production to address the urgent need for combat COVID-19. The most promising vaccine against SARS-CoV-2 or its variants is derived from mRNA or DNA encoding the S protein or other SARS-CoV-2 viral proteins. Taking the S protein as an example, in the cDNA expression vector driven by the CMV promoter, qα tagging increased S protein expression by 4 to 27-fold (fig. 1H), and an additional 6-fold increase occurred when CAG promoter was utilized (fig. 3C). Inclusion of 5' -UTR further increased S protein expression by about 20 to 70-fold in CAG driven cDNA expression vectors (fig. 3G). Thus, qα tagging plus 5' -UTR modulation of DNA vaccines encoding for S protein enhanced vaccine production by at least 200-fold (fig. 3H). Such enhancement of large-scale DNA vaccine production would drastically reduce costs and speed up COVID-19 vaccine availability. Since mRNA vaccines exhibit a number of advantages over other vaccines, and COVID-19S protein mRNA vaccines have been widely accepted for a wide range of human applications during pandemic periods, it is hypothesized that qα tagging plus 5' -UTR enhances mRNA dependent translation, resulting in increased expression of viral proteins such as S protein for vaccine production. To test this, in vitro transcription was performed to generate end-capped mRNA with a qα tag, and it was examined whether qα tagging could affect viral protein transcription following mRNA transfection in HEK293T cells. As shown in fig. 5A, the presence of the qα tag significantly increased the production of viral protein S from transfected functional mRNA in a time-dependent and dose-dependent manner. Such enhancements are generally applicable to other viral proteins N, E and mRNA of ORF3 and host cell gene ACE2 (fig. 5B, 5C). The addition of 5' -UTR significantly increased mRNA-dependent translation of qα -tagged viral S protein (fig. 5D) as well as viral E protein and cellular ACE2 (fig. 5E), consistent with cDNA expression vectors (fig. 3D to 3I). These data provide evidence that qα tagging and 5' -UTR inclusion enhance transcription efficiency, leading to increased production of reporter protein in all measured targets and to a degree dependent on different target proteins.

To further determine whether qα labelling regulates mRNA dependent translation, the dynamic changes in the transcript were measured after transcriptional inhibition with actinomycin D. It was found that treatment with actinomycin D completely blocked production of viral protein S (fig. 5F) and ORF3 (fig. 5G) in the absence of qα, as determined by gdLuc activity. However, qα labelling increases protein expression/production during transcriptional inhibition, which effect accumulates in a time-dependent manner (fig. 5f,5 g). These data provide evidence that qα tagging on target proteins promotes protein expression (increased transcription efficiency and/or mRNA stability) through post-transcriptional regulation. To further determine whether qα labelling affects mRNA stability of the target gene, mRNA decay assays were performed using S and E viral proteins as examples. Although time course changes show different patterns between S and E viral mRNA, qα labelling on both S (fig. 5H) and E (fig. 5I) viral proteins increases mRNA half-life by around 6 to 7 hours.

Taken together, qα tagging and native 5' -UTR inclusion on target mRNA significantly increases mRNA stability and transcription efficiency, and thus enhances protein expression/production of the target mRNA (e.g., S protein mRNA vaccine). Such enhancement not only reduces costs, but also stimulates vaccine responses due to high levels of S protein expression/release from the vaccinated target cells.

Enhancing antibody production. Therapeutic agents based on potent monoclonal antibodies (mabs) require optimization of antibody production in a suitable cell culture platform, which depends on high potency expression vectors. The various genetic elements in monoclonal antibody production vectors have been extensively modified. To determine whether novel qα labelling would enhance antibody production, human anti-SARS-CoV monoclonal antibodies (Bei, CR 3022) were used as a test platform. The qα tag was cloned into the C-terminal end of immunoglobulin heavy and light chains (H/L) of CR3022 containing heavy and light chains derived from human anti-SARS-CoV mAb (GenBank: DQ168569 and DQ168570, respectively) to generate qα tagged HQ and LQ (fig. 6A). HQ and LQ were co-transfected into HEK293T cells to generate qα -tagged monoclonal antibodies using the original H and L vectors (NR 52399 and NR 52400) as controls. Supernatants containing the cloned antibodies were collected 2 to 3 days post-transfection and their levels were measured using sandwich ELISA using SARS-CoV-2S protein as coating antigen (fig. 6B, 6C). Q alpha tagging was found to enhance antibody production up to 37-fold with or without normalization to transfection efficiency (fig. 6D). The enhancement efficiency was varied with experimental conditions (cell density, transfection efficiency and ELISA variation), with an average of 13-fold (fig. 6E). Western blot analysis of the supernatant confirmed the enhancement of qα for antibody production (fig. 6F). These data provide evidence that qα labelling induces a strong enhancement of antibody production (secretion).

Enhancing lentivirus production. Viral gene therapy has been widely studied and actively applied to clinical diseases. AAV and LV are the most promising strategies for viral gene therapy. However, viral packaging efficiency (yield) is a bottleneck for both AAV and LV gene therapy. In the field of CRISPR/Cas genome editing, viral encapsulation efficiency is also the rate limiting factor in the development of genome editing and therapeutics. In general, the level of mRNA from LV transfer vectors will affect LV packaging efficiency. It is hypothesized that if qα tagging increases the mRNA level of the transgene during encapsulation, qα tagging in LV transfer vectors will enhance the efficiency of LV encapsulation and gene delivery. To test this, LV transfer vector pRRL-E-LG was compared to pRRL-E-QLG for standard LV packages (psPax and VSV-G). After LV infection of HEK293T cells, qα tagging increased production of transgenic reporter gdLuc from the transfer vector (fig. 6G), similar to enhanced efficiency in transfected cells without LV encapsulation (fig. 2H). However, qα labelling on transfer vector had only a slight effect on encapsulation efficiency, i.e. potency of the encapsulated LV (data not shown). Similar changes were seen with LV-spCas9-Q-RFP and LV-MS2-spCas9-Q-GFP (FIGS. 6H, 6I), where encapsulation efficiency was typically > 100-fold less than standard LV-RFP or LV-GFP. These data provide evidence that slight changes in mRNA of the transgene in the transferred LV vector caused by qα tagging did not increase packaging efficiency, but qα tagging enhanced transgene production in transduced cells (fig. 6G). This is consistent with the following findings: q alpha labelling affects translation rather than transcription (fig. 5A to 5I). It was then tested whether qα labeling on LV encapsulation proteins (such as Gag, pol, and RRE) would enhance encapsulation efficiency in encapsulation vector psPAX. Interestingly, qα labeling on Gag significantly impairs LV encapsulation, but qα labeling on Pol and RRE enhances LV encapsulation of pRRL-GFP, pRRL-QLG and LV-MS2-Cas 9-GFP; however, the enhancement efficiency was only 1 to 3 times (fig. 6J to 6L). It is necessary to further optimize the LV packaging by qα labeling.

Q alpha tagging enhances secretion of the target protein. As indicated above, qα tagging increases the expression of various types of target proteins. When western blot analysis was performed using cell lysates to confirm the enhancement effect of qα -tagging on E dual reporter protein expression, it was unexpectedly found that in the qα -tagged group, the E-Flag-gdLuc protein level in the cell lysate was significantly reduced (fig. 7A), even though qα -tagging strongly increased gdLuc activity in the supernatant (fig. 1A to 1F). In the presence of 5' -UTR, the reduction of the expression level of CAG-driven E-Flag-gdLuc was more robust in cell lysates (FIG. 7B). This decrease also occurred with other viral proteins N and S and host cell genes such as ifnγ, IL-2 and hACE2 (fig. 7C, 7D). Inclusion of UTR increased protein expression in cell lysates (fig. 7B, 7D, and data not shown).

These unexpected observations contribute to the following hypothesis: the strong increase in qα labelling activity on supernatant gdLuc must involve a protein secretion process. This is supported by increased antibody secretion (FIGS. 6A-6E) and secreted IFNγ and IL-2 (FIGS. 2D, 2E). To confirm enhanced secretion, the supernatant was analyzed for protein levels of secreted E-Flag-gdLuc using serum-free medium. As shown in FIG. 7E, cleaved E-Flag-gdLuc and GFP and uncleaved E-Flag-gdLuc-GFP were detected in the unconcentrated supernatants (40%) of the Qα -tagged E-QLG group by Western blot analysis using anti-gdLuc and anti-GFP antibodies. A 17-fold increase in secreted protein was quantitatively detected in optical density, consistent with the enhancement detected by the gdLuc assay (fig. 7F). The secretion was completely blocked by treatment with the ER-Golgi protein transport inhibitor brefeldin (FIG. 7G). The secretion enhancement of qα -tagged was further confirmed using a non-secreted firefly luciferase (fLuc) assay. Even in the presence of qα labelling, there was no fLuc activity in the supernatant, but the protein expression level and enzyme activity were still significantly increased (fig. 7H). Taken together, qα tagging significantly enhances the expression and secretion of the target protein. We also noted incomplete autolysis by the 2A system in most proteins of interest, where the cleavage efficiency was different from protein to protein (FIG. 7C).

Discussion of the invention

In both published literature and patents, different types of bioactive peptides have been developed that regulate or enhance the production of a protein of interest (Daliri et al, 2017;Katayamaet al, 2021;Peighambardoust et al, 2021). The study presents a novel short peptide (epitope) tag (only 7 amino acids) that enhances target protein expression and secretion. Various types of peptide (epitope) tags (DeCaprio and Kohl,2019;Katayamaet al.,2021;Lee et al.,2020;Mishra,2020;Peighambardoust et al.,2021;Pina et al.,2021;Traenkle etal.,2020). have been previously identified for protein labeling, purification, and immunostaining, however, peptide tags for modulating or enhancing the production of a protein of interest (including expression and secretion) have not been identified. This qα tag will act as a general enhancer of protein production. To date, all the target proteins tested as shown in this study have enhanced protein production, with some proteins having up to a thousand-fold increase. In the context of research interest and potential applications, extensive testing of many other target proteins is necessary. To the inventors' knowledge, this is the first evidence to demonstrate protein modulation/enhancement of short peptide (epitope) tags traditionally used for protein labeling/detection and affinity purification. This finding provides paradigm shift in the context of epitope tagging and protein functional modulation.

Proteins and peptide tags have been widely used for protein labeling/detection and affinity purification (DeCaprio and Kohl,2019;Katayamaet al.,2021;Lee et al.,2020;Mishra,2020;Peighambardoust et al.,2021;Pina et al.,2021;Traenkle et al.,2020). fusion of peptide tags to proteins of interest allowing detection by immunostaining in vitro and in vivo using corresponding highly specific antibodies. The novel "pasta monster" fluorescent protein (smFP) technology using tandem tags dramatically enhances the sensitivity of tagged protein detection (VISWANATHAN ET al, 2015). Most tags can also be used for protein purification by immunoprecipitation and/or affinity chromatography. Some tags can enhance the yield of protein purification by extending the half-life or making the protein soluble (Bhagawati et al.,2019; han et al.,2020;Li,2011;Saribas et al., 2018). For some cases, labelling may affect the activity or function of the target protein (Majorek et al., 2014). For example, N-terminal labelling on PI3KCA increases activity, whereas C-terminal labelling affects membrane binding activity (Vasan et al., 2019). gdLuc the N-terminal secretory signal peptide not only determines its intrinsic secretory properties, but also regulates protein folding and functional activity (Gaur et al, 2017). For a particular protein, the C-terminal or N-terminal amino acid composition will regulate protein expression (Cambray et al.,2018; weber et al., 2020). Modification of the C-terminal intranet targeting peptide on gdLuc significantly improved its intracellular retention (Gauret al., 2017). Some peptides fused or endogenously contained in target proteins, such as PEST (Shumway et al., 1999) or KFERQ (Dong et al.,2020; park et al., 2016), label proteins for proteolytic or degradation. However, there is no evidence that these epitope tags would directly enhance protein expression and secretion, particularly in mammalian cells.

In this study, it was found that the novel epitope tag qα enhances the expression and secretion of both the tagged SARS-CoV-2 viral protein and many non-viral proteins. This conclusion is supported by the following findings: 1) For several viral proteins and host cell gene products, the production of fusion proteins of Q alpha tag strongly enhanced secretion gdluc, determined by dual reporter assays and fluorescence microscopy (fig. 1A to 1F, 2A to 2G); 2) Qα tag enhancement is independent of promoter dependence (fig. 3A to 3I, 5A to 5I); 3) Qα significantly enhanced mRNA-dependent production of target viral and non-viral fusion protein reporters, as determined by in vitro RNA transcription, mRNA transfection, and dual reporter assays (fig. 5A-5I); 4) In the presence of transcriptional inhibition, qα labelling retains its time-dependent enhancement, indicating posttranscriptional mechanisms (increased transcription efficiency and/or mRNA stability); 5) Qα tagging on S protein enhances S pseudovirus yield (fig. 4A to 4H); 6) Q alpha labelling on antibody heavy and light chains strongly increases antibody production; 7) Q alpha labeling on LV encapsulation carriers enhances encapsulation efficiency; and 8) Q alpha labelling enhances protein secretion processes, as shown by Western blot and gdLuc analysis of gdLuc fusion proteins (with or without brefeldin treatment), antibody production, and secreted forms of IFNγ and IL-2.

Although the underlying mechanisms of protein expression/secretion enhancement remain to be described, studies herein identify posttranscriptional mechanisms such as increased mRNA stability and translational efficiency via global transcriptional inhibition using classical measures of mRNA stability. Novel measures of mRNA decay (Chan et al, 2018) are required to extend these preliminary observations. modulation of mRNA involves a dynamic balance between synthesis and degradation processes. The synthesis process is fully understood; however, less is known about mRNA decay (Chanet al., 2018). Qα labelling can act as a novel mechanism to regulate mRNA stability. How the qα tag modulates mRNA stability and translation initiation/elongation would be an interesting and important direction to explore. For example, it is important to know whether the RNA sequence encoding the qα peptide has a secondary structure that can directly regulate the mRNA stability of the target protein (Boo and Kim, 2020). It is of interest to determine if synonymous substitutions of the qα peptide affect expression/production of the tagged protein. It is desirable to determine whether the amino acid sequence of the qα tag binds directly to poly-a or 3' -UTR or those residues contribute to mRNA stabilization and translational enhancement. For protein secretion, because of qα -tagging on non-secreted proteins, this qα -tag does not function like the secreted peptide, i.e., firefly luciferase does not alter the background luciferase activity in the medium, which may be present due to partial cell death. However, qα labelling on secreted proteins such as S protein, antibodies, ifnγ and IL-2 strongly enhances their yield. This is important for the industrial application of these secreted proteins. In particular, in the presence of the qα tag, the vaccinated cells will release more S protein for mRNA vaccine, which not only reduces the amount of mRNA for each vaccination, but also promotes an immune response due to the high level of secreted S protein. Given that brefeldin is known to inhibit ER golgi trafficking and to block completely qα -stimulated protein secretion, we speculate that the qα tag would regulate protein retrograde or anterograde trafficking. Other secretion inhibitors may be used to identify other pathways for protein secretion. Whether the qα tag affects proteins that are secreted unconventionally remains to be determined (Cohen et al 2020).

UTRs at both ends of viral genome or host cell mRNA are important in regulating transcription and translation efficiency (Berkhout et al.,2011;Hinnebusch et al.,2016;Raman and Brian,2005;Senanayake and Brian,1999;Williams et al.,1999). in particular, coronaviruses '5' -UTRs regulate translation rate via ribosome scanning (Berkhoutet al.,2011;Hinnebusch et al, 2016;Shirokikh et al, 2019; zhang et al, 2015). In Pfizer and Moderna vaccines, synthetic (non-viral) 5' -UTRs have been used to enhance translation of SARS-CoV-2S mRNA. The natural UTR of SARS-CoV-2 is highly conserved and plays an important role in viral RNA replication and transcription of genomic and subgenomic viral transcripts (Baldassarreet al.,2020;Yang and Leibowitz,2015). Thus, the native 5' -UTR is thought to enhance the accumulation of viral proteins. In this study, corroborated evidence was provided that the natural (natural) 5 '-and 3' -UTRs of SARS-CoV-2 enhance production of viral E-LG fusion proteins. Importantly, the native 5' -UTR acts as a universal regulator not only in the enhancement of viral proteins but also in the enhancement of many non-viral cellular proteins. It is speculated that such potent UTRs may be used to enhance any protein, particularly for use in viral packaging systems. For example, UTR-Sd18Q was observed to increase the encapsulation efficiency of S-pseudotyped LVLPs or VSVLP. UTR enhances lentiviral production in viral transfer vectors. Natural UTRs also enhance AMINO ACIDSV packaging and transduction efficiency.

This study identified the combination of qα tagging with SARS-CoV-2 natural UTR as a novel strategy to enhance the production of any target gene/protein of interest. For industrial applications, this strategy will reduce the cost and promote availability of many widely used products. Since it enhances the production of all SARS-CoV-2S viral proteins tested, the immediate use of this approach would be to enhance vaccine production to address the urgent need to combat COVID-19. The studies herein show that the efficiency of the S mRNA vaccine is enhanced by at least 200 fold. This is extremely important in the production of new mRNA vaccines against SARS-CoV-2 variant or any other emerging virus to accelerate the availability of mRNA vaccines. This strategy can be easily incorporated into DNA vaccine vectors. Thus, enhancing the yield of large-scale vaccine production would drastically reduce the cost and speed up the availability of COVID-19 vaccines. Another immediate industrial value of the methods herein is to enhance antibody production. Taking the example of a human anti-SARS-CoV monoclonal antibody, it was found that Q.alpha.tagging at the C-terminus of the immunoglobulin heavy and light chain variable regions strongly enhanced antibody secretion by up to 37-fold (13-fold on average). The qα labelling found in the middle of the target protein shows much stronger enhancement efficiency, and optimization of qα labelling in different regions of the target antibody heavy and light chains is expected to achieve higher levels of antibody production enhancement.

Enhancing the yield of viruses or pseudotyped viruses is also valuable in the areas of gene therapy and biomedical research. Pseudotyped viruses have prompted the study of high risk viruses that require BSL3 facilities. Pseudoviruses of the SARS-CoV-2S protein or variants thereof have been widely used for evaluation of neutralizing antibodies and vaccination and for research on both rationality and functionality (Donofrio et al, 2021; korber et al, 2020; muik et al, 2021; ou et al, 2020;Wibmer et al, 2021 b). The bottleneck in the production of S pseudovirions is the limited encapsulation efficiency of LVLP or VSV-like particles (Korberet al, 2020; muik et al, 2021; ou et al, 2020; walls et al, 2020). The methods used herein to combine qα tagging on Sd18 expression systems with native 5' -UTRs strongly enhance the packaging and transduction efficiency of SARS-CoV-2S-LVLP. This strategy has prompted current studies of the antiviral effect of EGCG and the protective efficiency of immunized serum from patients against the emerging SARS-CoV-2 virus (Liu et al, 2021a; liu et al, 2021 b). One of the challenges of viral gene therapy is limited viral encapsulation efficiency (yield). Using the LV system as a test platform, qα labelling in LV transfer vectors was found to have only a slight effect on encapsulation efficiency, but the production of transgenic proteins in transduced cells or transfected encapsulated cells was enhanced. This is expected because qα labelling affects translation of the target gene rather than transcription, whereas LV encapsulation requires the presence of intermediate RNAs from the transfer vector. Qα labeling at the C-terminus of Pol and RRE in encapsulation vector psPAX increases LV encapsulation efficiency, but qα labeling at the Gag C-terminus compromises LV encapsulation. Therefore, optimization of qα -tagged localization in LV packaging proteins is critical to maximize qα enhancement efficiency. Whereas Sd18 Qα labeling enhances Sd18 expression and packaging efficiency of S-LVLP, Qα labeling on VSV-G proteins back enhances conventional LV packaging efficiency. The qα insertion (Lorenz et al 2014;Schlehuber and Rose,2004) at different locations of VSV-G can maximize enhancement efficiency. Similar to LV packaging, it is necessary to further optimize qα labeling on AMINO ACIDSV packaging systems.

In recombinant protein production systems, qα labelling facilitates the yield of protein expression such as insulin, interferons, interleukins, cytokines and growth factors. Even a few fold increase in enhancement reduces production costs and speeds up clinical use. For in vivo gene enhancement, qα tagging via novel CRISPR/Cas gene knock-in strategies can be used to facilitate expression of disabling genes, particularly in single-fold insufficiently-mutable diseases such as Angelman syndrome, pitts-Hopkins syndrome, and the like. For genetic engineering, enhancement of qα -tagging of dominant genes can improve the phenotype of organisms, particularly in agricultural applications. Finally, this novel qα tag can be used as a general tag for protein labeling, protein purification, immunostaining and western blotting in a similar manner to other peptide tags such as Flag, myc, HA, ollas, C and T7. Importantly, qα labelling can enhance the marker strength of endogenous proteins due to its enhancing properties. This is very important for neural network tracers.

When western blot analysis was performed, incomplete cleavage of several target proteins via the autolytic 2A system was observed. This is consistent with previous reports of 2A cleavage inadequacies that vary with different types of 2A peptides and different proteins of interest (Chng et al, 2015; kim et al, 2011). Addition of Peptide (APVKQLL) to F2A increased cleavage efficiency (Groot Bramel-Verheije et al, 2000). Whether qα tagging affects the functionality of a 2A system remains to be determined. The 2A system functions primarily inside the cell but may have relatively low activity outside the cell because the ratio of lytic to non-lytic bands in the cell lysate is significantly higher than in the cell culture medium (fig. 7A to 7H).

Based on numerous previous in vitro and in vivo studies on epitope tags such as Flag, HA, myc, ollas, C and T7, it is expected that the smaller qα tag (7 amino acids) should be free of any toxicity.

In summary, the present study reports a novel peptide tag consisting of a specified short amino acid (7 amino acids) sequence, which can be used to enhance production of tagged proteins (including viral transcripts/proteins, endogenous gene products, vaccines, antibodies, engineered recombinant proteins) in cells in vitro, ex vivo, in vivo. This novel and universal peptide tag will promote protein expression and secretion. This is valuable for library screening against this primary qα tag to find maximizing protein expression/production/secretion. The present study also reports the abnormally potent efficiency of SARS-CoV-2 native 5' -UTR in enhancing protein expression/production. Combining qα tagging with the native 5' -UTR provides a synergistic boost in the production of viral and non-viral proteins. These strategies are all expensive in terms of biomedical development, immunological/vaccine processes, and biotherapeutic agents.

Example 2: experimental model and subject details

Cell line:

HEK293T, hela and BHK cell lines were cultured in standard protocols.

Details of the method

Vector cloning

All PCR reactions for the clones used in this study were performed using the Phusion high fidelity PCR premix kit (Thermo Fisher, F531), and the purification of the correct clones was confirmed by restriction enzyme digestion and Sanger sequencing as well as functional measurements using the Monarch PCR & DNA cleaning kit (NEB, T1030S).

Dual reporter vector: dual reporter LG fragments encoding Gaussia-Dura luciferase (gdLuc) and destabilized GFP (dsGFP) were generated by overlap PCR: 1) Standard PCR was performed using primer pair T1290/T1291 to generate fragment 1 (gdLuc) from template plasmid pMCS-Gaussia-Dura-luciferase (Thermo FISHER SCIENTIFIC, catalog number 16190), while fragment 2 (dsGFP) was generated from plasmid pLenti-EFS-EGFPd2PEST-2A-MCS-Hygro (TP 1380) (gift from Neville Sanjana (Addgene catalog number 138152)) using T1292/T1293; 2) Two purified fragments (100 ng each) of 19 nucleotides overlapping were mixed and subjected to 8 cycles of PCR; 3) 28 cycles of standard PCR were performed using primer pair T1292/T1293 using 1:100 dilution of the PCR product as template to generate LG fragments. After purification with Nucleospin gel and PCR cleaning kit (Macherey-Nagel, catalog number REF 740609), useThe HiFi DNA Assembly cloning kit (NEB, E5520S) clones this LG fragment (1485 bp) via SacII cloning site into the pcDNA6B-nCoV-x-Flag vector encoding various viral proteins or cellular genes hACE of SARS-CoV-2 as listed in the Critical resources Table (Zhang et al 2020) to generate the pcDNA6B-SARS-CoV-2-x-Flag-LG vector as listed in the Table. "x" indicates a gene of interest.

Unexpectedly, functional assays and Sanger sequencing identified a novel clone designated pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP 1479) with a Q.alpha.peptide (designated QLG) in the open reading frame prior to LG. The insert encoding SARS-CoV-2S protein from pcDNA6B-nCoV-S-Flag vector (TP 1456) was cloned into TP1479 via XhoI/XbaI site to generate pcDNA6B-SARS-CoV-2-S-Flag-QLG (TP 1487). The insert encoding SARS-CoV 2N protein from pcDNA6B-nCoV-N-Flag vector (TP 1431) or hACE2 from pcDNA6B-hACE2-Flag vector (TP 1470) was cloned into TP1479 via KpnI/XbaI site to generate pcDNA6B-SARS-CoV2-N-Flag-QLG (TP 1490) or pcDNA6B-hACE2-Flag-QLG (TP 1491).

The pcDNA6B-NIBP-Flag-LG (TP 1560) vector was generated by cloning the NIBP PCR product NEB-HiFi from pYX-Asc-mNIBP (Genbank # BC 070463) into pcDNA 6B-hACE-Flag-LG (TP 1540) via NotI/XbaI, while pcDNA6B-NIBP-Flag-QLG (TP 1558) was generated by cloning the NIBP PCR product NEB-HiFi into pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP 1479) via XhoI/XbaI.

The pCAG vector encoding E, S and NSP16 was generated by replacing the CMV promoter of the corresponding pcDNA6B-SARS-CoV2-x-Flag-LG or-QLG vector with the CAG promoter via the SnaBI/KpnI site.

UTR-containing vectors: DNA fragments containing 5'-UTR-E-Flag-Qα -3' -UTR involved in sequencing according to public SARS-CoV-2 were synthesized from Synbio Technologies and usedHiFi DNA Assembly cloning kit (NEB, E5520) was cloned into pCAG-Flag vector via EcoRV/Age sites. This vector pCAG-UTR-E-Flag-Qα -UTR (TP 1583) was digested with SnaBI/EcoRV (both blunt ends) to remove the CAG promoter and religated, yielding pUTR-E-Flag-Qα -UTR vector (TP 1585). The 3' -UTR with pCAG-UTR-E-Flag-Qα -UTR was removed by NotI digestion and ligated to generate pCAG-UTR-E-Flag-Qα (TP 1584) with an additional 37 amino acids at the open reading frame. The pCAG-UTR-S-Flag-QLG vector (TP 1586) was generated by replacing the E-Flag-Qα -UTR fragment with the S-Flag-QLG fragment from the pCAG-S-Flag-QLG vector (TP 1518) via XhoI/AgeI site. The NEB HiFi clone of pCAG-Sd18-Q vector (TP 1506) was performed via EcoRI site using the 5' -UTR of PCR product from pUTR-E-Flag-UTR vector (TP 1585) to generate pCAG-UTR-Sd18-Q (TP 1595). The UTR-containing pcDNA6B vector is generated by restriction cloning via KpnI/XhoI to transfer the 5' -UTR from the pCAG-UTR-E-Flag-Qα -UTR into a corresponding pcDNA6B vector such as pcDNA6B-S-QLG (TP 1487), pcDNA6B-E-Flag-QLG (TP 1479), pcDNA6B-ORF3-Flag-QLG (TP 1483).

Antibody carrier: plasmid set CR3022 expressing human heavy (H) and light (L) chains of anti-SARS-CoV was produced at HHSN272201400008C and was obtained from BEI Resources, NIAID, NIH (catalog number NR-53260), pFUSEss-CHIg-hG1-SARS-CoV2-mAb (NR-52399, TP1565) and pUSE 2ss-CLIg-hK-SARS-CoV2-mAb (NR-52400, TP1566) (GenBank: DQ168569 and DQ 168570). UsingHiFi DNA Assembly cloning kit Using synthetic oligonucleotides containing the Qalpha coding sequence and the C-terminus of the immunoglobulin heavy and light chains (see Table S1 for sequences) the Qalpha tagged HQ (TP 1574) and LQ (TP 1571) vectors were generated from H or L plasmids at the Nhe site.

Lentiviral vector: pRRLSIN.cPPT.PGK-GFP.WPRE (TP 792) is a gift from Didier Trono (Addgene # 12252) for generating pRRL-E-Flag-LG-GFP (TP 1577) by transferring E-Flag-LG from TP1478 to TP792 via BamHI/AgeI. AgeI/KpnI blunt end ligation generated pRRL-E-Flag-LG (TP 1578) from TP 1577. By transferring E-Flag-QLG from TP1479 to TP1578 via BamHI/BstBI, pRRL-E-Flag-QLG is generated. TP1578 and TP1579 were used as backbone vectors for NEB-HiFi cloning of human IFN gamma and IL2 PCR products via XbaI sites to generate pRRL-IFN gamma-LG (TP 1604) or QLG (TP 1605) and pRRL-IL2-LG (TP 1606) or QLG (TP 1607). PCR fragments of IFNγ and IL2 were derived from pUC8-IFNγ (gift from Howard Young, addgene # 17600) and pAIP-hIL2-co (gift from Jeremy Luban, addgene # 90513), respectively, using primer pair T1407/Tq408 as listed in Table S1. By cloning the 5' -UTR PCR product NEB-HiFi from TP1583 into TP1578 and TP1579 via XbaI, pRRL-UTR-Flag-LG (TP 1621) and pRRL-UTR-Flag-QLG (TP 1622) are generated, respectively. pRRL-Flag-LG (TP 1685) and pRRL-Flag-QLG (TP 1686) were generated from TP1621 and Tp1622, respectively, via BsmBI/XbaI digestion to remove UTR, and NEB HiFi cloning was performed using an oligonucleotide insert (T1469) to correct the ATG site in the ORFs.

NEB-HiFi cloning of two overlapping PCR fragments was performed using TP592 as a primer pair for PCR templates, T1396/T1397 and T1398/T1399, generating LV packaging vector psPAX-Gag-Q (TP 1618) from psPAX2 (TP 592, gift from Didier Trono, addgene # 12260) via SphI/EcoRV sites. NEB-HiFi cloning of two overlapping PCR fragments was performed using psPAX2 as primer pair for PCR template T1400/T1401 and T1402/T1403, generating psPAX-Pol-Q-RRE-Q (TP 1619) from psPAX via the SwaI/NheI site. NEB-HiFi cloning of two overlapping PCR fragments was performed by using TP592 as a primer pair for PCR templates, T1396/T1397 and T1398/T1399, generating psPAX-Gag-Q-Pol-Q-RRE-Q (TP 1620) from TP1619 via SphI/EcoRV site.

S-pseudoviral vector: usingHiFi DNA assembled cloning kit (NEB, E5520S), vector pCAG-SARS-CoV2-Sd18Q (TP 1506) encoding the S gene of human codon optimized SARS-CoV2 with C-terminal 18 amino acid deletion (Sd 18) and Q.alpha tag fusion was constructed. Briefly, the Sd18 expression cassette in the CMV-driven vector pcDNA3.1-SARS2-S (gift from Fang Li, addgene catalog number 145032) was transferred to the CAG-driven vector pCAG-Flag-SARS-CoV2-S (gift from Peihui Wang) via EcoRI/NotI sites and PCR using primer pair T1323/T1324. Vector pCAG-SARS-CoV2-Sd18 (TP 1567) encoding Sd18 but without Q.alpha.tag was constructed via NEB HiFi cloning using synthetic oligonucleotide insert T1367 at the SacII/Not site of the pCAG-SARS-CoV2-Sd18Q vector. As described above, a pCAG-UTR-Sd18Q (TP 1595) vector was produced.

Plasmid DNA purification and DNA purification

Plasmid DNA was purified using commercial kits for endotoxin-free small-scale preparation (catalog number REF 740490) or medium-scale preparation (catalog number REF 740420) from Macherey-Nagel (Germany). Coli bacterial cultures (5 ml for small scale preparation and 200ml for medium scale preparation) carrying the relevant plasmids were grown overnight at 30 ℃ (for NEB-stabilised cells) or 37 ℃ (for DH5 a e.coli cells) in LB or 2YT medium supplemented with 100 μg/ml carbenicillin. Bacterial cultures were harvested by centrifugation and the pellet obtained after centrifugation was treated to purify plasmid DNA according to the manufacturer's instructions. The final DNA was dissolved in ultrapure distilled water and the DNA concentration was determined using a Nanodrop 1UV-Vis spectrophotometer (Thermo-Fisher) or in Take3 edition using a Bio-Tek multimode microplate reader.

Cell culture and transfection

HEK293T human embryonic kidney cells and Hela human cervical epithelial cells were obtained from ATCC (http:// www.atcc.org). Both cells were cultured in Du's modified Igor's medium (DMEM, gibco) supplemented with Fetal Bovine Serum (FBS) and the antibiotic 1% penicillin/streptomycin (Corning). BHK-21-/WI-2 cells (EH 1011, kerafast, boston, MA, USA) were grown in DMEM supplemented with 5% FBS and 1% penicillin/streptomycin. All cells were incubated in a 37℃incubator under 5% CO ₂ atmosphere.

For most experiments, 96-well plates were used. For mRNA stability, 24-well plates were used. Cells resuspended (3 to 4x10 ⁴ cells/well for 96-well plates, or 1 to 2x10 ⁵ cells/well for 24-well plates) were seeded overnight before transfection (in DMEM plus 10% FBS). For transfection, the transporter 5 transfection reagent (TP 5) (Polysciences catalog number 26008) was used at a DNA/reagent ratio of 1 to 4. Typically, 50 to 100ng of plasmid DNA per well (for 96 well plates) is mixed with 0.2 to 0.4. Mu.l TP5 in 0.9% NaCl solution and incubated for 20 minutes at room temperature. The transfection reagent was again mixed with the DNA solution and added drop-wise to each well. The transfection was incubated at 37℃overnight (16 to 18 hours) at 5% CO ₂, and the medium was replaced with DMEM plus 10% FBS.

Multilabelled fluorescent immunocytochemistry and confocal image analysis

Cells were fixed with 4% Paraformaldehyde (PFA) for 30 min, washed with 1 XPBS, permeabilized with 0.5% Triton X-100/1 XPBS for 30 min, blocked with 10% donkey serum for 1 hr, and incubated overnight with murine anti-Flag monoclonal antibodies or anti-2A primary antibodies (Table 1) in 0.1% Triton X-100/1 XPBS at 4 ℃. The following day, cells were washed with 1 XPBS and counterstained with the corresponding Alexa Fluor secondary antibodies (Jackson Immuno Research Labs; donkey anti-rabbit, anti-mouse, igG (H+L) 488, 594 or 680) at 1:400 dilutions for 1 hour at room temperature using Hoechst 33258 (1:5000) as nuclei. Fluorescent confocal images were acquired and analyzed using a Leica SP8 confocal system.

TABLE 1

/>

Flow cytometry

Cells expressing dsGFP reporter were dissociated with (Corning) and passed through a 70 μm nylon cell sieve (CELL STRAINER) (Corning) to remove large clusters and washed with 1X PBS. Dissociated cells were fixed with 4% PFA in PBS and GFP positive cells were analyzed using a Cytek Aurora flow cytometer.

RNA extraction and reverse transcription quantitative PCR (RT-qPCR) for mRNA stability assay

HEK293T cells were transfected with the indicated vector (500 ng/well for 24 well plates) for 24 hours and then treated with the transcription inhibitor actinomycin D (10 μm) for various durations. Total RNA was extracted using Monarch total RNA miniprep kit (NEB, catalog number T2010), involving two steps of DNA removal. cDNA was synthesized using a high-capacity cDNA reverse transcription kit (Thermo FISHER SCIENTIFIC, cat. No. 4368814), using random hexanucleotide primers, using equal amounts of RNA (0.5. Mu.g). Real-time PCR analysis was performed on QuantStudio ^TM systems. mRNA expression levels of reporter gdLuc luciferase and human β -actin were determined using iTaq Universal SYBR Green Supermix kit (BioRad, cat No. 1725121). The gdLuc primers were (forward) 5'-GATTACAAGGATGACGACGATAAG-3' (SEQ ID NO: 2) (T1364 targeting Flag) and (reverse) 5'-AAGTCTTCGTTGTTCTCGGTGGG-3' (SEQ ID NO: 3) (T432 targeting gdLuc). The human β -actin primers are (forward) 5'-AAGAGCTATGAGCTGCCTGA-3' (SEQ ID NO: 4) and (reverse) 5'-TACGGATGTCAACGTCACAC-3' (SEQ ID NO: 5). Each sample was tested in triplicate. Circulation threshold (Ct) values for the reporter and beta-actin are obtained graphically. The Ct value difference between the reporter and beta-actin is presented as the delta Ct value. The ΔΔct value is obtained by subtracting this value for the control sample from the Δct value for the sample at different time points. The relative percent change in gene expression was calculated as 2- ΔΔCt. mRNA decay was calculated by nonlinear regression curve fitting (monophasic decay) using GRAPHPAD PRISM 9.1.1. Three independent experiments were performed.

Luciferase assay:

For the gdLuc assay, coelenterazine (CTZ) substrate (cat No. 3032,Nanolight Technology) was dissolved in 10ml of super sterile distilled water to prepare a stock solution and kept at-20 ℃ for use. CTZ stock solutions were diluted 10 to 30 times to prepare working solutions. Equal amounts of CTZ working solution were mixed with transfected cell culture medium (25 to 50 μl) in Corning (CLS 3922) white opaque 96 well optiplate and luminescence measured in a BioTek Synergy LX multimode microplate reader. In some experiments, firefly luciferase assay was performed using ONE-Glo luciferase assay kit (Promega Corp, catalog No. E6110). An aliquot of 100 μl substrate solution was mixed with 3 to 5 μl cell lysate and luminescence was measured in a BioTek Synergy LX multimode microplate reader. The data are presented as relative luciferase activity or fold change compared to the corresponding group. Experiments were performed at least 3 times and in quadruplicates each time.

In vitro transcription and mRNA transfection

For pcDNA6B vectors containing the T7 promoter, DNA streaks were presented with an agoi digestion followed by gel purification. For PCR products, the primers included the T7 promoter (TTAATACGACTCACTATAGGGTGGAATTCTGCAGATATCCAG (SEQ ID NO: 6), T1427), generating DNA fragments containing the 5' -UTR, target gene, LG or QLG dual reporter and poly (A) tail. PCR was performed using a Phusion high fidelity PCR premix kit (Thermo FISHER SCIENTIFIC, F531). DNA was purified using a gel extraction kit and concentration was determined in a Bio-Tek multimode microplate reader using a Take3 plate. RNA was synthesized from the purified DNA template using HiScribe ^TM T7 ARCA mRNA kit (NEW ENGLAND Biolabs, catalog E2060), co-transcribed blocked with m7G stress-resistant cap analog (ARCA, catalog 1411), and poly A tailing. The synthesized RNA was purified using the Monarch RNA cleaning kit (NEW ENGLAND Biolabs, catalog number E2040) and quantified with the Take3 plate. Following manufacturer's manual, useMessengerMAX mRNA transfection reagent (Thermo FISHER SCIENTIFIC, cat. No. LMRNA 015), equal amounts of RNA between LG and QLG groups at different doses were used for physician judicial transfection into HEK293T cells. Culture medium containing gdLuc was collected 4 to 72 hours post-transfection and gdLuc was tested as described above.

VSV-G or S protein pseudotyped lentiviral encapsulation and titration

Lentiviruses carrying the indicated lentiviral vectors were produced on a small scale according to standard protocols using a second generation LV encapsulation system. Briefly, HEK293T cells in one of the 6-well plates were co-transfected with a TP5 kit with the indicated transfer LV vector (1.4 μg), encapsulation vector psPAX2 or a mutant thereof (1 μg) and VSV-G or Sd18 vector (0.4 μg). 2 to 3 days after transfection, the supernatant containing LV was concentrated and purified using the simplified 10% sucrose purification method as previously described. Functional titers of crude and purified lentiviruses were determined 48 hours after transfection with serial dilutions of lentiviruses by counting HEK293T cells expressing GFP under a fluorescent microscope. For some cases, LV titration was performed using flow cytometry or RT-qPCR analysis. For PCR analysis, cell culture medium was collected from infected cells and centrifuged at 2,000g for 5 min. The supernatant is subjected to viral lysis to extract viral RNA. One-step RT-qPCR was performed according to the manufacturer' S protocol using qPCR lentivirus complete titration kit (Applied Biological Materials inc., catalog No. LV 900-S) and QuantStudio 3 real-time PCR system (Applied Biosystems, catalog No. a 28567). The resulting data is analyzed using QuantStudio design and analysis desktop software (Applied Biosystems).

Western blot analysis

SDS-polyacrylamide gels (10-12%) were prepared internally or Mini-PROTEAN TGX gels (catalogue No. 4561093, 4561096) were purchased from BioRad. Cell lysates were prepared using lysis buffer consisting of 50mM Tris-HCl pH 7.0, 150mM NaCl, 5mM EDTA and 1% Triton X-100 supplemented with PMSF (100X), aprotinin and leupeptin (200X). After collecting the supernatant, 50 μl of lysate was prepared from each well. Lysates were incubated at 4℃for 20 to 30min and centrifuged at maximum speed in an Eppendorf centrifuge. Clean lysates were either immediately denatured in dye-loaded 1x SDS-PAGE at 98℃for 5 min or stored at-80℃for later use. The supernatants were stored at 4℃until they were treated with dye-loaded 1x SDS-PAGE. 10 to 20. Mu.l aliquots of denatured cell lysates or 20 to 30. Mu.l supernatant were loaded onto SDS-polyacrylamide gels. SDS-PAGE was performed in Tris-glycine/SDS buffer under denaturing and reducing conditions.

Using wet transfer or2 Device usage/>The polyacrylamide gel was transferred to 0.2- μm nitrocellulose membranes (BioRad supported Nitrocellulose (NC) membrane, catalog No. 162-0097) in 2NC mini (IB 23002) or regular stack (IB 23001), in wet transfer, the following 1x transfer buffers were used: 25mM Tris-HCl pH 7.6, 192mM glycine, 20% methanol. The gel was sandwiched between NC membranes and transferred in 1x transfer buffer at 250mA at 4 ℃ for 1 to 2 hours.

According to the instruction of the manufacturer, at2 Gel transfer device (Invitrogen, thermo-Fisher, reference IB 21001), using a micro or conventional/>The 2 stack was transferred for 7 minutes by dry western blotting.

After transfer, the membranes were blocked with 1 XTBST buffer containing 5% milk. The membrane was then treated with primary antibody overnight at 4 ℃ or at room temperature for 2 hours. Membranes were washed three times with 1 XTBST buffer for one minute each and then incubated with secondary antibodies. The secondary antibody with infrared tag was diluted 1/10000-120000 and incubated with NC film for 45 minutes to one hour. At the end of incubation, the membranes were washed three times with 1 XTBST buffer for 5 minutes each and scanned on a Li-COR Odyssey image analyzer.

Antibody detection using enzyme-linked immunosorbent assay (ELISA)

HEK293T cells were co-transfected in 96 well plates with Q.alpha.tagged HQ (TP 1574) and LQ (TP 1571) in quadruplicate with or without the 20 ng/well normalization vector pGL4.16-CMV (TP 329) or pRRL-E-Flag-LG (TP 1578). The original antibody plasmids pFUSEss-CHIg-hG1-SARS-CoV2-mAb (TP 1565) and pFUSE2ss-CLIg-hk-SARS-CoV2-mAb (TP 1566) were used as controls. ELISA was performed using an ELISA kit (Invitrogen, thermo-Fisher, catalog number 88-5050-88) with Human IgG (total) uncoated. A96-well Costar ELISA plate (Corning) was first coated with SARS-Cov 2-spinous process (S) protein from BEI (catalog number NR 52724) at 100 μg/well overnight at 4 ℃. The washing and blocking steps were performed using the buffers and solutions provided in the kit. At 24 hours and 48 hours, the supernatant containing secreted antibody was removed from the transfection and kept at 4 ℃ for use. Aliquots of The aliquats of 0.5, 2.5 and 5.0 antibody supernatants were added to each SARS-Cov2-S coated well. After overnight incubation, the wells were washed with the solution provided in the kit (400 μl per well). Horseradish peroxidase (HRP) -conjugated anti-monoclonal detection antibodies were diluted in assay buffer (1/250) and added to each well and incubated for 2 to 3 hours at Room Temperature (RT). The wells were then washed 3 times (400 μl each) at RT using the buffer provided in the kit and treated with 300 μl of substrate TMB (3, 3', 5' -tetramethylbenzidine) for 15 minutes to develop blue color and the reaction was quenched with 2N HCl. Yellow formation was measured at 450nm using a BioTek multimode microplate reader. The level of anti-SARS-CoV monoclonal antibody was quantified by a Sigmoidal four-parameter logistic curve (4 PL) fit using PRISM GRAPHPAD 9.1.1.

ER golgi transport inhibition with brefeldin

Brefeldin (BA, adipoGen LIFE SCIENCES, catalog AG-CN 2-0018) was dissolved in DMSO to prepare a 1mg/mL working solution. HEK293T cells were transfected with the indicated vectors using TP5 transfection reagent in DMEM plus 10% fbs as described above. Transfected cells were incubated overnight and 10. Mu.g/ml BA was added, followed by media exchange and incubation in 5% CO ₂ at 37℃for 3 hours. The medium was replaced with 293FreeStyle serum-free medium (Gibco, thermo-Fisher, catalog No. 12-338-018) containing 10. Mu.g/ml BA and incubation was continued at 37℃for 24 hours in 5% CO ₂. The supernatant was withdrawn immediately after medium replacement and collected after 24 hours. Cell lysates were also prepared at the 24 hour time point. Supernatant and cell lysates were tested for gdLuc activity and subjected to western blot analysis.

Quantitative and statistical analysis

Fold change in the qα group or UTR group compared to the corresponding non-qα group or non-UTR group was quantified using excel software. Statistical analysis was performed using PRISM GRAPHPAD 9.1.1. * Significance of P <0.05, < P0.01, and P <0.001 was determined between the two groups using a two-tailed student t-test, or by one-factor analysis of variance for multiple comparisons. The data is presented as a mean SE. The size and type of individual samples are indicated and designated in the legend of the figures.

All of the features disclosed in this specification may be combined in any combination. Each feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose.

Reference to the literature

Baldassarre,A.,Paolini,A.,Bruno,S.P.,Felli,C.,Tozzi,A.E.,and Masotti,A.(2020).Potential use of noncoding RNAs and innovative therapeutic strategies to target the 5'UTR of SARS-CoV-2.Epigenomics 12,1349-1361.10.2217/epi-2020-0162.

Berkhout,B.,Arts,K.,and Abbink,T.E.(2011).Ribosomal scanning on the 5'-untranslated region of the human immunodeficiency virus RNA genome.Nucleic Acids Res 39,5232-5244.10.1093/nar/gkr113.

Bhagawati,M.,Terhorst,T.M.E.,Fusser,F.,Hoffmann,S.,Pasch,T.,Pietrokovski,S.,and Mootz,H.D.(2019).A mesophilic cysteine-less split intein for protein trans-splicing applications under oxidizing conditions.Proc Natl Acad Sci U S A 116,22164-22172.10.1073/pnas.1909825116.

Boo,S.H.,and Kim,Y.K.(2020).The emerging role of RNA modifications in the regulation of mRNA stability.Exp Mol Med 52,400-408.10.1038/s12276-020-0407-z.

Boson,B.,Legros,V.,Zhou,B.,Siret,E.,Mathieu,C.,Cosset,F.L.,Lavillette,D.,and Denolly,S.(2020).The SARS-CoV-2envelope and membrane proteins modulate maturation and retention of the spike protein,allowing assembly of virus-like particles.J Biol Chem 296,100111.10.1074/jbc.RA120.016175.

Bottaro,S.,Bussi,G.,and Lindorff-Larsen,K.(2021).Conformational Ensembles of Noncoding Elements in the SARS-CoV-2Genome fromMolecular Dynamics Simulations.J Am Chem Soc.10.1021/jacs.1c01094.

Cambray,G.,Guimaraes,J.C.,and Arkin,A.P.(2018).Evaluation of244,000 synthetic sequences reveals design principles to optimizetranslation in Escherichia coli.Nat Biotechnol 36,1005-1015.10.1038/nbt.4238.

Chan,A.P.,Choi,Y.,and Schork,N.J.(2020).Conserved GenomicTerminals of SARS-CoV-2 as Coevolving Functional Elements andPotential Therapeutic Targets.mSphere 5.10.1128/mSphere.00754-20.

Chan,L.Y.,Mugler,C.F.,Heinrich,S.,Vallotton,P.,and Weis,K.(2018).Non-invasive measurement of mRNA decay reveals translationinitiation as the major determinant of mRNA stability.Elife7.10.7554/eLife.32536.

Chng,J.,Wang,T.,Nian,R.,Lau,A.,Hoi,K.M.,Ho,S.C.,Gagnon,P.,Bi,X.,and Yang,Y.(2015).Cleavage efficient 2A peptides for high levelmonoclonal antibody expression in CHO cells.MAbs 7,403-412.10.1080/19420862.2015.1008351.

Corbett,K.S.,Edwards,D.K.,Leist,S.R.,Abiona,O.M.,Boyoglu-Barnum,S.,Gillespie,R.A.,Himansu,S.,Schafer,A.,Ziwawo,C.T.,DiPiazza,A.T.,et al.(2020).SARS-CoV-2 mRNA vaccine designenabled by prototype pathogen preparedness.Nature 586,567-571.10.1038/s41586-020-2622-0.

Daliri,E.B.,Oh,D.H.,and Lee,B.H.(2017).Bioactive Peptides.Foods6.10.3390/foods6050032.

DeCaprio,J.,and Kohl,T.O.(2019).Tandem ImmunoaffinityPurification Using Anti-FLAG and Anti-HA Antibodies.Cold Spring HarbProtoc 2019.10.1101/pdb.prot098657.

Donofrio,G.,Franceschi,V.,Macchi,F.,Russo,L.,Rocci,A.,Marchica,V.,Costa,F.,Giuliani,N.,Ferrari,C.,and Missale,G.(2021).ASimplified SARS-CoV-2 Pseudovirus Neutralization Assay.Vaccines(Basel)9.10.3390/vaccines9040389.

Dou,Y.,Lin,Y.,Wang,T.Y.,Wang,X.Y.,Jia,Y.L.,and Zhao,C.P.(2021).The CAG promoter maintains high-level transgene expressionin HEK293 cells.FEBS Open Bio 11,95-104.10.1002/2211-5463.13029.

Gaur,S.,Bhargava-Shah,A.,Hori,S.,Afjei,R.,Sekar,T.V.,Gambhir,S.S.,Massoud,T.F.,and Paulmurugan,R.(2017).EngineeringIntracellularly Retained Gaussia Luciferase Reporters for ImprovedBiosensing and Molecular Imaging Applications.ACS Chem Biol 12,2345-2353.10.1021/acschembio.7b00454.

Groot Bramel-Verheije,M.H.,Rottier,P.J.,and Meulenberg,J.J.(2000).Expression of a foreign epitope by porcine reproductive andrespiratory syndrome virus.Virology 278,380-389.10.1006/viro.2000.0525.

Han,X.,Ning,W.,Ma,X.,Wang,X.,and Zhou,K.(2020).Improvingprotein solubility and activity by introducing small peptide tags designedwith machine learning models.Metab Eng Commun 11,e00138.10.1016/j.mec.2020.e00138.

Hinnebusch,A.G.,Ivanov,I.P.,and Sonenberg,N.(2016).Translationalcontrol by 5'-untranslated regions of eukaryotic mRNAs.Science 352,1413-1416.10.1126/science.amino acidsd9868.

Hu,J.,Gao,Q.,He,C.,Huang,A.,Tang,N.,and Wang,K.(2020).Development of cell-based pseudovirus entry assay to identifypotential viral entry inhibitors and neutralizing antibodies againstSARS-CoV-2.Genes Dis 7,551-557.10.1016/j.gendis.2020.07.006.

Katayama,S.,Corpuz,H.M.,and Nakamura,S.(2021).Potential ofplant-derived peptides for the improvement of memory and cognitivefunction.Peptides 142,170571.10.1016/j.peptides.2021.170571.

Kim,J.H.,Lee,S.R.,Li,L.H.,Park,H.J.,Park,J.H.,Lee,K.Y.,Kim,M.K.,Shin,B.A.,and Choi,S.Y.(2011).High cleavage efficiency of a 2Apeptide derived from porcine teschovirus-1 in human cell lines,zebrafishand mice.PLoS One 6,e18556.10.1371/journal.pone.0018556.

Kolahchi,Z.,De Domenico,M.,Uddin,L.Q.,Cauda,V.,Grossmann,I.,Lacasa,L.,Grancini,G.,Mahmoudi,M.,and Rezaei,N.(2021).COVID-19 and Its Global Economic Impact.Adv Exp Med Biol1318,825-837.10.1007/978-3-030-63761-3_46.

Korber,B.,Fischer,W.M.,Gnanakaran,S.,Yoon,H.,Theiler,J.,Abfalterer,W.,Hengartner,N.,Giorgi,E.E.,Bhattacharya,T.,Foley,B.,etal.(2020).Tracking Changes in SARS-CoV-2 Spike:Evidence that D614GIncreases Infectivity of the COVID-19 Virus.Cell 182,812-827e819.10.1016/j.cell.2020.06.043.

Kuzmina,A.,Khalaila,Y.,Voloshin,O.,Keren-Naus,A.,Boehm-Cohen,L.,Raviv,Y.,Shemer-Avni,Y.,Rosenberg,E.,and Taube,R.(2021).SARS-CoV-2 spike variants exhibit differential infectivity andneutralization resistance to convalescent or post-vaccination sera.Cell HostMicrobe.10.1016/j.chom.2021.03.008.

Lee,T.H.,Kim,K.S.,Kim,J.H.,Jeong,J.H.,Woo,H.R.,Park,S.R.,Sohn,M.H.,Lee,H.J.,Rhee,J.H.,Cha,S.S.,et al.(2020).Novel shortpeptide tag from a bacterial toxin for versatile applications.J ImmunolMethods 479,112750.10.1016/j.jim.2020.112750.

Li,Y.(2011).Recombinant production of antimicrobial peptides inEscherichia coli:a review.Protein Expr Purif 80,260-267.10.1016/j.pep.2011.08.001.

Liu,J.,Bodnar,B.H.,Meng,F.,Khan,A.,Wang,X.,Luo,G.,Saribas,S.,Wang,T.,Lohani,S.C.,Wang,P.,et al.(2021a).Epigallocatechin Gallatefrom Green Tea Effectively Blocks Infection of SARS-CoV-2 and NewVariants by Inhibiting Spike Binding to ACE2 Receptor.bioRxiv,2021.2003.2017.435637.10.1101/2021.03.17.435637.

Liu,J.,Bodnar,B.H.,Wang,X.,Wang,P.,Meng,F.,Khan,A.I.,Saribas,A.S.,Padhiar,N.H.,McCluskey,E.,Shah,S.,etal.(2021b).Correlation of vaccine-elicited antibody levels and neutralizingactivities against SARS-CoV-2 and its variants.bioRxiv,2021.2005.2031.445871.10.1101/2021.05.31.445871.

Lorenz,I.C.,Nguyen,H.T.,Kemelman,M.,Lindsay,R.W.,Yuan,M.,Wright,K.J.,Arendt,H.,Back,J.W.,DeStefano,J.,Hoffenberg,S.,etal.(2014).The stem of vesicular stomatitis virus G can be replaced with theHIV-1 Env membrane-proximal external region without loss of G functionor membrane-proximal external region antigenic properties.AIDS Res HumRetroviruses 30,1130-1144.10.1089/AID.2013.0206.

Majorek,K.A.,Kuhn,M.L.,Chruszcz,M.,Anderson,W.F.,andMinor,W.(2014).Double trouble-Buffer selection and His-tag presencemay be responsible for nonreproducibility of biomedicalexperiments.Protein Sci 23,1359-1368.10.1002/pro.2520.

Miao,Z.,Tidu,A.,Eriani,G.,and Martin,F.(2020).Secondarystructure of the SARS-CoV-2 5'-UTR.RNA Biol,1-10.10.1080/15476286.2020.1814556.

Mishra,V.(2020).Affinity Tags for Protein Purification.Curr ProteinPept Sci 21,821-830.10.2174/1389203721666200606220109.

Muik,A.,Wallisch,A.K.,Sanger,B.,Swanson,K.A.,Muhl,J.,Chen,W.,Cai,H.,Maurus,D.,Sarkar,R.,Tureci,O.,et al.(2021).Neutralizationof SARS-CoV-2 lineage B.1.1.7 pseudovirus by BNT162b2vaccine-elicited human sera.Science 371,1152-1153.10.1126/science.abg6105.

Nie,J.,Li,Q.,Wu,J.,Zhao,C.,Hao,H.,Liu,H.,Zhang,L.,Nie,L.,Qin,H.,Wang,M.,et al.(2020).Establishment and validation of apseudovirus neutralization assay for SARS-CoV-2.Emerg Microbes Infect9,680-686.10.1080/22221751.2020.1743767.

Ou,X.,Liu,Y.,Lei,X.,Li,P.,Mi,D.,Ren,L.,Guo,L.,Guo,R.,Chen,T.,Hu,J.,et al.(2020).Characterization of spike glycoprotein ofSARS-CoV-2 on virus entry and its immune cross-reactivity withSARS-CoV.Nat Commun 11,1620.10.1038/s41467-020-15562-9.

Peighambardoust,S.H.,Karami,Z.,Pateiro,M.,and Lorenzo,J.M.(2021).A Review on Health-Promoting,Biological,and FunctionalAspects of Bioactive Peptides in Food Applications.Biomolecules11.10.3390/biom11050631.

Pina,A.S.,Batalha,I.L.,Dias,A.,and Roque,A.C.A.(2021).AffinityTags in Protein Purification and Peptide Enrichment:AnOverview.Methods Mol Biol 2178,107-132.10.1007/978-1-0716-0775-6_10.

Polack,F.P.,Thomas,S.J.,Kitchin,N.,Absalon,J.,Gurtman,A.,Lockhart,S.,Perez,J.L.,Perez Marc,G.,Moreira,E.D.,Zerbini,C.,etal.(2020).Safety and Efficacy of the BNT162b2 mRNA Covid-19Vaccine.N Engl J Med 383,2603-2615.10.1056/NEJMoa2034577.

Raman,S.,and Brian,D.A.(2005).Stem-loop IV in the 5'untranslatedregion is a cis-acting element in bovine coronavirus defective interferingRNA replication.J Virol 79,12434-12446.10.1128/JVI.79.19.12434-12446.2005.

Rangan,R.,Zheludev,I.N.,Hagey,R.J.,Pham,E.A.,Wayment-Steele,H.K.,Glenn,J.S.,and Das,R.(2020).RNA genome conservation andsecondary structure in SARS-CoV-2 and SARS-related viruses:a firstlook.RNA 26,937-959.10.1261/rna.076141.120.

Rezaei,N.,Ashkevarian,S.,Fathi,M.K.,Hanaei,S.,Kolahchi,Z.,Ladi Seyedian,S.S.,Rayzan,E.,Sarzaeim,M.,Vahed,A.,Mohamed,K.,etal.(2021).Introduction on Coronavirus Disease(COVID-19)Pandemic:TheGlobal Challenge.Adv Exp Med Biol 1318,1-22.10.1007/978-3-030-63761-3_1.

Rouchka,E.C.,Chariker,J.H.,and Chung,D.(2020).Variant analysisof 1,040 SARS-CoV-2 genomes.PLoS One 15,e0241535.10.1371/journal.pone.0241535.

Ryder,S.P.,Morgan,B.R.,Coskun,P.,Antkowiak,K.,and Massi,F.(2021).Analysis of Emerging Variants in Structured Regions of theSARS-CoV-2 Genome.Evol Bioinform Online 17,11769343211014167.10.1177/11769343211014167.

Saribas,A.S.,White,M.K.,and Safak,M.(2018).Structure-basedrelease analysis of the JC virus agnoprotein regions:A role for thehydrophilic surface of the major alpha helix domain in release.J CellPhysiol 233,2343-2359.10.1002/jcp.26106.

Schlehuber,L.D.,and Rose,J.K.(2004).Prediction and identificationof a permissive epitope insertion site in the vesicular stomatitis virusglycoprotein.J Virol 78,5079-5087.10.1128/jvi.78.10.5079-5087.2004.

Senanayake,S.D.,and Brian,D.A.(1999).Translation from the 5'untranslated region(UTR)of mRNA 1 is repressed,but that from the 5'UTR of mRNA 7 is stimulated in coronavirus-infected cells.J Virol 73,8003-8009.10.1128/JVI.73.10.8003-8009.1999.

Shirokikh,N.E.,Dutikova,Y.S.,Staroverova,M.A.,Hannan,R.D.,and Preiss,T.(2019).Migration of Small Ribosomal Subunits on the 5'Untranslated Regions of Capped Messenger RNA.Int J Mol Sci20.10.3390/ijms20184464.

Traenkle,B.,Segan,S.,Fagbadebo,F.O.,Kaiser,P.D.,and Rothbauer,U.(2020).A novel epitope tagging system to visualize and monitor antigensin live cells with chromobodies.Sci Rep 10,14267.10.1038/s41598-020-71091-x.

Vasan,N.,Razavi,P.,Johnson,J.L.,Shao,H.,Shah,H.,Antoine,A.,Ladewig,E.,Gorelick,A.,Lin,T.Y.,Toska,E.,et al.(2019).DoublePIK3CA mutations in cis increase oncogenicity and sensitivity toPI3Kalpha inhibitors.Science 366,714-723.10.1126/science.aminoacidsw9032.

Viswanathan,S.,Williams,M.E.,Bloss,E.B.,Stasevich,T.J.,Speer,C.M.,Nern,A.,Pfeiffer,B.D.,Hooks,B.M.,Li,W.P.,English,B.P.,etal.(2015).High-performance probes for light and electron microscopy.NatMethods 12,568-576.10.1038/nmeth.3365.

Walls,A.C.,Park,Y.J.,Tortorici,M.A.,Wall,A.,McGuire,A.T.,andVeesler,D.(2020).Structure,Function,and Antigenicity of theSARS-CoV-2 Spike Glycoprotein.Cell 181,281-292e286.10.1016/j.cell.2020.02.058.

Walsh,E.E.,Frenck,R.W.,Jr.,Falsey,A.R.,Kitchin,N.,Absalon,J.,Gurtman,A.,Lockhart,S.,Neuzil,K.,Mulligan,M.J.,Bailey,R.,etal.(2020).Safety and Immunogenicity of Two RNA-Based Covid-19Vaccine Candidates.N Engl J Med 383,2439-2450.10.1056/NEJMoa2027906.

Wang,Q.,Zhang,Y.,Wu,L.,Niu,S.,Song,C.,Zhang,Z.,Lu,G.,Qiao,C.,Hu,Y.,Yuen,K.Y.,et al.(2020).Structural and Functional Basis ofSARS-CoV-2 Entry by Using Human ACE2.Cell 181,894-904e899.10.1016/j.cell.2020.03.045.

Weber,M.,Burgos,R.,Yus,E.,Yang,J.S.,Lluch-Senar,M.,andSerrano,L.(2020).Impact of C-terminal amino acid composition on proteinexpression in bacteria.Mol Syst Biol 16,e9208.10.15252/msb.20199208.

Weissman,D.,Alameh,M.G.,de Silva,T.,Collini,P.,Hornsby,H.,Brown,R.,LaBranche,C.C.,Edwards,R.J.,Sutherland,L.,Santra,S.,etal.(2021).D614G Spike Mutation Increases SARS CoV-2 Susceptibility toNeutralization.Cell Host Microbe 29,23-31e24.10.1016/j.chom.2020.11.012.

Wibmer,C.K.,Ayres,F.,Hermanus,T.,Madzivhandila,M.,Kgagudi,P.,Oosthuysen,B.,Lambson,B.E.,de Oliveira,T.,Vermeulen,M.,van derBerg,K.,et al.(2021a).SARS-CoV-2 501Y.V2 escapes neutralization bySouth African COVID-19 donor plasma.NatMed.10.1038/s41591-021-01285-x.

Wibmer,C.K.,Ayres,F.,Hermanus,T.,Madzivhandila,M.,Kgagudi,P.,Oosthuysen,B.,Lambson,B.E.,de Oliveira,T.,Vermeulen,M.,van derBerg,K.,et al.(2021b).SARS-CoV-2 501Y.V2 escapes neutralization bySouth African COVID-19 donor plasma.Nat Med 27,622-625.10.1038/s41591-021-01285-x.

Williams,G.D.,Chang,R.Y.,and Brian,D.A.(1999).Aphylogenetically conserved hairpin-type 3'untranslated region pseudoknot functions in coronavirus RNA replication.J Virol 73,8349-8355.10.1128/JVI.73.10.8349-8355.1999.

Wu,F.,Zhao,S.,Yu,B.,Chen,Y.M.,Wang,W.,Song,Z.G.,Hu,Y.,Tao,Z.W.,Tian,J.H.,Pei,Y.Y.,et al.(2020).A new coronavirus associated with human respiratory disease in China.Nature 579,265-269.10.1038/s41586-020-2008-3.

Yang,D.,and Leibowitz,J.L.(2015).The structure and functions of coronavirus genomic 3'and 5'ends.Virus Res 206,120-133.10.1016/j.virusres.2015.02.025.

Zhang,J.,Cruz-Cosme,R.,Zhuang,M.W.,Liu,D.,Liu,Y.,Teng,S.,Wang,P.H.,and Tang,Q.(2020).A systemic and molecular study of subcellular localization of SARS-CoV-2proteins.Signal Transduct Target Ther 5,269.10.1038/s41392-020-00372-8.

Zhang,J.,Roberts,R.,and Rakotondrafara,A.M.(2015).The role of the 5'untranslated regions of Potyviridae in translation.Virus Res 206,74-81.10.1016/j.virusres.2015.02.005.

Zhao,J.,Qiu,J.,Aryal,S.,Hackett,J.L.,and Wang,J.(2020).The RNA Architecture of the SARS-CoV-2 3'-Untranslated Region.Viruses12.10.3390/v12121473.

Example 3: protein expression/analysis boosting by expression enhancing 21 mer cis regulatory motif (Exen)

Various techniques have been developed to enhance protein production, such as promoter optimization, mRNA regulation, codon optimization, and protein stabilization, as well as modification of host cell expression machinery including humanized yeast systems. Although these strategies have been successfully used in a number of research fields and by biopharmaceutical companies, the development of simple general methods that can increase protein production at lower cost remains the focus of research. The low level expression of many viral proteins, including the spinous process (S) protein, in mammalian cells has hampered the study ¹ of SARS-CoV-2, which limited the rapid response ^2,3 to the COVID-19 pandemic. In order to optimize SARS-CoV-2 viral protein expression, a variety of expression n-vectors were developed herein using different promoters and a dual reporter system based on luciferase/GFP. During the vector optimization process, it was found that the addition of the novel 21-mer oligonucleotide motif, referred to herein as "Exen" i.e., expression (Ex) enhancement (En) 21, to the vector dramatically increased the expression and secretion of the SARS-CoV-2 envelope (E) protein. This unique Exen encodes a specific heptapeptide called qα. Exen21 insertion of 21/qα was extended to various types of proteins and found to enhance production of other proteins of SARS-CoV-2, cellular gene products, mRNA vaccines, antibodies, engineered recombinant proteins and viral encapsulation proteins.

Materials and methods

Vector cloning

Dual reporter vector: dual reporter LG fragments encoding Gaussia-Dura luciferase (gdLuc) and destabilized GFP (dsGFP) were generated by overlap PCR: 1) Standard PCR was performed using primer pair T1290/T1291 to generate fragment 1 (gdLuc) from template plasmid pMCS-Gaussia-Dura-luciferase (Thermo FISHER SCIENTIFIC, catalog number 16190), while fragment 2 (dsGFP) was generated from plasmid pLenti-EFS-EGFPd2PEST-2A-MCS-Hygro (TP 1380) (gift from Neville Sanjana, addgene catalog number 138152) using T1292/T1293; 2) Two purified fragments (100 ng each) of 19 nucleotides overlapping were mixed and 5 cycles of PCR were performed using primer pair T1304/T1305: 98℃for 15sec, 58℃for 30sec and 72℃for 1min, followed by 30 cycles of PCR: 98℃for 30sec, 55℃for 30sec and 72℃for 1min to produce LG fragments. After purification with Nucleospin gel and PCR cleaning kit (Macherey-Nagel, catalog number REF 740609), useHiFi DNA Assembly cloning kit (NEB, catalog number E5520S, designated NEB-HiFi) this LG fragment (1485 bp) was cloned via SacII cloning site into pcDNA6B-nCoV-X-Flag vector encoding various viral proteins or cellular genes hACE2 of SARS-CoV-2 to generate pcDNA6B-SARS-CoV-2-X-Flag-LG vector as listed in the table.

"X" indicates a gene of interest.

Unexpectedly, functional assays and Sanger sequencing identified a novel clone designated pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP 1479) with a Q.alpha.peptide (designated QLG) in the ORF prior to LG. The insert encoding SARS-CoV-2S protein from pcDNA6B-nCoV-S-Flag vector (TP 1456) was cloned into TP1479 via XhoI/XbaI site to generate pcDNA6B-SARS-CoV-2-S-Flag-QLG (TP 1487). The insert encoding SARS-CoV-2N protein from pcDNA6B-nCoV-N-Flag vector (TP 1431) or hACE2 from pcDNA6B-hACE2-Flag vector (TP 1470) was cloned into TP1479 via KpnI/XbaI site to generate pcDNA6B-SARS-CoV-2-N-Flag-QLG (TP 1490) or pcDNA6B-hACE2-Flag-QLG (TP 1491).

The pcDNA6B-NIBP-Flag-LG (TP 1560) vector was generated by cloning the NIBP PCR product NEB-HiFi from pYX-Asc-mNIBP (TP 546, genbank # BC 070463) into pcDNA 6B-hACE-Flag-LG (TP 1538) via NotI/XbaI, while pcDNA6B-NIBP-Flag-QLG (TP 1558) was generated by cloning the NIBP PCR product NEB-HiFi into pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP 1479) via XhoI/XbaI.

The pCAG vector encoding E was generated by replacing the CMV promoter of the corresponding pcDNA6B-SARS-CoV-2-E-Flag-LG or-QLG vector with the CAG promoter via the SnaBI/KpnI site.

Mutation vector: site-directed mutagenesis or deletion mutagenesis of Exen/Qα was performed using pcDNA6B-SARS-CoV-2-E-Flag-QLG (TP 1479) as a template. The mutant primers are designed to alter or delete a particular nucleotide in the Exen sequence. For each mutation, a Phusion high fidelity PCR reaction was performed using a universal primer (T1640) that matched the upstream region of SARS-CoV-2E and a mutant primer that matched the Exen sequence other than the desired mutation that was introduced. PCR products carrying Exen mutations were gel purified and used The HiFi DNA assembly kit was cloned into EcoRV/NotI-digested 6B-E-QLG DNA.

Antibody carrier: plasmid set CR3022 expressing human heavy (H) and light (L) chains of anti-SARS-CoV was produced at HHSN272201400008C and was obtained from BEI Resources, NIAID, NIH (catalog number NR-53260), pFUSEss-CHIg-hG1-SARS-CoV-2-mAb (NR-52399, TP1565) and pUSE 2ss-CLIg-hK-SARS-CoV-2-mAb (NR-52400, TP1566) (GenBank: DQ168569 and DQ 168570). Q-tagged HQ (TP 1574) and LQ (TP 1571) vectors were generated from H or L plasmids at the Nhe site using NEB-HiFi using synthetic oligonucleotides (T1378, T1380 to T1383) containing the Q coding sequence and the C-terminus of the immunoglobulin heavy and light chains.

Lentiviral vector: the vector pRRL-SIN.cPPT.PGK-GFP.WPRE (TP 792) (Addgene # 12252) was used to generate pRRL-E-Flag-LG-GFP (TP 1577) by transferring the E-Flag-LG insert from TP1478 to TP792 via BamHI/AgeI. The pRRL-E-Flag-LG (TP 1578) vector was generated from TP1577 by AgeI/KpnI blunt-end ligation. The pRRL-E-Flag-QLG (TP 1579) vector was generated by transferring E-Flag-QLG from TP1479 to TP1578 via BamHI/BstBI. TP1578 and TP1579 vectors are used as backbones for NEB-HiFi cloning of human IFN gamma and IL2 PCR products via XbaI sites to generate pRRL-IFN gamma-LG (TP 1604) or QLG (TP 1605) and pRRL-IL2-LG (TP 1606) or QLG (TP 1607). PCR fragments of IFNγ and IL2 were derived from pUC8-IFNγ (Addgene#17600) and pAIP-hIL2-co (Addgene# 90513), respectively, using primer pairs T1407/T1408 and T1409/T1410. NEB-HiFi clones were generated from TP1621 and TP1622 via BsmBI/XbaI digestion and using an oligonucleotide insert (T1469) to yield pRRL-Flag-LG (TP 1685) and pRRL-Flag-QLG (TP 1686) vectors, respectively. pLV-EF1a-spCas9-Q-T2A-RFP (TP 1562) was generated from pLV-EF1a-spCas9-T2A-RFP (TP 855) at the NheI site using NEB-HiFi cloning, using a synthetic oligonucleotide (T1361) containing the Q coding sequence. The pLV-EF1a-MS2-spCas9-Q-F2A-GFP (TP 1552) vector was generated from pLV-EF1a-MS2-spCas9-F2A-GFP (TP 1081) at the NheI site using NEB-HiFi cloning, using an oligonucleotide (T1361).

NEB-HiFi cloning of two overlapping PCR fragments was performed with primer pairs T1396/T1397 and T1398/T1399 using TP592 as PCR template, generating LV packaging vector psPAX-Gag-Q (TP 1618) from psPAX2 (TP 592, addgene # 12260) via SphI/EcoRV site. NEB-HiFi cloning of two overlapping PCR fragments was performed using psPAX2 as primer pair for PCR template T1400/T1401 and T1402/T1403, generating psPAX-Pol-Q-RRE-Q (TP 1619) from psPAX via the SwaI/NheI site. NEB-HiFi cloning of two overlapping PCR fragments was performed by using TP592 as a primer pair for PCR templates, T1396/T1397 and T1398/T1399, generating psPAX-Gag-Q-Pol-Q-RRE-Q (TP 1620) from TP1619 via SphI/EcoRV site.

S-pseudoviral vector: using NEB-HiFi, a vector pCAG-SARS-CoV-2-Sd18Q (TP 1506) encoding the S gene of human codon optimized SARS-CoV-2 with a C-terminal 18 amino acid deletion (Sd 18) and Q.alpha.tag fusion was constructed. Briefly, the Sd18 expression cassette in the CMV-driven vector pcDNA3.1-SARS2-S (Addgene catalog number 145032) was transferred to the CAG-driven vector pCAG-Flag-SARS-CoV-2-S (gift from Peihui Wang) via EcoRI/NotI sites and PCR using primer pair T1323/T1324. Vector pCAG-SARS-CoV-2-Sd18 (TP 1567) encoding Sd18 but without Q.alpha.tag was constructed via NEB HiFi cloning using synthetic oligonucleotide insert T1367 at the SacII/NotI site of the pCAG-SARS-CoV-2-Sd18Q vector.

Plasmid DNA purification and DNA purification

Plasmid DNA was purified using commercial kits for endotoxin-free small-scale preparation (catalog No. REF 740490) or medium-scale preparation (catalog No. REF 740420) from Macherey-Nagel. Coli bacterial cultures (4 ml for small scale preparations and 200ml for medium scale preparations) carrying the relevant plasmids were grown overnight at 30 ℃ (for NEB-stabilised cells) or 37 ℃ (for DH 5a e.coli cells) in LB or 2YT medium supplemented with 100 μg/ml carbenicillin, 50 μg/ml kanamycin, 50 μg/ml blasticidin or 50 μg/ml bleomycin. Bacterial cultures were harvested by centrifugation and the pellet obtained after centrifugation was treated to purify plasmid DNA according to the manufacturer's instructions. The final DNA was dissolved in ultra-pure distilled water without DNase/RNase (Thermo Fisher, catalog number 10977023) and the DNA concentration was determined using a Nanodrop 1UV-Vis spectrophotometer (Thermo-Fisher) or using a Bio-Tek multimode microplate reader in Take 3.

Cell culture and transfection

HEK293T human embryonic kidney cells and Hela human cervical epithelial cells were obtained from ATCC (catalog numbers CRL-3216 and CCL-2). Both cells were cultured in Du's modified Igor's medium (DMEM, gibco) supplemented with Fetal Bovine Serum (FBS) and the antibiotic 1% penicillin/streptomycin (Corning). BHK-21-/WI-2 cells (Kerafast, EH 1011) were grown in DMEM supplemented with 5% FBS and 1% penicillin/streptomycin. All cells were incubated in an incubator at 37℃under 5% CO ₂ atmosphere.

Multilabelled fluorescent immunocytochemistry and confocal image analysis

Cells were fixed with 4% Paraformaldehyde (PFA) for 30 min, washed with 1 XPBS, permeabilized with 0.5% Triton X-100/1 XPBS for 30 min, blocked with 10% donkey serum for 1 hr, and incubated with murine anti-Flag monoclonal antibody or anti-2A primary antibody in 0.1% Triton X-100/1 XPBS overnight at 4 ℃. The following day, cells were washed with 1 XPBS and counterstained with the corresponding Alexa Fluor secondary antibodies (Jackson Immuno Research Labs; donkey anti-rabbit, anti-mouse, igG (H+L) 488, 594 or 680) at 1:400 dilutions for 1 hour at room temperature using Hoechst 33258 (1:5000) as nuclei. Fluorescent confocal images were acquired and analyzed using a Leica SP8 confocal system.

Flow cytometry

Cells expressing dsGFP reporter were dissociated with Accutase (Corning), passed through a70 μm nylon cell sieve (Corning) to remove large clusters, and washed with 1X PBS. Dissociated cells were fixed with 4% PFA in PBS and GFP positive cells were analyzed using a Cytek Aurora flow cytometer.

HEK293T cells were transfected with the indicated vector (500 ng/well for 24 well plates) for 24 hours and then treated with the transcription inhibitor actinomycin D (10 μm) for various durations. Total RNA was extracted using Monarch total RNA miniprep kit (NEB, catalog number T2010), involving two steps of DNA removal. cDNA was synthesized using a high-capacity cDNA reverse transcription kit (Thermo FISHER SCIENTIFIC, cat. No. 4368814), using random hexanucleotide primers, using equal amounts of RNA (0.5. Mu.g). Real-time PCR analysis was performed on QuantStudio ^TM systems. mRNA expression levels of reporter gdLuc luciferase and human β -actin were determined using iTaq Universal SYBR Green Supermix kit (BioRad, cat No. 1725121). The gdLuc primers were (forward) 5'-GATTACAAGGATGACGACGATAAG-3' (T1364 targeting Flag) and (reverse) 5'-AAGTCTTCGTTGTTCTCGGTGGG-3' (T432 targeting gdLuc). The human β -actin primers are (forward) 5'-AAGAGCTATGAGCTGCCTGA-3' and (reverse) 5'-TACGGATGTCAACGTCACAC-3'. Each sample was tested in triplicate. Circulation threshold (Ct) values for the reporter and beta-actin are obtained graphically. The Ct value difference between the reporter and beta-actin is presented as the delta Ct value. The ΔΔct value is obtained by subtracting this value for the control sample from the Δct value for the sample at different time points. The relative percent change in gene expression was calculated as 2- ΔΔCt. mRNA decay was calculated by nonlinear regression curve fitting (monophasic decay) using GRAPHPAD PRISM 9.1.1. Three independent experiments were performed.

Luciferase assay

For the gdLuc assay, coelenterazine (CTZ) substrate (Nanolight Technology, cat No. 3032) was dissolved in 10ml of super-sterile distilled water to prepare a stock solution and kept at-20 ℃ for use. CTZ stock solutions were diluted 10 to 30 times to prepare working solutions. Equal amounts of CTZ working solution were mixed with transfected cell culture medium (25 to 50 l) in white opaque 96 well optiplate (Corning, catalog No. CLS 3922) and luminescence was measured in a BioTek Synergy LX multimode microplate reader. In some experiments, firefly luciferase assay was performed using ONE-Glo luciferase assay kit (Promega Corp, catalog No. E6110). An aliquot of 100 μl substrate solution was mixed with 3 to 5 μl cell lysate and luminescence was measured in a BioTek Synergy LX multimode microplate reader. The data are presented as relative luciferase activity or fold change compared to the corresponding group. Experiments were performed at least 3 times and in quadruplicates each time.

In vitro transcription and mRNA transfection

For pcDNA6B vectors containing the T7 promoter, DNA streaks were presented with an agoi digestion followed by gel purification. For PCR products, the primers included the T7 promoter (TTAATACGACTCACTATAGGGTGGAATTCTGCAGATATCCAG, T1427), generating a DNA fragment containing the target gene, LG or QLG dual reporter and poly (A) tail. PCR was performed using a Phusion high fidelity PCR premix kit (Thermo FISHER SCIENTIFIC, F531). DNA was purified using a gel extraction kit and concentration was determined in a Bio-Tek multimode microplate reader using a Take3 plate. RNA was synthesized from the purified DNA template using HiScribe ^TM T7ARCA mRNA kit (NEB, cat# E2060), co-transcribed blocked with m7G stress-resistant cap analog (ARCA, cat# 1411), and poly A tailing. The synthesized RNA was purified using the Monarch RNA cleaning kit (NEB, catalog number E2040) and quantified with the Take3 plate. Following manufacturer's manual, useMessengerMAX mRNA transfection reagent (Thermo FISHER SCIENTIFIC, cat. No. LMRNA 015), equal amounts of RNA between LG groups and QLG groups at different doses were used for transfection into HEK293T cells in quadruplicates. Culture medium containing gdLuc was collected 4 to 72 hours post-transfection and gdLuc was tested as described above.

VSV-G or S protein pseudotyped lentiviral encapsulation and titration

Lentiviruses carrying the indicated lentiviral vectors were produced on a small scale according to standard protocols using a second generation LV encapsulation system. Briefly, HEK293T cells in one of the 6-well plates were co-transfected with a TP5 kit with the indicated transfer LV vector (1.4 μg), encapsulation vector psPAX2 or a mutant thereof (1 μg) and VSV-G or Sd18 vector (0.4 μg). 2 to 3 days after transfection, the supernatant containing LV was concentrated and purified ⁴⁹ using the simplified 10% sucrose purification method as previously described. Functional titers of crude and purified lentiviruses were determined 48 hours after transfection with serial dilutions of lentiviruses by counting HEK293T cells expressing GFP under a fluorescent microscope. For some cases, LV titration was performed using flow cytometry.

Western blot analysis

SDS-polyacrylamide gels (10-12%) were prepared internally, or mini-PROTEAN TGX gels (catalogue No. 4561093, 4561096) were purchased from BioRad. Cell lysates were prepared using lysis buffer consisting of 50mM Tris-HCl pH 7.0, 150mM NaCl, 5mM EDTA and 1% Triton X-100 supplemented with PMSF (100X), aprotinin and leupeptin (200X). After collecting the supernatant, 50 μl of lysate was prepared from each well. Lysates were incubated at 4℃for 20 to 30min and centrifuged at maximum speed in an Eppendorf centrifuge. Clean lysates were either immediately denatured in dye-loaded 1x SDS-PAGE at 98℃for 5 min or stored at-80℃for later use. The supernatants were stored at 4℃until they were treated with dye-loaded 1x SDS-PAGE. 10 to 20. Mu.l aliquots of denatured cell lysates or 20 to 30. Mu.l supernatant were loaded onto SDS-polyacrylamide gels. SDS-PAGE was performed in Tris-glycine/SDS buffer under denaturing and reducing conditions.

Using wet transfer or2 Device usage/>The polyacrylamide gel was transferred to 0.2 μm nitrocellulose membrane (BioRad supported Nitrocellulose (NC) membrane, catalog No. 162-0097) in 2NC mini (IB 23002) or regular stack (IB 23001), in wet transfer, the following 1x transfer buffers were used: 25mM Tris-HCl pH 7.6, 192mM glycine, 20% methanol. The gel was sandwiched between NC membranes and transferred in 1x transfer buffer at 250mA at 4 ℃ for 1 to 2 hours.

According to the instruction of the manufacturer, at2 Gel transfer device (Invitrogen, thermo-Fisher, reference IB 21001), using a micro or conventional/>The 2 stack was transferred for 7 minutes by dry western blotting. After transfer, the membranes were blocked with 1 XTBST buffer containing 5% milk. The membrane was then treated with primary antibody overnight at 4 ℃ or at room temperature for 2 hours. Membranes were washed three times with 1 XTBST buffer for one minute each and then incubated with secondary antibodies. The secondary antibody with infrared tag was diluted 1/10000-120000 and incubated with NC film for 45 minutes to one hour. At the end of incubation, the membranes were washed three times with 1 XTBST buffer for 5 minutes each and scanned on a Li-COR Odyssey image analyzer. Images were analyzed using NIH ImageJ (version 1.53) densitometry measurements. Data are expressed as integrated density time area and presented as relative fold compared to corresponding controls.

Antibody detection using enzyme-linked immunosorbent assay (ELISA)

HEK293T cells were co-transfected in 96 well plates with Q.alpha tagged HQ (TP 1574) and LQ (TP 1571) in quadruplicate with or without the 20 ng/well of the normalization vector pGL4.16-CMV (TP 329, which is derived from the promoter-free vector pGL4.16 (Promega, cat. No. E6711)) or pRRL-E-Flag-LG (TP 1578). The original antibody plasmids pFUSEss-CHIg-hG1-SARS-CoV-2-mAb (TP 1565) and pFUSE2ss-CLIg-hk-SARS-CoV-2-mAb (TP 1566) were used as controls. ELISA was performed using an ELISA kit (Invitrogen, thermo-Fisher, catalog number 88-5050-88) with Human IgG (total) uncoated. A96-well Costar ELISA plate (Corning) was first coated with SARS-Cov-2-spinous process (S) protein from BEI (catalog number NR 52724) at 100 μg/well overnight at 4 ℃. The washing and blocking steps were performed using the buffers and solutions provided in the kit. At 24 hours and 48 hours, the supernatant containing secreted antibody was removed from the transfection and kept at 4 ℃ for use. Aliquots of The aliquats of 0.5, 2.5 and 5.0. Mu.l antibody supernatant were added to each SARS-Cov-2-S coated well. After overnight incubation, wells were washed 4 times (400 μl per well). Horseradish peroxidase (HRP) -conjugated anti-human IgG detection monoclonal antibodies in assay buffer (1/250) were added to each well and incubated for 2 to 3 hours at room temperature. The wells were then washed 3 times (400 μl each) and treated with 300 μl of substrate TMB (3, 3', 5' -tetramethylbenzidine) for 15 minutes to develop blue, and the reaction was quenched with 2N HCl. Yellow formation was measured at 450nm using a BioTek multimode microplate reader. The level of anti-SARS-CoV monoclonal antibody was quantified by a Sigmoidal four-parameter logistic curve (4 PL) fit using PRISM GRAPHPAD 9.1.1.

ER golgi transport inhibition with brefeldin

Quantitative and statistical analysis

Fold change in the qα group compared to the corresponding non-Q group was quantified using excel software. Statistical analysis was performed using PRISM GRAPHPAD 9.1.1. * Significance of P <0.05, < P0.01, and P <0.001 was determined between the two groups using a two-tailed student t-test, or by one-factor analysis of variance for multiple comparisons. Data are presented as mean ± SE. The size and type of individual samples are indicated and designated in the legend of the figures.

Results

Discovery of novel heptapeptide qα in enhancing protein expression/production.

To study SARS-CoV-2 viral protein expression in mammalian cells, a dual reporter system was generated to quantitatively and dynamically measure viral protein expression. Gaussia-Dura luciferase (gdLuc) and destabilized green fluorescent protein (dsGFP) were fused (abbreviated LG) to the C-terminus of SARS-CoV-2E protein (FIGS. 8A and 14A to 14C). This design allows measurement of secreted gdLuc fused target proteins in culture medium by the sensitive dLuc assay and dsGFP positivity and intensity by fluorescence microscopy and loss of cells. During cloning of the E protein expression vector, the correct clones were initially screened by restriction enzyme digestion, and positive clones E1 and E7 were tested for protein expression by gdLuc test (fig. 8B to 8C) and fluorescent microscopy (fig. 8D and 14A). Surprisingly, E7 exhibits >20 times higher luciferase activity than E1. The E7 DNA sequence was confirmed by Sanger sequencing. Unexpectedly, it was found that E7 has an additional 21 nucleotide sequence encoding 7 amino acids (aa) in-frame between upstream of LG and downstream of the Flag tag. Based on the aa sequence of this heptapeptide, it is called qα and LG to which it is linked is called QLG. It was confirmed that transfection of pcDNA6B-E-QLG (E7) showed up to 90-fold higher expression than pcDNA6B-E-LG (E1) in HEK293T cells (FIGS. 8B to 8D). The effect of this addition of qα on the expression of SARS-CoV-2 structural proteins, including S, nucleocapsid (N) and membrane (M), and helper proteins NSP2, NSP16 and ORF3 was examined. Qα was found to enhance production of all viral proteins tested (fig. 8E, 8F, 9A, 14A to 14C and 15A to 15B), with efficiencies ranging from 3-fold to 3848-fold depending on the respective proteins. Such variations in qα potentiation efficiency can lead to differences in cell density/function, transfection efficiency, reporter dose and viral protein type.

Novel and unique 21-mer oligonucleotide cis-regulatory motifs contribute to qα boost.

Given that qα insertion requires an Open Reading Frame (ORF) with a gene of interest for protein expression and functional detection, it was originally speculated that in-frame heptapeptide qα plays a key role in facilitating protein production. Thus, alanine scanning and deletion mutation experiments (fig. 8G) were performed to determine the role of amino acid residues in modulating qα function at the peptide level. In the 4A mutations, these tested mutations all impaired the potentiation activity to various degrees from loss of >57% potentiation activity to almost all loss, indicating that each residue in this unique qα heptapeptide appears to be important for potentiation activity, and residue 4 is the most critical one. To explore the contribution of potential oligonucleotides at the RNA level, synonymous (silent, degenerate) mutations were created that only changed nucleotides, not amino acids. Unexpectedly, all degenerate mutants tested were found to show a significant loss (> 90%) of qα boosting activity (fig. 8H), indicating that qα boosting is mainly derived from the 21-mer oligonucleotide motif rather than the unique heptapeptide effect. Then, by retaining the ORF required for the expression of the reporter, a nonsense (missense) mutation test was performed. All mutants tested lost potentiation to varying degrees compared to the parental qα group (fig. 8I). These data provide evidence that the sequence (composition) and number of this 21-mer motif are critical for qα potentiation. The name Exenn is a new name assigned to a unique and novel expression-enhancing 21-mer cis-regulatory motif that encodes an epitope tag (qα).

The addition of Exen a 21/qα enhances the broad ability of protein expression/production.

To expand the potential application of Exen/qα in enhancing protein expression and production, similar experiments were performed in different types of proteins, mammalian cells and species. Similar boosting effects were observed for many non-viral proteins (fig. 9B to 9E). Interestingly, transfection with lower amounts of plasmid DNA in HEK293T cells resulted in higher boost efficiencies for most SARS-CoV-2 viral proteins (fig. 9A, 15A and 15B), but no effect on host cell gene products such as murine NIBP4 (fig. 9B) and human ACE2 (hACE 2) (fig. 9C) or cytokines such as ifnγ (fig. 9D) and IL-2 (fig. 9E). Exen21/Qα induced a stronger enhancement of SARS-CoV-2E protein in the presence of the stronger CAG promoter (FIG. 9F). It was found that similar enhancement of protein expression and production occurred in other cell types including Hela, BHK, etc. (fig. 9G). In addition to functioning in conventional plasmids, exen21/qα also showed enhanced activity in viral transfer vectors such as Lentiviral (LV) vectors (fig. 9D and 9E).

In summary, exen addition of 21/qα has a range of gene products, vector mammalian cell types andEnhancing the broad ability of protein expression/production.

Exen21/qα enhancement of antibody production.

Monoclonal antibody (mAb) -based therapeutics require optimization of antibody production in a suitable cell culture platform, which depends on high-potency expression vectors. To achieve this, genetic elements in mAb production vectors have been extensively modified. To determine whether Exen addition of 21/qα plays a role in enhancing antibody production, a human anti-SARS-CoV mAb (Bei, CR 3022) was used that contained the consensus regions of the heavy and light chains (GenBank: DQ168569, DQ168570, respectively) as a test platform. Exen21 a was inserted into the C-terminus of immunoglobulin heavy and light chains (H/L) of CR3022 to generate qα -tagged HQ and LQ (fig. 10A). HQ and LQ were co-transfected into HEK293T cells to generate qα -tagged mabs using the original H and L vectors (NR 52399 and NR 52400) as controls. The MAb-containing supernatants were collected 2 to 3 days post-transfection and their MAb levels were measured by ELISA using SARS-CoV-2S protein as coating antigen (fig. 10B and 10C). Exen21 a 21/qα was found to enhance mAb production up to 37-fold with or without normalization of transfection efficiency (figure 10D). An average of at least 13-fold enhanced efficiency was obtained from 16 independent experiments, even under different experimental conditions (cell density, transfection efficiency and ELISA variation) (fig. 10E). The Exen21/qα enhanced mAb production was further confirmed by western blot analysis of cell culture supernatants (fig. 10F). The data indicate that Exen addition of 21/qα strongly enhances mAb production/secretion.

Exen 21A enhancement of production of SARS-CoV-2S pseudovirions.

Pseudotyped viruses have been widely used not only for gene delivery but also in vaccine production, antibody neutralization, cell entry and research into pathogenicity exploration. Pseudovirions are excellent alternatives to high-wire viruses such as SARS-CoV-2 and its variants and do not require BSL3 facilities for processing. Pseudovirions are virus-like particles (VLPs) coated with a viral surface or membrane protein carrying a specific cell tropism. VLPs pseudotyped with SARS-CoV-2S protein, because of their three-dimensional structure, resemble a live virus, elicit an immune response ^8,10,11 that is stronger than any single viral protein. SARS-CoV-2S protein has been widely used to generate S-pseudovirions, but in most reports encapsulation efficiency against lentivirus-like particles (LVLP) or vesicular stomatitis virus-like particles (VSVLP) is low, even when a codon-optimized C-terminal deleted S protein is used ^5,6,8,12. In view of the fact that Exen addition of 21/qα enhances S protein production in mammalian cells, it is speculated that it may enhance the encapsulation efficiency of S-pseudotyped LVLP (S-LVLP). By using the widely used C-terminal 18aa deleted codon optimized SARS-CoV-2S protein (Sd 18) as a test platform (fig. 11A), the addition of Exen21/qα to the C-terminal Sd18 (Sd 18Q) was verified by western blot analysis to enhance Sd18 expression (fig. 11B). It was also found that the addition of Exen/qα increased the S-LVLP encapsulation efficiency in HEK-hACE2 cells by about 2 to 4-fold (fig. 11C). To provide a dynamic measurement of S-pseudovirion transduction, the encapsulation efficiency of dual reporter LV vector pRRL-E-QLG carrying an insert larger than GFP insert alone was tested. As expected, the original Sd18 in transfer vector pRRL-E-QLG exhibited an added encapsulation efficiency significantly lower than Exen/qα (fig. 11D and 11E). These data indicate that the addition of Exen/qα to the Sd18 expression system significantly enhances the packaging and transduction efficiency of SARS-CoV-2S-LVLP.

Exen21/qα enhancement of lentivirus production.

Viral gene therapy has been widely studied and actively applied to clinical diseases. For gene viral therapy, both AAV and LV are the most promising strategies, but viral packaging efficiency (yield) suffers from bottlenecks. In gene editing by CRISPR/Cas, viral encapsulation efficiency is also the rate limiting factor in the development of novel therapeutic agents. In general, the level of mRNA supplied by LV transfer vectors can affect LV packaging efficiency. It is speculated that the addition of Exenn a 21 to LV transfer vectors can boost transgene mRNA levels during encapsulation, thereby enhancing the efficiency of LV encapsulation and gene delivery. This idea was tested by comparing LV transfer vector pRRL-E-LG with pRRL-E-QLG for standard LV packaging (psPax and VSV-G). Following LV infection of HEK293T cells Exen increased production of transgene reporter gdLuc from the transfer vector (fig. 11F), similar to its boosting efficiency in transfected cells without LV encapsulation (fig. 9D, 9E). However, the addition of Exen a 21 to the transfer carrier only slightly affected the encapsulation efficiency (i.e. the titer of the encapsulated LV, data not shown). Similar changes were observed with LV-spCas9-Q-RFP and LV-MS2-spCas9-Q-GFP (FIGS. 11G and 11H), with encapsulation efficiencies of typically <1% of standard LV-RFP or LV-GFP. These data provide evidence that Exen 21-induced slight changes in mRNA levels in the transferred LV vectors did not increase packaging efficiency, although Exen21 addition did enhance production of the transgenic protein in transduced cells (fig. 11F). This is consistent with the following findings: the addition of Exen amplifies transduction rather than transcription (fig. 12A to 12G). It was also tested whether adding Exen/qα to LV encapsulation proteins (such as Gag, pol, and RRE) via encapsulation carrier psPAX would enhance encapsulation efficiency. Interestingly, the addition of Exen a 21 to Gag significantly damaged rather than amplified LV encapsulation, but the addition to Pol and RRE significantly enhanced LV encapsulation of pRRL-GFP (fig. 11I). These data provide evidence that proper insertion of Exen/qα into LV packaging vectors will enhance packaging and transduction efficiency.

Exen21/qα enhancement of vaccine production via increased mRNA stability and translation efficiency.

Another immediate application of Exen/qα addition may be to boost vaccine yield to address the urgent need to combat the COVID-19 pandemic. Currently, the most promising vaccine against SARS-CoV-2 and its variants is derived from mRNA or DNA ¹³ encoding the S protein. As shown in the above results, exen21/Qα addition increased S protein expression in the CMV driven cDNA expression vector by about 3 to 24 fold (FIG. 9A). Such enhancement of large-scale DNA vaccine production would reduce cost and speed up COVID-19 vaccine availability if applied on a large scale. Since mRNA vaccines exhibit a number of advantages over other vaccines and the use of vaccines based on SARS-CoV-2S protein mRNA is now widely accepted in humans, it is hypothesized that the addition of Exen.sup.21/Qα will also enhance the mRNA dependent translation of SARS-CoV-2 proteins such as S protein to increase vaccination efficiency. To test this idea Exen insertion was used to generate end-capped mRNA by in vitro transcription (independent of promoter) and to check if Exen/qα was effective in HEK293T cells after mRNA transduction (fig. 16A to 16E). The data show that the presence of Exen/qα significantly increased the production of SARS-CoV-2 protein S from transfected functional mRNA in a time-dependent and dose-dependent manner (fig. 12A). Such protein production enhancing motifs find general application to mRNA already of other SARS-CoV-2 proteins (including N, E and ORF 3) and to host cell gene hACE2 (FIGS. 12B, 12C and 16A to 16E). These data provide evidence that Exen addition of 21/qα will function in a transcription independent (promoter-free) manner by promoting mRNA stability and/or translational efficiency. To further determine whether the addition of Exen/qα regulated mRNA dependent translation, the dynamic changes in the transcript were measured after inhibition of transcription with actinomycin D. In the absence of Exen/qα addition, actinomycin D completely blocked production of viral protein E (fig. 12D) and ORF3 (fig. 12E), as measured by gdLuc activity. In contrast, the addition of Exen/qα showed a time-dependent increase in protein expression and production/accumulation even under transcriptional inhibition (fig. 12D and 12E), providing evidence that addition of Exen/qα to the target gene promotes protein expression and production via post-transcriptional regulation (increased translational efficiency and/or mRNA stability). To further determine whether the addition of Exen a affects the mRNA stability of the target gene, mRNA decay assays were used for E and S viral proteins. Although E and S viral mRNA exhibit different patterns of variation during the time course, adding Exen a/qα to the viral E protein (fig. 12F) and S protein (fig. 12G) increases the half-life of the encoded mRNA by 6 to 7 hours.

Taken together, the data indicate that adding Exen a 21 to a given target mRNA significantly increases mRNA stability and translational efficiency, thereby enhancing protein expression and production of the target mRNA (e.g., an S protein mRNA vaccine).

Exen21/qα enhancement of target protein secretion.

As seen above, the addition of Exen21/Qα increases the expression of multiple types of proteins of interest. With the aim of testing whether the addition of Exen.sup.21/Q.sup.alpha enhances intracellular E protein dual reporter protein expression (by Western blot analysis of cell lysates), it was unexpectedly found that in Exen/Q.sup.alpha.group, the E-QL protein levels in lysates were significantly reduced rather than increased as detected by Western blot analysis using anti-Flag antibodies (FIG. 13A), even though the addition of Exen.sup.21/Q.alpha.strongly increased gdLuc activity in culture supernatants (FIG. 8C). Similar reductions in the corresponding intracellular levels of other viral proteins (S and N) and host cell proteins (IFNγ, IL-2 and hACE 2) resulting from the addition of Exen/Qα were found (FIGS. 13B and 13C).

Based on these unexpected observations, it was hypothesized that Exen that a strong increase in Exen-induced supernatant gdLuc activity must involve a protein secretion process. This idea is supported by the Exen-induced potentiation observed in the antibody secretion (FIGS. 10A-10F) and secretory IFN gamma and IL-2 (FIGS. 9E and 9F) assays. To confirm this secretion enhancing activity, serum-free medium was used to analyze the protein level of secreted E-Flag-gdLuc in cell culture supernatants. It was found that in the non-concentrated supernatants of the Q.alpha.tagged E-QLG group (40. Mu.l from 100. Mu.l) cleaved E-Flag-gdLuc and GFP and non-cleaved E-Flag-gdLuc-GFP were detected by Western blot analysis using anti-gdLuc antibodies and anti-GFP antibodies (FIGS. 13D and 17A to 17E). Densitometry revealed a 17-fold increase in secreted protein levels (fig. 13D), consistent with the potentiation also seen in gdLuc test (fig. 13E). Protein secretion was blocked by treatment with the Endoplasmic Reticulum (ER) golgi protein transport inhibitor brefeldin (fig. 13F and 18A to 18D). We used a non-secreted firefly luciferase (fLuc) assay to further confirm the secretion enhancing characteristics of Exen/qα additions. The cellular level and enzymatic activity of the fLuc protein expression in the cell lysate increased significantly, but there was no detectable fLuc activity in the supernatant, even in the presence of Exen/qα addition (fig. 13G), which is consistent with the non-secreted protein spCas9 (fig. 11G). Thus Exen/qα addition appears to enhance expression and promote secretion of the protein of interest. Notably, the autolysis of most target proteins by the 2A system was incomplete, varying between different proteins (fig. 11C, 11D and 11F), which has been reported by others ^14,15.

Discussion of the invention

In this study, the discovery of a novel and unique Exen/qα cis regulatory motif with multiple capabilities to enhance expression and secretion of a protein of interest was reported. This cis-regulator Exen appears to be the secretion(s) -enhancing (e) cis (c) regulatory (re) targeting (t) element (e) (SECReTE) ¹⁶ recently identified by computational analysis as promoting ER-localized mRNA translation and protein secretion. This SECReTE motif is enriched in nearly all mRNA encoding secretion/membrane proteins in eukaryotic cells, and its addition results in enhanced protein secretion ¹⁶. When added to mRNA for exogenously expressed proteins such as GFP, protein expression and secretion are also enhanced ¹⁶. However, exen has a variety of features other than SECReTE: (1) no ternary repeat sequence such as NNY or NYN; (2) unique and exclusive composition/order of 21 nucleotides; (3) Size (21 mer) less than SECReTE (. Gtoreq.30 mers from. Gtoreq.10 ternary repeats); and (4) are absent in any cellular or viral genes. Furthermore, exen/qα also differ greatly from the activity enhancing motifs involved in promoter enhancer ^17-19 or antisense activity ²⁰. The data herein indicate that the addition of the Exen a motif to a given mRNA will significantly enhance the corresponding protein expression and secretion. This is also demonstrated in different types of proteins, including viral, non-viral, intracellular, structural and secreted proteins. The extent of such enhancement is variable, with proteins such as N and ORF3 exhibiting up to thousands of fold increases. It is believed that these findings translate into paradigm shift in protein production for use in research and applications.

The scope, extent, and mechanism of these Exen/qα actions were explored using various tools, methods, and target proteins. The addition of Exen a 21/qα strongly amplified the production of secreted gdLuc fusion protein derivatives of a variety of SARS-CoV-2 structural proteins (S, M, N and E), accessory proteins (NSP 2, NSP16 and ORF 3) and host cell gene products (fig. 8A to 8I and 9A to 9G). Among those tested, the protein production enhancement effect of Exen21/qα was largely independent of the particular promoter used, but it did result in a stronger enhancement of protein production when a stronger CAG promoter was used (fig. 9A to 9G). The addition of Exen a 21/qα enhanced mRNA-dependent production of target viral and non-viral protein fusion reporters, as determined by in vitro RNA transcription and mRNA transfection followed by a dual reporter assay (fig. 12A-12G). Exen 21A 21/Qα enhanced the yield of S-containing pseudoviruses and lentiviral encapsulation (FIG. 4). The addition of Exen a 21/qα increases the release of secreted host proteins (including a strong enhancement of antibody production when Exen a 21/qα is placed in the heavy and light chains of antibodies) and amplifies ifnγ and IL-2 secretion. Exen21 the action of 21/qα is blocked by the golgi transport inhibitor brefeldin. These findings point not only to the broad range of activity caused by Exen additions, but also to their potentially important and diverse applications in biotechnology fields such as vaccine, monoclonal antibody and other biopharmaceutical production where mammalian cell expression systems are required.

It was found that the addition of Exen a 21/qα strongly enhanced the regulated secretion of secreted proteins such as S protein, antibodies, ifnγ and IL-2 but not via any signal peptide oxygen intracellular targeting mechanism, as it did not induce the release of non-secreted proteins such as firefly luciferase and spCas 9. This property may potentially prove to be expensive for industrial use of such secreted proteins. For example, in an mRNA-based vaccine against SARS-CoV-2 variety, the addition of Exen/qα would likely enhance S protein production/secretion, thus reducing the amount of mRNA ¹³ required for vaccination due to the high levels of S protein released, while still providing the same host immune response.

The ability to enhance the yield of viruses or pseudotyped viruses can be costly in the areas of gene therapy and biomedical research. The use of pseudotyped viruses has prompted the study of high risk viruses that require BSL3 facilities. Pseudoviruses and variants thereof of the SARS-CoV-2S protein have been widely used in the evaluation of neutralizing antibodies and vaccination, ^5,6,12,21,22 in mechanism and function studies. The bottleneck in generating S-pseudovirions is the limited encapsulation efficiency of LVLP or VSVLP, ^5,6,8,12. The novel approach herein of adding Exen/Qα to the Sd18 expression system enhances the packaging and transduction efficiency of SARS-CoV-2S-LVLP. This strategy has prompted ongoing studies ^23,24 on the antiviral effects of EGCG and the protective efficiency of serum from immunized patients against the emerging SARS-CoV-2 virus. The challenge in viral gene therapy is limited viral encapsulation efficiency. Using the LV system as a test platform, it was found that adding Exen/qα to the LV transfer vector only slightly affected the encapsulation efficiency, but it enhanced the production of the transgenic protein in the transduced cells or transfected encapsulated cells. This is expected because Exen21/qα was found to affect post-transcriptional regulation of the gene of interest rather than transcription, whereas LV encapsulation requires the presence of intermediate RNAs from the transfer vector. In encapsulation carrier psPAX, the addition of Exen/qα at the C-terminus of Pol and RRE increases LV encapsulation efficiency, but at the Gag C-terminus it compromises LV encapsulation. Thus, optimizing Exen 21/qa positioning within the LV encapsulation body will help maximize Exen/qa boosting efficiency in applications. Because Exen addition of 21/qα enhances both Sd18 expression and packaging efficiency of S-LVLP, exen addition of 21/qα in VSV-G protein can enhance conventional LV packaging efficiency. Thus, exen/qα addition ^25,26 at different locations of VSV-G may maximize its production enhancement efficiency. Likewise, optimizing Exen/qα potentiation activity for AAV or other viral packaging systems may prove to be expensive in biopharmaceutical applications.

Numerous epitope tags (including Flag, myc, HA, ollas, V, his, C7, and T7) developed earlier were able to achieve specific research and biotechnology applications such as protein labeling, immunoaffinity purification, immunostaining, immunodetection enhancement ^27-34, protein degradation mitigation, and solubility conferences ^35-38. Other tags regulate the activity or function ³⁹ of the target protein, such as N-terminal or C-terminal tagging of PI3KCA, increasing its kinase and membrane-binding activity ⁴⁰, respectively. However, to date, no tagged epitopes have been found that can stimulate protein expression and secretion. A series of mutation analyses including alanine scanning, deletion, synonymous and non-synonymous mutations were performed and it was confirmed that unique 21 mer motif Exen with specific order/number of nucleotide composition was critical for its potentiating activity, which required matching to the ORF in the target gene. Thus, the encoded unique heptapeptide qα can act as a novel epitope with features shared with widely accepted epitope tags for general use. Importantly, the addition of Exen/qα can enhance the strength of endogenous protein markers due to its reinforcing ability and thus improve detection sensitivity ²⁷ in applications such as neural network tracers. Yet another important area to explore is that the addition of Exene a 21/qα makes it possible to enhance the expression of targeted, highly specific bioengineered proteins in vivo, such as via a novel CRISPR/Cas gene knockout strategy, facilitating expression of disabling genes. Such applications would be costly in treating conditions such as single-fold inadequate pathogenic mutable conditions (including Angelman syndrome, pitts-Hopkins syndrome, etc.). In genetic engineering, exen a 21/qα enhancement of dominant genes can improve organism phenotype, such as in agricultural applications. Of course, any potential toxic or off-target effects for such in vivo expression of qα -tagged proteins remain unknown and untested. Nevertheless, based on the findings when using epitope tags that were previously tested adequately in vitro and in vivo, we did expect that the very small 7-aa qα tag does not have any toxicity propensity.

The mechanism by which Exen a 21/qα plays a role in enhancing protein expression/secretion remains largely to be elucidated. However, initial findings indicate that since enhancement persists during global transcriptional inhibition of actinomycin D, the presence of Exeen21/qα slows decay of mRNA, providing evidence that Exenn21/qα plays a key role in posttranscriptional regulation, possibly including increased mRNA stability and possibly increased translational efficiency. This Exen21/qα supports the previous proof of concept that the coding sequence carries a large number of regulatory sites ⁴¹ that can regulate mRNA localization, stability, and translation efficiency. It will be of interest to determine whether Exen a 21/qα cis regulatory motifs have specific secondary RNA structures ⁴¹ that recruit RNA binding proteins, directly modulate mRNA stability ⁴² of the target protein, or bind directly to the poly-a or untranslated region (UTR) to exert their stabilizing effect on mRNA and enhancing effect on transcription. Because brefeldin, a traditional ER golgi secretory pathway inhibitor, blocks Exen a-stimulated protein secretion, exen a is presumed to regulate retrograde or anterograde transport ^43-46 between ER golgi networks and promote ER-targeted mRNA transfer and protein secretion, e.g., SECReTe. Other secretion inhibitors may be used to identify other pathways ^47,48 in proteins involved in Exen a/qα regulation, particularly in proteins that are not normally secreted (e.g., secretion of cytokines such as IL-1).

In summary, a novel small (21-mer) and unique cis-regulatory motif Exen a/qα was discovered that can greatly enhance the production of a variety of different types of proteins in mammalian cells ranging from viral transcripts/proteins, endogenous gene products, vaccines, antibodies to engineered recombinant proteins. This Exen < 21 >/Q alpha has a general protein production enhancing capability, which should facilitate various applications in biochemical research and biotechnology processes. Library screening involving this major Exen/qα is underway for optimizing motifs that will maximize protein expression/secretion.

Reference to the literature

1.Zhang,J.et al.A systemic and molecular study of subcellular localization of SARS-CoV-2proteins.Signal Transduct Target Ther 5,269(2020).

2.Rezaei,N.et al.Introduction on Coronavirus Disease(COVID-19)Pandemic:The Global Challenge.Adv Exp Med Biol 1318,1-22(2021).

3.Kolahchi,Z.et al.COVID-19and Its Global Economic Impact.Adv Exp Med Biol 1318,825-837(2021).

4.Bodnar,B.et al.Emerging role of NIK/IKK2-binding protein(NIBP)/trafficking protein particle complex 9(TRAPPC9)in nervous system diseases.Transl Res 224,55-70(2020).

5.Korber,B.et al.Tracking Changes in SARS-CoV-2Spike:Evidence that D614G Increases Infectivity of the COVID-19Virus.Cell182,812-827e819(2020).

6.Muik,A.et al.Neutralization of SARS-CoV-2lineage B.1.1.7pseudovirus by BNT162b2 vaccine-elicited human sera.Science 371,1152-1153(2021).

7.Nie,J.et al.Establishment and validation of a pseudovirus neutralization assay for SARS-CoV-2.Emerg Microbes Infect 9,680-686(2020).

8.Walls,A.C.et al.Structure,Function,and Antigenicity of the SARS-CoV-2Spike Glycoprotein.Cell 181,281-292e286(2020).

9.Weissman,D.et al.D614G Spike Mutation Increases SARS CoV-2Susceptibility to Neutralization.Cell Host Microbe 29,23-31e24(2021).

10.Wibmer,C.K.et al.SARS-CoV-2 501Y.V2 escapes neutralization by South African COVID-19donor plasma.Nat Med(2021).

11.Kuzmina,A.et al.SARS-CoV-2spike variants exhibit differential infectivity and neutralization resistance to convalescent or post-vaccination sera.Cell Host Microbe(2021).

12.Ou,X.et al.Characterization of spike glycoprotein of SARS-CoV-2on virus entry and its immune cross-reactivity with SARS-CoV.Nat Commun 11,1620(2020).

13.Rijkers,G.T.et al.Antigen Presentation of mRNA-Based and Virus-Vectored SARS-CoV-2Vaccines.Vaccines(Basel)9(2021).

14.Chaung,J.et al.Cleavage efficient 2A peptides for high levelmonoclonal antibody expression in CHO cells.MAbs 7,403-412(2015).

15.Kim,J.H.et al.High cleavage efficiency of a 2A peptide derivedfrom porcine teschovirus-1 in human cell lines,zebrafish and mice.PLoSOne 6,e18556(2011).

16.Cohen-Zontag,O.et al.A secretion-enhancing cis regulatorytargeting element(SECReTE)involved in mRNA localization and proteinsynthesis.PLoS Genet 15,e1008248(2019).

17.Erceg,J.et al.Subtle changes in motif positioning causetissue-specific effects on robustness of an enhancer's activity.PLoS Genet10,e1004060(2014).

18.Kheradpour,P.et al.Systematic dissection of regulatory motifsin 2000 predicted human enhancers using a massively parallel reporterassay.Genome Res 23,800-811(2013).

19.Ma,S.,Shah,S.,Bohnert,H.J.,Snyder,M.&Dinesh-Kumar,S.P.Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways.PLoS Genet9,e1003840(2013).

20.Matveeva,O.V.et al.Identification of sequence motifs inoligonucleotides whose presence is correlated with antisenseactivity.Nucleic Acids Res 28,2862-2865(2000).

21.Donofrio,G.et al.A Simplified SARS-CoV-2 PseudovirusNeutralization Assay.Vaccines(Basel)9(2021).

22.Wibmer,C.K.et al.SARS-CoV-2 501Y.V2 escapesneutralization by South African COVID-19 donor plasma.Nat Med 27,622-625(2021).

23.Liu,J.et al.Correlation of vaccine-elicited antibody levels andneutralizing activities against SARS-CoV-2 and its variants.Clin TranslMed 11,e644(2021).

24.Liu,J.et al.Epigallocatechin gallate from green tea effectivelyblocks infection of SARS-CoV-2 and new variants by inhibiting spikebinding to ACE2 receptor.Cell Biosci 11,168(2021).

25.Schlehuber,L.D.&Rose,J.K.Prediction and identification of apermissive epitope insertion site in the vesicular stomatitis virusglycoprotein.J Virol 78,5079-5087(2004).

26.Lorenz,I.C.et al.The stem of vesicular stomatitis virus G canbe replaced with the HIV-1 Env membrane-proximal external regionwithout loss of G function or membrane-proximal external region antigenicproperties.AIDS Res Hum Retroviruses 30,1130-1144(2014).

27.Viswanathan,S.et al.High-performance probes for light andelectron microscopy.Nat Methods 682 12,568-576(2015).

28.Pina,A.S.,Batalha,I.L.,Dias,A.&Roque,A.C.A.AffinityTags in Protein Purification and Peptide Enrichment:AnOverview.Methods Mol Biol 2178,107-132(2021).

29.Peighambardoust,S.H.,Karami,Z.,Pateiro,M.&Lorenzo,J.M.A Review on Health-Promoting,Biological,and Functional Aspects ofBioactive Peptides in Food Applications.Biomolecules 11(2021).

30.Katayama,S.,Corpuz,H.M.&Nakamura,S.Potential ofplant-derived peptides for the improvement of memory and cognitivefunction.Peptides 142,170571(2021).

31.Lee,T.H.et al.Novel short peptide tag from a bacterial toxin forversatile applications.J Immunol Methods 479,112750(2020).

32.DeCaprio,J.&Kohl,T.O.Tandem Immunoaffinity PurificationUsing Anti-FLAG and Anti-HA Antibodies.Cold Spring Harb Protoc 2019(2019).

33.Traenkle,B.,Segan,S.,Fagbadebo,F.O.,Kaiser,P.D.&Rothbauer,U.A novel epitope tagging system to visualize and monitorantigens in live cells with chromobodies.Sci Rep 10,14267(2020).

34.Mishra,V.Affinity Tags for Protein Purification.Curr ProteinPept Sci 21,821-830(2020).

35.Li,Y.Recombinant production of antimicrobial peptides inEscherichia coli:a review.Protein Expr Purif 80,260-267(2011).

36.Bhagawati,M.et al.A mesophilic cysteine-less split intein forprotein trans-splicing applications under oxidizing conditions.Proc NatlAcad Sci U S A 116,22164-22172(2019).

37.Han,X.,Ning,W.,Ma,X.,Wang,X.&Zhou,K.Improvingprotein solubility and activity by introducing small peptide tags designedwith machine learning models.Metab Eng Commun 11,e00138(2020).

38.Saribas,A.S.,White,M.K.&Safak,M.Structure-based releaseanalysis of the JC virus agnoprotein regions:A role for the hydrophilicsurface of the major alpha helix domain in release.J Cell Physiol 233,2343-2359(2018).

39.Majorek,K.A.,Kuhn,M.L.,Chruszcz,M.,Anderson,W.F.&Minor,W.Double trouble-Buffer selection and His-tag presence may beresponsible for nonreproducibility of biomedical experiments.Protein Sci23,1359-1368(2014).

40.Vasan,N.et al.Double PIK3CA mutations in cis increaseoncogenicity and sensitivity to PI3Kalpha inhibitors.Science 366,714-723(2019).

41.Ding,Y.,Lorenz,W.A.&Chuang,J.H.Coding Motif:exactdetermination of overrepresented nucleotide motifs in codingsequences.BMC Bioinformatics 13,32(2012).

42.Boo,S.H.&Kim,Y.K.The emerging role of RNA modificationsin the regulation of mRNA stability.Exp Mol Med 52,400-408(2020).

43.Kim,J.J.,Lipatova,Z.&Segev,N.TRAPP Complexes inSecretion and Autophagy.Front Cell Dev Biol 4,20(2016).

44.Pinar,M.et al.TRAPPII regulates exocytic Golgi exit bymediating nucleotide exchange on the Ypt31 ortholog RabERAB11.ProcNatl Acad Sci U S A 112,4346-4351(2015).

45.Reitz,C.The role of the retromer complex in aging-relatedneurodegeneration:a molecular and genomic review.Mol Genet Genomics290,413-427(2015).

46.Vardarajan,B.N.et al.Identification of Alzheimerdisease-associated variants in genes that regulate retromerfunction.Neurobiol Aging 33,2231 e2215-2231 e2230(2012).

47.Cohen,M.J.,Chirico,W.J.&Lipke,P.N.Through the back door:Unconventional protein secretion.Cell Surf 6,100045(2020).

48.Ni,D.et al.Canonical Secretomes,Innate Immune Caspase-1-,4/11-Gasdermin D Non-Canonical Secretomes and Exosomes MayContribute to Maintain Treg-Ness for Treg Immunosuppression,TissueRepair and Modulate Anti-Tumor Immunity via ROS Pathways.FrontImmunol 12,678201(2021).

49.Boroujeni,M.E.&Gardaneh,M.The Superiority of SucroseCushion Centrifugation to Ultrafiltration and PEGylation in GeneratingHigh-Titer Lentivirus Particles and Transducing Stem Cells with Enhanced Efficiency.Mol Biotechnol 60,185-193(2018).

Other embodiments

While the present disclosure has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.

The patent and scientific literature referred to herein constitutes knowledge available to those skilled in the art. All U.S. patents and published or unpublished U.S. patents cited herein are incorporated herein by reference. All published foreign patents and patent applications cited herein are incorporated by reference. All other published references, documents, manuscripts, and scientific documents cited herein are incorporated by reference.

Claims

1. A composition comprising an expression enhancing oligonucleotide having 21 nucleobases and comprising a cis-regulatory coding motif that retains an in-frame portion of a target gene.

2. The composition of claim 1, wherein the expression enhancing oligonucleotide comprises a nucleic acid sequence comprising CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7).

3. A synthetic oligonucleotide comprising a nucleic acid sequence comprising CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7).

4. A synthetic oligonucleotide according to claim 3, wherein said oligonucleotide encodes a peptide comprising an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

5. The synthetic oligonucleotide according to any one of claims 3 to 4, wherein said oligonucleotide encodes a peptide comprising an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

6. The synthetic oligonucleotide according to any one of claims 3 to 5, wherein said oligonucleotide encodes a peptide comprising amino acid sequence QPRFAAA (SEQ ID NO: 1).

7. A construct comprising the oligonucleotide of any one of claims 1 to 6.

8. A chimeric molecule comprising one or more peptide domains and one or more 5 '-and/or 3' -untranslated region (UTR) sequences or fragments thereof.

9. The chimeric molecule of claim 8, wherein the one or more peptide domains comprise from about five amino acids to about twenty amino acids.

10. The chimeric molecule of claim 9, wherein the one or more peptide domains comprise about seven amino acids.

11. The chimeric molecule of claim 8, wherein the one or more peptide domains comprise an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

12. The chimeric molecule of claim 8, wherein the peptide comprises an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

13. The chimeric molecule of claim 12, wherein the peptide comprises amino acid sequence QPRFAAA (SEQ ID NO: 1).

14. The chimeric molecule of claim 8, wherein the peptide domain comprises X _n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

15. The chimeric molecule of claim 8, wherein the one or more 5' -untranslated region (UTR) sequences or fragments thereof are derived from one or more viruses.

16. The chimeric molecule of claim 15, wherein the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof.

17. The chimeric molecule of claim 16, wherein the 5'-UTR and/or 3' -UTR is from a coronavirus.

18. The chimeric molecule of claim 17, wherein the coronavirus is SARS-CoV-2.

19. The chimeric molecule of claim 18, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence having at least 70% sequence identity to SARS-CoV-2 5' -UTR.

20. The chimeric molecule of claim 19, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence having at least 90% sequence identity to SARS-CoV-2 5' -UTR.

21. The chimeric molecule of claim 19, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a SARS-CoV-2 5' -UTR.

22. The chimeric molecule of claim 19, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence having at least 70% sequence identity to SARS-CoV-2 3' -UTR.

23. The chimeric molecule of claim 19, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence having at least 90% sequence identity to SARS-CoV-2 3' -UTR.

24. The chimeric molecule of claim 19, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a SARS-CoV-2 3' -UTR.

25. The chimeric molecule of claim 8, further comprising one or more biomolecules operably linked to the one or more peptide domains and/or the one or more 5'-UTR and/or 3' -UTR sequences.

26. The chimeric molecule of claim 25, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

27. The chimeric molecule of any one of claims 8 to 26, further comprising one or more promoter and/or regulatory sequences operably linked to the UTR or biomolecule.

28. A host cell comprising the oligonucleotide of any one of claims 1 to 7 or the chimeric molecule of any one of claims 8 to 27.

29. A construct encoding the chimeric molecule of claims 8 to 27.

30. A method of enhancing production of a biomolecule, the method comprising:

labelling a desired peptide or nucleic acid sequence with a chimeric molecule according to any one of claims 1 to 27 by conjugation or cloning;

Expressing the peptide or nucleic acid sequence; and

And (5) harvesting the protein.

31. The method of claim 30, wherein the protein comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

32. A nucleic acid comprising a promoter, a 5 '-untranslated region (5' -UTR) sequence, a biomolecule of interest, an oligonucleotide comprising a cis-regulatory coding motif, a 3 '-untranslated region (3' -UTR) sequence, and combinations thereof.

33. The nucleic acid of claim 32, wherein the one or more 5 '-untranslated regions (UTRs) and/or 3' -UTR sequences or fragments thereof are derived from one or more viruses.

34. The nucleic acid of claim 33, wherein the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof.

35. The nucleic acid of claim 33, wherein the 5'-UTR and/or 3' -UTR is derived from a coronavirus.

36. The nucleic acid of claim 35, wherein the coronavirus is SARS-CoV-2.

37. The nucleic acid of claim 36, wherein the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 5' -UTR.

38. The nucleic acid of claim 36, wherein the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 5' -UTR.

39. The nucleic acid of claim 36, wherein the one or more 5'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 5' -UTR.

40. The nucleic acid of claim 36, wherein the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 3' -UTR.

41. The nucleic acid of claim 36, wherein the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 3' -UTR.

42. The nucleic acid of claim 36, wherein the one or more 3'-UTR nucleic acid sequences or fragments thereof comprise SARS-CoV-2 3' -UTR.

43. A chimeric molecule comprising one or more oligonucleotides comprising a nucleic acid sequence of CAACCGCGGTTCGCGGCCGCT (SEQ ID NO: 7) and one or more 5 '-and/or 3' -untranslated region (UTR) sequences or fragments thereof.

44. The chimeric molecule of claim 43, wherein said one or more oligonucleotides encode a peptide comprising from about five amino acids to about twenty amino acids.

45. The chimeric molecule of claim 44, wherein said one or more peptides comprise about seven amino acids.

46. The chimeric molecule of claim 44, wherein said one or more peptides comprise an amino acid sequence having at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

47. The chimeric molecule of claim 44, wherein said one or more peptides comprise an amino acid sequence having at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

48. A chimeric molecule according to claim 44 wherein the one or more peptides comprise amino acid sequence QPRFAAA (SEQ ID NO: 1).

49. The chimeric molecule of claim 43, wherein the one or more peptides comprise a sequence comprising X _n-QPRFAAA-X_n, wherein n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

50. The chimeric molecule of claim 43, wherein the one or more 5' -untranslated region (UTR) sequences or fragments thereof are derived from one or more viruses.

51. The chimeric molecule of claim 50, wherein the one or more viruses comprise coronaviruses, retroviruses, picornaviruses, enveloped viruses, orthomyxoviruses, rhabdoviruses, or combinations thereof.

52. The chimeric molecule of claim 51, wherein the 5'-UTR and/or 3' -UTR is from a coronavirus.

53. The chimeric molecule of claim 52, wherein said coronavirus is SARS-CoV-2.

54. The chimeric molecule of claim 53, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 5' -UTR.

55. The chimeric molecule of claim 53, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence having at least 90% sequence identity to SARS-CoV-2 5' -UTR.

56. The chimeric molecule of claim 53, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises SARS-CoV-2 5' -UTR.

57. The chimeric molecule of claim 53, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence having at least 70% sequence identity to SARS-CoV-2 3' -UTR.

58. The chimeric molecule of claim 53, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence having at least 90% sequence identity to SARS-CoV-2 3' -UTR.

59. The chimeric molecule of claim 53, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises SARS-CoV-2 3' -UTR.

60. The chimeric molecule of claim 43, further comprising one or more biomolecules operably linked to the one or more oligonucleotides and/or the one or more 5'-UTR and/or 3' -UTR sequences.

61. The chimeric molecule of claim 60, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

62. The chimeric molecule of claims 43 to 61, further comprising one or more promoter and/or regulatory sequences operably linked to the UTR or biomolecule.

63. An expression vector comprising the nucleic acid of any one of claims 43 to 62.

64. A host cell comprising the nucleic acid of any one of claims 1 to 63.

65. A chimeric molecule comprising one or more 5 '-and/or 3' -untranslated region (UTR) sequences or fragments thereof associated with one or more biomolecules.

66. The chimeric molecule of claim 65, wherein the one or more 5 '-untranslated regions (UTRs) and/or 3' -UTR sequences or fragments thereof are derived from one or more viruses.

67. The chimeric molecule of claim 66, wherein the one or more viruses comprise a coronavirus, a retrovirus, a picornavirus, a enveloped virus, an orthomyxovirus, a rhabdovirus, or a combination thereof.

68. The chimeric molecule of claim 67, wherein the 5'-UTR and/or 3' -UTR is from a coronavirus.

69. The chimeric molecule of claim 68, wherein the coronavirus is SARS-CoV-2.

70. The chimeric molecule of claim 69, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 5' -UTR.

71. The chimeric molecule of claim 69, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 5' -UTR.

72. The chimeric molecule of claim 69, wherein the one or more 5'-UTR nucleic acid sequences or fragment thereof comprises a SARS-CoV-2 5' -UTR.

73. The chimeric molecule of claim 69, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence that has at least 70% sequence identity to SARS-CoV-2 3' -UTR.

74. The chimeric molecule of claim 69, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a nucleic acid sequence that has at least 90% sequence identity to SARS-CoV-2 3' -UTR.

75. The chimeric molecule of claim 69, wherein the one or more 3'-UTR nucleic acid sequences or fragment thereof comprises a SARS-CoV-2 3' -UTR.

76. The chimeric molecule of claim 65, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

77. The chimeric molecule of any one of claims 65 to 76, further comprising one or more promoter and/or regulatory sequences operably linked to the UTR or biomolecule.

78. A synthetic peptide tag comprising amino acid sequence units of about five to about fifteen different amino acids, wherein the N-terminal and/or C-terminal amino acids are linked or fused to a target molecule.

79. The synthetic peptide tag of claim 78, wherein the amino acid sequence unit comprises seven amino acids.

80. The synthetic peptide of claim 79, wherein the amino acid sequence has at least 70% sequence identity to QPRFAAA (SEQ ID NO: 1).

81. The synthetic peptide of claim 79, wherein the amino acid sequence has at least 90% sequence identity to QPRFAAA (SEQ ID NO: 1).

82. The synthetic peptide of claim 79, wherein the amino acid sequence comprises amino acid sequence QPRFAAA (SEQ ID NO: 1).

83. The synthetic peptide tag of claim 78, wherein the amino acid sequence comprises the amino acid sequence, wherein the peptide domain comprises Xn-QPRFAAA-Xn, wherein n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

84. The synthetic peptide tag of claim 78, further comprising a plurality of repeating amino acid sequence units.

85. The synthetic peptide tag of claim 84, wherein the repeat amino acid sequence units are in tandem.

86. The synthetic peptide tag of claim 85, wherein the amino acid sequence units are separated by a linker molecule or one or more amino acids.

87. A synthetic peptide comprising the structure: (Aa-Aa-Aa-Aa-Aa-Aa _Z-AA_Z)_X, wherein x is greater than or equal to 1, z is 0 or 1, and each Aa is independently alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, a modified amino acid, or a combination thereof.

88. A synthetic peptide comprising the structure: AA1-AA2-AA3-AA4-AA5-AA6-AA7, wherein each AA is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

89. A synthetic peptide comprising an amino acid sequence comprising the structure: xn-QPRFAAA-Xn, where n is independently 0 or greater than or equal to 1, and each X is independently: alanine (a), arginine (R), asparagine (N), aspartate (D), aspartic acid (D), asparagine (N), cysteine (C), glutamate (E), glutamine (Q), glycine (G), histidine (H), isoleucine (I), leucine (L), lysine (K), methionine (M), phenylalanine (F), proline (P), serine (S), threonine (T), tryptophan (W), tyrosine (Y), valine (V), selenocysteine, pyrrolysine, modified amino acids, or combinations thereof.

90. A fusion protein comprising the synthetic peptide of any one of claims 78 to 89 fused to one or more target peptides.

91. The fusion protein according to claim 90, wherein two or more synthetic peptides according to any one of claims 78 to 89 are fused to one target peptide.

92. A fusion molecule comprising the synthetic peptide of any one of claims 78 to 91 fused to one or more target peptides.

93. The fusion molecule of claim 92, wherein the biomolecule comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

94. A method of enhancing the production of a protein, the method comprising:

Tagging a desired peptide or nucleic acid sequence with a peptide tag according to any one of claims 78 to 93 by fusion or cloning;

Expressing the peptide or nucleic acid sequence; and

And (5) harvesting the protein.

95. The method of claim 94, wherein the protein comprises: viral transcripts/proteins, vaccines, antibodies, mRNA vaccines, DNA vaccines, peptide vaccines, oligonucleotides, polynucleotides, peptides, polypeptides, biomimetics, engineered recombinant proteins, synthetic peptides, natural peptides, cellular proteins, virions, antigens or biomimetics.

96. A composition comprising the peptide tagged biomolecule of any one of claims 78 to 93 and a pharmaceutically acceptable excipient, diluent or carrier.

97. A nucleic acid encoding the peptide tag of any one of claims 78 to 93.

98. An expression vector comprising the nucleic acid of claim 97.

99. A host cell comprising the expression vector of claim 98.