WO2023208118A1 - 核酸构建体及其应用 - Google Patents

核酸构建体及其应用 Download PDF

Info

Publication number
WO2023208118A1
WO2023208118A1 PCT/CN2023/091194 CN2023091194W WO2023208118A1 WO 2023208118 A1 WO2023208118 A1 WO 2023208118A1 CN 2023091194 W CN2023091194 W CN 2023091194W WO 2023208118 A1 WO2023208118 A1 WO 2023208118A1
Authority
WO
WIPO (PCT)
Prior art keywords
utr
seq
nucleotide sequence
sequence
virus
Prior art date
Application number
PCT/CN2023/091194
Other languages
English (en)
French (fr)
Inventor
胡占英
明鑫
高璐
王立
朱清源
朱奇慧
万海涛
陈罄馨
杨玉洁
Original Assignee
瑞可迪(上海)生物医药有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 瑞可迪(上海)生物医药有限公司 filed Critical 瑞可迪(上海)生物医药有限公司
Publication of WO2023208118A1 publication Critical patent/WO2023208118A1/zh

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • A61K39/215Coronaviridae, e.g. avian infectious bronchitis virus
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P11/00Drugs for disorders of the respiratory system
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • C07K14/08RNA viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material

Definitions

  • the present disclosure belongs to the field of nucleic acid drugs, and specifically relates to nucleic acid constructs comprising engineered UTRs and their use for preventing or treating diseases (eg, viral infections).
  • diseases eg, viral infections.
  • 5'UTR is the key to recruiting ribosomes to mRNA and initiating codon selection, playing an important role in regulating translation efficiency and shaping the cellular proteome (Ivanov, et al.Science.2016, 352(6292):1413 -1416.).
  • the 3'UTR of eukaryotes has a variety of regulatory motifs that can be recognized by microRNAs (miRNAs) and rbp to control the stability, localization and translation of mRNA (Mazumder et al., 2003; Mayr, 2017) .
  • the polyA tail located at the 3' end of the mRNA is another element that determines mRNA stability and protein levels.
  • the present disclosure provides an mRNA vaccine with a new UTR structure, which has the advantage of stabilizing and efficiently expressing the target gene. It can be universally applied to mRNA vaccines (for example, influenza or COVID-19 vaccines) to regulate the expression of the target gene, and has excellent clinical efficacy. Drug application prospects.
  • mRNA vaccines for example, influenza or COVID-19 vaccines
  • nucleic acid elements that can regulate expression of genes of interest, as well as nucleic acid constructs.
  • nucleotide construct contains at least one nucleic acid element that can regulate the expression of the gene of interest.
  • the nucleic acid element is a 5' untranslated region element (5'UTR).
  • the 5'UTR is selected from or is derived from or is Rho GTPase activating protein (ARHGAP), heat shock 27kDa protein 1 (HSPB1), beta globin (hemoglobin subunit beta, HBB), chemotactic protein Factor CC motif chemokine ligand 13 (CCL13)
  • ARHGAP Rho GTPase activating protein
  • HSPB1 heat shock 27kDa protein 1
  • HBB beta globin (hemoglobin subunit beta, HBB)
  • CCL13 chemotactic protein Factor CC motif chemokine ligand 13
  • the 5'UTR in the nucleic acid construct is derived from or is a 5'UTR sequence of the ARHGAP gene or a derivative sequence thereof.
  • the ARHGAP includes ARHGAP1, ARHGAP2, ARHGAP3, ARHGAP4, ARHGAP5, ARHGAP6, ARHGAP7(DLC1), ARHGAP8, ARHGAP9, ARHGAP10, ARHGAP12, ARHGAP13(SRGAP1), ARHGAP14(SRGAP2), ARHGAP15, ARHGAP17(RICH1 ), ARHGAP18, ARHGAP19, ARHGAP20, ARHGAP21, ARHGAP22, ARHGAP23, ARHGAP24, ARHGAP25, ARHGAP26.
  • the 5'UTR is derived from or is a 5'UTR sequence of the ARHGAP15 gene or a derivative sequence thereof.
  • the ARHGAP15 gene is derived from any species, such as human ARHGAP15, baboon ARHGAP15, monkey ARHGAP15, mouse ARHGAP15, etc.
  • the 5'UTR is derived from or is a 5'UTR sequence of the human ARHGAP15 gene or a derivative sequence thereof.
  • the 5'UTR of the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 1 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes a nucleotide sequence in which ATG is mutated into GTG.
  • the 5'UTR of the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 2 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes a nucleotide sequence in which ATG is mutated into CTG.
  • the 5'UTR of the ARHGAP15 gene includes or Is a nucleotide sequence as shown in SEQ ID NO: 42 or a nucleoside having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity thereto acid sequence.
  • the truncation method of the 5'UTR truncation includes: deleting the continuous nucleotide sequence at the 5' end in the sequence direction from the 5' end to the 3' end; and/or, from the 3' end. In the sequence direction to the 5' end, delete the continuous nucleotide sequence at the 3' end. In some embodiments, the truncation method of the 5'UTR truncation includes deleting the continuous nucleotide sequence at the 5' end in the sequence direction from the 5' to the 3' end, while retaining the nucleotides at the 3' end. sequence.
  • the truncation method includes deleting the continuous nucleotide sequence at the 5' end in the direction from the 5' to the 3' end of the ARHGAP15 5'UTR, while retaining the nucleotide sequence at the 3' end.
  • the truncation method of the 5'UTR truncation body includes deleting the continuous nucleotide sequence at the 3' end from the direction from the 3' to the 5' end of the ARHGAP15 5'UTR, while retaining the nucleosides at the 5' end. acid sequence.
  • the 5'UTR truncation is truncated by deleting the continuous nucleotide sequence at the 5' end in the direction from the 5' to the 3' end of the ARHGAP15 5'UTR; and, from 3' to 5 In the direction of the 'terminal sequence, the continuous nucleotide sequence at the 3' end is deleted.
  • the 5'UTR truncation includes at least 5 consecutive nucleotide sequences in the sequence shown in SEQ ID NO: 2 or 44; in some embodiments, the 5'UTR truncation includes SEQ ID NO: 2 or 44.
  • the 5'UTR truncation includes 7-62 nucleotides in the sequence shown in SEQ ID NO: 2 or 44 Continuous nucleotide sequence; In some embodiments, the 5'UTR truncation includes 7-59 continuous nucleotide sequences in the sequence shown in SEQ ID NO: 2; Exemplarily, the 5'UTR The truncated body contains 5, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26 in the sequence shown in SEQ ID NO: 2 or 44 ,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51 , 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 or 62 consecutive nucleotide sequences.
  • the 5' terminal nucleotide of the 5'UTR truncation (i.e., the starting nucleotide of the 5'UTR truncation) is the sequence shown in SEQ ID NO: 2 or 44. Naturally counting nucleotides at any position 1-57. In some embodiments, the 5' terminal nucleotide of the 5'UTR truncation is any one of positions 1-13 or 17-29 in natural counting of the sequence shown in SEQ ID NO: 2 or 44 The nucleotide at the position.
  • the length of the 5'UTR truncation is 59bp, 55bp, 51bp, 47bp, 43bp, 39bp, 35bp, 31bp, 27bp, 23bp, 19bp, 15bp, 11bp, 7bp.
  • the ARHGAP15 5'UTR truncated body does not comprise the point mutation of any of the preceding embodiments.
  • the 5'UTR of the aforementioned ARHGAP15 gene includes or is a nucleotide sequence as shown in SEQ ID NO: 31-40, including CTATAAT; or is at least 80%, 85%, or 90% identical to any of the aforementioned sequences. %, 95%, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the ARHGAP15 5'UTR truncated body comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 41, 76, 137-196 or has at least 80%, 85%, 90% %, 95%, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the aforementioned 5'UTR derived from the HSPB1 gene includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 3-5 or has at least 80%, Nucleotide sequences of 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity.
  • the 5'UTR includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-2, 3-5, 28-43, 76, 137-196, or the following sequence: CTATAAT, or A nucleotide sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity with any of the aforementioned sequences.
  • nucleic acid construct comprising:
  • the ORF is a polynucleotide sequence encoding a gene protein of interest.
  • the gene of interest is heterologous. In other embodiments, the gene of interest is endogenous.
  • the 5'UTR in the nucleic acid construct is located upstream of the open reading frame. In some embodiments, the 5'UTR in the nucleic acid construct is located at the 5' end of the open reading frame.
  • the 5'UTR in the nucleic acid construct is selected from or is derived from Rho GTPase activating protein (ARHGAP), heat shock 27kDa protein 1 (HSPB1), beta globin
  • ARHGAP Rho GTPase activating protein
  • HSPB1 heat shock 27kDa protein 1
  • beta globin The 5'UTR of any gene such as hemoglobin subunit beta (HBB) or CC motif chemokine ligand 13 (CCL13) or its derivative sequence.
  • the 5'UTR in the nucleic acid construct is derived from or is the 5'UTR sequence of the ARHGAP15 gene or a derivative sequence thereof.
  • the ARHGAP15 gene is derived from any species, such as human ARHGAP15, baboon ARHGAP15, monkey ARHGAP15, mouse ARHGAP15, etc.
  • the 5'UTR in the nucleic acid construct is derived from or is a 5'UTR sequence of the human ARHGAP15 gene or a derivative sequence thereof.
  • the 5'UTR of the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 1 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the 5'UTR in the nucleic acid construct contains at least one point mutation that can be used to inhibit translation initiation by ATG within the UTR.
  • the point mutation is selected from mutations at any one or more positions of A, T or G of the ATG sequence in the 5'UTR.
  • the point mutation is a mutation at the A position of the ATG sequence in the 5'UTR.
  • the mutation is A to G, C, or T.
  • the mutation is ATG to GTG, CTG or TTG. Further, the mutation can be used to inhibit translation initiation by ATG within the UTR.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes a nucleotide sequence in which ATG is mutated into GTG.
  • the 5'UTR of the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 2 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes a nucleotide sequence in which ATG is mutated into CTG.
  • the 5'UTR of the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 42 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the truncation method of the 5'UTR truncation includes deleting the continuous nucleotide sequence at the 3' end in the sequence direction from the 3' end to the 5' end, while retaining the nucleotides at the 5' end. sequence.
  • the truncation method of the 5'UTR truncation includes deleting the continuous nucleotide sequence at the 5' end in the sequence direction from the 5' to the 3' end; and, the 5'UTR truncation The truncation method of the short body includes deleting the continuous nucleotide sequence at the 3' end in the sequence direction from the 3' to the 5' end.
  • the truncation method includes deleting the continuous nucleotide sequence at the 5' end from the 5' to the 3' end of the ARHGAP15 5'UTR, while retaining the nucleotide sequence at the 3' end. In some embodiments, the truncation method of the 5'UTR truncation body includes deleting the continuous nucleotide sequence at the 3' end from the 3' to the 5' end of the ARHGAP15 5'UTR, while retaining the nucleosides at the 5' end. acid sequence.
  • the 5'UTR truncation is truncated by deleting the continuous nucleotide sequence at the 5' end in the direction from the 5' to the 3' end of the ARHGAP15 5'UTR; and, from 3' to 5 In the direction of the 'terminal sequence, the continuous nucleotides at the 3' end are deleted. sequence.
  • the 5'UTR truncation is in at least 5 consecutive nucleotide sequences in the nucleotide sequence shown in SEQ ID NO: 2 or 44; in some embodiments, the 5'UTR truncation is The short body includes 5-62 contiguous nucleotide sequences in the sequence shown in SEQ ID NO: 2 or 44; in some embodiments, the 5'UTR truncation body includes the sequence shown in SEQ ID NO: 2 or 44 7-62 contiguous nucleotide sequences; in some embodiments, the 5'UTR truncation includes 7-59 contiguous nucleotide sequences in the sequence shown in SEQ ID NO: 2; Exemplarily, the The above 5'UTR truncation contains 5, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 in the sequence shown in SEQ ID NO: 2 ,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,
  • the length of the 5'UTR truncation is 59bp, 55bp, 51bp, 47bp, 43bp, 39bp, 35bp, 31bp, 27bp, 23bp, 19bp, 15bp, 11bp, 7bp.
  • the ARHGAP15 5'UTR truncation further comprises a point mutation of any of the preceding embodiments. In some embodiments, the ARHGAP15 5'UTR truncation further comprises a nucleotide sequence in which ATG is mutated into GTG. In some embodiments, the aforementioned 5'UTR derived from the ARHGAP15 gene includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 28-30 or has at least 80%, 85%, 90%, 95% , 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the ARHGAP15 5'UTR truncated body does not comprise the point mutation of any of the preceding embodiments.
  • the 5'UTR of the aforementioned ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 31-40, CTATAAT, or has at least 80%, 85%, 90%, 95%, Nucleotide sequences with 96%, 97%, 98%, 99%, 100% identity.
  • the 5'UTR truncation at least includes CTATAAT or the nucleotide sequence shown in SEQ ID NO: 193.
  • the ARHGAP15 5'UTR truncated body comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 41, 76, 137-196 or has at least 80%, 85%, 90% %, 95%, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the aforementioned 5'UTR derived from the HSPB1 gene includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 3-5 or has at least 80%, Nucleotide sequences of 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity.
  • the 5'UTR in the nucleic acid constructs of the present disclosure includes or is a nucleotide as shown in any of SEQ ID NO: 1-2, 3-5, 28-43, 76, 137-196
  • the 5'UTR as described in any of the above nucleic acid constructs includes or is a nucleotide sequence as shown in any one of SEQ ID NOs: 1-5, 28-43, 76, 137-196 , or the following sequence: CTATAAT.
  • nucleic acid constructs of the present disclosure further comprise: (c) a 3' untranslated region element (3'UTR).
  • the open reading frame and the 5'UTR and/or the 3'UTR of the present disclosure are derived from different genes.
  • the disclosed nucleic acid constructs comprise at least one open reading frame, at least one 5'UTR, or at least one 3'UTR.
  • the nucleic acid constructs of the present disclosure include The 5'UTR and the 3'UTR are of the same or different origin, eg, derived from the same or different gene.
  • the 5'UTR and 3'UTR in the nucleic acid constructs of the present disclosure are derived from the same species or different species.
  • the 3'UTR in the nucleic acid constructs of the present disclosure is located downstream of the open reading frame. In some embodiments, the 3'UTR in the nucleic acid construct is located at the 3' end of the open reading frame.
  • the 3'UTR is derived from or is a 3'UTR sequence of the HBB or ARHGAP15 gene or a derivative sequence thereof.
  • the aforementioned 3'UTR derived from the CORO1A gene includes or is the nucleotide sequence shown in SEQ ID NO: 9 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 3'UTR derived from the HPX gene includes or is the nucleotide sequence shown in SEQ ID NO: 10 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the 3'UTR in the nucleic acid construct of the present disclosure includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 7-10 or is at least 80%, 85%, 90%, 95% identical thereto. %, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the 3'UTR includes or is a nucleotide sequence as shown in SEQ ID NO: 7 or 8.
  • the nucleic acid construct includes a 5'UTR and a 3'UTR, wherein:
  • the 5'UTR is derived from or is the 5'UTR of the ARHGAP15 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the ARHGAP15 gene or a derived sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the HBB gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the CORO1A gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the HSPB1 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the HPX gene or a derived sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the CCL13 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the ARHGAP15 gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the CCL13 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the CORO1A gene or a derived sequence thereof; or
  • the nucleic acid construct of the present disclosure includes a 5'UTR and a 3'UTR, and the 5'UTR and 3'UTR are selected from any one of the following:
  • the 5'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-2, 28-43, 76, 137-196, CTATAAT or has at least 80%, 85%, 90%, 95 %, 96%, 97%, 98%, 99%, A nucleotide sequence that is 100% identical, and the 3'UTR contains or is the nucleotide sequence set forth in SEQ ID NO: 8 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, and 100% identity;
  • the 5'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-2, 28-43, 76, 137-196, CTATAAT or has at least 80%, 85%, 90%, 95 %, 96%, 97%, 98%, 99%, 100% identity to the nucleotide sequence
  • the 3'UTR contains or is a nucleotide sequence as shown in SEQ ID NO: 9 or has at least 80% identity with it. Nucleotide sequences with %, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity;
  • the 5'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-2, 28-43, 76, 137-196, CTATAAT or has at least 80%, 85%, 90%, 95 %, 96%, 97%, 98%, 99%, 100% identity to the nucleotide sequence
  • the 3'UTR contains or is a nucleotide sequence as shown in SEQ ID NO: 10 or has at least 80% identity therewith Nucleotide sequences with %, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO:3
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 7 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO:3
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 8 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO:3
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence shown in SEQ ID NO: 9 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is the nucleotide sequence shown in SEQ ID NO: 4 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 7 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is the nucleotide sequence shown in SEQ ID NO: 4 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence shown in SEQ ID NO: 9 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is the nucleotide sequence shown in SEQ ID NO: 4 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 10 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO: 5
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 8 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO: 5
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence shown in SEQ ID NO: 9 , 98%, 99%, 100% identical nucleotide sequences.
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO: 5
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 10 , 98%, 99%, 100% identical nucleotide sequences;
  • the nucleic acid construct includes a 5'UTR and a 3'UTR, wherein:
  • the 5'UTR includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-2, 28-43, 76, 137-196, CTATAAT, and the 3'UTR is derived from or is derived from HBB, ARHGAP15, The 3'UTR of CORO1A and HPX genes or their derivative sequences.
  • the nucleic acid construct includes a 5'UTR and a 3'UTR, wherein:
  • the 5'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-2, 28-43, 76, 137-196, CTATAAT, and the 3'UTR contains or is as shown in SEQ ID NO: 7- 10 any of the nucleotide sequences shown.
  • nucleic acid constructs of the present disclosure further comprise: (d) a poly(A) tail.
  • the poly-A tail in the nucleic acid construct is located downstream of the 3'UTR. In some embodiments, the poly-A tail in the nucleic acid construct is located at the 3' end of the 3'UTR. some implementers In this case, the poly-A tail is at the 3' end of the nucleic acid construct. In some embodiments, the poly-A tail is at least about 50, 100, 150, 200, 300, 400, 500 nucleotides in length.
  • the poly-A tail includes, but is not limited to, HGH polyA, SV40 polyA, BGH polyA, rbGlob polyA, or SV40late polyA.
  • the poly-A tail comprises or is a nucleotide sequence as shown in SEQ ID NO: 16 or 135.
  • ORF and the 5'UTR and/or the 3'UTR are derived from different genes.
  • the 5'UTR in i), iii) to iv) is selected from the 5'UTR derived from any gene such as ARHGAP, HSPBl, HBB, CCL13, etc. or its derivative sequence (for example, the 5'UTR of ARHGAP15 or its derivatives). derived sequence).
  • the 3'UTR in ii) to iv) is selected from the 3'UTR or derivative sequences thereof derived from any gene such as HBB, ARHGAP15, CORO1A, HPX, etc. (e.g., the 3'UTR of HBB or ARHGAP15 or its derivatives sequence).
  • the poly-A tail in iv) to iv) comprises, for example, the nucleotide sequence shown in SEQ ID NO: 16 or 135.
  • the 5'UTR in i) includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-5, 28-43, 76, 137-196, and CTATAAT.
  • the 3'UTR in ii) includes or is a nucleotide sequence as shown in any one of SEQ ID NOs: 7-10.
  • the 5'UTR in iii) comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-5, 28-43, 76, 137-196, CTATAAT, and the 3'UTR comprises Or be the nucleotide sequence shown in any one of SEQ ID NO: 7-10.
  • the 5'UTR in iv) includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-5, 28-43, 76, 137-196, CTATAAT, and the 3'UTR includes or is a nucleotide sequence as set forth in any of SEQ ID NO: 7-10, and the poly-A tail contains or is a nucleotide sequence as set forth in SEQ ID NO: 16 or 135.
  • the polypeptide or protein encoded by the ORF is a fluorescent protein or luciferase (luciferase).
  • the structural protein is selected from the group consisting of spike protein (S protein or Spike protein), envelope protein (envelope protein (E protein)), membrane protein (membrane protein (M protein)) and nucleocapsid protein (nucleocapsid protein, N protein).
  • the structural protein is spike protein.
  • the spike protein is SARS-COV-2 spike protein.
  • the SARS-COV-2 spike protein is selected from the group consisting of SARS-COV-2 (e.g., wild-type SARS-COV-2), SARS-COV-2Alpha (B.1.1.7), SARS-COV- 2Beta(B.1.351), SARS-COV-2Gamma(P.1), SARS-COV-2Kappa(B.1.617.1), SARS-COV-2Delta(B.1.617.2), SARS-COV-2Omicron( The spike protein of any virus strain such as B.1.1.529), SARS-COV-2Omicron (B.A.4), etc.
  • SARS-COV-2 e.g., wild-type SARS-COV-2
  • SARS-COV-2Alpha B.1.1.7
  • SARS-COV-2Gamma(P.1) SARS-COV-2Kappa(B.1.617.1)
  • the SARS-COV-2 spike protein comprises or is an amino acid sequence as shown in any one of SEQ ID NO: 12, 14, 21, 23, 25, 80, 136 or has at least 80 Sequences with %, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% amino acid identity.
  • the ORF comprises a codon-optimized nucleotide sequence comprising a wild-type nucleotide sequence encoding a SARS-COV-2 antigen (e.g., wild-type SARS-COV-2, SEQ ID NO: 81 ) have at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity.
  • a SARS-COV-2 antigen e.g., wild-type SARS-COV-2, SEQ ID NO: 81
  • the ORF encoding the polypeptide or protein comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 13, 15, 20, 22, 24, 81, 97 or has at least 80 nucleotides therefrom. %, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the ORF encodes an influenza virus antigen.
  • the influenza virus is selected from type A influenza virus or type B influenza virus.
  • the influenza virus is type A influenza virus H1N1, type A influenza virus H3N2, type A influenza virus H3N8, type A influenza virus.
  • H2N2 Influenza A virus H5N1, Influenza A virus H9N2, Influenza A virus H7N7, Influenza B virus/Victoria (e.g., Influenza B virus/Washington/02/2019), Influenza B virus/Yamagata (e.g., Influenza B/Phuket/3073/2013) etc.
  • influenza virus antigen is a structural protein of influenza virus, for example, hemagglutinin (HA), neuraminidase (NA), ion channel protein M2 (M2ion channel), matrix protein M1 (matrix protein), nuclear protein NP (nucleoprotein), etc.
  • influenza virus antigen is the HA protein of influenza virus, such as influenza A protein.
  • influenza virus antigen is the NA protein of influenza virus, such as influenza A virus H1N1, influenza A virus H3N2, and influenza B virus Victoria (for example, Influenza B/Washington/02/2019), NA protein of influenza B virus Yamagata (eg, Influenza B/Phuket/3073/2013).
  • the HA protein comprises or is an amino acid sequence as shown in any one of SEQ ID NO: 83-86, 98-101 or is at least 80%, 85%, 90%, 95%, 96% identical thereto. , 97%, 98%, 99%, 100% amino acid identity of the sequences.
  • the ORF comprises a codon-optimized nucleotide sequence that is at least 80%, 85%, 90%, 95%, 96%, 97% identical to a wild-type nucleotide sequence encoding the HA antigen. , 98%, 99%, 100% identical sequences.
  • the ORF includes or is at least 80%, 85%, 90%, 95%, 96% identical to the nucleotide sequence shown in any one of SEQ ID NOs: 87-90, 102-105 , 97%, 98%, 99%, 100% identical sequences.
  • the present disclosure provides a nucleic acid construct that sequentially includes a 5'UTR, an ORF, a 3'UTR and a poly-A tail in the 5' to 3' direction.
  • the 5'UTR includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 1-5, 28-43, 76, 137-196, CTATAAT
  • the ORF includes or is as SEQ ID NO: 13, 15, 20, 22, 24, 81, 97, 87-90, 93-96, 102-105 any one of the nucleotide sequences shown
  • the 3'UTR contains or is as SEQ ID
  • the nucleotide sequence shown in any one of NO: 7-10, and the poly-A tail contains or is the nucleotide sequence shown in SEQ ID NO: 16 or 135.
  • RNA molecule comprising:
  • the gene of interest is heterologous. In other embodiments, the gene of interest is endogenous.
  • the 5'UTR in the RNA molecule is located upstream of the open reading frame. In some embodiments, the 5'UTR in the RNA molecule is located at the 5' end of the open reading frame.
  • the 5'UTR in the RNA molecule is selected from or is derived from Rho GTPase activating protein (ARHGAP), heat shock 27kDa protein 1 (HSPB1), beta globin (hemoglobin subunit beta , HBB), chemokine c-c-motif ligand 13 (C-C motif chemokine ligand 13, CCL13), or the 5'UTR of any gene or its derivative sequence.
  • ARHGAP Rho GTPase activating protein
  • HSPB1 heat shock 27kDa protein 1
  • beta globin hemoglobin subunit beta
  • HBB chemokine c-c-motif ligand 13
  • CCL13 chemokine c-c-motif ligand 13
  • the 5'UTR in the RNA molecule is derived from or is the 5'UTR sequence of the ARHGAP gene or a derivative sequence thereof.
  • the ARHGAP includes ARHGAP1, ARHGAP2, ARHGAP3, ARHGAP4, ARHGAP5, ARHGAP6, ARHGAP7(DLC1), ARHGAP8, ARHGAP9, ARHGAP10, ARHGAP12, ARHGAP13(SRGAP1), ARHGAP14(SRGAP2), ARHGAP15, ARHGAP17(RICH1 ), ARHGAP18, ARHGAP19, ARHGAP20, ARHGAP21, ARHGAP22, ARHGAP23, ARHGAP24, ARHGAP25, ARHGAP26.
  • the 5'UTR is derived from or is a 5'UTR sequence of the ARHGAP15 gene or a derivative sequence thereof.
  • the ARHGAP15 gene is derived from any species, such as human ARHGAP15, baboon ARHGAP15, mouse ARHGAP15, etc.
  • the 5'UTR is derived from or is a 5'UTR sequence of the human ARHGAP15 gene or a derivative sequence thereof.
  • the 5'UTR of the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 45 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the 5'UTR in the RNA molecules of the present disclosure contains at least one point mutation, including the point mutations in any embodiment of the aforementioned nucleic acid constructs.
  • the mutation is A to G, C, or U.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes the nucleotide sequence shown in SEQ ID NO: 79, or the 5'UTR includes at least 80% or 85% of the sequence of SEQ ID NO: 79. , 90%, 95%, 96%, 97%, 98%, 99% identical nucleotide sequences, wherein N 2 is selected from A, G, C or U.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes a nucleotide sequence in which AUG is mutated into GUG.
  • the 5'UTR derived from the ARHGAP15 gene comprises or is the nucleotide sequence shown in SEQ ID NO: 46 or is at least 80%, 85%, 90%, 95%, 96% identical thereto. , 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes a nucleotide sequence in which AUG is mutated into UUG.
  • the 5'UTR derived from the ARHGAP15 gene comprises or is the nucleotide sequence shown in SEQ ID NO: 78 or is at least 80%, 85%, 90%, 95%, Nucleotide sequences with 96%, 97%, 98%, 99%, 100% identity.
  • the 5'UTR in the RNA molecules of the present disclosure is a 5'UTR truncation.
  • the 5'UTR truncation still maintains similar active functions to the natural 5'UTR, for example, still maintains the function of regulating the expression protein of the target gene encoded by the ORF.
  • the truncation method of the 5'UTR truncation includes: deleting the continuous nucleotide sequence at the 5' end in the sequence direction from the 5' end to the 3' end; and/or, from the 3' end. In the sequence direction to the 5' end, delete the continuous nucleotide sequence at the 3' end. In some embodiments, the truncation method of the 5'UTR truncation includes deleting the continuous nucleotide sequence at the 5' end in the sequence direction from the 5' to the 3' end, while retaining the nucleotides at the 3' end. sequence.
  • the truncation method of the 5'UTR truncation includes deleting the continuous nucleotide sequence at the 3' end in the sequence direction from the 3' end to the 5' end, while retaining the nucleotides at the 5' end. sequence.
  • the truncation method of the 5'UTR truncation includes deleting the continuous nucleotide sequence at the 5' end in the sequence direction from the 5' to the 3' end; and, the 5'UTR truncation The truncation method of the short body includes deleting the continuous nucleotide sequence at the 3' end in the sequence direction from the 3' to the 5' end.
  • the 5'UTR truncation further comprises a point mutation of any of the preceding embodiments. In some embodiments, the 5'UTR truncated body does not comprise the point mutation of any of the preceding embodiments. In some embodiments, the 5'UTR truncated body retains at least 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, compared to the native 5'UTR sequence length. 15%, 10%, 5%, 1% length sequences.
  • the aforementioned 5'UTR sequence derived from the ARHGAP15 gene includes or is an ARHGAP15 5'UTR truncation.
  • the truncation method includes deleting the continuous nucleotide sequence from the 5' end in the direction from the 5' to the 3' end of the ARHGAP15 5'UTR; and/or, the sequence from the 3' to the 5' end. direction, the continuous nucleotide sequence at the 3' end is deleted.
  • the truncation method includes deleting the continuous nucleotide sequence at the 5' end in the direction from the 5' to the 3' end of the ARHGAP15 5'UTR, while retaining the nucleotide sequence at the 3' end.
  • the truncation method of the 5'UTR truncation body includes deleting the continuous nucleotide sequence at the 3' end from the direction from the 3' to the 5' end of the ARHGAP15 5'UTR, while retaining the nucleosides at the 5' end. acid sequence.
  • the 5'UTR truncation is truncated by deleting the continuous nucleotide sequence at the 5' end in the direction from the 5' to the 3' end of the ARHGAP15 5'UTR; and, from 3' to 5 In the direction of the 'terminal sequence, the continuous nucleotide sequence at the 3' end is deleted.
  • the truncated body retains at least 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20% compared to native ARHGAP15 5'UTR (SEQ ID NO: 45) %, 15%, 10%, 5%, 1% length sequences.
  • the 5'UTR truncation is within at least 5 consecutive nucleotide sequences of the nucleotide sequence shown in SEQ ID NO: 46 or 79; in some embodiments, the 5'UTR truncation The body includes 5-62 contiguous nucleotide sequences in the sequence shown in SEQ ID NO: 46 or 79; in some embodiments, the 5'UTR truncation body includes 7 of the sequence shown in SEQ ID NO: 46 or 79 -62 consecutive nucleotide sequences; in some embodiments, the 5'UTR truncation includes 7-59 consecutive nucleotides in the sequence shown in SEQ ID NO: 46 or 79 The nucleotide sequence; Exemplarily, the 5'UTR truncation includes 5, 7, 8, 9, 10, 11, 15, 16, 17, 18, 19, in the sequence shown in SEQ ID NO: 2 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43
  • the 5' terminal nucleotide of the 5'UTR truncation (i.e., the starting nucleotide of the 5'UTR truncation) is the sequence shown in SEQ ID NO: 46 or 79. Naturally counting nucleotides at any position 1-57. In some embodiments, the 5' terminal nucleotide of the 5'UTR truncation is any one of positions 1-13 or 17-29 of the sequence shown in SEQ ID NO: 46 or 79 in natural counting The nucleotide at the position.
  • the length of the 5'UTR truncation is 59bp, 55bp, 51bp, 47bp, 43bp, 39bp, 35bp, 31bp, 27bp, 23bp, 19bp, 15bp, 11bp, 7bp.
  • the ARHGAP15 5'UTR truncation further comprises a point mutation of any of the preceding embodiments. In some embodiments, the ARHGAP15 5'UTR truncation further comprises a nucleotide sequence in which AUG is mutated into GUG. In some embodiments, the aforementioned 5'UTR derived from the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 63-65 or has at least 80%, 85%, 90%, 95%, 96 %, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the ARHGAP15 5'UTR truncated body does not comprise the point mutation of any of the preceding embodiments.
  • the aforementioned 5'UTR derived from the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 66-75, CAUUAAU, or has at least 80%, 85%, 90%, 95% , 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the aforementioned 5'UTR derived from the ARHGAP15 gene includes or is a sequence corresponding to the sequence shown in SEQ ID NO: 41, 76, 137-196 (substituting U for T in the sequence) or has a sequence corresponding thereto. Nucleotide sequences that are at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical.
  • the aforementioned 5'UTR derived from the HSPBl gene includes or is the nucleotide sequence shown in SEQ ID NO: 47 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 5'UTR derived from the HBB gene includes or is the nucleotide sequence shown in SEQ ID NO: 48 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 5'UTR derived from the CCL13 gene includes or is the nucleotide sequence shown in SEQ ID NO: 49 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the 5'UTR in the RNA molecule of the present disclosure includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-49, 63-75, 77-78, CAUUAAU or has a sequence with it. Nucleotide sequences that are at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical.
  • the 5'UTR in the RNA molecule contains the sequence corresponding to the sequence shown in SEQ ID NO: 41, 76, 137-196 (replacing T in the sequence with U) or has at least 80%, 85%, 90% similarity with it. , 95%, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the 5'UTR as described in any of the above RNA molecules includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-49, 63-75, 77-78, CAUUAAU.
  • the 5'UTR in the RNA molecule contains the sequence corresponding to the sequence shown in SEQ ID NO: 41, 76, 137-196 (replacing T in the sequence with U).
  • RNA molecules of the present disclosure further comprise: (c) a 3' untranslated region element (3'UTR).
  • the open reading frame and the 5'UTR and/or the 3'UTR of the present disclosure are derived from different genes.
  • the RNA molecules of the disclosure comprise at least one open reading frame, at least one 5'UTR, or at least one 3'UTR.
  • the 5'UTR and 3'UTR in the RNA molecules of the present disclosure are of the same or different origins, such as from the same or different genes.
  • the 5'UTR and 3'UTR in the RNA molecules of the present disclosure are derived from the same or different species.
  • the 3'UTR in the RNA molecules of the present disclosure is located downstream of the open reading frame. In some embodiments, the 3'UTR in the RNA molecule is located at the 3' end of the open reading frame. In some embodiments, the 3'UTR in the RNA molecule of the present disclosure is selected from or is derived from or is hemoglobin subunit beta (HBB), ARHGAP15, coronin 1A (CORO1A), hemopexin ( hemopexin, HPX) or any other gene's 3'UTR or its derivative sequence.
  • HBB hemoglobin subunit beta
  • ARHGAP15 ARHGAP15
  • CORO1A coronin 1A
  • hemopexin hemopexin, HPX
  • the 3'UTR is derived from or is a 3'UTR sequence of the HBB or ARHGAP15 gene or a derivative sequence thereof.
  • the aforementioned 3'UTR derived from the ARHGAP15 gene includes or is the nucleotide sequence shown in SEQ ID NO: 51 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 3'UTR derived from the CORO1A gene includes or is the nucleotide sequence shown in SEQ ID NO: 52 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the aforementioned 3'UTR derived from the HPX gene includes or is the nucleotide sequence shown in SEQ ID NO: 53 or has at least 80%, 85%, 90%, 95%, 96%, Nucleotide sequences with 97%, 98%, 99%, 100% identity.
  • the 3'UTR in the RNA molecules of the present disclosure includes or is SEQ ID NO: 50-53 Any nucleotide sequence shown or a nucleotide sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity thereto.
  • the 3'UTR includes or is a nucleotide sequence as shown in SEQ ID NO: 50 or 51.
  • the RNA molecule contains a 5'UTR and a 3'UTR, wherein:
  • the 5'UTR is selected from 5'UTR derived from or derived from any gene such as ARHGAP (for example, ARHGAP15), HSPBl, HBB, CCL13 or the like, or a derivative sequence thereof, and the 3'UTR is selected from the derived sequence derived from or derived from any gene such as HBB, ARHGAP15, CORO1A, HPX The 3'UTR of any gene or its derivative sequence.
  • ARHGAP for example, ARHGAP15
  • HBB HBB
  • CCL13 CCL13 or the like
  • HPX HPX
  • the 5'UTR is derived from or is the 5'UTR of the ARHGAP15 gene or its derivative sequence, and the 3'UTR is derived from or is the 3'UTR of the HBB gene or its derived sequence;
  • the 5'UTR is derived from or is the 5'UTR of the ARHGAP15 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the ARHGAP15 gene or a derived sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the ARHGAP15 gene or its derivative sequence, and the 3'UTR is derived from or is the 3'UTR of the CORO1A gene or its derived sequence;
  • the 5'UTR is derived from or is the 5'UTR of the ARHGAP15 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the HPX gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the HBB gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the HBB gene or a derived sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the HBB gene or its derivative sequence, and the 3'UTR is derived from or is the 3'UTR of the ARHGAP15 gene or its derived sequence;
  • the 5'UTR is derived from or is the 5'UTR of the HBB gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the CORO1A gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the HBB gene or its derivative sequence
  • the 3'UTR is derived from or is the 3'UTR of the HPX gene or its derived sequence
  • the 5'UTR is derived from or is the 5'UTR of the HSPB1 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the HBB gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the HSPB1 gene or its derivative sequence, and the 3'UTR is derived from or is the 3'UTR of the ARHGAP15 gene or its derivative sequence;
  • the 5'UTR is derived from or is the 5'UTR of the HSPB1 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the CORO1A gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the HSPB1 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the HPX gene or a derived sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the CCL13 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the HBB gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the CCL13 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the ARHGAP15 gene or a derivative sequence thereof;
  • the 5'UTR is derived from or is the 5'UTR of the CCL13 gene or a derivative sequence thereof, and the 3'UTR is derived from or is the 3'UTR of the CORO1A gene or a derivative sequence thereof; or
  • the 5'UTR is derived from or is the 5'UTR of the CCL13 gene or a derivative sequence thereof
  • the 3'UTR is derived from or is the 3'UTR of the HPX gene or a derived sequence thereof.
  • the RNA molecule of the present disclosure contains a 5'UTR and a 3'UTR, and the 5'UTR and 3'UTR are selected from any one of the following:
  • the 5'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-46, 63-75, 77-78, CAUUAAU or has at least 80%, 85%, 90%, 95%, Nucleotide sequences that are 96%, 97%, 98%, 99%, or 100% identical, or the 5'UTR contains the sequence corresponding to the sequence shown in SEQ ID NO: 41, 76, 137-196 (replaced with U T) in the sequence or a nucleotide sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity thereto; and the 3'UTR contains or Be a nucleotide sequence as shown in SEQ ID NO: 50 or a nucleoside having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity thereto acid sequence;
  • the 5'UTR includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-46, 63-75, 77-78, CAUUAAU or has at least 80%, 85%, 90%, 95%, Nucleotide sequences with 96%, 97%, 98%, 99% or 100% identity, or the 5'UTR contains the sequence corresponding to the sequence shown in SEQ ID NO: 41, 76, 137-196 (replaced with U T) in the sequence or a nucleotide sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity thereto; and the 3'UTR contains or Is a nucleotide sequence as shown in SEQ ID NO: 53 or a core having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity thereto nucleotide sequence;
  • the 5'UTR contains or is the nucleotide sequence set forth in SEQ ID NO: 47 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of the identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 50 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is the nucleotide sequence set forth in SEQ ID NO: 47 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 51 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is the nucleotide sequence set forth in SEQ ID NO: 47 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 52 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO: 48
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 51 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical to the nucleotide sequence set forth in SEQ ID NO: 48
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 52 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% related to the nucleotide sequence set forth in SEQ ID NO: 49
  • the nucleotide sequence of identity, and the 3'UTR comprises or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 50 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is the nucleotide sequence set forth in SEQ ID NO: 49 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 51 , 98%, 99%, 100% identical nucleotide sequences;
  • the 5'UTR contains or is the nucleotide sequence set forth in SEQ ID NO: 49 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence set forth in SEQ ID NO: 52 , 98%, 99%, 100% identical nucleotide sequences.
  • the 5'UTR contains or is the nucleotide sequence set forth in SEQ ID NO: 49 or is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identical thereto
  • the nucleotide sequence of identity, and the 3'UTR contains or is at least 80%, 85%, 90%, 95%, 96%, 97% identical to the nucleotide sequence shown in SEQ ID NO: 53 Nucleotide sequences with %, 98%, 99%, and 100% identity;
  • the RNA molecule contains a 5'UTR and a 3'UTR, wherein:
  • the 5'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-46, 63-75, 77-78, CUAUAAU, or the 5'UTR contains SEQ ID NO: 41, 76, 137-
  • the RNA molecule contains a 5'UTR and a 3'UTR, wherein:
  • the 5'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-46, 63-75, 77-78, CUAUAAU, or the 5'UTR contains SEQ ID NO: 41, 76, 137- The sequence corresponding to the sequence shown in 196 (replacing T in the sequence with U); and the 3'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 50-53.
  • RNA molecules of the present disclosure further comprise: (d) a polyadenylate (poly-A) tail.
  • the poly-A tail includes, but is not limited to, HGH polyA, SV40 polyA, BGH polyA, rbGlob polyA, or SV40late polyA.
  • the 5' cap structure in the RNA molecule is located upstream of the 5' UTR. In some embodiments, the 5' cap structure in the RNA molecule is located at the 5' end of the 5'UTR. In some embodiments, the 5' cap structure is a cap structure known to those skilled in the art, such as CapO (methylation of the first nucleobase, e.g.
  • m7GpppN Cap1 (m7GpppN additional methylation of the ribose of the adjacent nucleotide, such as m7G(5')ppp(5')(2'OMeA)pG), Cap2 (m7GpppN the third nucleotide downstream Additional methylation of ribose), Cap3 (additional methylation of ribose at the 3rd nucleotide downstream of m7GpppN), Cap4 (additional methylation of ribose at the 4th nucleotide downstream of m7GpppN), ARCA (anti-reverse methylation) cap analogs), modified ARCA (e.g., phosphorothioate-modified ARCA), inosine, N1-methyl-guanosine, 2'-fluoro-guanosine, 7-deaza-guanosine, 8 -Oxo-guanosine, 2-amino-gua
  • RNA synthesis or in vitro RNA transcription is used to form a 5'-cap structure (eg, CapO or Cap1).
  • a 5'-cap structure (eg, CapO or Cap1) is formed via enzymatic capping using a capping enzyme (eg, vaccinia virus capping enzyme and/or cap-dependent 2'-O methyltransferase).
  • a capping enzyme eg, vaccinia virus capping enzyme and/or cap-dependent 2'-O methyltransferase.
  • an immobilized capping enzyme is used to add a 5' cap structure (Cap0 or Cap1).
  • the capping method and means in WO2016/193226 are introduced in full here.
  • the 5' cap structure includes, but is not limited to, ARCA, 3'OMe-m7G(5')ppp(5')G, m7G(5')ppp(5')(2'OMeA)pU, m7Gppp(A2'O-MOE)pG, m7G(5')ppp(5')(2'OMeA)pG, m7G(5')ppp(5')(2'OMeG)pG, m7(3'OMeG) (5')ppp(5')(2'OMeG)pG or m7(3'OMeG)(5')ppp(5')(2'OMeA)pG.
  • the 5' cap structure is m7G(5')ppp(5')(2'OMeA)pG.
  • Other 5' cap structures or cap structure analogs may also be used.
  • the RNA molecule as described in any one of the above contains any one of i)-v):
  • ORF and the 5'UTR and/or the 3'UTR are derived from different genes.
  • the 5'UTR in i), iii) to v) is selected from the 5'UTR derived from any gene such as ARHGAP, HSPBl, HBB, CCL13, etc. or its derivative sequence (for example, the 5'UTR of ARHGAP15 or its derivatives). derived sequence).
  • the 3'UTR in ii) to v) is selected from the 3'UTR or derivative sequences thereof derived from any gene such as HBB, ARHGAP15, CORO1A, HPX, etc. (e.g., the 3'UTR of HBB or ARHGAP15 or its derivatives sequence).
  • the poly-A tail in iv) to v) comprises, for example, the nucleotide sequence shown in SEQ ID NO: 56.
  • the 5' cap structure in v) includes, but is not limited to, Cap0, Cap1 (eg, m7G(5')ppp(5')(2'OMeA)pG), Cap2, Cap3, Cap4, ARCA.
  • the 5'UTR in i) includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-49, 63-75, 77-78, CAUUAAU; or, the 5'UTR includes SEQ Sequences corresponding to the sequences shown in ID NO: 41, 76, 137-196 (replace T in the sequence with U).
  • the 3'UTR in ii) includes or is a nucleotide sequence as shown in any one of SEQ ID NOs: 50-53.
  • the 5'UTR in iii) includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-49, 63-75, 77-78, CAUUAAU, or the 5'UTR includes SEQ The sequence corresponding to the sequence shown in ID NO: 41, 76, 137-196 (replacing T in the sequence with U); and the 3'UTR contains or is a nucleotide shown in any of SEQ ID NO: 50-53 sequence.
  • the 5'UTR in v) includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-49, 63-75, 77-78, CUAUAAU, or the 5'UTR includes SEQ The sequence corresponding to the sequence shown in ID NO: 41, 76, 137-196 (replace T in the sequence with U); 3'UTR contains or is a nucleotide sequence as shown in any one of SEQ ID NO: 50-53 , the poly-A tail contains or is the nucleotide sequence shown in SEQ ID NO: 56, and the 5' cap structure is m7G(5')ppp(5')(2'OMeA)pG.
  • the ORF includes a nucleotide sequence encoding at least one polypeptide or protein.
  • the nucleotide sequence may be a codon-optimized nucleotide sequence.
  • the polypeptide or protein encoded by the ORF is a fluorescent protein or luciferase.
  • the polypeptide or protein encoded by the ORF is a viral antigen.
  • viral antigens include, but are not limited to, antigens of influenza virus, coronavirus, respiratory syncytial virus, human immunodeficiency virus, herpes simplex virus, rabies virus, Epstein-Barr virus, and the like.
  • the viral antigen is a coronavirus antigen.
  • the coronavirus is a coronavirus that infects humans, such as SARS-CoV-2 (COVID-19), SARS-CoV, HCoV-229E, HCoV-OC43, HCoV-NL63, HCoV-HKU1, or MERS- CoV.
  • the coronavirus is SARS-COV-2.
  • the coronavirus antigen is a structural protein.
  • the structural protein is selected from the group consisting of spike protein (Spike protein, S protein or Spike protein), envelope protein (E protein), membrane protein (membrane protein, M protein) and nucleocapsid protein. (nucleocapsid protein, N protein).
  • the structural protein is spike protein.
  • the spike protein is SARS-COV-2 spike protein.
  • the SARS-COV-2 spike protein is selected from SARS-COV-2 (e.g., wild-type SARS-COV-2), SARS-COV-2Alpha(B.1.1.7), SARS-COV-2Beta(B.1.351), SARS-COV-2Gamma(P.1), SARS-COV-2Kappa(B. 1.617.1), SARS-COV-2Delta(B.1.617.2), SARS-COV-2Omicron(B.1.1.529), SARS-COV-2Omicron(BA.4) and other spike proteins .
  • SARS-COV-2 e.g., wild-type SARS-COV-2
  • SARS-COV-2Alpha(B.1.1.7) e.g., wild-type SARS-COV-2Beta(B.1.351)
  • SARS-COV-2Gamma(P.1) e.g., wild-type SARS-COV-2Kappa(B. 1.617.1
  • the ORF comprises a codon-optimized nucleotide sequence comprising a wild-type nucleotide sequence encoding a SARS-COV-2 antigen (e.g., wild-type SARS-COV-2, SEQ ID NO: 82 ) have at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity.
  • a SARS-COV-2 antigen e.g., wild-type SARS-COV-2, SEQ ID NO: 82
  • the ORF encoding the polypeptide or protein comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 54-55, 59-61, 82, 130 or has at least 80%, Nucleotide sequences of 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity.
  • influenza virus antigen is a structural protein of influenza virus, for example, hemagglutinin (HA), neuraminidase (NA), ion channel protein M2 (M2ion channel), matrix protein M1 (matrix protein), nuclear protein NP (nucleoprotein), etc.
  • influenza virus antigen is the HA protein of influenza virus, such as influenza A virus H1N1, influenza A virus H3N2, influenza B virus/Victoria (for example, Influenza B/Washington/02/2019) , HA protein of influenza B virus/Yamagata (e.g., Influenza B/Phuket/3073/2013).
  • influenza virus antigen is the NA protein of influenza virus, such as influenza A virus H1N1, influenza A virus H3N2, and influenza B virus Victoria (for example, Influenza B/Washington/02/2019), NA protein of influenza B virus Yamagata (e.g., Influenza B/Phuket/3073/2013).
  • influenza virus antigen is the NA protein of influenza virus, such as influenza A virus H1N1, influenza A virus H3N2, and influenza B virus Victoria (for example, Influenza B/Washington/02/2019), NA protein of influenza B virus Yamagata (e.g., Influenza B/Phuket/3073/2013).
  • the HA protein comprises or is an amino acid sequence as shown in any one of SEQ ID NO: 83-86, 98-101 or is at least 80%, 85%, 90%, 95%, 96% identical thereto. , 97%, 98%, 99%, 100% amino acid identity of the sequences.
  • the ORF comprises a codon-optimized nucleotide sequence that is at least 80%, 85%, 90%, 95%, 96%, 97% identical to a wild-type nucleotide sequence encoding the HA antigen. , 98%, 99%, 100% identical sequences.
  • the ORF encoding the polypeptide or protein comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 126-129, 131-134 or is at least 80%, 85%, 90% %, 95%, Sequences of 96%, 97%, 98%, 99%, 100% identity.
  • the present disclosure provides an RNA molecule that sequentially includes a 5'UTR, an ORF, a 3'UTR and a poly-A tail in the 5' to 3' direction.
  • the 5'UTR comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 45-49, 45-49, 63-75, 77-78, CAUUAAU, or, the 5'UTR Contains the sequences corresponding to the sequences shown in SEQ ID NO: 41, 76, 137-196 (replacing T in the sequence with U);
  • the ORF includes or is such as SEQ ID NO: 54-55, 59-61, 82, 126-134
  • the 3'UTR includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 50-53
  • the poly-A tail includes or is as SEQ ID The nucleotide sequence shown in NO: 56 or 135.
  • the RNA molecule includes the nucleotide sequence shown in NO
  • Rho GTPase activating protein 15 belongs to the ARHGAP family and is a Rac1-specific GTPase activating protein (GAPs). It is the main negative regulator of Rho family GTPase activity.
  • This disclosure provides more selections and combinations of 5'UTR and 3'UTR with excellent expression efficiency. At the same time, it provides mRNA with the potential to effectively prevent new coronavirus infection and influenza virus infection, especially against its various mutant strains. vaccine.
  • the present disclosure also provides an isolated polynucleotide encoding one or more polypeptides or proteins, wherein the polypeptides or proteins are coronavirus antigens.
  • the coronavirus is SARS-COV-2.
  • the coronavirus antigen is a structural protein.
  • the structural protein is selected from the group consisting of spike protein (S protein or Spike protein), envelope protein (envelope protein (E protein)), membrane protein (membrane protein (M protein)) and nucleocapsid protein (nucleocapsid protein, N protein).
  • the structural protein is spike protein.
  • the spike protein is SARS-COV-2 spike protein.
  • the SARS-COV-2 spike protein is selected from the group consisting of SARS-COV-2 (e.g., wild-type SARS-COV-2), SARS-COV-2Alpha (B.1.1.7), SARS-COV- 2Beta(B.1.351), SARS-COV-2Gamma(P.1), SARS-COV-2Kappa(B.1.617.1), SARS-COV-2Delta(B.1.617.2), SARS-COV-2Omicron( The spike protein of any virus strain such as B.1.1.529), SARS-COV-2Omicron (BA.4), etc.
  • SARS-COV-2 e.g., wild-type SARS-COV-2
  • SARS-COV-2Alpha B.1.1.7
  • SARS-COV-2Gamma(P.1) SARS-COV-2Kappa(B.1.617.1)
  • the ORF encodes an influenza virus antigen.
  • the influenza virus is selected from type A influenza virus or type B influenza virus.
  • the influenza virus is type A influenza virus. H1N1, Influenza A virus H3N2, Influenza A virus H3N8, Influenza A virus H2N2, Influenza A virus H5N1, Influenza A virus H9N2, Influenza A virus H7N7, Influenza B virus/Victoria (e.g., Influenza B/ Washington/02/2019), influenza B virus/Yamagata (e.g., Influenza B/Phuket/3073/2013), etc.
  • influenza virus antigen is a structural protein of influenza virus, for example, hemagglutinin (HA), neuraminidase (NA), ion channel protein M2 (M2ion channel), matrix protein M1 (matrix protein), nuclear protein NP (nucleoprotein), etc.
  • influenza virus antigen is the HA protein of influenza virus, such as influenza A virus H1N1, influenza A virus H3N2, influenza B virus/Victoria (for example, Influenza B/Washington/02/2019) , HA protein of influenza B virus/Yamagata (eg, Influenza B/Phuket/3073/2013).
  • the ORF comprises a codon-optimized nucleotide sequence comprising a wild-type nucleotide sequence encoding a SARS-COV-2 antigen (e.g., wild-type SARS-COV-2 mRNA, SEQ ID NO: 82 ) have at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity.
  • a SARS-COV-2 antigen e.g., wild-type SARS-COV-2 mRNA, SEQ ID NO: 82
  • the isolated polynucleotide comprises or is at least 80%, 85% identical to the nucleotide sequence set forth in SEQ ID NOs: 54-55, 59-61, 82, 126-134. , 90%, 95%, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences.
  • the polynucleotide is RNA, such as mRNA. In some embodiments, the polynucleotide is an ORF region of an mRNA.
  • the nucleic acid construct or RNA molecule of the present disclosure encodes a protein of interest, wherein the open reading frame (ORF) has enhanced expression of the protein of interest.
  • the RNA or polynucleotide molecules may further comprise one or more modifications (including chemical modifications), such as backbone modifications, sugar modifications, base modifications, etc. Base modification and/or lipid modification, etc.
  • the RNA or polynucleotide molecule is uniformly modified to a particular modification (eg, completely modified throughout the sequence).
  • RNA can be modified uniformly with pseudouridines so that every U in the sequence is a pseudouridine.
  • Backbone modifications in connection with the present disclosure refer to chemical modifications of the phosphates of the backbone of the nucleotides contained in the RNA or polynucleotide molecules of the present disclosure.
  • the backbone modification includes, but is not limited to, completely replacing the unmodified phosphate portion of the backbone with a modified phosphate ester.
  • the phosphate of the backbone can be modified by replacing one or more oxygen atoms with different substituents. group.
  • the modified phosphates include, but are not limited to, phosphorothioates, phosphoselenates, borane phosphates, borane phosphates, hydrogen phosphonates, phosphoramidates, alkyl or aryl esters. Phosphonates and phosphate triesters.
  • Sugar modifications in connection with the present disclosure refer to chemical modifications of the sugars of the nucleotides contained in the RNA or polynucleotide molecules of the present disclosure.
  • the sugar modifications include, but are not limited to, modifying or replacing the 2' hydroxyl (OH) group of the RNA molecule with a number of different "oxy" or “deoxy” substituents.
  • the "oxy" modification includes, but is not limited to, substitution modifications of alkoxy, aryloxy, polyethylene glycol (PEG), and the like.
  • deoxy modifications include, but are not limited to, hydrogen, amino (e.g., NH2, alkylamino, dialkylamino, heterocyclyl, arylamino, diarylamino, heteroarylamino, diheteroaryl amino acid or amino acid) modification.
  • amino e.g., NH2, alkylamino, dialkylamino, heterocyclyl, arylamino, diarylamino, heteroarylamino, diheteroaryl amino acid or amino acid
  • Base modifications in connection with the present disclosure refer to chemical modifications of the base portion of the nucleotides contained in the RNA or polynucleotide molecules of the present disclosure.
  • the base modifications include modifications to adenine, guanine, cytosine, and uracil in nucleotides.
  • the nucleosides and nucleotides described herein can be chemically modified on the primary groove surface.
  • the primary groove chemical modification may include amino, thiol, alkyl, or halogen groups.
  • the base modification includes, but is not limited to, pseudouridine, 1-methyl-pseudouridine, 5-azacytidine, 5-methylcytosine-5'-triphosphate or 2- Methoxyadenine modification.
  • the base modification is a pseudouridine modification.
  • the RNA can be uniformly modified with pseudouridines such that each U in the sequence is a pseudouridine.
  • Lipid modification in connection with the present disclosure means that the RNA or polynucleotide molecule of the present disclosure contains a lipid modification.
  • the lipid modification includes, but is not limited to, an RNA or polynucleotide molecule of the disclosure covalently attached to at least one linker, and a corresponding linker covalently attached to at least one lipid.
  • the lipid modification includes, but is not limited to, an RNA or polynucleotide molecule of the disclosure covalently attached to at least one lipid (no linker).
  • UTRs (5'UTR and/or 3'UTR) can be provided as flanking regions to the nucleic acid constructs, RNA or polynucleotide molecules of the present disclosure. UTRs may be homologous or heterologous to the coding region in the nucleic acid construct, RNA, or polynucleotide molecule of the present disclosure.
  • the flanking region may contain one or more 5'UTRs and/or 3'UTRs, which may be the same or different sequences. Any portion of the flanking region can be codon optimized. Any portion of the flanking regions may independently contain one or more different structural or chemical modifications before and/or after codon optimization.
  • RNA or polynucleotide of the present disclosure In order to alter one or more properties of the nucleic acid construct, RNA or polynucleotide of the present disclosure, UTRs heterologous to the ORF of the present disclosure are introduced or engineered into the nucleic acid construct, RNA or polynucleotide of the present disclosure. middle. The recombinant nucleic acid construct, RNA or polynucleotide is then administered to a cell, tissue or organism and the results, such as protein levels, localization and/or half-life, are measured to assess the effect of the heterologous UTR on the present disclosure, Beneficial effects produced by RNA or polynucleotides.
  • Regulatory elements and other elements useful or necessary for the expression of the encoded polypeptide of the present disclosure are, for example, promoters, terminators, selection markers, leader sequences, reporter genes, etc.
  • the nucleic acid constructs of the present disclosure may be prepared or obtained by known means (eg, by automated DNA synthesis and/or recombinant DNA technology) based on the information of the nucleotide sequences of the present disclosure, and/or may be isolated from suitable natural sources. .
  • the vector of the present disclosure also contains a promoter, for example, the promoter is at the 5' end of the 5'UTR of the nucleic acid construct, for example, the promoter is a T7 promoter, a T7lac promoter, a Tac promoter , Lac promoter, Trp promoter.
  • the promoter is at the 5' end of the 5'UTR of the nucleic acid construct, for example, the promoter is a T7 promoter, a T7lac promoter, a Tac promoter , Lac promoter, Trp promoter.
  • the present disclosure also provides a host cell comprising the nucleic acid construct, RNA or polynucleotide of any one of the preceding.
  • the cells are capable of expressing polypeptides encoded by one or more nucleic acid constructs, RNAs, or polynucleotides of the present disclosure.
  • the host cell is a bacterial cell, a fungal cell, or a mammalian cell.
  • Bacterial cells include, for example, Gram-negative bacterial strains (such as Escherichia coli strains, Proteus strains, and Pseudomonas strains) and Gram-positive bacterial strains (such as Bacillus spp. (Bacillus) strain, Streptomyces strain, Staphylococcus strain and Lactococcus strain) cells.
  • Gram-negative bacterial strains such as Escherichia coli strains, Proteus strains, and Pseudomonas strains
  • Gram-positive bacterial strains such as Bacillus spp. (Bacillus) strain, Streptomyces strain, Staphylococcus strain and Lactococcus strain
  • Mammalian cells include, for example, HEK293 cells, CHO cells, BHK cells, HeLa cells, COS cells, and the like.
  • the present disclosure provides a method of preparing a nucleic acid construct, RNA, or polynucleotide of the present disclosure, as well as a method of preparing the polypeptide encoded therein.
  • nucleic acid constructs RNA or polynucleotides, and polypeptides encoded by them, such as specifically suitable vectors, transformation or transfection methods, selection markers, methods for inducing protein expression, culture conditions, etc. in this field is known.
  • protein isolation and purification techniques suitable for use in methods of making the encoded polypeptides of the present disclosure are well known to those skilled in the art.
  • the method of preparing RNA molecules includes: preparing a nucleic acid construct or vector, and then using the nucleic acid construct or vector to perform reverse transcription to obtain an RNA molecule. In some specific embodiments, the method further includes adding a 5' Cap to the 5' end of the RNA molecule.
  • the present disclosure also provides a vaccine comprising the nucleic acid construct of any of the foregoing, the RNA of any of the foregoing, and/or the polynucleotide of any of the foregoing.
  • the nucleic acid construct, RNA or polynucleotide encodes one or more antigens of one or more viral strains.
  • Vaccines provided by the present disclosure may be monovalent vaccines, multivalent vaccines, or combination vaccines.
  • the present disclosure provides a monovalent vaccine comprising an antigen encoding an organism.
  • the monovalent vaccine contains one antigen encoding one viral strain.
  • the vaccine may include a nucleic acid construct, RNA, or polynucleotide molecule, or multiple nucleic acid constructs, RNA, or polynucleotide molecules encoding two or more antigens of the same or different species.
  • the vaccine includes RNA or RNAs encoding two or more antigens of the same or different viral strains.
  • the RNA may encode 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more viral antigens.
  • the antigen is a coronavirus antigen, such as SARS-CoV-2 (COVID-19), SARS-CoV, HCoV-229E, HCoV-OC43, HCoV-NL63, HCoV-HKU1 or MERS-CoV.
  • the Coronavirus antigens are structural proteins, such as those selected from the group consisting of spike protein (S protein or Spike protein), envelope protein (E protein), membrane protein (Membrane protein, M protein) and nucleocapsid protein ( nucleocapsid protein, N protein).
  • the structural protein is a spike protein, such as SARS-COV-2 spike protein.
  • the SARS-COV-2 spike protein is selected from the group consisting of SARS-COV-2 (e.g., wild-type SARS-COV-2 mRNA), SARS-COV-2Alpha (B.1.1.7), SARS-COV- 2Beta(B.1.351), SARS-COV-2Gamma(P.1), SARS-COV-2Kappa(B.1.617.1), SARS-COV-2Delta(B.1.617.2), SARS-COV-2Omicron( The spike protein of any virus strain such as B.1.1.529), SARS-COV-2Omicron (BA.4), etc.
  • SARS-COV-2 e.g., wild-type SARS-COV-2 mRNA
  • SARS-COV-2Alpha B.1.1.7
  • the SARS-COV-2 spike protein comprises or is an amino acid sequence as shown in any one of SEQ ID NO: 12, 14, 21, 23, 25, 80, 136 or has at least 80 Sequences with %, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% amino acid identity.
  • the DNA sequence encoding the SARS-COV-2 spike protein includes or is a nucleotide sequence as shown in any one of SEQ ID NO: 13, 15, 20, 22, 24, 81, 97, or A nucleotide sequence having at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity thereto.
  • the RNA sequence encoding the SARS-COV-2 spike protein includes or is a nucleotide sequence as shown in SEQ ID NO: 54-55, 59-61, 82, 130 or has at least Nucleotide sequences with 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, 100% identity.
  • the ORF encodes an influenza virus antigen.
  • the influenza virus is selected from type A influenza virus or type B influenza virus.
  • the influenza virus is type A influenza virus H1N1, type A influenza virus H3N2, type A influenza virus H3N8, type A influenza virus.
  • H2N2 Influenza A virus H5N1, Influenza A virus H9N2, Influenza A virus H7N7, Influenza B virus/Victoria (e.g., Influenza B/Washington/02/2019), Influenza B virus/Yamagata (e.g., Influenza B /Phuket/3073/2013) etc.
  • influenza virus antigen is a structural protein of influenza virus, for example, hemagglutinin (HA), neuraminidase (NA), ion channel protein M2 (M2ion channel), matrix protein M1 (matrix protein), nuclear protein NP (nucleoprotein), etc.
  • influenza virus antigen is the HA protein of influenza virus, such as influenza A virus H1N1, influenza A virus H3N2, influenza B virus/Victoria (for example, Influenza B/Washington/02/2019) , HA protein of influenza B virus/Yamagata (eg, Influenza B/Phuket/3073/2013).
  • the RNA sequence encoding the HA protein comprises or is a nucleotide sequence as shown in any one of SEQ ID NO: 126-129, 131-134 or is at least 80%, 85%, 90% identical thereto. , 95%, 96%, 97%, 98%, 99%, 100% identical nucleotide sequences List.
  • two or more different RNAs can be formulated in the same lipid nanoparticle.
  • two or more different RNAs can be separately formulated in separate lipid nanoparticles, and the lipid nanoparticles can then be combined and combined as a single vaccine composition (e.g., including multiple RNAs encoding multiple antigens). RNA), or can be administered alone.
  • the present disclosure also provides multivalent/combination vaccines comprising RNA encoding one or more coronaviruses or antigens from one or more different organisms. That is, the vaccine of the present disclosure can be a multivalent/combination vaccine that targets one or more antigens of the same strain/species, or one or more antigens of different strains/species.
  • the present disclosure also provides a delivery system comprising the nucleic acid construct described in any one of the foregoing, or the RNA molecule described in any of the foregoing; wherein the delivery vehicle is a cationic lipid delivery particle.
  • the particles are nanoparticles.
  • the delivery vehicle is a lipid nanoparticle.
  • the RNA molecules in the present disclosure can be delivered into cells and/or in vivo using any type of nanolipid particles in the art.
  • nanolipid particles include but are not limited to WO2017075531, WO2018081480A1, WO2017049245A2, WO2017099823A1, WO2022245888Al, Lipid particles disclosed in WO2022150717A1, CN101291653A, CN102119217A, WO2011000107A1, and CN107028886A, the above patents are fully incorporated into this disclosure by reference.
  • the present disclosure also provides a pharmaceutical composition, which includes the nucleic acid construct described in any one of the aforementioned, the polynucleotide described in any of the aforementioned, the vaccine described in any of the aforementioned, or the vaccine described in any of the aforementioned.
  • the pharmaceutical composition is a solid preparation, an injection, an external preparation, or a spray. , liquid preparations, or compound preparations.
  • the present disclosure provides a product or kit, which includes the nucleic acid construct described in any one of the aforementioned, the polynucleotide described in any of the aforementioned, the vaccine described in any of the aforementioned, or the vaccine described in any of the aforementioned.
  • the kit can be used to provide relevant detection or diagnostic purposes.
  • the present disclosure also provides a method of administering to a subject in need a therapeutically and/or prophylactically effective amount of the nucleic acid construct described in any of the foregoing, the polynucleotide as described in any of the foregoing, or the polynucleotide as described in any of the foregoing.
  • the disease includes a viral infectious disease or a respiratory disease associated with a viral infection.
  • the virus is a coronavirus.
  • the coronavirus is a coronavirus that infects humans, such as SARS-CoV-2 (COVID-19), SARS-CoV, HCoV-229E, HCoV-OC43, HCoV-NL63, HCoV-HKU1 or MERS-CoV.
  • the coronavirus is SARS-CoV-2.
  • the virus is an influenza virus.
  • the coronavirus is an influenza virus that infects humans, such as influenza A virus or influenza B virus.
  • influenza virus is type A influenza virus H1N1, type A influenza virus H3N2, type B influenza virus/Victoria (for example, Influenza B/Washington/02/2019), type B influenza virus/Yamagata (for example, Influenza B/ silk/3073/2013) etc.
  • the respiratory diseases related to viral infection include simple infections such as fever, cough and sore throat, headache, rhinitis, etc., pneumonia, acute respiratory tract infection , severe acute respiratory infection (SARI), hypoxic respiratory failure and acute respiratory distress syndrome, sepsis and septic shock, severe acute respiratory syndrome (SARS), etc.
  • the present disclosure also provides a method for treating and/or preventing diseases, comprising administering to a subject in need a therapeutically and/or preventively effective amount of the nucleic acid construct described in any one of the foregoing, or the nucleic acid construct described in any of the foregoing.
  • the disease includes a viral infectious disease or a respiratory disease associated with a viral infection.
  • the virus is a coronavirus.
  • influenza virus is influenza A virus H1N1, influenza A virus H3N2, influenza B virus/Victoria (for example, Influenza B/Washington/02/2019), influenza B virus/Yamagata (for example, Influenza B/ silk/3073/2013) etc.
  • influenza diseases related to viral infection include simple infections such as fever, cough and sore throat, headache, rhinitis, etc., pneumonia, acute respiratory tract infection , severe acute respiratory infection (SARI), hypoxic respiratory failure and acute respiratory distress syndrome, sepsis and septic shock, severe acute respiratory syndrome (SARS), etc.
  • the present disclosure also provides a method, comprising administering to a subject in need an effective amount of the nucleic acid construct of any of the foregoing, the polynucleotide of any of the foregoing, or the polynucleotide of any of the foregoing.
  • the virus is a coronavirus.
  • the coronavirus is a coronavirus that infects humans, such as SARS-CoV-2 (COVID-19), SARS-CoV, HCoV-229E, HCoV-OC43, HCoV-NL63, HCoV-HKU1 or MERS-CoV.
  • the coronavirus is SARS-CoV-2.
  • the virus is an influenza virus.
  • the coronavirus is an influenza virus that infects humans, such as influenza A virus or influenza B virus.
  • influenza virus is type A influenza virus H1N1, type A influenza virus H3N2, type B influenza virus/Victoria (for example, Influenza B/Washington/02/2019), type B influenza virus/Yamagata (for example, Influenza B/ silk/3073/2013) etc.
  • the subject is immune. In some embodiments, the subject has pulmonary disease. In some embodiments, the subject is 5 years old or younger, or 65 years old or older.
  • the method includes administering to the subject at least one dose, two doses, three doses, four doses, and more of the nucleic acid construct described in any of the foregoing, the polynucleotide of any of the foregoing. , the vaccine according to any of the foregoing, the carrier according to any of the foregoing, the delivery vehicle according to any of the foregoing, the pharmaceutical composition according to any of the foregoing, or the product according to any of the foregoing. or kit.
  • Viral antigens such as coronavirus antigens).
  • At least 100NU/mL, 200NU/mL, 300NU/mL, 400NU/mL, 500NU/mL, 600NU/mL, 700NU is produced in the subject's serum 1-72 hours after administration /mL, 800NU/mL, 900NU/mL or 1000NU/mL neutralizing antibody titer.
  • the antibody titer produced in the subject is increased by at least 1 log relative to the control.
  • the antibody titer produced in a subject can be increased by at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 log relative to a control.
  • the antibody titer produced in the subject is increased by at least 2-fold relative to the control.
  • the subject produces an increase in antibody titer of at least 3, 4, 5, 6, 7, 8, 9, or 10-fold relative to a control.
  • the geometric mean is the nth power of the product of n numbers, often used to describe proportional growth.
  • the geometric mean is used to characterize the antibody titer produced in a subject.
  • the control may be an unvaccinated subject, or a subject administered a live attenuated virus vaccine, an inactivated virus vaccine, or a protein subunit vaccine.
  • the antigen-specific immune response is characterized by measuring the antibody titer produced against the (coronavirus) antigen in a subject following administration of the preceding embodiments.
  • Antibody titer is a measurement of the amount of antibodies in a subject, for example, antibodies specific to a particular antigen or epitope of an antigen.
  • Antibody titers are usually expressed as the reciprocal of the maximum dilution that provides a positive result.
  • Enzyme-linked immunosorbent assay (ELISA) is a common assay used to determine antibody titers.
  • the antibody titer produced in the subject against the (coronavirus) antigen is increased by at least 1 log relative to the control.
  • the antibody titer produced in a subject can be increased by at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 log relative to a control.
  • the antibody titer produced in the subject is increased by at least 2-fold relative to the control.
  • the subject produces an increase in antibody titer of at least 3, 4, 5, 6, 7, 8, 9, or 10-fold relative to a control.
  • the antigen-specific immune response is measured as the ratio of the serum neutralizing antibody titer to the geometric mean titer (GMT) of the coronavirus, termed the geometric mean ratio (GMR).
  • the geometric mean titer (GMT) is the average antibody titer for a group of subjects and is calculated by multiplying all values and taking the nth root, where n is the number of subjects for which data are available.
  • control may be an unvaccinated subject, or a subject administered a live attenuated virus vaccine, an inactivated virus vaccine, or a protein subunit vaccine.
  • Vaccine effectiveness is an assessment of how well a vaccine (which may have been shown to have high vaccine efficacy) reduces disease in a population. This measure could assess the net balance of beneficial and adverse effects of a vaccination program, not just the vaccine itself, under natural conditions rather than in controlled clinical trials. Vaccine effectiveness is directly proportional to vaccine efficacy, but is also affected by the degree of immunity of target groups in the population, as well as other non-vaccine related factors such as hospitalizations, outpatient visits, or cost. For example, a retrospective case-control analysis could be used that compares vaccination rates in a group of infected cases and an appropriate control group. Vaccine effectiveness can be expressed as a ratio difference, using the odds of developing an infection (OR) despite vaccination.
  • Figure 1 Schematic diagram of the plasmid template containing the T7 promoter, 5'UTR, 3'UTR, polyA tail and CDS sequence of the foreign gene.
  • FIG. 2 Electrophoresis diagram of mRNA chip synthesized by in vitro transcription using linearized plasmid as template.
  • L is RNA molecular weight marker (RNA Ladder)
  • 1 is the purified mRNA sample.
  • Figure 3 ELISA detection of differences in SAS-Cov-2B.1.351 spike protein expression levels caused by different combinations of 5'UTR and 3'UTR. The abbreviations of 5'UTR and 3'UTR are shown in Table 3.
  • Figure 4 ELISA detection of differences in SAS-Cov-2B.1.1.7 spike protein expression levels caused by different combinations of 5'UTR and 3'UTR.
  • the abbreviations of 5'UTR and 3'UTR are shown in Table 3.
  • Figure 5 Schematic diagram of 5'UTR I point mutation to 5'UTR I'.
  • Figure 6 Western Blot detection of SAS-Cov-2B.1.617.2 spike protein expression levels before and after 5'UTR I point mutation.
  • the 5'UTR used in samples 1 and 3 is I
  • the 5'UTR used in samples 2 and 4 is I'
  • the mRNA synthesis templates of samples 1 and 2 are from plasmid miniprep
  • the mRNA synthesis templates of samples 3 and 4 are from plasmid maxiprep.
  • Sample 5 is the untransfected cell supernatant control
  • M is the molecular weight marker.
  • Figure 6A is a protein blot detection chart
  • Figure 6B is a sample ratio chart after quantification of the protein blot detection chart.
  • Figure 7 Western blot detection of the influence of 5'UTR I' on the expression levels of different heterologous genes.
  • Figure 7A is a protein blot detection chart
  • Figure 7B is a sample relative value chart after quantification of the protein blot detection chart.
  • Figure 8 Luciferase expression mediated by 5'UTR I'-3'UTR B and 5'UTR I-3'UTR B compared to BNT162b2UTR combination.
  • Figure 9 Results of 5'UTR I' truncated body and 5'UTR I' mutant regulating luciferase expression.
  • Figure 10 Difference results in HA mRNA expression levels under the regulation of 5'UTR I' and control 5'UTR.
  • 2019 novel coronavirus (2019-nCoV) is severe acute respiratory syndrome coronavirus 2 (severe acute respiratory syndrome coronavirus 2, SARS-CoV-2).
  • Nucleic acid or “nucleotide” includes RNA, DNA and cDNA molecules. It will be appreciated that due to the degeneracy of the genetic code, a large number of nucleotide sequences encoding a given protein can be generated.
  • the term nucleic acid is used interchangeably with the term “polynucleotide.”
  • Oligonucleotides are short-chain nucleic acid molecules.
  • Promoter refers to any nucleic acid sequence that regulates the expression of another nucleic acid sequence, which may be a heterologous target gene encoding a protein or RNA, by driving the transcription of the nucleic acid sequence. Promoters can be constitutive, inducible, repressive, tissue-specific, or any combination thereof. A promoter is the control region of a nucleic acid sequence where the initiation and rate of transcription of the remainder of the nucleic acid sequence is controlled.
  • “Introduction” in the context of inserting a nucleic acid sequence into a cell, means “transfection”, “transformation” or “transduction” and includes reference to the incorporation of a nucleic acid sequence into a eukaryotic or prokaryotic cell in which the nucleic acid sequence may be incorporated into the cell
  • the genome e.g., chromosome, plasmid, plasmid, or mitochondrial DNA
  • transient expression e.g., transfected mRNA
  • Nucleic acid construct refers to a single- or double-stranded nucleic acid molecule, such as a DNA fragment, that has been modified or synthesized to contain a nucleic acid segment in a manner that does not otherwise exist in nature, said nucleic acid molecule containing one or more control sequences or control element.
  • a nucleic acid construct contains a recombinant nucleotide sequence consisting essentially of optionally one, two, three or more isolated nucleotide sequences: including 5' UTR, open reading frame (ORF), 3'UTR.
  • ORF open reading frame
  • the sequences are operably linked to each other in the construct.
  • operably linked is defined herein as a structure in which control sequences, namely promoter sequences and/or 5'UTR sequences, are suitably positioned relative to the coding DNA sequence such that the control sequences direct editing.
  • the coding sequence is transcribed and translated into the polypeptide sequence encoded by the coding DNA.
  • Exogenous refers to any substance introduced or produced from outside an organism, cell, tissue or system.
  • “Homology” or “homology” is defined as a nucleotide that is identical to a nucleotide residue in the corresponding sequence on the target chromosome after aligning the sequences and introducing gaps if necessary to achieve maximum percent sequence identity. Percentage of residues. Alignments for the purpose of determining percent nucleotide sequence homology can be accomplished in various ways within the skill of the art, for example using publicly available computer software such as BLAST, BLAST-2, ALIGN, ClustalW2 or Megalign (DNASTAR) software. In some specific embodiments, the present disclosure calculates percent sequence homology based on BLAST.
  • nucleic acid sequence eg, DNA sequence
  • homology arm is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or more
  • the said Sequences are considered "homologous".
  • substitution is defined as a change in an amino acid or nucleotide sequence resulting from the replacement of one or more amino acids or nucleotides, respectively, by different amino acids or nucleotides, as compared to the amino acid sequence or nucleotide sequence of a reference polypeptide. If the substitution is conservative, the amino acid substituted into the polypeptide has similar structural or chemical properties (eg, charge, polarity, hydrophobicity, etc.) to the amino acid it replaces. In some embodiments, polypeptide variants may have "non-conservative" changes, where the substituted amino acids differ in structure and/or chemical properties.
  • deletion is defined as a change in an amino acid or nucleotide sequence that is lacking one or more amino acid or nucleotide residues, respectively, compared to the amino acid sequence or nucleotide sequence of a reference polypeptide.
  • deletions may involve deletions of 2, 5, 10, up to 20, up to 30 or up to 50 or more amino acid or nucleotide residues.
  • insertion or addition may be up to 10, up to 20, up to 30, up to 50 or more amino acids (or nucleotide residues).
  • Codon optimization refers to replacing the codons present in the target sequence that are generally rare in highly expressed genes of a given species with codons that are generally common in highly expressed genes of such species, and the codons before and after the replacement The codons code for the same amino acid.
  • Codon bias differences in codon usage between organisms
  • mRNA messenger RNA
  • tRNA transfer RNA
  • the dominance of a selected tRNA in a cell is often a reflection of the codons most commonly used in peptide synthesis. Therefore, based on codon optimization, genes can be modified for optimal gene expression in a given organism. Therefore, the choice of optimal codons depends on the codon usage preference of the host genome.
  • Cell or “host cell” includes any cell type susceptible to transformation, transfection, transduction, etc., by the nucleic acid constructs or vectors of the present disclosure.
  • the host cell can be an isolated primary cell, a pluripotent stem cell, a CD34+ cell, an induced pluripotent stem cell, or any of a number of immortalized cell lines (eg, HepG2 cells).
  • the host cell may be an in situ or in vivo cell in a tissue, organ or organism.
  • Treat means administering an internal or external therapeutic agent, such as a composition comprising any of the nucleic acid constructs of the present disclosure, to a patient who has one or more symptoms of a disease for which the therapeutic agent is known to affect Has therapeutic properties.
  • a therapeutic agent is administered to a subject or population in an amount effective to alleviate one or more symptoms of a disease, to induce regression of such symptoms, or to inhibit the progression of such symptoms to any clinically measurable extent.
  • the amount of therapeutic agent effective to alleviate the symptoms of any particular disease can vary depending on a variety of factors, such as the patient's disease state, age and weight, and the ability of the drug to produce the desired therapeutic effect in the patient. Whether disease symptoms have been alleviated may be assessed by any of the clinical tests commonly used by physicians or other health care professionals to assess the severity or progression of symptoms.
  • an “effective amount” or “pharmaceutically effective amount” includes an amount sufficient to ameliorate or prevent symptoms or conditions of a medical disease.
  • An effective amount also means an amount sufficient to allow or facilitate diagnosis.
  • the effective amount for a particular patient or veterinary subject may vary depending on factors such as the condition being treated, the general health of the patient, the method, route and dosage of administration, and the severity of the side effects.
  • An effective amount may be the maximum dosage or dosage regimen that avoids significant side effects or toxic effects.
  • an "effective amount” is an effective dose of RNA that produces an antigen-specific immune response.
  • “Prophylactically effective dose” refers to an effective dose that prevents viral infection at a clinically acceptable level. In some embodiments, the effective dose is the dose listed in the vaccine package insert.
  • traditional vaccines refer to vaccines other than the mRNA vaccines of the present disclosure.
  • traditional vaccines include, but are not limited to, live microbial vaccines, killed microbial vaccines, subunit vaccines, protein antigen vaccines, DNA vaccines, virus-like particle (VLP) vaccines, etc.
  • the traditional vaccine is one that has obtained regulatory approval and/or been registered by a national drug regulatory agency, such as the China Food and Drug Administration.
  • “Pharmaceutically acceptable” means those therapeutic agents, materials, compositions and/or dosage forms that, within the scope of reasonable medical judgment, are suitable for contact with patient tissue without undue toxicity, irritation, allergic reactions or other problems or complications, have a reasonable benefit/risk ratio, and be effective for the intended use.
  • Combined vaccine refers to a vaccine product for immunization made from multiple serotype antigens of different pathogenic organisms.
  • a vaccine containing the disclosed SARS-CoV-2 strain antigen mixed with several pathogenic microorganisms that can prevent and/or treat other viruses is a combination vaccine.
  • Combination vaccines are generally prepared by the same production method to produce monovalent vaccines, which are then mixed in the same injection for immunization; or, the antigens of several pathogenic microorganisms are mixed and then prepared into vaccine products.
  • the present disclosure involves first mixing pathogenic microorganism antigens and then preparing them into vaccine products. Combined and multivalent vaccines are the future development trend.
  • an "immune response" to a vaccine of the present disclosure refers to a subject's humoral and/or cellular immune response to the viral protein(s) (eg, coronavirus) present in the vaccine.
  • a “humoral” immune response refers to an immune response mediated by antibody molecules, including, for example, secretory IgA or IgG molecules
  • a "cellular" immune response is one mediated by T lymphocytes (e.g., CD4+ helper and/or An immune response mediated by CD8+ T cells (e.g., CTLs) and/or other leukocytes.
  • T lymphocytes e.g., CD4+ helper and/or An immune response mediated by CD8+ T cells (e.g., CTLs) and/or other leukocytes.
  • An important aspect of cellular immunity involves the antigen-specific response of cytolytic T cells (CTLs).
  • CTLs cytolytic T cells
  • CTLs are specific for peptide antigens that Presented in conjunction with proteins encoded by the major histocompatibility complex (MHC) and expressed on the cell surface, CTLs help induce and promote the destruction of intracellular microorganisms or lyse cells infected with these microorganisms.
  • MHC major histocompatibility complex
  • Another aspect of cellular immunity An antigen-specific response involving helper T cells, which help stimulate function and focus the activity of non-specific effector cells on cells with peptide antigens associated with MHC molecules.
  • Cellular immune responses also lead to Cytokines, chemokines, and other such molecules produced by activated T cells and/or other leukocytes.
  • the plasmid was linearized downstream of the poly(A) tail using BspQ 1 (Vazyme, DD4302-PC, China) and purified by PCR purification kit (QIAGEN, 28106, Germany). Using the purified linearized plasmid as a template, in vitro transcription was performed using the T7 in vitro transcription kit (Invitrogen, AM1333, USA). The mRNA synthesized by in vitro transcription was purified by MEGAclear TM Kit (Invitrogen, AM1908, USA) to obtain relatively high-purity mRNA, which was used for subsequent in vitro cell transfection and other experiments.
  • FBS FBS
  • DMEM medium Gibco
  • Opti-MEM Gibco
  • PBS Gibco
  • trypsin-EDTA Gibco
  • double antibody Pen/Strep, Gibco
  • Lipofectamine TM 2000 Invitrogen, 52887.
  • Human embryonic kidney cells (293T, ATCC) were grown in DMEM medium supplemented with 10% FBS and 1% double antibody in a humidified atmosphere of 5% CO2 .
  • Lipofectamine TM 2000 was used to transfect cells at a ratio of 1 ⁇ g of mRNA mixed with 2.5 ⁇ L of Lipofectamine TM 2000. The transfection dose of mRNA in each well was 2 ⁇ g.
  • the specific experimental method is as follows: dilute Lipofectamine TM 2000 and mRNA in Opti-MEM medium respectively, with the final total volume being 100 ⁇ L respectively, then mix Lipofectamine TM 2000 solution and mRNA solution, and incubate at room temperature for 10-15 minutes to form a solution containing Liposomal complexes of mRNA.
  • DMEM medium After removing the DMEM medium from the cell culture well plate, add lipoplexes containing different mRNAs to the corresponding well plates, and then add 400 ⁇ L of Opti-MEM medium containing 10% FBS to each well, and the cells are incubated at 37°C. (5% CO2 levels) and incubate for 4-6 hours. Thereafter, the transfection medium was removed, 1 mL of DMEM medium supplemented with 10% FBS and 1% double antibody was added to each cell culture well plate, and the cells were cultured at 37°C (5% CO2 level) for 48 hours. At the same time, some cells were transfected with Lipofectamine TM 2000 solution without mRNA in the above manner as a negative control.
  • Mutation of the amino acid KV of SARS-COV-2 spike protein to PP can change the spike protein from an unstable prefusion conformation to a stable postfusion conformation (Structure-based design of prefusion-stabilized SARS-CoV-2 spikes, Science , VOL.369, NO.6510), which makes the spike protein in a stable state in conformation, which is beneficial to the design and production of vaccines as an immunogen.
  • Methods include: 1: Establish a high protein expression gene library, including genes with high protein expression in monocytes, genes with higher protein expression in DCs than monocytes, genes with higher protein levels, and genes that have been tested to increase protein expression. For UTR genes, 2140 genes were obtained. 2: Establish a UTR library and download all genes with high protein level expression from the Genbank database (https://www.ncbi.nlm.nih.gov/genbank/) After the sequence, the integrity was first analyzed, and then the UTR data was obtained through repeated merging, CDS (coding sequence) interception and other methods, and 940 UTRs were obtained. Three: Establish a UTR selection library, analyze the UTR library, and screen out UTR sequences with appropriate sequence length and free energy.
  • the 5'UTR sequence length is between 40 and 70, and the 3'UTR length is between 100 and 150 or Between 300 and 400, perform free energy prediction on the sequences one by one, select sequences with higher 5'UTR free energy, select sequences with lower 3'UTR free energy, and obtain 64 UTRs from this selected library.
  • the screening strategy is that the 5'UTR free energy is above -10 and the 3'UTR free energy is very low.
  • 20 5'UTR and 3'UTR are selected for screening and in-cell verification.
  • 5'UTR I derived from human Rho GTPase activating protein 15 (Rho GTPase activating protein 15, referred to as ARHGAP15)
  • the DNA sequence is as shown in SEQ ID NO: 1
  • the RNA sequence is as follows SEQ ID NO: 45) has the potential to efficiently express the target protein in human cells (especially DC).
  • This example illustrates the preparation process of 5'UTR I-containing mRNA.
  • the sequence containing 5'UTR, 3'UTR, polyA tail and exogenous target gene CDS was artificially synthesized and inserted into the pUC57 vector to prepare a template vector.
  • the schematic diagram is shown in Figure 1. After sequencing, the sequence of the target gene inserted into the vector was correct.
  • the 5'UTR is SEQ ID NO: 1; the 3'UTR is the human hemoglobin beta subunit (HBB) 3'UTR sequence (SEQ ID NO: 7, named 3'UTR B) ;
  • the CDS sequence is the codon-optimized base sequence encoding the B.1.617.2 (Delta) spike protein (SEQ ID NO: 13); the polyA tail is a polyadenylic acid sequence (DNA sequence such as SEQ ID NO: 16, the RNA sequence is shown in SEQ ID NO: 56); T7 promoter (SEQ ID NO: 17).
  • the plasmid was linearized downstream of the poly(A) tail using BspQ I (Vazyme, DD4302-PC, China) and purified by a PCR purification kit (QIAGEN, 28106, Germany). Use the purified linearized plasmid as a template and perform in vitro transcription to synthesize mRNA according to the following process. Synthesize 100 ⁇ L of mRNA reaction system. The reaction system is shown in Table 2. In the reaction system, 2 ⁇ g of linearized template was added to 100 ⁇ L of nuclease-free water, and the reaction was carried out at 37°C for 4 hours and digested with DNase I for 30 min.
  • the synthesized mRNA contains less by-product dsRNA and the yield is excellent.
  • the capping process is achieved by chemical capping during in vitro transcription and synthesis.
  • the cap structure is m7G(5')ppp(5')(2'OMeA)pG ⁇ NH 4 (Hongene, ON-134).
  • Relatively high-purity mRNA was purified with MEGAclear TM Kit (Invitrogen, AM1908, USA) and used for subsequent in vitro cell transfection and other experiments.
  • the present disclosure has obtained a 5'UTR I with the potential to express the target protein efficiently, and its sequence is SEQ ID NO: 1; and the construction containing the 5'UTR I can efficiently transcribe the corresponding mRNA.
  • the chip electrophoresis identification was consistent with the theoretical size (see Figure 2).
  • the method is: artificially synthesizing plasmids containing different UTR combinations of CDS of the same target gene.
  • the basic plasmid was constructed according to the method in Example 1, and the CDS sequence of the target gene was recombinantly constructed through artificial synthesis and HindIII/XhoI digestion.
  • the reconstructed plasmid was linearized according to the method in Example 1 and the mRNA was synthesized and purified.
  • the mRNA was transfected into 293T cells using Lipofectamine TM 2000. After 48 hours, the 293T supernatant was collected and subjected to Western blotting and ELISA detection to detect the level of spike protein in the secreted cell supernatant.
  • the target genes used are: the gene encoding the new coronavirus B.1.351 spike protein (CDS as shown in SEQ ID NO: 20), or the gene encoding the new coronavirus B.1.1.7 spike protein (CDS as shown in SEQ ID NO: 22 shown).
  • CDS coronavirus B.1.351 spike protein
  • CDS coronavirus B.1.1.7 spike protein
  • Western blotting method In order to detect the in vitro expression level of the mRNA obtained from each construct, collect the cell supernatant in the well plate after the transfection is completed, and perform polyacrylamide gel electrophoresis, transfer, incubation with primary and secondary antibodies, and color development. , and finally obtained the in vitro expression level of each constructed mRNA.
  • the primary antibody used SARS-CoV-2 (2019-nCoV) spike RBD antibody (Sino Biology, 40592-T62), and the secondary antibody used HRP-linked anti-rabbit IgG (Transgene, HS101).
  • ELISA method After the transfection is completed, collect the corresponding cell supernatant, dilute the cell supernatant at a ratio of 1:50 to 1:100, and then use the ELISA kit (SARS-CoV-2 (2019-nCoV) spike Detection ELISA kit, KIT40591, Sino Biology) was used to detect the expression level of each constructed mRNA in vitro, read the absorbance value at 450nm and calculate the concentration of the relevant antigen.
  • SARS-CoV-2 2019-nCoV spike ELISA kit, KIT40591, Sino Biology
  • the comparison of B.1.351 spike protein expression levels mediated by different UTR combinations is as follows: I-B>A-B>G-B, I-B>I-E. For 5'UTR, I is better than A and G; for related combinations of I, I-B is better than I-E.
  • the comparison of B.1.1.7 spike protein expression levels mediated by different UTR combinations is as follows: I-F>I-B>I-D>I-E.
  • the 5'UTR and 3'UTR in the I-F combination are from the same gene, ARHGAP15.
  • the results in Table 7 show that the combination of I-B (SEQ ID NO: 1 and SEQ ID NO: 7) and I-F (SEQ ID NO: 1 and SEQ ID NO: 8) has a similar level of regulation of target gene expression, further demonstrating that the 5'UTR
  • the versatility of I allows it to be combined with different 3'UTRs to mediate the efficient expression of the gene of interest (for example, the gene encoding the spike protein).
  • 5'UTR I (SEQ ID NO: 1) is a naturally occurring human Rho GTPase activating protein family promoter.
  • the experimental data of Example 1-2 supports that 5'UTR I assists in the efficient expression of the new crown spike protein, but 5'UTR I contains an ATG inside, which may express a 27aa short peptide, affecting the normal expression of the target gene protein.
  • the present disclosure carried out point mutations (A ⁇ G) and obtained the mutant 5'UTR I' (DNA sequence As shown in SEQ ID NO: 2, the RNA sequence is shown in SEQ ID NO: 46). The mutation locations are shown in Figure 5.
  • Example 1 The method of Example 1 was used to prepare plasmids respectively containing 5'UTR IB and 5'UTR I'-B to regulate the expression of the target gene.
  • the target gene is the gene encoding the B.1.617.2 spike protein (SEQ ID NO: 13)
  • the plasmids were named Delta 13 and Delta 24 respectively
  • the mRNA was obtained through PCR transcription, synthesis and purification (the I'-Delta24-B mRNA sequence without the cap is shown in SEQ ID NO: 57), and then the prepared 293T cells were transfected with mRNA, and then the protein blotting method in Example 2 was used to detect the B.1.617.2 spike protein.
  • the target gene sequences contained in the plasmid are SEQ ID NO: 13 and SEQ ID NO: 15 respectively. Both are codon-optimized sequences, which encode Target protein sequences as shown in SEQ ID NO: 12 and SEQ ID NO: 14.
  • the mRNA was prepared and purified in the same manner as in Example 1, and 2 ⁇ g of mRNA was transfected into 293T cells using Lipofectamine TM 2000. The cell supernatant was collected after 2 days.
  • the protein blotting method in Example 2 was used to detect B.1.617.2 spike protein mRNA under the regulation of IB (named Delta 13) and B.1.617.2 spike protein mRNA under the regulation of I'-B (named Delta 13).
  • the sequences of Delta 24mRNA and Omicron 24mRNA without cap structure are as follows: SEQ ID NO: 57 and SEQ ID NO:58.
  • the expression of Delta 13 is calculated as 1. After normalization, the ratio in Table 9 is obtained.
  • a vector containing the luciferase CDS sequence (SEQ ID NO: 26), including the same T7 promoter (SEQ ID NO: 17) and polyA tail as the aforementioned vector containing the COVID-19 gene.
  • Three vectors Luc, Luc13 and Luc24 with different UTRs were constructed respectively.
  • the UTRs contained in the vectors are shown in Table 10. Among them, the 5'UTR and 3'UTR combination of BNT162b2 in Luc was used as a control, which came from CN113521269A.
  • In vitro transcription was performed to synthesize mRNA containing a cap structure by the same method as in Example 1, and it was transfected in vitro. After collecting the supernatant, luciferase substrate was added for detection, and the plate was read. The results are shown in Table 11 and Figure 8.
  • the results showed that the luciferase protein expression under the regulation of 5'UTR I' after point mutation was higher than that of 5'UTR I.
  • the I-B combination of the present disclosure results in a higher expression level of the target gene, and the I'-B combination significantly increases the expression level of the target gene.
  • the detection results using luciferase as the target gene also prove once again that 5'UTR I or 5'UTR I' can universally regulate the efficient expression of any target protein.
  • the mRNA containing the cap structure was synthesized by in vitro transcription using the same method as in Example 1. After transfection of 293T cells in vitro for 24 hours, luciferase substrate was added for detection and the plate was read. The results are shown in Figure 9. Luc24(I'-B The expression levels of luciferase mediated by the combination) and its 5'UTR I' truncation and mutant were significantly higher than Luc (5' and 3'UTR of BNT162b2). 5'UTR I' truncated bodies can maintain the activity of 5'UTR I' in regulating target protein expression.
  • 5'UTR I' truncated bodies (1-I', 2-I', 3-I', 4-I', 5-I', 6-I', 7-I', 9-I', 10-I', 11-I', 13-I') have improved activity in regulating the expression of target proteins.
  • 5'UTR I' mutants 15-I' and 16-I' make the expression level of the target gene better than 5'UTR I', indicating that the 5' end of 5'UTR I' is truncated or internally modified. Point mutations can further enhance the functional activity of 5'UTR in regulating the expression of target genes.
  • influenza A virus H1N1 A/Wisconsin/588/2019
  • HA protein SEQ ID NO:83
  • a The gene of HA protein of influenza virus H3N2 A/Cambodia/e0826360/2020) (SEQ ID NO:84)
  • the gene of HA protein of influenza virus B Washington/02/2019) (SEQ ID NO:85)
  • B Plasmid of the HA protein gene SEQ ID NO: 86) of influenza virus (Phuket/3073/2013).
  • the target gene sequences contained in the plasmid are SEQ ID NO: 87-90, which are all codon-optimized sequences.
  • HA CDS sequences encoding SEQ ID NO:83-86 were synthesized with reference to the 5' and 3'UTR sequences (SEQ ID NO:91-92) of patents WO2022/245888A1 and WO2022/150717A1, which are SEQ ID NO:93- 96.
  • the mRNA was prepared and purified in the same manner as in Example 1, and 1 ⁇ g of mRNA was transfected into 293T cells using Lipofectamine TM 2000. The cell lysate was collected after 1 day.
  • the protein blotting method in Example 2 was used to detect the expression of four influenza HA protein mRNAs under the regulation of I'-B.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Zoology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Pulmonology (AREA)
  • Communicable Diseases (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Biophysics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Cell Biology (AREA)
  • Epidemiology (AREA)
  • Mycology (AREA)
  • Oncology (AREA)
  • Immunology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

本公开涉及核酸构建体及其应用。具体地,本公开涉及包含工程化UTR的核酸构建体及其用于预防或治疗疾病(例如,预防病毒感染)的用途,所述UTR能够显著提高核酸构建体中目的基因的表达效率。

Description

核酸构建体及其应用
本公开要求如下专利申请的优先权:于2022年04月27日提交,申请号为CN202210456261.9,发明名称为“核酸构建体及其应用”的中国专利申请;上述专利申请的全部内容通过引用结合在本公开中。
技术领域
本公开属于核酸药物领域,具体涉及包含工程化UTR的核酸构建体及其用于预防或治疗疾病(例如,病毒感染)的用途。
背景技术
疫苗是重要的医药研究领域,已有的疫苗包括灭活疫苗、腺病毒载体疫苗、减毒流感病毒载体疫苗、重组蛋白疫苗、核酸疫苗(包括RNA疫苗和DNA疫苗)。mRNA是RNA疫苗的活性成分,其主要包括帽子结构(Cap)、5'非翻译区(Untranslatedregion,UTR)、编码抗原蛋白的开放阅读框(Open reading frame,ORF)、3'UTR和Poly(A)尾结构。其中,5'UTR是将核糖体招募到mRNA并启动密码子选择的关键,在调控翻译效率和塑造细胞蛋白质组方面发挥着重要作用(Ivanov,et al.Science.2016,352(6292):1413-1416.)。真核生物的3'UTR具有多种调控基元(regulatory motifs),可被microRNAs(miRNAs)和rbp识别,以控制mRNA的稳定性、定位和翻译(Mazumder et al.,2003;Mayr,2017)。polyA尾巴(位于mRNA3'末端)是另一个决定mRNA稳定性和蛋白质水平的元素。可选的3'UTR不仅影响mRNA的稳定性和翻译,而且还控制mRNA的定位(Tushev等人,2018)。mRNA的设计与优化,特别是其UTR的选择和使用,是整个mRNA疫苗产品制备流程中的关键一环。因此,开发高效的UTR,仍是mRNA药物领域的迫切需要。
本公开提供了具有新UTR结构mRNA疫苗,具有使得目的基因稳定、高效表达的优点,可普遍性的应用于mRNA疫苗(例如,流感或新冠疫苗),用于调控目的基因表达,具有优异的临床药物应用前景。
发明内容
本公开提供可调节目的基因表达的核酸元件,以及核酸构建体。其中,所述核苷酸构建体包含至少一个可调节目的基因表达的核酸元件。
核酸元件
一些实施方案中,所述核酸元件为5'非翻译区元件(5'UTR)。
一些实施方案中,5'UTR选自源自或为Rho GTPase激活蛋白(ARHGAP)、热休克27kD蛋白1(heat shock 27kDa protein 1,HSPB1)、β珠蛋白(hemoglobin subunit beta,HBB)、趋化因子c-c-基元配体13(C-C motif chemokine ligand 13,CCL13) 等任一基因的5'UTR或其衍生序列。
一些实施方案中,所述核酸构建体中的5'UTR是源自或为ARHGAP基因的5'UTR序列或其衍生序列。在一些实施方案中,所述ARHGAP包括ARHGAP1、ARHGAP2、ARHGAP3、ARHGAP4、ARHGAP5、ARHGAP6、ARHGAP7(DLC1)、ARHGAP8、ARHGAP9、ARHGAP10、ARHGAP12、ARHGAP13(SRGAP1)、ARHGAP14(SRGAP2)、ARHGAP15、ARHGAP17(RICH1)、ARHGAP18、ARHGAP19、ARHGAP20、ARHGAP21、ARHGAP22、ARHGAP23、ARHGAP24、ARHGAP25、ARHGAP26。
一些实施方案中,所述5'UTR是源自或为ARHGAP15基因的5'UTR序列或其衍生序列。一些实施方案中,所述ARHGAP15基因是来源任意物种的,如人类ARHGAP15、狒狒ARHGAP15、猴ARHGAP15、小鼠ARHGAP15等。
一些实施方案中,所述5'UTR是源自或为人类ARHGAP15基因的5'UTR序列或其衍生序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:1所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述5'UTR包含至少一个可用于抑制UTR内部ATG启动翻译的点突变。一些实施方案中,所述点突变选自5'UTR中ATG序列的A、T或G中的任意一个或多个位点上的突变。一些实施方案中,所述点突变为5'UTR中ATG序列的A位点发生突变。一些实施方案中,所述突变为A突变为G、C或T。一些实施方案中,所述突变为ATG突变为GTG、CTG或TTG。进一步地,所述突变可用于抑制UTR内部ATG启动翻译。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含至少一个可用于抑制UTR内部ATG启动翻译的点突变。一些实施方案中,所述点突变选自5'UTR中ATG序列的A、T或G中的任意一个或多个位点上的突变。一些实施方案中,所述点突变为5'UTR中ATG序列的A位点发生突变。一些实施方案中,所述突变为A突变为G、C或T。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含SEQ ID NO:44所示的核苷酸序列,或,所述5'UTR包含与SEQ ID NO:44具有至少80%、85%、90%、95%、96%、97%、98%、99%同一性的核苷酸序列,其中N1选自A、G、C或T。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含ATG突变为GTG的核苷酸序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:2所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含ATG突变为CTG的核苷酸序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或 为如SEQ ID NO:42所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含ATG突变为TTG的核苷酸序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:43所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,5'UTR为5'UTR截短体。一些实施方案中,所述5'UTR截短体仍然保持与天然5'UTR相似的活性功能,例如仍然保持调控ORF编码目的基因表达蛋白的功能。一些实施方案中,所述5'UTR截短体与天然5'UTR相比,具有增强的活性功能,例如增强的调控ORF编码目的基因表达蛋白的功能。
一些实施方案中,所述5'UTR截短体的截短方式包括:从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列;和/或,从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列,保留3'末端的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列,保留5'末端的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列;并且,所述5'UTR截短体的截短方式包括从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列。一些实施方案中,所述5'UTR截短体还包含前述任意实施方案的点突变。一些实施方案中,所述5'UTR截短体不包含前述任意实施方案的点突变。一些实施方案中,所述5'UTR截短体保留了相比于天然5'UTR序列长度的至少90%、80%、70%、60%、50%、40%、30%、20%、15%、10%、5%、1%长度的序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含或为ARHGAP15 5'UTR截短体。一些实施方案中,其截短方式包括从ARHGAP15 5'UTR的5'至3'末端的方向上,缺失5'末端连续的核苷酸序列;和/或,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸序列。
一些实施方案中,其截短方式包括从ARHGAP15 5'UTR的5'至3'末端的方向上,删除5'末端连续的核苷酸序列,保留3'末端的核苷酸序列。一些实施方案中,5'UTR截短体的截短方式包括从ARHGAP15 5'UTR的3'至5'末端的方向上,删除3'末端连续的核苷酸序列,保留5'末端的核苷酸序列。一些实施方案中,5'UTR截短体的截短方式是从ARHGAP15 5'UTR的5'至3'末端的方向上,删除5'末端连续的核苷酸序列;并且,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸序列。
一些实施方案中,所述截短体保留了相比于天然ARHGAP15 5'UTR(SEQ ID NO:1)至少90%、80%、70%、60%、50%、40%、30%、20%、15%、10%、5%、 1%长度的序列。
一些实施方案中,所述5'UTR截短体包含SEQ ID NO:2或44所示序列中至少5个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:2或44所示序列中5-62个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:2或44所示序列中7-62个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:2所示序列中7-59个连续的核苷酸序列;示例性地,所述5'UTR截短体包含SEQ ID NO:2或44所示序列中5、7、8、9、10、11、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57、58、59、60、61或62个连续的核苷酸序列。
一些实施方案中,所述5'UTR截短体的5’末端的核苷酸(也即,5'UTR截短体的起始核苷酸)是SEQ ID NO:2或44所示序列按自然计数的1-57位的任一位置处的核苷酸。一些实施方案中,所述5'UTR截短体的5’末端的核苷酸是SEQ ID NO:2或44所示序列按自然计数的第1-13或第17-29位中的任一位置处的核苷酸。例如,第1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57的任一位置处的核苷酸。
示例性地,所述5'UTR截短体的5’末端的核苷酸,及所述5'UTR截短体包含的SEQ ID NO:2所示序列中连续核苷酸序列的长度如下表所示:


一些实施方案中,所述5'UTR截短体的长度为59bp、55bp、51bp、47bp、43bp、39bp、35bp、31bp、27bp、23bp、19bp、15bp、11bp、7bp。
一些实施方案中,所述ARHGAP15 5'UTR截短体还包含前述任意实施方案的点突变。一些实施方案中,所述ARHGAP15 5'UTR截短体还包含ATG突变为GTG的核苷酸序列。一些实施方案中,前述源自ARHGAP15基因的5'UTR包含或为如SEQ ID NO:28-30任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述ARHGAP15 5'UTR截短体不包含前述任意实施方案的点突变。一些实施方案中,前述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:31-40所示的核苷酸序列,包含CTATAAT;或与前述任一序列具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述5'UTR截短体至少包含CTATAAT或SEQ ID NO:193所示的核苷酸序列。
一些实施方案中,所述ARHGAP15 5'UTR截短体包含或为如SEQ ID NO:41、76、137-196任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自HSPB1基因的5'UTR包含或为如SEQ ID NO:3-5任一所示的核苷酸序列或与SEQ ID NO:3-5任一具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,5'UTR包含或为如SEQ ID NO:1-2、3-5、28-43、76、137-196任一所示的核苷酸序列,或如下序列:CTATAAT,或与前述任一序列具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
核酸构建体
本公开提供一种核酸构建体,其包括:
(a)开放阅读框(ORF);和
(b)5'非翻译区元件(5'UTR)。
一些实施方案中,所述ORF为目的基因蛋白的编码多核苷酸序列。
一些实施方案中,所述目的基因为异源的。另一些实施方案中,所述目的基因为内源的。
一些实施方案中,所述核酸构建体中的5'UTR位于开放阅读框的上游。一些实施方案中,所述核酸构建体中的5'UTR位于开放阅读框的5'末端。
一些实施方案中,所述核酸构建体中的5'UTR选自源自或为Rho GTPase激活蛋白(ARHGAP)、热休克27kD蛋白1(heat shock 27kDa protein 1,HSPB1)、β珠蛋 白(hemoglobin subunit beta,HBB)、趋化因子c-c-基元配体13(C-C motif chemokine ligand 13,CCL13)等任一基因的5'UTR或其衍生序列。
一些实施方案中,所述核酸构建体中的5'UTR是源自或为ARHGAP基因的5'UTR序列或其衍生序列。在一些实施方案中,所述ARHGAP包括ARHGAP1、ARHGAP2、ARHGAP3、ARHGAP4、ARHGAP5、ARHGAP6、ARHGAP7(DLC1)、ARHGAP8、ARHGAP9、ARHGAP10、ARHGAP12、ARHGAP13(SRGAP1)、ARHGAP14(SRGAP2)、ARHGAP15、ARHGAP17(RICH1)、ARHGAP18、ARHGAP19、ARHGAP20、ARHGAP21、ARHGAP22、ARHGAP23、ARHGAP24、ARHGAP25、ARHGAP26。
一些实施方案中,所述核酸构建体中的5'UTR是源自或为ARHGAP15基因的5'UTR序列或其衍生序列。一些实施方案中,所述ARHGAP15基因是来源任意物种的,如人类ARHGAP15、狒狒ARHGAP15、猴ARHGAP15、小鼠ARHGAP15等。
一些实施方案中,所述核酸构建体中的5'UTR是源自或为人类ARHGAP15基因的5'UTR序列或其衍生序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:1所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述核酸构建体中的5'UTR包含至少一个可用于抑制UTR内部ATG启动翻译的点突变。一些实施方案中,所述点突变选自5'UTR中ATG序列的A、T或G中的任意一个或多个位点上的突变。一些实施方案中,所述点突变为5'UTR中ATG序列的A位点发生突变。一些实施方案中,所述突变为A突变为G、C或T。一些实施方案中,所述突变为ATG突变为GTG、CTG或TTG。进一步地,所述突变可用于抑制UTR内部ATG启动翻译。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含至少一个可用于抑制UTR内部ATG启动翻译的点突变。一些实施方案中,所述点突变选自5'UTR中ATG序列的A、T或G中的任意一个或多个位点上的突变。一些实施方案中,所述点突变为5'UTR中ATG序列的A位点发生突变。一些实施方案中,所述突变为A突变为G、C或T。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含SEQ ID NO:44所示的核苷酸序列,或,所述5'UTR包含与SEQ ID NO:44具有至少80%、85%、90%、95%、96%、97%、98%、99%同一性的核苷酸序列,其中N1选自A、G、C或T。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含ATG突变为GTG的核苷酸序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:2所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含ATG突变为CTG的核苷酸序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:42所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含ATG突变为TTG的核苷酸序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:43所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,本公开核酸构建体中的5'UTR为5'UTR截短体。一些实施方案中,所述5'UTR截短体仍然保持与天然5'UTR相似的活性功能,例如仍然保持调控ORF编码目的基因表达蛋白的功能。一些实施方案中,所述5'UTR截短体与天然5'UTR相比,具有增强的活性功能,例如增强的调控ORF编码目的基因表达蛋白的功能。
一些实施方案中,所述5'UTR截短体的截短方式包括:从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列;和/或,从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列,保留3'末端的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列,保留5'末端的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列;并且,所述5'UTR截短体的截短方式包括从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列。
一些实施方案中,所述5'UTR截短体还包含前述任意实施方案的点突变。一些实施方案中,所述5'UTR截短体不包含前述任意实施方案的点突变。一些实施方案中,所述5'UTR截短体保留了相比于天然5'UTR序列长度的至少90%、80%、70%、60%、50%、40%、30%、20%、15%、10%、5%、1%长度的序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含或为ARHGAP15 5'UTR截短体。一些实施方案中,其截短方式包括从ARHGAP15 5'UTR的5'至3'末端的方向上,缺失5'末端连续的核苷酸序列;和/或,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸序列。
一些实施方案中,其截短方式包括从ARHGAP15 5'UTR的5'至3'末端的方向上,删除5'末端连续的核苷酸序列,保留3'末端的核苷酸序列。一些实施方案中,5'UTR截短体的截短方式包括从ARHGAP15 5'UTR的3'至5'末端的方向上,删除3'末端连续的核苷酸序列,保留5'末端的核苷酸序列。一些实施方案中,5'UTR截短体的截短方式是从ARHGAP15 5'UTR的5'至3'末端的方向上,删除5'末端连续的核苷酸序列;并且,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸 序列。
一些实施方案中,所述截短体保留了相比于天然ARHGAP15 5'UTR(SEQ ID NO:1)至少90%、80%、70%、60%、50%、40%、30%、20%、15%、10%、5%、1%长度的序列。
一些实施方案中,所述5'UTR截短体在在SEQ ID NO:2或44所示核苷酸序列中至少5个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:2或44所示序列中5-62个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:2或44所示序列中7-62个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:2所示序列中7-59个连续的核苷酸序列;示例性地,所述5'UTR截短体包含SEQ ID NO:2所示序列中5、7、8、9、10、11、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57、58、59、60、61或62个连续的核苷酸序列。
一些实施方案中,所述5'UTR截短体的5’末端的核苷酸(也即,5'UTR截短体的起始核苷酸)是SEQ ID NO:2或44所示序列按自然计数的1-57位的任一位置处的核苷酸。一些实施方案中,所述5'UTR截短体的5’末端的核苷酸是SEQ ID NO:2或44所示序列按自然计数的第1-13或第17-29位中的任一位置处的核苷酸。例如,第1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57的任一位置处的核苷酸。
示例性地,所述5'UTR截短体的5’末端的核苷酸,及所述5'UTR截短体包含的SEQ ID NO:2所示序列中连续核苷酸序列的长度如下表所示:


一些实施方案中,所述5'UTR截短体的长度为59bp、55bp、51bp、47bp、43bp、39bp、35bp、31bp、27bp、23bp、19bp、15bp、11bp、7bp。
一些实施方案中,所述ARHGAP15 5'UTR截短体还包含前述任意实施方案的点突变。一些实施方案中,所述ARHGAP15 5'UTR截短体还包含ATG突变为GTG的核苷酸序列。一些实施方案中,前述源自ARHGAP15基因的5'UTR包含或为如SEQ ID NO:28-30任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述ARHGAP15 5'UTR截短体不包含前述任意实施方案的点突变。一些实施方案中,前述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:31-40所示的核苷酸序列、CTATAAT或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述5'UTR截短体至少包含CTATAAT或SEQ ID NO:193所示的核苷酸序列。
一些实施方案中,所述ARHGAP15 5'UTR截短体包含或为如SEQ ID NO:41、76、137-196任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自HSPB1基因的5'UTR包含或为如SEQ ID NO:3-5任一所示的核苷酸序列或与SEQ ID NO:3-5任一具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,本公开所述核酸构建体中的5'UTR包含或为如SEQ ID NO:1-2、3-5、28-43、76、137-196任一所示的核苷酸序列,或如下所示序列:CTATAAT;或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些具体实施方案中,如上任一项核酸构建体中所述的5'UTR包含或为如SEQ ID NO:1-5、28-43、76、137-196任一所示的核苷酸序列,或如下所示序列:CTATAAT。
一些实施方案中,本公开核酸构建体还进一步包含:(c)3'非翻译区元件(3'UTR)。
一些实施方案中,本公开所述开放阅读框与所述5'UTR和/或所述3'UTR源自不同的基因。一些实施方案中,本公开核酸构建体包含至少一个开放阅读框、至少一个5'UTR或至少一个3'UTR。一些实施方案中,本公开核酸构建体中的所述 5'UTR和3'UTR是相同或不同来源的,例如源自相同或不同的基因。一些实施方案中,本公开核酸构建体中的所述5'UTR和3'UTR源自相同物种或不同物种。
一些实施方案中,本公开核酸构建体中的3'UTR位于开放阅读框的下游。一些实施方案中,所述核酸构建体中的3'UTR位于开放阅读框的3'末端。
一些实施方案中,本公开核酸构建体中的所述3'UTR选自源自或为β珠蛋白(hemoglobin subunit beta,HBB)、ARHGAP15、冠蛋白1A(coronin 1A,CORO1A)、血红素结合蛋白(hemopexin,HPX)等任一基因的3'UTR或其衍生序列。
一些实施方案中,其中所述的3'UTR是源自或为HBB或ARHGAP15基因的3'UTR序列或其衍生序列。
一些实施方案中,前述源自HBB基因的3'UTR包含或为如SEQ ID NO:7所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的3'UTR包含或为如SEQ ID NO:8所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自CORO1A基因的3'UTR包含或为如SEQ ID NO:9所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自HPX基因的3'UTR包含或为如SEQ ID NO:10所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,本公开核酸构建体中的3'UTR包含或为如SEQ ID NO:7-10任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。一些具体实施方案中,其中所述的3'UTR包含或为如SEQ ID NO:7或8所示的核苷酸序列。
一些实施方案中,所述核酸构建体中包含5'UTR和3'UTR,其中:
5'UTR选自源自或为ARHGAP(例如ARHGAP15)、HSPB1、HBB、CCL13等任一基因的5'UTR或其衍生序列,和3'UTR选自源自或为HBB、ARHGAP15、CORO1A、HPX等任一基因的3'UTR或其衍生序列。
一些实施方案中,所述核酸构建体中包含5'UTR和3'UTR,所述5'UTR和3'UTR选自如下任意一项:
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为 CORO1A基因的3'UTR或其衍生序列;
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为CORO1A基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为CORO1A基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列;或
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为CORO1A基因的3'UTR或其衍生序列;或
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列。
一些实施方案中,本公开核酸构建体中包含5'UTR和3'UTR,所述5'UTR和3'UTR选自如下任意一项:
5'UTR包含或为如SEQ ID NO:1-2、28-43、76、137-196、CTATAAT任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:7所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:1-2、28-43、76、137-196、CTATAAT任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、 100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:8所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:1-2、28-43、76、137-196、CTATAAT任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:9所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:1-2、28-43、76、137-196、CTATAAT任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:10所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:3所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:7所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:3所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:8所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:3所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:9所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:3所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:10所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:4所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:7所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:4所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:8所示的核苷酸序列或与之具有至少80%、85%、90%、95%、 96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:4所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:9所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:4所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:10所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:5所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:7所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:5所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:8所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:5所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:9所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
5'UTR包含或为如SEQ ID NO:5所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:10所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
一些实施方案中,所述核酸构建体中包含5'UTR和3'UTR,其中:
5'UTR包含或为如SEQ ID NO:1-2、28-43、76、137-196、CTATAAT任一所示的核苷酸序列,和3'UTR源自或为源自HBB、ARHGAP15、CORO1A、HPX基因的3'UTR或其衍生序列。
一些具体实施方案中,所述核酸构建体中包含5'UTR和3'UTR,其中:
5'UTR包含或为如SEQ ID NO:1-2、28-43、76、137-196、CTATAAT任一所示的核苷酸序列,和3'UTR包含或为如SEQ ID NO:7-10任一所示的核苷酸序列。
一些实施方案中,本公开核酸构建体还进一步包含:(d)多聚腺苷酸(poly-A)尾巴。
一些实施方案中,所述核酸构建体中的poly-A尾巴位于3'UTR的下游。一些实施方案中,所述核酸构建体中的poly-A尾巴位于3'UTR的3'末端。一些实施方 案中,所述poly-A尾巴在所述核酸构建体的3'末端。一些实施方案中,poly-A尾巴的长度为至少约50、100、150、200、300、400、500个核苷酸。
一些实施方案中,所述poly-A尾巴包括但不限于HGH polyA、SV40polyA、BGH polyA、rbGlob polyA或SV40late polyA。
一些具体实施方案中,所述poly-A尾巴包含或为如SEQ ID NO:16或135所示的核苷酸序列。
一些实施方案中,从5'至3'方向上,如上任一项所述的核酸构建体含有i)-iv)中任一:
i)5'UTR,和开放阅读框(ORF);
ii)开放阅读框(ORF),和3'UTR;
iii)5'UTR,开放阅读框(ORF),和3'UTR;
iv)5'UTR,开放阅读框(ORF),3'UTR,和poly-A尾巴;
其中所述ORF与所述5'UTR和/或所述3'UTR源自不同的基因。
一些实施方案中,i)、iii)至iv)中的5'UTR选自源自ARHGAP、HSPB1、HBB、CCL13等任一基因的5'UTR或其衍生序列(例如ARHGAP15的5'UTR或其衍生序列)。
一些实施方案中,ii)至iv)中的3'UTR选自源自HBB、ARHGAP15、CORO1A、HPX等任一基因的3'UTR或其衍生序列(例如HBB或ARHGAP15的3'UTR或其衍生序列)。
一些实施方案中,iv)至iv)中的poly-A尾巴包含例如SEQ ID NO:16或135所示的核苷酸序列。
一些实施方案中,i)中的5'UTR包含或为如SEQ ID NO:1-5、28-43、76、137-196、CTATAAT任一所示的核苷酸序列。
一些实施方案中,ii)中的3'UTR包含或为如SEQ ID NO:7-10任一所示的核苷酸序列。
一些实施方案中,iii)中的5'UTR包含或为如SEQ ID NO:1-5、28-43、76、137-196、CTATAAT任一所示的核苷酸序列,和3'UTR包含或为如SEQ ID NO:7-10任一所示的核苷酸序列。
一些实施方案中,iv)中的5'UTR包含或为如SEQ ID NO:1-5、28-43、76、137-196、CTATAAT任一所示的核苷酸序列,3'UTR包含或为如SEQ ID NO:7-10任一所示的核苷酸序列,和poly-A尾巴包含或为如SEQ ID NO:16或135所示的核苷酸序列。
本公开中,上述任一项核酸构建体中,所述ORF包含编码至少一种多肽或蛋白的核苷酸序列。一些实施方案中,所述核苷酸序列可以是密码子优化的核苷酸序列。
一些实施方案中,所述ORF编码的多肽或蛋白是荧光蛋白或荧光素酶 (luciferase)。
一些实施方案中,所述ORF编码的多肽或蛋白是病毒抗原。示例性地,病毒抗原包括但不限于流感病毒、冠状病毒、呼吸道合胞病毒、人类免疫缺陷病毒、单纯疱疹病毒、狂犬病病毒或EB病毒等的抗原。
一些实施方案中,所述病毒抗原是冠状病毒抗原。一些实施方案中,所述冠状病毒为感染人的冠状病毒,例如SARS-CoV-2(COVID-19)、SARS-CoV、HCoV-229E、HCoV-OC43、HCoV-NL63、HCoV-HKU1或MERS-CoV。一些实施方案中,所述冠状病毒是SARS-COV-2。一些实施方案中,所述冠状病毒抗原是结构蛋白质。一些实施方案中,所述结构蛋白质选自刺突蛋白(Spike protein,S蛋白或Spike蛋白)、包膜蛋白(envelope protein,E蛋白)、膜蛋白(membrane protein,M蛋白)和核衣壳蛋白(nucleocapsid protein,N蛋白)。一些实施方案中,所述结构蛋白质为刺突蛋白。一些实施方案中,所述刺突蛋白为SARS-COV-2刺突蛋白。一些实施方案中,所述SARS-COV-2刺突蛋白选自SARS-COV-2(例如野生型SARS-COV-2)、SARS-COV-2Alpha(B.1.1.7)、SARS-COV-2Beta(B.1.351)、SARS-COV-2Gamma(P.1)、SARS-COV-2Kappa(B.1.617.1)、SARS-COV-2Delta(B.1.617.2)、SARS-COV-2Omicron(B.1.1.529)、SARS-COV-2Omicron(B.A.4)等任一病毒株的刺突蛋白。
一些实施方案中,其中所述SARS-COV-2刺突蛋白包含或为如SEQ ID NO:12、14、21、23、25、80、136任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。
一些实施方案中,所述ORF包含密码子优化的核苷酸序列,其包含与编码SARS-COV-2抗原的野生型核苷酸序列(例如野生型SARS-COV-2,SEQ ID NO:81)具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,其中编码所述多肽或蛋白的ORF包含或为如SEQ ID NO:13、15、20、22、24、81、97任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述ORF编码流感病毒抗原。一些实施方案中,所述流感病毒选自A型流感病毒或B型流感病毒,示例性地,流感病毒为A型流感病毒H1N1、A型流感病毒H3N2、A型流感病毒H3N8、A型流感病毒H2N2、A型流感病毒H5N1、A型流感病毒H9N2、A型流感病毒H7N7,B型流感病毒/Victoria(例如,B型流感病毒/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)等。一些实施方案中,流感病毒抗原为流感病毒的结构蛋白,例如,血凝素(hemagglutinin,HA)、神经氨酸酶(neuraminidase,NA)、离子通道蛋白M2(M2ion channel)、基质蛋白M1(matrix protein)、核蛋白NP(nucleoprotein)等。一些具体的实施方案中,所述流感病毒抗原为流感病毒的HA蛋白,例如A型流 感病毒H1N1、A型流感病毒H3N2、B型流感病毒Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒Yamagata(例如,Influenza B/Phuket/3073/2013)的HA蛋白。一些具体的实施方案中,所述流感病毒抗原为流感病毒的NA蛋白,例如A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒Yamagata(例如,Influenza B/Phuket/3073/2013)的NA蛋白。
一些实施方案中,所述HA蛋白包含或为如SEQ ID NO:83-86、98-101任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。
一些实施方案中,所述ORF包含密码子优化的核苷酸序列,其包含与编码HA抗原的野生型核苷酸序列具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,所述ORF包含或为SEQ ID NO:87-90、102-105任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,所述ORF包含或为SEQ ID NO:93-96任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,本公开提供一种核酸构建体,从5'至3'方向上,所述核酸构建体依次包含5'UTR、ORF、3'UTR和poly-A尾巴。一些实施方案中,所述5'UTR包含或为如SEQ ID NO:1-5、28-43、76、137-196、CTATAAT任一所示的核苷酸序列,所述ORF包含或为如SEQ ID NO:13、15、20、22、24、81、97、87-90、93-96、102-105任一所示的核苷酸序列,所述3'UTR包含或为如SEQ ID NO:7-10任一所示的核苷酸序列,和所述poly-A尾巴包含或为如SEQ ID NO:16或135所示的核苷酸序列。
示例性地,所述核酸构建体包含如SEQ ID NO:18-19所示的核苷酸序列。
本公开中,包括其任何实施方案,所述核酸构建体是一种核酸分子,例如cDNA。
RNA分子
本公开提供一种RNA分子,其包括:
(a)开放阅读框(ORF);和
(b)5'非翻译区元件(5'UTR)。
一些实施方案中,所述ORF为目的基因蛋白的编码多核苷酸序列。
一些实施方案中,所述目的基因为异源的。另一些实施方案中,所述目的基因为内源的。
一些实施方案中,所述RNA分子中的5'UTR位于开放阅读框的上游。一些实施方案中,所述RNA分子中的5'UTR位于开放阅读框的5'末端。
一些实施方案中,所述RNA分子中的5'UTR选自源自或为Rho GTPase激活蛋白(ARHGAP)、热休克27kD蛋白1(heat shock 27kDa protein 1,HSPB1)、β珠蛋白(hemoglobin subunit beta,HBB)、趋化因子c-c-基元配体13(C-C motif chemokine ligand 13,CCL13)等任一基因的5'UTR或其衍生序列。
一些实施方案中,所述RNA分子中的5'UTR是源自或为ARHGAP基因的5'UTR序列或其衍生序列。在一些实施方案中,所述ARHGAP包括ARHGAP1、ARHGAP2、ARHGAP3、ARHGAP4、ARHGAP5、ARHGAP6、ARHGAP7(DLC1)、ARHGAP8、ARHGAP9、ARHGAP10、ARHGAP12、ARHGAP13(SRGAP1)、ARHGAP14(SRGAP2)、ARHGAP15、ARHGAP17(RICH1)、ARHGAP18、ARHGAP19、ARHGAP20、ARHGAP21、ARHGAP22、ARHGAP23、ARHGAP24、ARHGAP25、ARHGAP26。
一些实施方案中,其中所述的5'UTR是源自或为ARHGAP15基因的5'UTR序列或其衍生序列。一些实施方案中,所述ARHGAP15基因是来源任意物种的,如人类ARHGAP15、狒狒ARHGAP15、小鼠ARHGAP15等。
一些实施方案中,其中所述的5'UTR是源自或为人类ARHGAP15基因的5'UTR序列或其衍生序列。一些实施方案中,所述的ARHGAP15基因的5'UTR包含或为如SEQ ID NO:45所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,本公开RNA分子中的5'UTR包含至少一个点突变,所述点突变包括前述核酸构建体任意实施方案中的点突变。一些实施方案中,所述突变为A突变为G、C或U。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含SEQ ID NO:79所示的核苷酸序列,或,所述5'UTR包含与SEQ ID NO:79具有至少80%、85%、90%、95%、96%、97%、98%、99%同一性的核苷酸序列,其中N2选自A、G、C或U。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含AUG突变为GUG的核苷酸序列。一些实施方案中,所述源自ARHGAP15基因的5'UTR包含或为如SEQ ID NO:46所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含AUG突变为CUG的核苷酸序列。一些实施方案中,所述源自ARHGAP15基因的5'UTR包含或为如SEQ ID NO:77所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含AUG突变为UUG的核苷酸序列。一些实施方案中,所述源自ARHGAP15基因的5'UTR包含或为如SEQ ID NO:78所示的核苷酸序列或与之具有至少80%、85%、90%、95%、 96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,本公开RNA分子中的5'UTR为5'UTR截短体。一些实施方案中,所述5'UTR截短体仍然保持与天然5'UTR相似的活性功能,例如仍然保持调控ORF编码目的基因表达蛋白的功能。
一些实施方案中,所述5'UTR截短体的截短方式包括:从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列;和/或,从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列,保留3'末端的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列,保留5'末端的核苷酸序列。一些实施方案中,所述5'UTR截短体的截短方式包括从5'至3'末端的序列方向上,删除5'末端的连续的核苷酸序列;并且,所述5'UTR截短体的截短方式包括从3'至5'末端的序列方向上,删除3'末端的连续的核苷酸序列。一些实施方案中,所述5'UTR截短体还包含前述任意实施方案的点突变。一些实施方案中,所述5'UTR截短体不包含前述任意实施方案的点突变。一些实施方案中,所述5'UTR截短体保留了相比于天然5'UTR序列长度的至少90%、80%、70%、60%、50%、40%、30%、20%、15%、10%、5%、1%长度的序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR序列包含或为ARHGAP15 5'UTR截短体。一些实施方案中,其截短方式包括从ARHGAP15 5'UTR的5'至3'末端的方向上,缺失5'末端连续的核苷酸序列;和/或,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸序列。
一些实施方案中,其截短方式包括从ARHGAP15 5'UTR的5'至3'末端的方向上,删除5'末端连续的核苷酸序列,保留3'末端的核苷酸序列。一些实施方案中,5'UTR截短体的截短方式包括从ARHGAP15 5'UTR的3'至5'末端的方向上,删除3'末端连续的核苷酸序列,保留5'末端的核苷酸序列。一些实施方案中,5'UTR截短体的截短方式是从ARHGAP15 5'UTR的5'至3'末端的方向上,删除5'末端连续的核苷酸序列;并且,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸序列。
一些实施方案中,所述截短体保留了相比于天然ARHGAP15 5'UTR(SEQ ID NO:45)至少90%、80%、70%、60%、50%、40%、30%、20%、15%、10%、5%、1%长度的序列。
一些实施方案中,所述5'UTR截短体在在SEQ ID NO:46或79所示核苷酸序列至少5个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:46或79所示序列中5-62个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:46或79所示序列中7-62个连续的核苷酸序列;一些实施方案中,所述5'UTR截短体包含SEQ ID NO:46或79所示序列中7-59个连续 的核苷酸序列;示例性地,所述5'UTR截短体包含SEQ ID NO:2所示序列中5、7、8、9、10、11、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57、58、59、60、61或62个连续的核苷酸序列。
一些实施方案中,所述5'UTR截短体的5’末端的核苷酸(也即,5'UTR截短体的起始核苷酸)是SEQ ID NO:46或79所示序列按自然计数的1-57位的任一位置处的核苷酸。一些实施方案中,所述5'UTR截短体的5’末端的核苷酸是SEQ ID NO:46或79所示序列按自然计数的第1-13或第17-29位中的任一位置处的核苷酸。例如,第1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20、21、22、23、24、25、26、27、28、29、30、31、32、33、34、35、36、37、38、39、40、41、42、43、44、45、46、47、48、49、50、51、52、53、54、55、56、57的任一位置处的核苷酸。
一些实施方案中,所述5'UTR截短体的长度为59bp、55bp、51bp、47bp、43bp、39bp、35bp、31bp、27bp、23bp、19bp、15bp、11bp、7bp。
一些实施方案中,所述ARHGAP15 5'UTR截短体还包含前述任意实施方案的点突变。一些实施方案中,所述ARHGAP15 5'UTR截短体还包含AUG突变为GUG的核苷酸序列。一些实施方案中,前述源自ARHGAP15基因的5'UTR包含或为如SEQ ID NO:63-65所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述ARHGAP15 5'UTR截短体不包含前述任意实施方案的点突变。一些实施方案中,前述源自ARHGAP15基因的5'UTR包含或为如SEQ ID NO:66-75所示的核苷酸序列、CUAUAAU或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的5'UTR包含或为如包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自HSPB1基因的5'UTR包含或为如SEQ ID NO:47所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自HBB基因的5'UTR包含或为如SEQ ID NO:48所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自CCL13基因的5'UTR包含或为如SEQ ID NO:49所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,本公开所述RNA分子中的5'UTR包含或为如SEQ ID NO:45-49、63-75、77-78、CUAUAAU任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。或者,RNA分子中的5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些具体实施方案中,如上任一项RNA分子中所述的5'UTR包含或为如SEQ ID NO:45-49、63-75、77-78、CUAUAAU任一所示的核苷酸序列。或者,RNA分子中的5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)。
一些实施方案中,本公开RNA分子还进一步包含:(c)3'非翻译区元件(3'UTR)。
一些实施方案中,本公开所述开放阅读框与所述5'UTR和/或所述3'UTR源自不同的基因。一些实施方案中,本公开RNA分子包含至少一个开放阅读框、至少一个5'UTR或至少一个3'UTR。一些实施方案中,本公开RNA分子中的所述5'UTR和3'UTR是相同或不同来源的,例如源自相同或不同的基因。一些实施方案中,本公开RNA分子中的所述5'UTR和3'UTR源自相同或不同的物种。
一些实施方案中,本公开RNA分子中的3'UTR位于开放阅读框的下游。一些实施方案中,所述RNA分子中的3'UTR位于开放阅读框的3'末端。一些实施方案中,本公开RNA分子中的所述3'UTR选自源自或为β珠蛋白(hemoglobin subunit beta,HBB)、ARHGAP15、冠蛋白1A(coronin 1A,CORO1A)、血红素结合蛋白(hemopexin,HPX)等任一基因的3'UTR或其衍生序列。
一些实施方案中,其中所述的3'UTR是源自或为HBB或ARHGAP15基因的3'UTR序列或其衍生序列。
一些实施方案中,前述源自HBB基因的3'UTR包含或为如SEQ ID NO:50所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自ARHGAP15基因的3'UTR包含或为如SEQ ID NO:51所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自CORO1A基因的3'UTR包含或为如SEQ ID NO:52所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,前述源自HPX基因的3'UTR包含或为如SEQ ID NO:53所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,本公开RNA分子中的3'UTR包含或为如SEQ ID NO:50-53 任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些具体实施方案中,其中所述的3'UTR包含或为如SEQ ID NO:50或51所示的核苷酸序列。
一些实施方案中,所述RNA分子中包含5'UTR和3'UTR,其中:
5'UTR选自源自或为ARHGAP(例如ARHGAP15)、HSPB1、HBB、CCL13等任一基因的5'UTR或其衍生序列,和3'UTR选自源自或为HBB、ARHGAP15、CORO1A、HPX等任一基因的3'UTR或其衍生序列。
一些实施方案中,所述RNA分子中包含5'UTR和3'UTR,所述5'UTR和3'UTR选自如下任意一项:
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为CORO1A基因的3'UTR或其衍生序列;
5'UTR源自或为ARHGAP15基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为CORO1A基因的3'UTR或其衍生序列;
5'UTR源自或为HBB基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为CORO1A基因的3'UTR或其衍生序列;
5'UTR源自或为HSPB1基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列;或
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为HBB基因的3'UTR或其衍生序列;
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为ARHGAP15基因的3'UTR或其衍生序列;
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为CORO1A基因的3'UTR或其衍生序列;或
5'UTR源自或为CCL13基因的5'UTR或其衍生序列,和3'UTR源自或为HPX基因的3'UTR或其衍生序列。
一些实施方案中,本公开RNA分子中包含5'UTR和3'UTR,所述5'UTR和3'UTR选自如下任意一项:
5'UTR包含或为如SEQ ID NO:45-46、63-75、77-78、CUAUAAU任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;和3'UTR包含或为如SEQ ID NO:50所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:45-46、63-75、77-78、CUAUAAU任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;和3'UTR包含或为如SEQ ID NO:51所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:45-46、63-75、77-78、CUAUAAU任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;和3'UTR包含或为如SEQ ID NO:52所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:45-46、63-75、77-78、CUAUAAU任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;和3'UTR包含或为如SEQ ID NO:53所所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:47所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:50所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:47所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:51所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:47所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:52所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:47所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:53所所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:48所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:50所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:48所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:51所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:48所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:52所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:48所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:53所所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:49所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:50所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:49所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:51所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
5'UTR包含或为如SEQ ID NO:49所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:52所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
5'UTR包含或为如SEQ ID NO:49所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列,和3'UTR包含或为如SEQ ID NO:53所所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列;
一些实施方案中,所述RNA分子中包含5'UTR和3'UTR,其中:
5'UTR包含或为如SEQ ID NO:45-46、63-75、77-78、CUAUAAU任一所示的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T);和3'UTR源自或为HBB、ARHGAP15、CORO1A、HPX任一基因的3'UTR或其衍生序列。
一些具体实施方案中,所述RNA分子中包含5'UTR和3'UTR,其中:
5'UTR包含或为如SEQ ID NO:45-46、63-75、77-78、CUAUAAU任一所示的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T);和3'UTR包含或为如SEQ ID NO:50-53任一所示的核苷酸序列。
一些实施方案中,本公开RNA分子还进一步包含:(d)多聚腺苷酸(poly-A)尾巴。
一些实施方案中,所述RNA分子中的poly-A尾巴位于3'UTR的下游。一些实施方案中,所述RNA分子中的poly-A尾巴位于3'UTR的3'末端。一些实施方案中,所述poly-A尾巴在所述RNA分子的3'末端。一些实施方案中,poly-A尾巴的长度为至少约50、100、150、200、300、400、500个核苷酸。
一些实施方案中,所述poly-A尾巴包括但不限于HGH polyA、SV40polyA、BGH polyA、rbGlob polyA或SV40late polyA。
一些具体实施方案中,所述poly-A尾巴包含或为如SEQ ID NO:56或135所示的核苷酸序列。
一些实施方案中,本公开RNA分子还进一步包含:(e)5'帽结构(5'Cap)。
一些实施方案中,所述RNA分子中的5'帽结构位于5'UTR的上游。一些实施方案中,所述RNA分子中的5'帽结构位于5'UTR的5'末端。在一些实施方案中,5'帽结构是本领域技术人员已知的帽结构,如Cap0(第一个核碱基的甲基化,例如 m7GpppN)、Cap1(m7GpppN的相邻核苷酸的核糖的额外甲基化,例如m7G(5')ppp(5')(2'OMeA)pG)、Cap2(m7GpppN下游第3个核苷酸的核糖的额外甲基化)、Cap3(m7GpppN下游第3个核苷酸的核糖的额外甲基化)、Cap4(m7GpppN下游第4个核苷酸的核糖的额外甲基化)、ARCA(抗反向帽类似物)、修饰的ARCA(例如,硫代磷酸酯修饰的ARCA)、肌苷、N1-甲基-鸟苷、2'-氟代-鸟苷、7-脱氮-鸟苷、8-氧代-鸟苷、2-氨基-鸟苷、LNA-鸟苷和2-叠氮基-鸟苷。
一些实施方案中,使用化学RNA合成或RNA体外转录(共转录加帽)形成5'-帽结构(如Cap0或Cap1)。
一些实施方案中,使用加帽酶(例如牛痘病毒加帽酶和/或帽依赖性2'-O甲基转移酶)经由酶促加帽来形成5'-帽结构(如Cap0或Cap1)。一些实施方案中,使用固定化加帽酶,添加5'帽结构(Cap0或Cap1)。此处全文引入WO2016/193226中的加帽方法和手段。
一些实施方案中,所述5'帽结构包括但不限于ARCA、3'OMe-m7G(5')ppp(5')G、m7G(5')ppp(5')(2'OMeA)pU、m7Gppp(A2'O-MOE)pG、m7G(5')ppp(5')(2'OMeA)pG、m7G(5')ppp(5')(2'OMeG)pG、m7(3'OMeG)(5')ppp(5')(2'OMeG)pG或m7(3'OMeG)(5')ppp(5')(2'OMeA)pG。
一些具体实施方案中,所述5'帽结构是m7G(5')ppp(5')(2'OMeA)pG。还可以使用其他5'帽结构或帽结构类似物。
一些实施方案中,从5'至3'方向上,如上任一项所述的RNA分子含有i)-v)中任一:
i)5'UTR,和开放阅读框(ORF);
ii)开放阅读框(ORF),和3'UTR;
iii)5'UTR,开放阅读框(ORF),和3'UTR;
iv)5'UTR,开放阅读框(ORF),3'UTR,和poly-A尾巴;
v)5'帽结构(5'Cap),5'UTR,开放阅读框(ORF),3'UTR,和poly-A尾巴;
其中所述ORF与所述5'UTR和/或所述3'UTR源自不同的基因。
一些实施方案中,i)、iii)至v)中的5'UTR选自源自ARHGAP、HSPB1、HBB、CCL13等任一基因的5'UTR或其衍生序列(例如ARHGAP15的5'UTR或其衍生序列)。
一些实施方案中,ii)至v)中的3'UTR选自源自HBB、ARHGAP15、CORO1A、HPX等任一基因的3'UTR或其衍生序列(例如HBB或ARHGAP15的3'UTR或其衍生序列)。
一些实施方案中,iv)至v)中的poly-A尾巴包含例如SEQ ID NO:56所示的核苷酸序列。
一些实施方案中,v)中的5'帽结构包括但不限于Cap0、Cap1(例如m7G(5')ppp(5')(2'OMeA)pG)、Cap2、Cap3、Cap4、ARCA。
一些实施方案中,i)中的5'UTR包含或为如SEQ ID NO:45-49、63-75、77-78、CUAUAAU任一所示的核苷酸序列;或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T)。
一些实施方案中,ii)中的3'UTR包含或为如SEQ ID NO:50-53任一所示的核苷酸序列。
一些实施方案中,iii)中的5'UTR包含或为如SEQ ID NO:45-49、63-75、77-78、CUAUAAU任一所示的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T);和3'UTR包含或为如SEQ ID NO:50-53任一所示的核苷酸序列。
一些实施方案中,iv)中的5'UTR包含或为如SEQ ID NO:45-49、63-75、77-78、CUAUAAU任一所示的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T);3'UTR包含或为如SEQ ID NO:50-53任一所示的核苷酸序列,和poly-A尾巴包含或为如SEQ ID NO:56所示的核苷酸序列。
一些实施方案中,v)中的5'UTR包含或为如SEQ ID NO:45-49、63-75、77-78、CUAUAAU任一所示的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T);3'UTR包含或为如SEQ ID NO:50-53任一所示的核苷酸序列,poly-A尾巴包含或为如SEQ ID NO:56所示的核苷酸序列,和5'帽结构为m7G(5')ppp(5')(2'OMeA)pG。
本公开中,上述任一项RNA分子中,所述ORF包含编码至少一种多肽或蛋白的核苷酸序列。一些实施方案中,所述核苷酸序列可以是密码子优化的核苷酸序列。
一些实施方案中,所述ORF编码的多肽或蛋白是荧光蛋白或荧光素酶(luciferase)。
一些实施方案中,所述ORF编码的多肽或蛋白是病毒抗原。示例性地,病毒抗原包括但不限于流感病毒、冠状病毒、呼吸道合胞病毒、人类免疫缺陷病毒、单纯疱疹病毒、狂犬病病毒或EB病毒等的抗原。
一些实施方案中,所述病毒抗原是冠状病毒抗原。一些实施方案中,所述冠状病毒为感染人的冠状病毒,例如SARS-CoV-2(COVID-19)、SARS-CoV、HCoV-229E、HCoV-OC43、HCoV-NL63、HCoV-HKU1或MERS-CoV。一些实施方案中,所述冠状病毒是SARS-COV-2。一些实施方案中,所述冠状病毒抗原是结构蛋白质。一些实施方案中,所述结构蛋白质选自刺突蛋白(Spike protein,S蛋白或Spike蛋白)、包膜蛋白(envelope protein,E蛋白)、膜蛋白(membrane protein,M蛋白)和核衣壳蛋白(nucleocapsid protein,N蛋白)。一些实施方案中,所述结构蛋白质为刺突蛋白。一些实施方案中,所述刺突蛋白为SARS-COV-2刺突蛋白。一些实施方案中,所述SARS-COV-2刺突蛋白选自SARS-COV-2(例如野生型 SARS-COV-2)、SARS-COV-2Alpha(B.1.1.7)、SARS-COV-2Beta(B.1.351)、SARS-COV-2Gamma(P.1)、SARS-COV-2Kappa(B.1.617.1)、SARS-COV-2Delta(B.1.617.2)、SARS-COV-2Omicron(B.1.1.529)、SARS-COV-2Omicron(BA.4)等任一病毒株的刺突蛋白。一些实施方案中,其中所述SARS-COV-2刺突蛋白包含或为如SEQ ID NO:12、14、21、23、25、80、136任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。
一些实施方案中,所述ORF包含密码子优化的核苷酸序列,其包含与编码SARS-COV-2抗原的野生型核苷酸序列(例如野生型SARS-COV-2,SEQ ID NO:82)具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,其中编码所述多肽或蛋白的ORF包含或为如SEQ ID NO:54-55、59-61、82、130任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述ORF编码流感病毒抗原。一些实施方案中,所述流感病毒选自A型流感病毒或B型流感病毒,示例性地,流感病毒为A型流感病毒H1N1、A型流感病毒H3N2、A型流感病毒H3N8、A型流感病毒H2N2、A型流感病毒H5N1、A型流感病毒H9N2、A型流感病毒H7N7,B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)等。一些实施方案中,流感病毒抗原为流感病毒的结构蛋白,例如,血凝素(hemagglutinin,HA)、神经氨酸酶(neuraminidase,NA)、离子通道蛋白M2(M2ion channel)、基质蛋白M1(matrix protein)、核蛋白NP(nucleoprotein)等。一些具体的实施方案中,所述流感病毒抗原为流感病毒的HA蛋白,例如A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)的HA蛋白。一些具体的实施方案中,所述流感病毒抗原为流感病毒的NA蛋白,例如A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒Yamagata(例如,Influenza B/Phuket/3073/2013)的NA蛋白。一些实施方案中,所述HA蛋白包含或为如SEQ ID NO:83-86、98-101任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。
一些实施方案中,所述ORF包含密码子优化的核苷酸序列,其包含与编码HA抗原的野生型核苷酸序列具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,其中编码所述多肽或蛋白的ORF包含或为如SEQ ID NO:126-129、131-134任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、 96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,本公开提供一种RNA分子,从5'至3'方向上,所述RNA分子依次包含5'UTR、ORF、3'UTR和poly-A尾巴。一些实施方案中,所述5'UTR包含或为如SEQ ID NO:45-49、45-49、63-75、77-78、CUAUAAU任一所示的核苷酸序列,或者,5'UTR包含SEQ ID NO:41、76、137-196所示序列所对应的序列(以U替换序列中的T);所述ORF包含或为如SEQ ID NO:54-55、59-61、82、126-134所示的核苷酸序列,所述3'UTR包含或为如SEQ ID NO:50-53任一所示的核苷酸序列,和所述poly-A尾巴包含或为如SEQ ID NO:56或135所示的核苷酸序列。一些具体实施方案中,所述RNA分子包含如SEQ ID NO:57-58、106-125所示的核苷酸序列。
本公开中,包括其任何实施方案,所述RNA分子可以是mRNA。
本公开首次使用源自人类Rho GTPase激活蛋白(ARHGAP)的5'UTR序列用于目的基因(如SARS-CoV-2病毒蛋白)的重组表达,筛选、设计并成功验证了源自ARHGAP15 5'UTR及其改造修饰的5'UTR可以启动不同目的基因的高效表达。Rho GTP酶激活蛋白15(Rho GTPase activating protein 15,ARHGAP15)属于ARHGAP家族,是一种Rac1特异性GTP酶激活蛋白(GAPs),其是Rho家族GTP酶活性的主要负向调节因子。本公开提供了更多的、具有优异表达效率的5'UTR和3'UTR选择和组合,同时,提供了具有高效预防新冠病毒感染、流感病毒感染潜力,特别是针对其各种变异株的mRNA疫苗。
多核苷酸
本公开还提供一种分离的多核苷酸,所述多核苷酸编码一种或多种多肽或蛋白,其中所述多肽或蛋白是冠状病毒抗原。一些实施方案中,所述冠状病毒是SARS-COV-2。一些实施方案中,所述冠状病毒抗原是结构蛋白质。一些实施方案中,所述结构蛋白质选自刺突蛋白(Spike protein,S蛋白或Spike蛋白)、包膜蛋白(envelope protein,E蛋白)、膜蛋白(membrane protein,M蛋白)和核衣壳蛋白(nucleocapsid protein,N蛋白)。一些实施方案中,所述结构蛋白质为刺突蛋白。一些实施方案中,所述刺突蛋白为SARS-COV-2刺突蛋白。一些实施方案中,所述SARS-COV-2刺突蛋白选自SARS-COV-2(例如野生型SARS-COV-2)、SARS-COV-2Alpha(B.1.1.7)、SARS-COV-2Beta(B.1.351)、SARS-COV-2Gamma(P.1)、SARS-COV-2Kappa(B.1.617.1)、SARS-COV-2Delta(B.1.617.2)、SARS-COV-2Omicron(B.1.1.529)、SARS-COV-2Omicron(BA.4)等任一病毒株的刺突蛋白。一些实施方案中,其中所述SARS-COV-2刺突蛋白包含或为如SEQ ID NO:12、14、21、23、25、80、136任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。
一些实施方案中,所述ORF编码流感病毒抗原。一些实施方案中,所述流感病毒选自A型流感病毒或B型流感病毒,示例性地,流感病毒为A型流感病毒 H1N1、A型流感病毒H3N2、A型流感病毒H3N8、A型流感病毒H2N2、A型流感病毒H5N1、A型流感病毒H9N2、A型流感病毒H7N7,B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)等。一些实施方案中,流感病毒抗原为流感病毒的结构蛋白,例如,血凝素(hemagglutinin,HA)、神经氨酸酶(neuraminidase,NA)、离子通道蛋白M2(M2ion channel)、基质蛋白M1(matrix protein)、核蛋白NP(nucleoprotein)等。一些具体的实施方案中,所述流感病毒抗原为流感病毒的HA蛋白,例如A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)的HA蛋白。一些具体的实施方案中,所述流感病毒抗原为流感病毒的NA蛋白,例如A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒Yamagata(例如,Influenza B/Phuket/3073/2013)的NA蛋白。一些实施方案中,所述流感病毒HA蛋白包含或为如SEQ ID NO:83-86、98-101任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。
一些实施方案中,所述ORF包含密码子优化的核苷酸序列,其包含与编码SARS-COV-2抗原的野生型核苷酸序列(例如野生型SARS-COV-2mRNA,SEQ ID NO:82)具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的序列。
一些实施方案中,所述分离的多核苷酸包含或为如SEQ ID NO:54-55、59-61、82、126-134所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述多核苷酸是RNA,例如mRNA。一些实施方案中,所述多核苷酸为mRNA的ORF区。
一些实施方案中,本公开提供前述任意一种核酸构建体或RNA分子在如下任一项中的用途:(1)制备疫苗;(2)在受试者体内或体外编码病毒抗原;(3)制备在受试者体内或体外编码病毒抗原的药物;(4)用于制备药物。
一些实施方案中,本公开中的核酸构建体或RNA分子编码目标蛋白,其中开放阅读框(ORF)具有增强的目标蛋白表达。
修饰
为了进一步改善本公开的RNA或多核苷酸分子的蛋白质表达的稳定性,所述RNA或多核苷酸分子可进一步包含一种或多种修饰(包括化学修饰),例如骨架修饰、糖修饰、碱基修饰和/或脂质修饰等。在一些实施方案中,所述RNA或多核苷酸分子被均匀地修饰成某个特定修饰(例如,在整个序列中完全修饰)。例如,可以用假尿苷均匀地修饰RNA,使得序列中的每个U是假尿苷。
与本公开有关的骨架修饰是指本公开RNA或多核苷酸分子中包含的核苷酸的骨架的磷酸酯的化学修饰。一些实施方案中,所述骨架修饰包括但不限于用修饰的磷酸酯完全取代骨架中未修饰的磷酸酯部分,例如可以通过用不同的取代基取代一个或多个氧原子来修饰主链的磷酸基团。一些实施方案中,所述修饰的磷酸酯包括但不限于硫代磷酸酯、亚磷酸硒酸酯、硼烷磷酸酯、硼烷磷酸酯、膦酸氢酯、氨基磷酸酯、烷基或芳基膦酸酯和磷酸三酯。
与本公开有关的糖修饰是指本公开RNA或多核苷酸分子中包含的核苷酸的糖的化学修饰。一些实施方案中,所述糖修饰包括但不限于将RNA分子的2'羟基(OH)修饰或替换为许多不同的“氧基”或“脱氧”取代基。一些实施方案中,所述“氧基”修饰包括但不限于烷氧基、芳氧基、聚乙二醇(PEG)等的取代修饰。一些实施方案中,“脱氧”修饰包括但不限于氢、氨基(例如NH2、烷基氨基、二烷基氨基、杂环基、芳基氨基、二芳基氨基、杂芳基氨基、二杂芳基氨基或氨基酸)修饰。
与本公开有关的碱基修饰是指本公开RNA或多核苷酸分子中包含的核苷酸的碱基部分的化学修饰。一些实施方案中,所述碱基修饰包括对核苷酸中腺嘌呤、鸟嘌呤、胞嘧啶和尿嘧啶的修饰。例如,本文所述的核苷和核苷酸可在主凹槽表面上被化学修饰。一些实施方案中,主要的凹槽化学修饰可包括氨基、硫醇基、烷基或卤素基团。一些实施方案中,所述碱基修饰包括但不限于用假尿苷、1-甲基-假尿苷、5-氮杂胞苷、5-甲基胞嘧啶-5'-三磷酸或2-甲氧基腺嘌呤修饰。一些实施方案中,所述碱基修饰为假尿苷修饰。一些具体的实施方案中,可以用假尿苷均匀地修饰RNA,使得序列中的每个U是假尿苷。
与本公开有关的脂质修饰是指本公开RNA或多核苷酸分子中包含脂质修饰。一些实施方案中,所述脂质修饰包括但不限于本公开的RNA或多核苷酸分子共价连接至少一个接头,以及相应的接头与至少一个脂质共价连接。一些实施方案中,所述脂质修饰包括但不限于本公开的RNA或多核苷酸分子与至少一个脂质共价连接(无接头)。
UTRs
UTRs(5'UTR和/或3'UTR)可以作为侧翼区域提供给本公开的核酸构建体、RNA或多核苷酸分子。UTRs可以与在本公开的核酸构建体、RNA或多核苷酸分子中的编码区同源或异源。侧翼区域可以包含一个或多个5'UTR和/或3'UTR,所述UTRs可以是相同的或不同的序列。侧翼区域的任何部分可以进行密码子优化。在密码子优化之前和/或之后,侧翼区域的任何部分可以独立地包含一个或多个不同的结构或化学修饰。
为了改变本公开的核酸构建体、RNA或多核苷酸的一种或多种特性,将与本公开的ORF异源的UTRs引入或工程化合成到本公开的核酸构建体、RNA或多核苷酸中。然后将所述重组核酸构建体、RNA或多核苷酸施用于细胞、组织或生物体,并测量结果,如蛋白质水平、定位和/或半衰期,以评估异源UTR对本公开的、 RNA或多核苷酸产生的有益影响。一些实施方案中,所述UTR包括野生UTR或其变体,所述UTR变体包括在末端添加或去除一个或多个核苷酸,包括A、T、C或G。一些实施方案中,所述UTR变体也包括任何方式进行的密码子优化或修改。一些实施方案中,所述UTR变体也包括本公开任何实施方式的衍生序列,例如在天然UTR序列的基础上,将ATG点突变为GTG,用于抑制UTR内部ATG启动翻译。
载体
本公开还提供一种载体,其包含前述任一项所述的核酸构建体、RNA或多核苷酸。其中核酸构建体、RNA或多核苷酸可存在于载体中和/或可为载体的一部分,该载体例如质粒、粘端质粒、YAC或病毒载体。载体可为表达载体,即可提供核酸构建体、RNA或多核苷酸编码多肽表达的载体。该表达载体通常包含至少一种本公开的核酸,其可操作地连接至一个或多个适合的表达调控元件(例如启动子、终止子等)。针对在特定宿主中的表达对所述元件及其序列进行选择为本领域技术人员的常识。对本公开的编码多肽的表达有用或必需的调控元件及其他元件例如为启动子、终止子、选择标记物、前导序列、报告基因等。
本公开的核酸构建体可基于本公开的核苷酸序列的信息通过已知的方式(例如通过自动DNA合成和/或重组DNA技术)制备或获得,和/或可从适合的天然来源加以分离。
一些实施方案中,本公开的载体还包含启动子,例如所述启动子在所述核酸构建体5'UTR的5'末端,例如所述启动子为T7启动子、T7lac启动子、Tac启动子、Lac启动子、Trp启动子。
一些具体实施方案中,其中所述的T7启动子包含或为如SEQ ID NO:17所示的核苷酸序列。
宿主细胞
本公开还提供一种宿主细胞,其包含前述任一项所述的核酸构建体、RNA或多核苷酸。一些实施方案中,所述细胞能够表达一种或多种本公开核酸构建体、RNA或多核苷酸编码的多肽。一些实施方案中,所述宿主细胞为细菌细胞、真菌细胞或哺乳动物细胞。
细菌细胞例如包括革兰氏阴性细菌菌株(例如大肠杆菌(Escherichia coli)菌株、变形杆菌属(Proteus)菌株及假单胞菌属(Pseudomonas)菌株)及革兰氏阳性细菌菌株(例如芽孢杆菌属(Bacillus)菌株、链霉菌属(Streptomyces)菌株、葡萄球菌属(Staphylococcus)菌株及乳球菌属(Lactococcus)菌株)的细胞。
真菌细胞例如包括木霉属(Trichoderma)、脉孢菌属(Neurospora)及曲菌属(Aspergillus)的物种的细胞;或者包括酵母属(Saccharomyces)(例如酿酒酵母(Saccharomyces cerevisiae))、裂殖酵母属(Schizosaccharomyces)(例如粟酒裂殖酵母(Schizosaccharomyces pombe))、毕赤酵母属(Pichia)(例如巴斯德毕赤酵母(Pichia  pastoris)及嗜甲醇毕赤酵母(Pichia methanolica))及汉森酵母属(HansenuLa)的物种的细胞。
哺乳动物细胞例如包括例如HEK293细胞、CHO细胞、BHK细胞、HeLa细胞、COS细胞等。
然而,本公开也可使用两栖类细胞、昆虫细胞、植物细胞及本领域中用于表达异源蛋白的任何其他细胞。
生产或制备方法
本公开提供一种制备本公开核酸构建体、RNA或多核苷酸的方法,以及制备其编码多肽的方法。
用于制备产生核酸构建体、RNA或多核苷酸,及其编码多肽的方法及试剂,例如特定适合载体、转化或转染方法、选择标记物、诱导蛋白表达的方法、培养条件等在本领域中是已知的。类似地,适用于制造本公开的编码多肽的方法中的蛋白分离及纯化技术为本领域技术人员所公知。
一些实施方案中,所述制备核酸构建体、RNA或多核苷酸的方法包括培养前述的宿主细胞,并从培养物中回收产生的核酸构建体、RNA或多核苷酸。本公开的核酸构建体、RNA或多核苷酸,及其编码多肽也可以通过本领域已知的其它产生方法获得,例如化学合成,包括固相或液相合成。
一些实施方案中,制备RNA分子的方法包括:制备核酸构建体或载体,然后利用所述核酸构建体或载体进行逆转录,得到RNA分子。一些具体的实施方案中,所述方法还包括对所述RNA分子的5’端添加5’Cap。
疫苗
本公开还提供一种疫苗,其包含前述任一项所述的核酸构建体、前述任一项所述的RNA和/或前述任一项所述的多核苷酸。一些实施方案中,所述核酸构建体、RNA或多核苷酸编码一种或多种病毒株的一个或多个抗原。本公开提供的疫苗可以是单价疫苗、多价疫苗或联合疫苗。
单价疫苗
本公开提供了一种单价疫苗,其包含编码一种生物体的一个抗原。一些实施方案中,单价疫苗包含编码一种病毒株的一个抗原。
多价/联合疫苗
一些实施方案中,所述疫苗可包括编码相同或不同物种的两种或更多种抗原的核酸构建、RNA或多核苷酸分子,或多个核酸构建、RNA或多核苷酸分子。一些实施方案中,所述疫苗包括编码相同或不同病毒株的两种或更多种抗原的RNA或多个RNA。在一些实施方案中,所述RNA可以编码1、2、3、4、5、6、7、8、9、10、11、12或更多个病毒抗原。
一些实施方案中,在上述单价疫苗和多价/联合疫苗中,所述抗原是冠状病毒抗原,所述冠状病毒例如为SARS-CoV-2(COVID-19)、SARS-CoV、HCoV-229E、HCoV-OC43、HCoV-NL63、HCoV-HKU1或MERS-CoV。一些实施方案中,所述 冠状病毒抗原是结构蛋白质,例如选自刺突蛋白(Spike protein,S蛋白或Spike蛋白)、包膜蛋白(envelope protein,E蛋白)、膜蛋白(membrane protein,M蛋白)和核衣壳蛋白(nucleocapsid protein,N蛋白)。一些实施方案中,所述结构蛋白质为刺突蛋白,例如为SARS-COV-2刺突蛋白。一些实施方案中,所述SARS-COV-2刺突蛋白选自SARS-COV-2(例如野生型SARS-COV-2mRNA)、SARS-COV-2Alpha(B.1.1.7)、SARS-COV-2Beta(B.1.351)、SARS-COV-2Gamma(P.1)、SARS-COV-2Kappa(B.1.617.1)、SARS-COV-2Delta(B.1.617.2)、SARS-COV-2Omicron(B.1.1.529)、SARS-COV-2Omicron(BA.4)等任一病毒株的刺突蛋白。一些实施方案中,其中所述SARS-COV-2刺突蛋白包含或为如SEQ ID NO:12、14、21、23、25、80、136任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。一些实施方案中,编码所述SARS-COV-2刺突蛋白的DNA序列包含或为如SEQ ID NO:13、15、20、22、24、81、97任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。一些实施方案中,编码所述SARS-COV-2刺突蛋白的RNA序列包含或为如SEQ ID NO:54-55、59-61、82、130所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。
一些实施方案中,所述ORF编码流感病毒抗原。一些实施方案中,所述流感病毒选自A型流感病毒或B型流感病毒,示例性地,流感病毒为A型流感病毒H1N1、A型流感病毒H3N2、A型流感病毒H3N8、A型流感病毒H2N2、A型流感病毒H5N1、A型流感病毒H9N2、A型流感病毒H7N7,B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)等。一些实施方案中,流感病毒抗原为流感病毒的结构蛋白,例如,血凝素(hemagglutinin,HA)、神经氨酸酶(neuraminidase,NA)、离子通道蛋白M2(M2ion channel)、基质蛋白M1(matrix protein)、核蛋白NP(nucleoprotein)等。一些具体的实施方案中,所述流感病毒抗原为流感病毒的HA蛋白,例如A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)的HA蛋白。一些实施方案中,所述HA蛋白包含或为如SEQ ID NO:83-86、98-101任一所示的氨基酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%的氨基酸同一性的序列。一些实施方案中,编码所述HA蛋白的DNA序列包含或为如SEQ ID NO:87-90、93-97、102-105任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序列。一些实施方案中,编码所述HA蛋白的RNA序列包含或为如SEQ ID NO:126-129、131-134任一所示的核苷酸序列或与之具有至少80%、85%、90%、95%、96%、97%、98%、99%、100%同一性的核苷酸序 列。
一些实施方案中,可以在同一脂质纳米颗粒中配制两种或更多种不同的RNA(例如,mRNA)。其他实施方案中,两个或更多个不同的RNA可以分别配制在单独的脂质纳米颗粒中,然后可以将脂质纳米颗粒组合并作为单个疫苗组合物(例如,包括编码多个抗原的多个RNA),或者可以单独给药。
本公开还提供了多价/联合疫苗,其包含编码一种或多种冠状病毒的RNA或一种或多种不同生物体的抗原。即本公开的疫苗可以是多价/联合疫苗,其靶向相同毒株/物种的一种或多种抗原,或不同毒株/物种的一种或多种抗原。
递送系统
本公开还提供一种递送系统,其包含前述任一项所述的核酸构建体,或前述任一项所述的RNA分子;其中所述递送媒介物是阳离子脂质递送颗粒。一些实施方案中,其中所述颗粒是纳米颗粒。一些实施方案中,所述递送媒介物是纳米脂质颗粒。本公开中的RNA分子可以使用本领域任意类型的纳米脂质颗粒实现向细胞内和/或体内的递送,示例性地,纳米脂质颗粒包括但不限于WO2017075531、WO2018081480A1、WO2017049245A2、WO2017099823A1、WO2022245888Al、WO2022150717A1、CN101291653A、CN102119217A、WO2011000107A1、CN107028886A中所公开的脂质颗粒,上述专利通过引用全文并入本公开。
药物组合物
本公开还提供一种药物组合物,其包含前述任一项所述的核酸构建体、前述任一项所述的多核苷酸、前述任一项所述的疫苗、前述任一项所述的载体,和/或前述任一项所述的递送媒介物,以及药学上可接受的载体、稀释剂或赋形剂,具体地,所述药物组合物为固体制剂、注射剂、外用制剂、喷剂、液体制剂、或复方制剂。
产品或试剂盒
本公开提供一种产品或试剂盒,其包含前述任一项所述的核酸构建体、前述任一项所述的多核苷酸、前述任一项所述的疫苗、前述任一项所述的载体、前述任一项所述的递送媒介物,和/或前述任一项所述的药物组合物。所述试剂盒可用于提供相关检测或诊断用途。
治疗和/或预防疾病的方法和制药用途
本公开还提供一种向有需要的受试者使用治疗和/或预防有效量的前述任一项所述的核酸构建体、前述任一项所述的多核苷酸、前述任一项所述的疫苗、前述任一项所述的载体、前述任一项所述的递送媒介物,前述任一项所述的药物组合物,和/或前述任一项所述的产品或试剂盒,在制备用于治疗和/或预防疾病的药物的用途。一些实施方案中,所述疾病包括病毒感染性疾病或病毒感染相关的呼吸系统疾病。一些实施方案中,所述病毒为冠状病毒。一些实施方案中,所述冠状病毒为感染人的冠状病毒,例如SARS-CoV-2(COVID-19)、SARS-CoV、HCoV-229E、 HCoV-OC43、HCoV-NL63、HCoV-HKU1或MERS-CoV。一些施方案中,所述冠状病毒为SARS-CoV-2。一些实施方案中,所述病毒为流感病毒。一些施方案中,所述冠状病毒为感染人的流感病毒,例如A型流感病毒或B型流感病毒。示例性地,流感病毒为A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)等。一些实施方案中,所述病毒感染相关的呼吸系统疾病(例如SARS-CoV-2感染相关的呼吸系统疾病)包括单纯性感染如发热、咳嗽和咽痛、头痛、鼻炎等、肺炎、急性呼吸道感染、严重急性呼吸道感染(SARI)、低氧性呼吸衰竭及急性呼吸窘迫综合征、脓毒症和脓毒性休克、重症急性呼吸综合征(SARS)等。
本公开还提供一种治疗和/或预防疾病的方法,包括向有需要的受试者施用治疗和/或预防有效量的前述任一项所述的核酸构建体、前述任一项所述的多核苷酸、前述任一项所述的疫苗、前述任一项所述的载体、前述任一项所述的递送媒介物,前述任一项所述的药物组合物,和/或前述任一项所述的产品或试剂盒。一些实施方案中,所述疾病包括病毒感染性疾病或病毒感染相关的呼吸系统疾病。一些实施方案中,所述病毒为冠状病毒。一些施方案中,所述冠状病毒为感染人的冠状病毒,例如SARS-CoV-2(COVID-19)、SARS-CoV、HCoV-229E、HCoV-OC43、HCoV-NL63、HCoV-HKU1或MERS-CoV。一些施实方案中,所述冠状病毒为SARS-CoV-2。一些实施方案中,所述病毒为流感病毒。一些实施方案中,所述冠状病毒为感染人的流感病毒,例如A型流感病毒或B型流感病毒。示例性地,流感病毒为A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)等。一些实施方案中,所述病毒感染相关的呼吸系统疾病(例如SARS-CoV-2感染相关的呼吸系统疾病)包括单纯性感染如发热、咳嗽和咽痛、头痛、鼻炎等、肺炎、急性呼吸道感染、严重急性呼吸道感染(SARI)、低氧性呼吸衰竭及急性呼吸窘迫综合征、脓毒症和脓毒性休克、重症急性呼吸综合征(SARS)等。
本公开还提供了一种方法,包括向有需要的受试者施用有效量的前述任一项所述的核酸构建体、前述任一项所述的多核苷酸、前述任一项所述的疫苗、前述任一项所述的载体、前述任一项所述的递送媒介物,前述任一项所述的药物组合物,和/或前述任一项所述的产品或试剂盒,其能够诱导受试者中和抗体反应和/或T细胞免疫应答,例如针对病毒抗原的中和抗体反应,例如针对病毒抗原的CD4+和/或CD8+T细胞免疫反应。其中相对于用针对该抗原的传统疫苗的预防性有效剂量接种的受试者的抗原抗体滴度而言,受试者的抗原抗体滴度在疫苗接种后有所增加。一些实施方案中,所述病毒为冠状病毒。一些施方案中,所述冠状病毒为感染人的冠状病毒,例如SARS-CoV-2(COVID-19)、SARS-CoV、HCoV-229E、 HCoV-OC43、HCoV-NL63、HCoV-HKU1或MERS-CoV。一些实施方案中,所述冠状病毒为SARS-CoV-2。一些实施方案中,所述病毒为流感病毒。一些施方案中,所述冠状病毒为感染人的流感病毒,例如A型流感病毒或B型流感病毒。示例性地,流感病毒为A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒/Victoria(例如,Influenza B/Washington/02/2019),B型流感病毒/Yamagata(例如,Influenza B/Phuket/3073/2013)等。
一些实施方案中,受试者是免疫的。一些实施方案中,受试者具有肺部疾病。一些实施方案中,受试者为5岁或更年轻,或65岁或以上。
一些实施方案中,该方法包括向受试者施用至少一剂、两剂、三剂、四剂及其以上的前述任一项所述的核酸构建体、前述任一项所述的多核苷酸、前述任一项所述的疫苗、前述任一项所述的载体、前述任一项所述的递送媒介物,前述任一项所述的药物组合物,或前述任一项所述的产品或试剂盒。
一些实施方案中,在给予前述任一项所述的核酸构建体、前述任一项所述的多核苷酸、前述任一项所述的疫苗、前述任一项所述的载体、前述任一项所述的递送媒介物,前述任一项所述的药物组合物,或前述任一项所述的产品或试剂盒的1-72小时后,在受试者的血清中产生可检测水平的病毒抗原(如冠状病毒抗原)。一些实施方案中,在给药1-72小时后的所述受试者的血清中产生了至少100NU/mL、200NU/mL、300NU/mL、400NU/mL、500NU/mL、600NU/mL、700NU/mL、800NU/mL、900NU/mL或1000NU/mL的中和抗体滴度。
一些实施方案中,在受试者中产生的抗体滴度相对于对照增加至少1log。例如,相对于对照,在受试者中产生的抗体滴度可以增加至少2、3、4、5、6、7、8、9或10log。
一些实施方案中,在受试者中产生的抗体滴度相对于对照增加至少2倍。例如,相对于对照,受试者产生的抗体滴度增加至少3、4、5、6、7、8、9或10倍。一些实施例中,几何平均值是n个数的乘积的n次方,通常用于描述比例增长。一些实施方案中,几何平均值用于表征在受试者中产生的抗体滴度。一些实施方案中,对照可以是未接种疫苗的受试者,或施用减毒活病毒疫苗、灭活病毒疫苗或蛋白质亚单位疫苗的受试者。
疫苗疗效
一些实施方案中,抗原特异性免疫应答的特征在于测量接受前述实施方案的给药后受试者中产生的抗(冠状)病毒抗原的抗体滴度。抗体滴度是对受试者内的抗体的量的测量,例如,特定于抗原的特定抗原或表位的抗体。抗体滴度通常表示为提供阳性结果的最大稀释的倒数。酶联免疫吸附测定(ELISA)是用于确定抗体滴度的常见测定。
一些实施方案中,抗体滴度用于评估受试者是否已经感染或确定是否需要免疫。在一些实施方案中,抗体效价用于确定自身免疫反应的强度、确定是否需要 加强免疫、确定以前的疫苗是否有效以及鉴定任何近期或之前的感染。根据本公开,抗体效价可用于确定由免疫组合物(例如,RNA疫苗)在受试者中诱导的免疫反应的强度。
一些实施方案中,在受试者中产生的抗(冠状)病毒抗原的抗体滴度相对于对照增加至少1log。例如,相对于对照,在受试者中产生的抗体滴度可以增加至少2、3、4、5、6、7、8、9或10log。
一些实施方案中,在受试者中产生的抗体滴度相对于对照增加至少2倍。例如,相对于对照,受试者产生的抗体滴度增加至少3、4、5、6、7、8、9或10倍。
一些实施方案中,抗原特异性免疫反应是以血清中和抗体滴度与冠状病毒的几何平均滴度(GMT)的比率来衡量的,称为几何平均比率(GMR)。几何平均滴度(GMT)是一组受试者的平均抗体滴度,计算方法是将所有数值相乘并取其n次方根,其中n是有可用数据的受试者数量。
一些实施方案中,对照可以是未接种疫苗的受试者,或施用减毒活病毒疫苗、灭活病毒疫苗或蛋白质亚单位疫苗的受试者。
一些实施方案中,可使用标准分析法评估疫苗效力(efficacy)(例如,参见Weinberg et al.,J Infect Dis.2010Jun 1;201(11):1607-10)。例如,疫苗效力可以通过双盲、随机、临床对照试验来衡量。疫苗效力可表示为未接种疫苗(ARU)和接种疫苗(ARV)的研究队列之间疾病发作率(AR)的比例减少,并可使用以下公式从接种疫苗组中疾病的相对风险(RR)计算出来。
效力(efficacy)=(ARU-ARV)/ARUx100;和
效力(efficacy)=(1-RR)x100
另一些实施方案中,可使用标准分析来评估疫苗的有效性(effectiveness)(例如,参见Weinberg et al.,J Infect Dis.2010Jun 1;201(11):1607-10)。疫苗有效性是对一种疫苗(可能已被证明具有较高的疫苗效力)如何减少人群中的疾病进行评估。这项措施可以评估在自然条件下而不是在受控的临床试验中,疫苗接种计划的有利和不利影响的净平衡,而不仅仅是疫苗本身。疫苗有效性与疫苗效力成正比,但也受人群中目标群体的免疫程度,以及其他如住院、门诊就诊或成本等非疫苗相关因素的影响。例如,可以采用回顾性的病例对照分析,即比较一组感染病例和适当的对照组中的疫苗接种率。疫苗的有效性可表示为比率差异,使用接种疫苗后仍发生感染的几率(OR)。
有效性(effectiveness)=(1-OR)×100。
附图说明
图1:含有T7启动子、5'UTR、3'UTR、polyA尾巴和外源基因CDS序列的质粒模板示意图。
图2:以线性化质粒为模板体外转录合成的mRNA芯片电泳图。其中,L为 RNA分子量标记(RNA Ladder),1为经纯化的mRNA样品。
图3:ELISA检测不同5'UTR和3'UTR组合导致的SAS-Cov-2B.1.351刺突蛋白表达水平差异图。其中5'UTR和3'UTR简写示意见表3。
图4:ELISA检测不同5'UTR和3'UTR组合导致的SAS-Cov-2B.1.1.7刺突蛋白表达水平差异图。其中5'UTR和3'UTR简写示意见表3。
图5:5'UTR I点突变为5'UTR I’示意图。
图6:蛋白印记(Western Blot)检测5'UTR I点突变前后导致的SAS-Cov-2B.1.617.2刺突蛋白表达水平差异图。其中,样品1和3使用的5'UTR为I,样品2和4使用的5'UTR为I’;样品1和2mRNA合成模板来源于质粒小提,样品3和4mRNA合成模板来源于质粒大提;样品5为未转染细胞上清对照;M为分子量标记。图6A为蛋白印记检测图,图6B为蛋白印记检测图量化后的样品比值图。
图7:蛋白印记检测5'UTR I’对不同异源基因表达水平影响图。其中,图7A为蛋白印记检测图,图7B为蛋白印记检测图量化后的样品相对值图。
图8:相比于BNT162b2UTR组合,5'UTR I’-3'UTR B和5'UTR I-3'UTR B介导的荧光素酶表达。
图9:5'UTR I’截短体和5'UTR I’突变体调控荧光素酶表达的结果图。
图10:5'UTR I’与对照5'UTR调控下HA mRNA表达水平差异结果。
具体实施方式
术语
为了更容易理解本公开,以下具体定义了某些技术和科学术语。除非在本文中另有明确定义,本文使用的所有其它技术和科学术语都具有本公开所属领域的一般技术人员通常理解的含义。
“2019新型冠状病毒(2019-nCoV)”的正式分类名为严重急性呼吸综合征冠状病毒2(severe acute respiratory syndrome coronavirus 2,SARS-CoV-2)。
“2019新型冠状病毒(2019-nCoV)引起的疾病”的正式名称为COVID-19。
“核酸”或“核苷酸”包括RNA、DNA和cDNA分子。应当理解,由于遗传密码的简并性,可以产生编码给定蛋白质的大量核苷酸序列。术语核酸可以与术语“多核苷酸”互换使用。“寡核苷酸”是短链核酸分子。“引物”是寡核苷酸,无论是天然存在于纯化的限制酶切消化中还是合成产生,当置于诱导与核酸链互补的引物延伸产物的合成的条件(即在核苷酸和诱导剂例如DNA聚合酶的存在下,并在合适的温度和pH下)下能够充当合成起始点。为了最大化扩增效率,引物优选是单链的,但另外可选地可以是双链的。如果是双链的,则在用于制备延伸产物之前,首先对引物进行处理以分离其链。优选地,引物是脱氧核糖核苷酸。引物必须足够长以在诱导剂的存在下引发延伸产物的合成。引物的确切长度将取决于许多因素,包括温度、引物来源和使用的方法。
“载体”或“表达载体”是指复制子,例如质粒、杆粒、噬菌体、病毒、病毒体或粘粒,可连接另一个DNA区段,即“插入物”以实现连接区段在细胞中的复制。载体可以是设计用于递送至宿主细胞或用于在不同宿主细胞之间转移的核酸构建体。如本文所用,载体在起源和/或最终形式上可以是病毒或非病毒的,如本文所用的PUC57DNA载体。术语“载体”涵盖与适当的控制元件结合时能够复制并且可以将基因序列转移至细胞的任何遗传元件。在一些实施方案中,载体可以是表达载体或重组载体。
“启动子”是指通过驱动核酸序列的转录来调节另一核酸序列表达的任何核酸序列,其可以是编码蛋白质或RNA的异源靶基因。启动子可以是组成型的、诱导型的、阻遏型的、组织特异性的或其任何组合。启动子是核酸序列的控制区域,在此核酸序列的其余部分的启动和转录速率是受控的。
“基因”是指涉及产生多肽链的DNA片段,可以包括或不包括在前和在后的编码区域,例如,5'非翻译(5'UTR)或“先导”序列和3'UTR或“非转录尾区”序列,以及在各自编码片段(外显子)之间的插入序列(内含子)。
“重组”是指多核苷酸是克隆、限制或连接步骤的各种组合的产物,以及导致与天然存在的多核苷酸差异和/或不同的构建体的其他步骤的产物。
“引入”,在将核酸序列插入细胞的内容中,意思为“转染”、“转化”或“转导”和包括参考核酸序列并入真核或原核细胞中,其中核酸序列可以并入细胞的基因组(例如,染色体、质粒、质体,或线粒体DNA),转入自主复制,或暂时表达(例如,转染mRNA)。
“核酸构建体”是指单链或双链的核酸分子,例如DNA片段,其被修饰或合成为以天然本来不存在的方式包含核酸区段,所述核酸分子包含一个或多个控制序列或调控元件。在本公开的上下文中,核酸构建体含有重组核苷酸序列,该重组核苷酸序列基本上由任选地一个、两个、三个或多个分离的核苷酸序列组成:包括5'UTR、开放阅读框(ORF)、3'UTR。在涉及包括两个或更多个序列的构建体的实施方式中,序列在构建体中彼此可操作地连接。
“衍生序列”是指与本公开UTR序列高度同源(例如与本公开UTR序列至少80%、85%、88%、90%、93%、95%、96%、97%、98%、99%、100%同一性)且仍保留与本公开UTR序列具有相同或相似功能活性的核苷酸序列。一些实施方案中,所述衍生序列为在天然UTR序列的基础上经过一个或多个核苷酸的取代、缺失或添加得到的核苷酸序列。一些实施方案中,所述衍生序列为在天然UTR序列的基础上经过截短的得到的核苷酸序列。一些实施方案中,所述衍生序列为在天然UTR序列的基础上,将ATG点突变为GTG、CTG或TTG,用于抑制UTR内部ATG启动翻译。
“可操作地连接”在本文中定义为如下结构:其中控制序列即启动子序列和/或5'UTR序列,被适当地置于相对于编码DNA序列的位置处,使得控制序列指导编 码序列的转录和mRNA翻译成由编码DNA编码的多肽序列。
“开放阅读框架”缩写为“ORF”,是指编码多肽的mRNA分子的片段或区域。ORF包括连续的非重叠的框架内密码子,从起始密码子开始并用终止密码子结束,由核糖体翻译。
“内源的”是指来自或在生物体、细胞、组织或系统内部产生的任何物质。
“外源的”是指从生物体、细胞、组织或系统外部引入或产生的任何物质。
“同一性的序列”或“序列同一性”是指基因或蛋白质之间分别在核苷酸或氨基酸水平上的序列同一性。“同一性的序列”或“序列同一性”是蛋白质之间在氨基酸水平上的同一性量度以及核酸之间在核苷酸水平上的同一性量度。蛋白质序列同一性可以通过在比对序列时比较每个序列中给定位置的氨基酸序列来确定。类似地,核酸序列同一性可以通过在比对序列时比较每个序列中给定位置的核苷酸序列来确定。用于比对序列以供比较的方法是本领域所熟知的,此类方法包括GAP、BESTFIT、BLAST、FASTA和TFASTA。BLAST算法计算序列同一性百分比并对两个序列之间的相似性进行统计学分析。用于进行BLAST分析的软件可通过国家生物技术信息中心(National Center for Biotechnology Information,NCBI)网站公开获得。
“同源性”或“同源”定义为,在对准序列且必要时引入空位以实现最大序列一致性百分比之后,与靶染色体上的相应序列中的核苷酸残基一致的核苷酸残基的百分比。为了确定核苷酸序列同源性百分比的比对可以用本领域技术范围内的各种方式来实现,例如使用公开可用的计算机软件,例如BLAST、BLAST-2、ALIGN、ClustalW2或Megalign(DNASTAR)软件。一些具体的实施方案中,本公开基于BLAST计算序列同源性百分比。本领域技术人员可以确定用于比对序列的合适参数,包括在所比较的序列的全长上实现最大比对所需的任何算法。在一些实施方案中,当同源臂的例如核酸序列(例如DNA序列)与宿主细胞的相应原生或未经编辑的核酸序列(例如基因组序列)至少70%、至少75%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或更多一致时,所述序列被视为“同源”。
“替换”定义为氨基酸或核苷酸序列的如下变化:与参比多肽的氨基酸序列或核苷酸序列相比,分别由不同氨基酸或核苷酸替换一个或多个氨基酸或核苷酸产生。如果替换是保守的,则替换成多肽的氨基酸具有与其替换的氨基酸相似的结构或化学性质(例如,电荷、极性、疏水性等)。在一些实施方案中,多肽变体可具有“非保守”变化,其中替换的氨基酸在结构和/或化学性质上不同。
“缺失”定义为氨基酸或核苷酸序列的如下变化:与参比多肽的氨基酸序列或核苷酸序列相比,分别缺少一个或多个氨基酸或核苷酸残基。在多肽或多核苷酸序列的情况下,考虑到被修饰的多肽或多核苷酸序列的长度,缺失可涉及缺失2个、5个、10个、高至20个、高至30个或高至50个或更多个氨基酸或核苷酸残基。
“插入”或“添加”是指氨基酸或核苷酸序列的如下变化:与参比多肽的氨基酸序列或核苷酸序列相比,该变化导致分别加入了一个或多个氨基酸或核苷酸残基。“插入”通常是指在多肽的氨基酸序列内添加一个或多个氨基酸残基(或多核苷酸内的核苷酸残基),而“添加”可以是插入或指在多肽的N-或C-末端添加的氨基酸残基(或在多核苷酸的5'或3'末端添加的核苷酸残基)。在多肽或多核苷酸序列的情况下,插入或添加可以是高至10个、高至20个、高至30个、高至50个或更多个氨基酸(或核苷酸残基)。
“密码子优化”是指将目标序列中存在的在给定物种的高度表达的基因中一般罕见的密码子替换为在这类物种的高度表达的基因中一般常见的密码子,而替换前后的密码子编码相同的氨基酸。各种物种对特定氨基酸的某些密码子表现出特定的偏好。密码子偏好(生物体之间密码子使用的差异)通常与信使RNA(mRNA)的翻译效率相关,继而认为所述信使RNA尤其取决于翻译的密码子的特性和具体的转运RNA(tRNA)分子的利用度。选择的tRNA在细胞中的优势通常是肽合成中最常使用的密码子的反映。因此,基于密码子优化,基因可以针对给定生物体中的最佳基因表达进行修改。因此,最佳密码子的选择取决于宿主基因组的密码子使用偏好。
“抗原抗体”是指一种与抗原特异性结合的血清抗体。
“细胞”或“宿主细胞”包括易于被本公开内容的核酸构建体或载体转化、转染、转导等的任何细胞类型。作为非限制性实例,宿主细胞可以是分离的原代细胞、多能干细胞、CD34+细胞、诱导的多能干细胞或许多永生化细胞系(例如HepG2细胞)中的任何一种。或者,宿主细胞可以是组织、器官或生物体中的原位或体内细胞。
“治疗”意指给予患者内用或外用治疗剂,例如包含本公开的任一种核酸构建体的组合物,所述患者具有一种或多种疾病症状,而已知所述治疗剂对这些症状具有治疗作用。通常,在受治疗患者或群体中以有效缓解一种或多种疾病症状的量给予治疗剂,以诱导这类症状退化或抑制这类症状发展到任何临床可测量的程度。有效缓解任何具体疾病症状的治疗剂的量(也称作“治疗有效量”)可根据多种因素变化,例如患者的疾病状态、年龄和体重,以及药物在患者产生需要疗效的能力。通过医生或其它专业卫生保健人士通常用于评价该症状的严重性或进展状况的任何临床检测方法,可评价疾病症状是否已被减轻。
“有效量”或“药物学上的有效量”包含足以改善或预防医学疾病的症状或病症的量。有效量还意指足以允许或促进诊断的量。用于特定患者或兽医学受试者的有效量可依据以下因素而变化:例如,待治疗的病症、患者的总体健康情况、给药的方法途径和剂量以及副作用严重性。有效量可以是避免显著副作用或毒性作用的最大剂量或给药方案。一些实施方案中,“有效量”是一种有效的RNA的剂量,可以产生抗原特异性免疫应答。
“预防性有效剂量”是指在临床上可接受的水平的预防病毒感染的有效剂量。在一些实施方案中,有效剂量是疫苗包装说明书中列出的剂量。本文所用的传统疫苗是指除本公开的mRNA疫苗以外的疫苗。例如,传统疫苗包括但不限于活的微生物疫苗、杀死的微生物疫苗、亚单元疫苗、蛋白质抗原疫苗、DNA疫苗、类病毒颗粒(VLP)疫苗等。在示例性的实施方案中,传统疫苗是一种已经取得监管部门批准和/或由国家药品监管机构注册的疫苗,例如中国食品和药品监督管理局。
“药学上可接受的”是指这些治疗剂、材料、组合物和/或剂型,在合理的医学判断范围内,适用于与患者组织接触而没有过度毒性、刺激性、过敏反应或其他问题或并发症,具有合理的获益/风险比,并且对预期的用途是有效。
“单价疫苗”是指由一种病原生物的一个血清型抗原所制成的用于免疫接种的疫苗制品,例如,只含有本公开一种SARS-CoV-2毒株的一个抗原的疫苗是单价疫苗。
“多价疫苗”是指由相同病原生物的多个血清型抗原所制成的用于免疫接种的疫苗制品。例如,含有一种SARS-CoV-2变异株的抗原与其他不同亚型SARS-CoV-2变异株的一种或多种抗原混合制成的疫苗是多价疫苗。
“联合疫苗”是指由不同病原生物的多个血清型抗原所制成的用于免疫接种的疫苗制品。例如,含有本公开SARS-CoV-2毒株抗原与可以预防和/或治疗其他病毒的几种病原微生物混合制成的疫苗是联合疫苗。联合疫苗一般是由同一制作方式制备出单价疫苗,再混合于同一针剂中供免疫使用;或者,将几种病原微生物的抗原混合后,再制备成疫苗制品。一些具体的实施方案中,本公开以首先混合病原微生物抗原,再制备为疫苗制品。联合、多价疫苗是未来发展趋势,其优点在于:(1)节省疫苗接种的成本;(2)节省多个单个疫苗分别包装、物流和组装的成本;(3)减少对接种者,尤其是婴儿的伤害;(4)增加疫苗接种覆盖率;(5)节约疫苗接种时间;(6)减少疫苗储存空间等。联合或多价疫苗可以节省包括多次就诊、遗漏接种等在内的直接和间接成本等,如目前常用的多价联合疫苗包括肺炎链球菌、脑膜炎球菌、脊灰病毒、轮状病毒、流感、HPV等疫苗。
本公开的疫苗的“免疫反应”是指受试者对疫苗中存在的(一种或多种)病毒蛋白(如冠状病毒)产生体液和/或细胞免疫反应。就本公开内容而言,“体液”免疫反应是指由抗体分子介导的免疫反应,包括例如分泌性IgA或IgG分子,而“细胞”免疫反应是由T淋巴细胞(例如CD4+辅助和/或CD8+T细胞(例如CTL)和/或其他白细胞介导的免疫反应。细胞免疫的一个重要方面涉及细胞溶解性T细胞(CTLs)的抗原特异反应。CTLs对肽类抗原具有特异性,这些抗原与主要组织相容性复合体(MHC)编码的蛋白质联合呈现,并在细胞表面表达,CTLs有助于诱导和促进细胞内微生物的破坏或溶解感染了这些微生物的细胞。细胞免疫的另一个方面涉及辅助T细胞的抗原特异性反应,辅助T细胞的作用是帮助刺激功能,并使非特异性效应细胞的活动集中在与MHC分子相关的肽类抗原的细胞上。细胞免疫反应还导 致细胞因子、趋化因子和其他由激活的T细胞和/或其他白细胞产生的此类分子。
实施例
以下结合实施例进一步描述本公开,但这些实施例并非限制着本公开的范围。本公开实施例中未注明具体条件的实验方法,通常按照常规条件,如细胞培养手册,分子克隆手册;或按照原料或商品制造厂商所建议的条件。未注明具体来源的试剂,为市场购买的常规试剂。
示例性给出如下常规实验操作1)至4):
1)mRNA产生
为了产生体外转录的mRNA,使用BspQ 1(Vazyme,DD4302-PC,中国)将质粒在聚腺苷酸尾下游线性化,并通过PCR纯化试剂盒(QIAGEN,28106,德国)进行纯化。以纯化后的线性化质粒作为模板,使用T7体外转录试剂盒(Invitrogen,AM1333,美国)进行体外转录。体外转录合成的mRNA经过MEGAclearTM Kit(Invitrogen,AM1908,美国)纯化得到相对高纯度的mRNA,用于后续体外细胞转染等实验。
2)芯片电泳
为了检验经体外转录所得到的mRNA的完整性与纯度,使用芯片电泳仪器(Agilent 2100,美国)和RNA 6000Nano试剂盒(Agilent,5067-1511,美国)对mRNA进行分析。实验方法具体如下:将待测mRNA进行变性处理(70℃条件下孵育2min,迅速置于冰上),依次将处理好的样品点在芯片的对应孔位,然后运行机器,等待分析结果产生。
3)细胞培养
材料:FBS(Gibco)、DMEM培养基(Gibco)、Opti-MEM(Gibco)、PBS(Gibco)、胰蛋白酶-EDTA(Gibco)、双抗(Pen/Strep,Gibco)和LipofectamineTM2000(Invitrogen,52887)。
使人胚肾细胞(293T,ATCC)在补充有10%FBS和1%双抗的DMEM培养基中,并在5%CO2的潮湿气体环境下生长。
4)体外转染
转染前8小时,按8×105个细胞/孔的标准,将细胞接种在12孔细胞培养板中。使用市售转染试剂LipofectamineTM2000按照每1μg mRNA与2.5μL LipofectamineTM2000混合的比例进行细胞转染,每一个孔的mRNA转染剂量为2μg。实验方法具体如下:将LipofectamineTM2000和mRNA分别稀释在Opti-MEM培养基中,最终总体积分别为100μL,然后将LipofectamineTM2000溶液和mRNA溶液混合,室温条件下孵育10-15分钟,形成含有mRNA的脂质体复合物。在去除细胞培养孔板的DMEM培养基后,将含有不同mRNA的脂复合物加到对应孔板中,然后在每个孔中补充400μL含有10%FBS的Opti-MEM培养基,细胞在37℃(5% CO2水平)培养4-6个小时。此后,去除转染培养基,在每个细胞培养孔板中加入1mL补充有10%FBS和1%双抗的DMEM培养基,将细胞在37℃(5%CO2水平)培养48小时。与此同时,按照上述方式对部分细胞转染不含mRNA的LipofectamineTM2000溶液,作为阴性对照。
这里对本公开所使用的目的基因序列进行总结,见表1。
表1.本公开的目的基因序列信息
SARS-COV-2刺突蛋白的氨基酸KV突变为PP能使刺突蛋白由不稳定的融合前构象变成稳定的融合后构象(Structure-based design of prefusion-stabilized SARS-CoV-2 spikes,Science,VOL.369,NO.6510),使刺突蛋白在构象上处于稳定状态,有利于刺突蛋白作为免疫原进行疫苗设计和生产。
实施例1、5'UTR序列的筛选和制备
1、UTR的筛选
为获得能在树突细胞(DC,dentric cell)中高表达目的蛋白的调控元件,本实施例经过四级筛选获得新结构UTR。
方法包括:一:建立蛋白高表达基因库,包括单核细胞中蛋白表达量高基因、在DC中比单核细胞高蛋白表达基因、蛋白水平较高的基因以及经测试会增加蛋白表达量的UTR的基因,得到2140个基因。二:建立UTR库,从Genbank数据库(https://www.ncbi.nlm.nih.gov/genbank/)下载蛋白水平高表达的基因库中所有 序列后,先分析完整性,之后通过重复合并、CDS(coding sequence,编码序列)截取等方法分析,获得UTR数据,得到940个UTR。三:建立UTR精选库,对UTR库进行分析,筛选出序列长度以及自由能合适的UTR序列,其中5'UTR序列长度在40到70之间,3'UTR长度在100到150之间或者300到400之间,将序列逐一进行自由能预测,挑选5'UTR自由能较高序列,挑选3'UTR自由能较低序列,从该精选库中得到64个UTR。四:确定最终候选序列,筛选策略为5'UTR自由能在-10以上,3'UTR自由能很低,最终确定20个5'UTR和3'UTR进行筛选和细胞内验证。
基于以上,我们认为源自人类Rho GTPase激活蛋白15(Rho GTPase activating protein 15,简称ARHGAP15)的5'UTR序列(命名为5'UTR I,DNA序列如SEQ ID NO:1所示,RNA序列如SEQ ID NO:45所示)具有在人体细胞(特别是DC)内高效表达目的蛋白的潜力。
2、UTR的制备
本实施例示例性给出含5'UTR I的mRNA的制备过程。首先,人工合成含有5'UTR、3'UTR、polyA尾巴和外源目的基因CDS的序列,并将其插入到pUC57载体上,制备出模板载体,其示意图见图1。经测序,载体中插入的目的基因序列正确。
如图1的pUC57Delta13载体所示,其中5'UTR为SEQ ID NO:1;3'UTR为人类血红蛋白β亚基(HBB)3'UTR序列(SEQ ID NO:7,命名为3'UTR B);CDS序列为编码B.1.617.2(Delta)刺突蛋白的经密码子优化的碱基序列(SEQ ID NO:13);polyA尾巴为多聚腺苷酸序列(DNA序列如SEQ ID NO:16所示,RNA序列如SEQ ID NO:56所示);T7启动子(SEQ ID NO:17)。
为了产生体外转录的mRNA,使用BspQ I(Vazyme,DD4302-PC,中国)将质粒在聚腺苷酸尾下游线性化,并通过PCR纯化试剂盒(QIAGEN,28106,德国)进行纯化。以纯化后的线性化质粒作为模板,按照下列过程进行体外转录合成mRNA。合成100μL的mRNA反应体系,反应体系见表2。反应体系中,线性化模板为2μg,用无核苷酶水补足100μL,37℃反应4小时,DNase I消化30min。经检测,合成的mRNA中副产物dsRNA少,产量优。加帽过程采用化学加帽方式在体外转录合成过程中实现,帽子结构为m7G(5')ppp(5')(2'OMeA)pG·NH4(Hongene,ON-134)。经过MEGAclearTM Kit(Invitrogen,AM1908,美国)纯化得到相对高纯度的mRNA,用于后续体外细胞转染等实验。
表2.mRNA反应体系

经过层层筛选和功能验证,本公开获得了具有高效表达目的蛋白潜力的5'UTR I,其序列为SEQ ID NO:1;且含有5'UTR I的构建可以高效转录出对应的mRNA,经芯片电泳鉴定与理论大小一致(见图2)。
实施例2、筛选高效表达异源蛋白的不同5'UTR和3'UTR的组合
本实施例中,对不同的5'UTR和不同的3'UTR的组合进行筛选,以获得能高效表达目的基因的UTR组合。
方法为:人工合成构建体含有相同目的基因的CDS的不同UTR组合的质粒。以实施例1的方法构建基础质粒,通过人工合成和HindIII/XhoI酶切的方式重组构建替换目的基因CDS序列,将重新构建的质粒按照实施例1的方式线性化并合成纯化mRNA。通过LipofectamineTM 2000将mRNA转染293T细胞,48h后,收取293T上清,分别进行蛋白印记和ELISA检测,检测分泌出的细胞上清中刺突蛋白的水平。
所使用的目的基因为:编码新冠B.1.351刺突蛋白的基因(CDS如SEQ ID NO:20所示),或编码新冠B.1.1.7刺突蛋白的基因(CDS如SEQ ID NO:22所示)。所使用的不同5'UTR与3'UTR的信息及其组合方式如表3和表4所示。
蛋白印记方法:为了检测各构建所得到的mRNA的体外表达水平,转染完成后收集孔板中的细胞上清,通过聚丙烯酰胺凝胶电泳、转膜、一抗和二抗孵育和显色,最终得到各构建的mRNA的体外表达水平。其中,一抗使用SARS-CoV-2(2019-nCoV)刺突RBD抗体(Sino Biology,40592-T62),二抗使用HRP-linked的抗兔IgG(Transgene,HS101)。
ELISA方法:转染完成后,收集相应的细胞上清,按照1:50至1:100的比例对细胞上清进行稀释,然后使用ELISA试剂盒(SARS-CoV-2(2019-nCoV)刺突检测ELISA试剂盒,KIT40591,Sino Biology)对各构建的mRNA在体外的表达水平进行检测,读取450nm的吸光值并计算相关抗原的浓度。
表3.5'UTR与3'UTR来源及序列

表4.不同5'UTR与3'UTR组合
表4中5'UTR的A、G、I和J分别如SEQ ID NO:4、3、1和5所示;3'UTR的B、E、D和F分别如SEQ ID NO:7、9、10和8所示。
不同5'UTR与3'UTR组合对新冠B.1.351刺突蛋白、新冠B.1.1.7刺突蛋白表达的影响分别如图3和表5、图4和表6所示。
表5.不同UTR组合对B.1.351刺突蛋白表达量的调控结果(Elisa检测)

(样品编号规则为,例如,#1:A-B代表的是#1号样品,其是A-B的UTR组合)
表6.不同UTR组合对B.1.1.7刺突蛋白表达量的调控结果(Elisa检测)


(样品编号规则为,例如,UK-#11:I-E代表的是UK-#11号样品,其是I-E的UTR
组合)
根据图3和表5可知,不同UTR组合介导的B.1.351刺突蛋白表达水平比较如下:I-B>A-B>G-B,I-B>I-E。对于5'UTR,I优于A和G;对于I的相关组合,I-B优于I-E。根据图4和表6可知,不同UTR组合介导的B.1.1.7刺突蛋白表达水平比较如下:I-F>I-B>I-D>I-E。综合B.1.351刺突和B.1.1.7刺突蛋白表达水平的结果分析,I-B的UTR组合具有更显著的优势,可介导目的基因高效表达,此外,I-F组合也显示了高水平表达目的基因的优势。
进一步地,我们制备含有I-B或I-F调控下的B.1.617.2刺突mRNA,经细胞转染,ELISA检测,对刺突蛋白表达量的检测结果如表7所示。
表7.I-B和I-F UTR调控的B.1.617.2刺突蛋白表达量(ELISA检测)
不同于I-B组合中的5'UTR和3'UTR来自不同基因,I-F组合中的5'UTR和3'UTR来自同一基因ARHGAP15。表7结果显示,I-B(SEQ ID NO:1和SEQ ID NO:7)和I-F(SEQ ID NO:1和SEQ ID NO:8)组合对目的基因表达的调控水平相当,进一步说明了5'UTR I的通用性,其可以和不同的3'UTR组合,介导目的基因(例如,刺突蛋白的编码基因)的高效表达。
实施例3、5'UTR I的序列改造和初步功能验证
1)生物信息学分析和突变设计
5'UTR I(SEQ ID NO:1)为天然存在人的Rho GTPase激活蛋白家族启动子。实施例1-2实验数据支持5'UTR I协助高效表达新冠刺突蛋白,但5'UTR I内部含有一个ATG,可能会表达一个27aa的短肽,影响正常的目的基因蛋白表达。为了避免核糖体从UTR内部ATG启动翻译的现象,增加从CDS区ATG翻译的概率,增加表达量,本公开进行了点突变(A→G),获得了突变体5'UTR I’(DNA序列如SEQ ID NO:2所示,RNA序列如SEQ ID NO:46所示)。突变位置见图5。
2)5'UTR I通过点突变为5'UTR I’能有效增加目的基因的表达
采用实施例1方法制备分别含有5'UTR I-B和5'UTR I’-B调控目的基因表达的质粒,所述目的基因为编码B.1.617.2刺突蛋白的基因(SEQ ID NO:13),所述质粒分别命名为Delta 13、Delta 24,经PCR转录合成、纯化获得mRNA(其中不含帽的I'-Delta24-B mRNA序列如SEQ ID NO:57所示),然后将制备出的mRNA转染293T细胞,再采用实施例2中的蛋白印记的方法检测B.1.617.2刺突蛋白的 表达。并通过Image软件进行灰度扫描,计算灰度值,以样品1(Delta 13)的表达量计为1,归一化处理后,得到表格8中的比值。
结果如图6A和6B,表8所示。结果可知,无论质粒小提(未去除内毒素)作为模板,还是质粒大提(去除内毒素)作为模板制备的mRNA,点突变后的5'UTR I’的构建中刺突蛋白的表达量均大于含有5'UTR I的构建体。因此,5'UTR I经过实施例3中点突变改造有效提高了异源蛋白的表达。
表8.I-B和I’-B UTR调控下目的基因的表达水平(蛋白印记检测)
实施例4、5'UTR I’调控下的不同目的基因mRNA的表达检测
通过酶切方式替换目的基因,保留5'UTR I’和3'UTR B,构建分别编码B.1.617.2(Delta)刺突蛋白的基因(SEQ ID NO:13)和B.1.1.529(Omicron)刺突蛋白的基因(SEQ ID NO:15)的质粒,所述质粒含有的目的基因序列分别是SEQ ID NO:13和SEQ ID NO:15,均是密码子优化后的序列,其编码如SEQ ID NO:12和SEQ ID NO:14所示的靶蛋白序列。采用实施例1的方式制备和纯化mRNA,通过LipofectamineTM 2000转染mRNA 2μg至293T细胞,2d后收取细胞上清。采用实施例2中的蛋白印记的方法检测I-B调控下的B.1.617.2刺突蛋白mRNA(命名为Delta 13)、I’-B调控下的B.1.617.2刺突蛋白mRNA(命名为Delta 24)和I’-B调控下的B.1.1.529刺突蛋白mRNA(命名为Omicron 24)的表达情况,其中,不含帽结构的Delta 24mRNA、Omicron 24mRNA序列分别如SEQ ID NO:57所示和SEQ ID NO:58所示。通过Image软件进行灰度扫描,计算灰度值,以Delta 13的表达量计为1,归一化处理后,得到表9中的比值。
图7A和7B中显示在I’-B的UTR组合调控下,新冠基因B.1.617.2和B.1.1.529均可高效表达。因此提示5'UTR I’具有通用性,可以启动不同目的基因的高效表达。对比5'UTR I和I’,Delta 24刺突蛋白表达量明显高于Delta 13刺突蛋白表达量,提示了点突变改造5'UTR I'有效提高了蛋白表达效率,此结果进一步证实了实施例3的结论。
表9.I-B和I’-B UTR组合调控下目的基因的表达水平(蛋白印记检测)
实施例5、I’-B的UTR组合调控下的荧光素酶(luciferase)的表达量检测
构建含有荧光素酶CDS序列(SEQ ID NO:26)的载体,包含与前述含有新冠基因载体中相同的T7启动子(SEQ ID NO:17),polyA尾。分别构建UTR不同的三个载体Luc,Luc13和Luc24,载体包含的UTR见表10。其中,Luc中的BNT162b2的5'UTR和3'UTR组合作为对照,其来自CN113521269A。通过与实施例1相同的方法进行体外转录合成含有帽子结构的mRNA,体外转染,收取上清后,加荧光素酶底物检测,读板,结果见表11和图8。
表10.载体结构
表11.I’-B和I-B调控下的荧光素酶表达量

注释:参数以相对光单位(relative light unit,RLU)表示。
结果显示,经点突变后的5'UTR I’调控下的荧光素酶蛋白表达量高于5'UTR I。与Luc对照中的BNT162b2的5'UTR和3'UTR组合相比,本公开的I-B组合使得目的基因的表达量更高,I’-B组合使得目的基因的表达量显著升高。而所述使用荧光素酶作为目的基因的检测结果也再次说明,5'UTR I或5'UTR I’均可以通用性的调控任意目的蛋白的高效表达。
实施例6、5'UTR I’截短体和突变体调控荧光素酶(luciferase)的表达量检测
6.1对5'UTR I’的5’末端进行连续截短,通过生物信息学方法(Nat Biotechnol.2019Jul;37(7):803-809.),分析不同长度的5'UTR I’截短体调控蛋白表达的效率。计算5'UTR I’截短体与5'UTR I’的相对MRL值,结果如下表12所示。由表12结果可知,对5'UTR I’中5’末端的核苷酸序列进行连续截短,不会引起MRL值的显著降低,说明5'UTR I’截短体能够保留调控目标蛋白表达的功能活性。SEQ ID NO:153所示的5'UTR I’截短体的3’末端的核苷酸序列进行连续截短,计算3’末端截短后截短体与SEQ ID NO:153所示序列的相对MRL值,结果如表13所示。由表13可知,3’末端截短未对5'UTR I’截短体的调控效率产生显著影响。
表12 5'UTR I’截短体与5'UTR I’的相对MRL值


表13 5'UTR I’截短体在3’末端截短后与未截短的相对MRL值

6.2构建16个不同长度的5'UTR I’截短体(SEQ ID NO:28-40,CTATAAT)和2个不同于A→G点突变的A→C、A→T点突变(SEQ ID NO:42-43),并由苏州金唯智公司合成,然后将16种合成序列通过HindⅢ和ScaI酶切、连接插入到实施例5的Luc 24中,进而获得相应的荧光素酶5'UTR I’截短体和点突变的构建体,命名为LUC1-16,其对应的5'UTR I’截短体(1-I’~14-I’)和突变体(15-I’~16-I’)序列如SEQ ID NO:63-78所示。通过与实施例1相同的方法进行体外转录合成含有帽子结构的mRNA,体外转染293T细胞24h后,加荧光素酶底物检测,读板,结果见图9所示,Luc24(I’-B的组合)及其5'UTR I’的截短体和突变体介导的荧光素酶的表达量均显著高于Luc(BNT162b2的5’和3’UTR)。5'UTR I’截短体可以保持5'UTR I’调控目标蛋白表达的活性,其中,部分5'UTR I’的截短体(1-I’、2-I’、3-I’、4-I’、5-I’、6-I’、7-I’、9-I’、10-I’、11-I’、13-I’)具有提高的调控目标蛋白表达的活性。同时,5'UTR I’的突变体15-I’和16-I’使得目的基因的表达水平优于5'UTR I’,说明对5'UTR I’的5’端进行截断或对内部进行点突变可以进一步增强5'UTR调控目的基因表达的功能活性。
实施例7、对照5'UTR与5'UTR I’调控下的流感HA mRNA表达水平差异
通过酶切方式替换目的基因,保留5'UTR I’和3'UTR B,分别构建编码A型流感病毒H1N1(A/Wisconsin/588/2019)HA蛋白的基因(SEQ ID NO:83),A型流感病毒H3N2(A/Cambodia/e0826360/2020)HA蛋白的基因(SEQ ID NO:84),B型流感病毒(Washington/02/2019)HA蛋白的基因(SEQ ID NO:85)、和B型流感病毒(Phuket/3073/2013)HA蛋白的基因(SEQ ID NO:86)的质粒,所述质粒含有的目的基因序列分别是SEQ ID NO:87-90,均是密码子优化后的序列。参考专利WO2022/245888Al和WO2022/150717A1的5’和3’UTR序列(SEQ ID NO:91-92)合成编码SEQ ID NO:83-86的四种HA CDS序列,分别是SEQ ID NO:93-96。采用实施例1的方式制备和纯化mRNA,通过LipofectamineTM 2000转染mRNA 1μg至293T细胞,1d后收取细胞裂解液。采用实施例2中的蛋白印记的方法检测I’-B调控下的四种流感HA蛋白mRNA的表达情况,其中,一抗使用A型流感病毒Hemagglutinin/HA抗体(Sino Biology,86001-RM01)和B型流感病毒Hemagglutinin/HA抗体(Sino Biology,11053-R004),二抗使用HRP-linked的抗兔IgG(Transgene,HS101)。结果如图10和11所示,相比对照5’UTR,5’UTR-I’调 控下的四种流感HA的表达量明显更高。
本公开部分序列如下所示:




































































Claims (35)

  1. 核酸构建体,其包含:
    (a)开放阅读框(ORF),和
    (b)5'非翻译区元件(5'UTR);
    其中,所述5'UTR源自ARHGAP基因;
    优选地,所述ARHGAP为ARHGAP15;
    更优选地,所述5'UTR包含SEQ ID NO:44所示核苷酸序列或其截短体。
  2. 如权利要求1所述的核酸构建体,其中,所述5'UTR包含至少一个可用于抑制UTR内部ATG启动翻译的点突变;
    优选地,所述点突变选自5'UTR中ATG序列的A、T或G中的任意一个或多个位点上的突变;
    更优选地,所述突变为A突变为G、C或T。
  3. 如权利要求1或2所述的核酸构建体,其中,所述5'UTR截短体仍然保持调控ORF编码蛋白表达的功能;或者,所述5'UTR截短体相比于天然5'UTR具有增强的调控ORF编码蛋白表达的功能;
    优选地,所述5'UTR截短体的截短方式包括:从5'至3'末端的序列方向上,缺失5'末端的连续的核苷酸序列;和/或,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸序列。
  4. 如权利要求1-3任一项所述的核酸构建体,其中,所述5'UTR截短体是从5'至3'末端的序列方向上,缺失5'末端的连续的核苷酸序列;和/或,从3’至5’末端的序列方向上,缺失3’末端的连续的核苷酸序列;
    优选地,所述5'UTR截短体包含SEQ ID NO:44所示序列中至少5个连续的核苷酸序列,更优选包含SEQ ID NO:2所示序列中至少7个连续的核苷酸序列;
    优选地,所述5'UTR截短体的5’末端的核苷酸是SEQ ID NO:44所示序列按自然计数的第1-57位中的任一位置处的核苷酸,更优选是SEQ ID NO:2所示序列按自然计数的第1-13或第17-29位中的任一位置处的核苷酸;
    优选地,所述5'UTR截短体包含CTATAAT或SEQ ID NO:193所示的核苷酸序列。
  5. 如权利要求1-3任一项所述的核酸构建体,其中,所述5'UTR包含SEQ ID NO:2、28-43、76、137-196任一项所示的核苷酸序列,或包含CTATAAT;或与前述任一序列具有至少80%同一性的核苷酸序列。
  6. 如权利要求1至5任一项所述的核酸构建体,其进一步包含:
    (c)3'非翻译区元件(3'UTR);
    优选地,所述3'UTR包含源自HBB、ARHGAP、CORO1A或HPX任一基因的3'UTR;
    更优选地,所述3'UTR包含源自ARHGAP15的3'UTR。
  7. 如权利要求6所述的核酸构建体,其中,
    所述3'UTR包含SEQ ID NO:7-10中任一所示的核苷酸序列,或,所述3'UTR包含与SEQ ID NO:7-10任一具有至少80%同一性的核苷酸序列。
  8. 如权利要求1至7任一项所述的核酸构建体,其进一步包含:
    (d)多聚腺苷酸(poly-A)尾巴。
  9. 如权利要求8所述的核酸构建体,其中,
    所述poly-A尾巴选自HGH polyA、SV40polyA、BGH polyA、rbGlob polyA或SV40late polyA;
    优选地,所述poly-A尾巴包含SEQ ID NO:16或SEQ ID NO:135所示或与之具有至少80%同一性的核苷酸序列。
  10. 如权利要求1至9任一项所述的核酸构建体,其中,所述ORF编码病毒抗原;优选地,所述ORF编码冠状病毒抗原或流感病毒抗原;
    所述冠状病毒优选为SARS-COV-2,所述冠状病毒抗原优选为刺突蛋白;
    更优选地,所述刺突蛋白选自SARS-COV-2、SARS-COV-2 Alpha、SARS-COV-2 Beta、SARS-COV-2 Gamma、SARS-COV-2 Kappa、SARS-COV-2 Delta或SARS-COV-2 Omicron任一病毒株的刺突蛋白;
    所述流感病毒优选为A型流感病毒或B型流感病毒,所述抗原为血凝素蛋白(HA)和/或神经酰胺酶(NA);
    更优选地,所述流感病毒抗原选自A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒Victoria、B型流感病毒Yamagata任一病毒株的血凝素蛋白和/或神经氨酸酶。
  11. 如权利要求1至10任一项所述的核酸构建体,其中,
    所述ORF所编码的多肽序列包含SEQ ID NO:12、14、21、23、25、80、83-86、98-101和136中任一所示的氨基酸序列;
    优选地,所述ORF包含SEQ ID NO:13、15、20、22、24、81、87-90、97、102-105中任一所示的核苷酸序列,或与SEQ ID NO:13、15、20、22、24、81、 87-90、97、102-105任一所示具有至少80%、85%、90%、95%、96%、97%、98%、99%同一性的核苷酸序列。
  12. 如前述任一项所述的核酸构建体,其包含如SEQ ID NO:18-19所示的核苷酸序列或与之具有至少80%同一性的核苷酸序列。
  13. RNA分子,其包括:
    (a)开放阅读框(ORF);
    (b)5'非翻译区元件(5'UTR);和
    (c)3'非翻译区元件(3'UTR);
    其中,所述5'UTR选自源自ARHGAP、HSPB1、HBB或CCL13任一基因的5'UTR,所述3'UTR选自源自HBB、ARHGAP、CORO1A或HPX任一基因的3'UTR;
    所述ARHGAP优选为ARHGAP15,所述5'UTR优选包含SEQ ID NO:79所示的核苷酸序列或其截短体。
  14. 如权利要求13所述的RNA分子,所述5'UTR和3'UTR选自如下任一组合:
    1)5'UTR源自ARHGAP的5'UTR,3'UTR源自HBB、ARHGAP、CORO1A或HPX任一的3'UTR;
    2)5'UTR源自HSPB1的5'UTR,3'UTR源自HBB、ARHGAP、CORO1A或HPX任一的3'UTR;
    3)5'UTR源自HBB的5'UTR,3'UTR源自HBB、ARHGAP、CORO1A或HPX任一的3'UTR;
    4)5'UTR源自CCL13的5'UTR,3'UTR源自HBB、ARHGAP、CORO1A或HPX任一的3'UTR;
    优选地,
    所述ARHGAP的5'UTR包含如SEQ ID NO:45-46、63-75、77-78、CUAUAAU任一所示的核苷酸序列,或包含CUAUAAU,或与前述任一序列具有至少80%同一性的核苷酸序列;
    所述源自HSPB1的5'UTR包含如SEQ ID NO:47所示或与之具有至少80%同一性的核苷酸序列;
    所述源自HBB的5'UTR包含如SEQ ID NO:48所示或与之具有至少80%同一性的核苷酸序列;
    所述源自CCL13的5'UTR包含如SEQ ID NO:49所示或与之具有至少80%同一性的核苷酸序列;
    所述源自HBB的3'UTR包含如SEQ ID NO:50所示或与之具有至少80%同 一性的核苷酸序列;
    所述源自ARHGAP的3'UTR包含如SEQ ID NO:51所示或与之具有至少80%同一性的核苷酸序列;
    所述源自CORO1A的3'UTR包含如SEQ ID NO:52所示或与之具有至少80%同一性的核苷酸序列;
    所述源自HPX的3'UTR包含如SEQ ID NO:53所示或与之具有至少80%同一性的核苷酸序列。
  15. 如权利要求13至14任一项所述的RNA分子,其中,所述ORF编码病毒抗原;优选地,所述ORF编码冠状病毒抗原或流感病毒抗原;
    所述冠状病毒优选为SARS-COV-2,所述冠状病毒抗原优选为刺突蛋白;
    更优选地,所述刺突蛋白选自SARS-COV-2、SARS-COV-2 Alpha、SARS-COV-2 Beta、SARS-COV-2 Gamma、SARS-COV-2 Kappa、SARS-COV-2 Delta或SARS-COV-2 Omicron任一病毒株的刺突蛋白;或者,
    所述流感病毒选自A型流感病毒或B型流感病毒,所述流感病毒抗原为血凝素蛋白(HA)和/或神经氨酸酶(NA);
    更优选地,所述流感病毒抗原选自A型流感病毒H1N1、A型流感病毒H3N2、B型流感病毒Victoria、B型流感病毒Yamagata任一病毒株的血凝素蛋白和/或神经氨酸酶。
  16. 如权利要求15所述的RNA分子,其中,
    所述ORF所编码的多肽序列包含SEQ ID NO:12、14、21、23、25、80、83-86、98-101和136中任一所示的氨基酸序列;
    优选地,所述ORF的核苷酸序列包含SEQ ID NO:54-55、59-61、82和126-134任一所示,或与SEQ ID NO:54-55、59-61、82、126-134任一所示具有至少80%、85%、90%、95%、96%、97%、98%、99%同一性的核苷酸序列。
  17. 如权利要求13至16任一项所述的RNA分子,其进一步包含:
    (d)多聚腺苷酸(poly-A)尾巴。
  18. 如权利要求17所述的RNA分子,其中,
    所述poly-A尾巴选自HGH polyA、SV40polyA、BGH polyA、rbGlob polyA或SV40late polyA;
    优选地,所述poly-A尾巴包含SEQ ID NO:56或SEQ ID NO:135所示的核苷酸序列或与之具有至少80%同一性的核苷酸序列。
  19. 如权利要求13至18任一项所述的RNA分子,其包含(a)开放阅读框(ORF),(b)5'非翻译区元件(5'UTR),(c)3'非翻译区元件(3'UTR),和(d)多聚腺苷酸(poly-A)尾巴;
    优选地,所述RNA分子包含如SEQ ID NO:57-58、106-125任一所示的核苷酸序列。
  20. 如权利要求13至19任一项所述的RNA分子,其进一步包含:
    (e)5'帽结构(5'Cap)。
  21. 如权利要求20所述的RNA分子,其中,
    所述5'Cap选自Cap0、Cap1、Cap2、Cap3、Cap4、ARCA、修饰的ARCA、肌苷、N1-甲基-鸟苷、2'-氟代-鸟苷、7-脱氮-鸟苷、8-氧代-鸟苷、2-氨基-鸟苷、LNA-鸟苷和2-叠氮基-鸟苷;
    优选地,所述5'Cap选自ARCA、3'OMe-m7G(5')ppp(5')G、m7G(5')ppp(5')(2'OMeA)pU、m7Gppp(A2'O-MOE)pG、m7G(5')ppp(5')(2'OMeA)pG、m7G(5')ppp(5')(2'OMeG)pG、m7(3'OMeG)(5')ppp(5')(2'OMeG)pG或m7(3'OMeG)(5')ppp(5')(2'OMeA)pG;
    优选地,所述5'Cap为m7G(5')ppp(5')(2'OMeA)pG。
  22. 如权利要求13至21中任一项所述的RNA分子,所述RNA分子是mRNA。
  23. 如权利要求13至22中任一项所述的RNA分子,其进一步包含一种或多种修饰,
    优选地,所述修饰包括骨架修饰、糖修饰、碱基修饰和/或脂质修饰;
    更优选地,所述碱基修饰为假尿苷修饰。
  24. RNA分子,其包含SEQ ID NO:54-55、59-61、82、126-134任一所示的核苷酸序列或与SEQ ID NO:54-55、59-61、82、126-134任一所示具有至少80%同一性的核苷酸序列;
    优选地,所述RNA分子进一步包含一种或多种修饰,更优选地,所述修饰为假尿苷修饰。
  25. 权利要求1-12任一项所述的核酸构建体,或权利要求13-24任一项所述的RNA分子在如下任一项中的用途:(1)制备疫苗;(2)在受试者体内或体外编码病毒抗原;(3)制备在受试者体内或体外编码病毒抗原的药物;(4)用于制备药物。
  26. 疫苗,其包含权利要求13至24任一项所述的RNA分子,优选地,所述RNA分子编码一种或多种病毒株的一个或多个抗原。
  27. 载体,其包含权利要求1至12任一项所述的核酸构建体,或权利要求13至24任一项所述的RNA分子。
  28. 宿主细胞,其包含权利要求27所述的载体。
  29. 权利要求13-24任一项所述的RNA分子的制备方法,包括:将权利要求1至12任一项所述的核酸构建体,或权利要求27所述的载体进行逆转录,得到RNA分子;优选地,所述方法还包含对所述RNA分子的5’端添加5’Cap。
  30. 递送系统,其包含权利要求1至12任一项所述的核酸构建体或权利要求13至24任一项所述的RNA分子,其中,所述递送系统的递送媒介物是阳离子脂质递送颗粒,优选地,所述递送媒介物是纳米脂质颗粒。
  31. 药物组合物,其包含:
    药学上可接受的载体、稀释剂或赋形剂,和
    选自以下任意一项:
    权利要求1至12任一项所述的核酸构建体、权利要求13至24任一项所述的RNA分子、权利要求26所述的疫苗,和/或权利要求30所述的递送系统。
  32. 产品或试剂盒,其包含权利要求1至12任一项所述的核酸构建体、权利要求13至24任一项所述的RNA分子、权利要求26所述的疫苗、权利要求30所述的递送系统,和/或权利要求31所述的药物组合物。
  33. 一种制备用于治疗和/或预防疾病的药物的用途,包括向有需要的受试者施用有效量的权利要求1至12任一项所述的核酸构建体、权利要求13至24任一项所述的RNA分子、权利要求26所述的疫苗、权利要求30所述的递送系统、权利要求31所述的药物组合物,和/或权利要求32所述的产品或试剂盒;所述疾病为病毒感染性疾病或病毒感染相关的呼吸系统疾病;
    优选地,所述病毒为冠状病毒或流感病毒,所述病毒更优选为SARS-CoV-2、A型流感病毒或B型流感病毒;
    优选地,所述病毒感染相关呼吸系统疾病包括单纯性感染、发热、咳嗽、咽痛、鼻炎、头痛、肺炎、急性呼吸道感染、严重急性呼吸道感染(SARI)、低氧性呼吸衰竭、急性呼吸窘迫综合征、脓毒症、脓毒性休克、重症急性呼吸综合征 (SARS)。
  34. 一种治疗和/或预防疾病的方法,包括向有需要的受试者施用有效量的权利要求1至12任一项所述的核酸构建体、权利要求13至24任一项所述的RNA分子、权利要求26所述的疫苗、权利要求30所述的递送系统、权利要求31所述的药物组合物,和/或权利要求32所述的产品或试剂盒;所述疾病为病毒感染性疾病或病毒感染相关的呼吸系统疾病;
    优选地,所述病毒为冠状病毒或流感病毒,所述病毒更优选为SARS-CoV-2、A型流感病毒或B型流感病毒;
    优选地,所述病毒感染相关呼吸系统疾病包括单纯性感染、发热、咳嗽、咽痛、鼻炎、头痛、肺炎、急性呼吸道感染、严重急性呼吸道感染(SARI)、低氧性呼吸衰竭、急性呼吸窘迫综合征、脓毒症、脓毒性休克、重症急性呼吸综合征(SARS)。
  35. 一种诱导受试者产生中和抗体反应和/或T细胞免疫应答方法,包括向有需要的受试者施用有效量的权利要求1至12任一项所述的核酸构建体、权利要求13至24任一项所述的RNA分子、权利要求26所述的疫苗、权利要求30所述的递送系统、权利要求31所述的药物组合物,和/或权利要求32所述的产品或试剂盒;
    优选地,所述中和抗体反应为针对病毒抗原的中和抗体反应,所述T细胞免疫应答包括CD4+和/或CD8+T细胞免疫反应;
    优选地,所述病毒为冠状病毒或流感病毒,所述病毒更优选为SARS-CoV-2、A型流感病毒或B型流感病毒。
PCT/CN2023/091194 2022-04-27 2023-04-27 核酸构建体及其应用 WO2023208118A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210456261.9 2022-04-27
CN202210456261 2022-04-27

Publications (1)

Publication Number Publication Date
WO2023208118A1 true WO2023208118A1 (zh) 2023-11-02

Family

ID=88517896

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/091194 WO2023208118A1 (zh) 2022-04-27 2023-04-27 核酸构建体及其应用

Country Status (2)

Country Link
TW (1) TW202400251A (zh)
WO (1) WO2023208118A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100273220A1 (en) * 2009-04-22 2010-10-28 Massachusetts Institute Of Technology Innate immune suppression enables repeated delivery of long rna molecules
CN105007938A (zh) * 2013-03-15 2015-10-28 宾夕法尼亚大学理事会 流感核酸分子和由其制备的疫苗
CN108778308A (zh) * 2015-12-22 2018-11-09 库瑞瓦格股份公司 生产rna分子组合物的方法
CN111671890A (zh) * 2020-05-14 2020-09-18 苏州大学 一种新型冠状病毒疫苗及其应用
CN113186203A (zh) * 2020-02-13 2021-07-30 斯微(上海)生物科技有限公司 治疗或者预防冠状病毒病的疫苗试剂
CN113185613A (zh) * 2021-04-13 2021-07-30 武汉大学 新型冠状病毒s蛋白及其亚单位疫苗
CN113528545A (zh) * 2021-09-17 2021-10-22 艾棣维欣(苏州)生物制药有限公司 编码新型冠状病毒b.1.1.7突变株抗原的核酸序列及其应用

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100273220A1 (en) * 2009-04-22 2010-10-28 Massachusetts Institute Of Technology Innate immune suppression enables repeated delivery of long rna molecules
CN105007938A (zh) * 2013-03-15 2015-10-28 宾夕法尼亚大学理事会 流感核酸分子和由其制备的疫苗
CN108778308A (zh) * 2015-12-22 2018-11-09 库瑞瓦格股份公司 生产rna分子组合物的方法
CN113186203A (zh) * 2020-02-13 2021-07-30 斯微(上海)生物科技有限公司 治疗或者预防冠状病毒病的疫苗试剂
CN111671890A (zh) * 2020-05-14 2020-09-18 苏州大学 一种新型冠状病毒疫苗及其应用
CN113185613A (zh) * 2021-04-13 2021-07-30 武汉大学 新型冠状病毒s蛋白及其亚单位疫苗
CN113528545A (zh) * 2021-09-17 2021-10-22 艾棣维欣(苏州)生物制药有限公司 编码新型冠状病毒b.1.1.7突变株抗原的核酸序列及其应用

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XU XIAO-FENG, GAO FENG, WANG JIAN-JIANG, LONG CONG, CHEN XING, TAO LAN, YANG LIU, DING LI, JI YONG: "BMX-ARHGAP fusion protein maintains the tumorigenicity of gastric cancer stem cells by activating the JAK/STAT3 signaling pathway", CANCER CELL INTERNATIONAL, vol. 19, no. 1, 1 December 2019 (2019-12-01), pages 133, XP093104751, DOI: 10.1186/s12935-019-0847-5 *
YANG TZU-JING; YU PEI-YU; CHANG YUAN-CHIH; LIANG KANG-HAO; TSO HSIAN-CHENG; HO MENG-RU; CHEN WAN-YU; LIN HSIU-TING; WU HAN-CHUNG; : "Effect of SARS-CoV-2 B.1.1.7 mutations on spike protein structure and function", NATURE STRUCTURAL & MOLECULAR BIOLOGY, NATURE PUBLISHING GROUP US, NEW YORK, vol. 28, no. 9, 12 August 2021 (2021-08-12), New York , pages 731 - 739, XP037561658, ISSN: 1545-9993, DOI: 10.1038/s41594-021-00652-z *

Also Published As

Publication number Publication date
TW202400251A (zh) 2024-01-01

Similar Documents

Publication Publication Date Title
CN111218458B (zh) 编码SARS-CoV-2病毒抗原的mRNA和疫苗及疫苗的制备方法
ES2523587T3 (es) Virus influenza B que tienen alteraciones en el polipéptido hemaglutinina
WO2017113786A1 (zh) 突变的病毒、其制备方法和应用
JP2024513999A (ja) インフルエンザ-コロナウイルス組み合わせワクチン
CN111821433A (zh) mRNA疫苗及其合成方法、试剂盒
JP2024514182A (ja) 呼吸器ウイルス組み合わせワクチン
WO2022007742A1 (zh) 一种重组的伪狂犬病病毒及其疫苗组合物
WO2021253962A1 (zh) 一种重组新城疫病毒载体新型冠状病毒疫苗候选株及其构建方法和应用
Bai et al. Research progress on circular RNA vaccines
WO2023035372A1 (zh) 一种有限自我复制mRNA分子系统、制备方法及应用
WO2023051701A1 (zh) 抗SARS-CoV-2感染的mRNA、蛋白以及抗SARS-CoV-2感染的疫苗
WO2022193553A1 (zh) 一种基于流感病毒载体的新型冠状病毒疫苗及其制备方法
CN113249408B (zh) 一种靶向激活体液免疫和细胞免疫的核酸疫苗载体构建及应用
KR20230087570A (ko) PAN-RAS mRNA 암 백신
WO2021206587A1 (en) Sars-cov-2 dna vaccine based on gene therapy dna vector gdtt1.8nas12
WO2021042947A1 (zh) 微环dna疫苗设计及应用
WO2023208118A1 (zh) 核酸构建体及其应用
CN112142827B (zh) 一种猪伪狂犬病毒的gB亚单位重组蛋白及其制备方法和应用
WO2023098679A1 (zh) 预防突变株的新型冠状病毒mRNA疫苗
WO2022122036A1 (zh) 一种SARS-CoV-2病毒的免疫原、药物组合物及其应用
CN105586344B (zh) 抑制流感病毒相关基因的siRNA及其应用
CN114668837A (zh) 一种基于mRNA的新冠病毒野生型和变异型联合疫苗及其制备方法
Xiao et al. Enhanced expression of GCRV VP6 in CIK cells by relative sequence optimization
CN113755505B (zh) 治疗和/或预防非洲猪瘟病毒的疫苗及其制备方法
KR102568329B1 (ko) 조류인플루엔자 뉴라미니다아제를 포함하는 바이러스-유사입자 및 이를 이용한 범용 백신

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23795550

Country of ref document: EP

Kind code of ref document: A1