CN117229371A

CN117229371A - Novel S protein mutant of coronavirus variant strain, genetically engineered mRNA thereof and vaccine composition

Info

Publication number: CN117229371A
Application number: CN202210633828.5A
Authority: CN
Inventors: 蒋俊; 辛琪; 王茜婷; 林耀新; 栗世铀; 王浩; 乌磊; 王利娜; 罗晓敏; 张欣月; 白雪皎
Original assignee: Beijing Tricision Biotherapeutics Inc
Current assignee: Beijing Tricision Biotherapeutics Inc
Priority date: 2022-06-06
Filing date: 2022-06-06
Publication date: 2023-12-15

Abstract

The invention provides an S protein mutant with stable conformation trimer, constructs mRNA encoding the S protein mutant and a vector capable of preparing the mRNA by in vitro transcription, wherein the mutation of a plurality of amino acid residues into proline is carried out in C-terminal amino acid of an extracellular domain of spike protein (S protein) of a novel coronavirus variant strain. The S protein mutant, as well as mRNA encoding it (including further optimized mRNA), can be used to induce an immune response against a novel coronavirus in a subject, thereby preventing and/or treating a disease or disorder associated with the novel coronavirus infection.

Description

Novel S protein mutant of coronavirus variant strain, genetically engineered mRNA thereof and vaccine composition

Technical Field

The invention belongs to the technical field of biological medicines and vaccines, and particularly relates to a recombinant antigen for preparing a vaccine against a novel coronavirus (2019-nCoV) variant strain, genetically engineered mRNA and a vector thereof, and an mRNA vaccine composition thereof.

Background

Traditional inactivated vaccines, attenuated vaccines and polypeptide vaccines have long development cycle and complex production process, but mRNA vaccines are based on the development of mRNA modification and delivery tools, once viral antigen sequences are obtained, mRNA vaccines with clinical scale can be rapidly designed and manufactured within weeks, standardized production can be achieved, and the mRNA vaccines are attractive in coping with pandemic outbreaks of infectious diseases. And the mRNA vaccine does not have potential reversion hazard of attenuated vaccine; the problem of restoring mutation of the inactivated vaccine does not exist. In immunogenicity, mRNA vaccines can induce B-cell and T-cell immune responses, can elicit an immune memory effect, and can express multiple antigens at a time, delivering more potent antigens. In addition, mRNA only needs to cross the cell membrane to efficiently express antigen proteins in the cytoplasm without risk of gene integration into the genome. And thirdly, mRNA is easily degraded after being translated into protein, and the transient expression characteristic of the mRNA ensures the safety of the mRNA medicament, enables the dosage to be controllable, and avoids antigen immune tolerance caused by long-term exposure of vaccine medicaments. Thus, mRNA vaccines have a subverted advantage in terms of safety, rapid preparation and immunogenicity.

mRNA is transcribed from the template strand of DNA and has the same sequence as the coding strand and is complementary to the template strand. Unlike prokaryotes, mRNA carrying genetic information in eukaryotes consists of a spacer arrangement of exons encoding proteins and introns with no encoding functions. Only the correctly modified, spliced mature mRNA can be transported as a informative template into the cytoplasm for further translation to produce protein.

The novel coronavirus (2019-nCoV, also known as SARS-CoV-2) has a spherical ellipsoidal shape with a diameter of 80-120nm. Under electron microscopy, the virion surface had a globular projection consisting of trimeric Spike glycoprotein (Spike, S). The envelope of the virus is composed of membrane glycoproteins (membrane glycoprotein, M) embedded in the viral envelope by three transmembrane domains. In addition, small amounts of small transmembrane protein-envelope (E) proteins are also present in the envelope. Finally, nucleocapsid (N) proteins bind to the RNA genome in the form of beads, forming a helically symmetric nucleocapsid. The research results show that S, M, E and N proteins are main components of coronaviruses for inducing immune responses of organisms. In addition, the receptor binding domain (receptor binding domain, RBD) in the S protein infects human airway epithelial cells by interacting with the human ACE2 protein.

2019-nCoV mutation has higher occurrence frequency, and certain dominant mutant strains can be formed after flowing in the crowd for a period of time, and mRNA vaccine aiming at the dominant mutant strains is designed and produced by utilizing the development advantages of the mRNA vaccine, so that the immune escape of the crowd with new mutant strains can be dealt with.

Disclosure of Invention

The invention provides an S protein mutant with stable conformation trimer, constructs mRNA encoding the S protein mutant and a vector capable of preparing the mRNA by in vitro transcription, wherein the mutation of a plurality of amino acid residues into proline is carried out in C-terminal amino acid of an extracellular domain of spike protein (S protein) of 2019-nCoV variant strain. These S protein mutants, as well as the mRNA encoding them (including further optimized mRNA), can be used to induce an immune response in a subject against strains including 2019-nCoV wild-type and variant strains, thereby preventing and/or treating diseases or disorders associated with 2019-nCoV infection.

The major structures of currently known 2019-nCoV virus particles include single strand positive strand nucleic acid, spike protein (S), membrane protein (M), envelope protein (E), and nucleocapsid protein (nucelocapsid protein, N). As shown in FIG. 1, the S protein can be divided into a receptor binding subunit S1 and a membrane fusion subunit S2. The process of adsorption invasion of 2019-nCoV virus to cells relies primarily on the S protein, which assembles in the form of homotrimers, whose cytoplasmic tail and transmembrane domains anchor the S protein into the viral membrane. By analyzing the S protein pre-fusion structure, the RBD of the S1 subunit is found to undergo hinge-like conformational movement to hide or expose key sites for receptor binding, wherein the downward state is a receptor non-binding state, and the up state is a receptor binding state and is in a relatively unstable state. This conformation allows the S protein to bind readily to the host receptor angiotensin converting enzyme 2 (ACE 2). Upon binding of RBD to the receptor, the S2 subunit is altered to a post-fusion conformation by insertion of FP into the host cell membrane. Using a cryoelectron microscope experiment, a large number of trimeric S protein domains were determined in a pre-fusion conformation, with a large number of neutralizing antibody sensitive epitopes present on the pre-fusion S protein, while the post-fusion conformation minimizes exposure of neutralizing sensitive epitopes present only in the pre-fusion conformation.

Thus, if to be used as an antigen for a vaccine, the optimized S protein mutant should be able to retain the epitope present in the S protein pre-fusion conformational form and induce antibodies capable of inhibiting viral fusion.

The mutant S protein is produced by amino acid mutation of a parent S protein, and the mutation can be substitution, deletion and/or insertion of amino acid. The parent S protein may be the S protein of a wild-type strain of 2019-nCoV, or the S protein of any mutant strain of 2019-nCoV (the mutation of any mutant strain of 2019-nCoV may occur in the region of the S protein or in the region other than the S protein). The parent S protein may be a full-length S protein, or a fragment of a full-length S protein (e.g., a sequence truncated to the full-length S protein (e.g., deletion of cytoplasmic tails and/or transmembrane domains), etc.).

In the present invention, the amino acid positions of both the S protein mutant and the parent S protein are described based on the amino acid sequence of the wild-type S protein, which can be obtained at NCBI GeneID 43740568, having a total of 1273 amino acids, the sequence of which is shown below and is designated as SEQ ID NO 1 in the present invention.

In one embodiment of the invention, the parent S protein is the S protein of the 2019-nCoV B.1.617.2 mutant strain, the S protein of the 2019-nCoV B.1.617.2 mutant strain having the following mutations compared to the S protein of the 2019-nCoV wild strain: T19R, G142D, EF 156-157 del, R158G, L452R, T478K, D614G, P6811R, D950N (said positions are depicted as positions of the amino acid sequence shown in SEQ ID NO: 1).

The first aspect of the present invention is to provide an S protein mutant.

According to the invention, the S protein mutant comprises at least an extracellular domain comprising an amino acid mutation at a position relative to the extracellular domain of the parent S protein: F817P, A892P, A899P, A942P, and KV986_987PP, which are described by the position of the amino acid sequence shown in SEQ ID NO: 1. The amino acid mutation can improve the stability of the S protein mutant.

In some embodiments of the invention, the S protein mutant further has the following mutations relative to the parent S protein: T19R, G142D, EF 156-157 del, R158G, L452R, T478K, D614G, P6811R, D950N, said positions being depicted as positions of the amino acid sequence shown in SEQ ID NO: 1.

According to the invention, in some embodiments, the S protein mutant has a mutation to the Furin cleavage site relative to the parent S protein, and RRARs at amino acids 682-685 (which are depicted as being located at the positions of the amino acid sequence shown in SEQ ID NO: 1) are mutated to lose the ability to be cleaved by Furin like (furilike) proteases. In one embodiment of the invention the RRAR is mutated to GSAS. By mutating the enzyme cutting site in the S protein, the S protein mutant can be prevented from being cut by protease, and the stability of the S protein mutant is further improved.

According to the invention, in some embodiments, the S protein mutant does not comprise the transmembrane domain and/or cytoplasmic tail of the S protein.

According to the invention, in some embodiments, the S protein mutant may also have an amino acid mutation in the fusion peptide domain relative to the parent S protein. Substitution, deletion and/or insertion of one or more amino acid residues in this region results in the fusion peptide domain losing its natural function, i.e., the function of mediating fusion of the virus with the host cell membrane. In some embodiments, the S protein mutant does not comprise a fusion peptide domain. By causing fusion peptide domain mutations in the S protein mutant to render it nonfunctional, the stability of the S protein mutant pre-fusion conformation can be increased such that a large number of neutralizing antibody sensitive epitopes present on the S protein pre-fusion conformation are retained and exposed.

According to the invention, in some embodiments, the S protein mutant is directly fused at the C-terminus of the extracellular region (amino acids 1-1209, said sites being depicted as being located at the position of the amino acid sequence shown in SEQ ID NO: 1) to aid in the formation of the domain of the trimer. "domain that facilitates trimer formation" refers to a protein or polypeptide domain that is capable of spontaneously or induced trimer formation when expressed. A variety of such domains are known in the art. By including domains in the S protein mutant that assist in trimer formation (e.g., by constructing a fusion protein), the S protein mutant can be promoted to form a trimeric conformation, and/or the trimeric conformation of the S protein mutant can be stabilized. In one embodiment of the invention, the domain that aids in trimer formation is T4 Fibritin Foldon Trimerization Motif. In one embodiment of the invention, the amino acid sequence of T4 Fibritin Foldon Trimerization Motif is shown in SEQ ID NO. 3.

In some preferred embodiments of the invention, the S protein mutants of the invention have 6 proline mutations in the extracellular domain of the S protein, depicted in the positions of the amino acid sequence shown in SEQ ID NO. 1: F817P, A892P, A899P, A942P, and KV986_987PP; the following mutations: T19R, G142D, EF 156-157 del, R158G, L452R, T478K, D614G, P6811R, D950N; mutating 682-685 amino acids RRAR into GSAS; and transmembrane domain and cytoplasmic tail that do not contain S protein. In one embodiment of the invention, the S protein mutant comprises the amino acid sequence as shown in SEQ ID NO. 2.

In some preferred embodiments of the invention, the S protein mutants of the invention have 6 proline mutations in the extracellular domain of the S protein, depicted in the positions of the amino acid sequence shown in SEQ ID NO. 1: F817P, A892P, A899P, A942P, and KV986_987PP; the following mutations: T19R, G142D, EF 156-157 del, R158G, L452R, T478K, D614G, P6811R, D950N; mutating 682-685 amino acids RRAR into GSAS; and a transmembrane domain and cytoplasmic tail that does not contain an S protein; the domain T4 Fibritin Foldon Trimerization Motif that assists in trimer formation is fused directly at the C-terminus of the extracellular region. In one embodiment of the invention, the S protein mutant comprises the amino acid sequence of SEQ ID NO. 2 and the amino acid sequence of SEQ ID NO. 3 directly linked from the N-terminus to the C-terminus. In one embodiment of the invention, the amino acid sequence of the S protein mutant is the amino acid sequence of SEQ ID NO:2 and the amino acid sequence of SEQ ID NO:3 directly linked from the N-terminus to the C-terminus.

In a second aspect, the invention provides a DNA molecule encoding an S protein mutant according to the first aspect of the invention, an expression vector or a cell comprising said DNA molecule.

According to the invention, the DNA molecule may be present in an expression vector, such as a plasmid vector or a viral vector, and transfected into an engineered cell for expression to obtain the S protein mutant of the invention. Or the DNA molecule can be recombined into the genome of an engineering cell, and expressed in the engineering cell to obtain the S protein mutant.

In some embodiments of the invention, the nucleotide sequence of the DNA molecule comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence set forth in SEQ ID NO. 4, which encodes the amino acid sequence set forth in SEQ ID NO. 2.

In some embodiments of the invention, the nucleotide sequence of the DNA molecule comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence set forth in SEQ ID NO. 5, which encodes the amino acid sequence set forth in SEQ ID NO. 3.

In some embodiments of the invention, the nucleotide sequence of the DNA molecule comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence of SEQ ID NO. 4, and a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence of SEQ ID NO. 5, directly linked from the 5 'end to the 3' end. In a specific embodiment of the invention, the nucleotide sequence of the DNA molecule comprises the nucleotide sequence of SEQ ID NO. 4 and the nucleotide sequence of SEQ ID NO. 5 directly linked from the 5 'end to the 3' end.

An expression vector comprising said DNA molecule. According to the invention, the expression vector may be a prokaryotic or eukaryotic expression vector.

A cell comprising the DNA molecule. According to the invention, the DNA molecule may be present outside the genome of the cell or may be recombined into the genome of the cell.

In a third aspect, the invention provides an mRNA encoding the mutant S protein of the first aspect of the invention.

According to the invention, the mRNA comprises an Open Reading Frame (ORF) encoding an S protein mutant.

According to the invention, the mRNA may comprise, from the 5' end to the 3' end, a 5' cap structure, a 5' UTR, an Open Reading Frame (ORF) encoding an S protein mutant, a 3' UTR and a poly-A tail.

5' cap structure: the 5 'cap is typically a modified nucleotide (especially a guanine nucleotide) added at the 5' end of the mRNA molecule, and also includes atypical cap analogs. Preferably, the 5' cap is added using a 5' -5' -triphosphate linkage (also known as m7 GpppN). In some embodiments of the invention, the 5' CAP structure is CAP1 (additional methylation of ribose of adjacent nucleotides of m7 GpppN), CAP2 (additional methylation of ribose of a second nucleotide downstream of m7 GpppN), CAP3 (additional methylation of ribose of a third nucleotide downstream of m7 GpppN), CAP4 (additional methylation of ribose of a fourth nucleotide downstream of m7 GpppN).

The 5' cap structure can be formed in chemical RNA synthesis using cap analogs, or RNA in vitro transcription (co-transcription capping), or can be formed in vitro using capping enzymes (e.g., commercially available capping kits).

In one embodiment of the invention, the 5' Cap structure is a Cap1 structure.

According to the invention, the 5'UTR may comprise a 5' UTR of β -globin or α -globin or a homologue, fragment thereof. In some embodiments of the invention, the 5'utr comprises a 5' utr of β -globin or a homolog, fragment thereof. In some embodiments of the invention, the 5'UTR comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the 5' UTR nucleotide sequence of the β -globin shown in SEQ ID NO. 6. In a specific embodiment of the invention, the 5'UTR comprises the 5' UTR nucleotide sequence of the β -globin as shown in SEQ ID NO. 6.

In some embodiments of the invention, the 5' utr further comprises a Kozak sequence. In one embodiment of the invention, the Kozak sequence is GCCACC.

According to the invention, the 3'UTR may comprise the 3' UTR of β -globin or α -globin or a homologue, fragment or combination of fragments thereof. In some embodiments of the invention, the 3'UTR comprises a nucleotide sequence which is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to a fragment of the α2-globin 3' UTR shown in SEQ ID NO. 7. In other embodiments of the invention, the 3'UTR comprises 2 nucleotide sequences joined end to end that are at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to a fragment of the α2-globin 3' UTR shown in SEQ ID NO. 7. In a specific embodiment of the invention, the 3' UTR comprises 2 nucleotide sequences as shown in SEQ ID NO. 7, joined end to end.

According to the invention, the poly-A tail may be 50-200 nucleotides, preferably 100-150 nucleotides, for example 110-120 nucleotides, for example about 110 nucleotides, about 120 nucleotides, about 130 nucleotides, about 140 nucleotides, about 150 nucleotides in length.

In one embodiment of the invention, the nucleotide sequence of the Open Reading Frame (ORF) of the S protein mutant is a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence set forth in SEQ ID NO. 8. The amino acid sequence of the S protein mutant after ORF translation consists of an amino acid sequence shown in SEQ ID NO. 2 and an amino acid sequence shown in SEQ ID NO. 3 which are directly connected from the N end to the C end. In one embodiment of the present invention, the nucleotide sequence of the Open Reading Frame (ORF) of the S protein mutant is shown in SEQ ID NO. 8.

In one embodiment of the invention, the mRNA comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence set forth in SEQ ID NO. 9. In one embodiment of the invention, the mRNA comprises the nucleotide sequence set forth in SEQ ID NO. 9.

According to the invention, one or more nucleotides in the mRNA may be modified. For example, one or more nucleotides (e.g., all nucleotides) in the mRNA can each independently be replaced with a naturally occurring nucleotide analog or an artificially synthesized nucleotide analog.

In a fourth aspect, the invention provides a nucleic acid molecule encoding an mRNA according to the third aspect of the invention. The nucleic acid molecule may be in the form of a vector, for example a plasmid vector or a viral vector. In some embodiments, the nucleic acid molecules can be used to prepare the mRNA of the invention by transcription in vitro.

In one embodiment of the invention, the nucleic acid molecule is an in vitro transcription vector comprising operably linked nucleotide sequences encoding a 5'UTR, a 3' UTR and a poly-A tail. The 5'UTR comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the 5' UTR nucleotide sequence of the β -globin shown in SEQ ID NO. 6. The 3'UTR comprises 2 nucleotide sequences joined end to end with at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homology to the fragment of the alpha 2-globin 3' UTR shown in SEQ ID NO. 7. The poly-A tail may be 50-200 nucleotides, preferably 100-150 nucleotides, for example 110-120 nucleotides, such as about 110 nucleotides, about 120 nucleotides, about 130 nucleotides, about 140 nucleotides, about 150 nucleotides in length.

According to the present invention, the vector for in vitro transcription further comprises a nucleotide sequence encoding an ORF of the S protein mutant. The nucleotide sequence of the Open Reading Frame (ORF) of the S protein mutant is a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence set forth in SEQ ID NO. 8.

In a specific embodiment of the invention, the vector for in vitro transcription comprises operably linked nucleotide sequences encoding an ORF of a 5'UTR, an S protein mutant, a 3' UTR and a poly-A tail; the 5'UTR comprises a 5' UTR nucleotide sequence of the beta-globin shown in SEQ ID NO. 6; the 3' UTR comprises 2 nucleotide sequences shown in SEQ ID NO. 7 connected end to end; the poly-A tail is 50-200 nucleotides in length; the nucleotide sequence of the Open Reading Frame (ORF) of the S protein mutant is shown as SEQ ID NO. 8.

According to the present invention, a conventional plasmid can be used as a vector. In some embodiments of the invention, the plasmid is psp73 or pUC57-kana.

The mRNA of the present invention can be prepared by methods known in the art, including but not limited to chemical synthesis or in vitro transcription, and the like. In some embodiments of the invention, a nucleic acid molecule encoding an mRNA may be synthesized artificially, cloned into a vector, and constructed into a plasmid for in vitro transcription. And (3) transforming the constructed plasmid into host bacteria for culture and amplification, and extracting the plasmid. The extracted plasmid was digested into linear molecules by enzyme digestion. mRNA was prepared using in vitro transcription using the prepared linearized plasmid molecule as a template. In Vitro Transcription (IVT) systems typically comprise a transcription buffer, nucleotide Triphosphates (NTPs), an RNase inhibitor, and a polymerase. NTP may be selected from, but is not limited to, natural and non-natural (modified) NTP. The polymerase may be selected from, but is not limited to, T7 RNA polymerase, T3RNA polymerase, and mutant polymerase. The cap structure analogue can be added in the in vitro transcription process to directly obtain mRNA with a cap structure; capping enzymes and dimethyl transferases may also be used to add a capping structure to the mRNA after in vitro transcription is complete. The resulting mRNA may be purified by methods conventional in the art, such as chemical precipitation, magnetic bead, affinity chromatography, and the like.

The S protein mutant of the first aspect of the invention can be directly used as an antigen for preparing vaccines.

The mRNA according to the third aspect of the present invention may be prepared into a liposome or a lipid nanoparticle or the like encapsulating the mRNA together with a lipid compound, and then into a vaccine.

Accordingly, in a fifth aspect the present invention provides a vaccine composition comprising an S protein mutant according to the first aspect of the invention, or an mRNA according to the third aspect of the invention.

According to the invention, the vaccine or vaccine composition may contain pharmaceutically acceptable excipients, and/or immunological adjuvants in addition to the S protein mutant or mRNA, the lipid compound used to form the liposome or lipid nanoparticle.

Lipid nanoparticles can be prepared using methods known in the art. For example: the lipid nanoparticles are prepared by dissolving lipid molecules in an organic solvent at a molar ratio to obtain a lipid-mixed solution, mixing the lipid-mixed solution with an aqueous solution of the object to be delivered (e.g., nucleic acid) as an organic phase and the aqueous phase. Lipid nanoparticles may be prepared using other methods including, but not limited to, spray drying, single and double emulsion solvent evaporation, solvent extraction, phase separation, nano-precipitation, microfluidic, simple and complex coacervation, and others well known to those of ordinary skill in the art. The preparation method may further comprise the step of separating and purifying to obtain the lipid nanoparticle. The preparation method may further comprise the step of lyophilizing the lipid nanoparticle.

According to the invention, in the vaccine or vaccine composition, when lipid nanoparticles are used as a carrier, mRNA is located in the lipid nanoparticles, and the lipid nanoparticles contain 30-60mol% of ionizable/cationic lipid molecules, 5-30mol% of neutral lipid molecules, 30-50mol% of cholesterol lipid molecules, and 0.4-10mol% of PEGylated lipid molecules, which account for the total lipid molecules; preferably contains 32-55 mole% of ionizable/cationic lipid molecules, 8-20 mole% of neutral lipid molecules, 35-50 mole% of cholesterol lipid molecules, 0.5-5 mole% of PEGylated lipid molecules; more preferably, it contains 39-51 mole% of ionizable/cationic lipid molecules, 9-16 mole% of neutral lipid molecules, 37-49 mole% of cholesterol lipid molecules, 1.3-2.7 mole% of PEGylated lipid molecules.

The ionizable/cationic lipid molecules may be selected from commercial molecules such as DLin-MC3-DMA, DOTAP, DOTMA, and the ionizable lipid molecules represented by formula C:

c (C)Wherein each n ₃ Are independent of each other and may be the same or different, each n ₃ Selected from integers from 1 to 8, each m ₃ Are independent of each other and may be the same or different, each m ₃ An integer selected from 0 to 8; preferably, each n ₃ Selected from integers from 4 to 8, each m ₃ An integer selected from 4 to 8; preferably, each n ₃ Are all identical to each other, each m ₃ Are identical to each other.

In one embodiment of the invention, n is preferably ₃ Is 6, m ₃ 4, the molecular structure is as follows:

the neutral lipid molecule may be selected from, for example, phosphatidylcholines represented by formula EE, phosphatidylethanolamine compound shown in formula FF, wherein Ra, rb, rc, rd is independently selected from the group consisting of linear or branched C10-30 alkyl, linear or branched C10-30 alkenyl, preferably CH ₃ (CH ₂ ) ₁₇ CH ₂ -、CH ₃ (CH ₂ ) ₁₅ CH ₂ -、CH ₃ (CH ₂ ) ₁₃ CH ₂ -、CH ₃ (CH ₂ ) ₁₁ CH ₂ -、CH ₃ (CH ₂ ) ₉ CH ₂ -、CH ₃ (CH ₂ ) ₇ CH ₂ -、CH ₃ (CH ₂ ) ₇ -CH＝CH-(CH ₂ ) ₇ -、CH ₃ (CH ₂ ) ₄ CH＝CHCH ₂ CH＝CH(CH ₂ ) ₇ -、CH ₃ (CH ₂ ) ₇ -CH＝CH-(CH ₂ ) ₉ -。

The cholesterol lipid molecule may be selected from cholesterol, 5-heptadecylresorcinol and cholesterol hemisuccinate, for example.

The pegylated lipid molecule comprises a lipid moiety and a PEG-based polymer moiety, denoted as "lipid moiety-PEG-number average molecular weight", said lipid moiety being a diacylglycerol or diacylglycerol amide selected from dilauroylglycerol, dimyristoylglycerol, dipalmitoylglycerol, distearoyl glycerol, dilaurylglycerol amide, dimyristoylglycerol amide, dipalmitoylglycerol amide, distearoyl glyceramide, 1, 2-distearoyl-sn-glycerol-3-phosphoethanolamine, 1, 2-dimyristoyl-sn-glycerol-3-phosphoethanolamine; the PEG has a number average molecular weight of about 130 to about 50,000, such as about 150 to about 30,000, about 150 to about 20,000, about 150 to about 15,000, about 150 to about 10,000, about 150 to about 6,000, about 150 to about 5,000, about 150 to about 4,000, about 150 to about 3,000, about 300 to about 3,000, about 1,000 to about 3,000, about 1,500 to about 2,500, such as about 2000.

In the vaccine composition, the mass ratio of the total mass of lipid molecules to mRNA is 5-20:1.

The S protein mutant of the first aspect of the invention or the mRNA of the third aspect of the invention is applied to the preparation of vaccines.

According to the invention, the vaccine or vaccine composition may be used for the prevention and/or treatment of 2019-nCoV infection or a disease or disorder associated with 2019-nCoV infection, which 2019-nCoV may be a wild strain or a mutant of any one thereof. In one embodiment of the invention, the 2019-nCoV is a B.1.617.2 mutant.

The diseases or conditions associated with 2019-nCoV infection include, but are not limited to, pneumonia caused by 2019-nCoV infection, headache, nasal obstruction, runny nose, cough or/and airway inflammation caused by 2019-nCoV infection, disseminated intravascular coagulation caused by 2019-nCoV infection, and sepsis caused by 2019-nCoV infection.

In a sixth aspect, the invention provides the use of a DNA molecule according to the second aspect of the invention for the preparation of a mutant S protein, and the use of a nucleic acid molecule according to the fourth aspect of the invention for the preparation of an mRNA according to the third aspect of the invention.

List of sequences according to the invention:

/>

the ionizable lipid compounds of formula C of the present invention may be synthesized using methods known in the art, for example, by reacting one or more equivalents of an amine with one or more equivalents of an epoxy-terminated compound under suitable conditions. The synthesis of the ionizable lipid compounds is performed with or without a solvent, and the synthesis may be performed at a higher temperature in the range of 25-100 ℃. The resulting ionizable lipid compound may optionally be purified.

In some embodiments of the invention, the ionizable lipid compounds of the invention may be prepared using the following general preparation methods.

Step 1: reduction of

The carboxyl group of the compound A1 is reduced to a hydroxyl group in the presence of a reducing agent to obtain a compound A2. Examples of reducing agents include, but are not limited to, lithium aluminum hydride, diisobutylaluminum hydride, and the like. Examples of the solvent used in the reaction include, but are not limited to, ethers (such as diethyl ether, tetrahydrofuran, dioxane, etc.), halogenated hydrocarbons (such as chloroform, methylene chloride, dichloroethane, etc.), hydrocarbons (such as n-pentane, n-hexane, benzene, toluene, etc.), and mixed solvents of two or more of these solvents.

Step 2: oxidation

The hydroxyl group of the compound A2 is oxidized to an aldehyde group in the presence of an oxidizing agent to obtain a compound A3. Examples of oxidizing agents include, but are not limited to, 2-iodoxybenzoic acid (IBX), pyridinium chlorochromate (PCC), pyridinium Dichlorochromate (PDC), dess-martin oxidizing agent, manganese dioxide, and the like. Examples of the solvent used in the reaction include, but are not limited to, halogenated hydrocarbons (such as chloroform, methylene chloride, dichloroethane, etc.), hydrocarbons (such as n-pentane, n-hexane, benzene, toluene, etc.), nitriles (such as acetonitrile, etc.), and mixed solvents of two or more of these solvents.

Step 3: halo-reduction

First, the aldehyde α -hydrogen of the compound A3 is subjected to halogenation with a halogenating agent under acidic conditions to obtain an α -halogenated aldehyde intermediate, and then the aldehyde group of the α -halogenated aldehyde is reduced to a hydroxyl group in the presence of a reducing agent to obtain the compound A4. Examples of conditions that provide acidity include, but are not limited to, DL-proline. Examples of halogenated agents include, but are not limited to, N-chlorosuccinimide (NCS) and N-bromosuccinimide (NBS). Examples of reducing agents include, but are not limited to, sodium borohydride, sodium cyanoborohydride, and sodium triacetoxyborohydride.

Step 4: epoxidation

The compound A4 is subjected to intramolecular nucleophilic substitution reaction in the presence of a base to obtain an epoxy compound A5. Examples of bases include, but are not limited to, hydroxides or hydrides of alkali metals, such as sodium hydroxide, potassium hydroxide, and sodium hydride. Examples of solvents used in the reaction include, but are not limited to, mixtures of dioxane and water.

Step 5: ring opening reaction

Compound A5 is ring-opened with an amine (e.g., N-bis (2-aminoethyl) methylamine) to obtain the final compound. Examples of the solvent for the reaction include, but are not limited to, ethanol, methanol, isopropanol, tetrahydrofuran, chloroform, hexane, toluene, diethyl ether, etc.

The raw material A1 in the preparation method can be obtained commercially or synthesized by a conventional method.

Description of the terminology:

in the present application, the meanings of novel coronaviruses, 2019-nCoV and SARS-CoV-2 are the same.

In the present description and claims, conventional single-letter or three-letter codes for amino acid residues are used. Unless otherwise indicated, amino acid sequences are written in an amino-to-carboxyl orientation from left to right.

For ease of reference, the S protein mutants of the present application are described using the following naming convention: original amino acid, position, substituted amino acid. According to this naming convention, for example, substitution of asparagine with alanine at position 30 is expressed as: asn30Ala or N30A; the absence of asparagine at the same position is expressed as: asn30 or n30; insertion of another amino acid residue, e.g., lysine, is denoted: asn30AsnLys or N30NK; deletion of consecutive stretch of amino acid residues, e.g., deletion of amino acid residues 242-244, denoted as (242-244) ×or Δ (242-244) or 242_244del; if an S protein mutant contains a "deletion" and an insertion at that position, as compared to the other S protein parents, it is expressed as: *36Asp or 36D, indicates the deletion at position 36 with simultaneous insertion of aspartic acid. When one or more alternative amino acid residues may be inserted at a given position, this is expressed as: N30A, E, or N30A or N30E. In addition, when a position suitable for modification is identified herein without any particular modification being suggested, it is to be understood that any amino acid residue may be substituted for the amino acid residue at that position. Thus, for example, where reference is made to modifying an asparagine at position 30, but not specified, it is to be understood that the asparagine may be deleted or substituted with any one of the other amino acids, i.e., R, D, A, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V. Further, "N30X" refers to any one of the following substitutions: N30R, N30D, N30C, N30Q, N30E, N30G, N30H, N30I, N30L, N30K, N30M, N30F, N30P, N30S, N30T, N30W, N30Y, or N30V; or abbreviated as: N30R, D, C, Q, E, G, H, I, L, K, M, F, P, S, T, W, Y, V.

Domain: as used herein, the term "domain" when referring to a polypeptide refers to a motif of the polypeptide that has one or more identifiable structural or functional features or properties (e.g., binding capacity, serving as a site for protein-protein interaction).

The term "protein mutant" or "polypeptide mutant" refers to a molecule whose amino acid sequence differs from a native or reference sequence. Amino acid sequence mutants may have substitutions, deletions and/or insertions, etc., at certain positions within the amino acid sequence, as compared to the native or reference sequence. Typically, the mutant will have at least about 50% identity, at least about 60% identity, at least about 70% identity, at least about 80% identity, at least about 90% identity, at least about 95% identity, at least about 99% identity to the native or reference sequence.

In the present description and claims, nucleotides are referred to by their commonly accepted single letter codes. Unless otherwise indicated, nucleotide sequences are written in the 5 'to 3' direction from left to right. Nucleobases are represented herein by commonly known single letter symbols recommended by the IUPAC-IUB biochemical nomenclature committee. Thus, A represents adenine, C represents cytosine, G represents guanine, T represents thymine, and U represents uracil. The skilled artisan will appreciate that the T base in the codons disclosed herein is present in DNA, whereas the T base will be substituted with a U base in the corresponding RNA. For example, a codon-nucleotide sequence in the form of DNA disclosed herein, such as a vector or an In Vitro Translation (IVT) template, has its T base transcribed into a U base in its corresponding transcribed mRNA. In this regard, both codon-optimized DNA sequences (comprising T) and their corresponding mRNA sequences (comprising U) are considered codon-optimized nucleotide sequences of the present disclosure. Those skilled in the art will also appreciate that equivalent codon patterns can be generated by substituting one or more bases with non-natural bases.

The terms "nucleic acid sequence", "nucleotide sequence" or "polynucleotide sequence" are used interchangeably and refer to a contiguous nucleic acid sequence. The sequence may be single-or double-stranded DNA or RNA, such as mRNA.

"nucleotide sequence encoding …" refers to a nucleic acid (e.g., mRNA or DNA molecule) encoding a polypeptide. The coding sequence may further comprise initiation and termination signals operably linked to regulatory elements including promoters and polyadenylation signals capable of directing expression in cells of the individual or mammal to which the nucleic acid is administered.

Homology: as used herein, the term "homology" refers to the overall relatedness between polymer molecules, e.g., between nucleic acid molecules (e.g., DNA molecules and/or RNA molecules) and/or between polypeptide molecules. In general, the term "homology" means the evolutionary relationship between two molecules. Thus, two homologous molecules will have a common evolutionary ancestor. In the context of the present disclosure, the term homology includes identity and similarity.

In some embodiments, polymer molecules are considered "homologous" to each other if at least 25%,30%,35%,40%,45%,50%,55%,60%,65%,70%,75%,80%,85%,90%,95%,96%,97%,98%,99% or 100% of the monomers in the molecule are identical (identical monomers) or similar (conservative substitutions). The term "homologous" necessarily refers to a comparison between at least two sequences (polynucleotide or polypeptide sequences).

Identity: as used herein, the term "identity" refers to overall monomer conservation between polymer molecules, e.g., between polynucleotide molecules (e.g., DNA molecules and/or RNA molecules) and/or between polypeptide molecules. For example, the calculation of the percent identity of two polynucleotide sequences can be performed by aligning the two sequences for optimal comparison purposes (e.g., gaps can be introduced in one or both of the first and second nucleic acid sequences for optimal alignment and non-identical sequences can be abandoned for comparison purposes, in certain embodiments, the length of the sequences aligned for comparison purposes is at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of the length of the reference sequence.

Suitable software programs are available from a variety of sources and are used for alignment of both protein and nucleotide sequences. For example, one suitable program for determining percent sequence identity is the Bl2seq, which is part of the BLAST suite of programs available from the national center for Biotechnology information of the United states government (BLAST. Ncbi. Lm. Nih. Gov). Other suitable programs are parts of the bioinformatics EMBOSS program suite, for example Needle, stretcher, water or Matcher, and are also available from European Bioinformatics Institute (EBI) of www.ebi.ac.uk/Tools/psa. Sequence alignment may be performed using methods known in the art, such as MAFFT, clustal (ClustalW, clustal X or Clustal Omega), MUSCLE, and the like.

The terms "coding region" and "coding region" refer to the Open Reading Frame (ORF) in a polynucleotide that, when expressed, produces a polypeptide or protein.

"operably linked" refers to a functional linkage between two or more molecules, constructs, transcripts, entities, moieties, and the like.

Expression: as used herein, "expression" of a nucleic acid sequence refers to one or more of the following events: (1) Generating an mRNA template from the DNA sequence (e.g., by transcription); (2) Processing of mRNA transcripts (e.g., by splicing, editing, 5 'cap formation, and/or 3' end processing); (3) translating the mRNA into a polypeptide or protein; and (4) post-translational modification of the polypeptide or protein.

5' cap structure: the 5 'cap is typically a modified nucleotide (especially a guanine nucleotide) added at the 5' end of the mRNA molecule, and also includes atypical cap analogs. The 5' cap may be added using a 5' -5' -triphosphate linkage (also known as m7 gppppn). Additional examples of 5 'cap structures include glyceryl, inverted deoxyabasic residues (moieties), 4',5 '-methylene nucleotides, 1- (. Beta. -D-erythro furanosyl) nucleotides, 4' -thio nucleotides, carbocyclic nucleotides, 1, 5-anhydrohexitol nucleotides, L-nucleotides, alpha-nucleotides, modified base nucleotides, threo-pentofuranosyl nucleotides, acyclic 3',4' -amethonucleotides, acyclic 3, 4-dihydroxybutyl nucleotides, acyclic 3, 5-dihydroxypentyl nucleotides, 3'-3' -inverted nucleotide moieties, 3'-3' -inverted abasic moieties, 3'-2' -inverted nucleotide moieties, 3'-2' -inverted abasic moieties, 1, 4-butanediol phosphates, 3 '-phosphoramidates, hexyl phosphates, aminohexyl phosphates, 3' -phosphorothioates, dithiophosphates or bridged or unbridged methylphosphonate moieties. These modified 5' cap structures can be used in the context of the present invention to modify the mRNA sequences of the present invention.

Cap analogue: cap analogs refer to non-polymerizable dinucleotides that function as caps, in that they facilitate translation or localization, and/or prevent degradation of RNA molecules when incorporated at the 5' end of the RNA molecule. Non-polymerizable means that the cap analogue will be incorporated only at the 5' end, as it does not have a 5' triphosphate and therefore cannot be extended in the 3' direction by a template dependent RNA polymerase. Cap analogs include, but are not limited to, chemical structures selected from the group consisting of: m7GpppG, m7GpppA, m7GpppC; unmethylated cap analogs (e.g., gpppG); a dimethyl cap analogue (e.g., m2,7 GpppG), a trimethyl cap analogue (e.g., m2,7 GpppG), a dimethyl symmetrical cap analogue (e.g., m7Gpppm 7G), or an anti-reverse cap analogue (e.g., ARCA; m7,2'OmeGpppG, m7,2' dGpppG, m7,3'OmeGpppG, m7,3' dGpppG, and tetraphosphate derivatives thereof) (stepfski et al, 2001.RNA 7 (10): 1486-95).

Naturally occurring nucleotide analogs or synthetic nucleotide analogs, for example, are selected from the group consisting of pseudouridine (pseudouridine), 2-thiouridine (2-thiouridine), 5-methyluridine (5-methyluridine), 5-methylcytidine (5-methylcytidine), N6-methyladenosine (N6-methylpseudouridine), N1-methylpseudouridine (N1-methylpseudouridine), 5-ethynyluridine (5-ethylpseudouridine), pseudouridine triphosphate (pseudouridine-UTP), 1-methyl-pseudouridine triphosphate (N1-methyl-pseudouridine-UTP), 5-ethynyl uridine triphosphate (5-methyl-UTP), 5-methylcytidine triphosphate (5-methyl-CTP), and the like.

By "pharmaceutically acceptable excipient" is meant any ingredient other than the S protein mutants or mrnas described herein, and which has substantially non-toxic and non-inflammatory properties in the patient, including, but not limited to, any and all solvents, dispersion media or other liquid carriers, dispersing or suspending aids, surfactants, isotonic agents, thickening or emulsifying agents, preservatives, binders, lubricants, antioxidants, diluents, granulating and/or dispersing agents, antimicrobial or antifungal agents, osmolality adjusting agents, pH adjusting agents, colorants, sweeteners or flavoring agents, stabilizers, buffers, chelating agents, cryoprotectants, and/or fillers, as appropriate for the particular dosage form desired. Various excipients for formulating pharmaceutical compositions and techniques for preparing the compositions are known in the art. Exemplary antimicrobial or antifungal agents include, but are not limited to, benzalkonium chloride, benzethonium chloride, methylparaben, ethylparaben, benzoic acid, hydroxybenzoic acid, potassium or sodium benzoate, potassium or sodium sorbate, sodium propionate, sorbic acid, and the like, and combinations thereof. Exemplary preservatives include, but are not limited to, beta-carotene, citric acid, ascorbic acid, butylated hydroxyanisole, sodium Lauryl Sulfate (SLS), vitamin a, vitamin C, vitamin E, sodium dodecyl ether sulfate (SLES), and the like, and combinations thereof. Exemplary buffers to control pH may include, but are not limited to, sodium phosphate, sodium succinate, histidine (or histidine-HCl), sodium malate, sodium citrate, sodium carbonate, and the like, and/or combinations thereof. Exemplary cryoprotectants include, but are not limited to, trehalose, lactose, glycerol, mannitol, sucrose, dextrose, and the like, and combinations thereof. Exemplary bulking agents can include, but are not limited to, mannitol, glycine, lactose, sucrose, trehalose, raffinose, and combinations thereof.

And/or is to be taken as a specific disclosure of each of two specified features or components with or without the other. Thus, the term "and/or" as used in phrases such as "a and/or B" is intended to include "a and B", "a or B", "a" (alone) and "B" (alone). Likewise, the term "and/or" as used in phrases such as "A, B and/or C" is intended to encompass each of the following aspects: A. b and C; A. b or C; a or C; a or B; b or C; a and C; a and B; b and C; a (alone); b (alone); and C (alone).

"comprising" and "including" have the same meaning and are intended to be open and allow for the inclusion of additional elements or steps but not required. When the terms "comprising" or "including" are used herein, the terms "consisting of" and/or "consisting essentially of … …" are therefore also included and disclosed.

"about": the term "about" as used in conjunction with numerical values throughout the specification and claims means a range of accuracy that is familiar and acceptable to those skilled in the art. Typically, this accuracy is in the interval of + -10%.

Drawings

Fig. 1: schematic of the primary structure of 2019-nCoV S protein and conformational structure prior to pre-fusion. In the figure, the A part is a primary structure schematic diagram of S protein, SS (signal sequence) -signal peptide sequence, NTD (N-terminal domain) -N terminal region, RBD (receptor binding domain) -receptor binding domain, S2'-S2' protease cleavage site, FP (fusion peptide) -fusion peptide, HR1 (head repeat 1) -7 peptide repeat 1, CH (central helix) -central helix, CD (connector domain) -connecting domain, HR2 (head repeat 2) -7 peptide repeat 2, TM (transmembrane domain) -transmembrane domain, CT (cytoplasmic tail) -cytoplasmic tail, and arrow is protease cleavage site. S1/S2 is preceded by an S1 subunit and followed by an S2 subunit; part B of the figure is a side view and a top view of the S protein pre-fusion construct.

Fig. 2: statistical graphs of the amount of intracellular protein expression of mRNA prepared from different in vitro transcription vectors using Firefly Luc as reporter protein.

Fig. 3: the b.1.617.2mrna integrity results were analyzed on a 2100 bioanalyzer using an RNA 6000 nano chip.

Fig. 4: ELISA method for detecting S protein mutant expression level in supernatant of CHO-K1 cell transfected with nucleic acid.

Fig. 5: three-dimensional structure model diagram of the S protein mutant.

Fig. 6: II-37 (also known as C2) lipid nanoparticles encapsulate S protein expression levels in supernatants after transfection of mRNA according to the invention.

Fig. 7: BALB/c mouse immunization strategy

Fig. 7-1: mRNA vaccine specific IgG binding antibody detection results after immunization of BALB/c mice. BALB/c mice (n=4) were intramuscular injected with different doses of vaccine or phosphate buffered saline (PBS, control, n=4) on day 0 and day 21. Blood was collected on day 35 and the concentration of SARS-CoV-2 B.1.617.2 strain S protein-specific IgG-binding antibody in the blood was determined by ELISA. Each dot represents a single animal, the same number of dots being covered, the figures shown in the figures being median.

Fig. 7-2: results of competitive inhibition assay of ACE2 after immunization of BALB/c mice with mRNA vaccine. BALB/c mice (n=4) were intramuscular injected with different doses of vaccine or phosphate buffered saline (PBS, control, n=4) on day 0 and day 21. Blood was collected on day 35, and the titer of neutralizing antibodies that competitively bound to SARS-CoV-2 B.1.617.2 strain S protein in the blood sample was measured, and the results were expressed as inhibition (%). The figures show the median value and 20% the inhibition ratio cut-off.

Fig. 7-3: mRNA vaccine detection results of pseudovirus neutralizing antibodies after immunization of BALB/c mice. BALB/c mice (n=4) were intramuscular injected with different doses of vaccine or phosphate buffered saline (PBS, control, n=4) on day 0 and day 21. Day 35 blood was collected and strain pVNT50 based on SARS-CoV-2 was determined by the reporter gene method (Vazyme) at B.1.617.2. The numbers shown in the figures are median values.

Fig. 8: rhesus monkey immunization strategy schematic

Fig. 8-1: detection results of specific IgG binding antibodies after immunization of rhesus monkeys with mRNA vaccine. Female and male rhesus monkeys (9-22 years) were intramuscular injected with 10 μg, 30 μg or 100 μg mRNA vaccine (n=3) on day 0 and day 28, and control group was physiological saline (n=2). Blood was collected on day 35 and the concentration of the S protein-specific IgG-binding antibody of the SARS-CoV-2 B.1.617.2 strain in the blood was determined by ELISA. The numbers shown in the figures are median values.

Fig. 8-2: results of competitive inhibition assay of rhesus ACE2 following mRNA vaccine immunization. Female and male rhesus monkeys (9-22 years) were intramuscular injected with 10 μg, 30 μg or 100 μg mRNA vaccine (n=3) on day 0 and day 28, and control group was physiological saline (n=2). Day 35 was bled and the neutralizing antibody titer of the blood sample for S protein that competitively bound to ACE2 in the b.1.617.2 strain was measured, and the result was expressed as inhibition (%). The figures show the median value and 20% the cut-off value.

Fig. 8-3: results of detection of neutralizing antibodies to rhesus pseudovirus after immunization with mRNA vaccine. Female and male rhesus monkeys (9-22 years) were intramuscular injected with 10 μg, 30 μg or 100 μg mRNA vaccine (n=3) on day 0 and day 28, and control group was physiological saline (n=2). Day 35 blood was collected and strain pVNT50 based on SARS-CoV-2 was determined by the reporter gene method (Vazyme) at B.1.617.2. The numbers shown in the figures are median values.

Fig. 9: h11 K18-hACE2 transgenic mouse immunization strategy diagram

Fig. 9-1: h11 Detection results of specific IgG binding antibodies after immunization of the K18-hACE2 transgenic mice with mRNA vaccine. Mice (n=10) were intramuscular injected with different doses of mRNA vaccine or physiological saline on day 0 and day 25 (control group, n=10); the challenge control group was not injected (n=8). Blood samples were collected on day 32 and the concentration of the S protein-specific IgG-binding antibody of the SARS-CoV-2 B.1.617.2 strain in the blood samples was determined by ELISA. Each dot represents a single animal, the same number of dots being covered, the figures shown in the figures being median. P-values were analyzed using one-way analysis of variance (ns, P >0.05; P <0.01; P <0.001; P < 0.0001).

Fig. 9-2: h11 Results of neutralizing antibody titer after immunization of K18-hACE2 transgenic mice with mRNA vaccine. Blood samples were collected on day 32 and the neutralizing antibody titer of the S protein in the blood samples that competitively bound to ACE2 with the b.1.617.2 strain was measured, and the results were expressed as inhibition ratio. The figures show the median value and 20% the inhibition ratio cut-off.

Fig. 10: h11 Statistical graphs of viral load of each tissue in challenge test after immunization of mRNA vaccine by K18-hACE2 transgenic mice

Detailed Description

The technical scheme of the invention will be further described in detail below with reference to specific embodiments. It is to be understood that the following examples are illustrative only and are not to be construed as limiting the scope of the invention. All techniques implemented based on the above description of the invention are intended to be included within the scope of the invention.

Unless otherwise indicated, the starting materials and reagents used in the following examples were either commercially available or may be prepared by known methods. The experimental method is a conventional molecular biological method in the field, and can be operated by referring to the instruction of a molecular biological experimental manual or a kit product instruction in the field.

Example 1 efficiency comparison experiment of IVT vector of the present invention

In the embodiment, firefly Luc is taken as a reporter protein, different IVT vectors are constructed for in vitro transcription synthesis of mRNA capable of translating the Firefly Luc, and the translation efficiency of the synthesized mRNA with different sequence characteristics is compared.

The coding sequence of Firefly Luc was cloned into the multiple cloning site of the corresponding vector by means of a plasmid vector construction technique conventional in the art to obtain vectors numbered IVT1, IVT2, IVT3 and IVT4, respectively, after which corresponding Firefly Luc mRNA samples were prepared by in vitro transcription from the aforementioned vectors using a T7 in vitro transcription kit (cat#AM1344, available from Simer-Feisher).

The vectors IVT 1-IVT 4 are all modified on the basis of a commercial vector psp73, the following sequences are inserted into the vector psp73 at the XhoI/NdeI enzyme cutting sites, wherein UTR sequences are not added into the IVT1, and the length of polyA tails is 64A; the 3' UTR sequences of the 5' UTR and GCTCGCTTTCTTGCTGTCCAATTTCTATTAAAGGTTCCTTTGTTCCCTAAGTCCAACTACTAAACTGGGGGATATTATGAAGGGCCTTGAGCATCTGGATTCTGCCTAATAAAAAACATTTATTTTCATTGC shown in SEQ ID NO. 6 (the 3' UTR sequence of beta globin) were used for IVT2, and the polyA length was 120A; the IVT3 uses the 5'UTR shown in SEQ ID NO. 6 and the 3' UTR sequence shown in SEQ ID NO. 7, and the polyA length is 120A; the 5'UTR shown in SEQ ID NO. 6 and the 3' UTR sequence shown in SEQ ID NO. 7 of 2 tandem repeats were used in IVT4, with a polyA length of 120A. A multiple cloning site comprising the common cleavage sites HindIII and EcoRI is inserted between the sequences of the above 5'UTR and 3' UTR, and the coding sequence of Firefly Luc is cloned into the multiple cloning sites HindIII and EcoRI. All vectors were constructed by the company Jinsri using the method of gene synthesis.

Each Firefly Luc mRNA sample was transfected into CHO cells using Lipofectamine2000 (cat# 11668030, available from Semer Feishan) as a transfection reagent using Dual-Lumi ^TM Double luciferase reporter gene detection kit (at#RG08)8S, available from Shanghai Biyun biotechnology Co., ltd.) for detection of luciferase. The DNA of Firefly Luc was transferred into psicheck2 plasmid as a positive control (psicheck 2 plasmid, cat#60908-6151, available from Beijing Tian Enzem Gene technologies Co., ltd.). The method comprises the following steps: on the first day, CHO cells were seeded into 96 well plates at 1.5X10 per well ⁴ Cells were cultured overnight with f12k+10% fbs; the following day, the medium was changed to serum-free F12K medium prior to transfection, and mRNA or DNA was transfected into CHO cells using Lipofectamine 2000; the amount of nucleic acid used per well was 100ng, the amount of liposome was 0.3. Mu.l, the total volume per well was 100. Mu.l, and the cells were cultured overnight; on the third day, the serum-free medium was changed to complete medium (f12k+10% fbs) and the culture was continued for 24 hours; on the fourth day (48 hours post-transfection), firefly Luc fluorescence values were measured.

The results are shown in FIG. 2. In the figure, "DNA" is a positive control (psicheck 2 plasmid carrying the Firefly Luc gene), "IVT1-Luc", "IVT2-Luc", "IVT3-Luc", "IVT4-Luc" represent the corresponding Firefly Luc mRNA transcribed in vitro from the vectors of IVT1, IVT2, IVT3 and IVT4, respectively, "negative control" is a negative control. As can be seen from FIG. 2, the protein expression level of IVT4-Luc is far higher than that of the other three mRNAs by 2-3 times under the same transfection level of the mRNAs, which indicates that the stability of IVT4-Luc is good and the translation efficiency is high.

Example 2 B.1.617.2 preparation of mRNA and translation thereof

1. A nucleic acid sequence encoding the mRNA shown in SEQ ID No.8 was synthesized and cloned into pUC57-kana vector behind the T7 promoter, which had been previously engineered to contain sequences encoding SEQ ID No. 6, a Kozak sequence, 2 end-to-end SEQ ID No. 7, and a polyA tail. The nucleic acid sequence encoding the mRNA shown in SEQ ID No.8 was cloned into the multiple cloning site between the Kozak sequence and 2 end-to-end SEQ ID No. 7, and a plasmid for in vitro transcription was constructed.

2. And (3) transforming the constructed plasmid into escherichia coli Dh5a, culturing and amplifying the plasmid, and extracting the plasmid.

3. The extracted plasmid was digested into linear molecules using the restriction enzyme SpeI immediately following the polyA tail.

4. The prepared linearized plasmid molecule is used as a template, an in vitro transcription method (in vitro transcription kit A45975 of Thermo company) is used for preparing mRNA, the sequence of the mRNA is shown as SEQ ID NO:9, the mRNA is hereinafter abbreviated as B.1.617.2 mRNA, and the S protein mutant is obtained after translation of the mRNA, wherein the amino acid sequence of the S protein mutant is the amino acid sequence of SEQ ID NO:2 and the amino acid sequence of SEQ ID NO:3 which are directly connected from the N end to the C end. After the end of in vitro transcription, CAP structures of CAP1 are added to mRNA using capping enzymes and dimethyltransferase.

Purification of mRNA: the mRNA stock solution obtained was purified by affinity chromatography.

Quality control of mrna: the prepared mRNA was analyzed for mRNA integrity on a 2100 bioanalyzer using an RNA 6000 nano chip, and the results are shown in FIG. 3, where the transcribed mRNA bands were single and no significant degradation was observed.

In addition, spike fragments were excised from the commercial plasmid pCMV3-Spike by restriction enzymes HindIII and EcoRI, and inserted between the HindIII and EcoRI sites of the IVT1 vector of example 1 to give an IVT1-Spike plasmid. And then carrying out point mutation on the plasmid to obtain IVT1-spike-D614G plasmid, and carrying out in vitro transcription by taking the plasmid as a template to obtain spike-D614G mRNA, thereby expressing the full-length S protein containing the D614G mutation.

B.1.617.2 mRNA cellular level expression assay: the CHO-K1 cell line was used as an expression system, mRNA was transfected with Lipofectamine Messenger MAX Reagent (Invitrogen, cat # 1168-027), after 48 hours of culture, cell culture supernatants were collected, and the S protein expression level was detected using an ELISA kit for detecting S protein to evaluate whether mRNA was translatable into protein. The results are shown in FIG. 4. In FIG. 4, "spike DNA" is a commercial plasmid pCMV3-spike (purchased from Soy Severe Inc.) expressing full-length wild-type S protein; "spike-D614G mRNA" is mRNA expressing the full-length S protein containing the D614G mutation, and "spike B.1.617.2 mRNA" is mRNA expressed as B.1.617.2, and the result shows that the mRNA of the invention can highly express the S protein mutant in cells.

After purifying the obtained S protein mutant, carrying out structural analysis by adopting a freeze electron microscope, wherein the 3D structure of the S protein is shown in figure 5, and the S protein mutant is a stable structure of pre-fusion (prefusion spike structure). B.1.617.2 the sequence of the mutant strain and the sequence of the wild strain differ by 9 mutation sites, 2 of which are in the RBD region. The RBD domain status of the pre-fusion S protein of the wild strain has been reported to be mainly 1 OPEN, 2 CLOSE structures. The structure of the S protein mutant of the invention is mainly in flexible state of 2 OPEN and 1 CLOSE. This structural difference is the structural basis for the enhanced binding capacity of the virus to the receptor ACE2 and the enhanced infectivity, and it also leads to a significant difference in the immunogenic epitopes of the S protein, and thus based on the significant differences in antibodies induced by the different structures, in particular neutralizing antibodies.

EXAMPLE 3 construction of LNP-entrapped mRNA

mRNA-entrapped nanoparticles were prepared using II-37 (also known as C2) as an ionizable lipid. Accurately weighing the compounds II-37 and DSPC, CHOL, DMG-PEG2000, and fully dissolving each lipid in absolute ethyl alcohol in a proper container for standby. The specific molar ratio is as follows: II-37:DSPC:CHOL:DMG-PEG 2000=45:15:38.5:1.5; the lipid solutions were mixed uniformly in proportion, and the b.1.617.2 mRNA of example 2 was prepared as an aqueous solution (purified water as solvent) at aqueous phase ph=4 as an organic phase.

Mixing the organic phase and the water phase in a volume ratio of 3:1, and preparing the lipid nanoparticle suspension on a microfluidic platform (such as PNI Ignite). And centrifugally filtering the obtained lipid nanoparticle suspension through a 100kDa ultrafiltration centrifuge tube, purifying and concentrating, and sub-packaging the concentrated liquid.

The prepared lipid nanoparticles were measured for particle size, PDI, potential using a laser nanoparticle analyzer, encapsulation efficiency (EE%) using an ultraviolet spectrophotometer in combination with a RiboGreen RNA kit, and a portion of the samples were transfected into cells a549 in the manner of example 2 and the cell transfection efficiency was measured by Elisa.

The physical and chemical quality control data of the prepared lipid nanoparticle are shown in the following table:

sample information	Particle size (nm)	PDI	Zeta potential	Encapsulation efficiency
					mRNA-LNP	147.8±20.6	0.0651	34.28	100％

As a control, liposomes were also made with lipofectamine max entrapped with b.1.617.2 mRNA of example 2, and the above lipid nanoparticles were transfected into cell a549 separately, the negative control being lipid nanoparticles prepared from II-37 without mRNA. As shown in FIG. 6, after the lipid nanoparticle carries mRNA to transfect cells, the expression level of protein in the cells is very high compared with the control reagent Lipofectamine Max, which indicates that the transfection efficiency of the cells of the prepared lipid nanoparticle is very high.

EXAMPLE 4 determination of immunogenicity of protein mutants

The mRNA vaccine used was in a 45:15:38.5:1.5 molar ratio with the mRNA-entrapped LNP lipid nanoparticle prepared in example 3, lipid component II-37:dspc: chol: dmg-PEG 2000. The experimental method comprises the following steps:

ELISA method for detecting specific IgG binding antibody (IgG Binding Antibody)

The content of 2019-nCoV specific IgG antibodies in the plasma of the immunized animal is detected by an indirect ELISA method. Spike antigen protein of the 2019-nCoV B.1.617.2 mutant strain (0.05. Mu.g) was coated on an ELISA plate (Thermo, catalog number.# 442404) at 2-8deg.C overnight. Blocking with 3% BSA (SIGMA, catalog number.#A7030) for 1h at room temperature, adding diluted mouse plasma (1:50), monkey plasma (1:500) incubation for 2h, PBST washing 5 times. Then adding HRP conjugated goat anti-mouse/monkey secondary antibody, incubating for 30-45min at room temperature, and washing with PBST for 5 times. Color development was performed with TMB (thermo filter, catalyst number.# 34029), incubated at room temperature for 7min, stopped by adding stop solution (Solarbio, catalyst number.# C1058), and the antibody content was determined by measuring absorbance at a wavelength of 450 nm. A standard curve was fitted by a polynomial method of selecting positive antibodies (mice: yiqiao Shenzhou cat#40591-MM43, rhesus: ACRO cat#SPD-M201), and the total amount of antibodies was calibrated.

ELISA assay for detecting competitive binding of neutralizing antibodies to antigen proteins in samples for ACE2 (ACE 2 Binding Inhibition)

The neutralizing antibody (Spike RBD) of the 2019-nCoV b.1.617.2 mutant strain in the plasma was diluted and added to the microplate on a plate pre-coated with Human ACE2 Protein using ELISA Anti-SARS-CoV-2 Neutralizing Antibody Titer Serologic Assay Kit (ACRO, catalyst number.#ras-N031/RAS-N040/RAS-N056), and after incubation with HRP-SARS-CoV-2 spike.37 ℃ for 1 hour at constant temperature, incubation with substrate 37 ℃ for 20min at constant temperature, followed by termination with a termination solution. Sample absorbance values (OD 450 nm/OD 630 nm) were determined using a microplate reader (BioTek, SLXFATS) at 450nm/630 nm. OD450nm minus OD630 nm readings for each well reduced background interference. The inhibition rate calculation method comprises the following steps: OD450nm inhibition= (1-sample OD450 nm/Negative Control OD450 nm). Times.100%.

Pseudo virus neutralization experiment (reporter gene method)

The neutralizing antibody can block the binding of the S protein and ACE2 on the surface of the novel coronavirus, thereby preventing the infection of host cells by the pseudovirus. By detecting the expression level of the reporter luciferase, the degree to which the virus is blocked can be deduced. Plasma/serum samples were taken from mice/monkeys at different time points before and after vaccine injection, all samples were heat-inactivated in a water bath at 56 ℃ for 30min before use. Serum-free DMEM (Gibco Catalog Number.#c) 11995500 CP) medium was diluted 20-fold and filter sterilized with a 0.22 μm filter, and 3-fold serial dilutions were made in DMEM medium containing 10% fbs (Gibco Catalog Number.# 10099-141C) for a total of 6 gradients. SARS-CoV-2-Fluc pseudovirus (Vazyme) was transferred from-80℃to 4℃refrigerator or ice until thawed, and the virus was diluted to 1-2X 10 with DMEM medium containing 10% FBS serum before use ⁴ TCID50/ml. The virus suspension was mixed with equal amounts of plasma in 96-well plates and incubated for 1h at 37℃50. Mu.L of 2X 10 density was added to each well ⁴ ACE 2-overexpressing 293 cells of cells/well were cultured for 48 hours, then 96-well plates were removed, 100. Mu.L of medium was aspirated from the well plates, 100. Mu.L of a room temperature equilibrated Bio-Lite reporter gene (Vazyme, catalyst number.#DD 1201) detection reagent was added, the plates were shaken for 2 minutes, and after standing at room temperature for 5 minutes, chemiluminescent values (RLU) were detected with a multifunctional microplate reader (TECAN, spark).

The prior research work of the inventor proves that the ACE2 competitive inhibition method and the pseudo-virus neutralization method can well represent the neutralization degree of live viruses in rhesus experiments, and have important reference significance for judging the immunogenicity of vaccines.

BALB/c mice used in this experiment were purchased from Beijing vitamin Torili laboratory animal technologies Co., ltd (animal production license: SCXK (Beijing) 2021-0006), and BALB/c female mice (SPF grade) were subjected to the experiment at 6-8 weeks of age. H11-K18-hACE2 transgenic mice were purchased from Jiangsu Jiujia kang biotechnology Co., ltd (production animal license: SCXK (Su) 2018-0008), 6 week old, SPG grade; the ACE2 humanized mouse model is prepared by preparing an ACE2 humanized mouse on a C57BL/6JGpt background mouse, and driving hACE2 to be overexpressed at the H11 site of a safety island by regulating and controlling a promoter through a human Cytokeratin 18 (Cytokeratin 18, K18) promoter, so as to simulate the human severe COVID-19 phenotype. The age range of the rhesus monkey is 9-22 years old, the rhesus monkey is healthy, and the rhesus monkey is not abnormal in appearance, mental condition, posture, respiration, fecaluria condition, ingestion and drinking water condition during the environment adaptation and quarantine period, so that the rhesus monkey meets the experimental requirements.

1. mRNA vaccine immunogenicity assay in BALB/c mice

BALB/c mice immunization strategy is shown in FIG. 7,2 inter-immunization intervals of 21 days, and conventional blood collection was used for antibody detection.

The experiment set up 6 dose groups, and a PBS control group alone. The 6 dose group search range was sequentially increased 4-fold from the lowest dose of 0.02 μg, i.e., 0.02,0.08,0.3,1.25,5 to the highest dose of 20 μg.

Results:

specific IgG binding antibody detection is shown in FIG. 7-1. All dose groups significantly induced the production of an S protein specific IgG antibody against the b.1.617.2 strain compared to the PBS control group. The median of 20 mug of the highest dose group antibody concentration is 14802ng/mL, and the median of 0.02-5 mug of the group antibody concentration is 165, 1355, 4015, 1809, 7234ng/mL respectively, which show a dose-effect relationship.

The results of inhibition of ACE2 by S protein competitively binding to strain b.1.617.2 are shown in fig. 7-2, expressed as inhibition (%). The results showed that the median inhibition rates were 58%,80%,79%,90% and 91% from 0.08 μg group, 0.3 μg group, 1.25 μg group, 5 μg group to 20 μg group, respectively. Wherein the inhibition rate of the 20 mug high dose group is up to more than 91 percent.

The results of pseudo-virus neutralizing antibody levels are shown in FIGS. 7-3. From the figure, it can be seen that from the 0.08 μg group, higher levels of neutralizing antibody production were induced.

2. mRNA vaccine immunogenicity Pre-test in rhesus monkey

Rhesus immunization strategies are shown in fig. 8, with 28 days between 2 immunizations, and blood was routinely drawn for antibody detection.

The experiment set up 3 dose groups, 10, 30, 100 μg respectively low, medium and high dose groups, and a physiological saline control group was independently established.

Specific IgG binding antibody detection as seen in fig. 8-1, all dose groups induced the production of an S protein specific IgG antibody against the b.1.617.2 strain compared to the saline control group. The median of the high, medium and low dose group antibody concentrations was 141553, 63249, 82458ng/mL, respectively.

The results of inhibition of ACE2 by S protein competitively binding to strain b.1.617.2 are shown in fig. 8-2, expressed as inhibition (%). The results showed that the median levels of inhibition of rhesus ACE2 competitive binding were 89%,88% and 98% for the 10 μg,30 μg,100 μg dose group, respectively.

The results of pseudo-virus neutralizing antibody levels are shown in FIGS. 8-3. The high, medium and low dose group GMT was 3000, 392, 434.

The levels of antibodies in rhesus monkeys showed dose-dependent effects, i.e., low to high vaccine immunity, and could induce dose-dependent humoral and cellular immune responses.

The mice can produce high-level neutralizing antibodies in the group of 0.08 mug at minimum, and the rhesus monkeys can produce high-efficiency antibodies at the minimum of 10 mug.

3. In vivo efficacy test and toxicity test of mRNA vaccine in H11K 18-hACE2 transgenic mice

H11 The immunization strategy of the K18-hACE2 transgenic mice is shown in FIG. 9, and the immunization interval between 2 times is 25 days, and the mice are routinely bled for antibody detection and transferred to a P3 laboratory for toxicity attack experiments 14 days after the secondary immunization.

The experiment set up 3 dose groups, and a physiological saline control group and a blank mouse control group were independently established. 0.8 μg,4 μg,20 μg for low, medium and high dose groups, respectively. The challenge control group was not injected.

Specific IgG binding antibody detection as seen in fig. 9-1, all dose groups significantly induced the production of an S protein specific IgG antibody against the b.1.617.2 strain compared to the saline control group. The median of the antibody concentration of the 20 mug highest dose group is 2621ng/mL, the median of the antibody concentration of the 4 mug group and the median of the antibody concentration of the 0.8 mug group is 1121, 155ng/mL, and the highest dose group have no statistical difference with the 20 mug group and are in dose-effect relation.

The results of inhibition of ACE2 by S protein competitively binding to strain b.1.617.2 are shown in fig. 9-2, expressed as inhibition rate. The results show that different inhibition effects are shown under different dose groups due to individual differences.

B.1.617.2 strain challenge test: the mice are challenged by nasal drops, the virus suspension volume is 20 mu l/mouse, and the challenge dose is 1000TCID ₅₀ The challenge observation period was 5 days, and the blank groups on day 3 (3 dpi) and day 5 (5 dpi) were euthanized 4/time after challenge, and each challenge group was euthanized 5/time. Taking lung, brain, intestinal tissue, heart, liver, kidney and spleen, detecting viral load of each tissue by qPCR method, and taking each tissue for carrying outPathological HE detection.

Each immune group (3 # low dose group, 4# medium dose group, 5# high dose group) showed a significant decrease in viral load of the 5# lung tissue by more than 2 Log10 values on both day 3 and day 5 post infection, as compared to the challenge control group 2 #; the low dose group 3# lung tissue viral load decreased by more than 2 Log10 values on day 5 post infection. The viral load of brain tissue in each immune group (3 #, 4#, 5 #) decreased significantly by more than 2 Log10 values at both day 3 and day 5 post infection. Each immune group (3 #, 4#, 5 #) showed a different degree of decline in viral load in heart, liver, spleen, kidney and intestine tissue compared to the challenge control group 2#, with individual tissue individual time points declining by more than 2 Log10 values (fig. 10).

Histopathological (HE) changes: the 10 mice with lung lesions of the toxicity attack control group are moderate or severe lesions, the lung interval widens the blood stasis, and inflammatory cells infiltrate. The immune high, medium and low dose groups are mild and moderate lesions, severe lesions do not occur, and the incidence rate of moderate lesions is lower than that of the toxicity attack control group. The contrast of the lesion degrees of the heart and the spleen shows that the high, medium and low dose groups of the immunity are lightened to different degrees compared with the toxicity attack control group. Other organs (liver, intestinal tissue, brain) were slightly diseased.

In summary, in the SARS-CoV-2 B.1.617.2 strain infection H11K 18-hACE2 transgenic mouse model, the body weight change of the challenge control group accords with clinical manifestations, and the body weight change of experimental animals in each immune group (3#, 4#, 5#) at 5dpi shows different degrees of improvement, and the high-dose group 5# shows body weight increase. The virus load of the lung and brain tissues is obviously reduced, the virus load of the 5# lung in the 3dpi and 5dpi high-dose group is reduced by more than 2 Log10 values, the virus load of the 3# lung in the 5dpi low-dose group is obviously reduced by more than 2 Log10 values, the virus load of the brain tissues of each immune group is obviously reduced by more than 2 Log10 values, and the virus load of other tissues is reduced to different degrees. The immune groups (3 #, 4#, 5 #) for pathological changes of lung tissue are improved to different degrees compared with the control group. Thus, mRNA vaccine of SARS-CoV-2 B.1.617.2 strain has protective effect against infection of H11-K18-hACE2 transgenic mice with SARS-CoV-2 B.1.617.2 strain.

EXAMPLE 5 Synthesis of ionizable lipid specific Compounds II-37 of formula C

Synthesis of linolenol (a 2): liAlH was added to 950mL of tetrahydrofuran at 0deg.C ₄ (7.20 g), linoleic acid (50 g, a 1), after which the mixture was stirred at 25℃for 2h. After completion of the reaction, which was shown by Thin Layer Chromatography (TLC), the reaction mixture was quenched with water (7.2 mL), naOH aqueous solution (7.2 mL, mass fraction 15%) and water (21.6 mL), and an appropriate amount of Na was added ₂ SO ₄ After stirring for 15 minutes, the filter cake was filtered through a buchner funnel and washed with ethyl acetate, the filtrate was collected and concentrated by evaporation to give 47.4g of the target product linolenol (a 2).

¹ H NMR(400MHz,CDCl ₃ ):δ5.27-5.44(m,4H),3.63(t,J＝6.63Hz,2H),2.77(t,J＝6.44Hz,2H),1.97-2.12(m,4H),1.57-1.63(m,1H),1.20-1.46(m,18H),0.83-0.95(m,3H)

Synthesis of (9Z, 12Z) -octadeca-9, 12-dienal (a 3): linolenol (25.0 g, a 2) and 2-iodoxybenzoic acid (39.4 g) were added to 170mL of acetonitrile at room temperature, and the mixture was stirred at 85 ℃ for 4h. The reaction solution was filtered through a buchner funnel and the filter cake was washed with methylene chloride, and the filtrate was collected and concentrated by evaporation to give 24.0g of the objective (9Z, 12Z) -octadeca-9, 12-dienal (a 3).

¹ H NMR(400MHz,CDCl ₃ ):δ9.76(t,J＝1.76Hz,1H),5.25-5.43(m,4H),2.76(t,J＝6.17Hz,2H),2.41(td,J＝7.33,1.87Hz,2H),2.04(q,J＝6.84Hz,4H),1.56-1.68(m,2H),1.22-1.36(m,14H),0.88(t,J＝6.73Hz,3H)

Synthesis of (9Z, 12Z) -2-chloro-octadeca-9, 12-dien-1-ol (a 4): to 246mL of acetonitrile at 0℃were added (9Z, 12Z) -octadeca-9, 12-dienal (43.0 g, a 3), DL-proline (5.62 g) and N-chlorosuccinimide, followed by stirring at 0℃for 2h. After completion of the reaction, the reaction mixture was diluted with absolute ethanol (246 mL), and sodium borohydride (8.8 g) was added thereto, followed by stirring at 0℃for 4 hours. The reaction mixture was quenched with water (120 mL) and extracted with methyl tert-butyl ether, the combined organic phases were washed with saturated brine, dried over sodium sulfate, filtered and concentrated by evaporation to give the desired product (9 z,12 z) -2-chloro-octadeca-9, 12-dien-1-ol (a 4,46 g) which was used directly in the next step.

¹ H NMR(400MHz,CDCl ₃ ):δ5.25-5.51(m,4H),3.97-4.07(m,1H),3.79(dd,J＝12.01,3.63Hz,1H),3.59-3.70(m,1H),2.67-2.90(m,2H),1.96-2.15(m,5H),1.64-1.82(m,1H),1.20-1.49(m,15H),0.89(br t,J＝6.75Hz,3H)

Synthesis of 2- [ (7 z,10 z) -hexadecane-7, 10-diene ] oxirane (a 5): to 450mL of 1, 4-dioxane were added (9Z, 12Z) -2-chloro-octadeca-9, 12-dien-1-ol (45 g, a 4) and aqueous sodium hydroxide solution (120 g of sodium hydroxide in 585mL of water) at room temperature, and after the addition was completed, the mixture was stirred at 35℃for 2 hours. TLC showed that after the reaction was completed, the reaction solution was separated by a separating funnel and washed with saturated brine, dried over sodium sulfate, filtered and concentrated by evaporation, and then the residue was purified by flash column chromatography eluting with petroleum ether/ethyl acetate to give the target product 2- [ (7 z,10 z) -hexadecane-7, 10-diene ] oxirane (a 5) 29.11g.

¹ H NMR(400MHz,CDCl ₃ ):δ5.27-5.46(m,4H),2.87-2.98(m,1H),2.70-2.85(m,3H),2.46(dd,J＝5.00,2.75Hz,1H),1.94-2.21(m,4H),1.24-1.58(m,17H),0.78-1.00(m,3H)

II-37 synthesis: 2- [ (7Z, 10Z) -hexadecane-7, 10-diene ] oxirane (5 g) and N, N-bis (2-aminoethyl) methylamine (739 mg) were added to 10mL of ethanol at room temperature, and the mixture was stirred at 90℃for 36h. The reaction solution was concentrated by evaporation, and the residue was purified by flash column chromatography eluting with methylene chloride/methanol to give crude product II-37 (4 g). The target product was purified again by flash column chromatography with dichloromethane/methanol to give II-37 (2.2 g).

¹ H NMR(400MHz,CDCl ₃ ):δ5.27-5.44(m,12H),3.48-3.79(m,3H),2.63-3.00(m,12H),2.16-2.61(m,12H),2.05(q,J＝6.80Hz,12H),1.18-1.57(m,51H),0.89(t,J＝6.88Hz,9H)

ESI-MS：m/z 910.8[M+H] ⁺ ,911.8[M+2H] ⁺ ,912.8[M+3H] ⁺

The embodiments of the present invention have been described above. However, the present invention is not limited to the above embodiment. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

SEQUENCE LISTING

<110> Beijing Qihen Biotechnology Co., ltd

<120> S protein mutant of novel coronavirus variant strain, genetically engineered mRNA thereof and vaccine composition

<130> CPCN22410423

<160> 9

<170> PatentIn version 3.5

<210> 1

<211> 1273

<212> PRT

<213> Unknown

<220>

<223> 2019-nCoV wild-type S protein

<400> 1

Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser Gln Cys Val

1 5 10 15

Asn Leu Thr Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe

20 25 30

Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu

35 40 45

His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp

50 55 60

Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp

65 70 75 80

Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu

85 90 95

Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser

100 105 110

Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile

115 120 125

Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr

130 135 140

Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr

145 150 155 160

Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu

165 170 175

Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe

180 185 190

Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr

195 200 205

Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu

210 215 220

Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr

225 230 235 240

Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser

245 250 255

Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro

260 265 270

Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala

275 280 285

Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys

290 295 300

Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val

305 310 315 320

Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys

325 330 335

Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala

340 345 350

Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu

355 360 365

Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro

370 375 380

Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe

385 390 395 400

Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly

405 410 415

Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys

420 425 430

Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn

435 440 445

Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe

450 455 460

Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys

465 470 475 480

Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly

485 490 495

Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val

500 505 510

Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys

515 520 525

Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn

530 535 540

Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu

545 550 555 560

Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val

565 570 575

Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe

580 585 590

Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val

595 600 605

Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile

610 615 620

His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser

625 630 635 640

Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val

645 650 655

Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala

660 665 670

Ser Tyr Gln Thr Gln Thr Asn Ser Pro Arg Arg Ala Arg Ser Val Ala

675 680 685

Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser

690 695 700

Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile

705 710 715 720

Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val

725 730 735

Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu

740 745 750

Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr

755 760 765

Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln

770 775 780

Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe

785 790 795 800

Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser

805 810 815

Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly

820 825 830

Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp

835 840 845

Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu

850 855 860

Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly

865 870 875 880

Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile

885 890 895

Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr

900 905 910

Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn

915 920 925

Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala

930 935 940

Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn

945 950 955 960

Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val

965 970 975

Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln

980 985 990

Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val

995 1000 1005

Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn

1010 1015 1020

Leu Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys

1025 1030 1035

Arg Val Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro

1040 1045 1050

Gln Ser Ala Pro His Gly Val Val Phe Leu His Val Thr Tyr Val

1055 1060 1065

Pro Ala Gln Glu Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His

1070 1075 1080

Asp Gly Lys Ala His Phe Pro Arg Glu Gly Val Phe Val Ser Asn

1085 1090 1095

Gly Thr His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu Pro Gln

1100 1105 1110

Ile Ile Thr Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val

1115 1120 1125

Val Ile Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro

1130 1135 1140

Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn

1145 1150 1155

His Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn

1160 1165 1170

Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu Asn Glu

1175 1180 1185

Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu Leu

1190 1195 1200

Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro Trp Tyr Ile Trp Leu

1205 1210 1215

Gly Phe Ile Ala Gly Leu Ile Ala Ile Val Met Val Thr Ile Met

1220 1225 1230

Leu Cys Cys Met Thr Ser Cys Cys Ser Cys Leu Lys Gly Cys Cys

1235 1240 1245

Ser Cys Gly Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro

1250 1255 1260

Val Leu Lys Gly Val Lys Leu His Tyr Thr

1265 1270

<210> 2

<211> 1206

<212> PRT

<213> Unknown

<220>

<223> 2019-nCoV S protein mutant

<400> 2

Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser Gln Cys Val

1 5 10 15

Asn Leu Arg Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe

20 25 30

Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu

35 40 45

His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp

50 55 60

Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp

65 70 75 80

Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Ile Glu

85 90 95

Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser

100 105 110

Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile

115 120 125

Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Asp Val Tyr

130 135 140

Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Val Tyr Ser Ser

145 150 155 160

Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu Met Asp

165 170 175

Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe Val Phe

180 185 190

Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr Pro Ile

195 200 205

Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu Pro Leu

210 215 220

Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr Leu Leu

225 230 235 240

Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser Gly Trp

245 250 255

Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro Arg Thr

260 265 270

Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala Val Asp

275 280 285

Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys Ser Phe

290 295 300

Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val Gln Pro

305 310 315 320

Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys Pro Phe

325 330 335

Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala Trp Asn

340 345 350

Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu Tyr Asn

355 360 365

Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro Thr Lys

370 375 380

Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe Val Ile

385 390 395 400

Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly Lys Ile

405 410 415

Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys Val Ile

420 425 430

Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn Tyr Asn

435 440 445

Tyr Arg Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe Glu Arg

450 455 460

Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Lys Pro Cys Asn Gly

465 470 475 480

Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly Phe Gln

485 490 495

Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val Leu Ser

500 505 510

Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys Lys Ser

515 520 525

Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn Gly Leu

530 535 540

Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu Pro Phe

545 550 555 560

Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val Arg Asp

565 570 575

Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe Gly Gly

580 585 590

Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val Ala Val

595 600 605

Leu Tyr Gln Gly Val Asn Cys Thr Glu Val Pro Val Ala Ile His Ala

610 615 620

Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser Asn Val

625 630 635 640

Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val Asn Asn

645 650 655

Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala Ser Tyr

660 665 670

Gln Thr Gln Thr Asn Ser Arg Gly Ser Ala Ser Ser Val Ala Ser Gln

675 680 685

Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser Val Ala

690 695 700

Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile Ser Val

705 710 715 720

Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val Asp Cys

725 730 735

Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu Leu Leu

740 745 750

Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr Gly Ile

755 760 765

Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln Val Lys

770 775 780

Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe Asn Phe

785 790 795 800

Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser Pro Ile

805 810 815

Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Ile

820 825 830

Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp Leu Ile

835 840 845

Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr

850 855 860

Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly Thr Ile

865 870 875 880

Thr Ser Gly Trp Thr Phe Gly Ala Gly Pro Ala Leu Gln Ile Pro Phe

885 890 895

Pro Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr Gln Asn

900 905 910

Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn Ser Ala

915 920 925

Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Pro Ser Ala Leu Gly

930 935 940

Lys Leu Gln Asn Val Val Asn Gln Asn Ala Gln Ala Leu Asn Thr Leu

945 950 955 960

Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val Leu Asn

965 970 975

Asp Ile Leu Ser Arg Leu Asp Pro Pro Glu Ala Glu Val Gln Ile Asp

980 985 990

Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln

995 1000 1005

Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu Ala

1010 1015 1020

Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val

1025 1030 1035

Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ser

1040 1045 1050

Ala Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ala

1055 1060 1065

Gln Glu Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His Asp Gly

1070 1075 1080

Lys Ala His Phe Pro Arg Glu Gly Val Phe Val Ser Asn Gly Thr

1085 1090 1095

His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu Pro Gln Ile Ile

1100 1105 1110

Thr Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val Val Ile

1115 1120 1125

Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro Glu Leu

1130 1135 1140

Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His Thr

1145 1150 1155

Ser Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn Ala Ser

1160 1165 1170

Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu Asn Glu Val Ala

1175 1180 1185

Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu Leu Gly Lys

1190 1195 1200

Tyr Glu Gln

1205

<210> 3

<211> 28

<212> PRT

<213> Artificial Sequence

<220>

<223> domain aiding in trimer formation

<400> 3

Gly Tyr Ile Pro Glu Ala Pro Arg Asp Gly Gln Ala Tyr Val Arg Lys

1 5 10 15

Asp Gly Glu Trp Val Leu Leu Ser Thr Phe Leu Gly

20 25

<210> 4

<211> 3618

<212> DNA

<213> Artificial Sequence

<220>

<223> Artificial sequence

<400> 4

atgttcgtgt tcctcgtgct ccttccgctg gtctcgagcc agtgcgtcaa tttgcgcacg 60

aggacgcagt tgccccccgc gtacacgaac tcgtttacgc ggggggtgta ctacccggac 120

aaggtcttcc gcagctctgt cctgcacagc actcaggacc tcttcctccc gttcttctcg 180

aacgtgacgt ggttccacgc cattcacgtg tcggggacga acgggacgaa gaggttcgac 240

aaccctgttc tgccgttcaa cgacggggtg tacttcgctt cgatcgagaa gtccaacatt 300

attcgcgggt ggatattcgg gaccactctc gattcgaaga ctcagtcctt gctgatagtg 360

aacaacgcca cgaacgtggt cattaaggtc tgcgagttcc agttctgtaa tgacccgttc 420

ctggacgttt actatcacaa gaacaacaag tcttggatgg agagtgaggt gtattcgtcc 480

gcgaataatt gtaccttcga gtatgtctcg cagccattct tgatggatct tgagggcaag 540

cagggaaatt tcaagaatct ccgcgagttt gtcttcaaga acatcgacgg gtacttcaag 600

atatactcga agcacacgcc gatcaacctc gtccgtgatc tcccgcaggg cttcagcgct 660

ctggagccgc tggtggatct cccgatcggg atcaacatca cgcggttcca gacgctgctg 720

gccctgcaca ggagttacct gacgccgggt gactccagta gtgggtggac tgcgggtgcc 780

gcggcgtact acgtcgggta cctgcagccg cgcacgttct tgttgaagta caacgagaac 840

gggacgatca cggacgcggt tgattgcgcg ttggaccctc tgtcggagac gaagtgcacc 900

ctgaagtcgt tcacggtgga gaagggtatc tatcagacct cgaacttccg ggtccagccg 960

actgagagta tcgttcggtt cccgaacatt acgaacctgt gtccgttcgg ggaggtcttc 1020

aacgcgacgc ggttcgcgag tgtgtacgct tggaaccgga agaggatctc gaattgtgtg 1080

gcggactaca gtgtgctgta caattcggcg tccttttcca cgttcaagtg ctacggggtg 1140

tcgcccacga agttgaacga cctctgcttc accaacgtgt atgcggattc cttcgtcatc 1200

cgtggtgacg aggtgcgtca gattgcgccg gggcagacgg ggaagatagc ggactataat 1260

tataagttgc ccgacgactt tactggctgc gttattgctt ggaacagcaa taacctggac 1320

agtaaggtcg ggggcaacta taattatcgg taccgtctgt tccggaagag caatctgaag 1380

cccttcgagc gcgatatctc gaccgagatc taccaggccg gctcgaagcc gtgcaacggc 1440

gtcgaggggt ttaattgtta ctttccgtta cagagctacg ggtttcagcc cacgaacggg 1500

gtggggtacc agccctaccg cgtcgtggtg ctgagcttcg agctgctgca cgccccggcc 1560

acggtgtgcg gtccgaagaa aagtacaaac cttgtgaaga acaagtgtgt gaactttaac 1620

ttcaacgggc tcaccgggac gggggtgttg acggagagta acaagaagtt cctgccgttc 1680

cagcagttcg gtcgggatat cgcggacacc acggatgccg tgagggatcc gcagacgctt 1740

gagattctgg acatcacgcc ctgcagcttc gggggcgtca gtgtgatcac gcctggtacg 1800

aacaccagca accaggttgc ggtgttgtac cagggtgtga attgcactga ggtccccgta 1860

gcgatccacg cggatcagct gaccccgacg tggagggtgt actcgacggg gagtaatgtc 1920

ttccagactc gcgcgggttg cctgattggc gctgagcacg tgaacaactc gtacgagtgc 1980

gacattccca ttggggcggg gatctgcgcg tcgtaccaga cccagacgaa cagccggggc 2040

agcgctagca gcgtcgcgtc gcagtcgatc atcgcgtaca cgatgagcct gggggcggag 2100

aacagtgtgg cctattcgaa caacagcata gctatcccca cgaattttac gatcagtgtg 2160

acgaccgaga tcttgcccgt gtcgatgacc aagacctcgg tcgattgcac gatgtacatt 2220

tgtggggata gcactgagtg ttctaacctc ctgctccagt acggcagttt ctgtacgcag 2280

ctcaaccggg cgcttacggg gattgccgtg gagcaggaca agaacactca ggaggtgttt 2340

gcgcaggtca agcagatcta caagacgcct ccgatcaagg atttcggggg gttcaatttc 2400

tcccagatac tccccgaccc ttcgaagccc agcaagcgta gccctattga ggacctgctc 2460

ttcaataagg ttacgcttgc ggacgcgggc ttcatcaagc agtacgggga ctgtctgggg 2520

gacattgccg cccgggacct gatctgtgct cagaagttca atgggctcac tgttctgccg 2580

cccctgctca cggacgagat gatcgcgcag tacacgtcgg cgctcctcgc cggcacgatc 2640

acgtcgggct ggacgtttgg ggctggtcct gcgctgcaga tcccgttccc tatgcagatg 2700

gcgtaccgct tcaatgggat cggggtgacc cagaatgtcc tgtacgagaa tcagaagctc 2760

atcgccaatc agttcaactc ggcgatcggg aagatacagg actccctgtc gagtacgcct 2820

tccgcgttgg ggaagctgca gaacgtggtg aaccagaatg ctcaggcgtt gaacacgttg 2880

gtgaagcagc tgtcgtccaa cttcggggcg atatcctcgg tgctgaacga tattctcagt 2940

cggctggacc cgccggaggc ggaggttcag atcgatagac tcatcactgg tcgcctccag 3000

agtttgcaga cgtacgtgac tcagcagctc atccgggctg ctgagatacg tgcgtctgcg 3060

aacctggcgg cgaccaagat gagtgagtgc gtgctggggc agagcaagcg ggtggacttt 3120

tgcgggaagg gctatcacct gatgtccttc ccgcagtccg cccctcacgg ggtggtcttc 3180

ctgcacgtga cgtatgtgcc ggcgcaggag aagaacttca ccacggcgcc ggccatatgt 3240

cacgacggga aggcccactt cccccgtgag ggggtcttcg tgtcgaatgg gacgcactgg 3300

ttcgtgacgc agcggaattt ctatgagccg cagataatta cgactgacaa cacgtttgtc 3360

agtggtaatt gtgatgtggt catagggatt gttaacaaca ccgtgtatga tcccctccag 3420

ccggagctgg acagcttcaa ggaggagctg gataagtact tcaagaatca cacgtcgccg 3480

gacgtggatc ttggggacat atcggggatc aacgcgagtg ttgttaacat acagaaggag 3540

atcgaccggc tcaatgaggt tgcgaagaac ctcaatgagt cgttgatcga ccttcaggag 3600

ctcggcaagt atgagcag 3618

<210> 5

<211> 84

<212> DNA

<213> Artificial Sequence

<220>

<223> Artificial sequence

<400> 5

ggctatatcc cagaggcccc tagagatggc caggcctacg ttagaaagga cggcgagtgg 60

gtcctgctga gcacattcct gggc 84

<210> 6

<211> 50

<212> RNA

<213> Artificial Sequence

<220>

<223> Artificial sequence

<400> 6

acauuugcuu cugacacaac uguguucacu agcaaccuca aacagacacc 50

<210> 7

<211> 88

<212> RNA

<213> Artificial Sequence

<220>

<223> Artificial sequence

<400> 7

gcuggagccu cgguagccgu uccuccugcc cgcugggccu cccaacgggc ccuccucccc 60

uccuugcacc ggcccuuccu ggucuuug 88

<210> 8

<211> 3705

<212> RNA

<213> Artificial Sequence

<220>

<223> Artificial sequence

<400> 8

auguucgugu uccucgugcu ccuuccgcug gucucgagcc agugcgucaa uuugcgcacg 60

aggacgcagu ugccccccgc guacacgaac ucguuuacgc ggggggugua cuacccggac 120

aaggucuucc gcagcucugu ccugcacagc acucaggacc ucuuccuccc guucuucucg 180

aacgugacgu gguuccacgc cauucacgug ucggggacga acgggacgaa gagguucgac 240

aacccuguuc ugccguucaa cgacggggug uacuucgcuu cgaucgagaa guccaacauu 300

auucgcgggu ggauauucgg gaccacucuc gauucgaaga cucaguccuu gcugauagug 360

aacaacgcca cgaacguggu cauuaagguc ugcgaguucc aguucuguaa ugacccguuc 420

cuggacguuu acuaucacaa gaacaacaag ucuuggaugg agagugaggu guauucgucc 480

gcgaauaauu guaccuucga guaugucucg cagccauucu ugauggaucu ugagggcaag 540

cagggaaauu ucaagaaucu ccgcgaguuu gucuucaaga acaucgacgg guacuucaag 600

auauacucga agcacacgcc gaucaaccuc guccgugauc ucccgcaggg cuucagcgcu 660

cuggagccgc ugguggaucu cccgaucggg aucaacauca cgcgguucca gacgcugcug 720

gcccugcaca ggaguuaccu gacgccgggu gacuccagua guggguggac ugcgggugcc 780

gcggcguacu acgucgggua ccugcagccg cgcacguucu uguugaagua caacgagaac 840

gggacgauca cggacgcggu ugauugcgcg uuggacccuc ugucggagac gaagugcacc 900

cugaagucgu ucacggugga gaaggguauc uaucagaccu cgaacuuccg gguccagccg 960

acugagagua ucguucgguu cccgaacauu acgaaccugu guccguucgg ggaggucuuc 1020

aacgcgacgc gguucgcgag uguguacgcu uggaaccgga agaggaucuc gaauugugug 1080

gcggacuaca gugugcugua caauucggcg uccuuuucca cguucaagug cuacggggug 1140

ucgcccacga aguugaacga ccucugcuuc accaacgugu augcggauuc cuucgucauc 1200

cguggugacg aggugcguca gauugcgccg gggcagacgg ggaagauagc ggacuauaau 1260

uauaaguugc ccgacgacuu uacuggcugc guuauugcuu ggaacagcaa uaaccuggac 1320

aguaaggucg ggggcaacua uaauuaucgg uaccgucugu uccggaagag caaucugaag 1380

cccuucgagc gcgauaucuc gaccgagauc uaccaggccg gcucgaagcc gugcaacggc 1440

gucgaggggu uuaauuguua cuuuccguua cagagcuacg gguuucagcc cacgaacggg 1500

gugggguacc agcccuaccg cgucguggug cugagcuucg agcugcugca cgccccggcc 1560

acggugugcg guccgaagaa aaguacaaac cuugugaaga acaagugugu gaacuuuaac 1620

uucaacgggc ucaccgggac ggggguguug acggagagua acaagaaguu ccugccguuc 1680

cagcaguucg gucgggauau cgcggacacc acggaugccg ugagggaucc gcagacgcuu 1740

gagauucugg acaucacgcc cugcagcuuc gggggcguca gugugaucac gccugguacg 1800

aacaccagca accagguugc gguguuguac caggguguga auugcacuga gguccccgua 1860

gcgauccacg cggaucagcu gaccccgacg uggagggugu acucgacggg gaguaauguc 1920

uuccagacuc gcgcggguug ccugauuggc gcugagcacg ugaacaacuc guacgagugc 1980

gacauuccca uuggggcggg gaucugcgcg ucguaccaga cccagacgaa cagccggggc 2040

agcgcuagca gcgucgcguc gcagucgauc aucgcguaca cgaugagccu gggggcggag 2100

aacagugugg ccuauucgaa caacagcaua gcuaucccca cgaauuuuac gaucagugug 2160

acgaccgaga ucuugcccgu gucgaugacc aagaccucgg ucgauugcac gauguacauu 2220

uguggggaua gcacugagug uucuaaccuc cugcuccagu acggcaguuu cuguacgcag 2280

cucaaccggg cgcuuacggg gauugccgug gagcaggaca agaacacuca ggagguguuu 2340

gcgcagguca agcagaucua caagacgccu ccgaucaagg auuucggggg guucaauuuc 2400

ucccagauac uccccgaccc uucgaagccc agcaagcgua gcccuauuga ggaccugcuc 2460

uucaauaagg uuacgcuugc ggacgcgggc uucaucaagc aguacgggga cugucugggg 2520

gacauugccg cccgggaccu gaucugugcu cagaaguuca augggcucac uguucugccg 2580

ccccugcuca cggacgagau gaucgcgcag uacacgucgg cgcuccucgc cggcacgauc 2640

acgucgggcu ggacguuugg ggcugguccu gcgcugcaga ucccguuccc uaugcagaug 2700

gcguaccgcu ucaaugggau cggggugacc cagaaugucc uguacgagaa ucagaagcuc 2760

aucgccaauc aguucaacuc ggcgaucggg aagauacagg acucccuguc gaguacgccu 2820

uccgcguugg ggaagcugca gaacguggug aaccagaaug cucaggcguu gaacacguug 2880

gugaagcagc ugucguccaa cuucggggcg auauccucgg ugcugaacga uauucucagu 2940

cggcuggacc cgccggaggc ggagguucag aucgauagac ucaucacugg ucgccuccag 3000

aguuugcaga cguacgugac ucagcagcuc auccgggcug cugagauacg ugcgucugcg 3060

aaccuggcgg cgaccaagau gagugagugc gugcuggggc agagcaagcg gguggacuuu 3120

ugcgggaagg gcuaucaccu gauguccuuc ccgcaguccg ccccucacgg gguggucuuc 3180

cugcacguga cguaugugcc ggcgcaggag aagaacuuca ccacggcgcc ggccauaugu 3240

cacgacggga aggcccacuu cccccgugag ggggucuucg ugucgaaugg gacgcacugg 3300

uucgugacgc agcggaauuu cuaugagccg cagauaauua cgacugacaa cacguuuguc 3360

agugguaauu gugauguggu cauagggauu guuaacaaca ccguguauga uccccuccag 3420

ccggagcugg acagcuucaa ggaggagcug gauaaguacu ucaagaauca cacgucgccg 3480

gacguggauc uuggggacau aucggggauc aacgcgagug uuguuaacau acagaaggag 3540

aucgaccggc ucaaugaggu ugcgaagaac cucaaugagu cguugaucga ccuucaggag 3600

cucggcaagu augagcaggg cuauauccca gaggccccua gagauggcca ggccuacguu 3660

agaaaggacg gcgagugggu ccugcugagc acauuccugg gcuga 3705

<210> 9

<211> 4098

<212> RNA

<213> Artificial Sequence

<220>

<223> Artificial sequence

<400> 9

gggagaccgg ccucgagaca uuugcuucug acacaacugu guucacuagc aaccucaaac 60

agacaccaag cuugccacca uguucguguu ccucgugcuc cuuccgcugg ucucgagcca 120

gugcgucaau uugcgcacga ggacgcaguu gccccccgcg uacacgaacu cguuuacgcg 180

ggggguguac uacccggaca aggucuuccg cagcucuguc cugcacagca cucaggaccu 240

cuuccucccg uucuucucga acgugacgug guuccacgcc auucacgugu cggggacgaa 300

cgggacgaag agguucgaca acccuguucu gccguucaac gacggggugu acuucgcuuc 360

gaucgagaag uccaacauua uucgcgggug gauauucggg accacucucg auucgaagac 420

ucaguccuug cugauaguga acaacgccac gaacgugguc auuaaggucu gcgaguucca 480

guucuguaau gacccguucc uggacguuua cuaucacaag aacaacaagu cuuggaugga 540

gagugaggug uauucguccg cgaauaauug uaccuucgag uaugucucgc agccauucuu 600

gauggaucuu gagggcaagc agggaaauuu caagaaucuc cgcgaguuug ucuucaagaa 660

caucgacggg uacuucaaga uauacucgaa gcacacgccg aucaaccucg uccgugaucu 720

cccgcagggc uucagcgcuc uggagccgcu gguggaucuc ccgaucggga ucaacaucac 780

gcgguuccag acgcugcugg cccugcacag gaguuaccug acgccgggug acuccaguag 840

uggguggacu gcgggugccg cggcguacua cgucggguac cugcagccgc gcacguucuu 900

guugaaguac aacgagaacg ggacgaucac ggacgcgguu gauugcgcgu uggacccucu 960

gucggagacg aagugcaccc ugaagucguu cacgguggag aaggguaucu aucagaccuc 1020

gaacuuccgg guccagccga cugagaguau cguucgguuc ccgaacauua cgaaccugug 1080

uccguucggg gaggucuuca acgcgacgcg guucgcgagu guguacgcuu ggaaccggaa 1140

gaggaucucg aauugugugg cggacuacag ugugcuguac aauucggcgu ccuuuuccac 1200

guucaagugc uacggggugu cgcccacgaa guugaacgac cucugcuuca ccaacgugua 1260

ugcggauucc uucgucaucc guggugacga ggugcgucag auugcgccgg ggcagacggg 1320

gaagauagcg gacuauaauu auaaguugcc cgacgacuuu acuggcugcg uuauugcuug 1380

gaacagcaau aaccuggaca guaaggucgg gggcaacuau aauuaucggu accgucuguu 1440

ccggaagagc aaucugaagc ccuucgagcg cgauaucucg accgagaucu accaggccgg 1500

cucgaagccg ugcaacggcg ucgagggguu uaauuguuac uuuccguuac agagcuacgg 1560

guuucagccc acgaacgggg ugggguacca gcccuaccgc gucguggugc ugagcuucga 1620

gcugcugcac gccccggcca cggugugcgg uccgaagaaa aguacaaacc uugugaagaa 1680

caagugugug aacuuuaacu ucaacgggcu caccgggacg gggguguuga cggagaguaa 1740

caagaaguuc cugccguucc agcaguucgg ucgggauauc gcggacacca cggaugccgu 1800

gagggauccg cagacgcuug agauucugga caucacgccc ugcagcuucg ggggcgucag 1860

ugugaucacg ccugguacga acaccagcaa ccagguugcg guguuguacc agggugugaa 1920

uugcacugag guccccguag cgauccacgc ggaucagcug accccgacgu ggagggugua 1980

cucgacgggg aguaaugucu uccagacucg cgcggguugc cugauuggcg cugagcacgu 2040

gaacaacucg uacgagugcg acauucccau uggggcgggg aucugcgcgu cguaccagac 2100

ccagacgaac agccggggca gcgcuagcag cgucgcgucg cagucgauca ucgcguacac 2160

gaugagccug ggggcggaga acaguguggc cuauucgaac aacagcauag cuauccccac 2220

gaauuuuacg aucaguguga cgaccgagau cuugcccgug ucgaugacca agaccucggu 2280

cgauugcacg auguacauuu guggggauag cacugagugu ucuaaccucc ugcuccagua 2340

cggcaguuuc uguacgcagc ucaaccgggc gcuuacgggg auugccgugg agcaggacaa 2400

gaacacucag gagguguuug cgcaggucaa gcagaucuac aagacgccuc cgaucaagga 2460

uuucgggggg uucaauuucu cccagauacu ccccgacccu ucgaagccca gcaagcguag 2520

cccuauugag gaccugcucu ucaauaaggu uacgcuugcg gacgcgggcu ucaucaagca 2580

guacggggac ugucuggggg acauugccgc ccgggaccug aucugugcuc agaaguucaa 2640

ugggcucacu guucugccgc cccugcucac ggacgagaug aucgcgcagu acacgucggc 2700

gcuccucgcc ggcacgauca cgucgggcug gacguuuggg gcugguccug cgcugcagau 2760

cccguucccu augcagaugg cguaccgcuu caaugggauc ggggugaccc agaauguccu 2820

guacgagaau cagaagcuca ucgccaauca guucaacucg gcgaucggga agauacagga 2880

cucccugucg aguacgccuu ccgcguuggg gaagcugcag aacgugguga accagaaugc 2940

ucaggcguug aacacguugg ugaagcagcu gucguccaac uucggggcga uauccucggu 3000

gcugaacgau auucucaguc ggcuggaccc gccggaggcg gagguucaga ucgauagacu 3060

caucacuggu cgccuccaga guuugcagac guacgugacu cagcagcuca uccgggcugc 3120

ugagauacgu gcgucugcga accuggcggc gaccaagaug agugagugcg ugcuggggca 3180

gagcaagcgg guggacuuuu gcgggaaggg cuaucaccug auguccuucc cgcaguccgc 3240

cccucacggg guggucuucc ugcacgugac guaugugccg gcgcaggaga agaacuucac 3300

cacggcgccg gccauauguc acgacgggaa ggcccacuuc ccccgugagg gggucuucgu 3360

gucgaauggg acgcacuggu ucgugacgca gcggaauuuc uaugagccgc agauaauuac 3420

gacugacaac acguuuguca gugguaauug ugaugugguc auagggauug uuaacaacac 3480

cguguaugau ccccuccagc cggagcugga cagcuucaag gaggagcugg auaaguacuu 3540

caagaaucac acgucgccgg acguggaucu uggggacaua ucggggauca acgcgagugu 3600

uguuaacaua cagaaggaga ucgaccggcu caaugagguu gcgaagaacc ucaaugaguc 3660

guugaucgac cuucaggagc ucggcaagua ugagcagggc uauaucccag aggccccuag 3720

agauggccag gccuacguua gaaaggacgg cgaguggguc cugcugagca cauuccuggg 3780

cugagaauuc gcuggagccu cgguagccgu uccuccugcc cgcugggccu cccaacgggc 3840

ccuccucccc uccuugcacc ggcccuuccu ggucuuuggc uggagccucg guagccguuc 3900

cuccugcccg cugggccucc caacgggccc uccuccccuc cuugcaccgg cccuuccugg 3960

ucuuuguuaa uuaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 4020

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 4080

aaaaaaaaaa aaaacuag 4098

Claims

1. An S protein mutant of 2019-nCoV, comprising at least an extracellular domain comprising an amino acid mutation at a position relative to the extracellular domain of the parent S protein: F817P, A892P, A899P, A942P and KV986_987PP, and T19R, G142D, EF 156-157 del, R158G, L452R, T478K, D614G, P6811R, D950N, the amino acid positions of which are depicted in the amino acid sequence shown in SEQ ID NO: 1.

2. The mutant S protein of 2019-nCoV of claim 1, further comprising a mutation at amino acids RRAR from positions 682 to 685 relative to the amino acid sequence set forth in SEQ ID No. 1 to disable cleavage by furin; preferably, the RRAR is mutated to GSAS;

preferably, the S protein mutant of 2019-nCoV does not comprise the transmembrane domain and/or cytoplasmic tail of the S protein;

Preferably, the S protein mutant of 2019-nCoV is directly fused at the C end of the extracellular domain to assist in forming a domain of a trimer; preferably, the domain that assists in trimer formation is T4 Fibritin Foldon Trimerization Motif.

3. The mutant S protein of 2019-nCoV according to claim 1 or 2, comprising an amino acid sequence as shown in SEQ ID No. 2;

preferably, the amino acid sequence of the S protein mutant of 2019-nCoV comprises the amino acid sequence of SEQ ID NO. 2 and the amino acid sequence of SEQ ID NO. 3 which are directly connected from the N end to the C end.

4. A DNA molecule encoding the S protein mutant of 2019-nCoV of any one of claims 1-3;

preferably, the nucleotide sequence of the DNA molecule comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence of SEQ ID NO. 4, and a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence of SEQ ID NO. 5, directly linked from the 5 'end to the 3' end.

5. An expression vector comprising the DNA molecule of claim 4.

6. A cell comprising the DNA molecule of claim 4 or the expression vector of claim 5.

7. An mRNA molecule comprising an open reading frame encoding the S protein mutant of 2019-nCoV of any one of claims 1-3;

preferably, the nucleotide sequence of the open reading frame is a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence set forth in SEQ ID NO. 8.

8. The mRNA molecule of claim 7, wherein the mRNA comprises, from 5 'to 3' ends, a 5'utr, an open reading frame encoding an S protein mutant of 2019-nCoV, a 3' utr, and a poly-a tail;

preferably, the 5'utr comprises a 5' utr of β -globin or α -globin or a homolog or fragment thereof; preferably, the 5'UTR comprises a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the 5' UTR nucleotide sequence of the β -globin shown in SEQ ID NO. 6;

Preferably, the 3'utr comprises a 3' utr of β -globin or α -globin or a homologue or fragment or combination of fragments thereof; preferably, the 3'UTR comprises 1 nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to a fragment of the α2-globin 3' UTR shown in SEQ ID NO. 7; alternatively, 2 or more nucleotide sequences joined end-to-end that are at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to a fragment of the 3' UTR of the alpha 2-globin shown in SEQ ID NO. 7;

preferably, the poly-A tail is 50-200 nucleotides in length, preferably 100-150 nucleotides in length;

preferably, the mRNA further contains a Kozak sequence, preferably the Kozak sequence is GCCACC;

preferably, the mRNA further comprises a 5 'CAP, preferably, the 5' CAP is CAP1.

9. The mRNA molecule of claim 7 or 8, comprising a nucleotide sequence that is at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or about 100% homologous to the nucleotide sequence set forth in SEQ ID No. 9.

10. A nucleic acid molecule encoding an mRNA molecule according to any one of claims 7 to 9.

11. A vaccine composition comprising the S protein mutant of 2019-nCoV of any one of claims 1-3, or the mRNA molecule of any one of claims 7-9; preferably, the vaccine composition further comprises a pharmaceutically acceptable excipient, and/or an immunoadjuvant; preferably, the vaccine or vaccine composition is for use in the prevention and/or treatment of 2019-nCoV infection or a disease or disorder associated with 2019-nCoV infection; preferably, the disease or condition associated with 2019-nCoV infection is selected from pneumonia caused by 2019-nCoV infection, headache, nasal obstruction, runny nose, cough or/and airway inflammation caused by 2019-nCoV infection, disseminated intravascular coagulation caused by 2019-nCoV infection, and sepsis caused by 2019-nCoV infection.

12. The vaccine composition of claim 11, further comprising a lipid nanoparticle in which the mRNA is located, the lipid nanoparticle comprising 30-60mol% ionizable/cationic lipid molecules, 5-30mol% neutral lipid molecules, 30-50mol% cholesterol lipid molecules, 0.4-10mol% pegylated lipid molecules of its total lipid molecules; preferably contains 32-55 mole% of ionizable/cationic lipid molecules, 8-20 mole% of neutral lipid molecules, 35-50 mole% of cholesterol lipid molecules, 0.5-5 mole% of PEGylated lipid molecules; more preferably 39-51 mole% of ionizable/cationic lipid molecules, 9-16 mole% of neutral lipid molecules, 37-49 mole% of cholesterol lipid molecules, 1.3-2.7 mole% of pegylated lipid molecules;

Preferably, the ionizable/cationic lipid molecule is a compound of formula CWherein each n ₃ Are independent of each other and may be the same or different, each n ₃ Selected from integers from 1 to 8, each m ₃ Are independent of each other and may be the same or different, each m ₃ An integer selected from 0 to 8; preferably, each n ₃ Selected from integers from 4 to 8, each m ₃ An integer selected from 4 to 8; preferably, each n ₃ Are all identical to each other, each m ₃ Are all identical to each other;

preferably, the ionizable/cationic lipid molecular structure is as follows:

preferably, the ratio of the total mass of the lipid molecules to the mass of the mRNA is 5-20:1.