CN113185613B - Novel coronavirus S protein and subunit vaccine thereof - Google Patents

Novel coronavirus S protein and subunit vaccine thereof Download PDF

Info

Publication number
CN113185613B
CN113185613B CN202110395117.4A CN202110395117A CN113185613B CN 113185613 B CN113185613 B CN 113185613B CN 202110395117 A CN202110395117 A CN 202110395117A CN 113185613 B CN113185613 B CN 113185613B
Authority
CN
China
Prior art keywords
leu
val
ser
thr
asn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110395117.4A
Other languages
Chinese (zh)
Other versions
CN113185613A (en
Inventor
徐可
蓝柯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202110395117.4A priority Critical patent/CN113185613B/en
Publication of CN113185613A publication Critical patent/CN113185613A/en
Application granted granted Critical
Publication of CN113185613B publication Critical patent/CN113185613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/14Antivirals for RNA viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0681Cells of the genital tract; Non-germinal cells from gonads
    • C12N5/0682Cells of the female genital tract, e.g. endometrium; Non-germinal cells from ovaries, e.g. ovarian follicle cells
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/03Fusion polypeptide containing a localisation/targetting motif containing a transmembrane segment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • C12N2510/02Cells for production
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/20011Coronaviridae
    • C12N2770/20022New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/20011Coronaviridae
    • C12N2770/20034Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/22Vectors comprising a coding region that has been codon optimised for expression in a respective host
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Wood Science & Technology (AREA)
  • Virology (AREA)
  • Medicinal Chemistry (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Public Health (AREA)
  • Biophysics (AREA)
  • Veterinary Medicine (AREA)
  • Animal Behavior & Ethology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Communicable Diseases (AREA)
  • Oncology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Reproductive Health (AREA)
  • Cell Biology (AREA)
  • Immunology (AREA)
  • Mycology (AREA)
  • Epidemiology (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention provides a novel coronavirus S protein and subunit vaccine thereof, wherein furin cleavage sites 682-RRAR-685 between two subunits S1/S2 of the novel coronavirus S protein are replaced by a flexible protein linker, and the linker is GSAS, GS combination (GGGS) consisting of glycine G and serine S n Or (GGGGS) n Or (G) n And n is an integer not less than 1. The novel coronavirus subunit vaccine comprises the recombinant S protein and a pharmaceutically acceptable adjuvant. The invention applies the S protein with the biological activity and the tripolymer conformation to prepare the tripolymer subunit vaccine, can induce the mouse to generate a neutralizing antibody with immune protection function aiming at SARS-CoV-2 after immunizing the mouse, and can provide 100 percent of protection efficiency after carrying out lethal challenge infection of novel coronavirus on the immunized mouse.

Description

Novel coronavirus S protein and subunit vaccine thereof
Technical Field
The invention belongs to the technical field of biology, and relates to a novel coronavirus S protein and a subunit vaccine thereof.
Background
In 30 months of 2020, world health organization announces that the global epidemic caused by the novel coronavirus (SARS-CoV-2) is an emergent public health event of international concern. Due to the extremely high transmission potential of the virus, more than 1.23 hundred million cases of SARS-CoV-2 infection have been cumulatively reported worldwide by 20 days 3/2021, resulting in 2718896 deaths of patients. In order to cope with the epidemic situation of the novel coronavirus pneumonia which is popular in the world, governments, enterprises and academic circles of various countries are making various treatment and prevention strategies, and the development and application of vaccines and antiviral drugs are the most important. Vaccines are the most effective means of combating all viral infections and also the most effective means of protecting uninfected people.
The types of SARS-CoV-2 vaccines currently under investigation or approved for sale are: inactivated vaccines, adenoviral vector vaccines, nucleic acid vaccines (including mRNA vaccines and DNA vaccines), and subunit vaccines. Although some vaccines against SARS-CoV-2 have been marketed urgently for emergency use, there are still many problems with current vaccine varieties: 1. compared with nucleic acid vaccines and subunit vaccines, inactivated vaccines mainly used in China have low immunogenicity, an inactivated reagent can damage natural epitopes of antigens, and other proteins in virus particles can also interfere with the immunogenicity of S proteins. 2. The nucleic acid vaccine product is applied to new coronavirus for the first time, and the nucleic acid vaccine which is not successfully marketed before is not available, and the long-term safety and integration risk of the nucleic acid vaccine are unknown. 3. Adenoviral vectors of adenoviral vaccines are subject to interference from preexisting immunity. 4. The existing various vaccines are designed aiming at early epidemic strains, cannot effectively deal with the generated mutant strains, and do not have conservative protection effect. Because the subunit vaccine adopts the full-length or partial amino acid sequence of SARS-CoV-2S protein as antigen, the subunit vaccine has strong immunogenicity, can induce the generation of neutralizing antibody with high titer, and the gene recombination technology is convenient for mutant strain renewal and broad-spectrum design of the antigen, so the subunit vaccine has greater advantages in coping with SARS-CoV-2 mutant strains.
SARS-CoV-2 is a single-stranded positive-strand RNA virus with an envelope genome of about 30kb and belongs to the genus beta of the family of coronaviruses. The viral genome encodes a variety of proteins, including spike protein (S), membrane glycoprotein (M), nucleocapsid protein (N), membrane protein (E), and a variety of non-structural proteins. The S protein is a type I virus fusion protein, mediates the attachment of the virus on a cell surface receptor angiotensin converting enzyme 2(ACE2), and then releases a genome into a cell, so that the S protein is a target of a neutralizing antibody and is also a main effective component for vaccine preparation. The novel coronavirus S protein consists of 1273 amino acids, comprising 21-35N-glycosylation sites. The S protein forms a special corolla structure on the surface of the virus in the form of trimer, and the coronavirus is named accordingly. Under the action of host cell protease, the S protein is split into two subunits of S1 and S2 through RRAR cleavage sequence in the middle of the protein, the S1 mainly functions to combine with host cell surface receptor, and the S2 subunit mediates fusion of virus-cell and cell-cell membrane. The intact trimeric structure of the S protein is the primary structure recognized by the virus by neutralizing antibodies in the host cell nucleus.
Therefore, there is a need to develop a new effective vaccine against a novel coronavirus.
Disclosure of Invention
In order to solve the technical problem, the invention provides a novel coronavirus S protein and subunit vaccine thereof, wherein the S1/S2 cleavage site RRAR is mutated to lose the ability of being cleaved by furin-like protease so as to retain complete S protein antigenicity.
In a first aspect of the invention, a novel coronavirus S protein is provided, wherein a Furin cleavage site 682-RRAR-685 between two subunits of S1/S2 of the novel coronavirus S protein is replaced by a flexible protein linker, and the linker is GSAS, a GS combination (GGGS) consisting of glycine G and serine S n Or (GGGGS) n Or (G) n Wherein n is an integer of 1 or more.
Further, the novel coronavirus S protein has at least one of the following modifications 1-4:
modification 1, replacement of the original signal peptide of the novel coronavirus S protein with one of a tPA signal peptide, a CD5 signal peptide, and an IgG signal peptide;
modification 2, SEQ ID NO: 24 by a T4 bacteriophage fibritin trimer motif or a 25GCN4 multimer-forming motif;
modification 3 deletion of the C-terminal domain of the novel coronavirus S protein SEQ ID NO: 7;
modification 4, the novel coronavirus S protein has a mutation of one or more amino acid residues to proline at amino acid position 817-987, said mutation comprising a substitution of K986P and/or V987P.
Further, the nucleotide sequence of the novel coronavirus S protein is shown as SEQ ID NO: 8 or SEQ ID NO: 10 or SEQ ID NO: shown at 12.
Further, the amino acid sequence of the novel coronavirus S protein is shown as SEQ ID NO: 9 or SEQ ID NO: 11 or SEQ ID NO: 13, respectively.
In a second aspect of the invention, there is provided a nucleic acid molecule having a nucleotide sequence as set forth in SEQ ID NO: 8 or SEQ ID NO: 10 or SEQ ID NO: shown at 12.
In a third aspect of the invention, a recombinant expression vector is provided which is capable of expressing the novel coronavirus S protein.
Further, the nucleotide sequence of the expression region of the recombinant expression vector is shown as SEQ ID NO: 8 or SEQ ID NO: 10 or SEQ ID NO: shown at 12.
In a fourth aspect of the invention, there is provided an engineered cell comprising said recombinant expression vector.
In a fifth aspect of the present invention, there is provided a method for preparing a novel coronavirus S protein, the method comprising:
obtaining the recombinant expression vector;
transfecting the recombinant expression vector into cells, and obtaining a cell strain for stably expressing the recombinant S protein through glutamine resistance screening and monoclonal screening of a cell population;
and carrying out secretory expression and purification on the cell strain to obtain a purified recombinant novel coronavirus S protein.
In a sixth aspect of the invention, a novel coronavirus subunit vaccine is provided, said novel coronavirus subunit vaccine comprising said recombinant S protein and a pharmaceutically acceptable adjuvant.
Further, the adjuvant comprises at least one of aluminium hydroxide, lecithin, Freund's adjuvant, MPL TM, IL-12, aluminium hydroxide combined CpG ODN composite adjuvant, ISA51VG, ISA720VG, MF59, QS21 and AS03 adjuvants.
One or more technical solutions in the embodiments of the present invention have at least the following technical effects or advantages:
the immunogen S protein polypeptide has stable prefusion conformation, and the Furin cleavage site 682-RRAR-685 between two subunits of S1/S2 is replaced by a flexible protein linker, such as GSAS, GS combination (GGGS) composed of G (Gly) glycine and S (Ser) serine n Or (GGGGS) n Or (G) n The length and effect of the linker are adjusted by n. In the present invention, we compared various flexibilities lThe inker, the result of FIG. 2B shows that the GS flexible linker, including 682-GSAS-685, 682-GG-685 has better shear protection effect compared with the reported 682-QQAQ-685 mutation, and S2 in the GS mutation is less sheared, so that the non-sheared S protein form can be better maintained. The invention uses the S protein with the biological activity and the tripolymer conformation to prepare the tripolymer subunit vaccine, and the immune mouse can induce the mouse to generate the neutralizing antibody with immune protection function against SARS-CoV-2 original strain and the current popular mutant strain, and the immune mouse can provide 100 percent of protection efficiency after lethal virus attack infection of novel coronavirus.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
FIG. 1 shows the sequence design of secreted S protein from mammalian cells;
FIG. 2 shows the optimized expression effect of S protein in CHO-K1 cells; FIG. 2A: comparing the protein expression level of the S protein original sequence with the optimized codon sequence; FIG. 2B: comparing the original sequence of the Furin shearing site of the S protein with the shearing conditions of two different mutation strategies; FIG. 2C: comparing the expression of the original signal peptide and the S protein replaced by tPA or IgG signal peptide in the cell culture supernatant;
FIG. 3 is a growth curve of CHO-K1 cells stably expressing S protein;
FIG. 4 is an analysis of the expression level of the secretory expression recombinant S protein;
FIG. 5 is an analysis of trimers secreting expressed recombinant S proteins;
FIG. 6 is a graph showing the detection of the level of antigen-elicited specific antibodies of the S protein trimer;
FIG. 7 shows the detection of the neutralizing activity of the antibody generated after immunization of mice with the S protein trimer antigen; FIGS. 7A and 7B are the mouse serum collected with secondary vaccination or PBS and the serum collected from SARS-CoV-2 convalescent patient, respectively, diluted with different gradients, incubated with new coronavirus for 1h, infected with BHK-21ACE2 cells, and collected after 24h to detect the activity of firefly luciferase;
FIG. 8 is a protective study of a recombinant S protein trimer subunit vaccine; FIGS. 8A and 8B show that the mice vaccinated twice or with PBS, respectively, were infected with SARS-CoV-2 by nasal drip, and the weight change and survival of the mice in the vaccine group and the control group were recorded;
FIG. 9 shows the results of the vaccine of the present invention with very high neutralizing activity against pseudoviruses of different mutant strains of Xinguan; FIG. 9A shows that the sera of the vaccine secondary immunized mice had higher neutralizing activity against the original strain and part of the mutant pseudoviruses currently circulating; FIG. 9B shows the result that the sera of inactivated vaccine volunteer had only low neutralizing activity against the original strain pseudovirus and no neutralizing activity at all against the south African strain pseudovirus;
FIG. 10 is a challenge validation of recombinant S protein trimer subunit vaccine; FIG. 10A is the result of little change in body weight of the vaccine group mice after infection with SARS-CoV-2; FIG. 10B is the result of the recombinant S protein trimer subunit vaccine being able to completely resist the lethal dose of SARS-CoV-2 infection.
Detailed Description
The present invention will be specifically explained below in conjunction with specific embodiments and examples, and the advantages and various effects of the present invention will be more clearly presented thereby. It will be understood by those skilled in the art that these specific embodiments and examples are for the purpose of illustrating the invention and are not to be construed as limiting the invention.
Throughout the specification, unless otherwise specifically noted, terms used herein should be understood as having meanings as commonly used in the art. Accordingly, unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. If there is a conflict, the present specification will control.
Unless otherwise specifically stated, various raw materials, reagents, instruments, equipment and the like used in the present invention are commercially available or can be prepared by existing methods.
In order to solve the technical problems, the embodiment of the invention provides the following general ideas:
the applicant discovers through analysis and experimental verification that: the full sequence covering S1 and S2 is the best vaccine antigen selection, and is more broad-spectrum than S1 (or RBD region) alone or S2 alone, which induces more antibody species. Meanwhile, the complete trimer structure of the S protein is the primary structure of virus recognized by a host cell nucleus neutralizing antibody, so that if the trimer structure of the S protein can be reduced in vitro and the trimer S protein is obtained by in vitro recombinant expression and nuclear purification, the natural conformation of the virus can be simulated, an organism can be activated to generate a recognition antibody which is closest to the natural virus, and the complete trimer structure of the S protein is also the optimal selection of subunit recombinant vaccines.
Therefore, the invention aims to provide a novel coronavirus trimer S protein which comprises two subunits of S1 and S2 and can be secreted and expressed in the supernatant of mammalian cells, a gene sequence thereof, and high-efficiency protection provided by preparing a novel coronavirus subunit vaccine by using the protein expressed by CHO cell supernatant.
According to a typical embodiment of the present invention, there is provided a novel coronavirus S protein, wherein the Furin cleavage site 682-RRAR-685 between two subunits S1/S2 of the novel coronavirus S protein is replaced by a flexible protein linker, and the linker is GSAS, GS combination (GGGS) composed of glycine G and serine S n Or (GGGGS) n Or (G) n Wherein n is an integer of 1 or more.
n is an integer more than or equal to 1 and is within the protection scope of the invention; preferably, n in the embodiment of the application is 1-3; the linker is (G) n When n is longer, n is preferably 1-10;
the core of the present application is that the S1/S2 cleavage site RRAR is mutated to lose the ability to be cleaved by Furin-like protease to retain the complete S protein antigenicity, the immunogenic S protein polypeptides of the invention have a stable prefusion conformation, and the Furin cleavage site 682-RRAR-685 between the two subunits of S1/S2 is replaced with a flexible oneProtein linkers, e.g. GSAS, GS combinations, (GGGS) consisting of G (Gly) glycine and S (Ser) serine n Or (GGGGS) n Or (G) n The length and effect of the linker are adjusted by n. In the example of the invention, we compared various flexible linkers, and the results in FIG. 2B show that GS flexible linkers, including 682-GSAS-685 and 682-GG-685, have better shear-protecting effect and the GS mutations have less shear of S2, so that the non-sheared S protein form can be better maintained compared with the reported 682-QAQ-685 mutation.
The nucleotide sequence of the wild novel coronavirus S protein is shown as SEQ ID NO: 1, the amino acid sequence of the wild novel coronavirus S protein is shown as SEQ ID NO:2 is shown in the specification; wherein the amino acid sequence of the original signal peptide of the wild novel coronavirus S protein is MFVFLVLLPLVSS, and the amino acid sequence is shown as SEQ ID NO: 15 is shown in the figure; the nucleotide sequence of the original signal peptide is shown as SEQ ID NO: 14 is shown in the figure;
for efficient expression in eukaryotic cells, the Codon of the S expression gene is optimized by mammal preference by using JAVA Codon adaptation software, and in some embodiments, the Codon of the S expression gene is optimized by using JAVA Codon adaptation software, so that the expression efficiency in mammalian cells is higher than that of the natural S gene. The nucleotide sequence of the original signal peptide was optimized as shown in SEQ ID NO: 16 is shown in the figure; the amino acid sequence of the original signal peptide is shown as SEQ ID NO: 17 (amino acid sequence is the same as SEQ ID NO: 15); other codon optimization methods can also be adopted, and specifically, the following codon optimization scheme can be adopted:
codon optimization scheme 1: the nucleotide sequence is shown as SEQ ID NO: 3 is shown in the specification;
codon optimization scheme 2: the nucleotide sequence is shown as SEQ ID NO: 4 is shown in the specification;
codon optimization scheme 3: the nucleotide sequence is shown as SEQ ID NO: 5 is shown in the specification;
codon optimization scheme 4: the nucleotide sequence is shown as SEQ ID NO: 6 is shown in the specification;
codon optimization scheme 5: the nucleotide sequence is shown as SEQ ID NO: 7 is shown in the specification;
and replacing furin cleavage sites 682-RRAR-685 between two subunits of S1/S2 in the codon optimization schemes 1-5 with the codon sequence of the flexible protein linker of the application, wherein the linker is GSAS, GS combination (GGGS) consisting of glycine G and serine S n Or (GGGGS) n Or (G) n Wherein n is an integer of 1 or more.
As an alternative embodiment, the novel coronavirus S protein has at least one of the following modifications 1-4:
modification 1 the original signal peptide of the novel coronavirus S protein may be replaced with SEQ ID NO: 18 and SEQ ID NO: 19, a tPA signal peptide of SEQ ID NO: 20 and SEQ ID NO: 21, the IgG signal peptide of SEQ ID NO: 22 and SEQ ID NO: 23, one of the CD5 signal peptides shown; the sequence adopts secretory signal peptide to ensure that S protein can be secreted and expressed in the culture supernatant of mammalian cells, and tPA signal peptide, S protein natural signal peptide, IgG signal peptide and CD5 signal peptide are selected to optimally select the tPA signal peptide.
Modification 2, SEQ ID NO: the transmembrane region shown in 24 was replaced with T4 phage Fibritin trimer motif or the multimer-forming motif of SEQ ID NO 25GCN 4. For better formation of the trimeric structure, the S protein was fused to the trimeric folding domain of the minor fibrin (Fibritin) of the T4 bacteriophage at the C-terminus.
Modification 3 deletion of the C-terminal domain of the novel coronavirus S protein SEQ ID NO: 26, a transmembrane domain. Aiming at promoting the secretory expression of the recombinant S protein.
Modification 4, mutations in the novel coronavirus S protein having one or more amino acid residues mutated to proline at amino acid position 817-987, which mutations may comprise substitutions K986P and/or V987P. Substitution of two proline residues improves the stability of the prefusion conformation.
Any permutation and combination of one or more of the above modifications 1-4 is within the scope of the present invention.
Preferably, the nucleotide sequence of the novel coronavirus S protein can be represented by SEQ ID NO: 8. the amino acid sequence of SEQ ID NO: 10 and SEQ ID NO: 12, wherein:
nucleotide sequence of novel coronavirus S protein SEQ ID NO: in the scheme shown in 8, the codon optimization scheme 3 framework + original signal peptide + is adopted 682 GSAS 685 + T4 phage Fibritin trimer motif; amino acid sequence SEQ ID NO: 9 is shown in the figure;
nucleotide sequence of novel coronavirus S protein SEQ ID NO: 10, the codon optimization scheme 3 framework + tPA signal peptide + 691 GSAS 694 + T4 phage Fibritin trimer motif; amino acid sequence SEQ ID NO: 11 is shown in the figure;
nucleotide sequence of novel coronavirus S protein SEQ ID NO: in the scheme shown in 12, the codon optimization scheme 3 is adopted for the skeleton + tPA signal peptide + 691 GG 692 + T4 phage Fibritin trimer motif; amino acid sequence SEQ ID NO: 13 is shown in the figure;
the embodiment of the invention obtains an S protein recombinant gene sequence and a protein sequence which can be efficiently secreted and expressed in mammalian cells through testing and comparing various expression elements. As a preferred embodiment, the sequence is a C-segment truncated form of the S protein, retaining both subunits S1 and S2, and all functional regions except the transmembrane region are retained to maximize the retention of antibody epitopes on the S protein. In the optimal technical scheme, the nucleotide sequence of the novel coronavirus S protein is shown as SEQ ID NO: 8 or SEQ ID NO: 10 or SEQ ID NO: shown at 12. The amino acid sequence of the novel coronavirus S protein is shown as SEQ ID NO: 9 or SEQ ID NO: 11 or SEQ ID NO: shown at 13.
According to another exemplary embodiment of the present embodiments, there is provided a nucleic acid molecule having a nucleotide sequence as set forth in SEQ ID NO: 8 or SEQ ID NO: 10 or SEQ ID NO: shown at 12. Biological materials containing the nucleic acid molecule are also within the scope of the present invention, and include one of recombinant DNA, plasmid vectors, phage vectors, viral vectors, and engineered bacteria.
According to another exemplary embodiment of the present embodiments, there is provided a recombinant expression vector capable of expressing the novel coronavirus S protein.
According to another exemplary embodiment of the present embodiments, there is provided an engineered cell comprising the recombinant expression vector. The engineering cells can be selected from suspension cells, including CHO series and human vaccine mammalian cell strains such as 293, 293FT and the like, which are within the protection scope of the invention, specifically, the embodiment of the invention uses CHO-K1 cells, and the S gene is transfected into the cells to obtain a stable expression cell strain, so as to efficiently express the recombinant S protein.
According to another exemplary embodiment of the present invention, there is provided a method for preparing a novel coronavirus S protein, the method comprising:
obtaining the recombinant expression vector;
transfecting the recombinant expression vector into cells, and obtaining a cell strain for stably expressing the recombinant S protein through glutamine resistance screening and monoclonal screening of a cell population;
and carrying out secretory expression and purification on the cell strain to obtain a purified recombinant novel coronavirus S protein.
According to another exemplary embodiment of the present embodiments, there is provided a novel coronavirus subunit vaccine comprising said recombinant S protein and a pharmaceutically acceptable adjuvant.
The adjuvant comprises at least one of aluminium hydroxide, lecithin, Freund's adjuvant, MPL TM, IL-12, aluminium hydroxide combined CpG ODN composite adjuvant, ISA51VG, ISA720VG, MF59, QS21 and AS03 adjuvants. In other embodiments, the adjuvant may be in other forms.
The novel coronavirus subunit vaccine can be prepared into nasal drops, sprays and intramuscular injections.
The invention applies the obtained S protein with biological activity and trimer conformation to prepare trimer subunit vaccine, after immunizing a mouse, the invention can induce the mouse to generate a neutralizing antibody with immune protection function aiming at SARS-CoV-2, and after carrying out lethal challenge infection of novel coronavirus on the immunized mouse, 100 percent of protection efficiency can be provided.
The effects of the present application will be described in detail below with reference to examples and experimental data.
Example A recombinant S protein vector construction and expression optimization
1. Construction of S protein Gene expressed in mammalian cell supernatant
The construction of the S protein expression gene of the present invention is schematically shown in FIG. 1. FIG. 1 shows the sequence design of secreted S protein from mammalian cells: the original signal peptide is retained in the sequence, or the sequence is mutated into tPA signal peptide/CD 5 signal peptide/IgG signal peptide; the Furin shearing site is mutated from RRAR to GSAS, GS combination, (GGGS) n Or (GGGGS) n Or (G) n (ii) a The C-terminal transmembrane as well as intracellular domain were replaced with the T4 phage secondary fibrin sequence.
First, for efficient expression in eukaryotic cells, we performed mammalian-preferred Codon optimization of codons of S-expressing gene using JAVA Codon adaptation software, and the nucleotide sequence of the original signal peptide was optimized as shown in SEQ ID NO: 16 is shown in the figure; the results in FIG. 2A show that the unoptimized S native gene is expressed in CHO cells in very low and barely detectable amounts; and the S gene optimized by the codon can express the S protein in CHO cells with high efficiency.
Secondly, the immunogenic S protein polypeptides of the invention have a stable prefusion conformation, replacing the furin cleavage site 682-RRAR-685 between the two subunits of S1/S2 with a flexible protein linker, such as GSAS, GS combinations, (GGGS) consisting of G (Gly) glycine and S (Ser) serine n Or (GGGGS) n Or (G) n The length and effect of the linker are adjusted by n. In the present invention, we compared various flexible linkers, and the results in FIG. 2B show that GS flexible linkers, including 682-GSAS-685 and 682-GG-685, have better shear-protecting effect than the reported 682-QQAQ-685 mutation, and the GS mutation has less shear of S2, so that the shear can be better maintainedA form of S protein which is not cleaved. The results in FIG. 7 show that intact S protein can induce the production of high titer of specific neutralizing antibody, while antibody targeting only S protein S2 subunit has no activity of neutralizing virus, i.e. fails to protect the body against SARS-CoV-2 infection.
Again, in order to express the S protein in mammalian supernatant without lysing the cells, we removed the transmembrane region of the native S protein (TM region), and fused the TM region-removed S protein to the trimeric folding domain of the minor fibrin (Fibritin) of T4 bacteriophage at the C-terminus, enhancing trimer formation. When using the secretion signal peptide, we compared the tPA signal peptide, the S protein self signal peptide, and the human IgG signal peptide. The results in FIG. 2C show that both the human tissue plasminogen activator (tPA) signal peptide and the S protein self signal peptide are able to detect the S protein in transfected CHO supernatants, and therefore, it is preferred that both signal peptides be expressed in the supernatants.
The nucleotide sequence of the S gene is shown as SEQ ID NO: 8 or SEQ ID NO: 10 or SEQ ID NO: shown at 12.
2. Construction of CHO-K1 cell line secreting and expressing recombinant S protein
Cloning the S gene (the nucleotide sequence of the S gene is shown as SEQ ID NO: 8) constructed in the step (1) into an expression vector (modified by a pC-neo vector of a promega company and added with a GS expression label) to obtain the expression vector with a Glutamine Synthetase (GS) screening label. The method comprises the following specific steps: the pC-GS vector is cut by restriction endonucleases NheI and SmaI (thermo Fisher), the S gene is cut by NheI and EcorV (thermo Fisher), the cut vector and the S gene are respectively recovered by agarose gel electrophoresis, and are linked by T4 ligase (New England Biolabs), and are transformed into competent DH5, and the correct pC-GS-S clone is selected and identified, thus obtaining the S gene expression plasmid.
The S gene expression plasmid was transfected into CHO-K1 cells (China center for type culture Collection, university of Wuhan) by electroporation using an electrotransfer (Biorad). Electroporation transfection protocol: take 1X 10 7 From single cell to 800LCHO-CD1 serum-free medium (Shanghai culture source biotechnology)Co., Ltd.), 40. mu.g of plasmid was added thereto and mixed, the culture solution was transferred to an electric rotating cup (Biorad Co.), and the voltage was set to 300V, the capacitance was 960. mu.F, the resistance was infinite, and the pulse was exponential. After the electric shock is finished, taking out the culture solution to a CHO-CD1 culture medium, and adjusting the cell concentration to 5 x 10 5 one/mL. Cell density and viability were measured daily after electroporation and as cell concentration began to steadily increase to 1X 10 6 At one/mL, the cell line is proved to be complete in establishment. And (4) screening by a limiting dilution method, selecting clone strains, respectively carrying out amplification culture, and further screening dominant cell strains. And after the dominant strain is subjected to amplification culture, inoculating cells to carry out cell growth cycle detection.
3. CHO-K1-S cell growth assay
The cells of the dominant strain are transferred into 20mL CHO-CD1 culture medium, and the final concentration of the cells is 1 × 10 6 Per mL, left at 37 ℃ with 5% CO 2 In a shaker incubator (Thermo fisher) at 120 rpm. Seeding when day 0, then every 24 hours using blood counting plate (Biorad) counting, counting cell growth, count until the cell number is not increasing or decreasing. The experimental results show that the concentration is 1 multiplied by 10 6 At the inoculation concentration of one strain/mL, the cells of the dominant strain enter a logarithmic growth phase on the 2 th to 3 rd days of inoculation, reach a plateau phase on the 6 th to 7 th days, and start to enter a decline phase on the 8 th day (FIG. 3). FIG. 3 is a growth curve of CHO-K1-S cells stably expressing S protein: inoculation 2X 10 7 CHO cells were cultured in 20mL serum-free medium for 8 days to count the growth.
4. Detection of secreted expressed recombinant S proteins
The cells of the dominant strain were transferred to 2 cell culture flasks containing 20mL of CHO-CD1 medium at a final cell concentration of 1X 10 6 Marking the volume/mL as bottle A; 2X 10 6 one/mL is recorded as B bottle. Standing at 37 deg.C with 5% CO 2 In a shaker incubator at 120 rpm. After continuous culture for 4 days, taking 100 mu L of culture solution in the bottle A and the bottle B, centrifuging for 5 minutes at 800rpm, taking out supernatant, and adding 6 xSDS loading buffer with corresponding volume; after centrifugation, the cells were added to 40. mu.L of 1 XSDS loading buffer, and then placed in a metal bath (DLAB) at 100 ℃ for 10 minutes, and the protein surface in the supernatant was examined by SDS-PAGESo as to achieve the purpose. The experimental result shows that the high-expression S protein can be detected in the cell supernatant of the dominant strain, and the S protein expression quantity is higher in the sample with high inoculation initial concentration (figure 4). FIG. 4 is an analysis of expression level of the secretory expression recombinant S protein: and (3) inoculating CHO cells to a serum-free culture medium, recording the number of the inoculated cells as an A bottle and a B bottle respectively, and detecting the expression of S protein in the cell and culture medium supernatant after culturing for 4 days.
After centrifugation of the cell culture broth from the flask B, 10. mu.L, 20. mu.L, 40. mu.L, and 60. mu.L of the supernatant culture broth were taken out into an EP tube, and 2. mu.L, 4. mu.L, 8. mu.L, and 12. mu.L of 6 × Native loading buffer were added to the second portion, and the supernatant was examined for the trimer form of S protein by Native-PAGE. As shown in FIG. 5, the experimental results show that a small amount of S protein monomers can be detected in the cell supernatant of the dominant strain, and most of the S protein exists in the form of dimer or trimer in the culture medium supernatant (FIG. 5), which indicates that the CHO-K1-S cell strain can express a large amount of S protein with natural polymer conformation, and the secretion and expression of recombinant S protein by the cell strain can reach 3g/L, and can meet the production requirements of subunit vaccines.
Example II immunization and Effect identification of recombinant S protein trimer vaccine
1. Immunization protocol for mice
The mouse used in the experiment is K18-hACE male mouse, is purchased from Jiangsu Jiejiaokang Biotechnology limited company, 19-28g in 6-8 weeks, and all animal experiments are carried out in SPF laboratory. Experimental groups inoculated vaccine + adjuvant for a total of 4 mice; control groups were inoculated with PBS for a total of 4 mice. The second needle was inoculated 14 days after the first needle vaccination, and orbital bleeding was performed on all mice 35 days after the first needle vaccination (i.e., 21 days after the second needle vaccination). Mouse sera were taken for testing.
2. ELISA detection procedure
(1) RBD (seikagaku corporation) was diluted with 0.1M carbonate buffer (pH 9.6) to a final concentration of 1 ng/. mu.l, and 100. mu.l was added to each well of a 96-well microplate to contain 100ng RBD in each well. Incubate at 37 ℃ for 3 hours.
(2) The coating solution was discarded, and 250. mu.L of 0.05% PBST (Morina Biotech Co., Ltd.) was added to each well to wash. Wash 3 times for 5 minutes each.
(3) 200 μ L of 5% skim milk (formulated in 0.05% PBST) was added to each well, blocked, and incubated at 37 ℃ for 3 hours.
(4) Serum samples from vaccine and control mice were treated with 5% skim milk for 1: and (5) diluting by 500. Discard the blocking solution, add 100. mu.L of diluted mouse serum to each well, and repeat each mouse serum three times at 4 ℃ overnight.
(5) Serum was discarded and 200. mu.L of 0.05% PBST was added to each well for washing. Wash 3 times for 5 minutes each.
(6) Mixing 5% skimmed milk with a mixing ratio of 1: goat anti-mouse IgG secondary antibody (Boerci technologies, Inc.) was diluted at 5000 a ratio. To each well 100 μ L of diluted secondary antibody was added. Incubate at 37 ℃ for 1 h.
(7) The secondary antibody was discarded and 200. mu.L of 0.05% PBST was added to each well for washing. Wash 3 times for 5 minutes each.
(8) Then, 100. mu.L of an enzyme substrate (Hcm TMB One) was added to each well, and the mixture was developed in the dark at room temperature for 30 minutes. The reaction was stopped by adding 50. mu.L of 1M HCl per well. Placing the sample in a microplate reader to detect the value of OD 450.
3. VSV framework new coronavirus packaging process
The new coronavirus Spike protein expression plasmid Sdel-18 (18 amino acids deleted at the tail end, professor Charningshao from plasmid) was transfected into Vero E6 cells (15. mu.g/10 cm dish), and 48 hours after transfection, the cells were infected with seed virus VSV-DG-Luc (professor Charningshao from plasmid) at 300. mu.L/10 cm dish. 1h after infection, the virus fluid was discarded and replaced with fresh complete medium containing VSV-G antibody (1: 1000). Culturing at 37 deg.C for 24 hr, collecting cell supernatant, packaging pseudovirus, and freezing at-80 deg.C.
4. Neutralization experimental process of new coronavirus
(1) Experiments were performed by seeding BHK-21ACE2 cells (from the China type culture Collection, university of Wuhan) in 96-well plates until the degree of cell polymerization reached 90%.
(2) The mouse serum was removed at 56 ℃ for 30 min.
(3) Mouse serum was diluted with a gradient of infection medium (DMEM + 2% FBS + 1% PS).
(4) Pseudovirus, pseudovirus (v), diluted in infection medium (DMEM + 2% FBS + 1% PS): total volume (v) ═ 1: 10.
(5) serum dilutions were mixed with pseudovirus dilutions in equal ratios (50L +50L per well) and incubated for 1h at 37 ℃.
(6) BHK-21ACE2 cells were washed twice with PBS.
(7) Adding the mixed solution of serum and pseudovirus into cells, culturing at 37 ℃ for 24h, cracking the cells, adding firefly luciferase substrate (Promega), and measuring the firefly luciferase activity by using a Varioskan LUX multifunctional microplate reader (Thermo fisher).
5. The serum of a mouse immunized by the vaccine of the invention contains extremely high anti-RBD IgG
To verify that the designed vaccine can induce the mice to produce specific antibodies, the prepared recombinant S protein vaccine was used to immunize the mice, and the control mice were inoculated with an equal volume of 1 × PBS. Orbital bleeds were performed on all mice 35 days after the first needle inoculation (i.e., 21 days after the second needle inoculation). Subsequently, 4 vaccine group mouse sera and 4 control group mouse sera prepared were mixed in a ratio of 1: and diluting the mixture by 500, and detecting the content of the specific antibody in the serum by enzyme-linked immunosorbent assay (ELISA).
The results of the experiment are shown in FIG. 6. As can be seen, in 1: under the dilution multiple of 500, the OD450 value of the reaction between the serum of the vaccine group mouse and the new coronavirus RBD protein is obviously higher than that of the reaction between the serum of the control group mouse and the new coronavirus RBD protein. This demonstrates that the vaccine prepared by the present invention can induce mice to produce very high specific antibodies against the RBD region of the S protein. Meanwhile, the serum of inactivated vaccine volunteers was mixed at a ratio of 1: at a release rate of 2000, a similar level of antibody response to the RBD region as the vaccine was induced.
6. The vaccine of the invention keeps the integrity of S protein antigen and has extremely high neutralizing activity to the new coronavirus
In order to verify the neutralizing capacity of the antibody generated by the vaccine to the new coronavirus, the serum of a mouse immunized secondarily by the vaccine and an S2 subunit antibody (Okayama technologies, Inc.) of a commercial targeting S protein are respectively subjected to a pseudovirus neutralizing experiment, and the neutralizing activities of the two antibodies are compared.
Serum of the vaccine secondary immunization group mouse (S-55) and serum of the control group (PBS-53) mouse were taken, and 1: 100. 1: 1000. 1: 10000. 1: 20000. 1: 40000 and 1: 80000 times of dilution; commercializing an antibody targeting the S2 subunit of the S protein performs 1: 100. 1: 500. 1: 2500. 1: 5000 and 1:10000 fold dilution. The diluted sera were subjected to neutralization experiments with VSV backbone new corona pseudoviruses to determine the neutralizing activity of mouse sera and commercial S2 subunit-specific antibodies targeting the S protein.
Figure 7A results show that vaccine re-immunized mouse sera were in the 1: 100-1: has neutralizing activity to new coronavirus at 10000 dilution, and has neutralizing activity at about 1: neutralization efficiency of 50% was achieved at 8000. While the control mouse sera had no neutralizing activity against pseudovirus at all dilutions; the results in fig. 7B show that the commercial S2 subunit-specific antibody targeting the S protein only shows very weak pseudovirus neutralizing activity at low dilution concentrations. This shows that the recombinant S protein of the vaccine can retain the integrity of the S protein antigen and can induce the generation of high-level specific antibody with good neutralizing effect on the new coronavirus.
7. The vaccine of the invention has extremely high neutralization activity on the pseudovirus of the new crown original strain
In order to compare the neutralizing activity of the antibody generated by the vaccine and the existing new corona inactivated vaccine to the new corona pseudovirus, the serum of the mouse immunized secondarily by the vaccine and the serum of the volunteer of the inactivated vaccine are subjected to a pseudovirus neutralizing experiment.
Taking the serum of the vaccine secondary immunization group mice, and carrying out 1: 100. 1: 1000. 1: 10000. 1: 20000. 1: 40000. 1: 80000. 1: 160000 and 1: 320000 times of dilution; serum of new coronavirus inactivated vaccine volunteer was subjected to 1:10 2 、1:10 3 、1:10 4 、1:10 5 、1:10 6 And 1:10 7 And (5) diluting by times. Control mice sera served as controls. And (3) performing a neutralization experiment on the diluted serum and the VSV framework new corona pseudovirus, and determining the neutralization activity of the S protein specific antibody in the mouse serum and the serum of the inactivated vaccine volunteer.
Figure 8A results show that vaccine secondary immunized mice sera were in a 1: 100-1: 32000 dilution is right after dilutionCoronaviruses have neutralizing activity, at a dilution of about 1: at 6300 50% neutralization efficiency is achieved. Control sera, however, were not active at all dilutions for neutralizing pseudovirus. Figure 8B results show that inactivated vaccine volunteer serum 1:10 3 –1:10 5 The dilution had low neutralizing activity against the new coronaviruses and failed to achieve 50% neutralization efficiency. This shows that, compared with the existing inactivated vaccine, the vaccine can induce and generate specific antibody with better neutralizing effect and high level for the new coronavirus.
8. The vaccine of the invention has extremely high neutralization activity on pseudoviruses of different new crown mutant strains
In order to compare the neutralizing activity of the antibody generated by the vaccine and the existing new corona inactivated vaccine to different mutant strain pseudoviruses of the new corona, the serum of a mouse immunized twice with the vaccine and the serum of an inactivated vaccine volunteer are subjected to a mutant strain pseudovirus neutralizing experiment.
Taking serum of a vaccine secondary immune group mouse and serum of a new coronavirus inactivated vaccine volunteer, and carrying out 1:10 2 、1:10 3 、1:10 4 、1:10 5 、1:10 6 And (5) diluting by times. The diluted serum and VSV framework new crown different mutant strain pseudoviruses are subjected to a neutralization experiment, and the dilution of 50% neutralization activity or 20% neutralization activity of specific antibodies in mouse serum and inactivated vaccine volunteer serum is determined and counted.
The results in FIG. 9A show that the sera of vaccine secondary immunized mice had higher neutralizing activity against the original strain and part of the mutant pseudoviruses currently circulating, although the neutralizing activity was reduced for south Africa strain pseudoviruses. The results in fig. 9B show that the inactivated vaccine volunteer sera had only low neutralizing activity against the original strain pseudovirus and no neutralizing activity at all for south african strain pseudovirus. The specific antibody with better neutralizing effect and higher titer to the new crown original strain and the new crown mutant strain which is popular at present can be induced and generated by the vaccine.
EXAMPLE III challenge validation of recombinant S protein trimer subunit vaccine
The mice are adapted to ABSL-3 environment for 2-3 days and then are subjected to experiments, and the mice are correspondingly divided into the following 2 groups: control group: a PBS group; experimental groups: a vaccine group; each group had 4.
SARS-CoV-2 is used for treating the above two groups of mice with dosage of 2.5 × 10 2 PFU/mouse (clinical strain isolated from Wuhan institute of sciences, Wuhan university ABSL-3 laboratory amplification). The specific operation flow is as follows: the initial titer of SARS-CoV-2 stock solution was 6X 10 6 PFU/mL, total 200. mu.L. A1.5 mL screw cap tube was prepared and 714. mu.L of 1 XPBS and 6. mu.L of LSARS-CoV-2 stock solution were added and mixed, at which time the volume of the virus dilution was 720. mu.L and the titer was 5X 10 4 PFU/mL. Then, 2mL of a screw cap tube was prepared, and 1800. mu.L of 1 XPBS and 200. mu.L of the above-mentioned SARS-CoV-2 dilution were added thereto and mixed, at which time the volume of the virus dilution was 2000. mu.L and the titer was 5X 10 3 PFU/mL, then the diluted SARS-CoV-2 is placed on ice until ready for use. Clamping the mouse with forceps, performing inhalation anesthesia with isoflurane, observing the mouse, and sucking 50 μ L of the solution to titer of 5 × 10 when the mouse is in a state of unstable standing and faint 3 PFU/mL virus solution is slowly dripped into the nostril of the mouse, so that the virus solution is naturally inhaled along with the breath of the mouse. The mice were returned to the cages after a few seconds. After challenge, mice were weighed at fixed time points daily for a total of 11 days. And finally, using Graphpad prism software to draw a weight and survival curve of each group of mice.
The experimental result shows that the vaccine prepared from the recombinant S protein with the natural trimer structure obtained by the experimental scheme applied by the invention has excellent immunogenicity, the body weight of the vaccine group mice does not change greatly after SARS-CoV-2 infection (figure 10A), and the vaccine group mice can completely resist SARS-CoV-2 infection with lethal dose (figure 10B). The invention can be used as a potential SARS-CoV-2 recombinant S protein subunit vaccine.
Finally, it should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Sequence listing
<110> Wuhan university
<120> novel coronavirus S protein and subunit vaccine thereof
<160> 26
<170> SIPOSequenceListing 1.0
<210> 1
<211> 3819
<212> DNA
<213> novel coronavirus (SARS-CoV-2)
<400> 1
atgtttgtgt tcctggtgct gctgccactg gtgtccagcc agtgtgtgaa cctgaccacc 60
aggacccaac ttcctcctgc ctacaccaac tccttcacca ggggagtcta ctaccctgac 120
aaggtgttca ggtcctctgt gctgcacagc acccaggacc tgttcctgcc attcttcagc 180
aatgtgacct ggttccatgc catccatgtg tctggcacca atggcaccaa gaggtttgac 240
aaccctgtgc tgccattcaa tgatggagtc tactttgcca gcacagagaa gagcaacatc 300
atcaggggct ggatttttgg caccaccctg gacagcaaga cccagtccct gctgattgtg 360
aacaatgcca ccaatgtggt gattaaggtg tgtgagttcc agttctgtaa tgacccattc 420
ctgggagtct actaccacaa gaacaacaag tcctggatgg agtctgagtt cagggtctac 480
tcctctgcca acaactgtac ctttgaatat gtgagccaac cattcctgat ggacttggag 540
ggcaagcagg gcaacttcaa gaacctgagg gagtttgtgt tcaagaacat tgatggctac 600
ttcaagattt acagcaaaca cacaccaatc aacctggtga gggacctgcc acagggcttc 660
tctgccttgg aaccactggt ggacctgcca attggcatca acatcaccag gttccagacc 720
ctgctggctc tgcacaggtc ctacctgaca cctggagact cctcctctgg ctggacagca 780
ggagcagcag cctactatgt gggctacctc caaccaagga ccttcctgct gaaatacaat 840
gagaatggca ccatcacaga tgctgtggac tgtgccctgg acccactgtc tgagaccaag 900
tgtaccctga aatccttcac agtggagaag ggcatctacc agaccagcaa cttcagggtc 960
caaccaacag agagcattgt gaggtttcca aacatcacca acctgtgtcc atttggagag 1020
gtgttcaatg ccaccaggtt tgcctctgtc tatgcctgga acaggaagag gattagcaac 1080
tgtgtggctg actactctgt gctctacaac tctgcctcct tcagcacctt caagtgttat 1140
ggagtgagcc caaccaaact gaatgacctg tgtttcacca atgtctatgc tgactccttt 1200
gtgattaggg gagatgaggt gagacagatt gcccctggac aaacaggcaa gattgctgac 1260
tacaactaca aactgcctga tgacttcaca ggctgtgtga ttgcctggaa cagcaacaac 1320
ctggacagca aggtgggagg caactacaac tacctctaca gactgttcag gaagagcaac 1380
ctgaaaccat ttgagaggga catcagcaca gagatttacc aggctggcag cacaccatgt 1440
aatggagtgg agggcttcaa ctgttacttt ccactccaat cctatggctt ccaaccaacc 1500
aatggagtgg gctaccaacc atacagggtg gtggtgctgt cctttgaact gctccatgcc 1560
cctgccacag tgtgtggacc aaagaagagc accaacctgg tgaagaacaa gtgtgtgaac 1620
ttcaacttca atggactgac aggcacagga gtgctgacag agagcaacaa gaagttcctg 1680
ccattccaac agtttggcag ggacattgct gacaccacag atgctgtgag ggacccacag 1740
accttggaga ttctggacat cacaccatgt tcctttggag gagtgtctgt gattacacct 1800
ggcaccaaca ccagcaacca ggtggctgtg ctctaccagg atgtgaactg tactgaggtg 1860
cctgtggcta tccatgctga ccaacttaca ccaacctgga gggtctacag cacaggcagc 1920
aatgtgttcc agaccagggc tggctgtctg attggagcag agcatgtgaa caactcctat 1980
gagtgtgaca tcccaattgg agcaggcatc tgtgcctcct accagaccca gaccaacagc 2040
ccaaggaggg caaggtctgt ggcaagccag agcatcattg cctacacaat gagtctggga 2100
gcagagaact ctgtggctta cagcaacaac agcattgcca tcccaaccaa cttcaccatc 2160
tctgtgacca cagagattct gcctgtgagt atgaccaaga cctctgtgga ctgtacaatg 2220
tatatctgtg gagacagcac agagtgtagc aacctgctgc tccaatatgg ctccttctgt 2280
acccaactta acagggctct gacaggcatt gctgtggaac aggacaagaa cacccaggag 2340
gtgtttgccc aggtgaagca gatttacaag acacctccaa tcaaggactt tggaggcttc 2400
aacttcagcc agattctgcc tgacccaagc aagccaagca agaggtcctt cattgaggac 2460
ctgctgttca acaaggtgac cctggctgat gctggcttca tcaagcaata tggagactgt 2520
ctgggagaca ttgctgccag ggacctgatt tgtgcccaga agttcaatgg actgacagtg 2580
ctgcctccac tgctgacaga tgagatgatt gcccaataca cctctgccct gctggctggc 2640
accatcacct ctggctggac ctttggagca ggagcagccc tccaaatccc atttgctatg 2700
cagatggctt acaggttcaa tggcattgga gtgacccaga atgtgctcta tgagaaccag 2760
aaactgattg ccaaccagtt caactctgcc attggcaaga ttcaggactc cctgtccagc 2820
acagcctctg ccctgggcaa actccaagat gtggtgaacc agaatgccca ggctctgaac 2880
accctggtga agcaactttc cagcaacttt ggagccatct cctctgtgct gaatgacatc 2940
ctgagcagac tggacaaggt ggaggctgag gtccagattg acagactgat tacaggcaga 3000
ctccaatccc tccaaaccta tgtgacccaa caacttatca gggctgctga gattagggca 3060
tctgccaacc tggctgccac caagatgagt gagtgtgtgc tgggacaaag caagagggtg 3120
gacttctgtg gcaagggcta ccacctgatg agttttccac agtctgcccc tcatggagtg 3180
gtgttcctgc atgtgaccta tgtgcctgcc caggagaaga acttcaccac agcccctgcc 3240
atctgccatg atggcaaggc tcactttcca agggagggag tgtttgtgag caatggcacc 3300
cactggtttg tgacccagag gaacttctat gaaccacaga ttatcaccac agacaacacc 3360
tttgtgtctg gcaactgtga tgtggtgatt ggcattgtga acaacacagt ctatgaccca 3420
ctccaacctg aactggactc cttcaaggag gaactggaca aatacttcaa gaaccacacc 3480
agccctgatg tggacctggg agacatctct ggcatcaatg cctctgtggt gaacatccag 3540
aaggagattg acagactgaa tgaggtggct aagaacctga atgagtccct gattgacctc 3600
caagaactgg gcaaatatga acaatacatc aagtggccat ggtacatctg gctgggcttc 3660
attgctggac tgattgccat tgtgatggtg accataatgc tgtgttgtat gacctcctgt 3720
tgttcctgtc tgaaaggctg ttgttcctgt ggctcctgtt gtaagtttga tgaggatgac 3780
tctgaacctg tgctgaaagg agtgaaactg cactacacc 3819
<210> 2
<211> 1273
<212> PRT
<213> novel coronavirus (SARS-CoV-2)
<400> 2
Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser Gln Cys Val
1 5 10 15
Asn Leu Thr Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe
20 25 30
Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu
35 40 45
His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp
50 55 60
Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp
65 70 75 80
Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu
85 90 95
Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser
100 105 110
Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile
115 120 125
Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr
130 135 140
Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr
145 150 155 160
Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu
165 170 175
Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe
180 185 190
Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr
195 200 205
Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu
210 215 220
Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr
225 230 235 240
Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser
245 250 255
Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro
260 265 270
Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala
275 280 285
Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys
290 295 300
Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val
305 310 315 320
Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys
325 330 335
Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala
340 345 350
Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu
355 360 365
Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro
370 375 380
Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe
385 390 395 400
Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly
405 410 415
Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys
420 425 430
Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn
435 440 445
Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe
450 455 460
Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys
465 470 475 480
Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly
485 490 495
Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val
500 505 510
Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys
515 520 525
Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn
530 535 540
Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu
545 550 555 560
Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val
565 570 575
Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe
580 585 590
Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val
595 600 605
Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile
610 615 620
His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser
625 630 635 640
Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val
645 650 655
Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala
660 665 670
Ser Tyr Gln Thr Gln Thr Asn Ser Pro Arg Arg Ala Arg Ser Val Ala
675 680 685
Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser
690 695 700
Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile
705 710 715 720
Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val
725 730 735
Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu
740 745 750
Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr
755 760 765
Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln
770 775 780
Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe
785 790 795 800
Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser
805 810 815
Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly
820 825 830
Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp
835 840 845
Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu
850 855 860
Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly
865 870 875 880
Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile
885 890 895
Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr
900 905 910
Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn
915 920 925
Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala
930 935 940
Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn
945 950 955 960
Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val
965 970 975
Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln
980 985 990
Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val
995 1000 1005
Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu
1010 1015 1020
Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val
1025 1030 1035 1040
Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ser Ala
1045 1050 1055
Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ala Gln Glu
1060 1065 1070
Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His Asp Gly Lys Ala His
1075 1080 1085
Phe Pro Arg Glu Gly Val Phe Val Ser Asn Gly Thr His Trp Phe Val
1090 1095 1100
Thr Gln Arg Asn Phe Tyr Glu Pro Gln Ile Ile Thr Thr Asp Asn Thr
1105 1110 1115 1120
Phe Val Ser Gly Asn Cys Asp Val Val Ile Gly Ile Val Asn Asn Thr
1125 1130 1135
Val Tyr Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu
1140 1145 1150
Asp Lys Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp
1155 1160 1165
Ile Ser Gly Ile Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp
1170 1175 1180
Arg Leu Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu
1185 1190 1195 1200
Gln Glu Leu Gly Lys Tyr Glu Gln Tyr Ile Lys Trp Pro Trp Tyr Ile
1205 1210 1215
Trp Leu Gly Phe Ile Ala Gly Leu Ile Ala Ile Val Met Val Thr Ile
1220 1225 1230
Met Leu Cys Cys Met Thr Ser Cys Cys Ser Cys Leu Lys Gly Cys Cys
1235 1240 1245
Ser Cys Gly Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro Val
1250 1255 1260
Leu Lys Gly Val Lys Leu His Tyr Thr
1265 1270
<210> 3
<211> 3819
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
atgtttgtgt tcctggtcct gctgcctctt gtgagttcac aatgtgttaa tctgacaacg 60
aggactcagc tcccccccgc ctatacaaat agttttaccc gcggcgtgta ttatccggat 120
aaagtcttca ggtcttctgt gctccacagc acccaggacc tgttcctgcc ttttttttcc 180
aatgtgacct ggttccacgc catccacgtg tctggaacaa acggtaccaa aagattcgat 240
aaccctgtgc tgccctttaa cgatggagtc tactttgcta gcaccgagaa aagcaacatt 300
attagggggt ggatttttgg cactaccctc gacagcaaaa cccagtcatt gcttatcgtc 360
aacaacgcta ccaacgtcgt gattaaggtt tgcgaatttc agttttgcaa tgatcctttc 420
ctcggcgtgt attatcataa gaacaataaa tcttggatgg aatccgagtt ccgagtatat 480
tcaagcgcca acaactgtac ttttgaatat gtgtcccagc cattcctcat ggatctggaa 540
ggcaagcagg ggaactttaa aaatctcaga gagttcgtat tcaagaacat tgacgggtac 600
tttaagatct atagtaagca tacccccatc aaccttgtaa gagacctgcc acaggggttt 660
agtgccctgg agccactcgt ggatctgcca atcggaatca acatcacacg ctttcagact 720
ttgcttgcgc tgcacagaag ctatctgacc ccgggtgata gctcatctgg atggacagcg 780
ggggccgccg cgtactacgt cgggtacctt cagcccagga cgttcctgct gaaatacaac 840
gaaaacggca ccattaccga cgcagtagac tgcgcactcg accccctgag tgaaacaaag 900
tgtacgttga aaagttttac cgtagagaaa ggcatatatc agactagcaa ttttagggtt 960
cagcccacag agtctattgt gcgctttcct aatatcacca atttgtgccc ttttggagaa 1020
gtgtttaatg ccacccgatt tgcgtctgtg tatgcttgga atcgcaaaag gatctcaaac 1080
tgcgtcgccg actattccgt gctgtacaac tctgcttcat ttagcacatt caagtgttat 1140
ggggtgagtc caaccaaatt gaacgacctc tgctttacaa acgtgtacgc tgactcattt 1200
gtcattagag gcgacgaagt gaggcagatt gcccccgggc agacaggaaa aattgcggac 1260
tacaactaca agctccctga tgacttcacg ggctgtgtca tcgcatggaa cagtaacaat 1320
cttgatagca aggtgggcgg caattacaat tacctgtaca gactgtttag aaaatctaat 1380
ctcaaaccct ttgaaaggga catttccact gaaatctatc aggccgggag cactccgtgt 1440
aacggcgtag aggggtttaa ctgctatttc ccactgcagt cctatggatt ccagccaaca 1500
aacggggtgg gctaccaacc ctaccgggta gtggtgctga gctttgaact tctgcatgct 1560
ccggctaccg tctgtggccc aaagaagagc acaaacctcg taaagaacaa gtgtgttaac 1620
ttcaatttta atggcctcac cggaactggc gtcctcactg agtccaataa gaagtttctg 1680
ccgtttcaac agttcggccg ggacatagct gacacgactg acgccgtgag agaccctcaa 1740
accctcgaaa tactggacat cactccttgc tcattcggcg gcgtttctgt gataacacca 1800
ggcacgaaca cttctaatca ggtggctgtg ctttatcagg acgtgaactg cacagaagtg 1860
cctgtcgcca ttcatgccga tcagctcacc cctacttgga gagtttatag caccggctca 1920
aacgtgttcc aaacgagagc aggctgcctt atcggggcag agcacgtgaa caatagctat 1980
gagtgtgata tcccaattgg ggctggcata tgcgctagct accagaccca gacaaactca 2040
cccaggcggg cccggtcagt ggctagccag tctattatcg cctacaccat gtccctgggc 2100
gccgagaaca gtgtcgcgta cagcaataac tccatcgcta tccctaccaa cttcacgatc 2160
tcagtgacga ctgagatatt gccggtttct atgactaaga ccagtgtgga ttgtacaatg 2220
tacatctgtg gtgatagcac agagtgctct aatctcctgc tccaatatgg gagcttttgt 2280
acccagctga acagagcatt gaccgggatt gccgtcgagc aggataagaa cacacaagaa 2340
gtatttgccc aggtgaaaca gatctacaag actcccccta ttaaagactt cggcggcttt 2400
aacttttctc agatactccc cgaccctagc aagcctagca aacggagctt cattgaagat 2460
cttttgttta ataaggtcac attggcggat gccggcttta tcaagcagta cggggattgt 2520
ttgggtgata ttgcggctag ggatctgatt tgtgcccaga agttcaatgg cctgacagtg 2580
ctgccccccc tgcttacaga cgagatgatt gcgcagtaca ccagcgctct gctggcggga 2640
accatcacct ccggctggac ctttggggcc ggagccgcac tccagatccc ttttgccatg 2700
cagatggcct atagattcaa tggaatcggc gtgacacaga acgtcctgta tgagaaccag 2760
aaactcatcg ctaatcagtt taacagcgcc attggcaaaa ttcaggattc tctgagttca 2820
accgcatcag ctttgggtaa actgcaggat gtcgtaaatc agaatgctca ggccctgaat 2880
actcttgtta agcagctctc ctctaacttc ggcgccatca gttctgtgct gaacgacatt 2940
ctgtctagac tggacaaggt ggaggcagag gtacaaatcg accgcctgat caccggacgg 3000
ctgcagtcac tccaaacata cgtgacccaa cagctcatcc gggcagccga aattagagcc 3060
tctgcaaatc tggccgccac aaagatgagt gagtgcgttc tgggtcagtc caaacgagtg 3120
gacttctgcg gcaaaggtta ccacctgatg agtttccccc agtctgcccc gcatggcgtg 3180
gtattcctgc acgtgactta tgtcccagcc caggaaaaga acttcaccac cgccccagca 3240
atttgtcacg atggtaaggc ccacttcccc cgggaaggcg tttttgtgtc caatggcact 3300
cattggttcg tgacacagag aaacttttac gaaccccaaa tcattaccac cgacaacact 3360
ttcgtcagcg ggaattgtga cgtagtaatc gggattgtga acaacaccgt ctatgacccc 3420
ctgcagcccg agcttgactc ctttaaagag gaactggata agtatttcaa gaatcacaca 3480
agccctgatg ttgatctggg cgacatctct ggcattaacg cttcagtggt caacatacaa 3540
aaagagatcg atcgcctcaa tgaagtcgcc aagaatctca atgagtcact catcgatttg 3600
caggaactgg ggaagtacga gcagtatatc aagtggccct ggtacatctg gctgggattt 3660
attgctgggc tcatcgctat cgtaatggtc accattatgt tgtgctgcat gacctcctgt 3720
tgttcctgtc tgaaaggttg ttgtagttgc ggcagttgtt gtaagttcga tgaagatgac 3780
tctgagcctg tgctcaaggg cgtcaagctc cactacaca 3819
<210> 4
<211> 3819
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
atgttcgtgt tcctggtgct gctgcccctg gtgagcagcc agtgcgtgaa cctgaccacc 60
agaacccagc tgccccccgc ctacaccaac agcttcacca gaggcgtgta ctaccccgac 120
aaggtgttca gaagcagcgt gctgcacagc acccaggacc tgttcctgcc cttcttcagc 180
aacgtgacct ggttccacgc catccacgtg agcggcacca acggcaccaa gagattcgac 240
aaccccgtgc tgcccttcaa cgacggcgtg tacttcgcca gcaccgagaa gagcaacatc 300
atcagaggct ggatcttcgg caccaccctg gacagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt cagagtgtac 480
agcagcgcca acaactgcac cttcgagtac gtgagccagc ccttcctgat ggacctggag 540
ggcaagcagg gcaacttcaa gaacctgaga gagttcgtgt tcaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccccatc aacctggtga gagacctgcc ccagggcttc 660
agcgccctgg agcccctggt ggacctgccc atcggcatca acatcaccag attccagacc 720
ctgctggccc tgcacagaag ctacctgacc cccggcgaca gcagcagcgg ctggaccgcc 780
ggcgccgccg cctactacgt gggctacctg cagcccagaa ccttcctgct gaagtacaac 840
gagaacggca ccatcaccga cgccgtggac tgcgccctgg accccctgag cgagaccaag 900
tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttcagagtg 960
cagcccaccg agagcatcgt gagattcccc aacatcacca acctgtgccc cttcggcgag 1020
gtgttcaacg ccaccagatt cgccagcgtg tacgcctgga acagaaagag aatcagcaac 1080
tgcgtggccg actacagcgt gctgtacaac agcgccagct tcagcacctt caagtgctac 1140
ggcgtgagcc ccaccaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200
gtgatcagag gcgacgaggt gagacagatc gcccccggcc agaccggcaa gatcgccgac 1260
tacaactaca agctgcccga cgacttcacc ggctgcgtga tcgcctggaa cagcaacaac 1320
ctggacagca aggtgggcgg caactacaac tacctgtaca gactgttcag aaagagcaac 1380
ctgaagccct tcgagagaga catcagcacc gagatctacc aggccggcag caccccctgc 1440
aacggcgtgg agggcttcaa ctgctacttc cccctgcaga gctacggctt ccagcccacc 1500
aacggcgtgg gctaccagcc ctacagagtg gtggtgctga gcttcgagct gctgcacgcc 1560
cccgccaccg tgtgcggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac cggcaccggc gtgctgaccg agagcaacaa gaagttcctg 1680
cccttccagc agttcggcag agacatcgcc gacaccaccg acgccgtgag agacccccag 1740
accctggaga tcctggacat caccccctgc agcttcggcg gcgtgagcgt gatcaccccc 1800
ggcaccaaca ccagcaacca ggtggccgtg ctgtaccagg acgtgaactg caccgaggtg 1860
cccgtggcca tccacgccga ccagctgacc cccacctgga gagtgtacag caccggcagc 1920
aacgtgttcc agaccagagc cggctgcctg atcggcgccg agcacgtgaa caacagctac 1980
gagtgcgaca tccccatcgg cgccggcatc tgcgccagct accagaccca gaccaacagc 2040
cccagaagag ccagaagcgt ggccagccag agcatcatcg cctacaccat gagcctgggc 2100
gccgagaaca gcgtggccta cagcaacaac agcatcgcca tccccaccaa cttcaccatc 2160
agcgtgacca ccgagatcct gcccgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgacagcac cgagtgcagc aacctgctgc tgcagtacgg cagcttctgc 2280
acccagctga acagagccct gaccggcatc gccgtggagc aggacaagaa cacccaggag 2340
gtgttcgccc aggtgaagca gatctacaag acccccccca tcaaggactt cggcggcttc 2400
aacttcagcc agatcctgcc cgaccccagc aagcccagca agagaagctt catcgaggac 2460
ctgctgttca acaaggtgac cctggccgac gccggcttca tcaagcagta cggcgactgc 2520
ctgggcgaca tcgccgccag agacctgatc tgcgcccaga agttcaacgg cctgaccgtg 2580
ctgccccccc tgctgaccga cgagatgatc gcccagtaca ccagcgccct gctggccggc 2640
accatcacca gcggctggac cttcggcgcc ggcgccgccc tgcagatccc cttcgccatg 2700
cagatggcct acagattcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgagcagc 2820
accgccagcg ccctgggcaa gctgcaggac gtggtgaacc agaacgccca ggccctgaac 2880
accctggtga agcagctgag cagcaacttc ggcgccatca gcagcgtgct gaacgacatc 2940
ctgagcagac tggacaaggt ggaggccgag gtgcagatcg acagactgat caccggcaga 3000
ctgcagagcc tgcagaccta cgtgacccag cagctgatca gagccgccga gatcagagcc 3060
agcgccaacc tggccgccac caagatgagc gagtgcgtgc tgggccagag caagagagtg 3120
gacttctgcg gcaagggcta ccacctgatg agcttccccc agagcgcccc ccacggcgtg 3180
gtgttcctgc acgtgaccta cgtgcccgcc caggagaaga acttcaccac cgcccccgcc 3240
atctgccacg acggcaaggc ccacttcccc agagagggcg tgttcgtgag caacggcacc 3300
cactggttcg tgacccagag aaacttctac gagccccaga tcatcaccac cgacaacacc 3360
ttcgtgagcg gcaactgcga cgtggtgatc ggcatcgtga acaacaccgt gtacgacccc 3420
ctgcagcccg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480
agccccgacg tggacctggg cgacatcagc ggcatcaacg ccagcgtggt gaacatccag 3540
aaggagatcg acagactgaa cgaggtggcc aagaacctga acgagagcct gatcgacctg 3600
caggagctgg gcaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttc 3660
atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720
tgcagctgcc tgaagggctg ctgcagctgc ggcagctgct gcaagttcga cgaggacgac 3780
agcgagcccg tgctgaaggg cgtgaagctg cactacacc 3819
<210> 5
<211> 3819
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
atgttcgtgt tcctggtgct gctgcccctg gtgagcagcc agtgcgtgaa cctgaccacc 60
cgcacccagc tgccccccgc ctacaccaac agcttcaccc gcggcgtgta ctaccccgac 120
aaggtgttcc gcagcagcgt gctgcacagc acccaggacc tgttcctgcc cttcttcagc 180
aacgtgacct ggttccacgc catccacgtg agcggcacca acggcaccaa gcgcttcgac 240
aaccccgtgc tgcccttcaa cgacggcgtg tacttcgcca gcaccgagaa gagcaacatc 300
atccgcggct ggatcttcgg caccaccctg gacagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt ccgcgtgtac 480
agcagcgcca acaactgcac cttcgagtac gtgagccagc ccttcctgat ggacctggag 540
ggcaagcagg gcaacttcaa gaacctgcgc gagttcgtgt tcaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccccatc aacctggtgc gcgacctgcc ccagggcttc 660
agcgccctgg agcccctggt ggacctgccc atcggcatca acatcacccg cttccagacc 720
ctgctggccc tgcaccgcag ctacctgacc cccggcgaca gcagcagcgg ctggaccgcc 780
ggcgccgccg cctactacgt gggctacctg cagccccgca ccttcctgct gaagtacaac 840
gagaacggca ccatcaccga cgccgtggac tgcgccctgg accccctgag cgagaccaag 900
tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttccgcgtg 960
cagcccaccg agagcatcgt gcgcttcccc aacatcacca acctgtgccc cttcggcgag 1020
gtgttcaacg ccacccgctt cgccagcgtg tacgcctgga accgcaagcg catcagcaac 1080
tgcgtggccg actacagcgt gctgtacaac agcgccagct tcagcacctt caagtgctac 1140
ggcgtgagcc ccaccaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200
gtgatccgcg gcgacgaggt gcgccagatc gcccccggcc agaccggcaa gatcgccgac 1260
tacaactaca agctgcccga cgacttcacc ggctgcgtga tcgcctggaa cagcaacaac 1320
ctggacagca aggtgggcgg caactacaac tacctgtacc gcctgttccg caagagcaac 1380
ctgaagccct tcgagcgcga catcagcacc gagatctacc aggccggcag caccccctgc 1440
aacggcgtgg agggcttcaa ctgctacttc cccctgcaga gctacggctt ccagcccacc 1500
aacggcgtgg gctaccagcc ctaccgcgtg gtggtgctga gcttcgagct gctgcacgcc 1560
cccgccaccg tgtgcggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac cggcaccggc gtgctgaccg agagcaacaa gaagttcctg 1680
cccttccagc agttcggccg cgacatcgcc gacaccaccg acgccgtgcg cgacccccag 1740
accctggaga tcctggacat caccccctgc agcttcggcg gcgtgagcgt gatcaccccc 1800
ggcaccaaca ccagcaacca ggtggccgtg ctgtaccagg acgtgaactg caccgaggtg 1860
cccgtggcca tccacgccga ccagctgacc cccacctggc gcgtgtacag caccggcagc 1920
aacgtgttcc agacccgcgc cggctgcctg atcggcgccg agcacgtgaa caacagctac 1980
gagtgcgaca tccccatcgg cgccggcatc tgcgccagct accagaccca gaccaacagc 2040
ccccgccgcg cccgcagcgt ggccagccag agcatcatcg cctacaccat gagcctgggc 2100
gccgagaaca gcgtggccta cagcaacaac agcatcgcca tccccaccaa cttcaccatc 2160
agcgtgacca ccgagatcct gcccgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgacagcac cgagtgcagc aacctgctgc tgcagtacgg cagcttctgc 2280
acccagctga accgcgccct gaccggcatc gccgtggagc aggacaagaa cacccaggag 2340
gtgttcgccc aggtgaagca gatctacaag acccccccca tcaaggactt cggcggcttc 2400
aacttcagcc agatcctgcc cgaccccagc aagcccagca agcgcagctt catcgaggac 2460
ctgctgttca acaaggtgac cctggccgac gccggcttca tcaagcagta cggcgactgc 2520
ctgggcgaca tcgccgcccg cgacctgatc tgcgcccaga agttcaacgg cctgaccgtg 2580
ctgccccccc tgctgaccga cgagatgatc gcccagtaca ccagcgccct gctggccggc 2640
accatcacca gcggctggac cttcggcgcc ggcgccgccc tgcagatccc cttcgccatg 2700
cagatggcct accgcttcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgagcagc 2820
accgccagcg ccctgggcaa gctgcaggac gtggtgaacc agaacgccca ggccctgaac 2880
accctggtga agcagctgag cagcaacttc ggcgccatca gcagcgtgct gaacgacatc 2940
ctgagccgcc tggacaaggt ggaggccgag gtgcagatcg accgcctgat caccggccgc 3000
ctgcagagcc tgcagaccta cgtgacccag cagctgatcc gcgccgccga gatccgcgcc 3060
agcgccaacc tggccgccac caagatgagc gagtgcgtgc tgggccagag caagcgcgtg 3120
gacttctgcg gcaagggcta ccacctgatg agcttccccc agagcgcccc ccacggcgtg 3180
gtgttcctgc acgtgaccta cgtgcccgcc caggagaaga acttcaccac cgcccccgcc 3240
atctgccacg acggcaaggc ccacttcccc cgcgagggcg tgttcgtgag caacggcacc 3300
cactggttcg tgacccagcg caacttctac gagccccaga tcatcaccac cgacaacacc 3360
ttcgtgagcg gcaactgcga cgtggtgatc ggcatcgtga acaacaccgt gtacgacccc 3420
ctgcagcccg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480
agccccgacg tggacctggg cgacatcagc ggcatcaacg ccagcgtggt gaacatccag 3540
aaggagatcg accgcctgaa cgaggtggcc aagaacctga acgagagcct gatcgacctg 3600
caggagctgg gcaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttc 3660
atcgccggcc tgatcgccat cgtgatggtg accatcatgc tgtgctgcat gaccagctgc 3720
tgcagctgcc tgaagggctg ctgcagctgc ggcagctgct gcaagttcga cgaggacgac 3780
agcgagcccg tgctgaaggg cgtgaagctg cactacacc 3819
<210> 6
<211> 3819
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
atgttcgtgt tcctggtgct cctgcccctg gtgagctctc agtgcgtgaa cctgacaacc 60
cggacacagc tgcctcctgc ctacaccaac tctttcacaa gaggcgtcta ctatcctgat 120
aaggtgttca gaagctctgt gctgcattct acccaagatc tgttcctgcc tttcttcagc 180
aatgtgacat ggttccacgc catccacgtc tctgggacta acggtacaaa gagattcgac 240
aaccccgtac tgcctttcaa cgacggcgtt tacttcgcca gcaccgaaaa atctaacatc 300
atcaggggat ggatctttgg cacaaccctg gacagcaaga cccaatctct gctgatcgtg 360
aacaacgcca ccaacgtggt gataaaggtt tgtgaattcc agttctgcaa cgaccccttc 420
ctgggcgtgt actaccataa gaacaacaag agctggatgg aaagcgagtt cagagtgtac 480
agctccgcca acaactgcac attcgagtac gtgtcccagc cttttctgat ggacctggaa 540
ggcaaacaag gcaacttcaa gaacctgaga gagttcgtgt ttaagaacat cgacggctac 600
ttcaagatct actccaagca cacccctatc aacctggttc gggatctgcc tcagggcttt 660
tctgctctgg aacctctggt ggacctgcca atcggcatca acatcacacg cttccagacc 720
ttgctcgccc tgcacagatc ctacctgacc cctggcgact cctctagcgg atggaccgcc 780
ggcgcggccg catactacgt gggatatctg cagcctagaa ccttcctgct gaaatacaac 840
gagaatggca ccatcacaga cgccgtcgat tgcgccctgg accctctgag cgagacaaaa 900
tgtaccctga aaagttttac cgtggaaaag ggcatctacc agaccagcaa ttttagagtg 960
cagcccaccg aaagcatcgt gcggttcccc aacatcacca acctgtgccc cttcggcgag 1020
gtcttcaacg ccaccagatt cgcctctgtc tacgcctgga acagaaagag aatcagcaat 1080
tgcgtggccg actacagcgt gctgtacaac agcgccagct tctctacgtt caagtgctac 1140
ggcgtaagcc ctaccaagct gaacgacctg tgcttcacca acgtgtacgc cgactccttt 1200
gtgatccggg gagacgaggt gcggcagatt gcccctggcc agaccggcaa gatcgctgac 1260
tacaactaca agctgcccga tgatttcacc ggctgcgtga tcgcttggaa cagcaacaac 1320
cttgactcaa aggtaggagg caattacaac tacctgtaca gactgtttcg gaagagcaac 1380
ctgaagcctt tcgagagaga tatctcgaca gagatctatc aggccggatc tacgccctgt 1440
aatggcgttg aaggctttaa ctgctacttt cccctgcagt cttacggctt tcagcctacc 1500
aatggagttg gttaccagcc ataccgggtg gtggtgctca gcttcgagct gctccacgcc 1560
ccagctaccg tgtgcggccc taagaagtct accaacctcg ttaagaacaa gtgcgtgaac 1620
ttcaatttca acggcctgac cggaaccggc gtgctgaccg agagcaacaa aaagttcctg 1680
ccgttccaac agtttggcag agacatcgcc gataccacag atgccgttag agatcctcag 1740
acactggaaa tcctggatat cacaccttgc agcttcggcg gagtgagcgt gatcaccccc 1800
ggcaccaaca cctctaacca ggtggctgtg ctgtaccagg acgtgaactg caccgaggtc 1860
cccgtcgcca tccacgccga ccaactgacc cccacctggc gggtgtacag caccggcagc 1920
aacgtgttcc agaccagagc cggctgtctg atcggcgccg agcacgtgaa caatagttat 1980
gaatgtgaca tccccatcgg agctggcatt tgcgcttctt accagactca gaccaattct 2040
ccacgcagag ctcggagcgt ggccagccag tccatcatcg cctatactat gagcctgggc 2100
gctgagaaca gcgtggcata cagcaacaac agcatcgcaa tccccaccaa ttttacaatc 2160
agtgtgacca ccgaaatcct gcctgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgacagcac agagtgcagc aacctgctgc tgcagtacgg ctccttttgc 2280
acccagctga atagagctct gacaggcatc gctgttgaac aggataagaa cacccaagag 2340
gtgttcgccc aggtaaagca gatctacaag acccctccta tcaaggactt cggcggcttt 2400
aacttcagcc agatcctgcc tgacccaagc aaaccctcca aacggagctt tattgaggat 2460
ctgctgttca acaaggtgac cctggccgac gccggattca tcaagcagta cggcgactgc 2520
ctgggcgaca tcgccgccag agatctgatc tgcgcccaga aattcaacgg gctgacagtg 2580
ctgcctccac tgctgaccga tgagatgatc gcccagtata caagcgccct gctcgctggc 2640
acgatcacca gcggatggac attcggagcc ggcgccgctc tgcaaatccc tttcgccatg 2700
cagatggcct acagattcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760
aagctgatcg ctaaccagtt caatagcgcc atcgggaaga tccaggacag cctgtcatcc 2820
acagccagcg ccctgggcaa gctgcaggac gtggtgaatc aaaacgctca ggcgctgaac 2880
acactggtga agcaactgag cagcaacttc ggcgccatca gctcagtgct gaacgatatt 2940
ctgtctagac tggacaaagt ggaggccgag gtgcagatag atagactgat caccggcaga 3000
ctgcagagcc tgcaaaccta cgtgacccag cagctgatcc gggccgccga aatccgggcc 3060
agcgccaatc tggcagccac taagatgtct gagtgcgtgc tgggccagag caagcgggtg 3120
gacttctgcg gcaagggcta ccacctgatg agcttcccac aatctgcccc tcacggcgtg 3180
gtgttcctac acgtgacata cgtgcctgct caggagaaga atttcacgac cgcccctgct 3240
atctgtcacg acggaaaggc ccacttccct agagaaggcg tctttgtgag caacggaaca 3300
cactggttcg tgacacagag aaacttctac gagcctcaga tcatcacaac tgataacaca 3360
ttcgtgagcg ggaactgcga cgtcgtgatc ggcatcgtga acaataccgt ttacgaccct 3420
ctgcagcctg agctggactc cttcaaagag gaactggata agtacttcaa gaaccacacc 3480
agcccagacg tcgacctggg cgacattagc ggcatcaacg ccagcgtggt caacatccag 3540
aaggaaatcg atagactgaa cgaggtcgcc aagaacctga atgaaagttt gatcgacctg 3600
caggaactgg gcaagtacga gcagtacatc aagtggcctt ggtacatttg gctgggattc 3660
atcgccggcc tgatcgccat cgtgatggtc accatcatgc tgtgttgcat gacaagctgc 3720
tgctcctgcc tgaagggctg ttgttcttgt ggaagctgct gtaaattcga cgaggacgat 3780
tccgagcccg tgctgaaggg cgtgaagctg cactacacc 3819
<210> 7
<211> 3819
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
atgttcgtgt tcctggtgct gctgcccctg gtgtcctctc agtgtgtgaa cctgaccacc 60
agaacacagc tgcctccagc ctacaccaac agcttcacca gaggcgtgta ctaccccgac 120
aaggtgttcc ggtcctccgt gctgcattct acccaggacc tgttcctgcc tttcttcagc 180
aacgtgacct ggttccacgc catccatgtg tctggcacca acggcaccaa gagattcgac 240
aaccccgtgc tgcctttcaa cgacggggtg tactttgcct ccaccgagaa gtccaacatc 300
atcagaggct ggatcttcgg caccacactg gacagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt catcaaagtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtct actaccacaa gaacaacaag tcctggatgg aatccgagtt ccgggtgtac 480
tcctccgcca acaactgcac cttcgagtac gtgtcccagc ctttcctgat ggacctggaa 540
ggcaagcagg gcaacttcaa gaacctgcgc gagttcgtgt ttaagaacat cgacggctac 600
ttcaagatct actccaagca cacccctatc aacctcgtgc gggatctgcc tcagggcttc 660
tctgctctgg aacccctggt ggatctgccc atcggcatca acatcacccg gtttcagacc 720
ctgctggccc tgcaccggtc ttatttgacc cctggcgact cctcttctgg ctggactgct 780
ggtgccgctg cttactacgt gggctacctg cagcctagaa ccttcctgct gaagtacaac 840
gagaatggca ccatcaccga cgccgtggac tgtgctctgg atcctctgtc cgagacaaag 900
tgcaccctga agtccttcac cgtggaaaag ggcatctacc agacctccaa cttccgggtg 960
cagcccaccg agtctatcgt gcggttccct aacatcacca acctgtgtcc tttcggcgag 1020
gtgttcaatg ccaccagatt cgcctctgtg tacgcctgga accggaagcg gatctctaac 1080
tgcgtggccg actacagcgt gctgtacaac tccgcctcct tcagcacctt caagtgctac 1140
ggcgtgtccc ctaccaagct gaacgacctg tgcttcacaa acgtgtacgc cgactccttc 1200
gtgatccggg gagatgaagt gcggcagatc gctcctggac agaccggcaa gatcgccgat 1260
tacaactaca agctgcccga cgacttcacc ggctgtgtga tcgcttggaa ctccaacaac 1320
ctggactcca aagtcggcgg caactacaac tacctgtacc ggctgttccg gaagtctaac 1380
ctgaagcctt tcgagcggga catcagcacc gagatctacc aggctggcag caccccttgt 1440
aacggcgtgg aaggcttcaa ctgctacttc ccactgcagt cctacggctt tcagcctacc 1500
aatggcgtgg gctatcagcc ctacagagtg gtggtgctgt ccttcgagct gctgcatgct 1560
cctgctaccg tgtgcggccc taagaaatct accaacctgg tcaagaacaa atgcgtgaac 1620
ttcaacttca acggcctgac cggcaccggc gtgctgacag agtccaacaa gaagttcctg 1680
ccattccagc agttcggccg ggatatcgcc gataccacag atgccgtcag ggaccctcag 1740
acactggaaa tcctggacat caccccttgc agcttcggcg gagtgtctgt gatcacccca 1800
ggcaccaaca cctctaacca ggtggccgtg ctgtatcagg acgtgaactg taccgaggtg 1860
cccgtggcta tccatgccga tcagctgacc cctacatggc gcgtgtactc caccggctcc 1920
aacgtgttcc agacaagagc tggctgtctg atcggcgctg agcacgtgaa caattcctac 1980
gagtgcgaca tccccatcgg agccggaatc tgcgcctctt atcagaccca gaccaactct 2040
cccagacggg ccagatctgt ggccagccag tctatcattg cttacaccat gagcctgggc 2100
gccgagaact ctgtggccta cagcaacaac tctatcgcta tccccaccaa cttcaccatc 2160
tccgtgacca cagagatcct gcctgtgtcc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgactctac cgagtgctcc aacctgctgc tgcagtacgg ctccttctgc 2280
acccagctga atagagccct gaccggaatc gccgtggaac aggacaagaa cacccaagag 2340
gtgttcgccc aagtgaagca gatctacaag acccctccta tcaaggactt cggcggcttc 2400
aatttctccc agattctgcc cgatcctagc aagccctcca agcggtcttt catcgaggac 2460
ctgctgttca acaaagtgac actggccgac gccggcttca tcaagcagta tggcgattgc 2520
ctgggcgaca ttgccgccag ggatctgatc tgtgcccaga agtttaacgg actgacagtg 2580
ctgcctcctc tgctgaccga tgagatgatc gcccagtaca cctccgcact gctggctggc 2640
acaatcacct ctggatggac atttggcgct ggcgccgctc tgcagatccc tttcgctatg 2700
cagatggcct accggttcaa cggcatcggc gtgacccaga atgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggaaaga tccaggacag cctgtccagc 2820
accgcttctg ccctgggaaa gctgcaggat gtggtcaacc agaacgctca ggccctgaac 2880
accctcgtga agcagctgtc ctctaacttc ggcgccatct cctctgtgct gaacgatatc 2940
ctgagccggc tggacaaggt ggaagccgag gtgcagatcg acagactgat caccggacgg 3000
ctgcagtccc tgcagaccta tgttacccag cagctgatca gagccgccga gattagagcc 3060
tctgccaatc tggccgccac caagatgtct gagtgtgtgc tgggccagtc caagagagtg 3120
gacttttgcg gcaagggcta ccacctgatg agcttccctc agtctgctcc tcacggcgtg 3180
gtgtttctgc acgtgaccta cgtgcccgct caagagaaga actttaccac cgctcctgcc 3240
atctgccacg acggcaaggc tcactttcct cgagaaggcg tgttcgtgtc taacggcacc 3300
cattggttcg tgacacagcg gaacttctac gagccccaga tcatcaccac cgacaacacc 3360
tttgtgtccg gcaactgcga cgtcgtgatc ggaattgtga acaataccgt gtacgaccct 3420
ctgcagcccg agctggactc cttcaaagag gaactggaca agtactttaa gaaccacaca 3480
agccccgacg tggacctggg agacatctct ggcatcaacg cctccgtggt caacatccag 3540
aaagagatcg accggctgaa cgaggtggcc aagaatctga acgagtccct gatcgacctg 3600
caagaactgg ggaagtacga gcagtacatc aagtggccct ggtacatctg gctgggcttt 3660
atcgctggcc tgatcgctat cgtgatggtc acaatcatgc tgtgctgtat gacctcctgc 3720
tgctcttgcc tgaagggctg ctgttcttgc ggctcttgct gcaagttcga cgaggacgac 3780
tctgagcccg tgctgaaagg cgtgaagctg cactacacc 3819
<210> 8
<211> 3708
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
atgttcgtgt tcctggtgct gctgcccctg gtgagcagcc agtgcgtgaa cctgaccacc 60
cgcacccagc tgccccccgc ctacaccaac agcttcaccc gcggcgtgta ctaccccgac 120
aaggtgttcc gcagcagcgt gctgcacagc acccaggacc tgttcctgcc cttcttcagc 180
aacgtgacct ggttccacgc catccacgtg agcggcacca acggcaccaa gcgcttcgac 240
aaccccgtgc tgcccttcaa cgacggcgtg tacttcgcca gcaccgagaa gagcaacatc 300
atccgcggct ggatcttcgg caccaccctg gacagcaaga cccagagcct gctgatcgtg 360
aacaacgcca ccaacgtggt gatcaaggtg tgcgagttcc agttctgcaa cgaccccttc 420
ctgggcgtgt actaccacaa gaacaacaag agctggatgg agagcgagtt ccgcgtgtac 480
agcagcgcca acaactgcac cttcgagtac gtgagccagc ccttcctgat ggacctggag 540
ggcaagcagg gcaacttcaa gaacctgcgc gagttcgtgt tcaagaacat cgacggctac 600
ttcaagatct acagcaagca cacccccatc aacctggtgc gcgacctgcc ccagggcttc 660
agcgccctgg agcccctggt ggacctgccc atcggcatca acatcacccg cttccagacc 720
ctgctggccc tgcaccgcag ctacctgacc cccggcgaca gcagcagcgg ctggaccgcc 780
ggcgccgccg cctactacgt gggctacctg cagccccgca ccttcctgct gaagtacaac 840
gagaacggca ccatcaccga cgccgtggac tgcgccctgg accccctgag cgagaccaag 900
tgcaccctga agagcttcac cgtggagaag ggcatctacc agaccagcaa cttccgcgtg 960
cagcccaccg agagcatcgt gcgcttcccc aacatcacca acctgtgccc cttcggcgag 1020
gtgttcaacg ccacccgctt cgccagcgtg tacgcctgga accgcaagcg catcagcaac 1080
tgcgtggccg actacagcgt gctgtacaac agcgccagct tcagcacctt caagtgctac 1140
ggcgtgagcc ccaccaagct gaacgacctg tgcttcacca acgtgtacgc cgacagcttc 1200
gtgatccgcg gcgacgaggt gcgccagatc gcccccggcc agaccggcaa gatcgccgac 1260
tacaactaca agctgcccga cgacttcacc ggctgcgtga tcgcctggaa cagcaacaac 1320
ctggacagca aggtgggcgg caactacaac tacctgtacc gcctgttccg caagagcaac 1380
ctgaagccct tcgagcgcga catcagcacc gagatctacc aggccggcag caccccctgc 1440
aacggcgtgg agggcttcaa ctgctacttc cccctgcaga gctacggctt ccagcccacc 1500
aacggcgtgg gctaccagcc ctaccgcgtg gtggtgctga gcttcgagct gctgcacgcc 1560
cccgccaccg tgtgcggccc caagaagagc accaacctgg tgaagaacaa gtgcgtgaac 1620
ttcaacttca acggcctgac cggcaccggc gtgctgaccg agagcaacaa gaagttcctg 1680
cccttccagc agttcggccg cgacatcgcc gacaccaccg acgccgtgcg cgacccccag 1740
accctggaga tcctggacat caccccctgc agcttcggcg gcgtgagcgt gatcaccccc 1800
ggcaccaaca ccagcaacca ggtggccgtg ctgtaccagg acgtgaactg caccgaggtg 1860
cccgtggcca tccacgccga ccagctgacc cccacctggc gcgtgtacag caccggcagc 1920
aacgtgttcc agacccgcgc cggctgcctg atcggcgccg agcacgtgaa caacagctac 1980
gagtgcgaca tccccatcgg cgccggcatc tgcgccagct accagaccca gaccaacagc 2040
cccggcagcg ccagcagcgt ggccagccag agcatcatcg cctacaccat gagcctgggc 2100
gccgagaaca gcgtggccta cagcaacaac agcatcgcca tccccaccaa cttcaccatc 2160
agcgtgacca ccgagatcct gcccgtgagc atgaccaaga ccagcgtgga ctgcaccatg 2220
tacatctgcg gcgacagcac cgagtgcagc aacctgctgc tgcagtacgg cagcttctgc 2280
acccagctga accgcgccct gaccggcatc gccgtggagc aggacaagaa cacccaggag 2340
gtgttcgccc aggtgaagca gatctacaag acccccccca tcaaggactt cggcggcttc 2400
aacttcagcc agatcctgcc cgaccccagc aagcccagca agcgcagctt catcgaggac 2460
ctgctgttca acaaggtgac cctggccgac gccggcttca tcaagcagta cggcgactgc 2520
ctgggcgaca tcgccgcccg cgacctgatc tgcgcccaga agttcaacgg cctgaccgtg 2580
ctgccccccc tgctgaccga cgagatgatc gcccagtaca ccagcgccct gctggccggc 2640
accatcacca gcggctggac cttcggcgcc ggcgccgccc tgcagatccc cttcgccatg 2700
cagatggcct accgcttcaa cggcatcggc gtgacccaga acgtgctgta cgagaaccag 2760
aagctgatcg ccaaccagtt caacagcgcc atcggcaaga tccaggacag cctgagcagc 2820
accgccagcg ccctgggcaa gctgcaggac gtggtgaacc agaacgccca ggccctgaac 2880
accctggtga agcagctgag cagcaacttc ggcgccatca gcagcgtgct gaacgacatc 2940
ctgagccgcc tggacaaggt ggaggccgag gtgcagatcg accgcctgat caccggccgc 3000
ctgcagagcc tgcagaccta cgtgacccag cagctgatcc gcgccgccga gatccgcgcc 3060
agcgccaacc tggccgccac caagatgagc gagtgcgtgc tgggccagag caagcgcgtg 3120
gacttctgcg gcaagggcta ccacctgatg agcttccccc agagcgcccc ccacggcgtg 3180
gtgttcctgc acgtgaccta cgtgcccgcc caggagaaga acttcaccac cgcccccgcc 3240
atctgccacg acggcaaggc ccacttcccc cgcgagggcg tgttcgtgag caacggcacc 3300
cactggttcg tgacccagcg caacttctac gagccccaga tcatcaccac cgacaacacc 3360
ttcgtgagcg gcaactgcga cgtggtgatc ggcatcgtga acaacaccgt gtacgacccc 3420
ctgcagcccg agctggacag cttcaaggag gagctggaca agtacttcaa gaaccacacc 3480
agccccgacg tggacctggg cgacatcagc ggcatcaacg ccagcgtggt gaacatccag 3540
aaggagatcg accgcctgaa cgaggtggcc aagaacctga acgagagcct gatcgacctg 3600
caggagctgg gcaagtacga gcagggctac atccccgagg ccccccgcga cggccaggcc 3660
tacgtgcgca aggacggcga gtgggtgctg ctgagcacct tcctgtga 3708
<210> 9
<211> 1235
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 9
Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser Gln Cys Val
1 5 10 15
Asn Leu Thr Thr Arg Thr Gln Leu Pro Pro Ala Tyr Thr Asn Ser Phe
20 25 30
Thr Arg Gly Val Tyr Tyr Pro Asp Lys Val Phe Arg Ser Ser Val Leu
35 40 45
His Ser Thr Gln Asp Leu Phe Leu Pro Phe Phe Ser Asn Val Thr Trp
50 55 60
Phe His Ala Ile His Val Ser Gly Thr Asn Gly Thr Lys Arg Phe Asp
65 70 75 80
Asn Pro Val Leu Pro Phe Asn Asp Gly Val Tyr Phe Ala Ser Thr Glu
85 90 95
Lys Ser Asn Ile Ile Arg Gly Trp Ile Phe Gly Thr Thr Leu Asp Ser
100 105 110
Lys Thr Gln Ser Leu Leu Ile Val Asn Asn Ala Thr Asn Val Val Ile
115 120 125
Lys Val Cys Glu Phe Gln Phe Cys Asn Asp Pro Phe Leu Gly Val Tyr
130 135 140
Tyr His Lys Asn Asn Lys Ser Trp Met Glu Ser Glu Phe Arg Val Tyr
145 150 155 160
Ser Ser Ala Asn Asn Cys Thr Phe Glu Tyr Val Ser Gln Pro Phe Leu
165 170 175
Met Asp Leu Glu Gly Lys Gln Gly Asn Phe Lys Asn Leu Arg Glu Phe
180 185 190
Val Phe Lys Asn Ile Asp Gly Tyr Phe Lys Ile Tyr Ser Lys His Thr
195 200 205
Pro Ile Asn Leu Val Arg Asp Leu Pro Gln Gly Phe Ser Ala Leu Glu
210 215 220
Pro Leu Val Asp Leu Pro Ile Gly Ile Asn Ile Thr Arg Phe Gln Thr
225 230 235 240
Leu Leu Ala Leu His Arg Ser Tyr Leu Thr Pro Gly Asp Ser Ser Ser
245 250 255
Gly Trp Thr Ala Gly Ala Ala Ala Tyr Tyr Val Gly Tyr Leu Gln Pro
260 265 270
Arg Thr Phe Leu Leu Lys Tyr Asn Glu Asn Gly Thr Ile Thr Asp Ala
275 280 285
Val Asp Cys Ala Leu Asp Pro Leu Ser Glu Thr Lys Cys Thr Leu Lys
290 295 300
Ser Phe Thr Val Glu Lys Gly Ile Tyr Gln Thr Ser Asn Phe Arg Val
305 310 315 320
Gln Pro Thr Glu Ser Ile Val Arg Phe Pro Asn Ile Thr Asn Leu Cys
325 330 335
Pro Phe Gly Glu Val Phe Asn Ala Thr Arg Phe Ala Ser Val Tyr Ala
340 345 350
Trp Asn Arg Lys Arg Ile Ser Asn Cys Val Ala Asp Tyr Ser Val Leu
355 360 365
Tyr Asn Ser Ala Ser Phe Ser Thr Phe Lys Cys Tyr Gly Val Ser Pro
370 375 380
Thr Lys Leu Asn Asp Leu Cys Phe Thr Asn Val Tyr Ala Asp Ser Phe
385 390 395 400
Val Ile Arg Gly Asp Glu Val Arg Gln Ile Ala Pro Gly Gln Thr Gly
405 410 415
Lys Ile Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe Thr Gly Cys
420 425 430
Val Ile Ala Trp Asn Ser Asn Asn Leu Asp Ser Lys Val Gly Gly Asn
435 440 445
Tyr Asn Tyr Leu Tyr Arg Leu Phe Arg Lys Ser Asn Leu Lys Pro Phe
450 455 460
Glu Arg Asp Ile Ser Thr Glu Ile Tyr Gln Ala Gly Ser Thr Pro Cys
465 470 475 480
Asn Gly Val Glu Gly Phe Asn Cys Tyr Phe Pro Leu Gln Ser Tyr Gly
485 490 495
Phe Gln Pro Thr Asn Gly Val Gly Tyr Gln Pro Tyr Arg Val Val Val
500 505 510
Leu Ser Phe Glu Leu Leu His Ala Pro Ala Thr Val Cys Gly Pro Lys
515 520 525
Lys Ser Thr Asn Leu Val Lys Asn Lys Cys Val Asn Phe Asn Phe Asn
530 535 540
Gly Leu Thr Gly Thr Gly Val Leu Thr Glu Ser Asn Lys Lys Phe Leu
545 550 555 560
Pro Phe Gln Gln Phe Gly Arg Asp Ile Ala Asp Thr Thr Asp Ala Val
565 570 575
Arg Asp Pro Gln Thr Leu Glu Ile Leu Asp Ile Thr Pro Cys Ser Phe
580 585 590
Gly Gly Val Ser Val Ile Thr Pro Gly Thr Asn Thr Ser Asn Gln Val
595 600 605
Ala Val Leu Tyr Gln Asp Val Asn Cys Thr Glu Val Pro Val Ala Ile
610 615 620
His Ala Asp Gln Leu Thr Pro Thr Trp Arg Val Tyr Ser Thr Gly Ser
625 630 635 640
Asn Val Phe Gln Thr Arg Ala Gly Cys Leu Ile Gly Ala Glu His Val
645 650 655
Asn Asn Ser Tyr Glu Cys Asp Ile Pro Ile Gly Ala Gly Ile Cys Ala
660 665 670
Ser Tyr Gln Thr Gln Thr Asn Ser Pro Gly Ser Ala Ser Ser Val Ala
675 680 685
Ser Gln Ser Ile Ile Ala Tyr Thr Met Ser Leu Gly Ala Glu Asn Ser
690 695 700
Val Ala Tyr Ser Asn Asn Ser Ile Ala Ile Pro Thr Asn Phe Thr Ile
705 710 715 720
Ser Val Thr Thr Glu Ile Leu Pro Val Ser Met Thr Lys Thr Ser Val
725 730 735
Asp Cys Thr Met Tyr Ile Cys Gly Asp Ser Thr Glu Cys Ser Asn Leu
740 745 750
Leu Leu Gln Tyr Gly Ser Phe Cys Thr Gln Leu Asn Arg Ala Leu Thr
755 760 765
Gly Ile Ala Val Glu Gln Asp Lys Asn Thr Gln Glu Val Phe Ala Gln
770 775 780
Val Lys Gln Ile Tyr Lys Thr Pro Pro Ile Lys Asp Phe Gly Gly Phe
785 790 795 800
Asn Phe Ser Gln Ile Leu Pro Asp Pro Ser Lys Pro Ser Lys Arg Ser
805 810 815
Phe Ile Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly
820 825 830
Phe Ile Lys Gln Tyr Gly Asp Cys Leu Gly Asp Ile Ala Ala Arg Asp
835 840 845
Leu Ile Cys Ala Gln Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu
850 855 860
Leu Thr Asp Glu Met Ile Ala Gln Tyr Thr Ser Ala Leu Leu Ala Gly
865 870 875 880
Thr Ile Thr Ser Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gln Ile
885 890 895
Pro Phe Ala Met Gln Met Ala Tyr Arg Phe Asn Gly Ile Gly Val Thr
900 905 910
Gln Asn Val Leu Tyr Glu Asn Gln Lys Leu Ile Ala Asn Gln Phe Asn
915 920 925
Ser Ala Ile Gly Lys Ile Gln Asp Ser Leu Ser Ser Thr Ala Ser Ala
930 935 940
Leu Gly Lys Leu Gln Asp Val Val Asn Gln Asn Ala Gln Ala Leu Asn
945 950 955 960
Thr Leu Val Lys Gln Leu Ser Ser Asn Phe Gly Ala Ile Ser Ser Val
965 970 975
Leu Asn Asp Ile Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gln
980 985 990
Ile Asp Arg Leu Ile Thr Gly Arg Leu Gln Ser Leu Gln Thr Tyr Val
995 1000 1005
Thr Gln Gln Leu Ile Arg Ala Ala Glu Ile Arg Ala Ser Ala Asn Leu
1010 1015 1020
Ala Ala Thr Lys Met Ser Glu Cys Val Leu Gly Gln Ser Lys Arg Val
1025 1030 1035 1040
Asp Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gln Ser Ala
1045 1050 1055
Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ala Gln Glu
1060 1065 1070
Lys Asn Phe Thr Thr Ala Pro Ala Ile Cys His Asp Gly Lys Ala His
1075 1080 1085
Phe Pro Arg Glu Gly Val Phe Val Ser Asn Gly Thr His Trp Phe Val
1090 1095 1100
Thr Gln Arg Asn Phe Tyr Glu Pro Gln Ile Ile Thr Thr Asp Asn Thr
1105 1110 1115 1120
Phe Val Ser Gly Asn Cys Asp Val Val Ile Gly Ile Val Asn Asn Thr
1125 1130 1135
Val Tyr Asp Pro Leu Gln Pro Glu Leu Asp Ser Phe Lys Glu Glu Leu
1140 1145 1150
Asp Lys Tyr Phe Lys Asn His Thr Ser Pro Asp Val Asp Leu Gly Asp
1155 1160 1165
Ile Ser Gly Ile Asn Ala Ser Val Val Asn Ile Gln Lys Glu Ile Asp
1170 1175 1180
Arg Leu Asn Glu Val Ala Lys Asn Leu Asn Glu Ser Leu Ile Asp Leu
1185 1190 1195 1200
Gln Glu Leu Gly Lys Tyr Glu Gln Gly Tyr Ile Pro Glu Ala Pro Arg
1205 1210 1215
Asp Gly Gln Ala Tyr Val Arg Lys Asp Gly Glu Trp Val Leu Leu Ser
1220 1225 1230
Thr Phe Leu
1235
<210> 10
<211> 3735
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
atggacgcca tgaagcgcgg cctgtgctgc gtgctgctgc tgtgcggcgc cgtgttcgtg 60
agcgcccagt gcgtgaacct gaccacccgc acccagctgc cccccgccta caccaacagc 120
ttcacccgcg gcgtgtacta ccccgacaag gtgttccgca gcagcgtgct gcacagcacc 180
caggacctgt tcctgccctt cttcagcaac gtgacctggt tccacgccat ccacgtgagc 240
ggcaccaacg gcaccaagcg cttcgacaac cccgtgctgc ccttcaacga cggcgtgtac 300
ttcgccagca ccgagaagag caacatcatc cgcggctgga tcttcggcac caccctggac 360
agcaagaccc agagcctgct gatcgtgaac aacgccacca acgtggtgat caaggtgtgc 420
gagttccagt tctgcaacga ccccttcctg ggcgtgtact accacaagaa caacaagagc 480
tggatggaga gcgagttccg cgtgtacagc agcgccaaca actgcacctt cgagtacgtg 540
agccagccct tcctgatgga cctggagggc aagcagggca acttcaagaa cctgcgcgag 600
ttcgtgttca agaacatcga cggctacttc aagatctaca gcaagcacac ccccatcaac 660
ctggtgcgcg acctgcccca gggcttcagc gccctggagc ccctggtgga cctgcccatc 720
ggcatcaaca tcacccgctt ccagaccctg ctggccctgc accgcagcta cctgaccccc 780
ggcgacagca gcagcggctg gaccgccggc gccgccgcct actacgtggg ctacctgcag 840
ccccgcacct tcctgctgaa gtacaacgag aacggcacca tcaccgacgc cgtggactgc 900
gccctggacc ccctgagcga gaccaagtgc accctgaaga gcttcaccgt ggagaagggc 960
atctaccaga ccagcaactt ccgcgtgcag cccaccgaga gcatcgtgcg cttccccaac 1020
atcaccaacc tgtgcccctt cggcgaggtg ttcaacgcca cccgcttcgc cagcgtgtac 1080
gcctggaacc gcaagcgcat cagcaactgc gtggccgact acagcgtgct gtacaacagc 1140
gccagcttca gcaccttcaa gtgctacggc gtgagcccca ccaagctgaa cgacctgtgc 1200
ttcaccaacg tgtacgccga cagcttcgtg atccgcggcg acgaggtgcg ccagatcgcc 1260
cccggccaga ccggcaagat cgccgactac aactacaagc tgcccgacga cttcaccggc 1320
tgcgtgatcg cctggaacag caacaacctg gacagcaagg tgggcggcaa ctacaactac 1380
ctgtaccgcc tgttccgcaa gagcaacctg aagcccttcg agcgcgacat cagcaccgag 1440
atctaccagg ccggcagcac cccctgcaac ggcgtggagg gcttcaactg ctacttcccc 1500
ctgcagagct acggcttcca gcccaccaac ggcgtgggct accagcccta ccgcgtggtg 1560
gtgctgagct tcgagctgct gcacgccccc gccaccgtgt gcggccccaa gaagagcacc 1620
aacctggtga agaacaagtg cgtgaacttc aacttcaacg gcctgaccgg caccggcgtg 1680
ctgaccgaga gcaacaagaa gttcctgccc ttccagcagt tcggccgcga catcgccgac 1740
accaccgacg ccgtgcgcga cccccagacc ctggagatcc tggacatcac cccctgcagc 1800
ttcggcggcg tgagcgtgat cacccccggc accaacacca gcaaccaggt ggccgtgctg 1860
taccaggacg tgaactgcac cgaggtgccc gtggccatcc acgccgacca gctgaccccc 1920
acctggcgcg tgtacagcac cggcagcaac gtgttccaga cccgcgccgg ctgcctgatc 1980
ggcgccgagc acgtgaacaa cagctacgag tgcgacatcc ccatcggcgc cggcatctgc 2040
gccagctacc agacccagac caacagcccc ggcagcgcca gcagcgtggc cagccagagc 2100
atcatcgcct acaccatgag cctgggcgcc gagaacagcg tggcctacag caacaacagc 2160
atcgccatcc ccaccaactt caccatcagc gtgaccaccg agatcctgcc cgtgagcatg 2220
accaagacca gcgtggactg caccatgtac atctgcggcg acagcaccga gtgcagcaac 2280
ctgctgctgc agtacggcag cttctgcacc cagctgaacc gcgccctgac cggcatcgcc 2340
gtggagcagg acaagaacac ccaggaggtg ttcgcccagg tgaagcagat ctacaagacc 2400
ccccccatca aggacttcgg cggcttcaac ttcagccaga tcctgcccga ccccagcaag 2460
cccagcaagc gcagcttcat cgaggacctg ctgttcaaca aggtgaccct ggccgacgcc 2520
ggcttcatca agcagtacgg cgactgcctg ggcgacatcg ccgcccgcga cctgatctgc 2580
gcccagaagt tcaacggcct gaccgtgctg ccccccctgc tgaccgacga gatgatcgcc 2640
cagtacacca gcgccctgct ggccggcacc atcaccagcg gctggacctt cggcgccggc 2700
gccgccctgc agatcccctt cgccatgcag atggcctacc gcttcaacgg catcggcgtg 2760
acccagaacg tgctgtacga gaaccagaag ctgatcgcca accagttcaa cagcgccatc 2820
ggcaagatcc aggacagcct gagcagcacc gccagcgccc tgggcaagct gcaggacgtg 2880
gtgaaccaga acgcccaggc cctgaacacc ctggtgaagc agctgagcag caacttcggc 2940
gccatcagca gcgtgctgaa cgacatcctg agccgcctgg acaaggtgga ggccgaggtg 3000
cagatcgacc gcctgatcac cggccgcctg cagagcctgc agacctacgt gacccagcag 3060
ctgatccgcg ccgccgagat ccgcgccagc gccaacctgg ccgccaccaa gatgagcgag 3120
tgcgtgctgg gccagagcaa gcgcgtggac ttctgcggca agggctacca cctgatgagc 3180
ttcccccaga gcgcccccca cggcgtggtg ttcctgcacg tgacctacgt gcccgcccag 3240
gagaagaact tcaccaccgc ccccgccatc tgccacgacg gcaaggccca cttcccccgc 3300
gagggcgtgt tcgtgagcaa cggcacccac tggttcgtga cccagcgcaa cttctacgag 3360
ccccagatca tcaccaccga caacaccttc gtgagcggca actgcgacgt ggtgatcggc 3420
atcgtgaaca acaccgtgta cgaccccctg cagcccgagc tggacagctt caaggaggag 3480
ctggacaagt acttcaagaa ccacaccagc cccgacgtgg acctgggcga catcagcggc 3540
atcaacgcca gcgtggtgaa catccagaag gagatcgacc gcctgaacga ggtggccaag 3600
aacctgaacg agagcctgat cgacctgcag gagctgggca agtacgagca gggctacatc 3660
cccgaggccc cccgcgacgg ccaggcctac gtgcgcaagg acggcgagtg ggtgctgctg 3720
agcaccttcc tgtga 3735
<210> 11
<211> 1244
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 11
Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly
1 5 10 15
Ala Val Phe Val Ser Ala Gln Cys Val Asn Leu Thr Thr Arg Thr Gln
20 25 30
Leu Pro Pro Ala Tyr Thr Asn Ser Phe Thr Arg Gly Val Tyr Tyr Pro
35 40 45
Asp Lys Val Phe Arg Ser Ser Val Leu His Ser Thr Gln Asp Leu Phe
50 55 60
Leu Pro Phe Phe Ser Asn Val Thr Trp Phe His Ala Ile His Val Ser
65 70 75 80
Gly Thr Asn Gly Thr Lys Arg Phe Asp Asn Pro Val Leu Pro Phe Asn
85 90 95
Asp Gly Val Tyr Phe Ala Ser Thr Glu Lys Ser Asn Ile Ile Arg Gly
100 105 110
Trp Ile Phe Gly Thr Thr Leu Asp Ser Lys Thr Gln Ser Leu Leu Ile
115 120 125
Val Asn Asn Ala Thr Asn Val Val Ile Lys Val Cys Glu Phe Gln Phe
130 135 140
Cys Asn Asp Pro Phe Leu Gly Val Tyr Tyr His Lys Asn Asn Lys Ser
145 150 155 160
Trp Met Glu Ser Glu Phe Arg Val Tyr Ser Ser Ala Asn Asn Cys Thr
165 170 175
Phe Glu Tyr Val Ser Gln Pro Phe Leu Met Asp Leu Glu Gly Lys Gln
180 185 190
Gly Asn Phe Lys Asn Leu Arg Glu Phe Val Phe Lys Asn Ile Asp Gly
195 200 205
Tyr Phe Lys Ile Tyr Ser Lys His Thr Pro Ile Asn Leu Val Arg Asp
210 215 220
Leu Pro Gln Gly Phe Ser Ala Leu Glu Pro Leu Val Asp Leu Pro Ile
225 230 235 240
Gly Ile Asn Ile Thr Arg Phe Gln Thr Leu Leu Ala Leu His Arg Ser
245 250 255
Tyr Leu Thr Pro Gly Asp Ser Ser Ser Gly Trp Thr Ala Gly Ala Ala
260 265 270
Ala Tyr Tyr Val Gly Tyr Leu Gln Pro Arg Thr Phe Leu Leu Lys Tyr
275 280 285
Asn Glu Asn Gly Thr Ile Thr Asp Ala Val Asp Cys Ala Leu Asp Pro
290 295 300
Leu Ser Glu Thr Lys Cys Thr Leu Lys Ser Phe Thr Val Glu Lys Gly
305 310 315 320
Ile Tyr Gln Thr Ser Asn Phe Arg Val Gln Pro Thr Glu Ser Ile Val
325 330 335
Arg Phe Pro Asn Ile Thr Asn Leu Cys Pro Phe Gly Glu Val Phe Asn
340 345 350
Ala Thr Arg Phe Ala Ser Val Tyr Ala Trp Asn Arg Lys Arg Ile Ser
355 360 365
Asn Cys Val Ala Asp Tyr Ser Val Leu Tyr Asn Ser Ala Ser Phe Ser
370 375 380
Thr Phe Lys Cys Tyr Gly Val Ser Pro Thr Lys Leu Asn Asp Leu Cys
385 390 395 400
Phe Thr Asn Val Tyr Ala Asp Ser Phe Val Ile Arg Gly Asp Glu Val
405 410 415
Arg Gln Ile Ala Pro Gly Gln Thr Gly Lys Ile Ala Asp Tyr Asn Tyr
420 425 430
Lys Leu Pro Asp Asp Phe Thr Gly Cys Val Ile Ala Trp Asn Ser Asn
435 440 445
Asn Leu Asp Ser Lys Val Gly Gly Asn Tyr Asn Tyr Leu Tyr Arg Leu
450 455 460
Phe Arg Lys Ser Asn Leu Lys Pro Phe Glu Arg Asp Ile Ser Thr Glu
465 470 475 480
Ile Tyr Gln Ala Gly Ser Thr Pro Cys Asn Gly Val Glu Gly Phe Asn
485 490 495
Cys Tyr Phe Pro Leu Gln Ser Tyr Gly Phe Gln Pro Thr Asn Gly Val
500 505 510
Gly Tyr Gln Pro Tyr Arg Val Val Val Leu Ser Phe Glu Leu Leu His
515 520 525
Ala Pro Ala Thr Val Cys Gly Pro Lys Lys Ser Thr Asn Leu Val Lys
530 535 540
Asn Lys Cys Val Asn Phe Asn Phe Asn Gly Leu Thr Gly Thr Gly Val
545 550 555 560
Leu Thr Glu Ser Asn Lys Lys Phe Leu Pro Phe Gln Gln Phe Gly Arg
565 570 575
Asp Ile Ala Asp Thr Thr Asp Ala Val Arg Asp Pro Gln Thr Leu Glu
580 585 590
Ile Leu Asp Ile Thr Pro Cys Ser Phe Gly Gly Val Ser Val Ile Thr
595 600 605
Pro Gly Thr Asn Thr Ser Asn Gln Val Ala Val Leu Tyr Gln Asp Val
610 615 620
Asn Cys Thr Glu Val Pro Val Ala Ile His Ala Asp Gln Leu Thr Pro
625 630 635 640
Thr Trp Arg Val Tyr Ser Thr Gly Ser Asn Val Phe Gln Thr Arg Ala
645 650 655
Gly Cys Leu Ile Gly Ala Glu His Val Asn Asn Ser Tyr Glu Cys Asp
660 665 670
Ile Pro Ile Gly Ala Gly Ile Cys Ala Ser Tyr Gln Thr Gln Thr Asn
675 680 685
Ser Pro Gly Ser Ala Ser Ser Val Ala Ser Gln Ser Ile Ile Ala Tyr
690 695 700
Thr Met Ser Leu Gly Ala Glu Asn Ser Val Ala Tyr Ser Asn Asn Ser
705 710 715 720
Ile Ala Ile Pro Thr Asn Phe Thr Ile Ser Val Thr Thr Glu Ile Leu
725 730 735
Pro Val Ser Met Thr Lys Thr Ser Val Asp Cys Thr Met Tyr Ile Cys
740 745 750
Gly Asp Ser Thr Glu Cys Ser Asn Leu Leu Leu Gln Tyr Gly Ser Phe
755 760 765
Cys Thr Gln Leu Asn Arg Ala Leu Thr Gly Ile Ala Val Glu Gln Asp
770 775 780
Lys Asn Thr Gln Glu Val Phe Ala Gln Val Lys Gln Ile Tyr Lys Thr
785 790 795 800
Pro Pro Ile Lys Asp Phe Gly Gly Phe Asn Phe Ser Gln Ile Leu Pro
805 810 815
Asp Pro Ser Lys Pro Ser Lys Arg Ser Phe Ile Glu Asp Leu Leu Phe
820 825 830
Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Ile Lys Gln Tyr Gly Asp
835 840 845
Cys Leu Gly Asp Ile Ala Ala Arg Asp Leu Ile Cys Ala Gln Lys Phe
850 855 860
Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr Asp Glu Met Ile Ala
865 870 875 880
Gln Tyr Thr Ser Ala Leu Leu Ala Gly Thr Ile Thr Ser Gly Trp Thr
885 890 895
Phe Gly Ala Gly Ala Ala Leu Gln Ile Pro Phe Ala Met Gln Met Ala
900 905 910
Tyr Arg Phe Asn Gly Ile Gly Val Thr Gln Asn Val Leu Tyr Glu Asn
915 920 925
Gln Lys Leu Ile Ala Asn Gln Phe Asn Ser Ala Ile Gly Lys Ile Gln
930 935 940
Asp Ser Leu Ser Ser Thr Ala Ser Ala Leu Gly Lys Leu Gln Asp Val
945 950 955 960
Val Asn Gln Asn Ala Gln Ala Leu Asn Thr Leu Val Lys Gln Leu Ser
965 970 975
Ser Asn Phe Gly Ala Ile Ser Ser Val Leu Asn Asp Ile Leu Ser Arg
980 985 990
Leu Asp Lys Val Glu Ala Glu Val Gln Ile Asp Arg Leu Ile Thr Gly
995 1000 1005
Arg Leu Gln Ser Leu Gln Thr Tyr Val Thr Gln Gln Leu Ile Arg Ala
1010 1015 1020
Ala Glu Ile Arg Ala Ser Ala Asn Leu Ala Ala Thr Lys Met Ser Glu
1025 1030 1035 1040
Cys Val Leu Gly Gln Ser Lys Arg Val Asp Phe Cys Gly Lys Gly Tyr
1045 1050 1055
His Leu Met Ser Phe Pro Gln Ser Ala Pro His Gly Val Val Phe Leu
1060 1065 1070
His Val Thr Tyr Val Pro Ala Gln Glu Lys Asn Phe Thr Thr Ala Pro
1075 1080 1085
Ala Ile Cys His Asp Gly Lys Ala His Phe Pro Arg Glu Gly Val Phe
1090 1095 1100
Val Ser Asn Gly Thr His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu
1105 1110 1115 1120
Pro Gln Ile Ile Thr Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp
1125 1130 1135
Val Val Ile Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro
1140 1145 1150
Glu Leu Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His
1155 1160 1165
Thr Ser Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn Ala Ser
1170 1175 1180
Val Val Asn Ile Gln Lys Glu Ile Asp Arg Leu Asn Glu Val Ala Lys
1185 1190 1195 1200
Asn Leu Asn Glu Ser Leu Ile Asp Leu Gln Glu Leu Gly Lys Tyr Glu
1205 1210 1215
Gln Gly Tyr Ile Pro Glu Ala Pro Arg Asp Gly Gln Ala Tyr Val Arg
1220 1225 1230
Lys Asp Gly Glu Trp Val Leu Leu Ser Thr Phe Leu
1235 1240
<210> 12
<211> 3729
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
atggacgcca tgaagcgcgg cctgtgctgc gtgctgctgc tgtgcggcgc cgtgttcgtg 60
agcgcccagt gcgtgaacct gaccacccgc acccagctgc cccccgccta caccaacagc 120
ttcacccgcg gcgtgtacta ccccgacaag gtgttccgca gcagcgtgct gcacagcacc 180
caggacctgt tcctgccctt cttcagcaac gtgacctggt tccacgccat ccacgtgagc 240
ggcaccaacg gcaccaagcg cttcgacaac cccgtgctgc ccttcaacga cggcgtgtac 300
ttcgccagca ccgagaagag caacatcatc cgcggctgga tcttcggcac caccctggac 360
agcaagaccc agagcctgct gatcgtgaac aacgccacca acgtggtgat caaggtgtgc 420
gagttccagt tctgcaacga ccccttcctg ggcgtgtact accacaagaa caacaagagc 480
tggatggaga gcgagttccg cgtgtacagc agcgccaaca actgcacctt cgagtacgtg 540
agccagccct tcctgatgga cctggagggc aagcagggca acttcaagaa cctgcgcgag 600
ttcgtgttca agaacatcga cggctacttc aagatctaca gcaagcacac ccccatcaac 660
ctggtgcgcg acctgcccca gggcttcagc gccctggagc ccctggtgga cctgcccatc 720
ggcatcaaca tcacccgctt ccagaccctg ctggccctgc accgcagcta cctgaccccc 780
ggcgacagca gcagcggctg gaccgccggc gccgccgcct actacgtggg ctacctgcag 840
ccccgcacct tcctgctgaa gtacaacgag aacggcacca tcaccgacgc cgtggactgc 900
gccctggacc ccctgagcga gaccaagtgc accctgaaga gcttcaccgt ggagaagggc 960
atctaccaga ccagcaactt ccgcgtgcag cccaccgaga gcatcgtgcg cttccccaac 1020
atcaccaacc tgtgcccctt cggcgaggtg ttcaacgcca cccgcttcgc cagcgtgtac 1080
gcctggaacc gcaagcgcat cagcaactgc gtggccgact acagcgtgct gtacaacagc 1140
gccagcttca gcaccttcaa gtgctacggc gtgagcccca ccaagctgaa cgacctgtgc 1200
ttcaccaacg tgtacgccga cagcttcgtg atccgcggcg acgaggtgcg ccagatcgcc 1260
cccggccaga ccggcaagat cgccgactac aactacaagc tgcccgacga cttcaccggc 1320
tgcgtgatcg cctggaacag caacaacctg gacagcaagg tgggcggcaa ctacaactac 1380
ctgtaccgcc tgttccgcaa gagcaacctg aagcccttcg agcgcgacat cagcaccgag 1440
atctaccagg ccggcagcac cccctgcaac ggcgtggagg gcttcaactg ctacttcccc 1500
ctgcagagct acggcttcca gcccaccaac ggcgtgggct accagcccta ccgcgtggtg 1560
gtgctgagct tcgagctgct gcacgccccc gccaccgtgt gcggccccaa gaagagcacc 1620
aacctggtga agaacaagtg cgtgaacttc aacttcaacg gcctgaccgg caccggcgtg 1680
ctgaccgaga gcaacaagaa gttcctgccc ttccagcagt tcggccgcga catcgccgac 1740
accaccgacg ccgtgcgcga cccccagacc ctggagatcc tggacatcac cccctgcagc 1800
ttcggcggcg tgagcgtgat cacccccggc accaacacca gcaaccaggt ggccgtgctg 1860
taccaggacg tgaactgcac cgaggtgccc gtggccatcc acgccgacca gctgaccccc 1920
acctggcgcg tgtacagcac cggcagcaac gtgttccaga cccgcgccgg ctgcctgatc 1980
ggcgccgagc acgtgaacaa cagctacgag tgcgacatcc ccatcggcgc cggcatctgc 2040
gccagctacc agacccagac caacagcccc ggcggcagcg tggccagcca gagcatcatc 2100
gcctacacca tgagcctggg cgccgagaac agcgtggcct acagcaacaa cagcatcgcc 2160
atccccacca acttcaccat cagcgtgacc accgagatcc tgcccgtgag catgaccaag 2220
accagcgtgg actgcaccat gtacatctgc ggcgacagca ccgagtgcag caacctgctg 2280
ctgcagtacg gcagcttctg cacccagctg aaccgcgccc tgaccggcat cgccgtggag 2340
caggacaaga acacccagga ggtgttcgcc caggtgaagc agatctacaa gacccccccc 2400
atcaaggact tcggcggctt caacttcagc cagatcctgc ccgaccccag caagcccagc 2460
aagcgcagct tcatcgagga cctgctgttc aacaaggtga ccctggccga cgccggcttc 2520
atcaagcagt acggcgactg cctgggcgac atcgccgccc gcgacctgat ctgcgcccag 2580
aagttcaacg gcctgaccgt gctgcccccc ctgctgaccg acgagatgat cgcccagtac 2640
accagcgccc tgctggccgg caccatcacc agcggctgga ccttcggcgc cggcgccgcc 2700
ctgcagatcc ccttcgccat gcagatggcc taccgcttca acggcatcgg cgtgacccag 2760
aacgtgctgt acgagaacca gaagctgatc gccaaccagt tcaacagcgc catcggcaag 2820
atccaggaca gcctgagcag caccgccagc gccctgggca agctgcagga cgtggtgaac 2880
cagaacgccc aggccctgaa caccctggtg aagcagctga gcagcaactt cggcgccatc 2940
agcagcgtgc tgaacgacat cctgagccgc ctggacaagg tggaggccga ggtgcagatc 3000
gaccgcctga tcaccggccg cctgcagagc ctgcagacct acgtgaccca gcagctgatc 3060
cgcgccgccg agatccgcgc cagcgccaac ctggccgcca ccaagatgag cgagtgcgtg 3120
ctgggccaga gcaagcgcgt ggacttctgc ggcaagggct accacctgat gagcttcccc 3180
cagagcgccc cccacggcgt ggtgttcctg cacgtgacct acgtgcccgc ccaggagaag 3240
aacttcacca ccgcccccgc catctgccac gacggcaagg cccacttccc ccgcgagggc 3300
gtgttcgtga gcaacggcac ccactggttc gtgacccagc gcaacttcta cgagccccag 3360
atcatcacca ccgacaacac cttcgtgagc ggcaactgcg acgtggtgat cggcatcgtg 3420
aacaacaccg tgtacgaccc cctgcagccc gagctggaca gcttcaagga ggagctggac 3480
aagtacttca agaaccacac cagccccgac gtggacctgg gcgacatcag cggcatcaac 3540
gccagcgtgg tgaacatcca gaaggagatc gaccgcctga acgaggtggc caagaacctg 3600
aacgagagcc tgatcgacct gcaggagctg ggcaagtacg agcagggcta catccccgag 3660
gccccccgcg acggccaggc ctacgtgcgc aaggacggcg agtgggtgct gctgagcacc 3720
ttcctgtga 3729
<210> 13
<211> 1242
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 13
Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly
1 5 10 15
Ala Val Phe Val Ser Ala Gln Cys Val Asn Leu Thr Thr Arg Thr Gln
20 25 30
Leu Pro Pro Ala Tyr Thr Asn Ser Phe Thr Arg Gly Val Tyr Tyr Pro
35 40 45
Asp Lys Val Phe Arg Ser Ser Val Leu His Ser Thr Gln Asp Leu Phe
50 55 60
Leu Pro Phe Phe Ser Asn Val Thr Trp Phe His Ala Ile His Val Ser
65 70 75 80
Gly Thr Asn Gly Thr Lys Arg Phe Asp Asn Pro Val Leu Pro Phe Asn
85 90 95
Asp Gly Val Tyr Phe Ala Ser Thr Glu Lys Ser Asn Ile Ile Arg Gly
100 105 110
Trp Ile Phe Gly Thr Thr Leu Asp Ser Lys Thr Gln Ser Leu Leu Ile
115 120 125
Val Asn Asn Ala Thr Asn Val Val Ile Lys Val Cys Glu Phe Gln Phe
130 135 140
Cys Asn Asp Pro Phe Leu Gly Val Tyr Tyr His Lys Asn Asn Lys Ser
145 150 155 160
Trp Met Glu Ser Glu Phe Arg Val Tyr Ser Ser Ala Asn Asn Cys Thr
165 170 175
Phe Glu Tyr Val Ser Gln Pro Phe Leu Met Asp Leu Glu Gly Lys Gln
180 185 190
Gly Asn Phe Lys Asn Leu Arg Glu Phe Val Phe Lys Asn Ile Asp Gly
195 200 205
Tyr Phe Lys Ile Tyr Ser Lys His Thr Pro Ile Asn Leu Val Arg Asp
210 215 220
Leu Pro Gln Gly Phe Ser Ala Leu Glu Pro Leu Val Asp Leu Pro Ile
225 230 235 240
Gly Ile Asn Ile Thr Arg Phe Gln Thr Leu Leu Ala Leu His Arg Ser
245 250 255
Tyr Leu Thr Pro Gly Asp Ser Ser Ser Gly Trp Thr Ala Gly Ala Ala
260 265 270
Ala Tyr Tyr Val Gly Tyr Leu Gln Pro Arg Thr Phe Leu Leu Lys Tyr
275 280 285
Asn Glu Asn Gly Thr Ile Thr Asp Ala Val Asp Cys Ala Leu Asp Pro
290 295 300
Leu Ser Glu Thr Lys Cys Thr Leu Lys Ser Phe Thr Val Glu Lys Gly
305 310 315 320
Ile Tyr Gln Thr Ser Asn Phe Arg Val Gln Pro Thr Glu Ser Ile Val
325 330 335
Arg Phe Pro Asn Ile Thr Asn Leu Cys Pro Phe Gly Glu Val Phe Asn
340 345 350
Ala Thr Arg Phe Ala Ser Val Tyr Ala Trp Asn Arg Lys Arg Ile Ser
355 360 365
Asn Cys Val Ala Asp Tyr Ser Val Leu Tyr Asn Ser Ala Ser Phe Ser
370 375 380
Thr Phe Lys Cys Tyr Gly Val Ser Pro Thr Lys Leu Asn Asp Leu Cys
385 390 395 400
Phe Thr Asn Val Tyr Ala Asp Ser Phe Val Ile Arg Gly Asp Glu Val
405 410 415
Arg Gln Ile Ala Pro Gly Gln Thr Gly Lys Ile Ala Asp Tyr Asn Tyr
420 425 430
Lys Leu Pro Asp Asp Phe Thr Gly Cys Val Ile Ala Trp Asn Ser Asn
435 440 445
Asn Leu Asp Ser Lys Val Gly Gly Asn Tyr Asn Tyr Leu Tyr Arg Leu
450 455 460
Phe Arg Lys Ser Asn Leu Lys Pro Phe Glu Arg Asp Ile Ser Thr Glu
465 470 475 480
Ile Tyr Gln Ala Gly Ser Thr Pro Cys Asn Gly Val Glu Gly Phe Asn
485 490 495
Cys Tyr Phe Pro Leu Gln Ser Tyr Gly Phe Gln Pro Thr Asn Gly Val
500 505 510
Gly Tyr Gln Pro Tyr Arg Val Val Val Leu Ser Phe Glu Leu Leu His
515 520 525
Ala Pro Ala Thr Val Cys Gly Pro Lys Lys Ser Thr Asn Leu Val Lys
530 535 540
Asn Lys Cys Val Asn Phe Asn Phe Asn Gly Leu Thr Gly Thr Gly Val
545 550 555 560
Leu Thr Glu Ser Asn Lys Lys Phe Leu Pro Phe Gln Gln Phe Gly Arg
565 570 575
Asp Ile Ala Asp Thr Thr Asp Ala Val Arg Asp Pro Gln Thr Leu Glu
580 585 590
Ile Leu Asp Ile Thr Pro Cys Ser Phe Gly Gly Val Ser Val Ile Thr
595 600 605
Pro Gly Thr Asn Thr Ser Asn Gln Val Ala Val Leu Tyr Gln Asp Val
610 615 620
Asn Cys Thr Glu Val Pro Val Ala Ile His Ala Asp Gln Leu Thr Pro
625 630 635 640
Thr Trp Arg Val Tyr Ser Thr Gly Ser Asn Val Phe Gln Thr Arg Ala
645 650 655
Gly Cys Leu Ile Gly Ala Glu His Val Asn Asn Ser Tyr Glu Cys Asp
660 665 670
Ile Pro Ile Gly Ala Gly Ile Cys Ala Ser Tyr Gln Thr Gln Thr Asn
675 680 685
Ser Pro Gly Gly Ser Val Ala Ser Gln Ser Ile Ile Ala Tyr Thr Met
690 695 700
Ser Leu Gly Ala Glu Asn Ser Val Ala Tyr Ser Asn Asn Ser Ile Ala
705 710 715 720
Ile Pro Thr Asn Phe Thr Ile Ser Val Thr Thr Glu Ile Leu Pro Val
725 730 735
Ser Met Thr Lys Thr Ser Val Asp Cys Thr Met Tyr Ile Cys Gly Asp
740 745 750
Ser Thr Glu Cys Ser Asn Leu Leu Leu Gln Tyr Gly Ser Phe Cys Thr
755 760 765
Gln Leu Asn Arg Ala Leu Thr Gly Ile Ala Val Glu Gln Asp Lys Asn
770 775 780
Thr Gln Glu Val Phe Ala Gln Val Lys Gln Ile Tyr Lys Thr Pro Pro
785 790 795 800
Ile Lys Asp Phe Gly Gly Phe Asn Phe Ser Gln Ile Leu Pro Asp Pro
805 810 815
Ser Lys Pro Ser Lys Arg Ser Phe Ile Glu Asp Leu Leu Phe Asn Lys
820 825 830
Val Thr Leu Ala Asp Ala Gly Phe Ile Lys Gln Tyr Gly Asp Cys Leu
835 840 845
Gly Asp Ile Ala Ala Arg Asp Leu Ile Cys Ala Gln Lys Phe Asn Gly
850 855 860
Leu Thr Val Leu Pro Pro Leu Leu Thr Asp Glu Met Ile Ala Gln Tyr
865 870 875 880
Thr Ser Ala Leu Leu Ala Gly Thr Ile Thr Ser Gly Trp Thr Phe Gly
885 890 895
Ala Gly Ala Ala Leu Gln Ile Pro Phe Ala Met Gln Met Ala Tyr Arg
900 905 910
Phe Asn Gly Ile Gly Val Thr Gln Asn Val Leu Tyr Glu Asn Gln Lys
915 920 925
Leu Ile Ala Asn Gln Phe Asn Ser Ala Ile Gly Lys Ile Gln Asp Ser
930 935 940
Leu Ser Ser Thr Ala Ser Ala Leu Gly Lys Leu Gln Asp Val Val Asn
945 950 955 960
Gln Asn Ala Gln Ala Leu Asn Thr Leu Val Lys Gln Leu Ser Ser Asn
965 970 975
Phe Gly Ala Ile Ser Ser Val Leu Asn Asp Ile Leu Ser Arg Leu Asp
980 985 990
Lys Val Glu Ala Glu Val Gln Ile Asp Arg Leu Ile Thr Gly Arg Leu
995 1000 1005
Gln Ser Leu Gln Thr Tyr Val Thr Gln Gln Leu Ile Arg Ala Ala Glu
1010 1015 1020
Ile Arg Ala Ser Ala Asn Leu Ala Ala Thr Lys Met Ser Glu Cys Val
1025 1030 1035 1040
Leu Gly Gln Ser Lys Arg Val Asp Phe Cys Gly Lys Gly Tyr His Leu
1045 1050 1055
Met Ser Phe Pro Gln Ser Ala Pro His Gly Val Val Phe Leu His Val
1060 1065 1070
Thr Tyr Val Pro Ala Gln Glu Lys Asn Phe Thr Thr Ala Pro Ala Ile
1075 1080 1085
Cys His Asp Gly Lys Ala His Phe Pro Arg Glu Gly Val Phe Val Ser
1090 1095 1100
Asn Gly Thr His Trp Phe Val Thr Gln Arg Asn Phe Tyr Glu Pro Gln
1105 1110 1115 1120
Ile Ile Thr Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val Val
1125 1130 1135
Ile Gly Ile Val Asn Asn Thr Val Tyr Asp Pro Leu Gln Pro Glu Leu
1140 1145 1150
Asp Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His Thr Ser
1155 1160 1165
Pro Asp Val Asp Leu Gly Asp Ile Ser Gly Ile Asn Ala Ser Val Val
1170 1175 1180
Asn Ile Gln Lys Glu Ile Asp Arg Leu Asn Glu Val Ala Lys Asn Leu
1185 1190 1195 1200
Asn Glu Ser Leu Ile Asp Leu Gln Glu Leu Gly Lys Tyr Glu Gln Gly
1205 1210 1215
Tyr Ile Pro Glu Ala Pro Arg Asp Gly Gln Ala Tyr Val Arg Lys Asp
1220 1225 1230
Gly Glu Trp Val Leu Leu Ser Thr Phe Leu
1235 1240
<210> 14
<211> 39
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
atgtttgttt ttcttgtttt attgccacta gtctctagt 39
<210> 15
<211> 13
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 15
Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser
1 5 10
<210> 16
<211> 39
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
atgttcgtgt tcctggtgct gctgcccctg gtgagcagc 39
<210> 17
<211> 13
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 17
Met Phe Val Phe Leu Val Leu Leu Pro Leu Val Ser Ser
1 5 10
<210> 18
<211> 66
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
atggacgcca tgaagcgcgg cctgtgctgc gtgctgctgc tgtgcggcgc cgtgttcgtg 60
agcgcc 66
<210> 19
<211> 22
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 19
Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly
1 5 10 15
Ala Val Phe Val Ser Ala
20
<210> 20
<211> 57
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
atgggctggt cctgcatcat cctgttcctg gtcgccaccg ctaccggcgt gcatagc 57
<210> 21
<211> 19
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 21
Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly
1 5 10 15
Val His Ser
<210> 22
<211> 72
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
atgcccatgg ggtctctgca accgctggcc accttgtacc tgctggggat gctggtcgct 60
tcctgcctcg ga 72
<210> 23
<211> 24
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 23
Met Pro Met Gly Ser Leu Gln Pro Leu Ala Thr Leu Tyr Leu Leu Gly
1 5 10 15
Met Leu Val Ala Ser Cys Leu Gly
20
<210> 24
<211> 27
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 24
Gly Tyr Ile Pro Glu Ala Pro Arg Asp Gly Gln Ala Tyr Val Arg Lys
1 5 10 15
Asp Gly Glu Trp Val Leu Leu Ser Thr Phe Leu
20 25
<210> 25
<211> 31
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 25
Met Lys Gln Ile Glu Asp Lys Ile Glu Glu Ile Leu Ser Lys Ile Tyr
1 5 10 15
His Ile Glu Asn Glu Ile Ala Arg Ile Lys Lys Leu Ile Gly Glu
20 25 30
<210> 26
<211> 25
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 26
Pro Trp Tyr Ile Trp Leu Gly Phe Ile Ala Gly Leu Ile Ala Ile Val
1 5 10 15
Met Val Thr Ile Met Leu Cys Cys Met
20 25

Claims (5)

1. A nucleic acid molecule having a nucleotide sequence set forth in SEQ ID NO: 8, the nucleic acid molecule highly expresses S protein trimer of SARS-CoV-2.
2. A recombinant expression vector is characterized in that the nucleotide sequence of an expression region of the recombinant expression vector is shown as SEQ ID NO: shown in fig. 8.
3. An engineered cell comprising the recombinant expression vector of claim 2.
4. A method for preparing a novel coronavirus S protein, the method comprising: obtaining the recombinant expression vector of claim 2;
transfecting the recombinant expression vector into cells, and obtaining a cell strain for stably expressing the recombinant S protein through glutamine resistance screening and monoclonal screening of a cell population;
and carrying out secretory expression and purification on the cell strain to obtain a purified recombinant novel coronavirus S protein.
5. Use of the nucleic acid molecule of claim 1 for the preparation of a novel coronavirus subunit vaccine.
CN202110395117.4A 2021-04-13 2021-04-13 Novel coronavirus S protein and subunit vaccine thereof Active CN113185613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110395117.4A CN113185613B (en) 2021-04-13 2021-04-13 Novel coronavirus S protein and subunit vaccine thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110395117.4A CN113185613B (en) 2021-04-13 2021-04-13 Novel coronavirus S protein and subunit vaccine thereof

Publications (2)

Publication Number Publication Date
CN113185613A CN113185613A (en) 2021-07-30
CN113185613B true CN113185613B (en) 2022-09-13

Family

ID=76975592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110395117.4A Active CN113185613B (en) 2021-04-13 2021-04-13 Novel coronavirus S protein and subunit vaccine thereof

Country Status (1)

Country Link
CN (1) CN113185613B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021249116A1 (en) 2020-06-10 2021-12-16 Sichuan Clover Biopharmaceuticals, Inc. Coronavirus vaccine compositions, methods, and uses thereof
EP4204000A1 (en) * 2020-08-26 2023-07-05 The University of Queensland Modified polypeptides with improved properties
CN113336838B (en) * 2021-05-11 2022-05-17 中国农业科学院哈尔滨兽医研究所(中国动物卫生与流行病学中心哈尔滨分中心) Novel coronavirus pneumonia recombinant vaccinia virus vector vaccine
CN113563483B (en) * 2021-08-09 2024-02-02 广州明药科技有限公司 Phage display novel coronavirus capsid protein and application
WO2023020623A1 (en) * 2021-08-20 2023-02-23 百奥泰生物制药股份有限公司 Fusion protein and spike protein nanoparticle for preventing or treating coronavirus infections, and use thereof
CN113603793A (en) * 2021-08-31 2021-11-05 南华大学 Novel coronavirus recombinant S protein, recombinant plasmid, recombinant bacterium and application for preparing exosome drug or exosome vaccine
TW202321276A (en) * 2021-09-02 2023-06-01 高端疫苗生物製劑股份有限公司 Immunogenic compositions and methods for immunization against variants of severe acute respiratory syndrome coronavirus 2 (sars-cov-2)
CN115772544B (en) * 2021-09-06 2024-04-26 合肥星眸生物科技有限公司 AAV vectors against VEGF-A and ANG-2
CN113817753B (en) * 2021-09-07 2024-04-09 上海交通大学 Expression of SARS-CoV-2 fiber protein or its variant S Δ21 Construction and use of pseudotyped VSV viruses
CA3234490A1 (en) * 2021-10-07 2023-04-13 Precision NanoSystems ULC Rna vaccine lipid nanoparticles
CN114316071B (en) * 2021-12-29 2024-03-08 浙江大学 Recombinant mumps virus particles, composition and application thereof
CN114031675B (en) * 2022-01-10 2022-06-07 广州市锐博生物科技有限公司 Vaccines and compositions based on the S protein of SARS-CoV-2
CN115335390A (en) * 2022-01-10 2022-11-11 广州市锐博生物科技有限公司 Vaccines and compositions based on the S protein of SARS-CoV-2
EP4267108A1 (en) * 2022-02-07 2023-11-01 Seqirus Inc. Self-replicating rna and uses thereof
CN114150005B (en) * 2022-02-09 2022-04-22 广州恩宝生物医药科技有限公司 Adenovirus vector vaccine for prevention of SARS-CoV-2 Oncuronjorn strain
CN114574502B (en) * 2022-04-11 2023-07-14 四川大学 Novel coronavirus vaccine using replication-defective adeno-associated virus as vector
TW202400251A (en) * 2022-04-27 2024-01-01 大陸商瑞可迪(上海)生物醫藥有限公司 Nucleic acid constructs and applications thereof
CN114807179B (en) * 2022-06-01 2022-10-21 广州达博生物制品有限公司 Construction and application of novel coronavirus pneumonia vaccine
WO2024008014A1 (en) * 2022-07-07 2024-01-11 成都威斯克生物医药有限公司 Pharmaceutical composition for resisting infection with sars-cov-2 or mutant thereof, and combined drug thereof
WO2024032468A1 (en) * 2022-08-08 2024-02-15 神州细胞工程有限公司 Preparation and use of recombinant five-component sars-cov-2 trimer protein vaccine capable of inducing broad-spectrum neutralization activity
CN117582492A (en) * 2022-08-12 2024-02-23 上海市公共卫生临床中心 Recombinant multivalent vaccine
CN116063411A (en) * 2022-09-16 2023-05-05 广东珩达生物医药科技有限公司 Novel coronavirus antigen polypeptide, recombinant adeno-associated virus thereof and application of recombinant adeno-associated virus in preparation of vaccine
CN115894713B (en) * 2022-09-22 2023-08-01 武汉滨会生物科技股份有限公司 Heterotrimeric fusion proteins, compositions and uses thereof

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111217917B (en) * 2020-02-26 2020-10-23 康希诺生物股份公司 Novel coronavirus SARS-CoV-2 vaccine and preparation method thereof
CN111671890B (en) * 2020-05-14 2022-08-05 苏州大学 Novel coronavirus vaccine and application thereof
CN112076315B (en) * 2020-08-25 2023-09-01 中国农业科学院生物技术研究所 Nanometer antigen particle fused by novel coronavirus S protein and ferritin subunit, novel coronavirus vaccine, preparation method and application thereof
CN112375784B (en) * 2021-01-07 2021-04-16 北京百普赛斯生物科技股份有限公司 Method for preparing recombinant novel coronavirus Spike protein

Also Published As

Publication number Publication date
CN113185613A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN113185613B (en) Novel coronavirus S protein and subunit vaccine thereof
CN113929786B (en) Novel coronavirus mutant strain S protein and subunit vaccine thereof
CN113321739B (en) COVID-19 subunit vaccine and preparation method and application thereof
CN111560354B (en) Recombinant novel coronavirus, preparation method and application thereof
CN107427571A (en) Novel multivalent vaccine based on nano particle
CN113943373B (en) Beta coronavirus polymer antigen, preparation method and application thereof
CN112707968A (en) Recombinant receptor binding protein and recombinant receptor protein for detecting neutralizing antibody of novel coronavirus
CN112553172B (en) COVID-19 pseudovirus and preparation method and application thereof
CN113354717B (en) Novel coronavirus SARS-CoV-2 broad-spectrum polypeptide antigen and its specific neutralizing antibody and application
CN112575008A (en) Nucleic acid molecules encoding structural proteins of novel coronaviruses and novel coronavirus vaccines
CN113527522B (en) New coronavirus trimer recombinant protein, DNA, mRNA, application and mRNA vaccine
CN111808176B (en) Bovine herpes virus antigen compositions and uses thereof
CN114437185A (en) Coronavirus trimer subunit vaccine and application thereof
KR20230025020A (en) human cytomegalovirus gB polypeptide
CN113748203A (en) Recombinant classical swine fever virus
WO2021239086A1 (en) Sars-cov-2 pseudovirus and method for testing ability of sample to neutralize sars-cov-2
CN111683959A (en) Chimeric polyepitopes of Zika virus comprising non-structural proteins and their use in immunogenic compositions
CN112194712B (en) Zika/dengue vaccine and application thereof
CN114478717A (en) Recombinant novel coronavirus protein vaccine, preparation method and application thereof
CN114213547A (en) Fusion protein for displaying new crown S protein, recombinant virus particle and application thereof
KR20230030653A (en) Recombinant Polypeptides Containing One or More Immunogenic Fragments and Antibody Fc Regions and Uses Thereof
CN112940109A (en) T cell receptor for recognizing EBV antigen and application thereof
CN111732667B (en) Peste des petits ruminants virus genetic engineering subunit vaccine
CN113831414B (en) Porcine circovirus 2b type Capid-Fc fusion protein, preparation method, gene and construction method and application thereof
CN114306589B (en) Recombinant African swine fever virus antigen cocktail vaccine and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant