CN114958882B - DNA molecule for expressing varicella-zoster virus gE protein - Google Patents

DNA molecule for expressing varicella-zoster virus gE protein Download PDF

Info

Publication number
CN114958882B
CN114958882B CN202210509802.XA CN202210509802A CN114958882B CN 114958882 B CN114958882 B CN 114958882B CN 202210509802 A CN202210509802 A CN 202210509802A CN 114958882 B CN114958882 B CN 114958882B
Authority
CN
China
Prior art keywords
seq
protein
dna molecule
thr
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210509802.XA
Other languages
Chinese (zh)
Other versions
CN114958882A (en
Inventor
安文琪
张静静
王斌
邢体坤
宋路萍
杨振苹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shengming Biotechnology Zhengzhou Co ltd
Original Assignee
Shengming Biotechnology Zhengzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shengming Biotechnology Zhengzhou Co ltd filed Critical Shengming Biotechnology Zhengzhou Co ltd
Priority to CN202210509802.XA priority Critical patent/CN114958882B/en
Publication of CN114958882A publication Critical patent/CN114958882A/en
Application granted granted Critical
Publication of CN114958882B publication Critical patent/CN114958882B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • A61P31/20Antivirals for DNA viruses
    • A61P31/22Antivirals for DNA viruses for herpes viruses
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/569Immunoassay; Biospecific binding assay; Materials therefor for microorganisms, e.g. protozoa, bacteria, viruses
    • G01N33/56983Viruses
    • G01N33/56994Herpetoviridae, e.g. cytomegalovirus, Epstein-Barr virus
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/02Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2710/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
    • C12N2710/00011Details
    • C12N2710/16011Herpesviridae
    • C12N2710/16711Varicellovirus, e.g. human herpesvirus 3, Varicella Zoster, pseudorabies
    • C12N2710/16722New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2710/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
    • C12N2710/00011Details
    • C12N2710/16011Herpesviridae
    • C12N2710/16711Varicellovirus, e.g. human herpesvirus 3, Varicella Zoster, pseudorabies
    • C12N2710/16734Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2333/00Assays involving biological materials from specific organisms or of a specific nature
    • G01N2333/005Assays involving biological materials from specific organisms or of a specific nature from viruses
    • G01N2333/01DNA viruses
    • G01N2333/03Herpetoviridae, e.g. pseudorabies virus
    • G01N2333/04Varicella-zoster virus
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Virology (AREA)
  • Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Urology & Nephrology (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • Hematology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Biomedical Technology (AREA)
  • Communicable Diseases (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Cell Biology (AREA)
  • Oncology (AREA)
  • Biophysics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Food Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Mycology (AREA)
  • Epidemiology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention discloses a DNA molecule for expressing varicella-zoster virus gE protein. The DNA molecule contains a coding gene of a double signal peptide and a coding gene of varicella-zoster virus gE protein; the double signal peptide is formed by connecting a signal peptide 1 and a signal peptide 2. In the embodiment of the invention, a double-signal peptide system is adopted to express the key antigen gE proteins of varicella-zoster virus of different strains, specifically, the antibody heavy chain signal peptide-gE protein signal peptide is used for jointly guiding the expression of gE proteins or related fusion proteins thereof, the secretion expression yield is obviously higher than that of the natural signal peptide of the gE proteins, and the nitrogen end sequencing of the expression product is consistent with the cleavage site of the natural signal peptide and the report of a literature, so that the method is more suitable for large-scale industrial production and reduces the production cost.

Description

DNA molecule for expressing varicella-zoster virus gE protein
Technical Field
The invention relates to the technical field of biology, in particular to a DNA molecule for expressing varicella-zoster virus gE protein.
Background
The only natural host for varicella-zoster virus is human, with only one serotype, 71 genes in the genome, encoding 60 different proteins. Infection of children with varicella-zoster virus causes varicella, and the virus is hidden in the infected person after recovery; the senile virus recurs to cause herpes zoster and is often accompanied by serious complications. Along with the trend of aging in the world, the prevention of herpes zoster has become a public health concern, and the research and development of products for preventing the herpes zoster of middle-aged and elderly people has important social and economic significance.
The expression and preparation of key antigens are primary considerations in the development of vaccines, but the natural signal peptide of the varicella-zoster virus key antigen gE protein (glycoprotein E) is detrimental to protein expression. Therefore, the development of stable transgenic cell lines which highly express varicella-zoster virus gE protein is a problem which needs to be solved in the development process of varicella-zoster virus vaccine.
Disclosure of Invention
The technical problem to be solved by the invention is how to prepare varicella-zoster virus gE protein and/or how to improve the expression efficiency of varicella-zoster virus gE protein.
In order to solve the technical problems, the invention firstly provides a DNA molecule. The DNA molecule contains a gene encoding a double signal peptide and a gene encoding varicella zoster virus gE protein. The double signal peptide may be a signal peptide formed by connecting signal peptide 1 and signal peptide 2.
The DNA molecule described above may be an expression cassette. The DNA molecules are particularly capable of expressing the two signal peptides and the gE protein of varicella zoster virus. In the above-mentioned DNA molecule, the gene encoding the double signal peptide is located upstream of the gene encoding the varicella zoster virus antigen component gE protein.
In the DNA molecule described above, the arrangement of the dual signal peptide may be any of the following:
1) Signal peptide 1-signal peptide 2-gE protein;
2) Signal peptide 1-signal peptide 2-gE protein-fusion tag;
3) Signal peptide 1-signal peptide 2-fusion tag-gE protein.
The fusion tag may be a histidine tag.
In the above DNA molecule, the signal peptide 1 may be an antibody heavy chain signal peptide. The signal peptide 2 may be a natural signal peptide of the gE protein of the varicella-zoster virus. The carboxy terminus of the signal peptide 1 is linked to the amino terminus of the signal peptide 2. The signal peptide 1 and/or the signal peptide 2 can also be signal peptides such as an antibody light chain, gE protein, human oncostatin M (hOSM), osteoconductin (BM 40), human albumin (HSA), secreted luciferase (Gaussia Luc), interleukin 2 (IL-2) and/or human insulin.
The varicella-zoster virus gE protein may be the varicella-zoster virus gE protein of the Ellen strain, or may be the gE protein of other varicella-zoster virus strains such as Oka strain, wild type strain, dumas strain and/or VZV-32 strain.
The DNA molecules described above may contain regulatory sequences. Such regulatory sequences include, but are not limited to, leader sequences, polyadenylation sequences, propeptide sequences, promoters, signal sequences, and transcription terminators. At a minimum, the regulatory sequences include promoters and termination signals for transcription and translation. In order to introduce specific restriction enzyme sites of the vector in order to ligate the regulatory sequences with the coding region of the nucleic acid sequence encoding the protein, a ligated regulatory sequence may be provided. The regulatory sequence may be a suitable promoter sequence, i.e.a nucleic acid sequence which is recognized by the host cell in which the nucleic acid sequence is expressed. The promoter sequence contains transcriptional regulatory sequences that mediate the expression of the protein. The promoter may be any nucleic acid sequence that is transcriptionally active in the host cell of choice, including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular proteins that are homologous or heterologous to the host cell. The control sequence may also be a suitable transcription termination sequence, a sequence that is recognized by the host cell to terminate transcription. The termination sequence is operably linked to the 3' terminus of the nucleic acid sequence encoding the protein. Any terminator which is functional in the host cell of choice may be used in the present invention. The control sequences may also be suitable leader sequences, i.e., untranslated regions of mRNA which are important for translation by the host cell. The leader sequence is operably linked to the 5' terminus of the nucleic acid sequence encoding the protein. Any leader sequence that is functional in the host cell of choice may be used in the present invention. The regulatory sequence may also be a signal peptide coding region which codes for an amino acid sequence attached to the amino terminus of the protein and which directs the encoded protein into the cell's secretory pathway. Signal peptide coding regions that direct the expressed protein into the secretory pathway of host cells used may be used in the present invention. It may also be desirable to add regulatory sequences that regulate the expression of the protein according to the growth of the host cell. Examples of regulatory systems are those that are capable of opening or closing gene expression in response to chemical or physical stimuli, including in the presence of regulatory compounds. Other examples of regulatory sequences are those which enable the amplification of a gene. In these examples, the nucleic acid sequence encoding the protein should be operably linked to regulatory sequences.
In the DNA molecule, the amino acid sequence of the signal peptide 1 can be 1-19 positions of SEQ ID No. 2. The amino acid sequence of the signal peptide 2 may be positions 1-30 of SEQ ID No. 1.
In the above DNA molecule, the gE protein of varicella-zoster virus may be any one of the following proteins:
a1 Amino acid sequence is protein at 31 st-546 th position of SEQ ID No.1 in the sequence table,
a2 Amino acid sequence is protein at 50-565 th position of SEQ ID No.2 in sequence table,
a3 Amino acid sequence is protein at 50-565 th position of SEQ ID No.3 in sequence table,
a4 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.4 in the sequence table,
a5 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.5 in the sequence table,
a6 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.6 in the sequence table,
a7 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.7 in the sequence table,
a8 Fusion proteins obtained by fusing protein tags at the carboxyl end or/and the amino end of the proteins shown in A1), A2), A3), A4), A5), A6) or A7);
a9 A protein obtained by substituting and/or deleting and/or adding one or more amino acid residues for the amino acid sequence of any one of the proteins A1) to A8) and having the same function and derived from A1) or A2) or A3) or A4) or A5) or A6) or A7) or A8) or having 80% or more identity with the protein represented by A1) or A2) or A3) or A4) or A5) or A6) or A7) or A8).
In the above DNA molecule, the gene encoding the gE protein may be any of the following:
b1 A DNA molecule shown in nucleotides 91-1638 of SEQ ID No. 08;
b2 A DNA molecule shown in the 91 st to 1638 th nucleotides of SEQ ID No. 09;
b3 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 10;
b4 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 11;
b5 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 12;
b6 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 13;
b7 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 14;
b8 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 15;
b9 A DNA molecule shown in SEQ ID No.16 at 148 th to 1689 th nucleotides;
b10 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 17;
b11 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 18;
b12 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 19;
b13 A DNA molecule shown in SEQ ID No.20 at 148 th to 1689 th nucleotides;
b14 A DNA molecule shown in SEQ ID No.21 at 148 th to 1689 th nucleotides;
the coding gene of the double signal peptide can be any one of the following:
c1 DNA molecule shown in 1 st to 147 th nucleotide of SEQ ID No. 10;
c2 DNA molecule shown in SEQ ID No.12 at 1-147 th nucleotide.
The DNA molecule may be any of the following:
d1 A DNA molecule shown in SEQ ID No. 10;
d2 A DNA molecule shown in SEQ ID No. 11;
d3 A DNA molecule shown in SEQ ID No. 12;
d4 A DNA molecule shown in SEQ ID No. 13;
d5 A DNA molecule shown in SEQ ID No. 14;
d6 A DNA molecule shown in SEQ ID No. 15;
d7 A DNA molecule shown in SEQ ID No. 16;
d8 A DNA molecule shown in SEQ ID No. 17;
d9 A DNA molecule shown as SEQ ID No. 18;
d10 A DNA molecule shown in SEQ ID No. 19;
d11 A DNA molecule shown in SEQ ID No. 20;
d12 A DNA molecule shown in SEQ ID No. 21.
In order to solve the technical problems, the invention also provides a recombinant vector containing the DNA molecule.
Recombinant cell lines comprising the DNA molecules as described above or the recombinant vectors as described above are also within the scope of the invention.
The cell line described above may be a CHO or HEK293 cell line. The cell line may also be a yeast cell line and/or an insect cell line, etc.
In order to solve the technical problems, the invention also provides the protein. The protein may be any of the following:
e1 Amino acid sequence is protein of SEQ ID No.1 in the sequence table,
e2 Amino acid sequence is protein of SEQ ID No.2 in the sequence table,
e3 Amino acid sequence is protein of SEQ ID No.3 in the sequence table,
e4 Amino acid sequence is protein of SEQ ID No.4 in the sequence table,
e5 Amino acid sequence is protein of SEQ ID No.5 in the sequence table,
e6 Amino acid sequence is protein of SEQ ID No.6 in the sequence table,
e7 Amino acid sequence is protein of SEQ ID No.7 in the sequence table,
e8 Fusion proteins obtained by fusing protein tags at the carboxyl end or/and the amino end of the proteins shown in E1), E2), E3), E4), E5), E6) or E7);
e9 A protein which is obtained by substituting and/or deleting and/or adding one or more amino acid residues for the amino acid sequence of the protein in any one of E1) to E8) and has the same function and is derived from E1) or E2) or E3) or E4) or E5) or E6) or E7) or E8) or has more than 80% identity with the protein shown in E1) or E2) or E3) or E4) or E5) or E6) or E7) or E8).
The protein can be synthesized artificially or obtained by synthesizing the coding gene and then biologically expressing.
Among the above proteins, the protein tag (protein-tag) refers to a polypeptide or protein that is fusion expressed together with a target protein by using a DNA in vitro recombination technique, so as to facilitate the expression, detection, tracing and/or purification of the target protein. The protein tag may be a Flag tag, his tag, MBP tag, HA tag, myc tag, GST tag, and/or SUMO tag, etc.
In the above proteins, the identity refers to the identity of amino acid sequences. The identity of amino acid sequences can be determined using homology search sites on the internet, such as BLAST web pages of the NCBI homepage website. For example, in advanced BLAST2.1, the identity of a pair of amino acid sequences can be searched for by using blastp as a program, setting the Expect value to 10, setting all filters to OFF, using BLOSUM62 as Matrix, setting Gap existence cost, per residue gap cost and Lambda ratio to 11,1 and 0.85 (default values), respectively, and calculating, and then obtaining the value (%) of the identity.
In the above protein, the 80% or more identity may be at least 81%, 82%, 85%, 86%, 88%, 90%, 91%, 92%, 95%, 96%, 98%, 99% or 100% identity.
The following uses of the DNA molecules according to any of the above and/or the recombinant vectors according to the above and/or the recombinant cell lines according to the above and/or the proteins according to the above are also within the scope of the invention:
f1 Use of a varicella-zoster virus antigen component gE protein for the manufacture of a product;
f2 Use in the preparation or development of a medicament for varicella-zoster virus-induced disease;
f3 Use of a varicella-zoster virus prophylactic product for the preparation of a key antigenic component thereof;
f4 The key components of the diagnosis or detection kit for varicella-zoster virus.
Above, the varicella-zoster virus-caused diseases include, but are not limited to varicella, zoster or postherpetic neuralgia.
The double signal peptide system is adopted to express the key antigen gE proteins of different strain varicella-zoster virus, specifically, the antibody heavy chain signal peptide-gE protein signal peptide jointly guides the gE protein or related fusion protein to express, the secretion expression yield is obviously higher than that of the gE protein natural signal peptide, and the nitrogen end sequencing of the expression product is consistent with the cleavage site of the natural signal peptide and the report of the literature, so that the method is more suitable for large-scale industrial production, and the production cost is reduced.
Drawings
FIG. 1 is a graph showing the prediction of the native signal peptide of the gE protein. SP (Sec/SPI) is the predicted signal peptide probability, CS is the predicted signal peptidase cleavage site, and OTHERS represents the probability of non-signal peptide. The ordinate indicates the probability that the amino acid sequence is the predicted structure, and the abscissa indicates the protein amino acid sequence.
FIG. 2 is a diagram of nucleic acid electrophoresis constructed for recombinant expression plasmids. The recombinant expression plasmid of the gE protein is identified by adopting HindIII and Pac I double enzyme digestion, the theoretical enzyme digestion band is a 7112bp vector and a target gene of about 1700bp, wherein M is DNA marker DL10000, lane 1 is pCGS3-gE01,2 is pCGS3-gE02,3 is pCGS3-gE03,4 is pCGS3-gE04,5 is pCGS3-gE05,6 is pCGS3-gE06,7 is pCGS3-gE07,8 is pCGS3-gE08,9 is pCGS3-gE09, 10 is pCGS3-gE10, 11 is pCGS3-gE11, 12 is pCGS3-gE12, 13 is pCGS3-gE13, and 14 is pCGS3-gE14.
FIG. 3 shows SDS-PAGE patterns of varicella-zoster virus gE protein expression. The recombinant expression plasmid was transfected into CHO and the supernatant was cultured for SDS-PAGE electrophoresis, wherein M was a protein marker, lane 1 was pCGS3-gE01/CHO,2 was pCGS3-gE02/CHO,3 was pCGS3-gE03/CHO,4 was pCGS3-gE04/CHO,5 was pCGS3-gE05/CHO,6 was pCGS3-gE06/CHO,7 was pCGS3-gE07/CHO,8 was pCGS3-gE08/CHO,9 was pCGS3-gE09/CHO,10 was pCGS3-gE10/CHO,11 was pCGS3-gE11/CHO,12 was pCGS3-gE12/CHO,13 was pCGS3-gE13/CHO,14 was pCGS3-gE14/CHO,15 was not transfected with the blank plasmid, and the boxes were marked with gE protein bands.
FIG. 4 is an SDS-PAGE electrophoresis of varicella-zoster virus gE protein (His tag) purified. Wherein M is a protein marker, lane 1 is pCGS3-gE03/CHO, lane 2 is pCGS3-gE05/CHO, lane 3 is pCGS3-gE11/CHO, lane 4 is pCGS3-gE12/CHO, lane 5 is pCGS3-gE13/CHO, and lane 6 is pCGS3-gE14/CHO after purification.
FIG. 5 shows the nitrogen end sequencing peaks of varicella-zoster virus gE protein. And 5 amino acids of the purified gE protein are detected by N-terminal sequencing, and the peak diagram is sequentially from top to bottom and corresponds to the amino acids released in each cycle. The detection proteins were respectively pCGS3-gE03/CHO purified Ellen strain gE protein samples (left), pCGS3-gE11/CHO purified Oka strain gE protein samples (middle), pCGS3-gE13/CHO purified wild type strain gE protein samples (right).
FIG. 6 shows the cell densities pCGS3-gE04 (left) and pCGS3-gE06 (right) of the varicella zoster virus gE protein stably expressing cell pool Fed-Batch. The ordinate is cell density (number of cells/mL) and the abscissa is time (days).
FIG. 7 shows the cell viability of the varicella zoster virus gE protein stably expressing cell pool Fed-Batch pCGS3-gE04 (left) and pCGS3-gE06 (right). The ordinate shows the cell viability (%), and the abscissa shows the time (days).
FIG. 8 shows the micro-expression levels pCGS3-gE04 (left) and pCGS3-gE06 (right) of the varicella zoster virus gE protein stable expression cell pool Fed-Batch. Protein micro-expression (mg/L) is plotted on the ordinate and time (days) is plotted on the abscissa.
Detailed Description
The following detailed description of the invention is provided in connection with the accompanying drawings that are presented to illustrate the invention and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the invention in any way.
The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified.
The key reagents and their manufacturer information in the following examples are as follows:
ExpiCHO-S TM cells: thermo company;
ExpiCHO TM expression Medium: thermo company;
ExpiFectamine TM CHO Transfection Kit(Kit Includes:3.2mL ExpiFectamine TM CHO Reagent,6mL ExpiFectamine TM CHO Enhancer,330mL of expi CHO Feed): thermo company;
OptiPRO TM SFM: thermo company;
Figure BDA0003638902740000061
GS -/- cell: merck company;
Figure BDA0003638902740000071
CD CHO Fusion Medium: SAFC company;
Figure BDA0003638902740000072
advanced CHO Fed-batch Medium: SAFC company;
pCGS3 expression vector: merck company;
hind III endonuclease: NEB company;
pac I endonuclease: NEB company;
DH5 alpha competence: bioengineering (Shanghai) stock Co.Ltd;
Ni-NTA6FF His tag protein purification kit: bioengineering (Shanghai) stock Co.Ltd;
ultracel-10regenerated cellulose membrane,15mL sample volume: millipore Co;
ultracel-10regenerated cellulose membrane,0.5mL sample volume: millipore Co;
PBS, pH 7.4: thermo company;
Sure PAGE TM Bis-Tris,10×8,4-12%,15wells: nanjing Jinsri Biotechnology Co.
The main instruments and their manufacturer information in the following examples are as follows:
electroporation instrument: bio-Rad Co;
gel imaging system: protein Simple company;
cell counter: shanghai Rui Boyu Biotech Co., ltd;
ultra-clean bench: suzhou Antai air technologies Co., ltd;
electric heating constant temperature water bath kettle: fisher Scientific company;
CO 2 constant temperature shaking table: CRYSTAL Co;
ext> HYGext> -ext> Aext> fullext> constantext> temperatureext> shakeext> flaskext> cabinetext>:ext> Experiment equipment factory in Taicang city;
DYY-6C type electrophoresis apparatus: six instrument factories in Beijing;
DYCP-31DN horizontal electrophoresis tank: six instrument factories in Beijing;
eStain TM l1 protein staining instrument: nanjing Jinsri biotechnology Co., ltd;
micropipettes: eppendorf Inc.
Example 1 gE protein Signal peptide prediction
Prediction of varicella-zoster virus key antigen gE protein signal peptide using SignalP-5.0 on-line software: inputting the amino acid sequence of the gE protein (amino acids 1-546 of SEQ ID No.1 in the sequence table) into a website https:// servics. The predicted result is shown in figure 1, the nitrogen end signal peptide structure probability of the gE protein is 13.36%, the signal peptidase cleavage site is positioned between amino acids 30 and 31 of SEQ ID No.1, and the predicted result shows that the signal peptide cleavage efficiency of the gE protein is lower and is not beneficial to secretion and expression of the gE protein (figure 1).
Example 2 construction of expression plasmid
To overcome the adverse effects of the native signal peptide of the gE protein described in example 1, the present study designed a dual signal peptide system to express different strain types of gE proteins.
Specifically, 14 synthetic plasmids (pUC 57-gE01 to pUC57-gE 14) were synthesized by the gene synthesis company according to the information entrusted to Table 1, and the 14 synthetic plasmids respectively contained DNA molecules (SEQ ID No.8 to SEQ ID No. 21) encoding gE proteins which were different (different signal peptides, different strain types, different codon optimizations, and whether or not they carried a tag).
TABLE 1 varicella zoster Virus gE protein expression plasmid statistics
Synthetic plasmid Construction of plasmids Amino acid sequence Nucleic acid sequences Signal peptides Label (Label) Codon optimization Plant type
pUC57-gE01 pCGS3-gE01 SEQ ID No.1 SEQ ID No.8 gE signal peptide C-His CHO Widetype
pUC57-gE02 pCGS3-gE02 SEQ ID No.1 SEQ ID No.9 gE signal peptide C-His HEK293 Widetype
pUC57-gE03 pCGS3-gE03 SEQ ID No.2 SEQ ID No.10 Dual signal peptides C-His CHO Ellen
pUC57-gE04 pCGS3-gE04 SEQ ID No.3 SEQ ID No.11 Dual signal peptides Without any means for CHO Ellen
pUC57-gE05 pCGS3-gE05 SEQ ID No.2 SEQ ID No.12 Dual signal peptides C-His HEK293 Ellen
pUC57-gE06 pCGS3-gE06 SEQ ID No.3 SEQ ID No.13 Dual signal peptides Without any means for HEK293 Ellen
pUC57-gE07 pCGS3-gE07 SEQ ID No.4 SEQ ID No.14 Dual signal peptides Without any means for CHO Oka
pUC57-gE08 pCGS3-gE08 SEQ ID No.4 SEQ ID No.15 Dual signal peptides Without any means for HEK293 Oka
pUC57-gE09 pCGS3-gE09 SEQ ID No.5 SEQ ID No.16 Dual signal peptides Without any means for CHO Widetype
pUC57-gE10 pCGS3-gE10 SEQ ID No.5 SEQ ID No.17 Dual signal peptides Without any means for HEK293 Widetype
pUC57-gE11 pCGS3-gE11 SEQ ID No.6 SEQ ID No.18 Dual signal peptides C-His CHO Oka
pUC57-gE12 pCGS3-gE12 SEQ ID No.6 SEQ ID No.19 Dual signal peptides C-His HEK293 Oka
pUC57-gE13 pCGS3-gE13 SEQ ID No.7 SEQ ID No.20 Dual signal peptides C-His CHO Widetype
pUC57-gE14 pCGS3-gE14 SEQ ID No.7 SEQ ID No.21 Dual signal peptides C-His HEK293 Widetype
The 14 synthetic plasmids (pUC 57-gE 01-pUC 57-gE 14) described in Table 1 were digested simultaneously with restriction enzymes HindIII (NEB) and Pac I (NEB), and then the digested fragments (about 1700 bp) were cloned into pCGS3 expression vectors to construct 14 recombinant expression plasmids (pCGS 3-gE 01-pCGS 3-gE 14). The result of the restriction enzyme digestion identification of the 14 recombinant expression plasmids is shown in figure 2, the vector plasmid fragment after the double restriction enzyme digestion of HindIII and Pac I is 7112bp, the target fragment (the encoding DNA molecule of gE protein) is about 1700bp, and the identification result meets the expectations.
Structural description of recombinant expression plasmids pCGS3-gE01 and pCGS3-gE 02: the fragment between HindIII and Pac I cleavage recognition sites of the pCGS3 vector was replaced with the DNA fragment shown by SEQ ID No.8 (CHO codon optimized) or SEQ ID No.9 (HEK 293 codon optimized) in the sequence Listing, respectively, and the other sequences of the pCGS3 vector were kept unchanged. The 1 st to 90 th nucleotide of SEQ ID No.8 or SEQ ID No.9 is the coding gene of the natural signal peptide of the gE protein, the 91 st to 1638 th nucleotide is the coding gene of the gE protein, the 1639 th to 1653 th nucleotide is the coding gene of the GGGGS connecting peptide, the 1654 th to 1671 th nucleotide is the coding gene of the histidine tag, and the 1672 th to 1674 th nucleotide is the stop codon. The amino acid sequence of the gE protein natural signal peptide is 1-30 positions of SEQ ID No. 1. The encoding DNA molecule of the gE protein shown in SEQ ID No.8 or SEQ ID No.9 can express recombinant protein gE-His, and the nucleotide sequence of the gE-His is the 91 st-1671 st position of SEQ ID No.8 or the 91 st-1671 st position of SEQ ID No. 9; the amino acid sequence of the gE-His is 31-557 of SEQ ID No. 1. The amino acid sequence of the gE protein is 31-546 of SEQ ID No. 1.
Structural description of recombinant expression plasmids pCGS3-gE03 and pCGS3-gE 05: the fragment between the HindIII and Pac I cleavage recognition sites of the pCGS3 vector was replaced with the DNA fragment shown in SEQ ID No.10 (CHO codon optimized) or SEQ ID No.12 (HEK 293 codon optimized), respectively, and the other sequences of the pCGS3 vector were kept unchanged. The 1 st to 57 th of the target sequence (SEQ ID No.10 or SEQ ID No. 12) are antibody heavy chain signal peptide coding genes, 58 th to 147 th are gE protein natural signal peptide coding genes, 148 th to 1695 th are gE protein (CHO or HEK293 cell codon optimization) coding genes, 1696 th to 1710 th are GGGGS connecting peptide coding genes, 1711 th to 1728 th are histidine tag coding genes, and 1629 th to 1731 th are stop codes. The amino acids 1-49 of SEQ ID No.2 are double signal peptide, wherein the amino acid sequence of the heavy chain signal peptide of the antibody is 1-19 of SEQ ID No.2, and the amino acid sequence of the natural signal peptide of the gE protein is 20-49 of SEQ ID No. 2. The coding DNA molecule of the gE protein shown in SEQ ID No.10 or SEQ ID No.12 can express recombinant protein gE-His, and the nucleotide sequence of the gE-His is 148 th-1728 th of SEQ ID No.10 or 148 th-1728 th of SEQ ID No. 12; the amino acid sequence of the gE-His is 50 th to 576 th positions of SEQ ID No. 2. Wherein the amino acid sequence of the gE protein is 50 th to 565 th positions of SEQ ID No. 2.
Structural description of recombinant expression plasmids pCGS3-gE04 and pCGS3-gE 06: the small fragment between HindIII and Pac I cleavage recognition sites of the pCGS3 vector was replaced with the DNA fragment shown in SEQ ID No.11 (CHO codon optimized) or SEQ ID No.13 (HEK 293 codon optimized), respectively, and the other sequences of the pCGS3 vector were kept unchanged. The 1 st to 57 th of the target sequence (SEQ ID No.11 or SEQ ID No. 13) are antibody heavy chain signal peptide coding genes, 58 th to 147 th are gE protein natural signal peptide coding genes, 148 th to 1695 th are gE protein coding genes, and 1696 th to 1698 th are stop codes. Amino acids 1-19 of SEQ ID No.3 are double signal peptides (wherein the antibody heavy chain signal peptide is 1-19 of SEQ ID No.3, and the gE protein natural signal peptide is 20-49 of SEQ ID No. 3). The expression cassette of the gE protein shown in SEQ ID No.11 or SEQ ID No.13 can express gE protein, and the nucleotide sequence of the gE protein is the 148 th-1695 th positions of SEQ ID No.11 or SEQ ID No. 13; the amino acid sequence of the gE protein is 50-565 th position of SEQ ID No. 3.
The recombinant expression plasmids pCGS3-gE07, pCGS3-gE08, pCGS3-gE09 and pCGS3-gE10 are described in the structure: the fragment between Hind III and Pac I cleavage recognition sites of pCGS3 vector was replaced with the DNA fragment shown in SEQ ID No.14 (CHO codon optimized), SEQ ID No.15 (HEK 293 codon optimized), SEQ ID No.16 (CHO codon optimized) or SEQ ID No.17 (HEK 293 codon optimized), respectively, and the other sequences of pCGS3 vector were kept unchanged. The 1 st to 57 th of the target sequence (SEQ ID No.14, SEQ ID No.15, SEQ ID No.16 or SEQ ID No. 17) are antibody heavy chain signal peptide coding genes, 58-147 are gE protein natural signal peptide coding genes, 148-1689 are gE protein coding genes, and 1690-1692 are termination codes.
The coding DNA molecule of the gE protein shown in SEQ ID No.14 or SEQ ID No.15 can show gE protein, and the nucleotide sequence of the gE protein is 148 th-1689 th (CHO cell codon optimization) of SEQ ID No.14 or 148 th-1689 th (HEK 293 cell codon optimization) of SEQ ID No. 15; the amino acid sequence of the gE protein is 50 th-563 th site of SEQ ID No. 4. The coding DNA molecule of the gE protein shown in SEQ ID No.16 or SEQ ID No.17 can express gE protein, and the nucleotide sequence of the gE protein is 148 th-1689 th (CHO cell codon optimization) of SEQ ID No.16 or 148 th-1689 th (HEK 293 cell codon optimization) of SEQ ID No. 17; the amino acid sequence of the gE protein is 50 th-563 th site of SEQ ID No. 5; amino acids 1-49 of SEQ ID No.5 are double signal peptides (wherein the antibody heavy chain signal peptide is 1-19 of SEQ ID No.5, and the gE protein natural signal peptide is 20-49 of SEQ ID No. 5).
The recombinant expression plasmids pCGS3-gE11, pCGS3-gE12, pCGS3-gE13 and pCGS3-gE14 are described in the structure: the small fragments between HindIII and Pac I cleavage recognition sites of the pCGS3 vector are replaced by DNA fragments shown in SEQ ID No.18, SEQ ID No.19, SEQ ID No.20 or SEQ ID No.21, respectively, and the other sequences of the pCGS3 vector are kept unchanged. The 1 st to 57 th of the target sequence (SEQ ID No.18, SEQ ID No.19, SEQ ID No.20 or SEQ ID No. 21) are antibody heavy chain signal peptide coding genes, 58 th to 147 th are gE protein natural signal peptide coding genes, 148 th to 1689 th are gE protein coding genes, 1690 th to 1698 th are GGS connecting peptide coding genes, 1699 th to 1716 th are histidine tag coding genes, and 1617 th to 1719 th are stop codes.
The coding DNA molecule of the gE protein shown in SEQ ID No.18 or SEQ ID No.19 can express recombinant protein gE-His, and the nucleotide sequence of the gE-His is 148 th to 1719 th (CHO cell codon optimization) of SEQ ID No.18 or 148 th to 1719 th (HEK 293 cell codon optimization) of SEQ ID No. 19; the amino acid sequence of the gE-His is 50 th to 572 th positions of SEQ ID No. 6; the amino acid sequence of the gE protein is 50 th-563 th site of SEQ ID No. 6; amino acids 1-49 of SEQ ID No.6 are double signal peptides (wherein the antibody heavy chain signal peptide is 1-19 of SEQ ID No.6 and the gE protein natural signal peptide is 20-49 of SEQ ID No. 6).
The coding DNA molecule of the gE protein shown in SEQ ID No.20 or SEQ ID No.21 can express recombinant protein gE-His, the nucleotide sequence of the gE-His is 148 th-1689 th (CHO cell codon optimization) or 148 th-1689 th (HEK 293 cell codon optimization) of the SEQ ID No.20 or the SEQ ID No.21, the amino acid sequence of the gE-His is 50 th-572 th of the SEQ ID No.7, the amino acid sequence of the gE protein is 50 th-563 th of the SEQ ID No.7, the amino acid 1-49 th of the SEQ ID No.7 is a double signal peptide (wherein the heavy chain signal peptide of an antibody is 1-19 th of the SEQ ID No.7, and the natural signal peptide of the gE protein is 20-49 th of the SEQ ID No. 7).
Example 3 transient expression
24h before transfection at 3.0X10 6 Density passage of cells/mL ExpiCHO-S TM Cells to CHO expression Medium (ExpiCHO) TM Expression Medium); after the next day of cell counting, the cell density was adjusted to 6.0X10 with fresh expression medium 6 cell/mL for use.
The volume of original plasmid required to transfect 100mL of cells was calculated as 0.8 μg of plasmid per mL of cell culture. A total of 14 recombinant expression plasmids of pCGS3-gE 01-14 constructed as described in example 2 were transformed, respectivelyDye reagent (Expifectamine) TM CHO storage) in OptiPRO TM In SFM, the reaction was carried out at room temperature for 5 minutes. Slowly dripping the plasmid and transfection reagent complex into the ExpiCHO-S TM Cells cell culture (6.0X10) 6 cells/mL)(ExpiCHO TM Expression Medium) shaking the cell culture while adding to disperse the DNA and transfection reagent complexes uniformly, 37℃at 8% CO 2 Transfection culture was performed in a concentration incubator. The feed (Expifectamine) was added 18-22 h after transfection at maximum titer TM CHO Transfection Kit ExpiCHO Feed) and enhancers (ExpiFectamine TM Expifectamine in CHO Transfection Kit TM CHO cancer) while reducing the culture conditions to 37 ℃ to 32 ℃, CO 2 The concentration is reduced from 8% to 5%, and the culture is continued; the feed was again added on day 5 post-transfection. When the cell activity rate is reduced to 65% -75%, the culture collection is stopped, and the obtained supernatant recombinant cell line culture is sequentially named as pCGS3-gE 01-14/CHO and SDS-PAGE electrophoresis. The recombinant cell lines pCGS3-gE 03-14/CHO (lanes 3-14 in FIG. 3) containing the dual signal peptide expression system were significantly higher in gE expression than the recombinant cell lines pCGS3-gE01/CHO and pCGS3-gE02/CHO ( lanes 1 and 2 in FIG. 3) of the native signal peptide of the gE protein, whether or not carrying a tag, or different codon optimized species, or different strain types.
Example 4 protein purification
The His-tagged recombinant cell lines obtained in example 3 (pCGS 3-gE03/CHO, pCGS3-gE05/CHO, pCGS3-gE11/CHO, pCGS3-gE12/CHO, pCGS3-gE13/CHO and pCGS3-gE 14/CHO) cultures were centrifuged at 6000g for 30min, the supernatants were collected and concentrated by ultrafiltration, the supernatants and the binding/washing buffers (Ni-NTA 6FF His tag protein purification kit) were used in a volume ratio of 1:1, standing for 20min for full incubation to obtain ultrafiltration concentrated supernatant and a binding/washing buffer solution mixed solution. 1 column volume of binding/washing buffer balance column, ultrafiltration concentrated supernatant and binding/washing buffer mixture added to the column by gravity flow through the pre-packed column. Washing the column with 10-15 column volumes of binding/washing buffer and collecting the flow-through until the absorbance 280nm of the flow-through is near baseline; the target protein on the column was eluted with 3-4 column volumes of binding/elution buffer (Ni-NTA 6FF His tag protein purification kit components), ultrafiltration concentrated and displaced to PBS (pH 7.4) buffer, SDS-PAGE electrophoresis. His-tagged recombinant cell lines (pCGS 3-gE03/CHO, pCGS3-gE05/CHO, pCGS3-gE11/CHO, pCGS3-gE12/CHO, pCGS3-gE13/CHO and pCGS3-gE 14/CHO) were purified to a protein band gray scale assay purity of more than 90% (lanes 1-6 in FIG. 4).
Example 5N-terminal sequencing
The expression of the protein of interest by the dual signal peptide system may result in an indeterminate cleavage site of the signal peptidase, i.e., cleavage at either the first signal peptide or the second signal peptide. The recombinant cell lines of the different strain types of Ellen strains (pCGS 3-gE 03/CHO), the recombinant cell lines of Oka strains (pCGS 3-gE 11/CHO) or the gE proteins purified from the culture supernatants of the wild strain recombinant cell lines (pCGS 3-gE 13/CHO) were subjected to nitrogen sequencing by the Edman degradation method. After the protein transfer, 600. Mu.L of 0.1% TFA solution was added to the PVDF membrane test tube, and the mixture was put into a thermostatic mixer and oscillated at 600rpm for 1 minute, the supernatant was removed, and the procedure was repeated 3 times. Taking out, naturally airing, and shearing into 0.5cm 2 Loading the sample after the sample is sized. Placing the sheared PVDF membrane test sample into a reactor, and setting by software PPSQ-30 Analysis: sample name, sample number, number of test cycles, selection method file, and starting test after the completion of setting. The raw data and spectra generated by PPSQ-33A are identified by PPSQ-30 data processing software and peak maps are derived. The first 5 amino acid sequences of the nitrogen end of the purified gE protein expressed by the recombinant cell line of the Ellen strain (pCGS 3-gE 03/CHO), the recombinant cell line of the Oka strain (pCGS 3-gE 11/CHO) and the recombinant cell line of the wild type strain (pCGS 3-gE 03/CHO) are SVLRY, and are consistent with amino acids 31-35 of the gE protein reported in the literature, namely, the cleavage position of the natural signal peptide of the gE protein is consistent with that of the expression product of the double signal peptide system (FIG. 5).
EXAMPLE 6 development of stably transfected cell lines
Resuscitates one branch
Figure BDA0003638902740000121
GS -/- Host cells, which are grown to 2.+ -. 1X 10 6 At a cell/mL rate of 0.3X10 6 cell/mL density inoculation passage to +.>
Figure BDA0003638902740000122
CD CHO Fusion (with 6mM Glutamine) medium. The transfection effect is optimal 6-9 days after resuscitating, and cells are treated at 0.5X10 s 24 hours before transfection 6 cell/mL inoculation to +.>
Figure BDA0003638902740000123
CD CHO Fusion (with 6mM Glutamine) Medium, after 24 hours of incubation, was used for transfection while preparing the Condition Medium: the host cell culture broth was centrifuged and the supernatant was collected and filtered through a 0.2 μm filter membrane and designated as Condition Medium for use.
1. Recombinant gE protein expression plasmid electrotransformation
The recombinant gE protein expression plasmids pCGS3-gE04 and pCGS3-gE06 (25. Mu.g) obtained in example 2 were transfected to 5X 10 using a Bio-Rad electrotransport apparatus 6 cell CHOZN host cells, transfection parameters 300V, 950. Mu.F, exponentially decayed. Each plasmid was transfected in both circular and PvuI enzyme tangential forms, for a total of 2 batches, electrotransferred, and miniboost was prepared by inoculating 96-well plates at 5000 cells/200. Mu.L/well per batch (20%
Figure BDA0003638902740000124
CD CHO Fusion+80%/>
Figure BDA0003638902740000125
CHO Cloning Medium), a total of 7008 miniboost were prepared by co-inoculating 73 96 well plates with pCGS3-gE04 plasmid, a total of 7968 miniboost was prepared by co-inoculating 83 96 well plates with pCGS3-gE06 plasmid, and after electrotransformation, a single clone was prepared by limiting dilution method according to 0.5 cells/200. Mu.L/well inoculated 96 well plates (20%Condition Medium+80%>
Figure BDA0003638902740000126
CHO Cloning Medium), 20 96 well plates were inoculated with each of the pCGS3-gE04 and pCGS3-gE06 plasmids, and 1920 single-clone wells containing transfected cells were inoculated.
2. Monoclonal screening after electrotransformation
Placing a monoclonal well containing transfected cells intoCO humidified at 37 ℃ 2 The incubator is cultured without interference for 5 days. After 5 days the plates were removed and checked for the presence of hyper-proliferation. Weekly use of GS screening Medium (L-Glutamine free)
Figure BDA0003638902740000127
CD CHO Fusion) replaces the evaporated medium. Standing and culturing for about 21 days, wherein the Minipool has obvious cell mass growth, but the monoclonal cell mass growth is not found, and the monoclonal screening technical route directly fails after electrotransformation.
3. Miniboost screening after electrotransformation
The 96-well plate (about 80% confluence) after stationary culture for about 21 days in the step 2, which has formed miniool, was passaged into a new 96-well plate at a dilution ratio of 1:4, stationary culture was performed for about 7 days, and 30. Mu.L of the supernatant was subjected to recombinant gE protein micro-expression level measurement by a Biacore8K (surface ion resonance SPR system) system. Gradually screening miniboost with high expression level (about Top30% -40%), sequentially amplifying to 24-well plate, 6-well plate, T25 and TPP until shake flask, counting cells every two days after amplifying to shake flask, passaging, and recovering cells (cell viability)>95%) according to 1X 10 7 cells/mL/frozen miniboost. And selecting 5-6 miniboost with the highest recombinant gE protein expression level in two transfection batches for Fed Batch evaluation for 15 days. And (3) monitoring the living cell density of each Minipool (namely, the recombinant gE protein stable expression cell pool) growth curve, the living rate of the recombinant gE protein stable expression cell, and measuring the expression quantity of the recombinant gE protein.
The growth curve of each Minipool has a viable cell density as shown in FIG. 6, and the maximum viable cell density of the stable expression cell pool of the recombinant gE protein expression plasmid pCGS3-gE04 cultured for 12-15 days is 13.3×10 6 cell/mL (left panel in FIG. 6), maximum viable cell density of stable expression cell pool of recombinant gE protein expression plasmid pCGS3-gE06 is 11.4x10 6 cells/mL (right panel in FIG. 6).
The cell viability of each Minipool is shown in FIG. 7, and the maximum cell viability of the recombinant gE protein expression plasmid pCGS3-gE04 at harvest of the stable expression cell pool is 81.68% (left panel in FIG. 7) and the maximum cell viability of the recombinant gE protein expression plasmid pCGS3-gE06 at harvest of the stable expression cell pool is 87.97% (right panel in FIG. 7) after culturing for 12-15 days.
The measurement result of the recombinant gE protein expression amount of each Minipool is shown in FIG. 8, and the recombinant gE protein expression amount of the recombinant gE protein expression plasmid pCGS3-gE04 stable expression cell pool is 91-150 mg/L (left diagram in FIG. 8), and the recombinant gE protein expression amount of the recombinant gE protein expression plasmid pCGS3-gE06 stable expression cell pool is 90-150 mg/L (right diagram in FIG. 8) after culturing for 12-15 days. 11 cell pools (01 Batch-gE 04-11, 01 Batch-gE 04A-15, 01 Batch-gE 04A-17, 01 Batch-gE 04A-20, 02 Batch-gE 04A-11, 02 Batch-gE 04A-12 and 01 Batch-gE 06-1, 01 Batch-gE 06-7, 01 Batch-gE 06-12, 01 Batch-gE 06A-8, 01 Batch-gE 06A-20) screened by the two recombinant plasmids are cultured for 12-15 days to reach 90-150 mg/L, and the expression quantity of the recombinant gE protein Fed Batch is obviously higher than 30mg/L reported in the prior art.
The present invention is described in detail above. It will be apparent to those skilled in the art that the present invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with respect to specific embodiments, it will be appreciated that the invention may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.
Sequence listing
<110> Henan Cheng Ming Biotechnology research laboratory Co., ltd
<120> a DNA molecule expressing varicella zoster Virus gE protein
<130> GNCSQ213208
<160> 21
<170> PatentIn version 3.5
<210> 1
<211> 557
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 1
Met Gly Thr Val Asn Lys Pro Val Val Gly Val Leu Met Gly Phe Gly
1 5 10 15
Ile Ile Thr Gly Thr Leu Arg Ile Thr Asn Pro Val Arg Ala Ser Val
20 25 30
Leu Arg Tyr Asp Asp Phe His Thr Asp Glu Asp Lys Leu Asp Thr Asn
35 40 45
Ser Val Tyr Glu Pro Tyr Tyr His Ser Asp His Ala Glu Ser Ser Trp
50 55 60
Val Asn Arg Gly Glu Ser Ser Arg Lys Ala Tyr Asp His Asn Ser Pro
65 70 75 80
Tyr Ile Trp Pro Arg Asn Asp Tyr Asp Gly Phe Leu Glu Asn Ala His
85 90 95
Glu His His Gly Val Tyr Asn Gln Gly Arg Gly Ile Asp Ser Gly Glu
100 105 110
Arg Leu Met Gln Pro Thr Gln Met Ser Ala Gln Glu Asp Leu Gly Asp
115 120 125
Asp Thr Gly Ile His Val Ile Pro Thr Leu Asn Gly Asp Asp Arg His
130 135 140
Lys Ile Val Asn Val Asp Gln Arg Gln Tyr Gly Asp Val Phe Lys Gly
145 150 155 160
Asp Leu Asn Pro Lys Pro Gln Gly Gln Arg Leu Ile Glu Val Ser Val
165 170 175
Glu Glu Asn His Pro Phe Thr Leu Arg Ala Pro Ile Gln Arg Ile Tyr
180 185 190
Gly Val Arg Tyr Thr Glu Thr Trp Ser Phe Leu Pro Ser Leu Thr Cys
195 200 205
Thr Gly Asp Ala Ala Pro Ala Ile Gln His Ile Cys Leu Lys His Thr
210 215 220
Thr Cys Phe Gln Asp Val Val Val Asp Val Asp Cys Ala Glu Asn Thr
225 230 235 240
Lys Glu Asp Gln Leu Ala Glu Ile Ser Tyr Arg Phe Gln Gly Lys Lys
245 250 255
Glu Ala Asp Gln Pro Trp Ile Val Val Asn Thr Ser Thr Leu Phe Asp
260 265 270
Glu Leu Glu Leu Asp Pro Pro Glu Ile Glu Pro Gly Val Leu Lys Val
275 280 285
Leu Arg Thr Glu Lys Gln Tyr Leu Gly Val Tyr Ile Trp Asn Met Arg
290 295 300
Gly Ser Asp Gly Thr Ser Thr Tyr Ala Thr Phe Leu Val Thr Trp Lys
305 310 315 320
Gly Asp Glu Lys Thr Arg Asn Pro Thr Pro Ala Val Thr Pro Gln Pro
325 330 335
Arg Gly Ala Glu Phe His Met Trp Asn Tyr His Ser His Val Phe Ser
340 345 350
Val Gly Asp Thr Phe Ser Leu Ala Met His Leu Gln Tyr Lys Ile His
355 360 365
Glu Ala Pro Phe Asp Leu Leu Leu Glu Trp Leu Tyr Val Pro Ile Asp
370 375 380
Pro Thr Cys Gln Pro Met Arg Leu Tyr Ser Thr Cys Leu Tyr His Pro
385 390 395 400
Asn Ala Pro Gln Cys Leu Ser His Met Asn Ser Gly Cys Thr Phe Thr
405 410 415
Ser Pro His Leu Ala Gln Arg Val Ala Ser Thr Val Tyr Gln Asn Cys
420 425 430
Glu His Ala Asp Asn Tyr Thr Ala Tyr Cys Leu Gly Ile Ser His Met
435 440 445
Glu Pro Ser Phe Gly Leu Ile Leu His Asp Gly Gly Thr Thr Leu Lys
450 455 460
Phe Val Asp Thr Pro Glu Ser Leu Ser Gly Leu Tyr Val Phe Val Val
465 470 475 480
Tyr Phe Asn Gly His Val Glu Ala Val Ala Tyr Thr Val Val Ser Thr
485 490 495
Val Asp His Phe Val Asn Ala Ile Glu Glu Arg Gly Phe Pro Pro Thr
500 505 510
Ala Gly Gln Pro Pro Ala Thr Thr Lys Pro Lys Glu Ile Thr Pro Val
515 520 525
Asn Pro Gly Thr Ser Pro Leu Leu Arg Tyr Ala Ala Trp Thr Gly Gly
530 535 540
Leu Ala Gly Gly Gly Gly Ser His His His His His His
545 550 555
<210> 2
<211> 576
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 2
Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly
1 5 10 15
Val His Ser Met Gly Thr Val Asn Lys Pro Val Val Gly Val Leu Met
20 25 30
Gly Phe Gly Ile Ile Thr Gly Thr Leu Arg Ile Thr Asn Pro Val Arg
35 40 45
Ala Ser Val Leu Arg Tyr Asp Asp Phe His Ile Asp Glu Asp Lys Leu
50 55 60
Asp Thr Asn Ser Val Tyr Glu Pro Tyr Tyr His Ser Asp His Ala Glu
65 70 75 80
Ser Ser Trp Val Asn Arg Gly Glu Ser Ser Arg Lys Ala Tyr Asp His
85 90 95
Asn Ser Pro Tyr Ile Trp Pro Arg Asn Asp Tyr Asp Gly Phe Leu Glu
100 105 110
Asn Ala His Glu His His Gly Val Tyr Asn Gln Gly Arg Gly Ile Asp
115 120 125
Ser Gly Glu Arg Leu Met Gln Pro Thr Gln Met Ser Ala Gln Glu Asp
130 135 140
Leu Gly Asp Asp Thr Gly Ile His Val Ile Pro Thr Leu Asn Gly Asp
145 150 155 160
Asp Arg His Lys Ile Val Asn Val Asp Gln Arg Gln Tyr Gly Asp Val
165 170 175
Phe Lys Gly Asp Leu Asn Pro Lys Pro Gln Gly Gln Arg Leu Ile Glu
180 185 190
Val Ser Val Glu Glu Asn His Pro Phe Thr Leu Arg Ala Pro Ile Gln
195 200 205
Arg Ile Tyr Gly Val Arg Tyr Thr Glu Thr Trp Ser Phe Leu Pro Ser
210 215 220
Leu Thr Cys Thr Gly Asp Ala Ala Pro Ala Ile Gln His Ile Cys Leu
225 230 235 240
Lys His Thr Thr Cys Phe Gln Asp Val Val Val Asp Val Asp Cys Ala
245 250 255
Glu Asn Thr Lys Glu Asp Gln Leu Ala Glu Ile Ser Tyr Arg Phe Gln
260 265 270
Gly Lys Lys Glu Ala Asp Gln Pro Trp Ile Val Val Asn Thr Ser Thr
275 280 285
Leu Phe Asp Glu Leu Glu Leu Asp Pro Pro Glu Ile Glu Pro Gly Val
290 295 300
Leu Lys Val Leu Arg Thr Glu Lys Gln Tyr Leu Gly Val Tyr Ile Trp
305 310 315 320
Asn Met Arg Gly Ser Asp Gly Thr Ser Thr Tyr Ala Thr Phe Leu Val
325 330 335
Thr Trp Lys Gly Asp Glu Lys Thr Arg Asn Pro Thr Pro Ala Val Thr
340 345 350
Pro Gln Pro Arg Gly Ala Glu Phe His Met Trp Asn Tyr His Ser His
355 360 365
Val Phe Ser Val Gly Asp Thr Phe Ser Leu Ala Met His Leu Gln Tyr
370 375 380
Lys Ile His Glu Ala Pro Phe Asp Leu Leu Leu Glu Trp Leu Tyr Val
385 390 395 400
Pro Ile Asp Pro Thr Cys Gln Pro Met Arg Leu Tyr Ser Thr Cys Leu
405 410 415
Tyr His Pro Asn Ala Pro Gln Cys Leu Ser His Met Asn Ser Gly Cys
420 425 430
Thr Phe Thr Ser Pro His Leu Ala Gln Arg Val Ala Ser Thr Val Tyr
435 440 445
Gln Asn Cys Glu His Ala Asp Asn Tyr Thr Ala Tyr Cys Leu Gly Ile
450 455 460
Ser His Met Glu Pro Ser Phe Gly Leu Ile Leu His Asp Gly Gly Thr
465 470 475 480
Thr Leu Lys Phe Val Asp Thr Pro Glu Ser Leu Ser Gly Leu Tyr Val
485 490 495
Phe Val Val Tyr Phe Asn Gly His Val Glu Ala Val Ala Tyr Thr Val
500 505 510
Val Ser Thr Val Asp His Phe Val Asn Ala Ile Glu Glu Arg Gly Phe
515 520 525
Pro Pro Thr Ala Gly Gln Pro Pro Ala Thr Thr Lys Pro Lys Glu Ile
530 535 540
Thr Pro Val Asn Pro Gly Thr Ser Pro Leu Ile Arg Tyr Ala Ala Trp
545 550 555 560
Thr Gly Gly Leu Ala Gly Gly Gly Gly Ser His His His His His His
565 570 575
<210> 3
<211> 565
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 3
Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly
1 5 10 15
Val His Ser Met Gly Thr Val Asn Lys Pro Val Val Gly Val Leu Met
20 25 30
Gly Phe Gly Ile Ile Thr Gly Thr Leu Arg Ile Thr Asn Pro Val Arg
35 40 45
Ala Ser Val Leu Arg Tyr Asp Asp Phe His Ile Asp Glu Asp Lys Leu
50 55 60
Asp Thr Asn Ser Val Tyr Glu Pro Tyr Tyr His Ser Asp His Ala Glu
65 70 75 80
Ser Ser Trp Val Asn Arg Gly Glu Ser Ser Arg Lys Ala Tyr Asp His
85 90 95
Asn Ser Pro Tyr Ile Trp Pro Arg Asn Asp Tyr Asp Gly Phe Leu Glu
100 105 110
Asn Ala His Glu His His Gly Val Tyr Asn Gln Gly Arg Gly Ile Asp
115 120 125
Ser Gly Glu Arg Leu Met Gln Pro Thr Gln Met Ser Ala Gln Glu Asp
130 135 140
Leu Gly Asp Asp Thr Gly Ile His Val Ile Pro Thr Leu Asn Gly Asp
145 150 155 160
Asp Arg His Lys Ile Val Asn Val Asp Gln Arg Gln Tyr Gly Asp Val
165 170 175
Phe Lys Gly Asp Leu Asn Pro Lys Pro Gln Gly Gln Arg Leu Ile Glu
180 185 190
Val Ser Val Glu Glu Asn His Pro Phe Thr Leu Arg Ala Pro Ile Gln
195 200 205
Arg Ile Tyr Gly Val Arg Tyr Thr Glu Thr Trp Ser Phe Leu Pro Ser
210 215 220
Leu Thr Cys Thr Gly Asp Ala Ala Pro Ala Ile Gln His Ile Cys Leu
225 230 235 240
Lys His Thr Thr Cys Phe Gln Asp Val Val Val Asp Val Asp Cys Ala
245 250 255
Glu Asn Thr Lys Glu Asp Gln Leu Ala Glu Ile Ser Tyr Arg Phe Gln
260 265 270
Gly Lys Lys Glu Ala Asp Gln Pro Trp Ile Val Val Asn Thr Ser Thr
275 280 285
Leu Phe Asp Glu Leu Glu Leu Asp Pro Pro Glu Ile Glu Pro Gly Val
290 295 300
Leu Lys Val Leu Arg Thr Glu Lys Gln Tyr Leu Gly Val Tyr Ile Trp
305 310 315 320
Asn Met Arg Gly Ser Asp Gly Thr Ser Thr Tyr Ala Thr Phe Leu Val
325 330 335
Thr Trp Lys Gly Asp Glu Lys Thr Arg Asn Pro Thr Pro Ala Val Thr
340 345 350
Pro Gln Pro Arg Gly Ala Glu Phe His Met Trp Asn Tyr His Ser His
355 360 365
Val Phe Ser Val Gly Asp Thr Phe Ser Leu Ala Met His Leu Gln Tyr
370 375 380
Lys Ile His Glu Ala Pro Phe Asp Leu Leu Leu Glu Trp Leu Tyr Val
385 390 395 400
Pro Ile Asp Pro Thr Cys Gln Pro Met Arg Leu Tyr Ser Thr Cys Leu
405 410 415
Tyr His Pro Asn Ala Pro Gln Cys Leu Ser His Met Asn Ser Gly Cys
420 425 430
Thr Phe Thr Ser Pro His Leu Ala Gln Arg Val Ala Ser Thr Val Tyr
435 440 445
Gln Asn Cys Glu His Ala Asp Asn Tyr Thr Ala Tyr Cys Leu Gly Ile
450 455 460
Ser His Met Glu Pro Ser Phe Gly Leu Ile Leu His Asp Gly Gly Thr
465 470 475 480
Thr Leu Lys Phe Val Asp Thr Pro Glu Ser Leu Ser Gly Leu Tyr Val
485 490 495
Phe Val Val Tyr Phe Asn Gly His Val Glu Ala Val Ala Tyr Thr Val
500 505 510
Val Ser Thr Val Asp His Phe Val Asn Ala Ile Glu Glu Arg Gly Phe
515 520 525
Pro Pro Thr Ala Gly Gln Pro Pro Ala Thr Thr Lys Pro Lys Glu Ile
530 535 540
Thr Pro Val Asn Pro Gly Thr Ser Pro Leu Ile Arg Tyr Ala Ala Trp
545 550 555 560
Thr Gly Gly Leu Ala
565
<210> 4
<211> 563
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 4
Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly
1 5 10 15
Val His Ser Met Gly Thr Val Asn Lys Pro Val Val Gly Val Leu Met
20 25 30
Gly Phe Gly Ile Ile Thr Gly Thr Leu Arg Ile Thr Asn Pro Val Arg
35 40 45
Ala Ser Val Leu Arg Tyr Asp Asp Phe His Ile Asp Glu Asp Lys Leu
50 55 60
Asp Thr Asn Ser Val Tyr Glu Pro Tyr Tyr His Ser Asp His Ala Glu
65 70 75 80
Ser Ser Trp Val Asn Arg Gly Glu Ser Ser Arg Lys Ala Tyr Asp His
85 90 95
Asn Ser Pro Tyr Ile Trp Pro Arg Asn Asp Tyr Asp Gly Phe Leu Glu
100 105 110
Asn Ala His Glu His His Gly Val Tyr Asn Gln Gly Arg Gly Ile Asp
115 120 125
Ser Gly Glu Arg Leu Met Gln Pro Thr Gln Met Ser Ala Gln Glu Asp
130 135 140
Leu Gly Asp Asp Thr Gly Ile His Val Ile Pro Thr Leu Asn Gly Asp
145 150 155 160
Asp Arg His Lys Ile Val Asn Val Asp Gln Arg Gln Tyr Gly Asp Val
165 170 175
Phe Lys Gly Asp Leu Asn Pro Lys Pro Gln Gly Gln Arg Leu Ile Glu
180 185 190
Val Ser Val Glu Glu Asn His Pro Phe Thr Leu Arg Ala Pro Ile Gln
195 200 205
Arg Ile Tyr Gly Val Arg Tyr Thr Glu Thr Trp Ser Phe Leu Pro Ser
210 215 220
Leu Thr Cys Thr Gly Asp Ala Ala Pro Ala Ile Gln His Ile Cys Leu
225 230 235 240
Lys His Thr Thr Cys Phe Gln Asp Val Val Val Asp Val Asp Cys Ala
245 250 255
Glu Asn Thr Lys Glu Asp Gln Leu Ala Glu Ile Ser Tyr Arg Phe Gln
260 265 270
Gly Lys Lys Glu Ala Asp Gln Pro Trp Ile Val Val Asn Thr Ser Thr
275 280 285
Leu Phe Asp Glu Leu Glu Leu Asp Pro Pro Glu Ile Glu Pro Gly Val
290 295 300
Leu Lys Val Leu Arg Thr Glu Lys Gln Tyr Leu Gly Val Tyr Ile Trp
305 310 315 320
Asn Met Arg Gly Ser Asp Gly Thr Ser Thr Tyr Ala Thr Phe Leu Val
325 330 335
Thr Trp Lys Gly Asp Glu Lys Thr Arg Asn Pro Thr Pro Ala Val Thr
340 345 350
Pro Gln Pro Arg Gly Ala Glu Phe His Met Trp Asn Tyr His Ser His
355 360 365
Val Phe Ser Val Gly Asp Thr Phe Ser Leu Ala Met His Leu Gln Tyr
370 375 380
Lys Ile His Glu Ala Pro Phe Asp Leu Leu Leu Glu Trp Leu Tyr Val
385 390 395 400
Pro Ile Asp Pro Thr Cys Gln Pro Met Arg Leu Tyr Ser Thr Cys Leu
405 410 415
Tyr His Pro Asn Ala Pro Gln Cys Leu Ser His Met Asn Ser Gly Cys
420 425 430
Thr Phe Thr Ser Pro His Leu Ala Gln Arg Val Ala Ser Thr Val Tyr
435 440 445
Gln Asn Cys Glu His Ala Asp Asn Tyr Thr Ala Tyr Cys Leu Gly Ile
450 455 460
Ser His Met Glu Pro Ser Phe Gly Leu Ile Leu His Asp Gly Gly Thr
465 470 475 480
Thr Leu Lys Phe Val Asp Thr Pro Glu Ser Leu Ser Gly Leu Tyr Val
485 490 495
Phe Val Val Tyr Phe Asn Gly His Val Glu Ala Val Ala Tyr Thr Val
500 505 510
Val Ser Thr Val Asp His Phe Val Asn Ala Ile Glu Glu Arg Gly Phe
515 520 525
Pro Pro Thr Ala Gly Gln Pro Pro Ala Thr Thr Lys Pro Lys Glu Ile
530 535 540
Thr Pro Val Asn Pro Gly Thr Ser Pro Leu Leu Arg Tyr Ala Ala Trp
545 550 555 560
Thr Gly Gly
<210> 5
<211> 563
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 5
Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly
1 5 10 15
Val His Ser Met Gly Thr Val Asn Lys Pro Val Val Gly Val Leu Met
20 25 30
Gly Phe Gly Ile Ile Thr Gly Thr Leu Arg Ile Thr Asn Pro Val Arg
35 40 45
Ala Ser Val Leu Arg Tyr Asp Asp Phe His Thr Asp Glu Asp Lys Leu
50 55 60
Asp Thr Asn Ser Val Tyr Glu Pro Tyr Tyr His Ser Asp His Ala Glu
65 70 75 80
Ser Ser Trp Val Asn Arg Gly Glu Ser Ser Arg Lys Ala Tyr Asp His
85 90 95
Asn Ser Pro Tyr Ile Trp Pro Arg Asn Asp Tyr Asp Gly Phe Leu Glu
100 105 110
Asn Ala His Glu His His Gly Val Tyr Asn Gln Gly Arg Gly Ile Asp
115 120 125
Ser Gly Glu Arg Leu Met Gln Pro Thr Gln Met Ser Ala Gln Glu Asp
130 135 140
Leu Gly Asp Asp Thr Gly Ile His Val Ile Pro Thr Leu Asn Gly Asp
145 150 155 160
Asp Arg His Lys Ile Val Asn Val Asp Gln Arg Gln Tyr Gly Asp Val
165 170 175
Phe Lys Gly Asp Leu Asn Pro Lys Pro Gln Gly Gln Arg Leu Ile Glu
180 185 190
Val Ser Val Glu Glu Asn His Pro Phe Thr Leu Arg Ala Pro Ile Gln
195 200 205
Arg Ile Tyr Gly Val Arg Tyr Thr Glu Thr Trp Ser Phe Leu Pro Ser
210 215 220
Leu Thr Cys Thr Gly Asp Ala Ala Pro Ala Ile Gln His Ile Cys Leu
225 230 235 240
Lys His Thr Thr Cys Phe Gln Asp Val Val Val Asp Val Asp Cys Ala
245 250 255
Glu Asn Thr Lys Glu Asp Gln Leu Ala Glu Ile Ser Tyr Arg Phe Gln
260 265 270
Gly Lys Lys Glu Ala Asp Gln Pro Trp Ile Val Val Asn Thr Ser Thr
275 280 285
Leu Phe Asp Glu Leu Glu Leu Asp Pro Pro Glu Ile Glu Pro Gly Val
290 295 300
Leu Lys Val Leu Arg Thr Glu Lys Gln Tyr Leu Gly Val Tyr Ile Trp
305 310 315 320
Asn Met Arg Gly Ser Asp Gly Thr Ser Thr Tyr Ala Thr Phe Leu Val
325 330 335
Thr Trp Lys Gly Asp Glu Lys Thr Arg Asn Pro Thr Pro Ala Val Thr
340 345 350
Pro Gln Pro Arg Gly Ala Glu Phe His Met Trp Asn Tyr His Ser His
355 360 365
Val Phe Ser Val Gly Asp Thr Phe Ser Leu Ala Met His Leu Gln Tyr
370 375 380
Lys Ile His Glu Ala Pro Phe Asp Leu Leu Leu Glu Trp Leu Tyr Val
385 390 395 400
Pro Ile Asp Pro Thr Cys Gln Pro Met Arg Leu Tyr Ser Thr Cys Leu
405 410 415
Tyr His Pro Asn Ala Pro Gln Cys Leu Ser His Met Asn Ser Gly Cys
420 425 430
Thr Phe Thr Ser Pro His Leu Ala Gln Arg Val Ala Ser Thr Val Tyr
435 440 445
Gln Asn Cys Glu His Ala Asp Asn Tyr Thr Ala Tyr Cys Leu Gly Ile
450 455 460
Ser His Met Glu Pro Ser Phe Gly Leu Ile Leu His Asp Gly Gly Thr
465 470 475 480
Thr Leu Lys Phe Val Asp Thr Pro Glu Ser Leu Ser Gly Leu Tyr Val
485 490 495
Phe Val Val Tyr Phe Asn Gly His Val Glu Ala Val Ala Tyr Thr Val
500 505 510
Val Ser Thr Val Asp His Phe Val Asn Ala Ile Glu Glu Arg Gly Phe
515 520 525
Pro Pro Thr Ala Gly Gln Pro Pro Ala Thr Thr Lys Pro Lys Glu Ile
530 535 540
Thr Pro Val Asn Pro Gly Thr Ser Pro Leu Leu Arg Tyr Ala Ala Trp
545 550 555 560
Thr Gly Gly
<210> 6
<211> 572
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 6
Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly
1 5 10 15
Val His Ser Met Gly Thr Val Asn Lys Pro Val Val Gly Val Leu Met
20 25 30
Gly Phe Gly Ile Ile Thr Gly Thr Leu Arg Ile Thr Asn Pro Val Arg
35 40 45
Ala Ser Val Leu Arg Tyr Asp Asp Phe His Ile Asp Glu Asp Lys Leu
50 55 60
Asp Thr Asn Ser Val Tyr Glu Pro Tyr Tyr His Ser Asp His Ala Glu
65 70 75 80
Ser Ser Trp Val Asn Arg Gly Glu Ser Ser Arg Lys Ala Tyr Asp His
85 90 95
Asn Ser Pro Tyr Ile Trp Pro Arg Asn Asp Tyr Asp Gly Phe Leu Glu
100 105 110
Asn Ala His Glu His His Gly Val Tyr Asn Gln Gly Arg Gly Ile Asp
115 120 125
Ser Gly Glu Arg Leu Met Gln Pro Thr Gln Met Ser Ala Gln Glu Asp
130 135 140
Leu Gly Asp Asp Thr Gly Ile His Val Ile Pro Thr Leu Asn Gly Asp
145 150 155 160
Asp Arg His Lys Ile Val Asn Val Asp Gln Arg Gln Tyr Gly Asp Val
165 170 175
Phe Lys Gly Asp Leu Asn Pro Lys Pro Gln Gly Gln Arg Leu Ile Glu
180 185 190
Val Ser Val Glu Glu Asn His Pro Phe Thr Leu Arg Ala Pro Ile Gln
195 200 205
Arg Ile Tyr Gly Val Arg Tyr Thr Glu Thr Trp Ser Phe Leu Pro Ser
210 215 220
Leu Thr Cys Thr Gly Asp Ala Ala Pro Ala Ile Gln His Ile Cys Leu
225 230 235 240
Lys His Thr Thr Cys Phe Gln Asp Val Val Val Asp Val Asp Cys Ala
245 250 255
Glu Asn Thr Lys Glu Asp Gln Leu Ala Glu Ile Ser Tyr Arg Phe Gln
260 265 270
Gly Lys Lys Glu Ala Asp Gln Pro Trp Ile Val Val Asn Thr Ser Thr
275 280 285
Leu Phe Asp Glu Leu Glu Leu Asp Pro Pro Glu Ile Glu Pro Gly Val
290 295 300
Leu Lys Val Leu Arg Thr Glu Lys Gln Tyr Leu Gly Val Tyr Ile Trp
305 310 315 320
Asn Met Arg Gly Ser Asp Gly Thr Ser Thr Tyr Ala Thr Phe Leu Val
325 330 335
Thr Trp Lys Gly Asp Glu Lys Thr Arg Asn Pro Thr Pro Ala Val Thr
340 345 350
Pro Gln Pro Arg Gly Ala Glu Phe His Met Trp Asn Tyr His Ser His
355 360 365
Val Phe Ser Val Gly Asp Thr Phe Ser Leu Ala Met His Leu Gln Tyr
370 375 380
Lys Ile His Glu Ala Pro Phe Asp Leu Leu Leu Glu Trp Leu Tyr Val
385 390 395 400
Pro Ile Asp Pro Thr Cys Gln Pro Met Arg Leu Tyr Ser Thr Cys Leu
405 410 415
Tyr His Pro Asn Ala Pro Gln Cys Leu Ser His Met Asn Ser Gly Cys
420 425 430
Thr Phe Thr Ser Pro His Leu Ala Gln Arg Val Ala Ser Thr Val Tyr
435 440 445
Gln Asn Cys Glu His Ala Asp Asn Tyr Thr Ala Tyr Cys Leu Gly Ile
450 455 460
Ser His Met Glu Pro Ser Phe Gly Leu Ile Leu His Asp Gly Gly Thr
465 470 475 480
Thr Leu Lys Phe Val Asp Thr Pro Glu Ser Leu Ser Gly Leu Tyr Val
485 490 495
Phe Val Val Tyr Phe Asn Gly His Val Glu Ala Val Ala Tyr Thr Val
500 505 510
Val Ser Thr Val Asp His Phe Val Asn Ala Ile Glu Glu Arg Gly Phe
515 520 525
Pro Pro Thr Ala Gly Gln Pro Pro Ala Thr Thr Lys Pro Lys Glu Ile
530 535 540
Thr Pro Val Asn Pro Gly Thr Ser Pro Leu Leu Arg Tyr Ala Ala Trp
545 550 555 560
Thr Gly Gly Gly Gly Ser His His His His His His
565 570
<210> 7
<211> 572
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<400> 7
Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly
1 5 10 15
Val His Ser Met Gly Thr Val Asn Lys Pro Val Val Gly Val Leu Met
20 25 30
Gly Phe Gly Ile Ile Thr Gly Thr Leu Arg Ile Thr Asn Pro Val Arg
35 40 45
Ala Ser Val Leu Arg Tyr Asp Asp Phe His Thr Asp Glu Asp Lys Leu
50 55 60
Asp Thr Asn Ser Val Tyr Glu Pro Tyr Tyr His Ser Asp His Ala Glu
65 70 75 80
Ser Ser Trp Val Asn Arg Gly Glu Ser Ser Arg Lys Ala Tyr Asp His
85 90 95
Asn Ser Pro Tyr Ile Trp Pro Arg Asn Asp Tyr Asp Gly Phe Leu Glu
100 105 110
Asn Ala His Glu His His Gly Val Tyr Asn Gln Gly Arg Gly Ile Asp
115 120 125
Ser Gly Glu Arg Leu Met Gln Pro Thr Gln Met Ser Ala Gln Glu Asp
130 135 140
Leu Gly Asp Asp Thr Gly Ile His Val Ile Pro Thr Leu Asn Gly Asp
145 150 155 160
Asp Arg His Lys Ile Val Asn Val Asp Gln Arg Gln Tyr Gly Asp Val
165 170 175
Phe Lys Gly Asp Leu Asn Pro Lys Pro Gln Gly Gln Arg Leu Ile Glu
180 185 190
Val Ser Val Glu Glu Asn His Pro Phe Thr Leu Arg Ala Pro Ile Gln
195 200 205
Arg Ile Tyr Gly Val Arg Tyr Thr Glu Thr Trp Ser Phe Leu Pro Ser
210 215 220
Leu Thr Cys Thr Gly Asp Ala Ala Pro Ala Ile Gln His Ile Cys Leu
225 230 235 240
Lys His Thr Thr Cys Phe Gln Asp Val Val Val Asp Val Asp Cys Ala
245 250 255
Glu Asn Thr Lys Glu Asp Gln Leu Ala Glu Ile Ser Tyr Arg Phe Gln
260 265 270
Gly Lys Lys Glu Ala Asp Gln Pro Trp Ile Val Val Asn Thr Ser Thr
275 280 285
Leu Phe Asp Glu Leu Glu Leu Asp Pro Pro Glu Ile Glu Pro Gly Val
290 295 300
Leu Lys Val Leu Arg Thr Glu Lys Gln Tyr Leu Gly Val Tyr Ile Trp
305 310 315 320
Asn Met Arg Gly Ser Asp Gly Thr Ser Thr Tyr Ala Thr Phe Leu Val
325 330 335
Thr Trp Lys Gly Asp Glu Lys Thr Arg Asn Pro Thr Pro Ala Val Thr
340 345 350
Pro Gln Pro Arg Gly Ala Glu Phe His Met Trp Asn Tyr His Ser His
355 360 365
Val Phe Ser Val Gly Asp Thr Phe Ser Leu Ala Met His Leu Gln Tyr
370 375 380
Lys Ile His Glu Ala Pro Phe Asp Leu Leu Leu Glu Trp Leu Tyr Val
385 390 395 400
Pro Ile Asp Pro Thr Cys Gln Pro Met Arg Leu Tyr Ser Thr Cys Leu
405 410 415
Tyr His Pro Asn Ala Pro Gln Cys Leu Ser His Met Asn Ser Gly Cys
420 425 430
Thr Phe Thr Ser Pro His Leu Ala Gln Arg Val Ala Ser Thr Val Tyr
435 440 445
Gln Asn Cys Glu His Ala Asp Asn Tyr Thr Ala Tyr Cys Leu Gly Ile
450 455 460
Ser His Met Glu Pro Ser Phe Gly Leu Ile Leu His Asp Gly Gly Thr
465 470 475 480
Thr Leu Lys Phe Val Asp Thr Pro Glu Ser Leu Ser Gly Leu Tyr Val
485 490 495
Phe Val Val Tyr Phe Asn Gly His Val Glu Ala Val Ala Tyr Thr Val
500 505 510
Val Ser Thr Val Asp His Phe Val Asn Ala Ile Glu Glu Arg Gly Phe
515 520 525
Pro Pro Thr Ala Gly Gln Pro Pro Ala Thr Thr Lys Pro Lys Glu Ile
530 535 540
Thr Pro Val Asn Pro Gly Thr Ser Pro Leu Leu Arg Tyr Ala Ala Trp
545 550 555 560
Thr Gly Gly Gly Gly Ser His His His His His His
565 570
<210> 8
<211> 1674
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 8
atgggcaccg tgaacaagcc agtggtggga gtgctgatgg gattcggcat catcaccggc 60
accctgagga tcacaaaccc tgtgagagct tctgtgctga gatatgatga ttttcacacc 120
gatgaggata agctggatac aaactctgtg tacgagcctt actaccactc tgatcacgcc 180
gaatcttctt gggtgaatag aggcgagtcc agcagaaagg cctacgatca caactctcct 240
tacatctggc ctaggaacga ttatgatggc ttcctggaga acgcccacga acaccacggc 300
gtgtacaacc agggcagagg aatcgattct ggagagagac tgatgcagcc tacccagatg 360
tctgctcagg aggatctggg agatgataca ggaatccatg tgatccctac actgaacggc 420
gatgataggc acaagattgt gaacgtggat cagagacagt acggagatgt gtttaaggga 480
gatctgaacc ctaagcctca gggccagaga ctgattgagg tgagcgtgga ggaaaaccac 540
ccttttaccc tgagggctcc tatccagaga atctacggag tgagatacac cgagacatgg 600
tcttttctgc ctagcctgac atgtaccggc gatgctgccc ctgccattca gcatatttgt 660
ctgaagcaca caacctgttt tcaggatgtg gtggtggatg tggactgtgc cgagaacaca 720
aaggaggatc agctggccga gatctcttac agatttcagg gcaagaagga ggctgatcag 780
ccttggatcg tggtgaacac aagcaccctg tttgatgagc tggagctgga tcctcctgag 840
attgagcctg gtgtgctgaa ggtgctgaga acagaaaagc agtatctggg cgtgtacatt 900
tggaacatga ggggctctga tggaacaagc acctacgcta ctttcctggt gacctggaag 960
ggcgatgaga aaacaagaaa ccctacacca gccgtgactc ctcagcctag aggagctgag 1020
ttccacatgt ggaactacca ttctcacgtg ttttctgtgg gcgatacatt ttctctggcc 1080
atgcacctgc agtataagat ccacgaggct ccatttgatc tgctgctgga gtggctgtac 1140
gtgcctattg atccaacatg tcagccaatg agactgtact ccacatgtct gtaccaccca 1200
aacgcccctc agtgtctgag ccacatgaac agcggatgta cttttacatc tcctcacctg 1260
gcccagagag tggctagcac agtgtaccag aactgtgagc acgccgataa ctacacagct 1320
tattgtctgg gcatctctca catggagcct tcttttggcc tgatcctgca tgacggaggc 1380
accacactga agtttgtgga tacccctgag tctctgagcg gcctgtacgt gtttgtggtg 1440
tactttaacg gccacgtgga ggctgtggcc tacaccgtgg tgagcacagt ggatcacttt 1500
gtgaatgcca ttgaggagag aggcttccca ccaacagccg gacagcctcc agccacaaca 1560
aagcctaagg agatcacacc tgtgaaccca ggcacctctc ctctgctcag atacgccgcc 1620
tggaccggcg gactggccgg cggcggcggc tctcatcacc accaccacca ctaa 1674
<210> 9
<211> 1674
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 9
atggggacag tgaacaagcc tgtggtgggg gtcctgatgg ggttcgggat catcaccggc 60
accctgagga ttactaaccc cgtgagggcc agcgtcctgc ggtacgacga ttttcacacc 120
gatgaggata agctggatac caactccgtg tacgagccat actaccacag cgaccacgcc 180
gagtccagct gggtgaaccg gggcgagtcc tccaggaagg cctacgacca caactccccc 240
tacatctggc cacggaacga ttacgacggc ttcctcgaga acgcccacga gcaccacggg 300
gtgtacaacc agggccgggg cattgacagc ggggagcggt tgatgcagcc cacccagatg 360
agcgcccagg aggatctggg ggatgatacc gggatccacg tgatcccaac actgaacggc 420
gatgataggc acaagattgt gaacgtggac cagcggcagt acggggatgt gttcaagggg 480
gatctgaacc ccaagcccca gggccagcgg ctcatcgagg tgtccgtgga ggagaaccac 540
cctttcaccc tgcgggcccc cattcagagg atctacgggg tgaggtacac cgagacctgg 600
agcttcctgc cctccctgac atgcacaggg gacgccgccc ccgccatcca gcacatctgc 660
ctgaagcaca ccacatgctt ccaggatgtg gtggtcgatg tggactgcgc cgagaacacc 720
aaggaggacc agctcgccga gatcagctac cggtttcagg gcaagaagga ggccgatcag 780
ccctggattg tcgtgaacac aagcacactg ttcgatgagc tggagctgga cccccccgag 840
atcgagccag gcgtgctgaa ggtgctgcgg acagagaagc agtacctggg cgtgtacatt 900
tggaacatgc ggggcagcga cggcacatcc acctacgcca ccttcctggt gacctggaag 960
ggggacgaaa agacacggaa ccccacaccc gccgtgaccc cacagcccag gggcgccgag 1020
ttccacatgt ggaactacca ctcccacgtg ttctccgtgg gggatacctt cagcctggcc 1080
atgcacctgc agtacaagat ccacgaggcc cctttcgacc tgctgctgga gtggctgtac 1140
gtgcctattg accccacttg ccagcctatg cggctgtaca gcacctgcct gtaccaccct 1200
aacgcccccc agtgcctgtc ccacatgaac agcggctgca cattcacttc cccccacctg 1260
gcccagcggg tggcttccac cgtgtaccag aactgcgagc atgccgataa ctacaccgcc 1320
tactgcctgg ggatcagcca catggagccc agctttgggc tgatcctgca cgacggcggg 1380
accacactga agttcgtgga cacccccgag agcctgagcg gcctgtacgt gtttgtggtg 1440
tactttaacg gccacgtgga ggccgtggcc tacaccgtgg tgagcacagt cgatcacttc 1500
gtgaacgcca ttgaggagag ggggttcccc ccaaccgccg gccagccccc cgccaccaca 1560
aagcccaagg agattacccc tgtgaacccc ggcacaagcc cactgctcag gtacgccgcc 1620
tggacagggg gcctggccgg cggcggcggc tctcatcacc accaccacca ctaa 1674
<210> 10
<211> 1731
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 10
atgggctggt cttgtatcat cctgtttctg gtggccacag ccaccggcgt gcactctatg 60
ggcaccgtga acaagccagt ggtgggagtg ctgatgggat tcggcatcat caccggcacc 120
ctgaggatca caaaccctgt gagagcttct gtgctgagat atgatgattt tcacatcgat 180
gaggataagc tggatacaaa ctctgtgtac gagccttact accactctga tcacgccgaa 240
tcttcttggg tgaatagagg cgagtccagc agaaaggcct acgatcacaa ctctccttac 300
atctggccta ggaacgatta tgatggcttc ctggagaacg cccacgaaca ccacggcgtg 360
tacaaccagg gcagaggaat cgattctgga gagagactga tgcagcctac ccagatgtct 420
gctcaggagg atctgggaga tgatacagga atccatgtga tccctacact gaacggcgat 480
gataggcaca agattgtgaa cgtggatcag agacagtacg gagatgtgtt taagggagat 540
ctgaacccta agcctcaggg ccagagactg attgaggtga gcgtggagga aaaccaccct 600
tttaccctga gggctcctat ccagagaatc tacggagtga gatacaccga gacatggtct 660
tttctgccta gcctgacatg taccggcgat gctgcccctg ccattcagca tatttgtctg 720
aagcacacaa cctgttttca ggatgtggtg gtggatgtgg actgtgccga gaacacaaag 780
gaggatcagc tggccgagat ctcttacaga tttcagggca agaaggaggc tgatcagcct 840
tggatcgtgg tgaacacaag caccctgttt gatgagctgg agctggatcc tcctgagatt 900
gagcctggtg tgctgaaggt gctgagaaca gaaaagcagt atctgggcgt gtacatttgg 960
aacatgaggg gctctgatgg aacaagcacc tacgctactt tcctggtgac ctggaagggc 1020
gatgagaaaa caagaaaccc tacaccagcc gtgactcctc agcctagagg agctgagttc 1080
cacatgtgga actaccattc tcacgtgttt tctgtgggcg atacattttc tctggccatg 1140
cacctgcagt ataagatcca cgaggctcca tttgatctgc tgctggagtg gctgtacgtg 1200
cctattgatc caacatgtca gccaatgaga ctgtactcca catgtctgta ccacccaaac 1260
gcccctcagt gtctgagcca catgaacagc ggatgtactt ttacatctcc tcacctggcc 1320
cagagagtgg ctagcacagt gtaccagaac tgtgagcacg ccgataacta cacagcttat 1380
tgtctgggca tctctcacat ggagccttct tttggcctga tcctgcatga cggaggcacc 1440
acactgaagt ttgtggatac ccctgagtct ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc tgtggcctac accgtggtga gcacagtgga tcactttgtg 1560
aatgccattg aggagagagg cttcccacca acagccggac agcctccagc cacaacaaag 1620
cctaaggaga tcacacctgt gaacccaggc acctctcctc tgatcagata cgccgcctgg 1680
accggcggac tggccggcgg cggcggctct catcaccacc accaccacta a 1731
<210> 11
<211> 1698
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 11
atgggctggt cttgtatcat cctgtttctg gtggccacag ccaccggcgt gcactctatg 60
ggcaccgtga acaagccagt ggtgggagtg ctgatgggat tcggcatcat caccggcacc 120
ctgaggatca caaaccctgt gagagcttct gtgctgagat atgatgattt tcacatcgat 180
gaggataagc tggatacaaa ctctgtgtac gagccttact accactctga tcacgccgaa 240
tcttcttggg tgaatagagg cgagtccagc agaaaggcct acgatcacaa ctctccttac 300
atctggccta ggaacgatta tgatggcttc ctggagaacg cccacgaaca ccacggcgtg 360
tacaaccagg gcagaggaat cgattctgga gagagactga tgcagcctac ccagatgtct 420
gctcaggagg atctgggaga tgatacagga atccatgtga tccctacact gaacggcgat 480
gataggcaca agattgtgaa cgtggatcag agacagtacg gagatgtgtt taagggagat 540
ctgaacccta agcctcaggg ccagagactg attgaggtga gcgtggagga aaaccaccct 600
tttaccctga gggctcctat ccagagaatc tacggagtga gatacaccga gacatggtct 660
tttctgccta gcctgacatg taccggcgat gctgcccctg ccattcagca tatttgtctg 720
aagcacacaa cctgttttca ggatgtggtg gtggatgtgg actgtgccga gaacacaaag 780
gaggatcagc tggccgagat ctcttacaga tttcagggca agaaggaggc tgatcagcct 840
tggatcgtgg tgaacacaag caccctgttt gatgagctgg agctggatcc tcctgagatt 900
gagcctggtg tgctgaaggt gctgagaaca gaaaagcagt atctgggcgt gtacatttgg 960
aacatgaggg gctctgatgg aacaagcacc tacgctactt tcctggtgac ctggaagggc 1020
gatgagaaaa caagaaaccc tacaccagcc gtgactcctc agcctagagg agctgagttc 1080
cacatgtgga actaccattc tcacgtgttt tctgtgggcg atacattttc tctggccatg 1140
cacctgcagt ataagatcca cgaggctcca tttgatctgc tgctggagtg gctgtacgtg 1200
cctattgatc caacatgtca gccaatgaga ctgtactcca catgtctgta ccacccaaac 1260
gcccctcagt gtctgagcca catgaacagc ggatgtactt ttacatctcc tcacctggcc 1320
cagagagtgg ctagcacagt gtaccagaac tgtgagcacg ccgataacta cacagcttat 1380
tgtctgggca tctctcacat ggagccttct tttggcctga tcctgcatga cggaggcacc 1440
acactgaagt ttgtggatac ccctgagtct ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc tgtggcctac accgtggtga gcacagtgga tcactttgtg 1560
aatgccattg aggagagagg cttcccacca acagccggac agcctccagc cacaacaaag 1620
cctaaggaga tcacacctgt gaacccaggc acctctcctc tgatcagata cgccgcctgg 1680
accggcggac tggcctaa 1698
<210> 12
<211> 1731
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 12
atgggctggt cctgcattat tctgtttctg gtggccacag ccacaggcgt gcatagcatg 60
gggacagtga acaagcctgt ggtgggggtc ctgatggggt tcgggatcat caccggcacc 120
ctgaggatta ctaaccccgt gagggccagc gtcctgcggt acgacgattt tcacatcgat 180
gaggataagc tggataccaa ctccgtgtac gagccatact accacagcga ccacgccgag 240
tccagctggg tgaaccgggg cgagtcctcc aggaaggcct acgaccacaa ctccccctac 300
atctggccac ggaacgatta cgacggcttc ctcgagaacg cccacgagca ccacggggtg 360
tacaaccagg gccggggcat tgacagcggg gagcggttga tgcagcccac ccagatgagc 420
gcccaggagg atctggggga tgataccggg atccacgtga tcccaacact gaacggcgat 480
gataggcaca agattgtgaa cgtggaccag cggcagtacg gggatgtgtt caagggggat 540
ctgaacccca agccccaggg ccagcggctc atcgaggtgt ccgtggagga gaaccaccct 600
ttcaccctgc gggcccccat tcagaggatc tacggggtga ggtacaccga gacctggagc 660
ttcctgccct ccctgacatg cacaggggac gccgcccccg ccatccagca catctgcctg 720
aagcacacca catgcttcca ggatgtggtg gtcgatgtgg actgcgccga gaacaccaag 780
gaggaccagc tcgccgagat cagctaccgg tttcagggca agaaggaggc cgatcagccc 840
tggattgtcg tgaacacaag cacactgttc gatgagctgg agctggaccc ccccgagatc 900
gagccaggcg tgctgaaggt gctgcggaca gagaagcagt acctgggcgt gtacatttgg 960
aacatgcggg gcagcgacgg cacatccacc tacgccacct tcctggtgac ctggaagggg 1020
gacgaaaaga cacggaaccc cacacccgcc gtgaccccac agcccagggg cgccgagttc 1080
cacatgtgga actaccactc ccacgtgttc tccgtggggg ataccttcag cctggccatg 1140
cacctgcagt acaagatcca cgaggcccct ttcgacctgc tgctggagtg gctgtacgtg 1200
cctattgacc ccacttgcca gcctatgcgg ctgtacagca cctgcctgta ccaccctaac 1260
gccccccagt gcctgtccca catgaacagc ggctgcacat tcacttcccc ccacctggcc 1320
cagcgggtgg cttccaccgt gtaccagaac tgcgagcatg ccgataacta caccgcctac 1380
tgcctgggga tcagccacat ggagcccagc tttgggctga tcctgcacga cggcgggacc 1440
acactgaagt tcgtggacac ccccgagagc ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc cgtggcctac accgtggtga gcacagtcga tcacttcgtg 1560
aacgccattg aggagagggg gttcccccca accgccggcc agccccccgc caccacaaag 1620
cccaaggaga ttacccctgt gaaccccggc acaagcccac tgatcaggta cgccgcctgg 1680
acagggggcc tggccggcgg cggcggcagc caccatcacc accaccacta a 1731
<210> 13
<211> 1698
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 13
atgggctggt cctgcattat tctgtttctg gtggccacag ccacaggcgt gcatagcatg 60
gggacagtga acaagcctgt ggtgggggtc ctgatggggt tcgggatcat caccggcacc 120
ctgaggatta ctaaccccgt gagggccagc gtcctgcggt acgacgattt tcacatcgat 180
gaggataagc tggataccaa ctccgtgtac gagccatact accacagcga ccacgccgag 240
tccagctggg tgaaccgggg cgagtcctcc aggaaggcct acgaccacaa ctccccctac 300
atctggccac ggaacgatta cgacggcttc ctcgagaacg cccacgagca ccacggggtg 360
tacaaccagg gccggggcat tgacagcggg gagcggttga tgcagcccac ccagatgagc 420
gcccaggagg atctggggga tgataccggg atccacgtga tcccaacact gaacggcgat 480
gataggcaca agattgtgaa cgtggaccag cggcagtacg gggatgtgtt caagggggat 540
ctgaacccca agccccaggg ccagcggctc atcgaggtgt ccgtggagga gaaccaccct 600
ttcaccctgc gggcccccat tcagaggatc tacggggtga ggtacaccga gacctggagc 660
ttcctgccct ccctgacatg cacaggggac gccgcccccg ccatccagca catctgcctg 720
aagcacacca catgcttcca ggatgtggtg gtcgatgtgg actgcgccga gaacaccaag 780
gaggaccagc tcgccgagat cagctaccgg tttcagggca agaaggaggc cgatcagccc 840
tggattgtcg tgaacacaag cacactgttc gatgagctgg agctggaccc ccccgagatc 900
gagccaggcg tgctgaaggt gctgcggaca gagaagcagt acctgggcgt gtacatttgg 960
aacatgcggg gcagcgacgg cacatccacc tacgccacct tcctggtgac ctggaagggg 1020
gacgaaaaga cacggaaccc cacacccgcc gtgaccccac agcccagggg cgccgagttc 1080
cacatgtgga actaccactc ccacgtgttc tccgtggggg ataccttcag cctggccatg 1140
cacctgcagt acaagatcca cgaggcccct ttcgacctgc tgctggagtg gctgtacgtg 1200
cctattgacc ccacttgcca gcctatgcgg ctgtacagca cctgcctgta ccaccctaac 1260
gccccccagt gcctgtccca catgaacagc ggctgcacat tcacttcccc ccacctggcc 1320
cagcgggtgg cttccaccgt gtaccagaac tgcgagcatg ccgataacta caccgcctac 1380
tgcctgggga tcagccacat ggagcccagc tttgggctga tcctgcacga cggcgggacc 1440
acactgaagt tcgtggacac ccccgagagc ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc cgtggcctac accgtggtga gcacagtcga tcacttcgtg 1560
aacgccattg aggagagggg gttcccccca accgccggcc agccccccgc caccacaaag 1620
cccaaggaga ttacccctgt gaaccccggc acaagcccac tgatcaggta cgccgcctgg 1680
acagggggcc tggcctaa 1698
<210> 14
<211> 1692
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 14
atgggctggt cttgtatcat cctgtttctg gtggccacag ccaccggcgt gcactctatg 60
ggcaccgtga acaagccagt ggtgggagtg ctgatgggat tcggcatcat caccggcacc 120
ctgaggatca caaaccctgt gagagcttct gtgctgagat atgatgattt tcacatcgat 180
gaggataagc tggatacaaa ctctgtgtac gagccttact accactctga tcacgccgaa 240
tcttcttggg tgaatagagg cgagtccagc agaaaggcct acgatcacaa ctctccttac 300
atctggccta ggaacgatta tgatggcttc ctggagaacg cccacgaaca ccacggcgtg 360
tacaaccagg gcagaggaat cgattctgga gagagactga tgcagcctac ccagatgtct 420
gctcaggagg atctgggaga tgatacagga atccatgtga tccctacact gaacggcgat 480
gataggcaca agattgtgaa cgtggatcag agacagtacg gagatgtgtt taagggagat 540
ctgaacccta agcctcaggg ccagagactg attgaggtga gcgtggagga aaaccaccct 600
tttaccctga gggctcctat ccagagaatc tacggagtga gatacaccga gacatggtct 660
tttctgccta gcctgacatg taccggcgat gctgcccctg ccattcagca tatttgtctg 720
aagcacacaa cctgttttca ggatgtggtg gtggatgtgg actgtgccga gaacacaaag 780
gaggatcagc tggccgagat ctcttacaga tttcagggca agaaggaggc tgatcagcct 840
tggatcgtgg tgaacacaag caccctgttt gatgagctgg agctggatcc tcctgagatt 900
gagcctggtg tgctgaaggt gctgagaaca gaaaagcagt atctgggcgt gtacatttgg 960
aacatgaggg gctctgatgg aacaagcacc tacgctactt tcctggtgac ctggaagggc 1020
gatgagaaaa caagaaaccc tacaccagcc gtgactcctc agcctagagg agctgagttc 1080
cacatgtgga actaccattc tcacgtgttt tctgtgggcg atacattttc tctggccatg 1140
cacctgcagt ataagatcca cgaggctcca tttgatctgc tgctggagtg gctgtacgtg 1200
cctattgatc caacatgtca gccaatgaga ctgtactcca catgtctgta ccacccaaac 1260
gcccctcagt gtctgagcca catgaacagc ggatgtactt ttacatctcc tcacctggcc 1320
cagagagtgg ctagcacagt gtaccagaac tgtgagcacg ccgataacta cacagcttat 1380
tgtctgggca tctctcacat ggagccttct tttggcctga tcctgcatga cggaggcacc 1440
acactgaagt ttgtggatac ccctgagtct ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc tgtggcctac accgtggtga gcacagtgga tcactttgtg 1560
aatgccattg aggagagagg cttcccacca acagccggac agcctccagc cacaacaaag 1620
cctaaggaga tcacacctgt gaacccaggc acctctcctc tgctcagata cgccgcctgg 1680
accggcggat aa 1692
<210> 15
<211> 1692
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 15
atgggctggt cctgcattat tctgtttctg gtggccacag ccacaggcgt gcatagcatg 60
gggacagtga acaagcctgt ggtgggggtc ctgatggggt tcgggatcat caccggcacc 120
ctgaggatta ctaaccccgt gagggccagc gtcctgcggt acgacgattt tcacatcgat 180
gaggataagc tggataccaa ctccgtgtac gagccatact accacagcga ccacgccgag 240
tccagctggg tgaaccgggg cgagtcctcc aggaaggcct acgaccacaa ctccccctac 300
atctggccac ggaacgatta cgacggcttc ctcgagaacg cccacgagca ccacggggtg 360
tacaaccagg gccggggcat tgacagcggg gagcggttga tgcagcccac ccagatgagc 420
gcccaggagg atctggggga tgataccggg atccacgtga tcccaacact gaacggcgat 480
gataggcaca agattgtgaa cgtggaccag cggcagtacg gggatgtgtt caagggggat 540
ctgaacccca agccccaggg ccagcggctc atcgaggtgt ccgtggagga gaaccaccct 600
ttcaccctgc gggcccccat tcagaggatc tacggggtga ggtacaccga gacctggagc 660
ttcctgccct ccctgacatg cacaggggac gccgcccccg ccatccagca catctgcctg 720
aagcacacca catgcttcca ggatgtggtg gtcgatgtgg actgcgccga gaacaccaag 780
gaggaccagc tcgccgagat cagctaccgg tttcagggca agaaggaggc cgatcagccc 840
tggattgtcg tgaacacaag cacactgttc gatgagctgg agctggaccc ccccgagatc 900
gagccaggcg tgctgaaggt gctgcggaca gagaagcagt acctgggcgt gtacatttgg 960
aacatgcggg gcagcgacgg cacatccacc tacgccacct tcctggtgac ctggaagggg 1020
gacgaaaaga cacggaaccc cacacccgcc gtgaccccac agcccagggg cgccgagttc 1080
cacatgtgga actaccactc ccacgtgttc tccgtggggg ataccttcag cctggccatg 1140
cacctgcagt acaagatcca cgaggcccct ttcgacctgc tgctggagtg gctgtacgtg 1200
cctattgacc ccacttgcca gcctatgcgg ctgtacagca cctgcctgta ccaccctaac 1260
gccccccagt gcctgtccca catgaacagc ggctgcacat tcacttcccc ccacctggcc 1320
cagcgggtgg cttccaccgt gtaccagaac tgcgagcatg ccgataacta caccgcctac 1380
tgcctgggga tcagccacat ggagcccagc tttgggctga tcctgcacga cggcgggacc 1440
acactgaagt tcgtggacac ccccgagagc ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc cgtggcctac accgtggtga gcacagtcga tcacttcgtg 1560
aacgccattg aggagagggg gttcccccca accgccggcc agccccccgc caccacaaag 1620
cccaaggaga ttacccctgt gaaccccggc acaagcccac tgctcaggta cgccgcctgg 1680
acagggggct aa 1692
<210> 16
<211> 1692
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 16
atgggctggt cttgtatcat cctgtttctg gtggccacag ccaccggcgt gcactctatg 60
ggcaccgtga acaagccagt ggtgggagtg ctgatgggat tcggcatcat caccggcacc 120
ctgaggatca caaaccctgt gagagcttct gtgctgagat atgatgattt tcacaccgat 180
gaggataagc tggatacaaa ctctgtgtac gagccttact accactctga tcacgccgaa 240
tcttcttggg tgaatagagg cgagtccagc agaaaggcct acgatcacaa ctctccttac 300
atctggccta ggaacgatta tgatggcttc ctggagaacg cccacgaaca ccacggcgtg 360
tacaaccagg gcagaggaat cgattctgga gagagactga tgcagcctac ccagatgtct 420
gctcaggagg atctgggaga tgatacagga atccatgtga tccctacact gaacggcgat 480
gataggcaca agattgtgaa cgtggatcag agacagtacg gagatgtgtt taagggagat 540
ctgaacccta agcctcaggg ccagagactg attgaggtga gcgtggagga aaaccaccct 600
tttaccctga gggctcctat ccagagaatc tacggagtga gatacaccga gacatggtct 660
tttctgccta gcctgacatg taccggcgat gctgcccctg ccattcagca tatttgtctg 720
aagcacacaa cctgttttca ggatgtggtg gtggatgtgg actgtgccga gaacacaaag 780
gaggatcagc tggccgagat ctcttacaga tttcagggca agaaggaggc tgatcagcct 840
tggatcgtgg tgaacacaag caccctgttt gatgagctgg agctggatcc tcctgagatt 900
gagcctggtg tgctgaaggt gctgagaaca gaaaagcagt atctgggcgt gtacatttgg 960
aacatgaggg gctctgatgg aacaagcacc tacgctactt tcctggtgac ctggaagggc 1020
gatgagaaaa caagaaaccc tacaccagcc gtgactcctc agcctagagg agctgagttc 1080
cacatgtgga actaccattc tcacgtgttt tctgtgggcg atacattttc tctggccatg 1140
cacctgcagt ataagatcca cgaggctcca tttgatctgc tgctggagtg gctgtacgtg 1200
cctattgatc caacatgtca gccaatgaga ctgtactcca catgtctgta ccacccaaac 1260
gcccctcagt gtctgagcca catgaacagc ggatgtactt ttacatctcc tcacctggcc 1320
cagagagtgg ctagcacagt gtaccagaac tgtgagcacg ccgataacta cacagcttat 1380
tgtctgggca tctctcacat ggagccttct tttggcctga tcctgcatga cggaggcacc 1440
acactgaagt ttgtggatac ccctgagtct ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc tgtggcctac accgtggtga gcacagtgga tcactttgtg 1560
aatgccattg aggagagagg cttcccacca acagccggac agcctccagc cacaacaaag 1620
cctaaggaga tcacacctgt gaacccaggc acctctcctc tgctcagata cgccgcctgg 1680
accggcggat aa 1692
<210> 17
<211> 1692
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 17
atgggctggt cctgcattat tctgtttctg gtggccacag ccacaggcgt gcatagcatg 60
gggacagtga acaagcctgt ggtgggggtc ctgatggggt tcgggatcat caccggcacc 120
ctgaggatta ctaaccccgt gagggccagc gtcctgcggt acgacgattt tcacaccgat 180
gaggataagc tggataccaa ctccgtgtac gagccatact accacagcga ccacgccgag 240
tccagctggg tgaaccgggg cgagtcctcc aggaaggcct acgaccacaa ctccccctac 300
atctggccac ggaacgatta cgacggcttc ctcgagaacg cccacgagca ccacggggtg 360
tacaaccagg gccggggcat tgacagcggg gagcggttga tgcagcccac ccagatgagc 420
gcccaggagg atctggggga tgataccggg atccacgtga tcccaacact gaacggcgat 480
gataggcaca agattgtgaa cgtggaccag cggcagtacg gggatgtgtt caagggggat 540
ctgaacccca agccccaggg ccagcggctc atcgaggtgt ccgtggagga gaaccaccct 600
ttcaccctgc gggcccccat tcagaggatc tacggggtga ggtacaccga gacctggagc 660
ttcctgccct ccctgacatg cacaggggac gccgcccccg ccatccagca catctgcctg 720
aagcacacca catgcttcca ggatgtggtg gtcgatgtgg actgcgccga gaacaccaag 780
gaggaccagc tcgccgagat cagctaccgg tttcagggca agaaggaggc cgatcagccc 840
tggattgtcg tgaacacaag cacactgttc gatgagctgg agctggaccc ccccgagatc 900
gagccaggcg tgctgaaggt gctgcggaca gagaagcagt acctgggcgt gtacatttgg 960
aacatgcggg gcagcgacgg cacatccacc tacgccacct tcctggtgac ctggaagggg 1020
gacgaaaaga cacggaaccc cacacccgcc gtgaccccac agcccagggg cgccgagttc 1080
cacatgtgga actaccactc ccacgtgttc tccgtggggg ataccttcag cctggccatg 1140
cacctgcagt acaagatcca cgaggcccct ttcgacctgc tgctggagtg gctgtacgtg 1200
cctattgacc ccacttgcca gcctatgcgg ctgtacagca cctgcctgta ccaccctaac 1260
gccccccagt gcctgtccca catgaacagc ggctgcacat tcacttcccc ccacctggcc 1320
cagcgggtgg cttccaccgt gtaccagaac tgcgagcatg ccgataacta caccgcctac 1380
tgcctgggga tcagccacat ggagcccagc tttgggctga tcctgcacga cggcgggacc 1440
acactgaagt tcgtggacac ccccgagagc ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc cgtggcctac accgtggtga gcacagtcga tcacttcgtg 1560
aacgccattg aggagagggg gttcccccca accgccggcc agccccccgc caccacaaag 1620
cccaaggaga ttacccctgt gaaccccggc acaagcccac tgctcaggta cgccgcctgg 1680
acagggggct aa 1692
<210> 18
<211> 1719
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 18
atgggctggt cttgtatcat cctgtttctg gtggccacag ccaccggcgt gcactctatg 60
ggcaccgtga acaagccagt ggtgggagtg ctgatgggat tcggcatcat caccggcacc 120
ctgaggatca caaaccctgt gagagcttct gtgctgagat atgatgattt tcacatcgat 180
gaggataagc tggatacaaa ctctgtgtac gagccttact accactctga tcacgccgaa 240
tcttcttggg tgaatagagg cgagtccagc agaaaggcct acgatcacaa ctctccttac 300
atctggccta ggaacgatta tgatggcttc ctggagaacg cccacgaaca ccacggcgtg 360
tacaaccagg gcagaggaat cgattctgga gagagactga tgcagcctac ccagatgtct 420
gctcaggagg atctgggaga tgatacagga atccatgtga tccctacact gaacggcgat 480
gataggcaca agattgtgaa cgtggatcag agacagtacg gagatgtgtt taagggagat 540
ctgaacccta agcctcaggg ccagagactg attgaggtga gcgtggagga aaaccaccct 600
tttaccctga gggctcctat ccagagaatc tacggagtga gatacaccga gacatggtct 660
tttctgccta gcctgacatg taccggcgat gctgcccctg ccattcagca tatttgtctg 720
aagcacacaa cctgttttca ggatgtggtg gtggatgtgg actgtgccga gaacacaaag 780
gaggatcagc tggccgagat ctcttacaga tttcagggca agaaggaggc tgatcagcct 840
tggatcgtgg tgaacacaag caccctgttt gatgagctgg agctggatcc tcctgagatt 900
gagcctggtg tgctgaaggt gctgagaaca gaaaagcagt atctgggcgt gtacatttgg 960
aacatgaggg gctctgatgg aacaagcacc tacgctactt tcctggtgac ctggaagggc 1020
gatgagaaaa caagaaaccc tacaccagcc gtgactcctc agcctagagg agctgagttc 1080
cacatgtgga actaccattc tcacgtgttt tctgtgggcg atacattttc tctggccatg 1140
cacctgcagt ataagatcca cgaggctcca tttgatctgc tgctggagtg gctgtacgtg 1200
cctattgatc caacatgtca gccaatgaga ctgtactcca catgtctgta ccacccaaac 1260
gcccctcagt gtctgagcca catgaacagc ggatgtactt ttacatctcc tcacctggcc 1320
cagagagtgg ctagcacagt gtaccagaac tgtgagcacg ccgataacta cacagcttat 1380
tgtctgggca tctctcacat ggagccttct tttggcctga tcctgcatga cggaggcacc 1440
acactgaagt ttgtggatac ccctgagtct ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc tgtggcctac accgtggtga gcacagtgga tcactttgtg 1560
aatgccattg aggagagagg cttcccacca acagccggac agcctccagc cacaacaaag 1620
cctaaggaga tcacacctgt gaacccaggc acctctcctc tgctcagata cgccgcctgg 1680
accggcggag gcggctctca tcaccaccac caccactaa 1719
<210> 19
<211> 1719
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 19
atgggctggt cctgcattat tctgtttctg gtggccacag ccacaggcgt gcatagcatg 60
gggacagtga acaagcctgt ggtgggggtc ctgatggggt tcgggatcat caccggcacc 120
ctgaggatta ctaaccccgt gagggccagc gtcctgcggt acgacgattt tcacatcgat 180
gaggataagc tggataccaa ctccgtgtac gagccatact accacagcga ccacgccgag 240
tccagctggg tgaaccgggg cgagtcctcc aggaaggcct acgaccacaa ctccccctac 300
atctggccac ggaacgatta cgacggcttc ctcgagaacg cccacgagca ccacggggtg 360
tacaaccagg gccggggcat tgacagcggg gagcggttga tgcagcccac ccagatgagc 420
gcccaggagg atctggggga tgataccggg atccacgtga tcccaacact gaacggcgat 480
gataggcaca agattgtgaa cgtggaccag cggcagtacg gggatgtgtt caagggggat 540
ctgaacccca agccccaggg ccagcggctc atcgaggtgt ccgtggagga gaaccaccct 600
ttcaccctgc gggcccccat tcagaggatc tacggggtga ggtacaccga gacctggagc 660
ttcctgccct ccctgacatg cacaggggac gccgcccccg ccatccagca catctgcctg 720
aagcacacca catgcttcca ggatgtggtg gtcgatgtgg actgcgccga gaacaccaag 780
gaggaccagc tcgccgagat cagctaccgg tttcagggca agaaggaggc cgatcagccc 840
tggattgtcg tgaacacaag cacactgttc gatgagctgg agctggaccc ccccgagatc 900
gagccaggcg tgctgaaggt gctgcggaca gagaagcagt acctgggcgt gtacatttgg 960
aacatgcggg gcagcgacgg cacatccacc tacgccacct tcctggtgac ctggaagggg 1020
gacgaaaaga cacggaaccc cacacccgcc gtgaccccac agcccagggg cgccgagttc 1080
cacatgtgga actaccactc ccacgtgttc tccgtggggg ataccttcag cctggccatg 1140
cacctgcagt acaagatcca cgaggcccct ttcgacctgc tgctggagtg gctgtacgtg 1200
cctattgacc ccacttgcca gcctatgcgg ctgtacagca cctgcctgta ccaccctaac 1260
gccccccagt gcctgtccca catgaacagc ggctgcacat tcacttcccc ccacctggcc 1320
cagcgggtgg cttccaccgt gtaccagaac tgcgagcatg ccgataacta caccgcctac 1380
tgcctgggga tcagccacat ggagcccagc tttgggctga tcctgcacga cggcgggacc 1440
acactgaagt tcgtggacac ccccgagagc ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc cgtggcctac accgtggtga gcacagtcga tcacttcgtg 1560
aacgccattg aggagagggg gttcccccca accgccggcc agccccccgc caccacaaag 1620
cccaaggaga ttacccctgt gaaccccggc acaagcccac tgctcaggta cgccgcctgg 1680
acagggggcg gcggctctca tcaccaccac caccactaa 1719
<210> 20
<211> 1719
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 20
atgggctggt cttgtatcat cctgtttctg gtggccacag ccaccggcgt gcactctatg 60
ggcaccgtga acaagccagt ggtgggagtg ctgatgggat tcggcatcat caccggcacc 120
ctgaggatca caaaccctgt gagagcttct gtgctgagat atgatgattt tcacaccgat 180
gaggataagc tggatacaaa ctctgtgtac gagccttact accactctga tcacgccgaa 240
tcttcttggg tgaatagagg cgagtccagc agaaaggcct acgatcacaa ctctccttac 300
atctggccta ggaacgatta tgatggcttc ctggagaacg cccacgaaca ccacggcgtg 360
tacaaccagg gcagaggaat cgattctgga gagagactga tgcagcctac ccagatgtct 420
gctcaggagg atctgggaga tgatacagga atccatgtga tccctacact gaacggcgat 480
gataggcaca agattgtgaa cgtggatcag agacagtacg gagatgtgtt taagggagat 540
ctgaacccta agcctcaggg ccagagactg attgaggtga gcgtggagga aaaccaccct 600
tttaccctga gggctcctat ccagagaatc tacggagtga gatacaccga gacatggtct 660
tttctgccta gcctgacatg taccggcgat gctgcccctg ccattcagca tatttgtctg 720
aagcacacaa cctgttttca ggatgtggtg gtggatgtgg actgtgccga gaacacaaag 780
gaggatcagc tggccgagat ctcttacaga tttcagggca agaaggaggc tgatcagcct 840
tggatcgtgg tgaacacaag caccctgttt gatgagctgg agctggatcc tcctgagatt 900
gagcctggtg tgctgaaggt gctgagaaca gaaaagcagt atctgggcgt gtacatttgg 960
aacatgaggg gctctgatgg aacaagcacc tacgctactt tcctggtgac ctggaagggc 1020
gatgagaaaa caagaaaccc tacaccagcc gtgactcctc agcctagagg agctgagttc 1080
cacatgtgga actaccattc tcacgtgttt tctgtgggcg atacattttc tctggccatg 1140
cacctgcagt ataagatcca cgaggctcca tttgatctgc tgctggagtg gctgtacgtg 1200
cctattgatc caacatgtca gccaatgaga ctgtactcca catgtctgta ccacccaaac 1260
gcccctcagt gtctgagcca catgaacagc ggatgtactt ttacatctcc tcacctggcc 1320
cagagagtgg ctagcacagt gtaccagaac tgtgagcacg ccgataacta cacagcttat 1380
tgtctgggca tctctcacat ggagccttct tttggcctga tcctgcatga cggaggcacc 1440
acactgaagt ttgtggatac ccctgagtct ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc tgtggcctac accgtggtga gcacagtgga tcactttgtg 1560
aatgccattg aggagagagg cttcccacca acagccggac agcctccagc cacaacaaag 1620
cctaaggaga tcacacctgt gaacccaggc acctctcctc tgctcagata cgccgcctgg 1680
accggcggag gcggctctca tcaccaccac caccactaa 1719
<210> 21
<211> 1719
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<400> 21
atgggctggt cctgcattat tctgtttctg gtggccacag ccacaggcgt gcatagcatg 60
gggacagtga acaagcctgt ggtgggggtc ctgatggggt tcgggatcat caccggcacc 120
ctgaggatta ctaaccccgt gagggccagc gtcctgcggt acgacgattt tcacaccgat 180
gaggataagc tggataccaa ctccgtgtac gagccatact accacagcga ccacgccgag 240
tccagctggg tgaaccgggg cgagtcctcc aggaaggcct acgaccacaa ctccccctac 300
atctggccac ggaacgatta cgacggcttc ctcgagaacg cccacgagca ccacggggtg 360
tacaaccagg gccggggcat tgacagcggg gagcggttga tgcagcccac ccagatgagc 420
gcccaggagg atctggggga tgataccggg atccacgtga tcccaacact gaacggcgat 480
gataggcaca agattgtgaa cgtggaccag cggcagtacg gggatgtgtt caagggggat 540
ctgaacccca agccccaggg ccagcggctc atcgaggtgt ccgtggagga gaaccaccct 600
ttcaccctgc gggcccccat tcagaggatc tacggggtga ggtacaccga gacctggagc 660
ttcctgccct ccctgacatg cacaggggac gccgcccccg ccatccagca catctgcctg 720
aagcacacca catgcttcca ggatgtggtg gtcgatgtgg actgcgccga gaacaccaag 780
gaggaccagc tcgccgagat cagctaccgg tttcagggca agaaggaggc cgatcagccc 840
tggattgtcg tgaacacaag cacactgttc gatgagctgg agctggaccc ccccgagatc 900
gagccaggcg tgctgaaggt gctgcggaca gagaagcagt acctgggcgt gtacatttgg 960
aacatgcggg gcagcgacgg cacatccacc tacgccacct tcctggtgac ctggaagggg 1020
gacgaaaaga cacggaaccc cacacccgcc gtgaccccac agcccagggg cgccgagttc 1080
cacatgtgga actaccactc ccacgtgttc tccgtggggg ataccttcag cctggccatg 1140
cacctgcagt acaagatcca cgaggcccct ttcgacctgc tgctggagtg gctgtacgtg 1200
cctattgacc ccacttgcca gcctatgcgg ctgtacagca cctgcctgta ccaccctaac 1260
gccccccagt gcctgtccca catgaacagc ggctgcacat tcacttcccc ccacctggcc 1320
cagcgggtgg cttccaccgt gtaccagaac tgcgagcatg ccgataacta caccgcctac 1380
tgcctgggga tcagccacat ggagcccagc tttgggctga tcctgcacga cggcgggacc 1440
acactgaagt tcgtggacac ccccgagagc ctgagcggcc tgtacgtgtt tgtggtgtac 1500
tttaacggcc acgtggaggc cgtggcctac accgtggtga gcacagtcga tcacttcgtg 1560
aacgccattg aggagagggg gttcccccca accgccggcc agccccccgc caccacaaag 1620
cccaaggaga ttacccctgt gaaccccggc acaagcccac tgctcaggta cgccgcctgg 1680
acagggggcg gcggctctca tcaccaccac caccactaa 1719

Claims (9)

  1. A dna molecule characterized in that: the DNA molecule contains a coding gene of a double signal peptide and a coding gene of varicella-zoster virus gE protein; the double signal peptide is formed by connecting a signal peptide 1 and a signal peptide 2;
    the amino acid sequence of the signal peptide 1 is 1 st-19 th site of SEQ ID No. 2; the amino acid sequence of the signal peptide 2 is 1 st-30 th position of SEQ ID No. 1.
  2. 2. The DNA molecule of claim 1, wherein: the signal peptide 1 is an antibody heavy chain signal peptide, the signal peptide 2 is a natural signal peptide of gE protein of the varicella-zoster virus, and the carboxyl end of the signal peptide 1 is connected with the amino end of the signal peptide 2.
  3. 3. The DNA molecule of claim 1 or 2, characterized in that: the gE protein of the varicella-zoster virus is any one of the following proteins:
    a1 Amino acid sequence is protein at 31 st-546 th position of SEQ ID No.1 in the sequence table,
    a2 Amino acid sequence is protein at 50-565 th position of SEQ ID No.2 in sequence table,
    a3 Amino acid sequence is protein at 50-565 th position of SEQ ID No.3 in sequence table,
    a4 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.4 in the sequence table,
    a5 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.5 in the sequence table,
    a6 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.6 in the sequence table,
    a7 Amino acid sequence is the 50 th-563 th protein of SEQ ID No.7 in the sequence table,
    a8 Fusion proteins obtained by fusing protein tags at the carboxyl end or/and the amino end of the proteins shown in A1), A2), A3), A4), A5), A6) or A7).
  4. 4. The DNA molecule of claim 1 or 2, characterized in that: the coding gene of the gE protein is any one of the following:
    b1 A DNA molecule shown in nucleotides 91-1638 of SEQ ID No. 08;
    b2 A DNA molecule shown in the 91 st to 1638 th nucleotides of SEQ ID No. 09;
    b3 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 10;
    b4 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 11;
    b5 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 12;
    b6 A DNA molecule shown in 148 th to 1695 th nucleotides of SEQ ID No. 13;
    b7 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 14;
    b8 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 15;
    b9 A DNA molecule shown in SEQ ID No.16 at 148 th to 1689 th nucleotides;
    b10 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 17;
    b11 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 18;
    b12 A DNA molecule shown as 148 th to 1689 th nucleotides of SEQ ID No. 19;
    b13 A DNA molecule shown in SEQ ID No.20 at 148 th to 1689 th nucleotides;
    b14 A DNA molecule shown in SEQ ID No.21 at 148 th to 1689 th nucleotides;
    the coding gene of the double signal peptide is any one of the following:
    c1 DNA molecule shown in 1 st to 147 th nucleotide of SEQ ID No. 10;
    c2 DNA molecule shown in SEQ ID No.12 at 1-147 th nucleotide.
  5. 5. The DNA molecule of claim 1 or 2, characterized in that: the DNA molecule is any one of the following:
    d1 A DNA molecule shown in SEQ ID No. 10;
    d2 A DNA molecule shown in SEQ ID No. 11;
    d3 A DNA molecule shown in SEQ ID No. 12;
    d4 A DNA molecule shown in SEQ ID No. 13;
    d5 A DNA molecule shown in SEQ ID No. 14;
    d6 A DNA molecule shown in SEQ ID No. 15;
    d7 A DNA molecule shown in SEQ ID No. 16;
    d8 A DNA molecule shown in SEQ ID No. 17;
    d9 A DNA molecule shown as SEQ ID No. 18;
    d10 A DNA molecule shown in SEQ ID No. 19;
    d11 A DNA molecule shown in SEQ ID No. 20;
    d12 A DNA molecule shown in SEQ ID No. 21.
  6. 6. A recombinant vector comprising a DNA molecule according to any one of claims 1-5.
  7. 7. A recombinant cell line comprising a DNA molecule according to any one of claims 1 to 5 or a recombinant vector according to claim 6.
  8. 8. A protein characterized in that: the protein is any one of the following:
    e1 Amino acid sequence is protein of SEQ ID No.1 in the sequence table,
    e2 Amino acid sequence is protein of SEQ ID No.2 in the sequence table,
    e3 Amino acid sequence is protein of SEQ ID No.3 in the sequence table,
    e4 Amino acid sequence is protein of SEQ ID No.4 in the sequence table,
    e5 Amino acid sequence is protein of SEQ ID No.5 in the sequence table,
    e6 Amino acid sequence is protein of SEQ ID No.6 in the sequence table,
    e7 Amino acid sequence is protein of SEQ ID No.7 in the sequence table,
    e8 Fusion proteins obtained by fusing protein tags at the carboxyl end or/and the amino end of the proteins shown in E1), E2), E3), E4), E5), E6) or E7).
  9. 9. Use of a DNA molecule according to any one of claims 1 to 5 and/or a recombinant vector according to claim 6 and/or a recombinant cell line according to claim 7 and/or a protein according to claim 8 for any one of the following:
    f1 Use of a varicella-zoster virus antigen component gE protein for the manufacture of a product;
    f2 Use in the preparation or development of a medicament for varicella-zoster virus-induced disease;
    f3 Use of a varicella-zoster virus prophylactic product for the preparation of a key antigenic component thereof;
    f4 The key components of the diagnosis or detection kit for varicella-zoster virus.
CN202210509802.XA 2022-05-11 2022-05-11 DNA molecule for expressing varicella-zoster virus gE protein Active CN114958882B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210509802.XA CN114958882B (en) 2022-05-11 2022-05-11 DNA molecule for expressing varicella-zoster virus gE protein

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210509802.XA CN114958882B (en) 2022-05-11 2022-05-11 DNA molecule for expressing varicella-zoster virus gE protein

Publications (2)

Publication Number Publication Date
CN114958882A CN114958882A (en) 2022-08-30
CN114958882B true CN114958882B (en) 2023-05-26

Family

ID=82982247

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210509802.XA Active CN114958882B (en) 2022-05-11 2022-05-11 DNA molecule for expressing varicella-zoster virus gE protein

Country Status (1)

Country Link
CN (1) CN114958882B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108315344A (en) * 2018-02-14 2018-07-24 武汉博沃生物科技有限公司 VZV glycoprotein E genes expression vector and its restructuring yeast strains and application
CN112870344A (en) * 2019-11-29 2021-06-01 北京绿竹生物技术股份有限公司 Recombinant varicella zoster virus vaccine
CN113683704A (en) * 2021-07-28 2021-11-23 安徽智飞龙科马生物制药有限公司 Varicella-zoster virus r-gE fusion protein, recombinant varicella-zoster vaccine, and preparation method and application thereof
WO2022055176A1 (en) * 2020-09-11 2022-03-17 주식회사 유바이오로직스 Vaccine composition for chickenpox or varicella zoster and method of using same

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112142829B (en) * 2019-06-28 2022-02-22 怡道生物科技(苏州)有限公司 Varicella-zoster virus gE protein mutant and expression method thereof
CN114324878A (en) * 2022-01-06 2022-04-12 郑州大学 Fluorescence-labeled varicella-zoster virus immunochromatography detection test paper and application thereof
CN114891072B (en) * 2022-03-11 2023-07-04 上海博唯生物科技有限公司 Truncated vaccine antigen peptide for preventing and/or treating herpesvirus, and preparation method and application thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108315344A (en) * 2018-02-14 2018-07-24 武汉博沃生物科技有限公司 VZV glycoprotein E genes expression vector and its restructuring yeast strains and application
CN112870344A (en) * 2019-11-29 2021-06-01 北京绿竹生物技术股份有限公司 Recombinant varicella zoster virus vaccine
WO2022055176A1 (en) * 2020-09-11 2022-03-17 주식회사 유바이오로직스 Vaccine composition for chickenpox or varicella zoster and method of using same
CN113683704A (en) * 2021-07-28 2021-11-23 安徽智飞龙科马生物制药有限公司 Varicella-zoster virus r-gE fusion protein, recombinant varicella-zoster vaccine, and preparation method and application thereof

Also Published As

Publication number Publication date
CN114958882A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN112209995B (en) Preparation method of SARS-CoV-2 surface protein receptor binding region
CN112390863B (en) Modified new coronavirus Spike protein extracellular domain and application thereof
CN109280656A (en) Recombinate muscardine Proteinase K mutant PK-M1 and preparation method
CN116574172B (en) Recombinant humanized type I collagen and preparation method thereof
WO2024087784A1 (en) Recombinant type xvii humanized collagen expressed in yeast and preparation method therefor
CN114014940A (en) Preparation method of 2019-nCoV surface protein receptor binding region fusion protein
CN104232611B (en) A kind of recombination muscardine Proteinase K and its industrialized production and purification process
CN114958882B (en) DNA molecule for expressing varicella-zoster virus gE protein
CN109207460A (en) Recombinate muscardine Proteinase K mutant PK-M2 and preparation method
KR100227405B1 (en) Yeast agglutination genes and yeast containing them
CN116102663A (en) Monkey poxvirus B6R antigen and preparation method and application thereof
CN114196688B (en) Expression of prokaryotic alkaline phosphatase in yeast
CN114349831B (en) aspA gene mutant, recombinant bacterium and method for preparing L-valine
CN111349575B (en) Pichia pastoris engineering bacteria for constitutive expression of porcine pepsinogen C and application thereof
JP4491588B2 (en) Purification method of killer protein
JP2023533160A (en) Methods for Producing Varicella-Zoster Virus Surface Protein Antigens
CN108265059B (en) Recombinant dust mite 2-class allergen protein and preparation method and application thereof
CN111019927A (en) Recombinant plasmid and recombinant engineering bacterium for expressing TEV protein, and method for preparing and purifying TEV protein
CN109295041A (en) With active polypeptide of serrapeptase and preparation method thereof
CN114561395B (en) Fusion tag-free rhIL-11 and soluble expression and efficient purification method of mutant thereof
CN114807193B (en) Carbonic anhydrase gene and application thereof
CN113564185B (en) Alkaline phosphatase gene, CHO stable cell strain thereof and ALP preparation method
CN111349576B (en) Pichia pastoris engineering bacteria for constitutive expression of porcine pepsinogen A and application thereof
CN114957410A (en) Preparation method of surface protein receptor binding region of kappa strain 2019-nCoV
CN116496384A (en) Preparation method of A/Cambodia/e0826360/2020 (H3N 2) HA antigen

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230508

Address after: Room 1014, 10th Floor, Yuyan Building, No. 29 Chongde Street, Zhengdong New District, Zhengzhou City, Henan Province, 450046

Applicant after: Shengming Biotechnology (Zhengzhou) Co.,Ltd.

Address before: 453000 No.1-2 Huanghe Road, Pingyuan demonstration area, Xinxiang City, Henan Province

Applicant before: Henan Shengming Biotechnology Research Institute Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant