CN111718419B - Fusion protein containing RNA binding protein and expression vector used in combination with same - Google Patents

Fusion protein containing RNA binding protein and expression vector used in combination with same Download PDF

Info

Publication number
CN111718419B
CN111718419B CN201910212861.9A CN201910212861A CN111718419B CN 111718419 B CN111718419 B CN 111718419B CN 201910212861 A CN201910212861 A CN 201910212861A CN 111718419 B CN111718419 B CN 111718419B
Authority
CN
China
Prior art keywords
glu
ala
leu
lys
ser
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910212861.9A
Other languages
Chinese (zh)
Other versions
CN111718419A (en
Inventor
郭敏
王静
许乃庆
姜灵轩
于雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangma Healthcode Shanghai Biotech Co Ltd
Original Assignee
Kangma Healthcode Shanghai Biotech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangma Healthcode Shanghai Biotech Co Ltd filed Critical Kangma Healthcode Shanghai Biotech Co Ltd
Priority to CN201910212861.9A priority Critical patent/CN111718419B/en
Publication of CN111718419A publication Critical patent/CN111718419A/en
Application granted granted Critical
Publication of CN111718419B publication Critical patent/CN111718419B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • C12N15/815Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts for yeasts other than Saccharomyces
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/85Fusion polypeptide containing an RNA binding domain

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biochemistry (AREA)
  • Mycology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Virology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The present invention provides fusion proteins comprising RNA-binding proteins and expression vectors for use therewith. Specifically, the fusion protein provided by the invention can be combined with (or matched with) a corresponding expression vector to greatly improve the in vitro translation expression efficiency. In addition, the mRNA translated by the expression vector of the present invention may not contain a polyA structure, as compared to the usual eukaryotic mRNA. Further, fusion proteins containing RNA binding proteins and their use in conjunction with expression vectors for cell-free protein synthesis are provided.

Description

Fusion protein containing RNA binding protein and expression vector used in combination with same
Technical Field
The invention relates to the field of genetic engineering, in particular to a fusion protein containing RNA binding protein, a corresponding expression vector thereof and application of the fusion protein and the corresponding expression vector in a cell-free protein synthesis system and in improving protein synthesis.
Background
Proteins are important molecules in cells, and are involved in performing almost all functions of cells. The difference in the sequence and structure of the protein determines the difference in its function. In cells, proteins can catalyze various biochemical reactions as enzymes, can coordinate various activities of organisms as signaling molecules, can support biological morphology, store energy, transport molecules, and move organisms. In the biomedical field, protein antibodies are important means for treating diseases such as cancer as targeted drugs.
The four processes of protein translation include translation initiation, translation elongation, translation termination and ribosome recirculation, with the most regulated and rate-limiting step being translation initiation [1 ]. Translation initiation of eukaryotic mRNA having a "cap structure" (m 7GpppN) is mainly carried out by a scanning mechanism depending on the "cap structure" [1-3 ]. Wherein the "cap structure" was capable of recruiting the translation initiation factor eIF4F, and thus the downstream 43S Pre-translation initiation complex (PIC) comprising the ribosome 40S small subunit and the translation initiation factors eIF1, eIF3 and eIF5, and another ternary complex comprising the translation initiation factor eIF2 and the initiation tRNA (Met-tRNAi) were also recruited to the mRNA. eIF4F is composed of three protein subunits: eIF4E, eIF4G, and eIF 4A. eIF4E specifically binds to a "cap structure" anchoring eIF4F in the 5' untranslated region of mRNA; eIF4A is an RNA helicase; eIF4G is a scaffold protein for almost the entire translation initiation process, and it interacts with multiple translation initiation factors and plays an important role in the recruitment of downstream factors. In addition to the "cap structure", most of the eukaryotic mRNA has a poly (adenine) nucleotide chain at its 3' end, which can be bound by a poly (A) binding protein (pAB 1). In addition to binding to polyA at the 3' end of mRNA, the pAB1 protein was also able to bind to translation initiation factor eIF 4G. The binding site of pAB1 on eIF4G is different from the binding site on eIF4G of the 5' cap structure "eIF 4E capable of binding mRNA, so that pAB1 and eIF4E are capable of binding to eIF4G simultaneously. The 5 'end and the 3' end of mRNA are closely spaced under the mediation of eIF4E, eIF4G and pAB1 proteins, thereby forming a loop structure. The functional significance of mRNA cyclization structures in eukaryotes is not currently well defined, and its possible explanation is to help ribosomes remain bound to the mRNA after translation is complete, allowing ribosomes to move from the 3 'end directly to the 5' end to continue scanning and restart the translation process, reducing the re-cycling assembly of ribosomal subunits and translation initiation factor complexes [2 ].
Prokaryotes have three translation initiation factors, IF1, IF2, and IF 3. IF1 is the smallest translation initiation factor that regulates how tightly IF2 binds to the ribosome. IF1 binds to the a site of the ribosome at the initiation of protein translation, preventing premature false binding of the aminoacyl tRNA that binds to the same site in translation elongation to the ribosome. IF2 is the largest translation initiation factor and has a critical role in initiating translation. IF2 facilitates binding of the initiator aminoacyl tRNA to the P site on the ribosomal 30S small subunit, and facilitates binding of the 50S large subunit and the 30S small subunit. By binding to 16S rRNA, IF3 facilitates the dissociation of empty 70S ribosomes, allowing the ribosomal subunits to be recycled for the next round of translation initiation.
Among eukaryotic initiation translation factors, eIF3 is the largest complex factor. eIF3 has 13 different subunits, ranging in size from eIF3a to eIF3m from 25kD to 170kD, and finally the total eIF3 complex formed by all subunits is about 800kD [4 ]. eIF3a, RPG1 in Saccharomyces cerevisiae, is a core component of the initiation translation factor eIF3 complex and facilitates binding of mRNA and tRNA (i) Met to ribosomes. eIF3 in yeast consists of five core subunits and some other non-core subunits. The most critical yeast subunits eIF3a, eIF3b, eIF3c, eIF3g and eIF3i form a conserved core complex that can perform the critical function of eIF3 [5], where eIF3a is the largest subunit of eIF 3.
eIF3a interacts with many other proteins. In the family of eifs, eIF2 β, eIF4G, eIF4B all interact with eIF3a [6-8 ]. N-terminal eIF3a also binds to both eIF1 and eIF5 [9-10 ]. Studies in saccharomyces cerevisiae showed that eIF3a interacts with eIF3b, eIF3c, eIF3d, eIF3f, eIF3g, eIF3h, eIF3i, eIF3j and eFI3k in the eIF3 complex [4 ]. eIF3a may also bind ribosomes directly. The C-terminal of eIF3a of Saccharomyces cerevisiae can be combined with 16-18 helical structures of 18S rRNA, and 1-396 residues of the N-terminal of the eIF3a are tightly combined with RPS0A protein in 40S small subunit of ribosome, so that an important bridge is formed between the initial complex eIF3 and the 40S small subunit of ribosome [11-12 ].
The respective capsid proteins of bacteriophage MS2 and bacteriophage Q β are specific RNA binding proteins that can bind to different RNA stem-loop structures to achieve specific protein-RNA recognition. The MS2 phage capsid protein or MS2 capsid protein (MS 2 coat protein, MS2 or MS2 CP for short) has specific interaction with the stem-loop structure RNA sequence consisting of 19 bases at the 5' end of phage replicase, MS2 binding sites (MS 2 binding sites, MS2 bs). The present invention adopts specific recognition binding site sequence of MS2 capsid protein AAACATGAGGATTACCCATGT [13-14 ]. Q beta phage capsid protein or Q beta capsid protein (Q beta coat protein, Q beta or Q beta CP for short) is another mRNA binding protein, and the sequence of the specific recognition binding site selected by the invention is TAAGGATTAATTGCATGTCTAAGACAGCAA [15 ].
Cell lysates contain a large number of protein synthesis and translation machinery, and target proteins can be synthesized by adding exogenous templates. In the case of yeast cells, the original mRNA will continue to use resources to synthesize non-target proteins (i.e., hybrid proteins), which affects the efficiency and yield of target protein synthesis. The present invention is intended to solve the aforementioned problem of non-target protein expression by exploring and studying the interaction between RNA-binding protein and translation initiation factor, thereby achieving the purpose of improving the expression of target protein.
Reference to the literature
1.Sonenberg & Hinnebusch., Regulation of translation initiation in eukaryotes: mechanisms and biological targets. Cell, 2009. 136(4): 731-45.
2.Dever et al., Mechanism and Regulation of Protein Synthesis in Saccharomyces cerevisiae. Genetics, 2016. 203(1): 65-107.
3.M., D.J.R.G., Nucleic Acid. Encyclopedia of Cell Biology. Elsevier, 2015.
4.Saletta et al., The translational regulator eIF3a: The tricky eIF3 subunit! Biochimica et Biophysica Acta, 2010. 1806: 275-286.
5.Hinnebusch et al., eIF3: a versatile scaffold for translation initiatioin complexes, Trends Biochem. Sci. 2006. 31: 553-562.
6.Valasek et al, Direct eIF2–eIF3 contact in the multifactor complex is important for translation initiation in vivo, EMBO J. 21(2002) 5886–5898.
7.A. Gradi, H. Imataka, Y.V. Svitkin, E. Rom, B. Raught, S. Morino, N. Sonenberg, A novel functional human eukaryotic translation initiation factor 4G, Mol. Cell. Biol. 18 (1998) 334–342.
8.N. Methot, M.S. Song, N. Sonenberg, A region rich in aspartic acid, arginine, tyrosine, and glycine (DRYG) mediates eukaryotic initiation factor 4B (eIF4B)self-association and interaction with eIF3, Mol. Cell. Biol. 16 (1996) 5328–5334.
9.K. Asano, J. Clayton, A. Shalev, A.G. Hinnebusch, A multifactor complex of eukaryotic initiation factors, eIF1, eIF2, eIF3, eIF5, and initiator tRNA(Met) is an important translation initiation intermediate in vivo, Genes Dev. 14 (2000) 2534–2546.
10.L. Phan, L.W. Schoenfeld, L. Valasek, K.H. Nielsen, A.G. Hinnebusch, A subcomplex of three eIF3 subunits binds eIF1 and eIF5 and stimulates ribosome binding of mRNA and tRNA(i)Met, EMBO J. 20 (2001) 2954–2965.
11.L. Valasek, A.A. Mathew, B.S. Shin, K.H. Nielsen, B. Szamecz, A.G. Hinnebusch, The yeast eIF3 subunits TIF32/a, NIP1/c, and eIF5 make critical connections with the 40S ribosome in vivo, Genes Dev. 17 (2003) 786–799.
12.B. Szamecz, E. Rutkai, L. Cuchalova, V. Munzarova, A. Herrmannova, K.H. Nielsen, L. Burela, A.G. Hinnebusch, L. Valasek, eIF3a cooperates with sequences 5′ of uORF1 to promote resumption of scanning by post-termination ribosomes for reinitiation on GCN4 mRNA, Genes Dev. 22 (2008) 2414–2425.
13.Saletta et al., Nucleotide sequence of the gene coding for the bacteriophage MS2 coat protein. Nature, 1972. 237: 82-88.
14.Peabody, The RNA binding site of bacteriophage MS2 coat protein. The EMBO Journal, 1993. 12(2): 595-600.
Lim et al., The RNA-binding site of bacteriophage Qβ Coat protein. The Journal of Biological chemistry, 1996.271(50):31839-31845。
Disclosure of Invention
The invention aims to provide a scheme for modifying fusion protein capable of binding RNA and a corresponding nucleic acid construct thereof, which can effectively enhance the biosynthesis efficiency.
In order to achieve the above objects, the first aspect of the present invention provides a design concept for the modification of RNA-binding protein fusion of translation initiation factor elements.
Specifically, provided is a fusion protein, which has a structure shown in formula Ia or Ib:
A-B-C (Ia)
C-B-A (Ib);
in the formula (I), the compound is shown in the specification,
a is a translation initiation factor element;
b is nothing or a connecting peptide;
c is an RNA binding protein element;
each "-" is a peptide bond.
Further, the translation initiation factor element is eIF3 a.
In another preferred embodiment, the translation initiation factor element is from a eukaryotic cell.
In another preferred embodiment, the translation initiation factor element is from a yeast cell.
In another preferred embodiment, the yeast cell is selected from the group consisting of: kluyveromyces, saccharomyces cerevisiae, pichia pastoris, or combinations thereof.
In another preferred embodiment, the kluyveromyces is selected from the group consisting of: kluyveromyces lactis, Kluyveromyces marxianus, Kluyveromyces multibuyveri, or a combination thereof.
As another preference, the translation initiation factor element is from a prokaryote.
Further, the prokaryote is a bacterium, an actinomycete, a fungus, or a combination thereof.
In another preferred embodiment, the RNA binding protein element sequences are all from a bacteriophage.
Further, the RNA binding protein element is MS2 capsid protein (MS 2 CP) or Q beta capsid protein (Q beta CP).
In a second aspect, the present invention provides an isolated polynucleotide encoding a fusion protein according to the first aspect of the invention.
In a third aspect, the invention provides a vector or vector combination comprising a polynucleotide according to the second aspect.
In a fourth aspect, the invention provides a genetically engineered cell having a polynucleotide of the second aspect integrated at one or more sites in its genome or comprising a vector or combination of vectors of the third aspect.
In another preferred embodiment, the cell is selected from the group consisting of: kluyveromyces, saccharomyces cerevisiae, pichia pastoris, or combinations thereof.
In another preferred embodiment, the cell is selected from the group consisting of: kluyveromyces lactis, Kluyveromyces marxianus, Kluyveromyces multibuyveri, or a combination thereof.
In a fifth aspect, the present invention provides a cell extract or a cell lysate of the genetically engineered cell of the fourth aspect, wherein the cell extract or the cell lysate contains the fusion protein of the first aspect.
In a sixth aspect of the invention, there is also provided a variant of the nucleic acid construct comprising an mRNA binding sequence corresponding to the fusion protein of the first aspect.
Specifically, an expression vector for use in combination with the fusion protein of the first aspect is provided, the expression vector comprising at least (a) a template DNA sequence; (b) the specific recognition binding site sequence of the RNA binding protein element.
Further, the expression vector further comprises one or more elements selected from the group consisting of: promoter elements, enhancer elements, tobacco mosaic virus 5' leader, Kozak sequences, poly (a) or polyA, terminator elements.
Further, the invention may be embodied as an expression vector for use with the fusion protein of the first aspect, and illustratively, the expression vector comprises a nucleic acid sequence having a structure represented by formula II:
Z1-Z2-Z3-Z4-Z5-Z8-Z6-Z7 (II)
in the formula (I), the compound is shown in the specification,
Z1-Z8 are each an element used to construct the construct;
each "-" is independently a bond or a nucleotide linking sequence;
z1 is a promoter element selected from the group consisting of: a T7 promoter, a T3 promoter, an SP6 promoter, or a combination thereof;
z2 is a null or enhancer element including an IRES element;
z3 is a leader sequence-omega sequence of 5' end of the tobacco mosaic virus or nothing;
z4 is a null or Kozak sequence;
z5 is the coding sequence of the foreign protein;
z6 is no or poly a selected from the group consisting of: 50A, 70A, 90A, or a combination thereof;
z7 is a terminator element selected from the group consisting of: a T7 terminator, a T3 terminator, an SP6 terminator, or a combination thereof;
z8 is the specific recognition binding site sequence of the RNA binding protein of the fusion protein of the first aspect, and there is no specific sequence relation with the positions of other elements at any position of the expression vector, and the position in formula II is only exemplary and is one way; the Z8 element is selected from the group consisting of: one or more mRNA binding sequences of MS2 CP or binding still active mutant sequences, one or more mRNA binding sequences of Q β CP or binding still active mutant sequences or a combination thereof.
In a seventh aspect of the present invention, there is provided an in vitro cell-free protein synthesis system, wherein the synthesis system at least comprises the cell extract or cell lysate of the fifth aspect and the expression vector of the sixth aspect.
Further, the synthesis system further comprises one or more components selected from the group consisting of: substrates for the synthesis of RNA, substrates for the synthesis of proteins, polyethylene glycol, magnesium ions, potassium ions, buffers, RNA polymerase, energy regeneration systems, dithiothreitol, optionally an aqueous solvent.
The in vitro cell-free protein synthesis system can be selected from a eukaryotic cell in vitro cell-free protein synthesis system. Specifically, it can be selected from yeasts.
In another preferred embodiment, the yeast is selected from the group consisting of: one of or the combination of Kluyveromyces, Saccharomyces cerevisiae and Pichia pastoris.
In an eighth aspect of the present invention, there is provided a method for synthesizing a foreign protein, the method comprising the steps of:
(i) constructing an expression vector of the sixth aspect containing a foreign protein template DNA sequence;
(ii) providing the in vitro cell-free protein synthesis system of the seventh aspect, and incubating for a period of time and under suitable conditions to synthesize the exogenous protein;
and optionally (iii) isolating or detecting the foreign protein.
In another preferred embodiment, the coding sequence of the foreign protein is from a prokaryote or a eukaryote.
In another preferred embodiment, the coding sequence of the foreign protein is from an animal, a plant, a microorganism, a pathogen.
In another preferred embodiment, the coding sequence for the foreign protein is from a mammal, preferably a primate, a rodent, including human, mouse, rat.
According to a ninth aspect of the present invention, there is provided a kit comprising a container and, separately located in the container, each of the components of the in vitro cell-free protein synthesis system of the seventh aspect.
The tenth aspect of the present invention provides the use of each of the above aspects for improving the in vitro protein synthesis capacity of an in vitro cell-free protein synthesis system.
The invention fuses the translation initiation factor and the RNA binding protein, and adds the specific recognition binding site sequence of the RNA binding protein in the added exogenous template, thereby increasing the translated probability of the target protein and further improving the efficiency and the yield of in vitro protein synthesis.
It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be reiterated herein, but to the extent of space.
The main advantages of the invention include:
(a) the invention constructs a fusion protein by fusing a translation initiation factor element and an RNA binding protein for the first time, and the fusion protein is used in an in vitro cell-free protein synthesis system.
(b) The invention discovers for the first time that the intracellular translation initiation factor element is connected with RNA binding protein to form fusion protein, and then the fusion protein is combined with a template (expression vector) containing a sequence of a specific recognition binding site of the corresponding RNA binding protein, so that the ribosome element can be directly recruited to target mRNA, the restriction of other translation initiation factors is reduced, and the in vitro protein synthesis capacity can be obviously enhanced.
(c) The invention discovers for the first time that the polyA structure of the template can be replaced by the recognition site of RNA binding protein, the degradation of the template can be reduced without the polyA structure, and the in vitro protein synthesis capacity can still be obviously improved in the cell extract or cell lysate containing the fusion protein.
Drawings
FIG. 1 shows pKMcas9 \uKleIF3aC plasmid map.
FIG. 2 is a schematic diagram of the plasmid map of pMD18T-eIF3a-MS2 CP constructed.
FIG. 3 is a schematic diagram of the plasmid map of pMD18T-MS2 CP-eIF3a constructed.
FIG. 4 is a schematic map of the constructed pMD18T-eIF3a-Q β CP plasmid.
FIG. 5 is a schematic map of the constructed pMD18T-Q β CP-eIF3a plasmid.
FIG. 6 shows the results of the determination of the in vitro translation activity of the template containing the binding site of MS2 CP in eIF3a-MS2 CP strain.
FIG. 7 shows the results of in vitro translation activity assay of a template containing a Q β CP binding site in strain eIF3a-Q β.
Detailed Description
After extensive and intensive research, a fusion protein formed by a translation initiation factor and an RNA binding protein is discovered for the first time unexpectedly through a large amount of screening and groping, and the in vitro translation efficiency of the fusion protein can be greatly improved when the fusion protein is used together with a template (expression vector) containing a specific recognition binding site sequence of the fusion protein. In addition, the invention also discovers that the specific recognition binding site sequence of the RNA binding protein can effectively replace the polyA structure in the original template on the premise of not influencing the activity of protein synthesis. On this basis, the present inventors have completed the present invention.
RNA binding proteins
The RNA binding protein refers to a protein capable of specifically recognizing and binding a certain segment of RNA structure or certain sites or certain sequences, so that the specific binding function of RNA and protein is realized. Preferably, it is an mRNA binding protein that recognizes mRNA. Wherein, respective capsid proteins of bacteriophage MS2 and bacteriophage Q beta, namely MS2 capsid protein or MS2 bacteriophage capsid protein, Q beta capsid protein or Q beta bacteriophage capsid protein are RNA binding proteins derived from bacteriophage, which can bind different RNA stem-loop structures to realize specific recognition of protein-RNA.
The specific recognition binding site sequence of the RNA binding protein of the present invention refers to a polynucleotide sequence capable of specifically recognizing and binding to the RNA binding protein, and may be a corresponding DNA sequence or RNA sequence.
Eukaryotic in vitro cell-free protein synthesis system
The eukaryotic in vitro cell-free protein synthesis system is a transcription-translation coupled system based on eukaryotic cells, and can synthesize RNA by taking a DNA template as an initial part or finish in vitro protein synthesis by taking DNA or RNA as a template. Eukaryotic cells include yeast cells, rabbit reticulocytes, wheat germ cells, insect cells, human cells, and the like. The eukaryotic in vitro cell-free protein synthesis system has the advantages of being capable of synthesizing RNA or protein with a complex structure, modifying protein after translation and the like.
In the present invention, the eukaryotic in vitro cell-free protein synthesis system is not particularly limited, and a preferred eukaryotic in vitro cell-free protein synthesis system includes a yeast in vitro cell-free protein synthesis system, preferably a yeast in vitro protein synthesis system, more preferably a kluyveromyces lactis expression system.
Yeast (yeast) combines the advantages of simple culture, efficient protein folding and post-translational modification. Wherein Saccharomyces cerevisiae (Saccharomyces cerevisiae) And Pichia pastoris (Pichia pastoris) Is a model organism for expressing complex eukaryotic proteins and membrane proteins, and yeast can also be used as a raw material for preparing an in vitro translation system.
Kluyveromyces (Kluyveromyces) is a species of Sporospora subphylum, of which Kluyveromyces marxianus (Kluyveromyces marxianus)Kluyveromyces marxianus) And Kluyveromyces lactis: (Kluyveromyces lactis) Is a yeast widely used in industry. In comparison with other yeasts, kluyveromyces lactis has many advantages such as superior secretion ability, better large-scale fermentation characteristics, a level of food safety, and the ability to modify proteins post-translationally.
In the present invention, the eukaryotic in vitro cell-free protein synthesis system at least comprises: eukaryotic cell extract or cell lysate.
Further, the synthesis system further comprises one or more components selected from the group consisting of: substrates for the synthesis of RNA, substrates for the synthesis of proteins, polyethylene glycol, magnesium ions, potassium ions, buffers, RNA polymerase, energy regeneration systems, dithiothreitol, optionally an aqueous solvent.
In a particularly preferred embodiment, the invention provides an in vitro cell-free protein synthesis system comprising: eukaryotic cell extract or cell lysate, 4-hydroxyethylpiperazine ethanesulfonic acid, potassium acetate, magnesium acetate, Adenosine Triphosphate (ATP), Guanosine Triphosphate (GTP), cytosine nucleoside triphosphate (CTP), Thymidine Triphosphate (TTP), amino acid mixtures, creatine phosphate, Dithiothreitol (DTT), phosphocreatine kinase, rnase inhibitors, luciferin, luciferase DNA, RNA polymerase.
In the present invention, the RNA polymerase is not particularly limited and may be selected from one or more RNA polymerases, and a typical RNA polymerase is T7 RNA polymerase.
In the present invention, the proportion of the eukaryotic cell extract in the in vitro cell-free protein synthesis system is not particularly limited, and usually the eukaryotic cell extract accounts for 20 to 70%, preferably 30 to 60%, more preferably 40 to 50% of the in vitro cell-free protein synthesis system.
In the present invention, the eukaryotic cell extract does not contain intact cells, and typical eukaryotic cell extracts include various types of RNA polymerases required for RNA synthesis, ribosomes for protein translation, transfer RNA, aminoacyl tRNA synthetases, initiation and elongation factors required for protein synthesis, and stop release factors. In addition, the eukaryotic cell extract also contains some other proteins, especially soluble proteins, which originate from the cytoplasm of the eukaryotic cell.
In the present invention, the eukaryotic cell extract contains 20-100 mg/mL of protein, preferably 50-100 mg/mL. The method for determining the protein content is a Coomassie brilliant blue determination method.
In the present invention, the preparation method of the eukaryotic cell extract is not limited, and a preferred preparation method comprises the following steps:
(i) providing a eukaryotic cell;
(ii) washing the eukaryotic cells to obtain washed eukaryotic cells;
(iii) breaking the washed eukaryotic cells to obtain a crude extract of the eukaryotic cells;
(iv) and carrying out solid-liquid separation on the eukaryotic cell crude extract to obtain a liquid part, namely the eukaryotic cell extract.
In the present invention, the solid-liquid separation method is not particularly limited, and a preferable method is centrifugation.
In a preferred embodiment, the centrifugation is carried out in the liquid state.
In the present invention, the centrifugation conditions are not particularly limited, and one preferable centrifugation condition is 5000-.
In the present invention, the centrifugation time is not particularly limited, and a preferable centrifugation time is 0.5 min to 2 h, preferably 20 min to 50 min.
In the present invention, the temperature of the centrifugation is not particularly limited, and it is preferable that the centrifugation is performed at 1 to 10 ℃, preferably, 2 to 6 ℃.
In the present invention, the washing treatment is not particularly limited, and a preferable washing treatment is a treatment with a washing solution at a pH of 7 to 8 (preferably, 7.4), the washing solution is not particularly limited, and typically the washing solution is selected from the group consisting of: 4-hydroxyethyl piperazine potassium ethanesulfonate, potassium acetate, magnesium acetate or their combination.
In the present invention, the manner of the cell disruption treatment is not particularly limited, and a preferable cell disruption treatment includes high-pressure disruption, freeze-thawing (e.g., liquid nitrogen low-temperature disruption).
The nucleoside triphosphate mixture in the in vitro cell-free protein synthesis system is adenosine triphosphate, guanosine triphosphate, cytosine nucleoside triphosphate and uracil nucleoside triphosphate. In the present invention, the concentration of each mononucleotide is not particularly limited, and usually the concentration of each mononucleotide is 0.5 to 5 mM, preferably 1.0 to 2.0 mM.
The amino acid mixture in the in vitro cell-free protein synthesis system can comprise natural or unnatural amino acids, and can comprise D-type or L-type amino acids. Representative amino acids include (but are not limited to) the 20 natural amino acids: glycine, alanine, valine, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine, and histidine. The concentration of each amino acid is usually 0.01-0.5 mM, preferably 0.02-0.2 mM, such as 0.05, 0.06, 0.07, 0.08 mM.
In a preferred embodiment, the in vitro cell-free protein synthesis system further comprises polyethylene glycol or an analog thereof. The concentration of polyethylene glycol or its analogue is not particularly limited, and generally, the concentration (w/v) of polyethylene glycol or its analogue is 0.1 to 8%, preferably 0.5 to 4%, more preferably 1 to 2%, based on the total weight of the biosynthesis system. Representative PEG examples include (but are not limited to): PEG3000, PEG8000, PEG6000 and PEG 3350. It is understood that the systems of the present invention may also include other polyethylene glycols of various molecular weights (e.g., PEG200, 400, 1500, 2000, 4000, 6000, 8000, 10000, etc.).
In a preferred embodiment, the in vitro cell-free protein synthesis system further comprises sucrose. The concentration of sucrose is not particularly limited, and generally, the concentration of sucrose is 0.03 to 40wt%, preferably 0.08 to 10wt%, more preferably 0.1 to 5wt%, based on the total weight of the protein synthesis system.
Vector, genetically engineered cell
The invention also provides a vector or combination of vectors comprising the nucleic acid construct of the invention. Preferably, the carrier is selected from: bacterial plasmids, bacteriophages, yeast plasmids, animal cell vectors, shuttle vectors; the vector is a transposon vector. Methods for preparing recombinant vectors are well known to those of ordinary skill in the art. Any plasmid and vector may be used as long as it can replicate and is stable in the host.
One of ordinary skill in the art can use well-known methods to construct expression vectors containing the promoter and/or gene sequences of interest described herein. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like.
The invention also provides a genetic engineering cell, which contains the construct or the vector combination, or the chromosome of the genetic engineering cell is integrated with the construct or the vector combination. In another preferred embodiment, the genetically engineered cell further comprises a vector comprising a gene encoding a transposase or having a transposase gene integrated into its chromosome.
Preferably, the genetically engineered cell is a eukaryotic cell.
In another preferred embodiment, the eukaryotic cell includes (but is not limited to): human body cells, Chinese hamster ovary cells, insect cells, wheat germ cells, rabbit reticulocyte and other high-level eukaryotic cells.
In another preferred embodiment, the eukaryotic cell includes (but is not limited to): a yeast cell (preferably, a kluyveromyces cell, more preferably a kluyveromyces lactis cell).
The constructs or vectors of the invention may be used to transform appropriate genetically engineered cells.
Template DNA (template DNA)
The template DNA is a nucleotide sequence for coding any target protein to be synthesized, can be an original sequence, and also can be a synthetic sequence or a modified sequence, and the template DNA can be used for synthesizing corresponding RNA and/or protein.
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Experimental procedures without specific conditions noted in the following examples, generally followed by conventional conditions, such as Sambrook et al, molecular cloning: the conditions described in the Laboratory Manual (New York: Cold Spring Harbor Laboratory Press, 1989), or according to the manufacturer's recommendations. Unless otherwise indicated, percentages and parts are percentages and parts by weight. The invention uses the lactic acid Kluyveromyces (Kluyveromyces lactis) ((R))Kluyveromyces lactis,For shortK. lactisOrkl) For the purpose of example, the same design, analysis and experimental methods are also applicable to other eukaryotic cells such as yeast and animal cells, and prokaryotic cells.
The gene editing techniques used in the embodiments of the present invention are all conventional techniques, and one of them is used for example, but this does not mean that only editing can be performed by using this technique.
Unless otherwise specified, the materials and reagents used in the examples of the present invention are commercially available products.
Example 1C-terminal or N-terminal Targeted insertion of eIF3a Gene into MS2 CP
The codingKlThe nucleotide sequence of eIF3a is shown as SEQ ID NO. 1, and the amino acid sequence thereof is shown as SEQ ID NO. 2. The connection isKlThe nucleotide sequence of the connection sequence of eIF3a and MS2 CP is shown as SEQ ID NO. 3, and the amino acid sequence thereof is shown as SEQ ID NO. 4. The nucleotide sequence of the code MS2 CP is shown as SEQ ID NO. 5, and the amino acid sequence thereof is shown as SEQ ID NO. 6. The above-mentionedKlThe nucleotide sequence of the eIF3a-MS2 CP fusion protein is shown as SEQ ID NO. 7, and the amino acid sequence thereof is shown as SEQ ID NO. 8.
Selecting the construction mode with or without connecting sequenceKlConnection construction of eIF3a and MS2 CPKleIF3a-MS2 CP fusion protein.
KlSequence determination
(1) According to inKlThe C-terminal insertion of eIF3a gene into MS2 CP design, PAM sequence (NGG) selection, and determination of corresponding gRNA sequence. The principle of gRNA selection in this example is: the GC content is moderate (40% -60%), and the existence of a poly T structure is avoided. In the present embodiment, it is preferred that,Klthe sequence of eIF3aC gRNA is GAACTTCTTGAAAAGAAGAA.
The plasmid construction and transformation method is as follows: use of the primer pCas9-KleIF3 aC-gRNA-PF: GAACTTCTTGAAAAGAAGAAGTTTTAGAGCTAGAAATAGC and pCas9-KleIF3 aC-gRNA-PR: TTCTTCTTTTCAAGAAGTTCAAAGTCCCATTCGCCACCCG, PCR amplification was performed using the pCAS plasmid as a template. Taking 17 muL of the amplification product, adding 0.2 muL Dpn I, 2 muL 10 Xdigestion buffer, and mixing uniformly to 37oC, water bath for 3 h. Adding 10 mu L of the product after the Dpn I treatment into 50 mu L DH5 alpha competent cells, placing the product on ice for 30min, and standing the product 42 mu LoAfter C heat shock for 45 s, 1 mL of LB liquid medium 37 was addedoC shaking culture for 1h, spreading on Kan-resistant LB plate, 37oC, inverted culture until several monoclonals grow out. Picking 2 monoclonals, respectively carrying out shake culture in LB liquid culture medium for several days, detecting positive clone strains by colony PCR, carrying out sequencing confirmation, extracting plasmids for storage, and naming the plasmids as pKMcas9 \uKleIF3aC。
(2) According to inKlDesign of inserting MS2 CP into the N end of eIF3a gene, selecting PAM sequence (NGG), and determining corresponding gRNA sequence. The principle of gRNA selection in this example is: the GC content is moderate (40% -60%), and the existence of a poly T structure is avoided. In the present embodiment, it is preferred that,Klthe upstream gene sequence of eIF3a is shown as SEQ ID NO: shown at 9. SelectingKlThe sequence of eIF3aN gRNA is ATGATATCAACTTTTGATGA.
The plasmid construction and transformation method is as follows: use of the primer pCas9-KleIF3 aN-gRNA-PF: ATGATATCAACTTTTGATGAGTTTTAGAGCTAGAAATAGC and pCas9-KleIF3 aN-gRNA-PR: TCATCAAAAGTTGATATCATAAAGTCCCATTCGCCACCCG, PCR amplification was performed using the pCAS plasmid as a template. Taking 17 muL of the amplification product, adding 0.2 muL Dpn I, 2 muL 10 Xdigestion buffer, and mixing uniformly to 37oC, water bath for 3 h. Adding 10 mu L of the product after the Dpn I treatment into 50 mu L DH5 alpha competent cells, placing the product on ice for 30min, and standing the product 42 mu LoAfter C heat shock for 45 s, 1 mL of LB liquid medium 37 was addedoC shaking culture for 1h, plating on Kan-resistant LB plate, 37oC, inverted culture until several monoclonals grow out. Picking 2 single colony, shaking culturing in LB liquid culture medium for several days, detecting positive by colony PCR, sequencing, extracting plasmid, storing, and naming as pKMCas9_KleIF3aN。
Donor DNA plasmid construction and amplification
To facilitate the storage and amplification of linear donor DNA, this example inserts the donor DNA into the pMD18T plasmid and amplifies by PCR to obtain a linear donor DNA sequence.
Taking pMD18T plasmid as a template, and taking a primer pMD 18T-PF: ATCGTCGACCTGCAGGCATG and pMD 18T-PR: ATCTCTAGAGGATCCCCGGG PCR amplification, taking 17 muL of amplification product, adding 1 muL Dpn I, 2 muL 10 Xdigestion buffer, mixing uniformly, 37 muLoAnd C, water bath is carried out for 3h, and the plasmid backbone linear fragment pMD18T-vector is obtained.
Construction of donor plasmids pMD18T-eIF3a-MS2 CP and pMD18T-MS2 CP-eIF3a
(1) Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aC-HR 1-PF: AATGAAGAAGTTGATTGCTG andKleIF3aC-gRNA mutant-PR:CTACCATTTTTTTTCTCCAACAACTCTTCGATTTCTCTTTGCTTTC PCR amplification, the product name is eIF3 aC-F1; taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aC-gRNA mutant-PF: TCGAAGAGTTGTTGGAGAAAAAAAATGGTAGCTCTAGATCTAGCCCTG andKleIF3aC-HR 1-PR: ACCAACAGTAGACTTACCAGCATGTGGACCTTGAACTTCATCTTGAGTTGAACCTCCACCTCCAGATCCACCTCCACCACCTCTTCCAGCATTCATTC PCR amplification was performed, the product being designated eIF3 aC-F2. Using synthetic MS2 CP encoding DNA as template, primers MS2 CP-PF 1: ACTCAAGATGAAGTTCAAGGTCCACATGCTGGTAAGTCTACTGTTGGTGGAGGTGGATCTATGGCTTCTAATTTCACTCA and MS2 CP-PR: ATAAATACCAGAATTAGCAG PCR amplification was performed, the product was named MS2 CP.
Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aC-HR 2-PF: TCTGCTATTGCTGCTAATTCTGGTATTTATTAATATCTTGCATATCTCATTC andKleIF3aC-HR 2-PR: CCGTAGTGGGGTTCTCAATC PCR amplification, the product name is eIF3 aC-F3;
adding 1 mu L each of the amplification products eIF3aC-F1, eIF3aC-F2, MS2 CP, eIF3aC-F3 and pMD18T-vector to 5 mu L Cloning Mix (Kit name is Transgene pEASY-Uni Seamless Cloning and Assembly Kit, from the whole formula Co., Ltd., the same below), and mixing to 50oC, water bath for 1 h. Placing on ice for 2 min after the water bath is finished, adding 10 μ L of reaction solution into 50 μ L of Trans-T1 competent cells (from the whole gold company, the same below), placing on ice for 30min, and placing on 42 minoAfter C heat shock for 30s, 1 mL of LB liquid medium 37 was addedoC shaking culture for 1h, plating on Amp resistant LB solid culture, 37oC, inverted culture until single clone grows out. 6 single clones are picked and cultured in LB liquid culture medium in a shaking way, after PCR detection is positive and sequencing is confirmed, plasmids are extracted and stored, and are named as pMD18T-eIF3a-MS2 CP, referring to figure 2, the sequence of a donor fragment is shown as SEQ ID NO 10.
(2) Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aN-HR 1-PF: GTACCCGGGGATCCTCTAGAGATCCAACTCCGTAGACGACATTTTGAAAAACGGTGC andKleIF3aN-gRNA mutant-PR: ACCTTCGTCGAACGTCGAGATCATTGAGAACGAAGTTGAAGTTTATGTGCTATGTCGC PCR amplification, the product name is eIF3 aN-F1; kluyveromyces lactis genomic DNAAs template, primersKleIF3aN -gRNA mutant-PF:TTCTCAATGATCTCGACGTTCGACGAAGGTTGAAATTTTTCAGCATGCTTCCGTTGAAG
AndKleIF3aN-HR 1-PR: ACCAGCATGTGGACCTTGAACTTCATCTTGAGTGATGAAATGCTAGTATAATCGATGG PCR amplification was performed, the product being designated eIF3 aN-F2.
Using synthetic MS2 CP encoding DNA as template, primers MS2 CP-PF 1: ACTCAAGATGAAGTTCAAGGTCCACATGCTGGTAAGTCTACTGTTGGTGGAGGTGGATCTATGGCTTCTAATTTCACTCA and MS2 CP-PR: ATAAATACCAGAATTAGCAG PCR amplification was performed, the product was named MS2 CP.
Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aN-HR 2-PF: CTGCTATTGCTGCTAATTCTGGTATTTATATGGCTCCACCGGCTTTACGTCCTGAG andKleIF3aN-HR 2-PR: CATGCCTGCAGGTCGACGATTCGTAGTAGTTGGCCAACACAGAGGGCTTTGGAGCACG PCR amplification, the product name is eIF3 aN-F3;
adding 1 mu L each of the amplification products eIF3aN-F1, eIF3aN-F2, MS2 CP, eIF3aN-F3 and pMD18T-vector to 5 mu L Cloning Mix (Kit name is Transgene pEASY-Uni Seamless Cloning and Assembly Kit, from the whole formula Co., Ltd., the same below), and mixing to 50oC, water bath for 1 h. Placing on ice for 2 min after the water bath is finished, adding 10 μ L of reaction solution into 50 μ L of Trans-T1 competent cells (from the whole gold company, the same below), placing on ice for 30min, and placing on 42 minoAfter C heat shock for 30s, 1 mL of LB liquid medium 37 was addedoC shaking culture for 1h, plating on Amp resistant LB solid culture, 37oC, inverted culture until single clone grows out. 6 single clones are picked and cultured in LB liquid culture medium in a shaking way, after PCR detection is positive and sequencing is confirmed, plasmids are extracted and stored, and the plasmids are named as pMD18T-MS2 CP-eIF3a, and referring to FIG. 3, the DNA sequence of a donor fragment is shown as SEQ ID NO. 11.
K. lactisElectric conversion
Taking out competence from a refrigerator at-80 ℃, thawing on ice, adding 400 ng of gRNA & Cas9 plasmid (or gRNA/Cas9 fragment) and 1000 ng of donor DNA fragment, mixing uniformly, transferring into an electric shock cup, and carrying out ice bath for 2 min; putting the electric shock cup into an electric rotating instrument for electric shock (the parameters are 1.5 kV, 200 omega and 25 muF); immediately adding 700 mu L of YPD after the electric shock is finished, and incubating for 1-3 h by using a shaking table at 30 ℃ and 200 rpm; 2-200. mu.L of the suspension was inoculated onto YPD (containing G418 resistance) plates and cultured at 30 ℃ for 2-3 days until single colonies appeared.
Positive identification
(1) 12-24 monoclonals were picked from the transformed plates, and the cells were used as templates, and the primers eIF3aC-CF: AGGCAAAGACGTATGGCTGA and eIF3aC-CR were used: TATGGACATGCCTTCATGGC PCR detection was performed on the samples. The positive band is about 2319bp, and the negative band is about 1836 bp. And the cell strain which is positive in PCR result and identified by sequencing is determined to be a positive cell strain and is named as eIF3a-MS 2.
(2) 12-24 monoclonals were picked from the transformed plates, and the cells were used as templates, and the primers eIF3aN-CF: GTACCCGGGGATCCTCTAGAGATCCAACTCCGTAGACGACATTTTGAAAAACGGTGC and eIF3aN-CR were used: CATGCCTGCAGGTCGACGATTCGTAGTAGTTGGCCAACACAGAGGGCTTTGGAGCACG PCR detection was performed on the samples. The positive band is about 2353bp, and the negative band is about 1903 bp. The cell strain which is positive in PCR result and identified by sequencing is determined to be a positive cell strain and is named as MS2-eIF3 a.
Example 2C-terminal or N-terminal targeted insertion of eIF3a Gene Q β CP
The codingKlThe nucleotide sequence of eIF3a is shown as SEQ ID NO. 1, and the amino acid sequence thereof is shown as SEQ ID NO. 2. The nucleotide sequence of the coded Q beta CP is shown as SEQ ID NO. 12, and the amino acid sequence thereof is shown as SEQ ID NO. 13. The above-mentionedKlThe nucleotide sequence of the eIF3a-Q beta fusion protein is shown as SEQ ID NO. 14, and the amino acid sequence thereof is shown as SEQ ID NO. 15.
Selecting the construction mode with or without connecting sequenceKleIF3a is linked to Q β CP.
KlPlasmids
In thatKlThe gRNA design of eIF3a gene with Q beta CP inserted at C-terminal is the same as that in example 1.1 (1);
in thatKlgRNA design of Q β CP insertion at N-terminal of eIF3a gene was the same as in (2) of example 1.1.
Donor DNA plasmid construction and amplification
pMD18T-vector and eIF3aC-F1, eIF3aN-F1 used were the same as in example 1.
Construction of donor plasmids pMD18T-eIF3a-Q β CP and pMD18T-Q β CP-eIF3a
(1) Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3a-gRNA mutant-PF:tcgaagaGTtGTtGgaGaaAaaAaaTggtagctctagatctagccctg
AndKleIF3aC-HR1-PR 2: ACCTAGAGTTACCGTTTCTAGTTTGGCCATacctcttccagcattcattc PCR amplification was performed, the product being designated eIF3 aC-F4.
Using the synthesized Q beta CP encoding DNA as a template, and using a primer Q beta CP-PF: ATGGCCAAACTAGAAACGGT and Q β CP-PR: ATAAGCTGGGTTTAGTTGGT PCR amplification was performed, the product being named Q β CP.
Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aC-HR2-PF 2: GACGCCATAGACCAACTAAACCCAGCTTATTAATATCTTGCATATCTCATTC andKleIF3aC-HR 2-PR: ccgtagtggggttctcaatc PCR amplification, the product name is eIF3 aC-F5;
taking 1 mu L of each of amplification products eIF3aC-F1, eIF3aC-F4, Q beta CP, eIF3aC-F5 and pMD18T-vector, adding 5 mu L of Cloning Mix, and carrying out molecular Cloning construction according to the same method in example 1 to obtain a target plasmid named as pMD18T-eIF3a-Q beta CP, and referring to FIG. 4, wherein the DNA sequence of a donor fragment is shown as SEQ ID NO: 16.
(2) Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aN-gRNA mutant-PF: TTCTCAATGATCTCGACGTTCGACGAAGGTTGAAATTTTTCAGCATGCTTCCGTTGAAG andKleIF3aN-HR1-PR 2: GTTACCTAGAGTTACCGTTTCTAGTTTGGCCATGATGAAATGCTAGTATAATCGATGGG PCR amplification was performed, the product being designated eIF3 aN-F4.
Using the synthesized Q beta CP encoding DNA as a template, and using a primer Q beta CP-PF: ATGGCCAAACTAGAAACGGT and Q β CP-PR: ATAAGCTGGGTTTAGTTGGT PCR amplification was performed, the product being named Q β CP.
Taking Kluyveromyces lactis genome DNA as template and using primerKleIF3aN-HR2-PF 2: CCATAGACCAACTAAACCCAGCTTATtaaATGGCTCCACCGGCTTTACGTCCTGAG andKleIF3aN-HR 2-PR: CATGCCTGCAGGTCGACGATTCGTAGTAGTTGGCCAACACAGAGGGCTTTGGAGCACG PCR amplification, the product name is eIF3 aN-F5;
taking 1 mu L of each of amplification products eIF3aN-F1, eIF3aN-F4, Q beta CP, eIF3aN-F5 and pMD18T-vector, adding 5 mu L of Cloning Mix, and carrying out molecular Cloning construction according to the same method in example 1 to obtain a target plasmid named as pMD18T-Q beta CP-eIF3a, and referring to FIG. 5, wherein the DNA sequence of a donor fragment is shown as SEQ ID NO: 17.
K. lactisElectrotransformation (same as example 1)
2.4 Positive identification
(1) The primers and method used for identification were the same as in example 1, with a positive band of about 2238bp and a negative band of about 1836 bp. And the cell strain which is positive in PCR result and identified by sequencing is determined to be a positive cell strain and is named as eIF3a-Q beta.
(2) The used identification primers and method are the same as example 1, the positive band is about 2305bp, and the negative band is about 1903 bp. The cell strain which is positive in PCR result and identified by sequencing is determined to be a positive cell strain and is named Q beta-eIF 3 a.
Example 3 construction of an in vitro cell-free protein Synthesis System plasmid containing an RNA-binding protein MS2 CP-specific recognition binding site sequence
3.1 construction of plasmid pKM01 (SEQ ID NO: 18)
3.1.1 insertion of a sequence containing two specific recognition binding sites for MS2 CP (SEQ ID NO: 19) between the reporter genes eGFP and polyA in the control template SP plasmid (SEQ ID NO: 20).
19 as a template, with primers OeGFP vF: GGACGAGCTGTACAAGTAAAAACATGAGGATTACCCATGTTATATCG and LT-R: GGCTATGTCGCTTTTTTTTTTTTATTGGCATCC performing PCR amplification to obtain an insert;
using SP plasmid as template, primer LAC4ter F: ATAAGGATTAATTACTTGGATGCCAATAAAAAAAAAAAAGCGACATAGCC and OeGFP iR: GGGTAATCCTCATGTTTTTACTTGTACAGCTCGTCCATGCCG PCR amplification, to obtain vector fragments.
3.1.2 transformation of DH 5. alpha. competence
Mixing 10 muL of each insert fragment amplification product with the vector fragment amplification product; adding 1 muL Dpn I into the 20 muL amplification product, and incubating for 6 h at 37 ℃; adding 4 muL of the product after the DpnI treatment into 50 muL DH5 alpha competent cells, placing the cells on ice for 30min, placing the cells on ice for 3 min after heat shock is carried out for 45 s at 42 ℃, adding 200 muL LB liquid culture medium, carrying out shake culture at 37 ℃ for 4 h, and coating the cells on an LB solid culture medium containing Amp antibiotics for overnight culture; 6 monoclonals are picked for amplification culture, sequencing is carried out to confirm correctness, and plasmids are extracted for storage.
3.2 construction of plasmid pKM02 (SEQ ID NO: 21):
3.2.1A specific recognition binding site sequence containing two MS2 CPs (SEQ ID NO: 19) was inserted into the control template SP plasmid (SEQ ID NO: 20) and replaced with polyA.
19 as a template, with primers OeGFP vF: GGACGAGCTGTACAAGTAAAAACATGAGGATTACCCATGTTATATCG and LT-R: GGCTATGTCGCTTTTTTTTTTTTATTGGCATCC performing PCR amplification to obtain an insert;
using SP plasmid as a template, the DNA fragment was purified using primer (LT) T7 ter-F: GGATGCCAATAAAAAAAAAAAAGCGACATAGCCCTAGCATAACCCCTTGGGGCC and OeGFP iR: GGGTAATCCTCATGTTTTTACTTGTACAGCTCGTCCATGCCG PCR amplification, to obtain vector fragments.
3.2.2 transformation of DH 5. alpha. competence (same as 3.1.2)
3.3 construction of plasmid pKM03 (SEQ ID NO: 22):
3.3.1A specific recognition binding site sequence (SEQ ID NO: 23) containing 3 MS2 CPs was inserted into the SP plasmid (SEQ ID NO: 20) and replaced with polyA.
Using SEQ ID NO. 23 as a template, primers GFP14MS F: CGAGCTGTACAAGTAAATAAGGATTAATTAAAACATGAGGATTACCCATGTTATATCGC and lac 36R: GGCTATGTCGCTTTTTTTTTTTTATTGGCATCCAAG performing PCR amplification to obtain an insert;
using SP plasmid as a template, the DNA fragment was purified using primer (LT) T7 ter-F: GGATGCCAATAAAAAAAAAAAAGCGACATAGCCCTAGCATAACCCCTTGGGGCC and MS14GFP R: CATGGGTAATCCTCATGTTTTAATTAATCCTTATTTACTTGTACAGCTCGTCCATGCCG PCR amplification, to obtain vector fragments.
3.3.2 transformation of DH 5. alpha. competence (same as 3.1.2)
Example 4 construction of an in vitro cell-free protein Synthesis System plasmid containing an RNA-binding protein QbetaCP specific recognition binding site sequence
4.1 construction of plasmid pKM04 (SEQ ID NO: 24):
4.1.1 insertion of a specific recognition binding site sequence (SEQ ID NO: 25) containing two Q β CPs between the reporter genes eGFP and polyA in the SP plasmid (SEQ ID NO: 20).
Using SEQ ID NO:25 as a template, primers GFP14Q 5F: GTACAAGTAAATAAGGATTAATTATAAGGATTAATTGCATGTCTAAGACAGCAATAAGG and LT-R: GGCTATGTCGCTTTTTTTTTTTTATTGGCATCC performing PCR amplification to obtain an insert;
using the SP plasmid as template, with primer lac 36F: CTTGGATGCCAATAAAAAAAAAAAAGCGACATAGCC and Q514GFP R: GACATGCAATTAATCCTTATAATTAATCCTTATTTACTTGTACAGCTCGTCCATGCCG PCR amplification, to obtain vector fragments.
4.1.2 transformation DH 5. alpha. is competent (same as 3.1.2).
4.2 construction of plasmid pKM05 (SEQ ID NO: 26)
4.2.1A sequence of specific recognition binding sites (SEQ ID NO: 27) containing 3 Q.beta.CPs was inserted between the reporter genes eGFP and polyA in the SP plasmid (SEQ ID NO: 20).
Using SEQ ID NO:27 as a template, primers GFP14Q 5F: GTACAAGTAAATAAGGATTAATTATAAGGATTAATTGCATGTCTAAGACAGCAATAAGG and LT-R: GGCTATGTCGCTTTTTTTTTTTTATTGGCATCC performing PCR amplification to obtain an insert;
using the SP plasmid as template, with primer lac 36F: CTTGGATGCCAATAAAAAAAAAAAAGCGACATAGCC and Q514GFP R: GACATGCAATTAATCCTTATAATTAATCCTTATTTACTTGTACAGCTCGTCCATGCCG PCR amplification, to obtain vector fragments.
4.2.2 transformation of DH 5. alpha. competence (same as 3.1.2).
4.3 construction of plasmid pKM06 (SEQ ID NO: 28):
4.3.1 specific recognition binding site sequence (SEQ ID NO: 27) containing 3 Q.beta.CPs was inserted into the SP plasmid (SEQ ID NO: 20) and replaced the polyA.
Using SEQ ID NO:27 as a template, primers GFP14Q 5F: GTACAAGTAAATAAGGATTAATTATAAGGATTAATTGCATGTCTAAGACAGCAATAAGG and LT-R: GGCTATGTCGCTTTTTTTTTTTTATTGGCATCC performing PCR amplification to obtain an insert;
using SP plasmid as a template, the DNA fragment was purified using primer (LT) T7 ter-F: GGATGCCAATAAAAAAAAAAAAGCGACATAGCCCTAGCATAACCCCTTGGGGCC and Q514GFP R: GACATGCAATTAATCCTTATAATTAATCCTTATTTACTTGTACAGCTCGTCCATGCCG PCR amplification, to obtain vector fragments.
4.3.2 transformation of DH 5. alpha. competence (same as 3.1.2).
Example 5 expression of templates containing sequences that specifically recognize binding sites in the in vitro cell-free protein Synthesis System prepared in example 1 or 2
5.1 preparation of cell extract (cell lysate)
The preparation method of the cell extracting solution (cell lysate) comprises the following steps:
(i) providing cells that are eIF3a-MS2, MS2-eIF3a cells prepared in example 1 and eIF3a-Q β, Q β -eIF3a cells prepared in example 2, respectively;
(ii) washing the cells to obtain washed cells;
(iii) subjecting the washed cells to cell disruption treatment, thereby obtaining a crude cell extract;
(iv) and carrying out solid-liquid separation on the crude cell extract to obtain a liquid part, namely a cell extract (cell lysate).
The solid-liquid separation method is not particularly limited, and the separation method selected in this example is centrifugation. The centrifugation parameters were 4 ℃ at 30000 Xg, and centrifugation for 30 min.
The washing treatment method is not particularly limited, and the washing treatment method selected in this example is a method of treating the substrate with a washing solution at a pH of 7.4, the washing solution is not particularly limited, and typically the washing solution is selected from the following group: potassium 4-hydroxyethyl piperazine ethanesulfonate, potassium acetate, magnesium acetate, or combinations thereof. Potassium acetate was chosen for this example.
Wherein, the cell disruption treatment is not particularly limited, and a preferable cell disruption treatment includes high pressure disruption, freeze-thaw (e.g., liquid nitrogen low temperature) disruption.
5.2 preparation of in vitro cell-free protein Synthesis System
4-hydroxyethylpiperazine ethanesulfonic acid at a final concentration of 22 mM, pH 7.4, 30-150 mM potassium acetate, 1.0-5.0 mM magnesium acetate, 1.5-4 mM nucleoside triphosphate mixtures (adenosine triphosphate, guanosine triphosphate, cytosine nucleoside triphosphate, and uracil nucleoside triphosphate), 0.08-0.24 mM natural or unnatural amino acid mixtures (including but not limited to glycine, alanine, valine, leucine, isoleucine, phenylalanine, proline, tryptophan, serine, tyrosine, cysteine, methionine, asparagine, glutamine, threonine, aspartic acid, glutamic acid, lysine, arginine, and histidine), 25 mM creatine phosphate, 1.7 mM dithiothreitol, 0.027-0.054 mg/mL T7 RNA polymerase, 0.27 mg/mL creatine phosphate kinase, 1% -4% of polyethylene glycol, 0.5% -2% of sucrose, and finally 50% by volume of cell extract (cell lysate) is added.
5.3 in vitro cell-free protein Synthesis reactions
The fragment between the transcription start and termination sequences of T7 in all plasmids (pKM 01-pKM 06) was amplified using PCR primers T7_ pET21a _ F: CGCGAAATTAATACGACTCACTATAGG and T7ter _ pET21a _ R: TCCGGATATAGTTCCTCCTTTCAG.
And purifying and enriching the amplified DNA fragment by using an ethanol precipitation method: adding 1/10 volume of 3M sodium acetate (pH 5.2) to the PCR product, adding 2.5-3 times volume of 95% ethanol, and incubating on ice for 15 min; centrifuging at room temperature at a speed higher than 14000 g for 30min, and discarding the supernatant; washing with 70% ethanol, centrifuging for 15 min, discarding the supernatant, dissolving the precipitate with ultrapure water, and determining the DNA concentration.
The purified DNA fragment was added to the synthesis system of step 5.2 of this example. And placing the reaction system in an environment with the temperature of 25-30 ℃, and standing and incubating for about 3 hours. The relative fluorescence intensity of enhanced green fluorescent protein (eGFP) was used to indicate the ability of recombinant proteins of in vitro biosynthetic systems to synthesize. After the reaction, 10. mu.L of the reaction solution was added to a 96-well plate, and immediately placed in an Envision 2120 multifunctional microplate reader (Perkin Elmer), and the Relative fluorescence Unit value (RFU) of enhanced green fluorescent protein (eGFP) was read and detected as an activity Unit. Three independent experiments were designed for each sample using the SP plasmid as a control (PC) which did not contain a DNA fragment of the mRNA binding protein recognition site sequence.
Results of the experiment
1. Application of template containing MS2 CP specific recognition binding site sequence
The eIF3a-MS2 strain was prepared as a yeast in vitro cell-free protein synthesis system, and a green fluorescent protein gene synthesis template (i.e., pKM01-pKM03) containing a sequence of a specific recognition binding site of MS2 CP was added to determine the protein translation ability of the modified strain, and a green fluorescent protein gene synthesis template (i.e., SP plasmid, PC) containing no sequence of a specific recognition binding site of MS2 CP was used as a control. And (3) placing the reaction system in an environment with the temperature of 25-30 ℃, and standing and incubating for about 3 hours. The activity of the strain after the modification of eIF3a is not reduced compared with that before the modification. The results of the activity assay of the different templates in an in vitro translation system prepared from strain eIF3a-MS2 are shown in FIG. 6. According to the detection result, the relative fluorescence unit value of the control group without the MS2 CP specific recognition binding site sequence is 56, while the experimental groups are improved to different degrees; wherein the relative fluorescence unit value of the green fluorescent protein after the insertion of 2 MS2 CP binding recognition sites is 144 (plasmid pKM 01), the relative fluorescence unit value after the substitution of polyA by 2 MS2 CP binding recognition sites is 83 (plasmid pKM 02), and the relative fluorescence unit value after the substitution of polyA by 3 MS2 CP binding recognition sites is 126 (plasmid pKM 03); the data show that the activity of the latter three (i.e., experimental) is 2.6 times, 1.5 times and 2.3 times that of the control (PC), respectively. The same MS2-eIF3a strain was used to prepare a yeast in vitro cell-free protein synthesis system, and a green fluorescent protein gene synthesis template (i.e., pKM01-pKM03) containing the sequence of the specific recognition binding site of MS2 CP was added to determine the protein translation ability of the engineered strain, and a green fluorescent protein gene synthesis template (i.e., SP plasmid, PC) containing no sequence of the specific recognition binding site of MS2 CP was used as a control. The experimental results show that the activity values of the experimental groups are improved compared with those of the control group (PC).
2. Application of template containing Q beta CP specific recognition binding site sequence
The eIF3a-Q beta strain is prepared into a yeast in-vitro cell-free protein synthesis system, a green fluorescent protein gene synthesis template (namely pKM04-pKM06) containing a Q beta CP specific recognition binding site sequence is added to determine the protein translation capability of the modified strain, and a green fluorescent protein gene synthesis template (namely SP plasmid, PC) containing no MS2 CP specific recognition binding site sequence is used as a control. The activity of the strain after the modification of eIF3a is not reduced compared with that before the modification. The results of the activity assay of the different templates in an in vitro translation system prepared with the strain eIF3a-Q β are shown in FIG. 7. The relative fluorescence unit value of the control template not containing the sequence of the Q.beta.CP specific recognition binding site was 26, the relative fluorescence unit value of the green fluorescent protein after insertion of 2 Q.beta.CP binding recognition sites was 35 (plasmid pKM 04), the relative fluorescence unit value of the green fluorescent protein after insertion of 3 Q.beta.CP binding recognition sites was 42 (plasmid pKM 05), and the relative fluorescence unit value after substitution of polyA with 3 Q.beta.CP binding recognition sites was 46 (plasmid pKM06), which were 1.3 times, 1.6 times, and 1.8 times, respectively, that of the SP control template (PC). The Q β -eIF3a strain was also used to prepare a yeast in vitro cell-free protein synthesis system, and a green fluorescent protein gene synthesis template (i.e., pKM04-pKM06) containing the MS2 CP-specific recognition binding site sequence was added to determine the protein translation ability of the engineered strain, and a green fluorescent protein gene synthesis template (i.e., SP plasmid, PC) containing no MS2 CP-specific recognition binding site sequence was used as a control. The experimental results show that the activity values are improved compared with the control group (PC).
All documents referred to herein are incorporated by reference into this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications can be made by those skilled in the art after reading the above disclosure, and equivalents also fall within the scope of the invention as defined by the appended claims.
Sequence listing
<110> Kangma (Shanghai) Biotech Co., Ltd
<120> fusion protein comprising RNA-binding protein and expression vector used in combination with the same
<130> 2019
<141> 2019-03-20
<160> 28
<170> SIPOSequenceListing 1.0
<210> 1
<211> 2778
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 1
atggctccac cggctttacg tcctgagaat gccattagaa gagctgacga attagtctct 60
gttggcgagc caatggctgc gttgcaatct ctatttgatt tattatcttc aagaaggtct 120
cgttttgctg atgctgccac tttggaacct ataatcttca agttcttgga acttggtgtt 180
gaattgagga agggtaaaat gatcaaggaa ggtttatacc aatacaagaa gcatatgcaa 240
cacactcccg aaggtttgat ttctgtaggt gctgttgctc gtaaattcat cgatttgatc 300
gaaactaaga tgaccaacat ccaagcgcaa actgatgcca aagaagaatc caacaaggac 360
caagccgaag aggatctaga gggtggtgtc accccagaaa atttgttggt ttctgtttac 420
gaacaagaac aaactgttgg tggattcaac aatgatgatg tttcagcttg gttgagattt 480
acctgggaat cttaccgtac cactctagat ttcttgagaa ataattctca attggaaatc 540
acgtatgcgg gtgttgttaa cagaaccatg caattctgtt acaaatataa ccgtaagaat 600
gaattcaagc gtttagctga aatgttacgt caacatttgg atgccgcaaa ctaccaacaa 660
cagagatatg gtcaccacac tgtcgattta tcagatcctg acactttgca acgttattct 720
gaccaacgtt tccaacaagt taacgtttct gttaaattgg aattatggca tgaagctttc 780
agatccattg aagatgttca tcatttgatg cgtctctcga agcgtgctcc aaagccctct 840
gtgttggcca actactacga aaatttggcc aagatcttct ttgtctctgg taactattta 900
ttgcacgctg ctgcatggga aaaattctac aatttgtact tgaagaatcc aaatgcttcc 960
gaagaagact ttaagttcta ttcatctcaa tttgtcttgt ccgctttggc aattcaattg 1020
gatgacttac caattgctgg tttcgatcct caaattcgtc tatgtgactt actagacctt 1080
gaaagtaaac caaagagaaa ggatttgatt actgctgctg gtgaacaaca agtcgtagag 1140
aaagctgacg ctgatatttt gaaattcttc aatatattgg aaactaattt cgatgtgaag 1200
tctgctaagt ctcaattgtc tgcacttttg ccaaacttgg ttgaaaagcc atatttcgcc 1260
caatacgtgg ctccattgag aaacctattc atcagaagat ccatcattga agtctcaaag 1320
gctcaaacat ctattcactt agttgaattg catgaaatgt tgtcactgcc agctcctttc 1380
gaattatctg tttttgaact agaaaaatac ttaatccaag ctgctatgga tgattatgtt 1440
agtatttcca ttgaccatga aactgatacc gtttcatttg cccaagatcc atttgacgct 1500
tggcaagcat ctcttgttga agttcccgaa tctagcactt ctgatgaagc aaagaactct 1560
gaatccgaag aggaaacctc ccaagaaacg catgctgatg aggaacagaa tgaacaagtc 1620
ttcactcgta actcagaagt ccgttctaag ttgactgatc tatccaagat cttaaaggcc 1680
aacgaagaat acgaaaatgg ttcttactat tacagagtca aacttgtgcg tgaagaattg 1740
atcagaagaa aggaagaagt tatcaagtta gaaaaggaag ctgctgaaat tagagctaag 1800
agtaacgctg aacgcaagaa gagaagcgaa gaagaaaaca agattcttgc caagaaggct 1860
ctagaagaaa ggcaaagacg tatggctgag gaaaaggctg ctgttgaatc ttctatggag 1920
aaagaagcgg aacgtcgtgc tgaagaaatg atggaacgtg agagagaagc gatccatgaa 1980
caagaaatga agaagttgat tgctgaaact aatgccaatg gtgtcattca tattgatcca 2040
aaggaagcca agaacctaac aagtgataag atcaaccaaa tggtcattga acaagtagcc 2100
aagaacaaga aggatttgac tgaacgtatg acctatgcct tcaagaagtt agatcacctg 2160
gaaagagcct atagacaaat ggaattgcca ttgttagaaa aggacgctga agagcaaaaa 2220
aagagggata gagagaatta cgacaatttc aagaagaagt taattgagac ttccaaggcc 2280
gactatgaaa agaaattggc tctacatcaa cgtttgaaca aaatctacag tactttcaac 2340
caatacaagt catctgttat cgctgaaaag aaggaagagt tagaaaaaca acgcgccttg 2400
aaggaagctc aattagaaga agctaagaag caaagaattg aacaagtccg taaggaacgt 2460
tatgaagcta aagttgctga aatacaagct gcaattgaag ctgaagctgc tgaaaaggag 2520
gctttggcta aggaggaaga acttgccaag agacgtgccg aacgtgaaag aatcaacaag 2580
gaaagagacg aaattgctag aaagcaaaga gaaatcgaag aacttcttga aaagaagaac 2640
ggtagctcta gatctagccc tgttccttct actccaaccc cagcaccagc accagcacaa 2700
actgctccgg tatccaataa accaatgtct atggctgaaa agttgagact gaagagaatg 2760
aatgctggaa gaggttaa 2778
<210> 2
<211> 925
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 2
Met Ala Pro Pro Ala Leu Arg Pro Glu Asn Ala Ile Arg Arg Ala Asp
1 5 10 15
Glu Leu Val Ser Val Gly Glu Pro Met Ala Ala Leu Gln Ser Leu Phe
20 25 30
Asp Leu Leu Ser Ser Arg Arg Ser Arg Phe Ala Asp Ala Ala Thr Leu
35 40 45
Glu Pro Ile Ile Phe Lys Phe Leu Glu Leu Gly Val Glu Leu Arg Lys
50 55 60
Gly Lys Met Ile Lys Glu Gly Leu Tyr Gln Tyr Lys Lys His Met Gln
65 70 75 80
His Thr Pro Glu Gly Leu Ile Ser Val Gly Ala Val Ala Arg Lys Phe
85 90 95
Ile Asp Leu Ile Glu Thr Lys Met Thr Asn Ile Gln Ala Gln Thr Asp
100 105 110
Ala Lys Glu Glu Ser Asn Lys Asp Gln Ala Glu Glu Asp Leu Glu Gly
115 120 125
Gly Val Thr Pro Glu Asn Leu Leu Val Ser Val Tyr Glu Gln Glu Gln
130 135 140
Thr Val Gly Gly Phe Asn Asn Asp Asp Val Ser Ala Trp Leu Arg Phe
145 150 155 160
Thr Trp Glu Ser Tyr Arg Thr Thr Leu Asp Phe Leu Arg Asn Asn Ser
165 170 175
Gln Leu Glu Ile Thr Tyr Ala Gly Val Val Asn Arg Thr Met Gln Phe
180 185 190
Cys Tyr Lys Tyr Asn Arg Lys Asn Glu Phe Lys Arg Leu Ala Glu Met
195 200 205
Leu Arg Gln His Leu Asp Ala Ala Asn Tyr Gln Gln Gln Arg Tyr Gly
210 215 220
His His Thr Val Asp Leu Ser Asp Pro Asp Thr Leu Gln Arg Tyr Ser
225 230 235 240
Asp Gln Arg Phe Gln Gln Val Asn Val Ser Val Lys Leu Glu Leu Trp
245 250 255
His Glu Ala Phe Arg Ser Ile Glu Asp Val His His Leu Met Arg Leu
260 265 270
Ser Lys Arg Ala Pro Lys Pro Ser Val Leu Ala Asn Tyr Tyr Glu Asn
275 280 285
Leu Ala Lys Ile Phe Phe Val Ser Gly Asn Tyr Leu Leu His Ala Ala
290 295 300
Ala Trp Glu Lys Phe Tyr Asn Leu Tyr Leu Lys Asn Pro Asn Ala Ser
305 310 315 320
Glu Glu Asp Phe Lys Phe Tyr Ser Ser Gln Phe Val Leu Ser Ala Leu
325 330 335
Ala Ile Gln Leu Asp Asp Leu Pro Ile Ala Gly Phe Asp Pro Gln Ile
340 345 350
Arg Leu Cys Asp Leu Leu Asp Leu Glu Ser Lys Pro Lys Arg Lys Asp
355 360 365
Leu Ile Thr Ala Ala Gly Glu Gln Gln Val Val Glu Lys Ala Asp Ala
370 375 380
Asp Ile Leu Lys Phe Phe Asn Ile Leu Glu Thr Asn Phe Asp Val Lys
385 390 395 400
Ser Ala Lys Ser Gln Leu Ser Ala Leu Leu Pro Asn Leu Val Glu Lys
405 410 415
Pro Tyr Phe Ala Gln Tyr Val Ala Pro Leu Arg Asn Leu Phe Ile Arg
420 425 430
Arg Ser Ile Ile Glu Val Ser Lys Ala Gln Thr Ser Ile His Leu Val
435 440 445
Glu Leu His Glu Met Leu Ser Leu Pro Ala Pro Phe Glu Leu Ser Val
450 455 460
Phe Glu Leu Glu Lys Tyr Leu Ile Gln Ala Ala Met Asp Asp Tyr Val
465 470 475 480
Ser Ile Ser Ile Asp His Glu Thr Asp Thr Val Ser Phe Ala Gln Asp
485 490 495
Pro Phe Asp Ala Trp Gln Ala Ser Leu Val Glu Val Pro Glu Ser Ser
500 505 510
Thr Ser Asp Glu Ala Lys Asn Ser Glu Ser Glu Glu Glu Thr Ser Gln
515 520 525
Glu Thr His Ala Asp Glu Glu Gln Asn Glu Gln Val Phe Thr Arg Asn
530 535 540
Ser Glu Val Arg Ser Lys Leu Thr Asp Leu Ser Lys Ile Leu Lys Ala
545 550 555 560
Asn Glu Glu Tyr Glu Asn Gly Ser Tyr Tyr Tyr Arg Val Lys Leu Val
565 570 575
Arg Glu Glu Leu Ile Arg Arg Lys Glu Glu Val Ile Lys Leu Glu Lys
580 585 590
Glu Ala Ala Glu Ile Arg Ala Lys Ser Asn Ala Glu Arg Lys Lys Arg
595 600 605
Ser Glu Glu Glu Asn Lys Ile Leu Ala Lys Lys Ala Leu Glu Glu Arg
610 615 620
Gln Arg Arg Met Ala Glu Glu Lys Ala Ala Val Glu Ser Ser Met Glu
625 630 635 640
Lys Glu Ala Glu Arg Arg Ala Glu Glu Met Met Glu Arg Glu Arg Glu
645 650 655
Ala Ile His Glu Gln Glu Met Lys Lys Leu Ile Ala Glu Thr Asn Ala
660 665 670
Asn Gly Val Ile His Ile Asp Pro Lys Glu Ala Lys Asn Leu Thr Ser
675 680 685
Asp Lys Ile Asn Gln Met Val Ile Glu Gln Val Ala Lys Asn Lys Lys
690 695 700
Asp Leu Thr Glu Arg Met Thr Tyr Ala Phe Lys Lys Leu Asp His Leu
705 710 715 720
Glu Arg Ala Tyr Arg Gln Met Glu Leu Pro Leu Leu Glu Lys Asp Ala
725 730 735
Glu Glu Gln Lys Lys Arg Asp Arg Glu Asn Tyr Asp Asn Phe Lys Lys
740 745 750
Lys Leu Ile Glu Thr Ser Lys Ala Asp Tyr Glu Lys Lys Leu Ala Leu
755 760 765
His Gln Arg Leu Asn Lys Ile Tyr Ser Thr Phe Asn Gln Tyr Lys Ser
770 775 780
Ser Val Ile Ala Glu Lys Lys Glu Glu Leu Glu Lys Gln Arg Ala Leu
785 790 795 800
Lys Glu Ala Gln Leu Glu Glu Ala Lys Lys Gln Arg Ile Glu Gln Val
805 810 815
Arg Lys Glu Arg Tyr Glu Ala Lys Val Ala Glu Ile Gln Ala Ala Ile
820 825 830
Glu Ala Glu Ala Ala Glu Lys Glu Ala Leu Ala Lys Glu Glu Glu Leu
835 840 845
Ala Lys Arg Arg Ala Glu Arg Glu Arg Ile Asn Lys Glu Arg Asp Glu
850 855 860
Ile Ala Arg Lys Gln Arg Glu Ile Glu Glu Leu Leu Glu Lys Lys Asn
865 870 875 880
Gly Ser Ser Arg Ser Ser Pro Val Pro Ser Thr Pro Thr Pro Ala Pro
885 890 895
Ala Pro Ala Gln Thr Ala Pro Val Ser Asn Lys Pro Met Ser Met Ala
900 905 910
Glu Lys Leu Arg Leu Lys Arg Met Asn Ala Gly Arg Gly
915 920 925
<210> 3
<211> 90
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 3
ggtggaggtg gatctggagg tggaggttca actcaagatg aagttcaagg tccacatgct 60
ggtaagtcta ctgttggtgg aggtggatct 90
<210> 4
<211> 30
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 4
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Thr Gln Asp Glu Val Gln
1 5 10 15
Gly Pro His Ala Gly Lys Ser Thr Val Gly Gly Gly Gly Ser
20 25 30
<210> 5
<211> 393
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 5
atggcttcta atttcactca attcgttttg gttgataatg gtggtactgg tgatgttact 60
gttgctccat ctaatttcgc taatggtgtt gctgaatgga tttcttctaa ttctagatct 120
caagcttata aagttacttg ttctgttaga caatcttctg ctcaaaatag aaaatatact 180
attaaagttg aagttccaaa agttgctact caaactgttg gtggtgttga attgccagtt 240
gctgcttgga gatcttattt gaatatggaa ttgactattc caattttcgc tactaattct 300
gattgtgaat tgattgttaa agctatgcaa ggtttgttga aagatggtaa tccaattcca 360
tctgctattg ctgctaattc tggtatttat taa 393
<210> 6
<211> 130
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 6
Met Ala Ser Asn Phe Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr
1 5 10 15
Gly Asp Val Thr Val Ala Pro Ser Asn Phe Ala Asn Gly Val Ala Glu
20 25 30
Trp Ile Ser Ser Asn Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser
35 40 45
Val Arg Gln Ser Ser Ala Gln Asn Arg Lys Tyr Thr Ile Lys Val Glu
50 55 60
Val Pro Lys Val Ala Thr Gln Thr Val Gly Gly Val Glu Leu Pro Val
65 70 75 80
Ala Ala Trp Arg Ser Tyr Leu Asn Met Glu Leu Thr Ile Pro Ile Phe
85 90 95
Ala Thr Asn Ser Asp Cys Glu Leu Ile Val Lys Ala Met Gln Gly Leu
100 105 110
Leu Lys Asp Gly Asn Pro Ile Pro Ser Ala Ile Ala Ala Asn Ser Gly
115 120 125
Ile Tyr
130
<210> 7
<211> 3258
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 7
atggctccac cggctttacg tcctgagaat gccattagaa gagctgacga attagtctct 60
gttggcgagc caatggctgc gttgcaatct ctatttgatt tattatcttc aagaaggtct 120
cgttttgctg atgctgccac tttggaacct ataatcttca agttcttgga acttggtgtt 180
gaattgagga agggtaaaat gatcaaggaa ggtttatacc aatacaagaa gcatatgcaa 240
cacactcccg aaggtttgat ttctgtaggt gctgttgctc gtaaattcat cgatttgatc 300
gaaactaaga tgaccaacat ccaagcgcaa actgatgcca aagaagaatc caacaaggac 360
caagccgaag aggatctaga gggtggtgtc accccagaaa atttgttggt ttctgtttac 420
gaacaagaac aaactgttgg tggattcaac aatgatgatg tttcagcttg gttgagattt 480
acctgggaat cttaccgtac cactctagat ttcttgagaa ataattctca attggaaatc 540
acgtatgcgg gtgttgttaa cagaaccatg caattctgtt acaaatataa ccgtaagaat 600
gaattcaagc gtttagctga aatgttacgt caacatttgg atgccgcaaa ctaccaacaa 660
cagagatatg gtcaccacac tgtcgattta tcagatcctg acactttgca acgttattct 720
gaccaacgtt tccaacaagt taacgtttct gttaaattgg aattatggca tgaagctttc 780
agatccattg aagatgttca tcatttgatg cgtctctcga agcgtgctcc aaagccctct 840
gtgttggcca actactacga aaatttggcc aagatcttct ttgtctctgg taactattta 900
ttgcacgctg ctgcatggga aaaattctac aatttgtact tgaagaatcc aaatgcttcc 960
gaagaagact ttaagttcta ttcatctcaa tttgtcttgt ccgctttggc aattcaattg 1020
gatgacttac caattgctgg tttcgatcct caaattcgtc tatgtgactt actagacctt 1080
gaaagtaaac caaagagaaa ggatttgatt actgctgctg gtgaacaaca agtcgtagag 1140
aaagctgacg ctgatatttt gaaattcttc aatatattgg aaactaattt cgatgtgaag 1200
tctgctaagt ctcaattgtc tgcacttttg ccaaacttgg ttgaaaagcc atatttcgcc 1260
caatacgtgg ctccattgag aaacctattc atcagaagat ccatcattga agtctcaaag 1320
gctcaaacat ctattcactt agttgaattg catgaaatgt tgtcactgcc agctcctttc 1380
gaattatctg tttttgaact agaaaaatac ttaatccaag ctgctatgga tgattatgtt 1440
agtatttcca ttgaccatga aactgatacc gtttcatttg cccaagatcc atttgacgct 1500
tggcaagcat ctcttgttga agttcccgaa tctagcactt ctgatgaagc aaagaactct 1560
gaatccgaag aggaaacctc ccaagaaacg catgctgatg aggaacagaa tgaacaagtc 1620
ttcactcgta actcagaagt ccgttctaag ttgactgatc tatccaagat cttaaaggcc 1680
aacgaagaat acgaaaatgg ttcttactat tacagagtca aacttgtgcg tgaagaattg 1740
atcagaagaa aggaagaagt tatcaagtta gaaaaggaag ctgctgaaat tagagctaag 1800
agtaacgctg aacgcaagaa gagaagcgaa gaagaaaaca agattcttgc caagaaggct 1860
ctagaagaaa ggcaaagacg tatggctgag gaaaaggctg ctgttgaatc ttctatggag 1920
aaagaagcgg aacgtcgtgc tgaagaaatg atggaacgtg agagagaagc gatccatgaa 1980
caagaaatga agaagttgat tgctgaaact aatgccaatg gtgtcattca tattgatcca 2040
aaggaagcca agaacctaac aagtgataag atcaaccaaa tggtcattga acaagtagcc 2100
aagaacaaga aggatttgac tgaacgtatg acctatgcct tcaagaagtt agatcacctg 2160
gaaagagcct atagacaaat ggaattgcca ttgttagaaa aggacgctga agagcaaaaa 2220
aagagggata gagagaatta cgacaatttc aagaagaagt taattgagac ttccaaggcc 2280
gactatgaaa agaaattggc tctacatcaa cgtttgaaca aaatctacag tactttcaac 2340
caatacaagt catctgttat cgctgaaaag aaggaagagt tagaaaaaca acgcgccttg 2400
aaggaagctc aattagaaga agctaagaag caaagaattg aacaagtccg taaggaacgt 2460
tatgaagcta aagttgctga aatacaagct gcaattgaag ctgaagctgc tgaaaaggag 2520
gctttggcta aggaggaaga acttgccaag agacgtgccg aacgtgaaag aatcaacaag 2580
gaaagagacg aaattgctag aaagcaaaga gaaatcgaag agttgttgga gaaaaaaaat 2640
ggtagctcta gatctagccc tgttccttct actccaaccc cagcaccagc accagcacaa 2700
actgctccgg tatccaataa accaatgtct atggctgaaa agttgagact gaagagaatg 2760
aatgctggaa gaggtggtgg aggtggatct ggaggtggag gttcaactca agatgaagtt 2820
caaggtccac atgctggtaa gtctactgtt ggtggaggtg gatctatggc ttctaatttc 2880
actcaattcg ttttggttga taatggtggt actggtgatg ttactgttgc tccatctaat 2940
ttcgctaatg gtgttgctga atggatttct tctaattcta gatctcaagc ttataaagtt 3000
acttgttctg ttagacaatc ttctgctcaa aatagaaaat atactattaa agttgaagtt 3060
ccaaaagttg ctactcaaac tgttggtggt gttgaattgc cagttgctgc ttggagatct 3120
tatttgaata tggaattgac tattccaatt ttcgctacta attctgattg tgaattgatt 3180
gttaaagcta tgcaaggttt gttgaaagat ggtaatccaa ttccatctgc tattgctgct 3240
aattctggta tttattaa 3258
<210> 8
<211> 1085
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 8
Met Ala Pro Pro Ala Leu Arg Pro Glu Asn Ala Ile Arg Arg Ala Asp
1 5 10 15
Glu Leu Val Ser Val Gly Glu Pro Met Ala Ala Leu Gln Ser Leu Phe
20 25 30
Asp Leu Leu Ser Ser Arg Arg Ser Arg Phe Ala Asp Ala Ala Thr Leu
35 40 45
Glu Pro Ile Ile Phe Lys Phe Leu Glu Leu Gly Val Glu Leu Arg Lys
50 55 60
Gly Lys Met Ile Lys Glu Gly Leu Tyr Gln Tyr Lys Lys His Met Gln
65 70 75 80
His Thr Pro Glu Gly Leu Ile Ser Val Gly Ala Val Ala Arg Lys Phe
85 90 95
Ile Asp Leu Ile Glu Thr Lys Met Thr Asn Ile Gln Ala Gln Thr Asp
100 105 110
Ala Lys Glu Glu Ser Asn Lys Asp Gln Ala Glu Glu Asp Leu Glu Gly
115 120 125
Gly Val Thr Pro Glu Asn Leu Leu Val Ser Val Tyr Glu Gln Glu Gln
130 135 140
Thr Val Gly Gly Phe Asn Asn Asp Asp Val Ser Ala Trp Leu Arg Phe
145 150 155 160
Thr Trp Glu Ser Tyr Arg Thr Thr Leu Asp Phe Leu Arg Asn Asn Ser
165 170 175
Gln Leu Glu Ile Thr Tyr Ala Gly Val Val Asn Arg Thr Met Gln Phe
180 185 190
Cys Tyr Lys Tyr Asn Arg Lys Asn Glu Phe Lys Arg Leu Ala Glu Met
195 200 205
Leu Arg Gln His Leu Asp Ala Ala Asn Tyr Gln Gln Gln Arg Tyr Gly
210 215 220
His His Thr Val Asp Leu Ser Asp Pro Asp Thr Leu Gln Arg Tyr Ser
225 230 235 240
Asp Gln Arg Phe Gln Gln Val Asn Val Ser Val Lys Leu Glu Leu Trp
245 250 255
His Glu Ala Phe Arg Ser Ile Glu Asp Val His His Leu Met Arg Leu
260 265 270
Ser Lys Arg Ala Pro Lys Pro Ser Val Leu Ala Asn Tyr Tyr Glu Asn
275 280 285
Leu Ala Lys Ile Phe Phe Val Ser Gly Asn Tyr Leu Leu His Ala Ala
290 295 300
Ala Trp Glu Lys Phe Tyr Asn Leu Tyr Leu Lys Asn Pro Asn Ala Ser
305 310 315 320
Glu Glu Asp Phe Lys Phe Tyr Ser Ser Gln Phe Val Leu Ser Ala Leu
325 330 335
Ala Ile Gln Leu Asp Asp Leu Pro Ile Ala Gly Phe Asp Pro Gln Ile
340 345 350
Arg Leu Cys Asp Leu Leu Asp Leu Glu Ser Lys Pro Lys Arg Lys Asp
355 360 365
Leu Ile Thr Ala Ala Gly Glu Gln Gln Val Val Glu Lys Ala Asp Ala
370 375 380
Asp Ile Leu Lys Phe Phe Asn Ile Leu Glu Thr Asn Phe Asp Val Lys
385 390 395 400
Ser Ala Lys Ser Gln Leu Ser Ala Leu Leu Pro Asn Leu Val Glu Lys
405 410 415
Pro Tyr Phe Ala Gln Tyr Val Ala Pro Leu Arg Asn Leu Phe Ile Arg
420 425 430
Arg Ser Ile Ile Glu Val Ser Lys Ala Gln Thr Ser Ile His Leu Val
435 440 445
Glu Leu His Glu Met Leu Ser Leu Pro Ala Pro Phe Glu Leu Ser Val
450 455 460
Phe Glu Leu Glu Lys Tyr Leu Ile Gln Ala Ala Met Asp Asp Tyr Val
465 470 475 480
Ser Ile Ser Ile Asp His Glu Thr Asp Thr Val Ser Phe Ala Gln Asp
485 490 495
Pro Phe Asp Ala Trp Gln Ala Ser Leu Val Glu Val Pro Glu Ser Ser
500 505 510
Thr Ser Asp Glu Ala Lys Asn Ser Glu Ser Glu Glu Glu Thr Ser Gln
515 520 525
Glu Thr His Ala Asp Glu Glu Gln Asn Glu Gln Val Phe Thr Arg Asn
530 535 540
Ser Glu Val Arg Ser Lys Leu Thr Asp Leu Ser Lys Ile Leu Lys Ala
545 550 555 560
Asn Glu Glu Tyr Glu Asn Gly Ser Tyr Tyr Tyr Arg Val Lys Leu Val
565 570 575
Arg Glu Glu Leu Ile Arg Arg Lys Glu Glu Val Ile Lys Leu Glu Lys
580 585 590
Glu Ala Ala Glu Ile Arg Ala Lys Ser Asn Ala Glu Arg Lys Lys Arg
595 600 605
Ser Glu Glu Glu Asn Lys Ile Leu Ala Lys Lys Ala Leu Glu Glu Arg
610 615 620
Gln Arg Arg Met Ala Glu Glu Lys Ala Ala Val Glu Ser Ser Met Glu
625 630 635 640
Lys Glu Ala Glu Arg Arg Ala Glu Glu Met Met Glu Arg Glu Arg Glu
645 650 655
Ala Ile His Glu Gln Glu Met Lys Lys Leu Ile Ala Glu Thr Asn Ala
660 665 670
Asn Gly Val Ile His Ile Asp Pro Lys Glu Ala Lys Asn Leu Thr Ser
675 680 685
Asp Lys Ile Asn Gln Met Val Ile Glu Gln Val Ala Lys Asn Lys Lys
690 695 700
Asp Leu Thr Glu Arg Met Thr Tyr Ala Phe Lys Lys Leu Asp His Leu
705 710 715 720
Glu Arg Ala Tyr Arg Gln Met Glu Leu Pro Leu Leu Glu Lys Asp Ala
725 730 735
Glu Glu Gln Lys Lys Arg Asp Arg Glu Asn Tyr Asp Asn Phe Lys Lys
740 745 750
Lys Leu Ile Glu Thr Ser Lys Ala Asp Tyr Glu Lys Lys Leu Ala Leu
755 760 765
His Gln Arg Leu Asn Lys Ile Tyr Ser Thr Phe Asn Gln Tyr Lys Ser
770 775 780
Ser Val Ile Ala Glu Lys Lys Glu Glu Leu Glu Lys Gln Arg Ala Leu
785 790 795 800
Lys Glu Ala Gln Leu Glu Glu Ala Lys Lys Gln Arg Ile Glu Gln Val
805 810 815
Arg Lys Glu Arg Tyr Glu Ala Lys Val Ala Glu Ile Gln Ala Ala Ile
820 825 830
Glu Ala Glu Ala Ala Glu Lys Glu Ala Leu Ala Lys Glu Glu Glu Leu
835 840 845
Ala Lys Arg Arg Ala Glu Arg Glu Arg Ile Asn Lys Glu Arg Asp Glu
850 855 860
Ile Ala Arg Lys Gln Arg Glu Ile Glu Glu Leu Leu Glu Lys Lys Asn
865 870 875 880
Gly Ser Ser Arg Ser Ser Pro Val Pro Ser Thr Pro Thr Pro Ala Pro
885 890 895
Ala Pro Ala Gln Thr Ala Pro Val Ser Asn Lys Pro Met Ser Met Ala
900 905 910
Glu Lys Leu Arg Leu Lys Arg Met Asn Ala Gly Arg Gly Gly Gly Gly
915 920 925
Gly Ser Gly Gly Gly Gly Ser Thr Gln Asp Glu Val Gln Gly Pro His
930 935 940
Ala Gly Lys Ser Thr Val Gly Gly Gly Gly Ser Met Ala Ser Asn Phe
945 950 955 960
Thr Gln Phe Val Leu Val Asp Asn Gly Gly Thr Gly Asp Val Thr Val
965 970 975
Ala Pro Ser Asn Phe Ala Asn Gly Val Ala Glu Trp Ile Ser Ser Asn
980 985 990
Ser Arg Ser Gln Ala Tyr Lys Val Thr Cys Ser Val Arg Gln Ser Ser
995 1000 1005
Ala Gln Asn Arg Lys Tyr Thr Ile Lys Val Glu Val Pro Lys Val Ala
1010 1015 1020
Thr Gln Thr Val Gly Gly Val Glu Leu Pro Val Ala Ala Trp Arg Ser
1025 1030 1035 1040
Tyr Leu Asn Met Glu Leu Thr Ile Pro Ile Phe Ala Thr Asn Ser Asp
1045 1050 1055
Cys Glu Leu Ile Val Lys Ala Met Gln Gly Leu Leu Lys Asp Gly Asn
1060 1065 1070
Pro Ile Pro Ser Ala Ile Ala Ala Asn Ser Gly Ile Tyr
1075 1080 1085
<210> 9
<211> 1000
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 9
ccaactccgt agacgacatt ttgaaaaacg gtgctagata cgtccgtcaa gtgagagagt 60
cagacaaatc aagactagta tccttactag tgcatggtcc tccaggttca ggaaaaactg 120
ccttagctgc cgccattgca ttgaagtctg agttcccatt tgttagatta atttcccctg 180
aagaaatagc aggtatgtcg gaaacggcca agatcgcata catcgacaat accttcagag 240
atgcttacaa atcgccactt aacattctag ttatcgattc gatagagact ctagttgact 300
gggttccaat tggtccaagg ttctctaata atatcctaca agtactcaag gtctatttga 360
aaagaaaacc tccaaataat cgtcgtttgc ttatcatctc tactacgtcc gcttacacgg 420
tgcttaagca aatggatata ctaagttgtt tcgacaacga aattgctgtc ccaaacgtat 480
caaacctaga tgaactgaat aacattatga cagattccgg atttcttgac gatgctggta 540
gagtcgaagt tatccgcaaa ttatcccaag ttacatctac tctcaacgtc ggcgtgaaaa 600
aagtactcac aaacattgaa acagcaagac acgatgagga tcccgtcaat gaacttgttg 660
atctaatggt gcaatcatct tgaattatac tattcatttc taatatgata catttatata 720
tatatatata ttatacatat gtgatatgta ctcgcgacat agcacataaa cttcaacttc 780
gttctcaatg atatcaactt ttgatgaagg ttgaaatttt tcagcatgct tccgttgaag 840
atcaaagaga ataagtgaaa aaaaactttt cagtctattt atgtgaaact tcttcctatt 900
gcaccttgca aaataaagat atatacttgg atctagtctc tgatttagaa agggtagata 960
acaaccgttc tcgacccatc gattatacta gcatttcatc 1000
<210> 10
<211> 2100
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 10
aatgaagaag ttgattgctg aaactaatgc caatggtgtc attcatattg atccaaagga 60
agccaagaac ctaacaagtg ataagatcaa ccaaatggtc attgaacaag tagccaagaa 120
caagaaggat ttgactgaac gtatgaccta tgccttcaag aagttagatc acctggaaag 180
agcctataga caaatggaat tgccattgtt agaaaaggac gctgaagagc aaaaaaagag 240
ggatagagag aattacgaca atttcaagaa gaagttaatt gagacttcca aggccgacta 300
tgaaaagaaa ttggctctac atcaacgttt gaacaaaatc tacagtactt tcaaccaata 360
caagtcatct gttatcgctg aaaagaagga agagttagaa aaacaacgcg ccttgaagga 420
agctcaatta gaagaagcta agaagcaaag aattgaacaa gtccgtaagg aacgttatga 480
agctaaagtt gctgaaatac aagctgcaat tgaagctgaa gctgctgaaa aggaggcttt 540
ggctaaggag gaagaacttg ccaagagacg tgccgaacgt gaaagaatca acaaggaaag 600
agacgaaatt gctagaaagc aaagagaaat cgaagagttg ttggagaaaa aaaatggtag 660
ctctagatct agccctgttc cttctactcc aaccccagca ccagcaccag cacaaactgc 720
tccggtatcc aataaaccaa tgtctatggc tgaaaagttg agactgaaga gaatgaatgc 780
tggaagaggt ggtggaggtg gatctggagg tggaggttca actcaagatg aagttcaagg 840
tccacatgct ggtaagtcta ctgttggtgg aggtggatct atggcttcta atttcactca 900
attcgttttg gttgataatg gtggtactgg tgatgttact gttgctccat ctaatttcgc 960
taatggtgtt gctgaatgga tttcttctaa ttctagatct caagcttata aagttacttg 1020
ttctgttaga caatcttctg ctcaaaatag aaaatatact attaaagttg aagttccaaa 1080
agttgctact caaactgttg gtggtgttga attgccagtt gctgcttgga gatcttattt 1140
gaatatggaa ttgactattc caattttcgc tactaattct gattgtgaat tgattgttaa 1200
agctatgcaa ggtttgttga aagatggtaa tccaattcca tctgctattg ctgctaattc 1260
tggtatttat taatatcttg catatctcat tcagatgaca aatatatatt aattaacacc 1320
tttcataaac ataaacatat ccaactaaag atatactcct agaaaagtgg ttcatttcct 1380
attctcagaa tggatcctcc ttaagacaac atttaattgt agtatgtcgt gttcgttcat 1440
tcttgtcttg tcgtgttgat gttttgaaat ctgaaaaatt tcgaattttc tttgccagtg 1500
taaacatatg gaaattcaac gtgaaatatt atttccaatt catccatcat gtgataagcc 1560
acactacaaa tcaatattga agtaactatt gagcaagtat ccacatcact gaaaagtgca 1620
tcatttagta atacatcgga taaaagcaac taagcaacat taactgcata gtatttgttg 1680
atcattttgt ggtttgaagt gcctttcatt tgctcttgac acttttccag atataaattt 1740
agaacgccat gtcaaccttt ttcgatgaaa tgaagatgtc tttcgagaca gttcctgtgg 1800
atcgggataa caagatttca acgtccgagt ttttggaagc atcagaatcc cttgtcaaat 1860
tgttcgatct tttgggaaat gctgcttttg tcgttgttca aaacgattta aacgggaaca 1920
ttgccaagct tcgcaaaaga ctgttggcca ctcccgacaa atctgctacg ttacaagatc 1980
tagttactaa tgaaagagca gagggtaaga aaacagccag tgaaggttta ttatggttaa 2040
ccagaggctt gcagttcacg gcccaggcca tgaaagaaac gattgagaac cccactacgg 2100
<210> 11
<211> 2353
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 11
gtacccgggg atcctctaga gatccaactc cgtagacgac attttgaaaa acggtgctag 60
atacgtccgt caagtgagag agtcagacaa atcaagacta gtatccttac tagtgcatgg 120
tcctccaggt tcaggaaaaa ctgccttagc tgccgccatt gcattgaagt ctgagttccc 180
atttgttaga ttaatttccc ctgaagaaat agcaggtatg tcggaaacgg ccaagatcgc 240
atacatcgac aataccttca gagatgctta caaatcgcca cttaacattc tagttatcga 300
ttcgatagag actctagttg actgggttcc aattggtcca aggttctcta ataatatcct 360
acaagtactc aaggtctatt tgaaaagaaa acctccaaat aatcgtcgtt tgcttatcat 420
ctctactacg tccgcttaca cggtgcttaa gcaaatggat atactaagtt gtttcgacaa 480
cgaaattgct gtcccaaacg tatcaaacct agatgaactg aataacatta tgacagattc 540
cggatttctt gacgatgctg gtagagtcga agttatccgc aaattatccc aagttacatc 600
tactctcaac gtcggcgtga aaaaagtact cacaaacatt gaaacagcaa gacacgatga 660
ggatcccgtc aatgaacttg ttgatctaat ggtgcaatca tcttgaatta tactattcat 720
ttctaatatg atacatttat atatatatat atattataca tatgtgatat gtactcgcga 780
catagcacat aaacttcaac ttcgttctca atgatctcga cgttcgacga aggttgaaat 840
ttttcagcat gcttccgttg aagatcaaag agaataagtg aaaaaaaact tttcagtcta 900
tttatgtgaa acttcttcct attgcacctt gcaaaataaa gatatatact tggatctagt 960
ctctgattta gaaagggtag ataacaaccg ttctcgaccc atcgattata ctagcatttc 1020
atcactcaag atgaagttca aggtccacat gctggtaagt ctactgttgg tggaggtgga 1080
tctatggctt ctaatttcac tcaattcgtt ttggttgata atggtggtac tggtgatgtt 1140
actgttgctc catctaattt cgctaatggt gttgctgaat ggatttcttc taattctaga 1200
tctcaagctt ataaagttac ttgttctgtt agacaatctt ctgctcaaaa tagaaaatat 1260
actattaaag ttgaagttcc aaaagttgct actcaaactg ttggtggtgt tgaattgcca 1320
gttgctgctt ggagatctta tttgaatatg gaattgacta ttccaatttt cgctactaat 1380
tctgattgtg aattgattgt taaagctatg caaggtttgt tgaaagatgg taatccaatt 1440
ccatctgcta ttgctgctaa ttctggtatt tatatggctc caccggcttt acgtcctgag 1500
aatgccatta gaagagctga cgaattagtc tctgttggcg agccaatggc tgcgttgcaa 1560
tctctatttg atttattatc ttcaagaagg tctcgttttg ctgatgctgc cactttggaa 1620
cctataatct tcaagttctt ggaacttggt gttgaattga ggaagggtaa aatgatcaag 1680
gaaggtttat accaatacaa gaagcatatg caacacactc ccgaaggttt gatttctgta 1740
ggtgctgttg ctcgtaaatt catcgatttg atcgaaacta agatgaccaa catccaagcg 1800
caaactgatg ccaaagaaga atccaacaag gaccaagccg aagaggatct agagggtggt 1860
gtcaccccag aaaatttgtt ggtttctgtt tacgaacaag aacaaactgt tggtggattc 1920
aacaatgatg atgtttcagc ttggttgaga tttacctggg aatcttaccg taccactcta 1980
gatttcttga gaaataattc tcaattggaa atcacgtatg cgggtgttgt taacagaacc 2040
atgcaattct gttacaaata taaccgtaag aatgaattca agcgtttagc tgaaatgtta 2100
cgtcaacatt tggatgccgc aaactaccaa caacagagat atggtcacca cactgtcgat 2160
ttatcagatc ctgacacttt gcaacgttat tctgaccaac gtttccaaca agttaacgtt 2220
tctgttaaat tggaattatg gcatgaagct ttcagatcca ttgaagatgt tcatcatttg 2280
atgcgtctct cgaagcgtgc tccaaagccc tctgtgttgg ccaactacta cgaatcgtcg 2340
acctgcaggc atg 2353
<210> 12
<211> 402
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 12
atggccaaac tagaaacggt aactctaggt aacattggta aggatgggaa gcagactttg 60
gtgttaaacc ctcgtggtgt taacccgacg aacggggtgg catcactatc ccaggcaggg 120
gcagtgccag ccctagagaa gagagtcacc gtctcagtat cccaaccgtc gaggaatagg 180
aagaattaca aagtccaggt caaaatccag aatccgacag cttgcacggc taatggatct 240
tgtgatccat cggtgacaag acaggcatac gccgatgtaa ctttctcatt tactcaatac 300
tcaacagacg aagagcgtgc ctttgtaagg acagagttgg ctgctcttct agcctcccct 360
ctacttattg acgccataga ccaactaaac ccagcttatt aa 402
<210> 13
<211> 133
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 13
Met Ala Lys Leu Glu Thr Val Thr Leu Gly Asn Ile Gly Lys Asp Gly
1 5 10 15
Lys Gln Thr Leu Val Leu Asn Pro Arg Gly Val Asn Pro Thr Asn Gly
20 25 30
Val Ala Ser Leu Ser Gln Ala Gly Ala Val Pro Ala Leu Glu Lys Arg
35 40 45
Val Thr Val Ser Val Ser Gln Pro Ser Arg Asn Arg Lys Asn Tyr Lys
50 55 60
Val Gln Val Lys Ile Gln Asn Pro Thr Ala Cys Thr Ala Asn Gly Ser
65 70 75 80
Cys Asp Pro Ser Val Thr Arg Gln Ala Tyr Ala Asp Val Thr Phe Ser
85 90 95
Phe Thr Gln Tyr Ser Thr Asp Glu Glu Arg Ala Phe Val Arg Thr Glu
100 105 110
Leu Ala Ala Leu Leu Ala Ser Pro Leu Leu Ile Asp Ala Ile Asp Gln
115 120 125
Leu Asn Pro Ala Tyr
130
<210> 14
<211> 3177
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 14
atggctccac cggctttacg tcctgagaat gccattagaa gagctgacga attagtctct 60
gttggcgagc caatggctgc gttgcaatct ctatttgatt tattatcttc aagaaggtct 120
cgttttgctg atgctgccac tttggaacct ataatcttca agttcttgga acttggtgtt 180
gaattgagga agggtaaaat gatcaaggaa ggtttatacc aatacaagaa gcatatgcaa 240
cacactcccg aaggtttgat ttctgtaggt gctgttgctc gtaaattcat cgatttgatc 300
gaaactaaga tgaccaacat ccaagcgcaa actgatgcca aagaagaatc caacaaggac 360
caagccgaag aggatctaga gggtggtgtc accccagaaa atttgttggt ttctgtttac 420
gaacaagaac aaactgttgg tggattcaac aatgatgatg tttcagcttg gttgagattt 480
acctgggaat cttaccgtac cactctagat ttcttgagaa ataattctca attggaaatc 540
acgtatgcgg gtgttgttaa cagaaccatg caattctgtt acaaatataa ccgtaagaat 600
gaattcaagc gtttagctga aatgttacgt caacatttgg atgccgcaaa ctaccaacaa 660
cagagatatg gtcaccacac tgtcgattta tcagatcctg acactttgca acgttattct 720
gaccaacgtt tccaacaagt taacgtttct gttaaattgg aattatggca tgaagctttc 780
agatccattg aagatgttca tcatttgatg cgtctctcga agcgtgctcc aaagccctct 840
gtgttggcca actactacga aaatttggcc aagatcttct ttgtctctgg taactattta 900
ttgcacgctg ctgcatggga aaaattctac aatttgtact tgaagaatcc aaatgcttcc 960
gaagaagact ttaagttcta ttcatctcaa tttgtcttgt ccgctttggc aattcaattg 1020
gatgacttac caattgctgg tttcgatcct caaattcgtc tatgtgactt actagacctt 1080
gaaagtaaac caaagagaaa ggatttgatt actgctgctg gtgaacaaca agtcgtagag 1140
aaagctgacg ctgatatttt gaaattcttc aatatattgg aaactaattt cgatgtgaag 1200
tctgctaagt ctcaattgtc tgcacttttg ccaaacttgg ttgaaaagcc atatttcgcc 1260
caatacgtgg ctccattgag aaacctattc atcagaagat ccatcattga agtctcaaag 1320
gctcaaacat ctattcactt agttgaattg catgaaatgt tgtcactgcc agctcctttc 1380
gaattatctg tttttgaact agaaaaatac ttaatccaag ctgctatgga tgattatgtt 1440
agtatttcca ttgaccatga aactgatacc gtttcatttg cccaagatcc atttgacgct 1500
tggcaagcat ctcttgttga agttcccgaa tctagcactt ctgatgaagc aaagaactct 1560
gaatccgaag aggaaacctc ccaagaaacg catgctgatg aggaacagaa tgaacaagtc 1620
ttcactcgta actcagaagt ccgttctaag ttgactgatc tatccaagat cttaaaggcc 1680
aacgaagaat acgaaaatgg ttcttactat tacagagtca aacttgtgcg tgaagaattg 1740
atcagaagaa aggaagaagt tatcaagtta gaaaaggaag ctgctgaaat tagagctaag 1800
agtaacgctg aacgcaagaa gagaagcgaa gaagaaaaca agattcttgc caagaaggct 1860
ctagaagaaa ggcaaagacg tatggctgag gaaaaggctg ctgttgaatc ttctatggag 1920
aaagaagcgg aacgtcgtgc tgaagaaatg atggaacgtg agagagaagc gatccatgaa 1980
caagaaatga agaagttgat tgctgaaact aatgccaatg gtgtcattca tattgatcca 2040
aaggaagcca agaacctaac aagtgataag atcaaccaaa tggtcattga acaagtagcc 2100
aagaacaaga aggatttgac tgaacgtatg acctatgcct tcaagaagtt agatcacctg 2160
gaaagagcct atagacaaat ggaattgcca ttgttagaaa aggacgctga agagcaaaaa 2220
aagagggata gagagaatta cgacaatttc aagaagaagt taattgagac ttccaaggcc 2280
gactatgaaa agaaattggc tctacatcaa cgtttgaaca aaatctacag tactttcaac 2340
caatacaagt catctgttat cgctgaaaag aaggaagagt tagaaaaaca acgcgccttg 2400
aaggaagctc aattagaaga agctaagaag caaagaattg aacaagtccg taaggaacgt 2460
tatgaagcta aagttgctga aatacaagct gcaattgaag ctgaagctgc tgaaaaggag 2520
gctttggcta aggaggaaga acttgccaag agacgtgccg aacgtgaaag aatcaacaag 2580
gaaagagacg aaattgctag aaagcaaaga gaaatcgaag agttgttgga gaaaaaaaat 2640
ggtagctcta gatctagccc tgttccttct actccaaccc cagcaccagc accagcacaa 2700
actgctccgg tatccaataa accaatgtct atggctgaaa agttgagact gaagagaatg 2760
aatgctggaa gaggtatggc caaactagaa acggtaactc taggtaacat tggtaaggat 2820
gggaagcaga ctttggtgtt aaaccctcgt ggtgttaacc cgacgaacgg ggtggcatca 2880
ctatcccagg caggggcagt gccagcccta gagaagagag tcaccgtctc agtatcccaa 2940
ccgtcgagga ataggaagaa ttacaaagtc caggtcaaaa tccagaatcc gacagcttgc 3000
acggctaatg gatcttgtga tccatcggtg acaagacagg catacgccga tgtaactttc 3060
tcatttactc aatactcaac agacgaagag cgtgcctttg taaggacaga gttggctgct 3120
cttctagcct cccctctact tattgacgcc atagaccaac taaacccagc ttattaa 3177
<210> 15
<211> 1058
<212> PRT
<213> Artificial sequence (artificial sequence)
<400> 15
Met Ala Pro Pro Ala Leu Arg Pro Glu Asn Ala Ile Arg Arg Ala Asp
1 5 10 15
Glu Leu Val Ser Val Gly Glu Pro Met Ala Ala Leu Gln Ser Leu Phe
20 25 30
Asp Leu Leu Ser Ser Arg Arg Ser Arg Phe Ala Asp Ala Ala Thr Leu
35 40 45
Glu Pro Ile Ile Phe Lys Phe Leu Glu Leu Gly Val Glu Leu Arg Lys
50 55 60
Gly Lys Met Ile Lys Glu Gly Leu Tyr Gln Tyr Lys Lys His Met Gln
65 70 75 80
His Thr Pro Glu Gly Leu Ile Ser Val Gly Ala Val Ala Arg Lys Phe
85 90 95
Ile Asp Leu Ile Glu Thr Lys Met Thr Asn Ile Gln Ala Gln Thr Asp
100 105 110
Ala Lys Glu Glu Ser Asn Lys Asp Gln Ala Glu Glu Asp Leu Glu Gly
115 120 125
Gly Val Thr Pro Glu Asn Leu Leu Val Ser Val Tyr Glu Gln Glu Gln
130 135 140
Thr Val Gly Gly Phe Asn Asn Asp Asp Val Ser Ala Trp Leu Arg Phe
145 150 155 160
Thr Trp Glu Ser Tyr Arg Thr Thr Leu Asp Phe Leu Arg Asn Asn Ser
165 170 175
Gln Leu Glu Ile Thr Tyr Ala Gly Val Val Asn Arg Thr Met Gln Phe
180 185 190
Cys Tyr Lys Tyr Asn Arg Lys Asn Glu Phe Lys Arg Leu Ala Glu Met
195 200 205
Leu Arg Gln His Leu Asp Ala Ala Asn Tyr Gln Gln Gln Arg Tyr Gly
210 215 220
His His Thr Val Asp Leu Ser Asp Pro Asp Thr Leu Gln Arg Tyr Ser
225 230 235 240
Asp Gln Arg Phe Gln Gln Val Asn Val Ser Val Lys Leu Glu Leu Trp
245 250 255
His Glu Ala Phe Arg Ser Ile Glu Asp Val His His Leu Met Arg Leu
260 265 270
Ser Lys Arg Ala Pro Lys Pro Ser Val Leu Ala Asn Tyr Tyr Glu Asn
275 280 285
Leu Ala Lys Ile Phe Phe Val Ser Gly Asn Tyr Leu Leu His Ala Ala
290 295 300
Ala Trp Glu Lys Phe Tyr Asn Leu Tyr Leu Lys Asn Pro Asn Ala Ser
305 310 315 320
Glu Glu Asp Phe Lys Phe Tyr Ser Ser Gln Phe Val Leu Ser Ala Leu
325 330 335
Ala Ile Gln Leu Asp Asp Leu Pro Ile Ala Gly Phe Asp Pro Gln Ile
340 345 350
Arg Leu Cys Asp Leu Leu Asp Leu Glu Ser Lys Pro Lys Arg Lys Asp
355 360 365
Leu Ile Thr Ala Ala Gly Glu Gln Gln Val Val Glu Lys Ala Asp Ala
370 375 380
Asp Ile Leu Lys Phe Phe Asn Ile Leu Glu Thr Asn Phe Asp Val Lys
385 390 395 400
Ser Ala Lys Ser Gln Leu Ser Ala Leu Leu Pro Asn Leu Val Glu Lys
405 410 415
Pro Tyr Phe Ala Gln Tyr Val Ala Pro Leu Arg Asn Leu Phe Ile Arg
420 425 430
Arg Ser Ile Ile Glu Val Ser Lys Ala Gln Thr Ser Ile His Leu Val
435 440 445
Glu Leu His Glu Met Leu Ser Leu Pro Ala Pro Phe Glu Leu Ser Val
450 455 460
Phe Glu Leu Glu Lys Tyr Leu Ile Gln Ala Ala Met Asp Asp Tyr Val
465 470 475 480
Ser Ile Ser Ile Asp His Glu Thr Asp Thr Val Ser Phe Ala Gln Asp
485 490 495
Pro Phe Asp Ala Trp Gln Ala Ser Leu Val Glu Val Pro Glu Ser Ser
500 505 510
Thr Ser Asp Glu Ala Lys Asn Ser Glu Ser Glu Glu Glu Thr Ser Gln
515 520 525
Glu Thr His Ala Asp Glu Glu Gln Asn Glu Gln Val Phe Thr Arg Asn
530 535 540
Ser Glu Val Arg Ser Lys Leu Thr Asp Leu Ser Lys Ile Leu Lys Ala
545 550 555 560
Asn Glu Glu Tyr Glu Asn Gly Ser Tyr Tyr Tyr Arg Val Lys Leu Val
565 570 575
Arg Glu Glu Leu Ile Arg Arg Lys Glu Glu Val Ile Lys Leu Glu Lys
580 585 590
Glu Ala Ala Glu Ile Arg Ala Lys Ser Asn Ala Glu Arg Lys Lys Arg
595 600 605
Ser Glu Glu Glu Asn Lys Ile Leu Ala Lys Lys Ala Leu Glu Glu Arg
610 615 620
Gln Arg Arg Met Ala Glu Glu Lys Ala Ala Val Glu Ser Ser Met Glu
625 630 635 640
Lys Glu Ala Glu Arg Arg Ala Glu Glu Met Met Glu Arg Glu Arg Glu
645 650 655
Ala Ile His Glu Gln Glu Met Lys Lys Leu Ile Ala Glu Thr Asn Ala
660 665 670
Asn Gly Val Ile His Ile Asp Pro Lys Glu Ala Lys Asn Leu Thr Ser
675 680 685
Asp Lys Ile Asn Gln Met Val Ile Glu Gln Val Ala Lys Asn Lys Lys
690 695 700
Asp Leu Thr Glu Arg Met Thr Tyr Ala Phe Lys Lys Leu Asp His Leu
705 710 715 720
Glu Arg Ala Tyr Arg Gln Met Glu Leu Pro Leu Leu Glu Lys Asp Ala
725 730 735
Glu Glu Gln Lys Lys Arg Asp Arg Glu Asn Tyr Asp Asn Phe Lys Lys
740 745 750
Lys Leu Ile Glu Thr Ser Lys Ala Asp Tyr Glu Lys Lys Leu Ala Leu
755 760 765
His Gln Arg Leu Asn Lys Ile Tyr Ser Thr Phe Asn Gln Tyr Lys Ser
770 775 780
Ser Val Ile Ala Glu Lys Lys Glu Glu Leu Glu Lys Gln Arg Ala Leu
785 790 795 800
Lys Glu Ala Gln Leu Glu Glu Ala Lys Lys Gln Arg Ile Glu Gln Val
805 810 815
Arg Lys Glu Arg Tyr Glu Ala Lys Val Ala Glu Ile Gln Ala Ala Ile
820 825 830
Glu Ala Glu Ala Ala Glu Lys Glu Ala Leu Ala Lys Glu Glu Glu Leu
835 840 845
Ala Lys Arg Arg Ala Glu Arg Glu Arg Ile Asn Lys Glu Arg Asp Glu
850 855 860
Ile Ala Arg Lys Gln Arg Glu Ile Glu Glu Leu Leu Glu Lys Lys Asn
865 870 875 880
Gly Ser Ser Arg Ser Ser Pro Val Pro Ser Thr Pro Thr Pro Ala Pro
885 890 895
Ala Pro Ala Gln Thr Ala Pro Val Ser Asn Lys Pro Met Ser Met Ala
900 905 910
Glu Lys Leu Arg Leu Lys Arg Met Asn Ala Gly Arg Gly Met Ala Lys
915 920 925
Leu Glu Thr Val Thr Leu Gly Asn Ile Gly Lys Asp Gly Lys Gln Thr
930 935 940
Leu Val Leu Asn Pro Arg Gly Val Asn Pro Thr Asn Gly Val Ala Ser
945 950 955 960
Leu Ser Gln Ala Gly Ala Val Pro Ala Leu Glu Lys Arg Val Thr Val
965 970 975
Ser Val Ser Gln Pro Ser Arg Asn Arg Lys Asn Tyr Lys Val Gln Val
980 985 990
Lys Ile Gln Asn Pro Thr Ala Cys Thr Ala Asn Gly Ser Cys Asp Pro
995 1000 1005
Ser Val Thr Arg Gln Ala Tyr Ala Asp Val Thr Phe Ser Phe Thr Gln
1010 1015 1020
Tyr Ser Thr Asp Glu Glu Arg Ala Phe Val Arg Thr Glu Leu Ala Ala
1025 1030 1035 1040
Leu Leu Ala Ser Pro Leu Leu Ile Asp Ala Ile Asp Gln Leu Asn Pro
1045 1050 1055
Ala Tyr
<210> 16
<211> 2019
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 16
aatgaagaag ttgattgctg aaactaatgc caatggtgtc attcatattg atccaaagga 60
agccaagaac ctaacaagtg ataagatcaa ccaaatggtc attgaacaag tagccaagaa 120
caagaaggat ttgactgaac gtatgaccta tgccttcaag aagttagatc acctggaaag 180
agcctataga caaatggaat tgccattgtt agaaaaggac gctgaagagc aaaaaaagag 240
ggatagagag aattacgaca atttcaagaa gaagttaatt gagacttcca aggccgacta 300
tgaaaagaaa ttggctctac atcaacgttt gaacaaaatc tacagtactt tcaaccaata 360
caagtcatct gttatcgctg aaaagaagga agagttagaa aaacaacgcg ccttgaagga 420
agctcaatta gaagaagcta agaagcaaag aattgaacaa gtccgtaagg aacgttatga 480
agctaaagtt gctgaaatac aagctgcaat tgaagctgaa gctgctgaaa aggaggcttt 540
ggctaaggag gaagaacttg ccaagagacg tgccgaacgt gaaagaatca acaaggaaag 600
agacgaaatt gctagaaagc aaagagaaat cgaagagttg ttggagaaaa aaaatggtag 660
ctctagatct agccctgttc cttctactcc aaccccagca ccagcaccag cacaaactgc 720
tccggtatcc aataaaccaa tgtctatggc tgaaaagttg agactgaaga gaatgaatgc 780
tggaagaggt atggccaaac tagaaacggt aactctaggt aacattggta aggatgggaa 840
gcagactttg gtgttaaacc ctcgtggtgt taacccgacg aacggggtgg catcactatc 900
ccaggcaggg gcagtgccag ccctagagaa gagagtcacc gtctcagtat cccaaccgtc 960
gaggaatagg aagaattaca aagtccaggt caaaatccag aatccgacag cttgcacggc 1020
taatggatct tgtgatccat cggtgacaag acaggcatac gccgatgtaa ctttctcatt 1080
tactcaatac tcaacagacg aagagcgtgc ctttgtaagg acagagttgg ctgctcttct 1140
agcctcccct ctacttattg acgccataga ccaactaaac ccagcttatt aatatcttgc 1200
atatctcatt cagatgacaa atatatatta attaacacct ttcataaaca taaacatatc 1260
caactaaaga tatactccta gaaaagtggt tcatttccta ttctcagaat ggatcctcct 1320
taagacaaca tttaattgta gtatgtcgtg ttcgttcatt cttgtcttgt cgtgttgatg 1380
ttttgaaatc tgaaaaattt cgaattttct ttgccagtgt aaacatatgg aaattcaacg 1440
tgaaatatta tttccaattc atccatcatg tgataagcca cactacaaat caatattgaa 1500
gtaactattg agcaagtatc cacatcactg aaaagtgcat catttagtaa tacatcggat 1560
aaaagcaact aagcaacatt aactgcatag tatttgttga tcattttgtg gtttgaagtg 1620
cctttcattt gctcttgaca cttttccaga tataaattta gaacgccatg tcaacctttt 1680
tcgatgaaat gaagatgtct ttcgagacag ttcctgtgga tcgggataac aagatttcaa 1740
cgtccgagtt tttggaagca tcagaatccc ttgtcaaatt gttcgatctt ttgggaaatg 1800
ctgcttttgt cgttgttcaa aacgatttaa acgggaacat tgccaagctt cgcaaaagac 1860
tgttggccac tcccgacaaa tctgctacgt tacaagatct agttactaat gaaagagcag 1920
agggtaagaa aacagccagt gaaggtttat tatggttaac cagaggcttg cagttcacgg 1980
cccaggccat gaaagaaacg attgagaacc ccactacgg 2019
<210> 17
<211> 2305
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 17
gtacccgggg atcctctaga gatccaactc cgtagacgac attttgaaaa acggtgctag 60
atacgtccgt caagtgagag agtcagacaa atcaagacta gtatccttac tagtgcatgg 120
tcctccaggt tcaggaaaaa ctgccttagc tgccgccatt gcattgaagt ctgagttccc 180
atttgttaga ttaatttccc ctgaagaaat agcaggtatg tcggaaacgg ccaagatcgc 240
atacatcgac aataccttca gagatgctta caaatcgcca cttaacattc tagttatcga 300
ttcgatagag actctagttg actgggttcc aattggtcca aggttctcta ataatatcct 360
acaagtactc aaggtctatt tgaaaagaaa acctccaaat aatcgtcgtt tgcttatcat 420
ctctactacg tccgcttaca cggtgcttaa gcaaatggat atactaagtt gtttcgacaa 480
cgaaattgct gtcccaaacg tatcaaacct agatgaactg aataacatta tgacagattc 540
cggatttctt gacgatgctg gtagagtcga agttatccgc aaattatccc aagttacatc 600
tactctcaac gtcggcgtga aaaaagtact cacaaacatt gaaacagcaa gacacgatga 660
ggatcccgtc aatgaacttg ttgatctaat ggtgcaatca tcttgaatta tactattcat 720
ttctaatatg atacatttat atatatatat atattataca tatgtgatat gtactcgcga 780
catagcacat aaacttcaac ttcgttctca atgatctcga cgttcgacga aggttgaaat 840
ttttcagcat gcttccgttg aagatcaaag agaataagtg aaaaaaaact tttcagtcta 900
tttatgtgaa acttcttcct attgcacctt gcaaaataaa gatatatact tggatctagt 960
ctctgattta gaaagggtag ataacaaccg ttctcgaccc atcgattata ctagcatttc 1020
atcatggcca aactagaaac ggtaactcta ggtaacattg gtaaggatgg gaagcagact 1080
ttggtgttaa accctcgtgg tgttaacccg acgaacgggg tggcatcact atcccaggca 1140
ggggcagtgc cagccctaga gaagagagtc accgtctcag tatcccaacc gtcgaggaat 1200
aggaagaatt acaaagtcca ggtcaaaatc cagaatccga cagcttgcac ggctaatgga 1260
tcttgtgatc catcggtgac aagacaggca tacgccgatg taactttctc atttactcaa 1320
tactcaacag acgaagagcg tgcctttgta aggacagagt tggctgctct tctagcctcc 1380
cctctactta ttgacgccat agaccaacta aacccagctt attaaatggc tccaccggct 1440
ttacgtcctg agaatgccat tagaagagct gacgaattag tctctgttgg cgagccaatg 1500
gctgcgttgc aatctctatt tgatttatta tcttcaagaa ggtctcgttt tgctgatgct 1560
gccactttgg aacctataat cttcaagttc ttggaacttg gtgttgaatt gaggaagggt 1620
aaaatgatca aggaaggttt ataccaatac aagaagcata tgcaacacac tcccgaaggt 1680
ttgatttctg taggtgctgt tgctcgtaaa ttcatcgatt tgatcgaaac taagatgacc 1740
aacatccaag cgcaaactga tgccaaagaa gaatccaaca aggaccaagc cgaagaggat 1800
ctagagggtg gtgtcacccc agaaaatttg ttggtttctg tttacgaaca agaacaaact 1860
gttggtggat tcaacaatga tgatgtttca gcttggttga gatttacctg ggaatcttac 1920
cgtaccactc tagatttctt gagaaataat tctcaattgg aaatcacgta tgcgggtgtt 1980
gttaacagaa ccatgcaatt ctgttacaaa tataaccgta agaatgaatt caagcgttta 2040
gctgaaatgt tacgtcaaca tttggatgcc gcaaactacc aacaacagag atatggtcac 2100
cacactgtcg atttatcaga tcctgacact ttgcaacgtt attctgacca acgtttccaa 2160
caagttaacg tttctgttaa attggaatta tggcatgaag ctttcagatc cattgaagat 2220
gttcatcatt tgatgcgtct ctcgaagcgt gctccaaagc cctctgtgtt ggccaactac 2280
tacgaatcgt cgacctgcag gcatg 2305
<210> 18
<211> 1311
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 18
cgcgaaatta atacgactca ctatagggaa aaaagaaatc tctcaagctg aaattaaacc 60
aaaactctaa tataagaaaa aaaaatagaa aggtattttt acaacaatta ccaacaacaa 120
caaacaacaa acaacattac aattactatt tacaattaca aaaaaaaaaa atgattacag 180
aaacatcatc accgttcaga tctatattct cccacagtgg gaaacaccac catcaccacc 240
accatcacgg gagcggcgtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 300
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgcgcggc gagggcgagg 360
gcgatgccac caacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 420
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 480
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 540
agcgcaccat ctccttcaag gacgacggca cctacaagac ccgcgccgag gtgaagttcg 600
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 660
acatcctggg gcacaagctg gagtacaact tcaacagcca caacgtctat atcacggccg 720
acaagcagaa gaacggcatc aaggcgaact tcaagatccg ccacaacgtc gaggacggca 780
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 840
tgcccgacaa ccactacctg agcacccagt ccaagctgag caaagacccc aacgagaagc 900
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 960
agctgtacaa gtaaaaacat gaggattacc catgttatat cgcgcgaaaa catgaggatt 1020
acccatgtat aaggattaat tacttggatg ccaataaaaa aaaaaaagcg acatagccaa 1080
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1140
aaaaaaaaaa aaaaaaaaaa aaaaaaaaac gaactcgagc accaccacca ccaccactga 1200
gatccggctg ctaacaaagc ccgaaaggaa gctgagttgg ctgctgccac cgctgagcaa 1260
taactagcat aaccccttgg ggcctctaaa cgggtcttga ggggtttttt g 1311
<210> 19
<211> 104
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 19
aaacatgagg attacccatg ttatatcgcg cgaaaacatg aggattaccc atgtataagg 60
attaattact tggatgccaa taaaaaaaaa aaagcgacat agcc 104
<210> 20
<211> 1257
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 20
cgcgaaatta atacgactca ctatagggaa aaaagaaatc tctcaagctg aaattaaacc 60
aaaactctaa tataagaaaa aaaaatagaa aggtattttt acaacaatta ccaacaacaa 120
caaacaacaa acaacattac aattactatt tacaattaca aaaaaaaaaa atgattacag 180
aaacatcatc accgttcaga tctatattct cccacagtgg gaaacaccac catcaccacc 240
accatcacgg gagcggcgtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 300
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgcgcggc gagggcgagg 360
gcgatgccac caacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 420
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 480
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 540
agcgcaccat ctccttcaag gacgacggca cctacaagac ccgcgccgag gtgaagttcg 600
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 660
acatcctggg gcacaagctg gagtacaact tcaacagcca caacgtctat atcacggccg 720
acaagcagaa gaacggcatc aaggcgaact tcaagatccg ccacaacgtc gaggacggca 780
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 840
tgcccgacaa ccactacctg agcacccagt ccaagctgag caaagacccc aacgagaagc 900
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 960
agctgtacaa gtaaataagg attaattact tggatgccaa taaaaaaaaa aaagcgacat 1020
agccaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1080
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaacgaac tcgagcacca ccaccaccac 1140
cactgagatc cggctgctaa caaagcccga aaggaagctg agttggctgc tgccaccgct 1200
gagcaataac tagcataacc ccttggggcc tctaaacggg tcttgagggg ttttttg 1257
<210> 21
<211> 1126
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 21
cgcgaaatta atacgactca ctatagggaa aaaagaaatc tctcaagctg aaattaaacc 60
aaaactctaa tataagaaaa aaaaatagaa aggtattttt acaacaatta ccaacaacaa 120
caaacaacaa acaacattac aattactatt tacaattaca aaaaaaaaaa atgattacag 180
aaacatcatc accgttcaga tctatattct cccacagtgg gaaacaccac catcaccacc 240
accatcacgg gagcggcgtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 300
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgcgcggc gagggcgagg 360
gcgatgccac caacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 420
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 480
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 540
agcgcaccat ctccttcaag gacgacggca cctacaagac ccgcgccgag gtgaagttcg 600
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 660
acatcctggg gcacaagctg gagtacaact tcaacagcca caacgtctat atcacggccg 720
acaagcagaa gaacggcatc aaggcgaact tcaagatccg ccacaacgtc gaggacggca 780
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 840
tgcccgacaa ccactacctg agcacccagt ccaagctgag caaagacccc aacgagaagc 900
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 960
agctgtacaa gtaaaaacat gaggattacc catgttatat cgcgcgaaaa catgaggatt 1020
acccatgtat aaggattaat tacttggatg ccaataaaaa aaaaaaagcg acatagccct 1080
agcataaccc cttggggcct ctaaacgggt cttgaggggt tttttg 1126
<210> 22
<211> 1159
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 22
cgcgaaatta atacgactca ctatagggaa aaaagaaatc tctcaagctg aaattaaacc 60
aaaactctaa tataagaaaa aaaaatagaa aggtattttt acaacaatta ccaacaacaa 120
caaacaacaa acaacattac aattactatt tacaattaca aaaaaaaaaa atgattacag 180
aaacatcatc accgttcaga tctatattct cccacagtgg gaaacaccac catcaccacc 240
accatcacgg gagcggcgtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 300
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgcgcggc gagggcgagg 360
gcgatgccac caacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 420
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 480
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 540
agcgcaccat ctccttcaag gacgacggca cctacaagac ccgcgccgag gtgaagttcg 600
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 660
acatcctggg gcacaagctg gagtacaact tcaacagcca caacgtctat atcacggccg 720
acaagcagaa gaacggcatc aaggcgaact tcaagatccg ccacaacgtc gaggacggca 780
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 840
tgcccgacaa ccactacctg agcacccagt ccaagctgag caaagacccc aacgagaagc 900
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 960
agctgtacaa gtaaataagg attaattaaa acatgaggat tacccatgtt atatcgcgcg 1020
aaaacatgag gattacccat gttatatcgc gcgaaaacat gaggattacc catgtcttgg 1080
atgccaataa aaaaaaaaaa gcgacatagc cctagcataa ccccttgggg cctctaaacg 1140
ggtcttgagg ggttttttg 1159
<210> 23
<211> 137
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 23
ataaggatta attaaaacat gaggattacc catgttatat cgcgcgaaaa catgaggatt 60
acccatgtta tatcgcgcga aaacatgagg attacccatg tcttggatgc caataaaaaa 120
aaaaaagcga catagcc 137
<210> 24
<211> 1317
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 24
cgcgaaatta atacgactca ctatagggaa aaaagaaatc tctcaagctg aaattaaacc 60
aaaactctaa tataagaaaa aaaaatagaa aggtattttt acaacaatta ccaacaacaa 120
caaacaacaa acaacattac aattactatt tacaattaca aaaaaaaaaa atgattacag 180
aaacatcatc accgttcaga tctatattct cccacagtgg gaaacaccac catcaccacc 240
accatcacgg gagcggcgtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 300
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgcgcggc gagggcgagg 360
gcgatgccac caacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 420
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 480
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 540
agcgcaccat ctccttcaag gacgacggca cctacaagac ccgcgccgag gtgaagttcg 600
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 660
acatcctggg gcacaagctg gagtacaact tcaacagcca caacgtctat atcacggccg 720
acaagcagaa gaacggcatc aaggcgaact tcaagatccg ccacaacgtc gaggacggca 780
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 840
tgcccgacaa ccactacctg agcacccagt ccaagctgag caaagacccc aacgagaagc 900
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 960
agctgtacaa gtaaataagg attaattata aggattaatt gcatgtctaa gacagcaata 1020
aggattaatt gcatgtctaa gacagcaact tggatgccaa taaaaaaaaa aaagcgacat 1080
agccaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1140
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaacgaac tcgagcacca ccaccaccac 1200
cactgagatc cggctgctaa caaagcccga aaggaagctg agttggctgc tgccaccgct 1260
gagcaataac tagcataacc ccttggggcc tctaaacggg tcttgagggg ttttttg 1317
<210> 25
<211> 110
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 25
ataaggatta attataagga ttaattgcat gtctaagaca gcaataagga ttaattgcat 60
gtctaagaca gcaacttgga tgccaataaa aaaaaaaaag cgacatagcc 110
<210> 26
<211> 1347
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 26
cgcgaaatta atacgactca ctatagggaa aaaagaaatc tctcaagctg aaattaaacc 60
aaaactctaa tataagaaaa aaaaatagaa aggtattttt acaacaatta ccaacaacaa 120
caaacaacaa acaacattac aattactatt tacaattaca aaaaaaaaaa atgattacag 180
aaacatcatc accgttcaga tctatattct cccacagtgg gaaacaccac catcaccacc 240
accatcacgg gagcggcgtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 300
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgcgcggc gagggcgagg 360
gcgatgccac caacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 420
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 480
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 540
agcgcaccat ctccttcaag gacgacggca cctacaagac ccgcgccgag gtgaagttcg 600
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 660
acatcctggg gcacaagctg gagtacaact tcaacagcca caacgtctat atcacggccg 720
acaagcagaa gaacggcatc aaggcgaact tcaagatccg ccacaacgtc gaggacggca 780
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 840
tgcccgacaa ccactacctg agcacccagt ccaagctgag caaagacccc aacgagaagc 900
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 960
agctgtacaa gtaaataagg attaattata aggattaatt gcatgtctaa gacagcaata 1020
aggattaatt gcatgtctaa gacagcaata aggattaatt gcatgtctaa gacagcaact 1080
tggatgccaa taaaaaaaaa aaagcgacat agccaaaaaa aaaaaaaaaa aaaaaaaaaa 1140
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1200
aaaaacgaac tcgagcacca ccaccaccac cactgagatc cggctgctaa caaagcccga 1260
aaggaagctg agttggctgc tgccaccgct gagcaataac tagcataacc ccttggggcc 1320
tctaaacggg tcttgagggg ttttttg 1347
<210> 27
<211> 140
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 27
ataaggatta attataagga ttaattgcat gtctaagaca gcaataagga ttaattgcat 60
gtctaagaca gcaataagga ttaattgcat gtctaagaca gcaacttgga tgccaataaa 120
aaaaaaaaag cgacatagcc 140
<210> 28
<211> 1162
<212> DNA
<213> Artificial sequence (artificial sequence)
<400> 28
cgcgaaatta atacgactca ctatagggaa aaaagaaatc tctcaagctg aaattaaacc 60
aaaactctaa tataagaaaa aaaaatagaa aggtattttt acaacaatta ccaacaacaa 120
caaacaacaa acaacattac aattactatt tacaattaca aaaaaaaaaa atgattacag 180
aaacatcatc accgttcaga tctatattct cccacagtgg gaaacaccac catcaccacc 240
accatcacgg gagcggcgtg agcaagggcg aggagctgtt caccggggtg gtgcccatcc 300
tggtcgagct ggacggcgac gtaaacggcc acaagttcag cgtgcgcggc gagggcgagg 360
gcgatgccac caacggcaag ctgaccctga agttcatctg caccaccggc aagctgcccg 420
tgccctggcc caccctcgtg accaccctga cctacggcgt gcagtgcttc agccgctacc 480
ccgaccacat gaagcagcac gacttcttca agtccgccat gcccgaaggc tacgtccagg 540
agcgcaccat ctccttcaag gacgacggca cctacaagac ccgcgccgag gtgaagttcg 600
agggcgacac cctggtgaac cgcatcgagc tgaagggcat cgacttcaag gaggacggca 660
acatcctggg gcacaagctg gagtacaact tcaacagcca caacgtctat atcacggccg 720
acaagcagaa gaacggcatc aaggcgaact tcaagatccg ccacaacgtc gaggacggca 780
gcgtgcagct cgccgaccac taccagcaga acacccccat cggcgacggc cccgtgctgc 840
tgcccgacaa ccactacctg agcacccagt ccaagctgag caaagacccc aacgagaagc 900
gcgatcacat ggtcctgctg gagttcgtga ccgccgccgg gatcactctc ggcatggacg 960
agctgtacaa gtaaataagg attaattata aggattaatt gcatgtctaa gacagcaata 1020
aggattaatt gcatgtctaa gacagcaata aggattaatt gcatgtctaa gacagcaact 1080
tggatgccaa taaaaaaaaa aaagcgacat agccctagca taaccccttg gggcctctaa 1140
acgggtcttg aggggttttt tg 1162

Claims (11)

1. A fusion protein having the structure of formula Ia or formula Ib:
A-B-C(Ia)
C-B-A(Ib);
in the formula (I), the compound is shown in the specification,
a is translation initiation factor element eIF3 a;
b is nothing or a connecting peptide;
c is RNA binding protein element MS2 capsid protein or Q β capsid protein;
each "-" is a peptide bond.
2. An isolated polynucleotide encoding the fusion protein of claim 1.
3. A vector or vector combination comprising the polynucleotide of claim 2.
4. A genetically engineered cell having the polynucleotide of claim 2 integrated at one or more sites in its genome or comprising the vector or combination of vectors of claim 3.
5. A cell extract or cell lysate obtained from the genetically engineered cell of claim 4, wherein the cell extract or cell lysate comprises the fusion protein of claim 1.
6. An expression vector for use with the fusion protein of claim 1, characterized in that the expression vector comprises at least: a template DNA sequence, the specific recognition binding site sequence of the RNA binding protein element.
7. The expression vector of claim 6, wherein: the expression vector further comprises one or more elements selected from the group consisting of: promoter elements, enhancer elements, tobacco mosaic virus 5' leader sequence, Kozak sequence, polyadenylation, terminator elements.
8. An in vitro cell-free protein synthesis system comprising at least the cell extract or cell lysate of claim 7 and the expression vector of claim 6 or 7.
9. The synthesis system according to claim 8, characterized in that: the synthesis system further comprises one or more components selected from the group consisting of: substrates for the synthesis of RNA, substrates for the synthesis of proteins, polyethylene glycol, magnesium ions, potassium ions, buffers, RNA polymerase, energy regeneration systems, dithiothreitol, optionally an aqueous solvent.
10. A method for synthesizing a foreign protein, comprising the steps of:
(i) constructing an expression vector according to claim 6 or 7 containing a foreign protein template DNA sequence;
(ii) providing the in vitro cell-free protein synthesis system of claim 8 or 9, and performing an incubation reaction to synthesize the foreign protein;
and optionally (iii) isolating or detecting the foreign protein.
11. A kit comprising a container and the components of the in vitro cell-free protein synthesis system of claim 8 or 9 separately disposed in the container.
CN201910212861.9A 2019-03-20 2019-03-20 Fusion protein containing RNA binding protein and expression vector used in combination with same Active CN111718419B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910212861.9A CN111718419B (en) 2019-03-20 2019-03-20 Fusion protein containing RNA binding protein and expression vector used in combination with same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910212861.9A CN111718419B (en) 2019-03-20 2019-03-20 Fusion protein containing RNA binding protein and expression vector used in combination with same

Publications (2)

Publication Number Publication Date
CN111718419A CN111718419A (en) 2020-09-29
CN111718419B true CN111718419B (en) 2022-04-12

Family

ID=72562468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910212861.9A Active CN111718419B (en) 2019-03-20 2019-03-20 Fusion protein containing RNA binding protein and expression vector used in combination with same

Country Status (1)

Country Link
CN (1) CN111718419B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230134868A1 (en) 2019-11-30 2023-05-04 Kangma-Healthcode (Shanghai) Biotech Co., Ltd Biomagnetic microsphere and preparation method therefor and use thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108690139A (en) * 2017-07-31 2018-10-23 康码(上海)生物科技有限公司 The preparation of new fusion protein and its application synthesized in raising protein

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108690139A (en) * 2017-07-31 2018-10-23 康码(上海)生物科技有限公司 The preparation of new fusion protein and its application synthesized in raising protein

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Eukaryotic translation initiation factor 3 plays distinct roles at the mRNA entry and exit channels of the ribosomal preinitiation complex;Colin Echeverría Aitken等;《Elife》;20161031;第5卷;摘要 *
Faithful and efficient translation of homologous and heterologous mRNAs in an mRNA-dependent cell-free system from Saccharomyces cerevisiae.;M.F.Tuite,J等;《JBC》;19800925;第255卷(第18期);全文 *
Quantitative profiling of in vivo-assembled RNA-protein complexes using a novel integrated proteomic approach;Becky Pinjou Tsai等;《Mol Cell Proteomics.》;20110430;第10卷(第4期);全文 *
体外展示技术研究进展;卢明锋;《生命科学》;20100815(第08期);全文 *
无细胞蛋白合成系统在病毒蛋白表达及应用中的研究进展;杨界等;《中华实验和临床感染病杂志(电子版)》;20131215(第06期);第916页右栏第3段 *

Also Published As

Publication number Publication date
CN111718419A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN109971775B (en) Nucleic acid construct and method for regulating protein synthesis by using same
US11946084B2 (en) Fusion protein comprising a Pab1 element and an eIF4G element and use of the fusion protein for improving protein synthesis
CN110093284B (en) Method for improving protein synthesis efficiency in cell
CN110408635B (en) Application of nucleic acid construct containing streptavidin element in protein expression and purification
CN110845622B (en) Preparation of fusion protein with deletion of different structural domains and application of fusion protein in improvement of protein synthesis
CN110938649A (en) Protein synthesis system for improving expression quantity of foreign protein and application method thereof
CN113481226A (en) Signal peptide related sequence and application thereof in protein synthesis
CN110551745A (en) Multiple histidine sequence tag and application thereof in protein expression and purification
CN111718419B (en) Fusion protein containing RNA binding protein and expression vector used in combination with same
WO2019100431A1 (en) Tandem dna element capable of enhancing protein synthesis efficiency
AU769879B2 (en) Method for producing in vivo proteins chemically diversified by incorporating non-standard amino acids
CN111378706B (en) Method for changing in vitro protein synthesis capacity through Edc3 gene knockout and application thereof
CN112342248B (en) Method for changing in-vitro protein synthesis capacity by gene knockout and application thereof
CN111778270B (en) Method for reflecting in vitro cell-free protein expression level by integrating luminescent reporter gene
CN111118065A (en) Gene modification method of eukaryote, corresponding gene engineering cell and application thereof
WO2024051855A1 (en) Nucleic acid construct and use thereof in ivtt system
CN111254127B (en) Method for modifying protein kinase A catalytic subunit TPK2 gene and application thereof
EP1280890B1 (en) Mutant strains capable of producing chemically diversified proteins by incorporation of non-conventional amino acids
CN118662662A (en) Carrier based on phase separation system and capable of responding to CRISPR-Cas9 system, construction method and application thereof
CN115873852A (en) Recombinant nucleic acid sequence, genetic engineering bacteria and method for producing 1,5-pentanediamine
CN118389546A (en) Saccharomyces cerevisiae engineering strain for synthesizing D-limonene and construction method thereof
CN111254128A (en) Method for modifying gene of protein kinase A catalytic subunit TPK1 and application thereof
CN118530960A (en) Alpha-1, 3-fucosyltransferase mutant and application thereof
MXPA01004302A (en) Method for producing in vivo proteins chemically diversified by incorporating non-standard amino acids

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant