CN111748022B - Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof - Google Patents

Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof Download PDF

Info

Publication number
CN111748022B
CN111748022B CN201910247616.1A CN201910247616A CN111748022B CN 111748022 B CN111748022 B CN 111748022B CN 201910247616 A CN201910247616 A CN 201910247616A CN 111748022 B CN111748022 B CN 111748022B
Authority
CN
China
Prior art keywords
nucleic acid
protein
acid molecule
leu
ala
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910247616.1A
Other languages
Chinese (zh)
Other versions
CN111748022A (en
Inventor
张学礼
陈晶
樊飞宇
唐金磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Institute of Industrial Biotechnology of CAS
Original Assignee
Tianjin Institute of Industrial Biotechnology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Institute of Industrial Biotechnology of CAS filed Critical Tianjin Institute of Industrial Biotechnology of CAS
Priority to CN201910247616.1A priority Critical patent/CN111748022B/en
Publication of CN111748022A publication Critical patent/CN111748022A/en
Application granted granted Critical
Publication of CN111748022B publication Critical patent/CN111748022B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/37Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/905Stable introduction of foreign DNA into chromosome using homologous recombination in yeast
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P33/00Preparation of steroids
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P20/00Technologies relating to chemical industry
    • Y02P20/50Improvements relating to the production of bulk chemicals
    • Y02P20/54Improvements relating to the production of bulk chemicals using solvents, e.g. supercritical solvents or ionic liquids

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mycology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The invention discloses curvularia lunata-derived steroid substance transport protein and an encoding gene and application thereof. The present invention provides a protein represented by SEQ ID No.1 (i.e., ClCDR4 protein). The protein can be used as a steroid transport protein, and compared with the engineering bacterium HC206 which does not express the protein, the yield of hydrocortisone of the engineering bacterium HC207 which expresses the protein is obviously improved. Therefore, the discovery and discovery of the steroid transporter (namely the ClCDR4 protein) provide a feasible solution for the bottleneck problem of low substrate input amount in the hydrocortisone synthesis industry.

Description

Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof
Technical Field
The invention relates to the technical field of biology, and particularly relates to curvularia lunata-derived steroid substance transport protein and an encoding gene and application thereof.
Background
In nature, a large class of compounds represented by cholesterol and related derivatives of cholesterol analogs exist, and the compounds all take cyclopentane multi-hydrogen phenanthrene as a parent nucleus and have different physiological activities due to carrying different side chain modification groups. There are nearly 300 kinds of steroids found in nature, which are widely known as: steroid drugs such as cholesterol, phytosterol, androstenedione, hydrocortisone, etc. Steroid drugs are mainly classified into: adrenocortical hormone, anabolic hormone and sex hormone. With the success of steroid drugs in the fields of disease treatment, metabolic regulation and the like, the steroid drugs have increasingly strong market demand, good development momentum, and stable growth throughout the year, and squat is the second largest drug market status (Fernandez-Cabezon et al, 2018).
With the continuous expansion of market demands, the biosynthesis of steroid drugs is widely concerned, and the biosynthesis of the steroid drugs at present mainly adopts a semi-synthesis technology, and natural steroids are used as a framework for structural modification. However, due to the structural characteristics of steroid compounds, most steroid substrates have the characteristic of extremely low solubility, so two major bottleneck problems in current steroid drug production are as follows: 1. the dosage of the substrate is low; 2. the proportion of catalytic by-products is high (Xiong, S., Wang, Y., Yao, M., Liu, H., Zhou, X., Xiao, W., Yuan, Y.,2017.Cell found with high product specificity and catalytic activity for 21-deoxy-isol biological transformation. Microb fact 16, 105.). However, most of the research and research currently made on low Substrate dosage has focused on the use of different organic solvents, the use of surfactants, the use of cyclodextrins to enhance Substrate solubility, and the biotransformation under supercritical systems, etc. (Lu, W. -Y., Du, L. -X., (Wang, M., Wen, J. -P., Sun, B., Guo, Y. -W., (2006. Effect of two-step Substrate Addition on 11. beta. -Hydroxylation by Curvularia lunata CL-114.Biochemical Engineering journal.32,233-238.Lu, W., (Du, L., Wang, M., Guo, Y., Lu, F, Sun, B., Wen, J., Jia, X2007, A. simulation. beta. -modification 11. hydrate, and production, etc.). However, the use of organic solvents or surfactants can cause different damage to biological cells and affect catalytic efficiency. Taking the production of hydrocortisone as an example, the maximum dosage of a precursor compound of 17 alpha-hydroxypregna-4-ene-3, 20-dione-21-acetate (Cortexolone-21-acetate, RSA) can only reach 7g/L after decades of researches, and the biosynthesis of the hydrocortisone is greatly limited. Therefore, an efficient substrate and product transport mechanism is the key of steroid drug biosynthesis, and the discovery of efficient transport proteins from biological cells is a very effective way for solving the problem.
Disclosure of Invention
The invention aims to provide curvularia lunata-derived steroid substance transport protein and an encoding gene and application thereof.
In a first aspect, the invention claims a protein or a protein set.
The protein claimed by the invention is a steroid substance transporter derived from curvularia lunata, is named as ClCDR4 protein, and specifically can be a protein shown in any one of the following (A1) - (A4):
(A1) protein with amino acid sequence shown as SEQ ID No. 1;
(A2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (A1) and having the same function;
(A3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (A1) or (A2) and having the same function;
(A4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (A1) to (A3).
The protein set claimed by the invention consists of protein A, protein B, protein C and protein D.
The protein A is a protein (i.e., a ClCDR4 protein) as shown in any one of (A1) to (A4) above.
The protein B is Ac-CPR protein derived from Absidia coerulea, and specifically may be a protein represented by any one of the following (B1) to (B4):
(B1) protein with amino acid sequence shown as SEQ ID No. 19;
(B2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (B1) and having the same function;
(B3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (B1) or (B2) and having the same function;
(B4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (B1) to (B3).
The protein C is Ac-Cytb5 protein derived from Absidia coerulea, and specifically can be protein shown in any one of the following (C1) - (C4):
(C1) protein with amino acid sequence shown as SEQ ID No. 20;
(C2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (C1) and having the same function;
(C3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (C1) or (C2) and having the same function;
(C4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of a protein defined in any one of (C1) to (C3);
the protein D is Ac-CYP003 protein derived from Absidia coerulea, and specifically can be protein shown in any one of the following (D1) - (D4):
(D1) protein with amino acid sequence shown as SEQ ID No. 21;
(D2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (D1) and having the same function;
(D3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (D1) or (D2) and having the same function;
(D4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (D1) to (D3).
In a second aspect, the invention claims a nucleic acid molecule or a set of nucleic acid molecules.
The nucleic acid molecule claimed by the present invention is a nucleic acid molecule encoding the protein described in the first aspect above (i.e., the ClCDR4 protein).
The nucleic acid molecule set claimed by the present invention is composed of a nucleic acid molecule A, a nucleic acid molecule B, a nucleic acid molecule C and a nucleic acid molecule D.
The nucleic acid molecule a is a nucleic acid molecule encoding the protein a (i.e., the ClCDR4 protein) of the first aspect above;
the nucleic acid molecule B is a nucleic acid molecule encoding the protein B (i.e., Ac-CPR protein) of the first aspect hereinbefore;
the nucleic acid molecule C is a nucleic acid molecule encoding the protein C (i.e.the Ac-Cytb5 protein) described hereinbefore in the first aspect;
the nucleic acid molecule D is a nucleic acid molecule encoding the protein D (i.e. Ac-CYP003 protein) of the first aspect hereinbefore.
Further, the nucleic acid molecule is a gene encoding ClCDR4 protein, named ClCDR4 gene, and specifically may be a DNA molecule as shown in any one of (a1) to (a3) below:
(a1) a DNA molecule with the nucleotide sequence shown as the 887-5653 position of SEQ ID No. 2;
(a2) a DNA molecule that hybridizes under stringent conditions to the DNA molecule defined in (a1) and encodes a ClCDR4 protein;
(a3) a DNA molecule which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% of homology with the DNA sequence defined in (a1) or (a2) and encodes ClCDR4 protein.
Further, the nucleic acid molecule a is a DNA molecule (i.e., ClCDR4 gene) as shown in any one of (a1) to (a3) above.
Further, the nucleic acid molecule B is a gene encoding Ac-CPR protein, is named as Ac-CPR gene, and specifically can be a DNA molecule shown in any one of the following (B1) to (B3):
(b1) a DNA molecule with the nucleotide sequence shown in the 813 th and 2864 th positions of SEQ ID No. 16;
(b2) a DNA molecule that hybridizes under stringent conditions to the DNA molecule defined in (b1) and encodes an Ac-CPR protein;
(b3) a DNA molecule which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% homology with the DNA sequence defined in (b1) or (b2) and encodes Ac-CPR protein.
Further, the nucleic acid molecule C is a gene encoding Ac-Cytb5 protein, is named as Ac-Cytb5 gene, and can be specifically a DNA molecule shown in any one of the following (C1) to (C3):
(c1) a DNA molecule with the nucleotide sequence shown as the position of 887-1276 of SEQ ID No. 17;
(c2) a DNA molecule which hybridizes under stringent conditions to the DNA molecule defined in (c1) and encodes the Ac-Cytb5 protein;
(c3) and (c) a DNA molecule which has more than 99%, more than 95%, more than 90%, more than 85% or more than 80% homology with the DNA sequence defined in (c1) or (c2) and encodes Ac-Cytb5 protein.
Further, the nucleic acid molecule D is a gene encoding Ac-CYP003 protein, named Ac-CYP003 gene, and specifically may be a DNA molecule as shown in any one of (D1) to (D3) below:
(d1) a DNA molecule with the nucleotide sequence shown as 501-2084 position of SEQ ID No. 18;
(d2) a DNA molecule that hybridizes under stringent conditions to the DNA molecule defined in (d1) and encodes the Ac-CYP003 protein;
(d3) a DNA molecule which has a homology of 99% or more, 95% or more, 90% or more, 85% or more or 80% or more with the DNA sequence defined in (d1) or (d2) and encodes Ac-CYP003 protein.
Wherein the stringent conditions may be as follows: 50 ℃ in 7% Sodium Dodecyl Sulfate (SDS), 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 2 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing at 50 ℃ in 1 XSSC, 0.1% SDS; also can be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 0.5 XSSC, 0.1% SDS at 50 ℃; it can also be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 65 ℃; can also be: in a solution of 6 XSSC, 0.5% SDS at 65 ℃ and then washed once with each of 2 XSSC, 0.1% SDS and 1 XSSC, 0.1% SDS.
In a third aspect, the invention claims any of the following biomaterials:
(c1) a recombinant vector comprising a nucleic acid molecule as described in the second aspect above.
(c2) An expression cassette comprising a nucleic acid molecule as described in the second aspect above.
(c3) A transgenic cell line comprising a nucleic acid molecule as described in the second aspect above.
(c4) A recombinant bacterium comprising a nucleic acid molecule as described in the second aspect above.
(c5) The complete set of recombinant vector consists of a recombinant vector A, a recombinant vector B, a recombinant vector C and a recombinant vector D; the recombinant vector A is a recombinant vector containing the nucleic acid molecule A described in the second aspect above; the recombinant vector B is a recombinant vector containing the nucleic acid molecule B as described in the second aspect above; the recombinant vector C is a recombinant vector comprising the nucleic acid molecule C as described in the second aspect above; the recombinant vector D is a recombinant vector comprising the nucleic acid molecule D as described in the second aspect above.
(c6) The expression cassette set consists of an expression cassette A, an expression cassette B, an expression cassette C and an expression cassette D; the expression cassette A is an expression cassette comprising the nucleic acid molecule A as described in the second aspect above; the expression cassette B is an expression cassette comprising the nucleic acid molecule B as described in the second aspect above; the expression cassette C is an expression cassette comprising a nucleic acid molecule C as described in the second aspect above; the expression cassette D is an expression cassette comprising a nucleic acid molecule D as described in the second aspect above.
(c7) The complete set of transgenic cell line consists of a transgenic cell line A, a transgenic cell line B, a transgenic cell line C and a transgenic cell line D; the transgenic cell line A is a transgenic cell line comprising the nucleic acid molecule A as described above in the second aspect; the transgenic cell line B is a transgenic cell line comprising a nucleic acid molecule B as described in the second aspect above; the transgenic cell line C is a transgenic cell line comprising a nucleic acid molecule C as described in the second aspect above; the transgenic cell line D is a transgenic cell line comprising a nucleic acid molecule D as described in the second aspect above.
(c8) The complete set of recombinant bacteria consists of recombinant bacteria A, recombinant bacteria B, recombinant bacteria C and recombinant bacteria D; the recombinant bacterium A is a recombinant bacterium containing the nucleic acid molecule A in the second aspect; the recombinant bacterium B is a recombinant bacterium containing the nucleic acid molecule B in the second aspect; the recombinant bacterium C is a recombinant bacterium containing the nucleic acid molecule C in the second aspect; the recombinant bacterium D is a recombinant bacterium containing the nucleic acid molecule D described above in the second aspect.
In a fourth aspect, the invention claims a method for constructing engineering bacteria.
The method for constructing engineering bacteria claimed by the invention can be the following method A or method B:
the method A comprises the following steps: a method for constructing engineering bacteria A specifically comprises the following steps: modifying the saccharomycete to express the protein in the first aspect, wherein the modified saccharomycete is the engineering bacterium A.
The method B comprises the following steps: a method for constructing engineering bacteria B specifically comprises the following steps: modifying saccharomycete to express the complete set of protein in the first aspect, wherein the modified saccharomycete is engineering saccharomycete B.
Further, the method a may comprise the steps of: introducing the nucleic acid molecule of the second aspect into the yeast to obtain the recombinant yeast expressing the protein of the first aspect, namely the engineering bacteria A.
Further, the method B may comprise the steps of: introducing the nucleic acid kit of the second aspect into the yeast to obtain the recombinant yeast expressing the protein kit of the first aspect, namely the engineering bacteria B.
Still further, in the method A, the nucleic acid molecule may be introduced into the yeast in the form of the recombinant vector or the expression cassette described in the foregoing third aspect. In method B, the set of nucleic acid molecules may be introduced into the yeast in the form of the set of recombinant vectors or the set of expression cassettes described in the third aspect above.
Still further, the nucleic acid molecule or the set of nucleic acid molecules is integrated into the genome of the yeast at least one of the following sites: gal7 site, NDT80 site, Gal80 site, ADH1 site.
More specifically, the nucleic acid molecule (or the nucleic acid molecule a) is integrated into the genome of the yeast at the Gal7 site or the ADH1 site; the nucleic acid molecule B, the nucleic acid molecule C and the nucleic acid molecule D are integrated into the genome of the yeast at the Gal7 site, NDT80 site, Gal80 site and/or ADH1 site.
Further, the yeast may be Saccharomyces cerevisiae (such as BY 4742. delta. TRP strain) or the like.
In one embodiment of the invention, the nucleic acid molecule is introduced into the yeast in the form of an expression cassette; the expression cassette is expression cassette pTDH3-ClCDR4-TPI1 t; the sequence of the expression cassette pTDH3-ClCDR4-TPI1t is SEQ ID No. 2. When the expression cassette pTDH3-ClCDR4-TPI1t is introduced into the yeast, a homologous arm marker fragment Gal7S-URA3-up and a homologous arm marker fragment Gal7S-URA3-down (realizing Gal7 site integrated into saccharomyces cerevisiae) are also introduced; the sequence of the homologous arm marker fragment gal7S-URA3-up is shown as SEQ ID No. 6; the sequence of the homologous arm marker fragment gal7S-URA3-down is shown in SEQ ID No. 7.
In another embodiment of the invention, the set of nucleic acid molecules is introduced into the yeast in the form of a set of expression cassettes; the complete set of expression cassette consists of an expression cassette pPgk-Ac-CPR-ADH1t, an expression cassette pTDH3-Ac-Cytb5-TPI1t, an expression cassette pTEF-Ac-CYP003-CYC1t and an expression cassette pFBA1-ClCDR4-TDH2 t; the sequence of the expression cassette pPgk-Ac-CPR-ADH1t is SEQ ID No. 16; the sequence of the expression cassette pTDH3-Ac-Cytb5-TPI1t is SEQ ID No. 17; the sequence of the expression cassette pTEF-Ac-CYP003-CYC1t is SEQ ID No. 18; the sequence of the expression box pFBA1-ClCDR4-TDH2t is SEQ ID No. 3. When each expression cassette is introduced into the yeast, a homologous arm marker fragment is also introduced. Introducing a homologous arm marker segment Gal7-URA3-up and a homologous arm marker segment Gal7-URA3-down to realize the Gal7 locus integrated in the saccharomyces cerevisiae; the sequence of the homologous arm marker fragment gal7-URA3-up is shown as SEQ ID No. 8; the sequence of the homologous arm marker fragment gal7-URA3-down is shown in SEQ ID No. 9. A homologous arm marker segment NDT80-his3-up and a homologous arm NDT80-his3-down are introduced to realize the integration of an NDT80 site of saccharomyces cerevisiae; the sequence of the homologous arm marker fragment NDT80-his3-up is shown as SEQ ID No. 10; the sequence of the homologous arm marker segment NDT80-his3-down is shown in SEQ ID No. 11. Introducing a homologous arm marker fragment Gal80-leu2-up and a homologous arm Gal80-leu2-down to realize the integration of a Gal80 locus of the saccharomyces cerevisiae; the sequence of the homologous arm marker fragment Gal80-leu2-up is shown as SEQ ID No. 12; the sequence of the homologous arm marker fragment Gal80-leu2-down is shown in SEQ ID No. 13. Introducing a homologous arm marker fragment ADH1-trp1-up and a homologous arm ADH1-trp1-down to realize the integration of the homologous arm marker fragment ADH1 site of the saccharomyces cerevisiae; the sequence of the homologous arm marker fragment ADH1-trp1-up is shown as SEQ ID No. 14; the sequence of the homologous arm marker fragment ADH1-trp1-down is shown in SEQ ID No. 15.
In a fifth aspect, the invention claims an engineered bacterium (i.e. the engineered bacterium a or the engineered bacterium B) prepared by the method described in the fourth aspect.
In a specific embodiment of the invention, the engineering bacteria A is engineering bacteria HC 202; the engineering bacteria HC202 is prepared by the following steps:
(d1) the Cas9 plasmid is introduced into Saccharomyces cerevisiae (Saccharomyces cerevisiae) BY4742 delta TRP to obtain recombinant yeast BY4742 Cas 9.
Further, the Cas9 plasmid can be p414-TEF1p-Cas9-CYC1t (Addgene, # 43802).
(d2) pGCY1gRNA plasmid and a delta GCY1 fragment are introduced into the BY4742 Cas9 to obtain a recombinant yeast BY4742 delta TRP delta GCY 1.
The pGCY1gRNA plasmid carries a DNA sequence for expression of a spacer (spacer) on the sgRNA specific for the gcy1 gene (i.e., a DNA sequence for expression of a target sequence in the gcy1 gene in crRNA in the sgRNA specific for the gcy1 gene). Furthermore, the pGCY1gRNA plasmid was obtained by PCR amplification using a plasmid p426-SNR52p-gRNA.CAN1.Y-SUP4t (Addgene, #43803) as a template and primers 43803-up (5'-gatcatttatctttcactgcg-3') and 43803-GCY1gRNA-down1 (5'-cgcagtgaa agataaatgatcCTCAAATAGGTTTAGGTACGgttttagagctagaaatagcaag-3'). The sequence of the delta GCY1 fragment is shown as SEQ ID No. 4.
(d3) Introducing pYPR1gRNA plasmid and a delta YPR1 fragment into the BY4742 delta TRP delta GCY1 to obtain a recombinant yeast BY4742 delta TRP delta GCY1 delta YPR1 which is named as HC 201;
the plasmid pYPR1gRNA carries a DNA sequence for expressing a spacer (spacer) on the sgRNA specific to the YPR1 gene (i.e., a DNA sequence for recognizing a target sequence in the YPR1 gene in crRNA in the sgRNA specific to the YPR1 gene). Further, the pYPR1gRNA plasmid was obtained by PCR amplification using a plasmid p426-SNR52p-gRNA.CAN1.Y-SUP4t (Addgene, #43803) as a template and primers 43803-up (5'-gatcatttatctttcactgcg-3') and 43803-YPR1gRNA-down1 (5'-cgcagtgaa agataaatgatcCAGTGTTGGGTTTCGGCACTgttttagagctagaaatagcaag-3'). The sequence of the delta YPR1 fragment is shown as SEQ ID No. 5.
(d4) And introducing the expression cassette pTDH3-ClCDR4-TPI1t (SEQ ID No.2), the marker fragment gal7S-URA3-up (SEQ ID No.6) of the homologous arm and the marker fragment gal7S-URA3-down (SEQ ID No.7) of the homologous arm into the HC201 to obtain a recombinant bacterium, namely the engineering bacterium HC 202.
In a specific embodiment of the invention, the engineering bacteria B is engineering bacteria HC 207; the engineering bacteria HC207 is prepared by the following steps:
(e1) the expression cassette pPgk-Ac-CPR-ADH1t (SEQ ID No.16), the expression cassette pTDH3-Ac-Cytb5-TPI1t (SEQ ID No.17), the expression cassette pTEF-Ac-CYP003-CYC1t (SEQ ID No.18), the marker fragment gal7-URA3-up (SEQ ID No.8) of the homologous arm and the marker fragment gal7-URA3-down (SEQ ID No.9) of the homologous arm are introduced into the HC201, and the obtained recombinant bacterium is named as HC 203.
(e2) The expression cassette pPgk-Ac-CPR-ADH1t (SEQ ID No.16), the expression cassette pTDH3-Ac-Cytb5-TPI1t (SEQ ID No.17), the expression cassette pTEF-Ac-CYP003-CYC1t (SEQ ID No.18), the homologous arm marker fragment NDT80-his3-up (SEQ ID No.10) and the homologous arm marker fragment NDT80-his3-down (SEQ ID No.11) are introduced into the HC203, and the obtained recombinant bacterium is named as HC 204.
(e3) The expression cassette pPgk-Ac-CPR-ADH1t (SEQ ID No.16), the expression cassette pTDH3-Ac-Cytb5-TPI1t (SEQ ID No.17), the expression cassette pTEF-Ac-CYP003-CYC1t (SEQ ID No.18), the homologous arm marker fragment Gal80-leu2-up (SEQ ID No.12) and the homologous arm marker fragment Gal80-leu2-down (SEQ ID No.13) are introduced into the HC204, and the obtained recombinant bacterium is named as HC 205.
(e4) Introducing the expression cassette pPgk-Ac-CPR-ADH1t (SEQ ID No.16), the expression cassette pTDH3-Ac-Cytb5-TPI1t (SEQ ID No.17), the expression cassette pTEF-Ac-CYP003-CYC1t (SEQ ID No.18), the expression cassette pFBA1-ClCDR4-TDH2t (SEQ ID No.3), the homologous arm marker fragment ADH1-trp1-up (SEQ ID No.14) and the homologous arm marker fragment ADH1-trp1-down (SEQ ID No.15) into the HC205, and obtaining the recombinant bacterium, namely the engineering bacterium HC 207.
In a sixth aspect, the invention claims any of the following applications:
(A) use of a protein as described hereinbefore in the first aspect (i.e. the ClCDR4 protein) as a steroid transporter;
(B) use of a protein or a protein set as described in the first aspect hereinbefore or a nucleic acid molecule set as described in the second aspect hereinbefore or a biological material as described in the third aspect hereinbefore or an engineered bacterium as described in the fifth aspect hereinbefore: transporting steroids or preparing products capable of transporting steroids;
(C) use of a protein or a protein set as described in the first aspect hereinbefore or a nucleic acid molecule set as described in the second aspect hereinbefore or a biological material as described in the third aspect hereinbefore or an engineered bacterium as described in the fifth aspect hereinbefore: increasing the transport capacity for steroid substrates and/or the efflux capacity for steroid products during steroid synthesis, or preparing a product capable of increasing the transport capacity for steroid substrates and/or the efflux capacity for steroid products during steroid synthesis;
(D) use of a set of proteins according to the first aspect as described above or a set of nucleic acid molecules as described above according to the second aspect or a set of recombinant vectors, a set of expression cassettes, a set of transgenic cell lines or a set of recombinant bacteria as described above according to the third aspect: the capacity of the strain for synthesizing the hydrocortisone is improved by improving the transport capacity of the strain to steroid substances as substrates and/or the efflux capacity of the strain to steroid products.
In a seventh aspect, the invention claims a method for preparing hydrocortisone.
The method for preparing hydrocortisone provided by the invention is a whole-cell catalysis method, and specifically comprises the following steps: carrying out fermentation culture on the engineering bacteria B, collecting bacteria, adding a substrate, and carrying out catalytic reaction, wherein a reaction product contains hydrocortisone; the substrate is a steroid capable of being catalysed by a steroid 11 β -hydroxylase to produce hydrocortisone. The engineering bacteria B can express not only steroid 11 beta-hydroxylase, but also ClCDR4 protein serving as a steroid transport protein, so that the method can improve the capability of the strain in synthesizing hydrocortisone by improving the outward discharge capability of the strain on steroid products by improving the transport capability of the strain on steroids serving as substrates.
In each of the above aspects, the steroid may be a steroid (steroid). The steroid compound (steroid) is a generic term for a group of compounds having a steroid nucleus, i.e., a cyclopentane-phenanthrene carbon skeleton. Further, in the present invention, the steroid may be 7 α -hydroxypregn-4-ene-3, 20-dione-21-acetate (also called desoxyhydrocortisone 21-acetate, Cortexolone-21-acetate, RSA), deoxycorticosterol (RS) or Hydrocortisone (HC).
Experiments prove that the ClCDR4 protein derived from curvularia lunata provided by the invention can be used as a steroid transport protein, and compared with an engineering bacterium HC206 not expressing the protein, the yield of hydrocortisone of the engineering bacterium HC207 expressing the protein is remarkably improved. Therefore, the discovery and discovery of the steroid transporter (i.e., the ClCDR4 protein) in the invention provides a feasible solution to the bottleneck problem of low substrate dosage in the hydrocortisone synthesis industry.
Drawings
FIG. 1 shows the comparison of the catalytic ability of strains HC201 and HC202 to hydrolyze RSA as a substrate. A is the intracellular and extracellular proportion of the substrate RSA; b is the intracellular and extracellular proportion of the hydrolysate RS.
FIG. 2 shows a comparison of the ability of strains HC206 and HC207 to synthesize hydrocortisone.
Detailed Description
The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
Curvularia lunata (Curvularia lunata): ATCC12017, product of bantam organism (BN Bio).
Plasmid M4: a SexA1 cleavage site, a pTDH3 promoter, a Green Fluorescent Protein (GFP) gene, and terminator TPI1t and Asc1 cleavage sites were inserted in this order into the multiple cloning site of the peasy-Blunt-simple vector (all-around gold organism Co., Ltd.).
Plasmid M8: a SexA1 cleavage site, a pFBA1 promoter, a Green Fluorescent Protein (GFP) gene, and terminator TDH2t and Asc1 cleavage sites were inserted in this order into the multiple cloning site of the peasy-Blunt-simple vector (all-around King Bio Inc.).
Saccharomyces cerevisiae (Saccharomyces cerevisiae) BY 4742: products of Thermo Fisher corporation.
Saccharomyces cerevisiae (Saccharomyces cerevisiae) BY 4742. DELTA. TRP: is "BY 4742-TRP" as described in "Zhubo Dai, et al. producing aglycons of ginsenosides in bakers' yeast. Sci Rep.2014Jan 15" in Table 1. The applicant is publicly available and can only use it for repeated experiments.
Example 1 clonal expression of Curvularia lunata steroid Transporter ClCDR4
The cloning and expression of the gene are divided into the following 3 steps:
1. extraction of curvularia lunata total RNA
Firstly, curvularia lunata is cultured on a flat plate for several days, and a certain number of spores are collected and inoculated into 50mL potato-glucose (PDA) culture medium for overnight culture until a large amount of thalli are synthesized; subsequently, curvularia lunata mycelia were collected by centrifugation, washed with potassium Phosphate Buffer (PBS), finally resuspended in 50mL of buffer and induced for 1h with the substrate desoxyhydrocortisone 21-acetate (RSA) at a final concentration of 170mg/L, and sampled for RNA extraction.
The RNA extraction method comprises the following steps:
(1) to 2.0mL of nutlet (RNase free) were added 0.5mm grinding beads (filling the bottom of the conical tube) and 1mL of Trizol, followed by addition of liquid nitrogen quick-frozen pieces of hyphae (about 100 mg).
(2) This was repeated twice using a bead mill (BeadBeater) at maximum speed of 30 s.
(3) Remove the attached nucleosomes by gentle shaking at room temperature for 5 min.
(4) 0.2mL of chloroform was added and the beads were milled at maximum speed for 15 s.
(5) Gently shaken at room temperature for 2 min.
(6) Centrifugation at 12000g for 15min at 4 ℃ and taking the supernatant (ca. 0.5mL) to a new EP tube (RNase free)
(7) An equal volume (0.5mL) of isopropanol was added and mixed well.
(8) After gentle shaking at room temperature for 10min, a white precipitate was visible.
(9) Centrifugation was carried out at 12000g for 10min at 4 ℃ to remove the supernatant.
(10) The precipitate was washed with 1mL of 75% ethanol in DEPC water, centrifuged at 7500g at 4 ℃ for 5min, and the supernatant was removed completely.
(11) The mixture is placed at room temperature for about 15-20min to remove ethanol, and the RNA becomes insoluble after being aired for a long time.
(12) Adding 50 μ L DEPC water, and performing water bath at 60 deg.C for 10min to help dissolve.
(13) The concentration and purity of RNA were determined by NanoDrop. The ratio of A260/A280 of the RNA sample with better purity is 1.9-2.0, and A260/A230 is usually more than 2.
2. Reverse transcription PCR and gene amplification
First strand reverse transcription-PCR: taking a PCR tube without RNase, and amplifying according to a Thermo reverse transcription kit to obtain cDNA. The specific operation steps are as follows: 3 μ L of template total RNA, 1 μ L of oligo (dT)18 primer, 1 μ L of 10mM dNTP Mix, and 10 μ L of RNase-free water. Performing instant centrifugation, performing PCR at 65 ℃ for 5min, and performing rapid cooling on ice; then adding reaction liquid in the following system: 5 × RT Buffer 4 μ L, Maxima H Minus Enzyme Mix1 μ L, instant centrifugation, reaction in PCR instrument 50 deg.C 50min, 85 deg.C 5min, 4 deg.C heat preservation.
And (3) PCR amplification: the cDNA obtained by reverse transcription is used as a template, and a primer ClCDR4-up-Pac1/ClCDR4-down-Asc1 (shown in table 1) is used for amplifying a target gene, wherein the amplification system is TAKARA
Figure BDA0002011485240000111
10. mu.l of HS DNA polymerase, 10. mu.l of Dntp mix, 1. mu.l each of primers (see Table 1), cDNA, 0.5. mu.l of template, 0.5. mu.l of PrimerSTAR HS polymerase (2.5U/. mu.L), and distilled water were added to a total volume of 50. mu.l. Amplification conditions were 98 ℃ pre-denaturation for 2min (1 cycle); denaturation at 98 ℃ for 10 seconds, annealing at 56 ℃ for 15 seconds, and extension at 72 ℃ for 5 minutes (30 cycles); extension at 72 ℃ for 8 min (1 cycle). The resulting amplification product was designated as ClCDR 4. And purifying the obtained PCR amplification product by using a PCR product purification kit of Shanghai biological engineering Co., Ltd, carrying out enzyme digestion on the DNA fragment by using Pac1 and Asc1 of Thermo company after purification, and recovering the product for later use.
TABLE 1 ClCDR4 Gene amplification primers
Primer name Sequence (5 '-3')
ClCDR4-up-Pac1 CCTTAATTAAATGAGCCTAGTCGGCAATTTCA
ClCDR4-down-Asc1 ggcgcgccTTATACCTTCTCCGATATTTCT
The sequence of the ClCDR4 gene (wild type) is shown in the 887-5653 position of SEQ ID No.2, and encodes the protein shown in SEQ ID No.1 (named as ClCDR4 protein).
3. Construction of Yeast Gene integration fragment
The plasmids M4 and M8 are cut by Pac1 and Asc1 of Thermo company, and the cut product gel is recovered for standby. 50ng of each ClCDR4 gene fragment obtained in the step 2 is added into a connection system: mu.L of 2 Xquick ligation Buffer (NEB), 0.5. mu.L of Quick ligation Buffer (NEB, 400,000 covalent end units/ml), distilled water was added to 10. mu.L, the mixture was reacted at room temperature for 10min to obtain a ligation product, which was transferred to Trans1-T1 competent cells and subjected to ice-bath for 30 min, heat-shock at 42 ℃ for 30 sec, and immediately placed on ice for 2 min. Adding 800 mul LB culture medium, incubating at 250rpm and 37 ℃ for 1 hour, coating the bacterial liquid on LB plate containing ampicillin, culturing overnight, PCR screening 5 positive single colonies, liquid culturing the positive clones, extracting positive clone plasmid for sequencing verification, and inserting target fragments into the vectors M4 and M8 according to sequencing results to obtain the plasmids M4-ClCDR4 and M8-ClCDR 4. PCR was performed using the constructed plasmid M4-ClCDR4 as a template and the primers Xp-M-pEASY-M13F-R and Xp-M-pEASY-M13R-F (see Table 2) (the same procedure as in step 2 of example 1) to obtain pTDH3-ClCDR4-TPI1t fragment (SEQ ID No.2) comprising the TDH3 promoter (positions 87-886 of SEQ ID No.2), the ClCDR4 gene derived from Curvularia lunata (positions 887 and 5653 of SEQ ID No.2) and the TPI1 terminator (position 5654 and 6055 of SEQ ID No. 2). PCR was carried out using the constructed plasmid M8-ClCDR4 as a template and primers S-4G-4-M-ADH1t-FBA1-F and S-4G-4-M-TDH2t-TDH3-R (see Table 2) (the same procedure as in step 2 of example 1) to obtain a pFBA1-ClCDR4-TDH2t fragment (SEQ ID No.3) comprising the FBA1 promoter (positions 55-876 of SEQ ID No.3), the ClCDR4 gene derived from Curvularia lunata (positions 877-5643 of SEQ ID No.3) and the TDH2 terminator (positions 5644-6044 of SEQ ID No. 3). And performing gel recovery treatment on the target fragment obtained by amplification for later use.
TABLE 2 amplification primers for ClCDR4 Gene integration fragments
Figure BDA0002011485240000121
Example 2 construction of Saccharomyces cerevisiae Chassis Strain HC201
The process of constructing the saccharomyces cerevisiae chassis strain HC201 by the method of criprpr-Cas 9 is divided into the following 6 steps:
1. construction of gRNA plasmid of endogenous gcy1, ypr1 genes of saccharomyces cerevisiae
Firstly, using plasmid p426-SNR52p-gRNA. CAN1.Y-SUP4t (Addgene, #43803) as a template, using primers 43803-up and 43803-GCY1gRNA-down1 (see Table 3) to perform PCR amplification (the method is the same as example 1, step 2), and performing Dpn1 digestion treatment on the PCR product obtained by amplification, wherein the Dpn1 treatment system is as follows: 10. mu.L of 10 XDpn 1Buffer (Thermo Co.), 5. mu.L of Dpn1 (Thermo Co., 400,000 covalent end units/ml), 80. mu.L of PCR amplification product, and distilled water were supplemented to 100. mu.L, and digestion was carried out for 4 hours, followed by gel recovery of the treated product for future use. The digest obtained after gel recovery was transferred to Trans1-T1 competent cells in an ice bath for 30 min, heat-shocked at 42 ℃ for 30 sec, and immediately placed on ice for 2 min. Adding 800 mu l of LB culture medium, incubating at 250rpm and 37 ℃ for 1 hour, coating bacterial liquid on an LB plate containing ampicillin, after overnight culture, carrying out PCR screening on 5 positive single colonies, carrying out liquid culture on positive clones, extracting positive clone plasmids to carry out sequencing verification, wherein a sequencing result shows that a correct plasmid is named pGCY1gRNA, and the plasmid contains an N20 sequence (namely a DNA sequence of a spacer sequence (spacer) on sgRNA specific to gcy1 gene, namely a DNA sequence used for expressing a target sequence in a recognition gcy1 gene in crRNA in sgRNA specific to gcy1 gene) corresponding to gcy1 gene. The construction of pYPR1gRNA, a gRNA plasmid corresponding to YPR1 gene, was carried out in the same manner, using 43803-up and 43803-YPR1gRNA-down1 as primers, as shown in Table 3.
2. Construction of saccharomyces cerevisiae endogenous GCY11, YPR1 gene knockout recombinant fragment
Firstly, the genome of wild type Saccharomyces cerevisiae BY4742 is used as a template, primers GCY1-QC-YZ-up and GCY1-QC-M (see table 3) are used for PCR amplification (the method is the same as the step 2 of the example 1), and a GCY1 gene knockout recombinant fragment delta GCY1(SEQ ID No.4) is obtained, wherein the fragment comprises about 500bp of upstream and 50bp of downstream of GCY1 gene. And performing gel recovery treatment on the target fragment obtained by amplification for later use. YPR1 gene knockout recombinant fragment delta YPR1(SEQ ID No.5) was constructed in the same manner as delta GCY1, using YPR1-QC-YZ-up and YPR1-QC-M as primers, as shown in Table 3.
3. Transformation of Cas9 plasmid
The starting strain Saccharomyces cerevisiae BY 4742. delta. TRP was cultured overnight in the screening medium. The screening medium consisted of: SD-Ura-His-Leu-Trp (beijing pan kino (functional genome) science and technology ltd.), 2% glucose, 0.005% His, 0.01% Leu, 0.01% Ura, 0.01% Trp (each percentage number indicates g/100 mL). 1mL (l OD: about 0.6-1.0) was taken and dispensed into 1.5mL EP tubes, centrifuged at 10000g for 1min at 4 ℃ and the supernatant was discarded, the precipitate was washed with sterile water (4 ℃), centrifuged under the same conditions and the supernatant discarded. The cells were incubated at 25 ℃ for 20min with 1mL of a treatment solution (10mM LiAc; 10mM DTT; 0.6M sorbitol; 10mM Tris-HCl (pH7.5) added thereto, and the treatment solution was used. After centrifugation, the supernatant was discarded, 1mL of 1M sorbitol (0.22 μ M aqueous membrane filtration sterilization) was added to the cells for resuspension, and the cells were centrifuged to discard the supernatant (resuspended twice with 1M sorbitol) to a final volume of about 90 μ L. Cas9 plasmid p414-TEF1p-Cas9-CYC1t (Addgene, #43802)1 uL is added, the mixture is transferred to an electric cuvette after being mixed evenly, electric shock is carried out for 5.6ms at 2.7kv, 1mL of 1M sorbitol is added, the mixture is recovered for 1h at 30 ℃, and the mixture is coated on a screening medium plate (formula: 0.8% yeast selection medium SD-Ura-His-Leu-Trp, 2% glucose, 0.005% His, 0.01% Leu, 0.01% Ura, 1.5% agar, and each percentage number represents g/100 mL). The conditions of the screening culture are as follows: culturing at 30 deg.C for 36 hr or more. PCR identified the correct positive clone, designated strain BY4742 Cas 9.
4. Co-transformation of gRNA plasmids and knock-out recombinant fragments
The starting strain BY4742 Cas9 was cultured overnight in the selection medium. The screening medium consisted of: SD-Ura-His-Leu-Trp (beijing pan kino (functional genome) science and technology ltd.), 2% glucose, 0.005% His, 0.01% Leu, 0.01% Ura (each percentage indicates g/100 mL). Then, saccharomyces cerevisiae competence is prepared, 1 mu L of pGCY1gRNA plasmid and 1 mu L of delta GCY1 fragment are added into the prepared competent cells of saccharomyces cerevisiae BY4742 Cas9, the competent cells are uniformly mixed and then transferred into an electric rotating cup, 2.7kv electric shock is carried out for 5.6ms, 1mL of 1M sorbitol is added, the mixture is revived at 30 ℃ for 1h, and the mixture is coated on a screening medium plate (formula: 0.8% yeast selection medium SD-Ura-His-Leu-Trp, 2% glucose, 0.005% His, 0.01% Leu, 1.5% agar, and each percentage number represents g/100 mL). The conditions of the screening culture are as follows: culturing at 30 deg.C for 36 hr or more. The obtained strain is subjected to PCR verification, and a correct positive clone is verified BY using primers GCY1-QC-YZ-up/GCY1-QC-YZ-down, and the strain is named as a strain BY4742 delta TRP delta GCY 1.
5. Elimination of pGCY1gRNA plasmid
The starting strain BY 4742. delta. TRP. DELTA. GCY1 was streaked on a plate containing 5-fluoroorotic acid (5-FOA) screening medium (formulation: 0.8% yeast selection medium SD-Ura-His-Leu-Trp, 2% glucose, 0.005% His, 0.01% Leu, 0.01% Ura, 0.05% 5-FOA, 1.5% agar; each percentage indicates g/100mL) at 30 ℃ for 24 hours, and then the single clones were streaked on SD-Ura-Trp and SD-Trp screening medium plates, respectively, and clones that grew on SD-Trp plates but did not grow on SD-Ura-Trp plates were selected, and it was confirmed that the BY 4742. delta. TRP GCY1 strain that had correctly eliminated pGCY1 gRNA.
6. Knock-out of YPR1 Gene
Co-transforming the Saccharomyces cerevisiae BY4742 delta TRP delta GCY1 strain obtained in step 5 with pYPR1gRNA plasmid and delta YPR1 fragment (same method as step 4), performing PCR verification on the obtained strain, using primers YPR1-QC-YZ-up/YPR1-QC-YZ-down, verifying that the correct strain is continuously subjected to pYPR1gRNA plasmid elimination (same method as step 5), and naming the obtained strain as BY4742 delta TRP delta GCY1 delta YPR1 as HC201
TABLE 3 GCY1 and YPR1 Gene knockout primers
Figure BDA0002011485240000141
Example 3 construction of Saccharomyces cerevisiae engineering bacterium HC202
The starting strain Saccharomyces cerevisiae HC201(BY4742, DELTA TRP DELTA YPR1 DELTA GCY1) was cultured overnight in the selection medium. The screening medium consisted of: SD-Ura-His-Leu-Trp (beijing pan kino (functional genome) science and technology ltd.), 2% glucose, 0.005% His, 0.01% Leu, 0.01% Ura, 0.01% Trp (each percentage number indicates g/100 mL). 1mL (1OD about 0.6-1.0) was dispensed into 1.5mL EP tubes, centrifuged at 4 ℃ at 10000g for 1min, the supernatant was discarded, the precipitate was washed with sterile water (4 ℃) and centrifuged under the same conditions, and the supernatant was discarded. The cells were incubated at 25 ℃ for 20min with 1mL of a treatment solution (10mM LiAc; 10mM DTT; 0.6M sorbitol; 10mM Tris-HCl (pH7.5) added thereto, and the treatment solution was used. After centrifugation, the supernatant was discarded, 1mL of 1M sorbitol (0.22 μ M aqueous membrane filtration sterilization) was added to the cells for resuspension, and the cells were centrifuged to discard the supernatant (resuspended twice with 1M sorbitol) to a final volume of about 90 μ L. The fragment pTDH3-ClCDR4-TPI1t and the homologous arm marker fragment Gal7S-URA3-up (SEQ ID No. 6; the homologous arm fragment comprising 400bp homologous region upstream of Gal7 site, URA3marker gene, and pTDH3-ClCDR4-TPI1t upstream 86bp homologous region), Gal7S-URA3-down (SEQ ID No. 7; the homologous arm fragment comprising pTDH3-ClCDR4-TPI1t downstream homologous region, and Gal7 site downstream of 300bp homologous region) obtained in example 1 were added, respectively (integration of ClCDR4 gene fragment into Gal7 site of Saccharomyces cerevisiae BY4742 was achieved) (Gal 7 site after mixing, after that 1. mu.L of each, 2.7kv shock 5.6ms was applied, 1mL of sorbitol was added, 1h at 30 ℃,1. degreeve medium culture was selected from His 8.0.ra-glucose-supplemented, and Leu was transferred to a cuvette after mixing, 1. degreefed with 1H, 1h culture medium (Trd. medium selection: 0.ra-His-His.8% of yeast culture medium selection, glucose-0. sup. 0.01% leu, 0.01% trp, 1.5% agar; each percentage number represents g/100 mL). The conditions of the screening culture are as follows: culturing at 30 deg.C for 36 hr or more. PCR identified the correct positive clone, designated strain HC 202.
Example 4 Saccharomyces cerevisiae strain HC202 catalyzed substrate RSA hydrolysis
And (3) flask fermentation catalysis: in a 500mL triangular flask containing 100mL of YPD liquid medium (2% glucose, 2% Peptone, 1% Yeast) in an amount of 1mL, which was inoculated at 30 ℃ at 250rpm for 2 days, and the Yeast cells were collected at 5000rpm in a shaking flask at 30 ℃ for 2 days, in a solid selection medium (formulation: solid Yeast selection Medium SD-Ura-His-Leu-Trp, 2% glucose, 0.01% Trp., 1.5% agar; each percentage number indicates g/100mL) by activating a strain of HC202 Yeast, in a corresponding liquid selection medium (formulation: liquid Yeast selection Medium SD-Ura-His-Leu-Trp, 2% glucose, 0.005% His, 0.01% Leu, 0.01% Trp.; each percentage number indicates g/100mL), preparing a seed solution (30 ℃, 250rpm, 16 hours), inoculating the seed solution in 1mL amount into 3 flasks containing 100mL of YPD liquid medium (2% glucose, 2% Peptone, 1% Yeast), and washed with PBS buffer and finally resuspended in a 250mL triangular flask containing 30mL PBS, the substrate RSA with the final concentration of 1000mg/L is added for catalytic reaction, and the mixture is cultured for 1 day at 30 ℃ and 250rpm with shaking.
And (3) product extraction:
1. and (3) performing full-catalytic liquid extraction, namely putting 5mL of catalytic reaction liquid into a separating funnel, adding an equal volume of an extracting agent (methanol: chloroform: 1:9, volume ratio), taking 4mL of lower-layer organic phase, putting the lower-layer organic phase into a 10mL centrifuge tube, drying, performing redissolution by using 1mL of methanol solution, centrifuging, taking supernatant, and filtering the supernatant through a 0.22-micron organic filter membrane into a liquid bottle for HPLC detection. The detection method comprises the following steps: column Shimadzu inertsustatin 250mm 4.6mm 5um (cantonese, gazhou green baicao bio ltd), mobile phase methanol: water 6:4(v/v), flow rate 0.8mL/min, column temperature 25 ℃, detection wavelength 254 nm. The product was analyzed by agilent 1260 High Performance Liquid Chromatography (HPLC).
2. Extracting extracellular products, collecting supernatant solution of 5mL of catalytic reaction solution at 5000rpm, adding an extracting agent with the same volume (methanol: chloroform: 1:9, volume ratio), taking down 2mL of organic phase of a lower layer, putting the organic phase into a 10mL centrifuge tube, drying, redissolving by using 500 mu L of methanol solution, centrifuging, taking the supernatant, passing through a 0.22 mu m organic filter membrane, putting the supernatant into a liquid phase bottle, and carrying out HPLC detection (the detection method is the same as the step 1). The product was analyzed by agilent 1260 High Performance Liquid Chromatography (HPLC).
3. And (3) extracting intracellular products, collecting thallus precipitates from 5mL of catalytic reaction solution at 5000rpm, adding an equal volume of an extracting agent (methanol: chloroform: 1:9, volume ratio), taking 2mL of lower-layer organic phase, putting the lower-layer organic phase into a 10mL centrifuge tube, drying, re-dissolving with 500 mu L of methanol solution, centrifuging, taking supernatant, passing the supernatant through a 0.22 mu m organic filter membrane, and putting the supernatant into a liquid-phase bottle for HPLC detection (the detection method is the same as the step 1). The product was analyzed by agilent 1260 High Performance Liquid Chromatography (HPLC).
By comparing the results of the intracellular, extracellular and whole cell data of saccharomyces cerevisiae strains HC201 and HC202 (fig. 1), it can be seen that: when the concentration of added substrate RSA is 1000mg/L, 29% of RSA substrate residue exists in the strain HC201 after 24 hours of catalysis, and the rest 71% of RSA substrate residue is converted into a hydrolysate 11-deoxycorticosterol (RS), wherein 52% of RSA substrate residue exists in cells, and 19% of RSA substrate residue exists outside the cells; and the Saccharomyces cerevisiae strain HC202 expressing the ClCDR4 gene derived from curvularia lunata has no residual substrate RSA, which shows that 1000mg/L of the substrate RSA is converted into a hydrolysis product RS, wherein 74% exists in the cell and 26% exists outside the cell. According to the data, the addition of the ClCDR4 protein derived from curvularia lunata promotes the transport and efflux of steroid substrates. The ClCDR4 protein contains 1588 amino acids, and according to sequence alignment, the homology of the ClCDR4 amino acid sequence with the amino acid sequences of two ABC transporters reported by Candida albicans (CDR 4) and Saccharomyces cerevisiae (Saccharomyces cerevisiae) PDR5 is 51% and 48%, respectively, and further combined with the experimental result, the protein is inferred to be the steroid ABC transporter derived from filamentous fungi.
Example 5 construction of Saccharomyces cerevisiae HC206 and HC207
1. Starting from a saccharomyces cerevisiae strain HC201, SD-Ura-His-Leu-Trp (Beijing Pankeno (functional genome) science and technology Co., Ltd.), 2% glucose, 0.005% His, 0.01% Leu, 0.01% Ura (each percentage number represents g/100 mL). 1mL (OD. about.0.6-1.0) was dispensed into 1.5mL EP tubes, centrifuged at 4 ℃ at 10000g for 1min, the supernatant was discarded, the precipitate was washed with sterile water (4 ℃), centrifuged under the same conditions, and the supernatant was discarded. The cells were incubated at 25 ℃ for 20min with 1mL of a treatment solution (10mM LiAc; 10mM DTT; 0.6M sorbitol; 10mM Tris-HCl (pH7.5) added thereto, and the treatment solution was used. After centrifugation, the supernatant was discarded, 1mL of 1M sorbitol (0.22 μ M aqueous membrane filtration sterilization) was added to the cells for resuspension, and the cells were centrifuged to discard the supernatant (resuspended twice with 1M sorbitol) to a final volume of about 90 μ L. Separately adding a laboratory-available recombinant fragment pPgk-Ac-CPR-ADH1t (SEQ ID No.16, which comprises the Pgk promoter-position 63-812 of SEQ ID No.16, the Absidia coerulea-derived Ac-CPR gene-position 813-plus 2864 of SEQ ID No.16 (the amino acid sequence of the Ac-CPR protein is shown in SEQ ID No. 19), and the ADH1 terminator-position 2865-plus 3022 of SEQ ID No.16), pTDH3-Ac-Cytb5-TPI1t (SEQ ID No.17, which comprises the pTDH3 promoter-position 87-886 of SEQ ID No.17, the Absidia blue-derived Ac-Cytb5 gene-position 887-plus 1276 of SEQ ID No.17 (the amino acid sequence of the Ac-Cytb 829 4 protein is shown in SEQ ID No. 20), and the terminator of SEQ ID No. 1-Cy 51-CD 17, the amino acid sequence of the TPI 1-Cytb 51-CD-No. 17, and CYP 18-Cy-51-No. 17 (SEQ ID No. 18-F-5, SEQ ID No.17), the fragment comprises a TEF promoter, namely 51-500 th site of SEQ ID No.18, an Ac-CYP003 gene from Absidia coerulea, namely 501-2084 th site of SEQ ID No.18 (the amino acid sequence of the Ac-CYP003 protein is shown as SEQ ID No. 21), a CYC1 terminator, namely 2085-2391 th site of SEQ ID No.18, and a homology arm marker fragment gal7-URA3-up (SEQ ID No. 8; the homologous arm fragment comprises 400bp homologous region at the upstream of Gal7 locus, URA3marker gene and Pgk promoter 400bp homologous region), Gal7-URA3-down (SEQ ID No. 9; the homologous arm fragment comprises 200bp homologous regions of CYC1 terminator and 300bp homologous regions at downstream of Gal7 locus (AcCPR, AcCytb5 and AcCYP003 gene fragments are integrated at Gal7 locus of Saccharomyces cerevisiae HC 201), 1 microliter of each fragment is uniformly mixed and transferred into an electric rotating cup, 2.7kv shock is carried out for 5.7ms, 1mL of 1M sorbitol is added, the mixture is revived at 30 ℃ for 1 hour, and the mixture is coated on a solid screening culture medium (formula: a yeast selection medium SD-Ura-His-Leu-Trp, 2% glucose, 0.005% His, 0.01% Leu, 1.5% agar; each percentage number represents g/100 mL). The conditions of the screening culture are as follows: culturing at 30 deg.C for 36 hr or more. PCR identified the correct positive clone, designated strain HC 203.
2. Starting with a saccharomyces cerevisiae strain HC203, the AcCPR, AcCytb5 and AcCYP003 gene segments are continuously integrated at the NDT80 site of the saccharomyces cerevisiae HC203 (the method is the same as the step 1), the used homologous arm marker segments are replaced by NDT80-His3-up (SEQ ID No. 10; the homologous arm segments comprise 400bp homologous regions at the upstream of an NDT80 site, His 3marker genes and 400bp homologous regions of a Pgk promoter), NDT80-His3-down (SEQ ID No. 11; the homologous arm segments comprise 200bp homologous regions of a CYC1 terminator and 300bp homologous regions at the downstream of a Gal7 site), and a correct positive clone is obtained and named as a strain HC 204.
3. Taking a saccharomyces cerevisiae strain HC204 as an initial strain, integrating AcCPR, AcCytb5 and AcCYP003 gene segments into a Gal80 site of the saccharomyces cerevisiae HC204 (the method is the same as the step 1), replacing the used homologous arm marker segment with Gal80-Leu2-up (SEQ ID No. 12; the homologous arm segment comprises 400bp homologous regions at the upstream of an NDT80 site, Leu2 marker genes and 400bp homologous regions of a Pgk promoter), NDT80-Leu2-down (SEQ ID No. 13; the homologous arm segment comprises 200bp homologous regions of a CYC1 terminator and 300bp homologous regions at the downstream of a Gal80 site), and obtaining a correct positive clone which is named as the strain HC 205.
4. Starting with a saccharomyces cerevisiae strain HC205, integrating AcCPR, AcCytb5 and AcCYP003 gene segments which are consistent with those in steps 2 and 3 into a saccharomyces cerevisiae ADH1 site (the method is the same as step 1), replacing the used homologous arm marker segment with ADH1-Trp1-up (SEQ ID No. 14; the homologous arm segment comprises 400bp homologous regions at the upstream of an ADH1 site, a Trp1marker gene and a Pgk promoter 400bp homologous region), and ADH1-Trp1-down (SEQ ID No. 15; the homologous arm segment comprises a CYC1 terminator 200bp homologous region and an ADH1 site downstream of 300bp homologous regions), and obtaining a correct positive clone which is named as a strain HC 206.
5. And (2) starting from a saccharomyces cerevisiae strain HC205, integrating AcCPR, AcCytb5, an AcCYP003 gene fragment and a recombinant fragment pFBA1-ClCDR4-TDH2t (SEQ ID No.3) which are consistent with those in the steps 2, 3 and 4 into a saccharomyces cerevisiae ADH1 site (the method is the same as the step 1), replacing the used homologous arm marker fragment with ADH1-Trp1-up (SEQ ID No. 14; the homologous arm fragment comprises a 400bp homologous region upstream of an ADH1 site, a Trp1marker gene and a 400bp homologous region of a Pgk promoter), ADH1-Trp1-down (SEQ ID No. 15; the homologous arm fragment comprises a 200bp homologous region of a CYC1 terminator and a 300bp homologous region downstream of an ADH1 site), and obtaining a correct positive clone which is named as the strain HC 207.
Example 6 Synthesis of hydrocortisone by catalytic fermentation of Saccharomyces cerevisiae HC206 and HC207
And (3) flask fermentation catalysis: HC205, HC206 Saccharomyces cerevisiae strains were activated in a solid selection medium (formulation: solid Yeast selection Medium SD-Ura-His-Leu-Trp, 2% glucose, 1.5% agar; each percentage number indicates g/100mL), seed solutions (30 ℃, 250rpm, 16h) were prepared in a corresponding liquid selection medium (formulation: liquid Yeast selection Medium SD-Ura-His-Leu-Trp, 2% glucose; each percentage number indicates g/100mL), inoculated in 1mL aliquots into 3 flasks of 500mL containing 100mL YPD liquid medium (2% glucose, 2% Peptone, 1% Yeast extract; each percentage number indicates g/100mL), Yeast cells were cultured at 30 ℃, 250rpm for 2 days with shaking, collected at 5000rpm, and washed with PBS buffer and finally resuspended in 250mL flasks containing 30mL PBS, a substrate RSA was added thereto at a final concentration of 850mg/L to perform a catalytic reaction, and the mixture was cultured at 30 ℃ and 250rpm for 1 day with shaking.
And (3) product extraction: and (3) performing full-catalytic liquid extraction, namely taking 5mL of catalytic reaction liquid into a separating funnel, adding an equal volume of an extracting agent (methanol: chloroform: 1:9, volume ratio), taking 4mL of lower-layer organic phase, putting the lower-layer organic phase into a 10mL centrifuge tube, drying, performing redissolution by using 1mL of methanol solution, centrifuging, taking supernatant, and passing the supernatant through a 0.22-micron organic filter membrane to a liquid bottle for HPLC detection (the detection method is the same as the step 1 of the example 4). The product was analyzed by agilent 1260 High Performance Liquid Chromatography (HPLC).
Through the comparative verification of the capacity of synthesizing the hydrocortisone by the shake flask catalytic fermentation of the strains HC206 and HC207, the result is shown in figure 2, the production rate of the hydrocortisone of the strain HC206 is 136 mg/L.d, while the production rate of the hydrocortisone of the strain HC207 expressing the transporter ClCDR4 derived from curvularia lunata on the basis of the strain HC206 is 167 mg/L.d, which is improved by 23 percent compared with the strain HC 206. The result shows that the steroid substance transporter ClCDR4 derived from curvularia lunata effectively improves the synthetic capacity of hydrocortisone of the strain by improving the transport capacity of a substrate RSA. The excavation and discovery of the steroid substance transporter provide a feasible solution for the bottleneck problem of low substrate feeding amount in the hydrocortisone synthesis industry.
<110> institute of biotechnology for Tianjin industry of Chinese academy of sciences
<120> curvularia lunata-derived steroid substance transport protein, and coding gene and application thereof
<130> GNCLN190431
<160> 21
<170> PatentIn version 3.5
<210> 1
<211> 1588
<212> PRT
<213> Curvularia lunata
<400> 1
Met Ser Leu Val Gly Asn Phe Thr Ser Asn Phe Asp Arg Glu Ala Val
1 5 10 15
Ser Gly Gly Ala Pro Ser Pro Glu Met Ile Ala Glu Gln Gln Arg Gln
20 25 30
His Tyr Gln Asp Asp Glu Arg Asp His Gln Ala Ala Glu Ser Asp Ala
35 40 45
Ser Thr Ile Ala Ala Gly Glu Gly Ser Pro Pro Ser Gln His Asn Arg
50 55 60
Ala Asp His Lys Asn Ala Ser His Asn Thr Arg Asp Asp Asp Glu Ile
65 70 75 80
Ala Glu Thr Met Arg Arg Glu Glu Ala Val His Gln Leu Ala Arg Arg
85 90 95
Leu Thr Ala Gln Ser His Gln Ser Ser Ser Gln Ala Asn Pro Phe Asn
100 105 110
Ala Pro Pro Asn Ser Ala Leu Asp Pro Asn Gly Glu His Phe Asn Ala
115 120 125
Arg Ala Trp Thr Lys Ala Met Leu Asn Leu Gln Leu Gln Asp Glu Asn
130 135 140
Ala Pro Pro Val Arg Thr Ala Gly Val Ala Phe Arg Asn Leu Asn Val
145 150 155 160
His Gly Phe Gly Thr Asp Ala Asp Tyr Gln Lys Ser Val Gly Asn Val
165 170 175
Trp Leu Glu Gly Pro Ser Leu Val Lys Lys Leu Met Gly Asp Lys Gly
180 185 190
Arg Lys Ile Asn Ile Leu Arg Asp Cys Asp Gly Leu Val Glu Ala Gly
195 200 205
Glu Met Leu Val Val Leu Gly Pro Pro Gly Ser Gly Cys Ser Thr Phe
210 215 220
Leu Lys Thr Ile Thr Gly Glu Thr His Gly Phe Phe Val Asp Gln Asn
225 230 235 240
Ser His Ile Asn Tyr Gln Gly Ile Ser Pro Glu Ile Met Asn Lys Asn
245 250 255
Tyr Arg Gly Glu Ala Ile Tyr Thr Ala Glu Val Asp Val His Phe Pro
260 265 270
Met Met Thr Val Gly Glu Thr Leu Tyr Phe Ala Ala Gln Ala Arg Arg
275 280 285
Pro Arg His Ile Pro Gly Gly Val Ser Val Gln Gln Tyr Ala Glu His
290 295 300
Gln Arg Asp Val Ile Met Ala Leu Tyr Gly Ile Ser His Thr Leu Asn
305 310 315 320
Thr Arg Val Gly Asn Asp Phe Leu Arg Gly Val Ser Gly Gly Glu Arg
325 330 335
Lys Arg Val Thr Ile Ala Glu Ala Ser Leu Ser Arg Ala Pro Leu Gln
340 345 350
Ala Trp Asp Asn Ser Thr Arg Gly Leu Asp Ser Ala Asn Ala Ile Glu
355 360 365
Phe Cys Lys Thr Leu Arg Met Glu Thr Glu Ile Asn Gly Ser Thr Ala
370 375 380
Cys Val Ala Ile Tyr Gln Ala Pro Gln Ala Ala Tyr Asp Leu Phe Asp
385 390 395 400
Lys Ala Leu Val Leu Tyr Glu Gly Arg Gln Ile Phe Phe Gly Lys Thr
405 410 415
Thr Asp Ala Lys Ala Tyr Phe Val Asn Met Gly Phe His Cys Pro Asp
420 425 430
Arg Gln Thr Asp Ala Asp Phe Leu Thr Ser Met Thr Ser Pro Leu Glu
435 440 445
Arg Ile Val Arg Glu Gly Phe Glu Gly Arg Val Pro Arg Thr Pro Asp
450 455 460
Glu Phe Ala Gln Arg Trp Leu Asp Ser Pro Glu Arg Ala Ala Leu Leu
465 470 475 480
Arg Asp Ile Glu Ala Tyr Glu Gln Lys Tyr Pro Ile Gly Gly Glu Ser
485 490 495
Ser Gln Lys Phe Lys Glu Ser Arg Gln Leu Gln Lys Ala Lys Gly Gln
500 505 510
Arg Glu Thr Ser Pro Tyr Thr Leu Ser Tyr Met Asp Gln Val Lys Leu
515 520 525
Cys Leu Trp Arg Gly Phe Val Arg Leu Lys Ala Asp Pro Ser Ile Thr
530 535 540
Leu Thr Gln Leu Ile Ala Asn Ser Ile Met Ala Leu Ile Ile Ser Ser
545 550 555 560
Val Phe Tyr Asn Leu Gln Pro Thr Thr Ser Ser Phe Tyr Ser Arg Ser
565 570 575
Ala Leu Leu Phe Phe Ala Ile Leu Met Asn Ala Phe Gly Ser Ala Leu
580 585 590
Glu Ile Leu Thr Leu Tyr Ala Gln Arg Pro Ile Val Glu Lys His Ser
595 600 605
Arg Tyr Ala Leu Tyr His Pro Ser Ala Glu Ala Phe Ala Ser Met Leu
610 615 620
Thr Asp Leu Pro Tyr Lys Ile Val Asn Ala Ile Thr Phe Asn Leu Val
625 630 635 640
Leu Tyr Phe Met Thr Asn Leu Arg Arg Glu Pro Gly Asn Phe Phe Phe
645 650 655
Phe Val Leu Ile Ser Phe Thr Leu Thr Leu Val Met Ser Met Phe Phe
660 665 670
Arg Ser Ile Ala Ala Leu Ser Arg Ser Leu Val Gln Ala Leu Ala Pro
675 680 685
Ala Ala Ile Leu Ile Leu Gly Leu Val Met Tyr Thr Gly Phe Ala Ile
690 695 700
Pro Pro Asn Tyr Met Leu Gly Trp Ser Lys Trp Ile Arg Tyr Ile Asn
705 710 715 720
Pro Val Ser Tyr Gly Phe Glu Ala Leu Met Val Asn Glu Phe His Asn
725 730 735
Arg Arg Phe Glu Cys Asn Asp Tyr Ile Pro Ser Ser Gly Gly Leu Pro
740 745 750
Ala Leu Ser Ala Tyr Asp Asn Ile Ser Gly Pro Gln Arg Ala Cys Arg
755 760 765
Ala Ile Gly Ser Val Pro Gly Gln Pro Tyr Val Glu Gly Asp Ala Tyr
770 775 780
Ile Asn Ser Ser Phe Asn Tyr Tyr Ala Ser His Lys Trp Arg Asn Phe
785 790 795 800
Gly Ile Met Trp Ala Phe Met Phe Gly Leu Met Phe Val Tyr Leu Ala
805 810 815
Gly Thr Glu Tyr Ile Thr Ala Lys Lys Ser Lys Gly Glu Val Leu Val
820 825 830
Phe Arg Arg Gly His Lys Leu Pro Ala Pro Lys Ser Lys Ser Gln Glu
835 840 845
Asp Leu Glu Ala Ala Asp Pro Gly Arg Asn Val Ala Val Gln Asn Asp
850 855 860
Asn Ser Asp Ser Ile Ala Ile Ile Glu Arg Gln Thr Ala Ile Phe Gln
865 870 875 880
Trp Glu Asp Val Cys Tyr Asp Ile Thr Ile Lys Lys Glu Pro Arg Arg
885 890 895
Ile Leu Asp His Val Asp Gly Trp Val Lys Pro Gly Thr Leu Thr Ala
900 905 910
Leu Met Gly Val Ser Gly Ala Gly Lys Thr Thr Leu Leu Asp Cys Leu
915 920 925
Ala Thr Arg Thr Thr Met Gly Val Ile Thr Gly Gln Met Leu Val Asp
930 935 940
Gly Lys Pro Arg Asp Glu Ser Phe Gln Arg Lys Thr Gly Tyr Ala Gln
945 950 955 960
Gln Gln Asp Leu His Leu Ser Thr Ser Thr Val Arg Glu Ala Leu Ile
965 970 975
Phe Ser Ala Val Leu Arg Gln Pro Ala His Val Ser Arg Gln Glu Lys
980 985 990
Leu Glu Tyr Val Glu Glu Val Ile Lys Leu Leu Glu Met Thr Glu Tyr
995 1000 1005
Ala Asp Ala Val Val Gly Val Pro Gly Glu Gly Leu Asn Val Glu
1010 1015 1020
Gln Arg Lys Arg Leu Thr Ile Gly Val Glu Leu Ala Ala Lys Pro
1025 1030 1035
Ala Leu Leu Leu Phe Leu Asp Glu Pro Thr Ser Gly Leu Asp Ser
1040 1045 1050
Gln Thr Ser Trp Ala Ile Leu Asp Leu Leu Asp Lys Leu Lys Lys
1055 1060 1065
Asn Gly Gln Ala Ile Leu Cys Thr Ile His Gln Pro Ser Ala Met
1070 1075 1080
Leu Phe Gln Arg Phe Asp Arg Leu Leu Phe Leu Ala Arg Gly Gly
1085 1090 1095
Arg Thr Val Tyr Tyr Gly Asp Ile Gly Glu Asn Ser Gln Thr Leu
1100 1105 1110
Val Asn Tyr Phe Val Arg Asn Gly Gly Pro Pro Cys Pro Pro Asp
1115 1120 1125
Ala Asn Pro Ala Glu Trp Met Leu Glu Val Ile Gly Ala Ala Pro
1130 1135 1140
Gly Ser His Thr Asp Ile Asp Trp His Gln Thr Trp Arg Gln Ser
1145 1150 1155
Pro Glu Tyr Thr Glu Val Lys Arg His Leu Ala Glu Leu Lys Ser
1160 1165 1170
Glu Arg Gly Gln Ala Glu Ala Leu Gln Arg Thr Leu Ser Ala Gln
1175 1180 1185
Lys Arg Glu Asp Lys Ala Ala Tyr Arg Glu Phe Ala Ala Pro Phe
1190 1195 1200
Ala Val Gln Leu Arg Glu Thr Leu Val Arg Val Phe Gln Gln Tyr
1205 1210 1215
Trp Arg Thr Pro Ser Tyr Ile Tyr Ser Lys Thr Phe Leu Cys Val
1220 1225 1230
Leu Ser Ala Leu Phe Ile Gly Phe Ser Leu Phe Gln Met Pro Asn
1235 1240 1245
Thr Gln Thr Gly Leu Gln Asn Gln Met Phe Gly Ile Phe Met Leu
1250 1255 1260
Leu Thr Ile Phe Gly Gln Leu Val Gln Gln Ile Met Pro His Phe
1265 1270 1275
Val Thr Gln Arg Ala Leu Tyr Glu Val Arg Glu Arg Pro Ser Lys
1280 1285 1290
Ala Tyr Ser Trp Lys Ala Phe Met Ile Ala Asn Ile Val Val Glu
1295 1300 1305
Leu Pro Trp Asn Ser Leu Met Ala Val Leu Ile Phe Phe Cys Trp
1310 1315 1320
Tyr Tyr Pro Ile Gly Leu Tyr Lys Asn Ala Glu Tyr Thr Asp Ala
1325 1330 1335
Val Thr Leu Arg Gly Phe Gln Leu Phe Leu Phe Val Trp Met Phe
1340 1345 1350
Leu Leu Phe Thr Ser Thr Phe Thr His Met Val Ile Ala Gly Met
1355 1360 1365
Asp His Ala Glu Thr Gly Gly Asn Val Ala Asn Leu Met Phe Ser
1370 1375 1380
Leu Cys Leu Ile Phe Cys Gly Val Leu Ala Gln Pro Ser Gln Phe
1385 1390 1395
Pro Arg Phe Trp Ile Phe Met Tyr Arg Val Ser Pro Phe Thr Tyr
1400 1405 1410
Met Val Ser Gly Met Leu Ser Ala Gly Leu Ala Asn Ser Gln Val
1415 1420 1425
Asn Cys Ala Pro Asn Glu Leu Ile His Phe Asp Pro Ser Gln Gly
1430 1435 1440
Gln Thr Cys Gly Glu Tyr Ile Lys Pro Trp Ile Ser Val Ser Gly
1445 1450 1455
Gly Lys Met Leu Asn Pro Asp Ala Thr Ser Asp Cys Asn Phe Cys
1460 1465 1470
Ala Ile Gln Asp Thr Asn Val Phe Leu Ser Ser Ile Ser Ser Ser
1475 1480 1485
Tyr Ser Asp Leu Trp Arg Asn Phe Gly Ile Leu Trp Val Tyr Val
1490 1495 1500
Ile Phe Asn Ile Ala Ala Ala Leu Ala Leu Tyr Tyr Leu Ile Arg
1505 1510 1515
Met Pro Lys Pro Lys Lys Glu Glu Thr Lys Glu Ala Ser Pro Ala
1520 1525 1530
Thr Ala Ala Ser Ala Thr Thr Ala Gly Glu Pro Ser Arg Val Gly
1535 1540 1545
Ser Ser His Asn Ser Asp Ser Gly Arg Gln Gly Glu Lys Asp Gly
1550 1555 1560
Glu Thr Ala Arg Tyr Ala Ser Ile Thr Gly Ala Ser Thr Pro Pro
1565 1570 1575
Gln Gln Pro Thr Glu Ile Ser Glu Lys Val
1580 1585
<210> 2
<211> 6143
<212> DNA
<213> Artificial sequence
<400> 2
gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc gctttccagt 60
cggtcacaca ggaaacagct atgaccatac tagcgttgaa tgttagcgtc aacaacaaga 120
agtttaatga cgcggaggcc aaggcaaaaa gattccttga ttacgtaagg gagttagaat 180
cattttgaat aaaaaacacg ctttttcagt tcgagtttat cattatcaat actgccattt 240
caaagaatac gtaaataatt aatagtagtg attttcctaa ctttatttag tcaaaaaatt 300
agccttttaa ttctgctgta acccgtacat gcccaaaata gggggcgggt tacacagaat 360
atataacatc gtaggtgtct gggtgaacag tttattcctg gcatccacta aatataatgg 420
agcccgcttt ttaagctggc atccagaaaa aaaaagaatc ccagcaccaa aatattgttt 480
tcttcaccaa ccatcagttc ataggtccat tctcttagcg caactacaga gaacaggggc 540
acaaacaggc aaaaaacggg cacaacctca atggagtgat gcaacctgcc tggagtaaat 600
gatgacacaa ggcaattgac ccacgcatgt atctatctca ttttcttaca ccttctatta 660
ccttctgctc tctctgattt ggaaaaagct gaaaaaaaag gttgaaacca gttccctgaa 720
attattcccc tacttgacta ataagtatat aaagacggta ggtattgatt gtaattctgt 780
aaatctattt cttaaacttc ttaaattcta cttttatagt tagtcttttt tttagtttta 840
aaacaccaag aacttagttt cgaataaaca cacataaaca aacaaaatga gcctagtcgg 900
caatttcacc tccaactttg atcgggaggc tgtatccggc ggcgccccat caccagaaat 960
gatagccgaa cagcaacgcc aacactacca ggacgacgag cgggaccacc aggctgccga 1020
gtcagatgcg tccacaatag cagcgggcga gggctcgccg cccagtcagc acaaccgcgc 1080
tgaccacaag aatgcctctc acaacacacg tgacgacgac gaaatcgccg agaccatgcg 1140
gagagaagag gcagtccacc aacttgccag acgcctgact gcccagagcc atcaatcttc 1200
atcacaagca aaccctttca acgcaccccc taattcggct ctggatccca atggcgagca 1260
cttcaacgcc cgtgcctgga caaaggctat gctcaatctg caacttcaag acgagaatgc 1320
accaccggtc cgaaccgctg gtgttgcttt ccgcaacctg aatgtccatg gtttcggtac 1380
cgacgctgat taccagaaga gcgtcggcaa cgtttggctc gaaggcccca gtctcgtcaa 1440
gaagctcatg ggcgacaagg gccgcaagat caacattctg cgagactgtg atggtcttgt 1500
cgaggctggt gaaatgttgg ttgttctagg acctcctgga tctggttgct caaccttctt 1560
gaagaccatt actggcgaaa cacacggttt ctttgtcgat cagaactcac acattaacta 1620
ccagggtatc agcccagaga ttatgaacaa gaactaccgt ggagaggcca tctacacggc 1680
cgaggtcgac gtccacttcc ccatgatgac tgtcggagaa accctctact ttgctgctca 1740
agctcgccgc cccaggcaca tcccaggcgg cgtcagcgta caacaatacg ctgaacacca 1800
gcgcgatgtc atcatggcct tgtacggcat ctcccacact ctcaacactc gggtcggaaa 1860
cgacttcctc cgcggtgtct ctggtggaga gcgaaagcgt gtcaccatcg ccgaagcatc 1920
ccttagcaga gcacccctgc aagcctggga caactcgacc agaggtctcg actctgcaaa 1980
cgccatcgag ttctgcaaaa ccctccgtat ggagactgag atcaacggca gcactgcatg 2040
tgttgccatt taccaagcac cacaggcagc ttatgatctt tttgacaagg cgcttgtttt 2100
gtacgagggc cgccaaatct tctttggtaa gacgacagac gcaaaggcct actttgtcaa 2160
catgggcttc cattgccctg accgccaaac ggatgccgat ttcttaacct ccatgaccag 2220
tccgctggag cgcatcgtgc gtgaaggctt cgagggccgc gttccccgca cacctgacga 2280
gttcgcacag cgctggctag attctcctga gcgagcagcg ctgcttcgag acattgaagc 2340
ttatgagcaa aaatacccca ttggtggcga gtctagccag aaattcaagg agtctcgcca 2400
gcttcaaaag gcaaagggtc agcgtgagac atcgccttac acgctctctt acatggacca 2460
ggtcaagctt tgcctctggc gaggcttcgt ccgcttaaag gccgatccta gtatcactct 2520
cacccaactc atcgctaact ccatcatggc tcttatcatc tcgtccgtct tctacaacct 2580
ccagcctacg acatcaagtt tctactctcg atccgctctg ctcttcttcg ctatcctgat 2640
gaacgccttc ggttccgcgc ttgagatctt gacgctctac gcccaacgtc ccattgttga 2700
gaagcactcg cgctacgcct tgtatcaccc atctgctgag gcttttgcta gtatgttgac 2760
ggatttgcct tacaagattg taaacgccat tacgttcaat ttggttttgt acttcatgac 2820
caacctgcgg agagaaccgg gcaatttctt cttctttgtc ctcatctcct ttacgcttac 2880
gctcgtcatg tccatgttct tccgatccat tgccgctttg tcgcgctccc tggtacaggc 2940
gttggcgcct gctgccattc tcattctcgg cctcgtcatg tacaccggtt tcgccattcc 3000
tccgaactac atgttgggat ggtccaagtg gatccgctac atcaaccctg tcagctatgg 3060
atttgaggcc ctcatggtca atgaattcca caacaggcga tttgagtgca atgactacat 3120
tccttccagc ggtggtcttc cagctctcag tgcctatgac aacatctctg gtccccaacg 3180
agcttgccgg gctatcggat ctgttcctgg acagccatat gtcgagggcg acgcttacat 3240
caactcgtcc ttcaattact acgcttcgca caagtggcgc aactttggaa tcatgtgggc 3300
attcatgttc ggtctcatgt ttgtctacct cgctggtacc gaatacatca ctgccaagaa 3360
gtccaagggt gaggtccttg ttttccgccg cggccacaag ctccccgctc ccaagagcaa 3420
gtcacaagag gaccttgaag ctgctgaccc cggacgcaac gttgctgtgc aaaacgacaa 3480
ctcggacagc attgccatca tcgaacgcca aaccgccatc ttccagtggg aggacgtgtg 3540
ctacgacatc acgatcaaga aggagcctcg ccgaattctc gaccacgttg atggatgggt 3600
caagcctggt actttgaccg cgctcatggg cgtttctggt gccggtaaaa caacgctttt 3660
ggactgcttg gctacacgta caaccatggg tgtgattaca ggtcagatgc tggttgatgg 3720
caaacctcgc gatgaatctt tccagcgtaa gacaggttac gctcaacaac aggatctcca 3780
tctgtcaacc tcgaccgttc gcgaagccct cattttctct gctgtcctac gtcaacctgc 3840
acacgtctct cgccaagaga agctcgaata tgtcgaagaa gttatcaagc ttctggaaat 3900
gaccgagtac gctgatgccg tcgttggtgt acctggcgaa ggtcttaatg tcgagcaacg 3960
aaagcgtctt actatcggtg tggagttggc tgctaagcct gcgctccttc tcttcttgga 4020
tgaacctacc tctggacttg actctcaaac ctcttgggct atcctggatc ttctcgacaa 4080
gctgaagaag aacggccagg ctattctctg cactatccac cagcccagtg ccatgttgtt 4140
ccagcgcttc gaccgtcttc tgttcttggc ccgtggtgga cgtactgtct actatggaga 4200
tatcggtgaa aactcgcaaa ctcttgtcaa ctactttgtt cgcaacggcg gccctccatg 4260
cccacctgat gccaatcccg ctgaatggat gttggaagtc atcggtgctg ctcctggctc 4320
gcatactgat attgactggc atcagacttg gcgacagtcc cctgagtata ccgaggtcaa 4380
gcgccatctt gctgaactca agtccgaacg tggccaggct gaggctcttc agcgtaccct 4440
gtctgctcag aagcgggagg acaaggctgc ataccgcgag tttgccgcgc catttgccgt 4500
ccagctccgc gagaccttgg ttcgtgtctt ccaacaatac tggcgcactc cttcgtacat 4560
ctactccaag actttcctct gtgttctttc ggccctcttc atcggtttct cgctcttcca 4620
gatgcccaac acccagactg ggctccagaa ccagatgttt ggtatcttca tgttgttgac 4680
catcttcggt cagttggtcc aacagatcat gccgcacttt gtcacccagc gtgccttgta 4740
cgaggttcgt gagcgaccca gcaaggccta ttcctggaag gcattcatga ttgccaacat 4800
cgtcgtcgaa cttccctgga actccctgat ggctgtgctc attttcttct gctggtacta 4860
cccgattggt ctctacaaga atgccgagta caccgatgcc gtcaccctcc gcggtttcca 4920
gttgttcctc tttgtctgga tgttcctgct ctttacctcg accttcacgc acatggttat 4980
tgctggtatg gaccacgccg agactggtgg caatgtggcc aacttgatgt tctcgctctg 5040
cctgattttc tgcggtgtcc tcgcccagcc ttcccagttc ccccgcttct ggatcttcat 5100
gtaccgtgtc tcaccgttta cgtacatggt gtccggtatg ttgtccgctg gtcttgccaa 5160
cagtcaggtg aactgcgctc caaacgagct cattcatttc gacccatccc agggccagac 5220
ttgcggcgaa tacatcaagc cttggatttc ggtctctggt ggcaagatgc tgaacccgga 5280
tgctacaagt gattgcaact tctgtgctat ccaggatacc aatgtcttcc tctcaagcat 5340
ctcatcctcc tactccgacc tttggcgtaa ctttggaatc ctctgggtct atgtcatctt 5400
caacatcgct gccgcgctcg ccttgtacta tcttatccgt atgcccaagc ccaagaagga 5460
ggagacaaag gaagctagcc cggctactgc agccagtgcg acaacggccg gtgagccttc 5520
ccgtgttggc agcagccaca actcggattc cggccgccaa ggtgagaagg atggcgagac 5580
tgctcgctac gcttccatca ctggtgccag cacaccacca caacaaccta cagaaatatc 5640
ggagaaggta taagattaat ataattatat aaaaatatta tcttcttttc tttatatcta 5700
gtgttatgta aaataaattg atgactacgg aaagcttttt tatattgttt ctttttcatt 5760
ctgagccact taaatttcgt gaatgttctt gtaagggacg gtagatttac aagtgataca 5820
acaaaaagca aggcgctttt tctaataaaa agaagaaaag catttaacaa ttgaacacct 5880
ctatatcaac gaagaatatt actttgtctc taaatccttg taaaatgtgt acgatctcta 5940
tatgggttac tcataagtgt accgaagact gcattgaaag tttatgtttt ttcactggag 6000
gcgtcatttt cgcgttgaga agatgttctt atccaaattt caactgttat atagaactgg 6060
ccgtcgtttt acaacgtcgt ccattcaggc tgcgcaactg ttgggaaggg cgatcggtgc 6120
gggcctcttc gctattacgc cag 6143
<210> 3
<211> 6098
<212> DNA
<213> Artificial sequence
<400> 3
ctcaggtata gcatgaggtc gctcttattg accacacctc taccggcatg ccgagatcca 60
actggcaccg ctggcttgaa caacaatacc agccttccaa cttctgtaaa taacggcggt 120
acgccagtgc caccagtacc gttacctttc ggtatacctc ctttccccat gtttccaatg 180
cccttcatgc ctccaacggc tactatcaca aatcctcatc aagctgacgc aagccctaag 240
aaatgaataa caatactgac agtactaaat aattgcctac ttggcttcac atacgttgca 300
tacgtcgata tagataataa tgataatgac agcaggatta tcgtaatacg taatagttga 360
aaatctcaaa aatgtgtggg tcattacgta aataatgata ggaatgggat tcttctattt 420
ttcctttttc cattctagca gccgtcggga aaacgtggca tcctctcttt cgggctcaat 480
tggagtcacg ctgccgtgag catcctctct ttccatatct aacaactgag cacgtaacca 540
atggaaaagc atgagcttag cgttgctcca aaaaagtatt ggatggttaa taccatttgt 600
ctgttctctt ctgactttga ctcctcaaaa aaaaaaaatc tacaatcaac agatcgcttc 660
aattacgccc tcacaaaaac ttttttcctt cttcttcgcc cacgttaaat tttatccctc 720
atgttgtcta acggatttct gcacttgatt tattataaaa agacaaagac ataatacttc 780
tctatcaatt tcagttattg ttcttccttg cgttattctt ctgttcttct ttttcttttg 840
tcatatataa ccataaccaa gtaatacata ttcaaaatga gcctagtcgg caatttcacc 900
tccaactttg atcgggaggc tgtatccggc ggcgccccat caccagaaat gatagccgaa 960
cagcaacgcc aacactacca ggacgacgag cgggaccacc aggctgccga gtcagatgcg 1020
tccacaatag cagcgggcga gggctcgccg cccagtcagc acaaccgcgc tgaccacaag 1080
aatgcctctc acaacacacg tgacgacgac gaaatcgccg agaccatgcg gagagaagag 1140
gcagtccacc aacttgccag acgcctgact gcccagagcc atcaatcttc atcacaagca 1200
aaccctttca acgcaccccc taattcggct ctggatccca atggcgagca cttcaacgcc 1260
cgtgcctgga caaaggctat gctcaatctg caacttcaag acgagaatgc accaccggtc 1320
cgaaccgctg gtgttgcttt ccgcaacctg aatgtccatg gtttcggtac cgacgctgat 1380
taccagaaga gcgtcggcaa cgtttggctc gaaggcccca gtctcgtcaa gaagctcatg 1440
ggcgacaagg gccgcaagat caacattctg cgagactgtg atggtcttgt cgaggctggt 1500
gaaatgttgg ttgttctagg acctcctgga tctggttgct caaccttctt gaagaccatt 1560
actggcgaaa cacacggttt ctttgtcgat cagaactcac acattaacta ccagggtatc 1620
agcccagaga ttatgaacaa gaactaccgt ggagaggcca tctacacggc cgaggtcgac 1680
gtccacttcc ccatgatgac tgtcggagaa accctctact ttgctgctca agctcgccgc 1740
cccaggcaca tcccaggcgg cgtcagcgta caacaatacg ctgaacacca gcgcgatgtc 1800
atcatggcct tgtacggcat ctcccacact ctcaacactc gggtcggaaa cgacttcctc 1860
cgcggtgtct ctggtggaga gcgaaagcgt gtcaccatcg ccgaagcatc ccttagcaga 1920
gcacccctgc aagcctggga caactcgacc agaggtctcg actctgcaaa cgccatcgag 1980
ttctgcaaaa ccctccgtat ggagactgag atcaacggca gcactgcatg tgttgccatt 2040
taccaagcac cacaggcagc ttatgatctt tttgacaagg cgcttgtttt gtacgagggc 2100
cgccaaatct tctttggtaa gacgacagac gcaaaggcct actttgtcaa catgggcttc 2160
cattgccctg accgccaaac ggatgccgat ttcttaacct ccatgaccag tccgctggag 2220
cgcatcgtgc gtgaaggctt cgagggccgc gttccccgca cacctgacga gttcgcacag 2280
cgctggctag attctcctga gcgagcagcg ctgcttcgag acattgaagc ttatgagcaa 2340
aaatacccca ttggtggcga gtctagccag aaattcaagg agtctcgcca gcttcaaaag 2400
gcaaagggtc agcgtgagac atcgccttac acgctctctt acatggacca ggtcaagctt 2460
tgcctctggc gaggcttcgt ccgcttaaag gccgatccta gtatcactct cacccaactc 2520
atcgctaact ccatcatggc tcttatcatc tcgtccgtct tctacaacct ccagcctacg 2580
acatcaagtt tctactctcg atccgctctg ctcttcttcg ctatcctgat gaacgccttc 2640
ggttccgcgc ttgagatctt gacgctctac gcccaacgtc ccattgttga gaagcactcg 2700
cgctacgcct tgtatcaccc atctgctgag gcttttgcta gtatgttgac ggatttgcct 2760
tacaagattg taaacgccat tacgttcaat ttggttttgt acttcatgac caacctgcgg 2820
agagaaccgg gcaatttctt cttctttgtc ctcatctcct ttacgcttac gctcgtcatg 2880
tccatgttct tccgatccat tgccgctttg tcgcgctccc tggtacaggc gttggcgcct 2940
gctgccattc tcattctcgg cctcgtcatg tacaccggtt tcgccattcc tccgaactac 3000
atgttgggat ggtccaagtg gatccgctac atcaaccctg tcagctatgg atttgaggcc 3060
ctcatggtca atgaattcca caacaggcga tttgagtgca atgactacat tccttccagc 3120
ggtggtcttc cagctctcag tgcctatgac aacatctctg gtccccaacg agcttgccgg 3180
gctatcggat ctgttcctgg acagccatat gtcgagggcg acgcttacat caactcgtcc 3240
ttcaattact acgcttcgca caagtggcgc aactttggaa tcatgtgggc attcatgttc 3300
ggtctcatgt ttgtctacct cgctggtacc gaatacatca ctgccaagaa gtccaagggt 3360
gaggtccttg ttttccgccg cggccacaag ctccccgctc ccaagagcaa gtcacaagag 3420
gaccttgaag ctgctgaccc cggacgcaac gttgctgtgc aaaacgacaa ctcggacagc 3480
attgccatca tcgaacgcca aaccgccatc ttccagtggg aggacgtgtg ctacgacatc 3540
acgatcaaga aggagcctcg ccgaattctc gaccacgttg atggatgggt caagcctggt 3600
actttgaccg cgctcatggg cgtttctggt gccggtaaaa caacgctttt ggactgcttg 3660
gctacacgta caaccatggg tgtgattaca ggtcagatgc tggttgatgg caaacctcgc 3720
gatgaatctt tccagcgtaa gacaggttac gctcaacaac aggatctcca tctgtcaacc 3780
tcgaccgttc gcgaagccct cattttctct gctgtcctac gtcaacctgc acacgtctct 3840
cgccaagaga agctcgaata tgtcgaagaa gttatcaagc ttctggaaat gaccgagtac 3900
gctgatgccg tcgttggtgt acctggcgaa ggtcttaatg tcgagcaacg aaagcgtctt 3960
actatcggtg tggagttggc tgctaagcct gcgctccttc tcttcttgga tgaacctacc 4020
tctggacttg actctcaaac ctcttgggct atcctggatc ttctcgacaa gctgaagaag 4080
aacggccagg ctattctctg cactatccac cagcccagtg ccatgttgtt ccagcgcttc 4140
gaccgtcttc tgttcttggc ccgtggtgga cgtactgtct actatggaga tatcggtgaa 4200
aactcgcaaa ctcttgtcaa ctactttgtt cgcaacggcg gccctccatg cccacctgat 4260
gccaatcccg ctgaatggat gttggaagtc atcggtgctg ctcctggctc gcatactgat 4320
attgactggc atcagacttg gcgacagtcc cctgagtata ccgaggtcaa gcgccatctt 4380
gctgaactca agtccgaacg tggccaggct gaggctcttc agcgtaccct gtctgctcag 4440
aagcgggagg acaaggctgc ataccgcgag tttgccgcgc catttgccgt ccagctccgc 4500
gagaccttgg ttcgtgtctt ccaacaatac tggcgcactc cttcgtacat ctactccaag 4560
actttcctct gtgttctttc ggccctcttc atcggtttct cgctcttcca gatgcccaac 4620
acccagactg ggctccagaa ccagatgttt ggtatcttca tgttgttgac catcttcggt 4680
cagttggtcc aacagatcat gccgcacttt gtcacccagc gtgccttgta cgaggttcgt 4740
gagcgaccca gcaaggccta ttcctggaag gcattcatga ttgccaacat cgtcgtcgaa 4800
cttccctgga actccctgat ggctgtgctc attttcttct gctggtacta cccgattggt 4860
ctctacaaga atgccgagta caccgatgcc gtcaccctcc gcggtttcca gttgttcctc 4920
tttgtctgga tgttcctgct ctttacctcg accttcacgc acatggttat tgctggtatg 4980
gaccacgccg agactggtgg caatgtggcc aacttgatgt tctcgctctg cctgattttc 5040
tgcggtgtcc tcgcccagcc ttcccagttc ccccgcttct ggatcttcat gtaccgtgtc 5100
tcaccgttta cgtacatggt gtccggtatg ttgtccgctg gtcttgccaa cagtcaggtg 5160
aactgcgctc caaacgagct cattcatttc gacccatccc agggccagac ttgcggcgaa 5220
tacatcaagc cttggatttc ggtctctggt ggcaagatgc tgaacccgga tgctacaagt 5280
gattgcaact tctgtgctat ccaggatacc aatgtcttcc tctcaagcat ctcatcctcc 5340
tactccgacc tttggcgtaa ctttggaatc ctctgggtct atgtcatctt caacatcgct 5400
gccgcgctcg ccttgtacta tcttatccgt atgcccaagc ccaagaagga ggagacaaag 5460
gaagctagcc cggctactgc agccagtgcg acaacggccg gtgagccttc ccgtgttggc 5520
agcagccaca actcggattc cggccgccaa ggtgagaagg atggcgagac tgctcgctac 5580
gcttccatca ctggtgccag cacaccacca caacaaccta cagaaatatc ggagaaggta 5640
taaatttaac tccttaagtt actttaatga tttagttttt attattaata attcatgctc 5700
atgacatctc atatacacgt ttataaaact taaatagatt gaaaatgtat taaagattcc 5760
tcagggattc gatttttttg gaagtttttg tttttttttc cttgagatgc tgtagtattt 5820
gggaacaatt atacaatcga aagatatatg cttacattcg accgttttag ccgtgatcat 5880
tatcctatag taacataacc tgaagcataa ctgacactac tatcatcaat acttgtcaca 5940
tgagaactct gtgaataatt aggccactga aatttgatgc ctgaaggacc ggcatcacgg 6000
attttcgata aagcacttag tatcacacta attggctttt cgccatacta gcgttgaatg 6060
ttagcgtcaa caacaagaag tttaatgacg cggaggcc 6098
<210> 4
<211> 621
<212> DNA
<213> Artificial sequence
<400> 4
gaagtatgac ctctgttaaa tttttttttt tttaaatttc actttctaaa gtcccagaaa 60
tccgcttgaa tgtcttacat attgcaatgg atatgcttgg gtgatcatac ttcctggctt 120
tagatatttg aaacttaact cttgtcaaca aacttcctat ggagtgtata agaattgtaa 180
gttataacac cggcgaacaa tcggggcaga ctattccggg gaagaacaag gaagggcggt 240
cttttctccc tcattgtcat agcaaggtca tttcgccttc tcagaaaggg gtagaatcaa 300
tctagcacgc agattgcaaa cacggcttaa taatatgcct atcaggcatt cacccgtgtg 360
acgaatcgca caccgctgct ctccttaatt ccctagagta gaaaccgagc tttcaggaaa 420
agactacggc agtaaagaat tgctttactg ggcgtataaa accgggagaa tcaagacatt 480
ctaatgactt gattcaggat gagagcttaa taggtgcatc ttagcaagct aaaatttgga 540
cagctctcat tactaaatta agatagaaaa ttgtttttgc gtgtttctcg tatgattgta 600
atatgtagat aaattaaaca t 621
<210> 5
<211> 581
<212> DNA
<213> Artificial sequence
<400> 5
aagtgcaact gaaagctacg aatataccat gactatttta attacgttgg tgtcattgat 60
attcaggttc tactttaact tcattttggc atcttttgtt caagaattgt tacaccaccc 120
caaatattta gttgataggg atgacgtaga acaaaacttg aaaaacaagc ctatttggaa 180
aagactgtgg gctaagagcc aaaaaggttg ttataagcta tgtaagaatt tgttagagta 240
aatagtaaaa taaagcccta gcgttatagc gttttacact gatgaagaaa tgtttctatt 300
ctgacatcat aaaatcattc tcactgatgc tattgtcact tttcatcacg tgtgtattta 360
agtagctctt acgtaatttt gaaaaaaaag gaaaaaggaa actatatagc acgcaattcc 420
ctatttggtt gcaattcaat tccgtgaaac ccttttcttt tctaaagtga taataaagta 480
actttgcaat ataatcaggt cgcaaatata cccacagata ataatctacc gcgcgctaca 540
ttacagttaa tgcctccagc aacctgtagt gcttctttaa a 581
<210> 6
<211> 1290
<212> DNA
<213> Artificial sequence
<400> 6
ggaaaagttg taaatattat tggtagtatt cgtttggtaa agtagagggg gtaatttttc 60
ccctttattt tgttcataca ttcttaaatt gctttgcctc tccttttgga aagctatact 120
tcggagcact gttgagcgaa ggctcattag atatattttc tgtcattttc cttaacccaa 180
aaataaggga aagggtccaa aaagcgctcg gacaactgtt gaccgtgatc cgaaggactg 240
gctatacagt gttcacaaaa tagccaagct gaaaataatg tgtagctatg ttcagttagt 300
ttggctagca aagatataaa agcaggtcgg aaatatttat gggcattatt atgcagagca 360
tcaacatgat aaaaaaaaac agttgaatat tccctcaaaa atgtcgaaag ctacatataa 420
ggaacgtgct gctactcatc ctagtcctgt tgctgccaag ctatttaata tcatgcacga 480
aaagcaaaca aacttgtgtg cttcattgga tgttcgtacc accaaggaat tactggagtt 540
agttgaagca ttaggtccca aaatttgttt actaaaaaca catgtggata tcttgactga 600
tttttccatg gagggcacag ttaagccgct aaaggcatta tccgccaagt acaatttttt 660
actcttcgaa gacagaaaat ttgctgacat tggtaataca gtcaaattgc agtactctgc 720
gggtgtatac agaatagcag aatgggcaga cattacgaat gcacacggtg tggtgggccc 780
aggtattgtt agcggtttga agcaggcggc agaagaagta acaaaggaac ctagaggcct 840
tttgatgtta gcagaattgt catgcaaggg ctccctatct actggagaat atactaaggg 900
tactgttgac attgcgaaga gcgacaaaga ttttgttatc ggctttattg ctcaaagaga 960
catgggtgga agagatgaag gttacgattg gttgattatg acacccggtg tgggtttaga 1020
tgacaaggga gacgcattgg gtcaacagta tagaaccgtg gatgatgtgg tctctacagg 1080
atctgacatt attattgttg gaagaggact atttgcaaag ggaagggatg ctaaggtaga 1140
gggtgaacgt tacagaaaag caggctggga agcatatttg agaagatgcg gccagcaaaa 1200
ctaagtgcct aatgagtgag ctaactcaca ttaattgcgt tgcgctcact gcccgctttc 1260
cagtcggtca cacaggaaac agctatgacc 1290
<210> 7
<211> 376
<212> DNA
<213> Artificial sequence
<400> 7
aaagaaagtg gaatattcat tcatatcata ttttttctat taactgcctg gtttctttta 60
aattttttat tggttgtcga cttgaacgga gtgacaatat atatatatat atatttaata 120
atgacatcat tatctgtaaa tctgattctt aatgctattc tagttatgta agagtggtcc 180
tttccataaa aaaaaaaaaa aagaaaaaag aattttagga atacaatgca gcttgtaagt 240
aaaatctgga atattcatat cgccacaact tcttatgctt ataaaagcac taatgcctgg 300
cgaagaggcc cgcaccgatc gcccttccca acagttgcgc agcctgaatg gacgacgttg 360
taaaacgacg gccagt 376
<210> 8
<211> 1604
<212> DNA
<213> Artificial sequence
<400> 8
ggaaaagttg taaatattat tggtagtatt cgtttggtaa agtagagggg gtaatttttc 60
ccctttattt tgttcataca ttcttaaatt gctttgcctc tccttttgga aagctatact 120
tcggagcact gttgagcgaa ggctcattag atatattttc tgtcattttc cttaacccaa 180
aaataaggga aagggtccaa aaagcgctcg gacaactgtt gaccgtgatc cgaaggactg 240
gctatacagt gttcacaaaa tagccaagct gaaaataatg tgtagctatg ttcagttagt 300
ttggctagca aagatataaa agcaggtcgg aaatatttat gggcattatt atgcagagca 360
tcaacatgat aaaaaaaaac agttgaatat tccctcaaaa atgtcgaaag ctacatataa 420
ggaacgtgct gctactcatc ctagtcctgt tgctgccaag ctatttaata tcatgcacga 480
aaagcaaaca aacttgtgtg cttcattgga tgttcgtacc accaaggaat tactggagtt 540
agttgaagca ttaggtccca aaatttgttt actaaaaaca catgtggata tcttgactga 600
tttttccatg gagggcacag ttaagccgct aaaggcatta tccgccaagt acaatttttt 660
actcttcgaa gacagaaaat ttgctgacat tggtaataca gtcaaattgc agtactctgc 720
gggtgtatac agaatagcag aatgggcaga cattacgaat gcacacggtg tggtgggccc 780
aggtattgtt agcggtttga agcaggcggc agaagaagta acaaaggaac ctagaggcct 840
tttgatgtta gcagaattgt catgcaaggg ctccctatct actggagaat atactaaggg 900
tactgttgac attgcgaaga gcgacaaaga ttttgttatc ggctttattg ctcaaagaga 960
catgggtgga agagatgaag gttacgattg gttgattatg acacccggtg tgggtttaga 1020
tgacaaggga gacgcattgg gtcaacagta tagaaccgtg gatgatgtgg tctctacagg 1080
atctgacatt attattgttg gaagaggact atttgcaaag ggaagggatg ctaaggtaga 1140
gggtgaacgt tacagaaaag caggctggga agcatatttg agaagatgcg gccagcaaaa 1200
ctaaacgcac agatattata acatctgcac aataggcatt tgcaagaatt actcgtgagt 1260
aaggaaagag tgaggaacta tcgcatacct gcatttaaag atgccgattt gggcgcgaat 1320
cctttatttt ggcttcaccc tcatactatt atcagggcca gaaaaaggaa gtgtttccct 1380
ccttcttgaa ttgatgttac cctcataaag cacgtggcct cttatcgaga aagaaattac 1440
cgtcgctcgt gatttgtttg caaaaagaac aaaactgaaa aaacccagac acgctcgact 1500
tcctgtcttc ctattgattg cagcttccaa tttcgtcaca caacaaggtc ctagcgacgg 1560
ctcacaggtt ttgtaacaag caatcgaagg ttctggaatg gcgg 1604
<210> 9
<211> 499
<212> DNA
<213> Artificial sequence
<400> 9
agtctaggtc cctatttatt tttttatagt tatgttagta ttaagaacgt tatttatatt 60
tcaaattttt cttttttttc tgtacagacg cgtgtacgca tgtaacatta tactgaaaac 120
cttgcttgag aaggttttgg gacgctcgaa ggctttaatt tgcaagctgc ggccctgcat 180
taatgaatcg gccaacgcgc aaagaaagtg gaatattcat tcatatcata ttttttctat 240
taactgcctg gtttctttta aattttttat tggttgtcga cttgaacgga gtgacaatat 300
atatatatat atatttaata atgacatcat tatctgtaaa tctgattctt aatgctattc 360
tagttatgta agagtggtcc tttccataaa aaaaaaaaaa aagaaaaaag aattttagga 420
atacaatgca gcttgtaagt aaaatctgga atattcatat cgccacaact tcttatgctt 480
ataaaagcac taatgcctg 499
<210> 10
<211> 1463
<212> DNA
<213> Artificial sequence
<400> 10
caagtttgtg taatagatag cgttatatta tagaactata aaggtccttg aatatacata 60
gtgtttcatt cctattactg tatatgtgac tttacattgt tacttccgcg gctatttgac 120
gttttctgct tcaggtgcgg cttggagggc aaagtgtcag aaaatcggcc aggccgtatg 180
acacaaaaga gtagaaaacg agatctcaaa tatctcgagg cctgtcctct atacaaccgc 240
ccagctctct gacaaagctc cagaacggtt gtcttttgtt tcgaaaagcc aaggtccctt 300
ataattgccc tccattttgt gtcacctatt taagcaaaaa attgaaagtt tactaacctt 360
tcattaaaga gaaataacaa tattataaaa agcgcttaaa atgacagagc agaaagccct 420
agtaaagcgt attacaaatg aaaccaagat tcagattgcg atctctttaa agggtggtcc 480
cctagcgata gagcactcga tcttcccaga aaaagaggca gaagcagtag cagaacaggc 540
cacacaatcg caagtgatta acgtccacac aggtataggg tttctggacc atatgataca 600
tgctctggcc aagcattccg gctggtcgct aatcgttgag tgcattggtg acttacacat 660
agacgaccat cacaccactg aagactgcgg gattgctctc ggtcaagctt ttaaagaggc 720
cctaggggcc gtgcgtggag taaaaaggtt tggatcagga tttgcgcctt tggatgaggc 780
actttccaga gcggtggtag atctttcgaa caggccgtac gcagttgtcg aacttggttt 840
gcaaagggag aaagtaggag atctctcttg cgagatgatc ccgcattttc ttgaaagctt 900
tgcagaggct agcagaatta ccctccacgt tgattgtctg cgaggcaaga atgatcatca 960
ccgtagtgag agtgcgttca aggctcttgc ggttgccata agagaagcca cctcgcccaa 1020
tggtaccaac gatgttccct ccaccaaagg tgttcttatg tagacgcaca gatattataa 1080
catctgcaca ataggcattt gcaagaatta ctcgtgagta aggaaagagt gaggaactat 1140
cgcatacctg catttaaaga tgccgatttg ggcgcgaatc ctttattttg gcttcaccct 1200
catactatta tcagggccag aaaaaggaag tgtttccctc cttcttgaat tgatgttacc 1260
ctcataaagc acgtggcctc ttatcgagaa agaaattacc gtcgctcgtg atttgtttgc 1320
aaaaagaaca aaactgaaaa aacccagaca cgctcgactt cctgtcttcc tattgattgc 1380
agcttccaat ttcgtcacac aacaaggtcc tagcgacggc tcacaggttt tgtaacaagc 1440
aatcgaaggt tctggaatgg cgg 1463
<210> 11
<211> 616
<212> DNA
<213> Artificial sequence
<400> 11
agtctaggtc cctatttatt tttttatagt tatgttagta ttaagaacgt tatttatatt 60
tcaaattttt cttttttttc tgtacagacg cgtgtacgca tgtaacatta tactgaaaac 120
cttgcttgag aaggttttgg gacgctcgaa ggctttaatt tgcaagctgc ggccctgcat 180
taatgaatcg gccaacgcgc ataaactaat gattttaaat cgttaaaaaa atatgcgaat 240
tctgtggatc gaacacagga cctccagata acttgaccga agttttttct tcagtctggc 300
gctctcccaa ctgagctaaa tccgcttact atttgttatc agttcccttc atatctacat 360
agaataggtt aagtatttta ttagttgcca gaagaactac tgatagttgg gaatatttgg 420
tgaataatga agattgggtg aataatttga taattttgag attcaattgt taatcaatgt 480
tacaatatta tgtatacaga gtatactaga agttctcttc ggagatcttg aagttcacaa 540
aagggaatcg atatttctac ataatattat cattacttct tccccatctt atatttgtca 600
ttcattattg attatg 616
<210> 12
<211> 1775
<212> DNA
<213> Artificial sequence
<400> 12
gcgcaagttt tccgctttgt aatatatatt tatacccctt tcttctctcc cctgcaatat 60
aatagtttaa ttctaatatt aataatatcc tatattttct tcatttaccg gcgcactctc 120
gcccgaacga cctcaaaatg tctgctacat tcataataac caaaagctca taactttttt 180
ttttgaacct gaatatatat acatcacata tcactgctgg tccttgccga ccagcgtata 240
caatctcgat agttggtttc ccgttctttc cactcccgtc atgtctgccc ctaagaagat 300
cgtcgttttg ccaggtgacc acgttggtca agaaatcaca gccgaagcca ttaaggttct 360
taaagctatt tctgatgttc gttccaatgt caagttcgat ttcgaaaatc atttaattgg 420
tggtgctgct atcgatgcta caggtgttcc acttccagat gaggcgctgg aagcctccaa 480
gaaggctgat gccgttttgt taggtgctgt gggtggtcct aaatggggta ccggtagtgt 540
tagacctgaa caaggtttac taaaaatccg taaagaactt caattgtacg ccaacttaag 600
accatgtaac tttgcatccg actctctttt agacttatct ccaatcaagc cacaatttgc 660
taaaggtact gacttcgttg ttgtcagaga attagtggga ggtatttact ttggtaagag 720
aaaggaagac gatggtgatg gtgtcgcttg ggatagtgaa caatacaccg ttccagaagt 780
gcaaagaatc acaagaatgg ccgctttcat ggccctacaa catgagccac cattgcctat 840
ttggtccttg gataaagcta atgttttggc ctcttcaaga ttatggagaa aaactgtgga 900
ggaaaccatc aagaacgaat tccctacatt gaaggttcaa catcaattga ttgattctgc 960
cgccatgatc ctagttaaga acccaaccca cctaaatggt attataatca ccagcaacat 1020
gtttggtgat atcatctccg atgaagcctc cgttatccca ggttccttgg gtttgttgcc 1080
atctgcgtcc ttggcctctt tgccagacaa gaacaccgca tttggtttgt acgaaccatg 1140
ccacggttct gctccagatt tgccaaagaa taaggtcaac cctatcgcca ctatcttgtc 1200
tgctgcaatg atgttgaaat tgtcattgaa cttgcctgaa gaaggtaagg ccattgaaga 1260
tgcagttaaa aaggttttgg atgcaggtat cagaactggt gatttaggtg gttccaacag 1320
taccaccgaa gtcggtgatg ctgtcgccga agaagttaag aaaatccttg cttaaacgca 1380
cagatattat aacatctgca caataggcat ttgcaagaat tactcgtgag taaggaaaga 1440
gtgaggaact atcgcatacc tgcatttaaa gatgccgatt tgggcgcgaa tcctttattt 1500
tggcttcacc ctcatactat tatcagggcc agaaaaagga agtgtttccc tccttcttga 1560
attgatgtta ccctcataaa gcacgtggcc tcttatcgag aaagaaatta ccgtcgctcg 1620
tgatttgttt gcaaaaagaa caaaactgaa aaaacccaga cacgctcgac ttcctgtctt 1680
cctattgatt gcagcttcca atttcgtcac acaacaaggt cctagcgacg gctcacaggt 1740
tttgtaacaa gcaatcgaag gttctggaat ggcgg 1775
<210> 13
<211> 532
<212> DNA
<213> Artificial sequence
<400> 13
agtctaggtc cctatttatt tttttatagt tatgttagta ttaagaacgt tatttatatt 60
tcaaattttt cttttttttc tgtacagacg cgtgtacgca tgtaacatta tactgaaaac 120
cttgcttgag aaggttttgg gacgctcgaa ggctttaatt tgcaagctgc ggccctgcat 180
taatgaatcg gccaacgcgc aagcatcttg ccctgtgctt ggcccccagt gcagcgaacg 240
ttataaaaac gaatactgag tatatatcta tgtaaaacaa ccatatcatt tcttgttctg 300
aactttgttt acctaactag ttttaaattt ccctttttcg tgcatgcggg tgttcttatt 360
tattagcata ctacatttga aatatcaaat ttccttagta gaaaagtgag agaaggtgca 420
ctgacacaaa aaataaaatg ctacgtataa ctgtcaaaac tttgcagcag cgggcatcct 480
tccatcatag cttcaaacat attagcgttc ctgatcttca tacccgtgct ca 532
<210> 14
<211> 1535
<212> DNA
<213> Artificial sequence
<400> 14
ttcactaccc tttttccatt tgccatctat tgaagtaata ataggcgcat gcaacttctt 60
ttcttttttt ttcttttctc tctcccccgt tgttgtctca ccatatccgc aatgacaaaa 120
aaatgatgga agacactaaa ggaaaaaatt aacgacaaag acagcaccaa cagatgtcgt 180
tgttccagag ctgatgaggg gtatctcgaa gcacacgaaa ctttttcctt ccttcattca 240
cgcacactac tctctaatga gcaacggtat acggccttcc ttccagttac ttgaatttga 300
aataaaaaaa agtttgctgt cttgctatca agtataaata gacctgcaat tattaatctt 360
ttgtttcctc gtcattgttc tcgttccctt tcttccttgt ttctttttct gcacaatatt 420
tcaagctata ccaagcatac aatcaactat ctcatataca atgtctgtta ttaatttcac 480
aggtagttct ggtccattgg tgaaagtttg cggcttgcag agcacagagg ccgcagaatg 540
tgctctagat tccgatgctg acttgctggg tattatatgt gtgcccaata gaaagagaac 600
aattgacccg gttattgcaa ggaaaatttc aagtcttgta aaagcatata aaaatagttc 660
aggcactccg aaatacttgg ttggcgtgtt tcgtaatcaa cctaaggagg atgttttggc 720
tctggtcaat gattacggca ttgatatcgt ccaactgcat ggagatgagt cgtggcaaga 780
ataccaagag ttcctcggtt tgccagttat taaaagactc gtatttccaa aagactgcaa 840
catactactc agtgcagctt cacagaaacc tcattcgttt attcccttgt ttgattcaga 900
agcaggtggg acaggtgaac ttttggattg gaactcgatt tctgactggg ttggaaggca 960
agagagcccc gaaagcttac attttatgtt agctggtgga ctgacgccag aaaatgttgg 1020
tgatgcgctt agattaaatg gcgttattgg tgttgatgta agcggaggtg tggagacaaa 1080
tggtgtaaaa gactctaaca aaatagcaaa tttcgtcaaa aatgctaaga aatagacgca 1140
cagatattat aacatctgca caataggcat ttgcaagaat tactcgtgag taaggaaaga 1200
gtgaggaact atcgcatacc tgcatttaaa gatgccgatt tgggcgcgaa tcctttattt 1260
tggcttcacc ctcatactat tatcagggcc agaaaaagga agtgtttccc tccttcttga 1320
attgatgtta ccctcataaa gcacgtggcc tcttatcgag aaagaaatta ccgtcgctcg 1380
tgatttgttt gcaaaaagaa caaaactgaa aaaacccaga cacgctcgac ttcctgtctt 1440
cctattgatt gcagcttcca atttcgtcac acaacaaggt cctagcgacg gctcacaggt 1500
tttgtaacaa gcaatcgaag gttctggaat ggcgg 1535
<210> 15
<211> 613
<212> DNA
<213> Artificial sequence
<400> 15
agtctaggtc cctatttatt tttttatagt tatgttagta ttaagaacgt tatttatatt 60
tcaaattttt cttttttttc tgtacagacg cgtgtacgca tgtaacatta tactgaaaac 120
cttgcttgag aaggttttgg gacgctcgaa ggctttaatt tgcaagctgc ggccctgcat 180
taatgaatcg gccaacgcgc gcgaatttct tatgatttat gatttttatt attaaataag 240
ttataaaaaa aataagtgta tacaaatttt aaagtgactc ttaggtttta aaacgaaaat 300
tcttattctt gagtaactct ttcctgtagg tcaggttgct ttctcaggta tagcatgagg 360
tcgctcttat tgaccacacc tctaccggca tgccgagcaa atgcctgcaa atcgctcccc 420
atttcaccca attgtagata tgctaactcc agcaatgagt tgatgaatct cggtgtgtat 480
tttatgtcct cagaggacaa cacctgttgt aatcgttctt ccacacggat ccacagccta 540
gccttcagtt gggctctatc ttcatcgtca ttcattgcat ctactagccc cttacctgag 600
cttcaagacg tta 613
<210> 16
<211> 3072
<212> DNA
<213> Artificial sequence
<400> 16
ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg agccttaatt 60
aaacgcacag atattataac atctgcacaa taggcatttg caagaattac tcgtgagtaa 120
ggaaagagtg aggaactatc gcatacctgc atttaaagat gccgatttgg gcgcgaatcc 180
tttattttgg cttcaccctc atactattat cagggccaga aaaaggaagt gtttccctcc 240
ttcttgaatt gatgttaccc tcataaagca cgtggcctct tatcgagaaa gaaattaccg 300
tcgctcgtga tttgtttgca aaaagaacaa aactgaaaaa acccagacac gctcgacttc 360
ctgtcttcct attgattgca gcttccaatt tcgtcacaca acaaggtcct agcgacggct 420
cacaggtttt gtaacaagca atcgaaggtt ctggaatggc gggaaagggt ttagtaccac 480
atgctatgat gcccactgtg atctccagag caaagttcgt tcgatcgtac tgttactctc 540
tctctttcaa acagaattgt ccgaatcgtg tgacaacaac agcctgttct cacacactct 600
tttcttctaa ccaagggggt ggtttagttt agtagaacct cgtgaaactt acatttacat 660
atatataaac ttgcataaat tggtcaatgc aagaaataca tatttggtct tttctaattc 720
gtagtttttc aagttcttag atgctttctt tttctctttt ttacagatca tcaaggaagt 780
aattatctac tttttacaac aaatataaaa caatggatct ccctacagca actgatatca 840
atgaaaaacc caaattgtcc aaggaagaac aagatccacg caatttcgta aagttaatga 900
acgatcagaa tcgaaatgaa ttgatcatct tttatggttc tcaaactggt actggtgaag 960
actatgcgca acgcttggga aaggaatgca agaagcgatt caacatacaa ccaatggtgg 1020
ccgatctaga aaactatgat cttggctatt tggatacact ccccaaggaa acgattgccg 1080
tgtttgttat ctctacttat ggtgaaggcg accctacgga tagtgcagtc aacttttggg 1140
aactcttgaa caaggatgta cctaccttct ctaaaggttg cgcggtggaa cgacctctta 1200
aagatttacg ctactttgtc tttgggcttg gcaatcgaac gtatgaatac tttaatggag 1260
cagctattgg agtggataaa caacttactc agcttggtgc aacacgattg ggcgaagtag 1320
gaatggggga tgatgataac tctttggaag acgattttat tcaatggcaa gatcaagtat 1380
ggcctttatt agcggatgct ttagcgacaa gcacggatac agtggatgaa caagcacaag 1440
cacaacatgc gtacaaggtg atgatgggcc aagaaaagga agatgaatca ttttactata 1500
tgggtgagct tggcgatact cagcttacaa catggagtgc gaagcgacct taccctgcac 1560
ctgtcaagat tcatgacctc acacctgctt ctcgtgatca acgtcattgc ttacacctgg 1620
atgtggactt gtccaacagc aacatctctt atactactgg cgatcatctc ggtatttggc 1680
ctacgaacaa cgaagacgaa gtgtttttgg tatctagtct ttttggttgg aatgacgctt 1740
atctggatca agtaatcaat gttgttccca cagattccac caacaaacct ccattccccc 1800
agcctaccac cttacggtct gctcttcgtc attacttgga tattgctcaa cttccttctc 1860
gatctactct tgatttactg cttccttctt gctcaaacga cagcctaaag tctttcttac 1920
agaacttggt caacgataaa gatgaacaca agcgggtggt attggatcaa gttcgtaacc 1980
tgggccagct tctctctttt gctttggaaa ctattggatc cacgactact gatggtgctt 2040
tgaaggatat acccgtggaa gttgtattgg aatgctactc tcggcttcaa cctcgttatt 2100
actctatatc atcatcatcc agcgaatcag caactacagt tagtgcgacc gctgtcactt 2160
tgaaatacaa cccaactcct gatcgaactg tatatggcgt gaacaccaat tacctttggg 2220
cgatccatca atcaatgtca tcgactccat catcggatgt gccaaagtat gtagtggatg 2280
ggccgcgtca acaatatctg atcaccaagg aagccaacag cgactcgatt aaaatcaaga 2340
ttcctgtaca tattcgcaaa tccaccttcc gtctacctcc ttcatcaagc actcctgtca 2400
ttatggttgg tcctggtact ggtgttgctc ctttccgtgg atttgtacgt gaacgtgtct 2460
accaaaagca agtcttgggc gaagatgttg gtgctactgt cctcttcttt ggctgccgac 2520
gatccaccga agactatctt tatgctgacg aatggccaag attattcaag tccctgggaa 2580
atggtccttc tagaatcatc accgccttct ctagagaatc tgaagaaaag aaggtctacg 2640
tacagcaacg actagccgaa catggacagg aaatgtggga cttgttggca aatcaagggg 2700
cttactttta tgtctgtggt gatgcaaagt atatggcgaa ggatgtgcaa caaaccgtga 2760
tcgacatggc aaagtctttt ggtggtcttg gcgataacga agctactacc tttattcaag 2820
aattacggaa atccaatcga tatgtggaag acgtgtgggc atagagttat aaaaaaaata 2880
agtgtataca aattttaaag tgactcttag gttttaaaac gaaaattctt attcttgagt 2940
aactctttcc tgtaggtcag gttgctttct caggtatagc atgaggtcgc tcttattgac 3000
cacacctcta ccggcatgcc gattaattaa agtgatcccc cacacaccat agcttcaaaa 3060
tgtttctact cc 3072
<210> 17
<211> 1766
<212> DNA
<213> Artificial sequence
<400> 17
gtgcctaatg agtgagctaa ctcacattaa ttgcgttgcg ctcactgccc gctttccagt 60
cggtcacaca ggaaacagct atgaccatac tagcgttgaa tgttagcgtc aacaacaaga 120
agtttaatga cgcggaggcc aaggcaaaaa gattccttga ttacgtaagg gagttagaat 180
cattttgaat aaaaaacacg ctttttcagt tcgagtttat cattatcaat actgccattt 240
caaagaatac gtaaataatt aatagtagtg attttcctaa ctttatttag tcaaaaaatt 300
agccttttaa ttctgctgta acccgtacat gcccaaaata gggggcgggt tacacagaat 360
atataacatc gtaggtgtct gggtgaacag tttattcctg gcatccacta aatataatgg 420
agcccgcttt ttaagctggc atccagaaaa aaaaagaatc ccagcaccaa aatattgttt 480
tcttcaccaa ccatcagttc ataggtccat tctcttagcg caactacaga gaacaggggc 540
acaaacaggc aaaaaacggg cacaacctca atggagtgat gcaacctgcc tggagtaaat 600
gatgacacaa ggcaattgac ccacgcatgt atctatctca ttttcttaca ccttctatta 660
ccttctgctc tctctgattt ggaaaaagct gaaaaaaaag gttgaaacca gttccctgaa 720
attattcccc tacttgacta ataagtatat aaagacggta ggtattgatt gtaattctgt 780
aaatctattt cttaaacttc ttaaattcta cttttatagt tagtcttttt tttagtttta 840
aaacaccaag aacttagttt cgaataaaca cacataaaca aacaaaatgt ccaaagtcta 900
ttctactgct gatgtcaatg cccaccaaac ccgtggtgac ctttgggttg tgattcacaa 960
caaggtttac gatattaccg aattcgttct tgaacatcct ggtggtgaag aagtcttgct 1020
cgacgaagct ggtaaggatg ctactgaatc cttcgaagac attggccact ctgacgaagc 1080
tcgctacatt ttggaaaagt acctcattgg tgaattggat gctgcttctc gtgttgacaa 1140
ccacaagttt gatcccatcc gtgctggaga actccctgaa caaaagcaat ctggatctgc 1200
cctccgtgtg atcatccctg ctcttgctgt tgccggtgtc cttgtctaca agtttatcct 1260
ctccaacaag aactaggatt aatataatta tataaaaata ttatcttctt ttctttatat 1320
ctagtgttat gtaaaataaa ttgatgacta cggaaagctt ttttatattg tttctttttc 1380
attctgagcc acttaaattt cgtgaatgtt cttgtaaggg acggtagatt tacaagtgat 1440
acaacaaaaa gcaaggcgct ttttctaata aaaagaagaa aagcatttaa caattgaaca 1500
cctctatatc aacgaagaat attactttgt ctctaaatcc ttgtaaaatg tgtacgatct 1560
ctatatgggt tactcataag tgtaccgaag actgcattga aagtttatgt tttttcactg 1620
gaggcgtcat tttcgcgttg agaagatgtt cttatccaaa tttcaactgt tatatagaac 1680
tggccgtcgt tttacaacgt cgtccattca ggctgcgcaa ctgttgggaa gggcgatcgg 1740
tgcgggcctc ttcgctatta cgccag 1766
<210> 18
<211> 2445
<212> DNA
<213> Artificial sequence
<400> 18
ggtatagcat gaggtcgctc ttattgacca cacctctacc ggcatgccga ttaattaaag 60
tgatccccca agtgatcccc cacacaccat agcttcaaaa tgtttctact ccttttttac 120
tcttccagat tttctcggac tccgcgcatc gccgtaccac ttcaaaacac ccaagcacag 180
catactaaat ttcccctctt tcttcctcta gggtgtcgtt aattacccgt actaaaggtt 240
tggaaaagaa aaaagagacc gcctcgtttc tttttcttcg tcgaaaaagg caataaaaat 300
ttttatcacg tttctttttc ttgaaaattt ttttttttga tttttttctc tttcgatgac 360
ctcccattga tatttaagtt aataaacggt cttcaatttc tcaagtttca gtttcatttt 420
tcttgttcta ttacaacttt ttttacttct tgctcattag aaagaaagca tagcaatcta 480
atctaagttt taattacaaa atgcttacag agtacattca tcattttatc aacaactttg 540
atcaaaagaa gactatggat caattacaaa ccatggtgtc atccaaagaa ggtatgatcg 600
gtttggccac agctgcagtc cttatgtctg gtgcagccgt ttacaaatca accaggatcg 660
aacggggatg ccctcaagta cctaaccagt cctactttat gggatctaca aaagaatatc 720
gcaacaaccc tgctgcattc atcgagaaat gggaaaagga actgggccct gtttatggtg 780
cttatttgtt tggtcagtat actactgttg tctctggtcc tcaagtgcgc gaagtattct 840
tgaacgatga ctttgacttt attgctggca tccgacgaga ctttgatacc aacttgttat 900
caaatggcgg cgatcttcga gacttgcctg tacacaagtt tgcgggtagt atcaagaaga 960
acctcagccc caaactccca ttctacacca gccgcgtgat tgaacatctc aagattggat 1020
taaaggaatt ctgtggagta gtgccagatg aaggaaagga attcgatcat gtgtatccct 1080
tggtgcaaca tatggtagcc aaagctagtg catccgtctt tgtgggtcct gaattagcca 1140
agaatgaaca attgatcgac tcgtttaaga atatggtcct cgaagtggga tctgaattag 1200
cccccaagcc ttatttagaa ttcttcccta atctgatgcg tctacgaatg tggtttattg 1260
gcaaaacgtc acaaaaggtc aagagacatc gggatcagct tcgtgcggcc ttggcacctc 1320
aagtggaata tcgactcaag gctatgaagg aaaacgatag caactgggat cgacctaacg 1380
actttttaca agatattctg gaaagcggcg atatcccagc tcatgtggat gtcacggatc 1440
attgttgcga ttggatgaca caaattatct ttgcagctct acacacaacc agtgaaaacg 1500
gcacattatc cttctatcgc ttattggata acccaaaggt gttggaagat ttactggaag 1560
aacaaaatca agtgttggaa gatgcaggct atgatagctc cgttggccct gaagtcttta 1620
cccgggaaat cctcaacaaa ttcgtcaaga tggatagtgt gattcgtgaa acgagtcgac 1680
tacgaaacga ctatatcggt ctccctcaca agaatattag ctcaaagaca attacattat 1740
cagggggtgc tatgatccgt ccaggtgaac gtgcttatgt gaatgcttat tccaaccatc 1800
gtgatggaac tatccaaaaa gtgacggaca acttaaaatc atttgaaccc taccgatttg 1860
taaaccaaga caggaactct acaaagattg gtgaagactt tatattcttt ggaatgggta 1920
aacatgcctg tcctggtcga tggttcgcca ttcaagaaat taaaacgatt attgccatga 1980
tgatccgaag ctaccaatta tctgcccttg gtcctgttac cttccccacc gatgactatt 2040
ccagaatacc catgggtcga ttcaagattg tgccaagaaa gtaaccgctg atcctagagg 2100
gccgcatcat gtaattagtt atgtcacgct tacattcacg ccctcccccc acatccgctc 2160
taaccgaaaa ggaaggagtt agacaacctg aagtctaggt ccctatttat ttttttatag 2220
ttatgttagt attaagaacg ttatttatat ttcaaatttt tctttttttt ctgtacagac 2280
gcgtgtacgc atgtaacatt atactgaaaa ccttgcttga gaaggttttg ggacgctcga 2340
aggctttaat ttgcaagctg cggccctgca ttaatgaatc ggccaacgcg ccagggtttt 2400
cccagtcacg acgttgtaaa acgacggcca gtgaattgta atacg 2445
<210> 19
<211> 683
<212> PRT
<213> Artificial sequence
<400> 19
Met Asp Leu Pro Thr Ala Thr Asp Ile Asn Glu Lys Pro Lys Leu Ser
1 5 10 15
Lys Glu Glu Gln Asp Pro Arg Asn Phe Val Lys Leu Met Asn Asp Gln
20 25 30
Asn Arg Asn Glu Leu Ile Ile Phe Tyr Gly Ser Gln Thr Gly Thr Gly
35 40 45
Glu Asp Tyr Ala Gln Arg Leu Gly Lys Glu Cys Lys Lys Arg Phe Asn
50 55 60
Ile Gln Pro Met Val Ala Asp Leu Glu Asn Tyr Asp Leu Gly Tyr Leu
65 70 75 80
Asp Thr Leu Pro Lys Glu Thr Ile Ala Val Phe Val Ile Ser Thr Tyr
85 90 95
Gly Glu Gly Asp Pro Thr Asp Ser Ala Val Asn Phe Trp Glu Leu Leu
100 105 110
Asn Lys Asp Val Pro Thr Phe Ser Lys Gly Cys Ala Val Glu Arg Pro
115 120 125
Leu Lys Asp Leu Arg Tyr Phe Val Phe Gly Leu Gly Asn Arg Thr Tyr
130 135 140
Glu Tyr Phe Asn Gly Ala Ala Ile Gly Val Asp Lys Gln Leu Thr Gln
145 150 155 160
Leu Gly Ala Thr Arg Leu Gly Glu Val Gly Met Gly Asp Asp Asp Asn
165 170 175
Ser Leu Glu Asp Asp Phe Ile Gln Trp Gln Asp Gln Val Trp Pro Leu
180 185 190
Leu Ala Asp Ala Leu Ala Thr Ser Thr Asp Thr Val Asp Glu Gln Ala
195 200 205
Gln Ala Gln His Ala Tyr Lys Val Met Met Gly Gln Glu Lys Glu Asp
210 215 220
Glu Ser Phe Tyr Tyr Met Gly Glu Leu Gly Asp Thr Gln Leu Thr Thr
225 230 235 240
Trp Ser Ala Lys Arg Pro Tyr Pro Ala Pro Val Lys Ile His Asp Leu
245 250 255
Thr Pro Ala Ser Arg Asp Gln Arg His Cys Leu His Leu Asp Val Asp
260 265 270
Leu Ser Asn Ser Asn Ile Ser Tyr Thr Thr Gly Asp His Leu Gly Ile
275 280 285
Trp Pro Thr Asn Asn Glu Asp Glu Val Phe Leu Val Ser Ser Leu Phe
290 295 300
Gly Trp Asn Asp Ala Tyr Leu Asp Gln Val Ile Asn Val Val Pro Thr
305 310 315 320
Asp Ser Thr Asn Lys Pro Pro Phe Pro Gln Pro Thr Thr Leu Arg Ser
325 330 335
Ala Leu Arg His Tyr Leu Asp Ile Ala Gln Leu Pro Ser Arg Ser Thr
340 345 350
Leu Asp Leu Leu Leu Pro Ser Cys Ser Asn Asp Ser Leu Lys Ser Phe
355 360 365
Leu Gln Asn Leu Val Asn Asp Lys Asp Glu His Lys Arg Val Val Leu
370 375 380
Asp Gln Val Arg Asn Leu Gly Gln Leu Leu Ser Phe Ala Leu Glu Thr
385 390 395 400
Ile Gly Ser Thr Thr Thr Asp Gly Ala Leu Lys Asp Ile Pro Val Glu
405 410 415
Val Val Leu Glu Cys Tyr Ser Arg Leu Gln Pro Arg Tyr Tyr Ser Ile
420 425 430
Ser Ser Ser Ser Ser Glu Ser Ala Thr Thr Val Ser Ala Thr Ala Val
435 440 445
Thr Leu Lys Tyr Asn Pro Thr Pro Asp Arg Thr Val Tyr Gly Val Asn
450 455 460
Thr Asn Tyr Leu Trp Ala Ile His Gln Ser Met Ser Ser Thr Pro Ser
465 470 475 480
Ser Asp Val Pro Lys Tyr Val Val Asp Gly Pro Arg Gln Gln Tyr Leu
485 490 495
Ile Thr Lys Glu Ala Asn Ser Asp Ser Ile Lys Ile Lys Ile Pro Val
500 505 510
His Ile Arg Lys Ser Thr Phe Arg Leu Pro Pro Ser Ser Ser Thr Pro
515 520 525
Val Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg Gly Phe
530 535 540
Val Arg Glu Arg Val Tyr Gln Lys Gln Val Leu Gly Glu Asp Val Gly
545 550 555 560
Ala Thr Val Leu Phe Phe Gly Cys Arg Arg Ser Thr Glu Asp Tyr Leu
565 570 575
Tyr Ala Asp Glu Trp Pro Arg Leu Phe Lys Ser Leu Gly Asn Gly Pro
580 585 590
Ser Arg Ile Ile Thr Ala Phe Ser Arg Glu Ser Glu Glu Lys Lys Val
595 600 605
Tyr Val Gln Gln Arg Leu Ala Glu His Gly Gln Glu Met Trp Asp Leu
610 615 620
Leu Ala Asn Gln Gly Ala Tyr Phe Tyr Val Cys Gly Asp Ala Lys Tyr
625 630 635 640
Met Ala Lys Asp Val Gln Gln Thr Val Ile Asp Met Ala Lys Ser Phe
645 650 655
Gly Gly Leu Gly Asp Asn Glu Ala Thr Thr Phe Ile Gln Glu Leu Arg
660 665 670
Lys Ser Asn Arg Tyr Val Glu Asp Val Trp Ala
675 680
<210> 20
<211> 129
<212> PRT
<213> Artificial sequence
<400> 20
Met Ser Lys Val Tyr Ser Thr Ala Asp Val Asn Ala His Gln Thr Arg
1 5 10 15
Gly Asp Leu Trp Val Val Ile His Asn Lys Val Tyr Asp Ile Thr Glu
20 25 30
Phe Val Leu Glu His Pro Gly Gly Glu Glu Val Leu Leu Asp Glu Ala
35 40 45
Gly Lys Asp Ala Thr Glu Ser Phe Glu Asp Ile Gly His Ser Asp Glu
50 55 60
Ala Arg Tyr Ile Leu Glu Lys Tyr Leu Ile Gly Glu Leu Asp Ala Ala
65 70 75 80
Ser Arg Val Asp Asn His Lys Phe Asp Pro Ile Arg Ala Gly Glu Leu
85 90 95
Pro Glu Gln Lys Gln Ser Gly Ser Ala Leu Arg Val Ile Ile Pro Ala
100 105 110
Leu Ala Val Ala Gly Val Leu Val Tyr Lys Phe Ile Leu Ser Asn Lys
115 120 125
Asn
<210> 21
<211> 527
<212> PRT
<213> Artificial sequence
<400> 21
Met Leu Thr Glu Tyr Ile His His Phe Ile Asn Asn Phe Asp Gln Lys
1 5 10 15
Lys Thr Met Asp Gln Leu Gln Thr Met Val Ser Ser Lys Glu Gly Met
20 25 30
Ile Gly Leu Ala Thr Ala Ala Val Leu Met Ser Gly Ala Ala Val Tyr
35 40 45
Lys Ser Thr Arg Ile Glu Arg Gly Cys Pro Gln Val Pro Asn Gln Ser
50 55 60
Tyr Phe Met Gly Ser Thr Lys Glu Tyr Arg Asn Asn Pro Ala Ala Phe
65 70 75 80
Ile Glu Lys Trp Glu Lys Glu Leu Gly Pro Val Tyr Gly Ala Tyr Leu
85 90 95
Phe Gly Gln Tyr Thr Thr Val Val Ser Gly Pro Gln Val Arg Glu Val
100 105 110
Phe Leu Asn Asp Asp Phe Asp Phe Ile Ala Gly Ile Arg Arg Asp Phe
115 120 125
Asp Thr Asn Leu Leu Ser Asn Gly Gly Asp Leu Arg Asp Leu Pro Val
130 135 140
His Lys Phe Ala Gly Ser Ile Lys Lys Asn Leu Ser Pro Lys Leu Pro
145 150 155 160
Phe Tyr Thr Ser Arg Val Ile Glu His Leu Lys Ile Gly Leu Lys Glu
165 170 175
Phe Cys Gly Val Val Pro Asp Glu Gly Lys Glu Phe Asp His Val Tyr
180 185 190
Pro Leu Val Gln His Met Val Ala Lys Ala Ser Ala Ser Val Phe Val
195 200 205
Gly Pro Glu Leu Ala Lys Asn Glu Gln Leu Ile Asp Ser Phe Lys Asn
210 215 220
Met Val Leu Glu Val Gly Ser Glu Leu Ala Pro Lys Pro Tyr Leu Glu
225 230 235 240
Phe Phe Pro Asn Leu Met Arg Leu Arg Met Trp Phe Ile Gly Lys Thr
245 250 255
Ser Gln Lys Val Lys Arg His Arg Asp Gln Leu Arg Ala Ala Leu Ala
260 265 270
Pro Gln Val Glu Tyr Arg Leu Lys Ala Met Lys Glu Asn Asp Ser Asn
275 280 285
Trp Asp Arg Pro Asn Asp Phe Leu Gln Asp Ile Leu Glu Ser Gly Asp
290 295 300
Ile Pro Ala His Val Asp Val Thr Asp His Cys Cys Asp Trp Met Thr
305 310 315 320
Gln Ile Ile Phe Ala Ala Leu His Thr Thr Ser Glu Asn Gly Thr Leu
325 330 335
Ser Phe Tyr Arg Leu Leu Asp Asn Pro Lys Val Leu Glu Asp Leu Leu
340 345 350
Glu Glu Gln Asn Gln Val Leu Glu Asp Ala Gly Tyr Asp Ser Ser Val
355 360 365
Gly Pro Glu Val Phe Thr Arg Glu Ile Leu Asn Lys Phe Val Lys Met
370 375 380
Asp Ser Val Ile Arg Glu Thr Ser Arg Leu Arg Asn Asp Tyr Ile Gly
385 390 395 400
Leu Pro His Lys Asn Ile Ser Ser Lys Thr Ile Thr Leu Ser Gly Gly
405 410 415
Ala Met Ile Arg Pro Gly Glu Arg Ala Tyr Val Asn Ala Tyr Ser Asn
420 425 430
His Arg Asp Gly Thr Ile Gln Lys Val Thr Asp Asn Leu Lys Ser Phe
435 440 445
Glu Pro Tyr Arg Phe Val Asn Gln Asp Arg Asn Ser Thr Lys Ile Gly
450 455 460
Glu Asp Phe Ile Phe Phe Gly Met Gly Lys His Ala Cys Pro Gly Arg
465 470 475 480
Trp Phe Ala Ile Gln Glu Ile Lys Thr Ile Ile Ala Met Met Ile Arg
485 490 495
Ser Tyr Gln Leu Ser Ala Leu Gly Pro Val Thr Phe Pro Thr Asp Asp
500 505 510
Tyr Ser Arg Ile Pro Met Gly Arg Phe Lys Ile Val Pro Arg Lys
515 520 525

Claims (13)

1. The protein is the protein shown as (A1) or (A2):
(A1) protein with amino acid sequence shown as SEQ ID No. 1;
(A2) and (C) attaching a tag to the N-terminus and/or C-terminus of the protein defined in (A1).
2. A set of proteins consisting of protein a, protein B, protein C and protein D;
the protein A is a protein shown as the following (A1) or (A2):
(A1) protein with amino acid sequence shown as SEQ ID No. 1;
(A2) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in (A1);
the protein B is a protein shown as the following (B1) or (B2):
(B1) protein with amino acid sequence shown in SEQ ID No. 19;
(B2) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in (B1);
the protein C is a protein shown as the following (C1) or (C2):
(C1) protein with amino acid sequence shown as SEQ ID No. 20;
(C2) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in (C1);
the protein D is a protein shown as the following (D1) or (D2):
(D1) protein with amino acid sequence shown as SEQ ID No. 21;
(D2) and (D1) attaching a tag to the N-terminus and/or C-terminus of the protein defined in (D1).
3. A nucleic acid molecule encoding the protein of claim 1.
4. The nucleic acid molecule of claim 3, wherein: the nucleic acid molecule is a DNA molecule with the nucleotide sequence shown in the 887-5653 site of SEQ ID No. 2.
5. A set of nucleic acid molecules consisting of a nucleic acid molecule A, a nucleic acid molecule B, a nucleic acid molecule C and a nucleic acid molecule D;
the nucleic acid molecule A is a nucleic acid molecule encoding the protein A of claim 2;
the nucleic acid molecule B is a nucleic acid molecule encoding the protein B of claim 2;
the nucleic acid molecule C is a nucleic acid molecule encoding the protein C according to claim 2;
the nucleic acid molecule D is a nucleic acid molecule encoding the protein D according to claim 2.
6. The kit of nucleic acid molecules according to claim 5, wherein: the nucleic acid molecule A is a DNA molecule with the nucleotide sequence shown as the 887-5653 site of SEQ ID No. 2;
the nucleic acid molecule B is a DNA molecule with the nucleotide sequence shown as 813 th and 2864 th positions of SEQ ID No. 16;
the nucleic acid molecule C is a DNA molecule with the nucleotide sequence shown as the 887-1276 site of SEQ ID No. 17;
the nucleic acid molecule D is a DNA molecule with the nucleotide sequence shown in the 501-2084 th site of SEQ ID No. 18.
7. Any one of the following biomaterials:
(c1) a recombinant vector comprising the nucleic acid molecule of claim 3 or 4;
(c2) an expression cassette comprising the nucleic acid molecule of claim 3 or 4;
(c3) a transgenic cell line comprising the nucleic acid molecule of claim 3 or 4;
(c4) a recombinant bacterium comprising the nucleic acid molecule according to claim 3 or 4;
(c5) the complete set of recombinant vector consists of a recombinant vector A, a recombinant vector B, a recombinant vector C and a recombinant vector D; the recombinant vector A is a recombinant vector containing the nucleic acid molecule A in claim 5 or 6; the recombinant vector B is a recombinant vector containing the nucleic acid molecule B in claim 5 or 6; the recombinant vector C is a recombinant vector containing the nucleic acid molecule C of claim 5 or 6; the recombinant vector D is a recombinant vector containing the nucleic acid molecule D of claim 5 or 6;
(c6) the expression cassette set consists of an expression cassette A, an expression cassette B, an expression cassette C and an expression cassette D; the expression cassette A is an expression cassette comprising the nucleic acid molecule A of claim 5 or 6; the expression cassette B is an expression cassette comprising the nucleic acid molecule B according to claim 5 or 6; the expression cassette C is an expression cassette comprising the nucleic acid molecule C according to claim 5 or 6; the expression cassette D is an expression cassette comprising the nucleic acid molecule D according to claim 5 or 6;
(c7) the complete set of transgenic cell line consists of a transgenic cell line A, a transgenic cell line B, a transgenic cell line C and a transgenic cell line D; the transgenic cell line A is a transgenic cell line containing the nucleic acid molecule A according to claim 5 or 6; the transgenic cell line B is a transgenic cell line containing the nucleic acid molecule B according to claim 5 or 6; the transgenic cell line C is a transgenic cell line comprising the nucleic acid molecule C according to claim 5 or 6; the transgenic cell line D is a transgenic cell line comprising the nucleic acid molecule D according to claim 5 or 6;
(c8) the complete set of recombinant bacteria consists of recombinant bacteria A, recombinant bacteria B, recombinant bacteria C and recombinant bacteria D; the recombinant bacterium A is a recombinant bacterium containing the nucleic acid molecule A in claim 5 or 6; the recombinant bacterium B is a recombinant bacterium containing the nucleic acid molecule B of claim 5 or 6; the recombinant bacterium C is a recombinant bacterium containing the nucleic acid molecule C as claimed in claim 5 or 6; the recombinant bacterium D is a recombinant bacterium containing the nucleic acid molecule D according to claim 5 or 6.
8. A method for constructing engineering bacteria is a method A or a method B as follows:
the method A comprises the following steps: a method for constructing engineering bacteria A comprises the following steps: modifying yeast, and introducing the nucleic acid molecule of claim 3 or 4 into the yeast to obtain recombinant yeast expressing the protein of claim 1, namely engineering bacteria A;
the method B comprises the following steps: a method for constructing engineering bacteria B comprises the following steps: modifying yeast, and introducing the nucleic acid molecule set of claim 5 or 6 into the yeast to obtain recombinant yeast expressing the protein set of claim 2, namely engineering bacteria B.
9. The method of claim 8, wherein: said nucleic acid molecule is introduced into said yeast in the form of said recombinant vector or said expression cassette of claim 7; the set of nucleic acid molecules is introduced into the yeast in the form of the set of recombinant vectors or the set of expression cassettes of claim 7.
10. The method of claim 8, wherein: the nucleic acid molecule or the set of nucleic acid molecules is integrated into the genome of the yeast at least at one of the following sites: gal7 site, NDT80 site, Gal80 site, ADH1 site.
11. The engineering bacterium is the engineering bacterium A prepared by the method A in any one of claims 8 to 10 or the engineering bacterium B prepared by the method B in any one of claims 8 to 10.
12. Any one of the following applications:
(A) use of the protein of claim 1 as a steroid transporter; the steroid substance is 17 alpha-hydroxypregna-4-ene-3, 20-dione-21-acetate;
(B) use of the protein of claim 1 or the set of proteins of claim 2 or the nucleic acid molecule of claim 3 or 4 or the set of nucleic acid molecules of claim 5 or 6 or the biological material of claim 7 or the engineered bacterium of claim 11 for: transporting steroids or preparing products capable of transporting steroids; the steroid substance is 17 alpha-hydroxypregne-4-ene-3, 20-diketone-21-acetate;
(C) use of a set of proteins according to claim 2 or a set of nucleic acid molecules according to claim 5 or 6 or a set of recombinant vectors, a set of expression cassettes, a set of transgenic cell lines or a set of recombinant bacteria according to claim 7: preparing a product capable of improving the transport capacity of 17 alpha-hydroxypregna-4-ene-3, 20-dione-21-acetate in hydrocortisone synthesis;
(D) use of a set of proteins according to claim 2 or a set of nucleic acid molecules according to claim 5 or 6 or a set of recombinant vectors, a set of expression cassettes, a set of transgenic cell lines or a set of recombinant bacteria according to claim 7: the capability of the strain for synthesizing the hydrocortisone is improved by improving the transport capability of the strain to the 17 alpha-hydroxypregna-4-ene-3, 20-dione-21-acetate serving as a substrate.
13. A method for preparing hydrocortisone is a whole-cell catalysis method, and comprises the following steps: carrying out fermentation culture on the engineering bacteria B as claimed in claim 11, collecting bacteria, adding a substrate, and carrying out catalytic reaction, wherein the reaction product contains hydrocortisone; the substrate is 17 alpha-hydroxypregna-4-ene-3, 20-dione-21-acetate.
CN201910247616.1A 2019-03-29 2019-03-29 Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof Active CN111748022B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910247616.1A CN111748022B (en) 2019-03-29 2019-03-29 Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910247616.1A CN111748022B (en) 2019-03-29 2019-03-29 Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof

Publications (2)

Publication Number Publication Date
CN111748022A CN111748022A (en) 2020-10-09
CN111748022B true CN111748022B (en) 2022-05-31

Family

ID=72671232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910247616.1A Active CN111748022B (en) 2019-03-29 2019-03-29 Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof

Country Status (1)

Country Link
CN (1) CN111748022B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4546078A (en) * 1982-11-09 1985-10-08 Schering Aktiengesellschaft Polymer containing biocatalyst
CN1594590A (en) * 2004-07-01 2005-03-16 天津科技大学 Method for increasing conversion of hydrocortisone
CN109097343A (en) * 2018-08-09 2018-12-28 中国科学院天津工业生物技术研究所 11 B-hydroxylase of steroid and its encoding gene and application in Curvuluria Iunata

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4546078A (en) * 1982-11-09 1985-10-08 Schering Aktiengesellschaft Polymer containing biocatalyst
CN1594590A (en) * 2004-07-01 2005-03-16 天津科技大学 Method for increasing conversion of hydrocortisone
CN109097343A (en) * 2018-08-09 2018-12-28 中国科学院天津工业生物技术研究所 11 B-hydroxylase of steroid and its encoding gene and application in Curvuluria Iunata

Also Published As

Publication number Publication date
CN111748022A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
CN109097343B (en) Steroid 11 beta-hydroxylase in curvularia lunata as well as coding gene and application thereof
CN110607247B (en) Method for improving capacity of saccharomyces cerevisiae in synthesizing squalene
CN112852650B (en) Saccharomyces cerevisiae engineering bacterium for high yield of santalene and santalol and construction method and application thereof
CN114133438B (en) Purple sweet potato anthocyanin synthesis regulation factor IbEIN3-2 and application thereof
CN107034150B (en) Recombinant yarrowia lipolytica strain and construction method and application thereof
WO2020029564A1 (en) Tripterygium wilfordii triterpene synthase twosc1, coding gene therefor, and application thereof
JPH0659218B2 (en) DNA transfer vector
CN116286900B (en) Acetic acid permease A gene RkAcpa and application thereof
CN110747178B (en) Application of tripterygium wilfordii cytochrome p450 oxidase in preparation of abietane-type diterpene compound
CN114106130B (en) Purple sweet potato anthocyanin synthesis regulation factor IbJOX4 and application thereof
CN110791468A (en) Construction method and application of mycobacterium genetic engineering bacteria
CN112094797B (en) Genetically engineered bacterium and application thereof in preparation of 9 alpha, 22-dihydroxy-23, 24-bis-cholesta-4-en-3-one
CN112063646B (en) Method for integrating multiple copies of target gene, recombinant bacterium and preparation method of recombinant human serum albumin
CN110982720A (en) Recombinant yarrowia lipolytica producing dammarane diol and protopanoxadiol and use thereof
CN109097342B (en) Steroid 11 beta-hydroxylase in Absidia coerulea, coding gene and application thereof
CN116987603A (en) Recombinant saccharomyces cerevisiae strain for high yield of cannabigerolic acid as well as construction method and application thereof
CN111334522B (en) Recombinant saccharomyces cerevisiae for producing ambergris alcohol and construction method
CN111748022B (en) Curvularia lunata-derived steroid substance transport protein and coding gene and application thereof
CN113249241B (en) Construction and application of saccharomyces cerevisiae protease deletion strain
CN114717173B (en) Genetically engineered strain for producing sterol side chain incomplete degradation product, construction method and application thereof
CN112708602B (en) Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application
CN112359092A (en) Construction method of genome short fragment library
CN114316005B (en) Upstream regulation factor IbWRKY1 and application thereof in regulation of IbMYB1 expression of purple sweet potato
CN115747238A (en) Aldolase gene salA and application thereof in construction of high-yield ADD genetic engineering bacteria
CN115786360A (en) Gene with herbicide metabolism detoxification function and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant