CN112708602B - Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application - Google Patents

Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application Download PDF

Info

Publication number
CN112708602B
CN112708602B CN201911021712.0A CN201911021712A CN112708602B CN 112708602 B CN112708602 B CN 112708602B CN 201911021712 A CN201911021712 A CN 201911021712A CN 112708602 B CN112708602 B CN 112708602B
Authority
CN
China
Prior art keywords
protein
nucleic acid
acid molecule
leu
recombinant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911021712.0A
Other languages
Chinese (zh)
Other versions
CN112708602A (en
Inventor
张学礼
陈晶
程健
江会锋
樊飞宇
戴住波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Institute of Industrial Biotechnology of CAS
Original Assignee
Tianjin Institute of Industrial Biotechnology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Institute of Industrial Biotechnology of CAS filed Critical Tianjin Institute of Industrial Biotechnology of CAS
Priority to CN201911021712.0A priority Critical patent/CN112708602B/en
Publication of CN112708602A publication Critical patent/CN112708602A/en
Application granted granted Critical
Publication of CN112708602B publication Critical patent/CN112708602B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0006Oxidoreductases (1.) acting on CH-OH groups as donors (1.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/001Oxidoreductases (1.) acting on the CH-CH group of donors (1.3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0012Oxidoreductases (1.) acting on nitrogen containing compounds as donors (1.4, 1.5, 1.6, 1.7)
    • C12N9/0036Oxidoreductases (1.) acting on nitrogen containing compounds as donors (1.4, 1.5, 1.6, 1.7) acting on NADH or NADPH (1.6)
    • C12N9/0038Oxidoreductases (1.) acting on nitrogen containing compounds as donors (1.4, 1.5, 1.6, 1.7) acting on NADH or NADPH (1.6) with a heme protein as acceptor (1.6.2)
    • C12N9/0042NADPH-cytochrome P450 reductase (1.6.2.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P33/00Preparation of steroids
    • C12P33/20Preparation of steroids containing heterocyclic rings
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y101/00Oxidoreductases acting on the CH-OH group of donors (1.1)
    • C12Y101/03Oxidoreductases acting on the CH-OH group of donors (1.1) with a oxygen as acceptor (1.1.3)
    • C12Y101/03006Cholesterol oxidase (1.1.3.6)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y103/00Oxidoreductases acting on the CH-CH group of donors (1.3)
    • C12Y103/01Oxidoreductases acting on the CH-CH group of donors (1.3) with NAD+ or NADP+ as acceptor (1.3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y103/00Oxidoreductases acting on the CH-CH group of donors (1.3)
    • C12Y103/01Oxidoreductases acting on the CH-CH group of donors (1.3) with NAD+ or NADP+ as acceptor (1.3.1)
    • C12Y103/01072DELTA24-sterol reductase (1.3.1.72)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y106/00Oxidoreductases acting on NADH or NADPH (1.6)
    • C12Y106/02Oxidoreductases acting on NADH or NADPH (1.6) with a heme protein as acceptor (1.6.2)
    • C12Y106/02004NADPH-hemoprotein reductase (1.6.2.4), i.e. NADP-cytochrome P450-reductase

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The invention discloses a dioscorea zingiberensis-derived diosgenin synthesis-related protein, a coding gene and application thereof. The dioscorea zingiberensis-derived diosgenin synthesis related protein provided by the invention is two proteins shown in SEQ ID No.9 and SEQ ID No. 7. The saccharomyces cerevisiae capable of synthesizing the cholesterol is introduced by utilizing the dioscorea zingiberensis-derived diosgenin synthesis related gene provided by the invention and a cytochrome P450 reductase coding gene together, so that a saccharomyces cerevisiae engineering strain capable of synthesizing the dioscorea zingiberensis saponin can be obtained. The invention realizes the biosynthesis of the saccharomyces cerevisiae dioscin.

Description

Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application
Technical Field
The invention relates to the technical field of biology, in particular to a dioscorea zingiberensis-derived diosgenin synthesis-related protein, a coding gene and application thereof.
Background
Dioscorea plants are rich in Starch or steroid saponins and have been widely used in the food and pharmaceutical industries worldwide (Piperno, D.R., Ranore, A.J., Holst, I., Hansell, P.,2000.Starch yield crop moisture conservation in the pana Manual nutritional requirement, Nature.407, 894-7.). Among them, peltate yam (Dioscorea zingiberensis) is an important Chinese medicine and widely used in the treatment of rheumatoid arthritis, anthrax, cough, heart disease, etc. (Jesus, M., Martins, A.P., Gallardo, E., Silvestre, S.,2016.Diosgenin: Recent Highliights on pharmacy and Analytical Methods. J. animal Methods chem.2016, 4156293.). Diosgenin, also known as diosgenin, is a steroid sapogenin accumulated in dioscorea zingiberensis rhizome and is an important precursor for synthesizing various steroid hormone drugs, including antioxidants, anti-inflammatory drugs, sex hormones, steroids, cortisones, contraceptives, birth control compounds, anabolic drugs, and the like (Willaman, j.j., Fenske, c.s., corell, d.s.,1953. occurence of alkaloids in dioscorea.science.118, 329-30.).
At present, the production of diosgenin in China mainly adopts a mode of directly extracting dioscoreaceae plants, and the source plant of the dioscorea zingiberensis is mainly grown and planted in the middle and lower reaches of Yangtze river and the north-south water transfer area in China, however, due to the unstable planting state and long period (more than 2 years) and the complicated extraction process and the influence of factors such as Bai, y, Zhang, L, Jin, w, Wei, m, Zhou, p, Zheng, g, Niu, L, Nie, L, Zhang, y, Wang, h, Yu, L, 2015, In situ high-valued digestion and transformation of sulfur free diosorea zingiberensis c.h.wright for clean production of diosgenin, biosesor, 196,642-7, the steroid hormone and the steroid hormone of saponin, the steroid hormone supply and the steroid hormone price fluctuate to the industry. Steroid hormone is the second major hormone medicine second to antibiotics at present, and China is a major export country of steroid hormone bulk drugs and intermediates. The stable and efficient supply and output of dioscin and related substitute raw materials are the basis for maintaining the continuous and healthy development of the steroid hormone industry in China. Therefore, the transformation from plant resource extraction to microbial synthesis of the upstream raw materials used by the steroid drugs is a revolutionary transformation, and the further leap-type development of the steroid hormone industry in China is led. The method combining biological conversion and chemical synthesis is utilized to realize the synthesis of steroid hormone, replace plant raw materials with high pollution and high cost, and have remarkable economic and social benefits.
In plants, diosgenin is biosynthesized from precursor cholesterol in ten steps (see fig. 1), at least three oxidation reactions on C-22, C-26, C-16 of the precursor cholesterol followed by two cyclization reactions, resulting in the synthesis of diosgenin (sonawale, p.d., Pollier, J., Panda, S.s., Szymanski, J.Massalha, H.s., Yona, M.s., Unger, T.Makily, S.s., Arent, P.s., Pauls, L.Almekias-Siegl, E.s., Rogachev, I.Meir, S.Cardinas, P.D., Masri, A.A., Petrikov, M.Scheduler, H.Schaffer, A.A.Kambrin, A.A.A., Gimbrios, Godys, P.D., Masri, A.A.1628, P.D., P.P.P.S., Petrolb.2016, P.D.S.S., phytol.S.D.. The synthetic pathway of diosgenin has been successfully identified in Paris polyphylla (Paris polyphylla) and fenugreek (Trigonella foenum graecum). The identification shows that the synthesis of diosgenin of the two plants needs two key P450 proteins (PpCYP90G4/TfCYP90B50 and PpCYP94D108/TfCYP82J17), wherein the former is responsible for catalyzing C-16 and C-22 double oxidation of cholesterol at the same time, and the latter is responsible for completing hydroxylation of C-26, namely, biosynthesis from cholesterol to diosgenin. The dioscorea zingiberensis is used as a main source plant of dioscin in China, the synthesis route of the dioscin is not resolved so far, and related important P450 protein is not identified, so that the identification of the P450 gene related to the dioscin synthesis in the dioscorea zingiberensis is completed, and the important significance is realized for the biosynthesis of the dioscin in the saccharomyces cerevisiae.
Disclosure of Invention
The invention aims to provide a dioscorea zingiberensis-derived diosgenin synthesis related protein, a coding gene and application.
In a first aspect, the invention claims a protein or a protein set.
The protein claimed in the present invention is protein a or protein B.
The protein set claimed by the invention is protein set A or protein set B or protein set C.
The protein set A is composed of the protein A and the protein B.
The protein set B consists of the protein A, the protein B and the protein C.
The protein set is composed of the protein A, the protein B, the protein C, the protein D and the protein E.
The protein A (cholesterol 16-position and 22-position double oxidases from dioscorea zingiberensis and named as DGCYP033) is a protein shown in any one of the following (A1) - (A4):
(A1) protein with amino acid sequence shown as SEQ ID No. 9;
(A2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (A1) and having the same function;
(A3) and (b) a protein having 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more homology with the amino acid sequence defined in (A1) or (A2) and having the same function.
(A4) A fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of a protein defined in any one of (A1) to (A3);
the protein B (cholesterol 26-site oxidase derived from dioscorea zingiberensis, named DGCYP029) is a protein shown in any one of the following (B1) - (B4):
(B1) protein with amino acid sequence shown as SEQ ID No. 7;
(B2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (B1) and having the same function;
(B3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (B1) or (B2) and having the same function;
(B4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (B1) to (B3).
The protein C (grape-derived cytochrome P450 reductase, designated VvCPR) is a protein represented by any one of the following (C1) to (C4):
(C1) protein with amino acid sequence shown as SEQ ID No. 11;
(C2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (C1) and having the same function;
(C3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (C1) or (C2) and having the same function;
(C4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (C1) to (C3).
The protein D (sterol 7-position reductase derived from zebra fish, named DrDHCR7) is a protein shown in any one of the following (D1) - (D4):
(D1) protein with amino acid sequence shown as SEQ ID No. 1;
(D2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (D1) and having the same function;
(D3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (D1) or (D2) and having the same function;
(D4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (D1) to (D3).
The protein E (sterol 24-site reductase derived from zebra fish, named DrDHCR24) is a protein shown in any one of the following (E1) - (E4):
(E1) protein with amino acid sequence shown as SEQ ID No. 2;
(E2) a protein obtained by substituting and/or deleting and/or adding one or more amino acid residues to the amino acid sequence defined in (E1) and having the same function;
(E3) a protein having a homology of 99% or more, 95% or more, 90% or more, 85% or more, or 80% or more with the amino acid sequence defined in (E1) or (E2) and having the same function;
(E4) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (E1) to (E3).
In the above protein, the tag is a polypeptide or protein that is expressed by fusion with a target protein using in vitro recombinant DNA technology, so as to facilitate expression, detection, tracking and/or purification of the target protein. The protein tag may be a His tag, a Flag tag, an MBP tag, an HA tag, a myc tag, a GST tag, and/or a SUMO tag, among others.
In a second aspect, the invention claims a nucleic acid molecule or a set of nucleic acid molecules.
The nucleic acid molecules claimed in the present invention are nucleic acid molecules encoding the proteins described hereinbefore.
The nucleic acid molecule is a nucleic acid molecule A or a nucleic acid molecule B.
The set of nucleic acid molecules claimed by the invention is set of nucleic acid molecules A or set of nucleic acid molecules B or set of nucleic acid molecules C.
The set of nucleic acid molecules A consists of the nucleic acid molecule A and the nucleic acid molecule B.
The nucleic acid molecule set B consists of the nucleic acid molecule A, the nucleic acid molecule B and the nucleic acid molecule C.
The set of nucleic acid molecules consists of the nucleic acid molecule A, the nucleic acid molecule B, the nucleic acid molecule C, the nucleic acid molecule D and the nucleic acid molecule E.
The nucleic acid molecule A is a nucleic acid molecule encoding the protein A described above.
The nucleic acid molecule B is a nucleic acid molecule encoding the protein B described above.
The nucleic acid molecule C is a nucleic acid molecule encoding the protein C described above.
The nucleic acid molecule D is a nucleic acid molecule encoding the protein D as described above.
The nucleic acid molecule E is a nucleic acid molecule which codes for the protein E described above.
Further, the nucleic acid molecule a encodes DGCYP033, which is a DNA molecule represented by any one of (a1) to (a3) below:
(a1) a DNA molecule with the nucleotide sequence shown as 11 th-1477 th sites of SEQ ID No. 10;
(a2) a DNA molecule which hybridizes under stringent conditions to the DNA molecule defined in (a1) and which encodes a protein represented by any one of (A1) to (A4) as described hereinbefore;
(a3) a DNA molecule which has 99% or more, 95% or more, 90% or more, 85% or more or 80% or more homology to the DNA sequence defined in (a1) or (a2) and which encodes a protein represented by any one of (A1) to (A4) described above.
The nucleic acid molecule B encodes DGCYP029 and is a DNA molecule shown in any one of (B1) to (B3) as follows:
(b1) a DNA molecule with a nucleotide sequence shown as 11 th to 1501 th sites of SEQ ID No. 8;
(b2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (B1) and which encodes the protein B as described hereinbefore;
(b3) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more or 80% or more homology to the DNA sequence defined in (B1) or (B2) and encoding the protein B as described above.
The nucleic acid molecule C encodes VvCPR and is a DNA molecule shown in any one of (C1) - (C3) as follows:
(c1) a DNA molecule with the nucleotide sequence shown as 11 th-2125 th site of SEQ ID No. 12;
(c2) a DNA molecule which hybridizes under stringent conditions to the DNA molecule defined in (C1) and which encodes protein C as described hereinbefore;
(c3) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more or 80% or more homology to the DNA sequence defined in (C1) or (C2) and encoding the protein C as described above.
The nucleic acid molecule D encodes DrDHCR7 and is a DNA molecule shown in any one of (D1) - (D3) as follows:
(d1) a DNA molecule with the nucleotide sequence shown in the 813 th and 2249 th positions of SEQ ID No. 3;
(d2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (D1) and which encodes the protein D as described hereinbefore;
(d3) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more or 80% or more homology to the DNA sequence defined in (D1) or (D2) and encoding the protein D as described above.
The nucleic acid molecule E encodes DrDHCR24 and is a DNA molecule shown as any one of (E1) - (E3) below:
(e1) a DNA molecule with the nucleotide sequence shown in the 493 2043 position of SEQ ID No. 4;
(e2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (E1) and which encodes the protein E described hereinbefore;
(e3) a DNA molecule having 99% or more, 95% or more, 90% or more, 85% or more or 80% or more homology to the DNA sequence defined in (E1) or (E2) and encoding the protein E as described above.
Wherein the stringent conditions may be as follows: 50 ℃ in 7% Sodium Dodecyl Sulfate (SDS), 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 2 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing at 50 ℃ in 1 XSSC, 0.1% SDS; also can be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 0.5 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 50 ℃; also can be: 50 ℃ in 7% SDS, 0.5M Na3PO4Hybridization with 1mM EDTA, rinsing in 0.1 XSSC, 0.1% SDS at 65 ℃; can also be: in a solution of 6 XSSC, 0.5% SDS at 65 ℃ and then washed once with each of 2 XSSC, 0.1% SDS and 1 XSSC, 0.1% SDS.
In a third aspect, the invention claims any of the following biomaterials:
(c1) a recombinant vector comprising a nucleic acid molecule as described above.
(c2) An expression cassette comprising a nucleic acid molecule as described herein before.
(c3) A transgenic cell line comprising a nucleic acid molecule as described above.
(c4) The recombinant bacterium is a recombinant bacterium containing the nucleic acid molecule.
(c5) The complete set of recombinant vector is a complete set of vector A or a complete set of vector B or a complete set of vector C.
The complete set of vector A consists of a recombinant vector A and a recombinant vector B.
The complete set of vector B consists of the recombinant vector A, the recombinant vector B and the recombinant vector C.
The complete set of vector C consists of the recombinant vector A, the recombinant vector B, the recombinant vector C, the recombinant vector D and the recombinant vector E.
The recombinant vector A is a recombinant vector containing the nucleic acid molecule A; the recombinant vector B is a recombinant vector containing the nucleic acid molecule B; the recombinant vector C is a recombinant vector containing the nucleic acid molecule C; the recombinant vector D is a recombinant vector containing the nucleic acid molecule D; the recombinant vector E is a recombinant vector containing the nucleic acid molecule E described above.
(c6) The complete set of expression cassette is a complete set of expression cassette A or a complete set of expression cassette B or a complete set of expression cassette C.
The expression cassette A set consists of an expression cassette A and an expression cassette B.
The expression cassette B consists of the expression cassette A, the expression cassette B and the expression cassette C.
The expression cassette C set consists of the expression cassette A, the expression cassette B, the expression cassette C, the expression cassette D and the expression cassette E.
The expression cassette A is an expression cassette comprising the nucleic acid molecule A as described above; the expression cassette B is an expression cassette comprising the nucleic acid molecule B as described above; the expression cassette C is an expression cassette comprising a nucleic acid molecule C as described hereinbefore; the expression cassette D is an expression cassette comprising a nucleic acid molecule D as described hereinbefore; the expression cassette E is an expression cassette which contains the nucleic acid molecule E described above.
(c7) The complete set of transgenic cell line is a complete set of transgenic cell line A or a complete set of transgenic cell line B or a complete set of transgenic cell line C.
The complete set of transgenic cell line A consists of a transgenic cell line A and a transgenic cell line B.
The complete set of transgenic cell line B consists of the transgenic cell line A, the transgenic cell line B and the transgenic cell line C.
The complete set of transgenic cell line C consists of the transgenic cell line A, the transgenic cell line B, the transgenic cell line C, the transgenic cell line D and the transgenic cell line E.
The transgenic cell line A is a transgenic cell line containing the nucleic acid molecule A as described above; the transgenic cell line B is a transgenic cell line containing the nucleic acid molecule B as described above; the transgenic cell line C is a transgenic cell line comprising a nucleic acid molecule C as described above; the transgenic cell line D is a transgenic cell line containing the nucleic acid molecule D; the transgenic cell line E is a transgenic cell line which contains the nucleic acid molecule E described above.
(c8) The complete set of recombinant bacteria is a complete set of recombinant bacteria A or a complete set of recombinant bacteria B or a complete set of recombinant bacteria C.
The recombinant bacterium A set consists of a recombinant bacterium A and a recombinant bacterium B.
The set of recombinant bacteria B consists of the recombinant bacteria A, the recombinant bacteria B and the recombinant bacteria C.
The set of recombinant bacterium C consists of the recombinant bacterium A, the recombinant bacterium B, the recombinant bacterium C, the recombinant bacterium D and the recombinant bacterium E.
The recombinant bacterium A is a recombinant bacterium containing the nucleic acid molecule A; the recombinant bacterium B is a recombinant bacterium containing the nucleic acid molecule B; the recombinant bacterium C is a recombinant bacterium containing the nucleic acid molecule C; the recombinant bacterium D is a recombinant bacterium containing the nucleic acid molecule D; the recombinant bacterium E is a recombinant bacterium containing the nucleic acid molecule E.
In a fourth aspect, the invention claims a method for constructing engineering bacteria.
The method for constructing the yeast engineering bacteria for synthesizing the diosgenin, which is claimed by the invention, can comprise the following steps of: modifying the saccharomycete capable of synthesizing cholesterol to express the protein A, the protein B and the cytochrome P450 reductase, wherein the modified saccharomycete is the target engineering bacterium.
Further, the yeast capable of synthesizing cholesterol may be prepared according to a method comprising the steps of: modifying the starting yeast to express sterol 7-position reductase and sterol 24-position reductase, wherein the modified yeast is the yeast capable of synthesizing cholesterol.
Corresponding to the gene level, the method may comprise the steps of: the target engineering bacterium is obtained by introducing the nucleic acid molecule A, the nucleic acid molecule B and the coding gene of the cytochrome P450 reductase into the yeast capable of synthesizing cholesterol to obtain recombinant yeast expressing the protein A, the protein B and the cytochrome P450 reductase.
Further, the yeast capable of synthesizing cholesterol may be prepared according to a method comprising the steps of: and (3) introducing the encoding gene of the sterol 7-position reductase and the encoding gene of the sterol 24-position reductase into the starting yeast to obtain recombinant yeast for expressing the sterol 7-position reductase and the sterol 24-position reductase, namely the yeast capable of synthesizing cholesterol.
Wherein the cytochrome P450 reductase may be protein C as described above; the sterol 7-position reductase may be protein D as described hereinbefore; the sterol 24-reductase can be protein E as described above.
The gene encoding the cytochrome P450 reductase may be the nucleic acid molecule C as described hereinbefore, corresponding to the gene level; the encoding gene of sterol 7-position reductase can be the nucleic acid molecule D; the gene encoding sterol 24-reductase can be nucleic acid molecule E as described above.
Further, each of the nucleic acid molecules or encoding genes may be introduced into the corresponding recipient yeast in the form of a recombinant vector or an expression cassette.
In a particular embodiment of the invention, said nucleic acid molecule a and said nucleic acid molecule B and the gene encoding said cytochrome P450 reductase (as described above for nucleic acid molecule C) are integrated into the genome of said yeast capable of synthesizing cholesterol at the site Gal 80. The gene encoding sterol 7-reductase (nucleic acid molecule D, as described above) and the gene encoding sterol 24-reductase (nucleic acid molecule E, as described above) are integrated into the genome of the starting yeast at the Gal7 site.
Further, the yeast may be saccharomyces cerevisiae, etc.
In a specific embodiment of the invention, the starting yeast is specifically Saccharomyces cerevisiae BY-T3.
In a specific embodiment of the present invention, the nucleic acid molecule D and the nucleic acid molecule E are introduced into the starting yeast in the form of expression cassettes. The expression cassette is an expression cassette Ppgk-DrDHCR7-ADH1t and an expression cassette pGal1-DrDHCR24-CYC1 t. The sequence of the expression cassette Ppgk-DrDHCR7-ADH1t is SEQ ID No. 3; the sequence of the expression cassette pGal1-DrDHCR24-CYC1t is SEQ ID No. 4. When the expression cassette Ppgk-DrDHCR7-ADH1t and the expression cassette pGal1-DrDHCR24-CYC1t are introduced into the starting yeast, a homologous arm marker fragment Gal7-URA3-up and a homologous arm marker fragment Gal7-URA3-down (the Gal7 site integrated into saccharomyces cerevisiae is realized) are also introduced; the sequence of the homologous arm marker fragment gal7-URA3-up is shown as SEQ ID No. 5; the sequence of the homologous arm marker fragment gal7-URA3-down is shown in SEQ ID No. 6.
In a specific embodiment of the present invention, the nucleic acid molecule a, the nucleic acid molecule B and the nucleic acid molecule C are introduced into the yeast capable of synthesizing cholesterol in the form of expression cassettes. The expression cassette is expression cassette pPgk-DGCYP029-ADH1t, expression cassette pTDH3-VvCPR-TPI1t and expression cassette pTEF-DGCYP033-CYC1 t.
The expression cassette pPgk-DGCYP029-ADH1t takes plasmid pM2-DGCYP029-sy-Sc as a template, and a fragment obtained by PCR amplification is obtained by using a primer 1-M-pEASY-PGK1-F and a primer 3G-1-M-ADHt-TDH 3-R. The sequence of the primer 1-M-pEASY-PGK1-F is shown in SEQ ID No. 15; the sequence of the primer 3G-1-M-ADHt-TDH3-R is shown in SEQ ID No. 16. The plasmid pM2-DGCYP029-sy-Sc is a recombinant plasmid obtained by replacing a small fragment between an enzyme cutting site SexA1 and Asc1 of a pM2 plasmid with a DNA fragment (DGCYP029-sy-Sc) shown at 11 th to 1501 th sites of SEQ ID No. 8.
The expression cassette pTDH3-VvCPR-TPI1t is a fragment obtained by PCR amplification using a primer 3G-3-M-ADHt-TDH3-F and a primer 3G-3-M-TPI1t-TEF1-R with a plasmid pM4-VvCPR-sy-Sc as a template. The sequence of the primer 3G-3-M-ADHt-TDH3-F is shown in SEQ ID No. 17; the sequence of the primer 3G-3-M-TPI1t-TEF1-R is shown as SEQ ID No. 18. The plasmid pM4-VvCPR-sy-Sc is a recombinant plasmid obtained by replacing a small fragment between the enzyme cutting site SexA1 and Asc1 of the pM4 plasmid by a DNA fragment (VvCPR-sy-Sc) shown at the 11 th to the 2125 th sites of SEQ ID No. 12.
The expression cassette pTEF-DGCYP033-CYC1t takes plasmid pM3-DGCYP033-sy-Sc as a template, and a fragment obtained by PCR amplification is carried out by using a primer 3G-2-M-TPI1t-TEF1-F and a primer M-CYC1 t-pEASY-R. The sequence of the primer 3G-2-M-TPI1t-TEF1-F is shown as SEQ ID No. 19; the sequence of the primer M-CYC1t-pEASY-R is shown in SEQ ID No. 20. The plasmid pM3-DGCYP033-sy-Sc is a recombinant plasmid obtained by replacing a small fragment between the enzyme cutting site SexA1 and Asc1 of the pM3 plasmid with a DNA fragment (DGCYP033-sy-Sc) shown at 11 th to 1477 th positions of SEQ ID No. 10.
When the expression cassette pPgk-DGCYP029-ADH1t, the expression cassette pTDH3-VvCPR-TPI1t and the expression cassette pTEF-DGCYP033-CYC1t are introduced into the yeast capable of synthesizing cholesterol, a homology arm marker fragment Gal80-Leu-up and a homology arm marker fragment Gal80-Leu-down (realizing the Gal80 site integrated in saccharomyces cerevisiae) are also introduced; the sequence of the homologous arm marker fragment gal80-Leu-up is shown in SEQ ID No. 13; the sequence of the homologous arm marker fragment gal80-Leu-down is shown in SEQ ID No. 14.
In a fifth aspect, the invention claims an engineered bacterium prepared by the method described in the fourth aspect.
In a specific embodiment of the invention, the engineering bacteria is engineering bacteria DG 001; the engineering bacterium DG001 is prepared according to the following steps:
(d1) introducing the expression cassette Ppgk-DrDHCR7-ADH1T (SEQ ID No.3), the expression cassette pGal1-DrDHCR24-CYC1T (SEQ ID No.4), the marker fragment gal7-URA3-up (SEQ ID No.5) of the homologous arm and the marker fragment gal7-URA3-down (SEQ ID No.6) of the homologous arm into Saccharomyces cerevisiae BY-T3 to obtain a recombinant strain which is the yeast capable of synthesizing cholesterol;
(d2) the engineering bacterium DG001 is a recombinant bacterium obtained by introducing the expression cassette pPgk-DGCYP029-ADH1t, the expression cassette pTDH3-VvCPR-TPI1t, the expression cassette pTEF-DGCYP033-CYC1t, the marker fragment gal80-Leu-up (SEQ ID No.13) of the homologous arm and the marker fragment gal80-Leu-down (SEQ ID No.14) of the homologous arm into the yeast capable of synthesizing cholesterol.
In a sixth aspect, the invention claims the use of the protein or the set of proteins or the nucleic acid molecule or the set of nucleic acid molecules or the biological material or the engineered bacteria in the preparation of diosgenin.
In a seventh aspect, the invention claims a method for preparing diosgenin.
The method for preparing diosgenin claimed by the invention can comprise the following steps: carrying out fermentation culture on the engineering bacteria in the fifth aspect, and collecting fermentation products; the fermentation product contains diosgenin.
Further, the temperature of the culture was 30 ℃.
Further, the conditions of the culture are: 1) selective culture with 2% (percent. indicates g/100mL) glucose as a carbon source was performed at 30 ℃ and 250rpm for 30 hours. 2) The selection medium was transferred to a selection medium with 2% (percentage indicated g/100mL) galactose as carbon source and incubated at 30 ℃ and 250rpm for 90 h.
Experiments prove that the saccharomyces cerevisiae engineering strain capable of synthesizing the diosgenin can be obtained by introducing the relevant gene for synthesizing the dioscorea zingiberensis which is derived from the dioscorea zingiberensis provided by the invention and a cytochrome P450 reductase coding gene into the saccharomyces cerevisiae capable of synthesizing cholesterol. The invention realizes the biosynthesis of the saccharomyces cerevisiae dioscin.
Drawings
Fig. 1 shows the synthesis of diosgenin starting from cholesterol.
FIG. 2 shows the GC-MS identification of strain DG-Cho fermentation products.
FIG. 3 shows GC-MS identification of fermentation products of strain DG 001.
Detailed Description
The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
Plasmid pM 2: a SexA1 cleavage site, a pPGK1 promoter, a Green Fluorescent Protein (GFP) gene, and terminator ADHt and Asc1 cleavage sites were inserted in this order into the multiple cloning site of the peasy-Blunt-simple vector (all-grass organisms, Inc.).
Plasmid pM 3: a SexA1 cleavage site, a pTEF1 promoter, a Green Fluorescent Protein (GFP) gene, and terminators CYC1t and Asc1 cleavage sites were inserted in this order into the multiple cloning site of the peasy-Blunt-simple vector (all-grass biology, Ltd.).
Plasmid pM 4: a SexA1 cleavage site, a pTDH3 promoter, a Green Fluorescent Protein (GFP) gene, and terminator TPI1t and Asc1 cleavage sites were inserted in this order into the multiple cloning site of the peasy-Blunt-simple vector (all-grass King Bio Inc.).
Plasmid pyes2.0: is Addgene product.
Saccharomyces cerevisiae (Saccharomyces cerevisiae) BY-T3: dai, z.et al.identification of a novel cytochrome P450enzyme sites the C-2 α hydroxylation of specific Tritersugars and its application in yeast cell factors 51,70-78(2019), from the 2019 literature of the experimental wearman. The applicant can obtain the said product, and can only use it for repeating the experiment of the invention, and has no other use.
Example 1 construction of Saccharomyces cerevisiae cholesterol Synthesis Chassis Strain DG-Cho
1. Construction of Yeast PolyGene integration fragments
Selecting sterol 7-site reductase (DrDHCR7) and sterol 24-site reductase (DrDHCR24) from zebra fish (Danio rerio) to perform codon optimization on the two proteins, and performing optimization and gene synthesis work by Nanjing Kingsri biotechnology limited to obtain optimized genes which are named as DrDHCR7-sy-Sc and DrDHCR 24-sy-Sc. The sequence of the DrDHCR7-sy-Sc gene is shown as 813 th and 2249 th positions of SEQ ID No.3, and the DrDHCR7-sy-Sc gene encodes protein shown as SEQ ID No. 1; the sequence of the DrDHCR24-sy-Sc gene is shown as the 493-2043 position of SEQ ID No.4, and the DrDHCR24-sy-Sc gene encodes the protein shown as SEQ ID No. 2. The cloning vector plasmids pM2, pYES2.0 and gene fragments DrDHCR7-sy-Sc and DrDHCR24-sy-Sc are cut by Thermo company SexA1 and Asc1, and the cut products are recovered by PCR product gel recovery kit of Shanghai biological engineering Limited company for later use. The enzyme cutting vector pM2 and the obtained DrDHCR7-sy-Sc gene fragment are added into a connection system by 50ng respectively: mu.L of 2 Xquick ligation Buffer (NEB), 0.5. mu.L of Quick ligation Buffer (NEB, 400,000 covalent end units/ml), distilled water was added to 10. mu.L, the mixture was reacted at room temperature for 10min to obtain a ligation product, which was transferred to Trans1-T1 competent cells and ice-cooled for 30min, heat-shocked at 42 ℃ for 30 sec, and immediately placed on ice for 2 min. Adding 800 mu l LB culture medium, incubating at 250rpm and 37 ℃ for 1 hour, coating the bacterial liquid on LB plate containing ampicillin, after overnight culture, PCR screening 5 positive single colonies, carrying out liquid culture on positive clones, extracting positive clone plasmid for sequencing verification, wherein the sequencing result shows that the target fragment is inserted into the vector pM2, thus obtaining the plasmid pM2-DrDHCR 7-sy-Sc. The plasmid pYes-DrDHCR24-sy-Sc was constructed using the pYES2.0 vector and the DrDHCR24-sy-Sc gene fragment prepared by the enzymatic cleavage in the same manner as described above.
PCR amplification was performed using the constructed plasmid pM2-DrDHCR7-sy-Sc as a template and using primers 1-M-pEASY-PGK1-F and 1-M-ADHt-Gal1-R (see Table 2) as the amplification system TAKARA
Figure BDA0002247419530000112
Figure BDA0002247419530000112
10. mu.L of HS DNA polymerase 5 XBuffer, 4. mu.L of Dntp mix, 1. mu.L each of primers (see Table 1), 0.5. mu.L of plasmid pM2-DrDHCR7-sy-Sc template, 0.5. mu.L of PrimerSTAR HS polymerase (2.5U/. mu.L), and distilled water were added to a total volume of 50. mu.L. The amplification conditions were 98 ℃ for 3 min(1 cycle); denaturation at 98 ℃ for 10 seconds, annealing at 56 ℃ for 15 seconds, and extension at 72 ℃ for 2 minutes (30 cycles); extension at 72 ℃ for 10min (1 cycle). The resulting amplification product was designated Ppgk-DrDHCR7-ADH1t (SEQ ID No.3), and the fragment contained Pgk promoter (positions 63-812 of SEQ ID No.3), zebra fish-derived DrDHCR7 gene (positions 813-2249 of SEQ ID No.3) and ADH1 terminator (positions 2250-2407 of SEQ ID No. 3). PCR was performed using the constructed plasmid pYes-DrDHCR24-sy-Sc as a template using the primers 2-M-ADHt-Gal1-F and M-CYC1t-pEASY-R (see Table 1) in the same manner as described above to obtain pGal1-DrDHCR24-CYC1t fragment (SEQ ID No.4) comprising the Gal1 promoter (position 51-492 of SEQ ID No.4), the zebrafish-derived DrDHCR24 gene (position 493 2043 of SEQ ID No.4) and the CYC1 terminator (position 2044-2243 of SEQ ID No. 4). And performing gel recovery treatment on the target fragment obtained by amplification for later use.
TABLE 1 amplification primers for DrDHCR7-sy-Sc and DrDHCR24-sy-Sc gene integration fragments
Figure BDA0002247419530000111
2. Construction of Saccharomyces cerevisiae Strain DG-Cho
A starting strain, Saccharomyces cerevisiae BY-T3, was inoculated into a liquid screening medium (formulation: 0.8% SD-His (Beijing Pankeno technology Co., Ltd.), 2% glucose (each percentage indicates g/100mL), each percentage indicates g/100mL) and cultured overnight. 1mL (OD600 about 0.6-1.0) was dispensed into 1.5mL EP tubes, centrifuged at 4 ℃ at 10000g for 1min, the supernatant was discarded, the precipitate was washed with sterile water (4 ℃), centrifuged under the same conditions, and the supernatant was discarded. The cells were treated with 1mL of a treatment solution (formulation: 10mM LiAc; 10mM DTT; 0.6M sorbitol; 10mM Tris-HCl (pH7.5), and DTT was added when the treatment solution was used), and the cells were left at 25 ℃ for 20 min. After centrifugation, the supernatant was discarded, 1mL of 1M sorbitol (0.22 μ M aqueous membrane filtration sterilization) was added to the cells for resuspension, and the cells were centrifuged, and the supernatant was discarded (resuspended twice with 1M sorbitol buffer) to a final volume of about 90. mu.L. Adding the fragment Ppgk-DrDHCR7-ADH1T, pGal1-DrDHCR24-CYC1T and the homologous arm marker fragment Gal7-URA3-up (SEQ ID No. 5; the homologous arm fragment comprises 400bp homologous region upstream of Gal7 site, URA3marker gene, and Pgk promoter 400bp homologous region), Gal7-URA3-down (SEQ ID No. 6; the homologous arm fragment comprises CYC1 terminator 200bp homologous region, and Gal7 site downstream of 300bp homologous region) (spreading DrDHCR7-sy-Sc and DrDHCR24-sy-Sc gene fragments on Gal7 site of Saccharomyces cerevisiae BY-T3) obtained in step 1, 1.5. mu.L each, transferring to electroporation cuvette after mixing, 2.7kv electroporation 5.7, adding 1mL sorbitol 1M, and screening medium (Trra-0.8% His), 2% glucose, 0.01% Trp, 0.01% Leu, 1.5% agar; each percentage number represents g/100 mL). The conditions of the screening culture are as follows: culturing at 30 deg.C for 36 hr or more. PCR identified the correct positive clone, designated strain DG-Cho.
Example 2 production of Cholesterol by Saccharomyces cerevisiae Strain DG-Cho fermentation
Activating the Saccharomyces cerevisiae strain DG-Cho and the control strain BY-T3 in a solid selection medium (same as the step 2 in the example 1), preparing seed solutions (30 ℃, 250rpm, 12h) in corresponding liquid selection media (same as the step 2 in the example 1), respectively inoculating the seed solutions with 1% of inoculation amount into 100mL triangular flasks containing 15mL of corresponding liquid selection media, culturing at 30 ℃, 250rpm for 30h, then centrifugally collecting the bacteria at 5000rpm, resuspending the bacteria in 15mL of corresponding 2% (percent indicates g/100mL) liquid selection media with galactose as a carbon source at 2 mL, further culturing at 30 ℃, 250rpm for 90h to obtain a fermentation broth, and growing the cells (OD 2)600) And the product is measured.
Taking 6mL of fermentation liquor in a crushing tube, centrifuging for 1min at 13000rpm, and removing supernatant; washing the precipitate with sterile water, centrifuging under the same condition, and discarding the supernatant; adding a proper amount of glass beads (the diameter is 0.5mm) and 1mL of extract (methanol: acetone is 1:1), carrying out shaking crushing for 5min, carrying out ultrasonic crushing for 30min, centrifuging at 13000rpm for 2min, taking supernatant, filtering through a 0.22 mu m organic filter membrane, putting the supernatant into a liquid phase bottle, and carrying out GC-MS detection. GC-MS detection method: gas mass spectrometry tandem-mass spectrometry (GC-MS) agilent technologies 5975C and a three-axis insert xl MSD detector equipped with a chromatography column: HP-5ms (30m 0.25mm 0.5 μm). GC-MS measurement conditions: the injection port temperature is 300 ℃, the injection volume is 1 mu L, the flow is not divided, and the solvent is delayed for 5 min; chromatographic conditions are as follows: maintaining at 240 deg.C for 5min, heating at 10 deg.C/min to 300 deg.C, and maintaining at 300 deg.C for 25min, and totally 36 min. MS conditions: SIM: 69,139,282, and 414.
And (3) detection results: after integration of DrDHCR7 and DrDHCR24 derived from zebrafish into Saccharomyces cerevisiae BY-T3 was detected, the resulting fermentation product of strain DG-Cho was qualitatively analyzed for cholesterol production BY GC-MS detection (FIG. 2). Therefore, the saccharomyces cerevisiae chassis strain DG-Cho capable of synthesizing cholesterol in vivo is successfully constructed.
Example 3 construction of a de novo synthetic engineered Strain of diosgenin
1. Construction of Yeast PolyGene integration fragments
Selecting 16-site and 22-site dioxygenase (DGCYP033) and 26-site oxidase (DGCYP029) of cholesterol derived from dioscorea zingiberensis (D.zingiberensis) and cytochrome P450 reductase (VvCPR) derived from grapes to perform codon optimization on the 3 proteins, and entrusting the optimization and gene synthesis work to Nanjing King Shirui biotechnology Limited company to complete, wherein the obtained optimized genes are named as DGCYP029-sy-Sc, DGCYP033-sy-Sc and VvCPR-sy-Sc. The sequence of the DGCYP029-sy-Sc gene with the recognition sequences of enzyme cutting sites SexA1 and Asc1 at two ends is shown as SEQ ID No.8, and the 11 th to 1501 th sites of the SEQ ID No.8 encode protein shown as SEQ ID No. 7; the sequence of DGCYP033-sy-Sc gene with the recognition sequences of the enzyme cutting sites SexA1 and Asc1 at two ends is shown as SEQ ID No.10, the 11 th to 1477 th sites of the SEQ ID No.10 code the protein shown as SEQ ID No.9, the sequence of VvCPR-sy-Sc gene with the recognition sequences of the enzyme cutting sites SexA1 and Asc1 at two ends is shown as SEQ ID No.12, and the 11 th to 2125 th sites of the SEQ ID No.12 code the protein shown as SEQ ID No. 11. The cloning vector plasmids pM2, pM3 and pM4 and gene fragments DGCYP029-sy-Sc, DGCYP033-sy-Sc and VvCPR-sy-Sc with recognition sequences of enzyme cutting sites SexA1 and Asc1 at two ends are cut by Thermo company SexA1 and Asc1, and the cut enzyme products are recovered by PCR product gel recovery kit of Shanghai Biotechnology engineering Limited company for later use. The restriction enzyme vector pM2 is linked with the DGCYP029-sy-Sc gene fragment, the restriction enzyme vector pM3 and the gene fragment DGCYP033-sy-Sc, and the restriction enzyme vector pM4 is linked with the gene fragment VvCPR-sy-Sc (the method is the same as the step 1 of the example 1), finally the constructed plasmid pM2-DGCYP029-sy-Sc (the recombinant plasmid obtained by replacing the DNA fragment shown at the 11 th to 1501 th positions of SEQ ID No.8 with the small fragment between the restriction enzyme sites SexA1 and Asc1 of the pM2 plasmid), pM3-DGCYP033-sy-Sc (the recombinant plasmid obtained by replacing the DNA fragment shown at the 11 th to 1477 th positions of SEQ ID No.10 with the small fragment between the restriction sites SexA1 and Asc1 of the pM3 plasmid) and pM 4-Vvsy-Sc (the recombinant plasmid obtained by replacing the DNA fragment shown at the 11 th to 2125 th positions of SEQ ID No.12 with the DNA 3884 and the recombinant plasmid Asc 4 th positions of the SexA 3884 of the plasmid).
Using plasmid pM2-DGCYP029-sy-Sc as template, primers 1-M-pEASY-PGK1-F and 3G-1-M-ADHt-TDH3-R, plasmid pM4-VvCPR-sy-Sc as template, primers 3G-3-M-ADHt-TDH3-F and 3G-3-M-TPI1t-TEF1-R, plasmid pM3-DGCYP033-sy-Sc as template, primers 3G-2-M-TPI1t-TEF1-F and M-CYCYCYCYC 1 t-TEASY-R, performing PCR amplification, obtaining fragments Pgk-CYP 029-ADH1t, pTDH 3-Vv73742 and pTEF-461-CYP 461-84, and recovering the above-purified primers (see Table 1 for PCR) 2).
TABLE 2 amplification primers for the VcCYP94N-sy-Sc, DGCYP033-sy-Sc and VvCPR-sy-Sc gene integration fragments
Figure BDA0002247419530000141
2. Construction of Saccharomyces cerevisiae Strain DG-001
The starting strain Saccharomyces cerevisiae DG-Cho was inoculated into a liquid screening medium (formulation: 0.8% SD-His-Ura (Beijing Pankeno (functional genome) science and technology Co., Ltd.), 2% glucose; each percentage number indicates g/100mL) for overnight culture. 1mL (OD600 about 0.6-1.0) was dispensed into 1.5mL EP tubes, centrifuged at 4 ℃ at 10000g for 1min, the supernatant was discarded, the precipitate was washed with sterile water (4 ℃), centrifuged under the same conditions, and the supernatant was discarded. The cells were incubated at 25 ℃ for 20min with 1mL of a treatment solution (10mM LiAc; 10mM DTT; 0.6M sorbitol; 10mM Tris-HCl (pH7.5) added thereto, and the treatment solution was used. After centrifugation, the supernatant was discarded, 1mL of 1M sorbitol (0.22 μ M aqueous membrane filtration sterilization) was added to the cells for resuspension, and the cells were centrifuged to discard the supernatant (resuspended twice with 1M sorbitol) to a final volume of about 90 μ L. Adding the fragments pPgk-DGCYP029-ADH1t, pTDH3-VvCPR-TPI1t and pTEF-DGCYP033-CYC1t obtained in step 1 and a homologous arm marker fragment Gal80-Leu-up (SEQ ID No. 13; the homologous arm fragment comprises a 400bp homologous region upstream of the Gal80 site, a Leu2marker gene and a Pgk promoter 400bp homologous region), Gal80-Leu-down (SEQ ID No. 14; the homologous arm fragment comprises a CYC1 terminator 200bp homologous region and a Gal80 bp homologous region downstream of the Gal80 site) (realizing that the VcCYP94N-sy-Sc, VvCPR-sy-Sc and CYP DG033-sy-Sc gene fragments are integrated into the Gal80 site of the Saccharomyces cerevisiae DG-Cho) and transferring the mixture to an electrotransfer, 2.7, 5.7 electric shock culture cup, adding sorbitol 1.7-1.8% screening medium (1 mL-1.8 ℃ culture medium: 1.8: 1 mL-Leu-1-Leu-culture, 2% glucose, 0.01% Trp, 1.5% agar; each percentage number represents g/100 mL). The conditions of the screening culture are as follows: culturing at 30 deg.C for 36 hr or more. The correct positive clone was identified by PCR and was designated strain DG 001.
Example 4 production of diosgenin by Saccharomyces cerevisiae strain DG001 fermentation
Activating the Saccharomyces cerevisiae strain DG001 and the control strain DG-Cho in a solid selection medium (same as the step 2 in the example 1), preparing seed solutions (30 ℃, 250rpm and 12h) in corresponding liquid selection media (same as the step 2 in the example 1), respectively inoculating the seed solutions in a 100mL triangular flask containing 15mL of the corresponding liquid selection media in an inoculation amount of 1%, after culturing for 30h at 30 ℃ at 250rpm, centrifugally collecting the bacteria at 5000rpm, re-suspending the bacteria in a 100mL triangular flask by using 15mL of the corresponding liquid selection media with 2% (percent of the bacteria indicates g/100mL) galactose as a carbon source, continuously performing shaking culture at 30 ℃, 250rpm for 90h to obtain a fermentation liquid, and performing cell growth (OD (OD) on the cells600) And the product was measured (the same method as in example 2 was used for extraction and measurement of the product).
And (3) detection results: the DGCYP029-sy-Sc, VvCPR-sy-Sc and DGCYP033-sy-Sc gene fragments are integrated into the Saccharomyces cerevisiae DG-Cho by detection, and the obtained fermentation product of the strain DG-001 is qualitatively analyzed for the generation of diosgenin by GC-MS detection (figure 3). Therefore, the saccharomyces cerevisiae chassis strain DG001 capable of synthesizing the diosgenin in vivo from the beginning is successfully constructed, and the yield of the diosgenin is 3.2 mg/L.
<110> institute of biotechnology for Tianjin industry of Chinese academy of sciences
<120> dioscorea zingiberensis-derived diosgenin synthesis-related protein, coding gene and application
<130> GNCLN192228
<160> 20
<170> PatentIn version 3.5
<210> 1
<211> 478
<212> PRT
<213> Artificial sequence
<400> 1
Met Met Ala Ser Asp Arg Val Arg Lys Arg His Lys Gly Ser Ala Asn
1 5 10 15
Gly Ala Gln Thr Val Glu Lys Glu Pro Ser Lys Glu Pro Ala Gln Trp
20 25 30
Gly Arg Ala Trp Glu Val Asp Trp Phe Ser Leu Ser Gly Val Ile Leu
35 40 45
Leu Leu Cys Phe Ala Pro Phe Leu Val Ser Phe Phe Ile Met Ala Cys
50 55 60
Asp Gln Tyr Gln Cys Ser Ile Ser His Pro Leu Leu Asp Leu Tyr Asn
65 70 75 80
Gly Asp Ala Thr Leu Phe Thr Ile Trp Asn Arg Ala Pro Ser Phe Thr
85 90 95
Trp Ala Ala Ala Lys Ile Tyr Ala Ile Trp Val Thr Phe Gln Val Val
100 105 110
Leu Tyr Met Cys Val Pro Asp Phe Leu His Lys Ile Leu Pro Gly Tyr
115 120 125
Val Gly Gly Val Gln Asp Gly Ala Arg Thr Pro Ala Gly Leu Ile Asn
130 135 140
Lys Tyr Glu Val Asn Gly Leu Gln Cys Trp Leu Ile Thr His Val Leu
145 150 155 160
Trp Val Leu Asn Ala Gln His Phe His Trp Phe Ser Pro Thr Ile Ile
165 170 175
Ile Asp Asn Trp Ile Pro Leu Leu Trp Cys Thr Asn Ile Leu Gly Tyr
180 185 190
Ala Val Ser Thr Phe Ala Phe Ile Lys Ala Tyr Leu Phe Pro Thr Asn
195 200 205
Pro Glu Asp Cys Lys Phe Thr Gly Asn Met Phe Tyr Asn Tyr Met Met
210 215 220
Gly Ile Glu Phe Asn Pro Arg Ile Gly Lys Trp Phe Asp Phe Lys Leu
225 230 235 240
Phe Phe Asn Gly Arg Pro Gly Ile Val Ala Trp Thr Leu Ile Asn Leu
245 250 255
Ser Tyr Ala Ala Lys Gln Gln Glu Leu Tyr Gly Tyr Val Thr Asn Ser
260 265 270
Met Ile Leu Val Asn Val Leu Gln Ala Val Tyr Val Val Asp Phe Phe
275 280 285
Trp Asn Glu Ala Trp Tyr Leu Lys Thr Ile Asp Ile Cys His Asp His
290 295 300
Phe Gly Trp Tyr Leu Gly Trp Gly Asp Cys Val Trp Leu Pro Phe Leu
305 310 315 320
Tyr Thr Leu Gln Gly Leu Tyr Leu Val Tyr Asn Pro Ile Gln Leu Ser
325 330 335
Thr Pro His Ala Ala Gly Val Leu Ile Leu Gly Leu Val Gly Tyr Tyr
340 345 350
Ile Phe Arg Val Thr Asn His Gln Lys Asp Leu Phe Arg Arg Thr Glu
355 360 365
Gly Asn Cys Ser Ile Trp Gly Lys Lys Pro Thr Phe Ile Glu Cys Ser
370 375 380
Tyr Gln Ser Ala Asp Gly Ala Ile His Lys Ser Lys Leu Met Thr Ser
385 390 395 400
Gly Phe Trp Gly Val Ala Arg His Met Asn Tyr Thr Gly Asp Leu Met
405 410 415
Gly Ser Leu Ala Tyr Cys Leu Ala Cys Gly Gly Asn His Leu Leu Pro
420 425 430
Tyr Phe Tyr Ile Ile Tyr Met Thr Ile Leu Leu Val His Arg Cys Ile
435 440 445
Arg Asp Glu His Arg Cys Ser Asn Lys Tyr Gly Lys Asp Trp Glu Arg
450 455 460
Tyr Thr Ala Ala Val Ser Tyr Arg Leu Leu Pro Asn Ile Phe
465 470 475
<210> 2
<211> 516
<212> PRT
<213> Artificial sequence
<400> 2
Met Asp Pro Leu Leu Tyr Leu Gly Gly Leu Ala Val Leu Phe Leu Ile
1 5 10 15
Trp Ile Lys Val Lys Gly Leu Glu Tyr Val Ile Ile His Gln Arg Trp
20 25 30
Ile Phe Val Cys Leu Phe Leu Leu Pro Leu Ser Val Val Phe Asp Val
35 40 45
Tyr Tyr His Leu Arg Ala Trp Ile Ile Phe Lys Met Cys Ser Ala Pro
50 55 60
Lys Gln His Asp Gln Arg Val Arg Asp Ile Gln Arg Gln Val Arg Glu
65 70 75 80
Trp Arg Lys Asp Gly Gly Lys Lys Tyr Met Cys Thr Gly Arg Pro Gly
85 90 95
Trp Leu Thr Val Ser Leu Arg Val Gly Lys Tyr Lys Lys Thr His Lys
100 105 110
Asn Ile Met Ile Asn Met Met Asp Ile Leu Glu Val Asp Thr Lys Arg
115 120 125
Lys Val Val Arg Val Glu Pro Leu Ala Asn Met Gly Gln Val Thr Ala
130 135 140
Leu Leu Asn Ser Ile Gly Trp Thr Leu Pro Val Leu Pro Glu Leu Asp
145 150 155 160
Asp Leu Thr Val Gly Gly Leu Val Met Gly Thr Gly Ile Glu Ser Ser
165 170 175
Ser His Ile Tyr Gly Leu Phe Gln His Ile Cys Val Ala Phe Glu Leu
180 185 190
Val Leu Ala Asp Gly Ser Leu Val Arg Cys Thr Glu Lys Glu Asn Ser
195 200 205
Asp Leu Phe Tyr Ala Val Pro Trp Ser Cys Gly Thr Leu Gly Phe Leu
210 215 220
Val Ala Ala Glu Ile Arg Ile Ile Pro Ala Gln Lys Trp Val Lys Leu
225 230 235 240
His Tyr Glu Pro Val Arg Gly Leu Asp Ala Ile Cys Lys Lys Phe Ala
245 250 255
Glu Glu Ser Ala Asn Lys Glu Asn Gln Phe Val Glu Gly Leu Gln Tyr
260 265 270
Ser Arg Asp Glu Ala Val Ile Met Thr Gly Val Met Thr Asp His Ala
275 280 285
Glu Pro Asp Lys Thr Asn Cys Ile Gly Tyr Tyr Tyr Lys Pro Trp Phe
290 295 300
Phe Arg His Val Glu Ser Phe Leu Lys Gln Asn Arg Val Ala Val Glu
305 310 315 320
Tyr Ile Pro Leu Arg His Tyr Tyr His Arg His Thr Arg Ser Ile Phe
325 330 335
Trp Glu Leu Gln Asp Ile Ile Pro Phe Gly Asn Asn Pro Leu Phe Arg
340 345 350
Tyr Val Phe Gly Trp Met Val Pro Pro Lys Ile Ser Leu Leu Lys Leu
355 360 365
Thr Gln Gly Glu Thr Ile Arg Lys Leu Tyr Glu Gln His His Val Val
370 375 380
Gln Asp Met Leu Val Pro Met Lys Asp Ile Lys Ala Ala Ile Gln Arg
385 390 395 400
Phe His Glu Asp Ile His Val Tyr Pro Leu Trp Leu Cys Pro Phe Leu
405 410 415
Leu Pro Asn Gln Pro Gly Met Val His Pro Lys Gly Asp Glu Asp Glu
420 425 430
Leu Tyr Val Asp Ile Gly Ala Tyr Gly Glu Pro Lys Val Lys His Phe
435 440 445
Glu Ala Thr Ser Ser Thr Arg Gln Leu Glu Lys Phe Val Arg Asp Val
450 455 460
His Gly Phe Gln Met Leu Tyr Ala Asp Val Tyr Met Glu Arg Lys Glu
465 470 475 480
Phe Trp Glu Met Phe Asp Gly Thr Leu Tyr His Lys Leu Arg Glu Glu
485 490 495
Leu Gly Cys Lys Asp Ala Phe Pro Glu Val Phe Asp Lys Ile Cys Lys
500 505 510
Ser Ala Arg His
515
<210> 3
<211> 2460
<212> DNA
<213> Artificial sequence
<400> 3
ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg agccttaatt 60
aaacgcacag atattataac atctgcacaa taggcatttg caagaattac tcgtgagtaa 120
ggaaagagtg aggaactatc gcatacctgc atttaaagat gccgatttgg gcgcgaatcc 180
tttattttgg cttcaccctc atactattat cagggccaga aaaaggaagt gtttccctcc 240
ttcttgaatt gatgttaccc tcataaagca cgtggcctct tatcgagaaa gaaattaccg 300
tcgctcgtga tttgtttgca aaaagaacaa aactgaaaaa acccagacac gctcgacttc 360
ctgtcttcct attgattgca gcttccaatt tcgtcacaca acaaggtcct agcgacggct 420
cacaggtttt gtaacaagca atcgaaggtt ctggaatggc gggaaagggt ttagtaccac 480
atgctatgat gcccactgtg atctccagag caaagttcgt tcgatcgtac tgttactctc 540
tctctttcaa acagaattgt ccgaatcgtg tgacaacaac agcctgttct cacacactct 600
tttcttctaa ccaagggggt ggtttagttt agtagaacct cgtgaaactt acatttacat 660
atatataaac ttgcataaat tggtcaatgc aagaaataca tatttggtct tttctaattc 720
gtagtttttc aagttcttag atgctttctt tttctctttt ttacagatca tcaaggaagt 780
aattatctac tttttacaac aaatataaaa caatgatggc atctgataga gttagaaaaa 840
gacataaagg ttcagcaaat ggtgctcaaa ctgttgaaaa agaaccatct aaagaaccag 900
cacaatgggg tagagcttgg gaagttgatt ggttctcttt gtcaggtgtt attttgttgt 960
tgtgtttcgc accatttttg gtttctttct ttatcatggc ttgtgatcaa taccaatgtt 1020
ctatctcaca tccattgttg gatttgtata atggtgacgc aactttgttt actatttgga 1080
atagagctcc atcttttact tgggctgcag ctaagatcta tgctatctgg gttacattcc 1140
aagttgtttt gtacatgtgt gttccagatt tcttgcataa aattttgcca ggttatgttg 1200
gtggtgttca agatggtgca agaacaccag ctggtttgat taataagtac gaagttaacg 1260
gtttgcaatg ttggttgatc actcatgttt tgtgggtttt gaacgcacaa catttccatt 1320
ggttctctcc aacaatcatc atcgataact ggatcccatt gttgtggtgt actaacatct 1380
tgggttacgc tgtttcaaca ttcgctttta ttaaggctta cttatttcca actaatccag 1440
aagattgtaa gtttactggt aacatgttct acaactacat gatgggtatc gaattcaatc 1500
caagaatcgg taaatggttc gatttcaagt tgtttttcaa tggtagacca ggaattgttg 1560
cttggacttt gattaatttg tcttacgcag ctaagcaaca agaattgtac ggttacgtta 1620
caaactcaat gatcttggtt aacgttttgc aagcagttta cgttgttgat ttcttttgga 1680
acgaagcttg gtacttgaag actatcgata tctgtcatga tcatttcggt tggtatttgg 1740
gttggggtga ctgtgtttgg ttgccatttt tatacacttt gcaaggtttg tatttggttt 1800
acaacccaat ccaattgtct acaccacatg cagctggtgt tttgatcttg ggtttggttg 1860
gttactacat ttttagagtt actaaccatc aaaaggattt gtttagaaga acagagggta 1920
actgttcaat ctggggtaaa aagccaactt ttattgaatg ttcttaccaa tcagcagatg 1980
gtgctatcca taagtctaag ttgatgactt caggtttctg gggtgttgct agacatatga 2040
attatactgg tgacttgatg ggttctttgg catactgttt agcttgtggt ggtaatcatt 2100
tgttgccata cttctacatc atctatatga ctatcttatt ggttcataga tgtatcagag 2160
atgaacatag atgttctaat aagtacggta aagattggga aagatataca gcagctgttt 2220
catacagatt attgccaaat attttctaaa gttataaaaa aaataagtgt atacaaattt 2280
taaagtgact cttaggtttt aaaacgaaaa ttcttattct tgagtaactc tttcctgtag 2340
gtcaggttgc tttctcaggt atagcatgag gtcgctctta ttgaccacac ctctaccggc 2400
atgccgacgg attagaagcc gccgagcggg tgacagccct ccgaaggaag actctcctcc 2460
<210> 4
<211> 2297
<212> DNA
<213> Artificial sequence
<400> 4
ggtatagcat gaggtcgctc ttattgacca cacctctacc ggcatgccga cggattagaa 60
gccgccgagc gggtgacagc cctccgaagg aagactctcc tccgtgcgtc ctcgtcttca 120
ccggtcgcgt tcctgaaacg cagatgtgcc tcgcgccgca ctgctccgaa caataaagat 180
tctacaatac tagcttttat ggttatgaag aggaaaaatt ggcagtaacc tggccccaca 240
aaccttcaaa tgaacgaatc aaattaacaa ccataggatg ataatgcgat tagtttttta 300
gccttatttc tggggtaatt aatcagcgaa gcgatgattt ttgatctatt aacagatata 360
taaatgcaaa aactgcataa ccactttaac taatactttc aacattttcg gtttgtatta 420
cttcttattc aaatgtaata aaagtatcaa caaaaaattg ttaatatacc tctatacttt 480
aacgtcaagg agatggatcc attgttatac ttgggtggtt tagctgtttt gtttttaatc 540
tggatcaagg ttaaaggttt agaatatgtt attattcatc aaagatggat ttttgtttgt 600
ttatttttgt tgccattgtc agttgttttc gatgtttact accatttgag agcttggatc 660
atttttaaga tgtgttctgc accaaagcaa catgatcaaa gagttagaga tatccaaaga 720
caagttagag aatggagaaa agatggtggt aaaaagtaca tgtgtactgg tagaccagga 780
tggttgacag tttcattaag agttggtaaa tacaagaaaa ctcataagaa catcatgatt 840
aatatgatgg atattttaga agttgataca aagagaaagg ttgttagagt tgaaccattg 900
gctaatatgg gtcaagttac tgcattgttg aactctatcg gttggacatt gccagtttta 960
ccagaattgg atgatttgac tgttggtggt ttagttatgg gtacaggtat cgaatcttca 1020
tctcatatct atggtttgtt ccaacatatc tgtgttgctt tcgaattggt tttagcagat 1080
ggttctttag ttagatgtac tgaaaaggaa aactcagatt tgttttacgc tgttccatgg 1140
tcttgtggta cattgggttt cttggttgct gcagaaatca gaatcatccc agctcaaaag 1200
tgggttaagt tgcattacga accagttaga ggtttggatg caatctgtaa gaaattcgct 1260
gaagaatcag caaataagga aaaccaattc gttgaaggtt tacaatactc tagagatgaa 1320
gctgttatta tgactggtgt tatgacagat catgcagaac cagataagac taactgtatc 1380
ggttactact acaagccatg gtttttcaga catgttgaat catttttaaa gcaaaacaga 1440
gttgcagttg aatacatccc attgagacat tactaccata gacatacaag atctattttc 1500
tgggaattgc aagatatcat cccattcggt aacaacccat tgtttagata cgtttttggt 1560
tggatggttc caccaaagat ctcattgttg aagttgactc aaggtgaaac aatcagaaag 1620
ttgtacgaac aacatcatgt tgttcaagat atgttggttc caatgaagga tatcaaggct 1680
gcaatccaaa gattccatga agatatccat gtttacccat tgtggttgtg tccatttttg 1740
ttaccaaatc aaccaggaat ggttcatcca aaaggtgacg aagatgaatt gtacgttgat 1800
attggtgctt acggtgaacc aaaggttaag catttcgaag caacttcatc tacaagacaa 1860
ttggaaaagt ttgttagaga tgttcatggt ttccaaatgt tgtacgctga tgtttacatg 1920
gaaagaaagg aattctggga aatgttcgat ggtactttgt accataagtt gagagaagaa 1980
ttgggttgta aggatgcttt tccagaagtt tttgataaaa tttgtaaatc tgcaagacat 2040
taaagtctag gtccctattt atttttttat agttatgtta gtattaagaa cgttatttat 2100
atttcaaatt tttctttttt ttctgtacag acgcgtgtac gcatgtaaca ttatactgaa 2160
aaccttgctt gagaaggttt tgggacgctc gaaggcttta atttgcaagc tgcggccctg 2220
cattaatgaa tcggccaacg cgccagggtt ttcccagtca cgacgttgta aaacgacggc 2280
cagtgaattg taatacg 2297
<210> 5
<211> 1604
<212> DNA
<213> Artificial sequence
<400> 5
ggaaaagttg taaatattat tggtagtatt cgtttggtaa agtagagggg gtaatttttc 60
ccctttattt tgttcataca ttcttaaatt gctttgcctc tccttttgga aagctatact 120
tcggagcact gttgagcgaa ggctcattag atatattttc tgtcattttc cttaacccaa 180
aaataaggga aagggtccaa aaagcgctcg gacaactgtt gaccgtgatc cgaaggactg 240
gctatacagt gttcacaaaa tagccaagct gaaaataatg tgtagctatg ttcagttagt 300
ttggctagca aagatataaa agcaggtcgg aaatatttat gggcattatt atgcagagca 360
tcaacatgat aaaaaaaaac agttgaatat tccctcaaaa atgtcgaaag ctacatataa 420
ggaacgtgct gctactcatc ctagtcctgt tgctgccaag ctatttaata tcatgcacga 480
aaagcaaaca aacttgtgtg cttcattgga tgttcgtacc accaaggaat tactggagtt 540
agttgaagca ttaggtccca aaatttgttt actaaaaaca catgtggata tcttgactga 600
tttttccatg gagggcacag ttaagccgct aaaggcatta tccgccaagt acaatttttt 660
actcttcgaa gacagaaaat ttgctgacat tggtaataca gtcaaattgc agtactctgc 720
gggtgtatac agaatagcag aatgggcaga cattacgaat gcacacggtg tggtgggccc 780
aggtattgtt agcggtttga agcaggcggc agaagaagta acaaaggaac ctagaggcct 840
tttgatgtta gcagaattgt catgcaaggg ctccctatct actggagaat atactaaggg 900
tactgttgac attgcgaaga gcgacaaaga ttttgttatc ggctttattg ctcaaagaga 960
catgggtgga agagatgaag gttacgattg gttgattatg acacccggtg tgggtttaga 1020
tgacaaggga gacgcattgg gtcaacagta tagaaccgtg gatgatgtgg tctctacagg 1080
atctgacatt attattgttg gaagaggact atttgcaaag ggaagggatg ctaaggtaga 1140
gggtgaacgt tacagaaaag caggctggga agcatatttg agaagatgcg gccagcaaaa 1200
ctaaacgcac agatattata acatctgcac aataggcatt tgcaagaatt actcgtgagt 1260
aaggaaagag tgaggaacta tcgcatacct gcatttaaag atgccgattt gggcgcgaat 1320
cctttatttt ggcttcaccc tcatactatt atcagggcca gaaaaaggaa gtgtttccct 1380
ccttcttgaa ttgatgttac cctcataaag cacgtggcct cttatcgaga aagaaattac 1440
cgtcgctcgt gatttgtttg caaaaagaac aaaactgaaa aaacccagac acgctcgact 1500
tcctgtcttc ctattgattg cagcttccaa tttcgtcaca caacaaggtc ctagcgacgg 1560
ctcacaggtt ttgtaacaag caatcgaagg ttctggaatg gcgg 1604
<210> 6
<211> 499
<212> DNA
<213> Artificial sequence
<400> 6
agtctaggtc cctatttatt tttttatagt tatgttagta ttaagaacgt tatttatatt 60
tcaaattttt cttttttttc tgtacagacg cgtgtacgca tgtaacatta tactgaaaac 120
cttgcttgag aaggttttgg gacgctcgaa ggctttaatt tgcaagctgc ggccctgcat 180
taatgaatcg gccaacgcgc aaagaaagtg gaatattcat tcatatcata ttttttctat 240
taactgcctg gtttctttta aattttttat tggttgtcga cttgaacgga gtgacaatat 300
atatatatat atatttaata atgacatcat tatctgtaaa tctgattctt aatgctattc 360
tagttatgta agagtggtcc tttccataaa aaaaaaaaaa aagaaaaaag aattttagga 420
atacaatgca gcttgtaagt aaaatctgga atattcatat cgccacaact tcttatgctt 480
ataaaagcac taatgcctg 499
<210> 7
<211> 496
<212> PRT
<213> Artificial sequence
<400> 7
Met Gly Leu Met Gly Pro Leu Leu Leu Thr Leu Ala Ala Leu Ala Val
1 5 10 15
Thr Val Phe Leu Leu Arg Arg Arg Arg Gln Pro Ser Ser Lys Thr Ser
20 25 30
Lys Pro Leu Ala Ser Ser Gly Thr Leu Ser Glu Leu Met Lys Asn Gly
35 40 45
His Arg Ile Leu Asp Trp Thr Thr Glu Leu Leu Ser Ser Ser Gln Thr
50 55 60
Gly Thr Val Thr Thr Phe Met Gly Val Val Thr Ala Asn Pro Ser Asn
65 70 75 80
Val Glu His Ile Leu Lys Ser His Phe Pro Asn Tyr Pro Lys Gly Ser
85 90 95
His Ser Thr Thr Ile Leu Ser Asp Phe Leu Gly Ala Gly Ile Phe Asn
100 105 110
Ser Asp Gly Glu His Trp Arg Leu Gln Arg Lys Thr Ala Ser Leu Glu
115 120 125
Phe Thr Thr Lys Ser Ile Arg Ser Phe Val Ser Ser Asn Val Arg Leu
130 135 140
Glu Thr Ser Ser Arg Leu Leu Pro Val Leu His Ser Phe Ala Arg Ser
145 150 155 160
Gly Gln Ile Val Asp Leu Gln Asp Leu Phe Asp Cys Leu Ala Phe Asp
165 170 175
Asn Val Cys Gln Val Thr Phe Gly Tyr Asp Pro Ala Arg Leu Asp Ser
180 185 190
Ser Gly Asp Pro Asp Ser Val Ala Phe Ser Arg Ala Phe Asp Arg Ala
195 200 205
Thr Ala Leu Ser Val Arg Arg Phe Ser His Pro Phe Pro Phe Thr Trp
210 215 220
Lys Leu Leu Arg Phe Leu Asn Ala Gly Tyr Glu Arg Glu Leu Lys Ala
225 230 235 240
Glu Val Ala Lys Val His Arg Phe Ala Met Gln Val Val Arg Arg Arg
245 250 255
Lys Lys Asp Gly Asp Leu Gly Asp Asp Leu Leu Ser Arg Phe Ile Ala
260 265 270
Glu Ala Asp Tyr Ser Asp Glu Phe Leu Arg Asp Ile Ile Ile Ser Phe
275 280 285
Val Leu Ala Gly Arg Asp Thr Thr Ser Ala Thr Leu Thr Trp Phe Phe
290 295 300
Trp Leu Ile Ala Ser Arg Pro Glu Val Lys Ala Arg Val Leu Asp Glu
305 310 315 320
Ile Arg Ala Ala Arg Glu Gln Glu Arg Glu Arg Thr Gly Thr Ala Thr
325 330 335
Ser Glu Ala Val Leu Thr Leu Asp Gln Val Arg Gly Met Asp Tyr Leu
340 345 350
His Ala Ala Leu Ser Glu Thr Leu Arg Leu Tyr Pro Pro Val Pro Leu
355 360 365
Gln Thr Arg Ala Cys Ala Glu Asp Asp Leu Leu Pro Asp Gly Thr Pro
370 375 380
Val Lys Lys Gly Ser Thr Val Met Tyr Ser Ala Tyr Ala Met Gly Arg
385 390 395 400
Ser Glu Ser Ile Trp Gly Glu Asp Trp Lys Glu Phe Arg Pro Glu Arg
405 410 415
Trp Leu Glu Asn Gly Val Phe Arg Pro Ala Ser Ser Phe Arg Phe Pro
420 425 430
Val Phe His Ala Gly Pro Arg Met Cys Leu Gly Lys Asp Met Ala Tyr
435 440 445
Ile Gln Met Lys Ala Val Ala Ala Ala Val Met Glu Arg Phe Glu Leu
450 455 460
Glu Val Val Asp Glu Glu Lys Pro Arg Glu Pro Glu Phe Thr Met Ile
465 470 475 480
Leu Arg Met Lys Gly Gly Leu Pro Val Arg Ile Arg Glu Lys Glu Phe
485 490 495
<210> 8
<211> 1509
<212> DNA
<213> Artificial sequence
<400> 8
gcgaccaggt atgggtttga tgggtccttt attattgact ttagccgctt tagccgttac 60
agtattttta ttgagaagaa gaagacaacc tagtagtaaa acatcaaaac cattggcttc 120
ttcaggtact ttgtctgaat tgatgaagaa cggtcataga atcttggatt ggactacaga 180
attgttatct tcatctcaaa ctggtacagt tactactttt atgggtgttg ttacagctaa 240
tccatcaaac gttgaacata tcttgaagtc tcatttccca aactacccaa agggttcaca 300
ttctactaca atcttgtcag atttcttggg tgcaggtatt ttcaattctg atggtgaaca 360
ttggagattg caaagaaaga cagcttcttt ggaattcact acaaagtcaa tcagatcttt 420
cgtttcatct aacgttagat tggaaacttc atctagattg ttgccagttt tgcattcatt 480
tgctagatct ggtcaaatcg ttgatttgca agatttgttc gattgtttgg cattcgataa 540
cgtttgtcaa gttactttcg gttacgatcc agctagatta gattcatctg gtgacccaga 600
ttcagttgca ttttctagag cttttgatag agcaacagct ttgtcagtta gaagattttc 660
tcatccattc ccttttactt ggaagttgtt gagatttttg aacgcaggtt acgaaagaga 720
attgaaggca gaagttgcta aggttcatag atttgctatg caagttgtta gaagaagaaa 780
gaaagatggt gacttgggtg acgatttgtt atcaagattc attgcagaag ctgattactc 840
tgatgaattc ttgagagata tcatcatctc atttgtttta gcaggtagag atactacatc 900
tgctactttg acttggtttt tctggttaat tgcatctaga ccagaagtta aggctagagt 960
tttggatgaa atcagagctg caagagaaca agaaagagaa agaactggta cagcaacttc 1020
agaagctgtt ttgacattgg atcaagttag aggcatggat tatttgcatg ctgcattatc 1080
tgaaacattg agattatacc caccagttcc attacaaact agagcatgtg ctgaagatga 1140
tttgttacca gatggtacac cagttaagaa aggttcaact gttatgtatt ctgcatacgc 1200
tatgggtaga tcagaatcta tttggggtga agactggaaa gaattcagac cagaaagatg 1260
gttggaaaat ggtgttttta gaccagcatc atcttttaga tttccagttt ttcatgctgg 1320
tccaagaatg tgtttgggta aagatatggc atacatccaa atgaaggctg ttgctgcagc 1380
tgttatggaa agattcgaat tggaagttgt tgatgaagaa aagccaagag aaccagaatt 1440
tactatgatt ttgagaatga aaggtggttt gccagttaga ataagagaaa aggaattttg 1500
aggcgcgcc 1509
<210> 9
<211> 488
<212> PRT
<213> Artificial sequence
<400> 9
Met Phe Pro Leu Ala Ile Ile Val Leu Leu Phe Pro Thr Leu Leu Leu
1 5 10 15
Leu Phe Ile Gly Val Ala Leu Gly Leu Arg Ser Gly Ala Asn Glu Ser
20 25 30
Trp Lys Lys Arg Gly Leu Asn Ile Pro Pro Gly Ser Met Gly Trp Pro
35 40 45
Leu Leu Gly Glu Thr Ile Ala Phe Arg Lys Leu His Pro Cys Thr Ser
50 55 60
Leu Gly Glu Tyr Met Glu Asp Arg Leu Gln Arg Tyr Gly Lys Ile Tyr
65 70 75 80
Arg Ser Asn Leu Phe Gly Ala Pro Thr Val Val Ser Ala Asp Ala Glu
85 90 95
Leu Asn Arg Phe Val Leu Met Asn Asp Gly Lys Leu Phe Glu Pro Ser
100 105 110
Trp Pro Lys Ser Val Ala Asp Ile Leu Gly Lys Thr Ser Met Leu Val
115 120 125
Leu Thr Gly Glu Met His Arg Tyr Met Lys Ser Leu Ser Val Asn Phe
130 135 140
Met Gly Ile Ala Arg Leu Arg Asn His Phe Leu Gly Asp Ser Glu Arg
145 150 155 160
Tyr Ile Leu Glu Asn Leu Ala Thr Trp Lys Glu Gly Val Pro Phe Pro
165 170 175
Ala Lys Glu Glu Ala Cys Lys Ile Thr Phe Asn Leu Met Val Lys Asn
180 185 190
Ile Leu Ser Met Asn Pro Gly Glu Pro Glu Thr Glu Arg Leu Arg Ile
195 200 205
Leu Tyr Met Ser Phe Met Lys Gly Val Ile Ala Met Pro Leu Asn Phe
210 215 220
Pro Gly Thr Ala Tyr Arg Lys Ala Ile Gln Ser Arg Ala Thr Ile Leu
225 230 235 240
Lys Thr Ile Glu His Leu Met Glu Asp Arg Leu Glu Lys Lys Lys Ala
245 250 255
Gly Thr Asp Asn Ile Gly Glu Ala Asp Leu Leu Gly Phe Ile Leu Glu
260 265 270
Gln Ser Asn Leu Asp Ala Glu Gln Phe Gly Asp Leu Leu Leu Gly Leu
275 280 285
Leu Phe Gly Gly His Glu Thr Ser Ser Thr Ala Ile Thr Leu Ala Ile
290 295 300
Tyr Phe Leu Glu Gly Cys Pro Lys Ala Val Gln Glu Leu Arg Glu Glu
305 310 315 320
His Leu Asn Leu Val Arg Met Lys Lys Gln Arg Gly Glu Ser Lys Ala
325 330 335
Leu Thr Trp Glu Asp Tyr Lys Ser Met Asp Phe Ala Gln Cys Val Val
340 345 350
Ser Glu Thr Leu Arg Leu Gly Asn Ile Ile Lys Phe Val His Arg Lys
355 360 365
Ala Asn Thr Asp Val Gln Phe Lys Gly Tyr Asp Ile Pro Ser Gly Trp
370 375 380
Ser Val Ile Pro Val Phe Ala Ala Ala His Leu Asp Pro Thr Val Tyr
385 390 395 400
Asp Asn Pro Gln Lys Phe Asp Pro Trp Arg Trp Gln Thr Ile Ser Ser
405 410 415
Ser Thr Ala Arg Ile Asp Asn Tyr Met Pro Phe Gly Gln Gly Leu Arg
420 425 430
Asn Cys Ala Gly Leu Glu Leu Ala Lys Met Glu Ile Ala Val Phe Leu
435 440 445
His His Leu Val Leu Asn Phe Asp Trp Glu Leu Ala Glu Pro Asp His
450 455 460
Pro Leu Ala Tyr Ala Phe Pro Glu Phe Glu Lys Gly Leu Pro Ile Lys
465 470 475 480
Val Arg Lys Leu Ser Ile Leu Glu
485
<210> 10
<211> 1485
<212> DNA
<213> Artificial sequence
<400> 10
gcgaccaggt atgtttcctc tagctatcat cgtcttgcta tttcccacac tgctgctcct 60
cttcatagga gtggccctgg gtttgagaag tggagccaat gagagctgga agaagagggg 120
gctcaacatt cccccaggaa gcatgggctg gccgctcctc ggcgagacca tcgccttccg 180
gaagctccat ccctgcacct ctctcggcga gtacatggag gatcgtctcc agaggtatgg 240
aaagatctac cgctcgaact tgttcggcgc gccgacggtg gtttcggcgg atgcagagct 300
gaaccggttc gtgctgatga acgacgggaa gctgttcgag ccgagctggc cgaagagcgt 360
ggcggacata ctgggaaaga cgtcgatgct ggtgctcaca ggggagatgc atcgctacat 420
gaagtccttg tccgtcaact tcatggggat cgctaggctt cggaatcact tccttgggga 480
ctctgagcgc tatatcttgg agaaccttgc gacctggaag gagggcgttc ctttccctgc 540
taaagaagaa gcttgcaaga taaccttcaa tttaatggtg aagaacatac tgagtatgaa 600
tcctggggag ccagagaccg agaggttgcg cattctctac atgtccttca tgaagggagt 660
gatagctatg cctctcaatt tccctggaac tgcatacagg aaagccattc agtctagagc 720
tacaatcctg aaaaccattg agcatttgat ggaggatagg ctggagaaga agaaggcagg 780
cactgataat atcggagaag ctgatcttct aggtttcatt ctagagcagt cgaacttgga 840
tgctgaacaa ttcggagact tgctgttagg tttgcttttt ggtggccatg agacctcctc 900
cactgccatc actctggcta tctacttcct tgaaggatgc cctaaagctg tacaagaact 960
aagggaagag catttgaacc tggtgaggat gaagaagcag agaggagagt ccaaagcact 1020
cacttgggaa gactacaaat ccatggactt tgcacagtgt gtggtgagtg agactctaag 1080
gctgggaaac atcatcaagt ttgtgcacag gaaggctaac actgatgtcc aatttaaagg 1140
atatgacata ccgagtggct ggagtgtgat tccggtgttc gccgcagctc atttagatcc 1200
tactgtctat gacaatcctc agaaatttga tccttggaga tggcagacaa tctcctccag 1260
cactgctagg attgacaatt acatgccatt cggtcagggg ctgcgcaact gtgctggcct 1320
tgagctcgcc aagatggaga tcgccgtgtt ccttcaccac cttgtcctta acttcgactg 1380
ggagcttgct gagccagatc accccctcgc ctacgccttc cctgaattcg aaaagggcct 1440
tcctatcaaa gttcgcaagc tatccatcct agaatgaggc gcgcc 1485
<210> 11
<211> 704
<212> PRT
<213> Artificial sequence
<400> 11
Met Gln Ser Ser Ser Val Lys Val Ser Pro Phe Asp Leu Met Ser Ala
1 5 10 15
Ile Ile Lys Gly Ser Met Asp Gln Ser Asn Val Ser Ser Glu Ser Gly
20 25 30
Gly Ala Ala Ala Met Val Leu Glu Asn Arg Glu Phe Ile Met Ile Leu
35 40 45
Thr Thr Ser Ile Ala Val Leu Ile Gly Cys Val Val Val Leu Ile Trp
50 55 60
Arg Arg Ser Gly Gln Lys Gln Ser Lys Thr Pro Glu Pro Pro Lys Pro
65 70 75 80
Leu Ile Val Lys Asp Leu Glu Val Glu Val Asp Asp Gly Lys Gln Lys
85 90 95
Val Thr Ile Phe Phe Gly Thr Gln Thr Gly Thr Ala Glu Gly Phe Ala
100 105 110
Lys Ala Leu Ala Glu Glu Ala Lys Ala Arg Tyr Glu Lys Ala Ile Phe
115 120 125
Lys Val Val Asp Leu Asp Asp Tyr Ala Gly Asp Asp Asp Glu Tyr Glu
130 135 140
Glu Lys Leu Lys Lys Glu Thr Leu Ala Phe Phe Phe Leu Ala Thr Tyr
145 150 155 160
Gly Asp Gly Glu Pro Thr Asp Asn Ala Ala Arg Phe Tyr Lys Trp Phe
165 170 175
Ala Glu Gly Lys Glu Arg Gly Glu Trp Leu Gln Asn Leu Lys Tyr Gly
180 185 190
Val Phe Gly Leu Gly Asn Arg Gln Tyr Glu His Phe Asn Lys Val Ala
195 200 205
Lys Val Val Asp Asp Ile Ile Thr Glu Gln Gly Gly Lys Arg Ile Val
210 215 220
Pro Val Gly Leu Gly Asp Asp Asp Gln Cys Ile Glu Asp Asp Phe Ala
225 230 235 240
Ala Trp Arg Glu Leu Leu Trp Pro Glu Leu Asp Gln Leu Leu Arg Asp
245 250 255
Glu Asp Asp Ala Thr Thr Val Ser Thr Pro Tyr Thr Ala Ala Val Leu
260 265 270
Glu Tyr Arg Val Val Phe His Asp Pro Glu Gly Ala Ser Leu Gln Asp
275 280 285
Lys Ser Trp Gly Ser Ala Asn Gly His Thr Val His Asp Ala Gln His
290 295 300
Pro Cys Arg Ala Asn Val Ala Val Arg Lys Glu Leu His Thr Pro Ala
305 310 315 320
Ser Asp Arg Ser Cys Thr His Leu Glu Phe Asp Ile Ser Gly Thr Gly
325 330 335
Leu Thr Tyr Glu Thr Gly Asp His Val Gly Val Tyr Cys Glu Asn Leu
340 345 350
Pro Glu Thr Val Glu Glu Ala Glu Arg Leu Leu Gly Phe Ser Pro Asp
355 360 365
Val Tyr Phe Ser Ile His Thr Glu Arg Glu Asp Gly Thr Pro Leu Ser
370 375 380
Gly Ser Ser Leu Ser Pro Pro Phe Pro Pro Cys Thr Leu Arg Thr Ala
385 390 395 400
Leu Thr Arg Tyr Ala Asp Val Leu Ser Ser Pro Lys Lys Ser Ala Leu
405 410 415
Val Ala Leu Ala Ala His Ala Ser Asp Pro Ser Glu Ala Asp Arg Leu
420 425 430
Lys Tyr Leu Ala Ser Pro Ser Gly Lys Asp Glu Tyr Ala Gln Trp Val
435 440 445
Val Ala Ser Gln Arg Ser Leu Leu Glu Ile Met Ala Glu Phe Pro Ser
450 455 460
Ala Lys Pro Pro Leu Gly Val Phe Phe Ala Ala Val Ala Pro Arg Leu
465 470 475 480
Gln Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Pro Lys Met Val Pro Ser
485 490 495
Arg Ile His Val Thr Cys Ala Leu Val Cys Asp Lys Met Pro Thr Gly
500 505 510
Arg Ile His Lys Gly Ile Cys Ser Thr Trp Met Lys Tyr Ala Val Pro
515 520 525
Leu Glu Glu Ser Gln Asp Cys Ser Trp Ala Pro Ile Phe Val Arg Gln
530 535 540
Ser Asn Phe Lys Leu Pro Ala Asp Thr Ser Val Pro Ile Ile Met Ile
545 550 555 560
Gly Pro Gly Thr Gly Leu Ala Pro Phe Arg Gly Phe Leu Gln Glu Arg
565 570 575
Phe Ala Leu Lys Glu Ala Gly Ala Glu Leu Gly Ser Ser Ile Leu Phe
580 585 590
Phe Gly Cys Arg Asn Arg Lys Met Asp Tyr Ile Tyr Glu Asp Glu Leu
595 600 605
Asn Gly Phe Val Glu Ser Gly Ala Leu Ser Glu Leu Ile Val Ala Phe
610 615 620
Ser Arg Glu Gly Pro Thr Lys Glu Tyr Val Gln His Lys Met Met Glu
625 630 635 640
Lys Ala Ser Asp Ile Trp Asn Val Ile Ser Gln Gly Gly Tyr Ile Tyr
645 650 655
Val Cys Gly Asp Ala Lys Gly Met Ala Arg Asp Val His Arg Thr Leu
660 665 670
His Thr Ile Leu Gln Glu Gln Gly Ser Leu Asp Ser Ser Lys Ala Glu
675 680 685
Ser Met Val Lys Asn Leu Gln Met Thr Gly Arg Tyr Leu Arg Asp Val
690 695 700
<210> 12
<211> 2133
<212> DNA
<213> Artificial sequence
<400> 12
gcgaccaggt atgcaatcat cctccgtaaa ggtatcccca ttcgacttaa tgtcagcaat 60
catcaagggt tctatggacc aatcaaacgt atcatcagaa tcaggtggtg ctgcagccat 120
ggttttggaa aacagagaat tcattatgat cttgactaca tccattgctg ttttgatcgg 180
ttgtgttgtc gtattgatat ggagaagatc aggtcaaaaa caatccaaga ctccagaacc 240
acctaaacct ttgattgtta aggatttgga agtagaagtt gatgacggta aacaaaaggt 300
tacaatattt ttcggtacac aaaccggtac tgctgaaggt ttcgcaaaag ccttggctga 360
agaagcaaag gccagatacg aaaaggcaat ttttaaggtt gtcgatttgg atgactatgc 420
cggtgacgac gatgaatacg aagaaaaatt gaaaaaggaa actttggcct ttttcttttt 480
ggctacatat ggtgacggtg aaccaaccga caatgctgca agattctaca aatggtttgc 540
tgagggtaaa gaacgtggtg aatggttgca aaacttaaag tatggtgttt tcggtttggg 600
taacagacaa tacgaacatt tcaacaaagt tgcaaaggta gttgacgata taatcacaga 660
acaaggtggt aaaagaatcg tcccagtagg tttgggtgac gatgaccaat gtattgaaga 720
tgacttcgcc gcttggagag aattattatg gcctgaatta gatcaattgt taagagacga 780
agatgacgct accactgtat ctacaccata taccgcagcc gttttggaat acagagtcgt 840
atttcatgat cctgaaggtg catcattaca agacaagtca tggggttccg ccaatggtca 900
tactgttcac gatgctcaac acccatgtag agccaacgtt gctgtcagaa aagaattgca 960
tactcctgct agtgatagat cttgcacaca cttggaattc gacatttctg gtactggttt 1020
aacatatgaa accggtgacc atgtaggtgt ttactgtgaa aatttgccag aaacagtcga 1080
agaagcagaa agattgttag gtttctcacc tgatgtatat ttttccatac acaccgaaag 1140
agaagacggt actccattaa gtggttcttc attgtctcca ccttttccac cttgcacttt 1200
gagaacagca ttaaccagat acgccgatgt tttgtccagt cctaaaaagt ctgcattggt 1260
cgccttagct gcacatgcat cagatccatc cgaagccgac agattgaaat atttggctag 1320
tccttctggt aaagatgaat acgctcaatg ggttgtcgca agtcaaagat ctttgttaga 1380
aattatggcc gaatttccat ctgctaagcc acctttgggt gtcttctttg ccgctgtagc 1440
tccaagattg caacctagat actacagtat ctcttcatcc ccaaagatgg ttccttctag 1500
aatacatgtt acctgtgcat tggtctgcga taaaatgcca actggtagaa tccacaaggg 1560
tatttgttca acatggatga aatatgccgt tccattagaa gaatcacaag attgctcctg 1620
ggcacctatc ttcgttagac aatcaaactt caaattgcca gctgatacct ccgtccctat 1680
cattatgatt ggtccaggta caggtttagc tcctttcaga ggtttcttgc aagaaagatt 1740
tgcattgaag gaagctggtg cagaattggg tagttctatc ttgttctttg gttgtagaaa 1800
cagaaagatg gattacatct acgaagacga attgaacggt ttcgtagaaa gtggtgcttt 1860
gtctgaattg atcgttgcat tttcaagaga aggtccaact aaggaatacg ttcaacataa 1920
gatgatggaa aaggctagtg atatctggaa cgtcatctct caaggtggtt atatatacgt 1980
atgcggtgac gctaagggta tggcaagaga cgttcataga actttgcaca caatcttaca 2040
agaacaaggt tctttagatt catccaaggc tgaatcaatg gtaaagaact tacaaatgac 2100
tggtagatac ttgagagatg tctaaggcgc gcc 2133
<210> 13
<211> 1775
<212> DNA
<213> Artificial sequence
<400> 13
gcgcaagttt tccgctttgt aatatatatt tatacccctt tcttctctcc cctgcaatat 60
aatagtttaa ttctaatatt aataatatcc tatattttct tcatttaccg gcgcactctc 120
gcccgaacga cctcaaaatg tctgctacat tcataataac caaaagctca taactttttt 180
ttttgaacct gaatatatat acatcacata tcactgctgg tccttgccga ccagcgtata 240
caatctcgat agttggtttc ccgttctttc cactcccgtc atgtctgccc ctaagaagat 300
cgtcgttttg ccaggtgacc acgttggtca agaaatcaca gccgaagcca ttaaggttct 360
taaagctatt tctgatgttc gttccaatgt caagttcgat ttcgaaaatc atttaattgg 420
tggtgctgct atcgatgcta caggtgttcc acttccagat gaggcgctgg aagcctccaa 480
gaaggctgat gccgttttgt taggtgctgt gggtggtcct aaatggggta ccggtagtgt 540
tagacctgaa caaggtttac taaaaatccg taaagaactt caattgtacg ccaacttaag 600
accatgtaac tttgcatccg actctctttt agacttatct ccaatcaagc cacaatttgc 660
taaaggtact gacttcgttg ttgtcagaga attagtggga ggtatttact ttggtaagag 720
aaaggaagac gatggtgatg gtgtcgcttg ggatagtgaa caatacaccg ttccagaagt 780
gcaaagaatc acaagaatgg ccgctttcat ggccctacaa catgagccac cattgcctat 840
ttggtccttg gataaagcta atgttttggc ctcttcaaga ttatggagaa aaactgtgga 900
ggaaaccatc aagaacgaat tccctacatt gaaggttcaa catcaattga ttgattctgc 960
cgccatgatc ctagttaaga acccaaccca cctaaatggt attataatca ccagcaacat 1020
gtttggtgat atcatctccg atgaagcctc cgttatccca ggttccttgg gtttgttgcc 1080
atctgcgtcc ttggcctctt tgccagacaa gaacaccgca tttggtttgt acgaaccatg 1140
ccacggttct gctccagatt tgccaaagaa taaggtcaac cctatcgcca ctatcttgtc 1200
tgctgcaatg atgttgaaat tgtcattgaa cttgcctgaa gaaggtaagg ccattgaaga 1260
tgcagttaaa aaggttttgg atgcaggtat cagaactggt gatttaggtg gttccaacag 1320
taccaccgaa gtcggtgatg ctgtcgccga agaagttaag aaaatccttg cttaaacgca 1380
cagatattat aacatctgca caataggcat ttgcaagaat tactcgtgag taaggaaaga 1440
gtgaggaact atcgcatacc tgcatttaaa gatgccgatt tgggcgcgaa tcctttattt 1500
tggcttcacc ctcatactat tatcagggcc agaaaaagga agtgtttccc tccttcttga 1560
attgatgtta ccctcataaa gcacgtggcc tcttatcgag aaagaaatta ccgtcgctcg 1620
tgatttgttt gcaaaaagaa caaaactgaa aaaacccaga cacgctcgac ttcctgtctt 1680
cctattgatt gcagcttcca atttcgtcac acaacaaggt cctagcgacg gctcacaggt 1740
tttgtaacaa gcaatcgaag gttctggaat ggcgg 1775
<210> 14
<211> 532
<212> DNA
<213> Artificial sequence
<400> 14
agtctaggtc cctatttatt tttttatagt tatgttagta ttaagaacgt tatttatatt 60
tcaaattttt cttttttttc tgtacagacg cgtgtacgca tgtaacatta tactgaaaac 120
cttgcttgag aaggttttgg gacgctcgaa ggctttaatt tgcaagctgc ggccctgcat 180
taatgaatcg gccaacgcgc aagcatcttg ccctgtgctt ggcccccagt gcagcgaacg 240
ttataaaaac gaatactgag tatatatcta tgtaaaacaa ccatatcatt tcttgttctg 300
aactttgttt acctaactag ttttaaattt ccctttttcg tgcatgcggg tgttcttatt 360
tattagcata ctacatttga aatatcaaat ttccttagta gaaaagtgag agaaggtgca 420
ctgacacaaa aaataaaatg ctacgtataa ctgtcaaaac tttgcagcag cgggcatcct 480
tccatcatag cttcaaacat attagcgttc ctgatcttca tacccgtgct ca 532
<210> 15
<211> 80
<212> DNA
<213> Artificial sequence
<400> 15
ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg agccttaatt 60
aaacgcacag atattataac 80
<210> 16
<211> 74
<212> DNA
<213> Artificial sequence
<400> 16
cctccgcgtc attaaacttc ttgttgttga cgctaacatc aacgctagta ttcggcatgc 60
cggtagaggt gtgg 74
<210> 17
<211> 76
<212> DNA
<213> Artificial sequence
<400> 17
caggtatagc atgaggtcgc tcttattgac cacacctcta ccggcatgcc gaatactagc 60
gttgaatgtt agcgtc 76
<210> 18
<211> 75
<212> DNA
<213> Artificial sequence
<400> 18
aggagtagaa acattttgaa gctatggtgt gtgggggatc actttaatta atctatataa 60
cagttgaaat ttgga 75
<210> 19
<211> 76
<212> DNA
<213> Artificial sequence
<400> 19
gtcattttcg cgttgagaag atgttcttat ccaaatttca actgttatat agattaatta 60
aagtgatccc ccacac 76
<210> 20
<211> 78
<212> DNA
<213> Artificial sequence
<400> 20
cgtattacaa ttcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgcgt 60
tggccgattc attaatgc 78

Claims (22)

1. A protein, which is protein a or protein B;
the protein A is a protein shown in any one of the following (A1) - (A3):
(A1) protein with amino acid sequence shown as SEQ ID No. 9;
(A2) has more than 99 percent of homology with the amino acid sequence defined in (A1) and is derived from the protein of dioscorea zingiberensis with 16 th and 22 th double oxidase functions of cholesterol;
(A3) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of a protein defined in any one of (A1) to (A2);
the protein B is a protein shown in any one of the following (B1) to (B3):
(B1) protein with amino acid sequence shown as SEQ ID No. 7;
(B2) a protein which has more than 99% of homology with the amino acid sequence defined in (B1) and is derived from dioscorea zingiberensis and has the function of cholesterol 26-bit oxidase;
(B3) and (B1) attaching a tag to the N-terminus and/or C-terminus of the protein defined in (B1).
2. A set of proteins, characterized in that:
the protein set is protein set A or protein set B or protein set C;
the protein A set consists of protein A and protein B;
the protein B consists of protein A, protein B and protein C;
the protein set C consists of protein A, protein B, protein C, protein D and protein E;
the protein A is a protein shown in any one of the following (A1) - (A3):
(A1) protein with amino acid sequence shown as SEQ ID No. 9;
(A2) has more than 99 percent of homology with the amino acid sequence defined in (A1) and is derived from the protein of dioscorea zingiberensis with 16 th and 22 th double oxidase functions of cholesterol;
(A3) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of a protein defined in any one of (A1) to (A2);
the protein B is a protein shown in any one of the following (B1) to (B3):
(B1) protein with amino acid sequence shown as SEQ ID No. 7;
(B2) a protein which has more than 99% of homology with the amino acid sequence defined in (B1) and is derived from dioscorea zingiberensis and has the function of cholesterol 26-bit oxidase;
(B3) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of a protein defined in any one of (B1) to (B2);
the protein C is a protein shown in any one of the following (C1) to (C3):
(C1) protein with amino acid sequence shown as SEQ ID No. 11;
(C2) a protein having a homology of 99% or more with the amino acid sequence defined in (C1) and derived from grape having a cytochrome P450 reductase function;
(C3) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of a protein defined in any one of (C1) to (C2);
the protein D is a protein shown in any one of the following (D1) to (D3):
(D1) protein with amino acid sequence shown as SEQ ID No. 1;
(D2) a protein which has more than 99% of homology with the amino acid sequence defined in (D1) and is derived from zebrafish and has the function of sterol 7-position reductase;
(D3) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of a protein defined in any one of (D1) to (D2);
the protein E is a protein shown in any one of the following (E1) to (E3):
(E1) protein with amino acid sequence shown as SEQ ID No. 2;
(E2) a protein which has more than 99% of homology with the amino acid sequence defined in (E1) and is derived from zebra fish and has the function of sterol 24 site reductase;
(E3) a fusion protein obtained by attaching a tag to the N-terminus and/or C-terminus of the protein defined in any one of (E1) to (E2).
3. A nucleic acid molecule, which is a nucleic acid molecule A or a nucleic acid molecule B;
the nucleic acid molecule A is a nucleic acid molecule encoding the protein A of claim 1;
the nucleic acid molecule B is a nucleic acid molecule encoding the protein B according to claim 1.
4. The nucleic acid molecule of claim 3, wherein:
the nucleic acid molecule A is a DNA molecule shown as any one of (a1) to (a3) as follows:
(a1) a DNA molecule with the nucleotide sequence shown as 11 th-1477 th sites of SEQ ID No. 10;
(a2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (a1) and which encodes the protein A as claimed in claim 1;
(a3) a DNA molecule having 80% or more homology with the DNA sequence defined in (a1) or (a2) and encoding the protein A as defined in claim 1;
the nucleic acid molecule B is a DNA molecule shown as any one of (B1) to (B3) as follows:
(b1) a DNA molecule with a nucleotide sequence shown as 11 th to 1501 th sites of SEQ ID No. 8;
(b2) a DNA molecule which hybridizes under stringent conditions to the DNA molecule defined in (B1) and which encodes the protein B as claimed in claim 1;
(b3) a DNA molecule having a homology of 80% or more with the DNA sequence defined in (B1) or (B2) and encoding the protein B of claim 1.
5. A set of nucleic acid molecules, characterized in that: the nucleic acid molecule set is a nucleic acid molecule set A or a nucleic acid molecule set B or a nucleic acid molecule set C;
the nucleic acid molecule A set consists of a nucleic acid molecule A and a nucleic acid molecule B;
the nucleic acid molecule B consists of a nucleic acid molecule A, a nucleic acid molecule B and a nucleic acid molecule C;
the nucleic acid molecule C set consists of a nucleic acid molecule A, a nucleic acid molecule B, a nucleic acid molecule C, a nucleic acid molecule D and a nucleic acid molecule E;
the nucleic acid molecule A is a nucleic acid molecule encoding the protein A of claim 2;
the nucleic acid molecule B is a nucleic acid molecule encoding the protein B of claim 2;
the nucleic acid molecule C is a nucleic acid molecule encoding the protein C according to claim 2;
the nucleic acid molecule D is a nucleic acid molecule encoding the protein D according to claim 2;
the nucleic acid molecule E is a nucleic acid molecule which codes for a protein E as claimed in claim 2.
6. The kit of nucleic acid molecules according to claim 5, wherein:
the nucleic acid molecule A is a DNA molecule shown as any one of (a1) to (a3) as follows:
(a1) a DNA molecule with the nucleotide sequence shown as 11 th-1477 th sites of SEQ ID No. 10;
(a2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (a1) and which encodes the protein A of claim 2;
(a3) a DNA molecule having 80% or more homology with the DNA sequence defined in (a1) or (a2) and encoding the protein A as defined in claim 2;
the nucleic acid molecule B is a DNA molecule shown as any one of (B1) to (B3) as follows:
(b1) a DNA molecule with a nucleotide sequence shown as 11 th to 1501 th sites of SEQ ID No. 8;
(b2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (B1) and which encodes the protein B as claimed in claim 2;
(b3) a DNA molecule having 80% or more homology with the DNA sequence defined in (B1) or (B2) and encoding the protein B of claim 2;
the nucleic acid molecule C is a DNA molecule shown in any one of (C1) to (C3) as follows:
(c1) a DNA molecule with the nucleotide sequence shown as 11 th-2125 th site of SEQ ID No. 12;
(c2) a DNA molecule which hybridizes under stringent conditions to the DNA molecule defined in (C1) and which encodes the protein C as claimed in claim 2;
(c3) a DNA molecule having 80% or more homology with the DNA sequence defined in (C1) or (C2) and encoding the protein C as claimed in claim 2;
the nucleic acid molecule D is a DNA molecule shown in any one of (D1) to (D3) as follows:
(d1) a DNA molecule with the nucleotide sequence shown in the 813 th and 2249 th positions of SEQ ID No. 3;
(d2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (D1) and which encodes the protein D as claimed in claim 2;
(d3) a DNA molecule having 80% or more homology with the DNA sequence defined in (D1) or (D2) and encoding the protein D as defined in claim 2;
the nucleic acid molecule E is a DNA molecule shown as any one of (E1) to (E3) below:
(e1) a DNA molecule with the nucleotide sequence shown in the 493 2043 position of SEQ ID No. 4;
(e2) a DNA molecule which hybridizes under stringent conditions with the DNA molecule defined in (E1) and which encodes the protein E according to claim 2;
(e3) a DNA molecule having a homology of 80% or more with the DNA sequence defined in (E1) or (E2) and encoding the protein E of claim 2.
7. Any of the following biological materials:
(c1) a recombinant vector comprising the nucleic acid molecule of claim 3 or 4;
(c2) an expression cassette comprising the nucleic acid molecule of claim 3 or 4;
(c3) a transgenic cell line comprising the nucleic acid molecule of claim 3 or 4;
(c4) a recombinant bacterium comprising the nucleic acid molecule according to claim 3 or 4;
(c5) the complete set of recombinant vector is a complete set of vector A or a complete set of vector B or a complete set of vector C;
the complete set of vector A consists of a recombinant vector A and a recombinant vector B;
the complete set of vector B consists of the recombinant vector A, the recombinant vector B and the recombinant vector C;
the complete set of vector C consists of the recombinant vector A, the recombinant vector B, the recombinant vector C, the recombinant vector D and the recombinant vector E;
the recombinant vector A is a recombinant vector containing the nucleic acid molecule A in claim 5 or 6; the recombinant vector B is a recombinant vector containing the nucleic acid molecule B in claim 5 or 6; the recombinant vector C is a recombinant vector containing the nucleic acid molecule C of claim 5 or 6; the recombinant vector D is a recombinant vector containing the nucleic acid molecule D of claim 5 or 6; the recombinant vector E is a recombinant vector containing the nucleic acid molecule E of claim 5 or 6;
(c6) the complete set of expression cassette is a complete set of expression cassette A or a complete set of expression cassette B or a complete set of expression cassette C;
the set of expression cassette A consists of an expression cassette A and an expression cassette B;
the set of expression cassettes B consists of the expression cassette A, the expression cassette B and the expression cassette C;
the set of expression cassettes consists of the expression cassette A, the expression cassette B, the expression cassette C, the expression cassette D and the expression cassette E;
the expression cassette A is an expression cassette comprising the nucleic acid molecule A of claim 5 or 6; the expression cassette B is an expression cassette comprising the nucleic acid molecule B according to claim 5 or 6; the expression cassette C is an expression cassette comprising the nucleic acid molecule C according to claim 5 or 6; the expression cassette D is an expression cassette comprising the nucleic acid molecule D according to claim 5 or 6; the expression cassette E is an expression cassette which comprises the nucleic acid molecule E according to claim 5 or 6;
(c7) the complete set of transgenic cell line is a complete set of transgenic cell line A or a complete set of transgenic cell line B or a complete set of transgenic cell line C;
the complete set of transgenic cell line A consists of a transgenic cell line A and a transgenic cell line B;
the complete set of transgenic cell line B consists of the transgenic cell line A, the transgenic cell line B and a transgenic cell line C;
the complete set of transgenic cell line C consists of the transgenic cell line A, the transgenic cell line B, the transgenic cell line C, the transgenic cell line D and the transgenic cell line E;
the transgenic cell line A is a transgenic cell line containing the nucleic acid molecule A according to claim 5 or 6; the transgenic cell line B is a transgenic cell line containing the nucleic acid molecule B according to claim 5 or 6; the transgenic cell line C is a transgenic cell line comprising the nucleic acid molecule C according to claim 5 or 6; the transgenic cell line D is a transgenic cell line comprising the nucleic acid molecule D according to claim 5 or 6; the transgenic cell line E is a transgenic cell line containing the nucleic acid molecule E according to claim 5 or 6;
(c8) the complete set of recombinant bacteria is a complete set of recombinant bacteria A or a complete set of recombinant bacteria B or a complete set of recombinant bacteria C;
the set of recombinant bacterium A consists of a recombinant bacterium A and a recombinant bacterium B;
the set of recombinant bacteria B consists of the recombinant bacteria A, the recombinant bacteria B and the recombinant bacteria C;
the set of recombinant bacterium C consists of the recombinant bacterium A, the recombinant bacterium B, the recombinant bacterium C, the recombinant bacterium D and the recombinant bacterium E;
the recombinant bacterium A is a recombinant bacterium containing the nucleic acid molecule A in claim 5 or 6; the recombinant bacterium B is a recombinant bacterium containing the nucleic acid molecule B of claim 5 or 6; the recombinant bacterium C is a recombinant bacterium containing the nucleic acid molecule C as claimed in claim 5 or 6; the recombinant bacterium D is a recombinant bacterium containing the nucleic acid molecule D of claim 5 or 6; the recombinant bacterium E is a recombinant bacterium containing the nucleic acid molecule E according to claim 5 or 6.
8. A method for constructing yeast engineering bacteria for synthesizing diosgenin comprises the following steps: modifying yeast capable of synthesizing cholesterol to express protein A as claimed in claim 1, protein B as claimed in claim 1 and cytochrome P450 reductase, wherein the modified yeast is the target engineering bacteria.
9. The method of claim 8, wherein: the yeast capable of synthesizing cholesterol is prepared according to a method comprising the following steps: modifying the starting yeast to express sterol 7-position reductase and sterol 24-position reductase, wherein the modified yeast is the yeast capable of synthesizing cholesterol.
10. The method of claim 8, wherein: the method comprises the following steps: the target engineered bacterium is a recombinant yeast expressing the protein A, the protein B and the cytochrome P450 reductase obtained by introducing the nucleic acid molecule A according to claim 5 or 6, the nucleic acid molecule B according to claim 5 or 6 and the gene encoding the cytochrome P450 reductase into the yeast capable of synthesizing cholesterol.
11. The method of claim 9, wherein: the yeast capable of synthesizing cholesterol is prepared according to a method comprising the following steps: and (3) introducing the encoding gene of the sterol 7-position reductase and the encoding gene of the sterol 24-position reductase into the starting yeast to obtain recombinant yeast for expressing the sterol 7-position reductase and the sterol 24-position reductase, namely the yeast capable of synthesizing cholesterol.
12. The method of claim 8, wherein: the cytochrome P450 reductase is the protein C according to claim 2.
13. The method of claim 9, wherein: the sterol 7-position reductase is the protein D according to claim 2.
14. The method of claim 9, wherein: the sterol 24-position reductase is the protein E according to claim 2.
15. The method of claim 10, wherein: the cytochrome P450 reductase encoding gene is the nucleic acid molecule C according to claim 5 or 6.
16. The method of claim 11, wherein: the gene encoding sterol 7-position reductase is the nucleic acid molecule D according to claim 5 or 6.
17. The method of claim 11, wherein: the gene encoding sterol 24-position reductase is the nucleic acid molecule E according to claim 5 or 6.
18. The method of claim 10, wherein: the nucleic acid molecule a and the nucleic acid molecule B and the genes encoding the cytochrome P450 reductase are integrated into the Gal80 site of the genome of the yeast capable of synthesizing cholesterol.
19. The method of claim 11, wherein: the coding gene of sterol 7-position reductase and the coding gene of sterol 24-position reductase are integrated into the Gal7 locus of the genome of the starting yeast.
20. An engineered bacterium produced by the method of any one of claims 8 to 19.
21. Use of the protein of claim 1, the set of proteins of claim 2, the nucleic acid molecule of claim 3 or 4, the set of nucleic acid molecules of claim 5 or 6, the biological material of claim 7, or the engineered bacterium of claim 20 in the preparation of diosgenin.
22. A method for preparing diosgenin comprises the following steps: carrying out fermentation culture on the engineering bacteria in the claim 20, and collecting fermentation products; the fermentation product contains diosgenin.
CN201911021712.0A 2019-10-25 2019-10-25 Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application Active CN112708602B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911021712.0A CN112708602B (en) 2019-10-25 2019-10-25 Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911021712.0A CN112708602B (en) 2019-10-25 2019-10-25 Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application

Publications (2)

Publication Number Publication Date
CN112708602A CN112708602A (en) 2021-04-27
CN112708602B true CN112708602B (en) 2022-04-05

Family

ID=75541389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911021712.0A Active CN112708602B (en) 2019-10-25 2019-10-25 Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application

Country Status (1)

Country Link
CN (1) CN112708602B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113817692B (en) * 2021-08-27 2023-01-20 上海大学 Diosgenin synthesis related protein, coding gene and application thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101705275A (en) * 2009-10-30 2010-05-12 陕西科技大学 Process for producing diosgenin and method for processing peltate yam after extracting same
CN109097343A (en) * 2018-08-09 2018-12-28 中国科学院天津工业生物技术研究所 11 B-hydroxylase of steroid and its encoding gene and application in Curvuluria Iunata
CN110396403A (en) * 2018-04-24 2019-11-01 上海交通大学 Target the near infrared fluorescent probe and its preparation and use of CYP1B1 enzyme

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2357766A (en) * 1999-12-24 2001-07-04 Univ Wales Aberystwyth Production of heterologous proteins

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101705275A (en) * 2009-10-30 2010-05-12 陕西科技大学 Process for producing diosgenin and method for processing peltate yam after extracting same
CN110396403A (en) * 2018-04-24 2019-11-01 上海交通大学 Target the near infrared fluorescent probe and its preparation and use of CYP1B1 enzyme
CN109097343A (en) * 2018-08-09 2018-12-28 中国科学院天津工业生物技术研究所 11 B-hydroxylase of steroid and its encoding gene and application in Curvuluria Iunata

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
NP_001008645.1;NCBI;《NCBI》;20190324;全文 *
Q7SXF1.1;NCBI;《NCBI》;20190731;全文 *
The effect of crude diosgenin extract from purple and yellow greater yams (Dioscorea alata) on the lipid profile of dyslipidemia rats;Harijono等;《EMIRATES JOURNAL OF FOOD AND AGRICULTURE》;20160630;第28卷(第7期);全文 *
XP_002270732.2;NCBI;《NCBI》;20161122;全文 *
植物天然产物合成生物学研究;戴住波等;《中国科学院院刊》;20181120;第33卷(第11期);全文 *

Also Published As

Publication number Publication date
CN112708602A (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN109402212B (en) Method for preparing tauroursodeoxycholic acid through biotransformation and application thereof
CN108060092B (en) Recombinant bacterium and application thereof
CN109097343B (en) Steroid 11 beta-hydroxylase in curvularia lunata as well as coding gene and application thereof
CN111434773B (en) Recombinant yeast for high-yield sandalwood oil and construction method and application thereof
CN111205993B (en) Recombinant yeast for producing ursolic acid and oleanolic acid as well as construction method and application thereof
CN110607286B (en) Application of grifola frondosa ergothioneine genes Gfegt1 and Gfegt2 in synthesis of ergothioneine
CN115011616B (en) Acetaldehyde dehydrogenase gene RKALDH and application thereof
CN110747178B (en) Application of tripterygium wilfordii cytochrome p450 oxidase in preparation of abietane-type diterpene compound
CN110628738B (en) Method for improving activity of glucose oxidase, mutant and application thereof
CN116987603A (en) Recombinant saccharomyces cerevisiae strain for high yield of cannabigerolic acid as well as construction method and application thereof
CN112708602B (en) Dioscorea zingiberensis-derived diosgenin synthesis related protein, coding gene and application
CN108034667A (en) A kind of red monascus alpha-amylase gene, its preparation method and application
JP2022530774A (en) Vanillin biosynthesis from isoeugenol
CN107488638B (en) 15 α -hydroxylase and preparation method and application thereof
CN109097342B (en) Steroid 11 beta-hydroxylase in Absidia coerulea, coding gene and application thereof
CN112080479A (en) 17 beta-hydroxysteroid dehydrogenase mutant and application thereof
CN111748535B (en) Alanine dehydrogenase mutant and application thereof in fermentation production of L-alanine
CN109628476B (en) Method for producing 4-hydroxyisoleucine by using whole cell transformation
CN111334522A (en) Recombinant saccharomyces cerevisiae for producing ambergris alcohol and construction method thereof
CN110904062B (en) Strain capable of producing L-alanine at high yield
CN107287172B (en) Method for producing thymidine phosphorylase by using escherichia coli fermentation
CN111808836B (en) Heat-resistant mutant enzyme of pullulanase I and preparation method and application thereof
CN115976118A (en) Method and carrier for biologically synthesizing nootkatone
CN115873836A (en) Nerolidol synthetase and application thereof
CN112646834A (en) Lupeol derivative and synthesis method and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant