WO2018210208A1 - Glycosyltransferase, mutant, and application thereof - Google Patents

Glycosyltransferase, mutant, and application thereof Download PDF

Info

Publication number
WO2018210208A1
WO2018210208A1 PCT/CN2018/086738 CN2018086738W WO2018210208A1 WO 2018210208 A1 WO2018210208 A1 WO 2018210208A1 CN 2018086738 W CN2018086738 W CN 2018086738W WO 2018210208 A1 WO2018210208 A1 WO 2018210208A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
amino acid
polypeptide
acid sequence
glycosyltransferase
Prior art date
Application number
PCT/CN2018/086738
Other languages
French (fr)
Chinese (zh)
Inventor
周志华
严兴
王平平
魏维
李晓东
Original Assignee
中国科学院上海生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院上海生命科学研究院 filed Critical 中国科学院上海生命科学研究院
Priority to KR1020197037134A priority Critical patent/KR102418138B1/en
Priority to JP2019563883A priority patent/JP7086107B2/en
Priority to CN201880005455.9A priority patent/CN110462033A/en
Publication of WO2018210208A1 publication Critical patent/WO2018210208A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1048Glycosyltransferases (2.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P33/00Preparation of steroids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P33/00Preparation of steroids
    • C12P33/20Preparation of steroids containing heterocyclic rings

Definitions

  • the present invention relates to the field of biotechnology and plant biology, and in particular, to a glycosyltransferase, a glycosyltransferase mutant for use in the synthesis of ginsenoside Rh2, and uses thereof.
  • Ginsenoside is the main active substance in the genus Panax ginseng (such as ginseng, notoginseng, American ginseng, etc.). In recent years, some ginsenosides have also been found in the cucurbitaceae plant Panax notoginseng. At present, scientists at home and abroad have isolated at least 100 kinds of saponins from ginseng, Sanqi and other plants. Ginsenoside belongs to triterpenoid saponins. Some of the ginsenosides have been proven to have a wide range of physiological functions and medicinal properties: including anti-tumor, immune regulation, anti-fatigue, heart protection, liver protection and other functions. Many of the saponins have been used in clinical practice.
  • the drug Shenyi Capsule with ginsenoside Rg3 monomer as the main component can improve the symptoms of qi deficiency in patients with cancer and improve the immune function of the body.
  • Jinshen Capsule with ginsenoside Rh2 monomer as the main component is a health care medicine for improving immunity and enhancing disease resistance.
  • ginsenoside is a biologically active small molecule formed by saccharification of sapogenin.
  • saponins of ginsenosides mainly dammarane-type protopanaxadiol and protopanaxatriol, and oleanane-type aroma resins.
  • the structural difference between ginsenosides is mainly reflected in the different glycosylation modification of saponins.
  • the sugar chain of ginsenoside is generally bound to the hydroxyl group of C3, C6, or C20 of the sapogenin, which may be glucose, rhamnose, xylose, and arabinose.
  • ginsenosides Rb1, Rd and Rc are saponins in which protopanaxadiol is a sapogenin. The difference between them is only the difference in glycosylation modification, but there are many differences in the physiological functions between them.
  • Rb1 has the function of a stable central neuron system, while Rc functions to suppress the function of the central neuron system.
  • Rb1 has a wide range of physiological functions, while Rd has only a very limited number of functions.
  • Rare ginsenoside refers to a saponin with very low content in ginseng.
  • the ginsenoside Rh2(3-O- ⁇ -(D-glucopyranosyl)-20(S)-protopanaxadiol) belongs to the ginseng glycol saponin, and a glucosyl group is attached to the C-3 hydroxyl group of the sapogenin.
  • the content of ginsenoside Rh2 is only about one ten thousandth of the dry weight of ginseng.
  • ginsenoside Rh2 has good antitumor activity and is one of the most important antitumor active ingredients in ginseng. It can inhibit tumor cell growth and induce tumors. Apoptosis, anti-tumor metastasis.
  • ginsenoside Rh2 can inhibit the increase of lung cancer cells 3LL (mice), Morris liver cancer cells (rats), B-16 melanoma cells (mice), and HeLa cells (human).
  • ginsenoside Rh2 combined with radiotherapy or chemotherapy can enhance the effects of radiotherapy and chemotherapy.
  • ginsenoside Rh2 also has anti-allergic effects, enhances the body's immunity, and inhibits the inflammation caused by NO and PGE.
  • glycosyltransferase The function of a glycosyltransferase is to transfer a glycosyl group on a glycosyl donor (nucleoside diphosphate sugar, such as UDP-glucose) to a different glycosyl acceptor.
  • glycosyl donor nucleoside diphosphate sugar, such as UDP-glucose
  • glycosyl acceptors of these glycosyltransferases include sugars, lipids, proteins, nucleic acids, antibiotics, and other small molecules.
  • a glycosyltransferase involved in saponin glycosylation in ginseng which functions to transfer a glycosyl group on a glycosyl donor to a C3, C6 or C20 hydroxyl group of a saponin or aglycone, thereby forming different medicinal values. Saponin.
  • the object of the present invention is to provide a kind of glycosyltransferase and its application for synthesizing rare ginsenoside Rh2 and ginsenoside F2.
  • the amino acid sequence of the isolated polypeptide is non-Gln at amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19 and/or corresponding to the amino acid sequence corresponding to SEQ ID NO: 19.
  • the amino acid residue at position 322 is non-Ala;
  • the amino acid sequence of the isolated polypeptide is non-Asn(N) at amino acid residue corresponding to position 247 of the amino acid sequence shown in SEQ ID NO: 19 and/or corresponding to SEQ ID NO: 19
  • the amino acid residue at position 280 of the amino acid sequence is non-Lys (K).
  • the isolated polypeptide :
  • the sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1
  • the isolated polypeptide :
  • the sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1
  • the isolated polypeptide has the sequence defined by i) passing through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably.
  • amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 222 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: His, Asn, Gln, Lys and Arg.
  • the amino acid sequence of the isolated polypeptide is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19.
  • amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 322 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Val, Ile, Leu, Met and Phe.
  • amino acid sequence of the isolated polypeptide is Val at amino acid residue corresponding to position 322 of the amino acid sequence set forth in SEQ ID NO: 19.
  • the amino acid sequence of the isolated polypeptide is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown in SEQ ID NO: 19, and corresponds to the amino acid sequence shown in SEQ ID NO: 19.
  • the amino acid residue at position 322 is Val.
  • amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 247 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ser(S), Pro (P), Ala (A) or Thr (T).
  • amino acid sequence of the isolated polypeptide is Ser(S) at the amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19.
  • amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 280 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ile(I), Asn (N), Ser(S) or Alan (A).
  • the amino acid sequence of the isolated polypeptide is Ile(I) at the amino acid residue corresponding to position 280 of the amino acid sequence set forth in SEQ ID NO: 19.
  • the amino acid sequence of the isolated polypeptide is Ser(S) at amino acid residue corresponding to position 247 of the amino acid sequence set forth in SEQ ID NO: 19, corresponding to SEQ ID NO: 19
  • the amino acid residue at position 280 of the amino acid sequence is Ile(I).
  • the isolated polypeptide :
  • amino acid sequence shown in SEQ ID NO: 19 and the amino acid residue at position 222 is non-Gln and/or the amino acid residue at position 322 is non-Ala, or
  • the sequence defined by iii) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1
  • the isolated polypeptide :
  • amino acid sequence shown in SEQ ID NO: 19 and the amino acid residue at position 247 is non-Asn (N) and/or the amino acid residue at position 280 is non-Lys (K), or
  • the sequence defined by iii) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1
  • the isolated polypeptide has a sequence defined by iii) via one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably.
  • amino acid sequence of the isolated polypeptide having the amino acid sequence at position 222 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: His, Asn, Gln, Lys, and Arg.
  • the amino acid sequence of the isolated polypeptide has the amino acid residue at position 222 of the amino acid sequence set forth in SEQ ID NO: 19 as His.
  • amino acid sequence of the isolated polypeptide having the amino acid sequence at position 322 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Val, Ile, Leu, Met, and Phe.
  • amino acid sequence of the isolated polypeptide having the amino acid sequence at position 322 of the amino acid sequence set forth in SEQ ID NO: 19 is Val.
  • the amino acid sequence of the isolated polypeptide is His at the 222th amino acid sequence of the amino acid sequence shown in SEQ ID NO: 19, and at position 322 of the amino acid sequence shown in SEQ ID NO: 19.
  • the amino acid residue is Val.
  • amino acid sequence of the isolated polypeptide having the amino acid sequence at position 247 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ser(S), Pro(P) ), Ala (A) or Thr (T).
  • amino acid sequence of the isolated polypeptide having the amino acid sequence at position 247 of the amino acid sequence set forth in SEQ ID NO: 19 is Ser(S).
  • amino acid sequence of the isolated polypeptide having the amino acid sequence at position 280 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ile(I), Asn(N) ), Ser(S) or Ala(A).
  • amino acid sequence of the isolated polypeptide having the amino acid sequence at position 280 of the amino acid sequence set forth in SEQ ID NO: 19 is Ile(I).
  • the amino acid sequence of the isolated polypeptide has the amino acid residue at position 247 of the amino acid sequence shown in SEQ ID NO: 19 as Ser(S), and the amino acid sequence shown in SEQ ID NO: 19
  • the amino acid residue at position 280 is Ile(I).
  • the isolated polypeptide is a glycosyltransferase.
  • the glycosyltransferase is derived from a plant of the genus Panax.
  • the glycosyltransferase is derived from ginseng, American ginseng, and/or notoginseng.
  • polypeptide is selected from the group consisting of:
  • sequence (c) is a fusion protein formed by adding a tag sequence, a signal sequence or a secretion signal sequence to (a) or (b).
  • the glycosyltransferase activity refers to an activity capable of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
  • the glycosyltransferase increases the yield of ginsenoside Rh2 and/or ginsenoside F2.
  • the glycosyltransferase in the artificially constructed strain, can increase the yield of ginsenoside Rh2; preferably, it is increased by 5-150%; more preferably, by 10-100%; more preferably , increase by 20-80%; most preferably, increase by 28-70%.
  • the artificially constructed strain is selected from the group consisting of a Saccharomyces cerevisiae strain, an Escherichia coli strain, a Pichia strain, a fission yeast strain, and a Kluyveromyces strain.
  • an isolated polynucleotide is provided, the polynucleotide being a sequence selected from the group consisting of:
  • (C) a nucleotide sequence as set forth in SEQ ID NO.: 3 SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42 ;
  • (E) a nucleotide sequence of SEQ ID NO.: 3 SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42 a nucleotide sequence formed by truncating the 5' end and/or the 3' end or adding 1-60 (preferably 1-30, more preferably 1-10) nucleotides;
  • a vector comprising the polynucleotide of the second aspect of the invention is provided.
  • the vector comprises an expression vector, a shuttle vector, an integration vector.
  • a fourth aspect of the invention there is provided the use of the isolated polypeptide of the first aspect of the invention, which is used to catalyze the following reaction, or to prepare a catalytic preparation which catalyzes the following reaction:
  • the glycosyl donor comprises a nucleoside diphosphate selected from the group consisting of UDP-glucose, ADP-glucose, TDP-glucose, CDP-glucose, GDP-glucose, UDP-acetyl Glucose, ADP-acetylglucose, TDP-acetylglucose, CDP-acetylglucose, GDP-acetylglucose, UDP-xylose, ADP-xylose, TDP-xylose, CDP-xylose, GDP-xylose , UDP-xylose, UDP-galacturonic acid, ADP-galacturonic acid, TDP-galacturonic acid, CDP-galacturonic acid, GDP-galacturonic acid, UDP-galactose, ADP-galactose , TDP-galactose, CDP-galactose, GDP-galactose, U
  • the glycosyl donor comprises a uridine diphosphate (UDP) sugar selected from the group consisting of UDP-glucose, UDP-xylose, UDP-galacturonic acid, UDP-galactose, UDP-arabinose, UDP-rhamnose, or other uridine diphosphate hexose or uridine pentose diphosphate, or a combination thereof.
  • UDP uridine diphosphate
  • the isolated polypeptide is used to catalyze the following reactions or to prepare a catalytic preparation that catalyzes the following reactions:
  • R1 is H or OH
  • R2 is H or OH
  • R3 is H or a glycosyl group
  • R4 is a glycosyl group.
  • the isolated polypeptide is used to catalyze the following reactions or to prepare a catalytic preparation that catalyzes the following reactions:
  • the polypeptide is selected from the group consisting of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: A polypeptide or a polypeptide derived therefrom.
  • the glycosyl group is selected from the group consisting of: glucosyl, galacturonic acid, xylose, galactosyl, arabinose, rhamnosyl, and other hexose or pentose base.
  • the reaction product of the reaction (A) and or (B) includes, but is not limited to, a dammarane type tetracyclic triterpenoid compound of the S configuration or the R configuration, lanolin Type tetracyclic triterpenoids, apotirucallane type tetracyclic triterpenes, ganthanane type tetracyclic triterpenoids, cycloalkane (cycloalthene) type tetracyclic triterpenoids, cucurbitane tetracyclic triterpenes a compound or a decane type tetracyclic triterpenoid.
  • a dammarane type tetracyclic triterpenoid compound of the S configuration or the R configuration lanolin Type tetracyclic triterpenoids, apotirucallane type tetracyclic triterpenes, ganthanane type tetracyclic triterpenoids, cycloalkane (cycl
  • an in vitro glycosylation method comprising the steps of:
  • glycosyltransferase is the polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
  • the derivative polypeptide is selected from the group consisting of:
  • the glycosyltransferase activity refers to an activity capable of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
  • a method of performing a glycosyl-catalyzed reaction comprising the steps of performing a glycosyl-catalyzed reaction in the presence of the polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
  • the method further includes the steps of:
  • the compound of formula (I) is converted to the compound of formula (II) in the presence of a glycosyl donor and a polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
  • the compound of formula (I) is protoglycoside diol PPD, and the compound of formula (II) is ginsenoside Rh2;
  • the compound of formula (I) is Compound K and the compound of formula (II) is ginsenoside F2.
  • the method further comprises separately adding the polypeptide and the polypeptide derived therefrom to a catalytic reaction; and/or
  • the polypeptide and its derived polypeptide are simultaneously added to the catalytic reaction.
  • the method further comprises transferring a nucleotide sequence encoding a glycosyltransferase to a key gene and/or other glycosyl transfer in a anabolic pathway of dammar diol and/or protopanaxadiol.
  • the enzyme gene is co-expressed in the host cell to obtain the compound of formula (II).
  • the compound of formula (II) is ginsenoside Rh2 or ginsenoside F2.
  • the host cell is a yeast or Escherichia coli.
  • the method further comprises: providing an additive for regulating enzyme activity to the reaction system.
  • the additive for regulating enzyme activity is an additive that increases enzyme activity or inhibits enzyme activity.
  • the additive for regulating enzyme activity is selected from the group consisting of Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ . , or Fe 2+ .
  • the additive for regulating enzyme activity is: Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , Or a substance of Fe 2+ .
  • the glycosyl donor is a nucleoside diphosphate sugar selected from the group consisting of UDP-glucose, ADP-glucose, TDP-glucose, CDP-glucose, GDP-glucose, UDP-acetyl Glucose, ADP-acetylglucose, TDP-acetylglucose, CDP-acetylglucose, GDP-acetylglucose, UDP-xylose, ADP-xylose, TDP-xylose, CDP-xylose, GDP-xylose , UDP-xylose, UDP-galacturonic acid, ADP-galacturonic acid, TDP-galacturonic acid, CDP-galacturonic acid, GDP-galacturonic acid, UDP-galactose, ADP-galactose , TDP-galactose, CDP-galactose, GDP-galactose,
  • the glycosyl donor is uridine diphosphate, selected from the group consisting of UDP-glucose, UDP-xylose, UDP-galacturonic acid, UDP-galactose, UDP-Arabic Sugar, UDP-rhamnose, or other uridine diphosphate hexose or uridine pentose diphosphate, or a combination thereof.
  • the pH of the reaction system is: pH 4.0 to 10.0, preferably pH 5.5 to 9.0.
  • the temperature of the reaction system is from 10 ° C to 105 ° C, preferably from 20 ° C to 50 ° C.
  • the key genes in the darumadiol anabolic pathway include, but are not limited to, the damasenediol synthase gene.
  • the key genes in the proto-ginsengdiol anabolic pathway include, but are not limited to, a dammarenediol synthase gene, a cytochrome P450 gene CYP716A47 synthesized from protopanaxadiol, and Reductase gene, or a combination thereof.
  • the substrate of the glycosyl catalyzed reaction is a compound of formula (I), and the product is a compound of formula (II).
  • the compound of formula (I) is protoglycoside diol PPD, and the compound of formula (II) is ginsenoside Rh2;
  • the compound of formula (I) is Compound K and the compound of formula (II) is ginsenoside F2.
  • a genetically engineered host cell comprising the vector of the third aspect of the invention, or a genome thereof, which is integrated with the second aspect of the invention Polynucleotide.
  • the glycosyltransferase is the polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
  • nucleotide sequence encoding the glycosyltransferase is as described in the second aspect of the invention.
  • the cell is a prokaryotic cell or a eukaryotic cell.
  • the host cell is a eukaryotic cell, such as a yeast cell or a plant cell.
  • the host cell is a Saccharomyces cerevisiae cell.
  • the host cell is a prokaryotic cell, such as E. coli.
  • the host cell is a ginseng cell.
  • the host cell is not a cell that naturally produces a compound of formula (II).
  • the host cell is not a cell that naturally produces ginsenoside Rh2 or ginsenoside F2.
  • the key genes in the darumadiol anabolic pathway include, but are not limited to, the damasenediol synthase gene.
  • the host cell contains a key gene in the proto-glycol diol anabolic pathway including, but not limited to, a dammarene diol synthase gene, a cytochrome P450 gene synthesized from protothecodiol CYP716A47 and its reductase gene, or a combination thereof.
  • the host cell of the seventh aspect of the invention for the preparation of an enzyme catalytic reagent, or for the production of a glycosyltransferase, or as a catalytic cell, or for the production of a glycosylated tetracyclic ring Triterpenoids.
  • the tetracyclic triterpenoid is a compound of formula (II).
  • the host cell is used to produce a compound of formula (II) by glycosylation of a compound of formula (I).
  • the host cell is used to produce ginsenoside Rh2 and/or ginsenoside F2 by glycosylation of protopanaxadiol PPD and/or Compound K.
  • a method of producing a transgenic plant comprising the steps of: regenerating a genetically engineered host cell of the seventh aspect of the invention into a plant, and said genetically engineered host cell For plant cells.
  • the genetically engineered host cell is selected from the group consisting of: ginseng cells, American ginseng cells, notoginseng cells, and tobacco cells.
  • FIG. 1 Electrophoresis map of glycosyltransferase gene Pn50 agarose gel.
  • Fig. 2 is a diagram showing the TLC analysis of ginsenosides by the glycosyltransferase gene Pn50.
  • Figure 3 is a HPLC analysis of the production of recombinant ginseng saponin Rh2 by recombinant S. cerevisiae strain using Pn50.
  • Figure 4 is a comparative analysis of the fermentation yield of the recombinant Saccharomyces cerevisiae strain R2 produced by the Pn50 protein mutants Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2 and UGT-MUT3.
  • the present inventors have conducted extensive and intensive research to provide, for the first time, a mutant of the heptasaccharide transferase Pn50 (SEQ ID NO.: 4) and a ginseng-derived glycosyltransferase UGTPg45 (SEQ ID NO.: 21), three.
  • Mutants of hexosyltransferase Pn50, Pn50-Q222H-VE SEQ ID NO.: 41
  • UGT-MUT1 SEQ_ID_NO. 35
  • UGT-MUT2 SEQ_ID_NO.37
  • UGT-MUT3 SEQ_ID_NO.39
  • the glycosyltransferase of the present invention is capable of specifically and efficiently catalyzing the tetracyclic triterpenoid substrate and/or transferring a glycosyl group derived from a glycosyl donor to the C-3 position of a tetracyclic triterpenoid On the hydroxyl group.
  • the protopanaxadiol PPD can be efficiently converted into the rare ginsenoside Rh2 having antitumor activity, and the ginsenoside Compound K (having a glycosyl modification at the C20 position of the PPD) can be converted into the product ginsenoside F2.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS-Pn50 constructed by introducing the Panax notoginseng-derived glycosyltransferase gene Pn50 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 and is introduced into the wild-type ginseng source.
  • Pn50-Q222H-VE was further engineered by site-directed mutagenesis to obtain the mutant UGT-MUT1 gene, and the artificially synthesized ginsenoside Rh2 strain ZWBY04RS-MUT1 constructed by UGT-MUT1 was used.
  • Rh2 yield was compared with the strain constructed using UGTPg45.
  • Pn50-Q222H-VE was further engineered by site-directed mutagenesis to obtain the mutant UGT-MUT2 gene, and the artificially synthesized ginsenoside Rh2 strain ZWBY04RS-MUT2 constructed by UGT-MUT2 was used.
  • the Rh2 yield was compared with the strain constructed using UGTPg45.
  • Pn50-Q222H-VE was further engineered by site-directed mutagenesis to obtain the mutant UGT-MUT3 gene, and the artificially synthesized ginsenoside Rh2 strain ZWBY04RS-MUT3 constructed by UGT-MUT3 was used.
  • the Rh2 yield was compared with the strain constructed using UGTPg45.
  • the invention also provides methods of transformation and catalysis.
  • the glycosyltransferase of the present invention may also be a key enzyme in the anabolic pathway of dammarane diol and/or protopanaxadiol (for example, the dammarene diol synthase gene PgDDS, the cytochrome P450 gene synthesized by protopanaxadiol).
  • CYP716A47 and its reductase gene PgCPR1 are co-expressed in host cells or used in the preparation of genetically engineered cells of ginsenoside Rh2, and are used to construct strains for artificial synthesis of rare ginsenoside Rh2.
  • glycosyltransferase of the present invention can also be co-expressed in a host cell with a key enzyme in the metabolic pathway of dammarane diol and/or protopanaxadiol, and is used for constructing a strain for artificially synthesizing rare ginsenoside Rh2.
  • the present invention has been completed on this basis.
  • glycosyltransferase of the present invention has an activity of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
  • the amino acid sequence of the glycosyltransferase of the present invention is non-Gln at amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19 and/or corresponds to SEQ ID NO.
  • the amino acid residue at position 322 of the amino acid sequence shown in 19 is non-Ala.
  • glycosyltransferase of the invention in a preferred embodiment of the invention, the glycosyltransferase of the invention:
  • the sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1
  • the glycosyltransferase of the invention has the sequence defined by i) via one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10 More preferably, the sequence formed by the addition of 1-3, most preferably 1 amino acid residue, and the isolated polypeptide derived from i) having substantially the isolated polypeptide function defined by i).
  • the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19: His, Asn , Gln, Lys, and Arg.
  • the amino acid sequence of the glycosyltransferase of the present invention is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19.
  • the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to the amino acid sequence of SEQ ID NO: 19: Val, Ile , Leu, Met and Phe.
  • the amino acid sequence of the glycosyltransferase of the present invention is Val at the amino acid residue corresponding to position 322 of the amino acid sequence shown by SEQ ID NO: 19.
  • the amino acid sequence of the glycosyltransferase of the present invention is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19, and corresponds to SEQ ID NO: 19
  • the amino acid residue at position 322 of the amino acid sequence is Val.
  • the amino acid sequence of the glycosyltransferase of the present invention is non-Asn(N) at the amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19 and/or corresponds to The amino acid residue at position 280 of the amino acid sequence shown as SEQ ID NO: 19 is non-Lys (K).
  • glycosyltransferase of the invention in a preferred embodiment of the invention, the glycosyltransferase of the invention:
  • the sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1
  • the glycosyltransferase of the invention has the sequence defined by i) via one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10 More preferably, the sequence formed by the addition of 1-3, most preferably 1 amino acid residue, and the isolated polypeptide derived from i) having substantially the isolated polypeptide function defined by i).
  • the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to the amino acid sequence of SEQ ID NO: 19: Ser (S ), Pro (P), Ala (A) or Thr (T).
  • amino acid sequence of the glycosyltransferase of the present invention is Ser(S) at the amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19.
  • the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to the amino acid sequence of SEQ ID NO: 19: Ile(I) ), Asn (N), Ser (S) or Ala (A).
  • the amino acid sequence of the glycosyltransferase of the present invention is Ile(I) at the amino acid residue corresponding to position 280 of the amino acid sequence shown by SEQ ID NO: 19.
  • the amino acid sequence of the glycosyltransferase of the present invention is Ser(S) at amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19, corresponding to SEQ ID NO
  • the amino acid residue at position 280 of the amino acid sequence shown in 19 is Ile(I).
  • glycosyltransferase of the invention is selected from the group consisting of:
  • the glycosyltransferase activity of the present invention refers to an activity capable of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
  • a novel glycosyltransferase gene Pn50 cloned from Panax notoginseng utilizes this novel glycosyltransferase to catalyze the glycosylation of various dammarane-type ginsenosides C3.
  • the full-length glycosyltransferase gene sequence was spliced and analyzed by Pn50 by analyzing the Sanqi transcriptome data. This was cloned into the cloning vector PMDT-18T, and then the expression primer was designed to be constructed on the E. coli expression vector pET28a to induce expression in E. coli.
  • the obtained protein is capable of catalyzing the hydroxyglycosylation of the original human diol and Compound K at the C3 position.
  • a glycosyltransferase mutant protein 8E7 is provided.
  • the glycosyltransferase mutant protein is a mutant protein of the ginseng-derived wild-type gene UGTPg45, and the replacement of the ginseng-derived wild-type gene UGTPg45 by the mutant glycosyltransferase gene 8E7 can significantly increase Rh2 production (70% increase).
  • a glycosyltransferase mutant protein Pn50-Q222H-VE is provided.
  • the glycosyltransferase mutant protein is a mutant protein of the wild type gene Pn50 derived from Panax notoginseng, and the replacement of the ginseng-derived wild-type gene UGTPg45 by the mutant glycosyltransferase gene Pn50-Q222H-VE can greatly increase the yield of Rh2 ( Increase by 66%).
  • a glycosyltransferase mutant protein UGT-MUT1 is provided.
  • the glycosyltransferase mutant protein is a mutant protein of the wild type gene Pn50 derived from Panax notoginseng.
  • the mutant glycosyltransferase gene UGT-MUT1 replaces the ginseng-derived wild-type gene UGTPg45, which can greatly increase the yield of Rh2 (increased by 90). %).
  • a glycosyltransferase mutant protein UGT-MUT2 is provided.
  • the glycosyltransferase mutant protein is a mutant protein of the wild-type gene Pn50 derived from Panax notoginseng, and the wild-type gene UGTPg45 derived from the ginseng-derived wild-type gene UGT-MUT2 can greatly increase the yield of Rh2 (increased by 120). %).
  • a glycosyltransferase mutant protein UGT-MUT3 is provided.
  • the glycosyltransferase mutant protein is a mutant protein of the wild type gene Pn50 derived from Panax notoginseng.
  • the mutant glycosyltransferase gene UGT-MUT3 replaces the ginseng-derived wild-type gene UGTPg45, which can greatly increase the yield of Rh2 (increased 134). %).
  • the polypeptide of the present invention may be an amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19, which is non-Gln and/or an amino acid corresponding to position 322 of the amino acid sequence shown by SEQ ID NO: 19.
  • the residue is further mutated on the basis of non-Ala and still has the function and activity of the glycosyltransferase of the present invention.
  • the glycosyltransferase (a) of the present invention has the amino acid sequence shown as SEQ ID NO: 21; or (b) the sequence defined by (a) comprises one or more amino acid residues, preferably 1-20, More preferably, a sequence formed by substitution, deletion or addition of 1 to 15, more preferably 1 to 10, more preferably 1 to 3, most preferably 1 amino acid residue, and having substantially the function of the polypeptide defined in (a) a polypeptide derived from (a).
  • the glycosyltransferase of the present invention comprises the amino acid sequence such as SEQ ID NO.: 4, SEQ ID NO: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 Or up to 20, preferably up to 10, more preferably up to 3, more preferably up to 2, optimally up to 1 amino acid compared to the glycosyltransferase shown by SEQ ID NO.: 41
  • a mutant formed by substitution of amino acids of similar or similar nature. Mutants of these conservative variations can be produced according to, for example, amino acid substitutions as shown in the table below.
  • polynucleotide encoding a polypeptide can be a polynucleotide comprising the polypeptide, or a polynucleotide further comprising additional coding and/or non-coding sequences.
  • corresponding to has the meaning as commonly understood by one of ordinary skill in the art. Specifically, “corresponding to” means a position in which one sequence corresponds to a specified position in another sequence after alignment of homology or sequence identity by two sequences. Therefore, as for "amino acid residue corresponding to position 222/322/247/280 of the amino acid sequence shown in SEQ ID NO: 19", if one end of the amino acid sequence shown by SEQ ID NO: 19 is added With the 6-His tag, the 222th/322th/247th/280th position corresponding to the amino acid sequence shown in SEQ ID NO: 19 in the obtained mutant may be the 228th/328th/253th/286th position.
  • the obtained mutant corresponds to the 222th/322th position/247 of the amino acid sequence shown by SEQ ID NO: 19.
  • Bit / 280 bits may be the 220th / 320 bit / 245 / 278 bits, and so on.
  • the resulting mutant corresponds to SEQ ID
  • the 222th/322th/247th/280th position of the amino acid sequence shown by NO: 19 may be the 202th/302th/227th/260th position.
  • the homology or sequence identity may be 80% or more, preferably 90% or more, more preferably 95% to 98%, and most preferably 99% or more.
  • Methods for determining sequence homology or identity include, but are not limited to, Computational Molecular Biology, Lesk, AM, Oxford University Press, New York, 1988; Biocomputing: Information Biocomputing: Informatics and Genome Projects, Smith, DW, Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, AM and Griffin, HG , Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987 and Sequence Analysis Primer, Gribskov, M. and Devereux , J. M. Stockton Press, New York, 1991 and Carillo, H. and Lipman, D., SIAM J.
  • the preferred method of determining identity is to obtain the largest match between the sequences tested.
  • the method of determining identity is compiled in a publicly available computer program.
  • Preferred computer program methods for determining identity between two sequences include, but are not limited to, the GCG package (Devereux, J. et al., 1984), BLASTP, BLASTN, and FASTA (Altschul, S, F. et al, 1990).
  • the BLASTX program is available to the public from NCBI and other sources (BLAST Handbook, Altschul, S. et al, NCBI NLM NIH Bethesda, Md. 20894; Altschul, S. et al, 1990).
  • the well-known Smith Waterman algorithm can also be used to determine identity.
  • the ginsenosides and sapogenins referred to herein are ginsenosides and sapogenins of the C20 position S and/or R configuration.
  • isolated polypeptide means that the polypeptide is substantially free of other proteins, lipids, carbohydrates or other materials with which it is naturally associated.
  • One skilled in the art can purify the polypeptide using standard protein purification techniques. A substantially pure polypeptide produces a single major band on a non-reducing polyacrylamide gel. The purity of the polypeptide can also be further analyzed using amino acid sequences.
  • the active polypeptide of the present invention may be a recombinant polypeptide, a natural polypeptide, or a synthetic polypeptide.
  • the polypeptides of the invention may be naturally purified products, either chemically synthesized or produced recombinantly from prokaryotic or eukaryotic hosts (e.g., bacteria, yeast, plants).
  • the polypeptide of the invention may be glycosylated or may be non-glycosylated, depending on the host used in the recombinant production protocol.
  • Polypeptides of the invention may also or may not include an initial methionine residue.
  • the invention also includes fragments, derivatives and analogs of the polypeptides.
  • fragment refers to a polypeptide that substantially retains the same biological function or activity of the polypeptide.
  • the polypeptide fragment, derivative or analog of the present invention may be (i) a polypeptide having one or more conservative or non-conservative amino acid residues (preferably conservative amino acid residues) substituted, and such substituted amino acid residues It may or may not be encoded by the genetic code, or (ii) a polypeptide having a substituent group in one or more amino acid residues, or (iii) a mature polypeptide and another compound (such as a compound that extends the half-life of the polypeptide, for example Polyethylene glycol) a polypeptide formed by fusion, or (iv) a polypeptide formed by fused an additional amino acid sequence to the polypeptide sequence (such as a leader or secretion sequence or a sequence or proprotein sequence used to purify the polypeptide, or A fusion protein for the formation of an antigenic IgG fragment).
  • a polypeptide having one or more conservative or non-conservative amino acid residues preferably conservative amino acid residues
  • substituted amino acid residues It
  • the active polypeptide of the present invention has glycosyltransferase activity and is capable of catalyzing one or more of the following reactions:
  • polypeptide sequence is SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO: 41 or a derivative thereof
  • the term also encompasses SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37 or SEQ ID NO.: 39 or SEQ having the same function as the polypeptide shown.
  • ID NO: variant form of the 41 sequence is SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37 or SEQ ID NO.: 39 or SEQ having the same function as the polypeptide shown.
  • ID NO: variant form of the 41 sequence is SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ having the same function as the polypeptide shown.
  • ID NO: variant form of the 41 sequence is SEQ ID
  • variants include, but are not limited to, one or more (usually 1-50, preferably 1-30, more preferably 1-20, optimally 1-10) amino acid deletions , Insertion and/or Substitution, and the addition of one or several (usually within 20, preferably within 10, more preferably within 5) amino acids at the C-terminus and/or N-terminus.
  • amino acids usually 1-50, preferably 1-30, more preferably 1-20, optimally 1-10) amino acid deletions , Insertion and/or Substitution, and the addition of one or several (usually within 20, preferably within 10, more preferably within 5) amino acids at the C-terminus and/or N-terminus.
  • the term also encompasses active fragments and active derivatives of the proteins of the invention.
  • the invention also provides analogs of the polypeptides.
  • the difference between these analogs and the natural polypeptide may be a difference in amino acid sequence, a difference in the modification form which does not affect the sequence, or a combination thereof.
  • These polypeptides include natural or induced genetic variants. Induced variants can be obtained by a variety of techniques, such as random mutagenesis by irradiation or exposure to a mutagen, or by site-directed mutagenesis or other techniques known to molecular biology. Analogs also include analogs having residues other than the native L-amino acid (such as D-amino acids), as well as analogs having non-naturally occurring or synthetic amino acids (such as beta, gamma-amino acids). It is to be understood that the polypeptide of the present invention is not limited to the representative polypeptides exemplified above.
  • Modifications include chemically derived forms of the polypeptide, such as acetylation or carboxylation, in vivo or in vitro. Modifications also include glycosylation, such as those produced by glycosylation modifications in the synthesis and processing of the polypeptide or in further processing steps. Such modification can be accomplished by exposing the polypeptide to an enzyme that performs glycosylation, such as a mammalian glycosylation enzyme or a deglycosylation enzyme. Modified forms also include sequences having phosphorylated amino acid residues such as phosphotyrosine, phosphoserine, phosphothreonine. Also included are polypeptides that have been modified to enhance their resistance to protease hydrolysis or to optimize solubility properties.
  • the amino terminus or carboxy terminus of the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptides or derived polypeptides thereof of the present invention may further comprise one or more polypeptide fragments as a protein tag.
  • Any suitable label can be used in the present invention.
  • the tags can be FLAG, HA, HA1, c-Myc, Poly-His, Poly-Arg, Strep-TagII, AU1, EE, T7, 4A6, ⁇ , B, gE, and Ty1. These tags can be used to purify proteins. Table 1 lists some of these tags and their sequences.
  • the amino acid amino terminus of the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom may also be used.
  • a signal peptide sequence such as a pelB signal peptide or the like is added. The signal peptide can be cleaved off during secretion of the polypeptide from the cell.
  • the polynucleotide of the present invention may be in the form of DNA or RNA.
  • DNA forms include cDNA, genomic DNA or synthetic DNA.
  • DNA can be single-stranded or double-stranded.
  • the DNA can be a coding strand or a non-coding strand.
  • the coding region sequence encoding the mature polypeptide can be SEQ ID NO.: 3, SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42
  • the coding regions shown are identical or degenerate variants.
  • degenerate variant in the present invention means having the coding with SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO .: 39.
  • Polynucleotides of mature polypeptides include: coding sequences encoding only mature polypeptides; coding sequences for mature polypeptides and various additional coding sequences; coding sequences for mature polypeptides (and optionally additional coding sequences) and non-coding sequences.
  • polynucleotide encoding a polypeptide may be a polynucleotide comprising the polypeptide, or a polynucleotide further comprising additional coding and/or non-coding sequences.
  • the invention also relates to variants of the above polynucleotides which encode fragments, analogs and derivatives of polypeptides or polypeptides having the same amino acid sequence as the invention.
  • Variants of this polynucleotide may be naturally occurring allelic variants or non-naturally occurring variants. These nucleotide variants include substitution variants, deletion variants, and insertion variants.
  • an allelic variant is an alternative form of a polynucleotide that may be a substitution, deletion or insertion of one or more nucleotides, but does not substantially alter the function of the polypeptide encoded thereby. .
  • the invention also relates to polynucleotides which hybridize to the sequences described above and which have at least 50%, preferably at least 70%, more preferably at least 80% identity between the two sequences.
  • the invention particularly relates to polynucleotides that hybridize to the polynucleotides of the invention under stringent conditions (or stringent conditions).
  • stringent conditions means: (1) hybridization and elution at a lower ionic strength and higher temperature, such as 0.2 x SSC, 0.1% SDS, 60 ° C; or (2) hybridization a denaturing agent such as 50% (v/v) formamide, 0.1% calf serum / 0.1% Ficoll, 42 ° C, etc.; or (3) at least 90% identity between the two sequences, more It is good that hybridization occurs more than 95%.
  • polypeptide encoded by the hybridizable polynucleotide is SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO
  • the mature polypeptide shown by .:41 has the same biological function and activity.
  • nucleic acid fragments that hybridize to the sequences described above.
  • a "nucleic acid fragment” is at least 15 nucleotides in length, preferably at least 30 nucleotides, more preferably at least 50 nucleotides, and most preferably at least 100 nucleotides or more.
  • Nucleic acid fragments can be used in nucleic acid amplification techniques (such as PCR) to identify and/or isolate polynuclei encoding Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptides or derived polypeptides thereof. Glycosylate.
  • polypeptides and polynucleotides of the invention are preferably provided in isolated form, more preferably purified to homogeneity.
  • the full-length nucleotide sequence of the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or derivative polypeptide thereof of the present invention or a fragment thereof can generally be used by PCR amplification, recombinant method or artificial The synthetic method is obtained.
  • primers can be designed in accordance with the disclosed nucleotide sequences, particularly open reading frame sequences, and can be prepared using commercially available cDNA libraries or conventional methods known to those skilled in the art.
  • the library is used as a template to amplify the relevant sequences. When the sequence is long, it is often necessary to perform two or more PCR amplifications, and then the amplified fragments are spliced together in the correct order.
  • the recombinant sequence can be used to obtain the relevant sequences in large quantities. This is usually done by cloning it into a vector, transferring it to a cell, and then isolating the relevant sequence from the proliferated host cell by conventional methods.
  • synthetic sequences can be used to synthesize related sequences, especially when the fragment length is short.
  • a long sequence of fragments can be obtained by first synthesizing a plurality of small fragments and then performing the ligation.
  • DNA sequence encoding the protein of the present invention (or a fragment thereof, or a derivative thereof) completely by chemical synthesis.
  • the DNA sequence can then be introduced into various existing DNA molecules (or vectors) and cells known in the art.
  • mutations can also be introduced into the protein sequences of the invention by chemical synthesis.
  • a method of amplifying DNA/RNA using PCR technology is preferably used to obtain the gene of the present invention.
  • RACE method RACE-cDNA end rapid amplification method
  • primers for PCR can be appropriately selected according to the sequence information of the present invention disclosed herein. And can be synthesized by conventional methods.
  • the amplified DNA/RNA fragment can be isolated and purified by conventional methods such as by gel electrophoresis.
  • the present invention also relates to a vector comprising the polynucleotide of the present invention, and a coding sequence using the vector of the present invention or Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom Genetically engineered host cells, and methods of producing the polypeptides of the invention by recombinant techniques.
  • polynucleotide sequences of the present invention can be used to express or produce recombinant Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptides or derivatives thereof by conventional recombinant DNA techniques.
  • Peptide Generally there are the following steps:
  • a polynucleotide sequence encoding a Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom can be inserted into a recombinant expression vector.
  • the term "recombinant expression vector” refers to bacterial plasmids, phage, yeast plasmids, plant cell viruses, mammalian cell viruses such as adenoviruses, retroviruses or other vectors well known in the art. Any plasmid and vector can be used as long as it can replicate and stabilize in the host.
  • An important feature of expression vectors is that they typically contain an origin of replication, a promoter, a marker gene, and a translational control element.
  • Methods well known to those skilled in the art can be used to construct a coding DNA sequence containing Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom and suitable transcription/translation control An expression vector for the signal. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like.
  • the DNA sequence can be operably linked to an appropriate promoter in an expression vector to direct mRNA synthesis. Representative examples of such promoters are: lac or trp promoter of E.
  • the expression vector also includes a ribosome binding site for translation initiation and a transcription terminator.
  • the expression vector preferably comprises one or more selectable marker genes to provide phenotypic traits for selection of transformed host cells, such as dihydrofolate reductase for eukaryotic cell culture, neomycin resistance, and green Fluorescent protein (GFP), or tetracycline or ampicillin resistance for E. coli.
  • selectable marker genes to provide phenotypic traits for selection of transformed host cells, such as dihydrofolate reductase for eukaryotic cell culture, neomycin resistance, and green Fluorescent protein (GFP), or tetracycline or ampicillin resistance for E. coli.
  • Vectors comprising the appropriate DNA sequences described above, as well as appropriate promoters or control sequences, can be used to transform appropriate host cells to enable expression of the protein.
  • the host cell can be a prokaryotic cell, such as a bacterial cell; or a lower eukaryotic cell, such as a yeast cell; or a higher eukaryotic cell, such as a mammalian cell.
  • a prokaryotic cell such as a bacterial cell
  • a lower eukaryotic cell such as a yeast cell
  • a higher eukaryotic cell such as a mammalian cell.
  • Representative examples are: Escherichia coli, Streptomyces; bacterial cells of Salmonella typhimurium; fungal cells such as yeast; plant cells; insect cells of Drosophila S2 or Sf9; CHO, COS, 293 cells, or Bowes melanoma cells Animal cells, etc.
  • an enhancer sequence is inserted into the vector.
  • An enhancer is a cis-acting factor of DNA, usually about 10 to 300 base pairs, acting on a promoter to enhance transcription of the gene.
  • Usable examples include a 100 to 270 base pair SV40 enhancer on the late side of the replication initiation point, a polyoma enhancer on the late side of the replication initiation site, and an adenovirus enhancer.
  • Transformation of host cells with recombinant DNA can be carried out using conventional techniques well known to those skilled in the art.
  • the host is a prokaryote such as E. coli
  • competent cells capable of absorbing DNA can be harvested after the exponential growth phase and treated by the CaCl 2 method, and the procedures used are well known in the art.
  • Another method is to use MgCl 2 .
  • Conversion can also be carried out by electroporation if desired.
  • the host is a eukaryote, the following DNA transfection methods can be used: calcium phosphate coprecipitation, conventional mechanical methods such as microinjection, electroporation, liposome packaging, and the like.
  • the obtained transformant can be cultured by a conventional method to express the polypeptide encoded by the gene of the present invention.
  • the medium used in the culture may be selected from various conventional media depending on the host cell used.
  • the cultivation is carried out under conditions suitable for the growth of the host cell.
  • the selected promoter is induced by a suitable method (such as temperature conversion or chemical induction) and the cells are cultured for a further period of time.
  • the recombinant polypeptide in the above method can be expressed intracellularly, or on the cell membrane, or secreted outside the cell.
  • the recombinant protein can be isolated and purified by various separation methods using its physical, chemical, and other properties. These methods are well known to those skilled in the art. Examples of such methods include, but are not limited to, conventional renaturation treatment, treatment with a protein precipitant (salting method), centrifugation, osmotic sterilizing, super treatment, ultracentrifugation, molecular sieve chromatography (gel filtration), adsorption layer Analysis, ion exchange chromatography, high performance liquid chromatography (HPLC) and various other liquid chromatography techniques and combinations of these methods.
  • the use of the active polypeptide or glycosyltransferase Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or polypeptide derived therefrom according to the present invention includes, but is not limited to, specific and efficient
  • the glycosyl group from the glycosyl donor is transferred to the hydroxyl group at the C-3 position of the tetracyclic triterpenoid.
  • a compound of the formula (I) into the compound of the formula (II), for example, to convert the protopanaxadiol PPD into a rare ginsenoside Rh2 having superior antitumor activity; and to convert Compound K into ginsenoside F2.
  • the tetracyclic triterpene compound includes, but is not limited to, a dammarane type, a lanolin type, a ganthanane type, a cycloalkane (cycloaltenane) type in the S configuration or the R configuration, a tetracyclic triterpenoid such as apotirucallane type, cucurbitane or decane type.
  • the present invention provides an industrial catalytic method comprising: using the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide of the present invention or the like under the condition of providing a glycosyl donor
  • the polypeptide is derived to obtain ginsenoside Rh2 and ginsenoside F2.
  • the polypeptide used in the (A) reaction is selected from the polypeptide represented by SEQ ID NO.: 4 or SEQ ID NO.: 21 or a polypeptide derived therefrom; and the polypeptide used in the (B) reaction is SEQ ID NO.: A polypeptide represented by 4 or a polypeptide derived therefrom.
  • a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 and glycosyltransferase mutant protein 8E7 is provided.
  • a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 mutant protein Pn50-Q222H-VE of Panax notoginseng is provided.
  • a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 mutant protein UGT-MUT1 of Panax notoginseng is provided.
  • the glycosyl donor is a nucleoside diphosphate sugar selected from the group consisting of UDP-glucose, ADP-glucose, TDP-glucose, CDP-glucose, GDP-glucose, UDP-acetylglucose, ADP-acetylglucose , TDP-acetylglucose, CDP-acetylglucose, GDP-acetylglucose, UDP-xylose, ADP-xylose, TDP-xylose, CDP-xylose, UDP-xylose, GDP-xylose, UDP -galacturonic acid, ADP-galacturonic acid, TDP-galacturonic acid, CDP-galacturonic acid, GDP-galacturonic acid, UDP-galactose, ADP-galactose, TDP-galactose, CDP - Galactose, GDP-galactose,
  • the glycosyl donor is preferably uridine diphosphate, selected from the group consisting of UDP-glucose, UDP-xylose, UDP-rhamnose, UDP-galacturonic acid, UDP-galactose, UDP-Arabic Sugar, or other uridine diphosphate hexose or uridine pentose diphosphate, or a combination thereof.
  • uridine diphosphate selected from the group consisting of UDP-glucose, UDP-xylose, UDP-rhamnose, UDP-galacturonic acid, UDP-galactose, UDP-Arabic Sugar, or other uridine diphosphate hexose or uridine pentose diphosphate, or a combination thereof.
  • an enzyme active additive (an additive that increases enzyme activity or inhibits enzyme activity) may also be added.
  • the enzyme activity additive may be selected from the group consisting of Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , or Fe 2+ ; a substance of Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , or Fe 2+ .
  • the pH conditions of the process are: pH 4.0-10.0, preferably pH 6.0-pH 8.5, more preferably 8.5.
  • the temperature conditions of the process are from 10 ° C to 105 ° C, preferably from 25 ° C to 35 ° C, more preferably 35 ° C.
  • the present invention also provides a composition comprising an effective amount of the active polypeptide or glycosyltransferase Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or derivative thereof of the present invention.
  • Such carriers include, but are not limited to, water, buffer, dextrose, water, glycerol, ethanol, and combinations thereof.
  • Substances which modulate the glycosyltransferase activity of the present invention may also be added to the composition. Any substance having a function of increasing the activity of the enzyme is available. Preferably, the substance which increases the glycosyltransferase activity of the present invention is selected from the group consisting of mercaptoethanol.
  • many substances can reduce enzyme activity, including but not limited to: Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , and Fe 2+ ;
  • Substrate can be hydrolyzed to form Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ and Fe 2+ .
  • the enzyme can be conveniently applied by the practitioner to exert the effect of transglycosylation. Especially for the transglycosylation of dammar diol and protopanaxadiol.
  • two methods for forming rare ginsenosides one of the methods comprising: using the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT according to the present invention.
  • the MUT3 polypeptide or a polypeptide derived therefrom which comprises a substrate to be transglycosylated, said substrate comprising a tetracyclic triterpenoid such as dammar diol, protopanaxadiol and derivatives thereof.
  • a tetracyclic triterpenoid such as dammar diol, protopanaxadiol and derivatives thereof.
  • the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a derivative polypeptide thereof thereof is used to treat the glycosyl group to be transfected under the condition of pH 3.5-10. Substrate.
  • the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a derivative polypeptide thereof is used to treat the glycosyl group to be transfected at a temperature of 30-105 °C. Substrate.
  • the second method comprises the steps of: transferring the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or the derivative polypeptide gene thereof according to the invention into a process for synthesizing the original ginseng diol PPD.
  • bacteria for example, yeast or E. coli engineering bacteria
  • Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom, and dammar diol The key gene in the original P.
  • ginseng PPD anabolic pathway and optionally other glycosyltransferase genes are co-expressed in host cells (eg yeast cells or E. coli) to obtain direct production of rare ginsenoside Rh2 and/or ginsenoside F2.
  • host cells eg yeast cells or E. coli
  • the nucleotide sequence encoding the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom is anabolized with dammarene diol and/or protopanaxadiol PPD.
  • the key enzymes in the pathway and optionally other glycosyltransferases as well as key enzymes for the synthesis of UDP-rhamnose are co-expressed in host cells and used to construct recombinant strains of synthetic ginsenoside Rh2 and ginsenoside F2.
  • the key genes in the dammar diol anabolic pathway include, but are not limited to, the dammarene diol synthase gene.
  • the key genes in the proto-ginsengdiol anabolic pathway include, but are not limited to, a dammarene diol synthase gene PgDDS, a ginseng diol-synthesized cytochrome P450 gene CYP716A47, and Its reductase gene, or a combination thereof. Or isozymes of various enzymes and combinations thereof.
  • dammarene diol synthase converts squalene (Saccharomyces cerevisiae self-synthesis) into dammarene diol
  • cytochrome P450CYP716A47 and its reductase convert dammarene diol into proto-ginseng diol PPD. . (Han et.al, plant&cell physiology, 2011, 52.2062-73)
  • the method for producing ginsenoside Rh2 by using the Saccharomyces cerevisiae has the advantages of low cost, short cycle, stable quality, and the like compared with the conventional method relying on ginseng plant extract and glycosyl hydrolysis;
  • glycosyltransferase Pn50 obtained by the present invention from Sanqi for the first time can catalyze PPD and Rh2, catalyze the synthesis of F2 by CK, and introduce it into the PPD-producing strain, which can be more than the wild-type glycosyltransferase UGTPg45 in ginseng. Efficient synthesis of rare ginsenoside Rh2;
  • the present invention obtains the mutant gene 8E7 or the hepta-7-transferase gene Pn50 by randomly mutating the wild type glycosyltransferase UGTPg45 in ginseng, and introduces it into the PPD-producing strain compared to wild-type glycosyltransfer in ginseng.
  • the enzyme UGTPg45 can significantly increase the synthesis efficiency of rare ginsenoside Rh2.
  • the present invention obtains the mutant genes Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 by random mutation of the wild-type glycosyltransferase Pn50 in Panax notoginseng, and introduces them into the PPD-producing strain.
  • the wild-type glycosyltransferase Pn50 can significantly increase the synthesis efficiency of rare ginsenoside Rh2 in Sanqi.
  • Two primers were synthesized, such as the sequence Pn50 clone primer F, SEQ ID NO: 1 (ATGGAGAGAGAAATGTTGAGCA) and Pn50 clone primer R, SEQ ID NO: 2 (TCAGGAGGAAACAAGCTTTGAA).
  • the cDNA obtained by reverse transcription of RNA extracted from Panax notoginseng was used as a template, and PCR was carried out using the above primers.
  • DNA polymerase uses the high-fidelity KOD DNA polymerase from Biotech Engineering Co., Ltd.
  • the PCR amplification procedure was: 94 ° C for 2 min; 94 ° C for 15 s, 58 ° C for 30 s, 68 ° C for 2 min for a total of 35 cycles; 68 ° C for 10 min to 10 ° C.
  • the PCR product was detected by agarose gel electrophoresis and the results are shown in Figure 1.
  • the Pn50 gene has the nucleotide sequence of SEQ ID NO: 3.
  • the 1-1368th nucleotide from the 5' end of SEQ ID NO: 3 is the open reading frame (ORF) of Pn50, from nucleotides 1-3 of the 5' end of SEQ ID NO:
  • the start codon ATG of the Pn50 gene, the nucleotides 1366-1368 from the 5' end of SEQ ID NO: 3 are the stop codon TGA of the Pn50 gene.
  • the glycosyltransferase Pn50 gene encodes a protein of 455 amino acids, Pn50, having the amino acid residue sequence of SEQ ID NO: 4, which is predicted by software to have a theoretical molecular weight of 51.1 kDa and an isoelectric point pI of 5.10.
  • the amino acid at positions 332-375 of the amino terminus of SEQ ID NO: 4 is the glycosyltransferase PSPG conserved domain.
  • Two primers were synthesized, such as the sequence Pn50 expression primer F SEQ ID NO: 5 (GGATCCATGGAGAGAGAAATGTTGAGCA) and Pn50 expression primer R SEQ ID NO: 6 (CTCGAGTCAGGAGGAAACAAGCTTTGAA).
  • Two cleavage sites of BamH I and Xho I were added to the synthesized primer F/R, and PCR was carried out using cDNA extracted from plants as a template.
  • DNA polymerase uses the high-fidelity KOD DNA polymerase from Biotech Engineering Co., Ltd.
  • the PCR amplification procedure was: 94 ° C for 2 min; 94 ° C for 15 s, 58 ° C for 30 s, 68 ° C for 2 min for a total of 35 cycles; 68 ° C for 10 min to 10 ° C.
  • the PCR product was detected by agarose gel electrophoresis. Irradiate in the ultraviolet and cut off the target DNA band. Then, the DNA fragment of the amplified glycosyltransferase gene was recovered from the agarose gel using an Axygen Gel Extraction Kit (AEYGEN).
  • the two PCR products recovered were digested with BamH I and Xho I, ligated with pET28a which was digested with BamH I and Xho I, respectively, and the ligated product was transformed into E. coli EPI300 competent cells, and the transformed Escherichia coli was transformed.
  • the solution was applied to LB plates supplemented with 50 ug/mL kanamycin, and the recombinant clones were further verified by PCR and restriction enzyme digestion.
  • One of the clones was selected and the recombinant plasmid was extracted and verified by sequencing. After sequencing, the recombinant plasmid was transformed into E. coli BL21 (DE3) to induce expression.
  • the induced expression method was: picking a single clone from the plate to contain 50 ug/mL Kana.
  • the LB tube of themycin was used overnight, and 1% was inoculated into a 50 ml flask and shaken at 37 ° C until the OD600 was 0.6-0.7.
  • the final concentration was 0.1 mM of IPTG, and induced at 18 ° C for 16 h. 12000 g, 3 min collection of bacteria per gram of wet weight cells were resuspended in 10 ml of PBS buffer, and the cells were lysed, and the supernatant was centrifuged to obtain a crude enzyme solution.
  • glycosyltransferase Pn50 catalytic reaction system (100 ⁇ L) was configured as follows:
  • the reaction was carried out in a water bath at 37 ° C for 2 h. After the completion of the reaction, an equal volume of n-butanol was added for extraction, and a n-butanol phase was taken. After concentration in vacuo, the reaction product was dissolved in 10 ⁇ L of methanol, and the mixture was subjected to TLC or HPLC. It can be seen from the results in Fig. 2 that the glycosyltransferase Pn50 used in the present invention can catalyze the formation of a new product of the original ginseng diol PPD (reaction see Formula A), its migration position on the TLC plate and the migration of Rh2. The position is consistent, demonstrating that this new product is ginsenoside Rh2. In addition, Pn50 also catalyzes compound K, and the resulting product is presumed to be ginsenoside F2 according to the migration position on the TLC plate and the regional specificity of Pn50 (Fig. 2).
  • the glycosyltransferase gene Pn50 derived from Panax notoginseng can catalyze the hydroxyglycosylation of protopanaxadiol at the C3 position to synthesize rare ginsenoside Rh2.
  • the present invention firstly constructed a strain. Saccharomyces cerevisiae chassis cells capable of producing protopanaxadiol.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS was transformed with the conventional LiAc/ssDNA transformation method of Saccharomyces cerevisiae to obtain recombinant Saccharomyces cerevisiae strain ZWBY04RS-Pn50.
  • Disposition medium 1% Yeast Extract, 2% Peptone, 2% Dextrose (glucose), 2% agar powder.
  • Configure liquid medium Configure medium: 1% Yeast Extract, 2% Peptone, 2% Dextrose (glucose).
  • the recombinant S. cerevisiae ZWBY04RS-UGTPg45 and ZWBY04RS-Pn50 streaked on the solid medium plate were picked and shaken overnight in a test tube containing 5 mL of liquid medium (30 ° C, 250 rpm, 16 h); the cells were collected by centrifugation and transferred. In a 50 mL flask of 10 mL liquid medium, the fermentation product was obtained by adjusting the OD 600 to 0.05, 30 ° C, and shaking culture at 250 rpm for 4 days. This method simultaneously sets up a parallel experiment for each recombinant yeast.
  • Extraction and detection of protopanaxadiol and rare ginsenoside Rh2 100 ⁇ L of fermentation broth was taken from 10 mL of fermentation broth, and yeast was lysed with Fastprep, and an equal volume of n-butanol was added for extraction, followed by n-butanol under vacuum. Evaporate dry. The yield of the objective product was measured by HPLC after dissolving in 100 ⁇ L of methanol. The HPLC results are shown in Figure 3.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS-UGTPg45 constructed by introducing the ginseng-derived glycosyltransferase gene UGTPg45 into the Saccharomyces cerevisiae strain producing ginseng diol can synthesize rare ginsenoside Rh2, and its yield is 35.66 mg/L.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS-Pn50 constructed by introducing the Panax notoginseng-derived glycosyltransferase gene Pn50 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 45.55 mg/L.
  • UGTPg45-pMD18T Using the plasmid UGTPg45-pMD18T as a template, using the GeneMorph II Random Mutagenesis Kit (Agilent Technology), UGTPg45 random mutagenic primer F SEQ ID NO: 23 (Gcatagcaatctaatctaagttttaattacaaaatggagagagaaatgttgagcaaac) and UGTPg45 random mutation primer R SEQ ID NO: 24 (Gaaaagaagataatatttttatataattatattaatctcaggaggaaacaagctttgaa) as primers Wrong PCR, the procedure is as follows: 95 ° C 2 min pre-denaturation; 95 ° C denaturation 30 s, 58 ° C annealing 30 s, 72 ° C extension 3 min 15 s, 30 cycles; 72 ° C final extension 10 min.
  • 1ug, 1.5ug, 2ug template was used to explore the mutation rate.
  • the PCR product was recovered by tapping, and Ta was added to the vector pMD18T with Taq enzyme to transform E. coli TOP10.
  • Ten positive clones were randomly picked and sequenced, and the mutation rate was determined to be 1-2 bases/gene corresponding template amount of 1.5 ug.
  • Subsequent experiments using this condition for error-prone PCR constructed a random mutation library of UGTPg45.
  • fragment 7 was obtained by PCR using S.
  • fragment 8 primer F (Ttgatgagttcatttcaaagcttgtttcctctgagattaatataattatataaaaata)
  • SEQ ID NO: 28 fragment 8 primer R
  • Actgtcaaggagggtattctgggcctccatgtcgctgctatataacagttgaaatttgg were obtained by PCR using S.
  • fragment 9 primer F Cccaaagctaagagtcccatttattc
  • SEQ ID NO: 30 fragment 9 primer R
  • fragment 9 primer R Gaagagtaaaaaggagtagaaacattttgaagctatctgctcttgaatggcgacagcc
  • SEQ ID NO: 31 fragment 10 primer F
  • fragment 10 primer R Tctggtgaggatttacggtatg
  • Fragments 9 and 10 Fragments 9 and 10 were obtained by S.
  • Fragment 11 was obtained by PCR using plasmid SEQ ID NO: 33 (fragment 11 primer F) (Aagatgttcttatccaaatttcaactgttatatagcagcgacatggaggcccagaatac) and SEQ ID NO: 34 (fragment 11 primer R) (tacttcttgcagacatcagacatactattgtaattctcgacactggatggcggcgttag) using plasmid PLKAN as a template.
  • fragment 11 primer F Aagatgttcttatccaaatttcaactgttatatagcagcgacatggaggcccagaatac
  • fragment 11 primer R fragment 11 primer R
  • the above fragments 7-11 were mixed in an equimolar ratio, transformed into a PPD high-yield strain ZWBY04RS, coated with a YPD plate (100 ug/ml G418), cultured at 30 ° C for 2 days, and yeast transformants to be expressed with the UGTPg45 mutant gene were grown. Similarly, wild type UGTPg45 was transformed to construct yeast transformants as controls.
  • 600 ⁇ l of YPD medium per well (100 ug/ml G418) was added to a 96-well plate, yeast monoclonal was picked into the medium, and cultured at 30 ° C for 280 rpm for 1 day. 6 ul of the culture was transferred to a new 96-well plate containing 600 ul of YPD medium and incubated at 30 ° C for 3 days with shaking at 280 rpm. Add 600 ul of n-butanol to each well, cover the rubber cap and seal with tape, and spin for 3 h. After centrifugation at 4000 rpm for 10 min, 150 ul of n-butanol phase was pipetted into a new 96-well plate and the product was determined by HPLC.
  • Rh2 in the rare ginsenoside Rh2 strain ZWBY04RS-8E7 constructed using the mutant gene 8E7 was 70% higher than that of the strain ZWBY04RS-UGTPg45 constructed using UGTPg45, reaching 60.48 mg/L.
  • the wild type gene UGTPg45 has the nucleotide sequence of SEQ ID NO: 20.
  • the nucleotide from position 1-1374 at the 5' end of SEQ ID NO: 20 is the open reading frame of UGTPg45, and the nucleotide from positions 1-3 at the 5' end of SEQ ID NO: 20 is the initiation code of the UGTPg45 gene.
  • the sub-ATG, nucleotides 1371-1374 from the 5' end of SEQ ID NO: 20 are the stop codon TGA of the UGTPg45 gene.
  • the glycosyltransferase UGTPg45 gene encodes a 457 amino acid protein UGTPg45 having the amino acid residue sequence of SEQ ID NO: 19, which was predicted by software to have a theoretical molecular weight of 51.1 kDa and an isoelectric point pI of 5.10.
  • the amino acid at positions 332-375 of the amino terminus of SEQ ID NO: 19 is the glycosyltransferase PSPG conserved domain.
  • the mutant gene 8E7 has the nucleotide sequence of SEQ ID NO: 22. From the 5' end of SEQ ID NO: 22, nucleotides 1-1374 are the open reading frame of 8E7, and the nucleotides 1-3 from the 5' end of SEQ ID NO: 22 are the starting code of the 8E7 gene.
  • the sub-ATG, from nucleotides 1371 to 1374 of the 5' end of SEQ ID NO: 22, is the stop codon TGA of the 8E7 gene.
  • the glycosyltransferase 8E7 gene encodes a protein of 457 amino acids, 8E7, having the amino acid residue sequence of SEQ ID NO:21.
  • the inventors mutated the amino acid residue Q at position 222 of Pn50 to H, while the amino acid residues at positions 322 and 323 of Pn50 were deleted, and the inventors inserted two amino acids VE after the 321 position to obtain a Pn50 mutant Pn50- Q222H-VE.
  • Extraction and detection of protopanaxadiol and rare ginsenoside Rh2 100 ⁇ L of fermentation broth was taken from 10 mL of fermentation broth, and yeast was lysed with Fastprep, and an equal volume of n-butanol was added for extraction, followed by steaming n-butanol under vacuum. dry. The yield of the objective product was measured by HPLC after dissolving in 100 ⁇ L of methanol.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS-QE constructed by introducing the Panax notoginseng-derived glycosyltransferase gene Pn50-Q222H-VE into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 59.09 mg/L. ( Figure 4).
  • UGT-MUT1 mutates amino acid residue K at position 280 of Pn50-Q222H-VE to amino acid residue I
  • UGT-MUT2 mutates amino acid residue N at position 247 of Pn50-Q222H-VE to amino acid residue S
  • UGT-MUT3 mutates the amino acid residue K at position 280 of Pn50-Q222H-VE to amino acid residue I
  • Saccharomyces cerevisiae strain ZWBY04RS was transformed by the conventional LiAc/ssDNA transformation method of Saccharomyces cerevisiae to obtain recombinant Saccharomyces cerevisiae strains ZWBY04RS-MUT1, ZWBY04RS-MUT2 and ZWBY04RS-MUT3.
  • Extraction and detection of protopanaxadiol and rare ginsenoside Rh2 100 ⁇ L of fermentation broth was taken from 10 mL of fermentation broth, and yeast was lysed with Fastprep, and an equal volume of n-butanol was added for extraction, followed by steaming n-butanol under vacuum. dry. The yield of the objective product was measured by HPLC after dissolving in 100 ⁇ L of methanol. The HPLC results are shown in Figure 4.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS-MUT1 constructed by introducing the G7-derived glycosyltransferase gene UGT-MUT1 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 67.96 mg/L.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS-MUT2 constructed by introducing the G7-derived glycosyltransferase gene UGT-MUT2 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 78.29 mg/L.
  • the recombinant Saccharomyces cerevisiae strain ZWBY04RS-MUT3 constructed by introducing the G7-derived glycosyltransferase gene UGT-MUT3 into the Saccharomyces cerevisiae strain producing ginseng diol can synthesize rare ginsenoside Rh2 with a yield of 83.53 mg/L.
  • glycosyltransferase candidate genes have discovered a large number of glycosyltransferase candidate genes through transcriptome analysis of ginseng, American ginseng and notoginseng, but only a very small number of glycosyltransferases have been verified to be involved in the synthesis of ginsenosides. Glycosyltransferases involved in the synthesis of ginsenosides in Panax notoginseng have not been reported so far.
  • Panax notoginseng also synthesizes the same ginsenosides
  • the discovery of Panax notoginseng-derived glycosyltransferases allows us to better understand the synthetic pathways of these two types of plants for synthesizing ginsenosides, and on the other hand, for the synthesis of ginsenosides. More abundant components are of great significance.

Abstract

Provided are a glycosyltransferase, a mutant, and an in vitro glycosylation method. The method comprises the steps of: transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid compound in the presence of a glycosyltransferase to form a glycosylated tetracyclic triterpenoid compound, wherein the glycosyltransferase is a glycosyltransferase as shown in SEQ ID NO.:4 or a polypeptide derived therefrom, or a glycosyltransferase as shown in SEQ ID NO.:21 or a polypeptide derived therefrom.

Description

糖基转移酶、突变体及其应用Glycosyltransferase, mutant and application thereof 技术领域Technical field
本发明涉及生物技术和植物生物学领域,具体地,本发明涉及用于人参皂苷Rh2合成的糖基转移酶、糖基转移酶突变体及其应用。The present invention relates to the field of biotechnology and plant biology, and in particular, to a glycosyltransferase, a glycosyltransferase mutant for use in the synthesis of ginsenoside Rh2, and uses thereof.
背景技术Background technique
人参皂苷是五加科人参属植物(如人参、三七、西洋参等)中的主要活性物质,近年来在葫芦科植物三七中也发现一些人参皂苷。目前,国内外科学家已经从人参、三七等植物中分离出了至少100种皂苷,人参皂苷属于三萜皂苷。其中一些人参皂苷被证实具有广泛的生理功能和药用价值:包括抗肿瘤、免疫调节、抗疲劳、护心、护肝等功能。其中多种皂苷已经用于临床,如以人参皂苷Rg3单体为主要成分的药物参一胶囊可改善肿瘤患者的气虚症状,提高机体免疫功能。以人参皂苷Rh2单体为主要成分的今幸胶囊是一种保健药品用于提高机体免疫力,增强抗病能力。Ginsenoside is the main active substance in the genus Panax ginseng (such as ginseng, notoginseng, American ginseng, etc.). In recent years, some ginsenosides have also been found in the cucurbitaceae plant Panax notoginseng. At present, scientists at home and abroad have isolated at least 100 kinds of saponins from ginseng, Sanqi and other plants. Ginsenoside belongs to triterpenoid saponins. Some of the ginsenosides have been proven to have a wide range of physiological functions and medicinal properties: including anti-tumor, immune regulation, anti-fatigue, heart protection, liver protection and other functions. Many of the saponins have been used in clinical practice. For example, the drug Shenyi Capsule with ginsenoside Rg3 monomer as the main component can improve the symptoms of qi deficiency in patients with cancer and improve the immune function of the body. Jinshen Capsule with ginsenoside Rh2 monomer as the main component is a health care medicine for improving immunity and enhancing disease resistance.
从结构上来看,人参皂苷是皂苷元经过糖基化后形成的生物活性小分子。人参皂苷的皂苷元只有有限的几种,主要是达玛烷型的原人参二醇和原人参三醇,以及齐墩果烷型的香树脂。除了皂苷元的差异,人参皂苷之间结构上的差异主要体现在皂苷元不同的糖基化修饰上。人参皂苷的糖链一般结合在皂苷元的C3、C6、或C20的羟基上,糖基可以是葡萄糖、鼠李糖、木糖、和阿拉伯糖。Structurally, ginsenoside is a biologically active small molecule formed by saccharification of sapogenin. There are only a limited number of saponins of ginsenosides, mainly dammarane-type protopanaxadiol and protopanaxatriol, and oleanane-type aroma resins. In addition to the difference in sapogenin, the structural difference between ginsenosides is mainly reflected in the different glycosylation modification of saponins. The sugar chain of ginsenoside is generally bound to the hydroxyl group of C3, C6, or C20 of the sapogenin, which may be glucose, rhamnose, xylose, and arabinose.
不同的糖基结合位点,糖链组成和长度使人参皂苷在生理功能和药用价值上产生极大的差异。例如,人参皂苷Rb1,Rd和Rc都是以原人参二醇为皂苷元的皂苷,它们之间的差别只是糖基修饰上的差别,但它们之间的生理功能就有很多的差别。Rb1有稳定中心神经元系统的功能,而Rc的功能却是抑制中心神经元系统的功能,Rb1的生理功能非常广泛,而Rd却只有非常有限的几种功能。Different glycosylation sites, sugar chain composition and length make ginsenosides greatly different in physiological function and medicinal value. For example, ginsenosides Rb1, Rd and Rc are saponins in which protopanaxadiol is a sapogenin. The difference between them is only the difference in glycosylation modification, but there are many differences in the physiological functions between them. Rb1 has the function of a stable central neuron system, while Rc functions to suppress the function of the central neuron system. Rb1 has a wide range of physiological functions, while Rd has only a very limited number of functions.
稀有人参皂苷是指在人参中含量极低的皂苷。人参皂苷Rh2(3-O-β-(D-glucopyranosyl)-20(S)-protopanaxadiol)属于原人参二醇类的皂苷,在皂苷元的C-3位羟基上连有一个葡萄糖基。人参皂苷Rh2的含量大约只有人参干重的万分之一左右,但是,人参皂苷Rh2具有良好的抗肿瘤活性,是人参中最主要的抗肿瘤活性成分之一,能够抑制肿瘤细胞生长、诱导肿瘤细胞凋亡、抗肿瘤转移。研究表明人参皂苷Rh2能够抑制lung cancer cells 3LL(mice),Morris liver cancer cells(rats),B-16melanoma cells(mice),以及HeLa cells(human)的增值。 在临床上,人参皂苷Rh2与放疗或化疗结合治疗,可以增强放疗和化疗的效果。此外,人参皂苷Rh2还具有抗过敏,提高机体免疫力的功能,抑制NO和PGE产生的炎症等作用。Rare ginsenoside refers to a saponin with very low content in ginseng. The ginsenoside Rh2(3-O-β-(D-glucopyranosyl)-20(S)-protopanaxadiol) belongs to the ginseng glycol saponin, and a glucosyl group is attached to the C-3 hydroxyl group of the sapogenin. The content of ginsenoside Rh2 is only about one ten thousandth of the dry weight of ginseng. However, ginsenoside Rh2 has good antitumor activity and is one of the most important antitumor active ingredients in ginseng. It can inhibit tumor cell growth and induce tumors. Apoptosis, anti-tumor metastasis. Studies have shown that ginsenoside Rh2 can inhibit the increase of lung cancer cells 3LL (mice), Morris liver cancer cells (rats), B-16 melanoma cells (mice), and HeLa cells (human). Clinically, ginsenoside Rh2 combined with radiotherapy or chemotherapy can enhance the effects of radiotherapy and chemotherapy. In addition, ginsenoside Rh2 also has anti-allergic effects, enhances the body's immunity, and inhibits the inflammation caused by NO and PGE.
糖基转移酶的功能是将糖基供体(核苷二磷酸糖,例如UDP-葡萄糖)上的糖基转移到不同的糖基受体上。根据氨基酸序列的不同,目前糖基转移酶已有94个家族。目前已测序的植物基因组中,发现了上百种以上不同的糖基转移酶。这些糖基转移酶的糖基受体包括糖、脂、蛋白、核酸、抗生素和其它的小分子。在人参中参与皂苷糖基化的糖基转移酶,其作用是把糖基供体上的糖基转移到皂苷元或者苷元的C3、C6或C20的羟基上,从而形成具有不同药用价值的皂苷。The function of a glycosyltransferase is to transfer a glycosyl group on a glycosyl donor (nucleoside diphosphate sugar, such as UDP-glucose) to a different glycosyl acceptor. There are currently 94 families of glycosyltransferases depending on the amino acid sequence. More than one hundred different glycosyltransferases have been discovered in the plant genomes that have been sequenced. The glycosyl acceptors of these glycosyltransferases include sugars, lipids, proteins, nucleic acids, antibiotics, and other small molecules. A glycosyltransferase involved in saponin glycosylation in ginseng, which functions to transfer a glycosyl group on a glycosyl donor to a C3, C6 or C20 hydroxyl group of a saponin or aglycone, thereby forming different medicinal values. Saponin.
目前本领域尚缺乏一种有效的生产稀有人参皂苷Rh2、人参皂苷F2的方法,因此迫切需要开发多种特异高效的糖基转移酶。At present, there is a lack of an effective method for producing rare ginsenoside Rh2 and ginsenoside F2 in the field, and thus it is urgent to develop a variety of specific and efficient glycosyltransferases.
发明内容Summary of the invention
本发明的目的就是提供一类糖基转移酶及其应用,用于合成稀有人参皂苷Rh2、人参皂苷F2。The object of the present invention is to provide a kind of glycosyltransferase and its application for synthesizing rare ginsenoside Rh2 and ginsenoside F2.
本发明的第一方面,提供了一种分离的多肽,In a first aspect of the invention, an isolated polypeptide is provided,
(1)所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为非Gln和/或在对应于SEQ ID NO:19所示氨基酸序列第322位的氨基酸残基为非Ala;(1) The amino acid sequence of the isolated polypeptide is non-Gln at amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19 and/or corresponding to the amino acid sequence corresponding to SEQ ID NO: 19. The amino acid residue at position 322 is non-Ala;
(2)所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基为非Asn(N)和/或在对应于SEQ ID NO:19所示氨基酸序列第280位的氨基酸残基为非Lys(K)。(2) The amino acid sequence of the isolated polypeptide is non-Asn(N) at amino acid residue corresponding to position 247 of the amino acid sequence shown in SEQ ID NO: 19 and/or corresponding to SEQ ID NO: 19 The amino acid residue at position 280 of the amino acid sequence is non-Lys (K).
在另一优选例中,所述分离的多肽:In another preferred embodiment, the isolated polypeptide:
i).具有如SEQ ID NO:19所示氨基酸序列且第222位的氨基酸残基为非Gln和/或第322位的氨基酸残基为非Ala,或i) having an amino acid sequence as shown in SEQ ID NO: 19 and the amino acid residue at position 222 is non-Gln and/or the 322th position is non-Ala, or
ii).具有i)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的序列,且基本具有i)所限定的分离的多肽功能的由i)衍生的分离的多肽。Ii). The sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1 An isolated polypeptide derived from i) of a sequence formed by substitution, deletion or addition of an amino acid residue, and having substantially the isolated polypeptide function as defined in i).
在另一优选例中,所述分离的多肽:In another preferred embodiment, the isolated polypeptide:
i).具有如SEQ ID NO:19所示氨基酸序列且第247位的氨基酸残基为非Asn(N)和/或第280位的氨基酸残基为非Lys(K),或i) having an amino acid sequence as shown in SEQ ID NO: 19 and the amino acid residue at position 247 is non-Asn (N) and/or the 280th amino acid residue is non-Lys (K), or
ii).具有i)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的序列,且基本具有i)所限定的分离的多肽功能的由i)衍生的分离的多肽。Ii). The sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1 An isolated polypeptide derived from i) of a sequence formed by substitution, deletion or addition of an amino acid residue, and having substantially the isolated polypeptide function as defined in i).
在另一优选例中,所述分离的多肽具有i)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的添加而形成的序列,且基本具有i)所限定的分离的多肽功能的由i)衍生的分离的多肽。In another preferred embodiment, the isolated polypeptide has the sequence defined by i) passing through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably. An isolated polypeptide derived from i) having a sequence of from 1 to 3, most preferably 1 amino acid residue added, and having substantially the isolated polypeptide function as defined in i).
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基选自以下氨基酸的至少一种:His、Asn、Gln、Lys和Arg。In another preferred embodiment, the amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 222 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: His, Asn, Gln, Lys and Arg.
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为His。In another preferred embodiment, the amino acid sequence of the isolated polypeptide is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19.
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第322位的氨基酸残基选自以下氨基酸的至少一种:Val、Ile、Leu、Met和Phe。In another preferred embodiment, the amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 322 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Val, Ile, Leu, Met and Phe.
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第322位的氨基酸残基为Val。In another preferred embodiment, the amino acid sequence of the isolated polypeptide is Val at amino acid residue corresponding to position 322 of the amino acid sequence set forth in SEQ ID NO: 19.
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为His,在对应于SEQ ID NO:19所示氨基酸序列第322位的氨基酸残基为Val。In another preferred embodiment, the amino acid sequence of the isolated polypeptide is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown in SEQ ID NO: 19, and corresponds to the amino acid sequence shown in SEQ ID NO: 19. The amino acid residue at position 322 is Val.
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基选自以下氨基酸的至少一种:Ser(S)、Pro(P),Ala(A)或Thr(T)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 247 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ser(S), Pro (P), Ala (A) or Thr (T).
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示 氨基酸序列的第247位的氨基酸残基为Ser(S)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide is Ser(S) at the amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19.
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第280位的氨基酸残基选自以下氨基酸的至少一种:Ile(I)、Asn(N),Ser(S)或Alan(A)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide at amino acid residue corresponding to position 280 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ile(I), Asn (N), Ser(S) or Alan (A).
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第280位的氨基酸残基为Ile(I)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide is Ile(I) at the amino acid residue corresponding to position 280 of the amino acid sequence set forth in SEQ ID NO: 19.
在另一优选例中,所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基为Ser(S),在对应于SEQ ID NO:19所示氨基酸序列的第280位的氨基酸残基为Ile(I)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide is Ser(S) at amino acid residue corresponding to position 247 of the amino acid sequence set forth in SEQ ID NO: 19, corresponding to SEQ ID NO: 19 The amino acid residue at position 280 of the amino acid sequence is Ile(I).
在另一优选例中,所述分离的多肽:In another preferred embodiment, the isolated polypeptide:
iii).SEQ ID NO:19所示氨基酸序列且第222位的氨基酸残基为非Gln和/或第322位的氨基酸残基为非Ala,或Iii) the amino acid sequence shown in SEQ ID NO: 19 and the amino acid residue at position 222 is non-Gln and/or the amino acid residue at position 322 is non-Ala, or
iv).具有iii)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的序列,且基本具有iii)所限定的分离的多肽功能的由iii)衍生的分离的多肽。Iv). The sequence defined by iii) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1 An isolated polypeptide derived from iii) having a sequence formed by substitution, deletion or addition of an amino acid residue and having substantially the isolated polypeptide function as defined in iii).
在另一优选例中,所述分离的多肽:In another preferred embodiment, the isolated polypeptide:
iii).SEQ ID NO:19所示氨基酸序列且第247位的氨基酸残基为非Asn(N)和/或第280位的氨基酸残基为非Lys(K),或Iii) the amino acid sequence shown in SEQ ID NO: 19 and the amino acid residue at position 247 is non-Asn (N) and/or the amino acid residue at position 280 is non-Lys (K), or
iv).具有iii)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的序列,且基本具有iii)所限定的分离的多肽功能的由iii)衍生的分离的多肽。Iv). The sequence defined by iii) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1 An isolated polypeptide derived from iii) having a sequence formed by substitution, deletion or addition of an amino acid residue and having substantially the isolated polypeptide function as defined in iii).
在另一优选例中,所述分离的多肽具有iii)所限定的序列经过一个或几个氨基 酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的添加而形成的序列,且基本具有iii)所限定的分离的多肽功能的由iii)衍生的分离的多肽。In another preferred embodiment, the isolated polypeptide has a sequence defined by iii) via one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably. An isolated polypeptide derived from iii) having a sequence of from 1 to 3, most preferably 1 amino acid residue added, and having substantially the isolated polypeptide function as defined in iii).
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基选自以下氨基酸的至少一种:His、Asn、Gln、Lys和Arg。In another preferred embodiment, the amino acid sequence of the isolated polypeptide having the amino acid sequence at position 222 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: His, Asn, Gln, Lys, and Arg.
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为His。In another preferred embodiment, the amino acid sequence of the isolated polypeptide has the amino acid residue at position 222 of the amino acid sequence set forth in SEQ ID NO: 19 as His.
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第322位的氨基酸残基选自以下氨基酸的至少一种:Val、Ile、Leu、Met和Phe。In another preferred embodiment, the amino acid sequence of the isolated polypeptide having the amino acid sequence at position 322 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Val, Ile, Leu, Met, and Phe.
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第322位的氨基酸残基为Val。In another preferred embodiment, the amino acid sequence of the isolated polypeptide having the amino acid sequence at position 322 of the amino acid sequence set forth in SEQ ID NO: 19 is Val.
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为His,在SEQ ID NO:19所示氨基酸序列第322位的氨基酸残基为Val。In another preferred embodiment, the amino acid sequence of the isolated polypeptide is His at the 222th amino acid sequence of the amino acid sequence shown in SEQ ID NO: 19, and at position 322 of the amino acid sequence shown in SEQ ID NO: 19. The amino acid residue is Val.
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基选自以下氨基酸的至少一种:Ser(S)、Pro(P),Ala(A)或Thr(T)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide having the amino acid sequence at position 247 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ser(S), Pro(P) ), Ala (A) or Thr (T).
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基为Ser(S)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide having the amino acid sequence at position 247 of the amino acid sequence set forth in SEQ ID NO: 19 is Ser(S).
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第280位的氨基酸残基选自以下氨基酸的至少一种:Ile(I)、Asn(N),Ser(S)或Ala(A)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide having the amino acid sequence at position 280 of the amino acid sequence set forth in SEQ ID NO: 19 is selected from at least one of the following amino acids: Ile(I), Asn(N) ), Ser(S) or Ala(A).
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基酸序列的第280位的氨基酸残基为Ile(I)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide having the amino acid sequence at position 280 of the amino acid sequence set forth in SEQ ID NO: 19 is Ile(I).
在另一优选例中,所述分离的多肽的氨基酸序列在SEQ ID NO:19所示氨基 酸序列的第247位的氨基酸残基为Ser(S),在SEQ ID NO:19所示氨基酸序列第280位的氨基酸残基为Ile(I)。In another preferred embodiment, the amino acid sequence of the isolated polypeptide has the amino acid residue at position 247 of the amino acid sequence shown in SEQ ID NO: 19 as Ser(S), and the amino acid sequence shown in SEQ ID NO: 19 The amino acid residue at position 280 is Ile(I).
在另一优选例中,所述分离的多肽为糖基转移酶。In another preferred embodiment, the isolated polypeptide is a glycosyltransferase.
在另一优选例中,所述糖基转移酶来源于人参属植物。In another preferred embodiment, the glycosyltransferase is derived from a plant of the genus Panax.
在另一优选例中,所述糖基转移酶来源于人参、西洋参和/或三七。In another preferred embodiment, the glycosyltransferase is derived from ginseng, American ginseng, and/or notoginseng.
在另一优选例中,所述的多肽选自下组:In another preferred embodiment, the polypeptide is selected from the group consisting of:
(a)具有SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽;(a) having the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Polypeptide
(b)将SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的、或是添加信号肽序列后形成的、并具有糖基转移酶活性的衍生多肽;(b) the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 A substitution, deletion or addition of a polypeptide over one or several amino acid residues, preferably from 1 to 20, more preferably from 1 to 15, more preferably from 1 to 10, more preferably from 1 to 3, most preferably 1 amino acid residue. a derivative polypeptide formed or added with a signal peptide sequence and having glycosyltransferase activity;
(c)序列中含有(a)或(b)中所述多肽序列的衍生多肽;(c) a derivative polypeptide comprising a polypeptide sequence as described in (a) or (b);
(d)氨基酸序列与SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的同源性≥85%(较佳地≥90%、91%、92%、93%、94%、95%、96%、97%、98%或99%),并具有糖基转移酶活性的衍生多肽。(d) amino acid sequence and amino acid represented by SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Sequence homology ≥ 85% (preferably ≥ 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%) with glycosyltransferase Active derivative polypeptide.
在另一优选例中,所述的序列(c)为由(a)或(b)添加了标签序列、信号序列或分泌信号序列后所形成的融合蛋白。In another preferred embodiment, the sequence (c) is a fusion protein formed by adding a tag sequence, a signal sequence or a secretion signal sequence to (a) or (b).
在另一优选例中,所述糖基转移酶活性指能将糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上的活性。In another preferred embodiment, the glycosyltransferase activity refers to an activity capable of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
在另一优选例中,所述糖基转移酶能提高人参皂苷Rh2和/或人参皂苷F2的产量。In another preferred embodiment, the glycosyltransferase increases the yield of ginsenoside Rh2 and/or ginsenoside F2.
在另一优选例中,在人工构建的菌株中,所述糖基转移酶能提高人参皂苷Rh2的产量;优选地,提高5-150%;更优选地,提高10-100%;更优选地,提高20-80%;最优选,提高28-70%。In another preferred embodiment, in the artificially constructed strain, the glycosyltransferase can increase the yield of ginsenoside Rh2; preferably, it is increased by 5-150%; more preferably, by 10-100%; more preferably , increase by 20-80%; most preferably, increase by 28-70%.
在另一优选例中,所述人工构建的菌株选自下组:酿酒酵母菌株、大肠杆菌菌株、毕赤酵母菌株、裂殖酵母菌株、克鲁维酵母菌株。In another preferred embodiment, the artificially constructed strain is selected from the group consisting of a Saccharomyces cerevisiae strain, an Escherichia coli strain, a Pichia strain, a fission yeast strain, and a Kluyveromyces strain.
本发明的第二方面,提供了一种分离的多核苷酸,所述的多核苷酸为选自下组的序列:In a second aspect of the invention, an isolated polynucleotide is provided, the polynucleotide being a sequence selected from the group consisting of:
(A)编码本发明的第一方面所述多肽的核苷酸序列;(A) a nucleotide sequence encoding the polypeptide of the first aspect of the invention;
(B)编码如SEQ ID NO.:4或SEQ ID NO.:21所示多肽或其衍生多肽的核苷酸序列;(B) a nucleotide sequence encoding the polypeptide of SEQ ID NO.: 4 or SEQ ID NO.: 21 or a polypeptide derived therefrom;
(C)如SEQ ID NO.:3SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示的核苷酸序列;(C) a nucleotide sequence as set forth in SEQ ID NO.: 3 SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42 ;
(D)与SEQ ID NO.:3SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示序列的同源性≥90%(较佳地≥91%、92%、93%、94%、95%、96%、97%、98%或99%)的核苷酸序列;(D) Homology to the sequence set forth in SEQ ID NO.: 3 SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: a nucleotide sequence of ≥90% (preferably ≥91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%);
(E)在SEQ ID NO.:3SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示核苷酸序列的5’端和/或3’端截短或添加1-60个(较佳地1-30,更佳地1-10个)核苷酸所形成的核苷酸序列;(E) a nucleotide sequence of SEQ ID NO.: 3 SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42 a nucleotide sequence formed by truncating the 5' end and/or the 3' end or adding 1-60 (preferably 1-30, more preferably 1-10) nucleotides;
(F)与(A)-(E)任一所述的核苷酸序列互补的核苷酸序列。(F) a nucleotide sequence complementary to the nucleotide sequence of any of (A) to (E).
本发明的第三方面,提供了一种载体,所述的载体含有本发明的第二方面所述的多核苷酸。In a third aspect of the invention, a vector comprising the polynucleotide of the second aspect of the invention is provided.
在另一优选例中,所述载体包括表达载体、穿梭载体、整合载体。In another preferred embodiment, the vector comprises an expression vector, a shuttle vector, an integration vector.
本发明的第四方面,提供了本发明的第一方面所述分离的多肽的用途,它被用于催化以下反应,或被用于制备催化以下反应的催化制剂:In a fourth aspect of the invention, there is provided the use of the isolated polypeptide of the first aspect of the invention, which is used to catalyze the following reaction, or to prepare a catalytic preparation which catalyzes the following reaction:
(i)将来自糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上。(i) Transferring a glycosyl group derived from a glycosyl donor to the hydroxyl group at the C-3 position of the tetracyclic triterpenoid.
在另一优选例中,所述的糖基供体包括选自下组的核苷二磷酸糖:UDP-葡萄糖,ADP-葡萄糖,TDP-葡萄糖,CDP-葡萄糖,GDP-葡萄糖,UDP-乙酰基葡萄糖,ADP-乙酰基葡萄糖,TDP-乙酰基葡萄糖,CDP-乙酰基葡萄糖,GDP-乙酰基葡萄糖,UDP-木糖,ADP-木糖,TDP-木糖,CDP-木糖,GDP-木糖,UDP-木糖,UDP-半乳糖醛酸,ADP-半乳糖醛酸,TDP-半乳糖醛酸,CDP-半乳糖醛酸,GDP-半乳糖醛酸,UDP-半乳糖,ADP-半乳糖,TDP-半乳糖,CDP-半乳糖,GDP-半乳糖,UDP-阿拉伯糖,ADP-阿拉伯糖,TDP-阿拉伯糖,CDP-阿拉伯糖,GDP-阿拉伯糖,UDP-鼠李糖,ADP-鼠李糖,TDP-鼠李糖,CDP- 鼠李糖,GDP-鼠李糖,或其他核苷二磷酸己糖或核苷二磷酸戊糖,或其组合。In another preferred embodiment, the glycosyl donor comprises a nucleoside diphosphate selected from the group consisting of UDP-glucose, ADP-glucose, TDP-glucose, CDP-glucose, GDP-glucose, UDP-acetyl Glucose, ADP-acetylglucose, TDP-acetylglucose, CDP-acetylglucose, GDP-acetylglucose, UDP-xylose, ADP-xylose, TDP-xylose, CDP-xylose, GDP-xylose , UDP-xylose, UDP-galacturonic acid, ADP-galacturonic acid, TDP-galacturonic acid, CDP-galacturonic acid, GDP-galacturonic acid, UDP-galactose, ADP-galactose , TDP-galactose, CDP-galactose, GDP-galactose, UDP-arabinose, ADP-arabinose, TDP-arabinose, CDP-arabinose, GDP-arabinose, UDP-rhamnose, ADP-rat Liose, TDP-rhamnose, CDP-rhamnose, GDP-rhamnose, or other nucleoside diphosphate hexose or nucleoside pentose pentose, or a combination thereof.
在另一优选例中,所述的糖基供体包括选自下组的尿苷二磷酸(UDP)糖:UDP-葡萄糖,UDP-木糖,UDP-半乳糖醛酸,UDP-半乳糖,UDP-阿拉伯糖,UDP-鼠李糖,或其他尿苷二磷酸己糖或尿苷二磷酸戊糖,或其组合。In another preferred embodiment, the glycosyl donor comprises a uridine diphosphate (UDP) sugar selected from the group consisting of UDP-glucose, UDP-xylose, UDP-galacturonic acid, UDP-galactose, UDP-arabinose, UDP-rhamnose, or other uridine diphosphate hexose or uridine pentose diphosphate, or a combination thereof.
在另一优选例中,所述分离的多肽用于催化下述反应或被用于制备催化下述反应的催化制剂:In another preferred embodiment, the isolated polypeptide is used to catalyze the following reactions or to prepare a catalytic preparation that catalyzes the following reactions:
Figure PCTCN2018086738-appb-000001
Figure PCTCN2018086738-appb-000001
其中,R1为H或者OH;R2为H或者OH;R3为H或者糖基;R4为糖基。Wherein R1 is H or OH; R2 is H or OH; R3 is H or a glycosyl group; and R4 is a glycosyl group.
在另一优选例中,所述分离的多肽用于催化下述反应或被用于制备催化下述反应的催化制剂:In another preferred embodiment, the isolated polypeptide is used to catalyze the following reactions or to prepare a catalytic preparation that catalyzes the following reactions:
Figure PCTCN2018086738-appb-000002
Figure PCTCN2018086738-appb-000002
所述的多肽选自SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示的多肽或其衍生多肽。The polypeptide is selected from the group consisting of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: A polypeptide or a polypeptide derived therefrom.
在另一优选例中,所述的糖基选自:葡萄糖基、半乳糖醛酸基、木糖糖基,半乳糖基、阿拉伯糖基、鼠李糖基,以及其他己糖基或戊糖基。In another preferred embodiment, the glycosyl group is selected from the group consisting of: glucosyl, galacturonic acid, xylose, galactosyl, arabinose, rhamnosyl, and other hexose or pentose base.
在另一优选例中,所述反应(A)和或(B)的反应产物包括(但不限于):S构型或R构型的达玛烷型四环三萜类化合物、羊毛脂烷型四环三萜类化合物、apotirucallane型四环三萜、甘遂烷型四环三萜类化合物、环阿屯烷(环阿尔廷烷)型四环三萜类化合物、葫芦烷四环三萜类化合物、或楝烷型四环三萜类化合物。In another preferred embodiment, the reaction product of the reaction (A) and or (B) includes, but is not limited to, a dammarane type tetracyclic triterpenoid compound of the S configuration or the R configuration, lanolin Type tetracyclic triterpenoids, apotirucallane type tetracyclic triterpenes, ganthanane type tetracyclic triterpenoids, cycloalkane (cycloalthene) type tetracyclic triterpenoids, cucurbitane tetracyclic triterpenes a compound or a decane type tetracyclic triterpenoid.
本发明的第五方面,提供了一种体外糖基化方法,包括步骤:According to a fifth aspect of the invention, there is provided an in vitro glycosylation method comprising the steps of:
在糖基转移酶存在下,将糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上;从而形成糖基化的四环三萜类化合物;Transferring the glycosyl group of the glycosyl donor to the hydroxyl group C-3 of the tetracyclic triterpenoid in the presence of a glycosyltransferase; thereby forming a glycosylated tetracyclic triterpenoid;
其中,所述的糖基转移酶为本发明的第一方面所述的多肽或其衍生多肽。Wherein the glycosyltransferase is the polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
在另一优选例中,所述的衍生多肽选自:In another preferred embodiment, the derivative polypeptide is selected from the group consisting of:
将SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽经过一个或几个氨基酸残基的取代、缺失或添加而形成的、或是添加信号肽序列后形成的、并具有糖基转移酶活性的衍生多肽;或Passing a polypeptide of the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Or a derivative polypeptide formed by substitution, deletion or addition of several amino acid residues or formed by adding a signal peptide sequence and having glycosyltransferase activity; or
氨基酸序列与SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41氨基酸序列的同源性≥85%(较佳地≥90%、91%、92%、93%、94%、95%、96%、97%、98%、99%),并具有糖基转移酶活性的衍生多肽;Homology of the amino acid sequence to the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 ≥85% (preferably ≥90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%), and a derivative polypeptide having glycosyltransferase activity;
其中,所述糖基转移酶活性指能将糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上的活性。Here, the glycosyltransferase activity refers to an activity capable of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
本发明的第六方面,提供了一种进行糖基催化反应的方法,包括步骤:在本发明的第一方面所述的多肽或其衍生多肽存在的条件下,进行糖基催化反应。According to a sixth aspect of the invention, there is provided a method of performing a glycosyl-catalyzed reaction comprising the steps of performing a glycosyl-catalyzed reaction in the presence of the polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
在另一优选例中,所述的方法还包括步骤:In another preferred embodiment, the method further includes the steps of:
在糖基供体以及本发明的第一方面所述多肽或其衍生多肽的存在下,将式 (I)化合物转化为所述式(II)化合物。The compound of formula (I) is converted to the compound of formula (II) in the presence of a glycosyl donor and a polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
在另一优选例中,所述式(I)化合物为原人参二醇PPD,并且式(II)化合物为人参皂苷Rh2;或In another preferred embodiment, the compound of formula (I) is protoglycoside diol PPD, and the compound of formula (II) is ginsenoside Rh2;
所述式(I)化合物为Compound K,并且式(II)化合物为人参皂苷F2。The compound of formula (I) is Compound K and the compound of formula (II) is ginsenoside F2.
在另一优选例中,所述的方法还包括将所述的多肽及其衍生多肽分别加入催化反应;和/或In another preferred embodiment, the method further comprises separately adding the polypeptide and the polypeptide derived therefrom to a catalytic reaction; and/or
将所述的多肽及其衍生多肽同时加入催化反应。The polypeptide and its derived polypeptide are simultaneously added to the catalytic reaction.
在另一优选例中,所述的方法还包括将编码糖基转移酶的核苷酸序列与达玛稀二醇和/或原人参二醇合成代谢途径中的关键基因和/或其他糖基转移酶基因在宿主细胞中共表达,从而获得所述的式(II)化合物。In another preferred embodiment, the method further comprises transferring a nucleotide sequence encoding a glycosyltransferase to a key gene and/or other glycosyl transfer in a anabolic pathway of dammar diol and/or protopanaxadiol. The enzyme gene is co-expressed in the host cell to obtain the compound of formula (II).
在另一优选例中,所述式(II)化合物为人参皂苷Rh2或人参皂苷F2。In another preferred embodiment, the compound of formula (II) is ginsenoside Rh2 or ginsenoside F2.
在另一优选例中,所述的宿主细胞为酵母菌或大肠杆菌。In another preferred embodiment, the host cell is a yeast or Escherichia coli.
在另一优选例中,所述方法还包括:向反应体系中提供用于调节酶活性的添加物。In another preferred embodiment, the method further comprises: providing an additive for regulating enzyme activity to the reaction system.
在另一优选例中,所述的用于调节酶活性的添加物是:提高酶活性或抑制酶活性的添加物。In another preferred embodiment, the additive for regulating enzyme activity is an additive that increases enzyme activity or inhibits enzyme activity.
在另一优选例中,所述的用于调节酶活性的添加物选自下组:Ca 2+、Co 2+、Mn 2+、Ba 2+、Al 3+、Ni 2+、Zn 2+、或Fe 2+In another preferred embodiment, the additive for regulating enzyme activity is selected from the group consisting of Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ . , or Fe 2+ .
在另一优选例中,所述的用于调节酶活性的添加物是:可以生成Ca 2+、Co 2+、Mn 2+、Ba 2+、Al 3+、Ni 2+、Zn 2+、或Fe 2+的物质。 In another preferred embodiment, the additive for regulating enzyme activity is: Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , Or a substance of Fe 2+ .
在另一优选例中,所述的糖基供体是核苷二磷酸糖,选自下组:UDP-葡萄糖,ADP-葡萄糖,TDP-葡萄糖,CDP-葡萄糖,GDP-葡萄糖,UDP-乙酰基葡萄糖,ADP-乙酰基葡萄糖,TDP-乙酰基葡萄糖,CDP-乙酰基葡萄糖,GDP-乙酰基葡萄糖,UDP-木糖,ADP-木糖,TDP-木糖,CDP-木糖,GDP-木糖,UDP-木糖,UDP-半乳糖醛酸,ADP-半乳糖醛酸,TDP-半乳糖醛酸,CDP-半乳糖醛酸,GDP-半乳糖醛酸,UDP-半乳糖,ADP-半乳糖,TDP-半乳糖,CDP-半乳糖,GDP-半乳糖,UDP-阿拉伯糖,ADP-阿拉伯糖,TDP-阿拉伯糖,CDP-阿拉伯糖,GDP-阿拉伯糖,UDP-鼠李糖,ADP-鼠李糖,TDP-鼠李糖,CDP-鼠李糖,GDP-鼠李糖,或其他核苷二磷酸己糖或核苷二磷酸戊糖,或其组合。In another preferred embodiment, the glycosyl donor is a nucleoside diphosphate sugar selected from the group consisting of UDP-glucose, ADP-glucose, TDP-glucose, CDP-glucose, GDP-glucose, UDP-acetyl Glucose, ADP-acetylglucose, TDP-acetylglucose, CDP-acetylglucose, GDP-acetylglucose, UDP-xylose, ADP-xylose, TDP-xylose, CDP-xylose, GDP-xylose , UDP-xylose, UDP-galacturonic acid, ADP-galacturonic acid, TDP-galacturonic acid, CDP-galacturonic acid, GDP-galacturonic acid, UDP-galactose, ADP-galactose , TDP-galactose, CDP-galactose, GDP-galactose, UDP-arabinose, ADP-arabinose, TDP-arabinose, CDP-arabinose, GDP-arabinose, UDP-rhamnose, ADP-rat Liose, TDP-rhamnose, CDP-rhamnose, GDP-rhamnose, or other nucleoside diphosphate hexose or nucleoside pentose pentose, or a combination thereof.
在另一优选例中,所述的糖基供体是尿苷二磷酸糖,选自下组:UDP-葡萄糖,UDP-木糖,UDP-半乳糖醛酸,UDP-半乳糖,UDP-阿拉伯糖,UDP-鼠 李糖,或其他尿苷二磷酸己糖或尿苷二磷酸戊糖,或其组合。In another preferred embodiment, the glycosyl donor is uridine diphosphate, selected from the group consisting of UDP-glucose, UDP-xylose, UDP-galacturonic acid, UDP-galactose, UDP-Arabic Sugar, UDP-rhamnose, or other uridine diphosphate hexose or uridine pentose diphosphate, or a combination thereof.
在另一优选例中,反应体系的pH为:pH4.0-10.0,优选pH为5.5-9.0。In another preferred embodiment, the pH of the reaction system is: pH 4.0 to 10.0, preferably pH 5.5 to 9.0.
在另一优选例中,反应体系的温度为:10℃-105℃,优选20℃-50℃。In another preferred embodiment, the temperature of the reaction system is from 10 ° C to 105 ° C, preferably from 20 ° C to 50 ° C.
在另一优选例中,所述的达玛稀二醇合成代谢途径中的关键基因包括(但不限于):达玛烯二醇合成酶基因。In another preferred embodiment, the key genes in the darumadiol anabolic pathway include, but are not limited to, the damasenediol synthase gene.
在另一优选例中,所述的原人参二醇合成代谢途径中的关键基因包括(但不限于):达玛烯二醇合成酶基因、原人参二醇合成的细胞色素P450基因CYP716A47和其的还原酶基因,或其组合。In another preferred embodiment, the key genes in the proto-ginsengdiol anabolic pathway include, but are not limited to, a dammarenediol synthase gene, a cytochrome P450 gene CYP716A47 synthesized from protopanaxadiol, and Reductase gene, or a combination thereof.
在另一优选例中,所述糖基催化反应的底物为式(I)化合物,且所述的产物为式(II)化合物。In another preferred embodiment, the substrate of the glycosyl catalyzed reaction is a compound of formula (I), and the product is a compound of formula (II).
在另一优选例中,所述式(I)化合物为原人参二醇PPD,并且式(II)化合物为人参皂苷Rh2;或In another preferred embodiment, the compound of formula (I) is protoglycoside diol PPD, and the compound of formula (II) is ginsenoside Rh2;
所述式(I)化合物为Compound K,并且式(II)化合物为人参皂苷F2。The compound of formula (I) is Compound K and the compound of formula (II) is ginsenoside F2.
本发明的第七方面,提供了一种遗传工程化的宿主细胞,所述的宿主细胞含有本发明的第三方面所述的载体,或其基因组中整合有本发明的第二方面所述的多核苷酸。According to a seventh aspect of the invention, a genetically engineered host cell comprising the vector of the third aspect of the invention, or a genome thereof, which is integrated with the second aspect of the invention Polynucleotide.
在另一优选例中,所述的糖基转移酶为本发明的第一方面所述的多肽或其衍生多肽。In another preferred embodiment, the glycosyltransferase is the polypeptide of the first aspect of the invention or a polypeptide derived therefrom.
在另一优选例中,编码所述糖基转移酶的核苷酸序列如本发明的第二方面所述。In another preferred embodiment, the nucleotide sequence encoding the glycosyltransferase is as described in the second aspect of the invention.
在另一优选例中,所述的细胞为原核细胞或真核细胞。In another preferred embodiment, the cell is a prokaryotic cell or a eukaryotic cell.
在另一优选例中,所述的宿主细胞为真核细胞,如酵母细胞或植物细胞。In another preferred embodiment, the host cell is a eukaryotic cell, such as a yeast cell or a plant cell.
在另一优选例中,所述的宿主细胞为酿酒酵母细胞。In another preferred embodiment, the host cell is a Saccharomyces cerevisiae cell.
在另一优选例中,所述的宿主细胞原核细胞,如大肠杆菌。In another preferred embodiment, the host cell is a prokaryotic cell, such as E. coli.
在另一优选例中,所述的宿主细胞为人参细胞。In another preferred embodiment, the host cell is a ginseng cell.
在另一优选例中,所述的宿主细胞不是天然产生式(II)化合物的细胞。In another preferred embodiment, the host cell is not a cell that naturally produces a compound of formula (II).
在另一优选例中,所述的宿主细胞不是天然产生人参皂苷Rh2或人参皂苷F2的细胞。In another preferred embodiment, the host cell is not a cell that naturally produces ginsenoside Rh2 or ginsenoside F2.
在另一优选例中,所述的达玛稀二醇合成代谢途径中的关键基因包括(但不限于):达玛烯二醇合成酶基因。In another preferred embodiment, the key genes in the darumadiol anabolic pathway include, but are not limited to, the damasenediol synthase gene.
在另一优选例中,所述的宿主细胞含有原人参二醇合成代谢途径中的关键基因包括(但不限于):达玛烯二醇合成酶基因、原人参二醇合成的细胞色素P450基因CYP716A47及其还原酶基因,或其组合。In another preferred embodiment, the host cell contains a key gene in the proto-glycol diol anabolic pathway including, but not limited to, a dammarene diol synthase gene, a cytochrome P450 gene synthesized from protothecodiol CYP716A47 and its reductase gene, or a combination thereof.
本发明的第八方面,提供了本发明的第七方面所述的宿主细胞的用途,用于制备酶催化试剂,或生产糖基转移酶、或作为催化细胞、或产生糖基化的四环三萜类化合物。In an eighth aspect of the invention, there is provided the use of the host cell of the seventh aspect of the invention for the preparation of an enzyme catalytic reagent, or for the production of a glycosyltransferase, or as a catalytic cell, or for the production of a glycosylated tetracyclic ring Triterpenoids.
在另一优选例中,所述四环三萜类化合物为式(II)化合物。In another preferred embodiment, the tetracyclic triterpenoid is a compound of formula (II).
在另一优选例中,所述的宿主细胞用于通过对式(I)化合物的糖基化反应,生产式(II)化合物。In another preferred embodiment, the host cell is used to produce a compound of formula (II) by glycosylation of a compound of formula (I).
在另一优选例中,所述的宿主细胞用于通过对原人参二醇PPD和/或Compound K的糖基化反应,生产人参皂苷Rh2和/或人参皂苷F2。In another preferred embodiment, the host cell is used to produce ginsenoside Rh2 and/or ginsenoside F2 by glycosylation of protopanaxadiol PPD and/or Compound K.
本发明的第九方面,提供了一种产生转基因植物的方法,包括步骤:将本发明的第七方面所述的遗传工程化的宿主细胞再生为植物,并且所述的遗传工程化的宿主细胞为植物细胞。In a ninth aspect of the invention, a method of producing a transgenic plant, comprising the steps of: regenerating a genetically engineered host cell of the seventh aspect of the invention into a plant, and said genetically engineered host cell For plant cells.
在另一优选例中,所述的遗传工程化的宿主细胞选自:人参细胞、花旗参细胞、三七细胞、烟草细胞。In another preferred embodiment, the genetically engineered host cell is selected from the group consisting of: ginseng cells, American ginseng cells, notoginseng cells, and tobacco cells.
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。It is to be understood that within the scope of the present invention, the various technical features of the present invention and the various technical features specifically described hereinafter (as in the embodiments) may be combined with each other to constitute a new or preferred technical solution. Due to space limitations, we will not repeat them here.
附图说明DRAWINGS
图1糖基转移酶基因Pn50琼脂糖凝胶电泳图。Figure 1. Electrophoresis map of glycosyltransferase gene Pn50 agarose gel.
图2糖基转移酶基因Pn50催化人参皂苷TLC分析图。Fig. 2 is a diagram showing the TLC analysis of ginsenosides by the glycosyltransferase gene Pn50.
图3为利用Pn50构建重组酿酒酵母菌株产稀有人参皂苷Rh2的HPLC分析图。Figure 3 is a HPLC analysis of the production of recombinant ginseng saponin Rh2 by recombinant S. cerevisiae strain using Pn50.
图4为利用Pn50蛋白突变体Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2以及UGT-MUT3所构建重组酿酒酵母菌株产稀有人参皂苷Rh2的发酵产量比较分析。Figure 4 is a comparative analysis of the fermentation yield of the recombinant Saccharomyces cerevisiae strain R2 produced by the Pn50 protein mutants Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2 and UGT-MUT3.
具体实施方式detailed description
本发明人经过广泛而深入的研究,首次提供三七糖基转移酶Pn50(SEQ ID NO.:4)和人参来源的糖基转移酶UGTPg45的突变体8E7(SEQ ID NO.:21)、三七糖基转移酶Pn50的突变体Pn50-Q222H-VE(SEQ ID NO.:41)、UGT-MUT1(SEQ_ID_NO.35)、UGT-MUT2(SEQ_ID_NO.37)以及UGT-MUT3(SEQ_ID_NO.39)在萜类化合物糖基化催化及新皂苷合成中的应用。The present inventors have conducted extensive and intensive research to provide, for the first time, a mutant of the heptasaccharide transferase Pn50 (SEQ ID NO.: 4) and a ginseng-derived glycosyltransferase UGTPg45 (SEQ ID NO.: 21), three. Mutants of hexosyltransferase Pn50, Pn50-Q222H-VE (SEQ ID NO.: 41), UGT-MUT1 (SEQ_ID_NO. 35), UGT-MUT2 (SEQ_ID_NO.37), and UGT-MUT3 (SEQ_ID_NO.39) are The application of glycosylation catalysis of steroids and the synthesis of new saponins.
具体地,本发明的糖基转移酶能特异和高效地催化四环三萜化合物底物)的和/或将来自糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上。特别是能够高效将原人参二醇PPD转化为具有抗肿瘤活性的稀有人参皂苷Rh2,将稀有人参皂苷Compound K(在PPD的C20位具有一个糖基修饰)转化为产物人参皂苷F2。意外地,向产原人参二醇的酿酒酵母菌株中导入三七来源的糖基转移酶基因Pn50所构建的重组酿酒酵母菌株ZWBY04RS-Pn50能合成稀有人参皂苷Rh2,并且与导入野生型人参来源的糖基转移酶基因UGTPg45的菌株ZWBY04RS-UGTPg45相比,菌株ZWBY04RS-Pn50人参皂苷Rh2的产量提高28%(45.55/35.66-1=28%);利用随机突变对UGTPg45进行改造获得的突变体基因8E7所构建的人工合成稀有人参皂苷Rh2菌株ZWBY04RS-8E7,其Rh2产量相比于利用UGTPg45所构建菌株ZWBY04RS-UGTPg45产量提升了70%(60.48/35.66-1=70%)。通过定点突变对Pn50进行改造获得的突变体基因Pn50-Q222H-VE,利用Pn50-Q222H-VE所构建的人工合成稀有人参皂苷Rh2菌株ZWBY04RS-QE,其Rh2产量相比于利用UGTPg45所构建菌株ZWBY04RS-UGTPg45产量提升了66%(59.09/35.66-1=66%)。通过定点突变对Pn50-Q222H-VE做进一步的改造获得突变体UGT-MUT1基因,利用UGT-MUT1所构建的人工合成稀有人参皂苷Rh2菌株ZWBY04RS-MUT1,其Rh2产量相比于利用UGTPg45所构建菌株ZWBY04RS-UGTPg45产量提升了90%(67.96/35.66-1=90%)。通过定点突变对Pn50-Q222H-VE做进一步的改造获得突变体UGT-MUT2基因,利用UGT-MUT2所构建的人工合成稀有人参皂苷Rh2菌株ZWBY04RS-MUT2,其Rh2产量相比于利用UGTPg45所构建菌株ZWBY04RS-UGTPg45产量提升了120%(78.29/35.66-1=120%)。通过定点突变对Pn50-Q222H-VE做进一步的改造获得突变体UGT-MUT3基因,利用UGT-MUT3所构建的人工合成稀有人参皂苷Rh2菌 株ZWBY04RS-MUT3,其Rh2产量相比于利用UGTPg45所构建菌株ZWBY04RS-UGTPg45产量提升了134%(83.53/35.66-1=134%)。Specifically, the glycosyltransferase of the present invention is capable of specifically and efficiently catalyzing the tetracyclic triterpenoid substrate and/or transferring a glycosyl group derived from a glycosyl donor to the C-3 position of a tetracyclic triterpenoid On the hydroxyl group. In particular, the protopanaxadiol PPD can be efficiently converted into the rare ginsenoside Rh2 having antitumor activity, and the ginsenoside Compound K (having a glycosyl modification at the C20 position of the PPD) can be converted into the product ginsenoside F2. Unexpectedly, the recombinant Saccharomyces cerevisiae strain ZWBY04RS-Pn50 constructed by introducing the Panax notoginseng-derived glycosyltransferase gene Pn50 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 and is introduced into the wild-type ginseng source. Compared with the strain ZWBY04RS-UGTPg45 of the glycosyltransferase gene UGTPg45, the yield of the strain ZWBY04RS-Pn50 ginsenoside Rh2 was increased by 28% (45.55/35.66-1=28%); the mutant gene 8E7 obtained by genetic modification of UGTPg45 by random mutation The artificially synthesized ginsenoside Rh2 strain ZWBY04RS-8E7 was constructed, and its Rh2 yield was increased by 70% (60.48/35.66-1=70%) compared with the strain ZWBY04RS-UGTPg45 constructed using UGTPg45. The mutant gene Pn50-Q222H-VE obtained by the site-directed mutagenesis of Pn50 was constructed using Pn50-Q222H-VE to synthesize the rare ginsenoside Rh2 strain ZWBY04RS-QE, and the Rh2 yield was compared with the strain ZWBY04RS constructed using UGTPg45. -UGTPg45 production increased by 66% (59.09/35.66-1=66%). Pn50-Q222H-VE was further engineered by site-directed mutagenesis to obtain the mutant UGT-MUT1 gene, and the artificially synthesized ginsenoside Rh2 strain ZWBY04RS-MUT1 constructed by UGT-MUT1 was used. The Rh2 yield was compared with the strain constructed using UGTPg45. ZWBY04RS-UGTPg45 increased production by 90% (67.96/35.66-1=90%). Pn50-Q222H-VE was further engineered by site-directed mutagenesis to obtain the mutant UGT-MUT2 gene, and the artificially synthesized ginsenoside Rh2 strain ZWBY04RS-MUT2 constructed by UGT-MUT2 was used. The Rh2 yield was compared with the strain constructed using UGTPg45. ZWBY04RS-UGTPg45 production increased by 120% (78.29/35.66-1=120%). Pn50-Q222H-VE was further engineered by site-directed mutagenesis to obtain the mutant UGT-MUT3 gene, and the artificially synthesized ginsenoside Rh2 strain ZWBY04RS-MUT3 constructed by UGT-MUT3 was used. The Rh2 yield was compared with the strain constructed using UGTPg45. ZWBY04RS-UGTPg45 production increased by 134% (83.53/35.66-1=134%).
本发明还提供了转化和催化方法。本发明的糖基转移酶还可与达玛烯二醇和/或原人参二醇合成代谢途径中的关键酶(例如达玛烯二醇合成酶基因PgDDS、原人参二醇合成的细胞色素P450基因CYP716A47及其还原酶基因PgCPR1)在宿主细胞中共表达,或者应用于制备人参皂苷Rh2的遗传工程细胞中,应用于构建人工合成稀有人参皂苷Rh2的菌株。The invention also provides methods of transformation and catalysis. The glycosyltransferase of the present invention may also be a key enzyme in the anabolic pathway of dammarane diol and/or protopanaxadiol (for example, the dammarene diol synthase gene PgDDS, the cytochrome P450 gene synthesized by protopanaxadiol). CYP716A47 and its reductase gene PgCPR1) are co-expressed in host cells or used in the preparation of genetically engineered cells of ginsenoside Rh2, and are used to construct strains for artificial synthesis of rare ginsenoside Rh2.
此外,本发明的糖基转移酶还可与达玛烯二醇和/或原人参二醇合成代谢途径中的关键酶在宿主细胞中共表达,应用于构建人工合成稀有人参皂苷Rh2的菌株。在此基础上完成了本发明。In addition, the glycosyltransferase of the present invention can also be co-expressed in a host cell with a key enzyme in the metabolic pathway of dammarane diol and/or protopanaxadiol, and is used for constructing a strain for artificially synthesizing rare ginsenoside Rh2. The present invention has been completed on this basis.
定义definition
如本文所用,术语“活性多肽”、“本发明的多肽及其衍生多肽”、“本发明的酶”、“糖基转移酶”或“本发明的糖基转移酶”可互换使用,并具有本领域普通技术人员通常理解的含义。本发明糖基转移酶具有将将糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上的活性。As used herein, the terms "active polypeptide", "polypeptide of the invention and derivatives thereof", "enzyme of the invention", "glycosyltransferase" or "glycosyltransferase of the invention" are used interchangeably and There is a meaning that is generally understood by one of ordinary skill in the art. The glycosyltransferase of the present invention has an activity of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为非Gln和/或在对应于SEQ ID NO:19所示氨基酸序列第322位的氨基酸残基为非Ala。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is non-Gln at amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19 and/or corresponds to SEQ ID NO. The amino acid residue at position 322 of the amino acid sequence shown in 19 is non-Ala.
在本发明的一个优选例中,本发明糖基转移酶:In a preferred embodiment of the invention, the glycosyltransferase of the invention:
i).具有如SEQ ID NO:19所示氨基酸序列且第222位的氨基酸残基为非Gln和/或第322位的氨基酸残基为非Ala,或i) having an amino acid sequence as shown in SEQ ID NO: 19 and the amino acid residue at position 222 is non-Gln and/or the 322th position is non-Ala, or
ii).具有i)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的序列,且基本具有i)所限定的分离的多肽功能的由i)衍生的分离的多肽。Ii). The sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1 An isolated polypeptide derived from i) of a sequence formed by substitution, deletion or addition of an amino acid residue, and having substantially the isolated polypeptide function as defined in i).
在本发明的一个优选例中,本发明糖基转移酶具有i)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的添加而形成的序列,且基本具有i)所限定的分离的多肽功能的由i)衍生的分离的多肽。In a preferred embodiment of the invention, the glycosyltransferase of the invention has the sequence defined by i) via one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10 More preferably, the sequence formed by the addition of 1-3, most preferably 1 amino acid residue, and the isolated polypeptide derived from i) having substantially the isolated polypeptide function defined by i).
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基选自以下氨基酸的至少一种:His、Asn、Gln、Lys和Arg。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19: His, Asn , Gln, Lys, and Arg.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为His。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第322位的氨基酸残基选自以下氨基酸的至少一种:Val、Ile、Leu、Met和Phe。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to the amino acid sequence of SEQ ID NO: 19: Val, Ile , Leu, Met and Phe.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第322位的氨基酸残基为Val。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is Val at the amino acid residue corresponding to position 322 of the amino acid sequence shown by SEQ ID NO: 19.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为His,在对应于SEQ ID NO:19所示氨基酸序列第322位的氨基酸残基为Val。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is His at the amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19, and corresponds to SEQ ID NO: 19 The amino acid residue at position 322 of the amino acid sequence is Val.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基为非Asn(N)和/或在对应于SEQ ID NO:19所示氨基酸序列第280位的氨基酸残基为非Lys(K)。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is non-Asn(N) at the amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19 and/or corresponds to The amino acid residue at position 280 of the amino acid sequence shown as SEQ ID NO: 19 is non-Lys (K).
在本发明的一个优选例中,本发明糖基转移酶:In a preferred embodiment of the invention, the glycosyltransferase of the invention:
i).具有如SEQ ID NO:19所示氨基酸序列且第247位的氨基酸残基为非Asn(N)和/或第280位的氨基酸残基为非Lys(K),或i) having an amino acid sequence as shown in SEQ ID NO: 19 and the amino acid residue at position 247 is non-Asn (N) and/or the 280th amino acid residue is non-Lys (K), or
ii).具有i)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的序列,且基本具有i)所限定的分离的多肽功能的由i)衍生的分离的多肽。Ii). The sequence defined by i) is passed through one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10, more preferably 1-3, most preferably 1 An isolated polypeptide derived from i) of a sequence formed by substitution, deletion or addition of an amino acid residue, and having substantially the isolated polypeptide function as defined in i).
在本发明的一个优选例中,本发明糖基转移酶具有i)所限定的序列经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的添加而形成的序列,且基本具有i)所限定的分离的多肽功能的由i)衍生的分离的多肽。In a preferred embodiment of the invention, the glycosyltransferase of the invention has the sequence defined by i) via one or several amino acid residues, preferably 1-20, more preferably 1-15, more preferably 1-10 More preferably, the sequence formed by the addition of 1-3, most preferably 1 amino acid residue, and the isolated polypeptide derived from i) having substantially the isolated polypeptide function defined by i).
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基选自以下氨基酸的至少一种:Ser (S)、Pro(P),Ala(A)或Thr(T)。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to the amino acid sequence of SEQ ID NO: 19: Ser (S ), Pro (P), Ala (A) or Thr (T).
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基为Ser(S)。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is Ser(S) at the amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第280位的氨基酸残基选自以下氨基酸的至少一种:Ile(I)、Asn(N),Ser(S)或Ala(A)。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is at least one of the following amino acids at amino acid residue corresponding to the amino acid sequence of SEQ ID NO: 19: Ile(I) ), Asn (N), Ser (S) or Ala (A).
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第280位的氨基酸残基为Ile(I)。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is Ile(I) at the amino acid residue corresponding to position 280 of the amino acid sequence shown by SEQ ID NO: 19.
在本发明的一个优选例中,本发明糖基转移酶的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基为Ser(S),在对应于SEQ ID NO:19所示氨基酸序列第280位的氨基酸残基为Ile(I)。In a preferred embodiment of the present invention, the amino acid sequence of the glycosyltransferase of the present invention is Ser(S) at amino acid residue corresponding to position 247 of the amino acid sequence shown by SEQ ID NO: 19, corresponding to SEQ ID NO The amino acid residue at position 280 of the amino acid sequence shown in 19 is Ile(I).
在本发明的一个优选例中,本发明糖基转移酶选自下组:In a preferred embodiment of the invention, the glycosyltransferase of the invention is selected from the group consisting of:
(a)具有SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽;(a) having the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Polypeptide
(b)将SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的、或是添加信号肽序列后形成的、并具有糖基转移酶活性的衍生多肽;(b) the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 A substitution, deletion or addition of a polypeptide over one or several amino acid residues, preferably from 1 to 20, more preferably from 1 to 15, more preferably from 1 to 10, more preferably from 1 to 3, most preferably 1 amino acid residue. a derivative polypeptide formed or added with a signal peptide sequence and having glycosyltransferase activity;
(c)序列中含有(a)或(b)中所述多肽序列的衍生多肽;(c) a derivative polypeptide comprising a polypeptide sequence as described in (a) or (b);
(d)氨基酸序列与SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的同源性≥85%(较佳地≥90%、91%、92%、93%、94%、95%、96%、97%、98%或99%),并具有糖基转移酶活性的衍生多肽。(d) amino acid sequence and amino acid represented by SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Sequence homology ≥ 85% (preferably ≥ 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%) with glycosyltransferase Active derivative polypeptide.
在具体的实施方式中,本发明糖基转移酶活性指能将糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上的活性。In a specific embodiment, the glycosyltransferase activity of the present invention refers to an activity capable of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
在本发明的一个优选例中,从三七中克隆的一个新的糖基转移酶基因Pn50,利用这个新的糖基转移酶可以催化多种达玛烷型人参皂苷C3位糖基化。In a preferred embodiment of the present invention, a novel glycosyltransferase gene Pn50 cloned from Panax notoginseng utilizes this novel glycosyltransferase to catalyze the glycosylation of various dammarane-type ginsenosides C3.
通过对三七转录组数据进行分析,从中拼接到一个全长的糖基转移酶基因序列,并命名为Pn50。将其克隆到克隆载体PMDT-18T上,然后设计表达引物构建到大肠杆菌表达载体pET28a上,使其在大肠杆菌中诱导表达。所获得的蛋白能催化原人二醇和Compound K的C3位羟基糖基化。The full-length glycosyltransferase gene sequence was spliced and analyzed by Pn50 by analyzing the Sanqi transcriptome data. This was cloned into the cloning vector PMDT-18T, and then the expression primer was designed to be constructed on the E. coli expression vector pET28a to induce expression in E. coli. The obtained protein is capable of catalyzing the hydroxyglycosylation of the original human diol and Compound K at the C3 position.
所述三七糖基转移酶基因Pn50替换人参来源的UGTPg45可以大幅提升Rh2产量(提高28%)。Replacing the ginseng-derived UGTPg45 with the hepta-glycosyltransferase gene Pn50 significantly increased Rh2 production (28% increase).
在本发明的另一个优选例中,提供了一种糖基转移酶突变体蛋白8E7。所述糖基转移酶突变体蛋白为人参来源的野生型基因UGTPg45的突变体蛋白,所述突变糖基转移酶基因8E7替换人参来源的野生型基因UGTPg45可以大幅提升Rh2产量(提高70%)。In another preferred embodiment of the invention, a glycosyltransferase mutant protein 8E7 is provided. The glycosyltransferase mutant protein is a mutant protein of the ginseng-derived wild-type gene UGTPg45, and the replacement of the ginseng-derived wild-type gene UGTPg45 by the mutant glycosyltransferase gene 8E7 can significantly increase Rh2 production (70% increase).
在本发明的另一个优选例中,提供了一种糖基转移酶突变体蛋白Pn50-Q222H-VE。所述糖基转移酶突变体蛋白为三七来源的野生型基因Pn50的突变体蛋白,所述突变糖基转移酶基因Pn50-Q222H-VE替换人参来源的野生型基因UGTPg45可以大幅提升Rh2产量(提高66%)。In another preferred embodiment of the invention, a glycosyltransferase mutant protein Pn50-Q222H-VE is provided. The glycosyltransferase mutant protein is a mutant protein of the wild type gene Pn50 derived from Panax notoginseng, and the replacement of the ginseng-derived wild-type gene UGTPg45 by the mutant glycosyltransferase gene Pn50-Q222H-VE can greatly increase the yield of Rh2 ( Increase by 66%).
在本发明的另一个优选例中,提供了一种糖基转移酶突变体蛋白UGT-MUT1。所述糖基转移酶突变体蛋白为三七来源的野生型基因Pn50的突变体蛋白,所述突变糖基转移酶基因UGT-MUT1替换人参来源的野生型基因UGTPg45可以大幅提升Rh2产量(提高90%)。In another preferred embodiment of the invention, a glycosyltransferase mutant protein UGT-MUT1 is provided. The glycosyltransferase mutant protein is a mutant protein of the wild type gene Pn50 derived from Panax notoginseng. The mutant glycosyltransferase gene UGT-MUT1 replaces the ginseng-derived wild-type gene UGTPg45, which can greatly increase the yield of Rh2 (increased by 90). %).
在本发明的另一个优选例中,提供了一种糖基转移酶突变体蛋白UGT-MUT2。所述糖基转移酶突变体蛋白为三七来源的野生型基因Pn50的突变体蛋白,所述突变糖基转移酶基因UGT-MUT2替换人参来源的野生型基因UGTPg45可以大幅提升Rh2产量(提高120%)。In another preferred embodiment of the invention, a glycosyltransferase mutant protein UGT-MUT2 is provided. The glycosyltransferase mutant protein is a mutant protein of the wild-type gene Pn50 derived from Panax notoginseng, and the wild-type gene UGTPg45 derived from the ginseng-derived wild-type gene UGT-MUT2 can greatly increase the yield of Rh2 (increased by 120). %).
在本发明的另一个优选例中,提供了一种糖基转移酶突变体蛋白UGT-MUT3。所述糖基转移酶突变体蛋白为三七来源的野生型基因Pn50的突变体蛋白,所述突变糖基转移酶基因UGT-MUT3替换人参来源的野生型基因UGTPg45可以大幅提升Rh2产量(提高134%)。In another preferred embodiment of the invention, a glycosyltransferase mutant protein UGT-MUT3 is provided. The glycosyltransferase mutant protein is a mutant protein of the wild type gene Pn50 derived from Panax notoginseng. The mutant glycosyltransferase gene UGT-MUT3 replaces the ginseng-derived wild-type gene UGTPg45, which can greatly increase the yield of Rh2 (increased 134). %).
本领域普通技术人员不难知晓,在多肽的某些区域,例如非重要区域改变少数氨基酸残基基本上不会改变生物活性,例如,适当替换某些氨基酸得到的序列并不会影响其活性(可参见Watson等,Molecular Biology of The Gene,第四版,1987,The Benjamin/Cummings Pub.Co.P224)。因此,本领域普通 技术人员能够实施这种替换并且确保所得分子仍具有所需生物活性。It will be readily apparent to one of ordinary skill in the art that alteration of a small number of amino acid residues in certain regions of the polypeptide, such as non-significant regions, does not substantially alter biological activity, for example, the sequence obtained by appropriately replacing certain amino acids does not affect its activity ( See Watson et al, Molecular Biology of The Gene, Fourth Edition, 1987, The Benjamin/Cummings Pub. Co. P224). Thus, one of ordinary skill in the art will be able to implement such substitutions and ensure that the resulting molecule still has the desired biological activity.
因此,本发明的多肽可以在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为非Gln和/或在对应于SEQ ID NO:19所示氨基酸序列第322位的氨基酸残基为非Ala的基础上作进一步突变而仍具备本发明糖基转移酶的功能和活性。例如本发明的糖基转移酶(a)其氨基酸序列如SEQ ID NO:21所示;或(b)包含(a)所限定的序列经过一个或多个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的序列,且基本具有(a)所限定的多肽功能的由(a)衍生的多肽。Therefore, the polypeptide of the present invention may be an amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19, which is non-Gln and/or an amino acid corresponding to position 322 of the amino acid sequence shown by SEQ ID NO: 19. The residue is further mutated on the basis of non-Ala and still has the function and activity of the glycosyltransferase of the present invention. For example, the glycosyltransferase (a) of the present invention has the amino acid sequence shown as SEQ ID NO: 21; or (b) the sequence defined by (a) comprises one or more amino acid residues, preferably 1-20, More preferably, a sequence formed by substitution, deletion or addition of 1 to 15, more preferably 1 to 10, more preferably 1 to 3, most preferably 1 amino acid residue, and having substantially the function of the polypeptide defined in (a) a polypeptide derived from (a).
在本发明中,本发明的糖基转移酶包括与氨基酸序列如SEQ ID NO.:4、SEQ ID NO:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示的糖基转移酶相比,有至多20个、较佳地至多10个,再佳地至多3个,更佳地至多2个,最佳地至多1个氨基酸被性质相似或相近的氨基酸所替换而形成的突变体。这些保守性变异的突变体可根据,例如下表所示进行氨基酸替换而产生。In the present invention, the glycosyltransferase of the present invention comprises the amino acid sequence such as SEQ ID NO.: 4, SEQ ID NO: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 Or up to 20, preferably up to 10, more preferably up to 3, more preferably up to 2, optimally up to 1 amino acid compared to the glycosyltransferase shown by SEQ ID NO.: 41 A mutant formed by substitution of amino acids of similar or similar nature. Mutants of these conservative variations can be produced according to, for example, amino acid substitutions as shown in the table below.
初始残基Initial residue 代表性的取代残基Representative substituted residues 优选的取代残基Preferred substituted residue
Ala(A)Ala(A) Val;Leu;IleVal; Leu; Ile ValVal
Arg(R)Arg(R) Lys;Gln;AsnLys; Gln; Asn LysLys
Asn(N)Asn(N) Gln;His;Lys;ArgGln;His;Lys;Arg GlnGln
Asp(D)Asp(D) GluGlu GluGlu
Cys(C)Cys(C) SerSer SerSer
Gln(Q)Gln(Q) AsnAsn AsnAsn
Glu(E)Glu(E) AspAsp AspAsp
Gly(G)Gly(G) Pro;AlaPro; Ala AlaAla
His(H)His(H) Asn;Gln;Lys;ArgAsn; Gln; Lys; Arg ArgArg
Ile(I)Ile(I) Leu;Val;Met;Ala;PheLeu;Val;Met;Ala;Phe LeuLeu
Leu(L)Leu(L) Ile;Val;Met;Ala;PheIle;Val;Met;Ala;Phe IleIle
Lys(K)Lys(K) Arg;Gln;AsnArg; Gln; Asn ArgArg
Met(M)Met(M) Leu;Phe;IleLeu;Phe;Ile LeuLeu
Phe(F)Phe(F) Leu;Val;Ile;Ala;TyrLeu;Val;Ile;Ala;Tyr LeuLeu
Pro(P)Pro(P) AlaAla AlaAla
Ser(S)Ser(S) ThrThr ThrThr
Thr(T)Thr(T) SerSer SerSer
Trp(W)Trp(W) Tyr;PheTyr;Phe TyrTyr
Tyr(Y)Tyr(Y) Trp;Phe;Thr;SerTrp;Phe;Thr;Ser PhePhe
Val(V)Val(V) Ile;Leu;Met;Phe;AlaIle; Leu; Met; Phe; Ala LeuLeu
本发明还提供了编码本发明多肽的多核苷酸。术语“编码多肽的多核苷酸”可以是包括编码此多肽的多核苷酸,也可以是还包括附加编码和/或非编码序列的多核苷酸。The invention also provides polynucleotides encoding the polypeptides of the invention. The term "polynucleotide encoding a polypeptide" can be a polynucleotide comprising the polypeptide, or a polynucleotide further comprising additional coding and/or non-coding sequences.
因此,本文所用的“含有”,“具有”或“包括”包括了“包含”、“主要由……构成”、“基本上由……构成”、和“由……构成”;“主要由……构成”、“基本上由……构成”和“由……构成”属于“含有”、“具有”或“包括”的下位概念。Therefore, as used herein, "includes", "includes" or "includes" includes "includes", "consisting essentially of", "consisting essentially of", and "consisting of"; "Consisting", "consisting essentially of" and "consisting of" belong to the subordinate concept of "contains," "has," or "includes."
对应于SEQ ID NO:19所示氨基酸序列的第222位/322位/247位/280位的氨基酸残基Amino acid residue corresponding to position 222/322/247/280 of the amino acid sequence shown in SEQ ID NO:
本领域普通技术人员均知道,可在某个蛋白的氨基酸序列中对一些氨基酸残基作出各种突变,例如取代、添加或缺失,但得到的突变体仍能具备原蛋白的功能或活性。因此,本领域普通技术人员可对本发明具体公开的氨基酸序列作出一定改变而得到仍具有所需活性的突变体,那么这种突变体中与SEQ ID NO:19所示氨基酸序列的第222位/322位/247位/280位的氨基酸残基相对应的氨基酸残基可能就不是第222位/322位/247位/280位,但如此得到的突变体仍应落在本发明的保护范围内。It is known to those skilled in the art that various mutations, such as substitutions, additions or deletions, can be made to some amino acid residues in the amino acid sequence of a certain protein, but the resulting mutant can still possess the function or activity of the original protein. Thus, one of ordinary skill in the art can make certain changes to the amino acid sequence specifically disclosed in the present invention to obtain a mutant which still has the desired activity, and thus the mutant has the 222th position of the amino acid sequence shown in SEQ ID NO: 19. The amino acid residues corresponding to the 322/247/280 amino acid residues may not be the 222th/322th/247th/280th position, but the mutant thus obtained should still fall within the scope of the present invention. .
本文所用的术语“对应于”具有本领域普通技术人员通常理解的意义。具体地说,“对应于”表示两条序列经同源性或序列相同性比对后,一条序列与另一条序列中的指定位置相对应的位置。因此,就“对应于SEQ ID NO:19所示氨基酸序列的第222位/322位/247位/280位的氨基酸残基”而言,如果在SEQ ID NO:19所示氨基酸序列的一端加上6-His标签,那么所得突变体中对应于SEQ ID NO:19所示氨基酸序列的第222位/322位/247位/280位就可能是第228位/328位/253位/286位;而如果删除SEQ ID NO:19所示氨基酸序列中的少数氨基酸残基(例如2个),那么所得突变体中对应于SEQ ID NO:19所示氨基酸序列的第222位/322位/247位/280位就可能是第220位/320位/245位/278位,等等。再例如,如果一条具有400个氨基酸残基的序列与SEQ ID NO:19所示氨基酸序列的第20-420位具有较高的同源性或序列相同性,那么所得突变体中对应于SEQ ID NO:19所示氨基酸序列的第222位/322位/247位/280位就可能是第202位/302位/227位/260位。The term "corresponding to" as used herein has the meaning as commonly understood by one of ordinary skill in the art. Specifically, "corresponding to" means a position in which one sequence corresponds to a specified position in another sequence after alignment of homology or sequence identity by two sequences. Therefore, as for "amino acid residue corresponding to position 222/322/247/280 of the amino acid sequence shown in SEQ ID NO: 19", if one end of the amino acid sequence shown by SEQ ID NO: 19 is added With the 6-His tag, the 222th/322th/247th/280th position corresponding to the amino acid sequence shown in SEQ ID NO: 19 in the obtained mutant may be the 228th/328th/253th/286th position. And if a few amino acid residues (for example, 2) in the amino acid sequence shown by SEQ ID NO: 19 are deleted, the obtained mutant corresponds to the 222th/322th position/247 of the amino acid sequence shown by SEQ ID NO: 19. Bit / 280 bits may be the 220th / 320 bit / 245 / 278 bits, and so on. For another example, if a sequence having 400 amino acid residues has higher homology or sequence identity to positions 20-420 of the amino acid sequence set forth in SEQ ID NO: 19, then the resulting mutant corresponds to SEQ ID The 222th/322th/247th/280th position of the amino acid sequence shown by NO: 19 may be the 202th/302th/227th/260th position.
在具体的实施方式中,所述同源性或序列相同性可以是80%以上,优选90% 以上,更优选95%-98%,最优选99%以上。In particular embodiments, the homology or sequence identity may be 80% or more, preferably 90% or more, more preferably 95% to 98%, and most preferably 99% or more.
本领域普通技术人员公知的测定序列同源性或相同性的方法包括但不限于:计算机分子生物学(Computational Molecular Biology),Lesk,A.M.编,牛津大学出版社,纽约,1988;生物计算:信息学和基因组项目(Biocomputing:Informatics and Genome Projects),Smith,D.W.编,学术出版社,纽约,1993;序列数据的计算机分析(Computer Analysis of Sequence Data),第一部分,Griffin,A.M.和Griffin,H.G.编,Humana Press,新泽西,1994;分子生物学中的序列分析(Sequence Analysis in Molecular Biology),von Heinje,G.,学术出版社,1987和序列分析引物(Sequence Analysis Primer),Gribskov,M.与Devereux,J.编M Stockton Press,纽约,1991和Carillo,H.与Lipman,D.,SIAM J.Applied Math.,48:1073(1988)。测定相同性的优选方法要在测试的序列之间得到最大的匹配。测定相同性的方法编译在公众可获得的计算机程序中。优选的测定两条序列之间相同性的计算机程序方法包括但不限于:GCG程序包(Devereux,J.等,1984)、BLASTP、BLASTN和FASTA(Altschul,S,F.等,1990)。公众可从NCBI和其它来源得到BLASTX程序(BLAST手册,Altschul,S.等,NCBI NLM NIH Bethesda,Md.20894;Altschul,S.等,1990)。熟知的Smith Waterman算法也可用于测定相同性。Methods for determining sequence homology or identity as known to those of ordinary skill in the art include, but are not limited to, Computational Molecular Biology, Lesk, AM, Oxford University Press, New York, 1988; Biocomputing: Information Biocomputing: Informatics and Genome Projects, Smith, DW, Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, AM and Griffin, HG , Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987 and Sequence Analysis Primer, Gribskov, M. and Devereux , J. M. Stockton Press, New York, 1991 and Carillo, H. and Lipman, D., SIAM J. Applied Math., 48: 1073 (1988). The preferred method of determining identity is to obtain the largest match between the sequences tested. The method of determining identity is compiled in a publicly available computer program. Preferred computer program methods for determining identity between two sequences include, but are not limited to, the GCG package (Devereux, J. et al., 1984), BLASTP, BLASTN, and FASTA (Altschul, S, F. et al, 1990). The BLASTX program is available to the public from NCBI and other sources (BLAST Handbook, Altschul, S. et al, NCBI NLM NIH Bethesda, Md. 20894; Altschul, S. et al, 1990). The well-known Smith Waterman algorithm can also be used to determine identity.
如非特别说明,本文所说的人参皂苷和皂苷元,是的C20位S和/或R构型的人参皂苷和皂苷元。Unless otherwise specified, the ginsenosides and sapogenins referred to herein are ginsenosides and sapogenins of the C20 position S and/or R configuration.
如本文所用,“分离的多肽”是指所述多肽基本上不含天然与其相关的其它蛋白、脂类、糖类或其它物质。本领域的技术人员能用标准的蛋白质纯化技术纯化所述多肽。基本上纯的多肽在非还原聚丙烯酰胺凝胶上能产生单一的主带。所述多肽的纯度还可以用氨基酸序列进行进一步分析。As used herein, "isolated polypeptide" means that the polypeptide is substantially free of other proteins, lipids, carbohydrates or other materials with which it is naturally associated. One skilled in the art can purify the polypeptide using standard protein purification techniques. A substantially pure polypeptide produces a single major band on a non-reducing polyacrylamide gel. The purity of the polypeptide can also be further analyzed using amino acid sequences.
本发明的活性多肽可以是重组多肽、天然多肽、合成多肽。本发明的多肽可以是天然纯化的产物,或是化学合成的产物,或使用重组技术从原核或真核宿主(例如,细菌、酵母、植物)中产生。根据重组生产方案所用的宿主,本发明的多肽可以是糖基化的,或可以是非糖基化的。本发明的多肽还可包括或不包括起始的甲硫氨酸残基。The active polypeptide of the present invention may be a recombinant polypeptide, a natural polypeptide, or a synthetic polypeptide. The polypeptides of the invention may be naturally purified products, either chemically synthesized or produced recombinantly from prokaryotic or eukaryotic hosts (e.g., bacteria, yeast, plants). The polypeptide of the invention may be glycosylated or may be non-glycosylated, depending on the host used in the recombinant production protocol. Polypeptides of the invention may also or may not include an initial methionine residue.
本发明还包括所述多肽的片段、衍生物和类似物。如本文所用,术语“片 段”、“衍生物”和“类似物”是指基本上保持所述多肽相同的生物学功能或活性的多肽。The invention also includes fragments, derivatives and analogs of the polypeptides. As used herein, the terms "fragment," "derivative," and "analog" refer to a polypeptide that substantially retains the same biological function or activity of the polypeptide.
本发明的多肽片段、衍生物或类似物可以是(i)有一个或多个保守或非保守性氨基酸残基(优选保守性氨基酸残基)被取代的多肽,而这样的取代的氨基酸残基可以是也可以不是由遗传密码编码的,或(ii)在一个或多个氨基酸残基中具有取代基团的多肽,或(iii)成熟多肽与另一个化合物(比如延长多肽半衰期的化合物,例如聚乙二醇)融合所形成的多肽,或(iv)附加的氨基酸序列融合到此多肽序列而形成的多肽(如前导序列或分泌序列或用来纯化此多肽的序列或蛋白原序列,或与抗原IgG片段的形成的融合蛋白)。根据本文的教导,这些片段、衍生物和类似物属于本领域熟练技术人员公知的范围。The polypeptide fragment, derivative or analog of the present invention may be (i) a polypeptide having one or more conservative or non-conservative amino acid residues (preferably conservative amino acid residues) substituted, and such substituted amino acid residues It may or may not be encoded by the genetic code, or (ii) a polypeptide having a substituent group in one or more amino acid residues, or (iii) a mature polypeptide and another compound (such as a compound that extends the half-life of the polypeptide, for example Polyethylene glycol) a polypeptide formed by fusion, or (iv) a polypeptide formed by fused an additional amino acid sequence to the polypeptide sequence (such as a leader or secretion sequence or a sequence or proprotein sequence used to purify the polypeptide, or A fusion protein for the formation of an antigenic IgG fragment). These fragments, derivatives and analogs are within the purview of those skilled in the art in light of the teachings herein.
在本发明的活性多肽具有糖基转移酶活性,并且能够催化以下一种或多种反应:The active polypeptide of the present invention has glycosyltransferase activity and is capable of catalyzing one or more of the following reactions:
Figure PCTCN2018086738-appb-000003
Figure PCTCN2018086738-appb-000003
所述的多肽序列为SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO:41或其衍生多肽,该术语还包括具有与所示多肽具有相同功能的SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、 SEQ ID NO.:37或SEQ ID NO.:39或SEQ ID NO:41序列的变异形式。这些变异形式包括(但并不限于):一个或多个(通常为1-50个,较佳地1-30个,更佳地1-20个,最佳地1-10个)氨基酸的缺失、插入和/或取代,以及在C末端和/或N末端添加一个或数个(通常为20个以内,较佳地为10个以内,更佳地为5个以内)氨基酸。例如,在本领域中,用性能相近或相似的氨基酸进行取代时,通常不会改变蛋白质的功能。又比如,在C末端和/或N末端添加一个或数个氨基酸通常也不会改变蛋白质的功能。该术语还包括本发明蛋白的活性片段和活性衍生物。本发明还提供所述多肽的类似物。这些类似物与天然多肽的差别可以是氨基酸序列上的差异,也可以是不影响序列的修饰形式上的差异,或者兼而有之。这些多肽包括天然或诱导的遗传变异体。诱导变异体可以通过各种技术得到,如通过辐射或暴露于诱变剂而产生随机诱变,还可通过定点诱变法或其他已知分子生物学的技术。类似物还包括具有不同于天然L-氨基酸的残基(如D-氨基酸)的类似物,以及具有非天然存在的或合成的氨基酸(如β、γ-氨基酸)的类似物。应理解,本发明的多肽并不限于上述例举的代表性的多肽。The polypeptide sequence is SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO: 41 or a derivative thereof The term also encompasses SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37 or SEQ ID NO.: 39 or SEQ having the same function as the polypeptide shown. ID NO: variant form of the 41 sequence. These variants include, but are not limited to, one or more (usually 1-50, preferably 1-30, more preferably 1-20, optimally 1-10) amino acid deletions , Insertion and/or Substitution, and the addition of one or several (usually within 20, preferably within 10, more preferably within 5) amino acids at the C-terminus and/or N-terminus. For example, in the art, when substituted with amino acids of similar or similar properties, the function of the protein is generally not altered. As another example, the addition of one or several amino acids at the C-terminus and/or N-terminus will generally not alter the function of the protein. The term also encompasses active fragments and active derivatives of the proteins of the invention. The invention also provides analogs of the polypeptides. The difference between these analogs and the natural polypeptide may be a difference in amino acid sequence, a difference in the modification form which does not affect the sequence, or a combination thereof. These polypeptides include natural or induced genetic variants. Induced variants can be obtained by a variety of techniques, such as random mutagenesis by irradiation or exposure to a mutagen, or by site-directed mutagenesis or other techniques known to molecular biology. Analogs also include analogs having residues other than the native L-amino acid (such as D-amino acids), as well as analogs having non-naturally occurring or synthetic amino acids (such as beta, gamma-amino acids). It is to be understood that the polypeptide of the present invention is not limited to the representative polypeptides exemplified above.
修饰(通常不改变一级结构)形式包括:体内或体外的多肽的化学衍生形式如乙酰化或羧基化。修饰还包括糖基化,如那些在多肽的合成和加工中或进一步加工步骤中进行糖基化修饰而产生的多肽。这种修饰可以通过将多肽暴露于进行糖基化的酶(如哺乳动物的糖基化酶或去糖基化酶)而完成。修饰形式还包括具有磷酸化氨基酸残基(如磷酸酪氨酸,磷酸丝氨酸,磷酸苏氨酸)的序列。还包括被修饰从而提高了其抗蛋白酶水解性能或优化了溶解性能的多肽。Modifications (usually without altering the primary structure) include chemically derived forms of the polypeptide, such as acetylation or carboxylation, in vivo or in vitro. Modifications also include glycosylation, such as those produced by glycosylation modifications in the synthesis and processing of the polypeptide or in further processing steps. Such modification can be accomplished by exposing the polypeptide to an enzyme that performs glycosylation, such as a mammalian glycosylation enzyme or a deglycosylation enzyme. Modified forms also include sequences having phosphorylated amino acid residues such as phosphotyrosine, phosphoserine, phosphothreonine. Also included are polypeptides that have been modified to enhance their resistance to protease hydrolysis or to optimize solubility properties.
本发明的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的氨基端或羧基端还可含有一个或多个多肽片段,作为蛋白标签。任何合适的标签都可以用于本发明。例如,所述的标签可以是FLAG、HA、HA1、c-Myc、Poly–His、Poly-Arg、Strep-TagII、AU1、EE、T7、4A6、ε、B、gE、以及Ty1。这些标签可用于对蛋白进行纯化。表1列出了其中的一些标签及其序列。The amino terminus or carboxy terminus of the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptides or derived polypeptides thereof of the present invention may further comprise one or more polypeptide fragments as a protein tag. Any suitable label can be used in the present invention. For example, the tags can be FLAG, HA, HA1, c-Myc, Poly-His, Poly-Arg, Strep-TagII, AU1, EE, T7, 4A6, ε, B, gE, and Ty1. These tags can be used to purify proteins. Table 1 lists some of these tags and their sequences.
表1Table 1
标签label 残基数Number of residues 序列sequence
Poly-ArgPoly-Arg 5-6个(通常5个)5-6 (usually 5) RRRRRRRRRR
Poly-HisPoly-His 2-10个(通常6个)2-10 (usually 6) HHHHHHHHHHHH
FLAGFLAG 8个8 DYKDDDDKDYKDDDDK
Strep-TagIIStrep-TagII 8个8 WSHPQFEKWSHPQFEK
C-mycC-myc 10个10 WQKLISEEDLWQKLISEEDL
GSTGST 220个220 后面6个 LVPRGS The next 6 LVPRGS
为了使翻译的蛋白分泌表达(如分泌到细胞外),还可在所述Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的氨基酸氨基末端添加上信号肽序列,如pelB信号肽等。信号肽在多肽从细胞内分泌出来的过程中可被切去。In order to secrete the translated protein (eg, secreted extracellularly), the amino acid amino terminus of the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom may also be used. A signal peptide sequence such as a pelB signal peptide or the like is added. The signal peptide can be cleaved off during secretion of the polypeptide from the cell.
本发明的多核苷酸可以是DNA形式或RNA形式。DNA形式包括cDNA、基因组DNA或人工合成的DNA。DNA可以是单链的或是双链的。DNA可以是编码链或非编码链。编码成熟多肽的编码区序列可以与SEQ ID NO.:3、SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示的编码区序列相同或者是简并的变异体。如本文所用,“简并的变异体”在本发明中是指编码具有SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39、SEQ ID NO.:41所示氨基酸序列的多肽、或其衍生多肽,但与SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39、SEQ ID NO.:41或其衍生多肽的编码序列,优选为SEQ ID NO.:3、SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示的序列有差别的核酸序列。编码SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示多肽或其衍生多肽的成熟多肽的多核苷酸包括:只编码成熟多肽的编码序列;成熟多肽的编码序列和各种附加编码序列;成熟多肽的编码序列(和任选的附加编码序列)以及非编码序列。The polynucleotide of the present invention may be in the form of DNA or RNA. DNA forms include cDNA, genomic DNA or synthetic DNA. DNA can be single-stranded or double-stranded. The DNA can be a coding strand or a non-coding strand. The coding region sequence encoding the mature polypeptide can be SEQ ID NO.: 3, SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42 The coding regions shown are identical or degenerate variants. As used herein, "degenerate variant" in the present invention means having the coding with SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO .: 39. The polypeptide of the amino acid sequence of SEQ ID NO.: 41, or a polypeptide derived therefrom, but with SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37. The coding sequence of SEQ ID NO.: 39, SEQ ID NO.: 41 or a polypeptide derived therefrom, preferably SEQ ID NO.: 3, SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO. A nucleic acid sequence having a difference in sequence of SEQ ID NO.: 40 or SEQ ID NO.: 42. A polypeptide encoding SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 or a polypeptide derived therefrom Polynucleotides of mature polypeptides include: coding sequences encoding only mature polypeptides; coding sequences for mature polypeptides and various additional coding sequences; coding sequences for mature polypeptides (and optionally additional coding sequences) and non-coding sequences.
术语“编码多肽的多核苷酸”可以是包括编码此多肽的多核苷酸,也可以是 还包括附加编码和/或非编码序列的多核苷酸。The term "polynucleotide encoding a polypeptide" may be a polynucleotide comprising the polypeptide, or a polynucleotide further comprising additional coding and/or non-coding sequences.
本发明还涉及上述多核苷酸的变异体,其编码与本发明有相同的氨基酸序列的多肽或多肽的片段、类似物和衍生物。此多核苷酸的变异体可以是天然发生的等位变异体或非天然发生的变异体。这些核苷酸变异体包括取代变异体、缺失变异体和插入变异体。如本领域所知的,等位变异体是一个多核苷酸的替换形式,它可能是一个或多个核苷酸的取代、缺失或插入,但不会从实质上改变其编码的多肽的功能。The invention also relates to variants of the above polynucleotides which encode fragments, analogs and derivatives of polypeptides or polypeptides having the same amino acid sequence as the invention. Variants of this polynucleotide may be naturally occurring allelic variants or non-naturally occurring variants. These nucleotide variants include substitution variants, deletion variants, and insertion variants. As is known in the art, an allelic variant is an alternative form of a polynucleotide that may be a substitution, deletion or insertion of one or more nucleotides, but does not substantially alter the function of the polypeptide encoded thereby. .
本发明还涉及与上述的序列杂交且两个序列之间具有至少50%,较佳地至少70%,更佳地至少80%相同性的多核苷酸。本发明特别涉及在严格条件(或严紧条件)下与本发明所述多核苷酸可杂交的多核苷酸。在本发明中,“严格条件”是指:(1)在较低离子强度和较高温度下的杂交和洗脱,如0.2×SSC,0.1%SDS,60℃;或(2)杂交时加有变性剂,如50%(v/v)甲酰胺,0.1%小牛血清/0.1%Ficoll,42℃等;或(3)仅在两条序列之间的相同性至少在90%以上,更好是95%以上时才发生杂交。并且,可杂交的多核苷酸编码的多肽与SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示的成熟多肽有相同的生物学功能和活性。The invention also relates to polynucleotides which hybridize to the sequences described above and which have at least 50%, preferably at least 70%, more preferably at least 80% identity between the two sequences. The invention particularly relates to polynucleotides that hybridize to the polynucleotides of the invention under stringent conditions (or stringent conditions). In the present invention, "stringent conditions" means: (1) hybridization and elution at a lower ionic strength and higher temperature, such as 0.2 x SSC, 0.1% SDS, 60 ° C; or (2) hybridization a denaturing agent such as 50% (v/v) formamide, 0.1% calf serum / 0.1% Ficoll, 42 ° C, etc.; or (3) at least 90% identity between the two sequences, more It is good that hybridization occurs more than 95%. Furthermore, the polypeptide encoded by the hybridizable polynucleotide is SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO The mature polypeptide shown by .:41 has the same biological function and activity.
本发明还涉及与上述的序列杂交的核酸片段。如本文所用,“核酸片段”的长度至少含15个核苷酸,较好是至少30个核苷酸,更好是至少50个核苷酸,最好是至少100个核苷酸以上。核酸片段可用于核酸的扩增技术(如PCR)以确定和/或分离编码Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的多聚核苷酸。The invention also relates to nucleic acid fragments that hybridize to the sequences described above. As used herein, a "nucleic acid fragment" is at least 15 nucleotides in length, preferably at least 30 nucleotides, more preferably at least 50 nucleotides, and most preferably at least 100 nucleotides or more. Nucleic acid fragments can be used in nucleic acid amplification techniques (such as PCR) to identify and/or isolate polynuclei encoding Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptides or derived polypeptides thereof. Glycosylate.
本发明中的多肽和多核苷酸优选以分离的形式提供,更佳地被纯化至均质。The polypeptides and polynucleotides of the invention are preferably provided in isolated form, more preferably purified to homogeneity.
本发明的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽核苷酸全长序列或其片段通常可以用PCR扩增法、重组法或人工合成的方法获得。对于PCR扩增法,可根据本发明所公开的有关核苷酸序列,尤其是开放阅读框序列来设计引物,并用市售的cDNA库或按本领域技术人员已知的常规方法所制备的cDNA库作为模板,扩增而得有关序列。当序列较长时,常常需要进行两次或多次PCR扩增,然后再将各次扩增 出的片段按正确次序拼接在一起。The full-length nucleotide sequence of the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or derivative polypeptide thereof of the present invention or a fragment thereof can generally be used by PCR amplification, recombinant method or artificial The synthetic method is obtained. For PCR amplification, primers can be designed in accordance with the disclosed nucleotide sequences, particularly open reading frame sequences, and can be prepared using commercially available cDNA libraries or conventional methods known to those skilled in the art. The library is used as a template to amplify the relevant sequences. When the sequence is long, it is often necessary to perform two or more PCR amplifications, and then the amplified fragments are spliced together in the correct order.
一旦获得了有关的序列,就可以用重组法来大批量地获得有关序列。这通常是将其克隆入载体,再转入细胞,然后通过常规方法从增殖后的宿主细胞中分离得到有关序列。Once the relevant sequences are obtained, the recombinant sequence can be used to obtain the relevant sequences in large quantities. This is usually done by cloning it into a vector, transferring it to a cell, and then isolating the relevant sequence from the proliferated host cell by conventional methods.
此外,还可用人工合成的方法来合成有关序列,尤其是片段长度较短时。通常,通过先合成多个小片段,然后再进行连接可获得序列很长的片段。In addition, synthetic sequences can be used to synthesize related sequences, especially when the fragment length is short. Usually, a long sequence of fragments can be obtained by first synthesizing a plurality of small fragments and then performing the ligation.
目前,已经可以完全通过化学合成来得到编码本发明蛋白(或其片段,或其衍生物)的DNA序列。然后可将该DNA序列引入本领域中已知的各种现有的DNA分子(或如载体)和细胞中。此外,还可通过化学合成将突变引入本发明蛋白序列中。At present, it has been possible to obtain a DNA sequence encoding the protein of the present invention (or a fragment thereof, or a derivative thereof) completely by chemical synthesis. The DNA sequence can then be introduced into various existing DNA molecules (or vectors) and cells known in the art. In addition, mutations can also be introduced into the protein sequences of the invention by chemical synthesis.
应用PCR技术扩增DNA/RNA的方法被优选用于获得本发明的基因。特别是很难从文库中得到全长的cDNA时,可优选使用RACE法(RACE-cDNA末端快速扩增法),用于PCR的引物可根据本文所公开的本发明的序列信息适当地选择,并可用常规方法合成。可用常规方法如通过凝胶电泳分离和纯化扩增的DNA/RNA片段。A method of amplifying DNA/RNA using PCR technology is preferably used to obtain the gene of the present invention. Particularly, when it is difficult to obtain a full-length cDNA from a library, RACE method (RACE-cDNA end rapid amplification method) can be preferably used, and primers for PCR can be appropriately selected according to the sequence information of the present invention disclosed herein. And can be synthesized by conventional methods. The amplified DNA/RNA fragment can be isolated and purified by conventional methods such as by gel electrophoresis.
本发明也涉及包含本发明的多核苷酸的载体,以及用本发明的载体或Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的编码序列经基因工程产生的宿主细胞,以及经重组技术产生本发明所述多肽的方法。The present invention also relates to a vector comprising the polynucleotide of the present invention, and a coding sequence using the vector of the present invention or Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom Genetically engineered host cells, and methods of producing the polypeptides of the invention by recombinant techniques.
通过常规的重组DNA技术,可利用本发明的多聚核苷酸序列可用来表达或生产重组的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽。一般来说有以下步骤:The polynucleotide sequences of the present invention can be used to express or produce recombinant Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptides or derivatives thereof by conventional recombinant DNA techniques. Peptide. Generally there are the following steps:
(1)用本发明的编码Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的多核苷酸(或变异体),或用含有该多核苷酸的重组表达载体转化或转导合适的宿主细胞;(1) using a polynucleotide (or variant) encoding a Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom, or containing the polynucleoside An acid recombinant expression vector transforms or transduces a suitable host cell;
(2)在合适的培养基中培养的宿主细胞;(2) a host cell cultured in a suitable medium;
(3)从培养基或细胞中分离、纯化蛋白质。(3) Separating and purifying the protein from the culture medium or the cells.
本发明中,编码Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的多核苷酸序列可插入到重组表达载体中。术语 “重组表达载体”指本领域熟知的细菌质粒、噬菌体、酵母质粒、植物细胞病毒、哺乳动物细胞病毒如腺病毒、逆转录病毒或其他载体。只要能在宿主体内复制和稳定,任何质粒和载体都可以用。表达载体的一个重要特征是通常含有复制起点、启动子、标记基因和翻译控制元件。In the present invention, a polynucleotide sequence encoding a Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom can be inserted into a recombinant expression vector. The term "recombinant expression vector" refers to bacterial plasmids, phage, yeast plasmids, plant cell viruses, mammalian cell viruses such as adenoviruses, retroviruses or other vectors well known in the art. Any plasmid and vector can be used as long as it can replicate and stabilize in the host. An important feature of expression vectors is that they typically contain an origin of replication, a promoter, a marker gene, and a translational control element.
本领域的技术人员熟知的方法能用于构建含Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的编码DNA序列和合适的转录/翻译控制信号的表达载体。这些方法包括体外重组DNA技术、DNA合成技术、体内重组技术等。所述的DNA序列可有效连接到表达载体中的适当启动子上,以指导mRNA合成。这些启动子的代表性例子有:大肠杆菌的lac或trp启动子;λ噬菌体PL启动子;真核启动子包括CMV立即早期启动子、HSV胸苷激酶启动子、早期和晚期SV40启动子、反转录病毒的LTRs和其他一些已知的可控制基因在原核或真核细胞或其病毒中表达的启动子。表达载体还包括翻译起始用的核糖体结合位点和转录终止子。Methods well known to those skilled in the art can be used to construct a coding DNA sequence containing Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom and suitable transcription/translation control An expression vector for the signal. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like. The DNA sequence can be operably linked to an appropriate promoter in an expression vector to direct mRNA synthesis. Representative examples of such promoters are: lac or trp promoter of E. coli; lambda phage PL promoter; eukaryotic promoters include CMV immediate early promoter, HSV thymidine kinase promoter, early and late SV40 promoter, anti- Promoters for transcription of viral LTRs and other known controllable genes in prokaryotic or eukaryotic cells or their viruses. The expression vector also includes a ribosome binding site for translation initiation and a transcription terminator.
此外,表达载体优选地包含一个或多个选择性标记基因,以提供用于选择转化的宿主细胞的表型性状,如真核细胞培养用的二氢叶酸还原酶、新霉素抗性以及绿色荧光蛋白(GFP),或用于大肠杆菌的四环素或氨苄青霉素抗性。Furthermore, the expression vector preferably comprises one or more selectable marker genes to provide phenotypic traits for selection of transformed host cells, such as dihydrofolate reductase for eukaryotic cell culture, neomycin resistance, and green Fluorescent protein (GFP), or tetracycline or ampicillin resistance for E. coli.
包含上述的适当DNA序列以及适当启动子或者控制序列的载体,可以用于转化适当的宿主细胞,以使其能够表达蛋白质。Vectors comprising the appropriate DNA sequences described above, as well as appropriate promoters or control sequences, can be used to transform appropriate host cells to enable expression of the protein.
宿主细胞可以是原核细胞,如细菌细胞;或是低等真核细胞,如酵母细胞;或是高等真核细胞,如哺乳动物细胞。代表性例子有:大肠杆菌,链霉菌属;鼠伤寒沙门氏菌的细菌细胞;真菌细胞如酵母;植物细胞;果蝇S2或Sf9的昆虫细胞;CHO、COS、293细胞、或Bowes黑素瘤细胞的动物细胞等。The host cell can be a prokaryotic cell, such as a bacterial cell; or a lower eukaryotic cell, such as a yeast cell; or a higher eukaryotic cell, such as a mammalian cell. Representative examples are: Escherichia coli, Streptomyces; bacterial cells of Salmonella typhimurium; fungal cells such as yeast; plant cells; insect cells of Drosophila S2 or Sf9; CHO, COS, 293 cells, or Bowes melanoma cells Animal cells, etc.
本发明的多核苷酸在高等真核细胞中表达时,如果在载体中插入增强子序列时将会使转录得到增强。增强子是DNA的顺式作用因子,通常大约有10到300个碱基对,作用于启动子以增强基因的转录。可举的例子包括在复制起始点晚期一侧的100到270个碱基对的SV40增强子、在复制起始点晚期一侧的多瘤增强子以及腺病毒增强子等。When a polynucleotide of the present invention is expressed in higher eukaryotic cells, transcription will be enhanced if an enhancer sequence is inserted into the vector. An enhancer is a cis-acting factor of DNA, usually about 10 to 300 base pairs, acting on a promoter to enhance transcription of the gene. Usable examples include a 100 to 270 base pair SV40 enhancer on the late side of the replication initiation point, a polyoma enhancer on the late side of the replication initiation site, and an adenovirus enhancer.
本领域一般技术人员都清楚如何选择适当的载体、启动子、增强子和宿主细胞。It will be apparent to one of ordinary skill in the art how to select appropriate vectors, promoters, enhancers and host cells.
用重组DNA转化宿主细胞可用本领域技术人员熟知的常规技术进行。当宿主为原核生物如大肠杆菌时,能吸收DNA的感受态细胞可在指数生长期后收获,用CaCl 2法处理,所用的步骤在本领域众所周知。另一种方法是使用MgCl 2。如果需要,转化也可用电穿孔的方法进行。当宿主是真核生物,可选用如下的DNA转染方法:磷酸钙共沉淀法,常规机械方法如显微注射、电穿孔、脂质体包装等。 Transformation of host cells with recombinant DNA can be carried out using conventional techniques well known to those skilled in the art. When the host is a prokaryote such as E. coli, competent cells capable of absorbing DNA can be harvested after the exponential growth phase and treated by the CaCl 2 method, and the procedures used are well known in the art. Another method is to use MgCl 2 . Conversion can also be carried out by electroporation if desired. When the host is a eukaryote, the following DNA transfection methods can be used: calcium phosphate coprecipitation, conventional mechanical methods such as microinjection, electroporation, liposome packaging, and the like.
获得的转化子可以用常规方法培养,表达本发明的基因所编码的多肽。根据所用的宿主细胞,培养中所用的培养基可选自各种常规培养基。在适于宿主细胞生长的条件下进行培养。当宿主细胞生长到适当的细胞密度后,用合适的方法(如温度转换或化学诱导)诱导选择的启动子,将细胞再培养一段时间。The obtained transformant can be cultured by a conventional method to express the polypeptide encoded by the gene of the present invention. The medium used in the culture may be selected from various conventional media depending on the host cell used. The cultivation is carried out under conditions suitable for the growth of the host cell. After the host cell has grown to the appropriate cell density, the selected promoter is induced by a suitable method (such as temperature conversion or chemical induction) and the cells are cultured for a further period of time.
在上面的方法中的重组多肽可在细胞内、或在细胞膜上表达、或分泌到细胞外。如果需要,可利用其物理的、化学的和其它特性通过各种分离方法分离和纯化重组的蛋白。这些方法是本领域技术人员所熟知的。这些方法的例子包括但并不限于:常规的复性处理、用蛋白沉淀剂处理(盐析方法)、离心、渗透破菌、超处理、超离心、分子筛层析(凝胶过滤)、吸附层析、离子交换层析、高效液相层析(HPLC)和其它各种液相层析技术及这些方法的结合。The recombinant polypeptide in the above method can be expressed intracellularly, or on the cell membrane, or secreted outside the cell. If desired, the recombinant protein can be isolated and purified by various separation methods using its physical, chemical, and other properties. These methods are well known to those skilled in the art. Examples of such methods include, but are not limited to, conventional renaturation treatment, treatment with a protein precipitant (salting method), centrifugation, osmotic sterilizing, super treatment, ultracentrifugation, molecular sieve chromatography (gel filtration), adsorption layer Analysis, ion exchange chromatography, high performance liquid chromatography (HPLC) and various other liquid chromatography techniques and combinations of these methods.
应用application
本发明涉及的活性多肽或糖基转移酶Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的用途包括(但不限于):特异和高效地将来自糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上。特别是能够将式(I)化合物转化为所述式(II)化合物,例如将原人参二醇PPD转化为抗肿瘤活性更优良的稀有人参皂苷Rh2;将Compound K转化为人参皂苷F2。The use of the active polypeptide or glycosyltransferase Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or polypeptide derived therefrom according to the present invention includes, but is not limited to, specific and efficient The glycosyl group from the glycosyl donor is transferred to the hydroxyl group at the C-3 position of the tetracyclic triterpenoid. In particular, it is possible to convert a compound of the formula (I) into the compound of the formula (II), for example, to convert the protopanaxadiol PPD into a rare ginsenoside Rh2 having superior antitumor activity; and to convert Compound K into ginsenoside F2.
所述的四环三萜化合物包括(但不限于):S构型或R构型的达玛烷型、羊毛脂烷型、甘遂烷型、环阿屯烷(环阿尔廷烷)型、apotirucallane型、葫芦烷、楝烷型等四环三萜类化合物。The tetracyclic triterpene compound includes, but is not limited to, a dammarane type, a lanolin type, a ganthanane type, a cycloalkane (cycloaltenane) type in the S configuration or the R configuration, a tetracyclic triterpenoid such as apotirucallane type, cucurbitane or decane type.
本发明提供了一种工业催化方法,包括:在提供糖基供体的条件下,用本发明的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3 多肽或其衍生多肽,获得人参皂苷Rh2和人参皂苷F2。具体是,所述的(A)反应中所用的多肽选自SEQ ID NO.:4或SEQ ID NO.:21所示多肽或其衍生多肽;所述(B)反应中所用的多肽为SEQ ID NO.:4所示多肽或其衍生多肽。The present invention provides an industrial catalytic method comprising: using the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide of the present invention or the like under the condition of providing a glycosyl donor The polypeptide is derived to obtain ginsenoside Rh2 and ginsenoside F2. Specifically, the polypeptide used in the (A) reaction is selected from the polypeptide represented by SEQ ID NO.: 4 or SEQ ID NO.: 21 or a polypeptide derived therefrom; and the polypeptide used in the (B) reaction is SEQ ID NO.: A polypeptide represented by 4 or a polypeptide derived therefrom.
在本发明的一个优选例中,提供了一种利用前述三七的糖基转移酶Pn50和糖基转移酶突变体蛋白8E7在酿酒酵母中合成人参皂苷Rh2的方法。In a preferred embodiment of the present invention, there is provided a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 and glycosyltransferase mutant protein 8E7.
在本发明的一个优选例中,提供了一种利用前述三七的糖基转移酶Pn50突变体蛋白Pn50-Q222H-VE在酿酒酵母中合成人参皂苷Rh2的方法。In a preferred embodiment of the present invention, there is provided a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 mutant protein Pn50-Q222H-VE of Panax notoginseng.
在本发明的一个优选例中,提供了一种利用前述三七的糖基转移酶Pn50突变体蛋白UGT-MUT1在酿酒酵母中合成人参皂苷Rh2的方法。In a preferred embodiment of the present invention, there is provided a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 mutant protein UGT-MUT1 of Panax notoginseng.
在本发明的一个优选例中,提供了一种利用前述三七的糖基转移酶Pn50突变体蛋白UGT-MUT2在酿酒酵母中合成人参皂苷Rh2的方法。In a preferred embodiment of the present invention, there is provided a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 mutant protein UGT-MUT2.
在本发明的一个优选例中,提供了一种利用前述三七的糖基转移酶Pn50突变体蛋白UGT-MUT3在酿酒酵母中合成人参皂苷Rh2的方法。In a preferred embodiment of the present invention, there is provided a method for synthesizing ginsenoside Rh2 in Saccharomyces cerevisiae using the aforementioned glycosyltransferase Pn50 mutant protein UGT-MUT3.
所述的糖基供体是核苷二磷酸糖,选自下组:UDP-葡萄糖,ADP-葡萄糖,TDP-葡萄糖,CDP-葡萄糖,GDP-葡萄糖,UDP-乙酰基葡萄糖,ADP-乙酰基葡萄糖,TDP-乙酰基葡萄糖,CDP-乙酰基葡萄糖,GDP-乙酰基葡萄糖,UDP-木糖,ADP-木糖,TDP-木糖,CDP-木糖,UDP-木糖,GDP-木糖,UDP-半乳糖醛酸,ADP-半乳糖醛酸,TDP-半乳糖醛酸,CDP-半乳糖醛酸,GDP-半乳糖醛酸,UDP-半乳糖,ADP-半乳糖,TDP-半乳糖,CDP-半乳糖,GDP-半乳糖,UDP-阿拉伯糖,ADP-阿拉伯糖,TDP-阿拉伯糖,CDP-阿拉伯糖,GDP-阿拉伯糖,UDP-鼠李糖,ADP-鼠李糖,TDP-鼠李糖,CDP-鼠李糖,GDP-鼠李糖,或其他核苷二磷酸己糖或核苷二磷酸戊糖,或其组合。The glycosyl donor is a nucleoside diphosphate sugar selected from the group consisting of UDP-glucose, ADP-glucose, TDP-glucose, CDP-glucose, GDP-glucose, UDP-acetylglucose, ADP-acetylglucose , TDP-acetylglucose, CDP-acetylglucose, GDP-acetylglucose, UDP-xylose, ADP-xylose, TDP-xylose, CDP-xylose, UDP-xylose, GDP-xylose, UDP -galacturonic acid, ADP-galacturonic acid, TDP-galacturonic acid, CDP-galacturonic acid, GDP-galacturonic acid, UDP-galactose, ADP-galactose, TDP-galactose, CDP - Galactose, GDP-galactose, UDP-arabinose, ADP-arabinose, TDP-arabinose, CDP-arabinose, GDP-arabinose, UDP-rhamnose, ADP-rhamnose, TDP-rham Sugar, CDP-rhamnose, GDP-rhamnose, or other nucleoside hexose phosphate or nucleoside pentose pentose, or a combination thereof.
所述的糖基供体优选是尿苷二磷酸糖,选自下组:UDP-葡萄糖,UDP-木糖,UDP-鼠李糖,UDP-半乳糖醛酸,UDP-半乳糖,UDP-阿拉伯糖,或其他尿苷二磷酸己糖或尿苷二磷酸戊糖,或其组合。The glycosyl donor is preferably uridine diphosphate, selected from the group consisting of UDP-glucose, UDP-xylose, UDP-rhamnose, UDP-galacturonic acid, UDP-galactose, UDP-Arabic Sugar, or other uridine diphosphate hexose or uridine pentose diphosphate, or a combination thereof.
在所述方法中,还可以添加酶活性添加物(提高酶活性或抑制酶活性的添加物)。所述酶活性的添加物可以选自下组:Ca 2+、Co 2+、Mn 2+、Ba 2+、Al 3+、Ni 2+、Zn 2+、或Fe 2+;或为可以生成Ca 2+、Co 2+、Mn 2+、Ba 2+、Al 3+、Ni 2+、Zn 2+、或Fe 2+的物质。 In the method, an enzyme active additive (an additive that increases enzyme activity or inhibits enzyme activity) may also be added. The enzyme activity additive may be selected from the group consisting of Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , or Fe 2+ ; a substance of Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , or Fe 2+ .
所述方法的pH条件为:pH4.0-10.0,优选pH6.0-pH8.5,更优选8.5。The pH conditions of the process are: pH 4.0-10.0, preferably pH 6.0-pH 8.5, more preferably 8.5.
所述方法的温度条件为:10℃-105℃,优选25℃-35℃,更优选35℃。The temperature conditions of the process are from 10 ° C to 105 ° C, preferably from 25 ° C to 35 ° C, more preferably 35 ° C.
本发明还提供了一种组合物,它含有有效量的本发明的活性多肽或糖基转移酶Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽,以及食品学上或工业上可接受的载体或赋形剂。这类载体包括(但并不限于):水、缓冲液、葡萄糖、水、甘油、乙醇、及其组合。The present invention also provides a composition comprising an effective amount of the active polypeptide or glycosyltransferase Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or derivative thereof of the present invention. Polypeptides, as well as food or industrially acceptable carriers or excipients. Such carriers include, but are not limited to, water, buffer, dextrose, water, glycerol, ethanol, and combinations thereof.
所述的组合物中还可添加调节本发明糖基转移酶活性的物质。任何具有提高酶活性功能的物质均是可用的。较佳地,所述的提高本发明的糖基转移酶活性的物质选自巯基乙醇。此外,很多物质可以降低酶活性,包括但不限于:Ca 2+、Co 2+、Mn 2+、Ba 2+、Al 3+、Ni 2+、Zn 2+和Fe 2+;或在添加至底物后可水解形成Ca 2+、Co 2+、Mn 2+、Ba 2+、Al 3+、Ni 2+、Zn 2+和Fe 2+的物质。 Substances which modulate the glycosyltransferase activity of the present invention may also be added to the composition. Any substance having a function of increasing the activity of the enzyme is available. Preferably, the substance which increases the glycosyltransferase activity of the present invention is selected from the group consisting of mercaptoethanol. In addition, many substances can reduce enzyme activity, including but not limited to: Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ , and Fe 2+ ; Substrate can be hydrolyzed to form Ca 2+ , Co 2+ , Mn 2+ , Ba 2+ , Al 3+ , Ni 2+ , Zn 2+ and Fe 2+ .
在获得了本发明的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽后,本领域人员可以方便地应用该酶来发挥转糖基的作用,特别是对达玛稀二醇、原人参二醇的转糖基作用。作为本发明的优选方式,还提供了二种形成稀有人参皂苷的方法,该方法之一包含:用本发明所述的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽处理待转糖基的底物,所述的底物包括达玛稀二醇、原人参二醇及其衍生物等四环三萜类化合物。较佳地,在pH3.5-10条件下,用所述的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽酶处理待转糖基的底物。较佳地,在温度30-105℃条件下,用所述的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽酶处理待转糖基的底物。After obtaining the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptides or derived polypeptides thereof of the present invention, the enzyme can be conveniently applied by the practitioner to exert the effect of transglycosylation. Especially for the transglycosylation of dammar diol and protopanaxadiol. As a preferred mode of the present invention, there are also provided two methods for forming rare ginsenosides, one of the methods comprising: using the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT according to the present invention. The MUT3 polypeptide or a polypeptide derived therefrom, which comprises a substrate to be transglycosylated, said substrate comprising a tetracyclic triterpenoid such as dammar diol, protopanaxadiol and derivatives thereof. Preferably, the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a derivative polypeptide thereof thereof is used to treat the glycosyl group to be transfected under the condition of pH 3.5-10. Substrate. Preferably, the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a derivative polypeptide thereof is used to treat the glycosyl group to be transfected at a temperature of 30-105 °C. Substrate.
该方法之二包含:将本发明所述的Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽基因转入可以合成原人参二醇PPD的工程菌(例如,酵母或大肠杆菌工程菌)中,或者,将Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽基因与达玛稀二醇、原人参二醇PPD合成代谢途径中的关键基因和任选地其他糖基转移酶基因于宿主细胞(例如酵母细胞或大肠杆菌)中共表达,获得直接生产稀有人参皂苷Rh2和/或人参皂苷F2的重组菌。或者,将Pn50、8E7、Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3多肽或其衍生多肽的编码核苷酸序列与达玛烯二醇和/或原人参二醇PPD合成代谢途径中的关键 酶和任选地其他糖基转移酶以及合成UDP-鼠李糖的关键酶在宿主细胞中共表达,应用于构建人工合成稀有人参皂苷Rh2和人参皂苷F2的重组菌株。The second method comprises the steps of: transferring the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or the derivative polypeptide gene thereof according to the invention into a process for synthesizing the original ginseng diol PPD. In bacteria (for example, yeast or E. coli engineering bacteria), or, Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom, and dammar diol, The key gene in the original P. ginseng PPD anabolic pathway and optionally other glycosyltransferase genes are co-expressed in host cells (eg yeast cells or E. coli) to obtain direct production of rare ginsenoside Rh2 and/or ginsenoside F2. Recombinant bacteria. Alternatively, the nucleotide sequence encoding the Pn50, 8E7, Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 polypeptide or a polypeptide derived therefrom is anabolized with dammarene diol and/or protopanaxadiol PPD. The key enzymes in the pathway and optionally other glycosyltransferases as well as key enzymes for the synthesis of UDP-rhamnose are co-expressed in host cells and used to construct recombinant strains of synthetic ginsenoside Rh2 and ginsenoside F2.
所述的达玛稀二醇合成代谢途径中的关键基因包括(但不限于):达玛烯二醇合成酶基因。The key genes in the dammar diol anabolic pathway include, but are not limited to, the dammarene diol synthase gene.
在另一优选例中,所述的原人参二醇合成代谢途径中的关键基因包括(但不限于):达玛烯二醇合成酶基因PgDDS、原人参二醇合成的细胞色素P450基因CYP716A47及其还原酶基因,或其组合。或者以上各种酶的同功酶及其组合。其中,达玛烯二醇合成酶将环氧角鲨烯(酿酒酵母自身合成)转化为达玛烯二醇,细胞色素P450CYP716A47及其还原酶再将达玛烯二醇转化为原人参二醇PPD。(Han et.al,plant&cell physiology,2011,52.2062-73)In another preferred embodiment, the key genes in the proto-ginsengdiol anabolic pathway include, but are not limited to, a dammarene diol synthase gene PgDDS, a ginseng diol-synthesized cytochrome P450 gene CYP716A47, and Its reductase gene, or a combination thereof. Or isozymes of various enzymes and combinations thereof. Among them, dammarene diol synthase converts squalene (Saccharomyces cerevisiae self-synthesis) into dammarene diol, and cytochrome P450CYP716A47 and its reductase convert dammarene diol into proto-ginseng diol PPD. . (Han et.al, plant&cell physiology, 2011, 52.2062-73)
本发明的主要优点:The main advantages of the invention:
(1)本发明利用酿酒酵母生产人参皂苷Rh2的方法相比于传统的依赖于人参属植物提取和糖基水解的方法具有成本低、周期短、质量稳定等优点;(1) The method for producing ginsenoside Rh2 by using the Saccharomyces cerevisiae has the advantages of low cost, short cycle, stable quality, and the like compared with the conventional method relying on ginseng plant extract and glycosyl hydrolysis;
(2)本发明首次从三七中获得的糖基转移酶Pn50可以催化PPD和Rh2,催化CK合成F2,并且将其导入产PPD菌株中相比于人参中野生型糖基转移酶UGTPg45可以更高效的合成稀有人参皂苷Rh2;(2) The glycosyltransferase Pn50 obtained by the present invention from Sanqi for the first time can catalyze PPD and Rh2, catalyze the synthesis of F2 by CK, and introduce it into the PPD-producing strain, which can be more than the wild-type glycosyltransferase UGTPg45 in ginseng. Efficient synthesis of rare ginsenoside Rh2;
(3)本发明通过对人参中野生型糖基转移酶UGTPg45随机突变获得突变体基因8E7或者三七糖基转移酶基因Pn50,将其导入产PPD菌株中相比于人参中野生型糖基转移酶UGTPg45可以显著提高稀有人参皂苷Rh2合成效率。(3) The present invention obtains the mutant gene 8E7 or the hepta-7-transferase gene Pn50 by randomly mutating the wild type glycosyltransferase UGTPg45 in ginseng, and introduces it into the PPD-producing strain compared to wild-type glycosyltransfer in ginseng. The enzyme UGTPg45 can significantly increase the synthesis efficiency of rare ginsenoside Rh2.
(4)本发明通过对三七中野生型糖基转移酶Pn50随机突变获得突变体基因Pn50-Q222H-VE、UGT-MUT1、UGT-MUT2、UGT-MUT3,将其导入产PPD菌株中相比于三七中野生型糖基转移酶Pn50可以显著提高稀有人参皂苷Rh2合成效率。(4) The present invention obtains the mutant genes Pn50-Q222H-VE, UGT-MUT1, UGT-MUT2, UGT-MUT3 by random mutation of the wild-type glycosyltransferase Pn50 in Panax notoginseng, and introduces them into the PPD-producing strain. The wild-type glycosyltransferase Pn50 can significantly increase the synthesis efficiency of rare ginsenoside Rh2 in Sanqi.
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。下列实施例中未注明具体条件的实验方法,通常按照常规条件,例如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring Harbor Laboratory Press,1989)中所述的条件,或按照制造厂商所建 议的条件。除非另外说明,否则百分比和份数按重量计算。The invention is further illustrated below in conjunction with specific embodiments. It is to be understood that the examples are not intended to limit the scope of the invention. Experimental methods in which the specific conditions are not indicated in the following examples are generally carried out according to the conditions described in conventional conditions, for example, Sambrook et al., Molecular Cloning: Laboratory Manual (New York: Cold Spring Harbor Laboratory Press, 1989), or according to the manufacturing conditions. The conditions recommended by the manufacturer. Percentages and parts are by weight unless otherwise stated.
通过以下具体实施方法,可以进一步理解本发明的具体实施过程。The specific implementation process of the present invention can be further understood by the following specific implementation methods.
实施例1.三七糖基转移酶基因Pn50的克隆Example 1. Cloning of the hepta-7-transferase gene Pn50
合成如序列Pn50克隆引物F,SEQ ID NO:1(ATGGAGAGAGAAATGTTGAGCA)及Pn50克隆引物R,SEQ ID NO:2(TCAGGAGGAAACAAGCTTTGAA)的两条引物。以从三七中提取的RNA反转录获得的cDNA为模板,利用如上引物进行PCR。DNA聚合酶选用宝生物工程有限公司的高保真的KOD DNA聚合酶。PCR扩增程序为:94℃2min;94℃ 15s,58℃ 30s,68℃ 2min,共35个循环;68℃ 10min降至10℃。PCR产物经琼脂糖凝胶电泳检测,结果见图1。Two primers were synthesized, such as the sequence Pn50 clone primer F, SEQ ID NO: 1 (ATGGAGAGAGAAATGTTGAGCA) and Pn50 clone primer R, SEQ ID NO: 2 (TCAGGAGGAAACAAGCTTTGAA). The cDNA obtained by reverse transcription of RNA extracted from Panax notoginseng was used as a template, and PCR was carried out using the above primers. DNA polymerase uses the high-fidelity KOD DNA polymerase from Biotech Engineering Co., Ltd. The PCR amplification procedure was: 94 ° C for 2 min; 94 ° C for 15 s, 58 ° C for 30 s, 68 ° C for 2 min for a total of 35 cycles; 68 ° C for 10 min to 10 ° C. The PCR product was detected by agarose gel electrophoresis and the results are shown in Figure 1.
在紫外下照射,切下目标DNA条带。然后采用Axygen Gel Extraction Kit(AEYGEN公司)从琼脂糖凝胶中回收DNA即为扩增出的糖基转移酶基因的DNA片段。利用宝生物工程(大连)有限公司(Takara)的PMD18-T克隆试剂盒,将回收的PCR产物克隆到PMDT载体,所构建的载体命名为PMDT-Pn50。经测序获得Pn50的基因序列。Irradiate in the ultraviolet and cut off the target DNA band. Then, the DNA fragment of the amplified glycosyltransferase gene was recovered from the agarose gel using an Axygen Gel Extraction Kit (AEYGEN). The recovered PCR product was cloned into the PMDT vector using Takada's PMD18-T cloning kit, and the constructed vector was named PMDT-Pn50. The gene sequence of Pn50 was obtained by sequencing.
Pn50基因具有SEQ ID NO:3的核苷酸序列。自SEQ ID NO:3的5’端第1-1368位核苷酸为Pn50的开放阅读框(Open Reading Frame,ORF),自SEQ ID NO:3的5’端的第1-3位核苷酸为Pn50基因的起始密码子ATG,自SEQ ID NO:3的5’端的第1366-1368位核苷酸为Pn50基因的终止密码子TGA。糖基转移酶Pn50基因编码一个含有455个氨基酸的蛋白质Pn50,具有SEQ ID NO:4的氨基酸残基序列,用软件预测到该蛋白质的理论分子量大小为51.1kDa,等电点pI为5.10。自SEQ ID NO:4的氨基端的第332-375位氨基酸为糖基转移酶PSPG保守功能域。The Pn50 gene has the nucleotide sequence of SEQ ID NO: 3. The 1-1368th nucleotide from the 5' end of SEQ ID NO: 3 is the open reading frame (ORF) of Pn50, from nucleotides 1-3 of the 5' end of SEQ ID NO: The start codon ATG of the Pn50 gene, the nucleotides 1366-1368 from the 5' end of SEQ ID NO: 3 are the stop codon TGA of the Pn50 gene. The glycosyltransferase Pn50 gene encodes a protein of 455 amino acids, Pn50, having the amino acid residue sequence of SEQ ID NO: 4, which is predicted by software to have a theoretical molecular weight of 51.1 kDa and an isoelectric point pI of 5.10. The amino acid at positions 332-375 of the amino terminus of SEQ ID NO: 4 is the glycosyltransferase PSPG conserved domain.
Pn50核苷酸序列SEQ ID NO:3Pn50 nucleotide sequence SEQ ID NO: 3
Figure PCTCN2018086738-appb-000004
Figure PCTCN2018086738-appb-000004
Figure PCTCN2018086738-appb-000005
Figure PCTCN2018086738-appb-000005
Pn50氨基酸序列SEQ ID NO:4Pn50 amino acid sequence SEQ ID NO: 4
Figure PCTCN2018086738-appb-000006
Figure PCTCN2018086738-appb-000006
实施例2.三七糖基转移酶基因Pn50的在大肠杆菌中表达Example 2. Expression of the heptacosyltransferase gene Pn50 in Escherichia coli
合成如序列Pn50表达引物F SEQ ID NO:5(GGATCCATGGAGAGAGAAATGTTGAGCA)及Pn50表达引物R SEQ ID NO:6(CTCGAGTCAGGAGGAAACAAGCTTTGAA)的两条引物。在合成的引物F/R上分别加BamH I和Xho I两个酶切位点,以从植物中提取的cDNA为模板进行PCR。DNA聚合酶选用宝生物工程有限公司的高保真的KOD DNA聚合酶。PCR扩增程序为:94℃ 2min;94℃ 15s,58℃ 30s,68℃ 2min,共35个循环;68℃ 10min降至10℃。PCR产物经琼脂糖凝胶电泳检测。在紫外下照射,切下目标DNA条带。然后采用Axygen Gel Extraction Kit(AEYGEN公司)从琼脂糖凝胶中回收DNA即为扩增出的糖基转移酶基因的DNA片段。将回收的两个PCR产物用BamH I和Xho I酶切后与分别与同样用BamH I和Xho I酶切后的pET28a连接,连接产物转化大肠杆菌EPI300感受态细胞,将转化后的大肠杆菌菌液涂布在添加50ug/mL卡那霉素的LB平板上,并进一步通过PCR和酶切验证重组克隆。各选取其中一个克隆提取重组质粒后进行测序验证,测序验证正确后将重组质粒转化大肠杆菌BL21(DE3)中诱导表达,诱导表达方法为:从平板挑取单克隆接种至含有50ug/mL卡那霉素的LB试管中过夜,取1%接种至50ml三角瓶中37℃震荡培养至OD600为0.6-0.7加入终浓度为0.1mM的IPTG诱导,在18℃下诱导16h。12000g,3min收集菌体每克湿重菌体加入10ml PBS buffer重悬后裂解菌体,离心取上清做为粗酶液。Two primers were synthesized, such as the sequence Pn50 expression primer F SEQ ID NO: 5 (GGATCCATGGAGAGAGAAATGTTGAGCA) and Pn50 expression primer R SEQ ID NO: 6 (CTCGAGTCAGGAGGAAACAAGCTTTGAA). Two cleavage sites of BamH I and Xho I were added to the synthesized primer F/R, and PCR was carried out using cDNA extracted from plants as a template. DNA polymerase uses the high-fidelity KOD DNA polymerase from Biotech Engineering Co., Ltd. The PCR amplification procedure was: 94 ° C for 2 min; 94 ° C for 15 s, 58 ° C for 30 s, 68 ° C for 2 min for a total of 35 cycles; 68 ° C for 10 min to 10 ° C. The PCR product was detected by agarose gel electrophoresis. Irradiate in the ultraviolet and cut off the target DNA band. Then, the DNA fragment of the amplified glycosyltransferase gene was recovered from the agarose gel using an Axygen Gel Extraction Kit (AEYGEN). The two PCR products recovered were digested with BamH I and Xho I, ligated with pET28a which was digested with BamH I and Xho I, respectively, and the ligated product was transformed into E. coli EPI300 competent cells, and the transformed Escherichia coli was transformed. The solution was applied to LB plates supplemented with 50 ug/mL kanamycin, and the recombinant clones were further verified by PCR and restriction enzyme digestion. One of the clones was selected and the recombinant plasmid was extracted and verified by sequencing. After sequencing, the recombinant plasmid was transformed into E. coli BL21 (DE3) to induce expression. The induced expression method was: picking a single clone from the plate to contain 50 ug/mL Kana. The LB tube of themycin was used overnight, and 1% was inoculated into a 50 ml flask and shaken at 37 ° C until the OD600 was 0.6-0.7. The final concentration was 0.1 mM of IPTG, and induced at 18 ° C for 16 h. 12000 g, 3 min collection of bacteria per gram of wet weight cells were resuspended in 10 ml of PBS buffer, and the cells were lysed, and the supernatant was centrifuged to obtain a crude enzyme solution.
实施例3.三七糖基转移酶基因Pn50催化不同底物反应及其产物检测Example 3. Tris-7-transferase gene Pn50 catalyzes the reaction of different substrates and its product detection
配置糖基转移酶Pn50催化反应体系(100μL)如下:The glycosyltransferase Pn50 catalytic reaction system (100 μL) was configured as follows:
Figure PCTCN2018086738-appb-000007
Figure PCTCN2018086738-appb-000007
在37℃水浴下反应2h。反应结束后加入等体积的正丁醇抽提,取正丁醇相,经真空浓缩后,反应产物溶解于10μL甲醇中,结果用TLC或HPLC检测。从图2中结果可以看出本发明中使用的糖基转移酶Pn50可以催化原人参二醇PPD形成一种新的产物(反应见式A),其在TLC板上的迁移位置和Rh2的迁移位置一致,证明此种新的产物为人参皂苷Rh2。此外Pn50还能催化compound K,生成的产物根据TLC板上的迁移位置以及Pn50的区域专一性推测为人参皂苷F2(图2)。The reaction was carried out in a water bath at 37 ° C for 2 h. After the completion of the reaction, an equal volume of n-butanol was added for extraction, and a n-butanol phase was taken. After concentration in vacuo, the reaction product was dissolved in 10 μL of methanol, and the mixture was subjected to TLC or HPLC. It can be seen from the results in Fig. 2 that the glycosyltransferase Pn50 used in the present invention can catalyze the formation of a new product of the original ginseng diol PPD (reaction see Formula A), its migration position on the TLC plate and the migration of Rh2. The position is consistent, demonstrating that this new product is ginsenoside Rh2. In addition, Pn50 also catalyzes compound K, and the resulting product is presumed to be ginsenoside F2 according to the migration position on the TLC plate and the regional specificity of Pn50 (Fig. 2).
Figure PCTCN2018086738-appb-000008
Figure PCTCN2018086738-appb-000008
实施例4.利用三七来源的糖基转移酶基因Pn50在酿酒酵母中合成稀有人参皂苷Rh2Example 4. Synthesis of rare ginsenoside Rh2 in Saccharomyces cerevisiae using panax notoginseng-derived glycosyltransferase gene Pn50
(1)三七来源的糖基转移酶基因Pn50可以催化原人参二醇C3位羟基糖基化合成稀有人参皂苷Rh2,为实现在酿酒酵母中合成稀有人参皂苷Rh2,本发明首先构建了一株可生产原人参二醇的酿酒酵母底盘细胞。向野生型酿酒酵母中导入达玛烯二醇合成酶基因PgDDS,原人参二醇合成的细胞色素P450基因CYP716A47及其还原酶基因PgCPR1,利用酿酒酵母自身的甲羟戊酸途径合成的2,3-环氧角鲨烯可以合成原人参二醇。通过对人工构建的原人参二醇合成途径进行优化包括对合成限速步骤优化,前体供应优化等获得了一株高产原人参二醇的酿酒酵母菌株ZWBY04RS。(1) The glycosyltransferase gene Pn50 derived from Panax notoginseng can catalyze the hydroxyglycosylation of protopanaxadiol at the C3 position to synthesize rare ginsenoside Rh2. In order to realize the synthesis of rare ginsenoside Rh2 in Saccharomyces cerevisiae, the present invention firstly constructed a strain. Saccharomyces cerevisiae chassis cells capable of producing protopanaxadiol. Introducing the dammarene diol synthase gene PgDDS into wild-type Saccharomyces cerevisiae, the cytochrome P450 gene CYP716A47 synthesized by proto-ginsengdiol and its reductase gene PgCPR1, which were synthesized using S. cerevisiae's own mevalonate pathway. - Epoxy squalene can synthesize protopanaxadiol. The S. cerevisiae strain ZWBY04RS with high protogenetic ginseng diol was obtained by optimizing the artificially constructed proto-ginseng diol synthesis pathway, including optimization of the synthesis rate-limiting step and optimization of precursor supply.
(2)合成如序列SEQ ID NO:7-18的12条引物,以PCR的方法获得糖基转移酶表达的启动子,终止子,ORF,筛选标记,上下游同源臂片段,PCR方法同 实施例1。将上述PCR片段以及UGTPg45的ORF各100ng混匀后,利用酿酒酵母常规的LiAc/ssDNA转化方法,转化重组酿酒酵母菌株ZWBY04RS,获得重组酿酒酵母菌株ZWBY04RS-UGTPg45。类似的将上述PCR片段以及Pn50的ORF各100ng混匀后,利用酿酒酵母常规的LiAc/ssDNA转化方法,转化重组酿酒酵母菌株ZWBY04RS,获得重组酿酒酵母菌株ZWBY04RS-Pn50。(2) synthesizing 12 primers of the sequence SEQ ID NO: 7-18, and obtaining a promoter, a terminator, an ORF, a screening marker, an upstream and downstream homologous arm fragment, and a PCR method by PCR method Example 1. After the above PCR fragment and the ORF of UGTPg45 were each mixed 100 ng, the recombinant S. cerevisiae strain ZWBY04RS was transformed with the conventional LiAc/ssDNA transformation method of Saccharomyces cerevisiae to obtain recombinant Saccharomyces cerevisiae strain ZWBY04RS-UGTPg45. After the above PCR fragment and the ORF of Pn50 were each mixed 100 ng, the recombinant Saccharomyces cerevisiae strain ZWBY04RS was transformed with the conventional LiAc/ssDNA transformation method of Saccharomyces cerevisiae to obtain recombinant Saccharomyces cerevisiae strain ZWBY04RS-Pn50.
启动子引物F SEQ_ID_NO.7Promoter Primer F SEQ_ID_NO.7
Figure PCTCN2018086738-appb-000009
Figure PCTCN2018086738-appb-000009
启动子引物R SEQ_ID_NO.8Promoter Primer R SEQ_ID_NO.8
Figure PCTCN2018086738-appb-000010
Figure PCTCN2018086738-appb-000010
ORF引物F SEQ_ID_NO.9ORF Primer F SEQ_ID_NO.9
Figure PCTCN2018086738-appb-000011
Figure PCTCN2018086738-appb-000011
ORF引物R SEQ_ID_NO.10ORF Primer R SEQ_ID_NO.10
Figure PCTCN2018086738-appb-000012
Figure PCTCN2018086738-appb-000012
终止子引物F SEQ_ID_NO.11Terminator Primer F SEQ_ID_NO.11
Figure PCTCN2018086738-appb-000013
Figure PCTCN2018086738-appb-000013
终止子引物R SEQ_ID_NO.12Terminator Primer R SEQ_ID_NO.12
Figure PCTCN2018086738-appb-000014
Figure PCTCN2018086738-appb-000014
筛选标记引物F SEQ_ID_NO.13Filter Marker Primer F SEQ_ID_NO.13
Figure PCTCN2018086738-appb-000015
Figure PCTCN2018086738-appb-000015
筛选标记引物R SEQ_ID_NO.14Filter marker primer R SEQ_ID_NO.14
Figure PCTCN2018086738-appb-000016
Figure PCTCN2018086738-appb-000016
上游同源臂引物F SEQ_ID_NO.15Upstream homology arm primer F SEQ_ID_NO.15
Figure PCTCN2018086738-appb-000017
Figure PCTCN2018086738-appb-000017
上游同源臂引物R SEQ_ID_NO.16Upstream homology arm primer R SEQ_ID_NO.16
Figure PCTCN2018086738-appb-000018
Figure PCTCN2018086738-appb-000018
下游同源臂引物F SEQ_ID_NO.17Downstream homology arm primer F SEQ_ID_NO.17
Figure PCTCN2018086738-appb-000019
Figure PCTCN2018086738-appb-000019
下游同源臂引物R SEQ_ID_NO.18Downstream homology arm primer R SEQ_ID_NO.18
Figure PCTCN2018086738-appb-000020
Figure PCTCN2018086738-appb-000020
(3)配置固体培养基:配置培养基:1%Yeast Extract(酵母膏),2%Peptone(蛋白胨),2%Dextrose(glucose)(葡萄糖),2%琼脂粉。(3) Configuration of solid medium: Disposition medium: 1% Yeast Extract, 2% Peptone, 2% Dextrose (glucose), 2% agar powder.
配置液体培养基:配置培养基:1%Yeast Extract(酵母膏),2%Peptone(蛋白胨),2%Dextrose(glucose)(葡萄糖)。Configure liquid medium: Configure medium: 1% Yeast Extract, 2% Peptone, 2% Dextrose (glucose).
挑取在固体培养基平板上划线的重组酿酒酵母菌ZWBY04RS-UGTPg45和ZWBY04RS-Pn50,分别于含有5mL液体培养基的试管震荡培养过夜(30℃,250rpm,16h);离心收集菌体,转移至10mL液体培养基的50mL三角瓶中,调OD600至0.05,30℃,250rpm震荡培养4天得到发酵产物。本方法对每一株重组酵母同时设置一个平行实验。The recombinant S. cerevisiae ZWBY04RS-UGTPg45 and ZWBY04RS-Pn50 streaked on the solid medium plate were picked and shaken overnight in a test tube containing 5 mL of liquid medium (30 ° C, 250 rpm, 16 h); the cells were collected by centrifugation and transferred. In a 50 mL flask of 10 mL liquid medium, the fermentation product was obtained by adjusting the OD 600 to 0.05, 30 ° C, and shaking culture at 250 rpm for 4 days. This method simultaneously sets up a parallel experiment for each recombinant yeast.
原人参二醇及稀有人参皂苷Rh2的提取和检测:从10mL发酵液中吸取100μ L发酵液,用Fastprep震荡裂解酵母,加入等体积的正丁醇抽提,而后在真空条件下使正丁醇蒸干。用100μL甲醇溶解后通过HPLC检测目的产物的产量。HPLC结果见图3。Extraction and detection of protopanaxadiol and rare ginsenoside Rh2: 100 μL of fermentation broth was taken from 10 mL of fermentation broth, and yeast was lysed with Fastprep, and an equal volume of n-butanol was added for extraction, followed by n-butanol under vacuum. Evaporate dry. The yield of the objective product was measured by HPLC after dissolving in 100 μL of methanol. The HPLC results are shown in Figure 3.
向产原人参二醇的酿酒酵母菌株中导入人参来源的糖基转移酶基因UGTPg45所构建的重组酿酒酵母菌株ZWBY04RS-UGTPg45能合成稀有人参皂苷Rh2,其产量为35.66mg/L。向产原人参二醇的酿酒酵母菌株中导入三七来源的糖基转移酶基因Pn50所构建的重组酿酒酵母菌株ZWBY04RS-Pn50能合成稀有人参皂苷Rh2,其产量为45.55mg/L。The recombinant Saccharomyces cerevisiae strain ZWBY04RS-UGTPg45 constructed by introducing the ginseng-derived glycosyltransferase gene UGTPg45 into the Saccharomyces cerevisiae strain producing ginseng diol can synthesize rare ginsenoside Rh2, and its yield is 35.66 mg/L. The recombinant Saccharomyces cerevisiae strain ZWBY04RS-Pn50 constructed by introducing the Panax notoginseng-derived glycosyltransferase gene Pn50 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 45.55 mg/L.
实施例5.人参来源的糖基转移酶UGTPg45随机突变库的构建Example 5. Construction of a random mutation library of ginseng-derived glycosyltransferase UGTPg45
以质粒UGTPg45-pMD18T为模板,使用GeneMorph II Random Mutagenesis Kit(Agilent Technology),UGTPg45随机突变引物F SEQ ID NO:23(Gcatagcaatctaatctaagttttaattacaaaatggagagagaaatgttgagcaaaac)和UGTPg45随机突变引物R SEQ ID NO:24(Gaaaagaagataatatttttatataattatattaatctcaggaggaaacaagctttgaa)为引物进行易错PCR,程序如下:95℃ 2min预变性;95℃变性30s,58℃退火30s,72℃延伸3min 15s,30个循环;72℃终延伸10min。根据试剂盒说明书加入1ug,1.5ug,2ug模板对突变率进行摸索。割胶回收PCR产物,用Taq酶加A,连接到载体pMD18T上,转化大肠杆菌TOP10。随机挑取10个阳性克隆测序,确定突变率在1-2碱基/基因相应的模板用量为1.5ug。后续实验使用该条件进行易错PCR即构建好UGTPg45的随机突变库。Using the plasmid UGTPg45-pMD18T as a template, using the GeneMorph II Random Mutagenesis Kit (Agilent Technology), UGTPg45 random mutagenic primer F SEQ ID NO: 23 (Gcatagcaatctaatctaagttttaattacaaaatggagagagaaatgttgagcaaaac) and UGTPg45 random mutation primer R SEQ ID NO: 24 (Gaaaagaagataatatttttatataattatattaatctcaggaggaaacaagctttgaa) as primers Wrong PCR, the procedure is as follows: 95 ° C 2 min pre-denaturation; 95 ° C denaturation 30 s, 58 ° C annealing 30 s, 72 ° C extension 3 min 15 s, 30 cycles; 72 ° C final extension 10 min. According to the kit instructions, 1ug, 1.5ug, 2ug template was used to explore the mutation rate. The PCR product was recovered by tapping, and Ta was added to the vector pMD18T with Taq enzyme to transform E. coli TOP10. Ten positive clones were randomly picked and sequenced, and the mutation rate was determined to be 1-2 bases/gene corresponding template amount of 1.5 ug. Subsequent experiments using this condition for error-prone PCR constructed a random mutation library of UGTPg45.
使用引物SEQ ID NO:25(片段7引物F)(Acactggggcaataggctgtcgccattcaagagcagatagcttcaaaatgtttctactc)和SEQ ID NO:26(片段7引物R)(Cataatgtgagttttgctcaacatttctctctccattttgtaattaaaacttagattag),以酿酒酵母基因组DNA为模板PCR获得片段7,使用引物SEQ ID NO:27(片段8引物F)(Ttgatgagttcatttcaaagcttgtttcctcctgagattaatataattatataaaaata)和SEQ ID NO:28(片段8引物R)(Actgtcaaggagggtattctgggcctccatgtcgctgctatataacagttgaaatttgg)以酿酒酵母基因组DNA为模板PCR获得片段8,使用引物SEQ ID NO:29(片段9引物F)(Cccaaagctaagagtcccattttattc)和SEQ ID NO:30(片段9引物R)(Gaagagtaaaaaaggagtagaaacattttgaagctatctgctcttgaatggcgacagcc),SEQ ID NO:3 1(片段10引物F)(Tgtcgattcgatactaacgccgccatccagtgtcgagaattacaatagtatgtctgatg)和SEQ ID NO:32(片段10引物R)(Tctggtgaggatttacggtatg)以酿酒酵母基因组DNA为模板PCR获得片段9和10。使用引物SEQ ID NO:33(片段11引物F)(Aagatgttcttatccaaatttcaactgttatatagcagcgacatggaggcccagaatac)和SEQ ID NO:34(片段11引物R)(tacttcttgcagacatcagacatactattgtaattctcgacactggatggcggcgttag)以质粒PLKAN为模板PCR获得片段11。将上述片段7-11等摩尔比混合,转化PPD高产菌株ZWBY04RS,涂布YPD平板(加入100ug/ml G418),30℃培养2天,待表达UGTPg45突变基因的酵母转化子长出。类似的,转化野生型UGTPg45构建酵母转化子作为对照。Using the primers SEQ ID NO: 25 (fragment 7 primer F) (Acactggggcaataggctgtcgccattcaagagcagatagcttcaaaatgtttctactc) and SEQ ID NO: 26 (fragment 7 primer R) (Cataatgtgagttttgctcaacatttctctctctccattttgtaattaaaacttagattag), the fragment 7 was obtained by PCR using S. cerevisiae genomic DNA as a template, using primer SEQ ID NO: 27 (fragment 8 primer F) (Ttgatgagttcatttcaaagcttgtttcctcctgagattaatataattatataaaaata) and SEQ ID NO: 28 (fragment 8 primer R) (Actgtcaaggagggtattctgggcctccatgtcgctgctatataacagttgaaatttgg) were obtained by PCR using S. cerevisiae genomic DNA as a template, using primer SEQ ID NO: 29 (fragment 9 primer F) (Cccaaagctaagagtcccattttattc) and SEQ ID NO: 30 (fragment 9 primer R) (Gaagagtaaaaaaggagtagaaacattttgaagctatctgctcttgaatggcgacagcc), SEQ ID NO: 31 (fragment 10 primer F) (Tgtcgattcgatactaacgccgccatccagtgtcgagaattacaatagtatgtctgatg) and SEQ ID NO: 32 (fragment 10 primer R) (Tctggtgaggatttacggtatg) Fragments 9 and 10 were obtained by S. cerevisiae genomic DNA as template PCR. Fragment 11 was obtained by PCR using plasmid SEQ ID NO: 33 (fragment 11 primer F) (Aagatgttcttatccaaatttcaactgttatatagcagcgacatggaggcccagaatac) and SEQ ID NO: 34 (fragment 11 primer R) (tacttcttgcagacatcagacatactattgtaattctcgacactggatggcggcgttag) using plasmid PLKAN as a template. The above fragments 7-11 were mixed in an equimolar ratio, transformed into a PPD high-yield strain ZWBY04RS, coated with a YPD plate (100 ug/ml G418), cultured at 30 ° C for 2 days, and yeast transformants to be expressed with the UGTPg45 mutant gene were grown. Similarly, wild type UGTPg45 was transformed to construct yeast transformants as controls.
在96孔板中加入每孔600ul YPD培养基(加入100ug/ml G418),挑取酵母单克隆到培养基中,30℃ 280rpm震荡培养1天。转移6ul培养物到一块含有600ul YPD培养基的新的96孔板中,30℃ 280rpm震荡培养3天。每孔加入600ul正丁醇,盖好橡胶盖并用胶带封好,旋转抽提3h。4000rpm离心10min,吸取150ul正丁醇相到一块新的96孔板中,HPLC进行产物测定。600 μl of YPD medium per well (100 ug/ml G418) was added to a 96-well plate, yeast monoclonal was picked into the medium, and cultured at 30 ° C for 280 rpm for 1 day. 6 ul of the culture was transferred to a new 96-well plate containing 600 ul of YPD medium and incubated at 30 ° C for 3 days with shaking at 280 rpm. Add 600 ul of n-butanol to each well, cover the rubber cap and seal with tape, and spin for 3 h. After centrifugation at 4000 rpm for 10 min, 150 ul of n-butanol phase was pipetted into a new 96-well plate and the product was determined by HPLC.
利用突变体基因8E7所构建的稀有人参皂苷Rh2菌株ZWBY04RS-8E7其Rh2产量相比于利用UGTPg45所构建菌株ZWBY04RS-UGTPg45产量提升了70%,达到60.48mg/L。The yield of Rh2 in the rare ginsenoside Rh2 strain ZWBY04RS-8E7 constructed using the mutant gene 8E7 was 70% higher than that of the strain ZWBY04RS-UGTPg45 constructed using UGTPg45, reaching 60.48 mg/L.
所述野生型基因UGTPg45具有SEQ ID NO:20的核苷酸序列。自SEQ ID NO:20的5’端第1-1374位核苷酸为UGTPg45的开放阅读框,自SEQ ID NO:20的5’端的第1-3位核苷酸为UGTPg45基因的起始密码子ATG,自SEQ ID NO:20的5’端的第1371-1374位核苷酸为UGTPg45基因的终止密码子TGA。糖基转移酶UGTPg45基因编码一个含有457个氨基酸的蛋白质UGTPg45,具有SEQ ID NO:19的氨基酸残基序列,用软件预测到该蛋白质的理论分子量大小为51.1kDa,等电点pI为5.10。自SEQ ID NO:19的氨基端的第332-375位氨基酸为糖基转移酶PSPG保守功能域。The wild type gene UGTPg45 has the nucleotide sequence of SEQ ID NO: 20. The nucleotide from position 1-1374 at the 5' end of SEQ ID NO: 20 is the open reading frame of UGTPg45, and the nucleotide from positions 1-3 at the 5' end of SEQ ID NO: 20 is the initiation code of the UGTPg45 gene. The sub-ATG, nucleotides 1371-1374 from the 5' end of SEQ ID NO: 20 are the stop codon TGA of the UGTPg45 gene. The glycosyltransferase UGTPg45 gene encodes a 457 amino acid protein UGTPg45 having the amino acid residue sequence of SEQ ID NO: 19, which was predicted by software to have a theoretical molecular weight of 51.1 kDa and an isoelectric point pI of 5.10. The amino acid at positions 332-375 of the amino terminus of SEQ ID NO: 19 is the glycosyltransferase PSPG conserved domain.
UGTPg45氨基酸序列SEQ_ID_NO.19UGTPg45 amino acid sequence SEQ_ID_NO.19
Figure PCTCN2018086738-appb-000021
Figure PCTCN2018086738-appb-000021
UGTPg45核苷酸序列SEQ_ID_NO.20UGTPg45 nucleotide sequence SEQ_ID_NO.20
Figure PCTCN2018086738-appb-000022
Figure PCTCN2018086738-appb-000022
所述突变体基因8E7具有SEQ ID NO:22的核苷酸序列。自SEQ ID NO:22的5’端第1-1374位核苷酸为8E7的开放阅读框,自SEQ ID NO:22的5’端的第1-3位核苷酸为8E7基因的起始密码子ATG,自SEQ ID NO:22的5’端的第1371-1374位核苷酸为8E7基因的终止密码子TGA。糖基转移酶8E7基因编码一个含有457个氨基酸的蛋白质8E7,具有SEQ ID NO:21的氨基酸残基序列。The mutant gene 8E7 has the nucleotide sequence of SEQ ID NO: 22. From the 5' end of SEQ ID NO: 22, nucleotides 1-1374 are the open reading frame of 8E7, and the nucleotides 1-3 from the 5' end of SEQ ID NO: 22 are the starting code of the 8E7 gene. The sub-ATG, from nucleotides 1371 to 1374 of the 5' end of SEQ ID NO: 22, is the stop codon TGA of the 8E7 gene. The glycosyltransferase 8E7 gene encodes a protein of 457 amino acids, 8E7, having the amino acid residue sequence of SEQ ID NO:21.
8E7氨基酸序列SEQ_ID_NO.218E7 amino acid sequence SEQ_ID_NO.21
Figure PCTCN2018086738-appb-000023
Figure PCTCN2018086738-appb-000023
8E7核苷酸序列SEQ_ID_NO.228E7 nucleotide sequence SEQ_ID_NO.22
Figure PCTCN2018086738-appb-000024
Figure PCTCN2018086738-appb-000024
上述结果表明使用本发明中三七来源的糖基转移酶基因Pn50或者对野生型糖基转移酶基因改造获得的突变体基因8E7替换人参来源的糖基转移酶基因UGTPg45均能大幅提升稀有人参皂苷Rh2的合成效率和产量,具有显著有益效果。The above results indicate that the replacement of the ginseng-derived glycosyltransferase gene UGTPg45 by the G7-derived glycosyltransferase gene Pn50 or the mutant gene 8E7 obtained by genetic modification of the wild-type glycosyltransferase can greatly enhance the rare ginsenosides. The synthesis efficiency and yield of Rh2 have significant beneficial effects.
实施例6.利用三七Pn50的糖基转移酶突变体基因Pn50-Q222H-VE在酿酒酵母中合成Rh2Example 6. Synthesis of Rh2 in Saccharomyces cerevisiae using the glycosyltransferase mutant gene Pn50-Q222H-VE of Panax notoginseng Pn50
发明人将Pn50的222位点的氨基酸残基Q突变为H,同时Pn50的322和323位点的氨基酸残基缺失,发明人在321位点后面插入两个氨基酸VE获得Pn50的突变体Pn50-Q222H-VE。The inventors mutated the amino acid residue Q at position 222 of Pn50 to H, while the amino acid residues at positions 322 and 323 of Pn50 were deleted, and the inventors inserted two amino acids VE after the 321 position to obtain a Pn50 mutant Pn50- Q222H-VE.
(1)合成如序列SEQ ID NO:7-18的12条引物,以PCR的方法获得糖基转移酶表达的启动子,终止子,ORF,筛选标记,上下游同源臂片段,PCR方法同实施例1。将上述PCR片段以及Pn50-Q222H-VE的ORF各100ng混匀后,利用酿酒酵母常规的LiAc/ssDNA转化方法,转化重组酿酒酵母菌株ZWBY04RS,获得重组酿酒酵母菌株ZWBY04RS-QE。(1) synthesizing 12 primers such as SEQ ID NO: 7-18, and obtaining a promoter, a terminator, an ORF, a screening marker, an upstream and downstream homologous arm fragment, and a PCR method by PCR method Example 1. After the above PCR fragment and 100 ng of each of the ORFs of Pn50-Q222H-VE were mixed, the recombinant S. cerevisiae strain ZWBY04RS was transformed with the conventional LiAc/ssDNA transformation method of Saccharomyces cerevisiae to obtain recombinant Saccharomyces cerevisiae strain ZWBY04RS-QE.
(2)挑取在固体培养基平板上划线的重组酿酒酵母菌ZWBY04RS-QE,分别于含有5mL液体培养基的试管震荡培养过夜(30℃,250rpm,16h);离心收集菌体,转移至10mL液体培养基的50mL三角瓶中,调OD600至0.05,30℃,250rpm震荡培养4天得到发酵产物。本方法对每一株重组酵母同时设置一个平行实验。(2) Pick the recombinant S. cerevisiae ZWBY04RS-QE streaked on the solid medium plate, and incubate overnight in a test tube containing 5 mL of liquid medium (30 ° C, 250 rpm, 16 h); collect the cells by centrifugation and transfer to In a 50 mL flask of 10 mL liquid medium, the fermentation product was obtained by adjusting the OD 600 to 0.05, 30 ° C, and shaking culture at 250 rpm for 4 days. This method simultaneously sets up a parallel experiment for each recombinant yeast.
原人参二醇及稀有人参皂苷Rh2的提取和检测:从10mL发酵液中吸取100μL发酵液,用Fastprep震荡裂解酵母,加入等体积的正丁醇抽提,而后在真空条件下使正丁醇蒸干。用100μL甲醇溶解后通过HPLC检测目的产物的产量。Extraction and detection of protopanaxadiol and rare ginsenoside Rh2: 100 μL of fermentation broth was taken from 10 mL of fermentation broth, and yeast was lysed with Fastprep, and an equal volume of n-butanol was added for extraction, followed by steaming n-butanol under vacuum. dry. The yield of the objective product was measured by HPLC after dissolving in 100 μL of methanol.
向产原人参二醇的酿酒酵母菌株中导入三七来源的糖基转移酶基因Pn50-Q222H-VE所构建的重组酿酒酵母菌株ZWBY04RS-QE能合成稀有人参皂苷Rh2,其产量为59.09mg/L(图4)。The recombinant Saccharomyces cerevisiae strain ZWBY04RS-QE constructed by introducing the Panax notoginseng-derived glycosyltransferase gene Pn50-Q222H-VE into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 59.09 mg/L. (Figure 4).
Pn50-Q222H-VE的氨基酸序列,SEQ_ID_NO.41Amino acid sequence of Pn50-Q222H-VE, SEQ_ID_NO.41
Figure PCTCN2018086738-appb-000025
Figure PCTCN2018086738-appb-000025
Pn50-Q222H-VE的核酸序列,SEQ_ID_NO.42Nucleic acid sequence of Pn50-Q222H-VE, SEQ_ID_NO.42
Figure PCTCN2018086738-appb-000026
Figure PCTCN2018086738-appb-000026
实施例7.利用糖基转移酶突变体基因UGT-MUT1,UGT-MUT2和UGT-MUT3在酿酒酵母中合成Rh2Example 7. Synthesis of Rh2 in Saccharomyces cerevisiae using the glycosyltransferase mutant genes UGT-MUT1, UGT-MUT2 and UGT-MUT3
发明人通过对Pn50-Q222H-VE的280位的氨基酸残基K和/或247位的氨基酸残基N进行饱和突变,获得三个催化效率提高的突变体UGT-MUT1,UGT-MUT2和UGT-MUT3。UGT-MUT1是把Pn50-Q222H-VE的280位的氨基酸残基K突变为氨基酸残基I,UGT-MUT2是把Pn50-Q222H-VE的247位的氨基酸残基N突变为氨基酸残基S,UGT-MUT3是把Pn50-Q222H-VE的280位的氨基酸残基K突变为氨基酸残基I,并且将247为氨基酸残基N突变为氨基酸残基S。The inventors obtained three mutants with improved catalytic efficiency, UGT-MUT2 and UGT-, by saturating the amino acid residue K at position 280 and/or amino acid residue N at position 247 of Pn50-Q222H-VE. MUT3. UGT-MUT1 mutates amino acid residue K at position 280 of Pn50-Q222H-VE to amino acid residue I, and UGT-MUT2 mutates amino acid residue N at position 247 of Pn50-Q222H-VE to amino acid residue S, UGT-MUT3 mutates the amino acid residue K at position 280 of Pn50-Q222H-VE to amino acid residue I, and mutates 247 to amino acid residue N to amino acid residue S.
(1)合成如序列SEQ ID NO:7-18的12条引物,以PCR的方法获得糖基转移酶表达的启动子,终止子,ORF,筛选标记,上下游同源臂片段,PCR方法同实施例1。将上述PCR片段以及UGT-MUT1(Pn50-Q222H-VE-K280I),UGT-MUT2(Pn50-Q222H-VE-N247S)和UGT-MUT3(Pn50-Q222H-VE-K280I-N247S)的ORF各100ng混匀后,利用酿酒酵母常规的LiAc/ssDNA转化方法,转化重组酿酒酵母菌株ZWBY04RS,获得重组酿酒酵母菌株ZWBY04RS-MUT1,ZWBY04RS-MUT2和ZWBY04RS-MUT3。(1) synthesizing 12 primers such as SEQ ID NO: 7-18, and obtaining a promoter, a terminator, an ORF, a screening marker, an upstream and downstream homologous arm fragment, and a PCR method by PCR method Example 1. Mixing the above PCR fragments with the ORFs of UGT-MUT1 (Pn50-Q222H-VE-K280I), UGT-MUT2 (Pn50-Q222H-VE-N247S) and UGT-MUT3 (Pn50-Q222H-VE-K280I-N247S) After homogenization, the recombinant S. cerevisiae strain ZWBY04RS was transformed by the conventional LiAc/ssDNA transformation method of Saccharomyces cerevisiae to obtain recombinant Saccharomyces cerevisiae strains ZWBY04RS-MUT1, ZWBY04RS-MUT2 and ZWBY04RS-MUT3.
(2)挑取在固体培养基平板上划线的重组酿酒酵母菌ZWBY04RS-MUT1,ZWBY04RS-MUT2和ZWBY04RS-MUT3,分别于含有5mL液体培养基的试管震荡培养过夜(30℃,250rpm,16h);离心收集菌体,转移至10mL液体培养基的50mL 三角瓶中,调OD600至0.05,30℃,250rpm震荡培养4天得到发酵产物。本方法对每一株重组酵母同时设置一个平行实验。(2) Picking recombinant S. cerevisiae ZWBY04RS-MUT1, ZWBY04RS-MUT2 and ZWBY04RS-MUT3 streaked on solid medium plates, respectively, shaking culture in a test tube containing 5 mL of liquid medium overnight (30 ° C, 250 rpm, 16 h) The cells were collected by centrifugation, transferred to a 50 mL flask of 10 mL of liquid medium, adjusted to an OD 600 to 0.05, 30 ° C, and shaken at 250 rpm for 4 days to obtain a fermentation product. This method simultaneously sets up a parallel experiment for each recombinant yeast.
原人参二醇及稀有人参皂苷Rh2的提取和检测:从10mL发酵液中吸取100μL发酵液,用Fastprep震荡裂解酵母,加入等体积的正丁醇抽提,而后在真空条件下使正丁醇蒸干。用100μL甲醇溶解后通过HPLC检测目的产物的产量。HPLC结果见图4。Extraction and detection of protopanaxadiol and rare ginsenoside Rh2: 100 μL of fermentation broth was taken from 10 mL of fermentation broth, and yeast was lysed with Fastprep, and an equal volume of n-butanol was added for extraction, followed by steaming n-butanol under vacuum. dry. The yield of the objective product was measured by HPLC after dissolving in 100 μL of methanol. The HPLC results are shown in Figure 4.
向产原人参二醇的酿酒酵母菌株中导入三七来源的糖基转移酶基因UGT-MUT1所构建的重组酿酒酵母菌株ZWBY04RS-MUT1能合成稀有人参皂苷Rh2,其产量为67.96mg/L。向产原人参二醇的酿酒酵母菌株中导入三七来源的糖基转移酶基因UGT-MUT2所构建的重组酿酒酵母菌株ZWBY04RS-MUT2能合成稀有人参皂苷Rh2,其产量为78.29mg/L。向产原人参二醇的酿酒酵母菌株中导入三七来源的糖基转移酶基因UGT-MUT3所构建的重组酿酒酵母菌株ZWBY04RS-MUT3能合成稀有人参皂苷Rh2,其产量为83.53mg/L。The recombinant Saccharomyces cerevisiae strain ZWBY04RS-MUT1 constructed by introducing the G7-derived glycosyltransferase gene UGT-MUT1 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 67.96 mg/L. The recombinant Saccharomyces cerevisiae strain ZWBY04RS-MUT2 constructed by introducing the G7-derived glycosyltransferase gene UGT-MUT2 into the Saccharomyces cerevisiae strain producing protosan diol can synthesize rare ginsenoside Rh2 with a yield of 78.29 mg/L. The recombinant Saccharomyces cerevisiae strain ZWBY04RS-MUT3 constructed by introducing the G7-derived glycosyltransferase gene UGT-MUT3 into the Saccharomyces cerevisiae strain producing ginseng diol can synthesize rare ginsenoside Rh2 with a yield of 83.53 mg/L.
糖基转移酶突变体基因UGT-MUT1氨基酸序列SEQ_ID_NO.35Glycosyltransferase mutant gene UGT-MUT1 amino acid sequence SEQ_ID_NO.35
Figure PCTCN2018086738-appb-000027
Figure PCTCN2018086738-appb-000027
糖基转移酶突变体基因UGT-MUT1核苷酸序列SEQ_ID_NO.36Glycosyltransferase mutant gene UGT-MUT1 nucleotide sequence SEQ_ID_NO.36
Figure PCTCN2018086738-appb-000028
Figure PCTCN2018086738-appb-000028
糖基转移酶突变体基因UGT-MUT2氨基酸序列SEQ_ID_NO.37Glycosyltransferase mutant gene UGT-MUT2 amino acid sequence SEQ_ID_NO.37
Figure PCTCN2018086738-appb-000029
Figure PCTCN2018086738-appb-000029
糖基转移酶突变体基因UGT-MUT2核苷酸序列SEQ_ID_NO.38Glycosyltransferase mutant gene UGT-MUT2 nucleotide sequence SEQ_ID_NO.38
Figure PCTCN2018086738-appb-000030
Figure PCTCN2018086738-appb-000030
糖基转移酶突变体基因UGT-MUT3氨基酸序列SEQ_ID_NO.39Glycosyltransferase mutant gene UGT-MUT3 amino acid sequence SEQ_ID_NO.39
Figure PCTCN2018086738-appb-000031
Figure PCTCN2018086738-appb-000031
糖基转移酶突变体基因UGT-MUT3核苷酸序列SEQ_ID_NO.40Glycosyltransferase mutant gene UGT-MUT3 nucleotide sequence SEQ_ID_NO.40
Figure PCTCN2018086738-appb-000032
Figure PCTCN2018086738-appb-000032
讨论discuss
目前,通过对人参,花旗参和三七的转录组分析,研究人员已经发现了大量的糖基转移酶候选基因,但是仅有极少数的糖基转移酶被验证参与了人参皂苷的合成。对三七中参与人参皂苷合成的糖基转移酶至今未有报道。由于三七也合成相同的人参皂苷,发掘三七来源糖基转移酶一方面可以使我们更好的了解这两类植物合成人参皂苷合成途径,另一方面可以为人参皂苷的合成生物学研究提供更丰富的元件,具有重要意义。At present, researchers have discovered a large number of glycosyltransferase candidate genes through transcriptome analysis of ginseng, American ginseng and notoginseng, but only a very small number of glycosyltransferases have been verified to be involved in the synthesis of ginsenosides. Glycosyltransferases involved in the synthesis of ginsenosides in Panax notoginseng have not been reported so far. Since Panax notoginseng also synthesizes the same ginsenosides, the discovery of Panax notoginseng-derived glycosyltransferases allows us to better understand the synthetic pathways of these two types of plants for synthesizing ginsenosides, and on the other hand, for the synthesis of ginsenosides. More abundant components are of great significance.
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。All documents mentioned in the present application are hereby incorporated by reference in their entirety in their entireties in the the the the the the the the In addition, it should be understood that various modifications and changes may be made by those skilled in the art in the form of the appended claims.

Claims (13)

  1. 一种分离的多肽,其特征在于,An isolated polypeptide characterized in that
    (1)所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第222位的氨基酸残基为非Gln和/或在对应于SEQ ID NO:19所示氨基酸序列第322位的氨基酸残基为非Ala;或(1) The amino acid sequence of the isolated polypeptide is non-Gln at amino acid residue corresponding to position 222 of the amino acid sequence shown by SEQ ID NO: 19 and/or corresponding to the amino acid sequence corresponding to SEQ ID NO: 19. The amino acid residue at position 322 is non-Ala; or
    (2)所述分离的多肽的氨基酸序列在对应于SEQ ID NO:19所示氨基酸序列的第247位的氨基酸残基为非Asn(N)和/或在对应于SEQ ID NO:19所示氨基酸序列第280位的氨基酸残基为非Lys(K)。(2) The amino acid sequence of the isolated polypeptide is non-Asn(N) at amino acid residue corresponding to position 247 of the amino acid sequence shown in SEQ ID NO: 19 and/or corresponding to SEQ ID NO: 19 The amino acid residue at position 280 of the amino acid sequence is non-Lys (K).
  2. 一种分离的多肽,其特征在于,所述的多肽选自下组:An isolated polypeptide characterized in that said polypeptide is selected from the group consisting of:
    (a)具有SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽;(a) having the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Polypeptide
    (b)将SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽经过一个或几个氨基酸残基,优选1-20个、更优选1-15个、更优选1-10个、更优选1-3个、最优选1个氨基酸残基的取代、缺失或添加而形成的、或是添加信号肽序列后形成的、并具有糖基转移酶活性的衍生多肽;(b) the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 A substitution, deletion or addition of a polypeptide over one or several amino acid residues, preferably from 1 to 20, more preferably from 1 to 15, more preferably from 1 to 10, more preferably from 1 to 3, most preferably 1 amino acid residue. a derivative polypeptide formed or added with a signal peptide sequence and having glycosyltransferase activity;
    (c)序列中含有(a)或(b)中所述多肽序列的衍生多肽;(c) a derivative polypeptide comprising a polypeptide sequence as described in (a) or (b);
    (d)氨基酸序列与SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的同源性≥85%(较佳地≥90%、91%、92%、93%、94%、95%、96%、97%、98%或99%),并具有糖基转移酶活性的衍生多肽。(d) amino acid sequence and amino acid represented by SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Sequence homology ≥ 85% (preferably ≥ 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%) with glycosyltransferase Active derivative polypeptide.
  3. 一种分离的多核苷酸,其特征在于,所述的多核苷酸为选自下组的序列:An isolated polynucleotide, characterized in that said polynucleotide is a sequence selected from the group consisting of:
    (A)编码权利要求1或2所述多肽的核苷酸序列;(A) a nucleotide sequence encoding the polypeptide of claim 1 or 2;
    (B)编码如SEQ ID NO.:4或SEQ ID NO.:21所示多肽或其衍生多肽的核苷酸序列;(B) a nucleotide sequence encoding the polypeptide of SEQ ID NO.: 4 or SEQ ID NO.: 21 or a polypeptide derived therefrom;
    (C)如SEQ ID NO.:3、SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示的核苷酸序列;(C) a nucleoside as shown in SEQ ID NO.: 3, SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42 Acid sequence
    (D)与SEQ ID NO.:3、SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示序列的同源性≥90%(较佳地≥91%、 92%、93%、94%、95%、96%、97%、98%或99%)的核苷酸序列;(D) the same as the sequence shown in SEQ ID NO.: 3, SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: a nucleotide sequence of ≥90% (preferably ≥91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%);
    (E)在SEQ ID NO.:3、SEQ ID NO.:22、SEQ ID NO.:36、SEQ ID NO.:38、SEQ ID NO.:40或SEQ ID NO.:42所示核苷酸序列的5’端和/或3’端截短或添加1-60个(较佳地1-30,更佳地1-10个)核苷酸所形成的核苷酸序列;(E) nucleotides set forth in SEQ ID NO.: 3, SEQ ID NO.: 22, SEQ ID NO.: 36, SEQ ID NO.: 38, SEQ ID NO.: 40 or SEQ ID NO.: 42 a nucleotide sequence formed by truncating the 5' end and/or the 3' end of the sequence or adding 1-60 (preferably 1-30, more preferably 1-10) nucleotides;
    (F)与(A)-(E)任一所述的核苷酸序列互补的核苷酸序列。(F) a nucleotide sequence complementary to the nucleotide sequence of any of (A) to (E).
  4. 一种载体,其特征在于,所述的载体含有权利要求3所述的多核苷酸。A vector comprising the polynucleotide of claim 3.
  5. 权利要求1或2所述分离的多肽的用途,其特征在于,它被用于催化以下反应,或被用于制备催化以下反应的催化制剂:Use of the isolated polypeptide of claim 1 or 2, characterized in that it is used to catalyze the following reaction or to prepare a catalytic preparation which catalyzes the following reaction:
    (i)将来自糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上。(i) Transferring a glycosyl group derived from a glycosyl donor to the hydroxyl group at the C-3 position of the tetracyclic triterpenoid.
  6. 如权利要求5所述的用途,其特征在于,所述分离的多肽用于催化下述反应或被用于制备催化下述反应的催化制剂:The use according to claim 5, characterized in that the isolated polypeptide is used to catalyze the following reaction or to prepare a catalytic preparation which catalyzes the following reaction:
    Figure PCTCN2018086738-appb-100001
    Figure PCTCN2018086738-appb-100001
    其中,R1为H或者OH;R2为H或者OH;R3为H或者糖基;R4为糖基。Wherein R1 is H or OH; R2 is H or OH; R3 is H or a glycosyl group; and R4 is a glycosyl group.
  7. 一种体外糖基化方法,其特征在于,包括步骤:An in vitro glycosylation method, comprising the steps of:
    在糖基转移酶存在下,将糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上;从而形成糖基化的四环三萜类化合物;Transferring the glycosyl group of the glycosyl donor to the hydroxyl group C-3 of the tetracyclic triterpenoid in the presence of a glycosyltransferase; thereby forming a glycosylated tetracyclic triterpenoid;
    其中,所述的糖基转移酶为权利要求1或2所述的多肽或其衍生多肽。Wherein the glycosyltransferase is the polypeptide of claim 1 or 2 or a polypeptide derived therefrom.
  8. 如权利要求7所述的方法,其特征在于,所述的衍生多肽选自:The method of claim 7 wherein said derived polypeptide is selected from the group consisting of:
    将SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41所示氨基酸序列的多肽经过一个或几个氨基酸残基的取代、缺失或添加而形成的、或是添加信号肽序列后形成的、并具有糖基转移酶活性的衍生多肽;或Passing a polypeptide of the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 Or a derivative polypeptide formed by substitution, deletion or addition of several amino acid residues or formed by adding a signal peptide sequence and having glycosyltransferase activity; or
    氨基酸序列与SEQ ID NO.:4、SEQ ID NO.:21、SEQ ID NO.:35、SEQ ID NO.:37、SEQ ID NO.:39或SEQ ID NO.:41氨基酸序列的同源性≥85%(较佳地≥90%、91%、92%、93%、94%、95%、96%、97%、98%、99%),并具有糖 基转移酶活性的衍生多肽;Homology of the amino acid sequence to the amino acid sequence of SEQ ID NO.: 4, SEQ ID NO.: 21, SEQ ID NO.: 35, SEQ ID NO.: 37, SEQ ID NO.: 39 or SEQ ID NO.: 41 ≥85% (preferably ≥90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%), and a derivative polypeptide having glycosyltransferase activity;
    其中,所述糖基转移酶活性指能将糖基供体的糖基转移到四环三萜类化合物的C-3位羟基上的活性。Here, the glycosyltransferase activity refers to an activity capable of transferring a glycosyl group of a glycosyl donor to a hydroxyl group at the C-3 position of a tetracyclic triterpenoid.
  9. 一种进行糖基催化反应的方法,其特征在于,包括步骤:在权利要求1或2所述的多肽或其衍生多肽存在的条件下,进行糖基催化反应。A method for performing a glycosyl-catalyzed reaction, which comprises the steps of performing a glycosyl-catalyzed reaction in the presence of the polypeptide of claim 1 or 2 or a polypeptide derived therefrom.
  10. 如权利要求9所述的方法,其特征在于,所述糖基催化反应的底物为式(I)化合物,且所述的产物为式(II)化合物。The method of claim 9 wherein the substrate for the glycosyl catalyzed reaction is a compound of formula (I) and said product is a compound of formula (II).
  11. 一种遗传工程化的宿主细胞,其特征在于,所述的宿主细胞含有权利要求4所述的载体,或其基因组中整合有权利要求3所述的多核苷酸。A genetically engineered host cell, comprising the vector of claim 4, or a polynucleotide of claim 3 integrated in the genome thereof.
  12. 权利要求11所述的宿主细胞的用途,其特征在于,用于制备酶催化试剂,或生产糖基转移酶、或作为催化细胞、或产生糖基化的四环三萜类化合物。Use of the host cell according to claim 11, characterized in that it is used for the preparation of an enzyme catalytic reagent, or for the production of a glycosyltransferase, or as a catalytic cell, or a glycosylated tetracyclic triterpenoid.
  13. 一种产生转基因植物的方法,其特征在于,包括步骤:将权利要求11所述的遗传工程化的宿主细胞再生为植物,并且所述的遗传工程化的宿主细胞为植物细胞。A method of producing a transgenic plant, comprising the steps of: regenerating the genetically engineered host cell of claim 11 into a plant, and said genetically engineered host cell is a plant cell.
PCT/CN2018/086738 2017-05-16 2018-05-14 Glycosyltransferase, mutant, and application thereof WO2018210208A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
KR1020197037134A KR102418138B1 (en) 2017-05-16 2018-05-14 Glycosyltransferases, mutants and applications thereof
JP2019563883A JP7086107B2 (en) 2017-05-16 2018-05-14 Glycosyltransferases, mutants and their use
CN201880005455.9A CN110462033A (en) 2017-05-16 2018-05-14 Glycosyl transferase, mutant and its application

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710344730.7 2017-05-16
CN201710344730.7A CN108866020A (en) 2017-05-16 2017-05-16 Glycosyl transferase, mutant and its application

Publications (1)

Publication Number Publication Date
WO2018210208A1 true WO2018210208A1 (en) 2018-11-22

Family

ID=64273390

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/086738 WO2018210208A1 (en) 2017-05-16 2018-05-14 Glycosyltransferase, mutant, and application thereof

Country Status (4)

Country Link
JP (1) JP7086107B2 (en)
KR (1) KR102418138B1 (en)
CN (2) CN108866020A (en)
WO (1) WO2018210208A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107475214B (en) * 2017-08-14 2019-10-08 中国科学院华南植物园 A kind of 7-O- glycosyl transferase and its encoding gene and application
CN109295027B (en) * 2018-11-06 2020-12-29 浙江华睿生物技术有限公司 Glycosyltransferase mutant
CN111378681B (en) * 2018-12-27 2023-01-17 中国医学科学院药物研究所 Recombinant bacterium for producing dammarenediol-II glucoside and application thereof
CN113444703B (en) * 2020-03-26 2023-09-01 生合万物(苏州)生物科技有限公司 Glycosyltransferase mutant for catalyzing sugar chain extension and application thereof

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014051215A1 (en) * 2012-09-27 2014-04-03 Korea Advanced Institute Of Science And Technology Novel udp-glycosyltransferase derived from ginseng and use thereof
CN103849672A (en) * 2012-12-06 2014-06-11 中国科学院上海生命科学研究院 Group of glycosyl transferase and application thereof
CN104232723A (en) * 2013-06-07 2014-12-24 中国科学院上海生命科学研究院 Glycosyl transferases and applications of glycosyl transferases
WO2015167282A1 (en) * 2014-04-30 2015-11-05 Korea Advanced Institute Of Science And Technology A novel method for glycosylation of ginsenoside using a glycosyltransferase derived from panax ginseng
CN105087739A (en) * 2014-05-12 2015-11-25 中国科学院上海生命科学研究院 Novel catalytic system for preparing rare ginsenosides and application thereof
CN105177100A (en) * 2014-06-09 2015-12-23 中国科学院上海生命科学研究院 A group of glycosyl transferase, and applications thereof
CN105985938A (en) * 2015-01-30 2016-10-05 中国科学院上海生命科学研究院 Glycosyl transferase mutant protein and applications thereof

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5952203A (en) * 1997-04-11 1999-09-14 The University Of British Columbia Oligosaccharide synthesis using activated glycoside derivative, glycosyl transferase and catalytic amount of nucleotide phosphate
CN1324940A (en) * 2000-05-19 2001-12-05 上海博德基因开发有限公司 New polypeptide UDP glycosyltransferase (UGT) and cobalamin binding protein 11 and polynucleotides for encoding same
CN104357418B (en) * 2014-10-11 2017-11-10 上海交通大学 The application of a kind of glycosyl transferase and its mutant in ginseng saponin Rh 2 is synthesized

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014051215A1 (en) * 2012-09-27 2014-04-03 Korea Advanced Institute Of Science And Technology Novel udp-glycosyltransferase derived from ginseng and use thereof
CN103849672A (en) * 2012-12-06 2014-06-11 中国科学院上海生命科学研究院 Group of glycosyl transferase and application thereof
CN104232723A (en) * 2013-06-07 2014-12-24 中国科学院上海生命科学研究院 Glycosyl transferases and applications of glycosyl transferases
WO2015167282A1 (en) * 2014-04-30 2015-11-05 Korea Advanced Institute Of Science And Technology A novel method for glycosylation of ginsenoside using a glycosyltransferase derived from panax ginseng
CN105087739A (en) * 2014-05-12 2015-11-25 中国科学院上海生命科学研究院 Novel catalytic system for preparing rare ginsenosides and application thereof
CN105177100A (en) * 2014-06-09 2015-12-23 中国科学院上海生命科学研究院 A group of glycosyl transferase, and applications thereof
CN105985938A (en) * 2015-01-30 2016-10-05 中国科学院上海生命科学研究院 Glycosyl transferase mutant protein and applications thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG, P.P. ET AL.: "Production of Bioactive Ginsenosides Rh2 and Rg3 by Metabolically Engineered Yeasts", METABOLIC ENGINEERING, vol. 29, 11 March 2015 (2015-03-11), pages 97 - 105, XP029590692 *

Also Published As

Publication number Publication date
CN108866020A (en) 2018-11-23
JP2020520244A (en) 2020-07-09
KR102418138B1 (en) 2022-07-07
CN110462033A (en) 2019-11-15
JP7086107B2 (en) 2022-06-17
KR20200016268A (en) 2020-02-14

Similar Documents

Publication Publication Date Title
JP6479084B2 (en) A series of glycosyltransferases and their applications
US20230203458A1 (en) Group of udp-glycosyltransferase for catalyzing carbohydrate chain elongation and application thereof
WO2018210208A1 (en) Glycosyltransferase, mutant, and application thereof
WO2015188742A2 (en) Group of glycosyltransferases and use thereof
WO2016119756A1 (en) Mutant protein of glycosyltransferase and uses thereof
WO2018133844A1 (en) Cytochrome p450 mutant protein and applications thereof
WO2023006109A1 (en) Highly specific glycosyltransferase for rhamnose, and use thereof
WO2022105729A1 (en) Cytochrome p450 mutant protein and use thereof
CN112831481B (en) Glycosyltransferase and method for catalyzing sugar chain extension
CN113444703B (en) Glycosyltransferase mutant for catalyzing sugar chain extension and application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18802446

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019563883

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20197037134

Country of ref document: KR

Kind code of ref document: A

122 Ep: pct application non-entry in european phase

Ref document number: 18802446

Country of ref document: EP

Kind code of ref document: A1