WO2011121456A2 - Acides nucléiques et séquences de protéine de costunolide synthase - Google Patents

Acides nucléiques et séquences de protéine de costunolide synthase Download PDF

Info

Publication number
WO2011121456A2
WO2011121456A2 PCT/IB2011/001609 IB2011001609W WO2011121456A2 WO 2011121456 A2 WO2011121456 A2 WO 2011121456A2 IB 2011001609 W IB2011001609 W IB 2011001609W WO 2011121456 A2 WO2011121456 A2 WO 2011121456A2
Authority
WO
WIPO (PCT)
Prior art keywords
germacrene
protein
costunolide
plate
synthase
Prior art date
Application number
PCT/IB2011/001609
Other languages
English (en)
Other versions
WO2011121456A3 (fr
Inventor
Dae-Kyun Ro
Nobuhiro Ikezawa
Don Trinh Nguyen
Original Assignee
Uti Limited Partnership
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uti Limited Partnership filed Critical Uti Limited Partnership
Publication of WO2011121456A2 publication Critical patent/WO2011121456A2/fr
Publication of WO2011121456A3 publication Critical patent/WO2011121456A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y114/00Oxidoreductases acting on paired donors, with incorporation or reduction of molecular oxygen (1.14)
    • C12Y114/13Oxidoreductases acting on paired donors, with incorporation or reduction of molecular oxygen (1.14) with NADH or NADPH as one donor, and incorporation of one atom of oxygen (1.14.13)
    • C12Y114/1312Costunolide synthase (1.14.13.120)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P5/00Preparation of hydrocarbons or halogenated hydrocarbons
    • C12P5/007Preparation of hydrocarbons or halogenated hydrocarbons containing one or more isoprene units, i.e. terpenes

Definitions

  • the present invention relates to the fields of botany, molecular biology and biochemistry. More particular, the invention relates to the identification and characterization of costunolide synthase (LsCS) from lettuce (Lactuca sativa), and uses therefor.
  • LsCS costunolide synthase
  • Terpenoids are structurally the most diverse class of natural products, derived from isopentenyl diphosphate (IPP), with many known biological functions and commercial applications (Gershenzon and Dudareva, 2007).
  • IPP isopentenyl diphosphate
  • One subclass of terpenoids is the sesquiterpene lactone (STL) characterized by its a-methylene ⁇ -lactone moiety on the 15- carbon core backbone (Fischer, 1990).
  • STLs are found in several plant families including Cupressaceae (Pieman, 1986), liverwort (Asakawa et al, 1981 ; Knoche et al, 1969) and even in fungus (Pittayakhajonwut et al, 2009; Vidari et al, 1976), their occurrence in nature is by far the most frequent among Asteraceae (or Compositae) plants, the second largest plant family comprised of 23,000 plant species (Herz, 1977; Panero and Funk, 2008).
  • STLs have benefited human health and wellness as anti-inflammatory (e.g., parthenolide), sedative and analgesic (e.g., lactucopicrin), anti-cancer (e.g., thapsigargin), and anti-malarial (e.g., artemisinin) medicines
  • anti-inflammatory e.g., parthenolide
  • sedative and analgesic e.g., lactucopicrin
  • anti-cancer e.g., thapsigargin
  • anti-malarial e.g., artemisinin
  • costunolide An additional hydroxylation at C6 position of germacrene A acid facilitates non-enzymatic lactonization of C6 hydroxyl and CI 2 carboxylic group, yielding costunolide.
  • the costunolide in turn serves as a framework of guaianolides, eudesmanolides, germacranolides and other STLs by as yet unknown mechanisms.
  • elaborate chemical decorations of STL scaffolds are carried out by P450 and several other modifying enzymes in order to produce more complex and biologically active STL end- products.
  • artemisinic acid biosynthesis is a specific evolutionary event that only occurred in a single modern species A. annua, it can be theorized that biochemistry of artemisinic acid was diverged from more general germacrene A acid biosynthesis. Comparative analysis of artemisinic acid and germacrene A acid biosynthesis would be an interesting model to understand the adaptive evolution of enzymes.
  • Germacrene A acid is certainly a necessary chemical to further investigate the STL biochemistry in Asteraceae.
  • this compound is difficult to obtain because germacrene A acid is a low abundant, transient intermediate in the STL biosynthetic pathway (de Kraker et al, 2001b), and the chemical synthesis of terpenoid is difficult to achieve.
  • One report showed that a minute amount (2 mg) of germacrene A acid could be purified from 300 g of costus (Saussurea lappa) (de Kraker et al, 2001b).
  • microbial production of germacrene A acid would be a convenient alternative to acquire germacrene A acid, and necessitates the identification of its biosynthetic genes and subsequent reconstruction in microbes for de novo synthesis.
  • STLs have major classes of structures such as guaianolides, eudesmanolides, and germacranolides. While germacranolides is the largest group and is believed to be the precursor of guaianolides and eudesmanolides, the simplest germacranolides (+)-costunolide is generally accepted as the common intermediate of germacranolide-derived lactones (Herz, 1977; Fischer, 1990; Seaman, 1982; Song et al, 1995; de Kraker et al, 2002).
  • the biosynthetic scheme of costunolide, depicted in FIG. 6 is as follows (de Kraker et al, 2001a; de Kraker et al, 2002).
  • FPP Farnesyl pyrophosphate
  • GAS germacrene A synthase
  • CI 2 CI 2 is oxidized three times to yield germacrene A acid [germacra-l (lO), 4, l l(13)-trien-12-oic acid].
  • An additional hydroxylation at C6 position of germacrene A acid facilitates non-enzymatic lactonization of C6 hydroxyl and CI 2 carboxylic groups, yielding costunolide.
  • GAS genes While a number of GAS genes have been isolated and characterized in Asteraceae plants (Bennett et al, 2002; Bertea et al, 2006; Bouwmeester et al, 2002; G5pfert et al, 2009), the inventors' recent work has shown the gene isolation and characterization of one P450 (germacrene A oxidase, GAO) which catalyzes three-step oxidation from germacrene A to germacrene A acid (Nguyen et al, 2010).
  • costunolide synthase which hydroxylates C6 position of germacrene A acid
  • P450 de Kraker et al, 2002
  • lactone ring structure -methylene-gamma-lactone group
  • identification of this costunolide synthase gene is crucial for downstream application such as enzymatic synthesis of STLs.
  • an isolated P450 having sesquiterpene lactone-forming activity and having at least 60% sequence homology to SEQ ID NO:l .
  • the isolated P450 may be fused to a non-costunolide synthase or polypeptide sequence.
  • the isolated P450 may have 65%, 70%, 75%, 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence homology to SEQ ID NO:l .
  • the isolated P450 may comprise or consist of the sequence of SEQ ID NO: 1.
  • an isolated nucleic acid encoding a P450 having sesquiterpene lactone-forming activity and having at least 60% sequence homology to SEQ ID NO: 1.
  • the nucleic acid may have at least 50% sequence homology to SEQ ID NO:2, or at least 55%, 60%, 65%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% sequence homology to SEQ ID NO:2.
  • the nucleic acid may comprise or consist of the sequence of SEQ ID NO:2.
  • an expression cassette comprising a promoter operably linked to a nucleic acid encoding a P450 having sesquiterpene lactone- forming activity and having 60% sequence homology to SEQ ID NO:l .
  • the promoter may be a eukaryotic promoter, such as a plant promoter or a yeast promoter.
  • the the nucleic acid has 50% homology to SEQ ID NO:2.
  • an expression vector comprising a promoter operably linked to a nucleic acid encoding a P450 having sesquiterpene lactone-forming activity and having 60% sequence homology to SEQ ID NO:l and an origin of replication.
  • the promoter may be a eukaryotic promoter, such as a plant promoter or a yeast promoter.
  • the vector may be a transposon, a yeast artificial chromosome or a bacterial plasmid.
  • a recombinant cell comprising a heterologous expression cassette comprising a promoter operably linked to a nucleic acid encoding a P450 having sesquiterpene lactone-forming activity and having 60% sequence homology to SEQ ID NO:l .
  • the cell may be a plant cell, a bacterial cell or a yeast cell.
  • the promoter may be heterologous to a native P450 gene, such as a plant, bacterial or yeast promoter.
  • the expression cassette may be comprised in a transposon, a yeast artificial chromosome, or a bacterial plasmid.
  • the recombinant cell may further comprise a heterologous selectable marker.
  • Yet another embodiment comprises a transgenic plant, cells of which comprise a P450 costunolide synthase gene under the control of heterologous promoter.
  • the plant may be Asteraceae, such as Lactuca sativa.
  • a yeast cell comprising a germacrene A synthase, a germacrene A oxidase, a cytochrome P450 reductase, and a P450 costunolide synthase; and (b) culturing the yeast cell with a source of germacrene acid under conditions supporting the production of costunolide.
  • the germacrene A synthase may be from Asteraceae
  • the germacrene A oxidase may be from Asteraceae
  • the cytochrome P450 reductase is from Asteraceae
  • the P450 costunolide synthase may be from Asteraceae.
  • Genes encoding two, three or all four of germacrene A synthase, the germacrene A oxidase, the cytochrome P450 reductase, and the P450 costunolide synthase may be located on a single expression vector. Alternatively, genes encoding two, three or all four of germacrene A synthase, the germacrene A oxidase, the cytochrome P450 reductase, and the P450 costunolide synthase are located on two, three or four expression vectors.
  • the yeast may be Saccharomyces cerevisiae.
  • the germacrene acid may be provided exogenously, or the yeast cell may produce germacrene acid.
  • Still a further embodiment comprises a system comprising (a) a yeast cell comprising a germacrene A synthase, a germacrene A oxidase, a cytochrome P450 reductase, and a P450 having costunolide synthasea activity; and (b) a medium-containing vessel suitable for culturing the yeast cell.
  • the germacrene A synthase may be from Asteraceae
  • the germacrene A oxidase may be from Asteraceae
  • the cytochrome P450 reductase is from Asteraceae
  • the P450 costunolide synthase may be from Asteraceae.
  • Genes encoding two, three or all four of germacrene A synthase, the germacrene A oxidase, the cytochrome P450 reductase, and the P450 costunolide synthase may be located on a single expression vector.
  • genes encoding two, three or all four of germacrene A synthase, the germacrene A oxidase, the cytochrome P450 reductase, and the P450 having costunolide synthase activity are located on two, three or four expression vectors.
  • the yeast may be Saccharomyces cerevisiae.
  • the germacrene acid may be provided exogenously, or the yeast cell may produce germacrene acid.
  • the yeast may be Saccharomyces cerevisiae.
  • nucleic acid encoding a P450 having costunolide synthase activity that hybridizes under medium stringency conditions to SEQ ID NO:2.
  • the nucleic acid may hybridize under medium high or high stringency conditions to SEQ ID NO:2.
  • Another embodiment comprises an oligonucleotide of 15 to 100 bases and comprising at least 15 contiguous bases of SEQ ID NO:2.
  • the oligonucleotide may be 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 75, 80, 90 or 100 bases in length.
  • the oligonucleotide may comprise 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 75, 80, 90 or 100 contiguous bases of SEQ ID NO:2.
  • the oligonucleotide may be RNA or DNA.
  • the oligonucleotide may comprise a detectable marker, such as a sequential, radioactive, chemilluminescent, fluorescent, magnetic, colorimetric or enzymatic marker.
  • the oligonucleotide may comprise a non-Asteraceae sequence.
  • FIG. 1 Sesquiterpene lactone biosynthetic pathways in Asteraceae.
  • the left panel shows the proposed biosynthetic pathway of general sesquiterpene lactones in Asteraceae, and the right panel shows the artemisinic acid biosynthetic pathway in Artemisia annua.
  • FPP farnesyl diphosphate
  • GAS germacrene A synthase
  • ADS amorphadiene synthase.
  • FIGS. 2A-C Analyses of the metabolites de novo synthesized from transgenic yeast.
  • FIG. 2A GC-MS chromatographs are shown for the sesquiterpenoids from the yeast transformed with the indicated genes (GAS, germacrene A synthase; GAO, germacrene A oxidase, CPR, cytochrome P450 reductase).
  • Line a and b are negative controls, and line c displays the metabolites unique to the yeast transformed with three genes (GAS, GAO, and CPR).
  • FIG. 2B Compound 2 and 3 were not separated by DB-5 MS column but were clearly separated by the chiral column (Cyclodex-B column).
  • the mass fragmentation patterns of the compound 1 to 4 are given.
  • FIG. 2C Proposed acid- induced rearrangements of germacrene A acid to ⁇ -, ⁇ -, and ⁇ -costic acid and additional modification of costic acids to ilicic acid in yeast culture. Speculated structure of peak 2 as a-costic acid was also given with question mark.
  • FIG. 3 A GC-MS chromatographs of the sesquiterpenoid products. Inlet temperatures and chemical treatment are shown in front of the chromatograms. TCA, trichloroacetic acid; bracket, -, ⁇ -, ⁇ -costic acids; asterisk, germacrene A acid; arrow, heat-induced rearranged product of germacrene A acid.
  • M/z 171 corresponds to internal standard (IS), decanoic acid, -with Rt 7.68 min; m/z 233 is to detect the germacrene A acid (*) with Rt 8.41 min and, -, ⁇ -, ⁇ -costic acids (in bracket) with Rt 9.02 min and 9.17 min ( ⁇ -, ⁇ -costic acids co-migrate at 9.02 min); m/z 251 is to detect ilicic acid (triangle), with Rt 5.38 min. (FIG. 3C) LC-MS chromatography (negative m/z 233) of in vitro enzyme assay products.
  • FIG. 4A The Asteraceae phylogeny simplified from the figure by Panero and Funk (2008) is shown. Asterisks indicate the four subfamilies where GAOs were isolated, and the parentheses indicate specific species names. Representative sesquiterpene lactones were given.
  • FIG. 4B Phylogenetic tree of AMO/GAO in
  • HPO and EAH used as outgroups are Hyoscyamus muticus premnaspirodiene oxygenase and tobacco 5-epiaristolochene dihydroxylase, respectively. Bootstrap values in percentage from 1 ,000 replicates were shown. The bracket indicates the GAO clade that is clearly distinguished from BsGAO and AMO.
  • FIG. 4C Alignment of deduced amino acids from GAOs and AMO from Asteraceae. Amino acid sequences were obtained from cDNAs deposited at NCBI.
  • AMO amorphadiene oxidase from Artemisia annua (DQ268763 or DQ315671); LsGAO, germacrene A oxidase from Lactuca sativa (GUI 98171), or from Cichorium intybus (Ci; GU256644), Sassurea lappa (SI; GU256646), Helianthus annuus (Ha; GU256645), and Barnadesia spinosa (Bs; GU256647).
  • the alignment is shaded to a 50% consensus. Dark and light shading indicate identical and similar residues, respectively.
  • FIGS. 5A-B Biochemical analyses of GAOs from various Asteraceae plants.
  • FIG. 5 A LC-MS chromatography at selective negative 233 ion for germacrene A acid and 171 ion for internal standard decanoic acid. The arrow is germacrene A acid, and the arrow head is internal standard (IS), decanoic acid. Yields of germacrene A acid from four independent transformants were given at the start of the chromato graphs (mean ⁇ SD).
  • FIG. 5B Immunoblot analysis of recombinant GAOs was shown. FLAG secondary antibodies were used to detect the epitope tags at the C-termini of GAOs. Loaded microsome amounts were indicated.
  • Ls Lactuca sativa
  • Ci Chicorium intybus
  • SI Sassurea lappa
  • Bs Barnadesia spinosa
  • Ha Helianthus annuus.
  • FIG. 6 Sesquiterpene lactone biosynthetic pathways in Asteraceae.
  • the pathway shows the proposed biosynthetic pathway of general sesquiterpene lactones in Asteraceae through (+)-costunolide.
  • FPP is farnesyl diphosphate
  • GAS germacrene A synthase
  • GAO germacrene A oxidase
  • CS costunolide synthase.
  • FIG. 7 - LC-MS analysis of products produced from in vitro assays.
  • Chemical structure of costunolide is given together with its LC-MS fractionation pattern in 233 selective ion in positive ion mode.
  • Sample indicates that the enzymatic product from in vitro assay using microsomes isolated from yeast expressing newly identified costunolide synthase and CPR.
  • Control is the in vitro assays using microsomes from only CPR- expressing yeast, germacrene A acid was used as a substrate.
  • FIG. 8 A Chromatograms of m/z 233 in positive ion mode are shown.
  • Products from the yeast transformed with the indicated genes (GAS, germacrene A synthase; GAO, germacrene A oxidase; CPR, cytochrome P450 reducatse; LsCS, costunolide synthase) were analyzed by LC-MS.
  • Negative control shows the product profile from yeast transformed with pESC-Leu2d::G ⁇ -S/G ⁇ O/CPi?- GallO_cassette (without LsCS).
  • FIG. 8B Fragmentation patterns of authentic costunolide and the major product produced from 4 gene-expressing trangenic yeast (both compounds are the ones starred in the chromatograms of FIG. 8A).
  • FIG. 8C Scheme of costunolide synthase reaction. Germacrene acid is hydroxylated first and lactone ring closure proceeds spontaneously.
  • FIGS. 9A-B Structures of STLs found in various Asteraceae plants.
  • FIG. 9A Four representative structures of STLs with distinct regio- and stereo-characteristics of ⁇ - lactone-rings are shown, and their sesquiterpene backbones are labelled in red.
  • FIG. 9B Structures of STLs found in sunflower cv. HA300 and lettuce.
  • FIG. 10 Costunolide biosynthetic pathway in Asteraceae. Abbreviations used are:
  • GAS germacrene A synthase
  • GAO germacrene A oxidase
  • COS costunolide synthase
  • FIGS. 11A-C Biochemical and chemical characterizations of germacrene A acid 8p-hydroxylase.
  • FIGS. 11A and 11B (+/-)LC-MS analyses of C12 enzymatic product- profiles are shown.
  • Microsomes from the yeast expressing CI 2 and CPR catalyzes the synthesis of a compound (arrowhead) with [M-H 2 0+H]+ ion at m/z 233 and with [M-H]- ion at m/z 249.
  • the 6-hydroxy GAA was prepared by alkaline-hydrolysis of authentic costunolide standard, and the peaks marked by arrow indicated the 6-hydroxy GAA. The identity of this compound (arrow) was confirmed by reverting it to costunolide.
  • FIG. 11C Structure of the new compound (arrowhead) purified from the in vivo feeding assay ( ⁇ -hydroxy germacrene A acid) and its rearranged product in an acidic condition ( ⁇ -hydroxy ilicic acid).
  • ⁇ -hydroxy germacrene A acid ⁇ -hydroxy germacrene A acid
  • ⁇ -hydroxy ilicic acid ⁇ -hydroxy ilicic acid
  • FIGS. 12A-D Biochemical and chemical characterization of costunolide synthase.
  • FIG. 12A (+)LC-MS scan at m/z 233 demonstrated that the peak 3 and 4 showed identical retention times with 6-hydroxy germacrene A acid and costunolide, respectively.
  • FIG. 12B The structures of the standards are depicted.
  • FIG. 12C Metabolite profile of the culture extraction from the EPY300 strain expressing GAS, LsGAO, CPR, and with or without LsCOS by (+)LC-MS scan at m/z 233.
  • FIG. 12D Product ion scans of the costunolide standard and peak 4 by (+)LC-MS-MS showed identical fragmenting patterns. Diamonds indicate the parental ion at m/z 233.
  • FIG. 13 Bioinformatics analyses of LsCOS and C12 in Asteraceae.
  • the Asteraceae phylogeny was adapted from Panero and Funk (2008).
  • M. recutita is Matricaria recutita (German Chamomile);
  • C. coronarium is Chrysanthemum coronarium (Garland chrysanthemum);
  • two Arnica species are A. montana and A. chamissonis; three Helianthus species are H. annum, H. argophyllus and H. ciliaris.
  • X. strumarium is
  • Xanthium strumarium five Lactuca species are L. sativa, L. serriola, L. saligna, L. virosa, and L. perennis.; B. spinosa is Barnadesia spinosa.
  • FIGS. 14A-C - LC-MS results of the initial in vivo screening.
  • FIG. 14A Selective ion scan at m/z 233 showed a new compound from the EPY300-GAA strain expressing CI 2, but it displayed a different retention time from that of costunolide.
  • FIG. 14B Total ion scan of the C12-enzymatic reaction product showed a fragmented ion at m/z 233 and a sodium adduct at m/z 273.
  • FIG. 15 Immunoblot analysis of the recombinant P450s.
  • the ECL-Plus detection method was used to detect the fluorescent or luminescent signals from recombinant enzymes.
  • the Typhoon fluorescent imager for fluorescence and X-ray film for luminescence were used to visualize signals. The amount of microsomal protein loaded and primary antibodies used were indicated. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • sesquiterpene lactones are characteristic natural products in Asteraceae which constitutes approximately 8% of all plant species.
  • biochemistry and evolution of STLs in Asteraceae remain unexplored at the molecular level.
  • GEO germacrene A oxidase gene
  • evolutionarily conserved in all major subfamilies of Asteraceae encodes an enzyme that catalyzes three consecutive oxidations of germacrene A to yield germacrene A acid and possesses a latent activity of amorphadiene oxidation.
  • homologous genes were further isolated from the representative species of three major subfamilies of Asteraceae (sunflower, chicory, and costus from Asteroideae, Cichorioideae, and Carduoideae, respectively) and also from the phylogenetically basal species, Barnadesia spinosa, from Barnadesioideae.
  • the recombinant GAOs from these genes clearly showed germacrene A oxidase activities, suggesting that GAO activity is widely conserved in Asteraceae including the basal lineage.
  • GAOs could catalyze the conversion of non-natural substrate amorphadiene to artemisinic acid, whereas amorphadiene oxidase (AMO) diverged from GAO displayed negligible activity for germacrene A oxidation.
  • AMO amorphadiene oxidase
  • the observed amorphadiene oxidase activity in GAOs suggested that the catalytic plasticity is embedded in ancestral enzymes that may drive the chemical diversity in nature.
  • cytochrome P450 monooxygenase gene from lettuce (Lactuca sativa) that is responsible for the last step in costunolide biosynthesis (FIG. 6).
  • This newly identified cytochrome P450 enzyme can catalyze the conversion of germacrene acid to costunolide as cofirmed by liquid chromatography tandem mass spectrometry (LC-MS/MS).
  • LC-MS/MS liquid chromatography tandem mass spectrometry
  • the development of a quadruple expression plasmid carrying GAS, GAO, CPR and this new gene enabled de novo production of costunolide in yeast. Use of this platform in yeast and other hosts such as plants, bacteria and fungi will permit large-scale production of costunolide. Since costunolide is generally accepted as a precursor to various naturally-occurring STLs, its production in mass quantities will permit exploitation of STLs in a variety of industries.
  • Taxonomy and Biology The Asteraceae or Compositae, the aster, daisy, or sunflower family, is the second largest family of flowering plants, in terms of number of species.
  • the name Asteraceae is derived from the type genus Aster, while Compositae, an older but still valid name, means composite and refers to the characteristic inflorescence, a special type of pseudanthium found in only a few other angiosperm families.
  • the family has been universally recognized and placed in the order Asterales. The study of this family is known as synantherology.
  • the family comprises more than
  • Asteroideae or Tubuliflorae
  • Cichorioideae or Liguliflorae
  • the latter is paraphyletic and has been divided into many minor groups in most newer systems.
  • the four subfamilies Asteroideae, Cichorioideae, Carduoideae and Mutisioideae comprise 99% of the specific diversity of the whole family (appr. 70%, 14%, 11% and 3% respectively).
  • Other subfamilies have been recognised by some authors, e.g., Helianthoideae.
  • Asteraceae are especially common in open and dry environments. Many members of the Asteraceae are pollinated by insects, which explains their value in attracting beneficial insects, but anemophyly is also present ⁇ e.g., Ambrosia, Artemisia). There are many apomictic species in the family.
  • Seeds are ordinarily dispersed intact with the fruiting body, the cypsela.
  • Wind dispersal is common ⁇ anemochory) assisted by a hairy pappus.
  • epizoochory in which the dispersal unit, a single cypsela (e.g., Bidens) or entire capitulum ⁇ e.g., Arctium) provided with hooks, spines or some equivalent structure, sticks to the fur or plumage of an animal (or even to clothes, like in the photo) just to fall off later far from its mother plant.
  • Lactuca commonly known as lettuce, is a genus of flowering plants in the daisy family Asteraceae. The genus includes about 100 species, distributed worldwide, but mainly in temperate Eurasia. Its best-known representative is the garden lettuce ⁇ Lactuca sativa), with its many varieties. "Wild lettuce” commonly refers to the wild-growing cousins of common garden lettuce.
  • ⁇ менда ⁇ ии used as herbs and in herbal teas and other beverages.
  • Chamomile which comes from two different species, the annual Matricaria recutita or German chamomile, and the perennial Chamaemelum nobile, also called Roman chamomile.
  • Calendula also called the pot marigold is grown commercially for herbal teas and the potpourri industry.
  • Echinacea (Echinacea purpurea), used as a medicinal tea.
  • Winter tarragon also called Mexican mint marigold, Tagetes lucida is commonly grown and used as a tarragon substitute in climates where tarragon will not survive.
  • the wormwood genus Artemisia includes absinthe (A. absinthium) and tarragon (A. dracunculus).
  • Marigold Tinutica ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
  • Plants in Asteraceae are medically important in areas that do not have access to
  • Centaurea knapweed
  • Helianthus annuus domestic sunflower
  • Solidago goldenrod
  • Tanacetum, Chrysanthemum and Pulicaria contain species with insecticidal properties.
  • Parthenium argentatum is a source of hypoallergenic latex.
  • Some members of the Asteraceae are economically important as weeds. Notably in the United States are the ragwort, Senecio jacobaea, groundsel Senecio vulgaris and Taraxacum (dandelion).
  • Asteraceae are most usually herbs, but some shrubs, trees and climbers do exist. They are generally easy to distinguish, mainly because of their characteristic inflorescence and many shared apomorphies.
  • the leaves and the stems very often contain secretory canals with resin or latex (particularly common among the Cichorioideae).
  • the leaves can be alternate, opposite, or whorled. They may be simple, but are often deeply lobed or otherwise incised, often conduplicate or revolute.
  • the margins can be entire or dentate.
  • Asteraceae The most evident characteristic of Asteraceae is perhaps their inflorescence: a specialised capitulum, technically called a calathid or calathidium, but generally referred to as flower head or, alternatively, simply capitulum.
  • the capitulum is a contracted raceme composed of numerous individual sessile flowers, called the florets, all sharing the same receptacle.
  • the capitulum of the Asteraceae has evolved many characteristics that make it look superficially like a big single flower. This kind of flower-like inflorescences are quite widespread amongst plants and have been given the name of pseudanthia.
  • the bracts can be free or fused, and arranged in one to many rows, overlapping like the tiles of a roof (imbricate) or not (this variation is important in identification of tribes and genera).
  • Each floret may itself be subtended by a bract, called a "palea” or “receptacular bract.” These bracts as a group are often called “chaff.” The presence or absence of these bracts, their distribution on the receptacle, and their size and shape are all important diagnostic characteristics for genera and tribes.
  • the florets have five petals fused at the base to form a corolla tube and they may be either actinomorphic or zygomorphic.
  • Disc florets are usually actinomorphic, with five petal lips on the rim of the corolla tube. The petal lips may be either very short, or long, in which case they form deeply lobed petals. The latter is the only kind of floret in the Carduoideae, while the first kind is more widespread.
  • Ray florets are always highly zygomorphic and are characterised by the presence of a ligule, a strap-shaped structure on the edge of the corolla tube consisting of fused petals.
  • the calyx of the florets may be absent, but when present it is always modified into a pappus of two or more teeth, scales or bristles and this is often involved in the dispersion of the seeds. As with the bracts, the nature of the pappus is an important diagnostic feature.
  • the filaments are fused to the corolla, while the anthers are generally connate (syngenesious anthers), thus forming a sort of tube around the style (theca). They commonly have basal and/or apical appendages. Pollen is released inside the tube and is collected around the growing style, expelled with a sort of pump mechanism (niidelspritze) or a brush.
  • the pistil is made of two connate carpels.
  • the style has two lobes; stigmatic tissue may be located in the interior surface or form two lateral lines.
  • the ovary is inferior and has only one ovule, with basal placentation.
  • the fruit of the Asteraceae is achene-like, and is called a cypsela
  • Asteraceae generally store energy in the form of inulin. They produce iso/chlorogenic acid, sesquiterpene lactones, pentacyclic triterpene alcohols, various alkaloids, acetylenes (cyclic, aromatic, with vinyl end groups), and tannins. They have terpenoid essential oils which never contain iridoids.
  • Costunolide which was first isolated from costus roots ⁇ Saussurea lappa Clarke) (Rao et al, 1960), is not only the accepted common intermediate of STLs, but also has several biological activities by itself. Reported activities of costunolide are anti-inflammatory and anti-pyretic activities (Kassuya et al., 2009), anti-carcinogenic activity (Robinson et al., 2008), anti-diabetic activity (Eliza et al., 2009), in vitro inhibitory activity of Nuclear Factor kappa B (NF- ⁇ ) (Nam, 2006), anti-fungal (Wedge et al, 2000) and anti-viral activities (Chen et al, 1995).
  • NF- ⁇ Nuclear Factor kappa B
  • Costunolide also exhibited cytotoxic effects on various human cancer cells, including carcinoma and leukemia cells (Park et al, 2001).
  • the supply of costunolide usually depends on the extraction from plant materials.
  • costunolide is contained with other STLs in various plant species such as Saussurea lappa and Magnolia spp., its preparation (extraction and purification from plant sources) is costly and environmentally-unfriendly. II. Enzymes
  • GAO Germacrene A oxidase
  • GAO is a member of cytochrome P450 monooxygenase that typically cleaves molecular oxygen and inserts one oxygen atom into the substrate and the other into the water molecule produced.
  • the resulting products of cytochrome P450 reaction from substrate (R) are often one oxygen-atom added products such as alcohol (R-OH).
  • R-OH oxygen-atom added products
  • One particular interest in GAO is that it can catalyze three consecutive oxygen-atom additions to the substrate (R) to yield a highly oxidized carboxylic acid end product (R-OOH).
  • R-OOH highly oxidized carboxylic acid end product
  • an unstable diol intermediate is formed.
  • it can be non-enzymatically converted to an aldehyde intermediate in conjunction with dehydration (removal of one water molecule).
  • the GAO-mediated reaction yields four molecules of H 2 0 - three molecules from cytochrome P450 catalytic cycles and one from the dehydration of the aldehyde intermediate.
  • GAS Germacrene A synthase
  • this enzyme has one substrate, 2-trans,6-trans-farnesyl diphosphate, and two products, (+)-(R)-gemacrene A and diphosphate. It belongs to the family of lyases, specifically those carbon-oxygen lyases acting on phosphates.
  • the systematic name of this enzyme class is 2- trans,6-trans-farnesyl-diphosphate diphosphate-lyase [(+)-(R)-germacrene-A-forming].
  • Other names in common use include (+)-germacrene A synthase, (+)-(10R)- 2-trans,6-trans- farnesyl-diphosphate diphosphate-lyase, and germacrene-A-forming.
  • CPR Cytochrome P450 reductase
  • NADPH:ferrihemoprotein oxidoreductase NADPH:hemoprotein oxidoreductase
  • NADPH:P450 oxidoreductase NADPH reductase
  • POR POR.
  • Eukaryotic microsomal cytochrome P450 enzymes receive electrons from a FAD- and FMN-containing enzyme NADPH cytochrome P450 reductase.
  • Microsomal CPR is membrane-bound protein that interacts with different P450s. The general scheme of electron flow in the CPR/P450 system is:
  • CPR cytochrome P450
  • the reduction of cytochrome P450 is not the only physiological function of CPR.
  • the final step of heme oxidation by mammalian heme oxygenase requires CPR and 0 2 .
  • CPR affects the ferrireductase activity, probably transferring electrons to the flavocytochrome ferric reductase.
  • LsCS is 490 residues in length. Structural analysis shows that LsCS has conserved eukaryotic P450 regions: a helix K region, an aromatic region, and a heme-binding region at the C-terminal end. In addition, its N-terminal region contains hydrophobic domains corresponding to the membrane anchor sequences of microsomal P450 species, suggesting that LsCS is localized in the endoplasmic reticulum. The full length sequence of LsCS is provided in SEQ ID NO: 1.
  • Fragments of LsCS may be generated by genetic engineering of translation stop sites within the coding region (discussed below).
  • treatment of the LsCS molecule with proteolytic enzymes, known as proteases can produce a variety of N-terminal, C- terminal and internal fragments.
  • fragments may include contiguous residues of the LsCS sequence given in SEQ ID NO: l of 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 75, 80, 85, 90, 95, 100, 125, 150, 200, 250, 300, 350, 400, 450 or more amino acids in length.
  • fragments may be purified according to known methods, such as precipitation (e.g., ammonium sulfate), HPLC, ion exchange chromatography, affinity chromatography (including immunoaffinity chromatography) or various size separations (sedimentation, gel electrophoresis, gel filtration).
  • precipitation e.g., ammonium sulfate
  • HPLC high-density polychromatography
  • ion exchange chromatography e.g., ion exchange chromatography
  • affinity chromatography including immunoaffinity chromatography
  • various size separations sedimentation, gel electrophoresis, gel filtration.
  • Amino acid sequence variants of the polypeptide can be substitutional, insertional or deletion variants.
  • Deletion variants lack one or more residues of the native protein which are not essential for function or immunogenic activity, and are exemplified by the variants lacking a transmembrane sequence described above.
  • Another common type of deletion variant is one lacking secretory signal sequences or signal sequences directing a protein to bind to a particular part of a cell.
  • Insertional mutants typically involve the addition of material at a non-terminal point in the polypeptide. This may include the insertion of an immunoreactive epitope or simply a single residue. Terminal additions, called fusion proteins, are discussed below.
  • a specialized kind of insertional variant is the fusion protein.
  • This molecule generally has all or a substantial portion of the native molecule, linked at the N- or C-terminus, to all or a portion of a second polypeptide.
  • fusions typically employ leader sequences from other species to permit the recombinant expression of a protein in a heterologous host.
  • Another useful fusion includes the addition of an immunologically active domain, such as an antibody epitope, to facilitate purification of the fusion protein. Inclusion of a cleavage site at or near the fusion junction will facilitate removal of the extraneous polypeptide after purification.
  • Substitutional variants typically contain the exchange of one amino acid for another at one or more sites within the protein, and may be designed to modulate one or more properties of the polypeptide, such as stability against proteolytic cleavage, without the loss of other functions or properties. Substitutions of this kind preferably are conservative, that is, one amino acid is replaced with one of similar shape and charge.
  • Conservative substitutions are well known in the art and include, for example, the changes of: alanine to serine; arginine to lysine; asparagine to glutamine or histidine; aspartate to glutamate; cysteine to serine; glutamine to asparagine; glutamate to aspartate; glycine to proline; histidine to asparagine or glutamine; isoleucine to leucine or valine; leucine to valine or isoleucine; lysine to arginine; methionine to leucine or isoleucine; phenylalanine to tyrosine, leucine or methionine; serine to threonine; threonine to serine; tryptophan to tyrosine; tyrosine to tryptophan or phenylalanine; and valine to isoleucine or leucine.
  • amino acids of a protein may be substituted for other amino acids in a protein structure without appreciable loss of interactive binding capacity with structures such as, for example, antigen-binding regions of antibodies or binding sites on substrate molecules. Since it is the interactive capacity and nature of a protein that defines that protein's biological functional activity, certain amino acid substitutions can be made in a protein sequence, and its underlying DNA coding sequence, and nevertheless obtain a protein with like properties. It is thus contemplated by the inventors that various changes may be made in the DNA sequences of genes without appreciable loss of their biological utility or activity, as discussed below. Table 1 shows the codons that encode particular amino acids.
  • the hydropathic index of amino acids may be considered.
  • the importance of the hydropathic amino acid index in conferring interactive biologic function on a protein is generally understood in the art (Kyte & Doolittle, 1982). It is accepted that the relative hydropathic character of the amino acid contributes to the secondary structure of the resultant protein, which in turn defines the interaction of the protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, and the like.
  • Each amino acid has been assigned a hydropathic index on the basis of their hydrophobicity and charge characteristics (Kyte & Doolittle, 1982), these are: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine (+1.9); alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (- 0.9); tyrosine (-1.3); proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and arginine (-4.5).
  • amino acids may be substituted by other amino acids having a similar hydropathic index or score and still result in a protein with similar biological activity, i.e., still obtain a biological functionally equivalent protein.
  • substitution of amino acids whose hydropathic indices are within ⁇ 2 is preferred, those which are within ⁇ 1 are particularly preferred, and those within ⁇ 0.5 are even more particularly preferred.
  • Patent 4,554,101 the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 ⁇ 1); glutamate (+3.0 ⁇ 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); proline (- 0.5 ⁇ 1); alanine (-0.5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4).
  • amino acid can be substituted for another having a similar hydrophilicity value and still obtain a biologically equivalent and immunologically equivalent protein.
  • substitution of amino acids whose hydrophilicity values are within ⁇ 2 is preferred, those that are within ⁇ 1 are particularly preferred, and those within ⁇ 0.5 are even more particularly preferred.
  • amino acid substitutions are generally based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like.
  • substitutions that take various foregoing characteristics into consideration are well known to those of skill in the art and include: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine and asparagine; and valine, leucine and isoleucine.
  • Mimetics are peptide-containing molecules that mimic elements of protein secondary structure. See, for example, Johnson et al. (1993).
  • the underlying rationale behind the use of peptide mimetics is that the peptide backbone of proteins exists chiefly to orient amino acid side chains in such a way as to facilitate molecular interactions, such as those of antibody and antigen.
  • a peptide mimetic is expected to permit molecular interactions similar to the natural molecule.
  • Protein purification techniques are well known to those of skill in the art. These techniques involve, at one level, the crude fractionation of the cellular milieu to polypeptide and non-polypeptide fractions. Having separated the polypeptide from other proteins, the polypeptide of interest may be further purified using chromatographic and electrophoretic techniques to achieve partial or complete purification (or purification to homogeneity). Analytical methods particularly suited to the preparation of a pure peptide are ion-exchange chromatography, exclusion chromatography; polyacrylamide gel electrophoresis; isoelectric focusing. A particularly efficient method of purifying peptides is fast protein liquid chromatography or even HPLC.
  • Certain aspects of the present invention concern the purification of, and in particular embodiments, the substantial purification of an encoded protein or peptide.
  • the term "purified protein or peptide" as used herein, is intended to refer to a composition, isolatable from other components, wherein the protein or peptide is purified to any degree relative to its naturally-obtainable state.
  • a purified protein or peptide therefore also refers to a protein or peptide, free from the environment in which it may naturally occur.
  • purified will refer to a protein or peptide composition that has been subjected to fractionation to remove various other components, and which composition substantially retains its expressed biological activity. Where the term “substantially purified” is used, this designation will refer to a composition in which the protein or peptide forms the major component of the composition, such as constituting about 50%, about 60%, about 70%, about 80%), about 90%, about 95% or more of the proteins in the composition.
  • Various methods for quantifying the degree of purification of the protein or peptide will be known to those of skill in the art in light of the present disclosure. These include, for example, determining the specific activity of an active fraction, or assessing the amount of polypeptides within a fraction by SDS/PAGE analysis.
  • a preferred method for assessing the purity of a fraction is to calculate the specific activity of the fraction, to compare it to the specific activity of the initial extract, and to thus calculate the degree of purity, herein assessed by a "-fold purification number.”
  • the actual units used to represent the amount of activity will, of course, be dependent upon the particular assay technique chosen to follow the purification and whether or not the expressed protein or peptide exhibits a detectable activity.
  • Partial purification may be accomplished by using fewer purification steps in combination, or by utilizing different forms of the same general purification scheme. For example, it is appreciated that a cation-exchange column chromatography performed utilizing an HPLC apparatus will generally result in a greater "- fold" purification than the same technique utilizing a low pressure chromatography system. Methods exhibiting a lower degree of relative purification may have advantages in total recovery of protein product, or in maintaining the activity of an expressed protein.
  • HPLC High Performance Liquid Chromatography
  • Gel chromatography is a special type of partition chromatography that is based on molecular size.
  • the theory behind gel chromatography is that the column, which is prepared with tiny particles of an inert substance that contain small pores, separates larger molecules from smaller molecules as they pass through or around the pores, depending on their size.
  • the sole factor determining rate of flow is the size.
  • molecules are eluted from the column in decreasing size, so long as the shape is relatively constant.
  • Gel chromatography is unsurpassed for separating molecules of different size because separation is independent of all other factors such as pH, ionic strength, temperature, etc. There also is virtually no adsorption, less zone spreading and the elution volume is related in a simple matter to molecular weight.
  • Affinity Chromatography is a chromatographic procedure that relies on the specific affinity between a substance to be isolated and a molecule that it can specifically bind to. This is a receptor-ligand type interaction.
  • the column material is synthesized by covalently coupling one of the binding partners to an insoluble matrix. The column material is then able to specifically adsorb the substance from the solution. Elution occurs by changing the conditions to those in which binding will not occur (alter pH, ionic strength, temperature, etc.).
  • Lectins are a class of substances that bind to a variety of polysaccharides and glycoproteins. Lectins are usually coupled to agarose by cyanogen bromide. Conconavalin A coupled to Sepharose was the first material of this sort to be used and has been widely used in the isolation of polysaccharides and glycoproteins, other lectins that have been include lentil lectin, wheat germ agglutinin which has been useful in the purification of N-acetyl glucosaminyl residues and Helix pomatia lectin.
  • Lectins themselves are purified using affinity chromatography with carbohydrate ligands. Lactose has been used to purify lectins from castor bean and peanuts; maltose has been useful in extracting lectins from lentils and jack bean; N-acetyl-D galactosamine is used for purifying lectins from soybean; N-acetyl glucosaminyl binds to lectins from wheat germ; D- galactosamine has been used in obtaining lectins from clams and L-fuctose will bind to lectins from lotus.
  • the matrix should be a substance that itself does not adsorb molecules to any significant extent and that has a broad range of chemical, physical and thermal stability.
  • the ligand should be coupled in such a way as to not affect its binding properties.
  • the ligand should also provide relatively tight binding. And it should be possible to elute the substance without destroying the sample or the ligand.
  • affinity chromatography One of the most common forms of affinity chromatography is immunoaffinity chromatography. The generation of antibodies that would be suitable for use in accord with the present invention is discussed below.
  • the present invention also describes LsCS-related peptides for use in various embodiments of the present invention.
  • the peptides of the invention can also be synthesized in solution or on a solid support in accordance with conventional techniques.
  • Various automatic synthesizers are commercially available and can be used in accordance with known protocols. See, for example, Stewart and Young (1984); Tam et al. (1983); Merrifield (1986); and Barany and Merrifield (1979), each incorporated herein by reference.
  • Short peptide sequences, or libraries of overlapping peptides usually from about 6 up to about 35 to 50 amino acids, which correspond to the selected regions, described herein, can be readily synthesized and then screened in screening assays designed to identify reactive peptides.
  • recombinant DNA technology may be employed wherein a nucleotide sequence which encodes a peptide of the invention is inserted into an expression vector, transformed or transfected into an appropriate host cell and cultivated under conditions suitable for expression.
  • the present invention also provides, in another embodiment, genes encoding LsCS.
  • the native gene for the LsCS enzyme has been provided as SEQ ID NO:2.
  • the present invention is not limited in scope to this gene, however, as one of ordinary skill in the art could readily identify related homologs in various other plant species as discussed above.
  • LsCS gene may contain a variety of different bases and yet still produce a corresponding polypeptide that is functionally indistinguishable from, and in some cases structurally identical to, the human gene disclosed herein.
  • any reference to a nucleic acid should be read as encompassing a host cell containing that nucleic acid and, in some cases, capable of expressing the product of that nucleic acid.
  • cells expressing nucleic acids of the present invention may prove useful in the context of screening for agents that induce, repress, inhibit, augment, interfere with, block, abrogate, stimulate or enhance the function of LsCS.
  • Nucleic acids according to the present invention may encode an entire LsCS gene, a domain of LsCS that expresses enzyme activity, or any other fragment of the LsCS sequences set forth herein.
  • the nucleic acid may be derived from genomic DNA, i.e., cloned directly from the genome of a particular organism. In particular embodiments, however, the nucleic acid would comprise complementary DNA (cDNA).
  • cDNA is intended to refer to DNA prepared using messenger RNA (mRNA) as template.
  • mRNA messenger RNA
  • the advantage of using a cDNA, as opposed to genomic DNA or DNA polymerized from a genomic, non- or partially- processed RNA template, is that the cDNA primarily contains coding sequences of the corresponding protein. There may be times when the full or partial genomic sequence is preferred, such as where the non-coding regions are required for optimal expression or where non-coding regions such as introns are to be targeted in an antisense strategy.
  • a given LsCS from a given lettuce species may be represented by natural variants that have slightly different nucleic acid sequences but, nonetheless, encode the same protein (see Table 1, above).
  • a nucleic acid encoding a LsCS refers to a nucleic acid molecule that has been isolated free of total cellular nucleic acid.
  • the invention concerns a nucleic acid sequence essentially as set forth in SEQ ID NO:2.
  • the term “as set forth in SEQ ID NO:2” means that the nucleic acid sequence substantially corresponds to a portion of SEQ ID NO:2.
  • the term “functionally equivalent codon” is used herein to refer to codons that encode the same amino acid, such as the six codons for arginine or serine, and also refers to codons that encode biologically equivalent amino acids, as discussed in the following pages.
  • sequences that have at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about, 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of nucleotides that are identical to the nucleotides of SEQ ID NO:2.
  • Sequences that are essentially the same as those set forth in SEQ ID NO:2 also may be functionally defined as sequences that are capable of hybridizing to a nucleic acid segment containing the complement of SEQ ID NO:2 under standard conditions.
  • the DNA segments of the present invention include those encoding biologically functional equivalent LsCS proteins and peptides, as described above. Such sequences may arise as a consequence of codon redundancy and amino acid functional equivalency that are known to occur naturally within nucleic acid sequences and the proteins thus encoded.
  • functionally equivalent proteins or peptides may be created via the application of recombinant DNA technology, in which changes in the protein structure may be engineered, based on considerations of the properties of the amino acids being exchanged. Changes designed by man may be introduced through the application of site-directed mutagenesis techniques or may be introduced randomly and screened later for the desired function, as described below.
  • nucleic acid sequences that are “complementary” are those that are capable of base-pairing according to the standard Watson-Crick complementary rules.
  • complementary sequences means nucleic acid sequences that are substantially complementary, as may be assessed by the same nucleotide comparison set forth above, or as defined as being capable of hybridizing to the nucleic acid segment of SEQ ID NO:2 under relatively stringent conditions such as those described herein. Such sequences may encode the entire LsCS protein or functional or non- functional fragments thereof.
  • the hybridizing segments may be shorter oligonucleotides. Sequences of 17 bases long should occur only once in the human genome and, therefore, suffice to specify a unique target sequence. Although shorter oligomers are easier to make and increase in vivo accessibility, numerous other factors are involved in determining the specificity of hybridization. Both binding affinity and sequence specificity of an oligonucleotide to its complementary target increases with increasing length. It is contemplated that exemplary oligonucleotides of 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more base pairs will be used, although others are contemplated.
  • oligonucleotides encoding 250, 500, 1000, 1470, 1500, 2000, 2500, 3000 or longer are contemplated as well. Such oligonucleotides will find use, for example, as probes in Southern and Northern blots and as primers in amplification reactions.
  • hybridization conditions will be well known to those of skill in the art. In certain applications, for example, substitution of amino acids by site-directed mutagenesis, it is appreciated that lower stringency conditions are required. Under these conditions, hybridization may occur even though the sequences of probe and target strand are not perfectly complementary, but are mismatched at one or more positions. Conditions may be rendered less stringent by increasing salt concentration and decreasing temperature. For example, a medium stringency condition could be provided by about 0.1 to 0.25 M NaCl at temperatures of about 37°C to about 55°C, while a low stringency condition could be provided by about 0.15 M to about 0.9 M salt, at temperatures ranging from about 20°C to about 55°C. Thus, hybridization conditions can be readily manipulated, and thus will generally be a method of choice depending on the desired results.
  • High stringency conditions are defined as those permitting nucleic acid hybridization using a long DNA probe (>100 base pairs) and incubation in 2X SSC (17.53g NaCl and 8.82g sodium citrate per litre, pH 7.0) and 0.1% SDS at 65°C. Furthermore, DNA must remain hybridized after washing with the following two solutions at 65°C: a) 2X SSC/0.1% SDS and b) 0.1X SSC/0.1% SDS.
  • One method of using probes and primers of the present invention is in the search for genes related to LsCS or, more particularly, homologs of LsCS from other species.
  • the target DNA will be a genomic or cDNA library, although screening may involve analysis of RNA molecules.
  • screening may involve analysis of RNA molecules.
  • Site-specific mutagenesis is a technique useful in the preparation of individual peptides, or biologically functional equivalent proteins or peptides, through specific mutagenesis of the underlying DNA.
  • the technique further provides a ready ability to prepare and test sequence variants, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the DNA.
  • Site- specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the DNA sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the deletion junction being traversed.
  • a primer of about 17 to 25 nucleotides in length is preferred, with about 5 to 10 residues on both sides of the junction of the sequence being altered.
  • the technique typically employs a bacteriophage vector that exists in both a single stranded and double stranded form.
  • Typical vectors useful in site-directed mutagenesis include vectors such as the Ml 3 phage. These phage vectors are commercially available and their use is generally well known to those skilled in the art.
  • Double-stranded plasmids are also routinely employed in site directed mutagenesis, which eliminates the step of transferring the gene of interest from a phage to a plasmid.
  • site-directed mutagenesis is performed by first obtaining a single-stranded vector, or melting of two strands of a double-stranded vector which includes within its sequence a DNA sequence encoding the desired protein.
  • An oligonucleotide primer bearing the desired mutated sequence is synthetically prepared.
  • This primer is then annealed with the single-stranded DNA preparation, taking into account the degree of mismatch when selecting hybridization conditions, and subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand.
  • E. coli polymerase I Klenow fragment DNA polymerizing enzymes
  • a heteroduplex is formed wherein one strand encodes the original non-mutated sequence and the second strand bears the desired mutation.
  • This heteroduplex vector is then used to transform appropriate cells, such as E. coli cells, and clones are selected that include recombinant vectors bearing the mutated sequence arrangement.
  • sequence variants of the selected gene using site-directed mutagenesis is provided as a means of producing potentially useful species and is not meant to be limiting, as there are other ways in which sequence variants of genes may be obtained.
  • recombinant vectors encoding the desired gene may be treated with mutagenic agents, such as hydroxylamine, to obtain sequence variants.
  • expression vectors are employed to express the LsCS polypeptide product, which can then be purified for various uses.
  • the expression vectors are used in gene therapy. Expression requires that appropriate signals be provided in the vectors, and which include various regulatory elements, such as enhancers/promoters from both viral and mammalian sources that drive expression of the genes of interest in host cells. Elements designed to optimize messenger RNA stability and translatability in host cells also are defined. The conditions for the use of a number of dominant drug selection markers for establishing permanent, stable cell clones expressing the products are also provided, as is an element that links expression of the drug selection markers to expression of the polypeptide.
  • expression construct is meant to include any type of genetic construct containing a nucleic acid coding for a gene product in which part or all of the nucleic acid encoding sequence is capable of being transcribed.
  • the transcript may be translated into a protein, but it need not be.
  • expression includes both transcription of a gene and translation of mRNA into a gene product. In other embodiments, expression only includes transcription of the nucleic acid encoding a gene of interest.
  • vector is used to refer to a carrier nucleic acid molecule into which a nucleic acid sequence can be inserted for introduction into a cell where it can be replicated.
  • a nucleic acid sequence can be "exogenous,” which means that it is foreign to the cell into which the vector is being introduced or that the sequence is homologous to a sequence in the cell but in a position within the host cell nucleic acid in which the sequence is ordinarily not found.
  • Vectors include plasmids, cosmids, viruses (bacteriophage, animal viruses, and plant viruses), and artificial chromosomes (e.g., YACs).
  • plasmids include plasmids, cosmids, viruses (bacteriophage, animal viruses, and plant viruses), and artificial chromosomes (e.g., YACs).
  • YACs artificial chromosomes
  • expression vector refers to a vector containing a nucleic acid sequence coding for at least part of a gene product capable of being transcribed. In some cases, RNA molecules are then translated into a protein, polypeptide, or peptide. In other cases, these sequences are not translated, for example, in the production of antisense molecules or ribozymes.
  • Expression vectors can contain a variety of "control sequences,” which refer to nucleic acid sequences necessary for the transcription and possibly translation of an operably linked coding sequence in a particular host organism. In addition to control sequences that govern transcription and translation, vectors and expression vectors may contain nucleic acid sequences that serve other functions as well and are described infra.
  • a “promoter” is a control sequence that is a region of a nucleic acid sequence at which initiation and rate of transcription are controlled. It may contain genetic elements at which regulatory proteins and molecules may bind such as RNA polymerase and other transcription factors.
  • the phrases "operatively positioned,” “operatively linked,” “under control,” and “under transcriptional control” mean that a promoter is in a correct functional location and/or orientation in relation to a nucleic acid sequence to control transcriptional initiation and/or expression of that sequence.
  • a promoter may or may not be used in conjunction with an "enhancer,” which refers to a cz ' s-acting regulatory sequence involved in the transcriptional activation of a nucleic acid sequence.
  • a promoter may be one naturally-associated with a gene or sequence, as may be obtained by isolating the 5' non-coding sequences located upstream of the coding segment and/or exon. Such a promoter can be referred to as "endogenous.”
  • an enhancer may be one naturally associated with a nucleic acid sequence, located either downstream or upstream of that sequence.
  • certain advantages will be gained by positioning the coding nucleic acid segment under the control of a recombinant or heterologous promoter, which refers to a promoter that is not normally associated with a nucleic acid sequence in its natural environment.
  • a recombinant or heterologous enhancer refers also to an enhancer not normally associated with a nucleic acid sequence in its natural environment.
  • Such promoters or enhancers may include promoters or enhancers of other genes, and promoters or enhancers isolated from any other prokaryotic, viral, or eukaryotic cell, and promoters or enhancers not "naturally-occurring," i.e., containing different elements of different transcriptional regulatory regions, and/or mutations that alter expression.
  • sequences may be produced using recombinant cloning and/or nucleic acid amplification technology, including PCRTM, in connection with the compositions disclosed herein (see U.S. Patent 4,683,202, U.S. Patent 5,928,906, each incorporated herein by reference).
  • control sequences that direct transcription and/or expression of sequences within non-nuclear organelles such as mitochondria, chloroplasts, and the like, can be employed as well.
  • promoter and/or enhancer that effectively directs the expression of the DNA segment in the cell type, organelle, and organism chosen for expression.
  • One example is the native LsCS promoter.
  • the promoters employed may be constitutive, tissue-specific, inducible, and/or useful under the appropriate conditions to direct high level expression of the introduced DNA segment, such as is advantageous in the large-scale production of recombinant proteins and/or peptides.
  • the promoter may be heterologous or endogenous.
  • Other useful promoters include bacterial promoters, yeast promoters and other plant promoters.
  • Promoters useful for expressing genes in plants include (i) promoters for constitutive expression, such as the Cauliflower Mosaic Virus (CaMV) 35S promoter and nopaline synthase (nos) promoter (Gruber and Crosby, 1993), (i) Tobacco Mosaic Virus (TMV) and Tobacco Rattle Virus (TRV)-derived promoters (Grill et al, 2002), (iii) Sail promoter (Elleuch et al, 2001). Promoters for alkaloid biosynthetic genes, including PrPsBBE, PrPs4'OMT2, PrPs70MT and PrPsSAT (Apuya et al, 2008).
  • CaMV Cauliflower Mosaic Virus
  • nos nopaline synthase
  • TMV Tobacco Mosaic Virus
  • TRV Tobacco Rattle Virus
  • Promoters for alkaloid biosynthetic genes including PrPsBBE, PrPs4'OMT2,
  • IRES elements are used to create multigene, or polycistronic, messages.
  • IRES elements are able to bypass the ribosome scanning model of 5 '-methylated Cap dependent translation and begin translation at internal sites (Pelletier and Sonenberg, 1988).
  • IRES elements from two members of the picornavirus family polio and encephalomyocarditis have been described (Pelletier and Sonenberg, 1988), as well an IRES from a mammalian message (Macejak and Sarnow, 1991).
  • IRES elements can be linked to heterologous open reading frames. Multiple open reading frames can be transcribed together, each separated by an IRES, creating polycistronic messages.
  • each open reading frame is accessible to ribosomes for efficient translation.
  • Multiple genes can be efficiently expressed using a single promoter/enhancer to transcribe a single message (see U.S. Patents 5,925,565 and 5,935,819, herein incorporated by reference).
  • Vectors can include a multiple cloning site (MCS), which is a nucleic acid region that contains multiple restriction enzyme sites, any of which can be used in conjunction with standard recombinant technology to digest the vector.
  • MCS multiple cloning site
  • Restriction enzyme digestion refers to catalytic cleavage of a nucleic acid molecule with an enzyme that functions only at specific locations in a nucleic acid molecule. Many of these restriction enzymes are commercially available. Use of such enzymes is widely understood by those of skill in the art.
  • a vector is linearized or fragmented using a restriction enzyme that cuts within the MCS to enable exogenous sequences to be ligated to the vector.
  • "Ligation” refers to the process of forming phosphodiester bonds between two nucleic acid fragments, which may or may not be contiguous with each other. Techniques involving restriction enzymes and ligation reactions are well known to those of skill in the art of recombinant technology.
  • RNA molecules will undergo RNA splicing to remove introns from the primary transcripts.
  • Vectors containing genomic eukaryotic sequences may require donor and/or acceptor splicing sites to ensure proper processing of the transcript for protein expression (see Chandler et al., 1997, herein incorporated by reference.)
  • the vectors or constructs of the present invention will generally comprise at least one termination signal.
  • a “termination signal” or “terminator” is comprised of the DNA sequences involved in specific termination of an RNA transcript by an RNA polymerase. Thus, in certain embodiments a termination signal that ends the production of an RNA transcript is contemplated. A terminator may be necessary in vivo to achieve desirable message levels.
  • the terminator region may also comprise specific DNA sequences that permit site-specific cleavage of the new transcript so as to expose a polyadenylation site.
  • RNA molecules modified with this polyA tail appear to more stable and are translated more efficiently.
  • terminator comprises a signal for the cleavage of the RNA, and it is more preferred that the terminator signal promotes polyadenylation of the message.
  • the terminator and/or polyadenylation site elements can serve to enhance message levels and/or to minimize read through from the cassette into other sequences.
  • Terminators contemplated for use in the invention include any known terminator of transcription described herein or known to one of ordinary skill in the art, including but not limited to, for example, the termination sequences of genes, such as for example the bovine growth hormone terminator or viral termination sequences, such as for example the SV40 terminator.
  • the termination signal may be a lack of transcribable or translatable sequence, such as due to a sequence truncation.
  • polyadenylation signal In expression, particularly eukaryotic expression, one will typically include a polyadenylation signal to effect proper polyadenylation of the transcript.
  • the nature of the polyadenylation signal is not believed to be crucial to the successful practice of the invention, and/or any such sequence may be employed.
  • Preferred embodiments include the SV40 polyadenylation signal and/or the bovine growth hormone polyadenylation signal, convenient and/or known to function well in various target cells. Polyadenylation may increase the stability of the transcript or may facilitate cytoplasmic transport.
  • a vector in a host cell may contain one or more origins of replication sites (often termed "ori"), which is a specific nucleic acid sequence at which replication is initiated.
  • ori origins of replication sites
  • ARS autonomously replicating sequence
  • cells containing a nucleic acid construct of the present invention may be identified in vitro or in vivo by including a marker in the expression vector.
  • markers would confer an identifiable change to the cell permitting easy identification of cells containing the expression vector.
  • a selectable marker is one that confers a property that allows for selection.
  • a positive selectable marker is one in which the presence of the marker allows for its selection, while a negative selectable marker is one in which its presence prevents its selection.
  • An example of a positive selectable marker is a drug resistance marker.
  • a drug selection marker aids in the cloning and identification of transformants
  • genes that confer resistance to neomycin, puromycin, hygromycin, DHFR, GPT, zeocin and histidinol are useful selectable markers.
  • markers conferring a phenotype that allows for the discrimination of transformants based on the implementation of conditions other types of markers including screenable markers such as GFP, whose basis is colorimetric analysis, are also contemplated.
  • screenable enzymes such as herpes simplex virus thymidine kinase (tk) or chloramphenicol acetyltransferase (CAT) may be utilized.
  • a viable alternative to conventional agriculture is the production of valuable bioproducts in microbial systems.
  • the field of synthetic biology holds promise for the assembly of entire pathways in yeast and/or bacteria, using natural or novel enzymes and biosynthetic routes (Martin et al, 2009; Carothers et al, 2009; Picataggio 2009; Keasling 2008).
  • the terms “cell,” “cell line,” and “cell culture” may be used interchangeably. All of these terms also include their progeny, which is any and all subsequent generations. It is understood that all progeny may not be identical due to deliberate or inadvertent mutations.
  • "host cell” refers to a prokaryotic or eukaryotic cell, and it includes any transformable organisms that is capable of replicating a vector and/or expressing a heterologous gene encoded by a vector. A host cell can, and has been, used as a recipient for vectors.
  • a host cell may be "transfected” or “transformed,” which refers to a process by which exogenous nucleic acid is transferred or introduced into the host cell.
  • a transformed cell includes the primary subject cell and its progeny.
  • Host cells may be derived from prokaryotes or eukaryotes, depending upon whether the desired result is replication of the vector or expression of part or all of the vector-encoded nucleic acid sequences. Numerous cell lines and cultures are available for use as a host cell, and they can be obtained through the American Type Culture Collection (ATCC), which is an organization that serves as an archive for living cultures and genetic materials (www.atcc.org). An appropriate host can be determined by one of skill in the art based on the vector backbone and the desired result.
  • ATCC American Type Culture Collection
  • a plasmid or cosmid can be introduced into a prokaryote host cell for replication of many vectors.
  • Bacterial cells used as host cells for vector replication and/or expression include DH5oc, JM109, and KC8, as well as a number of commercially available bacterial hosts such as SURE ® Competent Cells and SOLOPACKTM Gold Cells (STRATAGENE ® , La Jolla).
  • bacterial cells such as E. coli LE392 could be used as host cells for phage viruses.
  • Some vectors may employ control sequences that allow it to be replicated and/or expressed in both prokaryotic and eukaryotic cells.
  • One of skill in the art would further understand the conditions under which to incubate all of the above described host cells to maintain them and to permit replication of a vector.
  • Also understood and known are techniques and conditions that would allow large-scale production of vectors, as well as production of the nucleic acids encoded by vectors and their cognate polypeptides, proteins, or peptides. IV. Transformation Methods
  • Agrobacterium transformation processes are the choice when transforming plants.
  • Agrobacterium tumefaciens is the causal agent of crown gall disease in over 140 species of dicot. It is a rod-shaped, Gram-negative soil bacterium (Smith et al, 1998). Symptoms are caused by the insertion of a small segment of DNA (known as the T-DNA, for 'transfer DNA') into the plant cell, which is incorporated at a semi-random location into the plant genome.
  • T-DNA small segment of DNA
  • Agrobacterium tumefaciens (or A. tumefaciens) is an alphaproteobacterium of the family Rhizobiaceae, which includes the nitrogen fixing legume symbionts. Unlike the nitrogen fixing symbionts, tumor producing Agrobacterium are pathogenic and do not benefit the plant. The wide variety of plants affected by Agrobacterium makes it of great concern to the agriculture industry. Economically, A. tumefaciens is a serious pathogen of walnuts, grape vines, stone fruits, nut trees, sugar beets, horse radish and rhubarb. In order to transfer the T-DNA into the plant cell A.
  • tumefaciens uses a Type IV secretion mechanism, involving the production of a T-pilus.
  • the VirA/VirG two component sensor system is able to detect phenolic signals released by wounded plant cells, in particular acetosyringone. This leads to a signal transduction event activating the expression of 11 genes within the VirB operon which are responsible for the formation of the T-pilus.
  • the VirB pro-pilin is formed. This is a polypeptide of 121 amino acids which requires processing by the removal of 47 residues to form a T-pilus subunit. The subunit is circularized by the formation of a peptide bond between the two ends of the polypeptide.
  • VirB6 Products of the other VirB genes are used to transfer the subunits across the plasma membrane.
  • Yeast two-hybrid studies provide evidence that VirB6, VirB7, VirB8, VirB9 and VirBlO may all encode components of the transporter.
  • An ATPase for the active transport of the subunits would also be required.
  • the T-DNA must be cut out of the circular plasmid.
  • a VirDl/D2 complex nicks the DNA at the left and right border sequences.
  • the VirD2 protein is covalently attached to the 5' end.
  • VirD2 contains a motif that leads to the nucleoprotein complex being targeted to the type IV secretion system (T4SS).
  • T4SS type IV secretion system
  • the T-DNA complex becomes coated with VirE2 proteins, which are exported through the T4SS independently from the T-DNA complex.
  • Nuclear localization signals, or NLS, located on the VirE2 and VirD2 are recognised by the importin alpha protein, which then associates with importin beta and the nuclear pore complex to transfer the T-DNA into the nucleus.
  • VIP1 also appears to be an important protein in the process, possibly acting as an adapter to bring the VirE2 to the importin. Once inside the nucleus, VIP2 may target the T-DNA to areas of chromatin that are being actively transcribed, so that the T-DNA can integrate into the host genome.
  • the DNA transmission capabilities of Agrobacterium have been extensively exploited in biotechnology as a means of inserting foreign genes into plants.
  • Van Montagu and Schell, (University of Ghent and Plant Genetic Systems, Belgium) discovered the gene transfer mechanism between Agrobacterium and plants, which resulted in the development of methods to alter Agrobacterium into an efficient delivery system for genetic engineering in plants.
  • the plasmid T-DNA that is transferred to the plant is an ideal vehicle for genetic engineering. This is done by cloning a desired gene sequence into the T-DNA that will be inserted into the host DNA. This process has been performed using firefly luciferase gene to produce glowing plants. This luminescence has been a useful device in the study of plant chloroplast function and as a reporter gene. It is also possible to transform Arabidopsis by dipping their flowers into a broth of Agrobacterium, the seed produced will be transgenic. Under laboratory conditions the T-DNA has also been transferred to human cells, demonstrating the diversity of insertion application.
  • yeast cells are easily transformed using a wide variety of known methods. These methods generally rely on the use of various forms of polyethylene glycol to induce transformation.
  • An exemplary transformation protocal is as follows. Briefly, yeast cells were cultured overnight, and 1 mL of saturated yeast cells are freshly inoculated to 50 mL medium. After 5 hour cultivation at 30°C in 200 rpm shaking, yeast cells are washed with sterile water, and yeast cells are mixed with 35% polyetylene glycol, 100 raM lithium acetate, and 100 ⁇ g of carrier DNA . Incubate the mixture at 42°C for 40 min and transfer onto ice for 2 min. A portion of the mixture is plated on the solid medium lacking specific amino acid nutrition for selection. Incubation at 30°C for 3 days usually provide a sufficient number of transgenic yeasts.
  • yeast platform strains allow for simple high-throughput screening of enzyme activities as opposed to relying on time consuming and costly enzyme assays.
  • Simple inexpensive sugars can be used to generate, endogenously, expensive substrates or intermediates that are not commercially available.
  • inherently unstable membrane enzymes such as P450s can be reliably tested in vivo; microsome preparation required for in vitro P450 assays are not necessary.
  • these strains often yield sufficient amount of the desired natural products (mg level) that can be easily identified by standard analytical techniques such as LC-MS, GC-MS or NMR. It has been demonstrated that S. cerevisiae has efficient non-specific efflux pumps (PDR5, SNQ2 and other related ABC transporters), which can facilitate secretion of some terpenoids and potentially other phytochemicals.
  • heterologous protein expression low expression or the formation of denatured proteins may be attributable to the differences in synonymous codon usage between the heterologous host and the natural host (Kimchi-Sarfaty et al, 2007; Komar et al, 1999).
  • Optimizing codon usage and eliminating mRNA secondary structure can significantly improve the levels of target protein expression in a heterologous host ( Komar et al, 1999; Yoshikuni et al, 2008).
  • a heterologous target protein may be poorly expressed.
  • Costunolide produced, e.g., in yeast system can be purified using preparative HPLC.
  • FPP high-producing yeast which has pESC-Leu2d::GAS/GAO/CPR/LsCS can be cultured in neutral pH condition (HEPES/NaOH buffered pH 7.5) for 4 days. After that, pH of the supernatant of the culture is adjusted to pH 6.0, and extracted with ethyl acetate three times. The ethyl acetate layer can be evaporated by a stream of nitrogen gas and remained pellet can be dissolved with MeOH to be subjected to preparative HPLC.
  • the targeted compound, costunolide can be separated with CI 8 column and prepared. The costunolide-containing fraction can be gathered together and the solvent (acetonitrile and water) can be evaporated, resulting in obtaining pure costunolide.
  • 13-amino costunolide derivatives can be synthesized from costunolide as described in Srivastava et al. (2006). Some of these compounds were shown to have better potential as anticancer agents than costunolide. Costunolide can be chemically modified to other several compounds as described in Kalsi et al. (1985).
  • costunolide can be biotransformed to further modified compounds by microbes such as Mucor polymorphosporus and Aspergillus candidus as described in Ma et al. (2007). Plants which produce STL can also be used to biotransform costunolide to more complicated STLs such as leucodin as described in de Kraker et al. (2002).
  • costunolide can be subject for further stereo- and region-specific modifications to make various STLs.
  • modifications including but not restricted to reduction, hydroxylation, epoxidation, methylation, acetylation, and hydrogenation, yield a wide range of STLs such as the antiinflammatory parthenolide, the anti-prostate cancer thapsigargin (Drew et al., 2009), leucodin, the sedative lactucin, the cytotoxic/anti-tumour deoxy-lactucin, the hypoglycaemic lactucopicrine (Pieman, 1986; de Kraker et al., 2002), the anti-cancer 13-amino costunolide derivatives (Srivastava et al., 2006), the anti-tumour nobilin (Pieman, 1986).
  • the aforementioned modifications can be achieved via bio-transformation using a host cell system or crude enzyme prepartion from cells (Ma et al., 2007; de Kraker et al., 2002), and/or chemical semi-synthesis (Srivastava et al, 2006; Wedge et al., 2000; Kalsi et al., 1985; odrigues et al, 1978).
  • kits are also within the scope of the invention.
  • Such kits can comprise a carrier, package or container that is compartmentalized to receive one or more containers such as vials, tubes, and the like, each of the container(s) comprising one of the separate elements of the present invention, and those elements to be used in methods of the present invention, in particular, polypeptides, nucleic acids, recombinant vectors, and cells.
  • the kit of the invention will typically comprise the container described above and one or more other containers comprising materials desirable from a commercial end user standpoint, including buffers, diluents, filters, plates, media, and package inserts with instructions for use.
  • a label can be provided on the container to indicate that the composition is used for a specific application.
  • Directions and or other information can also be included on an insert which is included with the kit. VIII. Examples
  • LsGAO Plasmid construct for gene expression. Sequence information at the start and stop codons of LsGAO was obtained from the Compositae Genome Project Database at the University of California Davis (compgenomics.ucdavis.edu). LsGAO was amplified from the cDNA template prepared from lettuce leaf by a forward primer, 5'- CGAGGrCT (Z4 ATGGAGCTTTCAATAACC ACC-3 ' (SEQ ID NO:29), and a reverse primer, 5 ' -GCCCTCTA GA GC AAAACTCGGTACGAGTAAC AAC-3 ' (SEQ ID NO:30). The amplified product was digested by Xbal and ligated into the Spel site of pESC-Ura plasmid.
  • plasmid stability was enhanced by coding three genes in a single plasmid.
  • the expression cassette of GAS was amplified from GAS::pESC-Leu plasmid by a forward primer, 5'-
  • GTCAATC4 CTA GTGAGTACGGATTAGAAG CCGCCGA-3' (SEQ ID NO:31), and a reverse primer, 5 ' -GTC AATGCCGGCCTTCGAGCGTCCC AAAACCT-3 ' (SEQ ID NO:32).
  • the amplified product was digested by Dralil and Nael, and the digested fragment was ligated to the corresponding sites of the empty pESC-Leu2d. This DNA manipulation freed two multiple cloning sites for further cloning.
  • Two expression cassettes for LsGAO and CPR were digested from the LsGAO/CPR: :pESC-Ur a by Pacl and Seal, and the digested fragment was ligated to the corresponding site of the newly generated GAS: :pESC-Leu2d, resulting in the triple expression plasmid named GAS/LsGAO/CPR: :pESC-Leu2d.
  • Bioinformatic analyses identified start and stop codons of chicory, sunflower, and Barnadesia spinosa, and their ORF sequence data were used to design appropriate primers. When necessary, clones were ordered from the Arizona Genomics Institute at the University of Arizona to obtain additional sequence information.
  • a 1.4-Kb fragment of SIGAO was first obtained from costus cDNA using primers designed at the highly conserved domains of other GAOs.
  • the primer pair used was a forward primer, 5'- ACCGTGGCTCAAAGCTCTCAGTC-3 ' (SEQ ID NO:33), and a reverse primer, 5'- GACTCCCCATAATCGGTCACATGC-3 ' (SEQ ID NO:34). Both 5'- and 3'-RACE were conducted to determine start and stop codons of SIGAO. All the isolated GAOs were first cloned to pESC-Leu vector to make translational fusions to FLAG epitope.
  • HaGAO was amplified using a forward primer, 5'- GCA CTA G ATGGAAGTCTCCCTCACCACTTC-3 ' (SEQ ID NO:35), and a reverse primer, 5 ' -CG AT4 CTA G GC A A A ACTTGGTAC A AGC ATC A A-3 ' (SEQ ID NO:36).
  • SIGAO was amplified using a forward primer, 5'-
  • CiGAO was amplified using a forward primer, 5'- ACGTCTA GL4 ATGGAGCTCACTCACTACTTCCA-3 ' (SEQ ID NO:39), and a reverse primer, 5'-ACGrCr ⁇ 04GCAAAACTTGGTACGAGTATCAATTCGGT-3' (SEQ ID NO:40).
  • BsGAO was amplified using a forward primer, 5'- ATArcr ⁇ 3 ⁇ 4(ACCATGGAACTCACTCTCACCACTTCCC-3' (SEQ ID NO:41), and a reverse primer, 5 ' - AT ⁇ 4 CTA G CGAGC AG AGTTGTTAGC AGTCTTGTAAGCTG-3 ' (SEQ ID NO:42).
  • the amplified fragments were digested by Xbal or Spel and cloned into the Spel site of pESC-Leu vector.
  • the entire ORFs in fusion with the FLAG epitope were digested by NotI and Pad, and the LsGAO in the triple expression vector was removed and replaced with the NotI- and i3 ⁇ 4c/-digested GAO ORFs.
  • Yeast culture and metabolite sample preparation For standard yeast culture, the transgenic yeast strain of interest was inoculated in 3 mL Synthetic Complete (SC) medium omitting appropriate amino acids with 2% Glc. The inoculums were cultured overnight at 30 °C in 200 rpm. The start culture was diluted 100-fold in the SC medium omitting appropriate amino acids with 1.8% Gal and 0.2% Glc. One hundred mL medium was cultured for metabolite profiling, whereas 1.0-2.5 L medium was cultured for chemical purification and identification. Rearranged sesquiterpenoids (costic acids) were extracted and analyzed by GC- MS according to the published methods (Ro et al, 2008).
  • culture medium was adjusted to have 100 mM HEPES/NaOH (pH 7.5). After cultivating yeast for 48- 72 hr at 30 °C, pH of the culture medium was adjusted to have pH 6.0 with 2 M HC1, and medium was extracted by ethyl acetate twice. The ethyl acetate fractions were evaporated in nitrogen gas to concentrate samples to 1 mL, and 1 ⁇ , was analyzed by GC-MS. For LC-MS analyses, the solvent was replaced with methanol.
  • Microsomes were prepared according to the published protocol (Pompon et al, 1996) except that micro-beadbeater (Biospec Products, Bartlesville, USA) was used for 90 s with glass beads (500 ⁇ diameter).
  • microsomal proteins were separated on 10% SDS-PAGE and transferred onto Polyvinylidene Fluoride (PVDF) membrane.
  • the membrane was blocked by 5% non-fat milk in TBST buffer (25 mM Tris-HCl, pH 7.5, 150 mM NaCl, and 0.05% Tween 20) for at least 1 h., incubated with anti-FLAG M2 primary antibodies (Sigma- Aldrich) in 1 :5000 dilution, washed three time with TBST, and incubated with goat anti-mouse secondary antibody (GE Healthcare) in 1 :5000 dilution. After washing the membrane three times with TBST, the bound secondary antibodies were detected with ECL Plus detection reagents (GE Healthcare). For in vitro enzyme assay, the protease- deficient S.
  • TBST buffer 25 mM Tris-HCl, pH 7.5, 150 mM NaCl, and 0.05% Tween 20
  • yeast culture was shifted to fresh medium with 2% Gal, and the yeasts were further cultivated for 24 h.
  • the in vitro enzyme reactions were carried out in 3 mL of 50 mM HEPES NaOH (pH 7.5) buffer containing 3 mg of microsomal protein, 200 ⁇ germacrene A, 500 ⁇ NADPH, and an NADPH regeneration system (10 mM Glc-6-phosphate and three units of Glc-6-phosphate dehydrogenase).
  • reaction occurred at 23 °C for 4 hr with gentle agitation.
  • the reaction product was acidified with 2 M HCl to pH 6.0 and extracted with ethyl acetate. The extract solvent was then replaced with methanol for LC-MS analysis.
  • Retention indices (RI) for methyl esters of ⁇ -, ⁇ -, ⁇ - costic acid and ilicic acid were: in DB-5 column, 1807, 1805, 1788, and 1966, respectively; in Cyclodex-B column, 1914, 1920, 1889, and 2208, respectively.
  • RI values for native forms of ⁇ -, ⁇ -, ⁇ - costic acid and ilicic acid were 1873, 1870, 1852, and 2103, respectively, in DB5 column.
  • Cope-rearranged product, elemenoic acid showed RI value of 1762, and EI-MS relative ion intensity as follows.
  • NMR analyses For costic acids and ilicic acid, NMR spectra were recorded in 3 mm standard NMR tubes on a Varian Unity Inova 500 MHz spectrometer equipped with a 3 mm ID-PFG probe. The 1H and 13 C NMR chemical shifts were referenced to solvent signals at ⁇ / c 7.14/127.68 (C 6 D 6 ) relative to TMS. ID and 2D homonuclear NMR spectra were measured with standard Varian pulse sequences, and the experiments performed include gCOSY, TOCSY, ROESY, gHSQCAD, and gHMBCAD.
  • Adiabatic broadband and band- selective GHSQCAD and GHMBCAD spectra were recorded using CHEMPACK 4.0 pulse sequences (implemented in Varian Vnmrj 2. IB spectrometer software).
  • 1H and 13 C NMR spectra were acquired in 5 mm standard NMR tubes at 400.13 and 100.6 MHz on a Bruker AVANCE 400 Spectrometer equipped with 5mm inverse probe with triple axis gradients. Chemical shifts ( ⁇ ) were referenced to internal TMS for both 13 C and ⁇ .
  • Spectra were recorded with standard Bruker pulse sequences under Xwinnmr. Experiments performed included ID proton, ID 13 C with proton decoupling, 13 C attached proton test with proton decoupling, COSY with double quantum filter, TOSCY with 60 ms mixing time, HSQC and HMBC.
  • LC-MS analyses of sesquiterpenoids Metabolite mass profiles were generated by Agilent 1200 Rapid Resolution LC (RRLC) system coupled with Agilent 6410 MS using 10iL injections of samples onto a reverse phase C 18 column (2.1 x 50 mm, 1.8 ⁇ , Eclipse plus CI 8 Zorbax) with a solvent gradient of 80:20 (A:B) to 20:80 (A:B) over 12 min at 0.4 ml min 1 at 40 °C column temperature (A: H 2 0 with 1% acetic acid; B: 100%) acetonitrile).
  • the initial 30 s of LC was operated at an isocratic mode with solvent composition 80:20 (A:B). Total ion scans were used in both negative and positive mode and specific ion masses were selected for further mass analysis.
  • ClastalW algorithm Phylogenetic analysis was performed using the Phylogenetic Analysis Using Parsimony (PAUP) 4.0 software. The first 21 amino acids corresponding to the membrane domain were excluded, and characters were reweighted according to rescaled consistency index. Parsimony analysis was performed using the tree-bisection-reconnection (TBR) algorithm. 1,000 replicates of the bootstap analysis were performed to evaluate the statistical significance of each node.
  • PAUP Parsimony
  • TBR tree-bisection-reconnection
  • PCR amplification using primers designed on the start and stop codons of the identified ESTs allowed them to isolate a full-length gene from lettuce leaf cDNA.
  • the isolated cDNA encodes a polypeptide of 488 amino acids with a predicted molecular mass of 54.9 kDa.
  • the deduced amino acid sequences from this gene showed 86.7% identity to those from A. annua AMO.
  • This P450 gene was designated as germacrene A oxidase ⁇ GAO) based on its catalytic property (see below).
  • yeast strain EPY300 engineered to produce a markedly increased level of farnesyl diphosphate (FPP, an immediate precursor of germacrene A) served as a platform strain.
  • FPP farnesyl diphosphate
  • the production of the hydrocarbon germacrene A in the EPY300 strain expressing previously characterized lettuce GAS has been demonstrated (Bennett et al, 2002; Gopfert et al, 2009).
  • open reading frames of GAO with FLAG epitope tag and A For in vivo catalytic coupling of GAS and GAO, open reading frames of GAO with FLAG epitope tag and A.
  • CPR cytochrome P450 reductase
  • the fourth peak (4) also showed abundant m/z 248 but displayed significantly delayed retention time.
  • the EI-MS analysis of their native (non-methylated) forms revealed an identical parent mass of m/z 234 in each case.
  • the observation of m/z 248 (methyl ester) and 234 (native form) suggested that the hydrocarbon germacrene A ( r 204) is sequentially oxidized three times by GAO as is the case for the biosynthesis of artemisinic acid by AMO (FIG. 1).
  • (-/+)- LC ESI-MS analyses of 1, 2 and 3 displayed their [M-H] " ions at m/z 233 and their [M+H] + ions at m/z 235 confirming the above findings.
  • the putative germacrene A acid observed in neutral culture was further analyzed by (- )LC-MS in comparison to those produced from non-buffered medium.
  • costic acids m/z 233
  • ilicic acid m/z 251
  • FIG. 3B top
  • the amount of costic acids and ilicic acid decreased to 15% and 34%, respectively, relative to their levels in non-buffered culture.
  • a new peak of m/z 233 at a slightly earlier retention time than costic acids increased by 44-fold (FIG.
  • GAO activity is highly conserved in Asteraceae.
  • the convenient in vivo system was used to trace the advent of GAO in various Asteraceae plant species.
  • EST-mining was used to trace the advent of GAO in various Asteraceae plant species.
  • RACE RACE methods was used to isolate full-length clones of AMO/GAO homologs from selected Asteraceae plants.
  • Sunflower Helianthus annuus
  • chicory Chicory (Cichorium intybus), and costus ⁇ Saussurea lappa
  • Barnadesia spinosa was selected as the representative of the most phylogenetic base lineage, subfamily Barnadesioideae (FIG. 4A) (Panero and Funk, 2008).
  • the enzymatic activities of the isolated GAO clones were examined in the yeast in vivo system by co-expressing with GAS and CPR.
  • the 233 negative ions from the germacrene A acid were detected in all samples tested, and quantitative analyses showed that comparable amounts of germacrene A acid were synthesized in the yeast strain expressing lettuce, chicory, costus, or Barnadesia clone (FIG. 5A).
  • the germacrene A acid from the yeast expressing sunflower clone was one order of magnitude lower (10-15 fold) than that of the others, semi-quantitative immunoblot analysis of isolated microsomes revealed that the sunflower recombinant protein was at least 10-fold lower than those from the other clones (FIG.
  • sunflower enzyme also resulted in the comparable level of catalytic activity to other enzymes.
  • GAOs and their corresponding enzymatic activities are conserved at the phylogenetic basal clade of Asteraceae (i.e., Barnadesioideae) and retained in three major subfamilies of Asteraceae.
  • the deduced amino acids from these clones shared significant sequence identities ranging from 78.4% to 97.3% (FIG. 4C).
  • AMO shared a higher degree of homology to the GAOs from lettuce, chicory, sunflower, and costus (84.2-86.8%) than BsGAO did with these GAOs (79.6-82.6%).
  • BsGAO did with these GAOs (79.6-82.6%).
  • a phylogenetic tree was reconstructed from AMO and five GAOs using two cytochrome P450s for sesquiterpene oxidations as outgroups (Ralston et ⁇ , 2001 ; Takahashi et al., 2007).
  • the phylogenetic analysis showed that AMO forms a distinctive node from the major GAO clade within the linage originated from the Barnadesia GAO (FIG. 4B).
  • sunflower GAO constitutes part of the major GAO clade (FIG. 4B, bracket) that can be distinguished from AMO by a strong statistical support.
  • AMO in Artemisia annua recently underwent a specific biochemical micro-evolution that was not mirrored by the overall speciation patterns of the subfamily Asteroideae.
  • Co-expression plasmids for the swapped gene pairs ⁇ GAS/AMO or ADS/GAOs were constructed in CPR: :pESC-Leu2d plasmid and transformed to the EPY300 strain.
  • the GAS and AMO pair displayed negligible activity (0.04%) for germacrene A acid synthesis, relative to the activity detected from the native enzyme pair, GAS and GAO (Table 2).
  • all five GAOs from various Asteraceae plants displayed ⁇ 10 2 -10 3 fold higher relative activities (5-40%) for artemisinic acid synthesis than that from the GAS/AMO pair.
  • brm broad multiplet; bs: broad singlet; bt: broad triplet; d: doublet; dd: doublet of doublets; ddd: doublet of doublets of doublets; m: multiplet; tt: triplet of tripl (partially) overlapped signals
  • ⁇ - and ⁇ -costic acids identified in this study are known natural products which can be derived from germacrene A acid in vitro (de Kraker et al, 2001b). It was therefore not surprising to detect these products in highly acidic yeast culture conditions, but the identification of ilicic acid was not expected from the yeast system.
  • Ilicic acid and its hydroxyl derivatives have been reported as natural products in Asteraceae plants such as Inula viscose, Laggera alata, and Dittrichia graveolens (Hernandez et al., 2001 ; Abou-Douh, 2008; Zheng et al, 2003), and their anti -tumor and anti-inflammatory activities have been evaluated (Hernandez et al, 2001 ; Leon et al, 2009).
  • the formation of ilicic acid from costic acids could be mediated by non-specific yeast enzyme(s) due to the stereo-specificity of the hydroxyl group at C4 position, but the nature of this chemical conversion was not further pursued in this study.
  • the occurrence of costic acids and ilicic acid from plants should be interpreted with caution since their presence in nature could be artifacts as shown in this work.
  • Asteraceae originated about 30 million years ago in South America and radiated rapidly in all worlds except Antarctica (Panero and Funk, 2008; Cronquist, 1977; Jansen and Palmer, 1987).
  • Asteraceae originated about 30 million years ago in South America and radiated rapidly in all worlds except Antarctica (Panero and Funk, 2008; Cronquist, 1977; Jansen and Palmer, 1987).
  • 22 kb DNA inversion in the chloroplastic genome shared by all Asteraceae plants is lacking in Barnadesioideae group, and thereafter this inversion data has served as key molecular evidence to support an ancient split between Barnadesioideae and the rest of Asteraceae (Jansen and Palmer, 1987).
  • Artemisia annua is the only plant species known to produce artemisinic acid and artemisinin. Specific evolutionary events seem to occur to drive the advent of ADS in A. annua, and GAO may have subsequently evolved to accommodate new sesquiterpene hydrocarbon (amorphadiene) for the biosynthesis of artemisinic acid. From this study, AMO compromised its activity for germacrene A oxidation on its way to acquire amorphadiene oxidation activity. However, all the GAOs displayed noticeable activities to oxidize amorphadiene, even though these enzymes have not been exposed to amorphadiene in nature.
  • (+)-Costunolide was purchased from AvaChem Scientific LLC (San Antonio, TX, USA). Germacrene A acid was prepared as described above.
  • 3 '-Rapid Amplification of cDNA Ends 3 '-RACE was performed using a SMARTTM RACE cDNA Amplification Kit (Clontech) following the manufacturer's instructions. Gene-specific primers, 3'-GSPl (5 ' -GAAGAGAGC AC AAGAAGAAGTGAGATCG-3 ' (SEQ ID NO:43)) and 3'-GSP2 (5 ' - ATT AGT ACCC AGAGAATGCCGAC AAGC-3 ' (SEQ ID NO:44)), were designed. The sequence of the universal primers for 3'-RACE was given in the user manual for the kit. The resultant PCR products at about 600 bp were subcloned into pGEM®-T Easy Vector (Promega) and their nucleotide sequences were determined.
  • LsCS costunolide synthase
  • quadruple expression plasmid pESC- Leu2d::GAS/GAO/CPR/LsCS was made as follows.
  • pESC-Lea2d::GAS/GAO/CPR was digested with Seal and BspEI, and the digested product that contains GAOICPR was ligated to the corresponding sites of pESC-Leu2d: :G/4,S, resulting in the plasmid harboring GASI GAOICPR and newly introduced Gal 10 promoter-multiple cloning site-ADHl terminator cassette (pESC-Leu2d::G ⁇ S/G ⁇ O/C i?-GallO_cassette).
  • LsCS was amplified from pESC-Ura::CP ?/ LsCS plasmid with a forward primer, 5'- ⁇ CTA G ATGGAGCCTCTC ACC ATCGTC-3 ' (SEQ ED NO:27), and a reverse primer, 5 '- ⁇ CTA GTGCGGACTTGAGGATCGGGACG-3 ' (SEQ ID NO:28) (Spel site is a).
  • PCR products were first subcloned into pGEM ® -T Easy Vector to confirm their nucleotide sequences, and then digested with Spel.
  • the generated LsCS coding fragments were ligated into the Spel site of pESC-Leu2d::G ⁇ £/G ⁇ O/CP./?-GallO_ cassette to generate quadruple expression plasmid, pESC-Leu2d: : GASI GAOICPRI LsCS.
  • the reaction (1 mL) consisted of 50mM HEPES/NaOH buffer pH7.5, 500 ⁇ NADPH, substrate (crude germacrene A acid solution), and 330 iL of microsomal fraction. After 3 hr incubation at 28 °C, the reaction was extracted three times with ethyl acetate. The ethyl acetate was completely evaporated under a stream of nitrogen, and then compounds were dissolved in 50 iL of MeOH. 20 iL of the MeOH sample was diluted with 80 ⁇ iL of MeOH and 400 ⁇ iL of water (finally 20% MeOH concentration), followed by filtration (0.22 ⁇ syringe filter). Forty iL of the sample was analyzed with LC-MS analysis.
  • the culture was centrifuged to separate supernatant and pellet.
  • the pH of the supernatant was adjusted to be pH 6.0 with 5 N HC1, and extracted with ethyl acetate twice.
  • compounds were dissolved in 1 mL of MeOH. Twenty ⁇ , of the MeOH sample was diluted with 80 ⁇ , of MeOH and 400 ⁇ . of water (finally 20% MeOH concentration), followed by filtration (0.22 ⁇ syringe filter). Forty ⁇ , of the sample was analyzed with LC-MS analysis.
  • cytochrome P450 cDNAs Isolation of cytochrome P450 cDNAs.
  • the lactone moiety of STL is formed by the lactonization of C6 hydroxyl and CI 2 carboxylic acid group in the germacrene A backbone (FIG. 6).
  • the catalytic conversion of germacrene to germacrene A acid was shown to be catalyzed by a cytochrome P450 highly homologous to the CYP71A1 from Artemisia annua (Ro et al, 2006; manuscript in preparation).
  • yeast expression plasmid for CYP_LsA was constructed and introduced into yeast YPL154C (PEP4). Since CYPjLsA had a putative endoplasmic reticulum-localizing signal, microsomal fractions were prepared from recombinant yeast cells and its enzymatic activity was determined using LC-MS analysis. LC- MS analysis showed that microsomal fractions of CFJ-VLs ⁇ -expressing yeast could react with germacrene A acid.
  • the major product from the CYP_LsA recombinant enzyme showed the identical retention time and m/z as those of authentic (+)-costunolide standard (FIG. 7). This result indicated that this major product is costunolide.
  • the newly identified gene, CYPJLsA was identified as costunolide synthase and accordingly it was renamed as costunolide synthase (LsCS).
  • LsCS costunolide synthase gene
  • the inventors developed a single expression plasmid designed to express four genes (GAS, GAO, CPR, and LsCS) simultaneously and introduced it to high FPP-producing yeast strain, EPY300 (Ro et al., 2008). After a 4-day-culture of the engineered yeast, the inventors determined whether costunolide was produced with LC-MS analysis.
  • LC-MS analysis in positive ion mode of m/z 233 showed that the engineered yeast can produce costunolide and other minor fractions of by-products (FIG. 8A). Since costunolide formation occurs through two steps from germacrene A acid; i.e., the first step is the C6 hydroxylation of germacrene A acid, and the second one is the spontaneous lactone ring closure (FIG. 8C), one of those minor products is likely to be C6 hydroxylated germacrene acid. Further LC-MS/MS analysis unambiguously confirmed the identity of LsCS reaction product as costunolide in comparison with the authentic costunolide standard. The reaction product and authentic costunolide standard showed the identical product ions and thir relative abundance (FIG.
  • costunolide The amount of costunolide, which was de novo biosynthesized in this experiment, was about 900 ⁇ g L "1 . This result demonstrated that costunolide can be synthesized from simple carbon source using the newly identified cytochrome P450 gene, LsCS.
  • Plant materials. Helianthus annuus L. cv. HA300 and Lactuca sativa cv. Mariska were grown under greenhouse conditions with 16 h illumination (330 ⁇ s-1 m-2) and a night length of 8 h.
  • Costunolide standard and structural confirmation were purchased from the AvaChem Scientific (San Antonio, TX, USA). The structure of the purchased costunolide was confirmed by ID and 2D NMR analyses. Experiments performed were ID proton, ID 13C with proton decoupling, 13C attached proton test with proton decoupling, COSY, TOSCY, HSQC, and HMBC.
  • RNA isolation and cDNA library construction Trichomes found in the anther appendages of sunflower florets were used to generate a trichome specific library. Trichomes in the secretory stage were manually isolated as described (Gopfert et al, 2005). The trichomes were collected in 200 ⁇ ice-cold RNA extraction buffer (Aurum total RNA Isolation Kit, Bio-Rad). When trichomes from 200 florets (approximately, 40,000 trichomes) were collected, the vial was frozen at -80°C. Altogether, trichomes from 5,000 florets were isolated. For total RNA isolation, the frozen aliquots were thawed on ice.
  • the ORFs of C49, CI 13, C63, and C7 were amplified using primers (la - 4b).
  • C49 and CI 13 were reamplified by a pair of primers, 5a and 5b, and then cloned into pDONR 221 plasmid by the gateway BP reaction.
  • C63 and C7 were cloned into pENTR/D-TOPO vector according to the provided protocol (Invitrogen).
  • the gateway LR reactions were performed for C49, CI 13, C63, and C7 to generate the respective yeast expression plasmids in pYES-DEST52 according to the provided protocol (Invitrogen).
  • ORFs were cloned in translational fusions with the V5 epitope in pYESDEST52 vector.
  • the ORFs of CI 2, CIOO, C28, SI, S2, and S3 were amplified using primers (6a - l ib). Amplified fragments were digested with Nhel or Xbal and subcloned into the Spel site of pESC-Ura vector to make translational fusions to the FLAG epitope. Expression of the cloned P450 genes were assessed by immunoblots using commercially available anti-V5 or anti-FLAG antibodies.
  • these plasmids and substrate-supplying plasmid, pESC-Leu2d::GAS/LsGAO/CPR were cotransformed in the EPY300 strain (Nguyen et al, 2010; Ro et al, 2008).
  • sunflower CI 2 and LsCOS were co-expressed with Artemisia annua CPR in pESC-Ura vector.
  • A. annua CPR from the pESC-Ura::CPR plasmid was digested by BamHI and Sail, and the digested fragment was ligated to the corresponding sites in pESC-Ura::C12, resulting in pESC- Ura::C12/CPR.
  • Partial sequences at the start codon of LsCOS were obtained from the Compositae Genome Project Database of the University of California Davis (compgenomics.ucdavis.edu).
  • the ORF of LsCOS was amplified from the cDNA templates from lettuce leaf with a primer pair, 12a and 12b, followed by the digestion with Xbal and ligation into the Spel site of pESC-Ura::CPR plasmid, resulting in pESC-Ura::CPR/LsCOS.
  • a quadruple expression plasmid, pESCLeu2d::GAS/LsGAO/CPR/LsCOS was constructed as follows.
  • the plasmid, pESCLeu2d::GAS/LsGAO/CPR (Nguyen et a., 2010), was digested with Seal and BspEI.
  • the digested product containing partial sequence of GAS and full-length sequences of LsGAO and CPR was ligated to the corresponding sites of pESC-Leu2d::GAS.
  • This cloning created a plasmid, pESCLeu2d::GAS/LsGAO/CPR-GallO_Cassette, which contains GAS/LsGAO/CPR and a newly introduced empty cloning site ⁇ i.e., Gal 10 promoter-multiple cloning site-ADHl terminator cassette).
  • An ORF of LsCOS was amplified from pESC- Ura::CPR/LsCOS plasmid using primers, 13a and 13b. The amplified PCR-products were first cloned into pGEM-T Easy vector and then digested with Spel.
  • the digested products were ligated into the Spel site of the pESCLeu2d::GAS/LsGAO/CPR-GallO_Cassette. This cloning created a quadruple expression plasmid, pESC-Leu2d::GAS/LsGAO/CPR/LsCOS.
  • the transgenic yeast strain of interest was inoculated in 3 ml of synthetic complete (SC) medium omitting the appropriate amino acids with 2% Glc.
  • SC synthetic complete
  • the inocula were cultured overnight in 30 °C at 200 rpm.
  • the start culture was diluted 25-fold in the SC medium omitting the appropriate amino acids with 1.8% Gal and 0.2% Glc.
  • methionine was added to the culture at a final concentration of 1 mM.
  • HEPES/NaOH 100 or 150 mM HEPES/NaOH (pH 7.5) was added to the culture medium to maintain the culture pH above 6.0.
  • yeast was cultured for 72 to 120 h in 30 °C at 200 rpm, the culture medium was adjusted to pH 6 with 5N HC1, and the medium was extracted with ethyl acetate. The ethyl acetate fractions were evaporated in N2 gas or by a rotary evaporator, and the metabolites were dissolved in methanol.
  • GAA was purified through a HPLC system (Waters 2795 Separation Module; Waters SunFire CI 8 column, 3.5 ⁇ , 4.6 x 150 mm; Waters 2996 Photodiode Array Detector with UV wavelength at 195 nm). The separation was achieved with a solvent gradient of 30:70 (A:B) to 28:72 (A:B) over 8 min at 1 mL min-1 and 40 ° C column temperature (A: H 2 0 with 0.1% acetic acid; B: 100% acetonitrile). To avoid acid-induced cyclization of GAA, the GAA fractions were collected into 250 mM ammonium acetate solution in which pH was kept above 6.
  • the ammonium acetate solution was adjusted to pH 6.0 with acetic acid, and GAA was recovered from ammonium acetate solution using the Sep-Pak Plus CI 8 cartridge (Waters). After the elution of GAA from the cartridge with 100% acetonitrile, the acetonitrile fraction was evaporated under the N2 stream. The purified GAA was dissolved in DMSO.
  • Microsome preparation and in vitro enzyme assay were carried out in 1 mL of 50 mM HEPES/NaOH.
  • LC-MS analysis was performed using an Agilent 1200 Rapid
  • the purification was conducted by HPLC with a solvent gradient of 50:50 (A:B) to 40:60 (A:B) over 8 min at 1 mL min 1 and 40 °C column temperature (A: H 2 0 with 0.1% acetic acid; B: 100% acetonitrile).
  • the eluted C6 hydroxy GAA was collected in 500 mM HEPES/NaOH (pH 7.5) to avoid acid-induced cyclization (final concentration of HEPES/NaOH after collection was about 50 mM).
  • the HPLC analysis confirmed that the purified C6 hydroxy GAA was spontaneously converted to costunolide even in a neutral pH condition, and the conversion rate was facilitated in elevated temperatures.
  • the extract was fractionated by a solvent gradient of 40:60 (A:B) to 38.4:61.6 (A:B) over 8 min at 1 ml min "1 and 40 ° C column temperature (A: H20 with 0.1% acetic acid; B: 100% acetonitrile).
  • the eluted CI 2 metabolic product was collected in 200 mM HEPES/NaOH (pH 7.5) to avoid acid-induced cyclization (final concentration of HEPES/NaOH after collection was about 140 mM).
  • the purified fraction was diluted 4 times with H 2 0, and its pH was adjusted to 6 by 1 N HC1.
  • the CI 2 metabolic product was purified by the Sep-Pak Plus CI 8 cartridge or ethyl acetate extraction.
  • NMR spectra for standard and de novo synthesized costunolide were obtained on a Varian 700 MHz spectrometer equipped with an inverse detection, cryo-cooled triple resonance, Z-gradient probe. 1H-NMR chemical shifts are reported using the residual proton resonance of solvents as reference, CDC13 ⁇ ; 7.24, and 13 C-NMR chemical shifts are reported relative to CDC13 5C; 77.0.
  • NMR spectra were recorded in 3 mm standard NMR tubes on a Varian Unity Inova 500 MHz spectrometer equipped with a 3 mm ID-PFG probe.
  • Costunolide synthase which catalyzes 6a-hydroxylation of GAA, was shown as a cytochrome P450 by using cell-free assay of chicory root (de Kraker et al, 2002). Since P450 is a diverse protein super-family and a few hundred P450s are found in the genomes of higher plants (Nelson et al, 2004), a selection strategy is critical to narrow down the candidate P450 genes. Lettuce latex, containing milli-molar levels of STLs, was initially considered as a source of transcripts for STL biosynthesis due to the easy sample accessibility (Sessa et al, 2000). However, q-PCR analysis of the lettuce GAS, catalyzing the first committed step in STL biosynthesis, showed that this gene is expressed 150 times higher in stem than in latex.
  • RNA isolated from pure HA300 trichomes was used to generate a plasmid cDNA library.
  • a total of 1,130 clones were sequenced by single pass 5'- end Sanger sequencings, and the resulting ESTs were assembled into 1 16 contigs and 651 singletons, yielding 767 unigenes.
  • 539 unigenes (70.2%) were annotated by the UniProt database (for a full list of annotated genes, see Table 5).
  • Previously reported GAS and GAO transcripts were present in 3 and 9 copies, respectively.
  • GAO was the 5th most abundant transcript (0.8% of total transcripts) in the trichome EST database, whereas only 4 copies of GAO were found from 86,398 ESTs generated from various tissues of sunflower (0.004%) (world-wide-web at cgpdb.ucdavis.edu/cgpdb2). This result suggests that the trichome cDNA library is highly enriched for the transcripts of STL biosynthesis.
  • Table 5 Annotation of Transcripts Isolated From Sunflower Trichomes
  • nCL51 Helianthus annuus germacrene A 1.00E- EU439 Contigl 480 2 Q4U3F7_HELAN synthase 1 (HaGAS l); 60 590

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Cell Biology (AREA)
  • Nutrition Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

Les lactones de sesquiterpène (STL) sont des produits naturels caractéristiques chez Asteraceae, mais la biochimie et l'évolution des STL chez Asteraceae restent inexplorées au niveau moléculaire. La présente invention concerne des séquences de gène et de protéine pour des costunolide synthases, avec des séquences exemplaires d'Asteraceae. La présente invention concerne en outre des micro-organismes recombinants contenant ces gènes, et des procédés de leur utilisation.
PCT/IB2011/001609 2010-03-16 2011-03-16 Acides nucléiques et séquences de protéine de costunolide synthase WO2011121456A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US31443910P 2010-03-16 2010-03-16
US61/314,439 2010-03-16

Publications (2)

Publication Number Publication Date
WO2011121456A2 true WO2011121456A2 (fr) 2011-10-06
WO2011121456A3 WO2011121456A3 (fr) 2012-02-16

Family

ID=44712689

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2011/001609 WO2011121456A2 (fr) 2010-03-16 2011-03-16 Acides nucléiques et séquences de protéine de costunolide synthase

Country Status (1)

Country Link
WO (1) WO2011121456A2 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113846083A (zh) * 2021-09-23 2021-12-28 华中农业大学 一种除虫菊大根香叶烯D合成酶TcGDS1及其编码基因与应用
CN115197188A (zh) * 2021-04-12 2022-10-18 南开大学 一类具有五环骨架的倍半萜氢醌化合物及其制备方法

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1916307A1 (fr) * 2001-09-17 2008-04-30 Plant Research International B.V. Enzymes de plante pour bioconversion

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1916307A1 (fr) * 2001-09-17 2008-04-30 Plant Research International B.V. Enzymes de plante pour bioconversion

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DATABASE EBI 22 December 2005 'Lettuce and sunflower ESTs from the compositae genome project.' Database accession no. DW045509 *
DATABASE EBI 23 December 2005 'Lettuce and sunflower ESTs from the compositae genome project.' Database accession no. DW114856 *
DE KRAKER, J. ET AL.: 'Biosynthesis of costunolide, dihydrocostunolide, and leucodin. Demonstration of cytochrome P450-catalyzed formation of the lactone ring present in sesquiterpene lactones of chicory.' PLANT PHYSIOLOGY vol. 129, no. 1, May 2002, ISSN 1532-2548 pages 257 - 268 *
IKEZAWA, N. ET AL.: 'Lettuce costunolide synthase (CYP71BL2) and its homolog (CYP71BL1 ) from sunflower catalyze distinct regio- and stereoselective hydroxylations in sesquiterpene lactone metabofism.' JOURNAL OF BIOLOGICAL CHEMISTRY vol. 286, no. 24, 17 June 2011, ISSN 1083-351X pages 21601 - 21611 *
LIU, Q. ET AL.: 'Reconstitution of the costunolide biosynthetic pathway in yeast and Nicotiana benthamiana.' PLOS ONE. vol. 6, no. 8, August 2011, ISSN 1932-6203 pages 1 - 12 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115197188A (zh) * 2021-04-12 2022-10-18 南开大学 一类具有五环骨架的倍半萜氢醌化合物及其制备方法
CN115197188B (zh) * 2021-04-12 2023-12-26 南开大学 一类具有五环骨架的倍半萜氢醌化合物及其制备方法
CN113846083A (zh) * 2021-09-23 2021-12-28 华中农业大学 一种除虫菊大根香叶烯D合成酶TcGDS1及其编码基因与应用
CN113846083B (zh) * 2021-09-23 2023-07-21 华中农业大学 一种除虫菊大根香叶烯D合成酶TcGDS1及其编码基因与应用

Also Published As

Publication number Publication date
WO2011121456A3 (fr) 2012-02-16

Similar Documents

Publication Publication Date Title
US11932887B2 (en) CYP76AD1-beta clade polynucleotides, polypeptides, and uses thereof
Wang et al. The control of red colour by a family of MYB transcription factors in octoploid strawberry (Fragaria× ananassa) fruits
Brown et al. Identification of a 12-gene fusaric acid biosynthetic gene cluster in Fusarium species through comparative and functional genomics
KR102181638B1 (ko) 스테비올 글리코시드의 재조합 생산
Wang et al. The soybean Dof‐type transcription factor genes, GmDof4 and GmDof11, enhance lipid content in the seeds of transgenic Arabidopsis plants
CN103667370B (zh) 编码类异戊烯修饰酶的多核苷酸和其使用方法
WO2011058446A2 (fr) Thébaïne 6-o-déméthylase et codéine o-déméthylase provenant de papaver somniferum
US20140148622A1 (en) Engineering Plants to Produce Farnesene and Other Terpenoids
US11827915B2 (en) Method for production of novel diterpene scaffolds
US20040002105A1 (en) Methods of identifying genes for the manipulation of triterpene saponins
US7935802B2 (en) Lignan glycosidase and utilization of the same
WO2011121456A2 (fr) Acides nucléiques et séquences de protéine de costunolide synthase
US20220315940A1 (en) Germacrene a synthase mutants
DK2992756T3 (en) Reduced onions that do not generate tear-inducing component
TW585918B (en) Isolated grand fir (Abies grandis) monoterpene synthase protein, replicable expression vectors and host cells comprising nucleotide sequences of said monoterpene synthases, method of enhancing the production of a gymnosperm monoterpene synthase
JP6018915B2 (ja) 28位がヒドロキシメチル基またはカルボキシル基である五環系トリテルペン化合物の製造方法
JP2009511071A (ja) コーヒーのフェニルプロパノイド及びフラボノイド生合成経路酵素をコードするポリヌクレオチド
US20070218461A1 (en) Indole-Diterpene Biosynthesis
Ming-Li et al. Cloning and expression analysis of Dihydroflavonol 4-Reductase gene in Brassica juncea
WO2023144199A1 (fr) Plantes ayant des niveaux réduits de métabolites de goût amer
Miranda Chávez Elucidation of the Function of Dihydrochalcones in Apple
Zhao et al. Cytochrome b5 diversity in green lineages preceded the evolution of syringyl lignin biosynthesis
Christinet Characterization and functional identification of a novel plant extradiol 4, 5-dioxygenase involved in betalain pigment biosynthesis in Portulaca grandiflora
Polturak Pathway discovery and metabolic engineering of betalains
Shewmaker et al. Engineering vitamin E content: from Arabidopsis mutant to soy oil

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11762088

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11762088

Country of ref document: EP

Kind code of ref document: A2