US20210010035A1 - Production of manool - Google Patents

Production of manool Download PDF

Info

Publication number
US20210010035A1
US20210010035A1 US16/938,605 US202016938605A US2021010035A1 US 20210010035 A1 US20210010035 A1 US 20210010035A1 US 202016938605 A US202016938605 A US 202016938605A US 2021010035 A1 US2021010035 A1 US 2021010035A1
Authority
US
United States
Prior art keywords
seq
polypeptide
amino acid
acid sequence
sequence identity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/938,605
Inventor
Michel Schalk
Laurent Daviet
Letizia ROCCI
Daniel Solis Escalante
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Firmenich SA
Original Assignee
Firmenich SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Firmenich SA filed Critical Firmenich SA
Priority to US16/938,605 priority Critical patent/US20210010035A1/en
Publication of US20210010035A1 publication Critical patent/US20210010035A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P5/00Preparation of hydrocarbons or halogenated hydrocarbons
    • C12P5/007Preparation of hydrocarbons or halogenated hydrocarbons containing one or more isoprene units, i.e. terpenes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/02Preparation of oxygen-containing organic compounds containing a hydroxy group
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N1/00Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
    • C12N1/14Fungi; Culture media therefor
    • C12N1/16Yeasts; Culture media therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N1/00Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
    • C12N1/20Bacteria; Culture media therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/88Lyases (4.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/90Isomerases (5.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y402/00Carbon-oxygen lyases (4.2)
    • C12Y402/03Carbon-oxygen lyases (4.2) acting on phosphates (4.2.3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/101Plasmid DNA for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/102Plasmid DNA for yeast
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12RINDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
    • C12R2001/00Microorganisms ; Processes using microorganisms
    • C12R2001/01Bacteria or Actinomycetales ; using bacteria or Actinomycetales
    • C12R2001/185Escherichia
    • C12R2001/19Escherichia coli
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12RINDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
    • C12R2001/00Microorganisms ; Processes using microorganisms
    • C12R2001/645Fungi ; Processes using fungi
    • C12R2001/85Saccharomyces
    • C12R2001/865Saccharomyces cerevisiae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y505/00Intramolecular lyases (5.5)
    • C12Y505/01Intramolecular lyases (5.5.1)
    • C12Y505/01012Copalyl diphosphate synthase (5.5.1.12)

Definitions

  • Terpenes are found in most organisms (microorganisms, animals and plants). These compounds are made up of five carbon units called isoprene units and are classified by the number of these units present in their structure. Thus monoterpenes, sesquiterpenes and diterpenes are terpenes containing 10, 15 and 20 carbon atoms respectively. Sesquiterpenes, for example, are widely found in the plant kingdom. Many sesquiterpene molecules are known for their flavor and fragrance properties and their cosmetic, medicinal and antimicrobial effects. Numerous sesquiterpene hydrocarbons and sesquiterpenoids have been identified.
  • Biosynthetic production of terpenes involves enzymes called terpene synthases. These enzymes convert an acyclic terpene precursor in one or more terpene products.
  • diterpene synthases produce diterpenes by cyclization of the precursor geranylgeranyl diphosphate (GGPP).
  • the cyclization of GGPP often requires two enzyme polypeptides, a type I and a type II diterpene synthase working in combination in two successive enzymatic reactions.
  • the type II diterpene synthases catalyze a cyclization/rearrangement of GGPP initiated by the protonation of the terminal double bond of GGPP leading to a cyclic diterpene diphosphate intermediate. This intermediate is then further converted by a type I diterpene synthase catalyzing an ionization initiated cyclization.
  • Diterpene synthases are present in the plants and other organisms and use substrates such as GGPP but they have different product profiles. Genes and cDNAs encoding diterpene synthases have been cloned and the corresponding recombinant enzymes characterized.
  • Copalyl diphosphate (CPP) synthase enzymes and sclareol synthase enzymes are enzymes that occur in plants. Hence, it is desirable to discover and use these enzymes and variants in biochemical processes to generate (+)-manool.
  • (+)-manool comprising:
  • polypeptide having CPP synthase activity wherein the polypeptide comprises
  • polypeptide having sclareol synthase activity wherein the polypeptide comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
  • nucleic acid encoding a polypeptide described above.
  • nucleic acid encoding a CPP synthase wherein the nucleic acid comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to the nucleic acid sequence as set forth in SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • nucleic acid encoding a sclareol synthase wherein the nucleic acid comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
  • an expression vector comprising the nucleic acids described above, a non-human host organism or cell comprising the nucleic acids described above or comprising the expression vector, non-human host organisms or cells capable of producing GGPP, methods of transforming a non-human host organism or cell, and methods for culturing the non-human host organisms or cells for producing (+)-manool.
  • FIG. 1 Enzymatic pathway from geranylgeranyl diphosphate (GGPP) to (+)-manool.
  • FIG. 2 Enzymatic pathways from geranylgeranyl diphosphate (GGPP) to (+)-manool and sclareol.
  • GGPP geranylgeranyl diphosphate
  • FIG. 3 GCMS analysis of the in vitro enzymatic conversion of GGPP.
  • A. Using the recombinant SmCPS enzyme.
  • B. Using the recombinant ScScS enzyme.
  • C. Combining the SmCPS with ScScS enzymes in a single assay.
  • FIG. 4 GCMS analysis of (+)-manool produced using Escherichia coli cells expressing SmCPS, ScScS and mevalonate pathway enzymes.
  • A Total ion chromatogram of an extract of the E. coli culture medium.
  • B Total ion chromatogram of a (+)-manool standard.
  • C Mass spectrum of the major peak (retention time of 14.55 min) in chromatogram A.
  • D Mass spectrum of the (+)-manool authentic standard.
  • FIG. 5 GCMS analysis of (+)-manool produced using E. coli cells expressing, mevalonate pathway enzymes, a GGPP synthase, ScSCS and five different CPP synthases: SmCPS2 from Salvia miltiorrhiza , CfCPS1 from Coleus forskohlii , TaTps1 from Triticum aestivum , MvCps3 from Marrubium vulgare and RoCPS1 from Rosmarinus officinalis.
  • SmCPS2 from Salvia miltiorrhiza
  • CfCPS1 from Coleus forskohlii
  • TaTps1 from Triticum aestivum
  • MvCps3 from Marrubium vulgare
  • RoCPS1 from Rosmarinus officinalis.
  • FIG. 6 GCMS analysis of (+)-manool produced using E. coli cells expressing, mevalonate pathway enzymes, a GGPP synthase, SmCPS2 and a class I diterpene synthases: NgSCS-del29 from Nicotiana glutinosa or SsScS from Salvia sclarea.
  • FIG. 7 Saccharomyces cerevisiae expression plasmids were constructed in vivo by co-transformation of yeast with six DNA fragments: a) LEU2 yeast marker, b) AmpR E. coli marker, c) Yeast origin of replication, d) E. coli replication origin, e) a fragment for co-expression of CrtE and one of the sclareol synthases coding sequences tested, and f) a fragment for expression of one of the copalyl diphosphate (CPP) synthases coding sequences tested.
  • CPP copalyl diphosphate
  • FIG. 8 GCMS analysis of (+)-manool produced using the modified S. cerevisiae strain YST045 expressing a GGPP synthase, ScSCS and five different truncated versions of CPP synthases: SmCPS2 from Salvia miltiorrhiza , CfCPS1 from Coleus forskohlii , TaTps1 from Triticum aestivum , MvCps3 from Marrubium vulgare and RoCPS1 from Rosmarinus officinalis.
  • polypeptide means an amino acid sequence of consecutively polymerized amino acid residues, for instance, at least 15 residues, at least 30 residues, at least 50 residues.
  • a polypeptide comprises an amino acid sequence that is an enzyme, or a fragment, or a variant thereof.
  • isolated polypeptide refers to an amino acid sequence that is removed from its natural environment by any method or combination of methods known in the art and includes recombinant, biochemical and synthetic methods.
  • protein refers to an amino acid sequence of any length wherein amino acids are linked by covalent peptide bonds, and includes oligopeptide, peptide, polypeptide and full length protein whether naturally occurring or synthetic.
  • biological function refers to the ability of the CPP synthase and the sclareol synthase activity to catalyze the formation of (+)-manool.
  • nucleic acid sequence refers to a sequence of nucleotides.
  • a nucleic acid sequence may be a single-stranded or double-stranded deoxyribonucleotide, or ribonucleotide of any length, and include coding and non-coding sequences of a gene, exons, introns, sense and anti-sense complimentary sequences, genomic DNA, cDNA, miRNA, siRNA, mRNA, rRNA, tRNA, recombinant nucleic acid sequences, isolated and purified naturally occurring DNA and/or RNA sequences, synthetic DNA and RNA sequences, fragments, primers and nucleic acid probes; and the complement of such sequences.
  • the skilled artisan is aware that the nucleic acid sequences of RNA are identical to the DNA sequences with the difference of thymine (T) being replaced by uracil (U).
  • An isolated nucleic acid or isolated nucleic acid sequence refers to a nucleic acid or nucleic acid sequence that is in an environment different from that in which the nucleic acid or nucleic acid sequence naturally occurs.
  • the term “naturally-occurring” as used herein as applied to a nucleic acid refers to a nucleic acid that is found in a cell in nature.
  • a nucleic acid sequence that is present in an organism, for instance in the cells of an organism, that can be isolated from a source in nature and which it has not been intentionally modified by a human in the laboratory is naturally occurring.
  • purified refers to the state of being free of other, dissimilar compounds with which the compound of the invention is normally associated in its natural state, so that the “purified,” “substantially purified,” and “isolated” subject comprises at least 0.5%, 1%, 5%, 10%, or 20%, or at least 50% or 75% of the mass, by weight, of a given sample. In one particular embodiment, these terms refer to the compound of the invention comprising at least 95, 96, 97, 98, 99 or 100% of the mass, by weight, of a given sample.
  • nucleic acid or protein of nucleic acids or proteins
  • isolated when referring to a nucleic acid or protein, of nucleic acids or proteins, also refers to a state of purification or concentration different than that which occurs naturally in a cell or organism. Any degree of purification or concentration greater than that which occurs naturally in a cell or organism, including (1) the purification from other associated structures or compounds or (2) the association with structures or compounds to which it is not normally associated in the cell or organism, are within the meaning of “isolated.”
  • the nucleic acid or protein or classes of nucleic acids or proteins, described herein may be isolated, or otherwise associated with structures or compounds to which they are not normally associated in nature, according to a variety of methods and processes known to those of skill in the art.
  • the terms “amplifying” and “amplification” refer to the use of any suitable amplification methodology for generating or detecting recombinant of naturally expressed nucleic acid, as described in detail, below.
  • the invention provides methods and reagents (e.g., specific degenerate oligonucleotide primer pairs, oligo dT primer) for amplifying (e.g., by polymerase chain reaction, PCR) naturally expressed (e.g., genomic DNA or mRNA) or recombinant (e.g., cDNA) nucleic acids of the invention in vivo, ex vivo or in vitro.
  • Recombinant nucleic acid sequence are nucleic acid sequences that result from the use of laboratory methods (molecular cloning) to bring together genetic material from more than on source, creating a nucleic acid sequence that does not occur naturally and would not be otherwise found in biological organisms.
  • Recombinant DNA technology refers to molecular biology procedures to prepare a recombinant nucleic acid sequence as described, for instance, in Laboratory Manuals edited by Weigel and Glazebrook, 2002 Cold Spring Harbor Lab Press; and Sambrook et al., 1989 Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press.
  • gene means a DNA sequence comprising a region, which is transcribed into a RNA molecule, e.g., an mRNA in a cell, operably linked to suitable regulatory regions, e.g., a promoter.
  • a gene may thus comprise several operably linked sequences, such as a promoter, a 5′ leader sequence comprising, e.g., sequences involved in translation initiation, a coding region of cDNA or genomic DNA, introns, exons, and/or a 3′non-translated sequence comprising, e.g., transcription termination sites.
  • a “chimeric gene” refers to any gene, which is not normally found in nature in a species, in particular, a gene in which one or more parts of the nucleic acid sequence are present that are not associated with each other in nature.
  • the promoter is not associated in nature with part or all of the transcribed region or with another regulatory region.
  • the term “chimeric gene” is understood to include expression constructs in which a promoter or transcription regulatory sequence is operably linked to one or more coding sequences or to an antisense, i.e., reverse complement of the sense strand, or inverted repeat sequence (sense and antisense, whereby the RNA transcript forms double stranded RNA upon transcription).
  • the term “chimeric gene” also includes genes obtained through the combination of portions of one or more coding sequences to produce a new gene.
  • a “3′ UTR” or “3′ non-translated sequence” refers to the nucleic acid sequence found downstream of the coding sequence of a gene, which comprises for example a transcription termination site and (in most, but not all eukaryotic mRNAs) a polyadenylation signal such as AAUAAA or variants thereof. After termination of transcription, the mRNA transcript may be cleaved downstream of the polyadenylation signal and a poly(A) tail may be added, which is involved in the transport of the mRNA to the site of translation, e.g., cytoplasm.
  • “Expression of a gene” involves transcription of the gene and translation of the mRNA into a protein. Overexpression refers to the production of the gene product as measured by levels of mRNA, polypeptide and/or enzyme activity in transgenic cells or organisms that exceeds levels of production in non-transformed cells or organisms of a similar genetic background.
  • “Expression vector” as used herein means a nucleic acid molecule engineered using molecular biology methods and recombinant DNA technology for delivery of foreign or exogenous DNA into a host cell.
  • the expression vector typically includes sequences required for proper transcription of the nucleotide sequence.
  • the coding region usually codes for a protein of interest but may also code for an RNA, e.g., an antisense RNA, siRNA and the like.
  • an “expression vector” as used herein includes any linear or circular recombinant vector including but not limited to viral vectors, bacteriophages and plasmids. The skilled person is capable of selecting a suitable vector according to the expression system.
  • the expression vector includes the nucleic acid of an embodiment herein operably linked to at least one regulatory sequence, which controls transcription, translation, initiation and termination, such as a transcriptional promoter, operator or enhancer, or an mRNA ribosomal binding site and, optionally, including at least one selection marker.
  • Nucleotide sequences are “operably linked” when the regulatory sequence functionally relates to the nucleic acid of an embodiment herein.
  • regulatory sequence refers to a nucleic acid sequence that determines the expression level of the nucleic acid sequences of an embodiment herein and is capable of regulating the rate of transcription of the nucleic acid sequence operably linked to the regulatory sequence. Regulatory sequences comprise promoters, enhancers, transcription factors, promoter elements and the like.
  • Promoter refers to a nucleic acid sequence that controls the expression of a coding sequence by providing a binding site for RNA polymerase and other factors required for proper transcription including without limitation transcription factor binding sites, repressor and activator protein binding sites.
  • the meaning of the term promoter also includes the term “promoter regulatory sequence”.
  • Promoter regulatory sequences may include upstream and downstream elements that may influences transcription, RNA processing or stability of the associated coding nucleic acid sequence. Promoters include naturally-derived and synthetic sequences.
  • the coding nucleic acid sequences is usually located downstream of the promoter with respect to the direction of the transcription starting at the transcription initiation site.
  • constitutive promoter refers to an unregulated promoter that allows for continual transcription of the nucleic acid sequence it is operably linked to.
  • operably linked refers to a linkage of polynucleotide elements in a functional relationship.
  • a nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence.
  • a promoter or rather a transcription regulatory sequence, is operably linked to a coding sequence if it affects the transcription of the coding sequence.
  • Operably linked means that the DNA sequences being linked are typically contiguous.
  • the nucleotide sequence associated with the promoter sequence may be of homologous or heterologous origin with respect to the cell or organism, e.g. host cell, plant cell, plant, or microorganism, to be transformed. The sequence also may be entirely or partially synthetic.
  • the nucleic acid sequence associated with the promoter sequence will be expressed or silenced in accordance with promoter properties to which it is linked.
  • the associated nucleic acid may code for a protein that is desired to be expressed or suppressed throughout the organism at all times or, alternatively, at a specific time or in specific tissues, cells, or cell compartment.
  • Such nucleotide sequences particularly encode proteins conferring desirable phenotypic traits to the host cells or organism altered or transformed therewith. More particularly, the associated nucleotide sequence leads to the production of a (+)-manool synthase in the host cell or organism.
  • Target peptide refers to an amino acid sequence which targets a protein, or polypeptide to intracellular organelles, i.e., mitochondria, or plastids, or to the extracellular space (secretion signal peptide).
  • a nucleic acid sequence encoding a target peptide may be fused to the nucleic acid sequence encoding the amino terminal end, e.g., N-terminal end, of the protein or polypeptide, or may be used to replace a native targeting polypeptide.
  • primer refers to a short nucleic acid sequence that is hybridized to a template nucleic acid sequence and is used for polymerization of a nucleic acid sequence complementary to the template.
  • the term “host cell” or “transformed cell” refers to a cell (or organism) altered to harbor at least one nucleic acid molecule, for instance, a recombinant gene encoding a desired protein or nucleic acid sequence which upon transcription yields a CPP synthase protein and/or a sclareol synthase protein or which together produce (+)-manool.
  • the host cell is particularly a bacterial cell, a fungal cell or a plant cell.
  • the host cell may contain a recombinant gene which has been integrated into the nuclear or organelle genomes of the host cell. Alternatively, the host may contain the recombinant gene extra-chromosomally.
  • Homologous sequences include orthologous or paralogous sequences. Methods of identifying orthologs or paralogs including phylogenetic methods, sequence similarity and hybridization methods are known in the art and are described herein.
  • Paralogs result from gene duplication that gives rise to two or more genes with similar sequences and similar functions. Paralogs typically cluster together and are formed by duplications of genes within related plant species. Paralogs are found in groups of similar genes using pair-wise Blast analysis or during phylogenetic analysis of gene families using programs such as CLUSTAL. In paralogs, consensus sequences can be identified characteristic to sequences within related genes and having similar functions of the genes.
  • Orthologs are sequences similar to each other because they are found in species that descended from a common ancestor. For instance, plant species that have common ancestors are known to contain many enzymes that have similar sequences and functions. The skilled artisan can identify orthologous sequences and predict the functions of the orthologs, for example, by constructing a polygenic tree for a gene family of one species using for example CLUSTAL or BLAST programs
  • selectable marker refers to any gene which upon expression may be used to select a cell or cells that include the selectable marker. Examples of selectable markers are described below. The skilled artisan will know that different antibiotic, fungicide, auxotrophic or herbicide selectable markers are applicable to different target species.
  • organism refers to any non-human multicellular or unicellular organisms such as a plant, or a microorganism. Particularly, a microorganism is a bacterium, a yeast, an algae or a fungus.
  • plant is used interchangeably to include plant cells including plant protoplasts, plant tissues, plant cell tissue cultures giving rise to regenerated plants, or parts of plants, or plant organs such as roots, stems, leaves, buds, flowers, petioles, petals, pollen, ovules, embryos, tubers, fruits, seed, progeny thereof and the like. Any plant can be used to carry out the methods of an embodiment herein.
  • a method for transforming a host cell or non-human organism comprising transforming a host cell or non-human organism with a nucleic acid encoding a polypeptide having a copalyl diphosphate synthase activity and with a nucleic acid encoding a polypeptide having a sclareol synthase activity, wherein the polypeptide having the copalyl diphosphate activity comprises
  • amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
  • the polypeptide having the sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
  • a method comprising cultivating a non-human host organism or cell capable of producing a geranylgeranyl diphosphate (GGPP) and transformed to express a polypeptide having a copalyl diphosphate synthase activity wherein the polypeptide having the copalyl diphosphate synthase activity comprises
  • the polypeptide having the sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
  • an expression vector comprising a nucleic acid encoding a CPP synthase wherein the CPP synthase comprises a polypeptide comprising
  • amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and further the expression vector comprises a nucleic acid encoding a sclareol synthase enzyme.
  • the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
  • the two enzymes i.e. the CPP synthase and the sclareol synthase, could be on two different vectors or plasmids transformed in the same cell. In a further embodiment, these two enzymes could be on two different vectors or plasmids transformed in two different cells.
  • non-human host organism or cell comprising or transformed to harbor at least one nucleic acid encoding a CPP synthase wherein the CPP synthase comprises
  • a polypeptide comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
  • the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
  • non-human host organism or cell comprising or transformed to harbor at least one nucleic acid encoding a CPP synthase wherein the CPP synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO: 2; and
  • At least one nucleic acid encoding a sclareol enzyme wherein the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group SEQ ID NO: 23 and SEQ ID NO: 25.
  • the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence that has at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having at least 98% %, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • the nucleic acid that encodes for a CPP synthase provided herein comprises the nucleotide sequence as set forth in SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • the CPP synthase comprises a polypeptide comprising
  • amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% sequence identity to SEQ ID NO: 14.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 14.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 15.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 17.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 18.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
  • the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 20.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
  • the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 21.
  • the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 21.
  • the nucleic acid encoding the sclareol synthase enzyme comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
  • the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
  • the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
  • the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
  • the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 4.
  • the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 4.
  • the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
  • the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
  • the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
  • the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 5.
  • the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 5.
  • the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
  • the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
  • the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
  • the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 23.
  • the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 23.
  • the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
  • the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
  • the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
  • the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 25.
  • the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 25.
  • an expression vector comprising at least one of the nucleic acids described herein.
  • a non-human host organism or cell that comprises one or more expression vectors comprising a nucleic acid encoding a CPP synthase as described herein and a nucleic acid encoding a sclareol synthase as described herein.
  • a non-human host organism or cell comprising or transformed to harbor at least one nucleic acid described herein so that it heterologously expresses or over-expresses at least one polypeptide described herein.
  • the present invention provides a transformed cell or organism, in which the polypeptides are expressed in higher quantity than in the same cell or organism not so transformed.
  • transgenic host organisms or cells such as plants, fungi, prokaryotes, or cultures of higher eukaryotic cells.
  • Appropriate cloning and expression vectors for use with bacterial, fungal, yeast, plant and mammalian cellular hosts are described, for example, in Pouwels et al., Cloning Vectors: A Laboratory Manual, 1985, Elsevier, New York and Sambrook et al., Molecular Cloning: A Laboratory Manual, 2 nd edition, 1989, Cold Spring Harbor Laboratory Press.
  • Cloning and expression vectors for higher plants and/or plant cells in particular are available to the skilled person. See for example Schardl et al., Gene, 1987, 61:1-11.
  • transgenic plants Methods for transforming host organisms or cells to harbor transgenic nucleic acids are familiar to the skilled person.
  • current methods include: electroporation of plant protoplasts, liposome-mediated transformation, agrobacterium-mediated transformation, polyethylene-glycol-mediated transformation, particle bombardment, microinjection of plant cells, and transformation using viruses.
  • transformed DNA is integrated into a chromosome of a non-human host organism and/or cell such that a stable recombinant system results.
  • Any chromosomal integration method known in the art may be used in the practice of the invention, including but not limited to recombinase-mediated cassette exchange (RMCE), viral site-specific chromosomal insertion, adenovirus and pronuclear injection.
  • RMCE recombinase-mediated cassette exchange
  • viral site-specific chromosomal insertion adenovirus and pronuclear injection.
  • One embodiment provides the above method for producing manool further comprising prior to step a), transforming a non-human host organism or host cell capable of producing GGPP with
  • the non-human host organism or host cell capable of producing GGPP comprises
  • the non-human host organism provided herein is a plant, a prokaryote or a fungus.
  • the non-human host provided herein is a microorganism, particularly bacteria or yeast.
  • the bacterium provided herein is Escherichia coli and yeast is Saccharomyces cerevisiae.
  • the non-human organism provided herein is Saccharomyces cerevisiae.
  • the cell is a prokaryotic cell.
  • the cell is a bacterial cell.
  • the cell is a eukaryotic cell.
  • the eukaryotic cell is a yeast cell or a plant cell.
  • the manool can be produced by culturing the transformed bacteria or yeast described herein, including through fermentation, for example as described in Paddon et al., Nature, 2013, 496:528-532.
  • the process of producing (+)-manool produces the (+)-manool at a purity of at least 98.5%.
  • a method provided herein further comprising processing the (+)-manool to a derivative using a chemical or biochemical synthesis or a combination of both using methods commonly known in the art.
  • the (+)-manool derivative is selected from the group consisting of a hydrocarbon, an alcohol, acetal, aldehyde, acid, ether, ketone, lactone, acetate and an ester.
  • said (+)-manool derivative is a C 10 to C 25 compound optionally comprising one, two or three oxygen atoms.
  • the derivative is selected from the group consisting of manool acetate ((3R)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-1-penten-3-yl acetate), copalol ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2-penten-1-ol), copalol acetate ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2-penten-1-yl acetate), copalal ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthaleny
  • a method provided herein further comprises contacting the (+)-manool with a suitable reacting system to convert said (+)-manool in to a suitable (+)-manool derivative.
  • Said suitable reacting system can be of enzymatic nature (e.g. requiring one or more enzymes) or of chemical nature (e.g. requiring one or more synthetic chemicals).
  • (+)-manool may be enzymatically converted to manooloxy or gamma-ambrol using a process described in the literature, for example as set forth in U.S. Pat. No. 7,294,492, wherein said patent is hereby incorporated by reference in its entirety herein.
  • the (+)-manool derivative is copalol and its esters with a C 1 -C 5 carboxylic acids.
  • the (+)-manool derivative is a (+)-manool ester with a C 1 -C 5 carboxylic acids.
  • the (+)-manool derivative is copalal.
  • the (+)-manool derivative is manooloxy.
  • (+)-manool derivative is Z-11.
  • the (+)-manool derivative is an ambrol or is a mixture thereof and its esters with a C 1 -C 5 carboxylic acids, and in particular gamma-ambrol and its esters.
  • the (+)-manool derivative is Ambrox®, sclareolide (also known as 3a,6,6,9a-tetramethyldecahydronaphtho[2,1-b]furan-2(1H)-one and all its diastereoisomer and stereoisomers), 3,4a,7,7,10a-pentamethyldodecahydro-1H-benzo[f]chromen-3-ol or 3,4a,7,7,10a-pentamethyl-4a,5,6,6a,7,8,9,10,10a,10b-decahydro-1H-benzo[f]chromene and all their diastereoisomer and stereoisomers cyclic ketone and open form, (1R,2R,4aS,8aS)-1-(2-hydroxyethyl)-2,5,5,8a-tetramethyldecahydronaphthalen-2-ol DOL, gamma-ambrol.
  • Ambrox®
  • manool obtained according to the invention can be processed into Manooloxy (a ketone, as per known methods) and then into ambrol (an alcohol) and ambrox (an ether), according to EP 212254.
  • Polypeptides are also meant to include truncated polypeptides provided that they keep their (+)-manool synthase activity and their sclareol synthase activity.
  • nucleotide sequence obtained by modifying the sequences described herein may be performed using any method known in the art, for example by introducing any type of mutations such as deletion, insertion or substitution mutations. Examples of such methods are cited in the part of the description relative to the variant polypeptides and the methods to prepare them.
  • the percentage of identity between two peptide or nucleotide sequences is a function of the number of amino acids or nucleotide residues that are identical in the two sequences when an alignment of these two sequences has been generated. Identical residues are defined as residues that are the same in the two sequences in a given position of the alignment.
  • the percentage of sequence identity is calculated from the optimal alignment by taking the number of residues identical between two sequences dividing it by the total number of residues in the shortest sequence and multiplying by 100.
  • the optimal alignment is the alignment in which the percentage of identity is the highest possible. Gaps may be introduced into one or both sequences in one or more positions of the alignment to obtain the optimal alignment.
  • Alignment for the purpose of determining the percentage of amino acid or nucleic acid sequence identity can be achieved in various ways using computer programs and for instance publicly available computer programs available on the world wide web.
  • the BLAST program (Tatiana et al., FEMS Microbiol Lett., 1999, 174:247-250) set to the default parameters, available from the National Center for Biotechnology Information (NCBI) at http://www.ncbi.nlm.nih.gov/BLAST/bl2seq/wblast2.cgi, can be used to obtain an optimal alignment of protein or nucleic acid sequences and to calculate the percentage of sequence identity.
  • the polypeptide to be contacted with GGPP in vitro can be obtained by extraction from any organism expressing it, using standard protein or enzyme extraction technologies. If the host organism is an unicellular organism or cell releasing the polypeptide of an embodiment herein into the culture medium, the polypeptide may simply be collected from the culture medium, for example by centrifugation, optionally followed by washing steps and re-suspension in suitable buffer solutions. In another embodiment, the GGPP may be contacted with the polypeptide in the culture medium where the polypeptide may be released from the host organism, unicellular organism or cell. If the organism or cell accumulates the polypeptide within its cells, the polypeptide may be obtained by disruption or lysis of the cells. The GGPP may be contacted with the polypeptide upon further extraction of the polypeptide from the cell lysate or through contact with the cell lysate without necessarily conducting such an extraction.
  • the method of any of the above-described embodiments is carried out in vivo.
  • These embodiments provided herein are particularly advantageous since it is possible to carry out the method in vivo without previously isolating the polypeptide.
  • the reaction occurs directly within the organism or cell transformed to express said polypeptide.
  • the organism or cell is meant to “express” a polypeptide, provided that the organism or cell is transformed to harbor a nucleic acid encoding said polypeptide, this nucleic acid is transcribed to mRNA and the polypeptide is found in the host organism or cell.
  • express encompasses “heterologously express” and “over-express”, the latter referring to levels of mRNA, polypeptide and/or enzyme activity over and above what is measured in a non-transformed organism or cell.
  • a particular organism or cell is meant to be “capable of producing GGPP” when it produces GGPP naturally or when it does not produce GPPP naturally but is transformed to produce GGPP, either prior to the transformation with a nucleic acid as described herein or together with said nucleic acid.
  • Organisms or cells transformed to produce a higher amount of GGPP than the naturally occurring organism or cell are also encompassed by the “organisms or cells capable of producing GGPP”.
  • Several methods to transform organisms, for example microorganisms, so that they produce GGPP are known, for example in Schalk et al., J. Am. Chem. Soc., 2013, 134:18900-18903.
  • Non-human host organisms suitable to carry out the method of an embodiment herein in vivo may be any non-human multicellular or unicellular organisms.
  • the non-human host organism used to carry out an embodiment herein in vivo is a plant, a prokaryote or a fungus. Any plant, prokaryote or fungus can be used. Particularly useful plants are those that naturally produce high amounts of terpenes.
  • the non-human host organism used to carry out the method of an embodiment herein in vivo is a microorganism. Any microorganism can be used but according to an even more particular embodiment said microorganism is a bacteria or yeast. Most particularly, said bacterium is E. coli and said yeast is Saccharomyces cerevisiae.
  • these organisms do not produce GGPP naturally or only in small amounts.
  • these organisms have to be transformed to produce said precursor or engineered to produce said precursor in larger amounts. They can be so transformed either before the modification with the nucleic acid described according to any of the above embodiments or simultaneously, as explained above.
  • the non-human host organism or cell capable of producing GGPP is transformed with a nucleic acid encoding a CPP synthase or variant thereof as described herein and a nucleic acid encoding a sclareol synthase or variant thereof as described herein, wherein the non-human host organism or cell capable of producing GGPP has been engineered to over-express a GGPP synthase or transformed with a nucleic acid encoding a GGPP synthase.
  • the non-human host organism or cell comprises a nucleic acid encoding a GGPP synthase, a nucleic acid encoding a CPP synthase or variant thereof as described herein, and a nucleic acid encoding a sclareol synthase or variant thereof as described herein, wherein at least one of said nucleic acids is heterologous to the non-human host organism or cell.
  • Isolated higher eukaryotic cells can also be used, instead of complete organisms, as hosts to carry out the method of an embodiment herein in vivo.
  • Suitable eukaryotic cells may be any non-human cell, but are particularly plant or fungal cells.
  • polypeptides having a CPP synthase activity used in any of the embodiments described herein or encoded by the nucleic acids described herein may be variants obtained by genetic engineering, provided that said variant keeps its CPP synthase activity.
  • polypeptides having a sclareol synthase activity used in any of the embodiments described herein or encoded by the nucleic acids described herein may be variants obtained by genetic engineering, provided that said variant keeps its sclareol synthase activity or has manool synthase activity.
  • polypeptide is intended as a polypeptide or peptide fragment that encompasses the amino acid sequences identified herein, as well as truncated or variant polypeptides, provided that they keep their CPP synthase activity and their sclareol synthase activity and/or manool synthase activity.
  • variant polypeptides are naturally occurring proteins that result from alternate mRNA splicing events or from proteolytic cleavage of the polypeptides described herein. Variations attributable to proteolysis include, for example, differences in the N- or C-termini upon expression in different types of host cells, due to proteolytic removal of one or more terminal amino acids from the polypeptides of an embodiment herein. Polypeptides encoded by a nucleic acid obtained by natural or artificial mutation of a nucleic acid of an embodiment herein, as described thereafter, are also encompassed by an embodiment herein.
  • Polypeptide variants resulting from a fusion of additional peptide sequences at the amino and carboxyl terminal ends can also be used in the methods of an embodiment herein.
  • a fusion can enhance expression of the polypeptides, be useful in the purification of the protein or improve the enzymatic activity of the polypeptide in a desired environment or expression system.
  • additional peptide sequences may be signal peptides, for example.
  • encompassed herein are methods using variant polypeptides, such as those obtained by fusion with other oligo- or polypeptides and/or those which are linked to signal peptides.
  • Polypeptides resulting from a fusion with another functional protein, such as another protein from the terpene biosynthesis pathway can also advantageously be used in the methods of an embodiment herein.
  • a variant may also differ from the polypeptide of an embodiment herein by attachment of modifying groups which are covalently or non-covalently linked to the polypeptide backbone.
  • the variant also includes a polypeptide which differs from the polypeptide described herein by introduced N-linked or O-linked glycosylation sites, and/or an addition of cysteine residues.
  • the skilled artisan will recognize how to modify an amino acid sequence and preserve biological activity.
  • the present invention provides a method for preparing a variant polypeptide having a CPP synthase activity or a sclareol synthase activity or a manool synthase activity, as described in any of the above embodiments, and comprising the steps of:
  • the variant polypeptide prepared when in combination with either a polypeptide with CPP synthase activity or a sclareol synthase activity is capable of producing (+)-manool.
  • a large number of mutant nucleic acid sequences may be created, for example by random mutagenesis, site-specific mutagenesis, or DNA shuffling.
  • the detailed procedures of gene shuffling are found in Stemmer, DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution ( Proc Natl Acad Sci USA., 1994, 91(22): 10747-1075).
  • DNA shuffling refers to a process of random recombination of known sequences in vitro, involving at least two nucleic acids selected for recombination.
  • mutations can be introduced at particular loci by synthesizing oligonucleotides containing a mutant sequence, flanked by restriction sites enabling ligation to fragments of the native sequence. Following ligation, the resulting reconstructed sequence encodes an analog having the desired amino acid insertion, substitution, or deletion.
  • oligonucleotide-directed site-specific mutagenesis procedures can be employed to provide an altered gene wherein predetermined codons can be altered by substitution, deletion or insertion.
  • Mutant nucleic acids may be obtained and separated, which may be used for transforming a host cell according to standard procedures, for example such as disclosed in the present examples.
  • step (d) the polypeptide obtained in step (c) is screened for at least one modified property, for example a desired modified enzymatic activity.
  • desired enzymatic activities for which an expressed polypeptide may be screened, include enhanced or reduced enzymatic activity, as measured by K M or V max value, modified regio-chemistry or stereochemistry and altered substrate utilization or product distribution.
  • the screening of enzymatic activity can be performed according to procedures familiar to the skilled person and those disclosed in the present examples.
  • Step (e) provides for repetition of process steps (a)-(d), which may preferably be performed in parallel. Accordingly, by creating a significant number of mutant nucleic acids, many host cells may be transformed with different mutant nucleic acids at the same time, allowing for the subsequent screening of an elevated number of polypeptides. The chances of obtaining a desired variant polypeptide may thus be increased at the discretion of the skilled person.
  • DNA sequence polymorphisms may exist within a given population, which may lead to changes in the amino acid sequence of the polypeptides disclosed herein.
  • Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation. Allelic variants may also include functional equivalents.
  • nucleic acid encoding the polypeptide of an embodiment herein is a useful tool to modify non-human host organisms or cells intended to be used when the method is carried out in vivo.
  • a nucleic acid encoding a polypeptide according to any of the above-described embodiments is therefore also provided herein.
  • nucleic acid of an embodiment herein can be defined as including deoxyribonucleotide or ribonucleotide polymers in either single- or double-stranded form (DNA and/or RNA).
  • nucleotide sequence should also be understood as comprising a polynucleotide molecule or an oligonucleotide molecule in the form of a separate fragment or as a component of a larger nucleic acid.
  • Nucleic acids of an embodiment herein also encompass certain isolated nucleotide sequences including those that are substantially free from contaminating endogenous material.
  • the nucleic acid of an embodiment herein may be truncated, provided that it encodes a polypeptide encompassed herein, as described above.
  • the nucleic acid of an embodiment herein that encodes for a CPP synthase can be either present naturally in a plant such as Salvia miltiorrhiza , or other species, such as Coleus forskohlii, Triticum aestivum, Marrubium vulgare or Rosmarinus officinalis , or be obtained by modifying SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • nucleic acid of an embodiment herein that encodes for a sclareol synthase can be either present naturally in a plant such as Salvia sclarea , or other species such as Nicotiana glutinosa , or can be obtained by modifying SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
  • Mutations may be any kind of mutations of these nucleic acids, such as point mutations, deletion mutations, insertion mutations and/or frame shift mutations.
  • a variant nucleic acid may be prepared in order to adapt its nucleotide sequence to a specific expression system. For example, bacterial expression systems are known to more efficiently express polypeptides if amino acids are encoded by particular codons.
  • nucleic acid sequences encoding the CPP synthase and the scalereol synthase may be optimized for increased expression in the host cell.
  • nucleotides of an embodiment herein may be synthesized using codons particular to a host for improved expression.
  • Another important tool for transforming host organisms or cells suitable to carry out the method of an embodiment herein in vivo is an expression vector comprising a nucleic acid according to any embodiment of an embodiment herein. Such a vector is therefore also provided herein.
  • Recombinant non-human host organisms and cells transformed to harbor at least one nucleic acid of an embodiment herein so that it heterologously expresses or over-expresses at least one polypeptide of an embodiment herein are also very useful tools to carry out the method of an embodiment herein. Such non-human host organisms and cells are therefore also provided herein.
  • a nucleic acid according to any of the above-described embodiments can be used to transform the non-human host organisms and cells and the expressed polypeptide can be any of the above-described polypeptides.
  • Non-human host organisms of an embodiment herein may be any non-human multicellular or unicellular organisms.
  • the non-human host organism is a plant, a prokaryote or a fungus. Any plant, prokaryote or fungus is suitable to be transformed according to the methods provided herein. Particularly useful plants are those that naturally produce high amounts of terpenes.
  • the non-human host organism is a microorganism.
  • Any microorganism is suitable to be used herein, but according to an even more particular embodiment said microorganism is a bacteria or yeast.
  • said bacterium is E. coli and said yeast is Saccharomyces cerevisiae.
  • Isolated higher eukaryotic cells can also be transformed, instead of complete organisms.
  • higher eukaryotic cells we mean here any non-human eukaryotic cell except yeast cells.
  • Particular higher eukaryotic cells are plant cells or fungal cells.
  • Embodiments provided herein include, but are not limited to cDNA, genomic DNA and RNA sequences.
  • Genes including the polynucleotides of an embodiment herein, can be cloned on basis of the available nucleotide sequence information, such as found in the attached sequence listing and by methods known in the art. These include e.g. the design of DNA primers representing the flanking sequences of such gene of which one is generated in sense orientations and which initiates synthesis of the sense strand and the other is created in reverse complementary fashion and generates the antisense strand. Thermo stable DNA polymerases such as those used in polymerase chain reaction are commonly used to carry out such experiments. Alternatively, DNA sequences representing genes can be chemically synthesized and subsequently introduced in DNA vector molecules that can be multiplied by e.g. compatible bacteria such as e.g. E. coli.
  • nucleic acid sequences obtained by mutations of SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32, and SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34; such mutations can be routinely made. It is clear to the skilled artisan that mutations, deletions, insertions, and/or substitutions of one or more nucleotides can be introduced into these DNA sequence
  • the nucleic acid sequences of an embodiment herein encoding CPP synthase and the sclareol synthase proteins can be inserted in expression vectors and/or be contained in chimeric genes inserted in expression vectors, to produce CPP synthase and sclareol synthase in a host cell or host organism.
  • the vectors for inserting transgenes into the genome of host cells are well known in the art and include plasmids, viruses, cosmids and artificial chromosomes.
  • Binary or co-integration vectors into which a chimeric gene is inserted are also used for transforming host cells.
  • An embodiment provided herein provides recombinant expression vectors comprising a nucleic acid encoding for a CPP synthase and a sclareol synthase each, separately, are operably linked to associated nucleic acid sequences such as, for instance, promoter sequences.
  • the promoter sequence may already be present in a vector so that the nucleic acid sequence which is to be transcribed is inserted into the vector downstream of the promoter sequence.
  • Vectors are typically engineered to have an origin of replication, a multiple cloning site, and a selectable marker.
  • GGPP geranylgeranyl diphosphate
  • GGPP geranylgeranyl diphosphate
  • CPP copalyl diphosphate
  • the codon usage of the cDNA encoding for the five CPP synthases were modified for optimal expression in E. coli (DNA 2.0, Menlo Park, Calif. 94025) and the NdeI and KpnI restriction sites were added at 5′-end and 3′-end, respectively.
  • the cDNA were designed to express the recombinant CPP synthase with deletion of the predicted peptide signal (58, 63, 59, 63 and 67 amino acids for SmCPS, CfCPS1, TaTps1, MvCps3 and RoCPS1, respectively).
  • the sclareol synthase from Salvia sclarea (SsScS) was used (NCBI accession No AET21246.1, WO2009095366).
  • the codon usage of the cDNA was optimized for E. coli expression (DNA 2.0, Menlo Park, Calif. 94025), the 50 first N-terminal codon were removed and the NdeI and KpnI restriction sites were added at the 5′-end and 3′-end, respectively. All the cDNAs were synthesized in vitro and cloned in the pJ208 or pJ401 plasmid (DNA 2.0, Menlo Park, Calif. 94025, USA).
  • the modified SmCPS-encoding cDNA (SmCPS2) and sclareol synthase (SsScS)-encoding cDNA (1132-2-5_opt) were digested with NdeI and KpnI and ligated into the pETDuet-1 plasmid providing the pETDuet-SmCPS2 and pETDuet-1132opt expression plasmids, respectively.
  • GGPP geranylgeranyl diphosphate
  • the CrtE gene from Pantoea agglomerans (NCBI accession M38424.1) encoding for a GGPP synthase (NCBI accession number AAA24819.1) was used.
  • the CrtE gene was synthesized with codon optimization and addition of the NcoI and BamHI restriction enzyme recognition sites at the 3′ and 5′ ends (DNA 2.0, Menlo Park, Calif.
  • the SmCPS2 encoding cDNA was digested with NdeI and KpnI and ligated into the pETDuet-1-CrtE plasmid thus providing the pETDuet-CrtE-SmCPS2 construct.
  • the optimized cDNA (1132-2-5_opt) encoding for the truncated SsScS was then introduced in the pETDuet-CrtE-SmCPS2 plasmid using the In-Fusion® technique (Clontech, Takara Bio Europe).
  • the pETDuet-1132opt was used as template in a PCR amplification using the forward primer SmCPS2-1132Inf_F1 5′-CTGTTTGAGCCGGTCGCCTAAGGTACCAGAAGGAGATAAATAATGGCGAAAATG AAGGAGAACTTTAAACG-3′ (SEQ ID NO: 9) and the reverse primer 1132-pET_Inf_R1 5′-GCAGCGGTTTCTTTACCAGACTCGAGGTCAGAACACGAAGCTCTTCATGTCCTCT-3′ (SEQ ID NO: 10).
  • the PCR product was ligated in the plasmid pETDuet-CrtE-SmCPS2 digested with the KpnI and XhoI restriction enzymes and using the In-Fusion® Dry-Down PCR Cloning Kit (Clontech, Takara Bio Europe), providing the new plasmid pETDuet-CrtE-SmCPS2-SsScS.
  • the CrtE gene is under the control of the first T7 promoter of the pETDuet plasmid and the CPP synthase and sclareol synthase encoding cDNAs are organized in a bi-cistronic construct under the control of the second T7 promoter.
  • the pETDuet-CrtE-SmCPS2-SsScS plasmid was used as template for construction of new expression plasmids carrying the four other CPP synthases-encoding enzymes.
  • the SmCPS2 cDNA was replaced by one of the four new CPP synthase encoding cDNA using an NdeI-KpnI restriction digestion-ligation approach providing the new plasmids pETDuet-CrtE-CfCPS1del63-SsScS, pETDuet-CrtE-TaTps1del59-SsScS, pETDuet-CrtE-MvCps3del63-SsScS and pETDuet-CrtE-RoCPS1del67-SsScS.
  • the expression plasmids (pETDuet-SmCPS2 or pETDuet-1132opt) were used to transform Bl21(DE3) E. coli cells (Novagene, Madison, Wis.). Single colonies of transformed cells were used to inoculate 25 ml LB medium. After 5 to 6 hours incubation at 37° C., the cultures were transferred to a 20° C. incubator and left 1 hour for equilibration. Expression of the protein was then induced by the addition of 0.1 mM IPTG and the culture was incubated over-night at 20° C.
  • the cells were collected by centrifugation, re-suspended in 0.1 volume of 50 mM MOPSO (3-morpholino-2-hydroxypropanesulfonic acid sodium salt, 3-(N-morpholinyl)-2-hydroxypropanesulfonic acid sodium salt) buffer at pH 7, 10% glycerol, 1 mM DTT and lysed by sonication.
  • MOPSO 3-morpholino-2-hydroxypropanesulfonic acid sodium salt
  • DTT mM DTT
  • the extracts were cleared by centrifugation (30 min at 20,000 g) and the supernatants containing the soluble proteins were used for further experiments.
  • Enzymatic assays were performed in Teflon sealed glass tubes using 50 to 100 ⁇ l of protein extract in a final volume of 1 mL of 50 mM MOPSO pH 7, 10% glycerol supplemented with 20 mM MgCl 2 and 50 to 200 ⁇ M purified geranylgeranyl diphosphate (GGPP) (prepared as described by Keller and Thompson, J. Chromatogr, 1993, 645(1):161-167). The tubes were incubated 5 to 48 hours at 30° C. and the enzyme products were extracted twice with one volume of pentane. After concentration under a nitrogen flux, the extracts were analyzed by GC-MS and compared to extracts from control proteins (obtained from cells transformed with the empty plasmid).
  • GGPP geranylgeranyl diphosphate
  • GC-MS analysis were performed on an Agilent 6890 series GC system equipped with a DB1 column (30 m ⁇ 0.25 mm ⁇ 0.25 mm film thickness; Agilent) and coupled with a 5975 series mass spectrometer.
  • the carrier gas was helium at a constant flow of 1 ml/min.
  • Injection was in split-less mode with the injector temperature set at 260° C. and the oven temperature was programmed from 100° C. to 225° C. at 10° C./min and to 280° C. at 30° C./min.
  • the identities of the products were confirmed based on the concordance of the retention indices and mass spectra of authentic standards.
  • the in vivo production of manool using cultures of whole cells was evaluated using E. coli cells.
  • the CrtE gene inserted in the co-expression plasmids described in Example 2 encodes for an enzyme having GGPP synthase activity that uses farnesyl-diphosphate (FPP) to produce geranylgeranyl diphosphate (GGPP).
  • FPP farnesyl-diphosphate
  • GGPP geranylgeranyl diphosphate
  • coli acetoacetyl-CoA thiolase (atoB), a Staphylococcus aureus HMG-CoA synthase (mvaS), a Staphylococcus aureus HMG-CoA reductase (mvaA) and a Saccharomyces cerevisiae FPP synthase (ERG20) genes was synthetized in vitro (DNA2.0, Menlo Park, Calif., USA) and ligated into the NcoI-BamHI digested pACYCDuet-1 vector (Invitrogen) yielding pACYC-29258.
  • a second operon containing a mevalonate kinase (MvaK1), a phosphomevalonate kinase (MvaK2), a mevalonate diphosphate decarboxylase (MvaD), and an isopentenyl diphosphate isomerase (idi) was amplified from genomic DNA of Streptococcus pneumoniae (ATCC BAA-334) and ligated into the second multicloning site of pACYC-29258 providing the plasmid pACYC-29258-4506.
  • This plasmid thus contains the genes encoding all enzymes of the biosynthetic pathway leading from acetyl-coenzyme A to FPP.
  • KRX E. coli cells were co-transformed with the plasmid pACYC-29258-4506 and one plasmid selected from pETDuet-CrtE-SmCPS2-SsSc, pETDuet-CrtE-CfCPS1del63-SsScS, pETDuet-CrtE-TaTps1del59-SsScS, pETDuet-CrtE-MvCps3del63-SsScS, or pETDuet-CrtE-RoCPS1del67-SsScS.
  • Transformed cells were selected on carbenicillin (50 ⁇ g/ml) and chloramphenicol (34 ⁇ g/ml) LB-agarose plates. Single colonies were used to inoculate 5 mL liquid LB medium supplemented with the same antibiotics. The cultures were incubated overnight at 37° C. The next day 2 mL of TB medium supplemented with the same antibiotics were inoculated with 0.2 mL of the overnight culture. After 6 hours incubation at 37° C., the culture was cooled down to 28° C. and 0.1 mM IPTG, 0.2% rhamnose and 1:10 volume of decane were added to each tube. The cultures were incubated for 48 hours at 28° C.
  • the cultures were then extracted twice with 2 volumes of MTBE (Methyl tert-butyl ether), the organic phase were concentrated to 500 ⁇ L and analyzed by GC-MS as described above in Example 4 except for the oven temperature which was 1 min hold at 100° C., followed by a temperature gradient of 10° C./min to 220° C. and 20° C./min and to 3000° C.
  • MTBE Metal tert-butyl ether
  • E. coli culture was prepared in the conditions described in Example 5, using the SmCPS/SsScS enzyme combination, except that the decane organic phase was replaced by 50 g/L Amberlite XAD-4 for solid phase extraction.
  • the culture medium was filtered to recover the resine.
  • the resine was then washed with 3 column volumes of water, and eluted using 3 column volumes of MTBE.
  • the product was then further purified by flash chromatography on silica gel using a mobile phase composed of heptane:MTBE 8:2 (v/v).
  • the structure of manool was confirmed by 1H- and 13C-NMR using a Bruker Avance 500 MHz spectrometer.
  • Sclareol synthases from the plant Nicotiana glutinosa are described in WO 2014/022434 and are shown to produce sclareol from labdenediol diphosphate (LPP).
  • LPP labdenediol diphosphate
  • Two of the sclareol synthase described in WO 2014/022434 were evaluated, NgSCS-del29 (corresponding to SEQ ID NO: 78 in WO 2014/0224) and NgSCS-del38 (corresponding to SEQ ID NO: 40 of WO 2014/022434) for the production of (+)-manool under conditions similar to Example 5.
  • a cDNA encoding for NgSCS-del29 was design with a codon usage optimal for E. coli expression and including the KpnI and XhoI sites at the 5′-end and 3′-end respectively. This DNA was synthesized by DNA 2.0 (Newark, CA 94560).
  • the pETDuet-CrtE-SmCPS2-SsScS plasmid (Example 2) was used as template for construction of a new expression plasmid.
  • the pETDuet-CrtE-SmCPS2-SsScS plasmid was digested with the KpnI and XhoI restriction sites to replace the SsScS cDNA with the NgSCS-del29 cDNA, providing the new pETDuet-CrtE-SmCPS2-del29 plasmid.
  • KRX E. coli cells (Promega) were co-transformed with the plasmid pACYC-29258-4506 (Example 5) and the pETDuet-CrtE-SmCPS2-del29 plasmid. Transformed cells were selected and cultivated in conditions for production of diterpene as described in Example 5. The production of diterpenes was evaluated using GC-MS analysis and the diterpene compounds produced were quantified using an internal standard (alpha-longipinene). With the new combination of the diterpene synthases SmCPS2 and NgSCS-del29, manool was produced by transformed E. coli cells ( FIG. 6 ).
  • the combination of the diterpene synthases SmCPS2 and NgSCS-del38 did not produce manool under the experimental conditions used.
  • at least one of the Nicotiana glutinosa sclareol synthase tested can also be used to produce manool from CPP.
  • the quantities produced using the Nicotiana glutinosa synthase were much lower than with the SsSCS synthase (see table below).
  • manool acetate obtained in the above examples was converted into its trienes according to the following experimental part (herein below as example into its Sclarene and (Z+E)-Biformene):
  • copalyl acetate obtained in the above examples was converted into Copalol according to the following experimental part:
  • the codon usage of the DNA encoding for different CPP synthases was modified for optimal expression in S. cerevisiae .
  • the DNA sequences were designed to express the recombinant CPP synthase with deletion of the predicted peptide signal (58, 63, 59, 63 and 67 amino acids for SmCPS, CfCPS1, TaTps1, MvCps3 and RoCPS1, respectively).
  • the NgSCS-del38, NgSCS-del29 and SaSCS DNA sequences were also codon optimized for S. cerevisiae expression.
  • plasmids For expression of the different genes in S. cerevisiae , a set of plasmids were constructed in vivo using yeast endogenous homologous recombination as previously described in Kuijpers et al., Microb Cell Fact., 2013, 12:47. Each plasmid is composed of six DNA fragments which were used for S. cerevisiae co-transformation. The fragments were:
  • FPP farnesyl-diphosphate
  • tHMG1 truncated HMG1
  • GAL4 under the control of a mutated version of its own promoter, as described in Griggs and Johnston, Proc Natl Acad Sci USA, 1991, 88:8597-8601, was integrated upstream the ERG9 promoter region.
  • the endogenous promoter of ERG9 was replaced by the yeast promoter region of CTR3 generating the strain YST035.
  • YST035 was mated with the strain CEN.PK2-1D (Euroscarf, Frankfurt, Germany) obtaining a diploid strain termed YST045.
  • YST045 was transformed with the above described fragments required for in vivo plasmid assembly.
  • Yeast transformations were performed with the lithium acetate protocol as described in Gietz and Woods, Methods Enzymol., 2002, 350:87-96. Transformation mixtures were plated on SmLeu-media containing 6.7 g/L of Yeast Nitrogen Base without amino acids (BD Difco, New Jersey, USA), 1.6 g/L Dropout supplement without leucine (Sigma Aldrich, Missouri, USA), 20 g/L glucose and 20 g/L agar. Plates were incubated for 3-4 days at 30° C. Single cells were used to produce manool in cultures as described in Westfall et al., Proc Natl Acad Sci USA, 2012, 109:E111-118.
  • manool was produced with some combinations of type II and type I diterpene synthases.
  • the production of manool was evaluated using GC-MS analysis and quantified using an internal standard.
  • the table below shows the quantities of manool produced relative to the SmCPS/SsScS combination (under these experimental conditions, the concentration of manool produced by cells expressing the SmCPS and the SsScS was 100 to 250 mg/L, the highest quantity of manool produced).
  • SEQ ID NO: 1 SmCPS, full-length copalyl diphosphate synthase from Salvia miltiorrhiza MASLSSTILSRSPAARRRITPASAKLHRPECFATSAWMGSSSKNLSL SYQLNHKKISVATVDAPQVHDHDGTTVHQGHDAVKNIEDPIEYIRTL LRTTGDGRISVSPYDTAWVAMIKDVEGRDGPQFPSSLEWIVQNQLED GSWGDQKLFCVYDRLVNTIACVVALRSWNVHAHKVKRGVTYKENVDK LMEGNEEHMTCGFEWFPALLQKAKSLGIEDLPYDSPAVQEVYHVREQ KLKRIPLEIMHKIPTSLLFSLEGLENLDWDKLLKLQSADGSFLTSPS STAFAFMQTKDEKCYQFIKNTIDTFNGGAPHTYPVDWGRLWAIDRLQ RLGISRFFEPEIADCLSHIHKFWTDKGVFSGRESEFCDIDDTSMGMR

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Virology (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Botany (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)

Abstract

Described herein are methods of producing (+)-manool, the methods including: contacting geranylgeranyl diphosphate with a copalyl diphosphate (CPP) synthase to form a (9S, 10S)-copalyl diphosphate and contacting the CPP with a sclareol synthase enzyme to form (+)-manool and derivatives thereof. Also described herein are nucleic acids encoding CPP synthases and sclareol synthases for use in the methods. Further described herein are expression vectors and non-human host organisms and cells including nucleic acids encoding a CPP synthase and a sclareol synthase as described herein.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a Divisional Application of U.S. Non-Provisional application Ser. No. 16/472,120 filed Jun. 20, 2019, which claims priority to U.S. National Phase Application of PCT/EP2017/083372, filed Dec. 18, 2017, which claims the benefit of priority to European Patent Application No. 16206349.9, filed Dec. 22, 2016, the entire contents of which are hereby incorporated by reference herein.
  • TECHNICAL FIELD
  • Provided herein are biochemical methods of producing (+)-manool using a copalyl diphosphate synthase and a sclareol synthase.
  • BACKGROUND
  • Terpenes are found in most organisms (microorganisms, animals and plants). These compounds are made up of five carbon units called isoprene units and are classified by the number of these units present in their structure. Thus monoterpenes, sesquiterpenes and diterpenes are terpenes containing 10, 15 and 20 carbon atoms respectively. Sesquiterpenes, for example, are widely found in the plant kingdom. Many sesquiterpene molecules are known for their flavor and fragrance properties and their cosmetic, medicinal and antimicrobial effects. Numerous sesquiterpene hydrocarbons and sesquiterpenoids have been identified.
  • Biosynthetic production of terpenes involves enzymes called terpene synthases. These enzymes convert an acyclic terpene precursor in one or more terpene products. In particular, diterpene synthases produce diterpenes by cyclization of the precursor geranylgeranyl diphosphate (GGPP). The cyclization of GGPP often requires two enzyme polypeptides, a type I and a type II diterpene synthase working in combination in two successive enzymatic reactions. The type II diterpene synthases catalyze a cyclization/rearrangement of GGPP initiated by the protonation of the terminal double bond of GGPP leading to a cyclic diterpene diphosphate intermediate. This intermediate is then further converted by a type I diterpene synthase catalyzing an ionization initiated cyclization.
  • Diterpene synthases are present in the plants and other organisms and use substrates such as GGPP but they have different product profiles. Genes and cDNAs encoding diterpene synthases have been cloned and the corresponding recombinant enzymes characterized.
  • Copalyl diphosphate (CPP) synthase enzymes and sclareol synthase enzymes are enzymes that occur in plants. Hence, it is desirable to discover and use these enzymes and variants in biochemical processes to generate (+)-manool.
  • SUMMARY
  • Provided herein is a method of producing (+)-manool comprising:
      • a) contacting geranylgeranyl diphosphate (GGPP) with a copalyl diphosphate (CPP) synthase to form a copalyl diphosphate, wherein the CPP synthase comprises
        • i) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
        • ii) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
        • iii) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21; and
      • b) contacting the CPP with a sclareol synthase to form (+)-manool; and
      • c) optionally isolating the (+)-manool.
  • Provided herein is the above method further comprising further processing the (+)-manool to a (+)-manool derivative.
  • Also provided herein is a polypeptide having CPP synthase activity, wherein the polypeptide comprises
      • a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
      • b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
      • c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
  • Further provided is a polypeptide having sclareol synthase activity, wherein the polypeptide comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
  • Also provided herein is a nucleic acid encoding a polypeptide described above.
  • Also provided herein is a nucleic acid encoding a CPP synthase wherein the nucleic acid comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to the nucleic acid sequence as set forth in SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • Further provided herein is a nucleic acid encoding a sclareol synthase wherein the nucleic acid comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
  • Also provided is an expression vector comprising the nucleic acids described above, a non-human host organism or cell comprising the nucleic acids described above or comprising the expression vector, non-human host organisms or cells capable of producing GGPP, methods of transforming a non-human host organism or cell, and methods for culturing the non-human host organisms or cells for producing (+)-manool.
  • DESCRIPTION OF THE DRAWINGS
  • FIG. 1. Enzymatic pathway from geranylgeranyl diphosphate (GGPP) to (+)-manool.
  • FIG. 2. Enzymatic pathways from geranylgeranyl diphosphate (GGPP) to (+)-manool and sclareol.
  • FIG. 3. GCMS analysis of the in vitro enzymatic conversion of GGPP. A. Using the recombinant SmCPS enzyme. B. Using the recombinant ScScS enzyme. C. Combining the SmCPS with ScScS enzymes in a single assay.
  • FIG. 4. GCMS analysis of (+)-manool produced using Escherichia coli cells expressing SmCPS, ScScS and mevalonate pathway enzymes. A. Total ion chromatogram of an extract of the E. coli culture medium. B. Total ion chromatogram of a (+)-manool standard. C. Mass spectrum of the major peak (retention time of 14.55 min) in chromatogram A. D. Mass spectrum of the (+)-manool authentic standard.
  • FIG. 5. GCMS analysis of (+)-manool produced using E. coli cells expressing, mevalonate pathway enzymes, a GGPP synthase, ScSCS and five different CPP synthases: SmCPS2 from Salvia miltiorrhiza, CfCPS1 from Coleus forskohlii, TaTps1 from Triticum aestivum, MvCps3 from Marrubium vulgare and RoCPS1 from Rosmarinus officinalis.
  • FIG. 6. GCMS analysis of (+)-manool produced using E. coli cells expressing, mevalonate pathway enzymes, a GGPP synthase, SmCPS2 and a class I diterpene synthases: NgSCS-del29 from Nicotiana glutinosa or SsScS from Salvia sclarea.
  • FIG. 7. Saccharomyces cerevisiae expression plasmids were constructed in vivo by co-transformation of yeast with six DNA fragments: a) LEU2 yeast marker, b) AmpR E. coli marker, c) Yeast origin of replication, d) E. coli replication origin, e) a fragment for co-expression of CrtE and one of the sclareol synthases coding sequences tested, and f) a fragment for expression of one of the copalyl diphosphate (CPP) synthases coding sequences tested.
  • FIG. 8. GCMS analysis of (+)-manool produced using the modified S. cerevisiae strain YST045 expressing a GGPP synthase, ScSCS and five different truncated versions of CPP synthases: SmCPS2 from Salvia miltiorrhiza, CfCPS1 from Coleus forskohlii, TaTps1 from Triticum aestivum, MvCps3 from Marrubium vulgare and RoCPS1 from Rosmarinus officinalis.
  • DETAILED DESCRIPTION Definitions
  • For the descriptions herein and the appended claims, the use of “or” means “and/or” unless stated otherwise.
  • Similarly, “comprise,” “comprises,” “comprising,” “include,” “includes,” and “including” are interchangeable and not intended to be limiting.
  • It is to be further understood that where descriptions of various embodiments use the term “comprising,” those skilled in the art would understand that in some specific instances, an embodiment can be alternatively described using language “consisting essentially of” or “consisting of.”
  • The following terms have the meanings ascribed to them unless specified otherwise.
  • The term “polypeptide” means an amino acid sequence of consecutively polymerized amino acid residues, for instance, at least 15 residues, at least 30 residues, at least 50 residues. In some embodiments provided herein, a polypeptide comprises an amino acid sequence that is an enzyme, or a fragment, or a variant thereof.
  • The term “isolated” polypeptide refers to an amino acid sequence that is removed from its natural environment by any method or combination of methods known in the art and includes recombinant, biochemical and synthetic methods.
  • The term “protein” refers to an amino acid sequence of any length wherein amino acids are linked by covalent peptide bonds, and includes oligopeptide, peptide, polypeptide and full length protein whether naturally occurring or synthetic.
  • The terms “biological function,” “function,” “biological activity” or “activity” refer to the ability of the CPP synthase and the sclareol synthase activity to catalyze the formation of (+)-manool.
  • The terms “nucleic acid sequence,” “nucleic acid,” and “polynucleotide” are used interchangeably meaning a sequence of nucleotides. A nucleic acid sequence may be a single-stranded or double-stranded deoxyribonucleotide, or ribonucleotide of any length, and include coding and non-coding sequences of a gene, exons, introns, sense and anti-sense complimentary sequences, genomic DNA, cDNA, miRNA, siRNA, mRNA, rRNA, tRNA, recombinant nucleic acid sequences, isolated and purified naturally occurring DNA and/or RNA sequences, synthetic DNA and RNA sequences, fragments, primers and nucleic acid probes; and the complement of such sequences. The skilled artisan is aware that the nucleic acid sequences of RNA are identical to the DNA sequences with the difference of thymine (T) being replaced by uracil (U).
  • An isolated nucleic acid or isolated nucleic acid sequence refers to a nucleic acid or nucleic acid sequence that is in an environment different from that in which the nucleic acid or nucleic acid sequence naturally occurs. The term “naturally-occurring” as used herein as applied to a nucleic acid refers to a nucleic acid that is found in a cell in nature. For example, a nucleic acid sequence that is present in an organism, for instance in the cells of an organism, that can be isolated from a source in nature and which it has not been intentionally modified by a human in the laboratory is naturally occurring.
  • The terms “purified,” “substantially purified,” and “isolated” as used herein refer to the state of being free of other, dissimilar compounds with which the compound of the invention is normally associated in its natural state, so that the “purified,” “substantially purified,” and “isolated” subject comprises at least 0.5%, 1%, 5%, 10%, or 20%, or at least 50% or 75% of the mass, by weight, of a given sample. In one particular embodiment, these terms refer to the compound of the invention comprising at least 95, 96, 97, 98, 99 or 100% of the mass, by weight, of a given sample. As used herein, the terms “purified,” “substantially purified,” and “isolated,” when referring to a nucleic acid or protein, of nucleic acids or proteins, also refers to a state of purification or concentration different than that which occurs naturally in a cell or organism. Any degree of purification or concentration greater than that which occurs naturally in a cell or organism, including (1) the purification from other associated structures or compounds or (2) the association with structures or compounds to which it is not normally associated in the cell or organism, are within the meaning of “isolated.” The nucleic acid or protein or classes of nucleic acids or proteins, described herein, may be isolated, or otherwise associated with structures or compounds to which they are not normally associated in nature, according to a variety of methods and processes known to those of skill in the art.
  • As used herein, the terms “amplifying” and “amplification” refer to the use of any suitable amplification methodology for generating or detecting recombinant of naturally expressed nucleic acid, as described in detail, below. For example, the invention provides methods and reagents (e.g., specific degenerate oligonucleotide primer pairs, oligo dT primer) for amplifying (e.g., by polymerase chain reaction, PCR) naturally expressed (e.g., genomic DNA or mRNA) or recombinant (e.g., cDNA) nucleic acids of the invention in vivo, ex vivo or in vitro.
  • “Recombinant nucleic acid sequence” are nucleic acid sequences that result from the use of laboratory methods (molecular cloning) to bring together genetic material from more than on source, creating a nucleic acid sequence that does not occur naturally and would not be otherwise found in biological organisms.
  • “Recombinant DNA technology” refers to molecular biology procedures to prepare a recombinant nucleic acid sequence as described, for instance, in Laboratory Manuals edited by Weigel and Glazebrook, 2002 Cold Spring Harbor Lab Press; and Sambrook et al., 1989 Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press.
  • The term “gene” means a DNA sequence comprising a region, which is transcribed into a RNA molecule, e.g., an mRNA in a cell, operably linked to suitable regulatory regions, e.g., a promoter. A gene may thus comprise several operably linked sequences, such as a promoter, a 5′ leader sequence comprising, e.g., sequences involved in translation initiation, a coding region of cDNA or genomic DNA, introns, exons, and/or a 3′non-translated sequence comprising, e.g., transcription termination sites.
  • A “chimeric gene” refers to any gene, which is not normally found in nature in a species, in particular, a gene in which one or more parts of the nucleic acid sequence are present that are not associated with each other in nature. For example the promoter is not associated in nature with part or all of the transcribed region or with another regulatory region. The term “chimeric gene” is understood to include expression constructs in which a promoter or transcription regulatory sequence is operably linked to one or more coding sequences or to an antisense, i.e., reverse complement of the sense strand, or inverted repeat sequence (sense and antisense, whereby the RNA transcript forms double stranded RNA upon transcription). The term “chimeric gene” also includes genes obtained through the combination of portions of one or more coding sequences to produce a new gene.
  • A “3′ UTR” or “3′ non-translated sequence” (also referred to as “3′ untranslated region,” or “3′end”) refers to the nucleic acid sequence found downstream of the coding sequence of a gene, which comprises for example a transcription termination site and (in most, but not all eukaryotic mRNAs) a polyadenylation signal such as AAUAAA or variants thereof. After termination of transcription, the mRNA transcript may be cleaved downstream of the polyadenylation signal and a poly(A) tail may be added, which is involved in the transport of the mRNA to the site of translation, e.g., cytoplasm.
  • “Expression of a gene” involves transcription of the gene and translation of the mRNA into a protein. Overexpression refers to the production of the gene product as measured by levels of mRNA, polypeptide and/or enzyme activity in transgenic cells or organisms that exceeds levels of production in non-transformed cells or organisms of a similar genetic background.
  • “Expression vector” as used herein means a nucleic acid molecule engineered using molecular biology methods and recombinant DNA technology for delivery of foreign or exogenous DNA into a host cell. The expression vector typically includes sequences required for proper transcription of the nucleotide sequence. The coding region usually codes for a protein of interest but may also code for an RNA, e.g., an antisense RNA, siRNA and the like.
  • An “expression vector” as used herein includes any linear or circular recombinant vector including but not limited to viral vectors, bacteriophages and plasmids. The skilled person is capable of selecting a suitable vector according to the expression system. In one embodiment, the expression vector includes the nucleic acid of an embodiment herein operably linked to at least one regulatory sequence, which controls transcription, translation, initiation and termination, such as a transcriptional promoter, operator or enhancer, or an mRNA ribosomal binding site and, optionally, including at least one selection marker. Nucleotide sequences are “operably linked” when the regulatory sequence functionally relates to the nucleic acid of an embodiment herein. “Regulatory sequence” refers to a nucleic acid sequence that determines the expression level of the nucleic acid sequences of an embodiment herein and is capable of regulating the rate of transcription of the nucleic acid sequence operably linked to the regulatory sequence. Regulatory sequences comprise promoters, enhancers, transcription factors, promoter elements and the like.
  • “Promoter” refers to a nucleic acid sequence that controls the expression of a coding sequence by providing a binding site for RNA polymerase and other factors required for proper transcription including without limitation transcription factor binding sites, repressor and activator protein binding sites. The meaning of the term promoter also includes the term “promoter regulatory sequence”. Promoter regulatory sequences may include upstream and downstream elements that may influences transcription, RNA processing or stability of the associated coding nucleic acid sequence. Promoters include naturally-derived and synthetic sequences. The coding nucleic acid sequences is usually located downstream of the promoter with respect to the direction of the transcription starting at the transcription initiation site.
  • The term “constitutive promoter” refers to an unregulated promoter that allows for continual transcription of the nucleic acid sequence it is operably linked to.
  • As used herein, the term “operably linked” refers to a linkage of polynucleotide elements in a functional relationship. A nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For instance, a promoter, or rather a transcription regulatory sequence, is operably linked to a coding sequence if it affects the transcription of the coding sequence. Operably linked means that the DNA sequences being linked are typically contiguous. The nucleotide sequence associated with the promoter sequence may be of homologous or heterologous origin with respect to the cell or organism, e.g. host cell, plant cell, plant, or microorganism, to be transformed. The sequence also may be entirely or partially synthetic. Regardless of the origin, the nucleic acid sequence associated with the promoter sequence will be expressed or silenced in accordance with promoter properties to which it is linked. The associated nucleic acid may code for a protein that is desired to be expressed or suppressed throughout the organism at all times or, alternatively, at a specific time or in specific tissues, cells, or cell compartment. Such nucleotide sequences particularly encode proteins conferring desirable phenotypic traits to the host cells or organism altered or transformed therewith. More particularly, the associated nucleotide sequence leads to the production of a (+)-manool synthase in the host cell or organism.
  • “Target peptide” refers to an amino acid sequence which targets a protein, or polypeptide to intracellular organelles, i.e., mitochondria, or plastids, or to the extracellular space (secretion signal peptide). A nucleic acid sequence encoding a target peptide may be fused to the nucleic acid sequence encoding the amino terminal end, e.g., N-terminal end, of the protein or polypeptide, or may be used to replace a native targeting polypeptide.
  • The term “primer” refers to a short nucleic acid sequence that is hybridized to a template nucleic acid sequence and is used for polymerization of a nucleic acid sequence complementary to the template.
  • As used herein, the term “host cell” or “transformed cell” refers to a cell (or organism) altered to harbor at least one nucleic acid molecule, for instance, a recombinant gene encoding a desired protein or nucleic acid sequence which upon transcription yields a CPP synthase protein and/or a sclareol synthase protein or which together produce (+)-manool.
  • The host cell is particularly a bacterial cell, a fungal cell or a plant cell. The host cell may contain a recombinant gene which has been integrated into the nuclear or organelle genomes of the host cell. Alternatively, the host may contain the recombinant gene extra-chromosomally. Homologous sequences include orthologous or paralogous sequences. Methods of identifying orthologs or paralogs including phylogenetic methods, sequence similarity and hybridization methods are known in the art and are described herein.
  • Paralogs result from gene duplication that gives rise to two or more genes with similar sequences and similar functions. Paralogs typically cluster together and are formed by duplications of genes within related plant species. Paralogs are found in groups of similar genes using pair-wise Blast analysis or during phylogenetic analysis of gene families using programs such as CLUSTAL. In paralogs, consensus sequences can be identified characteristic to sequences within related genes and having similar functions of the genes.
  • Orthologs, or orthologous sequences, are sequences similar to each other because they are found in species that descended from a common ancestor. For instance, plant species that have common ancestors are known to contain many enzymes that have similar sequences and functions. The skilled artisan can identify orthologous sequences and predict the functions of the orthologs, for example, by constructing a polygenic tree for a gene family of one species using for example CLUSTAL or BLAST programs
  • The term “selectable marker” refers to any gene which upon expression may be used to select a cell or cells that include the selectable marker. Examples of selectable markers are described below. The skilled artisan will know that different antibiotic, fungicide, auxotrophic or herbicide selectable markers are applicable to different target species.
  • The term “organism” refers to any non-human multicellular or unicellular organisms such as a plant, or a microorganism. Particularly, a microorganism is a bacterium, a yeast, an algae or a fungus.
  • The term “plant” is used interchangeably to include plant cells including plant protoplasts, plant tissues, plant cell tissue cultures giving rise to regenerated plants, or parts of plants, or plant organs such as roots, stems, leaves, buds, flowers, petioles, petals, pollen, ovules, embryos, tubers, fruits, seed, progeny thereof and the like. Any plant can be used to carry out the methods of an embodiment herein.
  • Particular Embodiments
  • In one embodiment provided herein is a method for transforming a host cell or non-human organism comprising transforming a host cell or non-human organism with a nucleic acid encoding a polypeptide having a copalyl diphosphate synthase activity and with a nucleic acid encoding a polypeptide having a sclareol synthase activity, wherein the polypeptide having the copalyl diphosphate activity comprises
  • a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
  • b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
  • c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
  • In one embodiment, the polypeptide having the sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
  • In one embodiment provided herein is a method comprising cultivating a non-human host organism or cell capable of producing a geranylgeranyl diphosphate (GGPP) and transformed to express a polypeptide having a copalyl diphosphate synthase activity wherein the polypeptide having the copalyl diphosphate synthase activity comprises
  • a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
  • b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
  • c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21; and
  • further transformed to express a polypeptide having a sclareol synthase activity.
  • Particularly, the polypeptide having the sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, and SEQ ID NO: 25.
  • Further provided herein is an expression vector comprising a nucleic acid encoding a CPP synthase wherein the CPP synthase comprises a polypeptide comprising
  • a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
  • b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
  • c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and further the expression vector comprises a nucleic acid encoding a sclareol synthase enzyme.
  • Particularly, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25. In a particularly embodiment, the two enzymes, i.e. the CPP synthase and the sclareol synthase, could be on two different vectors or plasmids transformed in the same cell. In a further embodiment, these two enzymes could be on two different vectors or plasmids transformed in two different cells.
  • Further provided herein is a non-human host organism or cell comprising or transformed to harbor at least one nucleic acid encoding a CPP synthase wherein the CPP synthase comprises
  • a) a polypeptide comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
  • b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
  • c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and at least one nucleic acid encoding a sclareol enzyme.
  • Particularly, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
  • Further provided herein is a non-human host organism or cell comprising or transformed to harbor at least one nucleic acid encoding a CPP synthase wherein the CPP synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting of SEQ ID NO: 1 and SEQ ID NO: 2; and
  • at least one nucleic acid encoding a sclareol enzyme wherein the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group SEQ ID NO: 23 and SEQ ID NO: 25.
  • In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence that has at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having at least 98% %, 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises a nucleotide sequence having 99% or 100% sequence identity to SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • In one embodiment, the nucleic acid that encodes for a CPP synthase provided herein comprises the nucleotide sequence as set forth in SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising
  • a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from the group consisting of SEQ ID NO: 14 and SEQ ID NO: 15; or
  • b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 17 and SEQ ID NO: 18; or
  • c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a polypeptide selected from group consisting of SEQ ID NO: 20 and SEQ ID NO: 21.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% sequence identity to SEQ ID NO: 14.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 14.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 14.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 15.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 15.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 17.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 17.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 18.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 18.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 20.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 20.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 21.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 21.
  • In one embodiment, the CPP synthase comprises a polypeptide comprising the amino acid sequence as set forth in SEQ ID NO: 21.
  • In one embodiment, the nucleic acid encoding the sclareol synthase enzyme comprises a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 4.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 4.
  • In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 4.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 5.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 5.
  • In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 5.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 23.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 23.
  • In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 23.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 98%, 99% or 100% sequence identity to SEQ ID NO: 25.
  • In one embodiment, the sclareol synthase comprises an amino acid sequence having at least 99% sequence identity to SEQ ID NO: 25.
  • In one embodiment, the sclareol synthase comprises the amino acid sequence as set forth in SEQ ID NO: 25.
  • In another embodiment, provided herein is an expression vector comprising at least one of the nucleic acids described herein.
  • In another embodiment, provided herein is a non-human host organism or cell that comprises one or more expression vectors comprising a nucleic acid encoding a CPP synthase as described herein and a nucleic acid encoding a sclareol synthase as described herein.
  • In another embodiment, provided herein is a non-human host organism or cell comprising or transformed to harbor at least one nucleic acid described herein so that it heterologously expresses or over-expresses at least one polypeptide described herein.
  • In an embodiment, the present invention provides a transformed cell or organism, in which the polypeptides are expressed in higher quantity than in the same cell or organism not so transformed.
  • There are several methods known in the art for the creation of transgenic host organisms or cells such as plants, fungi, prokaryotes, or cultures of higher eukaryotic cells. Appropriate cloning and expression vectors for use with bacterial, fungal, yeast, plant and mammalian cellular hosts are described, for example, in Pouwels et al., Cloning Vectors: A Laboratory Manual, 1985, Elsevier, New York and Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd edition, 1989, Cold Spring Harbor Laboratory Press. Cloning and expression vectors for higher plants and/or plant cells in particular are available to the skilled person. See for example Schardl et al., Gene, 1987, 61:1-11.
  • Methods for transforming host organisms or cells to harbor transgenic nucleic acids are familiar to the skilled person. For the creation of transgenic plants, for example, current methods include: electroporation of plant protoplasts, liposome-mediated transformation, agrobacterium-mediated transformation, polyethylene-glycol-mediated transformation, particle bombardment, microinjection of plant cells, and transformation using viruses.
  • In one embodiment, transformed DNA is integrated into a chromosome of a non-human host organism and/or cell such that a stable recombinant system results. Any chromosomal integration method known in the art may be used in the practice of the invention, including but not limited to recombinase-mediated cassette exchange (RMCE), viral site-specific chromosomal insertion, adenovirus and pronuclear injection.
  • In one embodiment for carrying out the method for producing (+)-manool, herein provided is a method of making at least one polypeptide having a CPP synthase activity and at least one polypeptide having a sclareol synthase activity as described in any embodiment of the invention.
  • One embodiment provides a method for producing manool comprising
      • a) contacting geranylgeranyl diphosphate (GGPP) with a copalyl diphosphate (CPP) synthase as described herein to form a copalyl diphosphate; and
      • b) contacting the CPP with a sclareol synthase as described herein to form (+)-manool;
        wherein step a) comprises culturing a non-human host organism or host cell capable of producing GGPP and transformed with one or more nucleic acids as described herein or with one or more expression vectors as described herein, so that the non-human host organism or host cell harbors a nucleic acid encoding a polypeptide having CPP synthase activity as described herein and a nucleic acid encoding a polypeptide having a sclareol synthase activity as described herein and expresses or over-expresses the polypeptides.
  • One embodiment provides the above method for producing manool further comprising prior to step a), transforming a non-human host organism or host cell capable of producing GGPP with
      • a) at least one nucleic acid encoding a polypeptide comprising
        • i. an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
        • ii. an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
        • iii. an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and
        • having a CPP synthase activity, so that said organism or cell expresses said polypeptide having a CPP synthase activity; and
      • b) at least one nucleic acid encoding a polypeptide having a sclareol synthase activity as described herein, so that said organism or cell expresses said polypeptide having a sclareol synthase activity.
  • In one embodiment, the non-human host organism or host cell capable of producing GGPP comprises
      • a) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 5; or
      • b) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 5; or
      • c) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 5; or
      • d) a nucleic acid comprising SEQ ID NO: 16 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 6 which encodes for a sclareol synthase; or
      • e) a nucleic acid comprising SEQ ID NO: 19 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 6 which encodes for a sclareol synthase; or
      • f) a nucleic acid comprising SEQ ID NO: 22 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 6 which encodes for a sclareol synthase; or
      • g) a nucleic acid comprising SEQ ID NO: 26 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
      • h) a nucleic acid comprising SEQ ID NO: 29 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
      • i) a nucleic acid comprising SEQ ID NO: 30 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
      • j) a nucleic acid comprising SEQ ID NO: 31 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
      • k) a nucleic acid comprising SEQ ID NO: 32 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 27 which encodes for a sclareol synthase; or
      • l) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 2 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
      • m) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
      • n) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
      • o) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 23; or
      • p) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 2 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
      • q) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 15 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
      • r) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 18 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
      • s) a nucleic acid encoding a CPP synthase comprising SEQ ID NO: 21 and a nucleic acid encoding a sclareol synthase comprising SEQ ID NO: 25; or
      • t) a nucleic acid comprising SEQ ID NO: 16 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 24 which encodes for a sclareol synthase; or
      • u) a nucleic acid comprising SEQ ID NO: 19 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 24 which encodes for a sclareol synthase; or
      • v) a nucleic acid comprising SEQ ID NO: 22 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 24 which encodes for a sclareol synthase; or
      • w) a nucleic acid comprising SEQ ID NO: 26 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
      • x) a nucleic acid comprising SEQ ID NO: 26 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase
      • y) a nucleic acid comprising SEQ ID NO: 29 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
      • z) a nucleic acid comprising SEQ ID NO: 29 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase
      • aa) a nucleic acid comprising SEQ ID NO: 30 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
      • bb) a nucleic acid comprising SEQ ID NO: 30 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase
      • cc) a nucleic acid comprising SEQ ID NO: 31 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
      • dd) a nucleic acid comprising SEQ ID NO: 31 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase; or
      • ee) a nucleic acid comprising SEQ ID NO: 32 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 33 which encodes for a sclareol synthase; or
      • ff) a nucleic acid comprising SEQ ID NO: 32 which encodes for a CPP synthase and a nucleic acid comprising SEQ ID NO: 34 which encodes for a sclareol synthase;
        wherein the above combinations of nucleic acid sequences and/or synthases also comprise the variants and various percent identities to the SEQ ID NO enumerated as described herein.
  • In one embodiment, the non-human host organism provided herein is a plant, a prokaryote or a fungus.
  • In one embodiment, the non-human host provided herein is a microorganism, particularly bacteria or yeast.
  • In one embodiment, the bacterium provided herein is Escherichia coli and yeast is Saccharomyces cerevisiae.
  • In one embodiment, the non-human organism provided herein is Saccharomyces cerevisiae.
  • In one embodiment, the cell is a prokaryotic cell.
  • In other embodiment, the cell is a bacterial cell.
  • In one embodiment, the cell is a eukaryotic cell.
  • In one embodiment, the eukaryotic cell is a yeast cell or a plant cell.
  • In one embodiment, the manool can be produced by culturing the transformed bacteria or yeast described herein, including through fermentation, for example as described in Paddon et al., Nature, 2013, 496:528-532.
  • In one embodiment, the process of producing (+)-manool produces the (+)-manool at a purity of at least 98.5%.
  • In another embodiment, a method provided herein further comprising processing the (+)-manool to a derivative using a chemical or biochemical synthesis or a combination of both using methods commonly known in the art.
  • In one embodiment, the (+)-manool derivative is selected from the group consisting of a hydrocarbon, an alcohol, acetal, aldehyde, acid, ether, ketone, lactone, acetate and an ester.
  • According to any embodiment of the invention, said (+)-manool derivative is a C10 to C25 compound optionally comprising one, two or three oxygen atoms.
  • In a further embodiment, the derivative is selected from the group consisting of manool acetate ((3R)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-1-penten-3-yl acetate), copalol ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2-penten-1-ol), copalol acetate ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2-penten-1-yl acetate), copalal ((2E)-3-methyl-5-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2-pentenal), (+)-manooloxy (4-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]-2-butanone), Z-11 ((3 S,5aR,7aS,11aS,11bR)-3,8,8,11a-tetramethyldodecahydro-3,5a-epoxynaphtho[2,1-c]oxepin), gamma-ambrol (2-[(1S,4aS,8aS)-5,5,8a-trimethyl-2-methylenedecahydro-1-naphthalenyl]ethanol) and Ambrox® (3aR,5aS,9aS,9bR)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan).
  • In another embodiment, a method provided herein further comprises contacting the (+)-manool with a suitable reacting system to convert said (+)-manool in to a suitable (+)-manool derivative. Said suitable reacting system can be of enzymatic nature (e.g. requiring one or more enzymes) or of chemical nature (e.g. requiring one or more synthetic chemicals).
  • For example, (+)-manool may be enzymatically converted to manooloxy or gamma-ambrol using a process described in the literature, for example as set forth in U.S. Pat. No. 7,294,492, wherein said patent is hereby incorporated by reference in its entirety herein.
  • In yet another embodiment, the (+)-manool derivative is copalol and its esters with a C1-C5 carboxylic acids.
  • In yet another embodiment, the (+)-manool derivative is a (+)-manool ester with a C1-C5 carboxylic acids.
  • In one embodiment, the (+)-manool derivative is copalal.
  • In one embodiment, the (+)-manool derivative is manooloxy.
  • In yet another embodiment, the (+)-manool derivative is Z-11.
  • In one embodiment, the (+)-manool derivative is an ambrol or is a mixture thereof and its esters with a C1-C5 carboxylic acids, and in particular gamma-ambrol and its esters.
  • In a further embodiment, the (+)-manool derivative is Ambrox®, sclareolide (also known as 3a,6,6,9a-tetramethyldecahydronaphtho[2,1-b]furan-2(1H)-one and all its diastereoisomer and stereoisomers), 3,4a,7,7,10a-pentamethyldodecahydro-1H-benzo[f]chromen-3-ol or 3,4a,7,7,10a-pentamethyl-4a,5,6,6a,7,8,9,10,10a,10b-decahydro-1H-benzo[f]chromene and all their diastereoisomer and stereoisomers cyclic ketone and open form, (1R,2R,4aS,8aS)-1-(2-hydroxyethyl)-2,5,5,8a-tetramethyldecahydronaphthalen-2-ol DOL, gamma-ambrol.
  • Specific examples of how said derivatives (e.g. a triene hydrocarbon, an acetate or copalol) can be obtained are detailed in the Examples.
  • For instance, the manool obtained according to the invention can be processed into Manooloxy (a ketone, as per known methods) and then into ambrol (an alcohol) and ambrox (an ether), according to EP 212254.
  • The ability of a polypeptide to catalyze the synthesis of a particular sesquiterpene can be confirmed by performing the enzyme assay as detailed in the Examples provided herein.
  • Polypeptides are also meant to include truncated polypeptides provided that they keep their (+)-manool synthase activity and their sclareol synthase activity.
  • As intended herein below, a nucleotide sequence obtained by modifying the sequences described herein may be performed using any method known in the art, for example by introducing any type of mutations such as deletion, insertion or substitution mutations. Examples of such methods are cited in the part of the description relative to the variant polypeptides and the methods to prepare them.
  • The percentage of identity between two peptide or nucleotide sequences is a function of the number of amino acids or nucleotide residues that are identical in the two sequences when an alignment of these two sequences has been generated. Identical residues are defined as residues that are the same in the two sequences in a given position of the alignment. The percentage of sequence identity, as used herein, is calculated from the optimal alignment by taking the number of residues identical between two sequences dividing it by the total number of residues in the shortest sequence and multiplying by 100. The optimal alignment is the alignment in which the percentage of identity is the highest possible. Gaps may be introduced into one or both sequences in one or more positions of the alignment to obtain the optimal alignment. These gaps are then taken into account as non-identical residues for the calculation of the percentage of sequence identity. Alignment for the purpose of determining the percentage of amino acid or nucleic acid sequence identity can be achieved in various ways using computer programs and for instance publicly available computer programs available on the world wide web. Preferably, the BLAST program (Tatiana et al., FEMS Microbiol Lett., 1999, 174:247-250) set to the default parameters, available from the National Center for Biotechnology Information (NCBI) at http://www.ncbi.nlm.nih.gov/BLAST/bl2seq/wblast2.cgi, can be used to obtain an optimal alignment of protein or nucleic acid sequences and to calculate the percentage of sequence identity.
  • The polypeptide to be contacted with GGPP in vitro can be obtained by extraction from any organism expressing it, using standard protein or enzyme extraction technologies. If the host organism is an unicellular organism or cell releasing the polypeptide of an embodiment herein into the culture medium, the polypeptide may simply be collected from the culture medium, for example by centrifugation, optionally followed by washing steps and re-suspension in suitable buffer solutions. In another embodiment, the GGPP may be contacted with the polypeptide in the culture medium where the polypeptide may be released from the host organism, unicellular organism or cell. If the organism or cell accumulates the polypeptide within its cells, the polypeptide may be obtained by disruption or lysis of the cells. The GGPP may be contacted with the polypeptide upon further extraction of the polypeptide from the cell lysate or through contact with the cell lysate without necessarily conducting such an extraction.
  • According to another particularly embodiment, the method of any of the above-described embodiments is carried out in vivo. These embodiments provided herein are particularly advantageous since it is possible to carry out the method in vivo without previously isolating the polypeptide. The reaction occurs directly within the organism or cell transformed to express said polypeptide.
  • The organism or cell is meant to “express” a polypeptide, provided that the organism or cell is transformed to harbor a nucleic acid encoding said polypeptide, this nucleic acid is transcribed to mRNA and the polypeptide is found in the host organism or cell. The term “express” encompasses “heterologously express” and “over-express”, the latter referring to levels of mRNA, polypeptide and/or enzyme activity over and above what is measured in a non-transformed organism or cell. A more detailed description of suitable methods to transform a non-human host organism or cell will be described later on in the part of the specification that is dedicated to such transformed non-human host organisms or cells.
  • A particular organism or cell is meant to be “capable of producing GGPP” when it produces GGPP naturally or when it does not produce GPPP naturally but is transformed to produce GGPP, either prior to the transformation with a nucleic acid as described herein or together with said nucleic acid. Organisms or cells transformed to produce a higher amount of GGPP than the naturally occurring organism or cell are also encompassed by the “organisms or cells capable of producing GGPP”. Several methods to transform organisms, for example microorganisms, so that they produce GGPP are known, for example in Schalk et al., J. Am. Chem. Soc., 2013, 134:18900-18903.
  • Non-human host organisms suitable to carry out the method of an embodiment herein in vivo may be any non-human multicellular or unicellular organisms. In a particular embodiment, the non-human host organism used to carry out an embodiment herein in vivo is a plant, a prokaryote or a fungus. Any plant, prokaryote or fungus can be used. Particularly useful plants are those that naturally produce high amounts of terpenes. In a more particular embodiment the non-human host organism used to carry out the method of an embodiment herein in vivo is a microorganism. Any microorganism can be used but according to an even more particular embodiment said microorganism is a bacteria or yeast. Most particularly, said bacterium is E. coli and said yeast is Saccharomyces cerevisiae.
  • Some of these organisms do not produce GGPP naturally or only in small amounts. To be suitable to carry out the method of an embodiment herein, these organisms have to be transformed to produce said precursor or engineered to produce said precursor in larger amounts. They can be so transformed either before the modification with the nucleic acid described according to any of the above embodiments or simultaneously, as explained above.
  • In one embodiment, the non-human host organism or cell capable of producing GGPP is transformed with a nucleic acid encoding a CPP synthase or variant thereof as described herein and a nucleic acid encoding a sclareol synthase or variant thereof as described herein, wherein the non-human host organism or cell capable of producing GGPP has been engineered to over-express a GGPP synthase or transformed with a nucleic acid encoding a GGPP synthase.
  • In one embodiment, the non-human host organism or cell comprises a nucleic acid encoding a GGPP synthase, a nucleic acid encoding a CPP synthase or variant thereof as described herein, and a nucleic acid encoding a sclareol synthase or variant thereof as described herein, wherein at least one of said nucleic acids is heterologous to the non-human host organism or cell.
  • Isolated higher eukaryotic cells can also be used, instead of complete organisms, as hosts to carry out the method of an embodiment herein in vivo. Suitable eukaryotic cells may be any non-human cell, but are particularly plant or fungal cells.
  • According to another embodiment, the polypeptides having a CPP synthase activity used in any of the embodiments described herein or encoded by the nucleic acids described herein may be variants obtained by genetic engineering, provided that said variant keeps its CPP synthase activity.
  • According to another embodiment, the polypeptides having a sclareol synthase activity used in any of the embodiments described herein or encoded by the nucleic acids described herein may be variants obtained by genetic engineering, provided that said variant keeps its sclareol synthase activity or has manool synthase activity.
  • As used herein, the polypeptide is intended as a polypeptide or peptide fragment that encompasses the amino acid sequences identified herein, as well as truncated or variant polypeptides, provided that they keep their CPP synthase activity and their sclareol synthase activity and/or manool synthase activity.
  • Examples of variant polypeptides are naturally occurring proteins that result from alternate mRNA splicing events or from proteolytic cleavage of the polypeptides described herein. Variations attributable to proteolysis include, for example, differences in the N- or C-termini upon expression in different types of host cells, due to proteolytic removal of one or more terminal amino acids from the polypeptides of an embodiment herein. Polypeptides encoded by a nucleic acid obtained by natural or artificial mutation of a nucleic acid of an embodiment herein, as described thereafter, are also encompassed by an embodiment herein.
  • Polypeptide variants resulting from a fusion of additional peptide sequences at the amino and carboxyl terminal ends can also be used in the methods of an embodiment herein. In particular such a fusion can enhance expression of the polypeptides, be useful in the purification of the protein or improve the enzymatic activity of the polypeptide in a desired environment or expression system. Such additional peptide sequences may be signal peptides, for example. Accordingly, encompassed herein are methods using variant polypeptides, such as those obtained by fusion with other oligo- or polypeptides and/or those which are linked to signal peptides. Polypeptides resulting from a fusion with another functional protein, such as another protein from the terpene biosynthesis pathway, can also advantageously be used in the methods of an embodiment herein.
  • A variant may also differ from the polypeptide of an embodiment herein by attachment of modifying groups which are covalently or non-covalently linked to the polypeptide backbone.
  • The variant also includes a polypeptide which differs from the polypeptide described herein by introduced N-linked or O-linked glycosylation sites, and/or an addition of cysteine residues. The skilled artisan will recognize how to modify an amino acid sequence and preserve biological activity.
  • Therefore, in an embodiment, the present invention provides a method for preparing a variant polypeptide having a CPP synthase activity or a sclareol synthase activity or a manool synthase activity, as described in any of the above embodiments, and comprising the steps of:
    • (a) selecting a nucleic acid according to any of the embodiments exposed above;
    • (b) modifying the selected nucleic acid to obtain at least one mutant nucleic acid;
    • (c) transforming host cells or unicellular organisms with the mutant nucleic acid sequence to express a polypeptide encoded by the mutant nucleic acid sequence;
    • (d) screening the polypeptide for at least one modified property; and,
    • (e) optionally, if the polypeptide has no desired variant CPP synthase activity, sclareol synthase activity, or manool synthase activity repeating the process steps (a) to (d) until a polypeptide with a desired variant CPP synthase activity, sclareol synthase activity, or manool synthase activity is obtained;
    • (f) optionally, if a polypeptide having a desired variant CPP synthase activity or a sclareol synthase activity or manool synthase activity was identified in step (d), isolating the corresponding mutant nucleic acid obtained in step (c).
  • According to an embodiment, the variant polypeptide prepared when in combination with either a polypeptide with CPP synthase activity or a sclareol synthase activity is capable of producing (+)-manool.
  • In step (b), a large number of mutant nucleic acid sequences may be created, for example by random mutagenesis, site-specific mutagenesis, or DNA shuffling. The detailed procedures of gene shuffling are found in Stemmer, DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution (Proc Natl Acad Sci USA., 1994, 91(22): 10747-1075). In short, DNA shuffling refers to a process of random recombination of known sequences in vitro, involving at least two nucleic acids selected for recombination. For example mutations can be introduced at particular loci by synthesizing oligonucleotides containing a mutant sequence, flanked by restriction sites enabling ligation to fragments of the native sequence. Following ligation, the resulting reconstructed sequence encodes an analog having the desired amino acid insertion, substitution, or deletion. Alternatively, oligonucleotide-directed site-specific mutagenesis procedures can be employed to provide an altered gene wherein predetermined codons can be altered by substitution, deletion or insertion.
  • Mutant nucleic acids may be obtained and separated, which may be used for transforming a host cell according to standard procedures, for example such as disclosed in the present examples.
  • In step (d), the polypeptide obtained in step (c) is screened for at least one modified property, for example a desired modified enzymatic activity. Examples of desired enzymatic activities, for which an expressed polypeptide may be screened, include enhanced or reduced enzymatic activity, as measured by KM or Vmax value, modified regio-chemistry or stereochemistry and altered substrate utilization or product distribution. The screening of enzymatic activity can be performed according to procedures familiar to the skilled person and those disclosed in the present examples.
  • Step (e) provides for repetition of process steps (a)-(d), which may preferably be performed in parallel. Accordingly, by creating a significant number of mutant nucleic acids, many host cells may be transformed with different mutant nucleic acids at the same time, allowing for the subsequent screening of an elevated number of polypeptides. The chances of obtaining a desired variant polypeptide may thus be increased at the discretion of the skilled person.
  • In addition to the gene sequences shown in the sequences disclosed herein, it will be apparent for the person skilled in the art that DNA sequence polymorphisms may exist within a given population, which may lead to changes in the amino acid sequence of the polypeptides disclosed herein. Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation. Allelic variants may also include functional equivalents.
  • Further embodiments also relate to the molecules derived by such sequence polymorphisms from the concretely disclosed nucleic acids. These natural variations usually bring about a variance of about 1 to 5% in the nucleotide sequence of a gene or in the amino acid sequence of the polypeptides disclosed herein. As mentioned above, the nucleic acid encoding the polypeptide of an embodiment herein is a useful tool to modify non-human host organisms or cells intended to be used when the method is carried out in vivo.
  • A nucleic acid encoding a polypeptide according to any of the above-described embodiments is therefore also provided herein.
  • The nucleic acid of an embodiment herein can be defined as including deoxyribonucleotide or ribonucleotide polymers in either single- or double-stranded form (DNA and/or RNA). The terms “nucleotide sequence” should also be understood as comprising a polynucleotide molecule or an oligonucleotide molecule in the form of a separate fragment or as a component of a larger nucleic acid. Nucleic acids of an embodiment herein also encompass certain isolated nucleotide sequences including those that are substantially free from contaminating endogenous material. The nucleic acid of an embodiment herein may be truncated, provided that it encodes a polypeptide encompassed herein, as described above.
  • In one embodiment, the nucleic acid of an embodiment herein that encodes for a CPP synthase can be either present naturally in a plant such as Salvia miltiorrhiza, or other species, such as Coleus forskohlii, Triticum aestivum, Marrubium vulgare or Rosmarinus officinalis, or be obtained by modifying SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
  • In a further embodiment, the nucleic acid of an embodiment herein that encodes for a sclareol synthase can be either present naturally in a plant such as Salvia sclarea, or other species such as Nicotiana glutinosa, or can be obtained by modifying SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
  • Mutations may be any kind of mutations of these nucleic acids, such as point mutations, deletion mutations, insertion mutations and/or frame shift mutations. A variant nucleic acid may be prepared in order to adapt its nucleotide sequence to a specific expression system. For example, bacterial expression systems are known to more efficiently express polypeptides if amino acids are encoded by particular codons.
  • Due to the degeneracy of the genetic code, more than one codon may encode the same amino acid sequence, multiple nucleic acid sequences can code for the same protein or polypeptide, all these DNA sequences being encompassed by an embodiment herein. Where appropriate, the nucleic acid sequences encoding the CPP synthase and the scalereol synthase may be optimized for increased expression in the host cell. For example, nucleotides of an embodiment herein may be synthesized using codons particular to a host for improved expression.
  • Another important tool for transforming host organisms or cells suitable to carry out the method of an embodiment herein in vivo is an expression vector comprising a nucleic acid according to any embodiment of an embodiment herein. Such a vector is therefore also provided herein.
  • Recombinant non-human host organisms and cells transformed to harbor at least one nucleic acid of an embodiment herein so that it heterologously expresses or over-expresses at least one polypeptide of an embodiment herein are also very useful tools to carry out the method of an embodiment herein. Such non-human host organisms and cells are therefore also provided herein.
  • A nucleic acid according to any of the above-described embodiments can be used to transform the non-human host organisms and cells and the expressed polypeptide can be any of the above-described polypeptides.
  • Non-human host organisms of an embodiment herein may be any non-human multicellular or unicellular organisms. In a particular embodiment, the non-human host organism is a plant, a prokaryote or a fungus. Any plant, prokaryote or fungus is suitable to be transformed according to the methods provided herein. Particularly useful plants are those that naturally produce high amounts of terpenes.
  • In a more particular embodiment the non-human host organism is a microorganism. Any microorganism is suitable to be used herein, but according to an even more particular embodiment said microorganism is a bacteria or yeast. Most particularly, said bacterium is E. coli and said yeast is Saccharomyces cerevisiae.
  • Isolated higher eukaryotic cells can also be transformed, instead of complete organisms. As higher eukaryotic cells, we mean here any non-human eukaryotic cell except yeast cells. Particular higher eukaryotic cells are plant cells or fungal cells.
  • Embodiments provided herein include, but are not limited to cDNA, genomic DNA and RNA sequences.
  • Genes, including the polynucleotides of an embodiment herein, can be cloned on basis of the available nucleotide sequence information, such as found in the attached sequence listing and by methods known in the art. These include e.g. the design of DNA primers representing the flanking sequences of such gene of which one is generated in sense orientations and which initiates synthesis of the sense strand and the other is created in reverse complementary fashion and generates the antisense strand. Thermo stable DNA polymerases such as those used in polymerase chain reaction are commonly used to carry out such experiments. Alternatively, DNA sequences representing genes can be chemically synthesized and subsequently introduced in DNA vector molecules that can be multiplied by e.g. compatible bacteria such as e.g. E. coli.
  • Provided herein are nucleic acid sequences obtained by mutations of SEQ ID NO: 3, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32, and SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34; such mutations can be routinely made. It is clear to the skilled artisan that mutations, deletions, insertions, and/or substitutions of one or more nucleotides can be introduced into these DNA sequence
  • The nucleic acid sequences of an embodiment herein encoding CPP synthase and the sclareol synthase proteins can be inserted in expression vectors and/or be contained in chimeric genes inserted in expression vectors, to produce CPP synthase and sclareol synthase in a host cell or host organism. The vectors for inserting transgenes into the genome of host cells are well known in the art and include plasmids, viruses, cosmids and artificial chromosomes. Binary or co-integration vectors into which a chimeric gene is inserted are also used for transforming host cells.
  • An embodiment provided herein provides recombinant expression vectors comprising a nucleic acid encoding for a CPP synthase and a sclareol synthase each, separately, are operably linked to associated nucleic acid sequences such as, for instance, promoter sequences.
  • Alternatively, the promoter sequence may already be present in a vector so that the nucleic acid sequence which is to be transcribed is inserted into the vector downstream of the promoter sequence. Vectors are typically engineered to have an origin of replication, a multiple cloning site, and a selectable marker.
  • EXAMPLES Example 1
  • Diterpene Synthase Genes.
  • Two diterpene synthase are necessary for the conversion of geranylgeranyl diphosphate (GGPP) to manool: a type II and a type I diterpene synthase. In the following examples, several type II and type I diterpene synthase combinations were selected and evaluated for the production of manool. For the type II synthases, five copalyl diphosphate (CPP) synthases were selected:
      • SmCPS, NCBI accession No ABV57835.1, from Salvia miltiorrhiza.
      • CfCPS1, NCBI accession No AHW04046.1, from Coleus forskohlii.
      • TaTps1, NCBI accession No BAH56559.1, from Triticum aestivum.
      • MvCps3, NCBI accession No AIE77092.1, from Marrubium vulgare.
      • RoCPS1, NCBI accession No AHL67261.1, from Rosmarinus officinalis.
  • The codon usage of the cDNA encoding for the five CPP synthases were modified for optimal expression in E. coli (DNA 2.0, Menlo Park, Calif. 94025) and the NdeI and KpnI restriction sites were added at 5′-end and 3′-end, respectively. In addition, the cDNA were designed to express the recombinant CPP synthase with deletion of the predicted peptide signal (58, 63, 59, 63 and 67 amino acids for SmCPS, CfCPS1, TaTps1, MvCps3 and RoCPS1, respectively).
  • For the type I diterpene synthase, the sclareol synthase from Salvia sclarea (SsScS) was used (NCBI accession No AET21246.1, WO2009095366). The codon usage of the cDNA was optimized for E. coli expression (DNA 2.0, Menlo Park, Calif. 94025), the 50 first N-terminal codon were removed and the NdeI and KpnI restriction sites were added at the 5′-end and 3′-end, respectively. All the cDNAs were synthesized in vitro and cloned in the pJ208 or pJ401 plasmid (DNA 2.0, Menlo Park, Calif. 94025, USA).
  • Example 2
  • Expression Plasmids.
  • The modified SmCPS-encoding cDNA (SmCPS2) and sclareol synthase (SsScS)-encoding cDNA (1132-2-5_opt) were digested with NdeI and KpnI and ligated into the pETDuet-1 plasmid providing the pETDuet-SmCPS2 and pETDuet-1132opt expression plasmids, respectively.
  • Another plasmid was constructed to co-expression the SmCPS2 and SsScS enzymes together with a geranylgeranyl diphosphate (GGPP) synthase. For the GGPP synthase, the CrtE gene from Pantoea agglomerans (NCBI accession M38424.1) encoding for a GGPP synthase (NCBI accession number AAA24819.1) was used. The CrtE gene was synthesized with codon optimization and addition of the NcoI and BamHI restriction enzyme recognition sites at the 3′ and 5′ ends (DNA 2.0, Menlo Park, Calif. 94025, USA) and ligated between NcoI and BamHI site of the pETDuet-1 plasmid to obtain the pETDuet-CrtE plasmid. The SmCPS2 encoding cDNA was digested with NdeI and KpnI and ligated into the pETDuet-1-CrtE plasmid thus providing the pETDuet-CrtE-SmCPS2 construct. The optimized cDNA (1132-2-5_opt) encoding for the truncated SsScS was then introduced in the pETDuet-CrtE-SmCPS2 plasmid using the In-Fusion® technique (Clontech, Takara Bio Europe). For this cloning, the pETDuet-1132opt was used as template in a PCR amplification using the forward primer SmCPS2-1132Inf_F1 5′-CTGTTTGAGCCGGTCGCCTAAGGTACCAGAAGGAGATAAATAATGGCGAAAATG AAGGAGAACTTTAAACG-3′ (SEQ ID NO: 9) and the reverse primer 1132-pET_Inf_R1 5′-GCAGCGGTTTCTTTACCAGACTCGAGGTCAGAACACGAAGCTCTTCATGTCCTCT-3′ (SEQ ID NO: 10). The PCR product was ligated in the plasmid pETDuet-CrtE-SmCPS2 digested with the KpnI and XhoI restriction enzymes and using the In-Fusion® Dry-Down PCR Cloning Kit (Clontech, Takara Bio Europe), providing the new plasmid pETDuet-CrtE-SmCPS2-SsScS. In this plasmid the CrtE gene is under the control of the first T7 promoter of the pETDuet plasmid and the CPP synthase and sclareol synthase encoding cDNAs are organized in a bi-cistronic construct under the control of the second T7 promoter.
  • The pETDuet-CrtE-SmCPS2-SsScS plasmid was used as template for construction of new expression plasmids carrying the four other CPP synthases-encoding enzymes. The SmCPS2 cDNA was replaced by one of the four new CPP synthase encoding cDNA using an NdeI-KpnI restriction digestion-ligation approach providing the new plasmids pETDuet-CrtE-CfCPS1del63-SsScS, pETDuet-CrtE-TaTps1del59-SsScS, pETDuet-CrtE-MvCps3del63-SsScS and pETDuet-CrtE-RoCPS1del67-SsScS.
  • Example 3
  • Heterologous Expression in E. coli and Enzymatic Activities.
  • The expression plasmids (pETDuet-SmCPS2 or pETDuet-1132opt) were used to transform Bl21(DE3) E. coli cells (Novagene, Madison, Wis.). Single colonies of transformed cells were used to inoculate 25 ml LB medium. After 5 to 6 hours incubation at 37° C., the cultures were transferred to a 20° C. incubator and left 1 hour for equilibration. Expression of the protein was then induced by the addition of 0.1 mM IPTG and the culture was incubated over-night at 20° C. The next day, the cells were collected by centrifugation, re-suspended in 0.1 volume of 50 mM MOPSO (3-morpholino-2-hydroxypropanesulfonic acid sodium salt, 3-(N-morpholinyl)-2-hydroxypropanesulfonic acid sodium salt) buffer at pH 7, 10% glycerol, 1 mM DTT and lysed by sonication. The extracts were cleared by centrifugation (30 min at 20,000 g) and the supernatants containing the soluble proteins were used for further experiments.
  • Example 4
  • In Vitro Diterpene Synthase Activity Assays.
  • Enzymatic assays were performed in Teflon sealed glass tubes using 50 to 100 μl of protein extract in a final volume of 1 mL of 50 mM MOPSO pH 7, 10% glycerol supplemented with 20 mM MgCl2 and 50 to 200 μM purified geranylgeranyl diphosphate (GGPP) (prepared as described by Keller and Thompson, J. Chromatogr, 1993, 645(1):161-167). The tubes were incubated 5 to 48 hours at 30° C. and the enzyme products were extracted twice with one volume of pentane. After concentration under a nitrogen flux, the extracts were analyzed by GC-MS and compared to extracts from control proteins (obtained from cells transformed with the empty plasmid). GC-MS analysis were performed on an Agilent 6890 series GC system equipped with a DB1 column (30 m×0.25 mm×0.25 mm film thickness; Agilent) and coupled with a 5975 series mass spectrometer. The carrier gas was helium at a constant flow of 1 ml/min. Injection was in split-less mode with the injector temperature set at 260° C. and the oven temperature was programmed from 100° C. to 225° C. at 10° C./min and to 280° C. at 30° C./min. The identities of the products were confirmed based on the concordance of the retention indices and mass spectra of authentic standards.
  • In these conditions and with the recombinant protein from E. coli cells transformed with the plasmids pETDuet-SmCPS2 or pETDuet-1132opt (heterologously expressing the SmCPS or ScScS enzymes, respectively) no production of diterpene molecules was detected in the solvent extracts (the diphosphate-containing diterpenes are not detected in these conditions). Similar assays were then performed but combining the 2 protein extracts containing the recombinant SmCPS and SsScS in a single assay. In these assays, one major product was formed and was identified as being (+)-manool by matching of the mass spectrum and retention index with authentic standards (FIG. 3). This experiment demonstrated that a sclareol synthase can be used together with a CPP synthase to produce manool.
  • Example 5
  • In Vivo Manool Production Using E. coli Cells.
  • The in vivo production of manool using cultures of whole cells was evaluated using E. coli cells. The CrtE gene inserted in the co-expression plasmids described in Example 2 encodes for an enzyme having GGPP synthase activity that uses farnesyl-diphosphate (FPP) to produce geranylgeranyl diphosphate (GGPP). To increase the level of the endogenous GGPP pool and therefore the productivity in diterpene of the cells, a heterologous complete mevalonate pathway leading to FPP was co-expressed in the same cells. The enzymes of this pathway were expressed using a single plasmid containing all the genes organized in two operons under the control of two promoters. The construction of this expression plasmid is described in patent application WO2013064411 or in Schalk et al. (J. Am. Chem. Soc., 2013, 134:18900-18903). Briefly, a first synthetic operon consisting of an E. coli acetoacetyl-CoA thiolase (atoB), a Staphylococcus aureus HMG-CoA synthase (mvaS), a Staphylococcus aureus HMG-CoA reductase (mvaA) and a Saccharomyces cerevisiae FPP synthase (ERG20) genes was synthetized in vitro (DNA2.0, Menlo Park, Calif., USA) and ligated into the NcoI-BamHI digested pACYCDuet-1 vector (Invitrogen) yielding pACYC-29258. A second operon containing a mevalonate kinase (MvaK1), a phosphomevalonate kinase (MvaK2), a mevalonate diphosphate decarboxylase (MvaD), and an isopentenyl diphosphate isomerase (idi) was amplified from genomic DNA of Streptococcus pneumoniae (ATCC BAA-334) and ligated into the second multicloning site of pACYC-29258 providing the plasmid pACYC-29258-4506. This plasmid thus contains the genes encoding all enzymes of the biosynthetic pathway leading from acetyl-coenzyme A to FPP.
  • KRX E. coli cells (Promega) were co-transformed with the plasmid pACYC-29258-4506 and one plasmid selected from pETDuet-CrtE-SmCPS2-SsSc, pETDuet-CrtE-CfCPS1del63-SsScS, pETDuet-CrtE-TaTps1del59-SsScS, pETDuet-CrtE-MvCps3del63-SsScS, or pETDuet-CrtE-RoCPS1del67-SsScS. Transformed cells were selected on carbenicillin (50 μg/ml) and chloramphenicol (34 μg/ml) LB-agarose plates. Single colonies were used to inoculate 5 mL liquid LB medium supplemented with the same antibiotics. The cultures were incubated overnight at 37° C. The next day 2 mL of TB medium supplemented with the same antibiotics were inoculated with 0.2 mL of the overnight culture. After 6 hours incubation at 37° C., the culture was cooled down to 28° C. and 0.1 mM IPTG, 0.2% rhamnose and 1:10 volume of decane were added to each tube. The cultures were incubated for 48 hours at 28° C. The cultures were then extracted twice with 2 volumes of MTBE (Methyl tert-butyl ether), the organic phase were concentrated to 500 μL and analyzed by GC-MS as described above in Example 4 except for the oven temperature which was 1 min hold at 100° C., followed by a temperature gradient of 10° C./min to 220° C. and 20° C./min and to 3000° C.
  • Under these culture conditions, manool was produced with each combination of type II diterpene synthase and the Salvia sclarea sclareol synthase (SsScS) (FIGS. 4 and 5). The amounts of diterpene compounds produced were quantified using an internal standard (alpha-longipinene). The table below shows the quantities of manool produced relative to the SmCPS/SsScS combination, when the ScScS is combined with various type II diterpene synthase (under these experimental conditions, the concentration of manool produced by cells expressing the SmCPS and the SsScS was 300 to 500 mg/L (FIG. 4)). Under these conditions, the highest relative quantity of manool produced was with the TaTps1del59 combination.
  • Type II diterpene Type I diterpene Relative quantity of
    synthase synthase manool produced
    SmCPS2 ScScS 100
    CfCPS1del63 ScScS 125.3
    TaTps1del59 ScScS 139.4
    MyCps3del63 ScScS 14.9
    RoCPS1del67 ScScS 77.7
  • Example 6
  • Production of (+)-Manool Using Recombinant Cells, Purification and NMR Analysis.
  • One litre of E. coli culture was prepared in the conditions described in Example 5, using the SmCPS/SsScS enzyme combination, except that the decane organic phase was replaced by 50 g/L Amberlite XAD-4 for solid phase extraction. The culture medium was filtered to recover the resine. The resine was then washed with 3 column volumes of water, and eluted using 3 column volumes of MTBE. The product was then further purified by flash chromatography on silica gel using a mobile phase composed of heptane:MTBE 8:2 (v/v). The structure of manool was confirmed by 1H- and 13C-NMR using a Bruker Avance 500 MHz spectrometer. The optical rotation was measured using a Perkin-Elmer 241 polarimeter and the value of [α]D 20=+26.9° (0.3%, CHCl3) confirmed the production of (+)-manool.
  • Example 7
  • In Vivo Manool Production in E. coli Cells Using a Sclareol Synthases from Nicotiana glutinosa.
  • Sclareol synthases from the plant Nicotiana glutinosa are described in WO 2014/022434 and are shown to produce sclareol from labdenediol diphosphate (LPP). Two of the sclareol synthase described in WO 2014/022434 were evaluated, NgSCS-del29 (corresponding to SEQ ID NO: 78 in WO 2014/0224) and NgSCS-del38 (corresponding to SEQ ID NO: 40 of WO 2014/022434) for the production of (+)-manool under conditions similar to Example 5.
  • A cDNA encoding for NgSCS-del29 was design with a codon usage optimal for E. coli expression and including the KpnI and XhoI sites at the 5′-end and 3′-end respectively. This DNA was synthesized by DNA 2.0 (Newark, CA 94560).
  • The pETDuet-CrtE-SmCPS2-SsScS plasmid (Example 2) was used as template for construction of a new expression plasmid. The pETDuet-CrtE-SmCPS2-SsScS plasmid was digested with the KpnI and XhoI restriction sites to replace the SsScS cDNA with the NgSCS-del29 cDNA, providing the new pETDuet-CrtE-SmCPS2-del29 plasmid.
  • KRX E. coli cells (Promega) were co-transformed with the plasmid pACYC-29258-4506 (Example 5) and the pETDuet-CrtE-SmCPS2-del29 plasmid. Transformed cells were selected and cultivated in conditions for production of diterpene as described in Example 5. The production of diterpenes was evaluated using GC-MS analysis and the diterpene compounds produced were quantified using an internal standard (alpha-longipinene). With the new combination of the diterpene synthases SmCPS2 and NgSCS-del29, manool was produced by transformed E. coli cells (FIG. 6). The combination of the diterpene synthases SmCPS2 and NgSCS-del38 did not produce manool under the experimental conditions used. Thus at least one of the Nicotiana glutinosa sclareol synthase tested can also be used to produce manool from CPP. However, the quantities produced using the Nicotiana glutinosa synthase were much lower than with the SsSCS synthase (see table below).
  • Type II diterpene Type I diterpene Relative quantity of
    synthase synthase manool produced.
    SmCPS2 SsScS 100
    SmCPS2 NgSCS-del29 3.1
  • Example 8
  • The manool obtained in the above examples was converted into its esters according to the following experimental part (herein below as example into its acetate):
  • Figure US20210010035A1-20210114-C00001
  • Following the literature (G. Ohloff, Helv. Chim. Acta 41, 845 (1958)), 32.0 g (0.11 mole) of pure crystalline (+)-Manool were treated by 20.0 g (0.25 mole) of acetyl chloride in 100 ml of dimethyl aniline for 5 days at room temperature. The mixture was additionally heated for 7 hours at 50° to reach 100% of conversion. After cooling, the reaction mixture was diluted with ether, washed successively with 10% H2SO4, aqueous NaHCO3 and water to neutrality. After drying (Na2SO4) and concentration, the product was distilled (bulb-to-bulb, B.p.=160°, 0.1 mbar) to give 20.01 g (79.4%) of Manool Acetate which was used without further purification.
  • MS: M+ 332 (0); m/e: 272 (27), 257 (83), 137 (62), 95 (90), 81 (100).
  • 1H-NMR (CDCl3): 0.67, 0.80, 0.87, 1.54 and 2.01 (5 s, 3H each), 4.49 (s, 1H), 4.80 (s, 1H), 5.11 (m, 1H), 5.13 (m, 1H), 5.95 (m, 1H).
  • 13C-NMR (CDCl3): 14.5 (q), 17.4 (t), 19.4 (t), 21.7 (q), 22.2 (q), 23.5 (q), 24.2 (t), 33.5 (s), 33.6 (t), 38.3 (t), 39.0 (t), 39.3 (t), 39.8 (s), 42.2 (t), 55.6 (d), 57.2 (t), 83.4 (s), 106.4 (t), 113.0 (t), 142.0 (d), 148.6 (s), 169.9 (s).
  • Example 9
  • The manool acetate obtained in the above examples was converted into its trienes according to the following experimental part (herein below as example into its Sclarene and (Z+E)-Biformene):
  • Figure US20210010035A1-20210114-C00002
  • To a solution of 0.4 g of Manool Acetate in 4 ml of cyclohexane at room temperature was added 0.029 g (0.05 eq.) of BF3.AcOH complex. After 15 minutes at room temperature, the reaction was quenched with aqueous NaHCO3 and washed with water to neutrality. GC-MS analysis showed only hydrocarbons which were identified as Sclarene, (Z) and (E)-biformene. No Copalol Acetate was detected.
    Another trial with more catalyst (0.15 eq) gave the same result.
  • Sclarene: MS: M+ 272 (18); m/e: 257 (100), 149 (15), 105 (15).
  • (Z) and (E)-Biformene (identical spectra): MS: M+ 272 (29); m/e: 257 (100), 187 (27), 161 (33), 105 (37).
  • Example 10
  • The manool obtained in the above examples was converted into Copalyl esters according to the following experimental part (herein below as example into the acetate):
  • Figure US20210010035A1-20210114-C00003
  • To a solution of 0.474 g (0.826 mmole, 0.27 eq.) of BF3.AcOH in 100 ml of cyclohexane at room temperature was added 4.4 g of acetic anhydride and 12.1 g of acetic acid. At room temperature, 10.0 g (33 mmole) of pure crystalline Manool in 15 ml of cyclohexane were added (sl. exothermic) and the temperature was maintained at room temperature using a water bath. After 30 min. of stirring at room temperature, a GC control showed no starting material. The reaction mixture was quenched with 300 ml of aq. saturated NaHCO3 and treated as usual. The crude mixture (9.9 g) was purified by flash chromatography (SiO2, pentane/ether 95:5) and bulb-to-bulb distillation (Eb.=130°, 0.1 mbar) to give 4.34 g (37.1%) of a 27/73 mixture of (Z) and (E)-Copalyl Acetate.
  • (Z)-Copalyl Acetate:
  • MS: M+ 332 (0); m/e: 317 (2), 272 (35)=, 257 (100), 137 (48),95 (68), 81 (70).
  • 1H-NMR (CDCl3): 0.67, 0.80, 0.87 1.76 and 2.04 (5s, 3H each), 4.86 (s, 1H), 5.35 (t: J=6 Hz, 1H).
  • (E)-Copalyl Acetate:
  • MS: M+ 332 (0); m/e: 317 (2), 272 (33)=, 257 (100), 137 (54),95 (67), 81 (74).
  • 1H-NMR (CDCl3): 0.68, 0.80, 0.87 1.70 and 2.06 (5s, 3H each), 4.82 (s, 1H), 5.31 (t: J=6 Hz, 1H).
  • 13C-NMR (CDCl3): (Spectrum recorded on (Z/E) mixture, only significant signals are given): 61.4 (t), 106.2 (t), 117.9 (d), 143.1 (s), 148.6 (s), 171.1 (s).
  • Example 11
  • The copalyl acetate obtained in the above examples was converted into Copalol according to the following experimental part:
  • Figure US20210010035A1-20210114-C00004
  • Copalyl Acetate (4.17 g, 12.5 mmole), KOH pellets (3.35 g, 59.7 mmole), water (1.5 g) and EtOH (9.5 ml) were mixed together and stirred for 3 hours at 50°. After usual workup, 3.7 g of crude (Z+E)-Copalol were obtained and purified by flash chromatography (SiO2, pentane/ether 7:2. After evaporation of the solvent, a bulb-to-bulb distillation (Eb=170°, 0.1 mbar) furnished 3.25 g (92%) of a 27/73 mixture of (Z) and (E)-Copalol.
  • (Z)-Copalol
  • MS: M+ 290 (3); m/e: 275 (18), 272 (27), 257 (82), 137 (71), 95 (93), 81 (100), 69 (70).
  • 1H-NMR (CDCl3): 0.67, 0.80, 0.87 and 1.74 (4s, 3H each); 4.06 (m, 2H), 4.55 (s, 1H), 4.86 (s, 1H), 5.42 (t: J=6 Hz, 1H).
  • (E)-Copalol
  • MS: M+ 290 (3); m/e: 275 (27), 272 (22), 257 (75), 137 (75), 95 (91), 81 (100), 69 (68).
  • 1H-NMR (CDCl3): 0.68, 0.80, 0.87 and 1.67 (4s, 3H each); 4.15 (m, 2H), 4.51 (s, 1H), 4.83 (s, 1H), 5.39 (t, J=6 Hz, 1H)
  • 13C-NMR (CDCl3): (Spectrum recorded on (Z/E) mixture, only significant signals are given): 59.4 (t), 106.2 (t), 123.0 (d), 140.6 (s), 148.6 (s).
  • Example 12
  • In Vivo Manool Production in Saccharomyces cerevisiae Cells Using Different Combinations of CPP Synthases and Sclareol Synthases.
  • Different combinations of class I and class II diterpene synthases were evaluated for the production of manool in S. cerevisiae cells.
  • For the class II diterpene synthase, five CPP synthases were selected:
      • SmCPS, NCBI accession No ABV57835.1, from Salvia miltiorrhiza.
      • CfCPS1, NCBI accession No AHW04046.1, from Coleus forskohlii.
      • TaTps1, NCBI accession No BAH56559.1, from Triticum aestivum.
      • MvCps3, NCBI accession No AIE77092.1, from Marrubium vulgare.
      • RoCPS1, NCBI accession No AHL67261.1, from Rosmarinus officinalis.
  • For the class I, two putative sclareol synthases from Nicotiana glutinosa and one from Salvia sclarea were selected:
      • NgSCS-del38 (corresponding to SEQ ID NO: 40 of WO 2014/022434).
      • NgSCS-del29 (corresponding to SEQ ID NO: 78 of WO 2014/022434).
      • SsScS, NCBI accession No AET21246.1, from Salvia sclarea.
  • The codon usage of the DNA encoding for different CPP synthases was modified for optimal expression in S. cerevisiae. In addition, the DNA sequences were designed to express the recombinant CPP synthase with deletion of the predicted peptide signal (58, 63, 59, 63 and 67 amino acids for SmCPS, CfCPS1, TaTps1, MvCps3 and RoCPS1, respectively). The NgSCS-del38, NgSCS-del29 and SaSCS DNA sequences were also codon optimized for S. cerevisiae expression.
  • For expression of the different genes in S. cerevisiae, a set of plasmids were constructed in vivo using yeast endogenous homologous recombination as previously described in Kuijpers et al., Microb Cell Fact., 2013, 12:47. Each plasmid is composed of six DNA fragments which were used for S. cerevisiae co-transformation. The fragments were:
      • a) LEU2 yeast marker, constructed by PCR using the primers 5′ AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGTCG TACCGCGCCATTCGACTACGTCGTAAGGCC-3′ (SEQ ID NO: 44) and 5′ TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCGTT GTTGCTGACCATCGACGGTCGAGGAGAACTT-3′ (SEQ ID NO: 45) with the plasmid pESC-LEU (Agilent Technologies, California, USA) as template;
      • b) AmpR E. coli marker, constructed by PCR using the primers 5′-TGGTCAGCAACAACGCCGAAGAATCACTCTCGTGTTGAGAATTGCACGCC TTGACCACGACACGTTAAGGGATTTTGGTCATGAG-3′ (SEQ ID NO: 37) and 5′-AACGCGTACCCTAAGTACGGCACCACAGTGACTATGCAGTCCGCACTTTG CCAATGCCAAAAATGTGCGCGGAACCCCTA-3′ (SEQ ID NO: 38) with the plasmid pESC-URA as template;
      • c) Yeast origin of replication, obtained by PCR using the primers 5′-TTGGCATTGGCAAAGTGCGGACTGCATAGTCACTGTGGTGCCGTACTTAG GGTACGCGTTCCTGAACGAAGCATCTGTGCTTCA-3′ (SEQ ID NO: 39) and 5′-CCGAGATGCCAAAGGATAGGTGCTATGTTGATGACTACGACACAGAACTG CGGGTGACATAATGATAGCATTGAAGGATGAGACT-3′ (SEQ ID NO: 40) with pESC-URA as template;
      • d) E. coli replication origin, obtained by PCR using the primers 5′-ATGTCACCCGCAGTTCTGTGTCGTAGTCATCAACATAGCACCTATCCTTTG GCATCTCGGTGAGCAAAAGGCCAGCAAAAGG-3′ (SEQ ID NO: 41) and 5′-CTCAGATGTACGGTGATCGCCACCATGTGACGGAAGCTATCCTGACAGTG TAGCAAGTGCTGAGCGTCAGACCCCGTAGAA-3′ (SEQ ID NO: 42) with the plasmid pESC-URA as template;
      • e) a fragment composed by the last 60 nucleotides of the fragment “d”, 200 nucleotides downstream the stop codon of the yeast gene PGK1, the GGPP synthase coding sequence CrtE (from Pantoea agglomerans, NCBI accession M38424.1) codon optimized for its expression in S. cerevisiae, the bidirectional yeast promoter of GAL10/GAL1, one of the tested sclareol synthase coding sequences, 200 nucleotides downstream the stop codon of the yeast gene CYC1 and the sequence 5′-ATTCCTAGTGACGGCCTTGGGAACTCGATACACGATGTTCAGTAGACCGC TCACACATGG-3′(SEQ ID NO: 43), this fragment was obtained by DNA synthesis (DNA 2.0, Menlo Park, Calif. 94025) and
      • f) a fragment composed by the last 60 nucleotides of fragment “e”, 200 nucleotides downstream the stop codon of the yeast gene CYC1, one of the tested CPP synthase coding sequences, the bidirectional yeast promoter of GAL10/GAL1 and 60 nucleotides corresponding to the beginning of the fragment “a”, this fragment was obtained by DNA synthesis (DNA 2.0, Menlo Park, Calif. 94025).
  • In total 15 plasmids were constructed which cover all the possible combinations of class I and class II diterpene synthases listed above. The table below show all the plasmids.
  • Plasmid Class II diterpene Class I diterpene
    name synthase synthase
    Nm SmCPS2 SsScS
    Cf CfCPS1del63 SsScS
    Mv MvCps3del63 SsScS
    Ro RoCPS1del67 SsScS
    Ta TaTps1del59 SsScS
    Nt_Sm SmCPS2 NgSCS-del38
    Nt_Cf CfCPS1del63 NgSCS-del38
    Nt_Mv MvCps3del63 NgSCS-del38
    Nt_Ro RoCPS1del67 NgSCS-del38
    Nt_Ta TaTps1del59 NgSCS-del38
    Nt2_Sm SmCPS2 NgSCS-del29
    Nt2_Cf CfCPS1del63 NgSCS-del29
    Nt2_Mv MvCps3del63 NgSCS-del29
    Nt2_Ro RoCPS1del67 NgSCS-del29
    Nt2_Ta TaTps1del59 NgSCS-del29
  • To increase the level of endogenous farnesyl-diphosphate (FPP) pool in S. cerevisiae cells, an extra copy of all the yeast endogenous genes involved in the mevalonate pathway, from ERG10 coding for acetyl-CoA C-acetyltransferase to ERG20 coding for FPP synthetase, were integrated in the genome of the S. cerevisiae strain CEN.PK2-1C (Euroscarf, Frankfurt, Germany) under the control of galactose-inducible promoters, similarly as described in Paddon et al., Nature, 2013, 496:528-532. Briefly, three cassettes were integrated in the LEU2, TRP1 and URA3 loci respectively. A first cassette containing the genes ERG20 and a truncated HMG1 (tHMG1) as described in Donald et al., Proc Natl Acad Sci USA, 1997, 109:E111-8, under the control of the bidirectional promoter GAL10/GAL1 and the genes ERG19 and ERG13 also under the control of GAL10/GAL1 promoter, the cassette was flanked by two 100 nucleotides regions corresponding to the up- and down-stream sections of LEU2. A second cassette where the genes IDI1 and tHMG1 were under the control of the GAL10/GAL1 promoter and the gene ERG13 under the control of the promoter region of GAL 7, the cassette was flanked by two 100 nucleotides regions corresponding to the up- and down-stream sections of TRP1. A third cassette with the genes ERG10, ERG12, tHMG1 and ERG8, all under the control of GAL10/GAL1 promoters, the cassette was flanked by two 100 nucleotides regions corresponding to the up- and down-stream sections of URA3. All genes in the three cassettes included 200 nucleotides of their own terminator regions. Also, an extra copy of GAL4 under the control of a mutated version of its own promoter, as described in Griggs and Johnston, Proc Natl Acad Sci USA, 1991, 88:8597-8601, was integrated upstream the ERG9 promoter region. In addition, the endogenous promoter of ERG9 was replaced by the yeast promoter region of CTR3 generating the strain YST035. Finally, YST035 was mated with the strain CEN.PK2-1D (Euroscarf, Frankfurt, Germany) obtaining a diploid strain termed YST045.
  • YST045 was transformed with the above described fragments required for in vivo plasmid assembly. Yeast transformations were performed with the lithium acetate protocol as described in Gietz and Woods, Methods Enzymol., 2002, 350:87-96. Transformation mixtures were plated on SmLeu-media containing 6.7 g/L of Yeast Nitrogen Base without amino acids (BD Difco, New Jersey, USA), 1.6 g/L Dropout supplement without leucine (Sigma Aldrich, Missouri, USA), 20 g/L glucose and 20 g/L agar. Plates were incubated for 3-4 days at 30° C. Single cells were used to produce manool in cultures as described in Westfall et al., Proc Natl Acad Sci USA, 2012, 109:E111-118.
  • Under these culture conditions, manool was produced with some combinations of type II and type I diterpene synthases. The production of manool was evaluated using GC-MS analysis and quantified using an internal standard. The table below shows the quantities of manool produced relative to the SmCPS/SsScS combination (under these experimental conditions, the concentration of manool produced by cells expressing the SmCPS and the SsScS was 100 to 250 mg/L, the highest quantity of manool produced).
  • Class II diterpene Class I diterpene Relative quantity of
    synthase synthase manool produced
    SmCPS2 SsScS 100
    CfCPS1del63 SsScS 67
    MvCps3del63 SsScS 1
    RoCPS1del67 SsScS 29
    TaTps1del59 SsScS 16
    SmCPS2 NgSCS-del38 0
    CfCPS1del63 NgSCS-del38 0
    MvCps3del63 NgSCS-del38 0
    RoCPS1del67 NgSCS-del38 0
    TaTps1del59 NgSCS-del38 0
    SmCPS2 NgSCS-del29 0
    CfCPS1del63 NgSCS-del29 0
    MvCps3del63 NgSCS-del29 0
    RoCPS1del67 NgSCS-del29 0
    TaTps1del59 NgSCS-del29 0
  • Sequence Listing.
    SEQ ID NO: 1
    SmCPS, full-length copalyl diphosphate
    synthase from Salviamiltiorrhiza
    MASLSSTILSRSPAARRRITPASAKLHRPECFATSAWMGSSSKNLSL
    SYQLNHKKISVATVDAPQVHDHDGTTVHQGHDAVKNIEDPIEYIRTL
    LRTTGDGRISVSPYDTAWVAMIKDVEGRDGPQFPSSLEWIVQNQLED
    GSWGDQKLFCVYDRLVNTIACVVALRSWNVHAHKVKRGVTYKENVDK
    LMEGNEEHMTCGFEWFPALLQKAKSLGIEDLPYDSPAVQEVYHVREQ
    KLKRIPLEIMHKIPTSLLFSLEGLENLDWDKLLKLQSADGSFLTSPS
    STAFAFMQTKDEKCYQFIKNTIDTFNGGAPHTYPVDWGRLWAIDRLQ
    RLGISRFFEPEIADCLSHIHKFWTDKGVFSGRESEFCDIDDTSMGMR
    LMRMHGYDVDPNVLRNFKQKDGKFSCYGGQMIESPSPIYNLYRASQL
    RFPGEEILEDAKRFAYDFLKEKLANNQILDKWVISKHLPDEIKLGLE
    MPWLATLPRVEAKYYIQYYAGSGDVWIGKTLYRMPEISNDTYHDLAK
    TDFKRCQAKHQFEWLYMQEWYESCGIEEFGISRKDLLLSYFLATASI
    FELERTNERIAWAKSQIIAKMITSFFNKETTSEEDKRALLNELGNIN
    GLNDTNGAGREGGAGSIALATLTQFLEGFDRYTRHQLKNAWSVWLTQ
    LQHGEADDAELLTNTLNICAGHIAFREEILAHNEYKALSNLTSKICR
    QLSFIQSEKEMGVEGEIAAKSSIKNKELEEDMQMLVKLVLEKYGGID
    RNIKKAFLAVAKTYYYRAYHAADTIDTHMFKVLFEPVA
    SEQ ID NO: 2
    SmCPS2, truncated copalyl diphosphate
    synthase from S.miltiorrhiza
    MATVDAPQVHDHDGTTVHQGHDAVKNIEDPIEYIRTLLRTTGDGRIS
    VSPYDTAWVAMIKDVEGRDGPQFPSSLEWIVQNQLEDGSWGDQKLFC
    VYDRLVNTIACVVALRSWNVHAHKVKRGVTYIKENVDKLMEGNEEHM
    TCGFEVVFPALLQKAKSLGIEDLPYDSPAVQEVYHVREQKLKRIPLE
    IMHKIPTSLLFSLEGLENLDWDKLLKLQSADGSFLTSPSSTAFAFMQ
    TKDEKCYQFIKNTIDTFNGGAPHTYPVDVFGRLWAIDRLQRLGISRF
    FEPEIADCLSHIHKFWTDKGVFSGRESEFCDIDDTSMGMRLMRMHGY
    DVDPNVLRNFKQKDGKFSCYGGQMIESPSPrYNLYRASQLRFPGEEI
    LEDAKRFAYDFLKEKLANNQILDKWVISKHLPDEIKLGLEMPWLATL
    PRVEAKYYIQYYAGSGDVWIGKTLYRMPEISNDTYHDLAKTDFKRCQ
    AKHQFEWLYMQEWYESCGIEEFGISRKDLLLSYFLATASIFELERTN
    ERIAWAKSQIIAKMITSFFNKETTSEEDKRALLNELGNINGLNDTNG
    AGREGGAGSIALATLTQFLEGFDRYTRHQLKNAWSVWLTQLQHGEAD
    DAELLTNTLNICAGHIAFREEILAHNEYKALSNLTSKICRQLSFIQS
    EKEMGVEGEIAAKSSIKNKELEEDMQMLVKLVLEKYGGlDRNIKKAF
    LAVAKTYYYRAYHAADTrDTHMFKVLFEPVA
    SEQ ID NO: 3
    SmCPS2opt, optimized cDNA for E.coli
    expression encoding for SmCPS2
    ATGGCAACTGTTGACGCACCTCAAGTCCATGATCACGATGGCACCAC
    CGTTCACCAGGGTCACGACGCGGTGAAGAACATCGAGGACCCGATCG
    AATACATTCGTACCCTGCTGCGTACCACTGGTGATGGTCGCATCAGC
    GTCAGCCCGTATGACACGGCGTGGGTGGCGATGATTAAAGACGTCGA
    GGGTCGCGATGGCCCGCAATTTCCTTCTAGCCTGGAGTGGATTGTCC
    AAAATCAGCTGGAAGATGGCTCGTGGGGTGACCAGAAGCTGTTTTGT
    GTTTACGATCGCCTGGTTAATACCATCGCATGTGTGGTTGCGCTGC
    GTAGCTGGAATGTTCACGCTCATAAAGTCAAACGTGGCGTGACGTAT
    ATCAAGGAAAACGTGGATAAGCTGATGGAAGGCAACGAAGAACACAT
    GACGTGTGGCTTCGAGGTTGTTTTTCCAGCCTTGCTGCAGAAAGCAA
    AGTCCCTGGGTATTGAGGATCTGCCGTACGACTCGCCGGCAGTGCAA
    GAAGTCTATCACGTCCGCGAGCAGAAGCTGAAACGCATCCCGCTGGA
    GATTATGCATAAGATTCCGACCTCTCTGCTGTTCTCTCTGGAAGGTC
    TGGAGAACCTGGATTGGGACAAACTGCTGAAGCTGCAGTCCGCTGAC
    GGTAGCTTTCTGACCAGCCCGAGCAGCACGGCCTTTGCGTTTATGCA
    GACCAAAGATGAGAAGTGCTATCAATTCATCAAGAATACTATTGATA
    CCTTCAACGGTGGCGCACCGCACACGTACCCAGTAGACGTTTTTGGT
    CGCCTGTGGGCGATTGACCGTTTGCAGCGTCTGGGTATCAGCCGTTT
    CTTCGAGCCGGAGATTGCGGACTGCTTGAGCCATATTCACAAATTCT
    GGACGGACAAAGGCGTGTTCAGCGGTCGTGAGAGCGAGTTCTGCGAC
    ATCGACGATACGAGCATGGGTATGCGTCTGATGCGTATGCACGGTTA
    CGACGTGGACCCGAATGTGTTGCGCAACTTCAAGCAAAAAGATGGCA
    AGTTTAGCTGCTACGGTGGCCAAATGATTGAGAGCCCGAGCCCGATC
    TATAACTTATATCGTGCGAGCCAACTGCGTTTCCCGGGTGAAGAAAT
    TCTGGAAGATGCGAAGCGTTTTGCGTATGACTTCCTGAAGGAAAAGC
    TCGCAAACAATCAAATCTTGGATAAATGGGTGATCAGCAAGCACTTG
    CCGGATGAGATTAAACTGGGTCTGGAGATGCCGTGGTTGGCCACCCT
    GCCGAGAGTTGAGGCGAAATACTATATTCAGTATTACGCGGGTAGCG
    GTGATGTTTGGATTGGCAAGACCCTGTACCGCATGCCGGAGATCAGC
    AATGATACCTATCATGACCTGGCCAAGACCGACTTCAAACGCTGTCA
    AGCGAAACATCAATTTGAATGGTTATACATGCAAGAGTGGTACGAAA
    GCTGCGGCATCGAAGAGTTCGGTATCTCCCGTAAAGATCTGCTGCTG
    TCTTACTTTCTGGCAACGGCCAGCATTTTCGAGCTGGAGCGTACCAA
    TGAGCGTATTGCCTGGGCGAAATCACAAATCATTGCTAAGATGATTA
    CGAGCTTTTTCAATAAAGAAACCACGTCCGAGGAAGATAAACGTGCT
    CTGCTGAATGAACTGGGCAACATCAACGGTCTGAATGACACCAACGG
    TGCCGGTCGTGAGGGTGGCGCAGGCAGCATTGCACTGGCCACGCTGA
    CCCAGTTCCTGGAAGGTTTCGACCGCTACACCCGTCACCAGCTGAAG
    AACGCGTGGTCCGTCTGGCTGACCCAGCTGCAGCATGGTGAGGCAGA
    CGACGCGGAGCTGCTGACCAACACGTTGAATATCTGCGCTGGCCATA
    TCGCGTTTCGCGAAGAGATTCTGGCGCACAACGAGTACAAAGCCCTG
    AGCAATCTGACCTCTAAAATCTGTCGTCAGCTTAGCTTTATTCAGAG
    CGAGAAAGAAATGGGCGTGGAAGGTGAGATCGCGGCAAAATCCAGCA
    TCAAGAACAAAGAACTGGAAGAAGATATGCAGATGTTGGTCAAGCTC
    GTCCTGGAGAAGTATGGTGGCATCGACCGTAATATCAAGAAAGCGTT
    TCTGGCCGTGGCGAAAACGTATTACTACCGCGCGTACCACGCGGCAG
    ATACCATTGACACCCACATGTTTAAGGTTTTGTTTGAGCCGGTTGCT
    TAA
    SEQ ID NO: 4
    Full-length sclareol synthase from Salvia
    sclarea
    MSLAFNVGVTPFSGQRVGSRKEKFPVQGFPVTTPNRSRLIVNCSLTT
    IDFMAKMKENFKREDDKFPTTTTLRSEDIPSNLCIIDTLQRLGVDQF
    FQYEINTILDNTFRLWQEKHKVIYGNVTTHAMAFRLLRVKGYEVSSE
    ELAPYGNQEAVSQQTNDLPMIIELYRAANERIYEEERSLEKILAWTT
    IFLNKQVQDNSIPDKKLHKLVEFYLRNYKGITIRLGARRNLELYDMT
    YYQALKSTNRFSNLCNEDFLVFAKQDFDIHEAQNQKGLQQLQRWYAD
    CRLDTLNFGRDVVIIANYLASLIIGDHAFDYVRLAFAKTSVLVTIMD
    DFFDCHGSSQECDKIIELVKEWKENPDAEYGSEELEILFMALYNTVN
    ELAERARVEQGRSVKEFLVKLWVEILSAFKIELDTWSNGTQQSFDEY
    ISSSWLSNGSRLTGLLTMQFVGVKLSDEMLMSEECTDLARHVCMVGR
    LLNDVCSSEREREENIAGKSYSILLATEKDGRKVSEDEAIAEINEMV
    EYHWRKVLQIVYKKESILPRRCKDVFLEMAKGTFYAYGINDELTSPQ
    QSKEDMKSFVF
    SEQ ID NO: 5
    Truncated sclareol synthase from Salviasclarea
    (SsScS)
    MAKMKENFKREDDKFPTTTTLRSEDIPSNLCIIDTLQRLGVDQFFQY
    EINTILDNTFRLWQEKHKVIYGNVTTHAMAFRLLRVKGYEVSSEELA
    PYGNQEAVSQQTNDLPMIIELYRAANERIYEEERSLEKILAWTTIFL
    NKQVQDNSIPDKKLHKLVEFYLRNYKGITIRLGARRNLELYDMTYYQ
    ALKSTNRFSNLCNEDFLVFAKQDFDIHEAQNQKGLQQLQRWYADCRL
    DTLNFGRDVVIIANYLASLIIGDHAFDYVRLAFAKTSVLVTIMDDFF
    DCHGSSQECDKIIELVKEWKENPDAEYGSEELEILFMALYNTVNELA
    ERARVEQGRSVKEFLVKLWVEILSAFKIELDTWSNGTQQSFDEYISS
    SWLSNGSRLTGLLTMQFVGVKLSDEMLMSEECTDLARHVCMVGRLLN
    DVCSSEREREENIAGKSYSILLATEKDGRKVSEDEAIAEINEMVEYH
    WRKVLQIVYKKESILPRRCKDVFLEMAKGTFYAYGINDELTSPQQSK
    EDMKSFVF
    SEQ ID NO: 6
    1132-2-5_opt, optimized cDNA for E.coli
    expression encoding the truncated sclareol
    synthase from Salviasclarea
    ATGGCGAAAATGAAGGAGAACTTTAAACGCGAGGACGATAAATTCCC
    GACGACCACGACCCTGCGCAGCGAGGATATCCCGAGCAACCTGTGCA
    TCATTGATACCCTGCAGCGCCTGGGTGTCGATCAGTTCTTCCAATAC
    GAAATCAATACCATTCTGGACAATACTTTTCGTCTGTGGCAAGAGAA
    ACACAAAGTGATCTACGGCAACGTTACCACCCACGCGATGGCGTTCC
    GTTTGTTGCGTGTCAAGGGCTACGAGGTTTCCAGCGAGGAACTGGCG
    CCGTACGGTAATCAGGAAGCAGTTAGCCAACAGACGAATGATCTGCC
    TATGATCATTGAGCTGTATCGCGCAGCAAATGAGCGTATCTACGAAG
    AGGAACGCAGCCTGGAAAAGATCCTGGCGTGGACCACGATCTTCCTG
    AACAAACAAGTTCAAGACAATTCTATTCCTGATAAGAAGCTGCATAA
    ACTGGTCGAATTCTATCTGCGTAATTACAAGGGCATCACGATCCGTC
    TGGGCGCACGCCGTAACCTGGAGTTGTATGATATGACGTATTACCAG
    GCTCTGAAAAGCACCAATCGTTTCTCCAATCTGTGTAATGAGGATTT
    TCTGGTGTTCGCCAAGCAGGATTTTGACATCCACGAGGCGCAAAATC
    AAAAAGGTCTGCAACAACTGCAACGTTGGTACGCTGACTGTCGCCTG
    GACACCCTGAATTTCGGTCGCGACGTTGTCATTATTGCAAACTATCT
    GGCCAGCCTGATCATCGGTGATCACGCATTCGACTACGTCCGCCTGG
    CCTTCGCTAAGACCAGCGTTCTGGTGACCATTATGGATGATTTCTTC
    GATTGCCACGGTTCTAGCCAGGAATGCGACAAAATCATTGAGCTGGT
    GAAAGAGTGGAAAGAAAACCCTGATGCGGAATACGGTTCCGAAGAGT
    TGGAGATCCTGTTTATGGCCTTGTACAACACCGTGAATGAACTGGCC
    GAGCGTGCTCGTGTGGAGCAGGGCCGTTCTGTGAAGGAGTTTTTGGT
    CAAGTTGTGGGTGGAAATCCTGTCCGCGTTCAAGATCGAACTGGATA
    CGTGGTCGAATGGTACGCAACAGAGCTTCGACGAATACATCAGCAGC
    AGCTGGCTGAGCAATGGCAGCCGTCTGACCGGTTTGCTGACCATGCA
    ATTTGTGGGTGTTAAACTGTCCGATGAAATGCTGATGAGCGAAGAAT
    GCACCGACCTGGCACGCCATGTGTGTATGGTGGGTCGCCTGCTGAAC
    GACGTCTGCAGCAGCGAACGTGAGCGCGAGGAAAACATTGCAGGCAA
    GAGCTACAGCATCTTGTTGGCCACCGAGAAAGATGGTCGCAAAGTGT
    CTGAGGACGAAGCAATTGCAGAGATTAATGAAATGGTCGAGTACCAC
    TGGCGTAAGGTTTTGCAGATTGTGTATAAGAAAGAGAGCATCTTGCC
    GCGTCGCTGTAAGGATGTTTTCTTGGAGATGGCGAAGGGCACGTTCT
    ATGCGTACGGCATTAACGACGAGCTGACGAGCCCGCAACAATCGAAA
    GAGGACATGAAGAGCTTCGTGTTCTGAGGTAC
    SEQ ID NO: 7
    GGPP synthase from Pantoeaagglomerans
    MVSGSKAGVSPHREIEVMRQSIDDHLAGLLPETDSQDIVSLAMREGV
    MAPGKRIRPLLMLLAARDLRYQGSMPTLLDLACAVELTHTASLMLDD
    MPCMDNAELRRGQPTTHKKFGESVAILASVGLLSKAFGLIAATGDLP
    GERRAQAVNELSTAVGVQGLVLGQFRDLNDAALDRTPDAILSTNHLK
    TGILFSAMLQIVAIASASSPSTRETLHAFALDFGQAFQLLDDLRDDH
    PETGKDRNKDAGKSTLVNRLGADAARQKLREHIDSADKHLTFACPQG
    GAIRQFMHLWFGHHLADWSPVMKIA
    SEQ ID NO: 8
    CrtEopt, optimized cDNA encoding for the GGPP
    synthase from Pantoeaagglomeranes.
    ATGGTTTCTGGTTCGAAAGCAGGAGTATCACCTCATAGGGAAATCGA
    AGTCATGAGACAGTCCATTGATGACCACTTAGCAGGATTGTTGCCAG
    AAACAGATTCCCAGGATATCGTTAGCCTTGCTATGAGAGAAGGTGTT
    ATGGCACCTGGTAAACGTATCAGACCTTTGCTGATGTTACTTGCTGC
    AAGAGACCTGAGATATCAGGGTTCTATGCCTACACTACTGGATCTAG
    CTTGTGCTGTTGAACTGACACATACTGCTTCCTTGATGCTGGATGAC
    ATGCCTTGTATGGACAATGCGGAACTTAGAAGAGGTCAACCAACAAC
    CCACAAGAAATTCGGAGAATCTGTTGCCATTTTGGCTTCTGTAGGTC
    TGTTGTCGAAAGCTTTTGGCTTGATTGCTGCAACTGGTGATCTTCCA
    GGTGAAAGGAGAGCACAAGCTGTAAACGAGCTATCTACTGCAGTTGG
    TGTTCAAGGTCTAGTCTTAGGACAGTTCAGAGATTTGAATGACGCAG
    CTTTGGACAGAACTCCTGATGCTATCCTGTCTACGAACCATCTGAAG
    ACTGGCATCTTGTTCTCAGCTATGTTGCAAATCGTAGCCATTGCTTC
    TGCTTCTTCACCATCTACTAGGGAAACGTTACACGCATTCGCATTGG
    ACTTTGGTCAAGCCTTTCAACTGCTAGACGATTTGAGGGATGATCAT
    CCAGAGACAGGTAAAGACCGTAACAAAGACGCTGGTAAAAGCACTCT
    AGTCAACAGATTGGGTGCTGATGCAGCTAGACAGAAACTGAGAGAGC
    ACATTGACTCTGCTGACAAACACCTGACATTTGCATGTCCACAAGGA
    GGTGCTATAAGGCAGTTTATGCACCTATGGTTTGGACACCATCTTGC
    TGATTGGTCTCCAGTGATGAAGATCGCCTAA
    SEQ ID NO: 9
    Forward primer SmCPS2-1132Inf_F1
    CTGTTTGAGCCGGTCGCCTAAGGTACCAGAAGGAGATAAATAATGGC
    GAAAATGAAGGAGAACTTTAAACG
    SEQ ID NO: 10
    Reverse primer 1132-pET_Inf_R1
    GCAGCGGTTTCTTTACCAGACTCGAGGTCAGAACACGAAGCTCTTCA
    TGTCCTCT
    SEQ ID NO: 11
    CfCPS1, full-length copalyl diphosphate
    synthase from Coleusforskohlii
    MGSLSTMNLNHSPMSYSGILPSSSAKAKLLLPGCFSISAWMNNGKNL
    NCQLTHKKISKVAE1RVATVNAPPVHDQDDSTENQCHDAVNNIEDPI
    EYIRTLLRTTGDGRISVSPYDTAWVALIKDLOGRDAPEFPSSLEWII
    QNQLADGSWGDAKFFCVYDRLVNTIACVVALRSWDVHAEKVERGVRY
    INENVEKLRDGNEEHMTCGFEVVFPALLQRAKSLGIQDLPYDAPVIQ
    EIYHSREQKSKRIPLEMMHKVPTSLLFSLEGLENLEWDKLLKLQSAD
    GSFLTSPSSTAFAFMQTRDPKCYQFIKNTIQTFNGGAPHTYPVDVFG
    RLWAIDRLQRLGISRFFESEIADCIAHIHRFWTEKGVFSGRESEFCD
    IDDTSMGVRLMRMHGYDVDPNVLKNFKKDDKFSCYGGQMIESPSPIY
    NLYRASQLRFPGEQILEDANKFAYDFLQEKLAHNQILDKWVISKHLP
    DEIKLGLEMPWYATLPRVEARYYIQYYAGSGDVWIGKTLYRMPEISN
    DTYFIELAKTDFKRCQAQHQFEWIYMQEWYESCNMEEFGISRKELLV
    AYFLATASIFELERANERIAWAKSQIISTIIASFFNNQNTSPEDKLA
    FLTDFKNGNSTNMALVTLTQFLEGFDRYTSHQLKNAWSVWLRKLQQG
    IEGNGGADAELLVNTLNICAGHIAFREELAHNDYKTLSNLTSKICRQ
    LSQIQNEKELETEGQKTSIKNKELEEDMQRLVKLVLEKSRVGINRDM
    KKTFLAVVKTYYYKAYHSAQAIDNHMFKVLFEPVA
    SEQ ID NO: 12
    CfCPS1-del63, truncated copalyl diphosphate
    synthase from Coleusforskohlii
    MVATVNAPPVHDQDDSTENQCHDAVNNIEDPIEYIRTLLRTTGDGRI
    SVSPYDTAWVALIKDLQGRDAPEFPSSLEWIIQNQLADGSWGDAKFF
    CVYDRLVNTIACVVALRSWDVHAEKVERGVRYINENVEKLRDGNEEH
    MTCGFEVVFPALLQRAKSLGIQDLPYDAPVIQEIYHSREQKSKRIPL
    EMMHKVPTSLLFSLEGLENLEWDKLLKLQSADGSFLTSPSSTAFAFM
    QTRDPKCYQFIKNTIQTFNGGAPHTYPVDVFGRLWAIDRLQRLGISR
    FFESEIADCIAHIHRFWTEKGVFSGRESEFCDIDDTSMGVRLMRMHG
    YDVDPNVLKNFKKDDKFSCYGGQMIESPSPIYNLYRASQLRFPGEQI
    LEDANKFAYDFLQEKLAHNQILDKWVISKHLPDEIKLGLEMPWYATL
    PRVEARYYIQYYAGSGDVWIGKTLYRMPEISNDTYHELAKTDFKRCQ
    AQHQFEWIYMQEWYESCNMEEFGISRKELLVAYFLATASIFELERAN
    ERIAWAKSQIISTIIASFFNNQNTSPEDKLAFLTDFKNGNSTNMALV
    TLTQFLEGFDRYTSHQLKNAWSVWLRKLQQGEGNGGADAELLVNTLN
    ICAGHIAFREEILAHNDYKTLSNLTSKICRQLSQIQNEKELETEGQK
    TSIKNKELEEDMQRLVKLVLEKSRVGINRDMKKTFLAVVKTYYYKAY
    HSAQAIDNHMFKVLFEPVA
    SEQ ID NO: 13
    Optimized cDNA for E.coli expression
    encoding for CfCPS1-del63
    ATGGTCGCTACTGTCAATGCTCCACCGGTCCACGATCAAGACGACAG
    CACTGAGAATCAATGTCATGATGCCGTAAACAATATTGAAGATCCAA
    TCGAGTATATCCGTACCCTGTTGCGCACGACGGGTGATGGTCGTATC
    AGCGTCAGCCCGTACGATACCGCGTGGGTGGCGCTGATCAAAGATCT
    GCAGGGCCGTGACGCACCGGAGTTTCCGTCCTCTCTTGAGTGGATCA
    TTCAAAACCAGCTGGCCGACGGTTCTTGGGGCGACGCCAAATTTTTC
    TGCGTGTATGACCGTCTGGTGAACACCATCGCGTGCGTCGTTGCGCT
    GCGTTCCTGGGACGTCCACGCGGAAAAAGTTGAGCGTGGCGTGCGCT
    ATATCAACGAAAATGTCGAAAAGCTGCGCGACGGTAATGAAGAACAC
    ATGACCTGTGGCTTTGAAGTTGTTTTCCCGGCGCTCCTGCAGCGCGC
    GAAGTCTCTGGGTATTCAAGATCTGCCGTACGATGCTCCGGTGATCC
    AAGAGATTTATCACTCTCGTGAGCAGAAGTCCAAGCGTATCCCGTTG
    GAGATGATGCACAAAGTTCCGACGAGCCTGCTGTTCAGCTTGGAAGG
    CCTGGAAAATCTGGAGTGGGACAAACTGCTGAAGCTGCAGAGCGCGG
    ACGGTAGCTTCCTGACGAGCCCGAGCAGCACCGCATTTGCATTTATG
    CAGACCCGTGACCCGAAGTGTTACCAATTTATTAAGAACACGATTCA
    GACGTTTAACGGTGGTGCACCGCATACCTATCCGGTAGACGTCTTTG
    GTCGCCTGTGGGCAATTGATCGTCTGCAGCGTTTGGGTATCAGCCGC
    TTCTTCGAAAGCGAAATTGCAGATTGTATCGCACACATCCATCGTTT
    TTGGACCGAGAAAGGCGTCTTTAGCGGCCGTGAGTCTGAGTTCTGTG
    ACATCGATGACACGAGCATGGGTGTCCGTCTGATGCGTATGCATGGC
    TATGATGTTGACCCGAACGTGCTGAAGAATTTTAAAAAAGATGACAA
    GTTTAGCTGCTACGGCGGTCAGATGATTGAGAGCCCGAGCCCGATTT
    ATAATCTGTACCGCGCGAGCCAACTGCGTTTCCCGGGTGAACAGATT
    CTGGAAGATGCCAATAAATTCGCGTATGATTTCCTGCAGGAAAAACT
    GGCGCACAATCAGATCCTGGATAAATGGGTTATCAGCAAGCATCTGC
    CTGACGAAATCAAATTGGGCCTGGAGATGCCGTGGTATGCGACCTTG
    CCGCGTGTCGAAGCGCGTTACTACATCCAGTACTATGCGGGTAGCGG
    CGATGTCTGGATTGGTAAGACGCTGTACCGTATGCCAGAGATTAGCA
    ACGACACCTACCATGAATTGGCAAAGACCGATTTCAAGCGTTGCCAA
    GCCCAACACCAGTTCGAGTGGATTTACATGCAAGAGTGGTACGAGTC
    GTGCAACATGGAAGAGTTCGGTATTAGCCGCAAAGAACTGCTGGTTG
    CATATTTCCTGGCCACGGCGAGCATCTTTGAGCTGGAGCGTGCGAAT
    GAACGCATTGCATGGGCAAAAAGCCAAATCATTTCTACCATTATCGC
    TTCGTTCTTTAATAACCAAAATACGAGCCCTGAGGATAAACTGGCGT
    TTCTGACTGATTTCAAAAATGGCAACAGCACCAACATGGCTCTGGTG
    ACCCTGACCCAGTTCCTGGAAGGCTTTGACCGCTACACTTCCCATCA
    ACTGAAAAACGCGTGGAGCGTTTGGCTGCGTAAGCTGCAACAGGGTG
    AGGGTAATGGCGGTGCCGACGCCGAGTTACTGGTGAATACGCTGAAC
    ATTTGCGCGGGTCACATCGCGTTCCGTGAAGAAATTCTGGCACATAA
    TGACTATAAAACGTTGTCGAACCTGACCAGCAAGATTTGTCGCCAGC
    TGAGCCAGATTCAGAATGAAAAAGAATTGGAAACCGAAGGCCAAAAG
    ACTTCCATTAAGAACAAAGAACTGGAAGAAGATATGCAGCGCCTGGT
    TAAACTGGTTTTGGAGAAAAGCCGTGTGGGTATCAATCGTGACATGA
    AGAAAACGTTCCTGGCTGTGGTGAAAACCTACTATTACAAAGCATAC
    CACTCCGCGCAGGCAATCGATAACCACATGTTCAAGGTTCTGTTCGA
    ACCGGTGGCCTAA
    SEQ ID NO: 14
    TaTps1, full-length copalyl diphosphate
    synthase from Tritictimaestivum.
    MLTFTAALRHVPVLDQPTSEPWRRLSLHLHSQRRPCGLVLISKSPSY
    PEVDVGEWKVDEYRQRTDEPSETRQMIDDIRTALASLGDDETSMSVS
    AYDTALVALVKNLDGGDGPQFPSCIDWIVQNQLPDGSWGDPAFFMVQ
    DRMISTLACWAVKSWNIDRDNLCDRGVLFIKENMSRLVEEEQDWMPC
    GFEINFPALLEKAKDLDLDIPYDHPVLEEIYAKRNLKLLKIPLDVLH
    AIPTTLLFSVEGMVDLPLDWEKLLRLRCPDGSFHSSPAATAAALSHT
    GDKECHAFLDRLIQKFEGGVPCSHSMDTFEQLWVYDRLMRLGISRHF
    TSEIQQCLEFIYRRWTQKGLAHNMHCPIPDIDDTAMGFRLLRQHGYD
    VTPSVFKHFEKDGKFVCFPMETNHASVTPMHNTYRASQFMFPGDDDV
    LARAGRYCRAFLQERQSSNKLYDKWIITKDLPGEVGYTLNFPWKSSL
    PRIETRMYLDQYGGNNDVWIAKVLYRMNLVSNDLYLKMAKADFTEYQ
    RLSRIEWNGLRKWYFRNHLQRYGATPKSALKAYFLASANIFEPGRAA
    ERLAWARMAVLAEAVTTHFRHIGGPCYSTENLEELIDLVSFDDVSGG
    LREAWKQWLMAWTAKESHGSVDGDTALLFVRTIEICSGRIVSSEQKL
    NLWDYSQLEQLTSSICHKLATIGLSQNEASMENTEDLHQQVDLEMQE
    LSWRVHQGCHGINRETRQTFLNVVKSFYYSAHCSPETVDSHIAKVIF
    QDVI
    SEQ ID NO: 15
    TaTps1-del59, truncated copalyl diphosphate
    synthase from Triticumaestivum.
    MYRQRTDEPSETRQMIDDIRTALASLGDDETSMSVSAYDTALVALVK
    NLDGGDGPQFPSCIDWIVQNQLPDGSWGDPAFFMVQDRMISTLACVV
    AVKSWNIDRDNLCDRGVLFIKENMSRLVEEEQDWMPCGFEINFPALL
    EKAKDLDLDIPYDHPVLEEIYAKRNLKLLKIPLDVLHAIPTTLLFSV
    EGMVDLPLDWEKLLRLRCPDGSFHSSPAATAAALSHTGDKECHAFLD
    RLIQKFEGGVPCSHSMDTFEQLWVVDRLMRLGISRHFTSEIQQCLEF
    IYRRWTQKGLAHNMHCPIPDIDDTAMGFRLLRQHGYDVTPSVFKHFE
    KDGKFVCFPMETNHASVTPMHNTYRASQFMFPGDDDVLARAGRYCRA
    FLQERQSSNKLYDKWIITKDLPGEVGYTLNFPWKSSLPRIETRMYLD
    QYGGNNDVWIAKVLYRMNLVSNDLYLKMAKADFTEYQRLSRIEWNGL
    RKWYFRNHLQRYGATPKSALKAYFLASANIFEPGRAAERLAWARMAV
    LAEAVTTHFRHIGGPCYSTENLEELIDLVSFDDVSGGLREAWKQWLM
    AWTAKESHGSVDGDTALLFVRTIEICSGRIVSSEQKLNLWDYSQLEQ
    LTSSICHKLATIGLSQNEASMENTEDLHQQVDLEMQELSWRVHQGCH
    GINRETRQTFLNVVKSFYYSAHCSPETVDSHIAKVIFQDVI
    SEQ ID NO: 16
    Optimized cDNA for E.coli expression encoding
    for TaTps1-del59
    ATGTATCGCCAAAGAACTGATGAGCCAAGCGAAACCCGCCAGATGAT
    CGATGATATTCGCACCGCTTTGGCTAGCCTGGGTGACGATGAAACCA
    GCATGAGCGTGAGCGCATACGACACCGCCCTGGTTGCCCTGGTGAAG
    AACCTGGACGGTGGCGATGGCCCGCAGTTCCCGAGCTGCATTGACTG
    GATTGTTCAGAACCAGCTGCCGGACGGTAGCTGGGGCGACCCGGCTT
    TCTTTATGGTTCAGGACCGTATGATCAGCACCCTGGCCTGTGTCGTG
    GCCGTGAAATCCTGGAATATCGATCGTGACAACTTGTGCGATCGTGG
    TGTCCTGTTTATCAAAGAAAACATGTCGCGTCTGGTTGAAGAAGAAC
    AAGATTGGATGCCATGTGGCTTCGAGATTAACTTTCCTGCACTGTTG
    GAGAAAGCTAAAGACCTGGACTTGGACATTCCGTACGATCATCCTGT
    GCTGGAAGAGATTTACGCGAAGCGTAATCTGAAACTGCTGAAGATTC
    CGTTAGATGTCCTCCATGCGATCCCGACGACGCTGTTGTTTTCCGTT
    GAGGGTATGGTCGATCTGCCGCTGGATTGGGAGAAACTGCTGCGTCT
    GCGTTGCCCGGACGGTTCTTTTCATTCTAGCCCGGCGGCGACGGCAG
    CGGCGCTGAGCCACACGGGTGACAAAGAGTGTCACGCCTTCCTGGAC
    CGCCTGATTCAAAAGTTCGAGGGTGGCGTCCCGTGCTCCCACAGCAT
    GGACACCTTCGAGCAACTGTGGGTTGTTGACCGTTTGATGCGTCTGG
    GTATCAGCCGTCATTTTACGAGCGAGATCCAGCAGTGCTTGGAGTTC
    ATCTATCGTCGTTGGACCCAGAAAGGTCTGGCGCACAATATGCACTG
    CCCGATCCCGGACATTGATGACACTGCGATGGGTTTTCGTCTGTTGA
    GACAGCACGGTTACGACGTGACCCCGTCGGTTTTCAAGCATTTCGAG
    AAAGACGGCAAGTTCGTATGCTTCCCGATGGAAACCAACCATGCGAG
    CGTGACGCCGATGCACAATACCTACCGTGCGAGCCAGTTCATGTTCC
    CGGGTGATGACGACGTGCTGGCCCGTGCCGGCCGCTACTGTCGCGCA
    TTCTTGCAAGAGCGTCAGAGCTCTAACAAGTTGTACGATAAGTGGAT
    TATCACGAAAGATCTGCCGGGTGAGGTTGGCTACACGCTGAACTTTC
    CGTGGAAAAGCTCCCTGCCGCGTATTGAAACTCGTATGTATCTGGAT
    CAGTACGGTGGCAATAACGATGTCTGGATTGCAAAGGTCCTGTATCG
    CATGAACCTGGTTAGCAATGACCTGTACCTGAAAATGGCGAAAGCCG
    ACTTTACCGAGTATCAACGTCTGTCTCGCATTGAGTGGAACGGCCTG
    CGCAAATGGTATTTTCGCAATCATCTGCAGCGTTACGGTGCGACCCC
    GAAGTCCGCGCTGAAAGCGTATTTCCTGGCGTCGGCAAACATCTTTG
    AGCCTGGCCGCGCAGCCGAGCGCCTGGCATGGGCACGTATGGCCGTG
    CTGGCTGAAGCTGTAACGACTCATTTCCGTCACATTGGCGGCCCGTG
    CTACAGCACCGAGAATCTGGAAGAACTGATCGACCTTGTTAGCTTCG
    ACGACGTGAGCGGCGGCTTGCGTGAGGCGTGGAAGCAATGGCTGATG
    GCGTGGACCGCAAAAGAATCACACGGCAGCGTGGACGGTGACACGGC
    ACTGCTGTTTGTCCGCACGATTGAGATTTGCAGCGGCCGCATCGTTT
    CCAGCGAGCAGAAACTGAATCTGTGGGATTACAGCCAGTTAGAGCAA
    TTGACCAGCAGCATCTGTCATAAACTGGCCACCATCGGTCTGAGCCA
    GAACGAAGCTAGCATGGAAAATACCGAAGATCTGCACCAACAAGTCG
    ATTTGGAAATGCAAGAACTGTCATGGCGTGTTCACCAGGGTTGTCAC
    GGTATTAATCGCGAAACCCGTCAAACCTTCCTGAATGTTGTTAAGTC
    TTTTTATTACTCCGCACACTGCAGCCCGGAAACCGTGGACAGCCATA
    TTGCAAAAGTGATCTTTCAAGACGTTATCTGA
    SEQ ID NO: 17
    MvCps3, full-length copalyl diphosphate
    synthase from Marrubiumvulgare.
    MGSLSTLNLIKTCVTLASSEKLNQPSQCYTISTCMKSSNNPPFNYYQ
    INGRKKMSTAIDSSVNAPPEQKYNSTALEHDTEIIEIEDHIECIRRL
    LRTAGDGRISVSPYDTAWIALIKDLDGHDSPQFPSSMEWVADNQLPD
    GSWGDEHFVCVYDRLVNTIACVVALRSWNVHAHKCEKGIKYIKENVH
    KLEDANEEHMTCGFEVVFPALLQRAQSMGIKGIPYNAPVIEEIYNSR
    EKKLKRIPMEVVHKVATSLLFSLEGLENLEWEKLLKLQSPDGSFLTS
    PSSTAFAFIHTKDRKCFNFINNIVHTFKGGAPHTYPVDIFGRLWAVD
    RLQRLGISRFFESEIAEFLSHVHRFWSDEAGVFSGRESVFCDIDDTS
    HMGLRLLRMHGYHVDPNVLKNFKQSDKFSCYGGQMMECSSPIYNLYR
    ASQLQFPGEEILEEANKFAYKFLQEKLESNQILDKWLISNLSDEIKV
    GLEMPWYATLPRVETSYYIHHYGGGDDVWIGKTLYRMPEISNDTYRE
    LARLDFRRCQAQHQLEWIYMQRWYESCRMQEFGISRKEVLRAYFLAS
    GTIFEVERAKERVAWARSQIISHMIKSFFNKETTSSDQKQALLTELL
    FGNISASETEKRELDGWVATLRQFLEGFDIGTRHQVKAAWDVWLRKV
    EQGEAHGGADAELCTTTLNTCANQHLSSHPDYNTLSKLTNKICHKLS
    QIQHQKEMKGGIKAKCSINNKEVDIEMQWLVKLVLEKSGLNRKAKQA
    FLSIAKTYYYRAYYADQTMDAHEFKVLFEPVV
    SEQ ID NO: 18
    MvCps3-del63, truncated copalyl diphosphate
    synthase from Marrubiumvulgare
    MAPPEQKYNSTALEHDTEIIEIEDHIECIRRLLRTAGDGRISVSPYD
    TAWIALIKDLDGHDSPQFPSSMEWVADNQLPDGSWGDEHFVCVYDRL
    VNTIACWALRSWNVHAHKCEKGIKYIKENVHKLEDANEEHMTCGFEV
    VFPALLQRAQSMGIKGIPYNAPVIEEIYNSREKKLKRIPMEVVHKVA
    TSLLFSLEGLENLEWEKLLKLQSPDGSFLTSPSSTAFAFIHTKDRKC
    FNFINNIVHTFKGGAPHTYPVDIFGRLWAVDRLQRLGISRFFESEIA
    EFLSHVHRFWSDEAGVFSGRESVFCDIDDTSMGLRLLRMHGYHVDPN
    VLKNFKQSDKFSCYGGQMMECSSPIYNLYRASQLQFPGEEILEEANK
    FAYKFLQEKLESNQILDKWLISNHLSDEIKVGLEMPWYATLPRVETS
    YYIHHYGGGDDVWIGKTLYRMPEISNDTYRELARLDFRRCQAQHQLE
    WIYMQRWYESCRMQEFGISRKEVLRAYFLASGTIFEVERAKERVAWA
    RSQIISHMIKSFFNKETTSSDQKQALLTELLFGNISASETEKRELDG
    VVVATLRQFLEGFDIGTRHQVKAAWDVWLRKVEQGEAHGGADAELCT
    TTLNTCANQHLSSHPDYNTLSKLTNKICHKLSQIQHQKEMKGGIKAK
    CSINNKEVDIEMQWLVKLVLEKSGLNRKAKQAFLSIAKTYYYRAYYA
    DQTMDAHIFKVLFEPVV
    SEQ ID NO: 19
    Optimized cDNA for E.coli expression encoding
    for MvCps3-del63
    ATGGCCCCGCCGGAACAAAAGTACAACAGCACTGCATTAGAACACGA
    CACCGAGATTATTGAGATCGAGGACCACATCGAGTGTATCCGCCGTC
    TGCTGCGTACCGCGGGTGATGGTCGTATTAGCGTGAGCCCGTATGAT
    ACCGCGTGGATTGCACTGATTAAAGATTTGGATGGCCACGACTCCCC
    GCAATTCCCGTCGAGCATGGAATGGGTTGCTGATAATCAGCTGCCGG
    ACGGTAGCTGGGGTGACGAGCACTTCGTTTGCGTTTACGATCGCCTG
    GTTAATACCATCGCATGCGTCGTGGCGCTGCGCAGCTGGAATGTCCA
    TGCACATAAGTGCGAGAAAGGTATTAAGTACATTAAAGAAAATGTCC
    ACAAACTGGAAGATGCGAACGAAGAACACATGACTTGCGGCTTCGAA
    GTCGTTTTTCCGGCCTTGCTGCAGCGTGCACAGAGCATGGGTATTAA
    GGGCATCCCGTACAACGCGCCTGTCATTGAAGAAATTTACAATTCCC
    GTGAGAAAAAGCTGAAACGTATTCCGATGGAAGTTGTCCACAAAGTC
    GCGACCAGCCTGCTGTTCTCCCTGGAAGGTCTGGAGAACCTGGAGTG
    GGAGAAATTGCTGAAACTGCAGAGCCCGGACGGTTCGTTTCTGACCA
    GCCCGAGCTCTACGGCATTCGCGTTTATCCATACCAAAGACCGTAAA
    TGTTTTAACTTTATTAACAATATCGTTCATACCTTTAAGGGTGGTGC
    ACCGCACACGTACCCTGTGGACATCTTTGGCCGCCTGTGGGCAGTGG
    ATCGCTTGCAGCGTCTGGGTATTAGCCGCTTCTTCGAGAGCGAGATC
    GCGGAATTTCTGAGCCACGTGCACCGTTTTTGGAGCGACGAAGCGGG
    CGTTTTCAGCGGCCGTGAGAGCGTGTTCTGTGATATTGATGACACCA
    GCATGGGTCTGCGCCTGCTTCGTATGCATGGCTACCATGTAGACCCA
    AACGTTCTGAAGAACTTCAAGCAATCTGACAAGTTTAGCTGCTACGG
    TGGCCAGATGATGGAATGCAGCAGCCCAATTTACAATCTGTACCGTG
    CGAGCCAACTGCAATTTCCGGGTGAAGAAATCTTGGAAGAGGCTAAC
    AAATTCGCGTATAAGTTTTTGCAAGAGAAACTGGAGTCCAATCAGAT
    TCTGGACAAGTGGCTGATCTCCAACCACCTGAGCGACGAAATCAAAG
    TTGGCCTGGAAATGCCGTGGTATGCGACCTTGCCGCGCGTTGAGACT
    AGCTATTATATTCACCATTACGGCGGTGGCGACGATGTGTGGATTGG
    TAAAACGCTGTATCGCATGCCGGAAATTAGCAACGACACCTACCGTG
    AGCTGGCACGTCTGGACTTCCGCCGCTGCCAGGCGCAGCACCAGTTG
    GAATGGATCTATATGCAACGTTGGTATGAGAGCTGTCGTATGCAAGA
    ATTTGGTATTTCCCGCAAAGAAGTCCTGCGTGCCTACTTCCTGGCCT
    CTGGCACGATTTTCGAAGTTGAGCGCGCCAAAGAGCGCGTGGCGTGG
    GCTCGTAGCCAAATCATTTCCCACATGATCAAGAGCTTCTTCAATAA
    AGAAACCACGAGCAGCGATCAGAAACAAGCGCTGCTGACCGAGTTGC
    TGTTTGGTAACATCTCTGCAAGCGAGACTGAGAAACGTGAGCTGGAT
    GGTGTTGTGGTTGCGACCCTGCGTCAGTTCCTGGAAGGCTTCGATAT
    CGGCACCCGTCACCAAGTGAAGGCAGCGTGGGATGTGTGGCTGCGTA
    AAGTCGAACAGGGTGAGGCACATGGTGGCGCGGACGCCGAGTTGTGT
    ACGACGACGCTGAACACGTGCGCGAATCAGCATCTGTCTAGCCATCC
    GGACTACAATACCCTGTCGAAACTCACCAATAAGATTTGTCACAAGC
    TGTCCCAAATCCAGCATCAGAAAGAAATGAAGGGCGGTATTAAGGCA
    AAGTGCTCTATCAATAACAAAGAAGTGGATATCGAGATGCAATGGCT
    GGTCAAACTGGTCCTGGAGAAATCCGGTCTGAACCGCAAGGCTAAAC
    AAGCGTTTCTGAGCATTGCCAAAACCTATTATTATCGTGCTTACTAT
    GCCGACCAGACGATGGATGCCCACATCTTCAAGGTCCTGTTTGAACC
    GGTCGTGTAA
    SEQ ID NO: 20
    RoCPSl, full-length copalyl diphosphate
    synthase from Rosmarinusofficinalis
    MTSMSSLNLSRAPAISRRLQLPAKVQLPEFYAVCSWLNNSSKHTPLS
    CHIHRKQLSKVTKCRVASLDASQVSEKGTSSPVQTPEEVNEKIENYI
    EYIKNLLTTSGDGRISVSPYDTSIVALIKDLKGRDTPQFPSCLEWIA
    QHQMADGSWGDEFFCIYDRILNTLACWALKSWNVHADMIEKGVTYVN
    ENVQKLEDGNLEHMTSGFEIVVPALVQRAQDLGIQGLPYDHPLIKEI
    ANTKEGRLKKIPKDMIYQKPTTLLFSLEGLGDLEWEKILKLQSGDGS
    FLTSPSSTAHVFMKTKDEKCLKFIENAVKNCNGGAPHTYPVDVFARL
    WAVDRLQRLGISRFFQQEIKYFLDHINSVWTENGVFSGRDSEFCDID
    DTSMGIRLLKMHGYDIDPNALEHFKQQDGKFSCYGGQMIESASPIYN
    LYRAAQLRFPGEEILEEATKFAYNFLQEKIANDQFQEKWVISDHLID
    EVKLGLKMPWYATLPRVEAAYYLQYYAGCGDVWIGKVFYRJVIPEIS
    NDTYKKLAILDFNRCQAQHQFEWIYMQEWYHRSSVSEFGISKKDLLR
    AYFLAAATIFEPERTQERLVWAKTQIVSGMITSFVNSGTTLSLHQKT
    ALLSQIGHNFDGLDEIISAMKDHGLAATLLTTFQQLLDGFDRYTRHQ
    LKNAWSQWFMKLQQGEASGGEDAELLANTLNICAGLIAFNEDVLSHH
    EYTTLSTLTNKICKRLTQIQDKKTLEVVDGSIKDKELEKDIQMLVKL
    VLEENGGGVDRNIKHTFLSVFKTFYYNAYHDDETTDVHIFKVLFGPV
    V
    SEQ ID NO: 21
    RoCPSl-del67, truncated copalyl diphosphate
    synthase from Rosmarinusofficinalis
    MASQVSEKGTSSPVQTPEEVNEKIENYIEYIKNLLTTSGDGRISVSP
    YDTSIVALIKDLKGRDTPQFPSCLEWIAQHQMADGSWGDEFFCIYDR
    ILNTLACVVALKSWNVHADMIEKGVTYVNENVQKLEDGNLEHMTSGF
    EIVVPALVQRAQDLGIQGLPYDHPLIKEIANTKEGRLKKIPKDMIYQ
    KPTTLLFSLEGLGDLEWEKLLKLQSGDGSFLTSPSSTAHVFMKTKDE
    KCLKFIENAVKNCNGGAPHTYPVDVFARLWAVDRLQRLGISRFFQQE
    IKYFLDHINSVWTENGVFSGRDSEFCDIDDTSMGIRLLKMHGYDIDP
    NALEHFKQQDGKFSCYGGQMIESASPIYNLYRAAQLRFPGEEILEEA
    TKFAYNFLQEKIANDQFQEKWVISDHLIDEVKLGLKMPWYATLPRVE
    AAYYLQYYAGCGDVWIGKVFYRMPEISNDTYKKLAILDFNRCQAQHQ
    FEWIYMQEWYIIRSSVSEFGISKKDLLRAYFLAAATIFEPERTQERL
    VWAKTQIVSGMITSFVNSGTTLSLHQKTALLSQIGHNFDGLDEIISA
    MKDHGLAATLLTTFQQLLDGFDRYTRHQLKNAWSQWFMKLQQGEASG
    GEDAELLANTLNICAGLtAFNEDVLSHHEYTTLSTLTNKICKRLTQI
    QDKKTLEWDGSIKDKELEKDIQMLVKLVLEENGGGVDRNIKHTFLSV
    FKTFYYNAYHDDETTDVHIFKVLFGPVV
    SEQ ID NO: 22
    Optimized cDNA for E.coli expression encoding
    for RoCPS1-del67
    ATGGCATCACAAGTTAGCGAGAAAGGCACCAGCTCCCCAGTTCAAAC
    GCCAGAGGAAGTGAACGAAAAGATCGAGAATTACATTGAGTATATTA
    AAAATCTGCTGACTACTTCGGGCGACGGCCGCATCAGCGTCAGCCCG
    TACGACACGAGCATCGTTGCCCTGATTAAAGACCTGAAGGGTCGTGA
    CACCCCGCAGTTTCCGTCCTGTCTGGAGTGGATTGCCCAACACCAAA
    TGGCCGATGGTTCCTGGGGTGATGAATTTTTCTGCATTTACGACCGC
    GATCCTGAATACGCTGGCTTGTGTTGTCGCCCTGAAGTCCTGGAATT
    TCATGCAGACATGATCGAAAAGGGTGTCACTTACGTTAACGAAAACG
    TGCAGAAACTGGAAGATGGCAATCTGGAGCACATGACGAGCGGTTTC
    CGAGATTGTTGTCCCGGCGCTGGTTCAGAGAGCGCAAGACCTGGGCA
    TCCAGGGCCTGCCGTATGATATCCGTTGATCAAAGAAATCGCAAACA
    CCAAAGAGGGCCGCCTGAAGAAAATTCCTAAAGACATGATTTATCAG
    AAACCGACTACGCTGCTGTTCAGCCTGGAAGGCTTGGGCGACCTGGA
    GTGGGAAAAGATCCTGAAGTTACAGTCTGGTGATGGTTCTTTCCTGA
    CCAGCCCGAGCTCTACGGCCCATGTTTTCATGAAAACCAAAGATGAG
    AAGTGTCTGAAGTTTATTGAAAATGCCGTCAAGAATTGCAACGGTGG
    CGCGCCTCACACCTACCCGGTGGACGTTTTCGCTCGTCTGTGGGCCG
    TCGATCGTCTGCAACGCCTGGGCATCTCGCGTTTCTTCCAGCAAGAG
    ATTAAGTACTTCCTGGACCACATTAATAGCGTGTGGACCGAAAACGG
    CGTTTTCAGCGGTCGCGACAGCGAGTTTTGTGATATTGATGACACCT
    CTATGGGTATCCGTTTGCTGAAGATGCACGGTTACGACATTGACCCG
    AATGCCCTGGAGCACTTTAAACAACAGGATGGTAAGTTCTCCTGCTA
    CGGTGGTCAGATGATTGAGAGCGCGAGCCCGATCTACAACCTGTACC
    GTGCTGCGCAGCTGCGTTTTCCGGGTGAAGAGATTCTGGAAGAGGCC
    ACCAAATTTGCGTATAATTTTTTGCAAGAGAAAATTGCAAACGACCA
    AATTCCAGGAAAAATGGGTTATTAGCGATCACCTTATCGATGAAGTG
    AAAACTGGGTTTGAAGATGCCGTGGTACGCGCGCTGCCACGTGTCGA
    GGCAGCGTATTATCTGCAGTATTATGCGGGCTGTGGTGTGTGTGGAT
    CGGCAAAGTGTTCTACCGTATGCCGGAAATCAGCAATGACACCTACA
    AGAAACTGGCCATCCTGGATTTCAACCGTTGCCAGGCGCAACACCAA
    TTCGAGTGGATCTACATGCAAGAGTGGTATCATCGTAGCAGCGTTTC
    TGAGTTTGGCATTTCCAAAAAAGACTTGCTGCGCGCGTATTTTCTGG
    CGGCAGCGACCATTTTCGAACCGGAGCGCACCCAGGAACGTCTGGTG
    TGGGCTAAGACGCAAATCGTCAGCGGTATGATTACGTCCTTTGTTAA
    TAGCGGTACGACTCTGAGCCTGCACCAGAAAACGGCACTGTTGAGCC
    AAATCGGTCATAACTTTGACGGCCTGGATGAGATTATCAGCGCGATG
    AAAGACCACGGCCTGGCAGCGACGCTGTTAACGACCTTTCAACAGCT
    GCTGGACGGCTTCGATCGCTACACCCGTCATCAGCTGAAAAACGCGT
    GGAGCCAGTGGTTCATGAAGCTGCAACAGGGTGAGGCGTCGGGTGGC
    GAAGATGCTGAGCTGCTGGCTAATACCCTGAACATTTGCGCGGGTTT
    GATTGCGTTTAATGAAGATGTGTTGAGCCACCATGAGTACACCACCC
    TGAGCACCCTGACCAACAAGATCTGTAAGCGCTTGACTCAAATCCAG
    GATAAGAAAACGCTGGAAGTCGTGGATGGTAGCATCAAAGATAAAGA
    ACTGGAAAAAGACATTCAAATGCTGGTGAAACTGGTCCTTGAAGAGA
    ACGGCGGTGGCGTTGACCGTAACATCAAGCACACCTTCCTGAGCGTC
    TTTAAAACCTTTTATTATAATGCCTATCATGACGATGAAACGACCGA
    CGTGCACATTTTCAAAGTTCTGTTCGGTCCGGTCGTGTAA
    SEQ ID NO: 23
    NgSCS-del29, truncated putative sclareol
    synthase from Nicotianaglutinosa
    MANFHRPSRVRCSHSTASSLEEAKERIRETFGKNELSPSSYDTAWVA
    MVPSRYSMNQPCFPRCLDWILENQREDGSWGLNPSHPLLVKDSLSST
    LACLLALRKWRIGDNQVQRGLGFIETHGWAVDNVDQISPLGFDIIFP
    SMIKYAEKLNLDLPFDPNLVNMMLRERELTIERALKNEFEGNMANVE
    YFAEGLGELCHWKEIMLHQRRNGSLFDSPATTAAALIYHQHDEKCFG
    YLSSILKLHENWVPTIYPTKVHSNLFFVDALQNLGVDRYFKTELKSV
    LDEIYRLWLEKNEEIFSDIAHCAMAFRLLRMNNYEVSSEELEGFVDQ
    EHFFTTSGGKLISHVAILELHRASQVDIQEGKDLILDKISTWTRNFM
    EQELLDNQILDRSKKEMEFAMRKFYGTFDRVETRRYIESYKMDSFKI
    LKAAYRSSNINNIDLLKFSEFIDFNLCQARHKEELQQIKRWFADCKL
    EQVGSSQNYLYTSYFPIAAILFEPEYGDARLAFAKCGIIATTVDDFF
    DGFACNEELQNIIELVERWDGYPTVGFRSERVRIFFLALYKMIEEIA
    AKAETKQGRCVKDLLINLWIDLLKCMLVELDLWKIKSTTPSIEEYLS
    IACVTTGVKCLILISLHLLGPKLSKDVTESSEVSALWNCTAVVARLN
    NDIHSYKREQAESSTNMAAILISQSQRTISEEEAIRQIKEMMESKRR
    ELLGMVLQNKESQLPQVCKDLFWTTFKAAYSIYTHGDEYRFPQELKN
    HINDVIYKPLNQYSP
    SEQ ID NO: 24
    Optimized cDNA for E.coli expression encoding
    for NgSCS-del29
    ATGGCTAATTTCCATCGCCCATCCCGTGTTCGTTGTTCCCACTCTAC
    CGCAAGCTCCCTGGAAGAGGCAAAAGAGCGCATCCGTGAAACCTTCG
    GCAAAAATGAACTCTCTCCTTCTAGCTATGATACGGCCTGGGTTGCT
    ATGGTCCCGAGCCGCTACAGCATGAACCAGCCGTGCTTTCCGCGCTG
    CCTGGACTGGATTCTGGAGAACCAACGTGAGGATGGCAGCTGGGGTC
    TGAACCCGAGCCATCCGTTACTGGTGAAAGACAGCTTGAGCAGCACG
    CTGGCGTGTTTGCTGGCGCTGCGTAAGTGGCGTATTGGCGACAACCA
    AGTCCAGCGTGGCCTGGGTTTTATCGAGACTCATGGTTGGGCAGTGG
    ACAACGTAGACCAGATCTCTCCACTGGGTTTTGACATCATTTTCCCG
    AGCATGATTAAATATGCGGAAAAGCTGAATCTGGATTTGCCTTTTGA
    TCCGAACCTGGTGAACATGATGCTGCGCGAGCGCGAGCTGACGATCG
    AGCGTGCGCTGAAAAACGAATTTGAGGGTAATATGGCTAATGTCGAG
    TACTTCGCCGAGGGTTTGGGTGAGCTGTGTCACTGGAAAGAAATCAT
    GCTGCACCAACGCCGTAACGGTAGCCTGTTCGACTCTCCGGCAACGA
    CCGCCGCGGCTCTTATTTATCATCAGCACGATGAGAAGTGCTTCGGC
    TATCTGTCTAGCATCCTGAAATTACACGAGAACTGGGTGCCGACCAT
    CTATCCGACCAAGGTTCACTCCAATCTGTTTTTCGTCGATGCGCTGC
    AGAACCTGGGTGTTGACCGTTACTTCAAAACCGAACTGAAGTCCGTC
    CTGGATGAGATCTACCGTTTGTGGCTGGAGAAAAACGAAGAGATCTT
    CAGCGATATTGCGCACTGCGCAATGGCGTTTCGCCTGTTGCGCATGA
    ATAATTACGAGGTTAGCAGCGAAGAACTGGAAGGCTTCGTGGACCAA
    GAACATTTTTTCACCACGTCGGGTGGCAAGCTGATCAGCCACGTTGC
    CATCCTGGAACTGCACCGTGCAAGCCAAGTGGACATTCAGGAGGGCA
    AAGACCTGATCCTGGACAAAATTAGCACCTGGACTCGCAACTTTATG
    GAACAGGAACTGCTGGATAACCAGATCTTGGATCGTAGCAAAAAAGA
    AATGGAATTTGCAATGCGTAAGTTTTACGGTACGTTCGATCGCGTGG
    AAACCCGTCGTTATATTGAAAGCTACAAAATGGATTCCTTCAAGATC
    CTGAAGGCAGCGTACCGTAGCTCCAACATTAACAATATTGACCTGTT
    GAAGTTCAGCGAGCACGACTTCAATCTCTGCCAGGCGCGTCACAAGG
    AAGAACTGCAGCAAATCAAACGCTGGTTCGCAGATTGCAAACTGGAG
    CAAGTCGGTAGCAGCCAGAACTACTTGTACACCTCTTACTTCCCGAT
    CGCGGCCATTTTGTTCGAGCCGGAGTATGGCGACGCACGCCTGGCGT
    TCGCGAAGTGCGGTATTATCGCGACCACCGTTGACGATTTTTTTGAC
    GGTTTTGCATGTAATGAAGAACTGCAAAACATCATCGAACTGGTCGA
    GAGATGGGACGGTTATCCGACGGTTGGTTTCCGCTCCGAGCGTGTGC
    GCATTTTCTTTCTGGCGCTGTACAAAATGATTGAAGAAATTGCCGCG
    AAAGCGGAAACGAAACAGGGCCGTTGCGTGAAAGATCTGTTGATCAA
    TCTGTGGATTGATCTGCTGAAATGCATGCTGGTCGAACTGGATCTGT
    GGAAAATTAAGAGCACGACCCCGAGCATTGAAGAGTATCTGAGCATT
    GCCTGTGTGACGACCGGCGTTAAGTGCTTGATCCTGATTAGCCTGCA
    TCTGCTGGGCCCGAAACTGAGCAAAGACGTGACCGAATCCAGCGAAG
    TTAGCGCTCTGTGGAACTGTACGGCCGTGGTTGCGCGCCTGAACAAC
    GACATTCATAGCTACAAGCGTGAGCAAGCCGAGAGCAGCACTAATAT
    GGCCGCAATCCTGATTTCGCAAAGCCAGCGTACCATCTCAGAAGAAG
    AAGCTATCCGCCAGATCAAAGAGATGATGGAATCGAAACGCCGTGAG
    CTGCTGGGCATGGTGCTGCAGAATAAAGAGAGCCAATTGCCGCAAGT
    CTGCAAAGACCTGTTTTGGACCACCTTCAAAGGCGCGTACAGCATTT
    ATACCCACGGTGATGAGTACCGTTTTCCACAAGAACTGAAGAACCAT
    ATCAACGATGTCATCTATAAGCCGTTAAATCAATACAGCCCTTAA
    SEQ ID NO: 25
    NgSCS-del38, putative sclareol synthase from
    Nicotianaglutinosa
    MSHSTASSLEEAKERIRETFGKNELSSSSYDTAWVAMVPSRYSMNQP
    CFPRCLDWILENQREDGSWGLNPSLPLLVKDSLSSTLACLLALRKWR
    IGDNQVQRGLGFIETHGWAVDNVDQISPLGFDIIFPSMIKYAEKLNL
    DLPFDPNLVNMMLRERELTIERALKNEFEGNMANVEYFAEGLGELCH
    WKEIMLHQRRNGSPFDSPATTAAALIYHQHDEKCFGYLSSILKLHEN
    WVPTIYPTKVHSNLFFVDALQNLGVDRYFKTELKSVLDEIYRLWLEK
    NEEIFSDIAHCAMAFRLLRMNNYEVSSEELEGFVDQEHFFTTSGGKL
    ISHVAILELHRASQVDIQEGKDLILDKISTWTRNFMEQELLDNQILD
    RSKKEMEFAMRKFYGTFDRVETRRYIESYKMDSFKILKAAYRSSNIN
    NIDLLKFSEHDFNLCQARHKEELQQIKRWFADCKLEQVGSSQNYLYT
    SYFPIAAILFEPEYGDARLAFAKCGIIATTVDDFFDGFACNEELQNI
    IELVERWDGYPTVGFRSERVRIFFLALYKMIEEIAAKAETKQGRCVK
    DLLINLWIDLLKCMLVELDLWKIKSTTPSIEEYLSIACVTTGVKCLI
    LISLHLLGPKLSKDVTESSEVSALWNCTAVVARLNNDIHSYKREQAE
    SSTNMVAILISQSQRTISEEEAIRQIKEMMESKRRELLGMVLQNKES
    QLPQVCKDLFWTTFKAAYSIYTHGDEYRFPQELKNHINDVIYKPLNQ
    YSP
    SEQ ID NO: 26
    Optimized cDNA for Saccharomycescerevisiae
    expression encoding for SmCPS2.
    ATGGCTACTGTTGACGCTCCACAAGTTCACGACCACGACGGTACTAC
    TGTTCACCAAGGTCACGACGCTGTTAAGAACATCGAAGACCCAATCG
    AATACATCAGAACTTTGTTGAGAACTACTGGTGACGGTAGAATCTCT
    GTTTCTCCATACGACACTGCTTGGGTTGCTATGATCAAGGACGTTGA
    AGGTAGAGACGGTCCACAATTCCCATCTTCTTTGGAATGGATCGTTC
    AAAACCAATTGGAAGACGGTTCTTGGGGTGACCAAAAGTTGTTCTGT
    GTTTACGACAGATTGGTTAACACTATCGCTTGTGTTGTTGCTTTGAG
    ATCTTGGAACGTTCACGCTCACAAGGTTAAGAGAGGTGTTACTTACA
    TCAAGGAAAACGTTGACAAGTTGATGGAAGGTAACGAAGAACACATG
    ACTTGTGGTTTCGAAGTTGTTTTCCCAGCTTTGTTGCAAAAGGCTAA
    GTCTTTGGGTATCGAAGACTTGCCATACGACTCTCCAGCTGTTCAAG
    AAGTTTACCACGTTAGAGAACAAAAGTTGAAGAGAATCCCATTGGAA
    ATCATGCACAAGATCCCAACTTCTTTGTTGTTCTCTTTGGAAGGTTT
    GGAAAACTTGGACTGGGACAAGTTGTTGAAGTTGCAATCTGCTGACG
    GTTCTTTCTTGACTTCTCCATCTTCTACTGCTTTCGCTTTCATGCAA
    ACTAAGGACGAAAAGTGTTACCAATTCATCAAGAACACTATCGACAC
    TTTCAACGGTGGTGCTCCACACACTTACCCAGTTGACGTTTTCGGTA
    GATTGTGGGCTATCGACAGATTGCAAAGATTGGGTATCTCTAGATTC
    TTCGAACCAGAAATCGCTGACTGTTTGTCTCACATCCACAAGTTCTG
    GACTGACAAGGGTGTTTTCTCTGGTAGAGAATCTGAATTCTGTGACA
    TCGACGACACTTCTATGGGTATGAGATTGATGAGAATGCACGGTTAC
    GACGTTGACCCAAACGTTTTGAGAAACTTCAAGCAAAAGGACGGTAA
    GTTCTCTTGTTACGGTGGTCAAATGATCGAATCTCCATCTCCAATCT
    ACAACTTGTACAGAGCTTCTCAATTGAGATTCCCAGGTGAAGAAATC
    TTGGAAGACGCTAAGAGATTCGCTTACGACTTCTTGAAGGAAAAGTT
    GGCTAACAACCAAATCTTGGACAAGTGGGTTATCTCTAAGCACTTGC
    CAGACGAAATCAAGTTGGGTTTGGAAATGCCATGGTTGGCTACTTTG
    CCAAGAGTTGAAGCTAAGTACTACATCCAATACTACGCTGGTTCTGG
    TGACGTTTGGATCGGTAAGACTTTGTACAGAATGCCAGAAATCTCTA
    ACGACACTTACCACGACTTGGCTAAGACTGACTTCAAGAGATGTCAA
    GCTAAGCACCAATTCGAATGGTTGTACATGCAAGAATGGTACGAATC
    TTGTGGTATCGAAGAATTCGGTATCTCTAGAAAGGACTTGTTGTTGT
    CTTACTTCTTGGCTACTGCTTCTATCTTCGAATTGGAAAGAACTAAC
    GAAAGAATCGCTTGGGCTAAGTCTCAAATCATCGCTAAGATGATCAC
    TTCTTTCTTCAACAAGGAAACTACTTCTGAAGAAGACAAGAGAGCTT
    TGTTGAACGAATTGGGTAACATCAACGGTTTGAACGACACTAACGGT
    GCTGGTAGAGAAGGTGGTGCTGGTTCTATCGCTTTGGCTACTTTGAC
    TCAATTCTTGGAAGGTTTCGACAGATACACTAGACACCAATTGAAGA
    ACGCTTGGTCTGTTTGGTTGACTCAATTGCAACACGGTGAAGCTGAC
    GACGCTGAATTGTTGACTAACACTTTGAACATCTGTGCTGGTCACAT
    CGCTTTCAGAGAAGAAATCTTGGCTCACAACGAATACAAGGCTTTGT
    CTAACTTGACTTCTAAGATCTGTAGACAATTGTCTTTCATCCAATCT
    GAAAAGGAAATGGGTGTTGAAGGTGAAATCGCTGCTAAGTCTTCTAT
    CAAGAACAAGGAATTGGAAGAAGACATGCAAATGTTGGTTAAGTTGG
    TTTTGGAAAAGTACGGTGGTATCGACAGAAACATCAAGAAGGCTTTC
    TTGGCTGTTGCTAAGACTTACTACTACAGAGCTTACCACGCTGCTGA
    CACTATCGACACTCACATGTTCAAGGTTTTGTTCGAACCAGTTGCTT
    AA
    SEQ ID NO: 27
    Optimized cDNA for S.cerevisiae expression
    encoding for truncated SsScS from
    Salviasclarea
    ATGGCTAAGATGAAGGAAAACTTCAAGAGAGAAGACGACAAGTTCCC
    AACTACTACTACTTTGAGATCTGAAGACATCCCATCTAACTTGTGTA
    TCATCGACACTTTGCAAAGATTGGGTGTTGACCAATTCTTCCAATAC
    GAAATCAACACTATCTTGGACAACACTTTCAGATTGTGGCAAGAAAA
    GCACAAGGTTATCTACGGTAACGTTACTACTCACGCTATGGCTTTCA
    GATTGTTGAGAGTTAAGGGTTACGAAGTTTCTTCTGAAGAATTGGCT
    CCATACGGTAACCAAGAAGCTGTTTCTCAACAAACTAACGACTTGCC
    AATGATCATCGAATTGTACAGAGCTGCTAACGAAAGAATCTACGAAG
    AAGAAAGATCTTTGGAAAAGATCTTGGCTTGGACTACTATCTTCTTG
    AACAAGCAAGTTCAAGACAACTCTATCCCAGACAAGAAGTTGCACAA
    GTTGGTTGAATTCTACTTGAGAAACTACAAGGGTATCACTATCAGAT
    TGGGTGCTAGAAGAAACTTGGAATTGTACGACATGACTTACTACCAA
    GCTTTGAAGTCTACTAACAGATTCTCTAACTTGTGTAACGAAGACTT
    CTTGGTTTTCGCTAAGCAAGACTTCGACATCCACGAAGCTCAAAACC
    AAAAGGGTTTGCAACAATTGCAAAGATGGTACGCTGACTGTAGATTG
    GACACTTTGAACTTCGGTAGAGACGTTGTTATCATCGCTAACTACTT
    GGCTTCTTTGATCATCGGTGACCACGCTTTCGACTACGTTAGATTGG
    CTTTCGCTAAGACTTCTGTTTTGGTTACTATCATGGACGACTTCTTC
    GACTGTCACGGTTCTTCTCAAGAATGTGACAAGATCATCGAATTGGT
    TAAGGAATGGAAGGAAAACCCAGACGCTGAATACGGTTCTGAAGAAT
    TGGAAATCTTGTTCATGGCTTTGTACAACACTGTTAACGAATTGGCT
    GAAAGAGCTAGAGTTGAACAAGGTAGATCTGTTAAGGAATTCTTGGT
    TAAGTTGTGGGTTGAAATCTTGTCTGCTTTCAAGATCGAATTGGACA
    CTTGGTCTAACGGTACTCAACAATCTTTCGACGAATACATCTCTTCT
    TCTTGGTTGTCTAACGGTTCTAGATTGACTGGTTTGTTGACTATGCA
    ATTCGTTGGTGTTAAGTTGTCTGACGAAATGTTGATGTCTGAAGAAT
    GTACTGACTTGGCTAGACACGTTTGTATGGTTGGTAGATTGTTGAAC
    GACGTTTGTTCTTCTGAAAGAGAAAGAGAAGAAAACATCGCTGGTAA
    GTCTTACTCTATCTTGTTGGCTACTGAAAAGGACGGTAGAAAGGTTT
    CTGAAGACGAAGCTATCGCTGAAATCAACGAAATGGTTGAATACCAC
    TGGAGAAAGGTTTTGCAAATCGTTTACAAGAAGGAATCTATCTTGCC
    AAGAAGATGTAAGGACGTTTTCTTGGAAATGGCTAAGGGTACTTTCT
    ACGCTTACGGTATCAACGACGAATTGACTTCTCCACAACAATCTAAG
    GAAGACATGAAGTCTTTCGTTTTCTAA
    SEQ ID NO: 28
    Optimized cDNA for S.cerevisiae expression
    encoding for the GGPP synthase from Pantoea
    agglomeranes
    ATGGTTTCTGGTTCTAAGGCTGGTGTTTCTCCACACAGAGAAATCGA
    AGTTATGAGACAATCTATCGACGACCACTTGGCTGGTTTGTTGCCAG
    AAACTGACTCTCAAGACATCGTTTCTTTGGCTATGAGAGAAGGTGTT
    ATGGCTCCAGGTAAGAGAATCAGACCATTGTTGATGTTGTTGGCTGC
    TAGAGACTTGAGATACCAAGGTTCTATGCCAACTTTGTTGGACTTGG
    CTTGTGCTGTTGAATTGACTCACACTGCTTCTTTGATGTTGGACGAC
    ATGCCATGTATGGACAACGCTGAATTGAGAAGAGGTCAACCAACTAC
    TCACAAGAAGTTCGGTGAATCTGTTGCTATCTTGGCTTCTGTTGGTT
    TGTTGTCTAAGGCTTTCGGTTTGATCGCTGCTACTGGTGACTTGCCA
    GGTGAAAGAAGAGCTCAAGCTGTTAACGAATTGTCTACTGCTGTTGG
    TGTTCAAGGTTTGGTTTTGGGTCAATTCAGAGACTTGAACGACGCTG
    CTTTGGACAGAACTCCAGACGCTATCTTGTCTACTAACCACTTGAAG
    ACTGGTATCTTGTTCTCTGCTATGTTGCAAATCGTTGCTATCGCTTC
    TGCTTCTTCTCCATCTACTAGAGAAACTTTGCACGCTTTCGCTTTGG
    ACTTCGGTCAAGCTTTCCAATTGTTGGACGACTTGAGAGACGACCAC
    CCAGAAACTGGTAAGGACAGAAACAAGGACGCTGGTAAGTCTACTTT
    GGTTAACAGATTGGGTGCTGACGCTGCTAGACAAAAGTTGAGAGAAC
    ACATCGACTCTGCTGACAAGCACTTGACTTTCGCTTGTCCACAAGGT
    GGTGCTATCAGACAATTCATGCACTTGTGGTTCGGTCACCACTTGGC
    TGACTGGTCTCCAGTTATGAAGATCGCTTAA
    SEQ ID NO: 29
    Optimized cDNA for S.cerevisiae expression
    encoding for CfCPS1-del63
    ATGGTTGCTACTGTTAACGCTCCACCAGTTCACGACCAAGACGACTC
    TACTGAAAACCAATGTCACGACGCTGTTAACAACATCGAAGACCCAA
    TCGAATACATCAGAACTTTGTTGAGAACTACTGGTGACGGTAGAATC
    TCTGTTTCTCCATACGACACTGCTTGGGTTGCTTTGATCAAGGACTT
    GCAAGGTAGAGACGCTCCAGAATTCCCATCTTCTTTGGAATGGATCA
    TCCAAAACCAATTGGCTGACGGTTCTTGGGGTGACGCTAAGTTCTTC
    TGTGTTTACGACAGATTGGTTAACACTATCGCTTGTGTTGTTGCTTT
    GAGATCTTGGGACGTTCACGCTGAAAAGGTTGAAAGAGGTGTTAGAT
    ACATCAACGAAAACGTTGAAAAGTTGAGAGACGGTAACGAAGAACAC
    ATGACTTGTGGTTTCGAAGTTGTTTTCCCAGCTTTGTTGCAAAGAGC
    TAAGTCTTTGGGTATCCAAGACTTGCCATACGACGCTCCAGTTATCC
    AAGAAATCTACCACTCTAGAGAACAAAAGTCTAAGAGAATCCCATTG
    GAAATGATGCACAAGGTTCCAACTTCTTTGTTGTTCTCTTTGGAAGG
    TTTGGAAAACTTGGAATGGGACAAGTTGTTGAAGTTGCAATCTGCTG
    ACGGTTCTTTCTTGACTTCTCCATCTTCTACTGCTTTCGCTTTCATG
    CAAACTAGAGACCCAAAGTGTTACCAATTCATCAAGAACACTATCCA
    AACTTTCAACGGTGGTGCTCCACACACTTACCCAGTTGACGTTTTCG
    GTAGATTGTGGGCTATCGACAGATTGCAAAGATTGGGTATCTCTAGA
    TTCTTCGAATCTGAAATCGCTGACTGTATCGCTCACATCCACAGATT
    CTGGACTGAAAAGGGTGTTTTCTCTGGTAGAGAATCTGAATTCTGTG
    ACATCGACGACACTTCTATGGGTGTTAGATTGATGAGAATGCACGGT
    TACGACGTTGACCCAAACGTTTTGAAGAACTTCAAGAAGGACGACAA
    GTTCTCTTGTTACGGTGGTCAAATGATCGAATCTCCATCTCCAATCT
    ACAACTTGTACAGAGCTTCTCAATTGAGATTCCCAGGTGAACAAATC
    TTGGAAGACGCTAACAAGTTCGCTTACGACTTCTTGCAAGAAAAGTT
    GGCTCACAACCAAATCTTGGACAAGTGGGTTATCTCTAAGCACTTGC
    CAGACGAAATCAAGTTGGGTTTGGAAATGCCATGGTACGCTACTTTG
    CCAAGAGTTGAAGCTAGATACTACATCCAATACTACGCTGGTTCTGG
    TGACGTTTGGATCGGTAAGACTTTGTACAGAATGCCAGAAATCTCTA
    ACGACACTTACCACGAATTGGCTAAGACTGACTTCAAGAGATGTCAA
    GCTCAACACCAATTCGAATGGATCTACATGCAAGAATGGTACGAATC
    TTGTAACATGGAAGAATTCGGTATCTCTAGAAAGGAATTGTTGGTTG
    CTTACTTCTTGGCTACTGCTTCTATCTTCGAATTGGAAAGAGCTAAC
    GAAAGAATCGCTTGGGCTAAGTCTCAAATCATCTCTACTATCATCGC
    TTCTTTCTTCAACAACCAAAACACTTCTCCAGAAGACAAGTTGGCTT
    TCTTGACTGACTTCAAGAACGGTAACTCTACTAACATGGCTTTGGTT
    ACTTTGACTCAATTCTTGGAAGGTTTCGACAGATACACTTCTCACCA
    ATTGAAGAACGCTTGGTCTGTTTGGTTGAGAAAGTTGCAACAAGGTG
    AAGGTAACGGTGGTGCTGACGCTGAATTGTTGGTTAACACTTTGAAC
    ATCTGTGCTGGTCACATCGCTTTCAGAGAAGAAATCTTGGCTCACAA
    CGACTACAAGACTTTGTCTAACTTGACTTCTAAGATCTGTAGACAAT
    TGTCTCAAATCCAAAACGAAAAGGAATTGGAAACTGAAGGTCAAAAG
    ACTTCTATCAAGAACAAGGAATTGGAAGAAGACATGCAAAGATTGGT
    TAAGTTGGTTTTGGAAAAGTCTAGAGTTGGTATCAACAGAGACATGA
    AGAAGACTTTCTTGGCTGTTGTTAAGACTTACTACTACAAGGCTTAC
    CACTCTGCTCAAGCTATCGACAACCACATGTTCAAGGTTTTGTTCGA
    ACCAGTTGCTTAA
    SEQ ID NO: 30
    Optimized cDNA for S.cerevisiae expression
    encoding for TaTps1-del59
    ATGTACAGACAAAGAACTGACGAACCATCTGAAACTAGACAAATGAT
    CGACGACATCAGAACTGCTTTGGCTTCTTTGGGTGACGACGAAACTT
    CTATGTCTGTTTCTGCTTACGACACTGCTTTGGTTGCTTTGGTTAAG
    AACTTGGACGGTGGTGACGGTCCACAATTCCCATCTTGTATCGACTG
    GATCGTTCAAAACCAATTGCCAGACGGTTCTTGGGGTGACCCAGCTT
    TCTTCATGGTTCAAGACAGAATGATCTCTACTTTGGCTTGTGTTGTT
    GCTGTTAAGTCTTGGAACATCGACAGAGACAACTTGTGTGACAGAGG
    TGTTTTGTTCATCAAGGAAAACATGTCTAGATTGGTTGAAGAAGAAC
    AAGACTGGATGCCATGTGGTTTCGAAATCAACTTCCCAGCTTTGTTG
    GAAAAGGCTAAGGACTTGGACTTGGACATCCCATACGACCACCCAGT
    TTTGGAAGAAATCTACGCTAAGAGAAACTTGAAGTTGTTGAAGATCC
    CATTGGACGTTTTGCACGCTATCCCAACTACTTTGTTGTTCTCTGTT
    GAAGGTATGGTTGACTTGCCATTGGACTGGGAAAAGTTGTTGAGATT
    GAGATGTCCAGACGGTTCTTTCCACTCTTCTCCAGCTGCTACTGCTG
    CTGCTTTGTCTCACACTGGTGACAAGGAATGTCACGCTTTCTTGGAC
    AGATTGATCCAAAAGTTCGAAGGTGGTGTTCCATGTTCTCACTCTAT
    GGACACTTTCGAACAATTGTGGGTTGTTGACAGATTGATGAGATTGG
    GTATCTCTAGACACTTCACTTCTGAAATCCAACAATGTTTGGAATTC
    ATCTACAGAAGATGGACTCAAAAGGGTTTGGCTCACAACATGCACTG
    TCCAATCCCAGACATCGACGACACTGCTATGGGTTTCAGATTGTTGA
    GACAACACGGTTACGACGTTACTCCATCTGTTTTCAAGCACTTCGAA
    AAGGACGGTAAGTTCGTTTGTTTCCCAATGGAAACTAACCACGCTTC
    TGTTACTCCAATGCACAACACTTACAGAGCTTCTCAATTCATGTTCC
    CAGGTGACGACGACGTTTTGGCTAGAGCTGGTAGATACTGTAGAGCT
    TTCTTGCAAGAAAGACAATCTTCTAACAAGTTGTACGACAAGTGGAT
    CATCACTAAGGACTTGCCAGGTGAAGTTGGTTACACTTTGAACTTCC
    CATGGAAGTCTTCTTTGCCAAGAATCGAAACTAGAATGTACTTGGAC
    CAATACGGTGGTAACAACGACGTTTGGATCGCTAAGGTTTTGTACAG
    AATGAACTTGGTTTCTAACGACTTGTACTTGAAGATGGCTAAGGCTG
    ACTTCACTGAATACCAAAGATTGTCTAGAATCGAATGGAACGGTTTG
    AGAAAGTGGTACTTCAGAAACCACTTGCAAAGATACGGTGCTACTCC
    AAAGTCTGCTTTGAAGGCTTACTTCTTGGCTTCTGCTAACATCTTCG
    AACCAGGTAGAGCTGCTGAAAGATTGGCTTGGGCTAGAATGGCTGTT
    TTGGCTGAAGCTGTTACTACTCACTTCAGACACATCGGTGGTCCATG
    TTACTCTACTGAAAACTTGGAAGAATTGATCGACTTGGTTTCTTTCG
    ACGACGTTTCTGGTGGTTTGAGAGAAGCTTGGAAGCAATGGTTGATG
    GCTTGGACTGCTAAGGAATCTCACGGTTCTGTTGACGGTGACACTGC
    TTTGTTGTTCGTTAGAACTATCGAAATCTGTTCTGGTAGAATCGTTT
    CTTCTGAACAAAAGTTGAACTTGTGGGACTACTCTCAATTGGAACAA
    TTGACTTCTTCTATCTGTCACAAGTTGGCTACTATCGGTTTGTCTCA
    AAACGAAGCTTCTATGGAAAACACTGAAGACTTGCACCAACAAGTTG
    ACTTGGAAATGCAAGAATTGTCTTGGAGAGTTCACCAAGGTTGTCAC
    GGTATCAACAGAGAAACTAGACAAACTTTCTTGAACGTTGTTAAGTC
    TTTCTACTACTCTGCTCACTGTTCTCCAGAAACTGTTGACTCTCACA
    TCGCTAAGGTTATCTTCCAAGACGTTATCTAA
    SEQ ID NO: 31
    Optimized cDNA for S.cerevisiae expression
    encoding for MvCps3-del63
    ATGGCTCCACCAGAACAAAAGTACAACTCTACTGCTTTGGAACACGA
    CACTGAAATCATCGAAATCGAAGACCACATCGAATGTATCAGAAGAT
    TGTTGAGAACTGCTGGTGACGGTAGAATCTCTGTTTCTCCATACGAC
    ACTGCTTGGATCGCTTTGATCAAGGACTTGGACGGTCACGACTCTCC
    ACAATTCCCATCTTCTATGGAATGGGTTGCTGACAACCAATTGCCAG
    ACGGTTCTTGGGGTGACGAACACTTCGTTTGTGTTTACGACAGATTG
    GTTAACACTATCGCTTGTGTTGTTGCTTTGAGATCTTGGAACGTTCA
    CGCTCACAAGTGTGAAAAGGGTATCAAGTACATCAAGGAAAACGTTC
    ACAAGTTGGAAGACGCTAACGAAGAACACATGACTTGTGGTTTCGAA
    GTTGTTTTCCCAGCTTTGTTGCAAAGAGCTCAATCTATGGGTATCAA
    GGGTATCCCATACAACGCTCCAGTTATCGAAGAAATCTACAACTCTA
    GAGAAAAGAAGTTGAAGAGAATCCCAATGGAAGTTGTTCACAAGGTT
    GCTACTTCTTTGTTGTTCTCTTTGGAAGGTTTGGAAAACTTGGAATG
    GGAAAAGTTGTTGAAGTTGCAATCTCCAGACGGTTCTTTCTTGACTT
    CTCCATCTTCTACTGCTTTCGCTTTCATCCACACTAAGGACAGAAAG
    TGTTTCAACTTCATCAACAACATCGTTCACACTTTCAAGGGTGGTGC
    TCCACACACTTACCCAGTTGACATCTTCGGTAGATTGTGGGCTGTTG
    ACAGATTGCAAAGATTGGGTATCTCTAGATTCTTCGAATCTGAAATC
    GCTGAATTCTTGTCTCACGTTCACAGATTCTGGTCTGACGAAGCTGG
    TGTTTTCTCTGGTAGAGAATCTGTTTTCTGTGACATCGACGACACTT
    CTATGGGTTTGAGATTGTTGAGAATGCACGGTTACCACGTTGACCCA
    AACGTTTTGAAGAACTTCAAGCAATCTGACAAGTTCTCTTGTTACGG
    TGGTCAAATGATGGAATGTTCTTCTCCAATCTACAACTTGTACAGAG
    CTTCTCAATTGCAATTCCCAGGTGAAGAAATCTTGGAAGAAGCTAAC
    AAGTTCGCTTACAAGTTCTTGCAAGAAAAGTTGGAATCTAACCAAAT
    CTTGGACAAGTGGTTGATCTCTAACCACTTGTCTGACGAAATCAAGG
    TTGGTTTGGAAATGCCATGGTACGCTACTTTGCCAAGAGTTGAAACT
    TCTTACTACATCCACCACTACGGTGGTGGTGACGACGTTTGGATCGG
    TAAGACTTTGTACAGAATGCCAGAAATCTCTAACGACACTTACAGAG
    AATTGGCTAGATTGGACTTCAGAAGATGTCAAGCTCAACACCAATTG
    GAATGGATCTACATGCAAAGATGGTACGAATCTTGTAGAATGCAAGA
    ATTCGGTATCTCTAGAAAGGAAGTTTTGAGAGCTTACTTCTTGGCTT
    CTGGTACTATCTTCGAAGTTGAAAGAGCTAAGGAAAGAGTTGCTTGG
    GCTAGATCTCAAATCATCTCTCACATGATCAAGTCTTTCTTCAACAA
    GGAAACTACTTCTTCTGACCAAAAGCAAGCTTTGTTGACTGAATTGT
    TGTTCGGTAACATCTCTGCTTCTGAAACTGAAAAGAGAGAATTGGAC
    GGTGTTGTTGTTGCTACTTTGAGACAATTCTTGGAAGGTTTCGACAT
    CGGTACTAGACACCAAGTTAAGGCTGCTTGGGACGTTTGGTTGAGAA
    AGGTTGAACAAGGTGAAGCTCACGGTGGTGCTGACGCTGAATTGTGT
    ACTACTACTTTGAACACTTGTGCTAACCAACACTTGTCTTCTCACCC
    AGACTACAACACTTTGTCTAAGTTGACTAACAAGATCTGTCACAAGT
    TGTCTCAAATCCAACACCAAAAGGAAATGAAGGGTGGTATCAAGGCT
    AAGTGTTCTATCAACAACAAGGAAGTTGACATCGAAATGCAATGGTT
    GGTTAAGTTGGTTTTGGAAAAGTCTGGTTTGAACAGAAAGGCTAAGC
    AAGCTTTCTTGTCTATCGCTAAGACTTACTACTACAGAGCTTACTAC
    GCTGACCAAACTATGGACGCTCACATCTTCAAGGTTTTGTTCGAACC
    AGTTGTTTAA
    SEQ ID NO: 32
    Optimized cDNA for S. cerevisiae expression
    encoding for RoCPS1-del67
    ATGGCTTCTCAAGTTTCTGAAAAGGGTACTTCTTCTCCAGTTCAAAC
    TCCAGAAGAAGTTAACGAAAAGATCGAAAACTACATCGAATACATCA
    AGAACTTGTTGACTACTTCTGGTGACGGTAGAATCTCTGTTTCTCCA
    TACGACACTTCTATCGTTGCTTTGATCAAGGACTTGAAGGGTAGAGA
    CACTCCACAATTCCCATCTTGTTTGGAATGGATCGCTCAACACCAAA
    TGGCTGACGGTTCTTGGGGTGACGAATTCTTCTGTATCTACGACAGA
    ATCTTGAACACTTTGGCTTGTGTTGTTGCTTTGAAGTCTTGGAACGT
    TCACGCTGACATGATCGAAAAGGGTGTTACTTACGTTAACGAAAACG
    TTCAAAAGTTGGAAGACGGTAACTTGGAACACATGACTTCTGGTTTC
    GAAATCGTTGTTCCAGCTTTGGTTCAAAGAGCTCAAGACTTGGGTAT
    CCAAGGTTTGCCATACGACCACCCATTGATCAAGGAAATCGCTAACA
    CTAAGGAAGGTAGATTGAAGAAGATCCCAAAGGACATGATCTACCAA
    AAGCCAACTACTTTGTTGTTCTCTTTGGAAGGTTTGGGTGACTTGGA
    ATGGGAAAAGATCTTGAAGTTGCAATCTGGTGACGGTTCTTTCTTGA
    CTTCTCCATCTTCTACTGCTCACGTTTTCATGAAGACTAAGGACGAA
    TAAGTGTTGAAGTTCATCGAAAACGCTGTTAAGAACTGTAACGGTGG
    TGCTCCACACACTTACCCAGTTGACGTTTTCGCTAGATTGTGGGCTG
    TTGACAGATTGCAAAGATTGGGTATCTCTAGATTCTTCCAACAAGAA
    ATCAAGTACTTCTTGGACCACATCAACTCTGTTTGGACTGAAAACGG
    TGTTTTCTCTGGTAGAGACTCTGAATTCTGTGACATCGACGACACTT
    CTATGGGTATCAGATTGTTGAAGATGCACGGTTACGACATCGACCCA
    AACGCTTTGGAACACTTCAAGCAACAAGACGGTAAGTTCTCTTGTTA
    CGGTGGTCAAATGATCGAATCTGCTTCTCCAATCTACAACTTGTACA
    GAGCTGCTCAATTGAGATTCCCAGGTGAAGAAATCTTGGAAGAAGCT
    ACTAAGTTCGCTTACAACTTCTTGCAAGAAAAGATCGCTAACGACCA
    ATTCCAAGAAAAGTGGGTTATCTCTGACCACTTGATCGACGAAGTTA
    AGTTGGGTTTGAAGATGCCATGGTACGCTACTTTGCCAAGAGTTGAA
    TGCTGCTTACTACTTGCAATACTACGCTGGTTGTGGGACGTTTGGAT
    CGGTAAGGTTTTCTACAGAATGCCAGAAATCTCTAACGACACTTACA
    AGAAGTTGGCTATCTTGGACTTCAACAGATGTCAAGCTCAACACCAA
    TTCGAATGGATCTACATGCAAGAATGGTACCACAGATCTTCTGTTTC
    TGAATTCGGTATCTCTAAGAAGGACTTGTTGAGAGCTTACTTCTTGG
    CTGCTGCTACTATCTTCGAACCAGAAAGAACTCAAGAAAGATTGGTT
    TGGGCTAAGACTCAAATCGTTTCTGGTATGATCACTTCTTTCGTTAA
    CTCTGGTACTACTTTGTCTTTGCACCAAAAGACTGCTTTGTTGTCTC
    AAATCGGTCACAACTTCGACGGTTTGGACGAAATCATCTCTGCTATG
    AAGGACCACGGTTTGGCTGCTACTTTGTTGACTACTTTCCAACAATT
    GTTGGACGGTTTCGACAGATACACTAGACACCAATTGAAGAACGCTT
    GGTCTCAATGGTTCATGAAGTTGCAACAAGGTGAAGCTTCTGGTGGT
    GAAGACGCTGAATTGTTGGCTAACACTTTGAACATCTGTGCTGGTTT
    GATCGCTTTCAACGAAGACGTTTTGTCTCACCACGAATACACTACTT
    TGTCTACTTTGACTAACAAGATCTGTAAGAGATTGACTCAAATCCAA
    GACAAGAAGACTTTGGAAGTTGTTGACGGTTCTATCAAGGACAAGGA
    ATTGGAAAAGGACATCCAAATGTTGGTTAAGTTGGTTTTGGAAGAAA
    ACGGTGGTGGTGTTGACAGAAACATCAAGCACACTTTCTTGTCTGTT
    TTCAAGACTTTCTACTACAACGCTTACCACGACGACGAAACTACTGA
    CGTTCACATCTTCAAGGTTTTGTTCGGTCCAGTTGTTTAA
    SEQ ID NO: 33
    Optimized cDNA for S.cerevisiae expression
    encoding for NgSCS-del29
    ATGGCTAACTTCCACAGACCATCTAGAGTTAGATGTTCTCACTCTAC
    TGCTTCTTCTTTGGAAGAAGCTAAGGAAAGAATCAGAGAAACTTTCG
    GTAAGAACGAATTGTCTCCATCTTCTTACGACACTGCTTGGGTTGCT
    ATGGTTCCATCTAGATACTCTATGAACCAACCATGTTTCCCAAGATG
    TTTGGACTGGATCTTGGAAAACCAAAGAGAAGACGGTTCTTGGGGTT
    TGAACCCATCTCACCCATTGTTGGTTAAGGACTCTTTGTCTTCTACT
    TTGGCTTGTTTGTTGGCTTTGAGAAAGTGGAGAATCGGTGACAACCA
    AGTTCAAAGAGGTTTGGGTTTCATCGAAACTCACGGTTGGGCTGTTG
    ACAACGTTGACCAAATCTCTCCATTGGGTTTCGACATCATCTTCCCA
    TCTATGATCAAGTACGCTGAAAAGTTGAACTTGGACTTGCCATTCGA
    CCCAAACTTGGTTAACATGATGTTGAGAGAAAGAGAATTGACTATCG
    AAAGAGCTTTGAAGAACGAATTCGAAGGTAACATGGCTAACGTTGAA
    TACTTCGCTGAAGGTTTGGGTGAATTGTGTCACTGGAAGGAAATCAT
    GTTGCACCAAAGAAGAAACGGTTCTTTGTTCGACTCTCCAGCTACTA
    CTGCTGCTGCTTTGATCTACCACCAACACGACGAAAAGTGTTTCGGT
    TACTTGTCTTCTATCTTGAAGTTGCACGAAAACTGGGTTCCAACTAT
    CTACCCAACTAAGGTTCACTCTAACTTGTTCTTCGTTGACGCTTTGC
    AAAACTTGGGTGTTGACAGATACTTCAAGACTGAATTGAAGTCTGTT
    TTGGACGAAATCTACAGATTGTGGTTGGAAAAGAACGAAGAAATCTT
    CTCTGACATCGCTCACTGTGCTATGGCTTTCAGATTGTTGAGAATGA
    ACAACTACGAAGTTTCTTCTGAAGAATTGGAAGGTTTCGTTGACCAA
    GAACACTTCTTCACTACTTCTGGTGGTAAGTTGATCTCTCACGTTGC
    TATCTTGGAATTGCACAGAGCTTCTCAAGTTGACATCCAAGAAGGTA
    AGGACTTGATCTTGGACAAGATCTCTACTTGGACTAGAAACTTCATG
    GAACAAGAATTGTTGGACAACCAAATCTTGGACAGATCTAAGAAGGA
    AATGGAATTCGCTATGAGAAAGTTCTACGGTACTTTCGACAGAGTTG
    AAACTAGAAGATACATCGAATCTTACAAGATGGACTCTTTCAAGATC
    TTGAAGGCTGCTTACAGATCTTCTAACATCAACAACATCGACTTGTT
    GAAGTTCTCTGAACACGACTTCAACTTGTGTCAAGCTAGACACAAGG
    AAGAATTGCAACAAATCAAGAGATGGTTCGCTGACTGTAAGTTGGAA
    CAAGTTGGTTCTTCTCAAAACTACTTGTACACTTCTTACTTCCCAAT
    CGCTGCTATCTTGTTCGAACCAGAATACGGTGACGCTAGATTGGCTT
    TCGCTAAGTGTGGTATCATCGCTACTACTGTTGACGACTTCTTCGAC
    GGTTTCGCTTGTAACGAAGAATTGCAAAACATCATCGAATTGGTTGA
    AAGATGGGACGGTTACCCAACTGTTGGTTTCAGATCTGAAAGAGTTA
    GAATCTTCTTCTTGGCTTTGTACAAGATGATCGAAGAAATCGCTGCT
    AAGGCTGAAACTAAGCAAGGTAGATGTGTTAAGGACTTGTTGATCAA
    CTTGTGGATCGACTTGTTGAAGTGTATGTTGGTTGAATTGGACTTGT
    GGAAGATCAAGTCTACTACTCCATCTATCGAAGAATACTTGTCTATC
    GCTTGTGTTACTACTGGTGTTAAGTGTTTGATCTTGATCTCTTTGCA
    CTTGTTGGGTCCAAAGTTGTCTAAGGACGTTACTGAATCTTCTGAAG
    TTTCTGCTTTGTGGAACTGTACTGCTGTTGTTGCTAGATTGAACAAC
    GACATCCACTCTTACAAGAGAGAACAAGCTGAATCTTCTACTAACAT
    GGCTGCTATCTTGATCTCTCAATCTCAAAGAACTATCTCTGAAGAAG
    AAGCTATCAGACAAATCAAGGAAATGATGGAATCTAAGAGAAGAGAA
    TTGTTGGGTATGGTTTTGCAAAACAAGGAATCTCAATTGCCACAAGT
    TTGTAAGGACTTGTTCTGGACTACTTTCAAGGCTGCTTACTCTATCT
    ACACTCACGGTGACGAATACAGATTCCCACAAGAATTGAAGAACCAC
    ATCAACGACGTTATCTACAAGCCATTGAACCAATACTCTCCATAA
    SEQ ID NO: 34
    Optimized cDNA for S.cerevisiae expression
    encoding for NgSCS-del38
    ATGTCTCACTCTACTGCTTCTTCTTTGGAAGAAGCTAAGGAAAGAAT
    CAGAGAAACTTTCGGTAAGAACGAATTGTCTTCTTCTTCTTACGACA
    CTGCTTGGGTTGCTATGGTTCCATCTAGATACTCTATGAACCAACCA
    TGTTTCCCAAGATGTTTGGACTGGATCTTGGAAAACCAAAGAGAAGA
    CGGTTCTTGGGGTTTGAACCCATCTTTGCCATTGTTGGTTAAGGACT
    CTTTGTCTTCTACTTTGGCTTGTTTGTTGGCTTTGAGAAAGTGGAGA
    ATCGGTGACAACCAAGTTCAAAGAGGTTTGGGTTTCATCGAAACTCA
    CGGTTGGGCTGTTGACAACGTTGACCAAATCTCTCCATTGGGTTTCG
    ACATCATCTTCCCATCTATGATCAAGTACGCTGAAAAGTTGAACTTG
    GACTTGCCATTCGACCCAAACTTGGTTAACATGATGTTGAGAGAAAG
    AGAATTGACTATCGAAAGAGCTTTGAAGAACGAATTCGAAGGTAACA
    TGGCTAACGTTGAATACTTCGCTGAAGGTTTGGGTGAATTGTGTCAC
    TGGAAGGAAATCATGTTGCACCAAAGAAGAAACGGTTCTCCATTCGA
    CTCTCCAGCTACTACTGCTGCTGCTTTGATCTACCACCAACACGACG
    AAAAGTGTTTCGGTTACTTGTCTTCTATCTTGAAGTTGCACGAAAAC
    TGGGTTCCAACTATCTACCCAACTAAGGTTCACTCTAACTTGTTCTT
    CGTTGACGCTTTGCAAAACTTGGGTGTTGACAGATACTTCAAGACTG
    AATTGAAGTCTGTTTTGGACGAAATCTACAGATTGTGGTTGGAAAAG
    AACGAAGAAATCTTCTCTGACATCGCTCACTGTGCTATGGCTTTCAG
    ATTGTTGAGAATGAACAACTACGAAGTTTCTTCTGAAGAATTGGAAG
    GTTTCGTTGACCAAGAACACTTCTTCACTACTTCTGGTGGTAAGTTG
    ATCTCTCACGTTGCTATCTTGGAATTGCACAGAGCTTCTCAAGTTGA
    CATCCAAGAAGGTAAGGACTTGATCTTGGACAAGATCTCTACTTGGA
    CTAGAAACTTCATGGAACAAGAATTGTTGGACAACCAAATCTTGGAC
    AGATCTAAGAAGGAAATGGAATTCGCTATGAGAAAGTTCTACGGTAC
    TTTCGACAGAGTTGAAACTAGAAGATACATCGAATCTTACAAGATGG
    ACTCTTTCAAGATCTTGAAGGCTGCTTACAGATCTTCTAACATCAAC
    AACATCGACTTGTTGAAGTTCTCTGAACACGACTTCAACTTGTGTCA
    AGCTAGACACAAGGAAGAATTGCAACAAATCAAGAGATGGTTCGCTG
    ACTGTAAGTTGGAACAAGTTGGTTCTTCTCAAAACTACTTGTACACT
    TCTTACTTCCCAATCGCTGCTATCTTGTTCGAACCAGAATACGGTGA
    CGCTAGATTGGCTTTCGCTAAGTGTGGTATCATCGCTACTACTGTTG
    ACGACTTCTTCGACGGTTTCGCTTGTAACGAAGAATTGCAAAACATC
    ATCGAATTGGTTGAAAGATGGGACGGTTACCCAACTGTTGGTTTCAG
    ATCTGAAAGAGTTAGAATCTTCTTCTTGGCTTTGTACAAGATGATCG
    AAGAAATCGCTGCTAAGGCTGAAACTAAGCAAGGTAGATGTGTTAAG
    GACTTGTTGATCAACTTGTGGATCGACTTGTTGAAGTGTATGTTGGT
    TGAATTGGACTTGTGGAAGATCAAGTCTACTACTCCATCTATCGAAG
    AATACTTGTCTATCGCTTGTGTTACTACTGGTGTTAAGTGTTTGATC
    TTGATCTCTTTGCACTTGTTGGGTCCAAAGTTGTCTAAGGACGTTAC
    TGAATCTTCTGAAGTTTCTGCTTTGTGGAACTGTACTGCTGTTGTTG
    CTAGATTGAACAACGACATCCACTCTTACAAGAGAGAACAAGCTGAA
    TCTTCTACTAACATGGTTGCTATCTTGATCTCTCAATCTCAAAGAAC
    TATCTCTGAAGAAGAAGCTATCAGACAAATCAAGGAAATGATGGAAT
    CTAAGAGAAGAGAATTGTTGGGTATGGTTTTGCAAAACAAGGAATCT
    CAATTGCCACAAGTTTGTAAGGACTTGTTCTGGACTACTTTCAAGGC
    TGCTTACTCTATCTACACTCACGGTGACGAATACAGATTCCCACAAG
    AATTGAAGAACCACATCAACGACGTTATCTACAAGCCATTGAACCAA
    TACTCTCCATAA
    SEQ ID NO: 35
    Primer for construction of fragment “a”
    (URA3 yeast marker)
    AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGT
    CGTACCGCGCCATTGAGAGTGCACCATACCACAGCTTT
    SEQ ID NO: 36
    Primer for construction of fragment “a”
    (URA3 yeast marker)
    TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCG
    TTGTTGCTGACCAGCGGTATTTCACACCGCATAGGGTA
    SEQ ID NO: 37
    Primer for construction of fragment “b” (AmpR
    E.coli marker)
    TGGTCAGCAACAACGCCGAAGAATCACTCTCGTGTTGAGAATTGCAC
    GCCTTGACCACGACACGTTAAGGGATTTTGGTCATGAG
    SEQ ID NO: 38
    Primer for construction of fragment “b”
    (AmpR E.coli marker)
    AACGCGTACCCTAAGTACGGCACCACAGTGACTATGCAGTCCGCACT
    TTGCCAATGCCAAAAATGTGCGCGGAACCCCTA
    SEQ ID NO: 39
    Primer for construction of fragment “c” (Yeast
    origin of replication)
    TTGGCATTGGCAAAGTGCGGACTGCATAGTCACTGTGGTGCCGTACT
    TAGGGTACGCGTTCCTGAACGAAGCATCTGTGCTTCA
    SEQ ID NO: 40
    Primer for construction of fragment “c” (Yeast
    origin of replication)
    CCGAGATGCCAAAGGATAGGTGCTATGTTGATGACTACGACACAGAA
    CTGCGGGTGACATAATGATAGCATTGAAGGATGAGACT
    SEQ ID NO: 41
    Primer for construction of fragment “d”
    (E. coli origin of replication)
    ATGTCACCCGCAGTTCTGTGTCGTAGTCATCAACATAGCACCTATCC
    TTTGGCATCTCGGTGAGCAAAAGGCCAGCAAAAGG
    SEQ ID NO: 42
    Primer for construction of fragment “d”
    (E.coli origin of replication)
    CTCAGATGTACGGTGATCGCCACCATGTGACGGAAGCTATCCTGACA
    GTGTAGCAAGTGCTGAGCGTCAGACCCCGTAGAA
    SEQ ID NO: 43
    Part of fragment “d” obtained by DNA synthesis
    ATTCCTAGTGACGGCCTTGGGAACTCGATACACGATGTTCAGTAGAC
    CGCTCACACATGG
    SEQ ID NO: 44
    Primer for construction of fragment “a” (LEU2
    yeast marker)
    AGGTGCAGTTCGCGTGCAATTATAACGTCGTGGCAACTGTTATCAGT
    CGTACCGCGCCATTCGACTACGTCGTAAGGCC
    SEQ ID NO: 45
    Primer for construction of fragment “a” (LEU2
    yeast marker)
    TCGTGGTCAAGGCGTGCAATTCTCAACACGAGAGTGATTCTTCGGCG
    TTGTTGCTGACCATCGACGGTCGAGGAGAACTT

Claims (17)

1. A method of producing (+)-manool, the method comprising:
a) contacting geranylgeranyl diphosphate (GGPP) with a copalyl diphosphate (CPP) synthase to form a copalyl diphosphate, wherein the CPP synthase comprises
a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and
b) contacting the CPP with a sclareol synthase to form the (+)-manool; and
c) optionally isolating the (+)-manool.
2. The method of claim 1, wherein the CPP synthase comprises
a) a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
b) a polypeptide comprising an amino acid sequence having at least 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
c) a polypeptide comprising an amino acid sequence having at least 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21.
3. The method of claim 1, wherein step a) further comprises culturing a non-human host organism or cell capable of producing GGPP and transformed to express at least one polypeptide comprising
a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21;
and having a CPP synthase activity, under conditions conducive to a production of CPP.
4. The method of claim 1, wherein the method further comprises, prior to step a), transforming a non-human host organism or cell capable of producing GGPP with
a) at least one nucleic acid encoding a polypeptide comprising
a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 17 and SEQ ID NO: 18; or
c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and
having a CPP synthase activity, so that said organism or cell expresses said polypeptide having a CPP synthase activity; and
b) at least one nucleic acid encoding a polypeptide having a sclareol synthase activity, so that said organism or cell expresses said polypeptide having a sclareol synthase activity.
5. The method as recited in claim 4, wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
6. The method as recited in claim 1, further comprising processing the (+)-manool to a (+)-manool derivative using a chemical or biochemical synthesis or a combination of both.
7. The method as recited in claim 6, wherein the derivative is an alcohol, acetal, aldehyde, acid, ether, ketone, lactone, acetate or an ester.
8. The method as recited in claim 6, wherein the derivative is selected from the group consisting of copalol, copalal, manooloxy, Z-11, gamma-ambrol and ambrox.
9. A method for transforming a host cell or non-human organism, the method comprising transforming a host cell or non-human organism with a nucleic acid encoding a polypeptide having a copalyl diphosphate synthase activity and a nucleic acid encoding a polypeptide having a sclareol synthase activity,
wherein the polypeptide having copalyl diphosphate activity comprises
a) an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
b) an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
c) an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 and SEQ ID NO: 21; and
wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99%, or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25.
10. The method as recited in claim 4, wherein the host cell or non-human organism is a plant, a prokaryote, or a fungus.
11. The method as recited in claim 4, wherein the non-human host organism or cell is E. coli or Saccharomyces cerevisiae.
12. An expression vector comprising
a) a nucleic acid encoding a polypeptide having a CPP synthase activity comprising
a) an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
b) an amino acid sequence having at least 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
c) an amino acid sequence having at 98%, 99% or 100% sequence identity SEQ ID NO: 20 or SEQ ID NO: 21; or
b) a nucleic acid encoding a polypeptide having a CPP synthase activity comprising a nucleotide sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to a nucleic acid sequence as set forth in SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, or SEQ ID NO: 32.
13. The expression vector of claim 12 further comprising
a) a nucleic acid encoding a polypeptide having a sclareol synthase activity, wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25; or
b) a nucleic acid encoding a polypeptide having a sclareol synthase activity comprising a nucleotide sequence having at least 90%, 95%, 98%,99% or 100% sequence identity to SEQ ID NO: 6, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID NO: 33, or SEQ ID NO: 34.
14. A non-human host organism or cell comprising
a) the expression vector as recited in claim 12; or
b) a nucleic acid encoding a polypeptide having a copalyl diphosphate synthase activity and a nucleic acid encoding a polypeptide having a sclareol synthase activity,
wherein the polypeptide having copalyl diphosphate activity comprises
i. an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 14 or SEQ ID NO: 15; or
ii. an amino acid sequence having at least 71%, 72%, 75%, 80%, 85%, 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 17 or SEQ ID NO: 18; or
iii. an amino acid sequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 20 or SEQ ID NO: 21; and
wherein the polypeptide having sclareol synthase activity comprises an amino acid sequence having at least 90%, 95%, 98%, 99% or 100% sequence identity to SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 23, or SEQ ID NO: 25; and
wherein at least one of the nucleic acids is heterologous to the non-human host organism or cell.
15. The non-human host organism or cell of claim 14, wherein the non-human host organism or cell is a plant, a prokaryote, a fungus, Escherichia coli, or Saccharomyces cerevisiae.
16. The method as recited in claim 9, wherein the host cell or non-human organism is a plant, a prokaryote, or a fungus.
17. The method as recited in claim 9, wherein the non-human host organism or cell is E. coli or Saccharomyces cerevisiae.
US16/938,605 2016-12-22 2020-07-24 Production of manool Abandoned US20210010035A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/938,605 US20210010035A1 (en) 2016-12-22 2020-07-24 Production of manool

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
EP16206349 2016-12-22
EP16206349.9 2016-12-22
PCT/EP2017/083372 WO2018114839A2 (en) 2016-12-22 2017-12-18 Production of manool
US201916472120A 2019-06-20 2019-06-20
US16/938,605 US20210010035A1 (en) 2016-12-22 2020-07-24 Production of manool

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
PCT/EP2017/083372 Division WO2018114839A2 (en) 2016-12-22 2017-12-18 Production of manool
US16/472,120 Division US10752922B2 (en) 2016-12-22 2017-12-18 Production of manool

Publications (1)

Publication Number Publication Date
US20210010035A1 true US20210010035A1 (en) 2021-01-14

Family

ID=57629393

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/472,120 Active US10752922B2 (en) 2016-12-22 2017-12-18 Production of manool
US16/938,605 Abandoned US20210010035A1 (en) 2016-12-22 2020-07-24 Production of manool

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US16/472,120 Active US10752922B2 (en) 2016-12-22 2017-12-18 Production of manool

Country Status (8)

Country Link
US (2) US10752922B2 (en)
EP (1) EP3559245B1 (en)
JP (1) JP7160811B2 (en)
CN (2) CN117604043A (en)
BR (1) BR112019013014A2 (en)
ES (1) ES2899121T3 (en)
MX (2) MX2019006635A (en)
WO (1) WO2018114839A2 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3997215A1 (en) * 2019-07-10 2022-05-18 Firmenich SA Biocatalytic method for the controlled degradation of terpene compounds
KR20240032089A (en) 2021-07-06 2024-03-08 아이소바이오닉스 비.브이. Recombinant production of C-20 terpenoid alcohols
CN114150011B (en) * 2021-11-17 2023-04-18 天津大学 Recombinant saccharomyces cerevisiae for heterogeneously synthesizing carnosic acid and construction method thereof

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3681853D1 (en) 1985-08-12 1991-11-14 Firmenich & Cie METHOD FOR PRODUCING OXYGEN-CONTAINING DECALINE DERIVATIVES.
JPH0696571B2 (en) * 1986-12-24 1994-11-30 ハリマ化成株式会社 Method for producing 8α, 12-epoxy-13,14,15,16-tetranorlabdane
JPH07206655A (en) * 1994-01-14 1995-08-08 Pola Chem Ind Inc Melanogenesis suppressor and skin external preparation
US7294492B2 (en) 2005-01-07 2007-11-13 International Flavors & Fragrances Inc. Process for the manufacture of spiroketals
WO2009044336A2 (en) * 2007-10-05 2009-04-09 Firmenich Sa Method for producing diterpenes
CN104031945B (en) 2008-01-29 2017-08-11 弗门尼舍有限公司 The method for producing sclareol
CN101939430B (en) * 2008-02-15 2015-05-13 弗门尼舍有限公司 Method for producing sclareol
MX352318B (en) 2011-11-01 2017-11-21 Firmenich & Cie Cytochrome p450 and use thereof for the enzymatic oxidation of terpenes.
AU2012343263B2 (en) * 2011-11-21 2017-10-26 The University Of British Columbia Diterpene synthases and method for producing diterpenoids
US20150339371A1 (en) 2012-06-28 2015-11-26 Nokia Corporation Method and apparatus for classifying significant places into place categories
US9353385B2 (en) 2012-07-30 2016-05-31 Evolva, Inc. Sclareol and labdenediol diphosphate synthase polypeptides, encoding nucleic acid molecules and uses thereof
EP3502264A3 (en) * 2013-05-31 2019-11-06 DSM IP Assets B.V. Microorganisms for diterpene production
US20180037912A1 (en) * 2014-01-31 2018-02-08 University Of Copenhagen Methods for Producing Diterpenes
WO2015197075A1 (en) 2014-06-23 2015-12-30 University Of Copenhagen Methods and materials for production of terpenoids

Also Published As

Publication number Publication date
MX2021011770A (en) 2021-10-22
US10752922B2 (en) 2020-08-25
US20190352673A1 (en) 2019-11-21
WO2018114839A2 (en) 2018-06-28
EP3559245B1 (en) 2021-08-25
JP2020513755A (en) 2020-05-21
JP7160811B2 (en) 2022-10-25
CN110100003B (en) 2023-11-17
WO2018114839A3 (en) 2018-08-30
EP3559245A2 (en) 2019-10-30
ES2899121T3 (en) 2022-03-10
CN110100003A (en) 2019-08-06
CN117604043A (en) 2024-02-27
BR112019013014A2 (en) 2020-01-14
MX2019006635A (en) 2019-08-21

Similar Documents

Publication Publication Date Title
US20210010035A1 (en) Production of manool
CN104846020B (en) Method for producing sclareol
US11932894B2 (en) Method for producing albicanol and/or drimenol
US20180208948A1 (en) Drimenol synthases i
US11773414B2 (en) Sesquiterpene synthases for production of drimenol and mixtures thereof
US10385361B2 (en) Production of manool
US11293040B2 (en) Methods of producing sesquiterpene compounds
US10337031B2 (en) Production of fragrant compounds
BR112017028183B1 (en) MANOOL PRODUCTION

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION