US20220403346A1 - Production of cannabinoids - Google Patents

Production of cannabinoids Download PDF

Info

Publication number
US20220403346A1
US20220403346A1 US17/636,322 US202017636322A US2022403346A1 US 20220403346 A1 US20220403346 A1 US 20220403346A1 US 202017636322 A US202017636322 A US 202017636322A US 2022403346 A1 US2022403346 A1 US 2022403346A1
Authority
US
United States
Prior art keywords
seq
pks
variant
domain
composition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/636,322
Inventor
Maxim Mikheev
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Biomedican Inc
Original Assignee
Biomedican Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Biomedican Inc filed Critical Biomedican Inc
Priority to US17/636,322 priority Critical patent/US20220403346A1/en
Assigned to BIOMEDICAN, INC. reassignment BIOMEDICAN, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MIKHEEV, MAXIM
Publication of US20220403346A1 publication Critical patent/US20220403346A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/001Oxidoreductases (1.) acting on the CH-CH group of donors (1.3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1025Acyltransferases (2.3)
    • C12N9/1029Acyltransferases (2.3) transferring groups other than amino-acyl groups (2.3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1085Transferases (2.) transferring alkyl or aryl groups other than methyl groups (2.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P17/00Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms
    • C12P17/02Oxygen as only ring hetero atoms
    • C12P17/06Oxygen as only ring hetero atoms containing a six-membered hetero ring, e.g. fluorescein
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y101/00Oxidoreductases acting on the CH-OH group of donors (1.1)
    • C12Y101/01Oxidoreductases acting on the CH-OH group of donors (1.1) with NAD+ or NADP+ as acceptor (1.1.1)
    • C12Y101/01034Hydroxymethylglutaryl-CoA reductase (NADPH) (1.1.1.34)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y103/00Oxidoreductases acting on the CH-CH group of donors (1.3)
    • C12Y103/03Oxidoreductases acting on the CH-CH group of donors (1.3) with oxygen as acceptor (1.3.3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y121/00Oxidoreductases acting on X-H and Y-H to form an X-Y bond (1.21)
    • C12Y121/03Oxidoreductases acting on X-H and Y-H to form an X-Y bond (1.21) with oxygen as acceptor (1.21.3)
    • C12Y121/03007Tetrahydrocannabinolic acid synthase (1.21.3.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y121/00Oxidoreductases acting on X-H and Y-H to form an X-Y bond (1.21)
    • C12Y121/03Oxidoreductases acting on X-H and Y-H to form an X-Y bond (1.21) with oxygen as acceptor (1.21.3)
    • C12Y121/03008Cannabidiolic acid synthase (1.21.3.8)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y203/00Acyltransferases (2.3)
    • C12Y203/01Acyltransferases (2.3) transferring groups other than amino-acyl groups (2.3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y205/00Transferases transferring alkyl or aryl groups, other than methyl groups (2.5)
    • C12Y205/01Transferases transferring alkyl or aryl groups, other than methyl groups (2.5) transferring alkyl or aryl groups, other than methyl groups (2.5.1)
    • C12Y205/0101(2E,6E)-Farnesyl diphosphate synthase (2.5.1.10), i.e. geranyltranstransferase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/08Transferases for other substituted phosphate groups (2.7.8)
    • C12Y207/08007Holo-[acyl-carrier-protein] synthase (2.7.8.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y404/00Carbon-sulfur lyases (4.4)
    • C12Y404/01Carbon-sulfur lyases (4.4.1)
    • C12Y404/01026Olivetolic acid cyclase (4.4.1.26)

Definitions

  • the present disclosure relates to improved methods of producing cannabinoids.
  • Cannabinoids are a general class of chemicals that act on cannabinoid receptors and other target molecules to modulate a wide range of physiological behaviour such as neurotransmitter release.
  • Cannabinoids are produced naturally in humans (called endocannabinoids) and by several plant species (called phytocannabinoids) including Cannabis sativa .
  • Cannabinoids have been shown to have several beneficial medical/therapeutic effects and therefore they are an active area of investigation by the pharmaceutical industry for use as pharmaceutical products for various diseases.
  • cannabinoids for pharmaceutical or other uses is done by chemical synthesis or through the extraction of cannabinoids from plants that are producing these cannabinoids, for example C. sativa .
  • the chemical synthesis of various cannabinoids is a costly process when compared to the extraction of cannabinoids from naturally occurring plants.
  • the chemical synthesis of cannabinoids also involves the use of chemicals that are not environmentally friendly, which can be considered as an additional cost to their production.
  • the synthetic chemical production of various cannabinoids has been classified as less pharmacologically active as those extracted from plants such as C. sativa .
  • the other method that is currently used to produce cannabinoids is production of cannabinoids in plants that naturally produce these chemicals; the most used plant for this is C. sativa .
  • the plant C. sativa is cultivated and during the flowering cycle various cannabinoids are produced naturally by the plant.
  • the plant can be harvested and the cannabinoids can be ingested for pharmaceutical purposes in various methods directly from the plant itself or the cannabinoids can be extracted from the plant.
  • sativa that contains the cannabinoids, into a chemical solution that selectively solubilizes the cannabinoids into this solution.
  • chemical solutions used to do this such as hexane, cold water extraction methods, CO2 extraction methods, and others.
  • This chemical solution now containing all the different cannabinoids, can then be removed, leaving behind the excess plant material.
  • the cannabinoid containing solution can then be further processed for use.
  • FIG. 1 A illustrates a first enzymatic pathway as described herein for producing Compound I from the starting materials of either Compound III and/or Compound II.
  • FIG. 1 B illustrates a second enzymatic pathway as described herein for producing Compound I from the starting materials of either Compound II and/or Acetyl-CoA and Malonyl CoA.
  • FIG. 1 C illustrates a third enzymatic pathway as described herein for producing Compound I from the starting materials from Acetyl-CoA and Malonyl CoA.
  • FIG. 2 is diagram of the cannabinoid synthesis pathway including nonenzymatic steps starting with a CBGA-Analog;
  • FIG. 3 illustrates the enzymatic pathway as described herein for producing GPP from different carbohydrate sources.
  • FIG. 4 describes the structures for Compound I, II, III and IV.
  • FIGS. 5 A-B describes the structures for Cannabinoid Precursors ( FIG. 5 A ) and Cannabinoids ( FIG. 5 B ).
  • FIG. 6 A is an alignment of SEQ ID NOs: 3-5 and 40 showing identical (*) vs conserved amino acid (.) between the three sequences.
  • FIG. 6 B is an alignment of SEQ ID NOs: 3-5 and 40-42 showing identical (*) vs conserved amino acid (.) between the six sequences.
  • FIG. 7 provides a list of abbreviations used throughout the specification.
  • FIG. 8 is an enzymatic assay used to illustrate the effect of different mutations on NphB gene on the production of Olivetolic Acid.
  • FIG. 9 A is a Western blot showing the production of cytoplastic THCAS when no ProA signal sequence is used.
  • FIG. 9 B shows the production of correctly glycosylated THCAS when ProA24 is used in dPRB1, dPEP4 and dPRB1+dPEP4 knockout yeast strains.
  • FIG. 9 C shows that the ProA19-ProA24 signal sequence can produce equally large amounts of THCAS.
  • FIG. 9 D shows THCA production is 10 times greater when produced in dPRB1 and/or dPEP4 knockout strains with THCAS fused to a ProA signal sequence.
  • a cannabinoid precursor includes a plurality of precursors, including mixtures thereof.
  • a polynucleotide includes a plurality of polynucleotides.
  • compositions and methods include the recited elements, but do not exclude other elements.
  • Consisting essentially of shall mean excluding other elements of any essential significance to the combination. Thus, compositions consisting essentially of produced cannabinoids would not exclude trace contaminants from the isolation and purification method and pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives, and the like.
  • Consisting of shall mean excluding more than trace elements of other ingredients and substantial method steps for produced cannabinoids. Embodiments defined by each of these transition terms are within the scope of this invention.
  • the term “about” or “approximately” means within an acceptable range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, e.g., the limitations of the measurement system. For example, “about” can mean a range of up to 20%, preferably up to 10%, more preferably up to 5%, and more preferably still up to 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, preferably within 5 fold, and more preferably within 2 fold, of a value. Unless otherwise stated, the term ‘about’ means within an acceptable error range for the particular value, such as ⁇ 1-20%, preferably ⁇ 1-10% and more preferably ⁇ 1-5%.
  • polynucleotide and “nucleic acid molecule” are used interchangeably to refer to polymeric forms of nucleotides of any length.
  • the polynucleotides may contain deoxyribonucleotides, ribonucleotides, and/or their analogs.
  • Nucleotides may have any three-dimensional structure, and may perform any function, known or unknown.
  • polynucleotide includes, for example, single-, double-stranded and triple helical molecules, a gene or gene fragment, exons, introns, mRNA, tRNA, rRNA, ribozymes, antisense molecules, cDNA, recombinant polynucleotides, branched polynucleotides, aptamers, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
  • a nucleic acid molecule may also comprise modified nucleic acid molecules (e.g., comprising modified bases, sugars, and/or internucleotide linkers).
  • peptide refers to a compound of two or more subunit amino acids, amino acid analogs, or peptidomimetics.
  • the subunits may be linked by peptide bonds or by other bonds (e.g., as esters, ethers, and the like).
  • amino acid refers to either natural and/or unnatural or synthetic amino acids, including glycine and both D or L optical isomers, and amino acid analogs and peptidomimetics.
  • a peptide of three or more amino acids is commonly called an oligopeptide if the peptide chain is short. If the peptide chain is long (e.g., greater than about 10 amino acids), the peptide is commonly called a polypeptide or a protein.
  • protein encompasses the term “polypeptide”
  • a “polypeptide” may be a less than full-length protein.
  • expression refers to the process by which polynucleotides are transcribed into mRNA and/or translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA transcribed from the genomic DNA.
  • under transcriptional control or “operably linked” refers to expression (e.g., transcription or translation) of a polynucleotide sequence which is controlled by an appropriate juxtaposition of an expression control element and a coding sequence.
  • a DNA sequence is “operatively linked” to an expression control sequence when the expression control sequence controls and regulates the transcription of that DNA sequence.
  • coding sequence is a sequence which is transcribed and translated into a polypeptide when placed under the control of appropriate expression control sequences. The boundaries of a coding sequence are determined by a start codon at the 5′ (amino) terminus and a translation stop codon at the 3′ (carboxyl) terminus.
  • a coding sequence can include, but is not limited to, a prokaryotic sequence, cDNA from eukaryotic mRNA, a genomic DNA sequence from eukaryotic (e.g., yeast, or mammalian) DNA, and even synthetic DNA sequences.
  • a polyadenylation signal and transcription termination sequence will usually be located 3′ to the coding sequence.
  • two coding sequences “correspond” to each other if the sequences or their complementary sequences encode the same amino acid sequences.
  • signal sequence denotes the endoplasmic reticulum translocation sequence. This sequence encodes a signal peptide that communicates to a cell to direct a polypeptide to which it is linked (e.g., via a chemical bond) to an endoplasmic reticulum vesicular compartment, to enter an exocytic/endocytic organelle, to be delivered either to a cellular vesicular compartment, the cell surface or to secrete the polypeptide. This signal sequence is sometimes clipped off by the cell in the maturation of a protein. Signal sequences can be found associated with a variety of proteins native to prokaryotes and eukaryotes.
  • hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
  • the hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner.
  • the complex may comprise two strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of these.
  • a hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PCR reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.
  • Two polypeptide sequences are “substantially homologous” or “substantially similar” when at least about 50%, at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% of amino acid residues of the polypeptide match conservative amino acids over a defined length of the polypeptide sequence.
  • Sequences that are similar can be identified by comparing the sequences using standard software available in sequence data banks.
  • Substantially homologous nucleic acid sequences also can be identified in a Southern hybridization experiment under, for example, stringent conditions as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art.
  • stringent conditions can be: hybridization at 5 ⁇ SSC and 50% formamide at 42° C., and washing at 0.1 ⁇ SSC and 0.1% sodium dodecyl sulfate at 6° C.
  • Further examples of stringent hybridization conditions include: incubation temperatures of about 25 degrees C. to about 37 degrees C.; hybridization buffer concentrations of about 6 ⁇ SSC to about 10 ⁇ SSC; formamide concentrations of about 0% to about 25%; and wash solutions of about 6 ⁇ SSC.
  • Examples of moderate hybridization conditions include: incubation temperatures of about 40 degrees C. to about 50 degrees C.; buffer concentrations of about 9 ⁇ SSC to about 2 ⁇ SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5 ⁇ SSC to about 2 ⁇ SSC.
  • Examples of high stringency conditions include: incubation temperatures of about 55 degrees C. to about 68 degrees C.; buffer concentrations of about 1 ⁇ SSC to about 0.1 ⁇ SSC; formamide concentrations of about 55% to about 75%; and wash solutions of about 1 ⁇ SSC, 0.1 ⁇ SSC, or deionized water.
  • hybridization incubation times are from 5 minutes to 24 hours, with 1, 2, or more washing steps, and wash incubation times are about 1, 2, or 15 minutes.
  • SSC is 0.15 M NaCl and 15 mM citrate buffer. It is understood that equivalents of SSC using other buffer systems can be employed. Similarity can be verified by sequencing, but preferably, is also or alternatively, verified by function (e.g., ability to traffic to an endosomal compartment, and the like), using assays suitable for the particular domain in question.
  • sequence similarity generally refers to the degree of identity or similarity between different nucleotide sequences of nucleic acid molecules or amino acid sequences of polypeptides that may or may not share a common evolutionary origin (see Reeck et al., supra). Sequence identity can be determined using any of a number of publicly available sequence comparison algorithms, such as BLAST, FASTA, DNA Strider, GCG (Genetics Computer Group, Program Manual for the GCG Package, Version 7, Madison, Wis.), etc.
  • the sequences are aligned for optimal comparison purposes.
  • the two sequences are, or are about, of the same length.
  • the percent identity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent sequence identity, typically exact matches are counted.
  • the determination of percent identity between two sequences can be accomplished using a mathematical algorithm.
  • a non-limiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of Karlin and Altschul, Proc. Natl. Acad. Sci. USA 1990, 87:2264, modified as in Karlin and Altschul, Proc. Natl. Acad. Sci. USA 1993, 90:5873-5877.
  • Such an algorithm is incorporated into the NBLAST and XBLAST programs of Altschul et al, J. Mol. Biol. 1990; 215: 403.
  • Gapped BLAST can be utilized as described in Altschul et al, Nucleic Acids Res. 1997, 25:3389.
  • PSI-Blast can be used to perform an iterated search that detects distant relationship between molecules. See Altschul et al. (1997) supra.
  • the default parameters of the respective programs e.g., XBLAST and NBLAST
  • the default parameters of the respective programs e.g., XBLAST and NBLAST
  • the sequences are also aligned for optimal comparison purposes.
  • the two sequences are, or are about, of the same length.
  • the percent similarity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent sequence similarity, typically conserved matches are counted.
  • the percent identity between two amino acid sequences is determined using the algorithm of Needleman and Wunsch (J. Mol. Biol. 1970, 48:444-453), which has been incorporated into the GAP program in the GCG software package (Accelrys, Burlington, Mass.; available at accelrys.com on the WorldWideWeb), using either a Blossum 62 matrix or a PAM250 matrix, a gap weight of 16, 14, 12, 10, 8, 6, or 4, and a length weight of 1, 2, 3, 4, 5, or 6.
  • the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package using a NWSgapdna.CMP matrix, a gap weight of 40, 50, 60, 70, or 80, and a length weight of 1, 2, 3, 4, 5, or 6.
  • a particularly preferred set of parameters is using a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of s.
  • percent identity can be determined by using software programs such as those described in Current Protocols In Molecular Biology (F. M. Ausubel et al., eds., 1987) Supplement 30, section 7.7.18, Table 7.7.1.
  • software programs such as those described in Current Protocols In Molecular Biology (F. M. Ausubel et al., eds., 1987) Supplement 30, section 7.7.18, Table 7.7.1.
  • default parameters are used for alignment.
  • a preferred alignment program is BLAST, using default parameters.
  • Constantly modified variants of domain sequences also can be provided.
  • conservatively modified variants refer to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences.
  • degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer, et al., 1991, Nucleic Acid Res. 19: 5081; Ohtsuka, et al., 1985, J. Biol. Chem. 260: 2605-2608; Rossolini et al., 1994, Mol. Cell. Probes 8: 91-98).
  • variants of the disclosed gene retain the ability of the wild type protein from which the variant was derived, although the activity may not be at the same level.
  • the variants have at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 100% efficacy compared to the original sequence.
  • the variant has improved activity as compared to the original sequence.
  • variants with improved activity have at least about 110%, at least about 120%, at least about 130%, at least about 140%, at least about 150%, or at least about 160% efficacy compared to the original sequence.
  • a variant common cannabinoid synthesising protein such as CBDAS
  • CBDAS must retain the ability to cyclize CBGA to produce CBDA with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence.
  • a variant common cannabinoid protein such as CBDAS
  • biologically active fragment possesses a biological activity that is at least substantially equal (e.g., not significantly different from) the biological activity of the wild type protein as measured using an assay suitable for detecting the activity.
  • the term “isolated” or “purified” means separated (or substantially free) from constituents, cellular and otherwise, in which the polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, are normally associated with in nature.
  • a non-naturally occurring polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof does not require “isolation” to distinguish it from its naturally occurring counterpart.
  • substantially free or substantially purified it is meant at least 50% of the population, preferably at least 70%, more preferably at least 80%, and even more preferably at least 90%, are free of the components with which they are associated in nature.
  • a cell has been “transformed”, “transduced”, or “transfected” when nucleic acids have been introduced inside the cell.
  • Transforming DNA may or may not be integrated (covalently linked) with chromosomal DNA making up the genome of the cell.
  • the polynucleotide may be maintained on an episomal element, such as a plasmid or a stably transformed cell is one in which the polynucleotide has become integrated into a chromosome so that it is inherited by daughter cells through chromosome replication. This stability is demonstrated by the ability of the cell to establish cell lines or clones comprised of a population of daughter cells containing the transformed polynucleotide.
  • a “clone” is a population of cells derived from a single cell or common ancestor by mitosis.
  • a “cell line” is a clone of a primary cell that is capable of stable growth in vitro for many generations (e.g., at least about 10).
  • a “vector” includes plasmids and viruses and any DNA or RNA molecule, whether self-replicating or not, which can be used to transform or transfect a cell.
  • a “genetic modification” refers to any addition, deletion and/or substitution to a cell's normal nucleotides and/or additional of heterologous sequences. Any method which can achieve the genetic modification are within the spirit and scope of this invention. Art recognized methods include viral mediated gene transfer, liposome mediated transfer, transformation, transfection and transduction.
  • FIGS. 1 - 3 A high-level biosynthetic route to produce cannabinoids and/or cannabinoid precursors is shown in FIGS. 1 - 3 .
  • the focus of one of these pathways is the production of Compound I from Compound II as shown in FIGS. 1 A- 1 B using an PKS Enzyme in combination with a npgA Enzyme. Additional pathways can be added to this core pathway, including the production of (a) Compound II from Compound III; and/or (b) the production of Compound II from Acetyl-CoA and Malonyl CoA; and/or (c) the production of Compound III from Compound IV; and/or (d) the production of Compound III from Compound IV.
  • FIG. 1 C shows the production of Compound I from acetyl-CoA and malonyl CoA using the described enzymes.
  • the biosynthetic routes as shown in FIGS. 1 - 3 can be used to produce Compounds described in FIGS. 4 - 5 .
  • the compounds comprise identical core structures but comprise different lengths in the C-tails (C-3 Tail, C-5 Tail, or C-7 Tail).
  • the starting materials e.g., Compound I-IV
  • the enzymatic pathways described herein can be used to convert each core structure.
  • Compound I can be enzymatically produced from Compound II using an PKS Enzyme in combination with a npgA Enzyme.
  • PKS Enzyme is defined as any one of the following amino acid sequences:
  • sequences corresponding to SEQ ID NO:1-7 and 40-42 are as follows:
  • Uncialis -PKS (GenBank Accession AUW31177.1) SEQ ID NO: 5 MTLPNNVVLFGDQTVDPCPIIKQLYRQSRDSLTLQALFRQSYDAVRREIATSEYSDRTLFPSFD SIQGLAEKQTERHNEAVSTVLHCIAQLGLLLIHADQDDFRLDARPSRTYLVGLCTGMLPAAALA ASSSASQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNDEFM IPTSKQAYISAESDSTATLSGPPSTLVSLFSLSDSFRKARRIKLPITAAFHAPHLRLPNVEKII GSLSHSDEYPLRNDVVIISTRSGKPITAQSLGDALQHIILDILREPIRWSTVVEEMINNFEDQG ANLTSVGPVRAADSLRQRMATAGIEILKSTELQPQEPRTKTRSNDIAIIGYAARLPESETLEE AWKILEDGRDVHKKIPSDRF
  • Grayi -PKS-dACP2 SEQ ID NO: 7 MTLPNNVVLFGDQTVDPCPIIKQLYRQSRDSLTLQTLFRQSYDAVRREIATSEASDRALFPSFD SFQDLAEKQNERHNEAVSTVLLCIAQLGLLMIHVDQDDSTFDARPSRTYLVGLCTGMLPAAALA ASSSTSQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNDEFM IPTSKQAYISAESDSSATLSGPPSTLLSLFSSSDIFKKARRIKLPITAAFHAPHLRVPDVEKIL GSLSHSDEYPLRNDVVIVSTRSGKPITAQSLGDALQHIIMDILREPMRVVSRWEEMINGLKDQG AILTSAGPVRAADSLRQRMASAGIEVSRSTEMQPRQEQRTKPRSSDIAIIGYAARLPESETLEE VWKILEDGRDVHKKIPSDRFDVDTHCDPSGKIKNTSYTP
  • furfuracea -PKS furfuracea -PKS
  • SEQ ID NO: 40 MTTTSRVVLFGDQTVDPSPLIKQLCRHSTHSLTLQTFLQKTYFAVRQELAICEISDRANFPSFD
  • variants of SEQ ID NOs:1-7 and 40-42 are made to retain PKS activity while retaining only one activate ACP domain which, the location of which is defined in Table 2:
  • Mutations that inactivate an ACP domain can be made by mutating the highly conserved amino acids of the ACP domain, while retaining the PKS activity. Examples of such mutations include:
  • PKS Variant Enzymes when two ACP domains are present, the PKS activity is retained.
  • amino acids that should be maintained include those that are known to be highly conserved between homologs and/or orthologs.
  • PKS Enzymes including the described variants derived from SEQ ID NO:1-5 or 40 in combination with a npgA Enzyme can be used to produce Compound I from Compound II in the methods described herein.
  • Variants of such PKS enzymes retain the ability to catalyze the conversion of Compound II into Compound I in combination with a npgA Enzyme, with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence.
  • a variant PKS enzyme has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of Compound II into Compound I as compared to the sequence from which the improved variant is derived.
  • any of these PKS Enzymes (including the described variants) derived from SEQ ID NO:41 or 42 in combination with SEQ ID NO:43 or 44 (including variants) along with a npgA enzyme can be used to produce Compound I from acetyl-CoA and malonyl-CoA in the methods described herein.
  • Variants of such PKS enzymes retain the ability to catalyse the conversion of acetyl-CoA and malonyl-CoA into Compound I with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence from which the variant sequence was derived.
  • such a variant PKS enzyme derived from SEQ ID NO:41 or 42 has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalysing the conversion of acetyl-CoA and malonyl-CoA into Compound I as compared to the sequence from which the improved variant is derived.
  • cs-OLAS-1 (SEQ ID NO:41) when combined with cs-HEX-1 (SEQ ID NO:43) and a npgA enzyme can generate Olivetolic Acid from acetyl-CoA and malonyl CoA.
  • Diviaric Acid-Synthase (pp-DVAS-1)(SEQ ID NO:42), Butiryl synthase (pp-BUT-1) (SEQ ID NO:44), and a npgA enzyme can produce Diviaric Acid from acetyl-CoA and malonyl CoA.
  • Variants derived from these sequences as described herein can also be used so long as the variants retain the ability to produce Olivetolic Acid or Diviaric Acid (respectively) as compared to the sequences from which the variants were derived.
  • cs-OLAS-1 variant enzymes comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:41.
  • cs-OLAS-1 variant enzymes comprise a polypeptide that has at least 70%, 75%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:41.
  • cs-OLAS-1 variant enzymes When producing Olivetolic Acid, any of these cs-OLAS-1 variant enzymes can be used in combination with a cs-HEX-1 enzyme (including variants) as described herein.
  • cs-HEX-1 variant enzymes comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:43.
  • cs-HEX-1 variant enzymes comprise a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:43.
  • pp-DVAS-1 variant enzymes comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:42.
  • pp-DVAS-1 variant enzymes comprise a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:42.
  • any of these pp-DVAS-1 variant enzymes can be used in combination with a Butiryl (pp-BUT-1) synthase (including variants) as described herein.
  • Butiryl (pp-BUT-1) synthase variants comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 990% or 100% sequence identity to SEQ ID NO:44.
  • Butiryl (pp-BUT-1) synthase variants comprise a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:44.
  • sequences corresponding to SEQ ID NO:43 and 44 are as follows:
  • the inventors have discovered that the PKS Enzyme derived from SEQ ID NO:1-5 or 40-44 require activation of the ACP domain. NpgA can catalyze this reaction.
  • the npgA enzyme comprises the following sequence (SEQ ID NO:8):
  • a npgA enzyme refers to any one or combination of the enzymes listed in Table 3 and/or SEQ ID NOs:8 or 31-33.
  • variants of any of these npgA enzymes can be used in combination with PKS Enzyme described herein to produce Compound I from Compound II in the methods described herein.
  • variants of the npgA enzymes retain the ability to catalyze the conversion of Compound II into Compound I in combination with a PKS Enzyme derived from SEQ ID NO:1-5 or 40, with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence.
  • a variant npgA enzyme has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of Compound II into Compound I as compared to the sequence from which the improved variant is derived.
  • variants of the npgA enzymes retain the ability to catalyze the conversion of malonyl-CoA and acetyl-CoA in combination with cs-OLAS-1 of SEQ ID NO:41 (or variant thereof) in combination with the cs-HEX-1 of SEQ ID NO:43 (or variant thereof), with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence from which the npgA variant is derived.
  • a variant npgA enzyme has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of malonyl-CoA and acetyl-CoA in combination with the enzymes of SEQ ID NO: 41 and 43 (or variants thereof) as compared to the npgA sequence from which the improved variant is derived.
  • variants of the npgA enzymes retain the ability to catalyze the conversion of malonyl-CoA and acetyl-CoA in combination with pp-DVAS-1 of SEQ ID NO:42 (or variant thereof) in combination with a pp-BUT-1 of SEQ ID NO:44 (or variant thereof), with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence from which the npgA variant is derived.
  • a variant npgA enzyme has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of malonyl-CoA and acetyl-CoA in combination with the enzymes of SEQ ID NO: 42 and 44 (or variants thereof) as compared to the npgA sequence from which the improved variant is derived.
  • npgA homolog from P. furfuracea (SEQ ID NO: 31) MTYHLCNADDDDGDGQTKAFRWLLDVQALWPAPGGGSQSAQSTAHWAT GTAAQHALALLADGERARALRFYRPSDAKLSLGSNLLKHRAIANTCRV PWSEAVISEGANRKPCYKPLGPRSKSLEFNVSHHGSLVALVGCPGEAV KLGVDVVKMNWERDYTTVMKDGFEAWANVYEAVFSEREIKDIAGFVPP IRGTQPDEIRAKLRHFYTHWCLKEAYVKMTGEALLAPWLKDLEFRNVQ VPLPASQMHASGQIGGDWGQTCGGVEIWFYGKRVTDVRLEIQAFREDY MIGTASSSVEMGLSVFKELDVERDVYPTQET npgA homolog from C.
  • Compound II can be produced by two different mechanisms.
  • Compound II can be produced by enzymatically converting Compound III into Compound II by an enzyme selected from AAL1, AAL1 ⁇ SKL, and/or CsAAE1.
  • the AAL1 enzyme comprises the following sequence (SEQ ID NO:9):
  • the AAL1 ⁇ SKL sequence is identical to SEQ ID NO:9, except that amino acids 614-616 have been deleted.
  • the CsAAE1 enzyme comprises the following sequence (SEQ ID NO:10):
  • variants of AAL1, AAL1 ⁇ SKL, and/or CsAAE1 can also be used to produce Compound II from Compound III in the methods described herein.
  • Variants of the AAL1, AAL1 ⁇ SKL, and/or CsAAE1 retain the ability to catalyze the conversion of Compound III into Compound II with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence.
  • a variant AAL1, AAL1 ⁇ SKL, and/or CsAAE1 enzyme has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of Compound III into Compound II as compared to the sequence from which the improved variant is derived.
  • FIG. 1 B The second way in which Compound II can be produce is shown in FIG. 1 B .
  • Acetyl-CoA and Malonyl CoA are enzymatically converted to produce Compound II using a combination of enzymes selected from:
  • HexA & HexB encode the alpha (hexA) and beta (hexB) subunits of the hexanoate synthase (HexS) from Aspergillus parasiticus SU-1 (Hitchman et al. 2001).
  • the genes StcJ and StcK are from Aspergillus nidulans and encode yeast-like FAS proteins (Brown et al. 1996).
  • many fungi would have hexanoate synthase or fatty acid synthase genes, which could readily be identified by sequencing of the DNA and sequence alignments with the known genes disclosed herein. Similarly, the skilled person would understand that homologous genes in different organisms may also be suitable.
  • Mutated FAS produces short-chain fatty acids, such as hexanoic acid.
  • Several different combinations of mutations enable the production of hexanoic acid.
  • the mutations include: FAS1 I306A and FAS2 G1250S; FAS1 I306A and FAS2 G1250S and M1251W; and FAS1 I306A, R1834K and FAS2 G1250S (Gajewski et al. 2017).
  • Mutated FAS2 and FAS1 may be expressed under the control of any suitable promoter, including, but not limited to the alcohol dehydrogenase II promoter of Y. lipolytica .
  • genomic FAS2 and FAS1 can be directly mutated using, for example, homologous recombination or CRISPR-Cas9 genome editing technology.
  • HexA comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:16.
  • HexA comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:16.
  • HexB comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:17.
  • HexB comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:17.
  • StcJ comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:18.
  • StcJ comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:18.
  • StcK comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:19.
  • StcK comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:19.
  • FAS2 comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:20 and one of the combinations of mutations defined above.
  • FAS2 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:20 and one of the combinations of mutations defined above.
  • FAS1 comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:21 and one of the combinations of mutations defined above.
  • FAS1 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:21 and one of the combinations of mutations defined above.
  • Variants of the Compound II producing proteins retain the ability to catalyse the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH.
  • a variant of a Compound II producing protein must retain the ability to catalyse the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence.
  • a variant of a Compound II producing protein has improved activity over the sequence from which it is derived in that the improved variant common cannabinoid protein has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalysing the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH, as compared to the sequence from which the improved variant is derived.
  • the hexanoyl-CoA synthases HexA & HexB, StcJ & StcK, or mutated FAS1&2 may be expressed using, for example, a constitutive TEF intron promoter or native promoter (Wong et al. 2017) and synthesized short terminator (Curran et al. 2015).
  • the production of Compound II may be determined by directly measuring the concentration of Compound II using LC-MS.
  • Compound III can be enzymatically produced from Compound IV using, for example, ADH alone or with the combination of ADH, FAO and one of 4 FALDH1-4. See, for example Gatter, M., et al., (2014) FEMS Yeast Research 14(6), 858-872 and Salid, A., et al., (2013) Applied Biochemistry and Biotechnology 171(8), 2273-2284. Carbon sources used to produce Compound III from alkans, such as for example hexan, octan.
  • FIG. 3 describes the preferred method of producing GPP.
  • GPP may be produced by a mutated farnesyl diphosphate synthase.
  • ERG20 condenses isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to provide geranyl pyrophosphate (GPP) and then condenses two molecules of GPP to provide feranyl pyrophosphate (FPP).
  • IPP isopentenyl diphosphate
  • DMAPP dimethylallyl diphosphate
  • GPP geranyl pyrophosphate
  • FPP feranyl pyrophosphate
  • mutated ERG20 that has a reduced or inability to produce FPP, may be used to increase the production of GPP.
  • Two sets of mutations have been identified in S. cerevisiae that increase GPP production.
  • the first mutation is a substitution of K197E and the second is a double substitution of F96W and N127W.
  • equivalent mutations may be introduced into ERG20 from Y. lipolytica .
  • the first mutation is a substitution of K189E and the second is a double substitution of F88W and N119W. Introducing Y.
  • the lipolytica ERG20 (K189E) increases the production of GPP but growth is little bit slower compared to wild type yeast. Introducing Y. lipolytica ERG20 (F88W and N119W) produces fast growing clones with a high level of GPP.
  • the sequences for the Y. lipolytica and S. cerevisiae genes are shown herein, however the skilled person would understand that homologous genes may also be suitable. Examples of ERG20 homologs as shown in Table 8. Accordingly, in certain embodiments, the one or more GPP producing genes comprise: a mutated farnesyl diphosphate synthase; a mutated S. cerevisiae ERG20 comprising a K197E substitution; a double mutated S.
  • ERG20 comprising F96W and N127W substitutions; a mutated Y. lipolytica ERG20 comprising a K189E substitution; or a double mutated Y. lipolytica ERG20 comprising F88W and N119W substitutions; or a combination thereof.
  • SEQ IDS For the SEQ IDS described herein, mutations are shown with a solid underline. In certain embodiments, S.
  • cerevisiae ERG20 (K197E) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:25.
  • S. cerevisiae ERG20 (K197E) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:25.
  • S. cerevisiae ERG20 (K197E) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:25.
  • S. cerevisiae ERG20 (F96W and N127W) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:26.
  • S. cerevisiae ERG20 (F96W and N127W) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:26.
  • Y. lipolytica ERG20 (K189E) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:27.
  • Y. lipolytica ERG20 comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:27.
  • lipolytica ERG20 (K189E) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:27.
  • Y. lipolytica ERG20 (F88W and N119W) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:28.
  • lipolytica ERG20 (F88W and N119W) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:28.
  • GPP proteins such as ERG20 retain the ability to, for example, condense isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to geranyl pyrophosphate (GPP) and yet have reduced GPP to FPP activity.
  • IPP condense isopentenyl diphosphate
  • DMAPP dimethylallyl diphosphate
  • GPP geranyl pyrophosphate
  • a variant of a GPP protein such as ERG20, retains the ability to condense isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to geranyl pyrophosphate (GPP) with at least about at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence, while the ability to condense GPP to FPP is reduced by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% (null mutation) as compared to the sequence from which it is derived.
  • IPP isopentenyl diphosphate
  • DMAPP dimethylallyl diphosphate
  • GPP geranyl pyrophosphate
  • ERG20 (K197E) SEQ ID NO: 25 MASEKEIRRERFLNVFPKLVEELNASLLAYGMPKEACDWYAHSLNYNTP GGKLNRGLSVVDTYAILSNKTVEQLGQEEYEKVAILGWCIELLQAYFLV ADDMMDKSITRRGQPCWYKVPEVGEIAINDAFMLEAAIYKLLKSHFRNE KYYIDITELFHEVTFQTELGQLMDLITAPEDKVDLSKFSLKKHSFIVTF ETAYYSFYLPVALAMYVAGITDEKDLKQARDVLIPLGEYFQIQDDYLDC FGTPEQIGKIGTDIQDNKCSWVINKALELASAEQRKTLDENYGKKDSVA EAKCKKIFNDLKIEQLYHEYEESIAKDLKAKISQVDESRGFKADVLTAF LNKVYKRSK* ERG20 (F96W and N127W) SEQ ID NO: 26 MASEKEIRRERFLNVFPKLVEELNAS
  • ERG20 (K189E) SEQ ID NO: 27 MSKAKFESVFPRISEELVQLLRDEGLPQDAVQWFSDSLQYNCVGGKLNR GLSVVDTYQLLTGKKELDDEEYYRLALLGWLIELLQAFFLVSDDIMDES KTRRGQPCWYLKPKVGMIAINDAFMLESGIYILLKKHFRQEKYYIDLVE LFHDISFKTELGQLVDLLTAPEDEVDLNRFSLDKHSFIVRYETAYYSFY LPVVLAMYVAGITNPKDLQQAMDVLIPLGEYFQVQDDYLDNFGDPEFIG KIGTDIQDNKCSWLVNKALQKATPEQRQILEDNYGVKDKSKELVIKKLY DDMKIEQDYLDYEEEVVGDIKKKIEQVDESRGFKKEVLNAFLAKIYKRQ K Y.
  • ERG20 (F88W and N119W) SEQ ID NO: 28 ASKAKFESVFPRISEELVQLLRDEGLPQDAVQWFSDSLQYNCVGGKLNR GLSVVDTYQLLTGKKELDDEEYYRLALLGWLIELLQAFWLVSDDIMDES KTRRGQPCWYLKPKVGMIAIWDAFMLESGIYILLKKHFRQEKYYIDLVE LFHDISFKTELGQLVDLLTAPEDEVDLNRFSLDKHSFIVRYKTAYYSFY LPVVLAMYVAGITNPKDLQQAMDVLIPLGEYFQVQDDYLDNFGDPEFIG KIGTDIQDNKCSWLVNKALQKATPEQRQILEDNYGVKDKSKELVIKKLY DDMKIEQDYLDYEEEVVGDIKKKIEQVDESRGFKKEVLNAFLAKIYKRQ K
  • HMGR Hydroxymethylglutaryl-CoA reductase
  • HMG-CoA and NADPH Hydroxymethylglutaryl-CoA reductase
  • HMGR is a rate limiting step in the GPP pathway in yeast. Accordingly, overexpressing HMGR may increase flux through the pathway and increase the production of GPP.
  • HMGR is a GPP pathway gene.
  • Other GPP pathway genes include those genes that are involved in the GPP pathway, the products of which either directly produce GPP or produce intermediates in the GPP pathway, for example, ERG10, ERG13, ERG12, ERG8, ERG19, IDI1 or ERG20, The HMGR1 sequence from Y.
  • lipolytica consists of 999 amino acids (aa) (SEQ ID NO: 29), of which the first 500 aa harbor multiple transmembrane domains and a response element for signal regulation. The remaining 499 C-terminal residues contain a catalytic domain and an NADPH-binding region. Truncated HMGR1(tHmgR) has been generated by deleting the N-terminal 500 aa (Gao et al. 2017). tHMGR is able to avoid self-degradation mediated by its N-terminal domain and is thus stabilized in the cytoplasm, which increases flux through the GPP pathway.
  • the N-terminal 500 aa are shown with a dashed underline in SEQ ID NO:29.
  • the N-terminal 500 aa are deleted in SEQ ID NO:30.
  • the one or more GPP pathway genes comprise a hydroxymethylglutaryl-CoA reductase (HMGR); a truncated hydroxymethylglutaryl-CoA reductase (tHMGR); or a combination thereof.
  • HMGR hydroxymethylglutaryl-CoA reductase
  • tHMGR truncated hydroxymethylglutaryl-CoA reductase
  • the sequence for the Y. lipolytica gene are shown herein, however the skilled person would understand that homologous genes may also be suitable. Examples of HMGR homologs as shown in Table 9.
  • HMGR comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:29.
  • HMGR comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:29.
  • tHmgR comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:30.
  • tHmgR comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:30.
  • the GPP producing and GPP pathway genes may be expressed using, for example, a constitutive TEF intron promoter or native promoter (Wong et al. 2017) and synthesized short terminator (Curran et al. 2015). Increased production of GPP can be determined by overexpressing a single heterologous gene encoding linalool synthase and then determining the production of linalool using, for example, a colorimentric assay (Ghorai 2012). Increased production of GPP may be indicated by a linalool concentration of at least 0.5 mg/L, 0.7 mg/L, 0.9 mg/L or preferably at least about 1 mg/L.
  • HMGR1 (underlined sequence is removed in tHMGR1) SEQ ID NO: 29 MLQAAIGKIVGFAVNRPIHTVVLTSIVASTAYLAILDIAIPGFEGTQPI SYYHPAAKSYDNPADWTHIAEADIPSDAYRLAFAQIRVSDVQGGEAPTI PGAVAVSDLDHRIVMDYKQWAPWTASNEQIASENHIWKHSFKDHVAFSW IKWFRWAYLRLSTLIQGADNFDIAVVALGYLAMHYTFFSLFRSMRKVGS HFWLASMALVSSTFAFLLAVVASSSLGYRPSMITMSEGLPFLVVAIGFD RKVNLASEVLTSKSSQLAPMVQVITKIASKALFEYSLEVAALFAGAYTG VPRLSQFCFLSAWILIFDYMFLLTFYSAVLAIKFEINHIKRNRMIQDAL KEDGVSAAVAEKVADSSPDAKLDRKSDVSLFGASGAIAVFKIFMVLGFL GLNLINLTAIPH
  • THCA cannabinoids tetrahydrocannabinolic acid
  • CBDA cannabidiolic acid
  • CBCA cannabichromenic acid
  • CBGA-analogs may be produced by a membrane-bound CBGA synthase (CBGAS) from C. sativa .
  • CBGAS is also known as geranylpyrophosphate olivetolate geranyltransferase, of which there are several forms, CsPT1, CsPT3 and CsPT4.
  • the one or more cannabinoid precursor or cannabinoid producing genes comprise: a soluble aromatic prenyltransferase; a cannabigerolic acid synthase (CBGAS); or a combination thereof; either alone or in combination with the cannabinoid producing genes: tetrahydrocannabinolic acid synthase (THCAS); cannabidiolic acid synthase (CBDAS); cannabichromenic acid synthase (CBCAS); or any combination thereof.
  • THCAS tetrahydrocannabinolic acid synthase
  • CBDAS cannabidiolic acid synthase
  • CBCAS cannabichromenic acid synthase
  • CBGA synthase comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:31.
  • CBGA synthase comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:32.
  • CBGA synthase comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:33.
  • CBGA synthase comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NOS: 31, 32 or 33.
  • CBGA may also be formed by heterologous expression of a soluble aromatic prenyltransferase.
  • the soluble aromatic prenyltransferase is NphB from Streptomyces sp. strain CL190 (ie wild type NphB) (Bonitz et al., 2011; Kuzuyama et al., 2005; Zirpel et al., 2017).
  • the soluble aromatic prenyltransferase is NphB, comprising at least one mutation selected from (a) Q161A; (b) G286S; (c) Y288A; (d) A232S; (e) Y288A+G286S; (f) Y288A+G286S+Q161A; (g) Q161A+G286S; (h) Q161A+Y288A; or (i) Y288A+A232S. It is expected that the mutants of NphB (e.g., Q161A) produces more CBGA that wild type NphB (Muntendam 2015).
  • NphB comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:34.
  • NphB comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:34.
  • Variants of the cannabinoid precursor or cannabinoid producing protein such as NphB variant (e.g., at least one of Q161A, G286S, Y288A, or A232S), retains the ability to attach geranyl groups to aromatic substrates—such as converting Compound I and GPP to CBGA-analog.
  • NphB variant e.g., at least one of Q161A, G286S, Y288A, or A232S
  • a variant Cannabinoid precursor or cannabinoid producing protein such as NphB variant (e.g., at least one of Q161A, G286S, Y288A, A232S), must retain the ability to attach geranyl groups to aromatic substrates, such as converting Compound I and GPP to CBGA-analog, with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence.
  • NphB variant e.g., at least one of Q161A, G286S, Y288A, A232S
  • NphB variant e.g., at least one of Q161A, G286S, Y288A, A232S
  • a variant of a Cannabinoid precursor or cannabinoid producing protein such as NphB variant (e.g., at least one of Q161A, G286S, Y288A, A232S), has improved activity over the sequence from which it is derived in that the improved variant common cannabinoid protein has more than 110%, 120%, 130%, 140%, or and 150% improved activity in attach geranyl groups to aromatic substrates, such as converting Compound I and GPP to CBGA-analog, as compared to the sequence from which the improved variant is derived.
  • NphB variant e.g., at least one of Q161A, G286S, Y288A, A232S
  • the improved variant common cannabinoid protein has more than 110%, 120%, 130%, 140%, or and 150% improved activity in attach geranyl groups to aromatic substrates, such as converting Compound I and GPP to CBGA-analog, as compared to the sequence
  • the cannabinoid precursor or cannabinoid producing genes CBGAS, soluble aromatic prenyltransferase, CBGAS, THCAS, CBDAS and CBCAS may be expressed using, for example, a constitutive TEF intron promoter or native promoter (Wong et al. 2017) and synthesized short terminator (Curran et al. 2015).
  • the production of one or more cannabinoid precursors or cannabinoids may be determined using a variety of methods. For example, if all of the precursors are available in the yeast cell, then the presence of the product, such as THCA, may be determined using HPLC or gas chromatography (GC).
  • cannabinoids will not be present and the activity of one or more genes can be checked by adding a gene and precursor.
  • CBGAS activity Compound I and GPP are added to a crude cellular lysate.
  • THCAS or CBDAS activity a CBGA-analog is added to a crude cellular lysate.
  • a crude lysate or purified proteins may be used. Further, it may be necessary to use an aqueous/organic two-liquid phase setup in order to solubilize the hydrophobic substrate (eg CBGA) and to allow in situ product removal.
  • CsPT1 SEQ ID NO: 31 MGLSSVCTFSFQTNYHTLLNPHNNNPKTSLLCYRHPKTPIKYSYNNFPS KHCSTKSFHLQNKCSESLSIAKNSIRAATTNQTEPPESDNHSVATKILN FGKACWKLQRPYTIIAFTSCACGLFGKELLHNTNLISWSLMFKAFFFLV AILCIASFTTTINQIYDLHIDRINKPDLPLASGEISVNTAWIMSIIVAL FGLIITIKMKGGPLYIFGYCFGIFGGIVYSVPPFRWKQNPSTAFLLNFL AHIITNFTFYYASRAALGLPFELRPSFTFLLAFMKSMGSALALIKDASD VEGDTKFGISTLASKYGSRNLTLFCSGIVLLSYVAAILAGIIWPQAFNS NVMLLSHAILAFWLILQTRDFALTNYDPEAGRRFYEFMWKLYYAEYLVY VFI CsPT3 SEQ ID NO: 32 MGLSLVCTFSF
  • Producing a CBGA-analog is an initial step in producing many cannabinoids. Once a CBGA-analog is produced, a single additional enzymatic step is required to turn the CBGA-analog into many other cannabinoids (ie, CBDA-analog, THCA-analog, CBCA-analog, etc.).
  • the acidic forms of the cannabinoids can be used as a pharmaceutical product or the acidic cannabinoids can be turned into their neutral form for use, for example Cannabidiol (CBD) is produced from CBDA through decarboxylation.
  • CBDA Cannabidiol
  • the resulting cannabinoid products will be used in the pharmaceutical/nutraceutical industry to treat a wide range of health issues.
  • THCAS tetrahydrocannabinolic acid synthase
  • CBDAS cannabidiolic acid synthase
  • CBCAS cannabichromenic acid synthase
  • THCAS comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:13.
  • THCAS comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:13.
  • CBDAS comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:14.
  • CBDAS comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:14.
  • CBCAS comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:15.
  • CBCAS comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:15.
  • the one or more cannabinoid precursor or cannabinoid producing genes comprise soluble aromatic prenyltransferase, cannabigerolic acid synthase (CBGAS), tetrahydrocannabinolic acid synthase (THCAS), cannabidiolic acid synthase (CBDAS) and cannabichromenic acid synthase (CBCAS).
  • THCAS THCAS
  • the properties of the reactants have to be taken into account, since they determine preferences for process variables and reaction conditions.
  • the THCAS is active in specialized structures called trichomes (Sirikantaramas et al., 2005). These glandular trichomes harbor a storage cavity (Mahlberg and Kim, 1992), containing the hydrophobic and for plant cells toxic cannabinoids in oil droplets (Morimoto et al., 2007). In this manner, the plant solves solubility and toxicity issues of the cannabinoids (Kim and Mahlberg, 2003).
  • lipid bodies will perform the role of lipid droplets in plants.
  • Cannabinoids are almost not soluble in the aquatic phase. At the same time, they have a great solubility in oils (lipids). By using strains with a large content of lipids and lipid bodies we are providing a safe (not toxic) storage for produced cannabinoids.
  • the production of fatty acids and fats in yeast may be increased by expressing rate limiting genes in the lipid biosynthesis pathway.
  • Y. lipolytica naturally produces Acetyl-CoA.
  • the overexpression of ACC increases the amount of Malonyl-CoA, which is the first step in fatty acid production.
  • the one or more genetic modifications that result in increased production of fatty acids or fats comprise Acetyl-CoA carboxylase (ACC1) and Diacylglyceride acyl-transferase (DGA1).
  • ACC1 Acetyl-CoA carboxylase
  • DGA1 Diacylglyceride acyl-transferase
  • ACC comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:23.
  • ACC1 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:23.
  • DGA1 comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:24.
  • DGA1 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:24.
  • ACC1 and DGA1 may be overexpressed in yeast by adding extra copies of the genes driven by native or stronger promoters.
  • native promoters may be substituted by stronger promoters such as TEFin, hp4d, hp8d and others, as would be appreciated by the person skilled in the art.
  • the overexpression of ACC and DGA1 may be determined by quantitative PCR, Microarrays, or next generation sequencing technologies, such as RNA-seq.
  • the product of increased enzyme levels will be increased production of fatty acids. Fatty acid production may be determined using chemical titration, thermometric titration, measurement of metal-fatty acid complexes using spectrophotometry, enzymatic methods or using a fatty acid binding protein.
  • Variants of the fatty acid and fat producing proteins retain the ability to produce malonyl-CoA from acetyl-CoA plus bicarbonate.
  • a variant of a fatty acid and fat producing protein, such as ACC1 must retain the ability to produce malonyl-CoA from acetyl-CoA plus bicarbonate with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence.
  • a variant of a fatty acid and fat producing protein such as ACC1
  • NADPH is extremely critical for a production of fatty acids. It is required 16 molecules of NADPH to produce one stearic acid. By using NADPH, cells create an excess of NADH. NADPH is also important for production of fatty acids and cannabinoids. Four molecules of NADPH is required to produce 1 molecule of GPP.
  • NADPH NADPH
  • Production of OLA from Hexanoyl-CoA does not require any additional NADPH. Therefore, we will need 8 molecules of NADPH to directly produce 1 molecule of a cannabinoid precursor.
  • Preferred methods of overexpressing NADP+ include, but are not limited to use of glucose-6-phosphate dehydrogenase, which is encoded by, for example ZWF1 (see, for example, Yuzbasheva, E.
  • ProA proteinase A
  • THCAS THCAS
  • CBDAS CBCAS
  • CBCAS proteinase A
  • Such ProA signal may also increase production of the CBDA, TCHA or CBCA analog.
  • Examples of such ProA signals that can be added to the N-terminus include any one of SEQ ID NO:45-46.
  • MKFTAAVSVLAAAGSVSAAV >ProA21 (SEQ ID NO: 46) MKFTAAVSVLAAAGSVSAAVS >ProA22 (SEQ ID NO: 47) MKFTAAVSVLAAAGSVSAAVSK >ProA23 (SEQ ID NO: 48) MKFTAAVSVLAAAGSVSAAVSKV >ProA24 (SEQ ID NO: 49) MKFTAAVSVLAAAGSVSAAVSKVS
  • any one of SEQ ID NO:45-49 can be added to the N-terminus of any one of SEQ ID NO:13-15 (or variants thereof) to aid in the expression, activity and production of the CBDA, TCHA or CBCA analog.
  • the additional of the ProA signal sequence added to the N-terminus of THCAS, CBDAS and/or CBCAS had substantially improved activity when expressed in a recombinant host having inactivated or deleted PEP4 and/or PRB1 genes or expressed in recombinant hosts lacking functional PEP4 and/or PRB1 genes (e.g, lacking endogenous sequences).
  • inactivation at in Y. lipolytica YALI0F27071p and/or YALI0B16500p and/or YALI0A06435p are preferably used to express of THCAS, CBDAS and/or CBCAS having ProA signal sequences.
  • the microorganism employed in a method of the invention or contained in the composition of the invention may be a microorganism which has been genetically modified by the introduction of a nucleic acid molecule encoding a corresponding enzyme.
  • the microorganism is a recombinant microorganism which has been genetically modified to have an increased activity of at least one enzyme described above for the conversions of the method according to the present invention. This can be achieved e.g. by transforming the microorganism with a nucleic acid encoding a corresponding enzyme.
  • the nucleic acid molecule introduced into the microorganism is a nucleic acid molecule which is heterologous with respect to the microorganism, i.e. it does not naturally occur in said microorganism.
  • microorganism in the context of the present invention refers to bacteria, as well as to fungi, such as yeasts, and also to algae and archaea.
  • the microorganism is a bacterium.
  • any bacterium can be used.
  • Preferred bacteria to be employed in the process according to the invention are bacteria of the genus Bacillus, Clostridium, Corynebacterium, Pseudomonas, Zymomonas or Escherichia .
  • the bacterium belongs to the genus Escherichia and even more preferred to the species Escherichia coli .
  • the bacterium belongs to the species Pseudomonas putida or to the species Zymomonas mobilis or to the species Corynebacterium glutamicum or to the species Bacillus subtilis . It is also possible to employ an extremophilic bacterium such as Thermus thermophilus , or anaerobic bacteria from the family Clostridiae.
  • an “increased activity” means that the expression and/or the activity of an enzyme in the genetically modified microorganism is at least 10%, preferably at least 20%, more preferably at least 30% or 50%, even more preferably at least 70% or 80% and particularly preferred at least 90% or 100% higher than in the corresponding non-modified microorganism.
  • the increase in expression and/or activity may be at least 150%, at least 200% or at least 500%.
  • the expression is at least 10-fold, more preferably at least 100-fold and even more preferred at least 1000-fold higher than in the corresponding non-modified microorganism.
  • the term “increased” expression/activity also covers the situation in which the corresponding non-modified microorganism does not express a corresponding enzyme so that the corresponding expression/activity in the non-modified microorganism is zero.
  • the concentration of the overexpressed enzyme is at least 5%, 10%, 20%, 30%, or 40% of the total host cell protein. Additionally, as would be appreciated by the person skilled in the art, increased expression of a gene may provide increased the activity of the gene product.
  • overexpression of a gene can increase the activity of the gene product by about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 105%, about 110%, about 115%, about 120%, about 125%, about 130%, about 135%, about 140%, about 145%, about 150%, about 155%, about 160%, about 165%, about 170%, about 175%, about 180%, about 185%, about 190%, about 95%, or about 200%.
  • Methods for measuring the level of expression of a given protein in a cell are well known to the person skilled in the art.
  • the measurement of the level of expression is done by measuring the amount of the corresponding protein.
  • Corresponding methods are well known to the person skilled in the art and include Western Blot, ELISA etc.
  • the measurement of the level of expression is done by measuring the amount of the corresponding RNA.
  • Corresponding methods are well known to the person skilled in the art and include, e.g., Northern Blot.
  • the transformation of the host cell with a polynucleotide or vector as described above can be carried out by standard methods, as for instance described in Sambrook and Russell (2001), Molecular Cloning: A Laboratory Manual, CSH Press, Cold Spring Harbor, N.Y., USA; Methods in Yeast Genetics, A Laboratory Course Manual, Cold Spring Harbor Laboratory Press, 1990.
  • the host cell is cultured in nutrient media meeting the requirements of the particular host cell used, in particular in respect of the pH value, temperature, salt concentration, aeration, antibiotics, vitamins, trace elements etc.
  • the disclosed genes may be under the control of any suitable promoter.
  • Many native promoters are available, for example, for Y. lipolytica , native promoters are available from the genes for translational elongation factor EF-1 alpha, acyl-CoA: diacylglycerol acyltransferase, acetyl-CoA-carboxylase 1, ATP citrate lyase 2, fatty acid synthase subunit beta, fatty acid synthase subunit alpha, isocitrate lyase 1, POX4 fatty-acyl coenzyme A oxidase, ZWF1 glucose-6-phosphate dehydrogenase, gytosolic NADP-specific isocitrate dehydrogenase, glyceraldehyde 3-phosphate dehydrogenase, the TEF intron promoter or native promoter (Wong et al.
  • Short synthetic terminators are particularly suitable and are readily available, see for example, MacPherson et al. 2016.
  • Methods of detecting increase production of Compound I may be determined using high-performance liquid chromatography (HPLC) or Liquid chromatography-mass spectrometry (LC/MS). For example, as yeast do not produce OA endogenously, the presence of OA indicates that the PKS Enzyme is functioning.
  • HPLC high-performance liquid chromatography
  • LC/MS Liquid chromatography-mass spectrometry
  • the microorganism is a fungus, more preferably a fungus of the genus Saccharomyces, Schizosaccharomyces, Aspergillus, Trichoderma, Kluyveromyces or Pichia and even more preferably of the species Saccharomyces cerevisiae, Schizosaccharomyces pombe, Aspergillus niger, Trichoderma reesei, Kluyveromyces marxianus, Kluyveromyces lactis, Pichia pastoris, Pichia torula or Pichia utilis.
  • genetically modified yeasts comprising one or more genetic modifications that result in the production of at least one cannabinoid or cannabinoid precursor and methods for their creation.
  • the disclosed yeast may produce various cannabinoids from a simple sugar source, for example, where the main carbon source available to the yeast is a sugar (glucose, galactose, fructose, sucrose, honey, molasses, raw sugar, etc.).
  • Genetic engineering of the yeast involves inserting various genes that produce the appropriate enzymes and/or altering the natural metabolic pathway in the yeast to achieve the production of a desired compound. Through genetic engineering of yeast, these metabolic pathways can be introduced into these yeast and the same metabolic products that are produced in the plant C. sativa can be produced by the yeast.
  • the benefit of this method is that once the yeast is engineered, the production of the cannabinoid is low cost and reliable, only a specific cannabinoid is produced or a subset is produced, depending on the organism and the genetic manipulation.
  • the purification of the cannabinoid is straightforward since there is only a single cannabinoid or a selected few cannabinoids present in the yeast.
  • the process is a sustainable process which is more environmentally friendly than synthetic production.
  • the biosynthetic pathways shown in FIGS. 1 - 3 are produced in yeast having at least 5% dry weight of fatty acids or fats, such as oily yeasts, for example, Y. lipolytica.
  • Cannabinoids have a limited solubility in water solutions. Yet, they have a high solubility in hydrophobic liquids like lipids, oils or fats. If hydrophobic media is limited or completely removed than a CBGA-analog will not be solubilized and will have limited availability to following cannabinoid synthetases.
  • hydrophobic media is limited or completely removed than a CBGA-analog will not be solubilized and will have limited availability to following cannabinoid synthetases.
  • purified THCA synthase is almost unable to convert CBGA into THCA.
  • unpurified yeast lysate converts CBGA much more efficiently.
  • CBGA was dissolved in the lipid fraction.
  • another paper (Lange et al.
  • cannabinoid in traditional yeast like S. cerevisiae, K. phaffii, K. marxianus .
  • cannabinoids like the main mass of lipids to be deposited in the lipid membrane.
  • yeast almost have no oily bodies. In such a case, any cannabinoids that are produced will be dissolved in this membrane. Too many cannabinoids will destabilize a membrane which will cause cell death. It was reported that in the best conditions, with high sugar content and without nitrogen supply, these yeasts can have a maximum of 2-3% dry weight of oils (ie fats and fatty acids).
  • Y. lipolytica there are several non-traditional yeasts, like Y. lipolytica .
  • the natural form of Y. lipolytica can have up to 17% dry weight of oils.
  • the main mass of oil is located in oily bodies.
  • Cannabinoids dissolved in such bodies will not cause membrane instability.
  • Y. lipolytica can have a much higher cannabinoid production level.
  • Several works have demonstrated modifications for Y. lipolytica which can bring the lipid content above 80% of dry mass (Qiao et al. 2015).
  • cannabinoids can be produced to some percentage of the oil content in yeast. This gives a correlation—more oil means more cannabinoid production.
  • oily yeasts as a backbone for cannabinoid and/or cannabinoid precursor production.
  • the yeast comprises at least 5% dry weight of fatty acids or fats.
  • the yeast may be oleaginous. Any oleaginous yeast may be suitable, however, particularly suitable yeast may be selected from the genera Rhodosporidium, Rhodotorula, Yarrowia, Cryptococcus, Candida, Lipomyces and Trichosporon .
  • the yeast is a Yarrowia lipolytica , a Lipomyces starkey , a Rhodosporidium toruloides , a Rhodotorula glutinis , a Trichosporon fermentans or a Cryptococcus curvatus .
  • the yeast may be naturally oleaginous. Accordingly, in certain embodiments, the yeast comprises at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75% or at least 80% dry weight of fatty acids or fats.
  • the yeast may also be genetically modified to accumulate or produce more fatty acids or fats.
  • the yeast is genetically modified to produce at least 5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75% or at least 80% dry weight of fatty acids or fats.
  • the method according to the present invention can also be carried out in a cell-free system (e.g., in vitro).
  • An in vitro reaction is understood to be a reaction in which no cells are employed, i.e. an acellular reaction.
  • in vitro preferably means in a cell-free system.
  • the term “in vitro” in one embodiment means in the presence of isolated enzymes (or enzyme systems optionally comprising possibly required cofactors).
  • the enzymes employed in the method are used in purified form.
  • the substrates for the reaction and the enzymes are incubated under conditions (buffer, temperature, cosubstrates, cofactors etc.) allowing the enzymes to be active and the enzymatic conversion to occur.
  • the reaction is allowed to proceed for a time sufficient to produce the respective product.
  • the production of the respective products can be measured by methods known in the art, such as gas chromatography possibly linked to mass spectrometry detection.
  • the enzymes described herein may be in any suitable form allowing the enzymatic reaction to take place. They may be purified or partially purified or in the form of crude cellular extracts or partially purified extracts. It is also possible that the enzymes are immobilized on a suitable carrier.
  • compositions as described herein comprising contacting the compositions as described herein with a carbohydrate source under conditions and for a time sufficient to produce the at least one cannabinoid or cannabinoid precursor.
  • examples of the culture conditions for producing at least one cannabinoid or cannabinoid precursor include a batch process and a fed batch or repeated fed batch process in a continuous manner, but are not limited thereto.
  • Carbon sources that may be used for producing at least one cannabinoid or cannabinoid precursor may include sugars and carbohydrates such as glucose, sucrose, lactose, fructose, maltose, starch, xylose and cellulose; oils and fats such as soybean oil, sunflower oil, castor oil, coconut oil, chicken fat and beef tallow; fatty acids such as palmitic acid, stearic acid, oleic acid and linoleic acid; alcohols such as glycerol and ethanol; and organic acids such as gluconic acid, acetic acid, malic acid and pyruvic acid, but these are not limited thereto.
  • Nitrogen sources that may be used in the present disclosure may include peptone, yeast extract, meat extract, malt extract, corn steep liquor, defatted soybean cake, and urea or inorganic compounds, such as ammonium sulfate, ammonium chloride, ammonium phosphate, ammonium carbonate, and ammonium nitrate, but these are not limited thereto. These nitrogen sources may also be used alone or in a mixture.
  • Phosphorus sources that may be used in the present disclosure may include potassium dihydrogen phosphate or dipotassium hydrogen phosphate, or corresponding sodium-containing salts, but these are not limited thereto.
  • the culture medium may contain a metal salt such as magnesium sulfate or iron sulfate, which is may be required for the growth.
  • a metal salt such as magnesium sulfate or iron sulfate
  • essential growth factors such as amino acids and vitamins may be used.
  • Basic compounds such as sodium hydroxide, potassium hydroxide, or ammonia, or acidic compounds such as phosphoric acid or sulfuric acid may be added to the culture medium in a suitable manner to adjust the pH of the culture medium.
  • an anti-foaming agent such as fatty acid polyglycol ester may be used to suppress the formation of bubbles.
  • the culture medium is maintained in an aerobic state, accordingly, oxygen or oxygen-containing gas (e.g., air) may be injected into the culture medium.
  • the temperature of the culture medium may be usually 20° C. to 35° C., preferably 25° C. to 32° C., but may be changed depending on conditions.
  • the culture may be continued until the maximum amount of a desired cannabinoid precursor or cannabinoid is produced, and it may generally be achieved within 5 hours to 160 hours.
  • the cannabinoid precursor or cannabinoid may be released into the culture medium or contained in the recombinant microorganisms.
  • the method of the present disclosure for producing at least one cannabinoid or cannabinoid precursor may include a step of recovering the at least one cannabinoid or cannabinoid precursor from the microorganism or the medium.
  • Methods known in the art such as centrifugation, filtration, anion-exchange chromatography, crystallization, HPLC, etc., may be used for the method for recovering at least one cannabinoid or cannabinoid precursor from the microorganism or the culture, but the method is not limited thereto.
  • the step of recovering may include a purification process. Specifically, following an overnight culture, 1 L cultures are pelleted by centrifugation, resuspended, washed in PBS and pelleted.
  • the cells are lysed by either chemical or mechanical methods or a combination of methods.
  • Mechanical methods can include a French Press or glass bead milling or other standard methods.
  • Chemical methods can include enzymatic cell lysis, solvent cell lysis, or detergent based cell lysis.
  • a liquid-liquid extraction of the cannabinoids is performed using the appropriate chemical solvent in which the cannabinoids are highly soluble and the solvent is not miscible in water. Examples include hexane, ethyl acetate, and cyclohexane, preferably solvents with straight or branched alkane chains (C5-C8) or mixtures thereof.
  • the at least one cannabinoid or cannabinoid precursor comprises a CBGA-analog, a THCA-analog, a CBDA-analog or a CBCA-analog.
  • the production of one or more cannabinoid precursors or cannabinoids may be determined using a variety of methods as described herein.
  • An example protocol for analysing a CBDA-analog is as follows:
  • a cannabinoid precursor in a third aspect of the present disclosure, there is provided a cannabinoid precursor, cannabinoid or a combination thereof produced using the methods described herein.
  • the at least one cannabinoid or cannabinoid precursor comprises a CBGA-analog, a THCA-analog, a CBDA-analog or a CBCA-analog.
  • Y. lipolytica episomal plasmids comprise a centromere, origin and bacteria replicative backbone. Fragments for these regions were synthesized by Twist Bioscience and cloned to make an episomal parent vector pBM-pa. Plasmids were constructed by Gibson Assembly, Golden gate assembly, ligation or sequence- and ligation-independent cloning (SLIC). Genomic DNA isolation from bacteria ( E. coli ) and yeast ( Yarrowia lipolytica ) were performed using Wizard Genomic DNA purification kit according to manufacturer's protocol (Promega, USA). Synthetic genes were codon-optimized using GeneGenie or Genscript (USA) and assembled from gene fragments purchased from TwistBioscience.
  • All the engineered Y. lipolytica strains were constructed by transforming the corresponding plasmids. All gene expression cassettes were constructed using a TEF intron promoter and synthesized short terminator. Up to six expression cassettes were cloned into episomal expression vectors through SLIC.
  • E. coli minipreps were performed using the Zyppy Plasmid Miniprep Kit (Zymo Research Corporation). Transformation of E. coli strains was performed using Mix & Go Competent Cells (Zymo research, USA). Transformation of Y. lipolytica with episomal expression plasmids was performed using the Zymogen Frozen EZ Yeast Transformation Kit II (Zymo Research Corporation), and spread on selective plates. Transformation of Y. lipolytica with linearized cassettes was performed using LiOAc method. Briefly, Y.
  • lipolytica strains were inoculated from glycerol stocks directly into 10 ml YPD media, grown overnight and harvested at an OD600 between 9 and 15 by centrifugation at 1,000 g for 3 min. Cells were washed twice in sterile water. Cells were dispensed into separate microcentrifuge tubes for each transformation, spun down and resuspended in 1.0 ml 100 mM LiOAc. Cells were incubated with shaking at 30° C. for 60 min, spun down, resuspended in 90 ul 100 mM LiOAc and placed on ice.
  • Dithiothreitol Cells were incubated at 30° C. with shaking for 60 min, heat-shocked for 10 min in a 39° C. water bath, spun down and resuspended in 1
  • E. coli strain DH10B was used for cloning and plasmid propagation.
  • DH10B was grown at 37° C. with constant shaking in Luria-Bertani Broth supplemented with 100 mg/L of ampicillin for plasmid propagation.
  • Y. lipolytica strains W29 was used as the base strain for all experiments.
  • Y. lipolytica was cultivated at 30° C. with constant agitation. Cultures (2 ml) of Y. lipolytica used in large-scale screens were grown in a shaking incubator at speed 250 rpm for 1 to 3 days, and larger culture volumes were shaken in 50 ml flasks or fermented in a bioreactor.
  • Y. lipolytica grew on YPD liquid media contained 10 g/L yeast extract, 20 g/L peptone and 20 g/L glucose, or YPD agar plate with addition of 20 g/L of agar. Medium was often supplemented with 150 to 300 mg/L Hygromycin B or 250 to 500 mg/L nourseothricin for selection, as appropriate.
  • modified YPD media with 0.1 to 1 g/L yeast extract was used for promoting lipid accumulation and often supplemented with 0.2 g/L and 5 g/L ammonium sulphate as alternative nitrogen source.
  • Y. lipolytica culture from the shaking flask experiment or bioreactor are pelleted and homogenized in acetonitrile followed by incubation on ice for 15 min. Supernatants are filtered (0.45 ⁇ m, Nylon) after centrifugation (13,100 g, 4° C., 20 min) and analyzed by HPLC-DAD. Quantification of products are based on integrated peak areas of the UV-chromatograms at 225 nm. Standard curves are generated for CBGA and THCA. The identity of all compounds can be confirmed by comparing mass and tandem mass spectra of each sample with coeluting standards analysed by Bruker CompactTM ESI-Q-TOF using positive ionization mode.
  • Embodiment 1 Y. lipolytica ERG20 comprising F88W and N119W substitutions; tHMGR; OLS: OAC; CBGAS; THCAS; HexA and HexB.
  • Embodiment 2 Y. lipolytica ERG20 comprising F88W and N119W substitutions; HMGR; OLS: OAC; NphB Q161A; THCAS; FAS1 I306A, M1251W and FAS2 G1250S.
  • Embodiment 3 S. cerevisiae ERG20 comprising a K197E substitution; OLS: OAC; NphB Q161A; CBDAS; StcJ and StcK.
  • Embodiment 4 Y. lipolytica ERG20 comprising a K189E substitution; HMGR; OLS: OAC; CBGAS; CBCAS; HexA and HexB.
  • Embodiment 5 Y. lipolytica ERG20 comprising a K189E substitution; tHMGR; OLS: OAC; CBGAS; CBDAS; StcJ and StcK.
  • the genetically modified yeast of the present disclosure enable the production of cannabinoid precursors and cannabinoids.
  • the accumulation of fatty acids or fats in the yeast of at least 5% dry weight provides a storage location for the cannabinoid precursors and cannabinoids removed from the plasma membrane. This reduces the accumulation of cannabinoid precursors and cannabinoids in the plasma membrane, reducing membrane destabilisation and reducing the chances of cell death.
  • Oily yeast such as Y. lipolytica can be engineered to have a fatty acid or fat (eg lipid) content above 80% dry weight, compared to 2-3% for yeast such as S. cerevisiae . Accordingly, cannabinoid precursor and cannabinoid production can be much higher in oily yeast, particularly oily yeast engineered to have a high fatty acid or fat (eg lipid) content.
  • NphB gene mutations were used to express NphBs.
  • NphB wild type and mutations with thrombin-6 ⁇ His tag at N-terminal are expressed episomally driven by TEF intron promoter.
  • NphB activity To assay NphB activity, in vitro assays containing 5 mM GPP, 2 mM OA, 5 mM MgCl2 and 0.5 mg/mL NphB purified enzymes were incubated for 24 h at room temperature and subsequently extracted by adding 200 ul acetonitrile to stop reaction, vortexing for 30 s. Solution was centrifuged at 18000 g for 3 min before subjected to HPLC analysis.
  • the flow rate was held at 0.2 ml/min for 12 min, increased from 0.2 ml/min to 0.4 ml/min in 0.5 min, and held at 0.4 ml/min for 3 min.
  • the total liquid chromatography run time was 15.5 min.
  • FIGS. 9 A and 9 B show the results when different ProA signal sequences were tested.
  • a lipid accumulation strain Y12 (W29 ⁇ pex10 AURA3 hp4d-YlACBP hp4d-YZWF1 hp4d-YlACC1 TEFin-YDGA1 TEFin-ScSUC2 TEFin-YlHXK) was used for THCAS episomal expression.
  • All THCAS has 3 ⁇ -is tag attached at C-terminal for Western Blot detection. All THCAS are driven by TEF intron promoter with XPR2 terminator. Different length of vacuolar proteinase A (YALI0F27071g) single peptide are attached at N-terminal of THCAS.
  • One THCAS variant is with two mutations at N89Q and N499Q for 2 glycosylation site removal.
  • YPD-hygromycin yeast extract peptone dextrose hygromycin
  • THCA THCA in vivo production
  • strains were incubated using the same cultural condition for 48 h to biomass growth.
  • CBGA was spiked at difference level of concentrations and incubated for another 48 or 72 hours for THCA production.
  • CBGA stock solution (1 mg/ml CBGA in F127 surfactant with 1% (v/v) canola oil) was used for spiking.
  • THCAS production was evaluated by western blot using a primary antibody (6 ⁇ -His Tag Polyclonal Antibody, PA1-983B) and secondary antibody (Goat anti-Rabbit IgG (H+L) Cross-Adsorbed Secondary Antibody, HRP, G-21234) against the C-terminal 3 ⁇ His tag on THCAS.
  • Western blot detection was performed using i-Step Ultra TMB-Blotting Solution (Thermo Fisher Scientific).
  • Extraction of cannabinoids was performed by adding 1 ml culture, 0.3 ml ethyl acetate/formic acid (0.05% (v/v)) and 0.2 ml equivalent glass bead to Omni homogenizer tube.
  • Products were analysed using high-performance liquid chromatography with UV detection.
  • the mobile phase was composed of 0.05% (v/v) formic acid in water (solvent A) and 0.05% (v/v) formic acid in acetonitrile (solvent B).
  • Olivetolic acid and cannabinoids were separated via gradient elution as follows: linearly increased from 45% B to 62.5% B in 3 min, held at 62.5% B for 4 min, increased from 62.5% B to 97% B in 1 min, held at 97% B for 4 min, decreased from 97% B to 45% B in 0.5 min, and held at 45% B for 3 min.
  • the flow rate was held at 0.2 ml/min for 12 min, increased from 0.2 ml/min to 0.4 ml/min in 0.5 min, and held at 0.4 ml/min for 3 min.
  • the total liquid chromatography run time was 15.5 min.
  • FIG. 9 A shows that THCAs without proA (Si) produces a large amount of cytoplasmic enzyme with mass 53 kD. This enzyme is not glycosylated and has a predicted molecular weight of 53 kD. ProA19 (S3) also produce significant amount of unglycosylated enzyme. We didn't receive a detectable by Western Blot amount of THCAs with correct glycosylation (69 kD) in strains with active PRB1 and PEP4, showing that without ProA and knockout almost no enzyme present in
  • FIG. 9 B shows the effect for protease knockout on ProA24-THCAs production.
  • Production of correctly glycosylated (69 kD) enzyme for dPRB1, dPEP4 and dPRB1+dPEP4 (lanes S15-S16, S18-S20, and S22-S23).
  • dPRB2 shows no detectable amount for any forms of THCAs (lanes S17 and S21).
  • FIG. 9 C shows that ProA19-24 can produce large amount of correctly glycosylated enzyme in dPRB1 strain.
  • FIG. 9 D provides the in vivo THCA production by strains expression THCAS with different ProA signal peptide and protease knockouts. From this figure, THCA production from THCAS fused to a ProA signal sequence expressed in dPRB1 and/or dPEP4 knockout strains produce more than 10 fold more THCA as compared to strains without ProA and protease knockout.

Abstract

The present disclosure relates to the production of cannabinoids in either recombinant microorganism or in cell-free systems using a combination of enzymes, including but not limited to a PKS enzyme, a npgA enzyme, a cs-OLAS-1, a pp-DVAS-1, a cs-HEX-1 and/or Butiryl synthase.

Description

    TECHNICAL FIELD
  • The present disclosure relates to improved methods of producing cannabinoids.
  • BACKGROUND
  • Cannabinoids are a general class of chemicals that act on cannabinoid receptors and other target molecules to modulate a wide range of physiological behaviour such as neurotransmitter release. Cannabinoids are produced naturally in humans (called endocannabinoids) and by several plant species (called phytocannabinoids) including Cannabis sativa. Cannabinoids have been shown to have several beneficial medical/therapeutic effects and therefore they are an active area of investigation by the pharmaceutical industry for use as pharmaceutical products for various diseases.
  • Currently the production of cannabinoids for pharmaceutical or other uses is done by chemical synthesis or through the extraction of cannabinoids from plants that are producing these cannabinoids, for example C. sativa. There are several drawbacks to the current methods of cannabinoid production. The chemical synthesis of various cannabinoids is a costly process when compared to the extraction of cannabinoids from naturally occurring plants. The chemical synthesis of cannabinoids also involves the use of chemicals that are not environmentally friendly, which can be considered as an additional cost to their production. Furthermore, the synthetic chemical production of various cannabinoids has been classified as less pharmacologically active as those extracted from plants such as C. sativa. Although there are drawbacks to chemically synthesized cannabinoids, the benefit of this production method is that the end product is a highly pure single cannabinoid. This level of purity is preferred for pharmaceutical use. The level of purity required by the pharmaceutical industry is reflected by the fact that no plant extract based cannabinoid production has received FDA approval yet and only synthetic compounds have been approved.
  • In contrast to the synthetic chemical production of cannabinoids, the other method that is currently used to produce cannabinoids is production of cannabinoids in plants that naturally produce these chemicals; the most used plant for this is C. sativa. In this method, the plant C. sativa is cultivated and during the flowering cycle various cannabinoids are produced naturally by the plant. The plant can be harvested and the cannabinoids can be ingested for pharmaceutical purposes in various methods directly from the plant itself or the cannabinoids can be extracted from the plant. There are multiple methods to extract the cannabinoids from the plant C. sativa. All of these methods typically involve placing the plant, C. sativa that contains the cannabinoids, into a chemical solution that selectively solubilizes the cannabinoids into this solution. There are various chemical solutions used to do this such as hexane, cold water extraction methods, CO2 extraction methods, and others. This chemical solution, now containing all the different cannabinoids, can then be removed, leaving behind the excess plant material. The cannabinoid containing solution can then be further processed for use.
  • There are several drawbacks of the natural production and extraction of cannabinoids in plants such as C. sativa. Since there are numerous cannabinoids produced by C. sativa it is often difficult to reproduce identical cannabinoid profiles in plants using an extraction process. Furthermore, variations in plant growth will lead to different levels of cannabinoids in the plant itself making reproducible extraction difficult. Different cannabinoid profiles will have different pharmaceutical effects which are not desired for a pharmaceutical product. Furthermore, the extraction of cannabinoids from C. sativa extracts produces a mixture of cannabinoids and not a highly pure single pharmaceutical compound. Since many cannabinoids are similar in structure it is difficult to purify these mixtures to a high level resulting in cannabinoid contamination of the end product.
  • There is thus a need to provide improved methods of cannabinoid production.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Other features, details, utilities, and advantages of the claimed subject matter will be apparent from the following written Detailed Description including those aspects illustrated in the accompanying drawings and defined in the appended claims. As described herein, the following claims are made:
    • 1. A Polyketide Synthase (PKS) enzyme comprising the amino acid sequence selected from:
      • a. SEQ ID NO:1 (C. stelaris-OLAs-dACP1);
      • b. SEQ ID NO:2 (C. stelaris-OLAs-dACP2);
      • c. SEQ ID NO:3 (C. stellaris-OLAs-wt (wild type C. stelaris));
      • d. SEQ ID NO:6 (C. grayi-PKS-dACP1);
      • e. SEQ ID NO:7 (C. grayi-PKS-dACP2);
      • f. SEQ ID NO:40 (P. furfuracea);
      • g. SEQ ID NO:41 (cs-OLAS-1);
      • h. SEQ ID NO:42 (pp-DVAS-1)
      • i. an PKS enzyme variant of any one of SEQ ID NO:4-5 and 40 (C. grayi, C uncialis), wherein one of the two ACP domains has been inactivated;
      • j. an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
      • k. an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence similarity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
      • l. a PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of the domains selected from: SAT domain, KS domain, AT domain, PT domain, ACP1 domain, ACP2 domain, and TE domain of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain; or
      • m. any combination of (a)-(l).
    • 2. A polynucleotide encoding the PKS enzyme of claim 1.
    • 3. A composition comprising:
      • a. the PKS enzyme of claim 1 selected from SEQ ID NO:1-7 and 40 or variant thereof and a npgA enzyme;
      • b. the cs-OLAS-1 of SEQ ID NO:41 or variant thereof, a cs-HEX-1 of SEQ ID NO:43 or variant thereof, and a npgA enzyme; or
      • c. the pp-DVAS-1 of SEQ ID NO:42 or variant thereof, a pp-BUT-1 of SEQ ID NO:44 or variant thereof, and a npgA enzyme.
    • 4. The composition of claim 3, wherein said composition is a cell-free composition.
    • 5. The composition of claim 3, wherein said composition further comprises a recombinant microorganism.
    • 6. The composition of claim 5, wherein said recombinant microorganism:
      • a. expresses the PKS enzyme of claim 1; and/or
      • b. expresses the npgA enzyme; and/or
      • c. expresses the cs-OLAS-1 or variant thereof and the cs-HEX-1 or variant thereof
      • d. the pp-DVAS-1 or variant thereof and the pp-BUT-1 or variant thereof; and/or
      • e. comprises the polynucleotide of claim 2.
    • 7. The composition of any one of claims 3-6, wherein said composition further comprises at least one enzyme selected from:
      • a. a FAS1 mutant, wherein mutations are selected from I306A, R1834K;
      • b. a FAS2 mutant, wherein said mutation is selected from G1250S, M1251W;
      • c. StcJ and StcK;
      • d. HexA and HexB;
      • e. ERG10;
      • f. ERG13;
      • g. HMGR;
      • h. tHMGR (truncated HMGR);
      • i. ERG12;
      • j. ERG8;
      • k. ERG19;
      • l. IDI1;
      • m. a ERG20 mutant, wherein said mutant is selected from
        • i. S. cerevisiae ERG20F96W/N127W or Y. lipolytica ERG20F88W/N119W or
        • ii. S. cerevisiae ERG20K197E or Y. lipolytica ERG20K189E.
      • n. a mutant NphB (mutNphB)(preferably with mutations at least one of Q161A, G286S, Y288A, A232S);
      • o. csPT1;
      • p. csPT4;
      • q. a tetrahydrocannabinolic acid synthase (THCAS);
      • r. a cannabidiolic acid synthase (CBDAS);
      • s. a cannabichromenic acid synthase (CBCAS); or
      • t. any combination of (a)-(s).
    • 8. The composition of any one of claims 5-7, wherein said recombinant microorganism overexpresses a protein selected from:
      • a. the PKS enzyme of claim 1;
      • b. the npgA enzyme;
      • c. the cs-OLAS-1 or variant thereof and the cs-HEX-1 or variant thereof;
      • d. the pp-DVAS-1 or variant thereof and the pp-BUT-1 or variant thereof; and/or
      • e. the enzyme of claim 7.
    • 9. The composition of claim 8, wherein said protein is overexpressed by:
      • a. operably associating a strong promoter with a polynucleotide encoding the protein; and/or
      • b. multiple copies of a polynucleotide encoding the protein by the recombinant microorganism.
    • 10. The composition of any one of claims 5-9, wherein said recombinant microorganism further comprises inactivation of:
      • a. PEX10; and/or
      • b. CPR1; and/or
      • c. PEP4 (from S. cervisae, YALI0F27071p in YL); and/or
      • d. PRB1 (from S. cervisae, YALI0B16500p and/or YALI0A06435p in YL).
    • 11. The composition of any one of claims 3-10, wherein the composition further comprises any one of:
      • a. Compound II, wherein n is 1 (Butyryl-CoA), 2 (Hexanoyl-CoA) or 3 (Octanoyl-CoA);
  • Figure US20220403346A1-20221222-C00001
  • and/or
      • b. Compound III, wherein n is 1 (Butyric Acid), 2 (Hexanoic Acid) or 3 (Octanoic Acid);
  • Figure US20220403346A1-20221222-C00002
    • 12. The composition of any one of claims 3-11, wherein the composition further comprises at least one cannabinoid or cannabinoid precursor.
    • 13. The composition of claim 12, wherein the at least one cannabinoid or cannabinoid precursor comprises CBGA, THCA, CBDA, CBCA, CBD, THC, CBC, CBGVA, THCVA, CBDVA, CBCVA, CBDV, THCV, CBCV, THCA-C7, CBDA-C7, CBGA-C7 CBCA-C7, CBD-C7, THC-C7, CBC-C7, or CBN analog.
    • 14. A method of producing Compound I, wherein said method comprises contacting the composition of any one of claims 3-13 with a carbohydrate source to enzymatically produce Compound I, wherein Compound I is
  • Figure US20220403346A1-20221222-C00003
      • wherein n is selected from 1 (Diviaric Acid), 2 (Olivetolic acid), or 3 (2,4-Dihydroxy-6-geptylbenzoic acid).
    • 15. The method of claim 14, wherein the carbohydrate source is selected from:
      • a. Acetyl-CoA;
      • b. Malonyl-CoA;
      • c. Mevalonate;
      • d. Compound II;
      • e. Compound III; and/or
      • f. Compound IV, wherein Compound IV is

  • CH3—(CH2)2n—OH  Compound IV
        • wherein n is selected from 1 (propanol), 2 (pentanol), or 3 (heptanol);
    • 16. The method of either claim 14 or 15, wherein the carbohydrate source is exogenously provided.
    • 17. The method of any one of claims 14-16, wherein said carbohydrate source is provided by enzymatically converting Compound III into Compound II.
    • 18. The method of claim 17, wherein the enzyme that converts Compound III into Compound II is selected from:
      • a. CsAAE1
      • b. AAL1ΔSKL; or
      • c. AAL1.
    • 19. The method of claim 14-16, wherein acetyl-CoA and malonyl-CoA is enzymatically converted into Compound II by the combination of enzymes selected from:
      • a. StcJ and StcK;
      • b. HexA and HexB; or
      • c. MutFas1 and MutFas2.
    • 20. The method of any one of claims 14-19, wherein Compound II is enzymatically converted into Compound I.
    • 21. The method of claim 20, wherein the conversion of Compound II into Compound I is by the PKS enzyme of claim 1 (a)-(f) or (i)-(m) and a npgA enzyme.
    • 22. The method of claim 14-16, wherein acetyl-CoA and malonyl-CoA is enzymatically converted into Compound I by the combination of enzymes selected from:
      • a. the cs-OLAS-1 of SEQ ID NO:41 or variant thereof, cs-HEX-1 of SEQ ID NO:43 or variant thereof, and the npgA enzyme; or
      • b. the pp-DVAS-1 of SEQ ID NO:42 or variant thereof, a pp-BUT-1 of SEQ ID NO:44 or variant thereof and the npgA enzyme.
    • 23. The method of any one of claims 14-22, wherein said method further comprises enzymatically converting Acetyl-CoA into Mevalonate by:
      • a. ERG10;
      • b. ERG13; or
      • c. one or both of HMGR or tHMGR.
    • 24. The method of claim 23, wherein Mevalonate is further enzymatically converted into Geranyldiphosphate (GPP) by:
      • a. ERG12;
      • b. ERG8;
      • c. ERG19;
      • d. IDI1; and
      • e. an ERG20 mutant, wherein said mutant is selected from
        • i. S. cerevisiae ERG20F96W/N127W or Y. lipolytica ERG20F88W/N119W or
        • ii. S. cerevisiae ERG20K197E or Y. lipolytica ERG20K189E.
    • 25. The method of any one of claims 14-24, wherein Geranyldiphosphate is exogenously provided.
    • 26. The method of either claim 24 or 25 wherein said method further comprises enzymatically converting Compound I and Geranyldiphosphate into at least one cannabinoid or cannabinoid precursor.
    • 27. The method of claim 26, wherein the at least one cannabinoid or cannabinoid precursor comprises CBGA, THCA, CBDA, CBCA, CBD, THC, CBC, CBGVA, THCVA, CBDVA, CBCVA, CBDV, THCV, CBCV, THCA-C7, CBDA-C7, CBGA-C7 CBCA-C7, CBD-C7, THC-C7, CBC-C7, or CBN analog.
    • 28. The method of either claim 26-27, wherein Compound I and Geranyldiphosphate is enzymatically converted into the at least one cannabinoid precursor by mutNphB, csPT1 and/or csPT4.
    • 29. The method of any one of claims 26-28, wherein cannabinoid precursor is a CBGA analog.
    • 30. The method of claim 29, wherein the CBGA-analog is further enzymatically converted into a CBDA analog, a TCHA analog and/or a CBCA analog by a CBDAS, a TCHAS, and/or a CBCAS.
    • 31. The method of claim 30, wherein the CBDAS, TCHAS, and/or the CBCAS comprises a ProA signal sequence.
    • 32. The method of any one of claims 14-31, wherein the method is carried out in a microorganism lacking functional PEP4 and/or PRB1 activity.
    • 33. The method of any one of claims 14-32, wherein Compound I, the at least one cannabinoid or cannabinoid precursor, or the CBGA, THCA, CBDA, CBCA, CBD, THC, CBC, CBGVA, THCVA, CBDVA, CBCVA, CBDV, THCV, CBCV, THCA-C7, CBDA-C7, CBGA-C7 CBCA-C7, CBD-C7, THC-C7, CBC-C7, or CBN analog is recovered.
    • 34. The method of any one of claims 14-32, wherein Compound I, the at least one cannabinoid or cannabinoid precursor, or the CBGA, THCA, CBDA, CBCA, CBD, THC, CBC, CBGVA, THCVA, CBDVA, CBCVA, CBDV, THCV, CBCV, THCA-C7, CBDA-C7, CBGA-C7, CBCA-C7, CBD-C7, THC-C7, CBC-C7, or CBN analog is purified.
    • 35. The Compound I, the at least one cannabinoid or cannabinoid precursor, or the CBGA, THCA, CBDA, CBCA, CBD, THC, CBC, CBGVA, THCVA, CBDVA, CBCVA, CBDV, THCV, CBCV, THCA-C7, CBDA-C7, CBGA-C7 CBCA-C7, CBD-C7, THC-C7, CBC-C7, or CBN analog acid produced by the method of any one of claims 14-34.
    • 36. The composition of any one of claims 5-13 or the method of any one of claims 14-35, wherein the recombinant microorganism is selected from: bacteria, fungi, yeasts, algae, and archaea.
    • 37. The composition or method of claim 36, wherein said recombinant microorganism is a yeast.
    • 38. The composition or method of claim 37, wherein said yeast is oleaginous.
    • 39. The composition or method of claim 38, wherein the yeast is selected from the genera Rhodosporidium, Rhodotorula, Yarrowia, Cryptococcus, Candida, Lipomyces and Trichosporon.
    • 40. The composition or method of claim 38, wherein said yeast is a Yarrowia lipolytica, a Lipomyces starkey, a Rhodosporidium toruloides, a Rhodotorula glutinis, a Trichosporon fermentans or a Cryptococcus curvatus.
    • 41. The composition or method of one of claims 36-40, wherein the yeast comprises at least 5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, or at least 25% dry weight of fatty acids or fats.
    • 42. The composition or method of any one of claims 36-40, wherein the yeast is genetically modified to produce at least 5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, or at least 25% dry weight of fatty acids or fats.
    BRIEF DESCRIPTION OF DRAWINGS
  • Embodiments of the present disclosure will be discussed with reference to the accompanying drawings wherein:
  • FIG. 1A illustrates a first enzymatic pathway as described herein for producing Compound I from the starting materials of either Compound III and/or Compound II.
  • FIG. 1B illustrates a second enzymatic pathway as described herein for producing Compound I from the starting materials of either Compound II and/or Acetyl-CoA and Malonyl CoA.
  • FIG. 1C illustrates a third enzymatic pathway as described herein for producing Compound I from the starting materials from Acetyl-CoA and Malonyl CoA.
  • FIG. 2 is diagram of the cannabinoid synthesis pathway including nonenzymatic steps starting with a CBGA-Analog;
  • FIG. 3 illustrates the enzymatic pathway as described herein for producing GPP from different carbohydrate sources.
  • FIG. 4 describes the structures for Compound I, II, III and IV.
  • FIGS. 5A-B describes the structures for Cannabinoid Precursors (FIG. 5A) and Cannabinoids (FIG. 5B).
  • FIG. 6A is an alignment of SEQ ID NOs: 3-5 and 40 showing identical (*) vs conserved amino acid (.) between the three sequences.
  • FIG. 6B is an alignment of SEQ ID NOs: 3-5 and 40-42 showing identical (*) vs conserved amino acid (.) between the six sequences.
  • FIG. 7 provides a list of abbreviations used throughout the specification.
  • FIG. 8 is an enzymatic assay used to illustrate the effect of different mutations on NphB gene on the production of Olivetolic Acid.
  • FIG. 9A is a Western blot showing the production of cytoplastic THCAS when no ProA signal sequence is used. FIG. 9B shows the production of correctly glycosylated THCAS when ProA24 is used in dPRB1, dPEP4 and dPRB1+dPEP4 knockout yeast strains. FIG. 9C shows that the ProA19-ProA24 signal sequence can produce equally large amounts of THCAS. FIG. 9D shows THCA production is 10 times greater when produced in dPRB1 and/or dPEP4 knockout strains with THCAS fused to a ProA signal sequence.
  • DESCRIPTION OF EMBODIMENTS Definitions
  • The following definitions are provided for specific terms which are used in the following written description.
  • As used in the specification and claims, the singular form “a”, “an” and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a cannabinoid precursor” includes a plurality of precursors, including mixtures thereof. The term “a polynucleotide” includes a plurality of polynucleotides.
  • As used herein, the term “comprising” is intended to mean that the compositions and methods include the recited elements, but do not exclude other elements. “Consisting essentially of” shall mean excluding other elements of any essential significance to the combination. Thus, compositions consisting essentially of produced cannabinoids would not exclude trace contaminants from the isolation and purification method and pharmaceutically acceptable carriers, such as phosphate buffered saline, preservatives, and the like. “Consisting of” shall mean excluding more than trace elements of other ingredients and substantial method steps for produced cannabinoids. Embodiments defined by each of these transition terms are within the scope of this invention.
  • The term “about” or “approximately” means within an acceptable range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, e.g., the limitations of the measurement system. For example, “about” can mean a range of up to 20%, preferably up to 10%, more preferably up to 5%, and more preferably still up to 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, preferably within 5 fold, and more preferably within 2 fold, of a value. Unless otherwise stated, the term ‘about’ means within an acceptable error range for the particular value, such as ±1-20%, preferably ±1-10% and more preferably ±1-5%.
  • Where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either both of those included limits are also included in the invention.
  • As used herein, the terms “polynucleotide” and “nucleic acid molecule” are used interchangeably to refer to polymeric forms of nucleotides of any length. The polynucleotides may contain deoxyribonucleotides, ribonucleotides, and/or their analogs. Nucleotides may have any three-dimensional structure, and may perform any function, known or unknown. The term “polynucleotide” includes, for example, single-, double-stranded and triple helical molecules, a gene or gene fragment, exons, introns, mRNA, tRNA, rRNA, ribozymes, antisense molecules, cDNA, recombinant polynucleotides, branched polynucleotides, aptamers, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A nucleic acid molecule may also comprise modified nucleic acid molecules (e.g., comprising modified bases, sugars, and/or internucleotide linkers).
  • As used herein, the term “peptide” refers to a compound of two or more subunit amino acids, amino acid analogs, or peptidomimetics. The subunits may be linked by peptide bonds or by other bonds (e.g., as esters, ethers, and the like).
  • As used herein, the term “amino acid” refers to either natural and/or unnatural or synthetic amino acids, including glycine and both D or L optical isomers, and amino acid analogs and peptidomimetics. A peptide of three or more amino acids is commonly called an oligopeptide if the peptide chain is short. If the peptide chain is long (e.g., greater than about 10 amino acids), the peptide is commonly called a polypeptide or a protein. While the term “protein” encompasses the term “polypeptide”, a “polypeptide” may be a less than full-length protein.
  • As used herein, “expression” refers to the process by which polynucleotides are transcribed into mRNA and/or translated into peptides, polypeptides, or proteins. If the polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA transcribed from the genomic DNA.
  • As used herein, “under transcriptional control” or “operably linked” refers to expression (e.g., transcription or translation) of a polynucleotide sequence which is controlled by an appropriate juxtaposition of an expression control element and a coding sequence. In one aspect, a DNA sequence is “operatively linked” to an expression control sequence when the expression control sequence controls and regulates the transcription of that DNA sequence.
  • As used herein, “coding sequence” is a sequence which is transcribed and translated into a polypeptide when placed under the control of appropriate expression control sequences. The boundaries of a coding sequence are determined by a start codon at the 5′ (amino) terminus and a translation stop codon at the 3′ (carboxyl) terminus. A coding sequence can include, but is not limited to, a prokaryotic sequence, cDNA from eukaryotic mRNA, a genomic DNA sequence from eukaryotic (e.g., yeast, or mammalian) DNA, and even synthetic DNA sequences. A polyadenylation signal and transcription termination sequence will usually be located 3′ to the coding sequence.
  • As used herein, two coding sequences “correspond” to each other if the sequences or their complementary sequences encode the same amino acid sequences.
  • As used herein, “signal sequence” denotes the endoplasmic reticulum translocation sequence. This sequence encodes a signal peptide that communicates to a cell to direct a polypeptide to which it is linked (e.g., via a chemical bond) to an endoplasmic reticulum vesicular compartment, to enter an exocytic/endocytic organelle, to be delivered either to a cellular vesicular compartment, the cell surface or to secrete the polypeptide. This signal sequence is sometimes clipped off by the cell in the maturation of a protein. Signal sequences can be found associated with a variety of proteins native to prokaryotes and eukaryotes.
  • As used herein, “hybridization” refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson-Crick base pairing, Hoogstein binding, or in any other sequence-specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi-stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PCR reaction, or the enzymatic cleavage of a polynucleotide by a ribozyme.
  • As used herein, a polynucleotide or polynucleotide domain (or a polypeptide or polypeptide domain) which has a certain percentage (for example, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%) of “sequence identity” to another sequence means that, when maximally aligned, using software programs routine in the art, that percentage of bases (or amino acids) are the same in comparing the two sequences.
  • Two polypeptide sequences are “substantially homologous” or “substantially similar” when at least about 50%, at least about 60%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% of amino acid residues of the polypeptide match conservative amino acids over a defined length of the polypeptide sequence.
  • Sequences that are similar (e.g., substantially homologous) can be identified by comparing the sequences using standard software available in sequence data banks.
  • Substantially homologous nucleic acid sequences also can be identified in a Southern hybridization experiment under, for example, stringent conditions as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art. For example, stringent conditions can be: hybridization at 5×SSC and 50% formamide at 42° C., and washing at 0.1×SSC and 0.1% sodium dodecyl sulfate at 6° C. Further examples of stringent hybridization conditions include: incubation temperatures of about 25 degrees C. to about 37 degrees C.; hybridization buffer concentrations of about 6×SSC to about 10×SSC; formamide concentrations of about 0% to about 25%; and wash solutions of about 6×SSC. Examples of moderate hybridization conditions include: incubation temperatures of about 40 degrees C. to about 50 degrees C.; buffer concentrations of about 9×SSC to about 2×SSC; formamide concentrations of about 30% to about 50%; and wash solutions of about 5×SSC to about 2×SSC. Examples of high stringency conditions include: incubation temperatures of about 55 degrees C. to about 68 degrees C.; buffer concentrations of about 1×SSC to about 0.1×SSC; formamide concentrations of about 55% to about 75%; and wash solutions of about 1×SSC, 0.1×SSC, or deionized water. In general, hybridization incubation times are from 5 minutes to 24 hours, with 1, 2, or more washing steps, and wash incubation times are about 1, 2, or 15 minutes. SSC is 0.15 M NaCl and 15 mM citrate buffer. It is understood that equivalents of SSC using other buffer systems can be employed. Similarity can be verified by sequencing, but preferably, is also or alternatively, verified by function (e.g., ability to traffic to an endosomal compartment, and the like), using assays suitable for the particular domain in question.
  • The terms “percent (%) sequence similarity”, “percent (%) sequence identity”, and the like, generally refer to the degree of identity or similarity between different nucleotide sequences of nucleic acid molecules or amino acid sequences of polypeptides that may or may not share a common evolutionary origin (see Reeck et al., supra). Sequence identity can be determined using any of a number of publicly available sequence comparison algorithms, such as BLAST, FASTA, DNA Strider, GCG (Genetics Computer Group, Program Manual for the GCG Package, Version 7, Madison, Wis.), etc.
  • To determine the percent identity between two amino acid sequences or two nucleic acid molecules, the sequences are aligned for optimal comparison purposes. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., percent identity=number of identical positions/total number of positions (e.g., overlapping positions)×100). In one embodiment, the two sequences are, or are about, of the same length. The percent identity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent sequence identity, typically exact matches are counted.
  • The determination of percent identity between two sequences can be accomplished using a mathematical algorithm. A non-limiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of Karlin and Altschul, Proc. Natl. Acad. Sci. USA 1990, 87:2264, modified as in Karlin and Altschul, Proc. Natl. Acad. Sci. USA 1993, 90:5873-5877. Such an algorithm is incorporated into the NBLAST and XBLAST programs of Altschul et al, J. Mol. Biol. 1990; 215: 403. BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12, to obtain nucleotide sequences homologous to sequences of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3, to obtain amino acid sequences homologous to protein sequences of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al, Nucleic Acids Res. 1997, 25:3389. Alternatively, PSI-Blast can be used to perform an iterated search that detects distant relationship between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See ncbi.nlm.nih.gov/BLAST/on the WorldWideWeb.
  • To determine the percent similarity between two amino acid sequences, the sequences are also aligned for optimal comparison purposes. The percent similarity between the two sequences is a function of the number of conserved amino acids at positions shared by the sequences (i.e., percent similarity=number of conserved amino acids positions/total number of positions (e.g., overlapping positions)×100). In one embodiment, the two sequences are, or are about, of the same length. The percent similarity between two sequences can be determined using techniques similar to those described below, with or without allowing gaps. In calculating percent sequence similarity, typically conserved matches are counted.
  • Another non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, CABIOS 1988; 4: 11-17. Such an algorithm is incorporated into the ALIGN program (version 2.0), which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used.
  • In a preferred embodiment, the percent identity between two amino acid sequences is determined using the algorithm of Needleman and Wunsch (J. Mol. Biol. 1970, 48:444-453), which has been incorporated into the GAP program in the GCG software package (Accelrys, Burlington, Mass.; available at accelrys.com on the WorldWideWeb), using either a Blossum 62 matrix or a PAM250 matrix, a gap weight of 16, 14, 12, 10, 8, 6, or 4, and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package using a NWSgapdna.CMP matrix, a gap weight of 40, 50, 60, 70, or 80, and a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that can be used if the practitioner is uncertain about what parameters should be applied to determine if a molecule is a sequence identity or homology limitation of the invention) is using a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of s.
  • Another non-limiting example of how percent identity can be determined is by using software programs such as those described in Current Protocols In Molecular Biology (F. M. Ausubel et al., eds., 1987) Supplement 30, section 7.7.18, Table 7.7.1. Preferably, default parameters are used for alignment. A preferred alignment program is BLAST, using default parameters. In particular, preferred programs are BLASTN and BLASTP, using the following default parameters: Genetic code=standard; filter=none; strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50 sequences; sort by=HIGH SCORE; Databases=non-redundant, GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+SwissProtein+SPupdate+PIR. Details of these programs can be found at the following Internet address: http://www.ncbi.nlm.nih.gov/cgi-bin/BLAST.
  • Statistical analysis of the properties described herein may be carried out by standard tests, for example, t-tests, ANOVA, or Chi squared tests. Typically, statistical significance will be measured to a level of p=0.05 (5%), more preferably p=0.01, p=0.001, p=0.0001, p=0.000001
  • “Conservatively modified variants” of domain sequences also can be provided. With respect to particular nucleic acid sequences, conservatively modified variants refer to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Specifically, degenerate codon substitutions can be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer, et al., 1991, Nucleic Acid Res. 19: 5081; Ohtsuka, et al., 1985, J. Biol. Chem. 260: 2605-2608; Rossolini et al., 1994, Mol. Cell. Probes 8: 91-98).
  • Unless otherwise described, variants of the disclosed gene retain the ability of the wild type protein from which the variant was derived, although the activity may not be at the same level. In preferred embodiments, the variants have at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 100% efficacy compared to the original sequence. In preferred embodiments, the variant has improved activity as compared to the original sequence. For example, variants with improved activity have at least about 110%, at least about 120%, at least about 130%, at least about 140%, at least about 150%, or at least about 160% efficacy compared to the original sequence.
  • For example, a variant common cannabinoid synthesising protein, such as CBDAS, must retain the ability to cyclize CBGA to produce CBDA with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence. In preferred embodiments, a variant common cannabinoid protein, such as CBDAS, has improved activity over the sequence from which it is derived in that the improved variant common cannabinoid protein has more than 110%, 120%, 130%, 140%, or and 150% improved activity in cyclizing CBGA to produce CBDA, as compared to the sequence from which the improved variant is derived.
  • The term “biologically active fragment”, “biologically active form”, “biologically active equivalent” of and “functional derivative” of a wild-type protein, possesses a biological activity that is at least substantially equal (e.g., not significantly different from) the biological activity of the wild type protein as measured using an assay suitable for detecting the activity.
  • As used herein, the term “isolated” or “purified” means separated (or substantially free) from constituents, cellular and otherwise, in which the polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, are normally associated with in nature. As is apparent to those of skill in the art, a non-naturally occurring polynucleotide, peptide, polypeptide, protein, antibody, or fragments thereof, does not require “isolation” to distinguish it from its naturally occurring counterpart. By substantially free or substantially purified, it is meant at least 50% of the population, preferably at least 70%, more preferably at least 80%, and even more preferably at least 90%, are free of the components with which they are associated in nature.
  • A cell has been “transformed”, “transduced”, or “transfected” when nucleic acids have been introduced inside the cell. Transforming DNA may or may not be integrated (covalently linked) with chromosomal DNA making up the genome of the cell. For example, the polynucleotide may be maintained on an episomal element, such as a plasmid or a stably transformed cell is one in which the polynucleotide has become integrated into a chromosome so that it is inherited by daughter cells through chromosome replication. This stability is demonstrated by the ability of the cell to establish cell lines or clones comprised of a population of daughter cells containing the transformed polynucleotide. A “clone” is a population of cells derived from a single cell or common ancestor by mitosis. A “cell line” is a clone of a primary cell that is capable of stable growth in vitro for many generations (e.g., at least about 10).
  • A “vector” includes plasmids and viruses and any DNA or RNA molecule, whether self-replicating or not, which can be used to transform or transfect a cell.
  • As used herein, a “genetic modification” refers to any addition, deletion and/or substitution to a cell's normal nucleotides and/or additional of heterologous sequences. Any method which can achieve the genetic modification are within the spirit and scope of this invention. Art recognized methods include viral mediated gene transfer, liposome mediated transfer, transformation, transfection and transduction.
  • The practice of the present invention employs, unless otherwise indicated, conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Maniatis, Fritsch & Sambrook, In Molecular Cloning: A Laboratory Manual (1982); DNA Cloning: A Practical Approach, Volumes I and II (D. N. Glover, ed., 1985); Oligonucleotide Synthesis (M. J. Gait, ed., 1984); Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins, eds., 1985); Transcription and Translation (B. D. Hames & S. I. Higgins, eds., 1984); Animal Cell Culture (R. I. Freshney, ed., 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide to Molecular Cloning (1984).
  • Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated by reference for the purpose of describing and disclosing devices, formulations and methodologies that may be used in connection with the presently described invention.
  • Pathway
  • A high-level biosynthetic route to produce cannabinoids and/or cannabinoid precursors is shown in FIGS. 1-3 . The focus of one of these pathways is the production of Compound I from Compound II as shown in FIGS. 1A-1B using an PKS Enzyme in combination with a npgA Enzyme. Additional pathways can be added to this core pathway, including the production of (a) Compound II from Compound III; and/or (b) the production of Compound II from Acetyl-CoA and Malonyl CoA; and/or (c) the production of Compound III from Compound IV; and/or (d) the production of Compound III from Compound IV.
  • Alternatively, FIG. 1C shows the production of Compound I from acetyl-CoA and malonyl CoA using the described enzymes.
  • The biosynthetic routes as shown in FIGS. 1-3 can be used to produce Compounds described in FIGS. 4-5 . As shown in the Tables in FIGS. 4-5 , the compounds comprise identical core structures but comprise different lengths in the C-tails (C-3 Tail, C-5 Tail, or C-7 Tail). Depending on whether the starting materials (e.g., Compound I-IV) comprise a C-3, C-5, or C-7 tail will determine the resulting cannabinoid analogs and/or cannabinoid precursor analogs. Regardless of the length of the C-tail contained in the starting materials, the enzymatic pathways described herein can be used to convert each core structure.
  • Production of Compound I
  • As shown in FIGS. 1A and 1B, Compound I can be enzymatically produced from Compound II using an PKS Enzyme in combination with a npgA Enzyme. As used herein, an “PKS Enzyme” is defined as any one of the following amino acid sequences:
      • a. SEQ ID NO:1 (C. stelaris-OLAs-dACP1 (sequence on page 4-5));
      • b. SEQ ID NO:2 (C. stelaris-OLAs-dACP2 (sequence on page 5));
      • c. SEQ ID NO:3 (C. stellaris-OLAs-wt (wild type C. stelaris));
      • d. SEQ ID NO:6 (C. grayi-PKS-dACP1);
      • e. SEQ ID NO:7 (C. grayi-PKS-dACP2);
      • f. SEQ ID NO:40 (P. furfuracea);
      • g. SEQ ID NO:41 (cs-OLAS-1);
      • h. SEQ ID NO:42 (pp-DVAS-1)
      • i. an PKS enzyme variant of any one of SEQ ID NO:4-5 and 40 (C. stelaris, C. grayi, C. uncialis, P. furfuracea), wherein one of the two ACP domains has been inactivated;
      • j. an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
      • k. an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence similarity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
      • l. a PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of the domains selected from: SAT domain, KS domain, AT domain, PT domain, ACP1 domain, ACP2 domain, and TE domain of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain; or
      • m. any combination of (a)-(1).
  • The sequences corresponding to SEQ ID NO:1-7 and 40-42 are as follows:
  • C. Stelaris-OLAs-dACPi
    (SEQ ID NO: 1)
    MTPPNNVVLFGDQTVDPCPVIKQLYRQSRDSLALQAFFRQSYEAVRREIATSEYSDRALFPSFD
    SIRALAEKQPEKHNEAVSTVLLCIAQLGLLLVHSDQDDSMFDAGPSKTYLVGLCTGMLPAAALA
    ASSSTSQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNNEFM
    IPTSKQAYISAESDSTATISGPPSTLVSLFTSSDSFRKARRVKLPITAAFHAPHLRVPDSEKII
    GSLLNSDEYPLRNDVVIVSTRSGKPIRAQSLGDALQHIILDILREPIRWSRVIEEMIPNLKDQG
    VILTSAGPVRAADSLRQRMASAGIEVLMSTEMQPLREPRTKPRSSDIAIIGYAARLPESETLEE
    VWKILEDGRDVHKKIPNDRFDVDTHCDPSGKIKNTTYTPYGCFLDRPGFFDARLFNMSPREASQ
    TDPAQRLLLLTTYEALEMAGYTPDGSPSSAGDRIGTFFGQTLDDYREANASQNIEMYYVSGGIR
    AFGAGRLNYHFKWEGPSYCVDAACSSSTLSIQMAMSSLRTHECDTAVAGGTNVLTGVDMFSGLS
    RGSFLSPTGSCKTFDNDADGYCRGDGVGTVILKRLDDAIADGDNIQAVIKSAATNHSAHAVSIT
    HPHAGAQQNLMRQVLREADVEPSEIDYVEMHGTGTQAGDATEFASVTNVISGRTRDNPLHVGAI
    KANFGHAEAAAGTNSLVKVLMMMRKNAIPPHVGIKGRINEKFPPLDKINVRINRTMTPFVARAG
    GDGKRRVLLNNFNATGGNTSLLLEDAPKTDVRGHDLRSAHVIAISAKTSYSFKQNTQRLLEYLQ
    LNPETQIQDLSYTTTARRMHHVIRKAYAVQSTEQLVQSMKKDISNSSELGATTELSSAIFLFTG
    QGSQYLGMGRQLFQTNTAFRKSISESDNICVRQGLPSFEWIVTAESSEERVPSPSESQLALVAI
    ALALASLWQSWGITPKAVIGHSLGEYAALCVAGVLSISDTLYLVGKRAEMMEKKCIANSHSMLA
    IQSDSESIQQIISGGQMPSCEIACLNGPSNTVVSGSLKDIHSLEEKLNALGTKTTLLKLPFAFH
    SVQMDPILEDIRALAQNVQFRKPNVPIASTLLGTLVKDHGIITADYLARQARQAVRFQEALQAC
    KAESIASDDTLWIEVGPHPLCHGMVRSTLGLSPTKALPSLKRDEDCWSTISRSIANAYNSGVKV
    SWIDYHRDFQGALRLLELPSYAFDLKNYWIQHEGDWSLRKGETTHTNAPPPQASFSTTCLQVIE
    NETFTQNSASVTFSSQLSEPKLNTAVRGHLVSGIGLCPSSVYADVAFTAAWYIASRMTPSDPVP
    AMDLSTMEVFRPLIVDSKETPQLLKVSASRNANEQVVNIKISSQDDKGRQEHAHCTVMYGDGHQ
    WMDEWQRNAYLVESRIDKLTQPSSPGIHRMLKEMIYKQFQTVVTYSPEYHNIDEIFMDCDLNET
    AANINFQSMAGNGEFIYSPYWIDTVAHLAGFILNANVKTPTDTVFISHGWQSFRIAAPLSDEKT
    YRGYVRMQPSSGRGVMAGDVYIFDGDEIVVVCKGIKFQQMKRTTLQSLLGVSPAATPISKPIPA
    KPSGPHPVTARKAAVTQSLSAGFSRVLDTIASEVGVDVSELSDDVKISDVGVD
    Figure US20220403346A1-20221222-P00001
    LLTISILGRL
    RPETGLDLSSSLFIEHPSIAELRAFFLDKMDVPQAIANDDDSDDSSEDDGPGFSRSQSTSTIST
    PEEPDVVNILMSIIAREVGVEESEIQLSTPFAEIGVDSLLTISILDAFKTEIGMNLSANFFHDH
    PTFADVQKALGAPSTPQKPLDLPLCRLEQSSKPLSQTPRAKSVLLQGRPDKGKPALFLLPDGAG
    SLFSYISLPSLPSGLPVYGLDSPFHNNPSEYTISFSAVATIYIAAIRAIQPKGPYMLGGWSLGG
    IHAYETARQLIEQGETISNLIMIDSPCPGTLPPLPAPTLSLLEKAGIFDGLSTSGAPITERTRL
    HFLGCVRALENYTVTPLPPGKSPGKVTVIWAQEGVLEGREEQGKEYMAATSSGDLNKDMDKAKE
    WLTGKRTSFGPSGWDKLTGTEVHCHVVSGNHFSIMFPPKVCWQSTSSFSPSMDYDTNAYNLQIT
    AVAEAVATGLPEK*
    C. Stelaris-OLAs-dACP2
    (SEQ ID NO: 2)
    MTPPNNVVLFGDQTVDPCPVIKQLYRQSRDSLALQAFFRQSYEAVRREIATSEYSDRALFPSFD
    SIRALAEKQPEKHNEAVSTVLLCIAQLGLLLVHSDQDDSMFDAGPSKTYLVGLCTGMLPAAALA
    ASSSTSQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNNEFM
    IPTSKQAYISAESDSTATISGPPSTLVSLFTSSDSFRKARRVKLPITAAFHAPHLRVPDSEKII
    GSLLNSDEYPLRNDVVIVSTRSGKPIRAQSLGDALQHIILDILREPIRWSRVIEEMIPNLKDQG
    VILTSAGPVRAADSLRQRMASAGIEVLMSTEMQPLREPRTKPRSSDIAIIGYAARLPESETLEE
    VWKILEDGRDVHKKIPNDRFDVDTHCDPSGKIKNTTYTPYGCFLDRPGFFDARLFNMSPREASQ
    TDPAQRLLLLTTYEALEMAGYTPDGSPSSAGDRIGTFFGQTLDDYREANASQNIEMYYVSGGIR
    AFGAGRLNYHFKWEGPSYCVDAACSSSTLSIQMAMSSLRTHECDTAVAGGTNVLTGVDMFSGLS
    RGSFLSPTGSCKTFDNDADGYCRGDGVGTVILKRLDDAIADGDNIQAVIKSAATNHSAHAVSIT
    HPHAGAQQNLMRQVLREADVEPSEIDYVEMHGTGTQAGDATEFASVTNVISGRTRDNPLHVGAI
    KANFGHAEAAAGTNSLVKVLMMMRKNAIPPHVGIKGRINEKFPPLDKINVRINRTMTPFVARAG
    GDGKRRVLLNNFNATGGNTSLLLEDAPKTDVRGHDLRSAHVIAISAKTSYSFKQNTQRLLEYLQ
    LNPETQIQDLSYTTTARRMHHVIRKAYAVQSTEQLVQSMKKDISNSSELGATTELSSAIFLFTG
    QGSQYLGMGRQLFQTNTAFRKSISESDNICVRQGLPSFEWIVTAESSEERVPSPSESQLALVAI
    ALALASLWQSWGITPKAVIGHSLGEYAALCVAGVLSISDTLYLVGKRAEMMEKKCIANSHSMLA
    IQSDSESIQQIISGGQMPSCEIACLNGPSNTVVSGSLKDIHSLEEKLNALGTKTTLLKLPFAFH
    SVQMDPILEDIRALAQNVQFRKPNVPIASTLLGTLVKDHGIITADYLARQARQAVRFQEALQAC
    KAESIASDDTLWIEVGPHPLCHGMVRSTLGLSPTKALPSLKRDEDCWSTISRSIANAYNSGVKV
    SWIDYHRDFQGALRLLELPSYAFDLKNYWIQHEGDWSLRKGETTHTNAPPPQASFSTTCLQVIE
    NETFTQNSASVTFSSQLSEPKLNTAVRGHLVSGIGLCPSSVYADVAFTAAWYIASRMTPSDPVP
    AMDLSTMEVFRPLIVDSKETPQLLKVSASRNANEQVVNIKISSQDDKGRQEHAHCTVMYGDGHQ
    WMDEWQRNAYLVESRIDKLTQPSSPGIHRMLKEMIYKQFQTVVTYSPEYHNIDEIFMDCDLNET
    AANINFQSMAGNGEFIYSPYWIDTVAHLAGFILNANVKTPTDTVFISHGWQSFRIAAPLSDEKT
    YRGYVRMQPSSGRGVMAGDVYIFDGDEIVVVCKGIKFQQMKRTTLQSLLGVSPAATPISKPIPA
    KPSGPHPVTARKAAVTQSLSAGFSRVLDTIASEVGVDVSELSDDVKISDVGVDSLLTISILGRL
    RPETGLDLSSSLFIEHPSIAELRAFFLDKMDVPQAIANDDDSDDSSEDDGPGFSRSQSTSTIST
    PEEPDVVNILMSIIAREVGVEESEIQLSTPFAEIGVD
    Figure US20220403346A1-20221222-P00002
    LLTISILDAFKTEIGMNLSANFFHDH
    PTFADVQKALGAPSTPQKPLDLPLCRLEQSSKPLSQTPRAKSVLLQGRPDKGKPALFLLPDGAG
    SLFSYISLPSLPSGLPVYGLDSPFHNNPSEYTISFSAVATIYIAAIRAIQPKGPYMLGGWSLGG
    IHAYETARQLIEQGETISNLIMIDSPCPGTLPPLPAPTLSLLEKAGIFDGLSTSGAPITERTRL
    HFLGCVRALENYTVTPLPPGKSPGKVTVIWAQEGVLEGREEQGKEYMAATSSGDLNKDMDKAKE
    WLTGKRTSFGPSGWDKLTGTEVHCHVVSGNHFSIMFPPKVCWQSTSSFSPSMDYDTNAYNLQIT
    AVAEAVATGLPEK
    C. Stelaris-OLAS
    (SEQ ID NO: 3)
    MTPPNNVVLFGDQTVDPCPVIKQLYRQSRDSLALQAFFRQSYEAVRREIATSEYSDRALFPSFD
    SIRALAEKQPEKHNEAVSTVLLCIAQLGLLLVHSDQDDSMFDAGPSKTYLVGLCTGMLPAAALA
    ASSSTSQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNNEFM
    IPTSKQAYISAESDSTATISGPPSTLVSLFTSSDSFRKARRVKLPITAAFHAPHLRVPDSEKII
    GSLLNSDEYPLRNDVVIVSTRSGKPIRAQSLGDALQHIILDILREPIRWSRVIEEMIPNLKDQG
    VILTSAGPVRAADSLRQRMASAGIEVLMSTEMQPLREPRTKPRSSDIAIIGYAARLPESETLEE
    VWKILEDGRDVHKKIPNDRFDVDTHCDPSGKIKNTTYTPYGCFLDRPGFFDARLFNMSPREASQ
    TDPAQRLLLLTTYEALEMAGYTPDGSPSSAGDRIGTFFGQTLDDYREANASQNIEMYYVSGGIR
    AFGAGRLNYHFKWEGPSYCVDAACSSSTLSIQMAMSSLRTHECDTAVAGGTNVLTGVDMFSGLS
    RGSFLSPTGSCKTFDNDADGYCRGDGVGTVILKRLDDAIADGDNIQAVIKSAATNHSAHAVSIT
    HPHAGAQQNLMRQVLREADVEPSEIDYVEMHGTGTQAGDATEFASVTNVISGRTRDNPLHVGAI
    KANFGHAEAAAGTNSLVKVLMMMRKNAIPPHVGIKGRINEKFPPLDKINVRINRTMTPFVARAG
    GDGKRRVLLNNFNATGGNTSLLLEDAPKTDVRGHDLRSAHVIAISAKTSYSFKQNTQRLLEYLQ
    LNPETQIQDLSYTTTARRMHHVIRKAYAVQSTEQLVQSMKKDISNSSELGATTELSSAIFLFTG
    QGSQYLGMGRQLFQTNTAFRKSISESDNICVRQGLPSFEWIVTAESSEERVPSPSESQLALVAI
    ALALASLWQSWGITPKAVIGHSLGEYAALCVAGVLSISDTLYLVGKRAEMMEKKCIANSHSMLA
    IQSDSESIQQIISGGQMPSCEIACLNGPSNTVVSGSLKDIHSLEEKLNALGTKTTLLKLPFAFH
    SVQMDPILEDIRALAQNVQFRKPNVPIASTLLGTLVKDHGIITADYLARQARQAVRFQEALQAC
    KAESIASDDTLWIEVGPHPLCHGMVRSTLGLSPTKALPSLKRDEDCWSTISRSIANAYNSGVKV
    SWIDYHRDFQGALRLLELPSYAFDLKNYWIQHEGDWSLRKGETTHTNAPPPQASFSTTCLQVIE
    NETFTQNSASVTFSSQLSEPKLNTAVRGHLVSGIGLCPSSVYADVAFTAAWYIASRMTPSDPVP
    AMDLSTMEVFRPLIVDSKETPQLLKVSASRNANEQVVNIKISSQDDKGRQEHAHCTVMYGDGHQ
    WMDEWQRNAYLVESRIDKLTQPSSPGIHRMLKEMIYKQFQTVVTYSPEYHNIDEIFMDCDLNET
    AANINFQSMAGNGEFIYSPYWIDTVAHLAGFILNANVKTPTDTVFISHGWQSFRIAAPLSDEKT
    YRGYVRMQPSSGRGVMAGDVYIFDGDEIVVVCKGIKFQQMKRTTLQSLLGVSPAATPISKPIPA
    KPSGPHPVTARKAAVTQSLSAGFSRVLDTIASEVGVDVSELSDDVKISDVGVDSLLTISILGRL
    RPETGLDLSSSLFIEHPSIAELRAFFLDKMDVPQAIANDDDSDDSSEDDGPGFSRSQSTSTIST
    PEEPDVVNILMSIIAREVGVEESEIQLSTPFAEIGVDSLLTISILDAFKTEIGMNLSANFFHDH
    PTFADVQKALGAPSTPQKPLDLPLCRLEQSSKPLSQTPRAKSVLLQGRPDKGKPALFLLPDGAG
    SLFSYISLPSLPSGLPVYGLDSPFHNNPSEYTISFSAVATIYIAAIRAIQPKGPYMLGGWSLGG
    IHAYETARQLIEQGETISNLIMIDSPCPGTLPPLPAPTLSLLEKAGIFDGLSTSGAPITERTRL
    HFLGCVRALENYTVTPLPPGKSPGKVTVIWAQEGVLEGREEQGKEYMAATSSGDLNKDMDKAKE
    WLTGKRTSFGPSGWDKLTGTEVHCHVVSGNHFSIMFPPKVCWQSTSSFSPSMDYDTNAYNLQIT
    AVAEAVATGLPEK
    (C. Grayi PKS)(GenBank Accession E9KMQ2.1)
    SEQ ID NO: 4
    MTLPNNVVLFGDQTVDPCPIIKQLYRQSRDSLTLQTLFRQSYDAVRREIATSEASDRALFPSFD
    SFQDLAEKQNERHNEAVSTVLLCIAQLGLLMIHVDQDDSTFDARPSRTYLVGLCTGMLPAAALA
    ASSSTSQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNDEFM
    IPTSKQAYISAESDSSATLSGPPSTLLSLFSSSDIFKKARRIKLPITAAFHAPHLRVPDVEKIL
    GSLSHSDEYPLRNDVVIVSTRSGKPITAQSLGDALQHIIMDILREPMRWSRVVEEMINGLKDQG
    AILTSAGPVRAADSLRQRMASAGIEVSRSTEMQPRQEQRTKPRSSDIAIIGYAARLPESETLEE
    VWKILEDGRDVHKKIPSDRFDVDTHCDPSGKIKNTSYTPYGCFLDRPGFFDARLFNMSPREASQ
    TDPAQRLLLLTTYEALEMAGYTPDGTPSTAGDRIGTFFGQTLDDYREANASQNIEMYYVSGGIR
    AFGPGRLNYHFKWEGPSYCVDAACSSSTLSIQMAMSSLRAHECDTAVAGGTNVLTGVDMFSGLS
    RGSFLSPTGSCKTFDNDADGYCRGDGVGSVILKRLDDAIADGDNIQAVIKSAATNHSAHAVSIT
    HPHAGAQQNLMRQVLREGDVEPADIDYVEMHGTGTQAGDATEFASVTNVITGRTRDNPLHVGAV
    KANFGHAEAAAGTNSLVKVLMMMRKNAIPPHIGIKGRINEKFPPLDKINVRINRTMTPFVARAG
    GDGKRRVLLNNFNATGGNTSLLIEDAPKTDIQGHDLRSAHVVAISAKTPYSFRQNTQRLLEYLQ
    LNPETQLQDLSYTTTARRMHHVIRKAYAVQSIEQLVQSLKKDISSSSEPGATTEHSSAVFLFTG
    QGSQYLGMGRQLYQTNKAFRKSISESDSICIRQGLPSFEWIVSAEPSEERITSPSESQLALVAI
    ALALASLWQSWGITPKAVMGHSLGEYAALCVAGVLSISDTLYLVGKRAQMMEKKCIANTHSMLA
    IQSDSESIQQIISGGQMPSCEIACLNGPSNTVVSGSLTDIHSLEEKLNAMGTKTTLLKLPFAFH
    SVQMDPILEDIRALAQNVQFRKPIVPIASTLLGTLVKDHGIITADYLTRQARQAVRFQEALQAC
    RAENIATDDTLWVEVGAHPLCHGMVRSTLGLSPTKALPSLKRDEDCWSTISRSIANAYNSGVKV
    SWIDYHRDFQGALRLLELPSYAFDLKNYWIQHEGDWSLRKGETTRTTAPPPQASFSTTCLQVIE
    NETFTQDSASVTFSSQLSEPKLNTAVRGHLVSGTGLCPSSVYADVAFTAAWYIASRMTPSDPVP
    AMDLSSMEVFRPLIVDSNETSQLLRVSATRNPNEQIVNIKISSQDDKGRQEHAHCTVMYGDGHQ
    WMEEWQRNAYLIQSRIDKLTQPSSPGIHRMLKEMIYKQFQTVVTYSPEYHNIDEIFMDCDLNET
    AANIKLQSTAGHGEFIYSPYWIDTVAHLAGFILNANVKTPADTVFISHGWQSFQIAAPLSAEKT
    YRGYVRMQPSSGRGVMAGDVYIFDGDEIVVVCKGIKFQQMKRTTLQSLLGVSPAATPTSKSIAA
    KSTRPQLVTVRKAAVTQSPVAGFSKVLDTIASEVGVDVSELSDDVKISDVGVDSLLTISILGRL
    RPETGLDLSSSLFIEHPTIAELRAFFLDKMDMPQATANDDDSDDSSDDEGPGFSRSQSNSTIST
    PEEPDVVNVLMSIIAREVGIQESEIQLSTPFAEIGVDSLLTISILDALKTEIGMNLSANFFHDH
    PTFADVQKALGAAPTPQKPLDLPLARLEQSPRPSSQALRAKSVLLQGRPEKGKPALFLLPDGAG
    SLFSYISLPSLPSGLPIYGLDSPFHNNPSEFTISFSDVATIYIAAIRAIQPKGPYMLGGWSLGG
    IHAYETARQLIEQGETISNLIMIDSPCPGTLPPLPAPTLSLLEKAGIFDGLSTSGAPITERTRL
    HFLGCVRALENYTVTPLPPGKSPGKVTVIWAQDGVLEGREEQGKEYMAATSSGDLNKDMDKAKE
    WLTGKRTSFGPSGWDKLTGTEVHCHVVGGNHFSIMFPPKVC
    (C. Uncialis-PKS)(GenBank Accession AUW31177.1)
    SEQ ID NO: 5
    MTLPNNVVLFGDQTVDPCPIIKQLYRQSRDSLTLQALFRQSYDAVRREIATSEYSDRTLFPSFD
    SIQGLAEKQTERHNEAVSTVLHCIAQLGLLLIHADQDDFRLDARPSRTYLVGLCTGMLPAAALA
    ASSSASQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNDEFM
    IPTSKQAYISAESDSTATLSGPPSTLVSLFSLSDSFRKARRIKLPITAAFHAPHLRLPNVEKII
    GSLSHSDEYPLRNDVVIISTRSGKPITAQSLGDALQHIILDILREPIRWSTVVEEMINNFEDQG
    ANLTSVGPVRAADSLRQRMATAGIEILKSTELQPQQEPRTKTRSNDIAIIGYAARLPESETLEE
    AWKILEDGRDVHKKIPSDRFDVDTHCDPSGKIKNTTYTPYGCFLDRPGFFDARLFNMSPREASQ
    TDPAQRLLLLTTYEALEMAGYTPDGTPSTAGDRIGTFFGQTLDDYREANASQNIEMYYVSGGIR
    AFGAGRLNYHFKWEGPSYCVDAACSSSTLSIQMAMSSLRAHECDTAVAGGTNVLTGVDMFSGLS
    RGSFLSPTGSCKTFDNDADGYCRGDGVGSVILKRLDDAVADGDNIQAVIKSAATNHSAHAVSIT
    HPHAGAQQNLMRQVLREADVEPSEIDYVEMHGTGTQAGDATEFTSVTNVISGRTRDNPLYVGAV
    KANFGHAEAAAGTNSLVKVLMMMRKNAIPPHIGIKGRINEKFPPLDKINVRINRTMTPFVARAG
    GDGKRRVLLNNFNATGGNTSLLLEDAPKTDIRGHDPRSAHVIAISAKTPYSFRQNTQRLLEYLQ
    QNPDTQLQNLSYTTTARRMHHAIRKAYAVQSIEELVQSMKKDVSNSSELGATTEHSTAIFLFTG
    QGSQYLGMGRQLFQTNTSFRKSISDSDNLCIRQGLPSFEWIVSAEPSEERVPTPSESQLALVAI
    ALALASLWQSWGITPKAVIGHSLGEYAALCVAGVLSISDTLYLVGKRAEMMEKKCIANTHSMLA
    VQSASDSIQQIISGGQMPSCEIACLNGPTNTVVSGSLKDIHSLKEKLDTMGTKTTLLKLPFAFH
    SVQMDPILEDIRALAQNVQFRKPIVPIASTLLGTLVKDHGIITADYLTRQARQAVRFQGALQAC
    KAESIAGDDTLWIELGPHPLCHGMVRSTLGVSPAKALPSLKRDEDCWSTLSRSIANAYNSGVKM
    SWIDYHRDFQGALKLLELPSYAFDLKNYWIQHEGDWSLRKGETTRTTAPPPQASFSTTCLQVVE
    NETFTQDSASVTFSSQLSEPKLNAAIRGHLVSGIGLCPSSVYADVAFTAAWYIASHMTPSDPVP
    AMDLSTMEVFRPLIVDSNETPQLLKVSASKNSNEQVVNIKISSRDDKGRQEHAHCTVMYGDGHQ
    WIDEWQRNAYLFESRIAKLTQPSSPGIHRMLKEMIYKQFQTVVTYSREYHNIDEIFMDCDLNET
    AANIKLQSMAGNGEFIYSPYWIDTIAHLAGFILNANVKTPADTVFISHGWQSFRIAAPLSAEKK
    YRGYVCMQPSSGRGVMAGDVYLFDGDQIVVVCKGIKFQQMKRTTLQSLLGVSPAATPMSKPITA
    KSTRPHPVAVRKVVVTQSPGAGFSKVLDTIASEVGVDASELSDDVKISDIGVDSLLTISILGRL
    RPETGLDLSSSLFIEHPTIAELRAFFLDKMVVPQATVNDDDSDDSSEDGGPGFSRSQSNSTIST
    PEEPDVVSILMSIIAREVGVEESEIQLSTPFAEIGVDSLLTISILDAFKTEIGMNLSANFFHDH
    PTVADVQKALGTASTPQKPLDLPLHRVEQNSKPLSQNLRAKSVLLQGRPEKGKPALFLLPDGAG
    SLFSYISLPSLPSGLPVYGLDSPFHHNPSEYTISFAAVATIYIAAIRAIQPKGPYMLGGWSLGG
    IHAYETARQLIEQGETISNLIMIDSPCPGTLPPLPAPTLSLLEKAGIFDGLSTSGAPITERTRL
    HFLGCVRALENYTVTPLPPGKSPGKVTVIWAQEGVLEGREEQGKEYMAATSSGDLNKDMDKAKE
    WLTGKRTSFGPSGWDKLTGTDVHCHVVGGNHFSIMFPPKVCWRSTFSLSSSIDNDTNAYNLQIA
    AVAKAVATGLPEK
    (C. Grayi-PKS-dACPi)
    SEQ ID NO: 6
    MTLPNNVVLFGDQTVDPCPIIKQLYRQSRDSLTLQTLFRQSYDAVRREIATSEASDRALFPSFD
    SFQDLAEKQNERHNEAVSTVLLCIAQLGLLMIHVDQDDSTFDARPSRTYLVGLCTGMLPAAALA
    ASSSTSQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNDEFM
    IPTSKQAYISAESDSSATLSGPPSTLLSLFSSSDIFKKARRIKLPITAAFHAPHLRVPDVEKIL
    GSLSHSDEYPLRNDVVIVSTRSGKPITAQSLGDALQHIIMDILREPMRWSRVVEEMINGLKDQG
    AILTSAGPVRAADSLRQRMASAGIEVSRSTEMQPRQEQRTKPRSSDIAIIGYAARLPESETLEE
    VWKILEDGRDVHKKIPSDRFDVDTHCDPSGKIKNTSYTPYGCFLDRPGFFDARLFNMSPREASQ
    TDPAQRLLLLTTYEALEMAGYTPDGTPSTAGDRIGTFFGQTLDDYREANASQNIEMYYVSGGIR
    AFGPGRLNYHFKWEGPSYCVDAACSSSTLSIQMAMSSLRAHECDTAVAGGTNVLTGVDMFSGLS
    RGSFLSPTGSCKTFDNDADGYCRGDGVGSVILKRLDDAIADGDNIQAVIKSAATNHSAHAVSIT
    HPHAGAQQNLMRQVLREGDVEPADIDYVEMHGTGTQAGDATEFASVTNVITGRTRDNPLHVGAV
    KANFGHAEAAAGTNSLVKVLMMMRKNAIPPHIGIKGRINEKFPPLDKINVRINRTMTPFVARAG
    GDGKRRVLLNNFNATGGNTSLLIEDAPKTDIQGHDLRSAHVVAISAKTPYSFRQNTQRLLEYLQ
    LNPETQLQDLSYTTTARRMHHVIRKAYAVQSIEQLVQSLKKDISSSSEPGATTEHSSAVFLFTG
    QGSQYLGMGRQLYQTNKAFRKSISESDSICIRQGLPSFEWIVSAEPSEERITSPSESQLALVAI
    ALALASLWQSWGITPKAVMGHSLGEYAALCVAGVLSISDTLYLVGKRAQMMEKKCIANTHSMLA
    IQSDSESIQQIISGGQMPSCEIACLNGPSNTVVSGSLTDIHSLEEKLNAMGTKTTLLKLPFAFH
    SVQMDPILEDIRALAQNVQFRKPIVPIASTLLGTLVKDHGIITADYLTRQARQAVRFQEALQAC
    RAENIATDDTLWVEVGAHPLCHGMVRSTLGLSPTKALPSLKRDEDCWSTISRSIANAYNSGVKV
    SWIDYHRDFQGALRLLELPSYAFDLKNYWIQHEGDWSLRKGETTRTTAPPPQASFSTTCLQVIE
    NETFTQDSASVTFSSQLSEPKLNTAVRGHLVSGTGLCPSSVYADVAFTAAWYIASRMTPSDPVP
    AMDLSSMEVFRPLIVDSNETSQLLRVSATRNPNEQIVNIKISSQDDKGRQEHAHCTVMYGDGHQ
    WMEEWQRNAYLIQSRIDKLTQPSSPGIHRMLKEMIYKQFQTVVTYSPEYHNIDEIFMDCDLNET
    AANIKLQSTAGHGEFIYSPYWIDTVAHLAGFILNANVKTPADTVFISHGWQSFQIAAPLSAEKT
    YRGYVRMQPSSGRGVMAGDVYIFDGDEIVVVCKGIKFQQMKRTTLQSLLGVSPAATPTSKSIAA
    KSTRPQLVTVRKAAVTQSPVAGFSKVLDTIASEVGVDVSELSDDVKISDVGVD
    Figure US20220403346A1-20221222-P00003
    LLTISILGRL
    RPETGLDLSSSLFIEHPTIAELRAFFLDKMDMPQATANDDDSDDSSDDEGPGFSRSQSNSTIST
    PEEPDVVNVLMSIIAREVGIQESEIQLSTPFAEIGVDSLLTISILDALKTEIGMNLSANFFHDH
    PTFADVQKALGAAPTPQKPLDLPLARLEQSPRPSSQALRAKSVLLQGRPEKGKPALFLLPDGAG
    SLFSYISLPSLPSGLPIYGLDSPFHNNPSEFTISFSDVATIYIAAIRAIQPKGPYMLGGWSLGG
    IHAYETARQLIEQGETISNLIMIDSPCPGTLPPLPAPTLSLLEKAGIFDGLSTSGAPITERTRL
    HFLGCVRALENYTVTPLPPGKSPGKVTVIWAQDGVLEGREEQGKEYMAATSSGDLNKDMDKAKE
    WLTGKRTSFGPSGWDKLTGTEVHCHVVGGNHFSIMFPPKVC
    (C. Grayi-PKS-dACP2)
    SEQ ID NO: 7
    MTLPNNVVLFGDQTVDPCPIIKQLYRQSRDSLTLQTLFRQSYDAVRREIATSEASDRALFPSFD
    SFQDLAEKQNERHNEAVSTVLLCIAQLGLLMIHVDQDDSTFDARPSRTYLVGLCTGMLPAAALA
    ASSSTSQLLRLAPEIVLVALRLGLEANRRSAQIEASTESWASVVPGMAPQEQQEALAQFNDEFM
    IPTSKQAYISAESDSSATLSGPPSTLLSLFSSSDIFKKARRIKLPITAAFHAPHLRVPDVEKIL
    GSLSHSDEYPLRNDVVIVSTRSGKPITAQSLGDALQHIIMDILREPMRVVSRWEEMINGLKDQG
    AILTSAGPVRAADSLRQRMASAGIEVSRSTEMQPRQEQRTKPRSSDIAIIGYAARLPESETLEE
    VWKILEDGRDVHKKIPSDRFDVDTHCDPSGKIKNTSYTPYGCFLDRPGFFDARLFNMSPREASQ
    TDPAQRLLLLTTYEALEMAGYTPDGTPSTAGDRIGTFFGQTLDDYREANASQNIEMYYVSGGIR
    AFGPGRLNYHFKWEGPSYCVDAACSSSTLSIQMAMSSLRAHECDTAVAGGTNVLTGVDMFSGLS
    RGSFLSPTGSCKTFDNDADGYCRGDGVGSVILKRLDDAIADGDNIQAVIKSAATNHSAHAVSIT
    HPHAGAQQNLMRQVLREGDVEPADIDYVEMHGTGTQAGDATEFASVTNVITGRTRDNPLHVGAV
    KANFGHAEAAAGTNSLVKVLMMMRKNAIPPHIGIKGRINEKFPPLDKINVRINRTMTPFVARAG
    GDGKRRVLLNNFNATGGNTSLLIEDAPKTDIQGHDLRSAHVVAISAKTPYSFRQNTQRLLEYLQ
    LNPETQLQDLSYTTTARRMHHVIRKAYAVQSIEQLVQSLKKDISSSSEPGATTEHSSAVFLFTG
    QGSQYLGMGRQLYQTNKAFRKSISESDSICIRQGLPSFEWIVSAEPSEERITSPSESQLALVAI
    ALALASLWQSWGITPKAVMGHSLGEYAALCVAGVLSISDTLYLVGKRAQMMEKKCIANTHSMLA
    IQSDSESIQQIISGGQMPSCEIACLNGPSNTVVSGSLTDIHSLEEKLNAMGTKTTLLKLPFAFH
    SVQMDPILEDIRALAQNVQFRKPIVPIASTLLGTLVKDHGIITADYLTRQARQAVRFQEALQAC
    RAENIATDDTLWVEVGAHPLCHGMVRSTLGLSPTKALPSLKRDEDCWSTISRSIANAYNSGVKV
    SWIDYHRDFQGALRLLELPSYAFDLKNYWIQHEGDWSLRKGETTRTTAPPPQASFSTTCLQVIE
    NETFTQDSASVTFSSQLSEPKLNTAVRGHLVSGTGLCPSSVYADVAFTAAWYIASRMTPSDPVP
    AMDLSSMEVFRPLIVDSNETSQLLRVSATRNPNEQIVNIKISSQDDKGRQEHAHCTVMYGDGHQ
    WMEEWQRNAYLIQSRIDKLTQPSSPGIHRMLKEMIYKQFQTVVTYSPEYHNIDEIFMDCDLNET
    AANIKLQSTAGHGEFIYSPYWIDTVAHLAGFILNANVKTPADTVFISHGWQSFQIAAPLSAEKT
    YRGYVRMQPSSGRGVMAGDVYIFDGDEIVVVCKGIKFQQMKRTTLQSLLGVSPAATPTSKSIAA
    KSTRPQLVTVRKAAVTQSPVAGFSKVLDTIASEVGVDVSELSDDVKISDVGVDSLLTISILGRL
    RPETGLDLSSSLFIEHPTIAELRAFFLDKMDMPQATANDDDSDDSSDDEGPGFSRSQSNSTIST
    PEEPDVVNVLMSIIAREVGIQESEIQLSTPFAEIGVD
    Figure US20220403346A1-20221222-P00004
    LLTISILDALKTEIGMNLSANFFHDH
    PTFADVQKALGAAPTPQKPLDLPLARLEQSPRPSSQALRAKSVLLQGRPEKGKPALFLLPDGAG
    SLFSYISLPSLPSGLPIYGLDSPFHNNPSEFTISFSDVATIYIAAIRAIQPKGPYMLGGWSLGG
    IHAYETARQLIEQGETISNLIMIDSPCPGTLPPLPAPTLSLLEKAGIFDGLSTSGAPITERTRL
    HFLGCVRALENYTVTPLPPGKSPGKVTVIWAQDGVLEGREEQGKEYMAATSSGDLNKDMDKAKE
    WLTGKRTSFGPSGWDKLTGTEVHCHVVGGNHFSIMFPPKVC
    (P. furfuracea-PKS)
    SEQ ID NO: 40
    MTTTSRVVLFGDQTVDPSPLIKQLCRHSTHSLTLQTFLQKTYFAVRQELAICEISDRANFPSFD
    SILALAETYSQSNESNEAVSTVLLCIAQLGLLLSREYNDNVINDSSCYSTTYLVGLCTGMLPAA
    ALAFASSTTQLLELAPEVVRISVRLGLEASRRSAQIEKSHESWATLVPGIPLQEQRDILHRFHD
    VYPIPASKRAYISAESDSTTTISGPPSTLASLFSFSESLRNTRKISLPITAAFHAPHLGSSDTD
    KIIGSLSKGNEYHLRRDAVIISTSTGDQITGRSLGEALQQVVWDILREPLRWSTVTHAIAAKFR
    DQDAVLISAGPVRAANSLRREMTNAGVKIVDSYEMQPLQVSQSRNTSGDIAIVGVAGRLPGGET
    LEEIWENLEKGKDLHKEDRFDVKTHCDPSGKIKNTTLTPYGCFLDRPGFFDARLFNMSPREAAQ
    TDPAQRLLLLTTYEALEMSGYTPNGSPSSASDRIGTFFGQTLDDYREANASQNIDMYYVTGGIR
    AFGPGRLNYHFKWEGPSYCVDAACSSSALSVQMAMSSLRARECDTAVAGGTNILTGVDMFSGLS
    RGSFLSPTGSCKTFDDEADGYCRGEGVGSVVLKRLEDAIAEGDNIQAVIKSAATNHSAHAISIT
    HPHAGTQQKLIRQVLREADVEADEIDYVEMHGTGTQAGDATEFTSVTKVLSDRTKDNPLHIGAV
    KANFGHAEAAAGTNSLIKILMMMRKNKIPPHVGIKGRINHKFPPLDKVNVSIDRALVAFKAHAK
    GDGKRRVLLNNFNATGGNTSLVLEDPPETVTEGEDPRTAWVVAVSAKTSNSFTQNQQRLLNYVE
    SNPETQLQDLSYTTTARRMHHDTYRKAYAVESMDQLVRSMRKDLSSPSEPTAITGSSPSIFAFT
    GQGAQYLGMGRQLFETNTSFRQNILDFDRICVRQGLPSFKWLVTSSTSDESVPSPSESQLAMVS
    IAVALVSLWQSWGIVPSAVIGHSLGEYAALCVAGVLSVSDTLYLVGKRAEMMEKKCIANSHAML
    AVQSGSELIQQIIHAEKISTCELACSNGPSNTVVSGTGKDINSLAEKLDDMGVKKTLLKLPYAF
    HSAQMDPILEDIRAIASNVEFLKPTVPIASTLLGSLVRDQGVITAEYLSRQTRQPVKFQEALYS
    LRSEGIAGDEALWIEVGAHPLCHSMVRSTLGLSPTKALPTLRRDEDCWSTISKSISNAYNSGAK
    FMWTEYHRDFRGALKLLELPSYAFDLKNYWIQHEGDWSLRKGEKMIASSTPTVPQQTFSTTCLQ
    KVESETFTQDSASVAFSSRLAEPSLNTAVRGHLVNNVGLCPSSVYADVAFTAAWYIASRMAPSE
    LVPAMDLSTMEVFRPLIVDKETSQILHVSASRKPGEQVVKVQISSQDMNGSKDHANCTVMYGDG
    QQWIDEWQLNAYLVQSRVDQLIQPVKPASVHRLLKEMIYRQFQTVVTYSKEYHNIDEIFMDCDL
    NETAANIRFQPTAGNGNFIYSPYWIDTVAHLAGFVLNASTKTPADTVFISHGWQSFRIAAPLSD
    EKTYRGYVRMQPIGTRGVMAGDVYIFDGDRIVVLCKGIKFQKMKRNILQSLLSTGHEETPPARP
    VPSKRTVQGSVTETKAAITPSIKAASGGFSNILETIASEVGIEVSEITDDGKISDLGVDSLLTI
    SILGRLRSETGLDLPSSLFIAYPTVAQLRNFFLDKVATSQSVFDDEESEMSSSTAGSTPGSSTS
    HGNQNTTVTTPAEPDVVAILMSIIAREVGIDATEIQPSTPFADLGVDSLLTISILDSFKSEMRM
    SLAATFFHENPTFTDVQKALGAPSMPQKSLKMPSEFPEMNMGPSNQSVRSKSSILQGRPASNRP
    ALFLLPDGAGSMFSYISLPALPSGVPVYGLDSPFHNSPKDYTVSFEEVASIFIKEIRAIQPRGP
    YMLGGWSLGGILAYEASRQLIAQGETITNLIMIDSPCPGTLPPLPSPTLNLLEKAGIFDGLSAS
    SGPITERTRLHFLGSVRALENYTVKPIPADRSPGKVTVIWAQDGVLEGREDVGGEEWMADSSGG
    DANADMEKAKQWLTGKRTSFGPSGWDKLTGAEVQCHVVGGNHFSIMFPPKLCGEEKLANASWNN
    cs-OLAS-1
    SEQ ID NO: 41
    MASQVLLLFGDQNAEKLPEIRRLDRVSRSSPPLQRFLREATDVVQNEVAKLSLHRRKAFFAFDN
    LVTLAEKHAKQDCPDDVVSTVLITIIRLGGLILYMQQNPRVLESSETAVHSLGLCVGLFPAAVA
    AVSRNSEDVRIFGLEIVAICIRLMERVRSRSQKIEAAPGAWAYTVVGAGAEDSKSVLDNFHQAQ
    NLPDHNRAFIGVSSKTWTTIFGPPSTLDKLWIHSPQLGLAPKLKLNAFGAVHASHLPMLDMETI
    IGDSSLLMTPLTSKVRIVSSSTCAPFVASDLGTLLYEMILDIAQNTLRLTDTVQTIVSDLRRIG
    DVELVVLGPTAHTTVMQSALRENYINVNLVSELEAPVSSQDLRGGSNLIAVVGMSGRFPGSENV
    YEFWETLKKGTDFAEKIPSSRFDINKHFDADGVEKNALSTLYGCFLERPGVFDTRFFNISPREA
    AQMDPTQRLLLMASYEALELAGYTPDGSTSMNAKRIATYIAQVTDDWRTINECQGTDIYYIPGS
    CRAFTPGRLNYHYKWEGASLSLDAACAGGTTAVTLACSALLSRECDTALAGGGSILAVPGPWSG
    LCRGSFLSSTGNCKPLRDDADGYCRGEGIGIVVLKRLEDAIADNDNIQALINSSARTYSAGAVS
    ITQPHAESQAKLYKRVLQEANLDPLDIGFVEMHGTGTQWGDLMEVQSISEVFAEGRTKEYPLVI
    GAVKANVGHGEAAAGMSSLIKSIMLFREPEAIPPQPGWPFKLNPKLPCMEKMNIRVADGQAPFL
    PRPSGDGTKRLLVNNFDASGGNTCVILSEPPERPQKSQDPRTYHVVACSARTSYSLKANKKRLL
    QYLQSDEDVAISDVAYTTTARRMHNVLRSSYVAQTSKDLIKLISNDLEQSAEAEIKSTSSNRVV
    FAFTGQGSLYPGMGKQLFETSAIFRESILSYQRILDSQGFPYVVDIIADDGVTIESKDMAQVQL
    AIVFIELALAELWKSWGVQPDLLIGHSLGEYAALCVSGVLSVSDALYLVGQRSSMIMKNCTPGS
    SGMLTVAASAKTIEETLANHDLASCEISCVNAPEMSVVSGTHEDLKSLQALLNAKFRTTFLKVA
    YGFHSAQIDPILESLETSASGITFAKPQIPIASTLLSDIVSDNGTFNPEYLARQAREPVNFSGT
    LQTCRSKGFVDDQTLWIELGPDPVCLGLVRSTLEIPSERLLPSLKSKEENWKTITNAVSRAYLS
    KQPVAWVDFHHEYIGCLTLLELPTYAFDLQNHWASYKQEQLFPAAQQLQNQLIIAAAPERKFLP
    TTCVQWVEKESFTGDEISATFSSHTSEPKLFSAIQGHLVDNTAICPATIFCEMAYTAAKYLYEG
    TNPGKAVPQMSLWTLDITHPLVVPVSDPLQIVEISAAKSAGRDWSIHVTFTSKDQASTHEHGSC
    DVRFGKSDERKALFSRSLHLVKKRIDALRSSAVAGLSHRLQRPIVYKLFRSLVDYGEKYRGLEE
    VYLDNTGYGDAVARVKLGSSADLGNFTHSPYWTDTIIHLAGFVLNGDVSLSPDDAYISAGFEAF
    HLFEELSDSKAYTTYVAMQPADKPGIVTGDIFVFEDDKLVALCGGLYFHKMTKKVLRIIFGQGG
    QAPAKKTSQSKTAAPIKQQPEAVDIEPSSQGSLPDSDDRSAYDSSGSGAIQSSPPSSVDNDNEP
    DVAEVLLAIISKETGFSTADMEPSTKFTDMGLDSLMSIAITAAAKREIDLELPASYFTDNATVG
    NVTKDFGKAPAVQAVATLPAKVKEAPAPAPALVPSRVQSAEYMANNPEPYEKKGDIVTPGSSGA
    SSPAPERVTMAMPVKATIPTPKAKQALKPKAVAAAKADLSQYSSNLVLVRGKRSSKETPLMLVT
    DGAGSATAYIYLPAMKTGTPIYALESPFLQDPLAYNCSVEEVSALYVKGLQKTQPKGPYLIGGW
    SAGAVHAYEVARQLLEAGEKVLGLILIDMRVPKGMPDALEPSLEIIESAGLLTSLERAGQADTP
    QATKTKQHLVGTVKGLVQYTPRPVPASNKPSHTALIWAQKGLSEAGQEDVVRLPAAERMAAAAQ
    EANMGQEDVGPEDSHTELASWFYSKRNAFGPNGWDKLVQGKVDCHVIEGADHFSMVVPPKAKIL
    GQIIEDVVRKCIAGGSPRINGEDH
    Diviaric Acid Synthase pp-DVAS-1
    SEQ ID NO: 42
    MTSQVLLLFGDQTAEKLLSIQRLTRVAKTSPLLQRFLREATDVVQAEVGKLSLERRNAFFAFDN
    LINLAEKHAKQDCPDDVISTALITIIRLGDLVIYVQSNPRLFEDPETAVHSLGFCTGLLPAAVA
    AVSRNTEDLHRFGLEIVAISIRLMEAICNRSRQIEAVPRSWGYTVVGAGSEDSKAVLDDFHLAQ
    NLPDHNRAFIGVSSRTWTTIFGPPSTLDKLWTHSPQLGLAPKLKLHAYTAVHASHLPVLDMEKI
    VGESPMLMTPLTSKVRIVSSSTCTSFVASDLGTLLHEMILDIAQNTLRLTETVQTIVSDLRKIG
    DVDLVVLGPTAHTSLVQNALREKSINVKLISEPEAPVSAHDLRGGSGLVAIVGMSGRFPGSDSV
    HQFWETLRNGQDLHQEIPLSRFDIDEHFDPDGVMKNSLSTRYGCFIEKPGLFDNRLFNVSPREA
    AQMDPLQRLLLMASYEAMEMAGYAPDGSVSTSTKRIATYMAQTTDDWRSVNECQGIDIYNIPSV
    ARAFTPGRLNYHFKWEGASHCIDAACAGGSTSVALACSALLARECDTALAGGGSILAAPGLWSG
    LSRGGFLSPSGNCKPLRDDADGYCRGEAIGVVVLKRLEDAIADNDNIQAVIKSSARSYSAEAVS
    ITQPHAESQAKLYRRVLQEAGADPLDIGYVEMHGTGTQWGDLMEVQSISEVFAEGRTREYPLVI
    GAVKANVGHGEAAAGVTSLIKNIMMFREPDSIPSQPGWPFKLNPKLPRLDKMNIKVADGNTSFI
    PRPTGDGEKMLLLNNFDASGGNTCIVLGEPPERPQKSQDPRTHHIVACSARTPISLRANKERLL
    QCLRSDEEISISDVAYTTTARRMQDVLRSSYVAQTSKDLIRLITDDLKQTAVAKPKSSSHSRVV
    FAFTGQGSLYAGIGRQLFETSANFRDNIFMYQKICDSQGLPYVVDIIADDGADIESKNMAQIQL
    AIVFVELALANLWKSWGVQPDLLIGHSLGEYAALCVSGVLSVSDALYLVGKRSSMIMKKCTPGS
    SGMLAVAAPVKAIEEALANQDLASCEISCMNAPEMSVVSGTHKDLRSLQALLSSGVRTTFLKVT
    YGFHSSQIDPILKDLENSASGITFAKPQIPIASTLIGDIVSDVGTFSPNYLARQAREPVNFSGA
    LRASKSKGFVDDQTLWMEIGPDPVCLGLVRSTLEIPSEKLLPSLKSNEENWKTISNAIARAYLS
    KQPVAWADFHHEYVGSLTLLELPTYAFDLKEYWSSYKQELLVAGAQQTPSKLPGPAGPERKHLG
    MTCVQWVEKESFKGDEISATFSSHTSEPKLFAAIQGHLVDNTAICPATVFCEIAFTAAKYLYEG
    ANPGKAAPLMSLWALDITHPLVVPVSDPLQIIEISAVKSADRDWLVHVSFNSKDSTSSHGHGSC
    DVQFGRNDERKAEFSRSLHLVNKRVDALTSSAVAGISHRLQRPIVYNLFASFVKYGEKYQGLEE
    VYLDTTGYGDTAARIKLGPNADSGTFTQSPYWTDTVIHLAGFVLNGDVTLSPSDAYISTGFEAF
    HIFEELSHTKTYITYVSMQPSEKSNVLTGDVYVFEGDRLIALCGGLNFHKMTKKVLRIIFGQGG
    QTSAKKTVQPKAAAPIRSKPHSISTETSKKVSPPDSDASSAYDSSGSGTNASSPPSSVDNDDEP
    NVVQNLLAIIAAESGFDVAEMEPSTEFADMGLDSLMSIAIVAAAKRDLDLELPASFFTDNARVA
    DITKEFGKASPAPKPAPAAVAPSAKVNEAPAHVQSTESMANDPEPYEKRGEIATSDSSAGSSPT
    PEKAAPAMPVNAMIPTPKPTAKSKQAAKPTLSQHTSNVVLIRGKRSSKEIPLMLVTDGAGSAAA
    YIHLPAMKTGTPIYALESPYLRDPHAYKCSVEEVCDLYIAGIRKTQPKGPYIIGGWSAGAVYAY
    EVACKLLEAGEKILGLILIDMRVPKAMPDALEPSLDLIESAGLSTGVDRAGQADSPQGMILKEH
    LVSTVKALVRYSPRPVPHSNKPNHTTLIWAQKGMSEAGKDNVLKMSTDEGSLLAGDLGEANMGQ
    VAEGEDPEGGMKSWFFARRSAFGPNGWDKLVGGEVDCRVIEGADHFSMVVPPKVKELGKILEDA
    VRKCIADEN
  • As can be deduced from the alignment shown in FIG. 6 , variants of SEQ ID NOs:1-7 and 40-42 are made to retain PKS activity while retaining only one activate ACP domain which, the location of which is defined in Table 2:
  • TABLE 2
    AA for SEQ
    AA for SEQ AA for SEQ AA for SEQ AA for SEQ ID NO: 42
    ID NO: 3 ID NO: 4 ID NO: 5 ID NO: 40 AA for SEQ (Protousnea
    Name Accession Description (C. Stelaris) (C. Grayi) (C. Uncialis) (P. furfuracea) ID NO: 41 poepiggi)
    PksD COG3321 Acyl transferase domain  367-795  367-795  367-795  370-795  369-799  369-799
    Cd00833 in polyketide synthase
    (PKS) enzymes
    PT_fungal_ TIGR04532 iterative type I PKS 1273-1587 1273-1587 1273-1587 1276-1590 1281-1589 1281-1589
    PKS product template domain
    SAT pfam 16073 Starter unit: ACP   8-243   8-243   8-243   8-246   7-244   7-244
    transacylase in aflatoxin
    biosynthesis
    EntF COG3319 Thioesterase domain of 1847-2122 1847-2122 1847-2089 1857-2112 1851-2124 1843-2117
    type I polyketide
    synthase or non-
    ribosomal peptide
    synthetase
    PP-binding pfam00550 Phosphopantetheine 1625-1692 1625-1692 1625-1692 1631-1698 1630-1732 1670-1732
    (PKS_PP) smart00823 attachment site
    ACP Domain 1
    PP-binding pfam00550 Phosphopantetheine 1738-1802 1738-1802 1738-1802 1748-1812 not present not present
    (PKS_PP) smart00823 attachment site
    ACP Domain 2
    PKS_AT smart00827 Acyl transferase domain  893-1195  893-1195  893-1195  894-1196  898-1199  898-1199
    in polyketide synthase
    (PKS) enzymes
  • Mutations that inactivate an ACP domain can be made by mutating the highly conserved amino acids of the ACP domain, while retaining the PKS activity. Examples of such mutations include:
      • a. Substituting the serine at position 1654 or 1766 with any amino acid, such as for example, alanine in SEQ ID NO:3 or the corresponding position in SEQ ID NO:4 and 5 (see for example SEQ ID Nos: 1-2 and 6-7;
      • b. L1655 to R, H or K; D1653 to R, H or K, L1656 to R, H, K
  • Even though one of the two ACP domains is preferably inactivated in PKS Variant Enzymes (when two ACP domains are present), the PKS activity is retained. Examples of amino acids that should be maintained include those that are known to be highly conserved between homologs and/or orthologs.
  • Any of these PKS Enzymes (including the described variants) derived from SEQ ID NO:1-5 or 40 in combination with a npgA Enzyme can be used to produce Compound I from Compound II in the methods described herein. Variants of such PKS enzymes retain the ability to catalyze the conversion of Compound II into Compound I in combination with a npgA Enzyme, with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence. In preferred embodiments, a variant PKS enzyme, has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of Compound II into Compound I as compared to the sequence from which the improved variant is derived.
  • Alternatively, any of these PKS Enzymes (including the described variants) derived from SEQ ID NO:41 or 42 in combination with SEQ ID NO:43 or 44 (including variants) along with a npgA enzyme can be used to produce Compound I from acetyl-CoA and malonyl-CoA in the methods described herein. Variants of such PKS enzymes retain the ability to catalyse the conversion of acetyl-CoA and malonyl-CoA into Compound I with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence from which the variant sequence was derived. In preferred embodiments, such a variant PKS enzyme derived from SEQ ID NO:41 or 42, has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalysing the conversion of acetyl-CoA and malonyl-CoA into Compound I as compared to the sequence from which the improved variant is derived.
  • Specifically, it was surprisingly discovered that cs-OLAS-1 (SEQ ID NO:41) when combined with cs-HEX-1 (SEQ ID NO:43) and a npgA enzyme can generate Olivetolic Acid from acetyl-CoA and malonyl CoA. Similarly, Diviaric Acid-Synthase (pp-DVAS-1)(SEQ ID NO:42), Butiryl synthase (pp-BUT-1) (SEQ ID NO:44), and a npgA enzyme can produce Diviaric Acid from acetyl-CoA and malonyl CoA. Variants derived from these sequences as described herein can also be used so long as the variants retain the ability to produce Olivetolic Acid or Diviaric Acid (respectively) as compared to the sequences from which the variants were derived.
  • Accordingly, in certain embodiments, cs-OLAS-1 variant enzymes comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:41. In certain embodiments, cs-OLAS-1 variant enzymes comprise a polypeptide that has at least 70%, 75%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:41. When producing Olivetolic Acid, any of these cs-OLAS-1 variant enzymes can be used in combination with a cs-HEX-1 enzyme (including variants) as described herein. For example, in certain embodiments, cs-HEX-1 variant enzymes comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:43. In certain embodiments, cs-HEX-1 variant enzymes comprise a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:43.
  • Additionally, in certain embodiments, pp-DVAS-1 variant enzymes comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:42. In certain embodiments, pp-DVAS-1 variant enzymes comprise a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:42. When producing Diviaric Acid, any of these pp-DVAS-1 variant enzymes can be used in combination with a Butiryl (pp-BUT-1) synthase (including variants) as described herein. For example, in certain embodiments, Butiryl (pp-BUT-1) synthase variants comprise a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 990% or 100% sequence identity to SEQ ID NO:44. In certain embodiments, Butiryl (pp-BUT-1) synthase variants comprise a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:44.
  • The sequences corresponding to SEQ ID NO:43 and 44 are as follows:
  • >SEQ ID NO: 43-cs-HEX-1
    MPYFLSPERRASGTDDPNSVAVVGLACRFPGDAENGPAFWDFLCKARSAY
    SESDRFNMNAFHSTAKGRLDTSITRGAHFLRQDIAAFDANFFSMSHSEAI
    AMDPNQRLMLEVAYEAFENAGLPLEAVAGTNTSCYIGNFTTDYRDMLFRD
    PDAMPLYSMSGSGYELISNRVSWFYDLRGPSFTLGTACSSSLVAVHQGCQ
    SLRTGESNTAIVGGSNLLLNPEMFLALSNQQFLAQDGRSKSFDIRGDGYG
    RGEGFAALVLKRVDDAIRDGDPIRAIIRGTGVNQDGKTKSITVPNADAQA
    DLTRSTYQSAGLSYKDTQYFEAHGTGTKAGDPLELKALSETLAAGRTANN
    KLIVGSVKPNIGHLEATAGLAGIIKSIYILEHAIIPPNIHFHQANPRIPF
    DEWNIEVPTKIMPWPVEGQRRISVQGFGYGGTNAHVILDDALHYLEKRRL
    KGNHFTKPFVPTPNGARGLVRNQTTNHSIKALKLKLSQSQKKLRLFVLSA
    QDQDGLNRQKTSLSIYLRKCLAGPTPPSSEYLRDLAFTLGHRRSRLAWKT
    FLTASSPDELLSNLENKSLDVPSFRPSSEPRIGFIFTGQGAQWARMGAEL
    NQYPIFRESVEASDEYLRSELKCKWSAMEEMLREEDQSKVNLPAYSQPIC
    TILQIALVDMLESWNIVPVAITGHSSGEIAGAYCLGALSKEDALKAAYYR
    GLLSSQMKTISPSVHGSMMAVGASESEAEEWIARITSGDLVVACVNSPSS
    VTISGDTPAIDELEAILKKDGVFARKLKVETAYHSPHMEMISVPYLQSMM
    DIQPQKGCPSRKMYSAVTGELVEPSELGPINWVRNLVSPVLFYDALYDLL
    RPMEAGRRSPDTAVDVLLEIGPHSALQGPANQTMKEHGIKGVDYRSVLSR
    GKNGIQTALAAVGALFSQGLTVNVKEVNGDTDDAQPLVDLPSYNWNHSRT
    FWSESRVTKEFRSRQHPPMRLLGAPCPSFGESERLWRGFMRISEEPWIRD
    HQIQGSIIYPAAGYICAAIEAACQLAAEGQDIKEFRLRDVQIIAPALITE
    ESDLELIVQIRPHLIGTQNNSSTWYEFTVSSCLNGQALRQNCHGLLLIEY
    KPAGDSGMSIERNLEDQTAQAQYTKTESLCPTQENAKDFYTELASVGLNY
    GSTFQNISKIRRGRGNSCCDVDISEQAFPAVSGTFKRPHVIHPTTLDAMF
    HAVFAAYKDQKGRLKEAMVPTSIDEMVISAAAPFEAGSRFKGFCKASKHG
    FRELMADLVMLDESSNWPAVTLKGFRLAAISGSSGASDEDIGPTSKKLFS
    KMVWKPALELLSLDQRKVMLNGTMPKAVTSESVSGLEKSEKLALHFISQI
    LERVPIDAVKKPHLQGFYRWMQEQQDQVNTYCHFLQTPNEGYLGIDDETA
    GLYEGAVNSEGAEGEALCRLGKNLEDILLGNVDAAELLLKDELTARVQHE
    IRGLDECFEKIGKFVNVLAHNNPDLSVLELGSARGGLAASLFSEPSDAMQ
    GLPNYVFSASHEGDLEEAKGYLAATNASITFRTLSIEKELASQGFESGTF
    DIIIASNPLRAQDDKTLTNMKTLLKPEGKLCLVSVARPAIGLSMVFRCLA
    SSLSSKLHYPCITDSESLDTVLKRTKLRTEFGISDFEDARYQHLSLAIAT
    NSETVGQDRQDREMIILEGSSPSDRSSALVTQLIHELESRNIKPSRMTWD
    QTKHDFSHKECISLLELEASFLEDLSEADFSAVKNLILDSANLTWVTALD
    GPACAVASGMARSIRNEIPGKSFRSLQVQEKSLDTPDKLAFLVGQVATTV
    IPDDEFREDAGVLQVCRVVEDAPMNEDITQLLVEGKENVEDMALDQVNGP
    QMLAIRAQGMLDTLCVEDDDVAVNELGNDEVEIDVKATGLNFRDVMVAMG
    QIPDNLLGFEASGIIKRVGRDVAGLEAGDSVCTLGHGCHRTLFRNKAIFC
    QRIPDGVSFADAATLPLVHCTAFYALVHVARVRPKQSVLIHAAAGGVGQA
    AIQIAKHFDLEIFATVGSTEKRNLIQEVYGIPDDHIFNSRDLSFEKGVLR
    MTNGRGVDCIINSLSGEALRRTWRCIAPFGTFVEIGMKDILGNTGLEMRP
    FLQDATFTFFNLKHVMTANPQLMAEIIEGTFDFLRQGISRPVSPVTIYPV
    SEVENAFRLMQTGKHRGKIAITWDGKDVVTVLHRTDNSLKLDENATYVLV
    GGLGGLGRSLSNLLVDLGARNLCFISRSGDQSTSAQKLLQDLEQRNVKTS
    VYRCDIADKGSVAETISYCAEKMPPIKGCFQCAMVLRDVLFEKMTHTQWT
    ESLRPKVQGSWNLHTLLPKELDFFVILSSFAGIFGNRTQSNYAAAGAYQD
    ALAHHRRAQGLKAVTVDLGIMRDVGVIAEHGATDYLKEWEEPFGIRETEL
    HVLIKKIINAELQFTSTDTETQLPPQILTGFATGGTAHLANIRRPFYFDD
    PRFAILTHTGLSASHSSTASASGPNGSVTLKDLLPHITVPADAEIAMKDA
    LIARIAKSLQIETSEIDEKRPLHSYGVDSLVAVEIANWIFKEIKVTVSVF
    DILASMPITALAGKVVIKSPFLPADVEAK
    SEQ ID NO: 44-Butiryl (BUT) synthase (pp-BUT-1)
    MPHSLSPESSDSVADDPNSVAVIGFACRFPGDAENGPAFWEFLCKARSAY
    SETDRFNINAFHSTAKDRLATSAAQGAHFLRQDVAAFDANFFSISHNEAM
    AMDPNQRFMLEVAYEAFENAGLPLETIAGTNTSCYIGNYTTDYREMLFRD
    SEAMRLYSMSGLGSELISNRVSWFYDLRGPSFTLGTACSSSLVAIHQGCQ
    SLRIRESSMAIVGGSNLLLNPEMFIALSNQQFLAQDGRSKSFDIRGDGYG
    RGEGFAALVLKRVDDAIRDGDPIRAILRGTGVNQDGKTKSITVPSADAQA
    DLIRSTYRSAGLSLKDTHYFEAHGTGTKAGDTTEMKALSETLAAGRKPSN
    KLIVGSVKSNIGHLEATAGIAGVIKAIYILEHAIIPPNIHFHQANPRIPF
    EKWNIEVPTKVMPWPVEGQRRISVQSFGYGGTNAHAILDDAYHYLEKRGV
    KGFHFTNPSISTISNGAGGFMRSQTTNPAIKALKLKLSHSQQKPRLFVLS
    AHDQDGLNRQKKSLSKYVRNFLAGAAHPSVDFLRDLAFTLGHRRSRLAWK
    TYLVASSPDDLLAKLENKALDVPFFRPSSEPRVGFIFTGQGAQWARMGAE
    LNQYPIFRESVEASDEYLRSCLKCNWSAMEEILRKEDQSNINLPAYSQPI
    CTILQIALVDLLETWNIVPSAITGHSSGEIAGAYCLGALSKEDALKAAYY
    RGFLSSQMKTISPSVHGSMMAVGASESEAEDWISRLTRGDVVVACVNSPS
    SVTVSGDAVAINELETMLKKEGIFARKLKVETAYHSPHMEMISVPYLQSM
    TDIQPKEGYPSRKMHSAVTGELVEPSELGPINWVRNLVSPVLFYDALHDL
    LRPMEAGRRSSDTAVDVLLEIGPHSALQGPANQTMKKHGIKGVDYRSFLS
    RGKNGVETALAAVGALFSQGLSVNVKEVNGDTDNAQTLVDLPYYHWNHSR
    TFWSESRITKEFRLRQHPRMRLLGAPCPTLGESERLWRGFMRISEETWIR
    DHQIQGSIIYPAAGYICAAIEAACQLAAEGQVIRDFWLRDVQIIAPALIT
    EESDLELWQIRPHFSGTQSSSSTWSEFTVSSCLNGQSLRKNCNGLLLIEY
    TSAEDSDMSAERDLEDQTAQAQCGKTESLCPTRTNTKDFYTELASVGLNY
    GSTFQNVSNIRRGRGVSCCDVNISEHAFPALSGEAERPHIIHPTTLDAMF
    HAVFAAYKDPKGRLREAMVPTSIDEMIISADAPFEVGSRFKGFSNASKHG
    FRELMADLIMLSETSNRPAVTVKGFCFAAISGSAGASDEDMEPTTKKLFS
    KMVWKPALELLSSDQKHRMLNVVMPKALAPEIASGLEKSEQLALHFISQV
    LERVSIDAVQKTRLQDLYRWMEEQQDQVNTCGRFLHTTNQGYLGIDEETA
    KLYERDVISDGAEGEAVCQIGQNLDDILLGKTDAAELLLKNELIARLQHE
    IRGLDECFGKMKEYVNLLAHNDPDLSVLELGTARGGLARSLFSSAPELSH
    TMPSLTQYVFSTSTEVDLKEAKEHLDITNTSITFKILSIENELTGQGFEG
    GAFDIIIASNFLRAQFDEKTLTNMKKLLKPGGKLWLVNVARPVTGLSMVF
    RCLASSLNLKYNYPDVADNEPLDTILKRNNLRAEFRISDFQDARYEHLSL
    TMAKFSEPVGQEYGDREIIILEASNPSDRSSALASRLVKELESRAVKASR
    VTWDRRTCDLTPKECISLMELEASFLEDLSEADFDAVRRIILDSANLTWV
    TALNGPAGAIASGMARSIRNEIPGKLFRSLQVQDKSLDSPDELAFLVGNV
    ATSVTPDDEFREDAGVLHVCRMVEDAPMSEEITQLLVEGRESVEDMSLEQ
    VGGPQMLAIRAQGMLDTLCVEEDDVAGNELERDEIEIEVKATGLNFRDVM
    VAMGQIPDNLLGFEASGIITHVGHDVTHFEVGDSVCTLGHGSHRTLFRNK
    AIFCQRIPDGISFAEAATFPLVHCTAFYSLVHVARVRPKQSILIHAAAGG
    VGQAVIQIAKHFDLEIFATVGSKDKRKLIQEEYGIPDNHIFNSRDLSFEK
    GVLRMTNGRGVDCIINSLSGEALRRTWRCIAPFGTFIEIGMKDILGNTGL
    EMRPFLQDATFTFINLKHVMTANPQLMAEIIEGTFDFLRQGISRAVSPVT
    VYPVSEVEDAFRLMQTGKHRGKIAITWDGKDVVPVLHHASNIAMLDEHAT
    YVLVGGLGGLGRSLSNLLVDLGARNLCFVSRSGDQSTSAQRLIRDLGQKN
    VKTSVYRCDIANRDSVAKTISNCSEHMPPIKGVFQCAMVLRDVLFEKMTH
    TQWTESLRPKVQGSWNLHSLLPKDLDFFVILSSFAGIFGNRTQSNYAAAS
    AYQDALAYHRRAEGLKAVTIDLGIMRDVGVIAEHGTTDYLKEWEEPFGIR
    ETELHALIKKIITAELQSSSTDNETQLPSQFLTGFATGGTVHLANIRRPF
    YFDDPRFSILAQTGLSASLSSTPGSSGPNGTVVLRDLLPHVTTAADAGIA
    MKDALISRVAKSLQTETSEIDEARPLHSYGVDSLVAVEIANWIFKEIKVI
    VSVFDVLASMPIAALAEMVVAKSPFLPADMVAK

    npgA Enzyme
  • The inventors have discovered that the PKS Enzyme derived from SEQ ID NO:1-5 or 40-44 require activation of the ACP domain. NpgA can catalyze this reaction.
  • In preferred embodiments, the npgA enzyme comprises the following sequence (SEQ ID NO:8):
  • MVQDTSSASTSPILTRWYIDTRPLTASTAALPLLETLQPADQISVQKYY
    HLKDKHMSLASNLLKYLFVHRNCRIPWSSIVISRTPDPHRRPCYIPPSG
    SQEDSFKDGYTGINVEFNVSHQASMVAIAGTAFTPNSGGDSKLKPEVGI
    DITCVNERQGRNGEERSLESLRQYIDIFSEVFSTAEMANIRRLDGVSSS
    SLSADRLVDYGYRLFYTYWALKEAYIKMTGEALLAPWLRELEFSNVVAP
    AAVAESGDSAGDFGEPYTGVRTTLYKNLVEDVRIEVAALGGDYLFATAA
    RGGGIGASSRPGGGPDGSGIRSQDPWRPFKKLDIERDIQPCATGVCNCL
    S
  • As used herein, “a npgA enzyme” refers to any one or combination of the enzymes listed in Table 3 and/or SEQ ID NOs:8 or 31-33.
  • Moreover, variants of any of these npgA enzymes can be used in combination with PKS Enzyme described herein to produce Compound I from Compound II in the methods described herein. In these embodiments, variants of the npgA enzymes retain the ability to catalyze the conversion of Compound II into Compound I in combination with a PKS Enzyme derived from SEQ ID NO:1-5 or 40, with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence. In preferred embodiments, a variant npgA enzyme, has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of Compound II into Compound I as compared to the sequence from which the improved variant is derived.
  • Alternatively, variants of the npgA enzymes retain the ability to catalyze the conversion of malonyl-CoA and acetyl-CoA in combination with cs-OLAS-1 of SEQ ID NO:41 (or variant thereof) in combination with the cs-HEX-1 of SEQ ID NO:43 (or variant thereof), with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence from which the npgA variant is derived. In preferred embodiments, a variant npgA enzyme has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of malonyl-CoA and acetyl-CoA in combination with the enzymes of SEQ ID NO: 41 and 43 (or variants thereof) as compared to the npgA sequence from which the improved variant is derived.
  • In further embodiments, variants of the npgA enzymes retain the ability to catalyze the conversion of malonyl-CoA and acetyl-CoA in combination with pp-DVAS-1 of SEQ ID NO:42 (or variant thereof) in combination with a pp-BUT-1 of SEQ ID NO:44 (or variant thereof), with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence from which the npgA variant is derived. In preferred embodiments, a variant npgA enzyme, has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of malonyl-CoA and acetyl-CoA in combination with the enzymes of SEQ ID NO: 42 and 44 (or variants thereof) as compared to the npgA sequence from which the improved variant is derived.
  • npgA homolog from P. furfuracea
    (SEQ ID NO: 31)
    MTYHLCNADDDDGDGQTKAFRWLLDVQALWPAPGGGSQSAQSTAHWAT
    GTAAQHALALLADGERARALRFYRPSDAKLSLGSNLLKHRAIANTCRV
    PWSEAVISEGANRKPCYKPLGPRSKSLEFNVSHHGSLVALVGCPGEAV
    KLGVDVVKMNWERDYTTVMKDGFEAWANVYEAVFSEREIKDIAGFVPP
    IRGTQPDEIRAKLRHFYTHWCLKEAYVKMTGEALLAPWLKDLEFRNVQ
    VPLPASQMHASGQIGGDWGQTCGGVEIWFYGKRVTDVRLEIQAFREDY
    MIGTASSSVEMGLSVFKELDVERDVYPTQET
    npgA homolog from C. Stelaris
    (SEQ ID NO: 32)
    MNGPKVFRWVLDVQSLWPTPPDGPNGLQPSAREATARWASGKEAQYAL
    SLLASEEQAKVLRFYRPSDAKLSLASCLLKHRAIATTCEIPWSEATIG
    EDSNRKPCYKPSNPGGNTLEFNVSHHGTLVALVGCPGKAVRLGVDIVR
    MNWDKDYATVMKEGFQSWAKTYEAVFSDREVQDIAHYVTPKHDDLQDT
    IRAKLRHFYAHWCLKEAYVKMTGEALLAPWLKDVEFRNVQVPLPTSRA
    VDGAPEVNLWGQTCTDVEIWAHGNRVTDVQLEIQAFRDDYMIATASSH
    IGAKFSAFKELDLGKDVYP
    npgA homolog from C. Grayi
    (SEQ ID NO: 33)
    MAMTGPKVYRWVLDVQSLWPTPPDGTNHLQPSGREATAQWASGKEARY
    ALSLLTPEEQAKVLRFYRPSDAKLSLASCLLKRRAIATTCEVPWSEAT
    IGEDSNRKPCYKPSNPEGKAVEFNVSHHGSLVALVGCPGKDVSLGVDV
    VRMNWDKDYAGVMREGFESWARTYEAVFSDREVEDIAHYVAPTHDNVQ
    DTIRAKLRHFYAHWCLKEAYVKMTGEALLAPWLKDVEFRNVQVPLPTG
    LAADGASENNLWGQTCTDVEIWAHGNRVTDVQLEIQAFRDDYMIATAS
    SHVGAEFSAFRELDLEKDVYP
  • TABLE 3
    npgA Enzymes
    % identity to
    Accession No. Protein Name SEQ ID NO: 8
    XP_663744.1 hypothetical protein AN6140.2 [Aspergillus nidulans FGSC A4] 100.00%
    XP_026607463.1 Uncharacterized protein DSM5745_02284 [Aspergillus mulundensis]  75.29%
    OJJ01434.1 hypothetical protein ASPVEDRAFT_82959 [Aspergillus versicolor CBS 583.65]  68.35%
    OJJ58831.1 hypothetical protein ASPSYDRAFT_58043 [Aspergillus sydowii CBS 593.65]  66.76%
    GAQ06841.1 hypothetical protein ALT_4162 [Aspergillus lentulus]  57.79%
    KKK21491.1 hypothetical protein AOCH_005987 [Aspergillus ochraceoroseus]  58.13%
    XP_001260366.1 4′-phosphopantetheinyl transferase NpgA [Aspergillus fischeri NRRL 181]  57.35%
    CEL00884.1 hypothetical protein ASPCAL00476 [Aspergillus calidoustus]  66.28%
    XP_026618747.1 hypothetical protein CDV56_106897 [Aspergillus thermomutatus]  55.80%
    KKK11895.1 hypothetical protein ARAM_003790 [Aspergillus rambellii]  57.10%
    RHZ72079.1 hypothetical protein CDV55_108504 [Aspergillus turcosus]  55.41%
    XP_002378105.1 aflYg/npgA protein, putative [Aspergillus flavus NRRL3357]  56.82%
    RAQ52488.1 aflYg/npgA protein [Aspergillus flavus]  57.47%
    EDP54396.1 4′-phosphopantetheinyl transferase NpgA [Aspergillus fumigatus A1163]  56.86%
    OXN06337.1 hypothetical protein CDV58_05090 [Aspergillus fumigatus]  56.57%
    XP_755193.1 4′-phosphopantetheinyl transferase NpgA/CfwA [Aspergillus fumigatus Af293]  56.57%
    XP_022585045.1 hypothetical protein ASPZODRAFT_200027 [Penicilliopsis zonata CBS 506.65]  55.16%
    KEY77082.1 4′-phosphopantetheinyl transferase NpgA [Aspergillus fumigatus var. RP-2014]  56.16%
    PYI23618.1 4′-phosphopantetheinyl transferase [Aspergillus violaceofuscus CBS 115571]  54.78%
    ODM20598.1 hypothetical protein SI65_03651 [Aspergillus cristatus]  52.72%
    KJK61502.1 Sfp [Aspergillus parasiticus SU-1]  56.82%
    GAO86809.1 L-aminoadipate-semialdehyde dehydrogenase-phosphopantetheinyl transferase [Aspergillus udagawae]  56.37%
    PIG80832.1 aflYg/npgA protein [Aspergillus arachidicola]  56.82%
    XP_025504279.1 hypothetical protein BO66DRAFT_81606 [Aspergillus aculeatinus CBS 121060]  52.57%
    RJE25168.1 4′-phosphopantetheinyl transferase NpgA [Aspergillus sclerotialis]  55.84%
    XP_001267784.1 4′-phosphopantetheinyl transferase NpgA [Aspergillus clavatus NRRL 1]  57.43%
    RWQ96577.1 4′-phosphopantetheinyl transferase NpgA [Byssochlamys spectabilis]  52.08%
    RAK81669.1 hypothetical protein BO72DRAFT_444212 [Aspergillus fijiensis CBS 313.89]  51.74%
    XP_025431842.1 hypothetical protein BP01DRAFT_356077 [Aspergillus saccharolyticus JOP 1030-1]  51.46%
    OJJ31021.1 hypothetical protein ASPWEDRAFT_176122 [Aspergillus wentii DTO 134E9]  55.59%
    XP_025576628.1 4′-phosphopantetheinyl transferase [Aspergillus ibericus CBS 121593]  54.11%
    XP_020059757.1 hypothetical protein ASPACDRAFT_1852401 [Aspergillus aculeatus ATCC 16872]  53.20%
    PYI30524.1 4′-phosphopantetheinyl transferase [Aspergillus indologenus CBS 114.80]  54.84%
    XP_015403697.1 putative aflYg/npgA protein [Aspergillus nomius NRRL 13137]  54.60%
    XP_025470021.1 4′-phosphopantetheinyl transferase NpgA [Aspergillus sclerotioniger CBS 115572]  54.46%
    PYI08903.1 4′-phosphopantetheinyl transferase [Aspergillus sclerotiicarbonarius CBS 121057]  53.98%
    XP_025446590.1 hypothetical protein BO95DRAFT_478940 [Aspergillus brunneoviolaceus CBS 621.78]  52.66%
    XP_023093666.1 unnamed protein product [Aspergillus oryzae RIB40]  53.76%
    XP_025495634.1 4′-phosphopantetheinyl transferase [Aspergillus uvarum CBS 121591]  55.33%
    EIT78712.1 hypothetical protein A03042_05000 [Aspergillus oryzae 3.042]  53.48%
    XP_020121487.1 hypothetical protein UA08_03648 [Talaromyces atroroseus]  50.42%
    XP_022401752.1 hypothetical protein ASPGLDRAFT_124818 [Aspergillus glaucus CBS 516.65]  53.30%
    XP_025530903.1 4′-phosphopantetheinyl transferase [Aspergillus japonicus CBS 114.51]  54.21%
    XP_022388698.1 aflYg/npgA protein [Aspergillus bombycis]  55.43%
    KUL90071.1 hypothetical protein ZTR_02868 [Talaromyces verruculosus]  51.12%
    PCH00357.1 4′-phosphopantetheinyl transferase [Penicillium sp. ‘occitanis’]  49.72%
    KFX47391.1 L-aminoadipate-semialdehyde dehydrogenase-phosphopantetheinyl transferase [Talaromyces marneffei PM1]  49.73%
    XP_002146553.1 4′-phosphopantetheinyl transferase NpgA/CfwA [Talaromyces marneffei ATCC 18224]  49.73%
    CRG90513.1 hypothetical protein PISL3812_07557 [Talaromyces islandicus]  52.66%
    PGH13396.1 hypothetical protein AJ79_03675 [Helicocarpus griseus UAMH5409]  50.14%
    PLN81137.1 hypothetical protein BDW42DRAFT_102289 [Aspergillus taichungensis]  54.24%
    GAD93105.1 4′-phosphopantetheinyl transferase NpgA/CfwA [Byssochlamys spectabilis No. 5]  53.95%
    PGH08948.1 4′-phosphopantetheinyl transferase [Blastomyces parvus]  48.78%
    XP_024667956.1 hypothetical protein BDW47DRAFT_113120 [Aspergillus candidus]  55.90%
    RAO71122.1 hypothetical protein BHQ10_007134 [Talaromyces amestolkiae]  50.29%
    EEQ83341.1 4′-phosphopantetheinyl transferase NpgA [Blastomyces dermatitidis ER-3]  49.59%
    EYE91721.1 hypothetical protein EURHEDRAFT_236841 [Aspergillus ruber CBS 135680]  52.29%
    EQL35867.1 hypothetical protein BDFG_02477 [Blastomyces dermatitidis ATCC 26199]  50.14%
    XP_024691353.1 hypothetical protein P168DRAFT_272258 [Aspergillus campestris IBT 28561]  56.13%
    GAA86427.1 aflYg/npgA protein [Aspergillus kawachii IFO 4308]  51.75%
    EGE81927.1 4′-phosphopantetheinyl transferase NpgA [Blastomyces dermatitidis ATCC 18188]  50.14%
    XP_002621466.1 4′-phosphopantetheinyl transferase NpgA [Blastomyces gilchristii SLH14081]  50.27%
    OJD18353.1 hypothetical protein AJ78_01597 [Emergomyces pasteurianus Ep9510]  49.60%
    XP_024687280.1 4′-phosphopantetheinyl transferase [Aspergillus novofumigatus IBT 16806]  56.07%
    GCB28155.1 L-aminoadipate-semialdehyde dehydrogenase-phosphopantetheinyl transferase [Aspergillus awamori]  52.05%
    XP_025454152.1 4′-phosphopantetheinyl transferase [Aspergillus lacticoffeatus CBS 101883]  52.05%
    XP_001395469.1 npgA protein [Aspergillus niger CBS 513.88]  52.84%
    KLJ10976.1 hypothetical protein EMPG_09807 [Emmonsia parva UAMH 139]  50.00%
    XP_026628569.1 4′-phosphopantetheinyl transferase [Aspergillus welwitschiae]  51.75%
    OJJ67400.1 hypothetical protein ASPBRDRAFT_200113 [Aspergillus brasiliensis CBS 101740]  51.87%
    RDK45378.1 4′-phosphopantetheinyl transferase [Aspergillus phoenicis ATCC 13157]  52.63%
    OOF92416.1 hypothetical protein ASPCADRAFT_509391 [Aspergillus carbonarius ITEM 5010]  52.57%
    XP_002790645.2 4′-phosphopantetheinyl transferase NpgA [Paracoccidioides lutzii pb01]  49.33%
    PYH95779.1 4′-phosphopantetheinyl transferase [Aspergillus ellipticus CBS 707.79]  53.69%
    OJD20335.1 hypothetical protein ACJ73_08332 [Blastomyces percursus]  49.59%
    XP_002541282.1 conserved hypothetical protein [Uncinocarpus reesii 1704]  50.43%
    XP_025565104.1 aflYg/npgA protein [Aspergillus vadensis CBS 113365]  53.22%
    ODH48202.1 hypothetical protein GX48_05693 [Paracoccidioides brasiliensis]  47.14%
    XP_025535897.1 aflYg/npgA protein [Aspergillus costaricaensis CBS 115574]  51.92%
    OAX77444.1 hypothetical protein ACJ72_08257 [Emmonsia sp. CAC-2015a]  48.83%
    OXV06433.1 hypothetical protein Egran_05801 [Elaphomyces granulatus]  48.78%
    XP_025554268.1 4′-phosphopantetheinyl transferase [Aspergillus homomorphus CBS 101889]  50.97%
    GAQ45036.1 aflYg/npgA protein [Aspergillus niger]  52.19%
    XP_010760919.1 hypothetical protein PADG_05197 [Paracoccidioides brasiliensis Pb18]  46.58%
    EEH17147.2 hypothetical protein PABG_07234 [Paracoccidioides brasiliensis Pb03]  46.59%
    XP_013324640.1 4′-phosphopantetheinyl transferase NpgA [Rasamsonia emersonii CBS 393.64]  52.80%
    OJI80632.1 hypothetical protein ASPTUDRAFT_130475 [Aspergillus tubingensis CBS 134.48]  50.73%
    XP_024702426.1 4′-phosphopantetheinyl transferase [Aspergillus steynii IBT 23096]  52.68%
    XP_025477897.1 aflYg/npgA protein [Aspergillus neoniger CBS 115656]  50.29%
    OXV06984.1 hypothetical protein Egran_05250 [Elaphomyces granulatus]  47.34%
    XP_025395965.1 4′-phosphopantetheinyl transferase [Aspergillus heteromorphus CBS 117.55]  49.86%
    XP_001218317.1 conserved hypothetical protein [Aspergillus terreus NIH2624]  50.14%
    KMP00727.1 phosphopantetheinyl transferase A [Coccidioides immitis RMSCC 2394]  47.38%
    XP_001247064.2 4′-phosphopantetheinyl transferase NpgA [Coccidioides immitis RS]  47.38%
    PGH23632.1 hypothetical protein AJ80_02238 [Polytolypa hystricis UAMH7299]  46.83%
    AAU07984.1 putative 4′-phosphopantetheinyl transferase [Aspergillus fumigatus]  56.45%
    XP_002478852.1 4′-phosphopantetheinyl transferase NpgA/CfwA [Talaromyces stipitatus ATCC 10500]  47.34%
    EEH07682.1 4′-phosphopantetheinyl transferase NpgA [Histoplasma capsulatum G186AR]  47.95%
    EFW15615.1 4′-phosphopantetheinyl transferase NpgA [Coccidioides posadasii str. Silveira]  45.86%
    PGH36127.1 4′-phosphopantetheinyl transferase [Emmonsia crescens]  46.90%
  • Production of Compound H
  • As shown in FIGS. 1A and 1B, Compound II can be produced by two different mechanisms.
  • First, Compound II can be produced by enzymatically converting Compound III into Compound II by an enzyme selected from AAL1, AAL1ΔSKL, and/or CsAAE1.
  • In preferred embodiments, the AAL1 enzyme comprises the following sequence (SEQ ID NO:9):
  • MPQIIHKSAWGDIPLSTFFYGNVTDYLRSKKSFGSDKIGYIDAETGEGI
    TYKQLWKLANGISAVLYHHYGIGHARAPVASDHTLGDVVMLHAPNSRFF
    PSLHYGMLDMGCTITSASVSYDVADLAHQLRVTDASLVLCYQEKENNVR
    QAIKEAQKDAAFPGITHPVRILLIENLLTMACNISEEKINSAMARKFEY
    SPQECTKRIAYLSMSSGTTGGIPKAVRLTHFNMSSCDTLGTLSTPSFST
    GDDIRVAAIVPMTHQYGLTKFIFNMCSSHATTVVHRQFDLVKLLESQKK
    YKLNRLMLVPPVIVKMAKDPAVEPYIPSLYEHVDFITTGAAPLPGSAVT
    NLLTRITGNPQGIRHSQSGRPPLTISQGYGLTETSPLCAVFDPLDPDVD
    FRSAGKATSHVEIRIVSEDGVDQPQLKLDDLSHLDGMLKRDEPLPVGEV
    LIRGPMIMDGYHKNRQSSEESFDRSQEDPKTLIHWQDKWLKTGDIGMVD
    QKGRLMIVDRNKEMIKSMSKQVAPAELESLLLNHDQVIDCAVIGVNSEA
    KATESARAFLVLKDPSYDAVKIKAWLDGQVPSYKRLYGGVVVLKNEQIP
    KNPSGKILRRILRTRKDDFIQGIDVSQL
  • The AAL1ΔSKL sequence is identical to SEQ ID NO:9, except that amino acids 614-616 have been deleted.
  • In preferred embodiments, the CsAAE1 enzyme comprises the following sequence (SEQ ID NO:10):
  • MAYKSLDAISVSDIQALGIASPAAEKLFKEISDIITHYGAATPQTWSRIS
    KRLLNPDLPFSFHQIMYYGCYKDFGPDPPAWLPDPKTAGFTNVWKLLEKR
    GYEFLGSNYLDPISSFSAFQEFSVSNPEVYWKTVLDEMSVSFSVPPQCIL
    REDSPLSNPGGQWLPGAHLNPAKNCLSLNSESSSNDVAITWRDEGSDHLP
    VSCMTLEELRTEVWSVAYALNALGLDRGAAIAINMPMNVKSVIIYLAIVL
    AGYVVVSIADSFAPVEISTRLKISQAKAIFTQDLIIRGEKSIPLYSRVVD
    AQSPMAIVIPTKGSNFSMKLRDGDISWRDFLERVNNLRGNEFAAVEQPVE
    AYTNILFSSGTTGEPKAIPWINATPLKAAADAWCHMDIRKGDIVAWPTNL
    GWMMGPWLVYASLLNGACIALYNGSPIGSGFAKFVQDAKVTILGVIPSIV
    RTWKSTNCTAGYDWSAIRCFGSTGEASNVDEYLWLMGRAHYKPIIEYCGG
    TEIGGAFITGSLLQPQSLAAFSTPTMGCSLFILGNDGYPIPHNVPGMGEL
    ALGSLMFGASSSLLNGDHYKVYYKGMPVWNGKILRRHGDVFERTSRGYYH
    AHGRADDTMNLGGIKVSSVELERLCNAADSSILETAAIGVPPPQGGPERL
    VIAVVFKHPDNSTPDLEELKKSFNSVVQKKLNPLFRVSRVVPLPSLPRTA
    TNKVMRRILRQRFVQREQNSKL
  • Moreover, variants of AAL1, AAL1ΔSKL, and/or CsAAE1 can also be used to produce Compound II from Compound III in the methods described herein. Variants of the AAL1, AAL1ΔSKL, and/or CsAAE1 retain the ability to catalyze the conversion of Compound III into Compound II with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence. In preferred embodiments, a variant AAL1, AAL1ΔSKL, and/or CsAAE1 enzyme, has improved activity over the sequence from which it is derived in that the improved variant has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalyzing the conversion of Compound III into Compound II as compared to the sequence from which the improved variant is derived.
  • The second way in which Compound II can be produce is shown in FIG. 1B. In this situation Acetyl-CoA and Malonyl CoA are enzymatically converted to produce Compound II using a combination of enzymes selected from:
      • a. StcJ and StcK;
      • b. HexA and HexB;
      • c. MutFas1 and MutFas2;
  • The genes HexA & HexB encode the alpha (hexA) and beta (hexB) subunits of the hexanoate synthase (HexS) from Aspergillus parasiticus SU-1 (Hitchman et al. 2001). The genes StcJ and StcK are from Aspergillus nidulans and encode yeast-like FAS proteins (Brown et al. 1996). As would be understood by the person skilled in the art, many fungi would have hexanoate synthase or fatty acid synthase genes, which could readily be identified by sequencing of the DNA and sequence alignments with the known genes disclosed herein. Similarly, the skilled person would understand that homologous genes in different organisms may also be suitable. Examples of HexA and HexB homologs as shown in Tables 4 and 5. Examples of FAS1 and FAS2 homologs as shown in Tables 6 and 7. The endogenous yeast genes FAS1 (Fatty acid synthase subunit beta) and FAS2 (Fatty acid synthase subunit alpha) form fatty acid synthase FAS which catalyses the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH. Mutated FAS produces short-chain fatty acids, such as hexanoic acid. Several different combinations of mutations enable the production of hexanoic acid. The mutations include: FAS1 I306A and FAS2 G1250S; FAS1 I306A and FAS2 G1250S and M1251W; and FAS1 I306A, R1834K and FAS2 G1250S (Gajewski et al. 2017). Mutated FAS2 and FAS1 may be expressed under the control of any suitable promoter, including, but not limited to the alcohol dehydrogenase II promoter of Y. lipolytica. Alternatively, genomic FAS2 and FAS1 can be directly mutated using, for example, homologous recombination or CRISPR-Cas9 genome editing technology.
  • Accordingly, in certain embodiments, HexA comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:16. In certain embodiments, HexA comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:16. In certain embodiments, HexB comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:17. In certain embodiments, HexB comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:17. In certain embodiments, StcJ comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:18. In certain embodiments, StcJ comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:18. In certain embodiments, StcK comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:19. In certain embodiments, StcK comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:19. In certain embodiments, FAS2 comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:20 and one of the combinations of mutations defined above. In certain embodiments, FAS2 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:20 and one of the combinations of mutations defined above. In certain embodiments, FAS1 comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:21 and one of the combinations of mutations defined above. In certain embodiments, FAS1 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:21 and one of the combinations of mutations defined above.
  • Variants of the Compound II producing proteins retain the ability to catalyse the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH. For example, a variant of a Compound II producing protein must retain the ability to catalyse the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence. In preferred embodiments, a variant of a Compound II producing protein has improved activity over the sequence from which it is derived in that the improved variant common cannabinoid protein has more than 110%, 120%, 130%, 140%, or and 150% improved activity in catalysing the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH, as compared to the sequence from which the improved variant is derived.
  • The hexanoyl-CoA synthases HexA & HexB, StcJ & StcK, or mutated FAS1&2 may be expressed using, for example, a constitutive TEF intron promoter or native promoter (Wong et al. 2017) and synthesized short terminator (Curran et al. 2015). The production of Compound II may be determined by directly measuring the concentration of Compound II using LC-MS.
  • HexA
    SEQ ID NO: 16
    MVIQGKRLAASSIQLLASSLDAKKLCYEYDERQAPGVTQITEEAPTEQPPLSTPPSLPQTPNIS
    PISASKIVIDDVALSRVQIVQALVARKLKTAIAQLPTSKSIKELSGGRSSLQNELVGDIHNEFS
    SIPDAPEQILLRDFGDANPTVQLGKTSSAAVAKLISSKMPSDFNANAIRAHLANKWGLGPLRQT
    AVLLYAIASEPPSRLASSSAAEEYWDNVSSMYAESCGITLRPRQDTMNEDAMASSAIDPAVVAE
    FSKGHRRLGVQQFQALAEYLQIDLSGSQASQSDALVAELQQKVDLWTAEMTPEFLAGISPMLDV
    KKSRRYGSWWNMARQDVLAFYRRPSYSEFVDDALAFKVFLNRLCNRADEALLNMVRSLSCDAYF
    KQGSLPGYHAASRLLEQAITSTVADCPKARLILPAVGPHTTITKDGTIEYAEAPRQGVSGPTAY
    IQSLRQGASFIGLKSADVDTQSNLTDALLDAMCLALHNGISFVGKTFLVTGAGQGSIGAGVVRL
    LLEGGARVLVTTSREPATTSRYFQQMYDNHGAKFSELRVVPCNLASAQDCEGLIRHVYDPRGLN
    WDLDAILPFAAASDYSTEMHDIRGQSELGHRLMLVNVFRVLGHIVHCKRDAGVDCHPTQVLLPL
    SPNHGIFGGDGMYPESKLALESLFHRIRSESWSDQLSICGVRIGWTRSTGLMTAHDIIAETVEE
    HGIRTFSVAEMALNIAMLLTPDFVAHCEDGPLDADFTGSLGTLGSIPGFLAQLHQKVQLAAEVI
    RAVQAEDEHERFLSPGTKPTLQAPVAPMHPRSSLRVGYPRLPDYEQEIRPLSPRLERLQDPANA
    VVVVGYSELGPWGSARLRWEIESQGQWTSAGYVELAWLMNLIRHVNDESYVGWVDTQTGKPVRD
    GEIQALYGDHIDNHTGIRPIQSTSYNPERMEVLQEVAVEEDLPEFEVSQLTADAMRLRHGANVS
    IRPSGNPDACHVKLKRGAVILVPKTVPFVWGSCAGELPKGWTPAKYGIPENLIHQVDPVTLYTI
    CCVAEAFYSAGITHPLEVFRHIHLSELGNFIGSSMGGPTKTRQLYRDVYFDHEIPSDVLQDTYL
    NTPAAWVNMLLLGCTGPIKTPVGACATGVESIDSGYESIMAGKTKMCLVGGYDDLQEEASYGFA
    QLKATVNVEEEIACGRQPSEMSRPMAESRAGFVEAHGCGVQLLCRGDIALQMGLPIYAVIASSA
    MAADKIGSSVPAPGQGILSFSRERARSSMISVTSRPSSRSSTSSEVSDKSSLTSITSISNPAPR
    AQRARSTTDMAPLRAALATWGLTIDDLDVASLHGTSTRGNDLNEPEVIETQMRHLGRTPGRPLW
    AICQKSVTGHPKAPAAAWMLNGCLQVLDSGLVPGNRNLDTLDEALRSASHLCFPTRTVQLREVK
    AFLLTSFGFGQKGGQVVGVAPKYFFATLPRPEVEGYYRKVRVRTEAGDRAYAAAVMSQAVVKIQ
    TQNPYDEPDAPRIFLDPLARISQDPSTGQYRFRSDATPALDDDALPPPGEPTELVKGISSAWIE
    EKVRPHMSPGGTVGVDLVPLASFDAYKNAIFVERNYTVRERDWAEKSADVRAAYASRWCAKEAV
    FKCLQTHSQGAGAAMKEIEIEHGGNGAPKVKLRGAAQTAARQRGLEGVQLSISYGDDAVIAVAL
    GLMSGAS
    HexB
    SEQ ID NO: 17
    MGSVSREHESIPIQAAQRGAARICAAFGGQGSNNLDVLKGLLELYKRYGPDLDELLDVASNTLS
    QLASSPAAIDVHEPWGFDLRQWLTTPEVAPSKEILALPPRSFPLNTLLSLALYCATCRELELDP
    GQFRSLLHSSTGHSQGILAAVAITQAESWPTFYDACRTVLQISFWIGLEAYLFTPSSAASDAMI
    QDCIEHGEGLLSSMLSVSGLSRSQVERVIEHVNKGLGECNRWVHLALVNSHEKFVLAGPPQSLW
    AVCLHVRRIRADNDLDQSRILFRNRKPIVDILFLPISAPFHTPYLDGVQDRVIEALSSASLALH
    SIKIPLYHTGTGSNLQELQPHQLIPTLIRAITVDQLDWPLVCRGLNATHVLDFGPGQTCSLIQE
    LTQGTGVSVIQLTTQSGPKPVGGHLAAVNWEAEFGLRLHANVHGAAKLHNRMTTLLGKPPVMVA
    FMTPTTVRWDFVAAVAQAGYHVELAGGGYHAERQFEAEIRRLATAIPADHGITCNLLYAKPTTF
    SWQISVIKDLVRQGVPVEGITIGAGIPSPEVVQECVQSIGLKHISFKPGSFEAIHQVIQIARTH
    PNFLIGLQWTAGRGGGHHSWEDFHGPILATYAQIRSCPNILLVVGSGFGGGPDTFPYLTGQWAQ
    AFGYPCMPFDGVLLGSRMMVAREAHTSAQAKRLIIDAQGVGDADWHKSFDEPTGGVVTVNSEFG
    QPIHVLATRGVMLWKELDNRVFSIKDTSKRLEYLRNHRQEIVSRLNADFARPWFAVDGHGQNVE
    LEDMTYLEVLRRLCDLTYVSHQKRWVDPSYRILLLDFVHLLRERFQCAIDNPGEYPLDIIVRVE
    ESLKDKAYRTLYPEDVSLLMHLFSRRDIKPVPFIPRLDERFETWFKKDSLWQSEDVEAVIGQDV
    QRIFIIQGPMAVQYSISDDESVKDILHNICNHYVEALQADSRETSIGDVHSITQKPLSAFPGLK
    VTTNRVQGLYKFEKVGAVPEMDVLFEHIVGLSKSWARTCLMSKSVFRDGSRLHNPIRAALQLQR
    GDTIEVLLTADSEIRKIRLISPTGDGGSTSKVVLEIVSNDGQRVFATLAPNIPLSPEPSVVFCF
    KVDQKPNEWTLEEDASGRAERIKALYMSLWNLGFPNKASVLGLNSQFTGEELMITTDKIRDFER
    VLRQTSPLQLQSWNPQGCVPIDYCVVIAWSALTKPLMVSSLKCDLLDLLHSAISFHYAPSVKPL
    RVGDIVKTSSRILAVSVRPRGTMLTVSADIQRQGQHVVTVKSDFFLGGPVLACETPFELTEEPE
    MVVHVDSEVRRAILHSRKWLMREDRALDLLGRQLLFRLKSEKLFRPDGQLALLQVTGSVFSYSP
    DGSTTAFGRVYFESESCTGNVVMDFLHRYGAPRAQLLELQHPGWTGTSTVAVRGPRRSQSYARV
    SLDHNPIHVCPAFARYAGLSGPIVHGMETSAMMRRIAEWAIGDADRSRFRSWHITLQAPVHPND
    PLRVELQHKAMEDGEMVLKVQAFNERTEERVAEADAHVEQETTAYVFCGQGSQRQGMGMDLYVN
    CPEAKALWARADKHLWEKYGFSILHIVQNNPPALTVHFGSQRGRRIRANYLRMMGQPPIDGRHP
    PILKGLTRNSTSYTFSYSQGLLMSTQFAQPALALMEMAQFEWLKAQGVVQKGARFAGHSLGEYA
    ALGACASFLSFEDLISLIFYRGLKMQNALPRDANGHTDYGMLAADPSRIGKGFEEASLKCLVHI
    IQQETGWFVEVVNYNINSQQYVCAGHFRALWMLGKICDDLSCHPQPETVEGQELRAMVWKHVPT
    VEQVPREDRMERGRATIPLPGIDIPYHSTMLRGEIEPYREYLSERIKVGDVKPCELVGRWIPNV
    VGQPFSVDKSYVQLVHGITGSPRLHSLLQQMA
    SteJ
    SEQ ID NO: 18
    MTQKTIQQVPRQGLELLASTQDLAQLCYIYGEPAEGEDSTADESIINTPQCSTIPEVAVEPEVQ
    PIPDTPLTAIFIIRALVARKLRRSETEIDPSRSIKELCGGKSTLQNELIGELGNEFQTSLPDRA
    EDVSLADLDAALGEVSLGPTSVSLLQRVFTAKMPARMTVSNVRERLAEIWGLGFHRQTAVLVAA
    LAAEPHSRLTSLEAAYQYWDGLNEAYGQSLGLFLRKAISQQAARSDDQGAQAIAPADSLGSKDL
    ARKQYEALREYLGIRTPTTKQDGLDLADLQQKLDCWTAEFSDDFLSQISRRFDARKTRWYRDWW
    NSARQELLTICQNSNVQWTDKMREHFVQRAEEGLVEIARAHSLAKPLVPDLIQAISLPPVVRLG
    RLATMMPRTVVTLKGEIQCEEHEREPSCFVEFFSSWIQANNIRCTIQSNGEDLTSVFINSLVHA
    SQQGVSFPNHTYLITGAGPGSIGQHIVRRLLTGGARVIVTTSREPLPAAAFFKELYSKCGNRGS
    QLHLVPFNQASVVDCERLIGYIYDDLGLDLDAILPFAATSQVGAEIDGLDASNEAAFRLMLVNV
    LRLVGFVVSQKRRRGISCRPTQVVLPLSPNHGILGGDGLYAESKRGLETLIQRFHSESWKEELS
    ICGVSIGWTRSTGLMAANDLVAETAEKQGRVLTFSVDEMGDLISLLLTPQLATRCEDAPVMADF
    SGNLSCWRDASAQLAAARASLRERADTARALAQEDEREYRCRRAGSTQEPVDQRVSLHLGFPSL
    PEYDPLLHPDLVPADAVVVVGFAELGPWGSARIRWEMESRGCLSPAGYVETAWLMNLIRHVDNV
    NYVGWVDGEDGKPVADADIPKRYGERILSNAGIRSLPSDNREVFQEIVLEQDLPSFETTRENAE
    ALQQRHGDMVQVSTLKNGLCLVQLQHGATIRVPKSIMSPPGVAGQLPTGWSPERYGIPAEIVQQ
    VDPVALVLLCCVAEAFYSAGISDPMEIFEHIHLSELGNFVGSSMGGVVNTRALYHDVCLDKDVQ
    SDALQETYLNTAPAWVNMLYLGAAGPIKTPVGACATALESVDSAVESIKAGQTKICLVGGYDDL
    QPEESAGFARMKATVSVRDEQARGREPGEMSRPTAASRSGFVESQGCGVQLLCRGDVALAMGLP
    IYGIIAGTGMASDGIGRSVPAPGQGILTFAQEDAQNPAPSRTALARWGLGIDDITVASLHATST
    PANDTNEPLVIQREMTHLGRTSGRPLWAICQKFVTGHPKAPAAAWMLNGCLQVLDTGLVPGNRN
    ADDVDPALRSFSHLCFPIRSIQTDGIKAFLLNSCGFGQKEAQLVGVHPRYFLGLLSEPEFEEYR
    TRRQLRIAGAERAYISAMMTNSIVCVQSHPPFGPAEMHSILLDPSARICLDSSTNSYRVTKAST
    PVYTGFQRPHDKREDPRPSTIGVDTVTLSSFNAHENAIFLQRNYTERERQSLQLQSHRSFRSAV
    ASGWCAKEAVFKCLQTVSKGAGAAMSEIEIVRVQGAPSVLHGDALAAAQKAGLDNIQLSLSYGD
    DCVVAVALGVRKWCLWPLASHR
    StcK
    SEQ ID NO: 19
    MTPSPFLDAVDAGLSRLYACFGGQGPSNWAGLDELVHLSHAYADCAPIQDLLDSSARRLESQQR
    SHTDRHFLLGAGSNYRPGSTTLLHPHHLPEDLALSPYSFPINTLLSLLHYAITAYSLQLDPGQL
    RQKLQGAIGHSQGVFVAAAIAISHTDHGWPSFYRAADLALQLSFWVGLESHHASPRSILCANEV
    IDCLENGEGAPSHLLSVTGLDINHLERLVRKLNDQGGDSLYISLINGHNKFVLAGAPHALRGVC
    IALRSVKASPELDQSRVPFPLRRSVVDVQFLPVSAPYHSSLLSSVELRVTDAIGGLRLRGNDLA
    IPVYCQANGSLRNLQDYGTHDILLTLIQSVTVERVNWPALCWAMNDATHVLSFGPGAVGSLVQD
    VLEGTGMNVVNLSGQSMASNLSLLNLSAFALPLGKDWGRKYRPRLRKAAEGSAHASIETKMTRL
    LGTPHVMVAGMTPTTCSPELVAAIIQADYHVEFACGGYYNRATLETALRQLSRSIPPHRSITCN
    VIYASPKALSWQTQVLRRLIMEEGLPIDGITVGAGIPSPEVVKEWIDMLAISHIWFKPGSVDAI
    DRVLTIARQYPTLPVGIQWTGGRAGGHHSCEDFHLPILDCYARIRNCENVILVAGSGFGGAEDT
    WPYMNGSWSCKLGYAPMPFDGILLGSRMMVAREAKTSFAVKQLIVEAPGVKDDGNDNGAWAKCE
    HDAVGGVISVTSEMGQPIHVLATRAMRLWKEFDDRFFSIRDPKRLKAALKQHRVEIINRLNNDF
    ARPWFAQTDSSKPTEIEELSYRQVLRRLCQLTYVQHQARWIDSSYLSLVHDFLRLAQGRLGSGS
    EAELRFLSCNTPIELEASFDAAYGVQGDQILYPEDVSLLINLFRRQGQKPVPFIPRLDADFQTW
    FKKDSLWQSEDVDAVVDQDAQRVCIIQGPVAVRHSRVCDEPVKDILDGITEAHLKMMLKEAASD
    NGYTWANQRDEKGNRLPGIETSQEGSLCRYYLVGPTLPSTEAIVEHLVGECAWGYAALSQKKVV
    FGQNRAPNPIRDAFKPDIGDVIEAKYMDGCLREITLYHSLRRQGDPRAIRAALGLIHLDGNKVS
    VTLLTRSKGKRPALEFKMELLGGTMGPLILKMHRTDYLDSVRRLYTDLWIGRDLPSPTSVGLNS
    EFTGDRVTITAEDVNTFLAIVGQAGPARCRAWGTRGPVVPIDYAVVIAWTALTKPILLEALDAD
    PLRLLHQSASTRFVPGIRPLHVGDTVTTSSRITERTITTIGQRVEISAELLREGKPVVRLQTTF
    IIQRRPEESVSQQQFRCVEEPDMVIRVDSHTKLRVLMSRKWFLLDGPCSDLIGKILIFQLHSQT
    VFDAAGAPASLQVSGSVSLAPSDTSVVCVSSVGTRIGRVYMEEEGFGANPVMDFLNRHGAPRVQ
    RQPLPRAGWTGDDAASISFTAPAQSEGYAMVSGDTNPIHVCPLFSRFAGLGQPVVHGLHLSATV
    RRILEWIIGDNERTRFCSWAPSFDGLVRANDRLRMEIQHFAMADGCMVVHVRVLKESTGEQVMH
    AEAVLEQAQTTYVFTGQGTQERGMGMALYDTNAAARAVWDRAERHFRSQYGISLLHIVRENPTS
    LTVNFGSRRGRQIRDIYLSMSDSDPSMLPGLTRDSRSYTFNYPSGLLMSTQFAQPALAVMEIAE
    YAHLQAQGVVQTQAIFAGHSLGEYSSLGACTTIMPFESLLSLILYRGLKMQNTLPRNANGRTDY
    GMVAADPSRIRSDFTEDRLIELVRLVSQATGVLLEVVNYNVHSRQYVCAGHVRSLWVLSHACDD
    LSRSTSPNSPQTMSECIAHHIPSSCSVTNETELSRGRATIPLAGVDIPFHSQMLRGHIDGYRQY
    LRHHLRVSDIKPEELVGRWIPNVTGKPFALDAPYIRLVQGVTQSRPLLELLRRVEENR
    FAS alpha | FAS2
    SEQ ID NO: 20
    MRPEIEQELAHTLLVELLAYQFASPVRWIETQDVILAEKRTERIVEIGPADTLGGMARRTLASK
    YEAYDAATSVQRQILCYNKDAKEIYYDVDPVEEETESAPEAAAAPPTSAAPAAAVVAAPAPAAS
    APSAGPAAPVEDAPVTALDIVRTLVAQKLKKALSDVPLNKAIKDLVGGKSTLQNEILGDLGKEF
    GSTPEKPEDTPLDELGASMQATFNGQLGKQSSSLIARLVSSKMPGGFNITAVRKYLETRWGLGP
    GRQDGVLLLALTMEPASRIGSEPDAKVFLDDVANKYAANSGISLNVPTASGDGGASAGGMLMDP
    AAIDALTKDQRALFKQQLEIIARYLKMDLRDGQKAFVASQETQKTLQAQLDLWQAEHGDFYASG
    IEPSFDPLKARVYDSSWNWARQDALSMYYDIIFGRLKVVDREIVSQCIRIMNRSNPLLLEFMQY
    HIDNCPTERGETYQLAKELGEQLIENCKEVLGVSPVYKDVAVPTGPQTTIDARGNIEYQEVPRA
    SARKLEHYVKQMAEGGPISEYSNRAKVQNDLRSVYKLIRRQHRLSKSSQLQFNALYKDVVRALS
    MNENQIMPQENGSTKKPGRNGSVRNGSPRAGKVETIPFLHLKKKNEHGWDYSKKLTGIYLDVLE
    SAARSGLTFQGKNVLMTGAGAGSIGAEVLQGLISGGAKVIVTTSRYSREVTEYYQAMYARYGAR
    GSQLVVVPFNQGSKQDVEALVDYIYDTKKGLGWDLDFIVPFAAIPENGREIDSIDSKSELAHRI
    MLTNLLRLLGSVKAQKQANGFETRPAQVILPLSPNHGTFGNDGLYSESKLALETLFNRWYSENW
    SNYLTICGAVIGWTRGTGLMSGNNMVAEGVEKLGVRTFSQQEMAFNLLGLMAPAIVNLCQLDPV
    WADLNGGLQFIPDLKDLMTRLRTEIMETSDVRRAVIKETAIENKVVNGEDSEVLYKKVIAEPRA
    NIKFQFPNLPTWDEDIKPLNENLKGMVNLDKVVVVTGFSEVGPWGNSRTRWEMEASGKFSLEGC
    VEMAWIMGLIRHHNGPIKGKTYSGWVDSKTGEPVDDKDVKAKYEKYILEHSGIRLIEPELFKGY
    DPKKKQLLQEIVIEEDLEPFEASKETAEEFKREHGEKVEIFEVLESGEYTVRLKKGATLLIPKA
    LQFDRLVAGQVPTGWDARRYGIPEDIIEQVDPVTLFVLVCTAEAMLSAGVTDPYEFYKYVHLSE
    VGNCIGSGIGGTHALRGMYKDRYLDKPLQKDILQESFINTMSAWVNMLLLSSTGPIKTPVGACA
    TAVESVDIGYETIVEGKARVCFVGGFDDFQEEGSYEFANMKATSNAEDEFAHGRTPQEMSRPTT
    TTRAGFMESQGCGMQLIMSAQLALDMGVPIYGIIALTTTATDKIGRSVPAPGQGVLTTARENPG
    KFPSPLLDIKYRRRQLELRKRQIREWQESELLYLQEEAEAIKAQNPADFVVEEYLQERAQHINR
    EAIRQEKDAQFSLGNNFWKQDSRIAPLRGALATWGLTVDEIGVASFHGTSTVANDKNESDVICQ
    QMKHLGRKKGNALLGIFQKYLTGHPKGAAGAWMFNGCLQVLDSGLVPGNRNADNVDKVMEKFDY
    IVYPSRSIQTDGIKAFSVTSFGFGQKGAQVIGIHPKYLYATLDRAQFEAYRAKVETRQKKAYRY
    FHNGLVNNSIFVAKNKAPYEDELQSKVFLNPDYRVAADKKTSELKYPPKPPVATDAGSESTKAV
    IESLAKAHATENSKIGVDVESIDSINISNETFIERILPASEQQYCQNAPSPQSSFAGRWSAKEA
    VFKSLGVCSKGAGAPLKDIEIENDSNGAPTLHGVAAEAAKEAGVKHISVSISHSDMQAVAVAIS
    QF
    FAS beta | FAS1
    SEQ ID NO: 21 
    MYGTSTGPQTGINTPRSSQSLRPLILSHGSLEFSFLVPTSLHFHASQLKDTFTASLPEPTDELA
    QDDEPSSVAELVARYIGHVAHEVEEGEDDAHGTNQDVLKLTLNEFERAFMRGNDVHAVAATLPG
    ITAKKVLVVEAYYAGRAAAGRPTKPYDSALFRAASDEKARIYSVLGGQGNIEEYFDELREVYNT
    YTSFVDDLISSSAELLQSLSREPDANKLYPKGLNVMQWLREPDTQPDVDYLVSAPVSLPLIGLV
    QLAHFAVTCRVLGKEPGEILERFSGTTGHSQGIVTAAAIATATTWESFHKAVANALTMLFWIGL
    RSQQAYPRTSIAPSVLQDSIENGEGTPTPMLSIRDLPRTAVQEHIDMTNQHLPEDRHISISLVN
    SARNFVVTGPPLSLYGLNLRLRKVKAPTGLDQNRVPFTQRKVRFVNRFLPITAPFHSQYLYSAF
    DRIMEDLEDVEISPKSLTIPVYGTKTGDDLRAISDANVVPALVRMITHDPVNWEQTTAFPNATH
    IVDFGPGGISGLGVLTNRNKDGTGVRVILAGSMDGTNAEVGYKPELFDRDEHSVKYAIDWVKEY
    GPRLVKNATGQTFVDTKMSRLLGIPPIMVAGMTPTTVPWDFVAATMNAGYHIELAGGGYYNAKT
    MTEAITKIEKAIPPGRGITVNLIYVNPRAMGWQIPLIGKLRADGVPIEGLTIGAGVPSIEVANE
    YIETLGIKHIAFKPGSVDAIQQVINIAKANPKFPVILQWTGGRGGGHHSFEDFHQPILQMYSRI
    RRHENIILVAGSGFGGAEDTYPYLSGNWSSRFGYPPMPFDGCLFGSRMMTAKEAHTSKNAKQAI
    VDAPGLDDQDWEKTYKGAAGGVVTVLSEMGEPIHKLATRGVLFWHEMDQKIFKLDKAKRVPELK
    KQRDYIIKKLNDDFQKVWFGRNSAGETVDLEDMTYAEVVHRMVDLMYVKHEGRWIDDSLKKLTG
    DFIRRVEERFTTAEGQASLLQNYSELNVPYPAVDNILAAYPEAATQLINAQDVQHFLLLCQRRG
    QKPVPFVPSLDENFEYWFKKDSLWQSEDLEAVVGQDVGRTCILQGPMAAKFSTVIDEPVGDILN
    SIHQGHIKSLIKDMYNGDETTIPITEYFGGRLSEAQEDIEMDGLTISEDANKISYRLSSSAADL
    PEVNRWCRLLAGRSYSWRHALFSADVFVQGHRFQTNPLKRVLAPSTGMYVEIANPEDAPKTVIS
    VREPYQSGKLVKTVDIKLNEKGPIALTLYEGRTAENGVVPLTFLFTYHPDTGYAPIREVMDSRN
    DRIKEFYYRIWFGNKDVPFYTPTTATFNGGRETITSQAVADFVHAVGNTGEAFVERPGKEVFAP
    MDFAIVAGWKAITKPIFPRTIDGDLLKLVHLSNGFKMVPGAQPLKVGDVLDTTAQINSIINEES
    GKIVEVCGTIRRDGKPIMHVTSQFLYRGAYTDFENTFQRKDEVPMQVHLASSRDVAILRSKEWF
    RLDMDDVELLGQTLTFRLQSLIRFKNKNVFSQVQTMGQVLLELPTKEVIQVASVDYEAGTSHGN
    PVIDYLQRNGTSIEQPVYFENPIPLSGKTPLVLRAPASNETYARVSGDYNPIHVSRVFSSYANL
    PGTITHGMYTSAAVRSLVETWAAENNIGRVRGFHVSLVDMVLPNDLITVRLQHVGMIAGRKIIK
    VEASNKETEDKVLLGEAEVEQPVTAYVFTGQGSQEQGMGMELYATSPVAKEVWDRPSFHWNYGL
    SIIDIVKNNPKERTVHFGGPRGKAIRQNYMSMTFETVNADGTIKSEKIFKEIDETTTSYTYRSP
    TGLLSATQFTQPALTLMEKASFEDMRSKGLVQRDSSFAGHSLGEYSALADLADVMLIESLVSVV
    FYRGLTMQVAVERDEQGRSNYSMCAVNPSRISKTFNEQALQYVVGNISEQTGWLLEIVNYNVAN
    MQYVAAGDLRALDCLTNLLNYLKAQNIDIPALMQSMSLEDVKAHLVNIIHECVKQTEAKPKPIN
    LERGFATIPLKGIDVPFHSTFLRSGVKPFRSFLIKKINKTTIDPSKLVGKYIPNVTARPFEITK
    EYFEDVYRLTNSPRIAHILANWEKYEEGTEGGSRHGGTTAASS
  • TABLE 1
    HEXA HOMOLOGS
    Description Ident Accession
    hypothetical protein [Aspergillus parasiticus SU-1] 99% KJK60794.1
    sterigmatocystin biosynthesis fatty acid synthase subunit alpha 98% KOC17633.1
    [Aspergillus flavus AF70]
    fatty acid synthase alpha subunit [Aspergillus flavus NRRL3357] 98% XP_002379948.1
    HexA [Aspergillus flavus] 98% AAS90024.1
    unnamed protein product [Aspergillus oryzae RIB40] 98% XP_001821514.3
    sterigmatocystin biosynthesis fatty acid synthase subunit alpha 97% PIG79619.1
    [Aspergillus arachidicola]
    sterigmatocystin biosynthesis fatty acid synthase subunit alpha 92% XP_022391210.1
    [Aspergillus bombycis]
    sterigmatocystin biosynthesis fatty acid synthase subunit alpha 92% XP_015404699.1
    [Aspergillus nomius NRRL 13137]
  • TABLE 2
    HEXB HOMOLOGS
    Description Ident Accession
    hypothetical protein [Aspergillus parasiticus SU-1] 99% KJK60796.1
    fatty acid synthase beta subunit [Aspergillus flavus NRRL3357] 99% XP_002379947.1
    HexB [Aspergillus flavus] 99% AAS90085.1
    unnamed protein product [Aspergillus oryzae RIB40] 98% XP_001821515.1
    fatty acid synthase beta subunit [Aspergillus flavus AF70] 98% KOC17632.1
    fatty acid synthase beta subunit [Aspergillus arachidicola] 96% PIG79622.1
    HexB [Aspergillus flavus] 96% AAS90002.1
    enoyl reductase domain of FAS1 [Aspergillus oryzae 3.042] 98% EIT81347.1
    fatty acid synthase beta subunit [Aspergillus bombycis] 89% XP_022391135.1
    HexB [Aspergillus nomius] 90% AAS90050.1
    fatty acid synthase beta subunit [Aspergillus nomius NRRL 13137] 90% XP_015404698.1
  • TABLE 3
    FAS1 HOMOLOGS
    Description Ident Accession
    fatty acid synthase, beta subunit [Aspergillus nidulans] 100% AAB41494.1
    hypothetical protein [Aspergillus nidulans FGSC A4]  99% XP_682677.1
    hypothetical protein [Aspergillus sydowii CBS 593.65]  94% OJJ52999.1
    Putative Fatty acid synthase beta subunit dehydratase  94% CEN62087.1
    [Aspergillus calidoustus]
    hypothetical protein [Aspergillus versicolor CBS 583.65]  93% OJJ08968.1
    hypothetical protein [Aspergillus rambellii]  91% KKK18959.1
    hypothetical protein [Aspergillus ochraceoroseus]  91% KKK13726.1
    fatty acid synthase beta subunit dehydratase [Aspergillus terreus  91% XP_001213436.1
    NIH2624]
    hypothetical protein [Aspergillus carbonarius ITEM 5010]  89% OOF94457.1
    hypothetical protein [Aspergillus turcosus]  90% OXN14637.1
    fatty acid synthase beta subunit [Aspergillus sclerotioniger CBS  89% PWY96795.1
    115572]
    fatty acid synthase beta subunit [Aspergillus heteromorphus CBS  89% XP_025394299.1
    117.55]
    fatty acid synthase beta subunit [Aspergillus sclerotiicarbonarius  89% PYI01270.1
    CBS 121057]
    hypothetical protein [Aspergillus thermomutatus]  90% OXS11585.1
  • TABLE 4
    FAS2 HOMOLOGS
    Description Ident Accession
    RecName: Full = Fatty acid synthase subunit alpha; Includes: 100% P78615.1
    RecName: Full = Acyl carrier; Includes: RecName: Full = 3-oxoacyl-
    [acyl-carrier-protein] reductase; AltName: Full = Beta-ketoacyl
    reductase; Includes: RecName: Full = 3-oxoacyl-[acyl-carrier-protein]
    synthase; AltName: Full = Beta-ketoacyl synthase
    FAS2_PENPA Fatty acid synthase subunit alpha [Aspergillus  99% XP_682676.1
    nidulans FGSC A4]
    TPA: Fatty acid synthase, alpha subunit  99% CBF87553.1
    [Source: UniProtKB/TrEMBL; Acc: P78615] [Aspergillus nidulans
    FGSC A4]
    hypothetical protein ASPVEDRAFT_144895 [Aspergillus versicolor  93% OJJ08967.1
    CBS 583.65]
    Putative Fatty acid synthase subunit alpha reductase [Aspergillus  93% CEN62088.1
    calidoustus]
    hypothetical protein ASPSYDRAFT_564317 [Aspergillus sydowii  93% OJJ52998.1
    CBS 593.65]
    hypothetical protein BP01DRAFT_383520 [Aspergillus  91% XP_025430630.1
    saccharolyticus JOP 1030-1]
    putative fatty acid synthase alpha subunit FasA [Aspergillus  91% PYI32058.1
    indologenus CBS 114.80]
    hypothetical protein ASPCADRAFT_208136 [Aspergillus  90% OOF94458.1
    carbonarius ITEM 5010]
    hypothetical protein ASPACDRAFT_79663 [Aspergillus aculeatus  90% XP_020055233.1
    ATCC 16872]
    fatty acid synthase alpha subunit FasA [Aspergillus kawachii IFO  91% GAA92751.1
    4308]
    putative fatty acid synthase alpha subunit FasA [Aspergillus fijiensis  90% RAK72625.1
    CBS 313.89]
    putative fatty acid synthase alpha subunit FasA [Aspergillus  90% XP_025498650.1
    aculeatinus CBS 121060]
    putative fatty acid synthase alpha subunit FasA [Aspergillus  90% PYI15679.1
    violaceofuscus CBS 115571]
    fatty acid synthase alpha subunit FasA [Aspergillus piperis CBS  91% XP_025520376.1
    112811]
    fatty acid synthase alpha subunit FasA [Aspergillus vadensis CBS  91% PYH66515.1
    113365]
    putative fatty acid synthase alpha subunit FasA [Aspergillus  90% XP_025442388.1
    brunneoviolaceus CBS 621.78]
    fatty acid synthase alpha subunit FasA [Aspergillus neoniger CBS  91% XP_025476115.1
    115656]
    fatty acid synthase alpha subunit FasA [Aspergillus costaricaensis  91% RAK83984.1
    CBS 115574]
  • Production of Compound III
  • The production of Compound III can be enzymatically produced from Compound IV using, for example, ADH alone or with the combination of ADH, FAO and one of 4 FALDH1-4. See, for example Gatter, M., et al., (2014) FEMS Yeast Research 14(6), 858-872 and Salid, A., et al., (2013) Applied Biochemistry and Biotechnology 171(8), 2273-2284. Carbon sources used to produce Compound III from alkans, such as for example hexan, octan.
  • Production of GPP
  • FIG. 3 describes the preferred method of producing GPP. Specifically, GPP may be produced by a mutated farnesyl diphosphate synthase. For example, normally in yeast, the farnesyl diphosphate synthase ERG20 condenses isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to provide geranyl pyrophosphate (GPP) and then condenses two molecules of GPP to provide feranyl pyrophosphate (FPP). However, only a low level of GPP remains as ERG20 converts most of the GPP to FPP. More GPP is required for the commercial scale production of cannabinoids. Accordingly, mutated ERG20 that has a reduced or inability to produce FPP, may be used to increase the production of GPP. Two sets of mutations have been identified in S. cerevisiae that increase GPP production. The first mutation is a substitution of K197E and the second is a double substitution of F96W and N127W. As would be readily appreciated by the person skilled in the art, due to the high homology between ERG20 from S. cerevisiae and ERG20 from Y. lipolytica, equivalent mutations may be introduced into ERG20 from Y. lipolytica. In Y. lipolytica the first mutation is a substitution of K189E and the second is a double substitution of F88W and N119W. Introducing Y. lipolytica ERG20 (K189E) increases the production of GPP but growth is little bit slower compared to wild type yeast. Introducing Y. lipolytica ERG20 (F88W and N119W) produces fast growing clones with a high level of GPP. The sequences for the Y. lipolytica and S. cerevisiae genes are shown herein, however the skilled person would understand that homologous genes may also be suitable. Examples of ERG20 homologs as shown in Table 8. Accordingly, in certain embodiments, the one or more GPP producing genes comprise: a mutated farnesyl diphosphate synthase; a mutated S. cerevisiae ERG20 comprising a K197E substitution; a double mutated S. cerevisiae ERG20 comprising F96W and N127W substitutions; a mutated Y. lipolytica ERG20 comprising a K189E substitution; or a double mutated Y. lipolytica ERG20 comprising F88W and N119W substitutions; or a combination thereof. For the SEQ IDS described herein, mutations are shown with a solid underline. In certain embodiments, S. cerevisiae ERG20 (K197E) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:25. In certain embodiments, S. cerevisiae ERG20 (K197E) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:25. In certain embodiments, S. cerevisiae ERG20 (F96W and N127W) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:26. In certain embodiments, S. cerevisiae ERG20 (F96W and N127W) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:26. The equivalent Y. lipolytica amino acid sequences are shown in SEQ ID NOS: 27 and 28. In certain embodiments, Y. lipolytica ERG20 (K189E) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:27. In certain embodiments, Y. lipolytica ERG20 (K189E) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:27. In certain embodiments, Y. lipolytica ERG20 (F88W and N119W) comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:28. In certain embodiments, Y. lipolytica ERG20 (F88W and N119W) comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:28.
  • Variants of the GPP proteins, such as ERG20, retain the ability to, for example, condense isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to geranyl pyrophosphate (GPP) and yet have reduced GPP to FPP activity. For example, a variant of a GPP protein, such as ERG20, retains the ability to condense isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to geranyl pyrophosphate (GPP) with at least about at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence, while the ability to condense GPP to FPP is reduced by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% (null mutation) as compared to the sequence from which it is derived.
  • ERG20 (K197E)
    SEQ ID NO: 25
    MASEKEIRRERFLNVFPKLVEELNASLLAYGMPKEACDWYAHSLNYNTP
    GGKLNRGLSVVDTYAILSNKTVEQLGQEEYEKVAILGWCIELLQAYFLV
    ADDMMDKSITRRGQPCWYKVPEVGEIAINDAFMLEAAIYKLLKSHFRNE
    KYYIDITELFHEVTFQTELGQLMDLITAPEDKVDLSKFSLKKHSFIVTF
    ETAYYSFYLPVALAMYVAGITDEKDLKQARDVLIPLGEYFQIQDDYLDC
    FGTPEQIGKIGTDIQDNKCSWVINKALELASAEQRKTLDENYGKKDSVA
    EAKCKKIFNDLKIEQLYHEYEESIAKDLKAKISQVDESRGFKADVLTAF
    LNKVYKRSK*
    ERG20 (F96W and N127W)
    SEQ ID NO: 26
    MASEKEIRRERFLNVFPKLVEELNASLLAYGMPKEACDWYAHSLNYNTP
    GGKLNRGLSVVDTYAILSNKTVEQLGQEEYEKVAILGWCIELLQAYWLV
    ADDMMDKSITRRGQPCWYKVPEVGEIAIWDAFMLEAAIYKLLKSHFRNE
    KYYIDITELFHEVTFQTELGQLMDLITAPEDKVDLSKFSLKKHSFIVTF
    KTAYYSFYLPVALAMYVAGITDEKDLKQARDVLIPLGEYFQIQDDYLDC
    FGTPEQIGKIGTDIQDNKCSWVINKALELASAEQRKTLDENYGKKDSVA
    EAKCKKIFNDLKIEQLYHEYEESIAKDLKAKISQVDESRGFKADVLTAF
    LNKVYKRSK*
    Y. lipolytica ERG20 (K189E)
    SEQ ID NO: 27
    MSKAKFESVFPRISEELVQLLRDEGLPQDAVQWFSDSLQYNCVGGKLNR
    GLSVVDTYQLLTGKKELDDEEYYRLALLGWLIELLQAFFLVSDDIMDES
    KTRRGQPCWYLKPKVGMIAINDAFMLESGIYILLKKHFRQEKYYIDLVE
    LFHDISFKTELGQLVDLLTAPEDEVDLNRFSLDKHSFIVRYETAYYSFY
    LPVVLAMYVAGITNPKDLQQAMDVLIPLGEYFQVQDDYLDNFGDPEFIG
    KIGTDIQDNKCSWLVNKALQKATPEQRQILEDNYGVKDKSKELVIKKLY
    DDMKIEQDYLDYEEEVVGDIKKKIEQVDESRGFKKEVLNAFLAKIYKRQ
    K
    Y. lipolytica ERG20 (F88W and N119W)
    SEQ ID NO: 28
    ASKAKFESVFPRISEELVQLLRDEGLPQDAVQWFSDSLQYNCVGGKLNR
    GLSVVDTYQLLTGKKELDDEEYYRLALLGWLIELLQAFWLVSDDIMDES
    KTRRGQPCWYLKPKVGMIAIWDAFMLESGIYILLKKHFRQEKYYIDLVE
    LFHDISFKTELGQLVDLLTAPEDEVDLNRFSLDKHSFIVRYKTAYYSFY
    LPVVLAMYVAGITNPKDLQQAMDVLIPLGEYFQVQDDYLDNFGDPEFIG
    KIGTDIQDNKCSWLVNKALQKATPEQRQILEDNYGVKDKSKELVIKKLY
    DDMKIEQDYLDYEEEVVGDIKKKIEQVDESRGFKKEVLNAFLAKIYKRQ
    K
  • TABLE 8
    ERG20 HOMOLOGS
    Description Ident Accession
    YALI0E05753P [Yarrowia lipolytica CLIB122] 99% XP_503599.1
    hypothetical protein [Nadsonia fulvescens var. elongata DSM 6958] 71% ODQ67901.1
    hypothetical protein [Lipomyces starkeyi NRRL Y-11557] 70% ODQ75043.1
    Farnesyl pyrophosphate synthetase [Galactomyces candidus] 68% CDO55796.1
    hypothetical protein [Kazachstania naganishii CBS 8797] 68% XP_022463460.1
    farnesyl pyrophosphate synthase [Saitoella complicata NRRL Y-17804] 66% XP_019025287.1
    hypothetical protein [Tetrapisispora blattae CBS 6284] 67% XP_004179894.1
    hypothetical protein [Torulaspora delbrueckii] 67% XP_003680478.1
    unnamed protein product [Zymoseptoria tritici ST99CH_1E4] 66% SMR57088.1
    ERG20 farnesyl diphosphate synthase [Zymoseptoria tritici IPO323] 66% XP_003850094.1
    LAFE_0G04434g1_1 [Lachancea fermentati] 68% SCW03167.1
    ERG20-like protein [Saccharomyces kudriavzevii IFO 1802] 66% EJT43164.1
    hypothetical protein [Dactylellina haptotyla CBS 200.50] 66% EPS37682.1
    CYFA0S07e04962g1_1 [Cyberlindnera fabianii] 65% CDR41679.1
    probable farnesyl pyrophosphate synthetase [Ramularia collo-cygni] 65% XP_023628194.1
    farnesyl pyrophosphate synthetase [Kluyveromyces marxianus 65% XP_022673909.1
    DMKU3-1042]
    polyprenyl synt-domain-containing protein [Sphaerulina musiva 67% XP_016759989.1
    SO2202]
  • High levels of GPP production are dependent on adequate mevalonate production. Hydroxymethylglutaryl-CoA reductase (HMGR) catalyses the production of mevalonate from HMG-CoA and NADPH. HMGR is a rate limiting step in the GPP pathway in yeast. Accordingly, overexpressing HMGR may increase flux through the pathway and increase the production of GPP. HMGR is a GPP pathway gene. Other GPP pathway genes include those genes that are involved in the GPP pathway, the products of which either directly produce GPP or produce intermediates in the GPP pathway, for example, ERG10, ERG13, ERG12, ERG8, ERG19, IDI1 or ERG20, The HMGR1 sequence from Y. lipolytica consists of 999 amino acids (aa) (SEQ ID NO: 29), of which the first 500 aa harbor multiple transmembrane domains and a response element for signal regulation. The remaining 499 C-terminal residues contain a catalytic domain and an NADPH-binding region. Truncated HMGR1(tHmgR) has been generated by deleting the N-terminal 500 aa (Gao et al. 2017). tHMGR is able to avoid self-degradation mediated by its N-terminal domain and is thus stabilized in the cytoplasm, which increases flux through the GPP pathway. The N-terminal 500 aa are shown with a dashed underline in SEQ ID NO:29. The N-terminal 500 aa are deleted in SEQ ID NO:30. In certain embodiments, the one or more GPP pathway genes comprise a hydroxymethylglutaryl-CoA reductase (HMGR); a truncated hydroxymethylglutaryl-CoA reductase (tHMGR); or a combination thereof. The sequence for the Y. lipolytica gene are shown herein, however the skilled person would understand that homologous genes may also be suitable. Examples of HMGR homologs as shown in Table 9. In certain embodiments, HMGR comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:29. In certain embodiments, HMGR comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:29. In certain embodiments, tHmgR comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:30. In certain embodiments, tHmgR comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:30.
  • The GPP producing and GPP pathway genes may be expressed using, for example, a constitutive TEF intron promoter or native promoter (Wong et al. 2017) and synthesized short terminator (Curran et al. 2015). Increased production of GPP can be determined by overexpressing a single heterologous gene encoding linalool synthase and then determining the production of linalool using, for example, a colorimentric assay (Ghorai 2012). Increased production of GPP may be indicated by a linalool concentration of at least 0.5 mg/L, 0.7 mg/L, 0.9 mg/L or preferably at least about 1 mg/L.
  • HMGR1 (underlined sequence is removed in tHMGR1)
    SEQ ID NO: 29
    MLQAAIGKIVGFAVNRPIHTVVLTSIVASTAYLAILDIAIPGFEGTQPI
    SYYHPAAKSYDNPADWTHIAEADIPSDAYRLAFAQIRVSDVQGGEAPTI
    PGAVAVSDLDHRIVMDYKQWAPWTASNEQIASENHIWKHSFKDHVAFSW
    IKWFRWAYLRLSTLIQGADNFDIAVVALGYLAMHYTFFSLFRSMRKVGS
    HFWLASMALVSSTFAFLLAVVASSSLGYRPSMITMSEGLPFLVVAIGFD
    RKVNLASEVLTSKSSQLAPMVQVITKIASKALFEYSLEVAALFAGAYTG
    VPRLSQFCFLSAWILIFDYMFLLTFYSAVLAIKFEINHIKRNRMIQDAL
    KEDGVSAAVAEKVADSSPDAKLDRKSDVSLFGASGAIAVFKIFMVLGFL
    GLNLINLTAIPHLGKAAAAAQSVTPITLSPELLHAIPASVPVVVTFVPS
    VVYEHSQLILQLEDALTTFLAACSKTIGDPVISKYIFLCLMVSTALNVY
    LFGATREVVRTQSVKVVEKHVPIVIEKPSEKEEDTSSEDSIELTVGKQP
    KPVTETRSLDDLEAIMKAGKTKLLEDHEVVKLSLEGKLPLYALEKQLGD
    NTRAVGIRRSIISQQSNTKTLETSKLPYLHYDYDRVFGACCENVIGYMP
    LPVGVAGPMNIDGKNYHIPMATTEGCLVASTMRGCKAINAGGGVTTVLT
    QDGMTRGPCVSFPSLKRAGAAKIWLDSEEGLKSMRKAFNSTSRFARLQS
    LHSTLAGNLLFIRFRTTTGDAMGMNMISKGVEHSLAVMVKEYGFPDMDI
    VSVSGNYCTDKKPAAINWIEGRGKSVVAEATIPAHIVKSVLKSEVDALV
    ELNISKNLIGSAMAGSVGGFNAHAANLVTAIYLATGQDPAQNVESSNCI
    TLMSNVDGNLLISVSMPSIEVGTIGGGTILEPQGAMLEMLGVRGPHIET
    PGANAQQLARIIASGVLAAELSLCSALAAGHLVQSHMTHNRSQAPTPAK
    QSQADLQRLQNGSNICIRS
    tHmgR
    SEQ ID NO: 30
    TQSVKVVEKHVPIVIEKPSEKEEDTSSEDSIELTVGKQPKPVTETRSLD
    DLEAIMKAGKTKLLEDHEVVKLSLEGKLPLYALEKQLGDNTRAVGIRRS
    IISQQSNTKTLETSKLPYLHYDYDRVFGACCENVIGYMPLPVGVAGPMN
    IDGKNYHIPMATTEGCLVASTMRGCKAINAGGGVTTVLTQDGMTRGPCV
    SFPSLKRAGAAKIWLDSEEGLKSMRKAFNSTSRFARLQSLHSTLAGNLL
    FIRFRTTTGDAMGMNMISKGVEHSLAVMVKEYGFPDMDIVSVSGNYCTD
    KKPAAINWIEGRGKSVVAEATIPAHIVKSVLKSEVDALVELNISKNLIG
    SAMAGSVGGFNAHAANLVTAIYLATGQDPAQNVESSNCITLMSNVDGNL
    LISVSMPSIEVGTIGGGTILEPQGAMLEMLGVRGPHIETPGANAQQLAR
    IIASGVLAAELSLCSALAAGHLVQSHMTHNRSQAPTPAKQSQADLQRLQ
    NGSNICIRS
  • TABLE 9
    HMGR HOMOLOGS
    Description Ident Accession
    YALI0E04807P [Yarrowia lipolytica CLIB122] 100% XP_503558.1
    hypothetical protein [Nadsonia fulvescens var. elongata DSM 6958]  75% ODQ65159.1
    hypothtical protein [Galactomyces candidum]  74% CDO55526.1
    hypothetical protein [Lipomyces starkeyi NRRL Y-11557]  74% ODQ70929.1
    hypothetical protein [Meyerozyma guilliermondii ATCC 6260]  76% EDK40614.2
    HMG1 [Sugiyamaella lignohabitans]  73% XP_018736018.1
    hypothetical protein [Meyerozyma guilliermondii ATCC 6260]  76% XP_001482757.1
    hypothetical protein [Babjeviella inositovora NRRL Y-12698]  76% XP_018984841.1
    DEHA2D09372P [Debaryomyces hansenii CBS767]  75% XP_458872.2
    3-hydroxy-3-methylglutaryl-coenzyme A reductase 1 [[Candida] glabrata]  75% KTB22480.1
    hypothetical protein [Vanderwaltozyma polyspora DSM 70294]  72% XP_001643950.1
    LAFE_0A01552g1_1 [Lachancea fermentati]  76% SCV99364.1
    hypothetical protein [Debaryomyces fabryi]  75% XP_015466829.1
    uncharacterized protein [Kuraishia capsulata CBS 1993]  76% XP_022457391.1
    uncharacterized protein [Candida] glabrata]  75% XP_449268.1
  • The production of the cannabinoids tetrahydrocannabinolic acid (THCA), cannabidiolic acid (CBDA) and cannabichromenic acid (CBCA) involves the prenylation of OA with GPP to CBGA (as shown in FIGS. 1A-1C) by an aromatic prenyltransferase, and then CBDA, THCA or CBCA by CBDAS, THCAS or CBCAS, respectively.
  • As described herein CBGA-analogs may be produced by a membrane-bound CBGA synthase (CBGAS) from C. sativa. CBGAS is also known as geranylpyrophosphate olivetolate geranyltransferase, of which there are several forms, CsPT1, CsPT3 and CsPT4. In certain embodiments, the one or more cannabinoid precursor or cannabinoid producing genes comprise: a soluble aromatic prenyltransferase; a cannabigerolic acid synthase (CBGAS); or a combination thereof; either alone or in combination with the cannabinoid producing genes: tetrahydrocannabinolic acid synthase (THCAS); cannabidiolic acid synthase (CBDAS); cannabichromenic acid synthase (CBCAS); or any combination thereof. The sequences for the Cannabis sativa genes CBGAS, THCAS, CBDAS and CBCAS are shown herein, however the skilled person would understand that homologous genes may also be suitable.
  • In certain embodiments, CBGA synthase comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:31. In certain embodiments, CBGA synthase comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:32. In certain embodiments, CBGA synthase comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:33. In certain embodiments, CBGA synthase comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NOS: 31, 32 or 33. CBGA may also be formed by heterologous expression of a soluble aromatic prenyltransferase. In certain embodiments, the soluble aromatic prenyltransferase is NphB from Streptomyces sp. strain CL190 (ie wild type NphB) (Bonitz et al., 2011; Kuzuyama et al., 2005; Zirpel et al., 2017). In certain embodiments, the soluble aromatic prenyltransferase is NphB, comprising at least one mutation selected from (a) Q161A; (b) G286S; (c) Y288A; (d) A232S; (e) Y288A+G286S; (f) Y288A+G286S+Q161A; (g) Q161A+G286S; (h) Q161A+Y288A; or (i) Y288A+A232S. It is expected that the mutants of NphB (e.g., Q161A) produces more CBGA that wild type NphB (Muntendam 2015).
  • Wild type NphB produces 15% CBGA and 85% of another by-product. The sequence for the Streptomyces sp. strain CL190 gene NphB is shown herein, however the skilled person would understand that homologous genes may also be suitable. In certain embodiments, NphB comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:34. In certain embodiments, NphB comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:34.
  • Variants of the cannabinoid precursor or cannabinoid producing protein, such as NphB variant (e.g., at least one of Q161A, G286S, Y288A, or A232S), retains the ability to attach geranyl groups to aromatic substrates—such as converting Compound I and GPP to CBGA-analog. For example, a variant Cannabinoid precursor or cannabinoid producing protein, such as NphB variant (e.g., at least one of Q161A, G286S, Y288A, A232S), must retain the ability to attach geranyl groups to aromatic substrates, such as converting Compound I and GPP to CBGA-analog, with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence. In preferred embodiments, a variant of a Cannabinoid precursor or cannabinoid producing protein, such as NphB variant (e.g., at least one of Q161A, G286S, Y288A, A232S), has improved activity over the sequence from which it is derived in that the improved variant common cannabinoid protein has more than 110%, 120%, 130%, 140%, or and 150% improved activity in attach geranyl groups to aromatic substrates, such as converting Compound I and GPP to CBGA-analog, as compared to the sequence from which the improved variant is derived.
  • The cannabinoid precursor or cannabinoid producing genes CBGAS, soluble aromatic prenyltransferase, CBGAS, THCAS, CBDAS and CBCAS may be expressed using, for example, a constitutive TEF intron promoter or native promoter (Wong et al. 2017) and synthesized short terminator (Curran et al. 2015). The production of one or more cannabinoid precursors or cannabinoids may be determined using a variety of methods. For example, if all of the precursors are available in the yeast cell, then the presence of the product, such as THCA, may be determined using HPLC or gas chromatography (GC). Alternatively, if only a portion of the cannabinoid synthesis pathway present, then cannabinoids will not be present and the activity of one or more genes can be checked by adding a gene and precursor. For example, to check CBGAS activity, Compound I and GPP are added to a crude cellular lysate. For checking CBCAS, THCAS or CBDAS activity, a CBGA-analog is added to a crude cellular lysate. A crude lysate or purified proteins may be used. Further, it may be necessary to use an aqueous/organic two-liquid phase setup in order to solubilize the hydrophobic substrate (eg CBGA) and to allow in situ product removal.
  • CsPT1
    SEQ ID NO: 31
    MGLSSVCTFSFQTNYHTLLNPHNNNPKTSLLCYRHPKTPIKYSYNNFPS
    KHCSTKSFHLQNKCSESLSIAKNSIRAATTNQTEPPESDNHSVATKILN
    FGKACWKLQRPYTIIAFTSCACGLFGKELLHNTNLISWSLMFKAFFFLV
    AILCIASFTTTINQIYDLHIDRINKPDLPLASGEISVNTAWIMSIIVAL
    FGLIITIKMKGGPLYIFGYCFGIFGGIVYSVPPFRWKQNPSTAFLLNFL
    AHIITNFTFYYASRAALGLPFELRPSFTFLLAFMKSMGSALALIKDASD
    VEGDTKFGISTLASKYGSRNLTLFCSGIVLLSYVAAILAGIIWPQAFNS
    NVMLLSHAILAFWLILQTRDFALTNYDPEAGRRFYEFMWKLYYAEYLVY
    VFI
    CsPT3
    SEQ ID NO: 32
    MGLSLVCTFSFQTNYHTLLNPHNKNPKNSLLSYQHPKTPIIKSSYDNFP
    SKYCLTKNFHLLGLNSHNRISSQSRSIRAGSDQIEGSPHHESDNSIATK
    ILNFGHTCWKLQRPYVVKGMISIACGLFGRELFNNRHLFSWGLMWKAFF
    ALVPILSFNFFAAIMNQIYDVDIDRINKPDLPLVSGEMSIETAWILSII
    VALTGLIVTIKLKSAPLFVFIYIFGIFAGFAYSVPPIRWKQYPFTNFLI
    TISSHVGLAFTSYSATTSALGLPFVWRPAFSFIIAFMTVMGMTIAFAKD
    ISDIEGDAKYGVSTVATKLGARNMTFVVSGVLLLNYLVSISIGIIWPQV
    FKSNIMILSHAILAFCLIFQTRELALANYASAPSRQFFEFIWLLYYAEY
    FVYVFI
    CsPT4
    SEQ ID NO: 33
    MVFSSVCSFPSSLGTNFKLVPRSNFKASSSHYHEINNFINNKPIKFSYF
    SSRLYCSAKPIVHRENKFTKSFSLSHLQRKSSIKAHGEIEADGSNGTSE
    FNVMKSGNAIWRFVRPYAAKGVLFNSAAMFAKELVGNLNLFSWPLMFKI
    LSFTLVILCIFVSTSGINQIYDLDIDRLNKPNLPVASGEISVELAWLLT
    IVCTISGLTLTIITNSGPFFPFLYSASIFFGFLYSAPPFRWKKNPFTAC
    FCNVMLYVGTSVGVYYACKASLGLPANWSPAFCLLFWFISLLSIPISIA
    KDLSDIEGDRKFGIITFSTKFGAKPIAYICHGLMLLNYVSVMAAAIIWP
    QFFNSSVILLSHAFMAIWVLYQAWILEKSNYATETCQKYYIFLWIIFSL
    EHAFYLFM
    NphB
    SEQ ID NO: 34
    MSEAADVERVYAAMEEAAGLLGVACARDKIYPLLSTFQDTLVEGGSVVV
    FSMASGRHSTELDFSISVPTSHGDPYATVVEKGLFPATGHPVDDLLADT
    QKHLPVSMFAIDGEVTGGFKKTYAFFPTDNMPGVAELSAIPSMPPAVAE
    NAELFARYGLDKVAMTSMDYKKRQVNLYFSELSAQTLEAESVLALVREL
    GLHVPNELGLKFCKRSFSVYPTLNWETGKIDRLCFAVISNDPTLVPSSD
    EGDIEKFHNYATKAPYAYVGEKRTLVYGLTLSPKEEYYKLGAYYHITDV
    QRGLLKAFDSLED
  • Producing a CBGA-analog is an initial step in producing many cannabinoids. Once a CBGA-analog is produced, a single additional enzymatic step is required to turn the CBGA-analog into many other cannabinoids (ie, CBDA-analog, THCA-analog, CBCA-analog, etc.). The acidic forms of the cannabinoids can be used as a pharmaceutical product or the acidic cannabinoids can be turned into their neutral form for use, for example Cannabidiol (CBD) is produced from CBDA through decarboxylation. The resulting cannabinoid products will be used in the pharmaceutical/nutraceutical industry to treat a wide range of health issues.
  • The genes for tetrahydrocannabinolic acid synthase (THCAS), cannabidiolic acid synthase (CBDAS) and cannabichromenic acid synthase (CBCAS) may be derived from C. sativa, however, the skilled person would understand that homologous genes may also be suitable. In certain embodiments, THCAS comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:13. In certain embodiments, THCAS comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:13. In certain embodiments, CBDAS comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:14. In certain embodiments, CBDAS comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:14. In certain embodiments, CBCAS comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:15. In certain embodiments, CBCAS comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:15. Accordingly, in certain embodiments, the one or more cannabinoid precursor or cannabinoid producing genes comprise soluble aromatic prenyltransferase, cannabigerolic acid synthase (CBGAS), tetrahydrocannabinolic acid synthase (THCAS), cannabidiolic acid synthase (CBDAS) and cannabichromenic acid synthase (CBCAS).
  • THCAS
    SEQ ID NO: 13
    NPRENFLKCFSKHIPNNVANPKLVYTQHDQLYMSILNSTIQNLRFISDTT
    PKPLVIVTPSNNSHIQATILCSKKVGLQIRTRSGGHDAEGMSYISQVPFV
    VVDLRNMHSIKIDVHSQTAWVEAGATLGEVYYWINEKNENLSFPGGYCPT
    VGVGGHFSGGGYGALMRNYGLAADNIIDAHLVNVDGKVLDRKSMGEDLFW
    AIRGGGGENFGIIAAWKIKLVAVPSKSTIFSVKKNMEIHGLVKLFNKWQN
    IAYKYDKDLVLMTHFITKNITDNHGKNKTTVHGYFSSIFHGGVDSLVDLM
    NKSFPELGIKKTDCKEFSWIDTTIFYSGVVNFNTANFKKEILLDRSAGKK
    TAFSIKLDYVKKPIPETAMVKILEKLYEEDVGAGMYVLYPYGGIMEEISE
    SAIPFPHRAGIMYELWYTASWEKQEDNEKHINWVRSVYNFTTPYVSQNPR
    LAYLNYRDLDLGKTNHASPNNYTQARIWGEKYFGKNFNRLVKVKTKVDPN
    NFFRNEQSIPPLPPHHH
    CBDAS
    SEQ ID NO: 14
    NPRENFLKCFSQYIPNNATNLKLVYTQNNPLYMSVLNSTIHNLRFTSDTT
    PKPLVIVTPSHVSHIQGTILCSKKVGLQIRTRSGGHDSEGMSYISQVPFV
    IVDLRNMRSIKIDVHSQTAWVEAGATLGEVYYWVNEKNENLSLAAGYCPT
    VCAGGHFGGGGYGPLMRNYGLAADNIIDAHLVNVHGKVLDRKSMGEDLFW
    ALRGGGAESFGIIVAWKIRLVAVPKSTMFSVKKIMEIHELVKLVNKWQNI
    AYKYDKDLLLMTHFITRNITDNQGKNKTAIHTYFSSVFLGGVDSLVDLMN
    KSFPELGIKKTDCRQLSWIDTIIFYSGVVNYDTDNFNKEILLDRSAGQNG
    AFKIKLDYVKKPIPESVFVQILEKLYEEDIGAGMYALYPYGGIMDEISES
    AIPFPHRAGILYELWYICSWEKQEDNEKHLNWIRNIYNFMTPYVSKNPRL
    AYLNYRDLDIGINDPKNPNNYTQARIWGEKYFGKNFDRLVKVKTLVDPNN
    FFRNEQSIPPLPRHRH
    CBCAS
    SEQ ID NO: 15
    NPQENFLKCFSEYIPNNPANPKFIYTQHDQLYMSVLNSTIQNLRFTSDTT
    PKPLVIVTPSNVSHIQASILCSKKVGLQIRTRSGGHDAEGLSYISQVPFA
    IVDLRNMHTVKVDIHSQTAWVEAGATLGEVYYWINEMNENFSFPGGYCPT
    VGVGGHFSGGGYGALMRNYGLAADNIIDAHLVNVDGKVLDRKSMGEDLFW
    AIRGGGGENFGIIAAWKIKLVVVPSKATIFSVKKNMEIHGLVKLFNKWQN
    IAYKYDKDLMLTTHFRTRNITDNHGKNKTTVHGYFSSIFLGGVDSLVDLM
    NKSFPELGIKKTDCKELSWIDTTIFYSGVVNYNTANFKKEILLDRSAGKK
    TAFSIKLDYVKKLIPETAMVKILEKLYEEEVGVGMYVLYPYGGIMDEISE
    SAIPFPHRAGIMYELWYTATWEKQEDNEKHINWVRSVYNFTTPYVSQNPR
    LAYLNYRDLDLGKTNPESPNNYTQARIWGEKYFGKNFNRLVKVKTKADPN
    NFFRNEQSIPPLPPRHH
  • Fatty Acid and Fat Producing Genes:
  • For successful process development and application of THCAS, the properties of the reactants (cannabinoids and enzyme) have to be taken into account, since they determine preferences for process variables and reaction conditions. In C. sativa L., the THCAS is active in specialized structures called trichomes (Sirikantaramas et al., 2005). These glandular trichomes harbor a storage cavity (Mahlberg and Kim, 1992), containing the hydrophobic and for plant cells toxic cannabinoids in oil droplets (Morimoto et al., 2007). In this manner, the plant solves solubility and toxicity issues of the cannabinoids (Kim and Mahlberg, 2003). A similar strategy have used for biotechnological cannabinoid production, since multi-phase production systems are one of the applied concepts in reaction engineering to avoid limitations caused by toxicity, volatility, or low solubility of substrates and/or products (Willrodt et al., 2015). It was shown that THCAS is active in a two—liquid phase setup using hexane as organic phase for continuous substrate supply and in situ product removal (1.5 U g-1 total protein)(Lange e t al., 2015b). In another study, whole cells of P. pastoris were able to produce THCA with a maximal space—time—yield of 0.059 g L-1 h-1 (Zirpel et al., 2015).
  • The similar environment can be reproduced inside of Y. lipolitica which has incorporated lipid bodies. In this case lipid bodies will perform the role of lipid droplets in plants. Cannabinoids are almost not soluble in the aquatic phase. At the same time, they have a great solubility in oils (lipids). By using strains with a large content of lipids and lipid bodies we are providing a safe (not toxic) storage for produced cannabinoids.
  • Thus, the production of fatty acids and fats in yeast may be increased by expressing rate limiting genes in the lipid biosynthesis pathway. Y. lipolytica naturally produces Acetyl-CoA. The overexpression of ACC increases the amount of Malonyl-CoA, which is the first step in fatty acid production. In certain embodiments, the one or more genetic modifications that result in increased production of fatty acids or fats comprise Acetyl-CoA carboxylase (ACC1) and Diacylglyceride acyl-transferase (DGA1). The sequences for the native Y. lipolytica genes are shown herein, however the skilled person would understand that homologous genes may also be suitable. Examples of DGA1 homologs as shown in Table 8. In certain embodiments, ACC comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:23. In certain embodiments, ACC1 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:23. In certain embodiments, DGA1 comprises a polynucleotide encoding a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:24. In certain embodiments, DGA1 comprises a polypeptide that has at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO:24.
  • ACC1 and DGA1 may be overexpressed in yeast by adding extra copies of the genes driven by native or stronger promoters. Alternatively, native promoters may be substituted by stronger promoters such as TEFin, hp4d, hp8d and others, as would be appreciated by the person skilled in the art. The overexpression of ACC and DGA1 may be determined by quantitative PCR, Microarrays, or next generation sequencing technologies, such as RNA-seq. Alternatively, the product of increased enzyme levels will be increased production of fatty acids. Fatty acid production may be determined using chemical titration, thermometric titration, measurement of metal-fatty acid complexes using spectrophotometry, enzymatic methods or using a fatty acid binding protein.
  • Variants of the fatty acid and fat producing proteins, such as ACC1 retain the ability to produce malonyl-CoA from acetyl-CoA plus bicarbonate. For example, a variant of a fatty acid and fat producing protein, such as ACC1, must retain the ability to produce malonyl-CoA from acetyl-CoA plus bicarbonate with at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, or at least about 100% efficacy compared to the original sequence. In preferred embodiments, a variant of a fatty acid and fat producing protein, such as ACC1, has improved activity over the sequence from which it is derived in that the improved variant common cannabinoid protein has more than 110%, 120%, 130%, 140%, or and 150% improved activity in producing malonyl-CoA from acetyl-CoA plus bicarbonate, as compared to the sequence from which the improved variant is derived.
  • ACC1
    SEQ ID NO: 23
    MRLQLRTLTRRFFSMASGSSTPDVAPLVDPNIHKGLASHFFGLNSVHTA
    KPSKVKEFVASHGGHTVINKVLIANNGIAAVKEIRSVRKWAYETFGDER
    AISFTVMATPEDLAANADYIRMADQYVEVPGGTNNNNYANVELIVDVAE
    RFGVDAVWAGWGHASENPLLPESLAASPRKIVFIGPPGAAMRSLGDKIS
    STIVAQHAKVPCIPWSGTGVDEVVVDKSTNLVSVSEEVYTKGCTTGPKQ
    GLEKAKQIGFPVMIKASEGGGGKGIRKVEREEDFEAAYHQVEGEIPGSP
    IFIMQLAGNARHLEVQLLADQYGNNISLFGRDCSVQRRHQKIIEEAPVT
    VAGQQTFTAMEKAAVRLGKLVGYVSAGTVEYLYSHEDDKFYFLELNPRL
    QVEHPTTEMVTGVNLPAAQLQIAMGIPLDRIKDIRLFYGVNPHTTTPID
    FDFSGEDADKTQRRPVPRGHTTACRITSEDPGEGFKPSGGTMHELNFRS
    SSNVWGYFSVGNQGGIHSFSDSQFGHIFAFGENRSASRKHMVVALKELS
    IRGDFRTTVEYLIKLLETPDFEDNTITTGWLDELISNKLTAERPDSFLA
    VVCGAATKAHRASEDSIATYMASLEKGQVPARDILKTLFPVDFIYEGQR
    YKFTATRSSEDSYTLFINGSRCDIGVRPLSDGGILCLVGGRSHNVYWKE
    EVGATRLSVDSKTCLLEVENDPTQLRSPSPGKLVKFLVENGDHVRANQP
    YAEIEVMKMYMTLTAQEDGIVQLMKQPGSTIEAGDILGILALDDPSKVK
    HAKPFEGQLPELGPPTLSGNKPHQRYEHCQNVLHNILLGFDNQVVMKST
    LQEMVGLLRNPELPYLQWAHQVSSLHTRMSAKLDATLAGLIDKAKQRGG
    EFPAKQLLRALEKEASSGEVDALFQQTLAPLFDLAREYQDGLAIHELQV
    AAGLLQAYYDSEARFCGPNVRDEDVILKLREENRDSLRKVVMAQLSHSR
    VGAKNNLVLALLDEYKVADQAGTDSPASNVHVAKYLRPVLRKIVELESR
    ASAKVSLKAREILIQCALPSLKERTDQLEHILRSSVVESRYGEVGLEHR
    TPRADILKEVVDSKYIVFDVLAQFFAHDDPWIVLAALELYIRRACKAYS
    ILDINYHQDSDLPPVISWRFRLPTMSSALYNSVVSSGSKTPTSPSVSRA
    DSVSDFSYTVERDSAPARTGAIVAVPHLDDLEDALTRVLENLPKRGAGL
    AISVGASNKSAAASARDAAAAAASSVDTGLSNICNVMIGRVDESDDDDT
    LIARISQVIEDFKEDFEACSLRRITFSFGNSRGTYPKYFTFRGPAYEED
    PTIRHIEPALAFQLELARLSNFDIKPVHTDNRNIHVYEATGKNAASDKR
    FFTRGIVRPGRLRENIPTSEYLISEADRLMSDILDALEVIGTTNSDLNH
    IFINFSAVFALKPEEVEAAFGGFLERFGRRLWRLRVTGAEIRMMVSDPE
    TGSAFPLRAMINNVSGYVVQSELYAEAKNDKGQWIFKSLGKPGSMHMRS
    INTPYPTKEWLQPKRYKAHLMGTTYCYDFPELFRQSIESDWKKYDGKAP
    DDLMTCNELILDEDSGELQEVNREPGANNVGMVAWKFEAKTPEYPRGRS
    FIVVANDITFQIGSFGPAEDQFFFKVTELARKLGIPRIYLSANSGARIG
    IADELVGKYKVAWNDETDPSKGFKYLYFTPESLATLKPDTVVTTEIEEE
    GPNGVEKRHVIDYIVGEKDGLGVECLRGSGLIAGATSRAYKDIFTLTLV
    TCRSVGIGAYLVRLGQRAIQIEGQPIILTGAPAINKLLGREVYSSNLQL
    GGTQIMYNNGVSHLTARDDLNGVHKIMQWLSYIPASRGLPVPVLPHKTD
    VWDRDVTFQPVRGEQYDVRWLISGRTLEDGAFESGLFDKDSFQETLSGW
    AKGVVVGRARLGGIPFGVIGVETATVDNTTPADPANPDSIEMSTSEAGQ
    VWYPNSAFKTSQAINDFNHGEALPLMILANWRGFSGGQRDMYNEVLKYG
    SFIVDALVDYKQPIMVYIPPTGELRGGSWVVVDPTINSDMMEMYADVES
    RGGVLEPEGMVGIKYRRDKLLDTMARLDPEYSSLKKQLEESPDSEELKV
    KLSVREKSLMPIYQQISVQFADLHDRAGRMEAKGVIREALVWKDARRFF
    FWRIRRRLVEEYLITKINSILPSCTRLECLARIKSWKPATLDQGSDRGV
    AEWFDENSDAVSARLSELKKDASAQSFASQLRKDRQGTLQGMKQALASL
    SEAERAELLKGL
    DGA1
    SEQ ID NO: 24
    MTIDSQYYKSRDKNDTAPKIAGIRYAPLSTPLLNRCETFSLVWHIFSIP
    TFLTIFMLCCAIPLLWPFVIAYVVYAVKDDSPSNGGVVKRYSPISRNFF
    IWKLFGRYFPITLHKTVDLEPTHTYYPLDVQEYHLIAERYWPQNKYLRA
    IISTIEYFLPAFMKRSLSINEQEQPAERDPLLSPVSPSSPGSQPDKWIN
    HDSRYSRGESSGSNGHASGSELNGNGNNGTTNRRPLSSASAGSTASDST
    LLNGSLNSYANQIIGENDPQLSPTKLKPTGRKYIFGYHPHGIIGMGAFG
    GIATEGAGWSKLFPGIPVSLMTLTNNFRVPLYREYLMSLGVASVSKKSC
    KALLKRNQSICIVVGGAQESLLARPGVMDLVLLKRKGFVRLGMEVGNVA
    LVPIMAFGENDLYDQVSNDKSSKLYRFQQFVKNFLGFTLPLMHARGVFN
    YDVGLVPYRRPVNIVVGSPIDLPYLPHPTDEEVSEYHDRYIAELQRIYN
    EHKDEYFIDWTEEGKGAPEFRMIE
  • TABLE 5
    DGA1 HOMOLOGS
    Description Ident Accession
    YALI0E32769P [Yarrowia lipolytica CLIB122] 100% XP_504700.1
    Diacylglycerol acyltransferase [Galactomyces candidus]  44% CDO57007.1
    hypothetical protein [Lipomyces starkeyi NRRL Y-11557]  60% ODQ70106.1
    DAGAT-domain-containing protein [Nadsonia fulvescens var. elongata  60% ODQ67305.1
    DSM 6958]
    hypothetical protein [Tortispora caseinolytica NRRL Y-17796]  65% ODV90514.1
    diacylglycerol acyltransferase [Saitoella complicata NRRL Y-17804]  60% XP_019022950.1
    uncharacterized protein KUCA_T00002736001 [Kuraishia capsulata  51% XP_022458761.1
    CBS 1993]
    diacylglycerol O-acyltransferas-like protein 2B [Meliniomyces bicolor E]  55% XP_024728739.1
    Diacylglycerol O-acyltransferase 1 [Hanseniaspora osmophila]  57% OEJ83128.1
    DAGAT-domain-containing protein [Ascoidea rubescens DSM 1968]  49% XP_020048004.1
  • NADPH Balance
  • NADPH is extremely critical for a production of fatty acids. It is required 16 molecules of NADPH to produce one stearic acid. By using NADPH, cells create an excess of NADH. NADPH is also important for production of fatty acids and cannabinoids. Four molecules of NADPH is required to produce 1 molecule of GPP.
  • Thus, to produce one Hexanoyl-CoA, 4 molecules of NADPH is required. Production of OLA from Hexanoyl-CoA does not require any additional NADPH. Therefore, we will need 8 molecules of NADPH to directly produce 1 molecule of a cannabinoid precursor. Preferred methods of overexpressing NADP+ include, but are not limited to use of glucose-6-phosphate dehydrogenase, which is encoded by, for example ZWF1 (see, for example, Yuzbasheva, E. Y., et al., New Biotechnology 39 (Pt A), 18-21, or use of GAPC and/or MCE2 (see, for example, Qiao, K., et al., (2017) Nature Biotechnology 35(2), 173-177.
  • Pro A Signals
  • It was surprisingly found that the addition of a proteinase A (ProA) signal sequence directly to the N-terminus of any one of THCAS, CBDAS or CBCAS may aid in the correct targeting of the synthase to a vacuole and correct protein assembly and glycosylation, which, in turn increases the activity and conversion of the CBGA Analog to the corresponding CBDA, TCHA or CBCA analog. Such ProA signal may also increase production of the CBDA, TCHA or CBCA analog. Examples of such ProA signals that can be added to the N-terminus include any one of SEQ ID NO:45-46.
  • >ProA20
    (SEQ ID NO: 45)
    MKFTAAVSVLAAAGSVSAAV
    >ProA21
    (SEQ ID NO: 46)
    MKFTAAVSVLAAAGSVSAAVS
    >ProA22
    (SEQ ID NO: 47)
    MKFTAAVSVLAAAGSVSAAVSK
    >ProA23
    (SEQ ID NO: 48)
    MKFTAAVSVLAAAGSVSAAVSKV
    >ProA24
    (SEQ ID NO: 49)
    MKFTAAVSVLAAAGSVSAAVSKVS
  • Thus, any one of SEQ ID NO:45-49 can be added to the N-terminus of any one of SEQ ID NO:13-15 (or variants thereof) to aid in the expression, activity and production of the CBDA, TCHA or CBCA analog.
  • In preferred embodiments, the additional of the ProA signal sequence added to the N-terminus of THCAS, CBDAS and/or CBCAS had substantially improved activity when expressed in a recombinant host having inactivated or deleted PEP4 and/or PRB1 genes or expressed in recombinant hosts lacking functional PEP4 and/or PRB1 genes (e.g, lacking endogenous sequences). For example, inactivation at in Y. lipolytica YALI0F27071p and/or YALI0B16500p and/or YALI0A06435p are preferably used to express of THCAS, CBDAS and/or CBCAS having ProA signal sequences.
  • Recombinant Microorganisms
  • As described above, the microorganism employed in a method of the invention or contained in the composition of the invention may be a microorganism which has been genetically modified by the introduction of a nucleic acid molecule encoding a corresponding enzyme. Thus, in a preferred embodiment, the microorganism is a recombinant microorganism which has been genetically modified to have an increased activity of at least one enzyme described above for the conversions of the method according to the present invention. This can be achieved e.g. by transforming the microorganism with a nucleic acid encoding a corresponding enzyme. Preferably, the nucleic acid molecule introduced into the microorganism is a nucleic acid molecule which is heterologous with respect to the microorganism, i.e. it does not naturally occur in said microorganism.
  • The term “microorganism” in the context of the present invention refers to bacteria, as well as to fungi, such as yeasts, and also to algae and archaea. In one preferred embodiment, the microorganism is a bacterium. In principle any bacterium can be used. Preferred bacteria to be employed in the process according to the invention are bacteria of the genus Bacillus, Clostridium, Corynebacterium, Pseudomonas, Zymomonas or Escherichia. In a particularly preferred embodiment, the bacterium belongs to the genus Escherichia and even more preferred to the species Escherichia coli. In another preferred embodiment the bacterium belongs to the species Pseudomonas putida or to the species Zymomonas mobilis or to the species Corynebacterium glutamicum or to the species Bacillus subtilis. It is also possible to employ an extremophilic bacterium such as Thermus thermophilus, or anaerobic bacteria from the family Clostridiae.
  • It is also conceivable to use in the method according to the invention a combination of microorganisms wherein different microorganisms express different enzymes as described above.
  • In the context of the present invention, an “increased activity” means that the expression and/or the activity of an enzyme in the genetically modified microorganism is at least 10%, preferably at least 20%, more preferably at least 30% or 50%, even more preferably at least 70% or 80% and particularly preferred at least 90% or 100% higher than in the corresponding non-modified microorganism. In even more preferred embodiments, the increase in expression and/or activity may be at least 150%, at least 200% or at least 500%. In particularly preferred embodiments the expression is at least 10-fold, more preferably at least 100-fold and even more preferred at least 1000-fold higher than in the corresponding non-modified microorganism.
  • The term “increased” expression/activity also covers the situation in which the corresponding non-modified microorganism does not express a corresponding enzyme so that the corresponding expression/activity in the non-modified microorganism is zero. Preferably, the concentration of the overexpressed enzyme is at least 5%, 10%, 20%, 30%, or 40% of the total host cell protein. Additionally, as would be appreciated by the person skilled in the art, increased expression of a gene may provide increased the activity of the gene product. In certain embodiments, overexpression of a gene can increase the activity of the gene product by about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 105%, about 110%, about 115%, about 120%, about 125%, about 130%, about 135%, about 140%, about 145%, about 150%, about 155%, about 160%, about 165%, about 170%, about 175%, about 180%, about 185%, about 190%, about 95%, or about 200%.
  • Methods for measuring the level of expression of a given protein in a cell are well known to the person skilled in the art. In one embodiment, the measurement of the level of expression is done by measuring the amount of the corresponding protein. Corresponding methods are well known to the person skilled in the art and include Western Blot, ELISA etc. In another embodiment the measurement of the level of expression is done by measuring the amount of the corresponding RNA. Corresponding methods are well known to the person skilled in the art and include, e.g., Northern Blot.
  • In addition, it is possible to insert different mutations into the polynucleotides by methods usual in molecular biology (see for instance Sambrook and Russell (2001), Molecular Cloning: A Laboratory Manual, CSH Press, Cold Spring Harbor, N.Y., USA), leading to the synthesis of polypeptides possibly having modified biological properties. The introduction of point mutations is conceivable at positions at which a modification of the amino acid sequence for instance influences the biological activity or the regulation of the polypeptide. Similarly, CRISPR-Cas9 genome editing technology can be used to modify the disclosed sequences to produce enzyme variants.
  • The transformation of the host cell with a polynucleotide or vector as described above can be carried out by standard methods, as for instance described in Sambrook and Russell (2001), Molecular Cloning: A Laboratory Manual, CSH Press, Cold Spring Harbor, N.Y., USA; Methods in Yeast Genetics, A Laboratory Course Manual, Cold Spring Harbor Laboratory Press, 1990. The host cell is cultured in nutrient media meeting the requirements of the particular host cell used, in particular in respect of the pH value, temperature, salt concentration, aeration, antibiotics, vitamins, trace elements etc.
  • The disclosed genes may be under the control of any suitable promoter. Many native promoters are available, for example, for Y. lipolytica, native promoters are available from the genes for translational elongation factor EF-1 alpha, acyl-CoA: diacylglycerol acyltransferase, acetyl-CoA-carboxylase 1, ATP citrate lyase 2, fatty acid synthase subunit beta, fatty acid synthase subunit alpha, isocitrate lyase 1, POX4 fatty-acyl coenzyme A oxidase, ZWF1 glucose-6-phosphate dehydrogenase, gytosolic NADP-specific isocitrate dehydrogenase, glyceraldehyde 3-phosphate dehydrogenase, the TEF intron promoter or native promoter (Wong et al. 2017), a synthesized short terminator (Curran et al. 2015), or the alcohol dehydrogenase II promoter of Y. lipolytica. Any suitable terminator may be used. Short synthetic terminators are particularly suitable and are readily available, see for example, MacPherson et al. 2016.
  • Methods of detecting increase production of Compound I may be determined using high-performance liquid chromatography (HPLC) or Liquid chromatography-mass spectrometry (LC/MS). For example, as yeast do not produce OA endogenously, the presence of OA indicates that the PKS Enzyme is functioning.
  • Genetically Modified Yeast Strains
  • In another preferred embodiment the microorganism is a fungus, more preferably a fungus of the genus Saccharomyces, Schizosaccharomyces, Aspergillus, Trichoderma, Kluyveromyces or Pichia and even more preferably of the species Saccharomyces cerevisiae, Schizosaccharomyces pombe, Aspergillus niger, Trichoderma reesei, Kluyveromyces marxianus, Kluyveromyces lactis, Pichia pastoris, Pichia torula or Pichia utilis.
  • In further preferred embodiments, genetically modified yeasts comprising one or more genetic modifications that result in the production of at least one cannabinoid or cannabinoid precursor and methods for their creation. The disclosed yeast may produce various cannabinoids from a simple sugar source, for example, where the main carbon source available to the yeast is a sugar (glucose, galactose, fructose, sucrose, honey, molasses, raw sugar, etc.). Genetic engineering of the yeast involves inserting various genes that produce the appropriate enzymes and/or altering the natural metabolic pathway in the yeast to achieve the production of a desired compound. Through genetic engineering of yeast, these metabolic pathways can be introduced into these yeast and the same metabolic products that are produced in the plant C. sativa can be produced by the yeast. The benefit of this method is that once the yeast is engineered, the production of the cannabinoid is low cost and reliable, only a specific cannabinoid is produced or a subset is produced, depending on the organism and the genetic manipulation. The purification of the cannabinoid is straightforward since there is only a single cannabinoid or a selected few cannabinoids present in the yeast. The process is a sustainable process which is more environmentally friendly than synthetic production.
  • In the past, there have been multiple attempts to produce cannabinoids in yeasts. At present, no one has been able reach a reasonable price for production due to extremely low yield. We have identified how the yield can be increased.
  • In preferred embodiments, the biosynthetic pathways shown in FIGS. 1-3 are produced in yeast having at least 5% dry weight of fatty acids or fats, such as oily yeasts, for example, Y. lipolytica.
  • Additionally and as described below, we also propose (1) making additional genetic modifications that will increase oil production level in the engineered yeast; (2) add additional genes from the cannabinoid production pathway in combination with genes from alternative pathways that produce cannabinoid intermediates, such as for example NphB; (3) increase production of GPP by, for example, genetically mutating ERG20 and/or by using equivalent genes from alternative pathways; (4) increase production of compounds from fatty acid pathway for use in the cannabinoid production pathway, for example, increase the production of malonyl-CoA by overexpressing ACC1.
  • Cannabinoids have a limited solubility in water solutions. Yet, they have a high solubility in hydrophobic liquids like lipids, oils or fats. If hydrophobic media is limited or completely removed than a CBGA-analog will not be solubilized and will have limited availability to following cannabinoid synthetases. As an example, in the paper (Zirpel et al. 2015) it was shown that purified THCA synthase is almost unable to convert CBGA into THCA. In the same paper the authors demonstrated that unpurified yeast lysate converts CBGA much more efficiently. The authors also demonstrated that CBGA was dissolved in the lipid fraction. In another paper (Lange et al. 2016) the authors made the next step in improving a cell free process. They used a two-phase reaction with an organic, hydrophobic phase and aquatic phase. The authors demonstrated a high yield of THCA from CBGA. They found that CBGA was dissolved in organic phase. They also demonstrated that THCA was moved back to the organic phase. We can therefore conclude that a hydrophobic phase is required for successful synthesis and that cannabinoids are mostly present in the organic phase.
  • Production of cannabinoid in traditional yeast, like S. cerevisiae, K. phaffii, K. marxianus, results in the cannabinoids, like the main mass of lipids to be deposited in the lipid membrane. These types of yeast almost have no oily bodies. In such a case, any cannabinoids that are produced will be dissolved in this membrane. Too many cannabinoids will destabilize a membrane which will cause cell death. It was reported that in the best conditions, with high sugar content and without nitrogen supply, these yeasts can have a maximum of 2-3% dry weight of oils (ie fats and fatty acids).
  • However, there are several non-traditional yeasts, like Y. lipolytica. The natural form of Y. lipolytica can have up to 17% dry weight of oils. The main mass of oil is located in oily bodies. Cannabinoids dissolved in such bodies will not cause membrane instability. As a result, Y. lipolytica can have a much higher cannabinoid production level. Several works have demonstrated modifications for Y. lipolytica which can bring the lipid content above 80% of dry mass (Qiao et al. 2015).
  • Therefore, we propose that cannabinoids can be produced to some percentage of the oil content in yeast. This gives a correlation—more oil means more cannabinoid production.
  • A review paper (Ângela et al. 2017) analysed different types of yeast as a potential producers for cannabinoids. TABLE 1 is adapted from the summary table in Angela et al. 2017, in which the authors compared 4 yeasts types by different parameters. Yet, they completely ignored oil content, theoretical maximal limit of production and minimal cost of goods for production. The far right two columns show maximum oil amount as a percentage of dry weight, and the production cost if there is only 1% of cannabinoid in the oil. The bottom row shows an embodiment of a modified Yarrowia lipolytica of the present disclosure. Finally, the authors in Angela et al. 2017 considered that acetyl-CoA pool engineering had optimization potential; +. However, we have found that YL has large concentration of acetyl-CoA without modifications.
  • Therefore, in preferred embodiments, we are proposing to use oily yeasts as a backbone for cannabinoid and/or cannabinoid precursor production.
  • TABLE 6
    COMPARISON OF DIFFERENT MICROBIAL EXPRESSION HOSTS REGARDING THEIR CAPACITY OF
    HETEROOGOUS CANNABINOID BIOSYNTHESIS
    Production cost
    Genetic Strains, plant protein Post- Hexanoic Maximal oil with only 1% of
    tools promoters, expression translational GPP acid acetyl-CoA pool amount % of cannabinoids
    available vectors capacity modifications engineering engineering engineering dry weight from oils
    E. coli ++ + ++ + + ++ + +  2% $12.50
    S. cerevisiae ++ + ++ + ++ ++ +++ ++ +++  2% $12.50
    P. Pastoris + ++ +++ ++ + ++  3%  $8.33
    K. marxianus ++ + ++ ++  3%  $8.33
    Y. Lipolica + + ++ ++ + ++ +, YL has large 17%  $1.47
    concentration of
    ac-CoA without
    modifications
    Y.L. + + ++ ++ + ++ +, YL has large 80%  $0.31
    modified concentration of
    ac-CoA without
    modifications
    * maximal oil % means how much oils can be produced in the best cultivation conditions.
    % calculated from dried mass.

    Table 1 adapted from Carvalho, Angela, et al. “Designing microorganisms for heterologous biosynthesis of cannabinoids.” FEMS yeast research 17.4 (2017).1. +++, many publications available, well established; ++, publications available, optimization potential; +, first publications available, not yet established/not working; −, not possible;‘empty’, not yet described.
  • As described above, in certain embodiments, the yeast comprises at least 5% dry weight of fatty acids or fats. Accordingly, the yeast may be oleaginous. Any oleaginous yeast may be suitable, however, particularly suitable yeast may be selected from the genera Rhodosporidium, Rhodotorula, Yarrowia, Cryptococcus, Candida, Lipomyces and Trichosporon. In certain embodiments, the yeast is a Yarrowia lipolytica, a Lipomyces starkey, a Rhodosporidium toruloides, a Rhodotorula glutinis, a Trichosporon fermentans or a Cryptococcus curvatus. The yeast may be naturally oleaginous. Accordingly, in certain embodiments, the yeast comprises at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75% or at least 80% dry weight of fatty acids or fats. The yeast may also be genetically modified to accumulate or produce more fatty acids or fats. Accordingly, in certain embodiments, the yeast is genetically modified to produce at least 5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75% or at least 80% dry weight of fatty acids or fats.
  • Cell-Free Production
  • The method according to the present invention can also be carried out in a cell-free system (e.g., in vitro). An in vitro reaction is understood to be a reaction in which no cells are employed, i.e. an acellular reaction. Thus, in vitro preferably means in a cell-free system. The term “in vitro” in one embodiment means in the presence of isolated enzymes (or enzyme systems optionally comprising possibly required cofactors). In one embodiment, the enzymes employed in the method are used in purified form.
  • For carrying out the method in vitro the substrates for the reaction and the enzymes are incubated under conditions (buffer, temperature, cosubstrates, cofactors etc.) allowing the enzymes to be active and the enzymatic conversion to occur. The reaction is allowed to proceed for a time sufficient to produce the respective product. The production of the respective products can be measured by methods known in the art, such as gas chromatography possibly linked to mass spectrometry detection.
  • The enzymes described herein may be in any suitable form allowing the enzymatic reaction to take place. They may be purified or partially purified or in the form of crude cellular extracts or partially purified extracts. It is also possible that the enzymes are immobilized on a suitable carrier.
  • Carbohydrate Sources
  • In another aspect of the present disclosure, there is provided method of producing at least one cannabinoid or cannabinoid precursor comprising contacting the compositions as described herein with a carbohydrate source under conditions and for a time sufficient to produce the at least one cannabinoid or cannabinoid precursor.
  • Specifically, examples of the culture conditions for producing at least one cannabinoid or cannabinoid precursor include a batch process and a fed batch or repeated fed batch process in a continuous manner, but are not limited thereto. Carbon sources that may be used for producing at least one cannabinoid or cannabinoid precursor may include sugars and carbohydrates such as glucose, sucrose, lactose, fructose, maltose, starch, xylose and cellulose; oils and fats such as soybean oil, sunflower oil, castor oil, coconut oil, chicken fat and beef tallow; fatty acids such as palmitic acid, stearic acid, oleic acid and linoleic acid; alcohols such as glycerol and ethanol; and organic acids such as gluconic acid, acetic acid, malic acid and pyruvic acid, but these are not limited thereto. These substances may be used alone or in a mixture. Nitrogen sources that may be used in the present disclosure may include peptone, yeast extract, meat extract, malt extract, corn steep liquor, defatted soybean cake, and urea or inorganic compounds, such as ammonium sulfate, ammonium chloride, ammonium phosphate, ammonium carbonate, and ammonium nitrate, but these are not limited thereto. These nitrogen sources may also be used alone or in a mixture. Phosphorus sources that may be used in the present disclosure may include potassium dihydrogen phosphate or dipotassium hydrogen phosphate, or corresponding sodium-containing salts, but these are not limited thereto. In addition, the culture medium may contain a metal salt such as magnesium sulfate or iron sulfate, which is may be required for the growth. Lastly, in addition to the above-described substances, essential growth factors such as amino acids and vitamins may be used. Such a variety of culture methods is disclosed, for example, in the literature (“Biochemical Engineering” by James M. Lee, Prentice-Hall International Editions, pp 138-176).
  • Basic compounds such as sodium hydroxide, potassium hydroxide, or ammonia, or acidic compounds such as phosphoric acid or sulfuric acid may be added to the culture medium in a suitable manner to adjust the pH of the culture medium. In addition, an anti-foaming agent such as fatty acid polyglycol ester may be used to suppress the formation of bubbles. In certain embodiments, the culture medium is maintained in an aerobic state, accordingly, oxygen or oxygen-containing gas (e.g., air) may be injected into the culture medium. The temperature of the culture medium may be usually 20° C. to 35° C., preferably 25° C. to 32° C., but may be changed depending on conditions. The culture may be continued until the maximum amount of a desired cannabinoid precursor or cannabinoid is produced, and it may generally be achieved within 5 hours to 160 hours. The cannabinoid precursor or cannabinoid may be released into the culture medium or contained in the recombinant microorganisms.
  • The method of the present disclosure for producing at least one cannabinoid or cannabinoid precursor may include a step of recovering the at least one cannabinoid or cannabinoid precursor from the microorganism or the medium. Methods known in the art, such as centrifugation, filtration, anion-exchange chromatography, crystallization, HPLC, etc., may be used for the method for recovering at least one cannabinoid or cannabinoid precursor from the microorganism or the culture, but the method is not limited thereto. The step of recovering may include a purification process. Specifically, following an overnight culture, 1 L cultures are pelleted by centrifugation, resuspended, washed in PBS and pelleted. The cells are lysed by either chemical or mechanical methods or a combination of methods. Mechanical methods can include a French Press or glass bead milling or other standard methods. Chemical methods can include enzymatic cell lysis, solvent cell lysis, or detergent based cell lysis. A liquid-liquid extraction of the cannabinoids is performed using the appropriate chemical solvent in which the cannabinoids are highly soluble and the solvent is not miscible in water. Examples include hexane, ethyl acetate, and cyclohexane, preferably solvents with straight or branched alkane chains (C5-C8) or mixtures thereof.
  • In certain embodiments, the at least one cannabinoid or cannabinoid precursor comprises a CBGA-analog, a THCA-analog, a CBDA-analog or a CBCA-analog. The production of one or more cannabinoid precursors or cannabinoids may be determined using a variety of methods as described herein. An example protocol for analysing a CBDA-analog is as follows:
      • 1. Remove solvent from samples under vacuum.
      • 2. Re-suspend dry samples in either 100 uL of dry hexane or dry ethyl acetate
      • 3. Add 20 uL of N-Methyl-N-(trimethylsilyl)trifluoroacetamide (MSTFA)
      • 4. Briefly mix
      • 5. Heat solution to 60° C. for 10-15 minutes
      • 6. GC-MS Method
        • a. Instrument Agilent 6890-5975 GC-MS (Model Number: Agilent 19091S-433)
        • b. Column HP-5MS 5% Phenyl Methyl Siloxane
        • c. OVEN:
          • i. Initial temp: 100° C. (On) Maximum temp: 300° C.
          • ii. Initial time: 3.00 min Equilibration time: 0.50 min
          • iii. Ramps:
            • # Rate Final temp Final time
            • 1-30.00 280 1.00
            • 2-70.00 300 5.00
            • 3-0.0 (Off)
          • iv. Post temp: 0° C.
          • v. Post time: 0.00 min
          • vi. Run time: 15.29 Min
  • In a third aspect of the present disclosure, there is provided a cannabinoid precursor, cannabinoid or a combination thereof produced using the methods described herein. In certain embodiments, the at least one cannabinoid or cannabinoid precursor comprises a CBGA-analog, a THCA-analog, a CBDA-analog or a CBCA-analog.
  • EXAMPLES Example 1: Vector Construction and Transformation
  • Y. lipolytica episomal plasmids comprise a centromere, origin and bacteria replicative backbone. Fragments for these regions were synthesized by Twist Bioscience and cloned to make an episomal parent vector pBM-pa. Plasmids were constructed by Gibson Assembly, Golden gate assembly, ligation or sequence- and ligation-independent cloning (SLIC). Genomic DNA isolation from bacteria (E. coli) and yeast (Yarrowia lipolytica) were performed using Wizard Genomic DNA purification kit according to manufacturer's protocol (Promega, USA). Synthetic genes were codon-optimized using GeneGenie or Genscript (USA) and assembled from gene fragments purchased from TwistBioscience. All the engineered Y. lipolytica strains were constructed by transforming the corresponding plasmids. All gene expression cassettes were constructed using a TEF intron promoter and synthesized short terminator. Up to six expression cassettes were cloned into episomal expression vectors through SLIC.
  • E. coli minipreps were performed using the Zyppy Plasmid Miniprep Kit (Zymo Research Corporation). Transformation of E. coli strains was performed using Mix & Go Competent Cells (Zymo research, USA). Transformation of Y. lipolytica with episomal expression plasmids was performed using the Zymogen Frozen EZ Yeast Transformation Kit II (Zymo Research Corporation), and spread on selective plates. Transformation of Y. lipolytica with linearized cassettes was performed using LiOAc method. Briefly, Y. lipolytica strains were inoculated from glycerol stocks directly into 10 ml YPD media, grown overnight and harvested at an OD600 between 9 and 15 by centrifugation at 1,000 g for 3 min. Cells were washed twice in sterile water. Cells were dispensed into separate microcentrifuge tubes for each transformation, spun down and resuspended in 1.0 ml 100 mM LiOAc. Cells were incubated with shaking at 30° C. for 60 min, spun down, resuspended in 90 ul 100 mM LiOAc and placed on ice. Linearized DNA (1-5 mg) was added to each transformation mixture in a total volume of 10 ul, followed by 25 ul of 50 mg/ml boiled salmon sperm DNA. Cells were incubated at 30° C. for 15 min with shaking, before adding 720 μl PEG buffer (50% PEG8000, 100 mM LiOAc, pH=6.0) and 45 μl 2 M Dithiothreitol. Cells were incubated at 30° C. with shaking for 60 min, heat-shocked for 10 min in a 39° C. water bath, spun down and resuspended in 1 ml sterile water. Cells (200 μl) were plated on appropriate selection plates.
  • Example 2: Yeast Culture Conditions
  • E. coli strain DH10B was used for cloning and plasmid propagation. DH10B was grown at 37° C. with constant shaking in Luria-Bertani Broth supplemented with 100 mg/L of ampicillin for plasmid propagation. Y. lipolytica strains W29 was used as the base strain for all experiments. Y. lipolytica was cultivated at 30° C. with constant agitation. Cultures (2 ml) of Y. lipolytica used in large-scale screens were grown in a shaking incubator at speed 250 rpm for 1 to 3 days, and larger culture volumes were shaken in 50 ml flasks or fermented in a bioreactor.
  • For colony screening and cell propagation, Y. lipolytica grew on YPD liquid media contained 10 g/L yeast extract, 20 g/L peptone and 20 g/L glucose, or YPD agar plate with addition of 20 g/L of agar. Medium was often supplemented with 150 to 300 mg/L Hygromycin B or 250 to 500 mg/L nourseothricin for selection, as appropriate. For cannabinoid producing strains, modified YPD media with 0.1 to 1 g/L yeast extract was used for promoting lipid accumulation and often supplemented with 0.2 g/L and 5 g/L ammonium sulphate as alternative nitrogen source.
  • Example 3: Cannabinoid Isolation
  • Y. lipolytica culture from the shaking flask experiment or bioreactor are pelleted and homogenized in acetonitrile followed by incubation on ice for 15 min. Supernatants are filtered (0.45 μm, Nylon) after centrifugation (13,100 g, 4° C., 20 min) and analyzed by HPLC-DAD. Quantification of products are based on integrated peak areas of the UV-chromatograms at 225 nm. Standard curves are generated for CBGA and THCA. The identity of all compounds can be confirmed by comparing mass and tandem mass spectra of each sample with coeluting standards analysed by Bruker Compact™ ESI-Q-TOF using positive ionization mode.
  • Example 4: Gene Combinations
  • Embodiment 1: Y. lipolytica ERG20 comprising F88W and N119W substitutions; tHMGR; OLS: OAC; CBGAS; THCAS; HexA and HexB.
  • Embodiment 2: Y. lipolytica ERG20 comprising F88W and N119W substitutions; HMGR; OLS: OAC; NphB Q161A; THCAS; FAS1 I306A, M1251W and FAS2 G1250S.
  • Embodiment 3: S. cerevisiae ERG20 comprising a K197E substitution; OLS: OAC; NphB Q161A; CBDAS; StcJ and StcK.
  • Embodiment 4: Y. lipolytica ERG20 comprising a K189E substitution; HMGR; OLS: OAC; CBGAS; CBCAS; HexA and HexB.
  • Embodiment 5: Y. lipolytica ERG20 comprising a K189E substitution; tHMGR; OLS: OAC; CBGAS; CBDAS; StcJ and StcK.
  • The genetically modified yeast of the present disclosure enable the production of cannabinoid precursors and cannabinoids. The accumulation of fatty acids or fats in the yeast of at least 5% dry weight provides a storage location for the cannabinoid precursors and cannabinoids removed from the plasma membrane. This reduces the accumulation of cannabinoid precursors and cannabinoids in the plasma membrane, reducing membrane destabilisation and reducing the chances of cell death. Oily yeast such as Y. lipolytica can be engineered to have a fatty acid or fat (eg lipid) content above 80% dry weight, compared to 2-3% for yeast such as S. cerevisiae. Accordingly, cannabinoid precursor and cannabinoid production can be much higher in oily yeast, particularly oily yeast engineered to have a high fatty acid or fat (eg lipid) content.
  • Example 5: Production of Diviaric Acid/Olivetolic Acid from Compound I
  • It is known that the production of Diviaric Acid and/or Olivetolic Acid is a major bottleneck in cannabinoid biosynthesis. In an effort to eliminate this block, microorganisms capable of producing Diviaric Acid and/or Olivetolic Acid directly from Acetyl-CoA and Malonyl-CoA as illustrated in FIG. 1C were analysed and novel sequences corresponding to SEQ ID NO:41-42 were isolated. It was determined that:
      • a. to produce Olivetolic Acid, a combination of cs-OLAS-1 of SEQ ID NO:41 and cs-HEX-1 of SEQ ID NO:43 are needed;
      • b. to produce Diviaric Acid, a combination of SEQ ID NO:42 and SEQ ID NO:44 are needed.
    Example 6—Effect of NphB Gene Mutations on Product Quality
  • To evaluate the effect of NphB gene mutations on product quality, a lipid accumulation strain (W29 Δpex10 AURA3 hp4d-YlACBP hp4d-YlZWF1 hp4d-YlACC1 TEFin-YlDGA1 TEFin-ScSUC2 TEFin-YlHXK1) was used to express NphBs. NphB wild type and mutations with thrombin-6×His tag at N-terminal are expressed episomally driven by TEF intron promoter.
  • Strains were pre-grown in yeast extract peptone dextrose hygromycin (YPD-hygromycin) medium overnight and then back-diluted to OD 600 nm=0.2 into YPD-hygromycin medium. Strains were incubated for 48 h in incubator shaker (250 r.p.m.) at 30° C. while supplementing with 50 mg/L hygromycin every 24 h.
  • Cells were centrifuged at 3000 g for 5 min. Pellet was resuspended in binding buffer (His gravitrap, GE). Beads and EZBlock protease inhibitor cocktail V were added to cells before homogenized on Omni homogenizer for 90s at 4° C. His-tagged protein were purified by His gravitrap kit according to manufacturer's manual. Purified protein was then buffer exchanged by PD-10 desalting column (GE). Thrombin-6×His tag was removed by thrombin digestion at 25° C. for 16 h followed by purification through His gravitrap column to obtain tag-free protein. Proteins were concentrated and buffer exchanged in assay buffer (50 mM NaH2PO4, 300 mM NaCl, 20 mM β-mercaptoethanol) by Spin-X UF concentrators (Corning).
  • To assay NphB activity, in vitro assays containing 5 mM GPP, 2 mM OA, 5 mM MgCl2 and 0.5 mg/mL NphB purified enzymes were incubated for 24 h at room temperature and subsequently extracted by adding 200 ul acetonitrile to stop reaction, vortexing for 30 s. Solution was centrifuged at 18000 g for 3 min before subjected to HPLC analysis.
  • Products were then analysed using high-performance liquid chromatography with UV detection. The mobile phase was composed of 0.05% (v/v) formic acid in water (solvent A) and 0.05% (v/v) formic acid in acetonitrile (solvent B). Olivetolic acid and cannabinoids were separated via gradient elution as follows: linearly increased from 45% B to 62.5% B in 3 min, held at 62.5% B for 4 min, increased from 62.5% B to 97% B in 1 min, held at 97% B for 4 min, decreased from 97% B to 45% B in 0.5 min, and held at 45% B for 3 min. The flow rate was held at 0.2 ml/min for 12 min, increased from 0.2 ml/min to 0.4 ml/min in 0.5 min, and held at 0.4 ml/min for 3 min. The total liquid chromatography run time was 15.5 min.
  • Summary of enzymatic assay products quantified by HPLC.
  • CBGA
    CBGA byproduct
    Mutation Name OA (area) (area) (area)
    Q161A NphB1 266,298 547 8,105
    Q161A + G286S + Y288A NphB3 258,567 26,667 374
    Y288A NphB5 303,916 6,417 N.D.
    Y288A + A232S NphB6 268,441 11,647 N.D.
    Y288A + G286S NphB7 287,361 19,613 219
    G286S NphB8 273,713 1,570 812
  • As shown in FIG. 8 , different mutations in NphB shown in the above table produce Olivetolic Acid and CBGA with low amounts of CBGA by-product.
  • Example 7-Improved Activity of Using ProASignal Sequences
  • It was surprisingly found that the addition of a ProA signal sequence (e.g., one of SEQ ID Nos:45-49) to THCAS, CBDAS and/or CBCAS improves functionality of these enzymes and increases production of the resulting cannabinoids analogs. For example, FIGS. 9A and 9B show the results when different ProA signal sequences were tested.
  • Specifically, a lipid accumulation strain Y12 (W29 Δpex10 AURA3 hp4d-YlACBP hp4d-YZWF1 hp4d-YlACC1 TEFin-YDGA1 TEFin-ScSUC2 TEFin-YlHXK) was used for THCAS episomal expression. All THCAS has 3×-is tag attached at C-terminal for Western Blot detection. All THCAS are driven by TEF intron promoter with XPR2 terminator. Different length of vacuolar proteinase A (YALI0F27071g) single peptide are attached at N-terminal of THCAS. One THCAS variant is with two mutations at N89Q and N499Q for 2 glycosylation site removal.
  • Strain
    number genotype plasmid
    S1 Y12 TEFin-THCA-His-XPR2
    S2 Y12 TEFin-ProA18-THCAS-His-XPR2
    S3 Y12 TEFin-ProA19-THCAS-His-XPR2
    S4 Y12 TEFin-ProA20-THCAS-His-XPR2
    S5 Y12 TEFin-ProA21-THCAS-His-XPR2
    S6 Y12 TEFin-ProA22-THCAS-His-XPR2
    S7 Y12 TEFin-ProA23-THCAS-His-XPR2
    S8 Y12 TEFin-ProA24-THCAS-His-XPR2
    S9 Y12, ΔPRB1 TEFin-ProA18-THCAS-His-XPR2
    S10 Y12, ΔPRB1 TEFin-ProA19-THCAS-His-XPR2
    S11 Y12, ΔPRB1 TEFin-ProA20-THCAS-His-XPR2
    S12 Y12, ΔPRB1 TEFin-ProA21-THCAS-His-XPR2
    S13 Y12, ΔPRB1 TEFin-ProA22-THCAS-His-XPR2
    S14 Y12, ΔPRB1 TEFin-ProA23-THCAS-His-XPR2
    S15 Y12, ΔPRB1 TEFin-ProA24-THCAS-His-XPR2
    S16 Y12, ΔPEP4 TEFin-ProA24-THCAS-His-XPR2
    S17 Y12, ΔPRB2 TEFin-ProA24-THCAS-His-XPR2
    S18 Y12, ΔPEP4, ΔPRB1 TEFin-ProA24-THCAS-His-XPR2
    S19 Y12, ΔPRB1 TEFin-ProA24-THCAS-His-2M-XPR2
    S20 Y12, ΔPEP4 TEFin-ProA24-THCAS-His-2M-XPR2
    S21 Y12, ΔPRB2 TEFin-ProA24-THCAS-His-2M-XPR2
    S22 Y12, ΔPEP4, ΔPRB1 TEFin-ProA24-THCAS-His-2M-XPR2
    S23 Y12, ΔPRB1, ΔPRB2 TEFin-ProA24-THCAS-His-2M-XPR2
  • Strains were pre-grown in yeast extract peptone dextrose hygromycin (YPD-hygromycin) medium overnight and then back-diluted to OD 600 nm=0.2 into YPD-hygromycin medium. For protein production, strains were incubated for 48 h in incubator shaker (250 r.p.m.) at 30° C. while supplementing with 50 mg/L hygromycin every 24 h. For THCA in vivo production, strains were incubated using the same cultural condition for 48 h to biomass growth. Then CBGA was spiked at difference level of concentrations and incubated for another 48 or 72 hours for THCA production. CBGA stock solution (1 mg/ml CBGA in F127 surfactant with 1% (v/v) canola oil) was used for spiking.
  • Cells were centrifuged at 15000 g for 3 min. Pellet was resuspended in THCAS assay buffer (100 mM Na-citrate buffer pH 4.5). Beads and 1% (v/v) EZBlock protease inhibitor cocktail V were added to cells before homogenized on Omni homogenizer for 90s at 4° C. Cell lysate obtained by centrifugation at 15000 g for 5 min was used for western blot. THCAS production was evaluated by western blot using a primary antibody (6×-His Tag Polyclonal Antibody, PA1-983B) and secondary antibody (Goat anti-Rabbit IgG (H+L) Cross-Adsorbed Secondary Antibody, HRP, G-21234) against the C-terminal 3×His tag on THCAS. Western blot detection was performed using i-Step Ultra TMB-Blotting Solution (Thermo Fisher Scientific).
  • Extraction of cannabinoids was performed by adding 1 ml culture, 0.3 ml ethyl acetate/formic acid (0.05% (v/v)) and 0.2 ml equivalent glass bead to Omni homogenizer tube.
  • Cells were cooled down on ice for 2 min followed homogenized at Speed 5 for 90 s at 4° C. Organic and inorganic layers were separated by centrifugation at 18,000 g for 2 min. Samples were extracted with ethyl acetate/formic acid (0.05% (v/v)) for 3 times. The combined organic layers were evaporated in a fume hood and the remainders were resuspended in 300 ul of acetonitrile/H2O/formic acid (80%/20%/0.05% (v/v/v)). Product was filtered before subjected to HPLC analysis.
  • Products were analysed using high-performance liquid chromatography with UV detection. The mobile phase was composed of 0.05% (v/v) formic acid in water (solvent A) and 0.05% (v/v) formic acid in acetonitrile (solvent B). Olivetolic acid and cannabinoids were separated via gradient elution as follows: linearly increased from 45% B to 62.5% B in 3 min, held at 62.5% B for 4 min, increased from 62.5% B to 97% B in 1 min, held at 97% B for 4 min, decreased from 97% B to 45% B in 0.5 min, and held at 45% B for 3 min. The flow rate was held at 0.2 ml/min for 12 min, increased from 0.2 ml/min to 0.4 ml/min in 0.5 min, and held at 0.4 ml/min for 3 min. The total liquid chromatography run time was 15.5 min.
  • FIG. 9A shows that THCAs without proA (Si) produces a large amount of cytoplasmic enzyme with mass 53 kD. This enzyme is not glycosylated and has a predicted molecular weight of 53 kD. ProA19 (S3) also produce significant amount of unglycosylated enzyme. We didn't receive a detectable by Western Blot amount of THCAs with correct glycosylation (69 kD) in strains with active PRB1 and PEP4, showing that without ProA and knockout almost no enzyme present in
  • FIG. 9B shows the effect for protease knockout on ProA24-THCAs production. Production of correctly glycosylated (69 kD) enzyme for dPRB1, dPEP4 and dPRB1+dPEP4 (lanes S15-S16, S18-S20, and S22-S23). dPRB2 shows no detectable amount for any forms of THCAs (lanes S17 and S21).
  • FIG. 9C shows that ProA19-24 can produce large amount of correctly glycosylated enzyme in dPRB1 strain.
  • FIG. 9D provides the in vivo THCA production by strains expression THCAS with different ProA signal peptide and protease knockouts. From this figure, THCA production from THCAS fused to a ProA signal sequence expressed in dPRB1 and/or dPEP4 knockout strains produce more than 10 fold more THCA as compared to strains without ProA and protease knockout.
  • The reference to any prior art in this specification is not, and should not be taken as, an acknowledgement of any form of suggestion that such prior art forms part of the common general knowledge.
  • It will be appreciated by those skilled in the art that the disclosure is not restricted in its use to the particular application described. Neither is the present disclosure restricted in its preferred embodiment with regard to the particular elements and/or features described or depicted herein. It will be appreciated that the disclosure is not limited to the embodiment or embodiments disclosed, but is capable of numerous rearrangements, modifications and substitutions without departing from the scope of the disclosure as set forth and defined by the following claims.
  • REFERENCES
    • Angela, C., Hansen, E. H., Kayser, O., Carlsen, S. and Stehle, F. 2017. Microorganism design for heterologous biosynthesis of cannabinoids. FEMS Yeast Research.
    • Bonitz, T., Alva, V., Saleh, O., Lupas, A. N. and Heide, L., 2011. Evolutionary relationships of microbial aromatic prenyltransferases. PloS one, 6(11), p. e27336.
    • Brown, D. W., Adams, T. H. and Keller, N. P., 1996. Aspergillus has distinct fatty acid synthases for primary and secondary metabolism. Proceedings of the National Academy of Sciences, 93(25), pp. 14873-14877.
    • Curran, K. A., Morse, N.J., Markham, K. A., Wagman, A. M., Gupta, A. and Alper, H. S., 2015. Short synthetic terminators for improved heterologous gene expression in yeast. ACS synthetic biology, 4(7), pp. 824-832.
    • Gao, S., Tong, Y., Zhu, L., Ge, M., Zhang, Y., Chen, D., Jiang, Y. and Yang, S., 2017. Iterative integration of multiple-copy pathway genes in Yarrowia lipolytica for heterologous β-carotene production. Metabolic engineering, 41, pp. 192-201.
    • Gajewski, J., Pavlovic, R., Fischer, M., Boles, E. and Grininger, M., 2017. Engineering fungal de novo fatty acid synthesis for short chain fatty acid production. Nature Communications, 8, p. 14650.
    • Ghorai, N., Chakraborty, S., Gucchait, S., Saha, S. K. and Biswas, S., 2012. Estimation of total Terpenoids concentration in plant tissues using a monoterpene, Linalool as standard reagent. Protocol Exchange, 5.
    • Hitchman, T. S., Schmidt, E. W., Trail, F., Rarick, M. D., Linz, J. E. and Townsend, C. A., 2001. Hexanoate synthase, a specialized type I fatty acid synthase in aflatoxin B1 biosynthesis. Bioorganic chemistry, 29(5), pp. 293-307.
    • Kampranis, S. C. and Makris, A. M. 2012. Developing a yeast cell factory for the production of terpenoids. Computational and structural biotechnology journal 3, p. e201210006.
    • Kuzuyama, T., Noel, J. P. and Richard, S. B., 2005. Structural basis for the promiscuous biosynthetic prenylation of aromatic natural products. Nature, 435(7044), p. 983.
    • Lange, K., Schmid, A. and Julsing, M. K. 2016. A9-Tetrahydrocannabinolic acid synthase: The application of a plant secondary metabolite enzyme in biocatalytic chemical synthesis. Journal of Biotechnology 233, pp. 42-48.
    • MacPherson, M. and Saka, Y., 2016. Short synthetic terminators for assembly of transcription units in vitro and stable chromosomal integration in yeast S. cerevisiae. ACS synthetic biology, 6(1), pp. 130-138.
    • Muntendam, R. (2015). Metabolomics and bioanalysis of terpenoid derived secondary metabolites: Analysis of Cannabis sativa L. metabolite production and prenylases for cannabinoid production [Groningen].
    • Poulos, J. L. and Farnia, A. 2016. Patent US20160010126—Production of cannabinoids in yeast—Google Patents. Available at: https://www.google.com/patents/US20160010126 [Accessed: r May 2017].
    • Qiao, K., Imam Abidi, S. H., Liu, H., Zhang, H., Chakraborty, S., Watson, N., Kumaran Ajikumar, P. and Stephanopoulos, G. 2015. Engineering lipid overproduction in the oleaginous yeast Yarrowia lipolytica. Metabolic Engineering 29, pp. 56-65.
    • Zhao, J., Bao, X., Li, C., Shen, Y. and Hou, J. 2016. Improving monoterpene geraniol production through geranyl diphosphate synthesis regulation in Saccharomyces cerevisiae. Applied Microbiology and Biotechnology 100(10), pp. 4561-4571.
    • Zhuang, X. U. N. Engineering Novel Terpene Production Platforms In The Yeast Saccharomyces Cerevisiae.
    • Zirpel, B., Degenhardt, F., Martin, C., Kayser, O. and Stehle, F. 2017. Engineering yeasts as platform organisms for cannabinoid biosynthesis. Journal of Biotechnology.
    • Zirpel, B., Stehle, F. and Kayser, O. 2015. Production of A9-tetrahydrocannabinolic acid from cannabigerolic acid by whole cells of Pichia (Komagataella) pastoris expressing Δ9-tetrahydrocannabinolic acid synthase from Cannabis sativa L. Biotechnology Letters 37(9), pp. 1869-1875.

Claims (26)

1. A Polyketide Synthase (PKS) enzyme comprising the amino acid sequence selected from:
a. SEQ ID NO:1 (C. stelaris-OLAs-dACP1);
b. SEQ ID NO:2 (C. stelaris-OLAs-dACP2);
c. SEQ ID NO:3 (C. stellaris-OLAs-wt (wild type C. stelaris));
d. SEQ ID NO:6 (C. grayi-PKS-dACP1);
e. SEQ ID NO:7 (C. grayi-PKS-dACP2);
f. SEQ ID NO:40 (P. furfuracea);
g. SEQ ID NO:41 (cs-OLAS-1);
h. SEQ ID NO:42 (pp-DVAS-1)
i. an PKS enzyme variant of any one of SEQ ID NO:4-5 and 40 (C. grayi, C Uncialis), wherein one of the two ACP domains has been inactivated;
j. an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
k. an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence similarity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
l. a PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of the domains selected from: SAT domain, KS domain, AT domain, PT domain, ACP1 domain, ACP2 domain, and TE domain of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain; or
m. any combination of (a)-(1).
2. A polynucleotide encoding the PKS enzyme of claim 1.
3. A composition comprising:
a. the PKS enzyme of claim 1 selected from SEQ ID NO:1-7 and 40 or variant thereof and a npgA enzyme;
b. the cs-OLAS-1 of SEQ ID NO:41 or variant thereof, a cs-HEX-1 of SEQ ID NO:43 or variant thereof, and a npgA enzyme; or
c. the pp-DVAS-1 of SEQ ID NO:42 or variant thereof, a pp-BUT-1 of SEQ ID NO:44 or variant thereof, and a npgA enzyme.
4. The composition of claim 3, wherein said composition is a cell-free composition.
5. The composition of claim 3, wherein said composition further comprises a recombinant microorganism.
6. The composition of claim 5, wherein said recombinant microorganism:
a. expresses the PKS enzyme comprising the amino acid sequence selected from:
1) SEQ ID NO:1 (C. stelaris-OLAs-dACP1);
2) SEQ ID NO:2 (C. stelaris-OLAs-dACP2);
3) SEQ ID NO:3 (C. stellaris-OLAs-wt (wild type C. stelaris));
4) SEQ ID NO:6 (C. grayi-PKS-dACP1);
5) SEQ ID NO:7 (C. grayi-PKS-dACP2);
6) SEQ ID NO:40 (P. furfuracea);
7) SEQ ID NO:41 (cs-OLAS-1);
8) SEQ ID NO:42 (pp-DVAS-1)
9) an PKS enzyme variant of any one of SEQ ID NO:4-5 and 40 (C. grayi, C Uncialis), wherein one of the two ACP domains has been inactivated;
10) an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
11) an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence similarity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
12) a PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of the domains selected from: SAT domain, KS domain, AT domain, PT domain, ACP1 domain, ACP2 domain, and TE domain of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain: or
13) any combination of 1)-12); and/or
b. expresses the npgA enzyme; and/or
c. expresses the cs-OLAS-1 or variant thereof and the cs-HEX-1 or variant thereof, and/or
d. the pp-DVAS-1 or variant thereof and the pp-BUT-1 or variant thereof.
7. The composition of claim 3, wherein said composition further comprises at least one enzyme selected from:
a. a FAS1 mutant, wherein mutations are selected from 1306A, RI834K;
b. a FAS2 mutant, wherein said mutation is selected from G1250S, M1251W;
c. StcJ and StcK;
d. HexA and HexB;
e. ERG10;
f. ERG13;
g. HMGR;
h. tHMGR (truncated HMGR);
i. ERG12;
j. ERG8;
k. ERG19;
l. IDI1;
m. a ERG20 mutant, wherein said mutant is selected from
i. S. cerevisiae ERG20F96W/N127W or Y. lipolytica ERG20F88W/N119W or
ii. S. cerevisiae ERG20K197E or Y. lipolytica ERG20K189E.
n. a mutant NphB (mutNphB)(preferably with mutations at least one of Q161A, G286S, Y288A, A232S);
o. csPT1;
p. csPT4;
q. a tetrahydrocannabinolic acid synthase (THCAS);
r. a cannabidiolic acid synthase (CBDAS);
s. a cannabichromenic acid synthase (CBCAS); or
t. any combination of (a)-(s).
8. The composition of claim 5, wherein said recombinant microorganism overexpresses a protein selected from:
a. the PKS enzyme of comprising the amino acid sequence selected from:
1) SEQ ID NO:1 (C. stelaris-OLAs-dACP1);
2) SEQ ID NO:2 (C. stelaris-OLAs-dACP2);
3) SEQ ID NO:3 (C. stellaris-OLAs-wt (wild type C. stelaris));
4) SEQ ID NO:6 (C. grayi-PKS-dACP1);
5) SEQ ID NO:7 (C. grayi-PKS-dACP2);
6) SEQ ID NO:40 (P. furfuracea);
7) SEQ ID NO:41 (cs-OLAS-1);
8) SEQ ID NO:42 (pp-DVAS-1);
9) an PKS enzyme variant of any one of SEQ ID NO:4-5 and 40 (C. grayi, C uncialis), wherein one of the two ACP domains has been inactivated;
10) an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
11) an PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence similarity to any one of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain;
12) a PKS enzyme variant having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to any one of the domains selected from: SAT domain, KS domain, AT domain, PT domain, ACP1 domain, ACP2 domain, and TE domain of SEQ ID NOS: 1-7 or 40-42, wherein said PKS enzyme variant has retained PKS activity and has only one active ACP domain: or
13) any combination of 1)-12):
b. the npgA enzyme;
c. cs-OLAS-1 or variant thereof and the cs-HEX-1 or variant thereof;
d. the pp-DVAS-1 or variant thereof and the pp-BUT-1 or variant thereof, and/or
e. at least one enzyme selected from:
1) a FAS1 mutant, wherein mutations are selected from I306A, R1834K;
2) a FAS2 mutant, wherein said mutation is selected from G1250S, M1251W:
3) StcJ and StcK;
4) HexA and HexB;
5) ERG10;
6) ERG13;
7) HMGR;
8) tHMGR (truncated HMGR);
9) ERG12;
10 ERG8;
11) ERG19;
12 IDI1;
13) a ERG20 mutant, wherein said mutant is selected from
a. S. cerevisiae ERG20F96W/N127W or Y. lipolytica ERG20F88W/N119W or
b. S. cerevisiae ERG20K197E or Y. lipolytica ERG20K189E.
14) a mutant NphB (mutNphB)(preferably with mutations at least one of Q161A, G286S, Y288A, A232S);
15) csPT1;
16 csPT4;
17) a tetrahydrocannabinolic acid synthase (THCAS);
18) a cannabidiolic acid synthase (CBDAS);
19) a cannabichromenic acid synthase (CBCAS); or
20) any combination of 1)-19).
9. The composition of claim 8, wherein said protein is overexpressed by:
a. operably associating a strong promoter with a polynucleotide encoding the protein; and/or
b. multiple copies of a polynucleotide encoding the protein by the recombinant microorganism.
10. The composition of claim 5, wherein said recombinant microorganism further comprises inactivation of:
a. PEX10; and/or
b. CPR1; and/or
c. PEP4 (from S. cerevisiae, YALI0F27071p in YL); and/or
d. PRB1 (from S. cervisae, YALI0B16500p and/or YALI0A06435p in YL).
11. The composition of claim 3, wherein the composition further comprises any one of:
a. Compound II, wherein n is 1 (Butyryl-CoA), 2 (Hexanoyl-CoA) or 3 (Octanoyl-CoA);
Figure US20220403346A1-20221222-C00004
and/or
b. Compound III, wherein n is 1 (Butyric Acid), 2 (Hexanoic Acid) or 3 (Octanoic Acid);
Figure US20220403346A1-20221222-C00005
12. The composition of claim 3, wherein the composition further comprises at least one cannabinoid or cannabinoid precursor.
13. The composition of claim 12, wherein the at least one cannabinoid or cannabinoid precursor comprises CBGA, THCA, CBDA, CBCA, CBD, THC, CBC, CBGVA, THCVA, CBDVA, CBCVA, CBDV, THCV, CBCV, THCA-C7, CBDA-C7, CBGA-C7 CBCA-C7, CBD-C7, THC-C7, CBC-C7, or CBN analog.
14. A method of producing Compound I, wherein said method comprises contacting the composition of claim 3 with a carbohydrate source to enzymatically produce Compound I, wherein Compound I is
Figure US20220403346A1-20221222-C00006
wherein n is selected from 1 (Diviaric Acid), 2 (Olivetolic acid), or 3 (2,4-Dihydroxy-6-geptylbenzoic acid).
15-31. (canceled)
32. The method of claim 14, wherein the method is carried out in a microorganism lacking functional PEP4 and/or PRB1 activity.
33. (canceled)
34. (canceled)
35. (canceled)
36. The composition of claim 5 or the method of claim 32, wherein the recombinant microorganism is selected from: bacteria, fungi, yeasts, algae, and archaea.
37. The composition or method of claim 36, wherein said recombinant microorganism is a yeast.
38. The composition or method of claim 37, wherein said yeast is oleaginous.
39. The composition or method of claim 38, wherein the yeast is selected from the genera Rhodosporidium, Rhodotorula, Yarrowia, Cryptococcus, Candida, Lipomyces and Trichosporon.
40. The composition or method of claim 38, wherein said yeast is a Yarrowia lipolytica, a Lipomyces starkey, a Rhodosporidium toruloides, a Rhodotorula glutinis, a Trichosporon fermentans or a Cryptococcus curvatus.
41. The composition or method of claim 36, wherein the yeast comprises at least 5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, or at least 25% dry weight of fatty acids or fats.
42. The composition or method of claim 36, wherein the yeast is genetically modified to produce at least 5%, at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, at least 15%, at least 16%, at least 17%, at least 18%, at least 19%, at least 20%, at least 21%, at least 22%, at least 23%, at least 24%, or at least 25% dry weight of fatty acids or fats.
US17/636,322 2019-08-19 2020-08-18 Production of cannabinoids Pending US20220403346A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/636,322 US20220403346A1 (en) 2019-08-19 2020-08-18 Production of cannabinoids

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201962888738P 2019-08-19 2019-08-19
PCT/US2020/046837 WO2021034847A1 (en) 2019-08-19 2020-08-18 Production of cannabinoids
US17/636,322 US20220403346A1 (en) 2019-08-19 2020-08-18 Production of cannabinoids

Publications (1)

Publication Number Publication Date
US20220403346A1 true US20220403346A1 (en) 2022-12-22

Family

ID=72292671

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/636,322 Pending US20220403346A1 (en) 2019-08-19 2020-08-18 Production of cannabinoids

Country Status (6)

Country Link
US (1) US20220403346A1 (en)
EP (1) EP4017972A1 (en)
AU (1) AU2020333745A1 (en)
CA (1) CA3148628A1 (en)
MX (1) MX2022002121A (en)
WO (1) WO2021034847A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220195469A1 (en) * 2020-12-18 2022-06-23 Debut Biotechnology, Inc. Cell-free production of geranyl pyrophosphate from glycerol in a cell-free manufacturing system
CN114657078A (en) * 2022-01-27 2022-06-24 森瑞斯生物科技(深圳)有限公司 Construction method and application of high-yield cannabidiolic acid saccharomyces cerevisiae strain

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3130763A1 (en) 2019-02-25 2020-09-03 Ginkgo Bioworks, Inc. Biosynthesis of cannabinoids and cannabinoid precursors

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9822384B2 (en) 2014-07-14 2017-11-21 Librede Inc. Production of cannabinoids in yeast
US11149291B2 (en) * 2017-07-12 2021-10-19 Biomedican, Inc. Production of cannabinoids in yeast

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220195469A1 (en) * 2020-12-18 2022-06-23 Debut Biotechnology, Inc. Cell-free production of geranyl pyrophosphate from glycerol in a cell-free manufacturing system
CN114657078A (en) * 2022-01-27 2022-06-24 森瑞斯生物科技(深圳)有限公司 Construction method and application of high-yield cannabidiolic acid saccharomyces cerevisiae strain

Also Published As

Publication number Publication date
EP4017972A1 (en) 2022-06-29
AU2020333745A1 (en) 2022-03-31
MX2022002121A (en) 2022-07-27
WO2021034847A1 (en) 2021-02-25
CA3148628A1 (en) 2021-02-25

Similar Documents

Publication Publication Date Title
US11939613B2 (en) Production of cannabinoids in yeast
US11555211B2 (en) Recombinant production systems for prenylated polyketides of the cannabinoid family
US20220403346A1 (en) Production of cannabinoids
WO2017139496A1 (en) Microbial engineering for the production of cannabinoids and cannabinoid precursors
EP3622080A1 (en) Improved methods for producing isobutene from 3-methylcrotonic acid
EP3404094A1 (en) Recombinant host cells for the production of malonate
EP3058078B1 (en) Engineering of hydrocarbon metabolism in yeast
WO2015042201A2 (en) A high yield route for the production of compounds from renewable sources
WO2020198679A1 (en) Biosynthetic cannabinoid production in engineered microorganisms
US20230063396A1 (en) Genetically modified yeast for the production of cannabigerolic acid, cannabichromenic acid and related cannabinoids
JP2022533449A (en) Methods and cells for the production of phytocannabinoids and phytocannabinoid precursors
US20220213513A1 (en) Production of cannabinoids
US20210403959A1 (en) Use of type i and type ii polyketide synthases for the production of cannabinoids and cannabinoid analogs
JP2017520270A (en) Production of fatty diols by microorganisms
WO2021042057A1 (en) Systems and methods for preparing cannabinoids and derivatives
JP2023027261A (en) Thioesterase variants having improved activity for production of medium-chain fatty acid derivatives
WO2022051433A1 (en) Production of sesqui-cannabinoids
US20220325313A1 (en) Biosynthesis of alpha-ionone and beta-ionone
WO2022245988A2 (en) Production of carotenoids
EP3313998B1 (en) Method for the enzymatic production of isoamyl alcohol
US11634718B2 (en) Production of macrocyclic ketones in recombinant hosts
WO2023056338A1 (en) Biosynthetic production of vitamin a compounds
WO2023064639A1 (en) Optimized biosynthesis pathway for cannabinoid biosynthesis

Legal Events

Date Code Title Description
AS Assignment

Owner name: BIOMEDICAN, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MIKHEEV, MAXIM;REEL/FRAME:059041/0211

Effective date: 20220119

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION