WO2023069921A1 - Recombinant thca synthase polypeptides engineered for enhanced biosynthesis of cannabinoids - Google Patents

Recombinant thca synthase polypeptides engineered for enhanced biosynthesis of cannabinoids Download PDF

Info

Publication number
WO2023069921A1
WO2023069921A1 PCT/US2022/078258 US2022078258W WO2023069921A1 WO 2023069921 A1 WO2023069921 A1 WO 2023069921A1 US 2022078258 W US2022078258 W US 2022078258W WO 2023069921 A1 WO2023069921 A1 WO 2023069921A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
host cell
acid
polypeptide
cannabinoid
Prior art date
Application number
PCT/US2022/078258
Other languages
French (fr)
Inventor
Trish Choudhary
Xueyang FENG
Amy LUM
Gisele PASSAIA PRIETSCH
Prumjot PANESAR
Original Assignee
Epimeron Usa, Inc.
Willow Biosciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Epimeron Usa, Inc., Willow Biosciences, Inc. filed Critical Epimeron Usa, Inc.
Publication of WO2023069921A1 publication Critical patent/WO2023069921A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • C12N15/81Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts

Definitions

  • the present disclosure relates to recombinant THCA synthase polypeptides engineered with enhanced activity and the use of recombinant genes encoding these polypeptides in recombinant host cell systems for the production of cannabinoid compounds.
  • Cannabinoids are a class of compounds that act on endocannabinoid receptors and include the phytocannabinoids naturally produced by Cannabis sativa.
  • Cannabinoids include the more prevalent and well-known compounds, A 9 -tetrahydrocannabinol (THC), cannabidiol (CBD), as well as 80 or more less prevalent cannabinoids, cannabinoid precursors, related metabolites, and synthetically produced derivative compounds.
  • Cannabinoids are increasingly used to treat a range of diseases and conditions such as multiple sclerosis and chronic pain. Current large-scale production of cannabinoids for pharmaceutical or other use is through extraction from plants.
  • the present disclosure relates generally to recombinant polypeptides engineered with increased THCA synthase activity relative to the naturally occurring THCA synthase from Cannabis sativa, and the use of these recombinant polypeptides in recombinant host cell systems and methods for the preparation of cannabinoids.
  • This summary is intended to introduce the subject matter of the present disclosure, but does not cover each and every embodiment, combination, or variation that is contemplated and described within the present disclosure. Further embodiments are contemplated and described by the disclosure of the detailed description, drawings, and claims.
  • the present disclosure provides a recombinant polypeptide having THCA synthase activity, wherein the polypeptide comprises an amino acid sequence of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18:
  • the polypeptide is encoded by a polynucleotide sequence having at least 80% identity to SEQ ID NO: 17, and a neutral codon difference as compared to SEQ ID NO: 17 at a position encoding an amino acid residue selected from: V75, H108, K136, K137, V184, K187, G328, F337, A368, P404, L415, T464, D498, and H516; optionally, wherein the neutral codon difference as compared to SEQ ID NO: 17 is selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOG
  • the present disclosure provides a recombinant polypeptide having THCA synthase activity, wherein the polypeptide comprises an amino acid sequence of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18, wherein the amino acid difference is:
  • the polypeptide comprises at least two amino acid differences as compared to SEQ ID NO: 18 selected from: H23N and A335C; Q41 R and D258R; L43G and K276Q; S72A and V293I; K234R and K496E; I266Q and H517Y; N301 D and Y472I; A335T and G348A; and A335T and H517V.
  • the polypeptide comprises neutral codon differences as compared to SEQ ID NO: 17 selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOGAT), and H516 (CAT>CAC).
  • SEQ ID NO: 17 selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CT
  • the polypeptide comprises an amino acid sequence of at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to a sequence selected from the group consisting of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, and 168.
  • the polypeptide comprises an N-terminal secretion peptide; optionally, wherein the N-terminal secretion peptide comprising an amino acid sequence selected from SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, and 122.
  • the THCA synthase activity of the polypeptide as compared to a polypeptide consisting of SEQ ID NO: 18 is increased at least 1.2-fold, at least 1.5-fold, at least 2-fold, at least 5-fold, or more.
  • the THCA synthase activity of the polypeptide is measured as the rate of conversion of the substrate cannabigerolic acid (CBGA) to THCA under suitable reaction conditions.
  • the THCA synthase activity of the polypeptide is measured as the rate of conversion of the substrate CBGVA to THCVA under suitable reaction conditions.
  • the present disclosure also provides a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure.
  • the polynucleotide comprises:
  • the polynucleotide encoding the polypeptide further comprises a polynucleotide sequence encoding an N-terminal secretion peptide comprising an amino acid sequence selected from SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116,
  • polynucleotide sequence encoding the N-terminal secretion peptide is selected from SEQ ID NO: 99, 101 , 103, 105, 107, 109, 111 , 113, 115, 117,
  • the present disclosure also provides an expression vector comprising a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, optionally wherein, the expression vector comprises a control sequence.
  • the present disclosure also provides a recombinant host cell comprising: (a) a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, or (b) an expression vector comprising a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure.
  • the present disclosure provides a method for preparing a recombinant polypeptide having THCA synthase activity of the present disclosure wherein the method comprises culturing a recombinant host cell of the present disclosure and isolating the polypeptide from the cell.
  • the present disclosure provides a method for preparing a recombinant polypeptide having THCA synthase activity comprising:
  • the present disclosure also provides a recombinant host cell comprising a nucleic acid encoding a recombinant polypeptide having THCA synthase activity of the present disclosure.
  • the host cell further comprises a pathway of enzymes capable of producing a cannabinoid or cannabinoid precursor; optionally, wherein the cannabinoid or cannabinoid precursor is selected from divarinic acid (DA), olivetolic acid (OA), cannabigerovarinic acid (CBGVA), and cannabigerolic acid (CBGA),.
  • DA divarinic acid
  • OA olivetolic acid
  • CBGVA cannabigerovarinic acid
  • CBGA cannabigerolic acid
  • the host cell further comprises a pathway of enzymes capable of converting hexanoic acid (HA) to cannabigerolic acid (CBGA); optionally, wherein the pathway comprises enzymes capable of catalyzing reactions (i) - (iv): and
  • the host cell further comprises a pathway of enzymes capable of converting hexanoic acid (HA) to cannabigerolic acid (CBGA), wherein the pathway comprises at least the enzymes AAE, OLS, OAC, and PT4; optionally, wherein the enzymes AAE, OLS, OAC, and PT4 have an amino acid sequence of at least 90% identity to SEQ ID NO: 2 (AAE), SEQ ID NO: 4 (OLS), SEQ ID NO: 6 (OAC), and SEQ ID NO: 8 or 10 (PT4), respectively.
  • AAE hexanoic acid
  • OAC cannabigerolic acid
  • PT4 cannabigerolic acid
  • the host cell is capable of producing a cannabinoid selected from cannabigerolic acid (CBGA), cannabigerol (CBG), cannabidiolic acid (CBDA), cannabidiol (CBD), AMetrahydrocannabinolic acid (A 9 -THCA), A 9 - tetrahydrocannabinol (A 9 -THC), AMetrahydrocannabinolic acid (A 8 -THCA), A 8 - tetrahydrocannabinol (A 8 -THC), cannabichromenic acid (CBCA), cannabichromene (CBC), cannabinolic acid (CBNA), cannabinol (CBN), cannabidivarinic acid (CBDVA), cannabidivarin (CBDV), AMetrahydrocannabivarinic acid (A 9 -THCVA), AMetrahydrocannabivarinic acid (A 9 -THCVA), AMetra
  • the host cell comprises a pathway capable of producing THCA, and the production of THCA is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by a polypeptide of SEQ ID NO: 18.
  • the recombinant host cell capable of increased production of THCA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: R3, L23, H28, L31 , M33, L43, S72, V75, K137, G207, K233, 1266, K268, K276, H282, V293, N301 , A335, G348, T418, Y472, N500, and H517; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from R3V, L23I, L23V, H28G, H28Q, L31 D, L31G, M33D, L43G, L43S, S72A, V75A, V75Y, K137C, K137F, K137M, K137S, K137Y, G207A,
  • the recombinant host cell capable of increased production of THCA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and a neutral codon difference as compared to SEQ ID NO: 17 selected from: V75 (GTA>GTG), V184 (GTA>GTG), G328 (GGOGGT), F337 (TTOTTT), P404 (CCT>CCC), D498 (GAOGAT), and H516 (CAT>CAC).
  • the host cell comprises a pathway capable of producing THCVA, and the production of THCVA is increased at least 2- fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by a polypeptide of SEQ ID NO: 18.
  • the recombinant host cell capable of increased production of THCVA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: K12, A19, H28, Q41 , V75, H108, V151 , A214, K234, D256, D258, H274, F317, F332, A335, S354, T367, and K496; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from K12G, A19E, A19G, A19Q, H28N, Q41 R, V75A, H108R, V151G, A214T, K234R, D256S, D258R, H274C, H274E, H274Q, F317L, F332L, A335C, S354C,
  • the recombinant host cell capable of increased production of THCVA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and a neutral codon difference as compared to SEQ ID NO: 17 selected from: H108 (CAOCAT), K136 (AAG>AAA), K187 (AAG>AAA), R191 R (AGG>AGA), A368A (GCOGCG), L415L (TTA>CTG), and T464T (ACOACG).
  • the source of the host cell is selected from Saccharomyces cerevisiae, Yarrowia lipolytica, Pichia pastoris, and Escherichia coli.
  • the present disclosure also provides a method for producing a cannabinoid comprising: (a) culturing in a suitable medium a recombinant host cell of the present disclosure; and (b) recovering the produced cannabinoid.
  • the method further comprises contacting a cell-free extract of the culture with a biocatalytic reagent or chemical reagent.
  • the present disclosure also provides a method for preparing a compound of structural formula (I) wherein, R 1 is C1-C7 alkyl; the method comprising contacting under suitable reactions conditions a recombinant polypeptide having THCA synthase activity of the present disclosure and a compound of structural formula (II) wherein, R 1 is C1-C7 alkyl.
  • the compound of structure formula (I) is A9-tetrahydrocannabinolic acid (A9-THCA) and the compound of structural formula (II) is cannabigerolic acid (CBGA); or (b) the compound of structure formula (I) is A9- tetrahydrocannabivarinic acid (A9-THCVA) and the compound of structural formula (II) is cannabigerovarinic acid (CBGVA).
  • FIG. 1 depicts an exemplary four enzyme pathway capable of converting hexanoic acid (HA) to the cannabinoid precursor, olivetolic acid (OA), and then further converting OA to the cannabinoid, cannabigerolic acid (CBGA).
  • the four enzymes catalyzing the steps in the biosynthetic pathway are AAE, OLS, OAC, and PT.
  • FIG. 2 depicts three exemplary two step pathways for converting the cannabinoid, CBGA, to one or more of the cannabinoids, A 9 -THCA, CBDA, and/or CBCA, and then, optionally, further converting them to the decarboxylated cannabinoids, A 9 -THC, CBD, and/or CBC.
  • the first conversion from CBGA to A 9 -THCA, CBDA, and/or CBCA can be catalyzed by a cannabinoid synthase, CBDA synthase (CBDAS), THCA synthase (THCAS) and/or CBCA synthase (CBCAS), respectively.
  • CBDA synthase CBDA synthase
  • THCAS THCA synthase
  • CBCAS CBCA synthase
  • the single cannabinoid synthase e.g., CBDAS
  • CBDAS is capable of catalyzing not only the conversion of CBGA to its preferred product (e.g., CBDAS preferentially converts CBGA to CBDA), but also converts CBGA to one or both of the other cannabinoid acid products, typically in lesser amounts.
  • FIG. 3 depicts an exemplary four enzyme pathway capable of converting butyric acid (BA) to the rare cannabinoid precursor, divarinic acid (DA), and then further converting DA to the rare cannabinoid, cannabigerovarinic acid (CBGVA).
  • the four enzymes catalyzing the steps in the biosynthetic pathway are AAE, OLS, OAC, and PT.
  • FIG. 4 depicts three exemplary two step pathways for converting the rare cannabinoid, CBGVA, to one or more of the rare cannabinoids, A 9 -THCVA, CBDVA, and/or CBCVA, and then, optionally, further converting them to the decarboxylated cannabinoids, A 9 -THCV, CBDV, and/or CBCV.
  • the first conversion from CBGVA to A 9 -THCVA, CBDVA, and/or CBCVA can be catalyzed by a single cannabinoid synthase, CBDAs, THCAs and/or CBCAs, respectively.
  • the single cannabinoid synthase e.g., CBDAs
  • CBDAs is capable of catalyzing not only the conversion of CBGVA to its preferred product (e.g., CBDAs preferentially converts CBGVA to CBDVA), but also converts CBGVA to one or both of the other cannabinoid acid products, typically in lesser amounts.
  • Cannabinoid refers to a compound that acts on cannabinoid receptor, and is intended to include the endocannabinoid compounds that are produced naturally in animals, the phytocannabinoid compounds produced naturally in cannabis plants, and the synthetic cannabinoids compounds.
  • Cannabinoids as referenced in the present disclosure include, but are not limited to, the exemplary naturally occurring and synthetic cannabinoid product compounds shown below in Table 1 (below).
  • Pathway refers an ordered sequence of enzymes that act in a linked series to convert an initial substrate molecule into final product molecule.
  • pathway is intended to encompass naturally-occurring pathways and non-naturally occurring, recombinant pathways. Accordingly, a pathway of the present disclosure can include a series of enzymes that are naturally-occurring and/or non-naturally occurring, and can include a series of enzymes that act in vivo or in vitro.
  • “Pathway capable of producing a cannabinoid” refers to a pathway that can convert a cannabinoid precursor molecule, such as hexanoic acid, into a cannabinoid product molecule, such as cannabigerolic acid (CBGA).
  • CBDA cannabigerolic acid
  • the four enzymes AAE, OLS, OAC, and PT which convert hexanoic acid to CBGA form a pathway capable of producing a cannabinoid.
  • Cannabinoid precursor refers to a compound capable of being converted into a cannabinoid by a pathway capable producing a cannabinoid.
  • Cannabinoid precursors as referenced in the present disclosure include, but are not limited to, the exemplary naturally occurring and synthetic cannabinoid precursors with varying alkyl carbon chain lengths summarized in Table 2 (below).
  • “Conversion” as used herein refers to the enzymatic conversion of a substrate(s) to a corresponding product(s). “Percent conversion” refers to the percent of the substrate that is converted to the product within a period of time under specified conditions. Thus, the “enzymatic activity” or “activity” of an enzymatic conversion can be expressed as “percent conversion” of the substrate to the product.
  • Substrate as used herein in the context of an enzyme mediated process refers to the compound or molecule acted on by the enzyme.
  • Process as used herein in the context of an enzyme mediated process refers to the compound or molecule resulting from the activity of the enzyme.
  • “Host cell” as used herein refers to a cell capable of being functionally modified with recombinant nucleic acids and functioning to express recombinant products, including polypeptides and compounds produced by activity of the polypeptides.
  • nucleic acid or “polynucleotide” as used herein interchangeably to refer to two or more nucleosides that are covalently linked together.
  • the nucleic acid may be wholly comprised ribonucleosides (e.g., RNA), wholly comprised of 2'-deoxyribonucleotides (e.g., DNA) or mixtures of ribo- and 2'-deoxyribonucleosides.
  • the nucleoside units of the nucleic acid can be linked together via phosphodiester linkages (e.g., as in naturally occurring nucleic acids), or the nucleic acid can include one or more non-natural linkages (e.g., phosphorothioester linkage).
  • Nucleic acid or polynucleotide is intended to include singlestranded or double-stranded molecules, or molecules having both single-stranded regions and double-stranded regions.
  • Nucleic acid or polynucleotide is intended to include molecules composed of the naturally occurring nucleobases (i.e., adenine, guanine, uracil, thymine and cytosine), or molecules comprising that include one or more modified and/or synthetic nucleobases, such as, for example, inosine, xanthine, hypoxanthine, etc.
  • Protein “Protein,” “polypeptide,” and “peptide” are used herein interchangeably to denote a polymer of at least two amino acids covalently linked by an amide bond, regardless of length or post-translational modification (e.g., glycosylation, phosphorylation, lipidation, myristilation, ubiquitination, etc.).
  • protein or “polypeptide” or “peptide” polymer can include D- and L-amino acids, and mixtures of D- and L-amino acids.
  • Naturally-occurring or wild-type refers to the form as found in nature.
  • a naturally occurring nucleic acid sequence is the sequence present in an organism that can be isolated from a source in nature and which has not been intentionally modified by human manipulation.
  • Non-limiting examples include, among others, recombinant cells expressing genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise expressed at a different level.
  • Nucleic acid derived from refers to a nucleic acid having a sequence at least substantially identical to a sequence of found in naturally in an organism.
  • cDNA molecules prepared by reverse transcription of mRNA isolated from an organism or nucleic acid molecules prepared synthetically to have a sequence at least substantially identical to, or which hybridizes to a sequence at least substantially identical to a nucleic sequence found in an organism.
  • Coding sequence refers to that portion of a nucleic acid (e.g., a gene) that encodes an amino acid sequence of a protein.
  • Heterologous nucleic acid refers to any polynucleotide that is introduced into a host cell by laboratory techniques, and includes polynucleotides that are removed from a host cell, subjected to laboratory manipulation, and then reintroduced into a host cell.
  • Codon degenerate describes a nucleotide sequence that has one or more different codons relative to the reference nucleotide sequence but which encodes a polypeptide that is identical to the polypeptide encoded by a reference nucleotide sequence.
  • the different codons between the nucleotide sequence and the reference nucleotide sequence are called “synonyms” or “synonymous” codons in that they use different triplets of nucleotides to encode the same amino acid in a polypeptide.
  • Codon optimized refers to changes in the codons of the polynucleotide encoding a protein to those preferentially used in a particular organism such that the encoded protein is efficiently expressed in the organism of interest.
  • the genetic code is degenerate in that most amino acids are represented by several different “synonymous” codons, it is well known that codon usage by particular organisms is nonrandom and biased towards particular codon triplets. This codon usage bias may be higher in reference to a given gene, genes of common function or ancestral origin, highly expressed proteins versus low copy number proteins, and the aggregate protein coding regions of an organism's genome.
  • the polynucleotides encoding the imine reductase enzymes may be codon optimized for optimal production from the host organism selected for expression.
  • “Preferred, optimal, high codon usage bias codons” refers to codons that are used at higher frequency in the protein coding regions than other codons that code for the same amino acid.
  • the preferred codons may be determined in relation to codon usage in a single gene, a set of genes of common function or origin, highly expressed genes, the codon frequency in the aggregate protein coding regions of the whole organism, codon frequency in the aggregate protein coding regions of related organisms, or combinations thereof. Codons whose frequency increases with the level of gene expression are typically optimal codons for expression.
  • codon frequency e.g., codon usage, relative synonymous codon usage
  • codon preference in specific organisms, including multivariate analysis, for example, using cluster analysis or correspondence analysis, and the effective number of codons used in a gene (see GCG CodonPreference, Genetics Computer Group Wisconsin Package; CodonW, John Peden, University of Nottingham; McInerney, J. O, 1998, Bioinformatics 14:372-73; Stenico et al., 1994, Nucleic Acids Res. 222437-46; Wright, F., 1990, Gene 87:23-29).
  • Codon usage tables are available for a growing list of organisms (see for example, Wada et al., 1992, Nucleic Acids Res. 20:2111-2118; Nakamura et al., 2000, Nucl. Acids Res. 28:292; Duret, et al., supra; Henaut and Danchin, "Escherichia coli and Salmonella,"
  • the data source for obtaining codon usage may rely on any available nucleotide sequence capable of coding for a protein.
  • These data sets include nucleic acid sequences actually known to encode expressed proteins (e.g., complete protein coding sequences-CDS), expressed sequence tags (ESTS), or predicted coding regions of genomic sequences (see for example, Mount, D., Bioinformatics: Sequence and Genome Analysis, Chapter 8, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 2001 ; Uberbacher, E. C., 1996, Methods Enzymol. 266:259-281 ; Tiwari et al.,
  • Control sequence refers to all sequences, which are necessary or advantageous for the expression of a polynucleotide and/or polypeptide as used in the present disclosure.
  • Each control sequence may be native or foreign to the nucleic acid sequence encoding a polypeptide.
  • control sequences include, but are not limited to, a leader, a promoter, a polyadenylation sequence, a pro-peptide sequence, a signal peptide sequence, and a transcription terminator.
  • control sequences typically include a promoter, and transcriptional and translational stop signals.
  • the control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
  • “Operably linked” as used herein refers to a configuration in which a control sequence is appropriately placed (e.g., in a functional relationship) at a position relative to a polynucleotide sequence or polypeptide sequence of interest such that the control sequence directs or regulates the expression of the sequence of interest.
  • Promoter sequence refers to a nucleic acid sequence that is recognized by a host cell for expression of a polynucleotide of interest, such as a coding sequence.
  • the promoter sequence contains transcriptional control sequences, which mediate the expression of a polynucleotide of interest.
  • the promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
  • Percentage of sequence identity “percent sequence identity,” “percent sequence homology,” or “percent homology” are used interchangeably herein to refer to values quantifying comparisons of the sequences of polynucleotides or polypeptides, and are determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (or gaps) as compared to the reference sequence for optimal alignment of the two sequences.
  • the percentage values may be calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
  • the percentage may be calculated by determining the number of positions at which either the identical nucleic acid base or amino acid residue occurs in both sequences or a nucleic acid base or amino acid residue is aligned with a gap to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
  • Those of skill in the art appreciate that there are many established algorithms available to align two sequences.
  • Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman, 1981 , Adv. Appl. Math. 2:482, by the homology alignment algorithm of Needleman and Wunsch, 1970, J. Mol. Biol. 48:443, by the search for similarity method of Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the GCG Wisconsin Software Package), or by visual inspection (see generally, Current Protocols in Molecular Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc.
  • HSPs high scoring sequence pairs
  • Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always ⁇ 0).
  • M forward score for a pair of matching residues; always >0
  • N penalty score for mismatching residues; always ⁇ 0.
  • a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negativescoring residue alignments; or the end of either sequence is reached.
  • the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
  • W wordlength
  • E expectation
  • the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, 1989, Proc Natl Acad Sci USA 89:10915).
  • Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison Wis.), using default parameters provided.
  • Reference sequence refers to a defined sequence used as a basis for a sequence comparison.
  • a reference sequence may be a subset of a larger sequence, for example, a segment of a full-length nucleic acid or polypeptide sequence.
  • a reference sequence typically is at least 20 nucleotide or amino acid residue units in length, but can also be the full length of the nucleic acid or polypeptide.
  • two polynucleotides or polypeptides may each (1) comprise a sequence (i.e., a portion of the complete sequence) that is similar between the two sequences, and (2) may further comprise a sequence that is divergent between the two sequences, sequence comparisons between two (or more) polynucleotides or polypeptide are typically performed by comparing sequences of the two polynucleotides or polypeptides over a “comparison window” to identify and compare local regions of sequence similarity.
  • Comparison window refers to a conceptual segment of at least about 20 contiguous nucleotide positions or amino acids residues wherein a sequence may be compared to a reference sequence of at least 20 contiguous nucleotides or amino acids and wherein the portion of the sequence in the comparison window may comprise additions or deletions (or gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • “Substantial identity” or “substantially identical” refers to a polynucleotide or polypeptide sequence that has at least 70% sequence identity, at least 80% sequence identity, at least 85% sequence identity, at least 90% sequence identity, at least 95 % sequence identity, or at least 99% sequence identity, as compared to a reference sequence over a comparison window of at least 20 nucleoside or amino acid residue positions, frequently over a window of at least 30-50 positions, wherein the percentage of sequence identity is calculated by comparing the reference sequence to a sequence that includes deletions or additions which total 20 percent or less of the reference sequence over the window of comparison.
  • “Corresponding to,” “reference to,” or “relative to” when used in the context of the numbering of a given amino acid or polynucleotide sequence refers to the numbering of the residues of a specified reference sequence when the given amino acid or polynucleotide sequence is compared to the reference sequence.
  • the residue number or residue position of a given polymer is designated with respect to the reference sequence rather than by the actual numerical position of the residue within the given amino acid or polynucleotide sequence.
  • a given amino acid sequence such as that of an engineered imine reductase, can be aligned to a reference sequence by introducing gaps to optimize residue matches between the two sequences. In these cases, although the gaps are present, the numbering of the residue in the given amino acid or polynucleotide sequence is made with respect to the reference sequence to which it has been aligned.
  • isolated as used herein in reference to a molecule means that the molecule (e.g., cannabinoid, polynucleotide, polypeptide) is substantially separated from other compounds that naturally accompany it, e.g., protein, lipids, and polynucleotides.
  • the term embraces nucleic acids which have been removed or purified from their naturally-occurring environment or expression system (e.g., host cell or in vitro synthesis).
  • substantially pure refers to a composition in which a desired molecule is the predominant species present (i.e., on a molar or weight basis it is more abundant than any other individual macromolecular species in the composition), and is generally a substantially purified composition when the object species comprises at least about 50 percent of the macromolecular species present by mole or % weight.
  • “Recovered” as used herein in relation to an enzyme, protein, or cannabinoid compound refers to a more or less pure form of the enzyme, protein, or cannabinoid.
  • the present disclosure provides engineered genes that encode recombinant polypeptides having THCA synthase activity.
  • a recombinant host cell e.g., S. cerevisiae
  • CBDA cannabigerolic acid
  • the THCA synthase product cannabinoid cannabinoid, A 9 -tetrahydrocannabinolic acid (A 9 -THCA) is produced by the host cell in greater yield relative to a comparable recombinant host cell integrated with the Cannabis sativa THCA synthase (“d28_THCAS”), which corresponds to the polypeptide of SEQ ID NO: 18.
  • d28_THCAS Cannabis sativa THCA synthase
  • sativa catalyzed by the d28_THCAS polypeptide is the oxidative cyclization of the monoterpene moiety of cannabigerolic acid (CBGA) (compound (2)) coupled with the reduction of FAD co-substrate, to form the cannabinoid product A 9 -THCA (compound (1)), as shown in Scheme 1.
  • the recombinant polypeptides with THCA synthase activity of the present disclosure when incorporated in a recombinant host cell comprising a pathway that produces a cannabinoid, such as CBGA (compound (2)), are capable, in the presence of FAD, of oxidatively cyclizing that substrate to form a cannabinoid product, such as THCA (compound (1)).
  • the enhanced yield of the cyclized cannabinoid product is correlated with one or more residue differences in recombinant polypeptides of the present disclosure, as compared to the d28_THCAS amino acid sequence of SEQ ID NO: 18, and/or correlated with codon differences in the nucleotide sequences encoding the polypeptides, as compared to the recombinant nucleic acid sequence of SEQ ID NO: 17.
  • Exemplary engineered genes and encoded recombinant polypeptides with THCA synthase activity that exhibit the unexpected and surprising technical effect of increased cannabinoid product yield when integrated in a recombinant host cell are summarized in Table 3 below.
  • the recombinant polypeptides having THCA synthase activity and increased activity have one or more residue differences as compared to the reference C. sativa THCA synthase polypeptide of SEQ ID NO: 18.
  • the recombinant polypeptides have one or more residue differences at residue positions selected from R3, K12, A19, L23, H28, L31 , M33, Q41 , L43, S72, V75, H108, K137, V151 , G207, A214, K233, K234, D256, D258, I266, K268, H274, K276, H282, V293, N301 , F317, F332, A335, G348, S354, T367, T418, Y472, K496, N500, and H517.
  • the amino acid residue differences are: R3V, K12G, A19E, A19G, A19Q, L23I, L23V, H28G, H28N, H28Q, L31 D, L31G, M33D, Q41 R, L43G, L43S, S72A, V75A, V75Y, H108R, K137C, K137F, K137M, K137S, K137Y, V151G, G207A, A214T, K233G, K233S, K233T, K234R, D256S, D258R, I266Q, K268E, H274C, H274E, H274Q, K276Q, H282L, V293I, N301 D, F317L, F332L, A335C, A335T, G348A, S354C, T367E, T418V, Y472I, K496E, K496Q, N500D,
  • residue differences relative to SEQ ID NO: 18 at residue positions associated with increased THCA synthase activity can be used in various combinations to form recombinant THCA synthase polypeptides having desirable functional characteristics when integrated in a recombinant host cell, for example increased yield product of the cannabinoid product compound, THCA.
  • Some exemplary combinations are described in Table 3 and elsewhere herein.
  • the present disclosure provides a recombinant polypeptide having increased THCA synthase activity and amino acid residue differences as compared to SEQ ID NO: 18 at the following pairs of positions: H23 and A335; Q41 and D258; L43 and K276; S72 and V293; K234 and K496; 1266 and H517; N301 and Y472; A335 and G348; and A335 and H517.
  • the recombinant polypeptides can have at least the following residue differences in combination: H23N and A335C; Q41 R and D258R; L43G and K276Q; S72A and V293I; K234R and K496E; I266Q and H517Y; N301 D and Y472I; A335T and G348A; and A335T and H517V.
  • polypeptide comprises an amino acid sequence comprising one or more of the amino acid differences or sets of amino acid differences (relative to SEQ ID NO: 18) disclosed in any one of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162,
  • a recombinant polypeptide of the present disclosure having THCA synthase activity can have an amino acid sequence comprising one or more of the amino acid differences or sets of amino acid differences (relative to SEQ ID NO: 18) disclosed in any one of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, and 82, and additionally have 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11 , 1-12, 1-14, 1-15, 1-16, 1-18, 1-20, 1-22, 1-24, 1-26, 1-30, 1-35, 1-40, 1-45, 1-50, 1-55, or 1-60 residue differences at other residue positions.
  • the number of differences can be 1 , 2, 3, 4, 5, 6, 7,
  • any of the engineered THCA synthase polypeptides disclosed herein can further comprise other residue differences relative to the reference polypeptide of SEQ ID NO: 18 at other residue positions.
  • Residue differences at these other residue positions can provide for additional variations in the amino acid sequence without adversely affecting the ability of the recombinant polypeptide to carry out the desired biocatalytic conversion (e.g., conversion of compound (2) to compound (1)).
  • the recombinant polypeptides can have additionally 1- 2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11 , 1-12, 1-14, 1-15, 1-16, 1-18, 1-20, 1-22, 1-24, 1- 26, 1-30, 1-35, 1-40 residue differences at other amino acid residue positions as compared to SEQ ID NO: 18.
  • the number of differences can be 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 14, 15, 16, 18, 20, 22, 24, 26, 30, 35, and 40 residue differences at other residue positions.
  • the residue difference at these other positions can include conservative changes or non-conservative changes.
  • the residue differences can comprise conservative substitutions and non-conservative substitutions as compared to the reference polypeptide of SEQ ID NO: 18.
  • the recombinant polypeptides of the disclosure can be in the form of fusion polypeptides in which the engineered polypeptides are fused to other polypeptides, such as, by way of example and not limitation, antibody tags (e.g., myc epitope), purification sequences (e.g., His tags for binding to metals), and cell localization signals (e.g., secretion signals).
  • antibody tags e.g., myc epitope
  • purification sequences e.g., His tags for binding to metals
  • cell localization signals e.g., secretion signals
  • the recombinant polypeptides described herein can be used with or without fusions to other polypeptides.
  • the recombinant polypeptides described herein are not restricted to the genetically encoded amino acids.
  • the polypeptides described herein may be comprised, either in whole or in part, of naturally-occurring and/or synthetic non-encoded amino acids.
  • THCA cannabinoid
  • a recombinant host cell e.g., yeast
  • THCA synthase enzyme that converts CBGA to THCA.
  • the present disclosure contemplates that any of the recombinant polypeptides having THCA synthase activity of present disclosure may be made used as a fusion polypeptide construct with an N-terminal secretion peptide, particularly where the recombinant polypeptide is expressed in a recombinant host cell (e.g., yeast) as described elsewhere herein.
  • N-terminal secretion peptide (SP) sequences include those disclosed elsewhere herein including, Table 4, the Examples and accompanying Sequence Listing, and those disclosed as fusion with d28_THCAS in US Provisional Patent Application No. 63/164,510, filed March 22, 2021 , which is hereby incorporated by reference herein.
  • the present disclosure provides polynucleotides encoding the recombinant polypeptides having THCA synthase activity and increased activity and/or yield as described herein.
  • the polynucleotide encoding a recombinant polypeptide having THCA synthase activity comprises an amino acid sequence that is at least about 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more identical to the polypeptide sequence of SEQ ID NO: 18.
  • the polynucleotide encodes a recombinant polypeptide comprising an amino acid sequence that has the percent identity described above and has one or more amino acid residue differences as compared to SEQ ID NO: 18 described elsewhere herein.
  • the polynucleotide has a sequence encoding a recombinant polypeptide that does not include an amino acid difference relative to SEQ ID NO: 18, but which polynucleotide sequence has one or more codon differences relative to SEQ ID NO: 17, which codon differences result in increased yield of the cannabinoid product produced by a recombinant host cell in which the polynucleotide sequence is integrated.
  • the polynucleotide has a sequence of at least 80% identity to SEQ ID NO: 17, and a codon difference as compared to SEQ ID NO: 17 at a position encoding an amino acid residue selected from: V75, H108, K136, V184, K187, R191 , G328, F337, A368, P404, L415, T464, D498, and H516.
  • the codon differences at positions V75, H108, K136, V184, K187, R191, G328, F337, A368, P404, L415, T464, D498, and H516 are selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOGAT), and H516 (CAT>CAC).
  • the polynucleotides encoding the recombinant polypeptides having THCA synthase activity and increased activity and/or yield as described herein can include a combination of one or more codon differences relative to SEQ ID NO: 17, wherein at least one the codon differences encodes an amino acid difference as compared to SEQ ID NO: 18 and at least one codon difference does not encode an amino acid difference as compared to SEQ ID NO: 18
  • the present disclosure provides a polynucleotide sequence encoding a recombinant polypeptide having THCA synthase activity, wherein the polynucleotide sequence comprises a combination of a codon difference encoding an amino acid difference and a codon difference selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAOAAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGOAGA),
  • the polynucleotide comprises a sequence encoding an exemplary recombinant polypeptide having THCA synthase activity as disclosed in Table 3 and accompanying Sequence Listing.
  • the polynucleotide comprises a sequence of at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145,
  • the polynucleotide comprises a codon degenerate sequence of a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145, 147, 149, 151 , 153, 155, 157, 159, 161 , 163, 165, and 167.
  • the polynucleotide sequences encoding the recombinant polypeptides of the present disclosure may be operatively linked to one or more heterologous regulatory sequences that control gene expression to create a recombinant polynucleotide capable of expressing the polypeptide.
  • Expression constructs containing a heterologous polynucleotide encoding the recombinant polypeptide can be introduced into appropriate host cells to express the corresponding polypeptide. Because of the knowledge of the codons corresponding to the various amino acids, availability of a protein sequence provides a description of all the polynucleotides capable of encoding the subject.
  • the codons can be selected to fit the host cell in which the protein is being produced.
  • preferred codons used in bacteria are used to express the gene in bacteria
  • preferred codons used in yeast are used for expression in yeast
  • preferred codons used in mammals are used for expression in mammalian cells. It is contemplated that all codons need not be replaced to optimize the codon usage of the recombinant polypeptide since the natural sequence will comprise preferred codons and because use of preferred codons may not be required for all amino acid residues. Consequently, codon optimized polynucleotides encoding the recombinant polypeptide may contain preferred codons at about 40%, 50%, 60%, 70%, 80%, or greater than 90% of codon positions of the full length coding region.
  • the present disclosure also provides an expression vector comprising a polynucleotide encoding a recombinant polypeptide having increased THCA synthase activity, and one or more expression regulating regions such as a promoter, a terminator, a replication origin, or the like, depending on the type of hosts into which they are to be introduced.
  • the various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the recombinant polypeptide at such sites.
  • a polynucleotide sequence of the present disclosure may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression.
  • the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression.
  • the recombinant expression vector may be any vector (e.g., a plasmid or virus), which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the polynucleotide sequence.
  • the choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced.
  • the vectors may be linear or closed circular plasmids.
  • the expression vector may be an autonomously replicating vector, i.e., a vector that exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a mini-chromosome, or an artificial chromosome.
  • the vector may contain any means for assuring self-replication.
  • the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated.
  • the expression vector further comprises one or more selectable markers, which permit easy selection of transformed cells.
  • the present disclosure also provides host cell comprising a polynucleotide or expression vector encoding a recombinant polypeptide of the present disclosure, wherein the polynucleotide is operatively linked to one or more control sequences for expression of the polypeptide having THCA synthase activity in the host cell.
  • Host cells for use in expressing the polypeptides encoded by the expression vectors of the present invention are well known in the art and include but are not limited to, bacterial cells, such as E.
  • the present disclosure provides a method for producing a cannabinoid comprising: (a) culturing in a suitable medium a recombinant host cell of the present disclosure; and (b) recovering the produced cannabinoid.
  • the recombinant polynucleotides of the present disclosure that encode recombinant polypeptides having THCA synthase activity can be incorporated into recombinant host cells for enhanced in vivo cannabinoid biosynthesis.
  • the recombinant polynucleotides can be incorporated into a pathway capable of producing a cannabinoid, such as CBGA and CBGVA, and thereby provide the THCA synthase activity for biosynthesis of further cyclized cannabinoids, such as THCA and THCVA, by the cells.
  • HA hexanoic acid
  • CBDA cannabigerolic acid
  • the recombinant polynucleotides encoding polypeptides having THCA synthase activity of the present disclosure can be integrated into recombinant host cells with a shorter cannabinoid pathway capable of converting the cannabinoid precursor, olivetolic acid (OA) to cannabigerolic acid (CBGA).
  • the recombinant host cells exhibit enhanced yields of the further cyclized cannabinoid product, THCA, when fed the OA compound.
  • the cannabinoid pathway of the recombinant host cell is made up of a sequence of linked enzymes that produce a cannabinoid precursor substrate (e.g., OA) and then convert that precursor to a prenylated cannabinoid compound (e.g., CBGA).
  • the pathway comprises at least a THCA synthase capable of oxidatively cyclizing the monoterpene moiety of the prenylated cannabinoid compound using a redox acceptor cosubstrate, such as FAD. Further decarboxylation of the produced cannabinoid compound can also be part of the cannabinoid pathway.
  • cannabinoid compounds can be produced biosynthetically by a recombinant host cell integrated with such a cannabinoid pathway.
  • Methods and techniques for integrated polynucleotides expressing pathway enzymes into recombinant host cells, such as yeast, are well known in the art and described elsewhere herein including the Examples.
  • FIG. 1 One exemplary cannabinoid pathway is depicted in FIG. 1. As shown in FIG. 1 , this pathway is capable of converting hexanoic acid (HA) to the cannabinoid, cannabigerolic acid (CBGA).
  • the pathway of FIG. 1 includes the sequence of four enzymes: (1) acyl activating enzyme (AAE), a CoA ligase enzyme of class E.C. 6.2.1 .1 , or a fatty acyl-CoA ligase (FACL) of class E.C.6.2.1.3 (e.g., FAA1 or FAA4); (2) olivetol synthase (OLS), a CoA synthase enzyme of class E.C.
  • AAE acyl activating enzyme
  • FFACL fatty acyl-CoA ligase
  • OLS olivetol synthase
  • OAC olivetolic acid cyclase
  • PT prenyltransferase
  • the first two enzymes carry out the conversion of the HA starting compound to the precursor tetraketide- CoA compound, 3,5,7-trioxododecanoyl-CoA.
  • the activity of the third enzyme, OAC catalyzes the CoA lyase and cyclization of the tetraketide-CoA to provide the cannabinoid precursor, olivetolic acid (OA).
  • the prenyltransferase activity of the fourth enzyme catalyzes the prenylation of OA with geranyl pyrophosphate (GPP), thereby forming the cannabinoid compound, CBGA.
  • GPP geranyl pyrophosphate
  • further enzymatic modification of the prenylated cannabinoid compound, CBGA, to provide cannabinoids, such as CBDA, THCA, and/or CBCA can be carried out by including a cannabinoid synthase (e.g., CBDAS, THCAS) as a fifth enzyme in the pathway.
  • a cannabinoid synthase e.g., CBDAS, THCAS
  • Exemplary cannabinoid pathway enzymes that can be introduced into a recombinant host cell to provide the pathways as illustrated in FIGS. 1 and 2 include, but are not limited to, the enzymes derived from C. sativa, AAE1 , OLS, OAC, PT4, CBDAS, and/or THCAS, listed in Table 4 (below), and homologs and variants of these enzymes, as described elsewhere herein.
  • sequences of the exemplary cannabinoid pathway enzymes AAE1 , OLS, OAC, PT4, CBDAS, and THCAS listed in Table 4 are naturally occurring sequences from the plant source, Cannabis sativa.
  • the THCAS enzyme of SEQ ID NO: 16 or 18 is replaced in the host cell by a recombinant polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, e.g., a THCA synthase provided in Table 3 and the accompanying Sequence Listing.
  • the other heterologous cannabinoid pathway enzymes used in the recombinant host can include naturally occurring sequence homologs of the AAE1 , OLS, OAC, and PT4 enzymes and/or enzymes having non-naturally occurring sequences.
  • enzymes with amino acid sequences engineered to function optimally in a particular enzyme pathway, and/or optimally for production of particular cannabinoid, and/or optimally in a particular host can include naturally occurring sequence homologs of the AAE1 , OLS, OAC, and PT4 enzymes and/or enzymes having non-naturally occurring sequences.
  • enzymes with amino acid sequences engineered to function optimally in a particular enzyme pathway, and/or optimally for production of particular cannabinoid, and/or optimally in a particular host can include naturally occurring sequence homologs of the AAE1 , OLS, OAC, and PT4 enzymes and/or enzymes having non-naturally occurring sequences.
  • cannabinoid pathway enzymes contemplated by the present disclosure include modification of the enzyme’s amino acid sequence at either its N- or C- terminus by truncation or fusion.
  • versions of the AAE1 , OLS, OAC, and/or PT4 enzymes that are engineered with amino acid substitutions and/or truncated at the N- or C-terminus can be prepared using methods known in the art, and used in the compositions and methods of the present disclosure.
  • a PT4 enzyme of SEQ ID NO: 8 that is truncated at the N-terminus by 82 amino acids can be used.
  • the amino acid sequence of such a truncated CBDAS is provided herein as the d82_PT4 enzyme of SEQ ID NO: 10.
  • the pathway capable of producing a cannabinoid comprises at least enzymes having an amino acid sequence at least 90% identity to SEQ ID NO: 2 (AAE1), SEQ ID NO: 4 (OLS), SEQ ID NO: 6 (OAC), SEQ ID NO: 10 (d82_PT4), and an amino acid sequence of at least 90% identity to recombinant polypeptide having THCA synthase activity of the present disclosure as provided in Tables 3, 6, and 8, and the accompanying Sequence Listing.
  • the present disclosure provides engineered recombinant polypeptides that have THCA synthase activity and which exhibit enhanced THCA production when expressed in a recombinant host cell with a cannabinoid pathway capable of producing CBGA.
  • These amino acid sequences of these engineered polypeptides include one or more amino acid differences relative to the naturally occurring THCA synthase sequence of SEQ ID NO: 18 that result in increased THCA titer from the cells when fed hexanoic acid.
  • the present disclosure provides a recombinant host cell comprising nucleic acids encoding a cannabinoid pathway comprising an engineered polypeptide with THCA synthase activity capable of converting CBGA to THCA, wherein the production of THCA by the recombinant host cell is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by the naturally occurring THCA synthase polypeptide of SEQ ID NO: 18.
  • This increased THCA titer is achieved due to the host cell including a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: R3, L23, H28, L31 , M33, L43, S72, V75, K137, G207, K233, I266, K268, K276, H282, V293, N301 , A335, G348, T418, Y472, N500, and H517.
  • specific exemplary amino acid residue differences relative to SEQ ID NO: 18 include one or more of: R3V, L23I, L23V, H28G, H28Q, L31D, L31G, M33D, L43G, L43S, S72A, V75A, V75Y, K137C, K137F, K137M, K137S, K137Y, G207A, K233G, K233S, K233T, I266Q, K268E, K276Q, H282L, V293I, N301 D, A335T, G348A, T418V, Y472I, N500D, H517R, H517V, and H517Y.
  • the recombinant polynucleotide encoding the engineered polypeptide with THCA synthase activity can further include neutral codon differences at certain amino acid encoding positions that result in enhanced THCA titer from recombinant host cells.
  • Exemplary neutral codon differences resulting in enhanced THCA titer from host cells fed hexanoic acid (HA) include: V75 (GTA>GTG), V184 (GTA>GTG), G328 (GGOGGT), F337 (TTOTTT), P404 (CCT>CCC), D498 (GAOGAT), and H516 (CAT>CAC).
  • the present disclosure provides a recombinant host cell comprising pathway capable of producing a cannabinoid comprises enzymes capable of catalyzing reactions (i) - (iv): and
  • exemplary enzymes capable of catalyzing reactions are: (i) acyl activating enzyme (AAE); (ii) olivetol synthase (OLS); (iii) olivetolic acid cyclase (OLA); and (iv) prenyltransferase (PT).
  • AAE acyl activating enzyme
  • OLS olivetol synthase
  • OAA olivetolic acid cyclase
  • PT prenyltransferase
  • the cannabinoid compound, CBGA that is produced by the four enzyme cannabinoid pathway of FIG. 1 , can be further converted to any of at least three other different cannabinoid compounds, AMetrahydrocannabinolic acid (THCA), cannabidiolic acid (CBDA), and/or cannabichromenic acid (CBCA).
  • THCA cannabidiolic acid
  • CBCA cannabichromenic acid
  • This further enzymatic cyclization of CBGA can include the conversion of (v) CBGA to A 9 -THCA; (vi) CBGA to CBDA; and/or (vii) CBGA to CBCA, as shown in the reaction schemes below.
  • a recombinant host cell comprising a pathway capable of converting hexanoic acid to CBGA (or simply OA to CBGA) can be further extended to include enzymes capable of catalyzing a reaction (v), (vi), and/or (vii), and thereby produce any or all of the cyclized cannabinoid product compounds.
  • exemplary enzymes capable of catalyzing reaction (v)-(vii) are: (v) THCA synthase (THCAS); (vi) CBDA synthase (CBDAS); and (vii) CBCA synthase (CBCAS).
  • THCAS THCA synthase
  • CBDA synthase CBDA synthase
  • CBCAS CBCA synthase
  • a recombinant host cell can be carried out by further integrating a recombinant polynucleotide sequence capable of expressing a cannabinoid synthase (e.g., CBDAS, THCAS, and/or CBCAS) can thus provide a cell capable of biosynthetic production of one or more of the further cyclized cannabinoids, A 9 -THCA, CBDA, and/or CBCA.
  • a cannabinoid synthase e.g., CBDAS, THCAS, and/or CBCAS
  • a recombinant host cell comprising a cannabinoid pathway, such as AAE, OLS, OAC, and PT, capable of converting HA to CBGA, or even a single enzyme pathway of PT, capable of converting OA to CBGA, could be further modified by integrating a recombinant polynucleotide capable of expressing a recombinant polypeptide with THCAS activity of the present disclosure.
  • the addition of the THCAS activity to the pathway allows for the conversion of the cannabinoid, CBGA to the further cyclized cannabinoid, THCA.
  • the resulting cannabinoid pathway combines the pathway of FIG.
  • any of the recombinant polynucleotides encoding recombinant polypeptides having THCAS activity of the present disclosure can be incorporated in a host cell to provide such a pathway.
  • the cannabinoids, A 9 -THCA, CBDA, and CBCA can be further decarboxylated to provide the cannabinoids, A 9 -THC, CBD, and/or CBC. Accordingly, it is contemplated, that in some embodiments this further decarboxylation reaction can be carried out under in vitro reaction conditions using the cannabinoid acids separated and/or isolated from the recombinant host cells.
  • cannabinoid pathway enzymes useful in the recombinant host cells and associated methods of the present disclosure are known in the art, and can include naturally occurring enzymes obtained or derived from cannabis plants, or non-naturally occurring enzymes that have been engineered based on the naturally occurring cannabis plant sequences. It is also contemplated that enzymes obtained or derived from other organisms (e.g., microorganisms) having a catalytic activity related to a desired conversion activity useful in a cannabinoid pathway can be engineered for use in a recombinant host cell of the present disclosure.
  • FIGS. 1-2 depict the production of the more common naturally occurring cannabinoids, CBGA, A 9 -THCA, CBDA, and CBCA
  • the recombinant polypeptides, cannabinoid pathways, recombinant host cells, and associated methods of the present disclosure can also be used to biosynthesize a range of additional rarely occurring, and/or synthetic cannabinoid compounds.
  • Table 1 depicts the names and structures of a wide range of exemplary rarely occurring, and/or synthetic cannabinoid compounds that are contemplated for production using the recombinant polypeptides, host cells, compositions and methods of the present disclosure.
  • Table 2 depicts additional rarely occurring, and/or synthetic cannabinoid precursor compounds that could be produced by such recombinant host cells in the pathway for production of certain rarely occurring, and/or synthetic cannabinoid compounds of Table 1 .
  • a recombinant host cell that includes a pathway to a cannabinoid and that expresses a recombinant polypeptide having THCA synthase activity of the present disclosure (e.g., as in Tables 3, 6, and 8) can be used for the biosynthetic production of a rarely occurring, and/or synthetic cannabinoid compound, or a composition comprising such a cannabinoid compound.
  • a recombinant host cell of the present disclosure can be used for production of a cannabinoid compound selected from cannabigerolic acid (CBGA), cannabigerol (CBG), cannabidiolic acid (CBDA), cannabidiol (CBD), A 9 -tetrahydrocannabinolic acid (A 9 -THCA), AMetrahydrocannabinol (A 9 -THC), AMetrahydrocannabinolic acid (A 8 -THCA), AMetrahydrocannabinol (A 8 -THC), cannabichromenic acid (CBCA), cannabichromene (CBC), cannabinolic acid (CBNA), cannabinol (CBN), cannabidivarinic acid (CBDVA), cannabichromenic acid (CBCA), cannabichromene (CBC), cannabinolic acid (CBNA), cannabinol (CBN), cannabidivari
  • compositions and methods of the present disclosure can be used for the production of the more rarely occurring varin series of cannabinoids, CBGVA, A 9 -THCVA, CBDVA, and CBCVA.
  • the varin cannabinoids feature a 3 carbon propyl side-chain rather than the 5 carbon pentyl side chain found in the common cannabinoids, CBGA, A 9 -THCA, CBDA, and CBCA.
  • the pathway capable of producing a cannabinoid comprises enzymes capable of catalyzing reactions (i) - (iv):
  • Exemplary enzymes capable of catalyzing reactions are: (i) acyl activating enzyme (AAE); (ii) olivetol synthase (OLS); (iii) olivetolic acid cyclase (OAC); and (iv) PT.
  • acyl activating enzyme AAE
  • OLS olivetol synthase
  • OAC olivetolic acid cyclase
  • PT PT
  • Exemplary enzymes, AAE1 , OLS, OAC, and PT4 derived from C. sativa are known in the art and also provided in Table 4 and the accompanying Sequence Listing.
  • the rare varin cannabinoid can be converted to the rare varin cannabinoids, cannabidivarinic acid (CBDVA), AMetrahydrocannabivarinic acid (A 9 -THCVA), and cannabichromevarinic acid (CBCVA).
  • CBDA cannabidivarinic acid
  • a 9 -THCVA AMetrahydrocannabivarinic acid
  • CBCVA cannabichromevarinic acid
  • Enzymes capable of carrying out these conversions include the C. sativa CBDA synthase, THCA synthase, and CBCA synthase, respectively.
  • the present disclosure provides a recombinant host cell comprising a pathway capable of converting BA to CBGVA and further comprising an enzyme capable of catalyzing the conversion of (v) CBGVA to A 9 -THCVA; (vi) CBGVA to CBDVA; and/or (vii) CBGVA to CBCVA.
  • the recombinant host cell comprises pathway capable of converting BA to CBGVA further comprises further comprises enzymes capable of catalyzing a reaction (v), (vi), and/or (vii):
  • CBGVA Cannabigerovarinic acid
  • CBGVA Cannabigerovarinic acid
  • CBDVA Cannabidivarinic acid
  • CBGVA Cannabigerovarinic acid
  • CBCVA Cannabichromevarinic acid
  • exemplary enzymes capable of catalyzing the reactions (v)-(vii) are: (v) THCA synthase (THCAS); (vi) CBDA synthase (CBDAS); and (vii) CBCA synthase (CBCAS).
  • THCAS THCA synthase
  • CBDAS CBDA synthase
  • CBCAS CBCA synthase
  • Exemplary THCAS, CBDAS, and CBCAS enzymes are provided in Table 4.
  • a recombinant host cell comprising a four enzyme pathway, such as AAE, OLS, OAC, and PT, capable of converting BA to CBGVA, or even a single enzyme pathway of PT, capable of converting the rare cannabinoid precursor, DA to CBGVA, could be further modified by integrating a recombinant polynucleotide capable of expressing a recombinant polypeptide with THCA synthase activity to convert the rare cannabinoid, CBGVA to the cyclized rare cannabinoid, THCVA.
  • the resulting cannabinoid pathway combines the pathway of FIG.
  • any of the recombinant polynucleotides encoding recombinant polypeptides having THCA synthase activity of the present disclosure can be incorporated in a host cell to provide such a combined pathway capable of producing a rare cannabinoid, such as THCVA.
  • he present disclosure provides engineered recombinant polypeptides with THCA synthase activity that exhibit enhanced THCVA production when expressed in a recombinant host cell with a cannabinoid pathway capable of producing CBGVA.
  • These amino acid sequences of these engineered polypeptides include one or more amino acid differences relative to the naturally occurring THCA synthase sequence of SEQ ID NO: 18 that result in increased THCVA titer from the cells when fed butyric acid (BA).
  • the present disclosure provides a recombinant host cell comprising nucleic acids encoding a cannabinoid pathway comprising an engineered polypeptide with THCA synthase activity capable of converting CBGVA to THCVA, wherein the production of THCVA by the recombinant host cell is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by the naturally occurring THCA synthase polypeptide of SEQ ID NO: 18.
  • This increased THCVA titer is achieved due to the host cell including a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: K12, A19, H28, Q41 , V75, H108, V151 , A214, K234, D256, D258, H274, F317, F332, A335, S354, T367, and K496; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from K12G, A19E, A19G, A19Q, H28N, Q41R, V75A, H108R, V151G, A214T, K234R, D256S, D258R, H274C, H274E, H274Q, F317L, F332L, A335C, S354C, T367E, K496E
  • the recombinant polynucleotide encoding the engineered polypeptide with THCA synthase activity can further include neutral codon differences at certain amino acid encoding positions that result in enhanced THCVA titer from recombinant host cells fed butyric acid (BA).
  • neutral codon differences resulting in enhanced THCA titer include: H108 (CAOCAT), K136 (AAG>AAA), K187 (AAG>AAA), R191R (AGG>AGA), A368A (GCOGCG), L415L (TTA>CTG), and T464T (ACOACG).
  • the rare cannabinoid acids, CBDVA, A 9 -THCVA, and CBCVA can undergo a further decarboxylation reaction to provide the varin cannabinoid products, cannabidivarin (CBDV), A 9 -tetrahydrocannabivarin (A 9 -THCV), and cannabichromevarin (CBCV), respectively.
  • CBDVA cannabidivarin
  • a 9 -THCV A 9 -tetrahydrocannabivarin
  • CBCV cannabichromevarin
  • a heterologous cannabinoid pathway comprising the sequence of at least the four enzymes AAE, OLS, OAC, and PT is capable of converting a precursor substrate compound, such as hexanoic acid (HA) to an initial cannabinoid compound, such as cannabigerolic acid (CBGA) or CBGVA.
  • HA hexanoic acid
  • CBDVA cannabigerolic acid
  • These initial cannabinoid product compounds can themselves be used as a substrate for the in vitro biosynthesis of a range of further cannabinoid product compounds, such as THCA and THCVA, as shown in FIGS. 2 and 4.
  • cannabinoid compounds such as those shown in Table 1 , are contemplated for in vivo biosynthetic production in a recombinant host cell of the present disclosure or via a partial or full in vitro biosynthesis process using the recombinant THCAS polypeptides of the present disclosure.
  • the heterologous cannabinoid pathways of the present disclosure can be incorporated (e.g., by recombinant transformation) into a range of host cells to provide a system for biosynthetic production of cannabinoids (e.g., CBGA, CBGVA, CBDA, CBDVA, THCA, THCVA).
  • the host cell used in the recombinant host cells of the present disclosure can be any cell that can be recombinantly modified with nucleic acids and cultured to express the recombinant products of those nucleic acids, including polypeptides and metabolites produced by the activity of the recombinant polypeptides.
  • exemplary host cell sources useful as recombinant host cells of the present disclosure include, but are not limited to, Saccharomyces cerevisiae, Yarrowia lipolytica, Pichia pastoris, and Escherichia coli. It is also contemplated that the host cell source for a recombinant host cell of the present disclosure can include a non- naturally occurring cell source, e.g., an engineered host cell. For example, a non-naturally occurring source host cell, such as a yeast cell previously engineered for improved production of recombinant genes, may be used to prepare the recombinant host cell of the present disclosure.
  • a non-naturally occurring source host cell such as a yeast cell previously engineered for improved production of recombinant genes
  • the recombinant host cells of the present disclosure comprise heterologous nucleic acids encoding a pathway of enzymes capable of producing a cannabinoid (e.g., CBGA or CBGVA), and a heterologous nucleic acid comprising a sequence encoding a recombinant polypeptide having THCA synthase activity capable of oxidatively cyclizing a prenylated cannabinoid substrate using a redox active co-substrate, such as FAD, and thereby form a cyclized cannabinoid product, such as THCA or THCVA.
  • a pathway of enzymes capable of producing a cannabinoid e.g., CBGA or CBGVA
  • a heterologous nucleic acid comprising a sequence encoding a recombinant polypeptide having THCA synthase activity capable of oxidatively cyclizing a prenylated cannabinoid substrate using a redox
  • nucleic acid sequences encoding the cannabinoid pathway enzymes are known in the art, and provided herein (see e.g., Table 4), and can readily be used in accordance with the present disclosure.
  • the nucleic acid sequence encoding enzymes which form a part of a cannabinoid pathway further include one or more additional nucleic acid sequences, for example, a nucleic acid sequence controlling expression of the enzymes which form a part of a cannabinoid biosynthetic enzyme pathway, and these one or more additional nucleic acid sequences together with the nucleic acid sequence encoding the enzyme can be considered a heterologous nucleic acid sequence.
  • heterologous nucleic acid sequences such as nucleic acid sequences encoding the cannabinoid pathway enzymes (e.g., AAE, OLS, OAC, and PT)
  • AAE cannabinoid pathway enzyme
  • OLS cannabinoid pathway enzyme
  • PT cannabinoid pathway enzyme
  • the THCA synthase polypeptide that occurs naturally in C. sativa includes a 28 amino acid secretion peptide fused to its N-terminus.
  • This N-terminal fusion of C. sativa THCAS is provided in Table 4 as SEQ ID NO: 16.
  • the recombinant polypeptides of the present disclosure may be expressed in a recombinant host cell as a fusion polypeptide construct with an N-terminal secretion peptide to provide efficient production of THCA.
  • Exemplary N-terminal secretion peptide (SP) sequences include those disclosed in Table 5 below and the accompanying Sequence Listing.
  • any of the recombinant polynucleotides encoding recombinant polypeptides having THCAS activity of the present disclosure can be modified with a polynucleotide sequence (e.g., SEQ ID NO: 99, 101 , 103, 105, 107, 109, 111 , 113, 115, 117, 119, 121) so as to express a recombinant polypeptide with an N-terminal secretion peptide sequence of any one of SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, and 122.
  • a polynucleotide sequence e.g., SEQ ID NO: 99, 101 , 103, 105, 107, 109, 111 , 113, 115, 117, 119, 12
  • a recombinant polypeptide of the present disclosure (e.g., any one of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, and 168) further comprises an N-terminal SP-AT secretion peptide of SEQ ID NO: 108.
  • the heterologous nucleic acids encoding the recombinant THCA synthase enzymes and/or other pathway enzymes will further comprise transcriptional promoters capable of controlling expression of the enzymes in the recombinant host cell.
  • the transcriptional promoters are selected to be compatible with the host cell, so that promoters obtained from bacterial cells are used when a bacterial host cell is selected in accordance herewith, while a fungal promoter is used when a fungal host cell is selected, a plant promoter is used when a plant cell is selected, and so on.
  • Promoters useful in the recombinant host cells of the present disclosure may be constitutive or inducible, provided such promoters are operable in the host cells.
  • Promoters that may be used to control expression in fungal host cells are well known in the art and include, but are not limited to: inducible promoters, such as a Gall promoter or Gal10 promoter, a constitutive promoter, such as an alcohol dehydrogenase (ADH) promoter, a glyceraldehyde-3-phosphate dehydrogenase (GPD) promoter, or an S. pombe Nmt, or ADH promoter.
  • Exemplary promoters that may be used to control expression in bacterial cells can include the Escherichia coll promoters lac, tac, trc, trp or the 77 promoter.
  • Exemplary promoters that may be used to control expression in plant cells include, for example, a Cauliflower Mosaic Virus 35S promoter (Odell et al. (1985) Nature 313:810-812), a ubiquitin promoter (U.S. Pat. No. 5,510,474; Christensen et al. (1989)), or a rice actin promoter (McElroy et al. (1990) Plant Cell 2:163-171).
  • Exemplary promoters that can be used in mammalian cells include, a viral promoter such as an SV40 promoter or a metallothionine promoter. All of these host cell promoters are well known by and readily available to one of ordinary skill in the art.
  • nucleic acid control elements useful for controlling expression in a recombinant host cell can include transcriptional terminators, enhancers and the like, all of which may be used with the heterologous nucleic acids incorporate in the recombinant host cells of the present disclosure.
  • the heterologous nucleic acid sequences of the present disclosure comprise a promoter capable of controlling expression in a host cell, wherein the promoter is linked to a nucleic acid sequence encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, and as necessary, other enzymes constituting a cannabinoid pathway (e.g., AAE, OLS, OAC, PT).
  • a cannabinoid pathway e.g., AAE, OLS, OAC, PT.
  • This heterologous nucleic acid sequence can be integrated into a recombinant expression vector which ensures good expression in the desired host cell, wherein the expression vector is suitable for expression in a host cell, meaning that the recombinant expression vector comprises the heterologous nucleic acid sequence linked to any genetic elements required to achieve expression in the host cell.
  • Genetic elements that may be included in the expression vector in this regard include a transcriptional termination region, one or more nucleic acid sequences encoding marker genes, one or more origins of replication, and the like.
  • the expression vector further comprises genetic elements required for the integration of the vector or a portion thereof in the host cell's genome.
  • an expression vector comprising a heterologous nucleic acid of the present disclosure may further contain a marker gene.
  • Marker genes useful in accordance with the present disclosure include any genes that allow the distinction of transformed cells from non-transformed cells, including all selectable and screenable marker genes.
  • a marker gene may be a resistance marker such as an antibiotic resistance marker against, for example, kanamycin or ampicillin.
  • Screenable markers that may be employed to identify transformants through visual inspection include p-glucuronidase (GUS) (U.S. Pat. Nos. 5,268,463 and 5,599,670) and green fluorescent protein (GFP) (Niedz et al., 1995, Plant Cell Rep., 14: 403).
  • the present disclosure also provides of a method for producing a cannabinoid, wherein a heterologous nucleic acid encoding a recombinant polypeptide having THCA synthase activity (e.g., an exemplary engineered polypeptide of Tables 3, 6, and 8) can be introduced into a recombinant host cell.
  • a heterologous nucleic acid encoding a recombinant polypeptide having THCA synthase activity e.g., an exemplary engineered polypeptide of Tables 3, 6, and 8
  • the recombinant host cell can then be used for production of the polypeptide, or incorporated in a biocatalytic process that utilized the THCA synthase activity of the recombinant polypeptide expressed by the host cell for the catalytic oxidative cyclization of a prenylated cannabinoid substrate, e.g., the oxidative cyclization of CBGA with FAD to produce THCA.
  • the recombinant host cell can further comprise a pathway of enzymes capable of producing a prenylated cannabinoid (e.g., CBGA or CBGVA) which can act as a substrate for the recombinant polypeptide with THCA synthase activity.
  • a recombinant host cell comprising a heterologous nucleic acid encoding a recombinant polypeptide having THCA synthase activity of the present disclosure can provide improved biosynthesis of a desired cannabinoid (e.g., THCA) product in terms of titer, yield, and production rate, due to the improved characteristics of the expressed THCA synthase activity in the cell associated with the amino acid and codon differences engineered in the gene.
  • a desired cannabinoid e.g., THCA
  • the present disclosure provides a method of producing a cannabinoid derivative, wherein the method comprises: (a) culturing in a suitable medium a recombinant host cell of the present disclosure; and (b) recovering the produced cannabinoid derivative.
  • the method of producing a cannabinoid derivative further contacting a cell-free extract of the culture containing the produced cannabinoid with a biocatalytic reagent or chemical reagent capable of converting the cannabinoid to a cannabinoid derivative.
  • the biocatalytic reagent is an enzyme capable of converting the produced cannabinoid to a different cannabinoid or a cannabinoid derivative compound.
  • the chemical reagent is capable of chemically modifying the produced cannabinoid to produce a different cannabinoid or a cannabinoid derivative compound.
  • the method for producing a cannabinoid the method can further comprise contacting a cell-free extract of the culture containing the produced cannabinoid with a biocatalytic reagent or chemical reagent.
  • the cannabinoid, or cannabinoid derivative produced using the methods of the present disclosure can be produced and/or recovered from the reaction in the form of a salt, in at least one embodiment, the recovered salt of the cannabinoid, cannabinoid precursor, cannabinoid precursor derivative, or cannabinoid derivative is a pharmaceutically acceptable salt.
  • Such pharmaceutically acceptable salts retain the biological effectiveness and properties of the free base compound.
  • polypeptides with THCA synthase activity of the present disclosure can be incorporated in any biosynthesis method requiring a THCA synthase catalyzed biocatalytic step.
  • the recombinant polypeptides having THCA synthase activity can be used in a method for preparing a cannabinoid compound of structural formula (I) wherein, R 1 is C1-C7 alkyl, wherein the method comprises contacting an recombinant polypeptide having THCA synthase activity of the present disclosure (e.g., an exemplary recombinant of Tables 3, 6, and 8) under suitable reactions conditions, with a cannabinoid precursor compound of structural formula (II) wherein, R 1 is C1-C7 alkyl.
  • Exemplary conversions of cannabinoid compounds of structural formula (II) to cannabinoid compounds of structural formula (I) that are catalyzed by the recombinant polypeptides having THCA synthase activity of the present disclosure include: (1) conversion of cannabigerolic acid (CBGA) to A 9 -tetrahydrocannabinolic acid (A9-THCA); and (2) conversion of cannabigerovarinic acid (CBGVA) to AMetrahydrocannabivarinic acid (A9-THCVA).
  • the recombinant polypeptides having THCA synthase activity of the present disclosure can catalyze the conversion of other cannabinoid compounds that are structural analogs of CBGA and CBGVA, including but not limited to the exemplary cannabinoid compounds listed in Table 1.
  • the compound of structural formula (II) is CBGA and the compound of structure formula (I) is A 9 -THCA.
  • the compound of structural formula (II) is CBGVA and the compound of structure formula (I) is A 9 -THCVA.
  • Suitable reaction conditions for the biosynthesis of cannabinoids are known in the art, and can be used with the recombinant polypeptides having THCA synthase activity of the present disclosure.
  • the suitable reaction conditions comprise the presence of a redox active co-substrate molecule, such as FAD, which is capable of acting as an electron acceptor molecule.
  • suitable reaction conditions for the exemplary polypeptides of the present disclosure can be determined using routine techniques known in the art for optimizing biocatalytic reactions.
  • Suitable reaction conditions can be readily determined and optimized for particular reactions by routine experimentation that includes, but is not limited to, contacting the recombinant polypeptide and substrate under experimental reaction conditions of concentration, pH, temperature, solvent conditions, and detecting the production of the desired compound of structural formula (I).
  • the suitable reaction conditions comprise a reaction solution of ⁇ pH 7-8, a temperature of 25 C to 37 C; optionally, the reaction conditions comprise a reaction solution of ⁇ pH 7 and a temperature of ⁇ 30 C. In at least one embodiment, the reaction solution is allowed to incubate at a temperature of 25 C to 37 C for a reaction time of at least 1 , 6, 12, 24, or 48 hours, before the amount of reaction product is determined.
  • the methods for biocatalytic conversion of a cannabinoid compound of structural formula (II) to a cannabinoid compound of structural formula (I) using an recombinant polypeptide having THCA synthase activity of the present disclosure can comprise additional chemical or biocatalytic steps carried out on the product compound of structural formula (II), including steps of product compound work-up, extraction, isolation, purification, and/or crystallization, each of which can be carried out under a range of conditions.
  • Example 1 Preparation and Screening of Engineered Polypeptides with Improved THCA synthase Activity
  • This example illustrates preparation of site saturation mutagenesis libraries of polypeptides derived from the parent polypeptide, d28_THCAS of SEQ ID NO: 18 and screening for improved activity in the conversion of CBGA to THCA relative to the activity of the parent polypeptide of SEQ ID NO: 18.
  • the polynucleotide sequence encoding the d28_THCAS polypeptide (SEQ ID NO: 18) from Cannabis sativa was codon optimized as SEQ ID NO: 17. This codon-optimized gene was synthesized fused to a polynucleotide sequence (SEQ ID NO: 107) encoding the secretion peptide SP_AT, MRFPSIFTAVLFAASSALA (SEQ ID NO: 108).
  • the synthetic gene (SEQ ID NO: 123) encoding the complete SP_AT-THCAS fusion (SEQ ID NO: 124) was expressed under the pGall promoter (SEQ ID NO: 125) and ALD4 terminator (SEQ ID NO: 126).
  • the construct was integrated into the X-3 site (Easy-Clone 2.0) of a yeast strain which already had integrated genes encoding the cannabinoid pathway enzymes AAE1 (SEQ ID NO: 2), OLS (SEQ ID NO: 4), OAC (SEQ ID NO: 6), and d82_PT4 (SEQ ID NO: 10).
  • the resulting strain EVT001 integrated with the SP_AT-THCAS fusion gene thus included a cannabinoid pathway of the enzymes AAE, OLS, OAC, PT4, and THCAS capable of converting hexanoic acid (HA) to THCA.
  • This EVT001 strain was used as a control strain in screening the saturation mutagenesis library strains for fold-improvement in THCA titer as described below.
  • Genomic DNA from the EVT001 strain with the SP_AT-THCAS fusion integrated at X-3 was used as the template to generate two PCR products: (1) a first PCR product (Fragment A), which does not harbor any degenerate codons, and (2) a second PCR product (Fragment B), which has sequence overlap with the Fragment A, and is amplified harboring one NNK degenerate codon only. Primers used for amplification of Fragments A and B and overlap extension were designed according to standard site-saturation mutagenesis protocols.
  • Fragment B was amplified with a series of forward primers that included the single NNK degenerate codon scanned across the various desired positions and a single reverse primer: 5’-CGGGTATAAGCGAAGAAGCGCAAT-3’ (SEQ ID NO: 127). Fragment A was amplified using a single forward primer: 5’-AGGCGAGAGCCGACATACGA-3’ (SEQ ID NO: 128) and a series of reverse primers designed according to the location of the mutagenesis site.
  • the two fragments A and B were assembled by overlap extension PCR using forward primer, 5’- AGCCCTCCGAAGGAACACTCTC-3’ (SEQ ID NO: 129) and reverse primer of 5’- CGACCTTCCATGGGGTCGC-3’ (SEQ ID NO: 130).
  • the assembled OE-PCR products were then pooled together and gel purified to provide a saturation mutagenesis library of linear donor DNA.
  • the pooled saturation mutagenesis library linear donor DNA was transformed and integrated as a knock-in using CRISPR-Cas9 into an m-Venus cassette in a yeast strain, EVT000.
  • the m-Venus cassette was integrated at the X-3 site under control the pGall promoter and ALD4 terminator.
  • the EVT000 strain (like the control EVT001) already had integrated genes encoding the cannabinoid pathway enzyme activities of AAE, OLS, OAC, and PT4.
  • HA hexanoic acid
  • HPLC sample preparation The whole broth of the culture was extracted and diluted with MeOH for sample preparation. The prepared samples were loaded onto RapidFire365 coupled with a triple quadruple mass spectrometry detector. Metabolites OA, CBGA, and THCA were detected using MRM mode. Calibration curves of OA, CBGA, and THCA were generated by running serial dilutions of standards, and then used to calculate concentrations of each metabolite.
  • HPLC instrumentation and parameters HPLC system: Agilent RapidFire 365; Column: Agilent Cartridge C18 (12 pl, type C); Mobile phase: Pump 1 uses H 2 O with 0.1% formic acid at 1 mL/min; Pump 2 uses 20:80 acetonitrile: H 2 O at 0.8 mL/min; Pump 3 uses MeOH with 0.1% formic acid; Aqueous wash uses H 2 O; Organic wash uses acetonitrile; RapidFire cycle time: Aspiration 600 ms; Load/wash 3000 ms; Extra wash 2000 ms; Elute 4000 ms; Re-equilibration 500 ms.
  • This example illustrates preparation of site saturation mutagenesis libraries of polypeptides derived from the parent polypeptide, d28_THCAS of SEQ ID NO: 18 and screening for improved activity in the conversion of CBGVA to THCVA relative to the activity of the parent polypeptide of SEQ ID NO: 18.
  • a site saturation mutagenesis library was prepared as described in Example 1.
  • LC-MS/MS sample preparation The whole broth of the culture was extracted in 80% acetonitrile/20% ethanol and diluted with 100% acetonitrile for sample preparation. The prepared samples were loaded onto UHPLC coupled to a triple quadrupole mass spectrometry detector. Metabolites DA, CBGVA, and THCVA were detected using SRM mode. Calibration curves of DA, CBGVA, and THCVA were generated by running serial dilutions of standards, and then used to calculate concentrations of each metabolite.
  • UHPLC MS instrumentation and parameters UHPLC system: A Thermo Scientific VanquishTM UHPLC Systems equipped with a pump (VF-P10-A), an autosampler (VF-A10-A), and a column compartment (VH-C10-A) was used for the chromatographic separation. Separation was achieved with a Thermo AccucoreTM C18 column, 2.6pm, 150x2.1 mm (Thermo Scientific) at 40°C, with an injection volume 2 pL.
  • the mobile phase consists of 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B).
  • the flow rate is 0.8 mL/min, and the gradient elution program is as follows: 10-95% B (0-1.0 min), 95% B (1 .0-2.5 min), 95-10% B (2.5-2.6 min), and 10% B (2.6-3.5 min).
  • Mass spectrometry measurements were performed on an Thermo Scientific TSQ AltisTM triple quadrupole mass. Samples were introduced to MS via electrospray ionization (ESI) in negative mode with selected reaction monitoring (SRM).
  • Mass spectrometer was operated in the following conditions: sheath gas flow rate, 60 Arb; auxiliary gas, 15 Arb.
  • the ESI voltage 2900 V and the source temperature was 350°C.
  • the parameter of the quantification of SRM transitions are shown below in Table 7.

Abstract

The present disclosure relates to recombinant polypeptides that have THCA synthase activity, nucleic acids encoding these recombinant polypeptides, recombinant host cells that produce these recombinant polypeptides, and compositions comprising the recombinant polypeptides, nucleic acids, and/or recombinant host cells. The present disclosure also relates to uses of these recombinant polypeptides, nucleic acids encoding them, and recombinant host cells comprising them, in methods for the preparation of cannabinoids, such as Δ9-tetrahydrocannabinolic acid (THCA), and Δ9-tetrahydrocannabivarinic acid (THCVA).

Description

RECOMBINANT THCA SYNTHASE POLYPEPTIDES ENGINEERED FOR ENHANCED BIOSYNTHESIS OF CANNABINOIDS
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority of U.S. Provisional Patent Application Number 63/257,523, filed October 19, 2021 , the entirety of which is hereby incorporated by reference herein.
FIELD
[0002] The present disclosure relates to recombinant THCA synthase polypeptides engineered with enhanced activity and the use of recombinant genes encoding these polypeptides in recombinant host cell systems for the production of cannabinoid compounds.
REFERENCE TO SEQUENCE LISTING
[0003] The official copy of the Sequence Listing is submitted concurrently with the specification via USPTO Patent Center as an WIPO Standard ST.26 formatted XML file with file name “13421 -013WO1. xml”, a creation date of October 13, 2022, and a size of 289,064 bytes. This Sequence Listing filed via USPTO Patent Center is part of the specification and is incorporated in its entirety by reference herein.
BACKGROUND
[0004] Cannabinoids are a class of compounds that act on endocannabinoid receptors and include the phytocannabinoids naturally produced by Cannabis sativa. Cannabinoids include the more prevalent and well-known compounds, A9-tetrahydrocannabinol (THC), cannabidiol (CBD), as well as 80 or more less prevalent cannabinoids, cannabinoid precursors, related metabolites, and synthetically produced derivative compounds. Cannabinoids are increasingly used to treat a range of diseases and conditions such as multiple sclerosis and chronic pain. Current large-scale production of cannabinoids for pharmaceutical or other use is through extraction from plants. These plant-based production processes, however, have several challenges including susceptibility of the plants to inconsistent production caused by variance in biotic and abiotic factors, difficulty reproducing identical cannabinoid accumulation profiles, and difficulty in producing a single cannabinoid compound with purity high enough for pharmaceutical applications. While some cannabinoids can be produced as a single pure product via chemical synthesis, these processes have proven very costly and too costly for large-scale production.
[0005] More economical biosynthetic approaches to cannabinoid production are being developed using microbial hosts. These processes have the potential to be robust, scalable, and capable of producing single cannabinoid compound with higher purity compared to other current processes. Several biosynthetic systems for cannabinoid compound have been reported (see e.g., W02019071000, W02018200888, WO2018148849, W02019014490, US20180073043, US20180334692, and WO2019046941). These biosynthetic systems are capable of producing the cannabinoid, CBGA, to some extent, but are not capable of efficient production of the downstream cannabinoid compounds, CBDA and THCA.
[0006] There exists a need for improved recombinant genes encoding the cannabinoid pathway enzyme, THCAS, that when integrated in a recombinant host cell system enhances the biosynthetic production of cannabinoids, such as THCA, and the rare cannabinoid THCVA
SUMMARY
[0007] The present disclosure relates generally to recombinant polypeptides engineered with increased THCA synthase activity relative to the naturally occurring THCA synthase from Cannabis sativa, and the use of these recombinant polypeptides in recombinant host cell systems and methods for the preparation of cannabinoids. This summary is intended to introduce the subject matter of the present disclosure, but does not cover each and every embodiment, combination, or variation that is contemplated and described within the present disclosure. Further embodiments are contemplated and described by the disclosure of the detailed description, drawings, and claims.
[0008] In at least one embodiment, the present disclosure provides a recombinant polypeptide having THCA synthase activity, wherein the polypeptide comprises an amino acid sequence of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18:
(a) at one or more positions selected from: R3, K12, A19, L23, H28, L31 , M33, Q41 , L43, S72, V75, H108, K137, V151 , G207, A214, K233, K234, D256, D258, I266, K268, H274, K276, H282, V293, N301 , F317, F332, A335, G348, S354, T367, T418, Y472, K496, N500, and H517;
(b) selected from: R3V, K12G, A19E, A19G, A19Q, L23I, L23V, H28G, H28N, H28Q, L31 D, L31G, M33D, Q41 R, L43G, L43S, S72A, V75A, V75Y, H108R, K137C, K137F, K137M, K137S, K137Y, V151G, G207A, A214T, K233G, K233S, K233T, K234R, D256S, D258R, I266Q, K268E, H274C, H274E, H274Q, K276Q, H282L, V293I, N301 D, F317L, F332L, A335C, A335T, G348A, S354C, T367E, T418V, Y472I, K496E, K496Q, N500D, H517R, H517V, and H517Y. [0009] In at least one embodiment, the polypeptide is encoded by a polynucleotide sequence having at least 80% identity to SEQ ID NO: 17, and a neutral codon difference as compared to SEQ ID NO: 17 at a position encoding an amino acid residue selected from: V75, H108, K136, K137, V184, K187, G328, F337, A368, P404, L415, T464, D498, and H516; optionally, wherein the neutral codon difference as compared to SEQ ID NO: 17 is selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOGAT), and H516 (CAT>CAC). [0010] In at least one embodiment, the present disclosure provides a recombinant polypeptide having THCA synthase activity, wherein the polypeptide comprises an amino acid sequence of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18, wherein the amino acid difference is:
(a) at one or more positions selected from: K12, A19, Q41 , H108, V151 , A214, K234, D256, D258, I266, K268, H274, K276, H282, V293, N301 , F317, F332, A335, G348, S354,T367, T418, and K496; and/or
(b) selected from: R3V, K12G, A19E, A19G, A19Q, L23V, H28N, H28G, H28Q, L31 D, L31G, M33D, Q41R, L43G, L43S, V75A, V75Y, H108R, K137C, K137F, K137M, K137S, K137Y, V151G, G207A, A214T, K233G, K233S, K233T, K234R, D256S, D258R, I266Q, K268E, H274C, H274E, H274Q, K276Q, H282L, V293I, N301 D, F317L, F332L, A335C, A335T, G348A, S354C, T367E, T418V, Y472I, K496E, K496Q, N500D, H517R, H517V, and H517Y.
[0011] In at least one embodiment, the polypeptide comprises at least two amino acid differences as compared to SEQ ID NO: 18 selected from: H23N and A335C; Q41 R and D258R; L43G and K276Q; S72A and V293I; K234R and K496E; I266Q and H517Y; N301 D and Y472I; A335T and G348A; and A335T and H517V.
[0012] In at least one embodiment, the polypeptide comprises neutral codon differences as compared to SEQ ID NO: 17 selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOGAT), and H516 (CAT>CAC).
[0013] In at least one embodiment, the polypeptide comprises an amino acid sequence of at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to a sequence selected from the group consisting of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, and 168.
[0014] In at least one embodiment, the polypeptide comprises an N-terminal secretion peptide; optionally, wherein the N-terminal secretion peptide comprising an amino acid sequence selected from SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, and 122.
[0015] In at least one embodiment, the THCA synthase activity of the polypeptide as compared to a polypeptide consisting of SEQ ID NO: 18 is increased at least 1.2-fold, at least 1.5-fold, at least 2-fold, at least 5-fold, or more. In at least one embodiment, the THCA synthase activity of the polypeptide is measured as the rate of conversion of the substrate cannabigerolic acid (CBGA) to THCA under suitable reaction conditions. In at least one embodiment, the THCA synthase activity of the polypeptide is measured as the rate of conversion of the substrate CBGVA to THCVA under suitable reaction conditions. [0016] In at least one embodiment, the present disclosure also provides a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure. In at least one embodiment, the polynucleotide comprises:
(a) a sequence of at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145, 147, 149, 151 , 153, 155, 157, 159, 161 , 163, 165, and 167;
(b) a codon degenerate sequence of a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145, 147, 149, 151 , 153, 155, 157, 159, 161 , 163, 165, and 167.
[0017] In at least one embodiment, the polynucleotide encoding the polypeptide further comprises a polynucleotide sequence encoding an N-terminal secretion peptide comprising an amino acid sequence selected from SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116,
118, 120, and 122; optionally, wherein the polynucleotide sequence encoding the N-terminal secretion peptide is selected from SEQ ID NO: 99, 101 , 103, 105, 107, 109, 111 , 113, 115, 117,
119, and 121.
[0018] In at least one embodiment, the present disclosure also provides an expression vector comprising a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, optionally wherein, the expression vector comprises a control sequence.
[0019] In at least one embodiment, the present disclosure also provides a recombinant host cell comprising: (a) a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, or (b) an expression vector comprising a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure.
[0020] In at least one embodiment, the present disclosure provides a method for preparing a recombinant polypeptide having THCA synthase activity of the present disclosure wherein the method comprises culturing a recombinant host cell of the present disclosure and isolating the polypeptide from the cell.
[0021] In at least one embodiment, the present disclosure provides a method for preparing a recombinant polypeptide having THCA synthase activity comprising:
(a) transforming a host cell with an expression vector comprising a polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure;
(b) culturing said transformed host cell under conditions whereby said recombinant polypeptide is produced by said host cell; and
(c) recovering said recombinant polypeptide from said host cells. [0022] In at least one embodiment, the present disclosure also provides a recombinant host cell comprising a nucleic acid encoding a recombinant polypeptide having THCA synthase activity of the present disclosure.
[0023] In at least one embodiment of the recombinant host cell, the host cell further comprises a pathway of enzymes capable of producing a cannabinoid or cannabinoid precursor; optionally, wherein the cannabinoid or cannabinoid precursor is selected from divarinic acid (DA), olivetolic acid (OA), cannabigerovarinic acid (CBGVA), and cannabigerolic acid (CBGA),. [0024] In at least one embodiment of the recombinant host cell, the host cell further comprises a pathway of enzymes capable of converting hexanoic acid (HA) to cannabigerolic acid (CBGA); optionally, wherein the pathway comprises enzymes capable of catalyzing reactions (i) - (iv):
Figure imgf000007_0001
and
(iv)
Figure imgf000008_0001
Geranyldiphosphate
[0025] In at least one embodiment of the recombinant host cell, the host cell further comprises a pathway of enzymes capable of converting hexanoic acid (HA) to cannabigerolic acid (CBGA), wherein the pathway comprises at least the enzymes AAE, OLS, OAC, and PT4; optionally, wherein the enzymes AAE, OLS, OAC, and PT4 have an amino acid sequence of at least 90% identity to SEQ ID NO: 2 (AAE), SEQ ID NO: 4 (OLS), SEQ ID NO: 6 (OAC), and SEQ ID NO: 8 or 10 (PT4), respectively.
[0026] In at least one embodiment of the recombinant host cell, the host cell is capable of producing a cannabinoid selected from cannabigerolic acid (CBGA), cannabigerol (CBG), cannabidiolic acid (CBDA), cannabidiol (CBD), AMetrahydrocannabinolic acid (A9-THCA), A9- tetrahydrocannabinol (A9-THC), AMetrahydrocannabinolic acid (A8-THCA), A8- tetrahydrocannabinol (A8-THC), cannabichromenic acid (CBCA), cannabichromene (CBC), cannabinolic acid (CBNA), cannabinol (CBN), cannabidivarinic acid (CBDVA), cannabidivarin (CBDV), AMetrahydrocannabivarinic acid (A9-THCVA), AMetrahydrocannabivarin (A9-THCV), cannabidibutolic acid (CBDBA), cannabidibutol (CBDB), AMetrahydrocannabutolic acid (A9- THCBA), AMetrahydrocannabutol (A9-THCB), cannabidiphorolic acid (CBDPA), cannabidiphorol (CBDP), AMetrahydrocannabiphorolic acid (A9-THCPA), A9- tetrahydrocannabiphorol (A9-THCP), cannabichromevarinic acid (CBCVA), cannabichromevarin (CBCV), cannabigerovarinic acid (CBGVA), cannabigerovarin (CBGV), cannabicyclolic acid (CBLA), cannabicyclol (CBL), cannabielsoinic acid (CBEA), cannabielsoin (CBE), cannabicitranic acid (CBTA), cannabicitran (CBT), and any combination thereof.
[0027] In at least one embodiment of the recombinant host cell, the host cell comprises a pathway capable of producing THCA, and the production of THCA is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by a polypeptide of SEQ ID NO: 18.
[0028] In at least one embodiment, the recombinant host cell capable of increased production of THCA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: R3, L23, H28, L31 , M33, L43, S72, V75, K137, G207, K233, 1266, K268, K276, H282, V293, N301 , A335, G348, T418, Y472, N500, and H517; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from R3V, L23I, L23V, H28G, H28Q, L31 D, L31G, M33D, L43G, L43S, S72A, V75A, V75Y, K137C, K137F, K137M, K137S, K137Y, G207A, K233G, K233S, K233T, I266Q, K268E, K276Q, H282L, V293I, N301 D, A335T, G348A, T418V, Y472I, N500D, H517R, H517V, and H517Y. [0029] In at least one embodiment, the recombinant host cell capable of increased production of THCA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and a neutral codon difference as compared to SEQ ID NO: 17 selected from: V75 (GTA>GTG), V184 (GTA>GTG), G328 (GGOGGT), F337 (TTOTTT), P404 (CCT>CCC), D498 (GAOGAT), and H516 (CAT>CAC).
[0030] In at least one embodiment of the recombinant host cell, the host cell comprises a pathway capable of producing THCVA, and the production of THCVA is increased at least 2- fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by a polypeptide of SEQ ID NO: 18.
[0031] In at least one embodiment, the recombinant host cell capable of increased production of THCVA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: K12, A19, H28, Q41 , V75, H108, V151 , A214, K234, D256, D258, H274, F317, F332, A335, S354, T367, and K496; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from K12G, A19E, A19G, A19Q, H28N, Q41 R, V75A, H108R, V151G, A214T, K234R, D256S, D258R, H274C, H274E, H274Q, F317L, F332L, A335C, S354C, T367E, K496E, and K496Q.
[0032] In at least one embodiment, the recombinant host cell capable of increased production of THCVA comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and a neutral codon difference as compared to SEQ ID NO: 17 selected from: H108 (CAOCAT), K136 (AAG>AAA), K187 (AAG>AAA), R191 R (AGG>AGA), A368A (GCOGCG), L415L (TTA>CTG), and T464T (ACOACG).
[0033] In at least one embodiment of the recombinant host cell, the source of the host cell is selected from Saccharomyces cerevisiae, Yarrowia lipolytica, Pichia pastoris, and Escherichia coli.
[0034] In at least one embodiment, the present disclosure also provides a method for producing a cannabinoid comprising: (a) culturing in a suitable medium a recombinant host cell of the present disclosure; and (b) recovering the produced cannabinoid. In at least one embodiment, the method further comprises contacting a cell-free extract of the culture with a biocatalytic reagent or chemical reagent.
[0035] In at least one embodiment, the present disclosure also provides a method for preparing a compound of structural formula (I)
Figure imgf000010_0001
wherein, R1 is C1-C7 alkyl; the method comprising contacting under suitable reactions conditions a recombinant polypeptide having THCA synthase activity of the present disclosure and a compound of structural formula (II)
Figure imgf000010_0002
wherein, R1 is C1-C7 alkyl.
[0036] In at least one embodiment of the method: (a) the compound of structure formula (I) is A9-tetrahydrocannabinolic acid (A9-THCA) and the compound of structural formula (II) is cannabigerolic acid (CBGA); or (b) the compound of structure formula (I) is A9- tetrahydrocannabivarinic acid (A9-THCVA) and the compound of structural formula (II) is cannabigerovarinic acid (CBGVA).
BRIEF DESCRIPTION OF THE DRAWINGS
[0037] A better understanding of the novel features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings (also “Figure” and “FIG.” herein), of which:
[0038] FIG. 1 depicts an exemplary four enzyme pathway capable of converting hexanoic acid (HA) to the cannabinoid precursor, olivetolic acid (OA), and then further converting OA to the cannabinoid, cannabigerolic acid (CBGA). The four enzymes catalyzing the steps in the biosynthetic pathway are AAE, OLS, OAC, and PT.
[0039] FIG. 2 depicts three exemplary two step pathways for converting the cannabinoid, CBGA, to one or more of the cannabinoids, A9-THCA, CBDA, and/or CBCA, and then, optionally, further converting them to the decarboxylated cannabinoids, A9-THC, CBD, and/or CBC. The first conversion from CBGA to A9-THCA, CBDA, and/or CBCA can be catalyzed by a cannabinoid synthase, CBDA synthase (CBDAS), THCA synthase (THCAS) and/or CBCA synthase (CBCAS), respectively. As described elsewhere herein, in some embodiments the single cannabinoid synthase (e.g., CBDAS) is capable of catalyzing not only the conversion of CBGA to its preferred product (e.g., CBDAS preferentially converts CBGA to CBDA), but also converts CBGA to one or both of the other cannabinoid acid products, typically in lesser amounts.
[0040] FIG. 3 depicts an exemplary four enzyme pathway capable of converting butyric acid (BA) to the rare cannabinoid precursor, divarinic acid (DA), and then further converting DA to the rare cannabinoid, cannabigerovarinic acid (CBGVA). The four enzymes catalyzing the steps in the biosynthetic pathway are AAE, OLS, OAC, and PT.
[0041] FIG. 4 depicts three exemplary two step pathways for converting the rare cannabinoid, CBGVA, to one or more of the rare cannabinoids, A9-THCVA, CBDVA, and/or CBCVA, and then, optionally, further converting them to the decarboxylated cannabinoids, A9-THCV, CBDV, and/or CBCV. The first conversion from CBGVA to A9-THCVA, CBDVA, and/or CBCVA can be catalyzed by a single cannabinoid synthase, CBDAs, THCAs and/or CBCAs, respectively. As described elsewhere herein, in some embodiments the single cannabinoid synthase (e.g., CBDAs) is capable of catalyzing not only the conversion of CBGVA to its preferred product (e.g., CBDAs preferentially converts CBGVA to CBDVA), but also converts CBGVA to one or both of the other cannabinoid acid products, typically in lesser amounts.
DETAILED DESCRIPTION
[0042] For the descriptions herein and the appended claims, the singular forms “a”, and “an” include plural referents unless the context clearly indicates otherwise. Thus, for example, reference to “a protein” includes more than one protein, and reference to “a compound” refers to more than one compound. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation. The use of “comprise,” “comprises,” “comprising” “include,” “includes,” and “including” are interchangeable and not intended to be limiting. It is to be further understood that where descriptions of various embodiments use the term “comprising,” those skilled in the art would understand that in some specific instances, an embodiment can be alternatively described using language “consisting essentially of’ or “consisting of.”
[0043] Where a range of values is provided, unless the context clearly dictates otherwise, it is understood that each intervening integer of the value, and each tenth of each intervening integer of the value, unless the context clearly dictates otherwise, between the upper and lower limit of that range, and any other stated or intervening value in that stated range, is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of these limits, ranges excluding (i) either or (ii) both of those included limits are also included in the invention. For example, “1 to 50,” includes “2 to 25,” “5 to 20,” “25 to 50,” “1 to 10,” etc.
[0044] Generally, the nomenclature used herein and the techniques and procedures described herein include those that are well understood and commonly employed by those of ordinary skill in the art, such as the common techniques and methodologies described in e.g., Green and Sambrook, Molecular Cloning: A Laboratory Manual (Fourth Edition), Vols. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 2012 (hereinafter “Sambrook”); and Current Protocols in Molecular Biology, F. M. Ausubel et al., eds., originally published in 1987 in book form by Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., and regularly supplemented through 2011 , and now available in journal format online as Current Protocols in Molecular Biology, Vols. 00 - 130, (1987-2020), published by Wiley & Sons, Inc. in the Wiley Online Library (hereinafter “Ausubel”).
[0045] All publications, patents, patent applications, and other documents referenced in this disclosure are hereby incorporated by reference in their entireties for all purposes to the same extent as if each individual publication, patent, patent application or other document were individually indicated to be incorporated by reference herein for all purposes.
[0046] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention pertains. It is to be understood that the terminology used herein is for describing particular embodiments only and is not intended to be limiting. For purposes of interpreting this disclosure, the following description of terms will apply and, where appropriate, a term used in the singular form will also include the plural form and vice versa.
[0047] Definitions
[0048] “Cannabinoid” refers to a compound that acts on cannabinoid receptor, and is intended to include the endocannabinoid compounds that are produced naturally in animals, the phytocannabinoid compounds produced naturally in cannabis plants, and the synthetic cannabinoids compounds. Cannabinoids as referenced in the present disclosure include, but are not limited to, the exemplary naturally occurring and synthetic cannabinoid product compounds shown below in Table 1 (below).
[0049] TABLE 1 : Exemplary cannabinoid product compounds
Figure imgf000013_0001
Figure imgf000014_0001
Figure imgf000015_0001
Figure imgf000016_0001
Figure imgf000017_0001
Figure imgf000018_0001
[0050] “Pathway” refers an ordered sequence of enzymes that act in a linked series to convert an initial substrate molecule into final product molecule. As used herein, “pathway” is intended to encompass naturally-occurring pathways and non-naturally occurring, recombinant pathways. Accordingly, a pathway of the present disclosure can include a series of enzymes that are naturally-occurring and/or non-naturally occurring, and can include a series of enzymes that act in vivo or in vitro.
[0051] “Pathway capable of producing a cannabinoid” refers to a pathway that can convert a cannabinoid precursor molecule, such as hexanoic acid, into a cannabinoid product molecule, such as cannabigerolic acid (CBGA). For example, the four enzymes AAE, OLS, OAC, and PT which convert hexanoic acid to CBGA, form a pathway capable of producing a cannabinoid.
[0052] “Cannabinoid precursor” as used herein refers to a compound capable of being converted into a cannabinoid by a pathway capable producing a cannabinoid. Cannabinoid precursors as referenced in the present disclosure include, but are not limited to, the exemplary naturally occurring and synthetic cannabinoid precursors with varying alkyl carbon chain lengths summarized in Table 2 (below).
[0053] TABLE 2: Exemplary cannabinoid precursor compounds
Figure imgf000019_0001
[0054] “Conversion” as used herein refers to the enzymatic conversion of a substrate(s) to a corresponding product(s). “Percent conversion” refers to the percent of the substrate that is converted to the product within a period of time under specified conditions. Thus, the “enzymatic activity” or “activity” of an enzymatic conversion can be expressed as “percent conversion” of the substrate to the product.
[0055] “Substrate” as used herein in the context of an enzyme mediated process refers to the compound or molecule acted on by the enzyme.
[0056] “Product” as used herein in the context of an enzyme mediated process refers to the compound or molecule resulting from the activity of the enzyme.
[0057] “Host cell” as used herein refers to a cell capable of being functionally modified with recombinant nucleic acids and functioning to express recombinant products, including polypeptides and compounds produced by activity of the polypeptides.
[0058] “Nucleic acid,” or “polynucleotide” as used herein interchangeably to refer to two or more nucleosides that are covalently linked together. The nucleic acid may be wholly comprised ribonucleosides (e.g., RNA), wholly comprised of 2'-deoxyribonucleotides (e.g., DNA) or mixtures of ribo- and 2'-deoxyribonucleosides. The nucleoside units of the nucleic acid can be linked together via phosphodiester linkages (e.g., as in naturally occurring nucleic acids), or the nucleic acid can include one or more non-natural linkages (e.g., phosphorothioester linkage). Nucleic acid or polynucleotide is intended to include singlestranded or double-stranded molecules, or molecules having both single-stranded regions and double-stranded regions. Nucleic acid or polynucleotide is intended to include molecules composed of the naturally occurring nucleobases (i.e., adenine, guanine, uracil, thymine and cytosine), or molecules comprising that include one or more modified and/or synthetic nucleobases, such as, for example, inosine, xanthine, hypoxanthine, etc.
[0059] “Protein,” “polypeptide,” and “peptide” are used herein interchangeably to denote a polymer of at least two amino acids covalently linked by an amide bond, regardless of length or post-translational modification (e.g., glycosylation, phosphorylation, lipidation, myristilation, ubiquitination, etc.). As used herein “protein” or “polypeptide” or “peptide” polymer can include D- and L-amino acids, and mixtures of D- and L-amino acids.
[0060] “Naturally-occurring” or “wild-type” as used herein refers to the form as found in nature. For example, a naturally occurring nucleic acid sequence is the sequence present in an organism that can be isolated from a source in nature and which has not been intentionally modified by human manipulation.
[0061] “Recombinant,” “engineered,” or “non-naturally occurring” when used herein with reference to, e.g., a cell, nucleic acid, or polypeptide, refers to a material, or a material corresponding to the natural or native form of the material, that has been modified in a manner that would not otherwise exist in nature, or is identical thereto but is produced or derived from synthetic materials and/or by manipulation using recombinant techniques. Non-limiting examples include, among others, recombinant cells expressing genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise expressed at a different level.
[0062] “Nucleic acid derived from” as used herein refers to a nucleic acid having a sequence at least substantially identical to a sequence of found in naturally in an organism. For example, cDNA molecules prepared by reverse transcription of mRNA isolated from an organism, or nucleic acid molecules prepared synthetically to have a sequence at least substantially identical to, or which hybridizes to a sequence at least substantially identical to a nucleic sequence found in an organism.
[0063] “Coding sequence” refers to that portion of a nucleic acid (e.g., a gene) that encodes an amino acid sequence of a protein.
[0064] “Heterologous nucleic acid” as used herein refers to any polynucleotide that is introduced into a host cell by laboratory techniques, and includes polynucleotides that are removed from a host cell, subjected to laboratory manipulation, and then reintroduced into a host cell.
[0065] “Codon degenerate” describes a nucleotide sequence that has one or more different codons relative to the reference nucleotide sequence but which encodes a polypeptide that is identical to the polypeptide encoded by a reference nucleotide sequence. The different codons between the nucleotide sequence and the reference nucleotide sequence are called “synonyms” or “synonymous” codons in that they use different triplets of nucleotides to encode the same amino acid in a polypeptide.
[0066] “Codon optimized” refers to changes in the codons of the polynucleotide encoding a protein to those preferentially used in a particular organism such that the encoded protein is efficiently expressed in the organism of interest. Although the genetic code is degenerate in that most amino acids are represented by several different “synonymous” codons, it is well known that codon usage by particular organisms is nonrandom and biased towards particular codon triplets. This codon usage bias may be higher in reference to a given gene, genes of common function or ancestral origin, highly expressed proteins versus low copy number proteins, and the aggregate protein coding regions of an organism's genome. In some embodiments, the polynucleotides encoding the imine reductase enzymes may be codon optimized for optimal production from the host organism selected for expression.
[0067] “Preferred, optimal, high codon usage bias codons” refers to codons that are used at higher frequency in the protein coding regions than other codons that code for the same amino acid. The preferred codons may be determined in relation to codon usage in a single gene, a set of genes of common function or origin, highly expressed genes, the codon frequency in the aggregate protein coding regions of the whole organism, codon frequency in the aggregate protein coding regions of related organisms, or combinations thereof. Codons whose frequency increases with the level of gene expression are typically optimal codons for expression. A variety of methods are known for determining the codon frequency (e.g., codon usage, relative synonymous codon usage) and codon preference in specific organisms, including multivariate analysis, for example, using cluster analysis or correspondence analysis, and the effective number of codons used in a gene (see GCG CodonPreference, Genetics Computer Group Wisconsin Package; CodonW, John Peden, University of Nottingham; McInerney, J. O, 1998, Bioinformatics 14:372-73; Stenico et al., 1994, Nucleic Acids Res. 222437-46; Wright, F., 1990, Gene 87:23-29). Codon usage tables are available for a growing list of organisms (see for example, Wada et al., 1992, Nucleic Acids Res. 20:2111-2118; Nakamura et al., 2000, Nucl. Acids Res. 28:292; Duret, et al., supra; Henaut and Danchin, "Escherichia coli and Salmonella,"
1996, Neidhardt, et al. Eds., ASM Press, Washington D.C., p. 2047-2066. The data source for obtaining codon usage may rely on any available nucleotide sequence capable of coding for a protein. These data sets include nucleic acid sequences actually known to encode expressed proteins (e.g., complete protein coding sequences-CDS), expressed sequence tags (ESTS), or predicted coding regions of genomic sequences (see for example, Mount, D., Bioinformatics: Sequence and Genome Analysis, Chapter 8, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 2001 ; Uberbacher, E. C., 1996, Methods Enzymol. 266:259-281 ; Tiwari et al.,
1997, Comput. Appl. Biosci. 13:263-270).
[0068] “Control sequence” as used herein refers to all sequences, which are necessary or advantageous for the expression of a polynucleotide and/or polypeptide as used in the present disclosure. Each control sequence may be native or foreign to the nucleic acid sequence encoding a polypeptide. Such control sequences include, but are not limited to, a leader, a promoter, a polyadenylation sequence, a pro-peptide sequence, a signal peptide sequence, and a transcription terminator. At a minimum, control sequences typically include a promoter, and transcriptional and translational stop signals. The control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
[0069] “Operably linked” as used herein refers to a configuration in which a control sequence is appropriately placed (e.g., in a functional relationship) at a position relative to a polynucleotide sequence or polypeptide sequence of interest such that the control sequence directs or regulates the expression of the sequence of interest.
[0070] “Promoter sequence” refers to a nucleic acid sequence that is recognized by a host cell for expression of a polynucleotide of interest, such as a coding sequence. The promoter sequence contains transcriptional control sequences, which mediate the expression of a polynucleotide of interest. The promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
[0071] “Percentage of sequence identity,” “percent sequence identity,” “percentage homology,” or “percent homology” are used interchangeably herein to refer to values quantifying comparisons of the sequences of polynucleotides or polypeptides, and are determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (or gaps) as compared to the reference sequence for optimal alignment of the two sequences. The percentage values may be calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Alternatively, the percentage may be calculated by determining the number of positions at which either the identical nucleic acid base or amino acid residue occurs in both sequences or a nucleic acid base or amino acid residue is aligned with a gap to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity. Those of skill in the art appreciate that there are many established algorithms available to align two sequences. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman, 1981 , Adv. Appl. Math. 2:482, by the homology alignment algorithm of Needleman and Wunsch, 1970, J. Mol. Biol. 48:443, by the search for similarity method of Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the GCG Wisconsin Software Package), or by visual inspection (see generally, Current Protocols in Molecular Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1995 Supplement) (Ausubel)). Examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., 1990, J. Mol. Biol. 215: 403-410 and Altschul et al., 1977, Nucleic Acids Res.
3389-3402, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information website. This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as, the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negativescoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11 , an expectation (E) of 10, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, 1989, Proc Natl Acad Sci USA 89:10915). Exemplary determination of sequence alignment and % sequence identity can employ the BESTFIT or GAP programs in the GCG Wisconsin Software package (Accelrys, Madison Wis.), using default parameters provided.
[0072] “Reference sequence” refers to a defined sequence used as a basis for a sequence comparison. A reference sequence may be a subset of a larger sequence, for example, a segment of a full-length nucleic acid or polypeptide sequence. A reference sequence typically is at least 20 nucleotide or amino acid residue units in length, but can also be the full length of the nucleic acid or polypeptide. Since two polynucleotides or polypeptides may each (1) comprise a sequence (i.e., a portion of the complete sequence) that is similar between the two sequences, and (2) may further comprise a sequence that is divergent between the two sequences, sequence comparisons between two (or more) polynucleotides or polypeptide are typically performed by comparing sequences of the two polynucleotides or polypeptides over a “comparison window” to identify and compare local regions of sequence similarity.
“Comparison window” refers to a conceptual segment of at least about 20 contiguous nucleotide positions or amino acids residues wherein a sequence may be compared to a reference sequence of at least 20 contiguous nucleotides or amino acids and wherein the portion of the sequence in the comparison window may comprise additions or deletions (or gaps) of 20 percent or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
[0073] “Substantial identity” or “substantially identical” refers to a polynucleotide or polypeptide sequence that has at least 70% sequence identity, at least 80% sequence identity, at least 85% sequence identity, at least 90% sequence identity, at least 95 % sequence identity, or at least 99% sequence identity, as compared to a reference sequence over a comparison window of at least 20 nucleoside or amino acid residue positions, frequently over a window of at least 30-50 positions, wherein the percentage of sequence identity is calculated by comparing the reference sequence to a sequence that includes deletions or additions which total 20 percent or less of the reference sequence over the window of comparison.
[0074] “Corresponding to,” “reference to,” or “relative to” when used in the context of the numbering of a given amino acid or polynucleotide sequence refers to the numbering of the residues of a specified reference sequence when the given amino acid or polynucleotide sequence is compared to the reference sequence. In other words, the residue number or residue position of a given polymer is designated with respect to the reference sequence rather than by the actual numerical position of the residue within the given amino acid or polynucleotide sequence. For example, a given amino acid sequence, such as that of an engineered imine reductase, can be aligned to a reference sequence by introducing gaps to optimize residue matches between the two sequences. In these cases, although the gaps are present, the numbering of the residue in the given amino acid or polynucleotide sequence is made with respect to the reference sequence to which it has been aligned.
[0075] “Isolated” as used herein in reference to a molecule means that the molecule (e.g., cannabinoid, polynucleotide, polypeptide) is substantially separated from other compounds that naturally accompany it, e.g., protein, lipids, and polynucleotides. The term embraces nucleic acids which have been removed or purified from their naturally-occurring environment or expression system (e.g., host cell or in vitro synthesis).
[0076] “Substantially pure” refers to a composition in which a desired molecule is the predominant species present (i.e., on a molar or weight basis it is more abundant than any other individual macromolecular species in the composition), and is generally a substantially purified composition when the object species comprises at least about 50 percent of the macromolecular species present by mole or % weight.
[0077] “Recovered” as used herein in relation to an enzyme, protein, or cannabinoid compound, refers to a more or less pure form of the enzyme, protein, or cannabinoid.
[0078] Recombinant Polypeptides with Enhanced THCA Synthase Activity
[0079] The present disclosure provides engineered genes that encode recombinant polypeptides having THCA synthase activity. When integrated into a recombinant host cell (e.g., S. cerevisiae) having a pathway capable of producing a cannabinoid, such as cannabigerolic acid (CBGA), the presence of the engineered genes expressing the recombinant polypeptides results in an increased yield of the THCA synthase product of the cannabinoid. In the case of a recombinant host cell capable of producing the cannabinoid, CBGA, the THCA synthase product cannabinoid, A9-tetrahydrocannabinolic acid (A9-THCA), is produced by the host cell in greater yield relative to a comparable recombinant host cell integrated with the Cannabis sativa THCA synthase (“d28_THCAS”), which corresponds to the polypeptide of SEQ ID NO: 18. The enzymatic reaction in the cannabinoid pathway of C. sativa catalyzed by the d28_THCAS polypeptide is the oxidative cyclization of the monoterpene moiety of cannabigerolic acid (CBGA) (compound (2)) coupled with the reduction of FAD co-substrate, to form the cannabinoid product A9-THCA (compound (1)), as shown in Scheme 1.
Scheme 1
Figure imgf000026_0001
[0080] The recombinant polypeptides with THCA synthase activity of the present disclosure when incorporated in a recombinant host cell comprising a pathway that produces a cannabinoid, such as CBGA (compound (2)), are capable, in the presence of FAD, of oxidatively cyclizing that substrate to form a cannabinoid product, such as THCA (compound (1)). Without intending to be bound by any particular theory or mechanism, the conversion of the cannabinoid substrate, CBGA (compound (2)), to the THCA product (compound (1)) as in Scheme 1 , when carried out by the recombinant polypeptides with THCA synthase activity of the present disclosure integrated in a recombinant host cell results in a greater yield of the THCA, relative to a control recombinant host cell strain integrated with a pathway that instead expresses the d28_THCAS polypeptide of SEQ ID NO: 18. The enhanced yield of the cyclized cannabinoid product is correlated with one or more residue differences in recombinant polypeptides of the present disclosure, as compared to the d28_THCAS amino acid sequence of SEQ ID NO: 18, and/or correlated with codon differences in the nucleotide sequences encoding the polypeptides, as compared to the recombinant nucleic acid sequence of SEQ ID NO: 17. Exemplary engineered genes and encoded recombinant polypeptides with THCA synthase activity that exhibit the unexpected and surprising technical effect of increased cannabinoid product yield when integrated in a recombinant host cell are summarized in Table 3 below.
[0081] TABLE 3: Recombinant polypeptides with THCA synthase activity
Figure imgf000026_0002
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
[0082] In at least one embodiment, the recombinant polypeptides having THCA synthase activity and increased activity have one or more residue differences as compared to the reference C. sativa THCA synthase polypeptide of SEQ ID NO: 18. In some embodiments, the recombinant polypeptides have one or more residue differences at residue positions selected from R3, K12, A19, L23, H28, L31 , M33, Q41 , L43, S72, V75, H108, K137, V151 , G207, A214, K233, K234, D256, D258, I266, K268, H274, K276, H282, V293, N301 , F317, F332, A335, G348, S354, T367, T418, Y472, K496, N500, and H517. In at least one embodiment, the amino acid residue differences are: R3V, K12G, A19E, A19G, A19Q, L23I, L23V, H28G, H28N, H28Q, L31 D, L31G, M33D, Q41 R, L43G, L43S, S72A, V75A, V75Y, H108R, K137C, K137F, K137M, K137S, K137Y, V151G, G207A, A214T, K233G, K233S, K233T, K234R, D256S, D258R, I266Q, K268E, H274C, H274E, H274Q, K276Q, H282L, V293I, N301 D, F317L, F332L, A335C, A335T, G348A, S354C, T367E, T418V, Y472I, K496E, K496Q, N500D, H517R, H517V, and H517Y.
[0083] It is contemplated that the residue differences relative to SEQ ID NO: 18 at residue positions associated with increased THCA synthase activity can be used in various combinations to form recombinant THCA synthase polypeptides having desirable functional characteristics when integrated in a recombinant host cell, for example increased yield product of the cannabinoid product compound, THCA. Some exemplary combinations are described in Table 3 and elsewhere herein. For example, the present disclosure provides a recombinant polypeptide having increased THCA synthase activity and amino acid residue differences as compared to SEQ ID NO: 18 at the following pairs of positions: H23 and A335; Q41 and D258; L43 and K276; S72 and V293; K234 and K496; 1266 and H517; N301 and Y472; A335 and G348; and A335 and H517. In at least one embodiment, the recombinant polypeptides can have at least the following residue differences in combination: H23N and A335C; Q41 R and D258R; L43G and K276Q; S72A and V293I; K234R and K496E; I266Q and H517Y; N301 D and Y472I; A335T and G348A; and A335T and H517V.
[0084] Based on the correlation of recombinant polypeptide functional information provided herein with the sequence information provided in Table 3 and the accompanying Sequence Listing, one of ordinary skill can recognize that the present disclosure provides a range of recombinant polypeptides having THCA synthase activity, wherein the polypeptide comprises an amino acid sequence comprising one or more of the amino acid differences or sets of amino acid differences (relative to SEQ ID NO: 18) disclosed in any one of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, and 168, and otherwise have at least 80%, at least 85% at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to a sequence selected from the group consisting of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, and 168.
[0085] Thus, in at least one embodiment, a recombinant polypeptide of the present disclosure having THCA synthase activity can have an amino acid sequence comprising one or more of the amino acid differences or sets of amino acid differences (relative to SEQ ID NO: 18) disclosed in any one of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, and 82, and additionally have 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11 , 1-12, 1-14, 1-15, 1-16, 1-18, 1-20, 1-22, 1-24, 1-26, 1-30, 1-35, 1-40, 1-45, 1-50, 1-55, or 1-60 residue differences at other residue positions. In some embodiments, the number of differences can be 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 14, 15, 16, 18, 20, 22, 24, 26, 30, 35, 40, 45, 50, 55, or 60 residue differences at the other residue positions.
[0086] In addition to the residue positions specified above, any of the engineered THCA synthase polypeptides disclosed herein can further comprise other residue differences relative to the reference polypeptide of SEQ ID NO: 18 at other residue positions.
[0087] Residue differences at these other residue positions can provide for additional variations in the amino acid sequence without adversely affecting the ability of the recombinant polypeptide to carry out the desired biocatalytic conversion (e.g., conversion of compound (2) to compound (1)). In some embodiments, the recombinant polypeptides can have additionally 1- 2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11 , 1-12, 1-14, 1-15, 1-16, 1-18, 1-20, 1-22, 1-24, 1- 26, 1-30, 1-35, 1-40 residue differences at other amino acid residue positions as compared to SEQ ID NO: 18. In some embodiments, the number of differences can be 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , 12, 14, 15, 16, 18, 20, 22, 24, 26, 30, 35, and 40 residue differences at other residue positions. The residue difference at these other positions can include conservative changes or non-conservative changes. In some embodiments, the residue differences can comprise conservative substitutions and non-conservative substitutions as compared to the reference polypeptide of SEQ ID NO: 18.
[0088] In some embodiments, the recombinant polypeptides of the disclosure can be in the form of fusion polypeptides in which the engineered polypeptides are fused to other polypeptides, such as, by way of example and not limitation, antibody tags (e.g., myc epitope), purification sequences (e.g., His tags for binding to metals), and cell localization signals (e.g., secretion signals). Thus, the recombinant polypeptides described herein can be used with or without fusions to other polypeptides. It is also contemplated that the recombinant polypeptides described herein are not restricted to the genetically encoded amino acids. In addition to the genetically encoded amino acids, the polypeptides described herein may be comprised, either in whole or in part, of naturally-occurring and/or synthetic non-encoded amino acids.
[0089] Based on the relatively high level of production of the cannabinoid, THCA, in the trichome cells of the C. sativa plant, it is believed that production of THCA in a recombinant host cell (e.g., yeast) may require secretion of the THCA synthase enzyme that converts CBGA to THCA. The present disclosure contemplates that any of the recombinant polypeptides having THCA synthase activity of present disclosure may be made used as a fusion polypeptide construct with an N-terminal secretion peptide, particularly where the recombinant polypeptide is expressed in a recombinant host cell (e.g., yeast) as described elsewhere herein. Exemplary N-terminal secretion peptide (SP) sequences include those disclosed elsewhere herein including, Table 4, the Examples and accompanying Sequence Listing, and those disclosed as fusion with d28_THCAS in US Provisional Patent Application No. 63/164,510, filed March 22, 2021 , which is hereby incorporated by reference herein.
[0090] In another aspect, the present disclosure provides polynucleotides encoding the recombinant polypeptides having THCA synthase activity and increased activity and/or yield as described herein. In at least one embodiment, the polynucleotide encoding a recombinant polypeptide having THCA synthase activity comprises an amino acid sequence that is at least about 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more identical to the polypeptide sequence of SEQ ID NO: 18. In some embodiments, the polynucleotide encodes a recombinant polypeptide comprising an amino acid sequence that has the percent identity described above and has one or more amino acid residue differences as compared to SEQ ID NO: 18 described elsewhere herein.
[0091] In at least one embodiment, the polynucleotide has a sequence encoding a recombinant polypeptide that does not include an amino acid difference relative to SEQ ID NO: 18, but which polynucleotide sequence has one or more codon differences relative to SEQ ID NO: 17, which codon differences result in increased yield of the cannabinoid product produced by a recombinant host cell in which the polynucleotide sequence is integrated. In at least one embodiment, the polynucleotide has a sequence of at least 80% identity to SEQ ID NO: 17, and a codon difference as compared to SEQ ID NO: 17 at a position encoding an amino acid residue selected from: V75, H108, K136, V184, K187, R191 , G328, F337, A368, P404, L415, T464, D498, and H516. In at least one embodiment, the codon differences at positions V75, H108, K136, V184, K187, R191, G328, F337, A368, P404, L415, T464, D498, and H516 are selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOGAT), and H516 (CAT>CAC).
[0092] It is also contemplated that the polynucleotides encoding the recombinant polypeptides having THCA synthase activity and increased activity and/or yield as described herein, can include a combination of one or more codon differences relative to SEQ ID NO: 17, wherein at least one the codon differences encodes an amino acid difference as compared to SEQ ID NO: 18 and at least one codon difference does not encode an amino acid difference as compared to SEQ ID NO: 18 Accordingly, in at least one embodiment, the present disclosure provides a polynucleotide sequence encoding a recombinant polypeptide having THCA synthase activity, wherein the polynucleotide sequence comprises a combination of a codon difference encoding an amino acid difference and a codon difference selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAOAAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGOAGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOGAT), and H516 (CAT>CAC).
[0093] In at least one embodiment, the polynucleotide comprises a sequence encoding an exemplary recombinant polypeptide having THCA synthase activity as disclosed in Table 3 and accompanying Sequence Listing. In at least one embodiment, the polynucleotide comprises a sequence of at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100% identity to a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145, 147, 149, 151 , 153, 155, 157, 159, 161 , 163, 165, and 167. In at least one embodiment, the polynucleotide comprises a codon degenerate sequence of a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145, 147, 149, 151 , 153, 155, 157, 159, 161 , 163, 165, and 167.
[0094] The polynucleotide sequences encoding the recombinant polypeptides of the present disclosure may be operatively linked to one or more heterologous regulatory sequences that control gene expression to create a recombinant polynucleotide capable of expressing the polypeptide. Expression constructs containing a heterologous polynucleotide encoding the recombinant polypeptide can be introduced into appropriate host cells to express the corresponding polypeptide. Because of the knowledge of the codons corresponding to the various amino acids, availability of a protein sequence provides a description of all the polynucleotides capable of encoding the subject. The degeneracy of the genetic code, where the same amino acids are encoded by alternative or synonymous codons allows an extremely large number of nucleic acids to be made, all of which encode the improved transaminase enzymes disclosed herein. Thus, having identified a particular amino acid sequence, those skilled in the art could make any number of different nucleic acids by simply modifying the sequence of one or more codons in a way which does not change the amino acid sequence of the protein. In this regard, the present disclosure specifically contemplates each and every possible variation of polynucleotides that could be made by selecting combinations based on the possible codon choices, and all such variations are to be considered specifically disclosed for any polypeptide disclosed herein, including the amino acid sequences presented in Table 3 and the accompanying Sequence Listing.
[0095] The codons can be selected to fit the host cell in which the protein is being produced. For example, preferred codons used in bacteria are used to express the gene in bacteria; preferred codons used in yeast are used for expression in yeast; and preferred codons used in mammals are used for expression in mammalian cells. It is contemplated that all codons need not be replaced to optimize the codon usage of the recombinant polypeptide since the natural sequence will comprise preferred codons and because use of preferred codons may not be required for all amino acid residues. Consequently, codon optimized polynucleotides encoding the recombinant polypeptide may contain preferred codons at about 40%, 50%, 60%, 70%, 80%, or greater than 90% of codon positions of the full length coding region.
[0096] The present disclosure also provides an expression vector comprising a polynucleotide encoding a recombinant polypeptide having increased THCA synthase activity, and one or more expression regulating regions such as a promoter, a terminator, a replication origin, or the like, depending on the type of hosts into which they are to be introduced. The various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the recombinant polypeptide at such sites. Alternatively, a polynucleotide sequence of the present disclosure may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression. In creating the expression vector, the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression. The recombinant expression vector may be any vector (e.g., a plasmid or virus), which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the polynucleotide sequence. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vectors may be linear or closed circular plasmids.
[0097] The expression vector may be an autonomously replicating vector, i.e., a vector that exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a mini-chromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated.
Furthermore, a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon may be used. In at least one embodiment, the expression vector further comprises one or more selectable markers, which permit easy selection of transformed cells.
[0098] The present disclosure also provides host cell comprising a polynucleotide or expression vector encoding a recombinant polypeptide of the present disclosure, wherein the polynucleotide is operatively linked to one or more control sequences for expression of the polypeptide having THCA synthase activity in the host cell. Host cells for use in expressing the polypeptides encoded by the expression vectors of the present invention are well known in the art and include but are not limited to, bacterial cells, such as E. coll, or fungal cells, such as Saccharomyces cerevisiae or Pichia pastoris, insect cells, such as Drosophila S2 and Spodoptera Sf9, animal cells, such as CHO, COS, BHK, 293, and plant cells. Appropriate culture mediums and growth conditions for the above-described host cells are well known in the art. Accordingly, in at least one embodiment, the present disclosure provides a method for producing a cannabinoid comprising: (a) culturing in a suitable medium a recombinant host cell of the present disclosure; and (b) recovering the produced cannabinoid.
[0099] Use in Recombinant Host Cells
[0100] The recombinant polynucleotides of the present disclosure that encode recombinant polypeptides having THCA synthase activity can be incorporated into recombinant host cells for enhanced in vivo cannabinoid biosynthesis. In the context of recombinant host cells, the recombinant polynucleotides can be incorporated into a pathway capable of producing a cannabinoid, such as CBGA and CBGVA, and thereby provide the THCA synthase activity for biosynthesis of further cyclized cannabinoids, such as THCA and THCVA, by the cells. As described elsewhere herein, the recombinant polynucleotides encoding recombinant polypeptides having THCA synthase activity of the present disclosure when integrated into recombinant host cells with a pathway capable of converting hexanoic acid (HA) to cannabigerolic acid (CBGA) exhibit enhanced yields of the further cyclized cannabinoid product, THCA. Similarly, the recombinant polynucleotides encoding recombinant polypeptides having THCA synthase activity of the present disclosure when integrated into recombinant host cells with a pathway capable of converting butyric acid (BA) to cannabigerovarinic acid (CBGVA) exhibit enhanced yields of the rare cyclized cannabinoid product, THCVA [0101] Alternatively, it is contemplated that the recombinant polynucleotides encoding polypeptides having THCA synthase activity of the present disclosure can be integrated into recombinant host cells with a shorter cannabinoid pathway capable of converting the cannabinoid precursor, olivetolic acid (OA) to cannabigerolic acid (CBGA). In such an embodiment, the recombinant host cells exhibit enhanced yields of the further cyclized cannabinoid product, THCA, when fed the OA compound.
[0102] Generally, the cannabinoid pathway of the recombinant host cell is made up of a sequence of linked enzymes that produce a cannabinoid precursor substrate (e.g., OA) and then convert that precursor to a prenylated cannabinoid compound (e.g., CBGA). Accordingly, the pathway comprises at least a THCA synthase capable of oxidatively cyclizing the monoterpene moiety of the prenylated cannabinoid compound using a redox acceptor cosubstrate, such as FAD. Further decarboxylation of the produced cannabinoid compound can also be part of the cannabinoid pathway. As described elsewhere herein, it is contemplated that a wide range of cannabinoid compounds can be produced biosynthetically by a recombinant host cell integrated with such a cannabinoid pathway. Methods and techniques for integrated polynucleotides expressing pathway enzymes into recombinant host cells, such as yeast, are well known in the art and described elsewhere herein including the Examples.
[0103] One exemplary cannabinoid pathway is depicted in FIG. 1. As shown in FIG. 1 , this pathway is capable of converting hexanoic acid (HA) to the cannabinoid, cannabigerolic acid (CBGA). The pathway of FIG. 1 includes the sequence of four enzymes: (1) acyl activating enzyme (AAE), a CoA ligase enzyme of class E.C. 6.2.1 .1 , or a fatty acyl-CoA ligase (FACL) of class E.C.6.2.1.3 (e.g., FAA1 or FAA4); (2) olivetol synthase (OLS), a CoA synthase enzyme of class E.C. 2.3.1.206; (3) olivetolic acid cyclase (OAC), a carbon-sulfur lyase enzyme of class E.C. 4.4.1.26, and (4) prenyltransferase (PT), a transferase of class E.C. 2.5.1.102. The first two enzymes carry out the conversion of the HA starting compound to the precursor tetraketide- CoA compound, 3,5,7-trioxododecanoyl-CoA. The activity of the third enzyme, OAC, catalyzes the CoA lyase and cyclization of the tetraketide-CoA to provide the cannabinoid precursor, olivetolic acid (OA). The prenyltransferase activity of the fourth enzyme catalyzes the prenylation of OA with geranyl pyrophosphate (GPP), thereby forming the cannabinoid compound, CBGA. As illustrated by the FIG. 2, further enzymatic modification of the prenylated cannabinoid compound, CBGA, to provide cannabinoids, such as CBDA, THCA, and/or CBCA, can be carried out by including a cannabinoid synthase (e.g., CBDAS, THCAS) as a fifth enzyme in the pathway.
[0104] Exemplary cannabinoid pathway enzymes that can be introduced into a recombinant host cell to provide the pathways as illustrated in FIGS. 1 and 2 include, but are not limited to, the enzymes derived from C. sativa, AAE1 , OLS, OAC, PT4, CBDAS, and/or THCAS, listed in Table 4 (below), and homologs and variants of these enzymes, as described elsewhere herein.
[0105] TABLE 4: Exemplary cannabinoid pathway enzymes
Figure imgf000046_0001
[0106] The sequences of the exemplary cannabinoid pathway enzymes AAE1 , OLS, OAC, PT4, CBDAS, and THCAS listed in Table 4 are naturally occurring sequences from the plant source, Cannabis sativa. In the recombinant host cell embodiments of the present disclosure, it is contemplated that the THCAS enzyme of SEQ ID NO: 16 or 18 is replaced in the host cell by a recombinant polynucleotide encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, e.g., a THCA synthase provided in Table 3 and the accompanying Sequence Listing.
[0107] It also is contemplated that the other heterologous cannabinoid pathway enzymes used in the recombinant host can include naturally occurring sequence homologs of the AAE1 , OLS, OAC, and PT4 enzymes and/or enzymes having non-naturally occurring sequences. For example, enzymes with amino acid sequences engineered to function optimally in a particular enzyme pathway, and/or optimally for production of particular cannabinoid, and/or optimally in a particular host. Methods for preparing such non-naturally occurring enzyme sequences are known in the art and include methods for enzyme engineering such as directed evolution (see, e.g., Stemmer, 1994, Proc Natl Acad Sci USA 91 : 10747-10751 ; PCT Publ. Nos. WO 95/22625, WO 97/0078, WO 97/35966, WO 98/27230, WO 00/42651 , and WO 01/75767; U.S. Pat. Nos. 6,537,746; 6,117,679; 6,376,246; and 6,586,182; and U.S. Pat. Publ. Nos. 20080220990A1 and 20090312196A1 ; each of which is hereby incorporated by reference herein). Other modifications of cannabinoid pathway enzymes contemplated by the present disclosure include modification of the enzyme’s amino acid sequence at either its N- or C- terminus by truncation or fusion. For example, in at least one embodiment of the pathway of producing a cannabinoid, versions of the AAE1 , OLS, OAC, and/or PT4 enzymes that are engineered with amino acid substitutions and/or truncated at the N- or C-terminus can be prepared using methods known in the art, and used in the compositions and methods of the present disclosure. In one embodiment, a PT4 enzyme of SEQ ID NO: 8 that is truncated at the N-terminus by 82 amino acids can be used. The amino acid sequence of such a truncated CBDAS is provided herein as the d82_PT4 enzyme of SEQ ID NO: 10. Accordingly, in at least one embodiment of the recombinant host cell, the pathway capable of producing a cannabinoid comprises at least enzymes having an amino acid sequence at least 90% identity to SEQ ID NO: 2 (AAE1), SEQ ID NO: 4 (OLS), SEQ ID NO: 6 (OAC), SEQ ID NO: 10 (d82_PT4), and an amino acid sequence of at least 90% identity to recombinant polypeptide having THCA synthase activity of the present disclosure as provided in Tables 3, 6, and 8, and the accompanying Sequence Listing.
[0108] The present disclosure provides engineered recombinant polypeptides that have THCA synthase activity and which exhibit enhanced THCA production when expressed in a recombinant host cell with a cannabinoid pathway capable of producing CBGA. These amino acid sequences of these engineered polypeptides include one or more amino acid differences relative to the naturally occurring THCA synthase sequence of SEQ ID NO: 18 that result in increased THCA titer from the cells when fed hexanoic acid. Accordingly, in at least one embodiment, the present disclosure provides a recombinant host cell comprising nucleic acids encoding a cannabinoid pathway comprising an engineered polypeptide with THCA synthase activity capable of converting CBGA to THCA, wherein the production of THCA by the recombinant host cell is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by the naturally occurring THCA synthase polypeptide of SEQ ID NO: 18. This increased THCA titer is achieved due to the host cell including a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: R3, L23, H28, L31 , M33, L43, S72, V75, K137, G207, K233, I266, K268, K276, H282, V293, N301 , A335, G348, T418, Y472, N500, and H517. As described elsewhere herein, specific exemplary amino acid residue differences relative to SEQ ID NO: 18 that can be used include one or more of: R3V, L23I, L23V, H28G, H28Q, L31D, L31G, M33D, L43G, L43S, S72A, V75A, V75Y, K137C, K137F, K137M, K137S, K137Y, G207A, K233G, K233S, K233T, I266Q, K268E, K276Q, H282L, V293I, N301 D, A335T, G348A, T418V, Y472I, N500D, H517R, H517V, and H517Y.
[0109] Additionally, in at least one embodiment, the recombinant polynucleotide encoding the engineered polypeptide with THCA synthase activity can further include neutral codon differences at certain amino acid encoding positions that result in enhanced THCA titer from recombinant host cells. Exemplary neutral codon differences resulting in enhanced THCA titer from host cells fed hexanoic acid (HA) include: V75 (GTA>GTG), V184 (GTA>GTG), G328 (GGOGGT), F337 (TTOTTT), P404 (CCT>CCC), D498 (GAOGAT), and H516 (CAT>CAC).
[0110] Accordingly, in at least one embodiment, the present disclosure provides a recombinant host cell comprising pathway capable of producing a cannabinoid comprises enzymes capable of catalyzing reactions (i) - (iv):
Figure imgf000048_0002
and
(iv)
Figure imgf000048_0001
Geranyldiphosphate [0111] As shown in FIG. 1 , exemplary enzymes capable of catalyzing reactions (i) - (iv) are: (i) acyl activating enzyme (AAE); (ii) olivetol synthase (OLS); (iii) olivetolic acid cyclase (OLA); and (iv) prenyltransferase (PT).
[0112] As shown in FIG. 2, the cannabinoid compound, CBGA, that is produced by the four enzyme cannabinoid pathway of FIG. 1 , can be further converted to any of at least three other different cannabinoid compounds, AMetrahydrocannabinolic acid (THCA), cannabidiolic acid (CBDA), and/or cannabichromenic acid (CBCA). This further enzymatic cyclization of CBGA can include the conversion of (v) CBGA to A9-THCA; (vi) CBGA to CBDA; and/or (vii) CBGA to CBCA, as shown in the reaction schemes below.
Figure imgf000049_0001
[0113] Thus, a recombinant host cell comprising a pathway capable of converting hexanoic acid to CBGA (or simply OA to CBGA) can be further extended to include enzymes capable of catalyzing a reaction (v), (vi), and/or (vii), and thereby produce any or all of the cyclized cannabinoid product compounds. As shown in FIG. 2, exemplary enzymes capable of catalyzing reaction (v)-(vii) are: (v) THCA synthase (THCAS); (vi) CBDA synthase (CBDAS); and (vii) CBCA synthase (CBCAS). The extension of the four enzyme exemplary pathway of FIG. 1 in a recombinant host cell can be carried out by further integrating a recombinant polynucleotide sequence capable of expressing a cannabinoid synthase (e.g., CBDAS, THCAS, and/or CBCAS) can thus provide a cell capable of biosynthetic production of one or more of the further cyclized cannabinoids, A9-THCA, CBDA, and/or CBCA.
[0114] The present disclosure contemplates that a recombinant host cell comprising a cannabinoid pathway, such as AAE, OLS, OAC, and PT, capable of converting HA to CBGA, or even a single enzyme pathway of PT, capable of converting OA to CBGA, could be further modified by integrating a recombinant polynucleotide capable of expressing a recombinant polypeptide with THCAS activity of the present disclosure. The addition of the THCAS activity to the pathway allows for the conversion of the cannabinoid, CBGA to the further cyclized cannabinoid, THCA. The resulting cannabinoid pathway combines the pathway of FIG. 1 with the further THCAS catalyzed conversion depicted in FIG. 2. It is contemplated that any of the recombinant polynucleotides encoding recombinant polypeptides having THCAS activity of the present disclosure can be incorporated in a host cell to provide such a pathway.
[0115] Furthermore, as shown in FIG. 2, the cannabinoids, A9-THCA, CBDA, and CBCA, can be further decarboxylated to provide the cannabinoids, A9-THC, CBD, and/or CBC. Accordingly, it is contemplated, that in some embodiments this further decarboxylation reaction can be carried out under in vitro reaction conditions using the cannabinoid acids separated and/or isolated from the recombinant host cells.
[0116] Other cannabinoid pathway enzymes useful in the recombinant host cells and associated methods of the present disclosure are known in the art, and can include naturally occurring enzymes obtained or derived from cannabis plants, or non-naturally occurring enzymes that have been engineered based on the naturally occurring cannabis plant sequences. It is also contemplated that enzymes obtained or derived from other organisms (e.g., microorganisms) having a catalytic activity related to a desired conversion activity useful in a cannabinoid pathway can be engineered for use in a recombinant host cell of the present disclosure.
[0117] Although the cannabinoid pathways of FIGS. 1-2 depict the production of the more common naturally occurring cannabinoids, CBGA, A9-THCA, CBDA, and CBCA, it is also contemplated that the recombinant polypeptides, cannabinoid pathways, recombinant host cells, and associated methods of the present disclosure can also be used to biosynthesize a range of additional rarely occurring, and/or synthetic cannabinoid compounds. Table 1 depicts the names and structures of a wide range of exemplary rarely occurring, and/or synthetic cannabinoid compounds that are contemplated for production using the recombinant polypeptides, host cells, compositions and methods of the present disclosure. Similarly, Table 2 depicts additional rarely occurring, and/or synthetic cannabinoid precursor compounds that could be produced by such recombinant host cells in the pathway for production of certain rarely occurring, and/or synthetic cannabinoid compounds of Table 1 . Accordingly, in at least one embodiment, a recombinant host cell that includes a pathway to a cannabinoid and that expresses a recombinant polypeptide having THCA synthase activity of the present disclosure (e.g., as in Tables 3, 6, and 8) can be used for the biosynthetic production of a rarely occurring, and/or synthetic cannabinoid compound, or a composition comprising such a cannabinoid compound. It is contemplated that the produced rarely occurring, and/or synthetic cannabinoid compound can include, but is not limited to, the cannabinoid compounds of Table 1. Accordingly, in at least embodiment, a recombinant host cell of the present disclosure can be used for production of a cannabinoid compound selected from cannabigerolic acid (CBGA), cannabigerol (CBG), cannabidiolic acid (CBDA), cannabidiol (CBD), A9-tetrahydrocannabinolic acid (A9-THCA), AMetrahydrocannabinol (A9-THC), AMetrahydrocannabinolic acid (A8-THCA), AMetrahydrocannabinol (A8-THC), cannabichromenic acid (CBCA), cannabichromene (CBC), cannabinolic acid (CBNA), cannabinol (CBN), cannabidivarinic acid (CBDVA), cannabidivarin (CBDV), A9-tetrahydrocannabivarinic acid (A9-THCVA), A9-tetrahydrocannabivarin (A9-THCV), cannabidibutolic acid (CBDBA), cannabidibutol (CBDB), A9-tetrahydrocannabutolic acid (A9- THCBA), AMetrahydrocannabutol (A9-THCB), cannabidiphorolic acid (CBDPA), cannabidiphorol (CBDP), A9-tetrahydrocannabiphorolic acid (A9-THCPA), A9- tetrahydrocannabiphorol (A9-THCP), cannabichromevarinic acid (CBCVA), cannabichromevarin (CBCV), cannabigerovarinic acid (CBGVA), cannabigerovarin (CBGV), cannabicyclolic acid (CBLA), cannabicyclol (CBL), cannabielsoinic acid (CBEA), cannabielsoin (CBE), cannabicitranic acid (CBTA), cannabicitran (CBT), and any combination thereof.
[0118] In at least one embodiment, the compositions and methods of the present disclosure can be used for the production of the more rarely occurring varin series of cannabinoids, CBGVA, A9-THCVA, CBDVA, and CBCVA. As shown in Table 1 , the varin cannabinoids feature a 3 carbon propyl side-chain rather than the 5 carbon pentyl side chain found in the common cannabinoids, CBGA, A9-THCA, CBDA, and CBCA. An exemplary cannabinoid pathway capable of producing the rare naturally occurring cannabinoid, cannabigerovarinic acid (CBGVA), is depicted in FIG. 3. Instead of starting with hexanoic acid, the pathway of FIG. 3 is fed butyric acid (BA) which is converted to divarinic acid (DA) via the same three enzyme pathway of AAE, OLS, and OAC. The cannabinoid precursor DA is then converted by a prenyltransferase to the rare prenylated cannabinoid, CBGVA. Accordingly, in at least one embodiment of the recombinant host cell, the pathway capable of producing a cannabinoid comprises enzymes capable of catalyzing reactions (i) - (iv):
(i)
Figure imgf000051_0001
Butyric acid (BA) Butanoyl-CoA
(ii) an
Figure imgf000052_0001
Figure imgf000052_0003
Figure imgf000052_0002
Cannabigerovarinic acid (CBGVA)
Geranyldiphosphate
[0119] Exemplary enzymes capable of catalyzing reactions (i) - (iv) are: (i) acyl activating enzyme (AAE); (ii) olivetol synthase (OLS); (iii) olivetolic acid cyclase (OAC); and (iv) PT. Exemplary enzymes, AAE1 , OLS, OAC, and PT4 derived from C. sativa are known in the art and also provided in Table 4 and the accompanying Sequence Listing.
[0120] As shown by the exemplary pathway of FIG. 4, with the incorporation of one or more synthase enzymes, the rare varin cannabinoid, CBGVA, can be converted to the rare varin cannabinoids, cannabidivarinic acid (CBDVA), AMetrahydrocannabivarinic acid (A9-THCVA), and cannabichromevarinic acid (CBCVA). Enzymes capable of carrying out these conversions include the C. sativa CBDA synthase, THCA synthase, and CBCA synthase, respectively. Accordingly, in at least one embodiment, the present disclosure provides a recombinant host cell comprising a pathway capable of converting BA to CBGVA and further comprising an enzyme capable of catalyzing the conversion of (v) CBGVA to A9-THCVA; (vi) CBGVA to CBDVA; and/or (vii) CBGVA to CBCVA. Thus, in at least one embodiment, the recombinant host cell comprises pathway capable of converting BA to CBGVA further comprises further comprises enzymes capable of catalyzing a reaction (v), (vi), and/or (vii):
(V)
Figure imgf000053_0001
Cannabigerovarinic acid (CBGVA)
Figure imgf000053_0002
Figure imgf000053_0003
Cannabigerovarinic acid (CBGVA) Cannabidivarinic acid (CBDVA)
Figure imgf000053_0004
Cannabigerovarinic acid (CBGVA) Cannabichromevarinic acid (CBCVA)
[0121] As shown in FIG. 4, exemplary enzymes capable of catalyzing the reactions (v)-(vii) are: (v) THCA synthase (THCAS); (vi) CBDA synthase (CBDAS); and (vii) CBCA synthase (CBCAS). Exemplary THCAS, CBDAS, and CBCAS enzymes are provided in Table 4.
[0122] The extension of the four enzyme exemplary pathway of FIG. 3 with a polynucleotide sequence capable of expressing such a cannabinoid synthase (e.g., CBDAS, THCAS, and/or CBCAS) allows for the biosynthetic production of one or more of the rare varin cannabinoids, A9-THCVA, CBDVA, and/or CBCVA.
[0123] The present disclosure contemplates that a recombinant host cell comprising a four enzyme pathway, such as AAE, OLS, OAC, and PT, capable of converting BA to CBGVA, or even a single enzyme pathway of PT, capable of converting the rare cannabinoid precursor, DA to CBGVA, could be further modified by integrating a recombinant polynucleotide capable of expressing a recombinant polypeptide with THCA synthase activity to convert the rare cannabinoid, CBGVA to the cyclized rare cannabinoid, THCVA. The resulting cannabinoid pathway combines the pathway of FIG. 3 with the further THCA synthase catalyzed conversion depicted in FIG. 4. It is contemplated that any of the recombinant polynucleotides encoding recombinant polypeptides having THCA synthase activity of the present disclosure can be incorporated in a host cell to provide such a combined pathway capable of producing a rare cannabinoid, such as THCVA.
[0124] As with the production of THCA, he present disclosure provides engineered recombinant polypeptides with THCA synthase activity that exhibit enhanced THCVA production when expressed in a recombinant host cell with a cannabinoid pathway capable of producing CBGVA. These amino acid sequences of these engineered polypeptides include one or more amino acid differences relative to the naturally occurring THCA synthase sequence of SEQ ID NO: 18 that result in increased THCVA titer from the cells when fed butyric acid (BA). Accordingly, in at least one embodiment, the present disclosure provides a recombinant host cell comprising nucleic acids encoding a cannabinoid pathway comprising an engineered polypeptide with THCA synthase activity capable of converting CBGVA to THCVA, wherein the production of THCVA by the recombinant host cell is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by the naturally occurring THCA synthase polypeptide of SEQ ID NO: 18. This increased THCVA titer is achieved due to the host cell including a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: K12, A19, H28, Q41 , V75, H108, V151 , A214, K234, D256, D258, H274, F317, F332, A335, S354, T367, and K496; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from K12G, A19E, A19G, A19Q, H28N, Q41R, V75A, H108R, V151G, A214T, K234R, D256S, D258R, H274C, H274E, H274Q, F317L, F332L, A335C, S354C, T367E, K496E, and K496Q.
[0125] Additionally, in at least one embodiment, the recombinant polynucleotide encoding the engineered polypeptide with THCA synthase activity can further include neutral codon differences at certain amino acid encoding positions that result in enhanced THCVA titer from recombinant host cells fed butyric acid (BA). Exemplary neutral codon differences resulting in enhanced THCA titer include: H108 (CAOCAT), K136 (AAG>AAA), K187 (AAG>AAA), R191R (AGG>AGA), A368A (GCOGCG), L415L (TTA>CTG), and T464T (ACOACG).
[0126] Furthermore, as shown in FIG. 4, the rare cannabinoid acids, CBDVA, A9-THCVA, and CBCVA, can undergo a further decarboxylation reaction to provide the varin cannabinoid products, cannabidivarin (CBDV), A9-tetrahydrocannabivarin (A9-THCV), and cannabichromevarin (CBCV), respectively. Accordingly, it is contemplated, that in some embodiments this further decarboxylation reaction can be carried out under in vitro reaction conditions using the rare cannabinoid acids separated and/or isolated from the recombinant host cells. [0127] As shown in FIG. 1 and 3, a heterologous cannabinoid pathway comprising the sequence of at least the four enzymes AAE, OLS, OAC, and PT is capable of converting a precursor substrate compound, such as hexanoic acid (HA) to an initial cannabinoid compound, such as cannabigerolic acid (CBGA) or CBGVA. These initial cannabinoid product compounds can themselves be used as a substrate for the in vitro biosynthesis of a range of further cannabinoid product compounds, such as THCA and THCVA, as shown in FIGS. 2 and 4. A wide range of cannabinoid compounds, such as those shown in Table 1 , are contemplated for in vivo biosynthetic production in a recombinant host cell of the present disclosure or via a partial or full in vitro biosynthesis process using the recombinant THCAS polypeptides of the present disclosure.
[0128] As described herein, the heterologous cannabinoid pathways of the present disclosure can be incorporated (e.g., by recombinant transformation) into a range of host cells to provide a system for biosynthetic production of cannabinoids (e.g., CBGA, CBGVA, CBDA, CBDVA, THCA, THCVA). Generally, the host cell used in the recombinant host cells of the present disclosure can be any cell that can be recombinantly modified with nucleic acids and cultured to express the recombinant products of those nucleic acids, including polypeptides and metabolites produced by the activity of the recombinant polypeptides. A wide range of suitable sources of host cells are known in the art, and exemplary host cell sources useful as recombinant host cells of the present disclosure include, but are not limited to, Saccharomyces cerevisiae, Yarrowia lipolytica, Pichia pastoris, and Escherichia coli. It is also contemplated that the host cell source for a recombinant host cell of the present disclosure can include a non- naturally occurring cell source, e.g., an engineered host cell. For example, a non-naturally occurring source host cell, such as a yeast cell previously engineered for improved production of recombinant genes, may be used to prepare the recombinant host cell of the present disclosure.
[0129] The recombinant host cells of the present disclosure comprise heterologous nucleic acids encoding a pathway of enzymes capable of producing a cannabinoid (e.g., CBGA or CBGVA), and a heterologous nucleic acid comprising a sequence encoding a recombinant polypeptide having THCA synthase activity capable of oxidatively cyclizing a prenylated cannabinoid substrate using a redox active co-substrate, such as FAD, and thereby form a cyclized cannabinoid product, such as THCA or THCVA. As described elsewhere herein, nucleic acid sequences encoding the cannabinoid pathway enzymes, are known in the art, and provided herein (see e.g., Table 4), and can readily be used in accordance with the present disclosure. Typically, the nucleic acid sequence encoding enzymes which form a part of a cannabinoid pathway, further include one or more additional nucleic acid sequences, for example, a nucleic acid sequence controlling expression of the enzymes which form a part of a cannabinoid biosynthetic enzyme pathway, and these one or more additional nucleic acid sequences together with the nucleic acid sequence encoding the enzyme can be considered a heterologous nucleic acid sequence. A variety of techniques and methodologies are available and well known in the art for introducing heterologous nucleic acid sequences, such as nucleic acid sequences encoding the cannabinoid pathway enzymes (e.g., AAE, OLS, OAC, and PT), into a host cell so as to attain expression the host cell. Such techniques are well known to the skilled artisan and can, for example, be found in Sambrook et al., Molecular Cloning, a Laboratory Manual, Cold Spring Harbor Laboratory Press, 2012, Fourth Ed.
[0130] The THCA synthase polypeptide that occurs naturally in C. sativa includes a 28 amino acid secretion peptide fused to its N-terminus. This N-terminal fusion of C. sativa THCAS is provided in Table 4 as SEQ ID NO: 16. As described elsewhere herein, it is contemplated that the recombinant polypeptides of the present disclosure may be expressed in a recombinant host cell as a fusion polypeptide construct with an N-terminal secretion peptide to provide efficient production of THCA. Exemplary N-terminal secretion peptide (SP) sequences include those disclosed in Table 5 below and the accompanying Sequence Listing.
[0131] TABLE S
Figure imgf000056_0001
Figure imgf000057_0001
[0132] It has been previously been shown that the secretion peptides SP-Alpha (SEQ ID NO: 100), SP_AA (SEQ ID NO: 102), SP_AT (SEQ ID NO: 108), SP_IN (SEQ ID NO: 114), SPJV (SEQ ID NO: 116), SP_KP (SEQ ID NO: 118), SP_LZ (SEQ ID NO: 120), or SP_SA (SEQ ID NO: 122) when combined as N-terminal fusions with recombinant d28_THCAS polypeptide of SEQ ID NO: 18 and integrated into a yeast strain capable of producing CBGA resulted in enhanced production of THCA. See e.g., US Provisional Patent Application No. 63/164,510, filed March 22, 2021 , which is hereby incorporated by reference herein. Accordingly, in at least one embodiment, it is contemplated that any of the recombinant polynucleotides encoding recombinant polypeptides having THCAS activity of the present disclosure can be modified with a polynucleotide sequence (e.g., SEQ ID NO: 99, 101 , 103, 105, 107, 109, 111 , 113, 115, 117, 119, 121) so as to express a recombinant polypeptide with an N-terminal secretion peptide sequence of any one of SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, and 122. In at least one embodiment, as exemplified in the Examples, a recombinant polypeptide of the present disclosure (e.g., any one of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, and 168) further comprises an N-terminal SP-AT secretion peptide of SEQ ID NO: 108.
[0133] One of ordinary skill will recognize that the heterologous nucleic acids encoding the recombinant THCA synthase enzymes and/or other pathway enzymes will further comprise transcriptional promoters capable of controlling expression of the enzymes in the recombinant host cell. Generally, the transcriptional promoters are selected to be compatible with the host cell, so that promoters obtained from bacterial cells are used when a bacterial host cell is selected in accordance herewith, while a fungal promoter is used when a fungal host cell is selected, a plant promoter is used when a plant cell is selected, and so on. Promoters useful in the recombinant host cells of the present disclosure may be constitutive or inducible, provided such promoters are operable in the host cells. Promoters that may be used to control expression in fungal host cells, such as Saccharomyces cerevisiae, are well known in the art and include, but are not limited to: inducible promoters, such as a Gall promoter or Gal10 promoter, a constitutive promoter, such as an alcohol dehydrogenase (ADH) promoter, a glyceraldehyde-3-phosphate dehydrogenase (GPD) promoter, or an S. pombe Nmt, or ADH promoter. Exemplary promoters that may be used to control expression in bacterial cells can include the Escherichia coll promoters lac, tac, trc, trp or the 77 promoter. Exemplary promoters that may be used to control expression in plant cells include, for example, a Cauliflower Mosaic Virus 35S promoter (Odell et al. (1985) Nature 313:810-812), a ubiquitin promoter (U.S. Pat. No. 5,510,474; Christensen et al. (1989)), or a rice actin promoter (McElroy et al. (1990) Plant Cell 2:163-171). Exemplary promoters that can be used in mammalian cells include, a viral promoter such as an SV40 promoter or a metallothionine promoter. All of these host cell promoters are well known by and readily available to one of ordinary skill in the art. Further nucleic acid control elements useful for controlling expression in a recombinant host cell can include transcriptional terminators, enhancers and the like, all of which may be used with the heterologous nucleic acids incorporate in the recombinant host cells of the present disclosure.
[0134] A wide variety of techniques are well known in the art for linking transcriptional promoters and other control elements to heterologous nucleic acid sequences encoding cannabinoid pathway genes. Such techniques are described in e.g., Sambrook et al., Molecular Cloning, a Laboratory Manual, Cold Spring Harbor Laboratory Press, 2012, Fourth Ed. Accordingly, in at least one embodiment, the heterologous nucleic acid sequences of the present disclosure comprise a promoter capable of controlling expression in a host cell, wherein the promoter is linked to a nucleic acid sequence encoding a recombinant polypeptide having THCA synthase activity of the present disclosure, and as necessary, other enzymes constituting a cannabinoid pathway (e.g., AAE, OLS, OAC, PT). This heterologous nucleic acid sequence can be integrated into a recombinant expression vector which ensures good expression in the desired host cell, wherein the expression vector is suitable for expression in a host cell, meaning that the recombinant expression vector comprises the heterologous nucleic acid sequence linked to any genetic elements required to achieve expression in the host cell. Genetic elements that may be included in the expression vector in this regard include a transcriptional termination region, one or more nucleic acid sequences encoding marker genes, one or more origins of replication, and the like. In some embodiments, the expression vector further comprises genetic elements required for the integration of the vector or a portion thereof in the host cell's genome.
[0135] It is also contemplated that in some embodiments an expression vector comprising a heterologous nucleic acid of the present disclosure may further contain a marker gene. Marker genes useful in accordance with the present disclosure include any genes that allow the distinction of transformed cells from non-transformed cells, including all selectable and screenable marker genes. A marker gene may be a resistance marker such as an antibiotic resistance marker against, for example, kanamycin or ampicillin. Screenable markers that may be employed to identify transformants through visual inspection include p-glucuronidase (GUS) (U.S. Pat. Nos. 5,268,463 and 5,599,670) and green fluorescent protein (GFP) (Niedz et al., 1995, Plant Cell Rep., 14: 403). [0136] In at least one embodiment, the present disclosure also provides of a method for producing a cannabinoid, wherein a heterologous nucleic acid encoding a recombinant polypeptide having THCA synthase activity (e.g., an exemplary engineered polypeptide of Tables 3, 6, and 8) can be introduced into a recombinant host cell. The recombinant host cell can then be used for production of the polypeptide, or incorporated in a biocatalytic process that utilized the THCA synthase activity of the recombinant polypeptide expressed by the host cell for the catalytic oxidative cyclization of a prenylated cannabinoid substrate, e.g., the oxidative cyclization of CBGA with FAD to produce THCA. In at one embodiment, the recombinant host cell can further comprise a pathway of enzymes capable of producing a prenylated cannabinoid (e.g., CBGA or CBGVA) which can act as a substrate for the recombinant polypeptide with THCA synthase activity. It is contemplated that a recombinant host cell comprising a heterologous nucleic acid encoding a recombinant polypeptide having THCA synthase activity of the present disclosure can provide improved biosynthesis of a desired cannabinoid (e.g., THCA) product in terms of titer, yield, and production rate, due to the improved characteristics of the expressed THCA synthase activity in the cell associated with the amino acid and codon differences engineered in the gene.
[0137] Accordingly, in at least one embodiment, the present disclosure provides a method of producing a cannabinoid derivative, wherein the method comprises: (a) culturing in a suitable medium a recombinant host cell of the present disclosure; and (b) recovering the produced cannabinoid derivative. In at least one embodiment, the method of producing a cannabinoid derivative further contacting a cell-free extract of the culture containing the produced cannabinoid with a biocatalytic reagent or chemical reagent capable of converting the cannabinoid to a cannabinoid derivative. In at least one embodiment, the biocatalytic reagent is an enzyme capable of converting the produced cannabinoid to a different cannabinoid or a cannabinoid derivative compound. In at least one embodiment, the chemical reagent is capable of chemically modifying the produced cannabinoid to produce a different cannabinoid or a cannabinoid derivative compound. In at least one embodiment of the method for producing a cannabinoid, the method can further comprise contacting a cell-free extract of the culture containing the produced cannabinoid with a biocatalytic reagent or chemical reagent.
[0138] It is contemplated that the cannabinoid, or cannabinoid derivative produced using the methods of the present disclosure can be produced and/or recovered from the reaction in the form of a salt, in at least one embodiment, the recovered salt of the cannabinoid, cannabinoid precursor, cannabinoid precursor derivative, or cannabinoid derivative is a pharmaceutically acceptable salt. Such pharmaceutically acceptable salts retain the biological effectiveness and properties of the free base compound.
[0139] It also is contemplated the recombinant polypeptides with THCA synthase activity of the present disclosure can be incorporated in any biosynthesis method requiring a THCA synthase catalyzed biocatalytic step. Thus, in at least one embodiment, the recombinant polypeptides having THCA synthase activity (e.g., exemplary polypeptides of Tables 3, 6, and 8) can be used in a method for preparing a cannabinoid compound of structural formula (I)
Figure imgf000060_0001
wherein, R1 is C1-C7 alkyl, wherein the method comprises contacting an recombinant polypeptide having THCA synthase activity of the present disclosure (e.g., an exemplary recombinant of Tables 3, 6, and 8) under suitable reactions conditions, with a cannabinoid precursor compound of structural formula (II)
Figure imgf000060_0002
wherein, R1 is C1-C7 alkyl.
[0140] Exemplary conversions of cannabinoid compounds of structural formula (II) to cannabinoid compounds of structural formula (I) that are catalyzed by the recombinant polypeptides having THCA synthase activity of the present disclosure include: (1) conversion of cannabigerolic acid (CBGA) to A9-tetrahydrocannabinolic acid (A9-THCA); and (2) conversion of cannabigerovarinic acid (CBGVA) to AMetrahydrocannabivarinic acid (A9-THCVA). It is contemplated that the recombinant polypeptides having THCA synthase activity of the present disclosure (e.g., polypeptides disclosed in Tables 3, 6, and 8) can catalyze the conversion of other cannabinoid compounds that are structural analogs of CBGA and CBGVA, including but not limited to the exemplary cannabinoid compounds listed in Table 1. Accordingly, in at least one embodiment of the biosynthesis method for conversion a cannabinoid compound of structural formula (II) to a cannabinoid compound of structural formula (I), the compound of structural formula (II) is CBGA and the compound of structure formula (I) is A9-THCA. In at least one embodiment, the compound of structural formula (II) is CBGVA and the compound of structure formula (I) is A9-THCVA.
[0141] Suitable reaction conditions for the biosynthesis of cannabinoids are known in the art, and can be used with the recombinant polypeptides having THCA synthase activity of the present disclosure. In at least one embodiment, the suitable reaction conditions comprise the presence of a redox active co-substrate molecule, such as FAD, which is capable of acting as an electron acceptor molecule. Additionally, suitable reaction conditions for the exemplary polypeptides of the present disclosure can be determined using routine techniques known in the art for optimizing biocatalytic reactions. It is contemplated that various ranges of suitable reaction conditions with the recombinant polypeptides of the present disclosure, including but not limited to ranges of pH, temperature, buffer, solvent system, substrate loading, polypeptide loading, co-substrate or co-factor loading, atmosphere, and reaction time. Suitable reaction conditions can be readily determined and optimized for particular reactions by routine experimentation that includes, but is not limited to, contacting the recombinant polypeptide and substrate under experimental reaction conditions of concentration, pH, temperature, solvent conditions, and detecting the production of the desired compound of structural formula (I). [0142] In at least one embodiment, the suitable reaction conditions comprise a reaction solution of ~pH 7-8, a temperature of 25 C to 37 C; optionally, the reaction conditions comprise a reaction solution of ~ pH 7 and a temperature of ~30 C. In at least one embodiment, the reaction solution is allowed to incubate at a temperature of 25 C to 37 C for a reaction time of at least 1 , 6, 12, 24, or 48 hours, before the amount of reaction product is determined.
[0143] The present disclosure also contemplates that the methods for biocatalytic conversion of a cannabinoid compound of structural formula (II) to a cannabinoid compound of structural formula (I) using an recombinant polypeptide having THCA synthase activity of the present disclosure can comprise additional chemical or biocatalytic steps carried out on the product compound of structural formula (II), including steps of product compound work-up, extraction, isolation, purification, and/or crystallization, each of which can be carried out under a range of conditions.
EXAMPLES
[0144] Various features and embodiments of the disclosure are illustrated in the following representative examples, which are intended to be illustrative, and not limiting. Those skilled in the art will readily appreciate that the specific examples are only illustrative of the invention as described more fully in the claims which follow thereafter. Every embodiment and feature described in the application should be understood to be interchangeable and combinable with every embodiment contained within.
Example 1 : Preparation and Screening of Engineered Polypeptides with Improved THCA synthase Activity
[0145] This example illustrates preparation of site saturation mutagenesis libraries of polypeptides derived from the parent polypeptide, d28_THCAS of SEQ ID NO: 18 and screening for improved activity in the conversion of CBGA to THCA relative to the activity of the parent polypeptide of SEQ ID NO: 18.
[0146] Materials and methods [0147] A. Site Saturation mutagenesis library build:
[0148] The polynucleotide sequence encoding the d28_THCAS polypeptide (SEQ ID NO: 18) from Cannabis sativa was codon optimized as SEQ ID NO: 17. This codon-optimized gene was synthesized fused to a polynucleotide sequence (SEQ ID NO: 107) encoding the secretion peptide SP_AT, MRFPSIFTAVLFAASSALA (SEQ ID NO: 108). The synthetic gene (SEQ ID NO: 123) encoding the complete SP_AT-THCAS fusion (SEQ ID NO: 124) was expressed under the pGall promoter (SEQ ID NO: 125) and ALD4 terminator (SEQ ID NO: 126). The construct was integrated into the X-3 site (Easy-Clone 2.0) of a yeast strain which already had integrated genes encoding the cannabinoid pathway enzymes AAE1 (SEQ ID NO: 2), OLS (SEQ ID NO: 4), OAC (SEQ ID NO: 6), and d82_PT4 (SEQ ID NO: 10). The resulting strain EVT001 integrated with the SP_AT-THCAS fusion gene thus included a cannabinoid pathway of the enzymes AAE, OLS, OAC, PT4, and THCAS capable of converting hexanoic acid (HA) to THCA. This EVT001 strain was used as a control strain in screening the saturation mutagenesis library strains for fold-improvement in THCA titer as described below.
[0149] Genomic DNA from the EVT001 strain with the SP_AT-THCAS fusion integrated at X-3 was used as the template to generate two PCR products: (1) a first PCR product (Fragment A), which does not harbor any degenerate codons, and (2) a second PCR product (Fragment B), which has sequence overlap with the Fragment A, and is amplified harboring one NNK degenerate codon only. Primers used for amplification of Fragments A and B and overlap extension were designed according to standard site-saturation mutagenesis protocols. Fragment B was amplified with a series of forward primers that included the single NNK degenerate codon scanned across the various desired positions and a single reverse primer: 5’-CGGGTATAAGCGAAGAAGCGCAAT-3’ (SEQ ID NO: 127). Fragment A was amplified using a single forward primer: 5’-AGGCGAGAGCCGACATACGA-3’ (SEQ ID NO: 128) and a series of reverse primers designed according to the location of the mutagenesis site. The two fragments A and B were assembled by overlap extension PCR using forward primer, 5’- AGCCCTCCGAAGGAACACTCTC-3’ (SEQ ID NO: 129) and reverse primer of 5’- CGACCTTCCATGGGGTCGC-3’ (SEQ ID NO: 130). The assembled OE-PCR products were then pooled together and gel purified to provide a saturation mutagenesis library of linear donor DNA.
[0150] The pooled saturation mutagenesis library linear donor DNA was transformed and integrated as a knock-in using CRISPR-Cas9 into an m-Venus cassette in a yeast strain, EVT000. The The m-Venus cassette was integrated at the X-3 site under control the pGall promoter and ALD4 terminator. The EVT000 strain (like the control EVT001) already had integrated genes encoding the cannabinoid pathway enzyme activities of AAE, OLS, OAC, and PT4.
[0151] B. Screening of site saturation mutagenesis library for cannabinoid biosynthesis: [0152] Individual clones from the saturation mutagenesis library integrated into EVT000 and the EVT001 control strain were picked and grown in 0.3 mL YPD in 96-well plates. The culture plates were incubated in shaking incubators for 48 h at 30 C, 85% humidity, and 250 rpm. Cultures were then sub-cultured into 0.27 mL fresh YPD and fed with hexanoic acid (HA) to 2 mM final concentration. Subculture plates were grown in shaking incubators for 48 hours at 30 C, 85% humidity, and 250 rpm. The whole broth from these sub-culture plates was extracted and analyzed for the presence of the cannabinoid precursor compound, OA, the cannabinoid, CBGA, and the final cannabinoid, THCA, using HPLC, as described below.
[0153] 1. HPLC sample preparation: The whole broth of the culture was extracted and diluted with MeOH for sample preparation. The prepared samples were loaded onto RapidFire365 coupled with a triple quadruple mass spectrometry detector. Metabolites OA, CBGA, and THCA were detected using MRM mode. Calibration curves of OA, CBGA, and THCA were generated by running serial dilutions of standards, and then used to calculate concentrations of each metabolite.
[0154] 2. HPLC instrumentation and parameters: HPLC system: Agilent RapidFire 365; Column: Agilent Cartridge C18 (12 pl, type C); Mobile phase: Pump 1 uses H2O with 0.1% formic acid at 1 mL/min; Pump 2 uses 20:80 acetonitrile: H2O at 0.8 mL/min; Pump 3 uses MeOH with 0.1% formic acid; Aqueous wash uses H2O; Organic wash uses acetonitrile; RapidFire cycle time: Aspiration 600 ms; Load/wash 3000 ms; Extra wash 2000 ms; Elute 4000 ms; Re-equilibration 500 ms.
[0155] C. Sequencing
[0156] Those clones from the saturation mutagenesis library determined by screening to exhibit an increased THCA titer were re-tested and sequenced using Sanger sequencing technology to determine the specific codon differences and amino acid differences.
[0157] D. Results
[0158] Screening of the saturation mutagenesis library strains for fold-improvement in production of THCA titer from HA feeding (FIOPC), relative to the control strain, EVT001 , which expresses the parent d28_THCAS polypeptide of SEQ ID NO: 18, are summarized in Table 6 (below).
[0159] TABLE 6
Figure imgf000063_0001
Figure imgf000064_0001
[0160] As shown by the results in Table 6, the presence of the following amino acid differences in the recombinant polypeptides having THCA synthase activity expressed in the strains from the EVT000 saturation mutagenesis libraries resulted in increased THCA titer produced by the yeast strain: R3V, L23I, L23V, H28G, H28Q, L31 D, L31G, M33D, L43G, L43S, S72A, V75A, V75Y, K137C, K137F, K137M, K137S, K137Y, G207A, K233G, K233S, K233T, I266Q, K268E, K276Q, H282L, V293I, N301 D, A335T, G348A, T418V, Y472I, N500D, H517R, H517V, and H517Y. Additionally, at least the following combinations of residue differences in the expressed recombinant polypeptides resulted in increased THCA titer produced by the yeast strain: I266Q and H517Y; L43G and K276Q; S72A and V293I; A335T and G348A; A335T and H517V; and N301 D and Y472I.
[0161] It also was observed that certain neutral codon changes, which did not result in an amino acid change in the recombinant polypeptide sequence, resulted in increased THCA titer produced by the yeast strain. Specifically, the following neutral codon changes at positions V75 (GTA>GTG), V184 (GTA>GTG), G328 (GGOGGT), F337 (TTOTTT), P404 (CCT>CCC), D498 (GAOGAT), and H516 (CAT>CAC).
Example 2: Preparation and Screening of Engineered Polypeptides with Improved THCVA Synthase Activity
[0162] This example illustrates preparation of site saturation mutagenesis libraries of polypeptides derived from the parent polypeptide, d28_THCAS of SEQ ID NO: 18 and screening for improved activity in the conversion of CBGVA to THCVA relative to the activity of the parent polypeptide of SEQ ID NO: 18.
[0163] Materials and methods
[0164] Site saturation mutagenesis library build:
[0165] A site saturation mutagenesis library was prepared as described in Example 1.
[0166] B. Screening of site saturation mutagenesis library for cannabinoid biosynthesis: [0167] Individual clones from the saturation mutagenesis library integrated into EVT000 and the EVT001 control strain were picked and grown in 0.3 mL YPD in 96-well plates. The culture plates were incubated in shaking incubators for 24 h at 30 C, 90% humidity, and 600 rpm (3 mm throw). Cultures were then sub-cultured into 0.27 mL fresh YPD and fed with butyric acid (BA) to 3 mM final concentration. Subculture plates were grown in shaking incubators for 72 hours at 30 C, 90% humidity, and 600 rpm (3 mm throw). The whole broth from these subculture plates was extracted and analyzed for the presence of the cannabinoid precursor compound, DA, the cannabinoid, CBGVA, and the final cannabinoid, THCVA, using HPLC, as described below.
[0168] 1. LC-MS/MS sample preparation: The whole broth of the culture was extracted in 80% acetonitrile/20% ethanol and diluted with 100% acetonitrile for sample preparation. The prepared samples were loaded onto UHPLC coupled to a triple quadrupole mass spectrometry detector. Metabolites DA, CBGVA, and THCVA were detected using SRM mode. Calibration curves of DA, CBGVA, and THCVA were generated by running serial dilutions of standards, and then used to calculate concentrations of each metabolite.
[0169] 2. UHPLC MS instrumentation and parameters: UHPLC system: A Thermo Scientific Vanquish™ UHPLC Systems equipped with a pump (VF-P10-A), an autosampler (VF-A10-A), and a column compartment (VH-C10-A) was used for the chromatographic separation. Separation was achieved with a Thermo Accucore™ C18 column, 2.6pm, 150x2.1 mm (Thermo Scientific) at 40°C, with an injection volume 2 pL. The mobile phase consists of 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B). The flow rate is 0.8 mL/min, and the gradient elution program is as follows: 10-95% B (0-1.0 min), 95% B (1 .0-2.5 min), 95-10% B (2.5-2.6 min), and 10% B (2.6-3.5 min). Seal wash 10% acetonitrile in water. Needle wash IPA: water: methanol: acetonitrile (1 :1 :1 :1). [0170] Mass spectrometry measurements were performed on an Thermo Scientific TSQ Altis™ triple quadrupole mass. Samples were introduced to MS via electrospray ionization (ESI) in negative mode with selected reaction monitoring (SRM). Mass spectrometer was operated in the following conditions: sheath gas flow rate, 60 Arb; auxiliary gas, 15 Arb. The ESI voltage 2900 V and the source temperature was 350°C. The parameter of the quantification of SRM transitions are shown below in Table 7.
[0171] TABLE 7: Parameters for quantification SRM transitions.
Figure imgf000066_0001
[0172] The data was collected and processed by Chromeleon (7).
[0173] C. Sequencing:
[0174] Those clones from the saturation mutagenesis library determined by screening to exhibit an increased THCVA titer were re-tested and sequenced using Sanger sequencing technology to determine the specific codon differences and amino acid differences.
[0175] D. Results:
[0176] Screening of the saturation mutagenesis library strains for fold-improvement in production of THCVA titer from BA feeding (FIOPC), relative to the control strain, EVT001 , which expresses the parent d28_THCAS polypeptide of SEQ ID NO: 18, are summarized below in Table 8.
[0177] TABLE S
Figure imgf000066_0002
Figure imgf000067_0001
[0178] As shown by the results in Table 8, the presence of the following amino acid differences in the recombinant polypeptides having THCA synthase activity expressed in the strains from the EVT000 saturation mutagenesis libraries resulted in increased THCVA titer produced by the yeast strain: K12G, A19E, A19G, A19Q, H28N, Q41R, V75A, H108R, V151G, A214T, K234R, D256S, D258R, H274C, H274E, H274Q, F317L, F332L, A335C, S354C, T367E, K496E, and K496Q.
[0179] It also was observed that certain neutral codon changes, which did not result in an amino acid change in the recombinant polypeptide sequence, resulted in increased THCA titer produced by the yeast strain. Specifically, the following neutral codon changes at the following positions: H108 (CAOCAT), K136 (AAG>AAA), K187 (AAG>AAA), R191 R (AGOAGA), A368A (GCOGCG), L415L (TTA>CTG), and T464T (ACOACG).
[0180] While the foregoing disclosure of the present invention has been described in some detail by way of example and illustration for purposes of clarity and understanding, this disclosure including the examples, descriptions, and embodiments described herein are for illustrative purposes, are intended to be exemplary, and should not be construed as limiting the present disclosure. It will be clear to one skilled in the art that various modifications or changes to the examples, descriptions, and embodiments described herein can be made and are to be included within the spirit and purview of this disclosure and the appended claims. Further, one of skill in the art will recognize a number of equivalent methods and procedure to those described herein. All such equivalents are to be understood to be within the scope of the present disclosure and are covered by the appended claims.
[0181] Additional embodiments of the invention are set forth in the following claims.
[0182] The disclosures of all publications, patent applications, patents, or other documents mentioned herein are expressly incorporated by reference in their entirety for all purposes to the same extent as if each such individual publication, patent, patent application or other document were individually specifically indicated to be incorporated by reference herein in its entirety for all purposes and were set forth in its entirety herein. In case of conflict, the present specification, including specified terms, will control.

Claims

What is claimed is:
1. A recombinant polypeptide having THCA synthase activity, wherein the polypeptide comprises an amino acid sequence of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18, wherein the amino acid difference is:
(a) at one or more positions selected from: K12, A19, Q41 , H108, V151 , A214, K234, D256, D258, I266, K268, H274, K276, H282, V293, N301 , F317, F332, A335, G348, S354.T367, T418, and K496; and/or
(b) selected from: R3V, K12G, A19E, A19G, A19Q, L23V, H28N, H28G, H28Q, L31 D, L31G, M33D, Q41R, L43G, L43S, V75A, V75Y, H108R, K137C, K137F, K137M, K137S, K137Y, V151G, G207A, A214T, K233G, K233S, K233T, K234R, D256S, D258R, I266Q, K268E, H274C, H274E, H274Q, K276Q, H282L, V293I, N301 D, F317L, F332L, A335C, A335T, G348A, S354C, T367E, T418V, Y472I, K496E, K496Q, N500D, H517R, H517V, and H517Y.
2. The polypeptide of claim 1 , wherein the polypeptide comprises at least two amino acid differences selected from: H28N and A335C; Q41 R and D258R; L43G and K276Q; S72A and V293I; K234R and K496E; I266Q and H517Y; N301 D and Y472I; A335T and G348A; and A335T and H517V.
3. The polypeptide of claim 1 , wherein the polypeptide is encoded by a polynucleotide sequence having at least 80% identity to SEQ ID NO: 17, and a codon difference as compared to SEQ ID NO: 17 selected from: V75 (GTA>GTG), H108 (CAOCAT), K136 (AAG>AAA), V184 (GTA>GTG), K187 (AAG>AAA), R191 R (AGG>AGA), G328 (GGOGGT), F337 (TTOTTT), A368A (GCOGCG), P404 (CCT>CCC), L415L (TTA>CTG), T464T (ACOACG), D498 (GAOGAT), and H516 (CAT>CAC).
4. The polypeptide of claim 1 , wherein the amino acid difference is:
(a) at one or more positions selected from: A335, I266, K268, K276, H282, V293, N301 , G348, and T418; and/or
(b) selected from: R3V, L23V, H28G, H28Q, L31 D, L31G, M33D, L43G, L43S, V75A, V75Y, K137C, K137F, K137M, K137S, K137Y, G207A, K233G, K233S, K233T, I266Q, K268E, K276Q, H282L, V293I, N301 D, A335T, G348A, T418V, Y472I, N500D, H517R, H517V, and H517Y.
5. The polypeptide of claim 1 , wherein the amino acid difference is:
(a) at one or more positions selected from: K12, A19, Q41 , H108, V151 , A214, K234, D256, D258, H274, F317, F332, A335, S354, T367, and K496; and/or
- 67 - (b) selected from: K12G, A19E, A19G, A19Q, H28N, Q41R, V75A, H108R, V151G, A214T, K234R, D256S, D258R, H274C, H274E, H274Q, F317L, F332L, A335C, S354C, T367E, K496E, and K496Q. The polypeptide of any one of claims 1-5 in which the polypeptide comprises an amino acid sequence of at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to a sequence selected from the group consisting of SEQ ID NO: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, and 168. The polypeptide of any one of claims 1-6 in which the polypeptide comprises an N-terminal secretion peptide; optionally, wherein the N-terminal secretion peptide comprising an amino acid sequence selected from SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, and 122. The polypeptide of any one of claims 1 -7 in which the THCA synthase activity of the polypeptide as compared to a polypeptide consisting of SEQ ID NO: 18 is increased at least 1 .2-fold, at least 1 .5-fold, at least 2-fold, at least 5-fold, or more. A polynucleotide encoding the polypeptide of any one of claims 1-8. The polynucleotide of claim 9 in which the polynucleotide sequence comprises:
(a) a sequence of at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least 99% identity to a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145, 147, 149, 151 , 153, 155, 157, 159, 161 , 163, 165, and 167; or,
(b) a codon degenerate sequence of a sequence selected from the group consisting of SEQ ID NO: 19, 21 , 23, 25, 27, 29, 31 , 33, 35, 37, 39, 41 , 43, 45, 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 73, 75, 77, 79, 81 , 83, 85, 87, 89, 91 , 93, 95, 97, 131 , 133, 135, 137, 139, 141 , 143, 145, 147, 149, 151 , 153, 155, 157, 159, 161 , 163, 165, and 167. The polynucleotide of any one of claims 9-10, wherein the polynucleotide encoding the polypeptide further comprises a polynucleotide sequence encoding an N-terminal secretion peptide comprising an amino acid sequence selected from SEQ ID NO: 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, and 122; optionally, wherein the polynucleotide sequence encoding the N-terminal secretion peptide is selected from SEQ ID NO: 99, 101 , 103, 105, 107, 109, 111 , 113, 115, 117, 119, and 121.
- 68 - An expression vector comprising the polynucleotide of any one of claims 9-11. The expression vector of claim 12 comprising a control sequence. A host cell comprising the polynucleotide of any one of claims 8-9 or the expression vector of any one of claims 12-13. A method for preparing a polypeptide of any one of claims 1-8 comprising culturing a host cell of claim 14 and isolating the polypeptide from the cell. A method for preparing a recombinant polypeptide having THCA synthase activity comprising:
(a) transforming a host cell with an expression vector comprising a polynucleotide encoding a recombinant polypeptide of any one of claims 1-8;
(b) culturing said transformed host cell under conditions whereby said recombinant polypeptide is produced by said host cell; and
(c) recovering said recombinant polypeptide from said host cells. A recombinant host cell comprising a nucleic acid encoding a recombinant polypeptide having THCA synthase activity of any one of claims 1-8. The host cell of claim 17, wherein the host cell further comprises a pathway of enzymes capable of producing a cannabinoid and/or a cannabinoid precursor; optionally, wherein the cannabinoid or cannabinoid precursor is selected from olivetolic acid (OA), divarinic acid (DA), cannabigerolic acid (CBGA), and cannabigerovarinic acid (CBGVA). The host cell of claim 18, wherein the pathway comprises enzymes capable of converting hexanoic acid (HA) to cannabigerolic acid (CBGA). The host cell of claim 19, wherein the pathway comprises enzymes capable of catalyzing reactions (i) - (iv):
(i) o o
Figure imgf000071_0001
Hexanoic acid Hexanoyl-CoA
- 69 -
Figure imgf000072_0001
Geranyldiphosphate The host cell of any one of claims 17-20, wherein the pathway comprises at least the enzymes AAE, OLS, OAC, and PT4; optionally, wherein the enzymes AAE, OLS, OAC, PT4 have an amino acid sequence of at least 90% identity to SEQ ID NO: 2 (AAE), SEQ ID NO: 4 (OLS), SEQ ID NO: 6 (OAC), and SEQ ID NO: 8 or 10 (PT4) respectively. The host cell of any one of claims 17-21 , wherein the cell produces a cannabinoid selected from cannabigerolic acid (CBGA), cannabigerol (CBG), cannabidiolic acid (CBDA), cannabidiol (CBD), AMetrahydrocannabinolic acid (A9-THCA), AMetrahydrocannabinol (A9- THC), AMetrahydrocannabinolic acid (A8-THCA), AMetrahydrocannabinol (A8-THC), cannabichromenic acid (CBCA), cannabichromene (CBC), cannabinolic acid (CBNA), cannabinol (CBN), cannabidivarinic acid (CBDVA), cannabidivarin (CBDV), A9- tetrahydrocannabivarinic acid (A9-THCVA), AMetrahydrocannabivarin (A9-THCV), cannabidibutolic acid (CBDBA), cannabidibutol (CBDB), AMetrahydrocannabutolic acid (A9- THCBA), AMetrahydrocannabutol (A9-THCB), cannabidiphorolic acid (CBDPA),
- 70 - cannabidiphorol (CBDP), AMetrahydrocannabiphorolic acid (A9-THCPA), A9- tetrahydrocannabiphorol (A9-THCP), cannabichromevarinic acid (CBCVA), cannabichromevarin (CBCV), cannabigerovarinic acid (CBGVA), cannabigerovarin (CBGV), cannabicyclolic acid (CBLA), cannabicyclol (CBL), cannabielsoinic acid (CBEA), cannabielsoin (CBE), cannabicitranic acid (CBTA), cannabicitran (CBT), and any combination thereof. The host cell of any one of claims 17-22, wherein the cell produces the cannabinoid, THCA. The host cell of claim 23, wherein the production of THCA is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by a polypeptide of SEQ ID NO: 18. The host cell of any one of claims 23-24, wherein the host cell comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: R3, L23, H28, L31 , M33, L43, S72, V75, K137, G207, K233, I266, K268, K276, H282, V293, N301 , A335, G348, T418, Y472, N500, and H517; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from R3V, L23I, L23V, H28G, H28Q, L31 D, L31G, M33D, L43G, L43S, S72A, V75A, V75Y, K137C, K137F, K137M, K137S, K137Y, G207A, K233G, K233S, K233T, I266Q, K268E, K276Q, H282L, V293I, N301 D, A335T, G348A, T418V, Y472I, N500D, H517R, H517V, and H517Y. The host cell of any one of claims 23-25, wherein the host cell comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and the nucleic acid comprises a neutral codon difference as compared to SEQ ID NO: 17 selected from: V75 (GTA>GTG), V184 (GTA>GTG), G328 (GGOGGT), F337 (TTOTTT), P404 (CCT>CCC), D498 (GAOGAT), and H516 (CAT>CAC). The host cell of claim 18, wherein the pathway comprises enzymes capable of converting butyric acid (BA) to cannabigerovarinic acid (CBGVA). The host cell of claim 27, wherein the pathway comprises enzymes capable of catalyzing reactions (i) - (iv):
(i)
Figure imgf000073_0001
Butyric acid (BA) Butanoyl-CoA
- 71 -
Figure imgf000074_0001
and
Figure imgf000074_0003
Figure imgf000074_0002
Cannabigerovarinic acid (CBGVA)
Geranyldiphosphate
29. The host cell of any one of claims 27-28, wherein the pathway comprises at least the enzymes AAE, OLS, OAC, and PT4; optionally, wherein the enzymes AAE, OLS, OAC, PT4 have an amino acid sequence of at least 90% identity to SEQ ID NO: 2 (AAE), SEQ ID NO: 4 (OLS), SEQ ID NO: 6 (OAC), and SEQ ID NO: 8 or 10 (PT4) respectively.
30. The host cell of any one of claims 17-29, wherein the cell produces the cannabinoid, THCVA.
31. The host cell of claim 30, wherein the production of THCVA is increased at least 2-fold, at least 3-fold, at least 4-fold, at least 5-fold, or more, relative to a control recombinant host cell comprising a pathway with the recombinant polypeptide having THCA synthase activity replaced by a polypeptide of SEQ ID NO: 18.
32. The host cell of any one of claims 30-31 , wherein the host cell comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18 and an amino acid residue difference as compared to SEQ ID NO: 18 at one or more positions selected from: K12, A19, H28, Q41 , V75, H108, V151 , A214, K234, D256, D258, H274, F317, F332, A335, S354, T367, and K496; optionally, wherein the amino acid residue difference as compared to SEQ ID NO: 18 is selected from K12G, A19E, A19G, A19Q, H28N, Q41 R, V75A, H108R, V151G, A214T, K234R, D256S, D258R, H274C, H274E, H274Q, F317L, F332L, A335C, S354C, T367E, K496E, and K496Q.
33. The host cell of any one of claims 30-32, wherein the host cell comprises a nucleic acid encoding a recombinant polypeptide of at least 80% identity to SEQ ID NO: 18, and the nucleic acid comprises a neutral codon difference as compared to SEQ ID NO: 17 selected from: H108 (CAOCAT), K136 (AAG>AAA), K187 (AAG>AAA), R191 R (AGG>AGA), A368A (GCOGCG), L415L (TTA>CTG), and T464T (ACOACG).
34. The host cell of any one of claims 17-33, wherein recombinant host cell source is selected from Saccharomyces cerevisiae, Yarrowia lipolytica, Pichia pastoris, and Escherichia coli.
35. A method for producing a cannabinoid comprising:
(a) culturing in a suitable medium a recombinant host cell of any one of claims 17-34; and
(b) recovering the produced cannabinoid.
36. The method of claim 35, wherein the method further comprises contacting a cell-free extract of the culture with a biocatalytic reagent or chemical reagent.
37. A method for preparing a compound of structural formula (I)
Figure imgf000075_0001
wherein, R1 is C1-C7 alkyl, comprising contacting under suitable reactions conditions geranyl pyrophosphate (GPP) and a compound of structural formula (II)
Figure imgf000076_0001
wherein, R1 is C1-C7 alkyl, and a recombinant polypeptide of any one of claims 1-8. The method of claim 37, wherein:
(a) the compound of structure formula (I) is AMetrahydrocannabinolic acid (A9-THCA) and the compound of structural formula (II) is cannabigerolic acid (CBGA); or
(b) the compound of structure formula (I) is A9-tetrahydrocannabivarinic acid (A9-THCVA) and the compound of structural formula (II) is cannabigerovarinic acid (CBGVA).
PCT/US2022/078258 2021-10-19 2022-10-18 Recombinant thca synthase polypeptides engineered for enhanced biosynthesis of cannabinoids WO2023069921A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163257523P 2021-10-19 2021-10-19
US63/257,523 2021-10-19

Publications (1)

Publication Number Publication Date
WO2023069921A1 true WO2023069921A1 (en) 2023-04-27

Family

ID=84330654

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/078258 WO2023069921A1 (en) 2021-10-19 2022-10-18 Recombinant thca synthase polypeptides engineered for enhanced biosynthesis of cannabinoids

Country Status (1)

Country Link
WO (1) WO2023069921A1 (en)

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5268463A (en) 1986-11-11 1993-12-07 Jefferson Richard A Plant promoter α-glucuronidase gene construct
WO1995022625A1 (en) 1994-02-17 1995-08-24 Affymax Technologies N.V. Dna mutagenesis by random fragmentation and reassembly
US5510474A (en) 1988-05-17 1996-04-23 Mycogen Plant Science, Inc. Plant ubiquitin promoter system
WO1997000078A1 (en) 1995-06-14 1997-01-03 Valio Oy Methods of preventing or treating allergies
WO1997035966A1 (en) 1996-03-25 1997-10-02 Maxygen, Inc. Methods and compositions for cellular and metabolic engineering
WO1998027230A1 (en) 1996-12-18 1998-06-25 Maxygen, Inc. Methods and compositions for polypeptide engineering
WO2000042651A1 (en) 1999-01-13 2000-07-20 Hitachi, Ltd. Semiconductor device
US6117679A (en) 1994-02-17 2000-09-12 Maxygen, Inc. Methods for generating polynucleotides having desired characteristics by iterative selection and recombination
WO2001075767A2 (en) 2000-03-30 2001-10-11 Maxygen, Inc. In silico cross-over site selection
US6376246B1 (en) 1999-02-05 2002-04-23 Maxygen, Inc. Oligonucleotide mediated nucleic acid recombination
US6537746B2 (en) 1997-12-08 2003-03-25 Maxygen, Inc. Method for creating polynucleotide and polypeptide sequences
US20080220990A1 (en) 2002-03-01 2008-09-11 Maxygen, Inc. Methods, systems, and software for identifying functional bio-molecules
US20090312196A1 (en) 2008-06-13 2009-12-17 Codexis, Inc. Method of synthesizing polynucleotide variants
US20180073043A1 (en) 2014-07-14 2018-03-15 Librede Inc. Production of Cannabidiolic Acid in Yeast
WO2018148849A1 (en) 2017-02-17 2018-08-23 Hyasynth Biologicals Inc. Method and cell line for production of polyketides in yeast
WO2018200888A1 (en) 2017-04-27 2018-11-01 Regents Of The University Of California Microorganisms and methods for producing cannabinoids and cannabinoid derivatives
US20180334692A1 (en) 2017-05-10 2018-11-22 Baymedica, Inc. Recombinant production systems for prenylated polyketides of the cannabinoid family
WO2019014490A1 (en) 2017-07-12 2019-01-17 Biomedican, Inc. Production of cannabinoids in yeast
WO2019046941A1 (en) 2017-09-05 2019-03-14 Inmed Pharmaceuticals Inc. Metabolic engineering of e. coli for the biosynthesis of cannabinoid products
WO2019071000A1 (en) 2017-10-05 2019-04-11 Intrexon Corporation Microorganisms and methods for the fermentation of cannabinoids
WO2020060948A1 (en) * 2018-09-17 2020-03-26 Levadura Biotechnology, Inc. Production of cannabinoids in yeast using a fatty acid feedstock
WO2021034848A1 (en) * 2019-08-18 2021-02-25 Ginkgo Bioworks, Inc. Biosynthesis of cannabinoids and cannabinoid precursors

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5599670A (en) 1986-11-11 1997-02-04 Cambia Biosystems. L.L.C. β-glucuronidase and glucuronide permease gene system
US5268463A (en) 1986-11-11 1993-12-07 Jefferson Richard A Plant promoter α-glucuronidase gene construct
US5510474A (en) 1988-05-17 1996-04-23 Mycogen Plant Science, Inc. Plant ubiquitin promoter system
US6117679A (en) 1994-02-17 2000-09-12 Maxygen, Inc. Methods for generating polynucleotides having desired characteristics by iterative selection and recombination
WO1995022625A1 (en) 1994-02-17 1995-08-24 Affymax Technologies N.V. Dna mutagenesis by random fragmentation and reassembly
WO1997000078A1 (en) 1995-06-14 1997-01-03 Valio Oy Methods of preventing or treating allergies
WO1997035966A1 (en) 1996-03-25 1997-10-02 Maxygen, Inc. Methods and compositions for cellular and metabolic engineering
US6586182B1 (en) 1996-12-18 2003-07-01 Maxygen, Inc. Methods and compositions for polypeptide engineering
WO1998027230A1 (en) 1996-12-18 1998-06-25 Maxygen, Inc. Methods and compositions for polypeptide engineering
US6537746B2 (en) 1997-12-08 2003-03-25 Maxygen, Inc. Method for creating polynucleotide and polypeptide sequences
WO2000042651A1 (en) 1999-01-13 2000-07-20 Hitachi, Ltd. Semiconductor device
US6376246B1 (en) 1999-02-05 2002-04-23 Maxygen, Inc. Oligonucleotide mediated nucleic acid recombination
WO2001075767A2 (en) 2000-03-30 2001-10-11 Maxygen, Inc. In silico cross-over site selection
US20080220990A1 (en) 2002-03-01 2008-09-11 Maxygen, Inc. Methods, systems, and software for identifying functional bio-molecules
US20090312196A1 (en) 2008-06-13 2009-12-17 Codexis, Inc. Method of synthesizing polynucleotide variants
US20180073043A1 (en) 2014-07-14 2018-03-15 Librede Inc. Production of Cannabidiolic Acid in Yeast
WO2018148849A1 (en) 2017-02-17 2018-08-23 Hyasynth Biologicals Inc. Method and cell line for production of polyketides in yeast
WO2018200888A1 (en) 2017-04-27 2018-11-01 Regents Of The University Of California Microorganisms and methods for producing cannabinoids and cannabinoid derivatives
US20180334692A1 (en) 2017-05-10 2018-11-22 Baymedica, Inc. Recombinant production systems for prenylated polyketides of the cannabinoid family
WO2019014490A1 (en) 2017-07-12 2019-01-17 Biomedican, Inc. Production of cannabinoids in yeast
WO2019046941A1 (en) 2017-09-05 2019-03-14 Inmed Pharmaceuticals Inc. Metabolic engineering of e. coli for the biosynthesis of cannabinoid products
WO2019071000A1 (en) 2017-10-05 2019-04-11 Intrexon Corporation Microorganisms and methods for the fermentation of cannabinoids
WO2020060948A1 (en) * 2018-09-17 2020-03-26 Levadura Biotechnology, Inc. Production of cannabinoids in yeast using a fatty acid feedstock
WO2021034848A1 (en) * 2019-08-18 2021-02-25 Ginkgo Bioworks, Inc. Biosynthesis of cannabinoids and cannabinoid precursors

Non-Patent Citations (23)

* Cited by examiner, † Cited by third party
Title
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410
ALTSCHUL ET AL., NUCLEIC ACIDS RES., 1977, pages 3389 - 3402
AUSUBEL ET AL.: "Current Protocols in Molecular Biology", vol. 00 - 130, 1995, GREENE PUBLISHING ASSOCIATES, INC. AND JOHN WILEY & SONS, INC., article "Current Protocols in Molecular Biology"
BASTIAN ZIRPEL ET AL: "Optimization of Δ 9 -tetrahydrocannabinolic acid synthase production in Komagataella phaffii via post-translational bottleneck identification", JOURNAL OF BIOTECHNOLOGY, vol. 272-273, 1 April 2018 (2018-04-01), Amsterdam NL, pages 40 - 47, XP055658563, ISSN: 0168-1656, DOI: 10.1016/j.jbiotec.2018.03.008 *
DATABASE UniProt [online] 2 December 2020 (2020-12-02), "SubName: Full=Tetrahydrocannabinolic acid synthase {ECO:0000313|EMBL:MBA5282456.1};", XP002808457, retrieved from EBI accession no. UNIPROT:A0A7C9FBI5 Database accession no. A0A7C9FBI5 *
HENAUTDANCHIN ET AL.: "Escherichia coli and Salmonella", vol. 266, 1996, ASM PRESS, pages: 2047 - 2066
HENIKOFFHENIKOFF, PROC NATL ACAD SCI USA, vol. 89, 1989, pages 10915
MCELROY ET AL., PLANT CELL, vol. 2, 1990, pages 163 - 171
MCINERNEY, J. O, BIOINFORMATICS, vol. 14, 1998, pages 372 - 73
MOUNT, D.: "Bioinformatics: Sequence and Genome Analysis", 2001, COLD SPRING HARBOR LABORATORY PRESS
NAKAMURA ET AL., NUCL. ACIDS RES., vol. 28, 2000, pages 292
NEEDLEMANWUNSCH, J. MOL. BIOL., vol. 48, 1970, pages 443
NIEDZ ET AL., PLANT CELL REP., vol. 14, 1995, pages 403
ODELL ET AL., NATURE, vol. 313, 1985, pages 810 - 812
PEARSONLIPMAN, PROC. NATL. ACAD. SCI. USA, vol. 85, 1988, pages 2444
ROMERO P ET AL: "Comprehending and improving cannabis specialized metabolism in the systems biology era", PLANT SCIENCE, ELSEVIER IRELAND LTD, IE, vol. 298, 27 June 2020 (2020-06-27), XP086239384, ISSN: 0168-9452, [retrieved on 20200627], DOI: 10.1016/J.PLANTSCI.2020.110571 *
SAMBROOK ET AL.: "Molecular Cloning, a Laboratory Manual", vol. 1-3, 2012, COLD SPRING HARBOR LABORATORY PRESS
SMITHWATERMAN, ADV. APPL. MATH., vol. 2, 1981, pages 482
STEMMER, PROC NATL ACAD SCI USA, vol. 91, 1994, pages 10747 - 10751
STENICO ET AL., NUCLEIC ACIDS RES., 1994, pages 222437 - 46
TIWARI ET AL., COMPUT. APPL. BIOSCI., vol. 13, 1997, pages 263 - 270
WADA ET AL., NUCLEIC ACIDS RES., vol. 20, 1992, pages 2111 - 2118
WRIGHT, F., GENE, vol. 87, 1990, pages 23 - 29

Similar Documents

Publication Publication Date Title
CN104854245B (en) Ergothioneine production by metabolic engineering
JP6562950B2 (en) Dreamenol synthase and method for producing dreammenol
US9957497B2 (en) Hydrocarbon synthase gene and use thereof
JP6622220B2 (en) Dreamenol synthase and method for producing dreammenol
EP3425050B1 (en) Hydroxynitrile lyase
US20220186231A1 (en) Recombinant acyl activating enzyme (aae) genes for enhanced biosynthesis of cannabinoids and cannabinoid precursors
US20230193329A1 (en) Compositions and Methods for Recombinant Biosynthesis of Cannabinoids
US11518983B1 (en) Prenyltransferase variants with increased thermostability
WO2023069921A1 (en) Recombinant thca synthase polypeptides engineered for enhanced biosynthesis of cannabinoids
US20240101994A1 (en) Recombinant olivetolic acid cyclase polypeptides engineered for enhanced biosynthesis of cannabinoids
KR102472270B1 (en) Development of novel methanotroph that co-assimilate methane and xylose, and producing shinorine using itself
WO2023010083A2 (en) Recombinant prenyltransferase polypeptides engineered for enhanced biosynthesis of cannabinoids
US20230279449A1 (en) Compositions and methods for enhancing recombinant biosynthesis of cannabinoids
WO2017077125A1 (en) Drimenol synthases iii
WO2022204007A2 (en) Recombinant polypeptides for enhanced biosynthesis of cannabinoids
CN110713962A (en) Genetic engineering bacterium for high-yield production of malonyl coenzyme A and construction method and application thereof
US20140329275A1 (en) Biocatalysis cells and methods
WO2023133483A1 (en) Recombinant polypeptides with berberine bridge enzyme activity useful for the biosynthesis of cannabinoids
WO2015182719A1 (en) Method for improving yield of substance from micro-organism, and kit using said method
CN114262681A (en) Berberine producing strain, and establishing method and application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22801682

Country of ref document: EP

Kind code of ref document: A1