WO2023006699A1 - Cells and method for producing isoprenoid molecules with canonical and non-canonical structures - Google Patents
Cells and method for producing isoprenoid molecules with canonical and non-canonical structures Download PDFInfo
- Publication number
- WO2023006699A1 WO2023006699A1 PCT/EP2022/070854 EP2022070854W WO2023006699A1 WO 2023006699 A1 WO2023006699 A1 WO 2023006699A1 EP 2022070854 W EP2022070854 W EP 2022070854W WO 2023006699 A1 WO2023006699 A1 WO 2023006699A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- cell
- kinase
- nucleic acid
- group containing
- Prior art date
Links
- 150000003505 terpenes Chemical class 0.000 title claims abstract description 201
- 238000004519 manufacturing process Methods 0.000 title claims abstract description 93
- -1 pyrophosphate terpenoid Chemical class 0.000 claims abstract description 272
- 210000004027 cell Anatomy 0.000 claims abstract description 222
- 150000007523 nucleic acids Chemical group 0.000 claims abstract description 148
- 108091000080 Phosphotransferase Proteins 0.000 claims abstract description 93
- 102000020233 phosphotransferase Human genes 0.000 claims abstract description 93
- 239000002243 precursor Substances 0.000 claims abstract description 71
- 150000003138 primary alcohols Chemical class 0.000 claims abstract description 71
- 235000007586 terpenes Nutrition 0.000 claims abstract description 68
- 210000003527 eukaryotic cell Anatomy 0.000 claims abstract description 59
- 235000011180 diphosphates Nutrition 0.000 claims abstract description 52
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 51
- 238000000034 method Methods 0.000 claims abstract description 47
- 150000004712 monophosphates Chemical class 0.000 claims abstract description 26
- 102000039446 nucleic acids Human genes 0.000 claims description 92
- 108020004707 nucleic acids Proteins 0.000 claims description 92
- 240000004808 Saccharomyces cerevisiae Species 0.000 claims description 82
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 claims description 80
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 claims description 61
- ASUAYTHWZCLXAN-UHFFFAOYSA-N prenol Chemical compound CC(C)=CCO ASUAYTHWZCLXAN-UHFFFAOYSA-N 0.000 claims description 54
- 238000006243 chemical reaction Methods 0.000 claims description 36
- CPJRRXSHAYUTGL-UHFFFAOYSA-N isopentenyl alcohol Chemical compound CC(=C)CCO CPJRRXSHAYUTGL-UHFFFAOYSA-N 0.000 claims description 33
- 241000499912 Trichoderma reesei Species 0.000 claims description 30
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 claims description 26
- 230000000694 effects Effects 0.000 claims description 25
- UHOVQNZJYSORNB-UHFFFAOYSA-N Benzene Chemical class C1=CC=CC=C1 UHOVQNZJYSORNB-UHFFFAOYSA-N 0.000 claims description 24
- 210000005253 yeast cell Anatomy 0.000 claims description 23
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 20
- 102000001253 Protein Kinase Human genes 0.000 claims description 14
- 108060004127 isopentenyl phosphate kinase Proteins 0.000 claims description 14
- 230000000865 phosphorylative effect Effects 0.000 claims description 14
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 claims description 13
- 229910052739 hydrogen Inorganic materials 0.000 claims description 13
- 239000001257 hydrogen Substances 0.000 claims description 13
- 229910052757 nitrogen Inorganic materials 0.000 claims description 13
- 229910052760 oxygen Inorganic materials 0.000 claims description 13
- 239000001301 oxygen Substances 0.000 claims description 13
- 108060006633 protein kinase Proteins 0.000 claims description 12
- 240000006439 Aspergillus oryzae Species 0.000 claims description 11
- 235000002247 Aspergillus oryzae Nutrition 0.000 claims description 11
- ZOXJGFHDIHLPTG-UHFFFAOYSA-N Boron Chemical compound [B] ZOXJGFHDIHLPTG-UHFFFAOYSA-N 0.000 claims description 11
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 claims description 11
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 claims description 11
- 229910052796 boron Inorganic materials 0.000 claims description 11
- 229910052698 phosphorus Inorganic materials 0.000 claims description 11
- 239000011574 phosphorus Substances 0.000 claims description 11
- 241000228245 Aspergillus niger Species 0.000 claims description 8
- 241000235070 Saccharomyces Species 0.000 claims description 8
- 125000002009 alkene group Chemical group 0.000 claims description 8
- 125000002355 alkine group Chemical group 0.000 claims description 8
- 125000004122 cyclic group Chemical group 0.000 claims description 8
- 229910052736 halogen Inorganic materials 0.000 claims description 8
- 150000002367 halogens Chemical class 0.000 claims description 8
- 229910052752 metalloid Inorganic materials 0.000 claims description 8
- 150000002738 metalloids Chemical class 0.000 claims description 8
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 claims description 8
- 229910052755 nonmetal Inorganic materials 0.000 claims description 8
- ZAMOUSCENKQFHK-UHFFFAOYSA-N Chlorine atom Chemical group [Cl] ZAMOUSCENKQFHK-UHFFFAOYSA-N 0.000 claims description 7
- PXGOKWXKJXAPGV-UHFFFAOYSA-N Fluorine Chemical group FF PXGOKWXKJXAPGV-UHFFFAOYSA-N 0.000 claims description 7
- 239000000460 chlorine Chemical group 0.000 claims description 7
- 229910052801 chlorine Inorganic materials 0.000 claims description 7
- 229910052731 fluorine Inorganic materials 0.000 claims description 7
- 239000011737 fluorine Chemical group 0.000 claims description 7
- 239000005864 Sulphur Substances 0.000 claims description 6
- MQCJHQBRIPSIKA-UHFFFAOYSA-N prenyl phosphate Chemical compound CC(C)=CCOP(O)(O)=O MQCJHQBRIPSIKA-UHFFFAOYSA-N 0.000 claims description 6
- 229910052717 sulfur Inorganic materials 0.000 claims description 5
- 239000011593 sulfur Substances 0.000 claims description 5
- 241001495180 Arthrospira Species 0.000 claims description 4
- 241001465318 Aspergillus terreus Species 0.000 claims description 4
- 244000027711 Brettanomyces bruxellensis Species 0.000 claims description 4
- 235000000287 Brettanomyces bruxellensis Nutrition 0.000 claims description 4
- WKBOTKDWSSQWDR-UHFFFAOYSA-N Bromine atom Chemical group [Br] WKBOTKDWSSQWDR-UHFFFAOYSA-N 0.000 claims description 4
- 241000195597 Chlamydomonas reinhardtii Species 0.000 claims description 4
- 240000009108 Chlorella vulgaris Species 0.000 claims description 4
- 235000007089 Chlorella vulgaris Nutrition 0.000 claims description 4
- 241000195633 Dunaliella salina Species 0.000 claims description 4
- 108091029865 Exogenous DNA Proteins 0.000 claims description 4
- 241000168517 Haematococcus lacustris Species 0.000 claims description 4
- 241001501873 Isochrysis galbana Species 0.000 claims description 4
- 235000014663 Kluyveromyces fragilis Nutrition 0.000 claims description 4
- 241000235058 Komagataella pastoris Species 0.000 claims description 4
- 241001300629 Nannochloropsis oceanica Species 0.000 claims description 4
- 241000224476 Nannochloropsis salina Species 0.000 claims description 4
- 241000221961 Neurospora crassa Species 0.000 claims description 4
- 244000253911 Saccharomyces fragilis Species 0.000 claims description 4
- 235000018368 Saccharomyces fragilis Nutrition 0.000 claims description 4
- 241000311449 Scheffersomyces Species 0.000 claims description 4
- 241000235347 Schizosaccharomyces pombe Species 0.000 claims description 4
- 241000235015 Yarrowia lipolytica Species 0.000 claims description 4
- GDTBXPJZTBHREO-UHFFFAOYSA-N bromine Chemical group BrBr GDTBXPJZTBHREO-UHFFFAOYSA-N 0.000 claims description 4
- 229910052794 bromium Inorganic materials 0.000 claims description 4
- 210000005254 filamentous fungi cell Anatomy 0.000 claims description 4
- 125000002887 hydroxy group Chemical group [H]O* 0.000 claims description 4
- 229940031154 kluyveromyces marxianus Drugs 0.000 claims description 4
- 241000320412 Ogataea angusta Species 0.000 claims description 3
- 241001563619 Ogataea parapolymorpha Species 0.000 claims description 3
- 238000012258 culturing Methods 0.000 claims description 3
- 150000002431 hydrogen Chemical class 0.000 claims 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 131
- 102000004196 processed proteins & peptides Human genes 0.000 description 104
- 229920001184 polypeptide Polymers 0.000 description 100
- 102000004190 Enzymes Human genes 0.000 description 77
- 108090000790 Enzymes Proteins 0.000 description 77
- 229940088598 enzyme Drugs 0.000 description 77
- 230000014509 gene expression Effects 0.000 description 58
- 108090000623 proteins and genes Proteins 0.000 description 56
- 239000003795 chemical substances by application Substances 0.000 description 50
- 125000003275 alpha amino acid group Chemical group 0.000 description 46
- 150000001875 compounds Chemical class 0.000 description 44
- 230000037361 pathway Effects 0.000 description 42
- CBIDRCWHNCKSTO-UHFFFAOYSA-N prenyl diphosphate Chemical compound CC(C)=CCO[P@](O)(=O)OP(O)(O)=O CBIDRCWHNCKSTO-UHFFFAOYSA-N 0.000 description 40
- 230000015572 biosynthetic process Effects 0.000 description 37
- CDOSHBSSFJOMGT-UHFFFAOYSA-N linalool Chemical compound CC(C)=CCCC(C)(O)C=C CDOSHBSSFJOMGT-UHFFFAOYSA-N 0.000 description 34
- 239000002773 nucleotide Substances 0.000 description 31
- NUHSROFQTUXZQQ-UHFFFAOYSA-N isopentenyl diphosphate Chemical compound CC(=C)CCO[P@](O)(=O)OP(O)(O)=O NUHSROFQTUXZQQ-UHFFFAOYSA-N 0.000 description 30
- XMGQYMWWDOXHJM-UHFFFAOYSA-N limonene Chemical compound CC(=C)C1CCC(C)=CC1 XMGQYMWWDOXHJM-UHFFFAOYSA-N 0.000 description 30
- 125000003729 nucleotide group Chemical group 0.000 description 30
- 239000000758 substrate Substances 0.000 description 30
- GVVPGTZRZFNKDS-JXMROGBWSA-N geranyl diphosphate Chemical compound CC(C)=CCC\C(C)=C\CO[P@](O)(=O)OP(O)(O)=O GVVPGTZRZFNKDS-JXMROGBWSA-N 0.000 description 27
- 108010087432 terpene synthase Proteins 0.000 description 27
- 108091026890 Coding region Proteins 0.000 description 26
- 101000724418 Homo sapiens Neutral amino acid transporter B(0) Proteins 0.000 description 26
- 102100028267 Neutral amino acid transporter B(0) Human genes 0.000 description 26
- 239000013598 vector Substances 0.000 description 26
- JTTIOYHBNXDJOD-UHFFFAOYSA-N 2,4,6-triaminopyrimidine Chemical compound NC1=CC(N)=NC(N)=N1 JTTIOYHBNXDJOD-UHFFFAOYSA-N 0.000 description 25
- FVFVNNKYKYZTJU-UHFFFAOYSA-N 6-chloro-1,3,5-triazine-2,4-diamine Chemical group NC1=NC(N)=NC(Cl)=N1 FVFVNNKYKYZTJU-UHFFFAOYSA-N 0.000 description 23
- 101000829958 Homo sapiens N-acetyllactosaminide beta-1,6-N-acetylglucosaminyl-transferase Proteins 0.000 description 23
- 102100023315 N-acetyllactosaminide beta-1,6-N-acetylglucosaminyl-transferase Human genes 0.000 description 23
- 150000004354 sesquiterpene derivatives Chemical class 0.000 description 23
- QFXSWGXWZXSGLC-UHFFFAOYSA-N 3-methylpent-2-en-1-ol Chemical compound CCC(C)=CCO QFXSWGXWZXSGLC-UHFFFAOYSA-N 0.000 description 22
- 125000004432 carbon atom Chemical group C* 0.000 description 22
- 230000026731 phosphorylation Effects 0.000 description 22
- 238000006366 phosphorylation reaction Methods 0.000 description 22
- 238000003786 synthesis reaction Methods 0.000 description 22
- 150000003648 triterpenes Chemical class 0.000 description 22
- 150000001298 alcohols Chemical class 0.000 description 21
- 102000004169 proteins and genes Human genes 0.000 description 21
- 229930004725 sesquiterpene Natural products 0.000 description 21
- 125000002298 terpene group Chemical group 0.000 description 21
- GLZPCOQZEFWAFX-UHFFFAOYSA-N Geraniol Chemical group CC(C)=CCCC(C)=CCO GLZPCOQZEFWAFX-UHFFFAOYSA-N 0.000 description 20
- 229930003658 monoterpene Natural products 0.000 description 20
- 101000957437 Homo sapiens Mitochondrial carnitine/acylcarnitine carrier protein Proteins 0.000 description 18
- 102100038738 Mitochondrial carnitine/acylcarnitine carrier protein Human genes 0.000 description 18
- 230000001965 increasing effect Effects 0.000 description 18
- 239000001490 (3R)-3,7-dimethylocta-1,6-dien-3-ol Substances 0.000 description 17
- CDOSHBSSFJOMGT-JTQLQIEISA-N (R)-linalool Natural products CC(C)=CCC[C@@](C)(O)C=C CDOSHBSSFJOMGT-JTQLQIEISA-N 0.000 description 17
- VWFJDQUYCIWHTN-UHFFFAOYSA-N Farnesyl pyrophosphate Natural products CC(C)=CCCC(C)=CCCC(C)=CCOP(O)(=O)OP(O)(O)=O VWFJDQUYCIWHTN-UHFFFAOYSA-N 0.000 description 17
- 150000001413 amino acids Chemical class 0.000 description 17
- 229930003827 cannabinoid Natural products 0.000 description 17
- 239000003557 cannabinoid Substances 0.000 description 17
- 229930007744 linalool Natural products 0.000 description 17
- 239000000047 product Substances 0.000 description 17
- VWFJDQUYCIWHTN-YFVJMOTDSA-N 2-trans,6-trans-farnesyl diphosphate Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CO[P@](O)(=O)OP(O)(O)=O VWFJDQUYCIWHTN-YFVJMOTDSA-N 0.000 description 16
- 102100022524 Alpha-1-antichymotrypsin Human genes 0.000 description 16
- 101000678026 Homo sapiens Alpha-1-antichymotrypsin Proteins 0.000 description 16
- 235000002577 monoterpenes Nutrition 0.000 description 16
- NDVASEGYNIMXJL-UHFFFAOYSA-N sabinene Chemical compound C=C1CCC2(C(C)C)C1C2 NDVASEGYNIMXJL-UHFFFAOYSA-N 0.000 description 16
- 108050006837 Prenyltransferases Proteins 0.000 description 15
- 102000019337 Prenyltransferases Human genes 0.000 description 15
- 230000008859 change Effects 0.000 description 15
- 229940087305 limonene Drugs 0.000 description 15
- 235000001510 limonene Nutrition 0.000 description 15
- 238000006467 substitution reaction Methods 0.000 description 15
- 108010006731 Dimethylallyltranstransferase Proteins 0.000 description 14
- 102000005454 Dimethylallyltranstransferase Human genes 0.000 description 14
- UAHWPYUMFXYFJY-UHFFFAOYSA-N beta-myrcene Chemical compound CC(C)=CCCC(=C)C=C UAHWPYUMFXYFJY-UHFFFAOYSA-N 0.000 description 14
- 239000001177 diphosphate Substances 0.000 description 14
- 239000012634 fragment Substances 0.000 description 14
- 241000219195 Arabidopsis thaliana Species 0.000 description 13
- 229940065144 cannabinoids Drugs 0.000 description 13
- 229930002368 sesterterpene Natural products 0.000 description 13
- XPPKVPWEQAFLFU-UHFFFAOYSA-N diphosphoric acid Chemical group OP(O)(=O)OP(O)(O)=O XPPKVPWEQAFLFU-UHFFFAOYSA-N 0.000 description 12
- 125000004435 hydrogen atom Chemical class [H]* 0.000 description 12
- 150000002653 sesterterpene derivatives Chemical class 0.000 description 12
- 102100034330 Chromaffin granule amine transporter Human genes 0.000 description 11
- 108010019686 Farnesol kinase Proteins 0.000 description 11
- 101000641221 Homo sapiens Chromaffin granule amine transporter Proteins 0.000 description 11
- RRHGJUQNOFWUDK-UHFFFAOYSA-N Isoprene Chemical group CC(=C)C=C RRHGJUQNOFWUDK-UHFFFAOYSA-N 0.000 description 11
- KQAZVFVOEIRWHN-UHFFFAOYSA-N alpha-thujene Natural products CC1=CCC2(C(C)C)C1C2 KQAZVFVOEIRWHN-UHFFFAOYSA-N 0.000 description 11
- 229910052799 carbon Inorganic materials 0.000 description 11
- 229930004069 diterpene Natural products 0.000 description 11
- 125000000567 diterpene group Chemical group 0.000 description 11
- 108010071062 pinene cyclase I Proteins 0.000 description 11
- YYGNTYWPHWGJRM-UHFFFAOYSA-N (6E,10E,14E,18E)-2,6,10,15,19,23-hexamethyltetracosa-2,6,10,14,18,22-hexaene Chemical compound CC(C)=CCCC(C)=CCCC(C)=CCCC=C(C)CCC=C(C)CCC=C(C)C YYGNTYWPHWGJRM-UHFFFAOYSA-N 0.000 description 10
- UFZFIJQVBUIVBK-QPJJXVBHSA-N (e)-3,4-dimethylpent-2-en-1-ol Chemical compound CC(C)C(\C)=C\CO UFZFIJQVBUIVBK-QPJJXVBHSA-N 0.000 description 10
- PUCCIVZQJKKAET-UHFFFAOYSA-N 3-ethylpent-2-en-1-ol Chemical compound CCC(CC)=CCO PUCCIVZQJKKAET-UHFFFAOYSA-N 0.000 description 10
- 108020004414 DNA Proteins 0.000 description 10
- 102000053602 DNA Human genes 0.000 description 10
- 101710139115 Terpineol synthase, chloroplastic Proteins 0.000 description 10
- FAMPSKZZVDUYOS-UHFFFAOYSA-N alpha-Caryophyllene Natural products CC1=CCC(C)(C)C=CCC(C)=CCC1 FAMPSKZZVDUYOS-UHFFFAOYSA-N 0.000 description 10
- SEEZIOZEUUMJME-FOWTUZBSSA-N cannabigerolic acid Chemical compound CCCCCC1=CC(O)=C(C\C=C(/C)CCC=C(C)C)C(O)=C1C(O)=O SEEZIOZEUUMJME-FOWTUZBSSA-N 0.000 description 10
- 230000004186 co-expression Effects 0.000 description 10
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 10
- 150000002773 monoterpene derivatives Chemical class 0.000 description 10
- 239000013612 plasmid Substances 0.000 description 10
- 230000035897 transcription Effects 0.000 description 10
- 238000013518 transcription Methods 0.000 description 10
- NPNUFJAVOOONJE-ZIAGYGMSSA-N β-(E)-Caryophyllene Chemical compound C1CC(C)=CCCC(=C)[C@H]2CC(C)(C)[C@@H]21 NPNUFJAVOOONJE-ZIAGYGMSSA-N 0.000 description 10
- GRWFGVWFFZKLTI-IUCAKERBSA-N 1S,5S-(-)-alpha-Pinene Natural products CC1=CC[C@@H]2C(C)(C)[C@H]1C2 GRWFGVWFFZKLTI-IUCAKERBSA-N 0.000 description 9
- YCOXTKKNXUZSKD-UHFFFAOYSA-N 3,4-xylenol Chemical compound CC1=CC=C(O)C=C1C YCOXTKKNXUZSKD-UHFFFAOYSA-N 0.000 description 9
- 241000196324 Embryophyta Species 0.000 description 9
- BHEOSNUKNHRBNM-UHFFFAOYSA-N Tetramethylsqualene Natural products CC(=C)C(C)CCC(=C)C(C)CCC(C)=CCCC=C(C)CCC(C)C(=C)CCC(C)C(C)=C BHEOSNUKNHRBNM-UHFFFAOYSA-N 0.000 description 9
- PRAKJMSDJKAYCZ-UHFFFAOYSA-N dodecahydrosqualene Natural products CC(C)CCCC(C)CCCC(C)CCCCC(C)CCCC(C)CCCC(C)C PRAKJMSDJKAYCZ-UHFFFAOYSA-N 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 230000001976 improved effect Effects 0.000 description 9
- 239000002609 medium Substances 0.000 description 9
- 125000001844 prenyl group Chemical group [H]C([*])([H])C([H])=C(C([H])([H])[H])C([H])([H])[H] 0.000 description 9
- 229940031439 squalene Drugs 0.000 description 9
- TUHBEKDERLKLEC-UHFFFAOYSA-N squalene Natural products CC(=CCCC(=CCCC(=CCCC=C(/C)CCC=C(/C)CC=C(C)C)C)C)C TUHBEKDERLKLEC-UHFFFAOYSA-N 0.000 description 9
- 239000000126 substance Substances 0.000 description 9
- XMGQYMWWDOXHJM-JTQLQIEISA-N (+)-α-limonene Chemical compound CC(=C)[C@@H]1CCC(C)=CC1 XMGQYMWWDOXHJM-JTQLQIEISA-N 0.000 description 8
- KJTLQQUUPVSXIM-ZCFIWIBFSA-N (R)-mevalonic acid Chemical compound OCC[C@](O)(C)CC(O)=O KJTLQQUUPVSXIM-ZCFIWIBFSA-N 0.000 description 8
- KJTLQQUUPVSXIM-UHFFFAOYSA-N DL-mevalonic acid Natural products OCCC(O)(C)CC(O)=O KJTLQQUUPVSXIM-UHFFFAOYSA-N 0.000 description 8
- 102100035111 Farnesyl pyrophosphate synthase Human genes 0.000 description 8
- 239000005792 Geraniol Substances 0.000 description 8
- GVVPGTZRZFNKDS-YFHOEESVSA-N Geranyl diphosphate Natural products CC(C)=CCC\C(C)=C/COP(O)(=O)OP(O)(O)=O GVVPGTZRZFNKDS-YFHOEESVSA-N 0.000 description 8
- 108010026318 Geranyltranstransferase Proteins 0.000 description 8
- 102100040870 Glycine amidinotransferase, mitochondrial Human genes 0.000 description 8
- 101000893303 Homo sapiens Glycine amidinotransferase, mitochondrial Proteins 0.000 description 8
- 230000009471 action Effects 0.000 description 8
- 238000007792 addition Methods 0.000 description 8
- 239000007864 aqueous solution Substances 0.000 description 8
- 230000001747 exhibiting effect Effects 0.000 description 8
- 229940113087 geraniol Drugs 0.000 description 8
- 238000011534 incubation Methods 0.000 description 8
- 230000001939 inductive effect Effects 0.000 description 8
- SXFKFRRXJUJGSS-UHFFFAOYSA-N olivetolic acid Chemical compound CCCCCC1=CC(O)=CC(O)=C1C(O)=O SXFKFRRXJUJGSS-UHFFFAOYSA-N 0.000 description 8
- 229920000642 polymer Polymers 0.000 description 8
- 241000894007 species Species 0.000 description 8
- NDVASEGYNIMXJL-NXEZZACHSA-N (+)-sabinene Natural products C=C1CC[C@@]2(C(C)C)[C@@H]1C2 NDVASEGYNIMXJL-NXEZZACHSA-N 0.000 description 7
- CKWCTBJSFBLMOH-FNORWQNLSA-N (e)-3-methylhex-2-en-1-ol Chemical compound CCC\C(C)=C\CO CKWCTBJSFBLMOH-FNORWQNLSA-N 0.000 description 7
- IFPAPBFCYGUMJZ-UHFFFAOYSA-N 3-methylidenepentan-1-ol Chemical compound CCC(=C)CCO IFPAPBFCYGUMJZ-UHFFFAOYSA-N 0.000 description 7
- SEEZIOZEUUMJME-VBKFSLOCSA-N Cannabigerolic acid Natural products CCCCCC1=CC(O)=C(C\C=C(\C)CCC=C(C)C)C(O)=C1C(O)=O SEEZIOZEUUMJME-VBKFSLOCSA-N 0.000 description 7
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 7
- GLZPCOQZEFWAFX-YFHOEESVSA-N Geraniol Natural products CC(C)=CCC\C(C)=C/CO GLZPCOQZEFWAFX-YFHOEESVSA-N 0.000 description 7
- JMVSBFJBMXQNJW-GIXZANJISA-N all-trans-pentaprenyl diphosphate Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CC\C(C)=C\CC\C(C)=C\COP(O)(=O)OP(O)(O)=O JMVSBFJBMXQNJW-GIXZANJISA-N 0.000 description 7
- VYBREYKSZAROCT-UHFFFAOYSA-N alpha-myrcene Natural products CC(=C)CCCC(=C)C=C VYBREYKSZAROCT-UHFFFAOYSA-N 0.000 description 7
- SEEZIOZEUUMJME-UHFFFAOYSA-N cannabinerolic acid Natural products CCCCCC1=CC(O)=C(CC=C(C)CCC=C(C)C)C(O)=C1C(O)=O SEEZIOZEUUMJME-UHFFFAOYSA-N 0.000 description 7
- 239000001963 growth medium Substances 0.000 description 7
- 230000003834 intracellular effect Effects 0.000 description 7
- GRWFGVWFFZKLTI-UHFFFAOYSA-N rac-alpha-Pinene Natural products CC1=CCC2C(C)(C)C1C2 GRWFGVWFFZKLTI-UHFFFAOYSA-N 0.000 description 7
- 230000001105 regulatory effect Effects 0.000 description 7
- 229930006696 sabinene Natural products 0.000 description 7
- DTGKSKDOIYIVQL-WEDXCCLWSA-N (+)-borneol Chemical compound C1C[C@@]2(C)[C@@H](O)C[C@@H]1C2(C)C DTGKSKDOIYIVQL-WEDXCCLWSA-N 0.000 description 6
- CRDAMVZIKSXKFV-YFVJMOTDSA-N (2-trans,6-trans)-farnesol Chemical group CC(C)=CCC\C(C)=C\CC\C(C)=C\CO CRDAMVZIKSXKFV-YFVJMOTDSA-N 0.000 description 6
- IHPKGUQCSIINRJ-CSKARUKUSA-N (E)-beta-ocimene Chemical compound CC(C)=CC\C=C(/C)C=C IHPKGUQCSIINRJ-CSKARUKUSA-N 0.000 description 6
- 239000001169 1-methyl-4-propan-2-ylcyclohexa-1,4-diene Substances 0.000 description 6
- QPRQEDXDYOZYLA-UHFFFAOYSA-N 2-methylbutan-1-ol Chemical compound CCC(C)CO QPRQEDXDYOZYLA-UHFFFAOYSA-N 0.000 description 6
- 241000351920 Aspergillus nidulans Species 0.000 description 6
- 101710129460 Beta-phellandrene synthase Proteins 0.000 description 6
- 108010008885 Cellulose 1,4-beta-Cellobiosidase Proteins 0.000 description 6
- XEKOWRVHYACXOJ-UHFFFAOYSA-N Ethyl acetate Chemical compound CCOC(C)=O XEKOWRVHYACXOJ-UHFFFAOYSA-N 0.000 description 6
- WEEGYLXZBRQIMU-UHFFFAOYSA-N Eucalyptol Chemical compound C1CC2CCC1(C)OC2(C)C WEEGYLXZBRQIMU-UHFFFAOYSA-N 0.000 description 6
- 102100037047 Fucose-1-phosphate guanylyltransferase Human genes 0.000 description 6
- 101001029296 Homo sapiens Fucose-1-phosphate guanylyltransferase Proteins 0.000 description 6
- 229910019142 PO4 Inorganic materials 0.000 description 6
- MOYAFQVGZZPNRA-UHFFFAOYSA-N Terpinolene Chemical compound CC(C)=C1CCC(C)=CC1 MOYAFQVGZZPNRA-UHFFFAOYSA-N 0.000 description 6
- 239000002253 acid Substances 0.000 description 6
- XCPQUQHBVVXMRQ-UHFFFAOYSA-N alpha-Fenchene Natural products C1CC2C(=C)CC1C2(C)C XCPQUQHBVVXMRQ-UHFFFAOYSA-N 0.000 description 6
- 125000000539 amino acid group Chemical group 0.000 description 6
- CKDOCTFBFTVPSN-UHFFFAOYSA-N borneol Natural products C1CC2(C)C(C)CC1C2(C)C CKDOCTFBFTVPSN-UHFFFAOYSA-N 0.000 description 6
- CRPUJAZIXJMDBK-UHFFFAOYSA-N camphene Chemical compound C1CC2C(=C)C(C)(C)C1C2 CRPUJAZIXJMDBK-UHFFFAOYSA-N 0.000 description 6
- BQOFWKZOCNGFEC-UHFFFAOYSA-N carene Chemical compound C1C(C)=CCC2C(C)(C)C12 BQOFWKZOCNGFEC-UHFFFAOYSA-N 0.000 description 6
- HIPIENNKVJCMAP-UHFFFAOYSA-N chrysanthemol Chemical compound CC(C)=CC1C(CO)C1(C)C HIPIENNKVJCMAP-UHFFFAOYSA-N 0.000 description 6
- DTGKSKDOIYIVQL-UHFFFAOYSA-N dl-isoborneol Natural products C1CC2(C)C(O)CC1C2(C)C DTGKSKDOIYIVQL-UHFFFAOYSA-N 0.000 description 6
- 238000002290 gas chromatography-mass spectrometry Methods 0.000 description 6
- CZVXBFUKBZRMKR-UHFFFAOYSA-N lavandulol Chemical compound CC(C)=CCC(CO)C(C)=C CZVXBFUKBZRMKR-UHFFFAOYSA-N 0.000 description 6
- 239000012528 membrane Substances 0.000 description 6
- 108020004999 messenger RNA Proteins 0.000 description 6
- CJWXCNXHAIFFMH-AVZHFPDBSA-N n-[(2s,3r,4s,5s,6r)-2-[(2r,3r,4s,5r)-2-acetamido-4,5,6-trihydroxy-1-oxohexan-3-yl]oxy-3,5-dihydroxy-6-methyloxan-4-yl]acetamide Chemical compound C[C@H]1O[C@@H](O[C@@H]([C@@H](O)[C@H](O)CO)[C@@H](NC(C)=O)C=O)[C@H](O)[C@@H](NC(C)=O)[C@@H]1O CJWXCNXHAIFFMH-AVZHFPDBSA-N 0.000 description 6
- XNGKCOFXDHYSGR-UHFFFAOYSA-N perillene Chemical compound CC(C)=CCCC=1C=COC=1 XNGKCOFXDHYSGR-UHFFFAOYSA-N 0.000 description 6
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 6
- 239000010452 phosphate Substances 0.000 description 6
- YKFLAYDHMOASIY-UHFFFAOYSA-N γ-terpinene Chemical compound CC(C)C1=CCC(C)=CC1 YKFLAYDHMOASIY-UHFFFAOYSA-N 0.000 description 6
- 239000000260 (2E,6E)-3,7,11-trimethyldodeca-2,6,10-trien-1-ol Substances 0.000 description 5
- 101710165761 (2E,6E)-farnesyl diphosphate synthase Proteins 0.000 description 5
- NMSPFMZHXLPNEI-FNORWQNLSA-N (2e)-3-methylhexa-2,5-dien-1-ol Chemical compound C=CCC(/C)=C/CO NMSPFMZHXLPNEI-FNORWQNLSA-N 0.000 description 5
- 101710195549 (S)-beta-macrocarpene synthase Proteins 0.000 description 5
- DELPMNLSQZDOGP-UHFFFAOYSA-N 5-chloro-3-methylpent-2-en-1-ol Chemical compound ClCCC(C)=CCO DELPMNLSQZDOGP-UHFFFAOYSA-N 0.000 description 5
- 241000894006 Bacteria Species 0.000 description 5
- 108091033409 CRISPR Proteins 0.000 description 5
- 101150051438 CYP gene Proteins 0.000 description 5
- 244000025254 Cannabis sativa Species 0.000 description 5
- 101710156207 Farnesyl diphosphate synthase Proteins 0.000 description 5
- 101710125754 Farnesyl pyrophosphate synthase Proteins 0.000 description 5
- 101710089428 Farnesyl pyrophosphate synthase erg20 Proteins 0.000 description 5
- 108010007508 Farnesyltranstransferase Proteins 0.000 description 5
- 241000233866 Fungi Species 0.000 description 5
- 102100039291 Geranylgeranyl pyrophosphate synthase Human genes 0.000 description 5
- 102000013404 Geranyltranstransferase Human genes 0.000 description 5
- 101710150389 Probable farnesyl diphosphate synthase Proteins 0.000 description 5
- 108020004511 Recombinant DNA Proteins 0.000 description 5
- 101150050559 SOAT1 gene Proteins 0.000 description 5
- 240000003768 Solanum lycopersicum Species 0.000 description 5
- 102100021993 Sterol O-acyltransferase 1 Human genes 0.000 description 5
- FRJSECSOXKQMOD-HQRMLTQVSA-N Taxa-4(5),11(12)-diene Chemical compound C1C[C@]2(C)CCC=C(C)[C@H]2C[C@@H]2CCC(C)=C1C2(C)C FRJSECSOXKQMOD-HQRMLTQVSA-N 0.000 description 5
- YPXGTKHZRCDZTL-KSFOROOFSA-N [(2r,3s)-2,3,4-trihydroxypentyl] dihydrogen phosphate Chemical compound CC(O)[C@H](O)[C@H](O)COP(O)(O)=O YPXGTKHZRCDZTL-KSFOROOFSA-N 0.000 description 5
- 108090000637 alpha-Amylases Proteins 0.000 description 5
- HMTAHNDPLDKYJT-CBBWQLFWSA-N amorpha-4,11-diene Chemical compound C1=C(C)CC[C@H]2[C@H](C)CC[C@@H](C(C)=C)[C@H]21 HMTAHNDPLDKYJT-CBBWQLFWSA-N 0.000 description 5
- HMTAHNDPLDKYJT-UHFFFAOYSA-N amorphadiene Natural products C1=C(C)CCC2C(C)CCC(C(C)=C)C21 HMTAHNDPLDKYJT-UHFFFAOYSA-N 0.000 description 5
- NPNUFJAVOOONJE-UHFFFAOYSA-N beta-cariophyllene Natural products C1CC(C)=CCCC(=C)C2CC(C)(C)C21 NPNUFJAVOOONJE-UHFFFAOYSA-N 0.000 description 5
- NPNUFJAVOOONJE-UONOGXRCSA-N caryophyllene Natural products C1CC(C)=CCCC(=C)[C@@H]2CC(C)(C)[C@@H]21 NPNUFJAVOOONJE-UONOGXRCSA-N 0.000 description 5
- 229930002886 farnesol Natural products 0.000 description 5
- 229940043259 farnesol Drugs 0.000 description 5
- 230000002538 fungal effect Effects 0.000 description 5
- 230000012010 growth Effects 0.000 description 5
- 229930005303 indole alkaloid Natural products 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 244000005700 microbiome Species 0.000 description 5
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 5
- 102000040430 polynucleotide Human genes 0.000 description 5
- 108091033319 polynucleotide Proteins 0.000 description 5
- 239000002157 polynucleotide Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 229920002477 rna polymer Polymers 0.000 description 5
- 150000003535 tetraterpenes Chemical class 0.000 description 5
- 235000009657 tetraterpenes Nutrition 0.000 description 5
- CRDAMVZIKSXKFV-UHFFFAOYSA-N trans-Farnesol Natural products CC(C)=CCCC(C)=CCCC(C)=CCO CRDAMVZIKSXKFV-UHFFFAOYSA-N 0.000 description 5
- 238000013519 translation Methods 0.000 description 5
- FQTLCLSUCSAZDY-UHFFFAOYSA-N (+) E(S) nerolidol Natural products CC(C)=CCCC(C)=CCCC(C)(O)C=C FQTLCLSUCSAZDY-UHFFFAOYSA-N 0.000 description 4
- 229940099369 (+)- limonene Drugs 0.000 description 4
- QEBNYNLSCGVZOH-NFAWXSAZSA-N (+)-valencene Chemical compound C1C[C@@H](C(C)=C)C[C@@]2(C)[C@H](C)CCC=C21 QEBNYNLSCGVZOH-NFAWXSAZSA-N 0.000 description 4
- OSVMTKAMIBTGSB-HMFGEACVSA-N (1S,9S,12R,14S)-3-hydroxy-5,9,13,13-tetramethyl-8-oxatetracyclo[7.4.1.02,7.012,14]tetradeca-2(7),3,5-triene-4-carboxylic acid Chemical compound Cc1cc2O[C@@]3(C)CC[C@@H]4[C@H]3[C@H](c2c(O)c1C(O)=O)C4(C)C OSVMTKAMIBTGSB-HMFGEACVSA-N 0.000 description 4
- 239000001890 (2R)-8,8,8a-trimethyl-2-prop-1-en-2-yl-1,2,3,4,6,7-hexahydronaphthalene Substances 0.000 description 4
- PKWSKCFMABZMMV-SFHVURJKSA-N (2S)-7-hydroxy-2,5-dimethyl-2-(4-methylpent-3-enyl)chromene-6-carboxylic acid Chemical compound OC1=C(C(O)=O)C(C)=C2C=C[C@@](CCC=C(C)C)(C)OC2=C1 PKWSKCFMABZMMV-SFHVURJKSA-N 0.000 description 4
- XBZYWSMVVKYHQN-MYPRUECHSA-N (4as,6as,6br,8ar,9r,10s,12ar,12br,14bs)-10-hydroxy-2,2,6a,6b,9,12a-hexamethyl-9-[(sulfooxy)methyl]-1,2,3,4,4a,5,6,6a,6b,7,8,8a,9,10,11,12,12a,12b,13,14b-icosahydropicene-4a-carboxylic acid Chemical compound C1C[C@H](O)[C@@](C)(COS(O)(=O)=O)[C@@H]2CC[C@@]3(C)[C@]4(C)CC[C@@]5(C(O)=O)CCC(C)(C)C[C@H]5C4=CC[C@@H]3[C@]21C XBZYWSMVVKYHQN-MYPRUECHSA-N 0.000 description 4
- FLXLJBCLEUWWCG-UHFFFAOYSA-N 2-methylbut-2-ene-1,4-diol Chemical compound OCC(C)=CCO FLXLJBCLEUWWCG-UHFFFAOYSA-N 0.000 description 4
- LRKICXFRWQUFSZ-UHFFFAOYSA-N 3-ethylpent-4-en-1-ol Chemical compound CCC(C=C)CCO LRKICXFRWQUFSZ-UHFFFAOYSA-N 0.000 description 4
- OEJVZRIAPBOYJN-UHFFFAOYSA-N 3-methylhept-4-en-1-ol Chemical compound CCC=CC(C)CCO OEJVZRIAPBOYJN-UHFFFAOYSA-N 0.000 description 4
- IWTBVKIGCDZRPL-UHFFFAOYSA-N 3-methylpentanol Chemical compound CCC(C)CCO IWTBVKIGCDZRPL-UHFFFAOYSA-N 0.000 description 4
- LQUGIUKHQGEKIC-UHFFFAOYSA-N 5-hydroxy-2,7-dimethyl-2-(4-methylpent-3-enyl)chromene-6-carboxylic acid Chemical compound CC1=C(C(O)=O)C(O)=C2C=CC(CCC=C(C)C)(C)OC2=C1 LQUGIUKHQGEKIC-UHFFFAOYSA-N 0.000 description 4
- 239000004382 Amylase Substances 0.000 description 4
- 102000013142 Amylases Human genes 0.000 description 4
- 108010065511 Amylases Proteins 0.000 description 4
- 101000757144 Aspergillus niger Glucoamylase Proteins 0.000 description 4
- 101100274507 Caenorhabditis elegans cki-1 gene Proteins 0.000 description 4
- 235000008697 Cannabis sativa Nutrition 0.000 description 4
- 101710118490 Copalyl diphosphate synthase Proteins 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- 102100040004 Gamma-glutamylcyclotransferase Human genes 0.000 description 4
- 101000886680 Homo sapiens Gamma-glutamylcyclotransferase Proteins 0.000 description 4
- 241000235649 Kluyveromyces Species 0.000 description 4
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 4
- 108030004881 Myrcene synthases Proteins 0.000 description 4
- LRHPLDYGYMQRHN-UHFFFAOYSA-N N-Butanol Chemical compound CCCCO LRHPLDYGYMQRHN-UHFFFAOYSA-N 0.000 description 4
- FQTLCLSUCSAZDY-ATGUSINASA-N Nerolidol Chemical compound CC(C)=CCC\C(C)=C\CC[C@](C)(O)C=C FQTLCLSUCSAZDY-ATGUSINASA-N 0.000 description 4
- 101710163270 Nuclease Proteins 0.000 description 4
- 101710093888 Pentalenene synthase Proteins 0.000 description 4
- 241000235648 Pichia Species 0.000 description 4
- 241000223252 Rhodotorula Species 0.000 description 4
- 244000114218 Salvia fruticosa Species 0.000 description 4
- 235000006293 Salvia fruticosa Nutrition 0.000 description 4
- 101710115850 Sesquiterpene synthase Proteins 0.000 description 4
- 229930182558 Sterol Natural products 0.000 description 4
- 108700015934 Triose-phosphate isomerases Proteins 0.000 description 4
- 101710174833 Tuberculosinyl adenosine transferase Proteins 0.000 description 4
- 241000235013 Yarrowia Species 0.000 description 4
- ZSLZBFCDCINBPY-ZSJPKINUSA-N acetyl-CoA Chemical compound O[C@@H]1[C@H](OP(O)(O)=O)[C@@H](COP(O)(=O)OP(O)(=O)OCC(C)(C)[C@@H](O)C(=O)NCCC(=O)NCCSC(=O)C)O[C@H]1N1C2=NC=NC(N)=C2N=C1 ZSLZBFCDCINBPY-ZSJPKINUSA-N 0.000 description 4
- 125000003158 alcohol group Chemical group 0.000 description 4
- XXROGKLTLUQVRX-UHFFFAOYSA-N allyl alcohol Chemical compound OCC=C XXROGKLTLUQVRX-UHFFFAOYSA-N 0.000 description 4
- MVNCAPSFBDBCGF-UHFFFAOYSA-N alpha-pinene Natural products CC1=CCC23C1CC2C3(C)C MVNCAPSFBDBCGF-UHFFFAOYSA-N 0.000 description 4
- 108010025592 aminoadipoyl-cysteinyl-allylglycine Proteins 0.000 description 4
- 235000019418 amylase Nutrition 0.000 description 4
- 150000001491 aromatic compounds Chemical class 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- 238000005119 centrifugation Methods 0.000 description 4
- 238000009833 condensation Methods 0.000 description 4
- 230000005494 condensation Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- WCASXYBKJHWFMY-UHFFFAOYSA-N crotyl alcohol Chemical compound CC=CCO WCASXYBKJHWFMY-UHFFFAOYSA-N 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 108020001507 fusion proteins Proteins 0.000 description 4
- 102000037865 fusion proteins Human genes 0.000 description 4
- CCCXGQLQJHWTLZ-UHFFFAOYSA-N geranyl linalool Natural products CC(=CCCC(=CCCCC(C)(O)CCC=C(C)C)C)C CCCXGQLQJHWTLZ-UHFFFAOYSA-N 0.000 description 4
- IQDXAJNQKSIPGB-HQSZAHFGSA-N geranyllinalool Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CCC(C)(O)C=C IQDXAJNQKSIPGB-HQSZAHFGSA-N 0.000 description 4
- 102000006602 glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 4
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 4
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 4
- QVDTXNVYSHVCGW-ONEGZZNKSA-N isopentenol Chemical compound CC(C)\C=C\O QVDTXNVYSHVCGW-ONEGZZNKSA-N 0.000 description 4
- BDAGIHXWWSANSR-UHFFFAOYSA-N methanoic acid Natural products OC=O BDAGIHXWWSANSR-UHFFFAOYSA-N 0.000 description 4
- WASNIKZYIWZQIP-AWEZNQCLSA-N nerolidol Natural products CC(=CCCC(=CCC[C@@H](O)C=C)C)C WASNIKZYIWZQIP-AWEZNQCLSA-N 0.000 description 4
- 230000008488 polyadenylation Effects 0.000 description 4
- 229920001550 polyprenyl Polymers 0.000 description 4
- 125000001185 polyprenyl group Polymers 0.000 description 4
- 150000003097 polyterpenes Chemical class 0.000 description 4
- BDERNNFJNOPAEC-UHFFFAOYSA-N propan-1-ol Chemical compound CCCO BDERNNFJNOPAEC-UHFFFAOYSA-N 0.000 description 4
- 150000003432 sterols Chemical class 0.000 description 4
- 235000003702 sterols Nutrition 0.000 description 4
- 230000008093 supporting effect Effects 0.000 description 4
- 125000003396 thiol group Chemical group [H]S* 0.000 description 4
- WCTNXGFHEZQHDR-UHFFFAOYSA-N valencene Natural products C1CC(C)(C)C2(C)CC(C(=C)C)CCC2=C1 WCTNXGFHEZQHDR-UHFFFAOYSA-N 0.000 description 4
- WTARULDDTDQWMU-RKDXNWHRSA-N (+)-β-pinene Chemical compound C1[C@H]2C(C)(C)[C@@H]1CCC2=C WTARULDDTDQWMU-RKDXNWHRSA-N 0.000 description 3
- WTARULDDTDQWMU-IUCAKERBSA-N (-)-Nopinene Natural products C1[C@@H]2C(C)(C)[C@H]1CCC2=C WTARULDDTDQWMU-IUCAKERBSA-N 0.000 description 3
- REPVLJRCJUVQFA-UHFFFAOYSA-N (-)-isopinocampheol Natural products C1C(O)C(C)C2C(C)(C)C1C2 REPVLJRCJUVQFA-UHFFFAOYSA-N 0.000 description 3
- GQVMHMFBVWSSPF-SOYUKNQTSA-N (4E,6E)-2,6-dimethylocta-2,4,6-triene Chemical compound C\C=C(/C)\C=C\C=C(C)C GQVMHMFBVWSSPF-SOYUKNQTSA-N 0.000 description 3
- 239000001306 (7E,9E,11E,13E)-pentadeca-7,9,11,13-tetraen-1-ol Substances 0.000 description 3
- CZVXBFUKBZRMKR-JTQLQIEISA-N (R)-lavandulol Natural products CC(C)=CC[C@@H](CO)C(C)=C CZVXBFUKBZRMKR-JTQLQIEISA-N 0.000 description 3
- WUOACPNHFRMFPN-SECBINFHSA-N (S)-(-)-alpha-terpineol Chemical compound CC1=CC[C@@H](C(C)(C)O)CC1 WUOACPNHFRMFPN-SECBINFHSA-N 0.000 description 3
- RFFOTVCVTJUTAD-AOOOYVTPSA-N 1,4-cineole Chemical compound CC(C)[C@]12CC[C@](C)(CC1)O2 RFFOTVCVTJUTAD-AOOOYVTPSA-N 0.000 description 3
- BYDRTKVGBRTTIT-UHFFFAOYSA-N 2-methylprop-2-en-1-ol Chemical compound CC(=C)CO BYDRTKVGBRTTIT-UHFFFAOYSA-N 0.000 description 3
- UWKAYLJWKGQEPM-UHFFFAOYSA-N 3,7-dimethylocta-1,6-dien-3-yl acetate Chemical compound CC(C)=CCCC(C)(C=C)OC(C)=O UWKAYLJWKGQEPM-UHFFFAOYSA-N 0.000 description 3
- LEEZDPXWPYCRRM-UHFFFAOYSA-N 3-hydroxy-beta-eudesmol Natural products C1CC(O)C(=C)C2CC(C(C)(O)C)CCC21C LEEZDPXWPYCRRM-UHFFFAOYSA-N 0.000 description 3
- 102100039217 3-ketoacyl-CoA thiolase, peroxisomal Human genes 0.000 description 3
- HIQIXEFWDLTDED-UHFFFAOYSA-N 4-hydroxy-1-piperidin-4-ylpyrrolidin-2-one Chemical compound O=C1CC(O)CN1C1CCNCC1 HIQIXEFWDLTDED-UHFFFAOYSA-N 0.000 description 3
- 102100034044 All-trans-retinol dehydrogenase [NAD(+)] ADH1B Human genes 0.000 description 3
- 101710193111 All-trans-retinol dehydrogenase [NAD(+)] ADH4 Proteins 0.000 description 3
- WPYMKLBDIGXBTP-UHFFFAOYSA-N Benzoic acid Natural products OC(=O)C1=CC=CC=C1 WPYMKLBDIGXBTP-UHFFFAOYSA-N 0.000 description 3
- 239000005711 Benzoic acid Substances 0.000 description 3
- 238000010354 CRISPR gene editing Methods 0.000 description 3
- ZJMVJDFTNPZVMB-UHFFFAOYSA-N Casbene Chemical compound C1CC(C)=CCCC(C)=CCCC(C)=CC2C(C)(C)C12 ZJMVJDFTNPZVMB-UHFFFAOYSA-N 0.000 description 3
- 108010059892 Cellulase Proteins 0.000 description 3
- 229930008410 Chrysanthemol Natural products 0.000 description 3
- 241000195493 Cryptophyta Species 0.000 description 3
- 108030005251 Cucurbitadienol synthases Proteins 0.000 description 3
- IBVJWOMJGCHRRW-UHFFFAOYSA-N Delta22-Carene Natural products C1CC(C)=CC2C(C)(C)C12 IBVJWOMJGCHRRW-UHFFFAOYSA-N 0.000 description 3
- 241000223221 Fusarium oxysporum Species 0.000 description 3
- 241000567178 Fusarium venenatum Species 0.000 description 3
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 3
- OINNEUNVOZHBOX-XBQSVVNOSA-N Geranylgeranyl diphosphate Natural products [P@](=O)(OP(=O)(O)O)(OC/C=C(\CC/C=C(\CC/C=C(\CC/C=C(\C)/C)/C)/C)/C)O OINNEUNVOZHBOX-XBQSVVNOSA-N 0.000 description 3
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 3
- 101100153048 Homo sapiens ACAA1 gene Proteins 0.000 description 3
- DGAQECJNVWCQMB-PUAWFVPOSA-M Ilexoside XXIX Chemical compound C[C@@H]1CC[C@@]2(CC[C@@]3(C(=CC[C@H]4[C@]3(CC[C@@H]5[C@@]4(CC[C@@H](C5(C)C)OS(=O)(=O)[O-])C)C)[C@@H]2[C@]1(C)O)C)C(=O)O[C@H]6[C@@H]([C@H]([C@@H]([C@H](O6)CO)O)O)O.[Na+] DGAQECJNVWCQMB-PUAWFVPOSA-M 0.000 description 3
- DTGKSKDOIYIVQL-MRTMQBJTSA-N Isoborneol Natural products C1C[C@@]2(C)[C@H](O)C[C@@H]1C2(C)C DTGKSKDOIYIVQL-MRTMQBJTSA-N 0.000 description 3
- 102100027612 Kallikrein-11 Human genes 0.000 description 3
- WHXSMMKQMYFTQS-UHFFFAOYSA-N Lithium Chemical compound [Li] WHXSMMKQMYFTQS-UHFFFAOYSA-N 0.000 description 3
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 3
- GLZPCOQZEFWAFX-JXMROGBWSA-N Nerol Natural products CC(C)=CCC\C(C)=C\CO GLZPCOQZEFWAFX-JXMROGBWSA-N 0.000 description 3
- 235000010676 Ocimum basilicum Nutrition 0.000 description 3
- 240000007926 Ocimum gratissimum Species 0.000 description 3
- 241001112159 Ogataea Species 0.000 description 3
- 108010085186 Peroxisomal Targeting Signals Proteins 0.000 description 3
- 102000012288 Phosphopyruvate Hydratase Human genes 0.000 description 3
- 108010022181 Phosphopyruvate Hydratase Proteins 0.000 description 3
- 235000008566 Pinus taeda Nutrition 0.000 description 3
- 241000218679 Pinus taeda Species 0.000 description 3
- ZLMJMSJWJFRBEC-UHFFFAOYSA-N Potassium Chemical compound [K] ZLMJMSJWJFRBEC-UHFFFAOYSA-N 0.000 description 3
- PXRCIOIWVGAZEP-UHFFFAOYSA-N Primaeres Camphenhydrat Natural products C1CC2C(O)(C)C(C)(C)C1C2 PXRCIOIWVGAZEP-UHFFFAOYSA-N 0.000 description 3
- WTARULDDTDQWMU-UHFFFAOYSA-N Pseudopinene Natural products C1C2C(C)(C)C1CCC2=C WTARULDDTDQWMU-UHFFFAOYSA-N 0.000 description 3
- BUGBHKTXTAQXES-UHFFFAOYSA-N Selenium Chemical compound [Se] BUGBHKTXTAQXES-UHFFFAOYSA-N 0.000 description 3
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 3
- 241000896499 Solanum habrochaites Species 0.000 description 3
- 235000014296 Solanum habrochaites Nutrition 0.000 description 3
- 241001453296 Synechococcus elongatus Species 0.000 description 3
- 238000010459 TALEN Methods 0.000 description 3
- 241000015728 Taxus canadensis Species 0.000 description 3
- 241000204673 Thermoplasma acidophilum Species 0.000 description 3
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 3
- 101710152431 Trypsin-like protease Proteins 0.000 description 3
- 150000007513 acids Chemical class 0.000 description 3
- 102000004139 alpha-Amylases Human genes 0.000 description 3
- OVKDFILSBMEKLT-UHFFFAOYSA-N alpha-Terpineol Natural products CC(=C)C1(O)CCC(C)=CC1 OVKDFILSBMEKLT-UHFFFAOYSA-N 0.000 description 3
- 229940024171 alpha-amylase Drugs 0.000 description 3
- PSVBPLKYDMHILE-UHFFFAOYSA-N alpha-humulene Natural products CC1=C/CC(C)(C)C=CCC=CCC1 PSVBPLKYDMHILE-UHFFFAOYSA-N 0.000 description 3
- 229940088601 alpha-terpineol Drugs 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 235000010233 benzoic acid Nutrition 0.000 description 3
- 229930006722 beta-pinene Natural products 0.000 description 3
- 230000006696 biosynthetic metabolic pathway Effects 0.000 description 3
- 229940116229 borneol Drugs 0.000 description 3
- 229930006739 camphene Natural products 0.000 description 3
- ZYPYEBYNXWUCEA-UHFFFAOYSA-N camphenilone Natural products C1CC2C(=O)C(C)(C)C1C2 ZYPYEBYNXWUCEA-UHFFFAOYSA-N 0.000 description 3
- 229930006737 car-3-ene Natural products 0.000 description 3
- 239000007833 carbon precursor Substances 0.000 description 3
- 229940117948 caryophyllene Drugs 0.000 description 3
- 229930009323 casbene Natural products 0.000 description 3
- RFFOTVCVTJUTAD-UHFFFAOYSA-N cineole Natural products C1CC2(C)CCC1(C(C)C)O2 RFFOTVCVTJUTAD-UHFFFAOYSA-N 0.000 description 3
- GQVMHMFBVWSSPF-UHFFFAOYSA-N cis-alloocimene Natural products CC=C(C)C=CC=C(C)C GQVMHMFBVWSSPF-UHFFFAOYSA-N 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- 238000005520 cutting process Methods 0.000 description 3
- IBVJWOMJGCHRRW-BDAKNGLRSA-N delta-2-carene Chemical compound C1CC(C)=C[C@@H]2C(C)(C)[C@H]12 IBVJWOMJGCHRRW-BDAKNGLRSA-N 0.000 description 3
- 230000005014 ectopic expression Effects 0.000 description 3
- 125000000524 functional group Chemical group 0.000 description 3
- 229930182830 galactose Natural products 0.000 description 3
- LCWMKIHBLJLORW-UHFFFAOYSA-N gamma-carene Natural products C1CC(=C)CC2C(C)(C)C21 LCWMKIHBLJLORW-UHFFFAOYSA-N 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 239000008103 glucose Substances 0.000 description 3
- 239000000543 intermediate Substances 0.000 description 3
- PNDPGZBMCMUPRI-UHFFFAOYSA-N iodine Chemical compound II PNDPGZBMCMUPRI-UHFFFAOYSA-N 0.000 description 3
- QMZRXYCCCYYMHF-UHFFFAOYSA-N isopentenyl phosphate Chemical compound CC(=C)CCOP(O)(O)=O QMZRXYCCCYYMHF-UHFFFAOYSA-N 0.000 description 3
- 229940074928 isopropyl myristate Drugs 0.000 description 3
- 229910052744 lithium Inorganic materials 0.000 description 3
- 230000004807 localization Effects 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000010369 molecular cloning Methods 0.000 description 3
- VLKZOEOYAKHREP-UHFFFAOYSA-N n-Hexane Chemical compound CCCCCC VLKZOEOYAKHREP-UHFFFAOYSA-N 0.000 description 3
- 230000007935 neutral effect Effects 0.000 description 3
- 210000003463 organelle Anatomy 0.000 description 3
- 210000002824 peroxisome Anatomy 0.000 description 3
- 238000003752 polymerase chain reaction Methods 0.000 description 3
- 239000011591 potassium Substances 0.000 description 3
- 229910052700 potassium Inorganic materials 0.000 description 3
- 239000000843 powder Substances 0.000 description 3
- 229930008679 prenylflavonoid Natural products 0.000 description 3
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 3
- 229910052711 selenium Inorganic materials 0.000 description 3
- 239000011669 selenium Substances 0.000 description 3
- 229910052710 silicon Inorganic materials 0.000 description 3
- 239000010703 silicon Substances 0.000 description 3
- 229910052708 sodium Inorganic materials 0.000 description 3
- 239000011734 sodium Substances 0.000 description 3
- 238000002470 solid-phase micro-extraction Methods 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 239000002904 solvent Substances 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 229940116411 terpineol Drugs 0.000 description 3
- 238000001890 transfection Methods 0.000 description 3
- RRBYUSWBLVXTQN-UHFFFAOYSA-N tricyclene Chemical compound C12CC3CC2C1(C)C3(C)C RRBYUSWBLVXTQN-UHFFFAOYSA-N 0.000 description 3
- RRBYUSWBLVXTQN-VZCHMASFSA-N tricyclene Natural products C([C@@H]12)C3C[C@H]1C2(C)C3(C)C RRBYUSWBLVXTQN-VZCHMASFSA-N 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- IHPKGUQCSIINRJ-UHFFFAOYSA-N β-ocimene Natural products CC(C)=CCC=C(C)C=C IHPKGUQCSIINRJ-UHFFFAOYSA-N 0.000 description 3
- 101710135150 (+)-T-muurolol synthase ((2E,6E)-farnesyl diphosphate cyclizing) Proteins 0.000 description 2
- YMBFCQPIMVLNIU-SOUVJXGZSA-N (-)-exo-alpha-bergamotene Chemical compound C1[C@@H]2[C@@](CCC=C(C)C)(C)[C@H]1CC=C2C YMBFCQPIMVLNIU-SOUVJXGZSA-N 0.000 description 2
- DMHADBQKVWXPPM-PDDCSNRZSA-N (1e,3z,6e,10z,14s)-3,7,11-trimethyl-14-propan-2-ylcyclotetradeca-1,3,6,10-tetraene Chemical compound CC(C)[C@@H]\1CC\C(C)=C/CC\C(C)=C\C\C=C(\C)/C=C/1 DMHADBQKVWXPPM-PDDCSNRZSA-N 0.000 description 2
- 101710116641 (2Z,6E)-farnesyl diphosphate synthase Proteins 0.000 description 2
- MSQFYBKYYIVVAG-SOFGYWHQSA-N (2e)-3-methylhepta-2,6-dien-1-ol Chemical compound OC\C=C(/C)CCC=C MSQFYBKYYIVVAG-SOFGYWHQSA-N 0.000 description 2
- WQOSNKWCIQZRGH-IHSQGBLNSA-N (2r)-2-[(3e)-4,8-dimethylnona-3,7-dienyl]-2,7-dimethylchromen-5-ol Chemical compound CC1=CC(O)=C2C=C[C@](CC/C=C(C)/CCC=C(C)C)(C)OC2=C1 WQOSNKWCIQZRGH-IHSQGBLNSA-N 0.000 description 2
- UYLFTJMQPWWDCW-MVLVPLOLSA-N (2s)-2-[(3e)-4,8-dimethylnona-3,7-dienyl]-5-hydroxy-2,7-dimethylchromene-6-carboxylic acid Chemical compound CC1=C(C(O)=O)C(O)=C2C=C[C@@](CC/C=C(C)/CCC=C(C)C)(C)OC2=C1 UYLFTJMQPWWDCW-MVLVPLOLSA-N 0.000 description 2
- 239000001618 (3R)-3-methylpentan-1-ol Substances 0.000 description 2
- FQVLRGLGWNWPSS-BXBUPLCLSA-N (4r,7s,10s,13s,16r)-16-acetamido-13-(1h-imidazol-5-ylmethyl)-10-methyl-6,9,12,15-tetraoxo-7-propan-2-yl-1,2-dithia-5,8,11,14-tetrazacycloheptadecane-4-carboxamide Chemical compound N1C(=O)[C@@H](NC(C)=O)CSSC[C@@H](C(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@@H]1CC1=CN=CN1 FQVLRGLGWNWPSS-BXBUPLCLSA-N 0.000 description 2
- JSNRRGGBADWTMC-UHFFFAOYSA-N (6E)-7,11-dimethyl-3-methylene-1,6,10-dodecatriene Chemical compound CC(C)=CCCC(C)=CCCC(=C)C=C JSNRRGGBADWTMC-UHFFFAOYSA-N 0.000 description 2
- NEJDKFPXHQRVMV-UHFFFAOYSA-N (E)-2-Methyl-2-buten-1-ol Natural products CC=C(C)CO NEJDKFPXHQRVMV-UHFFFAOYSA-N 0.000 description 2
- OJISWRZIEWCUBN-QIRCYJPOSA-N (E,E,E)-geranylgeraniol Chemical group CC(C)=CCC\C(C)=C\CC\C(C)=C\CC\C(C)=C\CO OJISWRZIEWCUBN-QIRCYJPOSA-N 0.000 description 2
- QYIMSPSDBYKPPY-RSKUXYSASA-N (S)-2,3-epoxysqualene Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CC\C=C(/C)CC\C=C(/C)CC[C@@H]1OC1(C)C QYIMSPSDBYKPPY-RSKUXYSASA-N 0.000 description 2
- IIFRHZRUHAELRO-SOFGYWHQSA-N (e)-3-methylhept-2-en-1-ol Chemical compound CCCC\C(C)=C\CO IIFRHZRUHAELRO-SOFGYWHQSA-N 0.000 description 2
- NPMCAXBLYVUWTC-VQHVLOKHSA-N (e)-3-methyloct-2-en-1-ol Chemical compound CCCCC\C(C)=C\CO NPMCAXBLYVUWTC-VQHVLOKHSA-N 0.000 description 2
- TWUDHDJKTHYMGY-QHHAFSJGSA-N (e)-3-methylpent-2-ene-1,5-diol Chemical compound OCCC(/C)=C/CO TWUDHDJKTHYMGY-QHHAFSJGSA-N 0.000 description 2
- SZPKMIRLEAFMBV-ZZXKWVIFSA-N (e)-3-methylpent-3-en-1-ol Chemical compound C\C=C(/C)CCO SZPKMIRLEAFMBV-ZZXKWVIFSA-N 0.000 description 2
- DSQXMIYNGKOIQC-GORDUTHDSA-N (e)-4-bromo-3-methylbut-2-en-1-ol Chemical compound BrCC(/C)=C/CO DSQXMIYNGKOIQC-GORDUTHDSA-N 0.000 description 2
- QZYCIZBUCPJIGT-GORDUTHDSA-N (e)-4-chloro-3-methylbut-2-en-1-ol Chemical compound ClCC(/C)=C/CO QZYCIZBUCPJIGT-GORDUTHDSA-N 0.000 description 2
- SRQGZQPUPABHCN-RQOWECAXSA-N (z)-3-chlorobut-2-en-1-ol Chemical compound C\C(Cl)=C\CO SRQGZQPUPABHCN-RQOWECAXSA-N 0.000 description 2
- YVLPJIGOMTXXLP-UHFFFAOYSA-N 15-cis-phytoene Chemical compound CC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CC=CC=C(C)CCC=C(C)CCC=C(C)CCC=C(C)C YVLPJIGOMTXXLP-UHFFFAOYSA-N 0.000 description 2
- QPIZDZGIXDKCRC-JTCWOHKRSA-N 2,4-dihydroxy-6-methyl-3-[(2e,6e)-3,7,11-trimethyldodeca-2,6,10-trienyl]benzoic acid Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CC1=C(O)C=C(C)C(C(O)=O)=C1O QPIZDZGIXDKCRC-JTCWOHKRSA-N 0.000 description 2
- NEJDKFPXHQRVMV-HWKANZROSA-N 2-Methyl-2-buten-1-ol Chemical compound C\C=C(/C)CO NEJDKFPXHQRVMV-HWKANZROSA-N 0.000 description 2
- NVGOATMUHKIQQG-UHFFFAOYSA-N 2-Methyl-3-buten-1-ol Chemical compound OCC(C)C=C NVGOATMUHKIQQG-UHFFFAOYSA-N 0.000 description 2
- JKLUVCHKXQJGIG-UHFFFAOYSA-N 2-Methylenebutan-1-ol Chemical compound CCC(=C)CO JKLUVCHKXQJGIG-UHFFFAOYSA-N 0.000 description 2
- ZMEAEUNNMHQAJK-UHFFFAOYSA-N 2-[3-[3-[2,2-diphenylethyl-[(4-methoxyphenyl)methyl]amino]propoxy]phenyl]acetamide Chemical compound C1=CC(OC)=CC=C1CN(CC(C=1C=CC=CC=1)C=1C=CC=CC=1)CCCOC1=CC=CC(CC(N)=O)=C1 ZMEAEUNNMHQAJK-UHFFFAOYSA-N 0.000 description 2
- YAJSVGNJAFUMAC-UHFFFAOYSA-N 2-methylene-1,4-butanediol Chemical compound OCCC(=C)CO YAJSVGNJAFUMAC-UHFFFAOYSA-N 0.000 description 2
- OINNEUNVOZHBOX-QIRCYJPOSA-N 2-trans,6-trans,10-trans-geranylgeranyl diphosphate Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CC\C(C)=C\COP(O)(=O)OP(O)(O)=O OINNEUNVOZHBOX-QIRCYJPOSA-N 0.000 description 2
- GJGGAZPKDCTKQF-UHFFFAOYSA-N 3,4-dimethylpent-3-en-1-ol Chemical compound CC(C)=C(C)CCO GJGGAZPKDCTKQF-UHFFFAOYSA-N 0.000 description 2
- ZURHFDBKAOZNNI-UHFFFAOYSA-N 3-(fluoromethyl)but-3-en-1-ol Chemical compound OCCC(=C)CF ZURHFDBKAOZNNI-UHFFFAOYSA-N 0.000 description 2
- RTKMFQOHBDVEBC-UHFFFAOYSA-N 3-bromo-3-buten-1-ol Chemical compound OCCC(Br)=C RTKMFQOHBDVEBC-UHFFFAOYSA-N 0.000 description 2
- WGSISXQCCOFXKW-UHFFFAOYSA-N 3-bromobut-2-en-1-ol Chemical compound CC(Br)=CCO WGSISXQCCOFXKW-UHFFFAOYSA-N 0.000 description 2
- KXJLZGFRZZIHKT-UHFFFAOYSA-N 3-chlorobut-3-en-1-ol Chemical compound OCCC(Cl)=C KXJLZGFRZZIHKT-UHFFFAOYSA-N 0.000 description 2
- BKUZULFHUKNCGI-UHFFFAOYSA-N 3-ethylpent-3-en-1-ol Chemical compound CCC(=CC)CCO BKUZULFHUKNCGI-UHFFFAOYSA-N 0.000 description 2
- DVEFUHVVWJONKR-UHFFFAOYSA-N 3-ethylpentan-1-ol Chemical compound CCC(CC)CCO DVEFUHVVWJONKR-UHFFFAOYSA-N 0.000 description 2
- JTWTYCHOXSBENH-UHFFFAOYSA-N 3-methylhept-3-en-1-ol Chemical compound CCCC=C(C)CCO JTWTYCHOXSBENH-UHFFFAOYSA-N 0.000 description 2
- YENBGXRPXGPABV-UHFFFAOYSA-N 3-methylhept-6-en-1-ol Chemical compound OCCC(C)CCC=C YENBGXRPXGPABV-UHFFFAOYSA-N 0.000 description 2
- MUPPEBVXFKNMCI-UHFFFAOYSA-N 3-methylheptan-1-ol Chemical compound CCCCC(C)CCO MUPPEBVXFKNMCI-UHFFFAOYSA-N 0.000 description 2
- BQURYLHIDQYWFD-UHFFFAOYSA-N 3-methylhex-3-en-1-ol Chemical compound CCC=C(C)CCO BQURYLHIDQYWFD-UHFFFAOYSA-N 0.000 description 2
- RJLCWOGYMNKNGB-UHFFFAOYSA-N 3-methylhex-4-en-1-ol Chemical compound CC=CC(C)CCO RJLCWOGYMNKNGB-UHFFFAOYSA-N 0.000 description 2
- YMTRRNPGRBDHDL-UHFFFAOYSA-N 3-methylhex-5-en-1-ol Chemical compound OCCC(C)CC=C YMTRRNPGRBDHDL-UHFFFAOYSA-N 0.000 description 2
- MOEBANQVGCZVQC-UHFFFAOYSA-N 3-methylhexa-2,4-dien-1-ol Chemical compound CC=CC(C)=CCO MOEBANQVGCZVQC-UHFFFAOYSA-N 0.000 description 2
- YGZVAQICDGBHMD-UHFFFAOYSA-N 3-methylhexan-1-ol Chemical compound CCCC(C)CCO YGZVAQICDGBHMD-UHFFFAOYSA-N 0.000 description 2
- AGVSLWNSTURZLY-UHFFFAOYSA-N 3-methylidenehept-6-en-1-ol Chemical compound OCCC(=C)CCC=C AGVSLWNSTURZLY-UHFFFAOYSA-N 0.000 description 2
- ZPLYSVMRQUOURS-UHFFFAOYSA-N 3-methylideneheptan-1-ol Chemical compound CCCCC(=C)CCO ZPLYSVMRQUOURS-UHFFFAOYSA-N 0.000 description 2
- BWROKMCJAGKDDI-UHFFFAOYSA-N 3-methylidenehexan-1-ol Chemical compound CCCC(=C)CCO BWROKMCJAGKDDI-UHFFFAOYSA-N 0.000 description 2
- PUYKIRCNGYGEGO-UHFFFAOYSA-N 3-methylideneoctan-1-ol Chemical compound CCCCCC(=C)CCO PUYKIRCNGYGEGO-UHFFFAOYSA-N 0.000 description 2
- CKMILIDJFMEGAX-UHFFFAOYSA-N 3-methylidenepent-4-en-1-ol Chemical compound OCCC(=C)C=C CKMILIDJFMEGAX-UHFFFAOYSA-N 0.000 description 2
- PJQKGOHDVPQQAA-UHFFFAOYSA-N 3-methyloct-3-en-1-ol Chemical compound CCCCC=C(C)CCO PJQKGOHDVPQQAA-UHFFFAOYSA-N 0.000 description 2
- ATNLQUNMECVMJZ-UHFFFAOYSA-N 3-methyloct-6-en-1-ol Chemical compound CC=CCCC(C)CCO ATNLQUNMECVMJZ-UHFFFAOYSA-N 0.000 description 2
- NPFMMNUXJYELCI-UHFFFAOYSA-N 3-methyloct-7-en-1-ol Chemical compound OCCC(C)CCCC=C NPFMMNUXJYELCI-UHFFFAOYSA-N 0.000 description 2
- SRDFQYARCLNJAG-UHFFFAOYSA-N 3-methylocta-2,4-dien-1-ol Chemical compound C(CC)C=CC(=CCO)C SRDFQYARCLNJAG-UHFFFAOYSA-N 0.000 description 2
- CLFSZAMBOZSCOS-UHFFFAOYSA-N 3-methyloctan-1-ol Chemical compound CCCCCC(C)CCO CLFSZAMBOZSCOS-UHFFFAOYSA-N 0.000 description 2
- ZSJHASYJQIRSLE-UHFFFAOYSA-N 3-methylpent-2-en-4-yn-1-ol Chemical compound C#CC(C)=CCO ZSJHASYJQIRSLE-UHFFFAOYSA-N 0.000 description 2
- RRURBGUJFHZKAL-UHFFFAOYSA-N 3-methylpent-4-yn-1-ol Chemical compound C#CC(C)CCO RRURBGUJFHZKAL-UHFFFAOYSA-N 0.000 description 2
- MOBKGOLFIBXIOP-UHFFFAOYSA-N 3-methylpenta-2,4-dien-1-ol Chemical compound C=CC(C)=CCO MOBKGOLFIBXIOP-UHFFFAOYSA-N 0.000 description 2
- OSJPPGNTCRNQQC-UWTATZPHSA-N 3-phospho-D-glyceric acid Chemical compound OC(=O)[C@H](O)COP(O)(O)=O OSJPPGNTCRNQQC-UWTATZPHSA-N 0.000 description 2
- OSWFIVFLDKOXQC-UHFFFAOYSA-N 4-(3-methoxyphenyl)aniline Chemical compound COC1=CC=CC(C=2C=CC(N)=CC=2)=C1 OSWFIVFLDKOXQC-UHFFFAOYSA-N 0.000 description 2
- GCHJBJOOADXJFT-UHFFFAOYSA-N 4-hydroxy-2-methylbut-2-enal Chemical compound O=CC(C)=CCO GCHJBJOOADXJFT-UHFFFAOYSA-N 0.000 description 2
- HXONYKIVDNJZBB-UHFFFAOYSA-N 4-hydroxy-2-methylidenebutanal Chemical compound OCCC(=C)C=O HXONYKIVDNJZBB-UHFFFAOYSA-N 0.000 description 2
- XECQOLIOBNMTQA-UHFFFAOYSA-N 4-methyl-3-methylidenepentan-1-ol Chemical compound CC(C)C(=C)CCO XECQOLIOBNMTQA-UHFFFAOYSA-N 0.000 description 2
- NVEQFIOZRFFVFW-UHFFFAOYSA-N 9-epi-beta-caryophyllene oxide Natural products C=C1CCC2OC2(C)CCC2C(C)(C)CC21 NVEQFIOZRFFVFW-UHFFFAOYSA-N 0.000 description 2
- ZRLNBWWGLOPJIC-PYQRSULMSA-N A'-neogammacerane Chemical compound C([C@]1(C)[C@H]2CC[C@H]34)CCC(C)(C)[C@@H]1CC[C@@]2(C)[C@]4(C)CC[C@@H]1[C@]3(C)CC[C@@H]1C(C)C ZRLNBWWGLOPJIC-PYQRSULMSA-N 0.000 description 2
- 102000007698 Alcohol dehydrogenase Human genes 0.000 description 2
- 108010021809 Alcohol dehydrogenase Proteins 0.000 description 2
- 102100034035 Alcohol dehydrogenase 1A Human genes 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-O Ammonium Chemical compound [NH4+] QGZKDVFQNNGYKY-UHFFFAOYSA-O 0.000 description 2
- 108010037870 Anthranilate Synthase Proteins 0.000 description 2
- 241000228212 Aspergillus Species 0.000 description 2
- 101000690713 Aspergillus niger Alpha-glucosidase Proteins 0.000 description 2
- 101000756530 Aspergillus niger Endo-1,4-beta-xylanase B Proteins 0.000 description 2
- IVRMZWNICZWHMI-UHFFFAOYSA-N Azide Chemical compound [N-]=[N+]=[N-] IVRMZWNICZWHMI-UHFFFAOYSA-N 0.000 description 2
- 241000194107 Bacillus megaterium Species 0.000 description 2
- 239000002028 Biomass Substances 0.000 description 2
- IHJMVDDWXNKYOC-UHFFFAOYSA-N BrCC(CCO)=C Chemical compound BrCC(CCO)=C IHJMVDDWXNKYOC-UHFFFAOYSA-N 0.000 description 2
- GAWIXWVDTYZWAW-UHFFFAOYSA-N C[CH]O Chemical group C[CH]O GAWIXWVDTYZWAW-UHFFFAOYSA-N 0.000 description 2
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 2
- KXDHJXZQYSOELW-UHFFFAOYSA-M Carbamate Chemical compound NC([O-])=O KXDHJXZQYSOELW-UHFFFAOYSA-M 0.000 description 2
- 102100031065 Choline kinase alpha Human genes 0.000 description 2
- 235000005979 Citrus limon Nutrition 0.000 description 2
- 244000131522 Citrus pyriformis Species 0.000 description 2
- 235000009088 Citrus pyriformis Nutrition 0.000 description 2
- 108700010070 Codon Usage Proteins 0.000 description 2
- UANVCGQMNRTKGM-UHFFFAOYSA-N Confluentinsaeure Natural products CCCCCC(=O)CC1=CC(OC)=CC(O)=C1C(=O)OC1=CC(CCCCC)=C(C(O)=O)C(OC)=C1 UANVCGQMNRTKGM-UHFFFAOYSA-N 0.000 description 2
- 241001527609 Cryptococcus Species 0.000 description 2
- 240000008067 Cucumis sativus Species 0.000 description 2
- 235000009849 Cucumis sativus Nutrition 0.000 description 2
- XFXPMWWXUTWYJX-UHFFFAOYSA-N Cyanide Chemical compound N#[C-] XFXPMWWXUTWYJX-UHFFFAOYSA-N 0.000 description 2
- 102000002004 Cytochrome P-450 Enzyme System Human genes 0.000 description 2
- 108010015742 Cytochrome P-450 Enzyme System Proteins 0.000 description 2
- 101710132690 Endo-1,4-beta-xylanase A Proteins 0.000 description 2
- 208000033962 Fontaine progeroid syndrome Diseases 0.000 description 2
- 101000649352 Fusarium oxysporum f. sp. lycopersici (strain 4287 / CBS 123668 / FGSC 9935 / NRRL 34936) Endo-1,4-beta-xylanase A Proteins 0.000 description 2
- 101150094690 GAL1 gene Proteins 0.000 description 2
- 102100028501 Galanin peptides Human genes 0.000 description 2
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 2
- 101000892220 Geobacillus thermodenitrificans (strain NG80-2) Long-chain-alcohol dehydrogenase 1 Proteins 0.000 description 2
- 101710119400 Geranylfarnesyl diphosphate synthase Proteins 0.000 description 2
- 101710107752 Geranylgeranyl diphosphate synthase Proteins 0.000 description 2
- 108010070675 Glutathione transferase Proteins 0.000 description 2
- 102000005720 Glutathione transferase Human genes 0.000 description 2
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 2
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 2
- 108020005004 Guide RNA Proteins 0.000 description 2
- 101000780443 Homo sapiens Alcohol dehydrogenase 1A Proteins 0.000 description 2
- 101100121078 Homo sapiens GAL gene Proteins 0.000 description 2
- 101000869690 Homo sapiens Protein S100-A8 Proteins 0.000 description 2
- OWIKHYCFFJSOEH-UHFFFAOYSA-N Isocyanic acid Chemical compound N=C=O OWIKHYCFFJSOEH-UHFFFAOYSA-N 0.000 description 2
- 241000204082 Kitasatospora griseola Species 0.000 description 2
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical class C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 2
- 241001149698 Lipomyces Species 0.000 description 2
- UPYKUZBSLRQECL-UKMVMLAPSA-N Lycopene Natural products CC(=C/C=C/C=C(C)/C=C/C=C(C)/C=C/C1C(=C)CCCC1(C)C)C=CC=C(/C)C=CC2C(=C)CCCC2(C)C UPYKUZBSLRQECL-UKMVMLAPSA-N 0.000 description 2
- JEVVKJMRZMXFBT-XWDZUXABSA-N Lycophyll Natural products OC/C(=C/CC/C(=C\C=C\C(=C/C=C/C(=C\C=C\C=C(/C=C/C=C(\C=C\C=C(/CC/C=C(/CO)\C)\C)/C)\C)/C)\C)/C)/C JEVVKJMRZMXFBT-XWDZUXABSA-N 0.000 description 2
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 2
- 108090000157 Metallothionein Proteins 0.000 description 2
- 241001432455 Methanolobus tindarius DSM 2278 Species 0.000 description 2
- WKDHACJYWCUZDW-UHFFFAOYSA-N NCC(=C)CCO Chemical compound NCC(=C)CCO WKDHACJYWCUZDW-UHFFFAOYSA-N 0.000 description 2
- 108030006316 Nerylneryl diphosphate synthases Proteins 0.000 description 2
- NHNBFGGVMKEFGY-UHFFFAOYSA-N Nitrate Chemical compound [O-][N+]([O-])=O NHNBFGGVMKEFGY-UHFFFAOYSA-N 0.000 description 2
- 241001195348 Nusa Species 0.000 description 2
- 102000004316 Oxidoreductases Human genes 0.000 description 2
- 108090000854 Oxidoreductases Proteins 0.000 description 2
- 102000002508 Peptide Elongation Factors Human genes 0.000 description 2
- 108010068204 Peptide Elongation Factors Proteins 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-N Phosphoric acid Chemical compound OP(O)(O)=O NBIIXXVUZAFLBC-UHFFFAOYSA-N 0.000 description 2
- 102100032442 Protein S100-A8 Human genes 0.000 description 2
- MUPFEKGTMRGPLJ-RMMQSMQOSA-N Raffinose Natural products O(C[C@H]1[C@@H](O)[C@H](O)[C@@H](O)[C@@H](O[C@@]2(CO)[C@H](O)[C@@H](O)[C@@H](CO)O2)O1)[C@@H]1[C@H](O)[C@@H](O)[C@@H](O)[C@@H](CO)O1 MUPFEKGTMRGPLJ-RMMQSMQOSA-N 0.000 description 2
- 240000001329 Salvia pomifera Species 0.000 description 2
- 235000016524 Salvia pomifera Nutrition 0.000 description 2
- 241000235346 Schizosaccharomyces Species 0.000 description 2
- 241000607762 Shigella flexneri Species 0.000 description 2
- 101710165017 Short-chain Z-isoprenyl diphosphate synthase Proteins 0.000 description 2
- 102000002669 Small Ubiquitin-Related Modifier Proteins Human genes 0.000 description 2
- 108010043401 Small Ubiquitin-Related Modifier Proteins Proteins 0.000 description 2
- 108091081024 Start codon Proteins 0.000 description 2
- 101000895629 Synechococcus sp. (strain ATCC 27264 / PCC 7002 / PR-6) Geranylgeranyl pyrophosphate synthase Proteins 0.000 description 2
- 108030004979 Terpentetriene synthases Proteins 0.000 description 2
- 102100036407 Thioredoxin Human genes 0.000 description 2
- 241000223230 Trichosporon Species 0.000 description 2
- 102000005924 Triose-Phosphate Isomerase Human genes 0.000 description 2
- MUPFEKGTMRGPLJ-UHFFFAOYSA-N UNPD196149 Natural products OC1C(O)C(CO)OC1(CO)OC1C(O)C(O)C(O)C(COC2C(C(O)C(O)C(CO)O2)O)O1 MUPFEKGTMRGPLJ-UHFFFAOYSA-N 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 125000004036 acetal group Chemical group 0.000 description 2
- 108010048241 acetamidase Proteins 0.000 description 2
- FPIPGXGPPPQFEQ-OVSJKPMPSA-N all-trans-retinol Chemical compound OC\C=C(/C)\C=C\C=C(/C)\C=C\C1=C(C)CCCC1(C)C FPIPGXGPPPQFEQ-OVSJKPMPSA-N 0.000 description 2
- NRKQTNOYIVOQOH-RCDVYXTJSA-N anthopogocyclolic acid Natural products O1C2=CC(O)=C(C(O)=O)C(C)=C2[C@@H]2C(C)(C)[C@H]3[C@@H]2[C@]1(C)CC3 NRKQTNOYIVOQOH-RCDVYXTJSA-N 0.000 description 2
- 108010047754 beta-Glucosidase Proteins 0.000 description 2
- 102000006995 beta-Glucosidase Human genes 0.000 description 2
- ZADPBFCGQRWHPN-UHFFFAOYSA-N boronic acid Chemical compound OBO ZADPBFCGQRWHPN-UHFFFAOYSA-N 0.000 description 2
- 125000001246 bromo group Chemical group Br* 0.000 description 2
- NEEDEQSZOUAJMU-UHFFFAOYSA-N but-2-yn-1-ol Chemical compound CC#CCO NEEDEQSZOUAJMU-UHFFFAOYSA-N 0.000 description 2
- LBHVNQAMWXEMLK-UHFFFAOYSA-N but-3-en-1-ol Chemical compound OCCC=C.OCCC=C LBHVNQAMWXEMLK-UHFFFAOYSA-N 0.000 description 2
- OTJZCIYGRUNXTP-UHFFFAOYSA-N but-3-yn-1-ol Chemical compound OCCC#C OTJZCIYGRUNXTP-UHFFFAOYSA-N 0.000 description 2
- JXKCVRNKAPHWJG-UHFFFAOYSA-N buta-2,3-dien-1-ol Chemical compound OCC=C=C JXKCVRNKAPHWJG-UHFFFAOYSA-N 0.000 description 2
- 125000003917 carbamoyl group Chemical group [H]N([H])C(*)=O 0.000 description 2
- 150000001244 carboxylic acid anhydrides Chemical class 0.000 description 2
- 101150038500 cas9 gene Proteins 0.000 description 2
- 230000003197 catalytic effect Effects 0.000 description 2
- 210000000170 cell membrane Anatomy 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- DMHADBQKVWXPPM-SBHJBAJOSA-N cembrene Natural products CC(C)C1CCC(=C/CCC(=CCC=C(C)/C=C/1)C)C DMHADBQKVWXPPM-SBHJBAJOSA-N 0.000 description 2
- 125000001309 chloro group Chemical group Cl* 0.000 description 2
- 210000003763 chloroplast Anatomy 0.000 description 2
- WQOSNKWCIQZRGH-JOCHJYFZSA-N confluentin Natural products CC(C)=CCCC(C)=CCC[C@@]1(C)Oc2cc(C)cc(O)c2C=C1 WQOSNKWCIQZRGH-JOCHJYFZSA-N 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- XLJMAIOERFSOGZ-UHFFFAOYSA-N cyanic acid Chemical compound OC#N XLJMAIOERFSOGZ-UHFFFAOYSA-N 0.000 description 2
- UYLFTJMQPWWDCW-UHFFFAOYSA-N daurichromenic acid Natural products CC1=C(C(O)=O)C(O)=C2C=CC(CCC=C(C)CCC=C(C)C)(C)OC2=C1 UYLFTJMQPWWDCW-UHFFFAOYSA-N 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 102000024323 dimethylallyltranstransferase activity proteins Human genes 0.000 description 2
- 108040001168 dimethylallyltranstransferase activity proteins Proteins 0.000 description 2
- SNRUBQQJIBEYMU-UHFFFAOYSA-N dodecane Chemical compound CCCCCCCCCCCC SNRUBQQJIBEYMU-UHFFFAOYSA-N 0.000 description 2
- 238000012377 drug delivery Methods 0.000 description 2
- 108010091384 endoglucanase 2 Proteins 0.000 description 2
- 108010092413 endoglucanase V Proteins 0.000 description 2
- 125000004185 ester group Chemical group 0.000 description 2
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 2
- 108010038658 exo-1,4-beta-D-xylosidase Proteins 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 229930009668 farnesene Natural products 0.000 description 2
- CXENHBSYCFFKJS-UHFFFAOYSA-N farnesene group Chemical group C=CC(C)=CCC=C(C)CCC=C(C)C CXENHBSYCFFKJS-UHFFFAOYSA-N 0.000 description 2
- 125000001153 fluoro group Chemical group F* 0.000 description 2
- 235000019253 formic acid Nutrition 0.000 description 2
- 125000002485 formyl group Chemical group [H]C(*)=O 0.000 description 2
- 230000037433 frameshift Effects 0.000 description 2
- 238000000769 gas chromatography-flame ionisation detection Methods 0.000 description 2
- 235000021472 generally recognized as safe Nutrition 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- XWRJRXQNOHXIOX-UHFFFAOYSA-N geranylgeraniol Natural products CC(C)=CCCC(C)=CCOCC=C(C)CCC=C(C)C XWRJRXQNOHXIOX-UHFFFAOYSA-N 0.000 description 2
- 125000002686 geranylgeranyl group Chemical group [H]C([*])([H])/C([H])=C(C([H])([H])[H])/C([H])([H])C([H])([H])/C([H])=C(C([H])([H])[H])/C([H])([H])C([H])([H])/C([H])=C(C([H])([H])[H])/C([H])([H])C([H])([H])C([H])=C(C([H])([H])[H])C([H])([H])[H] 0.000 description 2
- OJISWRZIEWCUBN-UHFFFAOYSA-N geranylnerol Natural products CC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCO OJISWRZIEWCUBN-UHFFFAOYSA-N 0.000 description 2
- 229930184727 ginkgolide Natural products 0.000 description 2
- 239000005090 green fluorescent protein Substances 0.000 description 2
- 125000005067 haloformyl group Chemical group 0.000 description 2
- 125000001976 hemiacetal group Chemical group 0.000 description 2
- 125000005842 heteroatom Chemical group 0.000 description 2
- 125000004029 hydroxymethyl group Chemical group [H]OC([H])([H])* 0.000 description 2
- QPIZDZGIXDKCRC-UHFFFAOYSA-N ilicicolinic acid B Natural products CC(C)=CCCC(C)=CCCC(C)=CCC1=C(O)C=C(C)C(C(O)=O)=C1O QPIZDZGIXDKCRC-UHFFFAOYSA-N 0.000 description 2
- 125000005462 imide group Chemical group 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 125000002346 iodo group Chemical group I* 0.000 description 2
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 2
- GRHBQAYDJPGGLF-UHFFFAOYSA-N isothiocyanic acid Chemical compound N=C=S GRHBQAYDJPGGLF-UHFFFAOYSA-N 0.000 description 2
- 229930002697 labdane diterpene Natural products 0.000 description 2
- 150000001761 labdane diterpenoid derivatives Chemical class 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 235000012661 lycopene Nutrition 0.000 description 2
- 229960004999 lycopene Drugs 0.000 description 2
- 239000001751 lycopene Substances 0.000 description 2
- OAIJSZIZWZSQBC-GYZMGTAESA-N lycopene Chemical compound CC(C)=CCC\C(C)=C\C=C\C(\C)=C\C=C\C(\C)=C\C=C\C=C(/C)\C=C\C=C(/C)\C=C\C=C(/C)CCC=C(C)C OAIJSZIZWZSQBC-GYZMGTAESA-N 0.000 description 2
- 238000001819 mass spectrum Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 230000002503 metabolic effect Effects 0.000 description 2
- BGVUIJDZTQIJIO-AZUAARDMSA-N miltiradiene Chemical compound CC1(C)CCC[C@]2(C)C(CC=C(C3)C(C)C)=C3CC[C@H]21 BGVUIJDZTQIJIO-AZUAARDMSA-N 0.000 description 2
- 239000006151 minimal media Substances 0.000 description 2
- 210000003470 mitochondria Anatomy 0.000 description 2
- 230000002438 mitochondrial effect Effects 0.000 description 2
- 125000002560 nitrile group Chemical group 0.000 description 2
- 230000036963 noncompetitive effect Effects 0.000 description 2
- 235000015097 nutrients Nutrition 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 125000001181 organosilyl group Chemical group [SiH3]* 0.000 description 2
- 125000002092 orthoester group Chemical group 0.000 description 2
- 230000003647 oxidation Effects 0.000 description 2
- 238000007254 oxidation reaction Methods 0.000 description 2
- 125000003544 oxime group Chemical group 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 125000000864 peroxy group Chemical group O(O*)* 0.000 description 2
- 239000013600 plasmid vector Substances 0.000 description 2
- 229920001195 polyisoprene Polymers 0.000 description 2
- 239000013587 production medium Substances 0.000 description 2
- TVDSBUOJIPERQY-UHFFFAOYSA-N prop-2-yn-1-ol Chemical compound OCC#C TVDSBUOJIPERQY-UHFFFAOYSA-N 0.000 description 2
- 125000001436 propyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])[H] 0.000 description 2
- 125000004076 pyridyl group Chemical group 0.000 description 2
- MUPFEKGTMRGPLJ-ZQSKZDJDSA-N raffinose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO[C@@H]2[C@@H]([C@@H](O)[C@@H](O)[C@@H](CO)O2)O)O1 MUPFEKGTMRGPLJ-ZQSKZDJDSA-N 0.000 description 2
- 230000008707 rearrangement Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000006722 reduction reaction Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- XVULBTBTFGYVRC-HHUCQEJWSA-N sclareol Chemical compound CC1(C)CCC[C@]2(C)[C@@H](CC[C@](O)(C)C=C)[C@](C)(O)CC[C@H]21 XVULBTBTFGYVRC-HHUCQEJWSA-N 0.000 description 2
- 125000000467 secondary amino group Chemical group [H]N([*:1])[*:2] 0.000 description 2
- 239000006152 selective media Substances 0.000 description 2
- 238000002741 site-directed mutagenesis Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- AWIJRPNMLHPLNC-UHFFFAOYSA-N thiocarboxylic acid group Chemical group C(=S)O AWIJRPNMLHPLNC-UHFFFAOYSA-N 0.000 description 2
- ZMZDMBWJUHKJPS-UHFFFAOYSA-N thiocyanic acid Chemical compound SC#N ZMZDMBWJUHKJPS-UHFFFAOYSA-N 0.000 description 2
- 108060008226 thioredoxin Proteins 0.000 description 2
- 229940094937 thioredoxin Drugs 0.000 description 2
- YMBFCQPIMVLNIU-UHFFFAOYSA-N trans-alpha-bergamotene Natural products C1C2C(CCC=C(C)C)(C)C1CC=C2C YMBFCQPIMVLNIU-UHFFFAOYSA-N 0.000 description 2
- ZCIHMQAPACOQHT-ZGMPDRQDSA-N trans-isorenieratene Natural products CC(=C/C=C/C=C(C)/C=C/C=C(C)/C=C/c1c(C)ccc(C)c1C)C=CC=C(/C)C=Cc2c(C)ccc(C)c2C ZCIHMQAPACOQHT-ZGMPDRQDSA-N 0.000 description 2
- 238000010361 transduction Methods 0.000 description 2
- 230000026683 transduction Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- APJYDQYYACXCRM-UHFFFAOYSA-N tryptamine Chemical compound C1=CC=C2C(CCN)=CNC2=C1 APJYDQYYACXCRM-UHFFFAOYSA-N 0.000 description 2
- YOVSPTNQHMDJAG-QLFBSQMISA-N β-eudesmene Chemical compound C1CCC(=C)[C@@H]2C[C@H](C(=C)C)CC[C@]21C YOVSPTNQHMDJAG-QLFBSQMISA-N 0.000 description 2
- KWFJIXPIFLVMPM-KHMAMNHCSA-N (+)-alpha-santalene Chemical compound CC(C)=CCC[C@]1(C)[C@@H]2C[C@H]3[C@@H](C2)[C@@]13C KWFJIXPIFLVMPM-KHMAMNHCSA-N 0.000 description 1
- BBPXZLJCPUPNGH-CMKODMSKSA-N (-)-Abietadiene Chemical compound CC1(C)CCC[C@]2(C)[C@@H](CCC(C(C)C)=C3)C3=CC[C@H]21 BBPXZLJCPUPNGH-CMKODMSKSA-N 0.000 description 1
- PGBNIHXXFQBCPU-ILXRZTDVSA-N (-)-beta-santalene Chemical compound C1C[C@H]2C(=C)[C@@](CCC=C(C)C)(C)[C@@H]1C2 PGBNIHXXFQBCPU-ILXRZTDVSA-N 0.000 description 1
- YMBFCQPIMVLNIU-KKUMJFAQSA-N (-)-endo-alpha-bergamotene Chemical compound C1[C@@H]2[C@](CCC=C(C)C)(C)[C@H]1CC=C2C YMBFCQPIMVLNIU-KKUMJFAQSA-N 0.000 description 1
- BQPPJGMMIYJVBR-UHFFFAOYSA-N (10S)-3c-Acetoxy-4.4.10r.13c.14t-pentamethyl-17c-((R)-1.5-dimethyl-hexen-(4)-yl)-(5tH)-Delta8-tetradecahydro-1H-cyclopenta[a]phenanthren Natural products CC12CCC(OC(C)=O)C(C)(C)C1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21C BQPPJGMMIYJVBR-UHFFFAOYSA-N 0.000 description 1
- ALEWCKXBHSDCCT-YFVJMOTDSA-N (2E,6E)-farnesyl monophosphate Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\COP(O)(O)=O ALEWCKXBHSDCCT-YFVJMOTDSA-N 0.000 description 1
- YHTCXUSSQJMLQD-GIXZANJISA-N (2E,6E,10E,14E)-geranylfarnesol Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CC\C(C)=C\CC\C(C)=C\CO YHTCXUSSQJMLQD-GIXZANJISA-N 0.000 description 1
- DIGQNXIGRZPYDK-WKSCXVIASA-N (2R)-6-amino-2-[[2-[[(2S)-2-[[2-[[(2R)-2-[[(2S)-2-[[(2R,3S)-2-[[2-[[(2S)-2-[[2-[[(2S)-2-[[(2S)-2-[[(2R)-2-[[(2S,3S)-2-[[(2R)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[2-[[(2S)-2-[[(2R)-2-[[2-[[2-[[2-[(2-amino-1-hydroxyethylidene)amino]-3-carboxy-1-hydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1,5-dihydroxy-5-iminopentylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]hexanoic acid Chemical compound C[C@@H]([C@@H](C(=N[C@@H](CS)C(=N[C@@H](C)C(=N[C@@H](CO)C(=NCC(=N[C@@H](CCC(=N)O)C(=NC(CS)C(=N[C@H]([C@H](C)O)C(=N[C@H](CS)C(=N[C@H](CO)C(=NCC(=N[C@H](CS)C(=NCC(=N[C@H](CCCCN)C(=O)O)O)O)O)O)O)O)O)O)O)O)O)O)O)N=C([C@H](CS)N=C([C@H](CO)N=C([C@H](CO)N=C([C@H](C)N=C(CN=C([C@H](CO)N=C([C@H](CS)N=C(CN=C(C(CS)N=C(C(CC(=O)O)N=C(CN)O)O)O)O)O)O)O)O)O)O)O)O DIGQNXIGRZPYDK-WKSCXVIASA-N 0.000 description 1
- 108030001843 (2Z,6Z)-farnesyl diphosphate synthases Proteins 0.000 description 1
- CHGIKSSZNBCNDW-UHFFFAOYSA-N (3beta,5alpha)-4,4-Dimethylcholesta-8,24-dien-3-ol Natural products CC12CCC(O)C(C)(C)C1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21 CHGIKSSZNBCNDW-UHFFFAOYSA-N 0.000 description 1
- KRLBLPBPZSSIGH-CSKARUKUSA-N (6e)-3,7-dimethylnona-1,6-dien-3-ol Chemical compound CC\C(C)=C\CCC(C)(O)C=C KRLBLPBPZSSIGH-CSKARUKUSA-N 0.000 description 1
- KQJSQWZMSAGSHN-UHFFFAOYSA-N (9beta,13alpha,14beta,20alpha)-3-hydroxy-9,13-dimethyl-2-oxo-24,25,26-trinoroleana-1(10),3,5,7-tetraen-29-oic acid Natural products CC12CCC3(C)C4CC(C)(C(O)=O)CCC4(C)CCC3(C)C2=CC=C2C1=CC(=O)C(O)=C2C KQJSQWZMSAGSHN-UHFFFAOYSA-N 0.000 description 1
- 239000001707 (E,7R,11R)-3,7,11,15-tetramethylhexadec-2-en-1-ol Substances 0.000 description 1
- 108010052418 (N-(2-((4-((2-((4-(9-acridinylamino)phenyl)amino)-2-oxoethyl)amino)-4-oxobutyl)amino)-1-(1H-imidazol-4-ylmethyl)-1-oxoethyl)-6-(((-2-aminoethyl)amino)methyl)-2-pyridinecarboxamidato) iron(1+) Proteins 0.000 description 1
- RUJPNZNXGCHGID-UHFFFAOYSA-N (Z)-beta-Terpineol Natural products CC(=C)C1CCC(C)(O)CC1 RUJPNZNXGCHGID-UHFFFAOYSA-N 0.000 description 1
- HNSDLXPSAYFUHK-UHFFFAOYSA-N 1,4-bis(2-ethylhexyl) sulfosuccinate Chemical compound CCCCC(CC)COC(=O)CC(S(O)(=O)=O)C(=O)OCC(CC)CCCC HNSDLXPSAYFUHK-UHFFFAOYSA-N 0.000 description 1
- GZCWLCBFPRFLKL-UHFFFAOYSA-N 1-prop-2-ynoxypropan-2-ol Chemical compound CC(O)COCC#C GZCWLCBFPRFLKL-UHFFFAOYSA-N 0.000 description 1
- FPIPGXGPPPQFEQ-UHFFFAOYSA-N 13-cis retinol Natural products OCC=C(C)C=CC=C(C)C=CC1=C(C)CCCC1(C)C FPIPGXGPPPQFEQ-UHFFFAOYSA-N 0.000 description 1
- XYTLYKGXLMKYMV-UHFFFAOYSA-N 14alpha-methylzymosterol Natural products CC12CCC(O)CC1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21C XYTLYKGXLMKYMV-UHFFFAOYSA-N 0.000 description 1
- YVLPJIGOMTXXLP-UUKUAVTLSA-N 15,15'-cis-Phytoene Natural products C(=C\C=C/C=C(\CC/C=C(\CC/C=C(\CC/C=C(\C)/C)/C)/C)/C)(\CC/C=C(\CC/C=C(\CC/C=C(\C)/C)/C)/C)/C YVLPJIGOMTXXLP-UUKUAVTLSA-N 0.000 description 1
- YVLPJIGOMTXXLP-BAHRDPFUSA-N 15Z-phytoene Natural products CC(=CCCC(=CCCC(=CCCC(=CC=C/C=C(C)/CCC=C(/C)CCC=C(/C)CCC=C(C)C)C)C)C)C YVLPJIGOMTXXLP-BAHRDPFUSA-N 0.000 description 1
- XYHKNCXZYYTLRG-UHFFFAOYSA-N 1h-imidazole-2-carbaldehyde Chemical compound O=CC1=NC=CN1 XYHKNCXZYYTLRG-UHFFFAOYSA-N 0.000 description 1
- MDHNGUIZZLNVNV-UHFFFAOYSA-N 2,4a,4b,7,7,10a-hexamethyl-2-(4-methylpentyl)-3,4,5,6,6a,8,9,10,10b,11,12,12a-dodecahydro-1h-chrysene Chemical compound C1CCC(C)(C)C2CCC3(C)C4(C)CCC(CCCC(C)C)(C)CC4CCC3C21C MDHNGUIZZLNVNV-UHFFFAOYSA-N 0.000 description 1
- ULSFDANQPKOXES-UHFFFAOYSA-N 2,5,6,8-tetramethyltricyclo[6.3.0.01,5]undecane Chemical class CC1CCC2(C)C(C)CC3(C)C12CCC3 ULSFDANQPKOXES-UHFFFAOYSA-N 0.000 description 1
- IXYJWLXGDVMKAY-UHFFFAOYSA-N 2,6,6,10-tetramethyltricyclo[5.3.1.01,5]undecane Chemical class C1C23C(C)CCC3C(C)(C)C1CCC2C IXYJWLXGDVMKAY-UHFFFAOYSA-N 0.000 description 1
- YNSUBIQKQNBKJT-UHFFFAOYSA-N 2-(1,3-thiazol-2-yl)ethanol Chemical compound OCCC1=NC=CS1 YNSUBIQKQNBKJT-UHFFFAOYSA-N 0.000 description 1
- 102100025230 2-amino-3-ketobutyrate coenzyme A ligase, mitochondrial Human genes 0.000 description 1
- QNGIKNUNNYQTHX-UHFFFAOYSA-N 2-methyl-2-(4-methylpentyl)tetracyclo[3.2.0.01,6.03,7]heptane Chemical compound CC(C)CCCC1(C)C2CC3C11C2C13 QNGIKNUNNYQTHX-UHFFFAOYSA-N 0.000 description 1
- OINNEUNVOZHBOX-QIRCYJPOSA-K 2-trans,6-trans,10-trans-geranylgeranyl diphosphate(3-) Chemical compound CC(C)=CCC\C(C)=C\CC\C(C)=C\CC\C(C)=C\COP([O-])(=O)OP([O-])([O-])=O OINNEUNVOZHBOX-QIRCYJPOSA-K 0.000 description 1
- GWYFCOCPABKNJV-UHFFFAOYSA-M 3-Methylbutanoic acid Natural products CC(C)CC([O-])=O GWYFCOCPABKNJV-UHFFFAOYSA-M 0.000 description 1
- TZCILBOEMCKTEA-UHFFFAOYSA-N 3-ethylpent-2-enyl phosphono hydrogen phosphate Chemical compound CCC(=CCOP(=O)(O)OP(=O)(O)O)CC TZCILBOEMCKTEA-UHFFFAOYSA-N 0.000 description 1
- LOWCSZZPBRBKHY-UHFFFAOYSA-N 3-methylhex-2-enyl phosphono hydrogen phosphate Chemical compound CCCC(C)=CCOP(O)(=O)OP(O)(O)=O LOWCSZZPBRBKHY-UHFFFAOYSA-N 0.000 description 1
- NFIFVJFRFHCXBE-UHFFFAOYSA-N 3-methylpent-2-enyl phosphono hydrogen phosphate Chemical compound CCC(C)=CCOP(O)(=O)OP(O)(O)=O NFIFVJFRFHCXBE-UHFFFAOYSA-N 0.000 description 1
- XZEUYTKSAYNYPK-UHFFFAOYSA-N 3beta-29-Norcycloart-24-en-3-ol Natural products C1CC2(C)C(C(CCC=C(C)C)C)CCC2(C)C2CCC3C(C)C(O)CCC33C21C3 XZEUYTKSAYNYPK-UHFFFAOYSA-N 0.000 description 1
- FPTJELQXIUUCEY-UHFFFAOYSA-N 3beta-Hydroxy-lanostan Natural products C1CC2C(C)(C)C(O)CCC2(C)C2C1C1(C)CCC(C(C)CCCC(C)C)C1(C)CC2 FPTJELQXIUUCEY-UHFFFAOYSA-N 0.000 description 1
- HNDJAZAWUZFLNZ-UHFFFAOYSA-N 4-but-1-en-2-yl-1-methylcyclohexene Chemical compound CCC(=C)C1CCC(C)=CC1 HNDJAZAWUZFLNZ-UHFFFAOYSA-N 0.000 description 1
- JCAIWDXKLCEQEO-ATPOGHATSA-N 5alpha,9alpha,10beta-labda-8(20),13-dien-15-yl diphosphate Chemical compound CC1(C)CCC[C@]2(C)[C@@H](CCC(/C)=C/COP(O)(=O)OP(O)(O)=O)C(=C)CC[C@H]21 JCAIWDXKLCEQEO-ATPOGHATSA-N 0.000 description 1
- CFDDGLYSWVWBIP-UHFFFAOYSA-N 7-methyl-3-methylidenedeca-1,6-diene Chemical compound CCCC(C)=CCCC(=C)C=C CFDDGLYSWVWBIP-UHFFFAOYSA-N 0.000 description 1
- BBPXZLJCPUPNGH-UHFFFAOYSA-N Abietadien Natural products CC1(C)CCCC2(C)C(CCC(C(C)C)=C3)C3=CCC21 BBPXZLJCPUPNGH-UHFFFAOYSA-N 0.000 description 1
- 102000013563 Acid Phosphatase Human genes 0.000 description 1
- 108010051457 Acid Phosphatase Proteins 0.000 description 1
- 108010087522 Aeromonas hydrophilia lipase-acyltransferase Proteins 0.000 description 1
- BIADSXOKHZFLSN-GCJBHHCISA-N Ambrein Natural products CC(=CCC[C@H]1[C@](C)(O)CC[C@H]2C(C)(C)CCC[C@]12C)CC[C@@H]3C(=C)CCCC3(C)C BIADSXOKHZFLSN-GCJBHHCISA-N 0.000 description 1
- 108010039224 Amidophosphoribosyltransferase Proteins 0.000 description 1
- 101000977998 Arabidopsis thaliana Isopentenyl phosphate kinase Proteins 0.000 description 1
- 241000203069 Archaea Species 0.000 description 1
- 102000004580 Aspartic Acid Proteases Human genes 0.000 description 1
- 108010017640 Aspartic Acid Proteases Proteins 0.000 description 1
- 101000961203 Aspergillus awamori Glucoamylase Proteins 0.000 description 1
- 101900127796 Aspergillus oryzae Glucoamylase Proteins 0.000 description 1
- 101900318521 Aspergillus oryzae Triosephosphate isomerase Proteins 0.000 description 1
- HRQKOYFGHJYEFS-UHFFFAOYSA-N Beta psi-carotene Chemical compound CC(C)=CCCC(C)=CC=CC(C)=CC=CC(C)=CC=CC=C(C)C=CC=C(C)C=CC1=C(C)CCCC1(C)C HRQKOYFGHJYEFS-UHFFFAOYSA-N 0.000 description 1
- PJNXQRZFCCVFBC-UHFFFAOYSA-N CC1CC(C)C(C)CC2CC(C)(C)CC12 Chemical class CC1CC(C)C(C)CC2CC(C)(C)CC12 PJNXQRZFCCVFBC-UHFFFAOYSA-N 0.000 description 1
- 102100039866 CTP synthase 1 Human genes 0.000 description 1
- 108010018956 CTP synthetase Proteins 0.000 description 1
- 101100327917 Caenorhabditis elegans chup-1 gene Proteins 0.000 description 1
- DNJVYWXIDISQRD-UHFFFAOYSA-N Cafestol Natural products C1CC2(CC3(CO)O)CC3CCC2C2(C)C1C(C=CO1)=C1CC2 DNJVYWXIDISQRD-UHFFFAOYSA-N 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- AQKDBFWJOPNOKZ-UHFFFAOYSA-N Celastrol Natural products CC12CCC3(C)C4CC(C)(C(O)=O)CCC4(C)CCC3(C)C2=CC=C2C1=CC(=O)C(=O)C2C AQKDBFWJOPNOKZ-UHFFFAOYSA-N 0.000 description 1
- 108010084185 Cellulases Proteins 0.000 description 1
- 102000005575 Cellulases Human genes 0.000 description 1
- 108010018888 Choline kinase Proteins 0.000 description 1
- 102100029172 Choline-phosphate cytidylyltransferase A Human genes 0.000 description 1
- 101710100763 Choline-phosphate cytidylyltransferase A Proteins 0.000 description 1
- 108020004638 Circular DNA Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- JCAIWDXKLCEQEO-LXOWHHAPSA-N Copalyl diphosphate Natural products [P@@](=O)(OP(=O)(O)O)(OC/C=C(\CC[C@H]1C(=C)CC[C@H]2C(C)(C)CCC[C@@]12C)/C)O JCAIWDXKLCEQEO-LXOWHHAPSA-N 0.000 description 1
- RRTBTJPVUGMUNR-UHFFFAOYSA-N Cycloartanol Natural products C12CCC(C(C(O)CC3)(C)C)C3C2(CC)CCC2(C)C1(C)CCC2C(C)CCCC(C)C RRTBTJPVUGMUNR-UHFFFAOYSA-N 0.000 description 1
- PMPVIKIVABFJJI-UHFFFAOYSA-N Cyclobutane Chemical compound C1CCC1 PMPVIKIVABFJJI-UHFFFAOYSA-N 0.000 description 1
- 102000018832 Cytochromes Human genes 0.000 description 1
- 108010052832 Cytochromes Proteins 0.000 description 1
- KVSNMTUIMXZPLU-UHFFFAOYSA-N D:A-friedo-oleanane Natural products CC12CCC3(C)C4CC(C)(C)CCC4(C)CCC3(C)C2CCC2(C)C1CCCC2C KVSNMTUIMXZPLU-UHFFFAOYSA-N 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- XVULBTBTFGYVRC-UHFFFAOYSA-N Episclareol Natural products CC1(C)CCCC2(C)C(CCC(O)(C)C=C)C(C)(O)CCC21 XVULBTBTFGYVRC-UHFFFAOYSA-N 0.000 description 1
- 240000006890 Erythroxylum coca Species 0.000 description 1
- WSPRAEIJBDUDRX-UHFFFAOYSA-N Euferol Natural products CC12CCC3(C)C(C(CCC=C(C)C)C)CCC3(C)C1CC=C1C2CCC(O)C1(C)C WSPRAEIJBDUDRX-UHFFFAOYSA-N 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- JUUHNUPNMCGYDT-UHFFFAOYSA-N Friedelin Natural products CC1CC2C(C)(CCC3(C)C4CC(C)(C)CCC4(C)CCC23C)C5CCC(=O)C(C)C15 JUUHNUPNMCGYDT-UHFFFAOYSA-N 0.000 description 1
- 101150038242 GAL10 gene Proteins 0.000 description 1
- 101150108358 GLAA gene Proteins 0.000 description 1
- 102100024637 Galectin-10 Human genes 0.000 description 1
- 108030001780 Geranylfarnesyl diphosphate synthases Proteins 0.000 description 1
- BKLIAINBCQPSOV-UHFFFAOYSA-N Gluanol Natural products CC(C)CC=CC(C)C1CCC2(C)C3=C(CCC12C)C4(C)CCC(O)C(C)(C)C4CC3 BKLIAINBCQPSOV-UHFFFAOYSA-N 0.000 description 1
- 108010073178 Glucan 1,4-alpha-Glucosidase Proteins 0.000 description 1
- 102000005744 Glycoside Hydrolases Human genes 0.000 description 1
- 108010031186 Glycoside Hydrolases Proteins 0.000 description 1
- MCNAURNYDFSEML-UHFFFAOYSA-N Guaiane Natural products CC1CCC(C(C)=C)C(O)C2=C(C)C(=O)CC12 MCNAURNYDFSEML-UHFFFAOYSA-N 0.000 description 1
- 239000000899 Gutta-Percha Substances 0.000 description 1
- 244000043261 Hevea brasiliensis Species 0.000 description 1
- 102100024023 Histone PARylation factor 1 Human genes 0.000 description 1
- 101001047783 Homo sapiens Histone PARylation factor 1 Proteins 0.000 description 1
- 101001128634 Homo sapiens NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 2, mitochondrial Proteins 0.000 description 1
- HVXLSFNCWWWDPA-UHFFFAOYSA-N Isocycloartenol Natural products C1CC(O)C(C)(C)C2C31CC13CCC3(C)C(C(CCCC(C)=C)C)CCC3(C)C1CC2 HVXLSFNCWWWDPA-UHFFFAOYSA-N 0.000 description 1
- JEKMKNDURXDJAD-UHFFFAOYSA-N Kahweol Natural products C1CC2(CC3(CO)O)CC3CCC2C2(C)C1C(C=CO1)=C1C=C2 JEKMKNDURXDJAD-UHFFFAOYSA-N 0.000 description 1
- 241001138401 Kluyveromyces lactis Species 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- LOPKHWOTGJIQLC-UHFFFAOYSA-N Lanosterol Natural products CC(CCC=C(C)C)C1CCC2(C)C3=C(CCC12C)C4(C)CCC(C)(O)C(C)(C)C4CC3 LOPKHWOTGJIQLC-UHFFFAOYSA-N 0.000 description 1
- AOCXQJSHHUIKPB-AEGPPILISA-N Laurencenone A Natural products Br[C@]1(C)[C@@H](Cl)C[C@@]2(C(C)=CC(=O)CC2(C)C)CC1 AOCXQJSHHUIKPB-AEGPPILISA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 101150085283 MYS gene Proteins 0.000 description 1
- LAEIZWJAQRGPDA-UHFFFAOYSA-N Manoyloxid Natural products CC1(C)CCCC2(C)C3CC=C(C)OC3(C)CCC21 LAEIZWJAQRGPDA-UHFFFAOYSA-N 0.000 description 1
- 102000003792 Metallothionein Human genes 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 101000997933 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) (2E,6E)-farnesyl diphosphate synthase Proteins 0.000 description 1
- 101001015102 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) Dimethylallyltranstransferase Proteins 0.000 description 1
- 102100032194 NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 2, mitochondrial Human genes 0.000 description 1
- CAHGCLMLTWQZNJ-UHFFFAOYSA-N Nerifoliol Natural products CC12CCC(O)C(C)(C)C1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21C CAHGCLMLTWQZNJ-UHFFFAOYSA-N 0.000 description 1
- 108091005461 Nucleic proteins Chemical group 0.000 description 1
- ITZMWVKTANMHQA-YDHLFZDLSA-N Obtusadiene Natural products C1=CC(C)=CC[C@]21C(C)(C)[C@@H](Br)[C@@H](O)CC2=C ITZMWVKTANMHQA-YDHLFZDLSA-N 0.000 description 1
- 241001099341 Ogataea polymorpha Species 0.000 description 1
- 101000894711 Origanum vulgare Bicyclo-germacrene synthase Proteins 0.000 description 1
- 229930012538 Paclitaxel Natural products 0.000 description 1
- 240000000342 Palaquium gutta Species 0.000 description 1
- 108700020962 Peroxidase Proteins 0.000 description 1
- 102000003992 Peroxidases Human genes 0.000 description 1
- BLUHKGOSFDHHGX-UHFFFAOYSA-N Phytol Natural products CC(C)CCCC(C)CCCC(C)CCCC(C)C=CO BLUHKGOSFDHHGX-UHFFFAOYSA-N 0.000 description 1
- HXQRIQXPGMPSRW-UHZRDUGNSA-N Pollinastanol Natural products O[C@@H]1C[C@H]2[C@@]3([C@]4([C@H]([C@@]5(C)[C@@](C)([C@H]([C@H](CCCC(C)C)C)CC5)CC4)CC2)C3)CC1 HXQRIQXPGMPSRW-UHZRDUGNSA-N 0.000 description 1
- WHWHDGKOSUKYOV-GDXNDQEESA-N Polpunonic acid Chemical compound C([C@H]1[C@]2(C)CC[C@@]34C)[C@](C)(C(O)=O)CC[C@]1(C)CC[C@]2(C)[C@H]4CC[C@@]1(C)[C@H]3CCC(=O)[C@@H]1C WHWHDGKOSUKYOV-GDXNDQEESA-N 0.000 description 1
- 229920000388 Polyphosphate Polymers 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 241000235403 Rhizomucor miehei Species 0.000 description 1
- 101000968489 Rhizomucor miehei Lipase Proteins 0.000 description 1
- 101900006077 Saccharomyces cerevisiae Alcohol dehydrogenase 1 Proteins 0.000 description 1
- 101900354623 Saccharomyces cerevisiae Galactokinase Proteins 0.000 description 1
- 101900084120 Saccharomyces cerevisiae Triosephosphate isomerase Proteins 0.000 description 1
- 241001072909 Salvia Species 0.000 description 1
- 235000017276 Salvia Nutrition 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- XWQLUVQFUZQPDY-UHFFFAOYSA-N Sesquifenchene Natural products CC(=CCCC1(C)C2CCC1C(=C)C2)C XWQLUVQFUZQPDY-UHFFFAOYSA-N 0.000 description 1
- 235000002560 Solanum lycopersicum Nutrition 0.000 description 1
- 241000779819 Syncarpia glomulifera Species 0.000 description 1
- 102100036049 T-complex protein 1 subunit gamma Human genes 0.000 description 1
- 108700026226 TATA Box Proteins 0.000 description 1
- 229940123237 Taxane Drugs 0.000 description 1
- HNZBNQYXWOLKBA-UHFFFAOYSA-N Tetrahydrofarnesol Natural products CC(C)CCCC(C)CCCC(C)=CCO HNZBNQYXWOLKBA-UHFFFAOYSA-N 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 102100033598 Triosephosphate isomerase Human genes 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 102100021436 UDP-glucose 4-epimerase Human genes 0.000 description 1
- 108010075202 UDP-glucose 4-epimerase Proteins 0.000 description 1
- 108030003566 Valencene synthases Proteins 0.000 description 1
- KIQXKOUFPHTUQS-OQDDGJPTSA-N Valerenol Natural products OC/C(=C\[C@H]1C2=C(C)CC[C@@H]2[C@H](C)CC1)/C KIQXKOUFPHTUQS-OQDDGJPTSA-N 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- IXKSXJFAGXLQOQ-XISFHERQSA-N WHWLQLKPGQPMY Chemical compound C([C@@H](C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CNC=N1 IXKSXJFAGXLQOQ-XISFHERQSA-N 0.000 description 1
- 101000720152 Zea mays Acyclic sesquiterpene synthase Proteins 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 229930014549 abietadiene Natural products 0.000 description 1
- 150000000150 abietanes Chemical class 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- UXUIXVKLBWVGHM-UHFFFAOYSA-N acorane group Chemical class CC(C)C1CCC(C)C12CCC(C)CC2 UXUIXVKLBWVGHM-UHFFFAOYSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 125000002015 acyclic group Chemical group 0.000 description 1
- 229930003651 acyclic monoterpene Natural products 0.000 description 1
- 150000002841 acyclic monoterpene derivatives Chemical class 0.000 description 1
- 229930009629 acyclic sesterterpene Natural products 0.000 description 1
- 150000000102 acyclic sesterterpene derivatives Chemical class 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- YHTCXUSSQJMLQD-UHFFFAOYSA-N all-E-geranylfarnesol Natural products CC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCO YHTCXUSSQJMLQD-UHFFFAOYSA-N 0.000 description 1
- BOTWFXYSPFMFNR-OALUTQOASA-N all-rac-phytol Natural products CC(C)CCC[C@H](C)CCC[C@H](C)CCCC(C)=CCO BOTWFXYSPFMFNR-OALUTQOASA-N 0.000 description 1
- 229930002945 all-trans-retinaldehyde Natural products 0.000 description 1
- KVQOADNSNSUAJT-NHULGOKLSA-N alpha-Patchoulene Natural products C[C@@H]1[C@H]2[C@]3(C(C)(C)[C@H](C2)CC=C3C)CC1 KVQOADNSNSUAJT-NHULGOKLSA-N 0.000 description 1
- KIQXKOUFPHTUQS-UHFFFAOYSA-N alpha-Valerenol Natural products CC1CCC(C=C(C)CO)C2=C(C)CCC12 KIQXKOUFPHTUQS-UHFFFAOYSA-N 0.000 description 1
- FSLPMRQHCOLESF-UHFFFAOYSA-N alpha-amyrenol Natural products C1CC(O)C(C)(C)C2CCC3(C)C4(C)CCC5(C)CCC(C)C(C)C5C4=CCC3C21C FSLPMRQHCOLESF-UHFFFAOYSA-N 0.000 description 1
- FSLPMRQHCOLESF-SFMCKYFRSA-N alpha-amyrin Chemical compound C1C[C@H](O)C(C)(C)[C@@H]2CC[C@@]3(C)[C@]4(C)CC[C@@]5(C)CC[C@@H](C)[C@H](C)[C@H]5C4=CC[C@@H]3[C@]21C FSLPMRQHCOLESF-SFMCKYFRSA-N 0.000 description 1
- SJMCNAVDHDBMLL-UHFFFAOYSA-N alpha-amyrin Natural products CC1CCC2(C)CCC3(C)C(=CCC4C5(C)CCC(O)CC5CCC34C)C2C1C SJMCNAVDHDBMLL-UHFFFAOYSA-N 0.000 description 1
- 235000003903 alpha-carotene Nutrition 0.000 description 1
- 125000003447 alpha-pinene group Chemical group 0.000 description 1
- KWFJIXPIFLVMPM-BSMMKNRVSA-N alpha-santalene Natural products C(=C\CC[C@]1(C)C2(C)[C@H]3[C@@H]2CC1C3)(\C)/C KWFJIXPIFLVMPM-BSMMKNRVSA-N 0.000 description 1
- OZQAPQSEYFAMCY-UHFFFAOYSA-N alpha-selinene Natural products C1CC=C(C)C2CC(C(=C)C)CCC21C OZQAPQSEYFAMCY-UHFFFAOYSA-N 0.000 description 1
- JGINTSAQGRHGMG-SOUVJXGZSA-N alpha-trans-bergamotenol Natural products C1[C@@H]2[C@@](CCC=C(CO)C)(C)[C@H]1CC=C2C JGINTSAQGRHGMG-SOUVJXGZSA-N 0.000 description 1
- WYTGDNHDOZPMIW-RCBQFDQVSA-N alstonine Chemical compound C1=CC2=C3C=CC=CC3=NC2=C2N1C[C@H]1[C@H](C)OC=C(C(=O)OC)[C@H]1C2 WYTGDNHDOZPMIW-RCBQFDQVSA-N 0.000 description 1
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical compound N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 1
- 229910052921 ammonium sulfate Inorganic materials 0.000 description 1
- 239000001166 ammonium sulphate Substances 0.000 description 1
- 235000011130 ammonium sulphate Nutrition 0.000 description 1
- 239000000883 anti-obesity agent Substances 0.000 description 1
- 229940125710 antiobesity agent Drugs 0.000 description 1
- 229930003362 apo carotenoid Natural products 0.000 description 1
- 125000000135 apo carotenoid group Chemical group 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 108060000514 aromatic prenyltransferase Proteins 0.000 description 1
- 210000004507 artificial chromosome Anatomy 0.000 description 1
- 125000003365 atisane group Chemical group 0.000 description 1
- 125000004429 atom Chemical group 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- JFSHUTJDVKUMTJ-QHPUVITPSA-N beta-amyrin Chemical compound C1C[C@H](O)C(C)(C)[C@@H]2CC[C@@]3(C)[C@]4(C)CC[C@@]5(C)CCC(C)(C)C[C@H]5C4=CC[C@@H]3[C@]21C JFSHUTJDVKUMTJ-QHPUVITPSA-N 0.000 description 1
- QQFMRPIKDLHLKB-UHFFFAOYSA-N beta-amyrin Natural products CC1C2C3=CCC4C5(C)CCC(O)C(C)(C)C5CCC4(C)C3(C)CCC2(C)CCC1(C)C QQFMRPIKDLHLKB-UHFFFAOYSA-N 0.000 description 1
- PDNLMONKODEGSE-UHFFFAOYSA-N beta-amyrin acetate Natural products CC(=O)OC1CCC2(C)C(CCC3(C)C4(C)CCC5(C)CCC(C)(C)CC5C4=CCC23C)C1(C)C PDNLMONKODEGSE-UHFFFAOYSA-N 0.000 description 1
- 235000013734 beta-carotene Nutrition 0.000 description 1
- 239000011648 beta-carotene Substances 0.000 description 1
- 150000001579 beta-carotenes Chemical class 0.000 description 1
- YOVSPTNQHMDJAG-UHFFFAOYSA-N beta-helmiscapene Natural products C1CCC(=C)C2CC(C(=C)C)CCC21C YOVSPTNQHMDJAG-UHFFFAOYSA-N 0.000 description 1
- GWYFCOCPABKNJV-UHFFFAOYSA-N beta-methyl-butyric acid Natural products CC(C)CC(O)=O GWYFCOCPABKNJV-UHFFFAOYSA-N 0.000 description 1
- 125000000045 beyerane group Chemical group 0.000 description 1
- 229930003642 bicyclic monoterpene Natural products 0.000 description 1
- 150000001604 bicyclic monoterpene derivatives Chemical class 0.000 description 1
- 230000002210 biocatalytic effect Effects 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 229920001222 biopolymer Polymers 0.000 description 1
- 230000001851 biosynthetic effect Effects 0.000 description 1
- 150000001372 bisabolanes Chemical class 0.000 description 1
- 150000001636 bornane derivatives Chemical class 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- FZZNNPQZDRVKLU-YTFOTSKYSA-N cadinane group Chemical group [C@@H]12CC[C@H](C)C[C@H]1[C@@H](CC[C@@H]2C)C(C)C FZZNNPQZDRVKLU-YTFOTSKYSA-N 0.000 description 1
- DNJVYWXIDISQRD-JTSSGKSMSA-N cafestol Chemical compound C([C@H]1C[C@]2(C[C@@]1(CO)O)CC1)C[C@H]2[C@@]2(C)[C@H]1C(C=CO1)=C1CC2 DNJVYWXIDISQRD-JTSSGKSMSA-N 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229960001714 calcium phosphate Drugs 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 235000021466 carotenoid Nutrition 0.000 description 1
- 150000001747 carotenoids Chemical class 0.000 description 1
- 150000002426 caryophyllane derivatives Chemical class 0.000 description 1
- 125000000745 casbane group Chemical group 0.000 description 1
- 125000002142 cassane group Chemical group 0.000 description 1
- 101150062912 cct3 gene Proteins 0.000 description 1
- 229930002312 cedrane Natural products 0.000 description 1
- 125000000591 cedrane group Chemical group 0.000 description 1
- KQJSQWZMSAGSHN-JJWQIEBTSA-N celastrol Chemical compound C([C@H]1[C@]2(C)CC[C@@]34C)[C@](C)(C(O)=O)CC[C@]1(C)CC[C@]2(C)C4=CC=C1C3=CC(=O)C(O)=C1C KQJSQWZMSAGSHN-JJWQIEBTSA-N 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 230000004663 cell proliferation Effects 0.000 description 1
- 210000002421 cell wall Anatomy 0.000 description 1
- 150000004385 cembranes Chemical class 0.000 description 1
- 150000000121 chamigrane derivatives Chemical class 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 1
- 238000011210 chromatographic step Methods 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- AORLUAKWVIEOLL-UHFFFAOYSA-N chrysanthemyl diphosphate Chemical compound CC(C)=CC1C(COP(O)(=O)OP(O)(O)=O)C1(C)C AORLUAKWVIEOLL-UHFFFAOYSA-N 0.000 description 1
- NXJJBCPAGHGVJC-IFDWNBOGSA-N cis-dehydrosqualene Natural products CC(=CCCC(=CCCC(=CC=C/C=C(C)/CCC=C(/C)CCC=C(C)C)C)C)C NXJJBCPAGHGVJC-IFDWNBOGSA-N 0.000 description 1
- 108010069282 cis-prenyl transferase Proteins 0.000 description 1
- RUJPNZNXGCHGID-MGCOHNPYSA-N cis-β-terpineol Chemical compound CC(=C)[C@H]1CC[C@](C)(O)CC1 RUJPNZNXGCHGID-MGCOHNPYSA-N 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 125000000517 cleistanthane group Chemical group 0.000 description 1
- 235000008957 cocaer Nutrition 0.000 description 1
- ZPUCINDJVBIVPJ-LJISPDSOSA-N cocaine Chemical compound O([C@H]1C[C@@H]2CC[C@@H](N2C)[C@H]1C(=O)OC)C(=O)C1=CC=CC=C1 ZPUCINDJVBIVPJ-LJISPDSOSA-N 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 229940125904 compound 1 Drugs 0.000 description 1
- 229940125782 compound 2 Drugs 0.000 description 1
- 229940126214 compound 3 Drugs 0.000 description 1
- 229940125898 compound 5 Drugs 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- ZDGVATANBJCRHY-NUKBDRAPSA-N copal-8-ol diphosphate Chemical compound CC1(C)CCC[C@]2(C)[C@@H](CCC(/C)=C/COP(O)(=O)OP(O)(O)=O)[C@](C)(O)CC[C@H]21 ZDGVATANBJCRHY-NUKBDRAPSA-N 0.000 description 1
- 239000002537 cosmetic Substances 0.000 description 1
- WSPRAEIJBDUDRX-FBJXRMALSA-N cucurbitadienol Chemical compound C([C@H]1[C@]2(C)CC[C@@H]([C@]2(CC[C@]11C)C)[C@@H](CCC=C(C)C)C)C=C2[C@H]1CC[C@H](O)C2(C)C WSPRAEIJBDUDRX-FBJXRMALSA-N 0.000 description 1
- 150000001928 cycloartanes Chemical class 0.000 description 1
- ONQRKEUAIJMULO-YBXTVTTCSA-N cycloartenol Chemical compound CC(C)([C@@H](O)CC1)[C@H]2[C@@]31C[C@@]13CC[C@]3(C)[C@@H]([C@@H](CCC=C(C)C)C)CC[C@@]3(C)[C@@H]1CC2 ONQRKEUAIJMULO-YBXTVTTCSA-N 0.000 description 1
- YNBJLDSWFGUFRT-UHFFFAOYSA-N cycloartenol Natural products CC(CCC=C(C)C)C1CCC2(C)C1(C)CCC34CC35CCC(O)C(C)(C)C5CCC24C YNBJLDSWFGUFRT-UHFFFAOYSA-N 0.000 description 1
- FODTZLFLDFKIQH-UHFFFAOYSA-N cycloartenol trans-ferulate Natural products C1=C(O)C(OC)=CC(C=CC(=O)OC2C(C3CCC4C5(C)CCC(C5(C)CCC54CC53CC2)C(C)CCC=C(C)C)(C)C)=C1 FODTZLFLDFKIQH-UHFFFAOYSA-N 0.000 description 1
- 229930007927 cymene Natural products 0.000 description 1
- 210000000172 cytosol Anatomy 0.000 description 1
- 150000001620 daucane derivatives Chemical class 0.000 description 1
- 238000005034 decoration Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 235000015872 dietary supplement Nutrition 0.000 description 1
- QBSJHOGDIUQWTH-UHFFFAOYSA-N dihydrolanosterol Natural products CC(C)CCCC(C)C1CCC2(C)C3=C(CCC12C)C4(C)CCC(C)(O)C(C)(C)C4CC3 QBSJHOGDIUQWTH-UHFFFAOYSA-N 0.000 description 1
- 125000002228 disulfide group Chemical group 0.000 description 1
- 150000004141 diterpene derivatives Chemical class 0.000 description 1
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 1
- 150000003483 drimane derivatives Chemical class 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 125000001513 elemane group Chemical group 0.000 description 1
- 108010091371 endoglucanase 1 Proteins 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- IEICDHBPEPUHOB-UHFFFAOYSA-N ent-beta-selinene Natural products C1CCC(=C)C2CC(C(C)C)CCC21C IEICDHBPEPUHOB-UHFFFAOYSA-N 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- AJWBFJHTFGRNDG-GBJTYRQASA-N eremophilane group Chemical group [C@H]12CCC[C@H](C)[C@@]1(C)C[C@@H](CC2)C(C)C AJWBFJHTFGRNDG-GBJTYRQASA-N 0.000 description 1
- 108010044215 ethanolamine kinase Proteins 0.000 description 1
- 150000000174 eudesmanes Chemical class 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 150000004563 farnesanes Chemical class 0.000 description 1
- 125000002895 fenchane group Chemical group 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 229930003935 flavonoid Natural products 0.000 description 1
- 150000002215 flavonoids Chemical class 0.000 description 1
- 235000017173 flavonoids Nutrition 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- 235000019634 flavors Nutrition 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 239000003205 fragrance Substances 0.000 description 1
- OFMXGFHWLZPCFL-SVRPQWSVSA-N friedelin Chemical compound C([C@H]1[C@]2(C)CC[C@@]34C)C(C)(C)CC[C@]1(C)CC[C@]2(C)[C@H]4CC[C@@]1(C)[C@H]3CCC(=O)[C@@H]1C OFMXGFHWLZPCFL-SVRPQWSVSA-N 0.000 description 1
- MFVJCHSUSSRHRH-UHFFFAOYSA-N friedeline Natural products CC1(C)CCC2(C)CCC3C4(C)CCC5C(C)(C)C(=O)CCC5(C)C4CCC3(C)C2C1 MFVJCHSUSSRHRH-UHFFFAOYSA-N 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 235000000633 gamma-carotene Nutrition 0.000 description 1
- 239000011663 gamma-carotene Substances 0.000 description 1
- HRQKOYFGHJYEFS-RZWPOVEWSA-N gamma-carotene Natural products C(=C\C=C\C(=C/C=C/C=C(\C=C\C=C(/C=C/C=1C(C)(C)CCCC=1C)\C)/C)\C)(\C=C\C=C(/CC/C=C(\C)/C)\C)/C HRQKOYFGHJYEFS-RZWPOVEWSA-N 0.000 description 1
- BXWQUXUDAGDUOS-UHFFFAOYSA-N gamma-humulene Natural products CC1=CCCC(C)(C)C=CC(=C)CCC1 BXWQUXUDAGDUOS-UHFFFAOYSA-N 0.000 description 1
- 150000002269 gammaceranes Chemical class 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 230000030279 gene silencing Effects 0.000 description 1
- 238000012637 gene transfection Methods 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 231100000118 genetic alteration Toxicity 0.000 description 1
- HIGQPQRQIQDZMP-UHFFFAOYSA-N geranil acetate Natural products CC(C)=CCCC(C)=CCOC(C)=O HIGQPQRQIQDZMP-UHFFFAOYSA-N 0.000 description 1
- HIGQPQRQIQDZMP-DHZHZOJOSA-N geranyl acetate Chemical compound CC(C)=CCC\C(C)=C\COC(C)=O HIGQPQRQIQDZMP-DHZHZOJOSA-N 0.000 description 1
- 125000002350 geranyl group Chemical group [H]C([*])([H])/C([H])=C(C([H])([H])[H])/C([H])([H])C([H])([H])C([H])=C(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 229930001431 germacrane Natural products 0.000 description 1
- IBMAYSYTZAVZPY-QDMKHBRRSA-N germacrane group Chemical group CC(C)[C@H]1CC[C@@H](C)CCC[C@@H](C)CC1 IBMAYSYTZAVZPY-QDMKHBRRSA-N 0.000 description 1
- 229930001612 germacrene Natural products 0.000 description 1
- YDLBHMSVYMFOMI-SDFJSLCBSA-N germacrene Chemical compound CC(C)[C@H]1CC\C(C)=C\CC\C(C)=C\C1 YDLBHMSVYMFOMI-SDFJSLCBSA-N 0.000 description 1
- 125000000668 gibberellane group Chemical group 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- UACIBCPNAKBWHX-CTBOZYAPSA-N gonane Chemical compound C1CCC[C@@H]2[C@H]3CC[C@@H]4CCC[C@H]4[C@@H]3CCC21 UACIBCPNAKBWHX-CTBOZYAPSA-N 0.000 description 1
- 125000002749 guaiane group Chemical group 0.000 description 1
- 235000019382 gum benzoic Nutrition 0.000 description 1
- 229920000588 gutta-percha Polymers 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 229910001385 heavy metal Inorganic materials 0.000 description 1
- DMEGYFMYUHOHGS-UHFFFAOYSA-N heptamethylene Natural products C1CCCCCC1 DMEGYFMYUHOHGS-UHFFFAOYSA-N 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 125000002857 himachalane group Chemical group 0.000 description 1
- 229930004610 hirsutane Natural products 0.000 description 1
- 125000000869 hirsutane group Chemical group 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 150000002422 hopanes Chemical class 0.000 description 1
- 125000001050 humulane group Chemical group 0.000 description 1
- QBNFBHXQESNSNP-UHFFFAOYSA-N humulene Natural products CC1=CC=CC(C)(C)CC=C(/C)CCC1 QBNFBHXQESNSNP-UHFFFAOYSA-N 0.000 description 1
- 125000001229 illudane group Chemical group 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 229930000618 isocedrane Natural products 0.000 description 1
- 238000006317 isomerization reaction Methods 0.000 description 1
- JEKMKNDURXDJAD-HWUKTEKMSA-N kahweol Chemical compound C([C@@H]1C[C@]2(C[C@@]1(CO)O)CC1)C[C@H]2[C@@]2(C)[C@H]1C(C=CO1)=C1C=C2 JEKMKNDURXDJAD-HWUKTEKMSA-N 0.000 description 1
- 125000003618 kaurane group Chemical group 0.000 description 1
- CAHGCLMLTWQZNJ-RGEKOYMOSA-N lanosterol Chemical compound C([C@]12C)C[C@@H](O)C(C)(C)[C@H]1CCC1=C2CC[C@]2(C)[C@H]([C@H](CCC=C(C)C)C)CC[C@@]21C CAHGCLMLTWQZNJ-RGEKOYMOSA-N 0.000 description 1
- 229940058690 lanosterol Drugs 0.000 description 1
- LHLLBECTIHFNGQ-UHFFFAOYSA-N lavandulyl diphosphate Chemical compound CC(C)=CCC(C(C)=C)COP(O)(=O)OP(O)(O)=O LHLLBECTIHFNGQ-UHFFFAOYSA-N 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000001972 liquid chromatography-electrospray ionisation mass spectrometry Methods 0.000 description 1
- XIXADJRWDQXREU-UHFFFAOYSA-M lithium acetate Chemical compound [Li+].CC([O-])=O XIXADJRWDQXREU-UHFFFAOYSA-M 0.000 description 1
- 229930015765 longipinane Natural products 0.000 description 1
- OVEGUURYEATKCQ-UHFFFAOYSA-N longipinane group Chemical group CC1CCC2C3C1C2(C)CCCC3(C)C OVEGUURYEATKCQ-UHFFFAOYSA-N 0.000 description 1
- 150000002654 lupanes Chemical class 0.000 description 1
- MQYXUWHLBZFQQO-QGTGJCAVSA-N lupeol Chemical compound C1C[C@H](O)C(C)(C)[C@@H]2CC[C@@]3(C)[C@]4(C)CC[C@@]5(C)CC[C@@H](C(=C)C)[C@@H]5[C@H]4CC[C@@H]3[C@]21C MQYXUWHLBZFQQO-QGTGJCAVSA-N 0.000 description 1
- PKGKOZOYXQMJNG-UHFFFAOYSA-N lupeol Natural products CC(=C)C1CC2C(C)(CCC3C4(C)CCC5C(C)(C)C(O)CCC5(C)C4CCC23C)C1 PKGKOZOYXQMJNG-UHFFFAOYSA-N 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 229930000005 marasmane Natural products 0.000 description 1
- 125000001837 marasmane group Chemical group 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 125000002950 monocyclic group Chemical group 0.000 description 1
- 229930003647 monocyclic monoterpene Natural products 0.000 description 1
- 150000002767 monocyclic monoterpene derivatives Chemical class 0.000 description 1
- 229930014570 monocyclic sesterterpene Natural products 0.000 description 1
- 150000004494 monocyclic sesterterpene derivatives Chemical class 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 229920003052 natural elastomer Polymers 0.000 description 1
- 229930014626 natural product Natural products 0.000 description 1
- 229920001194 natural rubber Polymers 0.000 description 1
- 239000002417 nutraceutical Substances 0.000 description 1
- 235000021436 nutraceutical agent Nutrition 0.000 description 1
- 108090000021 oryzin Proteins 0.000 description 1
- 238000007248 oxidative elimination reaction Methods 0.000 description 1
- 125000004430 oxygen atom Chemical group O* 0.000 description 1
- 150000002932 p-cymene derivatives Chemical class 0.000 description 1
- 229960001592 paclitaxel Drugs 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- RGSFGYAAUTVSQA-UHFFFAOYSA-N pentamethylene Natural products C1CCCC1 RGSFGYAAUTVSQA-UHFFFAOYSA-N 0.000 description 1
- OWFXHGABDKORFT-UHFFFAOYSA-N perrottetinenic acid Natural products C1CC(C)=CC(C2=C(O)C=3C(O)=O)C1C(C)(C)OC2=CC=3CCC1=CC=CC=C1 OWFXHGABDKORFT-UHFFFAOYSA-N 0.000 description 1
- 239000000575 pesticide Substances 0.000 description 1
- JTJMJGYZQZDUJJ-UHFFFAOYSA-N phencyclidine Chemical compound C1CCCCN1C1(C=2C=CC=CC=2)CCCCC1 JTJMJGYZQZDUJJ-UHFFFAOYSA-N 0.000 description 1
- 102000029799 phosphatidate cytidylyltransferase Human genes 0.000 description 1
- 108091022886 phosphatidate cytidylyltransferase Proteins 0.000 description 1
- 230000000243 photosynthetic effect Effects 0.000 description 1
- 125000000442 phytane group Chemical group 0.000 description 1
- 235000011765 phytoene Nutrition 0.000 description 1
- BOTWFXYSPFMFNR-PYDDKJGSSA-N phytol Chemical compound CC(C)CCC[C@@H](C)CCC[C@@H](C)CCC\C(C)=C\CO BOTWFXYSPFMFNR-PYDDKJGSSA-N 0.000 description 1
- 150000003137 picrotoxane derivatives Chemical class 0.000 description 1
- 125000003400 pimarane group Chemical group 0.000 description 1
- XOKSLPVRUOBDEW-UHFFFAOYSA-N pinane group Chemical group C12C(CCC(C1(C)C)C2)C XOKSLPVRUOBDEW-UHFFFAOYSA-N 0.000 description 1
- 239000001739 pinus spp. Substances 0.000 description 1
- 210000002706 plastid Anatomy 0.000 description 1
- 229930009338 polycyclic sesterterpene Natural products 0.000 description 1
- 150000004277 polycyclic sesterterpene derivatives Chemical class 0.000 description 1
- 239000001205 polyphosphate Substances 0.000 description 1
- 235000011176 polyphosphates Nutrition 0.000 description 1
- 239000011148 porous material Substances 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- WYQAYJYHTGZNNK-UHFFFAOYSA-N prezizaane group Chemical group CC1CCC2C13CCC(C3)C(C)C2(C)C WYQAYJYHTGZNNK-UHFFFAOYSA-N 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000004844 protein turnover Effects 0.000 description 1
- 150000003187 protostanes Chemical class 0.000 description 1
- WHWHDGKOSUKYOV-UHFFFAOYSA-N pulpononic acid Natural products CC12CCC3(C)C4CC(C)(C(O)=O)CCC4(C)CCC3(C)C2CCC2(C)C1CCC(=O)C2C WHWHDGKOSUKYOV-UHFFFAOYSA-N 0.000 description 1
- KKOXKGNSUHTUBV-UHFFFAOYSA-N racemic zingiberene Natural products CC(C)=CCCC(C)C1CC=C(C)C=C1 KKOXKGNSUHTUBV-UHFFFAOYSA-N 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000006578 reductive coupling reaction Methods 0.000 description 1
- 230000037425 regulation of transcription Effects 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000011604 retinal Substances 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- NCYCYZXNIZJOKI-OVSJKPMPSA-N retinal group Chemical group C\C(=C/C=O)\C=C\C=C(\C=C\C1=C(CCCC1(C)C)C)/C NCYCYZXNIZJOKI-OVSJKPMPSA-N 0.000 description 1
- 235000020944 retinol Nutrition 0.000 description 1
- 239000011607 retinol Substances 0.000 description 1
- 229960003471 retinol Drugs 0.000 description 1
- 238000007363 ring formation reaction Methods 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000010686 shark liver oil Substances 0.000 description 1
- 229940069764 shark liver oil Drugs 0.000 description 1
- 238000007086 side reaction Methods 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 230000010473 stable expression Effects 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 150000003431 steroids Chemical class 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 210000004895 subcellular structure Anatomy 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 108010014539 taxa-4(5),11(12)-diene synthase Proteins 0.000 description 1
- RCINICONZNJXQF-MZXODVADSA-N taxol Chemical compound O([C@@H]1[C@@]2(C[C@@H](C(C)=C(C2(C)C)[C@H](C([C@]2(C)[C@@H](O)C[C@H]3OC[C@]3([C@H]21)OC(C)=O)=O)OC(=O)C)OC(=O)[C@H](O)[C@@H](NC(=O)C=1C=CC=CC=1)C=1C=CC=CC=1)O)C(=O)C1=CC=CC=C1 RCINICONZNJXQF-MZXODVADSA-N 0.000 description 1
- 125000000101 thioether group Chemical group 0.000 description 1
- OECAXGZVYRHQKP-UHFFFAOYSA-N thujopsane group Chemical group CC1CCC2(C)CCCC(C)(C)C23CC13 OECAXGZVYRHQKP-UHFFFAOYSA-N 0.000 description 1
- 238000005809 transesterification reaction Methods 0.000 description 1
- 230000010474 transient expression Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 229940036248 turpentine Drugs 0.000 description 1
- 238000001195 ultra high performance liquid chromatography Methods 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 210000003934 vacuole Anatomy 0.000 description 1
- 150000004235 valerane derivatives Chemical class 0.000 description 1
- 235000015112 vegetable and seed oil Nutrition 0.000 description 1
- 239000008158 vegetable oil Substances 0.000 description 1
- 239000003981 vehicle Substances 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- NCYCYZXNIZJOKI-UHFFFAOYSA-N vitamin A aldehyde Natural products O=CC=C(C)C=CC=C(C)C=CC1=C(C)CCCC1(C)C NCYCYZXNIZJOKI-UHFFFAOYSA-N 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 239000007222 ypd medium Substances 0.000 description 1
- 239000007221 ypg medium Substances 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
- KKOXKGNSUHTUBV-LSDHHAIUSA-N zingiberene Chemical compound CC(C)=CCC[C@H](C)[C@H]1CC=C(C)C=C1 KKOXKGNSUHTUBV-LSDHHAIUSA-N 0.000 description 1
- 229930001895 zingiberene Natural products 0.000 description 1
- KVQOADNSNSUAJT-UHFFFAOYSA-N α-patchoulene Chemical compound CC1=CCC(C2(C)C)CC3C12CCC3C KVQOADNSNSUAJT-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N1/00—Microorganisms, e.g. protozoa; Compositions thereof; Processes of propagating, maintaining or preserving microorganisms or compositions thereof; Processes of preparing or isolating a composition containing a microorganism; Culture media therefor
- C12N1/14—Fungi; Culture media therefor
- C12N1/16—Yeasts; Culture media therefor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1085—Transferases (2.) transferring alkyl or aryl groups other than methyl groups (2.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1205—Phosphotransferases with an alcohol group as acceptor (2.7.1), e.g. protein kinases
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1229—Phosphotransferases with a phosphate group as acceptor (2.7.4)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P5/00—Preparation of hydrocarbons or halogenated hydrocarbons
- C12P5/007—Preparation of hydrocarbons or halogenated hydrocarbons containing one or more isoprene units, i.e. terpenes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y205/00—Transferases transferring alkyl or aryl groups, other than methyl groups (2.5)
- C12Y205/01—Transferases transferring alkyl or aryl groups, other than methyl groups (2.5) transferring alkyl or aryl groups, other than methyl groups (2.5.1)
- C12Y205/0101—(2E,6E)-Farnesyl diphosphate synthase (2.5.1.10), i.e. geranyltranstransferase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y207/00—Transferases transferring phosphorus-containing groups (2.7)
- C12Y207/04—Phosphotransferases with a phosphate group as acceptor (2.7.4)
- C12Y207/04026—Isopentenyl phosphate kinase (2.7.4.26)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
Definitions
- the present invention relates to the production of isoprenoids in eukaryotic cells.
- Isoprenoids are widely used as pharmaceuticals, cosmetics, nutraceuticals, flavours, fragrances, and pesticides, and have recently also found applications as drop-in jet fuels and biopolymers.
- Eukaryotic microorganisms such as yeasts
- yeasts are considered good hosts for high-value compound production because of their capacity to synthesize complex chemical structures.
- production of isoprenoids in microorganisms currently depends on the feeding of sugar. This can frequently be a challenging and inefficient process, as microorganisms prefer to use the sugar for growth and other metabolic processes instead of producing the desired product.
- Isoprenoids are synthesized by the successive addition of 5-carbon containing building blocks and, as a result, their structures mostly contain multiples of five carbon atoms.
- TS terpene synthases
- prenyl- transferases as in the case of cannabinoids or prenylated flavonoids
- the isopentenol utilization pathway can produce isopentenyl diphosphate or dimethylallyl diphosphate, the main precursors to isoprenoid synthesis, through sequential phosphorylation of isopentenol isomers isoprenol or prenol, Chatzivasileiou et al., PNAS, 2019.
- This alternative pathway converts primary alcohols into corresponding pyrophosphates: Conversion systems for the production of DMAPP- and IPP-derived compounds from prenol and isoprenol in bacteria have, however, only been demonstrated for obtaining canonical isoprenoids. Moreover, the enzymes used in these bacterial systems have not been demonstrated to be functional in yeast and have not been found to be efficient in yeast.
- Another bioconversion system utilizes the enzymes SfPhoN and AtIPK. This system has also been shown to function in E. coli. However, the activity of this system is non-existing or very low when it is expressed in yeast (https://pubs.acs.org/doi/10.1021/acssynbio.8b00383).
- a further object of the invention is to present a new system for the two-step conversion of isoprenol, prenol and prenol-like alcohols to terpene precursor compounds in eukaryotic cells such as yeast.
- alcohol kinase enzymes such as Arabidopsis thaliana farnesol kinase (AtFKI) and optionally an isopentenyl phosphate kinase (AtIPK) or another prenyl phosphate kinase, its yield being considerably higher compared to the other systems described in the background section herein above.
- the invention relates to cells comprising nucleic acid sequences encoding kinases or parts thereof, e.g., polypeptides or polypeptide analogues with kinase activity as described herein.
- the invention relates to a genetically engineered eukaryotic cell for production of an isoprenoid (or terpene or terpenoid) comprising a first nucleic acid sequence encoding a first kinase that converts a primary alcohol to a mono- or pyrophosphate isoprenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that converts a monophosphate precursor to an isoprenoid pyrophosphate precursor; wherein the first kinase comprises SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto while exhibiting kinase activity.
- the present invention provides a genetically engineered eukaryotic cell for the production of a terpene or terpenoid comprising a first nucleic acid sequence encoding a first kinase that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor wherein the first kinase comprises SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto while exhibiting kinase activity.
- vectors comprising the above nucleic acids, as well as host cells comprising said vectors and/or said nucleic acids or polypeptides.
- the nucleic acids may be comprised in a vector, e.g., a plasmid, cosmid, virus, or another vector used, e.g., conventionally in genetic engineering.
- the vector may comprise further sequences such as marker sequences, which allow for the selection of the vector in a suitable host cell and under suitable conditions.
- the vector may comprise expression control elements allowing proper expression of the coding regions in suitable hosts. Such control elements are known to the person skilled in the art and may include a promoter, a splice cassette, and a translation initiation codon, amongst others.
- the nucleic acids may be integrated in a certain chromosomal locus in the employed cell, in combination with the expression control elements described above.
- the nucleic acid of the invention is operatively attached to expression control elements allowing expression in eukaryotic cells.
- control elements ensuring expression in eukaryotic cells are well known to those skilled in the art and a further explained herein below.
- nucleic acid molecules for construction of vectors comprising nucleic acid molecules, for introduction of vectors into appropriately chosen host cells, for insertion of DNA fragments into genomic loci of said cells, or for causing or achieving expression of nucleic acid molecules are well-known in the art. Further detail and exemplary methods are detailed herein below.
- the system advantageously enables the biocatalytic synthesis of isoprenoids in eukaryotic microorganisms by a method that by-passes the MEP pathway and/or the MVA pathway for the production of DMAPP and IPP.
- said first nucleic acid sequence encodes a kinase that is capable of both alcohol phosphorylation and phosphate phosphorylation.
- said nucleic acid sequence encodes a single kinase enzyme with bi-catalytic activity and capable of sequential phosphorylation of alcohol and monophosphate substrates.
- the first nucleic acid sequence encodes a kinase that is capable of phosphorylating a primary alcohol to a monophosphate terpenoid or isoprenoid precursor.
- Said kinase may comprise SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
- said nucleic acid sequence encodes an alcohol kinase that is capable of phosphorylating a non-canonical, prenol-like primary alcohol to a non-canonical monophosphate isoprenoid precursor.
- the kinase may comprise SEQ ID NO: 1 or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
- the engineered cells can further include an exogenous nucleic acid sequence encoding a phosphate kinase, i.e. , a phosphokinase, such as prenyl phosphate kinase (or isopentenyl phosphate kinase, IPK) that can phosphorylate a phosphate precursor, e.g., dimethylallyl phosphate (DMAP), to dimethylallyl pyrophosphate (DMAPP) or isopentenyl phosphate (IP) to isopentenyl pyrophosphate (IPP).
- DMAP dimethylallyl phosphate
- DMAPP dimethylallyl pyrophosphate
- IP isopentenyl phosphate
- said phosphokinase comprises Arabidopsis thaliana IPK (AtIPK) SEQ ID NO: 3; or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
- said phosphokinase comprises MtIPK, SEQ ID NO: 4; or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
- said phosphokinase comprises TalPK, SEQ ID NO: 5; or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
- said phosphokinase comprises TalPK(204G), SEQ ID NO: 6.
- the phosphokinase comprises ScCK (ScCKI, Cki1 p), ScMK, EcGK, EcHK, HvIPK, MjlPK, or TalPK-3m.
- said phosphokinase is capable of phosphorylating a non-canonical monophosphate isoprenoid precursor, resulting in a non-canonical pyrophosphate isoprenoid precursor, such as 3-ethylpent-2-en-1-yl-diphosphate, 4-fluoro-3-methylbut-2-en-1-yl- diphosphate, 3-methyl-4-(methylthio)but-2-en-1 -yl-diphosphate, 3-methylpent-2-en-1 -yl- diphosphate, 3-methylhex-2-en-1 -yl-diphosphate, 3, 4-dimethylpent-2-en-1 -yl-diphosphate.
- a non-canonical monophosphate isoprenoid precursor such as 3-ethylpent-2-en-1-yl-diphosphate, 4-fluoro-3-methylbut-2-en-1-yl- diphosphate, 3-methyl-4-(methylthio)but-2-en-1 -yl-
- Contacting canonical or non-canonical monophosphate precursors with phosphokinases advantageously enables an alternative pathway for the production of canonical or non-canonical pyrophosphate isoprenoid precursors, which can be further converted to canonical and non- canonical isoprenoids by the action of, e.g., terpene synthases and / or prenyltransferases.
- the combined action of one or more of these enzymes provides an isoprenoid biosynthetic pathway that allows de-coupling of isoprenoid biosynthesis from biomass production and enables channelling more substrate into product, thus providing a non-competitive system.
- a 10- to 100-fold increase in production titers of canonical isoprenoid compounds can be achieved.
- it is a highly efficient method to avoid common bottlenecks in currently used methods for the production of isoprenoids in yeast and other eukaryotic cells.
- the kinase or kinases according to the invention are fused to one or more peptides or peptide analogues resulting in fusion proteins.
- Said fusion proteins may, depending on the functional characteristics of the said peptide or peptide analogue, advantageously confer additional functionality to the kinase or kinases of the invention.
- a peptide or peptide analogue may allow improved enzyme kinetics of the kinase domain or domains; and intracellular localisation peptide may increase the rate of localisation of the kinase or kinases according to the invention to sub-cellular organelles, such as chloroplasts, mitochondria or peroxisome via a chloroplastic, mitochondrial or peroxisomal targeting signal, respectively, thereby increasing the enzyme kinetics of the enzymes according to the invention.
- stability-increasing, and half-life-increasing peptides may contribute to a longer activity of the enzymes by reducing protein turnover, thus increasing the concentration of active enzymes, and total catalytic activity.
- Enzymatic promiscuity may be increased by the fusion of a kinase according to the invention to a peptide or peptide analogue comprising an additional domain, such as kinase domain, such as a phosphokinase domain and / or one or more peptide sequences improving enzyme kinetics.
- fusion to specific domains or peptides can facilitate the correct folding of the kinase, orcan improve the solubility of the kinase-containing polypeptide, resulting in higher overall intracellular activity.
- the peptide or peptide analogue is maltose-binding protein, green fluorescent protein, thioredoxin, glutathione S-transferase, yeast farnesyl diphosphate synthase (Erg20p), ATP-synthase, CTP synthase, GTP synthase, UTP synthase, NusA, or small ubiquitin related modifier Smt3, or a fragment thereof.
- the peptide consists of naturally encoded amino acid residues, i.e. , amino acids found in the genetic code.
- the primary alcohol is prenol, isoprenol or a prenol-like primary alcohol.
- the primary alcohol is an alcohol with the structure of formula 1 : where formula 1 is: wherein Ri is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulphur, a group containing phosphorus and / or a group containing boron;
- R 2 is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulphur, a group containing phosphorus and / or a group containing boron; and R 3 is hydrogen, methyl, fluorine, chlorine, bromine, iodine, sulphydryl; hydroxyl.
- the primary alcohol is 3-methylbut-2-en-1-ol, 4-fluoro-3-methylbut-2- en-1-ol, 3-methylpent-2-en-1-ol, 3,4-dimethylpent-2-en-1-ol, 3-ethylpent-2-en-1-ol, 3-methylhex- 2-en-1-ol, 3-methylhexa-2,5-dien-1-ol, 3-methylbut-3-en-1-ol, 3-methylenepentan-1-ol, 2- methylprop-2-en-1 -ol, 3-methyl-4-(methylthio)but-2-en-1 -ol, or 5-chloro-3-methylpent-2-en-1 -ol.
- the primary alcohol is a prenol-like primary alcohol that is not geraniol.
- the primary alcohol is a prenol-like primary alcohol that is not farnesol.
- the primary alcohol is a prenol-like primary alcohol that is not geranylgeraniol.
- non-canonical, prenol-like alcohols which comprise an incremental number of carbons (other than multiple of 5 carbons) or different heteroatoms, referred to herein as non-canonical (or prenyl-like) alcohols
- the production of novel isoprenoid building blocks with carbon size different than five or alternative structures, i.e. non-canonical building blocks is advantageously enabled.
- the number of potential isoprenoids obtained is expanded, many of which could have improved properties over canonical isoprenoids or could provide new functions.
- a 10- to 100-fold increase in production titers of canonical isoprenoid compounds can be achieved when compared to previously disclosed systems in yeast.
- the cell according to the invention comprises an exogenous nucleic acid sequence enabling expression or increased expression of an enzyme capable of catalysing the production of canonical and / or non-canonical terpenes, terpenoids, isoprenoids or structures containing isoprenoid groups.
- said nucleic acid sequence comprises an expression control sequence as described herein.
- said nucleic acid comprises a sequence coding for an enzyme.
- an enzyme is a terpene or terpenoid synthase (TS), such as a monoterpene synthase, a sesquiterpene synthase, a diterpene synthase, a sesterterpene synthase, or a triterpene synthase or a fragment thereof, which may convert the canonical or non-canonical pyrophosphate terpene precursors into any of a wide array of terpene and / or terpenoid compounds.
- TS terpene or terpenoid synthase
- the exogenous nucleic acid sequence enabling increased expression of an enzyme capable of catalysing the production of canonical and / or non-canonical terpenes, terpenoids, isoprenoids or structures containing isoprenoid groups is a prenyl transferase, whereby the terpene compounds are produced by appending precursors of the invention to other non-terpenoid skeletons (as in the case of cannabinoids and prenylated flavonoids).
- terpene synthases and prenyl transferases include but are not limited to limonene and limonene synthase, myrcene and myrcene synthase, geraniol and geraniol synthase, linalool and linalool synthase, taxadiene and taxadiene synthase, amorphadiene and amorphadiene synthase, valencene and valencene synthase, santalol and santalol synthase.
- the terpene synthase or the prenyl transferase is capable of using non-canonical isoprenoid building blocks as substrate.
- novel building blocks with carbon size different than five (or a multiple of five), or building blocks containing heteroatoms, as substrate the number of potential isoprenoids produced by the system according to the invention is advantageously greatly expanded.
- the terpene synthase enzyme or isoprenoid synthase enzyme or other enzyme or a fragment thereof capable of catalysing the production of canonical isoprenoids (or terpenes, or terpenoids) or structures containing isoprenoid groups comprises a change in the amino acid sequence that enables improved enzyme kinetics for utilisation of non-canonical isoprenoid building blocks.
- Such a change in the peptide sequence can, e.g., enhance the affinity of the enzyme for non-canonical substrates, or reduce the activation energy, which results in a greater reaction efficiency.
- the terpene and / or terpenoid yield is improved.
- the host cell is a yeast cell. Any yeast species may be appropriate.
- the genus of said yeast is selected from Saccharomyces, Pichia, Ogataea, Yarrowia, Kluyveromyces, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Schizosaccharomyces, Trichosporon and Lipomyces.
- the genus of said yeast is Saccharomyces, Pichia, Ogataea, Kluyveromyces or Yarrowia.
- Preferred species include Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Ogataea polymorpha, Kluyveromyces marxianus, Kluyveromyces lactis, Yarrowia lipolytica, or Dekkera bruxellensis.
- the host cell of the invention is a filamentous fungi cell, such as a cell derived from the miscella or germ bodies of a filamentous fungus.
- filamentous fungi cell such as a cell derived from the miscella or germ bodies of a filamentous fungus.
- Preferred species include but are not limited to Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
- the host cell of the invention is an algal cell, such as a cell derived from a multicellar alga.
- the algae cell is a microalga, such as a cell of the species Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana.
- eukaryotic systems comprising yeast incubators or microalgae photosynthetic production systems are adjusted for optimal Industry-oriented production methods and are, in addition, readily up-scalable depending on demand.
- eukaryotic cells contain organelles and subcellular structures that may be beneficial for bioproduction, by facilitating metabolic channelling, avoiding metabolite crosstalk or inadvertent inhibition, and potentially improving the function and stability of enzymes by enabling membrane association or confinement in a cellular compartment/substructure.
- a further advantage of the host cell of the invention being an eukaryotic cell is the ability to perform CYP-driven oxidations on the new structures in eukaryotic cells that have not been possible in bacterial cells.
- eukaryotic cells provide, in general, a better environment to functionally express enzyme that belong to the group of cytochrome P450 (CYPs), particularly if these are originally found in another eukaryotic cell. This is due to the fact that said CYPs are membrane bound enzymes and correct association with the membrane is essential for optimal activity.
- Eukaryotic cells, such as yeast contain appropriate membrane structures (e.g. ER) for exogenous CYPs to function.
- Such membranes are lacking from bacteria and, therefore, functional expression of said CYPs in prokaryotic cells is far less optimal.
- kinases reported to phosphorylate prenol and isoprenol in bacterial cells, establishing a bacterial system that converts prenol and isoprenol into CYP-decorated isoprenoids would be challenging due to the limitation described above.
- the invention relates to a method for the production of a terpene or an isoprenoid.
- said method comprises the steps of providing an engineered eukaryotic cell comprising a DNA sequence coding for a primary alcohol kinase, and culturing said engineered cell in a medium containing a primary alcohol.
- the cell provided further comprises an exogenous nucleic acid sequence coding fora phosphokinase.
- the primary alcohol is at an initial concentration within a range of 0.01 % to 1% v/v, such as within a range of 0.05% to 0.6% v/v, such as within a range of 0.1% to 0.3% v/v.
- the primary alcohol is at an initial concentration of 0.1% v/v.
- Fig. 1 & Fig. 2 Identification and characterization of the efficient alcohol kinases. Ion-count of 93- ion from SPME-GCMS from headspace of yeast co-expressing alcohol kinase candidates and AtIPK in yeast EGY48 to identify an efficient conversion of prenol isoprenoid building blocks.
- Fig. 3 SPME-GC-MS: Comparison of linalool production with either AtFKI or A65AtFKI in presence of prenol.
- Fig. 4 SPME-GC-MS: Comparison of AtFKI and A65AtFKI when co-expressed with the phosphokinase AtIPK in yeast in the presence of prenol.
- the production of linalool demonstrates that AtFKI performs well in combination with various phosphokinases.
- Fig. 5a Co-expression of the alcohol conversion pathway AtFKI-AtIPK and CIMyrS in yeast cells increases myrcene production (pink) compared without feeding alcohol (black).
- Fig. 5b Co-expression of the alcohol conversion pathway AtFKI-AtIPK and CISabS in yeast cells producing increased sabinene production (pink) compared without feeding alcohol (black).
- Fig. 5c Limonene titer when feeding different ratio of prenoLisoprenol in the strain containing the alcohol feeding pathway and limonene downstream building block.
- Fig. 5d Limonene yield when feeding different concentration alcohols in the strain containing the alcohol feeding pathway and limonene downstream building block.
- Fig. 6 Comparison of production of CBGA with and without conversion of prenol and isoprenol.
- Fig. 7 Alcohols found to be converted with AtFKI and AtIPK and utilized by CILimS.
- Fig. 8 SPME-GCMS analysis of non-canonical terpenes produced when AtFKI, AtIPK and CILimS. Based on the alcohol which have been converted and the mass spectrum of novel compounds the suggested structures are shown for the peaks.
- Fig. 9 Novel isoprenoid compounds.
- Fig. 10 Suggested core alcohol structure.
- Fig. 11 Comparison of total production of non-canonical limonene variants using AtFKI-AtIPK alcohol conversion combined with the limonene synthase CILimS, or the CILimS mutants L505G, F483A and I284G, and the alcohols 3,4-Dimethylpent-2-en-1-ol (3,4-DMP) and 3-ethylpent-2-en- 1 -ol (3E2E).
- Fig. 12 Synthesis of unnatural cannabigerolic acid analogues by using non-canonical prenyl diphosphates precursors.
- Noncanonical CBGA no alcohols .
- Yeast cells expressing the Erg20p(127W)-CsPT4 alone, AtFKI-AtIPK alone a: 3-methylpent-2-en-1-ol.
- c 3-methylhex-2-en-1-ol.
- d 3-ethylpent-2-en-1-ol.
- Fig. 13 shows the production of C16 and C17 sesquiterpenes using the alcohol conversion pathway and 3M2E.
- Fig. 14 evaluates of the efficiency of different prenyltransferases in supporting the production of C16 and C17 sesquiterpenes when feeding 3M2E.
- Fig. 15 evaluates of the efficiency of different prenyltransferases in supporting the production of C16 and C17 sesquiterpenes when feeding 3,4-DMP.
- Fig. 16 shows of the efficiency of different prenyltransferases in supporting the production of C16 and C17 sesquiterpenes when feeding 3-MP.
- Fig. 17 evaluates of the efficiency of different prenyltransferases in supporting the production of C17 sesquiterpenes when feeding 3E2E.
- Fig. 18 shows the production of a C16 sesquiterpenoid by Salvia fruticosa caryophyllene synthase (Sf126) and 3-MP.
- Fig. 19 shows the production of non-canonical squalene using 3MP.
- Fig. 20 shows the production of a non-canonical triterpenoid by curcubitadienol synthase and 3MP.
- Fig. 21 shows the production of a non-canonical triterpenoid by BmeTC(373C) and 3MP. DETAILED DESCRIPTION
- IUP isopentenol utilization pathway
- the inventors have surprisingly found that the ectopic expression of Arabidopsis thaliana farnesol kinase (FKI) and isopentenyl phosphate kinase lead to a high yield of canonical and non-canonical pyrophosphorylated isoprenoid precursors in yeast.
- FKI farnesol kinase
- isopentenyl phosphate kinase lead to a high yield of canonical and non-canonical pyrophosphorylated isoprenoid precursors in yeast.
- Isoprenoids are a naturally occurring group of chemical molecules displaying a wide structural diversity of carbon skeletons made up from basic isoprene units (typically C 5 ) and including compounds otherwise named as terpenes, terpenoids, or isoprenoids.
- basic isoprene units typically C 5
- terpene terpenoid
- isoprenoid is used interchangeably.
- terpenes are classified as: hemiterpenes C 5, monoterpenes, Ci 0 ; sesquiterpenes, Ci 5 ; diterpenes, C 2 o; sesterterpenes, C25; triterpenes, C30; and tetraterpenes, C 4 o.
- Terpenes consist of compounds with the formula (C 5 H 8 )n. They are further classified by the number of carbons: hemiterpenes C 5, monoterpenes (C10), sesquiterpenes (C15), diterpenes (C20), etc.
- a well-known monoterpene is alpha-pinene, a major component of turpentine.
- Terpenoids are modified terpenes, wherein methyl groups have been moved or removed, or oxygen atoms added, or contain other decorations or modifications.
- the term "terpene” may also be used more broadly, to include the terpenoids. Just like terpenes, the terpenoids can be classified according to the number of isoprene units that comprise the parent terpene.
- Hemiterpenes consist of a single isoprene unit. Isoprene itself is the only hemiterpene, but oxygen-containing derivatives such as prenol and isovaleric acid are hemiterpenoids.
- Monoterpenes are molecules comprising a 10-carbon isoprenoid structure.
- Monoterpenoids may, in addition to the 10-carbon isoprenoid structure, also comprise moieties not having isoprenoid structure.
- the biosynthesis of mono-terpenoids involves several additional steps following the initial conversion of GPP to the basic monoterpene skeleton. These additional steps may be oxidations (e.g. catalysed by a cytochrome P450 enzyme), reductions, isomerizations. acetylations, methylations, etc.
- sesquiterpenes and sesquiterpenoids examples include humulene, amorphadiene, farnesenes, farnesol, valencene, etc. (The sesqui- prefix means one and a half).
- Diterpenes are composed of four isoprene units and have the molecular formula C20H32. They derive from geranylgeranyl pyrophosphate. Examples of diterpenes and diterpenoids are casbene, abietadiene, miltiradiene, ginkgolides, cafestol, kahweol, cembrene and taxadiene (precursor of taxol). Diterpenes also form the basis for compounds such as retinol, retinal, and phytol.
- Sesterterpenes terpenes having 25 carbons and five isoprene units, are rare relative to the other sizes (The sester- prefix means two and a half).
- An example of a sesterterpenoid is geranylfarnesol.
- Triterpenes consist of six isoprene units and have the molecular formula C 3 oH 48 .
- the linear triterpene squalene the major constituent of shark liver oil, is derived from the reductive coupling of two molecules of farnesyl pyrophosphate. Squalene is then processed biosynthetically to generate either lanosterol or cycloartol, the structural precursors to the steroids.
- Sesquarterpenes are composed of seven isoprene units and have the molecular formula C35H56 ⁇ Sesquarterpenes are typically microbial in their origin. Examples of sesquarterpenoids are ferrugicadiol and tetraprenylcurcumene.
- Tetraterpenes contain eight isoprene units and have the molecular formula C 4 oH 64 ⁇ These include the acyclic lycopene, the monocyclic gamma-carotene, and the bicyclic alpha- and beta- carotenes.
- Polyterpenes consist of long chains of many isoprene units. Natural rubber consists of polyisoprene in which the double bonds are cis. Some plants produce a polyisoprene with trans double bonds, known as gutta-percha.
- Norisoprenoids such as the Ci 3 -norisoprenoid 3-oxo-a-ionol and 7,8-dihydroionone derivatives, such as megastigmane-3,9-diol and 3-oxo-7,8-dihydro-a-ionol can be produced by fungal peroxidases or glycosidases. Many norisoprenoids are the product of oxidative cleavage of a larger isoprenoid molecule by specific enzymes.
- Iridoids are a group of compounds found in plants and some animals, which are bio- synthetically derived from 8-oxogeraniol.
- Monoterpene indole alkaloids refer to a large and diverse group of plant chemical compounds derived from a unit of tryptamine and a 10-carbon or 9-carbon unit of terpenoid origin that is, in turn, derived from 8-oxo-geraniol.
- Higher terpenes are intended to mean molecules comprising more than 10 carbon atoms of isoprenoid structure. Examples include sesquiterpenes, diterpenes and triterpenes. Higher terpenes may include moieties not having the isoprenoid structure in addition to the terpene structure.
- Cannabinoids refers to a group of compounds members of which were initially isolated from the plant Cannabis sativa. Many cannabinoids are bio-synthesized by the addition of GPP to olivetolic acid.
- Meroterpenoids refer to compounds that contain an isoprenoid moiety as part of a larger compound. Such compounds are, for example, the group of cannabinoids, the group of monoterpene indole alkaloids, or other prenylated aromatic compounds.
- Canonical terpenes refer to terpenes synthesized using the canonical terpene precursors IPP, DMAPP, GPP, FPP, GGPP, etc, or their “cis-” counterparts, and which have a number of carbon atoms that is a multiple of 5, as their biosynthesis is based on 5-carbon precursors.
- DMAPP and IPP Dimethylallyl pyrophosphate (or dimethylallyl diphosphate; DMAPP) and isopentenyl pyrophosphate (or isopentenyl diphosphate; IPP) are 5-carbon precursors of isoprenoids.
- GPP Geranyl diphopsphate (or geranyl pyrophosphate; GDP).
- GPP is formed by condensation of one DMAPP and one IPP molecule.
- GPP is a branch point molecule in isoprenoid synthesis, and it can, by addition of an IPP molecule, be converted into FPP, and thereby be directed into the biosynthesis of sesqui-, di- or tri-terpenes or sterol synthesis, or it can, by the action of a monoterpene synthase, be directed into the synthesis of monoterpenoids, iridoids, and monoterpene indole alkaloids.
- FPP Farnesyl pyrophosphate (orfarnesyl diphosphate; FDP) is formed by condensing GPP with an IPP molecule.
- FPP is the precursor for the synthesis of sesquiterpenes, diterpenes, triterpenes and sterols.
- GGPP Geranylgeranyl pyrophopsphate (or geranylgeranyl diphosphate; GGDP).
- GGPP is formed by condensing an FPP with an IPP molecule.
- GGPP is precursor for the synthesis of diterpenes.
- GFPP Geranylfarnesyl pyrophopsphate (or geranylfarnesyl diphosphate; GFDP).
- GGPP is formed by condensing a GGPP with an IPP molecule.
- GFPP is precursor for the synthesis of sesterterpenes.
- Structural analogues Also referred to as chemical analogues, chemical analogues, analogues, or analogues are compounds that possess structural similarity to a specific compound but differ in some or more ways to the in respect to said compound.
- Analogues can be, but are not limited to, compounds with one or more atoms added or substituted, with functional groups added, removed or substituted, and with substructures changed, isomerized, or modified.
- Non-canonical isoprenoids are chemical analogues of canonical isoprenoids produced by removal of diphosphate groups from non-canonical isoprenoid building blocks. When the diphosphate group is removed, several different reactions can occur including cyclization of the molecule, rearrangement of - or formation of bond, double bonds and triple bonds, and reaction with water or oxygen to form functional groups.
- Non-canonical meroterpenoids are analogues of canonical meroterpenoids (i.e. meroterpenoids with a canonical isoprenoid moiety).
- Non-canonical cannabinoids are analogues of the compounds to the cannabinoid group of compounds. These analogues are defined by the utilization of a non-canonical isoprenoids building block instead of GPP.
- Non-canonical monoterpenes, sesquiterpenes, diterpenes, sesterterpenes, triterpenes, tetraterpenes, polyterpenes, sterols Chemical analogues to monoterpenes, sesquiterpenes, diterpenes, sesterterpenes, triterpenes, tetraterpenes, polyterpenes, sterols or molecules with structural resemblance or production means.
- Diphopsphate(s) also referred to as pyrophosphate(s) is any molecule with a diphosphosphate group.
- pyrophosphate(s) is any molecule with a diphosphosphate group.
- this work is often refereed to, but not limited to, organic molecule with a diphosphate group and their analogues.
- Prenol-like primary alcohol is used here to describe an alcohol with a structure that is an analog of the alcohols prenol or isoprenol.
- MEP pathway The methylerythritol 4-phosphate (MEP) pathway forming IPP and DMAPP.
- MEP methylerythritol 4-phosphate
- MVA pathway The mevalonate pathway (MVA pathway) is an essential metabolic path-way present in eukaryotes and in some bacteria forming IPP and DMAPP starting from acetyl-CoA.
- Alternative MVA pathway The alternative MVA pathway is found in archaea and provides IPP and DMAPP, starting from acetyl-CoA but utilizing isopentenyl phosphate as intermediate.
- Terpene synthases typically synthesize multiple products, but the diversity of products varies among terpene synthases. Some terpene synthases have high product specificity, catalysing the synthesis of a limited number of products, and other terpene synthases have low product specificity, catalysing the synthesis of a large variety of different terpenes.
- terpene synthases are frequently classified as, hemiterpenesynthases, (if the accept DMAPP or IPP) monoterpene synthases (if they accept GPP), sesquiterpene synthases (if they accept FPP), diterpene synthases (if they accept GGPP), sesterterpene synthase (if they accept GFPP), or triterpene synthases (if they accept oxidosqualene orsqualene).
- hemiterpenesynthases if the accept DMAPP or IPP
- monoterpene synthases if they accept GPP
- sesquiterpene synthases if they accept FPP
- diterpene synthases if they accept GGPP
- sesterterpene synthase if they accept GFPP
- triterpene synthases if they accept oxidosqualene orsqualene
- Prenyltransferases are enzymes that append a prenyl moiety to isoprenoid or non-isoprenoid skeletons. Many prenyltransferases that append a prenyl moiety to other isoprenoid chains are involved in the synthesis of the prenyl diphosphate precursors, such as GPP (GPP synthases), FPP (FPP synthases), GGPP (GGPP synthases) or geranylfarnesyl diphosphate synthases (GFPP synthases). These enzymes typically add IPP units to extend DMAPP to larger size prenyl- diphosphates in the trans- configuration.
- GPP GPP synthases
- FPP FPP synthases
- GGPP GGPP synthases
- GFPP synthases geranylfarnesyl diphosphate synthases
- trans-polyprenyl synthases trans-polyprenyltransferases
- prenyltransferase enzymes that catalyse the cis- condensation and elongation of DMAPP with IPP. These enzymes are termed cis- prenyltransferase, or cis-polyprenyl diphosphate synthase, or cis-polyprenyltransferases, are responsible for the synthesis of neryl diphosphate, cis.cis-farnesyl diphosphate, and nerylneryl diphosphate.
- prenyltransferases have been reported to condense two DMAPP molecules to lavandulyl diphosphate or chrysanthemyl diphosphate.
- Prenyltransferases that append a prenyl moiety to non-isoprenoid scaffolds add DMAPP, GPP, FPP or GGPP to non-isoprenoid compounds, including flavonoids, amino acid residues and peptides, aromatic compounds, and other chemical compounds in general.
- Such prenyltransferase enzymes are involved in the biosynthesis of many different natural products including, but not limited to, cannabinoids, prenylated flavonoids, or other meroterpenoids. In the case of cannabinoid synthesis, this enzyme is a geranyldiphosphate:olivetolate geranyltransferase.
- the prenylransferase may be part of separate polypeptides or fused into one polypeptide chain.
- the prenyltransferase may also be fused to another prenyltransferase (e.g. Erg20p; an FPP synthase), a terpene synthase, or another non-terpene synthesizing protein.
- the prenyltransferase may also be fused to an enzyme that naturally localizes to the peroxisome matrix or its membrane in yeasts or in another organism, or that it is fused to a polypeptide chain that is itself fused to a peroxisomal targeting signal.
- An aromatic prenyltransferase is selected among any enzyme with prenyltransferase activity, identified from any organism or engineered, that is able to transfer an isoprenoid moiety to another isoprenoid or non-isoprenoid compound.
- Prenyl diphosphate synthase refers to any polypeptide with prenyl diphosphate synthesizing capacity that utilizes prenyl pyrophosphate compounds as substrate(s). In most cases, a prenyl diphosphate synthase is a prenyltransferase (see above)
- pyrophosphate is used interchangeably herein with "diphosphate”. Pyrophosphates are in this document an umbrella term for organic molecules that contain a pyrophosphate group.
- prenyl diphosphate is used interchangeably with “prenyl pyrophosphate”, “isoprenyl diphosphate”, or “isoprenyl pyrophosphate”, and includes monoprenyl diphosphates containing a single prenyl or isoprenyl group (such as DMAPP or IPP), and polyprenyl diphosphates with two or more prenyl/isoprenyl groups (such as GPP, FPP, GGPP, etc.).
- Non-canonical pyrophosphates refers to structural analogues of compounds containing a single prenyl or isoprenyl group (such as DMAPP or IPP), as well as structural analogues of compounds polyprenyl diphosphates with two or more prenyl / isoprenyl groups (such as GPP, FPP, GGPP, etc.).
- Canonical isoprenoid building blocks refer to prenyl pyrophosphate compounds with a carbon number that is a multiple of 5, which serve as substrates in the biosynthesis of either larger prenyl pyrophosphate compounds or of canonical terpenes (terpenoids and/or isoprenoids) and meroterpenoids. While not limited to these processes, non-canonical can usually be utilized by prenyltransferases and functional analogues, enzymes capable of removing the diphosphate group to catalysing a reaction, modified or folded by oxidoreductases. Squalene, dehydrosqualene, oxidosqualene, and phytoene are also considered herein as canonical isoprenoid building blocks, despite the fact that they do not carry a diphosphate group.
- Non-canonical isoprenoid building blocks refers to pyrophosphate group- containing organic molecules that are structural analogues of the canonical isoprenoid building blocks and can serve as substrates in the biosynthesis of either larger pyrophosphate-containing compounds by the action of prenyltransferases or prenyl diphosphate synthase enzymes, also described as condensation and elongation in this work, or in the biosynthesis of non-canonical isoprenoids (terpenes or isoprenoids) by the action of terpene synthases, or non-canonical meroterpenoids by the action of corresponding prenyltransferases.
- non-canonical isoprenoid building blocks can also be utilized by enzymes capable of removing the diphosphate group to catalysing a reaction, modified, or folded by oxidoreductases, and other reactions.
- genetically engineered refers to the genetic alteration of a cell resulting from the direct uptake of exogenous genetic material from its surroundings through the cell membrane(s), or by other means, such as viral transduction, whether or not said exogenous genetic material is incorporated into the cell’s genome, thus possibly leading to either stable or transient expression.
- the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 15 encoding a first kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 15 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 15.
- the host cell of the invention comprises at least a first nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 2 (corresponding to a first kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 2 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 2.
- the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 2.
- said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 2.
- the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 14 encoding a first kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 14 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 14.
- the host cell of the invention comprises at least a first nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 1 (corresponding to a first kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 1 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 1.
- the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 1.
- said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 1.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 17 encoding a second kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 17 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 17.
- the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 3 (i.e., encoding a second kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 3 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 3.
- the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 3.
- said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 3.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 18 encoding a second kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 18 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 18.
- the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 4 (i.e. encoding a second kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 4 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 4. In another embodiment, the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 4.
- said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 4.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19 encoding a second kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 19 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19.
- the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 5 (i.e. encoding a second kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 5 and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 5.
- the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 5.
- said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 5.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19 encoding a second kinase polypeptide wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 19 and comprising the amino acid substitution 204G and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19 and comprising the amino acid substitution 204G.
- the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 6 (i.e., encoding a second kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 6 and comprising the amino acid substitution 204G and wherein said polypeptide is capable of phosphorylation in said host cell.
- the host cell of the invention comprises a polypeptide according to SEQ ID NO: 6.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 20 encoding a prenyl diphosphate synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 20.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 20.
- the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 7 encoding a prenyl diphosphate synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 7.
- the host cell of the invention comprises a polypeptide according to SEQ ID NO: 7.
- the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 8 encoding a prenyl diphosphate synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 8 and comprises amino acid change N127W.
- the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 8.
- the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 8 and comprises amino acid change N127W.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 21 encoding a (+)-limonene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 21.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 21.
- the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 9 encoding a (+)-limonene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 9.
- the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 9.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22 encoding a Beta-myrcene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 22.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22.
- the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 10 encoding a Beta-myrcene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 10.
- the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 10.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 23 encoding a Sabinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 23.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22.
- the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 11 encoding a Sabinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 11.
- the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 11.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 24 encoding a a-Pinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 24.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22.
- the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 12 encoding a a-Pinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 12.
- the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 12.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 25 encoding a geranyldiphosphate : olivetolate geranyltransferase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 25.
- the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 25.
- the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 13 encoding a geranyldiphosphate : olivetolate geranyltransferase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 13.
- the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 13.
- the host cell of the invention comprises a polypeptide consisting of naturally encoded amino acid residues, i.e., amino acids found in the genetic code.
- first kinase SEQ ID NO: 2
- a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor herein may be useful for providing a genetically engineered eukaryotic cell, such as a yeast cell
- specific combinations of the first kinase and phosphokinase may be of particular interest in the context of the present invention.
- the first kinase and the phosphokinase are: i) AtFKI and AtIPK; ii) AtFKI and TalPK; ii) AtFKI and TalPK(204G); or functional variants thereof having at least 70% homology thereto, such as at least 71 %, such as at least 72%, such as at least 73%, such as at least 74%, such as at least 75%, such as at least 76%, such as at least 77%, such as at least 78%, such as at least 79%, such as at least 80%, such as at least 81%, such as at least 82%, such as at least 83%, such as at least 84%, such as at least 85%, such as at least 86%, such as at least 87%, such as at least 88%, such as at least 89%, such as at least 90%, such as at least 91 %, such as at least 92%, such as at least 93%, such as at least 94%, such as at least 95%, such as at least 9
- the combined action of one or more of these enzymes provides an isoprenoid biosynthetic pathway that allows de-coupling of isoprenoid biosynthesis from biomass production and enables channelling more substrate into product, thus providing a non-competitive system.
- a 10- to 100-fold increase in production titers of canonical isoprenoid compounds can be achieved.
- it is a highly efficient method to avoid common bottlenecks in currently used methods for the production of isoprenoids in yeast and other eukaryotic cells.
- the present methods allow production of isoprenoids with a total titer of at least 10 mg/L, such as at least 30 mg/L, such as at least 100 mg/L, such as at least 300 mg/L, such as at least 1 g/L, such as at least 3 g/L, such as at least 10 g/L, such as at least 30 g/L or more, wherein the total titer is the sum of the intracellular isoprenoids titer and the extracellular isoprenoids titer.
- the produced isoprenoids may be secreted from the cell - extracellular isoprenoids - or it may be retained in the cell - intracellular isoprenoids.
- the method may also comprise a step of recovering the produced isoprenoids. This may involve a heating step to precipitate cell material and to release intracellular isoprenoids, a centrifugation or filtration step to remove the cell debris and precipitated materials, pH-adjusting and chromatographic steps optionally involving solvents to vary the solubility of the isoprenoids and to purify it from other components.
- recovery of isoprenoids involves the addition of a non-miscible solvent overlay in the yeast culture. Said solvent may be hexane, dodecane, isopropyl myristate, or a vegetable oil.
- the recovered isoprenoids may be used as a nutritional supplement with its naive or processed host cells directly.
- Kinases may involve a heating step to precipitate cell material and to release intracellular isoprenoids, a centrifugation or filtration step to remove the cell debris and precipitated materials, pH-adjusting and chromatographic steps optionally involving solvents to vary the solubility of
- Kinases of the invention meet the definition of an enzyme that catalyses the transfer of phosphate groups from high-energy, phosphate-donating molecules such as CTP, ATP, GTP, UTP, NTP, CDP, ADP, GDP, UDP, NDP, or diphosphate, triphosphate or polyphosphate to specific substrates.
- This process is known as phosphorylation, where the substrate gains a phosphate group, and the high-energy, e.g., NTP molecule donates a phosphate group.
- This transesterification produces a phosphorylated substrate and NDP, as illustrated in the below schematic:
- Kinases as referred to herein encompass both alcohol kinases and phosphate kinases, i.e. , phosphokinases according to the invention.
- Arabidopsis thaliana FKI is a farnesol kinase belonging to the phosphatidate cytidylyltransferase family of enzymes (Brenda EC 2.7.1.216, UniProt Accession Q67ZM7) that can phosphorylate farnesol using an NTP donor. It has also been shown to phosphorylate geraniol and geranylgeraniol. Phosphorylation of farnesol proceeds according to the following reaction:
- NTP + (2E,6E)-farnesol NDP + (2E,6E)-farnesyl phosphate 27b
- the present invention relates to a genetically engineered eukaryotic cell capable of producing mono- or pyrophosphate isoprenoid precursors, such as terpenoid precursors.
- the genetically engineered eukaryotic cell can be any appropriate cell.
- the genetically engineered eukaryotic cell is a yeast cell.
- the yeast cell is a cell from a GRAS (Generally Recognized As Safe) organism or a non-pathogenic organism or strain.
- GRAS Generally Recognized As Safe
- the cell according to the invention is a eukaryotic cell.
- a eukaryotic cell is described herein as a host cell insofar it is the recipient of and / or comprises nucleic acids or polypeptides according to the invention.
- a eukaryotic cell may be a yeast cell.
- the host cell is a yeast cell. Any yeast species may be appropriate.
- the genus of said yeast is selected from Saccharomyces, Pichia, Yarrowia, Kluyveromyces, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Schizosaccharomyces, Trichosporon and Lipomyces.
- the genus of said yeast is Saccharomyces, Pichia, Yarrowia, Ogataea or Kluyveromyces
- the yeast cell may be selected from the group consisting of Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Hansenula polymorpha (syn. Ogataea parapolymorpha), Kluyveromyces marxianus, Yarrowia lipolytica, Klyveromyces lactis, or Dekkera bruxellensis.
- the eukaryotic cell according to the invention may also be a cell derived from a filamentous fungus, such as a cell derived from the miscella or germ bodies of a filamentous fungus.
- a filamentous fungus such as a cell derived from the miscella or germ bodies of a filamentous fungus.
- Non-limiting exemplary species of filamentous fungi according to the invention are Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
- the eukaryotic cell according to the invention may also be an algal cell, such as a cell derived from a multicellular alga.
- the algal cell of the invention may also be a microalga, such as a cell of the species Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana. It is understood that other cells of other genera and, in particular, other species or strains of the same genera, are equally appropriate to use as host cells.
- the genetically engineered eukaryotic cell is not a plant cell.
- Cells of the invention can be produced, for example, by use of a combination of recombinant DNA techniques and gene transfection methods as are well known in the art (Morrison, S. (1985) Science 229: 1202).
- the nucleic acid sequence(s) of interest e.g., a kinase coding sequence
- expression vectors such as a eukaryotic expression plasmid such as used in the expression system disclosed in examples 1 and 2, or other expression systems well known in the art.
- a purified plasmid with the cloned sequences can be introduced into yeast cells or other eukaryotic host cells such as filamentous fungi cells, algae cells, mammalian cells such as CHO cells, HEK293T cells or HeLa cells or alternatively other eukaryotic cells like plant derived cells.
- the method used to introduce these genes can be methods described in the art including, but not limited to electroporation, chemical transformation, such as PEG/lithium acetate-mediated transformation, calcium-phosphate precipitation or DEAE-dextran transfection, transfection, such as lipofectamine transfection, transduction, ultrasound transformation and the like.
- chemical transformation such as PEG/lithium acetate-mediated transformation, calcium-phosphate precipitation or DEAE-dextran transfection
- transfection such as lipofectamine transfection, transduction, ultrasound transformation and the like.
- genes can be expressed in the cytosol, or can be targeted to mitochondrion, peroxisome, vacuole, or other organelles by the addition of a suitable targeting sequence such as a chloroplastic, mitochondrial or peroxisomal targeting signal suitable for the host cells.
- nucleic acid sequence to remove or include a targeting sequence
- genes can be subjected to codon optimization with techniques well known in the art to achieve optimized expression of the proteins, i.e., it may be preferred to modify said nucleic acids for the sake of optimization of codon usage, in particular if said nucleic acids, optionally fused to heterologous nucleic acids such as nucleic acids derived from other organisms as described herein, are to be expressed in cells from an organism different from the cell of origin.
- nucleic acid sequences encoding alcohol or phosphate kinases originating from, e.g., Arabidopsis thaliana according to the invention can be modified to include one or more, preferably at least 1, 2, 3, 4, 5, 10, 15, 20 and preferably up to 10, 15, 20, 25, 30, 50, 70 or 100 or more nucleotide replacements resulting in an optimized codon usage in, e.g. a preferred yeast genus.
- nucleotide replacements preferably relate to replacements of nucleotides not resulting in a change in the encoded amino acid sequence.
- the degree of identity between a specific nucleic acid sequence and a nucleic acid sequence, which is modified with respect to, 30 or which is a variant of said specific nucleic acid sequence will be at least 70%, preferably at least 75%, more preferably at least 80%, even more preferably at least 90% or most preferably at least 95%, 96%, 97%, 98% or 99%.
- Cells according to the invention may also be prepared using various site-directed mutagenesis methods, which for example can be designed based on the sequence of AtFKI, which is accessible under the Uniprot entry Q67ZM7 and provided herein as SEQ ID NO:2.
- the cell of the invention is prepared using any one of CRISPR, a TALEN, a zinc finger, meganuclease, and a DNA-cutting antibiotic as described in WO 2017/138986.
- the cell is prepared using CRISPR/cas9 technique, e.g. using RNA-guided Cas9 nuclease. This may be done as described in Lawrenson et al.
- the host cell is prepared using a combination of both TALEN and CRISPR/cas9 techniques, e.g., using RNA-guided Cas9 nuclease. This may be done as described in Holme et al., Plant Mol Biol (2017) 95:111-121 ;
- the cell of the invention is prepared using homology directed repair, a combination of a DNA cutting nuclease and a donor DNA fragment. This may be done as described in Sun et al., Molecular Plant (2016) 9:628-631 ; DOI: https://doi.Org/10.1016/j.molp.2016.01.001 except that the DNA cutting nuclease and donor DNA fragment are designed based on the gene sequences provided herein.
- cells expressing the kinase(s) can be identified and selected as would be known to the person skilled in the art according to the marker(s) used. These cells can then be amplified for their expression level and upscaled to produce canonical and non-canonical isoprenoids by use of the canonical and non-canonical isoprenoid precursors.
- nucleic acid is intended to include deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) and refers to a polynucleotide comprising a polymer of nucleotides.
- Nucleic acids comprise according to the invention genomic DNA, cDNA, mRNA, recombinantly produced and chemically synthesized molecules.
- a nucleic acid may be present as a single-stranded or double-stranded and linear or covalently circularly closed 31 molecule.
- the nucleic acids may have been codon-optimized for expression in a yeast cell as is known in the art.
- isolated nucleic acid means according to the invention that the nucleic acid was (i) amplified in vitro, for example by polymerase chain reaction (PCR), (ii) recombinantly produced by cloning, (iii) purified, for example by cleavage and gel-electrophoretic fractionation, or (iv) synthesized, for example by chemical synthesis.
- An isolated nucleic acid is a nucleic acid which is available for manipulation by recombinant DNA techniques.
- Nucleic acids may, according to the invention, be present alone or in combination with other nucleic acids, which may be homologous or heterologous.
- a nucleic acid is functionally linked to expression control sequences which may be homologous or heterologous with respect to said nucleic acid wherein the term “homologous” means that the nucleic acid is also functionally linked to the expression control sequence naturally and the term “heterologous” means that the nucleic acid is not functionally linked to the expression control sequence naturally.
- a nucleic acid such as a nucleic acid expressing RNA and/or protein or peptide, and an expression control sequence are "functionally” linked to one another, if they are covalently linked to one another in such a way that expression or transcription of said nucleic acid is under the control or under the influence of said expression control sequence. Since the nucleic acid is to be translated into a functional protein, and where an expression control sequence is functionally linked to a coding sequence, induction of said expression control sequence results in transcription of said nucleic acid without causing a frame shift in the coding sequence or said coding sequence otherwise not being capable of being translated into the desired protein or peptide.
- expression control sequence or "expression control element” comprises according to the invention promoters, ribosome binding sites, IRES, enhancers and other control elements which regulate transcription of a gene or translation of a mRNA.
- the expression control sequences can be regulated.
- the exact structure of expression control sequences may vary as a function of the species or cell type, but generally comprises 5'-untranscribed and 5'- and 3 '-untranslated sequences which are involved in initiation of transcription and translation, respectively, such as TATA box, capping sequence, CAAT sequence, and the like. More specifically, 5'-untranscribed expression control sequences comprise a promoter region which includes a promoter sequence for transcriptional control of 32 the functionally linked nucleic acid. Expression control sequences may also comprise enhancer sequences or upstream activator sequences.
- promoter or “promoter region” relates to a nucleic acid sequence which is located upstream (5') to the nucleic acid sequence being expressed and controls expression of the sequence by providing a recognition and binding site for RNA- polymerase.
- the "promoter region” may include further recognition and binding sites for further factors which are involved in the regulation of transcription of a gene.
- a promoter may be "inducible” by way of initiating transcription in response to an inducing agent or may be “constitutive” if transcription is not controlled by an inducing agent.
- a gene which is under the control of an inducible promoter is not expressed or only expressed to a small extent if an inducing agent is absent. In the presence of the inducing agent the gene is switched on or the level of transcription is increased. This is mediated, in general, by binding of a specific transcription factor.
- Promoters which are preferred according to the invention include promotors useful for expression in a yeast host, including but not limited to promoters obtained from the genes for Saccharomyces cerevisiae enolase ( EN01 ), Saccharomyces cerevisiae galactokinase ( GAL1 ), Saccharomyces cerevisiae UDP-glucose-4-epimerase ( GAL10 ), Saccharomyces cerevisiae glyceraldehyde-3-phosphate dehydrogenase ( TDH3 ), Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase ( ADH1 , ADH2/GAP), Saccharomyces cerevisiae triose phosphate isomerase ( TPI ), Saccharomyces cerevisiae metallothionein ( CUP1 ), and Saccharomyces cerevisiae 3-phosphoglycerate kina
- promoters for directing transcription of the nucleic acid constructs of the present invention in a filamentous fungal host cell are promoters obtained from the genes for Aspergillus nidulans acetamidase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Aspergillus oryzae TAKA amylase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Fusarium oxysporum trypsin-like protease (WO 96/00787), Fusarium venenatum amyloglucosidase (WO 00/56900), Fusarium venenatum Daria (WO 00/56900), Fusarium venenatum Quinn (
- Trichoderma reesei endoglucanase II Trichoderma reesei endoglucanase III
- Trichoderma reesei endoglucanase V Trichoderma reesei xylanase I, Trichoderma reesei xylanase II, Trichoderma reesei xylanase III, Trichoderma reesei beta-xylosidase, and Trichoderma reesei translation elongation factor, as well as the NA2-tpi promoter (a modified promoter from an Aspergillus neutral alpha-amylase gene in which the untranslated leader has been replaced by an untranslated leader from an Aspergillus triose phosphate isomerase gene; non-limiting examples include modified promoters from an Aspergillus niger neutral alpha-amylase gene in which the untranslated leader has been replaced by an untranslated leader from an
- the control sequence may also be a transcription terminator, which is recognized by a host cell to terminate transcription.
- the terminator is operably linked to the 3'-terminus of the polynucleotide encoding the polypeptide. Any terminator that is functional in the host cell may be used in the present invention.
- Preferred terminators for yeast host cells are obtained from the genes for Saccharomyces cerevisiae enolase, Saccharomyces cerevisiae cytochrome C ( CYC1 ), Saccharomyces cerevisiae alcohol dehydrogenase 1 ( ADH1 ), and Saccharomyces cerevisiae glyceraldehyde-3- phosphate dehydrogenase.
- Other useful terminators for yeast host cells are described by Romanos et al., 1992, supra.
- Preferred terminators for filamentous fungal host cells are obtained from the genes for Aspergillus nidulans acetamidase, Aspergillus nidulans anthranilate synthase, Aspergillus niger glucoamylase, Aspergillus niger alpha glucosidase, Aspergillus oryzae TAKA amylase, Fusarium oxysporum trypsin-like protease, Trichoderma reesei beta- glucosidase, Trichoderma reesei cellobiohydrolase I, Trichoderma reesei cellobiohydrolase
- Trichoderma reesei endoglucanase I Trichoderma reesei endoglucanase II
- Trichoderma reesei endoglucanase III Trichoderma reesei endoglucanase V
- Trichoderma reesei xylanase I Trichoderma reesei xylanase II
- Trichoderma reesei xylanase III Trichoderma reesei beta- xylosidase
- Trichoderma reesei translation elongation factor Trichoderma reesei translation elongation factor.
- control sequence may also be an mRNA stabilizer region downstream of a promoter and upstream of the coding sequence of a gene which increases expression of the gene.
- the control sequence may also be a leader, a non-translated region of an mRNA that is important for translation by the host cell.
- the leader is operably linked to the 5'-terminus of the 34 polynucleotide encoding the polypeptide. Any leader that is functional in the host cell may be used.
- Suitable leaders for yeast host cells are obtained from the genes for Saccharomyces cerevisiae enolase ( EN01 ), Saccharomyces cerevisiae 3-phosphoglycerate kinase, Saccharomyces cerevisiae alpha-factor, and Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde- 3-phosphate dehydrogenase ( ADH2/GAP ).
- Preferred leaders for filamentous fungal host cells are obtained from the genes for Aspergillus oryzae TAKA amylase and Aspergillus nidulans triose phosphate isomerase.
- the control sequence may also be a polyadenylation sequence, a sequence operably linked to the 3'-terminus of the polynucleotide and, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence that is functional in the host cell may be used.
- polyadenylation sequences for yeast host cells are described by Guo and Sherman,
- Preferred polyadenylation sequences for filamentous fungal host cells are obtained from the genes for Aspergillus nidulans anthranilate synthase, Aspergillus niger glucoamylase, Aspergillus niger alpha-glucosidase Aspergillus oryzae TAKA amylase, and Fusarium oxysporum trypsin-like protease.
- regulatory sequences that regulate expression of the polypeptide relative to the growth of the host cell.
- regulatory sequences are those that cause expression of the gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound.
- yeast the ADH2 system or GAL1 system may be used.
- filamentous fungi the Aspergillus niger glucoamylase promoter, Aspergillus oryzae TAKA alpha-amylase promoter, and Aspergillus oryzae glucoamylase promoter, Trichoderma reesei cellobiohydrolase I promoter, and Trichoderma reesei cellobiohydrolase promoter may be used.
- regulatory sequences are those that allow for gene amplification.
- these regulatory sequences include the dihydrofolate reductase gene that is amplified in the presence of methotrexate, and the metallothionein genes that are amplified with heavy metals.
- the polynucleotide encoding the polypeptide would be operably linked to the regulatory sequence.
- the term "expression” is used in its most general meaning and comprises the production of RNA or of RNA and protein/peptide. It also comprises partial 35 expression of nucleic acids. Furthermore, expression may be carried out transiently or stably. According to the invention, the term expression also includes an "increased expression” or "abnormal expression”.
- “Increased expression” or "abnormal expression” or “silenced” means according to the invention that expression is altered, increased or decreased, compared to a reference, preferably compared to the state in a normal cell in normal growing physical and chemical conditions, not undergoing above-normal respiration and / apoptosis, and in any case in the same conditions and state as the cell whose expression is being compared to.
- An increase in expression refers to an increase by at least 10%, in particular at least 20%, at least 50%, at least 100%, at least 200%, at least 500%, at least 1000%, at least 10000% or more.
- a decrease in expression or silencing refers to a decrease by at least 10%, in particular at least 20%, at least 50%, at least 100%, at least 200%, at least 500%, at least 1000%, at least 10000% or more.
- a nucleic acid molecule is according to the invention present in a vector, where appropriate with a promoter, which controls expression of the nucleic acid.
- vector is used here in its most general meaning and comprises any intermediary vehicle for a nucleic acid which enables said nucleic acid, for example, to be introduced into eukaryotic cells and preferably expressed and, where appropriate, to be replicated and / or integrated into a genome.
- vector as used herein generally relates to genetic material that is at least at the time of introduction into the host cell, extrachromosomal, usually circular DNA duplex.
- a vector containing foreign DNA is termed recombinant DNA.
- vector therefor comprises, but is not limited to, plasmids, viral vectors, cosmids, and artificial chromosomes. Common to most engineered vectors is an origin of replication, one or multiple cloning sites, and one or multiple selectable markers.
- a vector for expression of one or more kinases according to the invention may either be of a vector type in which the first and the second kinases are present in different vectors or a vector type in which both are present in the same vector.
- nucleic acid and amino acid sequences e.g. those shown in the sequence listing, is to be construed so as to also relate to modifications of said specific sequences resulting in sequences which are functionally equivalent to said specific sequences, e.g. nucleic acid sequences encoding amino acid sequences exhibiting properties identical or similar to those of the amino acid sequences encoded by the specific nucleic acid sequences.
- ectopic particularly in relation to “ectopic expression” as used herein, relates to the occurrence of gene expression in a cell in which it is normally not expressed.
- ectopic expression can be caused by the introduction and expression of a nucleic acid in a vector as defined herein or by juxtaposition of novel enhancer elements to a gene. Such techniques are known to the person skilled in the art.
- “Growth medium” or “culture medium” as used herein refers to is a solid, liquid, or semi-solid designed to support the growth of a population of microorganisms or cells via the process of cell proliferation.
- Different types of media are used for growing different types of cells and are known to the person skilled in the art. Examples of media are YPD medium, YPG medium, YPAD medium, synthetic minimal medium, and synthetic complex medium, YPGal, selective minimal medium and selective inducing minimal medium
- peptide analogue refers to a compound comprising a peptide, wherein the peptide may be modified with moieties that do not necessarily consist of proteinogenic amino acids and are thus non-proteinogenic amino acids residues.
- Non- proteinogenic amino acids are those not naturally encoded or found in the genetic code of any organism. These may be, e.g., intermediates in biosynthesis, or post-translationally formed in proteins.
- fusion protein or “fusion” or “recombinant protein” refers to a single polypeptide chain having at least two polypeptide domains that are not normally present in a single, natural polypeptide. Such a fusion protein is typically obtained by the expression of recombinant DNA molecules.
- Recombinant DNA molecules are DNA molecules formed by genetic recombination (such as molecular cloning) that bring together genetic material from multiple sources, creating sequences that would not otherwise be found in the genome.
- sequence identity or, for example, comprising a “sequence 75% identical to,” as used herein, refer to the extent that sequences are identical on a nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis over a window of comparison.
- a "percentage of sequence identity” may be calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, I) or the identical amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, lie, Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gin, Cys and Met) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions 37 by the total number of positions in the window of comparison, i.e., the window size and multiplying the result by 100 to yield the percentage of sequence identity.
- the identical nucleic acid base e.g., A, T, C, G, I
- the identical amino acid residue e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, lie, Phe, Tyr, Trp, Lys, Arg
- sequence relationships between two or more nucleic acid polymers or polypeptides include “reference sequence,” “comparison window,” “sequence identity,” “percentage of sequence identity” and “substantial identity”.
- a “reference sequence” is at least 12 but frequently 15 to 18 and often at least 25 monomer units, inclusive of nucleotides and amino acid residues, in length.
- two nucleic acid or polypeptide polymers may each comprise (1) a sequence (i.e., only a portion of a complete polymer) that is similar between the two polymers, and (2) a sequence that is divergent between the two polymers
- sequence comparisons between two (or more) polymers are typically performed by comparing sequences of the two polymers over a "comparison window" to identify and compare local regions of sequence similarity.
- a “comparison window” refers to a conceptual segment of at least 6 contiguous positions, usually about 50 to about 100, more usually about 100 to about 150 in which a sequence is compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
- the comparison window may comprise additions or deletions (i.e., gaps) of about 20% or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
- Optimal alignment of sequences for aligning a comparison window may be conducted by computerized implementations of algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Drive Madison, Wl, USA) or by inspection and the best alignment (i.e., resulting in the highest percentage homology over the comparison window) generated by any of the various methods selected.
- sequence similarity or sequence identity between sequences can be performed as follows: To determine the percent identity of two nucleic acid sequences, or of two amino acid sequences, the sequences can be aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes).
- the length of a reference sequence aligned for comparison purposes is at 38 least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 75%, 80%, 90%, 100% of the length of the reference sequence.
- amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position.
- the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.
- the comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm.
- the percent identity between two amino acid sequences is determined using the Needleman and Wunsch, (1970, J. Mol. Biol. 48: 444-453) algorithm which has been incorporated into the GAP program in the GCG software package, using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6.
- the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package, using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1 , 2, 3, 4, 5, or 6.
- a particularly preferred set of parameters are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.
- the percent identity between two amino acid or nucleotide sequences can also be determined using the algorithm of E. Meyers and W. Miller (1989, Cabios, 4: 11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.
- nucleic acid and protein sequences described herein can be used as a "query sequence" to perform a search against public databases, for example, to identify other family members or related sequences.
- Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al., (1990, J. Mol. Biol, 215: 403-10).
- Gapped BLAST can be utilized as described in Altschul et al. (Nucleic Acids Res. 25:3389-3402, 1997).
- the default parameters of the respective programs e.g., XBLAST and N BLAST can be used.
- the substrate according to the invention is a primary alcohol with three (3) up to thirty (30) carbon atoms, including, but not limited to; structures with sidechains chains, branched sidechains, structures with one (1) or more double bonds, one (1) or more triple bonds, functional groups, structures with the addition of or substitution with the elements Hydrogen, Nitrogen, Oxygen, Fluorine, Silicon, Phosphorus, Sulphur, Chlorine, Selenium, Boron, Iodine, Lithium, Sodium or Potassium.
- Such a substrate may be summarised by the structure of formula 1 : where formula 1 is: 40 or wherein Ri is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, including, but not limited to methyl, ethyl, propyl, isopropyl, methoxy, ethoxy, hydroxyl, hydroxymethyl, hydroxyethyl, sulfhydryl, silyl; a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, comprising: fluoro-, chloro-, bromo- , or iodo- groups; a group containing oxygen, comprising: hydroxyl-, carbonyl-, aldehyde-, haloformyl-, carbonate ester-, carboxylate-, carboxyl-, carboalkoxy-, hydroperoxy
- R 2 is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, including, but not limited to methyl, ethyl, propyl, isopropyl, methoxy, ethoxy, hydroxyl, hydroxymethyl, hydroxyethyl, sulfhydryl, silyl; a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, comprising: fluoro-, chloro-, bromo- , or iodo- groups; a group containing oxygen, comprising: hydroxyl-, carbonyl-, aldehyde-, haloformyl-, carbonate ester-, carboxylate-, carboxyl-, carboalkoxy-, hydroperoxy-, peroxy-, ether- , hemiacetal-, hemiketal-,
- R 3 is hydrogen, methyl, fluorine, chlorine, bromine, iodine, sulfhydryl; or hydroxyl.
- the primary alcohol of the invention is an alcohol comprising less than 5 carbon atoms, such as 4 carbon atoms, such as 3 carbon atoms.
- the primary alcohol of the invention is an alcohol comprising more than 5 carbon atoms, such as 6 carbon atoms, such as 7 carbon atoms, such as 8 carbon atoms, such as 9 carbon atoms.
- the primary alcohol of the invention is an alcohol comprising 5 carbon atoms.
- the substrate of the invention is any selected from the group formed by, but not limited to:
- the primary alcohol is 3-methylbut-2-en-1-ol, 4-fluoro-3-methylbut-2- en-1-ol, 3-methylpent-2-en-1-ol, 3,4-dimethylpent-2-en-1-ol, 3-ethylpent-2-en-1-ol, 3-methylhex- 2-en-1-ol, 3-methylhexa-2,5-dien-1-ol, 3-methylbut-3-en-1-ol, 3-methylenepentan-1-ol, 2- methylprop-2-en-1 -ol, 3-methyl-4-(methylthio)but-2-en-1 -ol, or 5-chloro-3-methylpent-2-en-1 -ol.
- the invention relates to the production of terpenes and terpenoid compounds in eukaryotic cells, said compounds being canonical or non-canonical terpenes or terpenoid compounds. Accordingly, a cell according to the invention is capable of production of a terpene or terpenoid selected from a group comprising, but not limited to: 43
- the terpene or terpenoid is selected from the group commprising of hemiterpenes, hemiterpenoids, and monoterpenes.
- the terpene or terpenoid is selected from the group comprising acyclic monoterpenes, monocyclic monoterpenes, cyclopropane monoterpenes, cyclobutane 44 monoterpenes, cyclopentane monoterpenes, cyclohexane monoterpenes, cymenes, bicyclic monoterpenes, pinanes, camphanes, fenchanes, monoterpene indole alkaloid, cannabinoids and sesquiterpenes.
- acyclic monoterpenes monocyclic monoterpenes, cyclopropane monoterpenes, cyclobutane 44 monoterpenes, cyclopentane monoterpenes, cyclohexane monoterpenes, cymenes, bicyclic monoterpenes, pinanes, camphanes, fenchanes, monoterpene indole
- the terpene or terpenoid is selected from the group comprising farnesanes, monocyclic farnesane sesquiterpenes, cyclofarnesanes, bisabolanes, germacranes, elemanes, humulanes, polycyclic farnesane sesquiterpenes, caryophyllanes, eudesmanes, furanoeudesmanes, eremophilanes, furanoeremonphilanes, valeranes, cadinanes, drimanes, guaianes, cycloguaianes, himachalanes, longipinanes, longifolanes, picrotoxanes, isodaucanes, daucanes, protoilludanes, illudanes, illudalanes, marasmanes, isolactaranes, lactaranes, sterpuranes, acoranes, chamigranes, cedra
- the terpene or terpenoid is selected from the group comprising phytanes, cyclophytanes, bicyclophytanes, labdanes, rearranged labdanes, tricyclophytanes, pimaranes, isopimaranes, cassanes, cleistanthanes, isocopalanes, abietanes, totaranes, tetracyclophytanes, beyeranes, kauranes, villanovanes, atisanes, gibberellanes, grayanatoxanes, cembranes, cyclocembranes, casbanes, lathyranes, jatrophanes, tiglianes, rhamnofolanes, daphnanes, eunicellanes, asbestinanes, biaranes, dolabellanes, dolastanes, fusicoccanes, verticillanes, taxanes, trinervitanes, kempanes,
- the terpene or terpenoid is a sesterterpene.
- the terpene or terpenoid is selected from the group comprising acyclic sesterterpenes, monocyclic sesterterpenes, bicyclic sesterterpenes, tricyclic sesterterpenes, tetracyclic sesterterpenes, pentacyclic sesterterpenes and polycyclic sesterterpenes.
- the terpene or terpenoid is a triterpene.
- the terpene or terpenoid is selected from the group comprising linear triterpenes, gonane type, tetracyclic triterpenes, protostanes, fusidanes, dammaranes, apotirucallanes, tirucallanes, euphanes, lanostanes, cycloartanes, cucurbitanes, baccharane type, pentacyclic triterpenes, baccharanes, lupanes, oleananes, taraxeranes, multifloranes, baueranes, glutinanes, friedelanes, pachysananes, taraxastanes, ursanes, pentacyclic 45 triterpenes, hopane type, hopanes, neohopanes, fernanes, adiananes, filicanes, gammaceranes, stictanes, arboranes, onoceranes, serratanes
- the terpene or terpenoid is a tetraterpene.
- the terpene or terpenoid is selected from the group comprising carotenoids, apocarotenoids, diapocarotenoids, megastigmanes, polyterpenes and prenylquinones.
- Examples of the products of monoterpene synthases include, but are not limited to, the following compounds: tricyclene, a/p/7a-thujene, alpha-pinene, a/p/7a-fenchene, camphene, sabinene, beta- pinene, myrcene, delta- 2-carene, a/p/7a-phellandrene, 3-carene, 1,4-cineole, a/p/7a-terpinene, befa-phellandrene, 1,8-cineole, limonene, (Z)-befa-ocimene, (E)-beta-ocimene, gamma- terpinene, terpinolene, linalool, perillene, allo-ocimene, c/s-beta-terpineol, c/s-terpine-1-ol, isoborneol, cf
- terpene synthases In addition to GPP, certain terpene synthases (or terpene synthase variants developed by protein engineering) have been reported to convert non-canonical prenyl diphosphate substrates, such as the 11 -carbon substrate 2-methyl-GPP, to terpenes with non-canonical prenyl scaffolds (Ignea et al. 2018).
- enzymes that are able to convert non-canonical prenyl-diphosphate substrates with carbon lengths that differ from 10 into non-canonical terpenoids with 8, 9, 11 , 12, 13 or 14 carbons and / or other non-hydrogen atoms, or that are in any way analogues of the canonical substrate (GPP), are also included in the definition of products of monoterpene synthases.
- sesquiterpene synthase products include, but are not limited to alpha-humulene, beta-caryophyllene, trans-alpha-bergamotene, cis-alpha-bergamotene, farnesene, alpha- santalene, santalol, beta-selinene, zingiberene, della-cadinene, germacrene, etc. (see more comprehensive list of structures above).
- the engineered eukaryotic cell according to the present invention is capable of producing terpene scaffolds with 16, 17 or 31 carbon atoms.
- the engineered eukaryotic cell is capable of producing terpene scaffolds with 16, 17 or 31 carbon atoms when feed with an alcohol substrate selected from the group consisting of 3M2E, 3,4-DMP, 3-MP, 3E2E, prenol, and isoprenol.
- the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK ⁇ , At ⁇ PK, Erg20p and CYC2 for the production of terpene with 16, 17 or31 carbon atoms, wherein AtFK ⁇ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto.
- the cell comprises a nucleic acid sequence encoding an At ⁇ PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 or a homologue or variant thereof having at least 75% identity thereto.
- the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK ⁇ , At ⁇ PK, and an enzyme selected from the group consisting of Erg20p(F96C), Synechococcus elongatus GGPPS, Erg20p, Taxus canadensis GGPPS, Lycopersicon esculentum NNPPS, Solanum habrochaites zFPPS for the production of sesquiterpene with 16 and 17 carbon atoms, wherein AtFK ⁇ comprises SEQ ID NO: 2 ora homologue or variant thereof having at least 75% identity thereto.
- the cell comprises a nucleic acid sequence encoding an At ⁇ PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 ora homologue or variant thereof having at least 75% identity thereto.
- the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK ⁇ , At ⁇ PK, Erg20p(F96C) and Salvia fruticosa trans-p-caryophyllene synthase (Sf126) for the production of sesquiterpenes with 16 or 17 carbon atoms, wherein AtFK ⁇ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto.
- the cell comprises a nucleic acid sequence encoding an At ⁇ PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 ora homologue or variant thereof having at least 75% identity thereto.
- the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK ⁇ , At ⁇ PK, and CPQ for the production of triterpene with 31 carbon atoms, wherein AtFK ⁇ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto.
- the cell comprises a nucleic acid sequence encoding an At ⁇ PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 ora homologue or variant thereof having at least 75% identity thereto.
- the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK ⁇ , At ⁇ PK, and BmeTC(373C) for the production of triterpene with 31 carbon atoms, wherein AtFK ⁇ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto.
- the cell comprises a nucleic acid sequence encoding an 47
- the nucleic acid sequence comprises or consists of SEQ ID NO: 17 or a homologue or variant thereof having at least 75% identity thereto.
- the genetically engineered eukaryotic cell for the production of a terpene or a terpenoid or an isoprenoid comprising a first nucleic acid sequence encoding a first kinase that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor; wherein the first kinase comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto, wherein the cell further comprises at least one further exogenous nucleic acid encoding an enzyme selected from the group consisting of Erg20p, CYC2, Erg20p(F96
- diterpene synthase products include but are not limited to: taxadiene, casbene, cembrene, copalyl diphosphate, copal-8-ol diphosphate, etc. (see more comprehensive list of structures above).
- triterpene synthase products include but are not limited to: friedelin, alpha-amyrin, beta-amyrin, lupeol, cucurbitadienol, etc. (see more comprehensive list of structures above).
- improved enzyme kinetics as used herein it is meant the result of any change to an enzyme, e.g., a kinase according to the invention or terpene synthase or prenyl transferase according to the invention, said change involving either a change to the nucleic acid sequence coding for said enzyme, or a change in the expressed peptide or protein, which result is measurable in terms of, e.g., the efficiency of said enzyme.
- Enzyme efficiency can be expressed in terms of kcat/Km, i.e., the specificity constant, wherein kcat is the turnover number and Km is the Michaelis constant, i.e. the affinity an enzyme has for its substrate.
- the specificity constant reflects both affinity and catalytic ability, it is useful for comparing different enzymes against each other, or different variants of said enzyme according to the invention against each other, or the same enzyme with different substrates according to the invention.
- the term “improved enzyme kinetics” also encompasses the result of any change affecting the affinity of an enzyme according to the invention.
- the exponents m and n are called partial orders of reaction and are not generally equal to the stoichiometric coefficients a and b. Instead they depend on the reaction mechanism and it would be known how to determine them experimentally by the person skilled in the art.
- enzyme promiscuity refers to the ability of an enzyme to catalyse a fortuitous side reaction in addition to its main reaction. Promiscuous activities are usually slow relative to the main activity.
- the alcohol kinase of the invention may in some embodiments also be suited to catalyse a phosphate phosphorylation.
- Stability can be measured at a selected temperature for a selected time period.
- a formulation comprising the enzymes which stability is to be compared can be kept at 40°C for 2 weeks to 1 month, at which time stability is measured.
- the extent of aggregation during storage can be used as an indicator of protein stability.
- said genetically engineered eukaryotic cell such as a yeast cell is capable of phosphorylating primary alcohols into mono- or pyrophosphate terpenoid precursors may be determined in different manners.
- it is determined by a method comprising the steps of:
- the level of the primary alcohol subsequent to said incubation is at least 1% lower, such as at least 2%, such as at least 3%, such as at least 4%, such as at least 5%, such as at least 6%, such as at least 7%, such as at least 8%, such as at least 9%, such as at least 10%, such as at least 20%, such as at least 30%, such as at least 40%, such as at least 50% lower that the starting level.
- whether said genetically engineered eukaryotic cell is capable of phosphorylating a primary alcohol present in a culture medium into a mono- or pyrophosphate terpenoid precursor is determined by the steps of:
- the genetically engineered eukaryotic cell according to the invention when incubated in a culture medium containing a predefined level of a primary alcohol, then the sum of the levels of the different larger size alcohols determined is at least 1%, such as at least 2%, such as at least 3%, such as at least 4%, such as at least 5%, such as at least 6%, such as at least 7%, such as at least 8%, such as at least 9%, 10%, such as at least 20%, such as at least 30%, such as at least 40%, such as at least 50%, such as at least 600%, such 50 as at least 70%, such as at least 80%, such as at least 90% of the starting level of the predefined primary alcohol having a known size.
- downstream terpene, terpenoid or isoprenoid compounds can be used as a surrogate measure of the ability of the AtFKI to phosphorylate a primary alcohol into a mono- or pyrophosphate terpenoid precursor.
- downstream compounds such as linalool, nerolido geranyllinalool, or squalene are measured and used as a marker of mono- or pyrophosphate terpenoid precursor production.
- the genetically engineered eukaryotic cell according to the invention is incubated in an aqueous solution containing a predefined level of linalool, nerolidol, geranyllinalool, or squalene, and a predefined level of a primary alcohol, then the molar increase in linalool, nerolidol, geranyllinalool, or squalene level after incubation is at least 25%, such as at least 30%, such as at least 40%, such as at least 50%, such as at least 60%, such as at least 70%, such as at least 80%, such as at least 90%, such as at least 95% higher than the predefined molar level of linalool, nerolidol, geranyllinalool, or squalene at the starting level
- the incubation in the aqueous solution may be performed in any suitable manner.
- the incubation is made under conditions allowing growth and/or metabolic activity of said eukaryotic cell, such as a yeast cell.
- the incubation is performed at a temperature in the range of 5 to 35°C, such as in the range of 20 to 32°C.
- the aqueous solution should also in addition to the primary alcohol also comprise components promoting cellular growth, such as yeast strain growth including a carbon source and a nitrogen source and optionally buffers and salts.
- the incubation may for example be done for 12 - 24 hours or 1 - 21 days.
- SEQ ID NO: 1 Amino acid sequence of Farnesol kinase of Arabidopsis thaliana UniProt accession Q67ZM7. 51
- SEQ ID NO: 3 Amino acid sequence of Isopentenyl phosphate kinase of Arabidopsis thaliana UniProt accession Q8H1 F7.
- SEQ ID NO: 4 Amino acid sequence of Isopentenyl phosphate kinase of Methanolobus tindarius strain DSM 2278 UniProt accession W9DTD1
- SEQ ID NO: 5 Amino acid sequence of Isopentenyl phosphate kinase of Thermoplasma acidophilum (strain ATCC 25905 / DSM 1728 / JCM 9062 / NBRC 15155 / AMRC-C165)
- SEQ ID NO: 7 Amino acid sequence of Erg20p or Geranyl diphosphate synthase of Saccharomyces cerevisiae strain JAY291 UniProt accession C7GRZ5
- SEQ ID NO:8 Amino acid sequence of Erg20p or Geranyl diphosphate synthase of
- SEQ ID NO: 9 Amino acid sequence of (+)-limonene synthase of Citrus limon and encoded by the CILimS gene GenBank accession AAM53944.1.
- SEQ ID NO: 10 Amino acid sequence of Beta-myrcene synthase of Ocimum basilicum and encoded by the ObMyrS (MYS gene).
- UniProt accession Q5SBP1 UniProt accession Q5SBP1
- SEQ ID NO: 11 Amino acid sequence of the Sabinene synthase of Salvia pomifera and encoded by the SpSabS gene. UniProt accession A6XH06.
- SEQ ID NO: 12 Amino acid sequence of a-Pinene synthase of Pinus taeda and encoded by the PtPinS gene. UniProt accession Q84KL3.
- SEQ ID NO: 13 Amino acid sequence of geranyldiphosphate: olivetolate geranyltransferase of Cannabis sativa and encoded by the CsPT4 gene. Uniprot: A0A455ZJC3
- SEQ ID NO: 15 Nucleotide coding sequence of A65AtFKI of Arabidopsis thaliana
- SEQ ID NO: 18 Nucleotide coding sequence of Isopentenyl phosphate kinase of Methanolobus tindarius strain DSM 2278. Gene MettiDRAFT_2389 Uniprot: W9DTD1.
- SEQ ID NO: 19 Nucleotide coding sequence of Isopentenyl phosphate kinase of Thermoplasma acidophilum (strain ATCC 25905 / DSM 1728 / JCM 9062 / NBRC 15155 / AMRC-C165).
- SEQ ID NO: 20 Nucleotide coding sequence of Erg20p of farnesyl diphosphate synthase of Saccharomyces cerevisiae strain S288C. Uniprot: P08524.
- SEQ ID NO: 21 Nucleotide coding sequence of (+)-limonene synthase of Citrus limon CILimS gene GenBank accession AF514287.1.
- SEQ ID NO: 24 Nucleotide coding sequence of a-Pinene synthase of Pinus taeda GenBank accession AF543530.1
- SEQ ID NO: 25 Nucleotide coding sequence of geranyldiphosphate: olivetolate geranyltransferase of Cannabis sativa GenBank accession BK010648
- SEQ ID NO: 26 Amino acid sequence of Erg20p or disordersyl diphosphate synthase of Saccharomyces cerevisiae strain 288c. UniProt accession P08524 comprising amino acid change (F96C) and indicated as Erg20p F96C
- SEQ ID NO: 27 Amino acid sequence of terpentetriene synthase from Streptomyces griseolosporeus (CYC2). UniProt accession Q9AJE3.
- SEQ ID NO: 28 Amino acid sequence of GGPP synthase of Synechococcus elongatus (SynGGPPS). UniProt accession Q2JX96. 53
- SEQ ID NO: 29 Amino acid sequence of GGPP synthase of Taxus canadensis (TcaGGPPS). UniProt accession Q9ZPM3.
- SEQ ID NO: 30 Amino acid sequence of nerylneryl diphosphate synthase of Lycopersicon esculentum (LycNNPPS). UniProt accession K7WQ45.
- SEQ ID NO: 31 Amino acid sequence of z-FPP synthase of Solanum habrochaites (zFPPS). UniProt accession B8XA40.
- SEQ ID NO: 32 Amino acid sequence of Salvia fruticose trans-b -caryophyllene synthase (Sf126).
- SEQ ID NO: 33 Amino acid sequence of D373C variant of terpene cyclase of Priestia megaterium or Bacillus megaterium (BmeTC(373C)). UniProt accession D5DR56 (wild type).
- SEQ ID NO: 34 Amino acid sequence of cucurbitadienol synthase of Cucumis sativus (CPQ). UniProt accession A0A097IYL3.
- SEQ ID NO: 35 Nucleotide coding sequence of variant Erg20p F96C of Saccharomyces cerevisiae strain 288c.
- SEQ ID NO: 36 Nucleotide coding sequence of terpentetriene synthase from Streptomyces griseolosporeus (CYC2).
- SEQ ID NO: 37 Nucleotide coding sequence of GGPP synthase of Synechococcus elongatus (SynGGPPS)
- SEQ ID NO: 38 Nucleotide coding sequence of GGPP synthase of Taxus canadensis (TcaGGPPS)
- SEQ ID NO: 40 Nucleotide coding sequence of z-FPP synthase of Solanum habrochaites (zFPPS).
- SEQ ID NO: 41 Nucleotide coding sequence of Salvia fruticosa trans-b -caryophyllene synthase (Sf126)
- SEQ ID NO: 42 Nucleotide coding sequence of D373C variant of terpene cyclase of Priestia megaterium or Bacillus megaterium (BmeTC(373C)).
- SEQ ID NO: 44 Amino acid sequence of (2Z,6Z)-farnesyl diphosphate synthase from Solanum lycopersicum (Lycopersicon esculentum). UniProt code K7W9N9.
- AAAAAG AAATT AG AG AG ATT CTT G AACGTTTT CCCT AAATT AGTAG AG GAATTGAACGCATCGCTTTTGGCTTACGGTATGCCTAAGGAAGCATGTGACTGGTATGCCC ACT CATT G AACT ACAACACT CCAGGCGGT AAGCT AAAT AG AGGTTT GT CCGTT GTGGACAC GT ATGCT ATT CT CT CCAACAAG ACCGTT G AACAATTGGGGCAAGAAG AAT ACG AAAAGGTT GCCATT CT AGGTTGGT GCATT GAGTT GTTGCAGGCTT ACTT CTT GGT CGCCGAT GAT AT GA TGGACAAGTCCATTACCAGAAGAGGCCAACCATGTTGGTACAAGGTTCCTGAAGTTGGGG AAATTGCCAT CAAT G ACGCATT CAT GTT AG AGGCT GCT AT CT ACAAGCTTTT G AAAT CT CAC TT CAG AAACG AAAAAT ACT ACAT AG AT AT CACCG AATT GTT C
- AAAGGCCGAAAGGT ACATT AT CG ATT CTGGTGGT ATTT CT AG A GCCCATTTTTT G ACT AG ATGG AT GTT GT CT GTT AACGGCTT GTAT G AAT GGCCAAAGTT GTT TT ACTTGCCCCT GT CTTT GTT GTTGGTT CCAACTT AT GTT CCCTT GAACTT CT ACG AATT GT C T ACTT ACG CT AG AAT CCACTT CGTTCCTATGATGGTTGCT G GTAACAAG AAGTT CT CTTT G A CTT CT AG ACAT ACG CCAT CCTT GTCT CATTT G GAT GTT AG AG AACAG AAG CAAG AAT CCG A AG AAACT ACCCAAG AAT CT AG AGCCT CT AT CTT CTTGGTT GAT CACTT G AAACAATTGGCCT CTTT G CCAT CCT ACATT CACAAATT G G GTT AT CAAG CT GCT GAG AG GTAT ATGTTG G AAAG A ATT G AAAA
- yeast strains used in this application were based on the EGY48 Saccharomyces cerevisiae strain disclosed in (Ignea et al. , (2011), Thomas B.J. and R. Rothstein (1989) and (Ellerstrom M et al (1992)) and modified according to Table 2.
- Table 2 Strain genotype
- Plasmids were generated using standard methods used within genetic engineering and known in the art. Detailed protocols for methods for plasmid construction can be found in general handbooks containing methods for molecular cloning. Genes were amplified by PCR and placed under the control of the dual inducible promoter P GALI and P GAUO . Coding genes sequences were then ligated using USER cloning (Nour-Eldin et al (2010)) into the backbone of the pESC-URA, pESC-LEU, pESC-TRP, and pESC-HIS, vectors (Agilent Technologies) to construct the plasmids listed in Table 3.
- the yeast cells were first cultured on selective complete minimal media with glucose at 30°C overnight.
- Selective complete minimal media consisted of 0.13% w/v dropout powder, 0.67% w/v yeast nitrogen base without amino acids with ammonium sulphate (YNB+AS), 2% w/v glucose.
- Dropout powder was purchased to lack leucine, histidine, uracil and tryptophan. When required, these four nutrients were added at 0.01-0.02% w/v.
- Cells were then harvested by centrifugation 73 to remove medium and resuspended in selective minimal production media. This media was used to induce galactose promoters, with additional raffinose as an alternative carbon source.
- Selective minimal production media composition 0.13% w/v dropout powder, 0.64% w/v YNB+AS, 2% galactose, 1% w/v raffinose.
- the same four nutrients as above were added at 0.01-0.02% w/v.
- the cultures were grown at 30°C, 150 rpm, for the indicated time, and analyzed using GC-FID and/or SPME sampling and GC-MS. Isopropylmyristate (IPM) was added as an overlay corresponding to 10% of the culture volume in samples analyzed with GC-FID.
- 500 mI_ yeast culture were transferred to a 2ml tube containing approx. 10Omg 0.5mm glass beads and 500 mI_ ethyl acetate with 0.05% formic acid, and vortexed for 3 minutes followed by a 1 minute centrifugation at 20,000g. The procedure was repeated three times, each time the resulting supernatant was transferred to separate 1,5ml tubes and 500mI_ of ethyl acetate with 0.05% formic acid was to the original. The resulting solution was evaporated using vacuum centrifugation and the resulting pellet was suspended in 200mI HPLC grade methanol, unsoluble compounds was removed using 0.22 pm ultrafree centrifugal filters (Merck Milipore, Tullagreen, Ireland).
- AtFKI is a superior kinase
- Yeast cells were grown overnight in synthetic complete minimal glucose selective media (CM- GLU), washed twice and transferred to synthetic complete minimal galactose selective media (CM-GAL) to induce heterologous gene expression.
- An aliquot of 5 ml. of the culture was transferred in a 35 ml. vial which was incubated for approximately 72 h and the headspace above the culture was analyzed by SPME and GCMS.
- CM- GLU synthetic complete minimal glucose selective media
- CM-GAL synthetic complete minimal galactose selective media
- a truncated form of AtFKI missing the first 65 amino acids is equally active as full-length AtFKI in yeast
- AtFKI is a chloroplastic enzyme and therefore contains an N-terminal chloroplast targeting signal. Therefore, the active form of AtFKI in the plant cells is a shorter form of the full-length enzyme.
- AtFKI with the first 65 amino acids removed and 75 replaced with a start codon (A65AtFKI; SEQ ID: 2) was expressed to compare the activity of the full native and the truncated version of the enzyme.
- the full and truncated enzymes were each expressed from a corresponding galactose-inducible pUUS plasmid in combination with AtIPK.
- Yeast strain EGY48 transformed with these plasmids was grown and analyzed as described in Example 1. It was found that when both the full enzyme and the truncated version was expressed, the production of linalool was similar. These results demonstrate that the first 65 amino acids of AtFKI are not necessary for full function, and that the region 66-end is a shorter, equally functional form of AtFKI. Strains used NCTY 7 and NCTY 8.
- An alcohol phosphorylation pathway comprising AtFKI performs well in combination with various phosphokinases.
- AtFKI was expressed together with AtIPK, MtIPK, TalPK or TalPK(204G) mutant and the headspace was analysed using GC-MS-SPME, as described in examples 1 and 2. Strains used: NCTY 7, NCTY 9, NCTY 10, and NCTY 11.
- Fig. 5 the production of limonene using AtFKI and AtIPK is compared to the production achieved in the absence of an alcohol. Further, an additional 80-fold production increase of myrcene was obtained after using alcohol feeding pathway; co-expression of AtFKI-AtIPKand ObMyrS in yeast. Media with 0.05% prenol and 0.05% isoprenol was used.
- Fig. 5 evidence is presented that different ratios and concentrations of prenol and isoprenol can be used to further enhance or control terpene production.
- the data demonstrate the concentration and ratio of prenol and isoprenol can be used to control the total production.
- the overall concentration of alcohols was 0.1%.
- Fig. 5a and 5b a similar increase in production of terpenes is demonstrated with AtFKI and AtIPK co-expressed with myrcene synthase or sabinene synthase, respectively.
- AtFKI boosts cannabinoid production from prenol and isoprenol.
- Cannabigerolic acid is a central intermediate in the synthesis of natural cannabinoids.
- Co-expression of the alcohol feeding pathway AtFKI-AtIPK and the CBGA-synthesizing fusion Erg20p(N127W)-CsPT4 in yeast cells (strain NCTY 19) resulted in 7.5-fold increased CBGA production (pink) compared to the production titer without feeding alcohol (black).
- Fig. 8 demonstrates that novel compounds are being produced when the prenol-like alcohols with additional methyl groups are converted with AtFKI + AtIPK + CILimS.
- Protein engineering of the terpene synthase facilitates the production of non-canonical terpenes.
- terpene synthases can indeed be mutated to utilize the novel non-canonical building block instead of the canonical building block, i.e., the mutants favour the larger non-canonical substrates over the canonical GPP.
- AtFKI can provide non-canonical precursors for the synthesis of cannabinoids with unprecedented structures.
- Fig. 12 demonstrates the capability of the system to convert a range of alcohols into the corresponding diphosphates, which are then attached to olivetolic acid to yield cannabigerolic acid (CBGA) analogues with novel non-canonical structures.
- CBDA cannabigerolic acid
- Yeast cells expressing the Erg20p(127W)-CsPT4 alone (NCTY 20; blue), AtFKI-AtIPK alone (NCTY 21b; purple) are shown as the controH and control, respectively a, 3-methylpent-2-en- 1 -ol was used as feed to produce compound 1 (2,4-dihydroxy-3-(3-methylpent-2-en-1-yl)-6- pentylbenzoic acid), compound 2 (3-((2E)-3,7-dimethylnona-2,6-dien-1-yl)-2,4-dihydroxy-6- pentylbenzoic acid) and compound 3 (2,4-dihydroxy-6-pentyl-3-(3,4,7-trimethylnona-2,6-dien-1- yl)benzoic acid) b, 3,4-dimethylpent-2-en-1-ol was used as feed to produce compound 4 (3-(3,4- dimethylpent-2-en-1-yl)-2,4
- FIG. 13 shows the production of non-canonical sesquiterpene scaffolds with 16 and 17 carbons by co-expression of the alcohol conversion pathway AtFK ⁇ and At ⁇ PK, together with the farnesyl pyrophosphate (FPP) synthase Erg20p and the terpene synthase CYC2 in S. cerevisiae strain EGY48.
- the cells were growing according to the conditions described in Example 6.
- GGPP geranylgeranyl pyrophopsphate
- Example 9 demonstrates that a yeast cell expressing AfFKI is capable of phosphorylating the core alcohol structures shown in Figures 13-17 (3-methylbut-2-en-1-ol (3M2E), 3,4-dimethylpent-2-en-1-ol (3,4-DMP), 3-methylpent-2-en-1-ol (3-MP), 3-ethylpent-2-en- 1 -ol (3E2E)) produce C16 and C17 (non-canonical) terpenes.
- 3M2E 3,4-dimethylpent-2-en-1-ol
- 3,4-DMP 3-methylpent-2-en-1-ol
- 3-MP 3-ethylpent-2-en- 1 -ol
- This example illustrates the biosynthesis of a non-canonical sesquiterpene by the action of the enzyme Salvia fruticosa caryophyllene synthase (Sf126) on C16 prenyl diphosphate substrates produced by the alcohol conversion pathway.
- the alcohol conversion pathway enzymes AfFKI and AfIPK were co-produced in yeast EGY48 cells with the FPP synthase Erg20p(F96C) and the terpene synthase Sfl26. 3-methylpent-2-en-1-ol (3-MP) was supplied (Fig 18).
- FIG. 19 shows the production of C31 (ho o)squalene by co-expression of the alcohol conversion pathway AfFKI- AflPK and the FPP synthase Erg20p(F96C) upon feeding 3-methylpent-2-en-1-ol (3-MP) and prenol.
- the yeast strain EGY48 was used and the cells were cultured under the conditions as described in Example 6.
- Example 12 illustrates the biosynthesis of non-canonical triterpenes.
- Fig. 20 shows the production of a non-canonical triterpenoid product by co-expression of the alcohol conversion pathway AfFKI- AtlPKand cucurbitadienol synthase CPQ.
- Fig. 21 shows the production of another non-canonical triterpenoid product by co-expression of the alcohol conversion pathway AtFKI-AtIPK and (+)- ambrein synthase BmeTC(373C).
- examples 9-12 demonstrate that yeast cells expressing AfFKI are capable of phosphorylating a variety of primary alcohols and generate non-canonical triterpenoids.
- a genetically engineered eukaryotic cell for the production of a terpene or terpenoid or isoprenoid comprising a first nucleic acid sequence encoding a first kinase that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor; wherein the first kinase comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto.
- Item 2 The cell according to item 1 wherein the first kinase is an alcohol kinase capable of phosphorylating a primary alcohol to a monophosphate terpenoid precursor.
- Item 3 The cell according to either item 1 or 2 wherein said first nucleic acid sequence encodes a kinase that is capable of phosphorylating a non-canonical, prenol-like primary alcohol to a non- canonical monophosphate terpenoid precursor.
- Item 4 The cell according to any of the preceding items, wherein said first nucleic acid encodes a kinase that also has phosphokinase activity thus being capable of catalyzing the conversion of a primary alcohol to a terpenoid pyrophosphate precursor.
- Item 5 The cell according to any one of the preceding items, wherein the alcohol kinase is Farnesol kinase of Arabidopsis thaliana or SEQ ID NO: 1 or a homolog or variant thereof having at least 75% identity thereto.
- the alcohol kinase is Farnesol kinase of Arabidopsis thaliana or SEQ ID NO: 1 or a homolog or variant thereof having at least 75% identity thereto.
- Item 6 The cell according to any one of the preceding items, wherein the phosphokinase is a prenyl phosphate kinase, such as an isopentenyl phosphate kinase.
- Item 7 The cell according to item 6, wherein the phosphokinase is SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6 or a homologue or variant thereof having at least 75% identity thereto.
- Item 8 The cell according to any one of the preceding items wherein the kinase or kinases are fused to one or more peptide or peptide analogues.
- Item 9 The cell according to item 8, wherein said one or more peptide or peptide analogues confer additional functionality to the kinase or kinases, such as an improved enzyme kinetics, or intracellular localisation functionality, or the functionality of increasing the stability, promiscuity and / or half-life of the kinase or kinases.
- Item 10 The cell according to item 9, wherein the peptide or peptide analogue is maltose-binding protein, green fluorescent protein, thioredoxin, glutathione S-transferase, yeast farnesyl diphosphate synthase (Erg20p), NusA, small ubiquitin related modifier Smt3, or a fragment thereof.
- the primary alcohol is an alcohol with the structure of formula 1:
- Ri is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulfur, a group containing phosphorus and / or a group containing boron; 82
- R 2 is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulphur, a group containing phosphorus and / or a group containing boron; and
- R 3 is hydrogen, methyl, fluorine, chlorine, bromine, iodine, sulfhydryl; or hydroxyl.
- Item 12 The cell according to item 11, wherein the primary alcohol is 3-methylbut-2-en-1-ol, 4- fluoro-3-methylbut-2-en-1 -ol, 3-methylpent-2-en-1 -ol, 3,4-dimethylpent-2-en-1 -ol, 3-ethylpent-2- en-1-ol, 3-methylhex-2-en-1-ol, 3-methylhexa-2,5-dien-1-ol, 3-methylbut-3-en-1-ol, 3- methylenepentan-1 -ol, 2-methylprop-2-en-1 -ol, 3-methyl-4-(methylthio)but-2-en-1 -ol, 5-chloro-3- methylpent-2-en-1-ol, 3,4-dimethylpent-2-en-1-ol, 4-methyl-3-methylenepentan-1-ol, 3,4- dimethylpent-3-en-1-ol, propan-1-ol, prop-2-en-1-ol, prop-2
- Item 13 The cell according to item 11 , wherein the primary alcohol is prenol, isoprenol or a prenol like alcohol.
- Item 14 The cell according to any one of the preceding items, further comprising a further exogenous nucleic acid sequence enabling increased expression of an enzyme capable of catalysing the production of canonical and / or non-canonical terpenes, terpenoids, isoprenoids or structures containing isoprenoid groups.
- the cell according to item 14 wherein the exogenous nucleic acid sequence enabling increased expression encodes a terpene synthase enzyme such as a monoterpene synthase, a sesquiterpene synthase, a diterpene synthase, a sesterterpene synthaseor a triterpene synthase or a fragment thereof; or a prenyltransferase enzyme or a fragment thereof; or other enzymes or a fragment thereof capable of catalysing the production of canonical or non-canonical terpenes, terpenoids, isoprenoids or structures containing isoprenoid groups.
- a terpene synthase enzyme such as a monoterpene synthase, a sesquiterpene synthase, a diterpene synthase, a sesterterpene synthaseor a triterpene synthase or a fragment thereof; or a pren
- Item 16 The cell according to item 15 wherein the terpene synthase or the prenyl transferase or other enzyme is capable of using non-canonical terpenoid or isoprenoid building blocks as substrate.
- Item 17 The cell according to either item 15 or 16 wherein the terpene synthase enzyme or fragment thereof or a prenyl transferase enzyme or fragment thereof or other enzymes or a fragment thereof capable of catalysing the production of canonical terpenes, terpenoids, 84 isoprenoids or structures containing isoprenoid groups, comprises a change in the amino acid sequence that enables improved enzyme kinetics for utilisation of non-canonical terpenoid or isoprenoid building blocks.
- Item 18 The cell according to any one of the preceding items, said cell being capable of production of a terpene or terpenoid selected from the group comprising:
- Limonene myrcene, alpha-pinene, sabinene, beta-pinene, 1,8-cineole, tricyclene, alpha- thujene, a/p/7a-fenchene, camphene, delta- 2-carene, a/p/7a-phellandrene, 3-carene, 1,4-cineole, a/p/7a-terpinene, befa-phellandrene, (Z)-befa-ocimene, (E)-beta-ocimene, gamma-terpinene, terpinolene, linalool, perillene, allo-ocimene, c/s-beta-terpineol, c/s-terpine-1-ol, isoborneol, cfe/fa-terpineol, borneol, chrysanthemol, lavandulol
- the eukaryotic cell is a yeast cell, such as a Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Hansenula polymorpha (syn. Ogataea parapolymorpha), Kluyveromyces marxianus, Yarrowia lipolytica, Klyveromyces lactis, or Dekkera bruxellensis cell.
- yeast cell such as a Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Hansenula polymorpha (syn. Ogataea parapolymorpha), Kluyveromyces marxianus, Yarrowia lipolytica, Klyveromyces lactis, or Dekkera bruxellensis cell
- Item 20 The cell according to any one of items 1 - 18 wherein the eukaryotic cell is a filamentous fungi cell, such as a cell derived from Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
- a filamentous fungi cell such as a cell derived from Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
- Item 21 The cell according to any one of items 1 -18, wherein the eukaryotic cell is an algal cell, such as a microalgae cell such as Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana.
- a microalgae cell such as Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana.
- Item 22 A method for production of a terpene, terpenoid or an isoprenoid, said method comprising the steps of:
- Item 23 The method according to item 22 wherein said cell further comprises an exogenous nucleic acid sequence coding for a phosphokinase.
- Item 24 The method according to either of items 22 or 23 wherein the exogenous DNA sequence coding for a primary alcohol kinase encodes an alcohol kinase comprising SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto.
- Item 25 The method according to either of items 22 or 23, wherein the exogenous DNA sequence coding for a primary alcohol kinase encodes an alcohol kinase comprising SEQ ID NO: 1 or a homolog or variant thereof having at least 75% identity thereto.
- Item 26 The method according to any one of items 22 - 25 wherein the primary alcohol is an alcohol with a structure according to item 11.
- Item 27 The method according to item 26 wherein the primary alcohol is an alcohol of item 12.
- Item 28 The method according to any one of items 22 - 27 wherein the primary alcohol is at an initial concentration within a range of 0.01% to 1% v/v, such as within a range of 0.05% to 0.6% v/v, such as within a range of 0.1% to 0.3% v/v, such as 0.1% v/v.
Abstract
The invention concerns a genetically engineered eukaryotic cell and a method for the production of a terpene or terpenoid or isoprenoid, said cell comprising a first nucleic acid sequence encoding a first kinase such as AtFKI or a homologue or variant thereof having at least 75% identity thereto that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor.
Description
CELLS AND METHOD FOR PRODUCING ISOPRENOID MOLECULES WITH CANONICAL AND NON-CANONICAL STRUCTURES
FIELD OF THE INVENTION
The present invention relates to the production of isoprenoids in eukaryotic cells.
BACKGROUND OF THE INVENTION
Isoprenoids are widely used as pharmaceuticals, cosmetics, nutraceuticals, flavours, fragrances, and pesticides, and have recently also found applications as drop-in jet fuels and biopolymers.
Extraction of these valuable compounds from natural sources can hardly meet the increasing demand, while organic chemical synthesis is often inefficient, particularly in the case of compounds having complex structures.
Eukaryotic microorganisms, such as yeasts, are considered good hosts for high-value compound production because of their capacity to synthesize complex chemical structures. However, production of isoprenoids in microorganisms currently depends on the feeding of sugar. This can frequently be a challenging and inefficient process, as microorganisms prefer to use the sugar for growth and other metabolic processes instead of producing the desired product.
Isoprenoids are synthesized by the successive addition of 5-carbon containing building blocks and, as a result, their structures mostly contain multiples of five carbon atoms. Canonically, the biosynthesis of isoprenoids and related compounds like cannabinoids, monoterpene indole alkaloids, and prenylated aromatic compounds, relies either on the MEP pathway and/or the MVA pathway, both leading to the formation of the 5-carbon precursors DMAPP and IPP that are condensed into diphosphate precursors with longer chains (i.e. GPP (C10), FPP (C15), GGPP (C20), GFPP (C25), etc...). These precursors are then converted into a wide array of compounds by terpene synthases (TS) or appended to other non-isoprenoid skeletons by specific prenyl- transferases (as in the case of cannabinoids or prenylated flavonoids). These pathways suffer from inherent limitations due to their length, complex regulation, and extensive cofactor requirements.
The isopentenol utilization pathway (IUP) can produce isopentenyl diphosphate or dimethylallyl diphosphate, the main precursors to isoprenoid synthesis, through sequential phosphorylation of isopentenol isomers isoprenol or prenol, Chatzivasileiou et al., PNAS, 2019. This alternative pathway converts primary alcohols into corresponding pyrophosphates: Conversion systems for the production of DMAPP- and IPP-derived compounds from prenol and isoprenol in bacteria
have, however, only been demonstrated for obtaining canonical isoprenoids. Moreover, the enzymes used in these bacterial systems have not been demonstrated to be functional in yeast and have not been found to be efficient in yeast. One currently available system uses the combination of enzymes ScCKI (Ckilp) and AtIPK, which have been shown to convert prenol to isoprenoid precursors in E. coli. Lund et al., ACS Synth. Biol., 2019, disclose an artificial alcohol- dependent hemiterpene biosynthetic pathway and coupled to several isoprenoid biosynthetic systems, affording lycopene and prenylated tryptophan in robust yields. These systems function very inefficiently (or not at all) when the key enzymes are expressed in yeast.
Another bioconversion system utilizes the enzymes SfPhoN and AtIPK. This system has also been shown to function in E. coli. However, the activity of this system is non-existing or very low when it is expressed in yeast (https://pubs.acs.org/doi/10.1021/acssynbio.8b00383).
Johnson et al., Angew Chem Int Ed Engl., 2020, present a use of a single non-canonical building block for isoprenoid synthesis in vitro; this system has not been shown to be functional in eukaryotic cells.
SUMMARY
Therefore, it is a first object of the invention to present a new system for the conversion of a primary alcohol to a mono- or pyrophosphate terpenoid precursor compound in a eukaryotic cell such as yeast. A further object of the invention is to present a new system for the two-step conversion of isoprenol, prenol and prenol-like alcohols to terpene precursor compounds in eukaryotic cells such as yeast. The activity of this new system is fulfilled by alcohol kinase enzymes such as Arabidopsis thaliana farnesol kinase (AtFKI) and optionally an isopentenyl phosphate kinase (AtIPK) or another prenyl phosphate kinase, its yield being considerably higher compared to the other systems described in the background section herein above.
Thereby, in a first aspect the invention relates to cells comprising nucleic acid sequences encoding kinases or parts thereof, e.g., polypeptides or polypeptide analogues with kinase activity as described herein. Specifically, the invention relates to a genetically engineered eukaryotic cell for production of an isoprenoid (or terpene or terpenoid) comprising a first nucleic acid sequence encoding a first kinase that converts a primary alcohol to a mono- or pyrophosphate isoprenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that converts a monophosphate precursor to an isoprenoid pyrophosphate precursor; wherein the first kinase comprises SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto while exhibiting kinase activity.
Hence, the present invention provides a genetically engineered eukaryotic cell for the production of a terpene or terpenoid comprising a first nucleic acid sequence encoding a first kinase that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor wherein the first kinase comprises SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto while exhibiting kinase activity.
Also provided herein are vectors comprising the above nucleic acids, as well as host cells comprising said vectors and/or said nucleic acids or polypeptides.
Also provided is the use of above polypeptides, nucleic acids, vectors or host cells for the production of mono- or pyrophosphate isoprenoid precursors, such as terpenoid precursors.
The nucleic acids may be comprised in a vector, e.g., a plasmid, cosmid, virus, or another vector used, e.g., conventionally in genetic engineering. The vector may comprise further sequences such as marker sequences, which allow for the selection of the vector in a suitable host cell and under suitable conditions. Furthermore, the vector may comprise expression control elements allowing proper expression of the coding regions in suitable hosts. Such control elements are known to the person skilled in the art and may include a promoter, a splice cassette, and a translation initiation codon, amongst others. As an alternative or in addition to using a vector, the nucleic acids may be integrated in a certain chromosomal locus in the employed cell, in combination with the expression control elements described above.
Preferably, the nucleic acid of the invention is operatively attached to expression control elements allowing expression in eukaryotic cells. Control elements ensuring expression in eukaryotic cells are well known to those skilled in the art and a further explained herein below.
Methods for construction of nucleic acid molecules, for construction of vectors comprising nucleic acid molecules, for introduction of vectors into appropriately chosen host cells, for insertion of DNA fragments into genomic loci of said cells, or for causing or achieving expression of nucleic acid molecules are well-known in the art. Further detail and exemplary methods are detailed herein below.
The system advantageously enables the biocatalytic synthesis of isoprenoids in eukaryotic microorganisms by a method that by-passes the MEP pathway and/or the MVA pathway for the production of DMAPP and IPP.
In a first embodiment, said first nucleic acid sequence encodes a kinase that is capable of both alcohol phosphorylation and phosphate phosphorylation. Thereby, said nucleic acid sequence encodes a single kinase enzyme with bi-catalytic activity and capable of sequential phosphorylation of alcohol and monophosphate substrates.
In another embodiment, the first nucleic acid sequence encodes a kinase that is capable of phosphorylating a primary alcohol to a monophosphate terpenoid or isoprenoid precursor. Said kinase may comprise SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
In a preferred embodiment, said nucleic acid sequence encodes an alcohol kinase that is capable of phosphorylating a non-canonical, prenol-like primary alcohol to a non-canonical monophosphate isoprenoid precursor.
In another embodiment, the kinase may comprise SEQ ID NO: 1 or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
In a further embodiment, the engineered cells can further include an exogenous nucleic acid sequence encoding a phosphate kinase, i.e. , a phosphokinase, such as prenyl phosphate kinase (or isopentenyl phosphate kinase, IPK) that can phosphorylate a phosphate precursor, e.g., dimethylallyl phosphate (DMAP), to dimethylallyl pyrophosphate (DMAPP) or isopentenyl phosphate (IP) to isopentenyl pyrophosphate (IPP).
In a preferred embodiment, said phosphokinase comprises Arabidopsis thaliana IPK (AtIPK) SEQ ID NO: 3; or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97%
identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
In another embodiment, said phosphokinase comprises MtIPK, SEQ ID NO: 4; or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
In another embodiment, said phosphokinase comprises TalPK, SEQ ID NO: 5; or a homolog or variant thereof having at least 75% identity thereto, such as at least 80% identity thereto, such as at least 85% identity thereto, such as at least 90% identity thereto, such as at least 95% identity thereto, such as at least 96% identity thereto, such as at least 97% identity thereto, such as at least 98% identity thereto, such as at least 99% identity thereto, while exhibiting kinase activity.
In a preferred embodiment, said phosphokinase comprises TalPK(204G), SEQ ID NO: 6.
In other embodiments, the phosphokinase comprises ScCK (ScCKI, Cki1 p), ScMK, EcGK, EcHK, HvIPK, MjlPK, or TalPK-3m.
In another embodiment, said phosphokinase is capable of phosphorylating a non-canonical monophosphate isoprenoid precursor, resulting in a non-canonical pyrophosphate isoprenoid precursor, such as 3-ethylpent-2-en-1-yl-diphosphate, 4-fluoro-3-methylbut-2-en-1-yl- diphosphate, 3-methyl-4-(methylthio)but-2-en-1 -yl-diphosphate, 3-methylpent-2-en-1 -yl- diphosphate, 3-methylhex-2-en-1 -yl-diphosphate, 3, 4-dimethylpent-2-en-1 -yl-diphosphate.
Contacting canonical or non-canonical monophosphate precursors with phosphokinases advantageously enables an alternative pathway for the production of canonical or non-canonical pyrophosphate isoprenoid precursors, which can be further converted to canonical and non- canonical isoprenoids by the action of, e.g., terpene synthases and / or prenyltransferases.
The combined action of one or more of these enzymes provides an isoprenoid biosynthetic pathway that allows de-coupling of isoprenoid biosynthesis from biomass production and enables channelling more substrate into product, thus providing a non-competitive system. Thereby, a 10- to 100-fold increase in production titers of canonical isoprenoid compounds can be achieved. Thus, it is a highly efficient method to avoid common bottlenecks in currently used methods for the production of isoprenoids in yeast and other eukaryotic cells.
In another embodiment, the kinase or kinases according to the invention are fused to one or more peptides or peptide analogues resulting in fusion proteins. Said fusion proteins may, depending on the functional characteristics of the said peptide or peptide analogue, advantageously confer additional functionality to the kinase or kinases of the invention. For example, such a peptide or peptide analogue may allow improved enzyme kinetics of the kinase domain or domains; and intracellular localisation peptide may increase the rate of localisation of the kinase or kinases according to the invention to sub-cellular organelles, such as chloroplasts, mitochondria or peroxisome via a chloroplastic, mitochondrial or peroxisomal targeting signal, respectively, thereby increasing the enzyme kinetics of the enzymes according to the invention. Likewise, stability-increasing, and half-life-increasing peptides may contribute to a longer activity of the enzymes by reducing protein turnover, thus increasing the concentration of active enzymes, and total catalytic activity. Enzymatic promiscuity may be increased by the fusion of a kinase according to the invention to a peptide or peptide analogue comprising an additional domain, such as kinase domain, such as a phosphokinase domain and / or one or more peptide sequences improving enzyme kinetics. Moreover, fusion to specific domains or peptides can facilitate the correct folding of the kinase, orcan improve the solubility of the kinase-containing polypeptide, resulting in higher overall intracellular activity.
In a preferred embodiment, the peptide or peptide analogue is maltose-binding protein, green fluorescent protein, thioredoxin, glutathione S-transferase, yeast farnesyl diphosphate synthase (Erg20p), ATP-synthase, CTP synthase, GTP synthase, UTP synthase, NusA, or small ubiquitin related modifier Smt3, or a fragment thereof.
In another embodiment, the peptide consists of naturally encoded amino acid residues, i.e. , amino acids found in the genetic code.
In one embodiment, the primary alcohol is prenol, isoprenol or a prenol-like primary alcohol.
In another embodiment, the primary alcohol is an alcohol with the structure of formula 1 : where formula 1 is:
wherein Ri is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulphur, a group containing phosphorus and / or a group containing boron;
R2 is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulphur, a group containing phosphorus and / or a group containing boron; and R3 is hydrogen, methyl, fluorine, chlorine, bromine, iodine, sulphydryl; hydroxyl.
In a preferred embodiment, the primary alcohol is 3-methylbut-2-en-1-ol, 4-fluoro-3-methylbut-2- en-1-ol, 3-methylpent-2-en-1-ol, 3,4-dimethylpent-2-en-1-ol, 3-ethylpent-2-en-1-ol, 3-methylhex- 2-en-1-ol, 3-methylhexa-2,5-dien-1-ol, 3-methylbut-3-en-1-ol, 3-methylenepentan-1-ol, 2- methylprop-2-en-1 -ol, 3-methyl-4-(methylthio)but-2-en-1 -ol, or 5-chloro-3-methylpent-2-en-1 -ol.
In one embodiment, the primary alcohol is a prenol-like primary alcohol that is not geraniol.
In another embodiment, the primary alcohol is a prenol-like primary alcohol that is not farnesol.
In another embodiment, the primary alcohol is a prenol-like primary alcohol that is not geranylgeraniol.
By using non-canonical, prenol-like alcohols, which comprise an incremental number of carbons (other than multiple of 5 carbons) or different heteroatoms, referred to herein as non-canonical (or prenyl-like) alcohols), in combination with enzymes according to the invention, the production of novel isoprenoid building blocks with carbon size different than five or alternative structures, i.e. non-canonical building blocks, is advantageously enabled. Thus, the number of potential isoprenoids obtained is expanded, many of which could have improved properties over canonical isoprenoids or could provide new functions. Further, a 10- to 100-fold increase in production titers of canonical isoprenoid compounds can be achieved when compared to previously disclosed systems in yeast.
In a further embodiment, the cell according to the invention comprises an exogenous nucleic acid sequence enabling expression or increased expression of an enzyme capable of catalysing the production of canonical and / or non-canonical terpenes, terpenoids, isoprenoids or structures containing isoprenoid groups.
In one embodiment, said nucleic acid sequence comprises an expression control sequence as described herein.
In another embodiment, said nucleic acid comprises a sequence coding for an enzyme. In one embodiment, such an enzyme is a terpene or terpenoid synthase (TS), such as a monoterpene synthase, a sesquiterpene synthase, a diterpene synthase, a sesterterpene synthase, or a triterpene synthase or a fragment thereof, which may convert the canonical or non-canonical pyrophosphate terpene precursors into any of a wide array of terpene and / or terpenoid compounds.
In another embodiment the exogenous nucleic acid sequence enabling increased expression of an enzyme capable of catalysing the production of canonical and / or non-canonical terpenes,
terpenoids, isoprenoids or structures containing isoprenoid groups is a prenyl transferase, whereby the terpene compounds are produced by appending precursors of the invention to other non-terpenoid skeletons (as in the case of cannabinoids and prenylated flavonoids).
Examples of preferred terpene synthases and prenyl transferases include but are not limited to limonene and limonene synthase, myrcene and myrcene synthase, geraniol and geraniol synthase, linalool and linalool synthase, taxadiene and taxadiene synthase, amorphadiene and amorphadiene synthase, valencene and valencene synthase, santalol and santalol synthase.
Other enzymes and fragments of the above-described enzymes capable of catalysing the production of canonical and / or non-canonical isoprenoids (or terpenes, or terpenoids) or structures containing isoprenoid groups are also contemplated.
In a preferred embodiment, the terpene synthase or the prenyl transferase is capable of using non-canonical isoprenoid building blocks as substrate. Importantly, by using of novel building blocks with carbon size different than five (or a multiple of five), or building blocks containing heteroatoms, as substrate, the number of potential isoprenoids produced by the system according to the invention is advantageously greatly expanded.
In another embodiment, the terpene synthase enzyme or isoprenoid synthase enzyme or other enzyme or a fragment thereof capable of catalysing the production of canonical isoprenoids (or terpenes, or terpenoids) or structures containing isoprenoid groups, comprises a change in the amino acid sequence that enables improved enzyme kinetics for utilisation of non-canonical isoprenoid building blocks. Such a change in the peptide sequence can, e.g., enhance the affinity of the enzyme for non-canonical substrates, or reduce the activation energy, which results in a greater reaction efficiency. Thus, the terpene and / or terpenoid yield is improved.
In a preferred embodiment, the host cell is a yeast cell. Any yeast species may be appropriate. In some embodiments, the genus of said yeast is selected from Saccharomyces, Pichia, Ogataea, Yarrowia, Kluyveromyces, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Schizosaccharomyces, Trichosporon and Lipomyces. In some preferred embodiments, the genus of said yeast is Saccharomyces, Pichia, Ogataea, Kluyveromyces or Yarrowia.
Preferred species include Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Ogataea polymorpha, Kluyveromyces marxianus, Kluyveromyces lactis, Yarrowia lipolytica, or Dekkera bruxellensis.
In another embodiment, the host cell of the invention is a filamentous fungi cell, such as a cell derived from the miscella or germ bodies of a filamentous fungus. Preferred species include but
are not limited to Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
In another embodiment, the host cell of the invention is an algal cell, such as a cell derived from a multicellar alga. In another embodiment, the algae cell is a microalga, such as a cell of the species Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana.
By the host cell of the invention being an eukaryotic cell, isoprenoid production titers and speed may be advantageously increased. Furthermore, eukaryotic systems comprising yeast incubators or microalgae photosynthetic production systems are adjusted for optimal Industry-oriented production methods and are, in addition, readily up-scalable depending on demand. In addition, eukaryotic cells contain organelles and subcellular structures that may be beneficial for bioproduction, by facilitating metabolic channelling, avoiding metabolite crosstalk or inadvertent inhibition, and potentially improving the function and stability of enzymes by enabling membrane association or confinement in a cellular compartment/substructure.
A further advantage of the host cell of the invention being an eukaryotic cell, is the ability to perform CYP-driven oxidations on the new structures in eukaryotic cells that have not been possible in bacterial cells. Indeed, eukaryotic cells provide, in general, a better environment to functionally express enzyme that belong to the group of cytochrome P450 (CYPs), particularly if these are originally found in another eukaryotic cell. This is due to the fact that said CYPs are membrane bound enzymes and correct association with the membrane is essential for optimal activity. Eukaryotic cells, such as yeast, contain appropriate membrane structures (e.g. ER) for exogenous CYPs to function. Such membranes are lacking from bacteria and, therefore, functional expression of said CYPs in prokaryotic cells is far less optimal. Although there are kinases reported to phosphorylate prenol and isoprenol in bacterial cells, establishing a bacterial system that converts prenol and isoprenol into CYP-decorated isoprenoids would be challenging due to the limitation described above.
In another aspect, the invention relates to a method for the production of a terpene or an isoprenoid.
In a preferred embodiment, said method comprises the steps of providing an engineered eukaryotic cell comprising a DNA sequence coding for a primary alcohol kinase, and culturing said engineered cell in a medium containing a primary alcohol.
In another embodiment, the cell provided further comprises an exogenous nucleic acid sequence coding fora phosphokinase.
In another embodiment, the primary alcohol is at an initial concentration within a range of 0.01 % to 1% v/v, such as within a range of 0.05% to 0.6% v/v, such as within a range of 0.1% to 0.3% v/v.
In a preferred embodiment, the primary alcohol is at an initial concentration of 0.1% v/v.
BRIEF DESCRIPTION OF THE DRAWINGS
In the following, example embodiments are described according to the invention:
Fig. 1 & Fig. 2: Identification and characterization of the efficient alcohol kinases. Ion-count of 93- ion from SPME-GCMS from headspace of yeast co-expressing alcohol kinase candidates and AtIPK in yeast EGY48 to identify an efficient conversion of prenol isoprenoid building blocks.
Fig. 3: SPME-GC-MS: Comparison of linalool production with either AtFKI or A65AtFKI in presence of prenol.
Fig. 4: SPME-GC-MS: Comparison of AtFKI and A65AtFKI when co-expressed with the phosphokinase AtIPK in yeast in the presence of prenol. The production of linalool demonstrates that AtFKI performs well in combination with various phosphokinases.
Fig. 5a: Co-expression of the alcohol conversion pathway AtFKI-AtIPK and CIMyrS in yeast cells increases myrcene production (pink) compared without feeding alcohol (black).
Fig. 5b: Co-expression of the alcohol conversion pathway AtFKI-AtIPK and CISabS in yeast cells producing increased sabinene production (pink) compared without feeding alcohol (black).
Fig. 5c: Limonene titer when feeding different ratio of prenoLisoprenol in the strain containing the alcohol feeding pathway and limonene downstream building block.
Fig. 5d: Limonene yield when feeding different concentration alcohols in the strain containing the alcohol feeding pathway and limonene downstream building block.
Fig. 6: Comparison of production of CBGA with and without conversion of prenol and isoprenol.
Fig. 7: Alcohols found to be converted with AtFKI and AtIPK and utilized by CILimS.
Fig. 8: SPME-GCMS analysis of non-canonical terpenes produced when AtFKI, AtIPK and CILimS. Based on the alcohol which have been converted and the mass spectrum of novel compounds the suggested structures are shown for the peaks.
Fig. 9: Novel isoprenoid compounds.
Fig. 10: Suggested core alcohol structure.
Fig. 11: Comparison of total production of non-canonical limonene variants using AtFKI-AtIPK alcohol conversion combined with the limonene synthase CILimS, or the CILimS mutants L505G, F483A and I284G, and the alcohols 3,4-Dimethylpent-2-en-1-ol (3,4-DMP) and 3-ethylpent-2-en- 1 -ol (3E2E).
Fig. 12: Synthesis of unnatural cannabigerolic acid analogues by using non-canonical prenyl diphosphates precursors. Noncanonical CBGA, no alcohols . Yeast cells expressing the Erg20p(127W)-CsPT4 alone, AtFKI-AtIPK alone a: 3-methylpent-2-en-1-ol. b: 3,4-dimethylpent- 2-en-1-ol. c: 3-methylhex-2-en-1-ol. d: 3-ethylpent-2-en-1-ol.
Fig. 13: shows the production of C16 and C17 sesquiterpenes using the alcohol conversion pathway and 3M2E.
Fig. 14: evaluates of the efficiency of different prenyltransferases in supporting the production of C16 and C17 sesquiterpenes when feeding 3M2E.
Fig. 15: evaluates of the efficiency of different prenyltransferases in supporting the production of C16 and C17 sesquiterpenes when feeding 3,4-DMP.
Fig. 16: shows of the efficiency of different prenyltransferases in supporting the production of C16 and C17 sesquiterpenes when feeding 3-MP.
Fig. 17: evaluates of the efficiency of different prenyltransferases in supporting the production of C17 sesquiterpenes when feeding 3E2E.
Fig. 18: shows the production of a C16 sesquiterpenoid by Salvia fruticosa caryophyllene synthase (Sf126) and 3-MP.
Fig. 19: shows the production of non-canonical squalene using 3MP.
Fig. 20: shows the production of a non-canonical triterpenoid by curcubitadienol synthase and 3MP.
Fig. 21 : shows the production of a non-canonical triterpenoid by BmeTC(373C) and 3MP.
DETAILED DESCRIPTION
In the following the invention is described in detail through embodiments hereof that should not be thought of as limiting to the scope of the invention.
It is known that the isopentenol utilization pathway (IUP) can produce isopentenyl diphosphate or dimethylallyl diphosphate, the main precursors to isoprenoid synthesis, through sequential phosphorylation of isopentenol isomers isoprenol or prenol. This non-canonical, alternative pathway converts primary alcohols into corresponding pyrophosphates: Conversion systems for the production of DMAPP- and IPP-derived compounds from prenol and isoprenol in bacteria have, however, only been demonstrated for obtaining canonical isoprenoids. Moreover, the enzymes used are not efficient in yeast.
The inventors have surprisingly found that the ectopic expression of Arabidopsis thaliana farnesol kinase (FKI) and isopentenyl phosphate kinase lead to a high yield of canonical and non-canonical pyrophosphorylated isoprenoid precursors in yeast.
Definitions
Isoprenoids are a naturally occurring group of chemical molecules displaying a wide structural diversity of carbon skeletons made up from basic isoprene units (typically C5) and including compounds otherwise named as terpenes, terpenoids, or isoprenoids. Herein, the terms “terpene", “terpenoid", and “isoprenoid" are used interchangeably. According to the number of C5 units, terpenes are classified as: hemiterpenes C5, monoterpenes, Ci0; sesquiterpenes, Ci5; diterpenes, C2o; sesterterpenes, C25; triterpenes, C30; and tetraterpenes, C4o.
Terpenes consist of compounds with the formula (C5H8)n. They are further classified by the number of carbons: hemiterpenes C5, monoterpenes (C10), sesquiterpenes (C15), diterpenes (C20), etc. A well-known monoterpene is alpha-pinene, a major component of turpentine.
Terpenoids are modified terpenes, wherein methyl groups have been moved or removed, or oxygen atoms added, or contain other decorations or modifications. The term "terpene" may also be used more broadly, to include the terpenoids. Just like terpenes, the terpenoids can be classified according to the number of isoprene units that comprise the parent terpene.
Hemiterpenes consist of a single isoprene unit. Isoprene itself is the only hemiterpene, but oxygen-containing derivatives such as prenol and isovaleric acid are hemiterpenoids.
Monoterpenes (or monoterpenoids) are molecules comprising a 10-carbon isoprenoid structure. Monoterpenoids may, in addition to the 10-carbon isoprenoid structure, also comprise moieties not having isoprenoid structure. Frequently, the biosynthesis of mono-terpenoids involves
several additional steps following the initial conversion of GPP to the basic monoterpene skeleton. These additional steps may be oxidations (e.g. catalysed by a cytochrome P450 enzyme), reductions, isomerizations. acetylations, methylations, etc.
Examples of sesquiterpenes and sesquiterpenoids include humulene, amorphadiene, farnesenes, farnesol, valencene, etc. (The sesqui- prefix means one and a half).
Diterpenes are composed of four isoprene units and have the molecular formula C20H32. They derive from geranylgeranyl pyrophosphate. Examples of diterpenes and diterpenoids are casbene, abietadiene, miltiradiene, ginkgolides, cafestol, kahweol, cembrene and taxadiene (precursor of taxol). Diterpenes also form the basis for compounds such as retinol, retinal, and phytol.
Sesterterpenes, terpenes having 25 carbons and five isoprene units, are rare relative to the other sizes (The sester- prefix means two and a half). An example of a sesterterpenoid is geranylfarnesol.
Triterpenes consist of six isoprene units and have the molecular formula C3oH48. The linear triterpene squalene, the major constituent of shark liver oil, is derived from the reductive coupling of two molecules of farnesyl pyrophosphate. Squalene is then processed biosynthetically to generate either lanosterol or cycloartenol, the structural precursors to the steroids.
Sesquarterpenes are composed of seven isoprene units and have the molecular formula C35H56· Sesquarterpenes are typically microbial in their origin. Examples of sesquarterpenoids are ferrugicadiol and tetraprenylcurcumene.
Tetraterpenes contain eight isoprene units and have the molecular formula C4oH64· These include the acyclic lycopene, the monocyclic gamma-carotene, and the bicyclic alpha- and beta- carotenes.
Polyterpenes consist of long chains of many isoprene units. Natural rubber consists of polyisoprene in which the double bonds are cis. Some plants produce a polyisoprene with trans double bonds, known as gutta-percha.
Norisoprenoids, such as the Ci3-norisoprenoid 3-oxo-a-ionol and 7,8-dihydroionone derivatives, such as megastigmane-3,9-diol and 3-oxo-7,8-dihydro-a-ionol can be produced by fungal
peroxidases or glycosidases. Many norisoprenoids are the product of oxidative cleavage of a larger isoprenoid molecule by specific enzymes.
Iridoids are a group of compounds found in plants and some animals, which are bio- synthetically derived from 8-oxogeraniol.
Monoterpene indole alkaloids, as used herein, refer to a large and diverse group of plant chemical compounds derived from a unit of tryptamine and a 10-carbon or 9-carbon unit of terpenoid origin that is, in turn, derived from 8-oxo-geraniol.
Higher terpenes, as used herein, are intended to mean molecules comprising more than 10 carbon atoms of isoprenoid structure. Examples include sesquiterpenes, diterpenes and triterpenes. Higher terpenes may include moieties not having the isoprenoid structure in addition to the terpene structure.
Cannabinoids, as used herein, refers to a group of compounds members of which were initially isolated from the plant Cannabis sativa. Many cannabinoids are bio-synthesized by the addition of GPP to olivetolic acid.
Meroterpenoids, as used herein, refer to compounds that contain an isoprenoid moiety as part of a larger compound. Such compounds are, for example, the group of cannabinoids, the group of monoterpene indole alkaloids, or other prenylated aromatic compounds.
Canonical terpenes, as used herein, refer to terpenes synthesized using the canonical terpene precursors IPP, DMAPP, GPP, FPP, GGPP, etc, or their “cis-” counterparts, and which have a number of carbon atoms that is a multiple of 5, as their biosynthesis is based on 5-carbon precursors.
DMAPP and IPP: Dimethylallyl pyrophosphate (or dimethylallyl diphosphate; DMAPP) and isopentenyl pyrophosphate (or isopentenyl diphosphate; IPP) are 5-carbon precursors of isoprenoids.
GPP: Geranyl diphopsphate (or geranyl pyrophosphate; GDP). GPP is formed by condensation of one DMAPP and one IPP molecule. GPP is a branch point molecule in isoprenoid synthesis, and it can, by addition of an IPP molecule, be converted into FPP, and thereby be directed into the biosynthesis of sesqui-, di- or tri-terpenes or sterol synthesis, or it can, by the action of a monoterpene synthase, be directed into the synthesis of monoterpenoids, iridoids, and monoterpene indole alkaloids. Other prenyltransferases can also direct GPP towards the production of cannabinoids, prenylated aromatic compounds, or meroterpenoids in general.
FPP: Farnesyl pyrophosphate (orfarnesyl diphosphate; FDP) is formed by condensing GPP with an IPP molecule. FPP is the precursor for the synthesis of sesquiterpenes, diterpenes, triterpenes and sterols.
GGPP: Geranylgeranyl pyrophopsphate (or geranylgeranyl diphosphate; GGDP). GGPP is formed by condensing an FPP with an IPP molecule. GGPP is precursor for the synthesis of diterpenes.
GFPP: Geranylfarnesyl pyrophopsphate (or geranylfarnesyl diphosphate; GFDP). GGPP is formed by condensing a GGPP with an IPP molecule. GFPP is precursor for the synthesis of sesterterpenes.
Structural analogues: Also referred to as chemical analogues, chemical analogues, analogues, or analogues are compounds that possess structural similarity to a specific compound but differ in some or more ways to the in respect to said compound.
Analogues can be, but are not limited to, compounds with one or more atoms added or substituted, with functional groups added, removed or substituted, and with substructures changed, isomerized, or modified.
Non-canonical isoprenoids (terpenes or terpenoids) are chemical analogues of canonical isoprenoids produced by removal of diphosphate groups from non-canonical isoprenoid building blocks. When the diphosphate group is removed, several different reactions can occur including cyclization of the molecule, rearrangement of - or formation of bond, double bonds and triple bonds, and reaction with water or oxygen to form functional groups.
Non-canonical meroterpenoids are analogues of canonical meroterpenoids (i.e. meroterpenoids with a canonical isoprenoid moiety).
Non-canonical cannabinoids are analogues of the compounds to the cannabinoid group of compounds. These analogues are defined by the utilization of a non-canonical isoprenoids building block instead of GPP.
Non-canonical monoterpenes, sesquiterpenes, diterpenes, sesterterpenes, triterpenes, tetraterpenes, polyterpenes, sterols: Chemical analogues to monoterpenes, sesquiterpenes, diterpenes, sesterterpenes, triterpenes, tetraterpenes, polyterpenes, sterols or molecules with structural resemblance or production means. Can be produced either by direct conversion of a non-canonical isoprenoids building block, or condensation of a non-canonical building block with
either another canonical or non-canonical building block. While not limited to this strict definition they will usually contain between 7 and 100 carbons and / or other non-hydrogen atoms.
Diphopsphate(s), also referred to as pyrophosphate(s) is any molecule with a diphosphosphate group. In this work is often refereed to, but not limited to, organic molecule with a diphosphate group and their analogues.
Prenol-like primary alcohol: is used here to describe an alcohol with a structure that is an analog of the alcohols prenol or isoprenol.
MEP pathway: The methylerythritol 4-phosphate (MEP) pathway forming IPP and DMAPP. The pathway is found e.g. in most bacteria, in algae and is the plastids of higher plants.
MVA pathway: The mevalonate pathway (MVA pathway) is an essential metabolic path-way present in eukaryotes and in some bacteria forming IPP and DMAPP starting from acetyl-CoA.
Alternative MVA pathway: The alternative MVA pathway is found in archaea and provides IPP and DMAPP, starting from acetyl-CoA but utilizing isopentenyl phosphate as intermediate.
Terpene synthases. The term includes any enzyme that is able to catalyse the rearrangement of DMAPP, IPP, GPP, FPP, GGPP, GFPP or other prenyl pyrophosphates or their non-canonical analogues. Terpene synthases typically synthesize multiple products, but the diversity of products varies among terpene synthases. Some terpene synthases have high product specificity, catalysing the synthesis of a limited number of products, and other terpene synthases have low product specificity, catalysing the synthesis of a large variety of different terpenes. Depending on the canonical substrate they primarily accept, terpene synthases are frequently classified as, hemiterpenesynthases, (if the accept DMAPP or IPP) monoterpene synthases (if they accept GPP), sesquiterpene synthases (if they accept FPP), diterpene synthases (if they accept GGPP), sesterterpene synthase (if they accept GFPP), or triterpene synthases (if they accept oxidosqualene orsqualene).
Prenyltransferases are enzymes that append a prenyl moiety to isoprenoid or non-isoprenoid skeletons. Many prenyltransferases that append a prenyl moiety to other isoprenoid chains are involved in the synthesis of the prenyl diphosphate precursors, such as GPP (GPP synthases), FPP (FPP synthases), GGPP (GGPP synthases) or geranylfarnesyl diphosphate synthases (GFPP synthases). These enzymes typically add IPP units to extend DMAPP to larger size prenyl- diphosphates in the trans- configuration. For this reason, they are also called trans-polyprenyl synthases ortrans-polyprenyltransferases. Several prenyltransferase enzymes exist that catalyse the cis- condensation and elongation of DMAPP with IPP. These enzymes are termed cis-
prenyltransferase, or cis-polyprenyl diphosphate synthase, or cis-polyprenyltransferases, are responsible for the synthesis of neryl diphosphate, cis.cis-farnesyl diphosphate, and nerylneryl diphosphate.
Furthermore, certain prenyltransferases have been reported to condense two DMAPP molecules to lavandulyl diphosphate or chrysanthemyl diphosphate.
Prenyltransferases that append a prenyl moiety to non-isoprenoid scaffolds add DMAPP, GPP, FPP or GGPP to non-isoprenoid compounds, including flavonoids, amino acid residues and peptides, aromatic compounds, and other chemical compounds in general. Such prenyltransferase enzymes are involved in the biosynthesis of many different natural products including, but not limited to, cannabinoids, prenylated flavonoids, or other meroterpenoids. In the case of cannabinoid synthesis, this enzyme is a geranyldiphosphate:olivetolate geranyltransferase.
The prenylransferase may be part of separate polypeptides or fused into one polypeptide chain. The prenyltransferase may also be fused to another prenyltransferase (e.g. Erg20p; an FPP synthase), a terpene synthase, or another non-terpene synthesizing protein. The prenyltransferase may also be fused to an enzyme that naturally localizes to the peroxisome matrix or its membrane in yeasts or in another organism, or that it is fused to a polypeptide chain that is itself fused to a peroxisomal targeting signal.
An aromatic prenyltransferase is selected among any enzyme with prenyltransferase activity, identified from any organism or engineered, that is able to transfer an isoprenoid moiety to another isoprenoid or non-isoprenoid compound.
Prenyl diphosphate synthase, as used herein, refers to any polypeptide with prenyl diphosphate synthesizing capacity that utilizes prenyl pyrophosphate compounds as substrate(s). In most cases, a prenyl diphosphate synthase is a prenyltransferase (see above)
The term “pyrophosphate” is used interchangeably herein with "diphosphate". Pyrophosphates are in this document an umbrella term for organic molecules that contain a pyrophosphate group.
Also, herein, the term "prenyl diphosphate" is used interchangeably with "prenyl pyrophosphate", “isoprenyl diphosphate”, or “isoprenyl pyrophosphate”, and includes monoprenyl diphosphates containing a single prenyl or isoprenyl group (such as DMAPP or IPP), and polyprenyl diphosphates with two or more prenyl/isoprenyl groups (such as GPP, FPP, GGPP, etc.).
Non-canonical pyrophosphates, as used herein, refers to structural analogues of compounds containing a single prenyl or isoprenyl group (such as DMAPP or IPP), as well as structural analogues of compounds polyprenyl diphosphates with two or more prenyl / isoprenyl groups (such as GPP, FPP, GGPP, etc.).
Canonical isoprenoid building blocks, as used herein, refer to prenyl pyrophosphate compounds with a carbon number that is a multiple of 5, which serve as substrates in the biosynthesis of either larger prenyl pyrophosphate compounds or of canonical terpenes (terpenoids and/or isoprenoids) and meroterpenoids. While not limited to these processes, non-canonical can usually be utilized by prenyltransferases and functional analogues, enzymes capable of removing the diphosphate group to catalysing a reaction, modified or folded by oxidoreductases. Squalene, dehydrosqualene, oxidosqualene, and phytoene are also considered herein as canonical isoprenoid building blocks, despite the fact that they do not carry a diphosphate group.
Non-canonical isoprenoid building blocks, as used herein, refers to pyrophosphate group- containing organic molecules that are structural analogues of the canonical isoprenoid building blocks and can serve as substrates in the biosynthesis of either larger pyrophosphate-containing compounds by the action of prenyltransferases or prenyl diphosphate synthase enzymes, also described as condensation and elongation in this work, or in the biosynthesis of non-canonical isoprenoids (terpenes or isoprenoids) by the action of terpene synthases, or non-canonical meroterpenoids by the action of corresponding prenyltransferases. Not limited to these processes, non-canonical isoprenoid building blocks can also be utilized by enzymes capable of removing the diphosphate group to catalysing a reaction, modified, or folded by oxidoreductases, and other reactions.
The term “genetically engineered” as used herein refers to the genetic alteration of a cell resulting from the direct uptake of exogenous genetic material from its surroundings through the cell membrane(s), or by other means, such as viral transduction, whether or not said exogenous genetic material is incorporated into the cell’s genome, thus possibly leading to either stable or transient expression.
Cells comprising exogenous nucleic acids
In one embodiment, the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 15 encoding a first kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 15 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 15.
In another embodiment, the host cell of the invention comprises at least a first nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 2 (corresponding to a first kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 2 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 2.
In another embodiment, the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 2.
In one embodiment, said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 2.
In another embodiment, the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 14 encoding a first kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 14 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises at least a first nucleic acid comprising or consisting of a variant of SEQ ID NO: 14.
In another embodiment, the host cell of the invention comprises at least a first nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 1 (corresponding to a first kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 1 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 1.
In another embodiment, the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 1.
In one embodiment, said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 1.
In another embodiment, the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 17 encoding a second kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 17 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 17.
In another embodiment, the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 3 (i.e., encoding a second kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 3 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 3.
In another embodiment, the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 3.
In one embodiment, said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 3.
In another embodiment, the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 18 encoding a second kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 18 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 18.
In another embodiment, the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 4 (i.e. encoding a second kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 4 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 4.
In another embodiment, the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 4.
In one embodiment, said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 4.
In another embodiment, the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19 encoding a second kinase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 19 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19.
In another embodiment, the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 5 (i.e. encoding a second kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 5 and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a polypeptide comprising or consisting of a variant of SEQ ID NO: 5.
In another embodiment, the host cell of the invention comprises a polypeptide with between 1 and 5 amino acid substitutions as compared to SEQ ID NO: 5.
In one embodiment, said polypeptide has between 1 and 3 amino acid substitutions as compared to SEQ ID NO: 5.
In another embodiment, the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19 encoding a second kinase polypeptide wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 19 and comprising the amino acid substitution 204G and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment the host cell of the invention comprises a second nucleic acid comprising or consisting of a variant of SEQ ID NO: 19 and comprising the amino acid substitution 204G.
In another embodiment, the host cell of the invention comprises a second nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 6 (i.e., encoding a second
kinase polypeptide), wherein said variant polypeptide has at least 75%, but less than 100% sequence identity to SEQ ID NO: 6 and comprising the amino acid substitution 204G and wherein said polypeptide is capable of phosphorylation in said host cell.
In another embodiment, the host cell of the invention comprises a polypeptide according to SEQ ID NO: 6.
In another embodiment, the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 20 encoding a prenyl diphosphate synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 20.
In another embodiment the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 20.
In another embodiment, the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 7 encoding a prenyl diphosphate synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 7.
In another embodiment, the host cell of the invention comprises a polypeptide according to SEQ ID NO: 7.
In another embodiment, the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 8 encoding a prenyl diphosphate synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 8 and comprises amino acid change N127W.
In another embodiment, the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 8.
In another embodiment, the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 8 and comprises amino acid change N127W.
In another embodiment, the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 21 encoding a (+)-limonene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 21.
In another embodiment the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 21.
In another embodiment, the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 9 encoding a (+)-limonene
synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 9.
In another embodiment, the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 9.
In another embodiment, the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22 encoding a Beta-myrcene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 22.
In another embodiment the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22.
In another embodiment, the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 10 encoding a Beta-myrcene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 10.
In another embodiment, the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 10.
In another embodiment, the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 23 encoding a Sabinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 23.
In another embodiment the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22.
In another embodiment, the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 11 encoding a Sabinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 11.
In another embodiment, the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 11.
In another embodiment, the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 24 encoding a a-Pinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 24.
In another embodiment the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 22.
In another embodiment, the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 12 encoding a a-Pinene synthase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 12.
In another embodiment, the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 12.
In another embodiment, the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 25 encoding a geranyldiphosphate : olivetolate geranyltransferase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 25.
In another embodiment the host cell of the invention comprises a nucleic acid comprising or consisting of a variant of SEQ ID NO: 25.
In another embodiment, the host cell of the invention comprises a nucleic acid encoding for a polypeptide comprising or consisting of a variant of SEQ ID NO: 13 encoding a geranyldiphosphate : olivetolate geranyltransferase polypeptide, wherein said variant has at least 75%, but less than 100% sequence identity to SEQ ID NO: 13.
In another embodiment, the host cell of the invention comprises a polypeptide comprising SEQ ID NO: 13.
In another embodiment, the host cell of the invention comprises a polypeptide consisting of naturally encoded amino acid residues, i.e., amino acids found in the genetic code.
Although all combinations of the first kinase (SEQ ID NO: 2) that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor and a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor herein may be useful for providing a genetically engineered eukaryotic cell, such as a yeast cell, specific combinations of the first kinase and phosphokinase may be of particular interest in the context of the present invention.
In some embodiments, the first kinase and the phosphokinase are: i) AtFKI and AtIPK; ii) AtFKI and TalPK;
ii) AtFKI and TalPK(204G); or functional variants thereof having at least 70% homology thereto, such as at least 71 %, such as at least 72%, such as at least 73%, such as at least 74%, such as at least 75%, such as at least 76%, such as at least 77%, such as at least 78%, such as at least 79%, such as at least 80%, such as at least 81%, such as at least 82%, such as at least 83%, such as at least 84%, such as at least 85%, such as at least 86%, such as at least 87%, such as at least 88%, such as at least 89%, such as at least 90%, such as at least 91 %, such as at least 92%, such as at least 93%, such as at least 94%, such as at least 95%, such as at least 96%, such as at least 97%, such as at least 98%, such as at least 99% homology thereto.
Titer
The combined action of one or more of these enzymes provides an isoprenoid biosynthetic pathway that allows de-coupling of isoprenoid biosynthesis from biomass production and enables channelling more substrate into product, thus providing a non-competitive system. Thereby, a 10- to 100-fold increase in production titers of canonical isoprenoid compounds can be achieved. Thus, it is a highly efficient method to avoid common bottlenecks in currently used methods for the production of isoprenoids in yeast and other eukaryotic cells.
The present methods allow production of isoprenoids with a total titer of at least 10 mg/L, such as at least 30 mg/L, such as at least 100 mg/L, such as at least 300 mg/L, such as at least 1 g/L, such as at least 3 g/L, such as at least 10 g/L, such as at least 30 g/L or more, wherein the total titer is the sum of the intracellular isoprenoids titer and the extracellular isoprenoids titer. Indeed, the produced isoprenoids may be secreted from the cell - extracellular isoprenoids - or it may be retained in the cell - intracellular isoprenoids.
The method may also comprise a step of recovering the produced isoprenoids. This may involve a heating step to precipitate cell material and to release intracellular isoprenoids, a centrifugation or filtration step to remove the cell debris and precipitated materials, pH-adjusting and chromatographic steps optionally involving solvents to vary the solubility of the isoprenoids and to purify it from other components. In some embodiments recovery of isoprenoids involves the addition of a non-miscible solvent overlay in the yeast culture. Said solvent may be hexane, dodecane, isopropyl myristate, or a vegetable oil. In some embodiments the recovered isoprenoids may be used as a nutritional supplement with its naive or processed host cells directly.
Kinases
Kinases of the invention meet the definition of an enzyme that catalyses the transfer of phosphate groups from high-energy, phosphate-donating molecules such as CTP, ATP, GTP, UTP, NTP, CDP, ADP, GDP, UDP, NDP, or diphosphate, triphosphate or polyphosphate to specific substrates. This process is known as phosphorylation, where the substrate gains a phosphate group, and the high-energy, e.g., NTP molecule donates a phosphate group. This transesterification produces a phosphorylated substrate and NDP, as illustrated in the below schematic:
ADP Phosphorylated substrate
Kinases as referred to herein encompass both alcohol kinases and phosphate kinases, i.e. , phosphokinases according to the invention.
Farnesol kinase
Arabidopsis thaliana FKI (or FOLK) is a farnesol kinase belonging to the phosphatidate cytidylyltransferase family of enzymes (Brenda EC 2.7.1.216, UniProt Accession Q67ZM7) that can phosphorylate farnesol using an NTP donor. It has also been shown to phosphorylate geraniol and geranylgeraniol. Phosphorylation of farnesol proceeds according to the following reaction:
Host cell
28
The present invention relates to a genetically engineered eukaryotic cell capable of producing mono- or pyrophosphate isoprenoid precursors, such as terpenoid precursors. The genetically engineered eukaryotic cell can be any appropriate cell.
In some embodiments, the genetically engineered eukaryotic cell is a yeast cell.
In some embodiments, the yeast cell is a cell from a GRAS (Generally Recognized As Safe) organism or a non-pathogenic organism or strain.
The cell according to the invention is a eukaryotic cell. Such a cell is described herein as a host cell insofar it is the recipient of and / or comprises nucleic acids or polypeptides according to the invention. Such a eukaryotic cell may be a yeast cell. In a preferred embodiment, the host cell is a yeast cell. Any yeast species may be appropriate. In some embodiments, the genus of said yeast is selected from Saccharomyces, Pichia, Yarrowia, Kluyveromyces, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Schizosaccharomyces, Trichosporon and Lipomyces. In some preferred embodiments, the genus of said yeast is Saccharomyces, Pichia, Yarrowia, Ogataea or Kluyveromyces The yeast cell may be selected from the group consisting of Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Hansenula polymorpha (syn. Ogataea parapolymorpha), Kluyveromyces marxianus, Yarrowia lipolytica, Klyveromyces lactis, or Dekkera bruxellensis. It is understood that other cells of other genera and, in particular, other species or strains of the same genera are equally appropriate to use as host cells. The eukaryotic cell according to the invention may also be a cell derived from a filamentous fungus, such as a cell derived from the miscella or germ bodies of a filamentous fungus. Non-limiting exemplary species of filamentous fungi according to the invention are Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
The eukaryotic cell according to the invention may also be an algal cell, such as a cell derived from a multicellular alga. The algal cell of the invention may also be a microalga, such as a cell of the species Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana. It is understood that other cells of other genera and, in particular, other species or strains of the same genera, are equally appropriate to use as host cells.
In some embodiments of the present invention, the genetically engineered eukaryotic cell is not a plant cell.
29
Methods of production of cells according to the invention
Cells of the invention can be produced, for example, by use of a combination of recombinant DNA techniques and gene transfection methods as are well known in the art (Morrison, S. (1985) Science 229: 1202).
For example, the nucleic acid sequence(s) of interest, e.g., a kinase coding sequence, can be amplified using the primers ligated into expression vectors such as a eukaryotic expression plasmid such as used in the expression system disclosed in examples 1 and 2, or other expression systems well known in the art. A purified plasmid with the cloned sequences can be introduced into yeast cells or other eukaryotic host cells such as filamentous fungi cells, algae cells, mammalian cells such as CHO cells, HEK293T cells or HeLa cells or alternatively other eukaryotic cells like plant derived cells. The method used to introduce these genes can be methods described in the art including, but not limited to electroporation, chemical transformation, such as PEG/lithium acetate-mediated transformation, calcium-phosphate precipitation or DEAE-dextran transfection, transfection, such as lipofectamine transfection, transduction, ultrasound transformation and the like. For exogenous expression, i.e., expression of exogenous genetic material, in yeast or other eukaryotic cells, genes can be expressed in the cytosol, or can be targeted to mitochondrion, peroxisome, vacuole, or other organelles by the addition of a suitable targeting sequence such as a chloroplastic, mitochondrial or peroxisomal targeting signal suitable for the host cells. Thus, it is understood that appropriate modifications to a nucleic acid sequence to remove or include a targeting sequence can be incorporated into an exogenous nucleic acid sequence to impart desirable properties. Furthermore, genes can be subjected to codon optimization with techniques well known in the art to achieve optimized expression of the proteins, i.e., it may be preferred to modify said nucleic acids for the sake of optimization of codon usage, in particular if said nucleic acids, optionally fused to heterologous nucleic acids such as nucleic acids derived from other organisms as described herein, are to be expressed in cells from an organism different from the cell of origin. For example, the nucleic acid sequences encoding alcohol or phosphate kinases originating from, e.g., Arabidopsis thaliana according to the invention can be modified to include one or more, preferably at least 1, 2, 3, 4, 5, 10, 15, 20 and preferably up to 10, 15, 20, 25, 30, 50, 70 or 100 or more nucleotide replacements resulting in an optimized codon usage in, e.g. a preferred yeast genus. Such nucleotide replacements preferably relate to replacements of nucleotides not resulting in a change in the encoded amino acid sequence. Preferably, the degree of identity between a specific nucleic acid sequence and a nucleic acid sequence, which is modified with respect to,
30 or which is a variant of said specific nucleic acid sequence, will be at least 70%, preferably at least 75%, more preferably at least 80%, even more preferably at least 90% or most preferably at least 95%, 96%, 97%, 98% or 99%.
Cells according to the invention may also be prepared using various site-directed mutagenesis methods, which for example can be designed based on the sequence of AtFKI, which is accessible under the Uniprot entry Q67ZM7 and provided herein as SEQ ID NO:2. In one embodiment, the cell of the invention is prepared using any one of CRISPR, a TALEN, a zinc finger, meganuclease, and a DNA-cutting antibiotic as described in WO 2017/138986. In one embodiment, the cell is prepared using CRISPR/cas9 technique, e.g. using RNA-guided Cas9 nuclease. This may be done as described in Lawrenson et al. , Genome Biology (2015) 16:258; DOI 10.1186/s13059-015-0826-7 except that the single guide RNA sequence is designed based on the gene sequences provided herein. In one embodiment, the host cell is prepared using a combination of both TALEN and CRISPR/cas9 techniques, e.g., using RNA-guided Cas9 nuclease. This may be done as described in Holme et al., Plant Mol Biol (2017) 95:111-121 ;
DOI: 10.1007/si 1103-017-0640-6) except that the TALEN and single guide RNA sequence are designed based on the gene sequences provided herein.
In one embodiment, the cell of the invention is prepared using homology directed repair, a combination of a DNA cutting nuclease and a donor DNA fragment. This may be done as described in Sun et al., Molecular Plant (2016) 9:628-631 ; DOI: https://doi.Org/10.1016/j.molp.2016.01.001 except that the DNA cutting nuclease and donor DNA fragment are designed based on the gene sequences provided herein.
After introduction of these genes in the host cells, cells expressing the kinase(s) can be identified and selected as would be known to the person skilled in the art according to the marker(s) used. These cells can then be amplified for their expression level and upscaled to produce canonical and non-canonical isoprenoids by use of the canonical and non-canonical isoprenoid precursors.
Nucleic acids
The term "nucleic acid", as used herein, is intended to include deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) and refers to a polynucleotide comprising a polymer of nucleotides. Nucleic acids comprise according to the invention genomic DNA, cDNA, mRNA, recombinantly produced and chemically synthesized molecules. According to the invention, a nucleic acid may be present as a single-stranded or double-stranded and linear or covalently circularly closed
31 molecule. The nucleic acids may have been codon-optimized for expression in a yeast cell as is known in the art.
The nucleic acids described according to the invention have preferably been isolated. The term "isolated nucleic acid" means according to the invention that the nucleic acid was (i) amplified in vitro, for example by polymerase chain reaction (PCR), (ii) recombinantly produced by cloning, (iii) purified, for example by cleavage and gel-electrophoretic fractionation, or (iv) synthesized, for example by chemical synthesis. An isolated nucleic acid is a nucleic acid which is available for manipulation by recombinant DNA techniques.
Nucleic acids may, according to the invention, be present alone or in combination with other nucleic acids, which may be homologous or heterologous. In preferred embodiments, a nucleic acid is functionally linked to expression control sequences which may be homologous or heterologous with respect to said nucleic acid wherein the term "homologous" means that the nucleic acid is also functionally linked to the expression control sequence naturally and the term "heterologous" means that the nucleic acid is not functionally linked to the expression control sequence naturally. A nucleic acid, such as a nucleic acid expressing RNA and/or protein or peptide, and an expression control sequence are "functionally" linked to one another, if they are covalently linked to one another in such a way that expression or transcription of said nucleic acid is under the control or under the influence of said expression control sequence. Since the nucleic acid is to be translated into a functional protein, and where an expression control sequence is functionally linked to a coding sequence, induction of said expression control sequence results in transcription of said nucleic acid without causing a frame shift in the coding sequence or said coding sequence otherwise not being capable of being translated into the desired protein or peptide.
The term "expression control sequence" or "expression control element" comprises according to the invention promoters, ribosome binding sites, IRES, enhancers and other control elements which regulate transcription of a gene or translation of a mRNA. In particular embodiments of the invention, the expression control sequences can be regulated. The exact structure of expression control sequences may vary as a function of the species or cell type, but generally comprises 5'-untranscribed and 5'- and 3 '-untranslated sequences which are involved in initiation of transcription and translation, respectively, such as TATA box, capping sequence, CAAT sequence, and the like. More specifically, 5'-untranscribed expression control sequences comprise a promoter region which includes a promoter sequence for transcriptional control of
32 the functionally linked nucleic acid. Expression control sequences may also comprise enhancer sequences or upstream activator sequences.
According to the invention the term "promoter" or "promoter region" relates to a nucleic acid sequence which is located upstream (5') to the nucleic acid sequence being expressed and controls expression of the sequence by providing a recognition and binding site for RNA- polymerase. The "promoter region" may include further recognition and binding sites for further factors which are involved in the regulation of transcription of a gene.
A promoter may be "inducible" by way of initiating transcription in response to an inducing agent or may be "constitutive" if transcription is not controlled by an inducing agent. A gene which is under the control of an inducible promoter is not expressed or only expressed to a small extent if an inducing agent is absent. In the presence of the inducing agent the gene is switched on or the level of transcription is increased. This is mediated, in general, by binding of a specific transcription factor.
Promoters which are preferred according to the invention include promotors useful for expression in a yeast host, including but not limited to promoters obtained from the genes for Saccharomyces cerevisiae enolase ( EN01 ), Saccharomyces cerevisiae galactokinase ( GAL1 ), Saccharomyces cerevisiae UDP-glucose-4-epimerase ( GAL10 ), Saccharomyces cerevisiae glyceraldehyde-3-phosphate dehydrogenase ( TDH3 ), Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase ( ADH1 , ADH2/GAP), Saccharomyces cerevisiae triose phosphate isomerase ( TPI ), Saccharomyces cerevisiae metallothionein ( CUP1 ), and Saccharomyces cerevisiae 3-phosphoglycerate kinase (PGK), Saccharomyces cerevisiae cell wall mannoprotein (CCW12). Other useful promoters for yeast host cells are described by Romanos et al., 1992, Yeast 8: 423-488.
Examples of suitable promoters for directing transcription of the nucleic acid constructs of the present invention in a filamentous fungal host cell are promoters obtained from the genes for Aspergillus nidulans acetamidase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Aspergillus oryzae TAKA amylase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Fusarium oxysporum trypsin-like protease (WO 96/00787), Fusarium venenatum amyloglucosidase (WO 00/56900), Fusarium venenatum Daria (WO 00/56900), Fusarium venenatum Quinn (WO 00/56900), Rhizomucor miehei lipase, Rhizomucor miehei aspartic proteinase, Trichoderma reesei beta-glucosidase, Trichoderma reesei cellobiohydrolase I, Trichoderma reesei cellobiohydrolase II, Trichoderma reesei endoglucanase
33
I, Trichoderma reesei endoglucanase II, Trichoderma reesei endoglucanase III, Trichoderma reesei endoglucanase V, Trichoderma reesei xylanase I, Trichoderma reesei xylanase II, Trichoderma reesei xylanase III, Trichoderma reesei beta-xylosidase, and Trichoderma reesei translation elongation factor, as well as the NA2-tpi promoter (a modified promoter from an Aspergillus neutral alpha-amylase gene in which the untranslated leader has been replaced by an untranslated leader from an Aspergillus triose phosphate isomerase gene; non-limiting examples include modified promoters from an Aspergillus niger neutral alpha-amylase gene in which the untranslated leader has been replaced by an untranslated leader from an Aspergillus nidulans or Aspergillus oryzae triose phosphate isomerase gene); and mutant, truncated, and hybrid promoters thereof. Other promoters are described in U.S. Pat. No. 6,011,147.
The control sequence may also be a transcription terminator, which is recognized by a host cell to terminate transcription. The terminator is operably linked to the 3'-terminus of the polynucleotide encoding the polypeptide. Any terminator that is functional in the host cell may be used in the present invention.
Preferred terminators for yeast host cells are obtained from the genes for Saccharomyces cerevisiae enolase, Saccharomyces cerevisiae cytochrome C ( CYC1 ), Saccharomyces cerevisiae alcohol dehydrogenase 1 ( ADH1 ), and Saccharomyces cerevisiae glyceraldehyde-3- phosphate dehydrogenase. Other useful terminators for yeast host cells are described by Romanos et al., 1992, supra.
Preferred terminators for filamentous fungal host cells are obtained from the genes for Aspergillus nidulans acetamidase, Aspergillus nidulans anthranilate synthase, Aspergillus niger glucoamylase, Aspergillus niger alpha glucosidase, Aspergillus oryzae TAKA amylase, Fusarium oxysporum trypsin-like protease, Trichoderma reesei beta- glucosidase, Trichoderma reesei cellobiohydrolase I, Trichoderma reesei cellobiohydrolase
II, Trichoderma reesei endoglucanase I, Trichoderma reesei endoglucanase II, Trichoderma reesei endoglucanase III, Trichoderma reesei endoglucanase V, Trichoderma reesei xylanase I, Trichoderma reesei xylanase II, Trichoderma reesei xylanase III, Trichoderma reesei beta- xylosidase, and Trichoderma reesei translation elongation factor.
The control sequence may also be an mRNA stabilizer region downstream of a promoter and upstream of the coding sequence of a gene which increases expression of the gene.
The control sequence may also be a leader, a non-translated region of an mRNA that is important for translation by the host cell. The leader is operably linked to the 5'-terminus of the
34 polynucleotide encoding the polypeptide. Any leader that is functional in the host cell may be used.
Suitable leaders for yeast host cells are obtained from the genes for Saccharomyces cerevisiae enolase ( EN01 ), Saccharomyces cerevisiae 3-phosphoglycerate kinase, Saccharomyces cerevisiae alpha-factor, and Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde- 3-phosphate dehydrogenase ( ADH2/GAP ).
Preferred leaders for filamentous fungal host cells are obtained from the genes for Aspergillus oryzae TAKA amylase and Aspergillus nidulans triose phosphate isomerase.
The control sequence may also be a polyadenylation sequence, a sequence operably linked to the 3'-terminus of the polynucleotide and, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence that is functional in the host cell may be used.
Useful polyadenylation sequences for yeast host cells are described by Guo and Sherman,
1995, Mol. Cellular Biol. 15: 5983-5990.
Preferred polyadenylation sequences for filamentous fungal host cells are obtained from the genes for Aspergillus nidulans anthranilate synthase, Aspergillus niger glucoamylase, Aspergillus niger alpha-glucosidase Aspergillus oryzae TAKA amylase, and Fusarium oxysporum trypsin-like protease.
It may also be desirable to add regulatory sequences that regulate expression of the polypeptide relative to the growth of the host cell. Examples of regulatory sequences are those that cause expression of the gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound. In yeast, the ADH2 system or GAL1 system may be used. In filamentous fungi, the Aspergillus niger glucoamylase promoter, Aspergillus oryzae TAKA alpha-amylase promoter, and Aspergillus oryzae glucoamylase promoter, Trichoderma reesei cellobiohydrolase I promoter, and Trichoderma reesei cellobiohydrolase promoter may be used. Other examples of regulatory sequences are those that allow for gene amplification. In eukaryotic systems, these regulatory sequences include the dihydrofolate reductase gene that is amplified in the presence of methotrexate, and the metallothionein genes that are amplified with heavy metals. In these cases, the polynucleotide encoding the polypeptide would be operably linked to the regulatory sequence.
According to the invention, the term "expression" is used in its most general meaning and comprises the production of RNA or of RNA and protein/peptide. It also comprises partial
35 expression of nucleic acids. Furthermore, expression may be carried out transiently or stably. According to the invention, the term expression also includes an "increased expression" or "abnormal expression".
"Increased expression" or "abnormal expression" or “silenced” means according to the invention that expression is altered, increased or decreased, compared to a reference, preferably compared to the state in a normal cell in normal growing physical and chemical conditions, not undergoing above-normal respiration and / apoptosis, and in any case in the same conditions and state as the cell whose expression is being compared to. An increase in expression refers to an increase by at least 10%, in particular at least 20%, at least 50%, at least 100%, at least 200%, at least 500%, at least 1000%, at least 10000% or more. A decrease in expression or silencing refers to a decrease by at least 10%, in particular at least 20%, at least 50%, at least 100%, at least 200%, at least 500%, at least 1000%, at least 10000% or more.
In a preferred embodiment, a nucleic acid molecule is according to the invention present in a vector, where appropriate with a promoter, which controls expression of the nucleic acid.
The term "vector" is used here in its most general meaning and comprises any intermediary vehicle for a nucleic acid which enables said nucleic acid, for example, to be introduced into eukaryotic cells and preferably expressed and, where appropriate, to be replicated and / or integrated into a genome. Thus, the term “vector” as used herein generally relates to genetic material that is at least at the time of introduction into the host cell, extrachromosomal, usually circular DNA duplex. A vector containing foreign DNA is termed recombinant DNA. The term vector therefor comprises, but is not limited to, plasmids, viral vectors, cosmids, and artificial chromosomes. Common to most engineered vectors is an origin of replication, one or multiple cloning sites, and one or multiple selectable markers.
A vector for expression of one or more kinases according to the invention, may either be of a vector type in which the first and the second kinases are present in different vectors or a vector type in which both are present in the same vector.
The teaching given herein with respect to specific nucleic acid and amino acid sequences, e.g. those shown in the sequence listing, is to be construed so as to also relate to modifications of said specific sequences resulting in sequences which are functionally equivalent to said specific sequences, e.g. nucleic acid sequences encoding amino acid sequences exhibiting properties identical or similar to those of the amino acid sequences encoded by the specific nucleic acid sequences.
36
The term “ectopic”, particularly in relation to “ectopic expression” as used herein, relates to the occurrence of gene expression in a cell in which it is normally not expressed. Such ectopic expression can be caused by the introduction and expression of a nucleic acid in a vector as defined herein or by juxtaposition of novel enhancer elements to a gene. Such techniques are known to the person skilled in the art.
“Growth medium” or “culture medium” as used herein refers to is a solid, liquid, or semi-solid designed to support the growth of a population of microorganisms or cells via the process of cell proliferation. Different types of media are used for growing different types of cells and are known to the person skilled in the art. Examples of media are YPD medium, YPG medium, YPAD medium, synthetic minimal medium, and synthetic complex medium, YPGal, selective minimal medium and selective inducing minimal medium
The term "peptide analogue" as used herein refers to a compound comprising a peptide, wherein the peptide may be modified with moieties that do not necessarily consist of proteinogenic amino acids and are thus non-proteinogenic amino acids residues. Non- proteinogenic amino acids are those not naturally encoded or found in the genetic code of any organism. These may be, e.g., intermediates in biosynthesis, or post-translationally formed in proteins.
As used herein the term “fusion protein” or “fusion” or “recombinant protein” refers to a single polypeptide chain having at least two polypeptide domains that are not normally present in a single, natural polypeptide. Such a fusion protein is typically obtained by the expression of recombinant DNA molecules. Recombinant DNA molecules are DNA molecules formed by genetic recombination (such as molecular cloning) that bring together genetic material from multiple sources, creating sequences that would not otherwise be found in the genome.
Sequence identity
The recitations "sequence identity" or, for example, comprising a "sequence 75% identical to," as used herein, refer to the extent that sequences are identical on a nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis over a window of comparison. Thus, a "percentage of sequence identity" may be calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, I) or the identical amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, lie, Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gin, Cys and Met) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions
37 by the total number of positions in the window of comparison, i.e., the window size and multiplying the result by 100 to yield the percentage of sequence identity.
Terms used to describe sequence relationships between two or more nucleic acid polymers or polypeptides include "reference sequence," "comparison window," "sequence identity," "percentage of sequence identity" and "substantial identity". A "reference sequence" is at least 12 but frequently 15 to 18 and often at least 25 monomer units, inclusive of nucleotides and amino acid residues, in length. Because two nucleic acid or polypeptide polymers may each comprise (1) a sequence (i.e., only a portion of a complete polymer) that is similar between the two polymers, and (2) a sequence that is divergent between the two polymers, sequence comparisons between two (or more) polymers are typically performed by comparing sequences of the two polymers over a "comparison window" to identify and compare local regions of sequence similarity. A "comparison window" refers to a conceptual segment of at least 6 contiguous positions, usually about 50 to about 100, more usually about 100 to about 150 in which a sequence is compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. The comparison window may comprise additions or deletions (i.e., gaps) of about 20% or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window may be conducted by computerized implementations of algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Drive Madison, Wl, USA) or by inspection and the best alignment (i.e., resulting in the highest percentage homology over the comparison window) generated by any of the various methods selected. Reference also may be made to the BLAST family of programs as for example disclosed by Altschul et al., 1997, Nucl. Acids Res. 25:3389. A detailed discussion of sequence analysis can be found in Unit 19.3 of Ausubel et al., "Current Protocols in Molecular Biology," John Wiley & Sons Inc, 1994-1998, Chapter 15.
Calculations of sequence similarity or sequence identity between sequences (the terms are used interchangeably herein) can be performed as follows: To determine the percent identity of two nucleic acid sequences, or of two amino acid sequences, the sequences can be aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In certain embodiments of the present disclosure, the length of a reference sequence aligned for comparison purposes is at
38 least 30%, preferably at least 40%, more preferably at least 50%, 60%, and even more preferably at least 75%, 80%, 90%, 100% of the length of the reference sequence. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position.
The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.
The comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment of the present disclosure, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch, (1970, J. Mol. Biol. 48: 444-453) algorithm which has been incorporated into the GAP program in the GCG software package, using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment of the present disclosure, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package, using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1 , 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and the one that should be used unless otherwise specified) are a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5. The percent identity between two amino acid or nucleotide sequences can also be determined using the algorithm of E. Meyers and W. Miller (1989, Cabios, 4: 11-17) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.
The nucleic acid and protein sequences described herein can be used as a "query sequence" to perform a search against public databases, for example, to identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al., (1990, J. Mol. Biol, 215: 403-10). BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12 to obtain nucleotide sequences homologous to nucleic acid molecules of the disclosure. BLAST protein searches can be performed with the XBLAST program, score = 50, wordlength = 3 to obtain amino acid sequences homologous to protein molecules of the disclosure. To obtain gapped alignments for
39 comparison purposes, Gapped BLAST can be utilized as described in Altschul et al. (Nucleic Acids Res. 25:3389-3402, 1997). When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and N BLAST) can be used.
Alcohol substrates The substrate according to the invention is a primary alcohol with three (3) up to thirty (30) carbon atoms, including, but not limited to; structures with sidechains chains, branched sidechains, structures with one (1) or more double bonds, one (1) or more triple bonds, functional groups, structures with the addition of or substitution with the elements Hydrogen, Nitrogen, Oxygen, Fluorine, Silicon, Phosphorus, Sulphur, Chlorine, Selenium, Boron, Iodine, Lithium, Sodium or Potassium.
Such a substrate may be summarised by the structure of formula 1 : where formula 1 is:
40 or wherein Ri is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, including, but not limited to methyl, ethyl, propyl, isopropyl, methoxy, ethoxy, hydroxyl, hydroxymethyl, hydroxyethyl, sulfhydryl, silyl; a group containing a reactive nonmetal;
a group containing a metalloid; a group containing a halogen, comprising: fluoro-, chloro-, bromo- , or iodo- groups; a group containing oxygen, comprising: hydroxyl-, carbonyl-, aldehyde-, haloformyl-, carbonate ester-, carboxylate-, carboxyl-, carboalkoxy-, hydroperoxy-, peroxy-, ether- , hemiacetal-, hemiketal-, acetal-, ketal-, orthoester-, methylenedioxy-, orthocarbonate ester-, or carboxylic anhydride - groups; a group containing nitrogen, comprising: carboxamide-, primary amine-, secondary amine-, tertiary amine-, 4° ammonium ion-, primary ketimine-, secondary ketimine-, primary aldimine-, secondary aldimine-, imide-, azide-, azo-, cyanate-, isocyanate-, nitrate-, nitrile-, isonitrile-, nitrosooxy-, nitro-, nitroso-, oxime-, pyridyl-, or carbamate - groups; a group containing sulfur, comprising: sulfhydryl-, sulphide-, disulphide-, sulfinyl-, sulfonyl-, sulfino- , sulpho-, sulphonate ester-, thiocyanate-, isothiocyanate-, carbonothioyl-, carbonothioyl-, thiocarboxylic acid-, carbothioic s-acid-, carbothioic o-acid-, thiolester-, thionoester-, carbodithioic acid-, orcarbodithio- groups; a group containing phosphorus comprising: phosphino-, phosphono- , phosphate-, or phosphate- groups; and / or a group containing boron comprising: borono-, boronate-, borino-, or borinate- groups;
R2 is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, including, but not limited to methyl, ethyl, propyl, isopropyl, methoxy, ethoxy, hydroxyl, hydroxymethyl, hydroxyethyl, sulfhydryl, silyl; a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, comprising: fluoro-, chloro-, bromo- , or iodo- groups; a group containing oxygen, comprising: hydroxyl-, carbonyl-, aldehyde-, haloformyl-, carbonate ester-, carboxylate-, carboxyl-, carboalkoxy-, hydroperoxy-, peroxy-, ether- , hemiacetal-, hemiketal-, acetal-, ketal-, orthoester-, methylenedioxy-, orthocarbonate ester-, or carboxylic anhydride - groups; a group containing nitrogen, comprising: carboxamide-, primary amine-, secondary amine-, tertiary amine-, 4° ammonium ion-, primary ketimine-, secondary ketimine-, primary aldimine-, secondary aldimine-, imide-, azide-, azo-, cyanate-, isocyanate-,
41 nitrate-, nitrile-, isonitrile-, nitrosooxy-, nitro-, nitroso-, oxime-, pyridyl-, or carbamate - groups; a group containing sulfur, comprising: sulfhydryl-, sulfide-, disulfide-, sulfinyl-, sulfonyl-, sulfino-, sulfo-, sulfonate ester-, thiocyanate-, isothiocyanate-, carbonothioyl-, carbonothioyl-, thiocarboxylic acid-, carbothioics-acid-, carbothioic o-acid-, thiolester-, thionoester-, carbodithioic acid-, orcarbodithio- groups; a group containing phosphorus comprising: phosphino-, phosphono- , phosphate-, or phosphate- groups; and / or a group containing boron comprising: borono-, boronate-, borino-, or borinate- groups; and
R3 is hydrogen, methyl, fluorine, chlorine, bromine, iodine, sulfhydryl; or hydroxyl.
In one embodiment, the primary alcohol of the invention is an alcohol comprising less than 5 carbon atoms, such as 4 carbon atoms, such as 3 carbon atoms.
In one embodiment, the primary alcohol of the invention is an alcohol comprising more than 5 carbon atoms, such as 6 carbon atoms, such as 7 carbon atoms, such as 8 carbon atoms, such as 9 carbon atoms.
In one embodiment, the primary alcohol of the invention is an alcohol comprising 5 carbon atoms.
Accordingly, the substrate of the invention is any selected from the group formed by, but not limited to:
3,4-dimethylpent-2-en-1-ol, 4-methyl-3-methylenepentan-1-ol, 3,4-dimethylpent-3-en-1-ol, propan-1-ol, prop-2-en-1-ol, prop-2-yn-1-ol, butan-1-ol, but-3-en-1-ol, but-2-en-1-ol, buta-2,3- dien-1-ol, but-3-yn-1-ol, 3-methylbut-3-en-1-ol, 3-methylbut-2-en-1-ol, 3-methylbutan-1-ol, but-2- yn-1-ol, 2-methylenebutan-1-ol, 2-methylbut-2-en-1-ol, 2-methylbut-3-en-1-ol, 2-methylbutan-1- ol, 3-ethylpent-4-en-1-ol, 3-methylpenta-2,4-dien-1-ol, 3-methylpentan-1-ol, 3-methylpent-2-en- 1 -ol, 3-methylenepentan-1-ol, 3-methylpent-3-en-1-ol, 3-ethylpentan-1-ol, 3-ethylpent-4-en-1-ol, 3-ethylpent-3-en-1 -ol, 3-ethylpent-2-en-1 -ol, 3-ethylpent-4-yn-1 -ol, 3-methylenepent-4-en-1 -ol, 3-methylpent-4-yn-1-ol, 3-methylenepent-4-yn-1-ol, 3-methylhexan-1-ol, 3-methylhex-2-en-1-ol, 3-methylenehexan-1-ol, 3-methylhex-3-en-1-ol, 3-methylhex-5-en-1-ol, 3-methylpent-2-en-4-yn- 1 -ol, 3-methylhex-4-en-1-ol, 3-methylhexa-2,4-dien-1-ol, 3-methylhexa-2,5-dien-1-ol, 3- methylheptan-1-ol, 3-methylhepta-2,4-dien-1-ol, 3-methyleneheptan-1-ol, 3-methylhept-3-en-1- ol, 3-methylhept-4-en-1-ol, 3-methylhept-5-en-1-ol, 3-methylhept-6-en-1-ol, 3-methylhept-2-en- 1 -ol, 3-methylenehept-4-en-1-ol, 3-methylhepta-3,4-dien-1-ol, 3-methylhept-4-en-1-ol, 3- methylhepta-5,6-dien-1-ol, 3-methylhepta-2,5-dien-1-ol, 3-methylhepta-2,6-dien-1-ol, 3- methylenehept-6-en-1-ol, 3-methylenehept-5-en-1-ol, 3-methylhepta-3,5-dien-1-ol, 3-
42 methylhepta-4,5-dien-1-ol, 3-methylhepta-3,5,6-trien-1-ol, 3-methylhepta-4,6-dien-1-ol, 3- methylhepta-2,4,6-trien-1 -ol, 3-methylhepta-4,6-dien-1 -ol, 3-methyloctan-1 -ol, 3-methyloct-2-en- 1 -ol, 3-methyleneoctan-1-ol, 3-methyloct-3-en-1-ol, 3-methyloct-4-en-1-ol, 3-methyloct-5-en-1- ol, 3-methyloct-7-en-1-ol, 3-methylocta-2,4-dien-1-ol, 3-methyleneocta-4,5-dien-1-ol, 3- methyleneoct-4-en-1-ol, 3-methylocta-3,4-dien-1-ol, 3-methyloct-6-en-1-ol, 3-methylocta-2,4,5- trien-1 -ol, 3-methylocta-2,4,6-trien-1 -ol, 3-methyleneocta-4,6-dien-1 -ol, 3-methylocta-4,5-dien-1 - ol, 3-methylocta-5,6-dien-1-ol, 3-methylocta-6,7-dien-1-ol, 3-methyleneocta-5,7-dien-1-ol, 3- methylocta-2,5,7-trien-1-ol, 3-methyleneocta-4,7-dien-1-ol, 3-methylocta-2,4,7-trien-1-ol, 3- methylocta-4,5,7-trien-1-ol, 3-methylocta-3,4,6-trien-1-ol, 3-methylocta-3,4,7-trien-1-ol, 3- fluorobut-2-en-1-ol, 3-chlorobut-2-en-1-ol, 3-bromobut-2-en-1-ol, 3-aminobut-2-en-1-ol, 3- phosphaneylbut-2-en-1-ol, 3-fluorobut-3-en-1-ol, 3-chlorobut-3-en-1-ol, 3-bromobut-3-en-1-ol, 3- aminobut-3-en-1-ol, 3-phosphaneylbut-3-en-1-ol, 4-chloro-3-methylbut-2-en-1-ol, 4-bromo-3- methylbut-2-en-1-ol, 4-hydroxy-2-methylbut-2-enal, 2-methylbut-2-ene-1,4-diol, 4-mercapto-3- methylbut-2-en-1-ol, 3-methylpent-2-en-1-ol, 4-amino-3-methylbut-2-en-1-ol, 3- (fluoromethyl)but-3-en-1-ol, 4-fluoro-3-methylbut-2-en-1-ol, 3-(chloromethyl)but-3-en-1-ol, 3- (bromomethyl)but-3-en-1-ol, 4-hydroxy-2-methylenebutanal, 2-methylenebutane-1 ,4-diol, 3- (mercaptomethyl)but-3-en-1-ol, 3-methylenepentan-1-ol, 3-(aminomethyl)but-3-en-1-ol, 3- (phosphaneylmethyl)but-3-en-1 -ol, 3-methyl-4-phosphaneylbut-2-en-1 -ol, 5-fluoro-3-methylpent-
2-en-1-ol, 5-bromo-3-methylpent-2-en-1-ol, 5-chloro-3-methylpent-2-en-1-ol, 3-methylpent-2- ene-1 ,5-diol, 5-hydroxy-3-methylpent-3-enal, 5-iodo-3-methylpent-2-en-1-ol, 3-methyl-4- (methylthio)but-2-en-1 -ol, 5-mercapto-3-methylpent-2-en-1 -ol, 5-amino-3-methylpent-2-en-1 -ol,
3-methyl-5-phosphaneylpent-2-en-1-ol, and analogues of these compounds including analogues with the elements Nitrogen, Oxygen, Fluorine, Silicon, Phosphorus, Sulfur, Chlorine, Selenium, Boron, Iodine, Lithium, Sodium or Potassium.
In a preferred embodiment, the primary alcohol is 3-methylbut-2-en-1-ol, 4-fluoro-3-methylbut-2- en-1-ol, 3-methylpent-2-en-1-ol, 3,4-dimethylpent-2-en-1-ol, 3-ethylpent-2-en-1-ol, 3-methylhex- 2-en-1-ol, 3-methylhexa-2,5-dien-1-ol, 3-methylbut-3-en-1-ol, 3-methylenepentan-1-ol, 2- methylprop-2-en-1 -ol, 3-methyl-4-(methylthio)but-2-en-1 -ol, or 5-chloro-3-methylpent-2-en-1 -ol.
Terpenes and terpenoid compounds
The invention relates to the production of terpenes and terpenoid compounds in eukaryotic cells, said compounds being canonical or non-canonical terpenes or terpenoid compounds. Accordingly, a cell according to the invention is capable of production of a terpene or terpenoid selected from a group comprising, but not limited to:
43
Limonene, myrcene, alpha-pinene, sabinene, beta-pinene, 1,8-cineole, tricyclene, alpha- thujene, a/p/7a-fenchene, camphene, delta- 2-carene, a/p/7a-phellandrene, 3-carene, 1,4-cineole, a/p/7a-terpinene, befa-phellandrene, (Z)-befa-ocimene, (E)-beta-ocimene, gamma-terpinene, terpinolene, linalool, linalool acetate, ethyl linalool acetate, perillene, allo-ocimene, cis- beta- terpineol, c/s-terpine-1-ol, isoborneol, cfe/fa-terpineol, borneol, chrysanthemol, lavandulol, alpha- terpineol, nerol, geraniol, geranyl acetate, alpha-humulene, beta-caryophyllene, valencene, amorpha-4,11-diene, taxadiene, cannabigerolic acid, grifolic acid, daurichromenic acid, confluentin, rhododaurichromenic acids A and B, anthopogocyclolic acid, anthopogochromenic acid, cannabiorcichromenic acid, cannabiorcicyclolic acid, c/s-perrottetinene, (-)-cis- perrottetinenic acid,
7-ethyl-3-methylenenona-1 ,6-diene 7-methyl-3-methylenenona-1 ,6-diene 7, 8-dimethyl-3-methylenenona-1, 6-diene 7-methyl-3-methylenedeca-1 ,6-diene 1 -methyl-4-(3-methylbut-1 -en-2-yl)cyclohex-1 -ene 4-(but-1 -en-2-yl)-1 -methylcyclohex-1 -ene 1 -methyl-4-(pent-1 -en-2-yl)cyclohex-1 -ene 1 -methyl-4-(pent-2-en-3-yl)cyclohex-1 -ene
2.4-dihydroxy-3-(3-methylpent-2-en-1-yl)-6-pentylbenzoic acid,
2.4-dihydroxy-3-(3-methylhex-2-en-1-yl)-6-pentylbenzoic acid 3-(4-fluoro-3-methylbut-2-en-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid 3-(7-ethyl-3-methylnona-2,6-dien-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid 3-(3,7-dimethylnona-2,6-dien-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid
2.4-dihydroxy-6-pentyl-3-(3,4,7-trimethylocta-2,6-dien-1-yl)benzoic acid
2.4-dihydroxy-6-pentyl-3-(3,4,7-trimethylnona-2,6-dien-1-yl)benzoic acid 3-(3-ethylpent-2-en-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid
2.4-dihydroxy-6-pentyl-3-(3,7,8-trimethylnona-2,6-dien-1-yl)benzoic acid 3-(3,7-dimethyldeca-2,6-dien-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid 3-(8-fluoro-3,7-dimethylocta-2,6-dien-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid
In one embodiment the terpene or terpenoid is selected from the group commprising of hemiterpenes, hemiterpenoids, and monoterpenes.
In another embodiment the terpene or terpenoid is selected from the group comprising acyclic monoterpenes, monocyclic monoterpenes, cyclopropane monoterpenes, cyclobutane
44 monoterpenes, cyclopentane monoterpenes, cyclohexane monoterpenes, cymenes, bicyclic monoterpenes, pinanes, camphanes, fenchanes, monoterpene indole alkaloid, cannabinoids and sesquiterpenes.
In another embodiment the terpene or terpenoid is selected from the group comprising farnesanes, monocyclic farnesane sesquiterpenes, cyclofarnesanes, bisabolanes, germacranes, elemanes, humulanes, polycyclic farnesane sesquiterpenes, caryophyllanes, eudesmanes, furanoeudesmanes, eremophilanes, furanoeremonphilanes, valeranes, cadinanes, drimanes, guaianes, cycloguaianes, himachalanes, longipinanes, longifolanes, picrotoxanes, isodaucanes, daucanes, protoilludanes, illudanes, illudalanes, marasmanes, isolactaranes, lactaranes, sterpuranes, acoranes, chamigranes, cedranes, isocedranes, zizaanes, prezizaanes, campherenanes, santalanes, thujopsanes, hirsutanes, pinguisanes, presilphiperfolianes, silphiperfolianes, silphinanes, and isocomanes, and diterpenes.
In another embodiment the terpene or terpenoid is selected from the group comprising phytanes, cyclophytanes, bicyclophytanes, labdanes, rearranged labdanes, tricyclophytanes, pimaranes, isopimaranes, cassanes, cleistanthanes, isocopalanes, abietanes, totaranes, tetracyclophytanes, beyeranes, kauranes, villanovanes, atisanes, gibberellanes, grayanatoxanes, cembranes, cyclocembranes, casbanes, lathyranes, jatrophanes, tiglianes, rhamnofolanes, daphnanes, eunicellanes, asbestinanes, biaranes, dolabellanes, dolastanes, fusicoccanes, verticillanes, taxanes, trinervitanes, kempanes, prenylsesquiterpenes, xenicanes xeniaphyllanes, prenylgermacranes, lobanes, prenyleudesmanes, bifloranes, sacculatanes, prenyldrimanes, prenylguaianes, prenylaromadendranes, sphenolobanes, prenyldaucanes and ginkgolides.
In another embodiment the terpene or terpenoid is a sesterterpene.
In another embodiment the terpene or terpenoid is selected from the group comprising acyclic sesterterpenes, monocyclic sesterterpenes, bicyclic sesterterpenes, tricyclic sesterterpenes, tetracyclic sesterterpenes, pentacyclic sesterterpenes and polycyclic sesterterpenes.
In another embodiment the terpene or terpenoid is a triterpene.
In another embodiment the terpene or terpenoid is selected from the group comprising linear triterpenes, gonane type, tetracyclic triterpenes, protostanes, fusidanes, dammaranes, apotirucallanes, tirucallanes, euphanes, lanostanes, cycloartanes, cucurbitanes, baccharane type, pentacyclic triterpenes, baccharanes, lupanes, oleananes, taraxeranes, multifloranes, baueranes, glutinanes, friedelanes, pachysananes, taraxastanes, ursanes, pentacyclic
45 triterpenes, hopane type, hopanes, neohopanes, fernanes, adiananes, filicanes, gammaceranes, stictanes, arboranes, onoceranes, serratanes, and iridals.
In yet another embodiment the terpene or terpenoid is a tetraterpene.
In another embodiment the terpene or terpenoid is selected from the group comprising carotenoids, apocarotenoids, diapocarotenoids, megastigmanes, polyterpenes and prenylquinones.
Examples of the products of monoterpene synthases include, but are not limited to, the following compounds: tricyclene, a/p/7a-thujene, alpha-pinene, a/p/7a-fenchene, camphene, sabinene, beta- pinene, myrcene, delta- 2-carene, a/p/7a-phellandrene, 3-carene, 1,4-cineole, a/p/7a-terpinene, befa-phellandrene, 1,8-cineole, limonene, (Z)-befa-ocimene, (E)-beta-ocimene, gamma- terpinene, terpinolene, linalool, perillene, allo-ocimene, c/s-beta-terpineol, c/s-terpine-1-ol, isoborneol, cfe/fa-terpineol, borneol, chrysanthemol, lavandulol, alpha-terpineol, nerol, geraniol. In addition to GPP, certain terpene synthases (or terpene synthase variants developed by protein engineering) have been reported to convert non-canonical prenyl diphosphate substrates, such as the 11 -carbon substrate 2-methyl-GPP, to terpenes with non-canonical prenyl scaffolds (Ignea et al. 2018). In the context of this disclosure, enzymes that are able to convert non-canonical prenyl-diphosphate substrates with carbon lengths that differ from 10 into non-canonical terpenoids with 8, 9, 11 , 12, 13 or 14 carbons and / or other non-hydrogen atoms, or that are in any way analogues of the canonical substrate (GPP), are also included in the definition of products of monoterpene synthases.
Examples of sesquiterpene synthase products include, but are not limited to alpha-humulene, beta-caryophyllene, trans-alpha-bergamotene, cis-alpha-bergamotene, farnesene, alpha- santalene, santalol, beta-selinene, zingiberene, della-cadinene, germacrene, etc. (see more comprehensive list of structures above).
In some embodiments, the engineered eukaryotic cell according to the present invention is capable of producing terpene scaffolds with 16, 17 or 31 carbon atoms.
In some embodiments, the engineered eukaryotic cell is capable of producing terpene scaffolds with 16, 17 or 31 carbon atoms when feed with an alcohol substrate selected from the group consisting of 3M2E, 3,4-DMP, 3-MP, 3E2E, prenol, and isoprenol.
46
In some embodiments, the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK\, At\PK, Erg20p and CYC2 for the production of terpene with 16, 17 or31 carbon atoms, wherein AtFK\ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto. In some embodiments, the cell comprises a nucleic acid sequence encoding an At\PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 or a homologue or variant thereof having at least 75% identity thereto.
In some embodiments, the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK\, At\PK, and an enzyme selected from the group consisting of Erg20p(F96C), Synechococcus elongatus GGPPS, Erg20p, Taxus canadensis GGPPS, Lycopersicon esculentum NNPPS, Solanum habrochaites zFPPS for the production of sesquiterpene with 16 and 17 carbon atoms, wherein AtFK\ comprises SEQ ID NO: 2 ora homologue or variant thereof having at least 75% identity thereto. In some embodiments, the cell comprises a nucleic acid sequence encoding an At\PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 ora homologue or variant thereof having at least 75% identity thereto.
In some embodiments, the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK\, At\PK, Erg20p(F96C) and Salvia fruticosa trans-p-caryophyllene synthase (Sf126) for the production of sesquiterpenes with 16 or 17 carbon atoms, wherein AtFK\ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto. In some embodiments, the cell comprises a nucleic acid sequence encoding an At\PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 ora homologue or variant thereof having at least 75% identity thereto.
In some embodiments, the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK\, At\PK, and CPQ for the production of triterpene with 31 carbon atoms, wherein AtFK\ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto. In some embodiments, the cell comprises a nucleic acid sequence encoding an At\PK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 ora homologue or variant thereof having at least 75% identity thereto.
In some embodiments, the engineered eukaryotic cell comprises nucleic acid sequences encoding AtFK\, At\PK, and BmeTC(373C) for the production of triterpene with 31 carbon atoms, wherein AtFK\ comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto. In some embodiments, the cell comprises a nucleic acid sequence encoding an
47
At IPK, wherein the nucleic acid sequence comprises or consists of SEQ ID NO: 17 or a homologue or variant thereof having at least 75% identity thereto. In some embodiments, the genetically engineered eukaryotic cell for the production of a terpene or a terpenoid or an isoprenoid comprising a first nucleic acid sequence encoding a first kinase that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor; wherein the first kinase comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto, wherein the cell further comprises at least one further exogenous nucleic acid encoding an enzyme selected from the group consisting of Erg20p, CYC2, Erg20p(F96C), SynGGPPS, Sf126, CPQ, and BmeTC(373C), wherein the enzyme is capable of catalysing the production of non- canonical isoprenoids or structures containing isoprenoid groups.
Examples of diterpene synthase products include but are not limited to: taxadiene, casbene, cembrene, copalyl diphosphate, copal-8-ol diphosphate, etc. (see more comprehensive list of structures above).
Examples of triterpene synthase products include but are not limited to: friedelin, alpha-amyrin, beta-amyrin, lupeol, cucurbitadienol, etc. (see more comprehensive list of structures above).
By “improved enzyme kinetics” as used herein it is meant the result of any change to an enzyme, e.g., a kinase according to the invention or terpene synthase or prenyl transferase according to the invention, said change involving either a change to the nucleic acid sequence coding for said enzyme, or a change in the expressed peptide or protein, which result is measurable in terms of, e.g., the efficiency of said enzyme. Enzyme efficiency can be expressed in terms of kcat/Km, i.e., the specificity constant, wherein kcat is the turnover number and Km is the Michaelis constant, i.e. the affinity an enzyme has for its substrate. Because the specificity constant reflects both affinity and catalytic ability, it is useful for comparing different enzymes against each other, or different variants of said enzyme according to the invention against each other, or the same enzyme with different substrates according to the invention. Thus, the term “improved enzyme kinetics” also encompasses the result of any change affecting the affinity of an enzyme according to the invention.
48
By “improved enzyme kinetics” it is also meant the result of any change affecting reaction rate. Reaction rate is often found to have the form: r= k(T)[A]m [B]n where k(T) is the reaction rate constant that depends on temperature, and [A] and [B] are the molar concentrations of substances A and B in moles per unit volume of solution, assuming the reaction is taking place throughout the volume of the solution. (For a reaction taking place at a boundary, e.g., on a cell membrane, one would use instead moles of A or B per unit area). The exponents m and n are called partial orders of reaction and are not generally equal to the stoichiometric coefficients a and b. Instead they depend on the reaction mechanism and it would be known how to determine them experimentally by the person skilled in the art.
By “enzyme promiscuity” as used herein refers to the ability of an enzyme to catalyse a fortuitous side reaction in addition to its main reaction. Promiscuous activities are usually slow relative to the main activity. For example, the alcohol kinase of the invention may in some embodiments also be suited to catalyse a phosphate phosphorylation.
Various analytical techniques for measuring protein stability are available in the art and are reviewed in Peptide and Protein Drug Delivery, 247-301 , Vincent Lee Ed., Marcel Dekker, Inc., New York, N.Y., Pubs. (1991) and Jones, A. Adv. Drug Delivery Rev. 10: 29-90 (1993). Stability can be measured at a selected temperature for a selected time period. For rapid screening, a formulation comprising the enzymes which stability is to be compared can be kept at 40°C for 2 weeks to 1 month, at which time stability is measured. For example, the extent of aggregation during storage can be used as an indicator of protein stability.
Whether said genetically engineered eukaryotic cell, such as a yeast cell is capable of phosphorylating primary alcohols into mono- or pyrophosphate terpenoid precursors may be determined in different manners.
In one embodiment, it is determined by a method comprising the steps of:
• providing an aqueous solution containing a predefined level of a primary alcohol
• incubating the genetically engineered eukaryotic cell to be tested with said aqueous solution
• determining the level of the primary alcohol in the aqueous solution subsequent to said incubation
49 wherein the reduction in the primary alcohol level is considered a measure of phosphorylating of the primary alcohol into a mono- or pyrophosphate terpenoid precursor.
Accordingly it is preferred that when the genetically engineered eukaryotic cell according to the invention is incubated in said aqueous solution containing a predefined level of a primary alcohol, then the level of the primary alcohol subsequent to said incubation is at least 1% lower, such as at least 2%, such as at least 3%, such as at least 4%, such as at least 5%, such as at least 6%, such as at least 7%, such as at least 8%, such as at least 9%, such as at least 10%, such as at least 20%, such as at least 30%, such as at least 40%, such as at least 50% lower that the starting level.
In one embodiment, whether said genetically engineered eukaryotic cell is capable of phosphorylating a primary alcohol present in a culture medium into a mono- or pyrophosphate terpenoid precursor is determined by the steps of:
• providing a culture medium containing a predefined level of primary alcohol of a known size;
• Acidifying the culture medium to pH 5;
• incubating the genetically engineered eukaryotic cell to be tested in said medium; and
• determining the level of isoprenoid alcohols in the culture medium subsequent to said incubation wherein the appearance of isoprenoid alcohols with larger size than the added primary alcohol is considered a measure of the phosphorylating of the primary alcohol into the mono- or pyrophosphate terpenoid precursor and its further incorporation into larger prenyl diphosphate precursors.
Accordingly it is preferred that when the genetically engineered eukaryotic cell according to the invention is incubated in a culture medium containing a predefined level of a primary alcohol, then the sum of the levels of the different larger size alcohols determined is at least 1%, such as at least 2%, such as at least 3%, such as at least 4%, such as at least 5%, such as at least 6%, such as at least 7%, such as at least 8%, such as at least 9%, 10%, such as at least 20%, such as at least 30%, such as at least 40%, such as at least 50%, such as at least 600%, such
50 as at least 70%, such as at least 80%, such as at least 90% of the starting level of the predefined primary alcohol having a known size.
In some embodiments, downstream terpene, terpenoid or isoprenoid compounds can be used as a surrogate measure of the ability of the AtFKI to phosphorylate a primary alcohol into a mono- or pyrophosphate terpenoid precursor. In some embodiments, downstream compounds such as linalool, nerolido geranyllinalool, or squalene are measured and used as a marker of mono- or pyrophosphate terpenoid precursor production.
Accordingly it is preferred that when the genetically engineered eukaryotic cell according to the invention is incubated in an aqueous solution containing a predefined level of linalool, nerolidol, geranyllinalool, or squalene, and a predefined level of a primary alcohol, then the molar increase in linalool, nerolidol, geranyllinalool, or squalene level after incubation is at least 25%, such as at least 30%, such as at least 40%, such as at least 50%, such as at least 60%, such as at least 70%, such as at least 80%, such as at least 90%, such as at least 95% higher than the predefined molar level of linalool, nerolidol, geranyllinalool, or squalene at the starting level
Regardless of whether the method of determining said genetically engineered eukaryotic cell is capable of converting the primary alcohol present in an aqueous solution into a mono- or pyrophosphate terpenoid precursor involves determining the levels of the primary alcohol or the larger isoprenoid alcohols, then the incubation in the aqueous solution may be performed in any suitable manner. In general, the incubation is made under conditions allowing growth and/or metabolic activity of said eukaryotic cell, such as a yeast cell. Thus, the incubation is performed at a temperature in the range of 5 to 35°C, such as in the range of 20 to 32°C. The aqueous solution should also in addition to the primary alcohol also comprise components promoting cellular growth, such as yeast strain growth including a carbon source and a nitrogen source and optionally buffers and salts. The incubation may for example be done for 12 - 24 hours or 1 - 21 days.
OVERVIEW OF SEQUENCE LISTING
SEQ ID NO: 1 Amino acid sequence of Farnesol kinase of Arabidopsis thaliana UniProt accession Q67ZM7.
51
SEQ ID NO: 2 Truncated amino acid sequence of Farnesol kinase of Arabidopsis thaliana UniProt accession Q67ZM7, missing 65 first aa, termed delta65AtFKI.
SEQ ID NO: 3 Amino acid sequence of Isopentenyl phosphate kinase of Arabidopsis thaliana UniProt accession Q8H1 F7. SEQ ID NO: 4 Amino acid sequence of Isopentenyl phosphate kinase of Methanolobus tindarius strain DSM 2278 UniProt accession W9DTD1
SEQ ID NO: 5 Amino acid sequence of Isopentenyl phosphate kinase of Thermoplasma acidophilum (strain ATCC 25905 / DSM 1728 / JCM 9062 / NBRC 15155 / AMRC-C165)
UniProt accession Q9HLX1 SEQ ID NO: 6 Amino acid sequence of Isopentenyl phosphate kinase of Thermoplasma acidophilum (strain ATCC 25905 / DSM 1728 / JCM 9062 / NBRC 15155 / AMRC-C165)
UniProt accession Q9HLX1 comprising amino acid change (204G)
SEQ ID NO: 7 Amino acid sequence of Erg20p or Geranyl diphosphate synthase of Saccharomyces cerevisiae strain JAY291 UniProt accession C7GRZ5 SEQ ID NO:8 Amino acid sequence of Erg20p or Geranyl diphosphate synthase of
Saccharomyces cerevisiae strain JAY291 UniProt accession C7GRZ5 comprising amino acid change (N127W) and indicated as Erg20pN127W
SEQ ID NO: 9 Amino acid sequence of (+)-limonene synthase of Citrus limon and encoded by the CILimS gene GenBank accession AAM53944.1. SEQ ID NO: 10 Amino acid sequence of Beta-myrcene synthase of Ocimum basilicum and encoded by the ObMyrS (MYS gene). UniProt accession Q5SBP1
SEQ ID NO: 11 Amino acid sequence of the Sabinene synthase of Salvia pomifera and encoded by the SpSabS gene. UniProt accession A6XH06.
SEQ ID NO: 12 Amino acid sequence of a-Pinene synthase of Pinus taeda and encoded by the PtPinS gene. UniProt accession Q84KL3.
SEQ ID NO: 13 Amino acid sequence of geranyldiphosphate: olivetolate geranyltransferase of Cannabis sativa and encoded by the CsPT4 gene. Uniprot: A0A455ZJC3
SEQ ID NO: 14 Nucleotide coding sequence of the AtFKI gene of Arabidopsis thaliana NCBI RefSeq: NM_125242.4
52
SEQ ID NO: 15 Nucleotide coding sequence of A65AtFKI of Arabidopsis thaliana
SEQ ID NO: 16 Nucleotide mRNA transcript of the AtFKI gene of Arabidopsis thaliana NCBI RefSeq: NM_125242.4
SEQ ID NO: 17 Nucleotide coding sequence of AtIPK of Arabidopsis thaliana NCBI RefSeq: NM_102426.6
SEQ ID NO: 18 Nucleotide coding sequence of Isopentenyl phosphate kinase of Methanolobus tindarius strain DSM 2278. Gene MettiDRAFT_2389 Uniprot: W9DTD1.
SEQ ID NO: 19 Nucleotide coding sequence of Isopentenyl phosphate kinase of Thermoplasma acidophilum (strain ATCC 25905 / DSM 1728 / JCM 9062 / NBRC 15155 / AMRC-C165).
Uniprot: Q9HLX1.
SEQ ID NO: 20 Nucleotide coding sequence of Erg20p of farnesyl diphosphate synthase of Saccharomyces cerevisiae strain S288C. Uniprot: P08524.
SEQ ID NO: 21 Nucleotide coding sequence of (+)-limonene synthase of Citrus limon CILimS gene GenBank accession AF514287.1.
SEQ ID NO:22 Nucleotide coding sequence of Beta-myrcene synthase of Ocimum basilicum GenBank accession AY693649.1
SEQ ID NO: 23 Nucleotide coding sequence of Sabinene synthase of Salvia pomifera GenBank accession DQ785794.1
SEQ ID NO: 24 Nucleotide coding sequence of a-Pinene synthase of Pinus taeda GenBank accession AF543530.1
SEQ ID NO: 25 Nucleotide coding sequence of geranyldiphosphate: olivetolate geranyltransferase of Cannabis sativa GenBank accession BK010648
SEQ ID NO: 26: Amino acid sequence of Erg20p orfarbesyl diphosphate synthase of Saccharomyces cerevisiae strain 288c. UniProt accession P08524 comprising amino acid change (F96C) and indicated as Erg20pF96C
SEQ ID NO: 27 Amino acid sequence of terpentetriene synthase from Streptomyces griseolosporeus (CYC2). UniProt accession Q9AJE3.
SEQ ID NO: 28 Amino acid sequence of GGPP synthase of Synechococcus elongatus (SynGGPPS). UniProt accession Q2JX96.
53
SEQ ID NO: 29 Amino acid sequence of GGPP synthase of Taxus canadensis (TcaGGPPS). UniProt accession Q9ZPM3.
SEQ ID NO: 30 Amino acid sequence of nerylneryl diphosphate synthase of Lycopersicon esculentum (LycNNPPS). UniProt accession K7WQ45.
SEQ ID NO: 31 Amino acid sequence of z-FPP synthase of Solanum habrochaites (zFPPS). UniProt accession B8XA40.
SEQ ID NO: 32 Amino acid sequence of Salvia fruticose trans-b -caryophyllene synthase (Sf126).
SEQ ID NO: 33 Amino acid sequence of D373C variant of terpene cyclase of Priestia megaterium or Bacillus megaterium (BmeTC(373C)). UniProt accession D5DR56 (wild type).
SEQ ID NO: 34 Amino acid sequence of cucurbitadienol synthase of Cucumis sativus (CPQ). UniProt accession A0A097IYL3.
SEQ ID NO: 35 Nucleotide coding sequence of variant Erg20pF96C of Saccharomyces cerevisiae strain 288c.
SEQ ID NO: 36 Nucleotide coding sequence of terpentetriene synthase from Streptomyces griseolosporeus (CYC2).
SEQ ID NO: 37 Nucleotide coding sequence of GGPP synthase of Synechococcus elongatus (SynGGPPS)
SEQ ID NO: 38 Nucleotide coding sequence of GGPP synthase of Taxus canadensis (TcaGGPPS)
SEQ ID NO: 39 Nucleotide coding sequence of nerylneryl diphosphate synthase of Lycopersicon esculentum (LycNNPPS)
SEQ ID NO: 40 Nucleotide coding sequence of z-FPP synthase of Solanum habrochaites (zFPPS).
SEQ ID NO: 41 Nucleotide coding sequence of Salvia fruticosa trans-b -caryophyllene synthase (Sf126)
SEQ ID NO: 42 Nucleotide coding sequence of D373C variant of terpene cyclase of Priestia megaterium or Bacillus megaterium (BmeTC(373C)).
SEQ ID NO: 43 Nucleotide coding sequence of cucurbitadienol synthase of Cucumis sativus (CPQ)
54
SEQ ID NO: 44 Amino acid sequence of (2Z,6Z)-farnesyl diphosphate synthase from Solanum lycopersicum (Lycopersicon esculentum). UniProt code K7W9N9.
SEQUENCES
SEQ ID NO: 1
MATTSTTTKLSVLCCSFISSPLVDSPPSLAFFSPIPRFLTVRIATSFRSSSRFPATKIRK
SSLAAVMFPENSVLSDVCAFGVTSIVAFSCLGFWGEIGKRGIFDQKLIRKLVHINIGLVF
MLCWPLFSSGIQGALFASLVPGLNIVRMLLLGLGVYHDEGTIKSMSRHGDRRELLKGPLY
YVLSITSACIYYWKSSPIAIAVICNLCAGDGMADIVGRRFGTEKLPYNKNKSFAGSIGMA
TAGFLASVAYMYYFASFGYIEDSGGMILRFLVISIASALVESLPISTDIDDNLTISLTSA
LAGFLLF
SEQ ID NO: 2
MVMFPENSVLSDVCAFGVTSIVAFSCLGFWGEIGKRGIFDQKLIRK LVHINIGLVFMLCWPLFSSG IQGALFASLV PGLNIVRMLLLGLGVYHDEGTIKSMSRHGD RRELLKGPLYYVLSITSACIYYWKSSPIAIAVICNLCAGDGMADIVGRRFGTEKLPYNKN KS FAG S I G M ATAG F LASVAY M YY FASFGYIEDSGGMILRF LVI S I AS ALV ESLPISTDIDDNLTISLTSALAGFLLF
SEQ ID NO: 3
MELNISESRSRSIRCIVKLGGAAITCKNELEKIHDENLEWACQLRQAMLEGSAPSKVIG
MDWSKRPGSSEISCDVDDIGDQKSSEFSKFWVHGAGSFGHFQASRSGVHKGGLEKPIVKAG
FVATRISVTNLNLEIVRALAREGIPTIGMSPFSCGWSTSKRDVASADLATVAKTIDSGFVPVLHG
DAVLDNILGCTILSGDVIIRHLADHLKPEYVVFLTDVLGVYDRPPSPSEPDAV
LLKEIAVGEDGSWKVVNPLLEHTDKKVDYSVAAHDTTGGMETKISEAAMIAKLGVDVYIV
KAATTHSQRALNGDLRDSVPEDWLGTIIRFSK
SEQ ID NO: 4
MDNNNITILKIGGSVITDKSADDGTARLSEIERIAAEISGFEGKLIIVHGAGSFGHPQVK
RFGLTGKFDHEGSIITHMSVRKLNTMVVETLNSAGINALPVHPMACAISSNSRIKSMFRE
QIEEMLANGFVPVLHGDMVMDTDLGTSVLSGDQIVPYLAIQMKASRIGIGSAEEGVLDDK
GGVIPLINNENFDEIKAYLSGSANTDVTGGMLGKVLELLELSEQSNSTSYIFNAGNTGNI
SDFLSGKNIGTAIGAGTI
SEQ ID NO: 5
MMILKIGGSVITDKSAYRTARTYAIRSIVKVLSGIEDLVCVVHGGGSFGHIKAMEFGLPG
PKNPRSSIGYSIVHRDMENLDLMVIDAMIEMGMRPISVPISALRYDGRFDYTPLIRYIDA
GFVPVSYGDVYIKDEHSYGIYSGDDIMADMAELLKPDVAVFLTDVDGIYSKDPKRNPDAV
LLRDIDTNITFDRVQNDVTGGIGKKFESMVKMKSSVKNGVYLINGNHPERIGDIGKESFI
GTVIR
SEQ ID NO: 6
MMILKIGGSVITDKSAYRTARTYAIRSIVKVLSGIEDLVCVVHGGGSFGHIKAMEFGLPG
PKNPRSSIGYSIVHRDMENLDLMVIDAMIEMGMRPISVPISALRYDGRFDYTPLIRYIDA
GFVPVSYGDVYIKDEHSYGIYSGDDIMADMAELLKPDVAVFLTDVDGIYSKDPKRNPDAV
LLRDIDTNITFDRVQNDVTGGIGGKFESMVKMKSSVKNGVYLINGNHPERIGDIGKESFIGTVIR
SEQ ID NO: 7
MASEKEIRRERFLNVFPKLVEELNASLLAYGMPKEACDWYAHSLNYNTPGGKLNRGLSWDTY
AILSNKTVEQLGQEEYEKVAILGWCIELLQAYFLVADDMMDKSITRRGQPCWYKVPEVGEIAIND
55
AFMLEAAIYKLLKSHFRNEKYYIDITELFHEVTFQTELGQLMDLITAPEDKVDLSKFSLKKHSFIVT
FKTAYYSFYLPVALAMYVAGITDEKDLKQARDVLIPLGEYFQIQD
DYLDCFGTPEQIGKIGTDIQDNKCSVWINKALELASAEQRKTLDENYGKKDSVAEAKCKK
IFNDLKIEQLYHEYEESIAKDLKAKISQVDESRGFKADVLTAFLNKVYKRSK
SEQ ID NO: 8
MASEKEIRRERFLNVFPKLVEELNASLLAYGMPKEACDWYAHSLNYNTPGGKLNRGLSWDTY
AILSNKTVEQLGQEEYEKVAILGWCIELLQAYFLVADDMMDKSITRRGQPCWYKVPEVGEIAIW
DAFMLEAAIYKLLKSHFRNEKYYIDITELFHEVTFQTELGQLMDLITAPEDKVDLSKFSLKKHSFIV
TFKTAYYSFYLPVALAMYVAGITDEKDLKQARDVLIPLGEYFQIQD
DYLDCFGTPEQIGKIGTDIQDNKCSVWINKALELASAEQRKTLDENYGKKDSVAEAKCKK
IFNDLKIEQLYHEYEESIAKDLKAKISQVDESRGFKADVLTAFLNKVYKRSK
SEQ ID NO: 9
MSSCINPSTLVTSVNAFKCLPLATNKAAIRIMAKYKPVQCLISAKYDNLTVDRRSANYQPSIWDH
DFLQSLNSNYTDEAYKRRAEELRGKVKIAIKDVIEPLDQLELIDNLQRLGLAHRFETEIRNILNNIY
NNNKDYNWRKENLYATSLEFRLLRQHGYPVSQEVFNGFKDDQGGFICDDFKGILSLHEASYYS
LEGESIMEEAWQFTSKHLKEVMISKNMEEDVFVAEQAKRALELPLHWKVPMLEARWFIHIYER
REDKNHLLLELAKMEFNTLQAIYQEELKEISGWWKDTGLGEKLSFARNRLVASFLWSMGIAFEP
QFAYCRRVLTISIALITVIDDIYDVYGTLDELEIFTDAVERWDINYALKHLPGYMKMCFLALYNFVN
EFAYYVLKQQDFDLLLSIKNAWLGLIQAYLVEAKWYHSKYTPKLEEYLENGLVSITGPLIITISYLS
GTNPIIKKELEFLESNPDIVHWSSKIFRLQDDLGTSSDEIQRGDVPKSIQCYMHETGASEEVARQ
HIKDMMRQMWKKVNAYTADKDSPLTGTTTEFLLNLVRMSHFMYLHGDGHGVQNQETIDVGFT
LLFQPIPLEDKHMAFTASPGTKG
SEQ ID NO: 10
MWSTISISMNVAILKKPLNFLHNSNNKASNPRCVSSTRRRPSCPLQLDVEPRRSGNYQPSAWD
FNYIQSLNNNHSKEERHLERKAKLIEEVKMLLEQEMAAVQQLELIEDLKNLGLSYLFQDEIKIILN
SIYNHHKCFHNNHEQCIHVNSDLYFVALGFRLFRQHGFKVSQEVFDCFKNEEGSDFSANLADD
TKGLLQLYEASYLVTEDEDTLEMARQFSTKILQKKVEEKMIEKENLLSWTLHSLELPLHWRIQRL
EAKWFLDAYASRPDMNPIIFELAKLEFNIAQALQQEELKDLSRVWVNDTGIAEKLPFARDRIVESH
YWAIGTLEPYQYRYQRSLIAKIIALTTWDDVYDVYGTLDELQLFTDAIRRWDIESINQLPSYMQL
CYLAIYNFVSELAYDIFRDKGFNSLPYLHKSWLDLVEAYFVEAKWFHDGYTPTLEEYLNNSKITII
CPAIVSEIYFAFANSIDKTEVESIYKYHDILYLSGMLARLPDDLGTSSFEMKRGDVAKAIQCYMKE
HNASEEEAREHIRFLMREAWKHMNTAAAADDCPFESDLWGAASLGRVANFVYVEGDGFGVQ
HSKIHQQMAELLFYPYQ
SEQ ID NO: 11
MPLNSLHNLERKPSKAWSTSCTAPAARLQASFSLQQEEPRQIRRSGDYQPSLWDFNYIQSLNT
PYKEQRYVNRQAELIMQVRMLLKVKMEAIQQLELIDDLQYLGLSYFFPDEIKQILSSIHNEHRYF
HNNDLYLTALGFRILRQHGFNVSEDVFDCFKTEKCSDFNANLAQDTKGMLQLYEASFLLREGE
DTLELARRFSTRSLREKLDEDGDEIDEDLSSWIRHSLDLPLHWRIQGLEARWFLDAYARRPDM
NPLIFKLAKLNFNIVQATYQEELKDVSRVWVNSSCLAEKLPFVRDRIVECFFWAIGAFEPHQYSY
QRKMAAIIITFVTIIDDVYDVYGTLEELELFTDMIRRWDNISISQLPYYMQVCYLALYNFVSERAYD
ILKDQHFNSIPYLQRSVWSLVEGYLKEAYWYYNGYKPSLEEYLNNAKISISAPTIISQLYFTLANS
TDETVIESLYEYHNILYLSGTILRLADDLGTSQHELERGDVPKAIQCYMKDTNASEREAVEHVKF
LIRETWKEMNTVTTASDCPFTDDLVAVATNLARAAQFIYLDGDGHGVQHSEIHQQMGGLLFQP
YV
SEQ ID NO: 12
MALVSAVPLNSKLCLRRTLFGFSHELKAIHSTVPNLGMCRGGKSIAPSMSMSSTTSVSNE
DGVPRRIAGHHSNLWDDDSIASLSTSYEAPSYRKRADKLIGEVKNIFDLMSVEDGVFTSP
LSDLHHRLWMVDSVERLGIDRHFKDEINSALDHVYSYWTEKGIGRGRESGVTDLNSTALG
56
LRTLRLHGYTVSSHVLDHFKNEKGQFTCSAIQTEGEIRDVLNLFRASLIAFPGEKIMEAA
EIFSTMYLKDALQKIPPSGLSQEIEYLLEFGWHTNLPRMETRMYIDVFGEDTTFETPYLI
REKLLELAKLEFNIFHSLVKRELQSLSRWWKDYGFPEITFSRHRHVEYYTLAACIANDPK
HSAFRLGFGKISHMITILDDIYDTFGTMEELKLLTAAFKRWDPSSIECLPDYMKGVYMAV
YDNINEMAREAQKIQGWDTVSYARKSWEAFIGAYIQEAKWISSGYLPTFDEYLENGKVSF
GSRITTLEPMLTLGFPLPPRILQEIDFPSKFNDLICAILRLKGDTQCYKADRARGEEASA
VSCYMKDHPGITEEDAVNQVNAMVDNLTKELNWELLRPDSGVPISYKKVAFDICRVFHYG
Y KY R DG F S VAS I E I KN LVT RTVVETVP L
SEQ ID NO: 13
MGLSLVCTFSFQTNYHTLLNPHNKNPKNSLLSYQHPKTPIIKSSYDNFPSKYCLTKNFHL
LGLNSHNRISSQSRSIRAGSDQIEGSPHHESDNSIATKILNFGHTCWKLQRPYVVKGMIS
IACGLFGRELFNNRHLFSWGLMWKAFFALVPILSFNFFAAIMNQIYDVDIDRINKPDLPL
VSGEMSIETAWILSIIVALTGLIVTIKLKSAPLFVFIYIFGIFAGFAYSVPPIRWKQYPF
TNFLITISSHVGLAFTSYSATTSALGLPFVWRPAFSFIIAFMTVMGMTIAFAKDISDIEG
DAKYGVSTVATKLGARNMTFWSGVLLLNYLVSISIGIIWPQVFKSNIMILSHAILAFCL
IFQTRELALANYASAPSRQFFEFIWLLYYAEYFVYVFI
SEQ ID NO: 14
AT GG CAACT ACT AGT ACTACT ACAAAG CTCTCCGTTCTCTG CTGCT CTTT CATTT CAT CTCC T CT CGTT G ACT CT COT CCTT CT CT CGCCTT CTT CT CT COG ATT CCACG ATT COT CACT GTCC GAATCGCGACTAGCTTTAGATCGAGCTCTAGGTTTCCGGCCACCAAAATCCGCAAGTCTTC ACT CGCCGCCGT GAT GTTT CCGGAAAATT CGGTTTTAT CAGAT GT CTGCGCGTTT GGAGT C ACTAGCAT CGTTGCGTT CT CGT GCCT CGGTTT CTGGGGAGAGATTGGCAAACGT GGCAT C TT CG ACCAG AAACT CAT CCG G AAG CTT GTG CAT AT AAAT ATTGG G CT AGTTTTT AT GCTTT G CTGGCCGCT GTT CAGTT CT GGAAT CCAAGGAGCACTTTT CGCAT CT CTT GT ACCTGGACT C AATATAGTAAGGATGCTATTGCTGGGGCTTGGAGTGTACCACGACGAAGGAACAATCAAGT CAAT GAGCAG ACAT GG AG AT CGCAGGGAACT ACTT AAGGGGCCGCTTT ACT AT GTACT GT C AAT CACAT CAGCCTGCAT CT ACT ATT GG AAAT CAT CCCCAAT CGCGATT GCGGT GAT ATGC AACCTTTGCGCAGGAGATGGTATGGCTGACATTGTGGGTCGGCGGTTTGGAACAGAGAAG CTT CCTT ACAACAAAAACAAAT CATTT GCTGGTAGCATT GGAAT GGCCACCGCCGGGTTT C TAGCATCTGTTGCGTATATGTACTACTTTGCTTCATTTGGTTACATCGAGGATAGCGGGGG AAT GATT CTT CGTTT CCT CGT CAT CT CT AT AGCAT CAGCT CTT GT GGAAT CACT CCCAAT AA GCACCG ACATT G ACG ACAAT CT CACCATTT CCTT AACCT CT GCCTTGGCCGG ATT CTT ACT CTTCTAA
SEQ ID NO: 15
ATGGTT AT GTT CCCAG AAAACT CT GTTTT GT CT GAT GTTT GTGCTTT CGGT GTT ACTT CT AT C GTT GCTTT CT CAT GTTT AGGTTT CTGGGGT GAAATT GGT AAGAG AGGTATTTT CG ACCAGAA GOT GATT AGAAAGTTGGT CCATATTAACAT CGGCCTGGTTTTTAT GTT GT GTT GGCCTTT GT TTT CCT CAGGTATT CAAGGT GCTTT GTT CGCTT CTTTGGTT CCAGGTTT GAAT AT CGT CAG A AT GTT GTTGTT AGGTTTGGGT GTTT ACCACG AT GAAGGTACT ATT AAGT COAT GT CAAG ACA CGGT G ACAGAAGAG AATT ATT GAAAGGT CCCTT GT ACT ACGT CTT GT CT ATT ACTT CT GCTT GOAT CT ACT ACTGGAAGT CAT CT CCAATT GOT ATT GCTGTT AT CT GTAACTT GTGTGCTGGT GATGGTAT GGCT GAT AT AGTT GGTAG AAG ATT CGGTACT G AAAAGCTGCCAT ACAACAAGA ACAAATCTTTCGCTGGTTCTATTGGTATGGCAACTGCTGGTTTTTTGGCTTCTGTTGCTTAT AT GT ACT ACTT CGCCT CTTT CGGTTACAT CGAAGATT CT GGTGGTAT GAT CTT GAGATT CCT GGTT ATTT CT ATTGCCT CCGCTTTGGTT GAAT CCTTGCCAATTT CT ACCG AT AT CGAT GAT A ACCT GACCAT CT CTTT GACAT CTGCTTTGGCAGGTTT CTT GOT GTT CTAA
57
SEQ ID NO: 16
CATGAACCATTCTAGCCGAATCCAAATAGAACGAAACTAAAATTGCTTTAATCCATATGTGA CCAG AT AAAGT AAAACACT CCTTTT GG G G AAAT AAAT AT CC ACTTTT CACCAT CTTTTT G GTA AACAATAAAACAAACCAAATTTT GT GTGCTT GAGCAACAACAACT ACCAACCAGTT ACT GT C AAAAAAAAT AAAT GAAAACGT AAT CG AAG AAAAGGT CGT GTTTT CT AGT GTT GCAG AAAAT G GCAACT ACT AGT ACT ACT ACAAAGCT CT CCGTT CT CTGCT GOT CTTT CATTT CAT CT COT CT CGTT GACT CT CCT CCTT CT CT CGCCTT CTT CT CT CCGATT CCACGATT CCT CACT GT CCGAA T CGCGACTAGCTTTAGAT CGAGCT CTAGGTTT CCGGCCACCAAAAT CCGCAAGT CTT CACT CGCCGCCGT GAT GTTT CCGGAAAATT CGGTTTTAT CAGAT GT CT GCGCGTTTGGAGT CACT AGCATCGTTGCGTTCTCGTGCCTCGGTTTCTGGGGAGAGATTGGCAAACGTGGCATCTTC GACCAGGTTTT CGAT CATT CAATT CACTT AATT AAATT AT ACACGCGT CGATTT GTT GAAT AA CGTTTTT G G AAT GTTGCT CG ATT CT G CAATTT CTG GTT AAAG CAACTTT G GATT C AG AGTAA TGGTAGTAGTAAGCTTTATCGAGTAATGGATGGCATTAATGGCGAGGATAATAAATCTTATT AGTTCAAGAAATTATAGAATGTGAATGGATACCTATTTTGGGGTGGAAGAACACTGCGATAA ACAATGGGAAAGTAACCCCATATCTGGTAAATTAGTGGGAAAATTGCTCAGAACCCAATTT GGGGAT G AAATTTTTTT GAAATT CGTGCGGT AT AAT CACAAT CT CAT CGTAACTT G AAGATT CAACCAT ATTT AT GAT AT AT AT G AATTTT GT AG GTTT AAAT GTTG CAC AATTT CTTT GOT AT AC TTT G AAGG ATT ATTT CTT CGTAT G AGT CAGT CT AT ACACAT ATTGG ACAT GGCTT GTTTTTTT TT CCTTT CT CTT ATGT GTTT CT CCGCAGT GACTT GTTTTT CTT CAAGTAATTTGGTT AGG AGT AG AGG AATT AT CT AT GATT AG CAAAAATT CT ATT AAAC ATTTTT G GTTT GG G AAT ACCTGT AT GGCTT CAGCT ACTT GATTTT GTTT CT CT G ACCTT AT CGT GATT AAG AT AT CAT CT CAG AACC CATT G AATTT AT AAG AT CCACAGGCT GT CT GAT GT AGGACT GACT CCACTTT CAT GGTTTT A TT GAAT G AGGCATTGGTTT AACAT CT GTGGTT G ACATTGGT GTT ATTT CTT ATTTTT CAGAAA CTCATCCGGAAGCTTGTGCATATAAATATTGGGCTAGTTTTTATGCTTTGCTGGCCGCTGTT CAGGTAAT AAT ACGG AGT G AAAGG AT CGATTTT GAT GT ACCAGCTTGCTTTT CTGCTT CACC AGTTT ATTT CGAT AT CAATT GTTT CAGTT CT GG AAT CCAAGG AGCACTTTT CGCAT CT CTT GT ACCTGGACTCAATATAGTAAGGATGCTATTGCTGGGGCTTGGAGTGTACCACGACGAAGG AACAAT CAAGT CAAT G AGCAG ACAT GG AGAT CGCAGGTAGCT GAAT CGAACAAGT GTCTGT CACAAT AT CCAACAAT CT CTT AGCATT CTT GGTATCTAT GAT CAAAATT CGGTTT GT CT GTTT ATTATTG CTTTT GT CACTT CACAAAGT GTCTGT AGT CG CTCTCTG CAT CAG GG AACT ACTT A AGGGGCCGCTTTACTATGTACTGTCAATCACATCAGCCTGCATCTACTATTGGAAATCATCC CCAATCGCGATTGCGGTGATATGCAACCTTTGCGCAGGAGATGGTAAGTACAGTTAGCTCT GGTTT GAT CCAAT AACATT AGCAACTTTT GT CTT AGG AATTT GAAT CCATT GAAT GAT GATT C TAATACACACACTTGCTGTTAAAAGGTATGGCTGACATTGTGGGTCGGCGGTTTGGAACAG AG AAGCTT CCTT AC AACAAAAAC AAAT CATTT GCTG GT AG CATT GG AAT G G CCACCG CCG G GTTTCT AG CAT CTGTTGCGTATATGT ACT ACTTT G CTT CATTT GGTT ACAT CG AG GAT AG CG GGGGAAT GATT CTT CGTTT CCT CGT CAT CT CT AT AGCAT CAGCT CTT GT GG AAT CACT COCA ATAAGCACCGACATT GACGACAAT CT CACCATTT CCTT AACCT CT GCCTTGGCCGGATT CTT ACT CTT CT AAT AAT ACCCT CT CGTT GTT AT GT AT CAT CAAAT AAAGGGT CGAGCTT G ATTGC T GAT AT GG AGGTAAAACTGCATT CATT GTT CCCAT CTT CTT CTGTAT GTACGT ATT AGT GAA AC ATCT CAT ATTGTTGTTGT CC AC AAAT CTTATTTTTCAGCTG CAATT GCAGTTGGGT AC AAT GTT GTAAT GTT CT AT CCATT AGT GAG ACAT AT GAT GACG AACAGT G ACGCTT CT ACAATTT G T AACAG AT ACT CTTTT GTAACAGAT ATT CAAT ACAT GTTT GTTT GTT ATTT GGCCT ATGGCTA TGTGG
SEQ ID NO: 17
AT GG AGCT GAAT ATTT CCG AG AGT CG AAGCAGAT CAATT CGTT GCATT GT GAAACTTGGAG GTGCGGCAATTACTTGCAAAAACGAGCTGGAGAAGATTCACGATGAAAATCTGGAGGTCGT GGCGTGTCAGTTACGTCAAGCTATGTTGGAGGGTTCAGCTCCAAGCAAGGTTATTGGCAT GGATT GGAGCAAGAGACCTGGAAGCT CT GAGATTT CTT GT GAT GTGGAT GACATAGGGGA TCAAAAGTCTTCTGAGTTTAGTAAATTTGTTGTGGTTCATGGCGCTGGTTCCTTTGGGCACT
58
TTCAGGCCAGTAGATCTGGGGTTCACAAAGGAGGACTTGAGAAACCTATTGTCAAAGCTGG TTT CGTTG CT ACT CGT ATATCTGT G ACAAAT CTT AAT CTT G AAATT GTACG AG CACT AG CCC GAGAGGGCATTCCTACGATAGGCATGTCTCCATTTTCATGTGGTTGGTCAACCTCCAAAAG AG AT GTGGCTT CTGCAGAT CT AGCAACCGT AGCT AAAACCAT AGACT CAGGATTT GT CCCT GTT CT CCAT GG AGATGCAGT GCT GG ACAAT AT ACT GGGCTGCACCAT ATT GAGT GGT GAT G TT AT CAT CCGT CAT CTTG CAG AT C ATTT G AAG CCAG AAT ATGTTGT CTTT CT CACAG AT GT A CTAGGT GT CTACGAT CGACCACCTT CACCTT CAGAGCCCGACGCT GTGCT CTT GAAAGAGA TCG CTGTT G GAG AAG AT G G AAG CT G G AAG GTT GT G AAT CCACT GTTG G AG C ACACAG AC A AGAAAGTTGACTACTCTGTTGCGGCGCACGATACAACCGGTGGAATGGAAACGAAGATAT CAGAAGCTGCT AT G ATTGCAAAACTTGGAGT CG AT GT CT ACATT GT G AAGGCT GCGACAAC T CATT CACAGAGAGCACTAAACGGT GATTT GAGAGATAGT GTT CCT GAAGATTGGCTT GGT ACT AT CAT CAG ATT CT C AAAGT AG
SEQ ID NO: 18
TGGATAACAACAACATCACCATCCTGAAAATTGGTGGTAGCGTGATTACCGATAAAAGCGC AGATGATGGCACCGCACGTCTGAGCGAAATTGAACGTATTGCAGCAGAAATTAGCGGCTTT GAAGGCAAACT GATT ATT GTT CAT GGTGCAGGTAGCTTT GGT CAT CCGCAGGTT AAACGTT TTGGTCTGACCGGTAAATTTGATCATGAAGGCAGCATTATTACCCATATGAGCGTTCGTAAA CT G AAT ACCATGGTT GTT GAAACCCT G AAT AGCGCAGGT ATT AAT GCACT GCCGGTT CAT C CG AT GGCAT GTGCAATT AGCAGCAAT AGCCGTAT CAAAAGCAT GTTT CGT G AGCAG ATT G A AGAAATGCTGGCCAATGGTTTTGTTCCGGTTCTGCATGGTGATATGGTTATGGATACCGAT CTGGGCACCAGCGTTCTGAGCGGTGATCAGATTGTTCCGTATCTGGCAATTCAGATGAAAG CAAGCCGTATTGGTATTGGTAGTGCCGAAGAGGGTGTTCTGGATGATAAAGGTGGTGTTAT T CCGCT GATT AACAACG AG AACTT CG AT GAG ATT AAAGCAT AT CT GAGTGGT AGCGCAAAT ACCGAT GTT ACCGGTGGT AT GCTGGGTAAAGTT CT GG AACT GCT GG AATT AAGCG AACAG A GCAATAGCACCAGCTATATCTTTAATGCAGGTAACACCGGCAACATCAGCGATTTTCTGTC AG GTAAAAACATT GG CACCG CCATT G GTG CAGG CACCATTT AA
SEQ ID NO: 19
AT GAT G ATTTT G AAG AT CGGTGGTTCCGTTAT CACT GAT AAGT CT G CTT AT AG AACT G CT AG AACCT ACGCCATT AG AT CCATT GT CAAAGTTTT GT CCGGTAT CGAAGATTTGGTTT GCGTT G TTCATGGTGGTGGTTCTTTTGGTCATATTAAGGCTATGGAATTTGGTCTGCCAGGTCCAAAA AAT CCAAGAT CAT CT ATTGGTT ACT CCAT CGTT CACAG AG ACATGGAAAATTT GG ATTT GAT GGTTAT CGACGCCAT GAT CG AAAT GGGTATGCGT CCAATTT CT GTT CCAATTT CAGCTTT G A GAT ACG ACGGTAG ATTT GATT ACACCCC ATT GATT AG GT ACATT G ATG CT G GTTTT GTTCCA GTTT CTT ACGGT GAT GTTT ACAT CAAGG AT GAACACT CTT ACGGTAT CT ACT CCGGT GAT GA T ATT ATGGCT GAT ATGGCCGAATT ATT G AAGCCAGAT GTTGCT GTTTT CTT GACCGAT GTTG AT GGCAT CT ATT CCAAAGAT CCAAAG AGAAAT CCAG ACGCCGTTTT GTT GAG AG AT ATT GAT ACCAACAT CACCTT CGACAGAGTT CAAAACG AT GTT ACTGGTGGT AT CGGTAAAAAGTT CG AAT CCAT GGTT AAGAT G AAGT CCT CT GTT AAGAACGGT GTCT ACTT GATT AACGGTAACCAT CCAGAAAGAAT CGGT GAT ATT GGTAAAG AGT CCTT CAT CGGTACT GT CAT CAG AT GA
SEQ ID NO: 20
TG GCTT CAG AAAAAG AAATT AG GAG AG AG AG ATT CTT G AACGTTTT CCCT AAATT AGTAG AG GAATTGAACGCATCGCTTTTGGCTTACGGTATGCCTAAGGAAGCATGTGACTGGTATGCCC ACT CATT G AACT ACAACACT CCAGGCGGT AAGCT AAAT AG AGGTTT GT CCGTT GTGGACAC GT ATGCT ATT CT CT CCAACAAG ACCGTT G AACAATTGGGGCAAGAAG AAT ACG AAAAGGTT GCCATT CT AGGTTGGT GCATT GAGTT GTTGCAGGCTT ACTT CTT GGT CGCCGAT GAT AT GA TGGACAAGTCCATTACCAGAAGAGGCCAACCATGTTGGTACAAGGTTCCTGAAGTTGGGG AAATTGCCAT CAAT G ACGCATT CAT GTT AG AGGCT GCT AT CT ACAAGCTTTT G AAAT CT CAC TT CAG AAACG AAAAAT ACT ACAT AG AT AT CACCG AATT GTT CCAT G AG GT CACCTT CCAAAC CG AATT G G GCC AATT GAT G G ACTT AAT CACT G CACCT G AAG ACAAAGT CG ACTT GAGT AAG
59
TT CT CCCT AAAGAAGCACT CCTT CAT AGTT ACTTT CAAG ACT GCTT ACT ATT CTTT CT ACTT G CCTGTCGCATTGGCCATGTACGTTGCCGGTATCACGGATGAAAAGGATTTGAAACAAGCCA GAG AT GT CTT GATT CCATT G G GT G AAT ACTT CC AAATT CAAG AT G ACT ACTT AG ACT G CTT C GGTACCCCAGAACAG AT CGGT AAG AT CGGT ACAG AT AT CCAAG AT AACAAAT GTT CTTGGG TAATCAACAAGGCATTGGAACTTGCTTCCGCAGAACAAAGAAAGACTTTAGACGAAAATTAC GGTAAGAAGGACT CAGT CGCAGAAGCCAAATGCAAAAAGATTTT CAAT GACTT GAAAATT G AACAG CTATACCACG AAT ATG AAG AGTCTATTG CCAAG GATTTG AAG G CCAAAATTTCTCAG GT CGAT G AGT CT CGTGGCTT CAAAGCT GAT GT CTT AACTGCGTT CTT GAACAAAGTTT ACAA GAG AAG CAAAT AG
SEQ ID NO: 21
AT GT CTT CTT GCATT AAT CCCT CAACCTTGGTT ACCT CT GTAAATGCTTT CAAAT GT CTT CCT CTTG C AACAAAT AAAG C AG C CAT C AG AAT CAT G G C CAAAT AT AAG CC AG T CC AAT G CCTT AT CAGCG CC AAAT AT GAT AATTT G ACAGTT GAT AG GAG AT CAG CAAACT ACC AACCTT CAATTT GGGACCACGATTTTTTGCAGTCATTGAATAGCAACTATACGGATGAAGCATACAAAAGACG AGCAGAAGAGCTGAGGGGAAAAGTGAAGATAGCGATTAAGGATGTAATCGAGCCTCTGGA T CAGTTGGAGCT GATT GAT AACTT GCAAAGACTT GG ATTGGCT CAT CGTTTT GAG ACT GAG ATT AG G AAC AT ATT G AAT AAT ATCT ACAAC AAT AAT AAAG ATT AT AATT G G AG AAAAG AAAAT CT GTATGCAACCT CCCTT GAATT CAGACT ACTT AG ACAACATGGCT AT CCT GTTT CT CAAGA GGTTTT CAATGGTTTTAAAGACGACCAGGGAGGCTT CATTT GT GAT GATTT CAAGGGAATA CT G AGCTT GC AT G AAG CTT CGTATT ACAGCTT AG AAG GAG AAAG CAT CAT G GAG GAG G CCT GG CAATTT ACT AGTAAACAT CTT AAAG AAGT GAT GAT CAG CAAG AACAT G G AAG AG GAT GT ATTT GTAGCAGAACAAGCGAAGCGT GCACT GGAGCT CCCT CT GCATT GGAAAGT GCCAAT GTTAGAGGCAAGGTGGTTCATACACATTTATGAGAGAAGAGAGGACAAGAACCACCTTTTA CTTG AGCTCG CTAAG ATG G AGTTTAACACTTTG CAGG C AATTTACC AG G AAG AACTAAAAG AAATTTCAGGGTGGTGGAAGGATACAGGTCTTGGAGAGAAATTGAGCTTTGCGAGGAACA GGTTGGTAGCGTCCTTCTTATGGAGCATGGGGATCGCGTTTGAGCCTCAATTCGCCTACTG CAGG AGAGTGCT CACAAT CT CGAT AGCCCT AATT ACAGT GATT GAT GACATTT AT GAT GTCT AT GG AACATT GG AT G AACTT GAG AT ATT CACT GAT GCT GTT GAG AGGT GGGACAT CAATT A TGCTTT GAAGCACCTT CCGGGCT AT AT GAAAAT GT GTTTT CTT GCGCTTT ACAACTTT GTTA AT GAATTTGCTT ATTACGTT CT CAAACAACAGGATTTT GATTTG CTT CT GAGCATAAAAAAT G CATGGCTTGGCTTAATACAAGCCTACTTGGTGGAGGCGAAATGGTACCATAGCAAGTACAC ACCG AAACT G G AAG AAT ACTT G GAAAAT G GATT G GTAT CAAT AACGG G CCCTTT AATT AT AA CG ATTT CAT AT CTTT CTGGTACAAAT CCAAT CATT AAGAAGGAACTGG AATTT CT AGAAAGT AAT CCAG AT AT AGTT CACT G GT CAT CCAAG ATTTT CCGTCTG CAAG AT GATTT G G G AACTT C AT CGGACG AGAT ACAG AG AGGGG AT GTT CCG AAAT CAAT CCAGT GTT ACATGCAT G AAACT GGTGCCT CAG AGG AAGTT GCT CGT CAACACAT CAAGG AT AT GAT G AGACAG AT GT GG AAG AAG GT G AAT G CAT AC AC AG CCG AT AAAG ACT CT CCCTT G ACT GG AACAACT ACT G AGTT CC T CTT GAAT CTT GT GAG AAT GT CCCATTTT ATGTAT CT ACAT GG AG ATGGGCAT GGTGTT CAA AACCAAG AG ACT AT CGAT GT CGGTTTT ACATTGCTTTTT CAGCCCATT CCCTT GG AGG ACAA AC AC ATG G CTTT C AC AG CAT CTCCTGGCAC C AAAG G CT G A
SEQ ID NO: 22
ATGTGGTCT ACCATT AG CATT AG CAT GAAT GTG G CAAT CCT G AAG AAG CCACTT AACTT CCT CCACAACT CAAAC AAC AAAG CTT CAAACCCT CGGTGCGT GT CGT CT ACT CGCCGGCGCCC TTCTTGCCCCTTGCAGCTTGACGTTGAACCCCGACGCTCCGGAAACTACCAGCCTTCAGCT TGGG ATTT CAACT ACATT CAAT CT CT CAAT AAT AAT CACT CCAAGGAGGAG AGGCATTT GG A AAGGAAAGCTAAGCTGATTGAGGAAGTGAAGATGCTATTGGAGCAGGAAATGGCGGCAGT T CAACAGTTGGAGTT GATT G AAG ACTT G AAAAAT CTGGG ATT GT CAT ACTT ATTT CAAGAT G AG ATT AAAAT AATTTT GAATT CCAT AT ACAAT CACCACAAAT GCTT CCACAAT AAT CAT G AAC AATGCAT ACACGT AAATT CAG ATTT GTATTT CGT CGCTCT CGG ATT CAGACT CTT CCGGCAA
60
CAT GGTTTT AAAGT CT CT CAAGAAGT ATTT G ACT GTTTT AAG AACG AAGAGGGCAGT GATTT CAGTGCAAACCTTGCTGACGATACAAAGGGGCTGCTACAACTTTACGAAGCGTCATATCTG GT GACAG AAG AT G AAG AT ACACT GG AGATGGCGCG ACAATTTT CCACCAAAATT CT GCAG A AAAAAGTGG AAG AAAAAAT GATT G AG AAGG AG AATTT ATT AT CAT GG ACACTT CATT CTTT G GAGCTCCCACTTCATTGGCGGATTCAAAGGCTGGAGGCCAAATGGTTCTTAGATGCTTATG CTAGCAGACCAGAT AT GAAT CCCATT ATTTTT GAGTT GGCTAAATTGGAATT CAATATTGCT CAAGCATT ACAACAGG AAG AACT CAAAG AT CT CT C AAG GT G GTG G AAT GAT ACT G GT ATT G CCGAAAAACTCCCATTTGCGAGGGATCGAATAGTTGAATCCCACTATTGGGCAATTGGAAC CCTT GAGCCTT AT CAAT AT AGAT AT CAAAGAAGCCT CAT CGCCAAG ATT ATT GCCCT AACT A CAGTT GTT GAT GAT GTCT ACG AT GT GT ACGGCACATT GG AT G AACT CCAACT ATTT ACAG AC GCAATT CG AAG AT G G G AT ATT GAAT CAAT CAACCAACTT CCTAGTT AC AT GCAACT ATG CTA TTT AG CAAT CT ACAACTTT GTTT CT GAG CTG GCTTACG AT ATTTT CCG AG AC AAG G GTTTCA ACAGCCTCCCATATTTACACAAATCGTGGCTGGATTTGGTTGAAGCATATTTTGTTGAGGCA AAGT G GTT CCACG AT G GAT AT ACT CCAACT CT AG AAG AAT AT CT C AACAATT CG AAG AT AAC AAT AATTT GTCCTG CAAT AGT CT CAG AAATTT ACTT CG CATTT GCAAACT CC AT CG ACAAAA CAGAGGT CGAG AGCAT AT ACAAAT AT CAT G ACAT CCTTT ACCTTT CCGG AAT GCTTGCAAG GCTT CCCG AT GATTT AGG AACAT CAT CGTTT GAG AT GAAGAG AGGT G ACGT GGCG AAAGC AATTCAGTGTTACATGAAGGAGCATAACGCCTCAGAGGAGGAGGCACGTGAGCACATCAG ATTTCTTATGCGGGAGGCGTGGAAGCATATGAACACGGCGGCTGCGGCCGACGACTGTCC ATTT GAG AGT GATTT AGTT GTGGGTGCAGCT AGT CT CGG AAG AGTGGCT AATTTT GT GT AT GT GG AGGG AG AT GGTTTT GG AGT GCAACACT CAAAAAT ACAT CAACAAATGGCT G AATT AC T GTTTT ACCCAT AT CAGT G A
SEQ ID NO: 23
AT GCCACT GAATT CCCT CCACAACTTGGAG AGG AAACCTT CAAAAGCATGGT CT ACCT CTT GCACT GCACCCGCAGCT CGCCT CCAGGCAT CTTT CT CCTT ACAACAAGAAG AACCT CGT CA AAT CCG ACG CTCT G GG G ATT ACC AACCCT CT CTTT G G GATTT CAATT ACAT ACAGT CT CT CA ACACTCCGTATAAGGAGCAGAGATACGTTAATAGGCAAGCAGAGTTGATTATGCAAGTGAG GAT GTTGCTT AAG GT AAAG AT G G AG G CAATT CAACAGTT G GAGTT GATT GAG ACTT G CAAT ACCTGGG ACT GT CTT ATTT CTTT CCAGAT GAG ATT AAACAAAT CTT AAGTT CT AT ACACAAT G AGCACAG AT ATTT CCACAAT AAT GATTT GTATCT CACAGCT CTTGG ATT CAGAAT CCT CAGA CAACATGGTTTT AAT GTTT CCGAAG AT GT ATTT GATT GTTT CAAG ACT GAGAAGTGCAGT GA TTT CAAT G CAAACCTT G CT CAAG AT ACG AAG GG AAT GTT ACAACTTT AT G AAGCAT CTTT CC TTTT GAG AGAAGGT G AAG AT ACATTGGAGCT AG CAAG ACG ATTTT CCACCAG AT CT CT ACG AG AAAAACTT GAT G AAG AT GGT GAT G AAATT GAT G AAG AT CT AT CAT CGT GG ATT CGCCATT CCTTGG AT CTT CCT CTT CATT GG AGG AT CCAAGG ATT AGAGGCAAG ATGGTT CTT AGAT GC TT AT G CGAG GAG G CCGG ACAT GAAT CCACTT ATTTT CAAACT CGCC AAACT CAACTT CAAT A TT GTT CAGGCAACAT AT CAAG AAG AACT CAAAG AT GT CT CAAGGTGGT GG AAT AGTT CGT G CCTT GCT GAGAAACT CCCATTT GT GAGAGAT AGGATT GTGGAATGCTT CTTTT GGGCCAT C GGGGCTTTT G AGCCT CACCAAT AT AGTT AT CAGAG AAAAAT GGCCGCCATT ATT ATT ACTTT CGT AACAATT AT CG AT GAT GTTT AT GAT GT GTATGG AACATT AG AAG AACTGGAACT ATTT A CAG AT ATG ATT CG CAG AT G GG AT AAT AT AT CAAT AAGCCAACTT CCAT ATT AT AT G CAAGT G TG CTATTTG G CACTATACAACTTCGTTTCTG AG CG G GCTTACG AT ATTCTAAAAG ATCAACA TTT CAACAGCAT CCCATATTTACAGAGAT CGTGGGT AAGTTT GGTT GAAGGAT AT CTT AAGG AG G CATACTG GT ACT ACAATG G CT AT AAACCAAGCTT G G AAG AAT AT CT CAACAACG CCAA GATTT CAAT AT CGGCT CCT ACAAT CAT AT CCCAGCTTT ATTTT ACATT AGCAAACT CG ACT G A T G AAACAGTT AT CG AG AGCTT AT ACG AAT AT CAT AACAT ACTTT ACCTAT CAG G AACCAT ATT AAGGCTTGCT GACGAT CTTGGGACAT CACAACAT GAGCTGGAGAGAGGAGACGT CCCGAA AGCAATCCAGTGCTACATGAAGGACACAAATGCTTCGGAGAGAGAGGCGGTGGAACACGT GAAGTTTCTGATAAGGGAGACGTGGAAGGAGATGAACACGGTCACAACAGCCAGCGATTG TCCGTTTACGGATGATTTGGTTGCGGTCGCAACTAATCTTGCAAGGGCGGCTCAGTTTATA
61
TAT CT CGACGGGGAT GGGCATGGCGTGCAACACT CGGAAAT ACAT CAACAGAT GGGAGGC CT GCT ATT CCAGCCTT ATGTCT G A
SEQ ID NO: 24
AT GT CAT CT ACT ACAT CCGTTT CT AAT G AAG AT GGTGT CCCAAG AAGAATT GCTGGT CAT CA TT CT AATTT GT GGG AT GAT GATT CT AT CGCCT CTTT GT CT ACTT CTT AT G AAGCT CCAT CTT A CAG AAAGAG AGCCG AT AAGTT G ATTGGT G AAGT CAAG AACAT CTT CGACTT GAT GT CT GTT GAGGATGGT GTTTTT ACTT CT CCATT GTCT G ACTTGCAT CACAG ATT GT GG ATGGTT GATT C AGTT GAAAG ATTGGGT AT CGACAG ACATTT CAAGG ACG AAAT CAATT CCGCTTT GG AT CAC GTTT ATT CTT ACTGGACCG AAAAAGGTATT GGTAGGGGTAG AGAAT CTGGTGTT ACT G ATTT GAATT CT ACCGCTTT GGGTTT GAG AACCTT GAGATT GOAT GGTT ACACT GTTT CTT CCCACG TTTT GG AT CATTTT AAG AACG AAAAGGGT CAGTT CACCT GTT CT GCT ATT CAAACT G AAGGT GAAAT CAGGG AT GT CTT GAATTT GTT CAG AGCTT CCTT G ATTGCTTT CCCAGGT G AAAAG AT T ATGGAAGCT GCT G AAATTTT CT CCACCAT GTACTT GAAAGATGCCTT GCAAAAAATT CCAC CAT CCG GTTT GTCT CAAG AAAT CG AAT ACTT GTT G GAATT CG GTTG G CAT ACCAATTT G CCA AGAATGGAAACTAGAATGTACATCGACGTTTTCGGTGAAGATACCACTTTTGAAACCCCATA CTTGATCAGGGAAAAGTTGTTAGAATTGGCCAAGTTGGAGTTCAACATCTTCCATTCATTGG T CAAGAGGG AATT GCAGT CTTT AT CT AGGT GGTGG AAAGATT ACGGTTT CCCAG AAATT AC CTT CT CCAG ACAT AGACAT GT CGAGTATT AT ACTTT GGCT GCTT GCATTGCT AACGAT CCT A AACATT CT GCTTT CAG ATT AGGTTT CGGTAAG AT CT CCCAT AT GAT CACCATTTTGGAT GAT AT CT ACGACACCTT CGGTACT AT GG AAGAATT G AAGTT GTT GACT GCT GCTTT CAAAAGAT G GG AT CCAT CCT CT ATT G AATGCTT GCCAG ATT AT AT GAAGGGT GTTT ACAT GGCCGTTT ACG ACAACATTAACGAAATGGCTAGAGAAGCCCAAAAGATTCAAGGTTGGGATACAGTTTCTTA CGCT AG AAAAT CTT G G G AAG CTTT CATT GGTG CTT ACATT CAAG AG G CT AAGT GG ATTTCTT CTGGTTACTTGCCAACTTTCGATGAGTACTTGGAAAACGGTAAGGTTTCTTTCGGTTCTAGA ATT ACT ACCTT GG AACCT AT GTT G ACCTTGGGTTTT CCATT GCCACCAAG AAT ATTGCAAG A AATT G ACTT CCCCT CCAAATT CAACG ATTT G ATTTGCGCCATTTT G AGGTT GAAGGGT GAT A CT C AAT GTT AC AAAG CTGATAGAGCTAGAGGTGAAGAAGCTTCAGCTGTTTCTTGTT ACAT G AAGGAT CAT CCAGGTAT CACT GAAGAAGAT GCCGTTAAT CAAGTTAACGCCAT GGTT GATA ACCT G ACCAAAG AGTT GAATTGGG AATTGCT AAGACCAG ATT CAGGT GTT CCAAT CT CTT A CAAG AAG GTTG CTTT CG AT AT CTGCAGAGTTTTT CACT ACGGTT ACAAGTACAGAG AT GGTT T CT CT GTT GCTT CCAT CGAAAT CAAG AACTTGGTT ACT AG AACCGTT GTT G AAACCGTT CCA TTGTGA
SEQ ID NO: 25
TGATCTTCGACGGCACAACCATGAGTATCGCCATTGGTTTGCTTAGCACCCTGGGAATAGG GGCAGAAGCG AAT CCAAG AGAAAATTT CTT GAAGT GTTTTT CT CAGT AT AT CCCGAAT AAT G CG ACG AACCTT AAGTT AGT AT ACACT CAGAACAACCCT CT AT AT AT G AGCGTT CT AAATT CT ACAATCCACAACCTAAGATTTACGTCCGACACGACTCCGAAACCCCTAGTTATAGTGACAC CGT CACAT GTT AGCCAT AT ACAGGGCACCAT ACT AT GTT CCAAAAAAGTT GGGTT ACAAAT A CGTACCCGTAGCGGGGGACACGACAGTGAGGGGATGAGTTATATTAGTCAGGTGCCTTTC GT CAT AGT GG ATTT AAGAAAT AT G AGGT CAATT AAAAT CGACGTT CACT CACAAACTGCCT G GGTTGAGGCGGGGGCCACATTGGGTGAAGTATATTACTGGGTCAATGAGAAGAACGAGAA TCTTTCACTAGCAGCCGGTTATTGTCCCACAGTCTGCGCCGGCGGTCACTTTGGCGGCGG CGGATACGGTCCCTTAATGAGAAATTACGGGCTTGCCGCAGACAATATCATAGATGCTCAC TT AGTT AAT GTT CAT G G AAAAGT GTT AG ACCGTAAAAG CAT G GG G GAG GAT CT GTTTT GGG CGCTTAGAGGGGGAGGGGCAGAATCATTTGGAATAATAGTGGCATGGAAAATCAGGCTTG TGGCT GTT CCAAAG AGT ACCAT GTT CT CAGT AAAG AAAAT AATGG AG AT CCAT G AGCT AGTT AAACTT GT G AAT AAAT GG CAAAACAT AGCCT AT AAAT AT GAT AAG G ACTT G CTG CTT AT G AC T CATTT CAT AACC AG AAACATT ACG GAT AACCAAG G G AAG AACAAAACAG CCATCCAT ACCT ACTTT AGCT CCGTTTT CTTGGGT GGTGT AG ACAGCTT AGTT G ACCT GAT G AACAAG AGTTTT CCG G AACT AGGTAT CAAG AAG ACAG ATT GT AG AC AACTTT CCT G G ATT GAT ACCAT AAT CTT
62
TT ACAGCGG AGT CGT CAATT AT G ACACT GACAACTT CAACAAGG AAATTTT ATT AGAT AGGA GT GCGGGT CAAAATGGGGCCTT CAAG AT CAAACT AG ACT ACGTT AAAAAACCCATT CCT GA AAGTGTTTTTGTTCAGATTCTGGAGAAGCTGTATGAAGAAGATATTGGCGCGGGGATGTAC GCT CTTT AT CCGTACGGCGGCATAATGGAT GAGATTAGT GAAAGCGCCAT CCCTTT CCCCC ACAGAGCTGGTATCCTGTACGAGTTGTGGTATATCTGCTCCTGGGAGAAACAGGAGGATAA CGAAAAGCACTTAAATTGGATTAGGAATATCTACAATTTCATGACGCCCTACGTTTCCAAGA ACCCCAGGTTGGCCTATTT GAACT ACAGGGAT CTT GATATTGGAAT CAACGACCCCAAAAA CCCAAACAACTACACCCAGGCAAGGATTTGGGGAGAGAAGTACTTCGGGAAGAACTTCGA CAGGCTAGTTAAGGTGAAAACGCTAGTTGATCCAAATAATTTTTTCAGAAACGAACAGAGTA T CCCT CCCTT ACCGCGT CAT AGGCACT AA
SEQ ID NO: 26
MASEKEIRRERFLNVFPKLVEELNASLLAYGMPKEACDWYAHSLNYNTPGGKLNRGLSWDTY
AILSNKTVEQLGQEEYEKVAILGWCIELLQAYCLVADDMMDKSITRRGQPCWYKVPEVGEIAIN
DAFMLEAAIYKLLKSHFRNEKYYIDITELFHEVTFQTELGQLMDLITAPEDKVDLSKFSLKKHSFIV
TFKTAYYSFYLPVALAMYVAGITDEKDLKQARDVLIPLGEYFQIQDDYLDCFGTPEQIGKIGTDIQ
DNKCSVWINKALELASAEQRKTLDENYGKKDSVAEAKCKKIFNDLKIEQLYHEYEESIAKDLKAK
ISQVDESRGFKADVLTAFLNKVYKRSK
SEQ ID NO: 27
MPDAIEFEHEGRRNPNSAEAESAYSSIIAALDLQESDYAVISGHSRIVGAAALVYPDADAETLLA
ASLWTACLIVNDDRWDYVQEDGGRLAPGEWFDGVTEWDTWRTAGPRLPDPFFELVRTTMS
RLDAALGAEAADEIGHEIKRAITAMKWEGVWNEYTKKTSLATYLSFRRGYCTMDVQVVLDKWI
NGGRSFAALRDDPVRRAIDDVWRFGCLSNDYYSWGREKKAVDKSNAVRILMDHAGYDESTA
LAHVRDDCVQAITDLDCIEESIKRSGHLGSHAQELLDYLACHRPLIYAAATWPTETNRYR
SEQ ID NO: 28
MVAQTFNLDTYLSQRQQQVEEALSAALVPAYPERIYEAMRYSLLAGGKRLRPILCLAACELAGG
SVEQAMPTACALEMIHTMSLIHDDLPAMDNDDFRRGKPTNHKVFGEDIAILAGDALLAYAFEHIA
SQTRGVPPQLVLQVIARIGHAVAATGLVGGQWDLESEGKAISLETLEYIHSHKTGALLEASVVS
GGILAGADEELLARLSHYARDIGLAFQIVDDILDVTATSEQLGKTAGKDQAAAKATYPSLLGLEA
SRQKAEELIQSAKEALRPYGSQAEPLLALADFITRRQH
SEQ ID NO: 29
MAYTAMAAGTQSLQLRTVASYQECNSMRSCFKLTPFKSFHGVNFNVPSLGAANCEIMGHLKL
GSLPYKQCSVSSKSTKTMAQLVDLAETEKAEGKDIEFDFNEYMKSKAVAVDAALDKAIPLEYPE
KIHESMRYSLLAGGKRVRPALCIAACELVGGSQDLAMPTACAMEMIHTMSLIHDDLPCMDNDD
FRRGKPTNHKVFGEDTAVLAGDALLSFAFEHIAVATSKTVPSDRTLRVISELGKTIGSQGLVGG
QVVDITSEGDANVDLKTLEWIHIHKTAVLLECSVVSGGILGGATEDEIARIRRYARCVGLLFQVV
DDILDVTKSSEELGKTAGKDLLTDKATYPKLMGLEKAKEFAAELATRAKEELSSFDQIKAAPLLG
LADYIAFRQN
SEQ ID NO: 30
MNSSIVSQHFFISLKSSLDLQCWKSSSPSSISMGEFKGIHDKLQILKLPLTMSDRGLSKISCSLSL
QTEKLRYDNDDNDDLELHEELIPKHIALIMDGNRRWAKAKGLEVYEGHKLIIPKLKEICDISSKLGI
QVITAFAFSTENWKRSKEEVDFLMQLFEEFFNEFLRFGVRVSVIGCKSNLPMTLQKCIALTEETT
63
KGNKGLHLVIALNYGGYYDILQATKSIVNKAMNGLLDVEDINKNLFEQELESKCPNPDLLIRTGG
EQRVSNFLLWQLAYTEFYFTNTLFPDFGEKDLKKAILNFQQRHRRFGGHTY
SEQ ID NO: 31
MSSLVLQCWKLSSPSLILQQNTSISMGAFKGIHKLQIPNSPLTVSARGLNKISCSLSLQTEKLCY
EDNDNDLDEELMPKHIALIMDGNRRWAKDKGLDVSEGHKHLFPKLKEICDISSKLGIQVITAFAF
STENWKRAKGEVDFLMQMFEELYDEFSRSGVRVSIIGCKTDLPMTLQKCIALTEETTKGNKGLH
LVIALNYGGYYDILQATKSIVNKAMNGLLDVEDINKNLFDQELESKCPNPDLLIRTGGDQRVSNF
LLWQLAYTEFYFTKTLFPDFGEEDLKEAIINFQQRHRRFGGHTY
SEQ ID NO: 32
MDIPVIVTSVSAENWRRSVTYHPNIWGEFFLSYASQLTEITVAGKEEHERQKEEIRNLLLQSDS
TLKKLELVDSIQRLGVGYHFEKEIGETLRFIHDTNSTNNNDLHEVALCFRLLREKGLHVPCDVFS
KFVDEEGNFRESIRNDVEGILSLYEASNYAVHGEEIPEKAFEFCSSHLVSLITNINNSLSTRVKDA
LKIPIRKSLNRLGAKKFISMYEEDDSHNQKLLNFAKLDFNLVQKIHQKELSHLTRVWVKELDFANK
LSFARDRLVECYFWIVGVYFEPSYGIARKLLTKVIYVASVLDDIYDVYGTLDELTLFTSIVQRWDI
SAIDQLPPYMRIYFKALFDVYVEMEDEMGKLGKSYAVEYGKAEMIRMAKMYFKEAEWSFKGYK
PTMEEYTTVALLSSGYMMMTINSLAVINDPISKEEFDVWLSEPPMLRASLIITRLMDDLAGYGSE
EKLSAVHYYMHQHGVSEEEAFVELQKQVKNAWKDLNKEFLEPREASMPILTCVDNFTRVIIVLY
SDEDTYGNSKTKTKDMIKSVLVDPFMLDC
SEQ ID NO: 33
MIILLKEVQLEIQRRIAYLRPTQKNDGSFRYCFETGVMPDAFLIMLLRTFDLDKEGLIKQLTERIVS
LQNEDGLWTLFDDEEHNLSATIQAYTALLYSGYYQKNDRILRKAERYIIDSGGISRAHFLTRWML
SVNGLYEWPKLFYLPLSLLLVPTYVPLNFYELSTYARIHFVPMMVAGNKKFSLTSRHTPSLSHLD
VREQNQESEETTQESRASIFLVDHLKQLASLPSYIHKLGYQAAERYMLERIEKDGTLYSYATSTF
FMIYGLLALGYKKDSFVIQKAIDGICSLLSTCSGHVHVENSTSTVWDTALLSYALQEAGVPQQDP
MIKGSTRYLKKRQHTKLGDWQFHNPNTAPGGWGFSDINTNNPDLDCTSAAIRALSRRAQTDT
DYLESWQRGINWLLSMQNKDGGFAAFEKNTDSILFTYLPLENAKDAATDPATADLTGRVLECL
GNFAGMNKSHPSIKAAVKWLFDHQLDNGSWYGRWGVCYIYGTWAAITGLRAVGVSASDPRIIK
AINWLKSIQQEDGGFGESCYSASLKKYVPLSFSTPSQTAWALDALMTICPLKDRAVEKGIKFLLN
PNLTEQQTHYPTGIGLPGQFYIQYHSYNDIFPLLALAHYAKKHSS
SEQ ID NO: 34
MWRLKVGKESVGEKEEKWIKSISNHLGRQVWEFCAENDDDDDDEAVIHVVANSSKHLLQQQR
RQSSFENARKQFRNNRFHRKQSSDLFLTIQYEKEIARNGAKNGGNTKVKEGEDVKKEAVNNTL
ERALSFYSAIQTSDGNWASDLGGPMFLLPGLVIALYVTGVLNSVLSKHHRQEMCRYIYNHQNE
DGGWGLHIEGSSTMFGSALNYVALRLLGEDANGGECGAMTKARSWILERGGATAITSWGKLW
LSVLGVYEWSGNNPLPPEFWLLPYSLPFHPGRMWCHCRMVYLPMSYLYGKRFVGPITHMVLS
LRKELYTIPYHEIDWNRSRNTCAQEDLYYPHPKMQDILWGSIYHVYEPLFNGWPGRRLREKAM
KIAMEHIHYEDENSRYIYLGPVNKVLNMLCCVWEDPYSDAFKFHLQRIPDYLWLAEDGMRMQG
YNGSQLWDTAFSIQAILSTKLIDTFGSTLRKAHHFVKHSQIQEDCPGDPNVWFRHIHKGAWPFS
TRDHGWLISDCTAEGLKASLMLSKLPSKIVGEPLEKNRLCDAVNVLLSLQNENGGFASYELTRS
YPWLELINPAETFGDIVIDYSYVECTSATMEALALFKKLHPGHRTKEIDAALAKAANFLENMQRT
DGSWYGCWGVCFTYAGWFGIKGLVAAGRTYNNCVAIRKACHFLLSKELPGGGWGESYLSCQ
NKVYTNLEGNRPHLVNTAVWLMALIEAGQGERDPAPLHRAARLLINSQLENGDFPQQEIMGVF
NKNCMITYAAYRNIFPIWALGEYSHRVLTE
64
SEQ ID NO: 35
ATGGCTTCAGAAAAAGAAATTAGGAGAGAGAGATTCTTGAACGTTTTCCCTAAATTAGTAGA GGAATTGAACGCATCGCTTTTGGCTTACGGTATGCCTAAGGAAGCATGTGACTGGTATGCC CACTCATTGAACTACAACACTCCAGGCGGTAAGCTAAATAGAGGTTTGTCCGTTGTGGACA CGTATGCTATTCTCTCCAACAAGACCGTTGAACAATTGGGGCAAGAAGAATACGAAAAGGT TGCCATTCTAGGTTGGTGCATTGAGTTGTTGCAGGCTTACTGTTTGGTCGCCGATGATATG AT GGACAAGT CCATTACCAGAAGAGGCCAACCAT GTTGGT ACAAGGTT COT GAAGTT GGG G AAATT G COAT CAAT G ACG CATT CAT GTT AG AGG CT G CT AT CT AC AAG CTTTT G AAAT CT CA CTT CAG AAACG AAAAAT ACT ACAT AG AT AT CACCG AATT GTT COAT G AGGT CACCTT CCAAA CCG AATTGGGCCAATT GATGGACTT AAT CACTGCACCT G AAG ACAAAGT CG ACTT G AGT AA GTTCTCCCT AAAG AAG C ACT CCTT CAT AGTT ACTTT CAAG ACT G CTT ACT ATT CTTT CT ACTT GCCTGTCGCATTGGCCATGTACGTTGCCGGTATCACGGATGAAAAGGATTTGAAACAAGC CAG AG AT GTCTTG ATT CCATT GG GT G AAT ACTT CCAAATT CAAG AT G ACT ACTT AG ACT G CT T CGGT ACCCCAGAACAG AT CGGTAAGAT CGGTACAG AT AT CCAAG AT AACAAAT GTT CTT G GGTAATCAACAAGGCATTAGAACTTGCTTCCGCAGAACAAAGAAAGACTTTAGACGAAAAT TACGGTAAGAAGGACTCAGTCGCAGAAGCCAAATGCAAAAAGATTTTCAATGACTTGAAAA TT G AACAGCT AT ACCACGAAT AT G AAG AGT CT ATT GCCAAGG ATTT G AAGGCT AAAATTT CT CAGGT CG AT GAGT CT CGT GGCTT CAAAGCT GAT GT CTT AACT GCGTTTTT G AACAAAGTTT A CAAG AG AT AA
SEQ ID NO: 36
AT GCCAGATGCT ATT G AATTT G AACAT G AAGGTAGAAGAAACCCAAACT CTGCT GAAGCT G AAT CTGCTT ACT CTT CT ATT ATT G CTG CTTT G G ACTT G CAAG AAT CCG ATT ACG CT GTT ATTT CTG GT CACT CT AG AAT AGTT G GTG CTGCTG CTTT AGTTT AT CCAG AT G CTG ATG CT G AAACT TT GTTGGCTGCTT CTTT GTGGACT GCTT GTTT GAT CGTTAAT GAT GAT AGATGGGACTACGT CCAAGAAGATGGTGGTAGATTGGCTCCAGGTGAATGGTTTGATGGTGTTACTGAAGTTGTT GATACTTG GAG AACT GCTGGT CCAAGATTGCCAGAT CCATTTTTT GAATT GGTT AGAACCAC CAT GT CCAGATTGG AT GOT GCATT GGGTGCT G AAGCAGCT G ACG AAATTGGT CACG AAAT C AAAAGAGCTATTACCGCTATGAAGTGGGAAGGTGTTTGGAATGAATACACCAAAAAGACAT CTTT GGCCACCT ACTT GT CTTTT AGAAG AGGTT ACT GTACCATGGAT GTT CAAGTT GTTTT G GACAAGTGG ATT AACGGTGGT AGAAGTTTT GOT GCCTT G AG AGAT GAT CCAGTT AGAAG AG CAATT GAT GAT GTTGTT GTT AG ATT CGGTTGCTT GT CCAACG ATT ATT ACT CTTGGGGTAGA GAAAAAAAGGCCGTT GAT AAGT CT AACGCCGT CAG AATTTT G ATGGAT CATGCT GGTTATG AT G AAT CT ACTG CTTT G G CT CAT GTT AG AG ATG ATTG CGTT CAAG CCATT ACT G ATTT GG AT T G CAT CG AAG AAT COAT CAAG AG AT CAGGT CATTT G G GTTCT CAT GCCC AAG AATT ATT G G ATT ACTT GGCTT GT CAT AGACCATT GAT AT ACGCT GCTGCT ACTTGGCCAACT G AAACT AAT AGAT ACAG AT AA
SEQ ID NO: 37
ATGGTTGCT CAAACTTT CAACTTGGAT ACCT ACTT GT CCCAAAG ACAACAACAAGTT G AAG A AGCTTT GTCTGCT GCTTTGGTT CCAGCTT AT CCAG AAAGAAT CT AT GAAGCT AT GAGGT ACT CTTT GTTGGCT GGTGGTAAAAGATT AAG ACCAATTTT GT GTTTGGCTGCTT GT G AATT AGCT GGTGGTTCT GTT GAACAAGCT ATGCCAACT GCTT GTGCTTT GG AAAT GATT CAT ACCAT GT C CTT GAT CCACG AT GATTTGCCAGCT AT GG AT AAT GAT G ATTT CAG AAG AGGTAAGCCT ACC AACCATAAGGTTTTCGGTGAAGATATTGCTATTTTGGCCGGTGATGCTTTGTTAGCTTATGC CTTT G AACAT ATT GCCT CT CAAACT AG AGGT GTT CCACCACAATT GGT GTTGCAAGTT ATT G
65
CTAGAATTGGTCATGCTGTTGCTGCTACTGGTTTGGTTGGTGGTCAAGTTGTTGATTTGGA ATCTGAAGGTAAGGCCATTTCTTTGGAAACCTTGGAGTATATCCATTCTCATAAGACTGGTG CTTT GTTGGAAGCTT CT GTT GTTT CTGGT GGTATT CT AGCT GGTGCT GAT G AAG AATT ATT G GCCAG ATT GTCT CATT ACGCCAG AGAT ATTGGTTTGGCTTT CCAAAT CGTT GAT GAT AT CTT GG AT GTT ACCGCT ACCT CT G AACAATTGGGTAAAACT GCT GGTAAAG AT CAAGCT GCTGCT AAAGCT ACTT ACCCAT CTTT GTT AGGTTTGGAAGCCT CT AG ACAAAAGGCCGAAG AATT GAT T CAAT CCG CT AAAG AAG CCTT AAG AC CAT ACG GTTCT CAAG CT G AACCATT ATT GG CTTT G GCAGATTT CATT ACCAGAAGGCAACATT AA
SEQ ID NO: 38
ATGGCCTACACCGCTATGGCTGCTGGCACTCAGTCTCTGCAGCTGAGGACCGTGGCTAGC
T ACCAAGAGT GCAACAGCAT G AGGT CCTGCTT CAAGCT GACCCCGTT CAAG AGCTT CCAC
GGCGTGAACTTCAACGTGCCATCTCTGGGCGCTGCCAACTGCGAGATCATGGGCCATCTT
AAGCTGGGCAGCCTGCCGTACAAGCAGTGCTCTGTGAGCAGCAAGAGCACCAAGACCATG
GCGCAGCTGGTGGACCTTGCCGAGACTGAGAAGGCTGAAGGCAAGGACATCGAGTTCGA
CTTCAACGAGTACATGAAGTCCAAGGCCGTGGCCGTGGACGCTGCTCTGGATAAGGCTAT
CCCACT CGAGT ACCCAGAGAAGAT CCACGAGT COAT GAGGT ACAGCCT GCT GGCT GGTGG
TAAGAGGGTTCGCCCAGCTCTGTGCATTGCTGCCTGCGAGCTTGTTGGCGGCTCTCAGGA
TCTG G CTAT GCCAACCGCTT G CG CTATG G AAAT GAT CC ACACCAT G AG CCT GAT CCACG AC
GACCTGCCGTGCATGGACAACGACGATTTCAGAAGGGGCAAGCCGACCAACCACAAGGT
GTTCGGCGAGGATACTGCTGTGCTTGCTGGCGACGCTCTGCTGAGCTTCGCCTTCGAGCA
TATCGCTGTGGCCACCAGCAAGACCGTGCCATCAGATAGGACCCTGAGGGTGATCAGCGA
GCTGGGCAAGACCATTGGCTCACAGGGCCTTGTTGGAGGCCAGGTGGTGGACATTACTAG
CGAGGGCGACGCT AACGT GGACCT CAAGACCCT CGAGTGGATT CACAT CCACAAGACCGC
CGTGCTGCTCGAGTGCTCAGTTGTTTCTGGCGGCATTCTTGGCGGCGCTACCGAGGATGA
GATCGCTAGGATTAGAAGGTACGCCCGCTGCGTGGGCCTGCTGTTCCAAGTGGTGGACGA
CATCCTGGACGTGACCAAGAGCAGCGAGGAACTCGGCAAGACCGCTGGCAAGGATCTGC
T GACCGACAAGGCCACCTAT CCGAAGCT GATGGGCCT CGAGAAGGCCAAAGAGTT CGCC
GCTGAACTGGCGACCAGGGCGAAAGAGGAACTGAGCAGCTTCGACCAGATCAAGGCCGC
T CCACTGCT GGGCCT CGCT GACTACATTGCGTT CAGGCAGAACT GA
SEQ ID NO: 39
AT GT CT GAT AG AGGTTT GTCT AAGATTT CCTGCT CCTT GT CATTGCAAACCGAAAAGTT GAG AT ACGAT AACGAT GAT AACGACG ACTTGGAATTGCACGAAGAATT GATT CCAAAACAT ATCG CCTTGATCATGGACGGTAATAGAAGATGGGCTAAAGCTAAAGGTTTGGAAGTTTACGAAGG T CACAAGTT GATT AT CCCCAAGTT G AAAG AAAT CTGCG ACAT CT CTT CT AAGTTGGGTATT C AAGTTATTACCGCTTT CGCTTT CT CTACCGAAAATTGGAAGAGGT CT AAAGAAGAGGTT GAC TT CTT GATGCAGTT GTT CG AAGAATTTTT CAACGAGTT CTT GAG ATT CGGT GTT AG AGTTT C TGTTAT CGGTTGCAAGT CT AATTT GCCAAT G ACCTT GCAAAAGTGCATTGCTTT G ACT GAAG AAACTACCAAAGGTAACAAAGGCTTGCATTTGGTTATT GCCTT GAATT ACGGTGGTT ACTAC GAT ATCTTG CAAG CT ACT AAGT COAT CGTT AACAAGG CT AT G AAT G GTTT GTTG G ACGT CG AAG AT AT CAACAAGAATTT GTT CG AGCAAGAGTTGG AAAGCAAGT GT CCAAAT CCAG ATTT GTT GATT AG AACCGGT GGT G AACAAAGAGT CT CT AATTT CTT GTT GTGGCAATT GGCTT ACA CCG AATT CT ACTT CACT AACACTTT GTT CCCAG ACTT CGGT G AAAAGG ATTT GAAGAAGGCT AT CTT G AACTT CCAG CAG AG ACAT AG AAG ATTT G GTG GT CAT ACTT ACT AA
SEQ ID NO: 40
66
ATGTCTG CT AG AG GTTT G AACAAAATTT CCTG CT CCTT GT CCTT G CAAACCG AAAAATT GTG TTACG AG G ATAACG ATAACG ACTTG G ACG AAG AATTG ATG CCAAAACATATTG CCTTG ATC AT GG ACGGT AAT AGAAG AT GGGCT AAAG ACAAAGGTTTGGAT GTTT CT G AAGGT CACAAAC ACTT GTT CCCCAAGTT GAAAG AAAT CT GCGAT AT CT CTT CCAAGTTGGGTATT CAAGTT ATT ACCGCTTT CGCTTT CT CT ACCGAAAATTGGAAAAG AGCT AAAGGCGAAGTT G ACTT CTT GAT GCAAAT GTT CGAAGAGTT GTACG ACG AATT CT CT AG AT CTGGT GTT AGAGTTT CCATT ATT G GTTGCAAG ACT G ATTTGCCAAT GACCTT GCAAAAGT GTATTGCTTT G ACT G AAG AAACCAC CAAAGGT AACAAGGGT CT GCATTTGGTT AT CGCTTT G AATT ATGGTGGTT ACT ACG AT AT CT TGCAAGCCACT AAGT CT AT CGTT AACAAGGCT AT G AAT GGTTT GTTGGACGT CGAAG AT AT CAACAAGAACTT GTT CG ACCAAGAGTTGGAAT CT AAGT GT CCAAAT CCAG ACTT GTT GATT A GAACTGGT GGT GAT CAAAG AGT CT CCAATTTTTT GTT GTGGCAATTGGCTT ACACCG AGTT CT ACTTT ACT AAG ACTTT GTT CCCAG ACTT CG GT G AAG AAG ATTT GAAAG AAG CCAT CAT CA ACTT CCAGCAGAG ACAT AG AAGATT CGGTGGT CAT ACTT ACT AA
SEQ ID NO: 41
AT GG AT ATT CCT GT GATT GTT ACTT CCGTTT CGGCT GAG AAT GT CGT CCGT CGAT CT GTAAC TT ACCAT CCAAAT ATTT GG G G AG AATTTTTT CTTT CAT AT G CTT CACAACTT ACG G AAAT CAC TGTTG CTG G AAAGG AAG AG CAT GAAAG ACAAAAG G AAG AG ATT AGG AATTT G CTT CTT CAA AGT GATT CAACCCT AAAAAAGCTT G AACT CGTT G ACT CAAT CCAACGCCTTGGAGT GGGCT ACCATTT CGAG AAAG AAATTGGCGAAACATT ACGATT CATT CAT GACACCAAT AGCACCAAT AACAACGAT CTT CACG AAGTT GCT CTTT GCTTT CGT CT GCTT AG AG AAAAAGGT CTT CAT GT T CCAT GT GAT GTTTTT AGCAAGTT CGTAGAT G AAG AAGG AAATTT CAGGG AGT CGAT AAG A AACG AT GTT G AAG G GAT ATT GAG CTT AT ATG AG G CAT CAAATT AT GC AGT G CAT G GAG AG G AAATT CCT GAAAAAGCATT CG AATTTT GCT CCT CT CAT CTT GTCT CTTT AAT CACCAACAT CA ACAATT CCCTTT CAACACGAGTT AAGG ATGCTTT GAAGAT CCCAATT CGAAAGAGT CT AAAC AG ATT G G GAG CAAAAAAGTT CAT CTCTATGTAT G AAG AAG AT G ACT CACACAAT CAAAAATT ACT CAATTTT GCCAAATTGG ACTT CAACTT AGT GCAG AAGAT ACACCAG AAAG AGCT AAGC CAT CTT ACAAGGTGGTGGAAGGAGTTAGACTTTGCAAATAAGCTAT CTTTTGCGAGAGATA GACTT GTGGAATGCTACTTTTGGAT AGTGGGAGTTTACTTT GAGCCAAGCT ATGGAATT GC AAG AAAGCT ACT AACCAAAGT CATTT AT GTGGCTT CT GT CCTT GAT G ACAT CT ACGACGT CT AT GGAACCTTAGACGAACT AACCCT CTT CACCAGCATT GT CCAAAGGTGGGACATTAGTGC CAT CGAT CAATTGCCACCAT ACAT GAG AAT AT ACTT CAAAGCCCTTTT CGAT GT AT ATGTTG AAATGGAAGACGAAAT GGGAAAACTAGGCAAAT CAT ATGCAGT CGAATAT GGAAAAGCT GA GATGATAAGGTTGGCCAAGATGTACTTTAAAGAGGCTGAATGGTCTTTTAAGGGGTACAAG CCT ACAAT G G AG G AAT ACACAAC AGT G GC ACTTTT GTCTTCGGGCT ACAT G ATG AT G ACAA TT AATT CATT AG CT GTT AT AAAT G ACCCAATT AG CAAG G AAG AGTTT GATT G G GTTTTG AGT G AACCACCT AT G CT AAG GG CAT CTTT GAT CATT ACT AG ACT CAT G G AT GACCTT GCCG G AT AT GGGAGT G AAG AG AAGCT CT CCGCAGTGCATT ACT ACATGCAT CAACAT GGTGTAT CAGA AG AGG AAGCTTTT GT AG AGCTT CAAAAACAAGT G AAG AAT GCATGG AAGGAT CT CAACAAG G AATTT CTT G AG CCAAG AG AGG CAT CCAT G CCAATT CT CACAT GTGTT GAT AATTT CAC ACG AGTTATAATCGTGTTGTATAGTGATGAAGATACATATGGTAACTCCAAAACTAAGACCAAAG AT AT GAT CAAGT CGGTGCT AGTT GACCCCTT CATGCTT GATTGCT AA
SEQ ID NO: 42
AT GAT CAT CTT GTT GAAAG AAGT CCAGTTGG AAAT CC AAAG AAG AATTG CCT ACCT AAGGC CAACT CAAAAG AAT G ATG GTT CTTT CAG AT ACT G CTT CG AAACT G GTGTTAT G CCAG AT GCT TT CCT GATT AT GTT GTT G AG AACTTT CG ACTT GG ACAAAG AGGT GTT GATT AAGCAATT G AC CG AAAGG AT CGT GT CCTTGCAAAAT GAAGAT GGTTT GTGG ACTTT GTT CGAT GACG AAG AA CAT AACTT GTCCG CT ACT ATT CAAG CTT ACACT GCTTT GTTGT ACT CCGGTTACT ACCAAAA
67
GAACG ACAG AATTTT G AG AAAGGCCGAAAGGT ACATT AT CG ATT CTGGTGGT ATTT CT AG A GCCCATTTTTT G ACT AG ATGG AT GTT GT CT GTT AACGGCTT GTAT G AAT GGCCAAAGTT GTT TT ACTTGCCCCT GT CTTT GTT GTTGGTT CCAACTT AT GTT CCCTT GAACTT CT ACG AATT GT C T ACTT ACG CT AG AAT CCACTT CGTTCCTATGATGGTTGCT G GTAACAAG AAGTT CT CTTT G A CTT CT AG ACAT ACG CCAT CCTT GTCT CATTT G GAT GTT AG AG AACAG AAG CAAG AAT CCG A AG AAACT ACCCAAG AAT CT AG AGCCT CT AT CTT CTTGGTT GAT CACTT G AAACAATTGGCCT CTTT G CCAT CCT ACATT CACAAATT G G GTT AT CAAG CT GCT GAG AG GTAT ATGTTG G AAAG A ATT G AAAAAG ACG GC ACCTT GTACT CTTATG CT ACTT CT ACTTT CTT CAT G ATCTACG GTTT GTT GGCTTTGGGTTACAAGAAGGATT CCTT CGTTATT CAAAAGGCCATT GACGGTAT CTGC AGTTT GTT AT CT ACTT GCT CTGGT CAT GTT CACGTT GAAAATT CT ACAT CT ACT GTTT GGG AT ACCGCCTT GTTGT CTT ATGCTTT ACAAGAGGCT GGTGTT CCACAACAAGAT CCAAT GATT AA GG GTACT ACCAG GT ACTT G AAG AAG AG ACAAC AT ACCAAATT AG GT G ACT G G CAATTT CAT AACCCAAAT ACT G CT CCAG GT G GTT GG G GTTTTT CT GAT ATT AAC ACT AACAACCC AG ATTT GG ATT G CACCT CTG CTG CT ATT AG AG CTTT AT CT AG AAG G G CT CAAACT GAT ACCG ATT ACT TGGAATCTTGGCAGAGAGGTATTAACTGGTTATTGTCCATGCAAAACAAGGACGGTGGTTT TG CTG CTTTT GAAAAG AAC ACCG ATT CCAT CTTGTT CACCT ATTT G CCATT G G AAAAT G CTA AGGATGCT GCT ACT GAT CCAGCT ACTGCT GATTT GACTGGTAGAGTTTTGGAAT GTTT GGG TAATTTCGCTGGCATGAACAAATCTCACCCATCTATTAAGGCTGCTGTTAAGTGGTTGTTCG AT CACCAATT GG AT AAT GGTT CTTGGT AT GGTAGAT GGGGT GTTT GTT AT AT CT ATGGT ACT TGGGCTGCAATTACTGGTTTGAGAGCTGTTGGTGTTTCTGCTTCAGATCCAAGAATTATCAA GG CTAT CAACT GGTT G AAGT CCAT CCAACAAG AG GAT GGCGGTTTTGGT G AAT CTT GTT AT T CTGCTT CT CT GAAGAAGT ACGT CCCATT GT CTTTTT CTACT CCAT CT CAAACTGCTT GGGC TTT AG AT GCTTT GAT G ACAATTT GT CCATT G AAGG AT AGGT CCGTT G AAAAGGGT ATT AAGT TT CT GTT G AACCCAAACTT G ACCG AACAACAAACT CATT ACCCAACTGGTATTGGTTTGCCA GGT CAATTTT ACAT CCAAT ACCACT CCT ACAACG ACAT CTTT CCATT ATTGGCTTT AGCT CA CTACG CTAAG AAG CACTCTTCATT AG GT AG AG GTAG AAG GTCT AAGTTGTAA
SEQ ID NO: 43
AT GTGGCGTTT G AAAGTTGGT AAAGAAT CCGT CGGT GAAAAAG AAG AAAAGT GG AT CAAGT CCAT CT CCAACC ATTT G G GT AG ACAAGTTT GG G AATTTT G CG CT G AAAAT GAT GAT G ACG A T G ACG ACG AAG CT GTT ATT CAT GTT GTTGCCAACT CCT CCAAACATTT GTT ACAACAACAAA GGCGT CAGT CCT CATTT G AAAATGCT AGAAAGCAGTT CCGTAACAACAG ATT CCAT AGG AA GCAAT CCT CT GATTT GTT CTT G ACCAT CCAGTACGAAAAAG AAATTGCCAG AAATGGT GCTA AGAATGGTGGTAACACAAAGGTCAAAGAAGGTGAGGACGTTAAGAAAGAAGCCGTTAACA ATACTTTGGAAAGGGCCTTGTCTTTCTACTCTGCTATTCAAACTTCTGATGGTAACTGGGCT TCTGACTTAGGTGGTCCAATGTTTTTGTTGCCAGGTTTGGTTATTGCCTTGTACGTTACTGG T GTTTT G AACT CCGTTTT GT CT AAGCACCAT AG ACAAGAAAT GT GCAGGTACAT CT ACAACC ACCAAAATGAAGATGGTGGTTGGGGTTTACACATTGAAGGTTCTTCTACTATGTTCGGTTCT GCCTT G AATT AT GTT GCTTT GAG ATT GCT AGGT GAAGAT GCT AATGGT GGT GAAT GTGGTG CTATG ACAAAAG CT AG AT CAT G G ATTTT G G AAAG AG GT GG CG CT ACT G CT ATT ACTT CTT G GGGTAAATT GTGGTTGTCT GTTTTGGGT GTTT AT GAGT GGTCT GGTAACAAT CCATTGCCA CCAGAATTTT GGTTGCT GCCAT ATT CTTT GCCATTT CAT CCAGGT AGG AT GTGGT GT CATT G CAGAAT GGTTT ATTT GCCCAT GT CTT ACTT GTACGGT AAG AGATTT GTT GGT CCAAT CACT C ACAT GGT CTT GT CTTT GAG AAAAGAGTT GTACACCATT CCAT ACCACG AAATT G ATTGGAAC AG AT CCAG AAACACTT GTGCT CAAGAGG ACTT GTATT ACCCACAT CCAAAGAT GCAAGAT A TTTT GTGGGGTT CCAT CTACCAT GTTTACGAACCTTT GTTTAAT GGTT GGCCAGGT AGAAGA TT G AG AG AAAAG G CT AT G AAG ATT GCCAT G G AACAT ATT CATT ACG AG G ACG AAAATT CCC GTT ACAT CT ATTTGGGT CCAGTT AACAAGGT CTT G AACAT GTTGTGTT GTT GGGTT GAAGAT CCAT ACT CT GATGCTTTT AAGTT CCACTTGCAAAG AAT CCCAG ATT ACTT GTGGTT GGCCGA AGATGGTATGAGAATGCAAGGTTATAATGGTTCCCAATTGTGGGATACCGCTTTCTCTATTC
68
AAG CT ATTTT GT CCACCAAGTT G ATCG AT ACTTT CG GTTCT ACTTT G AGG AAG G CACAT CAT TT CGT CAAGCACT CT CAAAT CCAAGAGGATT GT CCAGGT GAT CCT AAT GTTTGGTTT AGACA T ATT CACAAAGGT GCCTGGCCATT CT CT ACT AG AG AT CAT GGTTGGTT G ATTT CT G ATTGCA CTGCT GAAGGTTT GAAGGCTT CTTT GAT GTT GT CTAAGTTGCCAT CTAAGAT CGTTGGT GAA CCATTGG AAAAGAACAGATT GTGT GATGCCGTT AACGT CTT GTTGT CCTTGCAAAACG AAA ACGGTGGTTTTGCTT CTTACGAATT GACTAGAT CATACCCCTGGTTGGAATT GATTAACCCA GCT G AAACTTT CGGT GAT AT CGTT AT CG ATT ACT CCT ACGTT G AAT GT ACTT CTGCT ACT AT GG AAG CTTTGG CTTTGTTCAAAAAATTG CATCCAG GTCACAGG ACCAAAG AAATAG ATG CT GCTTTGGCT AAAG CTGCT AACTTCTTG G AAAAC AT G C AAAG AACT GATGGTTCTTGGTACG GTTGTTGGGGTGTTTGTTTTACTTATGCTGGTTGGTTTGGTATCAAAGGTTTAGTTGCTGCT GGTAG AACCT ACAACAATTGCGTTGCT ATT AG AAAGGCCT GT CACTT CTT GCT GT CT AAAG A ATT ACCAG GT GGTGGATGGGGT G AAT CTT ATTT GT CTT GT CAAAACAAG GTTT ACACC AACT TGGAAGGTAACAGACCACATTTGGTT AATACTGCTT GGGTTTT GATGGCTTT GATT GAAGCT GGT CAAGGT GAAAGAG AT CCAGCT CCATT GCAT AG AGCT GCT AG ATT ATT GAT CAACT CCC AATT G G AAAAT G GT G ACTT CCCACAACAAG AAAT CAT GG GTGTTTT C AACAAG AACT G CAT GATT ACTTACGCTGCCTACAGAAACATTTT CCCT ATTTGGGCTTTGGGT GAAT ACT CCCATA GAGTTTT GACT G AGT GA
SEQ ID NO: 44
MNSLFVGRPIVKSSYNVYTLPSSICGGHFFKVSNSLSLYDDHRRTRIEIIRNSELIPKHVAIIMDGN
RRWAKARGLPVQEGHKFLAPNLKNICNISSKLGIQVITAFAFSTENWNRSSEEVDFLMRLFEEF
FEEFMRLGVRVSLIGGKSKLPTKLQQVIELTEEVTKSNEGLHLMMALNYGGQYDMLQATKNIAS
KVKDGLIKLEDIDYTLFEQELTTKCAKFPKPDLLIRTGGEQRISNFLLWQLAYSELYFTNTLFPDF
GEEALMDAIFSFQRRHRRFGGHTY
69
EXAMPLES
The techniques and methods described herein are carried out in a manner known to the skilled person. Further details may be found, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. All methods including the use of kits and reagents are carried out according to the manufacturers' information unless specifically indicated.
Genes used:
Table 1 - List of genes used in the study. Accession numbers are for Uniprot.org. Where unavailable a reference is cited.
Gene Origin Function Accession number/reference
ERG20N127W S. cerevisiae Erg20p dominant (Ignea et al 2014) negative mutant that confers geranyl pyrophosphate synthase activity to the homo/heterodimeric partner subunit
CsTHCAs Cannabis sativa Tetrahydrocannabinolic Q8GTB6 acid synthase
ScCKI S. cerevisiae Choline kinase P20485
ScEKI S. cerevisiae Ethanolamine kinase Q03764
EcThiM Escherichia coli Hydroxyethylthiazole P76423 kinase
70
SfPhoN Shigella flexneri Acid phosphatase / 050542
Alcohol kinase
Yeast strains
The yeast strains used in this application were based on the EGY48 Saccharomyces cerevisiae strain disclosed in (Ignea et al. , (2011), Thomas B.J. and R. Rothstein (1989) and (Ellerstrom M et al (1992)) and modified according to Table 2. Table 2: Strain genotype
Construction of plasmids:
Plasmids were generated using standard methods used within genetic engineering and known in the art. Detailed protocols for methods for plasmid construction can be found in general handbooks containing methods for molecular cloning. Genes were amplified by PCR and placed under the control of the dual inducible promoter PGALI and PGAUO. Coding genes sequences were then ligated using USER cloning (Nour-Eldin et al (2010)) into the backbone of the pESC-URA, pESC-LEU, pESC-TRP, and pESC-HIS, vectors (Agilent Technologies) to construct the plasmids listed in Table 3.
Yeast media and yeast cultivation conditions
The yeast cells were first cultured on selective complete minimal media with glucose at 30°C overnight. Selective complete minimal media consisted of 0.13% w/v dropout powder, 0.67% w/v yeast nitrogen base without amino acids with ammonium sulphate (YNB+AS), 2% w/v glucose. Dropout powder was purchased to lack leucine, histidine, uracil and tryptophan. When required, these four nutrients were added at 0.01-0.02% w/v. Cells were then harvested by centrifugation
73 to remove medium and resuspended in selective minimal production media. This media was used to induce galactose promoters, with additional raffinose as an alternative carbon source. Selective minimal production media composition: 0.13% w/v dropout powder, 0.64% w/v YNB+AS, 2% galactose, 1% w/v raffinose. When appropriate, the same four nutrients as above were added at 0.01-0.02% w/v. The cultures were grown at 30°C, 150 rpm, for the indicated time, and analyzed using GC-FID and/or SPME sampling and GC-MS. Isopropylmyristate (IPM) was added as an overlay corresponding to 10% of the culture volume in samples analyzed with GC-FID.
Cannabinoid extraction and analysis
500 mI_ yeast culture were transferred to a 2ml tube containing approx. 10Omg 0.5mm glass beads and 500 mI_ ethyl acetate with 0.05% formic acid, and vortexed for 3 minutes followed by a 1 minute centrifugation at 20,000g. The procedure was repeated three times, each time the resulting supernatant was transferred to separate 1,5ml tubes and 500mI_ of ethyl acetate with 0.05% formic acid was to the original. The resulting solution was evaporated using vacuum centrifugation and the resulting pellet was suspended in 200mI HPLC grade methanol, unsoluble compounds was removed using 0.22 pm ultrafree centrifugal filters (Merck Milipore, Tullagreen, Ireland).
Qualitative LC-ESI-MS analysis was performed on the Dionex UltiMate® 3000 Quaternary Rapid Separation UHPLC focused system (Thermo Fisher Scientific, Germering, Germany) equipped with a Phenomenex Kinetex XB-C18 column (100 mm c 2.1 mm i.d., 1.7 pm particle size, 100 A pore size) (Phenomenex, Inc., Torrance, CA, USA). The samples was analyzed according to Hansen, N.L., Miettinen, K., Zhao, Y. et al. Integrating pathway elucidation with yeast engineering to produce polpunonic acid the precursor of the anti-obesity agent celastrol. Microb Cell Fact 19, 15 (2020). https://doi.org/10.1186/s12934-020-1284-9
Example 1:
AtFKI is a superior kinase
To evaluate the performance of the proposed alcohol phosphorylation pathway, consisting of the Saccharomyces cerevisiae Ckilp and the A. thaliana IPK enzymes (Chatzivassileiou et al 2019), in yeast, we over-expressed these two genes from plasmid vectors pUUS-GAL1 and pHUS-GAL1, respectively, in S. cerevisiae strain EGY48 (a W303 derivative). As control, cells expressing no kinase, i.e. transformed with the empty pUUS and pHUS vectors, were used. A
74 concentration of 0.1 % (v/v) of the alcohols prenol or isoprenol was used. As an additional control, Cki1 p and AtIPK was expressed together in the absence of alcohol.
Yeast cells were grown overnight in synthetic complete minimal glucose selective media (CM- GLU), washed twice and transferred to synthetic complete minimal galactose selective media (CM-GAL) to induce heterologous gene expression. An aliquot of 5 ml. of the culture was transferred in a 35 ml. vial which was incubated for approximately 72 h and the headspace above the culture was analyzed by SPME and GCMS. We examined the yeast culture headspace for the presence of geraniol, linalool, farnesol or nerolidol, which would indicate a functional pathway. Very low levels of linalool could be determined that were only marginally higher than the linalool levels in the control cells. These results suggested that Cki1 p or AtIPK could not provide sufficient flux through this pathway, and we searched for better performing enzymes.
We compared the performance of a set of candidate kinases for the first step, which included the enzymes: S. flexneri PhoN (SfPhoN), S. cerevisiae Ekilp (ScEKI), S. cerevisiae Cki1 p(ScCKI), E. Coli ThiM (EcThiM) and A. thaliana FOLK (AtFKI; SEQ ID: 2). The corresponding genes were cloned in plasmid vector pUUS and co-expressed with AtIPK from vector pHUS-AtIPK. The strains used were: EGY48, NCTY 7, NCTY 22, NCTY 23, NCTY 24, NCTY 25.
Under the same conditions as described above, when AtFKI and AtIPK were co-expressed, the production of linalool was considerably higher when either prenol or isoprenol were used, indicating that using the AtFKI kinase had a far superior effect that the remaining candidates in increasing the intracellular concentration of the terpene precursor GPP. Thus, production of linalool when co-expressing AtFKI and AtIPK in yeast identified the most efficient combination; AtFKI is superior to other alcohol kinases in phosphorylating prenol and isoprenol in yeast.
Example 2:
A truncated form of AtFKI missing the first 65 amino acids is equally active as full-length AtFKI in yeast
AtFKI is a chloroplastic enzyme and therefore contains an N-terminal chloroplast targeting signal. Therefore, the active form of AtFKI in the plant cells is a shorter form of the full-length enzyme. To evaluate the minimum domain required for AtFKI to be functional in yeast, we evaluated the activity of different truncations. AtFKI with the first 65 amino acids removed and
75 replaced with a start codon (A65AtFKI; SEQ ID: 2) was expressed to compare the activity of the full native and the truncated version of the enzyme. The full and truncated enzymes were each expressed from a corresponding galactose-inducible pUUS plasmid in combination with AtIPK. Yeast strain EGY48 transformed with these plasmids was grown and analyzed as described in Example 1. It was found that when both the full enzyme and the truncated version was expressed, the production of linalool was similar. These results demonstrate that the first 65 amino acids of AtFKI are not necessary for full function, and that the region 66-end is a shorter, equally functional form of AtFKI. Strains used NCTY 7 and NCTY 8.
Example 3:
An alcohol phosphorylation pathway comprising AtFKI performs well in combination with various phosphokinases.
To evaluate the effect of different phosphokinases to complement AtFKI in constructing a functional isoprenoid precursor supply pathway, AtFKI was expressed together with AtIPK, MtIPK, TalPK or TalPK(204G) mutant and the headspace was analysed using GC-MS-SPME, as described in examples 1 and 2. Strains used: NCTY 7, NCTY 9, NCTY 10, and NCTY 11.
The results shown in Fig. 4 (production of linalool) demonstrate that AtFKI works well in combination with various phosphokinases.
Example 4:
To evaluate the effect of the AtFKI alcohol bioconversion pathway on terpenoid production, we analyzed the production of the terpenes limonene, myrcene, and sabinene by co-expression of the corresponding terpene synthase, limonene synthase, myrcene synthase, or sabinene synthase, respectively. The strains used were NCTY 12, NCTY 13 and NCTY 14.
In Fig. 5 the production of limonene using AtFKI and AtIPK is compared to the production achieved in the absence of an alcohol. Further, an additional 80-fold production increase of myrcene was obtained after using alcohol feeding pathway; co-expression of AtFKI-AtIPKand ObMyrS in yeast. Media with 0.05% prenol and 0.05% isoprenol was used.
Also, an additional 20-fold production increase of sabinene was obtained after using alcohol feeding pathway; co-expression of AtFKI-AtIPK and SpSabS in yeast cells. Media with 0.05% prenol and 0.05% isoprenol was used.
76
Furthermore, in Fig. 5, evidence is presented that different ratios and concentrations of prenol and isoprenol can be used to further enhance or control terpene production. The data demonstrate the concentration and ratio of prenol and isoprenol can be used to control the total production. PrenoUsoprenol = 3:0, 2:1, 1:1, 1:2 obtained similar improvement which resulted in approximately 58-fold increase than without feeding alcohol (PrenoUsoprenol = 0:0). The overall concentration of alcohols was 0.1%.
Further, limonene yield can be adjusted by feeding alcohols at different concentration to the strain containing the alcohol feeding pathway and limonene downstream building block. Concentration of alcohol= 0.3% resulted the highest limonene titer with approximately 40-fold increase than without feeding alcohol (Concentration of alcohol= 0%). The alcohols ratio of prenoUsoprenol = 1:1.
In Fig. 5a and 5b, a similar increase in production of terpenes is demonstrated with AtFKI and AtIPK co-expressed with myrcene synthase or sabinene synthase, respectively.
When producing limonene, the production is increased over 25-fold when prenol and isoprenol is used, compared to strain with absence of AtFKI. (State-of the art production through isoprenoid precursors synthesized from the mevalonate pathway). Similarly, a 20-fold increase is found in using sabinene.
Example 5:
AtFKI boosts cannabinoid production from prenol and isoprenol.
Cannabigerolic acid (CBGA) is a central intermediate in the synthesis of natural cannabinoids. Co-expression of the alcohol feeding pathway AtFKI-AtIPK and the CBGA-synthesizing fusion Erg20p(N127W)-CsPT4 in yeast cells (strain NCTY 19) resulted in 7.5-fold increased CBGA production (pink) compared to the production titer without feeding alcohol (black). Samples were analyzed in triplicate (n = 3 biologically independent samples). Olivetolic acid was added to the cultures at 0.1 mM.
Example 6:
Synthesis of non-canonical terpenoids using AtFKI.
We tested the ability of AtFKI to phosphorylate additional alcohols, beyond prenol or isoprenol, by co-expressing AtFKI with AtIPK and CILimS in EGY48 cells (giving rise to strain NCTY 14).
77
We supplemented with different alcohols in the media and found that several of them were converted into non-canonical terpene like structures.
The data of Fig. 8 demonstrates that novel compounds are being produced when the prenol-like alcohols with additional methyl groups are converted with AtFKI + AtIPK + CILimS.
Based on the alcohol which have been converted and the mass spectrum of novel compounds the suggested structures are shown for the peaks.
In addition to the examples shown above, novel compounds have also been found when several other alcohols were tested with AtFKI and AtIPK (see Fig. 9).
Based on these observations, we propose that AfFKI can phosphorylate the core alcohol structure shown in Fig. 10.
Example 7:
Protein engineering of the terpene synthase facilitates the production of non-canonical terpenes.
Since several alcohols tested were able to produce only limited amounts of novel terpenoids, we examined whether this limitation in the production on non-canonical terpenoids is due to the inability of the terpene synthase, in this case limonene synthase, to accept the non-canonical substrate. To evaluate this, we subjected limonene synthase to site-directed mutagenesis aiming at specific residues whose substitution could expand the active site cavity and accommodate the larger substrate. The strains used in this study were: NCTY 26, NCTY 27 and NCTY 28.
The data in Fig. 11 demonstrates that terpene synthases can indeed be mutated to utilize the novel non-canonical building block instead of the canonical building block, i.e., the mutants favour the larger non-canonical substrates over the canonical GPP.
Example 8:
Synthesis of non-canonical cannabinoids.
We explored whether AtFKI can provide non-canonical precursors for the synthesis of cannabinoids with unprecedented structures. Fig. 12 demonstrates the capability of the system to convert a range of alcohols into the corresponding diphosphates, which are then attached to olivetolic acid to yield cannabigerolic acid (CBGA) analogues with novel non-canonical structures.
78
Four different alcohols were feeding in the yeast strain (NCTY 21) containing the alcohol feeding pathway AtFKI-AtIPK and CBGA downstream building block CsPT4-Erg20p(N127W) producing a blend of noncanonical CBGA (red), respectively, compared without feeding alcohols (green). Yeast cells expressing the Erg20p(127W)-CsPT4 alone (NCTY 20; blue), AtFKI-AtIPK alone (NCTY 21b; purple) are shown as the controH and control, respectively a, 3-methylpent-2-en- 1 -ol was used as feed to produce compound 1 (2,4-dihydroxy-3-(3-methylpent-2-en-1-yl)-6- pentylbenzoic acid), compound 2 (3-((2E)-3,7-dimethylnona-2,6-dien-1-yl)-2,4-dihydroxy-6- pentylbenzoic acid) and compound 3 (2,4-dihydroxy-6-pentyl-3-(3,4,7-trimethylnona-2,6-dien-1- yl)benzoic acid) b, 3,4-dimethylpent-2-en-1-ol was used as feed to produce compound 4 (3-(3,4- dimethylpent-2-en-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid) and compound 5 (2,4-dihydroxy-6- pentyl-3-((2E)-3,7,8-trimethylnona-2,6-dien-1-yl)benzoic acid) c, 3-methylhex-2-en-1-ol was used as feed to produce compound 6 ((E)-2,4-dihydroxy-3-(3-methylhex-2-en-1-yl)-6- pentylbenzoic acid) and compound 7 (3-((2E,6E)-3,7-dimethyldeca-2,6-dien-1-yl)-2,4-dihydroxy- 6-pentylbenzoic acid) d, 3-ethylpent-2-en-1-ol was used as feed to produce compound 8 ((E)-3- (7-ethyl-3-methylnona-2,6-dien-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid).
Example 9:
Production of non-canonical terpenes with 16 and 17 carbon atoms using AtFKI.
This example illustrates the biosynthesis of noncanonical sesquiterpene scaffolds with 16 and 17 carbon atoms. Fig. 13 shows the production of non-canonical sesquiterpene scaffolds with 16 and 17 carbons by co-expression of the alcohol conversion pathway AtFK\ and At\PK, together with the farnesyl pyrophosphate (FPP) synthase Erg20p and the terpene synthase CYC2 in S. cerevisiae strain EGY48. The cells were growing according to the conditions described in Example 6.
To improve the product yield of the non-canonical C16 and C17 sesquiterpene precursors, we selected and tested several wild-type and mutant FPP synthases and geranylgeranyl pyrophopsphate (GGPP) synthases. Fig. 14 shows that Erg20p(F96C) is the most effective enzyme to increase the yield of C16 and C17 sesquiterpenes when feeding 3-methylbut-2-en-1- ol (3M2E). Similarly, Erg20p(F96C) is also the most efficient of the enzymes tested for improving the production of C16 and C17 sesquiterpenes when feeding 3,4-dimethylpent-2-en-1-ol (3,4- DMP) (Fig. 15) and 3-methylpent-2-en-1-ol (3-MP) (Fig. 16). However, SynGGPPS was found to be the most efficient enzyme tested to increase the yield of C17 sesquiterpenes when feeding 3- ethylpent-2-en-1-ol (3E2E) (Fig 17).
79
The data in Example 9 demonstrates that a yeast cell expressing AfFKI is capable of phosphorylating the core alcohol structures shown in Figures 13-17 (3-methylbut-2-en-1-ol (3M2E), 3,4-dimethylpent-2-en-1-ol (3,4-DMP), 3-methylpent-2-en-1-ol (3-MP), 3-ethylpent-2-en- 1 -ol (3E2E)) produce C16 and C17 (non-canonical) terpenes.
Example 10:
This example illustrates the biosynthesis of a non-canonical sesquiterpene by the action of the enzyme Salvia fruticosa caryophyllene synthase (Sf126) on C16 prenyl diphosphate substrates produced by the alcohol conversion pathway. The alcohol conversion pathway enzymes AfFKI and AfIPK were co-produced in yeast EGY48 cells with the FPP synthase Erg20p(F96C) and the terpene synthase Sfl26. 3-methylpent-2-en-1-ol (3-MP) was supplied (Fig 18).
This example demonstrates that AfFKI can phosphorylate the core alcohol structure 3- methylpent-2-en-1-ol (3-MP) and produce non-canonical C16 sesquiterpene.
Example 11:
This example illustrates the biosynthesis of non-canonical triterpene scaffolds. Fig. 19 shows the production of C31 (ho o)squalene by co-expression of the alcohol conversion pathway AfFKI- AflPK and the FPP synthase Erg20p(F96C) upon feeding 3-methylpent-2-en-1-ol (3-MP) and prenol. The yeast strain EGY48 was used and the cells were cultured under the conditions as described in Example 6.
Example 12:
Example 12 illustrates the biosynthesis of non-canonical triterpenes. Fig. 20 shows the production of a non-canonical triterpenoid product by co-expression of the alcohol conversion pathway AfFKI- AtlPKand cucurbitadienol synthase CPQ. Fig. 21 shows the production of another non-canonical triterpenoid product by co-expression of the alcohol conversion pathway AtFKI-AtIPK and (+)- ambrein synthase BmeTC(373C).
In conclusion, examples 9-12 demonstrate that yeast cells expressing AfFKI are capable of phosphorylating a variety of primary alcohols and generate non-canonical triterpenoids.
80
ITEMS
Item 1. A genetically engineered eukaryotic cell for the production of a terpene or terpenoid or isoprenoid comprising a first nucleic acid sequence encoding a first kinase that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; and optionally a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor; wherein the first kinase comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto.
Item 2. The cell according to item 1 wherein the first kinase is an alcohol kinase capable of phosphorylating a primary alcohol to a monophosphate terpenoid precursor.
Item 3. The cell according to either item 1 or 2 wherein said first nucleic acid sequence encodes a kinase that is capable of phosphorylating a non-canonical, prenol-like primary alcohol to a non- canonical monophosphate terpenoid precursor.
Item 4. The cell according to any of the preceding items, wherein said first nucleic acid encodes a kinase that also has phosphokinase activity thus being capable of catalyzing the conversion of a primary alcohol to a terpenoid pyrophosphate precursor.
Item 5. The cell according to any one of the preceding items, wherein the alcohol kinase is Farnesol kinase of Arabidopsis thaliana or SEQ ID NO: 1 or a homolog or variant thereof having at least 75% identity thereto.
Item 6. The cell according to any one of the preceding items, wherein the phosphokinase is a prenyl phosphate kinase, such as an isopentenyl phosphate kinase.
Item 7. The cell according to item 6, wherein the phosphokinase is SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6 or a homologue or variant thereof having at least 75% identity thereto.
Item 8. The cell according to any one of the preceding items wherein the kinase or kinases are fused to one or more peptide or peptide analogues.
Item 9. The cell according to item 8, wherein said one or more peptide or peptide analogues confer additional functionality to the kinase or kinases, such as an improved enzyme kinetics, or intracellular localisation functionality, or the functionality of increasing the stability, promiscuity and / or half-life of the kinase or kinases.
81
Item 10. The cell according to item 9, wherein the peptide or peptide analogue is maltose-binding protein, green fluorescent protein, thioredoxin, glutathione S-transferase, yeast farnesyl diphosphate synthase (Erg20p), NusA, small ubiquitin related modifier Smt3, or a fragment thereof. Item 11. The cell according to any one of the preceding items, wherein the primary alcohol is an alcohol with the structure of formula 1:
Formula 1:
wherein Ri is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulfur, a group containing phosphorus and / or a group containing boron;
82
R2 is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulphur, a group containing phosphorus and / or a group containing boron; and
R3 is hydrogen, methyl, fluorine, chlorine, bromine, iodine, sulfhydryl; or hydroxyl.
Item 12. The cell according to item 11, wherein the primary alcohol is 3-methylbut-2-en-1-ol, 4- fluoro-3-methylbut-2-en-1 -ol, 3-methylpent-2-en-1 -ol, 3,4-dimethylpent-2-en-1 -ol, 3-ethylpent-2- en-1-ol, 3-methylhex-2-en-1-ol, 3-methylhexa-2,5-dien-1-ol, 3-methylbut-3-en-1-ol, 3- methylenepentan-1 -ol, 2-methylprop-2-en-1 -ol, 3-methyl-4-(methylthio)but-2-en-1 -ol, 5-chloro-3- methylpent-2-en-1-ol, 3,4-dimethylpent-2-en-1-ol, 4-methyl-3-methylenepentan-1-ol, 3,4- dimethylpent-3-en-1-ol, propan-1-ol, prop-2-en-1-ol, prop-2 -yn-1-ol, butan-1-ol, but-3-en-1-ol, but-2-en-1-ol, buta-2,3-dien-1-ol, but-3-yn-1-ol, 3-methylbut-3-en-1-ol, 3-methylbut-2-en-1-ol, 3- methylbutan-1-ol, but-2-yn-1-ol, 2-methylenebutan-1-ol, 2-methylbut-2-en-1-ol, 2-methylbut-3-en- 1 -ol, 2-methylbutan-1-ol, 3-ethylpent-4-en-1-ol, 3-methylpenta-2,4-dien-1-ol, 3-methylpentan-1- ol, 3-methylpent-2-en-1-ol, 3-methylenepentan-1-ol, 3-methylpent-3-en-1-ol, 3-ethylpentan-1-ol, 3-ethylpent-4-en-1-ol, 3-ethylpent-3-en-1-ol, 3-ethylpent-2-en-1-ol, 3-ethylpent-4-yn-1-ol, 3- methylenepent-4-en-1-ol, 3-methylpent-4-yn-1-ol, 3-methylenepent-4-yn-1-ol, 3-methylhexan-1- ol, 3-methylhex-2-en-1-ol, 3-methylenehexan-1-ol, 3-methylhex-3-en-1-ol, 3-methylhex-5-en-1- ol, 3-methylpent-2-en-4-yn-1-ol, 3-methylhex-4-en-1-ol, 3-methylhexa-2,4-dien-1-ol, 3- methylhexa-2,5-dien-1-ol, 3-methylheptan-1-ol, 3-methylhepta-2,4-dien-1-ol, 3- methyleneheptan-1-ol, 3-methylhept-3-en-1-ol, 3-methylhept-4-en-1-ol, 3-methylhept-5-en-1-ol, 3-methylhept-6-en-1-ol, 3-methylhept-2-en-1-ol, 3-methylenehept-4-en-1-ol, 3-methylhepta-3,4- dien-1-ol, 3-methylhept-4-en-1-ol, 3-methylhepta-5,6-dien-1-ol, 3-methylhepta-2,5-dien-1-ol, 3- methylhepta-2,6-dien-1-ol, 3-methylenehept-6-en-1-ol, 3-methylenehept-5-en-1-ol, 3- methylhepta-3,5-dien-1-ol, 3-methylhepta-4,5-dien-1-ol, 3-methylhepta-3,5,6-trien-1-ol, 3- methylhepta-4,6-dien-1-ol, 3-methylhepta-2,4,6-trien-1-ol, 3-methylhepta-4,6-dien-1-ol, 3- methyloctan-1-ol, 3-methyloct-2-en-1-ol, 3-methyleneoctan-1-ol, 3-methyloct-3-en-1-ol, 3- methyloct-4-en-1-ol, 3-methyloct-5-en-1-ol, 3-methyloct-7-en-1-ol, 3-methylocta-2,4-dien-1-ol, 3- methyleneocta-4,5-dien-1 -ol, 3-methyleneoct-4-en-1 -ol, 3-methylocta-3,4-dien-1 -ol, 3-methyloct- 6-en-1-ol, 3-methylocta-2,4,5-trien-1-ol, 3-methylocta-2,4,6-trien-1-ol, 3-methyleneocta-4,6-dien- 1 -ol, 3-methylocta-4,5-dien-1-ol, 3-methylocta-5,6-dien-1-ol, 3-methylocta-6,7-dien-1-ol, 3- methyleneocta-5,7-dien-1-ol, 3-methylocta-2,5,7-trien-1-ol, 3-methyleneocta-4,7-dien-1-ol, 3- methylocta-2,4,7-trien-1-ol, 3-methylocta-4,5,7-trien-1-ol, 3-methylocta-3,4,6-trien-1-ol, 3-
83 methylocta-3,4,7-trien-1-ol, 3-fluorobut-2-en-1-ol, 3-chlorobut-2-en-1-ol, 3-bromobut-2-en-1-ol, 3- aminobut-2-en-1-ol, 3-phosphaneylbut-2-en-1-ol, 3-fluorobut-3-en-1-ol, 3-chlorobut-3-en-1-ol, 3- bromobut-3-en-1-ol, 3-aminobut-3-en-1-ol, 3-phosphaneylbut-3-en-1-ol, 4-chloro-3-methylbut-2- en-1-ol, 4-bromo-3-methylbut-2-en-1-ol, 4-hydroxy-2-methylbut-2-enal, 2-methylbut-2-ene-1 ,4- diol, 4-mercapto-3-methylbut-2-en-1-ol, 3-methylpent-2-en-1-ol, 4-amino-3-methylbut-2-en-1-ol, 3-(fluoromethyl)but-3-en-1-ol, 4-fluoro-3-methylbut-2-en-1-ol, 3-(chloromethyl)but-3-en-1-ol, 3- (bromomethyl)but-3-en-1-ol, 4-hydroxy-2-methylenebutanal, 2-methylenebutane-1 ,4-diol, 3- (mercaptomethyl)but-3-en-1-ol, 3-methylenepentan-1-ol, 3-(aminomethyl)but-3-en-1-ol, 3- (phosphaneylmethyl)but-3-en-1 -ol, 3-methyl-4-phosphaneylbut-2-en-1 -ol, 5-fluoro-3-methylpent- 2-en-1-ol, 5-bromo-3-methylpent-2-en-1-ol, 5-chloro-3-methylpent-2-en-1-ol, 3-methylpent-2- ene-1 ,5-diol, 5-hydroxy-3-methylpent-3-enal, 5-iodo-3-methylpent-2-en-1-ol, 3-methyl-4- (methylthio)but-2-en-1 -ol, 5-mercapto-3-methylpent-2-en-1 -ol, 5-amino-3-methylpent-2-en-1 -ol, or 3-methyl-5-phosphaneylpent-2-en-1-ol or an analogue of any of these compounds including analogues with the elements nitrogen, oxygen, fluorine, silicon, phosphorus, sulphur, chlorine, selenium, boron, iodine, lithium, sodium or potassium.
Item 13. The cell according to item 11 , wherein the primary alcohol is prenol, isoprenol or a prenol like alcohol.
Item 14. The cell according to any one of the preceding items, further comprising a further exogenous nucleic acid sequence enabling increased expression of an enzyme capable of catalysing the production of canonical and / or non-canonical terpenes, terpenoids, isoprenoids or structures containing isoprenoid groups. Item 15. The cell according to item 14 wherein the exogenous nucleic acid sequence enabling increased expression encodes a terpene synthase enzyme such as a monoterpene synthase, a sesquiterpene synthase, a diterpene synthase, a sesterterpene synthaseor a triterpene synthase or a fragment thereof; or a prenyltransferase enzyme or a fragment thereof; or other enzymes or a fragment thereof capable of catalysing the production of canonical or non-canonical terpenes, terpenoids, isoprenoids or structures containing isoprenoid groups.
Item 16. The cell according to item 15 wherein the terpene synthase or the prenyl transferase or other enzyme is capable of using non-canonical terpenoid or isoprenoid building blocks as substrate.
Item 17. The cell according to either item 15 or 16 wherein the terpene synthase enzyme or fragment thereof or a prenyl transferase enzyme or fragment thereof or other enzymes or a fragment thereof capable of catalysing the production of canonical terpenes, terpenoids,
84 isoprenoids or structures containing isoprenoid groups, comprises a change in the amino acid sequence that enables improved enzyme kinetics for utilisation of non-canonical terpenoid or isoprenoid building blocks.
Item 18. The cell according to any one of the preceding items, said cell being capable of production of a terpene or terpenoid selected from the group comprising:
Limonene, myrcene, alpha-pinene, sabinene, beta-pinene, 1,8-cineole, tricyclene, alpha- thujene, a/p/7a-fenchene, camphene, delta- 2-carene, a/p/7a-phellandrene, 3-carene, 1,4-cineole, a/p/7a-terpinene, befa-phellandrene, (Z)-befa-ocimene, (E)-beta-ocimene, gamma-terpinene, terpinolene, linalool, perillene, allo-ocimene, c/s-beta-terpineol, c/s-terpine-1-ol, isoborneol, cfe/fa-terpineol, borneol, chrysanthemol, lavandulol, alpha-terpineol, nerol, geraniol, alpha- humulene, beta-caryophyllene, valencene, amorpha-4,11 -diene, alpha-patchoulene, alpha- santalane, beta-santalene, valerenol, obtusadiene, laurencenone A, taxadiene, miltiradiene, sclareol, casbene, cannabigerolic acid, grifolic acid, daurichromenic acid, confluentin, rhododaurichromenic acids A and B, anthopogocyclolic acid, anthopogochromenic acid, cannabiorcichromenic acid, cannabiorcicyclolic acid, c/s-perrottetinene, (-)-c/s-perrottetinenic acid, 7-ethyl-3-methylenenona-1 ,6-diene, 7-methyl-3-methylenenona-1 ,6-diene, 7,8-dimethyl-3- methylenenona-1 ,6-diene, 7-methyl-3-methylenedeca-1 ,6-diene, 1 -methyl-4-(3-methylbut-1 -en- 2-yl)cyclohex-1 -ene, 4-(but-1 -en-2-yl)-1 -methylcyclohex-1 -ene,
1 -methyl-4-(pent-1 -en-2-yl)cyclohex-1 -ene, 1 -methyl-4-(pent-2-en-3-yl)cyclohex-1 -ene, 2,4-dihydroxy-3-(3-methylpent-2-en-1-yl)-6-pentylbenzoic acid, 2,4-dihydroxy-3-(3-methylhex-2- en-1-yl)-6-pentylbenzoic acid, 3-(4-fluoro-3-methylbut-2-en-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid, 3-(7-ethyl-3-methylnona-2,6-dien-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid, 3-(3,7- dimethylnona-2,6-dien-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid, 2,4-dihydroxy-6-pentyl-3-(3,4,7- trimethylocta-2,6-dien-1-yl)benzoic acid, 2,4-dihydroxy-6-pentyl-3-(3,4,7-trimethylnona-2,6-dien- 1-yl)benzoic acid, 3-(3-ethylpent-2-en-1-yl)-2,4-dihydroxy-6-pentylbenzoic acid, 2,4-dihydroxy-6- pentyl-3-(3,7,8-trimethylnona-2,6-dien-1-yl)benzoic acid, 3-(3,7-dimethyldeca-2,6-dien-1-yl)-2,4- dihydroxy-6-pentylbenzoic acid, and 3-(8-fluoro-3,7-dimethylocta-2,6-dien-1-yl)-2,4-dihydroxy-6- pentylbenzoic acid.
Item 19. The cell according to any one of the preceding items wherein the eukaryotic cell is a yeast cell, such as a Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Hansenula polymorpha (syn. Ogataea parapolymorpha), Kluyveromyces marxianus, Yarrowia lipolytica, Klyveromyces lactis, or Dekkera bruxellensis cell.
85
Item 20. The cell according to any one of items 1 - 18 wherein the eukaryotic cell is a filamentous fungi cell, such as a cell derived from Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
Item 21. The cell according to any one of items 1 -18, wherein the eukaryotic cell is an algal cell, such as a microalgae cell such as Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana.
Item 22. A method for production of a terpene, terpenoid or an isoprenoid, said method comprising the steps of:
- providing an engineered eukaryotic cell comprising an exogenous DNA sequence coding for a primary alcohol kinase, and
- culturing said engineered cell in a medium containing a primary alcohol.
Item 23. The method according to item 22 wherein said cell further comprises an exogenous nucleic acid sequence coding for a phosphokinase.
Item 24. The method according to either of items 22 or 23 wherein the exogenous DNA sequence coding for a primary alcohol kinase encodes an alcohol kinase comprising SEQ ID NO: 2 or a homolog or variant thereof having at least 75% identity thereto.
Item 25. The method according to either of items 22 or 23, wherein the exogenous DNA sequence coding for a primary alcohol kinase encodes an alcohol kinase comprising SEQ ID NO: 1 or a homolog or variant thereof having at least 75% identity thereto.
Item 26. The method according to any one of items 22 - 25 wherein the primary alcohol is an alcohol with a structure according to item 11.
Item 27. The method according to item 26 wherein the primary alcohol is an alcohol of item 12.
Item 28. The method according to any one of items 22 - 27 wherein the primary alcohol is at an initial concentration within a range of 0.01% to 1% v/v, such as within a range of 0.05% to 0.6% v/v, such as within a range of 0.1% to 0.3% v/v, such as 0.1% v/v.
Claims
1. A genetically engineered eukaryotic cell for the production of a terpene or a terpenoid or an isoprenoid comprising a first nucleic acid sequence encoding a first kinase that phosphorylates a primary alcohol to a mono- or pyrophosphate terpenoid precursor; wherein the first kinase comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto.
2. The cell according to claim 1 , wherein the cell further comprises a second nucleic acid sequence encoding a phosphokinase that phosphorylates a monophosphate precursor to a terpenoid pyrophosphate precursor.
3. The cell according to any one of the preceding claims wherein the first kinase is an alcohol kinase capable of phosphorylating a primary alcohol to a monophosphate terpenoid precursor.
4. The cell according to any one of the preceding claims, wherein said first nucleic acid sequence encodes a kinase that is capable of phosphorylating a non-canonical, prenol-like primary alcohol to a non-canonical monophosphate terpenoid precursor.
5. The cell according to any one of the preceding claims, wherein said first nucleic acid encodes a kinase that also has phosphokinase activity thus being capable of catalyzing the conversion of a primary alcohol to a terpenoid pyrophosphate precursor.
6. The cell according to any one of the claims 2 to 5, wherein the phosphokinase is a prenyl phosphate kinase, such as an isopentenyl phosphate kinase.
7. The cell according to any one of claims 2 to 6, wherein the phosphokinase is SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, or SEQ ID NO: 6 ora homologue or variant thereof having at least 75% identity thereto.
87
8. The cell according to any one of the preceding claims, wherein the primary alcohol is an alcohol with the structure of formula 1:
Formula 1:
wherein Ri is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulfur, a group containing phosphorus and / or a group containing boron;
R2 is hydrogen, an alkane-, an alkene-, an alkyne-, a benzene derivative-, a cyclic group, a branched group, a group containing a reactive nonmetal; a group containing a metalloid; a group containing a halogen, a group containing oxygen, a group containing nitrogen, a group containing sulphur, a group containing phosphorus and / or a group containing boron; and
88
R3 is hydrogen, methyl, fluorine, chlorine, bromine, iodine, sulfhydryl; or hydroxyl.
9. The cell according to any one of the preceding claims, wherein the primary alcohol is prenol, isoprenol ora prenol like alcohol.
10. The cell according to any one of the preceding claims wherein the eukaryotic cell is a yeast cell, such as a Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces pombe, Scheffersomyces stipidis, Pichia pastoris, Hansenula polymorpha (syn. Ogataea parapolymorpha), Kluyveromyces marxianus, Yarrowia lipolytica, Klyveromyces lactis, or Dekkera bruxellensis cell.
11. The cell according to any one of claims 1 - 9 wherein the eukaryotic cell is a filamentous fungi cell, such as a cell derived from Aspergillus niger, Aspergillus oryzae, Aspergillus terreus, Neurospora crassa, or Trichoderma reesei.
12. The cell according to any one of claims 1 -9, wherein the eukaryotic cell is an algal cell, such as a microalgae cell such as Nannochoropsis gaditana, Nannochloropsis oceanica, Nannochloropsis salina, Chlamydomonas reinhardtii, Arthrospira, Chlorella vulgaris, Dunaliella salina, Haematococcus pluvialis, Pheaodactylum tricornutum, or Isochrysis galbana.
13. A method for production of a terpene, a terpenoid or an isoprenoid, said method comprising the steps of:
- providing an engineered eukaryotic cell comprising an exogenous DNA sequence coding for a first kinase, wherein the first kinase comprises SEQ ID NO: 2 or a homologue or variant thereof having at least 75% identity thereto and
- culturing said engineered cell in a medium containing a primary alcohol.
14. The method according to claim 13 wherein said cell further comprises an exogenous nucleic acid sequence coding for a phosphokinase.
89
15. The method according to any one of claims 13- 14 wherein the primary alcohol is an alcohol with a structure according to claim 8.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21188844.1 | 2021-07-30 | ||
EP21188844 | 2021-07-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023006699A1 true WO2023006699A1 (en) | 2023-02-02 |
Family
ID=77431114
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2022/070854 WO2023006699A1 (en) | 2021-07-30 | 2022-07-26 | Cells and method for producing isoprenoid molecules with canonical and non-canonical structures |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023006699A1 (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996000787A1 (en) | 1994-06-30 | 1996-01-11 | Novo Nordisk Biotech, Inc. | Non-toxic, non-toxigenic, non-pathogenic fusarium expression system and promoters and terminators for use therein |
US6011147A (en) | 1986-04-30 | 2000-01-04 | Rohm Enzyme Finland Oy | Fungal promoters active in the presence of glucose |
WO2000056900A2 (en) | 1999-03-22 | 2000-09-28 | Novo Nordisk Biotech, Inc. | Promoter sequences derived from fusarium venenatum and uses thereof |
WO2017138986A1 (en) | 2016-02-09 | 2017-08-17 | Cibus Us Llc | Methods and compositions for increasing efficiency of targeted gene modification using oligonucleotide-mediated gene repair |
US20180334692A1 (en) * | 2017-05-10 | 2018-11-22 | Baymedica, Inc. | Recombinant production systems for prenylated polyketides of the cannabinoid family |
WO2019232025A2 (en) * | 2018-05-29 | 2019-12-05 | Massachusetts Institute Of Technology | Microbial engineering for the production of isoprenoids |
WO2020150340A1 (en) * | 2019-01-15 | 2020-07-23 | North Carolina State University | Isoprenoids and methods of making thereof |
WO2020160289A1 (en) * | 2019-01-30 | 2020-08-06 | Genomatica, Inc. | Engineered cells for improved production of cannabinoids |
-
2022
- 2022-07-26 WO PCT/EP2022/070854 patent/WO2023006699A1/en unknown
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6011147A (en) | 1986-04-30 | 2000-01-04 | Rohm Enzyme Finland Oy | Fungal promoters active in the presence of glucose |
WO1996000787A1 (en) | 1994-06-30 | 1996-01-11 | Novo Nordisk Biotech, Inc. | Non-toxic, non-toxigenic, non-pathogenic fusarium expression system and promoters and terminators for use therein |
WO2000056900A2 (en) | 1999-03-22 | 2000-09-28 | Novo Nordisk Biotech, Inc. | Promoter sequences derived from fusarium venenatum and uses thereof |
WO2017138986A1 (en) | 2016-02-09 | 2017-08-17 | Cibus Us Llc | Methods and compositions for increasing efficiency of targeted gene modification using oligonucleotide-mediated gene repair |
US20180334692A1 (en) * | 2017-05-10 | 2018-11-22 | Baymedica, Inc. | Recombinant production systems for prenylated polyketides of the cannabinoid family |
WO2019232025A2 (en) * | 2018-05-29 | 2019-12-05 | Massachusetts Institute Of Technology | Microbial engineering for the production of isoprenoids |
WO2020150340A1 (en) * | 2019-01-15 | 2020-07-23 | North Carolina State University | Isoprenoids and methods of making thereof |
WO2020160289A1 (en) * | 2019-01-30 | 2020-08-06 | Genomatica, Inc. | Engineered cells for improved production of cannabinoids |
Non-Patent Citations (24)
Title |
---|
"GenBank", Database accession no. AF543530.1 |
"NCBI", Database accession no. N M 102426.6 |
"Peptide and Protein Drug Delivery", 1991, MARCEL DEKKER, INC., pages: 247 - 301 |
"UniProt", Database accession no. AOA0971YL3 |
"Uniprot", Database accession no. AOA455ZJC3 |
A. HEATHER FITZPATRICK ET AL: "Farnesol kinase is involved in farnesol metabolism, ABA signaling and flower development in Arabidopsis : Farnesol kinase in Arabidopsis", THE PLANT JOURNAL, vol. 66, no. 6, 12 April 2011 (2011-04-12), GB, pages 1078 - 1088, XP055716443, ISSN: 0960-7412, DOI: 10.1111/j.1365-313X.2011.04572.x * |
ALTSCHUL ET AL., J. MOL. BIOL, vol. 215, 1990, pages 403 - 10 |
ALTSCHUL ET AL., NUCL. ACIDS RES., vol. 25, 1997, pages 3389 |
ALTSCHUL ET AL., NUCLEIC ACIDS RES., vol. 25, 1997, pages 3389 - 3402 |
CHATZIVASILEIOU ET AL., PNAS, 2019 |
E. MEYERSW. MILLER, CABIOS, vol. 4, 1989, pages 11 - 17 |
GUOSHERMAN, MOL. CELLULAR BIOL., vol. 15, 1995, pages 5983 - 5990 |
HANSEN, N.LMIETTINEN, KZHAO, Y. ET AL.: "Integrating pathway elucidation with yeast engineering to produce polpunonic acid the precursor of the anti-obesity agent celastrol", MICROB CELL FACT, vol. 19, 2020, pages 15, Retrieved from the Internet <URL:https://doi.org/10.1186/s12934-020-1284-9> |
HOLME ET AL., PLANT MOL BIOL, vol. 95, 2017, pages 111 - 121 |
JOHNSON ET AL., ANGEW CHEM INT ED ENGL., 2020 |
JONES, A. ADV. DRUG DELIVERY REV., vol. 10, 1993, pages 29 - 90 |
LAWRENSON ET AL.: "16", GENOME BIOLOGY, 2015, pages 258 |
LUND ET AL., ACS SYNTH. BIOL., 2019 |
LUO ZHENGSHAN ET AL: "Enhancing isoprenoid synthesis in Yarrowia lipolytica by expressing the isopentenol utilization pathway and modulating intracellular hydrophobicity", METABOLIC ENGINEERING, vol. 61, 1 September 2020 (2020-09-01), AMSTERDAM, NL, pages 344 - 351, XP055884330, ISSN: 1096-7176, DOI: 10.1016/j.ymben.2020.07.010 * |
MORRISON, S.: "229", SCIENCE, 1985, pages 1202 |
NEEDLEMANWUNSCH, J. MOL. BIOL., vol. 48, 1970, pages 1994 - 1998 |
ROMANOS ET AL., YEAST, vol. 8, 1992, pages 423 - 488 |
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS |
SUN ET AL., MOLECULAR PLANT, vol. 9, 2016, pages 628 - 631 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11555211B2 (en) | Recombinant production systems for prenylated polyketides of the cannabinoid family | |
US11130972B2 (en) | Method for producing fragrant alcohols | |
JP6359058B2 (en) | Cytochrome P450 and their use for enzymatic oxidation of terpenes | |
US10704064B2 (en) | Recombinant yeast producing 3-hydroxypropionic acid and method for producing 3-hydroxypropionic acid using the same | |
Sato-Masumoto et al. | Two types of alcohol dehydrogenase from Perilla can form citral and perillaldehyde | |
US20120021475A1 (en) | Method for producing (+)-zizaene | |
US20210010035A1 (en) | Production of manool | |
WO2023006699A1 (en) | Cells and method for producing isoprenoid molecules with canonical and non-canonical structures | |
US8980604B2 (en) | Mutant glycerol dehydrogenase (GlyDH) for the production of a biochemical by fermentation | |
US11293040B2 (en) | Methods of producing sesquiterpene compounds | |
US20240117388A1 (en) | Acyl activating enzymes for preparation of cannabinoids | |
CN111527203A (en) | Cytochrome P450 monooxygenase catalyzed oxidation of sesquiterpenes | |
EP3317418B1 (en) | Production of manool | |
WO2016161984A1 (en) | Production of fragrant compounds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22754416 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |