CN115038786A - Optimized tetrahydrocannabinolic acid (THCA) synthase polypeptides - Google Patents

Optimized tetrahydrocannabinolic acid (THCA) synthase polypeptides Download PDF

Info

Publication number
CN115038786A
CN115038786A CN202080078001.1A CN202080078001A CN115038786A CN 115038786 A CN115038786 A CN 115038786A CN 202080078001 A CN202080078001 A CN 202080078001A CN 115038786 A CN115038786 A CN 115038786A
Authority
CN
China
Prior art keywords
seq
polypeptide
host cell
modified host
amino acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080078001.1A
Other languages
Chinese (zh)
Inventor
A·霍维茨
J·王
D·普拉特
J·尤伯萨克斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Demetrix Inc
Original Assignee
Demetrix Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Demetrix Inc filed Critical Demetrix Inc
Publication of CN115038786A publication Critical patent/CN115038786A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P17/00Preparation of heterocyclic carbon compounds with only O, N, S, Se or Te as ring hetero atoms
    • C12P17/02Oxygen as only ring hetero atoms
    • C12P17/06Oxygen as only ring hetero atoms containing a six-membered hetero ring, e.g. fluorescein
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/40Preparation of oxygen-containing organic compounds containing a carboxyl group including Peroxycarboxylic acids
    • C12P7/42Hydroxy-carboxylic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y121/00Oxidoreductases acting on X-H and Y-H to form an X-Y bond (1.21)
    • C12Y121/03Oxidoreductases acting on X-H and Y-H to form an X-Y bond (1.21) with oxygen as acceptor (1.21.3)
    • C12Y121/03007Tetrahydrocannabinolic acid synthase (1.21.3.7)

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Botany (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The present disclosure provides engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, nucleic acids comprising nucleotide sequences encoding the engineered variants, methods of making modified host cells comprising the nucleic acids, modified host cells expressing the engineered variants, methods of producing cannabinoids or cannabinoid derivatives, and methods of screening for engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides.

Description

Optimized tetrahydrocannabinolic acid (THCA) synthase polypeptides
Cross Reference to Related Applications
This application is a continuation of U.S. patent application serial No. 62/902,300 filed on 18.9.2019, the disclosure of each of which is incorporated herein by reference in its entirety.
Sequence List declaration
Sequence listing text (. txt) files are filed concurrently under 37CFR.1.821(c) and are hereby incorporated by reference in their entirety. The file details required under 37cfr.1.52(e) (5) and 37CFR 1.77(b) (5) are as follows: the file name is DEMT _007_01WO _ SeqList _ ST25. txt; the creation date is 2020, 9, 17; the size is 697176 bytes. The content of the sequence listing information recorded in computer-readable form is the same as the written sequence listing (if any) and is the same as the sequence information provided by the originally filed application and contains no new content. The information recorded in electronic form (if any) filed in connection with the present application is the same as the sequence listing contained in the filed application, according to rule 13 ter.
Background
Cannabis has been used by humans for thousands of years due to its medicinal properties. In modern times, the bioactive effects of cannabis are attributed to a class of compounds known as "cannabinoids", of which there are hundreds of structural analogs, including Tetrahydrocannabinol (THC) and Cannabidiol (CBD). It has recently been discovered that molecules and formulations of these cannabinoids are useful as therapeutic agents for chronic pain, multiple sclerosis, cancer-related nausea and vomiting, weight loss, anorexia, spasticity, seizures and other conditions.
Figure BDA0003634699900000011
The physiological effects of certain cannabinoids are thought to be mediated by their interaction with two cellular receptors found in humans and other animals. Cannabinoid receptor type 1 (CB1) is common in the brain, reproductive system and eye. Cannabinoid receptor type 2 (CB2) is common in the immune system and mediates therapeutic effects associated with inflammation in animal models. The discovery of cannabinoid receptors and their interaction with plant-derived cannabinoids precedes the identification of endogenous ligands.
In addition to THC and CBD, hundreds of other cannabinoids have also been identified in cannabis. However, many of these compounds are present at low levels and with a greater abundance of cannabinoids, making it difficult to obtain pure samples from plants to investigate their therapeutic potential. Similarly, methods for chemically synthesizing these types of products are cumbersome and expensive, and tend to be in inadequate yields. Thus, there is a need for additional processes for the preparation of pure cannabinoids or cannabinoid derivatives.
One possible method is by fermentation of engineered microorganisms such as yeast. By engineering the production of relevant plant enzymes in microorganisms, the conversion of various raw materials into a range of cannabinoids can be achieved, possibly at much lower cost and with much higher purity than is available from plants. A key challenge of this effort is the difficulty of expressing plant enzymes in microorganisms, particularly secreted enzymes such as cannabinoid synthases, which must successfully traverse the secretory pathway of the microorganism in order to fold and function properly. Engineered variants of cannabinoid synthases, modified host cells, and novel methods are needed to address these challenges.
Disclosure of Invention
The present disclosure provides engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, nucleic acids comprising nucleotide sequences encoding the engineered variants, methods of making modified host cells comprising the nucleic acids, modified host cells for producing cannabinoids or cannabinoid derivatives, methods of producing cannabinoids or cannabinoid derivatives, and methods of screening for engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides. Engineered variants of the present disclosure are useful for producing cannabinoids or cannabinoid derivatives (e.g., non-naturally occurring cannabinoids). The modified host cells of the present disclosure can be used to produce cannabinoids or cannabinoid derivatives (e.g., non-naturally occurring cannabinoids) and/or to express engineered variants of the present disclosure. The disclosure also provides modified host cells for expressing the engineered variants of the disclosure. In addition, the present disclosure provides for the preparation of engineered variants of the present disclosure.
One aspect of the disclosure relates to engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising the amino acid sequence of SEQ ID NO:44 with one or more amino acid substitutions. In some embodiments, the engineered variant comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID No. 44. In some embodiments, the engineered variant comprises at least one amino acid substitution in the signal polypeptide, the Flavin Adenine Dinucleotide (FAD) binding domain, the Berberine Bridge Enzyme (BBE) domain, or a combination of the foregoing. In some embodiments, the engineered variant comprises a substitution of at least one surface exposed amino acid.
In some embodiments, the engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of: r31, P43, P49, K50, L51, Q55, H56, L59, M61, M61, S62, L71, S100, V103, T109, Q124, V125, L132, S137, H143, V149, W161, K165, N168, E167, S170, F171, P172, Y175, G180, N196, H208, G235, a250, I257, K261, L269, G311, F317, L327, K390, T379, S429, N467, Y500, N528, P539, P542, H543, H544 and H545. In some embodiments, the engineered variant comprises at least one amino acid substitution selected from the group consisting of: r31, P43, P49, K50, L51, Q55, H56, L59, M61, S62, L71, S100, V103, T109, Q124, V125, L132, S137, H143, W161, K165, N168, E167, Y175, G180, N196, H208, a250, I257, K261, G311, F317, L327, K390, T379, Y500, N528, P542, H543, H544, H545, and H545.
In some embodiments, the engineered variant comprises an amino acid sequence selected from the group consisting of seq id no: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 62, SEQ ID NO 64, SEQ ID NO 66, SEQ ID NO 68, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 88, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 176, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 176, 178, 180, 136, 134, 182, 184 and 186.
In some embodiments, the engineered variant comprises an amino acid sequence selected from the group consisting of seq id no: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 116, SEQ ID NO 118, SEQ ID NO 120, SEQ ID NO 124, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 116, SEQ ID NO 118, SEQ ID NO 120, SEQ ID NO, 126, 128, 130, 132, 134, 138, 140, 142, 144, 146, 148, 152, 156, 158, 160, SEQ ID SEQ ID, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184 and 186.
In some embodiments, the engineered variant comprises the amino acid sequence of SEQ ID No. 44 having at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 amino acid substitutions. In some embodiments, the engineered variant comprises the amino acid sequence of SEQ ID No. 44 having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid substitutions.
In some embodiments, the engineered variant comprises at least one invariant amino acid in a Flavin Adenine Dinucleotide (FAD) binding domain, a Berberine Bridge Enzyme (BBE) domain, or a domain of the foregoing. In some embodiments, the engineered variant comprises at least 1 amino acid, at least 2 amino acids, at least 3 amino acids, at least 4 amino acids, at least 5 amino acids, at least 6 amino acids, at least 7 amino acids, at least 8 amino acids, at least 9 amino acids, at least 10 amino acids, at least 11 amino acids, at least 12 amino acids, at least 13 amino acids, at least 14 amino acids, or at least 15 invariant amino acids in the FAD binding domain. In some embodiments, the engineered variant comprises at least 1 amino acid, at least 2 amino acids, at least 3 amino acids, at least 4 amino acids, at least 5 amino acids, at least 6 amino acids, at least 7 amino acids, at least 8 amino acids, at least 9 amino acids, at least 10 amino acids, at least 11 amino acids, at least 12 amino acids, at least 13 amino acids, at least 14 amino acids, or at least 15 invariant amino acids in the BBE domain.
In some embodiments, the engineered variant comprises at least one invariant amino acid selected from the group consisting of: a28, F34, L35, C37, L64, N70, P87, I93, C99, R108, R110, G112, E117, G118, S120, P126, F127, D131, D141, W148, G152, a153, L155, G156, E157, Y159, Y160, N163, a173, G174, C176, P177, T178, V179, G182, G183, H184, F185, G187, G188, G189, Y190, G191, P192, L193, R195, a201, D202, I205, D206, V210, G214, G223, D225, L226, F227, W228, R231, G234, S237, F238, G239, G245, I246, L245, V251, V385, V312, V259, Q313, F312, S341, S354, N185, N187, N185, G188, G189, G468, P192, L193, L202, L205, L195, L202, L205, V206, V210, G185, F2, F185, G185, F2, F185, F2, G234, F185, F2, F185, F2, G185, F2, F185, G185, F2, F185, F2, F185, F234, F185, F2, F185, G468, F185, F2, G468, F185, F2, G185, G468, F2, F185, F2, G468, F2, G185, F2, F185, G185, F2, G468, G185, F2, F185, G185, F2, G468, F2, G185, G468, F2, F123, G414, G468, F123, F234, F123, G414, F123, F. In some embodiments, the engineered variant comprises at least one invariant amino acid selected from the group consisting of: c37, N70, I93, C99, E117, S120, F127, D131, G156, E157, Y159, G174, C176, G182, G183, F185, G187, G188, G189, Y190, G191, P192, R195, D202, D206, G214, W228, G234, F238 and L248.
In some embodiments, the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 invariant amino acids.
In some embodiments, the amount of tetrahydrocannabinolic acid (THCA) produced by the engineered variant from cannabigerolic acid (CBGA) is greater than the amount of THCA produced from CBGA by a Tetrahydrocannabinolic Acid Synthase (THCAs) polypeptide having an amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions. In some embodiments, the amount of THCA produced from CBGA by the engineered variant is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of THCA produced from CBGA by a thcbac acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, over the same length of time, in mg/L or mM, under similar conditions.
In some embodiments, such engineered variants can increase the ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) by producing THCA from CBGA as compared to the ratio of THCA to the other cannabinoid over the same length of time under similar conditions as the thcanoic acid (THCA) produced by a thcannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO: 44. In some embodiments, the engineered variant produces THCA from CBCA at a ratio of THCA to another cannabinoid (e.g., THCA) of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500: 1.
In some embodiments, the engineered variant comprises a truncation at the N-terminus, C-terminus, or at both the N-terminus and C-terminus. In some embodiments, the truncated engineered variant comprises a signal polypeptide or a membrane anchor. In some embodiments, the engineered variant lacks a native signal polypeptide. In some embodiments, the engineered variant comprises a truncation of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 amino acids at the C-terminus. In some embodiments, the engineered variant comprises a truncation of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids at the C-terminus.
Another aspect of the present disclosure relates to nucleic acids comprising nucleotide sequences encoding the engineered variants of the disclosure. In some embodiments, a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure comprises a nucleotide sequence selected from the group consisting of seq id no:49, 51, 53, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 93, 95, 97, 99, 101, 103, 105, 107, 111, 113, 115, 69, 111, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 161, 163, 165, 167, 171, 173, 175, 177, 179, 181, 183 and 185 SEQ ID NO. In some embodiments, a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure comprises a nucleotide sequence selected from the group consisting of seq id no:49, 51, 53, 73, 75, 77, 79, 81, 83, 85, 87, 91, 93, 95, 97, 99, 101, 103, 105, 107, 111, 113, 115, 117, 99, 101, 103, 105, 107, 117, 119, 123, 125, 119, 125, 119, 123, 125, 77, 127, 129, 131, 133, 137, 139, 141, 143, 145, 147, 151, 155, 157, 159, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183 and 185 SEQ ID NO. In some embodiments of the nucleic acids of the present disclosure, the nucleotide sequence is codon optimized.
One aspect of the disclosure relates to a method of making a modified host cell for producing a cannabinoid or cannabinoid derivative comprising introducing into a host cell one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure.
Another aspect of the present disclosure relates to vectors comprising one or more nucleic acids encoding a nucleotide sequence of an engineered variant of the disclosure.
One aspect of the present disclosure relates to a method of making a modified host cell for the production of a cannabinoid or a cannabinoid derivative, the method comprising introducing into a host cell one or more vectors comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure.
Another aspect of the disclosure relates to a modified host cell for the production of a cannabinoid or a cannabinoid derivative, wherein the modified host cell comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a geranyl pyrophosphate olivine acid geranyl transferase (GOT) polypeptide. In certain such embodiments, the GOT polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 17. In some embodiments, the modified host cell comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide. In certain such embodiments, the NphB polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 188.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tetraone synthase (TKS) polypeptide and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an Olivine Acid Cyclase (OAC) polypeptide. In certain such embodiments, the TKS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 19. In some embodiments, the modified host cell comprises three or more heterologous nucleic acids comprising nucleotide sequences encoding TKS polypeptides. In some embodiments, the OAC polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO:21 or SEQ ID NO: 48. In some embodiments, the modified host cell comprises three or more heterologous nucleic acids comprising nucleotide sequences encoding an OAC polypeptide.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an Acyl Activating Enzyme (AAE) polypeptide. In certain such embodiments, the AAE polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 23. In some embodiments, the modified host cell comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide.
In some embodiments of the disclosure, the modified host cell comprises one or more of: a) one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMG-CoA synthase (HMGS) polypeptide; b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a truncated 3-hydroxy-3-methyl-glutaryl-CoA reductase (tHMGR) polypeptide; c) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a Mevalonate Kinase (MK) polypeptide; d) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a phosphomevalonate kinase (PMK) polypeptide; e) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a mevalonate decarboxylase pyrophosphate (MVD1) polypeptide; or f) one or more heterologous nucleic acids comprising a nucleotide sequence encoding an isopentenyl diphosphate isomerase (IDI1) polypeptide. In some embodiments, the IDI1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 25. In some embodiments, the tHMGR polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 27. In some embodiments, the HMGS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 29. In some embodiments, the MK polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 39. In some embodiments, the PMK polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 37. In some embodiments, the MVD1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 33.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide. In certain such embodiments, the acetoacetyl-coa thiolase polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 31.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a Pyruvate Decarboxylase (PDC) polypeptide. In certain such embodiments, the PDC polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 35.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a heterologous nucleic acid encoding a nucleotide sequence for a geranyl pyrophosphate synthase (GPPS) polypeptide. In certain such embodiments, the GPPS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 41.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide. In certain such embodiments, the KAR2 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 5. In some embodiments, the modified host cell comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide. In certain such embodiments, the PDI1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 9.
In some embodiments of the disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide. In certain such embodiments, the IRE1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 11 or SEQ ID No. 190.
In some embodiments of the present disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide. In certain such embodiments, the ERO1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 7.
In some embodiments of the present disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide. In certain such embodiments, the FAD1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 192.
In some embodiments of the disclosure, the modified host cell comprises a deletion or down-regulation of one or more genes encoding PEP4 polypeptide. In certain such embodiments, the PEP4 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 15.
In some embodiments of the disclosure, the modified host cell comprises a deletion or down-regulation of one or more genes encoding a ROT2 polypeptide. In certain such embodiments, the ROT2 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 13.
In some embodiments of the disclosure, the modified host cell is a eukaryotic cell. In certain such embodiments, the eukaryotic cell is a yeast cell. In certain such embodiments, the yeast cell is saccharomyces cerevisiae. In certain such embodiments, the Saccharomyces cerevisiae is a protease deficient strain of Saccharomyces cerevisiae.
In some embodiments of the disclosure, at least one of the one or more nucleic acids is integrated into the chromosome of the modified host cell. In some embodiments of the disclosure, at least one of the one or more nucleic acids is maintained extrachromosomally. In some embodiments of the disclosure, at least one of the one or more nucleic acids is operably linked to an inducible promoter. In some embodiments of the disclosure, at least one of the one or more nucleic acids is operably linked to a constitutive promoter.
In some embodiments of the disclosure, the amount of cannabinoid or cannabinoid derivative produced by the modified host cell is greater than the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 is devoid of nucleic acids comprising nucleotide sequences encoding engineered variants of the disclosure, grown for the same length of time and in mg/L or mM under similar culture conditions.
In some embodiments of the disclosure, the amount of cannabinoids or cannabinoid derivatives produced by the modified host cell is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of cannabinoids or cannabinoid derivatives produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, grown for the same length of time under similar culture conditions, in mg/L or mM, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 The host cell is devoid of nucleic acid comprising a nucleotide sequence encoding the engineered variant of the disclosure.
In some embodiments of the present disclosure, the modified host cell has a faster growth rate and/or higher biomass yield than the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, grown for the same length of time under similar culture conditions, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44 lacks a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure.
In some embodiments of the present disclosure, the modified host cell has a growth rate and/or higher biomass yield that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% faster/higher than the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, grown under similar culture conditions for the same length of time, wherein the modified host cell comprises one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 The host cell lacks a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure.
In some embodiments of the disclosure, the modified host cell produces THCA from cannabigerolic acid (CBGA) at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to the ratio of THCA to the other cannabinoid produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 grown for the same length of time under similar culture conditions, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 is devoid of nucleic acids comprising nucleotide sequences encoding the engineered variants of the disclosure.
In some embodiments of the disclosure, the modified host cell produces a ratio of THCA to another cannabinoid (e.g., cannabis sativa) of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or more than about 500:1 from a THCA to another cannabinoid (e.g., CBCA).
Another aspect of the present disclosure relates to a method of producing a cannabinoid or cannabinoid derivative, the method comprising: a) culturing the modified host cell of the present disclosure in a culture medium. In certain such embodiments, the method comprises: b) recovering the produced cannabinoid or cannabinoid derivative. In some embodiments, the medium comprises a carboxylic acid. In certain such embodiments, the carboxylic acid is unsubstituted or substituted C 3 -C 18 A carboxylic acid. In certain such embodiments, unsubstituted or substituted C 3 -C 18 The carboxylic acid is unsubstituted or substituted hexanoic acid. In some embodiments, the medium comprises olivinic acid or an olivinic acid derivative. In some embodiments, the cannabinoid is cannabidiolic acid, cannabidiol, cannabidivarin acid, or cannabidivarin. In some embodiments, the medium comprises fermentable sugars. In some embodiments, the culture medium comprises a pretreated cellulosic feedstock. In some embodiments, the medium comprises a non-fermentable carbon source. In certain such embodiments, the non-fermentable carbon source comprises ethanol. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium.
In some embodiments of the methods of the present disclosure, the amount of the cannabinoid or the cannabinoid derivative produced is greater than the amount of the cannabinoid or the cannabinoid derivative produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, other than the modified host cell of the present disclosure, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 is devoid of nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure, and wherein the modified host cell of the present disclosure and the modified host cell comprising one or more nucleic acids comprising a tetrahydrocannabinolic acid synthase polypeptide comprising an amino acid sequence encoding an amino acid sequence of SEQ ID NO:44 are present in mg/L or mM, and wherein the modified host cell of the present disclosure and the modified host cell comprise one or more nucleic acids comprising a tetrahydrocannabinoids having an amino acid sequence of SEQ ID NO:44 A nucleic acid of a nucleotide sequence of a fibrinolic acid synthase polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure, under similar culture conditions for the same length of time.
In some embodiments of the methods of the present disclosure, the amount of the cannabinoid or the cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of the cannabinoid or the cannabinoid derivative produced in an alternative method comprising culturing, in place of the modified host cell of the present disclosure, one or more modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, wherein the one or more modified host cells comprise one or more nucleic acids comprising a tetrahydrocannabinolic acid synthase polypeptide encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, in mg/L or mM The modified host cell of nucleic acid sequences lacks a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure, and wherein the modified host cell of the disclosure and the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure, are cultured under similar culture conditions for the same length of time.
In some embodiments of the methods of the present disclosure, the cannabinoid is tetrahydrocannabinolic acid (THCA), and wherein the method produces THCA at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to the ratio of THCA produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, in place of the modified host cell of the present disclosure, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 lacks a nucleic acid comprising a nucleotide sequence encoding the disclosed engineered variant, grown for the same length of time under similar culture conditions.
One aspect of the present disclosure relates to a method of producing a cannabinoid or cannabinoid derivative comprising using an engineered variant of the present disclosure. In certain such embodiments, the methods comprise recovering the produced cannabinoid or cannabinoid derivative. In some embodiments of the methods of the present disclosure, the cannabinoid is tetrahydrocannabinolic acid, or tetrahydrocannabinol.
In some embodiments of the methods of the present disclosure, the amount of the cannabinoid or the cannabinoid derivative produced is greater than the amount of the cannabinoid or the cannabinoid derivative produced in an alternative method comprising using a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, instead of the engineered variant of the present disclosure, in mg/L or mM, wherein the engineered variant of the present disclosure and the tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 are used under similar conditions for the same length of time.
In some embodiments of the methods of the present disclosure, the amount of the cannabinoid or the cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%, in mg/L or mM, greater than the amount of the cannabinoid or the cannabinoid derivative produced in the alternative method, the alternative method comprises using a peptide having SEQ ID NO:44, but not the engineered variants of the disclosure, wherein the engineered variants of the disclosure and the polypeptide having SEQ ID NO:44 under similar conditions for the same length of time.
In some embodiments of the methods of the present disclosure, the cannabinoid is tetrahydrocannabinolic acid (THCA), and wherein the method produces THCA at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to a ratio of THCA to the other cannabinoid produced in an alternative method comprising using a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, rather than an engineered variant of the present disclosure, wherein the engineered variant of the present disclosure and the tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 are used under similar conditions for the same length of time.
In some embodiments of the methods of the present disclosure, the methods are produced from THCA at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500:1 THCA to another cannabinoid (e.g., CBCA).
Another aspect of the disclosure relates to a method of screening for an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, the method comprising: a) dividing the population of host cells into a control population and a test population; b) co-expressing in the control population a THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 and a comparative cannabinoid synthase polypeptide, wherein the THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 can convert cannabigerolic acid (CBGA) to a first cannabinoid, tetrahydrocannabinolic acid (THCA), and the comparative cannabinoid synthase polypeptide can convert the same CBGA to a different second cannabinoid; c) co-expressing the engineered variant and the comparative tetrahydrocannabinolic acid synthase polypeptide in the test population, wherein the engineered variant can convert CBGA to the same first cannabinoid, tetrahydrocannabinolic acid (THCA), as the THCAS polypeptide having the amino acid sequence of SEQ ID NO:44, and wherein the comparative cannabinoid synthase polypeptide can convert the same CBGA to the second cannabinoid and be expressed at similar levels in the test population and the control population; d) measuring a ratio of the first cannabinoid, tetrahydrocannabinolic acid (THCA), to the second cannabinoid produced by both the test population and the control population; and e) measuring the amount of the first cannabinoid produced by both the test population and the control population in mg/L or mM. In certain such embodiments, the test population is identified as comprising an engineered variant having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase (THCAS) polypeptide having an amino acid sequence of SEQ ID NO:44, wherein improved in vivo performance is evidenced by an increased ratio of the first cannabinoid to the second cannabinoid produced by the test population as compared to the ratio of the first cannabinoid to the second cannabinoid produced by the control population over the same length of time under similar culture conditions. In some embodiments of the methods of screening for engineered variants of a THCAS polypeptide, the test population is identified as comprising engineered variants having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:3, by producing a greater amount of the first cannabinoid from the test population as compared to the amount produced from the control population in mg/L or mM over the same length of time under similar culture conditions.
In some embodiments of the methods of screening for engineered variants of THCAS polypeptides, the cannabinoid synthase polypeptide is a cannabidiolic acid synthase polypeptide. In certain such embodiments, the cannabidiolic acid synthase (CBDAS) polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 3. In some embodiments of the methods of screening for engineered variants of THCAS polypeptides, the second cannabinoid is cannabidiolic acid (CBDA).
In some embodiments of the methods of screening for engineered variants of a THCAS polypeptide, the engineered variants are engineered variants of the present disclosure.
Drawings
Fig. 1A, 1B, and 1C depict expression constructs used to produce the S29 strain. The expression constructs depicted in fig. 1A, 1B and 1C were also used to generate the following strains: s61, S122, S171, S181, S220, S241, S270, S487, S951, S1000-S1059, S1072-1079, and S1081. In all figures, the construct map depicts regulatory, non-coding and genomic cassette sequences described in table 5, in addition to the designated coding sequences in table 1. The construct map also depicts the gene denoted by the previous "m" (e.g., mERG13) which specifies a 200-and 250-base pair (bp) open reading frame with downstream regulatory (terminator) sequences from Table 1. Arrows in the construct map indicate the directionality of certain DNA portions. The name of a part is preceded by a "! "is the output of the DNA design software used, redundant to arrow directionality and negligible.
FIG. 2 depicts the expression constructs used to produce the S181 strain. The expression constructs depicted in figure 2 were also used to generate the following strains: s220, S241, S270, S487, S951, S1000-S1059, S1072-1079, and S1081.
Figure 3 depicts the expression constructs used to produce the S220 strain. The expression constructs depicted in fig. 3 were also used to generate the following strains: s241, S270, S487, S951, S1000-S1059, S1072-1079, and S1081.
Figure 4 depicts the expression constructs used to produce the S241 strain. The expression constructs depicted in figure 4 were also used to generate the following strains: s270, S487, S951, S1000-S1059, S1072-1079, and S1081.
Figure 5 depicts the landing pad construct used to generate strain S61. The constructs depicted in fig. 5 were also used to generate the following strains: s122, S171, S181, S220, S241, S270, S487, S951, S1000-S1059, S1072-1079, and S1081.
Figure 6 depicts the expression constructs used to produce the S122 strain. The expression constructs depicted in fig. 6 were also used to generate the following strains: s171, S181, S220, S241, S270, S487, S951, S1000-S1059, S1072-1079, and S1081.
Figure 7 depicts the expression construct used to produce the S171 strain. The expression constructs depicted in fig. 7 were also used to generate the following strains: s181, S220, S241, S270, S487, S951, S1000-S1059, S1072-1079, and S1081.
Figure 8 depicts the expression constructs used to produce the S270 strain. The expression constructs depicted in fig. 8 were also used to generate the following strains: s487, S951, S1000-S1059, S1072-1079, and S1081.
Fig. 9A and 9B depict expression constructs used to produce the S487 strain. The expression constructs depicted in fig. 9A and 9B were also used to generate the following strains: S951S 1000-S1059, S1072-1079, and S1081.
FIG. 10 depicts the expression constructs used to generate the S1042 strain.
Figure 11 depicts expression constructs used to generate the following strains: s951, S1000-S1041, S1043-S1059, S1072-1079 and S1081.
Detailed description of the preferred embodiments
Synthetic biology allows engineering industrial host organisms-such as microorganisms-to convert simple sugar feedstocks into pharmaceutical products. Such methods include identifying genes that produce target molecules and optimizing their activity in an industrial host. Microbial production can be significantly cost-advantageous over agriculture and chemical synthesis, is less variable, and allows for customization of target molecules. However, reconstructing or creating a pathway to produce a target molecule in an industrial host organism may require significant engineering of both the pathway genes and the host. The present disclosure provides engineered variants of THCAS polypeptide comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, nucleic acids comprising nucleotide sequences encoding the engineered variants, methods of making modified host cells comprising the nucleic acids, modified host cells for producing cannabinoids or cannabinoid derivatives, methods of producing cannabinoids or cannabinoid derivatives, and methods of screening for engineered variants of THCAS polypeptides. Engineered variants of the present disclosure are useful for producing cannabinoids or cannabinoid derivatives (e.g., non-naturally occurring cannabinoids). The modified host cells of the disclosure can be used to produce cannabinoids or cannabinoid derivatives (e.g., non-naturally occurring cannabinoids) and/or to express engineered variants of the disclosure. The disclosure also provides modified host cells for expressing the engineered variants of the disclosure. In addition, the present disclosure provides for the preparation of engineered variants of the present disclosure.
Cannabinoid synthase polypeptides, such as tetrahydrocannabinolic acid synthase, cannabichromenic acid synthase, or cannabidiolic acid synthase polypeptides, play an important role in the biosynthesis of cannabinoids. However, reconstituting their activity in modified host cells has proven to be challenging, impeding the progress of the production of cannabinoids or cannabinoid derivatives. Cannabinoid synthases must successfully traverse the secretory pathway to fold and function properly. These secreted plant enzymes have not evolved to be expressed in yeast cells and are therefore poorly active, with limited conversion of the substrate cannabigerolic acid (CBGA) to cannabidiolic acid (CBDA) or tetrahydrocannabinolic acid (THCA). A simple way to increase the enzymatic activity is to increase its copy number (expression). However, expression of CBDAS and THCAS genes in yeast is detrimental (possibly due to protein misfolding), thwarting direct attempts to enhance activity by integrating multiple copies of the genes. Another problem exists with product characteristics. Although the main product of the native CBDAS enzyme is CBDA, the enzyme also produces a large amount of the undesirable by-product THCA, which would require expensive additional downstream purification steps to isolate in an industrial process.
For these reasons, CBDAS or THCAS enzymes are not optimal for industrial purposes and improved enzymes are needed. Parameters of interest include catalytic activity, product characteristics, enzyme stability, and optimal pH and temperature. Improvements in enzymes are often achieved by combining the generation of diversity (libraries of engineered variants) with screening or selection for properties of interest. DNA libraries encoding engineered variants can be generated in a variety of ways. For example, an error-prone PCR can be used to generate a library using the wild-type gene sequence as a template. The resulting library can be quite large, consisting of genes with variable numbers of mutations at random positions. Error-prone PCR is inexpensive and convenient, but has several drawbacks. First, a distribution is obtained, rather than the exact number of mutations per construct. This presents an unfortunate compromise. Distribution centered on low mutation numbers would include a large waste of screening capacity for zero mutation wild-type constructs. Distribution centered on higher mutation numbers may result in constructs with accumulated loss-of-function mutations that would prevent the identification of the desired gain-of-function mutation. Second, error-prone PCR introduces mutation bias (an inherent property of the low fidelity polymerase used), which means that the library is not sufficiently representative of certain types of mutations. An effective alternative to error-prone PCR is saturation mutagenesis, which involves synthesizing a library containing every possible amino acid at every position in the protein. Recent advances in DNA synthesis technology have significantly improved the quality of these libraries.
Once a library is generated that encodes the engineered variants, the engineered variants having the property of interest must be selected or screened. This can be achieved by using the protein production host to express and purify the engineered variants, followed by in vitro testing. Such methods allow for careful measurement of kinetic parameters of the engineered variants and evaluation of performance under carefully controlled conditions. However, for use in engineered microbial strains, in vitro data can be largely misleading, as no in vitro system can accurately represent the cellular environment. In this case, the best choice is in that itIn the precise context that they must ultimately perform-the engineered variants are tested inside the engineered production strain. In the case of cannabinoid synthases, such production strains were engineered to produce an excess of substrate CBGA. One challenge with such in vivo systems is the high variability. When testing large libraries, this variability can make it difficult to distinguish clones with more subtle improvements in enzyme activity than the wild-type. By calculating the ratio of the titer of the library enzyme product to the titer of the invariant competing enzyme, the variability of the data can be significantly reduced. This is because biological variables tend to affect both enzymes in the same way, allowing for normalization of the effect. In contrast to kinetic parameters, competition ratios report two enzymatic parameters such as K m And K cat As well as steady state levels (expression and stability) of the engineered functional variants.
By using the above methods, the present disclosure provides engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides. Here, various engineered variants were screened. In 58 different variants, THCA titers were increased (outside the standard deviation of wild type). These engineered variants of the disclosure are useful for producing cannabinoids or cannabinoid derivatives (e.g., non-naturally occurring cannabinoids). Engineered variants of the present disclosure can produce greater amounts of tetrahydrocannabinolic acid (THCA) from cannabigerolic acid (CBGA) than THCA from CBGA produced by a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions. In addition, engineered variants of the disclosure can increase the ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) over the same length of time under similar conditions to produce THCA from CBGA as compared to the ratio of THCA to the other cannabinoid produced by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO: 44. Similar conditions may include the same temperature, pH, buffer and/or fermentation conditions, as well as in the same medium and/or reaction solvent.
The methods of the disclosure can include the use of an engineered microorganism (e.g., a modified host cell) or an engineered variant of a THCAS polypeptide of the disclosure to produce naturally occurring and non-naturally occurring cannabinoids. Naturally occurring cannabinoids and non-naturally occurring cannabinoids (e.g., cannabinoid derivatives) are challenging to produce using chemical synthesis due to their complex structures. The methods of the present disclosure enable the construction of metabolic pathways inside living cells to produce customized cannabinoids or cannabinoid derivatives from simple precursors such as sugars and carboxylic acids. One or more nucleic acids disclosed herein (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding one or more polypeptides or engineered variants disclosed herein can be introduced into a host microorganism, allowing for the step-wise conversion of inexpensive starting materials such as sugars into the final product: a cannabinoid or a cannabinoid derivative. These products can be designated by selecting and constructing expression constructs or vectors comprising one or more of the nucleic acids disclosed herein (e.g., heterologous nucleic acids), thereby allowing efficient biological production of selected cannabinoids, such as THC and THCA, as well as the less common cannabinoid species found at low levels in Cannabis (canabis); or a cannabinoid derivative. Bio-production also enables the synthesis of cannabinoids or cannabinoid derivatives with defined stereochemistry, which is challenging to perform using chemical synthesis. To produce a cannabinoid or cannabinoid derivative and create a biosynthetic pathway within a modified host cell, a modified host cell comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of a THCAS polypeptide of the disclosure may express or overexpress a combination of heterologous nucleic acids comprising nucleotide sequences encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa). In some embodiments, the nucleotide sequence encoding a polypeptide involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivine acid, or hexanoyl-coa) is codon optimized.
The disclosure also provides for modification of the secretory pathway of a host cell modified with one or more nucleic acids (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding an engineered variant of a THCAS polypeptide of the disclosure. In some embodiments, the nucleotide sequence encoding the engineered variant of the THCAS polypeptide is codon optimized. Modification of the secretory pathway in a host cell can improve expression and solubilization of the engineered variants of the disclosure, as these variants are processed through the secretory pathway. Reconstituting the activity of a polypeptide processed through the secretory pathway (such as an engineered variant of the present disclosure) in a modified host cell, such as a modified yeast cell, can be challenging and unreliable. Often, the expressed engineered variants may be misfolded or mispositioned, resulting in low expression, lack of activity of the expressed engineered variants, aggregation of the engineered variants, reduced host cell viability, and/or cell death. In addition, accumulation of misfolded or mislocalized expressed engineered variants can induce metabolic stress in the modified host cell, damaging the modified host cell. The expressed engineered variants may lack the necessary post-translational modifications for folding and activity, such as disulfide bonds, glycosylation and trimming, as well as cofactors, thereby providing inactive polypeptides or polypeptides with reduced enzymatic activity.
The modified host cells of the present disclosure may be modified yeast cells. Yeast cells can be cultured using known conditions, grow rapidly, and are generally considered safe. Yeast cells contain a secretory pathway common to all eukaryotes. As disclosed herein, manipulation of the secretory pathway in a yeast host cell modified with one or more nucleic acids (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding an engineered variant of the THCAS polypeptide of the disclosure can improve the expression, folding, and enzymatic activity of the engineered variant as well as the viability of the modified yeast host cell, such as a modified Saccharomyces cerevisiae. In addition, the use of codon optimized nucleotide sequences encoding the engineered variants of the disclosure can improve the expression and activity of the engineered variants as well as the viability of modified yeast host cells (such as modified saccharomyces cerevisiae).
In addition to allowing the production of the desired cannabinoid or cannabinoid derivative, the present disclosure provides a more reliable and economical process than agriculture-based production. Microbial fermentation can be completed within days, not months as is necessary for crops, is not affected by climate change or soil contamination (e.g., heavy metal contamination), and can produce high titer pure products.
The present disclosure also provides a platform for economically producing high value cannabinoids (including THC) and derivatives thereof. Also provided is the production of different cannabinoids or cannabinoid derivatives in the absence of a viable production method. Cannabinoids and cannabinoid derivatives may be produced in amounts greater than 100 mg/liter of culture medium, greater than 1 g/liter of culture medium, greater than 10 g/liter of culture medium, greater than 100 g/liter of culture medium using the engineered variants, methods, and modified host cells disclosed herein.
In addition, the disclosure provides engineered variants, methods, modified host cells and nucleic acids of the THCAS polypeptide to produce cannabinoids or cannabinoid derivatives from simple precursors in vivo or in vitro. The nucleic acids (e.g., heterologous nucleic acids) disclosed herein can be introduced into a microorganism (e.g., a modified host cell) resulting in the expression or overexpression of one or more polypeptides (such as engineered variants of the disclosure) which can then be used for the production of cannabinoids or cannabinoid derivatives in vitro or in vivo. In some embodiments, the in vitro method is cell-free.
Cannabinoid biosynthesis
In addition to one or more nucleic acids (e.g., heterologous nucleic acids) encoding engineered variants of the THCAS polypeptide, one or more nucleic acids (e.g., heterologous nucleic acids) encoding one or more polypeptides having at least one activity of a polypeptide present in a cannabinoid or cannabinoid precursor biosynthetic pathway may be used in methods of synthesizing cannabinoids or cannabinoid derivatives and modified host cells. Cannabinoid precursors can include, for example, geranyl pyrophosphate (GPP), isoprene phosphate, olive acid, or hexanoyl coa.
In cannabis, cannabinoids are produced from the common metabolite precursors geranyl pyrophosphate (GPP) and hexanoyl-coa by the action of three polypeptides. Hexanoyl-coa and malonyl-coa are combined by a tetraone synthase (TKS) polypeptide to provide a 12-carbon tetraone intermediate. The tetrone compound intermediate is then cyclized by an Olivine Acid Cyclase (OAC) polypeptide to produce olivine acid. The olivinic acid is then prenylated with the common isoprenoid precursor GPP by a geranyl pyrophosphate olivinic acid geranyl transferase (GOT) polypeptide (e.g., a CsPT4 polypeptide) to produce CBGA, a cannabinoid also known as the "parent cannabinoid". Engineered variants of the THCAS polypeptides of the present disclosure then convert CBGA into other cannabinoids, e.g., THCA and the like. In the presence of heat or light, acidic cannabinoids may undergo decarboxylation, for example THCA to THC.
GPP and hexanoyl-coa can be produced in several ways. One or more nucleic acids (e.g., heterologous nucleic acids) encoding one or more polypeptides having at least one activity of a polypeptide present in these pathways can be used in methods and modified host cells for synthesizing cannabinoids or cannabinoid derivatives.
A polypeptide that is GPP-producing or is part of a GPP-producing biosynthetic pathway can be one or more polypeptides having at least one activity of a polypeptide present in a Mevalonate (MEV) pathway (e.g., one or more MEV pathway polypeptides). The term "mevalonate pathway" or "MEV pathway" as used herein may refer to a biosynthetic pathway which converts acetyl-coa to isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP). The mevalonate pathway comprises polypeptides that catalyze the following steps: (a) condensing two acetyl-coa molecules to produce acetoacetyl-coa (e.g., by the action of an acetoacetyl-coa thiolase polypeptide); (b) condensing acetoacetyl-CoA with acetyl-CoA to form hydroxymethylglutaryl-CoA (HMG-CoA) (e.g., by the action of an HMG-CoA Synthase (HMGs) polypeptide); (c) converting HMG-CoA to mevalonate (e.g., by the action of an HMG-CoA reductase (HMGR) polypeptide); (d) phosphorylating mevalonate to mevalonate 5-phosphate (e.g., by the action of a Mevalonate Kinase (MK) polypeptide); (e) converting mevalonate 5-phosphate to mevalonate 5-pyrophosphate (e.g., by the action of a phosphomevalonate kinase (PMK) polypeptide); (f) converting mevalonate 5-pyrophosphate to isopentenyl pyrophosphate (e.g., by action of a mevalonate pyrophosphate decarboxylase (MVD1) polypeptide); and (g) converting isopentenyl pyrophosphate (IPP) to dimethylallyl pyrophosphate (DMAPP) (e.g., by the action of an isopentenyl pyrophosphate isomerase (IDI1) polypeptide). Geranyl pyrophosphate synthase (GPPS) polypeptides then act on IPP and/or DMAPP to produce GPP.
Hexanoyl-coa producing polypeptides may include polypeptides that produce acyl-coa compounds or acyl-coa compound derivatives (e.g., acyl-activating enzyme polypeptides, fatty acyl-coa synthase polypeptides, or fatty acyl-coa ligase polypeptides). Hexanoyl-coa derivatives, acyl-coa compounds, or acyl-coa compound derivatives may also be formed via such polypeptides.
Figure BDA0003634699900000201
GPP and hexanoyl-coa can also be produced by a pathway comprising a polypeptide that condenses two acetyl-coa molecules to produce acetoacetyl-coa and a pyruvate decarboxylase polypeptide that produces acetyl-coa from pyruvate via acetaldehyde. Hexanoyl-coa derivatives, acyl-coa compounds, or acyl-coa compound derivatives may also be formed via such pathways.
General information
In certain aspects, the practice of the present disclosure will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are explained fully in the literature: "Molecular Cloning: A Laboratory Manual," second edition (Sambrook et al, 1989); "Oligonucleotide Synthesis" (edited by m.j. gate, 1984); "Animal Cell Culture" (edited by r.i. freshney, 1987); "Methods in Enzymology" (Academic Press, Inc.); "Current Protocols in Molecular Biology" (edited by F.M. Ausubel et al, 1987, and periodically updated); "PCR: The Polymerase Chain Reaction," (edited by Mullis et al, 1994). Singleton et al, Dictionary of Microbiology and Molecular Biology, 2 nd edition, J.Wiley & Sons (New York, N.Y.1994), and March, Advanced Organic Chemistry Reactions, Mechanisms and Structure, 4 th edition, John Wiley & Sons (New York, N.Y.1992) provide those skilled in the art with a general guidance for many of the terms used in this application.
As used herein, "cannabinoid" or "cannabinoid compound" may refer to a member of a unique class of heteroterpenoids that have only been found to date in Cannabis sativa (Cannabis sativa). Cannabinoids may include, but are not limited to, cannabichromene (CBC) types (e.g., cannabichromenic acid), Cannabigerol (CBG) types (e.g., cannabigerolic acid), Cannabidiol (CBD) types (e.g., cannabidiolic acid), delta 9 -trans-tetrahydrocannabinol (Δ) 9 type-THC (e.g. Delta) 9 -tetrahydrocannabinolic acid), Δ 8 -trans-tetrahydrocannabinol (Δ) 8 -THC) type, Cannabigerol (CBL) type, Cannabigerol (CBE) type, Cannabinol (CBN) type, dehydrocannabidiol (CBND) type, Cannabigerol (CBT) type, cannabigerolic acid (CBGA), cannabigerolic acid monomethyl ether (CBGAM), Cannabigerol (CBG), cannabigerol monomethyl ether (CBGM), cannabigerolic acid (CBGVA), Cannabigerol (CBGV), cannabigerolic acid (CBCA), cannabigerol (CBC), cannabigerolic acid (CBCVA), cannabigerol (CBCV), cannabidiolic acid (CBDA), cannabidiol Cannabis (CBD), cannabidiol monomethyl ether (CBDM), cannabidiol-C 4 (CBD-C 4 ) Cannabidivarin diphenol (CBDVA), Cannabidivarin (CBDV), cannabidiorol (CBD-C) 1 )、Δ 9 -tetrahydrocannabinolic acid A (THCA-A), Delta 9 -tetrahydrocannabinolic acid B (THCA-B), Delta 9 -Tetrahydrocannabinol (THC), Δ 9 -tetrahydrocannabinolic acid-C 4 (THCA-C 4 )、Δ 9 -tetrahydrocannabinol-C 4 (THC-C 4 )、Δ 9 -tetrahydrocannabinolic acid (THCVA), Δ 9 -Tetrahydrocannabivarin (THCV), Δ 9 Tetrahydrocannabinolic acid (THCA-C) 1 ) Delta-tetrahydrocannabinol (THC-C) 1 )、Δ 7 -cis-iso-tetrahydrocannabivarin, delta 8 -tetrahydrocannabinolic acid (Δ) 8 –THCA)、Δ 8 -tetrahydrocannabinol (Δ) 8 -THC), cannabicyclolic acid (CBLA), Cannabinol (CBL), sub-Cannabinol (CBLV), cannabigerolic acid a (CBEA-a), cannabigerolic acid B (CBEA-B), Cannabigeropine (CBE), cannabiisolic acid (cannabibielisoic acid), cannabidicaryonic acid (cannabibicincanic acid), cannabidivanic acid (cannabibicincanic acid), cannabinolic acid (CBNA), Cannabinol (CBN), cannabinol methyl ether (CBNM), cannabinol-C 4 、(CBN-C 4 ) sub-Cannabinol (CBV), cannabinol-C 2 (CNB-C 2 ) Cannabinol (CBN-C) 1 ) Dehydrocannabidiol (CBND), dehydrocannabidivarin (CBVD), Cannabitriphenol (CBT), 10-ethoxy-9-hydroxy-delta-6 a-tetrahydrocannabinol, 8, 9-dihydroxy-delta-6 a-tetrahydrocannabinol, Cannabitriphenol (CBTVE), Dehydrocannabifurane (DCBF), Cannabinoids (CBF), cannabichromene (CBCN), cannabidio-pyran Cycloalkane (CBT), 10-oxo-delta-6 a-tetrahydrocannabinol (OTHC), delta-9-cis-tetrahydrocannabinol (cis-THC), 3,4,5, 6-tetrahydro-7-hydroxy-alpha-2-trimethyl-9-n-propyl-2, 6-methanone-2H-1-benzoxein-5-methanol (OH- iso-HHCV), Cannabivarisol (CBR) and trihydroxy-delta-9-tetrahydrocannabinol (triothc).
Acyl-coa compounds as detailed herein may include compounds having the structure:
Figure BDA0003634699900000221
wherein R may be an unsubstituted fatty acid side chain or a fatty acid side chain substituted with or comprising one or more functional and/or reactive groups disclosed herein (i.e., acyl-coa compound derivatives).
As used herein, a hexanoyl-coa derivative, an acyl-coa compound derivative, a cannabinoid derivative, or an olivolic acid derivative may refer to hexanoyl-coa, an acyl-coa compound, a cannabinoid, or an olivolic acid that is substituted with or comprises one or more functional and/or reactive groups. Functional groups can include, but are not limited to, azido, halogen (e.g., chlorine, bromine, iodine, fluorine), methyl, alkyl (including branched and straight chain alkyl), alkynyl, alkenyl, methoxy, alkoxy, acetyl, amino, carboxyl, carbonyl, oxo, ester, hydroxy, thio (e.g., thiol), cyano, aryl, heteroaryl, cycloalkyl, cycloalkenyl, cycloalkylalkenyl, cycloalkylalkynyl, cycloalkenylalkyl, cycloalkenylalkynyl, heterocyclylalkenyl, heterocyclylalkynyl, heteroarylalkenyl, heteroarylalkynyl, arylalkenyl, arylalkynyl, heterocyclyl, spiro, heterospiro, thioalkyl (or alkylthio), arylthio, heteroarylthio, sulfone, sulfonyl, sulfoxide, amido, alkylamino, dialkylamino, arylamino, alkylarylamino, diarylamino, N-oxide, imide, N-oxide, N, Enamines, imines, oximes, hydrazones, nitriles, aralkyls, cycloalkylalkyls, haloalkyls, heterocyclylalkyls, heteroarylalkyls, nitro, thioketones, and the like. Suitable reactive groups may include, but are not necessarily limited to, azide, carboxyl, carbonyl, amine (e.g., alkylamine (e.g., lower alkylamine), arylamine), halide, ester (e.g., alkyl ester (e.g., lower alkyl ester, benzyl ester), aryl ester, substituted aryl ester), cyano, thioester, thioether, sulfonyl halide, alcohol, thiol, succinimidyl ester, isothiocyanate, iodoacetamide, maleimide, hydrazine, alkynyl, alkenyl, and the like. The reactive group may facilitate covalent attachment of the target molecule. Suitable target molecules may include, but are not limited to, detectable labels; an imaging agent; toxins (including cytotoxins); a joint; a peptide; drugs (e.g., small molecule drugs); a member of a specific binding pair; an epitope tag; a ligand for binding to a target receptor; a tag to aid in purification; a solubility-enhancing molecule; molecules that enhance bioavailability; a molecule that increases half-life in vivo; molecules that target specific cell types; a molecule that targets a specific tissue; providing a molecule that crosses the blood-brain barrier; molecules that promote selective attachment to a surface; and so on. The functional group and the reactive group may be unsubstituted or substituted with one or more functional groups or reactive groups.
Cannabinoid derivatives or olivopodic acid derivatives may also refer to the lack of one or more chemical moieties found in naturally occurring cannabinoids or olivopodic acid. Such chemical moieties may include, but are not limited to, methyl, alkyl, alkenyl, methoxy, alkoxy, acetyl, carboxyl, carbonyl, oxo, ester, hydroxy, aryl, heteroaryl, cycloalkyl, cycloalkenyl, cycloalkylalkenyl, cycloalkenylalkyl, cycloalkenylalkenyl, heterocyclylalkenyl, heteroarylalkenyl, arylalkenyl, heterocyclo, arylalkyl, cycloalkylalkyl, heterocycloalkylalkyl, heteroarylalkyl, and the like. In some embodiments, the cannabinoid derivative or the olivinic acid derivative may also comprise one or more of any of the functional and/or reactive groups described herein. The functional group and the reactive group may be unsubstituted or substituted with one or more functional groups or reactive groups.
The term "nucleic acid" as used herein may refer to a polymeric form of nucleotides of any length (ribonucleotides or deoxynucleotides). Thus, the term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, genes, synthetic DNA or RNA, DNA-RNA hybrids, or polymers comprising purine and pyrimidine bases or other naturally occurring, chemically or biochemically modified, non-naturally occurring, or derivatized nucleotide bases.
The terms "peptide," "polypeptide," and "protein" are used interchangeably herein and may refer to polymeric forms of amino acids of any length, which may include coded and non-coded amino acids as well as chemically or biochemically modified or derivatized amino acids. The polypeptides disclosed herein may include full-length polypeptides, polypeptide fragments, truncated polypeptides, fusion polypeptides, or polypeptides having a modified peptide backbone. The polypeptides disclosed herein may also be variants that differ from a specifically recited "reference" polypeptide (e.g., a wild-type polypeptide) by amino acid insertions, deletions, mutations, and/or substitutions.
An "engineered variant of a tetrahydrocannabinolic acid synthase polypeptide" or "engineered variant of the present disclosure" can refer to a non-wild-type polypeptide having tetrahydrocannabinolic acid synthase activity. The person skilled in the art can measure the tetrahydrocannabinolic acid synthase activity of the engineered variants using known methods. For example, by GC-MS or LC-MS or as described in the examples provided herein. The engineered variant may have amino acid substitutions as compared to a wild-type tetrahydrocannabinolic acid synthase sequence, such as a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO: 44. In addition to substitutions, the engineered variants may comprise truncations, additions and/or deletions and/or other mutations compared to a wild-type tetrahydrocannabinolic acid synthase sequence, such as a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO: 44. The engineered variant may have substitutions as compared to the non-wild type tetrahydrocannabinolic acid synthase sequence. In addition to substitutions, the engineered variants may comprise truncations, additions and/or deletions and/or other mutations compared to the non-wild type tetrahydrocannabinolic acid synthase sequence. The engineered variants described herein contain at least one amino acid residue substitution as compared to a parent tetrahydrocannabinolic acid synthase polypeptide. In some embodiments, the parent tetrahydrocannabinolic acid synthase polypeptide is a wild-type sequence. In some embodiments, the parent tetrahydrocannabinolic acid synthase polypeptide is a non-wild-type sequence.
As used herein, the term "heterologous" may refer to a substance not normally found in nature. Thus, the heterologous nucleotide sequence may be: (a) is foreign to its host cell (i.e., is "exogenous" to the cell); (b) naturally found in the host cell (i.e., "endogenous"), but is present in the cell in a non-native amount (i.e., greater or less than the amount naturally found in the host cell); (c) is found naturally in the host cell, but outside its native locus; or (d) introns removed or added, as found naturally in the host cell. The term "heterologous nucleotide sequence" or the term "heterologous nucleic acid" can refer to a nucleic acid or nucleotide sequence not normally found in a given cell in nature. Codon-optimized nucleotide sequences may be examples of heterologous nucleotide sequences. The term "heterologous enzyme" or "heterologous polypeptide" may refer to an enzyme or polypeptide not normally found in a given cell in nature. The term encompasses enzymes or polypeptides which are: (a) is exogenous to a given cell (i.e., is encoded by a nucleic acid that does not naturally occur in the host cell or in a given context that does not naturally occur in the host cell); or (b) is naturally found in the host cell (e.g., the enzyme or polypeptide is encoded by a nucleic acid endogenous to the cell), but is produced in a non-native amount (e.g., greater than or less than the amount naturally found) in the host cell. For example, a heterologous polypeptide can include a mutated version of a polypeptide naturally occurring in the host cell. The heterologous nucleic acid can be: (a) is foreign to its host cell (i.e., is "exogenous" to the cell); (b) naturally found in the host cell (i.e., "endogenous"), but is present in the cell in a non-native amount (i.e., greater or less than the amount naturally found in the host cell); or (c) is found naturally in the host cell, but outside its native locus. In some embodiments, the heterologous nucleic acid can comprise a codon optimized nucleotide sequence.
As used herein, the term "one or more heterologous nucleic acids" or "one or more heterologous nucleotide sequences" may refer to a heterologous nucleic acid comprising one or more nucleotide sequences encoding one or more polypeptides. In some embodiments, the one or more heterologous nucleic acids can comprise a nucleotide sequence encoding a polypeptide. In other embodiments, the one or more heterologous nucleic acids can comprise a nucleotide sequence encoding more than one polypeptide. In some embodiments, these one or more heterologous nucleic acids can comprise nucleotide sequences encoding multiple copies of the same polypeptide. In some embodiments, these one or more heterologous nucleic acids can comprise multiple copies of a nucleotide sequence encoding different polypeptides.
As used herein, "increased ratio" may refer to an increase in molar ratio, an increase in mass (or weight) ratio, an increase in molar concentration ratio, or an increase in mass concentration ratio or a ratio of mass concentration (e.g., mg/L or mg/mL) between two products produced by another polypeptide, engineered variant, method, and/or modified host cell disclosed herein as compared to the molar ratio, mass (or weight) ratio, molar concentration ratio, or mass concentration ratio between the same two products produced by the polypeptide, engineered variant, method, and/or modified host cell disclosed herein. For example, an engineered variant disclosed herein will produce a 100:1 ratio of THCA to CBCA that is an increased ratio of THCA to CBCA as compared to a 11:1 ratio of THCA to CBCA produced by a different engineered variant disclosed herein.
As used herein, the ratio of products produced by the polypeptides, engineered variants, methods, and/or modified host cells disclosed herein, such as the ratio of THCA to CBCA, can refer to a molar ratio, a mass (or weight) ratio, a molar concentration ratio, or a mass concentration (e.g., mg/L or mg/mL) ratio. For example, if the modified host cell disclosed herein produces 4mM THCA and 1mM CBCA, the ratio of THCA to CBCA will be 4: 1.
"operably linked" may refer to an arrangement of elements wherein the components so described are configured to perform their ordinary function. Thus, a control sequence operably linked to a coding sequence is capable of effecting the expression of the coding sequence. Control sequences need not be contiguous with the coding sequence, so long as they function to direct its expression. Thus, for example, an untranslated yet transcribed intervening sequence may be present between a promoter sequence and a coding sequence, and the promoter sequence may still be considered "operably linked" to the coding sequence.
"isolated" may refer to a polypeptide or nucleic acid that is substantially or essentially free of components that normally accompany it in its native state. An isolated polypeptide or nucleic acid may differ from the form or environment in which it is found in nature. Thus, isolated polypeptides and nucleic acids can be distinguished from polypeptides and nucleic acids that are present in a native cell. An isolated nucleic acid or polypeptide can be purified from one or more other components in admixture with the isolated nucleic acid or polypeptide, if such components are present.
A "modified host cell" (also referred to as a "recombinant host cell") can refer to a host cell into which a heterologous nucleic acid, such as an expression vector or construct, has been introduced. For example, a modified eukaryotic host cell can be produced by introducing a heterologous nucleic acid into a suitable eukaryotic host cell.
As used herein, "cell-free system" may refer to a cell lysate, cell extract, or other preparation in which substantially all of the cells in the preparation have been disrupted or otherwise processed such that all or selected cellular components, e.g., organelles, proteins, nucleic acids, cell membranes themselves (or fragments or components thereof), etc., are released from the cells or resuspended in an appropriate culture medium and/or purified from the cellular environment. Cell-free systems may include reaction mixtures prepared from purified and/or isolated polypeptides and suitable reagents and buffers.
In some embodiments, conservative substitutions may be made in the amino acid sequence of a polypeptide without disrupting the three-dimensional structure or function of the polypeptide. Conservative substitutions may be made by the skilled person by substituting amino acids with similar hydrophobicity, polarity and R chain length for each other. In addition, by comparing aligned sequences of homologous proteins from different species, conservative substitutions can be identified by locating mutated amino acid residues between species, without altering the basic function of the encoded protein. The term "conservative amino acid substitution" may refer to the interchangeability of amino acid residues in a protein that have similar side chains. For example, one class of amino acids with aliphatic side chains can consist of glycine, alanine, valine, leucine, and isoleucine; a class of amino acids with aliphatic hydroxyl side chains can consist of serine and threonine; one class of amino acids with amide-containing side chains may consist of asparagine and glutamine; one class of amino acids with aromatic side chains may consist of phenylalanine, tyrosine, and tryptophan; one class of amino acids with basic side chains may consist of lysine, arginine and histidine; one class of amino acids with acidic side chains may consist of glutamic acid and aspartic acid; and one class of amino acids with sulfur-containing side chains may consist of cysteine and methionine. Exemplary classes of conservative amino acid substitutions are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine.
A polynucleotide or polypeptide has a certain percentage of "sequence identity" to another polynucleotide or polypeptide, meaning that the percentage of bases or amino acids are the same when aligned and are in the same relative position when the two sequences are compared. Sequence identity can be determined in a number of different ways. To determine sequence identity, the sequences may be aligned using various methods and computer programs (e.g., BLAST, T-COFFEE, MUSCLE, MAFFT, etc.) available on the world Wide Web site (including ncbi. nlm. nili. gov/BLAST, ebi. ac. uk/Tools/msa/tcoffe/ebi. ac. uk/Tools/msa/MUSCLE/major. cbrc. jp/alignment/software /). See, e.g., Altschul et al (1990), J.mol.biol.215: 403-10.
Before the present disclosure is further described, it is to be understood that this disclosure is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present disclosure will be limited only by the appended claims.
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the scope of the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where a stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, the preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited.
It must be noted that, as used herein and in the appended claims, the singular forms "a," "an," and "the" may include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a cannabinoid compound" or "a cannabinoid" can include a plurality of such compounds, and reference to "a modified host cell" can include reference to one or more modified host cells and equivalents thereof known to those skilled in the art, and so forth. It is also noted that the claims may be drafted to exclude any optional element. Also, it is intended that such exclusive terminology as "solely," "only," and the like be used in connection with the recitation of claim elements, or that a antecedent basis for a "negative" limitation be used.
It is appreciated that certain features of the disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the disclosure which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination. All combinations of embodiments related to the present disclosure are expressly included in the present disclosure and disclosed herein as if each combination were individually and expressly disclosed. In addition, all subcombinations of the various embodiments and elements thereof are also expressly encompassed by the present disclosure and are disclosed herein as if each such subcombination was individually and specifically disclosed herein.
Engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides
Disclosed herein are engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising the amino acid sequence of SEQ ID NO:44 with one or more amino acid substitutions. The inventors have identified one or more improved property amino acid positions in the THCAS polypeptide comprising the amino acid sequence of SEQ ID No. 44 that, when substituted, result in an engineered variant. In one aspect of the disclosure, the substitution is at a position corresponding to a position in the THCAS polypeptide of SEQ ID No. 44 from cannabis. The THCAS polypeptide of SEQ ID No. 44 from cannabis comprises the following domains:
1. Signal polypeptide: amino acids 1-28.
FAD binding domain: amino acids 77-251.
BBE domain: amino acids 480-.
SEQ ID NO: 44 also comprises the following domain surface exposed amino acids: 28-33, 35, 36, 39-45, 47-50, 52, 55-59, 61, 62, 65, 66, 69, 71-77, 79, 80, 82, 88, 89, 90, 94, 98, 101, 102, 104, 109, 114, 115, 124, 125, 126, 133, 134, 136-, 402. 403, 405, 406, 408, 409, 410, 413, 422, 424, 430, 437, 438, 444, 446, 448, 450, 451, 454, 456, 457, 460, 463, 464, 467, 468, 470, 471, 472, 475, 478, 483, 484, 487, 488, 491, 493, 502, 504, 505, 508, 509, 513, 516, 517, 520, 524, 525, 527, 528, 530, 532, and 540, 545.
Residue positions in the engineered variants discussed herein are identified relative to a reference amino acid sequence, the THCAS polypeptide of SEQ ID NO:44 from Cannabis (shown in Table 1; UniProtKB/Swiss-Prot: Q8GTB 6). Thus, reference to the amino acid identified by "F317" is the 317 th amino acid from the N-terminus in the THCAS polypeptide of SEQ ID No. 44 from cannabis, wherein methionine is the first amino acid. In the THCAS polypeptide of SEQ ID NO:44 from Cannabis, amino acid 317 is phenylalanine (F). Those skilled in the art understand that the F317 amino acid can have different positions in THCAS polypeptides from different species or different isoforms. These engineered variants are intended to be encompassed by the present disclosure.
Polypeptide sequence positions at which a particular amino acid or amino acid change ("residue difference") occurs are sometimes described herein as "Xn" or "position n," where n refers to the amino acid position relative to a reference sequence. Thus, reference to the amino acid identified by "X317" is 317 th amino acid from the N-terminus in the THCAS polypeptide of SEQ ID NO:44 from Cannabis.
Specific substitution mutations, which are substitutions of specific amino acids in a reference sequence by different specified residues, can be represented by the conventional symbol "X (number) Y", where X is a one-letter identifier of an amino acid in the reference sequence, "number" is the amino acid position in the reference sequence, and Y is a one-letter identifier of an amino acid substitution in the engineered sequence. Thus, reference to a substitution identified by "F317Y" is a substitution of the 317 th amino acid from the N-terminus, phenylalanine, with tyrosine in the THCAS polypeptide of SEQ ID NO:44 from Cannabis.
Cannabinoid synthase polypeptides, secreted polypeptides, have structural features that can hinder expression in modified host cells, such as modified yeast cells. The cannabinoid synthase polypeptide comprises a disulfide bond, a number of glycosylation sites, including N-glycosylation sites, and a covalently attached Flavin Adenine Dinucleotide (FAD) cofactor moiety. Thus, reconstituting the activity of or expressing a cannabinoid synthase polypeptide in a modified host cell, such as a modified yeast cell, can be challenging and unreliable. Often these secreted polypeptides are misfolded or mispositioned, resulting in low expression, lack of activity of the polypeptide, reduced host cell viability, and/or cell death. As disclosed herein, the engineered variants may have improved expression, folding, and enzymatic activity as compared to the THCAS polypeptide comprising the amino acid sequence of SEQ ID NO: 44. In addition, expression of the engineered variants of the disclosure may enhance the viability of the modified host cells disclosed herein as compared to a modified host cell expressing a THCAS polypeptide comprising the amino acid sequence of SEQ ID No. 44.
The present disclosure provides engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising the amino acid sequence of SEQ ID NO:44 with one or more amino acid substitutions. In certain such embodiments, the engineered variant comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID No. 44. In some embodiments, the engineered variant comprises an amino acid sequence having at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 44.
The present disclosure provides engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, wherein the engineered variants comprise at least one amino acid substitution in a signal polypeptide, a Flavin Adenine Dinucleotide (FAD) binding domain, a Berberine Bridge Enzyme (BBE) domain, or a combination of the foregoing. In some embodiments, at least one amino acid substitution is present in the signal polypeptide. In certain such embodiments, the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, or at least 15 amino acid substitutions in the signal polypeptide. In some embodiments, the engineered variant comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 amino acid substitutions in the signal polypeptide. In some embodiments, there is at least one amino acid substitution in the FAD binding domain. In certain such embodiments, the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, or at least 15 amino acid substitutions in the FAD binding domain. In some embodiments, the engineered variant comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 amino acid substitutions in the FAD binding domain. In some embodiments wherein there is at least one amino acid substitution in the FAD domain, the engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of: x100, X103, X109, X124, X125, X132, X137, X143, X149, X161, X165, X167, X168, X170, X171, X172, X175, X180, X196, X208, X235, and X250. In some embodiments wherein there is at least one amino acid substitution in the FAD domain, the engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of: s100, V103, T109, Q124, V125, L132, S137, H143, V149, W161, K165, E167, N168, S170, F171, P172, Y175, G180, N196, H208, G235, and a 250. In some embodiments wherein there is at least one amino acid substitution in the FAD domain, the engineered variant comprises at least one amino acid substitution selected from the group consisting of: L132M, S170T, F171I, N196T, N196Q and N196V. In some embodiments, there is at least one amino acid substitution in the BBE domain. In certain such embodiments, the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, or at least 15 amino acid substitutions in the BBE domain. In some embodiments, the engineered variant comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 amino acid substitutions in the BBE domain. In some embodiments, wherein there is at least one amino acid substitution in the BBE domain, the engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of: x500 and X528. The BBE domain, i.e., the engineered variant, comprises at least one amino acid substitution at an amino acid selected from the group consisting of: y500 and N528. In some embodiments, in which there is at least one amino acid substitution in the BBE domain, the engineered variant comprises at least one amino acid substitution selected from the group consisting of: Y500M, Y500V and N528E.
The present disclosure provides engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, wherein the engineered variants comprise a substitution of at least one surface exposed amino acid. In certain such embodiments, at least one surface-exposed hydrophobic amino acid is substituted with a hydrophilic amino acid. In some embodiments, at least one surface exposed hydrophilic amino acid is substituted with a hydrophobic amino acid. In some embodiments, the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, or at least 15 surface exposed amino acid substitutions. In some embodiments, the engineered variant comprises substitutions of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 surface exposed amino acids. In some embodiments, wherein the engineered variant comprises at least one surface exposed amino acid substitution, the engineered variant comprises at least one amino acid substitution selected from the group consisting of: x132, X170, X171, X196, X261, X269, X317 and X539. In some embodiments, wherein the engineered variant comprises at least one surface exposed amino acid substitution, the engineered variant comprises at least one amino acid substitution selected from the group consisting of: l132, S170, F171, N196, K261, L269, F317 and P539. In some embodiments, wherein the engineered variant comprises at least one surface exposed amino acid substitution, the engineered variant comprises at least one amino acid substitution selected from the group consisting of: L132M, S170T, F171I, N196T, N196Q, N196V, K261C, L269I, F317Y and P539T. Substitution of surface exposed hydrophobic amino acids with hydrophilic amino acids can increase the hydrophilicity of the solvent exposed amino acids, which can improve the solubility of the engineered variants of the disclosure in aqueous (non-trichome) environments.
The present disclosure provides an engineered variant, wherein the engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of: x31, X43, X49, X50, X51, X55, X56, X59, X61, X62, X71, X100, X103, X109, X124, X125, X132, X137, X143, X149, X161, X165, X168, X167, X170, X171, X172, X175, X180, X196, X208, X235, X250, X257, X261, X269, X311, X317, X327, X390, X379, X429, X467, X500, X528, X539, X542, X543, X544 and X545. Such engineered variants may produce greater amounts of THCA from CBGA than from a THCA produced by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions. In some embodiments, such engineered variants may produce greater amounts of THCA from CBGA than THCA produced from CBGA by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions in mg/L or mM, and may produce THCA from CBGA by increasing the ratio of THCA to another cannabinoid (e.g., CBCA) compared to the ratio of THCA produced by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions.
The present disclosure provides an engineered variant, wherein the engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of: r31, P43, P49, K50, L51, Q55, H56, L59, M61, S62, L71, S100, V103, T109, Q124, V125, L132, S137, H143, W161, K165, N168, E167, Y175, G180, N196, H208, a250, I257, K261, G311, F317, L327, K390, T379, D429, Y500, N528, P542, H543, H544 and H545. Such engineered variants may produce greater amounts of THCA from CBGA than from a THCA produced by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions. In some embodiments, such engineered variants may produce greater amounts of THCA from CBGA than THCA produced from CBGA by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions in mg/L or mM, and may produce THCA from CBGA by increasing the ratio of THCA to another cannabinoid (e.g., CBCA) compared to the ratio of THCA produced by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions.
The present disclosure provides an engineered variant, wherein the engineered variant comprises at least one amino acid substitution selected from the group consisting of: r31, P43, P49, K50, L51, Q55, H56, L59, M61, S62, L71, S100, V103, T109, Q124, V125, L132, S137, H143, W161, K165, N168, E167, Y175, G180, N196, H208, a250, I257, K261, G311, F317, L327, K390, T3797, Y500, N528, P542, H543, H544, H545, and H545. Such engineered variants may produce greater amounts of THCA from CBGA than from a THCA produced by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions. In some embodiments, such engineered variants may produce greater amounts of THCA from CBGA than THCA produced from CBGA by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions in mg/L or mM, and may produce THCA from CBGA in an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to the ratio of THCA produced by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions.
The present disclosure provides an engineered variant, wherein the engineered variant comprises an amino acid sequence selected from the group consisting of seq id no: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 116, SEQ ID NO 118, SEQ ID NO 120, SEQ ID NO 124, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 116, SEQ ID NO 118, SEQ ID NO 120, SEQ ID NO, 126, 128, 130, 132, 134, 138, 140, 142, 144, 146, 148, 152, 156, 158, 160, 164, 166, 168, 170, 172, 174, 176, 178, 182, 184 and 186. Such engineered variants may produce greater amounts of THCA from CBGA than from CBGA by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions. In some embodiments, such engineered variants may produce greater amounts of THCA from CBGA than THCA produced from CBGA by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions in mg/L or mM, and may produce THCA from CBGA by increasing the ratio of THCA to another cannabinoid (e.g., CBCA) compared to the ratio of THCA produced by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions.
The present disclosure provides engineered variants, wherein the engineered variants comprise an amino acid sequence of SEQ ID No. 44 having at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 amino acid substitutions. The present disclosure provides an engineered variant, wherein the engineered variant comprises an amino acid sequence of SEQ ID No. 44 having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid substitutions. Combinations of amino acid substitutions as described herein can be generated and the resulting engineered variants screened for improved tetrahydrocannabinolic acid synthase (THCAS) properties. Engineered variants comprising combinations of all substitutions described herein are intended to be encompassed by the present disclosure. In some embodiments, an engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 amino acid substitutions described herein. In some embodiments, the engineered variant comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 of the amino acid substitutions described herein (e.g., 1-30 of the amino acid substitutions described herein). In some embodiments, the engineered variant comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 amino acid substitutions described herein (e.g., 1-15 amino acid substitutions described herein). In some embodiments, the engineered variant comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 of the amino acid substitutions described herein (e.g., 1-10 of the amino acid substitutions described herein). In some embodiments, the engineered variant comprises 1, 2, 3, 4, or 5 amino acid substitutions described herein (e.g., 1-5 amino acid substitutions described herein). In some embodiments, the engineered variant comprises 1, 2, 3, or 4 amino acid substitutions described herein (e.g., 1-4 amino acid substitutions described herein). In some embodiments, the engineered variant comprises 1, 2, or 3 amino acid substitutions described herein (e.g., 1-3 amino acid substitutions described herein). In some embodiments, the engineered variant comprises 1 or 2 amino acid substitutions described herein (e.g., 1-2 amino acid substitutions described herein). In some embodiments, the engineered variant comprises 1 amino acid substitution described herein. In some embodiments, the engineered variant comprises 2 amino acid substitutions described herein. In some embodiments, the engineered variant comprises 3 amino acid substitutions described herein. In some embodiments, the engineered variant comprises 4 amino acid substitutions described herein. In some embodiments, the engineered variant comprises 5 of the amino acid substitutions described herein.
The present disclosure provides engineered variants, wherein the engineered variants comprise at least one invariant amino acid. The present disclosure provides engineered variants, wherein the engineered variants comprise at least one invariant amino acid in a Flavin Adenine Dinucleotide (FAD) binding domain, a Berberine Bridge Enzyme (BBE) domain, or a combination of the foregoing.
In some embodiments, the engineered variant comprises at least one invariant amino acid in the FAD binding domain. In certain such embodiments, the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, or at least 15 invariant amino acids in the FAD binding domain. In some embodiments, the engineered variant comprises at least one invariant amino acid in the FAD binding domain, wherein the at least one invariant amino acid is selected from the group consisting of: x87, X93, X99, X108, X110, X112, X117, X118, X120, X126, X127, X131, X141, X148, X152, X153, X155, X156, X157, X159, X160, X163, X170, X171, X172, X173, X174, X176, X177, X178, X179, X182, X183, X184, X185, X187, X188, X189, X190, X191, X192, X193, X195, X201, X202, X205, X206, X210, X214, X223, X225, X226, X227, X228, X231, X234, X237, X238, X239, X246, X248, and X251. In some embodiments, mutation of one or more of these invariant amino acids reduces the titer of one or more cannabinoids. In some embodiments, mutation of one or more of amino acids X170, X171, and/or X172 reduces the titer of one or more cannabinoids. In some embodiments, wherein the engineered variant comprises at least one invariant amino acid in the FAD binding domain, said at least one invariant amino acid selected from the group consisting of: p87, I93, C99, R108, R110, G112, E117, G118, S120, P126, F127, D131, D141, W148, G152, a153, L155, G156, E157, Y159, Y160, N163, S170, F171, P172, G173, G174, C176, P177, T178, V179, G182, G183, H184, F185, G187, G188, G189, Y190, G191, a192, L193, R195, a201, D202, I205, D206, V210, G214, G223, D225, L226, F227, W228, R231, G234, N237, F239, G238, K245, I246, L248, and V251. In some embodiments, mutation of one or more of amino acids S170, F171, and/or P172 reduces the titer of one or more cannabinoids.
In some embodiments, the engineered variant comprises at least one invariant amino acid in the BBE domain. In certain such embodiments, the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, or at least 15 invariant amino acids in the BBE domain. In some embodiments, wherein the engineered variant comprises at least one invariant amino acid in the BBE domain selected from the group consisting of: x485, X499, X503, X514, X515, X522, X529, X530, X534, X535, and X536. In some embodiments, wherein the engineered variant comprises at least one invariant amino acid in the BBE domain selected from the group consisting of: r485, N499, a503, N514, F515, K522, N529, F530, E534, Q535, and S536.
The present disclosure provides an engineered variant, wherein the engineered variant comprises at least one invariant amino acid selected from the group consisting of: x28, X34, X35, X37, X64, X70, X87, X93, X99, X108, X110, X112, X117, X118, X120, X126, X127, X131, X141, X148, X152, X153, X155, X156, X157, X159, X160, X163, X173, X174, X176, X177, X178, X179, X182, X183, X184, X185, X187, X188, X189, X190, X191, X192, X193, X195, X201, X202, X205, X206, X210, X214, X223, X225, X226, X227, X228, X231, X234, X237, X238, X239, X245, X246, X248, X251, X260, X313, X314, X420, X185, X420, X414, X420, X414, X420, X2, X414, X420, X414, X123, X414, X123, X414, X123, X414, X2, X414. In certain such embodiments, the engineered variant comprises at least one invariant amino acid selected from the group consisting of: x37, X70, X93, X99, X117, X120, X127, X131, X156, X157, X159, X174, X176, X182, X183, X185, X187, X188, X189, X190, X191, X192, X195, X202, X206, X214, X228, X234, X238, X248, X277, X314, X324, X355, X382, X384, X386, X420, X423, X436, X441, X444, X445, X472, X477, X514, X515, X529, and X535. In some embodiments, the engineered variant comprises at least one invariant amino acid selected from the group consisting of: a28, F34, L35, C37, L64, N70, P87, I93, C99, R108, R110, G112, E117, G118, S120, P126, F127, D131, D141, W148, G152, a153, L155, G156, E157, Y159, Y160, N163, a173, G174, C176, P177, T178, V179, G182, G183, H184, F185, G187, G188, G189, Y190, G191, P192, L193, R195, a201, D202, I205, D206, V210, G214, G223, D225, L226, F227, W228, R231, G234, S237, F238, G239, G245, I246, L245, L251, V260, Q313, F313, S314, F227, W228, R231, G234, S237, S248, F245, I520, N420, N520, P420, P185, N185, P444, N185, P18, N185, P2, P18, L2, P68, P2, L123, P2, L123, L2, L123, P2. In certain such embodiments, the engineered variant comprises at least one invariant amino acid selected from the group consisting of: c37, N70, I93, C99, E117, S120, F127, D131, G156, E157, Y159, G174, C176, G182, G183, F185, G187, G188, G189, Y190, G191, P192, R195, D202, D206, G214, W228, G234, F238, L248, Q277, S314, L324, S355, K382, K384, D386, G420, M423, R436, Y441, W444, Y445, Y472, P477, N514, F515, N529, and Q535.
The present disclosure provides an engineered variant, wherein the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 invariant amino acids, provided that the engineered variant has at least one amino acid substitution as compared to SEQ ID NO: 44. Engineered variants having a combination of invariant amino acids and substitutions as described herein can be generated and screened for improved tetrahydrocannabinolic acid synthase (THCAS) properties. Engineered variants comprising combinations of all substituted and invariant amino acids described herein are intended to be encompassed by the present disclosure.
The present disclosure provides engineered variants, wherein the engineered variants comprise at least one amino acid substitution at the C-terminus. In certain such embodiments, the hydrophilic amino acid is replaced with a hydrophobic amino acid. In some embodiments, wherein the engineered variant comprises at least one amino acid substitution at the C-terminus, the hydrophobic amino acid is substituted with a hydrophilic amino acid. Such engineered variants may produce greater amounts of THCA from CBGA than from CBGA by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions.
The present disclosure provides engineered variants, wherein the engineered variants comprise truncations at the N-terminus, the C-terminus, or at both the N-terminus and the C-terminus. In some embodiments, the engineered variant comprises a truncation at the N-terminus. In some embodiments, the engineered variant comprises a truncation at the C-terminus. In some embodiments, the engineered variant comprises truncations at both the N-terminus and the C-terminus. In some embodiments, the engineered variant lacks the native signal polypeptide (i.e., amino acids 1-28 of SEQ ID NO: 44).
In some embodiments, the engineered variant comprises a truncation at the N-terminus, C-terminus, or both the N-terminus and C-terminus, and comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID No. 44. In some embodiments, the engineered variant comprises a truncation at the N-terminus, C-terminus, or both the N-terminus and C-terminus, and comprises an amino acid sequence having at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 44.
In some embodiments, the engineered variant comprises a truncation of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 amino acids at the C-terminus. In some embodiments, the engineered variant comprises a truncation of at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, or at least 20 amino acids at the C-terminus. In some embodiments, the engineered variant comprises a truncation of at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 amino acids at the C-terminus. In some embodiments, the engineered variant comprises a truncation of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids at the C-terminus (e.g., 1-10 amino acids at the C-terminus). In some embodiments, the engineered variant comprises a truncation of 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids at the C-terminus (e.g., 11-20 amino acids at the C-terminus). In some embodiments, the engineered variant comprises a truncation of 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acids at the C-terminus (e.g., 21-30 amino acids at the C-terminus).
In some embodiments, the engineered variant comprises a truncation of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 amino acids at the N-terminus. In some embodiments, the engineered variant comprises a truncation of at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, or at least 20 amino acids at the N-terminus. In some embodiments, the engineered variant comprises a truncation of at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 amino acids at the N-terminus. In some embodiments, the engineered variant comprises a truncation of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids at the N-terminus (e.g., 1-10 amino acids at the N-terminus). In some embodiments, the engineered variant comprises a truncation of 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 amino acids at the N-terminus (e.g., 11-20 amino acids at the N-terminus). In some embodiments, the engineered variant comprises a truncation of 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acids at the N-terminus (e.g., 21-30 amino acids at the N-terminus).
In some embodiments, the truncated engineered variants of the disclosure may comprise a signal polypeptide. In certain such embodiments, the truncated engineered variants differentiate between native signal polypeptides. In some embodiments, the signal polypeptide is a secretory signal polypeptide. In some embodiments, the secretion signal polypeptide is a native secretion signal polypeptide. In some embodiments, the secretion signal polypeptide is a synthetic secretion signal polypeptide. In some embodiments, the secretion signal polypeptide is an endoplasmic reticulum retention signal polypeptide. In certain such embodiments, the endoplasmic reticulum retention signal polypeptide is an HDEL polypeptide or a KDEL polypeptide. In some embodiments, the secretion signal polypeptide is a mitochondrial targeting signal polypeptide. In some embodiments, the secretion signal polypeptide is a golgi targeting signal polypeptide. In some embodiments, the secretion signal polypeptide is a vacuolar localization signal polypeptide. In certain such embodiments, the vacuolar localization signal polypeptide is a PEP4t polypeptide or a PRC1t polypeptide. In certain such embodiments, the vacuolar localization signal polypeptide is a PEP4t polypeptide. In some embodiments, the secretion signal polypeptide is a plasma membrane localization signal polypeptide. In some embodiments, the secretion signal polypeptide is a peroxisome-targeting signal polypeptide. In some embodiments, the peroxisome-targeting signal polypeptide is a PEX8 polypeptide. In some embodiments, the secretion signal polypeptide is a mating factor secretion signal polypeptide (e.g., an MF polypeptide or an evolved MF polypeptide (MFev)). In some embodiments, the signal polypeptide is linked to the N-terminus of the engineered variant.
In some embodiments, the truncated engineered variants of the disclosure may comprise a membrane anchor. The membrane anchor may be a sequence that is inserted into the membrane of the cell and anchors the attached polypeptide therein. The membrane anchor may be present in a membrane outside the cell (e.g., GPI polypeptide) or inside the cell (e.g., tail anchor, ER anchor). Examples of membrane anchors include, but are not limited to, glycosylphosphatidylinositol membrane anchors (GPI polypeptides, such as AGA1), CAAX box polypeptides (prenylated, such as RAS1), or tail anchor polypeptides with a hydrophobic C-terminus (e.g., phosphatidylinositol 4, 5-bisphosphate 5-phosphatase (INP54) has a hydrophobic tail anchor in the ER membrane or synaptophysin 2(VAMP2) has a hydrophobic poly I-tail anchor in the vesicle membrane).
The present disclosure provides engineered variants, wherein the engineered variants comprise additions and/or deletions of one or more amino acids.
Engineered variants of the THCAS polypeptide may be prepared and screened for improved properties, such as an amount of THCA produced from CBGA that is greater than the amount of THCA produced from CBGA by the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, in mg/L or mM under similar conditions for the same length of time. In addition, engineered variants of THCAS polypeptides may be prepared and screened for improved properties, such as production of THCA from CBGA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to the ratio of THCA to the other cannabinoid produced by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions. Similar conditions may refer to reaction conditions at the same temperature, pH, buffer and/or fermentation conditions, and in the same medium and/or reaction solvent.
In some embodiments of the disclosure, the engineered variant produces tetrahydrocannabinolic acid (THCA) from cannabigerolic acid (CBGA) in an amount that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of THCA produced from CBGA by a cannabidiolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same length of time under similar conditions, in mg/L or mM.
In some embodiments of the disclosure, the engineered variant is produced from THCA at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or more than about 500:1 THCA to another cannabinoid (e.g., CBCA).
These improved properties can be assessed by the conversion of CBGA to THCA, or alternatively the conversion of another raw material to the desired cannabinoid or cannabinoid derivative, in the case of isolated and/or purified engineered variants of the disclosure in vitro or modified host cells expressing the engineered variants in vivo. In some embodiments, the modified host cell expresses a polypeptide involved in the MEV pathway and/or a polypeptide involved in cannabinoid biosynthesis and/or comprises a modification of the secretory pathway. It is contemplated that engineered variants of the present disclosure having varying degrees of stability, solubility, activity and/or expression levels under one or more test conditions will be useful in the present disclosure for the production of cannabinoids or cannabinoid derivatives in a variety of host cells.
In addition, engineered variants of the THCAS polypeptide may be prepared and screened for improved properties, such as, for example, growth in mg/L or mM for the same length of time under similar culture conditions, the amount of cannabinoids or cannabinoid derivatives produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding the engineered variant is greater than the amount of cannabinoids or cannabinoid derivatives produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, but lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant.
In addition, engineered variants of the THCAS polypeptide may be prepared and screened for improved properties, such as, for example, faster/higher growth rate and/or higher biomass yield compared to the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, grown under similar culture conditions for the same length of time. In addition, engineered variants of the THCAS polypeptide may be prepared and screened for improved properties, such as, for example, production of THCA from CBGA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to the ratio of THCA produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant, when grown for the same length of time under similar culture conditions. Similar culture conditions may refer to host cells grown in the same medium under the same temperature, pH, and/or fermentation conditions.
Moreover, engineered variants of the THCAS polypeptide can be prepared and screened for improved properties, such as, for example, growth under similar culture conditions for the same length of time without a significant decrease in growth or viability of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, as compared to a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant. In addition, engineered variants of the THCAS polypeptide may be prepared and screened for improved properties, such as, for example, no significant reduction in growth or viability of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding the engineered variant as compared to an unmodified host cell.
Nucleic acids comprising nucleotide sequences encoding engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides, and expression vectors and constructs
The present disclosure provides nucleic acids comprising nucleotide sequences encoding engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides disclosed herein, as well as expression vectors and constructs comprising the same.
The present disclosure provides nucleic acids comprising nucleotide sequences encoding engineered variants of the disclosure. Some embodiments of the present disclosure relate to nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure comprising an amino acid sequence set forth in seq id no: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 62, SEQ ID NO 64, SEQ ID NO 66, SEQ ID NO 68, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 94, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 176, 178, 180, SEQ ID NO 182, SEQ ID NO 184 and SEQ ID NO 186. In some embodiments, the nucleotide sequence is codon optimized.
Some embodiments of the present disclosure relate to nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure comprising an amino acid sequence set forth in seq id no: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 116, SEQ ID NO 118, SEQ ID NO 120, SEQ ID NO 124, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 116, SEQ ID NO 118, SEQ ID NO 120, SEQ ID NO, 126, 128, 130, 132, 134, 138, 140, 142, 144, 146, 148, 152, 156, 158, 160, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184 or 186. In some embodiments, the nucleotide sequence is codon optimized.
The present disclosure also provides a nucleic acid comprising a nucleotide sequence encoding an engineered variant, wherein the nucleotide sequence is a nucleotide sequence set forth in seq id no: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 61, SEQ ID NO 63, SEQ ID NO 65, SEQ ID NO 67, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 161, 163, 157, 175, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183 or 185 SEQ ID NO. In some embodiments, the nucleotide sequence is codon optimized.
The present disclosure provides nucleic acids comprising a nucleotide sequence encoding an engineered variant, wherein the nucleotide sequence is the nucleotide sequence set forth in seq id no: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 61, SEQ ID NO 63, SEQ ID NO 65, SEQ ID NO 67, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, SEQ ID NO: 117. the amino acid sequence of SEQ ID NO: 119. SEQ ID NO: 121. SEQ ID NO: 123. SEQ ID NO: 125. SEQ ID NO: 127. SEQ ID NO: 129. SEQ ID NO: 131. the amino acid sequence of SEQ ID NO: 133. the amino acid sequence of SEQ ID NO: 135. SEQ ID NO: 137. SEQ ID NO: 139. the amino acid sequence of SEQ ID NO: 141. the amino acid sequence of SEQ ID NO: 143. the amino acid sequence of SEQ ID NO: 145. SEQ ID NO: 147. SEQ ID NO: 151. SEQ ID NO: 155. SEQ ID NO: 157. SEQ ID NO: 159. SEQ ID NO: 165. SEQ ID NO: 167. SEQ ID NO: 169. SEQ ID NO: 171. SEQ ID NO: 173. SEQ ID NO: 175. SEQ ID NO: 177. SEQ ID NO: 179. SEQ ID NO: 181. the amino acid sequence of SEQ ID NO:183 or SEQ ID NO:185, or any of the foregoing. In some embodiments, the nucleotide sequence is codon optimized.
The present disclosure provides a nucleic acid comprising a nucleotide sequence encoding an engineered variant, wherein the nucleotide sequence is the nucleotide sequence set forth in seq id no: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, SEQ ID NO 117, SEQ ID NO 119, SEQ ID NO 123, SEQ ID NO 125, SEQ ID NO 123, SEQ ID NO, 127, 129, 131, 133, 137, 139, 141, 143, 145, 147, 151, 155, 157, 159, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183 or 185 or a degenerate codon sequence of any of the foregoing. In some embodiments, the nucleotide sequence is codon optimized.
The present disclosure provides a nucleic acid comprising a nucleotide sequence encoding an engineered variant, wherein the nucleotide sequence is the nucleotide sequence set forth in seq id no:49, 51, 53, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 93, 95, 97, 99, 101, 103, 105, 107, 111, 113, 115, 69, 111, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 151, 155, 157, 159, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183 or 185 SEQ ID. In some embodiments, the nucleotide sequence is codon optimized.
Also included are nucleic acids that hybridize to the nucleic acids disclosed herein. Hybridization conditions can be stringent in that hybridization will occur only if there is at least 90%, at least 95%, or at least 97% sequence identity to the nucleotide sequence present in the nucleic acid encoding a polypeptide disclosed herein. Stringent conditions may include those used for known Southern hybridization, for example, incubation at 42 ℃ overnight in a solution with 50% formamide, 5 XSSC (150mM NaCl, 15mM trisodium citrate), 50mM sodium phosphate (pH 7.6), 5 XDenhardt's solution, 10% dextran sulfate, and 20. mu.g/ml denatured, sheared salmon sperm DNA followed by washing the hybridization support in 0.1 XSSC at about 65 ℃. Other known hybridization conditions are well known and described in Sambrook et al, Molecular Cloning: A Laboratory Manual, third edition, Cold Spring Harbor, N.Y. (2001).
The length of the nucleic acids disclosed herein may depend on the intended use. For example, if the intended use is as a primer or probe, e.g., for PCR amplification or for screening libraries, the nucleic acid will be less than the full-length sequence, e.g., 15-50 nucleotides in length. In certain such embodiments, the primer or probe may be substantially identical to a highly conserved region of the nucleotide sequence, or may be substantially identical to the 5 'or 3' end of the nucleotide sequence. In some cases, these primers or probes may use universal bases at some positions so as to be "substantially identical," but still provide flexibility in sequence recognition. Notably, suitable primer and probe hybridization conditions are well known in the art.
Some embodiments of the present disclosure relate to vectors comprising one or more of the nucleic acids disclosed herein. Some embodiments of the present disclosure relate to expression constructs comprising one or more of the nucleic acids disclosed herein. Some embodiments of the present disclosure relate to nucleic acids comprising codon-optimized nucleotide sequences encoding engineered variants of the disclosure. In some embodiments, the nucleic acids disclosed herein are heterologous.
Methods of screening for engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides
The present disclosure provides a method of screening for engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides comprising the amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions. In certain such embodiments, the methods involve a competition assay in which an engineered variant of the disclosure is expressed in a modified host cell with a relevant enzyme.
Some embodiments of the present disclosure relate to a method of screening for an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, the method comprising:
a) Dividing the population of host cells into a control population and a test population;
b) co-expressing in the control population a THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 and a comparative cannabinoid synthase polypeptide, wherein the THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 can convert CBGA to a first cannabinoid, THCA, and the comparative cannabinoid synthase polypeptide can convert the same CBGA to a different second cannabinoid;
c) co-expressing the engineered variant and the comparative cannabinoid synthase polypeptide in the test population, wherein the engineered variant can convert CBGA to the same first cannabinoid, THCA, as the THCAS polypeptide having the amino acid sequence of SEQ ID NO:44, and wherein the comparative cannabinoid synthase polypeptide can convert the same CBGA to the second cannabinoid and be expressed at similar levels in the test population and the control population;
d) measuring the ratio of the first cannabinoid (THCA) to the second cannabinoid produced by both the test population and the control population; and is
e) Measuring the amount of the first cannabinoid produced by both the test population and the control population in mg/L or mM. In certain such embodiments, the engineered variant is an engineered variant of the present disclosure.
In some embodiments, the test population is identified as comprising an engineered variant having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, by producing a greater amount of the first cannabinoid from the test population as compared to the amount produced from the control population, in mg/L or mM, over the same length of time under similar culture conditions. In some embodiments, the test population is identified as comprising an engineered variant having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, wherein improved in vivo performance is evidenced by an increased ratio of the first cannabinoid to the second cannabinoid produced by the test population as compared to the ratio of the first cannabinoid to the second cannabinoid produced by the control population over the same length of time under similar culture conditions.
In some embodiments, the cannabinoid synthase polypeptide is a tetrahydrocannabinolic acid synthase (THCAS) polypeptide. In certain such embodiments, the CBDAS polypeptide comprises an amino acid sequence that has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 3. In some embodiments, the nucleotide sequence encoding a CBDAS polypeptide is the nucleotide sequence set forth in SEQ ID NO. 1 or SEQ ID NO. 2. In some embodiments, the nucleotide sequence encoding a CBDAS polypeptide is the nucleotide sequence set forth in SEQ ID NO. 1 or SEQ ID NO. 2, or a codon degenerate nucleotide sequence thereof. In some embodiments, the nucleotide sequence encoding a CBDAS polypeptide has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 1 or SEQ ID No. 2. In some embodiments, the second cannabinoid is CBDA.
Engineered variants for expression of tetrahydrocannabinolic acid synthase (THCAS) polypeptides and modified host cells for production of cannabinoids and cannabinoid derivatives
The present disclosure provides modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure. In certain such embodiments, the modified host cells of the present disclosure are used to express engineered variants and/or to produce cannabinoids or cannabinoid derivatives. In some embodiments, the nucleotide sequence encoding the engineered variant is codon optimized.
The present disclosure also provides nucleic acids (e.g., heterologous nucleic acids) that can be introduced into a microorganism (e.g., a modified host cell) to cause expression or overexpression of the engineered variants of the present disclosure, which can then be used for the production of cannabinoids or cannabinoid derivatives in vitro (e.g., cell-free) or in vivo. In some embodiments, these nucleic acids comprise codon-optimized nucleotide sequences encoding the engineered variants.
Cannabinoid synthase polypeptides, secreted polypeptides, such as engineered variants of the disclosure, have structural features that can hinder expression in a modified host cell (such as a modified yeast cell). Cannabinoid synthase polypeptides, including the disclosed engineered variants, comprise a disulfide bond, a number of glycosylation sites, including N-glycosylation sites, and a covalently attached Flavin Adenine Dinucleotide (FAD) cofactor moiety. Often these secreted polypeptides are misfolded or mispositioned, resulting in low expression, lack of activity of the polypeptide, reduced host cell viability, and/or cell death. As disclosed herein, manipulation of the secretory pathway in a host cell modified with one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure may improve the expression, folding, and enzymatic activity of the engineered variant of the disclosure, as well as the viability of the modified host cell. In certain such embodiments, the nucleotide sequence encoding the engineered variant is codon optimized.
To produce a cannabinoid or cannabinoid derivative and create a biosynthetic pathway within a modified host cell, a modified host cell comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure can express or overexpress a combination of heterologous nucleic acids comprising a nucleotide sequence encoding a polypeptide involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isoprene phosphate, olive acid, or hexanoyl coa). In some embodiments, the nucleotide sequence encoding a polypeptide involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivine acid, or hexanoyl-coa) is codon optimized. In some embodiments, a modified host cell of the present disclosure for producing a cannabinoid or cannabinoid derivative comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, the host cell comprising one or more modifications to modulate expression of one or more secretory pathway polypeptides. The one or more modifications that modulate the expression of one or more secretory pathway polypeptides may include introducing into the host cell one or more heterologous nucleic acids comprising a nucleotide sequence encoding the one or more secretory pathway polypeptides and/or a deletion or downregulation in the host cell of one or more genes encoding the one or more secretory pathway polypeptides. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure, comprising one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides, for use in the production of a cannabinoid or cannabinoid derivative, causes expression or overexpression of the one or more secretory pathway polypeptides. In some embodiments, the nucleotide sequence encoding one or more secretory pathway polypeptides is codon optimized. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, comprising a deletion or down-regulation of one or more genes encoding one or more secretory pathway polypeptides, reduces or eliminates expression of one or more secretory pathway polypeptides, for use in producing a cannabinoid or cannabinoid derivative. In certain such embodiments, the modified host cell comprises a deletion of one or more genes encoding one or more secretory pathway polypeptides. In some embodiments, the modified host cell comprises down-regulation of one or more genes encoding one or more secretory pathway polypeptides.
In some embodiments, culturing a modified host cell for production of a cannabinoid or cannabinoid derivative in a culture medium provides for synthesis of the cannabinoid or cannabinoid derivative.
To express the engineered variants of the present disclosure, the modified host cell may express or overexpress one or more nucleic acids comprising a nucleotide sequence encoding the engineered variant. In some embodiments, the nucleotide sequence encoding the engineered variant is codon optimized. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant for expressing the engineered variant of the present disclosure comprises one or more modifications to modulate the expression of one or more secretory pathway polypeptides. The one or more modifications that modulate the expression of one or more secretory pathway polypeptides may include introducing into the host cell one or more heterologous nucleic acids comprising a nucleotide sequence encoding the one or more secretory pathway polypeptides and/or a deletion or downregulation in the host cell of one or more genes encoding the one or more secretory pathway polypeptides. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant for expressing an engineered variant of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides, resulting in the expression or overexpression of the one or more secretory pathway polypeptides. In some embodiments, the nucleotide sequence encoding one or more secretory pathway polypeptides is codon optimized. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant for expressing an engineered variant of the present disclosure comprises a deletion or down-regulation of one or more genes encoding one or more secretory pathway polypeptides, reduces or eliminates expression of one or more secretory pathway polypeptides. In certain such embodiments, the modified host cell comprises a deletion of one or more genes encoding one or more secretory pathway polypeptides. In some embodiments, the modified host cell comprises down-regulation of one or more genes encoding one or more secretory pathway polypeptides.
Secretory pathway modification
Secretory pathway polypeptides having modulated expression in a modified host cell of the present disclosure may include, but are not limited to: KAR2 polypeptide, ROT2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, PEP4 polypeptide and IRE1 polypeptide. Expression of the secretory pathway polypeptide may be modulated by introducing into the host cell one or more heterologous nucleic acids comprising a nucleotide sequence encoding the one or more secretory pathway polypeptides and/or by deletion or downregulation of one or more genes encoding the one or more secretory pathway polypeptides in the host cell. In some embodiments, the nucleotide sequence encoding one or more secretory pathway polypeptides is codon optimized.
In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of one or more of the following genes: the ROT2 gene or the PEP4 gene. In some embodiments, the modified host cells of the present disclosure comprise a deletion of one or more of the following genes: the ROT2 gene or the PEP4 gene. In some embodiments, the modified host cells of the present disclosure comprise down-regulation of one or more of the following genes: the ROT2 gene or the PEP4 gene.
The secretory pathway polypeptide and the heterologous nucleic acid comprising a nucleotide sequence encoding one or more secretory pathway polypeptides may be from any suitable source, e.g., bacteria, yeast, fungi, algae, human, plant, or mouse. In some embodiments, the secretory pathway polypeptide and the heterologous nucleic acid comprising a nucleotide sequence encoding one or more secretory pathway polypeptides may be derived from Pichia pastoris (Pichia pastoris) (now known as faffia foenum phaffii), Pichia finnishensis (Pichia finlandica), Pichia trehalose (Pichia trehalophila), Pichia koklama (Pichia koclamazae), Pichia membranaceus (Pichia membranaceus), Pichia stipitis (Pichia opuntiae), Pichia thermotolerans (Pichia thermophila), Pichia salicina (Pichia satalliria), Pichia gourkii (Pichia guerreri), Pichia pastoris (Pichia pastoris)), Pichia pastoris (Pichia pastoris), Pichia pastoris) (Pichia pastoris)) Yarrowia lipolytica (Yarrowia lipolytica), Kluyveromyces (Kluyveromyces), Kluyveromyces lactis (Kluyveromyces lactis), Kluyveromyces marxianus (Kluyveromyces marxianus), Schizosaccharomyces pombe (Schizosaccharomyces pombe), Saccharomyces xylosus (Schizosaccharomyces stipitis), Saccharomyces brussels (Dekkera bruxellensis), Saccharomyces adenosylvorus (Blastotrys adenosylvorans) (formerly known as Saccharomyces adensis (Arula), Aspergillus albicans (Candida albicans), Aspergillus nidulans (Aspergillus niger), Aspergillus niger (Aspergillus oryzae), Candida albicans (Fusarium oxysporum), Trichoderma reesei (Fusarium), Fusarium crassa, etc. In some embodiments, the disclosure also encompasses orthologous genes encoding the secretory pathway polypeptides disclosed herein. Exemplary secretory pathway polypeptides disclosed herein can also include full-length secretory pathway polypeptides, fragments of secretory pathway polypeptides, variants of secretory pathway polypeptides, truncated secretory pathway polypeptides, or fusion polypeptides having at least one activity of a secretory pathway polypeptide.
Exemplary KAR2 polypeptides disclosed herein can include full-length KAR2 polypeptides, fragments of KAR2 polypeptides, variants of KAR2 polypeptides, truncated KAR2 polypeptides, or fusion polypeptides having at least one activity of KAR2 polypeptides.
Exemplary ROT2 polypeptides disclosed herein may include a full-length ROT2 polypeptide, a fragment of ROT2 polypeptide, a variant of ROT2 polypeptide, a truncated ROT2 polypeptide, or a fusion polypeptide having at least one activity of ROT2 polypeptide.
Exemplary PDI1 polypeptides disclosed herein may include full-length PDI1 polypeptides, fragments of PDI1 polypeptides, variants of PDI1 polypeptides, truncated PDI1 polypeptides, or fusion polypeptides having at least one activity of a PDI1 polypeptide.
Exemplary ERO1 polypeptides disclosed herein may include full-length ERO1 polypeptides, fragments of ERO1 polypeptides, variants of ERO1 polypeptides, truncated ERO1 polypeptides, or fusion polypeptides having at least one activity of ERO1 polypeptides.
Exemplary FAD1 polypeptides disclosed herein can also include full-length FAD1 polypeptides, fragments of FAD1 polypeptides, variants of FAD1 polypeptides, truncated FAD1 polypeptides, or fusion polypeptides having at least one activity of FAD1 polypeptides.
Exemplary PEP4 polypeptides disclosed herein may include a full-length PEP4 polypeptide, a fragment of PEP4 polypeptide, a variant of PEP4 polypeptide, a truncated PEP1 polypeptide, or a fusion polypeptide having at least one activity of a PEP4 polypeptide.
Exemplary IRE1 polypeptides disclosed herein may include full-length IRE1 polypeptides, fragments of IRE1 polypeptides (e.g., missing the first 7 amino acids), variants of IRE1 polypeptides, truncated IRE1 polypeptides, or fusion polypeptides having at least one activity of an IRE1 polypeptide.
The modified host cells of the present disclosure may comprise one or more modifications to modulate the expression of one or more of KAR2 polypeptide, ROT2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, PEP4 polypeptide, or IRE1 polypeptide. The one or more modifications that modulate the expression of one or more of KAR2 polypeptide, ROT2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, PEP4 polypeptide, or IRE1 polypeptide may comprise introducing into the host cell one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide, and/or deletion or downregulation of one or more genes encoding one or more of ROT2 polypeptide or PEP4 polypeptide in the host cell. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide, thereby causing expression or overexpression of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, thereby reducing or eliminating expression of a ROT2 polypeptide or a PEP4 polypeptide.
In some embodiments, the one or more modifications that modulate the expression of one or more secretory pathway polypeptides may improve the viability of the modified host cell. Improving the viability of the modified host cell may improve the industrial fermentation process. The ERO1 polypeptide may serve as a partner for the PDI1 polypeptide, a protein disulfide isomerase polypeptide. Modulating expression of IRE1 polypeptide may prevent degradation of the expressed engineered variants of the disclosure.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides comprising the amino acid sequence set forth in SEQ ID NO:5(KAR2 polypeptide), SEQ ID NO:9(PDI1 polypeptide), SEQ ID NO:7(ERO1 polypeptide), SEQ ID NO:192(FAD1 polypeptide), SEQ ID NO:11(IRE1 polypeptide), or SEQ ID NO:190 (a fragment of IRE1 polypeptide).
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides comprising the amino acid sequence set forth in SEQ ID NO:5(KAR2 polypeptide), SEQ ID NO:9(PDI1 polypeptide), SEQ ID NO:7(ERO1 polypeptide), SEQ ID NO:192(FAD1 polypeptide), SEQ ID NO:11(IRE1 polypeptide), or SEQ ID NO:190 (a fragment of IRE1 polypeptide), or a conservatively substituted amino acid sequence of any of the foregoing.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides comprising at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, or at least 95% of a polypeptide having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, or a fragment of the SEQ ID NO. 5(KAR2 polypeptide), SEQ ID NO 9(PDI 1) polypeptide), SEQ ID NO 7(ERO1 polypeptide), SEQ ID NO 677 polypeptide, An amino acid sequence that is at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity.
In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of one or more genes encoding one or more secretory pathway polypeptides comprising the amino acid sequence set forth in SEQ ID NO:13(ROT2 polypeptide) or SEQ ID NO:15(PEP4 polypeptide).
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding two or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising nucleotide sequences encoding three or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or FAD1 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding two or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding three or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide. In some embodiments, the nucleotide sequence encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide is codon optimized.
In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of one or more genes encoding one or more of ROT2 polypeptide or PEP4 polypeptide. In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of genes encoding a ROT2 polypeptide and a PEP4 polypeptide.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a secretory pathway polypeptide, such as a full-length secretory pathway polypeptide, a fragment of a secretory pathway polypeptide, a variant of a secretory pathway polypeptide, a truncated secretory pathway polypeptide, or a fusion polypeptide having at least one activity of a secretory pathway polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, such as a full-length KAR2 polypeptide, a fragment of KAR2 polypeptide, a variant of KAR2 polypeptide, a truncated KAR2 polypeptide, or a fusion polypeptide having at least one activity of a KAR2 polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a ROT2 polypeptide, such as a full-length ROT2 polypeptide, a fragment of a ROT2 polypeptide, a variant of a ROT2 polypeptide, a truncated ROT2 polypeptide, or a fusion polypeptide having at least one activity of a ROT2 polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, such as a full-length PDI1 polypeptide, a fragment of a PDI1 polypeptide, a variant of a PDI1 polypeptide, a truncated PDI1 polypeptide, or a fusion polypeptide having at least one activity of a PDI1 polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
Exemplary heterologous nucleic acids disclosed herein may include nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, such as a full-length ERO1 polypeptide, a fragment of an ERO1 polypeptide, a variant of an ERO1 polypeptide, a truncated ERO1 polypeptide, or a fusion polypeptide having at least one activity of an ERO1 polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide, such as a full-length FAD1 polypeptide, a fragment of a FAD1 polypeptide, a variant of a FAD1 polypeptide, a truncated FAD1 polypeptide, or a fusion polypeptide having at least one activity of a FAD1 polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a PEP4 polypeptide, such as a full-length PEP4 polypeptide, a fragment of a PEP4 polypeptide, a variant of a PEP4 polypeptide, a truncated PEP1 polypeptide, or a fusion polypeptide having at least one activity of a PEP4 polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
Exemplary heterologous nucleic acids disclosed herein may include nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide, such as a full-length IRE1 polypeptide, a fragment of an IRE1 polypeptide (e.g., missing the first 7 amino acids), a variant of an IRE1 polypeptide, a truncated IRE1 polypeptide, or a fusion polypeptide having at least one activity of an IRE1 polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, one or more secretory pathway polypeptides, such as a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, are overexpressed in the modified host cell. Overexpression may be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides, such as KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present at 10-40 copies or about 100 copies per cell) and/or by operably linking a nucleotide sequence encoding one or more secretory pathway polypeptides, such as KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide, to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding a secretory pathway polypeptide, such as a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a secretory pathway polypeptide, such as a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a secretory pathway polypeptide, such as a KAR2 polypeptide, PDI1 polypeptide, FAD1 polypeptide, ERO1 polypeptide, or IRE1 polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a secretory pathway polypeptide, such as a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a secretory pathway polypeptide, such as a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide. In some embodiments, the modified host cell has five or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a secretory pathway polypeptide, such as a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides selected from the group consisting of the nucleotide sequences set forth in SEQ ID NO:4 (encoding a KAR2 polypeptide), SEQ ID NO:8 (encoding a PDI1 polypeptide), SEQ ID NO:6 (encoding an ERO1 polypeptide), SEQ ID NO:191 (encoding a FAD1 polypeptide), SEQ ID NO:10 (encoding an IRE1 polypeptide), or SEQ ID NO:189 (encoding a fragment of an IRE1 polypeptide).
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides selected from the group consisting of the nucleotide sequence set forth in SEQ ID NO:4 (encoding a KAR2 polypeptide), SEQ ID NO:8 (encoding a PDI1 polypeptide), SEQ ID NO:6 (encoding an ERO1 polypeptide), SEQ ID NO:191 (encoding a FAD1 polypeptide), SEQ ID NO:10 (encoding an IRE1 polypeptide), or SEQ ID NO:189 (encoding a fragment of an IRE1 polypeptide), or a codon degenerate nucleotide sequence of any of the foregoing.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more secretory pathway polypeptides selected from the group consisting of at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or a fragment of the IRE1 polypeptide, and SEQ ID NO 4 (encoding a KAR2 polypeptide), SEQ ID NO 8 (encoding a PDI1 polypeptide), SEQ ID NO 6 (encoding an ERO1 polypeptide), SEQ ID NO) polypeptide, At least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity.
In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of one or more genes encoding one or more secretory pathway polypeptides encoded by a nucleotide sequence selected from the group consisting of the nucleotide sequences set forth in SEQ ID NO:12 (encoding the ROT2 polypeptide) and SEQ ID NO:14 (encoding the PEP4 polypeptide).
In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of the ROT2 gene. In some embodiments, the modified host cells of the present disclosure comprise a deletion of the ROT2 gene. In some embodiments, the modified host cells of the present disclosure comprise down-regulation of the ROT2 gene.
In some embodiments, the modified host cells of the present disclosure comprise a deletion or down-regulation of the PEP4 gene. In some embodiments, the modified host cells of the present disclosure comprise a deletion of the PEP4 gene. In some embodiments, the modified host cells of the present disclosure comprise down-regulation of the PEP4 gene.
In some embodiments, the modified host cells of the invention comprise deletions or downregulations of the PEP4 gene and the ROT2 gene. In some embodiments, the modified host cells of the invention comprise deletions of the PEP4 gene and the ROT2 gene. In some embodiments, the modified host cells of the present disclosure comprise down-regulation of the PEP4 gene and the ROT2 gene.
Modification of the cannabinoid and cannabinoid precursor biosynthetic pathways
A modified host cell of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure may further comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa). In addition to the engineered variants of the present disclosure, such polypeptides may include, but are not limited to: geranyl pyrophosphate olive alkyd geranyl transferase (GOT) polypeptides, tetrone compound synthase (TKS) polypeptides, Olive Acid Cyclase (OAC) polypeptides, one or more polypeptides having at least one activity of a polypeptide present in the Mevalonate (MEV) pathway (e.g., one or more MEV pathway polypeptides), Acyl Activating Enzyme (AAE) polypeptides, GPP-producing polypeptides (e.g., Geranyl Pyrophosphate Synthase (GPPs) polypeptides), polypeptides that condense two acetyl-coa molecules to produce acetoacetyl-coa (e.g., acetoacetyl-coa thiolase polypeptides), and pyruvate decarboxylase polypeptides. In some embodiments, the nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivine acid, or hexanoyl-coa) is codon optimized.
The polypeptides involved in the biosynthesis of cannabinoids or cannabinoid precursors, as well as heterologous nucleic acids comprising nucleotide sequences encoding one or more polypeptides involved in the biosynthesis of cannabinoids or cannabinoid precursors, may be derived from any suitable source, such as bacteria, yeast, fungi, algae, humans, plants (e.g., cannabis) or mice. In some embodiments, the disclosure also encompasses orthologous genes encoding polypeptides disclosed herein that are involved in the biosynthesis of a cannabinoid or cannabinoid precursor.
Engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides
A modified host cell of the disclosure may comprise one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide disclosed herein. In certain such embodiments, the tetrahydrocannabinolic acid synthase polypeptide has the amino acid sequence of SEQ ID NO 44.
In some embodiments, a modified host cell of the present disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, wherein the engineered variant comprises an amino acid sequence set forth in seq id no: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 62, SEQ ID NO 64, SEQ ID NO 66, SEQ ID NO 68, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 88, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 176, 178, 180, 182, 184 or 186 SEQ ID NO. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, a modified host cell of the present disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, wherein the engineered variant comprises an amino acid sequence set forth in seq id no:50, 52, 54, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 124, 126, 128, 130, 132, 134, 138, 140, 142, 144, 146, 148, 152, 156, 158, 160, 164, 166, 168, 170, 172, 174, 176, 178, 182, 184 or 186. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, the engineered variants of the present disclosure are overexpressed in a modified host cell. Overexpression can be achieved by increasing the copy number of one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, for example, by using a high copy number expression vector (e.g., a plasmid present at 10-40 copies or about 100 copies per cell) and/or by operably linking a nucleotide sequence encoding an engineered variant of the disclosure to a strong promoter. In some embodiments, the modified host cell has one copy of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has two copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has three copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has four copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has five copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has six copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has seven copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has eight copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the modified host cell has eight or more copies of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of the disclosure.
In some embodiments, a modified host cell of the disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide disclosed herein, wherein the nucleotide sequence is a nucleotide sequence set forth in seq id no: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 61, SEQ ID NO 63, SEQ ID NO 65, SEQ ID NO 67, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 163, 173, 175, 171, 173, 175, 177, 179, 181, 2, 183 or 185 SEQ ID NO. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, a modified host cell of the disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide disclosed herein, wherein the nucleotide sequence is a nucleotide sequence set forth in seq id no: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 61, SEQ ID NO 63, SEQ ID NO 65, SEQ ID NO 67, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 161, 163, 165, 167, 171, 173, 175, 177, 179, 181, 183 or 185, or any of the foregoing. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, a modified host cell of the disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide disclosed herein, wherein the nucleotide sequence is the nucleotide sequence set forth in seq id no: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, SEQ ID NO 117, SEQ ID NO 119, SEQ ID NO 123, SEQ ID NO 125, SEQ ID NO 123, SEQ ID NO, 127, 129, 131, 133, 137, 139, 141, 143, 145, 147, 151, 155, 157, 159, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183 or 185 SEQ ID NO. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, a modified host cell of the disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide disclosed herein, wherein the nucleotide sequence is the nucleotide sequence set forth in seq id no: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, or SEQ ID NO 59, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, SEQ ID NO 117, SEQ ID NO 119, SEQ ID NO 123, SEQ ID NO 83, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, SEQ ID NO 117, SEQ ID NO 119, SEQ ID NO 123, SEQ ID NO, 125, 127, 129, 131, 133, 137, 139, 141, 143, 145, 147, 151, 155, 157, 159, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183 or 185 or a degenerate codon sequence of any of the foregoing. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, at least one of the one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure is operably linked to an inducible promoter. In some embodiments, at least one of the one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure is operably linked to a constitutive promoter.
Geranyl pyrophosphate olive alcohol geranyl transferase (GOT) polypeptides
The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a geranyl pyrophosphate olivine acid geranyl transferase (GOT) polypeptide.
Exemplary GOT polypeptides disclosed herein can include full-length GOT polypeptides, fragments of GOT polypeptides, variants of GOT polypeptides, truncated GOT polypeptides, or fusion polypeptides having at least one activity of a GOT polypeptide. In some embodiments, the GOT polypeptide has aromatic Prenyltransferase (PT) activity. In some embodiments, the GOT polypeptide modifies a cannabinoid precursor or a derivative of a cannabinoid precursor. In certain such embodiments, the GOT polypeptide modifies olivine acid or olivine acid derivatives.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the GOT polypeptide comprises the amino acid sequence set forth in SEQ ID No. 17. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the GOT polypeptide comprises the amino acid sequence shown in SEQ ID No. 17 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the GOT polypeptide comprises an amino acid sequence having at least 65%, at least 70%, at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 17.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the GOT polypeptide comprises an amino acid sequence having at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 17. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the GOT polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 17. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the GOT polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 17.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, such as a full-length GOT polypeptide, a fragment of a GOT polypeptide, a variant of a GOT polypeptide, a truncated GOT polypeptide, or a fusion polypeptide having at least one activity of a GOT polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, the GOT polypeptide is overexpressed in a modified host cell. Overexpression can be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present in 10-40 copies or about 100 copies per cell) and/or by operably linking the nucleotide sequence encoding the GOT polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has six copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has seven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has eight copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide. In some embodiments, the modified host cell has eight or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GOT polypeptide.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 16. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 16 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 16.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 16. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 16.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence has at least 80% sequence identity to SEQ ID No. 16. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence has at least 85% sequence identity to SEQ ID No. 16. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence has at least 90% sequence identity to SEQ ID No. 16. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GOT polypeptide, wherein the nucleotide sequence has at least 95% sequence identity to SEQ ID No. 16.
NphB polypeptides
In some embodiments, cannabigerolic acid is produced from GPP and olivinic acid using an NphB polypeptide instead of a GOT polypeptide. The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide.
Exemplary NphB polypeptides disclosed herein may include a full NphB polypeptide, a fragment of an NphB polypeptide, a variant of an NphB polypeptide, a truncated NphB polypeptide, or a fusion polypeptide having at least one activity of an NphB polypeptide. In some embodiments, the NphB polypeptide has aromatic Prenyltransferase (PT) activity. In some embodiments, the NphB polypeptide modifies a cannabinoid precursor or a derivative of a cannabinoid precursor. In certain such embodiments, the NphB polypeptide modifies olivinic acid or an olivinic acid derivative.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the NphB polypeptide comprises the amino acid sequence set forth in SEQ ID No. 188. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the NphB polypeptide comprises the amino acid sequence set forth in SEQ ID No. 188 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the NphB polypeptide comprises an amino acid sequence having at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 188. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the NphB polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 188. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the NphB polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 188.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising nucleotide sequences encoding NphB polypeptides, such as full-length NphB polypeptides, fragments of NphB polypeptides, variants of NphB polypeptides, truncated NphB polypeptides, or fusion polypeptides having at least one activity of the NphB polypeptides. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, the NphB polypeptide is overexpressed in the modified host cell. Overexpression may be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present at 10-40 copies or about 100 copies per cell) and/or by operably linking a nucleotide sequence encoding an NphB polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has six copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has seven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has eight copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide. In some embodiments, the modified host cell has eight or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an NphB polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence is that set forth in SEQ ID No. 187. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 187 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 187. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 187.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence has at least 80% sequence identity to SEQ ID No. 187. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence has at least 85% sequence identity to SEQ ID No. 187. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence has at least 90% sequence identity to SEQ ID No. 187. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide, wherein the nucleotide sequence has at least 95% sequence identity to SEQ ID No. 187.
Polypeptides that produce acyl-CoA compounds or acyl-CoA compound derivatives
The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a polypeptide that produces an acyl-coa compound or an acyl-coa compound derivative. Such polypeptides may include, but are not limited to, Acyl Activating Enzyme (AAE) polypeptides, fatty acyl-coa synthase (FAA) polypeptides, or fatty acyl-coa ligase polypeptides. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide.
AAE polypeptides, FAA polypeptides, and fatty acyl-coa ligase polypeptides can convert carboxylic acids to their coa form and produce acyl-coa compounds or acyl-coa compound derivatives. Promiscuous acyl-activating enzyme polypeptides (such as CsAAE1 and CsAAE3 polypeptides), FAA polypeptides, or fatty acyl-coa ligase polypeptides can allow for the production of cannabinoid derivatives (e.g., cannabigerolic acid derivatives) as well as cannabinoids (e.g., cannabigerolic acid). In some embodiments, unsubstituted or substituted hexanoic acid or a carboxylic acid other than unsubstituted or substituted hexanoic acid is fed to a modified host cell (e.g., present in a medium in which the cell is grown) that expresses an AAE polypeptide, a FAA polypeptide, or a fatty acyl-coa ligase polypeptide to produce hexanoyl-coa, an acyl-coa compound, a derivative of hexanoyl-coa, or a derivative of acyl-coa compound. The hexanoyl-coa, acyl-coa compound, derivative of hexanoyl-coa, or derivative of acyl-coa compound is then further utilized by the modified host cell to produce a cannabinoid or a derivative of a cannabinoid. In certain such embodiments, the cell culture medium comprising the modified host cell comprises unsubstituted or substituted hexanoic acid. In some embodiments, the cell culture medium comprising the modified host cell comprises a carboxylic acid other than unsubstituted or substituted hexanoic acid.
Exemplary AAE, FAA, or fatty acyl-coa ligase polypeptides disclosed herein can include full-length AAE, FAA, or fatty acyl-coa ligase polypeptides; a fragment of an AAE, FAA or fatty acyl coa ligase polypeptide; a variant of an AAE, FAA or fatty acyl-coa ligase polypeptide; a truncated AAE, FAA or fatty acyl coa ligase polypeptide; or a fusion polypeptide having at least one activity of an AAE, FAA, or fatty acyl-coa ligase polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the AAE polypeptide comprises the amino acid sequence set forth in SEQ ID No. 23. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the AAE polypeptide comprises the amino acid sequence set forth in SEQ ID No. 23 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the AAE polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 23. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the AAE polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 23. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the AAE polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 23.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl coa ligase polypeptide, such as a full-length AAE, FAA, or fatty acyl coa ligase polypeptide; fragments of AAE, FAA, or fatty acyl-coa ligase polypeptides; a variant of an AAE, FAA or fatty acyl-coa ligase polypeptide; a truncated AAE, FAA or fatty acyl coa ligase polypeptide; or a fusion polypeptide having at least one activity of an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, one or more AAE, FAA, or fatty acyl-coa ligase polypeptides are overexpressed in the modified host cell. Overexpression can be achieved by increasing the copy number of the one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl coa ligase polypeptide, for example, by using a high copy number expression vector (e.g., a plasmid present at 10-40 copies or about 100 copies per cell) and/or by operably linking a nucleotide sequence encoding an AAE, FAA, or fatty acyl coa ligase polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has six copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has seven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has eight copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide. In some embodiments, the modified host cell has eight or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an AAE, FAA, or fatty acyl-coa ligase polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 22. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 22 or a codon degenerate nucleotide sequence thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 22. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 22.
Polypeptides for condensing an acyl-CoA compound or acyl-CoA compound derivative with malonyl-CoA to produce olivetol acid or an olivetol acid derivative
The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides that condense an acyl-coa compound (such as hexanoyl-coa) or an acyl-coa compound derivative (such as a hexanoyl-coa derivative) with malonyl-coa to produce olivinic acid or an olivinic acid derivative. The polypeptides that react an acyl-coa compound or acyl-coa compound derivative with malonyl-coa to produce olivinic acid or an olivinic acid derivative may include TKS and OAC polypeptides. TKS and OAC polypeptides have been found to have broad substrate specificity, enabling the production of cannabinoid derivatives or cannabinoids. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide.
Exemplary TKS or OAC polypeptides disclosed herein may include full-length TKS or OAC polypeptides, fragments of TKS or OAC polypeptides, variants of TKS or OAC polypeptides, truncated TKS or OAC polypeptides, or fusion polypeptides having at least one activity of a TKS or OAC polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the TKS polypeptide comprises the amino acid sequence set forth in SEQ ID No. 19. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the TKS polypeptide comprises the amino acid sequence set forth in SEQ ID No. 19 or conservatively substituted amino acid sequences thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the TKS polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 19. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the TKS polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 19. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the TKS polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 19.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises the amino acid sequence set forth in SEQ ID NO:21 or SEQ ID NO: 48. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises the amino acid sequence shown in SEQ ID No. 21 or SEQ ID No. 48, or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 21 or SEQ ID No. 48. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 21 or SEQ ID No. 48. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 21 or SEQ ID No. 48.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises the amino acid sequence set forth in SEQ ID NO 21. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises the amino acid sequence set forth in SEQ ID No. 21 or conservatively substituted amino acid sequences thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 21. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 21. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 21.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide comprising the amino acid sequence set forth in SEQ ID NO: 48. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide comprising the amino acid sequence set forth in SEQ ID NO:48 or conservatively substituted amino acid sequences thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide comprising an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 48. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide comprising an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 48. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide comprising an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 48.
Exemplary heterologous nucleic acids disclosed herein may include nucleic acids comprising nucleotide sequences encoding a TKS or OAC polypeptide, such as a full-length TKS or OAC polypeptide, a fragment of a TKS or OAC polypeptide, a variant of a TKS or OAC polypeptide, a truncated TKS or OAC polypeptide, or a fusion polypeptide having at least one activity of a TKS or OAC polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, the TKS polypeptide is overexpressed in the modified host cell. Overexpression may be achieved by increasing the copy number of one or more heterologous nucleic acids comprising the nucleotide sequence encoding the TKS polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present in 10-40 copies or about 100 copies per cell) and/or by operably linking the nucleotide sequence encoding the TKS polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has six copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has seven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has eight copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has nine copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has ten copies of the heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has eleven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has twelve copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide. In some embodiments, the modified host cell has twelve or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a TKS polypeptide.
In some embodiments, the OAC polypeptide is overexpressed in the modified host cell. Overexpression may be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present in 10-40 copies or about 100 copies per cell) and/or by operably linking a nucleotide sequence encoding an OAC polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has six copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has seven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has eight copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has nine copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has ten copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has eleven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has twelve copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide. In some embodiments, the modified host cell has twelve or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an OAC polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 18. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 18 or a codon degenerate nucleotide sequence thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 18. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a TKS polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 18.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence is that shown in SEQ ID NO. 20 or SEQ ID NO. 47. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 20 or SEQ ID No. 47 or a codon degenerate nucleotide sequence of any of the foregoing. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 20 or SEQ ID No. 47. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 20 or SEQ ID No. 47.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence is that shown in SEQ ID NO. 20. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 20 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 20. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 20.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 47. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 47 or a codon degenerate nucleotide sequence thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID NO: 47. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an OAC polypeptide, wherein the OAC polypeptide is a variant OAC (Y27F variant) polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 47.
Geranyl pyrophosphate-producing polypeptides
The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPP-producing polypeptide. In some embodiments, the GPP-producing polypeptide is a Geranyl Pyrophosphate Synthase (GPPs) polypeptide. In some embodiments, the GPPS polypeptide further has farnesyl diphosphate synthase (FPPS) polypeptide activity. In some embodiments, a GPPS polypeptide is modified to have reduced FPPS polypeptide activity (e.g., at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or greater than at least 90% less FPPS polypeptide activity) as compared to a corresponding wild-type or parent GPPS polypeptide from which the modified GPPS polypeptide is derived. In some embodiments, the GPPS polypeptide is modified such that it has substantially no FPPS polypeptide activity. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide.
Exemplary GPPS polypeptides disclosed herein can include full-length GPPS polypeptides, fragments of GPPS polypeptides, variants of GPPS polypeptides, truncated GPPS polypeptides, or fusion polypeptides having at least one activity of a GPPS polypeptide.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide comprising the amino acid sequence set forth in SEQ ID NO: 41. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide comprising the amino acid sequence set forth in SEQ ID NO:41 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide comprising an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID NO: 41. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide comprising an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID NO: 41. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide comprising an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 41. Mutations in this amino acid sequence alter the ratio of GPP to farnesyl diphosphate (FPP), increasing the GPP yield required to produce CBDA or THCA.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, such as a full-length GPPS polypeptide, a fragment of a GPPS polypeptide, a variant of a GPPS polypeptide, a truncated GPPS polypeptide, or a fusion polypeptide having at least one activity of a GPPS polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, the GPPS polypeptide is overexpressed in the modified host cell. Overexpression can be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present in 10-40 copies or about 100 copies per cell) and/or by operably linking the nucleotide sequence encoding the GPPS polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has six copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has seven copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has eight copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide. In some embodiments, the modified host cell has eight or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a GPPS polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 40. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID NO:40 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID NO: 40. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a GPPS polypeptide, wherein the GPPS polypeptide is a variant GPPS (ERG20mut, F96W, N127W) polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 40.
Polypeptides for production of acetyl-CoA from pyruvate
The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a polypeptide that produces acetyl-coa from pyruvate. The polypeptide that produces acetyl-coa from pyruvate may include a Pyruvate Decarboxylase (PDC) polypeptide. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide.
Exemplary PDC polypeptides disclosed herein can include full-length PDC polypeptides, fragments of PDC polypeptides, variants of PDC polypeptides, truncated PDC polypeptides, or fusion polypeptides having at least one activity of a PDC polypeptide.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the PDC polypeptide comprises the amino acid sequence set forth in SEQ ID No. 35. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the PDC polypeptide comprises the amino acid sequence set forth in SEQ ID NO:35 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the PDC polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 35. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the PDC polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 35. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the PDC polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 35.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising nucleotide sequences encoding PDC polypeptides, such as full-length PDC polypeptides, fragments of PDC polypeptides, variants of PDC polypeptides, truncated PDC polypeptides, or fusion polypeptides having at least one activity of PDC polypeptides. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, PDC polypeptides are overexpressed in a modified host cell. Overexpression may be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present in 10-40 copies or about 100 copies per cell) and/or by operably linking the nucleotide sequence encoding a PDC polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding a PDC polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a PDC polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a PDC polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a PDC polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a PDC polypeptide. In some embodiments, the modified host cell has five or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding a PDC polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 34. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID NO:34 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 34. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDC polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 34.
Polypeptides for condensing two acetyl-coa molecules to produce acetoacetyl-coa
The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a polypeptide that condenses two acetyl-coa molecules to produce acetoacetyl-coa. In some embodiments, the polypeptide that condenses two acetyl-coa molecules to produce acetoacetyl-coa is an acetoacetyl-coa thiolase polypeptide. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide.
Exemplary acetoacetyl-coa thiolase polypeptides disclosed herein may include full-length acetoacetyl-coa thiolase polypeptides, fragments of acetoacetyl-coa thiolase polypeptides, variants of acetoacetyl-coa thiolase polypeptides, truncated acetoacetyl-coa thiolase polypeptides, or fusion polypeptides having at least one activity of an acetoacetyl-coa thiolase polypeptide.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the acetoacetyl-coa thiolase polypeptide comprises the amino acid sequence set forth in SEQ ID No. 31. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the acetoacetyl-coa thiolase polypeptide comprises the amino acid sequence set forth in SEQ ID No. 31 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the acetoacetyl-coa thiolase polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 31. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the acetoacetyl-coa thiolase polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 31. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the acetoacetyl-coa thiolase polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 31.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, such as a full-length acetoacetyl-coa thiolase polypeptide, a fragment of an acetoacetyl-coa thiolase polypeptide, a variant of an acetoacetyl-coa thiolase polypeptide, a truncated acetoacetyl-coa thiolase polypeptide, or a fusion polypeptide having at least one activity of an acetoacetyl-coa thiolase polypeptide. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, the acetoacetyl-coa thiolase polypeptide is overexpressed in a modified host cell. Overexpression can be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, for example, by using a high copy number expression vector (e.g., a plasmid present at 10-40 copies or about 100 copies per cell) and/or by operably linking a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide. In some embodiments, the modified host cell has five or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 30. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 30 or a codon-degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 30. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 30.
Mevalonate pathway polypeptides
The modified host cells of the present disclosure may comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides having at least one activity of a polypeptide present in the Mevalonate (MEV) pathway. In certain such embodiments, the one or more polypeptides having at least one activity of a polypeptide present in a Mevalonate (MEV) pathway comprise one or more MEV pathway polypeptides.
In some embodiments, the one or more polypeptides that are part of a biosynthetic pathway to GPP may be one or more polypeptides having at least one activity of a polypeptide present in a mevalonate pathway. The mevalonate pathway may comprise polypeptides that catalyze the following steps: (a) condensing two acetyl-coa molecules to produce acetoacetyl-coa (e.g., by the action of an acetoacetyl-coa thiolase polypeptide); (b) condensing acetoacetyl-CoA with acetyl-CoA to form hydroxymethylglutaryl-CoA (HMG-CoA) (e.g., by the action of an HMGs polypeptide); (c) converting HMG-CoA to mevalonate (e.g., by the action of an HMGR polypeptide); (d) phosphorylating mevalonate to mevalonate 5-phosphate (e.g., by the action of an MK polypeptide); (e) converting mevalonate 5-phosphate to mevalonate 5-pyrophosphate (e.g., by the action of a PMK polypeptide); (f) converting mevalonate 5-pyrophosphate to isopentenyl pyrophosphate (e.g., by action of a mevalonate pyrophosphate decarboxylase (MPD or MVD1) polypeptide); and (g) converting isopentenyl pyrophosphate to dimethylallyl pyrophosphate (e.g., by the action of an isopentenyl pyrophosphate isomerase (IDI1) polypeptide).
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MEV pathway polypeptide. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding more than one MEV pathway polypeptide. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising nucleotide sequences encoding more than two MEV pathway polypeptides. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising nucleotide sequences encoding more than three MEV pathway polypeptides. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising nucleotide sequences encoding more than four MEV pathway polypeptides. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding more than five MEV pathway polypeptides. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding more than six MEV pathway polypeptides. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding all MEV pathway polypeptides.
Exemplary MEV pathway polypeptides disclosed herein may also include full-length MEV pathway polypeptides, fragments of MEV pathway polypeptides, variants of MEV pathway polypeptides, truncated MEV pathway polypeptides, or fusion polypeptides having at least one activity of MEV pathway polypeptides. In some embodiments, the one or more MEV pathway polypeptides are selected from the group consisting of: acetoacetyl-coa thiolase polypeptides, HMGS polypeptides, HMGR polypeptides, MK polypeptides, PMK polypeptides, MVD1 polypeptides, and IDI1 polypeptides.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMGS polypeptide, wherein the HMGS polypeptide comprises the amino acid sequence set forth in SEQ ID NO: 29. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a HMGS polypeptide, wherein the HMGS polypeptide comprises the amino acid sequence set forth in SEQ ID No. 29, or a conservatively substituted amino acid sequence thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a HMGS polypeptide, wherein the HMGS polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 29. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMGS polypeptide, wherein the HMGS polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 29. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a HMGS polypeptide, wherein the HMGS polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 29.
In some embodiments, the HMGR polypeptide is a truncated HMGR (tmhmgr) polypeptide. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the tHMGR polypeptide comprises the amino acid sequence set forth in SEQ ID NO. 27. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide comprising the amino acid sequence set forth in SEQ ID NO. 27 or conservatively substituted amino acid sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the tHMGR polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 27. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the tHMGR polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 27. In some embodiments, a modified host cell of the disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the tHMGR polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 27.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide, wherein the MK polypeptide comprises the amino acid sequence shown in SEQ ID NO. 39. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide, wherein the MK polypeptide comprises the amino acid sequence shown in SEQ ID NO. 39 or conservatively substituted amino acid sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide comprising an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 39. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide comprising an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 39. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide, wherein the MK polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 39.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the PMK polypeptide comprises the amino acid sequence set forth in SEQ ID No. 37. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the PMK polypeptide comprises the amino acid sequence set forth in SEQ ID No. 37 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the PMK polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 37. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the PMK polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 37. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the PMK polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 37.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a MVD1 polypeptide, wherein the MVD1 polypeptide comprises the amino acid sequence set forth in SEQ ID NO: 33. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a MVD1 polypeptide, wherein the MVD1 polypeptide comprises the amino acid sequence set forth in SEQ ID No. 33 or conservatively substituted amino acid sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a MVD1 polypeptide, wherein the MVD1 polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 33. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a MVD1 polypeptide, wherein the MVD1 polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 33. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a MVD1 polypeptide, wherein the MVD1 polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 33.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the IDI1 polypeptide comprises the amino acid sequence set forth in SEQ ID No. 25. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the IDI1 polypeptide comprises the amino acid sequence set forth in SEQ ID No. 25 or conservatively substituted amino acid sequences thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the IDI1 polypeptide comprises an amino acid sequence having at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, or at least 75% amino acid sequence identity to SEQ ID No. 25. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the IDI1 polypeptide comprises an amino acid sequence having at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% amino acid sequence identity to SEQ ID No. 25. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the IDI1 polypeptide comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% amino acid sequence identity to SEQ ID No. 25.
Exemplary heterologous nucleic acids disclosed herein can include nucleic acids comprising a nucleotide sequence encoding an MEV pathway polypeptide, such as a full-length MEV pathway polypeptide, a fragment of an MEV pathway polypeptide, a variant of an MEV pathway polypeptide, a truncated MEV pathway polypeptide, or a fusion polypeptide having at least one activity of a polypeptide that is part of an MEV pathway. In some embodiments, the nucleotide sequence is codon optimized.
In some embodiments, one or more MEV pathway polypeptides are overexpressed in the modified host cell. Overexpression can be achieved by increasing the copy number of one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MEV pathway polypeptide, for example by using a high copy number expression vector (e.g., a plasmid present in 10-40 copies or about 100 copies per cell) and/or by operably linking a nucleotide sequence encoding an MEV pathway polypeptide to a strong promoter. In some embodiments, the modified host cell has one copy of a heterologous nucleic acid comprising a nucleotide sequence encoding an MEV pathway polypeptide. In some embodiments, the modified host cell has two copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an MEV pathway polypeptide. In some embodiments, the modified host cell has three copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an MEV pathway polypeptide. In some embodiments, the modified host cell has four copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an MEV pathway polypeptide. In some embodiments, the modified host cell has five copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an MEV pathway polypeptide. In some embodiments, the modified host cell has five or more copies of a heterologous nucleic acid comprising a nucleotide sequence encoding an MEV pathway polypeptide.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMGS polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 28. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMGS polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 28 or a codon degenerate nucleotide sequence thereof. In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a HMGS polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 28. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMGS polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 28.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the nucleotide sequence is that set forth in SEQ ID No. 26. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 26 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 26. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 26.
In some embodiments, a modified host cell of the present disclosure comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide. In some embodiments, a modified host cell of the present disclosure comprises two heterologous nucleic acids comprising a nucleotide sequence encoding a tHMGR polypeptide.
In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide, wherein the nucleotide sequence is that shown in SEQ ID NO. 38. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID NO:38 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 38. In some embodiments, a modified host cell of the disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MK polypeptide, wherein said nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 38.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 36. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID NO:36 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 36. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PMK polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 36.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MVD1 polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 32. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a MVD1 polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 32 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a MVD1 polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 32. In some embodiments, a modified host cell of the disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an MVD1 polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 32.
In some embodiments, the modified host cells of the present disclosure comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 24. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the nucleotide sequence is the nucleotide sequence set forth in SEQ ID No. 24 or a codon degenerate nucleotide sequence thereof. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the nucleotide sequence has at least 80%, at least 81%, at least 82%, at least 83%, or at least 84% sequence identity to SEQ ID No. 24. In some embodiments, a modified host cell of the present disclosure comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IDI1 polypeptide, wherein the nucleotide sequence has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, at least 99.9%, or 100% sequence identity to SEQ ID No. 24.
Modified hosts that produce cannabinoids or cannabinoid derivatives and/or express engineered variants of the disclosure Master cell
The present disclosure provides modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure. Modified host cells of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure are useful for producing cannabinoids or cannabinoid derivatives and/or for expressing the engineered variants of the present disclosure.
The present disclosure provides modified host cells for the production of cannabinoids or cannabinoid derivatives. To produce a cannabinoid or cannabinoid derivative, a modified host cell disclosed herein can be modified to express or overexpress one or more nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants, KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide of the present disclosure, and/or one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivolic acid, or hexanoyl-coa). The modified host cell for the production of a cannabinoid or cannabinoid derivative may comprise a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In certain such embodiments, the modified host cell for production of a cannabinoid or cannabinoid derivative may comprise a deletion of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In some embodiments, the modified host cell for production of a cannabinoid or cannabinoid derivative may comprise down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises a codon-optimized nucleotide sequence encoding an engineered variant of the disclosure. In some embodiments, the nucleotide sequence encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide, and/or one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa) is codon optimized.
The present disclosure also provides modified host cells modified to express or overexpress one or more nucleic acids comprising nucleotide sequences encoding the engineered variants of the disclosure. In some embodiments of modified host cells for expressing engineered variants of the present disclosure, the modified host cells comprise one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide. In some embodiments of modified host cells for expressing the engineered variants of the present disclosure, the modified host cells comprise one or more nucleic acids comprising a nucleotide sequence encoding the engineered variants of the present disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide. In certain such embodiments, the modified host cell may comprise a deletion of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide. In some embodiments, the modified host cell may comprise down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In some embodiments of the modified host cells used to express the engineered variants of the present disclosure, the nucleotide sequence encoding the engineered variants of the present disclosure is a codon optimized nucleotide sequence. In some embodiments, the nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide is codon optimized.
For the production of cannabinoids or cannabinoid derivatives, expression or overexpression of one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure in a modified host cell may be combined with expression or overexpression by the modified host cell of one or more heterologous nucleic acids disclosed herein (e.g., one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide) and/or with deletion or downregulation of one or more genes encoding one or more of ROT2 polypeptide or PEP4 polypeptide. In some embodiments, the nucleotide sequence encoding the engineered variants of the disclosure is a codon optimized nucleotide sequence.
For expressing or overexpressing the engineered variants of the present disclosure, expression or overexpression of one or more nucleic acids comprising a nucleotide sequence encoding the engineered variants in a modified host cell can be combined with expression or overexpression by the modified host cell of one or more heterologous nucleic acids disclosed herein (e.g., one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide) and/or with deletion or downregulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In some embodiments, the nucleotide sequence encoding the engineered variant is a codon optimized nucleotide sequence.
In some embodiments, the amount of cannabinoid or cannabinoid derivative produced by a modified host cell of the disclosure for producing the cannabinoid or cannabinoid derivative is greater than the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown under similar culture conditions for the same length of time, in mg/L or mM. In some embodiments, the amount of a cannabinoid or cannabinoid derivative produced by the modified host cell for producing the cannabinoid or cannabinoid derivative is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, greater than the amount of a cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, in mg/L or mM grown under similar culture conditions for the same length of time, At least 500% or at least 1000%.
In some embodiments, the amount of cannabinoids or cannabinoid derivatives produced by a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure is greater than the amount of cannabinoids or cannabinoid derivatives produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, but lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant, in mg/L or mM, grown for the same length of time under similar culture conditions. In some embodiments, the amount of cannabinoids or cannabinoid derivatives produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, greater than the amount of cannabinoids or cannabinoid derivatives produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a cannabidiolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant, grown for the same length of time and in mg/L or mM under similar culture conditions, At least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa). In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, the amount of cannabinoid or cannabinoid derivative produced by a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants disclosed and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide is greater than the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, but lacking the nucleic acid encoding the engineered variant, grown for the same length of time in mg/L or mM, in similar culture conditions . In some embodiments, the amount of cannabinoid or cannabinoid derivative produced by a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants disclosed and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide is greater than the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant, grown under similar culture conditions for the same length of time in mg/L or mM At least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, a FAD1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa). In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), prenyl phosphate, olive oil acid, or hexanoyl coa).
In some embodiments, the amount of cannabinoid or cannabinoid derivative produced by a modified host cell of the disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide is greater than the amount of cannabinoid or cannabinoid derivative produced by a cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a polypeptide having the amino acid sequence of SEQ ID NO:44 and one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, but lacks the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising a nucleic acid comprising a nucleotide sequence encoding the engineered variant. In some embodiments, the amount of cannabinoid or cannabinoid derivative produced by a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and a deletion or downregulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide is at least 5% greater than the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 and one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide, but lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant, in mg/L or mM grown under similar culture conditions for the same length of time, in mg/L or mM, At least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, oliveoyl acid, or hexanoyl coa) biosynthesis. In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, the amount of cannabinoid or cannabinoid derivative produced by a modified host cell of the present disclosure comprising one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants of the present disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or downregulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide is greater than the amount of cannabinoid or cannabinoid derivative produced by a nucleic acid comprising one or more nucleotide sequences encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, grown under similar culture conditions for the same length of time, in mg/L or mM, and deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or PEP4 polypeptide, but lacking the amount of cannabinoid or cannabinoid derivative produced by the modified host cell comprising a nucleic acid encoding a nucleotide sequence of the engineered variant. In some embodiments, the amount of cannabinoid or cannabinoid derivative produced by a modified host cell of the present disclosure comprising one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants disclosed, one or more genes encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or downregulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide is greater than the amount of cannabinoid or cannabinoid derivative produced by a nucleic acid comprising one or more nucleotide sequences encoding a tetrahydrophenolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, one or more nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE 35 46polypeptide, and a heterologous nucleic acid encoding one or more genes encoding one or more of a ROT 6 polypeptide or a PEP 5848325 polypeptide, grown under similar culture conditions for the same length of time, in mg/L or mM Or downregulated, but lacking the amount of cannabinoid or cannabinoid derivative produced by the modified host cell comprising a nucleic acid encoding the engineered variant by at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%. In some embodiments of modified host cells of the disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, the modified host cells comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa) biosynthesis. In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide deleted or down-regulated, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, the growth rate and/or biomass yield of a modified host cell of the present disclosure used to produce a cannabinoid or cannabinoid derivative is similar to or lower than the growth rate and/or biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown under similar culture conditions for the same length of time. In some embodiments, the modified host cells of the present disclosure for production of cannabinoids or cannabinoid derivatives have a growth rate and/or biomass yield that is similar or lower and has an increased THCA titer compared to the growth rate and/or biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown under similar culture conditions for the same length of time.
In some embodiments, the modified host cells of the present disclosure for producing a cannabinoid or a cannabinoid derivative have a faster growth rate and/or a higher biomass yield as compared to the growth rate and/or the higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown for the same length of time under similar culture conditions. In some embodiments, the growth rate and/or higher biomass yield of a modified host cell of the present disclosure for producing a cannabinoid or cannabinoid derivative is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, faster/higher than the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown for the same length of time under similar culture conditions, At least 500% or at least 1000%.
In some embodiments, the growth rate and/or biomass yield of a modified host cell of the present disclosure for expressing an engineered variant of the present disclosure grown under similar culture conditions for the same length of time is similar to or lower than the growth rate and/or biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant. In some embodiments, a modified host cell of the present disclosure expressing an engineered variant of the present disclosure has a growth rate and/or biomass yield similar or lower and has an increased THCA titer compared to the growth rate and/or biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant, grown under similar culture conditions for the same length of time.
In some embodiments, the modified host cells of the present disclosure used to express the engineered variants of the present disclosure have a faster growth rate and/or higher biomass yield compared to the growth rate and/or higher biomass yield of modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant, grown for the same length of time under similar culture conditions. In some embodiments, the growth rate and/or higher biomass yield of a modified host cell of the present disclosure for expressing an engineered variant of the present disclosure grown under similar culture conditions for the same length of time is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, faster/higher biomass yield than the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, At least 500% or at least 1000%.
In some embodiments, the growth rate and/or biomass yield of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure grown under similar culture conditions for the same length of time is similar to or lower than the growth rate and/or biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure has a growth rate and/or biomass yield similar or lower and has an increased THCA titer compared to the growth rate and/or biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant, grown under similar culture conditions for the same length of time.
In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure has a faster growth rate and/or higher biomass yield compared to the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown under similar culture conditions for the same length of time. In some embodiments, the growth rate and/or higher biomass yield of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, faster/higher than the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown for the same length of time under similar culture conditions, At least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa). In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide, grown under similar culture conditions for the same length of time has the same sequence as a modified host cell comprising one or more heterologous nucleic acids comprising a nucleotide sequence encoding a polypeptide having the sequence of SEQ ID NO:44 and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, but lack a growth rate and/or higher biomass yield of the modified host cell comprising a nucleic acid encoding the nucleotide sequence of the engineered variant as compared to a faster growth rate and/or higher biomass yield. In some embodiments, the growth rate and/or higher biomass yield of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants disclosed and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide grown under similar culture conditions for the same length of time is at least 5% greater than the growth rate and/or higher biomass yield of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44 and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a 1 polypeptide, or an IRE1 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant, but is at least 5% greater than the growth rate and/or higher biomass yield of a modified host cell comprising a nucleotide sequence encoding the engineered variant At least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa). In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), prenyl phosphate, olive oil acid, or hexanoyl coa).
In some embodiments, a modified host cell of the disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide has a modified nucleotide sequence comprising a nucleotide sequence encoding a polypeptide having the amino acid sequence of SEQ ID NO:44 and one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, but lack a growth rate and/or higher biomass yield of the modified host cell comprising a nucleic acid encoding the nucleotide sequence of the engineered variant as compared to a faster growth rate and/or higher biomass yield. In some embodiments, the growth rate and/or higher biomass yield of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and the deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide is at least 5% faster/higher than the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 and one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide, but lacks the nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown for the same length of time in mg/L or mM under similar culture conditions, At least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, oliveoyl acid, or hexanoyl coa) biosynthesis. In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants of the present disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or downregulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, grown under similar culture conditions for the same length of time, has the same activity as a modified host cell comprising one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tetrahydrophenolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or downregulation of one or more genes encoding one or more of a KAR2 polypeptide or a PEP4 polypeptide, but lack a growth rate and/or higher biomass yield of the modified host cell comprising a nucleic acid encoding the nucleotide sequence of the engineered variant as compared to a faster growth rate and/or higher biomass yield. In some embodiments, comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, growth rate and/or higher biomass yield ratio of a modified host cell of the disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, one or more nucleic acids comprising a heterologous sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, grown for the same length of time under similar culture conditions, but the growth rate and/or higher biomass yield of a modified host cell lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% faster/higher. In some embodiments of modified host cells of the disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, the modified host cells comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa) biosynthesis. In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide deleted or down-regulated, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, the modified host cells of the present disclosure used to produce a cannabinoid or cannabinoid derivative produce THCA from CBGA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to the ratio of THCA produced by modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, but lacking nucleic acids comprising nucleotide sequences encoding the engineered variant, grown for the same length of time under similar culture conditions. In some embodiments, the modified host cell for production of a cannabinoid or cannabinoid derivative produces the THCA from a thga to another cannabinoid (e.g., THCA) ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500: 1.
In some embodiments, the modified host cells of the present disclosure used to express the engineered variants of the present disclosure produce THCA from CBGA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to the ratio of THCA produced by modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, but lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant, grown for the same length of time under similar culture conditions. In some embodiments, a modified host cell for expressing an engineered variant of the disclosure is produced from, e.g., THCA at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or more than about 500:1 THCA to another cannabinoid (e.g., CBCA).
In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure produces THCA from CBGA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to the ratio of THCA produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown under similar culture conditions for the same length of time. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure produces a ratio of, e.g., thga to the amount of a cannabinoid (e.g., thga) such as thga to the amount of a cannabinoid (e.g., thga) of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or about 500: 1. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa). In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, modified host cells of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and one or more nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, an FAD1 polypeptide, or an IRE1 polypeptide, and lacking a nucleic acid encoding a nucleotide sequence encoding the engineered variant produce a ratio of THCA to another cannabinoid (e.g., CBCA) grown under similar culture conditions for the same length of time and from a CBCA to another cannabinoid in an increased ratio of THCA to another heterologous nucleic acid comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, an IRE1 polypeptide, or a ga 1 polypeptide THCA. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide is administered in an amount of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50: 70, about 1, about 1.5:1, about 19:1, about 20:1, about 50:1, about 100: 90:1, about 1: 90:1, about 15:1, about 5:1, about 5:1, about 20:1, about 50:1, about 80:1, about 1:1, about 80:1, about 5:1, about 5:1, about 1:1, about 80:1, about, A ratio of THCA to another cannabinoid (e.g., CBCA) of about 200:1, about 500:1, or greater than about 500:1 produces THCA from CBGA. In some embodiments of the modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, oliveotide, or hexanoyl coa). In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), prenyl phosphate, olive oil acid, or hexanoyl coa).
In some embodiments, the cells are grown under similar culture conditions for the same length of time as a cell comprising one or more nucleic acid sequences comprising a nucleotide sequence encoding a polypeptide having SEQ ID NO:44 and one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, but a modified host cell lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant produces THCA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to a ratio of THCA produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide produces THCA from CBGA. In some embodiments, a modified host cell of the disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and a deletion or downregulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide is purified from one or more nucleic acids encoding one or more of the ROT2 polypeptide or PEP4 polypeptide in an amount of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 90:1, about 100:1, about 500:1, or about 200 ca, CBCA) to produce THCA from CBGA. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, oliveoyl acid, or hexanoyl coa) biosynthesis. In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a THCAS polypeptide having the amino acid sequence of SEQ ID No. 44 and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, the nucleic acid comprising one or more nucleotide sequences encoding the engineered variants disclosed herein and the one or more heterologous nucleic acids comprising nucleotide sequences encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or IRE1 polypeptide, and the PEP 2 polypeptide or the polypeptide encoding the same, are grown under similar culture conditions for the same length of time as compared to the ratio of THCA produced by a modified host cell comprising one or more genes encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or IRE1 polypeptide, and one or more genes encoding one or more of ROT2 polypeptide or PEP4 polypeptide, but lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant, to another cannabinoid (e.g., CBCA), the nucleic acid comprising one or more nucleotide sequences encoding the disclosed engineered variant, the one or more heterologous nucleic acids comprising nucleotide sequences encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or IRE1 polypeptide, and the polypeptide encoding PEP4 or the same, for the same length of time as compared to the ratio of THCA produced by a modified host cell comprising the same, and/or a variant comprising the same, and/or a variant encoding the same, and/or a variant, and/or a method, for example A modified host cell of the disclosure in which one or more genes of one or more of the peptides are deleted or down-regulated produces THCA from CBGA at an increased ratio of THCA to the other cannabinoid. In some embodiments, a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or downregulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, comprises about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 30:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 20:1, about 30:1, about 35: 40:1, about 50:1, about 13.5:1, about 5:1, about 1, A ratio of THCA to another cannabinoid (e.g., CBCA) of about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500:1 produces THCA from CBGA. In some embodiments of modified host cells of the disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, the modified host cells comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa) biosynthesis. In some embodiments, a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide deleted or down-regulated, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa).
In some embodiments, the growth and/or viability of the modified host cells of the present disclosure used to produce cannabinoids or cannabinoid derivatives is not significantly reduced compared to the growth and/or viability of the unmodified host cells. In some embodiments, the cell density of a culture of a modified host cell of the present disclosure used to produce a cannabinoid or cannabinoid derivative is at least 25% or greater, at least 30% or greater, at least 35% or greater, at least 40% or greater, at least 45% or greater, at least 50% or greater, at least 55% or greater, at least 60% or greater, at least 65% or greater, at least 70% or greater, greater than the cell density of a culture of an unmodified control host cell grown in the same medium and under the same culture conditions for the same time, at least 75% or more, at least 80% or more, at least 85% or more, at least 90% or more, at least 95% or more, at least 100% or more, at least 110% or more, at least 120% or more, at least 130% or more, at least 140% or more, or at least 150% or more.
In some embodiments, the growth and/or viability of the modified host cells of the present disclosure used to express the engineered variants of the present disclosure is not significantly reduced compared to the growth and/or viability of the unmodified host cells. In some embodiments, the cell density of a culture of a modified host cell of the present disclosure used to express an engineered variant of the present disclosure is at least 25% or greater, at least 30% or greater, at least 35% or greater, at least 40% or greater, at least 45% or greater, at least 50% or greater, at least 55% or greater, at least 60% or greater, at least 65% or greater, at least 70% or greater, greater than the cell density of a culture of an unmodified control host cell grown in the same medium and under the same culture conditions for the same time, at least 75% or more, at least 80% or more, at least 85% or more, at least 90% or more, at least 95% or more, at least 100% or more, at least 110% or more, at least 120% or more, at least 130% or more, at least 140% or more, or at least 150% or more.
In some embodiments, the growth and/or viability of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure is not significantly reduced as compared to the growth and/or viability of an unmodified host cell. In some embodiments, the cell density of a culture of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure is at least 25% or more, at least 30% or more, at least 35% or more, at least 40% or more, at least 45% or more, at least 50% or more, at least 55% or more, at least 60% or more, at least 65% or more, at least 70% or more, at least 75% or more, at least 80% or more, at least 85% or more, at least 90% or more, at least 95% or more, at least 100% or more, at least 110% or more, at least 120% or more, at least 130% or more, than the cell density of a culture of an unmodified control host cell grown in the same medium and under the same culture conditions for the same time, At least 140% or more or at least 150% or more. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa).
In some embodiments, the growth and/or viability of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide is not significantly reduced as compared to the growth and/or viability of an unmodified host cell. In some embodiments, the cell density of a culture of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, FAD1 polypeptide, or IRE1 polypeptide is at least 25% or more, at least 30% or more, at least 35% or more, at least 40% or more, at least 45% or more, at least 50% or more, at least 55% or more, at least 60% or more, at least 65% or more, at least 70% or more, at least 75% or more, at least 80% or more, at least 85% or more, greater than the cell density of a culture of an unmodified control host cell grown in the same medium and under the same culture conditions for the same time At least 90% or more, at least 95% or more, at least 100% or more, at least 110% or more, at least 120% or more, at least 130% or more, at least 140% or more, or at least 150% or more. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, a FAD1 polypeptide, or an IRE1 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa).
In some embodiments, the growth and/or viability of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide is not significantly reduced as compared to the growth and/or viability of an unmodified host cell. In some embodiments, the cell density of a culture of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and a deletion or downregulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide is at least 25% or greater, at least 30% or greater, at least 35% or greater, at least 40% or greater, at least 45% or greater, at least 50% or greater, at least 55% or greater, at least 60% or greater, at least 65% or greater, at least 70% or greater, at least 75% or greater, at least 80% or greater, at least 85% or greater, at least 90% or greater, at least 95% or greater, than the cell density of a culture of an unmodified control host cell grown in the same medium and under the same culture conditions for the same time, At least 100% or more, at least 110% or more, at least 120% or more, at least 130% or more, at least 140% or more, or at least 150% or more. In some embodiments of modified host cells of the present disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the present disclosure and a deletion or down-regulation of one or more genes encoding one or more of the ROT2 polypeptide or PEP4 polypeptide, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, oliveoyl acid, or hexanoyl coa) biosynthesis.
In some embodiments, the growth and/or viability of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide is not significantly reduced as compared to the growth and/or viability of an unmodified host cell. In some embodiments, the cell density of a culture of a modified host cell of the present disclosure comprising one or more nucleic acids comprising a nucleotide sequence encoding one or more of the engineered variants of the present disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or IRE1 polypeptide, and a deletion or downregulation of one or more genes encoding one or more of ROT2 polypeptide or PEP4 polypeptide is at least 25% or more, at least 30% or more, at least 35% or more, at least 40% or more, at least 45% or more, at least 50% or more, at least 55% or more, at least 60% or more, at least 65% or more, at least 70% or more, than the cell density of a culture of an unmodified control host cell grown in the same medium and under the same culture conditions for the same time, At least 75% or more, at least 80% or more, at least 85% or more, at least 90% or more, at least 95% or more, at least 100% or more, at least 110% or more, at least 120% or more, at least 130% or more, at least 140% or more, or at least 150% or more. In some embodiments of modified host cells of the disclosure comprising one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, the modified host cells comprise one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olive acid, or hexanoyl coa) biosynthesis.
Suitable host cells
Parent host cells suitable for use in producing the modified host cells of the present disclosure may include eukaryotic cells. In some embodiments, the eukaryotic cell is a yeast cell.
Host cells (including parent host cells and modified host cells) are in some embodiments unicellular organisms, or are grown as single cells in culture. In some embodiments, the host cell is a eukaryotic cell. Suitable eukaryotic host cells can include, but are not limited to, yeast cells and fungal cells. Suitable eukaryotic host cells can include, but are not limited to, Pichia pastoris (now known as Farfuki yeast), Pichia finnishii, Pichia trehalose loving, Pichia klarrhizus, Pichia membranaefaciens, Pichia cactus, Pichia thermotolerant, Pichia pastoris, Pichia gourkii, Pichia pastoris, Pichia stipitis, Pichia methanolica, Pichia, Saccharomyces cerevisiae, Saccharomyces, Hansenula polymorpha (now known as Pichia angustifolia), yarrowia lipolytica, Kluyveromyces lactis, Kluyveromyces marxianus, Schizosaccharomyces pombe, Zymoxylase, Dekkera brussella, Gluconospora adenosylytica (previously known as adenine arabinosum), Candida albicans, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Trichoderma reesei, Pichia stipitis, Pichia pastoris, Pichia stipiti, Pichia pastoris, and Pichia pastoris, Saccharomyces cerevisiae, Saccharomyces cerevisiae, Saccharomyces, luknowense Chrysosporium, Fusarium graminearum, Fusarium oxysporum, Neurospora crassa, etc. In some embodiments, the modified host cells disclosed herein are cultured in vitro.
In some embodiments, the host cell of the present disclosure is a yeast cell. In some embodiments, the host cell is a protease deficient strain of Saccharomyces cerevisiae. Protease deficient yeast strains are effective in reducing degradation of the expressed heterologous protein. Examples of proteases deleted in such strains may include one or more of the following: PEP4, PRB1, and KEX 1.
In some embodiments, the host cell is saccharomyces cerevisiae. In some embodiments, the host cells used to produce the modified host cells of the present disclosure may be easily cultured; the growth is fast; availability of tools for modification (such as promoters and vectors); and the safety profile of the host cell. In some embodiments, the host cell used to produce the modified host cells of the present disclosure may be selected for its ability or inability to introduce certain post-translational modifications to the expressed polypeptide (such as the engineered variants of the present disclosure). For example, a modified favus foal host cell can hyperglycosylate the engineered variants of the disclosure, and hyperglycosylation can alter the activity of the resulting expressed polypeptide.
Genetic modification of host cells of the disclosure and exemplary modified host cells
The present disclosure provides modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, and methods of making the modified host cells. In some embodiments, the methods of making the modified host cells of the present disclosure comprise introducing into a host cell one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure. In some embodiments, the modified host cells of the present disclosure comprise one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure. In some embodiments, the nucleic acid comprises a codon optimized nucleotide sequence.
The present disclosure provides modified host cells and methods of making modified host cells for producing cannabinoids or cannabinoid derivatives comprising introducing into a host cell one or more nucleic acids (e.g., a heterologous nucleic acid) disclosed herein. In some embodiments, the nucleic acid comprises a codon optimized nucleotide sequence.
The present disclosure provides a method of making a modified host cell for producing a cannabinoid or a cannabinoid derivative, the method comprising a) introducing into a host cell one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure. In certain such embodiments, the method comprises b) introducing into the host cell one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide. In some embodiments, the method comprises b) introducing into the host cell one or more heterologous nucleic acids comprising a heterologous nucleic acid encoding a nucleotide sequence of one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or FAD1 polypeptide. In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises a codon-optimized nucleotide sequence encoding an engineered variant of the disclosure.
In some embodiments, the modified host cell for production of cannabinoids or cannabinoid derivatives comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide. In certain such embodiments, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide. In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or FAD1 polypeptide. In certain such embodiments, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide. In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
In some embodiments, the modified host cell for production of a cannabinoid or cannabinoid derivative comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In certain such embodiments, the modified host cell comprises a deletion or down-regulation of one or more genes encoding a ROT2 polypeptide and a PEP4 polypeptide. The present disclosure provides a method of making a modified host cell for producing a cannabinoid or cannabinoid derivative comprising introducing into a host cell a deletion or down-regulation of one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide.
In some embodiments, the modified host cell for production of cannabinoids or cannabinoid derivatives comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In certain such embodiments, the modified host cell comprises a deletion or downregulation of one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide, and one or more genes encoding an ROT2 polypeptide and a PEP4 polypeptide. In some embodiments, a modified host cell for the production of a cannabinoid or cannabinoid derivative comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
The present disclosure provides a method of preparing a modified host cell for the production of a cannabinoid or a cannabinoid derivative, the method comprising introducing into a host cell: a) one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and c) deletion or downregulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide.
The present disclosure provides a method of preparing a modified host cell for the production of a cannabinoid or a cannabinoid derivative, the method comprising introducing into a host cell: a) one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, and b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide.
In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative can comprise one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure and express or overexpress a combination of heterologous nucleic acids comprising nucleotide sequences encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivine acid, or hexanoyl-coa). In some embodiments, methods of making a modified host cell for production of a cannabinoid or a cannabinoid derivative include introducing into a host cell one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor.
In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl-coa). In certain such embodiments, the modified host cell comprises a deletion or downregulation of one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide, and genes encoding an ROT2 polypeptide and a PEP4 polypeptide. In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
In some embodiments, a modified host cell for production of a cannabinoid or a cannabinoid derivative comprises one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isoprene phosphate, olive acid, or hexanoyl coa). In certain such embodiments, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide. In some embodiments, a modified host cell for production of a cannabinoid or cannabinoid derivative comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
The present disclosure provides a method of making a modified host cell for expressing an engineered variant of the present disclosure, the method comprising introducing into a host cell one or more nucleic acids disclosed herein. The present disclosure provides a method of making a modified host cell for expressing an engineered variant of the present disclosure, the method comprising introducing into a host cell: a) one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, and b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide. The present disclosure provides a method of making a modified host cell for expressing an engineered variant of the present disclosure, the method comprising introducing into a host cell: a) one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, and b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide. In some embodiments, the modified host cell used to express the engineered variants of the present disclosure comprises a codon-optimized nucleotide sequence encoding the engineered variants of the present disclosure.
In some embodiments, a modified host cell for expressing engineered variants of the disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide. In certain such embodiments, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide. In some embodiments, a modified host cell for expressing engineered variants of the disclosure comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
In some embodiments, a modified host cell for expressing an engineered variant of the present disclosure comprises one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide. In certain such embodiments, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide. In some embodiments, a modified host cell for expressing an engineered variant of the present disclosure comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
In some embodiments, the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, for expressing the engineered variant of the disclosure, comprises a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In certain such embodiments, the modified host cell comprises a deletion or down-regulation of one or more genes encoding a ROT2 polypeptide and a PEP4 polypeptide. The present disclosure provides a method of making a modified host cell for expressing an engineered variant of the present disclosure, the method comprising introducing into a host cell: a) one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure and b) deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide.
In some embodiments, the modified host cells used to express the engineered variants of the present disclosure comprise one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the present disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, and a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide. In certain such embodiments, the modified host cell comprises a deletion or downregulation of one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide, and one or more genes encoding a ROT2 polypeptide and a PEP4 polypeptide. In some embodiments, a modified host cell for expressing an engineered variant of the present disclosure comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
The present disclosure provides a method of making a modified host cell for expressing an engineered variant of the present disclosure, the method comprising introducing into a host cell: a) one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, PDI1 polypeptide, ERO1 polypeptide, or IRE1 polypeptide, and c) deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or PEP4 polypeptide.
The present disclosure provides a method of making a modified host cell for expressing an engineered variant of the present disclosure, the method comprising introducing into a host cell: a) one or more nucleic acids comprising a nucleotide sequence encoding an engineered variant of the disclosure, and b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide.
In some embodiments, a modified host cell for expressing an engineered variant of the disclosure may comprise one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure and express or overexpress a combination of heterologous nucleic acids comprising nucleotide sequences encoding one or more polypeptides involved in the biosynthesis of cannabinoids or cannabinoid precursors (e.g., geranyl pyrophosphate (GPP), isoprene phosphate, olive acid, or hexanoyl coa). In some embodiments, methods of making a modified host cell for expression of an engineered variant of the present disclosure comprise introducing into a host cell one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor biosynthesis.
In some embodiments, a modified host cell for expressing an engineered variant of the disclosure comprises one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or an IRE1 polypeptide, a deletion or down-regulation of one or more genes encoding one or more of a ROT2 polypeptide or a PEP4 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in cannabinoid or cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl-coa) biosynthesis. In certain such embodiments, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide, and deletions or downregulations of genes encoding a ROT2 polypeptide and a PEP4 polypeptide. In some embodiments, a modified host cell for expressing an engineered variant of the present disclosure comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
In some embodiments, a modified host cell for expressing an engineered variant of the disclosure comprises one or more nucleic acids comprising a polynucleotide sequence encoding an engineered variant of the disclosure, one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more of a KAR2 polypeptide, a PDI1 polypeptide, an ERO1 polypeptide, or a FAD1 polypeptide, and one or more heterologous nucleic acids comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of a cannabinoid or a cannabinoid precursor (e.g., geranyl pyrophosphate (GPP), isopentenyl phosphate, olivinic acid, or hexanoyl coa). In certain such embodiments, the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide, one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide. In some embodiments, a modified host cell for expressing an engineered variant of the present disclosure comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
To modify a parent host cell to produce a modified host cell of the present disclosure, one or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein can be stably or transiently introduced into the host cell using established techniques. Such techniques may include, but are not limited to, electroporation, calcium phosphate precipitation, DEAE-dextran mediated transfection, liposome-mediated transfection, lithium acetate methods, and the like. See Gietz, R.D., and R.A.Woods. (2002) TRANSFORMATION OF YEAST BY THE LIac/SS CARRIER DNA/PEG METHOD. For stable transformation, the nucleic acid (e.g., heterologous nucleic acid) will typically include a selectable marker, e.g., any of several well-known selectable markers such as neomycin resistance, ampicillin resistance, tetracycline resistance, chloramphenicol resistance, kanamycin resistance, and the like. In some embodiments, a parent host cell modified to produce the modified host cells of the present disclosure is modified with one or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein using the CRISPR/Cas9 system.
In some embodiments, the expression level of the engineered variant and/or the production of cannabinoids or cannabinoid derivatives in the modified host cell may be altered by altering gene copy number, promoter strength, and/or promoter regulation.
One or more of the nucleic acids disclosed herein (e.g., heterologous nucleic acids) can be present in an expression vector or construct. Suitable expression vectors can include, but are not limited to, plasmids, yeast artificial chromosomes, and any other vector specific to a particular target host, such as yeast. Thus, for example, one or more nucleic acids (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding a mevalonate pathway gene product are included in any of a variety of expression vectors for expressing a mevalonate pathway gene product. Such vectors may include chromosomal, nonchromosomal and synthetic DNA sequences.
The present disclosure provides a method of preparing a modified host cell for the production of a cannabinoid or cannabinoid derivative comprising introducing into a host cell one or more vectors disclosed herein. In certain such embodiments, the one or more vectors comprise one or more vectors comprising one or more nucleic acids (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding an engineered variant of the disclosure. In certain such embodiments, the one or more vectors comprise one or more vectors comprising one or more nucleic acids (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding one or more secretory pathway polypeptides. In some embodiments, the method comprises introducing into the host cell a deletion or down-regulation of one or more genes encoding one or more secretory pathway polypeptides. In some embodiments, the nucleotide sequence encoding one or more secretory pathway polypeptides is codon optimized. In some embodiments, the one or more vectors include one or more vectors comprising one or more nucleic acids (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding one or more polypeptides involved in the biosynthesis of cannabinoids or cannabinoid precursors. In some embodiments, the nucleotide sequence encoding the one or more polypeptides involved in cannabinoid or cannabinoid precursor biosynthesis is codon optimized.
The present disclosure provides a method of making a modified host cell for expression of a cannabinoid synthase polypeptide comprising introducing into a host cell one or more vectors disclosed herein. In certain such embodiments, the one or more vectors comprise one or more vectors comprising one or more nucleic acids (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding an engineered variant of the disclosure. In certain such embodiments, the one or more vectors comprise one or more vectors comprising one or more nucleic acids (e.g., heterologous nucleic acids) comprising a nucleotide sequence encoding one or more secretory pathway polypeptides. In some embodiments, the nucleotide sequence encoding one or more secretory pathway polypeptides is codon optimized. In some embodiments, the method comprises introducing into the host cell a deletion or down-regulation of one or more genes encoding one or more secretory pathway polypeptides.
Many other suitable expression vectors are known to those skilled in the art, and many are commercially available. The following vectors are provided by way of example; for yeast, low copy CEN ARS and high copy 2 micron plasmids are provided. However, any other plasmid or other vector may be used as long as it is compatible with the host cell.
In some embodiments, one or more nucleic acids disclosed herein (e.g., heterologous nucleic acids) are present in a single expression vector. In some embodiments, two or more nucleic acids disclosed herein (e.g., heterologous nucleic acids) are present in a single expression vector. In some embodiments, three or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression vector. In some embodiments, four or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression vector. In some embodiments, five or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression vector. In some embodiments, six or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression vector. In some embodiments, seven or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression vector.
In some embodiments, two or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, three or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, four or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, five or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, the six or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, seven or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, the eight or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, the nine or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors. In some embodiments, ten or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression vectors.
In some embodiments, one or more nucleic acids disclosed herein (e.g., heterologous nucleic acids) are present in a single expression construct. In some embodiments, two or more nucleic acids disclosed herein (e.g., heterologous nucleic acids) are present in a single expression construct. In some embodiments, three or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression construct. In some embodiments, four or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression construct. In some embodiments, five or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression construct. In some embodiments, six or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression construct. In some embodiments, seven or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a single expression construct.
In some embodiments, two or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, three or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, four or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, five or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, six or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, seven or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, the eight or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, the nine or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs. In some embodiments, ten or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are in separate expression constructs.
In some embodiments, one or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a high copy number plasmid, such as a plasmid that is present at about 10-50 copies per cell or more than 50 copies per cell. In some embodiments, one or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein are present in a low copy number plasmid. In some embodiments, one or more nucleic acids disclosed herein (e.g., heterologous nucleic acids) are present in a medium copy number plasmid. The copy number of the plasmid may be selected to reduce expression of one or more polypeptides disclosed herein (such as engineered variants of the disclosure). Reducing expression by limiting the copy number of the plasmid may prevent saturation of the secretory pathway leading to possible protein degradation and/or death of the modified host cell or loss of viability of the modified host cell.
In some embodiments, the modified host cell has one copy of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has two copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has three copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has four copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has five copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has six copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has seven copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has eight copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has nine copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has ten copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has eleven copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has twelve copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein. In some embodiments, the modified host cell has twelve or more copies of a nucleic acid (e.g., a heterologous nucleic acid) comprising a nucleotide sequence encoding a polypeptide disclosed herein.
Any of a number of suitable transcriptional and translational control elements may be used in the expression vector or construct, depending on the host/vector or host/construct system utilized, including constitutive and inducible promoters, transcriptional enhancer elements, transcriptional terminators, and the like (see, e.g., Bitter et al (1987) Methods in Enzymology,153: 516-). 544).
In some embodiments, a nucleic acid disclosed herein (e.g., a heterologous nucleic acid) is operably linked to a promoter. In some embodiments, the promoter is a constitutive promoter. In some embodiments, the promoter is an inducible promoter. In some embodiments, the promoter is functional in a eukaryotic cell. In some embodiments, the promoter may be a strong expression driver. In some embodiments, the promoter may be a weak expression driver. In some embodiments, the promoter may be a moderate expression driver. The promoter may be selected to reduce expression of one or more polypeptides disclosed herein (such as the engineered variants of the disclosure). Reduction of expression by promoter selection may prevent saturation of the secretory pathway leading to possible protein degradation and/or death of the modified host cell or loss of viability of the modified host cell. Examples of strong constitutive promoters include, but are not limited to: pTDH3 and pFBA 1. Examples of medium constitutive promoters include, but are not limited to: pACT1 and pCYC 1. Examples of weak constitutive promoters include, but are not limited to: pSLN 1. Examples of strongly inducible promoters include, but are not limited to: pGAL1 and pGAL 10. Examples of moderately inducible promoters include, but are not limited to: pGAL 7. Examples of weakly inducible promoters include, but are not limited to: pGAL 3.
Non-limiting examples of suitable eukaryotic promoters may include the CMV immediate early promoter, HSV thymidine kinase promoter, early and late SV40 promoters, LTR promoter from retroviruses, and mouse metallothionein-I promoter. Selection of appropriate vectors, constructs and promoters is well within the level of ordinary skill in the art. The expression vector or construct may also contain a ribosome binding site for translation initiation and a transcription terminator. The expression vector or construct may also include appropriate sequences for amplifying expression.
In yeast, a number of vectors or constructs containing constitutive or inducible promoters can be used. For a review see, urea Protocols in Molecular Biology, Vol.2, 1988, Ed. Ausubel et al, Greene publishing. Assoc. & Wiley Interscience, Chapter 13; grant et al, 1987, Expression and Secretion Vectors for Yeast, in Methods in Enzymology, Wu & Grossman eds, 31987, Acad.Press, N.Y., Vol.153, page 516-; glover,1986, DNA Cloning, volume II, IRL Press, wash, d.c., chapter 3; and Bitter,1987, Heterologous Gene Expression in Yeast, Methods in Enzymology, edited by Berger and Kimmel, Acad.Press, N.Y., Vol.152, pages 673-684; and The Molecular Biology of The Yeast Saccharomyces,1982, edited by Stratan et al, Cold Spring Harbor Press, volumes I and II. Constitutive Yeast promoters such as pADH, pTDH3, pFBA1, pACT1, pCYC1 and pSLN1 or inducible promoters such as pGAL1, pGAL10, pGAL7 and pGAL3(Cloning In Yeast, Ch.3, R.Rothstein In: DNA Cloning Vol.11, A Practical Approach, DM Glover editor, 1986, IRL Press, Wash, D.C.) can be used. Alternatively, vectors that facilitate integration of the exogenous DNA sequence into the yeast chromosome may be used.
Typically, the recombinant expression vector will include an origin of replication and a selectable marker that permits transformation of the host cell, e.g., the Saccharomyces cerevisiae TRP1 gene or a gene cassette encoding antibiotic resistance, or the like; and promoters derived from highly expressed genes to direct transcription of coding sequences. Such promoters may be derived from gene sequences encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha-factor, acid phosphatase, or heat shock proteins, among others.
Inducible promoters are well known in the art. Suitable inducible promoters may include, but are not limited to, tetracycline-inducible promoters; estradiol inducible promoters, sugar inducible promoters (e.g., pGal1 or pSUC2), amino acid inducible promoters (e.g., pMet 25); metal inducible promoters (e.g., pCup1), methanol inducible promoters (e.g., pAOX1), and the like.
Furthermore, in many embodiments, the expression vector or construct will comprise one or more selectable marker genes to provide a phenotypic trait for selection of transformed host cells, such as dihydrofolate reductase or neomycin resistance in eukaryotic cell culture.
In some embodiments, one or more nucleic acids disclosed herein (e.g., a heterologous nucleic acid) is integrated into the genome of a modified host cell disclosed herein. In some embodiments, one or more nucleic acids disclosed herein (e.g., a heterologous nucleic acid) is integrated into the chromosome of a modified host cell disclosed herein. In some embodiments, one or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein remain episomal (i.e., not integrated into the genome or chromosome of the modified host cell). In some embodiments, at least one of the one or more nucleic acids (e.g., heterologous nucleic acids) disclosed herein is maintained extrachromosomally. The gene copy number of one or more genes encoding one or more polypeptides disclosed herein (such as an engineered variant of the disclosure) can be selected to reduce expression of the one or more polypeptides disclosed herein (such as an engineered variant of the disclosure). Reducing expression by limiting gene copy number may prevent saturation of the secretory pathway leading to possible protein degradation and/or death of the modified host cell or loss of viability of the modified host cell.
As will be appreciated by those skilled in the art, slight variations in nucleotide sequence do not necessarily alter the amino acid sequence of the encoded polypeptide. Those skilled in the art will appreciate that changes in nucleotide identity in a particular gene sequence that alter the amino acid sequence of the encoded polypeptide can result in reduced or enhanced effectiveness of the gene, and that in some applications (e.g., antisense, cosuppression or RNAi), partial sequences often function as effectively as full-length versions. The manner in which the nucleotide sequence is altered or shortened is well known to those skilled in the art, as are methods for testing the effectiveness of the altered gene. In certain embodiments, effectiveness can be readily tested by, for example, conventional gas chromatography. Accordingly, all such genetic variations are included in the present disclosure.
Genomic deletions of the open reading frame encoding the protein can eliminate all expression of the gene. Down-regulation of a gene can be achieved at the DNA, RNA or protein level in several ways, with the result that the amount of active protein in the cell is reduced. Truncation of the open reading frame or introduction of mutations that destabilize the protein or reduce catalytic activity achieves similar goals as fusion of a "degron" polypeptide that destabilizes the protein. Engineering of gene regulatory regions can also be used to alter gene expression. It is a method to change the promoter sequence or to replace it with a different promoter. Truncation of the terminator, referred to as reducing mRNA interference abundance (DAmP), is also known to reduce gene expression. Other methods of reducing mRNA stability include the use of cis-or trans-acting ribozymes (such as self-cleaving ribozymes), or RNA elements that recruit exonucleases, or antisense DNA. RNAi can be used to silence genes in budding yeast strains by introducing desired protein factors from other species (e.g., Drosha or Dice) (Drinnenberg et al 2009). Gene expression can also be silenced in s.cerevisiae via recruitment of native or heterologous silencing factors or repressors, which can be achieved at any site using the D-Cas9 CRISPR system (Qi et al 2013). Protein levels can also be reduced by engineering the amino acid sequence of the target protein. A variety of degron sequences are available for targeting proteins for rapid degradation, including but not limited to ubiquitin fusions and N-terminal regular residues of the amino terminus. These methods may be performed in an established or conditional manner.
Induction system
To adapt to the changing environment, microorganisms such as yeast have evolved a wide range of naturally inducible promoter systems. Any promoter regulated by small molecules or environmental changes (temperature, pH, oxygen levels, osmolarity, oxidative damage) can in principle be converted into an inducible system for expression of heterologous genes. The best known system in Saccharomyces cerevisiae is the galactose regulon, which is strongly repressed by glucose and activated by galactose. The heterologous genetic pathway under the control of the galactose-inducible promoter is regulated in the same way, so the engineered strain can be grown in glucose medium to construct biomass, and then converted to galactose for inducible pathway expression. Expression levels ranging from very strong pGAL1 to relatively weak pGAL3 can be achieved. However, galactose can be expensive and a poor carbon source for s. Thus, for industrial applications, it may be advantageous to re-engineer regulators so that cells can be induced in non-galactose medium. Galactose regulons may be modified in a number of ways for this purpose, including:
overexpression of the negative regulator GAL3 of GAL80 from an inducible promoter such as pSUC2-GAL3, results in a reduction in GAL80 expression from glucose switching to sucrose and activation of the pathway.
Deletion of the repressor GAL80 and replacement of the native GAL4 cassette with a version under the control of a sucrose-inducible promoter (e.g., pSUC2-GAL4) to induce expression by switching from glucose to sucrose.
Replacement of the native GAL80 gene with an inducible version, e.g., pSUC2-GAL80, such that expression is induced by conversion from sucrose to glucose.
These strategies often require fine-tuning of activator and repressor levels to achieve appropriate kinetics (very low or no activator is expressed in the off state, while the desired expression level is in the on state). There are a variety of ways to fine-tune protein expression, including the use of protein stabilizing or degradation tags (e.g., degron) or the use of temperature-sensitive mutants of activators or regulators. In the above example, the pSUC2 promoter was used to induce galactose regulon in sucrose media. However, any inducible promoter can be used for this purpose, or to control a single gene outside the context of the galactose regulon. The following list provides some examples:
phosphate-regulated promoters, e.g. pPHO5
Carbon source-regulated promoters, e.g. pADH2
Amino acid regulated promoters, e.g. pMET25
Metal ion inducible promoters, e.g. pCUP1
Temperature-regulated promoters, e.g., pHSP12, pHSP26
pH-regulated promoters, e.g.pHSP 12, pHSP26
Oxygen level regulating promoters, e.g. pDAN1
Oxidative stress-regulating promoters, for example, promoters from AHP1, TRR1, TRX2, TSA1, GPX2, GSH1, GSH2, GLR1, SOD1 or SOD2 genes.
ER stress-regulated promoters, such as unfolded protein response element promoters.
In addition to these natural examples, there are a variety of synthetic inducible promoter systems. These are generally based on rearranging native or exogenous transcription elements into a basic promoter scaffold and/or fusing an activation domain to a DNA binding domain to produce novel transcription factors. Two examples are provided below:
an estradiol inducible system involving fusion of the estradiol receptor with a DNA binding and transcriptional activation domain, paired with a synthetic or natural promoter with a binding site.
Tet transactivator (tTA) or reverse tet transactivator (rtTA) system paired with a tetO-containing promoter.
In some embodiments, one of the inducible promoter systems described above is used in a modified host cell of the present disclosure. In some embodiments, the inducible promoter system is a native inducible promoter system. In some embodiments, the inducible promoter system is a synthetic inducible promoter system. In some embodiments, a suitable medium for culturing the modified host cells of the present disclosure comprises one or more of the induction factors disclosed herein. Possible induction factors include:
Phosphate-regulated promoters, e.g. pPHO5
oKH 2 PO 4
Carbon source-regulated promoters, e.g. pADH2
o galactose (e.g., pGAL1)
o-glucose (e.g., pADH2)
o sucrose (e.g., pSUC2, pGPH1, pMAL12)
Maltose (e.g., pMAL12, pMAL32)
Amino acid regulated promoters, e.g. pMET25
o methionine (e.g., pMET25)
o lysine (e.g., pLYS9)
o other amino acids
Promoters inducible by metal ions, e.g. pCUP1
oCuSO 4
Temperature-regulated promoters, e.g., pHSP12, pHSP26
o temperature change, e.g. 30 ℃ to 37 DEG C
pH-regulated promoters, e.g.pHSP 12, pHSP26
opH change, e.g. pH 6 to pH 4
Oxygen level regulating promoters, e.g. pDAN1
Changes in o oxygen level, e.g., dissolved oxygen level changes from 20% to 1%
Oxidative stress-regulated promoters, e.g. pSOD1
o addition of the hydrogen peroxide or peroxide generating drug menadione
ER stress-regulated promoters, such as unfolded protein response element promoters.
Expression of tunicamycin, or a protein susceptible to misfolding (e.g. cannabinoid synthase)
An estradiol inducible system involving fusion of the estradiol receptor with a DNA binding and transcriptional activation domain paired with a synthetic or natural promoter with a binding site.
o-estradiol
Tet transactivator (tTA) or reverse tet transactivator (rtTA) system paired with a tetO-containing promoter.
o doxycycline
Codon usage
As is well known to those skilled in the art, expression of a heterologous nucleic acid in a host organism can be improved by replacing the nucleotide sequence encoding a particular amino acid (i.e., codon) with another codon for better expression in the host organism (i.e., codon optimization). One reason for this effect is due to the fact that different organisms show a preference for different codons. In some embodiments, the nucleic acids disclosed herein are modified or optimized such that the nucleotide sequence reflects the codon preference of a particular host cell. For example, in some embodiments, the nucleotide sequence will be modified or optimized for yeast codon bias. In some embodiments, the nucleotide sequences disclosed herein are codon optimized. See, e.g., Bennetzen and Hall (1982) J.biol.chem.257(6): 3026-3031.
Statistical methods have been generated to analyze codon usage bias in various organisms and many computer algorithms have been developed to perform these statistical analyses in the design of codon-optimized gene sequences (Lithwick G, Margalit H (2003) Hierarchy of sequence-dependent pests associated with prokarstic transformation. genome Research 13: 2665-73). Other modifications that increase codon usage for protein expression that do not depend on codon bias have also been described (Welch et al, (2009).
In some embodiments, the codon usage of the nucleotide sequence is modified or optimized such that the level of translation of the encoded mRNA is reduced. In some embodiments, codon-optimized nucleotide sequences can be optimized such that the level of translation of the encoded mRNA is reduced. Reducing the level of mRNA translation by modifying codon usage can be achieved by modifying the nucleotide sequence to include codons that are rare or uncommon for the host cell. Codon usage tables are available for a number of organisms, summarizing the percentage of time a particular organism uses a particular codon to encode an amino acid. Certain codons are used more frequently than other "rare" codons. The use of "rare" codons in a nucleotide sequence generally reduces its translation rate. Thus, for example, the nucleotide sequence is modified by introducing one or more rare codons that affect the rate of translation rather than the amino acid sequence of the translated polypeptide. For example, there are six codons encoding arginine: CGT, CGC, CGA, CGG, AGA and AGG. In e.coli, the codons CGT and CGC are used more frequently (encoding about 40% of the arginines in e.coli) than the codon AGG (encoding about 2% of the arginines in e.coli). Modification of CGT codons within a gene sequence to AGG codons does not alter the sequence of the polypeptide, but may reduce the translation rate of the gene.
In some embodiments, codon-optimized nucleotide sequences can be optimized for expression in yeast cells. In certain such embodiments, the yeast cell is saccharomyces cerevisiae.
Furthermore, it is understood that the present disclosure includes degeneracy of codon usage, as understood by one of ordinary skill in the art and as illustrated in the following tables.
Codon degeneracy
Figure BDA0003634699900001141
Figure BDA0003634699900001151
Methods of producing cannabinoids or cannabinoid derivatives or engineered variants expressing and/or producing tetrahydrocannabinolic acid synthase (THCAS) polypeptides
The present disclosure provides methods for expressing engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides of the present disclosure. In certain such embodiments, the method comprises culturing the modified host cell of the present disclosure in a culture medium. The present disclosure also provides methods for making engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides of the present disclosure. The present disclosure also provides methods of producing a cannabinoid or cannabinoid derivative comprising using the engineered variants of the present disclosure.
The present disclosure also provides methods of producing cannabinoids or cannabinoid derivatives. The methods of the present disclosure may involve the production of cannabinoids or cannabinoid derivatives using the engineered variants disclosed herein. The methods may involve culturing the modified host cells of the disclosure in a culture medium and recovering the cannabinoid or cannabinoid derivative produced. The methods may also involve cell-free production of cannabinoids or cannabinoid derivatives using one or more polypeptides disclosed herein (such as engineered variants of the disclosure) that are expressed or overexpressed by the modified host cells of the disclosure. The methods may also involve cell-free production of cannabinoids or cannabinoid derivatives using the engineered variants disclosed herein.
Cannabinoids or cannabinoid derivatives that may be produced using the engineered variants, methods, or modified host cells of the present disclosure may include, but are not limited to, cannabichromene (CBC) types (e.g., cannabichromenic acid), Cannabidiol (CBD) types (e.g., cannabidiolic acid), Δ 9 -trans-tetrahydrocannabinol (Δ) 9 type-THC (e.g. Delta) 9 -tetrahydrocannabinolic acid), Δ 8 -trans-tetrahydrocannabinol (Δ) 8 -THC type, Cannabichromene (CBL) type, Cannabigerol (CBE) type, Cannabinol (CBN) type, dehydrocannabidiol (CBND) type, Cannabitriol (CBT) type, the foregoingDerivatives of either, and Elsohly m.a. and slave d., Life sci.2005dec 22; 78(5) 539-48. other substances listed in Epub 2005Sep 30. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
Cannabinoids or cannabinoid derivatives that may be produced using the engineered variants, methods, or modified host cells of the present disclosure may include, but are not limited to, cannabichromenic acid (CBCA), cannabichromene phenol (CBC), sub-cannabichromene acid (CBCVA), sub-cannabichromene (CBCV), CBDA, Cannabidiol (CBD), cannabidiol monomethyl ether (CBDM), cannabidiol-C 4 (CBD-C 4 ) Sub-cannabidiolic acid (CBDVA), sub-Cannabidiol (CBDV), cannabidiol (CBD-C) 1 )、Δ 9 -tetrahydrocannabinolic acid A (THCA-A), Delta 9 -tetrahydrocannabinolic acid B (THCA-B), Delta 9 -Tetrahydrocannabinol (THC), Δ 9 -tetrahydrocannabinolic acid-C 4 (THCA-C 4 )、Δ 9 -tetrahydrocannabinol-C 4 (THC-C 4 )、Δ 9 -tetrahydrocannabinolic acid (THCVA), Δ 9 -Tetrahydrocannabivarin (THCV), Δ 9 Tetrahydrocannabinolic acid (THCA-C) 1 )、Δ 9 Tetrahydrocannabinol (THC-C) 1 )、Δ 7 -cis-iso-tetrahydrocannabivarin, delta 8 -tetrahydrocannabinolic acid (Δ) 8 –THCA)、Δ 8 -tetrahydrocannabinol (. DELTA.) - 8 -THC), cannabicyclolic acid (CBLA), Cannabinol (CBL), sub-Cannabicyclol (CBLV), cannabigerolic acid a (CBEA-a), cannabigerolic acid B (CBEA-B), Cannabigeropine (CBE), cannabiisonic acid, cannabichromenic acid, cannabinolic acid (CBNA), Cannabinol (CBN), cannabinol methyl ether (CBNM), cannabinol-C 4 、(CBN-C 4 ) sub-Cannabinol (CBV), cannabinol-C 2 (CNB-C 2 ) Cannabis orcinol (CBN-C) 1 ) Dehydrocannabidiol (CBND), dehydrocannabidivarin (CBVD), Cannabitriphenol (CBT), 10-ethoxy-9-hydroxy-delta-6 a-tetrahydrocannabinol, 8, 9-dihydroxy-delta-6 a-tetrahydrocannabinol, cannabitriphenolPhenol (CBTVE), Dehydrocannabifuran (DCBF), Cannabinofuran (CBF), cannabichromene (CBCN), Cannabidicarbone (CBT), 10-oxo-delta-6 a-tetrahydrocannabinol (OTHC), delta-9-cis-tetrahydrocannabinol (cis-THC), 3,4,5, 6-tetrahydro-7-hydroxy-alpha-2-trimethyl-9-n-propyl-2, 6-methano-2H-1-benzoxepin-5-methanol (OH-iso-HHCV), maquilibrium (CBR), trihydroxy-delta-9-tetrahydrocannabinol (triothC), CBGA-hydrocinnamic acid (3- [ (2E) -3, 7-dimethyloctyl-2, 6-dien-1-yl ]-2, 4-dihydroxy-6- (2-phenylethyl) benzoic acid), CBG-hydrocinnamic acid (2- [ (2E) -3, 7-dimethylocta-2, 6-dien-1-yl)]-5- (2-phenylethyl) benzene-1, 3-diol), CBDA-hydrocinnamic acid (2, 4-dihydroxy-3- [ 3-methyl-6- (prop-1-en-2-yl) cyclohex-2-en-1-yl)]-6- (2-phenylethyl) benzoic acid), CBD-hydrocinnamic acid (2- [ 3-methyl-6- (prop-1-en-2-yl) cyclohex-2-en-1-yl)]-5- (2-phenylethyl) benzene-1, 3-diol), THCA-hydrocinnamic acid (1-hydroxy-6, 6, 9-trimethyl-3- (2-phenylethyl) -6H,6aH,7H,8H,10 aH-benzo [ c ]]Isochromene-2-carboxylic acid), THC-hydrocinnamic acid (6,6, 9-trimethyl-3- (2-phenylethyl) -6H,6aH,7H,8H,10 aH-benzo [ c ]]Isochromen-1-ol, perottetinene), and derivatives of any of the foregoing. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
In some embodiments, the cannabinoid produced with the engineered variant, method or modified host cell of the present disclosure is Δ 9 -tetrahydrocannabinolic acid,. DELTA. 9 -tetrahydrocannabinol,. DELTA. 8 -tetrahydrocannabinolic acid,. DELTA. 8 -tetrahydrocannabinol, cannabidiolic acid, cannabidiol, cannabichromenic acid, cannabichromene, cannabinic acid, cannabinol, cannabinolic acid, cannabisubhnophol, cannabigerol, tetrahydrocannabinolic acid, tetrahydrocannabigerol, cannabichromene, cannabigerolic acid, cannabichromene, cannabiisolic acid, cannabidiolone, cannabidiole naphthenate or cannabidiopyran naphthene. In some embodiments, the cannabinoid is produced in an amount greater than 100mg/L of medium. In some embodimentsThe cannabinoid is produced in an amount greater than 50mg/L of medium.
In some embodiments, the cannabinoid produced with the engineered variant, method or modified host cell of the present disclosure is tetrahydrocannabinolic acid, tetrahydrocannabinolic acid or tetrahydrocannabinol. In some embodiments, the cannabinoid is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid is produced in an amount greater than 50mg/L of medium.
Additional cannabinoids or cannabinoid derivatives that may be produced using the engineered variants, methods or modified host cells of the disclosure may include, but are not limited to, CBDA, CBD, CBGA, THC, THCA, THCVA, CBDVA, (6aR,10aR) -1-hydroxy-6, 6, 9-trimethyl-3-butyl-6 a,7,8,10 a-tetrahydro-6H-dibenzo [ b, d ] pyran-2-carboxylic acid, (6aR,10aR) -1-hydroxy-6, 6, 9-trimethyl-3- (3-methylpentyl) -6a,7,8,10 a-tetrahydro-6H-dibenzo [ b, d ] pyran-2-carboxylic acid, (6aR,10aR) -1-hydroxy-6, 6, 9-trimethyl-3- (4-pentenyl) -6a,7,8,10 a-tetrahydro-6H-dibenzo [ b, d ] pyran-2-carboxylic acid, (6aR,10aR) -1-hydroxy-6, 6, 9-trimethyl-3-hexyl-6 a,7,8,10 a-tetrahydro-6H-dibenzo [ b, d ] pyran-2-carboxylic acid, (6aR,10aR) -1-hydroxy-6, 6, 9-trimethyl-3- (5-hexynyl) -6a,7,8,10 a-tetrahydro-6H-dibenzo [ b, d ] pyran-2-carboxylic acid, and also Bow, E.W. and Rimoldi, J.S., "The Structure-Function modification of Cannabinoids: CB1/CB2 modification, other materials listed in "Perspectives in Medicinal Chemistry2016: 817-39 doi:10.4137/PMC.S32171, incorporated by reference. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
Additional cannabinoids and cannabinoid derivatives that may be produced with the engineered variants, methods, or modified host cells of the present disclosure may also include, but are not limited to: (1' R,2' R) -4- (Hexane-2-yl) -5' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -4-hexyl-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- (3-methylpentyl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -4- (4-chlorobutyl) -5' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- (4-methylpentyl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- (4- (methylthio) butyl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- ((E) -pent-1-en-1-yl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- ((E) -pent-3-en-1-yl) -2' - ((E) Prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- ((E) -pent-2-en-1-yl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -4- (but-3-yn-1-yl) -5' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -4- ((E) -but-1-en-1-yl) -5' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- (pent-4-yn-1-yl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -2, 6-diol, (1' R,2'R) -5' -methyl-2 '- (prop-1-en-2-yl) -4-undecyl-1', 2',3',4 '-tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1'R,2' R) -4- (hex-5-yn-1-yl) -5 '-methyl-2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -2, 6-diol, (1' R,2'R) -4- ((E) -hept-1-en-1-yl) -5' -methyl -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4-octyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- ((E) -oct-1-en-1-yl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4-nonyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- (3-phenylpropyl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4- (4-phenylbutyl) -2' - (1' R,2' R) Prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -2, 6-diol, (1' R,2'R) -5' -methyl-4- (5-phenylpentyl) -2'- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1'R,2' R) -5 '-methyl-4- (6-phenylhexyl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -2, 6-diol, (1' R,2'R) -5' -methyl-4- (2-methylpentyl) -2'- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1'R,2' R) -4-isopropyl-5 '-methyl-2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -2, 6-diol, (1' R,2'R) -4-decyl-5' -methyl-2 '- (prop-1-en-2-yl) -1' 2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-2 ' - (prop-1-en-2-yl) -4-tridecyl-1 ',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (E) -3- ((1' R,2' R) -2, 6-dihydroxy-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -4-yl) acrylic acid, (Z) -3- ((1' R,2'R) -2, 6-dihydroxy-5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1' -biphenyl ] -4-yl) acrylic acid, 7- ((1'R,2' R) -2, 6-dihydroxy-5 '-methyl-2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -4-yl) heptanoic acid, 8- ((1' R,2'R) -2, 6-dihydroxy-5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -4-yl) octanoic acid, 9- ((1' R,2' R) -2, 6-dihydroxy-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -4-yl) nonanoic acid, 11- ((1' R,2' R) -2, 6-dihydroxy-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -4-yl) undecanoic acid, (1' R,2'R) -3',5 '-dihydroxy-5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1':4', 1' -terphenyl ] -2-carboxylic acid, (1'R,2' R) -3',5' -dihydroxy-5 '-methyl-2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1':4',1 '-terphenyl ] -3-carboxylic acid, (1' R,2'R) -3',5 '-dihydroxy-5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1':4', 1' -terphenyl ] -4-carboxylic acid, (1'R,2' R) -3',5' -dihydroxy-5 '-methyl-2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1':4',1 '-terphenyl ] -3, 5-dicarboxylic acid, (1' R,2'R) -4- (4-hydroxybutyl) -5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1'R,2' R) -4- (4-aminobutyl) -5 '-methyl-2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -2, 6-diol, 5- ((1' R,2'R) -2, 6-dihydroxy-5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1' -biphenyl ] -4-yl) valeronitrile, (1'R,2' R) -5 '-methyl-4- (3-methylhexan-2-yl) -2' - (prop-1-en-2-yl) valeronitrile -yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-2 ' - (prop-1-en-2-yl) -4-propyl-1 ',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -4-butyl-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1' R,2' R) -5' -methyl-4-pentyl-2 ' - (prop-ane) -1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -2, 6-diol, (1' R,2'R) -4-heptyl-5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1' -biphenyl ] -2, 6-diol, (1'R,2' R) -5 '-methyl-4- (pent-4-en-1-yl) -2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, 3- ((1' R,2' R) -2, 6-dihydroxy-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -4-yl) propanoic acid, (1' R,2' R) -4,5' -dimethyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -2, 6-diol, 2- ((1' R,2' R) -2, 6-dihydroxy-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -4-yl) acetic acid, 4- ((1' R,2'R) -2, 6-dihydroxy-5' -methyl-2 '- (prop-1-en-2-yl) -1',2',3',4 '-tetrahydro- [1,1' -biphenyl ] -4-yl) butyric acid, (1'R,2' R) -2, 6-dihydroxy-5 '-methyl-2' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1 '-biphenyl ] -4-carboxylic acid, 5- ((1' R,2' R) -2, 6-dihydroxy-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -4-yl) pentanoic acid and 6- ((1' R,2' R) -2, 6-dihydroxy-5 ' -methyl-2 ' - (prop-1-en-2-yl) -1',2',3',4' -tetrahydro- [1,1' -biphenyl ] -4-yl) hexanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
Cannabinoid derivatives may lack one or more of the chemical moieties found in naturally occurring cannabinoids. Such chemical moieties may include, but are not limited to, methyl, alkyl, alkenyl, methoxy, alkoxy, acetyl, carboxyl, carbonyl, oxo, ester, hydroxy, aryl, heteroaryl, cycloalkyl, cycloalkenyl, cycloalkylalkenyl, cycloalkenylalkyl, cycloalkenylalkenyl, heterocyclylalkenyl, heteroarylalkenyl, arylalkenyl, heterocyclo, arylalkyl, cycloalkylalkyl, heterocycloalkylalkyl, heteroarylalkyl, and the like. In some embodiments, cannabinoid derivatives, which refer to compounds lacking one or more chemical moieties found in naturally occurring cannabinoids, may also comprise one or more of any of the functional and/or reactive groups described herein. The functional group and the reactive group may be unsubstituted or substituted with one or more functional groups or reactive groups.
Cannabinoid derivatives may be cannabinoids substituted with or comprising one or more functional and/or reactive groups. Functional groups can include, but are not limited to, azido, halogen (e.g., chlorine, bromine, iodine, fluorine), methyl, alkyl, alkynyl, alkenyl, methoxy, alkoxy, acetyl, amino, carboxyl, carbonyl, oxo, ester, hydroxyl, thio (e.g., thiol), cyano, aryl, heteroaryl, cycloalkyl, cycloalkenyl, cycloalkylalkenyl, cycloalkylalkynyl, cycloalkenylalkyl, cycloalkenylalkenyl, cycloalkenylalkynyl, heterocyclylalkyl, heterocyclylalkynyl, heteroarylalkynyl, arylalkenyl, arylalkynyl, spiro, heterospiro, heterocyclyl, thioalkyl (or alkylthio), arylthio, heteroarylthio, sulfone, sulfonyl, sulfoxide, amido, alkylamino, dialkylamino, arylamino, alkylarylamino, diarylamino, N-oxide, imide, enamine, imine, oxime, nitro, thiol, and the like, Hydrazones, nitriles, aralkyl, cycloalkylalkyl, haloalkyl, heterocyclylalkyl, heteroarylalkyl, nitro, thioketo, and the like. Suitable reactive groups can include, but are not necessarily limited to, azides, carboxyls, carbonyls, amines (e.g., alkylamines (e.g., lower alkylamines), arylamines), halides, esters (e.g., alkyl esters (e.g., lower alkyl esters, benzyl esters), aryl esters, substituted aryl esters), cyano, thioesters, thioethers, sulfonyl halides, alcohols, thiols, succinimidyl esters, isothiocyanates, iodoacetamides, maleimides, hydrazines, alkynyls, alkenyls, acetyls, and the like. In some embodiments, the reactive group is selected from the group consisting of carboxyl, carbonyl, amine, ester, thioester, thioether, sulfonyl halide, alcohol, thiol, alkyne, alkene, azide, succinimidyl ester, isothiocyanate, iodoacetamide, maleimide, and hydrazine. The functional group and the reactive group may be unsubstituted or substituted with one or more functional groups or reactive groups.
"alkyl" may refer to straight or branched chain saturated hydrocarbons. E.g. C 1 -C 6 The alkyl group contains 1 to 6 carbon atoms. C 1 -C 6 Examples of alkyl groups include, but are not limited to, methyl, ethyl, propyl, butyl, pentyl, isopropyl, isobutyl, sec-butyl and tert-butyl, isopentyl, and neopentyl.
"alkenyl" may include unbranched (i.e., straight-chain) or branched hydrocarbon chains containing 2 to 12 carbon atoms. An "alkenyl" group contains at least one double bond. The double bond of the alkenyl group may be unconjugated or conjugated to another unsaturated group. Examples of alkenyl groups may include, but are not limited to, vinyl (ethylene), vinyl (vinyl), allyl, butenyl, pentenyl, hexenyl, butadienyl, pentadienyl, hexadienyl, 2-ethylhexenyl, 2-propyl-2-butenyl, 4- (2-methyl-3-butene) -pentenyl, and the like.
The compounds disclosed herein, such as cannabinoids and cannabinoid derivatives, may be substituted with one or more substituents, such as those illustrated generally herein, or as exemplified by particular classes, subclasses and species of the disclosure. In general, the term "substituted" means that a hydrogen atom in a given structure is replaced with the indicated substituent. Combinations of substituents contemplated by the present disclosure are generally those that result in the formation of stable or chemically feasible compounds.
As used herein, the term "unsubstituted" can mean that the specified group does not carry substituents other than the recited moiety (e.g., wherein the valencies are satisfied by hydrogen).
The reactive group may facilitate covalent attachment of the target molecule. Suitable target molecules may include, but are not limited to, detectable labels; an imaging agent; toxins (including cytotoxins); a joint; a peptide; drugs (e.g., small molecule drugs); a member of a specific binding pair; an epitope tag; a ligand for binding to a target receptor; a tag to aid in purification; a solubility-enhancing molecule; and so on. The linker may be a peptide linker or a non-peptide linker.
In some embodiments, an azide-substituted cannabinoid derivative can be reacted with a compound comprising an alkyne group via "click chemistry" to produce a product comprising a heterocycle, also known as azide-alkyne cycloaddition. In some embodiments, the alkyne-substituted cannabinoid derivative can be reacted with a compound comprising an azide group via click chemistry to produce a product comprising a heterocycle.
It may be desirable that other target molecules attached to cannabinoid derivatives may include, but are not necessarily limited to, detectable labels (e.g., spin labels, Fluorescence Resonance Energy Transfer (FRET) type dyes, e.g., for studying the structure of biomolecules in vivo); a small molecule drug; cytotoxic molecules (e.g., drugs); an imaging agent; a ligand for binding to a target receptor; tags that facilitate purification by, for example, affinity chromatography (e.g., attachment of FLAG epitope); solubility-increasing molecules (e.g., poly (ethylene glycol)); molecules that enhance bioavailability; a molecule that increases half-life in vivo; molecules that target a particular cell type (e.g., antibodies specific for an epitope on a target cell); a molecule that targets a specific tissue; providing a molecule that crosses the blood-brain barrier; and molecules that facilitate selective attachment to a surface, and the like.
In some embodiments, the target molecule comprises an imaging agent. Suitable imaging agents may include positive contrast agents and negative contrast agents. Suitable positive contrast agents may include, but are not limited to, gadolinium-tetraazacyclododecane tetraacetic acid (Gd-DOTA); gadolinium-diethylenetriaminepentaacetic acid (Gd-DTPA); gadolinium-1, 4, 7-tris (carbonylmethyl) -10- (2' -hydroxypropyl) -1,4,7, 10-tetraazacyclododecane (Gd-HP-DO 3A); manganese (II) -diphosphodipyridoxal (Mn-DPDP); gd-diethylenetriaminepentaacetic acid-bis (methylamide) (Gd-DTPA-BMA); and so on. Suitable negative contrast agents may include, but are not limited to, superparamagnetic iron oxide (SPIO) imaging agents; and perfluorocarbons, wherein suitable perfluorocarbons may include, but are not limited to, fluoroheptane, fluorocycloheptane, fluoromethylcycloheptane, fluorohexane, fluorocyclohexane, fluoropentane, fluorocyclopentane, fluoromethylcyclopentane, fluorodimethylcyclopentane, fluoromethylcyclobutane, fluorodimethylcyclobutane, fluorotrimethylcyclobutane, fluorobutane, fluorocyclobutane, fluoropropane, fluoroether, fluoropolyether, fluorotriethylamine, perfluorohexane, perfluoropentane, perfluorobutane, perfluoropropane, sulfur hexafluoride, and the like.
Other cannabinoid derivatives that can be produced with the engineered variants, methods, or modified host cells of the present disclosure can include derivatives that have been modified by organic synthetic or enzymatic pathways to alter drug metabolism and pharmacokinetics (e.g., solubility, bioavailability, absorption, distribution, plasma half-life, and metabolic clearance). Examples of modifications may include, but are not limited to, halogenation, acetylation, and methylation.
The cannabinoids or cannabinoid derivatives as described herein also include all pharmaceutically acceptable isotopically labelled cannabinoids or cannabinoid derivatives. An "isotopically" or "radiolabeled" compound is a compound in which one or more atoms are replaced or substituted by an atom having an atomic mass or mass number different from the atomic mass or mass number usually found in nature (i.e., naturally occurring). For example, in some embodiments, in a cannabinoid or cannabinoid derivative described herein, the hydrogen atom is replaced or substituted with one or more deuterium or tritium. Certain isotopically labeled cannabinoids or cannabinoid derivatives of the present disclosure, such as those incorporating a radioisotope, are useful in drug and/or substrate tissue distribution studies. Radioisotope tritium (i.e. tritium 3 H) And carbon 14 (i.e. 14 C) They are particularly useful for this purpose because of their ease of incorporation and ready means of detection. With heavier isotopes such as deuterium (i.e. 2 H) Substitution may provide certain therapeutic advantages resulting from greater metabolic stability, such as increased in vivo half-life and reduced dosage requirements, and may therefore be preferred in some circumstances. Suitable isotopes that can be incorporated into the cannabinoids or cannabinoid derivatives described herein include, but are not limited to 2 H (also written as D for deuterium), 3 H (also written as T for tritium) 11 C、 13 C、 14 C、 13 N、 15 N、 15 O、 17 O、 18 O、 18 F、 35 S、 36 Cl、 82 Br、 75 Br、 76 Br、 77 Br、 123 I、 124 I、 125 I and 131 I. with positron-emitting isotopes (such as 11 C、 18 F、 15 O and 13 n) substitution, useful in Positron Emission Tomography (PET) studies.
The biological production methods, modified host cells and engineered variants disclosed herein enable the synthesis of cannabinoids or cannabinoid derivatives with defined stereochemistry, which is challenging to use for chemical synthesis. The cannabinoids or cannabinoid derivatives disclosed herein may be enantiomers or diastereomers. The term "enantiomer" may refer to a pair of stereoisomers that are non-overlapping mirror images of each other. In some embodiments, the cannabinoid or cannabinoid derivative may be an (S) -enantiomer. In some embodiments, the cannabinoid or cannabinoid derivative may be the (R) -enantiomer. In some embodiments, the cannabinoid or cannabinoid derivative may be the (+) or (-) enantiomer. The term "diastereomer" may refer to a group of stereoisomers that cannot be made to overlap by rotation about a single bond. For example, cis-and trans-double bonds, internal and external substitutions on bicyclic ring systems, and compounds containing multiple stereocenters with different relative configurations can be considered diastereomers. The term "diastereomer" may refer to any member of the group of compounds. The cannabinoids or cannabinoid derivatives disclosed herein may comprise double bonds or fused rings. In certain such embodiments, the double bonds or fused rings may be cis or trans unless configuration is explicitly defined. Unless the configuration is explicitly defined, the substituents may be in the E or Z configuration if the cannabinoid or cannabinoid derivative contains a double bond.
In some embodiments, when the cell is a lysate; from the culture medium; from a modified host cell; from both cell lysate and culture medium; from both the modified host cell and the culture medium; from cell lysates, modified host cells, and culture media; or recovering the cannabinoid or cannabinoid derivative from a cell-free reaction mixture comprising one or more polypeptides and/or engineered variants disclosed herein, the recovered cannabinoid or cannabinoid derivative is in a salt form. In certain such embodiments, the salt is a pharmaceutically acceptable salt. In some embodiments, the recovered cannabinoid or salt of a cannabinoid derivative is then purified as disclosed herein.
The present disclosure includes pharmaceutically acceptable salts of the cannabinoids or cannabinoid derivatives described herein. "pharmaceutically acceptable salts" may refer to those salts that retain the biological effectiveness and properties of the free base and are not biologically or otherwise undesirable. Representative pharmaceutically acceptable salts include, but are not limited to, for example, water soluble and water insoluble salts such as acetate, water-insoluble salt of stilbene sulfonate (4, 4-diaminostilbene-2, 2-disulfonate), benzenesulfonate, benzoate, bicarbonate, bisulfate, bitartrate, borate, bromide, butyrate, calcium salt, calcium edetate, camphorsulfonate, carbonate, chloride, citrate, clavulanate, dihydrochloride, edetate, edisylate (edisylate), propionate lauryl sulfate (estolate), ethanesulfonate (esylate), fumarate (fumarate), glucoheptonate, gluconate, glutamate, glyclylaminoate (glycosylarsenate), hexafluorophosphate, hexylresorcinate (hexresorcinate), hydrabamine, and/or a, Hydrobromide, hydrochloride, hydroxynaphthoate, iodide, isethionate (sethionate), lactate, lactobionate, laurate, magnesium salt, malate, maleate, mandelate, methanesulfonate, methylbromide, methylnitrate, methylsulfate, mucate, naphthalenesulfonate, nitrate, N-methylglucamine ammonium salt, 3-hydroxy-2-naphthoate, oleate, oxalate, palmitate, pamoate (1, 1-methylene-bis-2-hydroxy-3-naphthoate, embonate (einborate)), pantothenate, phosphate/diphosphate, picrate, polygalacturonate, propionate, p-toluenesulfonate, salicylate, stearate, subacetate, succinate, sulfate, salicylate, and the like, Suramin acid salt (suramate), tannate salt, tartrate salt, 8-chlorotheophylline salt (teoclate), tosylate salt, triethyliodide and valerate salt.
"pharmaceutically acceptable salts" also include both acid addition salts and base addition salts. "pharmaceutically acceptable acid addition salts" may refer to those salts that retain the biological effectiveness and properties of the free base, are not biologically or otherwise undesirable, and are formed with inorganic acids such as, but not limited to, hydrochloric acid, hydrobromic acid, sulfuric acid, nitric acid, phosphoric acid, and the like, and organic acids such as, but not limited to, acetic acid, 2-dichloroacetic acid, adipic acid, alginic acid, ascorbic acid, aspartic acid, benzenesulfonic acid, benzoic acid, 4-acetamidobenzoic acid, camphoric acid, camphor-10-sulfonic acid, capric acid, hexanoic acid, octanoic acid, carbonic acid, cinnamic acid, citric acid, cyclamic acid, dodecylsulfuric acid, ethane-1, 2-disulfonic acid, ethanesulfonic acid, 2-hydroxyethanesulfonic acid, formic acid, fumaric acid, galactaric acid, gentisic acid, glucoheptonic acid, gluconic acid, citric acid, cyclamic acid, cyclododecanedioic acid, and gentisic acid, Glucuronic acid, glutamic acid, glutaric acid, 2-oxo-glutaric acid, glycerophosphoric acid, glycolic acid, hippuric acid, isobutyric acid, lactic acid, lactobionic acid, lauric acid, maleic acid, malic acid, malonic acid, mandelic acid, methanesulfonic acid, mucic acid, naphthalene-1, 5-disulfonic acid, naphthalene-2-sulfonic acid, l-hydroxy-2-naphthoic acid, nicotinic acid, oleic acid, orotic acid, oxalic acid, palmitic acid, pamoic acid, propionic acid, pyroglutamic acid, pyruvic acid, salicylic acid, 4-aminosalicylic acid, sebacic acid, stearic acid, succinic acid, tartaric acid, thiocyanic acid, p-toluenesulfonic acid, trifluoroacetic acid, undecylenic acid and the like.
"pharmaceutically acceptable base addition salts" may refer to those salts that retain the biological effectiveness and properties of the free acid and are not biologically or otherwise undesirable. These salts are prepared by adding an inorganic or organic base to the free acid. Salts derived from inorganic bases include, but are not limited to, sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, manganese, aluminum, and the like. For example, inorganic salts include, but are not limited to, ammonium, sodium, potassium, calcium, and magnesium salts. Salts derived from organic bases include, but are not limited to, salts of: primary, secondary and tertiary amines, substituted amines (including naturally occurring substituted amines), cyclic amines, and basic ion exchange resins (such as ammonia), isopropylamine, trimethylamine, diethylamine, triethylamine, tripropylamine, diethanolamine, ethanolamine, dinol, 2-dimethylaminoethanol, 2-diethylaminoethanol, dicyclohexylamine, lysine, arginine, histamine, caffeine, procaine, hydrabamine, choline (choline), betaine, benzphetamine, benzathine, ethylenediamine, glucosamine, methylglucamine, theobromine (theobromamine), triethanolamine, tromethamine, purine, piperazine, piperidine, N-ethylpiperidine, polyamine resins, and the like.
The present disclosure provides a method of producing a cannabinoid or cannabinoid derivative comprising using an engineered variant of the present disclosure. In certain such embodiments, the amount of cannabinoid or cannabinoid derivative produced is greater than the amount of cannabinoid or cannabinoid derivative produced in a method comprising use of a THCAS polypeptide having the amino acid sequence of SEQ ID NO:44, rather than an engineered variant of the present disclosure, in mg/L or mM. In certain such embodiments, the engineered variants of the present disclosure and having the amino acid sequence of SEQ ID NO:44 are tetrahydrocannabinolic acid synthase polypeptides under similar conditions for the same length of time. In some embodiments of the methods of producing a cannabinoid or cannabinoid derivative of the disclosure, the amount of cannabinoid or cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of cannabinoid or cannabinoid derivative produced in an alternative method that includes using a polypeptide having the amino acid sequence of SEQ ID NO:44 that is a tetrahydroxyphenol cannabisynthase polypeptide, rather than an engineered variant of the disclosure. In certain such embodiments, the engineered variants of the present disclosure and having the amino acid sequence of SEQ ID NO:44 are tetrahydrocannabinolic acid synthase polypeptides under similar conditions for the same length of time.
In some embodiments of the methods of producing a cannabinoid or cannabinoid derivative of the disclosure, the cannabinoid is THCA and the methods produce THCA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to a ratio of THCA to the other cannabinoid produced in methods that include use of a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, rather than an engineered variant of the disclosure. In certain such embodiments, the engineered variants of the present disclosure and having the amino acid sequence of SEQ ID NO:44 are tetrahydrocannabinolic acid synthase polypeptides under similar conditions for the same length of time. In some embodiments of the methods of producing a cannabinoid or a cannabinoid derivative of the disclosure, the cannabinoid is CBDA and the method produces the cannabinoid from another, e.g., THCA, at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500:1, of the cannabinoid to the CBCA (e.g., THCA).
Methods of producing cannabinoids or cannabinoid derivatives using host cells
The present disclosure provides methods of producing a cannabinoid or cannabinoid derivative (such as those described herein) comprising: culturing the modified host cell of the present disclosure in a culture medium. In certain such embodiments, the method comprises recovering the produced cannabinoid or cannabinoid derivative. In certain such embodiments, the resulting cannabinoid or cannabinoid derivative is then purified as disclosed herein.
In some embodiments, a modified host cell of the present disclosure is cultured in a culture medium to provide synthesis of a cannabinoid or a cannabinoid derivative (such as those described herein) in an increased amount as compared to an unmodified host cell cultured under similar conditions.
The present disclosure provides methods of producing a cannabinoid or cannabinoid derivative (such as those described herein) comprising: the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid. In certain such embodiments, the methods comprise recovering the produced cannabinoid or cannabinoid derivative. In certain such embodiments, the resulting cannabinoid or cannabinoid derivative is then purified as disclosed herein.
In some embodiments, the cell is isolated from a cell lysate; from the culture medium; from a modified host cell; from both cell lysate and culture medium; from both the modified host cell and the culture medium; recovering the cannabinoid or cannabinoid derivative from the cell lysate, the modified host cell, and the culture medium. In certain such embodiments, the recovered cannabinoid or cannabinoid derivative is then purified as disclosed herein. In some embodiments, when the cell is a lysate; from the culture medium; from a modified host cell; from both cell lysate and culture medium; from both the modified host cell and the culture medium; from cell lysates, modified host cells, and culture media; or recovering the cannabinoid or cannabinoid derivative from the cell-free reaction mixture comprising one or more polypeptides disclosed herein, the recovered cannabinoid or cannabinoid derivative is in a salt form. In certain such embodiments, the salt is a pharmaceutically acceptable salt. In some embodiments, the recovered cannabinoid or salt of a cannabinoid derivative is then purified as disclosed herein.
In some embodiments, the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid. In some embodiments, the carboxylic acid may be substituted with or comprise one or more functional and/or reactive groups. Functional groups can include, but are not limited to, azido, halogen (e.g., chlorine, bromine, iodine, fluorine), methyl, alkyl, alkynyl, alkenyl, methoxy, alkoxy, acetyl, amino, carboxyl, carbonyl, oxo, ester, hydroxyl, thio (e.g., thiol), cyano, aryl, heteroaryl, cycloalkyl, cycloalkenyl, cycloalkylalkenyl, cycloalkylalkynyl, cycloalkenylalkyl, cycloalkenylalkenyl, cycloalkenylalkynyl, heterocyclylalkenyl, heterocyclylalkynyl, heteroarylalkynyl, arylalkenyl, arylalkynyl, spiro, heterospiro, heterocyclyl, thioalkyl (or alkylthio), arylthio, heteroarylthio, sulfone, sulfonyl, sulfoxide, amido, alkylamino, dialkylamino, arylamino, alkylarylamino, diarylamino, N-oxide, imide, enamine, imine, oxime, hydrazones, nitriles, aralkyl, cycloalkylalkyl, haloalkyl, heterocyclylalkyl, heteroarylalkyl, nitro, thioketo, and the like. Suitable reactive groups may include, but are not necessarily limited to, azides, halogens, carboxyls, carbonyls, amines (e.g., alkylamines (e.g., lower alkylamines), arylamines), esters (e.g., alkyl esters (e.g., lower alkyl esters, benzyl esters), aryl esters, substituted aryl esters), cyano, thioesters, thioethers, sulfonyl halides, alcohols, thiols, succinimidyl esters, isothiocyanates, iodoacetamides, maleimides, hydrazines, alkynyls, alkenyls, and the like. In some embodiments, the reactive group is selected from the group consisting of carboxyl, carbonyl, amine, ester, thioester, thioether, sulfonyl halide, alcohol, thiol, succinimide ester, isothiocyanate, iodoacetamide, maleimide, azide, alkyne, alkene, and hydrazine. The functional group and the reactive group may be unsubstituted or substituted with one or more functional groups or reactive groups.
In some embodiments, the carboxylic acid is isotopically or radiolabeled. In some embodiments, the carboxylic acid may be an enantiomer or a diastereomer. In some embodiments, the carboxylic acid may be the (S) -enantiomer. In some embodiments, the carboxylic acid may be the (R) -enantiomer. In some embodiments, the carboxylic acid may be the (+) or (-) enantiomer. In some embodiments, the carboxylic acid may include double bonds or fused rings. In certain such embodiments, the double bonds or fused rings may be cis or trans unless configuration is explicitly defined. Unless configuration is explicitly defined, if the carboxylic acid contains a double bond, the substituent may be in the E or Z configuration.
In some embodiments, the carboxylic acid comprises a C ═ C group. In some embodiments, the carboxylic acid comprises an alkyne group. In some embodiments, the carboxylic acid comprises N 3 A group. In some embodiments, the carboxylic acid comprises a halogen. In some embodiments, the carboxylic acid comprises a CN group. In some embodiments, the carboxylic acid comprises iodine. In some embodiments, the carboxylic acid comprises bromine. In some embodiments, the carboxylic acid comprises chlorine. In some embodiments, the carboxylic acid comprises fluorine. In some embodiments, the carboxylic acid comprises a carbonyl group. In some embodiments, the carboxylic acid comprises an acetyl group. In some embodiments, the carboxylic acid comprises an alkyl group. In some embodiments, the carboxylic acid comprises an aryl group.
The carboxylic acid may include, but is not limited to, unsubstituted or substituted C 3 -C 18 Fatty acid, C 3 -C 18 Carboxylic acid, C 1 -C 18 Carboxylic acid, butyric acid, isobutyric acid, valeric acid, caproic acid, enanthic acid, caprylic acid, pelargonic acid, capric acid, undecanoic acid, lauric acid, myristic acid, C 15 -C 18 Fatty acid, C 15 -C 18 Carboxylic acids, fumaric acid, itaconic acid, malic acid, succinic acid, maleic acid, malonic acid, glutaric acid, glucaric acid, oxalic acid, adipic acid, pimelic acid, suberic acid, azelaic acid, sebacic acid, dodecanedioic acid, glutaconic acid, phthalic acid, isophthalic acid, terephthalic acid, citric acid, isocitric acid, aconitic acid, tricarballylic acid, and trimesic acid. The C carboxylic acid may include unsubstituted or substituted C 1 -C 18 A carboxylic acid. The C carboxylic acid may include unsubstituted or substituted C 3 -C 18 A carboxylic acid. The C carboxylic acid may include unsubstituted or substituted C 3 -C 12 A carboxylic acid. The C carboxylic acid may include unsubstituted or substituted C 4 -C 10 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted C 4 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substitutedC of (A) 5 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted C 6 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted C 7 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted C 8 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted C 9 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted C 10 A carboxylic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted butyric acid. In some embodiments, the carboxylic acid is unsubstituted or substituted pentanoic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted hexanoic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted heptanoic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted octanoic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted nonanoic acid. In some embodiments, the carboxylic acid is unsubstituted or substituted decanoic acid.
The carboxylic acid may include, but is not limited to, 2-methylhexanoic acid, 3-methylhexanoic acid, 4-methylhexanoic acid, 5-methylhexanoic acid, 2-hexenoic acid, 3-hexenoic acid, 4-hexenoic acid, 5-chloropentanoic acid, 5-aminopentanoic acid, 5-cyanovaleric acid, 5- (methylsulfanyl) pentanoic acid, 5-hydroxypentanoic acid, 5-phenylpentanoic acid, 2, 3-dimethylhexanoic acid, d 3 -hexanoic acid, 4-pentynoic acid, trans-2-pentenoic acid, 5-hexynoic acid, trans-2-hexenoic acid, 6-heptynoic acid, trans-2-octenoic acid, trans-2-nonenoic acid, 4-phenylbutyric acid, 6-phenylhexanoic acid, 7-phenylheptanoic acid, and the like. In some embodiments, the carboxylic acid is 2-methylhexanoic acid. In some embodiments, the carboxylic acid is 3-methylhexanoic acid. In some embodiments, the carboxylic acid is 4-methylhexanoic acid. In some embodiments, the carboxylic acid is 5-methylhexanoic acid. In some embodiments, the carboxylic acid is 2-hexenoic acid. In some embodiments, the carboxylic acid is 3-hexenoic acid. In some embodiments, the carboxylic acid is 4-hexenoic acid. In some embodiments, the carboxylic acid is 5-hexenoic acid. In some embodiments, the carboxylic acid is 5-chloropentanoic acid. In some embodiments, the carboxylic acid is 5-aminopentanoic acid. In some embodiments, the carboxylic acid is 5-cyanovaleric acid. In some embodiments, the carboxylic acid is 5- (methylsulfanyl) pentanoic acid. In a 1In some embodiments, the carboxylic acid is 5-hydroxypentanoic acid. In some embodiments, the carboxylic acid is 5-phenylpentanoic acid. In some embodiments, the carboxylic acid is 2, 3-dimethylhexanoic acid. In some embodiments, the carboxylic acid is d 3 -hexanoic acid. In some embodiments, the carboxylic acid is 4-pentynoic acid. In some embodiments, the carboxylic acid is trans-2-pentenoic acid. In some embodiments, the carboxylic acid is 5-hexynoic acid. In some embodiments, the carboxylic acid is trans-2-hexenoic acid. In some embodiments, the carboxylic acid is 6-heptynoic acid. In some embodiments, the carboxylic acid is trans-2-octenoic acid. In some embodiments, the carboxylic acid is trans-2-nonenoic acid. In some embodiments, the carboxylic acid is 4-phenylbutyric acid. In some embodiments, the carboxylic acid is 6-phenylhexanoic acid. In some embodiments, the carboxylic acid is 7-phenyl heptanoic acid.
In some embodiments in which the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid, the carboxylic acid is unsubstituted or substituted C 3 -C 18 A carboxylic acid. In certain such embodiments, unsubstituted or substituted C 3 -C 18 The carboxylic acid is unsubstituted or substituted hexanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
In some embodiments in which the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid, the carboxylic acid is butyric acid, valeric acid, caproic acid, caprylic acid, 2-methylhexanoic acid, 3-methylhexanoic acid, 4-methylhexanoic acid, 5-methylhexanoic acid, 2-hexenoic acid, 3-hexenoic acid, 4-hexenoic acid, 5-hexenoic acid, heptanoic acid, 5-chloropentanoic acid, 5- (methylsulfanyl) pentanoic acid, 4-pentynoic acid, trans-2-pentenoic acid, 5-hexynoic acid, trans-2-hexenoic acid, 6-heptynoic acid, trans-2-octenoic acid, nonanoic acid, trans-2-nonenoic acid, decanoic acid, undecanoic acid, dodecanoic acid, myristic acid, 4-phenylbutyric acid, 5-phenylpentanoic acid, 6-phenylhexanoic acid, 7-phenylheptanoic acid, isobutyric acid, fumaric acid, itaconic acid, malic acid, succinic acid, maleic acid, malonic acid, glutaric acid, glucaric acid, oxalic acid, adipic acid, pimelic acid, suberic acid, azelaic acid, sebacic acid, dodecanedioic acid, glutaconic acid, phthalic acid, isophthalic acid, terephthalic acid, citric acid, isocitric acid, aconitic acid, tricarballylic acid, trimesic acid, 5-aminopentanoic acid, 5-cyanopentanoic acid, 5-hydroxypentanoic acid, or 2, 3-dimethylhexanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
In some embodiments in which the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid, the carboxylic acid is butyric acid, valeric acid, caproic acid, caprylic acid, 2-methylhexanoic acid, 3-methylhexanoic acid, 4-methylhexanoic acid, 5-methylhexanoic acid, 2-hexenoic acid, 3-hexenoic acid, 4-hexenoic acid, 5-hexenoic acid, heptanoic acid, 5-chloropentanoic acid, 5- (methylsulfanyl) pentanoic acid, 4-pentynoic acid, trans-2-pentenoic acid, 5-hexynoic acid, trans-2-hexenoic acid, 6-heptynoic acid, trans-2-octenoic acid, nonanoic acid, trans-2-nonenoic acid, decanoic acid, undecanoic acid, dodecanoic acid, myristic acid, 4-phenylbutyric acid, 5-phenylpentanoic acid, 6-phenylhexanoic acid, 7-phenylheptanoic acid, isobutyric acid, fumaric acid, succinic acid, maleic acid, malonic acid, glutaric acid, oxalic acid, adipic acid, pimelic acid, suberic acid, azelaic acid, sebacic acid, dodecanedioic acid, phthalic acid, isophthalic acid, terephthalic acid, trimesic acid, 5-aminopentanoic acid, 5-cyanovaleric acid, 5-hydroxypentanoic acid or 2, 3-dimethylhexanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
In some embodiments in which the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid, the carboxylic acid is 2-methylhexanoic acid, 3-methylhexanoic acid, 4-methylhexanoic acid, 5-methylhexanoic acid, 2-hexenoic acid, 3-hexenoic acid, 4-hexenoic acid, heptanoic acid, 5-chloropentanoic acid, 5- (methylsulfanyl) pentanoic acid, 4-pentynoic acid, trans-2-pentenoic acid, 5-hexynoic acid, trans-2-hexenoic acid, 6-heptynoic acid, trans-2-octenoic acid, nonanoic acid, trans-2-nonenoic acid, decanoic acid, undecanoic acid, dodecanoic acid, myristic acid, 4-phenylbutyric acid, 5-phenylpentanoic acid, 6-phenylhexanoic acid, 7-phenylheptanoic acid, a, Isobutyric acid, fumaric acid, itaconic acid, malic acid, maleic acid, glucaric acid, suberic acid, azelaic acid, sebacic acid, dodecanedioic acid, glutaconic acid, phthalic acid, isophthalic acid, terephthalic acid, citric acid, isocitric acid, aconitic acid, tricarballylic acid, trimesic acid, 5-aminopentanoic acid, 5-cyanovaleric acid, 5-hydroxypentanoic acid or 2, 3-dimethylhexanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
In some embodiments in which the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid, the carboxylic acid is 2-methylhexanoic acid, 3-methylhexanoic acid, 4-methylhexanoic acid, 5-methylhexanoic acid, 2-hexenoic acid, 3-hexenoic acid, 4-hexenoic acid, heptanoic acid, 5-chloropentanoic acid, 5- (methylsulfanyl) pentanoic acid, 4-pentynoic acid, trans-2-pentenoic acid, 5-hexynoic acid, trans-2-hexenoic acid, 6-heptynoic acid, trans-2-octenoic acid, nonanoic acid, trans-2-nonenoic acid, decanoic acid, undecanoic acid, dodecanoic acid, myristic acid, 4-phenylbutyric acid, 5-phenylpentanoic acid, 6-phenylhexanoic acid, 7-phenylheptanoic acid, a, Isobutyric acid, fumaric acid, maleic acid, suberic acid, azelaic acid, sebacic acid, dodecanedioic acid, phthalic acid, isophthalic acid, terephthalic acid, trimesic acid, 5-aminopentanoic acid, 5-cyanovaleric acid, 5-hydroxypentanoic acid or 2, 3-dimethylhexanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
In some embodiments in which the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid, the carboxylic acid is 4-pentynoic acid, trans-2-pentenoic acid, 5-hexynoic acid, trans-2-hexenoic acid, 6-heptynoic acid, trans-2-octenoic acid, nonanoic acid, trans-2-nonenoic acid, decanoic acid, undecanoic acid, dodecanoic acid, 4-phenylbutyric acid, 5-phenylpentanoic acid, 6-phenylhexanoic acid, or 7-phenylheptanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
In some embodiments in which the modified host cells of the present disclosure are cultured in a medium comprising a carboxylic acid, the carboxylic acid is 2-methylhexanoic acid, 4-methylhexanoic acid, 5-methylhexanoic acid, 2-hexenoic acid, 3-hexenoic acid, 4-hexenoic acid, heptanoic acid, 5-chloropentanoic acid, or 5- (methylsulfanyl) pentanoic acid. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
The present disclosure also provides methods of producing a cannabinoid or cannabinoid derivative (such as those described herein) comprising: culturing the modified host cell of the present disclosure in a medium comprising olivinic acid or an olivinic acid derivative. In certain such embodiments, the methods comprise recovering the produced cannabinoid or cannabinoid derivative. In certain such embodiments, the resulting cannabinoid or cannabinoid derivative is then purified as disclosed herein.
The olivine acid derivatives used herein may be substituted with or comprise one or more reactive groups and/or functional groups disclosed herein. In some embodiments, the derivative of olivinic acid may lack one or more chemical moieties found in olivinic acid. In some embodiments, when the medium comprises an olivinic acid derivative, the olivinic acid derivative is orcinol. In some embodiments, when the medium comprises an olivine acid derivative, the olivine acid derivative is di-theanine acid (divarinic acid). In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium. In some embodiments, the cannabinoid or cannabinoid derivative is produced in an amount greater than 50mg/L of medium.
The present disclosure provides methods of producing cannabinoids or cannabinoid derivatives using the modified host cells of the disclosure. In some embodiments of methods of producing a cannabinoid or cannabinoid derivative using a modified host cell of the present disclosure, the amount of cannabinoid or cannabinoid derivative produced is greater than the amount of cannabinoid or cannabinoid derivative produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, in mg/L or mM. In certain such embodiments, the modified host cells of the present disclosure and the modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, are cultured under similar culture conditions for the same length of time.
In some embodiments of the methods of producing a cannabinoid or cannabinoid derivative using the modified host cell of the present disclosure, the amount of cannabinoid or cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000%, in mg/L or mM, greater than the amount of cannabinoid or cannabinoid derivative produced in the alternative method, the alternative method comprises culturing a culture medium comprising one or more nucleic acid sequences comprising a nucleotide sequence encoding a polypeptide having the amino acid sequence of SEQ ID NO:44, or a tetrahydrocannabinolic acid synthase polypeptide, but lacks a modified host cell comprising a nucleic acid encoding a nucleotide sequence of the engineered variant. In certain such embodiments, the modified host cells of the present disclosure and the modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, are cultured under similar culture conditions for the same length of time.
In some embodiments of methods of producing a cannabinoid or cannabinoid derivative using a modified host cell of the disclosure, the cannabinoid is THCA and the methods produce THCA at an increased ratio of THCA to another cannabinoid (e.g., CBCA) as compared to the ratio of THCA produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant, grown under similar culture conditions for the same length of time. In some embodiments of methods of producing a cannabinoid or cannabinoid derivative using a modified host cell of the disclosure, the cannabinoid is CBDA and the method produces the THCA from, for example, a ratio of cannabinoid to CBca (THCA), at about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500: 1.
Exemplary cell culture conditions
Suitable media for culturing the modified host cells of the present disclosure may include standard media (e.g., Luria-Bertani broth, optionally supplemented with one or more additional agents, such as an induction factor (e.g., wherein the nucleic acids disclosed herein are under the control of an inducible promoter, etc.), standard yeast media, and the like). In some embodiments, the medium may be supplemented with fermentable sugars (e.g., hexoses, such as glucose, xylose, and the like). Sugars that can be fermented by yeast may include, but are not limited to, sucrose, dextrose, glucose, fructose, mannose, galactose, and maltose.
In some embodiments, the medium may be supplemented with unsubstituted or substituted hexanoic acid, a carboxylic acid other than unsubstituted or substituted hexanoic acid, olivolic acid, or olivolic acid derivatives. In some embodiments, the culture medium can be supplemented with a pretreated cellulosic feedstock (e.g., wheat straw, barley straw, sorghum, rice straw, sugar cane bagasse, switchgrass, corn stover, corn fiber, grain, or any combination thereof). In some embodiments, the medium may be supplemented with oleic acid. In some embodiments, the medium comprises a non-fermentable carbon source. In certain such embodiments, the non-fermentable carbon source comprises ethanol. In some embodiments, a suitable medium comprises an inducing factor. In certain such embodiments, the induction factor comprises galactose. In some embodiments, the induction factor comprises KH 2 PO 4 Galactose, glucose, sucrose, maltose, amino acids (e.g., methionine, lysine), CuSO 4 Temperature change (e.g., 30 ℃ to 37 ℃), pH change (e.g., pH 6 to pH 4), oxygen level change (e.g., dissolved oxygen level changes from 20% to 1%), addition of the hydrogen peroxide or superoxide generating drug menadione, tunicamycin, expression of misfolded prone proteins (e.g., cannabinoid synthase), estradiol, or doxycycline. Additional induction systems are detailed herein.
The carbon source in suitable media can vary significantly from simple sugars such as glucose to more complex hydrolysates of other biomass such as yeast extract. The addition of salts typically provides essential elements such as magnesium, nitrogen, phosphorus, and sulfur to allow the cell to synthesize polypeptides and nucleic acids. Suitable media may also be supplemented with selective agents, such as antibiotics, to select for maintenance of certain plasmids, and the like. For example, if a microorganism is resistant to an antibiotic (such as ampicillin or tetracycline), the antibiotic can be added to the medium to prevent growth of cells that lack resistance. Suitable media may be supplemented with other compounds as necessary to select for desired physiological or biochemical characteristics, such as particular amino acids and the like.
In some embodiments, the modified host cells disclosed herein are grown in minimal medium. As used herein, the term "minimal medium" can refer to a growth medium that contains the minimal nutrients possible for cell growth, typically but not always in the absence of one or more amino acids (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more amino acids). Minimal media typically comprise (1) a carbon source for growth of cells (e.g., bacteria or yeast); (2) various salts that can vary between cell (e.g., bacterial or yeast) species and growth conditions; and (3) water.
In some embodiments, the modified host cells disclosed herein are grown in rich media. In certain such embodiments, the one or more enrichment media comprises yeast extract dextrose peptone (YPD) medium comprising water, 10g/L yeast extract, 20g/L bacto peptone, and 20g/L dextrose (glucose). In some embodiments, the one or more rich media comprise YP +20g/L galactose and 1g/L glucose. In some embodiments, the one or more rich media comprises carboxylic acid (e.g., 1mM olivine acid derivative, 2mM unsubstituted or substituted hexanoic acid, or 2mM carboxylic acid other than unsubstituted or substituted hexanoic acid). In some embodiments, the one or more enriched media provide for faster cell growth compared to the one or more minimal media.
Materials and methods suitable for recombinant cell maintenance and growth of the present disclosure are described herein, e.g., in the examples section. Other materials and methods suitable for cell (e.g., bacterial or yeast) culture maintenance and growth are well known in the art. Exemplary techniques can be found in International publication No. WO2009/076676, U.S. patent application No. 12/335,071 (U.S. publication No. 2009/0203102), WO 2010/003007, U.S. publication No. 2010/0048964, WO 2009/132220, U.S. publication No. 2010/0003716, Manual of Methods for General Bacteriology Gerhardt et al, American Society for Microbiology, Washington, D.C. (1994), or Brock in Biotechnology: A Textbook of Industrial Microbiology, second edition (1989) Sinauer Associates, Inc, Surnderland, MA.
Standard cell culture conditions can be used to culture the modified host cells disclosed herein (see, e.g., WO 2004/033646 and references cited therein). In some embodiments, the cells are at an appropriate temperature, gas mixture, and pH (such as at about 20 ℃ to about 37 ℃, at about 0.04% to about 84% CO 2 From about 0% to about 100% dissolved oxygen and at a pH of from about 2 to about 9). In some embodiments, the modified host cells disclosed herein are grown in a suitable cell culture medium at about 34 ℃. In some embodiments, the modified host cells disclosed herein are grown in a suitable cell culture medium at about 20 ℃ to about 37 ℃. Although about 30 ℃ is most suitable for growth of Saccharomyces cerevisiae, culturing cells at higher temperatures, such as 34 ℃, can be advantageous by reducing the cost of cooling industrial fermenters. In some embodiments, the modified host cells disclosed herein are grown in a suitable cell culture medium at about 20 ℃, about 21 ℃, about 22 ℃, about 23 ℃, about 24 ℃, about 25 ℃, about 26 ℃, about 27 ℃, about 28 ℃, about 29 ℃, about 30 ℃, about 31 ℃, about 32 ℃, about 33 ℃, about 34 ℃, about 35 ℃, about 36 ℃, or about 37 ℃. In some embodiments, the pH of the fermentation ranges between about pH 3.0 to about pH9.0 (such as about pH 3.0, about pH 3.5, about pH 4.0, about pH 4.5, about pH 5.0, about pH 5.5, about pH 6.0, about pH 6.5, about pH 7.0, about pH 7.5, about pH 8.0, about pH8.5, about pH 6.0 to about pH 8.0, or about pH 6.5 to about pH 7.0). In some embodiments, the pH range of the fermentation is between about pH 4.5 to about pH 5.5. In some embodiments, the pH range of the fermentation is between about pH 4.0 to about pH 6.0. In some embodiments, the pH range of the fermentation is between about pH 3.0 to about pH 6.0. In some embodiments, the pH range of the fermentation is between about pH 3.0 to about pH 5.5. In some embodiments, the pH range of the fermentation is between about pH 3.0 to about pH 5.0. In some embodiments, the dissolved oxygen is in the range of about 0% to about 10%, about 0% to about 20%, about 0% to about 30%, about 0% to about 40%, about 0% to about 50%, about 0% to about 60%, about 0% to about 70%, about 0% to about 80%, about 0% to about 90%, about 5% to about 10%, about 5% to about 20%, about 0% to about 90%, about 5% to about 10%, or about 20% Between 5% to about 30%, about 5% to about 40%, about 5% to about 50%, about 5% to about 60%, about 5% to about 70%, about 5% to about 80%, about 5% to about 90%, about 10% to about 20%, about 10% to about 30%, about 10% to about 40%, or about 10% to about 50%. In some embodiments, CO 2 At a level of about 0.04% to about 0.1% CO 2 About 0.04% to about 1% CO 2 About 0.04% to about 5% CO 2 About 0.04% to about 10% CO 2 About 0.04% to about 20% CO 2 About 0.04% to about 30% CO 2 About 0.04% to about 40% CO 2 About 0.04% to about 50% CO 2 About 0.04% to about 60% CO 2 About 0.04% to about 70% CO 2 About 0.1% to about 5% CO 2 About 0.1% to about 10% CO 2 About 0.1% to about 20% CO 2 About 0.1% to about 30% CO 2 About 0.1% to about 40% CO 2 About 0.1% to about 50% CO 2 About 1% to about 5% CO 2 About 1% to about 10% CO 2 About 1% to about 20% CO 2 About 1% to about 30% CO 2 About 1% to about 40% CO 2 About 1% to about 50% CO 2 About 5% to about 10% CO 2 About 10% to about 20% CO 2 About 10% to about 30% CO 2 About 10% to about 40% CO 2 About 10% to about 50% CO 2 About 10% to about 60% CO 2 About 10% to about 70% CO 2 About 10% to about 80% CO 2 About 50% to about 60% CO 2 About 50% to about 70% CO 2 Or about 50% to about 80% CO 2 In between. The modified host cells disclosed herein can be grown under aerobic, anoxic, microaerobic, or anaerobic conditions based on the needs of the cell.
Standard culture conditions and fermentation modes that may be used, such as batch, fed-batch or continuous fermentations are described in international publication No. WO 2009/076676, U.S. patent application No. 12/335,071 (U.S. publication No. 2009/0203102), WO 2010/003007, U.S. publication No. 2010/0048964, WO 2009/132220, U.S. publication No. 2010/0003716, the contents of each of which are incorporated herein by reference in their entirety. Batch and fed-batch fermentations are common and well known in the art and examples can be found in Brock, Biotechnology: A Textbook of Industrial Microbiology, second edition (1989) Sinauer Associates, Inc.
Production of cannabinoids or cannabinoid derivatives and recovery of the produced cannabinoids or cannabinoid derivatives
The present disclosure provides for the production of an amount of a cannabinoid or cannabinoid derivative. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives (such as those disclosed herein) from a modified host cell of the present disclosure in an amount of from about 1mg/L of media to about 1g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1mg/L culture medium to about 500mg/L culture medium. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 1mg/L of media to about 100mg/L of media. For example, in some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1mg/L medium to about 5mg/L medium, from about 5mg/L medium to about 10mg/L medium, from about 10mg/L medium to about 25mg/L medium, from about 25mg/L medium to about 50mg/L medium, from about 50mg/L medium to about 75mg/L medium, or from about 75mg/L medium to about 100mg/L medium. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount of from about 100mg/L medium to about 150mg/L medium, from about 150mg/L medium to about 200mg/L medium, from about 200mg/L medium to about 250mg/L medium, from about 250mg/L medium to about 500mg/L medium, from about 500mg/L medium to about 750mg/L medium, or from about 750mg/L medium to about 1g/L medium. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount of from about 50mg/L medium to about 100mg/L medium, from about 50mg/L medium to about 150mg/L medium, from about 50mg/L medium to about 200mg/L medium, from about 50mg/L medium to about 250mg/L medium, from about 50mg/L medium to about 500mg/L medium, or from about 50mg/L medium to about 750mg/L medium.
In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives (such as those disclosed herein) in an amount of from about 50mg/L of media to about 100g/L of media or greater than 100g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives (such as those disclosed herein) in an amount of from about 50mg/L of media to about 100mg/L of media or greater than 100mg/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives (such as those disclosed herein) in amounts greater than 50mg/L of medium. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives (such as those disclosed herein) in amounts greater than 100mg/L of medium.
In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 100mg/L of media to about 500mg/L of media or greater than 500mg/L of media.
In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 500mg/L of media to about 1g/L of media or greater than 1g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 1g/L of media to about 10g/L of media or greater than 10g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 10g/L of media to about 100g/L of media or greater than 100g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 1g/L of media to about 20g/L of media or greater than 20g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1g/L of media to about 30g/L of media or greater than 30g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1g/L culture medium to about 40g/L culture medium or greater than 40g/L culture medium. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 1g/L of media to about 50g/L of media or greater than 50g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1g/L culture medium to about 60g/L culture medium or greater than 60g/L culture medium. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1g/L culture medium to about 70g/L culture medium or greater than 70g/L culture medium. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1g/L of media to about 80g/L of media or greater than 80g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 1g/L culture medium to about 90g/L culture medium or greater than 90g/L culture medium. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 10g/L of media to about 20g/L of media or greater than 20g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 10g/L of media to about 30g/L of media or greater than 30g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 10g/L of media to about 40g/L of media or greater than 40g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 10g/L of media to about 50g/L of media or greater than 50g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 10g/L of media to about 60g/L of media or greater than 60g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 10g/L of media to about 70g/L of media or greater than 70g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 10g/L of media to about 80g/L of media or greater than 80g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 10g/L of media to about 90g/L of media or greater than 90g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 50g/L of media to about 100g/L of media or greater than 100g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 50g/L of media to about 60g/L of media or greater than 60g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 50g/L of media to about 70g/L of media or greater than 70g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 50g/L of media to about 80g/L of media or greater than 80g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 50g/L of media to about 90g/L of media or greater than 90g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 20g/L of media to about 100g/L of media or greater than 100g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 20g/L of media to about 30g/L of media or greater than 30g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 20g/L of media to about 40g/L of media or greater than 40g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 20g/L culture medium to about 50g/L culture medium or greater than 50g/L culture medium. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 20g/L of media to about 60g/L of media or greater than 60g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 20g/L of media to about 70g/L of media or greater than 70g/L of media. In some embodiments, the methods of the present disclosure provide for the production of a cannabinoid or a cannabinoid derivative in an amount from about 20g/L of media to about 80g/L of media or greater than 80g/L of media. In some embodiments, the methods of the present disclosure provide for the production of cannabinoids or cannabinoid derivatives in an amount of from about 20g/L of media to about 90g/L of media or greater than 90g/L of media.
In some embodiments, the modified host cells disclosed herein are cultured in a liquid medium comprising a carboxylic acid, an olivinic acid, or an olivinic acid derivative.
In some embodiments, methods of producing a cannabinoid or cannabinoid derivative (such as those disclosed herein) can involve culturing a modified yeast cell of the disclosure under conditions conducive to production of the cannabinoid or cannabinoid derivative; wherein the cannabinoid or cannabinoid derivative is produced by the modified yeast cell and is present in a medium (e.g., a liquid medium) in which the modified yeast cell is cultured. In some embodiments, the medium in which the modified yeast cell is cultured comprises a cannabinoid or cannabinoid derivative in an amount of 1ng/L to 1g/L (e.g., 1ng/L to 50ng/L, 50ng/L to 100ng/L, 100ng/L to 500ng/L, 500ng/L to 1 μ g/L, 1 μ g/L to 50 μ g/L, 50 μ g/L to 100 μ g/L, 100 μ g/L to 500 μ g/L, 500 μ g/L to 1mg/L, 1mg/L to 50mg/L, 50mg/L to 100mg/L, 100mg/L to 500mg/L, or 500mg/L to 1 g/L). In certain such embodiments, the modified yeast cell is a modified saccharomyces cerevisiae. In some embodiments, the medium in which the modified yeast cells are cultured comprises a cannabinoid or a cannabinoid derivative in an amount from 50mg/L to 100 mg/L. In certain such embodiments, the modified yeast cell is a modified saccharomyces cerevisiae. In some embodiments, the medium in which the modified yeast cell is cultured comprises the cannabinoid or cannabinoid derivative in an amount from 100mg/L to 500 mg/L. In certain such embodiments, the modified yeast cell is a modified Saccharomyces cerevisiae. In some embodiments, the medium in which the modified yeast cell is cultured comprises the cannabinoid or cannabinoid derivative in an amount from 500mg/L to 1 g/L. In certain such embodiments, the modified yeast cell is a modified saccharomyces cerevisiae. In some embodiments, the medium in which the modified yeast cell is cultured comprises a cannabinoid or a cannabinoid derivative in an amount greater than 1 g/L. In certain such embodiments, the modified yeast cell is a modified Saccharomyces cerevisiae.
In some embodiments, methods of producing a cannabinoid or cannabinoid derivative (such as those disclosed herein) can involve culturing a modified yeast cell of the disclosure under conditions conducive to fermentation of a sugar and under conditions conducive to production of the cannabinoid or cannabinoid derivative; wherein the cannabinoid or cannabinoid derivative is produced by the modified yeast cell and is present in an alcohol produced by the modified yeast cell. The present disclosure provides alcoholic beverages produced by modified yeast cells, wherein the alcoholic beverages comprise cannabinoids or cannabinoid derivatives produced by modified yeast cells. The alcoholic beverages may include beer, wine and distilled alcoholic beverages. In some embodiments, an alcoholic beverage of the disclosure comprises a cannabinoid or cannabinoid derivative in an amount of 1ng/L to 1g/L (e.g., 1ng/L to 50ng/L, 50ng/L to 100ng/L, 100ng/L to 500ng/L, 500ng/L to 1 μ g/L, 1 μ g/L to 50 μ g/L, 50 μ g/L to 100 μ g/L, 100 μ g/L to 500 μ g/L, 500 μ g/L to 1mg/L, 1mg/L to 50mg/L, 50mg/L to 100mg/L, 100mg/L to 500mg/L, or 500mg/L to 1 g/L). In some embodiments, alcoholic beverages of the present disclosure comprise a cannabinoid or cannabinoid derivative in an amount greater than 1 g/L.
The present disclosure provides beverages produced by modified yeast cells, wherein the beverages comprise cannabinoids or cannabinoid derivatives (such as those disclosed herein) produced by modified yeast cells. In some embodiments, a beverage of the disclosure comprises a cannabinoid or cannabinoid derivative in an amount of 1ng/L to 1g/L (e.g., 1ng/L to 50ng/L, 50ng/L to 100ng/L, 100ng/L to 500ng/L, 500ng/L to 1 μ g/L, 1 μ g/L to 50 μ g/L, 50 μ g/L to 100 μ g/L, 100 μ g/L to 500 μ g/L, 500 μ g/L to 1mg/L, 1mg/L to 50mg/L, 50mg/L to 100mg/L, 100mg/L to 500mg/L, or 500mg/L to 1 g/L). In some embodiments, the beverages of the present disclosure comprise a cannabinoid or cannabinoid derivative in an amount greater than 1 g/L. In some embodiments, the beverages of the present disclosure are non-alcoholic.
In some embodiments, the methods of the present disclosure provide for increased production of cannabinoids or cannabinoid derivatives (such as those disclosed herein). In certain such embodiments, culturing the modified host cell disclosed herein in a culture medium provides increased synthesis of a cannabinoid or a cannabinoid derivative as compared to an unmodified host cell cultured under similar conditions. Cannabinoid or cannabinoid derivative production by the modified host cells disclosed herein can be increased by a factor of about 5% to about 1,000,000, as compared to an unmodified host cell cultured under similar conditions. Production of a cannabinoid or cannabinoid derivative by a modified host cell disclosed herein can be increased by about 10% to about 1,000,000 fold (e.g., about 50% to about 1,000,000 fold, about 1 to about 500,000 fold, about 1 to about 50,000 fold, about 1 to about 5,000 fold, about 1 to about 1,000 fold, about 1 to about 500 fold, about 1 to about 100 fold, about 1 to about 50 fold, about 5 to about 100,000 fold, about 5 to about 10,000 fold, about 5 to about 1,000 fold, about 5 to about 500 fold, about 5 to about 100 fold, about 10 to about 50,000 fold, about 50 to about 10,000 fold, about 100 to about 5,000 fold, about 200 to about 1,000 fold, about 50 to about 500 fold, or about 50 to about 200 fold) as compared to production of a cannabinoid or cannabinoid derivative by an unmodified host cell cultured under similar conditions. Production of a cannabinoid or cannabinoid derivative by a modified host cell disclosed herein can also be increased by at least about any of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 1-fold, 2-fold, 5-fold, 10-fold, 20-fold, 50-fold, 100-fold, 200-fold, 500-fold, 1000-fold, 2000-fold, 5000-fold, 10,000-fold, 20,000-fold, 50,000-fold, 100,000-fold, 200,000-fold, 500,000-fold, or 1,000,000-fold as compared to production of the cannabinoid or cannabinoid derivative by an unmodified host cell cultured under similar conditions.
In some embodiments, production of a cannabinoid or cannabinoid derivative (such as those disclosed herein) by a modified host cell of the present disclosure can also be increased by any of at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% as compared to production of the cannabinoid or cannabinoid derivative by an unmodified host cell cultured under similar conditions. In some embodiments, production of a cannabinoid or cannabinoid derivative by a modified host cell disclosed herein can also be increased by at least about any of 1-20%, 2-20%, 5-20%, 10-20%, 15-20%, 1-15%, 1-10%, 2-15%, 2-10%, 5-15%, 10-15%, 1-50%, 10-50%, 20-50%, 30-50%, 40-50%, 50-100%, 50-60%, 50-70%, 50-80%, or 50-90% as compared to production of the cannabinoid or cannabinoid derivative by an unmodified host cell cultured under similar conditions.
In some embodiments, the modified host cells of the present disclosure are assayed for production of cannabinoids or cannabinoid derivatives by LC-MS analysis. In certain such embodiments, each cannabinoid or cannabinoid derivative is identified by retention time determined from authentic standards and Multiple Reaction Monitoring (MRM) shift.
In some embodiments, the modified host cell of the present disclosure is a yeast cell. In certain such embodiments, the modified host cells disclosed herein are cultured in a bioreactor. In some embodiments, the modified host cell is cultured in a medium supplemented with unsubstituted or substituted hexanoic acid, a carboxylic acid other than unsubstituted or substituted hexanoic acid, olivinic acid, or an olivinic acid derivative. In some embodiments, the modified yeast cell is a modified saccharomyces cerevisiae.
In some embodiments, a cannabinoid or cannabinoid derivative (such as those disclosed herein) is recovered from a cell lysate, for example, by lysing a modified host cell disclosed herein and recovering the cannabinoid or cannabinoid derivative from the lysate. In other cases, the cannabinoid or cannabinoid derivative is recovered from the culture medium in which the modified host cell disclosed herein is cultured. In other cases, the cannabinoid or cannabinoid derivative is recovered from the cell lysate and culture medium. In other cases, the cannabinoid or cannabinoid derivative is recovered from the modified host cell. In other cases, the cannabinoid or cannabinoid derivative is recovered from the modified host cell and the culture medium. In other cases, the cannabinoid or cannabinoid derivative is recovered from the cell lysate, the modified host cell, and the culture medium. In some embodiments, when the cell lysate; from the culture medium; from a modified host cell; from both the cell lysate and the culture medium; from both the modified host cell and the culture medium; from cell lysates, modified host cells, and culture media; or recovering the cannabinoid or cannabinoid derivative from the cell-free reaction mixture comprising one or more polypeptides disclosed herein, the recovered cannabinoid or cannabinoid derivative is in a salt form. In certain such embodiments, the salt is a pharmaceutically acceptable salt. In some embodiments, the recovered cannabinoid or salt of a cannabinoid derivative is then purified as disclosed herein.
In some embodiments, the recovered cannabinoid or cannabinoid derivative, such as those disclosed herein, is then purified. In some embodiments, a whole cell broth from a culture comprising the modified host cells of the present disclosure can be extracted with a suitable organic solvent to provide a cannabinoid or a cannabinoid derivative. Suitable organic solvents include, but are not limited to, hexane, heptane, ethyl acetate, petroleum ether and diethyl ether, chloroform and ethyl acetate. In some embodiments, a suitable organic solvent includes hexane. In some embodiments, a suitable organic solvent may be added to a whole cell broth from a fermentation comprising the modified host cells of the present disclosure at a ratio of 10:1 (10 parts whole cell broth to 1 part organic solvent) and stirred for 30 minutes. In certain such embodiments, the organic fraction may be separated and extracted twice with equal volumes of acidic water (pH 2.5). The organic layer can then be separated and dried in a concentrator (a reduced pressure rotary evaporator or thin film evaporator) to obtain crude cannabinoid or cannabinoid derivative crystals. In certain such embodiments, the crude crystals can be heated or exposed to light to decarboxylate the crude cannabinoid or cannabinoid derivative. In certain such embodiments, the crude crystals can be heated to 105 ℃ for 15 minutes followed by heating to 145 ℃ for 55 minutes to decarboxylate the crude cannabinoid or cannabinoid derivative. In certain such embodiments, the crude crystalline product may be redissolved and recrystallized in a suitable solvent (e.g., n-pentane) and filtered to remove all insoluble material. In certain such embodiments, the solvent may then be removed, for example by rotary evaporation, to yield a pure crystalline product.
In some embodiments, the cannabinoid or cannabinoid derivative is pure, e.g., at least about 40% pure, at least about 50% pure, at least about 60% pure, at least about 70% pure, at least about 80% pure, at least about 90% pure, at least about 95% pure, at least about 98% pure, or greater than 98% pure, wherein "pure" in the context of a cannabinoid or cannabinoid derivative may refer to a cannabinoid or cannabinoid derivative that is free of other cannabinoids or cannabinoid derivatives, macromolecules, contaminants, and/or the like.
Methods of making engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides
In one aspect, the disclosure provides methods for making engineered variants of tetrahydrocannabinolic acid synthase (THCAS) polypeptides. In certain such embodiments, the method may comprise culturing the modified host cell of the present disclosure in a culture medium. In some embodiments, the modified host cell of the present disclosure is of the genus pichia. The methods may comprise isolating and/or purifying the expressed engineered variants as described herein.
In some embodiments, the method of making the engineered variant comprises the step of isolating or purifying the engineered variant. Engineered variants of the present disclosure may be expressed in a modified host cell as described herein and isolated from the modified host cell and/or culture medium using any one or more of the well-known techniques for protein purification, including lysozyme treatment, sonication, filtration, salting out, ultracentrifugation, and chromatography, among others. Chromatographic techniques for isolating engineered variants of the present disclosure may include, inter alia, reverse phase chromatography, high performance liquid chromatography, ion exchange chromatography, gel electrophoresis, and affinity chromatography. In some embodiments, affinity chromatography is used.
In some embodiments, the engineered variants of the present disclosure expressed in the modified host cells of the present disclosure can be prepared and used in various forms, including but not limited to, crude extracts (e.g., cell-free lysates), powders (e.g., shake flask powders), lyophilizates, frozen stocks prepared with glycerol or another cryoprotectant, and substantially pure preparations (e.g., DSP powders).
In some embodiments, engineered variants of the present disclosure expressed in modified host cells of the present disclosure can be prepared and used in purified form. In general, the conditions used to purify a particular engineered variant will depend in part on factors such as net charge, hydrophobicity, hydrophilicity, molecular weight, molecular shape, and the like, and will be apparent to those skilled in the art.
Cell-free methods of producing cannabinoids or cannabinoid derivatives
The methods of the present disclosure may involve cell-free production of cannabinoids or cannabinoid derivatives (such as those disclosed herein) using engineered variants disclosed herein expressed or overexpressed by modified host cells of the present disclosure. In some embodiments, the engineered variants disclosed herein are used in a cell-free system for producing a cannabinoid or a cannabinoid derivative. In certain such embodiments, the engineered variants of the disclosure are isolated and/or purified. In some embodiments, appropriate starting materials for the production of cannabinoids or cannabinoid derivatives may be mixed with the engineered variants disclosed herein in a suitable reaction vessel to effect the reaction. The engineered variants disclosed herein may be used in combination to achieve complete synthesis of cannabinoids or cannabinoid derivatives from appropriate starting materials. In some embodiments, the cannabinoid or cannabinoid derivative is recovered from a cell-free reaction mixture comprising the engineered variant disclosed herein.
In some embodiments, the recovered cannabinoid or cannabinoid derivative, such as those disclosed herein, is then purified. In certain such embodiments, the cell-free reaction mixture comprising the engineered variants disclosed herein can be extracted with a suitable organic solvent to provide a cannabinoid or cannabinoid derivative. Suitable organic solvents include, but are not limited to, hexane, heptane, ethyl acetate, petroleum ether and diethyl ether, chloroform and ethyl acetate. In some embodiments, a suitable organic solvent includes hexane. In some embodiments, a suitable organic solvent may be added to a cell-free reaction mixture comprising one or more polypeptides disclosed herein at a ratio of 10:1 (10 parts reaction mixture to 1 part organic solvent) and stirred for 30 minutes. In certain such embodiments, the organic fraction may be separated and extracted twice with equal volumes of acidic water (pH 2.5). The organic layer can then be separated and dried in a concentrator (a reduced pressure rotary evaporator or thin film evaporator) to obtain crude cannabinoid or cannabinoid derivative crystals. In certain such embodiments, the crude crystals can be heated or exposed to light to decarboxylate the crude cannabinoid or cannabinoid derivative. In certain such embodiments, the crude crystals can be heated to 105 ℃ for 15 minutes followed by heating to 145 ℃ for 55 minutes to decarboxylate the crude cannabinoid or cannabinoid derivative. In certain such embodiments, the crude crystalline product may be redissolved and recrystallized in a suitable solvent (e.g., n-pentane) and filtered to remove all insoluble material. In certain such embodiments, the solvent may then be removed, for example by rotary evaporation, to yield a pure crystalline product.
In some embodiments, when a cannabinoid or cannabinoid derivative is recovered from a cell-free reaction mixture comprising one or more engineered variants disclosed herein, the recovered cannabinoid or cannabinoid derivative is in a salt form. In certain such embodiments, the salt is a pharmaceutically acceptable salt. In some embodiments, the recovered cannabinoid or salt of a cannabinoid derivative is then purified as disclosed herein.
In some embodiments, cell-free production of cannabinoids or cannabinoid derivatives by the engineered variants disclosed herein is determined by LC-MS analysis. In certain such embodiments, each cannabinoid or cannabinoid derivative is identified by retention time determined from authentic standards and Multiple Reaction Monitoring (MRM) shift.
Examples of non-limiting embodiments of the present disclosure
Embodiments of the inventive subject matter disclosed herein can be beneficial alone or in combination with one or more other embodiments. Without limiting the foregoing description, certain non-limiting embodiments of the present disclosure are provided below, numbered I-1 through I-121. As will be apparent to those of skill in the art upon reading this disclosure, each of the individually numbered embodiments may be used or combined with any of the preceding or following individually numbered embodiments. This is intended to provide support for all such combinations of embodiments and is not limited to the combinations of embodiments explicitly provided below.
Some embodiments of the present disclosure have the properties of embodiment I:
embodiment I-1. an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide, said engineered variant comprising the amino acid sequence of SEQ ID NO:44 with one or more amino acid substitutions.
Embodiment I-2. the engineered variant of embodiment I-1, wherein the engineered variant comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID No. 44.
Embodiment I-3. the engineered variant of embodiment I-1 or I-2, wherein the engineered variant comprises at least one amino acid substitution in a signal polypeptide, a Flavin Adenine Dinucleotide (FAD) -binding domain, a Berberine Bridge Enzyme (BBE) domain, or a combination of the foregoing.
Embodiment I-4. the engineered variant of embodiment I-3, wherein said engineered variant comprises at least one amino acid substitution in said signal polypeptide.
Embodiment I-5. the engineered variant of embodiment I-3 or I-4, wherein said engineered variant comprises at least one amino acid substitution in said FAD binding domain.
Embodiment I-6. the engineered variant of any one of embodiments I-3 to I-5, wherein the engineered variant comprises at least one amino acid substitution in the BBE domain.
Embodiment I-7. the engineered variant of any one of embodiments I-3 to I-6, wherein said engineered variant comprises at least one surface exposed amino acid substitution.
Embodiment I-8. the engineered variant of embodiment I-1 or I-2, wherein said engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of: r31, P43, P49, K50, L51, Q55, H56, L59, M61, S62, L71, S100, V103, T109, Q124, V125, L132, S137, H143, V149, W161, K165, N168, E167, S170, F171, P172, Y175, G180, N196, H208, G235, a250, I257, K261, L269, G311, F317, L327, K390, T379, S429, N467, Y500, N528, P539, P542, H543, H545, H and H545
Embodiment I-9. the engineered variant of embodiment I-8, wherein said engineered variant comprises at least one amino acid substitution selected from the group consisting of: R31Q, P43E, P49E, P49K, P49Q, K50T, L51I, Q55E, Q55P, H56E, L59E, M61W, M61H, M61S, S62Q, L71A, S100A, V103F, T109V, Q124D, Q124E, Q124N, V125E, V59125 125Q, L132M, S137G, H143D, V149I, W161R, W161Y, W161K, K165A, N168S, E167P, S170P, F P, P172P, Y175P, G180, N P, N36208, H36208, G235, a 171, S261, P72, P P, N P, P P, N P, P3676, P P, P3676, P P, P3676, P P, P3676, P P, P3676, P P, P3676, P P, P3676, P P, P3676, P P, P3676, P P, P.
Embodiment I-10. the engineered variant of embodiment I-1 or I-2, wherein said engineered variant comprises an amino acid sequence selected from the group consisting of: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 62, SEQ ID NO 64, SEQ ID NO 66, SEQ ID NO 68, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 88, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 176, 178, 180, 182, 184 and 186.
Embodiment I-11 the engineered variant of any one of embodiments I-1 to I-9, wherein the engineered variant comprises an amino acid sequence of SEQ ID No. 44 having at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 amino acid substitutions.
Embodiment I-12 the engineered variant of any one of embodiments I-1 to I-9, wherein the engineered variant comprises the amino acid sequence of SEQ ID No. 44 having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid substitutions.
Embodiment I-13. the engineered variant of any one of embodiments I-1 to I-12, wherein the engineered variant comprises at least one invariant amino acid in a Flavin Adenine Dinucleotide (FAD) -binding domain, a Berberine Bridge Enzyme (BBE) domain, or a combination of the foregoing.
Embodiments I-14 the engineered variant of embodiments I-13, wherein the engineered variant comprises at least one invariant amino acid in the FAD binding domain.
Embodiment I-15 the engineered variant of embodiment I-14, wherein said engineered variant comprises at least 1 invariant amino acid, at least 2 invariant amino acids, at least 3 invariant amino acids, at least 4 invariant amino acids, at least 5 invariant amino acids, at least 6 invariant amino acids, at least 7 invariant amino acids, at least 8 invariant amino acids, at least 9 invariant amino acids, at least 10 invariant amino acids, at least 11 invariant amino acids, at least 12 invariant amino acids, at least 13 invariant amino acids, at least 14 invariant amino acids, or at least 15 invariant amino acids in said FAD binding domain.
Embodiment I-16 the engineered variant of any one of embodiments I-13 to I-15, wherein the engineered variant comprises at least one invariant amino acid in the BBE domain.
The engineered variant of embodiments I-16, wherein the engineered variant comprises at least 1 invariant amino acid, at least 2 invariant amino acids, at least 3 invariant amino acids, at least 4 invariant amino acids, at least 5 invariant amino acids, at least 6 invariant amino acids, at least 7 invariant amino acids, at least 8 invariant amino acids, at least 9 invariant amino acids, at least 10 invariant amino acids, at least 11 invariant amino acids, at least 12 invariant amino acids, at least 13 invariant amino acids, at least 14 invariant amino acids, or at least 15 invariant amino acids in the BBE domain.
Embodiment I-18. the engineered variant of any one of embodiments I-1 to I-17, wherein the engineered variant comprises at least one invariant amino acid selected from the group consisting of: a28, F34, L35, C37, L64, N70, P87, I93, C99, R108, R110, G112, E117, G118, S120, P126, F127, D131, D141, W148, G152, a153, L155, G156, E157, Y159, Y160, N163, a173, G174, C176, P177, T178, V179, G182, G183, H184, F185, G187, G188, G189, Y190, G191, P192, L193, R195, a201, D202, I205, D206, V210, G214, G223, D225, L226, F227, W228, R231, G234, S237, F238, G239, G245, I246, L245, V251, V385, V312, V259, Q313, F312, S341, S354, N185, N187, N185, G188, G189, G468, P192, L193, L202, L205, L195, L202, L205, V206, V210, G185, F2, F185, G185, F2, F185, F2, G234, F185, F2, F185, F2, G185, F2, F185, G185, F2, F185, F2, F185, F234, F185, F2, F185, G468, F185, F2, G468, F185, F2, G185, G468, F2, F185, F2, G468, F2, G185, F2, F185, G185, F2, G468, G185, F2, F185, G185, F2, G468, F2, G185, G468, F2, F123, G414, G468, F123, F234, F123, G414, F123, F.
Embodiment I-19 the engineered variant of embodiment I-18, wherein said engineered variant comprises at least one invariant amino acid selected from the group consisting of: c37, N70, I93, C99, E117, S120, F127, D131, G156, E157, Y159, G174, C176, G182, G183, F185, G187, G188, G189, Y190, G191, P192, R195, D202, D206, G214, W228, G234, F238, L248, Q277, S314, L324, S355, K382, K384, D386, G420, M423, R436, Y441, W444, Y445, Y472, P477, N514, F515, N529, and Q535.
Embodiment I-20 the engineered variant of any one of embodiments I-1 to I-19, wherein the engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 invariant amino acids.
Embodiment I-21 the engineered variant of any one of embodiments I-1 to I-20, wherein the engineered variant produces an amount of tetrahydrocannabinolic acid (THCA) from cannabigerolic acid (CBGA) that is greater than the amount of THCA produced from CBGA by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM, within the same length of time under similar conditions.
Embodiment I-22 the engineered variant of any one of embodiments I-1 to I-21, wherein the engineered variant produces an amount of tetrahydrocannabinolic acid (THCA) from cannabigerolic acid (CBGA) that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of THCA produced from CBGA by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM, over the same length of time under similar conditions.
Embodiment I-23 the engineered variant of any one of embodiments I-1 to I-22, wherein the engineered variant produces THCA from cannabigerolic acid (CBGA) at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to the ratio of THCA to the other cannabinoid produced by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 over the same temporal length under similar conditions.
Embodiment I-24 the engineered variant of any one of embodiments I-1 to I-23, wherein the engineered variant produces from, for example, THCA at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500 THCA (CBCA).
Embodiment I-25. the engineered variant of any one of embodiments I-1 to I-24, wherein the engineered variant comprises a truncation at the N-terminus, the C-terminus, or at both the N-terminus and the C-terminus.
Embodiment I-26. the engineered variant of embodiment I-25, wherein said truncated engineered variant comprises a signal polypeptide or a membrane anchor.
Embodiment I-27. the engineered variant of embodiment I-25 or I-26, wherein said engineered variant lacks a native signal polypeptide.
Embodiment I-28. the engineered variant of any one of embodiments I-25 to I-27, wherein said engineered variant comprises a truncation of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 amino acids at the C-terminus.
Embodiment I-29. the engineered variant of any one of embodiments I-25 to I-27, wherein said engineered variant comprises a truncation of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids at the C-terminus.
Embodiment I-30. a nucleic acid comprising a nucleotide sequence encoding the engineered variant of any one of embodiments I-1 to I-29.
Embodiments I-31 a nucleic acid comprising a nucleotide sequence encoding an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide, the engineered variant comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, wherein the nucleotide sequence is selected from the group consisting of: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 61, SEQ ID NO 63, SEQ ID NO 65, SEQ ID NO 67, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 97, SEQ ID NO 99, SEQ ID NO 101, SEQ ID NO 103, SEQ ID NO 105, SEQ ID NO 107, SEQ ID NO 109, SEQ ID NO 111, SEQ ID NO 113, SEQ ID NO 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 163, 173, 175, 171, 173, 175, 177, 179, 181, 2, 183 and 185 SEQ ID NO.
Embodiment I-32. the nucleic acid of any of embodiments I-30 or I-31, wherein the nucleotide sequence is codon optimized.
Embodiment I-33 a method of preparing a modified host cell for the production of a cannabinoid or cannabinoid derivative comprising introducing into a host cell one or more nucleic acids of any of embodiments I-30 to I-32.
Embodiment I-34A vector comprising one or more nucleic acids according to any one of embodiments I-30 to I-32.
Embodiments I-35 a method of preparing a modified host cell for the production of a cannabinoid or a cannabinoid derivative comprising introducing one or more vectors as described in embodiments I-34 into a host cell.
Embodiment I-36 a modified host cell for the production of a cannabinoid or a cannabinoid derivative, wherein the modified host cell comprises one or more nucleic acids as described in any of embodiments I-30 to I-32.
Embodiment I-37. the modified host cell of embodiment I-36, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a geranyl pyrophosphate olive acid geranyl transferase (GOT) polypeptide.
Embodiment I-38. the modified host cell of embodiment I-37, wherein said GOT polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 17.
Embodiment I-39. the modified host cell of embodiment I-37 or I-38, wherein said modified host cell comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding said GOT polypeptide.
Embodiment I-40. the modified host cell of embodiment I-36, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide.
Embodiment I-41. the modified host cell of embodiment I-40, wherein the NphB polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 188.
Embodiment I-42. the modified host cell of any one of embodiments I-36 to I-41, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tetraone synthase (TKS) polypeptide and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an Olivine Acid Cyclase (OAC) polypeptide.
Embodiments I-43 the modified host cell of embodiments I-42, wherein the TKS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 19.
Embodiment I-44. the modified host cell of embodiment I-42 or I-43, wherein the modified host cell comprises three or more heterologous nucleic acids comprising nucleotide sequences encoding TKS polypeptides.
Embodiment I-45 the modified host cell of any one of embodiments I-42 to I-44, wherein the OAC polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO:21 or SEQ ID NO: 48.
Modified host cell according to any one of embodiments I-42 to I-45, wherein the modified host cell comprises three or more heterologous nucleic acids comprising nucleotide sequences encoding OAC polypeptides.
Embodiment I-47 the modified host cell of any one of embodiments I-36 to I-46, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an Acyl Activating Enzyme (AAE) polypeptide.
Embodiment I-48 the modified host cell of embodiment I-47, wherein the AAE polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 23.
Embodiment I-49. the modified host cell of embodiment I-47 or I-48, wherein the modified host cell comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding an AAE polypeptide.
Embodiment I-50 the modified host cell of any one of embodiments I-36 to I-49, wherein the modified host cell comprises one or more of: a) one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMG-CoA synthase (HMGS) polypeptide; b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a truncated 3-hydroxy-3-methyl-glutaryl-CoA reductase (tHMGR) polypeptide; c) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a Mevalonate Kinase (MK) polypeptide; d) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a phosphomevalonate kinase (PMK) polypeptide; e) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a mevalonate decarboxylase pyrophosphate (MVD1) polypeptide; or f) one or more heterologous nucleic acids comprising a nucleotide sequence encoding an isopentenyl diphosphate isomerase (IDI1) polypeptide.
Embodiment I-51. the modified host cell of embodiment I-50, wherein said IDI1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 25.
Embodiments I-52. the modified host cell of embodiments I-50 or I-51, wherein the tHMGR polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 27.
Embodiment I-53 the modified host cell of any one of embodiments I-50 to I-52, wherein the HMGS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 29.
Modified host cell according to any one of embodiments I-50 to I-53, wherein the MK polypeptide comprises an amino acid sequence having at least 85% sequence identity with SEQ ID NO: 39.
Embodiment I-55. the modified host cell of any one of embodiments I-50 to I-54, wherein the PMK polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 37.
Embodiment I-56. the modified host cell of any one of embodiments I-50 to I-55, wherein the MVD1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 33.
Embodiment I-57 the modified host cell of any one of embodiments I-36 to I-56, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-coa thiolase polypeptide.
Embodiments I-58. the modified host cell of embodiments I-57, wherein the acetoacetyl-coa thiolase polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 31.
Embodiment I-59 the modified host cell of any one of embodiments I-36 to I-58, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a Pyruvate Decarboxylase (PDC) polypeptide.
Embodiments I-60 modified host cells as described in embodiments I-59, wherein the PDC polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 35.
Embodiment I-61. the modified host cell of any one of embodiments I-36 to I-60, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a geranyl pyrophosphate synthase (GPPS) polypeptide.
Embodiments I-62. the modified host cell of embodiments I-61, wherein the GPPS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 41.
Embodiment I-63. the modified host cell of any one of embodiments I-36 to I-62, wherein said modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
Embodiments I-64 the modified host cell of embodiments I-63, wherein the KAR2 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 5.
Embodiment I-65. the modified host cell of embodiment I-63 or I-64, wherein the modified host cell comprises two or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
The modified host cell of any one of embodiments I-36 to I-65, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide.
Embodiment I-67. the modified host cell of embodiment I-66, wherein the PDI1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 9.
Embodiment I-68. the modified host cell of any one of embodiments I-36 to I-67, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide.
Embodiment I-69 the modified host cell of embodiment I-68, wherein the IRE1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO:11 or SEQ ID NO: 190.
Embodiment I-70 the modified host cell of any one of embodiments I-36 to I-69, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide.
Embodiment I-71. the modified host cell of embodiment I-70, wherein the ERO1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 7.
Embodiment I-72 the modified host cell of any one of embodiments I-36 to I-71, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide.
Embodiment I-73. the modified host cell of embodiment I-72, wherein the IDI1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 192.
Embodiment I-74. the modified host cell of any one of embodiments I-36 to I-73, wherein the modified host cell comprises a deletion or down-regulation of one or more genes encoding a PEP4 polypeptide.
Embodiments I-75 the modified host cell of embodiments I-74, wherein the PEP4 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 15.
Embodiment I-76 the modified host cell of any one of embodiments I-36 to I-75, wherein the modified host cell comprises a deletion or down-regulation of one or more genes encoding a ROT2 polypeptide.
Embodiment I-77 the modified host cell of embodiment I-76, wherein the ROT2 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 13.
Embodiment I-78 the modified host cell of any one of embodiments I-36 to I-77, wherein the modified host cell is a eukaryotic cell.
Embodiment I-79 the modified host cell of embodiment I-78, wherein the eukaryotic cell is a yeast cell.
Embodiment I-80. the modified host cell of embodiment I-79, wherein the yeast cell is Saccharomyces cerevisiae.
Embodiment I-81. the modified host cell of embodiment I-80, wherein said Saccharomyces cerevisiae is a protease deficient strain of Saccharomyces cerevisiae.
Embodiment I-82. the modified host cell of any one of embodiments I-36 to I-81, wherein at least one of the one or more nucleic acids is integrated into the chromosome of the modified host cell.
Embodiment I-83. the modified host cell of any one of embodiments I-36 to I-81, wherein at least one of the one or more nucleic acids is maintained extrachromosomally.
Embodiment I-84. the modified host cell of any one of embodiments I-36 to I-83, wherein at least one of the one or more nucleic acids is operably linked to an inducible promoter.
Embodiment I-85. the modified host cell of any one of embodiments I-36 to I-83, wherein at least one of the one or more nucleic acids is operably linked to a constitutive promoter.
Embodiment I-86. the modified host cell of any one of embodiments I-36 to I-85, wherein the growth is carried out under similar culture conditions for the same length of time in mg/L or mM, the amount of cannabinoid or cannabinoid derivative produced by the modified host cell is greater than the amount produced by a cell comprising one or more polypeptides comprising a polypeptide encoding a polypeptide having the amino acid sequence of SEQ ID NO:44 of the nucleotide sequence of the tetrahydrocannabinolic acid synthase polypeptide, or a cannabinoid derivative thereof, produced by the modified host cell, wherein the polypeptide comprises one or more polypeptides comprising a nucleotide sequence encoding a polypeptide having the sequence of SEQ ID NO:44, is devoid of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of any one of embodiments I-1 to I-29.
The modified host cell of any one of embodiments I-36 to I-86, wherein the amount of cannabinoid or cannabinoid derivative produced by the modified host cell is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO. 44, grown for the same length of time under similar culture conditions, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 is devoid of nucleic acid comprising a nucleotide sequence encoding the engineered variant of any one of embodiments I-1 to I-29.
Embodiment I-88. the modified host cell of any one of embodiments I-36 to I-87, wherein the modified host cell has a modified amino acid sequence that is identical to a sequence comprising one or more amino acid sequences encoding a polypeptide having the amino acid sequence of SEQ ID NO:44 of the amino acid sequence of tetrahydrocannabinolic acid synthase polypeptide, a higher biomass yield than a faster growth rate and/or a higher biomass yield, wherein the polypeptide comprises one or more polypeptides comprising a nucleotide sequence encoding a polypeptide having the sequence of SEQ ID NO:44, is devoid of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of any one of embodiments I-1 to I-29.
Embodiment I-89 the modified host cell of any one of embodiments I-36 to I-88, wherein the modified host cell has a growth rate and/or biomass yield that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% faster than the growth rate and/or biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, grown for the same length of time under similar culture conditions, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 is devoid of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of any of embodiments I-1 to I-29.
Embodiment I-90 the modified host cell of any one of embodiments I-36 to I-89, wherein the modified host cell produces THCA from cannabigerolic acid (CBGA) at an increased ratio of THCA to another cannabinoid (e.g., cannabis sativa cyclic terpene phenolic acid (CBCA)) as compared to the ratio of THCA to the other cannabinoid produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 grown under similar culture conditions for the same length of time, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 lacks a cell comprising a nucleic acid encoding an engineered cannabinolic acid synthase polypeptide as described in any one of embodiments I-1 to I-29 A nucleic acid of a nucleotide sequence of the variant of (a).
Embodiment I-91 the modified host cell of any one of embodiments I-36 to I-90, wherein the modified host cell produces from, for example, THCA at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500 THCA (CBCA).
Embodiments I-92 a method of producing a cannabinoid or a cannabinoid derivative, comprising: a) culturing a modified host cell as described in any one of embodiments I-36 to I-91 in a culture medium.
Embodiment I-93 the method of embodiment I-92, wherein said method comprises: b) recovering the produced cannabinoid or cannabinoid derivative.
Embodiment I-94 the method of embodiment I-92 or I-93, wherein said medium comprises a carboxylic acid.
Embodiment I-95. the method of embodiment I-94, wherein the carboxylic acid is unsubstituted or substituted C 3 -C 18 A carboxylic acid.
Embodiment I-96. the method of embodiment I-95, wherein said unsubstituted or substituted C 3 -C 18 The carboxylic acid is unsubstituted or substituted hexanoic acid.
Embodiment I-97 the method of embodiment I-92 or I-93, wherein the medium comprises olivinic acid or an olivinic acid derivative.
Embodiment I-98 the method of embodiment I-92 or I-93, wherein the cannabinoid is tetrahydrocannabinolic acid, or tetrahydrocannabinol.
Embodiment I-99. the method of any one of embodiments I-92 to I-98, wherein the medium comprises a fermentable sugar.
Embodiment I-100 the method of any one of embodiments I-92 to I-98, wherein the culture medium comprises a pretreated cellulosic feedstock.
Embodiment I-101. the method of any one of embodiments I-92 to I-98, wherein the medium comprises a non-fermentable carbon source.
Embodiments I-102 the method of embodiments I-101, wherein the non-fermentable carbon source comprises ethanol.
Embodiment I-103 the method of any one of embodiments I-92 to I-102, wherein the cannabinoid or the cannabinoid derivative is produced in an amount greater than 100mg/L of medium.
Embodiment I-104. the method of any of embodiments I-92 to I-103, wherein the amount of cannabinoid or the cannabinoid derivative produced is greater in mg/L or mM than the amount of cannabinoid or the cannabinoid derivative produced in an alternative method comprising culturing one or more modified host cells comprising a nucleic acid comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, instead of the modified host cells of any of embodiments I-34 to I-89, wherein the modified host cells comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 are devoid of nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 And wherein the modified host cell of any one of embodiments I-36 to I-91 and the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant of any one of embodiments I-1 to I-29, are cultured under similar culture conditions for the same length of time.
Embodiment I-105. the method of any of embodiments I-92 to I-104, wherein the amount of cannabinoid or the cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of cannabinoid or the cannabinoid derivative produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, instead of the modified host cell of any of embodiments I-36 to I-91, in mg/L or mM Wherein the polypeptide comprises one or more polypeptides comprising a nucleotide sequence encoding a polypeptide having the sequence of SEQ ID NO:44, is devoid of a nucleic acid comprising a nucleotide sequence encoding an engineered variant of any one of embodiments I-1 to I-29, and wherein the modified host cell of any one of embodiments I-36 to I-91 and the recombinant vector comprising one or more nucleic acid sequences encoding a polypeptide having the sequence of SEQ ID NO:44, or a tetrahydrocannabinolic acid synthase polypeptide, but lacking a nucleic acid comprising a nucleotide sequence encoding an engineered variant as described in any of embodiments I-1 to I-29, for the same length of time under similar culture conditions.
Embodiment I-106 the method of any one of embodiments I-92 to I-105, wherein the cannabinoid is tetrahydrocannabinolic acid (THCA), and wherein the method produces THCA at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to the ratio of THCA to the other cannabinoid produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, as opposed to the modified host cell of any one of embodiments I-36 to I-91, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 is devoid of nucleic acid comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide as described in embodiments I-36 to I-91 The nucleic acid of the nucleotide sequence of the engineered variant of any one of schemes I-1 to I-29, grown for the same length of time under similar culture conditions.
Embodiment I-107. the method of any one of embodiments I-92 to I-106, wherein the method produces from the CBCA, e.g., the ratio of THCA to another cannabinoid (e.g., the ratio of THCA to CBCA) at about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500: 1.
Embodiment I-108 a method of producing a cannabinoid or a cannabinoid derivative comprising using an engineered variant as described in any one of embodiments I-1 to I-29.
Embodiments I-109. the process of embodiments I-108, wherein the process comprises recovering the produced cannabinoid or cannabinoid derivative.
Embodiment I-110 the method of embodiment I-108 or I-109, wherein the cannabinoid is tetrahydrocannabinolic acid, or tetrahydrocannabinol.
Embodiment I-111 the method of any one of embodiments I-108 to I-110, wherein the amount of cannabinoid or the cannabinoid derivative produced is greater, in mg/L or mM, than the amount of cannabinoid or the cannabinoid derivative produced in an alternative method comprising using a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 instead of the engineered variant of any one of embodiments I-1 to I-29, wherein the engineered variant of any one of embodiments I-1 to I-29 and the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 are used for the same length of time under similar conditions.
Embodiment I-112. the method of any one of embodiments I-108 to I-111, wherein the amount of cannabinoid or the cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater, in mg/L or mM, than the amount of cannabinoid or the cannabinoid derivative produced in an alternative method comprising using a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, instead of the engineered variant of any one of embodiments I-1 to I-29, wherein the engineered variant of any one of embodiments I-1 to I-29 and the tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 are used under similar conditions for the same length of time.
Embodiment I-113 the method of any one of embodiments I-108 to I-112, wherein the cannabinoid is tetrahydrocannabinolic acid (THCA), and wherein the ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) produced in the alternative method, the method produces THCA in an increased ratio of THCA to the other cannabinoid, the alternative method comprising using a peptide having the sequence of SEQ ID NO:44, or a tetrahydrocannabinolic acid synthase polypeptide, rather than an engineered variant as described in any of embodiments I-1 through I-29, wherein the engineered variant of any one of embodiments I-1 to I-29 and a polypeptide having the sequence of SEQ ID NO:44 under similar conditions for the same length of time.
Embodiment I-114. the method of any one of embodiments I-108 to I-113, wherein the method produces from another cannabinoid, e.g., THCA, at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500:1, the THCA to CBCA (e.g., THCA).
Embodiments I-115 a method of screening for an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide, the engineered variant comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, the method comprising: a) dividing the population of host cells into a control population and a test population; b) co-expressing in the control population a THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 and a comparative tetrahydrocannabinolic acid synthase polypeptide, wherein the THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 can convert cannabigerolic acid (CBGA) to a first cannabinoid, tetrahydrocannabinolic acid (THCA), and the comparative cannabinoid synthase polypeptide can convert the same CBGA to a different second cannabinoid; c) co-expressing the engineered variant and the comparative cannabinoid synthase polypeptide in the test population, wherein the engineered variant can convert CBGA to the same first cannabinoid, tetrahydrocannabinolic acid (THCA), as the THCAS polypeptide having the amino acid sequence of SEQ ID NO:44, and wherein the comparative cannabinoid synthase polypeptide can convert the same CBGA to the second cannabinoid and be expressed at similar levels in the test population and the control population; d) measuring a ratio of the first cannabinoid, tetrahydrocannabinolic acid (THCA), to the second cannabinoid produced by both the test population and the control population; and e) measuring the amount of the first cannabinoid produced by both the test population and the control population in mg/L or mM.
Embodiment I-116 the method of embodiment I-115, wherein the test population is identified as comprising an engineered variant having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, wherein improved in vivo performance is evidenced by an increased ratio of the first cannabinoid to the second cannabinoid produced by the test population as compared to the ratio of the first cannabinoid to the second cannabinoid produced by the control population over the same length of time under similar culture conditions.
Embodiment I-117. the method of embodiment I-115 or I-116, wherein the test population is identified as comprising an engineered variant having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44 by producing a greater amount of the first cannabinoid from the test population as compared to the amount produced from the control population in mg/L or mM over the same length of time under similar culture conditions.
Embodiment I-118 the method of any one of embodiments I-115 to I-117, wherein the cannabinoid synthase polypeptide is a cannabidiolic acid synthase polypeptide.
Embodiments I-119. the method of embodiments I-118, wherein the cannabidiolic acid synthase polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 3.
Embodiment I-120 the method of any one of embodiments I-115 through I-119, wherein the second cannabinoid is cannabidiolic acid (CBDA).
Embodiment I-121. the method of any one of embodiments I-115 to I-119, wherein the engineered variant is the engineered variant of any one of embodiments I-1 to I-29.
The amino acid sequences and nucleotide sequences disclosed herein are provided in table 1. When referring to a genus and/or species, the sequence should not be construed as being limited to the specified genus and/or species, but also includes other genera and/or species expressing the sequence. Orthologs of the sequences disclosed in table 1 may also be encompassed by the present disclosure. The nucleotide sequences represented in Table 1 as codon-optimized are codon-optimized for expression in Saccharomyces cerevisiae. In table 1, "+" used as the end of the sequence indicates a stop codon. With respect to OAC, "# indicates the presence of a mutation in the sequence.
Table 1: amino acid sequences and nucleotide sequences of the disclosure
Figure BDA0003634699900001541
Figure BDA0003634699900001551
Figure BDA0003634699900001561
Figure BDA0003634699900001571
Figure BDA0003634699900001581
Figure BDA0003634699900001591
Figure BDA0003634699900001601
Figure BDA0003634699900001611
Figure BDA0003634699900001621
Figure BDA0003634699900001631
Figure BDA0003634699900001641
Figure BDA0003634699900001651
Figure BDA0003634699900001661
Figure BDA0003634699900001671
Figure BDA0003634699900001681
Figure BDA0003634699900001691
Figure BDA0003634699900001701
Figure BDA0003634699900001711
Figure BDA0003634699900001721
Figure BDA0003634699900001731
Figure BDA0003634699900001741
Figure BDA0003634699900001751
Figure BDA0003634699900001761
Figure BDA0003634699900001771
Figure BDA0003634699900001781
Figure BDA0003634699900001791
Figure BDA0003634699900001801
Figure BDA0003634699900001811
Figure BDA0003634699900001821
Figure BDA0003634699900001831
Figure BDA0003634699900001841
Figure BDA0003634699900001851
Figure BDA0003634699900001861
Figure BDA0003634699900001871
Figure BDA0003634699900001881
Figure BDA0003634699900001891
Figure BDA0003634699900001901
Figure BDA0003634699900001911
Figure BDA0003634699900001921
Figure BDA0003634699900001931
Figure BDA0003634699900001941
Figure BDA0003634699900001951
Figure BDA0003634699900001961
Figure BDA0003634699900001971
Figure BDA0003634699900001981
Figure BDA0003634699900001991
Figure BDA0003634699900002001
Figure BDA0003634699900002011
Figure BDA0003634699900002021
Figure BDA0003634699900002031
Figure BDA0003634699900002041
Figure BDA0003634699900002051
Figure BDA0003634699900002061
Figure BDA0003634699900002071
Figure BDA0003634699900002081
Figure BDA0003634699900002091
Figure BDA0003634699900002101
Figure BDA0003634699900002111
Figure BDA0003634699900002121
Figure BDA0003634699900002131
Figure BDA0003634699900002141
Figure BDA0003634699900002151
Figure BDA0003634699900002161
Figure BDA0003634699900002171
Figure BDA0003634699900002181
Figure BDA0003634699900002191
Figure BDA0003634699900002201
Figure BDA0003634699900002211
Figure BDA0003634699900002221
Figure BDA0003634699900002231
Figure BDA0003634699900002241
Figure BDA0003634699900002251
Figure BDA0003634699900002261
Figure BDA0003634699900002271
Figure BDA0003634699900002281
Figure BDA0003634699900002291
Figure BDA0003634699900002301
Figure BDA0003634699900002311
Figure BDA0003634699900002321
Figure BDA0003634699900002331
Figure BDA0003634699900002341
Figure BDA0003634699900002351
Figure BDA0003634699900002361
Figure BDA0003634699900002371
Figure BDA0003634699900002381
Figure BDA0003634699900002391
Figure BDA0003634699900002401
Figure BDA0003634699900002411
Figure BDA0003634699900002421
Figure BDA0003634699900002431
Figure BDA0003634699900002441
Figure BDA0003634699900002451
Figure BDA0003634699900002461
Figure BDA0003634699900002471
Figure BDA0003634699900002481
Figure BDA0003634699900002491
Figure BDA0003634699900002501
Examples
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present disclosure, and are not intended to limit the scope of what the inventors regard as their disclosure and are not intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g., amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless otherwise indicated, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees celsius and pressure is at or near atmospheric. Standard abbreviations may be used, such as bp, base pair; kb, kilobases; s or sec, seconds; min, minutes; h or hr, hr; aa, an amino acid; bp, base pair; nt, nucleotide; and so on.
THCAS mutation, construction and transformation
The THCAS mutations described herein were constructed based on saturation mutagenesis of CBDAS enzyme performed earlier. See U.S. application No. 62/851,560 filed on 2019, 5, 22, the contents of which are incorporated by reference herein in their entirety for all purposes. Mutant selections that improve CBDAS were transplanted into THCAS constructs and tested in strains that produce CBGA from OA. Table 2 shows data representing the performance of multiple, discrete integrations of enzymes (n >10 for all genotypes). The control strain D488 here contained wild-type THCAS and showed the expected level. The performance results are shown in tables 2 and 3.
Competition assay strains
Some, but not all, of the initial 10 mutations in the set tested resulted in increased titers, indicating that there was no 1:1 relationship between mutations in CBDAS and THCAS. Some beneficial mutations in CBDAS confer similar benefits to THCAS enzymes. The magnitude of some increase in THCA titer was surprisingly large, exceeding that observed for CBDAS. For example, the F371Y mutation in THCAS increased titer by about 81%, while the structurally equivalent F316Y mutation in CBDAS increased titer by only 43%.
Table 2: a THCAS mutant. All strains were tested at n-3 or greater. NA is not applicable and ND is incomplete.
Figure BDA0003634699900002511
Figure BDA0003634699900002521
Figure BDA0003634699900002531
Yeast transformation method
In optimized lithium acetate (LiAc) transformation, each DNA construct comprising one or more heterologous nucleic acids disclosed herein (e.g., the constructs detailed in table 4) is integrated into saccharomyces cerevisiae (cen. pk2, strain S4) using standard molecular biology techniques. Briefly, cells were grown overnight in yeast extract dextrose peptone (YPD) medium at 30 ℃ with shaking (200rpm), diluted in YPD to an OD600 of 0.175, and grown to an OD600 of 0.6-0.8. Transformation was performed in 96-well plates using 1.67mL of culture per well. The total culture volume was harvested by centrifugation, washed in an equal volume of sterile water, centrifuged again, and washed in an equal volume of 100mM LiAc. The cells were centrifuged again, the supernatant removed, and the cells resuspended in a transformation mixture consisting of 80. mu.L of 50% PEG, 12. mu.L of 1M LiAc, 3.3. mu.L of boiled salmon sperm DNA, 5. mu.L of PCR amplified library DNA, and 19.7. mu.L of water (scaled by the number of transformations). After heat shock at 42 ℃ for 40 minutes, cells were recovered overnight in YPD medium before plating on selective medium. DNA integration was confirmed by colony PCR with primers specific for integration of colony samples to confirm high integration rates.
Conditions for Yeast culture
Yeast colonies (modified host cells) containing the library construct nucleic acids disclosed herein were picked into 96-well microtiter plates containing 360. mu.L of YPD (10g/L yeast extract, 20g/L Bacto peptone, 20g/L dextrose (glucose)) and sealed with a vented membrane seal. Cells were cultured in a high volume microtiter plate incubator at 30 ℃ for 2 days (referred to as "preculture") with shaking at 1000rpm and 80% humidity until the culture reached carbon depletion. The culture saturated for growth was subcultured to a new plate containing YPGAL and either Olivonic acid or hexanoic acid, or Olivonic acid derivatives or carboxylic acids other than hexanoic acid (10g/L yeast extract, 20g/L Bacto peptone, 20g/L or 40g/L galactose, 1g/L glucose and 1mM Olivonic acid or 2mM hexanoic acid, or 1mM Olivonic acid derivatives or 2mM carboxylic acids other than hexanoic acid) by removing 15. mu.L from the saturated culture and diluting to 360. mu.L of fresh medium, and sealed with a gas permeable membrane seal. Prior to extraction and analysis, the modified host cells in the production medium were incubated in a high-capacity microtiter plate shaker at 30 ℃ at 1000rpm and 80% humidity for an additional 5 days. Upon completion, 25 μ L of whole cell broth was diluted into 975 μ L of methanol, sealed with a foil seal, and shaken at 1200rpm for 60 seconds to extract the cannabinoid or cannabinoid derivative. After shaking, the plates were centrifuged at 1000x g for 60 seconds to remove all solids. After centrifugation, the seals were removed and 10 μ L of the supernatant was transferred to a fresh assay plate containing 240 μ L of methanol, sealed with a foil seal, shaken at 900rpm for 60 seconds, and analyzed by LC-MS.
Analytical method
The samples were analyzed with an LC-MS mass spectrometer (Agilent 6470) using a Phenomenex Kinetex phenyl-hexyl 2.1X 30mm, 2.6 μm analytical column with the following gradients (mobile phase A: LC-MS grade water with 0.1% formic acid; mobile phase B: LC-MS grade acetonitrile with 0.1% formic acid):
time (minutes) %B
0 40
0.1 40
0.55 58
1.45 58
1.46 40
1.75 40
The mass spectrometer was operated in negative ion multiple reaction monitoring mode. Each cannabinoid or cannabinoid derivative was identified by retention time and Multiple Reaction Monitoring (MRM) shift determined from authentic standards:
name of Compound Q1 Mass (Da) Q3 Mass (Da) Collision energy (V)
CBGA 359.2 341.1 22
CBGA 359.2 315.2 22
CBDA 357.2 339.1 22
CBDA 357.2 245.1 30
CBCA 357.2 203.0 40
CBCA 357.2 191.0 30
THCA 357.0 245.0 35
THCA 357.0 191.0 35
Recovery and purification
Extracting whole cell broth from a culture comprising the modified host cells of the present disclosure with a suitable organic solvent to provide the cannabinoid or cannabinoid derivative. Suitable organic solvents include, but are not limited to, hexane, heptane, ethyl acetate, petroleum ether and diethyl ether, chloroform and ethyl acetate. A suitable organic solvent, such as hexane, is added to a whole cell broth from a fermentation comprising the modified host cells of the present disclosure at a ratio of 10:1 (10 parts whole cell broth-1 part organic solvent) and stirred for 30 minutes. The organic fraction was separated and extracted twice with equal volumes of acidic water (pH 2.5). The organic layer is then separated and dried in a concentrator (a reduced pressure rotary evaporator or thin film evaporator) to obtain crude cannabinoid or cannabinoid derivative crystals. The crude crystals may then be heated to 105 ℃ for 15 minutes followed by heating to 145 ℃ for 55 minutes to decarboxylate the crude cannabinoid or cannabinoid derivative. The crude crystalline product is redissolved and recrystallized in a suitable solvent (e.g., n-pentane) and filtered through a 1 μm filter to remove all insoluble material. The solvent is then removed, for example by rotary evaporation, to yield the pure crystalline product.
In vitro enzymatic assay and cell-free production of cannabinoids or cannabinoid derivatives
In some embodiments, modified host cells, e.g., modified yeast cells, are cultured in 96-well microtiter plates containing 360 μ L of YPD (10g/L yeast extract, 20g/L Bacto peptone, 20g/L dextrose (glucose)) and sealed with a gas-permeable membrane seal. The cells were then incubated at 30 ℃ in a high volume microtiter plate incubator at 1000rpm with shaking and 80% humidity for 3 days until the culture reached carbon depletion. The culture saturated in growth was then subcultured to 200mL YPGAL medium until OD600 was 0.2 and incubated at 30 ℃ for 20 hours with shaking. The cells were then harvested by centrifugation at 3000x g for 5 minutes at 4 ℃. The harvested cells were then resuspended in 50mL of buffer (50mM Tris-HCl, 1mM EDTA, 0.1M KCl, pH 7.4, 125 units Benzonase) and then lysed (Emulsiflex C3, Avestin, INC., 60 bar, 10 min). Cell debris was removed by centrifugation (10,000 Xg, 10min, 4 ℃). Subsequently, the supernatant was subjected to ultracentrifugation (150,000 Xg, 1h, 4 ℃, Beckman Coulter L-90K, TI-70). The resulting membrane fraction was then resuspended in 3.3mL of buffer (10mM Tris-HCl ) MgCl 2 pH 8.0, 10% glycerol) and dissolved with a tissue grinder. Then, 0.02% (v/v) of the corresponding membrane preparation was dissolved in a reaction buffer (50mM Tris-HCl, 10mM MgCl) 2 pH 8.5) and substrate (500 μ M olivinic acid, 500 μ M GPP) to a total volume of 50 μ L and incubated at 30 ℃ for 1 hour. The assay was then extracted by adding two reaction volumes of ethyl acetate followed by vortexing and centrifugation. The organic layer was evaporated for 30 min and resuspended in acetonitrile/H 2 O/formic acid (80:20: 0.05%) in combination
Figure BDA0003634699900002551
-MC column (0.22 μm pore size, PVDF membrane material) filtration. The cannabinoid or cannabinoid derivative is then detected via LC-MS and/or recovered and purified.
Yeast cultivation in a bioreactor
A single Yeast containing the modified host cells disclosed herein was grown in 15mL of Verduyn medium containing 50mM succinate (pH 5.0) and 2% glucose (originally described by Verduyn et al, Yeast 8(7):501-17) in 125mL flasks at 30 ℃ with shaking at 200rpm to an OD600 between 4 and 9. Glycerol was then added to the culture to a concentration of 20% and 1mL vials of the modified host cell suspension were stored at-80 ℃. One to two vials of modified host cells were thawed and grown in Verduyn medium containing 50mM succinate (pH 5.0) and 4% sucrose for 24 hours, then subcultured in the same medium to an OD600 reading of 0.1. After 24 hours of growth at 30 ℃ with shaking, 65mL of the culture was used to inoculate a 1.3 liter fermentor (Eppendorf DASGIP bioreactor) containing 585mLVerduyn fermentation medium containing 20g/L galactose supplemented with hexanoic acid (2mM), carboxylic acids other than hexanoic acid (2mM), olivinic acid (1mM) or olivinic acid derivatives (1 mM). The polyalphaolefin may be added to the fermentor as an extractant. In the addition of NH 4 The fermentor was maintained at 30 ℃ and pH 5.0 in the case of OH. In the initial batch phase, the fermentor was aerated with 0.5 volumes of air per volume per minute (VVM) and gradually stirred to maintain 30% dissolved oxygen. After the initial sugar is consumed, the rise in dissolved oxygen is triggered atThe addition of galactose + hexanoic acid (800g galactose/liter +9.28g hexanoic acid/liter) was performed at 10g galactose/liter per hour per liter per pulse of 10g galactose dose (alternatively, instead of the addition of hexanoic acid to the modified host cell disclosed herein, the modified host cell was subjected to the addition of olivoic acid, an olivoic acid derivative or a carboxylic acid other than hexanoic acid).
Between pulses, the feed rate was reduced to 5g galactose per liter per hour. When the dissolved oxygen increased by 10%, the feed rate was restored to 10g L -1 Hour(s) -1 . As the modified host cell density increased, dissolved oxygen was allowed to reach 0% and the pulse dose was increased to 50g galactose per liter. The oxygen delivery rate was maintained at a rate representing a full scale condition of 100mM per liter per hour by adjusting the agitation as the volume increased. The feed rate is dynamically adjusted to meet the demand using an algorithm that alternates between a high feed rate and a low feed rate. During low feed rates, the modified host cells should consume galactose and hexanoic acid, or alternatively, olivolic acid derivatives, or carboxylic acids other than hexanoic acid, as well as any overflow metabolites that accumulate during high feed rates. The rise in dissolved oxygen triggers the resumption of the high feed rate. The length of time spent in the low feed rate reflects the degree to which the modified host cells were overdosed or underfilled in the previous high feed rate pulse; this information is then monitored and used to adjust the high feed rate up or down, keeping the low feed rate within a defined range.
Over time, the feed rate is matched to the demand of sugar and caproic acid, or alternatively, olivinic acid derivatives, or carboxylic acids other than caproic acid, of the modified host cell. The algorithm ensures a minimum net accumulation of fermentation products other than cannabinoids or cannabinoid derivatives; biomass; and CO 2 . In some embodiments, the process lasts for 5 to 14 days. In certain such embodiments, the accumulated broth is removed daily and the biomass and cannabinoid or cannabinoid derivative concentration are determined. Periodic addition of NH 4 H 2 PO 4 Concentrated solutions, trace metals and vitamins to maintain steady state concentrations.
Table 4: constructs and strains used in the examples
Figure BDA0003634699900002561
Figure BDA0003634699900002571
Figure BDA0003634699900002581
If the strain has a parent strain, it is a progeny strain. All constructs present in the parent strain are also present in the progeny strain in their entirety.
S4 is cen.pk113-1A with genotype MAT α; URA 3; TRP 1; an LEU 2; HIS 3; MAL 2-8C; SUC2
S487 is the base strain used to test THCA synthase constructs. In this strain, the nucleotide sequence encoding the empty expression cassette pGAL1_ tTDH1 was added and the i33 locus was restored to its native sequence by deletion of the CBDAS construct therein
Table 5: list of regulatory and other elements
Figure BDA0003634699900002591
Figure BDA0003634699900002601
Figure BDA0003634699900002611
Figure BDA0003634699900002621
Figure BDA0003634699900002631
Figure BDA0003634699900002641
Figure BDA0003634699900002651
Figure BDA0003634699900002661
Figure BDA0003634699900002671
Figure BDA0003634699900002681
Figure BDA0003634699900002691
Figure BDA0003634699900002701
Figure BDA0003634699900002711
Figure BDA0003634699900002721
Table 5 legend: the non-coding DNA regions (regulatory and other regions) referred to in the figures are listed in Table 5. Flanking homology regions direct recombination at specific loci. The flanking homologous upstream sequences are denoted by "u" and the downstream by "d". "I" indicates the intergenic integration site, e.g.ui 7, di7 are the regions flanking intergenic region 7. Integration of deletion of the open reading frame has flanking homology to the deleted gene as shown, e.g. uPEP4, dPEP4 are the regions flanking the PEP4 gene. Synthetic Recombination Sequences (SRS) direct the internal recombination of two DNA constructs targeted for integration at the same locus. Linkers are short sequences used to assemble DNA constructs, between the indicated moieties. Linkers G1, G7, G10, RG1 and LTTDH1 contain the last 36bp of the upstream DNA portion; in the case of using these linkers, it is assumed that the linkers reconstruct the sequence omitted from the upstream portion to produce a seamless connection with the downstream portion. Linkers D0 and D9 are terminal linkers that direct the DNA construct into a cloning vector and do not integrate into the genome. The connection is also seamless without joints shown between the parts.
While the disclosure has been described with reference to specific embodiments thereof, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the true spirit and scope of the disclosure. In addition, many modifications may be made to adapt a particular situation, material, composition of matter, process step or steps, to the objective, spirit and scope of the present disclosure. All such modifications are intended to be within the scope of the appended claims.
Sequence listing
<110> Demertrex corporation
<120> optimized tetrahydrocannabinolic acid (THCA) synthase polypeptides
<130> 037050-8020CN01
<150> US 62/902,300
<151> 2019-09-18
<160> 251
<170> PatentIn 3.5 edition
<210> 1
<211> 1635
<212> DNA
<213> Artificial sequence
<220>
<223> cannabidiolic acid (CBDA) synthase codon optimized sequence 2 codon optimization
<400> 1
atgaaatgct ctaccttttc tttctggttc gtttgtaaga ttatcttctt cttcttctcc 60
ttcaacatcc aaacctctat cgctaaccct cgtgaaaact ttttgaaatg tttttcccaa 120
tacatcccaa ataacgctac taatttgaag ttggtttaca cccaaaacaa cccattgtat 180
atgtccgttt taaactctac tattcacaat ttgcgtttta cctctgatac tacccctaaa 240
ccattggtca ttgttacccc atcccatgtt tctcatatcc aaggtactat cttgtgttct 300
aaaaaggttg gtttgcaaat tagaactcgt tccggtggtc acgattctga aggtatgtct 360
tacatttctc aagttccttt cgtcattgtc gacttgagaa acatgagatc catcaaaatt 420
gatgttcact ctcaaactgc ttgggtcgaa gccggtgcca ctttaggtga ggtctactat 480
tgggttaacg agaagaacga aaacttgtct ttggctgccg gttactgtcc aactgtctgt 540
gctggtggtc attttggtgg tggtggttac ggtccattga tgagaaacta cggtttggct 600
gctgataaca ttattgatgc tcacttagtt aacgtccacg gtaaagtctt ggatagaaag 660
tccatgggtg aagacttgtt ctgggcttta agaggtggtg gtgctgaatc cttcggtatt 720
attgttgctt ggaaaatcag attggtcgct gttccaaaat ccaccatgtt ttctgtcaag 780
aaaatcatgg aaattcatga attagttaag ttggtcaaca aatggcaaaa cattgcctat 840
aaatacgaca aggatttgtt gttgatgact catttcatca ctcgtaacat cactgataat 900
caaggtaaga acaagactgc tatccatact tacttctctt ccgtcttctt gggtggtgtt 960
gactctttgg tcgatttgat gaacaaatcc tttccagagt taggtattaa gaagactgac 1020
tgtagacaat tatcttggat tgacactatt atcttctact ctggtgttgt caattacgat 1080
actgataact ttaacaagga aattttgttg gaccgttctg ctggtcaaaa cggtgccttc 1140
aagattaagt tagattacgt taagaagcca atcccagaat ctgtcttcgt ccaaattttg 1200
gagaaattgt atgaagagga cattggtgct ggtatgtacg ccttgtatcc ttacggtggt 1260
atcatggacg agatctccga atctgccatc ccttttcctc atcgtgctgg tatcttgtac 1320
gagttgtggt acatctgttc ctgggagaag caagaagata atgaaaagca cttgaactgg 1380
attagaaata tttataattt catgactcca tacgtttcta agaacccacg tttggcttac 1440
ttaaattaca gagatttgga tattggtatc aacgacccta agaaccctaa caactacact 1500
caagctagaa tttggggtga gaaatatttc ggtaagaact tcgatagatt ggtcaaggtt 1560
aaaactttag ttgatccaaa taactttttt agaaacgaac aatctattcc accattgcca 1620
agacacagac actag 1635
<210> 2
<211> 1635
<212> DNA
<213> Artificial sequence
<220>
<223> cannabidiolic acid (CBDA) synthase codon optimized sequence 5 codon optimization
<400> 2
atgaaatgct ccactttctc tttctggttc gtttgtaaga ttatcttctt cttcttttct 60
ttcaacatcc aaacttccat tgccaaccct cgtgagaact tcttgaaatg tttttctcaa 120
tatatcccaa ataacgctac taacttgaag ttagtctata ctcaaaacaa cccattatat 180
atgtctgtct taaactctac cattcacaac ttacgtttca cttctgatac tactccaaaa 240
cctttggtca tcgtcacccc atcccacgtt tctcacatcc aaggtaccat cttgtgttcc 300
aaaaaggttg gtttacaaat ccgtactaga tccggtggtc atgactccga aggtatgtct 360
tacatttccc aagtcccttt cgtcatcgtc gacttaagaa atatgcgttc catcaagatt 420
gatgtccatt cccaaactgc ttgggttgaa gccggtgcca ctttaggtga agtctattac 480
tgggttaacg agaagaatga gaacttatct ttggctgccg gttactgtcc aactgtttgt 540
gctggtggtc atttcggtgg tggtggttac ggtccattaa tgcgtaacta cggtttggct 600
gccgataaca tcattgatgc ccacttagtc aacgttcatg gtaaggtctt ggaccgtaag 660
tctatgggtg aggatttatt ctgggctttg agaggtggtg gtgctgaatc tttcggtatt 720
atcgtcgctt ggaagattag attagttgct gttccaaagt ctactatgtt ctctgttaag 780
aagatcatgg aaattcacga gttggttaaa ttagttaaca aatggcaaaa cattgcctac 840
aagtacgata aagatttgtt attaatgact cactttatca ctagaaacat tactgataac 900
caaggtaaga ataagactgc cattcacact tacttctctt ctgttttctt gggtggtgtt 960
gattccttgg tcgatttgat gaacaagtct tttccagaat taggtattaa gaagaccgat 1020
tgtcgtcaat tatcttggat tgataccatt attttttact ccggtgttgt caactacgac 1080
actgataatt ttaataagga gattttgtta gatagatctg ctggtcaaaa tggtgccttt 1140
aaaatcaaat tggactacgt taagaagcct attccagaat ccgtctttgt tcaaattttg 1200
gagaagttat acgaagaaga tattggtgct ggtatgtacg ccttgtatcc atatggtggt 1260
attatggatg aaatttctga atccgccatc cctttccctc atcgtgctgg tatcttatac 1320
gagttgtggt acatctgttc ttgggaaaag caagaagata atgaaaagca tttgaactgg 1380
atccgtaaca tctataactt catgactcca tacgtttcca aaaaccctag attggcttac 1440
ttaaattaca gagacttaga tattggtatt aacgacccta agaacccaaa caattacact 1500
caagctagaa tctggggtga aaagtacttc ggtaagaatt tcgacagatt agttaaggtc 1560
aagactttag ttgacccaaa taacttcttc agaaacgaac aatctatccc accattgcct 1620
agacatagac actag 1635
<210> 3
<211> 544
<212> PRT
<213> Cannabis sativa
<400> 3
Met Lys Cys Ser Thr Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Phe Ser Phe Asn Ile Gln Thr Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Gln Tyr Ile Pro Asn Asn Ala Thr Asn
35 40 45
Leu Lys Leu Val Tyr Thr Gln Asn Asn Pro Leu Tyr Met Ser Val Leu
50 55 60
Asn Ser Thr Ile His Asn Leu Arg Phe Thr Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser His Val Ser His Ile Gln Gly Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ser Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Ile Val Asp Leu Arg Asn Met Arg Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Val Asn Glu Lys Asn Glu Asn Leu Ser Leu Ala Ala Gly Tyr Cys
165 170 175
Pro Thr Val Cys Ala Gly Gly His Phe Gly Gly Gly Gly Tyr Gly Pro
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val His Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Leu Arg Gly Gly Gly Ala Glu Ser Phe Gly Ile
225 230 235 240
Ile Val Ala Trp Lys Ile Arg Leu Val Ala Val Pro Lys Ser Thr Met
245 250 255
Phe Ser Val Lys Lys Ile Met Glu Ile His Glu Leu Val Lys Leu Val
260 265 270
Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Leu Leu
275 280 285
Met Thr His Phe Ile Thr Arg Asn Ile Thr Asp Asn Gln Gly Lys Asn
290 295 300
Lys Thr Ala Ile His Thr Tyr Phe Ser Ser Val Phe Leu Gly Gly Val
305 310 315 320
Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly Ile
325 330 335
Lys Lys Thr Asp Cys Arg Gln Leu Ser Trp Ile Asp Thr Ile Ile Phe
340 345 350
Tyr Ser Gly Val Val Asn Tyr Asp Thr Asp Asn Phe Asn Lys Glu Ile
355 360 365
Leu Leu Asp Arg Ser Ala Gly Gln Asn Gly Ala Phe Lys Ile Lys Leu
370 375 380
Asp Tyr Val Lys Lys Pro Ile Pro Glu Ser Val Phe Val Gln Ile Leu
385 390 395 400
Glu Lys Leu Tyr Glu Glu Asp Ile Gly Ala Gly Met Tyr Ala Leu Tyr
405 410 415
Pro Tyr Gly Gly Ile Met Asp Glu Ile Ser Glu Ser Ala Ile Pro Phe
420 425 430
Pro His Arg Ala Gly Ile Leu Tyr Glu Leu Trp Tyr Ile Cys Ser Trp
435 440 445
Glu Lys Gln Glu Asp Asn Glu Lys His Leu Asn Trp Ile Arg Asn Ile
450 455 460
Tyr Asn Phe Met Thr Pro Tyr Val Ser Lys Asn Pro Arg Leu Ala Tyr
465 470 475 480
Leu Asn Tyr Arg Asp Leu Asp Ile Gly Ile Asn Asp Pro Lys Asn Pro
485 490 495
Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly Lys
500 505 510
Asn Phe Asp Arg Leu Val Lys Val Lys Thr Leu Val Asp Pro Asn Asn
515 520 525
Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Arg His Arg His
530 535 540
<210> 4
<211> 2049
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 4
atgtttttca acagactaag cgctggcaag ctgctggtac cactctccgt ggtcctgtac 60
gcccttttcg tggtaatatt acctttacag aattctttcc actcctccaa tgttttagtt 120
agaggtgccg atgatgtaga aaactacgga actgttatcg gtattgactt aggtactact 180
tattcctgtg ttgctgtgat gaaaaatggt aagactgaaa ttcttgctaa tgagcaaggt 240
aacagaatca ccccatctta cgtggcattc accgatgatg aaagattgat tggtgatgct 300
gcaaagaacc aagttgctgc caatcctcaa aacaccatct tcgacattaa gagattgatc 360
ggtttgaaat ataacgacag atctgttcag aaggatatca agcacttgcc atttaatgtg 420
gttaataaag atgggaagcc cgctgtagaa gtaagtgtca aaggagaaaa gaaggttttt 480
actccagaag aaatttctgg tatgatcttg ggtaagatga aacaaattgc cgaagattat 540
ttaggcacta aggttaccca tgctgtcgtt actgttcctg cttatttcaa tgacgcgcaa 600
agacaagcca ccaaggatgc tggtaccatc gctggtttga acgttttgag aattgttaat 660
gaaccaaccg cagccgccat tgcctacggt ttggataaat ctgataagga acatcaaatt 720
attgtttatg atttgggtgg tggtactttc gatgtctctc tattgtctat tgaaaacggt 780
gttttcgaag tccaagccac ttctggtgat actcatttag gtggtgaaga ttttgactat 840
aagatcgttc gtcaattgat aaaagctttc aagaagaagc atggtattga tgtgtctgac 900
aacaacaagg ccctagctaa attgaagaga gaagctgaaa aggctaaacg tgccttgtcc 960
agccaaatgt ccacccgtat tgaaattgac tccttcgttg atggtatcga cttaagtgaa 1020
accttgacca gagctaagtt tgaggaatta aacctagatc tattcaagaa gaccttgaag 1080
cctgtcgaga aggttttgca agattctggt ttggaaaaga aggatgttga tgatatcgtt 1140
ttggttggtg gttctactag aattccaaag gtccaacaat tgttagaatc atactttgat 1200
ggtaagaagg cctccaaggg tattaaccca gatgaagctg ttgcatacgg tgcagccgtt 1260
caagctggtg tcttatccgg tgaagaaggt gtcgaagata ttgttttatt ggatgtcaac 1320
gctttgactc ttggtattga aaccactggt ggtgtcatga ctccattaat taagagaaat 1380
actgctattc ctacaaagaa atcccaaatt ttctctactg ccgttgacaa ccaaccaacc 1440
gttatgatca aggtatacga gggtgaaaga gccatgtcta aggacaacaa tctattaggt 1500
aagtttgaat taaccggcat tccaccagca ccaagaggtg tacctcaaat tgaagtcaca 1560
tttgcacttg acgctaatgg tattctgaag gtgtctgcca cagataaggg aactggtaaa 1620
tccgaatcta tcaccatcac taacgataaa ggtagattaa cccaagaaga gattgataga 1680
atggttgaag aggctgaaaa attcgcttct gaagacgctt ctatcaaggc caaggttgaa 1740
tctagaaaca aattagaaaa ctacgctcac tctttgaaaa accaagttaa tggtgaccta 1800
ggtgaaaaat tggaagaaga agacaaggaa accttattag atgctgctaa cgatgtttta 1860
gaatggttag atgataactt tgaaaccgcc attgctgaag actttgatga aaagttcgaa 1920
tctttgtcca aggtcgctta tccaattact tctaagttgt acggaggtgc tgatggttct 1980
ggtgccgctg attatgacga cgaagatgaa gatgacgatg gtgattattt cgaacacgac 2040
gaattgtag 2049
<210> 5
<211> 682
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 5
Met Phe Phe Asn Arg Leu Ser Ala Gly Lys Leu Leu Val Pro Leu Ser
1 5 10 15
Val Val Leu Tyr Ala Leu Phe Val Val Ile Leu Pro Leu Gln Asn Ser
20 25 30
Phe His Ser Ser Asn Val Leu Val Arg Gly Ala Asp Asp Val Glu Asn
35 40 45
Tyr Gly Thr Val Ile Gly Ile Asp Leu Gly Thr Thr Tyr Ser Cys Val
50 55 60
Ala Val Met Lys Asn Gly Lys Thr Glu Ile Leu Ala Asn Glu Gln Gly
65 70 75 80
Asn Arg Ile Thr Pro Ser Tyr Val Ala Phe Thr Asp Asp Glu Arg Leu
85 90 95
Ile Gly Asp Ala Ala Lys Asn Gln Val Ala Ala Asn Pro Gln Asn Thr
100 105 110
Ile Phe Asp Ile Lys Arg Leu Ile Gly Leu Lys Tyr Asn Asp Arg Ser
115 120 125
Val Gln Lys Asp Ile Lys His Leu Pro Phe Asn Val Val Asn Lys Asp
130 135 140
Gly Lys Pro Ala Val Glu Val Ser Val Lys Gly Glu Lys Lys Val Phe
145 150 155 160
Thr Pro Glu Glu Ile Ser Gly Met Ile Leu Gly Lys Met Lys Gln Ile
165 170 175
Ala Glu Asp Tyr Leu Gly Thr Lys Val Thr His Ala Val Val Thr Val
180 185 190
Pro Ala Tyr Phe Asn Asp Ala Gln Arg Gln Ala Thr Lys Asp Ala Gly
195 200 205
Thr Ile Ala Gly Leu Asn Val Leu Arg Ile Val Asn Glu Pro Thr Ala
210 215 220
Ala Ala Ile Ala Tyr Gly Leu Asp Lys Ser Asp Lys Glu His Gln Ile
225 230 235 240
Ile Val Tyr Asp Leu Gly Gly Gly Thr Phe Asp Val Ser Leu Leu Ser
245 250 255
Ile Glu Asn Gly Val Phe Glu Val Gln Ala Thr Ser Gly Asp Thr His
260 265 270
Leu Gly Gly Glu Asp Phe Asp Tyr Lys Ile Val Arg Gln Leu Ile Lys
275 280 285
Ala Phe Lys Lys Lys His Gly Ile Asp Val Ser Asp Asn Asn Lys Ala
290 295 300
Leu Ala Lys Leu Lys Arg Glu Ala Glu Lys Ala Lys Arg Ala Leu Ser
305 310 315 320
Ser Gln Met Ser Thr Arg Ile Glu Ile Asp Ser Phe Val Asp Gly Ile
325 330 335
Asp Leu Ser Glu Thr Leu Thr Arg Ala Lys Phe Glu Glu Leu Asn Leu
340 345 350
Asp Leu Phe Lys Lys Thr Leu Lys Pro Val Glu Lys Val Leu Gln Asp
355 360 365
Ser Gly Leu Glu Lys Lys Asp Val Asp Asp Ile Val Leu Val Gly Gly
370 375 380
Ser Thr Arg Ile Pro Lys Val Gln Gln Leu Leu Glu Ser Tyr Phe Asp
385 390 395 400
Gly Lys Lys Ala Ser Lys Gly Ile Asn Pro Asp Glu Ala Val Ala Tyr
405 410 415
Gly Ala Ala Val Gln Ala Gly Val Leu Ser Gly Glu Glu Gly Val Glu
420 425 430
Asp Ile Val Leu Leu Asp Val Asn Ala Leu Thr Leu Gly Ile Glu Thr
435 440 445
Thr Gly Gly Val Met Thr Pro Leu Ile Lys Arg Asn Thr Ala Ile Pro
450 455 460
Thr Lys Lys Ser Gln Ile Phe Ser Thr Ala Val Asp Asn Gln Pro Thr
465 470 475 480
Val Met Ile Lys Val Tyr Glu Gly Glu Arg Ala Met Ser Lys Asp Asn
485 490 495
Asn Leu Leu Gly Lys Phe Glu Leu Thr Gly Ile Pro Pro Ala Pro Arg
500 505 510
Gly Val Pro Gln Ile Glu Val Thr Phe Ala Leu Asp Ala Asn Gly Ile
515 520 525
Leu Lys Val Ser Ala Thr Asp Lys Gly Thr Gly Lys Ser Glu Ser Ile
530 535 540
Thr Ile Thr Asn Asp Lys Gly Arg Leu Thr Gln Glu Glu Ile Asp Arg
545 550 555 560
Met Val Glu Glu Ala Glu Lys Phe Ala Ser Glu Asp Ala Ser Ile Lys
565 570 575
Ala Lys Val Glu Ser Arg Asn Lys Leu Glu Asn Tyr Ala His Ser Leu
580 585 590
Lys Asn Gln Val Asn Gly Asp Leu Gly Glu Lys Leu Glu Glu Glu Asp
595 600 605
Lys Glu Thr Leu Leu Asp Ala Ala Asn Asp Val Leu Glu Trp Leu Asp
610 615 620
Asp Asn Phe Glu Thr Ala Ile Ala Glu Asp Phe Asp Glu Lys Phe Glu
625 630 635 640
Ser Leu Ser Lys Val Ala Tyr Pro Ile Thr Ser Lys Leu Tyr Gly Gly
645 650 655
Ala Asp Gly Ser Gly Ala Ala Asp Tyr Asp Asp Glu Asp Glu Asp Asp
660 665 670
Asp Gly Asp Tyr Phe Glu His Asp Glu Leu
675 680
<210> 6
<211> 1692
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 6
atgagattaa gaaccgccat tgccacactg tgcctcacgg cttttacatc tgcaacttca 60
aacaatagct acatcgccac cgaccaaaca caaaatgcct ttaatgacac tcacttttgt 120
aaggtcgaca ggaatgatca cgttagtccc agttgtaacg taacattcaa tgaattaaat 180
gccataaatg aaaacattag agatgatctt tcggcgttat taaaatctga tttcttcaaa 240
tactttcggc tggatttata caagcaatgt tcattttggg acgccaacga tggtctgtgc 300
ttaaaccgcg cttgctctgt tgatgtcgta gaggactggg atacactgcc tgagtactgg 360
cagcctgaga tcttgggtag tttcaataat gatacaatga aggaagcgga tgatagcgat 420
gacgaatgta agttcttaga tcaactatgt caaaccagta aaaaacctgt agatatcgaa 480
gacaccatca actactgtga tgtaaatgac tttaacggta aaaacgccgt tctgattgat 540
ttaacagcaa atccggaacg atttacaggt tatggtggta agcaagctgg tcaaatttgg 600
tctactatct accaagacaa ctgttttaca attggcgaaa ctggtgaatc attggccaaa 660
gatgcatttt atagacttgt atccggtttc catgcctcta tcggtactca cttatcaaag 720
gaatatttga acacgaaaac tggtaaatgg gagcccaatc tggatttgtt tatggcaaga 780
atcgggaact ttcctgatag agtgacaaac atgtatttca attatgctgt tgtagctaag 840
gctctctgga aaattcaacc atatttacca gaattttcat tctgtgatct agtcaataaa 900
gaaatcaaaa acaaaatgga taacgttatt tcccagctgg acacaaaaat ttttaacgaa 960
gacttagttt ttgccaacga cctaagtttg actttgaagg acgaattcag atctcgcttc 1020
aagaatgtca cgaagattat ggattgtgtg caatgtgata gatgtagatt gtggggcaaa 1080
attcaaacta ccggttacgc aactgccttg aaaattttgt ttgaaatcaa cgacgctgat 1140
gaattcacca aacaacatat tgttggtaag ttaaccaaat atgagttgat tgcactatta 1200
caaactttcg gtagattatc tgaatctatt gaatctgtta acatgttcga aaaaatgtac 1260
gggaaaaggt taaacggttc tgaaaacagg ttaagctcat tcttccaaaa taacttcttc 1320
aacattttga aggaggcagg caagtcgatt cgttacacca tagagaacat caattccact 1380
aaagaaggaa agaaaaagac taacaattct caatcacatg tatttgatga tttaaaaatg 1440
cccaaagcag aaatagttcc aaggccctct aacggtacag taaataaatg gaagaaagct 1500
tggaatactg aagttaacaa cgttttagaa gcattcagat ttatttatag aagctatttg 1560
gatttaccca ggaacatctg ggaattatct ttgatgaagg tatacaaatt ttggaataaa 1620
ttcatcggtg ttgctgatta cgttagtgag gagacacgag agcctatttc ctataagcta 1680
gatatacaat aa 1692
<210> 7
<211> 563
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 7
Met Arg Leu Arg Thr Ala Ile Ala Thr Leu Cys Leu Thr Ala Phe Thr
1 5 10 15
Ser Ala Thr Ser Asn Asn Ser Tyr Ile Ala Thr Asp Gln Thr Gln Asn
20 25 30
Ala Phe Asn Asp Thr His Phe Cys Lys Val Asp Arg Asn Asp His Val
35 40 45
Ser Pro Ser Cys Asn Val Thr Phe Asn Glu Leu Asn Ala Ile Asn Glu
50 55 60
Asn Ile Arg Asp Asp Leu Ser Ala Leu Leu Lys Ser Asp Phe Phe Lys
65 70 75 80
Tyr Phe Arg Leu Asp Leu Tyr Lys Gln Cys Ser Phe Trp Asp Ala Asn
85 90 95
Asp Gly Leu Cys Leu Asn Arg Ala Cys Ser Val Asp Val Val Glu Asp
100 105 110
Trp Asp Thr Leu Pro Glu Tyr Trp Gln Pro Glu Ile Leu Gly Ser Phe
115 120 125
Asn Asn Asp Thr Met Lys Glu Ala Asp Asp Ser Asp Asp Glu Cys Lys
130 135 140
Phe Leu Asp Gln Leu Cys Gln Thr Ser Lys Lys Pro Val Asp Ile Glu
145 150 155 160
Asp Thr Ile Asn Tyr Cys Asp Val Asn Asp Phe Asn Gly Lys Asn Ala
165 170 175
Val Leu Ile Asp Leu Thr Ala Asn Pro Glu Arg Phe Thr Gly Tyr Gly
180 185 190
Gly Lys Gln Ala Gly Gln Ile Trp Ser Thr Ile Tyr Gln Asp Asn Cys
195 200 205
Phe Thr Ile Gly Glu Thr Gly Glu Ser Leu Ala Lys Asp Ala Phe Tyr
210 215 220
Arg Leu Val Ser Gly Phe His Ala Ser Ile Gly Thr His Leu Ser Lys
225 230 235 240
Glu Tyr Leu Asn Thr Lys Thr Gly Lys Trp Glu Pro Asn Leu Asp Leu
245 250 255
Phe Met Ala Arg Ile Gly Asn Phe Pro Asp Arg Val Thr Asn Met Tyr
260 265 270
Phe Asn Tyr Ala Val Val Ala Lys Ala Leu Trp Lys Ile Gln Pro Tyr
275 280 285
Leu Pro Glu Phe Ser Phe Cys Asp Leu Val Asn Lys Glu Ile Lys Asn
290 295 300
Lys Met Asp Asn Val Ile Ser Gln Leu Asp Thr Lys Ile Phe Asn Glu
305 310 315 320
Asp Leu Val Phe Ala Asn Asp Leu Ser Leu Thr Leu Lys Asp Glu Phe
325 330 335
Arg Ser Arg Phe Lys Asn Val Thr Lys Ile Met Asp Cys Val Gln Cys
340 345 350
Asp Arg Cys Arg Leu Trp Gly Lys Ile Gln Thr Thr Gly Tyr Ala Thr
355 360 365
Ala Leu Lys Ile Leu Phe Glu Ile Asn Asp Ala Asp Glu Phe Thr Lys
370 375 380
Gln His Ile Val Gly Lys Leu Thr Lys Tyr Glu Leu Ile Ala Leu Leu
385 390 395 400
Gln Thr Phe Gly Arg Leu Ser Glu Ser Ile Glu Ser Val Asn Met Phe
405 410 415
Glu Lys Met Tyr Gly Lys Arg Leu Asn Gly Ser Glu Asn Arg Leu Ser
420 425 430
Ser Phe Phe Gln Asn Asn Phe Phe Asn Ile Leu Lys Glu Ala Gly Lys
435 440 445
Ser Ile Arg Tyr Thr Ile Glu Asn Ile Asn Ser Thr Lys Glu Gly Lys
450 455 460
Lys Lys Thr Asn Asn Ser Gln Ser His Val Phe Asp Asp Leu Lys Met
465 470 475 480
Pro Lys Ala Glu Ile Val Pro Arg Pro Ser Asn Gly Thr Val Asn Lys
485 490 495
Trp Lys Lys Ala Trp Asn Thr Glu Val Asn Asn Val Leu Glu Ala Phe
500 505 510
Arg Phe Ile Tyr Arg Ser Tyr Leu Asp Leu Pro Arg Asn Ile Trp Glu
515 520 525
Leu Ser Leu Met Lys Val Tyr Lys Phe Trp Asn Lys Phe Ile Gly Val
530 535 540
Ala Asp Tyr Val Ser Glu Glu Thr Arg Glu Pro Ile Ser Tyr Lys Leu
545 550 555 560
Asp Ile Gln
<210> 8
<211> 1569
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 8
atgaagtttt ctgctggtgc cgtcctgtca tggtcctccc tgctgctcgc ctcctctgtt 60
ttcgcccaac aagaggctgt ggcccctgaa gactccgctg tcgttaagtt ggccaccgac 120
tccttcaatg agtacattca gtcgcacgac ttggtgcttg cggagttttt tgctccatgg 180
tgtggccact gtaagaacat ggctcctgaa tacgttaaag ccgccgagac tttagttgag 240
aaaaacatta ccttggccca gatcgactgt actgaaaacc aggatctgtg tatggaacac 300
aacattccag ggttcccaag cttgaagatt ttcaaaaaca gcgatgttaa caactcgatc 360
gattacgagg gacctagaac tgccgaggcc attgtccaat tcatgatcaa gcaaagccaa 420
ccggctgtcg ccgttgttgc tgatctacca gcttaccttg ctaacgagac ttttgtcact 480
ccagttatcg tccaatccgg taagattgac gccgacttca acgccacctt ttactccatg 540
gccaacaaac acttcaacga ctacgacttt gtctccgctg aaaacgcaga cgatgatttc 600
aagctttcta tttacttgcc ctccgccatg gacgagcctg tagtatacaa cggtaagaaa 660
gccgatatcg ctgacgctga tgtttttgaa aaatggttgc aagtggaagc cttgccctac 720
tttggtgaaa tcgacggttc cgttttcgcc caatacgtcg aaagcggttt gcctttgggt 780
tacttattct acaatgacga ggaagaattg gaagaataca agcctctctt taccgagttg 840
gccaaaaaga acagaggtct aatgaacttt gttagcatcg atgccagaaa attcggcaga 900
cacgccggca acttgaacat gaaggaacaa ttccctctat ttgccatcca cgacatgact 960
gaagacttga agtacggttt gcctcaactc tctgaagagg cgtttgacga attgagcgac 1020
aagatcgtgt tggagtctaa ggctattgaa tctttggtta aggacttctt gaaaggtgat 1080
gcctccccaa tcgtgaagtc ccaagagatc ttcgagaacc aagattcctc tgtcttccaa 1140
ttggtcggta agaaccatga cgaaatcgtc aacgacccaa agaaggacgt tcttgttttg 1200
tactatgccc catggtgtgg tcactgtaag agattggccc caacttacca agaactagct 1260
gatacctacg ccaacgccac atccgacgtt ttgattgcta aactagacca cactgaaaac 1320
gatgtcagag gcgtcgtaat tgaaggttac ccaacaatcg tcttataccc aggtggtaag 1380
aagtccgaat ctgttgtgta ccaaggttca agatccttgg actctttatt cgacttcatc 1440
aaggaaaacg gtcacttcga cgtcgacggt aaggccttgt acgaagaagc ccaggaaaaa 1500
gctgctgagg aagccgatgc tgacgctgaa ttggctgacg aagaagatgc cattcacgat 1560
gaattgtaa 1569
<210> 9
<211> 522
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 9
Met Lys Phe Ser Ala Gly Ala Val Leu Ser Trp Ser Ser Leu Leu Leu
1 5 10 15
Ala Ser Ser Val Phe Ala Gln Gln Glu Ala Val Ala Pro Glu Asp Ser
20 25 30
Ala Val Val Lys Leu Ala Thr Asp Ser Phe Asn Glu Tyr Ile Gln Ser
35 40 45
His Asp Leu Val Leu Ala Glu Phe Phe Ala Pro Trp Cys Gly His Cys
50 55 60
Lys Asn Met Ala Pro Glu Tyr Val Lys Ala Ala Glu Thr Leu Val Glu
65 70 75 80
Lys Asn Ile Thr Leu Ala Gln Ile Asp Cys Thr Glu Asn Gln Asp Leu
85 90 95
Cys Met Glu His Asn Ile Pro Gly Phe Pro Ser Leu Lys Ile Phe Lys
100 105 110
Asn Ser Asp Val Asn Asn Ser Ile Asp Tyr Glu Gly Pro Arg Thr Ala
115 120 125
Glu Ala Ile Val Gln Phe Met Ile Lys Gln Ser Gln Pro Ala Val Ala
130 135 140
Val Val Ala Asp Leu Pro Ala Tyr Leu Ala Asn Glu Thr Phe Val Thr
145 150 155 160
Pro Val Ile Val Gln Ser Gly Lys Ile Asp Ala Asp Phe Asn Ala Thr
165 170 175
Phe Tyr Ser Met Ala Asn Lys His Phe Asn Asp Tyr Asp Phe Val Ser
180 185 190
Ala Glu Asn Ala Asp Asp Asp Phe Lys Leu Ser Ile Tyr Leu Pro Ser
195 200 205
Ala Met Asp Glu Pro Val Val Tyr Asn Gly Lys Lys Ala Asp Ile Ala
210 215 220
Asp Ala Asp Val Phe Glu Lys Trp Leu Gln Val Glu Ala Leu Pro Tyr
225 230 235 240
Phe Gly Glu Ile Asp Gly Ser Val Phe Ala Gln Tyr Val Glu Ser Gly
245 250 255
Leu Pro Leu Gly Tyr Leu Phe Tyr Asn Asp Glu Glu Glu Leu Glu Glu
260 265 270
Tyr Lys Pro Leu Phe Thr Glu Leu Ala Lys Lys Asn Arg Gly Leu Met
275 280 285
Asn Phe Val Ser Ile Asp Ala Arg Lys Phe Gly Arg His Ala Gly Asn
290 295 300
Leu Asn Met Lys Glu Gln Phe Pro Leu Phe Ala Ile His Asp Met Thr
305 310 315 320
Glu Asp Leu Lys Tyr Gly Leu Pro Gln Leu Ser Glu Glu Ala Phe Asp
325 330 335
Glu Leu Ser Asp Lys Ile Val Leu Glu Ser Lys Ala Ile Glu Ser Leu
340 345 350
Val Lys Asp Phe Leu Lys Gly Asp Ala Ser Pro Ile Val Lys Ser Gln
355 360 365
Glu Ile Phe Glu Asn Gln Asp Ser Ser Val Phe Gln Leu Val Gly Lys
370 375 380
Asn His Asp Glu Ile Val Asn Asp Pro Lys Lys Asp Val Leu Val Leu
385 390 395 400
Tyr Tyr Ala Pro Trp Cys Gly His Cys Lys Arg Leu Ala Pro Thr Tyr
405 410 415
Gln Glu Leu Ala Asp Thr Tyr Ala Asn Ala Thr Ser Asp Val Leu Ile
420 425 430
Ala Lys Leu Asp His Thr Glu Asn Asp Val Arg Gly Val Val Ile Glu
435 440 445
Gly Tyr Pro Thr Ile Val Leu Tyr Pro Gly Gly Lys Lys Ser Glu Ser
450 455 460
Val Val Tyr Gln Gly Ser Arg Ser Leu Asp Ser Leu Phe Asp Phe Ile
465 470 475 480
Lys Glu Asn Gly His Phe Asp Val Asp Gly Lys Ala Leu Tyr Glu Glu
485 490 495
Ala Gln Glu Lys Ala Ala Glu Glu Ala Asp Ala Asp Ala Glu Leu Ala
500 505 510
Asp Glu Glu Asp Ala Ile His Asp Glu Leu
515 520
<210> 10
<211> 3348
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 10
atgcgtctac ttcgaagaaa catgttagta ttgacactgc tcgtttgtgt gttttcatcc 60
atcatttcat gctcaatccc attgtcgtct cgcacctcaa ggcggcagat agtggaagat 120
gaagttgcct ccactaaaaa gctcaatttc aactatggtg tggataaaaa tataaactcg 180
cccattcctg ctccaagaac cactgaaggt ttaccaaata tgaaactcag ctcatatcca 240
actcctaact tattgaatac tgctgataat cgacgtgcta acaaaaaagg acgtagggct 300
gccaattcta taagtgtacc ctatttggag aatcgttcct tgaacgaact gagtttatca 360
gatatactaa tcgcagccga cgttgagggt ggacttcatg ctgtagatag aagaaatggt 420
catatcatat ggtcaatcga accagaaaat tttcaacctc tgatagaaat acaagaacct 480
tcgaggttag aaacatatga aacgttgatt atagaacctt tcggtgatgg gaacatttac 540
tactttaacg cccatcaagg gttacaaaaa ctgcctttat ccatacgaca acttgtatca 600
acttccccgc tgcacttgaa aacaaatatt gtggttaatg actctggaaa aattgttgaa 660
gatgaaaagg tctacactgg atcgatgaga actataatgt atactataaa catgttgaat 720
ggtgaaatta tatcagcgtt cggacctggt tcaaaaaacg ggtatttcgg gagccagagt 780
gtggattgct cacctgagga gaagataaaa cttcaggaat gtgaaaatat gattgtaata 840
ggcaaaacta tttttgagct gggaattcac tcttatgatg gagcaagcta caatgtcact 900
tactctacat ggcagcaaaa tgttttagat gttcccctag cgcttcagaa tacattttca 960
aaggacggca tgtgcatagc gcctttccgt gataaatcat tgctagcaag cgatttagat 1020
tttagaattg ctagatgggt ttctccgaca ttccccggaa ttattgttgg gcttttcgat 1080
gtgtttaatg atctccgcac caatgaaaat atactggtac cgcatccctt taatcctggt 1140
gatcatgaaa gtatatcgag taacaaagtt tacttggatc agacttcgaa cctctcctgg 1200
tttgcattat ctagtcagaa ttttccatct ttagtcgaat cagctcccat atcaagatac 1260
gcttccagtg accgttggag ggtgtcttca atttttgaag atgagacttt attcaagaac 1320
gcaatcatgg gtgttcatca gatatataat aatgaatatg atcaccttta tgaaaactat 1380
gaaaaaacga atagtttgga cactacgcac aaatatccac ctctgatgat tgattcgtcc 1440
gttgatacaa ccgatttaca tcagaataac gagatgaatt cactaaagga atacatgtca 1500
ccagaagacc ttgaggcata tagaaaaaag atacacgagc aaatatcgag agaattagat 1560
gaaaagaacc aaaattcttt gctactgaag tttggaagtc tagtatatcg aattatagag 1620
actggagtat ttctgttgtt atttctcatt ttttgtgcaa tactacaaag attcaaaatt 1680
ttgccgccac tatatgtatt attatccaaa attggattta tgcctgaaaa ggaaatcccc 1740
atagttgagt cgaaatcgct aaattgtccc tcttcatcgg aaaatgtaac caagccattc 1800
gatatgaaat cagggaagca agttgttttt gaaggtgctg tgaacgatgg aagtctaaaa 1860
tctgaaaaag ataacgatga tgctgatgaa gatgatgaaa aatcactaga tttaaccaca 1920
gaaaagaaga agaggaaaag aggttcgaga ggaggcaaaa agggccgaaa atcacgcatt 1980
gcaaatatac caaactttga gcaatcttta aaaaatttgg tagtatccga aaaaatttta 2040
ggttacggtt catcaggaac agtagttttt cagggaagtt ttcaaggaag acctgttgcg 2100
gtaaagagaa tgttaattga tttttgtgac atagctttaa tggaaataaa acttttgact 2160
gaaagcgatg atcaccctaa cgtcatacga tactactgtt cagaaacaac agacagattt 2220
ttgtatattg ctttagagct ctgcaatttg aaccttcaag atttggtgga gtctaagaat 2280
gtatcagatg aaaacctgaa attacagaaa gagtataatc caatttcgtt attgagacaa 2340
atagcgtccg gggtagcaca tttacattct ttaaagatta tccatcgaga tttaaagcct 2400
caaaatattc tcgtttctac ttcgagtagg tttactgccg atcagcaaac aggagcagaa 2460
aatcttcgaa ttttgatatc agactttggt ctttgcaaaa aactagactc tggtcagtct 2520
tcatttagaa caaatttgaa taacccttct ggcacaagtg gttggagggc cccagagctg 2580
cttgaagaat caaacaattt gcagtgccaa gtcgaaacgg aacactcttc tagtaggcat 2640
acagtagttt catctgattc tttttatgat ccgttcacca agaggaggct aacaagatct 2700
attgatattt tttctatggg atgtgtattc tattatatcc tatccaaagg gaagcatcca 2760
tttggagata aatattcacg tgaaagcaat atcataagag gaatattcag tcttgatgaa 2820
atgaaatgtc tacatgatag atccttaatt gcagaagcta cagatctgat ctcccaaatg 2880
attgatcacg atccgttaaa aagacctact gctatgaaag ttctaaggca tccgttgttt 2940
tggccaaagt cgaaaaaatt ggagttcctt ttaaaagtta gtgataggct tgaaattgaa 3000
aacagagacc ctccaagtgc cctgttaatg aaatttgacg ccggttctga ctttgtaata 3060
cccagtggag attggactgt caagtttgat aaaacattca tggacaacct tgaaaggtac 3120
agaaaatacc attcatcaaa gttaatggat ctattaagag cacttaggaa taaatatcat 3180
cattttatgg atttacctga agatatagca gaactaatgg ggccggtacc cgatggattt 3240
tacgattact tcaccaagcg ttttccaaac ctattaatag gtgtttatat gattgtcaag 3300
gaaaatttaa gtgacgatca aattttacgt gaatttttgt attcataa 3348
<210> 11
<211> 1115
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 11
Met Arg Leu Leu Arg Arg Asn Met Leu Val Leu Thr Leu Leu Val Cys
1 5 10 15
Val Phe Ser Ser Ile Ile Ser Cys Ser Ile Pro Leu Ser Ser Arg Thr
20 25 30
Ser Arg Arg Gln Ile Val Glu Asp Glu Val Ala Ser Thr Lys Lys Leu
35 40 45
Asn Phe Asn Tyr Gly Val Asp Lys Asn Ile Asn Ser Pro Ile Pro Ala
50 55 60
Pro Arg Thr Thr Glu Gly Leu Pro Asn Met Lys Leu Ser Ser Tyr Pro
65 70 75 80
Thr Pro Asn Leu Leu Asn Thr Ala Asp Asn Arg Arg Ala Asn Lys Lys
85 90 95
Gly Arg Arg Ala Ala Asn Ser Ile Ser Val Pro Tyr Leu Glu Asn Arg
100 105 110
Ser Leu Asn Glu Leu Ser Leu Ser Asp Ile Leu Ile Ala Ala Asp Val
115 120 125
Glu Gly Gly Leu His Ala Val Asp Arg Arg Asn Gly His Ile Ile Trp
130 135 140
Ser Ile Glu Pro Glu Asn Phe Gln Pro Leu Ile Glu Ile Gln Glu Pro
145 150 155 160
Ser Arg Leu Glu Thr Tyr Glu Thr Leu Ile Ile Glu Pro Phe Gly Asp
165 170 175
Gly Asn Ile Tyr Tyr Phe Asn Ala His Gln Gly Leu Gln Lys Leu Pro
180 185 190
Leu Ser Ile Arg Gln Leu Val Ser Thr Ser Pro Leu His Leu Lys Thr
195 200 205
Asn Ile Val Val Asn Asp Ser Gly Lys Ile Val Glu Asp Glu Lys Val
210 215 220
Tyr Thr Gly Ser Met Arg Thr Ile Met Tyr Thr Ile Asn Met Leu Asn
225 230 235 240
Gly Glu Ile Ile Ser Ala Phe Gly Pro Gly Ser Lys Asn Gly Tyr Phe
245 250 255
Gly Ser Gln Ser Val Asp Cys Ser Pro Glu Glu Lys Ile Lys Leu Gln
260 265 270
Glu Cys Glu Asn Met Ile Val Ile Gly Lys Thr Ile Phe Glu Leu Gly
275 280 285
Ile His Ser Tyr Asp Gly Ala Ser Tyr Asn Val Thr Tyr Ser Thr Trp
290 295 300
Gln Gln Asn Val Leu Asp Val Pro Leu Ala Leu Gln Asn Thr Phe Ser
305 310 315 320
Lys Asp Gly Met Cys Ile Ala Pro Phe Arg Asp Lys Ser Leu Leu Ala
325 330 335
Ser Asp Leu Asp Phe Arg Ile Ala Arg Trp Val Ser Pro Thr Phe Pro
340 345 350
Gly Ile Ile Val Gly Leu Phe Asp Val Phe Asn Asp Leu Arg Thr Asn
355 360 365
Glu Asn Ile Leu Val Pro His Pro Phe Asn Pro Gly Asp His Glu Ser
370 375 380
Ile Ser Ser Asn Lys Val Tyr Leu Asp Gln Thr Ser Asn Leu Ser Trp
385 390 395 400
Phe Ala Leu Ser Ser Gln Asn Phe Pro Ser Leu Val Glu Ser Ala Pro
405 410 415
Ile Ser Arg Tyr Ala Ser Ser Asp Arg Trp Arg Val Ser Ser Ile Phe
420 425 430
Glu Asp Glu Thr Leu Phe Lys Asn Ala Ile Met Gly Val His Gln Ile
435 440 445
Tyr Asn Asn Glu Tyr Asp His Leu Tyr Glu Asn Tyr Glu Lys Thr Asn
450 455 460
Ser Leu Asp Thr Thr His Lys Tyr Pro Pro Leu Met Ile Asp Ser Ser
465 470 475 480
Val Asp Thr Thr Asp Leu His Gln Asn Asn Glu Met Asn Ser Leu Lys
485 490 495
Glu Tyr Met Ser Pro Glu Asp Leu Glu Ala Tyr Arg Lys Lys Ile His
500 505 510
Glu Gln Ile Ser Arg Glu Leu Asp Glu Lys Asn Gln Asn Ser Leu Leu
515 520 525
Leu Lys Phe Gly Ser Leu Val Tyr Arg Ile Ile Glu Thr Gly Val Phe
530 535 540
Leu Leu Leu Phe Leu Ile Phe Cys Ala Ile Leu Gln Arg Phe Lys Ile
545 550 555 560
Leu Pro Pro Leu Tyr Val Leu Leu Ser Lys Ile Gly Phe Met Pro Glu
565 570 575
Lys Glu Ile Pro Ile Val Glu Ser Lys Ser Leu Asn Cys Pro Ser Ser
580 585 590
Ser Glu Asn Val Thr Lys Pro Phe Asp Met Lys Ser Gly Lys Gln Val
595 600 605
Val Phe Glu Gly Ala Val Asn Asp Gly Ser Leu Lys Ser Glu Lys Asp
610 615 620
Asn Asp Asp Ala Asp Glu Asp Asp Glu Lys Ser Leu Asp Leu Thr Thr
625 630 635 640
Glu Lys Lys Lys Arg Lys Arg Gly Ser Arg Gly Gly Lys Lys Gly Arg
645 650 655
Lys Ser Arg Ile Ala Asn Ile Pro Asn Phe Glu Gln Ser Leu Lys Asn
660 665 670
Leu Val Val Ser Glu Lys Ile Leu Gly Tyr Gly Ser Ser Gly Thr Val
675 680 685
Val Phe Gln Gly Ser Phe Gln Gly Arg Pro Val Ala Val Lys Arg Met
690 695 700
Leu Ile Asp Phe Cys Asp Ile Ala Leu Met Glu Ile Lys Leu Leu Thr
705 710 715 720
Glu Ser Asp Asp His Pro Asn Val Ile Arg Tyr Tyr Cys Ser Glu Thr
725 730 735
Thr Asp Arg Phe Leu Tyr Ile Ala Leu Glu Leu Cys Asn Leu Asn Leu
740 745 750
Gln Asp Leu Val Glu Ser Lys Asn Val Ser Asp Glu Asn Leu Lys Leu
755 760 765
Gln Lys Glu Tyr Asn Pro Ile Ser Leu Leu Arg Gln Ile Ala Ser Gly
770 775 780
Val Ala His Leu His Ser Leu Lys Ile Ile His Arg Asp Leu Lys Pro
785 790 795 800
Gln Asn Ile Leu Val Ser Thr Ser Ser Arg Phe Thr Ala Asp Gln Gln
805 810 815
Thr Gly Ala Glu Asn Leu Arg Ile Leu Ile Ser Asp Phe Gly Leu Cys
820 825 830
Lys Lys Leu Asp Ser Gly Gln Ser Ser Phe Arg Thr Asn Leu Asn Asn
835 840 845
Pro Ser Gly Thr Ser Gly Trp Arg Ala Pro Glu Leu Leu Glu Glu Ser
850 855 860
Asn Asn Leu Gln Cys Gln Val Glu Thr Glu His Ser Ser Ser Arg His
865 870 875 880
Thr Val Val Ser Ser Asp Ser Phe Tyr Asp Pro Phe Thr Lys Arg Arg
885 890 895
Leu Thr Arg Ser Ile Asp Ile Phe Ser Met Gly Cys Val Phe Tyr Tyr
900 905 910
Ile Leu Ser Lys Gly Lys His Pro Phe Gly Asp Lys Tyr Ser Arg Glu
915 920 925
Ser Asn Ile Ile Arg Gly Ile Phe Ser Leu Asp Glu Met Lys Cys Leu
930 935 940
His Asp Arg Ser Leu Ile Ala Glu Ala Thr Asp Leu Ile Ser Gln Met
945 950 955 960
Ile Asp His Asp Pro Leu Lys Arg Pro Thr Ala Met Lys Val Leu Arg
965 970 975
His Pro Leu Phe Trp Pro Lys Ser Lys Lys Leu Glu Phe Leu Leu Lys
980 985 990
Val Ser Asp Arg Leu Glu Ile Glu Asn Arg Asp Pro Pro Ser Ala Leu
995 1000 1005
Leu Met Lys Phe Asp Ala Gly Ser Asp Phe Val Ile Pro Ser Gly
1010 1015 1020
Asp Trp Thr Val Lys Phe Asp Lys Thr Phe Met Asp Asn Leu Glu
1025 1030 1035
Arg Tyr Arg Lys Tyr His Ser Ser Lys Leu Met Asp Leu Leu Arg
1040 1045 1050
Ala Leu Arg Asn Lys Tyr His His Phe Met Asp Leu Pro Glu Asp
1055 1060 1065
Ile Ala Glu Leu Met Gly Pro Val Pro Asp Gly Phe Tyr Asp Tyr
1070 1075 1080
Phe Thr Lys Arg Phe Pro Asn Leu Leu Ile Gly Val Tyr Met Ile
1085 1090 1095
Val Lys Glu Asn Leu Ser Asp Asp Gln Ile Leu Arg Glu Phe Leu
1100 1105 1110
Tyr Ser
1115
<210> 12
<211> 2865
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 12
atggtccttt tgaaatggct cgtatgccaa ttggtcttct ttaccgcttt ttcgcatgcg 60
tttaccgact atctattaaa gaagtgtgcg caatctgggt tttgccatag aaacagggtt 120
tatgcagaaa atattgccaa atctcatcac tgctattaca aagtggacgc cgagtctatt 180
gcacacgatc ctttagagaa tgtgcttcat gctaccataa ttaaaactat accaagattg 240
gagggcgatg atatagccgt tcagttccca ttctctctct cttttttaca ggatcactca 300
gtaaggttca ctataaatga gaaagagaga atgccaacca acagcagcgg tttgttgatc 360
tcttcacaac ggttcaatga gacctggaag tacgcattcg acaagaaatt tcaagaggag 420
gcgaacagga ccagtattcc acaattccac ttccttaagc aaaaacaaac tgtgaactca 480
ttctggtcga aaatatcttc atttttgtca ctttcaaact ccactgcaga cacatttcat 540
cttcgaaacg gtgatgtatc cgtagaaatc tttgctgaac cttttcaatt gaaagtttac 600
tggcaaaatg cgctgaaact tattgtaaac gagcaaaatt tcctgaacat tgaacatcat 660
agaactaagc aggaaaactt cgcacacgtg ctgccagaag aaacaacttt caacatgttt 720
aaggacaatt tcttgtattc aaagcatgac tctatgcctt tggggcctga atcggttgcg 780
ctagatttct ctttcatggg ttctactaat gtctacggta taccggaaca tgcgacgtcg 840
ctaaggctga tggacacttc aggtggaaag gaaccctaca ggcttttcaa cgttgatgtc 900
tttgagtaca acatcggtac cagccaacca atgtacggtt cgatcccatt catgttttca 960
tcttcgtcca catctatctt ttgggtcaat gcagctgaca cttgggtaga cataaagtat 1020
gacaccagta aaaataaaac gatgactcat tggatctccg aaaatggtgt catagatgta 1080
gtcatgtccc tggggccaga tattccaact atcattgaca aatttaccga tttgactggt 1140
agaccctttt taccgcccat ttcctctata gggtaccatc aatgtagatg gaattataat 1200
gatgagatgg acgttctcac agtggactct cagatggatg ctcatatgat tccttacgat 1260
tttatttggt tggacttgga gtatacgaac gacaaaaaat attttacttg gaagcagcac 1320
tcctttccca atccaaaaag gctgttatcc aaattaaaaa agttgggtag aaatcttgtc 1380
gtactaatcg atcctcattt aaagaaagat tatgaaatca gtgacagggt aattaatgaa 1440
aatgtagcag tcaaggatca caatggaaat gactatgtag gtcattgctg gccaggtaat 1500
tctatatgga ttgataccat aagcaaatat ggccaaaaga tttggaagtc ctttttcgaa 1560
cggtttatgg atctgccggc tgatttaact aatttattca tttggaatga tatgaacgag 1620
ccttcgattt tcgatggccc agagaccaca gctccaaaag atttgattca cgacaattac 1680
attgaggaaa gatccgtcca taacatatat ggtctatcag tgcatgaagc tacttacgac 1740
gcaataaaat cgatttattc accatccgat aagcgtcctt tccttctaac aagggctttt 1800
tttgccggct ctcaacgtac tgctgccaca tggactggtg acaatgtggc caattgggat 1860
tacttaaaga tttccattcc tatggttctg tcaaacaaca ttgctggtat gccatttata 1920
ggagccgaca tagctggctt tgctgaggat cctacacctg aattgattgc acgttggtac 1980
caagcgggct tatggtaccc attttttaga gcacacgccc atatagacac caagagaaga 2040
gaaccatact tattcaatga acctttgaag tcgatagtac gtgatattat ccaattgaga 2100
tatttcctgc tacctacctt atacaccatg tttcataaat caagtgtcac tggatttccg 2160
ataatgaatc caatgtttat tgaacaccct gaatttgctg aattgtatca tatcgataac 2220
caattttact ggagtaattc aggtctatta gtcaaacctg tcacggagcc tggtcaatca 2280
gaaacggaaa tggttttccc acccggtata ttctatgaat tcgcatcttt acactctttt 2340
ataaacaatg gtactgattt gatagaaaag aatatttctg caccattgga taaaattcca 2400
ttatttattg aaggcggtca cattatcact atgaaagata agtatagaag atcttcaatg 2460
ttaatgaaaa acgatccata tgtaatagtt atagcccctg ataccgaggg acgagccgtt 2520
ggagatcttt atgttgatga tggagaaact tttggctacc aaagaggtga gtacgtagaa 2580
actcagttca ttttcgaaaa caatacctta aaaaatgttc gaagtcatat tcccgagaat 2640
ttgacaggca ttcaccacaa tactttgagg aataccaata ttgaaaaaat cattatcgca 2700
aagaataatt tacaacacaa cataacgttg aaagacagta ttaaagtcaa aaaaaatggc 2760
gaagaaagtt cattgccgac tagatcgtca tatgagaatg ataataagat caccattctt 2820
aacctatcgc ttgacataac tgaagattgg gaagttattt tttga 2865
<210> 13
<211> 954
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 13
Met Val Leu Leu Lys Trp Leu Val Cys Gln Leu Val Phe Phe Thr Ala
1 5 10 15
Phe Ser His Ala Phe Thr Asp Tyr Leu Leu Lys Lys Cys Ala Gln Ser
20 25 30
Gly Phe Cys His Arg Asn Arg Val Tyr Ala Glu Asn Ile Ala Lys Ser
35 40 45
His His Cys Tyr Tyr Lys Val Asp Ala Glu Ser Ile Ala His Asp Pro
50 55 60
Leu Glu Asn Val Leu His Ala Thr Ile Ile Lys Thr Ile Pro Arg Leu
65 70 75 80
Glu Gly Asp Asp Ile Ala Val Gln Phe Pro Phe Ser Leu Ser Phe Leu
85 90 95
Gln Asp His Ser Val Arg Phe Thr Ile Asn Glu Lys Glu Arg Met Pro
100 105 110
Thr Asn Ser Ser Gly Leu Leu Ile Ser Ser Gln Arg Phe Asn Glu Thr
115 120 125
Trp Lys Tyr Ala Phe Asp Lys Lys Phe Gln Glu Glu Ala Asn Arg Thr
130 135 140
Ser Ile Pro Gln Phe His Phe Leu Lys Gln Lys Gln Thr Val Asn Ser
145 150 155 160
Phe Trp Ser Lys Ile Ser Ser Phe Leu Ser Leu Ser Asn Ser Thr Ala
165 170 175
Asp Thr Phe His Leu Arg Asn Gly Asp Val Ser Val Glu Ile Phe Ala
180 185 190
Glu Pro Phe Gln Leu Lys Val Tyr Trp Gln Asn Ala Leu Lys Leu Ile
195 200 205
Val Asn Glu Gln Asn Phe Leu Asn Ile Glu His His Arg Thr Lys Gln
210 215 220
Glu Asn Phe Ala His Val Leu Pro Glu Glu Thr Thr Phe Asn Met Phe
225 230 235 240
Lys Asp Asn Phe Leu Tyr Ser Lys His Asp Ser Met Pro Leu Gly Pro
245 250 255
Glu Ser Val Ala Leu Asp Phe Ser Phe Met Gly Ser Thr Asn Val Tyr
260 265 270
Gly Ile Pro Glu His Ala Thr Ser Leu Arg Leu Met Asp Thr Ser Gly
275 280 285
Gly Lys Glu Pro Tyr Arg Leu Phe Asn Val Asp Val Phe Glu Tyr Asn
290 295 300
Ile Gly Thr Ser Gln Pro Met Tyr Gly Ser Ile Pro Phe Met Phe Ser
305 310 315 320
Ser Ser Ser Thr Ser Ile Phe Trp Val Asn Ala Ala Asp Thr Trp Val
325 330 335
Asp Ile Lys Tyr Asp Thr Ser Lys Asn Lys Thr Met Thr His Trp Ile
340 345 350
Ser Glu Asn Gly Val Ile Asp Val Val Met Ser Leu Gly Pro Asp Ile
355 360 365
Pro Thr Ile Ile Asp Lys Phe Thr Asp Leu Thr Gly Arg Pro Phe Leu
370 375 380
Pro Pro Ile Ser Ser Ile Gly Tyr His Gln Cys Arg Trp Asn Tyr Asn
385 390 395 400
Asp Glu Met Asp Val Leu Thr Val Asp Ser Gln Met Asp Ala His Met
405 410 415
Ile Pro Tyr Asp Phe Ile Trp Leu Asp Leu Glu Tyr Thr Asn Asp Lys
420 425 430
Lys Tyr Phe Thr Trp Lys Gln His Ser Phe Pro Asn Pro Lys Arg Leu
435 440 445
Leu Ser Lys Leu Lys Lys Leu Gly Arg Asn Leu Val Val Leu Ile Asp
450 455 460
Pro His Leu Lys Lys Asp Tyr Glu Ile Ser Asp Arg Val Ile Asn Glu
465 470 475 480
Asn Val Ala Val Lys Asp His Asn Gly Asn Asp Tyr Val Gly His Cys
485 490 495
Trp Pro Gly Asn Ser Ile Trp Ile Asp Thr Ile Ser Lys Tyr Gly Gln
500 505 510
Lys Ile Trp Lys Ser Phe Phe Glu Arg Phe Met Asp Leu Pro Ala Asp
515 520 525
Leu Thr Asn Leu Phe Ile Trp Asn Asp Met Asn Glu Pro Ser Ile Phe
530 535 540
Asp Gly Pro Glu Thr Thr Ala Pro Lys Asp Leu Ile His Asp Asn Tyr
545 550 555 560
Ile Glu Glu Arg Ser Val His Asn Ile Tyr Gly Leu Ser Val His Glu
565 570 575
Ala Thr Tyr Asp Ala Ile Lys Ser Ile Tyr Ser Pro Ser Asp Lys Arg
580 585 590
Pro Phe Leu Leu Thr Arg Ala Phe Phe Ala Gly Ser Gln Arg Thr Ala
595 600 605
Ala Thr Trp Thr Gly Asp Asn Val Ala Asn Trp Asp Tyr Leu Lys Ile
610 615 620
Ser Ile Pro Met Val Leu Ser Asn Asn Ile Ala Gly Met Pro Phe Ile
625 630 635 640
Gly Ala Asp Ile Ala Gly Phe Ala Glu Asp Pro Thr Pro Glu Leu Ile
645 650 655
Ala Arg Trp Tyr Gln Ala Gly Leu Trp Tyr Pro Phe Phe Arg Ala His
660 665 670
Ala His Ile Asp Thr Lys Arg Arg Glu Pro Tyr Leu Phe Asn Glu Pro
675 680 685
Leu Lys Ser Ile Val Arg Asp Ile Ile Gln Leu Arg Tyr Phe Leu Leu
690 695 700
Pro Thr Leu Tyr Thr Met Phe His Lys Ser Ser Val Thr Gly Phe Pro
705 710 715 720
Ile Met Asn Pro Met Phe Ile Glu His Pro Glu Phe Ala Glu Leu Tyr
725 730 735
His Ile Asp Asn Gln Phe Tyr Trp Ser Asn Ser Gly Leu Leu Val Lys
740 745 750
Pro Val Thr Glu Pro Gly Gln Ser Glu Thr Glu Met Val Phe Pro Pro
755 760 765
Gly Ile Phe Tyr Glu Phe Ala Ser Leu His Ser Phe Ile Asn Asn Gly
770 775 780
Thr Asp Leu Ile Glu Lys Asn Ile Ser Ala Pro Leu Asp Lys Ile Pro
785 790 795 800
Leu Phe Ile Glu Gly Gly His Ile Ile Thr Met Lys Asp Lys Tyr Arg
805 810 815
Arg Ser Ser Met Leu Met Lys Asn Asp Pro Tyr Val Ile Val Ile Ala
820 825 830
Pro Asp Thr Glu Gly Arg Ala Val Gly Asp Leu Tyr Val Asp Asp Gly
835 840 845
Glu Thr Phe Gly Tyr Gln Arg Gly Glu Tyr Val Glu Thr Gln Phe Ile
850 855 860
Phe Glu Asn Asn Thr Leu Lys Asn Val Arg Ser His Ile Pro Glu Asn
865 870 875 880
Leu Thr Gly Ile His His Asn Thr Leu Arg Asn Thr Asn Ile Glu Lys
885 890 895
Ile Ile Ile Ala Lys Asn Asn Leu Gln His Asn Ile Thr Leu Lys Asp
900 905 910
Ser Ile Lys Val Lys Lys Asn Gly Glu Glu Ser Ser Leu Pro Thr Arg
915 920 925
Ser Ser Tyr Glu Asn Asp Asn Lys Ile Thr Ile Leu Asn Leu Ser Leu
930 935 940
Asp Ile Thr Glu Asp Trp Glu Val Ile Phe
945 950
<210> 14
<211> 1218
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 14
atgttcagct tgaaagcatt attgccattg gccttgttgt tggtcagcgc caaccaagtt 60
gctgcaaaag tccacaaggc taaaatttat aaacacgagt tgtccgatga gatgaaagaa 120
gtcactttcg agcaacattt agctcattta ggccaaaagt acttgactca atttgagaaa 180
gctaaccccg aagttgtttt ttctagggag catcctttct tcactgaagg tggtcacgat 240
gttccattga caaattactt gaacgcacaa tattacactg acattacttt gggtactcca 300
cctcaaaact tcaaggttat tttggatact ggttcttcaa acctttgggt tccaagtaac 360
gaatgtggtt ccttggcttg tttcctacat tctaaatacg atcatgaagc ttcatcaagc 420
tacaaagcta atggtactga atttgccatt caatatggta ctggttcttt ggaaggttac 480
atttctcaag acactttgtc catcggggat ttgaccattc caaaacaaga cttcgctgag 540
gctaccagcg agccgggctt aacatttgca tttggcaagt tcgatggtat tttgggtttg 600
ggttacgata ccatttctgt tgataaggtg gtccctccat tttacaacgc cattcaacaa 660
gatttgttgg acgaaaagag atttgccttt tatttgggag acacttcaaa ggatactgaa 720
aatggcggtg aagccacctt tggtggtatt gacgagtcta agttcaaggg cgatatcact 780
tggttacctg ttcgtcgtaa ggcttactgg gaagtcaagt ttgaaggtat cggtttaggc 840
gacgagtacg ccgaattgga gagccatggt gccgccatcg atactggtac ttctttgatt 900
accttgccat caggattagc tgaaatgatt aatgctgaaa ttggggccaa gaagggttgg 960
accggtcaat atactctaga ctgtaacacc agagacaatc tacctgatct aattttcaac 1020
ttcaatggct acaacttcac tattgggcca tacgattaca cgcttgaagt ttcaggctcc 1080
tgtatctctg caattacacc aatggatttc ccagaacctg ttggcccact ggccatcgtt 1140
ggtgatgcct tcttgcgtaa atactattct atttacgatt tgggcaacaa tgcggttggt 1200
ttggccaaag caatttga 1218
<210> 15
<211> 405
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 15
Met Phe Ser Leu Lys Ala Leu Leu Pro Leu Ala Leu Leu Leu Val Ser
1 5 10 15
Ala Asn Gln Val Ala Ala Lys Val His Lys Ala Lys Ile Tyr Lys His
20 25 30
Glu Leu Ser Asp Glu Met Lys Glu Val Thr Phe Glu Gln His Leu Ala
35 40 45
His Leu Gly Gln Lys Tyr Leu Thr Gln Phe Glu Lys Ala Asn Pro Glu
50 55 60
Val Val Phe Ser Arg Glu His Pro Phe Phe Thr Glu Gly Gly His Asp
65 70 75 80
Val Pro Leu Thr Asn Tyr Leu Asn Ala Gln Tyr Tyr Thr Asp Ile Thr
85 90 95
Leu Gly Thr Pro Pro Gln Asn Phe Lys Val Ile Leu Asp Thr Gly Ser
100 105 110
Ser Asn Leu Trp Val Pro Ser Asn Glu Cys Gly Ser Leu Ala Cys Phe
115 120 125
Leu His Ser Lys Tyr Asp His Glu Ala Ser Ser Ser Tyr Lys Ala Asn
130 135 140
Gly Thr Glu Phe Ala Ile Gln Tyr Gly Thr Gly Ser Leu Glu Gly Tyr
145 150 155 160
Ile Ser Gln Asp Thr Leu Ser Ile Gly Asp Leu Thr Ile Pro Lys Gln
165 170 175
Asp Phe Ala Glu Ala Thr Ser Glu Pro Gly Leu Thr Phe Ala Phe Gly
180 185 190
Lys Phe Asp Gly Ile Leu Gly Leu Gly Tyr Asp Thr Ile Ser Val Asp
195 200 205
Lys Val Val Pro Pro Phe Tyr Asn Ala Ile Gln Gln Asp Leu Leu Asp
210 215 220
Glu Lys Arg Phe Ala Phe Tyr Leu Gly Asp Thr Ser Lys Asp Thr Glu
225 230 235 240
Asn Gly Gly Glu Ala Thr Phe Gly Gly Ile Asp Glu Ser Lys Phe Lys
245 250 255
Gly Asp Ile Thr Trp Leu Pro Val Arg Arg Lys Ala Tyr Trp Glu Val
260 265 270
Lys Phe Glu Gly Ile Gly Leu Gly Asp Glu Tyr Ala Glu Leu Glu Ser
275 280 285
His Gly Ala Ala Ile Asp Thr Gly Thr Ser Leu Ile Thr Leu Pro Ser
290 295 300
Gly Leu Ala Glu Met Ile Asn Ala Glu Ile Gly Ala Lys Lys Gly Trp
305 310 315 320
Thr Gly Gln Tyr Thr Leu Asp Cys Asn Thr Arg Asp Asn Leu Pro Asp
325 330 335
Leu Ile Phe Asn Phe Asn Gly Tyr Asn Phe Thr Ile Gly Pro Tyr Asp
340 345 350
Tyr Thr Leu Glu Val Ser Gly Ser Cys Ile Ser Ala Ile Thr Pro Met
355 360 365
Asp Phe Pro Glu Pro Val Gly Pro Leu Ala Ile Val Gly Asp Ala Phe
370 375 380
Leu Arg Lys Tyr Tyr Ser Ile Tyr Asp Leu Gly Asn Asn Ala Val Gly
385 390 395 400
Leu Ala Lys Ala Ile
405
<210> 16
<211> 1197
<212> DNA
<213> Artificial sequence
<220>
<223> geranyl pyrophosphate olivine geranyl transferase CspT4
Codon optimization of nucleotide sequences (GOT)
<400> 16
atgggtttat ctttggtctg caccttctcc tttcaaacta actaccacac tttattgaat 60
ccacataata agaatcctaa gaactcttta ttgtcctacc aacacccaaa gactcctatt 120
atcaagtcct cttacgataa cttcccatct aagtactgtt tgactaagaa tttccatttg 180
ttgggtttga attctcacaa cagaatttcc tcccaatccc gttctattag agccggttct 240
gatcaaatcg aaggttcccc tcatcatgag tccgataact ccattgctac taaaatttta 300
aatttcggtc atacttgttg gaagttgcaa cgtccttacg ttgtcaaggg tatgatctct 360
attgcttgtg gtttgttcgg tagagaattg tttaacaaca gacacttgtt ctcttggggt 420
ttgatgtgga aagctttctt cgctttggtc ccaattttgt ctttcaattt cttcgccgcc 480
atcatgaacc aaatctacga tgttgatatc gaccgtatca acaagccaga cttaccttta 540
gtttccggtg aaatgtccat tgaaactgct tggatcttgt ctatcattgt tgccttgact 600
ggtttaattg ttactattaa gttgaagtcc gctccattgt ttgtcttcat ctacatcttc 660
ggtatcttcg ctggtttcgc ttactccgtc ccacctatta gatggaaaca atatcctttt 720
accaatttct tgatcactat ttcctctcat gttggtttgg ctttcacttc ttactctgcc 780
accacttctg ctttaggttt gcctttcgtt tggcgtcctg ccttctcttt cattattgct 840
ttcatgactg tcatgggtat gactattgcc tttgctaaag acatttctga tatcgaaggt 900
gatgctaagt acggtgtctc taccgttgct accaagttag gtgctagaaa tatgactttt 960
gttgtttctg gtgtcttatt gttgaactac ttggtttcta tctctattgg tatcatttgg 1020
ccacaagttt tcaagtctaa cattatgatc ttgtctcatg ctattttggc tttctgtttg 1080
atctttcaaa ctcgtgaatt agccttagcc aattatgcct ctgccccatc ccgtcaattt 1140
ttcgaattca tctggttgtt atactatgcc gaatacttcg tttacgtctt catttaa 1197
<210> 17
<211> 398
<212> PRT
<213> Cannabis sativa
<400> 17
Met Gly Leu Ser Leu Val Cys Thr Phe Ser Phe Gln Thr Asn Tyr His
1 5 10 15
Thr Leu Leu Asn Pro His Asn Lys Asn Pro Lys Asn Ser Leu Leu Ser
20 25 30
Tyr Gln His Pro Lys Thr Pro Ile Ile Lys Ser Ser Tyr Asp Asn Phe
35 40 45
Pro Ser Lys Tyr Cys Leu Thr Lys Asn Phe His Leu Leu Gly Leu Asn
50 55 60
Ser His Asn Arg Ile Ser Ser Gln Ser Arg Ser Ile Arg Ala Gly Ser
65 70 75 80
Asp Gln Ile Glu Gly Ser Pro His His Glu Ser Asp Asn Ser Ile Ala
85 90 95
Thr Lys Ile Leu Asn Phe Gly His Thr Cys Trp Lys Leu Gln Arg Pro
100 105 110
Tyr Val Val Lys Gly Met Ile Ser Ile Ala Cys Gly Leu Phe Gly Arg
115 120 125
Glu Leu Phe Asn Asn Arg His Leu Phe Ser Trp Gly Leu Met Trp Lys
130 135 140
Ala Phe Phe Ala Leu Val Pro Ile Leu Ser Phe Asn Phe Phe Ala Ala
145 150 155 160
Ile Met Asn Gln Ile Tyr Asp Val Asp Ile Asp Arg Ile Asn Lys Pro
165 170 175
Asp Leu Pro Leu Val Ser Gly Glu Met Ser Ile Glu Thr Ala Trp Ile
180 185 190
Leu Ser Ile Ile Val Ala Leu Thr Gly Leu Ile Val Thr Ile Lys Leu
195 200 205
Lys Ser Ala Pro Leu Phe Val Phe Ile Tyr Ile Phe Gly Ile Phe Ala
210 215 220
Gly Phe Ala Tyr Ser Val Pro Pro Ile Arg Trp Lys Gln Tyr Pro Phe
225 230 235 240
Thr Asn Phe Leu Ile Thr Ile Ser Ser His Val Gly Leu Ala Phe Thr
245 250 255
Ser Tyr Ser Ala Thr Thr Ser Ala Leu Gly Leu Pro Phe Val Trp Arg
260 265 270
Pro Ala Phe Ser Phe Ile Ile Ala Phe Met Thr Val Met Gly Met Thr
275 280 285
Ile Ala Phe Ala Lys Asp Ile Ser Asp Ile Glu Gly Asp Ala Lys Tyr
290 295 300
Gly Val Ser Thr Val Ala Thr Lys Leu Gly Ala Arg Asn Met Thr Phe
305 310 315 320
Val Val Ser Gly Val Leu Leu Leu Asn Tyr Leu Val Ser Ile Ser Ile
325 330 335
Gly Ile Ile Trp Pro Gln Val Phe Lys Ser Asn Ile Met Ile Leu Ser
340 345 350
His Ala Ile Leu Ala Phe Cys Leu Ile Phe Gln Thr Arg Glu Leu Ala
355 360 365
Leu Ala Asn Tyr Ala Ser Ala Pro Ser Arg Gln Phe Phe Glu Phe Ile
370 375 380
Trp Leu Leu Tyr Tyr Ala Glu Tyr Phe Val Tyr Val Phe Ile
385 390 395
<210> 18
<211> 1158
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of nucleotide sequence of tetraone compound synthase (TKS)
<400> 18
atgaaccatt taagagctga gggtccagct tccgtcttgg ctatcggtac tgctaatcca 60
gagaacattt tattacaaga tgagtttcca gattactatt tccgtgttac taagtccgag 120
catatgaccc aattgaaaga aaagttccgt aaaatctgtg ataaatctat gattagaaaa 180
agaaactgct ttttaaacga agaacacttg aagcaaaacc caagattagt tgaacacgag 240
atgcaaacct tggacgctag acaagatatg ttggttgtcg aggttcctaa attgggtaaa 300
gacgcctgtg ctaaagctat caaagagtgg ggtcaaccta agtccaagat cactcactta 360
atcttcactt ccgcttccac cactgacatg cctggtgctg attaccactg tgccaagttg 420
ttgggtttgt ctccttctgt caagagagtt atgatgtacc aattaggttg ttacggtggt 480
ggtactgtct taagaattgc taaggacatc gctgaaaaca acaaaggtgc tagagtttta 540
gccgtttgtt gtgacatcat ggcttgttta tttcgtggtc catctgaatc tgacttggag 600
ttgttggttg gtcaagctat ttttggtgat ggtgccgctg ccgtcatcgt tggtgctgag 660
ccagatgaat ccgttggtga aagaccaatt ttcgaattag tctctactgg tcaaactatt 720
ttgccaaact ccgagggtac tatcggtggt catattcgtg aagccggttt aatctttgat 780
ttgcacaaag acgttccaat gttgatctct aacaacatcg aaaagtgttt aattgaggct 840
tttactccaa ttggtatctc tgactggaac tctatcttct ggatcactca tccaggtggt 900
aaggctatct tggacaaggt tgaagaaaaa ttacatttaa agtccgataa attcgtcgat 960
tctcgtcatg ttttgtctga acacggtaac atgtcttcct ccactgtctt gtttgttatg 1020
gatgaattac gtaagagatc tttggaggag ggtaagtcta ctactggtga tggtttcgaa 1080
tggggtgttt tgttcggttt cggtcctggt ttgactgttg aacgtgttgt tgttagatct 1140
gttccaatta agtactag 1158
<210> 19
<211> 385
<212> PRT
<213> Cannabis sativa
<400> 19
Met Asn His Leu Arg Ala Glu Gly Pro Ala Ser Val Leu Ala Ile Gly
1 5 10 15
Thr Ala Asn Pro Glu Asn Ile Leu Leu Gln Asp Glu Phe Pro Asp Tyr
20 25 30
Tyr Phe Arg Val Thr Lys Ser Glu His Met Thr Gln Leu Lys Glu Lys
35 40 45
Phe Arg Lys Ile Cys Asp Lys Ser Met Ile Arg Lys Arg Asn Cys Phe
50 55 60
Leu Asn Glu Glu His Leu Lys Gln Asn Pro Arg Leu Val Glu His Glu
65 70 75 80
Met Gln Thr Leu Asp Ala Arg Gln Asp Met Leu Val Val Glu Val Pro
85 90 95
Lys Leu Gly Lys Asp Ala Cys Ala Lys Ala Ile Lys Glu Trp Gly Gln
100 105 110
Pro Lys Ser Lys Ile Thr His Leu Ile Phe Thr Ser Ala Ser Thr Thr
115 120 125
Asp Met Pro Gly Ala Asp Tyr His Cys Ala Lys Leu Leu Gly Leu Ser
130 135 140
Pro Ser Val Lys Arg Val Met Met Tyr Gln Leu Gly Cys Tyr Gly Gly
145 150 155 160
Gly Thr Val Leu Arg Ile Ala Lys Asp Ile Ala Glu Asn Asn Lys Gly
165 170 175
Ala Arg Val Leu Ala Val Cys Cys Asp Ile Met Ala Cys Leu Phe Arg
180 185 190
Gly Pro Ser Glu Ser Asp Leu Glu Leu Leu Val Gly Gln Ala Ile Phe
195 200 205
Gly Asp Gly Ala Ala Ala Val Ile Val Gly Ala Glu Pro Asp Glu Ser
210 215 220
Val Gly Glu Arg Pro Ile Phe Glu Leu Val Ser Thr Gly Gln Thr Ile
225 230 235 240
Leu Pro Asn Ser Glu Gly Thr Ile Gly Gly His Ile Arg Glu Ala Gly
245 250 255
Leu Ile Phe Asp Leu His Lys Asp Val Pro Met Leu Ile Ser Asn Asn
260 265 270
Ile Glu Lys Cys Leu Ile Glu Ala Phe Thr Pro Ile Gly Ile Ser Asp
275 280 285
Trp Asn Ser Ile Phe Trp Ile Thr His Pro Gly Gly Lys Ala Ile Leu
290 295 300
Asp Lys Val Glu Glu Lys Leu His Leu Lys Ser Asp Lys Phe Val Asp
305 310 315 320
Ser Arg His Val Leu Ser Glu His Gly Asn Met Ser Ser Ser Thr Val
325 330 335
Leu Phe Val Met Asp Glu Leu Arg Lys Arg Ser Leu Glu Glu Gly Lys
340 345 350
Ser Thr Thr Gly Asp Gly Phe Glu Trp Gly Val Leu Phe Gly Phe Gly
355 360 365
Pro Gly Leu Thr Val Glu Arg Val Val Val Arg Ser Val Pro Ile Lys
370 375 380
Tyr
385
<210> 20
<211> 306
<212> DNA
<213> Artificial sequence
<220>
<223> nucleotide sequence codon optimization of Olive Acid Cyclase (OAC)
<400> 20
atggccgtca aacacttgat cgtcttaaaa ttcaaggatg aaattactga agctcaaaaa 60
gaagagttct tcaaaaccta tgtcaattta gtcaacatta ttcctgctat gaaggacgtt 120
tactggggta aggatgtcac ccaaaagaac aaggaagaag gttacactca cattgttgaa 180
gtcactttcg aatctgttga aactatccaa gattatatta tccacccagc tcatgtcggt 240
tttggtgatg tttacagatc tttttgggaa aaattgttga tctttgacta tactccaaga 300
aaataa 306
<210> 21
<211> 101
<212> PRT
<213> Cannabis sativa
<400> 21
Met Ala Val Lys His Leu Ile Val Leu Lys Phe Lys Asp Glu Ile Thr
1 5 10 15
Glu Ala Gln Lys Glu Glu Phe Phe Lys Thr Tyr Val Asn Leu Val Asn
20 25 30
Ile Ile Pro Ala Met Lys Asp Val Tyr Trp Gly Lys Asp Val Thr Gln
35 40 45
Lys Asn Lys Glu Glu Gly Tyr Thr His Ile Val Glu Val Thr Phe Glu
50 55 60
Ser Val Glu Thr Ile Gln Asp Tyr Ile Ile His Pro Ala His Val Gly
65 70 75 80
Phe Gly Asp Val Tyr Arg Ser Phe Trp Glu Lys Leu Leu Ile Phe Asp
85 90 95
Tyr Thr Pro Arg Lys
100
<210> 22
<211> 2163
<212> DNA
<213> Artificial sequence
<220>
<223> base activating enzyme Cs _ AAE1 nucleotide sequence codon
Optimization
<400> 22
atgggtaaga attacaagtc cttagactct gttgttgctt ctgactttat tgctttaggt 60
attacttccg aagttgctga aaccttacac ggtagattgg ctgaaattgt ttgcaactac 120
ggtgctgcta cccctcaaac ttggattaac attgctaatc atattttgtc tccagatttg 180
ccattttctt tacaccaaat gttgttctac ggttgttaca aggatttcgg tcctgctcct 240
ccagcttgga ttcctgatcc agaaaaagtc aaatctacta acttgggtgc tttgttggaa 300
aagagaggta aggagttttt gggtgttaag tacaaggacc caatttcttc tttctctcac 360
ttccaagaat tctctgttag aaaccctgaa gtttactgga gaactgtttt gatggatgag 420
atgaagattt ctttttctaa ggacccagag tgtatcttaa gaagagacga cattaacaat 480
ccaggtggtt ctgagtggtt accaggtggt tacttgaact ctgccaaaaa ttgcttgaac 540
gttaactcta acaagaaatt gaatgacact atgattgtct ggagagatga gggtaacgat 600
gatttgcctt tgaataaatt gactttggat caattgagaa aaagagtctg gttggttggt 660
tacgctttgg aagaaatggg tttagaaaaa ggttgtgcta tcgccatcga tatgcctatg 720
cacgttgatg ctgttgttat ttatttggct attgttttag ctggttatgt tgttgtttcc 780
atcgccgact ccttctctgc tccagaaatc tccaccagat tgagattgtc taaagccaaa 840
gccattttca cccaagacca catcattaga ggtaagaagc gtattccatt gtattctcgt 900
gttgttgaag ctaaatctcc tatggctatc gtcatcccat gctctggttc taacatcggt 960
gctgaattaa gagacggtga tatttcttgg gactactttt tagaaagagc taaagaattc 1020
aaaaactgcg agtttactgc tagagaacaa cctgtcgacg cttatactaa tattttattc 1080
tcttctggta ctactggtga acctaaggct attccatgga cccaagctac tcctttgaaa 1140
gccgctgctg atggttggtc ccatttagac atcagaaaag gtgatgtcat cgtctggcca 1200
actaacttag gttggatgat gggtccatgg ttagtctacg cttctttgtt gaatggtgcc 1260
tctatcgcct tatataatgg ttccccttta gtctctggtt ttgctaaatt cgttcaagat 1320
gctaaggtta ccatgttagg tgttgtccct tctatcgtta gatcttggaa atctactaac 1380
tgtgtttctg gttacgactg gtccactatt cgttgtttct cttcttctgg tgaagcttcc 1440
aatgtcgatg agtacttatg gttaatgggt cgtgctaact acaagccagt catcgaaatg 1500
tgcggtggta ctgaaattgg tggtgctttt tccgctggtt cttttttaca agcccaatcc 1560
ttgtcttcct tctcctctca atgtatgggt tgtactttat atatcttaga taagaatggt 1620
taccctatgc ctaaaaacaa gccaggtatt ggtgaattag ctttgggtcc tgttatgttt 1680
ggtgcttcta aaaccttgtt aaatggtaat catcacgacg tttacttcaa aggtatgcct 1740
actttgaacg gtgaggtttt gagacgtcat ggtgatattt tcgaattaac ttccaacggt 1800
tattatcacg ctcacggtag agctgatgat actatgaaca ttggtggtat taagatctct 1860
tccatcgaaa ttgagagagt ttgtaacgag gttgacgatc gtgttttcga aactactgct 1920
attggtgtcc ctcctttagg tggtggtcca gaacaattgg ttatcttttt cgtcttgaag 1980
gactccaacg acaccactat cgacttaaac caattaagat tgtctttcaa cttgggtttg 2040
caaaagaagt tgaatccatt atttaaggtt actcgtgtcg ttccattgtc ctccttgcca 2100
agaactgcta ccaacaagat tatgcgtaga gtcttgagac aacaattctc tcactttgag 2160
taa 2163
<210> 23
<211> 720
<212> PRT
<213> Cannabis sativa
<400> 23
Met Gly Lys Asn Tyr Lys Ser Leu Asp Ser Val Val Ala Ser Asp Phe
1 5 10 15
Ile Ala Leu Gly Ile Thr Ser Glu Val Ala Glu Thr Leu His Gly Arg
20 25 30
Leu Ala Glu Ile Val Cys Asn Tyr Gly Ala Ala Thr Pro Gln Thr Trp
35 40 45
Ile Asn Ile Ala Asn His Ile Leu Ser Pro Asp Leu Pro Phe Ser Leu
50 55 60
His Gln Met Leu Phe Tyr Gly Cys Tyr Lys Asp Phe Gly Pro Ala Pro
65 70 75 80
Pro Ala Trp Ile Pro Asp Pro Glu Lys Val Lys Ser Thr Asn Leu Gly
85 90 95
Ala Leu Leu Glu Lys Arg Gly Lys Glu Phe Leu Gly Val Lys Tyr Lys
100 105 110
Asp Pro Ile Ser Ser Phe Ser His Phe Gln Glu Phe Ser Val Arg Asn
115 120 125
Pro Glu Val Tyr Trp Arg Thr Val Leu Met Asp Glu Met Lys Ile Ser
130 135 140
Phe Ser Lys Asp Pro Glu Cys Ile Leu Arg Arg Asp Asp Ile Asn Asn
145 150 155 160
Pro Gly Gly Ser Glu Trp Leu Pro Gly Gly Tyr Leu Asn Ser Ala Lys
165 170 175
Asn Cys Leu Asn Val Asn Ser Asn Lys Lys Leu Asn Asp Thr Met Ile
180 185 190
Val Trp Arg Asp Glu Gly Asn Asp Asp Leu Pro Leu Asn Lys Leu Thr
195 200 205
Leu Asp Gln Leu Arg Lys Arg Val Trp Leu Val Gly Tyr Ala Leu Glu
210 215 220
Glu Met Gly Leu Glu Lys Gly Cys Ala Ile Ala Ile Asp Met Pro Met
225 230 235 240
His Val Asp Ala Val Val Ile Tyr Leu Ala Ile Val Leu Ala Gly Tyr
245 250 255
Val Val Val Ser Ile Ala Asp Ser Phe Ser Ala Pro Glu Ile Ser Thr
260 265 270
Arg Leu Arg Leu Ser Lys Ala Lys Ala Ile Phe Thr Gln Asp His Ile
275 280 285
Ile Arg Gly Lys Lys Arg Ile Pro Leu Tyr Ser Arg Val Val Glu Ala
290 295 300
Lys Ser Pro Met Ala Ile Val Ile Pro Cys Ser Gly Ser Asn Ile Gly
305 310 315 320
Ala Glu Leu Arg Asp Gly Asp Ile Ser Trp Asp Tyr Phe Leu Glu Arg
325 330 335
Ala Lys Glu Phe Lys Asn Cys Glu Phe Thr Ala Arg Glu Gln Pro Val
340 345 350
Asp Ala Tyr Thr Asn Ile Leu Phe Ser Ser Gly Thr Thr Gly Glu Pro
355 360 365
Lys Ala Ile Pro Trp Thr Gln Ala Thr Pro Leu Lys Ala Ala Ala Asp
370 375 380
Gly Trp Ser His Leu Asp Ile Arg Lys Gly Asp Val Ile Val Trp Pro
385 390 395 400
Thr Asn Leu Gly Trp Met Met Gly Pro Trp Leu Val Tyr Ala Ser Leu
405 410 415
Leu Asn Gly Ala Ser Ile Ala Leu Tyr Asn Gly Ser Pro Leu Val Ser
420 425 430
Gly Phe Ala Lys Phe Val Gln Asp Ala Lys Val Thr Met Leu Gly Val
435 440 445
Val Pro Ser Ile Val Arg Ser Trp Lys Ser Thr Asn Cys Val Ser Gly
450 455 460
Tyr Asp Trp Ser Thr Ile Arg Cys Phe Ser Ser Ser Gly Glu Ala Ser
465 470 475 480
Asn Val Asp Glu Tyr Leu Trp Leu Met Gly Arg Ala Asn Tyr Lys Pro
485 490 495
Val Ile Glu Met Cys Gly Gly Thr Glu Ile Gly Gly Ala Phe Ser Ala
500 505 510
Gly Ser Phe Leu Gln Ala Gln Ser Leu Ser Ser Phe Ser Ser Gln Cys
515 520 525
Met Gly Cys Thr Leu Tyr Ile Leu Asp Lys Asn Gly Tyr Pro Met Pro
530 535 540
Lys Asn Lys Pro Gly Ile Gly Glu Leu Ala Leu Gly Pro Val Met Phe
545 550 555 560
Gly Ala Ser Lys Thr Leu Leu Asn Gly Asn His His Asp Val Tyr Phe
565 570 575
Lys Gly Met Pro Thr Leu Asn Gly Glu Val Leu Arg Arg His Gly Asp
580 585 590
Ile Phe Glu Leu Thr Ser Asn Gly Tyr Tyr His Ala His Gly Arg Ala
595 600 605
Asp Asp Thr Met Asn Ile Gly Gly Ile Lys Ile Ser Ser Ile Glu Ile
610 615 620
Glu Arg Val Cys Asn Glu Val Asp Asp Arg Val Phe Glu Thr Thr Ala
625 630 635 640
Ile Gly Val Pro Pro Leu Gly Gly Gly Pro Glu Gln Leu Val Ile Phe
645 650 655
Phe Val Leu Lys Asp Ser Asn Asp Thr Thr Ile Asp Leu Asn Gln Leu
660 665 670
Arg Leu Ser Phe Asn Leu Gly Leu Gln Lys Lys Leu Asn Pro Leu Phe
675 680 685
Lys Val Thr Arg Val Val Pro Leu Ser Ser Leu Pro Arg Thr Ala Thr
690 695 700
Asn Lys Ile Met Arg Arg Val Leu Arg Gln Gln Phe Ser His Phe Glu
705 710 715 720
<210> 24
<211> 867
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 24
atgactgccg acaacaatag tatgccccat ggtgcagtat ctagttacgc caaattagtg 60
caaaaccaaa cacctgaaga cattttggaa gagtttcctg aaattattcc attacaacaa 120
agacctaata cccgatctag tgagacgtca aatgacgaaa gcggagaaac atgtttttct 180
ggtcatgatg aggagcaaat taagttaatg aatgaaaatt gtattgtttt ggattgggac 240
gataatgcta ttggtgccgg taccaagaaa gtttgtcatt taatggaaaa tattgaaaag 300
ggtttactac atcgtgcatt ctccgtcttt attttcaatg aacaaggtga attactttta 360
caacaaagag ccactgaaaa aataactttc cctgatcttt ggactaacac atgctgctct 420
catccactat gtattgatga cgaattaggt ttgaagggta agctagacga taagattaag 480
ggcgctatta ctgcggcggt gagaaaacta gatcatgaat taggtattcc agaagatgaa 540
actaagacaa ggggtaagtt tcacttttta aacagaatcc attacatggc accaagcaat 600
gaaccatggg gtgaacatga aattgattac atcctatttt ataagatcaa cgctaaagaa 660
aacttgactg tcaacccaaa cgtcaatgaa gttagagact tcaaatgggt ttcaccaaat 720
gatttgaaaa ctatgtttgc tgacccaagt tacaagttta cgccttggtt taagattatt 780
tgcgagaatt acttattcaa ctggtgggag caattagatg acctttctga agtggaaaat 840
gacaggcaaa ttcatagaat gctataa 867
<210> 25
<211> 288
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 25
Met Thr Ala Asp Asn Asn Ser Met Pro His Gly Ala Val Ser Ser Tyr
1 5 10 15
Ala Lys Leu Val Gln Asn Gln Thr Pro Glu Asp Ile Leu Glu Glu Phe
20 25 30
Pro Glu Ile Ile Pro Leu Gln Gln Arg Pro Asn Thr Arg Ser Ser Glu
35 40 45
Thr Ser Asn Asp Glu Ser Gly Glu Thr Cys Phe Ser Gly His Asp Glu
50 55 60
Glu Gln Ile Lys Leu Met Asn Glu Asn Cys Ile Val Leu Asp Trp Asp
65 70 75 80
Asp Asn Ala Ile Gly Ala Gly Thr Lys Lys Val Cys His Leu Met Glu
85 90 95
Asn Ile Glu Lys Gly Leu Leu His Arg Ala Phe Ser Val Phe Ile Phe
100 105 110
Asn Glu Gln Gly Glu Leu Leu Leu Gln Gln Arg Ala Thr Glu Lys Ile
115 120 125
Thr Phe Pro Asp Leu Trp Thr Asn Thr Cys Cys Ser His Pro Leu Cys
130 135 140
Ile Asp Asp Glu Leu Gly Leu Lys Gly Lys Leu Asp Asp Lys Ile Lys
145 150 155 160
Gly Ala Ile Thr Ala Ala Val Arg Lys Leu Asp His Glu Leu Gly Ile
165 170 175
Pro Glu Asp Glu Thr Lys Thr Arg Gly Lys Phe His Phe Leu Asn Arg
180 185 190
Ile His Tyr Met Ala Pro Ser Asn Glu Pro Trp Gly Glu His Glu Ile
195 200 205
Asp Tyr Ile Leu Phe Tyr Lys Ile Asn Ala Lys Glu Asn Leu Thr Val
210 215 220
Asn Pro Asn Val Asn Glu Val Arg Asp Phe Lys Trp Val Ser Pro Asn
225 230 235 240
Asp Leu Lys Thr Met Phe Ala Asp Pro Ser Tyr Lys Phe Thr Pro Trp
245 250 255
Phe Lys Ile Ile Cys Glu Asn Tyr Leu Phe Asn Trp Trp Glu Gln Leu
260 265 270
Asp Asp Leu Ser Glu Val Glu Asn Asp Arg Gln Ile His Arg Met Leu
275 280 285
<210> 26
<211> 1575
<212> DNA
<213> Artificial sequence
<220>
<223> truncated 3-hydroxy-3-methyl-glutaryl-CoA reductase (tmgh 1,
tHMGR)
<400> 26
atgcaattgg tgaagactga agtcaccaag aagtctttta ctgctcctgt acaaaaggct 60
tctacaccag ttttaaccaa taaaacagtc atttctggat cgaaagtcaa aagtttatca 120
tctgcgcaat cgagctcatc aggaccttca tcatctagtg aggaagatga ttcccgcgat 180
attgaaagct tggataagaa aatacgtcct ttagaagaat tagaagcatt attaagtagt 240
ggaaatacaa aacaattgaa gaacaaagag gtcgctgcct tggttattca cggtaagtta 300
cctttgtacg ctttggagaa aaaattaggt gatactacga gagcggttgc ggtacgtagg 360
aaggctcttt caattttggc agaagctcct gtattagcat ctgatcgttt accatataaa 420
aattatgact acgaccgcgt atttggcgct tgttgtgaaa atgttatagg ttacatgcct 480
ttgcccgttg gtgttatagg ccccttggtt atcgatggta catcttatca tataccaatg 540
gcaactacag agggttgttt ggtagcttct gccatgcgtg gctgtaaggc aatcaatgct 600
ggcggtggtg caacaactgt tttaactaag gatggtatga caagaggccc agtagtccgt 660
ttcccaactt tgaaaagatc tggtgcctgt aagatatggt tagactcaga agagggacaa 720
aacgcaatta aaaaagcttt taactctaca tcaagatttg cacgtctgca acatattcaa 780
acttgtctag caggagattt actcttcatg agatttagaa caactactgg tgacgcaatg 840
ggtatgaata tgatttctaa gggtgtcgaa tactcattaa agcaaatggt agaagagtat 900
ggctgggaag atatggaggt tgtctccgtt tctggtaact actgtaccga caaaaaacca 960
gctgccatca actggatcga aggtcgtggt aagagtgtcg tcgcagaagc tactattcct 1020
ggtgatgttg tcagaaaagt gttaaaaagt gatgtttccg cattggttga gttgaacatt 1080
gctaagaatt tggttggatc tgcaatggct gggtctgttg gtggatttaa cgcacatgca 1140
gctaatttag tgacagctgt tttcttggca ttaggacaag atcctgcaca aaatgtcgaa 1200
agttccaact gtataacatt gatgaaagaa gtggacggtg atttgagaat ttccgtatcc 1260
atgccatcca tcgaagtagg taccatcggt ggtggtactg ttctagaacc acaaggtgcc 1320
atgttggact tattaggtgt aagaggccca catgctaccg ctcctggtac caacgcacgt 1380
caattagcaa gaatagttgc ctgtgccgtc ttggcaggtg aattatcctt atgtgctgcc 1440
ctagcagccg gccatttggt tcaaagtcat atgacccaca acaggaaacc tgctgaacca 1500
acaaaaccta acaatttgga cgccactgat ataaatcgtt tgaaagatgg gtccgtcacc 1560
tgcattaaat cctaa 1575
<210> 27
<211> 524
<212> PRT
<213> Artificial sequence
<220>
<223> truncated 3-hydroxy-3-methyl-glutaryl-CoA reductase (tmgh 1,
tHMGR)
<400> 27
Met Gln Leu Val Lys Thr Glu Val Thr Lys Lys Ser Phe Thr Ala Pro
1 5 10 15
Val Gln Lys Ala Ser Thr Pro Val Leu Thr Asn Lys Thr Val Ile Ser
20 25 30
Gly Ser Lys Val Lys Ser Leu Ser Ser Ala Gln Ser Ser Ser Ser Gly
35 40 45
Pro Ser Ser Ser Ser Glu Glu Asp Asp Ser Arg Asp Ile Glu Ser Leu
50 55 60
Asp Lys Lys Ile Arg Pro Leu Glu Glu Leu Glu Ala Leu Leu Ser Ser
65 70 75 80
Gly Asn Thr Lys Gln Leu Lys Asn Lys Glu Val Ala Ala Leu Val Ile
85 90 95
His Gly Lys Leu Pro Leu Tyr Ala Leu Glu Lys Lys Leu Gly Asp Thr
100 105 110
Thr Arg Ala Val Ala Val Arg Arg Lys Ala Leu Ser Ile Leu Ala Glu
115 120 125
Ala Pro Val Leu Ala Ser Asp Arg Leu Pro Tyr Lys Asn Tyr Asp Tyr
130 135 140
Asp Arg Val Phe Gly Ala Cys Cys Glu Asn Val Ile Gly Tyr Met Pro
145 150 155 160
Leu Pro Val Gly Val Ile Gly Pro Leu Val Ile Asp Gly Thr Ser Tyr
165 170 175
His Ile Pro Met Ala Thr Thr Glu Gly Cys Leu Val Ala Ser Ala Met
180 185 190
Arg Gly Cys Lys Ala Ile Asn Ala Gly Gly Gly Ala Thr Thr Val Leu
195 200 205
Thr Lys Asp Gly Met Thr Arg Gly Pro Val Val Arg Phe Pro Thr Leu
210 215 220
Lys Arg Ser Gly Ala Cys Lys Ile Trp Leu Asp Ser Glu Glu Gly Gln
225 230 235 240
Asn Ala Ile Lys Lys Ala Phe Asn Ser Thr Ser Arg Phe Ala Arg Leu
245 250 255
Gln His Ile Gln Thr Cys Leu Ala Gly Asp Leu Leu Phe Met Arg Phe
260 265 270
Arg Thr Thr Thr Gly Asp Ala Met Gly Met Asn Met Ile Ser Lys Gly
275 280 285
Val Glu Tyr Ser Leu Lys Gln Met Val Glu Glu Tyr Gly Trp Glu Asp
290 295 300
Met Glu Val Val Ser Val Ser Gly Asn Tyr Cys Thr Asp Lys Lys Pro
305 310 315 320
Ala Ala Ile Asn Trp Ile Glu Gly Arg Gly Lys Ser Val Val Ala Glu
325 330 335
Ala Thr Ile Pro Gly Asp Val Val Arg Lys Val Leu Lys Ser Asp Val
340 345 350
Ser Ala Leu Val Glu Leu Asn Ile Ala Lys Asn Leu Val Gly Ser Ala
355 360 365
Met Ala Gly Ser Val Gly Gly Phe Asn Ala His Ala Ala Asn Leu Val
370 375 380
Thr Ala Val Phe Leu Ala Leu Gly Gln Asp Pro Ala Gln Asn Val Glu
385 390 395 400
Ser Ser Asn Cys Ile Thr Leu Met Lys Glu Val Asp Gly Asp Leu Arg
405 410 415
Ile Ser Val Ser Met Pro Ser Ile Glu Val Gly Thr Ile Gly Gly Gly
420 425 430
Thr Val Leu Glu Pro Gln Gly Ala Met Leu Asp Leu Leu Gly Val Arg
435 440 445
Gly Pro His Ala Thr Ala Pro Gly Thr Asn Ala Arg Gln Leu Ala Arg
450 455 460
Ile Val Ala Cys Ala Val Leu Ala Gly Glu Leu Ser Leu Cys Ala Ala
465 470 475 480
Leu Ala Ala Gly His Leu Val Gln Ser His Met Thr His Asn Arg Lys
485 490 495
Pro Ala Glu Pro Thr Lys Pro Asn Asn Leu Asp Ala Thr Asp Ile Asn
500 505 510
Arg Leu Lys Asp Gly Ser Val Thr Cys Ile Lys Ser
515 520
<210> 28
<211> 1387
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 28
atgactgaac taaaaaaaca aaagaccgct gaacaaaaaa ccagacctca aaatgtcggt 60
attaaaggta tccaaattta catcccaact caatgtgtca accaatctga gctagagaaa 120
tttgatggcg tttctcaagg taaatacaca attggtctgg gccaaaccaa catgtctttt 180
gtcaatgaca gagaagatat ctactcgatg tccctaactg ttttgtctaa gttgatcaag 240
agttacaaca tcgacaccaa caaaattggt agattagaag tcggtactga aactctgatt 300
gacaagtcca agtctgtcaa gtctgtcttg atgcaattgt ttggtgaaaa cactgacgtc 360
gaaggtattg acacgcttaa tgcctgttac ggtggtacca acgcgttgtt caactctttg 420
aactggattg aatctaacgc atgggatggt agagacgcca ttgtagtttg cggtgatatt 480
gccatctacg ataagggtgc cgcaagacca accggtggtg ccggtactgt tgctatgtgg 540
atcggtcctg atgctccaat tgtatttgac tctgtaagag cttcttacat ggaacacgcc 600
tacgattttt acaagccaga tttcaccagc gaatatcctt acgtcgatgg tcatttttca 660
ttaacttgtt acgtcaaggc tcttgatcaa gtttacaaga gttattccaa gaaggctatt 720
tctaaagggt tggttagcga tcccgctggt tcggatgctt tgaacgtttt gaaatatttc 780
gactacaacg ttttccatgt tccaacctgt aaattggtca caaaatcata cggtagatta 840
ctatataacg atttcagagc caatcctcaa ttgttcccag aagttgacgc cgaattagct 900
actcgcgatt atgacgaatc tttaaccgat aagaacattg aaaaaacttt tgttaatgtt 960
gctaagccat tccacaaaga gagagttgcc caatctttga ttgttccaac aaacacaggt 1020
aacatgtaca ccgcatctgt ttatgccgcc tttgcatctc tattaaacta tgttggatct 1080
gacgacttac aaggcaagcg tgttggttta ttttcttacg gttccggttt agctgcatct 1140
ctatattctt gcaaaattgt tggtgacgtc caacatatta tcaaggaatt agatattact 1200
aacaaattag ccaagagaat caccgaaact ccaaaggatt acgaagctgc catcgaattg 1260
agagaaaatg cccatttgaa gaagaacttc aaacctcaag gttccattga gcatttgcaa 1320
agtggtgttt actacttgac caacatcgat gacaaattta gaagatctta cgatgttaaa 1380
aaataat 1387
<210> 29
<211> 461
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 29
Met Thr Glu Leu Lys Lys Gln Lys Thr Ala Glu Gln Lys Thr Arg Pro
1 5 10 15
Gln Asn Val Gly Ile Lys Gly Ile Gln Ile Tyr Ile Pro Thr Gln Cys
20 25 30
Val Asn Gln Ser Glu Leu Glu Lys Phe Asp Gly Val Ser Gln Gly Lys
35 40 45
Tyr Thr Ile Gly Leu Gly Gln Thr Asn Met Ser Phe Val Asn Asp Arg
50 55 60
Glu Asp Ile Tyr Ser Met Ser Leu Thr Val Leu Ser Lys Leu Ile Lys
65 70 75 80
Ser Tyr Asn Ile Asp Thr Asn Lys Ile Gly Arg Leu Glu Val Gly Thr
85 90 95
Glu Thr Leu Ile Asp Lys Ser Lys Ser Val Lys Ser Val Leu Met Gln
100 105 110
Leu Phe Gly Glu Asn Thr Asp Val Glu Gly Ile Asp Thr Leu Asn Ala
115 120 125
Cys Tyr Gly Gly Thr Asn Ala Leu Phe Asn Ser Leu Asn Trp Ile Glu
130 135 140
Ser Asn Ala Trp Asp Gly Arg Asp Ala Ile Val Val Cys Gly Asp Ile
145 150 155 160
Ala Ile Tyr Asp Lys Gly Ala Ala Arg Pro Thr Gly Gly Ala Gly Thr
165 170 175
Val Ala Met Trp Ile Gly Pro Asp Ala Pro Ile Val Phe Asp Ser Val
180 185 190
Arg Ala Ser Tyr Met Glu His Ala Tyr Asp Phe Tyr Lys Pro Asp Phe
195 200 205
Thr Ser Glu Tyr Pro Tyr Val Asp Gly His Phe Ser Leu Thr Cys Tyr
210 215 220
Val Lys Ala Leu Asp Gln Val Tyr Lys Ser Tyr Ser Lys Lys Ala Ile
225 230 235 240
Ser Lys Gly Leu Val Ser Asp Pro Ala Gly Ser Asp Ala Leu Asn Val
245 250 255
Leu Lys Tyr Phe Asp Tyr Asn Val Phe His Val Pro Thr Cys Lys Leu
260 265 270
Val Thr Lys Ser Tyr Gly Arg Leu Leu Tyr Asn Asp Phe Arg Ala Asn
275 280 285
Pro Gln Leu Phe Pro Glu Val Asp Ala Glu Leu Ala Thr Arg Asp Tyr
290 295 300
Asp Glu Ser Leu Thr Asp Lys Asn Ile Glu Lys Thr Phe Val Asn Val
305 310 315 320
Ala Lys Pro Phe His Lys Glu Arg Val Ala Gln Ser Leu Ile Val Pro
325 330 335
Thr Asn Thr Gly Asn Met Tyr Thr Ala Ser Val Tyr Ala Ala Phe Ala
340 345 350
Ser Leu Leu Asn Tyr Val Gly Ser Asp Asp Leu Gln Gly Lys Arg Val
355 360 365
Gly Leu Phe Ser Tyr Gly Ser Gly Leu Ala Ala Ser Leu Tyr Ser Cys
370 375 380
Lys Ile Val Gly Asp Val Gln His Ile Ile Lys Glu Leu Asp Ile Thr
385 390 395 400
Asn Lys Leu Ala Lys Arg Ile Thr Glu Thr Pro Lys Asp Tyr Glu Ala
405 410 415
Ala Ile Glu Leu Arg Glu Asn Ala His Leu Lys Lys Asn Phe Lys Pro
420 425 430
Gln Gly Ser Ile Glu His Leu Gln Ser Gly Val Tyr Tyr Leu Thr Asn
435 440 445
Ile Asp Asp Lys Phe Arg Arg Ser Tyr Asp Val Lys Lys
450 455 460
<210> 30
<211> 1197
<212> DNA
<213> Saccharomyces cerevisiae
<400> 30
atgtctcaga acgtttacat tgtatcgact gccagaaccc caattggttc attccagggt 60
tctctatcct ccaagacagc agtggaattg ggtgctgttg ctttaaaagg cgccttggct 120
aaggttccag aattggatgc atccaaggat tttgacgaaa ttatttttgg taacgttctt 180
tctgccaatt tgggccaagc tccggccaga caagttgctt tggctgccgg tttgagtaat 240
catatcgttg caagcacagt taacaaggtc tgtgcatccg ctatgaaggc aatcattttg 300
ggtgctcaat ccatcaaatg tggtaatgct gatgttgtcg tagctggtgg ttgtgaatct 360
atgactaacg caccatacta catgccagca gcccgtgcgg gtgccaaatt tggccaaact 420
gttcttgttg atggtgtcga aagagatggg ttgaacgatg cgtacgatgg tctagccatg 480
ggtgtacacg cagaaaagtg tgcccgtgat tgggatatta ctagagaaca acaagacaat 540
tttgccatcg aatcctacca aaaatctcaa aaatctcaaa aggaaggtaa attcgacaat 600
gaaattgtac ctgttaccat taagggattt agaggtaagc ctgatactca agtcacgaag 660
gacgaggaac ctgctagatt acacgttgaa aaattgagat ctgcaaggac tgttttccaa 720
aaagaaaacg gtactgttac tgccgctaac gcttctccaa tcaacgatgg tgctgcagcc 780
gtcatcttgg tttccgaaaa agttttgaag gaaaagaatt tgaagccttt ggctattatc 840
aaaggttggg gtgaggccgc tcatcaacca gctgatttta catgggctcc atctcttgca 900
gttccaaagg ctttgaaaca tgctggcatc gaagacatca attctgttga ttactttgaa 960
ttcaatgaag ccttttcggt tgtcggtttg gtgaacacta agattttgaa gctagaccca 1020
tctaaggtta atgtatatgg tggtgctgtt gctctaggtc acccattggg ttgttctggt 1080
gctagagtgg ttgttacact gctatccatc ttacagcaag aaggaggtaa gatcggtgtt 1140
gccgccattt gtaatggtgg tggtggtgct tcctctattg tcattgaaaa gatatga 1197
<210> 31
<211> 398
<212> PRT
<213> Saccharomyces cerevisiae
<400> 31
Met Ser Gln Asn Val Tyr Ile Val Ser Thr Ala Arg Thr Pro Ile Gly
1 5 10 15
Ser Phe Gln Gly Ser Leu Ser Ser Lys Thr Ala Val Glu Leu Gly Ala
20 25 30
Val Ala Leu Lys Gly Ala Leu Ala Lys Val Pro Glu Leu Asp Ala Ser
35 40 45
Lys Asp Phe Asp Glu Ile Ile Phe Gly Asn Val Leu Ser Ala Asn Leu
50 55 60
Gly Gln Ala Pro Ala Arg Gln Val Ala Leu Ala Ala Gly Leu Ser Asn
65 70 75 80
His Ile Val Ala Ser Thr Val Asn Lys Val Cys Ala Ser Ala Met Lys
85 90 95
Ala Ile Ile Leu Gly Ala Gln Ser Ile Lys Cys Gly Asn Ala Asp Val
100 105 110
Val Val Ala Gly Gly Cys Glu Ser Met Thr Asn Ala Pro Tyr Tyr Met
115 120 125
Pro Ala Ala Arg Ala Gly Ala Lys Phe Gly Gln Thr Val Leu Val Asp
130 135 140
Gly Val Glu Arg Asp Gly Leu Asn Asp Ala Tyr Asp Gly Leu Ala Met
145 150 155 160
Gly Val His Ala Glu Lys Cys Ala Arg Asp Trp Asp Ile Thr Arg Glu
165 170 175
Gln Gln Asp Asn Phe Ala Ile Glu Ser Tyr Gln Lys Ser Gln Lys Ser
180 185 190
Gln Lys Glu Gly Lys Phe Asp Asn Glu Ile Val Pro Val Thr Ile Lys
195 200 205
Gly Phe Arg Gly Lys Pro Asp Thr Gln Val Thr Lys Asp Glu Glu Pro
210 215 220
Ala Arg Leu His Val Glu Lys Leu Arg Ser Ala Arg Thr Val Phe Gln
225 230 235 240
Lys Glu Asn Gly Thr Val Thr Ala Ala Asn Ala Ser Pro Ile Asn Asp
245 250 255
Gly Ala Ala Ala Val Ile Leu Val Ser Glu Lys Val Leu Lys Glu Lys
260 265 270
Asn Leu Lys Pro Leu Ala Ile Ile Lys Gly Trp Gly Glu Ala Ala His
275 280 285
Gln Pro Ala Asp Phe Thr Trp Ala Pro Ser Leu Ala Val Pro Lys Ala
290 295 300
Leu Lys His Ala Gly Ile Glu Asp Ile Asn Ser Val Asp Tyr Phe Glu
305 310 315 320
Phe Asn Glu Ala Phe Ser Val Val Gly Leu Val Asn Thr Lys Ile Leu
325 330 335
Lys Leu Asp Pro Ser Lys Val Asn Val Tyr Gly Gly Ala Val Ala Leu
340 345 350
Gly His Pro Leu Gly Cys Ser Gly Ala Arg Val Val Val Thr Leu Leu
355 360 365
Ser Ile Leu Gln Gln Glu Gly Gly Lys Ile Gly Val Ala Ala Ile Cys
370 375 380
Asn Gly Gly Gly Gly Ala Ser Ser Ile Val Ile Glu Lys Ile
385 390 395
<210> 32
<211> 1191
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 32
atgaccgttt acacagcatc cgttaccgca cccgtcaaca tcgcaaccct taagtattgg 60
gggaaaaggg acacgaagtt gaatctgccc accaattcgt ccatatcagt gactttatcg 120
caagatgacc tcagaacgtt gacctctgcg gctactgcac ctgagtttga acgcgacact 180
ttgtggttaa atggagaacc acacagcatc gacaatgaaa gaactcaaaa ttgtctgcgc 240
gacctacgcc aattaagaaa ggaaatggaa tcgaaggacg cctcattgcc cacattatct 300
caatggaaac tccacattgt ctccgaaaat aactttccta cagcagctgg tttagcttcc 360
tccgctgctg gctttgctgc attggtctct gcaattgcta agttatacca attaccacag 420
tcaacttcag aaatatctag aatagcaaga aaggggtctg gttcagcttg tagatcgttg 480
tttggcggat acgtggcctg ggaaatggga aaagctgaag atggtcatga ttccatggca 540
gtacaaatcg cagacagctc tgactggcct cagatgaaag cttgtgtcct agttgtcagc 600
gatattaaaa aggatgtgag ttccactcag ggtatgcaat tgaccgtggc aacctccgaa 660
ctatttaaag aaagaattga acatgtcgta ccaaagagat ttgaagtcat gcgtaaagcc 720
attgttgaaa aagatttcgc cacctttgca aaggaaacaa tgatggattc caactctttc 780
catgccacat gtttggactc tttccctcca atattctaca tgaatgacac ttccaagcgt 840
atcatcagtt ggtgccacac cattaatcag ttttacggag aaacaatcgt tgcatacacg 900
tttgatgcag gtccaaatgc tgtgttgtac tacttagctg aaaatgagtc gaaactcttt 960
gcatttatct ataaattgtt tggctctgtt cctggatggg acaagaaatt tactactgag 1020
cagcttgagg ctttcaacca tcaatttgaa tcatctaact ttactgcacg tgaattggat 1080
cttgagttgc aaaaggatgt tgccagagtg attttaactc aagtcggttc aggcccacaa 1140
gaaacaaacg aatctttgat tgacgcaaag actggtctac caaaggaata a 1191
<210> 33
<211> 396
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 33
Met Thr Val Tyr Thr Ala Ser Val Thr Ala Pro Val Asn Ile Ala Thr
1 5 10 15
Leu Lys Tyr Trp Gly Lys Arg Asp Thr Lys Leu Asn Leu Pro Thr Asn
20 25 30
Ser Ser Ile Ser Val Thr Leu Ser Gln Asp Asp Leu Arg Thr Leu Thr
35 40 45
Ser Ala Ala Thr Ala Pro Glu Phe Glu Arg Asp Thr Leu Trp Leu Asn
50 55 60
Gly Glu Pro His Ser Ile Asp Asn Glu Arg Thr Gln Asn Cys Leu Arg
65 70 75 80
Asp Leu Arg Gln Leu Arg Lys Glu Met Glu Ser Lys Asp Ala Ser Leu
85 90 95
Pro Thr Leu Ser Gln Trp Lys Leu His Ile Val Ser Glu Asn Asn Phe
100 105 110
Pro Thr Ala Ala Gly Leu Ala Ser Ser Ala Ala Gly Phe Ala Ala Leu
115 120 125
Val Ser Ala Ile Ala Lys Leu Tyr Gln Leu Pro Gln Ser Thr Ser Glu
130 135 140
Ile Ser Arg Ile Ala Arg Lys Gly Ser Gly Ser Ala Cys Arg Ser Leu
145 150 155 160
Phe Gly Gly Tyr Val Ala Trp Glu Met Gly Lys Ala Glu Asp Gly His
165 170 175
Asp Ser Met Ala Val Gln Ile Ala Asp Ser Ser Asp Trp Pro Gln Met
180 185 190
Lys Ala Cys Val Leu Val Val Ser Asp Ile Lys Lys Asp Val Ser Ser
195 200 205
Thr Gln Gly Met Gln Leu Thr Val Ala Thr Ser Glu Leu Phe Lys Glu
210 215 220
Arg Ile Glu His Val Val Pro Lys Arg Phe Glu Val Met Arg Lys Ala
225 230 235 240
Ile Val Glu Lys Asp Phe Ala Thr Phe Ala Lys Glu Thr Met Met Asp
245 250 255
Ser Asn Ser Phe His Ala Thr Cys Leu Asp Ser Phe Pro Pro Ile Phe
260 265 270
Tyr Met Asn Asp Thr Ser Lys Arg Ile Ile Ser Trp Cys His Thr Ile
275 280 285
Asn Gln Phe Tyr Gly Glu Thr Ile Val Ala Tyr Thr Phe Asp Ala Gly
290 295 300
Pro Asn Ala Val Leu Tyr Tyr Leu Ala Glu Asn Glu Ser Lys Leu Phe
305 310 315 320
Ala Phe Ile Tyr Lys Leu Phe Gly Ser Val Pro Gly Trp Asp Lys Lys
325 330 335
Phe Thr Thr Glu Gln Leu Glu Ala Phe Asn His Gln Phe Glu Ser Ser
340 345 350
Asn Phe Thr Ala Arg Glu Leu Asp Leu Glu Leu Gln Lys Asp Val Ala
355 360 365
Arg Val Ile Leu Thr Gln Val Gly Ser Gly Pro Gln Glu Thr Asn Glu
370 375 380
Ser Leu Ile Asp Ala Lys Thr Gly Leu Pro Lys Glu
385 390 395
<210> 34
<211> 1707
<212> DNA
<213> Artificial sequence
<220>
<223> pyruvate decarboxylase (Zm _ PDC) codon optimization
<400> 34
atgtcctaca ccgttggtac ctacttagct gagcgtttgg tccaaatcgg tttgaagcac 60
catttcgccg ttgctggtga ttacaacttg gtcttgttag ataatttatt attgaacaag 120
aacatggaac aagtctactg ctgtaatgaa ttgaactgtg gtttctctgc tgaaggttat 180
gctagagcta aaggtgccgc tgccgctgtt gtcacttact ctgttggtgc tttgtctgcc 240
ttcgacgcta ttggtggtgc ttacgccgag aatttacctg ttattttaat ttctggtgcc 300
cctaacaata acgatcatgc tgctggtcat gttttacacc acgctttggg taaaactgac 360
taccattatc aattagagat ggccaaaaac atcaccgccg ctgccgaggc catttacact 420
ccagaagaag ccccagccaa aattgatcac gtcatcaaaa ccgccttgag agagaaaaaa 480
cctgtttact tggaaatcgc ctgtaatatc gcctctatgc cttgcgccgc tcctggtcct 540
gcttccgcct tattcaacga tgaggcttct gatgaagctt ccttaaacgc tgctgttgag 600
gagactttaa agttcatcgc taatagagat aaggtcgctg ttttagtcgg ttctaagttg 660
cgtgctgccg gtgccgagga agctgctgtt aaattcgccg atgctttagg tggtgctgtc 720
gccaccatgg ccgccgccaa atcctttttc cctgaagaaa acccacacta catcggtact 780
tcttggggtg aagtctctta cccaggtgtc gaaaagacta tgaaggaagc cgatgccgtc 840
atcgccttgg ccccagtttt taatgattat tccaccactg gttggactga tatcccagat 900
cctaaaaagt tagttttagc cgagcctaga tccgttgttg ttaacggtat tagattccct 960
tccgttcact tgaaggatta cttaactaga ttggctcaaa aggtttccaa gaagaccggt 1020
gctttggact ttttcaaatc tttgaacgcc ggtgagttaa agaaggccgc ccctgctgac 1080
ccatctgctc cattggttaa cgctgagatt gctagacaag tcgaagcttt attgacccca 1140
aacactaccg ttatcgccga aactggtgac tcttggttta atgctcaaag aatgaagtta 1200
ccaaatggtg ccagagttga gtacgaaatg caatggggtc atatcggttg gtctgtccca 1260
gctgcttttg gttatgctgt tggtgcccct gagagaagaa acatcttgat ggttggtgac 1320
ggttccttcc aattgactgc tcaagaagtc gctcaaatgg ttagattaaa attaccagtc 1380
atcatcttct tgatcaataa ctacggttac actatcgaag tcatgattca cgatggtcct 1440
tacaataata ttaagaactg ggactatgct ggtttgatgg aagtctttaa tggtaacggt 1500
ggttacgatt ccggtgctgg taagggttta aaggctaaga ctggtggtga attagctgaa 1560
gccattaagg ttgccttggc taacaccgac ggtcctactt taatcgaatg tttcattggt 1620
agagaggatt gtaccgaaga gttagttaag tggggtaaga gagttgccgc tgctaattcc 1680
cgtaagcctg tcaataaatt gttataa 1707
<210> 35
<211> 568
<212> PRT
<213> Zymomonas mobilis
<400> 35
Met Ser Tyr Thr Val Gly Thr Tyr Leu Ala Glu Arg Leu Val Gln Ile
1 5 10 15
Gly Leu Lys His His Phe Ala Val Ala Gly Asp Tyr Asn Leu Val Leu
20 25 30
Leu Asp Asn Leu Leu Leu Asn Lys Asn Met Glu Gln Val Tyr Cys Cys
35 40 45
Asn Glu Leu Asn Cys Gly Phe Ser Ala Glu Gly Tyr Ala Arg Ala Lys
50 55 60
Gly Ala Ala Ala Ala Val Val Thr Tyr Ser Val Gly Ala Leu Ser Ala
65 70 75 80
Phe Asp Ala Ile Gly Gly Ala Tyr Ala Glu Asn Leu Pro Val Ile Leu
85 90 95
Ile Ser Gly Ala Pro Asn Asn Asn Asp His Ala Ala Gly His Val Leu
100 105 110
His His Ala Leu Gly Lys Thr Asp Tyr His Tyr Gln Leu Glu Met Ala
115 120 125
Lys Asn Ile Thr Ala Ala Ala Glu Ala Ile Tyr Thr Pro Glu Glu Ala
130 135 140
Pro Ala Lys Ile Asp His Val Ile Lys Thr Ala Leu Arg Glu Lys Lys
145 150 155 160
Pro Val Tyr Leu Glu Ile Ala Cys Asn Ile Ala Ser Met Pro Cys Ala
165 170 175
Ala Pro Gly Pro Ala Ser Ala Leu Phe Asn Asp Glu Ala Ser Asp Glu
180 185 190
Ala Ser Leu Asn Ala Ala Val Glu Glu Thr Leu Lys Phe Ile Ala Asn
195 200 205
Arg Asp Lys Val Ala Val Leu Val Gly Ser Lys Leu Arg Ala Ala Gly
210 215 220
Ala Glu Glu Ala Ala Val Lys Phe Ala Asp Ala Leu Gly Gly Ala Val
225 230 235 240
Ala Thr Met Ala Ala Ala Lys Ser Phe Phe Pro Glu Glu Asn Pro His
245 250 255
Tyr Ile Gly Thr Ser Trp Gly Glu Val Ser Tyr Pro Gly Val Glu Lys
260 265 270
Thr Met Lys Glu Ala Asp Ala Val Ile Ala Leu Ala Pro Val Phe Asn
275 280 285
Asp Tyr Ser Thr Thr Gly Trp Thr Asp Ile Pro Asp Pro Lys Lys Leu
290 295 300
Val Leu Ala Glu Pro Arg Ser Val Val Val Asn Gly Ile Arg Phe Pro
305 310 315 320
Ser Val His Leu Lys Asp Tyr Leu Thr Arg Leu Ala Gln Lys Val Ser
325 330 335
Lys Lys Thr Gly Ala Leu Asp Phe Phe Lys Ser Leu Asn Ala Gly Glu
340 345 350
Leu Lys Lys Ala Ala Pro Ala Asp Pro Ser Ala Pro Leu Val Asn Ala
355 360 365
Glu Ile Ala Arg Gln Val Glu Ala Leu Leu Thr Pro Asn Thr Thr Val
370 375 380
Ile Ala Glu Thr Gly Asp Ser Trp Phe Asn Ala Gln Arg Met Lys Leu
385 390 395 400
Pro Asn Gly Ala Arg Val Glu Tyr Glu Met Gln Trp Gly His Ile Gly
405 410 415
Trp Ser Val Pro Ala Ala Phe Gly Tyr Ala Val Gly Ala Pro Glu Arg
420 425 430
Arg Asn Ile Leu Met Val Gly Asp Gly Ser Phe Gln Leu Thr Ala Gln
435 440 445
Glu Val Ala Gln Met Val Arg Leu Lys Leu Pro Val Ile Ile Phe Leu
450 455 460
Ile Asn Asn Tyr Gly Tyr Thr Ile Glu Val Met Ile His Asp Gly Pro
465 470 475 480
Tyr Asn Asn Ile Lys Asn Trp Asp Tyr Ala Gly Leu Met Glu Val Phe
485 490 495
Asn Gly Asn Gly Gly Tyr Asp Ser Gly Ala Gly Lys Gly Leu Lys Ala
500 505 510
Lys Thr Gly Gly Glu Leu Ala Glu Ala Ile Lys Val Ala Leu Ala Asn
515 520 525
Thr Asp Gly Pro Thr Leu Ile Glu Cys Phe Ile Gly Arg Glu Asp Cys
530 535 540
Thr Glu Glu Leu Val Lys Trp Gly Lys Arg Val Ala Ala Ala Asn Ser
545 550 555 560
Arg Lys Pro Val Asn Lys Leu Leu
565
<210> 36
<211> 1356
<212> DNA
<213> Saccharomyces cerevisiae
<400> 36
atgtcagagt tgagagcctt cagtgcccca gggaaagcgt tactagctgg tggatattta 60
gttttagatc cgaaatatga agcatttgta gtcggattat cggcaagaat gcatgctgta 120
gcccatcctt acggttcatt gcaagagtct gataagtttg aagtgcgtgt gaaaagtaaa 180
caatttaaag atggggagtg gctgtaccat ataagtccta aaactggctt cattcctgtt 240
tcgataggcg gatctaagaa ccctttcatt gaaaaagtta tcgctaacgt atttagctac 300
tttaagccta acatggacga ctactgcaat agaaacttgt tcgttattga tattttctct 360
gatgatgcct accattctca ggaggacagc gttaccgaac atcgtggcaa cagaagattg 420
agttttcatt cgcacagaat tgaagaagtt cccaaaacag ggctgggctc ctcggcaggt 480
ttagtcacag ttttaactac agctttggcc tccttttttg tatcggacct ggaaaataat 540
gtagacaaat atagagaagt tattcataat ttatcacaag ttgctcattg tcaagctcag 600
ggtaaaattg gaagcgggtt tgatgtagcg gcggcagcat atggatctat cagatataga 660
agattcccac ccgcattaat ctctaatttg ccagatattg gaagtgctac ttacggcagt 720
aaactggcgc atttggttaa tgaagaagac tggaatataa cgattaaaag taaccattta 780
ccttcgggat taactttatg gatgggcgat attaagaatg gttcagaaac agtaaaactg 840
gtccagaagg taaaaaattg gtatgattcg catatgccgg aaagcttgaa aatatataca 900
gaactcgatc atgcaaattc tagatttatg gatggactat ctaaactaga tcgcttacac 960
gagactcatg acgattacag cgatcagata tttgagtctc ttgagaggaa tgactgtacc 1020
tgtcaaaagt atcctgagat cacagaagtt agagatgcag ttgccacaat tagacgttcc 1080
tttagaaaaa taactaaaga atctggtgcc gatatcgaac ctcccgtaca aactagctta 1140
ttggatgatt gccagacctt aaaaggagtt cttacttgct taatacctgg tgctggtggt 1200
tatgacgcca ttgcagtgat tgctaagcaa gatgttgatc ttagggctca aaccgctgat 1260
gacaaaagat tttctaaggt tcaatggctg gatgtaactc aggctgactg gggtgttagg 1320
aaagaaaaag atccggaaac ttatcttgat aaataa 1356
<210> 37
<211> 451
<212> PRT
<213> Saccharomyces cerevisiae
<400> 37
Met Ser Glu Leu Arg Ala Phe Ser Ala Pro Gly Lys Ala Leu Leu Ala
1 5 10 15
Gly Gly Tyr Leu Val Leu Asp Pro Lys Tyr Glu Ala Phe Val Val Gly
20 25 30
Leu Ser Ala Arg Met His Ala Val Ala His Pro Tyr Gly Ser Leu Gln
35 40 45
Glu Ser Asp Lys Phe Glu Val Arg Val Lys Ser Lys Gln Phe Lys Asp
50 55 60
Gly Glu Trp Leu Tyr His Ile Ser Pro Lys Thr Gly Phe Ile Pro Val
65 70 75 80
Ser Ile Gly Gly Ser Lys Asn Pro Phe Ile Glu Lys Val Ile Ala Asn
85 90 95
Val Phe Ser Tyr Phe Lys Pro Asn Met Asp Asp Tyr Cys Asn Arg Asn
100 105 110
Leu Phe Val Ile Asp Ile Phe Ser Asp Asp Ala Tyr His Ser Gln Glu
115 120 125
Asp Ser Val Thr Glu His Arg Gly Asn Arg Arg Leu Ser Phe His Ser
130 135 140
His Arg Ile Glu Glu Val Pro Lys Thr Gly Leu Gly Ser Ser Ala Gly
145 150 155 160
Leu Val Thr Val Leu Thr Thr Ala Leu Ala Ser Phe Phe Val Ser Asp
165 170 175
Leu Glu Asn Asn Val Asp Lys Tyr Arg Glu Val Ile His Asn Leu Ser
180 185 190
Gln Val Ala His Cys Gln Ala Gln Gly Lys Ile Gly Ser Gly Phe Asp
195 200 205
Val Ala Ala Ala Ala Tyr Gly Ser Ile Arg Tyr Arg Arg Phe Pro Pro
210 215 220
Ala Leu Ile Ser Asn Leu Pro Asp Ile Gly Ser Ala Thr Tyr Gly Ser
225 230 235 240
Lys Leu Ala His Leu Val Asn Glu Glu Asp Trp Asn Ile Thr Ile Lys
245 250 255
Ser Asn His Leu Pro Ser Gly Leu Thr Leu Trp Met Gly Asp Ile Lys
260 265 270
Asn Gly Ser Glu Thr Val Lys Leu Val Gln Lys Val Lys Asn Trp Tyr
275 280 285
Asp Ser His Met Pro Glu Ser Leu Lys Ile Tyr Thr Glu Leu Asp His
290 295 300
Ala Asn Ser Arg Phe Met Asp Gly Leu Ser Lys Leu Asp Arg Leu His
305 310 315 320
Glu Thr His Asp Asp Tyr Ser Asp Gln Ile Phe Glu Ser Leu Glu Arg
325 330 335
Asn Asp Cys Thr Cys Gln Lys Tyr Pro Glu Ile Thr Glu Val Arg Asp
340 345 350
Ala Val Ala Thr Ile Arg Arg Ser Phe Arg Lys Ile Thr Lys Glu Ser
355 360 365
Gly Ala Asp Ile Glu Pro Pro Val Gln Thr Ser Leu Leu Asp Asp Cys
370 375 380
Gln Thr Leu Lys Gly Val Leu Thr Cys Leu Ile Pro Gly Ala Gly Gly
385 390 395 400
Tyr Asp Ala Ile Ala Val Ile Ala Lys Gln Asp Val Asp Leu Arg Ala
405 410 415
Gln Thr Ala Asp Asp Lys Arg Phe Ser Lys Val Gln Trp Leu Asp Val
420 425 430
Thr Gln Ala Asp Trp Gly Val Arg Lys Glu Lys Asp Pro Glu Thr Tyr
435 440 445
Leu Asp Lys
450
<210> 38
<211> 1332
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 38
atgtcattac cgttcttaac ttctgcaccg ggaaaggtta ttatttttgg tgaacactct 60
gctgtgtaca acaagcctgc cgtcgctgct agtgtgtctg cgttgagaac ctacctgcta 120
ataagcgagt catctgcacc agatactatt gaattggact tcccggacat tagctttaat 180
cataagtggt ccatcaatga tttcaatgcc atcaccgagg atcaagtaaa ctcccaaaaa 240
ttggccaagg ctcaacaagc caccgatggc ttgtctcagg aactcgttag tcttttggat 300
ccgttgttag ctcaactatc cgaatccttc cactaccatg cagcgttttg tttcctgtat 360
atgtttgttt gcctatgccc ccatgccaag aatattaagt tttctttaaa gtctacttta 420
cccatcggtg ctgggttggg ctcaagcgcc tctatttctg tatcactggc cttagctatg 480
gcctacttgg gggggttaat aggatctaat gacttggaaa agctgtcaga aaacgataag 540
catatagtga atcaatgggc cttcataggt gaaaagtgta ttcacggtac cccttcagga 600
atagataacg ctgtggccac ttatggtaat gccctgctat ttgaaaaaga ctcacataat 660
ggaacaataa acacaaacaa ttttaagttc ttagatgatt tcccagccat tccaatgatc 720
ctaacctata ctagaattcc aaggtctaca aaagatcttg ttgctcgcgt tcgtgtgttg 780
gtcaccgaga aatttcctga agttatgaag ccaattctag atgccatggg tgaatgtgcc 840
ctacaaggct tagagatcat gactaagtta agtaaatgta aaggcaccga tgacgaggct 900
gtagaaacta ataatgaact gtatgaacaa ctattggaat tgataagaat aaatcatgga 960
ctgcttgtct caatcggtgt ttctcatcct ggattagaac ttattaaaaa tctgagcgat 1020
gatttgagaa ttggctccac aaaacttacc ggtgctggtg gcggcggttg ctctttgact 1080
ttgttacgaa gagacattac tcaagagcaa attgacagtt tcaaaaagaa attgcaagat 1140
gattttagtt acgagacatt tgaaacagac ttgggtggga ctggctgctg tttgttaagc 1200
gcaaaaaatt tgaataaaga tcttaaaatc aaatccctag tattccaatt atttgaaaat 1260
aaaactacca caaagcaaca aattgacgat ctattattgc caggaaacac gaatttacca 1320
tggacttcat aa 1332
<210> 39
<211> 443
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 39
Met Ser Leu Pro Phe Leu Thr Ser Ala Pro Gly Lys Val Ile Ile Phe
1 5 10 15
Gly Glu His Ser Ala Val Tyr Asn Lys Pro Ala Val Ala Ala Ser Val
20 25 30
Ser Ala Leu Arg Thr Tyr Leu Leu Ile Ser Glu Ser Ser Ala Pro Asp
35 40 45
Thr Ile Glu Leu Asp Phe Pro Asp Ile Ser Phe Asn His Lys Trp Ser
50 55 60
Ile Asn Asp Phe Asn Ala Ile Thr Glu Asp Gln Val Asn Ser Gln Lys
65 70 75 80
Leu Ala Lys Ala Gln Gln Ala Thr Asp Gly Leu Ser Gln Glu Leu Val
85 90 95
Ser Leu Leu Asp Pro Leu Leu Ala Gln Leu Ser Glu Ser Phe His Tyr
100 105 110
His Ala Ala Phe Cys Phe Leu Tyr Met Phe Val Cys Leu Cys Pro His
115 120 125
Ala Lys Asn Ile Lys Phe Ser Leu Lys Ser Thr Leu Pro Ile Gly Ala
130 135 140
Gly Leu Gly Ser Ser Ala Ser Ile Ser Val Ser Leu Ala Leu Ala Met
145 150 155 160
Ala Tyr Leu Gly Gly Leu Ile Gly Ser Asn Asp Leu Glu Lys Leu Ser
165 170 175
Glu Asn Asp Lys His Ile Val Asn Gln Trp Ala Phe Ile Gly Glu Lys
180 185 190
Cys Ile His Gly Thr Pro Ser Gly Ile Asp Asn Ala Val Ala Thr Tyr
195 200 205
Gly Asn Ala Leu Leu Phe Glu Lys Asp Ser His Asn Gly Thr Ile Asn
210 215 220
Thr Asn Asn Phe Lys Phe Leu Asp Asp Phe Pro Ala Ile Pro Met Ile
225 230 235 240
Leu Thr Tyr Thr Arg Ile Pro Arg Ser Thr Lys Asp Leu Val Ala Arg
245 250 255
Val Arg Val Leu Val Thr Glu Lys Phe Pro Glu Val Met Lys Pro Ile
260 265 270
Leu Asp Ala Met Gly Glu Cys Ala Leu Gln Gly Leu Glu Ile Met Thr
275 280 285
Lys Leu Ser Lys Cys Lys Gly Thr Asp Asp Glu Ala Val Glu Thr Asn
290 295 300
Asn Glu Leu Tyr Glu Gln Leu Leu Glu Leu Ile Arg Ile Asn His Gly
305 310 315 320
Leu Leu Val Ser Ile Gly Val Ser His Pro Gly Leu Glu Leu Ile Lys
325 330 335
Asn Leu Ser Asp Asp Leu Arg Ile Gly Ser Thr Lys Leu Thr Gly Ala
340 345 350
Gly Gly Gly Gly Cys Ser Leu Thr Leu Leu Arg Arg Asp Ile Thr Gln
355 360 365
Glu Gln Ile Asp Ser Phe Lys Lys Lys Leu Gln Asp Asp Phe Ser Tyr
370 375 380
Glu Thr Phe Glu Thr Asp Leu Gly Gly Thr Gly Cys Cys Leu Leu Ser
385 390 395 400
Ala Lys Asn Leu Asn Lys Asp Leu Lys Ile Lys Ser Leu Val Phe Gln
405 410 415
Leu Phe Glu Asn Lys Thr Thr Thr Lys Gln Gln Ile Asp Asp Leu Leu
420 425 430
Leu Pro Gly Asn Thr Asn Leu Pro Trp Thr Ser
435 440
<210> 40
<211> 1059
<212> DNA
<213> Artificial sequence
<220>
<223> variant farnesyl pyrophosphate synthase (ERG20mut, F96W, N127W;
GPPS) codon optimization
<400> 40
atggcttctg agaaggagat tcgtcgtgag agattcttga atgtttttcc taaattagtc 60
gaggaattga acgcttcttt gttggcttat ggtatgccta aggaagcttg tgattggtat 120
gctcactcct tgaattataa tactccaggt ggtaaattga accgtggttt gtctgttgtt 180
gacacttacg ctattttatc taacaagacc gtcgagcaat tgggtcaaga agagtatgaa 240
aaggtcgcta ttttaggttg gtgtattgaa ttgttgcaag cttactggtt ggttgccgat 300
gacatgatgg acaagtctat tactcgtcgt ggtcaacctt gctggtataa ggtcccagag 360
gttggtgaaa ttgctatctg ggacgctttc atgttggaag ctgctatcta taaattgttg 420
aaatcccact tcagaaacga gaaatactac attgacatca ccgagttgtt ccacgaagtc 480
actttccaaa ctgagttagg tcaattaatg gacttgatca ccgctccaga agacaaagtt 540
gacttgtcca agttttcctt gaaaaagcac tctttcatcg ttactttcaa gactgcttat 600
tactctttct acttaccagt tgccttggct atgtacgtcg ccggtatcac tgacgaaaag 660
gacttgaagc aagctcgtga cgttttgatt ccattaggtg aatatttcca aatccaagat 720
gactacttag actgttttgg tacccctgaa caaatcggta agatcggtac tgatattcaa 780
gataacaagt gctcttgggt tatcaacaag gctttagagt tagcctccgc cgaacaacgt 840
aaaactttag atgaaaacta cggtaaaaaa gactctgttg ctgaggccaa gtgtaagaag 900
atttttaacg atttaaaaat cgaacaattg tatcacgaat atgaagagtc cattgctaag 960
gatttgaagg ctaaaatttc tcaagttgac gaatcccgtg gtttcaaagc tgacgttttg 1020
actgcttttt taaacaaggt ttacaagcgt tccaaataa 1059
<210> 41
<211> 352
<212> PRT
<213> Artificial sequence
<220>
<223> variant farnesyl pyrophosphate synthase (ERG20mut, F96W, N127W;
GPPS)
<400> 41
Met Ala Ser Glu Lys Glu Ile Arg Arg Glu Arg Phe Leu Asn Val Phe
1 5 10 15
Pro Lys Leu Val Glu Glu Leu Asn Ala Ser Leu Leu Ala Tyr Gly Met
20 25 30
Pro Lys Glu Ala Cys Asp Trp Tyr Ala His Ser Leu Asn Tyr Asn Thr
35 40 45
Pro Gly Gly Lys Leu Asn Arg Gly Leu Ser Val Val Asp Thr Tyr Ala
50 55 60
Ile Leu Ser Asn Lys Thr Val Glu Gln Leu Gly Gln Glu Glu Tyr Glu
65 70 75 80
Lys Val Ala Ile Leu Gly Trp Cys Ile Glu Leu Leu Gln Ala Tyr Trp
85 90 95
Leu Val Ala Asp Asp Met Met Asp Lys Ser Ile Thr Arg Arg Gly Gln
100 105 110
Pro Cys Trp Tyr Lys Val Pro Glu Val Gly Glu Ile Ala Ile Trp Asp
115 120 125
Ala Phe Met Leu Glu Ala Ala Ile Tyr Lys Leu Leu Lys Ser His Phe
130 135 140
Arg Asn Glu Lys Tyr Tyr Ile Asp Ile Thr Glu Leu Phe His Glu Val
145 150 155 160
Thr Phe Gln Thr Glu Leu Gly Gln Leu Met Asp Leu Ile Thr Ala Pro
165 170 175
Glu Asp Lys Val Asp Leu Ser Lys Phe Ser Leu Lys Lys His Ser Phe
180 185 190
Ile Val Thr Phe Lys Thr Ala Tyr Tyr Ser Phe Tyr Leu Pro Val Ala
195 200 205
Leu Ala Met Tyr Val Ala Gly Ile Thr Asp Glu Lys Asp Leu Lys Gln
210 215 220
Ala Arg Asp Val Leu Ile Pro Leu Gly Glu Tyr Phe Gln Ile Gln Asp
225 230 235 240
Asp Tyr Leu Asp Cys Phe Gly Thr Pro Glu Gln Ile Gly Lys Ile Gly
245 250 255
Thr Asp Ile Gln Asp Asn Lys Cys Ser Trp Val Ile Asn Lys Ala Leu
260 265 270
Glu Leu Ala Ser Ala Glu Gln Arg Lys Thr Leu Asp Glu Asn Tyr Gly
275 280 285
Lys Lys Asp Ser Val Ala Glu Ala Lys Cys Lys Lys Ile Phe Asn Asp
290 295 300
Leu Lys Ile Glu Gln Leu Tyr His Glu Tyr Glu Glu Ser Ile Ala Lys
305 310 315 320
Asp Leu Lys Ala Lys Ile Ser Gln Val Asp Glu Ser Arg Gly Phe Lys
325 330 335
Ala Asp Val Leu Thr Ala Phe Leu Asn Lys Val Tyr Lys Arg Ser Lys
340 345 350
<210> 42
<211> 717
<212> DNA
<213> Artificial sequence
<220>
<223> GFP
<400> 42
atgtctaccg cactaacaga aggagctaaa ctattcgaaa aggagattcc ttacattaca 60
gaattagagg gtgatgtcga aggaatgaaa ttcattatca agggcgaggg tactggtgac 120
gctactaccg gtacgattaa agcaaagtac atctgtacaa caggtgacct tcctgttccg 180
tgggctactc tggtgagcac tttgtcttat ggagttcaat gttttgctaa atacccttcg 240
cacattaaag actttttcaa aagtgcaatg cctgagggct atactcagga gagaacaata 300
tctttcgaag gagatggtgt gtataagact agggctatgg tcacgtatga aagaggatcc 360
atctacaata gagtaacttt aactggtgaa aacttcaaaa aggacggtca catccttaga 420
aagaatgttg cctttcaatg cccaccatcc atcttgtaca ttttgccaga cacagttaac 480
aatggtatca gagttgagtt taaccaagct tatgacatag agggtgtcac cgaaaagttg 540
gttacaaaat gttcacagat gaatcgtccc ctggcaggat cagctgccgt ccatatccca 600
cgttaccatc atatcactta tcataccaag ctgtccaaag atcgtgatga gagaagggat 660
cacatgtgtt tggttgaagt ggtaaaggcc gtggatttgg atacttacca aggttga 717
<210> 43
<211> 238
<212> PRT
<213> Artificial sequence
<220>
<223> GFP
<400> 43
Met Ser Thr Ala Leu Thr Glu Gly Ala Lys Leu Phe Glu Lys Glu Ile
1 5 10 15
Pro Tyr Ile Thr Glu Leu Glu Gly Asp Val Glu Gly Met Lys Phe Ile
20 25 30
Ile Lys Gly Glu Gly Thr Gly Asp Ala Thr Thr Gly Thr Ile Lys Ala
35 40 45
Lys Tyr Ile Cys Thr Thr Gly Asp Leu Pro Val Pro Trp Ala Thr Leu
50 55 60
Val Ser Thr Leu Ser Tyr Gly Val Gln Cys Phe Ala Lys Tyr Pro Ser
65 70 75 80
His Ile Lys Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Thr Gln
85 90 95
Glu Arg Thr Ile Ser Phe Glu Gly Asp Gly Val Tyr Lys Thr Arg Ala
100 105 110
Met Val Thr Tyr Glu Arg Gly Ser Ile Tyr Asn Arg Val Thr Leu Thr
115 120 125
Gly Glu Asn Phe Lys Lys Asp Gly His Ile Leu Arg Lys Asn Val Ala
130 135 140
Phe Gln Cys Pro Pro Ser Ile Leu Tyr Ile Leu Pro Asp Thr Val Asn
145 150 155 160
Asn Gly Ile Arg Val Glu Phe Asn Gln Ala Tyr Asp Ile Glu Gly Val
165 170 175
Thr Glu Lys Leu Val Thr Lys Cys Ser Gln Met Asn Arg Pro Leu Ala
180 185 190
Gly Ser Ala Ala Val His Ile Pro Arg Tyr His His Ile Thr Tyr His
195 200 205
Thr Lys Leu Ser Lys Asp Arg Asp Glu Arg Arg Asp His Met Cys Leu
210 215 220
Val Glu Val Val Lys Ala Val Asp Leu Asp Thr Tyr Gln Gly
225 230 235
<210> 44
<211> 545
<212> PRT
<213> Cannabis sativa
<400> 44
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 45
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> THCA synthase codon optimized sequence 1
<400> 45
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 46
<211> 878
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 46
tgttgtggaa atgtaaagag ccccattatc ttagcctaaa aaaaccttct ctttggaact 60
ttcagtaata cgcttaactg ctcattgcta tattgaagta cggattagaa gccgccgagc 120
gggcgacagc cctccgacgg aagactctcc tccgtgcgtc ctggtcttca ccggtcgcgt 180
tcctgaaacg cagatgtgcc tcgcgccgca ctgctccgaa caataaagat tctacaatac 240
tagcttttat ggttatgaag aggaaaaatt ggcagtaacc tggccccaca aaccttcaaa 300
tcaacgaatc aaattaacaa ccataggata ataatgcgat tagtttttta gccttatttc 360
tggggtaatt aatcagcgaa gcgatgattt ttgatctatt aacagatata taaatgcaaa 420
agctgcataa ccactttaac taatactttc aacattttcg gtttgtatta cttcttattc 480
aaatgtcata aaagtatcaa caaaaaattg ttaatatacc tctatacttt aacgtcaagg 540
agaaaaaact atactcttat taccctatcc tatggataaa gcaatcttga tgaggataat 600
gatttttttt tgaatataca taaatactac cgtttttctg ctagattttg tgaagacgta 660
aataagtaca tattactttt taagccaaga caagattaag cattaacttt acccttttct 720
cttctaagtt tcaatactag ttatcactgt ttaaaagtta tggcgagaac gtcggcggtt 780
aaaatatatt accctgaacg tggtgaattg aagttctagg atggtttaaa gatttttcct 840
ttttgggaaa taagtaaaca atatattgct gcctttgc 878
<210> 47
<211> 306
<212> DNA
<213> Artificial sequence
<220>
<223> OAC Y27F variant (OAC) codon optimization
<400> 47
atggccgtca aacacttgat cgtcttaaaa ttcaaggatg aaattactga agctcaaaaa 60
gaagagttct tcaaaacctt cgtcaattta gtcaacatta ttcctgctat gaaggacgtt 120
tactggggta aggatgtcac ccaaaagaac aaggaagaag gttacactca cattgttgaa 180
gtcactttcg aatctgttga aactatccaa gattatatta tccacccagc tcatgtcggt 240
tttggtgatg tttacagatc tttttgggaa aaattgttga tctttgacta tactccaaga 300
aaataa 306
<210> 48
<211> 101
<212> PRT
<213> Artificial sequence
<220>
<223> OAC Y27F variant (OAC)
<400> 48
Met Ala Val Lys His Leu Ile Val Leu Lys Phe Lys Asp Glu Ile Thr
1 5 10 15
Glu Ala Gln Lys Glu Glu Phe Phe Lys Thr Phe Val Asn Leu Val Asn
20 25 30
Ile Ile Pro Ala Met Lys Asp Val Tyr Trp Gly Lys Asp Val Thr Gln
35 40 45
Lys Asn Lys Glu Glu Gly Tyr Thr His Ile Val Glu Val Thr Phe Glu
50 55 60
Ser Val Glu Thr Ile Gln Asp Tyr Ile Ile His Pro Ala His Val Gly
65 70 75 80
Phe Gly Asp Val Tyr Arg Ser Phe Trp Glu Lys Leu Leu Ile Phe Asp
85 90 95
Tyr Thr Pro Arg Lys
100
<210> 49
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS F317Y variants
<400> 49
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccattta tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 50
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS F317Y variants
<400> 50
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Tyr His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 51
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS 196T variants
<400> 51
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaccta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 52
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS 196T variants
<400> 52
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Thr Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 53
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS 196Q variant
<400> 53
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagacaata cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 54
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS 196Q variants
<400> 54
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Gln Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 55
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS K261C variants
<400> 55
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
tgcaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 56
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS K261C variants
<400> 56
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Cys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 57
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS N196V variants
<400> 57
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagagtcta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 58
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS N196V variants
<400> 58
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Val Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 59
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of the THCAS L132M variant
<400> 59
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatatgagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 60
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS L132M variants
<400> 60
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Met Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 61
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS S170T variants
<400> 61
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttaacc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 62
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS S170T variants
<400> 62
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Thr Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 63
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS P539T variant
<400> 63
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaacttta 1620
ccaccacacc accattag 1638
<210> 64
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P539T variant
<400> 64
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Thr Leu Pro Pro His His
530 535 540
His
545
<210> 65
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of the THCAS L269I variant
<400> 65
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtatcgtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 66
<211> 546
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS L269I variants
<400> 66
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Ile Ile Val Lys
260 265 270
Leu Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu
275 280 285
Val Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly
290 295 300
Lys Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly
305 310 315 320
Gly Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu
325 330 335
Gly Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr
340 345 350
Ile Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys
355 360 365
Glu Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile
370 375 380
Lys Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys
385 390 395 400
Ile Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val
405 410 415
Leu Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile
420 425 430
Pro Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala
435 440 445
Ser Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg
450 455 460
Ser Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu
465 470 475 480
Ala Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala
485 490 495
Ser Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe
500 505 510
Gly Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro
515 520 525
Asn Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His
530 535 540
His His
545
<210> 67
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS F171I variants
<400> 67
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc atcccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 68
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS F171I variant
<400> 68
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Ile Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 69
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS R31Q variants
<400> 69
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct caagagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 70
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS R31Q variants
<400> 70
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Gln Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 71
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS P43E variants
<400> 71
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcgaga acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 72
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P43E variants
<400> 72
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Glu Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 73
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS P49E variants
<400> 73
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taacgagaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 74
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P49E variants
<400> 74
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Glu Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 75
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS P49K variants
<400> 75
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taacaaaaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 76
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P49K variants
<400> 76
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Lys Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 77
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS P49Q variants
<400> 77
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccaaaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 78
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P49Q variants
<400> 78
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Gln Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 79
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS K50T variants
<400> 79
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctacc ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 80
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS K50T variants
<400> 80
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Thr Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 81
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of the THCAS L51I variant
<400> 81
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag atcgtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 82
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS L51I variants
<400> 82
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Ile Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 83
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Q55E variants
<400> 83
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctgagcatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 84
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Q55E variants
<400> 84
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Glu His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 85
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Q55P variants
<400> 85
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcctcatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 86
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Q55P variants
<400> 86
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Pro His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 87
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H56E variants
<400> 87
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaagagga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 88
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H56E variants
<400> 88
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln Glu Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 89
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS L59E variants
<400> 89
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaagagtat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 90
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS L59E variants
<400> 90
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Glu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 91
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS M61H variants
<400> 91
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
cattctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 92
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS M61H variants
<400> 92
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr His Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 93
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS M61S variants
<400> 93
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
tcctctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 94
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS M61S variants
<400> 94
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Ser Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 95
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS M61W variants
<400> 95
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
tggtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 96
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS M61W variants
<400> 96
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Trp Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 97
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS S62Q variants
<400> 97
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgcaaatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 98
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS S62Q variants
<400> 98
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Gln Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 99
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of the THCAS L71A variant
<400> 99
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac gccagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 100
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS L71A variants
<400> 100
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Ala Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 101
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS S100A variants
<400> 101
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgcgcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 102
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS S100A variants
<400> 102
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ala Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 103
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS V103F variants
<400> 103
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaagtttg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 104
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS V103F variants
<400> 104
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Phe Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 105
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS T109V variants
<400> 105
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtgtcaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 106
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS T109V variants
<400> 106
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Val Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 107
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Q124D variants
<400> 107
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccg atgtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 108
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Q124D variants
<400> 108
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Asp Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 109
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Q124E variants
<400> 109
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccg aggtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 110
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Q124E variants
<400> 110
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Glu Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 111
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Q124N variants
<400> 111
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttcca atgtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 112
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Q124N variants
<400> 112
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Asn Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 113
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS V125E variants
<400> 113
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagagccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 114
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS V125E variants
<400> 114
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Glu Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 115
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS V125Q variants
<400> 115
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aacaaccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 116
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS V125Q variants
<400> 116
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Gln Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 117
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS S137G variants
<400> 117
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcacgg tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 118
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS S137G variants
<400> 118
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Gly Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 119
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H143D variants
<400> 119
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttgatt ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 120
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H143D variants
<400> 120
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val Asp Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 121
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS V149I variants
<400> 121
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttggatcgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 122
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS V149I variants
<400> 122
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Ile Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 123
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS W161K variant
<400> 123
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
aaaattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 124
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS W161K variants
<400> 124
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Lys Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 125
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS W161R variants
<400> 125
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
cgtattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 126
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS W161R variants
<400> 126
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Arg Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 127
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS W161Y variants
<400> 127
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tatattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 128
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS W161Y variants
<400> 128
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Tyr Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 129
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS W165A variants
<400> 129
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aagccaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 130
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS W165A variants
<400> 130
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Ala Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 131
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of the THCAS E167P variant
<400> 131
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatcc taacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 132
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS E167P variants
<400> 132
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Pro Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 133
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS N168S variants
<400> 133
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga atccttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 134
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS N168S variants
<400> 134
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Ser Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 135
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS P172V variants
<400> 135
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttgtcggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 136
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P172V variants
<400> 136
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Val Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 137
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Y175F variants
<400> 137
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gtttttgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 138
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Y175F variants
<400> 138
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Phe Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 139
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS G180A variant
<400> 139
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttgcc 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 140
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS G180A variants
<400> 140
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Ala Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 141
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H208T variants
<400> 141
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc taccttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 142
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H208T variants
<400> 142
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala Thr
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 143
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS G235P variants
<400> 143
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtcctgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 144
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS G235P variants
<400> 144
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Pro Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 145
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS a205T variants
<400> 145
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcacc gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 146
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS A205T variants
<400> 146
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Thr Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 147
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS I157V variants
<400> 147
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactgt cttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 148
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS I157V variants
<400> 148
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Val Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 149
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS K261W variants
<400> 149
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
tggaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 150
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS K261W variants
<400> 150
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Trp Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 151
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS G311A variant
<400> 151
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac gcctacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 152
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS G311A variants
<400> 152
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Ala Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 153
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS G311C variants
<400> 153
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac tgctacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 154
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS G311C variants
<400> 154
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Cys Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 155
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of the THCAS L321I variant
<400> 155
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgatat catgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 156
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS L321I variants
<400> 156
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Ile Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 157
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS T379S variant
<400> 157
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagtccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 158
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS T379S variants
<400> 158
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Ser Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 159
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS K390E variants
<400> 159
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaagag ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 160
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS K390E variants
<400> 160
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Leu Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 161
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS S429L variants
<400> 161
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaattagct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 162
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS S429L variants
<400> 162
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Leu Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 163
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> THCAS N467D variant codon optimization
<400> 163
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacga tttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 164
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS N467D variant
<400> 164
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asp Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 165
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Q475S variants
<400> 165
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct cttccaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 166
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Q475S variants
<400> 166
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Ser Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 167
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Y500M variants
<400> 167
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaacatg 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 168
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Y500M variants
<400> 168
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Met Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 169
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS Y500V variants
<400> 169
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaacgtc 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 170
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS Y500V variants
<400> 170
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Val Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 171
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS N528E variants
<400> 171
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc agagaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc accattag 1638
<210> 172
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS N528E variants
<400> 172
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Glu
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
His
545
<210> 173
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> THCAS P542E variants
<400> 173
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccagaacacc accattag 1638
<210> 174
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P542E variants
<400> 174
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Glu His His
530 535 540
His
545
<210> 175
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> THCAS P542V variants
<400> 175
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccagttcacc accattag 1638
<210> 176
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS P542V variants
<400> 176
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Val His His
530 535 540
His
545
<210> 177
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H543V variants
<400> 177
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccagttc accattag 1638
<210> 178
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H543V variants
<400> 178
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro Val His
530 535 540
His
545
<210> 179
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H544A variants
<400> 179
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacg ctcattag 1638
<210> 180
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H544A variant
<400> 180
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His Ala
530 535 540
His
545
<210> 181
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H544E variants
<400> 181
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacg aacattag 1638
<210> 182
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H544E variant
<400> 182
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His Glu
530 535 540
His
545
<210> 183
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H545D variants
<400> 183
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc acgattag 1638
<210> 184
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H545D variants
<400> 184
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
Asp
545
<210> 185
<211> 1638
<212> DNA
<213> Artificial sequence
<220>
<223> codon optimization of THCAS H545E variants
<400> 185
atgaattgtt ctgctttctc tttctggttc gtttgtaaga tcatcttttt cttcttatct 60
ttccatattc aaatctctat cgctaaccct cgtgagaact tcttgaaatg tttctccaaa 120
catatcccaa acaatgtcgc taaccctaag ttagtttaca ctcaacatga tcaattatat 180
atgtctatct tgaactctac catccaaaac ttgagattca tctccgatac caccccaaaa 240
ccattggtta ttgttacccc atccaacaat tctcatattc aagctaccat tttgtgctcc 300
aaaaaggtcg gtttgcaaat ccgtactaga tctggtggtc acgatgctga aggtatgtct 360
tacatttccc aagtcccatt cgttgttgtc gatttaagaa atatgcactc tatcaaaatc 420
gacgttcact ctcaaactgc ttgggttgaa gccggtgcca ctttaggtga ggtttactac 480
tggattaacg aaaagaatga aaacttatcc tttccaggtg gttactgtcc aactgttggt 540
gttggtggtc acttctctgg tggtggttat ggtgccttga tgagaaacta cggtttagct 600
gctgataata ttatcgacgc tcacttggtt aatgtcgacg gtaaggtttt ggacagaaaa 660
tccatgggtg aagatttatt ctgggccatt agaggtggtg gtggtgaaaa cttcggtatc 720
attgctgctt ggaaaattaa attggtcgct gtcccatcca agtctactat tttctccgtc 780
aagaaaaaca tggaaattca tggtttggtt aaattattca acaagtggca aaacattgct 840
tacaaatacg acaaagactt agttttgatg acccacttca ttactaaaaa cattaccgac 900
aaccatggta aaaataaaac tactgttcac ggttacttct cttccatttt tcatggtggt 960
gtcgactcct tggtcgattt aatgaacaaa tctttccctg agttgggtat caagaagacc 1020
gactgtaaag aattctcttg gatcgacact actattttct actctggtgt cgttaacttc 1080
aacaccgcta atttcaagaa ggaaatttta ttagatagat ccgctggtaa aaagaccgct 1140
ttctctatca aattagacta cgttaaaaaa ccaatcccag aaaccgctat ggtcaaaatc 1200
ttggaaaaat tatatgaaga agacgttggt gccggtatgt acgtcttata tccatatggt 1260
ggtattatgg aagagatctc tgaatccgct atcccttttc cacacagagc cggtattatg 1320
tacgaattat ggtacactgc ttcctgggag aaacaagaag ataatgaaaa gcacattaac 1380
tgggttagat ctgtttacaa cttcactact ccatacgtct ctcaaaaccc aagattagcc 1440
tacttaaact accgtgattt ggatttaggt aaaactaatc acgcttcccc aaacaactac 1500
acccaagcta gaatttgggg tgagaagtac tttggtaaga acttcaaccg tttagtcaag 1560
gtcaagacta aagttgatcc aaacaatttt ttcagaaacg aacaatctat cccaccttta 1620
ccaccacacc acgaatag 1638
<210> 186
<211> 545
<212> PRT
<213> Artificial sequence
<220>
<223> THCAS H545E variants
<400> 186
Met Asn Cys Ser Ala Phe Ser Phe Trp Phe Val Cys Lys Ile Ile Phe
1 5 10 15
Phe Phe Leu Ser Phe His Ile Gln Ile Ser Ile Ala Asn Pro Arg Glu
20 25 30
Asn Phe Leu Lys Cys Phe Ser Lys His Ile Pro Asn Asn Val Ala Asn
35 40 45
Pro Lys Leu Val Tyr Thr Gln His Asp Gln Leu Tyr Met Ser Ile Leu
50 55 60
Asn Ser Thr Ile Gln Asn Leu Arg Phe Ile Ser Asp Thr Thr Pro Lys
65 70 75 80
Pro Leu Val Ile Val Thr Pro Ser Asn Asn Ser His Ile Gln Ala Thr
85 90 95
Ile Leu Cys Ser Lys Lys Val Gly Leu Gln Ile Arg Thr Arg Ser Gly
100 105 110
Gly His Asp Ala Glu Gly Met Ser Tyr Ile Ser Gln Val Pro Phe Val
115 120 125
Val Val Asp Leu Arg Asn Met His Ser Ile Lys Ile Asp Val His Ser
130 135 140
Gln Thr Ala Trp Val Glu Ala Gly Ala Thr Leu Gly Glu Val Tyr Tyr
145 150 155 160
Trp Ile Asn Glu Lys Asn Glu Asn Leu Ser Phe Pro Gly Gly Tyr Cys
165 170 175
Pro Thr Val Gly Val Gly Gly His Phe Ser Gly Gly Gly Tyr Gly Ala
180 185 190
Leu Met Arg Asn Tyr Gly Leu Ala Ala Asp Asn Ile Ile Asp Ala His
195 200 205
Leu Val Asn Val Asp Gly Lys Val Leu Asp Arg Lys Ser Met Gly Glu
210 215 220
Asp Leu Phe Trp Ala Ile Arg Gly Gly Gly Gly Glu Asn Phe Gly Ile
225 230 235 240
Ile Ala Ala Trp Lys Ile Lys Leu Val Ala Val Pro Ser Lys Ser Thr
245 250 255
Ile Phe Ser Val Lys Lys Asn Met Glu Ile His Gly Leu Val Lys Leu
260 265 270
Phe Asn Lys Trp Gln Asn Ile Ala Tyr Lys Tyr Asp Lys Asp Leu Val
275 280 285
Leu Met Thr His Phe Ile Thr Lys Asn Ile Thr Asp Asn His Gly Lys
290 295 300
Asn Lys Thr Thr Val His Gly Tyr Phe Ser Ser Ile Phe His Gly Gly
305 310 315 320
Val Asp Ser Leu Val Asp Leu Met Asn Lys Ser Phe Pro Glu Leu Gly
325 330 335
Ile Lys Lys Thr Asp Cys Lys Glu Phe Ser Trp Ile Asp Thr Thr Ile
340 345 350
Phe Tyr Ser Gly Val Val Asn Phe Asn Thr Ala Asn Phe Lys Lys Glu
355 360 365
Ile Leu Leu Asp Arg Ser Ala Gly Lys Lys Thr Ala Phe Ser Ile Lys
370 375 380
Leu Asp Tyr Val Lys Lys Pro Ile Pro Glu Thr Ala Met Val Lys Ile
385 390 395 400
Leu Glu Lys Leu Tyr Glu Glu Asp Val Gly Ala Gly Met Tyr Val Leu
405 410 415
Tyr Pro Tyr Gly Gly Ile Met Glu Glu Ile Ser Glu Ser Ala Ile Pro
420 425 430
Phe Pro His Arg Ala Gly Ile Met Tyr Glu Leu Trp Tyr Thr Ala Ser
435 440 445
Trp Glu Lys Gln Glu Asp Asn Glu Lys His Ile Asn Trp Val Arg Ser
450 455 460
Val Tyr Asn Phe Thr Thr Pro Tyr Val Ser Gln Asn Pro Arg Leu Ala
465 470 475 480
Tyr Leu Asn Tyr Arg Asp Leu Asp Leu Gly Lys Thr Asn His Ala Ser
485 490 495
Pro Asn Asn Tyr Thr Gln Ala Arg Ile Trp Gly Glu Lys Tyr Phe Gly
500 505 510
Lys Asn Phe Asn Arg Leu Val Lys Val Lys Thr Lys Val Asp Pro Asn
515 520 525
Asn Phe Phe Arg Asn Glu Gln Ser Ile Pro Pro Leu Pro Pro His His
530 535 540
Glu
545
<210> 187
<211> 921
<212> DNA
<213> Artificial sequence
<220>
<223> nucleotide of aromatic prenyltransferase (NphB-ScCO)
<400> 187
atgtctgagg cggcagacgt agagagagta tacgctgcta tggaggaagc ggctggatta 60
ttgggggtgg cttgtgccag agacaagata tatccgttac tgtctacttt ccaggacact 120
cttgtagaag gagggagtgt ggtggtgttt agtatggcat caggccgtca ttcaacagag 180
ctagatttca gtatatctgt gccaacaagt cacggtgatc catacgcaac cgtagtcgag 240
aagggtcttt tcccggcaac agggcatcct gtagatgatt tgcttgccga cacacagaag 300
cacctgcccg tctccatgtt cgcaatcgat ggtgaggtga ccggaggatt taaaaagact 360
tacgctttct tcccgactga caatatgcca ggagttgccg agttgagtgc aataccatcc 420
atgccgccag cagtcgcgga gaacgccgaa ttgttcgccc gttacggctt ggacaaagtc 480
caaatgacta gtatggacta taaaaagagg caggtgaatc tatatttcag cgaactttct 540
gcccaaacct tggaggcgga gagcgtttta gcccttgtta gggagttagg gctacacgtc 600
ccgaatgagt tgggtttgaa attttgtaag cgtagctttt cagtatatcc gacgctgaac 660
tgggaaactg gaaagattga caggctatgc tttgcagtga tttctaatga ccctacgctt 720
gtaccttcct cagacgaggg cgacatcgag aaattccaca actatgccac aaaagctccg 780
tatgcctacg tcggcgaaaa acgtactcta gtatacggtt tgactctgag tcccaaggaa 840
gagtattaca agctaggagc gtactatcat atcactgatg tgcaacgtgg cttgctgaaa 900
gccttcgact ccttagagga c 921
<210> 188
<211> 307
<212> PRT
<213> unknown
<220>
<223> Streptomyces
<400> 188
Met Ser Glu Ala Ala Asp Val Glu Arg Val Tyr Ala Ala Met Glu Glu
1 5 10 15
Ala Ala Gly Leu Leu Gly Val Ala Cys Ala Arg Asp Lys Ile Tyr Pro
20 25 30
Leu Leu Ser Thr Phe Gln Asp Thr Leu Val Glu Gly Gly Ser Val Val
35 40 45
Val Phe Ser Met Ala Ser Gly Arg His Ser Thr Glu Leu Asp Phe Ser
50 55 60
Ile Ser Val Pro Thr Ser His Gly Asp Pro Tyr Ala Thr Val Val Glu
65 70 75 80
Lys Gly Leu Phe Pro Ala Thr Gly His Pro Val Asp Asp Leu Leu Ala
85 90 95
Asp Thr Gln Lys His Leu Pro Val Ser Met Phe Ala Ile Asp Gly Glu
100 105 110
Val Thr Gly Gly Phe Lys Lys Thr Tyr Ala Phe Phe Pro Thr Asp Asn
115 120 125
Met Pro Gly Val Ala Glu Leu Ser Ala Ile Pro Ser Met Pro Pro Ala
130 135 140
Val Ala Glu Asn Ala Glu Leu Phe Ala Arg Tyr Gly Leu Asp Lys Val
145 150 155 160
Gln Met Thr Ser Met Asp Tyr Lys Lys Arg Gln Val Asn Leu Tyr Phe
165 170 175
Ser Glu Leu Ser Ala Gln Thr Leu Glu Ala Glu Ser Val Leu Ala Leu
180 185 190
Val Arg Glu Leu Gly Leu His Val Pro Asn Glu Leu Gly Leu Lys Phe
195 200 205
Cys Lys Arg Ser Phe Ser Val Tyr Pro Thr Leu Asn Trp Glu Thr Gly
210 215 220
Lys Ile Asp Arg Leu Cys Phe Ala Val Ile Ser Asn Asp Pro Thr Leu
225 230 235 240
Val Pro Ser Ser Asp Glu Gly Asp Ile Glu Lys Phe His Asn Tyr Ala
245 250 255
Thr Lys Ala Pro Tyr Ala Tyr Val Gly Glu Lys Arg Thr Leu Val Tyr
260 265 270
Gly Leu Thr Leu Ser Pro Lys Glu Glu Tyr Tyr Lys Leu Gly Ala Tyr
275 280 285
Tyr His Ile Thr Asp Val Gln Arg Gly Leu Leu Lys Ala Phe Asp Ser
290 295 300
Leu Glu Asp
305
<210> 189
<211> 3349
<212> DNA
<213> Artificial sequence
<220>
<223> IRE1 fragment
<400> 189
atgcggtcta cttcgaagaa acatgttagt attgacactg ctcgtttgtg tgttttcatc 60
catcatttca tgctcaatcc cattgtcgtc tcgcacctca aggcggcaga tagtggaaga 120
tgaagttgcc tccactaaaa agctcaattt caactatggt gtggataaaa atataaactc 180
gcccattcct gctccaagaa ccactgaagg tttaccaaat atgaaactca gctcatatcc 240
aactcctaac ttattgaata ctgctgataa tcgacgtgct aacaaaaaag gacgtagggc 300
tgccaattct ataagtgtac cctatttgga gaatcgttcc ttgaacgaac tgagtttatc 360
agatatacta atcgcagccg acgttgaggg tggacttcat gctgtagata gaagaaatgg 420
tcatatcata tggtcaatcg aaccagaaaa ttttcaacct ctgatagaaa tacaagaacc 480
ttcgaggtta gaaacatatg aaacgttgat tatagaacct ttcggtgatg ggaacattta 540
ctactttaac gcccatcaag ggttacaaaa actgccttta tccatacgac aacttgtatc 600
aacttccccg ctgcacttga aaacaaatat tgtggttaat gactctggaa aaattgttga 660
agatgaaaag gtctacactg gatcgatgag aactataatg tatactataa acatgttgaa 720
tggtgaaatt atatcagcgt tcggacctgg ttcaaaaaac gggtatttcg ggagccagag 780
tgtggattgc tcacctgagg agaagataaa acttcaggaa tgtgaaaata tgattgtaat 840
aggcaaaact atttttgagc tgggaattca ctcttatgat ggagcaagct acaatgtcac 900
ttactctaca tggcagcaaa atgttttaga tgttccccta gcgcttcaga atacattttc 960
aaaggacggc atgtgcatag cgcctttccg tgataaatca ttgctagcaa gcgatttaga 1020
ttttagaatt gctagatggg tttctccgac attccccgga attattgttg ggcttttcga 1080
tgtgtttaat gatctccgca ccaatgaaaa tatactggta ccgcatccct ttaatcctgg 1140
tgatcatgaa agtatatcga gtaacaaagt ttacttggat cagacttcga acctctcctg 1200
gtttgcatta tctagtcaga attttccatc tttagtcgaa tcagctccca tatcaagata 1260
cgcttccagt gaccgttgga gggtgtcttc aatttttgaa gatgagactt tattcaagaa 1320
cgcaatcatg ggtgttcatc agatatataa taatgaatat gatcaccttt atgaaaacta 1380
tgaaaaaacg aatagtttgg acactacgca caaatatcca cctctgatga ttgattcgtc 1440
cgttgataca accgatttac atcagaataa cgagatgaat tcactaaagg aatacatgtc 1500
accagaagac cttgaggcat atagaaaaaa gatacacgag caaatatcga gagaattaga 1560
tgaaaagaac caaaattctt tgctactgaa gtttggaagt ctagtatatc gaattataga 1620
gactggagta tttctgttgt tatttctcat tttttgtgca atactacaaa gattcaaaat 1680
tttgccgcca ctatatgtat tattatccaa aattggattt atgcctgaaa aggaaatccc 1740
catagttgag tcgaaatcgc taaattgtcc ctcttcatcg gaaaatgtaa ccaagccatt 1800
cgatatgaaa tcagggaagc aagttgtttt tgaaggtgct gtgaacgatg gaagtctaaa 1860
atctgaaaaa gataacgatg atgctgatga agatgatgaa aaatcactag atttaaccac 1920
agaaaagaag aagaggaaaa gaggttcgag aggaggcaaa aagggccgaa aatcacgcat 1980
tgcaaatata ccaaactttg agcaatcttt aaaaaatttg gtagtatccg aaaaaatttt 2040
aggttacggt tcatcaggaa cagtagtttt tcagggaagt tttcaaggaa gacctgttgc 2100
ggtaaagaga atgttaattg atttttgtga catagcttta atggaaataa aacttttgac 2160
tgaaagcgat gatcacccta acgtcatacg atactactgt tcagaaacaa cagacagatt 2220
tttgtatatt gctttagagc tctgcaattt gaaccttcaa gatttggtgg agtctaagaa 2280
tgtatcagat gaaaacctga aattacagaa agagtataat ccaatttcgt tattgagaca 2340
aatagcgtcc ggggtagcac atttacattc tttaaagatt atccatcgag atttaaagcc 2400
tcaaaatatt ctcgtttcta cttcgagtag gtttactgcc gatcagcaaa caggagcaga 2460
aaatcttcga attttgatat cagactttgg tctttgcaaa aaactagact ctggtcagtc 2520
ttcatttaga acaaatttga ataacccttc tggcacaagt ggttggaggg ccccagagct 2580
gcttgaagaa tcaaacaatt tgcagtgcca agtcgaaacg gaacactctt ctagtaggca 2640
tacagtagtt tcatctgatt ctttttatga tccgttcacc aagaggaggc taacaagatc 2700
tattgatatt ttttctatgg gatgtgtatt ctattatatc ctatccaaag ggaagcatcc 2760
atttggagat aaatattcac gtgaaagcaa tatcataaga ggaatattca gtcttgatga 2820
aatgaaatgt ctacatgata gatccttaat tgcagaagct acagatctga tctcccaaat 2880
gattgatcac gatccgttaa aaagacctac tgctatgaaa gttctaaggc atccgttgtt 2940
ttggccaaag tcgaaaaaat tggagttcct tttaaaagtt agtgataggc ttgaaattga 3000
aaacagagac cctccaagtg ccctgttaat gaaatttgac gccggttctg actttgtaat 3060
acccagtgga gattggactg tcaagtttga taaaacattc atggacaacc ttgaaaggta 3120
cagaaaatac cattcatcaa agttaatgga tctattaaga gcacttagga ataaatatca 3180
tcattttatg gatttacctg aagatatagc agaactaatg gggccggtac ccgatggatt 3240
ttacgattac ttcaccaagc gttttccaaa cctattaata ggtgtttata tgattgtcaa 3300
ggaaaattta agtgacgatc aaattttacg tgaatttttg tattcataa 3349
<210> 190
<211> 1108
<212> PRT
<213> Artificial sequence
<220>
<223> IRE1 fragment
<400> 190
Met Leu Val Leu Thr Leu Leu Val Cys Val Phe Ser Ser Ile Ile Ser
1 5 10 15
Cys Ser Ile Pro Leu Ser Ser Arg Thr Ser Arg Arg Gln Ile Val Glu
20 25 30
Asp Glu Val Ala Ser Thr Lys Lys Leu Asn Phe Asn Tyr Gly Val Asp
35 40 45
Lys Asn Ile Asn Ser Pro Ile Pro Ala Pro Arg Thr Thr Glu Gly Leu
50 55 60
Pro Asn Met Lys Leu Ser Ser Tyr Pro Thr Pro Asn Leu Leu Asn Thr
65 70 75 80
Ala Asp Asn Arg Arg Ala Asn Lys Lys Gly Arg Arg Ala Ala Asn Ser
85 90 95
Ile Ser Val Pro Tyr Leu Glu Asn Arg Ser Leu Asn Glu Leu Ser Leu
100 105 110
Ser Asp Ile Leu Ile Ala Ala Asp Val Glu Gly Gly Leu His Ala Val
115 120 125
Asp Arg Arg Asn Gly His Ile Ile Trp Ser Ile Glu Pro Glu Asn Phe
130 135 140
Gln Pro Leu Ile Glu Ile Gln Glu Pro Ser Arg Leu Glu Thr Tyr Glu
145 150 155 160
Thr Leu Ile Ile Glu Pro Phe Gly Asp Gly Asn Ile Tyr Tyr Phe Asn
165 170 175
Ala His Gln Gly Leu Gln Lys Leu Pro Leu Ser Ile Arg Gln Leu Val
180 185 190
Ser Thr Ser Pro Leu His Leu Lys Thr Asn Ile Val Val Asn Asp Ser
195 200 205
Gly Lys Ile Val Glu Asp Glu Lys Val Tyr Thr Gly Ser Met Arg Thr
210 215 220
Ile Met Tyr Thr Ile Asn Met Leu Asn Gly Glu Ile Ile Ser Ala Phe
225 230 235 240
Gly Pro Gly Ser Lys Asn Gly Tyr Phe Gly Ser Gln Ser Val Asp Cys
245 250 255
Ser Pro Glu Glu Lys Ile Lys Leu Gln Glu Cys Glu Asn Met Ile Val
260 265 270
Ile Gly Lys Thr Ile Phe Glu Leu Gly Ile His Ser Tyr Asp Gly Ala
275 280 285
Ser Tyr Asn Val Thr Tyr Ser Thr Trp Gln Gln Asn Val Leu Asp Val
290 295 300
Pro Leu Ala Leu Gln Asn Thr Phe Ser Lys Asp Gly Met Cys Ile Ala
305 310 315 320
Pro Phe Arg Asp Lys Ser Leu Leu Ala Ser Asp Leu Asp Phe Arg Ile
325 330 335
Ala Arg Trp Val Ser Pro Thr Phe Pro Gly Ile Ile Val Gly Leu Phe
340 345 350
Asp Val Phe Asn Asp Leu Arg Thr Asn Glu Asn Ile Leu Val Pro His
355 360 365
Pro Phe Asn Pro Gly Asp His Glu Ser Ile Ser Ser Asn Lys Val Tyr
370 375 380
Leu Asp Gln Thr Ser Asn Leu Ser Trp Phe Ala Leu Ser Ser Gln Asn
385 390 395 400
Phe Pro Ser Leu Val Glu Ser Ala Pro Ile Ser Arg Tyr Ala Ser Ser
405 410 415
Asp Arg Trp Arg Val Ser Ser Ile Phe Glu Asp Glu Thr Leu Phe Lys
420 425 430
Asn Ala Ile Met Gly Val His Gln Ile Tyr Asn Asn Glu Tyr Asp His
435 440 445
Leu Tyr Glu Asn Tyr Glu Lys Thr Asn Ser Leu Asp Thr Thr His Lys
450 455 460
Tyr Pro Pro Leu Met Ile Asp Ser Ser Val Asp Thr Thr Asp Leu His
465 470 475 480
Gln Asn Asn Glu Met Asn Ser Leu Lys Glu Tyr Met Ser Pro Glu Asp
485 490 495
Leu Glu Ala Tyr Arg Lys Lys Ile His Glu Gln Ile Ser Arg Glu Leu
500 505 510
Asp Glu Lys Asn Gln Asn Ser Leu Leu Leu Lys Phe Gly Ser Leu Val
515 520 525
Tyr Arg Ile Ile Glu Thr Gly Val Phe Leu Leu Leu Phe Leu Ile Phe
530 535 540
Cys Ala Ile Leu Gln Arg Phe Lys Ile Leu Pro Pro Leu Tyr Val Leu
545 550 555 560
Leu Ser Lys Ile Gly Phe Met Pro Glu Lys Glu Ile Pro Ile Val Glu
565 570 575
Ser Lys Ser Leu Asn Cys Pro Ser Ser Ser Glu Asn Val Thr Lys Pro
580 585 590
Phe Asp Met Lys Ser Gly Lys Gln Val Val Phe Glu Gly Ala Val Asn
595 600 605
Asp Gly Ser Leu Lys Ser Glu Lys Asp Asn Asp Asp Ala Asp Glu Asp
610 615 620
Asp Glu Lys Ser Leu Asp Leu Thr Thr Glu Lys Lys Lys Arg Lys Arg
625 630 635 640
Gly Ser Arg Gly Gly Lys Lys Gly Arg Lys Ser Arg Ile Ala Asn Ile
645 650 655
Pro Asn Phe Glu Gln Ser Leu Lys Asn Leu Val Val Ser Glu Lys Ile
660 665 670
Leu Gly Tyr Gly Ser Ser Gly Thr Val Val Phe Gln Gly Ser Phe Gln
675 680 685
Gly Arg Pro Val Ala Val Lys Arg Met Leu Ile Asp Phe Cys Asp Ile
690 695 700
Ala Leu Met Glu Ile Lys Leu Leu Thr Glu Ser Asp Asp His Pro Asn
705 710 715 720
Val Ile Arg Tyr Tyr Cys Ser Glu Thr Thr Asp Arg Phe Leu Tyr Ile
725 730 735
Ala Leu Glu Leu Cys Asn Leu Asn Leu Gln Asp Leu Val Glu Ser Lys
740 745 750
Asn Val Ser Asp Glu Asn Leu Lys Leu Gln Lys Glu Tyr Asn Pro Ile
755 760 765
Ser Leu Leu Arg Gln Ile Ala Ser Gly Val Ala His Leu His Ser Leu
770 775 780
Lys Ile Ile His Arg Asp Leu Lys Pro Gln Asn Ile Leu Val Ser Thr
785 790 795 800
Ser Ser Arg Phe Thr Ala Asp Gln Gln Thr Gly Ala Glu Asn Leu Arg
805 810 815
Ile Leu Ile Ser Asp Phe Gly Leu Cys Lys Lys Leu Asp Ser Gly Gln
820 825 830
Ser Ser Phe Arg Thr Asn Leu Asn Asn Pro Ser Gly Thr Ser Gly Trp
835 840 845
Arg Ala Pro Glu Leu Leu Glu Glu Ser Asn Asn Leu Gln Cys Gln Val
850 855 860
Glu Thr Glu His Ser Ser Ser Arg His Thr Val Val Ser Ser Asp Ser
865 870 875 880
Phe Tyr Asp Pro Phe Thr Lys Arg Arg Leu Thr Arg Ser Ile Asp Ile
885 890 895
Phe Ser Met Gly Cys Val Phe Tyr Tyr Ile Leu Ser Lys Gly Lys His
900 905 910
Pro Phe Gly Asp Lys Tyr Ser Arg Glu Ser Asn Ile Ile Arg Gly Ile
915 920 925
Phe Ser Leu Asp Glu Met Lys Cys Leu His Asp Arg Ser Leu Ile Ala
930 935 940
Glu Ala Thr Asp Leu Ile Ser Gln Met Ile Asp His Asp Pro Leu Lys
945 950 955 960
Arg Pro Thr Ala Met Lys Val Leu Arg His Pro Leu Phe Trp Pro Lys
965 970 975
Ser Lys Lys Leu Glu Phe Leu Leu Lys Val Ser Asp Arg Leu Glu Ile
980 985 990
Glu Asn Arg Asp Pro Pro Ser Ala Leu Leu Met Lys Phe Asp Ala Gly
995 1000 1005
Ser Asp Phe Val Ile Pro Ser Gly Asp Trp Thr Val Lys Phe Asp
1010 1015 1020
Lys Thr Phe Met Asp Asn Leu Glu Arg Tyr Arg Lys Tyr His Ser
1025 1030 1035
Ser Lys Leu Met Asp Leu Leu Arg Ala Leu Arg Asn Lys Tyr His
1040 1045 1050
His Phe Met Asp Leu Pro Glu Asp Ile Ala Glu Leu Met Gly Pro
1055 1060 1065
Val Pro Asp Gly Phe Tyr Asp Tyr Phe Thr Lys Arg Phe Pro Asn
1070 1075 1080
Leu Leu Ile Gly Val Tyr Met Ile Val Lys Glu Asn Leu Ser Asp
1085 1090 1095
Asp Gln Ile Leu Arg Glu Phe Leu Tyr Ser
1100 1105
<210> 191
<211> 921
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 191
atgcagttga gcaaggctgc tgagatgtgt tatgagataa caaactctta cttacacata 60
gaccagaaat ctcagataat agcaagtaca caagaagcga tacggttgac aagaaaatac 120
ttactaagtg aaatttttgt acgttggagt ccactgaatg gggaaatatc attctcgtac 180
aacggaggaa aagattgcca ggtattacta ctgttatatc tgagttgctt atgggaatat 240
ttcttcatta aggctcaaaa ttcccaattc gatttcgagt ttcaaagctt ccccatgcaa 300
agacttccaa ctgttttcat tgatcaagaa gaaactttcc ctacattaga gaattttgta 360
ctggaaacct cagagcgata ttgcctttcc ttatacgaat cacaaaggca atctggtgca 420
tcggtcaata tggcagacgc atttagagat tttataaaga tataccctga gaccgaagct 480
atagtgatag gtattagaca cacagaccca tttggtgaag cattaaagcc tattcaaaga 540
acagattcta actggcctga ttttatgagg ttgcaacctc tcttacactg ggacttaacc 600
aatatatgga gtttcttact gtattctaat gagccaattt gtggactata tggtaaaggt 660
ttcacatcaa tcggcggaat taacaactca ttgcctaacc cacacttgag aaaggactcc 720
aataatccag ccttgcattt tgaatgggaa atcattcatg catttggcaa ggacgcagaa 780
ggcgaacgta gttccgctat aaacacgtca cctatttccg tggtggataa ggaaagattc 840
agcaaatacc atgacaatta ctatcctggc tggtatttgg ttgatgacac tttagagaga 900
gcaggcagga tcaagaatta a 921
<210> 192
<211> 306
<212> PRT
<213> unknown
<220>
<223> genus Saccharomyces
<400> 192
Met Gln Leu Ser Lys Ala Ala Glu Met Cys Tyr Glu Ile Thr Asn Ser
1 5 10 15
Tyr Leu His Ile Asp Gln Lys Ser Gln Ile Ile Ala Ser Thr Gln Glu
20 25 30
Ala Ile Arg Leu Thr Arg Lys Tyr Leu Leu Ser Glu Ile Phe Val Arg
35 40 45
Trp Ser Pro Leu Asn Gly Glu Ile Ser Phe Ser Tyr Asn Gly Gly Lys
50 55 60
Asp Cys Gln Val Leu Leu Leu Leu Tyr Leu Ser Cys Leu Trp Glu Tyr
65 70 75 80
Phe Phe Ile Lys Ala Gln Asn Ser Gln Phe Asp Phe Glu Phe Gln Ser
85 90 95
Phe Pro Met Gln Arg Leu Pro Thr Val Phe Ile Asp Gln Glu Glu Thr
100 105 110
Phe Pro Thr Leu Glu Asn Phe Val Leu Glu Thr Ser Glu Arg Tyr Cys
115 120 125
Leu Ser Leu Tyr Glu Ser Gln Arg Gln Ser Gly Ala Ser Val Asn Met
130 135 140
Ala Asp Ala Phe Arg Asp Phe Ile Lys Ile Tyr Pro Glu Thr Glu Ala
145 150 155 160
Ile Val Ile Gly Ile Arg His Thr Asp Pro Phe Gly Glu Ala Leu Lys
165 170 175
Pro Ile Gln Arg Thr Asp Ser Asn Trp Pro Asp Phe Met Arg Leu Gln
180 185 190
Pro Leu Leu His Trp Asp Leu Thr Asn Ile Trp Ser Phe Leu Leu Tyr
195 200 205
Ser Asn Glu Pro Ile Cys Gly Leu Tyr Gly Lys Gly Phe Thr Ser Ile
210 215 220
Gly Gly Ile Asn Asn Ser Leu Pro Asn Pro His Leu Arg Lys Asp Ser
225 230 235 240
Asn Asn Pro Ala Leu His Phe Glu Trp Glu Ile Ile His Ala Phe Gly
245 250 255
Lys Asp Ala Glu Gly Glu Arg Ser Ser Ala Ile Asn Thr Ser Pro Ile
260 265 270
Ser Val Val Asp Lys Glu Arg Phe Ser Lys Tyr His Asp Asn Tyr Tyr
275 280 285
Pro Gly Trp Tyr Leu Val Asp Asp Thr Leu Glu Arg Ala Gly Arg Ile
290 295 300
Lys Asn
305
<210> 193
<211> 1030
<212> DNA
<213> unknown
<220>
<223> genus Saccharomyces
<400> 193
gttaaccatt ctggttcact tgccgtcgta tgttgcggac cacctatttt cgtcgacacc 60
gctagaaatc aaactgccaa agctgttatc agaaacccat caagaatgat tgaatacttg 120
gaggaatacc aagcctggtg aacaattttt catatttaag taaacactca atgtataata 180
tcctctaact gttgtaattt cattaacgta aatggtttgc gcctttttta ggggaccctt 240
gttgattcat tctaactact gaggcataag ttgtttcaaa taacactttt tcagaaaaat 300
aatcgtatta aaaagcagaa aaatcatacg taagatgaca gaagcttcat atttagtaac 360
tctgaattgt ataacacacc aattgccgat agaatatgaa ccaatcgatc ttcagcgttc 420
atgtacttaa tttaactacc tgtattttct tataaagata aaattggtgt ataatgtaag 480
ggccaagaga aaaaggaatc ccgcatccca agcaacttct agtggactat ttcttcaaaa 540
aaataactga ataaacacct atataatgtt cagaggttat actttagtgt tttagaatgc 600
agtaccaaaa gtaatatatt gaattaataa ctatatgatg tgtagctaag aattaaatag 660
taaacgtctt ctgaaacctt ttaagaggta attattggta ttccaaagtc atatgtggag 720
gtaagggaga cacaaaatta tctggaatga cagcgtgctg acacatataa agttccgtaa 780
cttcaaatgc cttcattatt caacatagga aaagtgaaat gtgtgcctct aaaatatacg 840
gaacatcgtc gaactaaaaa aatccattaa gcaaagttag aaacagcatg cactacaaga 900
catttggttc atcatgaaga atgctcaatt gaaccatcaa tcactttctc ttgttcgatg 960
ttagcattat cctcactatc agttgaatcc tcaatgcttt cggtttcagt cctcgcatct 1020
tcctgaactt 1030
<210> 194
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking the homologous ui1
<400> 194
gcttgtactg aaattaacga gaagattgct ttgtcgagac gtatcgaaaa gcattggcgt 60
ttgattggtt gggcaaccat taaaaagggt actacattgg aacccatcgc ttaaggaacc 120
aataaaacca ctgcaaagac aaaaatttca taattaatct gaaagaaagt gaagataaga 180
aacgggctag gaggaaggga aactgacact tctggttatt gcaatatgct catatacatt 240
gatgcgtaat gacattgatg atctttattc tctttttata acgttttctt tctttttttt 300
tccttcttac atagtattca actgtatatt taacatgttt tacgtatttt taagaaaaaa 360
ttactaaacg cgataatatt aagcaaatat ttatctcata gttctcgaac tcatttattt 420
cccattgatg ccatgaaaac ctctcaaacc tttatcgtct agttacacca gtagtcaata 480
aactgccttt ctttttttac 500
<210> 195
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di1
<400> 195
acaattgcac aaagataatg aagctccaaa attattcagt atctattgag tatatataac 60
cttgaaaagg ttttatttta tataagttcg ccatcttagt atagtggtta gtacacatcg 120
ttgtggccga tgaaaccctg gttcgattct aggagatggc atatttattt tttatattct 180
taatatacaa agaatgtcgt gtgaagctgt aggcacaggt aattttgtaa ccatagtcag 240
atgtggtgat catgagagcg aattataatt ttataccagc tggcaagaat tgagtaatat 300
ttagaccagc atataaaagt agaataaaaa gttatatgta caaatttttt ttgacgccag 360
gcatgaacaa aaactactat ggctttggaa ttttcaagct cttcgaaatc attccacacc 420
catggataaa aaatactaga ataattggat gaaattccaa tatttggtct tctctaaaaa 480
tgccgaatgg gatgttatca 500
<210> 196
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking the homologous ui2
<400> 196
agttttgccg ctttgcttga tcactgttag tatttcggca aattatatgg tattggtcgc 60
cttgccaggt tctttaggaa gttcatcata aactaacgct ttcaagaatt tacggaaatg 120
atagggttta tagtttttat aacagtagtg gacctagaaa acaccgatag ctggcggcgg 180
ttatatctca ttactactat gaaaactgtg gctcctgagt agctactgaa ggatgtgccc 240
atcttaatcc agacctagtc aatacaatca atacaactgt ctagctgaaa actgacaaaa 300
aagtttgtcg atgtctctag tatattcact ataacactaa attttattgt aaattattac 360
acaagttttc aagaagaaaa acaatattaa acacaataac tagattatgc gaggcacggc 420
aaaaggagtg aagagggcaa aatacggaga agacaatata gaataaattt ctttttttga 480
ttagagaaga ttgtttgcca 500
<210> 197
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di2
<400> 197
atagtcttac catgatgatc agtcggattc tgacgacgtt gggtatttta aaacacgcgt 60
aattgaaagg gtgatgttga gaatggacca cttcaagata tgctcgaaaa tgtagctata 120
tttcacggat gaataactcg taagaatgtg cagtagctga tggacctagg aacgatcaag 180
tcaacgttgt attttggttc ggcaaacaat tgatatgatg ttgacaagaa aaccatctgc 240
gctctaatct ctaagtacac gtgcatttgg acctatcatc aaaagagaat aagaggatac 300
tttcaagaga agttcaaaaa agaatcatta ttatgatcca atgacagtga caataagatc 360
aacataaaaa agaaaagtca gaagtataaa tctgggtctt tttctctaaa ataattatag 420
tctgttaatt tataaaactg cctaaaaaat atacttaaaa tatgtctaca gattatgcag 480
ctggaaaaaa tcaagcaaaa 500
<210> 198
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking the homologous ui3
<400> 198
ccagttatcc taggcaatta ctttatttga gtcttatatg acgtcactag aagctcagta 60
agagcaaccg agacctgaac atcctttttt ttttttgctt ctttatttgg cagcattttt 120
caaaaataat aaaatggaag ccgcgagtac gaacaatgat gtgttctggg aatacctcgt 180
caaaacaaga caatggtaag gattttcttt catcaggcag aaagatctgg atctgaatgg 240
catcattttg tgatgtgtaa aagcgggacc ttgttatttc gactttttgc atcatgttga 300
tgcaatttgc tacttttccg acggtgcgct ccaacggatg ggtatttcct taataacaag 360
gcatttctct ggaagttggc ttactgtttg aaatcacagc cggtcacaaa ataaagtaaa 420
aaaactatct ctctccacaa gaagtaatta caggttgtat actacgtgtg atcgtatttc 480
tttatgaaca ctaaggagtt 500
<210> 199
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di3
<400> 199
cgtgtctgaa aaatcttgaa ttttcagaaa agaataagcc ccaaatgtca gtgatggtag 60
tagcagtact cccctacgat tttagatact ttagagagcc caccttcaga atcggaagga 120
ggataatttt gtaaagccct tctgtttttt ctcttgcata acttatattt ccacatcaaa 180
aagtagtgtg ctaagaaaaa ggagacgaga aaaaggatta cggcactctc tgcatctaga 240
catataccaa aagttgggtt tgctcacgaa aataccataa ttgtggtgtc aaaaaaatcc 300
tgcctcataa taccactgca gcaattgtgg atgactaaaa aataacttgc attccacgat 360
gttattttac tttataaagc acctgcaatt tttttttttg tattaactca tcgagtatgt 420
ctgatgtgta aactgaacca ggcttaatat cgtttctaat tcttgttgtg agaaaacttt 480
cctgcctaat gtatttcgtc 500
<210> 200
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking the homologous ui7
<400> 200
taaaatttta ctcatagtag ataatggcat aaatcagtgg taaaaaaaga atacgcatga 60
caaattttga aaaccgtacg tgacataaaa tacctataat aagatgaaga catcaattct 120
caagttccac ttttcgccgg ttgagtgtat cggtcatagt acaaaggcag aaatgaataa 180
tagtaatatt aatcgcagtt tcctgttgtt tagagaactg aaacgctttg tgggcatcag 240
gtttagatta actgctaccc ttcttctgta tttgtctggc catgcttctc aaataaaccg 300
ctgcggataa catttcaagt ggtttctcaa gggagaatca tagtttagct taacatacag 360
caaatcgtca ctatcttgac tgatgcccgg tgtatagaga atgggtagtt aatatcatct 420
agatggggtt tctttgaaaa caccagtttc tttgaggaca ccagtttcag tgcttctctc 480
taccccatca actattgcac 500
<210> 201
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di7
<400> 201
cgtcttgctt ttttcggtag tttttcgttt cgataaaggc aacaatgctg tatattgtat 60
gcaggaagtt cttaaggaaa atacaggaat ctttagaaaa agaataaata gctttccatt 120
gtatcatgaa caagtcactc ttcatattat atgtcgtcgc ttcttatccc tttaagttat 180
gaattgttga gcttaggatt ttaccacaac tgaataacat tttctttatt ctaataatcg 240
atttttttta ataagattag acttgagtgc catcagcaaa gaaaaatact tttaatgtcc 300
tttttttaac tctacacaga atttagttgg ctgtttcatt gatttacaaa aatataaata 360
tataccgtta aaattatagc gataaactga gtatgtggtc tctcttttcc cgcagaatat 420
gaaagctttt cttttataaa tcttataata ttggtctctt tttggtacgt ttggcaaatt 480
ggcattcatt tatcatgaaa 500
<210> 202
<211> 503
<212> DNA
<213> Artificial sequence
<220>
<223> flanking the homologous ui10
<400> 202
aggaccactt catcaagttt cgaaagtgaa attaaatcca tttcagaaaa tttcaagaac 60
tctattccag aatcttccat actcttcagg atatcatata ataacaactc taataatacc 120
tctagtagcg agatcttcac acttttggta gaaaaagttt ggaattttga cgacttgata 180
atggcgatca attctaaaat ttcgaataca cataataaca acatttcacc aatcaccaag 240
atcaaatatc aggacgaaga tggggatttt gttgtgttag gtagcgatga agattggaat 300
gttgctaaag aaatgttggc ggaaaacaat gagaaattct tgaacattcg tctgtattga 360
taaataaaac tagtatacag caaatactaa ataattcaag aaaaaaacat tagatagaga 420
ggggcagatg ttcaagctat acccattata ttgatccaca cttagtatta agatacgtct 480
gtgaaggatg aaaaaaaatg tat 503
<210> 203
<211> 485
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di10
<400> 203
ttgtgcgttt ttataatttt tttttttttg taattctatg caaatgtaat ataagtatat 60
ttaaagaaat aatgagtcct gtgaaaacaa aaagaaaaaa agatcattaa tgtatgttaa 120
cgtatttgct ttgcaaattt taatttattt gttgttaaat gcattttttt tttgtcgttt 180
cagcgagttt tcttgaggtt gctactatca ttaaaatcac aatccacaga ggaagttgat 240
ctctttttca gttgggtggg ggcagagcat gggtgagcag tggccatggg tctaacagga 300
aataatcttt ttgaacgcac agataaattt tgtaataatt ttctatttga cattagagat 360
ggggtggtgg gagttagtgg gcttggccaa aagatgcttg aattttgtgg gatgctcagt 420
gaccttttaa aagaattttg ggtagaagag aacgaacctg aatgtgaatg gtgtgatgca 480
gagtc 485
<210> 204
<211> 490
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous ui21
<400> 204
ccatctatcc ttcgcctctc cttcgctctg taattttttt tactcgcgcg cttccgactt 60
ttgaaagaag gagcaataaa gttaaataaa tgtaattaaa ttatgctttt ttaggcaagt 120
tcgggacttt gttgccacgt attgctcttc tatgcaagca cttcactcct tttctttcat 180
ctctgttttc ttccactggc tggaagcttg agggttgcct cttgattctt tatcgcctgc 240
aaccattgcc ttgttccgtc ctctcaaggc gttccttccg tgctttttaa atactagaat 300
cattcgagac gtatttatga gcatgttact tcttgatgtt tatctaagag ggttgtttag 360
gttatccgca ttatttttaa agttttaagg ttacatcatt tattcagacg cgttcggagg 420
agagtgcatt caccaagatg taaatttctt cagttttccg gattaggatt ggaaaaatga 480
agaaaaatag 490
<210> 205
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di21
<400> 205
ctaaagtaat tgtagcagtt gttattaagg agttctttaa atcatattgc ttgcttgtat 60
cagaccattg gaaacttcaa tgtttaaact ctagaaaggt tgatctgctc aaatattttc 120
atatttacgg catgtcctaa cttgaacatt tgtagaagag agacatattt cttagtgtag 180
gcaagatatt tgaatgacat tgtctgccga aatatactcg acttgcagtg gaactgcaag 240
tcgaaaagga tatcgcttta gccaaacaaa aatttgttgt gctattcagt gagcatgcat 300
tggctataga ggccgcacct aaattgtatc ttttgattta tgtaactgcc acactttctc 360
tagcacagtc atgatacggc tttttttcat ttagccacca aatactgtaa atatcgtttt 420
agaacgttat gaaaaaatgc tcatccactt aaaaacctct ccgtattctg aaagttggta 480
taatcttgca ctttaagtgt 500
<210> 206
<211> 503
<212> DNA
<213> Artificial sequence
<220>
<223> flanking the homologous ui33
<400> 206
tctgttaacc attctggttc acttgccgtc gtatgttgcg gaccacctat tttcgtcgac 60
accgctagaa atcaaactgc caaagctgtt atcagaaacc catcaagaat gattgaatac 120
ttggaggaat accaagcctg gtgaacaatt tttcatattt aagtaaacac tcaatgtata 180
atatcctcta actgttgtaa tttcattaac gtaaatggtt tgcgcctttt ttaggggacc 240
cttgttgatt cattctaact actgaggcat aagttgtttc aaataacact ttttcagaaa 300
aataatcgta ttaaaaagca gaaaaatcat acgtaagatg acagaagctt catatttagt 360
aactctgaat tgtataacac accaattgcc gatagaatat gaaccaatcg atcttcagcg 420
ttcatgtact taatttaact acctgtattt tcttataaag ataaaattgg tgtataatgt 480
aagggccaag agaaaaagga atc 503
<210> 207
<211> 494
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di33
<400> 207
atttcttcaa aaaaataact gaataaacac ctatataatg ttcagaggtt atactttagt 60
gttttagaat gcagtaccaa aagtaatata ttgaattaat aactatatga tgtgtagcta 120
agaattaaat agtaaacgtc ttctgaaacc ttttaagagg taattattgg tattccaaag 180
tcatatgtgg aggtaaggga gacacaaaat tatctggaat gacagcgtgc tgacacatat 240
aaagttccgt aacttcaaat gccttcatta ttcaacatag gaaaagtgaa atgtgtgcct 300
ctaaaatata cggaacatcg tcgaactaaa aaaatccatt aagcaaagtt agaaacagca 360
tgcactacaa gacatttggt tcatcatgaa gaatgctcaa ttgaaccatc aatcactttc 420
tcttgttcga tgttagcatt atcctcacta tcagttgaat cctcaatgct ttcggtttca 480
gtcctcgcat cttc 494
<210> 208
<211> 502
<212> DNA
<213> Artificial sequence
<220>
<223> flanking the homologous ui34
<400> 208
tgagcaacca atcacttctg aaaccgctat gaagaaggtt gaagatggta acattttggt 60
tttccaagtt tccatgaaag ctaacaaata ccaaatcaag aaggccgtca aggaattata 120
cgaagttgac gtattgaagg ttaacacttt ggttagacca aacggtacca agaaggctta 180
cgttagattg actgctgact acgatgcttt ggacattgct aacagaatcg gttacattta 240
atctaattgg tttaattaat aaatttaata ttatttttaa atttttcttt aaatatacaa 300
taaatctttc ataacatgtt aaattcatga ttaagcgtaa ataaagtgta gtggcagagt 360
gcacggggtt tcctgtgcct tacaaagtag gtaccaattt gcgtattgca gcgagggttc 420
cggttactat ttataattac gtgttagtgt actgtgattt tattgaggct ataacaagaa 480
aaggatctgt gaaggttttg ag 502
<210> 209
<211> 492
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di34
<400> 209
cccttttctt ttcgcaagat gagagtaaag agttgtacat caggtaagaa tgttattatt 60
taaattcgaa gtgataaatt cttttcatga tgaatcactc gcttatatgg ggtagaatat 120
atatatatgt gtgtgtgtgt gtgtgtttgt gtatgtaggg atggtgcgcg tttgttgtgt 180
gacatttgct actcattctt ttccttttcc tacgactggc ttaacgggaa tattatcaat 240
ttgctgcatt cttatgcttc ggtccgatgc tcattaagat gatgcagatc tcgatgcaac 300
gaattccaag cccttatcga tatttttctt taactgggag acgcaaattg gcaacatttg 360
gttgcgttcc atgtcgttca tcctattaac gatgtcataa tccacatagg aaacaccctt 420
tgtagtaata gttaatggta tggcaaagta gtctgcaccg tccaccagag gcaataattg 480
atctgcccca gg 492
<210> 210
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous ui1001
<400> 210
atgagttaac gtagattact ttccgttagt gtaacggaca agatgacaca gtattgaaac 60
gctcctctat ctttgtgggt gttgagggag gagagagtta ttagtctaga cgctatatat 120
cacagagtcg agtgcccaat aatatagcag gtagacgcca acttaactac tggatgtgag 180
ttagagagga atatactgtt ttattaactc gtaccgtagt ggtttctggc gagaattcgc 240
cggcttaaaa tctattgact aactagaatt agcttaaagt ggaccttatt gaatctcacc 300
gggttatcgc gacatttata ttatcgaagg gtccagcttg gggtttgttt ggaaggttga 360
cttgttgttg tagttgtatc actaataatt acgattctca acgaccggcc caaagattgc 420
ctgggttaaa ttcagcgtct gatgtatact tagctcttcg caagtagtgt tctaattaaa 480
gatcacttca acttatcttc 500
<210> 211
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous di1001
<400> 211
aaggtaactt aggctgacga cgcaataatg cacgctcgcg tgtgatgaga ttacatcagt 60
aagaattcat atctcgatat gggataatta cgggctcaca ctctaggaat accaagaaca 120
gtaatgtttc cttattaata tgtagataag gatcttctta aagtttaatt gatggcgaat 180
tccaaatagt gacttaactt tcgcgcagag tatcgcagga caaggttagc ttttcgtaag 240
cctgatgtta tgccataact agccaccttg taatccccgc gctcagtaaa tctatctact 300
tatttgcttc acttcttcac ggagagtact gattttcttc tctcaatact actgactacc 360
ttagggcgac atactttagt tctgcataag cgtcaacctc ttgcctaagg gtatggatcc 420
ttgttaagcc ttattccata tagttgagct gaagaacaga tccctcaact cgagtgcgac 480
tagtagctag ttacgaacac 500
<210> 212
<211> 508
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous uPEP4
<400> 212
tgttaatccg ttttcaatat cttgagctcc tcaattgtat ttgctgaggt ctgattattt 60
ctataaccaa aagcggttat tgaatctatg gagaggctgt aacccgtctt atgccttccg 120
ggtactatat ttcatttgcg ggtgtcgatg gattaagggg cggaggcggc cctttttagg 180
atttatataa aaagccatac ttccgtactt cgtaacctct tatcaactgg ttaagggaac 240
agagtaaaga agtttgggta attcgctgct atttattcat tccaccttct tcttttttga 300
gcgaagcctt tataatcaaa ttttagtggt cttttctatt tttatttgag aagcctacca 360
cgtaagggaa gaataacaaa aagtatatct caccctactg tattcataaa aagttttttc 420
tattagaatt ctataagaaa agaaaaaaaa aaagcctagt gacctagtat ttaatccaaa 480
taaaattcaa acaaaaacca aaactaac 508
<210> 213
<211> 509
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologous dPEP4
<400> 213
gctaaacttt tcttacttct ccgccctatc cttttctgcc atctagagag cttttataag 60
tagataacaa taaaaaaaac tatagtatat ttaaaaaaaa aaaacaagac aaaccatctt 120
gtcctcagtt ttagaatcca ttgttctatg ctgctgccca taatgtcatt atatgcgggt 180
agcccgatga tgcggctcga gaatttcctt gtttatcctt ttccaatagc ggaacaattg 240
ataataaagc aatgtaagca gaagcgaaaa ataaaaagaa ataggctgca gagattcaca 300
ggctgcgctc tagaaacatt tgaaatcaag gcaaacatag aacacttgat aaaattctta 360
ccataatacc accattgatg attcaaaaaa tgagcccaag cttaaggagg ccatcaacga 420
ggtctagttc tggttcaagt aatatcccac aatcgccctc tgtacgatca acttcatcgt 480
tttctaatct gacaagaaac tccatacgg 509
<210> 214
<211> 503
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homologies uROT2
<400> 214
gctgcattct tcagtggtat gttatttatg taacgggtat gcgaaccaca acgccagatt 60
cttgaagggg aaacctaact acacagtctt agcaacaacc gccggcgctc tgggtctttt 120
gacgctggac ggtataattt caaagaaata ctactccaga tacgacaaga aataataaca 180
tactatttaa taacatcctt ttcacacact cacacactca catactttat atacatatat 240
ttttataact attactgctg atttatttgt aaggaaacgt gctttcctct tcatcggtca 300
agtgataagt ttctataata taatagcttt tctgtctact atcattcttt tttcatttca 360
agtaccctta atttgtttta cccggaccac gaaattttct cactacggca cttgagagct 420
ataactcaat gaacatgttg cttggagtga tttgattgct ctgcgtatct taaaatagcg 480
gtctcgaatc aaccgtatgc aac 503
<210> 215
<211> 499
<212> DNA
<213> Artificial sequence
<220>
<223> flanking homology dROT2
<400> 215
gggaaaaaaa cgaaggggta tctttacatc tttttagctc tttcttgcaa attaaacgta 60
aaaatatccg taaatataac atataaccat tatctataga aaaaaagaac cgaaaattgt 120
gtcaggccct acttcccgtg agctaacttc attcttgtcg aaaacttgac tagggtcgtc 180
cagtcgcaaa acgtatcaca tttcggacat ttcccacctc gaggtatcaa gtttcttctc 240
cctactatca actgttcatc atctagaaag tacctgtgaa gacatttcaa gtggttaaca 300
gacccgcagt ctttattatt gcaaagcgct acaaacggct tcagattttg ctcttcagac 360
gtgtagtcaa tttctttctc acaaatttca catcgcacta cccctgtagt tagtttcttt 420
tcgaaagttt cgaatatgtt tctttcattt tcaatgactt tagtgtatac agcttctacc 480
acttttaaat tctcatcac 499
<210> 216
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D0
<400> 216
atcgacgggc cggccagtgt ctctcgttta aacttg 36
<210> 217
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D2
<400> 217
atgagtatgc tatactccca ctaatggcat cacgct 36
<210> 218
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D3
<400> 218
taggcaagaa tagcggagca ctaggtttcg acttaa 36
<210> 219
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D4
<400> 219
acgatccacg gcttctaaag actgacaatt gctttc 36
<210> 220
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D9
<400> 220
caaagctagg ccggccctta gtactagttt aaaccg 36
<210> 221
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D20
<400> 221
tcggagcaaa tgaaacgatt ccgataagtg ttgcaa 36
<210> 222
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D21
<400> 222
ttgtgggtga aagaaggaga ggtacgtttc tatcgt 36
<210> 223
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D22
<400> 223
agacagaccc gccttaatct acaagattcg tgacat 36
<210> 224
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D23
<400> 224
cattatattt cctacggagt cggaagcagg gacgta 36
<210> 225
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint D30
<400> 225
ggaccaccag taacactcca attctgggtg atttac 36
<210> 226
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> linker DH7
<400> 226
ctcttattac cctatcctat ggtactttct cggcag 36
<210> 227
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint G1
<400> 227
acctctatac tttaacgtca aggagaaaaa actata 36
<210> 228
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint G7
<400> 228
catgataaaa aaaaacagtt gaatattccc tcaaaa 36
<210> 229
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint G10
<400> 229
aaaaaaaaag taagaatttt tgaaaattca atataa 36
<210> 230
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint RG1
<400> 230
tatagttttt tctccttgac gttaaagtat agaggt 36
<210> 231
<211> 36
<212> DNA
<213> Artificial sequence
<220>
<223> Joint LTTDH1
<400> 231
ataaagcaat cttgatgagg ataatgattt tttttt 36
<210> 232
<211> 635
<212> DNA
<213> Artificial sequence
<220>
<223> promoter pGAL1
<400> 232
tttggatgga cgcaaagaag tttaataatc atattacatg gcaataccac catatacata 60
tccatatcta atcttactta tatgttgtgg aaatgtaaag agccccatta tcttagccta 120
aaaaaacctt ctctttggaa ctttcagtaa tacgcttaac tgctcattgc tatattgaag 180
tacggattag aagccgccga gcgggcgaca gccctccgac ggaagactct cctccgtgcg 240
tcctggtctt caccggtcgc gttcctgaaa cgcagatgtg cctcgcgccg cactgctccg 300
aacaataaag attctacaat actagctttt atggttatga agaggaaaaa ttggcagtaa 360
cctggcccca caaaccttca aatcaacgaa tcaaattaac aaccatagga taataatgcg 420
attagttttt tagccttatt tctggggtaa ttaatcagcg aagcgatgat ttttgatcta 480
ttaacagata tataaatgca aaagctgcat aaccacttta actaatactt tcaacatttt 540
cggtttgtat tacttcttat tcaaatgtca taaaagtatc aacaaaaaat tgttaatata 600
cctctatact ttaacgtcaa ggagaaaaaa ctata 635
<210> 233
<211> 668
<212> DNA
<213> Artificial sequence
<220>
<223> promoter pGAL1-10
<400> 233
tatagttttt tctccttgac gttaaagtat agaggtatat taacaatttt ttgttgatac 60
ttttatgaca tttgaataag aagtaataca aaccgaaaat gttgaaagta ttagttaaag 120
tggttatgca gcttttgcat ttatatatct gttaatagat caaaaatcat cgcttcgctg 180
attaattacc ccagaaataa ggctaaaaaa ctaatcgcat tattatccta tggttgttaa 240
tttgattcgt tgatttgaag gtttgtgggg ccaggttact gccaattttt cctcttcata 300
accataaaag ctagtattgt agaatcttta ttgttcggag cagtgcggcg cgaggcacat 360
ctgcgtttca ggaacgcgac cggtgaagac caggacgcac ggaggagagt cttccgtcgg 420
agggctgtcg cccgctcggc ggcttctaat ccgtacttca atatagcaat gagcagttaa 480
gcgtattact gaaagttcca aagagaaggt ttttttaggc taagataatg gggctcttta 540
catttccaca acatataagt aagattagat atggatatgt atatggtggt attgccatgt 600
aatatgatta ttaaacttct ttgcgtccat ccaaaaaaaa agtaagaatt tttgaaaatt 660
caatataa 668
<210> 234
<211> 465
<212> DNA
<213> Artificial sequence
<220>
<223> promoter pGAL7
<400> 234
ggacggtagc aacaagaata tagcacgagc cgcgaagttc atttcgttac ttttgatatc 60
gctcacaact attgcgaagc gcttcagtga aaaaatcata aggaaaagtt gtaaatatta 120
ttggtagtat tcgtttggta aagtagaggg ggtaattttt cccctttatt ttgttcatac 180
attcttaaat tgctttgcct ctccttttgg aaagctatac ttcggagcac tgttgagcga 240
aggctcatta gatatatttt ctgtcatttt ccttaaccca aaaataaggg aaagggtcca 300
aaaagcgctc ggacaactgt tgaccgtgat ccgaaggact ggctatacag tgttcacaaa 360
atagccaagc tgaaaataat gtgtagctat gttcagttag tttggctagc aaagatataa 420
aagcaggtcg gaaatattta tgggcattat tatgcagagc atcaa 465
<210> 235
<211> 600
<212> DNA
<213> Artificial sequence
<220>
<223> promoter pTDH3
<400> 235
ttagtcaaaa aattagcctt ttaattctgc tgtaacccgt acatgcccaa aatagggggc 60
gggttacaca gaatatataa catcgtaggt gtctgggtga acagtttatt cctggcatcc 120
actaaatata atggagcccg ctttttaagc tggcatccag aaaaaaaaag aatcccagca 180
ccaaaatatt gttttcttca ccaaccatca gttcataggt ccattctctt agcgcaacta 240
cagagaacag gggcacaaac aggcaaaaaa cgggcacaac ctcaatggag tgatgcaacc 300
tgcctggagt aaatgatgac acaaggcaat tgacccacgc atgtatctat ctcattttct 360
tacaccttct attaccttct gctctctctg atttggaaaa agctgaaaaa aaaggttgaa 420
accagttccc tgaaattatt cccctacttg actaataagt atataaagac ggtaggtatt 480
gattgtaatt ctgtaaatct atttcttaaa cttcttaaat tctactttta tagttagtct 540
tttttttagt tttaaaacac caagaactta gtttcgaata aacacacata aacaaacaaa 600
<210> 236
<211> 648
<212> DNA
<213> Artificial sequence
<220>
<223> promoter pTEF1
<400> 236
gacagcctag acatcaatag tcatacaaca gaaagcgacc acccaacttt ggctgataat 60
agcgtataaa caatgcatac tttgtacgtt caaaatacaa tgcagtagat atatttatgc 120
atattacata taatacatat cacataggaa gcaacaggcg cgttggactt ttaattttcg 180
aggaccgcga atccttacat cacacccaat cccccacaag tgatccccca cacaccatag 240
cttcaaaatg tttctactcc ttttttactc ttccagattt tctcggactc cgcgcatcgc 300
cgtaccactt caaaacaccc aagcacagca tactaaattt cccctctttc ttcctctagg 360
gtgtcgttaa ttacccgtac taaaggtttg gaaaagaaaa aagagaccgc ctcgtttctt 420
tttcttcgtc gaaaaaggca ataaaaattt ttatcacgtt tctttttctt gaaaattttt 480
ttttttgatt tttttctctt tcgatgacct cccattgata tttaagttaa taaacggtct 540
tcaatttctc aagtttcagt ttcatttttc ttgttctatt acaacttttt ttacttcttg 600
ctcattagaa agaaagcata gcaatctaat ctaagtttta attacaaa 648
<210> 237
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> Synthesis of the recombined sequence SRS _ A
<400> 237
gagatcccca aacagtttaa tctctgatta actcttgcgt cgtatagtcg ggcttaactc 60
tatactcaaa atcactaaca agacgtagac gcaagacgat aagaccgggc aggatactat 120
cactcaatac agttcagata tctccatcta actgtactct actcaaaaac gtcttactaa 180
agaagaacgt ctccccaaca catgtaggaa ggaagtgatt acttatcttt tggctaaatc 240
tacataatta gtcagcgaac attctagaca gagagtgaaa tctgacacgt gagattagtg 300
cttatactct gattaggctc agtgaaacta gctgtgcatg taccgtgatt attaggttga 360
cataggaatt tagtgcctaa tatcggctcg attataaata tcagtaattc gatatcgtca 420
tgtctttatg cttacaggta tagtaatttg gccttagtgg aacgatcaat cggcttctgt 480
aatattattc aaccttccct 500
<210> 238
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> Synthesis of the recombined sequence SRS _ B
<400> 238
atcaatttgt gattactttg ctaggtaacg ccactactgg tgttataatt gctgttactt 60
aagcgccttg taatcgatat aagtgaaaat acaaaacgca gcctactgtt caatcggaac 120
tttagtatat tcgctgagga ctagtacgtg gcagaatcca gttataatgt tttgaatcgc 180
ttttaaggta ctgaagtaat ccacaggacg ccaagctctt atagcacagt gggatatatt 240
tgcacgtgat tattgcaaaa gagctaaggg tcgtacttca cgtcttttaa ctgagtacaa 300
cccaaatttg gttcgcttcg caagttgata gcgtagcgac acgtacagtt ggtttgaaac 360
agtagtactt ccttttaatt cggaggctta ttgtacggaa agtgttctgt taattaaggt 420
agaccagcag aacaccgcgc accagggatt gcatatcttt agggtatttc gagattgcat 480
cccattgaaa tcgtaccttt 500
<210> 239
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> Synthesis of recombined sequence SRS _ C
<400> 239
gaagactaac gaacgaggtg gtatatagtt accctatgta aaatcgttaa ttgtcattag 60
acaatgtttc aggagtagaa ttcagctagc tgttagccca ctggcacacg ctaggacgct 120
atcaggacgg tctttgaacc tataactcag tatatggttt tagacctata atctgcttct 180
aaggcacgcg ggggagtaac ttagatacta gtgcttccag gaaaatctgg cgcgataata 240
gcccgatctt ctaatagact atccctacca aaagttataa atcattgttc tcttgacgtt 300
aacatcactt gctgaaaatt agaatgtgaa gaaaaccata aacaaattag cctggcagac 360
agtgaatata ctctacgttg aacatataca aaaatagaag cgccggaaag aagatccttc 420
ccagtagagt ccgttaaact aatttcctaa ttgctaaata ctgtatctac ctgataatgg 480
ggtcggttac ttcagtttat 500
<210> 240
<211> 497
<212> DNA
<213> Artificial sequence
<220>
<223> Synthesis of the recombined sequence SRS _ D
<400> 240
ctcgtaactt gatttaataa cggaggctat tgaacttaaa atcgcatctg gctgttaaat 60
tcaataaact tgctttcagg ttaacttcct ttgattaagt cgtgaattaa gatcttatta 120
caactctggt acgtcctaag ttagattaaa gtactgttgg aggaacggaa taatatttct 180
tgtgttacgc agctagtgaa ccagtgtcac aggaggtgta acaaccggtt gaaattctaa 240
cttttgaagc tttacctaag ggtactgtat aaggatcaca ttgttagtac aacgctagtt 300
cgtaggcgtt caaacttaat tgtttagtag tccgcacttg actaactgac gctccttggt 360
ctgctcttcg taattaggct ttcgaaaggt acgatggaat actaagtata ataacagttg 420
tctgacaact acggtacgta tttgatgttg aggcagtgag ctaactccac ttagtgtgta 480
accttacgta tcatata 497
<210> 241
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> Synthesis of recombination sequence SRS _ G
<400> 241
atgagttaac gtagattact ttccgttagt gtaacggaca agatgacaca gtattgaaac 60
gctcctctat ctttgtgggt gttgagggag gagagagtta ttagtctaga cgctatatat 120
cacagagtcg agtgcccaat aatatagcag gtagacgcca acttaactac tggatgtgag 180
ttagagagga atatactgtt ttattaactc gtaccgtagt ggtttctggc gagaattcgc 240
cggcttaaaa tctattgact aactagaatt agcttaaagt ggaccttatt gaatctcacc 300
gggttatcgc gacatttata ttatcgaagg gtccagcttg gggtttgttt ggaaggttga 360
cttgttgttg tagttgtatc actaataatt acgattctca acgaccggcc caaagattgc 420
ctgggttaaa ttcagcgtct gatgtatact tagctcttcg caagtagtgt tctaattaaa 480
gatcacttca acttatcttc 500
<210> 242
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> Synthesis of recombinant sequence SRS _ H
<400> 242
aaggtaactt aggctgacga cgcaataatg cacgctcgcg tgtgatgaga ttacatcagt 60
aagaattcat atctcgatat gggataatta cgggctcaca ctctaggaat accaagaaca 120
gtaatgtttc cttattaata tgtagataag gatcttctta aagtttaatt gatggcgaat 180
tccaaatagt gacttaactt tcgcgcagag tatcgcagga caaggttagc ttttcgtaag 240
cctgatgtta tgccataact agccaccttg taatccccgc gctcagtaaa tctatctact 300
tatttgcttc acttcttcac ggagagtact gattttcttc tctcaatact actgactacc 360
ttagggcgac atactttagt tctgcataag cgtcaacctc ttgcctaagg gtatggatcc 420
ttgttaagcc ttattccata tagttgagct gaagaacaga tccctcaact cgagtgcgac 480
tagtagctag ttacgaacac 500
<210> 243
<211> 500
<212> DNA
<213> Artificial sequence
<220>
<223> Synthesis of recombination sequences SRS _ J
<400> 243
ggtcattgag gcggtaagaa tctgatttat ctagatctat agcaacgtca aataattcaa 60
atcccgtact tttcaagatt ctgagggtta aggtcttgat tgtgattcta aatacttgta 120
ggtaccgagt aatagacgcg cactcagatt tggtctaata cgattattta cccatagaga 180
gagtaatcgt ctatggcccg tagttagcaa ggttcaacgt gtattatgta ctgagtacgc 240
agatctgatt accctataat ttccaagata ttagtgattc taacggatat agtcaatacc 300
tcccaattcc ccacgcttcg attgtagtat ttgattcggc tgacaaacgc cgacaagatt 360
cgctgtaact ctttggctaa tagaaaagta aatcaacacg cgttcttaaa ttcttgacat 420
gtaagtactt ggaacaatct tacctgttat ccattatctg tttatcgatc ttacctaacc 480
atccagtttg cctgagtggg 500
<210> 244
<211> 214
<212> DNA
<213> Artificial sequence
<220>
<223> terminator tTDH1
<400> 244
gaatatacat aaatactacc gtttttctgc tagattttgt gaagacgtaa ataagtacat 60
attacttttt aagccaagac aagattaagc attaacttta cccttttctc ttctaagttt 120
caatactagt tatcactgtt taaaagttat ggcgagaacg tcggcggtta aaatatatta 180
ccctgaacgt ggtgaattga agttctagga tggt 214
<210> 245
<211> 301
<212> DNA
<213> Artificial sequence
<220>
<223> terminator tENO1
<400> 245
agcttttgat taagccttct agtccaaaaa acacgttttt ttgtcattta tttcattttc 60
ttagaatagt ttagtttatt cattttatag tcacgaatgt tttatgattc tatatagggt 120
tgcaaacaag catttttcat tttatgttaa aacaatttca ggtttacctt ttattctgct 180
tgtggtgacg cgtgtatccg cccgctcttt tggtcaccca tgtatttaat tgcataaata 240
attcttaaaa gtggagctag tctatttcta tttacatacc tctcatttct catttcctcc 300
t 301
<210> 246
<211> 504
<212> DNA
<213> Artificial sequence
<220>
<223> terminator tSSA1
<400> 246
gccaattggt gcggcaattg ataataacga aaatgtcttt taatgatctg ggtataatga 60
ggaattttcc gaacgttttt actttatata tatatataca tgtaacatat attctatacg 120
ctatagagaa aggaaatttt tcaattaaaa aaaaaataga gaaagagttt cacttcttga 180
ttatcgctaa cactaatggt tgaagtactg ctactttaat tttatagata ggcaaaaaaa 240
aattattcgg ggcgagctgg gaattgaacc cagggcctct cgcatgcttt gtcttcctgt 300
ttaatcagga agtcgcccaa agcgagaatc ataccactag accacacgcc cgtactaatt 360
gatgtcttcc ttttcggata gatgtatata tatacaaatt ggtcagattg cttttggctc 420
cctttcgtac gtaactcatt tagactacgg atcactagca ctatctcacc aagtttttaa 480
aagatccact gtgatcatta aaga 504
<210> 247
<211> 250
<212> DNA
<213> Artificial sequence
<220>
<223> terminator tADH1
<400> 247
gcgaatttct tatgatttat gatttttatt attaaataag ttataaaaaa aataagtgta 60
tacaaatttt aaagtgactc ttaggtttta aaacgaaaat tcttattctt gagtaactct 120
ttcctgtagg tcaggttgct ttctcaggta tagcatgagg tcgctcttat tgaccacacc 180
tctaccggca tgccgagcaa atgcctgcaa atcgctcccc atttcaccca attgtagata 240
tgctaactcc 250
<210> 248
<211> 249
<212> DNA
<213> Artificial sequence
<220>
<223> terminator tCYC1
<400> 248
atcatgtaat tagttatgtc acgcttacat tcacgccctc cccccacatc cgctctaacc 60
gaaaaggaag gagttagaca acctgaagtc taggtcccta tttatttttt tatagttatg 120
ttagtattaa gaacgttatt tatatttcaa atttttcttt tttttctgta cagacgcgtg 180
tacgcatgta acattatact gaaaaccttg cttgagaagg ttttgggacg ctcgaaggct 240
ttaatttgc 249
<210> 249
<211> 509
<212> DNA
<213> Artificial sequence
<220>
<223> terminator tHUG1
<400> 249
agtatgcttc tctttttttt tgtaggccag tgataggaaa gaacaataga atataaatac 60
gtcagaatat aatagatatg tttttatatt tagacctcgt acataggaat aattgacgtt 120
ttttttggcc aacatttgaa attttttttt gttacctcgc gctgagccca aacgggctcc 180
actacccgcc gcggtcgcca ttttgggaag tcatccgtcc caaaaaggaa atagccataa 240
catgtcgtta ctgttttgga acatcgcccg tttcgcccga ttccgcctca gcgggtataa 300
aaagagatct ttttttttcc tggctgtccc ttcccatttt taaatgtctt atctgctcct 360
ttgtgatctt acggtctcac taacctctct tcaactgctc aataatttcc cgctatgcaa 420
aattcccaag actactttta cgctcaaaat cgctcccaac aacaacaagc cccttccaca 480
ttgcgtaccg tgaccatggc ggaatttag 509
<210> 250
<211> 505
<212> DNA
<213> Artificial sequence
<220>
<223> terminator tSPG5
<400> 250
aaagacgttg tttcatcgcg ctattaccaa gaaggttact ttacttgttc ttgcacatgg 60
acgcacgttg tgtgttcata tatatatata tatatatata tatatatttg tgcttgtttt 120
cattgtctct atagttaata cattctattt ttatcgttat atttgcattc tcttcgcata 180
aaaacttcat gaaaattcgg cagaaaataa gccatatatg tactttatcc ataggcaaag 240
aaaagcactt aacgagaata tacaacaatt gcactagtac tgcatgtata tactcttatg 300
attatagcgg caagaaaaca aatataaaca cactaacaga tgaattcgaa tgaagatata 360
catgaagaac gcattgaagt tccacgaact ccccatcaaa cccagccaga gaaagactct 420
gatcgcatcg ctctcaggga tgaaatatca gtaccagaag gcgatgaaaa agcatattcg 480
gatgagaaag tagaaatggc aacca 505
<210> 251
<211> 522
<212> DNA
<213> Artificial sequence
<220>
<223> promoter pGAL10
<400> 251
atctgttaat agatcaaaaa tcatcgcttc gctgattaat taccccagaa ataaggctaa 60
aaaactaatc gcattattat cctatggttg ttaatttgat tcgttgattt gaaggtttgt 120
ggggccaggt tactgccaat ttttcctctt cataaccata aaagctagta ttgtagaatc 180
tttattgttc ggagcagtgc ggcgcgaggc acatctgcgt ttcaggaacg cgaccggtga 240
agaccaggac gcacggagga gagtcttccg tcggagggct gtcgcccgct cggcggcttc 300
taatccgtac ttcaatatag caatgagcag ttaagcgtat tactgaaagt tccaaagaga 360
aggttttttt aggctaagat aatggggctc tttacatttc cacaacatat aagtaagatt 420
agatatggat atgtatatgg tggtattgcc atgtaatatg attattaaac ttctttgcgt 480
ccatccaaaa aaaaagtaag aatttttgaa aattcaatat aa 522

Claims (121)

1. An engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide, the engineered variant comprising an amino acid sequence of SEQ ID No. 44 with one or more amino acid substitutions.
2. The engineered variant of claim 1, wherein the engineered variant comprises an amino acid sequence having at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID NO 44.
3. The engineered variant of claim 1 or claim 2, wherein the engineered variant comprises at least one amino acid substitution in a signal polypeptide, a Flavin Adenine Dinucleotide (FAD) binding domain, a Berberine Bridge Enzyme (BBE) domain, or a combination of the foregoing.
4. The engineered variant of claim 3, wherein the engineered variant comprises at least one amino acid substitution in the signal polypeptide.
5. The engineered variant of claim 3 or claim 4, wherein said engineered variant comprises at least one amino acid substitution in said FAD binding domain.
6. The engineered variant of any one of claims 3-5, wherein the engineered variant comprises at least one amino acid substitution in the BBE domain.
7. The engineered variant of any one of claims 3-6, wherein said engineered variant comprises a substitution of at least one surface exposed amino acid.
8. The engineered variant of claim 1 or claim 2, wherein said engineered variant comprises at least one amino acid substitution at an amino acid selected from the group consisting of:
l132, S170, F171, N196, K261, L269, F317, P539, R31, P43, P49, K50, L51, Q55, H56, L59, M61, S62, L71, S100, V103, T109, Q124, V125, L132, S137, H143, V149, W161, K165, E167, N168, S170, F171, P172, Y175, G180, N196, H208, G235, a250, I257, K261, L269, G311, F317, L327, T379, K390, S429, N467, Y500, N528, P539, P542, H543, H544, and H545.
9. The engineered variant of claim 8, wherein said engineered variant comprises at least one amino acid substitution selected from the group consisting of: L132M, S170T, F171I, N196T, N196Q, N196V, K261C, L269I, F317Y, P539T, R31Q, P43E, P49E, P49K, P49Q, K50T, L51I, Q55E, Q55P, H56E, L59E, M61H, M61S, M61W, S62Q, L71A, S100A, V103F, T109V, Q124D, Q124E, Q124N, V125E, V125Q, S137G, H143D, V149I, W161K, W161R, W161Y. K165A, E167P, N168s. P172v, Y175F, G180A, H208T, G235P, a250T, I257V, K261W, G311A, G311C, L327I, T379S, K390E, S429L, N467D, Q475S, Y500M, Y500V, N V, P36542, P542V, H543V, H544V, H545V and H545V.
10. The engineered variant of claim 1 or claim 2, wherein said engineered variant comprises an amino acid sequence selected from the group consisting of: SEQ ID NO 50, SEQ ID NO 52, SEQ ID NO 54, SEQ ID NO 56, SEQ ID NO 58, SEQ ID NO 60, SEQ ID NO 62, SEQ ID NO 64, SEQ ID NO 66, SEQ ID NO 68, SEQ ID NO 70, SEQ ID NO 72, SEQ ID NO 74, SEQ ID NO 76, SEQ ID NO 78, SEQ ID NO 80, SEQ ID NO 82, SEQ ID NO 84, SEQ ID NO 86, SEQ ID NO 88, SEQ ID NO 90, SEQ ID NO 92, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 114, SEQ ID NO 94, SEQ ID NO 96, SEQ ID NO 98, SEQ ID NO 100, SEQ ID NO 102, SEQ ID NO 104, SEQ ID NO 106, SEQ ID NO 108, SEQ ID NO 110, SEQ ID NO 112, SEQ ID NO 88, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 176, 178, 180, SEQ ID NO 182, SEQ ID NO 184 and SEQ ID NO 186.
11. The engineered variant of any one of claims 1-9, wherein said engineered variant comprises an amino acid sequence of SEQ ID No. 44 having at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 amino acid substitutions.
12. The engineered variant of any one of claims 1-9, wherein said engineered variant comprises the amino acid sequence of SEQ ID No. 44 having 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid substitutions.
13. The engineered variant of any one of claims 1-12, wherein the engineered variant comprises at least one invariant amino acid in a Flavin Adenine Dinucleotide (FAD) binding domain, a Berberine Bridge Enzyme (BBE) domain, or a combination of the foregoing.
14. The engineered variant of claim 13, wherein said engineered variant comprises at least one invariant amino acid in said FAD binding domain.
15. The engineered variant of claim 14, wherein the engineered variant comprises at least 1 invariant amino acid, at least 2 invariant amino acids, at least 3 invariant amino acids, at least 4 invariant amino acids, at least 5 invariant amino acids, at least 6 invariant amino acids, at least 7 invariant amino acids, at least 8 invariant amino acids, at least 9 invariant amino acids, at least 10 invariant amino acids, at least 11 invariant amino acids, at least 12 invariant amino acids, at least 13 invariant amino acids, at least 14 invariant amino acids, or at least 15 invariant amino acids in the FAD binding domain.
16. The engineered variant of any one of claims 13-15, wherein the engineered variant comprises at least one invariant amino acid in the BBE domain.
17. The engineered variant of claim 16, wherein the engineered variant comprises at least 1 invariant amino acid, at least 2 invariant amino acids, at least 3 invariant amino acids, at least 4 invariant amino acids, at least 5 invariant amino acids, at least 6 invariant amino acids, at least 7 invariant amino acids, at least 8 invariant amino acids, at least 9 invariant amino acids, at least 10 invariant amino acids, at least 11 invariant amino acids, at least 12 invariant amino acids, at least 13 invariant amino acids, at least 14 invariant amino acids, or at least 15 invariant amino acids in the BBE domain.
18. The engineered variant of any one of claims 1-9, wherein said engineered variant comprises at least one invariant amino acid selected from the group consisting of: a28, F34, L35, C37, L64, N70, P87, I93, C99, R108, R110, G112, E117, G118, S120, P126, F127, D131, D141, W148, G152, a153, L155, G156, E157, Y159, Y160, N163, a173, G174, C176, P177, T178, V179, G182, G183, H184, F185, G187, G188, G189, Y190, G191, P192, L193, R195, a201, D202, I205, D206, V210, G214, G223, D225, L226, F227, W228, R231, G234, S237, F238, G239, G245, I246, L245, L251, V260, Q313, F313, S314, F227, W228, R231, G234, S237, S248, F245, I520, N420, N520, P420, P185, N185, P444, N185, P18, N185, P2, P18, L2, P68, P2, L123, P2, L123, L2, L123, P2.
19. The engineered variant of claim 18, wherein said engineered variant comprises at least one invariant amino acid selected from the group consisting of: c37, N70, I93, C99, E117, S120, F127, D131, G156, E157, Y159, G174, C176, G182, G183, F185, G187, G188, G189, Y190, G191, P192, R195, D202, D206, G214, W228, G234, F238, L248, Q277, S314, L324, S355, K382, K384, D386, G420, M423, R436, Y441, W444, Y445, Y472, P477, N514, F515, N529, and Q535.
20. The engineered variant of any one of claims 1-19, wherein said engineered variant comprises at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 invariant amino acids.
21. The engineered variant of any of claims 1-20, wherein the engineered variant produces an amount of tetrahydrocannabinolic acid (THCA) from cannabigerolic acid (CBGA) that is greater than the amount of THCA produced from CBGA by a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, in mg/L or mM, within the same length of time under similar conditions.
22. The engineered variant of any one of claims 1-21, wherein the amount of tetrahydrocannabinolic acid (THCA) produced by the engineered variant from cannabigerolic acid (CBGA) is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of THCA produced from CBGA by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, in mg/L or mM over the same length of time under similar conditions.
23. The engineered variant of any of claims 1-22, wherein the engineered variant produces THCA from cannabigerolic acid (CBGA) at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to the ratio of THCA to the other cannabinoid, produced by a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID NO:44, over the same temporal length under similar conditions.
24. The engineered variant of any one of claims 1-23, wherein the engineered variant is produced from another cannabinoid, such as THCA, in a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500:1, such as THCA to another CBCA (e.g., thga).
25. The engineered variant of any one of claims 1-9 or claims 11-24, wherein the engineered variant comprises a truncation at the N-terminus, the C-terminus, or at both the N-terminus and the C-terminus.
26. The engineered variant of claim 25, wherein said truncated engineered variant comprises a signal polypeptide or a membrane anchor.
27. The engineered variant of claim 25 or claim 26, wherein the engineered variant lacks a native signal polypeptide.
28. The engineered variant of any one of claims 25-27, wherein said engineered variant comprises a truncation of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, or at least 10 amino acids at the C-terminus.
29. The engineered variant of any one of claims 25-27, wherein said engineered variant comprises a truncation of 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids at the C-terminus.
30. A nucleic acid comprising a nucleotide sequence encoding the engineered variant of any one of claims 1-29.
31. A nucleic acid comprising a nucleotide sequence encoding an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide, the engineered variant comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, wherein the nucleotide sequence is selected from the group consisting of: SEQ ID NO 49, SEQ ID NO 51, SEQ ID NO 53, SEQ ID NO 55, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 61, SEQ ID NO 63, SEQ ID NO 65, SEQ ID NO 67, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 77, SEQ ID NO 79, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 85, SEQ ID NO 87, SEQ ID NO 91, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO 57, SEQ ID NO 59, SEQ ID NO 61, SEQ ID NO 63, SEQ ID NO 65, SEQ ID NO 67, SEQ ID NO 69, SEQ ID NO 71, SEQ ID NO 73, SEQ ID NO 75, SEQ ID NO 83, SEQ ID NO 81, SEQ ID NO 83, SEQ ID NO 93, SEQ ID NO 95, SEQ ID NO, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 155, 157, 161, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183 and 185 SEQ ID NO.
32. The nucleic acid of any one of claim 30 or claim 31, wherein the nucleotide sequence is codon optimized.
33. A method of preparing a modified host cell for the production of a cannabinoid or cannabinoid derivative, the method comprising introducing into a host cell one or more nucleic acids of any of claims 30-32.
34. A vector comprising one or more nucleic acids of any one of claims 30-32.
35. A method of making a modified host cell for the production of a cannabinoid or cannabinoid derivative comprising introducing into a host cell one or more vectors of claim 34.
36. A modified host cell for the production of a cannabinoid or cannabinoid derivative, wherein the modified host cell comprises one or more nucleic acids of any of claims 30-32.
37. The modified host cell of claim 36, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a geranyl pyrophosphate olivine geranyl transferase (GOT) polypeptide.
38. The modified host cell of claim 37, wherein the GOT polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 17.
39. The modified host cell of claim 37 or claim 38, wherein the modified host cell comprises two or more heterologous nucleic acids comprising nucleotide sequences encoding the GOT polypeptide.
40. The modified host cell of claim 36, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an NphB polypeptide.
41. The modified host cell of claim 40, wherein the NphB polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 294.
42. The modified host cell of any one of claims 36-41, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a tetraone compound synthase (TKS) polypeptide and one or more heterologous nucleic acids comprising a nucleotide sequence encoding an Olivine Acid Cyclase (OAC) polypeptide.
43. The modified host cell of claim 42, wherein the TKS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 19.
44. The modified host cell of claim 42 or claim 43, wherein the modified host cell comprises three or more heterologous nucleic acids comprising nucleotide sequences encoding TKS polypeptides.
45. The modified host cell of any one of claims 42-44, wherein the OAC polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 21 or SEQ ID NO 48.
46. The modified host cell of any one of claims 42-45, wherein the modified host cell comprises three or more heterologous nucleic acids comprising nucleotide sequences encoding OAC polypeptides.
47. The modified host cell of any one of claims 36-46, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acyl-activating enzyme (AAE) polypeptide.
48. The modified host cell of claim 47, wherein the AAE polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 23.
49. The modified host cell of claim 47 or claim 48, wherein the modified host cell comprises two or more heterologous nucleic acids comprising nucleotide sequences encoding AAE polypeptides.
50. The modified host cell of any one of claims 36-49, wherein the modified host cell comprises one or more of: a) one or more heterologous nucleic acids comprising a nucleotide sequence encoding an HMG-CoA synthase (HMGS) polypeptide; b) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a truncated 3-hydroxy-3-methyl-glutaryl-CoA reductase (tHMGR) polypeptide; c) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a Mevalonate Kinase (MK) polypeptide; d) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a phosphomevalonate kinase (PMK) polypeptide; e) one or more heterologous nucleic acids comprising a nucleotide sequence encoding a mevalonate decarboxylase pyrophosphate (MVD1) polypeptide; or f) one or more heterologous nucleic acids comprising a nucleotide sequence encoding an isopentenyl diphosphate isomerase (IDI1) polypeptide.
51. The modified host cell of claim 50, wherein the IDI1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 25.
52. The modified host cell of claim 50 or claim 51, wherein the tHMGR polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 27.
53. The modified host cell of any one of claims 50-52, wherein the HMGS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 29.
54. The modified host cell of any one of claims 50-53, wherein the MK polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 39.
55. The modified host cell of any one of claims 50-54, wherein the PMK polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID No. 37.
56. The modified host cell of any one of claims 50-55, wherein the MVD1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 33.
57. The modified host cell of any one of claims 36-56, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an acetoacetyl-CoA thiolase polypeptide.
58. The modified host cell of claim 57, wherein the acetoacetyl-CoA thiolase polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 31.
59. The modified host cell of any one of claims 36-58, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a Pyruvate Decarboxylase (PDC) polypeptide.
60. The modified host cell of claim 59, wherein the PDC polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 35.
61. The modified host cell of any one of claims 36-60, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a geranyl pyrophosphate synthase (GPPS) polypeptide.
62. The modified host cell of claim 61, wherein the GPPS polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 41.
63. The modified host cell of any one of claims 36-62, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a KAR2 polypeptide.
64. The modified host cell of claim 63, wherein the KAR2 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 5.
65. The modified host cell of claim 63 or claim 64, wherein the modified host cell comprises two or more heterologous nucleic acids comprising nucleotide sequences encoding KAR2 polypeptides.
66. The modified host cell of any one of claims 36-65, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a PDI1 polypeptide.
67. The modified host cell of claim 66, wherein the PDI1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 9.
68. The modified host cell of any one of claims 36-67, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an IRE1 polypeptide.
69. The modified host cell of claim 68, wherein the IRE1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 11 or SEQ ID NO. 190.
70. The modified host cell of any one of claims 36-69, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding an ERO1 polypeptide.
71. The modified host cell of claim 70, wherein the ERO1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 7.
72. The modified host cell of any one of claims 36-71, wherein the modified host cell comprises one or more heterologous nucleic acids comprising a nucleotide sequence encoding a FAD1 polypeptide.
73. The modified host cell of claim 72, wherein the FAD1 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO: 192.
74. The modified host cell of any one of claims 36-73, wherein the modified host cell comprises a deletion or down-regulation of one or more genes encoding a PEP4 polypeptide.
75. The modified host cell of claim 74, wherein the PEP4 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO. 15.
76. The modified host cell of any one of claims 36-75, wherein the modified host cell comprises a deletion or down-regulation of one or more genes encoding a ROT2 polypeptide.
77. The modified host cell of claim 76, wherein the ROT2 polypeptide comprises an amino acid sequence having at least 85% sequence identity to SEQ ID NO 13.
78. The modified host cell of any one of claims 36-77, wherein the modified host cell is a eukaryotic cell.
79. The modified host cell of claim 78, wherein the eukaryotic cell is a yeast cell.
80. The modified host cell of claim 79, wherein the yeast cell is Saccharomyces cerevisiae.
81. The modified host cell of claim 80, wherein the Saccharomyces cerevisiae is a protease deficient strain of Saccharomyces cerevisiae.
82. The modified host cell of any one of claims 36-81, wherein at least one of the one or more nucleic acids is integrated into the chromosome of the modified host cell.
83. The modified host cell of any one of claims 36-82, wherein at least one of the one or more nucleic acids is maintained extrachromosomally.
84. The modified host cell of any one of claims 36-83, wherein at least one of the one or more nucleic acids is operably linked to an inducible promoter.
85. The modified host cell of any one of claims 36-83, wherein at least one of the one or more nucleic acids is operably linked to a constitutive promoter.
86. The modified host cell of any one of claims 36-85, wherein the growth is carried out under similar culture conditions for the same length of time in mg/L or mM, the amount of cannabinoid or cannabinoid derivative produced by the modified host cell is greater than the amount produced by a cell comprising one or more polypeptides comprising a polypeptide encoding a polypeptide having the amino acid sequence of SEQ ID NO:44 of the nucleotide sequence of the tetrahydrocannabinolic acid synthase polypeptide, or a cannabinoid derivative thereof, produced by the modified host cell, wherein the polypeptide comprises one or more polypeptides comprising a nucleotide sequence encoding a polypeptide having the sequence of SEQ ID NO:44, the modified host cell lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant of any one of claims 1-29.
87. The modified host cell of any one of claims 36-86, wherein the amount of cannabinoid or cannabinoid derivative produced by the modified host cell is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of cannabinoid or cannabinoid derivative produced by a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, grown for a same length of time under similar culture conditions, in mg/L or mM, wherein one or more of the cannabinoids or cannabinoid derivatives produced by the modified host cell comprises one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 is comprised The modified host cell of nucleic acid of (a) lacks a nucleic acid comprising a nucleotide sequence encoding the engineered variant of any one of claims 1-29.
88. The modified host cell of any one of claims 36-87, wherein the modified host cell has an amino acid sequence identical to a sequence comprising one or more amino acid sequences encoding a polypeptide having the amino acid sequence of SEQ ID NO:44 of the amino acid sequence of tetrahydrocannabinolic acid synthase polypeptide, a higher biomass yield than a faster growth rate and/or a higher biomass yield, wherein the polypeptide comprises one or more polypeptides comprising a nucleotide sequence encoding a polypeptide having the sequence of SEQ ID NO:44, the modified host cell lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant of any one of claims 1-29.
89. The modified host cell of any one of claims 36-88, wherein the modified host cell has a growth rate and/or biomass yield that is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% faster than the growth rate and/or higher biomass yield of a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44, grown under similar culture conditions for the same length of time, comprising one or more biomass yields comprising a growth rate and/or higher biomass yield comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having the amino acid sequence of SEQ ID No. 44 The modified host cell of nucleic acid of nucleotide sequence lacks a nucleic acid comprising a nucleotide sequence encoding the engineered variant of any of claims 1-29.
90. The modified host cell of any one of claims 36-89, wherein the cells are grown under similar culture conditions for the same length of time as the cells comprising one or more nucleic acid sequences comprising a nucleotide sequence encoding a polypeptide having the amino acid sequence of SEQ ID NO:44, and a modified host cell produced tetrahydrocannabinolic acid (THCA) in comparison to a ratio of another cannabinoid (e.g., cannabichromenic acid (CBCA)) produced by the modified host cell of the nucleic acid of the nucleotide sequence of the tetrahydrocannabinolic acid synthase polypeptide, the modified host cell produces THCA from cannabigerolic acid (CBGA) in an increased ratio of THCA to the other cannabinoid, wherein the polypeptide comprises one or more polypeptides comprising a nucleotide sequence encoding a polypeptide having the sequence of SEQ ID NO:44, the modified host cell lacking a nucleic acid comprising a nucleotide sequence encoding the engineered variant of any one of claims 1-29.
91. The modified host cell of any one of claims 36-90, wherein the modified host cell produces from another cannabinoid, such as THCA, at a ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or greater than about 500:1, a THCA to another CBCA (e.g., THCA).
92. A method of producing a cannabinoid or cannabinoid derivative, the method comprising:
a) culturing the modified host cell of any one of claims 36-91 in a culture medium.
93. The method of claim 92, wherein the method comprises:
b) recovering the produced cannabinoid or cannabinoid derivative.
94. The method of claim 92 or claim 93, wherein the culture medium comprises a carboxylic acid.
95. The method of claim 94, wherein the carboxylic acid is unsubstituted or substituted C 3 -C 18 A carboxylic acid.
96. The method of claim 95, wherein said unsubstituted or substituted C 3 -C 18 The carboxylic acid is unsubstituted or substituted hexanoic acid.
97. The method of claim 92 or claim 93, wherein the medium comprises olivinic acid or an olivinic acid derivative.
98. The method of claim 92 or claim 93, wherein the cannabinoid is tetrahydrocannabinolic acid, or tetrahydrocannabinol.
99. The method of any one of claims 92-98, wherein the culture medium comprises a fermentable sugar.
100. The method of any one of claims 92-98, wherein the culture medium comprises a pretreated cellulosic feedstock.
101. The method of any one of claims 92-98, wherein the medium comprises a non-fermentable carbon source.
102. The method of claim 101, wherein the non-fermentable carbon source comprises ethanol.
103. The method of any of claims 92-102, wherein the cannabinoid or cannabinoid derivative is produced in an amount greater than 100mg/L of medium.
104. The method of any of claims 92-103, wherein the amount of the cannabinoid or the cannabinoid derivative produced is greater than the amount of the cannabinoid or the cannabinoid derivative produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, instead of the modified host cell of any of claims 36-91, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 is devoid of nucleic acids comprising a nucleotide sequence encoding an engineered variant of any of claims 1-29, and wherein the modified host cell of any one of claims 36-91 and the nucleic acid comprising one or more nucleotide sequences encoding tetrahydrocannabinolic acid synthase polypeptides having the amino acid sequence of SEQ ID No. 44, such that the modified host cell lacking the nucleic acid comprising the nucleotide sequence encoding the engineered variant of any one of claims 1-29, are cultured under similar culture conditions for the same length of time.
105. The method of any of claims 92-104, wherein the amount of the cannabinoid or the cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of the cannabinoid or the cannabinoid derivative produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, but not the modified host cell of any of claims 36-91, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 is devoid of nucleic acid comprising a nucleotide sequence encoding the engineered variant of any of claims 1-29, and wherein the modified host cell of any of claims 36-91 and the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44, but devoid of nucleic acid comprising a nucleotide sequence encoding the engineered variant of any of claims 1-29, are cultured under similar culture conditions for the same length of time.
106. The method of any one of claims 92-105, wherein the cannabinoid is tetrahydrocannabinolic acid (THCA), and wherein the method produces THCA at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to the ratio of THCA to the other cannabinoid produced in an alternative method comprising culturing a modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, as opposed to the modified host cell of any one of claims 36-91, wherein the modified host cell comprising one or more nucleic acids comprising a nucleotide sequence encoding a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 is devoid of a deficiency of a host cell comprising nucleic acids encoding an engineered cannabinolic acid synthase polypeptide of any one of claims 1-29 The nucleotide sequence of the variant of (a), grown under similar culture conditions for the same length of time.
107. The method of any one of claims 92-106, wherein the method produces from a THCA to another cannabinoid (e.g., THCA) ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or more than about 500: 1.
108. A method of producing a cannabinoid or cannabinoid derivative comprising using the engineered variant of any one of claims 1-29.
109. The process as claimed in claim 108, wherein the process comprises recovering the produced cannabinoid or cannabinoid derivative.
110. The method of claim 108 or claim 109, wherein the cannabinoid is tetrahydrocannabinolic acid, or tetrahydrocannabinol.
111. The method of any one of claims 108-110, wherein the amount of cannabinoid or the cannabinoid derivative produced is greater than the amount of cannabinoid or the cannabinoid derivative produced in an alternative method comprising using a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 instead of the engineered variant of any one of claims 1-29, in mg/L or mM, wherein the engineered variant of any one of claims 1-29 and the tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 are used under similar conditions for the same length of time.
112. The method of any of claims 108-111 wherein the amount of cannabinoid or the cannabinoid derivative produced is at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100%, at least 150%, at least 200%, at least 500%, or at least 1000% greater than the amount of cannabinoid or the cannabinoid derivative produced in an alternative method comprising using a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 instead of the engineered variant of any of claims 1-29, wherein the engineered variant of any of claims 1-29 and the tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID No. 44 are present in mg/L or mM Cannabinolic acid synthase polypeptides were used under similar conditions for the same length of time.
113. The method of any one of claims 108-112 wherein the cannabinoid is tetrahydrocannabinolic acid (THCA), and wherein the method produces THCA at an increased ratio of THCA to another cannabinoid (e.g., cannabichromenic acid (CBCA)) as compared to the ratio of THCA produced in an alternative method comprising using a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, instead of the engineered variant of any one of claims 1-29, wherein the engineered variant of any one of claims 1-29 and the tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 are used under similar conditions for the same length of time.
114. The method of any one of claims 108-113, wherein the method produces from the thga ratio of about 11:1, about 11.5:1, about 12:1, about 12.5:1, about 13:1, about 13.5:1, about 14:1, about 14.5:1, about 15:1, about 15.5:1, about 16:1, about 16.5:1, about 17:1, about 17.5:1, about 18:1, about 18.5:1, about 19:1, about 19.5:1, about 20:1, about 25:1, about 30:1, about 35:1, about 40:1, about 45:1, about 50:1, about 60:1, about 70:1, about 80:1, about 90:1, about 100:1, about 150:1, about 200:1, about 500:1, or more than about 500:1 of THCA to another cannabinoid (e.g., THCA).
115. A method of screening for an engineered variant of a tetrahydrocannabinolic acid synthase (THCAS) polypeptide, the engineered variant comprising an amino acid sequence of SEQ ID NO:44 having one or more amino acid substitutions, the method comprising:
a) dividing the population of host cells into a control population and a test population;
b) co-expressing in the control population a THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 and a comparative cannabinoid synthase polypeptide, wherein the THCAS polypeptide having an amino acid sequence of SEQ ID NO:44 can convert cannabigerolic acid (CBGA) to a first cannabinoid, tetrahydrocannabinolic acid (THCA), and the comparative cannabinoid synthase polypeptide can convert the same CBGA to a different second cannabinoid;
c) co-expressing the engineered variant and the comparative cannabinoid synthase polypeptide in the test population, wherein the engineered variant can convert CBGA to the same first cannabinoid, tetrahydrocannabinolic acid (THCA), as the THCAS polypeptide having the amino acid sequence of SEQ ID NO:44, and wherein the comparative cannabinoid synthase polypeptide can convert the same CBGA to the second cannabinoid and be expressed at similar levels in the test population and the control population;
d) Measuring a ratio of the first cannabinoid, tetrahydrocannabinolic acid (THCA), to the second cannabinoid produced by both the test population and the control population; and is
e) Measuring the amount of the first cannabinoid produced by both the test population and the control population in mg/L or mM.
116. The method of claim 115, wherein the test population is identified as comprising an engineered variant having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44, wherein improved in vivo performance is evidenced by an increased ratio of the first cannabinoid to the second cannabinoid produced by the test population as compared to the ratio of the first cannabinoid to the second cannabinoid produced by the control population over the same length of time under similar culture conditions.
117. The method of claim 115 or claim 116, wherein the test population is identified as comprising an engineered variant having improved in vivo performance as compared to a tetrahydrocannabinolic acid synthase polypeptide having an amino acid sequence of SEQ ID NO:44 by producing a greater amount of the first cannabinoid from the test population as compared to the amount produced by the control population in mg/L or mM over the same length of time under similar culture conditions.
118. The method of any one of claims 115-117, wherein the cannabinoid synthase polypeptide is a cannabidiolic acid synthase polypeptide.
119. The method of claim 118, wherein the cannabidiolic acid synthase polypeptide comprises an amino acid sequence with at least 85% sequence identity to SEQ ID No. 3.
120. The method of any of claims 115-119 wherein the second cannabinoid is cannabidiolic acid (CBDA).
121. The method of any one of claims 115-120, wherein the engineered variant is the engineered variant of any one of claims 1-29.
CN202080078001.1A 2019-09-18 2020-09-17 Optimized tetrahydrocannabinolic acid (THCA) synthase polypeptides Pending CN115038786A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201962902300P 2019-09-18 2019-09-18
US62/902,300 2019-09-18
PCT/US2020/051261 WO2021055597A1 (en) 2019-09-18 2020-09-17 Optimized tetrahydrocannabinolic acid (thca) synthase polypeptides

Publications (1)

Publication Number Publication Date
CN115038786A true CN115038786A (en) 2022-09-09

Family

ID=72744852

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080078001.1A Pending CN115038786A (en) 2019-09-18 2020-09-17 Optimized tetrahydrocannabinolic acid (THCA) synthase polypeptides

Country Status (9)

Country Link
EP (1) EP4031657A1 (en)
JP (1) JP2022548904A (en)
CN (1) CN115038786A (en)
AU (1) AU2020349513A1 (en)
BR (1) BR112022004797A2 (en)
CA (1) CA3152803A1 (en)
IL (1) IL291328A (en)
MX (1) MX2022003203A (en)
WO (1) WO2021055597A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3130763A1 (en) 2019-02-25 2020-09-03 Ginkgo Bioworks, Inc. Biosynthesis of cannabinoids and cannabinoid precursors
WO2023168266A2 (en) * 2022-03-02 2023-09-07 Genomatica, Inc. Flavin-dependent oxidases having cannabinoid synthase activity

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003287028B2 (en) 2002-10-04 2008-09-04 E.I. Du Pont De Nemours And Company Process for the biological production of 1,3-propanediol with high yield
CA2709107A1 (en) 2007-12-13 2009-06-18 Danisco Us Inc. Compositions and methods for producing isoprene
MY153871A (en) 2008-04-23 2015-03-31 Danisco Us Inc Isoprene synthase variants for improved microbial production of isoprene
CN102791848B (en) 2008-07-02 2017-11-10 丹尼斯科美国公司 For producing the method and composition of the isoprene without C5 hydrocarbon in the case where removing coupling condition and/or safe operating range
JP7198555B2 (en) * 2017-04-27 2023-01-04 ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア Microorganisms and methods for producing cannabinoids and cannabinoid derivatives
EP3652327A4 (en) * 2017-07-12 2021-04-21 Biomedican, Inc. Production of cannabinoids in yeast
CA3123182A1 (en) * 2017-12-14 2019-06-20 Medicinal Genomics Corporation Methods and kits for classifying cannabinoid production in cannabis plants
WO2020069214A2 (en) * 2018-09-26 2020-04-02 Demetrix, Inc. Optimized expression systems for producing cannabinoid synthase polypeptides, cannabinoids, and cannabinoid derivatives

Also Published As

Publication number Publication date
BR112022004797A2 (en) 2022-06-21
CA3152803A1 (en) 2021-03-25
IL291328A (en) 2022-05-01
EP4031657A1 (en) 2022-07-27
JP2022548904A (en) 2022-11-22
AU2020349513A1 (en) 2022-05-05
MX2022003203A (en) 2022-06-08
WO2021055597A1 (en) 2021-03-25

Similar Documents

Publication Publication Date Title
CN110914416B (en) Microorganism and method for producing cannabinoids and cannabinoid derivatives
CN114729337A (en) Optimized cannabinoid synthase polypeptides
ES2386359T3 (en) Genetically modified host cells and use thereof to produce isoprenoid compounds
KR102114493B1 (en) Recombinant production of steviol glycosides
IL270214B1 (en) Anti-sortilin antibodies and methods of use thereof
US20210155966A1 (en) Production of steviol glycosides in recombinant hosts
KR20200070237A (en) Metabolic manipulation of E. coli for biosynthesis of cannabinoid products
CN115038786A (en) Optimized tetrahydrocannabinolic acid (THCA) synthase polypeptides
CN114207108A (en) Genetically modified host cells producing glycosylated cannabinoids
EP3535406A1 (en) Production of steviol glycosides in recombinant hosts
WO2017149147A2 (en) Production of gibberellins in recombinant hosts
US20230043569A1 (en) Production of gpp and cbga in a methylotrophic yeast strain
EP3645730A1 (en) Use of type iii polyketide synthases as phloroglucinol synthases
WO2021036901A1 (en) APPLICATION OF BRANCHED-CHAIN α-KETOACID DEHYDROGENASE COMPLEX IN PREPARATION OF MALONYL COENZYME A
CA3182860A1 (en) Method for producing olivetolic acid in an amoebozoa host species
CN113832090A (en) Recombinant bacillus subtilis natto for high yield of vitamin K2, preparation method and application
ES2425291A1 (en) Method for producing hydroxyectoine
WO2015120431A1 (en) Novel terpene cyclase variants, and methods using same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20220909