WO2012097091A2 - Engineered microorganisms with enhanced fermentation activity - Google Patents

Engineered microorganisms with enhanced fermentation activity Download PDF

Info

Publication number
WO2012097091A2
WO2012097091A2 PCT/US2012/020982 US2012020982W WO2012097091A2 WO 2012097091 A2 WO2012097091 A2 WO 2012097091A2 US 2012020982 W US2012020982 W US 2012020982W WO 2012097091 A2 WO2012097091 A2 WO 2012097091A2
Authority
WO
WIPO (PCT)
Prior art keywords
activity
nucleic acid
phosphate
polynucleotide
yeast
Prior art date
Application number
PCT/US2012/020982
Other languages
French (fr)
Other versions
WO2012097091A3 (en
Inventor
Kirsty Anne Lily SALMON
Jose Miguel Laplaza
Stephen Picataggio
Original Assignee
Verdezyne, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Verdezyne, Inc. filed Critical Verdezyne, Inc.
Publication of WO2012097091A2 publication Critical patent/WO2012097091A2/en
Publication of WO2012097091A3 publication Critical patent/WO2012097091A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/90Isomerases (5.)
    • C12N9/92Glucose isomerase (5.3.1.5; 5.3.1.9; 5.3.1.18)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/02Preparation of oxygen-containing organic compounds containing a hydroxy group
    • C12P7/04Preparation of oxygen-containing organic compounds containing a hydroxy group acyclic
    • C12P7/06Ethanol, i.e. non-beverage
    • C12P7/08Ethanol, i.e. non-beverage produced as by-product or from waste or cellulosic material substrate
    • C12P7/10Ethanol, i.e. non-beverage produced as by-product or from waste or cellulosic material substrate substrate containing cellulosic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y503/00Intramolecular oxidoreductases (5.3)
    • C12Y503/01Intramolecular oxidoreductases (5.3) interconverting aldoses and ketoses (5.3.1)
    • C12Y503/01005Xylose isomerase (5.3.1.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P2203/00Fermentation products obtained from optionally pretreated or hydrolyzed cellulosic or lignocellulosic material as the carbon source
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E50/00Technologies for the production of fuel of non-fossil origin
    • Y02E50/10Biofuels, e.g. bio-diesel

Definitions

  • FERMENTATION ACTIVITY naming Stephen Picataggio, Kirsty Anne Lily Salmon, and Jose Miguel LaPlaza as inventors and designated by Attorney Docket No. VRD-1002-PV4.
  • This patent application is related to U.S. provisional patent application no. 61/224,430 filed on July 9, 2009, entitled ENGINEERED MICROORGANISMS WITH ENHANCED FERMENTATION.
  • ACTIVITY naming Stephen Picataggio as inventor and designated by Attorney Docket No. VRD-1002-PV.
  • This patent application also is related to U.S. provisional patent application no. 61/316,780 filed on March 23, 2010, entitled ENGINEERED MICROORGANISMS WITH ENHANCED
  • the technology relates in part to genetically modified microorganisms that have enhanced fermentation activity, and methods for making and using such microorganisms.
  • Microorganisms employ various enzyme-driven biological pathways to support their own metabolism and growth.
  • a cell synthesizes native proteins, including enzymes, in vivo from deoxyribonucleic acid (DNA).
  • DNA first is transcribed into a complementary ribonucleic acid (RNA) that comprises a ribonucleotide sequence encoding the protein.
  • RNA then directs translation of the encoded protein by interaction with various cellular components, such as ribosomes.
  • the resulting enzymes participate as biological catalysts in pathways involved in production of molecules utilized or secreted by the organism.
  • pathways can be exploited for the harvesting of the naturally produced products.
  • the pathways also can be altered to increase production or to produce different products that may be commercially valuable.
  • Advances. in recombinant molecular biology methodology allow researchers to isolate DNA from one organism and insert it into another organism, thus altering the cellular synthesis of enzymes or other proteins.
  • Such genetic engineering can change the biological pathways within the host organism, causing it to produce a desired product.
  • Microorganic industrial production can minimize the use of caustic chemicals and production of toxic byproducts, thus providing a "clean" source for certain products.
  • microorganisms having enhanced fermentation activity.
  • such microorganisms are capable of generating a target product with enhanced fermentation efficiency by, for example, (i) preferentially utilizing a particular glycolysis pathway, which increases yield of a target product, upon a change in fermentation conditions; (ii) reducing cell division rates upon a change in fermentation conditions, thereby diverting nutrients towards production of a target product; (iii) having the ability to readily metabolize five-carbon sugars; and/or (iv) having the ability to readily metabolize carbon dioxide; and combinations of the foregoing.
  • a target product is ethanol or succinic acid.
  • engineered microorganisms that comprise: (a) a functional Embden-Meyerhoff glycolysis pathway that metabolizes six-carbon sugars under aerobic fermentation conditions, and (b) a genetic modification that reduces an Embden-Meyerhoff glycolysis pathway member activity upon exposure of the engineered microorganism to anaerobic fermentation conditions, whereby the engineered microorganisms preferentially metabolize six- carbon sugars by the Enter-Doudoroff pathway under the anaerobic fermentation conditions.
  • the genetic modification is insertion of a promoter into genomic DNA in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity.
  • the genetic modification is provision of a heterologous promoter polynucleotide in operable linkage with a polynucleotide that encodes the Embden- Meyerhoff glycolysis pathway member activity.
  • the genetic modification is a deletion or disruption of a polynucleotide that encodes, or regulates production of, the Embden- Meyerhoff glycolysis pathway member, and the microorganism comprises a heterologous nucleic acid that includes a polynucleotide encoding the Embden-Meyerhoff glycolysis pathway member operably linked to a polynucleotide that down-regulates production of the member under anaerobic fermentation conditions.
  • engineered microorganisms that comprise a genetic modification that inhibits cell division upon exposure to a change in fermentation conditions, where: the genetic modification comprises introduction of a heterologous promoter operably linked to a polynucleotide encoding a polypeptide that regulates the cell cycle of the microorganism; and the promoter activity is altered by the change in fermentation conditions.
  • engineered microorganisms that comprise a genetic modification that inhibits cell division and/or cell proliferation upon exposure of the microorganisms to a change in fermentation conditions.
  • the genetic modification inhibits cell division, inhibits cell proliferation, inhibits the cell cycle and/or induces cell cycle arrest.
  • the change in fermentation conditions is a switch to anaerobic fermentation conditions, and in certain embodiments, the change in fermentation conditions is a switch to an elevated temperature.
  • the polypeptide that regulates the cell cycle has thymidylate synthase activity.
  • the promoter activity is reduced by the change in fermentation conditions.
  • the genetic modification is a temperature sensitive mutation.
  • a target product produced by an engineered microorganism which comprise: (a) culturing an engineered microorganism described herein under aerobic conditions; and (b) culturing the engineered microorganism after (a) under anaerobic conditions, whereby the engineered microorganism produces the target product.
  • Also provided in some embodiments are methods for producing a target product by an engineered microorganism which comprise: (a) culturing an engineered microorganism described herein under a first set of fermentation conditions; and (b) culturing the engineered microorganism after (a) under a second set of fermentation conditions different than the first set of fermentation conditions, whereby the second set of fermentation conditions inhibits cell division and/or cell proliferation of the engineered microorganism.
  • the target product is ethanol or succinic acid.
  • the host microorganism from which the engineered microorganism is produced does not produce a detectable amount of the target product.
  • the culture conditions comprise fermentation conditions, comprise introduction of biomass, comprise introduction of a six-carbon sugar (e.g., glucose), and/or comprise introduction of a five-carbon sugar (e.g., xylulose, xylose); or combinations of the foregoing.
  • the target product is produced with a yield of greater than about 0.3 grams per gram of glucose added, and in certain embodiments, a method comprises purifying the target product from the cultured microorganisms. In some embodiments, a method comprises modifying the target product, thereby producing modified target product.
  • a method comprises placing the cultured microorganisms, the target product or the modified target product in a container, and in certain embodiments, a method comprises shipping the container.
  • the second set of fermentation conditions comprises an elevated temperature as compared to the temperature in the first set of fermentation conditions.
  • the genetic modification inhibits the cell cycle of the engineered microorganism upon exposure to the second set of fermentation conditions.
  • the genetic modification inhibits cell proliferation, inhibits cell division, inhibits the cell cycle and/or induces cell cycle arrest upon exposure to the second set of fermentation conditions.
  • the genetic modification inhibits thymidylate synthase activity upon exposure to the change in fermentation conditions, and sometimes the genetic modification comprises a temperature sensitive mutation.
  • microorganism which comprise: (a) introducing a genetic modification to a host microorganism that reduces an Embden-Meyerhoff glycolysis pathway member activity upon exposure of the engineered microorganism to anaerobic conditions; and (b) selecting for engineered
  • the genetic modification is insertion of a promoter into genomic DNA in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity.
  • the genetic modification sometimes is provision of a heterologous promoter polynucleotide in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity.
  • the genetic modification is a deletion or disruption of a
  • polynucleotide that encodes, or regulates production of, the Embden-Meyerhoff glycolysis pathway member, and the microorganism comprises a heterologous nucleic acid that includes a
  • the Embden-Meyerhoff glycolysis pathway member activity is a phosphofructokinase activity, and in certain embodiments, the Embden-Meyerhoff glycolysis pathway member activity is a phosphoglucose isomerase activity. In some embodiments, the activity of one or more (e.g., 2, 3, 4, 5 or more) pathway members in an EM pathway is reduced or removed to undetectable levels.
  • an engineered microorganism which comprise: (a) introducing a genetic modification to a host microorganism that inhibits cell division upon exposure to a change in fermentation conditions, thereby producing engineered microorganisms; and (b) selecting for engineered microorganisms with inhibited cell division upon exposure of the engineered microorganisms to the change in fermentation conditions.
  • the change in fermentation conditions comprises a change to anaerobic fermentation conditions.
  • the change in fermentation conditions sometimes comprises a change to an elevated temperature.
  • the genetic modification inhibits the cell cycle of the engineered microorganism upon exposure to the change in fermentation conditions.
  • the genetic modification sometimes inhibits cell division, inhibits the cell cycle, inhibits cell proliferation and/or induces cell cycle arrest upon exposure to the change in fermentation conditions.
  • the genetic modification inhibits thymidylate synthase activity upon exposure to the change in fermentation conditions, and in certain embodiments, the genetic modification comprises a temperature sensitive mutation.
  • the microorganism comprises a genetic modification that adds or alters a five-carbon sugar metabolic activity.
  • the microorganism comprises a genetic alteration that adds or alters xylose isomerase activity.
  • the microorganism comprises a genetic alteration that adds or alters a xylose reductase (XR) activity and a xylitol dehydrogenase (XD) activity.
  • the microorganism comprises a xylulokinase (XK) activity.
  • the microorganism comprises a genetic alteration that adds or alters five-carbon sugar transporter activity, and sometimes the transporter activity is a transporter facilitator activity or an active transporter activity.
  • the microorganism comprises a genetic alteration that adds or alters carbon dioxide fixation activity, and sometimes the genetic alteration that adds or alters phosphoenolpyruvate (PEP) carboxylase activity.
  • the microorganism comprises a genetic modification that reduces or removes an alcohol dehydrogenase 2 activity.
  • the microorganism comprises a genetic alteration that adds or alters a 6-phosphogluconate dehydrogenase (decarboxylating) activity.
  • the microorganism is an engineered yeast, such as a Saccharomyces yeast (e.g., S. cerevisiae), for example.
  • expression vectors comprising a polynucleotide that encodes a polypeptide possessing a xylose reductase activity and xylitol dehydrogenase activity. Also provided in some embodiments are expression vectors comprising a polynucleotide that encodes a polypeptide possessing a xylulokinase activity.
  • the polynucleotide sometimes includes one or more substituted codons, and in some embodiments, the one or more substituted codons are yeast codons (e.g., some or all codons are optimized with yeast codons (e.g., S. cerevisiae codons).
  • the polynucleotide sometimes includes a nucleotide sequence of SEQ ID NO: 29, 30, 32 or 33, fragment thereof, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing, and in certain embodiments the polypeptide includes an amino acid sequence of SEQ ID NO: 31 , fragment thereof, or sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing.
  • a stretch of contiguous nucleotides of the polynucleotide is from another organism, and sometimes the stretch of contiguous nucleotides from the other organism is from a nucleotide sequence that encodes a polypeptide possessing a xylose isomerase activity.
  • the other organism sometimes is a fungus, such as a Piromyces fungus (e.g., Piromyces strain E2 or another
  • Piromyces strain for example, and at times the stretch of contiguous nucleotides from the other organism is from SEQ ID NO: 34, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing.
  • the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing.
  • the stretch of contiguous nucleotides from the other organism sometimes is about 1 % to about 30% (e.g., about 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25%) of the total number of nucleotides in the polynucleotide that encodes the polypeptide possessing xylose isomerase activity.
  • about 30 contiguous nucleotides from the polynucleotide from Ruminococcus are replaced by about 10 to about 20 nucleotides from the other organism.
  • the contiguous stretch of polynucleotides from the other organism is at the 5' end of the polynucleotide.
  • the polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing.
  • a subsequence from one donor may represent a majority of the chimeric xylose isomerase sequence (e.g., about 55% to about 99% of the chimeric xylose isomerase nucleotide or amino acid sequence (e.g., about 60, 65, 70, 75, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98%) or all but 30 or fewer nucleotides or amino acids of the chimeric sequence (e.g., all but about 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6,
  • a subsequence from one donor may represent a minority of the chimeric xylose isomerase sequence (e.g., about 1 % to about 45% of the chimeric xylose isomerase nucleotide or amino acid sequence (e.g., about 40, 35, 30, 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2%) or about 1 to about 30 nucleotides or amino acids of the chimeric sequence (e.g., about 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 nucleotides or amino acids of the chimeric molecule).
  • one or more donor sequences for a chimeric xylose isomerase molecule are from a xylose isomerase described in the following Table:
  • nucleotide or amino acid modifications e.g., substitutions, deletions or insertions
  • the majority of a chimeric xylose isomerase molecule is from a Ruminococcus xylose isomerase described in the foregoing Table (e.g., about 80% or more of the nucleotides or amino acids of the chimeric molecule (e.g., about 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99% of the nucleotides or amino acids) or all but about 30 of the nucleotides or amino acids in the chimeric molecule (e.g., all but about 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 nucleotides or amino acids of the chimeric molecule)), and (b) a minority of the chimeric xylose isomerase is from a xylose
  • the minority of the chimeric xylose isomerase sometimes is from a xylose isomerase referenced in the foregoing Table, such as a xylose isomerase from the Piromyces strain, for example.
  • a donor sequence includes one or more nucleotide or amino acid mutations, examples of which are described herein.
  • nucleic acids including a polynucleotide that includes a first stretch of contiguous nucleic acids from a first organism and a second stretch of contiguous nucleic acids from a second organism, where the polynucleotide encodes a polypeptide possessing a xylose to xylulose xylose isomerase activity.
  • an expression vector comprising a polynucleotide that includes a first stretch of contiguous expression vectors from a first organism and a second stretch of contiguous expression vectors from a second organism, where the polynucleotide encodes a polypeptide possessing a xylose to xylulose, xylose isomerase activity.
  • the first organism and the second organism are the same, and in certain embodiments, the first organism and the second organism are different.
  • the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having xylose isomerase activity.
  • the first organism is a bacterium.
  • the bacterium is a Ruminococcus bacterium, and in certain embodiments, the bacterium is a Ruminococcus flavefaciens bacterium (e.g., Ruminococcus flavefaciens strain 17, Ruminococcus flavefaciens strain Siijpesteijn 1948, Rumino ' coccus flavefaciens strain FD1 , Ruminococcus flavefaciens strain 18P13).
  • the stretch of contiguous nucleotides is from SEQ ID NO: 29, 30, 32, 33, or a sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing.
  • the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 31 , or a sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing.
  • the second organism is a fungus.
  • the fungus is a Piromyces fungus, and in some embodiments, the fungus is a Piromyces strain E2 fungus.
  • the stretch of contiguous nucleotides is from SEQ ID NO: 34, or a sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing.
  • the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or a sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing.
  • the polynucleotide includes one or more substituted codons.
  • the one or more substituted codons are yeast codons.
  • the stretch of contiguous nucleotides from the first organism or second organism is about 1 % to about 30% (e.g., about 2, 3,
  • the stretch of contiguous nucleotides from the second organism is about 1 % to about 30% (e.g., about 2, 3, 4,
  • polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing.
  • the polynucleotide encodes a polypeptide that includes an amino acid sequence of SEQ ID NO: 58, 60 or 62, fragment thereof, or sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing.
  • the expression vector can include one or more point mutations.
  • the point mutation is at a position corresponding to position 179 of R. flavefaciens polypeptide having xylose isomerase activity. In some embodiments, the point mutation is a glycine 179 to alanine point mutation.
  • the microbes described herein can be used in fermentation methods.
  • a method includes, contacting a microbe described herein with a feedstock comprising a five carbon molecule under conditions for generating ethanol.
  • the five carbon molecule includes xylose. In some embodiments, about 15 grams per liter of ethanol, or more, is generated within about 372 hours. In certain embodiments, about 2.0 grams per liter dry cell weight, or more, is generated within about 372 hours.
  • nucleic acids including a polynucleotide that includes a first stretch of contiguous nucleic acids from a first organism and a second stretch of contiguous nucleic acids from a second organism, where the polynucleotide encodes a polypeptide possessing a phosphogluconate dehydratase activity.
  • an expression vector comprising a polynucleotide that includes a first stretch of contiguous expression vectors from a first organism and a second stretch of contiguous expression vectors from a second organism, where the polynucleotide encodes a polypeptide possessing a phosphogluconate dehydratase activity.
  • the first organism and the second organism are the same, and in certain embodiments, the first organism and the second organism are different.
  • the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having a phosphogluconate dehydratase activity.
  • nucleic acids including a polynucleotide that includes a first stretch of contiguous nucleic acids from a first organism and a second stretch of contiguous nucleic acids from a second organism, where the polynucleotide encodes a polypeptide possessing a 2- keto-3-deoxygluconate-6-phosphate aldolase activity.
  • an expression vector comprising a polynucleotide that includes a first stretch of contiguous expression vectors from a first organism and a second stretch of contiguous expression vectors from a second organism, where the polynucleotide encodes a polypeptide possessing a 2-keto-3-deoxygluconate- 6-phosphate aldolase activity.
  • the first organism and the second organism are the same, and in certain embodiments, the first organism and the second organism are different.
  • the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having a 2-keto-3-deoxygluconate-6-phosphate aldolase activity.
  • the expression vector includes a regulatory nucleotide sequence in operable linkage with the polynucleotide.
  • the regulatory nucleotide sequence comprises a promoter sequence.
  • the promoter sequence is an inducible promoter sequence.
  • the promoter sequence is a constitutively active promoter sequence.
  • a method for preparing an expression vector as described herein includes (i) providing a nucleic acid that contains a regulatory sequence, and (ii) inserting the polynucleotide into the nucleic acid in operable linkage with the regulatory sequence.
  • a microbe as described herein includes the nucleic acid of anyone of the foregoing embodiments.
  • a microbe includes an expression vector of any one of the foregoing embodiments.
  • the microbe is a yeast.
  • the microbe is a Saccharomyces yeast, and in some embodiments, the microbe is a Saccharomyces cerevisiae yeast.
  • provided herein is a nucleic acid comprising polynucleotide
  • a phosphogluconate dehydratase enzyme e.g., EDD
  • a 2-keto-3- deoxygluconate-6-phosphate aldolase enzyme e.g., EDA
  • a transaldolase enzyme e.g., TAL1
  • a transketolase enzyme e.g., TKL1 , TKL2, or TKL1 and TKL2
  • dehydrogenase enzyme e.g., ZWF1
  • a 6-phosphogluconolactonase enzyme e.g., SOL3, SOL4, or SOL3 and SOL4
  • a xylose isomerase enzyme or a xylose reductase (XR) enzyme and a xylitol dehydrogenase (XD) enzyme
  • XK xylulokinase
  • polynucleotide subsequences encoding the phosphogluconate dehydratase enzyme and the 3- deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp.
  • the polynucleotide encoding the phosphogluconate dehydratase enzyme and/or the 3-deoxygluconate-6-phosphate aldolase enzyme is a chimeric polynucleotide that includes part of such a sequence and part of another phosphogluconate dehydratase enzyme and the 3- deoxygluconate-6-phosphate aldolase enzyme sequence (e.g., from a different organism).
  • the polynucleotide subsequence that encodes the xylose isomerase enzyme is from a Ruminococcus spp. (e.g., Ruminococcus flavefaciens), and in some embodiments, is a chimeric polynucleotide that includes part of such a sequence and part of another xylose isomerase sequence (e.g., from a Piromyces spp.).
  • Ruminococcus spp. e.g., Ruminococcus flavefaciens
  • a chimeric polynucleotide that includes part of such a sequence and part of another xylose isomerase sequence e.g., from a Piromyces spp.
  • Non-limiting examples of xylose isomerase chimeric sequences are described herein.
  • a nucleic acid includes a polynucleotide subsequence that encodes a glucose-6-phosphate dehydrogenase enzyme (e.g., ZWF1 ) and/or a polynucleotide subsequence that encodes a 6-phosphogluconolactonase enzyme (e.g., SOL3/SOL4).
  • the polynucleotide subsequences that encode the glucose-6-phosphate dehydrogenase enzyme and the 6-phosphogluconolactonase enzyme are from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae).
  • a nucleic acid includes a polynucleotide subsequence that encodes a glucose transporter (e.g., GAL2, GXS1 , GXF1 , HXT7).
  • the polynucleotide subsequence that encodes the glucose transporter is from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae).
  • a nucleic acid includes a polynucleotide subsequence that alters the activity of 6- phosphogluconate dehydrogenase (decarboxylating) enzyme (e.g., GND1 , GND2).
  • the polynucleotide subsequences that alter the activity of 6-phosphogluconate dehydrogenase (decarboxylating) enzyme are from a yeast.
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a 6- phosphogluconate dehydrogenase (decarboxylating) enzyme.
  • a nucleic acid includes a polynucleotide subsequence that disrupts a phosphoglucose isomerase enzyme (e.g., PGI1 ).
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a phosphoglucose isomerase enzyme.
  • a nucleic acid includes a polynucleotide subsequence that encodes a transaldolase enzyme (e.g., TAL1 ).
  • a transaldolase enzyme e.g., TAL1
  • the polynucleotide subsequences that encode the transaldolase enzyme are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida.
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transaldolase enzyme.
  • a nucleic acid includes a polynucleotide subsequence that encodes a transketolase enzyme (e.g., TKL1 , TKL2, or TKL1 and TKL2).
  • the polynucleotide subsequences that encode the transketolase enzyme are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida.
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transketolase enzyme.
  • a nucleic acid includes one or more promoters operable in a yeast (e.g., Saccharomyces spp. (e.g., Saccharomyces cerevisiae), and in operable connection with one or more polynucleotide subsequences described above. Such promoters often are constitutively active and sometimes are operable under anaerobic and aerobic conditions.
  • Non-limiting examples of promoters include those that control glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and triose phosphate dehydrogenase (TDH-1 ).
  • GPD glucose phosphate dehydrogenase
  • TEZ-1 translation elongation factor
  • PGK-1 phosphoglucokinase
  • TSH-1 triose phosphate dehydrogenase
  • a nucleic acid can be one or two nucleic acids in some embodiments, and each nucleic acid can include one or two or more of the polynucleotide subsequences and or promoters described above.
  • a nucleic acid can be in circular (e.g., plasmid) or linear form, in some embodiments, and sometimes functions as an expression vector.
  • a nucleic acid functions as a tool for integrating the polynucleotide subsequences,
  • an engineered microbe comprising heterologous polynucleotide subsequences that encode a phosphogluconate dehydratase enzyme (e.g., EDD), a 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme (e.g., EDA), a xylose isomerase enzyme or a xylose reductase (XR) enzyme and a xylitol dehydrogenase (XD) enzyme, and a xylulokinase (XK) enzyme.
  • the microbe is a yeast, non-limiting examples of which are Saccharomyces spp.
  • polynucleotide subsequences encoding the phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6- phosphate aldolase enzyme independently are from an Escherichia spp. (e.g., Escherichia coli) or Pseudomonas spp. (e.g., Pseudomonas aeruginosa), and in certain embodiments, the
  • polynucleotide encoding the phosphogluconate dehydratase enzyme and/or the 3-deoxygluconate- 6-phosphate aldolase enzyme is a chimeric polynucleotide that includes part of such a sequence and part of another phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6- phosphate aldolase enzyme sequence (e.g., from a different organism).
  • the polynucleotide subsequence that encodes the xylose isomerase enzyme is from a
  • Ruminococcus spp. e.g., Ruminococcus flavefaciens
  • xylose isomerase chimeric sequences are described herein.
  • the engineered microbe expresses a glucose-6-phosphate dehydrogenase enzyme (e.g., ZWF1 ) and/or a 6-phosphogluconolactonase enzyme (e.g., SOL3/SOL4).
  • the polynucleotide subsequences that encode the glucose-6-phosphate dehydrogenase enzyme and the 6-phosphogluconolactonase enzyme are from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g.,
  • the polynucleotide subsequences that disrupt the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme are from a yeast.
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a 6-phosphogluconate dehydrogenase (decarboxylating) enzyme.
  • an engineered microbe sometimes expresses higher-than-normal levels (e.g., over-express) of an endogenous 6-phosphogluconolactonase enzyme and/or a glucose-6-phosphate
  • the engineered microbe includes a polynucleotide subsequence that encodes a glucose transporter (e.g., GAL2, GSX1 , GXF1 , HXT7).
  • the polynucleotide subsequence that encodes the glucose transporter is from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae).
  • an engineered microbe sometimes expresses higher-than-normal levels (e.g., over-express) of one or more endogenous glucose transport enzymes (e.g., under control of a constitutive promoter, or multiple copies of the nucleotide subsequences that encode such enzymes are inserted in the microbe).
  • the engineered microbe includes a genetic alteration that reduces the activity of an endogenous phosphofructokinase enzyme activity.
  • a polynucleotide subsequence that encodes such an enzyme is altered such that enzyme activity is significantly reduced or not detectable in the engineered microbe.
  • a nucleic acid includes a polynucleotide subsequence that alters the activity of 6-phosphogluconate
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a 6-phosphogluconate dehydrogenase (decarboxylating) enzyme.
  • a nucleic acid includes a polynucleotide subsequence that alters a phosphoglucose isomerase enzyme (e.g., PGM ) activity.
  • the polynucleotide subsequences that alter the phosphoglucose isomerase enzyme are from a yeast.
  • a nucleic acid includes a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a 6-phosphogluconate dehydrogenase (decarboxylating) enzyme.
  • a nucleic acid includes a polynucleotide subsequence that alters a phosphoglucose isomerase enzyme (e.g., PGM ) activity.
  • a nucleic acid includes a polynucleotide subsequence that alters a transaldolase enzyme (e.g., TAL1 ).
  • the polynucleotide subsequences that alter the transaldolase enzyme activity, increase the transaldolase activity and in some embodiments the polynucleotide sequences are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida.
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transaldolase enzyme.
  • a nucleic acid includes a polynucleotide subsequence that alters a transketolase enzyme (e.g., TKL1 , TKL2, or TKL1 and TKL2).
  • the polynucleotide subsequences that alter the transketolase enzyme increase transketolase activity and in some embodiments, the polynucleotide sequences are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida.
  • the polynucleotide subsequences that alter transketolase enzyme activity decrease the transketolase activity.
  • a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transketolase enzyme.
  • An engineered microbe sometimes expresses higher-than-normal levels (e.g., over-express) of an endogenous activity and/or expresses a novel activity independently selected from the non-limiting group of activities including 6-phosphogluconate dehydrogenase (decarboxylating) (e.g.,
  • the higher than normal levels are achieved by (i) placing a polynucleotide subsequence encoding the activity in operable connection with a constitutive promoter or a strong inducible promoter, (ii) increasing the number of copies of a polynucleotide subsequence encoding the activity (e.g., integration of multiple copies in the genomic DNA, plasmids maintained in the organism at high copy number), and/or (iii) the like and combinations thereof.
  • the polynucleotide subsequence that encodes an activity described herein is from a yeast, non-limiting examples of which are Saccharomyces spp.
  • the polynucleotide subsequence that encodes an activity described herein is from a bacteria, non-limiting examples of which are Bacillus spp, Escherichia spp, Ruminococcus spp, Yarrowia spp, and other bacteria described herein.
  • the engineered microbe includes a genetic alteration that reduces or eliminates an endogenous activity independently selected from the non-limiting group of activities including 6-phosphogluconate dehydrogenase (decarboxylating) (e.g., GND1/GND2), glycerophosphate dehydrogenase (e.g., GPD1/GPD2), phosphofructokinase (e.g., PFK26/PFK27), membrane channel activity (e.g., FPS1 ), trehalose-6-phosphate synthase (e.g., TPS1 ), neutral trehalose (e.g., NTH1 ), alkaline phosphatase specific for p-nitrophenyl phosphate (e.g., PH013), phosphoglucose isomerase (e.g., PGM ), transaldolase (e.g., TAL1 ), and/or transketolase (e.g., TKL
  • an activity is reduced or eliminated by disrupting and/or deleting nucleotide sequences or nucleotide subsequences that encode the activity.
  • the polynucleotide subsequences that are used to alter an activity are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia,
  • polynucleotide subsequences that are used to alter an activity are from a bacteria, non-limiting examples of which are
  • a polynucleotide subsequence that encodes such an enzyme is altered such that enzyme activity is significantly reduced or not detectable in the engineered microbe.
  • An activity often is "eliminated” when the activity is not detectable in an engineered organism.
  • one or more activities addressed in this paragraph are reduced or eliminated and one or more activities addressed in this paragraph are increased in an engineered microorganism relative to a control microorganism not including the genetic alteration(s) causing such a reduction or increase.
  • all of the activities addressed in this paragraph are increased or reduced or eliminated.
  • one or more activities addressed in this paragraph are reduced or eliminated, one or more activities addressed in this paragraph are increased, and one or more activities in this paragraph are not altered in an engineered microorganism relative to a control microorganism not including the genetic alteration(s) causing such a reduction or increase.
  • the engineered microbe includes one or more promoters operable in a yeast (e.g., Saccharomyces spp. (e.g., Saccharomyces cerevisiae), and in operable connection with one or more polynucleotide subsequences described above.
  • yeast e.g., Saccharomyces spp.
  • promoters often are constitutively active and sometimes are operable under anaerobic and aerobic conditions.
  • Non- limiting examples of promoters include those that control glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and t ose phosphate dehydrogenase (TDH-1 ).
  • polynucleotide sequences and promoters described above sometimes are non-stably associated with the microbe (e.g., they are in a non-integrated nucleic acid (e.g., a plasmid), and in some embodiments, are integrated in genomic DNA of the microbe.
  • the polynucleotide sequences are integrated in a transposition integration event, a homologous recombination integration event or a transposition integration event and a homologous recombination integration event.
  • a transposition integration event includes transposition of an operon comprising two or more of the polynucleotide
  • a homologous recombination integration event includes homologous recombination of an operon comprising two or more of the polynucleotide subsequences and or promoters described above.
  • methods for producing xylulose and/or ethanol using an engineered microbe described above which comprise contacting the engineered microbe with a medium (e.g., feedstock) under conditions in which the microbe synthesizes xylulose and/or ethanol.
  • the engineered microbe synthesizes xylulose and/or ethanol to about 85% to about 99% of theoretical yield (e.g., about 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% of theoretical xylulose and/or ethanol yield).
  • the medium e.g., feedstock
  • the ethanol is separated and/or recovered from the engineered microorganism.
  • polypeptide comprising: (i) the amino acid sequence of SEQ ID NO: 180, or (ii) an amino acid sequence that includes 1 to 10 amino acid subsitutions, insertions or deletions with respect to (i), which polypeptide has a xylose isomerase activity.
  • the polypeptide is an isolated chimeric xylose isomerase enzyme.
  • an isolated polynucleotide selected from the group consisting of: (i) a polynucleotide comprising a nucleotide sequence that encodes a polypeptide sequence of SEQ ID NO: 180; and (ii) a polynucleotide comprising a nucleotide sequence that encodes a polypeptide sequence that includes 1 to 10 amino acid substitutions, insertions or deletions with respect to SEQ ID NO: 180, which
  • polypeptide has a xylose isomerase activity.
  • the amino acid sequence e.g., polypeptide sequence
  • the amino acid sequence includes 1 to 10 conservative amino acid substitutions with respect to SEQ ID NO: 180.
  • the 1 to 10 amino acid substitutions, insertions or deletions do not substantially reduce or inhibit the activity of the polypeptide comprising SEQ ID NO: 180.
  • the nucleotide sequence is 90% or more identical to SEQ ID NO: 179 and encodes a polypeptide sequence of SEQ ID NO: 180 or a polynucleotide sequence that includes 1 to 10 amino acid substitutions, insertions or deletions with respect to SEQ ID NO: 180.
  • an engineered yeast comprising a chimeric enzyme which enzyme comprises a polynucleotide comprising a nucleotide sequence of the foregoing.
  • a method for producing ethanol comprising: (a) providing an engineered yeast of the foregoing; and (b) contacting the engineered yeast with a 5 carbon sugar, a 6 carbon sugar or mixture comprising 5 carbon and 6 carbon sugars, under fermentation conditions whereby ethanol is produced by the engineered yeast.
  • the engineered yeast is a
  • the yeast is a Saccharomyces cerevisiae yeast.
  • the engineered yeast synthesizes ethanol to about 85% to about 99% of theoretical yield.
  • the method comprises recovering ethanol synthesized by the engineered yeast.
  • the engineered yeast comprises between about a 1 -fold to about a 100-fold increase in ethanol production when compared to wild-type, parental, or partially engineered organisms of the same strain, under identical fermentation conditions.
  • an isolated nucleic acid including a polynucleotide that is 80% or more identical to SEQ ID NO: 179.
  • the polynucleotide encodes an amino acid sequence of SEQ ID NO: 180.
  • the nucleic acid includes a polynucleotide that is 85% or more identical to SEQ ID NO: 179.
  • the nucleic acid includes a polynucleotide that is 90% or more identical to SEQ ID NO: 179.
  • the nucleic acid includes a polynucleotide that is 95% or more identical to SEQ ID NO: 179.
  • the nucleic acid includes the polynucleotide of SEQ ID NO: 179, and in certain embodiments, the nucleic acid consistes of SEQ ID NO: 179.
  • the nucleic acid is an expression vector.
  • an engineered microorganism including a polynucleotide that is 80% or more identical to SEQ ID NO: 179.
  • the microorganism is a eukaryote.
  • the microorganism is a yeast.
  • the yeast is a Saccharomyces yeast, and in certain embodiments, the yeast is a Saccharomyeces cerevisiae yeast.
  • a method for producing ethanol which includes contacting an engineered microorganism described herein with a 5 carbon sugar, a 6 carbon sugar or mixture including 5 carbon and 6 carbon sugars, under fermentation conditions whereby ethanol is produced by the engineered microorganism.
  • the engineered microorganism synthesizes ethanol to about 85% to about 99% of theoretical yield.
  • the method includes recovering ethanol synthesized by the engineered microorganism.
  • the engineered microorganism includes between about a 2-fold to about a 100-fold increase in ethanol production when compared to wild-type, parental, or partially engineered microorganisms of the same strain, under identical fermentation conditions. Additional embodiments can be found in Example 63: Examples of the embodiments. Certain embodiments are described further in the following description, examples, claims and drawings.
  • FIG. 1 depicts a metabolic pathway that produces ethanol as by product of cellular respiration.
  • the solid lines represent activities present in the Embden-Meyerhoff pathway (e.g., aerobic respiration). Dashed lines represent activities associated with the Entner-Doudoroff pathway (e.g., anaerobic respiration).
  • One or both pathways often can be operational in a microorganism.
  • the level of activity of each pathway can vary from organism to organism.
  • the arrow from FBP e.g., Fructose-1 ,6-bisphosphate, also referred to as F-1.6-BP
  • G3P e.g., glcyeraldehyde-3- phosphate
  • FIGS. 2, 3 and 5 a smaller arrow from FBP to G3P is illustrated, indicating reduced or no conversion of FBP to G3P.
  • the reduction in conversion of FBP to G3P illustrated in FIGS. 2, 3 and 5 is a result of the reduction or elimination of the previous activity that converts fructose-6-phosphate (F6P) to FBP (e.g., the activity of PFK).
  • FIG. 2 depicts an engineered metabolic pathway that can be used to produce ethanol more efficiently in a host microorganism in which the pathway has been engineered.
  • the solid lines in FIGS. 2-5 represent the metabolic pathway naturally found in a host organism (e.g.,
  • FIGS. 2-5 represent a novel activity or pathway engineered into a microorganism to allow increased ethanol production efficiency.
  • FIG. 2 the activity of an enzyme in the Embden-Meyerhoff pathway, phosphofructokinase (e.g., PFK) is permanently or temporarily reduced or eliminated.
  • the inactivation is shown as the "X" in FIG. 2.
  • Disruption of the activity of PFK serves to inactivate the Embden-Meyerhoff pathway (EM pathway).
  • EM pathway Embden-Meyerhoff pathway
  • ED pathway Entner- Doudoroff pathway
  • FIG. 3 depicts an engineered metabolic pathway that can be used to produce ethanol using xylose as a carbon source by introducing the activity into a microorganism.
  • microorganism can convert xylose to xylulose in a single reaction using the introduced xylose isomerase activity.
  • Xylose also can be metabolized by the combined activities of xylose reductase, and xylitol dehydrogenase, as depicted in FIG. 20.
  • Xylulose then can be fermented to ethanol by entering the EM pathway.
  • Engineered microorganisms also can use the increased efficiency of ethanol production associated with inactivation of the EM pathway and introduction of activities of the ED pathway, shown in FIG. 2 and discussed below.
  • xylose efficiently (e.g., concurrently with six-carbon sugars or prior to the depletion of six-carbon sugars) can be provided by the introduction of the novel activities, xylose isomerase, or xylose reductase and xylitol dehydrogenase.
  • FIG. 4 depicts an engineered metabolic pathway that can be used to increase the efficiency of ethanol production (and other products) by introducing the ability to fix atmospheric carbon dioxide into a microorganism.
  • the engineered microorganism can incorporate or fix atmospheric carbon dioxide into organic molecules using the introduced phosphoenolpyruvate carboxylase activity. Carbon dioxide incorporated in this manner can be used as an additional carbon source that can increase production of many organic molecules, including ethanol.
  • Non-limiting examples of other products whose production can benefit from carbon fixation include; pyruvate, oxaloacetate, glyceraldehyde-3-phosphate and the like.
  • FIG. 5 shows a combination of some engineered metabolic pathways described herein.
  • the combination of engineered metabolic pathways shown in FIG. 5 can provide significant increases in the production of ethanol (or other products) when compared to the wild type organism or organisms lacking one, two, three or more of the modifications.
  • Other combinations of engineered metabolic pathways not shown in FIG. 5 are possible, including but not limited to, combinations including increased alcohol tolerance, modified alcohol dehydrogenase 2 activity and/or modified thymidylate synthase activity, as described herein. Therefore, FIG. 5 also illustrates an
  • a method for generating an engineered microorganism with the ability to produce a greater amount of target product comprising expressing one or more genetically modified activities, described herein, in a host organism that produces the desired target (e.g., ethanol, pyruvate, oxaloacetate and the like, for example) via one or more metabolic pathways.
  • the desired target e.g., ethanol, pyruvate, oxaloacetate and the like, for example
  • the combination of metabolic pathways includes those depicted in FIG. 5 in addition to combinations including one, two or three of the following activities; increased alcohol tolerance, modified alcohol dehydrogenase 2 activity and modified thymidylate synthase activity.
  • FIG. 6 shows DNA and amino acid sequence alignments for the nucleotide sequences of EDA (FIG. 6A, 6B) and EDD (FIG. 6C, 6D) genes from Zymomonas mobilis (native and optimized) and Escherichia coli.
  • FIG. 7 shows a representative western blot used to detect the presence of an enzyme associated with an activity described herein.
  • FIG. 8 illustrates schematic representations of native, modified and chimeric xylose isomerase genes.
  • FIG. 9 shows a representative Western blot used to detect gene products.
  • FIG. 10 graphically illustrates a comparison of specific activities of engineered mutant xylose isomerase enzymes. Results are presented as percent activity over wild type (WT) activity.
  • FIG. 1 1 illustrates comparative growth analysis results of yeast strains carrying vector only or a vector containing native Ruminococcus xylose isomerase, grown on media containing xylose. Experimental details and results of the growth assays are described in Example 13.
  • FIG. 12 illustrates comparative growth analysis results and measurement of ethanol production in yeast strains carrying vector only or a vector containing native Ruminococcus xylose isomerase. Growth of cells is shown by the lines connected by “diamonds” (vector with xylose isomerase) or “squares” (vector only). Ethanol production is shown by the lines connected by "x's” (vector with xylose isomerase) or “circles” (vector only). Experimental conditions and results are described in Example 13.
  • FIGS. 13A and 13B show representative Western blots used to detect levels of various exogenous EDD and EDA gene combinations expressed in a host organism. Experimental conditions and results are described in Example 17.
  • FIG. 14 graphically displays the relative activities of the various EDD/EDA combinations generated as described in Example 18.
  • FIG. 15 graphically represents the fermentation efficiency of engineered yeast strains carrying exogenous EDD/EDA gene combinations.
  • FIGS. 16A and 16B graphically illustrate fermentation data (e.g., cell growth, glucose usage and ethanol production) for engineered yeast strains generated as described herein.
  • FIG. 16A illustrates the fermentation data for engineered strain BF428 (BY4742 with vector controls)
  • FIG. 16B illustrates the fermentation data for engineered strain BF591 (BY4742 with EDD-PA01/EDA-PA01 ).
  • FIGS. 17A and 17B graphically illustrate fermentation data for engineered yeast strains described herein.
  • FIG. 17A illustrates the fermentation data for engineered strain BF738 (BY4742 tall with vector controls p426GPD and p425GPD).
  • FIG. 17B illustrates the fermentation data for engineered strain BF741 (BY4742 tall with plasmids pBF290 (EDD-PA01 ) and pBF292 (EDA- PA01 ). Experimental conditions and results are described in Example 21.
  • FIGS. 18A and 18B graphically illustrate fermentation data for engineered yeast strains as described herein.
  • FIG. 18A illustrates the fermentation data for BF740 grown on 2% dextrose
  • FIG. 18B illustrates the fermentation data for BF743 grown on 2% dextrose.
  • Strain descriptions, experimental conditions and results are described in Example 22.
  • FIG. 19 graphically illustrates the results of coupled assay kinetics for single plasmid and two plasmid edd/eda expression vector systems. Vector construction and experimental conditions are described in Example 24.
  • FIG. 20 depicts an engineered metabolic pathway that can be used to produce ethanol using xylose as a carbon source by introducing the activity into a microorganism.
  • the engineered microorganism can convert xylose to xylulose by the activities of xylose reductase and xylitol dehydrogenase.
  • Xylose also can be metabolized by the combined activities of xylose reductase, and xylitol dehydrogenase, as depicted in FIG. 20.
  • Xylulose then can be fermented to ethanol by entering the EM pathway.
  • Engineered microorganisms also can use the increased efficiency of ethanol production associated with inactivation of the EM pathway and introduction of activities of the ED pathway, shown in FIG. 2 and discussed below.
  • the ability to utilize xylose efficiently can be provided by the introduction of the novel activities, xylose isomerase, or xylose reductase and xylitol dehydrogenase.
  • FIG. 21 graphically illustrates the results of xylose isomerase chimera generated with various 5' edge sequences. Experimental methods and results are described in Example 28.
  • FIG. 22 shows the results of western blots performed on xylose isomerase chimera generated with various 5' edge sequences. Experimental methods and results are described in Example 28.
  • FIG. 23 shows a western blot of E. coli crude extract illustrated the presence of the EDD protein at the expected size.
  • Lane 1 is a standard size ladder (Novex Sharp standard)
  • Lane 2 is 1 g BF1055 cell lysate
  • Lane 3 is 10 pg BF1055 cell lysate
  • Lane 4 is 1 .5 [ig BF1706 cell lysate
  • Lane 5 is 15 Mg BF1706 cell lysate.
  • FIG. 24 graphically illustrates the results of activity evaluations of EDA genes expressed in yeast.
  • FIG. 25 graphically illustrates the specific activity of various xylose isomerase candidate activities. Experimental methods and results are described in Example 41 .
  • FIG. 26 presents a table summarizing the results of fermentations carried out, in various media, to determine differences in the ending parameters of fermentations comparing yCH1 and yCH24. Fermentation parameters summarized in the table are initial and final pH, initial and final OD 600 , amount of ethanol and glycerol produced. Experimental conditions and results are explained in Example 45.
  • FIG. 27 graphically illustrates the results of larger scale (e.g., Multifor) fermentations for strains yCH153, yCH137 and yCH208 grown in YPD with 8% glucose. Experimental conditions and results are explained in Example 46.
  • larger scale e.g., Multifor
  • FIG. 28 graphically illustrates the results of experiments performed to evaluate the effect of addition of magnesium (Mg) or manganese (Mn) to the EDA/EDD in vitro evaluation assay described herein. The results are presented in tabular form in Example 46.
  • FIG. 29 graphically shows the relative activities of native and codon optimized EDA proteins expressed in S. cerevisiae.
  • FIG. 30 graphically illustrates the results of experiments performed to identify the most active EDD genes when expressed in S. cerevisiae. Experimental conditions and results and explanation of abbreviations and symbols are described in Example 46.
  • FIG. 31 A-D diagrammatically illustrates nucleic acid constructs generated for engineering yeast strains described herein. The plasmid constructs are described in Example 48.
  • FIG. 32 illustrate the results of PCR analysis to confirm deletion of the PFK1 locus. Arrows indicate isolates that show the expected migration pattern of amplification products. Experimental conditions and results are described in Example 49.
  • FIG. 33 graphically illustrates the results of shake flask fermentations for strains yCH 153, yCH 137 and yCH208 grown in YPD with 8% glucose. The results represent the average of 8
  • FIG. 34 graphically illustrates the results of Multifor-based fermentations of strains yCH153 vs yCH208 in YPD with 8% glucose. Experimental conditions and results are described in Example 52.
  • FIG. 35 graphically illustrates the results of shake flask fermentations for strains yCH 1 and yCH247 in UMM media.
  • FIG. 36 graphically illustrates the results of Multifor fermentations for strains yCH 1 and yCH247 in UMM media.
  • FIG. 37 graphically illustrates the results of Multifor fermentations for strains yCH1 and yCH247 in UMM media.
  • FIG. 38 graphically illustrates the results of Multifor fermentations for strains yCH 1 and yCH247 in YPD media. Experimental conditions and results are described in Example 49.
  • FIG. 39 shows the results of PCR analysis of yCH247 and descendents of yCH247.
  • FIG. 39 shows the results of PCR analysis of yCH247 and descendents of yCH247.
  • FIG. 40 graphically illustrates the results of evaluation of EDD/EDA activity in yCH 1 , yCH247 and the 10 descendants from the stability study presented in FIG. 39. 75 ⁇ g of crude cell lysate was assayed.
  • FIG. 41 graphically illustrates the results of growth experiments performed on yCH 1 derived strains with and without deletions of PFK1 activity under aerobic and anaerobic conditions. Experimental conditions and results are described in Example 49.
  • FIG. 42 graphically illustrates the results of growth experiments performed on BF903 derived strains with and without deletions of PFK1 activity under aerobic and anaerobic conditions.
  • FIG. 43 graphically illustrates the results of shake flask fermentations for strains BF903 and BF2100 in UMM media.
  • FIG. 44 graphically illustrates the results of shake flask fermentations for strains BF903 and BF2100 in YPD media. Experimental conditions and results are described in Example 50.
  • FIG. 45 diagrammatically illustrates an integration construct for inserting an alternative fungal derived xylose metabolism pathway into engineered yeast strains. The pathway and approach are described herein. Construct details are given in Example 56.
  • FIG. 46 graphically illustrates results of fermentations performed using strain BF3319. Glucose and xylose consumption and ethanol production are shown. Experimental details and results are given in Example 62.
  • FIGS. 47A and 47B are nucleotide sequence alignments of native and codon optimized
  • Ruminococcus FD-1 xylose isomerase nucleotide sequences (e.g., labeled FD-1 , 1 , 2, 3 and 4, respectively) generated from the Ruminococcus FD-1 xylose isomerase amino acid sequence.
  • Ethanol is a two carbon, straight chain, primary alcohol that can be produced from fermentation (e.g., cellular respiration processes) or as a by-product of petroleum refining. Ethanol has widespread use in medicine, consumables, and in industrial processes where it often is used as an essential solvent and a precursor, or feedstock, for the synthesis of other products (e.g., ethyl halides, ethyl esters, diethyl ether, acetic acid, ethyl amines and to a lesser extent butadiene, for example).
  • the largest use of ethanol, worldwide, is as a motor fuel and fuel additive. Greater than 90% of the cars produced world wide can run efficiently on hydrous ethanol (e.g., 95% ethanol and 5% water). Ethanol also is commonly used for production of heat and light.
  • Biomass produced in the paper pulping and wood milling industries contains both 5 and six-carbon sugars. Use of this wasted biomass could allow production of significant amounts of bio-fuels and products, while reducing the use of land that could be used for food production.
  • Predominant forms of sugars in the biomass produced in wood and paper pulping and wood milling industries are glucose and xylose.
  • Provided herein are methods for producing ethanol, ethanol derivatives and/or conjugates and other organic chemical intermediates (e.g., pyruvate, acetaldehyde, glyceraldehyde-3-phospate, and the like) using biological systems. Such production systems may have significantly less environmental impact and could be economically competitive with current manufacturing systems.
  • microorganisms are engineered to contain at least one heterologous gene encoding an enzyme, where the enzyme is a member of a novel pathway engineered into the microorganism.
  • an organism may be selected for elevated activity of a native enzyme. Genetically engineered microorganisms described herein produce organic molecules for industrial uses.
  • the organisms are designed to be "feedstock flexible" in that they can use five-carbon sugars (e.g., pentose sugars such as xylose, for example), six-carbon sugars (e.g., hexose sugars such as glucose or fructose, for example) or both as carbon sources. Further, the organisms described herein have been designed to be highly efficient in their use of hexose sugars to produce desired organic molecules. To that end, the microorganisms described herein are
  • path flexible such that the microorganisms are able to direct hexose sugars primarily to either (i) the traditional glycolysis pathway (the Embden-Meyerhoff pathway) thereby generating ATP energy for cell growth and division at certain times, or (ii) a separate glycolytic pathway (the Entner-Doudoroff pathway) thereby producing significant levels of pyruvic acid, a key 3-carbon intermediate for producing many desired industrial organic molecules.
  • the traditional glycolysis pathway the Embden-Meyerhoff pathway
  • a separate glycolytic pathway the Entner-Doudoroff pathway
  • Pathway selection in the microorganism can be directed via one or more environmental switches such as a temperature change, oxygen level change, addition or subtraction of a component of the culture medium, or combinations thereof.
  • the metabolic pathway flexibility of microorganisms described herein allow the microorganisms to efficiently use hexose sugars, which ultimately can lead to microorganisms capable of producing a greater amount of industrial chemical product per gram of feedstock as compared with conventional microorganisms (e.g., the organism from which the engineered organism was generated, for example).
  • the metabolic pathway flexibility of the engineered microorganisms described herein is generated by adding or increasing metabolic activities associated with the Entner-Doudoroff pathway.
  • the metabolic activities added are phosphogluconate dehydratase (e.g., EDD gene), 2-keto-3-deoxygluconate-6-phosphate aldolase (e.g., EDA gene) or both.
  • EDD gene phosphogluconate dehydratase
  • EDA gene 2-keto-3-deoxygluconate-6-phosphate aldolase
  • a number of industrially useful microorganisms e.g., microorganisms used in fermentation processes, yeast for example), metabolize xylose inefficiently or are incapable of metabolizing xylose. Many organisms that can metabolize xylose do so only after all glucose and/or other six- carbon sugars have been depleted.
  • microorganisms described herein have been engineered to efficiently utilize five-carbon sugars (e.g., xylose, for example) as an alternative or additional source of carbon, concurrently with and/or prior to six-carbon sugar usage, by the incorporation of a heterologous nucleic acid (e.g., gene) encoding a xylose isomerase, in some embodiments, and in certain embodiments, by the incorporation of a heterologous nucleic acid encoding a xylose reductase and a xylitol dehydrogenase.
  • Xylose isomerase converts the five-carbon sugar xylose to xylulose, in some embodiments.
  • dehydrogenase convert xylose to xylulose.
  • Xylulose can ultimately be converted to pyruvic acid or to ethanol through metabolism via the Embden-Meyerhoff or Entner-Doudoroff pathways.
  • microorganisms described herein are engineered to express enzymes such as
  • PEP phosphoenolpyruvate carboxylase
  • Rubisco ribulose 1 ,5-bis-phosphate carboxylase
  • dehydrogenase activity e.g., by an enzyme like alcohol dehydrogenase 1 or ADH1 , for example.
  • ethanol can readily be converted back to acetaldehyde by the action of the enzyme alcohol dehydrogenase 2 (e.g., ADH2), thus lowering the yield of ethanol produced.
  • alcohol dehydrogenase 2 e.g., ADH2
  • microorganisms described herein are modified to reduce or eliminate the activity of ADH2, to allow increased yields of ethanol.
  • microorganisms described herein also are modified to have a higher tolerance to alcohol, thus enabling even higher yields of alcohol as a fermentation product without inhibition of cellular processes due to increased levels of alcohol in the growth medium.
  • a microorganism selected often is suitable for genetic manipulation and often can be cultured at cell densities useful for industrial production of a target product.
  • a microorganism selected often can be maintained in a fermentation device.
  • engineered microorganism refers to a modified microorganism that includes one or more activities distinct from an activity present in a microorganism utilized as a starting point (hereafter a "host microorganism”).
  • An engineered microorganism includes a heterologous polynucleotide in some embodiments, and in certain embodiments, an engineered organism has been subjected to selective conditions that alter an activity, or introduce an activity, relative to the host microorganism.
  • an engineered microorganism has been altered directly ' or indirectly by a human being.
  • a host microorganism sometimes is a native microorganism, and at times is a microorganism that has been engineered to a certain point.
  • an engineered microorganism is a single cell organism, often capable of dividing and proliferating.
  • a microorganism can include one or more of the following features: aerobe, anaerobe, filamentous, non-filamentous, monoploid, dipoid, auxotrophic and/or non- auxotrophic.
  • an engineered microorganism is a prokaryotic
  • an engineered microorganism is a non-prokaryotic microorganism.
  • an engineered microorganism is a eukaryotic microorganism (e.g., yeast, fungi, amoeba). Any suitable yeast may be selected as a host microorganism, engineered microorganism or source for a heterologous polynucleotide.
  • Yeast include, but are not limited to, Yarrowia yeast(e.g., Y. lipolytica (formerly classified as Candida lipolytica)), Candida yeast (e.g., C. revkaufi, C.
  • Rhodotorula yeast e.g., R. glutinus, R. graminis
  • Rhodosporidium yeast e.g., R. toruloides
  • Saccharomyces yeast e.g., S. cerevisiae, S. bayanus, S. pastorianus, S. carlsbergensis
  • Cryptococcus yeast Trichosporon yeast (e.g., T. pullans, T. cutaneum)
  • Pichia yeast e.g., P. pastoris
  • Lipomyces yeast e.g., L. starkeyii, L. lipoferus.
  • a yeast is a S. cerevisiae strain including, but not limited to,
  • a yeast is a Y. lipolytica strain that includes, but is not limited to, ATCC20362, ATCC8862, ATCC 18944, ATCC20228, ATCC76982 and LGAM S(7)1 strains (Papanikolaou S., and Aggelis G., Bioresour. Technol. 82(1 ):43-9 (2002)).
  • ATCC20362, ATCC8862, ATCC 18944, ATCC20228, ATCC76982 and LGAM S(7)1 strains (Papanikolaou S., and Aggelis G., Bioresour. Technol. 82(1 ):43-9 (2002)).
  • a yeast is a C. tropicalis strain that includes, but is not limited to, ATCC20336, ATCC20913, SU-2 (ura3-/ura3-), ATCC20962, H5343 (beta oxidation blocked; US Patent No. 5648247) strains.
  • Any suitable fungus may be selected as a host microorganism, engineered microorganism or source for a heterologous polynucleotide.
  • suitable fungi include, but are not limited to, Aspergillus fungi (e.g., A. parasiticus, A. nidulans), Thraustochytrium fungi,
  • a fungus is an A. parasiticus strain that includes, but is not limited to, strain ATCC24690, and in certain embodiments, a fungus is an A. nidulans strain that includes, but is not limited to, strain ATCC38163.
  • Any suitable prokaryote may be selected as a host microorganism, engineered microorganism or source for a heterologous polynucleotide.
  • a Gram negative or Gram positive bacteria may be selected. Examples of bacteria include, but are not limited to, Bacillus bacteria (e.g., B. subtilis, B. megaterium, B. stearothermophilus), Bacteroides bacteria (e.g., Bacteroides uniformis, Bacteroides thetaiotaomicron), Clostridium bacteria (e.g., C. phytofermentans, C. thermohydrosulfuricum, C.
  • H10 cellulyticum
  • Acinetobacter bacteria Norcardia baceteria
  • Lactobacillus bacterial e.g., Lactobacillus pentosus
  • Xanthobacter bacteria Escherichia bacteria
  • Escherichia bacteria e.g., E. coli (e.g., strains
  • Streptomyces bacteria e.g., Streptomyces rubiginosus, Streptomyces murinus
  • Erwinia bacteria Klebsiella bacteria
  • Serratia bacteria e.g., S. marcessans
  • Pseudomonas bacteria e.g., P. aeruginosa
  • Salmonella bacteria e.g., S. typhimurium, S.
  • Thermus bacteria e.g., Thermus thermophilic
  • Thermotoga bacteria e.g., Thermotoga maritiima, Thermotoga neopolitana
  • Ruminococcus e.g., Ruminococcus environmental samples, Ruminococcus albus, Ruminococcus bromii, Ruminococcus callidus, Ruminococcus flavefaciens, Ruminococcus gaenteauii, Ruminococcus gnavus, Ruminococcus lactaris, Ruminococcus obeum, Ruminococcus sp., Ruminococcus sp. 14531 , Ruminococcus sp.
  • Ruminococcus sp. 16442 Ruminococcus sp. 18P13, Ruminococcus sp. 25F6, Ruminococcus sp. 25F7, Ruminococcus sp. 25F8, Ruminococcus sp. 4_1_47FAA, Ruminococcus sp. 5, Ruminococcus sp. 5_1_39BFAA, Ruminococcus sp. 7L75, Ruminococcus sp. 8_1_37FAA, Ruminococcus sp. 9SE51 ,
  • Ruminococcus sp. C047 Ruminococcus sp. C07
  • Ruminococcus sp. CS1 Ruminococcus sp. CS6, Ruminococcus sp. DJF_VR52, Ruminococcus sp. DJF_VR66, Ruminococcus sp.
  • DJF_VR67 Ruminococcus sp. DJF_VR70k1 , Ruminococcus sp. DJF_VR87, Ruminococcus sp. Eg2, Ruminococcus sp. Egf, Ruminococcus sp. END-1 , Ruminococcus sp. FD1 , Ruminococcus sp. GM2/1 , Ruminococcus sp. ID1 , Ruminococcus sp. ID8, Ruminococcus sp. K-1 , Ruminococcus sp. KKA Seq234, Ruminococcus sp. M-1 , Ruminococcus sp.
  • Bacteria also include, but are not limited to, photosynthetic bacteria (e.g., green non-sulfur bacteria (e.g., Choroflexus bacteria (e.g., C. aurantiacus), Chloronema bacteria (e.g., C. gigateum)), green sulfur bacteria (e.g., Chlorobium bacteria (e.g., C. limicola), Pelodictyon bacteria (e.g., P. luteolum), purple sulfur bacteria (e.g., Chromatium bacteria (e.g., C.
  • photosynthetic bacteria e.g., green non-sulfur bacteria (e.g., Choroflexus bacteria (e.g., C. aurantiacus), Chloronema bacteria (e.g., C. gigateum)), green sulfur bacteria (e.g., Chlorobium bacteria (e.g., C. limicola), Pelodictyon bacteria (e.g., P. luteolum),
  • Rhodospirillum bacteria e.g., R. rubrum
  • Rhodobacter bacteria e.g., R. sphaeroides, R. capsulatus
  • Rhodomicrobium bacteria e.g., R. vanellii
  • Cells from non-microbial organisms can be utilized as a host microorganism, engineered microorganism or source for a heterologous polynucleotide.
  • Examples of such cells include, but are not limited to, insect cells (e.g., Drosophila (e.g., D. melanogaster), Spodoptera (e.g., S.
  • frugiperda Sf9 or Sf21 cells and Trichoplusa (e.g., High-Five cells); nematode cells (e.g., C. elegans cells); avian cells; amphibian cells (e.g., Xenopus laevis cells); reptilian cells; and mammalian cells (e.g., NIH3T3, 293, CHO, COS, VERO, C127, BHK, Per-C6, Bowes melanoma and HeLa cells).
  • Microorganisms or cells used as host organisms or source for a heterologous polynucleotide are commercially available.
  • Host ' microorganisms and engineered microorganisms may be provided in any suitable form.
  • such microorganisms may be provided in liquid culture or solid culture (e.g., agar-based medium), which may be a primary culture or may have been passaged (e.g., diluted and cultured) one or more times.
  • Microorganisms also may be provided in frozen form or dry form (e.g., lyophilized). Microorganisms may be provided at any suitable concentration.
  • Embden-Meyerhoff pathway operates primarily under aerobic (e.g., oxygen rich) conditions.
  • the other pathway operates primarily under anaerobic (e.g., oxygen poor) conditions, producing pyruvate that can be converted to lactic acid. Lactic acid can be further metabolized upon a return to appropriate conditions.
  • the EM pathway produces two ATP for each six-carbon sugar metabolized, as compared to one ATP produced for each six-carbon sugar metabolized in the ED pathway.
  • the ED pathway yields ethanol more efficiently than the EM pathway with respect to a given amount of input carbon, as seen by the lower net energy yield.
  • yeast preferentially use the EM pathway for metabolism of six-carbon sugars, thereby preferentially using the pathway that yields more energy and less desired product.
  • the following steps and enzymatic activities metabolize six-carbon sugars via the EM pathway.
  • Six-carbon sugars (glucose, sucrose, fructose, hexose and the like) are converted to glucose-6- phosphate by hexokinase or glucokinase (e.g., HXK or GLK, respectively).
  • Glucose-6-phosphate can be converted to fructose-6-phosphate by phosphoglucoisomerase (e.g., PGI).
  • Fructose-6- phosphate can be converted to fructose-1 , 6-bisphosphate by phosphofructokinase (e.g., PFK).
  • Fructose-1 ,6-bisphosphate represents a key intermediate in the metabolism of six-carbon sugars, as the next enzymatic reaction converts the six-carbon sugar into two 3 carbon sugars.
  • the reaction is catalyzed by fructose bisphosphate aldolase and yields a mixture of
  • G-3-P dihydroxyacetone phosphate
  • DHAP dihydroxyacetone phosphate
  • G-3-P glyceraldehyde-3-phosphate
  • the mixture of the two 3 carbon sugars is preferentially converted to glyceraldehyde-3-phosphate by the action of triosephosphate isomerase.
  • G-3-P is converted is converted to 1 ,3-diphosphoglycerate (1 ,3-DPG) by glyceraldehyde-3-phosphate dehydrogenase (GLD).
  • GGD glyceraldehyde-3-phosphate dehydrogenase
  • 3-DPG is converted to 3- phosphoglycerate (3-P-G by phosphoglycerate kinase (PGK).
  • 3-P-G is converted to 2- phosphoglycerate (2-P-G) by phophoglycero mutase (GPM).
  • 2-P-G is converted to
  • PEP phosphoenolpyruvate
  • ENO enolase
  • PEP is converted to pyruvate (PYR) by pyruvate kinase (PYK).
  • PYR is converted to acetaldehyde by pyruvate dicarboxylase (PDC).
  • Acetaldehyde is converted to ethanol by alcohol dehydrogenase 1 (ADH1 ).
  • enzymes in the EM pathway are reversible.
  • the enzymes in the EM pathway that are not reversible, and provide a useful activity with which to control six-carbon sugar metabolism, via the EM pathway include, but are not limited to phosphofructokinase and alcohol dehydrogenase.
  • reducing or eliminating the activity of phosphofructokinase may inactivate the EM pathway.
  • Engineering microorganisms with modified activities in PFK and/or ADH may yield increased product output as compared to organisms with the wild type activities, in certain embodiments.
  • modifying a reverse activity may also yield an increase in product yield by reducing or eliminating the back conversion of products by the backwards reaction.
  • the activity which catalyzes the conversion of ethanol to acetaldehyde is alcohol dehydrogenase 2 (ADH2). Reducing or eliminating the activity of ADH2 can increase the yield of ethanol per unit of carbon input due to the inactivation of the conversion of ethanol to acetaldehyde, in certain embodiments.
  • certain reversible activities also can be used to control six-carbon sugar metabolism via the EM pathway, in some embodiments.
  • a non-limiting example of a reversible enzymatic activity that can be utilized to control six-carbon sugar metabolism includes phosphoglucose isomerase (PGI).
  • a microorganism may be engineered to include or regulate one or more activities in the Embden- Meyerhoff pathway, for example. In some embodiments, one or more of these activities may be altered such that the activity or activities can be increased or decreased according to a change in environmental conditions. In certain embodiments, one or more of the activities (e.g., PGI, PFK or ADH2) can be altered to allow regulated control and an alternative pathway for more efficient carbon metabolism can be provided (e.g., one or more activities from the ED pathway, for example).
  • An engineered organism with the EM pathway under regulatable control and a novel or enhanced ED pathway would be useful for producing significantly more ethanol or other end product from a given amount of input feedstock.
  • Ethanol (or other product) producing activity can be provided by any non-mammalian source in certain embodiments. Such sources include, without limitation, eukaryotes such as yeast and fungi and prokaryotes such as bacteria. In some embodiments, the activity of one or more (e.g., 2, 3, 4, 5 or more) pathway members in an EM pathway is reduced or removed to undetectable levels.
  • An engineered microorganism may, in some embodiments, preferentially metabolize six-carbon sugars via the ED pathway as opposed to the EM pathway under certain conditions.
  • Such engineered microorganisms may metabolize about 60% or more of the available six-carbon sugars via the ED pathway (e.g., about 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing), and such fraction of the available six-carbon sugars are not metabolized by the EM pathway, under certain conditions.
  • a microorganism may metabolize six-carbon sugars substantially via the ED pathway, and not the EM pathway, in certain embodiments (e.g., 99% or greater, or 100%, of the available six-carbon sugars are metabolized via the ED pathway).
  • a six- carbon sugar is deemed as being metabolized via a particular pathway when the sugar is converted to end metabolites of the pathway, and not intermediate metabolites only, of the particular pathway.
  • a microorganism may preferentially metabolize certain sugars under the ED pathway after a certain time after the microorganism is exposed to a certain set of conditions (e.g., there may be a time delay after a microorganism is exposed to a certain set of conditions before the microorganism preferentially metabolizes sugars by the ED pathway).
  • Certain novel activities involved in the metabolism of six-carbon sugars by the ED pathway can be engineered into a desired yeast strain to increase the efficiency of ethanol (or other products) production.
  • Yeast do not have an activity that converts 6-phophogluconate to 2-keto-3-deoxy-6-p- gluconate or an activity that converts 2-keto-3-deoxy-6-p-gluconate to pyruvate.
  • Addition of these activities to engineered yeast can allow the engineered microorganisms to increase fermentation efficiency by allowing yeast to ferment ethanol under anaerobic condition without having to use the EM pathway and expend additional energy.
  • the engineered microorganism can benefit by producing ethanol more efficiently, with respect to a given amount of input carbon, than by using the native EM pathway.
  • Bacteria often have enzymatic activities that confer the ability to anaerobically metabolize six- carbon sugars to ethanol. These activities are associated with the ED pathway and include, but are not limited to, phosphogluconate dehydratase (e.g., the EDD gene, for example), and 2-keto-3- deoxygluconate-6-phosphate aldolase (e.g., the EDA gene, for example).
  • Phosphogluconate dehydratase converts 6-phophogluconate to 2-keto-3-deoxy-6-p-gluconate.
  • 2-keto-3- deoxygluconate-6-phosphate aldolase converts 2-keto-3-deoxy-6-p-gluconate to pyruvate.
  • these activities can be introduced into a host organism to generate an engineered microorganism which gains the ability to use the ED pathway to produce ethanol more efficiently than the non-engineered starting organism, by virtue of the lower net energy yield by the ED pathway.
  • a microorganism may be engineered to include or regulate one or more activities in the Entner-Doudoroff pathway. In some embodiments, one or more of these activities may be altered such that the activity or activities can be increased or decreased according to a change in environmental conditions.
  • Nucleic acid sequences encoding Embden-Meyerhoff pathway and Entner-Doudoroff pathway activities can be obtained from any suitable organism (e.g., plants, bacteria, and other microorganisms, for example) and any of these activities can be used herein with the proviso that the nucleic acid sequence is naturally active in the chosen microorganism when expressed, or can be altered or modified to be active.
  • Yeast also can have endogenous or heterologous enzymatic activities that enable the organism to anaerobically metabolize six carbon sugars.
  • Saccharomyces cerevisiae used in fermentation often convert glucose-6-phospate (G-6-P) to fructose-6-phosphate (F-6-P) via phosphoglucose isomerase (EC 5.3.1.9), up to 95% of G-6-P is converted to F-6-P in this manner for example. Only a minor proportion of G-6-P is converted to 6-phophoglucono-lactone (6-PGL) by an alternative enzyme, glucose-6-phosphate dehydrogenase (EC 1 .1.1 .49).
  • Yeast engineered to carry both Entner-Doudoroff (ED) and Embden-Meyerhoff (EM) pathways often covert sugars to ethanol using the EM pathway preferentially. Inactivation of one or more activities in the EM 4 pathway can result in conversion of sugars to ethanol using the ED pathway preferentially, in some embodiments.
  • ED Entner-Doudoroff
  • EM Embden-Meyerhoff
  • Phosphoglucose isomerase (EC 5.3.1 .9) catalyzes the reversible interconversion of glucose-6- phosphate and fructose-6-phosphate.
  • Phosphoglucose isomerase is encoded by the PGI1 gene in S. cerevisiae. The proposed mechanism for sugar isomerization involves several steps and is thought to occur via general acid/base catalysis. Since glucose 6-phosphate and fructose 6- phosphate exist predominantly in their cyclic forms, PGI is believed to catalyze first the opening of the hexose ring to yield the straight chain form of the substrates.
  • Glucose 6-phosphate and fructose 6-phosphate then undergo isomerization via formation of a cis-enediol intermediate with the double bond located between C-1 and C-2.
  • Phosphoglucose isomerase sometimes also is referred to as glucose-6-phosphate isomerase or phosphohexose isomerase.
  • PGI is involved in different pathways in different organisms. In some higher organisms PGI is involved in glycolysis, and in mammals PGI also is involved in gluconeogenesis. In plants PGI is involved in carbohydrate biosynthesis, and in some bacteria PGI provides a gateway for fructose into the Entner-Doudoroff pathway. PGI also is known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumor-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, PGI catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine. PGI activity is involved in cell cycle progression and completion of the
  • phosphoglucose isomerase activity is altered in an engineered microorganism. In some embodiments phosphoglucose isomerase activity is decreased or disrupted in an engineered microorganism. In certain embodiments, decreasing or disrupting phosphoglucose isomerase activity may be desirable to decrease or eliminate the isomerization of glucose-6-phosphate to fructose-6-phosphate, thereby increasing the proportion of glucose-6- phosphate converted to gluconolactone-6-phosphate by the activity encoded by ZWF1 (e.g., glucose-6-phosphate dehydrogenase).
  • ZWF1 e.g., glucose-6-phosphate dehydrogenase
  • Increased levels of gluconolactone-6-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden- Meyerhoff pathway.
  • Decreased or disrupted phosphoglucose isomerase (EC 5.3.1 .9) activity in yeast may be achieved by any suitable method, or as described herein.
  • Non-limiting examples of methods suitable for decreasing or disrupting the activity of phosphoglucose isomerase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast, disruption of both copies of the gene in a diploid yeast, expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • PGM genes may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • Glucose-6-phosphate dehydrogenase (EC 1.1.1 .49) catalyzes the first step of the pentose phosphate pathway, and is encoded by the S. cerevisiae gene, zwf ⁇ .
  • the reaction for the first step in the PPP pathway is;
  • D-glucose 6-phosphate + NADP + D-glucono-1 ,5-lactone 6-phosphate + NADPH + H +
  • the enzyme regenerates NADPH from NADP+ and is important both for maintaining cytosolic levels of NADPH and protecting yeast against oxidative stress.
  • Zwf 1 p expression in yeast is constitutive, and the activity is inhibited by NADPH such that processes that decrease the cytosolic levels of NADPH stimulate the oxidative branch of the pentose phosphate pathway.
  • Amplification of glucose-6-phosphate dehydrogenase activity in yeast may be desirable to increase the proportion of glucose-6-phosphate converted to 6-phosphoglucono-lactone and thereby improve fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
  • Glucose-6-phosphate dehydrogenase (EC 1.1.1 .49) activity in yeast may be amplified by over- expression of the zw 1 gene by any suitable method.
  • methods suitable to amplify or over express zw 1 include amplifying the number of ZWF1 genes in yeast following transformation with a high-copy number plasmid (e.g., such as one containing a 2uM origin of replication), integration of multiple copies of ZWF1 into the yeast genome, over-expression of the ZWF1 gene directed by a strong promoter, the like or combinations thereof.
  • the ZWF1 gene may be native to S. cerevisiae, or it may be obtained from a heterologous source.
  • 6-phosphog!uconolactonase (EC 3.1 .1 .31 ) catalyzes the second step of the ED (e.g., pentose phosphate pathway), and is encoded by S. cerevisiae genes SOL3 and SOL4.
  • Amplification of 6-phosphogluconolactonase activity in yeast may be desirable to increase the proportion of 6-phospho-D-glucono-1 ,5-lactone converted to 6-phospho-D-gluconate and thereby improve fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
  • over expression of 6-phosphogluconolactonase activity in yeast may be desirable to increase the proportion of 6-phospho-D-glucono-1 ,5-lactone converted to 6-phospho-D-gluconate and thereby improve fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
  • SOL3 is known to increase the rate of carbon source utilization to result in faster growth on xylose than wild type.
  • the Saccharomyces cerevisiae SOL protein family includes Sol3p and Sol4p. Both localize predominantly in the cytosol, exhibit 6-phosphogluconolactonase activity and function in the pentose phosphate pathway. 6-phosphogluconolactonase (EC 3.1 .1 .31 ) activity in yeast may be amplified by over-expression of the SOL3 and/or SOL4 gene(s) by any suitable method.
  • Non- limiting examples of methods to amplify or over express SOL3 and SOL4 include increasing the number of SOL3 and/or SOL4 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of SOL3 and/or SOL4 gene(s) into the yeast genome, over- expression of the SOL3 and/or SOL4 gene(s) directed by a strong promoter, the like or combinations thereof.
  • the SOL3 and/or SOL4 gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • Sol3p and Sol4p have similarity to each other, and to Candida albicans SoU p, Schizosaccharomyces pombe SoU p, human PGLS which is associated with 6-phosphogluconolactonase deficiency, and human H6PD which is associated with cortisone reductase deficiency.
  • Sol3p and Sol4p are also similar to the 6- phosphogluconolactonases in bacteria (Pseudomonas aeruginosa) and eukaryotes (Drosophila melanogaster, Arabidopsis thaliana, and Trypanosoma brucei), to the glucose-6-phosphate dehydrogenase enzymes from bacteria (Mycobacterium leprae) and eukaryotes (Plasmodium falciparum and rabbit liver microsomes), and have regions of similarity to proteins of the Nag family, including human GNPI and Escherichia coli NagB.
  • Phosphogluconate dehydrogenase (EC:1 .1.1.44) catalyzes the second oxidative reduction of NADP+ to NADPH in the cytosolic oxidative branch of the pentose phosphate pathway, and is encoded by the S. cerevisiae genes GND1 and GND2.
  • GND1 encodes the major isoform of the enzyme accounting for up to 80% of phosphog!uconate dehydrogenase activity, while GND2 encodes the minor isoform of the enzyme.
  • Phosphogluconate dehydrogenase sometimes also is referred to as phosphogluconic acid dehydrogenase, 6-phosphogluconic dehydrogenase, 6- phosphogluconic carboxylase, 6-phosphogluconate dehydrogenase (decarboxylating), and 6- phospho-D-gluconate dehydrogenase.
  • Phosphogluconate dehydrogenase belongs to the family of oxidoreductases, specifically those acting on the CH-OH group of donor with NAD + or NADP + as the acceptor. The reaction for the second oxidative reduction of NADP+ to NADPH in the cytosolic oxidative branch of the pentose phosphate pathway is;
  • Decreasing the level of 6-phosphogluconolactonase activity in yeast may be desirable to decrease the proportion of 6-phospho-D-gluconate converted to D-ribulose 5-phosphate thereby increasing the. levels of the intermediate gluconate-6-phosphate available for conversion to 6-dehydro-3- deoxy-gluconate-6-phosphate, in some embodiments involving engineered microorganisms including increased EDA and EDD activities, thereby improving fermentation of sugar to ethanoi via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden- Meyerhoff pathway.
  • Decreasing or disrupting 6-phosphogluconolactonase activity in yeast may be achieved by any suitable method, or as described herein.
  • methods suitable for decreasing the activity of 6-phosphogluconate dehydrogenase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • GND1 and/or GND2 gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • S. cerevisiae GND1 and GND2 have similarity to each other, and to the phosphogluconate dehydrogenase nucleotide sequences of Candida parapsilosis, Cryptococcus neoformans and humans.
  • Trehalose (e.g., also known as mycose or tremalose), is a natural alpha-linked disaccharide formed by an alpha, alpha- 1 ,1 -glucoside bond between two alpha-glucose units.
  • Trehalose biosynthesis is a two-step process in which glucose 6-phosphate and UDP-glucose are converted by trehalose-6-phosphate synthase, encoded by TPS1 , into alpha, alpha-trehalose 6-phosphate, which is then converted with water into trehalose and phosphate by trehalose-6-phosphate phosphatase, encoded by TPS2.
  • the main function of trehalose is as a carbohydrate storage moiety.
  • Trehalose-6-phosphate synthase (e.g., TPS1 ; EC 2.4.1 .15; also known as alpha, alpha- trehalose-phosphate synthase (UDP-forming)) catalyzes the chemical reaction UDP-glucose + D-glucose 6-phosphate ( JDP + alpha, alpha-trehalose 6-phosphate, and is part of the alpha, alpha-trehalose-phosphate synthase complex (UDP-forming).
  • the two substrates of this enzyme activity are UDP-glucose and D-glucose 6-phosphate, whereas its two products are UDP and alpha, alpha-trehalose 6-phosphate.
  • decreasing the level of trehalose-6-phosphate synthase activity in yeast may be desirable to decrease the proportion of glucose converted into the storage carbohydrate trehalose, thereby increasing the levels of glucose ultimately available for conversion to ethanol, in some embodiments involving engineered microorganisms including increased EDA and EDD activities, thereby improving fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
  • TPS1 gene has been shown to eliminate TPS activity and measureable trehalose.
  • Decreasing or disrupting trehalose-6-phosphate synthase activity in yeast may be achieved by any suitable method, or as described herein.
  • methods suitable for decreasing the activity of trehalose-6-phosphate synthase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding a trehalose-6-phosphate synthase activity is provided herein.
  • Trehalose-6-phosphate phosphatase (e.g., TPS2; EC 3.1 .3.12; also known as alpha, alpha- trehalose-6-phosphate phosphohydrolase, trehalose 6-phosphatase and trehalose-6-phosphate phosphohydrolase) catalyzes the chemical reaction, alpha, alpha-trehalose 6-phosphate + H20 alpha ,alpha-trehalose + phosphate, and is part of the alpha, alpha-trehalose-phosphate synthase complex (UDP-forming).
  • the two substrates of this enzyme activity are alpha, alpha-trehalose 6-phosphate and H20, whereas its two products are alpha, alpha-trehalose and phosphate. Removal of the phosphate allows another enzyme activity, trehalase, to hydrolyze trehalose into 2 molecules of glucose.
  • trehalose-6-phosphate phosphatase activity in yeast may be desirable to decrease the proportion of alpha, alpha-trehalose 6-phosphate, by conversion into trehalose and phosphate.
  • the trehalose can be further metabolized by trehalase, into two molecules of glucose which ultimately can be converted to ethanol via the engineered pathways described herein, via native and/or engineered pathways in engineered microorganisms described herein.
  • Trehalose-6-phosphate phosphatase (EC 3.1.3.12) activity in yeast may be amplified by over-expression of the TPS2 gene by any suitable method.
  • Non-limiting examples of methods to amplify or over express TPS2 include increasing the number of TPS2 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the TPS2 gene into the yeast genome, over-expression of the TPS2 gene directed by a strong promoter, the like or combinations thereof.
  • the TPS2 gene may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding a trehalose-6-phosphate phosphatase activity is provided herein.
  • Glyceraldehyde-3-phosphate dehydrogenase e.g., TDH3; EC 1.2.1 .12; also known as glyceraldehyde-3-phosphate dehydrogenase (phosphorylating), GAPDH, NAD-dependent glyceraldehyde-3-phosphate dehydrogenase and triosephosphate dehydrogenase
  • GAPDH NAD-dependent glyceraldehyde-3-phosphate dehydrogenase
  • triosephosphate dehydrogenase catalyzes the chemical reaction
  • dehydrogenase activity in yeast may be desirable to increase carbon flux through gluconeogenesis and glycolysis, such that glycerol and glycerol derivatives are converted to glucose and further metabolized into ethanol, via native and/or engineered pathways in engineered microorganisms described herein.
  • Glyceraldehyde-3-phosphate dehydrogenase (EC 1 .2.1 .12) activity in yeast may be amplified by over-expression of the TDH3 gene by any suitable method.
  • Non-limiting examples of methods to amplify or over express TDH3 include increasing the number of TDH3 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the TDH3 gene into the yeast genome, over-expression of the TDH3 gene directed by a strong promoter, the like or combinations thereof.
  • the TDH3 gene may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding a glyceraldehyde-3-phosphate dehydrogenase activity is provided herein.
  • Glutamate synthase e.g., GLT1 ; EC 1 .4.1 .14; also known as glutamate synthase (NADH), L- glutamate synthase, L-glutamate synthetase, NADH-dependent glutamate synthase, NADH- glutamate synthase, NADH: GOGAT
  • NADH glutamate synthase
  • L-glutamate + NAD(+) ⁇ > L-glutamine + 2-oxoglutarate + NADH, and participates in glutamate metabolism and nitrogen metabolism, and employs one cofactor, FMN.
  • Yeast cells contain 3 pathways for the synthesis of glutamate. Two pathways are mediated by two isoforms of glutamate dehydrogenase, encoded by GDH1 and GDH3. The third pathway involves the combined activities of glutamine synthetase (GLN1 ) and glutamate synthase (GLT1 ). Glnl p catalyzes amination of glutamate to form glutamine.
  • Gltl p then transfers the amide group of glutamine to 2-oxoglutarate, generating two molecules of glutamate.
  • Glutamate synthase also referred to as GOGAT, is a trimer of three Gltl p subunits.
  • Expression of the GLT1 gene is modulated by glutamate-mediated repression and by Gln3p/Gcn4p-mediated activation, depending upon the availability of nitrogen and glutamate in the medium. In amino acid starvation conditions, GLT1 expression is activated to a moderate degree by Gcn4p.
  • Glutamate synthase activity in yeast may be desirable to increase carbon flux through gluconeogenesis and glycolysis, such that glutamate and glutamate derivatives are ultimately converted to ethanol, via native and/or engineered pathways in engineered microorganisms described herein.
  • Glutamate synthase (EC 1 .4.1.14) activity in yeast may be amplified by over-expression of the GLT1 gene by any suitable method.
  • Non-limiting examples of methods to amplify or over express GLT1 include increasing the number of GLT1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the GLT1 gene into the yeast genome, over-expression of the GLT1 gene directed by a strong promoter, the like or combinations thereof.
  • the GLT1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding a glutamate synthase activity is provided herein.
  • Alcohol dehydrogenase 1 catalyzes the reduction of acetaldehyde to ethanol.
  • ADH1 e.g., ADH1 ; EC 1 .1 .1 .1
  • Alcohol dehydrogenase 1 catalyzes the reduction of acetaldehyde to ethanol.
  • ADH1 e.g., ADH1 ; EC 1 .1 .1 .1
  • Alcohol dehydrogenase 1 catalyzes the reduction of acetaldehyde to ethanol.
  • Adhl p e.g., Adh3p, Adh4p, and Adh5p
  • Adh2p catalyzes the reverse reaction of oxidizing ethanol to acetaldehyde.
  • the cytosolic ADH1 gene product is the major enzyme responsible for converting acetaldehyde to ethanol, and functions as a tetramer of four identical subunits with each subunit containing two zinc
  • Alcohol dehydrogenase (EC 1.1 .1 .1 ) activity in yeast may be amplified by over- expression of the ADH1 gene by any suitable method.
  • methods to amplify or over express ADH1 include increasing the number of ADH1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the ADH1 gene into the yeast genome, over-expression of the ADH1 gene directed by a strong promoter, the like or combinations thereof.
  • the ADH1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding an alcohol dehydrogenase activity is provided herein.
  • PDC1 e.g., PDC1 ; EC 4.1 .1 .1 ; also known as pyruvic decarboxylase, alpha- ketoacid carboxylase, alpha-carboxylase, 2-oxo-acid carboxy-lyase
  • pyruvate decarboxylase activity commits the end product of glycolysis, pyruvate, to ethanol production rather than to its other possible metabolic fates: the TCA cycle/aerobic respiration (pyruvate converted to acetyl-CoA by the action of pyruvate
  • Pyruvate decarboxylase activity also can decarboxylate other 2-oxo acids such as indolepyruvate and 2-keto-3-methyl-valerate.
  • the ability to decarboxylate other 2-oxo acids contributes to the catabolism of the amino acids isoleucine, phenylalanine, tryptophan, and valine, thereby providing additional opportunity to maximize carbon flux in the direction of ethanol production.
  • Pyruvate decarboxylase is conserved among yeast, bacteria and plants. The active enzyme is a homotetramer and requires thiamin diphosphate and magnesium cofactors
  • pyruvate decarboxylase activity in yeast may be desirable to increase the carbon flux through the last step of fermentation, the reduction of acetaldehyde to ethanol, by increasing the conversion of pyruvic acid into acetaldehyde, via native and/or engineered pathways in engineered microorganisms described herein.
  • Pyruvate decarboxylase e (EC 4.1.1.1 ) activity in yeast may be amplified by over-expression of the PDC1 gene by any suitable method.
  • Non-limiting examples of methods to amplify or over express PDC1 include increasing the number of PDC1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the PDC1 gene into the yeast genome, over-expression of the PDC 1 gene directed by a strong promoter, the like or combinations thereof.
  • the PDC1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding a pyruvate decarboxylase activity is provided herein.
  • Pyruvate kinase e.g., PYK1 , CDC19; EC 2.7.1.40 catalyzes the conversion of
  • PEP phosphoenolpyruvate
  • ADP phosphoenolpyruvate
  • PYK1 appears to be tightly regulated and activated by fructose-1 ,6-bisphosphate (FBP).
  • FBP fructose-1 ,6-bisphosphate
  • pyruvate kinase (EC 2.7.1 .40) activity in yeast may be desirable to increase the carbon flux through the last step of fermentation, the reduction of acetaldehyde to ethanol, by increasing the conversion of phosphoenolpyruvate to pyruvate, which can be further metabolized into acetaldehyde and ultimately ethanol, via native and/or engineered pathways in engineered microorganisms described herein.
  • Pyruvate kinase (EC 2.7.1 .40) activity in yeast may be amplified by over-expression of the PYK1 gene by any suitable method.
  • Non-limiting examples of methods to amplify or over express PYK1 include increasing the number of PYK1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the PYK1 gene into the yeast genome, over-expression of the PYK 1 gene directed by a strong promoter, the like or combinations thereof.
  • the PYK1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding a pyruvate kinase activity is provided herein.
  • PH013 also has been shown to play a role in efficient xylose utilization. It has been demonstrated that cells overexpressing xylulokinase frequently grow poorly and exhibit decreased fitness. Cells overexpressing xylulokinase that also have a corresponding PH013 deletion, show improved growth and fitness.
  • decreasing the level of PH013 activity in engineered cells may benefit the production of ethanol by (i) activation of other activities involved in ethanol production, (ii) deactivation of activities that may inhibit ethanol production, (iii) altering the transport of carbon sources into and/or out of the cell, and/or (iv) improving xylose utilization in strains engineered to metabolize xylose, with or without over expression of xylulokinase.
  • Decreasing or disrupting the synthesis of alkaline phosphatase (EC 3.1 .3.1 ) activity in yeast may be achieved by any suitable method, or as described herein.
  • Non-limiting examples of methods suitable for decreasing the synthesis of alkaline phosphatase (e.g., PH013) activity include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding an alkaline phosphatase activity specific for p-nitrophenyl phosphate is provided herein.
  • trehalose is hydrolyzed to form 2 molecules of glucose by the activity of trehalose.
  • Certain yeast have two trehalase activities, an acid trehalase encoded by ATH1 and a neutral trehalase encoded by NTH1 (e.g., EC 3.2.1.28).
  • a third locus, NTH2 is 77% identical to NTH1 , but does not appear to encode a trehalase activity, or be involved in trehalose catabolism, since an nth2 null mutant exhibits normal levels of neutral trehalase activity.
  • NTH1 is induced by various stresses including exposure to heat, hydrogen peroxide, or cycloheximide.
  • Nthl p normally is found as a cytoplasmic homodimer in its active state (e.g., the state in which the hydrolysis of intracellular trehalose occurs).
  • Decreased expression of NTH1 has been shown to be involved in strain stability. Without being limited by theory, decreasing the level of NTH1 activity in engineered cells may benefit the overall stability (e.g., health and fitness) of the cell , thereby allowing increased production of ethanol.
  • deletion of NTH1 also may provided unexpected benefits due to the increased temperature sensitivity of nthi strains.
  • alterations to reduce or eliminate NTH1 activity are not made in the same genetic background as alterations to reduce or eliminate TPS1 activity.
  • alterations to reduce or eliminate NTH1 activity are not made in the same genetic background as alterations to increase TPS2 activity.
  • alterations to reduce or eliminate NTH1 activity are not made in the same genetic background as alterations to reduce or eliminate TPS1 and increase TPS2 activity.
  • Decreasing or disrupting the synthesis of neutral trehalase (EC 3.2.1.28) activity in yeast may be achieved by any suitable method, or as described herein.
  • Non-limiting examples of methods suitable for decreasing the synthesis of plasma membrane channels encoded by NTH1 include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • Glycerol-3-phosphate dehydrogenase e.g., GPD1/GPD2; EC 1.1 .1 .8 / EC 1.1.5.3, respectively; also known a-glycerol phosphate dehydrogenase (NAD); a-glycerophosphate dehydrogenase (NAD); glycerol 1 -phosphate dehydrogenase; glycerol phosphate dehydrogenase (NAD);
  • NAD glycerophosphate dehydrogenase
  • hydroglycerophosphate dehydrogenase L-a-glycerol phosphate dehydrogenase; L-a-glycerophosphate dehydrogenase; L-glycerol phosphate dehydrogenase; L-glycerophosphate dehydrogenase; NAD-a-glycerophosphate dehydrogenase; NAD-dependent glycerol phosphate dehydrogenase; NAD-dependent glycerol-3-phosphate dehydrogenase; NAD-L-glycerol-3-phosphate dehydrogenase; NAD-linked glycerol 3-phosphate dehydrogenase; NADH-dihydroxyacetone phosphate reductase; glycerol-3-phosphate dehydrogenase (NAD), FAD-dependent glycerol-3-phosphate dehydrogenase, flavin-linked g
  • Glycerol-3-phosphate activities exist in many organisms; a cytoplasmic activity encoded by GPD1 , and a mitochondrial activity encoded by GPD2.
  • the cytoplasmic enzyme uses NAD as a cofactor and yields NADH, while the mitochondrial enzyme uses quinone as a cofactor and yields quinol.
  • Glycerol-3-phosphate dehydrogenase also acts on propane-1 ,2-diol phosphate and glycerone sulfate, but with a lower affinity.
  • Glycerol-3-phosphate dehydrogenase is a key enzyme in glycerol synthesis and has been shown to be important to growth and survival under osmotic stress.
  • decreasing glycerol-3-phosphate dehydrogenase activity in engineered cells may benefit the production of ethanol by decreasing the proportion of glycerol that enters the gluconeogenic pathway, thereby allowing the glycerol to be used directly as a non- carbohydrate carbon source for the production of ethanol.
  • Decreasing or disrupting the synthesis of glycerol-3-phosphate dehydrogenase (EC 1 .1 .1 .8 and/or EC 1.1 .5.3) activity in yeast may be achieved by any suitable method, or as described herein.
  • dehydrogenase activity include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • An example of a polynucleotide subsequence and/or an amino acid sequence coding a glycerol-3-phosphate dehydrogenase activity is provided herein. Five-Carbon Sugar Metabolism and Activities
  • five-carbon sugars are the second most predominant form of sugars in lignocelluosic waste biomass produced in wood pulp and wood milling industries.
  • xylose is the second most abundant carbohydrate in nature.
  • Non-limiting examples of five-carbon sugars include arabinose, lyxose, ribose, xylose, ribulose, and xylulose.
  • biomass e.g., ethanol, for example
  • energy e.g., ethanol, for example
  • Biomass and waste biomass contain both cellulose and hemicellulose.
  • Many industrially applicable organisms can metabolize five-carbon sugars (e.g., xylose, pentose and the like), but may do so at low efficiency, or may not begin metabolizing five-carbon sugars until all six-carbon sugars have been depleted from the growth medium.
  • Many yeast and fungus grow slowly on xylose and other five-carbon sugars. Some yeast, such as S. cerevisiae do not naturally use xylose, or do so only if there are no other carbon sources.
  • An engineered microorganism e.g., yeast, for example
  • yeast that could grow rapidly on xylose and provide ethanol and/or other products as a result of fermentation of xylose can be useful due to the ability to use a feedstock source that is currently underutilized while also reducing the need for petrochemicals.
  • the pentose phosphate pathway (PPP), which is a biochemical route for xylose metabolism, is found in virtually all cellular organisms where it provides D-ribose for nucleic acid biosynthesis, D- erythrose 4-phosphate for the synthesis of aromatic amino acids and NADPH for anabolic reactions.
  • the PPP is thought of as having two phases.
  • the oxidative phase converts the hexose, D-glucose 6P, into the pentose, D-ribulose 5P, plus C02 and NADPH.
  • the non-oxidative phase converts D-ribulose 5P into D-ribose 5P, D-xylulose 5P, D-sedoheptulose 7P, D-erythrose 4P, D- fructose 6P and D-glyceraldehyde 3P.
  • D-Xylose and L-arabinose enter the PPP through D- xylulose.
  • Certain organisms require two or more activities to convert xylose to a usable from that can be metabolized in the pentose phosphate pathway.
  • the activities are a reduction and an oxidation carried out by xylose reductase (XR; XYL1 , GRE3) and xylitol dehydrogenase (XD; XYL2, XDH1 ), respectively.
  • Xylose reductase converts D-xylose to xylitol.
  • Xylitol dehydrogenase converts xylitol to D-xylulose.
  • the use of these activities sometimes can inhibit cellular function due to cofactor and metabolite imbalances.
  • the xylose reductase activity and/or xylitol dehydrogenase activity selected for inclusion in an engineered organism can be chosen from an organism whose XR and/or XD activities utilize NADPH or NADH (e.g., co-factor flexible activities), thereby reducing or eliminating inhibition of cellular function due to cofactor and metabolite imbalances.
  • NADPH or NADH e.g., co-factor flexible activities
  • NADP7NADPH and/or NAD7NADH include C. shehatae, C. parapsilosis, P. segobiensis, P. stipitis, and Pachysolen tannophilus.
  • xylose reductase and/or xylitol dehydrogenase activities can be engineered to alter cofactor preference and/or specificity. Some organisms (e.g., certain bacteria, for example) require only one activity, xylose isomerase (xylA). Xylose isomerase converts xylose directly to xylulose. In some embodiments, additional alterations in the strain can compensate for cofactor and metabolite imbalances caused by the use of certain xylose reductase and/or xylitol dehydrogenase activities.
  • Xylulose is converted to xylulose-5-phophate by the activity of a xylulokinase enzyme (EC
  • Xylulose kinase (e.g., XYK3, XYL3, XKS1 ) catalyzes the chemical reaction, ATP + D-xylulose ⁇ ADP + D-xylulose 5-phosphate
  • Xylulokinase sometimes also is referred to as ATP:D-xylulose 5-phosphotransferase, xylulokinase (phosphorylating), and D-xylulokinase.
  • Increasing the activity of xylose isomerase or xylose reductase and xylitol dehydrogenase may cause an increase of xylulose in an engineered microorganism. Therefore, increasing xylulokinase activity levels in embodiments involving increased levels of XI or XR and XD may be desirable to allow increased flux through the respective metabolic pathways.
  • Xylulokinase activity levels can be increased using any suitable method.
  • Non-limiting examples of methods suitable for increasing xylulokinase activity include increasing the number of xylulokinase genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of xylulokinase genes into the yeast genome, over- expression of the xylulokinase gene directed by a strong promoter, the like or combinations thereof.
  • the xylulokinase gene may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • strains engineered to over express xylulokinase are also engineered to delete PH013 activity. As noted above, strains overexpressing xylulokinase activity often display deleterious phenotypic traits that can be partially or completely alleviated by a corresponding deletion of PH013 activity.
  • Phosphorylation of xylulose by xylulokinase allows the five-carbon sugar to be further converted by transketolase (e.g., TKL1/TKL2) to enter the EM pathway for further metabolism at either fructose- 6-phosphate or glyceraldehyde-3-phosphate.
  • transketolase e.g., TKL1/TKL2
  • TKL1/TKL2 transketolase
  • five-carbon sugars enter the EM pathway and are further converted for use by the ED pathway.
  • engineering a microorganism with xylose isomerase activity or co-factor flexible xylose reductase activity and xylitol dehydrogenase activity, along with increased xylulokinase activity may allow rapid growth on xylose when compared to the non-engineered microorganism, while avoiding cofactor and metabolite imbalances, in some embodiments.
  • engineering a microorganism with co-factor flexible xylose reductase activity and xylitol dehydrogenase activity may allow rapid growth on xylose when compared to the non- engineered microorganism, while avoiding cofactor and metabolite imbalances.
  • co- factor flexible as used herein with respect to xylose reductase activity and xylose isomerase activity refers to the ability to use NADP7NADPH and/or NAD7NADH as a cofactor for electron transport.
  • a microorganism may be engineered to include or regulate one or more activities in a five-carbon sugar metabolism pathway (e.g., pentose phosphate pathway, for example).
  • a five-carbon sugar metabolism pathway e.g., pentose phosphate pathway, for example.
  • an engineered microorganism can comprise a xylose isomerase activity.
  • the xylose isomerase activity may be altered such that the activity can be increased or decreased according to a change in environmental conditions.
  • Nucleic acid sequences encoding xylose isomerase activities can be obtained from any suitable bacteria (e.g., Piromyces, Orpinomyces, Bacteroides thetaiotaomicron, Clostridium phytofermentans, Thermus thermophilus and Ruminococcus (e.g., R. flavefaciens, R. flavefaciens strain FD1 , R.
  • Flavefaciens strain 18P13 are non-limiting examples) and any of these activities can be used herein with the proviso that the nucleic acid sequence is naturally active in the chosen microorganism when expressed, or can be altered or modified to be active.
  • an engineered microorganism can comprise a xylose reductase activity and a xylitol dehydrogenase activity.
  • an engineered microorganism can comprise a xylulokinase activity.
  • the xylose reductase activity, xylitol dehydrogenase activity and/or xylulokinase activity may be altered such that the activity can be increased or decreased according to a change in environmental conditions.
  • Nucleic acid sequences encoding xylose reductase activity, xylitol dehydrogenase activity and/or xylulokinase activities can be obtained from any suitable organism, and any of these activities can be used herein with the proviso that the nucleic acid sequence is naturally active in the chosen microorganism when expressed, or can be altered or modified to be active. Activities Linking 5-Carbon and 6-Carbon Sugar Metabolic Pathways
  • an engineered microorganism includes one or more altered activities that function to link 5-carbon sugar and 6-carbon sugar metabolic pathways (e.g., provide intermediates that enter and/or are metabolized by the pentose phosphate pathway, the glycolytic pathway, or the pentose phosphate and glycolytic pathways).
  • the altered linking activity is added, increased or amplified, with respect to a host or starting organism.
  • the altered activity is decreased or disrupted, with respect to a host or starting organism.
  • activities that function to reversibly link 5-carbon sugar and 6- carbon sugar metabolic pathways include transaldolase, transketolase, the like, or combinations thereof. Transketolase and transaldolase catalyze transfer of 2 carbon and 3 carbon molecular fragments respectively, in each case from a ketose donor to an aldose acceptor.
  • Transaldolase (EC:2.2.1 .2) catalyses the reversible transfer of a three-carbon ketol unit from sedoheptulose 7-phosphate to glyceraldehyde 3-phosphate to form erythrose 4-phosphate and fructose 6-phosphate.
  • the cofactor-less enzyme acts through a Schiff base intermediate (e.g., bound dihydroxyacetone).
  • Transaldolase is encoded by the gene TAL1 in S. cerevisiae, and is an enzyme in the non-oxidative pentose phosphate pathway that provides a link between the pentose phosphate and the glycolytic pathways. Transaldolase activity is thought to be found in substantially all organisms, and include 5 subfamilies.
  • transaldolase subfamilies have demonstrated transaldolase activity, one subfamily comprises an activity of undetermined function and the remaining subfamily includes a fructose 6-phosphate aldolase activity.
  • Transaldolase deficiency is well tolerated in many microorganisms, and without being limited by any theory, is thought to be involved in oxidative stress responses and apoptosis.
  • Transaldolase sometimes also is referred to as dihydroxyacetone transferase, glycerone transferase, or dihydroxyacetonetransferase, sedoheptulose-7- phosphate:D-glyceraldehyde-3-phosphate glyceronetransferase, and catalyzes the reaction: sedoheptulose 7-phosphate + glyceraldehyde 3-phosphate — erythrose 4-phosphate + fructose 6- phosphate
  • increasing or amplifying transaldolase activity in yeast may be desirable to increase the proportion of sedoheptulose 7-phosphate and glyceraldehyde 3-phosphate converted to fructose-6-phosphate and erythrose-4-phosphate, thereby increasing levels of fructose-6- phosphate.
  • Increased levels of fructose-6-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
  • Transaldolase (EC:2.2.1 .2) activity in yeast may be amplified by over-expression of the TAL1 gene by any suitable method.
  • decreasing or disrupting transaldolase activity may be desirable to decrease the proportion of sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate converted to fructose-6-phosphate and erythrose-4-phosphate, thereby increasing levels of glyceraldehyde-3-phosphate in the engineered microorganism. Increased levels of
  • glyceraldehyde-3-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
  • Decreased or disrupted transaldolase (EC:2.2.1 .2) activity in yeast may be achieved by any suitable method, or as described herein.
  • Transketolase (EC:2.2.1.1 ) catalyzes the reversible transfer of a two-carbon ketol unit from a ketose (e.g., xylulose 5-phosphate, fructose 6-phosphate, sedoheptulose 7-phosphate) to an aldose receptor (e.g., ribose 5-phosphate, erythrose 4-phosphate, glyceraldehyde 3-phosphate).
  • a ketose e.g., xylulose 5-phosphate, fructose 6-phosphate, sedoheptulose 7-phosphate
  • aldose receptor e.g., ribose 5-phosphate, erythrose 4-phosphate, glyceraldehyde 3-phosphate.
  • Transketolase is encoded by the TKL1 and TKL2 genes in S. cerevisiae.
  • TKL1 encodes the major isoform of the enzyme and TKL2 encodes a minor
  • Transketolase sometimes also is referred to as glycoaldehyde transferase, glycolaldehydetransferase, sedoheptulose-7- phosphate:D-glyceraldehyde-3-phosphate glycolaldehydetransferase, or fructose 6-phosphate:D- glyceraldehyde-3-phosphate glycolaldehydetransferase.
  • Transketolase double null mutants e.g., tkl1/tkl2 are viable but are auxotrophic for aromatic amino acids, indicating the genes are involved in the synthesis of aromatic amino acids.
  • Transketolase activity also is thought to be involved in the efficient use of fermentable carbon sources, and has been shown to catalyze a one-substrate reaction utilizing only xylulose 5-phosphate to produce glyceraldehyde 3-phosphate and erythrulose. Transketolase activity requires thiamine
  • Tkh p has similarity to S. cerevisiae Tkl2p, Escherichia coli transketolase, Rhodobacter sphaeroides transketolase,
  • Streptococcus pneumoniae recP Hansenula polymorpha dihydroxyacetone synthase
  • increasing or amplifying transketolase activity in yeast may be desirable to increase the proportion of xylulose 5-phosphate converted to glyceraldehyde 3-phosphate, thereby increasing levels of glyceraldehyde 3-phosphate available for entry into a 6-carbon sugar metabolic pathway directly and/or conversion to fructose-6-phosphate.
  • Increased levels of fructose-6- phosphate and/or glyceraldehyde 3-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
  • Transketolase (EC 2.2.1 .1 ) activity in yeast may be increased or amplified by over-expression of the TKL1 and/or TKL2 gene(s) by any suitable method.
  • methods to amplify or over express TKL1 and TKL2 include increasing the number of TKL1 and/or TKL2 gene(s) in yeast by transformation with a high-copy number plasmid, integration of multiple copies of TKL1 and/or TKL2 gene(s) into the yeast genome, over-expression of TKL1 and/or TKL2 gene(s) directed by a strong promoter, the like or combinations thereof.
  • Non- limiting examples of methods suitable for decreasing or disrupting the activity of transketolase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast, disruption of both copies of the gene in a diploid yeast, expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • TKL1 and/or TKL2 gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • Sugar metabolized as a carbon source by organisms typically is transported from outside a cell into the cell for use as an energy source and/or a raw material for synthesis of cellular products.
  • Sugar can be transported into the cell using active or passive transport mechanisms. Active transport systems frequently utilize energy to transport the sugar across the cell membrane. Sugars often are modified by phosphorylation, once transported inside the cell or organism, to prevent diffusion out of the cell. Sugar transport activities are thought also to act as sugar sensors and have high affinity and low affinity transporters. The rate of glucose utilization in yeast often is dictated by the activity and concentration of glucose transporters in the plasma membrane.
  • sugar transporters In yeast, sugar transporters have been found to be part of a multi-gene family. Some sugar transport systems transport certain sugars preferentially and other non-preferred sugars at a lower rate. Certain sugar transport systems transport one or more structurally similar sugars at substantially similar rates.
  • Non-limiting examples of sugar transporters include high affinity glucose transporters (e.g., HXT (e.g., HXT1 , HXT7 ⁇ ), glucose-xylose transporters (e.g., GXF1 , GXS1 ), and high affinity galactose transporters (e.g., GAL2), the like and combinations thereof.
  • Galactose permease is a high affinity galactose transport enzyme activity that also can transport glucose.
  • Gal2p is an integral plasma membrane protein belonging to a super family of sugar transporters that are predicted to contain 12 transmembrane domains separated by charged residues. Structurally and functionally similar sugar transporters have been identified in bacteria, rat, and humans.
  • Glucose often is transported by high affinity glucose transporters.
  • High affinity glucose transporters sometimes also are referred to as hexose transporters.
  • Certain sugar transport systems include high and low affinity transport activities that act on more than one sugar.
  • a non-limiting example of such a sugar transport system includes the
  • glucose/xylose transport system from Candida yeast.
  • Glucose and xylose are transported into certain Candida by a high affinity xylose-proton symporter (e.g., GXS1 ) and a low affinity diffusion facilitator (e.g., GXF1 ).
  • GXS1 high affinity xylose-proton symporter
  • GXF1 low affinity diffusion facilitator
  • S. cerevisiae normally lacks an efficient transport system for xylose, although xylose can enter the cell at low efficiency via non-specific transport systems sometimes involving HXT activities. Addition of the Candida GSX1 , GXF1 or GXS1 and GXF1 activities to S. cerevisiae engineered to metabolize xylose can further enhance the ability to ferment xylose to alcohol or other desired products.
  • an engineered microorganism includes one or more sugar transport activities that has been genetically added or altered.
  • the sugar transport activity is amplified or increased.
  • Sugar transport activities can be added, amplified by over expression or increased by any suitable method.
  • Non-limiting methods of adding, amplifying or increasing the activity of sugar transport systems include increasing the number of genes of a sugar transport activity (e.g., GAL2, GXF1 , GXS1 , HXT7) gene(s) in yeast by transformation with a high-copy number plasmid, integration of multiple copies of sugar transport activity (e.g., GAL2, GXF1 , GXS1 , ⁇ 7) gene(s) into the yeast genome, over-expression of sugar transport activity gene(s) directed by a strong promoter, the like or combinations thereof.
  • the sugar transport activity (e.g., GAL2, GXF1 , GXS1 , HXT7) gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source.
  • Plasma membrane channels may play a role in the efflux and/or intake of metabolites that can induce or repress expression of various activities.
  • One such plasma membrane channel is encoded by the FPS1 gene.
  • the FPS1 encoded plasma membrane channel has been shown to be involved in the efflux of glycerol from the cell and the uptake of acetic acid and trivalent metalloids, arsenite and antimonite, into the cell.
  • loss of FPS1 is important for the acquisition of resistance to acetic acid, as it eliminates the channel for passive diffusion of this acid into cells.
  • Reducing or eliminating the efflux of the reduced amounts of glycerol in gpdl , gpd2, or gpdl and gpd2 engineered strains by co-engineering reduced or eliminated expression of FPS1 may aid overall cell health and growth in engineered strains.
  • Decreasing the number of plasma membrane channels encoded by FPS1 may benefit the production of ethanol by inhibiting the loss of glycerol through FPS1 encoded plasma membrane channels, thereby increasing the overall health of engineered strains thereby allowing increased levels of ethanol production, in some embodiments involving engineered microorganisms including increased EDA and EDD activities, thereby improving fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden- Meyerhoff pathway.
  • Decreasing or disrupting the synthesis of plasma membrane channels encoded by FPS1 in yeast may be achieved by any suitable method, or as described herein.
  • methods suitable for decreasing the synthesis of plasma membrane channels encoded by FPS1 include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof.
  • a gene used to knockout one activity can also introduce or increase another activity.
  • An example of a gene used to knockout one activity can also introduce or increase another activity.
  • polynucleotide subsequence and/or an amino acid sequence coding a FPS1 plasma membrane channel is provided herein.
  • Microorganisms grown in fermentors often are grown under anaerobic conditions, with limited or no gas exchange. Therefore the atmosphere inside fermentors sometimes is carbon dioxide rich. Unlike photosynthetic organisms, many microorganisms suitable for use in industrial fermentation processes do not incorporate atmospheric carbon (e.g., C0 2 ) to any significant degree, or at all. Thus, to ensure that increasing levels of carbon dioxide do not inhibit cell growth and the fermentation process, methods to remove carbon dioxide from the interior of fermentors can be useful.
  • atmospheric carbon e.g., C0 2
  • Photosynthetic organisms make use of atmospheric carbon by incorporating the carbon available in carbon dioxide into organic carbon compounds by a process known as carbon fixation.
  • the activities responsible for a photosynthetic organism's ability to fix carbon dioxide include phosphoenolpyruvate carboxylase (e.g., PEP carboxylase) or ribulose 1 ,5-bis-phosphate carboxylase (e.g., Rubisco).
  • PEP carboxylase catalyzes the addition of carbon dioxide to phosphoenolpyruvate to generate the four-carbon compound oxaloacetate.
  • Oxaloacetate can be used in other cellular processes or be further converted to yield several industrially useful products (e.g., malate, succinate, citrate and the like).
  • Rubisco catalyzes the addition of carbon dioxide and ribulose-1 ,5-bisphosphate to generate 2 molecules of 3-phosphoglycerate. 3-phosphoglycerate can be further converted to ethanol via cellular fermentation or used to produce other commercially useful products.
  • Nucleic acid sequences encoding PEP carboxylase and Rubisco activities can be obtained from any suitable organism (e.g., plants, bacteria, and other microorganisms, for example) and any of these activities can be used herein with the proviso that the nucleic acid sequence is either naturally active in the chosen microorganism when expressed, or can be altered or modified to be active. Examples of Altered Activities
  • engineered microorganisms can include modifications to one or more (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20 or all) of the following activities: phosphofructokinase activity (PFK1 A subunit, PFK2 B subunit), phosphogluconate dehydratase activity (EDD), 2-keto-3- deoxygluconate-6-phosphate aldolase activity (EDA), xylose isomerase activity (xylA), xylose reductase activity (XYL1 ), xylitol dehydrogenase activity (XYL2), xylulokinase activity (XKS1 , XYL3), phosphoenolpyruvate carboxylase activity (PEP carboxylase), alcohol dehydrogenase 2 activity (ADH2), thymidylate synthase activity, phosphoglucose isomerase activity (PGI1 ), transaldolase activity (TAL1
  • TSH3 glyceraldehyde-3-phosphate dehydrogenase
  • one or more activities in one or more metabolic pathways can be engineered to increase carbon flux through the engineered pathways to produce a desired product (e.g., ethanol).
  • the engineered activities can be selected for allowing increased production of metabolic intermediates that can be utilized in one or more other engineered pathways to achieve increased production of a desired product with respect to the unmodified host organism.
  • This "carbon flux management" can be optimized for any chosen feedstock, by engineering appropriate activities in appropriate pathways.
  • phosphofructokinase activity refers to conversion of fructose-6- phosphate to fructose-1 ,6-bisphosphate. Phosphofructokinase activity may be provided by an enzyme that includes one or two subunits (referred to hereafter as “subunit A” and/or “subunit B”).
  • subunit A and/or "subunit B”
  • activating the Embden-Meyerhoff pathway refers to reducing or eliminating the activity of one or more activities in the Embden-Meyerhoff pathway, including but not limited to phosphofructokinase activity.
  • the phosphofructokinase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • an untranslated RNA molecule e.g., antisense RNA, RNAi, and the like, for example.
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the phosphofructokinase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • the genetic modification deletes the nucleic acid encoding the activity.
  • the genetic modification replaces the endogenous promoter and/or coding sequence with a heterologous promoter and/or coding sequence having lower relative expression and/or lower relative specific activity. Nucleic acid sequences that can be used to reduce or eliminate the activity of
  • phosphofructokinase activity can have sequences partially or substantially complementary to sequences described herein. Presence or absence of the amount of phosphofructokinase activity can be detected by any suitable method known in the art, including requiring a five-carbon sugar carbon source or a functional Entner-Doudoroff pathway for growth. Inactivation of the Embden- Meyerhoff pathway is described in further detail below.
  • substantially complementary refers to nucleotide sequences that will hybridize with each other.
  • the stringency of the hybridization conditions can be altered to tolerate varying amounts of sequence mismatch. Included are regions of counterpart, target and capture nucleotide sequences 55% or more, 56% or more, 57% or more, 58% or more, 59% or more, 60% or more, 61 % or more, 62% or more, 63% or more, 64% or more, 65% or more, 66% or more, 67% or more, 68% or more, 69% or more, 70% or more, 71 % or more, 72% or more, 73% or more, 74% or more, 75% or more, 76% or more, 77% or more, 78% or more, 79% or more, 80% or more, 81 % or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or
  • phosphogluconate dehydratase activity refers to conversion of 6- phophogluconate to 2-keto-3-deoxy-6-p-gluconate.
  • the phosphogluconate dehydratase activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring phosphogluconate dehydratase activity can be obtained from a number of sources, including Zymomonas mobilis and Escherichia coli.
  • 2-keto-3-deoxygluconate-6-phosphate aldolase activity refers to conversion of 2-keto-3-deoxy-6-p-gluconate to pyruvate.
  • the 2-keto-3-deoxygluconate-6- phosphate aldolase activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host
  • Nucleic acid sequences conferring 2-keto-3-deoxygluconate-6-phosphate aldolase activity can be obtained from a number of sources, including Zymomonas mobilis and Escherichia coli. Examples of an amino acid sequence of a polypeptide having 2-keto-3-deoxygluconate-6- phosphate aldolase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of 2-keto-3- deoxygluconate-6-phosphate aldolase activity can be detected by any suitable method known in the art, including western blot analysis.
  • xylose isomerase activity refers to conversion of xylose to xylulose.
  • the xylose isomerase activity can be provided by a polypeptide. In some embodiments, the
  • polypeptide is encoded by a heterologous nucleotide sequence introduced to a host
  • Nucleic acid sequences conferring xylose isomerase activity can be obtained from a number of sources, including Piromyces, Orpinomyces, Bacteroides (e.g., B. thetaiotaomicron, B. uniformis, B. stercoris), Clostrialies (e.g., Clostrialies BVAB3), Clostridium (e.g., C.
  • phytofermentans C. thermohydrosulfuricum, C. cellulyticum
  • Thermus thermophilus Eschericia coli
  • Streptomyces e.g., S. rubiginosus, S. murinus
  • Bacillus stearothermophilus Lactobacillus pentosus
  • Thermotoga e.g., T. maritime, T.
  • Ruminococcus e.g., Ruminococcus environmental samples, Ruminococcus albus, Ruminococcus bromii, Ruminococcus callidus, Ruminococcus flavefaciens, Ruminococcus gaenteauii, Ruminococcus gnavus, Ruminococcus lactaris, Ruminococcus obeum, Ruminococcus sp., Ruminococcus sp. 14531 , Ruminococcus sp. 15975, Ruminococcus sp. 16442, Ruminococcus sp. 18P13, Ruminococcus sp. 25F6,
  • Ruminococcus sp. C041 Ruminococcus sp. C047, Ruminococcus sp. C07, Ruminococcus sp. CS1 , Ruminococcus sp. CS6, Ruminococcus sp. DJF_VR52, Ruminococcus sp. DJF_VR66, Ruminococcus sp. DJF_VR67, Ruminococcus sp. DJF_VR70k1 , Ruminococcus sp. DJF_VR87, Ruminococcus sp. Eg2, Ruminococcus sp. Egf, Ruminococcus sp. END-1 , Ruminococcus sp.
  • Ruminococcus sp. Pei041 Ruminococcus sp. SC101 , Ruminococcus sp. SC103, Ruminococcus sp. Siijpesteijn 1948, Ruminococcus sp. WAL 17306, Ruminococcus sp. YE281 , Ruminococcus sp. YE58, Ruminococcus sp. YE71 , Ruminococcus sp. ZS2-15, Ruminococcus torques).
  • phosphoenolpyruvate carboxylase activity refers to the addition of carbon dioxide to phosphoenolpyruvate to generate the four-carbon compound oxaloacetate.
  • the phosphoenolpyruvate carboxylase activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • Nucleic acid sequences conferring phosphoenolpyruvate carboxylase activity can be obtained from a number of sources, including Zymomonas mobilis. Examples of an amino acid sequence of a polypeptide having phosphoenolpyruvate carboxylase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables.
  • Presence, absence or amount of xylose isomerase activity can be detected by any suitable method known in the art.
  • the term "alcohol dehydrogenase 2 activity" as used herein refers to conversion of ethanol to acetaldehyde, which is the reverse of the forward action catalyzed by alcohol dehydrogenase 1 .
  • the term "inactivation of the conversion of ethanol to acetaldehyde” refers to a reduction or elimination in the activity of alcohol dehydrogenase 2. Reducing or eliminating the activity of alcohol dehydrogenase 2 activity can lead to an increase in ethanol production.
  • the alcohol dehydrogenase 2 activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • an untranslated RNA molecule e.g., antisense RNA, RNAi, and the like, for example.
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the alcohol dehydrogenase 2 activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the activity of alcohol dehydrogenase 2 can have sequences partially or substantially complementary to nucleic acid sequences that encode alcohol dehydrogenase 2 activity. Presence or absence of the amount of alcohol dehydrogenase 2 activity can be detected by any suitable method known in the art, including inability to grown in media with ethanol as the sole carbon source.
  • thymidylate synthase activity refers to a reductive methylation, where deoxyuridine monophosphate (dUMP) and N5,N10-methylene tetrahydrofolate are together used to generate thymidine monophosphate (dTMP), yielding dihydrofolate as a secondary product.
  • dUMP deoxyuridine monophosphate
  • dTMP thymidine monophosphate
  • temporary inactivate thymidylate synthase activity refers to a temporary reduction or elimination in the activity of thymidylate synthase when the modified organism is shifted to a non- permissive temperature. The activity can return to normal upon return to a permissive
  • thymidylate synthase uncouples cell growth from cell division while under the non permissive temperature. This inactivation in turn allows the cells to continue fermentation without producing biomass and dividing, thus increasing the yield of product produced during fermentation.
  • the thymidylate synthase activity can be temporarily reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • Nucleic acid sequences conferring temperature sensitive thymidylate synthase activity can be obtained from S. cerevisiae strain 172066 (accession number 208583).
  • the cdc21 mutation in S. cerevisiae strain 172066 has a point mutation at position G139S relative to the initiating methionine. Examples of nucleotide sequences used to PCR amplify the polynucleotide encoding the temperature sensitive polypeptide, are presented below in tables.
  • Presence, absence or amount of thymidylate synthase activity can be detected by any suitable method known in the art, including growth arrest at the non-permissive temperature.
  • Thymidylate synthase is one of many polypeptides that regulate the cell cycle.
  • the cell cycle may be inhibited in engineered microorganisms under certain conditions (e.g., temperature shift, dissolved oxygen shift), which can result in inhibited or reduced cell proliferation, inhibited or reduced cell division, and sometimes cell cycle arrest (collectively "cell cycle inhibition").
  • a microorganism may display cell cycle inhibition after a certain time after the microorganism is exposed to the triggering conditions (e.g., there may be a time delay after a microorganism is exposed to a certain set of conditions before the microorganism displays cell cycle inhibition).
  • cell proliferation rates may be reduced by about 50% or greater, for example (e.g., reduced by about 52%, 54%, 56%, 58%, 60%, 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing).
  • the rate of cell division may be reduced by about 50% or greater, for example (e.g., the number of cells undergoing division is reduced by about 52%, 54%, 56%, 58%, 60%, 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing).
  • cells may be arrested at any stage of the cell cycle (e.g., resting G 0 phase, interphase (e.g., G, , S, G 2 phases), mitosis (e.g., prophase, prometaphase, metaphase, anaphase, telophase)) and different percentages of cells in a population can be arrested at different stages of the cell cycle.
  • stage of the cell cycle e.g., resting G 0 phase, interphase (e.g., G, , S, G 2 phases), mitosis (e.g., prophase, prometaphase, metaphase, anaphase, telophase)
  • mitosis e.g., prophase, prometaphase, metaphase, anaphase, telophase
  • phosphoglucose isomerase activity refers to the conversion of glucose- 6-phosphate to fructose-6-phosphate.
  • activation of the conversion of glucose-6- phosphate to fructose-6-phosphate refers to a reduction or elimination in the activity of phosphoglucose isomerase. Reducing or eliminating the activity of phosphoglucose isomerase activity can lead to an increase in ethanol production.
  • the phosphoglucose isomerase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the phosphoglucose isomerase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the activity of phosphoglucose isomerase can have sequences partially or substantially complementary to nucleic acid sequences that encode phosphoglucose isomerase activity. Presence or absence of the amount of phosphoglucose isomerase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • glucose-6-phosphate dehydrogenase activity refers to conversion of glucose-6-phosphate to gluconolactone-6-phosphate coupled with the generation of NADPH.
  • the glucose-6-phosphate dehydrogenase aldolase activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring glucose-6-phosphate dehydrogenase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae Examples of a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of glucose-6-phosphate dehydrogenase activity can be detected by any suitable method known in the art, including western blot analysis.
  • 6-phosphogluconolactonase activity refers to conversion of
  • the 6-phosphogluconolactonase activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • Nucleic acid sequences conferring 6-phosphogluconolactonase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having 6-phosphogluconolactonase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of 6- phosphogluconolactonase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • 6-phosphogluconate dehydrogenase (decarboxylating) activity refers to the conversion of gluconate-6-phosphate to ribulose-5-phosphate.
  • activation of the conversion of gluconate-6-phosphate to ribulose-5-phosphate refers to a reduction or elimination in the activity of 6-phosphogluconate dehydrogenase. Reducing or eliminating the activity of 6- phosphogluconate dehydrogenase (decarboxylating) activity can lead to an increase in ethanol production.
  • the 6-phosphogluconate dehydrogenase (decarboxylating) activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • an untranslated RNA molecule e.g., antisense RNA, RNAi, and the like, for example.
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the 6-phosphogluconate dehydrogenase (decarboxylating) activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of 6-phosphogluconate
  • dehydrogenase can have sequences partially or substantially complementary to nucleic acid sequences that encode 6-phosphogluconate dehydrogenase (decarboxylating) activity. Presence or absence of the amount of 6-phosphogluconate dehydrogenase
  • transketolase activity refers to conversion of xylulose-5-phosphate and ribose-5-phosphate to sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate.
  • the transketolase activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring transketolase activity can be obtained from a number of sources, including, but not limited to S.
  • the transketolase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the transketolase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the activity of transketolase can have sequences partially or substantially complementary to nucleic acid sequences that encode transketolase activity. Presence, absence or amount of transketolase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • transaldolase activity refers to conversion of sedoheptulose 7- phosphate and glyceraldehyde 3-phosphate to erythrose 4-phosphate and fructose 6-phosphate.
  • the transaldolase activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host
  • Nucleic acid sequences conferring transaldolase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae, Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida. Examples of an amino acid sequence of a polypeptide having transaldolase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in the examples.
  • the term "inactivation of the conversion of sedoheptulose 7-phosphate and glyceraldehyde 3-phosphate to erythrose 4-phosphate and fructose 6-phosphate” refers to a reduction or elimination in the activity of transaldolase. Reducing or eliminating the activity of transaldolase activity can lead to an increase in ethanol production.
  • the transaldolase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the transaldolase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the activity of transaldolase can have sequences partially or substantially complementary to nucleic acid sequences that encode transaldolase activity. Presence, absence or amount of transaldolase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • galactose permease activity refers to the import of galactose into a cell or organism by an activity that transports galactose across cell membranes.
  • the galactose permease activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring galactose permease activity can be obtained from a number of sources, including, but not limited to S. cerevisiae, Candida albicans, Debaryomyces hansenii,
  • Schizosaccharomyces pombe, Arabidopsis thaliana, and Colweilia psychrerythraea Examples of an amino acid sequence of a polypeptide having galactose permease activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in the Examples. Presence, absence or amount of galactose permease activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • glucose/xylose transport activity refers to the import of glucose and/or xylose into a cell or organism by an activity that transports glucose and/or xylose across cell membranes.
  • the glucose/xylose transport activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring glucose/xylose transport activity can be obtained from a number of sources, including, but not limited to Pichia yeast, S.
  • glucose/xylose transport activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • high affinity glucose transport activity and "hexose transport activity” as used herein refer to the import of glucose and other hexose sugars into a cell or organism by an activity that transports glucose and other hexose sugars across cell membranes.
  • the high affinity glucose transport activity or hexose transport activity can be provided by a polypeptide.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring high affinity glucose transport activity or hexose transport activity can be obtained from a number of sources, including, but not limited to S.
  • Presence, absence or amount of glucose/xylose transport activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • xylose reductase activity refers to the conversion of xylose to xylitol.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring xylose reductase activity can be obtained from a number of sources. Presence, absence or amount of xylose reductase activity can be detected by any suitable method known in the art, including activity assays, nucleic acid based analysis and western blot analysis.
  • xylitol dehydrogenase activity refers to the conversion of xylitol to xylulose.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring xylitol dehydrogenase activity can be obtained from a number of sources. Presence, absence or amount of xylitol dehydrogenase activity can be detected by any suitable method known in the art, including activity assays, nucleic acid based analysis and western blot analysis.
  • xylulokinase activity refers to the conversion of xylulose to xylulose-5- phosphate.
  • the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring xylulokinase activity can be obtained from a number of sources. Presence, absence or amount of xylulokinase activity can be detected by any suitable method known in the art, including activity assays, nucleic acid based analysis and western blot analysis.
  • trehalose-6-phosphate synthase activity refers to the conversion of 2 molecules of phosphorylated glucose (e.g., UDP-glucose and D-glucose-6-phosphate) into a molecule of trehalose.
  • phosphorylated glucose e.g., UDP-glucose and D-glucose-6-phosphate
  • decrease the proportion of glucose converted into the storage carbohydrate, trehalose refers to a reduction or elimination of trehalose-6-phosphate synthase (e.g., TPS1 ) activity. Reducing or eliminating the activity of trehalose-6-phosphate synthase can lead to an increase in ethanol production.
  • the trehalose-6-phosphate synthase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • an untranslated RNA molecule e.g., antisense RNA, RNAi, and the like, for example.
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the trehalose-6-phosphate synthase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the activity of trehalose-6-phosphate synthase can have sequences partially or substantially complementary to nucleic acid sequences that encode trehalose-6-phosphate synthase activity. Presence or absence of the amount of trehalose- 6-phosphate synthase activity can be detected by any suitable method known in the art, including the method described by Hottinger et al, (1987 J. Bacteriology 169: 5518-5522).
  • trehalose-6-phosphate phosphatase activity refers to the hydrolysis of alpha, alpha-trehalose-6-phosphate to remove the phosphate, a prerequisite for further metabolism by a trehalase activity.
  • the trehalose-6-phosphate phosphatase activity can be provided by a polypeptide.
  • the polypeptide is encoded by an endogenous nucleotide sequence.
  • the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • Nucleic acid sequences conferring trehalose-6- ⁇ phosphate phosphatase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having trehalose-6- phosphate phosphatase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of trehalose-6- phosphate phosphatase activity can be detected by any suitable method known in the art, including nucleic acid based analysis, western blot analysis, and the method described by De Virgilio et al, (Eur. J. Biochem 212: 315-323).
  • FPS1 encoded plasma membrane channel refers to a polypeptide encoded by the FPS1 gene that functions to allow efflux of glycerol from the cell and influx of acetic acid into the cell.
  • decreasing the efflux of glycerol from the cell and “decreasing the number of plasma membrane channels encoded by FPS1” refer to a reduction in number or complete elimination of the plasma membrane channels encoded by FPS1 and present in the plasma membranes of engineered organism described herein including the fpsl phenotype.
  • the plasma membrane channels encoded by FPS1 can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the plasma membrane channels encoded by FPS1 can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders expression of the gene encoding the membrane channels responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the membrane channels or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the plasma membrane channels encoded by FPS1 can have sequences partially or substantially complementary to nucleic acid sequences that encode FPS1 .
  • FPS1 deleted cells show increased resistance to acetic acid when compared to cells native for FPS1 plasma membrane channels. Presence or absence of the plasma membrane channels encoded by FPS1 can be detected by any suitable method known in the art, including acetic acid resistance, and ability to grow on acetic acid.
  • glycosyldehyde-3-phosphate dehydrogenase activity refers to conversion of glyceraldehyde-3-phosphate to 1 ,3 bis-phosphoglycerate.
  • the glyceraldehyde-3- phosphate dehydrogenase activity can be provided by a polypeptide.
  • the polypeptide is encoded by an endogenous nucleotide sequence.
  • the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host
  • Nucleic acid sequences conferring glyceraldehyde-3-phosphate dehydrogenase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae.
  • Examples of an amino acid sequence of a polypeptide having glyceraldehyde-3-phosphate dehydrogenase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of glyceraldehyde-3- phosphate dehydrogenase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • the term "glutamate synthase activity" as used herein refers to the reversible conversion of 2 molecules of L-glutamate into L-glutamine and 2-oxoglutarate. The glutamate synthase activity can be provided by a polypeptide.
  • the polypeptide is encoded by an endogenous nucleotide sequence.
  • the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • Nucleic acid sequences conferring glutamate synthase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having glutamate synthase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of glutamate synthase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • alcohol dehydrogenase 1 activity refers to the conversion of acetaldehyde to ethanol.
  • the alcohol dehydrogenase 1 activity can be provided by a polypeptide.
  • the polypeptide is encoded by an endogenous nucleotide sequence.
  • the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring alcohol dehydrogenase 1 activity can be obtained from a number of sources, including, but not limited to S. cerevisiae.
  • pyruvate decarboxylase activity refers to the decarboxylation of a 2-oxo acid into an aldehyde, where the 2-oxo acid generally is pyruvic acid and the aldehyde generally is acetaldehyde.
  • the pyruvate decarboxylase activity can be provided by a polypeptide.
  • the polypeptide is encoded by an endogenous nucleotide sequence.
  • the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • Nucleic acid sequences conferring pyruvate decarboxylase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having pyruvate decarboxylase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of pyruvate decarboxylase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • pyruvate kinase activity refers to the dephosphorylation of
  • the pyruvate kinase activity can be provided by a polypeptide.
  • the polypeptide is encoded by an endogenous nucleotide sequence.
  • the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring pyruvate kinase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae.
  • an amino acid sequence of a polypeptide having pyruvate kinase activity and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of pyruvate kinase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
  • the terms "alkaline phosphatase activity” and “alkaline phosphatase activity specific for p- nitrophenyl phosphate” as used herein refer to the hydrolysis of a phosphate monoester into an alcohol and inorganic phosphate.
  • the alkaline phosphatase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the alkaline phosphatase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the activity of alkaline phosphatase can have sequences partially or substantially complementary to nucleic acid sequences that encode alkaline phosphatase activity.
  • Presence or absence of the amount of alkaline phosphatase activity can be detected by any suitable method known in the art, including alkaline phosphatase activity assays in which PH013 activity is assayed using para-Nitrophenylphosphate (pNPP), a chromogenic substrate for acid and alkaline phosphatase.
  • pNPP para-Nitrophenylphosphate
  • neutral trehalase activity refers to the hydrolysis of alpha, alpha-6- trehalose into 2 molecules of glucose.
  • decreasing the level of NTH1 activity refers to a reduction or elimination of neutral trehalase activity. Reducing or eliminating the activity of neutral trehalase can lead to an increase in ethanol production.
  • the neutral trehalase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example).
  • the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
  • the neutral trehalase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below.
  • the genetic modification renders the activity responsive to changes in the environment.
  • the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein.
  • Nucleic acid sequences that can be used to reduce or eliminate the activity of neutral trehalase can have sequences partially or substantially complementary to nucleic acid sequences that encode a neutral trehalase activity. The rapid degradation of trehalose in intact cells when recovering from heat stress is not observed in mutant cells carrying a disrupted or a deleted gene of neutral trehalase (nthl ⁇ ).
  • Presence or absence of the amount of neutral trehalase activity can be detected by any suitable method known in the art, including the inability to grow on trehalose as the sole carbon source, or by the neutral enzyme overlay test described by Kopp et al, (JBC, 1993, 268: 4766-4774).
  • the term "glycerol-3-phosphate dehydrogenase activity" as used herein refers to the conversion of dihydroxyacetone phosphate into sn-glycerol-3-phosphate.
  • the term “decreasing glycerol-3- phosphate dehydrogenase activity” refers to a reduction or elimination of glycerol-3-phosphate dehydrogenase activity. Reducing or eliminating the activity of glycerol-3-phosphate
  • the dehydrogenase can lead to an increase in ethanol production.
  • the glycerol- 3-phosphate dehydrogenase activity can be reduced or eliminated by introduction of an
  • dehydrogenase can have sequences partially or substantially complementary to nucleic acid sequences that encode a glycerol-3-phosphate dehydrogenase activity. Mutants in GPD1 activity often produce less glycerol and show increased sensitivity to osmotic stress than strains native for GPD1 . Mutants lacking GPD2 activity often show poor growth under anaerobic conditions.
  • Mutants lacking GPD1 and GPD2 activity generally do not produce detectable levels of glycerol and are highly osmosensitive. Presence or absence of the amount of glycerol-3-phosphate dehydrogenase activity can be detected by any suitable method known in the art, including the inability to grow under anoxic conditions.
  • Activities described herein can be modified to generate microorganisms engineered to allow a method of independently regulating or controlling (e.g., ability to independently turn on or off, or increase or decrease, for example) six-carbon sugar metabolism, five-carbon sugar metabolism, atmospheric carbon metabolism (e.g., carbon dioxide fixation) or combinations thereof.
  • regulated control of a desired activity can be the result of a genetic modification.
  • the genetic modification can be modification of a promoter sequence.
  • the modification can increase of decrease an activity encoded by a gene operably linked to the promoter element.
  • the modification to the promoter element can add or remove a regulatory sequence.
  • the regulatory sequence can respond to a change in environmental or culture conditions.
  • Non-limiting examples of culture conditions that could be used to regulate an activity in this manner include, temperature, light, oxygen, salt, metals and the like. Additional methods for altering an activity by modification of a promoter element are given below.
  • the genetic modification can be to an ORF.
  • the modification of the ORF can increase or decrease expression of the ORF.
  • modification of the ORF can alter the efficiency of translation of the ORF.
  • modification of the ORF can alter the activity of the polypeptide or protein encoded by the ORF. Additional methods for altering an activity by modification of an ORF are given below.
  • the genetic modification can be to an activity associated with cell division (e.g., cell division cycle or CDC activity, for example).
  • the cell division cycle activity can be thymidylate synthase activity.
  • regulated control of cell division can be the result of a genetic modification.
  • the genetic modification can be to a nucleic acid sequence that encodes thymidylate synthase.
  • the genetic modification can temporarily inactivate thymidylate synthase activity by rendering the activity temperature sensitive (e.g., heat resistant, heat sensitive, cold resistant, cold sensitive and the like).
  • the genetic modification can modify a promoter sequence operably linked to a gene encoding an activity involved in control of cell division. In some embodiments the modification can increase of decrease an activity encoded by a gene operably linked to the promoter element.
  • the modification to the promoter element can add or remove a regulatory sequence.
  • the regulatory sequence can respond to a change in environmental or culture conditions.
  • culture conditions that could be used to regulate an activity in this manner include, temperature, light, oxygen, salt, metals and the like.
  • an engineered microorganism comprising one or more activities described above or below can be used in to produce ethanol by inhibiting cell growth and cell division by use of a temperature sensitive cell division control activity while allowing cellular fermentation to proceed, thereby producing a significant increase in ethanol yield when compared to the native organism.
  • a nucleic acid e.g., also referred to herein as nucleic acid reagent, target nucleic acid, target nucleotide sequence, nucleic acid sequence of interest or nucleic acid region of interest
  • nucleic acid reagent can be from any source or composition, such as DNA, cDNA, gDNA (genomic DNA), RNA, siRNA (short inhibitory RNA), RNAi, tRNA or mRNA, for example, and can be in any form (e.g., linear, circular, supercoiled, single-stranded, double-stranded, and the like).
  • a nucleic acid can also comprise DNA or RNA analogs (e.g., containing base analogs, sugar analogs and/or a non-native backbone and the like). It is understood that the term “nucleic acid” does not refer to or infer a specific length of the polynucleotide chain, thus polynucleotides and oligonucleotides are also included in the definition.
  • Deoxyribonucleotides include deoxyadenosine, deo.xycytidine, deoxyguanosine and deoxythymidine.
  • the uracil base is uridine.
  • a nucleic acid sometimes is a plasmid, phage, autonomously replicating sequence (ARS), centromere, artificial chromosome, yeast artificial chromosome (e.g., YAC) or other nucleic acid able to replicate or be replicated in a host cell.
  • a nucleic acid can be from a library or can be obtained from enzymatically digested, sheared or sonicated genomic DNA (e.g., fragmented) from an organism of interest.
  • nucleic acid subjected to fragmentation or cleavage may have a nominal, average or mean length of about 5 to about 10,000 base pairs, about 100 to about 1 ,000 base pairs, about 100 to about 500 base pairs, or about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000 or 10000 base pairs.
  • Fragments can be generated by any suitable method in the art, and the average, mean or nominal length of nucleic acid fragments can be controlled by selecting an appropriate fragment-generating procedure by the person of ordinary skill.
  • the fragmented DNA can be size selected to obtain nucleic acid fragments of a particular size range.
  • Nucleic acid can be fragmented by various methods known to the person of ordinary skill, which include without limitation, physical, chemical and enzymic processes. Examples of such processes are described in U.S. Patent Application Publication No. 200501 12590 (published on May 26, 2005, entitled “Fragmentation-based methods and systems for sequence variation detection and discovery," naming Van Den Boom et al.). Certain processes can be selected by the person of ordinary skill to generate non-specifically cleaved fragments or specifically cleaved fragments.
  • Examples of processes that can generate non-specifically cleaved fragment sample nucleic acid include, without limitation, contacting sample nucleic acid with apparatus that expose nucleic acid to shearing force (e.g., passing nucleic acid through a syringe needle; use of a French press); exposing sample nucleic acid to irradiation (e.g., gamma, x-ray, UV irradiation; fragment sizes can be controlled by irradiation intensity); boiling nucleic acid in water (e.g., yields about 500 base pair fragments) and exposing nucleic acid to an acid and base hydrolysis process.
  • shearing force e.g., passing nucleic acid through a syringe needle; use of a French press
  • irradiation e.g., gamma, x-ray, UV irradiation; fragment sizes can be controlled by irradiation intensity
  • boiling nucleic acid in water e.g., yields about
  • Nucleic acid may be specifically cleaved by contacting the nucleic acid with one or more specific cleavage agents.
  • specific cleavage agent refers to an agent, sometimes a chemical or an enzyme that can cleave a nucleic acid at one or more specific sites. Specific cleavage agents often will cleave specifically according to a particular nucleotide sequence at a particular site. Examples of enzymic specific cleavage agents include without limitation
  • endonucleases e.g., DNase (e.g., DNase I, II); RNase (e.g., RNase E, F, H, P); CleavaseTM enzyme; Taq DNA polymerase; E. coli DNA polymerase I and eukaryotic structure-specific endonucleases; murine FEN-1 endonucleases; type I, II or III restriction endonucleases such as Acc I, Afl III, Alu I, Alw44 I, Apa I, Asn I, Ava I, Ava II, BamH I, Ban II, Bel I, Bgl I.
  • DNase e.g., DNase I, II
  • RNase e.g., RNase E, F, H, P
  • CleavaseTM enzyme e.g., Taq DNA polymerase
  • murine FEN-1 endonucleases
  • exonucleases e.g., exonuclease III
  • ribozymes e.g., ribozymes
  • DNAzymes e.g., DNAzymes.
  • Sample nucleic acid may be treated with a chemical agent, or synthesized using modified nucleotides, and the modified nucleic acid may be cleaved.
  • Examples of chemical cleavage processes include without limitation alkylation, (e.g., alkylation of phosphorothioate-modified nucleic acid); cleavage of acid lability of P3'-N5'-phosphoroamidate-containing nucleic acid; and osmium tetroxide and piperidine treatment of nucleic acid.
  • alkylation e.g., alkylation of phosphorothioate-modified nucleic acid
  • cleavage of acid lability of P3'-N5'-phosphoroamidate-containing nucleic acid e.g., osmium tetroxide and piperidine treatment of nucleic acid.
  • nucleic acids of interest may be treated with one or more specific cleavage agents (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10 or more specific cleavage agents) in one or more reaction vessels (e.g., nucleic acid of interest is treated with each specific cleavage agent in a separate vessel).
  • specific cleavage agents e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10 or more specific cleavage agents
  • a nucleic acid suitable for use in the embodiments described herein sometimes is amplified by any amplification process known in the art (e.g., PCR, RT-PCR and the like). Nucleic acid amplification may be particularly beneficial when using organisms that are typically difficult to culture (e.g., slow growing, require specialize culture conditions and the like).
  • the terms "amplify”, “amplification”, “amplification reaction”, or “amplifying” as used herein refer to any in vitro processes for multiplying the copies of a target sequence of nucleic acid. Amplification sometimes refers to an "exponential" increase in target nucleic acid.
  • a nucleic acid reagent sometimes is stably integrated into the chromosome of the host organism, or a nucleic acid reagent can be a deletion of a portion of the host chromosome, in certain embodiments (e.g., genetically modified organisms, where alteration of the host genome confers the ability to selectively or preferentially maintain the desired organism carrying the genetic modification).
  • nucleic acid reagents e.g., nucleic acids or genetically modified organisms whose altered genome confers a selectable trait to the organism
  • nucleic acid reagent can be altered such that codons encode for (i) the same amino acid, using a different tRNA than that specified in the native sequence, or (ii) a different amino acid than is normal, including unconventional or unnatural amino acids (including detectably labeled amino acids).
  • native sequence refers to an unmodified nucleotide sequence as found in its natural setting (e.g., a nucleotide sequence as found in an organism).
  • a nucleic acid or nucleic acid reagent can comprise certain elements often selected according to the intended use of the nucleic acid. Any of the following elements can be included in or excluded from a nucleic acid reagent.
  • a nucleic acid reagent may include one or more or all of the following nucleotide elements: one or more promoter elements, one or more 5' untranslated regions (5'UTRs), one or more regions into which a target nucleotide sequence may be inserted (an "insertion element"), one or more target nucleotide sequences, one or more 3' untranslated regions (3'UTRs), and one or more selection elements.
  • a nucleic acid reagent comprises the following elements in the 5' to 3' direction: (1 ) promoter element, 5'UTR, and insertion element(s); (2) promoter element, 5'UTR, and target nucleotide sequence; (3) promoter element, 5'UTR, insertion element(s) and 3'UTR; and (4) promoter element, 5'UTR, target nucleotide sequence and 3'UTR.
  • a promoter element typically is required for DNA synthesis and/or RNA synthesis.
  • a promoter element often comprises a region of DNA that can facilitate the transcription of a particular gene, by providing a start site for the synthesis of RNA corresponding to a gene. Promoters generally are located near the genes they regulate, are located upstream of the gene (e.g., 5' of the gene), and are on the same strand of DNA as the sense strand of the gene, in some embodiments.
  • a promoter often interacts with a RNA polymerase.
  • a polymerase is an enzyme that catalyses synthesis of nucleic acids using a preexisting nucleic acid reagent.
  • the template is a DNA template
  • an RNA molecule is transcribed before protein is synthesized.
  • Enzymes having polymerase activity suitable for use in the present methods include any polymerase that is active in the chosen system with the chosen template to synthesize protein.
  • a promoter e.g., a heterologous promoter
  • a promoter element can be operably linked to a nucleotide sequence or an open reading frame (ORF).
  • RNA corresponding to the nucleotide sequence or ORF sequence operably linked to the promoter can catalyze the synthesis of an RNA corresponding to the nucleotide sequence or ORF sequence operably linked to the promoter, which in turn leads to synthesis of a desired peptide, polypeptide or protein.
  • operably linked refers to a nucleic acid sequence (e.g., a coding sequence) present on the same nucleic acid molecule as a promoter element and whose expression is under the control of said promoter element.
  • Promoter elements sometimes exhibit responsiveness to regulatory control.
  • Promoter elements also sometimes can be regulated by a selective agent. That is, transcription from promoter elements sometimes can be turned on, turned off, up-regulated or down-regulated, in response to a change in environmental, nutritional or internal conditions or signals (e.g., heat inducible promoters, light regulated promoters, feedback regulated promoters, hormone influenced promoters, tissue specific promoters, oxygen and pH influenced promoters, promoters that are responsive to selective agents (e.g., kanamycin) and the like, for example).
  • Promoters influenced by environmental, nutritional or internal signals frequently are influenced by a signal (direct or indirect) that binds at or near the promoter and increases or decreases expression of the target sequence under certain conditions.
  • Non-limiting examples of selective or regulatory agents that can influence transcription from a promoter element used in embodiments described herein include, without limitation, (1 ) nucleic acid segments that encode products that provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode products that are otherwise lacking in the recipient cell (e.g., essential products, tRNA genes, auxotrophic markers); (3) nucleic acid segments that encode products that suppress the activity of a gene product; (4) nucleic acid segments that encode products that can be readily identified (e.g., phenotypic markers such as antibiotics (e.g., ⁇ -lactamase), ⁇ -galactosidase, green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), cyan fluorescent protein (CFP), and cell surface proteins); (5) nucleic acid segments that bind products that are otherwise detrimental to cell survival and/or function; (6) nucleic acid segments that otherwise inhibit the activity of any of the nucleic acid segments described in Nos.
  • nucleic acid segments that bind products that modify a substrate e.g., restriction endonucleases
  • nucleic acid segments that can be used to isolate or identify a desired molecule e.g., specific protein binding sites
  • nucleic acid segments that encode a specific nucleotide sequence that can be otherwise non-functional e.g., for PCR amplification of subpopulations of molecules
  • nucleic acid segments that, when absent, directly or indirectly confer resistance or sensitivity to particular compounds (1 1 ) nucleic acid segments that encode products that either are toxic or convert a relatively non-toxic compound to a toxic compound (e.g., Herpes simplex thymidine kinase, cytosine deaminase) in recipient cells; (12) nucleic acid segments that inhibit replication, partition or heritability of nucleic acid molecules that contain them; and/or (13) nucleic acid segments
  • regulation of a promoter element can be used to alter (e.g., increase, add, decrease or substantially eliminate) the activity of a peptide, polypeptide or protein (e.g., enzyme activity for example).
  • a microorganism can be engineered by genetic modification to express a nucleic acid reagent that can add a novel activity (e.g., an activity not normally found in the host organism) or increase the expression of an existing activity by increasing transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest (e.g., homologous or heterologous nucleotide sequence of interest), in certain embodiments.
  • a microorganism can be engineered by genetic modification to express a nucleic acid reagent that can decrease expression of an activity by decreasing or substantially eliminating transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest, in certain embodiments.
  • the activity can be altered using recombinant DNA and genetic techniques known to the artisan. Methods for engineering microorganisms are further described herein. Tables herein provide non-limiting lists of yeast promoters that are up-regulated by oxygen, yeast promoters that are down-regulated by oxygen, yeast transcriptional repressors and their associated genes, DNA binding motifs as determined using the MEME sequence analysis software. Potential regulator binding motifs can be identified using the program MEME to search intergenic regions bound by regulators for overrepresented sequences. For each regulator, the sequences of intergenic regions bound with p-values less than 0.001 were extracted to use as input for motif discovery.
  • the MEME software was run using the following settings: a motif width ranging from 6 to 18 bases, the "zoops" distribution model, a 6th order Markov background model and a discovery limit of 20 motifs.
  • the discovered sequence motifs were scored for significance by two criteria: an E-value calculated by MEME and a specificity score. The motif with the best score using each metric is shown for each regulator. All motifs presented are derived from datasets generated in rich growth conditions with the exception of a previously published dataset for epitope-tagged Gal4 grown in galactose
  • the altered activity can be found by screening the organism under conditions that select for the desired change in activity.
  • certain microorganisms can be adapted to increase or decrease an activity by selecting or screening the organism in question on a media containing substances that are poorly metabolized or even toxic.
  • An increase in the ability of an organism to grow a substance that is normally poorly metabolized would result in an increase in the growth rate on that substance, for example.
  • a decrease in the sensitivity to a toxic substance might be manifested by growth on higher concentrations of the toxic substance, for example.
  • Genetic modifications that are identified in this manner sometimes are referred to as naturally occurring mutations or the organisms that carry them can sometimes be referred to as naturally occurring mutants. Modifications obtained in this manner are not limited to alterations in promoter sequences.
  • screening microorganisms by selective pressure can yield genetic alterations that can occur in non-promoter sequences, and sometimes also can occur in sequences that are not in the nucleotide sequence of interest, but in a related nucleotide sequences (e.g., a gene involved in a different step of the same pathway, a transport gene, and the like).
  • Naturally occurring mutants sometimes can be found by isolating naturally occurring variants from unique environments, in some embodiments.
  • a nucleic acid reagent may include a polynucleotide sequence 70% or more identical to the foregoing (or to the complementary sequences). That is, a nucleotide sequence that is at least 70% or more, 71% or more, 72% or more, 73% or more, 74% or more, 75% or more, 76% or more, 77% or more, 78% or more, 79% or more, 80% or more, 81 % or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or more, 91 % or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, or 99% or more identical to a nucleotide sequence described herein can be utilized.
  • nucleotide sequences having substantially the same nucleotide sequence when compared to each other.
  • One test for determining whether two nucleotide sequences or amino acids sequences are substantially identical is to determine the percent of identical nucleotide sequences or amino acid sequences shared.
  • sequence identity can be performed as follows. Sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes).
  • the length of a reference sequence aligned for comparison purposes is sometimes 30% or more, 40% or more, 50% or more, often 60% or more, and more often 70% or more, 80% or more, 90% or more, or 100% of the length of the reference sequence.
  • the nucleotides or amino acids at corresponding nucleotide or polypeptide positions, respectively, are then compared among the two sequences.
  • the nucleotides or amino acids are deemed to be identical at that position.
  • the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, introduced for optimal alignment of the two sequences.
  • Comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. Percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of Meyers & Miller, CABIOS 4: 1 1 -17 (1989), which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. Also, percent identity between two amino acid sequences can be determined using the Needleman & Wunsch, J. Mol. Biol.
  • a set of parameters often used is a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.
  • Sequence identity can also be determined by hybridization assays conducted under stringent conditions.
  • stringent conditions refers to conditions for hybridization and washing. Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y., 6.3.1 -6.3.6 (1989). Aqueous and nonaqueous methods are described in that reference and either can be used.
  • An example of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45 S C, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 50 9 C.
  • SSC sodium chloride/sodium citrate
  • stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45 9 C, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 55 9 C.
  • a further example of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45 S C, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 60 9 C.
  • stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45 S C, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 65 2 C. More often, stringency conditions are 0.5M sodium phosphate, 7% SDS at 65 S C, followed by one or more washes at 0.2X SSC, 1 % SDS at 65 S C.
  • nucleic acid reagents may also comprise one or more 5' UTR's, and one or more 3'UTR's.
  • a 5' UTR may comprise one or more elements endogenous to the nucleotide sequence from which it originates, and sometimes includes one or more exogenous elements.
  • a 5' UTR can originate from any suitable nucleic acid, such as genomic DNA, plasmid DNA, RNA or mRNA, for example, from any suitable organism (e.g., virus, bacterium, yeast, fungi, plant, insect or mammal). The artisan may select appropriate elements for the 5' UTR based upon the chosen expression system (e.g., expression in a chosen organism, or expression in a cell free system, for example).
  • a 5' UTR sometimes comprises one or more of the following elements known to the artisan:
  • enhancer sequences e.g., transcriptional or translational
  • transcription initiation site e.g., transcriptional or translational
  • transcription factor binding site e.g., transcription factor binding site
  • translation regulation site e.g., translation factor binding site
  • translation factor binding site e.g., translation factor binding site
  • accessory protein binding site e.g., feedback regulation agent binding sites
  • Pribnow box e.g., TATA box, -35 element, E-box (helix-loop-helix binding element), ribosome binding site, replicon, internal ribosome entry site (IRES), silencer element and the like.
  • a promoter element may be isolated such that all 5' UTR elements necessary for proper conditional regulation are contained in the promoter element fragment, or within a functional subsequence of a promoter element fragment.
  • a 5 'UTR in the nucleic acid reagent can comprise a translational enhancer nucleotide sequence.
  • a translational enhancer nucleotide sequence often is located between the promoter and the target nucleotide sequence in a nucleic acid reagent.
  • a translational enhancer sequence often binds to a ribosome, sometimes is an 18S rRNA-binding ribonucleotide sequence (i.e., a 40S ribosome binding sequence) and sometimes is an internal ribosome entry sequence (IRES).
  • An IRES generally forms an RNA scaffold with precisely placed RNA tertiary structures that contact a 40S ribosomal subunit via a number of specific intermolecular interactions.
  • ribosomal enhancer sequences are known and can be identified by the artisan (e.g., Mumblee et al., Nucleic Acids Research 33: D141 -D146 (2005); Paulous et al., Nucleic Acids Research 31 : 722-733 (2003); Akbergenov et al., Nucleic Acids Research 32: 239-247 (2004); Mignone et al., Genome Biology 3(3): reviews0004.1 -0001 .10 (2002); Gallie, Nucleic Acids Research 30: 3401 -341 1 (2002); Shaloiko et al., http address www.interscience.wiley.com, DOI: 10.1002/bit.20267; and Gallie et al., Nucleic Acids Research 15: 3257-3273 (1987)).
  • a translational enhancer sequence sometimes is a eukaryotic sequence, such as a Kozak consensus sequence or other sequence (e.g., hydroid polyp sequence, GenBank accession no. U07128).
  • a translational enhancer sequence sometimes is a prokaryotic sequence, such as a Shine-Dalgarno consensus sequence.
  • the translational enhancer sequence is a viral nucleotide sequence.
  • a translational enhancer sequence sometimes is from a 5' UTR of a plant virus, such as Tobacco Mosaic Virus (TMV), Alfalfa Mosaic Virus (AMV);
  • an omega sequence about 67 bases in length from TMV is included in the nucleic acid reagent as a translational enhancer sequence (e.g., devoid of guanosine nucleotides and includes a 25 nucleotide long poly (CAA) central region).
  • a 3' UTR may comprise one or more elements endogenous to the nucleotide sequence from which it originates and sometimes includes one or more exogenous elements.
  • a 3' UTR may originate from any suitable nucleic acid, such as genomic DNA, plasmid DNA, RNA or imRNA, for example, from any suitable organism (e.g., a virus, bacterium, yeast, fungi, plant, insect or mammal). The artisan can select appropriate elements for the 3' UTR based upon the chosen expression system (e.g., expression in a chosen organism, for example).
  • a 3' UTR sometimes comprises one or more of the following elements known to the artisan: transcription regulation site, transcription initiation site, transcription termination site, transcription factor binding site, translation regulation site, translation termination site, translation initiation site, translation factor binding site, ribosome binding site, replicon, enhancer element, silencer element and polyadenosine tail.
  • a 3' UTR often includes a polyadenosine tail and sometimes does not, and if a polyadenosine tail is present, one or more adenosine moieties may be added or deleted from it (e.g., about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45 or about 50 adenosine moieties may be added or subtracted).
  • modification of a 5' UTR and/or a 3' UTR can be used to alter (e.g., increase, add, decrease or substantially eliminate) the activity of a promoter.
  • Alteration of the promoter activity can in turn alter the activity of a peptide, polypeptide or protein (e.g., enzyme activity for example), by a change in transcription of the nucleotide sequence(s) of interest from an operably linked promoter element comprising the modified 5' or 3' UTR.
  • a peptide, polypeptide or protein e.g., enzyme activity for example
  • a microorganism can be engineered by genetic modification to express a nucleic acid reagent comprising a modified 5' or 3' UTR that can add a novel activity (e.g., an activity not normally found in the host organism) or increase the expression of an existing activity by increasing transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest (e.g., homologous or heterologous nucleotide sequence of interest), in certain embodiments.
  • a novel activity e.g., an activity not normally found in the host organism
  • a nucleotide sequence of interest e.g., homologous or heterologous nucleotide sequence of interest
  • a microorganism can be engineered by genetic modification to express a nucleic acid reagent comprising a modified 5' or 3' UTR that can decrease the expression of an activity by decreasing or substantially eliminating transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest, in certain embodiments.
  • a nucleotide reagent sometimes can comprise a target nucleotide sequence.
  • a "target nucleotide sequence” as used herein encodes a nucleic acid, peptide, polypeptide or protein of interest, and may be a ribonucleotide sequence or a deoxyribonucleotide sequence.
  • a target nucleic acid sometimes can comprise a chimeric nucleic acid (or chimeric nucleotide sequence), which can encode a chimeric protein (or chimeric amino acid sequence).
  • the term “chimeric” as used herein refers to a nucleic acid or nucleotide sequence, or encoded product thereof, containing sequences from two or more different sources.
  • Any suitable source can be selected, including, but not limited to, a sequence from a nucleic acid, nucleotide sequence, ribosomal nucleic acid, RNA, DNA, regulatory nucleotide sequence (e.g., promoter, URL, enhancer, repressor and the like), coding nucleic acid, gene, nucleic acid linker, nucleic acid tag, amino acid sequence, peptide, polypeptide, protein, chromosome, and organism.
  • regulatory nucleotide sequence e.g., promoter, URL, enhancer, repressor and the like
  • a chimeric molecule can include a sequence of contiguous nucleotides or amino acids from a source including, but not limited to, a virus, prokaryote, eukaryote, genus, species, homolog, ortholog, paralog and isozyme, nucleic acid linkers, nucleic acid tags, the like and combinations thereof).
  • a chimeric molecule can be generated by placing in juxtaposition fragments of related or unrelated nucleic acids, nucleotide sequences or DNA segments, in some embodiments.
  • the nucleic acids, nucleotide sequences or DNA segments can be native or wild type sequences, mutant sequences or engineered sequences (completely engineered or engineered to a point, for example).
  • a chimera includes about 1 , 2, 3, 4 or 5 sequences (e.g., contiguous nucleotides, contiguous amino acids) from one organism and 1 , 2, 3, 4 or 5 sequences (e.g., contiguous nucleotides, contiguous amino acids) from another organism.
  • the organisms sometimes are a microbe, such as a bacterium (e.g., gram positive, gram negative), yeast or fungus (e.g., aerobic fungus, anaerobic fungus), for example.
  • the organisms are bacteria, the organisms are yeast or the organisms are fungi (e.g., different species), and sometimes one organism is a bacterium or yeast and another is a fungus.
  • a chimeric molecule may contain up to about 99% of sequences from one organism (e.g., about 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 %) and the balance percentage from one or more other organisms.
  • a chimeric molecule includes altered codons (in the case of a chimeric nucleic acid) and one or more mutations (e.g., point mutations, nucleotide substitutions, amino acid substitutions).
  • the chimera comprises a portion of a xylose isomerase from one bacteria species and a portion of a xylose isomerase from another bacteria species.
  • the chimera comprises a portion of a xylose isomerase from one species of fungus and another portion of a xylose isomerase from another species of fungus.
  • the chimera comprises one portion of a xylose isomerase from a plant, and another portion of a xylose isomerase from a non-plant (such as a bacteria or fungus).
  • the chimera comprises one portion of a xylose isomerase from a plant, another portion of a xylose isomerase from a bacteria, and yet another portion of a xylose isomerase from a fungus.
  • a gene encoding a xylose isomerase protein is chimeric, and includes a portion of a xylose isomerase encoding sequence from one organism (e.g. a fungus (e.g.,
  • a fungal sequence is located at the N- terminal portion of the encoded xylose isomerase polypeptide and the bacterial sequence is located at the C-terminal portion of the polypeptide.
  • one contiguous fungal xylose isomerase sequence is about 1% to about 30% of overall sequence (e.g., about 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29%) and the remaining sequence is a contiguous bacterial xylose isomerase sequence.
  • a chimeric xylose isomerase includes one or more point mutations.
  • fragments used to generate a chimera can be juxtaposed as units (e.g., nucleic acid from the sources are combined end to end and not interspersed.
  • nucleotide sequence combinations can be noted as DNA source 1 DNA source 2 or DNA source 1/DNA source 2/DNA source 3, the like and combinations thereof, for example.
  • fragments used to generate a chimera can be juxtaposed such that one or more fragments from one or more sources can be interspersed with other fragments used to generate the chimera (e.g., DNA source 1/DNA source 2/DNA source 1/DNA source 3/DNA source 2/DNA source 1 ).
  • the nucleotide sequence length of the fragments used to generate a chimera can be in the range from about 5 base pairs to about 5,000 base pairs (e.g., about 5 base pairs (bp), about 10 bp, about 15 bp, about 20 bp, about 25 bp, about 30 bp, about 35 bp, about 40 bp, about 45 bp, about 50 bp, about 55 bp, about 60 bp, about bp, about 65 bp, about 70 bp, about 75 bp, about 80 bp, about 85 bp, about 90 bp, about 95 bp, about 100 bp, about 125 bp, about 150 bp, about 175 bp, about 200 bp, about 250 bp, about 300 bp, about 350 bp, about 400 bp, about 450 bp, about 500 bp, about 550 bp, about 600 bp, about 650 b
  • chimeric xylose isomerase sequences are generated by first aligning the sequences of donor and recipient xylose isomerases. In certain embodiments, the alignment is performed utilizing nucleotide sequences, and in some embodiments, the alignment is performed utilizing amino acid sequences. Aligning sequences from donors and recipients sometimes ' generates alignments with mismatched regions. In certain embodiments, a region of mismatch occurs in the N-terminus of the encoded polypeptides, and in some embodiments, the region of mismatch occurs in the C-terminus of the encoded polypeptides. In certain embodiments chimeric polypeptides sometimes include 5 or more, 10 or more, 15 or more, or 20 or more amino acids from an organism designated as a "donor".
  • An organism often is designated as a donor when a minority of the final chimeric sequence (e.g., a smaller fragment or portion) is taken from the donor and combined with another sequence present in a majority of the final chimeric sequence (e.g., larger fragment or substantially the whole encoded activity).
  • An organism sometimes is designated as a recipient when the majority of the polypeptide sequence for a chimeric enzyme is obtained from a xylose isomerase from that organism.
  • a donor may contribute between about 1 % to about 49% of the amino acids in a chimeric polypeptide.
  • the number of amino acids or nucleotides in a chimeric polypeptide donated by a donor is not equal to the number of amino acids or nucleotides removed from the recipient sequence. That is, a donor fragment may replace a larger or smaller number of amino acids or nucleotides than the number of amino acids or nucleotides removed from the recipient. In some embodiments, replacing a larger or smaller number of amino acids or nucleotides in the final chimeric sequence than was removed from the recipient is performed to maintain overall alignment and/or to maintain catalytic domain spacing, and sometimes results in a chimeric molecule having a substantially similar activity, but with a different length than the recipient xylose isomerase.
  • a donor replacement sequence may be between about 1 to about 10 amino acids more or less than the sequence removed in the recipient.
  • 8 amino acids e.g., codon triplets representative of 8 amino acids
  • a Ruminococcus flavefaciens xylose isomerase e.g., amino acid sequence removed MEFFSNIG; nucleotide sequence removed atggaatttttcagcaatatcggt
  • 10 amino acids from a Piromyces xylose isomerase e.g., amino acid sequence added MAKEYFPQIQ; nucleotide sequence added atggcataaggaatatttcccacaaattcaa).
  • a chimeric nucleic acid or nucleotide sequence encodes the same activity as the activity encoded by the source nucleic acids or nucleotide sequences.
  • a chimeric nucleic acid or nucleotide sequence has a similar or the same activity, but the amount of the activity, or kinetics of the activity, are altered (e.g., increased, decreased).
  • a chimeric nucleic acid or nucleotide sequence encodes a different activity, and in some embodiments a chimeric nucleic acid or nucleotide sequences encodes a chimeric activity (e.g., a combination of two or more activities).
  • polynucleotide sequences described herein are codon optimized. In some embodiments, codon optimization alters the polynucleotide coding sequence for enhanced expression in a chosen host, while leaving the amino acid sequence unchanged.
  • Codon optimization can reduce transcriptional and/or translational pausing and/or other features which may decrease the expression of a polynucleotide in a host organism. Any suitable codon optimization scheme or method may be used to optimize polynucleotide sequences. In certain embodiments, codon optimization can be performed manually using a preferred codon table for the selected host organism, and in some embodiments, codon optimization can be performed using software (e.g., using a computer or an online software package). In certain embodiments, codon optimization can be performed using commercially available software and/or algorithms offered by manufacturers of custom or made to order synthetic polynucleotides (e.g., Integrated DNA Technologies (IDT), DNA 2.0, Genscript, EnCor Biotechnology, Blue Heron, and the like).
  • IDTT Integrated DNA Technologies
  • codon optimization can be performed using IDT's gene synthesis services.
  • an amino acid sequence is provided, a host organism is selected and IDTs codon optimization algorithm provides a codon optimized polynucleotide sequence based on the provided amino acid sequence and the preferred codon triplets for the selected host. Due to rounding decisions and other heuristics included in the algorithm, a codon optimized polynucleotide sequence generated for an amino acid sequence sometimes is about 90 percent or more identical to another codon optimized polynucleotide sequences generated for the same amino acid sequence.
  • two or more codon optimized polynucleotide sequences generated for an amino acid sequence sometimes are 90% or more, 91% or more, 92% or more, 93% or more, 94% o more, 95% or more, 96% or more, 97% or more, 98% or more, or more than 99% identical to each other.
  • FIGS. 47A and 47B are nucleotide sequence alignments of codon optimized Ruminococcus FD-1 xylose isomerase nucleotide sequences (e.g., labeled 1 , 2, 3 and 4) generated from the Ruminococcus FD-1 xylose isomerase amino acid sequence. Also presented in FIGS.
  • 47A and 47B are the native Ruminococcus FD-1 xylose isomerase nucleotide sequence (e.g., top line labeled FD-1 ) and a consensus sequences (e.g., bottom line labeled consensus). The alignment shows that the four codon optimized sequences have a substantially high degree of identity.
  • an isolated nucleic acid comprises a chimeric nucleic acid which comprises a polynucleotide that is 80% or more identical to SEQ ID NO: 179 (e.g., 80% or more identical, 81 % or more identical, 82% or more identical, 83% or more identical, 84% or more identical, 85% or more identical, 86% or more identical, 87% or more identical, 88% or more identical, 89% or more identical, 90% or more identical, 91 % or more identical, 92% or more identical, 93% or more identical, 94% or more identical, 95% or more identical, 96% or more identical, 97% or more identical, 98% or more identical, or 99% or more identical).
  • SEQ ID NO: 179 e.g., 80% or more identical, 81 % or more identical, 82% or more identical, 83% or more identical, 84% or more identical, 85% or more identical, 86% or more identical, 87% or more identical, 88% or more identical, 89% or more identical, 90% or more identical,
  • a target nucleic acid sometimes is an untranslated ribonucleic acid and sometimes is a translated ribonucleic acid.
  • An untranslated ribonucleic acid may include, but is not limited to, a small interfering ribonucleic acid (siRNA), a short hairpin ribonucleic acid (shRNA), other ribonucleic acid capable of RNA interference (RNAi), an antisense ribonucleic acid, or a ribozyme.
  • a translatable target nucleotide sequence (e.g., a target ribonucleotide sequence) sometimes encodes a peptide, polypeptide or protein, which are sometimes referred to herein as "target peptides,” “target polypeptides” or “target proteins.”
  • Any peptides, polypeptides or proteins, or an activity catalyzed by one or more peptides, polypeptides or proteins may be encoded by a target nucleotide sequence and may be selected by a person of ordinary skill in the art.
  • Representative proteins include enzymes (e.g.,
  • phosphofructokinase activity phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate-6- phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol dehydrogenase 2 activity and thymidylate synthase activity and the like, for example), antibodies, serum proteins (e.g., albumin), membrane bound proteins, hormones (e.g., growth hormone, erythropoietin, insulin, etc.), cytokines, etc., and include both naturally occurring and exogenously expressed polypeptides.
  • serum proteins e.g., albumin
  • hormones e.g., growth hormone, erythropoietin, insulin, etc.
  • cytokines cytokines
  • Representative activities include phosphofructokinase activity, phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate-6-phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol
  • enzyme refers to a protein which can act as a catalyst to induce a chemical change in other compounds, thereby producing one or more products from one or more substrates.
  • specific polypeptides e.g., enzymes
  • protein refers to a molecule having a sequence of amino acids linked by peptide bonds.
  • a protein or polypeptide sometimes is of intracellular origin (e.g., located in the nucleus, cytosol, or interstitial space of host cells in vivo) and sometimes is a cell membrane protein in vivo.
  • a genetic modification can result in a modification (e.g., increase, substantially increase, decrease or substantially decrease) of a target activity.
  • a translatable nucleotide sequence generally is located between a start codon (AUG in ribonucleic acids and ATG in deoxyribonucleic acids) and a stop codon (e.g., UAA (ochre), UAG (amber) or UGA (opal) in ribonucleic acids and TAA, TAG or TGA in deoxyribonucleic acids), and sometimes is referred to herein as an "open reading frame" (ORF).
  • a nucleic acid reagent sometimes comprises one or more ORFs.
  • An ORF may be from any suitable source, sometimes from genomic DNA, mRNA, reverse transcribed RNA or complementary DNA (cDNA) or a nucleic acid library comprising one or more of the foregoing, and is from any organism species that contains a nucleic acid sequence of interest, protein of interest, or activity of interest.
  • organisms from which an ORF can be obtained include bacteria, yeast, fungi, human, insect, nematode, bovine, equine, canine, feline, rat or mouse, for example.
  • a nucleic acid reagent sometimes comprises a nucleotide sequence adjacent to an ORF that is translated in conjunction with the ORF and encodes an amino acid tag.
  • the tag-encoding nucleotide sequence is located 3' and/or 5' of an ORF in the nucleic acid reagent, thereby encoding a tag at the C-terminus or N-terminus of the protein or peptide encoded by the ORF. Any tag that does not abrogate in vitro transcription and/or translation may be utilized and may be appropriately selected by the artisan. Tags may facilitate isolation and/or purification of the desired ORF product from culture or fermentation media.
  • a tag sometimes specifically binds a molecule or moiety of a solid phase or a detectable label, for example, thereby having utility for isolating, purifying and/or detecting a protein or peptide encoded by the ORF.
  • a tag comprises one or more of the following elements: FLAG (e.g., DYKDDDDKG), V5 (e.g., GKPIPNPLLGLDST), c-MYC (e.g., EQKLISEEDL), HSV (e.g., QPELAPEDPED), influenza hemaglutinin, HA (e.g., YPYDVPDYA), VSV-G (e.g., YTDIEMNRLGK), bacterial glutathione-S-transferase, maltose binding protein, a streptavidin- or avidin-binding tag (e.g., pcDNATM6 BioEaseTM Gateway® Biotinylation System (Invitrogen)),
  • a cysteine-rich tag comprises the amino acid sequence CC-Xn-CC, wherein X is any amino acid and n is 1 to 3, and the cysteine-rich sequence sometimes is CCPGCC.
  • the tag comprises a cysteine-rich element and a polyhistidine element (e.g., CCPGCC and His6).
  • a tag often conveniently binds to a binding partner. For example, some tags bind to an antibody (e.g., FLAG) and sometimes specifically bind to a small molecule.
  • a polyhistidine tag specifically chelates a bivalent metal, such as copper, zinc and cobalt; a polylysine or polyarginine tag specifically binds to a zinc finger; a glutathione S-transferase tag binds to glutathione; and a cysteine-rich tag specifically binds to an arsenic-containing molecule.
  • a bivalent metal such as copper, zinc and cobalt
  • a polylysine or polyarginine tag specifically binds to a zinc finger
  • a glutathione S-transferase tag binds to glutathione
  • a cysteine-rich tag specifically binds to an arsenic-containing molecule.
  • Arsenic-containing molecules include LUMIOTM agents (Invitrogen, California), such as FIAsHTM (EDT2[4',5'-bis(1 ,3,2- dithioarsolan-2-yl)fluorescein-(1 ,2-ethanedithiol)2]) and ReAsH reagents (e.g., U.S. Patent 5,932,474 to Tsien et al., entitled “Target Sequences for Synthetic Molecules;" U.S. Patent 6,054,271 to Tsien et al., entitled “Methods of Using Synthetic Molecules and Target Sequences;" U.S.
  • LUMIOTM agents Invitrogen, California
  • FIAsHTM EDT2[4',5'-bis(1 ,3,2- dithioarsolan-2-yl)fluorescein-(1 ,2-ethanedithiol)2]
  • Patents 6,451 ,569 and 6,008,378 published U.S. Patent Application 2003/0083373, and published PCT Patent Application WO 99/21013, all to Tsien et al. and all entitled "Synthetic Molecules that Specifically React with Target Sequences”).
  • Such antibodies and small molecules sometimes are linked to a solid phase for convenient isolation of the target protein or target peptide.
  • a tag sometimes comprises a sequence that localizes a translated protein or peptide to a component in a system, which is referred to as a "signal sequence” or “localization signal sequence” herein.
  • a signal sequence often is incorporated at the N-terminus of a target protein or target peptide, and sometimes is incorporated at the C-terminus. Examples of signal sequences are known to the artisan, are readily incorporated into a nucleic acid reagent, and often are selected according to the organism in which expression of the nucleic acid reagent is performed.
  • a signal sequence in some embodiments localizes a translated protein or peptide to a cell membrane.
  • signal sequences include, but are not limited to, a nucleus targeting signal (e.g., steroid receptor sequence and N-terminal sequence of SV40 virus large T antigen); mitochondrial targeting signal (e.g., amino acid sequence that forms an amphipathic helix);
  • a nucleus targeting signal e.g., steroid receptor sequence and N-terminal sequence of SV40 virus large T antigen
  • mitochondrial targeting signal e.g., amino acid sequence that forms an amphipathic helix
  • peroxisome targeting signal e.g., C-terminal sequence in YFG from S.cerevisiae
  • a secretion signal e.g., N-terminal sequences from invertase, mating factor alpha, PH05 and SUC2 in S.cerevisiae; multiple N-terminal sequences of B. subtilis proteins (e.g., Tjalsma et al.,
  • alpha amylase signal sequence e.g., U.S. Patent No. 6,288,302
  • pectate lyase signal sequence e.g., U.S. Patent No. 5,846,8178
  • precollagen signal sequence e.g., U.S. Patent No. 5,712,1 14
  • OmpA signal sequence e.g., U.S. Patent No. 5,470,719
  • lam beta signal sequence e.g., U.S. Patent No. 5,389,529
  • B. brevis signal sequence e.g., U.S. Patent No. 5,232,841
  • P. pastoris signal sequence e.g., U.S. Patent No.
  • a tag sometimes is directly adjacent to the amino acid sequence encoded by an ORF (i.e., there is no intervening sequence) and sometimes a tag is substantially adjacent to an ORF encoded amino acid sequence (e.g., an intervening sequence is present).
  • An intervening sequence sometimes includes a recognition site for a protease, which is useful for cleaving a tag from a target protein or peptide.
  • the intervening sequence is cleaved by Factor Xa (e.g., recognition site I (E/D)GR), thrombin (e.g., recognition site LVPRGS), enterokinase (e.g., recognition site DDDDK), TEV protease (e.g., recognition site ENLYFQG) or PreScissionTM protease (e.g., recognition site LEVLFQGP), for example.
  • An intervening sequence sometimes is referred to herein as a "linker sequence," and may be of any suitable length selected by the artisan.
  • a linker sequence sometimes is about 1 to about 20 amino acids in length, and sometimes about 5 to about 10 amino acids in length.
  • linker length may be selected to substantially preserve target protein or peptide function (e.g., a tag may reduce target protein or peptide function unless separated by a linker), to enhance disassociation of a tag from a target protein or peptide when a protease cleavage site is present (e.g., cleavage may be enhanced when a linker is present), and to enhance interaction of a tag/target protein product with a solid phase.
  • a linker can be of any suitable amino acid content, and often comprises a higher proportion of amino acids having relatively short side chains (e.g., glycine, alanine, serine and threonine).
  • a nucleic acid reagent sometimes includes a stop codon between a tag element and an insertion element or ORF, which can be useful for translating an ORF with or without the tag.
  • Mutant tRNA molecules that recognize stop codons (described above) suppress translation termination and thereby are designated "suppressor tRNAs.” Suppressor tRNAs can result in the insertion of amino acids and continuation of translation past stop codons (e.g., U.S. Patent Application No. 60/587,583, filed July 1 , 2004, entitled “Production of Fusion Proteins by Cell-Free Protein Synthesis,"; Eggertsson, et al., (1988) Microbiological Review 52(3):354-374, and Engleerg-Kukla, et al.
  • suppressor tRNAs are known, including but not limited to, supE, supP, supD, supF and supZ suppressors, which suppress the termination of translation of the amber stop codon; supB, gIT, supL, supN, supC and supM suppressors, which suppress the function of the ochre stop codon and glyT, trpT and Su-9 suppressors, which suppress the function of the opal stop codon.
  • supE, supP, supD, supF and supZ suppressors which suppress the termination of translation of the amber stop codon
  • supB, gIT, supL, supN, supC and supM suppressors which suppress the function of the ochre stop codon and glyT, trpT and Su-9 suppressors, which suppress the function of the opal stop codon.
  • suppressor tRNAs contain one or more mutations in the anti-codon loop of the tRNA that allows the tRNA to base pair with a codon that ordinarily functions as a stop codon.
  • the mutant tRNA is charged with its cognate amino acid residue and the cognate amino acid residue is inserted into the translating polypeptide when the stop codon is encountered. Mutations that enhance the efficiency of termination suppressors (i.e., increase stop codon read-through) have been identified.
  • mutations in the uar gene also known as the prfA gene
  • mutations in the ups gene mutations in the sueA, sueB and sueC genes
  • mutations in the rpsD ramA
  • rpsE spcA genes
  • mutations in the rpIL gene include, but are not limited to, mutations in the uar gene (also known as the prfA gene), mutations in the ups gene, mutations in the sueA, sueB and sueC genes, mutations in the rpsD (ramA) and rpsE (spcA) genes and mutations in the rpIL gene.
  • a nucleic acid reagent comprising a stop codon located between an ORF and a tag can yield a translated ORF alone when no suppressor tRNA is present in the translation system, and can yield a translated ORF-tag fusion when a suppressor tRNA is present in the system.
  • Suppressor tRNA can be generated in cells transfected with a nucleic acid encoding the tRNA (e.g., a replication incompetent adenovirus containing the human tRNA-Ser suppressor gene can be transfected into cells, or a YAC containing a yeast or bacterial tRNA suppressor gene can be transfected into yeast cells, for example).
  • Vectors for synthesizing suppressor tRNA and for translating ORFs with or without a tag are available to the artisan (e.g., Tag-On-DemandTM kit (Invitrogen Corporation, California); Tag-On-DemandTM Suppressor Supernatant Instruction Manual, Version B, 6 June 2003, at http address www.invitrogen.com/content/sfs/
  • Any convenient cloning strategy known in the art may be utilized to incorporate an element, such as an ORF, into a nucleic acid reagent.
  • Known methods can be utilized to insert an element into the template independent of an insertion element, such as (1 ) cleaving the template at one or more existing restriction enzyme sites and ligating an element of interest and (2) adding restriction enzyme sites to the template by hybridizing oligonucleotide primers that include one or more suitable restriction enzyme sites and amplifying by polymerase chain reaction (described in greater detail herein).
  • Other cloning strategies take advantage of one or more insertion sites present or inserted into the nucleic acid reagent, such as an oligonucleotide primer hybridization site for PCR, for example, and others described hereafter.
  • a cloning strategy can be combined with genetic manipulation such as recombination (e.g., recombination of a nucleic acid reagent with a nucleic acid sequence of interest into the genome of the organism to be modified, as described further below).
  • genetic manipulation such as recombination (e.g., recombination of a nucleic acid reagent with a nucleic acid sequence of interest into the genome of the organism to be modified, as described further below).
  • the cloned ORF(s) can produce (directly or indirectly) a desire product, by engineering a microorganism with one or more ORFs of interest, which microorganism comprises one or more altered activities selected from the group consisting of phosphofructokinase activity, phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate- 6-phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol dehydrogenase 2 activity, sugar transport activity, phosphoglucoisomerase activity, transaldolase activity, transketolase activity, glucose-6-phosphate dehydrogenase activity, 6- phosphogluconolactonase activity, 6-phosphogluconate dehydrogenase (decarboxylating) activity, xylose reductase activity, xylitol dehydrogenase activity, xylulokinase activity and thy
  • the nucleic acid reagent includes one or more recombinase insertion sites.
  • a recombinase insertion site is a recognition sequence on a nucleic acid molecule that participates in an integration/recombination reaction by recombination proteins.
  • recombination site for Cre recombinase is loxP, which is a 34 base pair sequence comprised of two 13 base pair inverted repeats (serving as the recombinase binding sites) flanking an 8 base pair core sequence (e.g., Figure 1 of Sauer, B., Curr. Opin. Biotech. 5:521 -527 (1994)).
  • Other examples of recombination sites include attB, attP, attL, and attR sequences, and mutants, fragments, variants and derivatives thereof, which are recognized by the recombination protein ⁇ Int and by the auxiliary proteins integration host factor (IHF), FIS and excisionase (Xis) (e.g., U.S.
  • Examples of recombinase cloning nucleic acids are in Gateway® systems (Invitrogen, California), which include at least one recombination site for cloning a desired nucleic acid molecules in vivo or in vitro.
  • the system utilizes vectors that contain at least two different site- specific recombination sites, often based on the bacteriophage lambda system (e.g., attl and att2), and are mutated from the wild-type (attO) sites.
  • Each mutated site has a unique specificity for its cognate partner att site (i.e., its binding partner recombination site) of the same type (for example attB1 with attP1 , or attL1 with attR1 ) and will not cross-react with recombination sites of the other mutant type or with the wild-type attO site.
  • Nucleic acid fragments flanked by recombination sites are cloned and subcloned using the Gateway® system by replacing a selectable marker (for example, ccdB) flanked by att sites on the recipient plasmid molecule, sometimes termed the Destination Vector. Desired clones are then selected by transformation of a ccdB sensitive host strain and positive selection for a marker on the recipient molecule. Similar strategies for negative selection (e.g., use of toxic genes) can be used in other organisms such as thymidine kinase (TK) in mammals and insects.
  • TK thymidine kinase
  • a recombination system useful for engineering yeast is outlined briefly.
  • the system makes use of the ura3 gene (e.g., for S. cerevisiae and C. albicans, for example) or ura4 and ura5 genes (e.g., for S. pombe, for example) and toxicity of the nucleotide analogue 5-Fluoroorotic acid (5-FOA).
  • the ura3 or ura4 and ura5 genes encode orotine-5'-monophosphate (OMP) dicarboxylase.
  • OMP orotine-5'-monophosphate
  • Yeast carrying a mutation in the appropriate gene(s) or having a knock out of the appropriate gene(s) can grow in the presence of 5-FOA, if the media is also supplemented with uracil.
  • a nucleic acid engineering construct can be made which may comprise the URA3 gene or cassette (for S. cerevisiae), flanked on either side by the same nucleotide sequence in the same orientation.
  • the ura3 cassette comprises a promoter, the ura3 gene and a functional transcription terminator.
  • Target sequences which direct the construct to a particular nucleic acid region of interest in the organism to be engineered are added such that the target sequences are adjacent to and abut the flanking sequences on either side of the ura3 cassette.
  • Yeast can be transformed with the engineering construct and plated on minimal media without uracil. Colonies can be screened by PCR to determine those transformants that have the engineering construct inserted in the proper location in the genome.
  • Checking insertion location prior to selecting for recombination of the ura3 cassette may reduce the number of incorrect clones carried through to later stages of the procedure. Correctly inserted transformants can then be replica plated on minimal media containing 5-FOA to select for recombination of the ura3 cassette out of the construct, leaving a disrupted gene and an identifiable footprint (e.g., nucleic acid sequence) that can be use to verify the presence of the disrupted gene.
  • the technique described is useful for disrupting or "knocking out" gene function, but also can be used to insert genes or constructs into a host organisms genome in a targeted, sequence specific manner. Further detail will be described below in the engineering section and in the example section.
  • a nucleic acid reagent includes one or more topoisomerase insertion sites.
  • a topoisomerase insertion site is a defined nucleotide sequence recognized and bound by a site- specific topoisomerase.
  • the nucleotide sequence 5'-(C/T)CCTT-3' is a
  • topoisomerase recognition site bound specifically by most poxvirus topoisomerases, including vaccinia virus DNA topoisomerase I.
  • the topoisomerase cleaves the strand at the 3'-most thymidine of the recognition site to produce a nucleotide sequence comprising 5'-(C/T)CCTT-P04-TOPO, a complex of the topoisomerase covalently bound to the 3' phosphate via a tyrosine in the topoisomerase (e.g., Shuman, J. Biol. Chem. 266:1 1372- 1 1379, 1991 ; Sekiguchi and Shuman, Nucl. Acids Res. 22:5360-5365, 1994; U.S. Pat. No.
  • nucleotide sequence 5'- GCAACTT-3' is a topoisomerase recognition site for type IA E. coli topoisomerase III.
  • An element to be inserted often is combined with topoisomerase-reacted template and thereby incorporated into the nucleic acid reagent (e.g., http address www.invitrogen.com/downloads/F- 13512_Topo_Flyer.pdf; http address at world wide web uniform resource locator
  • a nucleic acid reagent sometimes contains one or more origin of replication (ORI) elements.
  • a template comprises two or more ORIs, where one functions efficiently in one organism (e.g., a bacterium) and another functions efficiently in another organism (e.g., a eukaryote, like yeast for example).
  • an ORI may function efficiently in one species (e.g., S. cerevisiae, for example) and another ORI may function efficiently in a different species (e.g., S. pombe, for example).
  • a nucleic acid reagent also sometimes includes one or more transcription regulation sites.
  • a nucleic acid reagent can include one or more selection elements (e.g., elements for selection of the presence of the nucleic acid reagent, and not for activation of a promoter element which can be selectively regulated). Selection elements often are utilized using known processes to determine whether a nucleic acid reagent is included in a cell.
  • a nucleic acid reagent includes two or more selection elements, where one functions efficiently in one organism and another functions efficiently in another organism.
  • selection elements include, but are not limited to, (1 ) nucleic acid segments that encode products that provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode products that are otherwise lacking in the recipient cell (e.g., essential products, tRNA genes, auxotrophic markers); (3) nucleic acid segments that encode products that suppress the activity of a gene product; (4) nucleic acid segments that encode products that can be readily identified (e.g., phenotypic markers such as antibiotics (e.g., ⁇ -lactamase), ⁇ -galactosidase, green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), cyan fluorescent protein (CFP), and cell surface proteins); (5) nucleic acid segments that bind products that are otherwise detrimental to cell survival and/or function; (6) nucleic acid segments that otherwise inhibit the activity of any of the nucleic acid segments described in Nos.
  • phenotypic markers such as antibiotics (e.g., ⁇ -lac
  • nucleic acid segments that bind products that modify a substrate e.g., restriction endonucleases
  • nucleic acid segments that can be used to isolate or identify a desired molecule e.g., specific protein binding sites
  • nucleic acid segments that encode a specific nucleotide sequence that can be otherwise non-functional e.g., for PCR amplification of subpopulations of molecules
  • nucleic acid segments that, when absent, directly or indirectly confer resistance or sensitivity to particular compounds (1 1 ) nucleic acid segments that encode products that either are toxic or convert a relatively non-toxic compound to a toxic compound (e.g., Herpes simplex thymidine kinase, cytosine deaminase) in recipient cells; (12) nucleic acid segments that inhibit replication, partition or heritability of nucleic acid molecules that contain them; and/or (13) nucleic acid segments
  • a nucleic acid reagent is of any form useful for in vivo transcription and/or translation.
  • a nucleic acid sometimes is a plasmid, such as a supercoiled plasmid, sometimes is a yeast artificial chromosome (e.g., YAC), sometimes is a linear nucleic acid (e.g., a linear nucleic acid produced by PCR or by restriction digest), sometimes is single-stranded and sometimes is double-stranded.
  • a nucleic acid reagent sometimes is prepared by an amplification process, such as a polymerase chain reaction (PCR) process or transcription-mediated amplification process (TMA).
  • PCR polymerase chain reaction
  • TMA transcription-mediated amplification process
  • TMA two enzymes are used in an isothermal reaction to produce amplification products detected by light emission (see, e.g., Biochemistry 1996 Jun 25;35(25):8429-38 and http address world wide web uniform resource locator devicelink.com/ivdt/archive/00/1 1/007. html).
  • Standard PCR processes are known (e.g., U. S. Patent Nos. 4,683,202; 4,683, 195; 4,965,188; and 5,656,493), and generally are performed in cycles. Each cycle includes heat denaturation, in which hybrid nucleic acids dissociate; cooling, in which primer oligonucleotides hybridize; and extension of the
  • oligonucleotides by a polymerase i.e., Taq polymerase.
  • a polymerase i.e., Taq polymerase
  • An example of a PCR cyclical process is treating the sample at 95°C for 5 minutes; repeating forty-five cycles of 95°C for 1 minute, 59°C for 1 minute, 10 seconds, and 72°C for 1 minute 30 seconds; and then treating the sample at 72°C for 5 minutes. Multiple cycles frequently are performed using a commercially available thermal cycler.
  • PCR amplification products sometimes are stored for a time at a lower temperature (e.g., at 4°C) and sometimes are frozen (e.g., at -20°C) before analysis.
  • a nucleic acid reagent, protein reagent, protein fragment reagent or other reagent described herein is isolated or purified.
  • isolated refers to material removed from its original environment (e.g., the natural environment if it is naturally occurring, or a host cell if expressed exogenously), and thus is altered “by the hand of man” from its original environment.
  • purified as used herein with reference to molecules does not refer to absolute purity. Rather, “purified” refers to a substance in a composition that contains fewer substance species in the same class (e.g., nucleic acid or protein species) other than the substance of interest in comparison to the sample from which it originated.
  • nucleic acid or protein refers to a substance in a composition that contains fewer nucleic acid species or protein species other than the nucleic acid or protein of interest in comparison to the sample from which it originated.
  • a protein or nucleic acid is “substantially pure,” indicating that the protein or nucleic acid represents at least 50% of protein or nucleic acid on a mass basis of the composition.
  • a substantially pure protein or nucleic acid is at least 75% on a mass basis of the composition, and sometimes at least 95% on a mass basis of the composition.
  • engineered microorganism refers to a modified organism that includes one or more activities distinct from an activity present in a microorganism utilized as a starting point for modification (e.g., host microorganism or unmodified organism).
  • Engineered microorganisms typically arise as a result of a genetic modification, usually introduced or selected for, by one of skill in the art using readily available techniques.
  • Non-limiting examples of methods useful for generating an altered activity include, introducing a heterologous polynucleotide (e.g., nucleic acid or gene integration, also referred to as "knock in”), removing an endogenous polynucleotide, altering the sequence of an existing endogenous nucleic acid sequence (e.g., site-directed mutagenesis), disruption of an existing endogenous nucleic acid sequence (e.g., knock outs and transposon or insertion element mediated mutagenesis), selection for an altered activity where the selection causes a change in a naturally occurring activity that can be stably inherited (e.g., causes a change in a nucleic acid sequence in the genome of the organism or in an epigenetic nucleic acid that is replicated and passed on to daughter cells), PCR-based mutagenesis, and the like.
  • a heterologous polynucleotide e.g., nucleic acid or gene integration, also referred to as "knock in
  • mutagenesis refers to any modification to a nucleic acid (e.g., nucleic acid reagent, or host chromosome, for example) that is subsequently used to generate a product in a host or modified organism.
  • Non-limiting examples of mutagenesis include, deletion, insertion, substitution, rearrangement, point mutations, suppressor mutations and the like. Mutagenesis methods are known in the art and are readily available to the artisan. Non-limiting examples of mutagenesis methods are described herein and can also be found in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
  • genetic modification refers to any suitable nucleic acid addition, removal or alteration that facilitates production of a target product (e.g., phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate-6-phosphate aldolase activity, xylose isomerase activity, or phosphoenolpyruvate carboxylase activity, for example), in an engineered microorganism.
  • a target product e.g., phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate-6-phosphate aldolase activity, xylose isomerase activity, or phosphoenolpyruvate carboxylase activity, for example
  • Genetic modifications include, without limitation, insertion of one or more nucleotides in a native nucleic acid of a host organism in one or more locations, deletion of one or more nucleotides in a native nucleic acid of a host organism in one or more locations, modification or substitution of one or more nucleotides in a native nucleic acid of a host organism in one or more locations, insertion of a non-native nucleic acid into a host organism (e.g., insertion of an autonomously replicating vector), and removal of a non-native nucleic acid in a host organism (e.g., removal of a vector).
  • heterologous polynucleotide refers to a nucleotide sequence not present in a host microorganism in some embodiments.
  • a heterologous polynucleotide is present in a different amount (e.g., different copy number) than in a host microorganism, which can be accomplished, for example, by introducing more copies of a particular nucleotide sequence to a host microorganism (e.g., the particular nucleotide sequence may be in a nucleic acid autonomous of the host chromosome or may be inserted into a chromosome).
  • a heterologous polynucleotide is from a different organism in some embodiments, and in certain embodiments, is from the same type of organism but from an outside source (e.g., a recombinant source).
  • altered activity refers to an activity in an engineered microorganism that is added or modified relative to the host microorganism (e.g., added, increased, reduced, inhibited or removed activity).
  • An activity can be altered by introducing a genetic modification to a host microorganism that yields an engineered microorganism having added, increased, reduced, inhibited or removed activity.
  • An added activity often is an activity not detectable in a host microorganism.
  • An increased activity generally is an activity detectable in a host microorganism that has been increased in an engineered microorganism.
  • An activity can be increased to any suitable level for production of a target product (e.g., ethanol), including but not limited to less than 2-fold (e.g., about 10% increase to about 99% increase; about 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% increase), 2-fold, 3- fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, of 10-fold increase, or greater than about 10-fold increase.
  • a target product e.g., ethanol
  • a reduced or inhibited activity generally is an activity detectable in a host microorganism that has been reduced or inhibited in an engineered microorganism.
  • An activity can be reduced to undetectable levels in some embodiments, or detectable levels in certain embodiments.
  • An activity can be decreased to any suitable level for production of a target product (e.g., ethanol), including but not limited to less than 2-fold (e.g., about 10% decrease to about 99% decrease; about 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% decrease), 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, of 10-fold decrease, or greater than about 10-fold decrease.
  • a target product e.g., ethanol
  • An altered activity sometimes is an activity not detectable in a host organism and is added to an engineered organism.
  • An altered activity also may be an activity detectable in a host organism and is increased in an engineered organism.
  • An activity may be added or increased by increasing the number of copies of a polynucleotide that encodes a polypeptide having a target activity, in some embodiments.
  • an activity can be added or increased by inserting into a host microorganism a heterologous polynucleotide that encodes a polypeptide having the added activity.
  • an activity can be added or increased by inserting into a host microorganism a heterologous polynucleotide that is (i) operably linked to another polynucleotide that encodes a polypeptide having the added activity, and (ii) up regulates production of the polynucleotide.
  • an activity can be added or increased by inserting or modifying a regulatory polynucleotide operably linked to another polynucleotide that encodes a polypeptide having the target activity.
  • an activity can be added or increased by subjecting a host microorganism to a selective environment and screening for microorganisms that have a detectable level of the target activity. Examples of a selective environment include, without limitation, a medium containing a substrate that a host organism can process and a medium lacking a substrate that a host organism can process.
  • An altered activity sometimes is an activity detectable in a host organism and is reduced, inhibited or removed (i.e., not detectable) in an engineered organism.
  • An activity may be reduced or removed by decreasing the number of copies of a polynucleotide that encodes a polypeptide having a target activity, in some embodiments.
  • an activity can be reduced or removed by (i) inserting a polynucleotide within a polynucleotide that encodes a polypeptide having the target activity (disruptive insertion), and/or (ii) removing a portion of or all of a polynucleotide that encodes a polypeptide having the target activity (deletion or knock out, respectively).
  • an activity can be reduced or removed by inserting into a host microorganism a heterologous polynucleotide that is (i) operably linked to another polynucleotide that encodes a polypeptide having the target activity, and (ii) down regulates production of the polynucleotide.
  • a heterologous polynucleotide that is (i) operably linked to another polynucleotide that encodes a polypeptide having the target activity, and (ii) down regulates production of the polynucleotide.
  • an activity can be reduced or removed by inserting or modifying a regulatory polynucleotide operably linked to another polynucleotide that encodes a polypeptide having the target activity.
  • An activity also can be reduced or removed by (i) inhibiting a polynucleotide that encodes a polypeptide having the activity or (ii) inhibiting a polynucleotide operably linked to another polynucleotide that encodes a polypeptide having the activity.
  • a polynucleotide can be inhibited by a suitable technique known in the art, such as by contacting an RNA encoded by the
  • an activity also can be reduced or removed by contacting a polypeptide having the activity with a molecule that specifically inhibits the activity (e.g., enzyme inhibitor, antibody).
  • a molecule that specifically inhibits the activity e.g., enzyme inhibitor, antibody.
  • an activity can be reduced or removed by subjecting a host microorganism to a selective environment and screening for microorganisms that have a reduced level or removal of the target activity.
  • an untranslated ribonucleic acid, or a cDNA can be used to reduce the expression of a particular activity or enzyme.
  • a microorganism can be engineered by genetic modification to express a nucleic acid reagent that reduces the expression of an activity by producing an RNA molecule that is partially or substantially homologous to a nucleic acid sequence of interest which encodes the activity of interest.
  • the RNA molecule can bind to the nucleic acid sequence of interest and inhibit the nucleic acid sequence from performing its natural function, in certain embodiments.
  • the RNA may alter the nucleic acid sequence of interest which encodes the activity of interest in a manner that the nucleic acid sequence of interest is no longer capable of performing its natural function (e.g., the action of a ribozyme for example).
  • nucleotide sequences sometimes are added to, modified or removed from one or more of the nucleic acid reagent elements, such as the promoter, 5'UTR, target sequence, or 3'UTR elements, to enhance, potentially enhance, reduce, or potentially reduce transcription and/or translation before or after such elements are incorporated in a nucleic acid reagent.
  • one or more of the following sequences may be modified or removed if they are present in a 5'UTR: a sequence that forms a stable secondary structure (e.g., quadruplex structure or stem loop stem structure (e.g., EMBL sequences X12949, AF274954, AF139980, AF152961 , S95936, U194144, AF1 16649 or substantially identical sequences that form such stem loop stem structures)); a translation initiation codon upstream of the target nucleotide sequence start codon; a stop codon upstream of the target nucleotide sequence translation initiation codon; an ORF upstream of the target nucleotide sequence translation initiation codon; an iron responsive element (IRE) or like sequence; and a 5' terminal oligopyrimidine tract (TOP, e.g., consisting of 5- 15 pyrimidines adjacent to the cap).
  • a stable secondary structure e.g., quadruplex structure or stem loop stem structure (e
  • a translational enhancer sequence and/or an internal ribosbme entry site sometimes is inserted into a 5'UTR (e.g., EMBL nucleotide sequences J04513, X87949, M95825, M12783, AF025841 , AF013263, AF006822, M17169, M13440, M22427, D14838 and M17446 and substantially identical nucleotide sequences).
  • An AU-rich element e.g., AUUUA repeats
  • splicing junction that follows a non-sense codon sometimes is removed from or modified in a 3'UTR.
  • a polyadenosine tail sometimes is inserted into a 3'UTR if none is present, sometimes is removed if it is present, and adenosine moieties sometimes are added to or removed from a polyadenosine tail present in a 3'UTR.
  • some embodiments are directed to a process comprising: determining whether any nucleotide sequences that increase, potentially increase, reduce or potentially reduce translation efficiency are present in the elements, and adding, removing or modifying one or more of such sequences if they are identified.
  • Certain embodiments are directed to a process comprising: determining whether any nucleotide sequences that increase or potentially increase translation efficiency are not present in the elements, and incorporating such sequences into the nucleic acid reagent.
  • an activity can be altered by modifying the nucleotide sequence of an ORF.
  • An ORF sometimes is mutated or modified (for example, by point mutation, deletion mutation, insertion mutation, PCR based mutagenesis and the like) to alter, enhance or increase, reduce, substantially reduce or eliminate the activity of the encoded protein or peptide.
  • the protein or peptide encoded by a modified ORF sometimes is produced in a lower amount or may not be produced at detectable levels, and in other embodiments, the product or protein encoded by the modified ORF is produced at a higher level (e.g., codons sometimes are modified so they are compatible with tRNA's preferentially used in the host organism or engineered organism).
  • an ORF nucleotide sequence sometimes is mutated or modified to alter the triplet nucleotide sequences used to encode amino acids (e.g., amino acid codon triplets, for example). Modification of the nucleotide sequence of an ORF to alter codon triplets sometimes is used to change the codon found in the original sequence to better match the preferred codon usage of the organism in which the ORF or nucleic acid reagent will be expressed. For example, the codon usage, and therefore the codon triplets encoded by a nucleic acid sequence from bacteria may be different from the preferred codon usage in eukaryotes like yeast or plants.
  • an ORF nucleotide sequences sometimes is modified to eliminate codon pairs and/or eliminate mRNA secondary structures that can cause pauses during translation of the mRNA encoded by the ORF nucleotide sequence.
  • Translational pausing sometimes occurs when nucleic acid secondary structures exist in an mRNA, and sometimes occurs due to the presence of codon pairs that slow the rate of translation by causing ribosomes to pause.
  • the use of lower abundance codon triplets can reduce translational pausing due to a decrease in the pause time needed to load a charged tRNA into the ribosome translation machinery.
  • nucleotide sequence of a nucleotide sequence of interest can be altered to better suit the transcription and/or translational machinery of the host and/or genetically modified microorganism.
  • slowing the rate of translation by the use of lower abundance codons, which slow or pause the ribosome can lead to higher yields of the desired product due to an increase in correctly folded proteins and a reduction in the formation of inclusion bodies.
  • Codons can be altered and optimized according to the preferred usage by a given organism by determining the codon distribution of the nucleotide sequence donor organism and comparing the distribution of codons to the distribution of codons in the recipient or host organism. Techniques described herein (e.g., site directed mutagenesis and the like) can then be used to alter the codons accordingly. Comparisons of codon usage can be done by hand, or using nucleic acid analysis software commercially available to the artisan.
  • Modification of the nucleotide sequence of an ORF also can be used to correct codon triplet sequences that have diverged in different organisms.
  • certain yeast e.g., C.
  • the amino acid triplet CUG (e.g., CTG in the DNA sequence) to encode serine.
  • CUG typically encodes leucine in most organisms.
  • the CUG codon In order to maintain the correct amino acid in the resultant polypeptide or protein, the CUG codon must be altered to reflect the organism in which the nucleic acid reagent will be expressed.
  • the heterologous nucleotide sequence must first be altered or modified to the appropriate leucine codon. Therefore, in some embodiments, the nucleotide sequence of an ORF sometimes is altered or modified to correct for differences that have occurred in the evolution of the amino acid codon triplets between different organisms.
  • the nucleotide sequence can be left unchanged at a particular amino acid codon, if the amino acid encoded is a conservative or neutral change in amino acid when compared to the originally encoded amino acid.
  • an activity can be altered by modifying translational regulation signals, like a stop codon for example.
  • a stop codon at the end of an ORF sometimes is modified to another stop codon, such as an amber stop codon described above.
  • a stop codon is introduced within an ORF, sometimes by insertion or mutation of an existing codon.
  • An ORF comprising a modified terminal stop codon and/or internal stop codon often is translated in a system comprising a suppressor tRNA that recognizes the stop codon.
  • An ORF comprising a stop codon sometimes is translated in a system comprising a suppressor tRNA that incorporates an unnatural amino acid during translation of the target protein or target peptide.
  • Methods for incorporating unnatural amino acids into a target protein or peptide include, for example, processes utilizing a heterologous tRNA/synthetase pair, where the tRNA recognizes an amber stop codon and is loaded with an unnatural amino acid (e.g., World Wide Web URL iupac.org/news/prize/2003/wang.pdf).
  • nucleic acid reagent e.g., Promoter, 5' or 3' UTR, ORI, ORF, and the like
  • the modifications described above can alter a given activity by (i) increasing or decreasing feedback inhibition mechanisms, (ii) increasing or decreasing promoter initiation, (iii) increasing or decreasing translation initiation, (iv) increasing or decreasing translational efficiency, (v) modifying localization of peptides or products expressed from nucleic acid reagents described herein, or (vi) increasing or decreasing the copy number of a nucleotide sequence of interest, (vii) expression of an anti-sense RNA, RNAi, siRNA, ribozyme and the like.
  • alteration of a nucleic acid reagent or nucleotide sequence can alter a region involved in feedback inhibition (e.g., 5' UTR, promoter and the like).
  • a modification sometimes is made that can add or enhance binding of a feedback regulator and sometimes a modification is made that can reduce, inhibit or eliminate binding of a feedback regulator.
  • alteration of a nucleic acid reagent or nucleotide sequence can alter sequences involved in transcription initiation (e.g., promoters, 5' UTR, and the like).
  • a modification sometimes can be made that can enhance or increase initiation from an endogenous or heterologous promoter element.
  • a modification sometimes can be made that removes or disrupts sequences that increase or enhance transcription initiation, resulting in a decrease or elimination of transcription from an endogenous or heterologous promoter element.
  • alteration of a nucleic acid reagent or nucleotide sequence can alter sequences involved in translational initiation or translational efficiency (e.g., 5' UTR, 3' UTR, codon triplets of higher or lower abundance, translational terminator sequences and the like, for example).
  • a modification sometimes can be made that can increase or decrease translational initiation, modifying a ribosome binding site for example.
  • a modification sometimes can be made that can increase or decrease translational efficiency.
  • Removing or adding sequences that form hairpins and changing codon triplets to a more or less preferred codon are non-limiting examples of genetic modifications that can be made to alter translation initiation and translation efficiency.
  • alteration of a nucleic acid reagent or nucleotide sequence can alter sequences involved in localization of peptides, proteins or other desired products (e.g., adipic acid, for example).
  • a modification sometimes can be made that can alter, add or remove sequences responsible for targeting a polypeptide, protein or product to an intracellular organelle, the periplasm, cellular membranes, or extracellularly. Transport of a heterologous product to a different intracellular space or extracellularly sometimes can reduce or eliminate the formation of inclusion bodies (e.g., insoluble aggregates of the desired product).
  • Non-limiting examples of alterations that can increase the number of copies of a sequence of interest include, adding copies of the sequence of interest by duplication of regions in the genome (e.g., adding additional copies by recombination or by causing gene amplification of the host genome, for example), cloning additional copies of a sequence onto a nucleic acid reagent, or altering an ORI to increase the number of copies of an epigenetic nucleic acid reagent.
  • Non-limiting examples of alterations that can decrease the number of copies of a sequence of interest include, removing copies of the sequence of interest by deletion or disruption of regions in the genome, removing additional copies of the sequence from epigenetic nucleic acid reagents, or altering an ORI to decrease the number of copies of an epigenetic nucleic acid reagent.
  • Engineered microorganisms can be prepared by altering, introducing or removing nucleotide sequences in the host genome or in stably maintained epigenetic nucleic acid reagents, as noted above.
  • the nucleic acid reagents use to alter, introduce or remove nucleotide sequences in the host genome or epigenetic nucleic acids can be prepared using the methods described herein or available to the artisan.
  • the nucleic acid of interest may be extracted, isolated, purified or amplified from a sample (e.g., from an organism of interest or culture containing a plurality of organisms of interest, like yeast or bacteria for example).
  • a sample e.g., from an organism of interest or culture containing a plurality of organisms of interest, like yeast or bacteria for example.
  • isolated refers to nucleic acid removed from its original environment (e.g., the natural environment if it is naturally occurring, or a' host cell if expressed exogenously), and thus is altered "by the hand of man" from its original environment.
  • An isolated nucleic acid generally is provided with fewer non-nucleic acid
  • a composition comprising isolated sample nucleic acid can be substantially isolated (e.g., about 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% free of non-nucleic acid components).
  • purified refers to sample nucleic acid provided that contains fewer nucleic acid species than in the sample source from which the sample nucleic acid is derived.
  • a composition comprising sample nucleic acid may be substantially purified (e.g., about 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% free of other nucleic acid species).
  • the term "amplified” as used herein refers to subjecting nucleic acid of a cell, organism or sample to a process that linearly or exponentially generates amplicon nucleic acids having the same or substantially the same nucleotide sequence as the nucleotide sequence of the nucleic acid in the sample, or portion thereof.
  • the nucleic acids used to prepare nucleic acid reagents as described herein can be subjected to fragmentation or cleavage.
  • PCR polymerase chain reaction
  • LCR ligase chain reaction
  • Non-limiting examples of PCR amplification methods include standard PCR, AFLP-PCR, Allele- specific PCR, Alu-PCR, Asymmetric PCR, Colony PCR, Hot start PCR, Inverse PCR (IPCR), In situ PCR (ISH), Intersequence-specific PCR (ISSR-PCR), Long PCR, Multiplex PCR, Nested PCR, Quantitative PCR, Reverse Transcriptase PCR (RT-PCR), Real Time PCR, Single cell PCR, Solid phase PCR, combinations thereof, and the like.
  • Reagents and hardware for conducting PCR are commercially available. Protocols for conducting the various type of PCR listed above are readily available to the artisan.
  • PCR conditions can be dependent upon primer sequences, target abundance, and the desired amount of amplification, and therefore, one of skill in the art may choose from a number of PCR protocols available (see, e.g., U.S. Pat. Nos. 4,683,195 and 4,683,202; and PCR Protocols: A Guide to Methods and Applications, Innis et al., eds, 1990.
  • PCR often is carried out as an automated process with a thermostable enzyme. In this process, the temperature of the reaction mixture is cycled through a denaturing region, a primer-annealing region, and an extension reaction region automatically. Machines specifically adapted for this purpose are commercially available.
  • a non-limiting example of a PCR protocol that may be suitable for embodiments described herein is, treating the sample at 95 S C for 5 minutes; repeating forty-five cycles of 95 g C for 1 minute, 59 S C for 1 minute, 10 seconds, and 72 S C for 1 minute 30 seconds; and then treating the sample at 72 5 C for 5 minutes. Additional PCR protocols are described in the example section. Multiple cycles frequently are performed using a commercially available thermal cycler. Suitable isothermal amplification processes known and selected by the person of ordinary skill in the art also may be applied, in certain embodiments.
  • nucleic acids encoding polypeptides with a desired activity can be isolated by amplifying the desired sequence from an organism having the desired activity using oligonucleotides or primers designed based on sequences described herein Amplified, isolated and/or purified nucleic acids can be cloned into the recombinant DNA vectors described in Figures herein or into suitable commercially available recombinant DNA vectors.
  • nucleic acid sequences prepared by isolation or amplification can be used, without any further modification, to add an activity to a microorganism and thereby generate a genetically modified or engineered microorganism.
  • nucleic acid sequences prepared by isolation or amplification can be genetically modified to alter (e.g., increase or decrease, for example) a desired activity.
  • nucleic acids, used to add an activity to an organism sometimes are genetically modified to optimize the heterologous polynucleotide sequence encoding the desired activity (e.g., polypeptide or protein, for example).
  • optimize as used herein can refer to alteration to increase or enhance expression by preferred codon usage.
  • optimize can also refer to modifications to the amino acid sequence to increase the activity of a polypeptide or protein, such that the activity exhibits a higher catalytic activity as compared to the "natural" version of the polypeptide or protein.
  • Nucleic acid sequences of interest can be genetically modified using methods known in the art. Mutagenesis techniques are particularly useful for small scale (e.g., 1 , 2, 5, 10 or more
  • nucleotides or large scale (e.g., 50, 100, 150, 200, 500, or more nucleotides) genetic modification.
  • Mutagenesis allows the artisan to alter the genetic information of an organism in a stable manner, either naturally (e.g., isolation using selection and screening) or experimentally by the use of chemicals, radiation or inaccurate DNA replication (e.g., PCR mutagenesis).
  • PCR mutagenesis e.g., PCR mutagenesis
  • genetic modification can be performed by whole scale synthetic synthesis of nucleic acids, using a native nucleotide sequence as the reference sequence, and modifying nucleotides that can result in the desired alteration of activity.
  • Mutagenesis methods sometimes are specific or targeted to specific regions or nucleotides (e.g., site-directed mutagenesis, PCR-based site- directed mutagenesis, and in vitro mutagenesis techniques such as transplacement and in vivo oligonucleotide site-directed mutagenesis, for example).
  • Mutagenesis methods sometimes are non-specific or random with respect to the placement of genetic modifications (e.g., chemical mutagenesis, insertion element (e.g., insertion or transposon elements) and inaccurate PCR based methods, for example).
  • Site directed mutagenesis is a procedure in which a specific nucleotide or specific nucleotides in a DNA molecule are mutated or altered. Site directed mutagenesis typically is performed using a nucleic acid sequence of interest cloned into a circular plasmid vector. Site-directed mutagenesis requires that the wild type sequence be known and used a platform for the genetic alteration. Site- directed mutagenesis sometimes is referred to as oligonucleotide-directed mutagenesis because the technique can be performed using oligonucleotides which have the desired genetic
  • the wild type sequence and the altered nucleotide are allowed to hybridize and the hybridized nucleic acids are extended and replicated using a DNA polymerase.
  • the double stranded nucleic acids are introduced into a host (e.g., E. coli, for example) and further rounds of replication are carried out in vivo.
  • the transformed cells carrying the mutated nucleic acid sequence are then selected and/or screened for those cells carrying the correctly mutagenized sequence.
  • Cassette mutagenesis and PCR-based site-directed mutagenesis are further modifications of the site-directed mutagenesis technique.
  • Site-directed mutagenesis can also be performed in vivo (e.g., transplacement "pop-in pop-out", In vivo site-directed mutagenesis with synthetic oligonucleotides and the like, for example).
  • PCR-based mutagenesis can be performed using PCR with oligonucleotide primers that contain the desired mutation or mutations.
  • the technique functions in a manner similar to standard site- directed mutagenesis, with the exception that a thermocycler and PCR conditions are used to replace replication and selection of the clones in a microorganism host.
  • PCR-based mutagenesis also uses a circular plasmid vector, the amplified fragment (e.g., linear nucleic acid molecule) containing the incorporated genetic modifications can be separated from the plasmid containing the template sequence after a sufficient number of rounds of thermocycler amplification, using standard electrophorectic procedures.
  • a modification of this method uses linear amplification methods and a pair of mutagenic primers that amplify the entire plasmid.
  • the procedure takes advantage of the E. coli Dam methylase system which causes DNA replicated in vivo to be sensitive to the restriction endonucleases Dpnl.
  • PCR synthesized DNA is not methylated and is therefore resistant to Dpnl.
  • This approach allows the template plasmid to be digested, leaving the genetically modified, PCR synthesized plasmids to be isolated and transformed into a host bacteria for DNA repair and replication, thereby facilitating subsequent cloning and identification steps.
  • a certain amount of randomness can be added to PCR-based sited directed mutagenesis by using partially degenerate primers.
  • Recombination sometimes can be used as a tool for mutagenesis. Homologous recombination allows the artisan to specifically target regions of known sequence for insertion of heterologous nucleotide sequences using the host organisms natural DNA replication and repair enzymes.
  • Homologous recombination methods sometimes are referred to as "pop in pop out” mutagenesis, transplacement, knock out mutagenesis or knock in mutagenesis.
  • Integration of a nucleic acid sequence into a host genome is a single cross over event, which inserts the entire nucleic acid reagent (e.g., pop in).
  • a second cross over event excises all but a portion of the nucleic acid reagent, leaving behind a heterologous sequence, often referred to as a "footprint" (e.g., pop out).
  • Mutagenesis by insertion e.g., knock in
  • a disrupting heterologous nucleic acid e.g., knock out
  • nucleic acid reagents designed to provide the appropriate nucleic acid target sequences the artisan can target a selectable nucleic acid reagent to a specific region, and then select for recombination events that "pop out” a portion of the inserted (e.g., "pop in") nucleic acid reagent.
  • Such methods take advantage of nucleic acid reagents that have been specifically designed with known target nucleic acid sequences at or near a nucleic acid or genomic region of interest.
  • Popping out typically leaves a "foot print" of left over sequences that remain after the
  • the method can be used to insert sequences, upstream or downstream of genes that can result in an enhancement or reduction in expression of the gene.
  • new genes can be introduced into the genome of a host organism using similar recombination or "pop in” methods.
  • An example of a yeast recombination system using the ura3 gene and 5-FOA were described briefly above and further detail is presented herein.
  • a method for modification is described in Alani et al., "A method for gene disruption that allows repeated use of URA3 selection in the construction of multiply disrupted yeast strains", Genetics 1 16(4):541 -545 August 1987.
  • the original method uses a Ura3 cassette with 1000 base pairs (bp) of the same nucleotide sequence cloned in the same orientation on either side of the URA3 cassette.
  • Targeting sequences of about 50 bp are added to each side of the construct.
  • the double stranded targeting sequences are complementary to sequences in the genome of the host organism.
  • the targeting sequences allow site-specific recombination in a region of interest.
  • the modification of the original technique replaces the two 1000 bp sequence direct repeats with two 200 bp direct repeats.
  • the modified method also uses 50 bp targeting sequences.
  • the modification reduces or eliminates recombination of a second knock out into the 1000 bp repeat left behind in a first mutagenesis, therefore allowing multiply knocked out yeast.
  • the 200 bp sequences used herein are uniquely designed, self-assembling sequences that leave behind identifiable footprints.
  • the technique used to design the sequences incorporate design features such as low identity to the yeast genome, and low identity to each other. Therefore a library of the self-assembling sequences can be generated to allow multiple knockouts in the same organism, while reducing or eliminating the potential for integration into a previous knockout.
  • the URA3 cassette makes use of the toxicity of 5-FOA in yeast carrying a functional URA3 gene.
  • Uracil synthesis deficient yeast are transformed with the modified URA3 cassette, using standard yeast transformation protocols, and the transformed cells are plated on minimal media minus uracil.
  • PCR can be used to verify correct insertion into the region of interest in the host genome, and certain embodiments the PCR step can be omitted. Inclusion of the PCR step can reduce the number of transformants that need to be counter selected to "pop out" the URA3 cassette.
  • the transformants (e.g., all or the ones determined to be correct by PCR, for example) can then be counter-selected on media containing 5-FOA, which will select for recombination out (e.g., popping out) of the URA3 cassette, thus rendering the yeast ura3 deficient again, and resistant to 5-FOA toxicity.
  • Targeting sequences used to direct recombination events to specific regions are presented herein.
  • a modification of the method described above can be used to integrate genes in to the chromosome, where after recombination a functional gene is left in the chromosome next to the 200bp footprint.
  • auxotrophic or dominant selection markers can be used in place of URA3 (e.g., an auxotrophic selectable marker), with the appropriate change in selection media and selection agents.
  • auxotrophic selectable markers are used in strains deficient for synthesis of a required biological molecule (e.g., amino acid or nucleoside, for example).
  • additional auxotrophic markers include; HIS3, TRP1 , LEU2, LEU2-d, and LYS2.
  • Certain auxotrophic markers e.g., URA3 and LYS2 allow counter selection to select for the second recombination event that pops out all but one of the direct repeats of the recombination construct.
  • HIS3 encodes an activity involved in histidine synthesis.
  • TRP1 encodes an activity involved in tryptophan synthesis.
  • LEU2 encodes an activity involved in leucine synthesis.
  • LEU2-d is a low expression version of LEU2 that selects for increased copy number (e.g., gene or plasmid copy number, for example) to allow survival on minimal media without leucine.
  • LYS2 encodes an activity involved in lysine synthesis, and allows counter selection for recombination out of the LYS2 gene using alpha-amino adipate (a-amino adipate).
  • Dominant selectable markers are useful because they also allow industrial and/or prototrophic strains to be used for genetic manipulations. Additionally, dominant selectable markers provide the advantage that rich medium can be used for plating and culture growth, and thus growth rates are markedly increased.
  • Non-limiting examples of dominant selectable markers include; Tn903 kan r , Cm', Hyg r , CUP1 , and DHFR.
  • Tn903 kan' encodes an activity involved in kanamycin antibiotic resistance (e.g., typically neomycin phosphotransferase II or NPTII, for example).
  • Cm' encodes an activity involved in chloramphenicol antibiotic resistance (e.g., typically chloramphenicol acetyl transferase or CAT, for example).
  • Hyg r encodes an activity involved in hygromycin resistance by phosphorylation of hygromycin B (e.g., hygromycin phosphotransferase, or HPT).
  • CUP1 encodes an activity involved in resistance to heavy metal (e.g., copper, for example) toxicity.
  • DHFR encodes a dihydrofolate reductase activity which confers resistance to methotrexate and sulfanilamide compounds.
  • random mutagenesis does not require any sequence information and can be accomplished by a number of widely different methods. Random mutagenesis often is used to generate mutant libraries that can be used to screen for the desired genotype or phenotype.
  • Non-limiting examples of random mutagenesis include; chemical mutagenesis, UV-induced mutagenesis, insertion element or transposon-mediated mutagenesis, DNA shuffling, error-prone PCR mutagenesis, and the like.
  • Chemical mutagenesis often involves chemicals like ethyl methanesulfonate (EMS), nitrous acid, mitomycin C, N-methyl-N-nitrosourea (MNU), diepoxybutane (DEB), 1 , 2, 7, 8- diepoxyoctane (DEO), methyl methane sulfonate (MMS), N-methyl- N'-nitro-N-nitrosoguanidine (MNNG), 4- nitroquinoline 1 -oxide (4-NQO), 2-methyloxy-6-chloro-9(3-[ethyl-
  • mutagenesis can be carried out in vivo. Sometimes the mutagenic process involves the use of the host organisms DNA replication and repair mechanisms to incorporate and replicate the mutagenized base or bases.
  • Base analog mutagenesis introduces a small amount of non-randomness to random mutagenesis, because specific base analogs can be chose which can be incorporated at certain nucleotides in the starting sequence. Correction of the mispairing typically yields a known substitution.
  • Bromo- deoxyuridine (BrdU) can be incorporated into DNA and replaces T in the sequence. The host DNA repair and replication machinery can sometime correct the defect, but sometimes will mispair the BrdU with a G.
  • UV induced mutagenesis is caused by the formation of thymidine dimers when UV light irradiates chemical bonds between two adjacent thymine residues.
  • Excision repair mechanism of the host organism correct the lesion in the DNA, but occasionally the lesion is incorrectly repaired typically resulting in a C to T transition.
  • Insertion element or transposon-mediated mutagenesis makes use of naturally occurring or modified naturally occurring mobile genetic elements.
  • Transposons often encode accessory activities in addition to the activities necessary for transposition (e.g., movement using a transposase activity, for example).
  • transposon accessory activities are antibiotic resistance markers (e.g., see Tn903 kan r described above, for example).
  • Insertion elements typically only encode the activities necessary for movement of the nucleic acid sequence. Insertion element and transposon mediated mutagenesis often can occur randomly, however specific target sequences are known for some transposons.
  • Mobile genetic elements like IS elements or Transposons (Tn) often have inverted repeats, direct repeats or both inverted and direct repeats flanking the region coding for the transposition genes.
  • transposase Recombination events catalyzed by the transposase cause the element to remove itself from the genome and move to a new location, leaving behind a portion of an inverted or direct repeat.
  • Classic examples of transposons are the "mobile genetic elements” discovered in maize.
  • Transposon mutagenesis kits are commercially available which are designed to leave behind a 5 codon insert (e.g., Mutation Generation System kit, Finnzymes, World Wide Web URL finnzymes.us, for example). This allows the artisan to identify the insertion site, without fully disrupting the function of most genes.
  • DNA shuffling is a method which uses DNA fragments from members of a mutant library and reshuffles the fragments randomly to generate new mutant sequence combinations.
  • the fragments are typically generated using DNasel, followed by random annealing and re-joining using self priming PCR.
  • the DNA overhanging ends, from annealing of random fragments, provide "primer" sequences for the PCR process.
  • Shuffling can be applied to libraries generated by any of the above mutagenesis methods. Error prone PCR and its derivative rolling circle error prone PCR uses increased magnesium and manganese concentrations in conjunction with limiting amounts of one or two nucleotides to reduce the fidelity of the.Taq polymerase.
  • the error rate can be as high as 2% under appropriate conditions, when the resultant mutant sequence is compared to the wild type starting sequence.
  • the library of mutant coding sequences must be cloned into a suitable plasmid.
  • point mutations are the most common types of mutation in error prone PCR, deletions and frameshift mutations are also possible.
  • Rolling circle error-prone PCR is a variant of error-prone PCR in which wild-type sequence is first cloned into a plasmid, the whole plasmid is then amplified under error-prone conditions.
  • organisms with altered activities can also be isolated using genetic selection and screening of organisms challenged on selective media or by identifying naturally occurring variants from unique environments.
  • 2-Deoxy-D-glucose is a toxic glucose analog. Growth of yeast on this substance yields mutants that are glucose-deregulated. A number of mutants have been isolated using 2-Deoxy-D-glucose including transport mutants, and mutants that ferment glucose and galactose simultaneously instead of glucose first then galactose when glucose is depleted. Similar techniques have been used to isolate mutant microorganisms that can metabolize plastics (e.g., from landfills), petrochemicals (e.g., from oil spills), and the like, either in a laboratory setting or from unique environments.
  • Similar methods can be used to isolate naturally occurring mutations in a desired activity when the activity exists at a relatively low or nearly undetectable level in the organism of choice, in some embodiments.
  • the method generally consists of growing the organism to a specific density in liquid culture, concentrating the cells, and plating the cells on various concentrations of the substance to which an increase in metabolic activity is desired.
  • the cells are incubated at a moderate growth temperature, for 5 to 10 days.
  • the plates can be stored for another 5 to 10 days at a low temperature.
  • the low temperature sometimes can allow strains that have gained or increased an activity to continue growing while other strains are inhibited for growth at the low temperature.
  • the plates can be replica plated on higher or lower concentrations of the selection substance to further select for the desired activity.
  • a native, heterologous or mutagenized polynucleotide can be introduced into a nucleic acid reagent for introduction into a host organism, thereby generating an engineered microorganism.
  • Standard recombinant DNA techniques can be used by the artisan to combine the mutagenized nucleic acid of interest into a suitable nucleic acid reagent capable of (i) being stably maintained by selection in the host organism, or (ii) being integrating into the genome of the host organism.
  • nucleic acid reagents comprise two replication origins to allow the same nucleic acid reagent to be manipulated in bacterial before final introduction of the final product into the host organism (e.g., yeast or fungus for example).
  • Standard molecular biology and recombinant DNA methods available to one of skill in the art can be found in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. .
  • Nucleic acid reagents can be introduced into microorganisms using various techniques.
  • methods used to introduce heterologous nucleic acids into various organisms include; transformation, transfection, transduction, electroporation, ultrasound-mediated transformation, particle bombardment and the like.
  • carrier molecules e.g., bis-benzimdazolyl compounds, for example, see US Patent 5595899
  • carrier molecules e.g., bis-benzimdazolyl compounds, for example, see US Patent 5595899
  • Conventional methods of transformation are readily available to the artisan and can be found in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
  • Culture, Production and Process Methods Engineered microorganisms often are cultured under conditions that optimize yield of a target molecule.
  • a non-limiting example of such a target molecule is ethanol.
  • Culture conditions often can alter (e.g., add, optimize, reduce or eliminate, for example) activity of one or more of the following activities: phosphofructokinase activity, phosphogluconate dehydratase activity, 2-keto-3- deoxygluconate-6-phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol dehydrogenase 2 activity, thymidylate synthase activities, 6- phosphogluconate dehydrogenase (decarboxylating), glucose-6-phosphate dehydrogenase, pyruvate decarboxylase, alcohol dehydrogenase 1 , 6-phosphogluconolactonase, glutamate synthase, trehalose-6-phosphate syntha
  • Conditions also may be optimized to maximize carbon flux through engineered pathways.
  • the engineered activities increase the production of ethanol through the increased activities in a number of pathways (e.g., Entner-Doudoroff, Embden-Meyerhoff, glycolysis, gluconeogenesis, pentose phosphate pathway, glutamate synthesis pathway).
  • the pathways are engineered to bias carbon flux through a pathway in the direction of the desired product.
  • the pathways utilized in the non-limiting examples presented herein were selected to maximize production of ethanol by regenerating or utilizing metabolic byproducts to internally generate additional carbon sources that can be further metabolized to produce ethanol.
  • Fermentation conditions can include several parameters, including without limitation, temperature, oxygen content, nutrient content (e.g., glucose content), pH, agitation level (e.g., revolutions per minute), gas flow rate (e.g., air, oxygen, nitrogen gas), redox potential, cell density (e.g., optical density), cell viability and the like.
  • a change in fermentation conditions e.g., switching fermentation conditions is an alteration, modification or shift of one or more fermentation parameters.
  • increasing or decreasing pH e.g., adding or removing an acid, a base or carbon dioxide
  • increasing or decreasing oxygen content e.g., introducing air, oxygen, carbon dioxide, nitrogen
  • adding or removing a nutrient e.g., one or more sugars or sources of sugar, biomass, vitamin and the like
  • Aerobic conditions often comprise greater than about 50% dissolved oxygen (e.g., about 52%, 54%, 56%, 58%, 60%, 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing).
  • dissolved oxygen e.g., about 52%, 54%, 56%, 58%, 60%, 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing).
  • Anaerobic conditions often comprise less than about 50% dissolved oxygen (e.g., about 1 %, 2%, 4%, 6%, 8%, 10%, 12%, 14%, 16%, 18%, 20%, 22%, 24%, 26%, 28%, 30%, 32%, 34%, 36%, 38%, 40%, 42%, 44%, 46%, 48%, or less than any one of the foregoing).
  • Culture media generally contain a suitable carbon source.
  • Carbon sources may include, but are not limited to, monosaccharides (e.g., glucose, fructose, xylose), disaccharides (e.g., lactose, sucrose), oligosaccharides, polysaccharides (e.g., starch, cellulose, hemicellulose, other lignocellulosic materials or mixtures thereof), sugar alcohols (e.g., glycerol), and renewable feedstocks (e.g., cheese whey permeate, cornsteep liquor, sugar beet molasses, barley malt).
  • monosaccharides e.g., glucose, fructose, xylose
  • disaccharides e.g., lactose, sucrose
  • oligosaccharides e.g., polysaccharides (e.g., starch, cellulose, hemicellulose, other lignocellulosic materials or mixtures thereof)
  • sugar alcohols e.
  • Carbon sources also can be selected from one or more of the following non-limiting examples: linear or branched alkanes (e.g., hexane), linear or branched alcohols (e.g., hexanol), fatty acids (e.g., about 10 carbons to about 22 carbons), esters of fatty acids, monoglycerides, diglycerides, triglycerides, phospholipids and various commercial sources of fatty acids including vegetable oils (e.g., soybean oil) and animal fats.
  • a carbon source may include one-carbon sources (e.g., carbon dioxide, methanol, formaldehyde, formate and carbon-containing amines) from which metabolic conversion into key biochemical intermediates can occur.
  • Nitrogen may be supplied from an inorganic (e.g., (NH.sub.4).sub.2SO.sub.4) or organic source (e.g., urea or glutamate).
  • inorganic e.g., (NH.sub.4).sub.2SO.sub.4)
  • organic source e.g., urea or glutamate
  • culture media also can contain suitable minerals, salts, cofactors, buffers, vitamins, metal ions (e.g., Mn.sup.+2, Co. sup. +2, Zn.sup.+2, Mg.sup.+2) and other components suitable for culture of microorganisms.
  • Engineered microorganisms sometimes are cultured in complex media (e.g., yeast extract- peptone-dextrose broth (YPD)).
  • engineered microorganisms are cultured in a defined minimal media that lacks a component necessary for growth and thereby forces selection of a desired expression cassette (e.g., Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Mich.)).
  • Culture media in some embodiments are common commercially prepared media, such as Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Mich.).
  • Other defined or synthetic growth media may also be used and the appropriate medium for growth of the particular microorganism are known.
  • yeast are cultured in YPD media (10 g/L Bacto Yeast Extract, 20 g/L Bacto Peptone, and 20 g/L Dextrose).
  • Filamentous fungi are grown in CM (Complete Medium) containing 10 g/L Dextrose, 2 g/L Bacto Peptone, 1 g/L Bacto Yeast Extract, 1 g/L Casamino acids, 50 mL /L 20X Nitrate Salts (120 g/L NaN0 3 , 10.4 g/L KCI, 10.4 g/L MgSCy7 H 2 0 ), 1 mlJL 1000X Trace Elements (22 g/L ZnSCy7 H 2 0, 1 1 g/L H 3 B0 3 , 5 g/L MnCI 2 -7 H 2 0, 5 g/L FeSCy 7 H 2 0, 1 .7 g/L CoCI 2 -6 H 2 0, 1.6 g/L CuSCy5 H 2 0, 1.5 g/L Na 2 MoCy2 H 2 0, and 50 g/L NajEDTA), and 1 mL/L Vitamin Solution (100 mg each of Biotin
  • a suitable pH range for the fermentation often is between about pH 4.0 to about pH 8.0, where a pH in the range of about pH 5.5 to about pH 7.0 sometimes is utilized for initial culture conditions. Culturing may be conducted under aerobic or anaerobic conditions, where microaerobic conditions sometimes are maintained.
  • a two-stage process may be utilized, where one stage promotes microorganism proliferation and another state promotes production of target molecule. In a two- stage process, the first stage may be conducted under aerobic conditions (e.g., introduction of air and/or oxygen) and the second stage may be conducted under anaerobic conditions (e.g., air or oxygen are not introduced to the culture conditions).
  • a variety of fermentation processes may be applied for commercial biological production of a target product.
  • commercial production of a target product from a recombinant microbial host is conducted using a batch, fed-batch or continuous fermentation process, for example.
  • a batch fermentation process often is a closed system where the media composition is fixed at the beginning of the process and not subject to further additions beyond those required for
  • the media is inoculated with the desired organism and growth or metabolic activity is permitted to occur without adding additional sources (i.e., carbon and nitrogen sources) to the medium.
  • additional sources i.e., carbon and nitrogen sources
  • the metabolite and biomass compositions of the system change constantly up to the time the culture is terminated.
  • cells proceed through a static lag phase to a high-growth log phase and finally to a stationary phase, wherein the growth rate is diminished or halted. Left untreated, cells in the stationary phase will eventually die.
  • a variation of the standard batch process is the fed-batch process, where the carbon source is continually added to the fermentor over the course of the fermentation process.
  • Fed-batch processes are useful when catabolite repression is apt to inhibit the metabolism of the cells or where it is desirable to have limited amounts of carbon source in the media at any one time.
  • Measurement of the carbon source concentration in fed-batch systems may be estimated on the basis of the changes of measurable factors such as pH, dissolved oxygen and the partial pressure of waste gases (e.g., CO. sub.2).
  • Continuous cultures generally maintain cells in the log phase of growth at a constant cell density.
  • Continuous or semi-continuous culture methods permit the modulation of one factor or any number of factors that affect cell growth or end product concentration. For example, an approach may limit the carbon source and allow all other parameters to moderate metabolism. In some systems, a number of factors affecting growth may be altered continuously while the cell concentration, measured by media turbidity, is kept constant. Continuous systems often maintain steady state growth and thus the cell growth rate often is balanced against cell loss due to media being drawn off the culture. Methods of modulating nutrients and growth factors for continuous culture processes, as well as techniques for maximizing the rate of product formation, are known and a variety of methods are detailed by Brock, supra.
  • ethanol may be purified from the culture media or extracted from the engineered microorganisms.
  • Culture media may be tested for ethanol concentration and drawn off when the concentration reaches a predetermined level.
  • Detection methods are known in the art, including but not limited to the use of a hydrometer and infrared measurement of vibrational frequency of dissolved ethanol using the CH band at 2900 cm '.
  • Ethanol may be present at a range of levels as described herein.
  • a target product sometimes is retained within an engineered microorganism after a culture process is completed, and in certain embodiments, the target product is secreted out of the microorganism into the culture medium.
  • culture media may be drawn from the culture system and fresh medium may be supplemented, and/or (ii) target product may be extracted from the culture media during or after the culture process is completed.
  • Engineered microorganisms may be cultured on or in solid, semi-solid or liquid media. In some embodiments media is drained from cells adhering to a plate. In certain embodiments, a liquid-cell mixture is centrifuged at a speed sufficient to pellet the cells but not disrupt the cells and allow extraction of the media, as known in the art. The cells may then be resuspended in fresh media.
  • Target product may be purified from culture media according to methods known in the art.
  • target product is extracted from the cultured engineered microorganisms.
  • the microorganism cells may be concentrated through centrifugation at speed sufficient to shear the cell membranes.
  • the cells may be physically disrupted (e.g., shear force, sonication) or chemically disrupted (e.g., contacted with detergent or other lysing agent).
  • the phases may be separated by centrifugation or other method known in the art and target product may be isolated according to known methods.
  • target product sometimes is provided in substantially pure form (e.g., 90% pure or greater, 95% pure or greater, 99% pure or greater or 99.5% pure or greater).
  • target product may be modified into any one of a number of downstream products.
  • ethanol may be derivatized or further processed to produce ethyl halides, ethyl esters, diethyl ether, acetic acid, ethyl amines, butadiene, solvents, food flavorings, distilled spirits and the like.
  • Target product may be provided within cultured microbes containing target product, and cultured microbes may be supplied fresh or frozen in a liquid media or dried. Fresh or frozen microbes may be contained in appropriate moisture-proof containers that may also be temperature controlled as necessary. Target product sometimes is provided in culture medium that is substantially cell-free. In some embodiments target product or modified target product purified from microbes is provided, and target product sometimes is provided in substantially pure form. In certain embodiments, ethanol can be provided in anhydrous or hydrous forms. Ethanol may be transported in a variety of containers including pints, quarts, liters, gallons, drums (e.g., 10 gallon or 55 gallon, for example) and the like.
  • a target product e.g., ethanol, succinic acid
  • a target product is produced with a yield of about 0.30 grams of target product, or greater, per gram of glucose added during a fermentation process (e.g., about 0.31 grams of target product per gram of glucose added, or greater; about 0.32 grams of target product per gram of glucose added, or greater; about 0.33 grams of target product per gram of glucose added, or greater; about 0.34 grams of target product per gram of glucose added, or greater; about 0.35 grams of target product per gram of glucose added, or greater; about 0.36 grams of target product per gram of glucose added, or greater; about 0.37 grams of target product per gram of glucose added, or greater; about 0.38 grams of target product per gram of glucose added, or greater; about 0.39 grams of target product per gram of glucose added, or greater; about 0.40 grams of target product per gram of glucose added, or greater; about 0.41 grams of target product per gram of glucose added, or greater; 0.42 grams of target product per gram
  • the maximum theoretical yield of ethanol from glucose is about 0.51 grams of ethanol produced per gram of glucose consumed (e.g., about 6% g/g). In certain embodiments, the maximum theoretical yield of ethanol from xylose is about 0.51 grams of ethanol produced per gram of xylose consumed (e.g., about 6% g/g). In some embodiments, the theoretical maximum mole to mole ratio of ethanol produced to glucose consumed is about 2 moles of ethanol produced per mole of glucose consumed. In some embodiments, the theoretical maximum mole to mole ratio of ethanol produced to pentose consumed is about 2 moles of ethanol produced per mole of pentose sugar consumed.
  • engineered strains described herein can produce between about 1 % g/g of ethanol to about a 6% g/g of ethanol, dependent on feedstock and genetic background (e.g., about 1 .0, 1 .1 , 1.2, 1 .3, 1.4, 1.5, 1.6, 1 .7, 1.8, 1.9, 2.0, 2.1 , 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1 , 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.1 , 4.1 , .4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1 , 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8 or 5.9% g/g).
  • feedstock and genetic background e.g., about 1 .0, 1 .1 , 1.2, 1 .3, 1.4, 1.5, 1.6,
  • engineered strains described herein produce between about 60% and 100% of theoretical maximum yield (e.g., about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95% or about 100%). In certain embodiments, engineered strains described herein yield between about a 1 % to about a 10% increase in ethanol yield when compared to parental controls (e.g., about a 1 % increase, about a 1 .5% increase, about a 2% increase, about a 2.5% increase, about a 3% increase, about a 3.5% increase, about a 4% increase, about a 4.5% increase, about a 5% increase, about a 5.5% increase, about a 6% increase, about a 6.5% increase, about a 7% increase, about a 7 .5% increase, about a 8% increase, about a 8.5% increase, about a 9% increase, about a 9.5% increase, and about a 10% increase).
  • parental controls e.g., about a 1 % increase, about
  • engineered yeast strains described herein show between about a 1 fold (e.g., 1 X) and about a 100 fold (e.g., 100X) increase in ethanol production when compared to a parental control under identical fermentation conditions (e.g., about 1 X, 1 .1 X, 1 .2X, 1 .3X, 1.4X, 1.5X, 1 .6X, 1 .7X, 1 .8X, 1 .9X, 2.0X, 2.1 X, 2.2X, 2.3X, 2.4X, 2.5X, 2.6X, 2.7X, 2.8X, 2.9X, 3.0X, 3.1 X, 3.2X, 3.3X, 3.4X, 3.5X, 3.6X, 3.7X, 3.8X, 3.9X, 4.0X, 4.5X, 5.0X, 5.5X, 6.0X, 6.5X, 7.0X, 7.5X, 8.0X, 8.5X, 9.0X, 9.5X, 10X, 1
  • Genomic DNA from Zymomonas mobilis was obtained from the American Type Culture Collection (ATCC accession number 31821 D-5).
  • the genes encoding phosphogluconate dehydratase EC 4.2.1 .12 (referred to as “edd') and 2-keto-3-deoxygluconate-6-phosphate aldolase EC 4.2.1 . 4 (referred to as "eda”) were isolated from the ZM4 genomic DNA using the following oligonucleotides:.
  • the ZM4 eda gene The ZM4 eda gene:
  • the ZM4 edd gene The ZM4 edd gene:
  • E. coli genomic DNA was prepared using Qiagen DNeasy blood and tissue kit according to the manufacture's protocol.
  • the E. coli edd and eda constructs were isolated from E. coli genomic DNA using the following oligonucleotides:
  • the E. coli eda gene The E. coli eda gene:
  • oligonucleotides set forth above were purchased from Integrated DNA technologies ("IDT”, Coralville, IA). These oligonucleotides were designed to incorporate a Spel restriction
  • Cloning the edd and eda genes from ZM4 and E. coli genomic DNA was accomplished using the following procedure: About 100ng of ZM4 or E. coli genomic DNA, 1 ⁇ of the oligonucleotide primer set listed above, 2.5 U of PfuUltra High-Fidelity DNA polymerase (Stratagene), 300 ⁇ dNTPs (Roche), and 1 X PfuUltra reaction buffer was mixed in a final reaction volume of 50 ⁇ .
  • a BIORAD DNA Engine Tetrad 2 Peltier thermal cycler was used for the PCR reactions and the following cycle conditions were used: 5 min denaturation step at 95 S C, followed by 30 cycles of 20 sec at 95 8 C, 20 sec at 55 5 C, and 1 min at 72 S C, and a final step of 5 min at 72 S C.
  • the first approach was to remove translational pauses from the polynucleotide sequence by designing the gene to incorporate only codons that are preferred in yeast. This optimization is referred to as the "hot rod” optimization.
  • translational pauses which are present in the native organism gene sequence are matched in the heterologous expression host organism by substituting the codon usage pattern of that host organism. This optimization is referred to as the "matched" optimization.
  • coli native are shown in Figure 6. Certain sequences in Figure 6 are presented at the end of this Example 1 .
  • the matched version of ZM4 edd and ZM4 eda genes were synthesized by IDT, and the hot rod version was constructed using methods described in Larsen et al. (Int. J. Bioinform. Res. Appl; 2008:4[3]; 324-336).
  • each version of each edd and eda gene was inserted into the yeast expression vector p426GPD (GPD promoter, 2 micron, URA3) (ATCC accession number 87361 ) between the Spel and Xhol cloning sites.
  • Each version of the eda gene was also inserted into the Spel and Xhol sites of the yeast expression vector p425GPD (GPD promoter, 2 micron, LEU3) (ATCC accession number 87359).
  • p426GPD yeast expression vector p426GPD
  • URA3 yeast expression vector p426GPD promoter, 2 micron, URA3
  • Each version of the eda gene was also inserted into the Spel and Xhol sites of the yeast expression vector p425GPD (GPD promoter, 2 micron, LEU3) (ATCC accession number 87359).
  • This strain has a deletion of the his3 gene, an imidazoleglycerol-phosphate dehydratase which catalyzes the sixth step in histidine biosynthesis; a deletion of Ieu2 gene, a beta- isopropylmalate dehydrogenase which catalyzes the third step in the leucine biosynthesis pathway; a deletion of the Iys2 gene, an alpha aminoadipate reductase which catalyzes the fifth step in biosynthesis of lysine; and a deletion of the ura3 gene, an orotidine-5'-phosphate decarboxylase which catalyzes the sixth enzymatic step in the de novo biosynthesis of pyrimidines.
  • the genotype of BY4742 makes it an auxotroph for histidine, leucine, lysine and uracil.
  • Transformation of the p426GPD plasmids containing an edd or an eda variant gene into yeast strain BY4742 was accomplished using the Zymo Research frozen-EZ yeast transformation II kit according to the manufacturer's protocol.
  • the transformed BY4742 cells were selected by growth on a synthetic dextrose medium (SD) (0.67% yeast nitrogen base-2% dextrose) containing complete amino acids minus uracil (Krackeler Scientific Inc). Plates were incubated at about 30 e C for about 48 hours. Transformant colonies for each edd and eda variant were inoculated onto 5ml of SD minus uracil medium and cells were grown at about 30 e C and shaken at about 250 rpm for about 24hours.
  • SD synthetic dextrose medium
  • Transformation of combined edd (in p426GPD) and edd (in p425GPD) constructs was accomplished with the Zymo Research frozen-EZ yeast transformation II kit based on manufacturer's protocol.
  • p425GPD and p426GPD vectors were also transformed into BY4742.
  • Transformants (16 different combinations total including the variant edd and eda combinations plus vector controls) were selected on synthetic dextrose medium (SD) (0.67% yeast nitrogen base- 2% dextrose) containing complete amino acids minus uracil and leucine.
  • Transformants of edd and eda variant combinations were inoculated onto 5ml of SD minus uracil and leucine and cells were grown at about 30 9 C in shaker flasks at about 250 rpm for about 24 hours. Fresh overnight culture was used to inoculate about 100ml of (SD media minus uracil and leucine containing about 0.01 g ergosterol /L and about 400 ⁇ of Tween80) to an initial inoculum OD 60 ofest m of about 0.1 and grown anaerobically at about 30 e C for approximately 14 hours until cells reached an OD 600 nm of 3-4. The cells were centrifuged at about 3000 g for about 10 minutes.
  • the cells were then washed with 25 ml deionized H 2 0 and centrifuged at 3000 g for 10 min. the cells were resuspended at about 2ml/g of cell pellet)in lysis buffer (50mM TrisCI pH7, 10mfv1 MgCI 2 , 1 X Calbiochem protease inhibitor cocktail set III). Approximately 900 ⁇ of glass beads were added and cells were lysed by vortexing at maximum speed for 4 x 30 seconds. Cell lysate was removed from the glass beads, placed into fresh tubes and spun at about 10,000g for about 10 minutes at about 4 S C. The supernatant containing whole cell extract (WCE) was transferred to a fresh tube.
  • lysis buffer 50mM TrisCI pH7, 10mfv1 MgCI 2 , 1 X Calbiochem protease inhibitor cocktail set III.
  • Approximately 900 ⁇ of glass beads were added and cells were lysed by vortexing at maximum speed for 4 x 30 seconds. Cell lys
  • WCE protein concentrations were measured using the Coomassie Plus Protein Assay (Thermo Scientific) according to the manufacturer's directions. A total of about 750Mg of WCE was used for the edd and eda coupled assay. For this assay, about 750Mg of WCE was mixed with about 2mM 6-phosphogluconate and about 4.5U lactate dehydrogenase in a final volume of about 400 ⁇ . A total of about 100 ⁇ of NADH was added to this reaction to a final molarity of about 0.3mM, and NADH oxidation was monitored for about 10 minutes at about 340nM using a DU800 spectrophotometer.
  • Example 2 Inactivation of the Embden-Meyerhof Pathway in Yeast
  • Saccharomyces cerevisiae strain YGR240CBY4742 was obtained from the ATCC (accession number 4015893). This strain is genetically identical to S. cerevisiae strain BY4742, except that YGR420C, the gene encoding the PFK1 enzyme, which is the alpha subunit of heterooctameric phosphofructokinase, has been deleted. A DNA construct designed to delete the gene encoding the PFK2 enzyme via homologous recombination was prepared. This construct substituted the gene encoding HIS3 (imidazoleglycerol-phosphate dehydratase, an enzyme required for synthesis of histidine) for the PFK2 gene.
  • HIS3 imidazoleglycerol-phosphate dehydratase, an enzyme required for synthesis of histidine
  • the DNA construct comprised, in the 5' to 3' direction, 100 bases of the 5' end of the open reading frame of PFK2, followed by the HIS3 promoter, HIS3 open reading frame, HIS3 terminator; and 100bp of the 3' end of the PFK2 open reading frame.
  • This construct was prepared by two rounds of PCR. In the first round, about 100ng of BY4742 genomic DNA was used as a template. The genomic DNA was prepared from cells using the Zymo Research Yeastar kit according to the manufacturer ' s instructions. PCR was performed using the following primers:
  • the PCR reaction conditions were the same as those set forth in Example 1 for preparing the edd and eda genes.
  • the second round of PCR reaction was performed with the following primer set: 5'- atgactgttactactcctttlgtgaatggtacttcttattgtaccgtcactgcatatlccgttcaatcttataaa -3' (SEQ ID NO:1 1 ) 5'-ttaatcaactctctttcttccaaccaaatggtcagcaatgagtctggtagcttgccagtgaatgacctttggcat-3'(SEQ ID NO: 12)
  • PCR conditions for this reaction were the same as for the first reaction immediately above.
  • the final PCR product was separated by agarose gel electrophoresis, excised, and purified using MP Biomedicals Geneclean II kit according to the manufacturer's instructions.
  • Approximately 2 of the purified DNA was used for transformation of the yeast strain YGR240CBY4742 by lithium acetate procedure as described by Shiestl and Gietz with an additional recovery step added after the heat shock step.
  • YP-Ethanol (1 % yeast extract-2% peptone-2% ethanol) and incubated at 30 s C for 2 hours prior to plating on selective media containing SC-Ethanol (0.67% yeast nitrogen base-2% ethanol) containing complete amino acids minus histidine.
  • SC-Ethanol 0.67% yeast nitrogen base-2% ethanol
  • the YGR420CBY4742APFK2 strain was used for transformation of the combination of edd-p426 GPD (edd variants in p426 GPD) and eda-p425 GPD (eda variants in p425 GPD) variant constructs.
  • a total of 16 combinations of edd-p426 GPD and eda-p425 GPD variant constructs were tested.
  • Each combination was transformed into YGR420CBY4742APFK2.
  • ⁇ g of edd-p426 GPD and ⁇ g of eda-p425 GPD was used. All transformants from each edd-p426 GPD and eda-p425 GPD construct combination were selected on SC-Ethanol
  • a complementation test for growth of YGR420CBY4742APFK2 strain on YPD (1 % yeast extract-2% peptone-2% dextrose) and YPGIuconate (1 % yeast extract-2% peptone-2% gluconate) was performed.
  • Viable colonies of edd-p426 GPD and eda-p425 GPD variant construct combinations grown on SC-Ethanol minus uracil and leucine were patched to plates containing SC-ethanol minus uracil and leucine and incubated at 30 9 C for 48hrs. These patches were used to inoculate 5ml of YPD media to an initial inoculum OD 60 onm of 0.1 and the cells were grown anaerobically at 30 9 C for 3 to 7 days.
  • Total genomic DNA from Zymomonas mobilis was obtained from ATCC (ATCC Number 31821 ).
  • the Z. mobilis gene encoding the enzyme phosphoenolpyruvate carboxylase ("PEP carboxylase") was isolated from this genomic DNA and cloned using PCR amplification.
  • PCR was performed in a total volume of about 50 micro-liters in the presence of about 20 nanograms of Z. mobilis genomic DNA, about 0.2 imM of 5' forward primer, about 0.2 mM of 3' reverse primer, about 0.2 mM of dNTP, about 1 micro-liter of pfu Ultrall DNA polymerase (Stratagene, La Jolla, CA), and 1 X PCR buffer (Stratagene, La Jolla, CA).
  • PCR was carried out in a thermocycler using the following program: Step One "95°C for 10 minutes” for 1 cycle, followed by Step Two "95°C for 20 seconds, 65°C for 30 seconds, and 72°C for 45 seconds” for 35 cycles, followed by Step Three "72°C for 5 minutes” for 1 cycle, and then Step Four "4°C Hold” to stop the reaction.
  • the primers for the PCR reaction were:
  • the DNA sequence of native Z Mobilis PEP carboxylase is set forth as SEQ ID NO:20.
  • the cloned gene was inserted into the vector pGPD426 (ATCC Number: 87361 ) in between the Spel and Xhol sites.
  • the final plasmid containing the PEP carboxylase gene was named pGPD426 PEPC.
  • pGPD426 N-his PEPC was constructed to insert a six- histidine tag at the N-terminus of the PEPC sequence for protein expression verification in yeast.
  • This plasmid was constructed using two rounds of PCR to extend the 5' end of the PEPC gene to incorporate a six-histidine tag at the N-terminus of the PEPC protein.
  • the two 5' forward primers used sequentially were:
  • the same 3' primer was used as described above.
  • the PCR was performed in a total volume of about 50 micro-liters in the presence of about 20 nanograms of Z. Mobilis PEP carboxylase polynucleotide, about 0.2 mM of 5' forward primer, about 0.2 mM of 3' reverse primer, about 0.2 mM of dNTP, about 1 micro-liter of pfu Ultrall DNA polymerase (Stratagene, La Jolla, CA), and 1 X PCR buffer (Stratagene, La Jolla, CA).
  • the PCR was carried out in a thermocycler using the following program: Step One "95°C for 10 minutes” for 1 cycle, followed by Step Two "95°C for 20 seconds, 65°C for 30 seconds, and 72°C for 45 seconds” for 35 cycles, followed Step Three "72°C for 5 minutes” for 1 cycle, and then Step Four "4°C Hold” to stop the reaction.
  • Step One "95°C for 10 minutes” for 1 cycle followed by Step Two "95°C for 20 seconds, 65°C for 30 seconds, and 72°C for 45 seconds” for 35 cycles, followed Step Three "72°C for 5 minutes” for 1 cycle, and then Step Four "4°C Hold” to stop the reaction.
  • the PEPC coding sequence was optimized to incorporate frequently used codons obtained from yeast glycolytic genes.
  • the resulting PEP carboxylase amino acid sequence remains identical to the wild type.
  • the codon optimized PEP carboxylase DNA sequence was ordered from IDT and was inserted into the vector pGPD426 at the Spel and Xhol site.
  • the final plasmid containing the codon optimized PEP carboxylase gene was named pGPD426 PEPC_opti.
  • a similar plasmid, named pGPD426 N-his PEPC_opti was constructed to insert a six-histidine tag at the N-terminus of the optimized PEPC gene for protein expression verification in yeast.
  • the 3' reverse primer sequence used for both PCR reactions was:
  • PCR reactions were carried out in a thermocycler using the following program: Step One "95°C for 10 minutes” for 1 cycle, followed by Step Two "95°C for 20 seconds, 65°C for 30 seconds, and 72°C for 45 seconds” for 35 cycles, followed Step Three "72°C for 5 minutes” for 1 cycle, and then Step Four "4°C Hold” to stop the reaction.
  • Saccharomyces cerevisiae strain BY4742 was cultured in YPD medium to an OD of about 1.0, and then prepared for transformation using the Frozen-EZ Yeast Transformation II kit (Zymo Research, Orange, CA) and following the manufacturer's instructions.
  • yeast mutant strains YKR097W (ATCC Number 4016013, APCK, in the phosphoenolpyruvate carboxykinase gene is deleted), YGL062W (ATCC Number 4014429, APYC1 ,in which the pyruvate carboxylase 1 gene is deleted), and YBR218C (ATCC Number 4013358, APYC2, in which the pyruvate carboxylase 2 gene is deleted).
  • the transformed yeast cells were grown aerobically in a shake flask in synthetic complete medium minus uracil (see Example IV) containing 1 % glucose to mid-log phase (an OD of 2.0).
  • Example 4 Production of Pentose Sugar Utilizing Yeast Cells
  • the full length gene encoding the enzyme xylose isomerase from Ruminococcus flavefaciens strain 17 also known as Ruminococcus flavefaciens strain Siijpesteijn 1948
  • a substitution at position 513 in which cytidine was replaced by guanidine
  • IDT Integrated DNA Technologies, Inc.
  • the sequence of this gene is set forth below as SEQ ID NO:22.
  • PCR was used to engineer a unique Spe ⁇ restriction site into the 5' end of each of the xylose isomerase genes, and to engineer a unique Xho ⁇ restriction site at the 3' end.
  • a version of each gene was created that contained a 6-HIS tag at the 3' end of each gene to enable detection of the proteins using Western analysis.
  • PCR amplifications were performed in about 50 ⁇ reactions containing 1 X PfuW Ultra reaction buffer (Stratagene, San Diego, CA), 0.2mM dNTPs, 0.2 ⁇ specific 5' and 3' primers, and 1 U P JItra II polymerase (Stratagene, San Diego, CA). The reactions were cycled at 95 °C for 10 minutes, followed by 30 rounds of amplification (95°C for 30 seconds, 62°C for 30 seconds, 72°C for 30 seconds) and a final extension incubation at 72 °C for 5 minutes. Amplified PCR products were cloned into pCR Blunt II TOPO (Life Sciences, Carlsbad, CA) and confirmed by sequencing (GeneWiz, La Jolla, CA). The PCR primers for these reactions were:
  • the xylose isomerase gene from Piromyces, strain E2 was synthesized by IDT.
  • the sequence of this gene is set forth below as SEQ ID NO: 24.
  • Two hot rod (“HR") versions of the Piromyces xylose isomerase gene were prepared using the method of Larsen et al., supra.
  • One version contained DNA sequence encoding a 6-histidine tag at the 5' terminus and the other did not.
  • the annealing temperature for the self-assembling oligonucleotides was about 48 degrees Celsius. The sequence of this gene is set forth below as
  • a unique Spel restriction site was engineered at the 5' end of each of the XI genes, and a unique Xho ⁇ restriction site was engineered at the 3' end.
  • a 6-HIS tag was engineered at the 3' end of each gene sequence to enable detection of the proteins using Western analysis.
  • the primers are listed in Table X. PCR amplifications were performed in 50 ⁇ reactions containing 1 X PiuW Ultra reaction buffer (Stratagene, San Diego, CA), 0.2mM dNTPs, 0.2 ⁇ specific 5' and 3' primers, and 1 U PMJItra II polymerase (Stratagene, San Diego, CA).
  • the reactions were cycled at 95 °C for 10 minutes, followed by 30 rounds of amplification (95 °C for 30 seconds, 62°C for 30 seconds, 72°C for 30 seconds) and a final extension incubation at 72 °C for 5 minutes.
  • Amplified PCR products were cloned into pCR Blunt II TOPO (Life Sciences, Carlsbad, CA) and confirmed by sequencing (GeneWiz, La Jolla, CA) .
  • the primers used for PCR were:
  • Saccharomyces cerevisiae strain BY4742 cells ATCC catalog number 201389 were cultured in YPD media ( 1 Og Yeast Extract, 20g Bacto-Peptone, 20g Glucose, 1 L total) at about 30 °C.
  • transformed cells containing the various xylose isomerase constructs were selected from the cultures and grown in about 100ml of SC-Dextrose (minus uracil) to an OD 600 of about 4.0.
  • SC-Dextrose minus uracil
  • the S. cerevisiae cultures that were transformed with the various xylose isomerase-histidine constructs were then lysed using YPER-Plus reagent (Thermo
  • the membrane was washed in 1 X PBS (EMD, San Diego, CA), 0.05% Tween-20 (Fisher Scientific, Fairlawn, NJ) for 2-5 minutes with gentle shaking.
  • the membrane was blocked in 3% BSA dissolved in 1 X PBS and 0.05% Tween-20 at room temperature for about 2 hours with gentle shaking.
  • the membrane was washed once in 1 X PBS and 0.05% Tween-20 for about 5 minutes with gentle shaking.
  • the membrane was then incubated at room temperature with the 1 :5000 dilution of primary antibody (Ms mAB to 6x His Tag, AbCam, Cambridge, MA) in 0.3% BSA (Fraction V, EMD, San Diego, CA) dissolved in 1 X PBS and 0.05% Tween-20 with gentle shaking. Incubation was allowed to proceed for about 1 hour with gentle shaking. The membrane was then washed three times for 5 minutes each with 1 X PBS and 0.05% Tween-20 with gentle shaking. The secondary antibody [Dnk pAb to Ms IgG (HRP), AbCam, Cambridge, MA] was used at
  • Example 5 Preparation of Selective Growth Yeast
  • the yeast gene cdc21 encodes thymidylate synthase, which is required for de novo synthesis of pyrimidine deoxyribonucleotides.
  • a cdc 21 mutant, strain 17206, (ATCC accession number 208583) has a point mutation G139S relative to the initiating methionine.
  • the restrictive temperature of this temperature sensitive mutant is 37°C, which arrests cell division at S phase, so that little or no cell growth and division occurs at or above this temperature.
  • Saccharomyces cerevisiae strain YGR420CBY4742APFK2 was used as the starting cell line to create the cdc21 growth sensitive mutant.
  • a construct for homologous recombination was prepared to replace the wild type thymidylate synthase YGR420CBY4742APFK2 for the cdc21 mutant. This construct was made in various steps. First, the cdc21 mutant region from Saccharomyces cerevisiae strain 17206 was PCR amplified using the following primers:
  • the genomic DNA of BR214-4a (ATTC accession number 208600) was extracted using Zymo research YeaStar Genomic DNA kit according to instructions.
  • the Iys2 gene with promoter and terminator regions was PCR amplified from BR214-4a genomic DNA using the following primers:
  • Lys2Fwd 5'-tgctaatgacccgggaattccacttgcaattacataaaaattccggcgg-3'
  • Lys2Rev 5'-atgatcattgagctcagcttcgcaagtattcattttagacccatggtgg-3'.
  • the PCR cycle was identical to that just described above but with genomic DNA of BR214-4a instead.
  • Xmal and Sacl restriction sites were designed to flank this DNA construct to clone it into the Xmal and Sacl sites of the PUC19-cdc21 vector according to standard cloning procedures described by Maniatis in Molecular Cloning.
  • the new construct with the cdc21 mutation with a Iys2 directly downstream of that will be referred to as PUC19-cdc21 -lys2.
  • the final step involved the cloning of the downstream region of thymidylate synthase into the PUC19-cdc21 -lys2 vector immediately downstream of the Iys2 gene.
  • the downstream region of the thymidylate synthase was amplified from BY4742 genomic DNA (ATCC accession number 201389D-5 using the following primers: ThymidylateSynthase_DownFwd: 5'-tgctaatgagagctctcattttttggtgcgatatgttttttggttgatg-3' and
  • ThymidylateSynthatse_DownRev 5'- aatgatcatgagctcgtcaacaagaactaaaaaattgttcaaaatgc-3'
  • This final construct is referred as PUC19-cdc21 -lys2-ThymidylateSynthase_down.
  • the sequence is set forth in the tables.
  • a final PCR amplification reaction of this construct was performed using the following PCR primers:
  • the PCR reaction was identical to that described above but using 100ng of the PUC19-cdc21 -lys2- ThymidylateSynthase__down construct as a template.
  • the final PCR product was separated by agarose gel electrophoresis, excised, and purified using MP Biomedicals Geneclean II kit as recommended. Homologous recombination of
  • YGR420CBY4742APFK2 to replace the wt thymidylate synthase for the cdc21 mutant was accomplished using 10 g of the purified PCR product to transform YGR420CBY4742APFK2 strain using same transformation protocol described above. Transformants were selected by culturing the cells on selective media containing SC-Ethanol (0.67% yeast nitrogen base-2% ethanol) containing complete amino acids minus lysine.
  • This final engineered strain contains the mutated cdc21 gene, and has both the PFK1 and PFK2 genes deleted.
  • This final engineered strain will be transformed with the best combination of edd-p426 GPD and eda -p425 GPD variant constructs. Ethanol and glucose measurements will be monitored during aerobic and anaerobic growth conditions using Roche ethanol and glucose kits according to instructions.
  • Example 6 Examples of Polynucleotide Regulators Provided in the tables hereafter are non-limiting examples of regulator polynucleotides that can be utilized in embodiments herein. Such polynucleotides may be utilized in native form or may be modified for use herein. Examples of regulatory polynucleotides include those that are regulated by oxygen levels in a system (e.g., up-regulated or down-regulated by relatively high oxygen levels or relatively low oxygen levels) Regulated Yeast Promoters - Up-regulated by oxygen
  • Adr1 A-AG-GAGAGAG-GGCAG YTSTYSTT-TTGYTWTT
  • Gal4 (Gal) YCTTTTTTTTYTTYYKG CGGM— CW-Y-CCCG
  • Gcr1 GGAAGCTGAAACGYMWRR GGAAGCTGAAACGYMWRR
  • Gcr2 GGAGAGGCATGATGGGGG AGGTGATGGAGTGCTCAG
  • YHP1 One of two homeobox transcriptional repressors (see also Yoxl p), that bind to Mcml p and to early cell cycle box (ECB) elements of cell cycle regulated genes, thereby restricting ECB-mediated transcription to the M/G 1 interval
  • HOS4 Subunit of the Set3 complex which is a meiotic-specific repressor of sporulation specific genes that contains deacetylase activity
  • SAP1 Putative ATPase of the AAA family interacts with the Sin l p
  • SET3 Defining member of the SET3 histone deacetylase complex which is a meiosis-specific repressor of sporulation genes; necessary for efficient transcription by RNAPII; one of two yeast proteins that contains both SET and PHD domains
  • YMR 181 C Protein of unknown function; mRNA transcribed as part of a
  • bicistronic transcript with a predicted transcriptional repressor RGM1/Y R182C; mRNA is destroyed by nonsense-mediated decay (NMD); YMR181 C is not an essential gene
  • YLR345W Similar to 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase enzymes responsible for the metabolism of fructoso-2,6- bisphosphate; mRNA expression is repressed by the Rfx1 p-Tup1 p- Ssn6p repressor complex; YLR345W is not an essential gene
  • MCM1 Transcription factor involved in cell-type-specific transcription and pheromone response; plays a central role in the formation of both repressor and activator complexes
  • RGT1 Glucose-responsive transcription factor that regulates expression of several glucose transporter (HXT) genes in response to glucose; binds to promoters and acts both as a transcriptional activator and repressor
  • RNA polymerase II mediator complex associates with core polymerase subunits to form the RNA polymerase II holoenzyme; essential for transcriptional regulation; target of the global repressor Tupl p
  • GAL1 1 Subunit of the RNA polymerase II mediator complex; associates with core polymerase subunits to form the RNA polymerase II holoenzyme; affects transcription by acting as target of activators and repressors Transcriptional activators
  • GCR2 functions with the DNA-binding protein Gcr1 p
  • Transcriptional activator of genes involved in nitrogen catabolite repression contains a GATA-1 -type zinc finger DNA-binding motif; activity and localization regulated by nitrogen limitation and Ure2p
  • NCR nitrogen catabolite repression
  • Transcriptional activator of proline utilization genes constitutively binds PUT1 and PUT2 promoter sequences and undergoes a conformational change to form the active state; has a Zn(2)-Cys(6) binuclear cluster domain

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

Provided herein are genetically modified microorganisms that have enhanced fermentation activity, and methods for making and using such microorganisms.

Description

ENGINEERED MICROORGANISMS WITH ENHANCED FERMENTATION ACTIVITY
Related Patent Application(s) This patent application claims the benefit of U.S. provisional patent application no. 61/432,104 filed on January 12, 201 1 , entitled ENGINEERED MICROORGANISMS WITH ENHANCED
FERMENTATION ACTIVITY, naming Stephen Picataggio, Kirsty Anne Lily Salmon, and Jose Miguel LaPlaza as inventors and designated by Attorney Docket No. VRD-1002-PV4. This patent application is related to U.S. provisional patent application no. 61/224,430 filed on July 9, 2009, entitled ENGINEERED MICROORGANISMS WITH ENHANCED FERMENTATION. ACTIVITY, naming Stephen Picataggio as inventor and designated by Attorney Docket No. VRD-1002-PV. This patent application also is related to U.S. provisional patent application no. 61/316,780 filed on March 23, 2010, entitled ENGINEERED MICROORGANISMS WITH ENHANCED
FERMENTATION ACTIVITY, naming Stephen Picataggio as inventor and designated by Attorney Docket No. VRD-1002-PV2. This patent application also is related to U.S. provisional patent application no. 61/334,097 filed on May 12, 2010, entitled ENGINEERED MICROORGANISMS WITH ENHANCED FERMENTATION ACTIVITY, naming Stephen Picataggio as inventor and designated by Attorney Docket No. VRD-1002-PV3. This patent application also is related to international patent application no. PCT/US2010/041618 filed on July 9, 2010, entitled
ENGINEERED MICROORGANISMS WITH ENHANCED FERMENTATION ACTIVITY, naming Stephen Picataggio, Kirsty Salmon and Jose Laplaza as inventors and designated by Attorney Docket No. VRD-1002-PC. This patent application also is related to international patent application no. PCT/US2010/041607 filed on July 9, 2010, entitled ENGINEERED
MICROORGANISMS WITH ENHANCED FERMENTATION ACTIVITY, naming Stephen
Picataggio, Kirsty Salmon and Jose 'LaPlaza as inventors and designated by Attorney Docket No. VRD-1002-PC2. This patent application also is related to U.S. patent application no. 13/045,829 filed on March 1 1 , 201 1 , entitled ENGINEERED MICROORGANISMS WITH ENHANCED
FERMENTATION ACTIVITY, naming Stephen Picataggio, Kirsty Anne Lily Salmon, and Jose Miguel LaPlaza as inventors and designated by Attorney Docket No. VRD-1002-CT. This patent application also is related to U.S. patent application no. 13/045,841 filed on March 1 1 , 201 1 , now issued, entitled ENGINEERED MICROORGANISMS WITH ENHANCED FERMENTATION ACTIVITY, naming Stephen Picataggio, Kirsty Anne Lily Salmon, and Jose Miguel LaPlaza as inventors and designated by Attorney Docket No. VRD-1002-CT2. This patent application also is related to U.S. patent application no. 13/045,847 filed on March 1 1 , 201 1 , entitled ENGINEERED MICROORGANISMS WITH ENHANCED FERMENTATION ACTIVITY, naming Stephen .
Picataggio, Kirsty Anne Lily Salmon, and Jose Miguel LaPlaza as inventors and designated by Attorney Docket No. VRD-1002-CT3. The entire contents of the foregoing patent applications are incorporated herein by reference, including, without limitation, all text, tables and drawings.
Field
The technology relates in part to genetically modified microorganisms that have enhanced fermentation activity, and methods for making and using such microorganisms.
Background
Microorganisms employ various enzyme-driven biological pathways to support their own metabolism and growth. A cell synthesizes native proteins, including enzymes, in vivo from deoxyribonucleic acid (DNA). DNA first is transcribed into a complementary ribonucleic acid (RNA) that comprises a ribonucleotide sequence encoding the protein. RNA then directs translation of the encoded protein by interaction with various cellular components, such as ribosomes. The resulting enzymes participate as biological catalysts in pathways involved in production of molecules utilized or secreted by the organism.
These pathways can be exploited for the harvesting of the naturally produced products. The pathways also can be altered to increase production or to produce different products that may be commercially valuable. Advances. in recombinant molecular biology methodology allow researchers to isolate DNA from one organism and insert it into another organism, thus altering the cellular synthesis of enzymes or other proteins. Such genetic engineering can change the biological pathways within the host organism, causing it to produce a desired product.
Microorganic industrial production can minimize the use of caustic chemicals and production of toxic byproducts, thus providing a "clean" source for certain products. Summary
Provided herein are engineered microorganisms having enhanced fermentation activity. In certain non-limiting embodiments, such microorganisms are capable of generating a target product with enhanced fermentation efficiency by, for example, (i) preferentially utilizing a particular glycolysis pathway, which increases yield of a target product, upon a change in fermentation conditions; (ii) reducing cell division rates upon a change in fermentation conditions, thereby diverting nutrients towards production of a target product; (iii) having the ability to readily metabolize five-carbon sugars; and/or (iv) having the ability to readily metabolize carbon dioxide; and combinations of the foregoing. In some embodiments, a target product is ethanol or succinic acid.
Thus, provided in certain embodiments are engineered microorganisms that comprise: (a) a functional Embden-Meyerhoff glycolysis pathway that metabolizes six-carbon sugars under aerobic fermentation conditions, and (b) a genetic modification that reduces an Embden-Meyerhoff glycolysis pathway member activity upon exposure of the engineered microorganism to anaerobic fermentation conditions, whereby the engineered microorganisms preferentially metabolize six- carbon sugars by the Enter-Doudoroff pathway under the anaerobic fermentation conditions. In some embodiments, the genetic modification is insertion of a promoter into genomic DNA in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity. In certain embodiments, the genetic modification is provision of a heterologous promoter polynucleotide in operable linkage with a polynucleotide that encodes the Embden- Meyerhoff glycolysis pathway member activity. In some embodiments, the genetic modification is a deletion or disruption of a polynucleotide that encodes, or regulates production of, the Embden- Meyerhoff glycolysis pathway member, and the microorganism comprises a heterologous nucleic acid that includes a polynucleotide encoding the Embden-Meyerhoff glycolysis pathway member operably linked to a polynucleotide that down-regulates production of the member under anaerobic fermentation conditions. In certain embodiments, the Embden-Meyerhoff glycolysis pathway member activity is a phosphofructokinase activity, a phosphoglucose isomerase activity, or a phosphofructokinase activity and a phosphoglucose isomerase activity. In some embodiments, the activity of one or more (e.g., 2, 3, 4, 5 or more) pathway members in an EM pathway is reduced or removed to undetectable levels. In certain embodiments, one or more activities in an EM pathway, not mentioned herein, also can be modified to further enhance production of a desired product (e.g., alcohol). Also provided in some embodiments are engineered microorganisms that comprise a genetic modification that inhibits cell division upon exposure to a change in fermentation conditions, where: the genetic modification comprises introduction of a heterologous promoter operably linked to a polynucleotide encoding a polypeptide that regulates the cell cycle of the microorganism; and the promoter activity is altered by the change in fermentation conditions. Provided also in certain embodiments are engineered microorganisms that comprise a genetic modification that inhibits cell division and/or cell proliferation upon exposure of the microorganisms to a change in fermentation conditions. In certain embodiments, the genetic modification inhibits cell division, inhibits cell proliferation, inhibits the cell cycle and/or induces cell cycle arrest. In some embodiments, the change in fermentation conditions is a switch to anaerobic fermentation conditions, and in certain embodiments, the change in fermentation conditions is a switch to an elevated temperature. In some embodiments, the polypeptide that regulates the cell cycle has thymidylate synthase activity. In certain embodiments, the promoter activity is reduced by the change in fermentation conditions. In some embodiments, the genetic modification is a temperature sensitive mutation.
Provided also in some embodiments are methods for manufacturing a target product produced by an engineered microorganism, which comprise: (a) culturing an engineered microorganism described herein under aerobic conditions; and (b) culturing the engineered microorganism after (a) under anaerobic conditions, whereby the engineered microorganism produces the target product. Also provided in some embodiments are methods for producing a target product by an engineered microorganism, which comprise: (a) culturing an engineered microorganism described herein under a first set of fermentation conditions; and (b) culturing the engineered microorganism after (a) under a second set of fermentation conditions different than the first set of fermentation conditions, whereby the second set of fermentation conditions inhibits cell division and/or cell proliferation of the engineered microorganism. In certain embodiments, the target product is ethanol or succinic acid. In some embodiments, the host microorganism from which the engineered microorganism is produced does not produce a detectable amount of the target product. In certain embodiments, the culture conditions comprise fermentation conditions, comprise introduction of biomass, comprise introduction of a six-carbon sugar (e.g., glucose), and/or comprise introduction of a five-carbon sugar (e.g., xylulose, xylose); or combinations of the foregoing. In some embodiments, the target product is produced with a yield of greater than about 0.3 grams per gram of glucose added, and in certain embodiments, a method comprises purifying the target product from the cultured microorganisms. In some embodiments, a method comprises modifying the target product, thereby producing modified target product. In certain embodiments, a method comprises placing the cultured microorganisms, the target product or the modified target product in a container, and in certain embodiments, a method comprises shipping the container. In some embodiments, the second set of fermentation conditions comprises an elevated temperature as compared to the temperature in the first set of fermentation conditions. In certain embodiments, the genetic modification inhibits the cell cycle of the engineered microorganism upon exposure to the second set of fermentation conditions. In some embodiments, the genetic modification inhibits cell proliferation, inhibits cell division, inhibits the cell cycle and/or induces cell cycle arrest upon exposure to the second set of fermentation conditions. In certain embodiments, the genetic modification inhibits thymidylate synthase activity upon exposure to the change in fermentation conditions, and sometimes the genetic modification comprises a temperature sensitive mutation.
Also provided in certain embodiments are methods for manufacturing an engineered
microorganism, which comprise: (a) introducing a genetic modification to a host microorganism that reduces an Embden-Meyerhoff glycolysis pathway member activity upon exposure of the engineered microorganism to anaerobic conditions; and (b) selecting for engineered
microorganisms that (i) metabolize six-carbon sugars by the Embden-Meyerhoff glycolysis pathway under aerobic fermentation conditions, and (ii) preferentially metabolize six-carbon sugars by the Entner-Doudoroff pathway under the anaerobic fermentation conditions. In some embodiments, the genetic modification is insertion of a promoter into genomic DNA in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity. The genetic modification sometimes is provision of a heterologous promoter polynucleotide in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity. In certain embodiments, the genetic modification is a deletion or disruption of a
polynucleotide that encodes, or regulates production of, the Embden-Meyerhoff glycolysis pathway member, and the microorganism comprises a heterologous nucleic acid that includes a
polynucleotide encoding the Embden-Meyerhoff glycolysis pathway member operably linked to a polynucleotide that down-regulates production of the member under anaerobic fermentation conditions. In some embodiments, the Embden-Meyerhoff glycolysis pathway member activity is a phosphofructokinase activity, and in certain embodiments, the Embden-Meyerhoff glycolysis pathway member activity is a phosphoglucose isomerase activity. In some embodiments, the activity of one or more (e.g., 2, 3, 4, 5 or more) pathway members in an EM pathway is reduced or removed to undetectable levels.
Provided also in some embodiments are methods for manufacturing an engineered microorganism, which comprise: (a) introducing a genetic modification to a host microorganism that inhibits cell division upon exposure to a change in fermentation conditions, thereby producing engineered microorganisms; and (b) selecting for engineered microorganisms with inhibited cell division upon exposure of the engineered microorganisms to the change in fermentation conditions. In certain embodiments, the change in fermentation conditions comprises a change to anaerobic fermentation conditions. The change in fermentation conditions sometimes comprises a change to an elevated temperature. In some embodiments, the genetic modification inhibits the cell cycle of the engineered microorganism upon exposure to the change in fermentation conditions. The genetic modification sometimes inhibits cell division, inhibits the cell cycle, inhibits cell proliferation and/or induces cell cycle arrest upon exposure to the change in fermentation conditions. In some embodiments, the genetic modification inhibits thymidylate synthase activity upon exposure to the change in fermentation conditions, and in certain embodiments, the genetic modification comprises a temperature sensitive mutation. In certain embodiments pertaining to engineered microorganisms, and methods of making or using such microorganisms, the microorganism comprises a genetic modification that adds or alters a five-carbon sugar metabolic activity. In some embodiments, the microorganism comprises a genetic alteration that adds or alters xylose isomerase activity. In certain embodiments, the microorganism comprises a genetic alteration that adds or alters a xylose reductase (XR) activity and a xylitol dehydrogenase (XD) activity. In some embodiments, the microorganism comprises a xylulokinase (XK) activity. In certain embodiments, the microorganism comprises a genetic alteration that adds or alters five-carbon sugar transporter activity, and sometimes the transporter activity is a transporter facilitator activity or an active transporter activity. In some embodiments, the microorganism comprises a genetic alteration that adds or alters carbon dioxide fixation activity, and sometimes the genetic alteration that adds or alters phosphoenolpyruvate (PEP) carboxylase activity. In certain embodiments, the microorganism comprises a genetic modification that reduces or removes an alcohol dehydrogenase 2 activity. In certain embodiments, the microorganism comprises a genetic alteration that adds or alters a 6-phosphogluconate dehydrogenase (decarboxylating) activity. In some embodiments the microorganism is an engineered yeast, such as a Saccharomyces yeast (e.g., S. cerevisiae), for example.
In some embodiments, provided are nucleic acids, comprising a polynucleotide that encodes a polypeptide from Ruminococcus possessing a xylose to xylulose xylose isomerase activity, or a polypeptide possessing xylose reductase activity and xylitol dehydrogenase activity. In certain embodiments, provided are nucleic acids, comprising a polynucleotide that encodes a polypeptide possessing xylulokinase activity. Also provided in certain embodiments are expression vectors comprising a polynucleotide that encodes a polypeptide from Ruminococcus possessing a xylose isomerase activity. Also provided in some embodiments are expression vectors comprising a polynucleotide that encodes a polypeptide possessing a xylose reductase activity and xylitol dehydrogenase activity. Also provided in some embodiments are expression vectors comprising a polynucleotide that encodes a polypeptide possessing a xylulokinase activity. The polynucleotide sometimes includes one or more substituted codons, and in some embodiments, the one or more substituted codons are yeast codons (e.g., some or all codons are optimized with yeast codons (e.g., S. cerevisiae codons).
The polynucleotide sometimes includes a nucleotide sequence of SEQ ID NO: 29, 30, 32 or 33, fragment thereof, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing, and in certain embodiments the polypeptide includes an amino acid sequence of SEQ ID NO: 31 , fragment thereof, or sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing. In certain embodiments, a stretch of contiguous nucleotides of the polynucleotide is from another organism, and sometimes the stretch of contiguous nucleotides from the other organism is from a nucleotide sequence that encodes a polypeptide possessing a xylose isomerase activity. The other organism sometimes is a fungus, such as a Piromyces fungus (e.g., Piromyces strain E2 or another
Piromyces strain) for example, and at times the stretch of contiguous nucleotides from the other organism is from SEQ ID NO: 34, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing. In some embodiments, the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing. The stretch of contiguous nucleotides from the other organism sometimes is about 1 % to about 30% (e.g., about 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25%) of the total number of nucleotides in the polynucleotide that encodes the polypeptide possessing xylose isomerase activity. In some embodiments, about 30 contiguous nucleotides from the polynucleotide from Ruminococcus are replaced by about 10 to about 20 nucleotides from the other organism.
Sometimes, the contiguous stretch of polynucleotides from the other organism is at the 5' end of the polynucleotide. In some embodiments, the polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing. The polynucleotide sometimes encodes a polypeptide that includes an amino acid sequence of SEQ ID NO: 58, 60 or 62, fragment thereof, or sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing. In some embodiments, the polynucleotide comprises one or more point mutations, and sometimes the point mutation is at a position corresponding to position 179 of the R. flavefaciens strain Siijpesteijn 1948 polypeptide having xylose isomerase activity (e.g., the point mutation is a glycine 179 to alanine point mutation). In certain embodiments, an expression vector includes a regulatory nucleotide sequence in operable linkage with the polynucleotide. A regulatory nucleotide sequence sometimes includes a promoter sequence (e.g., an inducible promoter sequence, constitutively active promoter sequence. In certain embodiments, provided are methods for preparing an expression vector of any one of embodiments H1 to H24, comprising: (i) providing a nucleic acid that contains a regulatory sequence, and (ii) inserting the polynucleotide into the nucleic acid in operable linkage with the regulatory sequence. Thus, in some embodiments, provided are chimeric xylose isomerase enzymes, and
polynucleotides that encode them, which include subsequences from two or more xylose isomerase donor sequences. The xylose isomerase donor sequences sometimes are naturally occurring native sequences from an organism, and sometimes are modified sequences. In certain embodiments, a subsequence from one donor may represent a majority of the chimeric xylose isomerase sequence (e.g., about 55% to about 99% of the chimeric xylose isomerase nucleotide or amino acid sequence (e.g., about 60, 65, 70, 75, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98%) or all but 30 or fewer nucleotides or amino acids of the chimeric sequence (e.g., all but about 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 nucleotides or amino acids of the chimeric molecule). In some embodiments, a subsequence from one donor may represent a minority of the chimeric xylose isomerase sequence (e.g., about 1 % to about 45% of the chimeric xylose isomerase nucleotide or amino acid sequence (e.g., about 40, 35, 30, 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2%) or about 1 to about 30 nucleotides or amino acids of the chimeric sequence (e.g., about 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 nucleotides or amino acids of the chimeric molecule). In some embodiments, one or more donor sequences for a chimeric xylose isomerase molecule are from a xylose isomerase described in the following Table:
Figure imgf000009_0001
Clostridium cellulolyticum 101 YP 002507697.1
Bacillus uniformis 100 ZP 02069286.1
Bacillus stearothermophilus 99 ABI49954.1
Bacteroides thetaiotaomicron 98 NP 809706.1
Clostridium
thermohydrosulfuricum 97 P22842.1
Orpinomyces sp. ukkl 96 ACA65427.1
Clostridium phytofermentans 95 YP 001558336.1
Escherichia coli 94
Piromyces strain E2 24 and 34 and 93/35 AJ249909 / CAB76571 .1 or a nucleotide sequence or amino acid sequence that is (a) about 80% or more identical to one of the foregoing sequences referenced in the Table (e.g., about 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99% identical), and/or (b) has about 1 to about 20 nucleotide or amino acid modifications (e.g., substitutions, deletions or insertions) relative to one of the foregoing sequences referenced in the Table (e.g., about 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19 nucleotide or amino acid modifications). In certain embodiments, (a) the majority of a chimeric xylose isomerase molecule is from a Ruminococcus xylose isomerase described in the foregoing Table (e.g., about 80% or more of the nucleotides or amino acids of the chimeric molecule (e.g., about 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98, 99% of the nucleotides or amino acids) or all but about 30 of the nucleotides or amino acids in the chimeric molecule (e.g., all but about 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 nucleotides or amino acids of the chimeric molecule)), and (b) a minority of the chimeric xylose isomerase is from a xylose isomerase of another organism (e.g., about 20% or fewer of the nucleotides or amino acids of the chimeric molecule (e.g., about 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2, 1% of the nucleotides or amino acids) or about 30 of the nucleotides or amino acids in the chimeric molecule (e.g., about 25, 24, 23, 22, 21 , 20, 19, 18, 17, 16, 15, 14, 13, 12, 1 1 , 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 nucleotides or amino acids of the chimeric molecule)). In the latter embodiments, the minority of the chimeric xylose isomerase sometimes is from a xylose isomerase referenced in the foregoing Table, such as a xylose isomerase from the Piromyces strain, for example. In some embodiments, a donor sequence includes one or more nucleotide or amino acid mutations, examples of which are described herein.
In some embodiments, provided are nucleic acids, including a polynucleotide that includes a first stretch of contiguous nucleic acids from a first organism and a second stretch of contiguous nucleic acids from a second organism, where the polynucleotide encodes a polypeptide possessing a xylose to xylulose xylose isomerase activity. In certain embodiments, an expression vector, comprising a polynucleotide that includes a first stretch of contiguous expression vectors from a first organism and a second stretch of contiguous expression vectors from a second organism, where the polynucleotide encodes a polypeptide possessing a xylose to xylulose, xylose isomerase activity. In some embodiments, the first organism and the second organism are the same, and in certain embodiments, the first organism and the second organism are different. In some embodiments, the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having xylose isomerase activity. In certain embodiments, the first organism is a bacterium. In some embodiments, the bacterium is a Ruminococcus bacterium, and in certain embodiments, the bacterium is a Ruminococcus flavefaciens bacterium (e.g., Ruminococcus flavefaciens strain 17, Ruminococcus flavefaciens strain Siijpesteijn 1948, Rumino'coccus flavefaciens strain FD1 , Ruminococcus flavefaciens strain 18P13). In some embodiments, the stretch of contiguous nucleotides is from SEQ ID NO: 29, 30, 32, 33, or a sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing. In certain embodiments, the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 31 , or a sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing. In certain embodiments, the second organism is a fungus. In some embodiments, the fungus is a Piromyces fungus, and in some embodiments, the fungus is a Piromyces strain E2 fungus. In certain embodiments, the stretch of contiguous nucleotides is from SEQ ID NO: 34, or a sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing. In some embodiments, the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or a sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing. In certain embodiments, the polynucleotide includes one or more substituted codons. In some embodiments, the one or more substituted codons are yeast codons. In certain embodiments, the stretch of contiguous nucleotides from the first organism or second organism is about 1 % to about 30% (e.g., about 2, 3,
4, 5, 6, 7, 8, 9, 10, 15, 20, 25%) of the total number of nucleotides in the polynucleotide that encodes the polypeptide possessing xylose isomerase activity. In some embodiments, the stretch of contiguous nucleotides from the second organism is about 1 % to about 30% (e.g., about 2, 3, 4,
5, 6, 7, 8, 9, 10, 15, 20, 25) of the total number of nucleotides in the polynucleotide, the
polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater (e.g., about 55, 60, 65, 70, 75, 80, 85, 90, 95% identity or greater) to the foregoing. In certain embodiments, the polynucleotide encodes a polypeptide that includes an amino acid sequence of SEQ ID NO: 58, 60 or 62, fragment thereof, or sequence having 75% identity or greater (e.g., about 80, 85, 90, 95% identity or greater) to the foregoing. In some embodiments, the expression vector can include one or more point mutations. In certain embodiments, the point mutation is at a position corresponding to position 179 of R. flavefaciens polypeptide having xylose isomerase activity. In some embodiments, the point mutation is a glycine 179 to alanine point mutation.
In certain embodiments, the microbes described herein can be used in fermentation methods. In some embodiments, a method includes, contacting a microbe described herein with a feedstock comprising a five carbon molecule under conditions for generating ethanol. In certain
embodiments, the five carbon molecule includes xylose. In some embodiments, about 15 grams per liter of ethanol, or more, is generated within about 372 hours. In certain embodiments, about 2.0 grams per liter dry cell weight, or more, is generated within about 372 hours. In some embodiments, provided are nucleic acids, including a polynucleotide that includes a first stretch of contiguous nucleic acids from a first organism and a second stretch of contiguous nucleic acids from a second organism, where the polynucleotide encodes a polypeptide possessing a phosphogluconate dehydratase activity. In certain embodiments, an expression vector, comprising a polynucleotide that includes a first stretch of contiguous expression vectors from a first organism and a second stretch of contiguous expression vectors from a second organism, where the polynucleotide encodes a polypeptide possessing a phosphogluconate dehydratase activity. In some embodiments, the first organism and the second organism are the same, and in certain embodiments, the first organism and the second organism are different. In some embodiments, the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having a phosphogluconate dehydratase activity.
In some embodiments, provided are nucleic acids, including a polynucleotide that includes a first stretch of contiguous nucleic acids from a first organism and a second stretch of contiguous nucleic acids from a second organism, where the polynucleotide encodes a polypeptide possessing a 2- keto-3-deoxygluconate-6-phosphate aldolase activity. In certain embodiments, an expression vector, comprising a polynucleotide that includes a first stretch of contiguous expression vectors from a first organism and a second stretch of contiguous expression vectors from a second organism, where the polynucleotide encodes a polypeptide possessing a 2-keto-3-deoxygluconate- 6-phosphate aldolase activity. In some embodiments, the first organism and the second organism are the same, and in certain embodiments, the first organism and the second organism are different. In some embodiments, the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having a 2-keto-3-deoxygluconate-6-phosphate aldolase activity.
In certain embodiments, the expression vector includes a regulatory nucleotide sequence in operable linkage with the polynucleotide. In some embodiments, the regulatory nucleotide sequence comprises a promoter sequence. In certain embodiments, the promoter sequence is an inducible promoter sequence. In some embodiments, the promoter sequence is a constitutively active promoter sequence. In certain embodiments, a method for preparing an expression vector as described herein, includes (i) providing a nucleic acid that contains a regulatory sequence, and (ii) inserting the polynucleotide into the nucleic acid in operable linkage with the regulatory sequence. In some embodiments, a microbe as described herein includes the nucleic acid of anyone of the foregoing embodiments. In certain embodiments, a microbe includes an expression vector of any one of the foregoing embodiments. In some embodiments, the microbe is a yeast. In certain embodiments, the microbe is a Saccharomyces yeast, and in some embodiments, the microbe is a Saccharomyces cerevisiae yeast. In various embodiments, provided herein is a nucleic acid comprising polynucleotide
subsequences that encode a phosphogluconate dehydratase enzyme (e.g., EDD), a 2-keto-3- deoxygluconate-6-phosphate aldolase enzyme (e.g., EDA), a transaldolase enzyme (e.g., TAL1 ), a transketolase enzyme (e.g., TKL1 , TKL2, or TKL1 and TKL2), a glucose-6-phosphate
dehydrogenase enzyme (e.g., ZWF1 ), a 6-phosphogluconolactonase enzyme (e.g., SOL3, SOL4, or SOL3 and SOL4) and a xylose isomerase enzyme or a xylose reductase (XR) enzyme and a xylitol dehydrogenase (XD) enzyme, and a xylulokinase (XK) enzyme. In some embodiments, polynucleotide subsequences encoding the phosphogluconate dehydratase enzyme and the 3- deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp. (e.g., Escherichia coli) or Pseudomonas spp. (e.g., Pseudomonas aeruginosa), and in certain embodiments, the polynucleotide encoding the phosphogluconate dehydratase enzyme and/or the 3-deoxygluconate-6-phosphate aldolase enzyme is a chimeric polynucleotide that includes part of such a sequence and part of another phosphogluconate dehydratase enzyme and the 3- deoxygluconate-6-phosphate aldolase enzyme sequence (e.g., from a different organism). In certain embodiments, the polynucleotide subsequence that encodes the xylose isomerase enzyme is from a Ruminococcus spp. (e.g., Ruminococcus flavefaciens), and in some embodiments, is a chimeric polynucleotide that includes part of such a sequence and part of another xylose isomerase sequence (e.g., from a Piromyces spp.). Non-limiting examples of xylose isomerase chimeric sequences are described herein. In some embodiments, a nucleic acid includes a polynucleotide subsequence that encodes a glucose-6-phosphate dehydrogenase enzyme (e.g., ZWF1 ) and/or a polynucleotide subsequence that encodes a 6-phosphogluconolactonase enzyme (e.g., SOL3/SOL4). In certain embodiments, the polynucleotide subsequences that encode the glucose-6-phosphate dehydrogenase enzyme and the 6-phosphogluconolactonase enzyme are from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae). In some embodiments, a nucleic acid includes a polynucleotide subsequence that encodes a glucose transporter (e.g., GAL2, GXS1 , GXF1 , HXT7). In certain embodiments, the polynucleotide subsequence that encodes the glucose transporter is from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae). In some embodiments, a nucleic acid includes a polynucleotide subsequence that alters the activity of 6- phosphogluconate dehydrogenase (decarboxylating) enzyme (e.g., GND1 , GND2). In certain embodiments, the polynucleotide subsequences that alter the activity of 6-phosphogluconate dehydrogenase (decarboxylating) enzyme are from a yeast. In some embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a 6- phosphogluconate dehydrogenase (decarboxylating) enzyme. In some embodiments, a nucleic acid includes a polynucleotide subsequence that disrupts a phosphoglucose isomerase enzyme (e.g., PGI1 ). In some embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a phosphoglucose isomerase enzyme. In some
embodiments, a nucleic acid includes a polynucleotide subsequence that encodes a transaldolase enzyme (e.g., TAL1 ). In certain embodiments, the polynucleotide subsequences that encode the transaldolase enzyme are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida. In some
embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transaldolase enzyme. In some embodiments, a nucleic acid includes a polynucleotide subsequence that encodes a transketolase enzyme (e.g., TKL1 , TKL2, or TKL1 and TKL2). In certain embodiments, the polynucleotide subsequences that encode the transketolase enzyme are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida. In some embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transketolase enzyme. In some embodiments, a nucleic acid includes one or more promoters operable in a yeast (e.g., Saccharomyces spp. (e.g., Saccharomyces cerevisiae), and in operable connection with one or more polynucleotide subsequences described above. Such promoters often are constitutively active and sometimes are operable under anaerobic and aerobic conditions. Non-limiting examples of promoters include those that control glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and triose phosphate dehydrogenase (TDH-1 ). A nucleic acid can be one or two nucleic acids in some embodiments, and each nucleic acid can include one or two or more of the polynucleotide subsequences and or promoters described above. A nucleic acid can be in circular (e.g., plasmid) or linear form, in some embodiments, and sometimes functions as an expression vector. In some embodiments, a nucleic acid functions as a tool for integrating the polynucleotide subsequences, and optionally promoter sequences, included in the nucleic acid, into genomic DNA of a host organism.
In some embodiments, provided herein is an engineered microbe comprising heterologous polynucleotide subsequences that encode a phosphogluconate dehydratase enzyme (e.g., EDD), a 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme (e.g., EDA), a xylose isomerase enzyme or a xylose reductase (XR) enzyme and a xylitol dehydrogenase (XD) enzyme, and a xylulokinase (XK) enzyme. In certain embodiments, the microbe is a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae). In some embodiments, polynucleotide subsequences encoding the phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6- phosphate aldolase enzyme independently are from an Escherichia spp. (e.g., Escherichia coli) or Pseudomonas spp. (e.g., Pseudomonas aeruginosa), and in certain embodiments, the
polynucleotide encoding the phosphogluconate dehydratase enzyme and/or the 3-deoxygluconate- 6-phosphate aldolase enzyme is a chimeric polynucleotide that includes part of such a sequence and part of another phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6- phosphate aldolase enzyme sequence (e.g., from a different organism). In certain embodiments, the polynucleotide subsequence that encodes the xylose isomerase enzyme is from a
Ruminococcus spp. (e.g., Ruminococcus flavefaciens), and in some embodiments, is a chimeric polynucleotide that includes part of such a sequence and part of another xylose isomerase sequence (e.g., from a Piromyces spp.). Non-limiting examples of xylose isomerase chimeric sequences are described herein. In some embodiments, the engineered microbe expresses a glucose-6-phosphate dehydrogenase enzyme (e.g., ZWF1 ) and/or a 6-phosphogluconolactonase enzyme (e.g., SOL3/SOL4). In certain embodiments, the polynucleotide subsequences that encode the glucose-6-phosphate dehydrogenase enzyme and the 6-phosphogluconolactonase enzyme are from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g.,
Saccharomyces cerevisiae). In certain embodiments, the polynucleotide subsequences that disrupt the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme are from a yeast. In some embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a 6-phosphogluconate dehydrogenase (decarboxylating) enzyme.
Thus, an engineered microbe sometimes expresses higher-than-normal levels (e.g., over-express) of an endogenous 6-phosphogluconolactonase enzyme and/or a glucose-6-phosphate
dehydrogenase enzyme (e.g., under control of a constitutive promoter, or multiple copies of the nucleotide subsequences that encode such enzymes are inserted in the microbe). In some embodiments, the engineered microbe includes a polynucleotide subsequence that encodes a glucose transporter (e.g., GAL2, GSX1 , GXF1 , HXT7). In certain embodiments, the polynucleotide subsequence that encodes the glucose transporter is from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae). Thus, an engineered microbe sometimes expresses higher-than-normal levels (e.g., over-express) of one or more endogenous glucose transport enzymes (e.g., under control of a constitutive promoter, or multiple copies of the nucleotide subsequences that encode such enzymes are inserted in the microbe). In some embodiments, the engineered microbe includes a genetic alteration that reduces the activity of an endogenous phosphofructokinase enzyme activity. In certain embodiments, a polynucleotide subsequence that encodes such an enzyme is altered such that enzyme activity is significantly reduced or not detectable in the engineered microbe. In some embodiments, a nucleic acid includes a polynucleotide subsequence that alters the activity of 6-phosphogluconate
dehydrogenase (decarboxylating) enzyme (e.g., GND1 , GND2). In certain embodiments, the polynucleotide subsequences that alter the activity of 6-phosphogluconate dehydrogenase (decarboxylating) enzyme are from a yeast. In some embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, a 6-phosphogluconate dehydrogenase (decarboxylating) enzyme. In some embodiments, a nucleic acid includes a polynucleotide subsequence that alters a phosphoglucose isomerase enzyme (e.g., PGM ) activity. In certain embodiments, the polynucleotide subsequences that alter the phosphoglucose isomerase enzyme are from a yeast. In some embodiments, a nucleic acid includes a
polynucleotide subsequence that decreases expression of, or disrupts, a phosphoglucose isomerase enzyme. In some embodiments, a nucleic acid includes a polynucleotide subsequence that alters a transaldolase enzyme (e.g., TAL1 ). In certain embodiments, the polynucleotide subsequences that alter the transaldolase enzyme activity, increase the transaldolase activity, and in some embodiments the polynucleotide sequences are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida. In certain embodiments, the polynucleotide subsequences that alter transaldolase enzyme activity, decrease the transaldolase activity. In some embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transaldolase enzyme. In some embodiments, a nucleic acid includes a polynucleotide subsequence that alters a transketolase enzyme (e.g., TKL1 , TKL2, or TKL1 and TKL2). In certain embodiments, the polynucleotide subsequences that alter the transketolase enzyme increase transketolase activity and in some embodiments, the polynucleotide sequences are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida. In certain embodiments, the polynucleotide subsequences that alter transketolase enzyme activity, decrease the transketolase activity. In some embodiments, a nucleic acid includes a polynucleotide subsequence that decreases expression of, or disrupts, transketolase enzyme. An engineered microbe sometimes expresses higher-than-normal levels (e.g., over-express) of an endogenous activity and/or expresses a novel activity independently selected from the non-limiting group of activities including 6-phosphogluconate dehydrogenase (decarboxylating) (e.g.,
GND1/GND2), phosphofructokinase (e.g., PFK1/PFK2), glucose-6-phosphate dehydrogenase (e.g., ZWF1 ), pyruvate decarboxylase (e.g., PDC1 ), alcohol dehydrogenase (e.g., ADH1 ), 6- phosphogluconolactonase enzyme (e.g., SOL3/SOL4), glutamate synthase (e.g., GLT1 ), trehalose-6-phosphate phosphatase (e.g., TPS2), glyceraldehyde-3-phosphate dehydrogenase (e.g., TDH3), pyruvate kinase (e.g., PYK1 , also known as CDC19), glucose transporter (e.g., GAL2, GSX1 , GXF1 , HXT7), phosphogluconate dehydratase (e.g., EDD), 2-keto-3- deoxygluconate-6-phosphate aldolase (e.g., EDA), xylose isomerase (e.g., XI), xylose reductase (e.g., XR), xylitol dehydrogenase (e.g., XD), xylulokinase (e.g., XK), other activities described herein, or combinations thereof. In some embodiments, the higher than normal levels are achieved by (i) placing a polynucleotide subsequence encoding the activity in operable connection with a constitutive promoter or a strong inducible promoter, (ii) increasing the number of copies of a polynucleotide subsequence encoding the activity (e.g., integration of multiple copies in the genomic DNA, plasmids maintained in the organism at high copy number), and/or (iii) the like and combinations thereof. In certain embodiments, the polynucleotide subsequence that encodes an activity described herein is from a yeast, non-limiting examples of which are Saccharomyces spp. (e.g., Saccharomyces cerevisiae). In certain embodiments, the polynucleotide subsequence that encodes an activity described herein is from a bacteria, non-limiting examples of which are Bacillus spp, Escherichia spp, Ruminococcus spp, Yarrowia spp, and other bacteria described herein.
In some embodiments, the engineered microbe includes a genetic alteration that reduces or eliminates an endogenous activity independently selected from the non-limiting group of activities including 6-phosphogluconate dehydrogenase (decarboxylating) (e.g., GND1/GND2), glycerophosphate dehydrogenase (e.g., GPD1/GPD2), phosphofructokinase (e.g., PFK26/PFK27), membrane channel activity (e.g., FPS1 ), trehalose-6-phosphate synthase (e.g., TPS1 ), neutral trehalose (e.g., NTH1 ), alkaline phosphatase specific for p-nitrophenyl phosphate (e.g., PH013), phosphoglucose isomerase (e.g., PGM ), transaldolase (e.g., TAL1 ), and/or transketolase (e.g., TKL1 , TKL2, or TKL1 and TKL2). In some embodiments, an activity is reduced or eliminated by disrupting and/or deleting nucleotide sequences or nucleotide subsequences that encode the activity. In certain embodiments, the polynucleotide subsequences that are used to alter an activity are from a yeast, non-limiting examples of which are Kluyveromyces, Pichia,
Schizosaccharomyces, and Candida. In some embodiments, the polynucleotide subsequences that are used to alter an activity are from a bacteria, non-limiting examples of which are
Escherichia, Ruminococcus, Bacillus, Yarrowia, and other bacteria described herein. In certain embodiments, a polynucleotide subsequence that encodes such an enzyme is altered such that enzyme activity is significantly reduced or not detectable in the engineered microbe. An activity often is "eliminated" when the activity is not detectable in an engineered organism. In some embodiments, one or more activities addressed in this paragraph are reduced or eliminated and one or more activities addressed in this paragraph are increased in an engineered microorganism relative to a control microorganism not including the genetic alteration(s) causing such a reduction or increase. In certain embodiments, all of the activities addressed in this paragraph are increased or reduced or eliminated. In some embodiments, one or more activities addressed in this paragraph are reduced or eliminated, one or more activities addressed in this paragraph are increased, and one or more activities in this paragraph are not altered in an engineered microorganism relative to a control microorganism not including the genetic alteration(s) causing such a reduction or increase.
In some embodiments, the engineered microbe includes one or more promoters operable in a yeast (e.g., Saccharomyces spp. (e.g., Saccharomyces cerevisiae), and in operable connection with one or more polynucleotide subsequences described above. Such promoters often are constitutively active and sometimes are operable under anaerobic and aerobic conditions. Non- limiting examples of promoters include those that control glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and t ose phosphate dehydrogenase (TDH-1 ). The polynucleotide sequences and promoters described above sometimes are non-stably associated with the microbe (e.g., they are in a non-integrated nucleic acid (e.g., a plasmid), and in some embodiments, are integrated in genomic DNA of the microbe. In some embodiments, the polynucleotide sequences are integrated in a transposition integration event, a homologous recombination integration event or a transposition integration event and a homologous recombination integration event. In some embodiments, a transposition integration event includes transposition of an operon comprising two or more of the polynucleotide
subsequences and/or promoters described above. In certain embodiments, a homologous recombination integration event includes homologous recombination of an operon comprising two or more of the polynucleotide subsequences and or promoters described above. In certain embodiments, provided are methods for producing xylulose and/or ethanol using an engineered microbe described above, which comprise contacting the engineered microbe with a medium (e.g., feedstock) under conditions in which the microbe synthesizes xylulose and/or ethanol. In some embodiments, the engineered microbe synthesizes xylulose and/or ethanol to about 85% to about 99% of theoretical yield (e.g., about 86%, 87%, 88%, 89%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% of theoretical xylulose and/or ethanol yield). In some embodiments, the medium (e.g., feedstock) contains a six-carbon sugar (e.g., hexose, glucose) and/or a five-carbon sugar (e.g., pentose, xylose). In certain embodiments, the ethanol is separated and/or recovered from the engineered microorganism.
Also provided herein is a polypeptide comprising: (i) the amino acid sequence of SEQ ID NO: 180, or (ii) an amino acid sequence that includes 1 to 10 amino acid subsitutions, insertions or deletions with respect to (i), which polypeptide has a xylose isomerase activity. In some embodiments, the polypeptide is an isolated chimeric xylose isomerase enzyme. Provided also herein is an isolated polynucleotide selected from the group consisting of: (i) a polynucleotide comprising a nucleotide sequence that encodes a polypeptide sequence of SEQ ID NO: 180; and (ii) a polynucleotide comprising a nucleotide sequence that encodes a polypeptide sequence that includes 1 to 10 amino acid substitutions, insertions or deletions with respect to SEQ ID NO: 180, which
polypeptide has a xylose isomerase activity. In certain embodiments, the amino acid sequence (e.g., polypeptide sequence) includes 1 to 10 conservative amino acid substitutions with respect to SEQ ID NO: 180. In some embodiments, the 1 to 10 amino acid substitutions, insertions or deletions do not substantially reduce or inhibit the activity of the polypeptide comprising SEQ ID NO: 180. In certain embodiments, the nucleotide sequence is 90% or more identical to SEQ ID NO: 179 and encodes a polypeptide sequence of SEQ ID NO: 180 or a polynucleotide sequence that includes 1 to 10 amino acid substitutions, insertions or deletions with respect to SEQ ID NO: 180. Also provided herein is an engineered yeast comprising a chimeric enzyme which enzyme comprises a polynucleotide comprising a nucleotide sequence of the foregoing. Provided also herein is a method for producing ethanol, comprising: (a) providing an engineered yeast of the foregoing; and (b) contacting the engineered yeast with a 5 carbon sugar, a 6 carbon sugar or mixture comprising 5 carbon and 6 carbon sugars, under fermentation conditions whereby ethanol is produced by the engineered yeast. In some embodiments, the engineered yeast is a
Saccharomyces yeast. In certain embodiments, the yeast is a Saccharomyces cerevisiae yeast. In certain embodiments, the engineered yeast synthesizes ethanol to about 85% to about 99% of theoretical yield. In some embodiments, the method comprises recovering ethanol synthesized by the engineered yeast. In certain embodiments the engineered yeast comprises between about a 1 -fold to about a 100-fold increase in ethanol production when compared to wild-type, parental, or partially engineered organisms of the same strain, under identical fermentation conditions.
Provided also herein is an isolated nucleic acid including a polynucleotide that is 80% or more identical to SEQ ID NO: 179. In some embodiments, the polynucleotide encodes an amino acid sequence of SEQ ID NO: 180. In certain embodiments, the nucleic acid includes a polynucleotide that is 85% or more identical to SEQ ID NO: 179. In some embodiments, the nucleic acid includes a polynucleotide that is 90% or more identical to SEQ ID NO: 179. In certain embodiments, the nucleic acid includes a polynucleotide that is 95% or more identical to SEQ ID NO: 179. In some embodiments, the nucleic acid includes the polynucleotide of SEQ ID NO: 179, and in certain embodiments, the nucleic acid consistes of SEQ ID NO: 179. In some embodiments, the nucleic acid is an expression vector. Also provided herein is an engineered microorganism including a polynucleotide that is 80% or more identical to SEQ ID NO: 179. In some embodiments, the microorganism is a eukaryote. In certain embodiments, the microorganism is a yeast. In some embodiments, the yeast is a Saccharomyces yeast, and in certain embodiments, the yeast is a Saccharomyeces cerevisiae yeast. Provided also herein is a method for producing ethanol, which includes contacting an engineered microorganism described herein with a 5 carbon sugar, a 6 carbon sugar or mixture including 5 carbon and 6 carbon sugars, under fermentation conditions whereby ethanol is produced by the engineered microorganism. In certain embodiments, the engineered microorganism synthesizes ethanol to about 85% to about 99% of theoretical yield. In some embodiments, the method includes recovering ethanol synthesized by the engineered microorganism. In certain embodiments the engineered microorganism includes between about a 2-fold to about a 100-fold increase in ethanol production when compared to wild-type, parental, or partially engineered microorganisms of the same strain, under identical fermentation conditions. Additional embodiments can be found in Example 63: Examples of the embodiments. Certain embodiments are described further in the following description, examples, claims and drawings.
Brief Description of the Drawings The drawings illustrate embodiments of the technology and are not limiting. For clarity and ease of illustration, the drawings are not made to scale and, in some instances, various aspects may be shown exaggerated or enlarged to facilitate an understanding of particular embodiments.
FIG. 1 depicts a metabolic pathway that produces ethanol as by product of cellular respiration. The solid lines represent activities present in the Embden-Meyerhoff pathway (e.g., aerobic respiration). Dashed lines represent activities associated with the Entner-Doudoroff pathway (e.g., anaerobic respiration). One or both pathways often can be operational in a microorganism. The level of activity of each pathway can vary from organism to organism. The arrow from FBP (e.g., Fructose-1 ,6-bisphosphate, also referred to as F-1.6-BP) to G3P (e.g., glcyeraldehyde-3- phosphate), illustrates wild type levels of conversion of FBP to two molecules of G3P. In the embodiments shown in FIGS. 2, 3 and 5 a smaller arrow from FBP to G3P is illustrated, indicating reduced or no conversion of FBP to G3P. The reduction in conversion of FBP to G3P illustrated in FIGS. 2, 3 and 5 is a result of the reduction or elimination of the previous activity that converts fructose-6-phosphate (F6P) to FBP (e.g., the activity of PFK).
FIG. 2 depicts an engineered metabolic pathway that can be used to produce ethanol more efficiently in a host microorganism in which the pathway has been engineered. The solid lines in FIGS. 2-5 represent the metabolic pathway naturally found in a host organism (e.g.,
Saccharomyces cerevisiae, for example). The dashed lines in FIGS. 2-5 represent a novel activity or pathway engineered into a microorganism to allow increased ethanol production efficiency. In FIG. 2 the activity of an enzyme in the Embden-Meyerhoff pathway, phosphofructokinase (e.g., PFK) is permanently or temporarily reduced or eliminated. The inactivation is shown as the "X" in FIG. 2. Disruption of the activity of PFK serves to inactivate the Embden-Meyerhoff pathway (EM pathway). To allow cells to survive with a non-functional PFK, two activities from the Entner- Doudoroff pathway (ED pathway) have been introduced into a host organism engineered with the reduced or non-functional EM pathway. The introduced activities allow survival with an inactivated EM pathway in addition to increased efficiency of ethanol production.
FIG. 3 depicts an engineered metabolic pathway that can be used to produce ethanol using xylose as a carbon source by introducing the activity into a microorganism. The engineered
microorganism can convert xylose to xylulose in a single reaction using the introduced xylose isomerase activity. Xylose also can be metabolized by the combined activities of xylose reductase, and xylitol dehydrogenase, as depicted in FIG. 20. Xylulose then can be fermented to ethanol by entering the EM pathway. Engineered microorganisms also can use the increased efficiency of ethanol production associated with inactivation of the EM pathway and introduction of activities of the ED pathway, shown in FIG. 2 and discussed below. The ability to utilize xylose efficiently (e.g., concurrently with six-carbon sugars or prior to the depletion of six-carbon sugars) can be provided by the introduction of the novel activities, xylose isomerase, or xylose reductase and xylitol dehydrogenase.
FIG. 4 depicts an engineered metabolic pathway that can be used to increase the efficiency of ethanol production (and other products) by introducing the ability to fix atmospheric carbon dioxide into a microorganism. The engineered microorganism can incorporate or fix atmospheric carbon dioxide into organic molecules using the introduced phosphoenolpyruvate carboxylase activity. Carbon dioxide incorporated in this manner can be used as an additional carbon source that can increase production of many organic molecules, including ethanol. Non-limiting examples of other products whose production can benefit from carbon fixation include; pyruvate, oxaloacetate, glyceraldehyde-3-phosphate and the like. The pathway depicted in FIG. 4 illustrates the introduction of the novel carbon dioxide fixation activity in the background of a fully functional EM pathway, and an introduced ED pathway. It is understood the introduction of the carbon fixation activity can benefit microorganisms that have no other modifications to any metabolic pathways. It also is understood that microorganism modified in one, or multiple, other metabolic pathways can benefit from the introduction of a carbon fixation activity. FIG. 5 shows a combination of some engineered metabolic pathways described herein. The combination of engineered metabolic pathways shown in FIG. 5 can provide significant increases in the production of ethanol (or other products) when compared to the wild type organism or organisms lacking one, two, three or more of the modifications. Other combinations of engineered metabolic pathways not shown in FIG. 5 are possible, including but not limited to, combinations including increased alcohol tolerance, modified alcohol dehydrogenase 2 activity and/or modified thymidylate synthase activity, as described herein. Therefore, FIG. 5 also illustrates an
embodiment of a method for generating an engineered microorganism with the ability to produce a greater amount of target product comprising expressing one or more genetically modified activities, described herein, in a host organism that produces the desired target (e.g., ethanol, pyruvate, oxaloacetate and the like, for example) via one or more metabolic pathways. In some
embodiments, the combination of metabolic pathways includes those depicted in FIG. 5 in addition to combinations including one, two or three of the following activities; increased alcohol tolerance, modified alcohol dehydrogenase 2 activity and modified thymidylate synthase activity.
FIG. 6 shows DNA and amino acid sequence alignments for the nucleotide sequences of EDA (FIG. 6A, 6B) and EDD (FIG. 6C, 6D) genes from Zymomonas mobilis (native and optimized) and Escherichia coli. FIG. 7 shows a representative western blot used to detect the presence of an enzyme associated with an activity described herein.
FIG. 8 illustrates schematic representations of native, modified and chimeric xylose isomerase genes. FIG. 9 shows a representative Western blot used to detect gene products.
FIG. 10 graphically illustrates a comparison of specific activities of engineered mutant xylose isomerase enzymes. Results are presented as percent activity over wild type (WT) activity.
Experimental details and results of the kinetic assays are present in Example 12.
FIG. 1 1 illustrates comparative growth analysis results of yeast strains carrying vector only or a vector containing native Ruminococcus xylose isomerase, grown on media containing xylose. Experimental details and results of the growth assays are described in Example 13.
FIG. 12 illustrates comparative growth analysis results and measurement of ethanol production in yeast strains carrying vector only or a vector containing native Ruminococcus xylose isomerase. Growth of cells is shown by the lines connected by "diamonds" (vector with xylose isomerase) or "squares" (vector only). Ethanol production is shown by the lines connected by "x's" (vector with xylose isomerase) or "circles" (vector only). Experimental conditions and results are described in Example 13. FIGS. 13A and 13B show representative Western blots used to detect levels of various exogenous EDD and EDA gene combinations expressed in a host organism. Experimental conditions and results are described in Example 17. FIG. 14 graphically displays the relative activities of the various EDD/EDA combinations generated as described in Example 18.
FIG. 15 graphically represents the fermentation efficiency of engineered yeast strains carrying exogenous EDD/EDA gene combinations. Vector= p426GPD/p425GPD; EE=EDD-E.coli/EDA- E.coli, EP= EDD-E.coli/EDA-PA01 ; PE=EDD-PA01/EDA-E.coli, PP= EDD-PA01/EDA-PA01 . Experimental conditions and results are described in Example 19. FIGS. 16A and 16B graphically illustrate fermentation data (e.g., cell growth, glucose usage and ethanol production) for engineered yeast strains generated as described herein. FIG. 16A illustrates the fermentation data for engineered strain BF428 (BY4742 with vector controls), and FIG. 16B illustrates the fermentation data for engineered strain BF591 (BY4742 with EDD-PA01/EDA-PA01 ).
Experimental conditions and results are described in Example 20.
FIGS. 17A and 17B graphically illustrate fermentation data for engineered yeast strains described herein. FIG. 17A illustrates the fermentation data for engineered strain BF738 (BY4742 tall with vector controls p426GPD and p425GPD). FIG. 17B illustrates the fermentation data for engineered strain BF741 (BY4742 tall with plasmids pBF290 (EDD-PA01 ) and pBF292 (EDA- PA01 ). Experimental conditions and results are described in Example 21.
FIGS. 18A and 18B graphically illustrate fermentation data for engineered yeast strains as described herein. FIG. 18A illustrates the fermentation data for BF740 grown on 2% dextrose, and FIG. 18B illustrates the fermentation data for BF743 grown on 2% dextrose. Strain descriptions, experimental conditions and results are described in Example 22. FIG. 19 graphically illustrates the results of coupled assay kinetics for single plasmid and two plasmid edd/eda expression vector systems. Vector construction and experimental conditions are described in Example 24.
FIG. 20 depicts an engineered metabolic pathway that can be used to produce ethanol using xylose as a carbon source by introducing the activity into a microorganism. The engineered microorganism can convert xylose to xylulose by the activities of xylose reductase and xylitol dehydrogenase. Xylose also can be metabolized by the combined activities of xylose reductase, and xylitol dehydrogenase, as depicted in FIG. 20. Xylulose then can be fermented to ethanol by entering the EM pathway. Engineered microorganisms also can use the increased efficiency of ethanol production associated with inactivation of the EM pathway and introduction of activities of the ED pathway, shown in FIG. 2 and discussed below. The ability to utilize xylose efficiently (e.g., concurrently with six-carbon sugars or prior to the depletion of six-carbon sugars) can be provided by the introduction of the novel activities, xylose isomerase, or xylose reductase and xylitol dehydrogenase.
FIG. 21 graphically illustrates the results of xylose isomerase chimera generated with various 5' edge sequences. Experimental methods and results are described in Example 28. FIG. 22 shows the results of western blots performed on xylose isomerase chimera generated with various 5' edge sequences. Experimental methods and results are described in Example 28.
FIG. 23 shows a western blot of E. coli crude extract illustrated the presence of the EDD protein at the expected size. Lane 1 is a standard size ladder (Novex Sharp standard), Lane 2 is 1 g BF1055 cell lysate, Lane 3 is 10 pg BF1055 cell lysate, Lane 4 is 1 .5 [ig BF1706 cell lysate, Lane 5 is 15 Mg BF1706 cell lysate. Experimental methods and results are described in Example 34. FIG. 24 graphically illustrates the results of activity evaluations of EDA genes expressed in yeast.
Experimental methods and results are described in Example 34. FIG. 25 graphically illustrates the specific activity of various xylose isomerase candidate activities. Experimental methods and results are described in Example 41 .
FIG. 26 presents a table summarizing the results of fermentations carried out, in various media, to determine differences in the ending parameters of fermentations comparing yCH1 and yCH24. Fermentation parameters summarized in the table are initial and final pH, initial and final OD600, amount of ethanol and glycerol produced. Experimental conditions and results are explained in Example 45.
FIG. 27 graphically illustrates the results of larger scale (e.g., Multifor) fermentations for strains yCH153, yCH137 and yCH208 grown in YPD with 8% glucose. Experimental conditions and results are explained in Example 46.
FIG. 28 graphically illustrates the results of experiments performed to evaluate the effect of addition of magnesium (Mg) or manganese (Mn) to the EDA/EDD in vitro evaluation assay described herein. The results are presented in tabular form in Example 46. FIG. 29 graphically shows the relative activities of native and codon optimized EDA proteins expressed in S. cerevisiae. FIG. 30 graphically illustrates the results of experiments performed to identify the most active EDD genes when expressed in S. cerevisiae. Experimental conditions and results and explanation of abbreviations and symbols are described in Example 46. FIG. 31 A-D diagrammatically illustrates nucleic acid constructs generated for engineering yeast strains described herein. The plasmid constructs are described in Example 48.
FIG. 32 illustrate the results of PCR analysis to confirm deletion of the PFK1 locus. Arrows indicate isolates that show the expected migration pattern of amplification products. Experimental conditions and results are described in Example 49.
FIG. 33 graphically illustrates the results of shake flask fermentations for strains yCH 153, yCH 137 and yCH208 grown in YPD with 8% glucose. The results represent the average of 8
samples/strain. FIG. 34 graphically illustrates the results of Multifor-based fermentations of strains yCH153 vs yCH208 in YPD with 8% glucose. Experimental conditions and results are described in Example 52.
FIG. 35 graphically illustrates the results of shake flask fermentations for strains yCH 1 and yCH247 in UMM media. FIG. 36 graphically illustrates the results of Multifor fermentations for strains yCH 1 and yCH247 in UMM media. FIG. 37 graphically illustrates the results of Multifor fermentations for strains yCH1 and yCH247 in UMM media. FIG. 38 graphically illustrates the results of Multifor fermentations for strains yCH 1 and yCH247 in YPD media. Experimental conditions and results are described in Example 49. FIG. 39 shows the results of PCR analysis of yCH247 and descendents of yCH247. FIG. 40 graphically illustrates the results of evaluation of EDD/EDA activity in yCH 1 , yCH247 and the 10 descendants from the stability study presented in FIG. 39. 75μg of crude cell lysate was assayed. FIG. 41 graphically illustrates the results of growth experiments performed on yCH 1 derived strains with and without deletions of PFK1 activity under aerobic and anaerobic conditions. Experimental conditions and results are described in Example 49.
FIG. 42 graphically illustrates the results of growth experiments performed on BF903 derived strains with and without deletions of PFK1 activity under aerobic and anaerobic conditions. FIG. 43 graphically illustrates the results of shake flask fermentations for strains BF903 and BF2100 in UMM media. FIG. 44 graphically illustrates the results of shake flask fermentations for strains BF903 and BF2100 in YPD media. Experimental conditions and results are described in Example 50. FIG. 45 diagrammatically illustrates an integration construct for inserting an alternative fungal derived xylose metabolism pathway into engineered yeast strains. The pathway and approach are described herein. Construct details are given in Example 56.
FIG. 46 graphically illustrates results of fermentations performed using strain BF3319. Glucose and xylose consumption and ethanol production are shown. Experimental details and results are given in Example 62.
FIGS. 47A and 47B are nucleotide sequence alignments of native and codon optimized
Ruminococcus FD-1 xylose isomerase nucleotide sequences (e.g., labeled FD-1 , 1 , 2, 3 and 4, respectively) generated from the Ruminococcus FD-1 xylose isomerase amino acid sequence.
Detailed Description
Ethanol is a two carbon, straight chain, primary alcohol that can be produced from fermentation (e.g., cellular respiration processes) or as a by-product of petroleum refining. Ethanol has widespread use in medicine, consumables, and in industrial processes where it often is used as an essential solvent and a precursor, or feedstock, for the synthesis of other products (e.g., ethyl halides, ethyl esters, diethyl ether, acetic acid, ethyl amines and to a lesser extent butadiene, for example). The largest use of ethanol, worldwide, is as a motor fuel and fuel additive. Greater than 90% of the cars produced world wide can run efficiently on hydrous ethanol (e.g., 95% ethanol and 5% water). Ethanol also is commonly used for production of heat and light.
World production of ethanol exceeds 50 gigaliters (e.g., 1 .3 1010 US gallons), with 69% of the world supply coming from Brazil and the United States. The United States fuel ethanol industry is based largely on corn biomass. The use of corn biomass for ethanol production may not yield a positive net energy gain, and further has the potential of diverting land that could be used for food production into ethanol production. It is possible that cellulosic crops may displace corn as the main fuel crop for producing bio-ethanol. Non-limiting examples of cellulosic crops and waste materials include switchgrass and wood pulp waste from paper production and wood milling industries.
Biomass produced in the paper pulping and wood milling industries contains both 5 and six-carbon sugars. Use of this wasted biomass could allow production of significant amounts of bio-fuels and products, while reducing the use of land that could be used for food production. Predominant forms of sugars in the biomass produced in wood and paper pulping and wood milling industries are glucose and xylose. Provided herein are methods for producing ethanol, ethanol derivatives and/or conjugates and other organic chemical intermediates (e.g., pyruvate, acetaldehyde, glyceraldehyde-3-phospate, and the like) using biological systems. Such production systems may have significantly less environmental impact and could be economically competitive with current manufacturing systems. Thus, provided herein are methods for manufacturing ethanol and other organic chemical intermediates by engineered microorganisms. In some embodiments microorganisms are engineered to contain at least one heterologous gene encoding an enzyme, where the enzyme is a member of a novel pathway engineered into the microorganism. In certain embodiments, an organism may be selected for elevated activity of a native enzyme. Genetically engineered microorganisms described herein produce organic molecules for industrial uses. The organisms are designed to be "feedstock flexible" in that they can use five-carbon sugars (e.g., pentose sugars such as xylose, for example), six-carbon sugars (e.g., hexose sugars such as glucose or fructose, for example) or both as carbon sources. Further, the organisms described herein have been designed to be highly efficient in their use of hexose sugars to produce desired organic molecules. To that end, the microorganisms described herein are
"pathway flexible" such that the microorganisms are able to direct hexose sugars primarily to either (i) the traditional glycolysis pathway (the Embden-Meyerhoff pathway) thereby generating ATP energy for cell growth and division at certain times, or (ii) a separate glycolytic pathway (the Entner-Doudoroff pathway) thereby producing significant levels of pyruvic acid, a key 3-carbon intermediate for producing many desired industrial organic molecules.
Pathway selection in the microorganism can be directed via one or more environmental switches such as a temperature change, oxygen level change, addition or subtraction of a component of the culture medium, or combinations thereof. The metabolic pathway flexibility of microorganisms described herein allow the microorganisms to efficiently use hexose sugars, which ultimately can lead to microorganisms capable of producing a greater amount of industrial chemical product per gram of feedstock as compared with conventional microorganisms (e.g., the organism from which the engineered organism was generated, for example). In some embodiments, the metabolic pathway flexibility of the engineered microorganisms described herein is generated by adding or increasing metabolic activities associated with the Entner-Doudoroff pathway. In certain embodiments the metabolic activities added are phosphogluconate dehydratase (e.g., EDD gene), 2-keto-3-deoxygluconate-6-phosphate aldolase (e.g., EDA gene) or both. A number of industrially useful microorganisms (e.g., microorganisms used in fermentation processes, yeast for example), metabolize xylose inefficiently or are incapable of metabolizing xylose. Many organisms that can metabolize xylose do so only after all glucose and/or other six- carbon sugars have been depleted. The microorganisms described herein have been engineered to efficiently utilize five-carbon sugars (e.g., xylose, for example) as an alternative or additional source of carbon, concurrently with and/or prior to six-carbon sugar usage, by the incorporation of a heterologous nucleic acid (e.g., gene) encoding a xylose isomerase, in some embodiments, and in certain embodiments, by the incorporation of a heterologous nucleic acid encoding a xylose reductase and a xylitol dehydrogenase. Xylose isomerase converts the five-carbon sugar xylose to xylulose, in some embodiments. In certain embodiments, xylose reductase and xylitol
dehydrogenase convert xylose to xylulose. Xylulose can ultimately be converted to pyruvic acid or to ethanol through metabolism via the Embden-Meyerhoff or Entner-Doudoroff pathways.
Many non-photosynthetic organisms are not capable of incorporating inorganic atmospheric carbon into organic carbon compounds, via carbon fixation pathways, to any appreciable degree, or at all. Often, microorganisms used in industrial fermentation process also are incapable of significant carbon fixation. The ability to incorporate atmospheric carbon dioxide, or carbon dioxide waste from respiration in fermentation processes, can increase the amount of industrial chemical product produced per gram of feedstock, in certain embodiments. Thus, the microorganisms described herein also can be modified to add or increase the ability to incorporate carbon from carbon dioxide into industrial chemical products, in some embodiments. In certain embodiments, the
microorganisms described herein are engineered to express enzymes such as
phosphoenolpyruvate carboxylase ("PEP" carboxylase) and/or ribulose 1 ,5-bis-phosphate carboxylase ("Rubisco"), thus allowing the use of carbon dioxide as an additional source of carbon. A particularly useful industrial chemical product produced by fermentation is ethanol. Ethanol is an end product of cellular respiration and is produced from acetaldehyde by an alcohol
dehydrogenase activity (e.g., by an enzyme like alcohol dehydrogenase 1 or ADH1 , for example). However, ethanol can readily be converted back to acetaldehyde by the action of the enzyme alcohol dehydrogenase 2 (e.g., ADH2), thus lowering the yield of ethanol produced. In some embodiments, microorganisms described herein are modified to reduce or eliminate the activity of ADH2, to allow increased yields of ethanol. In certain embodiments, the engineered
microorganisms described herein also are modified to have a higher tolerance to alcohol, thus enabling even higher yields of alcohol as a fermentation product without inhibition of cellular processes due to increased levels of alcohol in the growth medium.
Microorganisms
A microorganism selected often is suitable for genetic manipulation and often can be cultured at cell densities useful for industrial production of a target product. A microorganism selected often can be maintained in a fermentation device.
The term "engineered microorganism" as used herein refers to a modified microorganism that includes one or more activities distinct from an activity present in a microorganism utilized as a starting point (hereafter a "host microorganism"). An engineered microorganism includes a heterologous polynucleotide in some embodiments, and in certain embodiments, an engineered organism has been subjected to selective conditions that alter an activity, or introduce an activity, relative to the host microorganism. Thus, an engineered microorganism has been altered directly ' or indirectly by a human being. A host microorganism sometimes is a native microorganism, and at times is a microorganism that has been engineered to a certain point.
In some embodiments an engineered microorganism is a single cell organism, often capable of dividing and proliferating. A microorganism can include one or more of the following features: aerobe, anaerobe, filamentous, non-filamentous, monoploid, dipoid, auxotrophic and/or non- auxotrophic. In certain embodiments, an engineered microorganism is a prokaryotic
microorganism (e.g., bacterium), and in certain embodiments, an engineered microorganism is a non-prokaryotic microorganism. In some embodiments, an engineered microorganism is a eukaryotic microorganism (e.g., yeast, fungi, amoeba). Any suitable yeast may be selected as a host microorganism, engineered microorganism or source for a heterologous polynucleotide. Yeast include, but are not limited to, Yarrowia yeast(e.g., Y. lipolytica (formerly classified as Candida lipolytica)), Candida yeast (e.g., C. revkaufi, C.
pulcherrima, C. tropicalis, C. utilis), Rhodotorula yeast (e.g., R. glutinus, R. graminis),
Rhodosporidium yeast (e.g., R. toruloides), Saccharomyces yeast (e.g., S. cerevisiae, S. bayanus, S. pastorianus, S. carlsbergensis), Cryptococcus yeast, Trichosporon yeast (e.g., T. pullans, T. cutaneum), Pichia yeast (e.g., P. pastoris) and Lipomyces yeast (e.g., L. starkeyii, L. lipoferus). In some embodiments, a yeast is a S. cerevisiae strain including, but not limited to,
YGR240CBY4742 (ATCC accession number 4015893) and BY4742 (ATCC accession number 201389). In some embodiments, a yeast is a Y. lipolytica strain that includes, but is not limited to, ATCC20362, ATCC8862, ATCC 18944, ATCC20228, ATCC76982 and LGAM S(7)1 strains (Papanikolaou S., and Aggelis G., Bioresour. Technol. 82(1 ):43-9 (2002)). In certain
embodiments, a yeast is a C. tropicalis strain that includes, but is not limited to, ATCC20336, ATCC20913, SU-2 (ura3-/ura3-), ATCC20962, H5343 (beta oxidation blocked; US Patent No. 5648247) strains.
Any suitable fungus may be selected as a host microorganism, engineered microorganism or source for a heterologous polynucleotide. Non-limiting examples of fungi include, but are not limited to, Aspergillus fungi (e.g., A. parasiticus, A. nidulans), Thraustochytrium fungi,
Schizochytrium fungi and Rhizopus fungi (e.g., R. arrhizus, R. oryzae, R. nigricans), Orpinomyces or Piromyces. In some embodiments, a fungus is an A. parasiticus strain that includes, but is not limited to, strain ATCC24690, and in certain embodiments, a fungus is an A. nidulans strain that includes, but is not limited to, strain ATCC38163.
Any suitable prokaryote may be selected as a host microorganism, engineered microorganism or source for a heterologous polynucleotide. A Gram negative or Gram positive bacteria may be selected. Examples of bacteria include, but are not limited to, Bacillus bacteria (e.g., B. subtilis, B. megaterium, B. stearothermophilus), Bacteroides bacteria (e.g., Bacteroides uniformis, Bacteroides thetaiotaomicron), Clostridium bacteria (e.g., C. phytofermentans, C. thermohydrosulfuricum, C. cellulyticum (H10)), Acinetobacter bacteria, Norcardia baceteria, Lactobacillus bacterial (e.g., Lactobacillus pentosus), Xanthobacter bacteria, Escherichia bacteria (e.g., E. coli (e.g., strains
DH10B, Stbl2, DH5-alpha, DB3, DB3.1 ), DB4, DB5, JDP682 and ccdA-over (e.g., U.S. Application No. 09/518,188))), Streptomyces bacteria (e.g., Streptomyces rubiginosus, Streptomyces murinus), Erwinia bacteria, Klebsiella bacteria, Serratia bacteria (e.g., S. marcessans), Pseudomonas bacteria (e.g., P. aeruginosa), Salmonella bacteria (e.g., S. typhimurium, S. typhi), Thermus bacteria (e.g., Thermus thermophilic), and Thermotoga bacteria (e.g., Thermotoga maritiima, Thermotoga neopolitana), and Ruminococcus (e.g., Ruminococcus environmental samples, Ruminococcus albus, Ruminococcus bromii, Ruminococcus callidus, Ruminococcus flavefaciens, Ruminococcus gauvreauii, Ruminococcus gnavus, Ruminococcus lactaris, Ruminococcus obeum, Ruminococcus sp., Ruminococcus sp. 14531 , Ruminococcus sp. 15975, Ruminococcus sp. 16442, Ruminococcus sp. 18P13, Ruminococcus sp. 25F6, Ruminococcus sp. 25F7, Ruminococcus sp. 25F8, Ruminococcus sp. 4_1_47FAA, Ruminococcus sp. 5, Ruminococcus sp. 5_1_39BFAA, Ruminococcus sp. 7L75, Ruminococcus sp. 8_1_37FAA, Ruminococcus sp. 9SE51 ,
Ruminococcus sp. C36, Ruminococcus sp. CB10, Ruminococcus sp. CB3, Ruminococcus sp. CCUG 37327 A, Ruminococcus sp. CE2, Ruminococcus sp. CJ60, Ruminococcus sp. CJ63, Ruminococcus sp. C01 , Ruminococcus sp. C012, Ruminococcus sp. C022, Ruminococcus sp. C027, Ruminococcus sp. C028, Ruminococcus sp. C034, Ruminococcus sp. C041 ,
Ruminococcus sp. C047, Ruminococcus sp. C07, Ruminococcus sp. CS1 , Ruminococcus sp. CS6, Ruminococcus sp. DJF_VR52, Ruminococcus sp. DJF_VR66, Ruminococcus sp.
DJF_VR67, Ruminococcus sp. DJF_VR70k1 , Ruminococcus sp. DJF_VR87, Ruminococcus sp. Eg2, Ruminococcus sp. Egf, Ruminococcus sp. END-1 , Ruminococcus sp. FD1 , Ruminococcus sp. GM2/1 , Ruminococcus sp. ID1 , Ruminococcus sp. ID8, Ruminococcus sp. K-1 , Ruminococcus sp. KKA Seq234, Ruminococcus sp. M-1 , Ruminococcus sp. M10, Ruminococcus sp. 22, Ruminococcus sp. M23, Ruminococcus sp. M6, Ruminococcus sp. M73, Ruminococcus sp. M76, Ruminococcus sp. MLG080-3, Ruminococcus sp. NML 00-0124, Ruminococcus sp. Pei041 , Ruminococcus sp. SC101 , Ruminococcus sp. SC103, Ruminococcus sp. Siijpesteijn 1948, Ruminococcus sp. WAL 17306, Ruminococcus sp. YE281 , Ruminococcus sp. YE58,
Ruminococcus sp. YE71 , Ruminococcus sp. ZS2-15, Ruminococcus torques). Bacteria also include, but are not limited to, photosynthetic bacteria (e.g., green non-sulfur bacteria (e.g., Choroflexus bacteria (e.g., C. aurantiacus), Chloronema bacteria (e.g., C. gigateum)), green sulfur bacteria (e.g., Chlorobium bacteria (e.g., C. limicola), Pelodictyon bacteria (e.g., P. luteolum), purple sulfur bacteria (e.g., Chromatium bacteria (e.g., C. okenii)), and purple non-sulfur bacteria' (e.g., Rhodospirillum bacteria (e.g., R. rubrum), Rhodobacter bacteria (e.g., R. sphaeroides, R. capsulatus), and Rhodomicrobium bacteria (e.g., R. vanellii)).
Cells from non-microbial organisms can be utilized as a host microorganism, engineered microorganism or source for a heterologous polynucleotide. Examples of such cells, include, but are not limited to, insect cells (e.g., Drosophila (e.g., D. melanogaster), Spodoptera (e.g., S.
frugiperda Sf9 or Sf21 cells) and Trichoplusa (e.g., High-Five cells); nematode cells (e.g., C. elegans cells); avian cells; amphibian cells (e.g., Xenopus laevis cells); reptilian cells; and mammalian cells (e.g., NIH3T3, 293, CHO, COS, VERO, C127, BHK, Per-C6, Bowes melanoma and HeLa cells). Microorganisms or cells used as host organisms or source for a heterologous polynucleotide are commercially available. Microorganisms and cells described herein, and other suitable microorganisms and cells are available, for example, from Invitrogen Corporation, (Carlsbad, CA), American Type Culture Collection (Manassas, Virginia), and Agricultural Research Culture Collection (NRRL; Peoria, Illinois).
Host 'microorganisms and engineered microorganisms may be provided in any suitable form. For example, such microorganisms may be provided in liquid culture or solid culture (e.g., agar-based medium), which may be a primary culture or may have been passaged (e.g., diluted and cultured) one or more times. Microorganisms also may be provided in frozen form or dry form (e.g., lyophilized). Microorganisms may be provided at any suitable concentration.
Six-Carbon Sugar Metabolism and Activities
Six-carbon or hexose sugars can be metabolized using one of two pathways in many organisms. One pathway, the Embden-Meyerhoff pathway (EM pathway), operates primarily under aerobic (e.g., oxygen rich) conditions. The other pathway, the Entner-Doudoroff pathway (ED pathway), operates primarily under anaerobic (e.g., oxygen poor) conditions, producing pyruvate that can be converted to lactic acid. Lactic acid can be further metabolized upon a return to appropriate conditions. The EM pathway produces two ATP for each six-carbon sugar metabolized, as compared to one ATP produced for each six-carbon sugar metabolized in the ED pathway. Thus the ED pathway yields ethanol more efficiently than the EM pathway with respect to a given amount of input carbon, as seen by the lower net energy yield. However, yeast preferentially use the EM pathway for metabolism of six-carbon sugars, thereby preferentially using the pathway that yields more energy and less desired product.
The following steps and enzymatic activities metabolize six-carbon sugars via the EM pathway. Six-carbon sugars (glucose, sucrose, fructose, hexose and the like) are converted to glucose-6- phosphate by hexokinase or glucokinase (e.g., HXK or GLK, respectively). Glucose-6-phosphate can be converted to fructose-6-phosphate by phosphoglucoisomerase (e.g., PGI). Fructose-6- phosphate can be converted to fructose-1 , 6-bisphosphate by phosphofructokinase (e.g., PFK). Fructose-1 ,6-bisphosphate (F1 ,6BP) represents a key intermediate in the metabolism of six-carbon sugars, as the next enzymatic reaction converts the six-carbon sugar into two 3 carbon sugars. The reaction is catalyzed by fructose bisphosphate aldolase and yields a mixture of
dihydroxyacetone phosphate (DHAP) and glyceraldehyde-3-phosphate (G-3-P). The mixture of the two 3 carbon sugars is preferentially converted to glyceraldehyde-3-phosphate by the action of triosephosphate isomerase. G-3-P is converted is converted to 1 ,3-diphosphoglycerate (1 ,3-DPG) by glyceraldehyde-3-phosphate dehydrogenase (GLD). 1 , 3-DPG is converted to 3- phosphoglycerate (3-P-G by phosphoglycerate kinase (PGK). 3-P-G is converted to 2- phosphoglycerate (2-P-G) by phophoglycero mutase (GPM). 2-P-G is converted to
phosphoenolpyruvate (PEP) by enolase (ENO). PEP is converted to pyruvate (PYR) by pyruvate kinase (PYK). PYR is converted to acetaldehyde by pyruvate dicarboxylase (PDC). Acetaldehyde is converted to ethanol by alcohol dehydrogenase 1 (ADH1 ).
Many enzymes in the EM pathway are reversible. The enzymes in the EM pathway that are not reversible, and provide a useful activity with which to control six-carbon sugar metabolism, via the EM pathway, include, but are not limited to phosphofructokinase and alcohol dehydrogenase. In some embodiments, reducing or eliminating the activity of phosphofructokinase may inactivate the EM pathway. Engineering microorganisms with modified activities in PFK and/or ADH may yield increased product output as compared to organisms with the wild type activities, in certain embodiments. In some embodiments, modifying a reverse activity (e.g., the enzyme responsible for catalyzing the reverse activity of ADH, for example) may also yield an increase in product yield by reducing or eliminating the back conversion of products by the backwards reaction. The activity which catalyzes the conversion of ethanol to acetaldehyde is alcohol dehydrogenase 2 (ADH2). Reducing or eliminating the activity of ADH2 can increase the yield of ethanol per unit of carbon input due to the inactivation of the conversion of ethanol to acetaldehyde, in certain embodiments. In addition to enzyme activities that are not reversible, certain reversible activities also can be used to control six-carbon sugar metabolism via the EM pathway, in some embodiments. A non-limiting example of a reversible enzymatic activity that can be utilized to control six-carbon sugar metabolism includes phosphoglucose isomerase (PGI).
A microorganism may be engineered to include or regulate one or more activities in the Embden- Meyerhoff pathway, for example. In some embodiments, one or more of these activities may be altered such that the activity or activities can be increased or decreased according to a change in environmental conditions. In certain embodiments, one or more of the activities (e.g., PGI, PFK or ADH2) can be altered to allow regulated control and an alternative pathway for more efficient carbon metabolism can be provided (e.g., one or more activities from the ED pathway, for example). An engineered organism with the EM pathway under regulatable control and a novel or enhanced ED pathway would be useful for producing significantly more ethanol or other end product from a given amount of input feedstock. The term "activity" as used herein refers to the functioning of a microorganism's natural or engineered biological pathways to yield various products including ethanol and its precursors. Ethanol (or other product) producing activity can be provided by any non-mammalian source in certain embodiments. Such sources include, without limitation, eukaryotes such as yeast and fungi and prokaryotes such as bacteria. In some embodiments, the activity of one or more (e.g., 2, 3, 4, 5 or more) pathway members in an EM pathway is reduced or removed to undetectable levels. An engineered microorganism may, in some embodiments, preferentially metabolize six-carbon sugars via the ED pathway as opposed to the EM pathway under certain conditions. Such engineered microorganisms may metabolize about 60% or more of the available six-carbon sugars via the ED pathway (e.g., about 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing), and such fraction of the available six-carbon sugars are not metabolized by the EM pathway, under certain conditions. A microorganism may metabolize six-carbon sugars substantially via the ED pathway, and not the EM pathway, in certain embodiments (e.g., 99% or greater, or 100%, of the available six-carbon sugars are metabolized via the ED pathway). A six- carbon sugar is deemed as being metabolized via a particular pathway when the sugar is converted to end metabolites of the pathway, and not intermediate metabolites only, of the particular pathway. A microorganism may preferentially metabolize certain sugars under the ED pathway after a certain time after the microorganism is exposed to a certain set of conditions (e.g., there may be a time delay after a microorganism is exposed to a certain set of conditions before the microorganism preferentially metabolizes sugars by the ED pathway).
Certain novel activities involved in the metabolism of six-carbon sugars by the ED pathway can be engineered into a desired yeast strain to increase the efficiency of ethanol (or other products) production. Yeast do not have an activity that converts 6-phophogluconate to 2-keto-3-deoxy-6-p- gluconate or an activity that converts 2-keto-3-deoxy-6-p-gluconate to pyruvate. Addition of these activities to engineered yeast can allow the engineered microorganisms to increase fermentation efficiency by allowing yeast to ferment ethanol under anaerobic condition without having to use the EM pathway and expend additional energy. Therefore, by providing novel activities associated with converting 6-phophogluconate to 2-keto-3-deoxy-6-p-gluconate and 2-keto-3-deoxy-6-p- gluconate to pyruvate, the engineered microorganism can benefit by producing ethanol more efficiently, with respect to a given amount of input carbon, than by using the native EM pathway.
Bacteria often have enzymatic activities that confer the ability to anaerobically metabolize six- carbon sugars to ethanol. These activities are associated with the ED pathway and include, but are not limited to, phosphogluconate dehydratase (e.g., the EDD gene, for example), and 2-keto-3- deoxygluconate-6-phosphate aldolase (e.g., the EDA gene, for example). Phosphogluconate dehydratase converts 6-phophogluconate to 2-keto-3-deoxy-6-p-gluconate. 2-keto-3- deoxygluconate-6-phosphate aldolase converts 2-keto-3-deoxy-6-p-gluconate to pyruvate. In some embodiments, these activities can be introduced into a host organism to generate an engineered microorganism which gains the ability to use the ED pathway to produce ethanol more efficiently than the non-engineered starting organism, by virtue of the lower net energy yield by the ED pathway. A microorganism may be engineered to include or regulate one or more activities in the Entner-Doudoroff pathway. In some embodiments, one or more of these activities may be altered such that the activity or activities can be increased or decreased according to a change in environmental conditions. Nucleic acid sequences encoding Embden-Meyerhoff pathway and Entner-Doudoroff pathway activities can be obtained from any suitable organism (e.g., plants, bacteria, and other microorganisms, for example) and any of these activities can be used herein with the proviso that the nucleic acid sequence is naturally active in the chosen microorganism when expressed, or can be altered or modified to be active.
Yeast also can have endogenous or heterologous enzymatic activities that enable the organism to anaerobically metabolize six carbon sugars. Saccharomyces cerevisiae used in fermentation often convert glucose-6-phospate (G-6-P) to fructose-6-phosphate (F-6-P) via phosphoglucose isomerase (EC 5.3.1.9), up to 95% of G-6-P is converted to F-6-P in this manner for example. Only a minor proportion of G-6-P is converted to 6-phophoglucono-lactone (6-PGL) by an alternative enzyme, glucose-6-phosphate dehydrogenase (EC 1 .1.1 .49). Yeast engineered to carry both Entner-Doudoroff (ED) and Embden-Meyerhoff (EM) pathways often covert sugars to ethanol using the EM pathway preferentially. Inactivation of one or more activities in the EM 4 pathway can result in conversion of sugars to ethanol using the ED pathway preferentially, in some embodiments.
Phosphoglucose isomerase (EC 5.3.1 .9) catalyzes the reversible interconversion of glucose-6- phosphate and fructose-6-phosphate. Phosphoglucose isomerase is encoded by the PGI1 gene in S. cerevisiae. The proposed mechanism for sugar isomerization involves several steps and is thought to occur via general acid/base catalysis. Since glucose 6-phosphate and fructose 6- phosphate exist predominantly in their cyclic forms, PGI is believed to catalyze first the opening of the hexose ring to yield the straight chain form of the substrates. Glucose 6-phosphate and fructose 6-phosphate then undergo isomerization via formation of a cis-enediol intermediate with the double bond located between C-1 and C-2. Phosphoglucose isomerase sometimes also is referred to as glucose-6-phosphate isomerase or phosphohexose isomerase.
PGI is involved in different pathways in different organisms. In some higher organisms PGI is involved in glycolysis, and in mammals PGI also is involved in gluconeogenesis. In plants PGI is involved in carbohydrate biosynthesis, and in some bacteria PGI provides a gateway for fructose into the Entner-Doudoroff pathway. PGI also is known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumor-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, PGI catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine. PGI activity is involved in cell cycle progression and completion of the
gluconeogenic events of sporulation in S. cerevisiae. In certain embodiments, phosphoglucose isomerase activity is altered in an engineered microorganism. In some embodiments phosphoglucose isomerase activity is decreased or disrupted in an engineered microorganism. In certain embodiments, decreasing or disrupting phosphoglucose isomerase activity may be desirable to decrease or eliminate the isomerization of glucose-6-phosphate to fructose-6-phosphate, thereby increasing the proportion of glucose-6- phosphate converted to gluconolactone-6-phosphate by the activity encoded by ZWF1 (e.g., glucose-6-phosphate dehydrogenase). Increased levels of gluconolactone-6-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden- Meyerhoff pathway. Decreased or disrupted phosphoglucose isomerase (EC 5.3.1 .9) activity in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing or disrupting the activity of phosphoglucose isomerase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast, disruption of both copies of the gene in a diploid yeast, expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. PGM genes may be native to S. cerevisiae, or may be obtained from a heterologous source.
Glucose-6-phosphate dehydrogenase (EC 1.1.1 .49) catalyzes the first step of the pentose phosphate pathway, and is encoded by the S. cerevisiae gene, zwf\ . The reaction for the first step in the PPP pathway is;
D-glucose 6-phosphate + NADP+ = D-glucono-1 ,5-lactone 6-phosphate + NADPH + H+
This reaction is irreversible and rate-limiting for efficient fermentation of sugar via the Entner- Doudoroff pathway. The enzyme regenerates NADPH from NADP+ and is important both for maintaining cytosolic levels of NADPH and protecting yeast against oxidative stress. Zwf 1 p expression in yeast is constitutive, and the activity is inhibited by NADPH such that processes that decrease the cytosolic levels of NADPH stimulate the oxidative branch of the pentose phosphate pathway. Amplification of glucose-6-phosphate dehydrogenase activity in yeast may be desirable to increase the proportion of glucose-6-phosphate converted to 6-phosphoglucono-lactone and thereby improve fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
Glucose-6-phosphate dehydrogenase (EC 1.1.1 .49) activity in yeast may be amplified by over- expression of the zw 1 gene by any suitable method. Non-limiting examples of methods suitable to amplify or over express zw 1 include amplifying the number of ZWF1 genes in yeast following transformation with a high-copy number plasmid (e.g., such as one containing a 2uM origin of replication), integration of multiple copies of ZWF1 into the yeast genome, over-expression of the ZWF1 gene directed by a strong promoter, the like or combinations thereof. The ZWF1 gene may be native to S. cerevisiae, or it may be obtained from a heterologous source. 6-phosphog!uconolactonase (EC 3.1 .1 .31 ) catalyzes the second step of the ED (e.g., pentose phosphate pathway), and is encoded by S. cerevisiae genes SOL3 and SOL4. The reaction for the second step of the pentose phosphate pathway is; 6-phospho-D-glucono-1 ,5-lactone + H20 = 6-phospho-D-gluconate
Amplification of 6-phosphogluconolactonase activity in yeast may be desirable to increase the proportion of 6-phospho-D-glucono-1 ,5-lactone converted to 6-phospho-D-gluconate and thereby improve fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway. For example, over expression of
SOL3 is known to increase the rate of carbon source utilization to result in faster growth on xylose than wild type.
The Saccharomyces cerevisiae SOL protein family includes Sol3p and Sol4p. Both localize predominantly in the cytosol, exhibit 6-phosphogluconolactonase activity and function in the pentose phosphate pathway. 6-phosphogluconolactonase (EC 3.1 .1 .31 ) activity in yeast may be amplified by over-expression of the SOL3 and/or SOL4 gene(s) by any suitable method. Non- limiting examples of methods to amplify or over express SOL3 and SOL4 include increasing the number of SOL3 and/or SOL4 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of SOL3 and/or SOL4 gene(s) into the yeast genome, over- expression of the SOL3 and/or SOL4 gene(s) directed by a strong promoter, the like or combinations thereof. The SOL3 and/or SOL4 gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source. For example, Sol3p and Sol4p have similarity to each other, and to Candida albicans SoU p, Schizosaccharomyces pombe SoU p, human PGLS which is associated with 6-phosphogluconolactonase deficiency, and human H6PD which is associated with cortisone reductase deficiency. Sol3p and Sol4p are also similar to the 6- phosphogluconolactonases in bacteria (Pseudomonas aeruginosa) and eukaryotes (Drosophila melanogaster, Arabidopsis thaliana, and Trypanosoma brucei), to the glucose-6-phosphate dehydrogenase enzymes from bacteria (Mycobacterium leprae) and eukaryotes (Plasmodium falciparum and rabbit liver microsomes), and have regions of similarity to proteins of the Nag family, including human GNPI and Escherichia coli NagB.
Phosphogluconate dehydrogenase (EC:1 .1.1.44) catalyzes the second oxidative reduction of NADP+ to NADPH in the cytosolic oxidative branch of the pentose phosphate pathway, and is encoded by the S. cerevisiae genes GND1 and GND2. GND1 encodes the major isoform of the enzyme accounting for up to 80% of phosphog!uconate dehydrogenase activity, while GND2 encodes the minor isoform of the enzyme. Phosphogluconate dehydrogenase sometimes also is referred to as phosphogluconic acid dehydrogenase, 6-phosphogluconic dehydrogenase, 6- phosphogluconic carboxylase, 6-phosphogluconate dehydrogenase (decarboxylating), and 6- phospho-D-gluconate dehydrogenase. Phosphogluconate dehydrogenase belongs to the family of oxidoreductases, specifically those acting on the CH-OH group of donor with NAD+ or NADP+ as the acceptor. The reaction for the second oxidative reduction of NADP+ to NADPH in the cytosolic oxidative branch of the pentose phosphate pathway is;
6-phospho-D-gluconate + NADP+ ^D-ribulose 5-phosphate + C02 + NADPH
Decreasing the level of 6-phosphogluconolactonase activity in yeast may be desirable to decrease the proportion of 6-phospho-D-gluconate converted to D-ribulose 5-phosphate thereby increasing the. levels of the intermediate gluconate-6-phosphate available for conversion to 6-dehydro-3- deoxy-gluconate-6-phosphate, in some embodiments involving engineered microorganisms including increased EDA and EDD activities, thereby improving fermentation of sugar to ethanoi via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden- Meyerhoff pathway.
Decreasing or disrupting 6-phosphogluconolactonase activity in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing the activity of 6-phosphogluconate dehydrogenase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. GND1 and/or GND2 gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source. For example, S. cerevisiae GND1 and GND2 have similarity to each other, and to the phosphogluconate dehydrogenase nucleotide sequences of Candida parapsilosis, Cryptococcus neoformans and humans. Trehalose, (e.g., also known as mycose or tremalose), is a natural alpha-linked disaccharide formed by an alpha, alpha- 1 ,1 -glucoside bond between two alpha-glucose units. Trehalose biosynthesis is a two-step process in which glucose 6-phosphate and UDP-glucose are converted by trehalose-6-phosphate synthase, encoded by TPS1 , into alpha, alpha-trehalose 6-phosphate, which is then converted with water into trehalose and phosphate by trehalose-6-phosphate phosphatase, encoded by TPS2. The main function of trehalose is as a carbohydrate storage moiety. Trehalose-6-phosphate synthase (e.g., TPS1 ; EC 2.4.1 .15; also known as alpha, alpha- trehalose-phosphate synthase (UDP-forming)) catalyzes the chemical reaction UDP-glucose + D-glucose 6-phosphate (JDP + alpha, alpha-trehalose 6-phosphate, and is part of the alpha, alpha-trehalose-phosphate synthase complex (UDP-forming). Thus, the two substrates of this enzyme activity are UDP-glucose and D-glucose 6-phosphate, whereas its two products are UDP and alpha, alpha-trehalose 6-phosphate.
Without being limited by theory, decreasing the level of trehalose-6-phosphate synthase activity in yeast may be desirable to decrease the proportion of glucose converted into the storage carbohydrate trehalose, thereby increasing the levels of glucose ultimately available for conversion to ethanol, in some embodiments involving engineered microorganisms including increased EDA and EDD activities, thereby improving fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway.
Deletion of the TPS1 gene has been shown to eliminate TPS activity and measureable trehalose.
Decreasing or disrupting trehalose-6-phosphate synthase activity in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing the activity of trehalose-6-phosphate synthase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. An example of a polynucleotide subsequence and/or an amino acid sequence coding a trehalose-6-phosphate synthase activity is provided herein. Trehalose-6-phosphate phosphatase (e.g., TPS2; EC 3.1 .3.12; also known as alpha, alpha- trehalose-6-phosphate phosphohydrolase, trehalose 6-phosphatase and trehalose-6-phosphate phosphohydrolase) catalyzes the chemical reaction, alpha, alpha-trehalose 6-phosphate + H20 alpha ,alpha-trehalose + phosphate, and is part of the alpha, alpha-trehalose-phosphate synthase complex (UDP-forming). Thus, the two substrates of this enzyme activity are alpha, alpha-trehalose 6-phosphate and H20, whereas its two products are alpha, alpha-trehalose and phosphate. Removal of the phosphate allows another enzyme activity, trehalase, to hydrolyze trehalose into 2 molecules of glucose.
Without being limited by theory, increasing the level of trehalose-6-phosphate phosphatase activity in yeast may be desirable to decrease the proportion of alpha, alpha-trehalose 6-phosphate, by conversion into trehalose and phosphate. The trehalose can be further metabolized by trehalase, into two molecules of glucose which ultimately can be converted to ethanol via the engineered pathways described herein, via native and/or engineered pathways in engineered microorganisms described herein. Trehalose-6-phosphate phosphatase (EC 3.1.3.12) activity in yeast may be amplified by over-expression of the TPS2 gene by any suitable method. Non-limiting examples of methods to amplify or over express TPS2 include increasing the number of TPS2 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the TPS2 gene into the yeast genome, over-expression of the TPS2 gene directed by a strong promoter, the like or combinations thereof. The TPS2 gene may be native to S. cerevisiae, or may be obtained from a heterologous source. An example of a polynucleotide subsequence and/or an amino acid sequence coding a trehalose-6-phosphate phosphatase activity is provided herein.
Glyceraldehyde-3-phosphate dehydrogenase (e.g., TDH3; EC 1.2.1 .12; also known as glyceraldehyde-3-phosphate dehydrogenase (phosphorylating), GAPDH, NAD-dependent glyceraldehyde-3-phosphate dehydrogenase and triosephosphate dehydrogenase) catalyzes the chemical reaction; glyceraldehyde-3-phosphate + phosphate + NAD(+) <=> 1 ,3 bis-phosphoglycerate + NADH, and is generally found in the cytoplasm and cell wall as a tetramer. Three unlinked genes, TDH1 TDH2, and TDH3, encode related but not identical polypeptides that form catalytically active homotetramers with different specific activities. Tdh2p and Tdh3p are detected in exponentially growing cells whereas Tdhl p is primarily detected during stationary phase. Glyceraldehyde-3- phosphate dehydrogenase activity is part of the gluconeogenesis and glycolysis pathways, whereby glucose is synthesized from non-carbohydrate precursors (e.g., ethanol, glycerol, or peptone), and then metabolized to produce energy and precursors for other cellular processes, respectively.
Without being limited by theory, increasing the level of glyceraldehyde-3-phosphate
dehydrogenase activity in yeast may be desirable to increase carbon flux through gluconeogenesis and glycolysis, such that glycerol and glycerol derivatives are converted to glucose and further metabolized into ethanol, via native and/or engineered pathways in engineered microorganisms described herein. Glyceraldehyde-3-phosphate dehydrogenase (EC 1 .2.1 .12) activity in yeast may be amplified by over-expression of the TDH3 gene by any suitable method. Non-limiting examples of methods to amplify or over express TDH3 include increasing the number of TDH3 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the TDH3 gene into the yeast genome, over-expression of the TDH3 gene directed by a strong promoter, the like or combinations thereof. The TDH3 gene may be native to S. cerevisiae, or may be obtained from a heterologous source. An example of a polynucleotide subsequence and/or an amino acid sequence coding a glyceraldehyde-3-phosphate dehydrogenase activity is provided herein.
Glutamate synthase (e.g., GLT1 ; EC 1 .4.1 .14; also known as glutamate synthase (NADH), L- glutamate synthase, L-glutamate synthetase, NADH-dependent glutamate synthase, NADH- glutamate synthase, NADH: GOGAT) catalyzes the reversible chemical reaction,
2 L-glutamate + NAD(+) <=> L-glutamine + 2-oxoglutarate + NADH, and participates in glutamate metabolism and nitrogen metabolism, and employs one cofactor, FMN. Yeast cells contain 3 pathways for the synthesis of glutamate. Two pathways are mediated by two isoforms of glutamate dehydrogenase, encoded by GDH1 and GDH3. The third pathway involves the combined activities of glutamine synthetase (GLN1 ) and glutamate synthase (GLT1 ). Glnl p catalyzes amination of glutamate to form glutamine. Gltl p then transfers the amide group of glutamine to 2-oxoglutarate, generating two molecules of glutamate. Glutamate synthase, also referred to as GOGAT, is a trimer of three Gltl p subunits. Expression of the GLT1 gene is modulated by glutamate-mediated repression and by Gln3p/Gcn4p-mediated activation, depending upon the availability of nitrogen and glutamate in the medium. In amino acid starvation conditions, GLT1 expression is activated to a moderate degree by Gcn4p.
Without being limited by theory, increasing the level of glutamate synthase activity in yeast may be desirable to increase carbon flux through gluconeogenesis and glycolysis, such that glutamate and glutamate derivatives are ultimately converted to ethanol, via native and/or engineered pathways in engineered microorganisms described herein. Glutamate synthase (EC 1 .4.1.14) activity in yeast may be amplified by over-expression of the GLT1 gene by any suitable method. Non-limiting examples of methods to amplify or over express GLT1 include increasing the number of GLT1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the GLT1 gene into the yeast genome, over-expression of the GLT1 gene directed by a strong promoter, the like or combinations thereof. The GLT1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source. An example of a polynucleotide subsequence and/or an amino acid sequence coding a glutamate synthase activity is provided herein.
Alcohol dehydrogenase 1 (e.g., ADH1 ; EC 1 .1 .1 .1 ) catalyzes the reduction of acetaldehyde to ethanol. In S. cerevisiae, there are five genes that encode alcohol dehydrogenases involved in ethanol metabolism, ADH1 to ADH5. Four enzyme activities, Adhl p, Adh3p, Adh4p, and Adh5p, reduce acetaldehyde to ethanol during glucose fermentation. Adh2p catalyzes the reverse reaction of oxidizing ethanol to acetaldehyde. The cytosolic ADH1 gene product is the major enzyme responsible for converting acetaldehyde to ethanol, and functions as a tetramer of four identical subunits with each subunit containing two zinc ions.
Without being limited by theory, increasing the level of alcohol dehydrogenase activity in yeast may be desirable to increase the carbon flux through the last step of fermentation, the reduction of acetaldehyde to ethanol, via native and/or engineered pathways in engineered microorganisms described herein. Alcohol dehydrogenase (EC 1.1 .1 .1 ) activity in yeast may be amplified by over- expression of the ADH1 gene by any suitable method. Non-limiting examples of methods to amplify or over express ADH1 include increasing the number of ADH1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the ADH1 gene into the yeast genome, over-expression of the ADH1 gene directed by a strong promoter, the like or combinations thereof. The ADH1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source. An example of a polynucleotide subsequence and/or an amino acid sequence coding an alcohol dehydrogenase activity is provided herein. Pyruvate decarboxylase (e.g., PDC1 ; EC 4.1 .1 .1 ; also known as pyruvic decarboxylase, alpha- ketoacid carboxylase, alpha-carboxylase, 2-oxo-acid carboxy-lyase) catalyzes the general reaction, 2-oxo acid <=> an aldehyde + CO(2), where, the 2-oxo acid generally is pyruvic acid , and the aldehyde generally is acetaldehyde. In anaerobic conditions, pyruvate decarboxylase activity commits the end product of glycolysis, pyruvate, to ethanol production rather than to its other possible metabolic fates: the TCA cycle/aerobic respiration (pyruvate converted to acetyl-CoA by the action of pyruvate
dehydrogenase activity) or gluconeogenesis (pyruvate converted to oxaloacetate, by pyruvate carboxylase activity). Pyruvate decarboxylase activity also can decarboxylate other 2-oxo acids such as indolepyruvate and 2-keto-3-methyl-valerate. The ability to decarboxylate other 2-oxo acids contributes to the catabolism of the amino acids isoleucine, phenylalanine, tryptophan, and valine, thereby providing additional opportunity to maximize carbon flux in the direction of ethanol production. Pyruvate decarboxylase is conserved among yeast, bacteria and plants. The active enzyme is a homotetramer and requires thiamin diphosphate and magnesium cofactors
Without being limited by theory, increasing the level of pyruvate decarboxylase activity in yeast may be desirable to increase the carbon flux through the last step of fermentation, the reduction of acetaldehyde to ethanol, by increasing the conversion of pyruvic acid into acetaldehyde, via native and/or engineered pathways in engineered microorganisms described herein. Pyruvate decarboxylase e (EC 4.1.1.1 ) activity in yeast may be amplified by over-expression of the PDC1 gene by any suitable method. Non-limiting examples of methods to amplify or over express PDC1 include increasing the number of PDC1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the PDC1 gene into the yeast genome, over-expression of the PDC 1 gene directed by a strong promoter, the like or combinations thereof. The PDC1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source. An example of a polynucleotide subsequence and/or an amino acid sequence coding a pyruvate decarboxylase activity is provided herein.
Pyruvate kinase (e.g., PYK1 , CDC19; EC 2.7.1.40) catalyzes the conversion of
phosphoenolpyruvate (PEP) to pyruvate, the final step in glycolysis. As part of the conversion of PEP to pyruvate, a phosphate group is transferred from phosphoenolpyruvate (PEP) to ADP, yielding one molecule of pyruvate and one molecule of ATP. PYK1 appears to be tightly regulated and activated by fructose-1 ,6-bisphosphate (FBP). PYK1 is believed to be the main pyruvate kinase in the glycolytic pathway. Genes encoding pyruvate kinase have been identified in several other species, including human (PKLR/PK1 ) and mouse.
Without being limited by theory, increasing the level of pyruvate kinase (EC 2.7.1 .40) activity in yeast may be desirable to increase the carbon flux through the last step of fermentation, the reduction of acetaldehyde to ethanol, by increasing the conversion of phosphoenolpyruvate to pyruvate, which can be further metabolized into acetaldehyde and ultimately ethanol, via native and/or engineered pathways in engineered microorganisms described herein. Pyruvate kinase (EC 2.7.1 .40) activity in yeast may be amplified by over-expression of the PYK1 gene by any suitable method. Non-limiting examples of methods to amplify or over express PYK1 include increasing the number of PYK1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of the PYK1 gene into the yeast genome, over-expression of the PYK 1 gene directed by a strong promoter, the like or combinations thereof. The PYK1 gene may be native to S. cerevisiae, or may be obtained from a heterologous source. An example of a polynucleotide subsequence and/or an amino acid sequence coding a pyruvate kinase activity is provided herein. Alkaline phosphatase specific for p-nitrophenyl phosphate (e.g., PH013; EC 3.1.3.1 ; also known as alkaline phosphomonoesterase, glycerophosphatase, phosphomonoesterase, alkaline phosphohydrolase; alkaline phenyl phosphatase; orthophosphoric-monoester phosphohydrolase (alkaline optimum), phosphate-monoester phosphohydrolase (alkaline optimum)); catalyzes the reaction, phosphate monoester + H(2)0 <=> an alcohol + phosphate, and, is known to be responsible for removing phosphate groups from many types of molecules, including nucleotides, proteins, and alkaloids. The process of removing the phosphate group is called dephosphorylation. As the name suggests, alkaline phosphatases are most effective in an alkaline environment. Alkaline phosphatase also is believed to be involved in
activation/inactivation of many cellular activities. PH013 also has been shown to play a role in efficient xylose utilization. It has been demonstrated that cells overexpressing xylulokinase frequently grow poorly and exhibit decreased fitness. Cells overexpressing xylulokinase that also have a corresponding PH013 deletion, show improved growth and fitness. Without being limited by theory, decreasing the level of PH013 activity in engineered cells may benefit the production of ethanol by (i) activation of other activities involved in ethanol production, (ii) deactivation of activities that may inhibit ethanol production, (iii) altering the transport of carbon sources into and/or out of the cell, and/or (iv) improving xylose utilization in strains engineered to metabolize xylose, with or without over expression of xylulokinase.
Decreasing or disrupting the synthesis of alkaline phosphatase (EC 3.1 .3.1 ) activity in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing the synthesis of alkaline phosphatase (e.g., PH013) activity include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. An example of a polynucleotide subsequence and/or an amino acid sequence coding an alkaline phosphatase activity specific for p-nitrophenyl phosphate is provided herein.
As noted previously, trehalose is hydrolyzed to form 2 molecules of glucose by the activity of trehalose. Certain yeast have two trehalase activities, an acid trehalase encoded by ATH1 and a neutral trehalase encoded by NTH1 (e.g., EC 3.2.1.28). A third locus, NTH2, is 77% identical to NTH1 , but does not appear to encode a trehalase activity, or be involved in trehalose catabolism, since an nth2 null mutant exhibits normal levels of neutral trehalase activity. Trehalase catalyzes the reaction, alpha, alpha-trehalose + H(2)0 <=> 2 D-glucose. NTH1 is induced by various stresses including exposure to heat, hydrogen peroxide, or cycloheximide. Nthl p normally is found as a cytoplasmic homodimer in its active state (e.g., the state in which the hydrolysis of intracellular trehalose occurs). Decreased expression of NTH1 has been shown to be involved in strain stability. Without being limited by theory, decreasing the level of NTH1 activity in engineered cells may benefit the overall stability (e.g., health and fitness) of the cell , thereby allowing increased production of ethanol. In addition to the overall fitness of the cell in NTH1 deleted strains, deletion of NTH1 also may provided unexpected benefits due to the increased temperature sensitivity of nthi strains. Engineered strains described herein have demonstrated increased ethanol production under certain stress conditions, including temperature stress for example. Without being limited by any theory, nthi deleted strains may prove to be more stabile than wild type counterparts, and also may show stress induced increases in ethanol production without having to use high temperatures to induce the stress. In some embodiments, alterations to reduce or eliminate NTH1 activity are not made in the same genetic background as alterations to reduce or eliminate TPS1 activity. In certain embodiments, alterations to reduce or eliminate NTH1 activity are not made in the same genetic background as alterations to increase TPS2 activity. In some embodiments, alterations to reduce or eliminate NTH1 activity are not made in the same genetic background as alterations to reduce or eliminate TPS1 and increase TPS2 activity. Decreasing or disrupting the synthesis of neutral trehalase (EC 3.2.1.28) activity in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing the synthesis of plasma membrane channels encoded by NTH1 include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. An example of a polynucleotide
subsequence and/or an amino acid sequence coding a neutral trehalase activity is provided herein.
Glycerol-3-phosphate dehydrogenase (e.g., GPD1/GPD2; EC 1.1 .1 .8 / EC 1.1.5.3, respectively; also known a-glycerol phosphate dehydrogenase (NAD); a-glycerophosphate dehydrogenase (NAD); glycerol 1 -phosphate dehydrogenase; glycerol phosphate dehydrogenase (NAD);
glycerophosphate dehydrogenase (NAD); hydroglycerophosphate dehydrogenase; L-a-glycerol phosphate dehydrogenase; L-a-glycerophosphate dehydrogenase; L-glycerol phosphate dehydrogenase; L-glycerophosphate dehydrogenase; NAD-a-glycerophosphate dehydrogenase; NAD-dependent glycerol phosphate dehydrogenase; NAD-dependent glycerol-3-phosphate dehydrogenase; NAD-L-glycerol-3-phosphate dehydrogenase; NAD-linked glycerol 3-phosphate dehydrogenase; NADH-dihydroxyacetone phosphate reductase; glycerol-3-phosphate dehydrogenase (NAD), FAD-dependent glycerol-3-phosphate dehydrogenase, flavin-linked glycerol-3-phosphate dehydrogenase, glycerol-3-phosphate CoQ reductase, glycerophosphate dehydrogenase, L-glycerophosphate dehydrogenase, sn-glycerol-3-phosphate dehydrogenase), catalyzes the general reaction, dihydroxyacetone phosphate sn-glycerol 3-phosphate.
Two glycerol-3-phosphate activities exist in many organisms; a cytoplasmic activity encoded by GPD1 , and a mitochondrial activity encoded by GPD2. The cytoplasmic enzyme uses NAD as a cofactor and yields NADH, while the mitochondrial enzyme uses quinone as a cofactor and yields quinol. Glycerol-3-phosphate dehydrogenase also acts on propane-1 ,2-diol phosphate and glycerone sulfate, but with a lower affinity. Glycerol-3-phosphate dehydrogenase is a key enzyme in glycerol synthesis and has been shown to be important to growth and survival under osmotic stress. Without being limited by theory, decreasing glycerol-3-phosphate dehydrogenase activity in engineered cells may benefit the production of ethanol by decreasing the proportion of glycerol that enters the gluconeogenic pathway, thereby allowing the glycerol to be used directly as a non- carbohydrate carbon source for the production of ethanol. Decreasing or disrupting the synthesis of glycerol-3-phosphate dehydrogenase (EC 1 .1 .1 .8 and/or EC 1.1 .5.3) activity in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing the glycerol-3-phosphate
dehydrogenase (e.g., GPD1 , GPD2 or GPD1 and GPD2) activity include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. An example of a polynucleotide subsequence and/or an amino acid sequence coding a glycerol-3-phosphate dehydrogenase activity is provided herein. Five-Carbon Sugar Metabolism and Activities
As noted above, five-carbon sugars are the second most predominant form of sugars in lignocelluosic waste biomass produced in wood pulp and wood milling industries. Furthermore, xylose is the second most abundant carbohydrate in nature. Non-limiting examples of five-carbon sugars include arabinose, lyxose, ribose, xylose, ribulose, and xylulose.
The conversion of biomass to energy (e.g., ethanol, for example) has not proven economically attractive because many organisms cannot metabolize hemicellulose. Biomass and waste biomass contain both cellulose and hemicellulose. Many industrially applicable organisms can metabolize five-carbon sugars (e.g., xylose, pentose and the like), but may do so at low efficiency, or may not begin metabolizing five-carbon sugars until all six-carbon sugars have been depleted from the growth medium. Many yeast and fungus grow slowly on xylose and other five-carbon sugars. Some yeast, such as S. cerevisiae do not naturally use xylose, or do so only if there are no other carbon sources. An engineered microorganism (e.g., yeast, for example) that could grow rapidly on xylose and provide ethanol and/or other products as a result of fermentation of xylose can be useful due to the ability to use a feedstock source that is currently underutilized while also reducing the need for petrochemicals. The pentose phosphate pathway (PPP), which is a biochemical route for xylose metabolism, is found in virtually all cellular organisms where it provides D-ribose for nucleic acid biosynthesis, D- erythrose 4-phosphate for the synthesis of aromatic amino acids and NADPH for anabolic reactions. The PPP is thought of as having two phases. The oxidative phase converts the hexose, D-glucose 6P, into the pentose, D-ribulose 5P, plus C02 and NADPH. The non-oxidative phase converts D-ribulose 5P into D-ribose 5P, D-xylulose 5P, D-sedoheptulose 7P, D-erythrose 4P, D- fructose 6P and D-glyceraldehyde 3P. D-Xylose and L-arabinose enter the PPP through D- xylulose.
Certain organisms (e.g., yeast, filamentous fungus and other eukaryotes, for example) require two or more activities to convert xylose to a usable from that can be metabolized in the pentose phosphate pathway. The activities are a reduction and an oxidation carried out by xylose reductase (XR; XYL1 , GRE3) and xylitol dehydrogenase (XD; XYL2, XDH1 ), respectively. Xylose reductase converts D-xylose to xylitol. Xylitol dehydrogenase converts xylitol to D-xylulose. The use of these activities sometimes can inhibit cellular function due to cofactor and metabolite imbalances.
In some embodiments, the xylose reductase activity and/or xylitol dehydrogenase activity selected for inclusion in an engineered organism can be chosen from an organism whose XR and/or XD activities utilize NADPH or NADH (e.g., co-factor flexible activities), thereby reducing or eliminating inhibition of cellular function due to cofactor and metabolite imbalances. Non-limiting examples of yeast whose xylose reductase enzyme and/or xylitol dehydrogenase enzyme can use
NADP7NADPH and/or NAD7NADH include C. shehatae, C. parapsilosis, P. segobiensis, P. stipitis, and Pachysolen tannophilus. In certain embodiments, xylose reductase and/or xylitol dehydrogenase activities can be engineered to alter cofactor preference and/or specificity. Some organisms (e.g., certain bacteria, for example) require only one activity, xylose isomerase (xylA). Xylose isomerase converts xylose directly to xylulose. In some embodiments, additional alterations in the strain can compensate for cofactor and metabolite imbalances caused by the use of certain xylose reductase and/or xylitol dehydrogenase activities.
Xylulose is converted to xylulose-5-phophate by the activity of a xylulokinase enzyme (EC
2.7.1 .17). Xylulose kinase (e.g., XYK3, XYL3, XKS1 ) catalyzes the chemical reaction, ATP + D-xylulose ^ADP + D-xylulose 5-phosphate
Xylulokinase sometimes also is referred to as ATP:D-xylulose 5-phosphotransferase, xylulokinase (phosphorylating), and D-xylulokinase. Increasing the activity of xylose isomerase or xylose reductase and xylitol dehydrogenase may cause an increase of xylulose in an engineered microorganism. Therefore, increasing xylulokinase activity levels in embodiments involving increased levels of XI or XR and XD may be desirable to allow increased flux through the respective metabolic pathways. Xylulokinase activity levels can be increased using any suitable method. Non-limiting examples of methods suitable for increasing xylulokinase activity include increasing the number of xylulokinase genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of xylulokinase genes into the yeast genome, over- expression of the xylulokinase gene directed by a strong promoter, the like or combinations thereof. The xylulokinase gene may be native to S. cerevisiae, or may be obtained from a heterologous source. In some embodiments, strains engineered to over express xylulokinase are also engineered to delete PH013 activity. As noted above, strains overexpressing xylulokinase activity often display deleterious phenotypic traits that can be partially or completely alleviated by a corresponding deletion of PH013 activity.
Phosphorylation of xylulose by xylulokinase allows the five-carbon sugar to be further converted by transketolase (e.g., TKL1/TKL2) to enter the EM pathway for further metabolism at either fructose- 6-phosphate or glyceraldehyde-3-phosphate. In some embodiments, where the EM pathway is inactivated, five-carbon sugars enter the EM pathway and are further converted for use by the ED pathway. Therefore, engineering a microorganism with xylose isomerase activity or co-factor flexible xylose reductase activity and xylitol dehydrogenase activity, along with increased xylulokinase activity may allow rapid growth on xylose when compared to the non-engineered microorganism, while avoiding cofactor and metabolite imbalances, in some embodiments. In certain embodiments, engineering a microorganism with co-factor flexible xylose reductase activity and xylitol dehydrogenase activity, may allow rapid growth on xylose when compared to the non- engineered microorganism, while avoiding cofactor and metabolite imbalances. The term "co- factor flexible" as used herein with respect to xylose reductase activity and xylose isomerase activity refers to the ability to use NADP7NADPH and/or NAD7NADH as a cofactor for electron transport.
A microorganism may be engineered to include or regulate one or more activities in a five-carbon sugar metabolism pathway (e.g., pentose phosphate pathway, for example). In some
embodiments, an engineered microorganism can comprise a xylose isomerase activity. In some embodiments, the xylose isomerase activity may be altered such that the activity can be increased or decreased according to a change in environmental conditions. Nucleic acid sequences encoding xylose isomerase activities can be obtained from any suitable bacteria (e.g., Piromyces, Orpinomyces, Bacteroides thetaiotaomicron, Clostridium phytofermentans, Thermus thermophilus and Ruminococcus (e.g., R. flavefaciens, R. flavefaciens strain FD1 , R. Flavefaciens strain 18P13) are non-limiting examples) and any of these activities can be used herein with the proviso that the nucleic acid sequence is naturally active in the chosen microorganism when expressed, or can be altered or modified to be active. In some embodiments, an engineered microorganism can comprise a xylose reductase activity and a xylitol dehydrogenase activity. In certain embodiments, an engineered microorganism can comprise a xylulokinase activity. In some embodiments, the xylose reductase activity, xylitol dehydrogenase activity and/or xylulokinase activity may be altered such that the activity can be increased or decreased according to a change in environmental conditions. Nucleic acid sequences encoding xylose reductase activity, xylitol dehydrogenase activity and/or xylulokinase activities can be obtained from any suitable organism, and any of these activities can be used herein with the proviso that the nucleic acid sequence is naturally active in the chosen microorganism when expressed, or can be altered or modified to be active. Activities Linking 5-Carbon and 6-Carbon Sugar Metabolic Pathways
In some embodiments, an engineered microorganism includes one or more altered activities that function to link 5-carbon sugar and 6-carbon sugar metabolic pathways (e.g., provide intermediates that enter and/or are metabolized by the pentose phosphate pathway, the glycolytic pathway, or the pentose phosphate and glycolytic pathways). In certain embodiments, the altered linking activity is added, increased or amplified, with respect to a host or starting organism. In some embodiments, the altered activity is decreased or disrupted, with respect to a host or starting organism. Non-limiting examples of activities that function to reversibly link 5-carbon sugar and 6- carbon sugar metabolic pathways include transaldolase, transketolase, the like, or combinations thereof. Transketolase and transaldolase catalyze transfer of 2 carbon and 3 carbon molecular fragments respectively, in each case from a ketose donor to an aldose acceptor.
Transaldolase (EC:2.2.1 .2) catalyses the reversible transfer of a three-carbon ketol unit from sedoheptulose 7-phosphate to glyceraldehyde 3-phosphate to form erythrose 4-phosphate and fructose 6-phosphate. The cofactor-less enzyme acts through a Schiff base intermediate (e.g., bound dihydroxyacetone). Transaldolase is encoded by the gene TAL1 in S. cerevisiae, and is an enzyme in the non-oxidative pentose phosphate pathway that provides a link between the pentose phosphate and the glycolytic pathways. Transaldolase activity is thought to be found in substantially all organisms, and include 5 subfamilies. Three transaldolase subfamilies have demonstrated transaldolase activity, one subfamily comprises an activity of undetermined function and the remaining subfamily includes a fructose 6-phosphate aldolase activity. Transaldolase deficiency is well tolerated in many microorganisms, and without being limited by any theory, is thought to be involved in oxidative stress responses and apoptosis. Transaldolase sometimes also is referred to as dihydroxyacetone transferase, glycerone transferase, or dihydroxyacetonetransferase, sedoheptulose-7- phosphate:D-glyceraldehyde-3-phosphate glyceronetransferase, and catalyzes the reaction: sedoheptulose 7-phosphate + glyceraldehyde 3-phosphate — erythrose 4-phosphate + fructose 6- phosphate
In some embodiments, increasing or amplifying transaldolase activity in yeast may be desirable to increase the proportion of sedoheptulose 7-phosphate and glyceraldehyde 3-phosphate converted to fructose-6-phosphate and erythrose-4-phosphate, thereby increasing levels of fructose-6- phosphate. Increased levels of fructose-6-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway. Transaldolase (EC:2.2.1 .2) activity in yeast may be amplified by over-expression of the TAL1 gene by any suitable method. Non-limiting examples of methods to amplify or over express TAL1 include increasing the number of TAL1 genes in yeast by transformation with a high-copy number plasmid, integration of multiple copies of TAL1 genes into the yeast genome, over-expression of TAL1 genes directed by a strong promoter, the like or combinations thereof. The TAL1 genes may be native to S. cerevisiae, or may be obtained from a heterologous source.
In certain embodiments, decreasing or disrupting transaldolase activity may be desirable to decrease the proportion of sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate converted to fructose-6-phosphate and erythrose-4-phosphate, thereby increasing levels of glyceraldehyde-3-phosphate in the engineered microorganism. Increased levels of
glyceraldehyde-3-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway. Decreased or disrupted transaldolase (EC:2.2.1 .2) activity in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing or disrupting the activity of transaldolase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast, disruption of both copies of the gene in a diploid yeast, expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. Transketolase (EC:2.2.1.1 ) catalyzes the reversible transfer of a two-carbon ketol unit from a ketose (e.g., xylulose 5-phosphate, fructose 6-phosphate, sedoheptulose 7-phosphate) to an aldose receptor (e.g., ribose 5-phosphate, erythrose 4-phosphate, glyceraldehyde 3-phosphate). Transketolase is encoded by the TKL1 and TKL2 genes in S. cerevisiae. TKL1 encodes the major isoform of the enzyme and TKL2 encodes a minor isoform. Transketolase sometimes also is referred to as glycoaldehyde transferase, glycolaldehydetransferase, sedoheptulose-7- phosphate:D-glyceraldehyde-3-phosphate glycolaldehydetransferase, or fructose 6-phosphate:D- glyceraldehyde-3-phosphate glycolaldehydetransferase. Transketolase double null mutants (e.g., tkl1/tkl2) are viable but are auxotrophic for aromatic amino acids, indicating the genes are involved in the synthesis of aromatic amino acids. Transketolase activity also is thought to be involved in the efficient use of fermentable carbon sources, and has been shown to catalyze a one-substrate reaction utilizing only xylulose 5-phosphate to produce glyceraldehyde 3-phosphate and erythrulose. Transketolase activity requires thiamine
pyrophosphate as a cofactor, and has been purified as a homodimer of approximately 70 kilodalton subunits, from S. cerevisiae. Sequences from a variety of eukaryotic and prokaryotic' sources indicate transketolase enzymes have been evolutionarily conserved. Tkh p has similarity to S. cerevisiae Tkl2p, Escherichia coli transketolase, Rhodobacter sphaeroides transketolase,
Streptococcus pneumoniae recP, Hansenula polymorpha dihydroxyacetone synthase,
Kluyveromyces lactis TKL1 , Pichia stipitis TKT, rabbit liver transketolase, rat TKT, mouse TKT, and human TKT. Tkl1 p is also related to E. coli pyruvate dehydrogenase E1 subunit, which is another vitamin B1 -dependent enzyme.
In some embodiments, increasing or amplifying transketolase activity in yeast may be desirable to increase the proportion of xylulose 5-phosphate converted to glyceraldehyde 3-phosphate, thereby increasing levels of glyceraldehyde 3-phosphate available for entry into a 6-carbon sugar metabolic pathway directly and/or conversion to fructose-6-phosphate. Increased levels of fructose-6- phosphate and/or glyceraldehyde 3-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway. Transketolase (EC 2.2.1 .1 ) activity in yeast may be increased or amplified by over-expression of the TKL1 and/or TKL2 gene(s) by any suitable method. Non-limiting examples of methods to amplify or over express TKL1 and TKL2 include increasing the number of TKL1 and/or TKL2 gene(s) in yeast by transformation with a high-copy number plasmid, integration of multiple copies of TKL1 and/or TKL2 gene(s) into the yeast genome, over-expression of TKL1 and/or TKL2 gene(s) directed by a strong promoter, the like or combinations thereof.
In certain embodiments, decreasing or disrupting transketolase activity may be desirable to decrease the proportion of xylulose 5-phosphate converted to glyceraldehyde 3-phosphate, thereby increasing levels of xylulose 5-phosphate in the engineered microorganism. Increased levels of xylulose 5-phosphate can be further metabolized and thereby improve fermentation of sugar to ethanol via activities in the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden-Meyerhoff pathway. Decreased or disrupted transketolase (EC 2.2.1 .1 ) activity in yeast may be achieved by any suitable method, or as described herein. Non- limiting examples of methods suitable for decreasing or disrupting the activity of transketolase include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast, disruption of both copies of the gene in a diploid yeast, expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. TKL1 and/or TKL2 gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source.
Sugar Transport Activities
Sugar metabolized as a carbon source by organisms typically is transported from outside a cell into the cell for use as an energy source and/or a raw material for synthesis of cellular products. Sugar can be transported into the cell using active or passive transport mechanisms. Active transport systems frequently utilize energy to transport the sugar across the cell membrane. Sugars often are modified by phosphorylation, once transported inside the cell or organism, to prevent diffusion out of the cell. Sugar transport activities are thought also to act as sugar sensors and have high affinity and low affinity transporters. The rate of glucose utilization in yeast often is dictated by the activity and concentration of glucose transporters in the plasma membrane.
In yeast, sugar transporters have been found to be part of a multi-gene family. Some sugar transport systems transport certain sugars preferentially and other non-preferred sugars at a lower rate. Certain sugar transport systems transport one or more structurally similar sugars at substantially similar rates. Non-limiting examples of sugar transporters include high affinity glucose transporters (e.g., HXT (e.g., HXT1 , HXT7}), glucose-xylose transporters (e.g., GXF1 , GXS1 ), and high affinity galactose transporters (e.g., GAL2), the like and combinations thereof. Galactose permease is a high affinity galactose transport enzyme activity that also can transport glucose. Galactose permease is encoded by the GAL2 gene, and sometimes also is referred to as a galactose/glucose (methylgalactoside) porter. Gal2p is an integral plasma membrane protein belonging to a super family of sugar transporters that are predicted to contain 12 transmembrane domains separated by charged residues. Structurally and functionally similar sugar transporters have been identified in bacteria, rat, and humans.
Glucose often is transported by high affinity glucose transporters. High affinity glucose
transporters (e.g., HXT) are members of the major facilitator gene super family, and include the genes HXT6 (Hxt6p) and HXT7 (Hxt7p). HXT6 and HXT7 are substantially similar activities, and are expressed at high basal levels relative to other high affinity glucose transporters.
Approximately 20 HXT genes have been identified. High affinity glucose transporters sometimes also are referred to as hexose transporters.
Certain sugar transport systems include high and low affinity transport activities that act on more than one sugar. A non-limiting example of such a sugar transport system includes the
glucose/xylose transport system from Candida yeast. Glucose and xylose are transported into certain Candida by a high affinity xylose-proton symporter (e.g., GXS1 ) and a low affinity diffusion facilitator (e.g., GXF1 ). S. cerevisiae normally lacks an efficient transport system for xylose, although xylose can enter the cell at low efficiency via non-specific transport systems sometimes involving HXT activities. Addition of the Candida GSX1 , GXF1 or GXS1 and GXF1 activities to S. cerevisiae engineered to metabolize xylose can further enhance the ability to ferment xylose to alcohol or other desired products.
In some embodiments, an engineered microorganism includes one or more sugar transport activities that has been genetically added or altered. In certain embodiments, the sugar transport activity is amplified or increased. Sugar transport activities can be added, amplified by over expression or increased by any suitable method. Non-limiting methods of adding, amplifying or increasing the activity of sugar transport systems include increasing the number of genes of a sugar transport activity (e.g., GAL2, GXF1 , GXS1 , HXT7) gene(s) in yeast by transformation with a high-copy number plasmid, integration of multiple copies of sugar transport activity (e.g., GAL2, GXF1 , GXS1 , ΗΧΤ7) gene(s) into the yeast genome, over-expression of sugar transport activity gene(s) directed by a strong promoter, the like or combinations thereof. The sugar transport activity (e.g., GAL2, GXF1 , GXS1 , HXT7) gene(s) may be native to S. cerevisiae, or may be obtained from a heterologous source.
Plasma membrane channels may play a role in the efflux and/or intake of metabolites that can induce or repress expression of various activities. One such plasma membrane channel is encoded by the FPS1 gene. The FPS1 encoded plasma membrane channel has been shown to be involved in the efflux of glycerol from the cell and the uptake of acetic acid and trivalent metalloids, arsenite and antimonite, into the cell. In low-pH yeast cultures, loss of FPS1 is important for the acquisition of resistance to acetic acid, as it eliminates the channel for passive diffusion of this acid into cells. Without being limited by theory, decreasing the efflux of glycerol from the cell may be advantageous for increased ethanol production, when combined with other alterations that minimize glycerol production (e.g., reduction or elimination of glycerol-3-phosphate dehydrogenase activity (e.g., GPD1 , GPD2, or GPD1 and GPD2)). In order to reduce the effect of limiting glycerol production, keeping any glycerol that is produced in the cell may be advantageous to' the overall growth and health of the cell. Reducing or eliminating the efflux of the reduced amounts of glycerol in gpdl , gpd2, or gpdl and gpd2 engineered strains by co-engineering reduced or eliminated expression of FPS1 may aid overall cell health and growth in engineered strains. Decreasing the number of plasma membrane channels encoded by FPS1 , may benefit the production of ethanol by inhibiting the loss of glycerol through FPS1 encoded plasma membrane channels, thereby increasing the overall health of engineered strains thereby allowing increased levels of ethanol production, in some embodiments involving engineered microorganisms including increased EDA and EDD activities, thereby improving fermentation of sugar to ethanol via the Entner-Doudoroff pathway, even in the presence of the enzymes comprising the Embden- Meyerhoff pathway.
Decreasing or disrupting the synthesis of plasma membrane channels encoded by FPS1 in yeast may be achieved by any suitable method, or as described herein. Non-limiting examples of methods suitable for decreasing the synthesis of plasma membrane channels encoded by FPS1 include use of a regulated promoter, use of a weak constitutive promoter, disruption of one of the two copies of the gene in a diploid yeast (e.g., partial gene knockout), disrupting both copies of the gene in a diploid yeast (e.g., complete gene knockout) expression of an anti-sense nucleic acid, expression of an siRNA, over expression of a negative regulator of the endogenous promoter, alteration of the activity of an endogenous or heterologous gene, use of a heterologus gene with lower specific activity, the like or combinations thereof. In some embodiments, a gene used to knockout one activity can also introduce or increase another activity. An example of a
polynucleotide subsequence and/or an amino acid sequence coding a FPS1 plasma membrane channel is provided herein.
Carbon Dioxide Metabolism and Activities
Microorganisms grown in fermentors often are grown under anaerobic conditions, with limited or no gas exchange. Therefore the atmosphere inside fermentors sometimes is carbon dioxide rich. Unlike photosynthetic organisms, many microorganisms suitable for use in industrial fermentation processes do not incorporate atmospheric carbon (e.g., C02) to any significant degree, or at all. Thus, to ensure that increasing levels of carbon dioxide do not inhibit cell growth and the fermentation process, methods to remove carbon dioxide from the interior of fermentors can be useful.
Photosynthetic organisms make use of atmospheric carbon by incorporating the carbon available in carbon dioxide into organic carbon compounds by a process known as carbon fixation. The activities responsible for a photosynthetic organism's ability to fix carbon dioxide include phosphoenolpyruvate carboxylase (e.g., PEP carboxylase) or ribulose 1 ,5-bis-phosphate carboxylase (e.g., Rubisco). PEP carboxylase catalyzes the addition of carbon dioxide to phosphoenolpyruvate to generate the four-carbon compound oxaloacetate. Oxaloacetate can be used in other cellular processes or be further converted to yield several industrially useful products (e.g., malate, succinate, citrate and the like). Rubisco catalyzes the addition of carbon dioxide and ribulose-1 ,5-bisphosphate to generate 2 molecules of 3-phosphoglycerate. 3-phosphoglycerate can be further converted to ethanol via cellular fermentation or used to produce other commercially useful products. Nucleic acid sequences encoding PEP carboxylase and Rubisco activities can be obtained from any suitable organism (e.g., plants, bacteria, and other microorganisms, for example) and any of these activities can be used herein with the proviso that the nucleic acid sequence is either naturally active in the chosen microorganism when expressed, or can be altered or modified to be active. Examples of Altered Activities
In some embodiments, engineered microorganisms can include modifications to one or more (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20 or all) of the following activities: phosphofructokinase activity (PFK1 A subunit, PFK2 B subunit), phosphogluconate dehydratase activity (EDD), 2-keto-3- deoxygluconate-6-phosphate aldolase activity (EDA), xylose isomerase activity (xylA), xylose reductase activity (XYL1 ), xylitol dehydrogenase activity (XYL2), xylulokinase activity (XKS1 , XYL3), phosphoenolpyruvate carboxylase activity (PEP carboxylase), alcohol dehydrogenase 2 activity (ADH2), thymidylate synthase activity, phosphoglucose isomerase activity (PGI1 ), transaldolase activity (TAL1 ), transketolase activity (TKL1 , TKL2), 6-phosphogluconolactonase activity (SOL3, SOL4), Glucose-6-phosphate dehydrogenase activity (ZWF1 ), 6-phosphogluconate dehydrogenase (decarboxylating) activity (GND1 , GND2), galactose permease activity (GAL2), high affinity glucose transport activity (HXT7), glucose/xylose transport activity (GXS1 , GXF1 ), glycerol-3-phosphate dehydrogenase (GPD1 , GPD2), plasma membrane channel (FPS1 ), pyruvate decarboxylase (PDC1 ), pyruvate kinase (PYK1 ), alcohol dehydrogenase 1 (ADH1 ), glutamate synthase (GLT1 ), alkaline phosphatase (PH013), neutral trehalose (NTH1 ),
glyceraldehyde-3-phosphate dehydrogenase (TDH3), trehalose-6-phosphate
synthase/phosphatase complex (TPS1 , TPS2), and combinations of the foregoing. In certain embodiments, one or more activities in one or more metabolic pathways can be engineered to increase carbon flux through the engineered pathways to produce a desired product (e.g., ethanol). The engineered activities can be selected for allowing increased production of metabolic intermediates that can be utilized in one or more other engineered pathways to achieve increased production of a desired product with respect to the unmodified host organism. This "carbon flux management" can be optimized for any chosen feedstock, by engineering appropriate activities in appropriate pathways.
The term "phosphofructokinase activity" as used herein refers to conversion of fructose-6- phosphate to fructose-1 ,6-bisphosphate. Phosphofructokinase activity may be provided by an enzyme that includes one or two subunits (referred to hereafter as "subunit A" and/or "subunit B"). The term "inactivating the Embden-Meyerhoff pathway" as used herein refers to reducing or eliminating the activity of one or more activities in the Embden-Meyerhoff pathway, including but not limited to phosphofructokinase activity. In some embodiments, the phosphofructokinase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
In some embodiments, the phosphofructokinase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. In certain embodiments, the genetic modification deletes the nucleic acid encoding the activity. In some embodiments, the genetic modification replaces the endogenous promoter and/or coding sequence with a heterologous promoter and/or coding sequence having lower relative expression and/or lower relative specific activity. Nucleic acid sequences that can be used to reduce or eliminate the activity of
phosphofructokinase activity can have sequences partially or substantially complementary to sequences described herein. Presence or absence of the amount of phosphofructokinase activity can be detected by any suitable method known in the art, including requiring a five-carbon sugar carbon source or a functional Entner-Doudoroff pathway for growth. Inactivation of the Embden- Meyerhoff pathway is described in further detail below.
As referred to herein, "substantially complementary" with respect to sequences refers to nucleotide sequences that will hybridize with each other. The stringency of the hybridization conditions can be altered to tolerate varying amounts of sequence mismatch. Included are regions of counterpart, target and capture nucleotide sequences 55% or more, 56% or more, 57% or more, 58% or more, 59% or more, 60% or more, 61 % or more, 62% or more, 63% or more, 64% or more, 65% or more, 66% or more, 67% or more, 68% or more, 69% or more, 70% or more, 71 % or more, 72% or more, 73% or more, 74% or more, 75% or more, 76% or more, 77% or more, 78% or more, 79% or more, 80% or more, 81 % or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or more, 91 % or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more or 99% or more
complementary to each other.
The term "phosphogluconate dehydratase activity" as used herein refers to conversion of 6- phophogluconate to 2-keto-3-deoxy-6-p-gluconate. The phosphogluconate dehydratase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring phosphogluconate dehydratase activity can be obtained from a number of sources, including Zymomonas mobilis and Escherichia coli. Examples of an amino acid sequence of a polypeptide having phosphogluconate dehydratase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of phosphogluconate dehydratase activity can be detected by any suitable method known in the art, including western blot analysis.
The term "2-keto-3-deoxygluconate-6-phosphate aldolase activity" as used herein refers to conversion of 2-keto-3-deoxy-6-p-gluconate to pyruvate. The 2-keto-3-deoxygluconate-6- phosphate aldolase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host
microorganism. Nucleic acid sequences conferring 2-keto-3-deoxygluconate-6-phosphate aldolase activity can be obtained from a number of sources, including Zymomonas mobilis and Escherichia coli. Examples of an amino acid sequence of a polypeptide having 2-keto-3-deoxygluconate-6- phosphate aldolase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of 2-keto-3- deoxygluconate-6-phosphate aldolase activity can be detected by any suitable method known in the art, including western blot analysis.
The term "xylose isomerase activity" as used herein refers to conversion of xylose to xylulose. The xylose isomerase activity can be provided by a polypeptide. In some embodiments, the
polypeptide is encoded by a heterologous nucleotide sequence introduced to a host
microorganism. Nucleic acid sequences conferring xylose isomerase activity can be obtained from a number of sources, including Piromyces, Orpinomyces, Bacteroides (e.g., B. thetaiotaomicron, B. uniformis, B. stercoris), Clostrialies (e.g., Clostrialies BVAB3), Clostridium (e.g., C.
phytofermentans, C. thermohydrosulfuricum, C. cellulyticum), Thermus thermophilus, Eschericia coli, Streptomyces (e.g., S. rubiginosus, S. murinus), Bacillus stearothermophilus, Lactobacillus pentosus, Thermotoga (e.g., T. maritime, T. neopolitana) and Ruminococcus (e.g., Ruminococcus environmental samples, Ruminococcus albus, Ruminococcus bromii, Ruminococcus callidus, Ruminococcus flavefaciens, Ruminococcus gauvreauii, Ruminococcus gnavus, Ruminococcus lactaris, Ruminococcus obeum, Ruminococcus sp., Ruminococcus sp. 14531 , Ruminococcus sp. 15975, Ruminococcus sp. 16442, Ruminococcus sp. 18P13, Ruminococcus sp. 25F6,
Ruminococcus sp. 25F7, Ruminococcus sp. 25F8, Ruminococcus sp. 4_1_47FAA, Ruminococcus sp. 5, Ruminococcus sp. 5_1_39BFAA, Ruminococcus sp. 7L75, Ruminococcus sp. 8_1_37FAA, Ruminococcus sp. 9SE51 , Ruminococcus sp. C36, Ruminococcus sp. CB10, Ruminococcus sp. CB3, Ruminococcus sp. CCUG 37327 A, Ruminococcus sp. CE2, Ruminococcus sp. CJ60, Ruminococcus sp. CJ63, Ruminococcus sp. C01 , Ruminococcus sp. C012, Ruminococcus sp. C022, Ruminococcus sp. C027, Ruminococcus sp. C028, Ruminococcus sp. C034,
Ruminococcus sp. C041 , Ruminococcus sp. C047, Ruminococcus sp. C07, Ruminococcus sp. CS1 , Ruminococcus sp. CS6, Ruminococcus sp. DJF_VR52, Ruminococcus sp. DJF_VR66, Ruminococcus sp. DJF_VR67, Ruminococcus sp. DJF_VR70k1 , Ruminococcus sp. DJF_VR87, Ruminococcus sp. Eg2, Ruminococcus sp. Egf, Ruminococcus sp. END-1 , Ruminococcus sp. FD1 , Ruminococcus sp. GM2/1 , Ruminococcus sp. ID1 , Ruminococcus sp. ID8, Ruminococcus sp. K-1 , Ruminococcus sp. KKA Seq234, Ruminococcus sp. M-1 , Ruminococcus sp. M10,
Ruminococcus sp. M22, Ruminococcus sp. M23, Ruminococcus sp. M6, Ruminococcus sp. M73, Ruminococcus sp. M76, Ruminococcus sp. MLG080-3, Ruminococcus sp. NML 00-0124,
Ruminococcus sp. Pei041 , Ruminococcus sp. SC101 , Ruminococcus sp. SC103, Ruminococcus sp. Siijpesteijn 1948, Ruminococcus sp. WAL 17306, Ruminococcus sp. YE281 , Ruminococcus sp. YE58, Ruminococcus sp. YE71 , Ruminococcus sp. ZS2-15, Ruminococcus torques).
Examples of an amino acid sequence of a polypeptide having xylose isomerase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of xylose isomerase activity can be detected by any suitable method known in the art, including western blot analysis.
The term "phosphoenolpyruvate carboxylase activity" as used herein refers to the addition of carbon dioxide to phosphoenolpyruvate to generate the four-carbon compound oxaloacetate. The phosphoenolpyruvate carboxylase activity can be provided by a polypeptide. In some
embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring phosphoenolpyruvate carboxylase activity can be obtained from a number of sources, including Zymomonas mobilis. Examples of an amino acid sequence of a polypeptide having phosphoenolpyruvate carboxylase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables.
Presence, absence or amount of xylose isomerase activity can be detected by any suitable method known in the art. The term "alcohol dehydrogenase 2 activity" as used herein refers to conversion of ethanol to acetaldehyde, which is the reverse of the forward action catalyzed by alcohol dehydrogenase 1 . The term "inactivation of the conversion of ethanol to acetaldehyde" refers to a reduction or elimination in the activity of alcohol dehydrogenase 2. Reducing or eliminating the activity of alcohol dehydrogenase 2 activity can lead to an increase in ethanol production. In some embodiments, the alcohol dehydrogenase 2 activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
In some embodiments, the alcohol dehydrogenase 2 activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of alcohol dehydrogenase 2 can have sequences partially or substantially complementary to nucleic acid sequences that encode alcohol dehydrogenase 2 activity. Presence or absence of the amount of alcohol dehydrogenase 2 activity can be detected by any suitable method known in the art, including inability to grown in media with ethanol as the sole carbon source.
The term "thymidylate synthase activity" as used herein refers to a reductive methylation, where deoxyuridine monophosphate (dUMP) and N5,N10-methylene tetrahydrofolate are together used to generate thymidine monophosphate (dTMP), yielding dihydrofolate as a secondary product. The term "temporarily inactivate thymidylate synthase activity" refers to a temporary reduction or elimination in the activity of thymidylate synthase when the modified organism is shifted to a non- permissive temperature. The activity can return to normal upon return to a permissive
temperature. Temporarily inactivating thymidylate synthase uncouples cell growth from cell division while under the non permissive temperature. This inactivation in turn allows the cells to continue fermentation without producing biomass and dividing, thus increasing the yield of product produced during fermentation.
In some embodiments, the thymidylate synthase activity can be temporarily reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. Nucleic acid sequences conferring temperature sensitive thymidylate synthase activity can be obtained from S. cerevisiae strain 172066 (accession number 208583). The cdc21 mutation in S. cerevisiae strain 172066 has a point mutation at position G139S relative to the initiating methionine. Examples of nucleotide sequences used to PCR amplify the polynucleotide encoding the temperature sensitive polypeptide, are presented below in tables. Presence, absence or amount of thymidylate synthase activity can be detected by any suitable method known in the art, including growth arrest at the non-permissive temperature. Thymidylate synthase is one of many polypeptides that regulate the cell cycle. The cell cycle may be inhibited in engineered microorganisms under certain conditions (e.g., temperature shift, dissolved oxygen shift), which can result in inhibited or reduced cell proliferation, inhibited or reduced cell division, and sometimes cell cycle arrest (collectively "cell cycle inhibition"). Upon exposure to triggering conditions, a microorganism may display cell cycle inhibition after a certain time after the microorganism is exposed to the triggering conditions (e.g., there may be a time delay after a microorganism is exposed to a certain set of conditions before the microorganism displays cell cycle inhibition). Where cell cycle inhibition results in reduced cell proliferation, cell proliferation rates may be reduced by about 50% or greater, for example (e.g., reduced by about 52%, 54%, 56%, 58%, 60%, 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing). Where cell cycle inhibition results a reduced number of cells undergoing cell division, the rate of cell division may be reduced by about 50% or greater, for example (e.g., the number of cells undergoing division is reduced by about 52%, 54%, 56%, 58%, 60%, 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing). Where cell cycle inhibition results in cell cycle arrest, cells may be arrested at any stage of the cell cycle (e.g., resting G0 phase, interphase (e.g., G, , S, G2 phases), mitosis (e.g., prophase, prometaphase, metaphase, anaphase, telophase)) and different percentages of cells in a population can be arrested at different stages of the cell cycle.
The term "phosphoglucose isomerase activity" as used herein refers to the conversion of glucose- 6-phosphate to fructose-6-phosphate. The term "inactivation of the conversion of glucose-6- phosphate to fructose-6-phosphate" refers to a reduction or elimination in the activity of phosphoglucose isomerase. Reducing or eliminating the activity of phosphoglucose isomerase activity can lead to an increase in ethanol production. In some embodiments, the phosphoglucose isomerase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism. In some embodiments, the phosphoglucose isomerase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of phosphoglucose isomerase can have sequences partially or substantially complementary to nucleic acid sequences that encode phosphoglucose isomerase activity. Presence or absence of the amount of phosphoglucose isomerase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "glucose-6-phosphate dehydrogenase activity" as used herein refers to conversion of glucose-6-phosphate to gluconolactone-6-phosphate coupled with the generation of NADPH. The glucose-6-phosphate dehydrogenase aldolase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring glucose-6-phosphate dehydrogenase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae Examples of a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of glucose-6-phosphate dehydrogenase activity can be detected by any suitable method known in the art, including western blot analysis.
The term "6-phosphogluconolactonase activity" as used herein refers to conversion of
gluconolactone-6-phosphate to gluconate-6-phosphate. The 6-phosphogluconolactonase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring 6-phosphogluconolactonase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having 6-phosphogluconolactonase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of 6- phosphogluconolactonase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "6-phosphogluconate dehydrogenase (decarboxylating) activity" as used herein refers to the conversion of gluconate-6-phosphate to ribulose-5-phosphate. The term "inactivation of the conversion of gluconate-6-phosphate to ribulose-5-phosphate" refers to a reduction or elimination in the activity of 6-phosphogluconate dehydrogenase. Reducing or eliminating the activity of 6- phosphogluconate dehydrogenase (decarboxylating) activity can lead to an increase in ethanol production. In some embodiments, the 6-phosphogluconate dehydrogenase (decarboxylating) activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
In some embodiments, the 6-phosphogluconate dehydrogenase (decarboxylating) activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of 6-phosphogluconate
dehydrogenase (decarboxylating) can have sequences partially or substantially complementary to nucleic acid sequences that encode 6-phosphogluconate dehydrogenase (decarboxylating) activity. Presence or absence of the amount of 6-phosphogluconate dehydrogenase
(decarboxylating) activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis. The term "transketolase activity" as used herein refers to conversion of xylulose-5-phosphate and ribose-5-phosphate to sedoheptulose-7-phosphate and glyceraldehyde-3-phosphate. The transketolase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring transketolase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae, Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida. Examples of an amino acid sequence of a polypeptide having transketolase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in the examples. The term "inactivation of the conversion of xylulose-5-phosphate and ribose-5-phosphate to sedoheptulose-7-phosphate and glyceraldehyde- 3-phosphate" refers to a reduction or elimination in the activity of transketolase. Reducing or eliminating the activity of transketolase activity can lead to an increase in ethanol production. In some embodiments, the transketolase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
In some embodiments, the transketolase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of transketolase can have sequences partially or substantially complementary to nucleic acid sequences that encode transketolase activity. Presence, absence or amount of transketolase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "transaldolase activity" as used herein refers to conversion of sedoheptulose 7- phosphate and glyceraldehyde 3-phosphate to erythrose 4-phosphate and fructose 6-phosphate. The transaldolase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host
microorganism. Nucleic acid sequences conferring transaldolase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae, Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida. Examples of an amino acid sequence of a polypeptide having transaldolase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in the examples. The term "inactivation of the conversion of sedoheptulose 7-phosphate and glyceraldehyde 3-phosphate to erythrose 4-phosphate and fructose 6-phosphate" refers to a reduction or elimination in the activity of transaldolase. Reducing or eliminating the activity of transaldolase activity can lead to an increase in ethanol production. In some embodiments, the transaldolase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
In some embodiments, the transaldolase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of transaldolase can have sequences partially or substantially complementary to nucleic acid sequences that encode transaldolase activity. Presence, absence or amount of transaldolase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "galactose permease activity" as used herein refers to the import of galactose into a cell or organism by an activity that transports galactose across cell membranes. The galactose permease activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring galactose permease activity can be obtained from a number of sources, including, but not limited to S. cerevisiae, Candida albicans, Debaryomyces hansenii,
Schizosaccharomyces pombe, Arabidopsis thaliana, and Colweilia psychrerythraea. Examples of an amino acid sequence of a polypeptide having galactose permease activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in the Examples. Presence, absence or amount of galactose permease activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "glucose/xylose transport activity" as used herein refers to the import of glucose and/or xylose into a cell or organism by an activity that transports glucose and/or xylose across cell membranes. The glucose/xylose transport activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring glucose/xylose transport activity can be obtained from a number of sources, including, but not limited to Pichia yeast, S. cerevisiae, Candida albicans, Debaryomyces hansenii, Schizosaccharomyces pombe, Arabidopsis thaliana, and Colweilia psychrerythraea. Examples of an amino acid sequence of a polypeptide having glucose/xylose transport activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in the Examples. Presence, absence or amount of
glucose/xylose transport activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The terms "high affinity glucose transport activity" and "hexose transport activity" as used herein refer to the import of glucose and other hexose sugars into a cell or organism by an activity that transports glucose and other hexose sugars across cell membranes. The high affinity glucose transport activity or hexose transport activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring high affinity glucose transport activity or hexose transport activity can be obtained from a number of sources, including, but not limited to S. cerevisiae, Candida albicans, Debaryomyces hansenii, Schizosaccharomyces pombe, Arabidopsis thaliana, and Colwellia psychrerythraea. Presence, absence or amount of glucose/xylose transport activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "xylose reductase activity" as used herein refers to the conversion of xylose to xylitol. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring xylose reductase activity can be obtained from a number of sources. Presence, absence or amount of xylose reductase activity can be detected by any suitable method known in the art, including activity assays, nucleic acid based analysis and western blot analysis.
The term "xylitol dehydrogenase activity" as used herein refers to the conversion of xylitol to xylulose. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring xylitol dehydrogenase activity can be obtained from a number of sources. Presence, absence or amount of xylitol dehydrogenase activity can be detected by any suitable method known in the art, including activity assays, nucleic acid based analysis and western blot analysis.
The term "xylulokinase activity" as used herein refers to the conversion of xylulose to xylulose-5- phosphate. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring xylulokinase activity can be obtained from a number of sources. Presence, absence or amount of xylulokinase activity can be detected by any suitable method known in the art, including activity assays, nucleic acid based analysis and western blot analysis. The term "trehalose-6-phosphate synthase activity" as used herein refers to the conversion of 2 molecules of phosphorylated glucose (e.g., UDP-glucose and D-glucose-6-phosphate) into a molecule of trehalose. The term "decrease the proportion of glucose converted into the storage carbohydrate, trehalose" refers to a reduction or elimination of trehalose-6-phosphate synthase (e.g., TPS1 ) activity. Reducing or eliminating the activity of trehalose-6-phosphate synthase can lead to an increase in ethanol production. In some embodiments, the trehalose-6-phosphate synthase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
In some embodiments, the trehalose-6-phosphate synthase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of trehalose-6-phosphate synthase can have sequences partially or substantially complementary to nucleic acid sequences that encode trehalose-6-phosphate synthase activity. Presence or absence of the amount of trehalose- 6-phosphate synthase activity can be detected by any suitable method known in the art, including the method described by Hottinger et al, (1987 J. Bacteriology 169: 5518-5522).
The term "trehalose-6-phosphate phosphatase activity" as used herein refers to the hydrolysis of alpha, alpha-trehalose-6-phosphate to remove the phosphate, a prerequisite for further metabolism by a trehalase activity. The trehalose-6-phosphate phosphatase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by an endogenous nucleotide sequence. In certain embodiments, the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring trehalose-6- · phosphate phosphatase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having trehalose-6- phosphate phosphatase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of trehalose-6- phosphate phosphatase activity can be detected by any suitable method known in the art, including nucleic acid based analysis, western blot analysis, and the method described by De Virgilio et al, (Eur. J. Biochem 212: 315-323).
The term "FPS1 encoded plasma membrane channel" as used herein refers to a polypeptide encoded by the FPS1 gene that functions to allow efflux of glycerol from the cell and influx of acetic acid into the cell. The terms "decreasing the efflux of glycerol from the cell" and "decreasing the number of plasma membrane channels encoded by FPS1 " refer to a reduction in number or complete elimination of the plasma membrane channels encoded by FPS1 and present in the plasma membranes of engineered organism described herein including the fpsl phenotype.
Reducing the number of, or completely eliminating the, plasma membrane channels encoded by FPS1 can lead to an increase in ethanol production. In some embodiments, the plasma membrane channels encoded by FPS1 can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism. In some embodiments, the plasma membrane channels encoded by FPS1 can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders expression of the gene encoding the membrane channels responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the membrane channels or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the plasma membrane channels encoded by FPS1 can have sequences partially or substantially complementary to nucleic acid sequences that encode FPS1 . FPS1 deleted cells show increased resistance to acetic acid when compared to cells native for FPS1 plasma membrane channels. Presence or absence of the plasma membrane channels encoded by FPS1 can be detected by any suitable method known in the art, including acetic acid resistance, and ability to grow on acetic acid.
The term "glyceraldehyde-3-phosphate dehydrogenase activity" as used herein refers to conversion of glyceraldehyde-3-phosphate to 1 ,3 bis-phosphoglycerate. The glyceraldehyde-3- phosphate dehydrogenase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by an endogenous nucleotide sequence. In certain embodiments, the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host
microorganism. Nucleic acid sequences conferring glyceraldehyde-3-phosphate dehydrogenase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae.
Examples of an amino acid sequence of a polypeptide having glyceraldehyde-3-phosphate dehydrogenase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of glyceraldehyde-3- phosphate dehydrogenase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis. The term "glutamate synthase activity" as used herein refers to the reversible conversion of 2 molecules of L-glutamate into L-glutamine and 2-oxoglutarate. The glutamate synthase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by an endogenous nucleotide sequence. In certain embodiments, the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring glutamate synthase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having glutamate synthase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of glutamate synthase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "alcohol dehydrogenase 1 activity" as used herein refers to the conversion of acetaldehyde to ethanol. The alcohol dehydrogenase 1 activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by an endogenous nucleotide sequence. In certain embodiments, the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring alcohol dehydrogenase 1 activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having alcohol dehydrogenase 1 activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of alcohol dehydrogenase 1 activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis. The term "pyruvate decarboxylase activity" as used herein refers to the decarboxylation of a 2-oxo acid into an aldehyde, where the 2-oxo acid generally is pyruvic acid and the aldehyde generally is acetaldehyde. The pyruvate decarboxylase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by an endogenous nucleotide sequence. In certain embodiments, the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring pyruvate decarboxylase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having pyruvate decarboxylase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of pyruvate decarboxylase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
The term "pyruvate kinase activity" as used herein refers to the dephosphorylation of
phosphoenolpyruvate to yield pyruvate and ATP. The pyruvate kinase activity can be provided by a polypeptide. In some embodiments, the polypeptide is encoded by an endogenous nucleotide sequence. In certain embodiments, the polynucleotide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring pyruvate kinase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae. Examples of an amino acid sequence of a polypeptide having pyruvate kinase activity, and a nucleotide sequence of a polynucleotide that encodes the polypeptide, are presented below in tables. Presence, absence or amount of pyruvate kinase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis. The terms "alkaline phosphatase activity" and "alkaline phosphatase activity specific for p- nitrophenyl phosphate" as used herein refer to the hydrolysis of a phosphate monoester into an alcohol and inorganic phosphate. The term "decreasing the level of PH013 activity" refers to a reduction or elimination of alkaline phosphatase activity. Reducing or eliminating the activity of alkaline phosphatase can lead to an increase in ethanol production. In some embodiments, the alkaline phosphatase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism. In some embodiments, the alkaline phosphatase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of alkaline phosphatase can have sequences partially or substantially complementary to nucleic acid sequences that encode alkaline phosphatase activity. Presence or absence of the amount of alkaline phosphatase activity can be detected by any suitable method known in the art, including alkaline phosphatase activity assays in which PH013 activity is assayed using para-Nitrophenylphosphate (pNPP), a chromogenic substrate for acid and alkaline phosphatase.
The term "neutral trehalase activity" as used herein refers to the hydrolysis of alpha, alpha-6- trehalose into 2 molecules of glucose. The term "decreasing the level of NTH1 activity" refers to a reduction or elimination of neutral trehalase activity. Reducing or eliminating the activity of neutral trehalase can lead to an increase in ethanol production. In some embodiments, the neutral trehalase activity can be reduced or eliminated by introduction of an untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism.
In some embodiments, the neutral trehalase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of neutral trehalase can have sequences partially or substantially complementary to nucleic acid sequences that encode a neutral trehalase activity. The rapid degradation of trehalose in intact cells when recovering from heat stress is not observed in mutant cells carrying a disrupted or a deleted gene of neutral trehalase (nthl Δ). Presence or absence of the amount of neutral trehalase activity can be detected by any suitable method known in the art, including the inability to grow on trehalose as the sole carbon source, or by the neutral enzyme overlay test described by Kopp et al, (JBC, 1993, 268: 4766-4774). The term "glycerol-3-phosphate dehydrogenase activity" as used herein refers to the conversion of dihydroxyacetone phosphate into sn-glycerol-3-phosphate. The term "decreasing glycerol-3- phosphate dehydrogenase activity" refers to a reduction or elimination of glycerol-3-phosphate dehydrogenase activity. Reducing or eliminating the activity of glycerol-3-phosphate
dehydrogenase can lead to an increase in ethanol production. In some embodiments, the glycerol- 3-phosphate dehydrogenase activity can be reduced or eliminated by introduction of an
untranslated RNA molecule (e.g., antisense RNA, RNAi, and the like, for example). In certain embodiments, the untranslated RNA is encoded by a heterologous nucleotide sequence introduced to a host microorganism. In some embodiments, the glycerol-3-phosphate dehydrogenase activity can be temporarily or permanently reduced or eliminated by genetic modification, as described below. In certain embodiments, the genetic modification renders the activity responsive to changes in the environment. In some embodiments, the genetic modification disrupts synthesis of a functional nucleic acid encoding the activity or produces a nonfunctional polypeptide or protein. Nucleic acid sequences that can be used to reduce or eliminate the activity of glycerol-3-phosphate
dehydrogenase can have sequences partially or substantially complementary to nucleic acid sequences that encode a glycerol-3-phosphate dehydrogenase activity. Mutants in GPD1 activity often produce less glycerol and show increased sensitivity to osmotic stress than strains native for GPD1 . Mutants lacking GPD2 activity often show poor growth under anaerobic conditions.
Mutants lacking GPD1 and GPD2 activity generally do not produce detectable levels of glycerol and are highly osmosensitive. Presence or absence of the amount of glycerol-3-phosphate dehydrogenase activity can be detected by any suitable method known in the art, including the inability to grow under anoxic conditions.
Activities described herein can be modified to generate microorganisms engineered to allow a method of independently regulating or controlling (e.g., ability to independently turn on or off, or increase or decrease, for example) six-carbon sugar metabolism, five-carbon sugar metabolism, atmospheric carbon metabolism (e.g., carbon dioxide fixation) or combinations thereof. In some embodiments, regulated control of a desired activity can be the result of a genetic modification. In certain embodiments, the genetic modification can be modification of a promoter sequence. In some embodiments the modification can increase of decrease an activity encoded by a gene operably linked to the promoter element. In certain embodiments, the modification to the promoter element can add or remove a regulatory sequence. In some embodiments the regulatory sequence can respond to a change in environmental or culture conditions. Non-limiting examples of culture conditions that could be used to regulate an activity in this manner include, temperature, light, oxygen, salt, metals and the like. Additional methods for altering an activity by modification of a promoter element are given below. In some embodiments, the genetic modification can be to an ORF. In certain embodiments, the modification of the ORF can increase or decrease expression of the ORF. In some embodiments modification of the ORF can alter the efficiency of translation of the ORF. In certain embodiments, modification of the ORF can alter the activity of the polypeptide or protein encoded by the ORF. Additional methods for altering an activity by modification of an ORF are given below. In some embodiments, the genetic modification can be to an activity associated with cell division (e.g., cell division cycle or CDC activity, for example). In certain embodiments the cell division cycle activity can be thymidylate synthase activity. In certain embodiments, regulated control of cell division can be the result of a genetic modification. In some embodiments, the genetic modification can be to a nucleic acid sequence that encodes thymidylate synthase. In certain embodiments, the genetic modification can temporarily inactivate thymidylate synthase activity by rendering the activity temperature sensitive (e.g., heat resistant, heat sensitive, cold resistant, cold sensitive and the like). In some embodiments, the genetic modification can modify a promoter sequence operably linked to a gene encoding an activity involved in control of cell division. In some embodiments the modification can increase of decrease an activity encoded by a gene operably linked to the promoter element. In certain embodiments, the modification to the promoter element can add or remove a regulatory sequence. In some embodiments the regulatory sequence can respond to a change in environmental or culture conditions. Non-limiting examples of culture conditions that could be used to regulate an activity in this manner include, temperature, light, oxygen, salt, metals and the like. In some embodiments, an engineered microorganism comprising one or more activities described above or below can be used in to produce ethanol by inhibiting cell growth and cell division by use of a temperature sensitive cell division control activity while allowing cellular fermentation to proceed, thereby producing a significant increase in ethanol yield when compared to the native organism.
Polynucleotides and Polypeptides A nucleic acid (e.g., also referred to herein as nucleic acid reagent, target nucleic acid, target nucleotide sequence, nucleic acid sequence of interest or nucleic acid region of interest) can be from any source or composition, such as DNA, cDNA, gDNA (genomic DNA), RNA, siRNA (short inhibitory RNA), RNAi, tRNA or mRNA, for example, and can be in any form (e.g., linear, circular, supercoiled, single-stranded, double-stranded, and the like). A nucleic acid can also comprise DNA or RNA analogs (e.g., containing base analogs, sugar analogs and/or a non-native backbone and the like). It is understood that the term "nucleic acid" does not refer to or infer a specific length of the polynucleotide chain, thus polynucleotides and oligonucleotides are also included in the definition. Deoxyribonucleotides include deoxyadenosine, deo.xycytidine, deoxyguanosine and deoxythymidine. For RNA, the uracil base is uridine. A nucleic acid sometimes is a plasmid, phage, autonomously replicating sequence (ARS), centromere, artificial chromosome, yeast artificial chromosome (e.g., YAC) or other nucleic acid able to replicate or be replicated in a host cell. In certain embodiments a nucleic acid can be from a library or can be obtained from enzymatically digested, sheared or sonicated genomic DNA (e.g., fragmented) from an organism of interest. In some embodiments, nucleic acid subjected to fragmentation or cleavage may have a nominal, average or mean length of about 5 to about 10,000 base pairs, about 100 to about 1 ,000 base pairs, about 100 to about 500 base pairs, or about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000 or 10000 base pairs. Fragments can be generated by any suitable method in the art, and the average, mean or nominal length of nucleic acid fragments can be controlled by selecting an appropriate fragment-generating procedure by the person of ordinary skill. In some embodiments, the fragmented DNA can be size selected to obtain nucleic acid fragments of a particular size range. Nucleic acid can be fragmented by various methods known to the person of ordinary skill, which include without limitation, physical, chemical and enzymic processes. Examples of such processes are described in U.S. Patent Application Publication No. 200501 12590 (published on May 26, 2005, entitled "Fragmentation-based methods and systems for sequence variation detection and discovery," naming Van Den Boom et al.). Certain processes can be selected by the person of ordinary skill to generate non-specifically cleaved fragments or specifically cleaved fragments. Examples of processes that can generate non-specifically cleaved fragment sample nucleic acid include, without limitation, contacting sample nucleic acid with apparatus that expose nucleic acid to shearing force (e.g., passing nucleic acid through a syringe needle; use of a French press); exposing sample nucleic acid to irradiation (e.g., gamma, x-ray, UV irradiation; fragment sizes can be controlled by irradiation intensity); boiling nucleic acid in water (e.g., yields about 500 base pair fragments) and exposing nucleic acid to an acid and base hydrolysis process.
Nucleic acid may be specifically cleaved by contacting the nucleic acid with one or more specific cleavage agents. The term "specific cleavage agent" as used herein refers to an agent, sometimes a chemical or an enzyme that can cleave a nucleic acid at one or more specific sites. Specific cleavage agents often will cleave specifically according to a particular nucleotide sequence at a particular site. Examples of enzymic specific cleavage agents include without limitation
endonucleases (e.g., DNase (e.g., DNase I, II); RNase (e.g., RNase E, F, H, P); Cleavase™ enzyme; Taq DNA polymerase; E. coli DNA polymerase I and eukaryotic structure-specific endonucleases; murine FEN-1 endonucleases; type I, II or III restriction endonucleases such as Acc I, Afl III, Alu I, Alw44 I, Apa I, Asn I, Ava I, Ava II, BamH I, Ban II, Bel I, Bgl I. Bgl II, Bin I, Bsm I, BssH II, BstE II, Cfo I, Cla I, Dde I, Dpn I, Dra I, EclX I, EcoR I, EcoR I, EcoR II, EcoR V, Hae II, Hae II, Hind II, Hind III, Hpa I, Hpa II, Kpn I, Ksp I, Mlu I, MluN I, Msp I, Nci I, Nco I, Nde I, Nde II, Nhe I, Not I, Nru I, Nsi I, Pst I, Pvu I, Pvu II, Rsa I, Sac I, Sal I, Sau3A I, Sea I, ScrF I, Sfi I, Sma I, Spe I, Sph I, Ssp I, Stu I, Sty I, Swa I, Taq I, Xba I, Xho I); glycosylases (e.g., uracil-DNA glycolsylase (UDG), 3-methyladenine DNA glycosylase, 3-methyladenine DNA glycosylase II, pyrimidine hydrate-DNA glycosylase, FaPy-DNA glycosylase, thymine mismatch-DNA glycosylase, hypoxanthine-DNA glycosylase, 5-Hydroxymethyluracil DNA glycosylase (HmUDG), 5- Hydroxymethylcytosine DNA glycosylase, or 1 ,N6-etheno-adenine DNA glycosylase);
exonucleases (e.g., exonuclease III); ribozymes, and DNAzymes. Sample nucleic acid may be treated with a chemical agent, or synthesized using modified nucleotides, and the modified nucleic acid may be cleaved. In non-limiting examples, sample nucleic acid may be treated with (i) alkylating agents such as methylnitrosourea that generate several alkylated bases, including N3- methyladenine and N3-methylguanine, which are recognized and cleaved by alkyl purine DNA- glycosylase; (ii) sodium bisulfite, which causes deamination of cytosine residues in DNA to form uracil residues that can be cleaved by uracil N-glycosylase; and (iii) a chemical agent that converts guanine to its oxidized form, 8-hydroxyguanine, which can be cleaved by formamidopyrimidine DNA N-glycosylase. Examples of chemical cleavage processes include without limitation alkylation, (e.g., alkylation of phosphorothioate-modified nucleic acid); cleavage of acid lability of P3'-N5'-phosphoroamidate-containing nucleic acid; and osmium tetroxide and piperidine treatment of nucleic acid.
As used herein, the term "complementary cleavage reactions" refers to cleavage reactions that are carried out on the same nucleic acid using different cleavage reagents or by altering the cleavage specificity of the same cleavage reagent such that alternate cleavage patterns of the same target or reference nucleic acid or protein are generated. In certain embodiments, nucleic acids of interest may be treated with one or more specific cleavage agents (e.g., 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10 or more specific cleavage agents) in one or more reaction vessels (e.g., nucleic acid of interest is treated with each specific cleavage agent in a separate vessel).
A nucleic acid suitable for use in the embodiments described herein sometimes is amplified by any amplification process known in the art (e.g., PCR, RT-PCR and the like). Nucleic acid amplification may be particularly beneficial when using organisms that are typically difficult to culture (e.g., slow growing, require specialize culture conditions and the like). The terms "amplify", "amplification", "amplification reaction", or "amplifying" as used herein refer to any in vitro processes for multiplying the copies of a target sequence of nucleic acid. Amplification sometimes refers to an "exponential" increase in target nucleic acid. However, "amplifying" as used herein can also refer to linear increases in the numbers of a select target sequence of nucleic acid, but is different than a onetime, single primer extension step. In some embodiments, a limited amplification reaction, also known as pre-amplification, can be performed. Pre-amplification is a method in which a limited amount of amplification occurs due to a small number of cycles, for example 10 cycles, being performed. Pre-amplification can allow some amplification, but stops amplification prior to the exponential phase, and typically produces about 500 copies of the desired nucleotide sequence(s). Use of pre-amplification may also limit inaccuracies associated with depleted reactants in standard PCR reactions.
In some embodiments, a nucleic acid reagent sometimes is stably integrated into the chromosome of the host organism, or a nucleic acid reagent can be a deletion of a portion of the host chromosome, in certain embodiments (e.g., genetically modified organisms, where alteration of the host genome confers the ability to selectively or preferentially maintain the desired organism carrying the genetic modification). Such nucleic acid reagents (e.g., nucleic acids or genetically modified organisms whose altered genome confers a selectable trait to the organism) can be selected for their ability to guide production of a desired protein or nucleic acid molecule. When desired, the nucleic acid reagent can be altered such that codons encode for (i) the same amino acid, using a different tRNA than that specified in the native sequence, or (ii) a different amino acid than is normal, including unconventional or unnatural amino acids (including detectably labeled amino acids). As described herein, the term "native sequence" refers to an unmodified nucleotide sequence as found in its natural setting (e.g., a nucleotide sequence as found in an organism).
A nucleic acid or nucleic acid reagent can comprise certain elements often selected according to the intended use of the nucleic acid. Any of the following elements can be included in or excluded from a nucleic acid reagent. A nucleic acid reagent, for example, may include one or more or all of the following nucleotide elements: one or more promoter elements, one or more 5' untranslated regions (5'UTRs), one or more regions into which a target nucleotide sequence may be inserted (an "insertion element"), one or more target nucleotide sequences, one or more 3' untranslated regions (3'UTRs), and one or more selection elements. A nucleic acid reagent can be provided with one or more of such elements and other elements may be inserted into the nucleic acid before the nucleic acid is introduced into the desired organism. In some embodiments, a provided nucleic acid reagent comprises a promoter, 5'UTR, optional 3'UTR and insertion element(s) by which a target nucleotide sequence is inserted (i.e., cloned) into the nucleotide acid reagent. In certain embodiments, a provided nucleic acid reagent comprises a promoter, insertion element(s) and optional 3'UTR, and a 5' UTR/target nucleotide sequence is inserted with an optional 3'UTR. The elements can be arranged in any order suitable for expression in the chosen expression system (e.g., expression in a chosen organism, or expression in a cell free system, for example), and in some embodiments a nucleic acid reagent comprises the following elements in the 5' to 3' direction: (1 ) promoter element, 5'UTR, and insertion element(s); (2) promoter element, 5'UTR, and target nucleotide sequence; (3) promoter element, 5'UTR, insertion element(s) and 3'UTR; and (4) promoter element, 5'UTR, target nucleotide sequence and 3'UTR.
A promoter element typically is required for DNA synthesis and/or RNA synthesis. A promoter element often comprises a region of DNA that can facilitate the transcription of a particular gene, by providing a start site for the synthesis of RNA corresponding to a gene. Promoters generally are located near the genes they regulate, are located upstream of the gene (e.g., 5' of the gene), and are on the same strand of DNA as the sense strand of the gene, in some embodiments.
A promoter often interacts with a RNA polymerase. A polymerase is an enzyme that catalyses synthesis of nucleic acids using a preexisting nucleic acid reagent. When the template is a DNA template, an RNA molecule is transcribed before protein is synthesized. Enzymes having polymerase activity suitable for use in the present methods include any polymerase that is active in the chosen system with the chosen template to synthesize protein. In some embodiments, a promoter (e.g., a heterologous promoter) also referred to herein as a promoter element, can be operably linked to a nucleotide sequence or an open reading frame (ORF). Transcription from the promoter element can catalyze the synthesis of an RNA corresponding to the nucleotide sequence or ORF sequence operably linked to the promoter, which in turn leads to synthesis of a desired peptide, polypeptide or protein. The term "operably linked" as used herein with respect to promoters refers to a nucleic acid sequence (e.g., a coding sequence) present on the same nucleic acid molecule as a promoter element and whose expression is under the control of said promoter element.
Promoter elements sometimes exhibit responsiveness to regulatory control. Promoter elements also sometimes can be regulated by a selective agent. That is, transcription from promoter elements sometimes can be turned on, turned off, up-regulated or down-regulated, in response to a change in environmental, nutritional or internal conditions or signals (e.g., heat inducible promoters, light regulated promoters, feedback regulated promoters, hormone influenced promoters, tissue specific promoters, oxygen and pH influenced promoters, promoters that are responsive to selective agents (e.g., kanamycin) and the like, for example). Promoters influenced by environmental, nutritional or internal signals frequently are influenced by a signal (direct or indirect) that binds at or near the promoter and increases or decreases expression of the target sequence under certain conditions. Non-limiting examples of selective or regulatory agents that can influence transcription from a promoter element used in embodiments described herein include, without limitation, (1 ) nucleic acid segments that encode products that provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode products that are otherwise lacking in the recipient cell (e.g., essential products, tRNA genes, auxotrophic markers); (3) nucleic acid segments that encode products that suppress the activity of a gene product; (4) nucleic acid segments that encode products that can be readily identified (e.g., phenotypic markers such as antibiotics (e.g., β-lactamase), β-galactosidase, green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), cyan fluorescent protein (CFP), and cell surface proteins); (5) nucleic acid segments that bind products that are otherwise detrimental to cell survival and/or function; (6) nucleic acid segments that otherwise inhibit the activity of any of the nucleic acid segments described in Nos. 1 -5 above (e.g., antisense oligonucleotides); (7) nucleic acid segments that bind products that modify a substrate (e.g., restriction endonucleases); (8) nucleic acid segments that can be used to isolate or identify a desired molecule (e.g., specific protein binding sites); (9) nucleic acid segments that encode a specific nucleotide sequence that can be otherwise non-functional (e.g., for PCR amplification of subpopulations of molecules); (10) nucleic acid segments that, when absent, directly or indirectly confer resistance or sensitivity to particular compounds; (1 1 ) nucleic acid segments that encode products that either are toxic or convert a relatively non-toxic compound to a toxic compound (e.g., Herpes simplex thymidine kinase, cytosine deaminase) in recipient cells; (12) nucleic acid segments that inhibit replication, partition or heritability of nucleic acid molecules that contain them; and/or (13) nucleic acid segments that encode conditional replication functions, e.g., replication in certain hosts or host cell strains or under certain environmental conditions (e.g., temperature, nutritional conditions, and the like). In some embodiments, the regulatory or selective agent can be added to change the existing growth conditions to which the organism is subjected (e.g., growth in liquid culture, growth in a fermentor, growth on solid nutrient plates and the like for example).
In some embodiments, regulation of a promoter element can be used to alter (e.g., increase, add, decrease or substantially eliminate) the activity of a peptide, polypeptide or protein (e.g., enzyme activity for example). For example, a microorganism can be engineered by genetic modification to express a nucleic acid reagent that can add a novel activity (e.g., an activity not normally found in the host organism) or increase the expression of an existing activity by increasing transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest (e.g., homologous or heterologous nucleotide sequence of interest), in certain embodiments. In some embodiments, a microorganism can be engineered by genetic modification to express a nucleic acid reagent that can decrease expression of an activity by decreasing or substantially eliminating transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest, in certain embodiments.
In some embodiments the activity can be altered using recombinant DNA and genetic techniques known to the artisan. Methods for engineering microorganisms are further described herein. Tables herein provide non-limiting lists of yeast promoters that are up-regulated by oxygen, yeast promoters that are down-regulated by oxygen, yeast transcriptional repressors and their associated genes, DNA binding motifs as determined using the MEME sequence analysis software. Potential regulator binding motifs can be identified using the program MEME to search intergenic regions bound by regulators for overrepresented sequences. For each regulator, the sequences of intergenic regions bound with p-values less than 0.001 were extracted to use as input for motif discovery. The MEME software was run using the following settings: a motif width ranging from 6 to 18 bases, the "zoops" distribution model, a 6th order Markov background model and a discovery limit of 20 motifs. The discovered sequence motifs were scored for significance by two criteria: an E-value calculated by MEME and a specificity score. The motif with the best score using each metric is shown for each regulator. All motifs presented are derived from datasets generated in rich growth conditions with the exception of a previously published dataset for epitope-tagged Gal4 grown in galactose
In some embodiments, the altered activity can be found by screening the organism under conditions that select for the desired change in activity. For example, certain microorganisms can be adapted to increase or decrease an activity by selecting or screening the organism in question on a media containing substances that are poorly metabolized or even toxic. An increase in the ability of an organism to grow a substance that is normally poorly metabolized would result in an increase in the growth rate on that substance, for example. A decrease in the sensitivity to a toxic substance might be manifested by growth on higher concentrations of the toxic substance, for example. Genetic modifications that are identified in this manner sometimes are referred to as naturally occurring mutations or the organisms that carry them can sometimes be referred to as naturally occurring mutants. Modifications obtained in this manner are not limited to alterations in promoter sequences. That is, screening microorganisms by selective pressure, as described above, can yield genetic alterations that can occur in non-promoter sequences, and sometimes also can occur in sequences that are not in the nucleotide sequence of interest, but in a related nucleotide sequences (e.g., a gene involved in a different step of the same pathway, a transport gene, and the like). Naturally occurring mutants sometimes can be found by isolating naturally occurring variants from unique environments, in some embodiments. In addition to the regulated promoter sequences, regulatory sequences, and coding
polynucleotides provided herein, a nucleic acid reagent may include a polynucleotide sequence 70% or more identical to the foregoing (or to the complementary sequences). That is, a nucleotide sequence that is at least 70% or more, 71% or more, 72% or more, 73% or more, 74% or more, 75% or more, 76% or more, 77% or more, 78% or more, 79% or more, 80% or more, 81 % or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or more, 91 % or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, or 99% or more identical to a nucleotide sequence described herein can be utilized. The term "identical" as used herein refers to two or more nucleotide sequences having substantially the same nucleotide sequence when compared to each other. One test for determining whether two nucleotide sequences or amino acids sequences are substantially identical is to determine the percent of identical nucleotide sequences or amino acid sequences shared.
Calculations of sequence identity can be performed as follows. Sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). The length of a reference sequence aligned for comparison purposes is sometimes 30% or more, 40% or more, 50% or more, often 60% or more, and more often 70% or more, 80% or more, 90% or more, or 100% of the length of the reference sequence. The nucleotides or amino acids at corresponding nucleotide or polypeptide positions, respectively, are then compared among the two sequences. When a position in the first sequence is occupied by the same nucleotide or amino acid as the corresponding position in the second sequence, the nucleotides or amino acids are deemed to be identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, introduced for optimal alignment of the two sequences.
Comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. Percent identity between two amino acid or nucleotide sequences can be determined using the algorithm of Meyers & Miller, CABIOS 4: 1 1 -17 (1989), which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4. Also, percent identity between two amino acid sequences can be determined using the Needleman & Wunsch, J. Mol. Biol. 48: 444-453 (1970) algorithm which has been incorporated into the GAP program in the GCG software package (available at the http address www.gcg.com), using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1 , 2, 3, 4, 5, or 6. Percent identity between two nucleotide sequences can be determined using the GAP program in the GCG software package (available at http address www.gcg.com), using a
NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1 , 2, 3, 4, 5, or 6. A set of parameters often used is a Blossum 62 scoring matrix with a gap open penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.
Sequence identity can also be determined by hybridization assays conducted under stringent conditions. As use herein, the term "stringent conditions" refers to conditions for hybridization and washing. Stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y., 6.3.1 -6.3.6 (1989). Aqueous and nonaqueous methods are described in that reference and either can be used. An example of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45SC, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 509C. Another example of stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 459C, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 559C. A further example of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45SC, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 609C. Often, stringent hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45SC, followed by one or more washes in 0.2X SSC, 0.1 % SDS at 652C. More often, stringency conditions are 0.5M sodium phosphate, 7% SDS at 65SC, followed by one or more washes at 0.2X SSC, 1 % SDS at 65SC.
As noted above, nucleic acid reagents may also comprise one or more 5' UTR's, and one or more 3'UTR's. A 5' UTR may comprise one or more elements endogenous to the nucleotide sequence from which it originates, and sometimes includes one or more exogenous elements. A 5' UTR can originate from any suitable nucleic acid, such as genomic DNA, plasmid DNA, RNA or mRNA, for example, from any suitable organism (e.g., virus, bacterium, yeast, fungi, plant, insect or mammal). The artisan may select appropriate elements for the 5' UTR based upon the chosen expression system (e.g., expression in a chosen organism, or expression in a cell free system, for example). A 5' UTR sometimes comprises one or more of the following elements known to the artisan:
enhancer sequences (e.g., transcriptional or translational), transcription initiation site, transcription factor binding site, translation regulation site, translation initiation site, translation factor binding site, accessory protein binding site, feedback regulation agent binding sites, Pribnow box, TATA box, -35 element, E-box (helix-loop-helix binding element), ribosome binding site, replicon, internal ribosome entry site (IRES), silencer element and the like. In some embodiments, a promoter element may be isolated such that all 5' UTR elements necessary for proper conditional regulation are contained in the promoter element fragment, or within a functional subsequence of a promoter element fragment.
A 5 'UTR in the nucleic acid reagent can comprise a translational enhancer nucleotide sequence. A translational enhancer nucleotide sequence often is located between the promoter and the target nucleotide sequence in a nucleic acid reagent. A translational enhancer sequence often binds to a ribosome, sometimes is an 18S rRNA-binding ribonucleotide sequence (i.e., a 40S ribosome binding sequence) and sometimes is an internal ribosome entry sequence (IRES). An IRES generally forms an RNA scaffold with precisely placed RNA tertiary structures that contact a 40S ribosomal subunit via a number of specific intermolecular interactions. Examples of ribosomal enhancer sequences are known and can be identified by the artisan (e.g., Mignone et al., Nucleic Acids Research 33: D141 -D146 (2005); Paulous et al., Nucleic Acids Research 31 : 722-733 (2003); Akbergenov et al., Nucleic Acids Research 32: 239-247 (2004); Mignone et al., Genome Biology 3(3): reviews0004.1 -0001 .10 (2002); Gallie, Nucleic Acids Research 30: 3401 -341 1 (2002); Shaloiko et al., http address www.interscience.wiley.com, DOI: 10.1002/bit.20267; and Gallie et al., Nucleic Acids Research 15: 3257-3273 (1987)).
A translational enhancer sequence sometimes is a eukaryotic sequence, such as a Kozak consensus sequence or other sequence (e.g., hydroid polyp sequence, GenBank accession no. U07128). A translational enhancer sequence sometimes is a prokaryotic sequence, such as a Shine-Dalgarno consensus sequence. In certain embodiments, the translational enhancer sequence is a viral nucleotide sequence. A translational enhancer sequence sometimes is from a 5' UTR of a plant virus, such as Tobacco Mosaic Virus (TMV), Alfalfa Mosaic Virus (AMV);
Tobacco Etch Virus (ETV); Potato Virus Y (PVY); Turnip Mosaic (poty) Virus and Pea Seed Borne Mosaic Virus, for example. In certain embodiments, an omega sequence about 67 bases in length from TMV is included in the nucleic acid reagent as a translational enhancer sequence (e.g., devoid of guanosine nucleotides and includes a 25 nucleotide long poly (CAA) central region). A 3' UTR may comprise one or more elements endogenous to the nucleotide sequence from which it originates and sometimes includes one or more exogenous elements. A 3' UTR may originate from any suitable nucleic acid, such as genomic DNA, plasmid DNA, RNA or imRNA, for example, from any suitable organism (e.g., a virus, bacterium, yeast, fungi, plant, insect or mammal). The artisan can select appropriate elements for the 3' UTR based upon the chosen expression system (e.g., expression in a chosen organism, for example). A 3' UTR sometimes comprises one or more of the following elements known to the artisan: transcription regulation site, transcription initiation site, transcription termination site, transcription factor binding site, translation regulation site, translation termination site, translation initiation site, translation factor binding site, ribosome binding site, replicon, enhancer element, silencer element and polyadenosine tail. A 3' UTR often includes a polyadenosine tail and sometimes does not, and if a polyadenosine tail is present, one or more adenosine moieties may be added or deleted from it (e.g., about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45 or about 50 adenosine moieties may be added or subtracted). In some embodiments, modification of a 5' UTR and/or a 3' UTR can be used to alter (e.g., increase, add, decrease or substantially eliminate) the activity of a promoter. Alteration of the promoter activity can in turn alter the activity of a peptide, polypeptide or protein (e.g., enzyme activity for example), by a change in transcription of the nucleotide sequence(s) of interest from an operably linked promoter element comprising the modified 5' or 3' UTR. For example, a microorganism can be engineered by genetic modification to express a nucleic acid reagent comprising a modified 5' or 3' UTR that can add a novel activity (e.g., an activity not normally found in the host organism) or increase the expression of an existing activity by increasing transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest (e.g., homologous or heterologous nucleotide sequence of interest), in certain embodiments. In some embodiments, a microorganism can be engineered by genetic modification to express a nucleic acid reagent comprising a modified 5' or 3' UTR that can decrease the expression of an activity by decreasing or substantially eliminating transcription from a homologous or heterologous promoter operably linked to a nucleotide sequence of interest, in certain embodiments.
A nucleotide reagent sometimes can comprise a target nucleotide sequence. A "target nucleotide sequence" as used herein encodes a nucleic acid, peptide, polypeptide or protein of interest, and may be a ribonucleotide sequence or a deoxyribonucleotide sequence. A target nucleic acid sometimes can comprise a chimeric nucleic acid (or chimeric nucleotide sequence), which can encode a chimeric protein (or chimeric amino acid sequence). The term "chimeric" as used herein refers to a nucleic acid or nucleotide sequence, or encoded product thereof, containing sequences from two or more different sources. Any suitable source can be selected, including, but not limited to, a sequence from a nucleic acid, nucleotide sequence, ribosomal nucleic acid, RNA, DNA, regulatory nucleotide sequence (e.g., promoter, URL, enhancer, repressor and the like), coding nucleic acid, gene, nucleic acid linker, nucleic acid tag, amino acid sequence, peptide, polypeptide, protein, chromosome, and organism. A chimeric molecule can include a sequence of contiguous nucleotides or amino acids from a source including, but not limited to, a virus, prokaryote, eukaryote, genus, species, homolog, ortholog, paralog and isozyme, nucleic acid linkers, nucleic acid tags, the like and combinations thereof). A chimeric molecule can be generated by placing in juxtaposition fragments of related or unrelated nucleic acids, nucleotide sequences or DNA segments, in some embodiments. In certain embodiments the nucleic acids, nucleotide sequences or DNA segments can be native or wild type sequences, mutant sequences or engineered sequences (completely engineered or engineered to a point, for example).
In some embodiments, a chimera includes about 1 , 2, 3, 4 or 5 sequences (e.g., contiguous nucleotides, contiguous amino acids) from one organism and 1 , 2, 3, 4 or 5 sequences (e.g., contiguous nucleotides, contiguous amino acids) from another organism. The organisms sometimes are a microbe, such as a bacterium (e.g., gram positive, gram negative), yeast or fungus (e.g., aerobic fungus, anaerobic fungus), for example. In some embodiments, the organisms are bacteria, the organisms are yeast or the organisms are fungi (e.g., different species), and sometimes one organism is a bacterium or yeast and another is a fungus. A chimeric molecule may contain up to about 99% of sequences from one organism (e.g., about 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99 %) and the balance percentage from one or more other organisms. In certain embodiments, a chimeric molecule includes altered codons (in the case of a chimeric nucleic acid) and one or more mutations (e.g., point mutations, nucleotide substitutions, amino acid substitutions). In some embodiments, the chimera comprises a portion of a xylose isomerase from one bacteria species and a portion of a xylose isomerase from another bacteria species. In still other embodiments, the chimera comprises a portion of a xylose isomerase from one species of fungus and another portion of a xylose isomerase from another species of fungus. In still other embodiments, the chimera comprises one portion of a xylose isomerase from a plant, and another portion of a xylose isomerase from a non-plant (such as a bacteria or fungus).
In other embodiments, the chimera comprises one portion of a xylose isomerase from a plant, another portion of a xylose isomerase from a bacteria, and yet another portion of a xylose isomerase from a fungus.
In specific embodiments, a gene encoding a xylose isomerase protein is chimeric, and includes a portion of a xylose isomerase encoding sequence from one organism (e.g. a fungus (e.g.,
Piromyces, Orpinomyces, Neocallimastix, Caecomyces, Ruminomyces, and the like)) and a portion of a xylose isomerase encoding sequence from another organism (e.g., bacterium (e.g.,
Ruminococcus, Thermotoga, Clostridium)). Sometimes a fungal sequence is located at the N- terminal portion of the encoded xylose isomerase polypeptide and the bacterial sequence is located at the C-terminal portion of the polypeptide. In some embodiments one contiguous fungal xylose isomerase sequence is about 1% to about 30% of overall sequence (e.g., about 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29%) and the remaining sequence is a contiguous bacterial xylose isomerase sequence. In certain
embodiments, a chimeric xylose isomerase includes one or more point mutations.
A chimera sometimes is the result of recombination between two or more nucleic acids, nucleotide sequences or genes, and sometimes is the result of genetic manipulation (e.g., designed and/or generated by the hand of a human being). Any suitable nucleic acid or nucleotide sequence and method for combining nucleic acids or nucleotide sequences can be used to generate a chimeric nucleic acid or nucleotide sequence. Non-limiting examples of nucleic acid and nucleotide sequence sources and methods for generating chimeric nucleic acids and nucleotide sequences are presented herein.
In some embodiments, fragments used to generate a chimera can be juxtaposed as units (e.g., nucleic acid from the sources are combined end to end and not interspersed. In embodiments where a chimera includes one stretch of contiguous nucleotides for each organism, nucleotide sequence combinations can be noted as DNA source 1 DNA source 2 or DNA source 1/DNA source 2/DNA source 3, the like and combinations thereof, for example. In certain embodiments, fragments used to generate a chimera can be juxtaposed such that one or more fragments from one or more sources can be interspersed with other fragments used to generate the chimera (e.g., DNA source 1/DNA source 2/DNA source 1/DNA source 3/DNA source 2/DNA source 1 ). In some embodiments, the nucleotide sequence length of the fragments used to generate a chimera can be in the range from about 5 base pairs to about 5,000 base pairs (e.g., about 5 base pairs (bp), about 10 bp, about 15 bp, about 20 bp, about 25 bp, about 30 bp, about 35 bp, about 40 bp, about 45 bp, about 50 bp, about 55 bp, about 60 bp, about bp, about 65 bp, about 70 bp, about 75 bp, about 80 bp, about 85 bp, about 90 bp, about 95 bp, about 100 bp, about 125 bp, about 150 bp, about 175 bp, about 200 bp, about 250 bp, about 300 bp, about 350 bp, about 400 bp, about 450 bp, about 500 bp, about 550 bp, about 600 bp, about 650 bp, about 700 bp, about 750 bp, about 800 bp, about 850 bp, about 900 bp, about 950 bp, about 1000 bp, about 1500 bp, about 2000 bp, about 2500 bp, about 3000 bp, about 3500 bp, about 4000 bp, about 4500 bp, or about 5000 bp). In some embodiments, chimeric xylose isomerase sequences are generated by first aligning the sequences of donor and recipient xylose isomerases. In certain embodiments, the alignment is performed utilizing nucleotide sequences, and in some embodiments, the alignment is performed utilizing amino acid sequences. Aligning sequences from donors and recipients sometimes ' generates alignments with mismatched regions. In certain embodiments, a region of mismatch occurs in the N-terminus of the encoded polypeptides, and in some embodiments, the region of mismatch occurs in the C-terminus of the encoded polypeptides. In certain embodiments chimeric polypeptides sometimes include 5 or more, 10 or more, 15 or more, or 20 or more amino acids from an organism designated as a "donor". An organism often is designated as a donor when a minority of the final chimeric sequence (e.g., a smaller fragment or portion) is taken from the donor and combined with another sequence present in a majority of the final chimeric sequence (e.g., larger fragment or substantially the whole encoded activity). An organism sometimes is designated as a recipient when the majority of the polypeptide sequence for a chimeric enzyme is obtained from a xylose isomerase from that organism. For example, a donor may contribute between about 1 % to about 49% of the amino acids in a chimeric polypeptide. In some embodiments, the number of amino acids or nucleotides in a chimeric polypeptide donated by a donor is not equal to the number of amino acids or nucleotides removed from the recipient sequence. That is, a donor fragment may replace a larger or smaller number of amino acids or nucleotides than the number of amino acids or nucleotides removed from the recipient. In some embodiments, replacing a larger or smaller number of amino acids or nucleotides in the final chimeric sequence than was removed from the recipient is performed to maintain overall alignment and/or to maintain catalytic domain spacing, and sometimes results in a chimeric molecule having a substantially similar activity, but with a different length than the recipient xylose isomerase. For example, a donor replacement sequence may be between about 1 to about 10 amino acids more or less than the sequence removed in the recipient. In some embodiments described herein, 8 amino acids (e.g., codon triplets representative of 8 amino acids) are removed from a Ruminococcus flavefaciens xylose isomerase (e.g., amino acid sequence removed MEFFSNIG; nucleotide sequence removed atggaatttttcagcaatatcggt) and are replaced with 10 amino acids from a Piromyces xylose isomerase (e.g., amino acid sequence added MAKEYFPQIQ; nucleotide sequence added atggcataaggaatatttcccacaaattcaa).
In certain embodiments, a chimeric nucleic acid or nucleotide sequence encodes the same activity as the activity encoded by the source nucleic acids or nucleotide sequences. In some
embodiments, a chimeric nucleic acid or nucleotide sequence has a similar or the same activity, but the amount of the activity, or kinetics of the activity, are altered (e.g., increased, decreased). In certain embodiments, a chimeric nucleic acid or nucleotide sequence encodes a different activity, and in some embodiments a chimeric nucleic acid or nucleotide sequences encodes a chimeric activity (e.g., a combination of two or more activities). In certain embodiments, polynucleotide sequences described herein are codon optimized. In some embodiments, codon optimization alters the polynucleotide coding sequence for enhanced expression in a chosen host, while leaving the amino acid sequence unchanged. Codon optimization can reduce transcriptional and/or translational pausing and/or other features which may decrease the expression of a polynucleotide in a host organism. Any suitable codon optimization scheme or method may be used to optimize polynucleotide sequences. In certain embodiments, codon optimization can be performed manually using a preferred codon table for the selected host organism, and in some embodiments, codon optimization can be performed using software (e.g., using a computer or an online software package). In certain embodiments, codon optimization can be performed using commercially available software and/or algorithms offered by manufacturers of custom or made to order synthetic polynucleotides (e.g., Integrated DNA Technologies (IDT), DNA 2.0, Genscript, EnCor Biotechnology, Blue Heron, and the like). In some embodiments, codon optimization can be performed using IDT's gene synthesis services. In some codon optimization embodiments, an amino acid sequence is provided, a host organism is selected and IDTs codon optimization algorithm provides a codon optimized polynucleotide sequence based on the provided amino acid sequence and the preferred codon triplets for the selected host. Due to rounding decisions and other heuristics included in the algorithm, a codon optimized polynucleotide sequence generated for an amino acid sequence sometimes is about 90 percent or more identical to another codon optimized polynucleotide sequences generated for the same amino acid sequence. That is two or more codon optimized polynucleotide sequences generated for an amino acid sequence sometimes are 90% or more, 91% or more, 92% or more, 93% or more, 94% o more, 95% or more, 96% or more, 97% or more, 98% or more, or more than 99% identical to each other. Presented in FIGS. 47A and 47B are nucleotide sequence alignments of codon optimized Ruminococcus FD-1 xylose isomerase nucleotide sequences (e.g., labeled 1 , 2, 3 and 4) generated from the Ruminococcus FD-1 xylose isomerase amino acid sequence. Also presented in FIGS. 47A and 47B are the native Ruminococcus FD-1 xylose isomerase nucleotide sequence (e.g., top line labeled FD-1 ) and a consensus sequences (e.g., bottom line labeled consensus). The alignment shows that the four codon optimized sequences have a substantially high degree of identity.
In some embodiments, an isolated nucleic acid comprises a chimeric nucleic acid which comprises a polynucleotide that is 80% or more identical to SEQ ID NO: 179 (e.g., 80% or more identical, 81 % or more identical, 82% or more identical, 83% or more identical, 84% or more identical, 85% or more identical, 86% or more identical, 87% or more identical, 88% or more identical, 89% or more identical, 90% or more identical, 91 % or more identical, 92% or more identical, 93% or more identical, 94% or more identical, 95% or more identical, 96% or more identical, 97% or more identical, 98% or more identical, or 99% or more identical). A target nucleic acid sometimes is an untranslated ribonucleic acid and sometimes is a translated ribonucleic acid. An untranslated ribonucleic acid may include, but is not limited to, a small interfering ribonucleic acid (siRNA), a short hairpin ribonucleic acid (shRNA), other ribonucleic acid capable of RNA interference (RNAi), an antisense ribonucleic acid, or a ribozyme. A translatable target nucleotide sequence (e.g., a target ribonucleotide sequence) sometimes encodes a peptide, polypeptide or protein, which are sometimes referred to herein as "target peptides," "target polypeptides" or "target proteins."
Any peptides, polypeptides or proteins, or an activity catalyzed by one or more peptides, polypeptides or proteins may be encoded by a target nucleotide sequence and may be selected by a person of ordinary skill in the art. Representative proteins include enzymes (e.g.,
phosphofructokinase activity, phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate-6- phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol dehydrogenase 2 activity and thymidylate synthase activity and the like, for example), antibodies, serum proteins (e.g., albumin), membrane bound proteins, hormones (e.g., growth hormone, erythropoietin, insulin, etc.), cytokines, etc., and include both naturally occurring and exogenously expressed polypeptides. Representative activities (e.g., enzymes or combinations of enzymes which are functionally associated to provide an activity) include phosphofructokinase activity, phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate-6-phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol
dehydrogenase 2 activity and thymidylate synthase activity and the like for example. The term "enzyme" as used herein refers to a protein which can act as a catalyst to induce a chemical change in other compounds, thereby producing one or more products from one or more substrates. Specific polypeptides (e.g., enzymes) useful for embodiments described herein are listed hereafter. The term "protein" as used herein refers to a molecule having a sequence of amino acids linked by peptide bonds. This term includes fusion proteins, oligopeptides, peptides, cyclic peptides, polypeptides and polypeptide derivatives, whether native or recombinant, and also includes fragments, derivatives, homologs, and variants thereof. A protein or polypeptide sometimes is of intracellular origin (e.g., located in the nucleus, cytosol, or interstitial space of host cells in vivo) and sometimes is a cell membrane protein in vivo. In some embodiments (described above, and in further detail below in Engineering and Alteration Methods), a genetic modification can result in a modification (e.g., increase, substantially increase, decrease or substantially decrease) of a target activity. A translatable nucleotide sequence generally is located between a start codon (AUG in ribonucleic acids and ATG in deoxyribonucleic acids) and a stop codon (e.g., UAA (ochre), UAG (amber) or UGA (opal) in ribonucleic acids and TAA, TAG or TGA in deoxyribonucleic acids), and sometimes is referred to herein as an "open reading frame" (ORF). A nucleic acid reagent sometimes comprises one or more ORFs. An ORF may be from any suitable source, sometimes from genomic DNA, mRNA, reverse transcribed RNA or complementary DNA (cDNA) or a nucleic acid library comprising one or more of the foregoing, and is from any organism species that contains a nucleic acid sequence of interest, protein of interest, or activity of interest. Non-limiting examples of organisms from which an ORF can be obtained include bacteria, yeast, fungi, human, insect, nematode, bovine, equine, canine, feline, rat or mouse, for example.
A nucleic acid reagent sometimes comprises a nucleotide sequence adjacent to an ORF that is translated in conjunction with the ORF and encodes an amino acid tag. The tag-encoding nucleotide sequence is located 3' and/or 5' of an ORF in the nucleic acid reagent, thereby encoding a tag at the C-terminus or N-terminus of the protein or peptide encoded by the ORF. Any tag that does not abrogate in vitro transcription and/or translation may be utilized and may be appropriately selected by the artisan. Tags may facilitate isolation and/or purification of the desired ORF product from culture or fermentation media.
A tag sometimes specifically binds a molecule or moiety of a solid phase or a detectable label, for example, thereby having utility for isolating, purifying and/or detecting a protein or peptide encoded by the ORF. In some embodiments, a tag comprises one or more of the following elements: FLAG (e.g., DYKDDDDKG), V5 (e.g., GKPIPNPLLGLDST), c-MYC (e.g., EQKLISEEDL), HSV (e.g., QPELAPEDPED), influenza hemaglutinin, HA (e.g., YPYDVPDYA), VSV-G (e.g., YTDIEMNRLGK), bacterial glutathione-S-transferase, maltose binding protein, a streptavidin- or avidin-binding tag (e.g., pcDNA™6 BioEase™ Gateway® Biotinylation System (Invitrogen)), thioredoxin, β- galactosidase, VSV-glycoprotein, a fluorescent protein (e.g., green fluorescent protein or one of its many color variants (e.g., yellow, red, blue)), a polylysine or polyarginine sequence, a polyhistidine sequence (e.g., His6) or other sequence that chelates a metal (e.g., cobalt, zinc, copper), and/or a cysteine-rich sequence that binds to an arsenic-containing molecule. In certain embodiments, a cysteine-rich tag comprises the amino acid sequence CC-Xn-CC, wherein X is any amino acid and n is 1 to 3, and the cysteine-rich sequence sometimes is CCPGCC. In certain embodiments, the tag comprises a cysteine-rich element and a polyhistidine element (e.g., CCPGCC and His6). A tag often conveniently binds to a binding partner. For example, some tags bind to an antibody (e.g., FLAG) and sometimes specifically bind to a small molecule. For example, a polyhistidine tag specifically chelates a bivalent metal, such as copper, zinc and cobalt; a polylysine or polyarginine tag specifically binds to a zinc finger; a glutathione S-transferase tag binds to glutathione; and a cysteine-rich tag specifically binds to an arsenic-containing molecule. Arsenic-containing molecules include LUMIO™ agents (Invitrogen, California), such as FIAsH™ (EDT2[4',5'-bis(1 ,3,2- dithioarsolan-2-yl)fluorescein-(1 ,2-ethanedithiol)2]) and ReAsH reagents (e.g., U.S. Patent 5,932,474 to Tsien et al., entitled "Target Sequences for Synthetic Molecules;" U.S. Patent 6,054,271 to Tsien et al., entitled "Methods of Using Synthetic Molecules and Target Sequences;" U.S. Patents 6,451 ,569 and 6,008,378; published U.S. Patent Application 2003/0083373, and published PCT Patent Application WO 99/21013, all to Tsien et al. and all entitled "Synthetic Molecules that Specifically React with Target Sequences"). Such antibodies and small molecules sometimes are linked to a solid phase for convenient isolation of the target protein or target peptide.
A tag sometimes comprises a sequence that localizes a translated protein or peptide to a component in a system, which is referred to as a "signal sequence" or "localization signal sequence" herein. A signal sequence often is incorporated at the N-terminus of a target protein or target peptide, and sometimes is incorporated at the C-terminus. Examples of signal sequences are known to the artisan, are readily incorporated into a nucleic acid reagent, and often are selected according to the organism in which expression of the nucleic acid reagent is performed. A signal sequence in some embodiments localizes a translated protein or peptide to a cell membrane. Examples of signal sequences include, but are not limited to, a nucleus targeting signal (e.g., steroid receptor sequence and N-terminal sequence of SV40 virus large T antigen); mitochondrial targeting signal (e.g., amino acid sequence that forms an amphipathic helix);
peroxisome targeting signal (e.g., C-terminal sequence in YFG from S.cerevisiae); and a secretion signal (e.g., N-terminal sequences from invertase, mating factor alpha, PH05 and SUC2 in S.cerevisiae; multiple N-terminal sequences of B. subtilis proteins (e.g., Tjalsma et al.,
Microbiol. Molec. Biol. Rev. 64: 515-547 (2000)); alpha amylase signal sequence (e.g., U.S. Patent No. 6,288,302); pectate lyase signal sequence (e.g., U.S. Patent No. 5,846,818); precollagen signal sequence (e.g., U.S. Patent No. 5,712,1 14); OmpA signal sequence (e.g., U.S. Patent No. 5,470,719); lam beta signal sequence (e.g., U.S. Patent No. 5,389,529); B. brevis signal sequence (e.g., U.S. Patent No. 5,232,841 ); and P. pastoris signal sequence (e.g., U.S. Patent No.
5,268,273)). A tag sometimes is directly adjacent to the amino acid sequence encoded by an ORF (i.e., there is no intervening sequence) and sometimes a tag is substantially adjacent to an ORF encoded amino acid sequence (e.g., an intervening sequence is present). An intervening sequence sometimes includes a recognition site for a protease, which is useful for cleaving a tag from a target protein or peptide. In some embodiments, the intervening sequence is cleaved by Factor Xa (e.g., recognition site I (E/D)GR), thrombin (e.g., recognition site LVPRGS), enterokinase (e.g., recognition site DDDDK), TEV protease (e.g., recognition site ENLYFQG) or PreScission™ protease (e.g., recognition site LEVLFQGP), for example. An intervening sequence sometimes is referred to herein as a "linker sequence," and may be of any suitable length selected by the artisan. A linker sequence sometimes is about 1 to about 20 amino acids in length, and sometimes about 5 to about 10 amino acids in length. The artisan may select the linker length to substantially preserve target protein or peptide function (e.g., a tag may reduce target protein or peptide function unless separated by a linker), to enhance disassociation of a tag from a target protein or peptide when a protease cleavage site is present (e.g., cleavage may be enhanced when a linker is present), and to enhance interaction of a tag/target protein product with a solid phase. A linker can be of any suitable amino acid content, and often comprises a higher proportion of amino acids having relatively short side chains (e.g., glycine, alanine, serine and threonine).
A nucleic acid reagent sometimes includes a stop codon between a tag element and an insertion element or ORF, which can be useful for translating an ORF with or without the tag. Mutant tRNA molecules that recognize stop codons (described above) suppress translation termination and thereby are designated "suppressor tRNAs." Suppressor tRNAs can result in the insertion of amino acids and continuation of translation past stop codons (e.g., U.S. Patent Application No. 60/587,583, filed July 1 , 2004, entitled "Production of Fusion Proteins by Cell-Free Protein Synthesis,"; Eggertsson, et al., (1988) Microbiological Review 52(3):354-374, and Engleerg-Kukla, et al. ( 1996) in Escherichia coli and Salmonella Cellular and Molecular Biology, Chapter 60, pps 909-921 , Neidhardt, et al. eds., ASM Press, Washington, DC). A number of suppressor tRNAs are known, including but not limited to, supE, supP, supD, supF and supZ suppressors, which suppress the termination of translation of the amber stop codon; supB, gIT, supL, supN, supC and supM suppressors, which suppress the function of the ochre stop codon and glyT, trpT and Su-9 suppressors, which suppress the function of the opal stop codon. In general, suppressor tRNAs contain one or more mutations in the anti-codon loop of the tRNA that allows the tRNA to base pair with a codon that ordinarily functions as a stop codon. The mutant tRNA is charged with its cognate amino acid residue and the cognate amino acid residue is inserted into the translating polypeptide when the stop codon is encountered. Mutations that enhance the efficiency of termination suppressors (i.e., increase stop codon read-through) have been identified. These include, but are not limited to, mutations in the uar gene (also known as the prfA gene), mutations in the ups gene, mutations in the sueA, sueB and sueC genes, mutations in the rpsD (ramA) and rpsE (spcA) genes and mutations in the rpIL gene.
Thus, a nucleic acid reagent comprising a stop codon located between an ORF and a tag can yield a translated ORF alone when no suppressor tRNA is present in the translation system, and can yield a translated ORF-tag fusion when a suppressor tRNA is present in the system. Suppressor tRNA can be generated in cells transfected with a nucleic acid encoding the tRNA (e.g., a replication incompetent adenovirus containing the human tRNA-Ser suppressor gene can be transfected into cells, or a YAC containing a yeast or bacterial tRNA suppressor gene can be transfected into yeast cells, for example). Vectors for synthesizing suppressor tRNA and for translating ORFs with or without a tag are available to the artisan (e.g., Tag-On-Demand™ kit (Invitrogen Corporation, California); Tag-On-Demand™ Suppressor Supernatant Instruction Manual, Version B, 6 June 2003, at http address www.invitrogen.com/content/sfs/
manuals/tagondemand _supernatant_man.pdf ; Tag-On-Demand™ Gateway® Vector Instruction Manual, Version B, 20 June, 2003 at http address www.invitrogen.com/content/sfs/
manuals/tagondemand_vectors_man.pdf; and Capone et at., Amber, ochre and opal suppressor tRNA genes derived from a human serine tRNA gene. EMBO J. 4:213, 1985).
Any convenient cloning strategy known in the art may be utilized to incorporate an element, such as an ORF, into a nucleic acid reagent. Known methods can be utilized to insert an element into the template independent of an insertion element, such as (1 ) cleaving the template at one or more existing restriction enzyme sites and ligating an element of interest and (2) adding restriction enzyme sites to the template by hybridizing oligonucleotide primers that include one or more suitable restriction enzyme sites and amplifying by polymerase chain reaction (described in greater detail herein). Other cloning strategies take advantage of one or more insertion sites present or inserted into the nucleic acid reagent, such as an oligonucleotide primer hybridization site for PCR, for example, and others described hereafter. In some embodiments, a cloning strategy can be combined with genetic manipulation such as recombination (e.g., recombination of a nucleic acid reagent with a nucleic acid sequence of interest into the genome of the organism to be modified, as described further below). In some embodiments, the cloned ORF(s) can produce (directly or indirectly) a desire product, by engineering a microorganism with one or more ORFs of interest, which microorganism comprises one or more altered activities selected from the group consisting of phosphofructokinase activity, phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate- 6-phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol dehydrogenase 2 activity, sugar transport activity, phosphoglucoisomerase activity, transaldolase activity, transketolase activity, glucose-6-phosphate dehydrogenase activity, 6- phosphogluconolactonase activity, 6-phosphogluconate dehydrogenase (decarboxylating) activity, xylose reductase activity, xylitol dehydrogenase activity, xylulokinase activity and thymidylate synthase activity.
In some embodiments, the nucleic acid reagent includes one or more recombinase insertion sites. A recombinase insertion site is a recognition sequence on a nucleic acid molecule that participates in an integration/recombination reaction by recombination proteins. For example, the
recombination site for Cre recombinase is loxP, which is a 34 base pair sequence comprised of two 13 base pair inverted repeats (serving as the recombinase binding sites) flanking an 8 base pair core sequence (e.g., Figure 1 of Sauer, B., Curr. Opin. Biotech. 5:521 -527 (1994)). Other examples of recombination sites include attB, attP, attL, and attR sequences, and mutants, fragments, variants and derivatives thereof, which are recognized by the recombination protein λ Int and by the auxiliary proteins integration host factor (IHF), FIS and excisionase (Xis) (e.g., U.S. Patent Nos. 5,888,732; 6,143,557; 6,171 ,861 ; 6,270,969; 6,277,608; and 6,720, 140; U.S. Patent Appln. Nos. 09/517,466, filed March 2, 2000, and 09/732,914, filed August 14, 2003, and in U.S. patent publication no. 2002-0007051 -A1 ; Landy, Curr. Opin. Biotech. 3:699-707 (1993)). Examples of recombinase cloning nucleic acids are in Gateway® systems (Invitrogen, California), which include at least one recombination site for cloning a desired nucleic acid molecules in vivo or in vitro. In some embodiments, the system utilizes vectors that contain at least two different site- specific recombination sites, often based on the bacteriophage lambda system (e.g., attl and att2), and are mutated from the wild-type (attO) sites. Each mutated site has a unique specificity for its cognate partner att site (i.e., its binding partner recombination site) of the same type (for example attB1 with attP1 , or attL1 with attR1 ) and will not cross-react with recombination sites of the other mutant type or with the wild-type attO site. Different site specificities allow directional cloning or linkage of desired molecules thus providing desired orientation of the cloned molecules. Nucleic acid fragments flanked by recombination sites are cloned and subcloned using the Gateway® system by replacing a selectable marker (for example, ccdB) flanked by att sites on the recipient plasmid molecule, sometimes termed the Destination Vector. Desired clones are then selected by transformation of a ccdB sensitive host strain and positive selection for a marker on the recipient molecule. Similar strategies for negative selection (e.g., use of toxic genes) can be used in other organisms such as thymidine kinase (TK) in mammals and insects.
A recombination system useful for engineering yeast is outlined briefly. The system makes use of the ura3 gene (e.g., for S. cerevisiae and C. albicans, for example) or ura4 and ura5 genes (e.g., for S. pombe, for example) and toxicity of the nucleotide analogue 5-Fluoroorotic acid (5-FOA). The ura3 or ura4 and ura5 genes encode orotine-5'-monophosphate (OMP) dicarboxylase. Yeast with an active ura3 or ura4 and ura5 gene (phenotypically Ura+) convert 5-FOA to
fluorodeoxyuridine, which is toxic to yeast cells. Yeast carrying a mutation in the appropriate gene(s) or having a knock out of the appropriate gene(s) can grow in the presence of 5-FOA, if the media is also supplemented with uracil.
A nucleic acid engineering construct can be made which may comprise the URA3 gene or cassette (for S. cerevisiae), flanked on either side by the same nucleotide sequence in the same orientation. The ura3 cassette comprises a promoter, the ura3 gene and a functional transcription terminator. Target sequences which direct the construct to a particular nucleic acid region of interest in the organism to be engineered are added such that the target sequences are adjacent to and abut the flanking sequences on either side of the ura3 cassette. Yeast can be transformed with the engineering construct and plated on minimal media without uracil. Colonies can be screened by PCR to determine those transformants that have the engineering construct inserted in the proper location in the genome. Checking insertion location prior to selecting for recombination of the ura3 cassette may reduce the number of incorrect clones carried through to later stages of the procedure. Correctly inserted transformants can then be replica plated on minimal media containing 5-FOA to select for recombination of the ura3 cassette out of the construct, leaving a disrupted gene and an identifiable footprint (e.g., nucleic acid sequence) that can be use to verify the presence of the disrupted gene. The technique described is useful for disrupting or "knocking out" gene function, but also can be used to insert genes or constructs into a host organisms genome in a targeted, sequence specific manner. Further detail will be described below in the engineering section and in the example section. In certain embodiments, a nucleic acid reagent includes one or more topoisomerase insertion sites. A topoisomerase insertion site is a defined nucleotide sequence recognized and bound by a site- specific topoisomerase. For example, the nucleotide sequence 5'-(C/T)CCTT-3' is a
topoisomerase recognition site bound specifically by most poxvirus topoisomerases, including vaccinia virus DNA topoisomerase I. After binding to the recognition sequence, the topoisomerase cleaves the strand at the 3'-most thymidine of the recognition site to produce a nucleotide sequence comprising 5'-(C/T)CCTT-P04-TOPO, a complex of the topoisomerase covalently bound to the 3' phosphate via a tyrosine in the topoisomerase (e.g., Shuman, J. Biol. Chem. 266:1 1372- 1 1379, 1991 ; Sekiguchi and Shuman, Nucl. Acids Res. 22:5360-5365, 1994; U.S. Pat. No.
5,766,891 ; PCT/US95/16099; and PCT/US98/12372). In comparison, the nucleotide sequence 5'- GCAACTT-3' is a topoisomerase recognition site for type IA E. coli topoisomerase III. An element to be inserted often is combined with topoisomerase-reacted template and thereby incorporated into the nucleic acid reagent (e.g., http address www.invitrogen.com/downloads/F- 13512_Topo_Flyer.pdf; http address at world wide web uniform resource locator
invitrogen.com/content/sfs/brochures/710_021849%20_B_TOPOCIoning_bro.pdf ; TOPO TA Cloning® Kit and Zero Blunt® TOPO® Cloning Kit product information).
A nucleic acid reagent sometimes contains one or more origin of replication (ORI) elements. In some embodiments, a template comprises two or more ORIs, where one functions efficiently in one organism (e.g., a bacterium) and another functions efficiently in another organism (e.g., a eukaryote, like yeast for example). In some embodiments, an ORI may function efficiently in one species (e.g., S. cerevisiae, for example) and another ORI may function efficiently in a different species (e.g., S. pombe, for example). A nucleic acid reagent also sometimes includes one or more transcription regulation sites.
A nucleic acid reagent can include one or more selection elements (e.g., elements for selection of the presence of the nucleic acid reagent, and not for activation of a promoter element which can be selectively regulated). Selection elements often are utilized using known processes to determine whether a nucleic acid reagent is included in a cell. In some embodiments, a nucleic acid reagent includes two or more selection elements, where one functions efficiently in one organism and another functions efficiently in another organism. Examples of selection elements include, but are not limited to, (1 ) nucleic acid segments that encode products that provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) nucleic acid segments that encode products that are otherwise lacking in the recipient cell (e.g., essential products, tRNA genes, auxotrophic markers); (3) nucleic acid segments that encode products that suppress the activity of a gene product; (4) nucleic acid segments that encode products that can be readily identified (e.g., phenotypic markers such as antibiotics (e.g., β-lactamase), β-galactosidase, green fluorescent protein (GFP), yellow fluorescent protein (YFP), red fluorescent protein (RFP), cyan fluorescent protein (CFP), and cell surface proteins); (5) nucleic acid segments that bind products that are otherwise detrimental to cell survival and/or function; (6) nucleic acid segments that otherwise inhibit the activity of any of the nucleic acid segments described in Nos. 1 -5 above (e.g., antisense oligonucleotides); (7) nucleic acid segments that bind products that modify a substrate (e.g., restriction endonucleases); (8) nucleic acid segments that can be used to isolate or identify a desired molecule (e.g., specific protein binding sites); (9) nucleic acid segments that encode a specific nucleotide sequence that can be otherwise non-functional (e.g., for PCR amplification of subpopulations of molecules); (10) nucleic acid segments that, when absent, directly or indirectly confer resistance or sensitivity to particular compounds; (1 1 ) nucleic acid segments that encode products that either are toxic or convert a relatively non-toxic compound to a toxic compound (e.g., Herpes simplex thymidine kinase, cytosine deaminase) in recipient cells; (12) nucleic acid segments that inhibit replication, partition or heritability of nucleic acid molecules that contain them; and/or (13) nucleic acid segments that encode conditional replication functions, e.g., replication in certain hosts or host cell strains or under certain environmental conditions (e.g., temperature, nutritional conditions, and the like).
A nucleic acid reagent is of any form useful for in vivo transcription and/or translation. A nucleic acid sometimes is a plasmid, such as a supercoiled plasmid, sometimes is a yeast artificial chromosome (e.g., YAC), sometimes is a linear nucleic acid (e.g., a linear nucleic acid produced by PCR or by restriction digest), sometimes is single-stranded and sometimes is double-stranded. A nucleic acid reagent sometimes is prepared by an amplification process, such as a polymerase chain reaction (PCR) process or transcription-mediated amplification process (TMA). In TMA, two enzymes are used in an isothermal reaction to produce amplification products detected by light emission (see, e.g., Biochemistry 1996 Jun 25;35(25):8429-38 and http address world wide web uniform resource locator devicelink.com/ivdt/archive/00/1 1/007. html). Standard PCR processes are known (e.g., U. S. Patent Nos. 4,683,202; 4,683, 195; 4,965,188; and 5,656,493), and generally are performed in cycles. Each cycle includes heat denaturation, in which hybrid nucleic acids dissociate; cooling, in which primer oligonucleotides hybridize; and extension of the
oligonucleotides by a polymerase (i.e., Taq polymerase). An example of a PCR cyclical process is treating the sample at 95°C for 5 minutes; repeating forty-five cycles of 95°C for 1 minute, 59°C for 1 minute, 10 seconds, and 72°C for 1 minute 30 seconds; and then treating the sample at 72°C for 5 minutes. Multiple cycles frequently are performed using a commercially available thermal cycler. PCR amplification products sometimes are stored for a time at a lower temperature (e.g., at 4°C) and sometimes are frozen (e.g., at -20°C) before analysis.
In some embodiments, a nucleic acid reagent, protein reagent, protein fragment reagent or other reagent described herein is isolated or purified. The term "isolated" as used herein refers to material removed from its original environment (e.g., the natural environment if it is naturally occurring, or a host cell if expressed exogenously), and thus is altered "by the hand of man" from its original environment. The term "purified" as used herein with reference to molecules does not refer to absolute purity. Rather, "purified" refers to a substance in a composition that contains fewer substance species in the same class (e.g., nucleic acid or protein species) other than the substance of interest in comparison to the sample from which it originated. "Purified," if a nucleic acid or protein for example, refers to a substance in a composition that contains fewer nucleic acid species or protein species other than the nucleic acid or protein of interest in comparison to the sample from which it originated. Sometimes, a protein or nucleic acid is "substantially pure," indicating that the protein or nucleic acid represents at least 50% of protein or nucleic acid on a mass basis of the composition. Often, a substantially pure protein or nucleic acid is at least 75% on a mass basis of the composition, and sometimes at least 95% on a mass basis of the composition.
Engineering and Alteration Methods
Methods and compositions (e.g., nucleic acid reagents) described herein can be used to generate engineered microorganisms. As noted above, the term "engineered microorganism" as used herein refers to a modified organism that includes one or more activities distinct from an activity present in a microorganism utilized as a starting point for modification (e.g., host microorganism or unmodified organism). Engineered microorganisms typically arise as a result of a genetic modification, usually introduced or selected for, by one of skill in the art using readily available techniques. Non-limiting examples of methods useful for generating an altered activity include, introducing a heterologous polynucleotide (e.g., nucleic acid or gene integration, also referred to as "knock in"), removing an endogenous polynucleotide, altering the sequence of an existing endogenous nucleic acid sequence ( e.g., site-directed mutagenesis), disruption of an existing endogenous nucleic acid sequence (e.g., knock outs and transposon or insertion element mediated mutagenesis), selection for an altered activity where the selection causes a change in a naturally occurring activity that can be stably inherited (e.g., causes a change in a nucleic acid sequence in the genome of the organism or in an epigenetic nucleic acid that is replicated and passed on to daughter cells), PCR-based mutagenesis, and the like. The term "mutagenesis" as used herein refers to any modification to a nucleic acid (e.g., nucleic acid reagent, or host chromosome, for example) that is subsequently used to generate a product in a host or modified organism. Non-limiting examples of mutagenesis include, deletion, insertion, substitution, rearrangement, point mutations, suppressor mutations and the like. Mutagenesis methods are known in the art and are readily available to the artisan. Non-limiting examples of mutagenesis methods are described herein and can also be found in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
The term "genetic modification" as used herein refers to any suitable nucleic acid addition, removal or alteration that facilitates production of a target product (e.g., phosphogluconate dehydratase activity, 2-keto-3-deoxygluconate-6-phosphate aldolase activity, xylose isomerase activity, or phosphoenolpyruvate carboxylase activity, for example), in an engineered microorganism.
Genetic modifications include, without limitation, insertion of one or more nucleotides in a native nucleic acid of a host organism in one or more locations, deletion of one or more nucleotides in a native nucleic acid of a host organism in one or more locations, modification or substitution of one or more nucleotides in a native nucleic acid of a host organism in one or more locations, insertion of a non-native nucleic acid into a host organism (e.g., insertion of an autonomously replicating vector), and removal of a non-native nucleic acid in a host organism (e.g., removal of a vector).
The term "heterologous polynucleotide" as used herein refers to a nucleotide sequence not present in a host microorganism in some embodiments. In certain embodiments, a heterologous polynucleotide is present in a different amount (e.g., different copy number) than in a host microorganism, which can be accomplished, for example, by introducing more copies of a particular nucleotide sequence to a host microorganism (e.g., the particular nucleotide sequence may be in a nucleic acid autonomous of the host chromosome or may be inserted into a chromosome). A heterologous polynucleotide is from a different organism in some embodiments, and in certain embodiments, is from the same type of organism but from an outside source (e.g., a recombinant source). The term "altered activity" as used herein refers to an activity in an engineered microorganism that is added or modified relative to the host microorganism (e.g., added, increased, reduced, inhibited or removed activity). An activity can be altered by introducing a genetic modification to a host microorganism that yields an engineered microorganism having added, increased, reduced, inhibited or removed activity.
An added activity often is an activity not detectable in a host microorganism. An increased activity generally is an activity detectable in a host microorganism that has been increased in an engineered microorganism. An activity can be increased to any suitable level for production of a target product (e.g., ethanol), including but not limited to less than 2-fold (e.g., about 10% increase to about 99% increase; about 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% increase), 2-fold, 3- fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, of 10-fold increase, or greater than about 10-fold increase. A reduced or inhibited activity generally is an activity detectable in a host microorganism that has been reduced or inhibited in an engineered microorganism. An activity can be reduced to undetectable levels in some embodiments, or detectable levels in certain embodiments. An activity can be decreased to any suitable level for production of a target product (e.g., ethanol), including but not limited to less than 2-fold (e.g., about 10% decrease to about 99% decrease; about 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% decrease), 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, of 10-fold decrease, or greater than about 10-fold decrease.
An altered activity sometimes is an activity not detectable in a host organism and is added to an engineered organism. An altered activity also may be an activity detectable in a host organism and is increased in an engineered organism. An activity may be added or increased by increasing the number of copies of a polynucleotide that encodes a polypeptide having a target activity, in some embodiments. In certain embodiments an activity can be added or increased by inserting into a host microorganism a heterologous polynucleotide that encodes a polypeptide having the added activity. In certain embodiments, an activity can be added or increased by inserting into a host microorganism a heterologous polynucleotide that is (i) operably linked to another polynucleotide that encodes a polypeptide having the added activity, and (ii) up regulates production of the polynucleotide. Thus, an activity can be added or increased by inserting or modifying a regulatory polynucleotide operably linked to another polynucleotide that encodes a polypeptide having the target activity. In certain embodiments, an activity can be added or increased by subjecting a host microorganism to a selective environment and screening for microorganisms that have a detectable level of the target activity. Examples of a selective environment include, without limitation, a medium containing a substrate that a host organism can process and a medium lacking a substrate that a host organism can process.
An altered activity sometimes is an activity detectable in a host organism and is reduced, inhibited or removed (i.e., not detectable) in an engineered organism. An activity may be reduced or removed by decreasing the number of copies of a polynucleotide that encodes a polypeptide having a target activity, in some embodiments. In some embodiments, an activity can be reduced or removed by (i) inserting a polynucleotide within a polynucleotide that encodes a polypeptide having the target activity (disruptive insertion), and/or (ii) removing a portion of or all of a polynucleotide that encodes a polypeptide having the target activity (deletion or knock out, respectively). In certain embodiments, an activity can be reduced or removed by inserting into a host microorganism a heterologous polynucleotide that is (i) operably linked to another polynucleotide that encodes a polypeptide having the target activity, and (ii) down regulates production of the polynucleotide. Thus, an activity can be reduced or removed by inserting or modifying a regulatory polynucleotide operably linked to another polynucleotide that encodes a polypeptide having the target activity.
An activity also can be reduced or removed by (i) inhibiting a polynucleotide that encodes a polypeptide having the activity or (ii) inhibiting a polynucleotide operably linked to another polynucleotide that encodes a polypeptide having the activity. A polynucleotide can be inhibited by a suitable technique known in the art, such as by contacting an RNA encoded by the
polynucleotide with a specific inhibitory RNA (e.g., RNAi, siRNA, ribozyme). An activity also can be reduced or removed by contacting a polypeptide having the activity with a molecule that specifically inhibits the activity (e.g., enzyme inhibitor, antibody). In certain embodiments, an activity can be reduced or removed by subjecting a host microorganism to a selective environment and screening for microorganisms that have a reduced level or removal of the target activity.
In some embodiments, an untranslated ribonucleic acid, or a cDNA can be used to reduce the expression of a particular activity or enzyme. For example, a microorganism can be engineered by genetic modification to express a nucleic acid reagent that reduces the expression of an activity by producing an RNA molecule that is partially or substantially homologous to a nucleic acid sequence of interest which encodes the activity of interest. The RNA molecule can bind to the nucleic acid sequence of interest and inhibit the nucleic acid sequence from performing its natural function, in certain embodiments. In some embodiments, the RNA may alter the nucleic acid sequence of interest which encodes the activity of interest in a manner that the nucleic acid sequence of interest is no longer capable of performing its natural function (e.g., the action of a ribozyme for example). In certain embodiments, nucleotide sequences sometimes are added to, modified or removed from one or more of the nucleic acid reagent elements, such as the promoter, 5'UTR, target sequence, or 3'UTR elements, to enhance, potentially enhance, reduce, or potentially reduce transcription and/or translation before or after such elements are incorporated in a nucleic acid reagent. In some embodiments, one or more of the following sequences may be modified or removed if they are present in a 5'UTR: a sequence that forms a stable secondary structure (e.g., quadruplex structure or stem loop stem structure (e.g., EMBL sequences X12949, AF274954, AF139980, AF152961 , S95936, U194144, AF1 16649 or substantially identical sequences that form such stem loop stem structures)); a translation initiation codon upstream of the target nucleotide sequence start codon; a stop codon upstream of the target nucleotide sequence translation initiation codon; an ORF upstream of the target nucleotide sequence translation initiation codon; an iron responsive element (IRE) or like sequence; and a 5' terminal oligopyrimidine tract (TOP, e.g., consisting of 5- 15 pyrimidines adjacent to the cap). A translational enhancer sequence and/or an internal ribosbme entry site (IRES) sometimes is inserted into a 5'UTR (e.g., EMBL nucleotide sequences J04513, X87949, M95825, M12783, AF025841 , AF013263, AF006822, M17169, M13440, M22427, D14838 and M17446 and substantially identical nucleotide sequences).
An AU-rich element (ARE, e.g., AUUUA repeats) and/or splicing junction that follows a non-sense codon sometimes is removed from or modified in a 3'UTR. A polyadenosine tail sometimes is inserted into a 3'UTR if none is present, sometimes is removed if it is present, and adenosine moieties sometimes are added to or removed from a polyadenosine tail present in a 3'UTR. Thus, some embodiments are directed to a process comprising: determining whether any nucleotide sequences that increase, potentially increase, reduce or potentially reduce translation efficiency are present in the elements, and adding, removing or modifying one or more of such sequences if they are identified. Certain embodiments are directed to a process comprising: determining whether any nucleotide sequences that increase or potentially increase translation efficiency are not present in the elements, and incorporating such sequences into the nucleic acid reagent.
In some embodiments, an activity can be altered by modifying the nucleotide sequence of an ORF. An ORF sometimes is mutated or modified (for example, by point mutation, deletion mutation, insertion mutation, PCR based mutagenesis and the like) to alter, enhance or increase, reduce, substantially reduce or eliminate the activity of the encoded protein or peptide. The protein or peptide encoded by a modified ORF sometimes is produced in a lower amount or may not be produced at detectable levels, and in other embodiments, the product or protein encoded by the modified ORF is produced at a higher level (e.g., codons sometimes are modified so they are compatible with tRNA's preferentially used in the host organism or engineered organism). To determine the relative activity, the activity from the product of the mutated ORF (or cell containing it) can be compared to the activity of the product or protein encoded by the unmodified ORF (or cell containing it). In some embodiments, an ORF nucleotide sequence sometimes is mutated or modified to alter the triplet nucleotide sequences used to encode amino acids (e.g., amino acid codon triplets, for example). Modification of the nucleotide sequence of an ORF to alter codon triplets sometimes is used to change the codon found in the original sequence to better match the preferred codon usage of the organism in which the ORF or nucleic acid reagent will be expressed. For example, the codon usage, and therefore the codon triplets encoded by a nucleic acid sequence from bacteria may be different from the preferred codon usage in eukaryotes like yeast or plants.
Preferred codon usage also may be different between bacterial species. In certain embodiments an ORF nucleotide sequences sometimes is modified to eliminate codon pairs and/or eliminate mRNA secondary structures that can cause pauses during translation of the mRNA encoded by the ORF nucleotide sequence. Translational pausing sometimes occurs when nucleic acid secondary structures exist in an mRNA, and sometimes occurs due to the presence of codon pairs that slow the rate of translation by causing ribosomes to pause. In some embodiments, the use of lower abundance codon triplets can reduce translational pausing due to a decrease in the pause time needed to load a charged tRNA into the ribosome translation machinery. Therefore, to increase transcriptional and translational efficiency in bacteria (e.g., where transcription and translation are concurrent, for example) or to increase translational efficiency in eukaryotes (e.g., where transcription and translation are functionally separated), the nucleotide sequence of a nucleotide sequence of interest can be altered to better suit the transcription and/or translational machinery of the host and/or genetically modified microorganism. In certain embodiment, slowing the rate of translation by the use of lower abundance codons, which slow or pause the ribosome, can lead to higher yields of the desired product due to an increase in correctly folded proteins and a reduction in the formation of inclusion bodies. Codons can be altered and optimized according to the preferred usage by a given organism by determining the codon distribution of the nucleotide sequence donor organism and comparing the distribution of codons to the distribution of codons in the recipient or host organism. Techniques described herein (e.g., site directed mutagenesis and the like) can then be used to alter the codons accordingly. Comparisons of codon usage can be done by hand, or using nucleic acid analysis software commercially available to the artisan.
Modification of the nucleotide sequence of an ORF also can be used to correct codon triplet sequences that have diverged in different organisms. For example, certain yeast (e.g., C.
tropicalis and C. maltosa) use the amino acid triplet CUG (e.g., CTG in the DNA sequence) to encode serine. CUG typically encodes leucine in most organisms. In order to maintain the correct amino acid in the resultant polypeptide or protein, the CUG codon must be altered to reflect the organism in which the nucleic acid reagent will be expressed. Thus, if an ORF from a bacterial donor is to be expressed in either Candida yeast strain mentioned above, the heterologous nucleotide sequence must first be altered or modified to the appropriate leucine codon. Therefore, in some embodiments, the nucleotide sequence of an ORF sometimes is altered or modified to correct for differences that have occurred in the evolution of the amino acid codon triplets between different organisms. In some embodiments, the nucleotide sequence can be left unchanged at a particular amino acid codon, if the amino acid encoded is a conservative or neutral change in amino acid when compared to the originally encoded amino acid.
In some embodiments, an activity can be altered by modifying translational regulation signals, like a stop codon for example. A stop codon at the end of an ORF sometimes is modified to another stop codon, such as an amber stop codon described above. In some embodiments, a stop codon is introduced within an ORF, sometimes by insertion or mutation of an existing codon. An ORF comprising a modified terminal stop codon and/or internal stop codon often is translated in a system comprising a suppressor tRNA that recognizes the stop codon. An ORF comprising a stop codon sometimes is translated in a system comprising a suppressor tRNA that incorporates an unnatural amino acid during translation of the target protein or target peptide. Methods for incorporating unnatural amino acids into a target protein or peptide are known, which include, for example, processes utilizing a heterologous tRNA/synthetase pair, where the tRNA recognizes an amber stop codon and is loaded with an unnatural amino acid (e.g., World Wide Web URL iupac.org/news/prize/2003/wang.pdf). Depending on the portion of a nucleic acid reagent (e.g., Promoter, 5' or 3' UTR, ORI, ORF, and the like) chosen for alteration (e.g., by mutagenesis, introduction or deletion, for example) the modifications described above can alter a given activity by (i) increasing or decreasing feedback inhibition mechanisms, (ii) increasing or decreasing promoter initiation, (iii) increasing or decreasing translation initiation, (iv) increasing or decreasing translational efficiency, (v) modifying localization of peptides or products expressed from nucleic acid reagents described herein, or (vi) increasing or decreasing the copy number of a nucleotide sequence of interest, (vii) expression of an anti-sense RNA, RNAi, siRNA, ribozyme and the like. In some embodiments, alteration of a nucleic acid reagent or nucleotide sequence can alter a region involved in feedback inhibition (e.g., 5' UTR, promoter and the like). A modification sometimes is made that can add or enhance binding of a feedback regulator and sometimes a modification is made that can reduce, inhibit or eliminate binding of a feedback regulator.
In certain embodiments, alteration of a nucleic acid reagent or nucleotide sequence can alter sequences involved in transcription initiation (e.g., promoters, 5' UTR, and the like). A modification sometimes can be made that can enhance or increase initiation from an endogenous or heterologous promoter element. A modification sometimes can be made that removes or disrupts sequences that increase or enhance transcription initiation, resulting in a decrease or elimination of transcription from an endogenous or heterologous promoter element.
In some embodiments, alteration of a nucleic acid reagent or nucleotide sequence can alter sequences involved in translational initiation or translational efficiency (e.g., 5' UTR, 3' UTR, codon triplets of higher or lower abundance, translational terminator sequences and the like, for example). A modification sometimes can be made that can increase or decrease translational initiation, modifying a ribosome binding site for example. A modification sometimes can be made that can increase or decrease translational efficiency. Removing or adding sequences that form hairpins and changing codon triplets to a more or less preferred codon are non-limiting examples of genetic modifications that can be made to alter translation initiation and translation efficiency. In certain embodiments, alteration of a nucleic acid reagent or nucleotide sequence can alter sequences involved in localization of peptides, proteins or other desired products (e.g., adipic acid, for example). A modification sometimes can be made that can alter, add or remove sequences responsible for targeting a polypeptide, protein or product to an intracellular organelle, the periplasm, cellular membranes, or extracellularly. Transport of a heterologous product to a different intracellular space or extracellularly sometimes can reduce or eliminate the formation of inclusion bodies (e.g., insoluble aggregates of the desired product).
In some embodiments, alteration of a nucleic acid reagent or nucleotide sequence can alter sequences involved in increasing or decreasing the copy number of a nucleotide sequence of interest. A modification sometimes can be made that increases or decreases the number of copies of an ORF stably integrated into the genome of an organism or on an epigenetic nucleic acid reagent. Non-limiting examples of alterations that can increase the number of copies of a sequence of interest include, adding copies of the sequence of interest by duplication of regions in the genome (e.g., adding additional copies by recombination or by causing gene amplification of the host genome, for example), cloning additional copies of a sequence onto a nucleic acid reagent, or altering an ORI to increase the number of copies of an epigenetic nucleic acid reagent. Non-limiting examples of alterations that can decrease the number of copies of a sequence of interest include, removing copies of the sequence of interest by deletion or disruption of regions in the genome, removing additional copies of the sequence from epigenetic nucleic acid reagents, or altering an ORI to decrease the number of copies of an epigenetic nucleic acid reagent.
In certain embodiments, increasing or decreasing the expression of a nucleotide sequence of interest can also be accomplished by altering, adding or removing sequences involved in the expression of an anti-sense RNA, RNAi, siRNA, ribozyme and the like. The methods described above can be used to modify expression of anti-sense RNA, RNAi, siRNA, ribozyme and the like.
Engineered microorganisms can be prepared by altering, introducing or removing nucleotide sequences in the host genome or in stably maintained epigenetic nucleic acid reagents, as noted above. The nucleic acid reagents use to alter, introduce or remove nucleotide sequences in the host genome or epigenetic nucleic acids can be prepared using the methods described herein or available to the artisan.
Nucleic acid sequences having a desired activity can be isolated from cells of a suitable organism using lysis and nucleic acid purification procedures available in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. or with commercially available cell lysis and DNA purification reagents and kits. In some embodiments, nucleic acids used to engineer microorganisms can be provided for conducting methods described herein after processing of the organism containing the nucleic acid. For example, the nucleic acid of interest may be extracted, isolated, purified or amplified from a sample (e.g., from an organism of interest or culture containing a plurality of organisms of interest, like yeast or bacteria for example). The term "isolated" as used herein refers to nucleic acid removed from its original environment (e.g., the natural environment if it is naturally occurring, or a' host cell if expressed exogenously), and thus is altered "by the hand of man" from its original environment. An isolated nucleic acid generally is provided with fewer non-nucleic acid
components (e.g., protein, lipid) than the amount of components present in a source sample. A composition comprising isolated sample nucleic acid can be substantially isolated (e.g., about 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% free of non-nucleic acid components). The term "purified" as used herein refers to sample nucleic acid provided that contains fewer nucleic acid species than in the sample source from which the sample nucleic acid is derived. A composition comprising sample nucleic acid may be substantially purified (e.g., about 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% free of other nucleic acid species). The term "amplified" as used herein refers to subjecting nucleic acid of a cell, organism or sample to a process that linearly or exponentially generates amplicon nucleic acids having the same or substantially the same nucleotide sequence as the nucleotide sequence of the nucleic acid in the sample, or portion thereof. As noted above, the nucleic acids used to prepare nucleic acid reagents as described herein can be subjected to fragmentation or cleavage. Amplification of nucleic acids is sometimes necessary when dealing with organisms that are difficult to culture. Where amplification may be desired, any suitable amplification technique can be utilized. Non-limiting examples of methods for amplification of polynucleotides include, polymerase chain reaction (PCR); ligation amplification (or ligase chain reaction (LCR));
amplification methods based on the use of Q-beta replicase or template-dependent polymerase (see US Patent Publication Number US20050287592); helicase-dependant isothermal
amplification (Vincent et al., "Helicase-dependent isothermal DNA amplification". EMBO reports 5 (8): 795-800 (2004)); strand displacement amplification (SDA); thermophilic SDA nucleic acid sequence based amplification (3SR or NASBA) and transcription-associated amplification (TAA). Non-limiting examples of PCR amplification methods include standard PCR, AFLP-PCR, Allele- specific PCR, Alu-PCR, Asymmetric PCR, Colony PCR, Hot start PCR, Inverse PCR (IPCR), In situ PCR (ISH), Intersequence-specific PCR (ISSR-PCR), Long PCR, Multiplex PCR, Nested PCR, Quantitative PCR, Reverse Transcriptase PCR (RT-PCR), Real Time PCR, Single cell PCR, Solid phase PCR, combinations thereof, and the like. Reagents and hardware for conducting PCR are commercially available. Protocols for conducting the various type of PCR listed above are readily available to the artisan. PCR conditions can be dependent upon primer sequences, target abundance, and the desired amount of amplification, and therefore, one of skill in the art may choose from a number of PCR protocols available (see, e.g., U.S. Pat. Nos. 4,683,195 and 4,683,202; and PCR Protocols: A Guide to Methods and Applications, Innis et al., eds, 1990. PCR often is carried out as an automated process with a thermostable enzyme. In this process, the temperature of the reaction mixture is cycled through a denaturing region, a primer-annealing region, and an extension reaction region automatically. Machines specifically adapted for this purpose are commercially available. A non-limiting example of a PCR protocol that may be suitable for embodiments described herein is, treating the sample at 95SC for 5 minutes; repeating forty-five cycles of 95gC for 1 minute, 59SC for 1 minute, 10 seconds, and 72SC for 1 minute 30 seconds; and then treating the sample at 725C for 5 minutes. Additional PCR protocols are described in the example section. Multiple cycles frequently are performed using a commercially available thermal cycler. Suitable isothermal amplification processes known and selected by the person of ordinary skill in the art also may be applied, in certain embodiments. In some embodiments, nucleic acids encoding polypeptides with a desired activity can be isolated by amplifying the desired sequence from an organism having the desired activity using oligonucleotides or primers designed based on sequences described herein Amplified, isolated and/or purified nucleic acids can be cloned into the recombinant DNA vectors described in Figures herein or into suitable commercially available recombinant DNA vectors. Cloning of nucleic acid sequences of interest into recombinant DNA vectors can facilitate further manipulations of the nucleic acids for preparation of nucleic acid reagents, (e.g., alteration of nucleotide sequences by mutagenesis, homologous recombination, amplification and the like, for example). Standard cloning procedures (e.g., enzymic digestion, ligation, and the like) are readily available to the artisan and can be found in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
In some embodiments, nucleic acid sequences prepared by isolation or amplification can be used, without any further modification, to add an activity to a microorganism and thereby generate a genetically modified or engineered microorganism. In certain embodiments, nucleic acid sequences prepared by isolation or amplification can be genetically modified to alter (e.g., increase or decrease, for example) a desired activity. In some embodiments, nucleic acids, used to add an activity to an organism, sometimes are genetically modified to optimize the heterologous polynucleotide sequence encoding the desired activity (e.g., polypeptide or protein, for example). The term "optimize" as used herein can refer to alteration to increase or enhance expression by preferred codon usage. The term optimize can also refer to modifications to the amino acid sequence to increase the activity of a polypeptide or protein, such that the activity exhibits a higher catalytic activity as compared to the "natural" version of the polypeptide or protein.
Nucleic acid sequences of interest can be genetically modified using methods known in the art. Mutagenesis techniques are particularly useful for small scale (e.g., 1 , 2, 5, 10 or more
nucleotides) or large scale (e.g., 50, 100, 150, 200, 500, or more nucleotides) genetic modification. Mutagenesis allows the artisan to alter the genetic information of an organism in a stable manner, either naturally (e.g., isolation using selection and screening) or experimentally by the use of chemicals, radiation or inaccurate DNA replication (e.g., PCR mutagenesis). In some
embodiments, genetic modification can be performed by whole scale synthetic synthesis of nucleic acids, using a native nucleotide sequence as the reference sequence, and modifying nucleotides that can result in the desired alteration of activity. Mutagenesis methods sometimes are specific or targeted to specific regions or nucleotides (e.g., site-directed mutagenesis, PCR-based site- directed mutagenesis, and in vitro mutagenesis techniques such as transplacement and in vivo oligonucleotide site-directed mutagenesis, for example). Mutagenesis methods sometimes are non-specific or random with respect to the placement of genetic modifications (e.g., chemical mutagenesis, insertion element (e.g., insertion or transposon elements) and inaccurate PCR based methods, for example).
Site directed mutagenesis is a procedure in which a specific nucleotide or specific nucleotides in a DNA molecule are mutated or altered. Site directed mutagenesis typically is performed using a nucleic acid sequence of interest cloned into a circular plasmid vector. Site-directed mutagenesis requires that the wild type sequence be known and used a platform for the genetic alteration. Site- directed mutagenesis sometimes is referred to as oligonucleotide-directed mutagenesis because the technique can be performed using oligonucleotides which have the desired genetic
modification incorporated into the complement a nucleotide sequence of interest. The wild type sequence and the altered nucleotide are allowed to hybridize and the hybridized nucleic acids are extended and replicated using a DNA polymerase. The double stranded nucleic acids are introduced into a host (e.g., E. coli, for example) and further rounds of replication are carried out in vivo. The transformed cells carrying the mutated nucleic acid sequence are then selected and/or screened for those cells carrying the correctly mutagenized sequence. Cassette mutagenesis and PCR-based site-directed mutagenesis are further modifications of the site-directed mutagenesis technique. Site-directed mutagenesis can also be performed in vivo (e.g., transplacement "pop-in pop-out", In vivo site-directed mutagenesis with synthetic oligonucleotides and the like, for example).
PCR-based mutagenesis can be performed using PCR with oligonucleotide primers that contain the desired mutation or mutations. The technique functions in a manner similar to standard site- directed mutagenesis, with the exception that a thermocycler and PCR conditions are used to replace replication and selection of the clones in a microorganism host. As PCR-based mutagenesis also uses a circular plasmid vector, the amplified fragment (e.g., linear nucleic acid molecule) containing the incorporated genetic modifications can be separated from the plasmid containing the template sequence after a sufficient number of rounds of thermocycler amplification, using standard electrophorectic procedures. A modification of this method uses linear amplification methods and a pair of mutagenic primers that amplify the entire plasmid. The procedure takes advantage of the E. coli Dam methylase system which causes DNA replicated in vivo to be sensitive to the restriction endonucleases Dpnl. PCR synthesized DNA is not methylated and is therefore resistant to Dpnl. This approach allows the template plasmid to be digested, leaving the genetically modified, PCR synthesized plasmids to be isolated and transformed into a host bacteria for DNA repair and replication, thereby facilitating subsequent cloning and identification steps. A certain amount of randomness can be added to PCR-based sited directed mutagenesis by using partially degenerate primers.
Recombination sometimes can be used as a tool for mutagenesis. Homologous recombination allows the artisan to specifically target regions of known sequence for insertion of heterologous nucleotide sequences using the host organisms natural DNA replication and repair enzymes.
Homologous recombination methods sometimes are referred to as "pop in pop out" mutagenesis, transplacement, knock out mutagenesis or knock in mutagenesis. Integration of a nucleic acid sequence into a host genome is a single cross over event, which inserts the entire nucleic acid reagent (e.g., pop in). A second cross over event excises all but a portion of the nucleic acid reagent, leaving behind a heterologous sequence, often referred to as a "footprint" (e.g., pop out). Mutagenesis by insertion (e.g., knock in) or by double recombination leaving behind a disrupting heterologous nucleic acid (e.g., knock out) both server to disrupt or "knock out" the function of the gene or nucleic acid sequence in which insertion occurs. By combining selectable markers and/or auxotrophic markers with nucleic acid reagents designed to provide the appropriate nucleic acid target sequences, the artisan can target a selectable nucleic acid reagent to a specific region, and then select for recombination events that "pop out" a portion of the inserted (e.g., "pop in") nucleic acid reagent. Such methods take advantage of nucleic acid reagents that have been specifically designed with known target nucleic acid sequences at or near a nucleic acid or genomic region of interest.
Popping out typically leaves a "foot print" of left over sequences that remain after the
recombination event. The left over sequence can disrupt a gene and thereby reduce or eliminate expression of that gene. In some embodiments, the method can be used to insert sequences, upstream or downstream of genes that can result in an enhancement or reduction in expression of the gene. In certain embodiments, new genes can be introduced into the genome of a host organism using similar recombination or "pop in" methods. An example of a yeast recombination system using the ura3 gene and 5-FOA were described briefly above and further detail is presented herein.
A method for modification is described in Alani et al., "A method for gene disruption that allows repeated use of URA3 selection in the construction of multiply disrupted yeast strains", Genetics 1 16(4):541 -545 August 1987. The original method uses a Ura3 cassette with 1000 base pairs (bp) of the same nucleotide sequence cloned in the same orientation on either side of the URA3 cassette. Targeting sequences of about 50 bp are added to each side of the construct. The double stranded targeting sequences are complementary to sequences in the genome of the host organism. The targeting sequences allow site-specific recombination in a region of interest. The modification of the original technique replaces the two 1000 bp sequence direct repeats with two 200 bp direct repeats. The modified method also uses 50 bp targeting sequences. The modification reduces or eliminates recombination of a second knock out into the 1000 bp repeat left behind in a first mutagenesis, therefore allowing multiply knocked out yeast. Additionally, the 200 bp sequences used herein are uniquely designed, self-assembling sequences that leave behind identifiable footprints. The technique used to design the sequences incorporate design features such as low identity to the yeast genome, and low identity to each other. Therefore a library of the self-assembling sequences can be generated to allow multiple knockouts in the same organism, while reducing or eliminating the potential for integration into a previous knockout.
As noted above, the URA3 cassette makes use of the toxicity of 5-FOA in yeast carrying a functional URA3 gene. Uracil synthesis deficient yeast are transformed with the modified URA3 cassette, using standard yeast transformation protocols, and the transformed cells are plated on minimal media minus uracil. In some embodiments, PCR can be used to verify correct insertion into the region of interest in the host genome, and certain embodiments the PCR step can be omitted. Inclusion of the PCR step can reduce the number of transformants that need to be counter selected to "pop out" the URA3 cassette. The transformants (e.g., all or the ones determined to be correct by PCR, for example) can then be counter-selected on media containing 5-FOA, which will select for recombination out (e.g., popping out) of the URA3 cassette, thus rendering the yeast ura3 deficient again, and resistant to 5-FOA toxicity. Targeting sequences used to direct recombination events to specific regions are presented herein. A modification of the method described above can be used to integrate genes in to the chromosome, where after recombination a functional gene is left in the chromosome next to the 200bp footprint.
In some embodiments, other auxotrophic or dominant selection markers can be used in place of URA3 (e.g., an auxotrophic selectable marker), with the appropriate change in selection media and selection agents. Auxotrophic selectable markers are used in strains deficient for synthesis of a required biological molecule (e.g., amino acid or nucleoside, for example). Non-limiting examples of additional auxotrophic markers include; HIS3, TRP1 , LEU2, LEU2-d, and LYS2. Certain auxotrophic markers (e.g., URA3 and LYS2) allow counter selection to select for the second recombination event that pops out all but one of the direct repeats of the recombination construct. HIS3 encodes an activity involved in histidine synthesis. TRP1 encodes an activity involved in tryptophan synthesis. LEU2 encodes an activity involved in leucine synthesis. LEU2-d is a low expression version of LEU2 that selects for increased copy number (e.g., gene or plasmid copy number, for example) to allow survival on minimal media without leucine. LYS2 encodes an activity involved in lysine synthesis, and allows counter selection for recombination out of the LYS2 gene using alpha-amino adipate (a-amino adipate).
Dominant selectable markers are useful because they also allow industrial and/or prototrophic strains to be used for genetic manipulations. Additionally, dominant selectable markers provide the advantage that rich medium can be used for plating and culture growth, and thus growth rates are markedly increased. Non-limiting examples of dominant selectable markers include; Tn903 kanr, Cm', Hygr, CUP1 , and DHFR. Tn903 kan' encodes an activity involved in kanamycin antibiotic resistance (e.g., typically neomycin phosphotransferase II or NPTII, for example). Cm' encodes an activity involved in chloramphenicol antibiotic resistance (e.g., typically chloramphenicol acetyl transferase or CAT, for example). Hygr encodes an activity involved in hygromycin resistance by phosphorylation of hygromycin B (e.g., hygromycin phosphotransferase, or HPT). CUP1 encodes an activity involved in resistance to heavy metal (e.g., copper, for example) toxicity. DHFR encodes a dihydrofolate reductase activity which confers resistance to methotrexate and sulfanilamide compounds.
In contrast to site-directed or specific mutagenesis, random mutagenesis does not require any sequence information and can be accomplished by a number of widely different methods. Random mutagenesis often is used to generate mutant libraries that can be used to screen for the desired genotype or phenotype. Non-limiting examples of random mutagenesis include; chemical mutagenesis, UV-induced mutagenesis, insertion element or transposon-mediated mutagenesis, DNA shuffling, error-prone PCR mutagenesis, and the like.
Chemical mutagenesis often involves chemicals like ethyl methanesulfonate (EMS), nitrous acid, mitomycin C, N-methyl-N-nitrosourea (MNU), diepoxybutane (DEB), 1 , 2, 7, 8- diepoxyoctane (DEO), methyl methane sulfonate (MMS), N-methyl- N'-nitro-N-nitrosoguanidine (MNNG), 4- nitroquinoline 1 -oxide (4-NQO), 2-methyloxy-6-chloro-9(3-[ethyl-
2-chloroethyl]-aminopropylamino)-acridinedihydrochloride (ICR-170), 2-amino purine (2AP), and hydroxylamine (HA), provided herein as non-limiting examples. These chemicals can cause base- pair subsitutions, frameshift mutations, deletions, transversion mutations, transition mutations, incorrect replication, and the like. In some embodiments, the mutagenesis can be carried out in vivo. Sometimes the mutagenic process involves the use of the host organisms DNA replication and repair mechanisms to incorporate and replicate the mutagenized base or bases.
Another type of chemical mutagenesis involves the use of base-analogs. The use of base-analogs cause incorrect base pairing which in the following round of replication is corrected to a mismatched nucleotide when compared to the starting sequence. Base analog mutagenesis introduces a small amount of non-randomness to random mutagenesis, because specific base analogs can be chose which can be incorporated at certain nucleotides in the starting sequence. Correction of the mispairing typically yields a known substitution. For example, Bromo- deoxyuridine (BrdU) can be incorporated into DNA and replaces T in the sequence. The host DNA repair and replication machinery can sometime correct the defect, but sometimes will mispair the BrdU with a G. The next round of replication then causes a G-C transversion from the original A-T in the native sequence. Ultra violet (UV) induced mutagenesis is caused by the formation of thymidine dimers when UV light irradiates chemical bonds between two adjacent thymine residues. Excision repair mechanism of the host organism correct the lesion in the DNA, but occasionally the lesion is incorrectly repaired typically resulting in a C to T transition.
Insertion element or transposon-mediated mutagenesis makes use of naturally occurring or modified naturally occurring mobile genetic elements. Transposons often encode accessory activities in addition to the activities necessary for transposition (e.g., movement using a transposase activity, for example). In many examples, transposon accessory activities are antibiotic resistance markers (e.g., see Tn903 kanr described above, for example). Insertion elements typically only encode the activities necessary for movement of the nucleic acid sequence. Insertion element and transposon mediated mutagenesis often can occur randomly, however specific target sequences are known for some transposons. Mobile genetic elements like IS elements or Transposons (Tn) often have inverted repeats, direct repeats or both inverted and direct repeats flanking the region coding for the transposition genes. Recombination events catalyzed by the transposase cause the element to remove itself from the genome and move to a new location, leaving behind a portion of an inverted or direct repeat. Classic examples of transposons are the "mobile genetic elements" discovered in maize. Transposon mutagenesis kits are commercially available which are designed to leave behind a 5 codon insert (e.g., Mutation Generation System kit, Finnzymes, World Wide Web URL finnzymes.us, for example). This allows the artisan to identify the insertion site, without fully disrupting the function of most genes.
DNA shuffling is a method which uses DNA fragments from members of a mutant library and reshuffles the fragments randomly to generate new mutant sequence combinations. The fragments are typically generated using DNasel, followed by random annealing and re-joining using self priming PCR. The DNA overhanging ends, from annealing of random fragments, provide "primer" sequences for the PCR process. Shuffling can be applied to libraries generated by any of the above mutagenesis methods. Error prone PCR and its derivative rolling circle error prone PCR uses increased magnesium and manganese concentrations in conjunction with limiting amounts of one or two nucleotides to reduce the fidelity of the.Taq polymerase. The error rate can be as high as 2% under appropriate conditions, when the resultant mutant sequence is compared to the wild type starting sequence. After amplification, the library of mutant coding sequences must be cloned into a suitable plasmid. Although point mutations are the most common types of mutation in error prone PCR, deletions and frameshift mutations are also possible. There are a number of commercial error-prone PCR kits available, including those from Stratagene and Clontech (e.g., World Wide Web URL strategene.com and World Wide Web URL clontech.com, respectively, for example). Rolling circle error-prone PCR is a variant of error-prone PCR in which wild-type sequence is first cloned into a plasmid, the whole plasmid is then amplified under error-prone conditions.
As noted above, organisms with altered activities can also be isolated using genetic selection and screening of organisms challenged on selective media or by identifying naturally occurring variants from unique environments. For example, 2-Deoxy-D-glucose is a toxic glucose analog. Growth of yeast on this substance yields mutants that are glucose-deregulated. A number of mutants have been isolated using 2-Deoxy-D-glucose including transport mutants, and mutants that ferment glucose and galactose simultaneously instead of glucose first then galactose when glucose is depleted. Similar techniques have been used to isolate mutant microorganisms that can metabolize plastics (e.g., from landfills), petrochemicals (e.g., from oil spills), and the like, either in a laboratory setting or from unique environments.
Similar methods can be used to isolate naturally occurring mutations in a desired activity when the activity exists at a relatively low or nearly undetectable level in the organism of choice, in some embodiments. The method generally consists of growing the organism to a specific density in liquid culture, concentrating the cells, and plating the cells on various concentrations of the substance to which an increase in metabolic activity is desired. The cells are incubated at a moderate growth temperature, for 5 to 10 days. To enhance the selection process, the plates can be stored for another 5 to 10 days at a low temperature. The low temperature sometimes can allow strains that have gained or increased an activity to continue growing while other strains are inhibited for growth at the low temperature. Following the initial selection and secondary growth at low temperature, the plates can be replica plated on higher or lower concentrations of the selection substance to further select for the desired activity. A native, heterologous or mutagenized polynucleotide can be introduced into a nucleic acid reagent for introduction into a host organism, thereby generating an engineered microorganism. Standard recombinant DNA techniques (restriction enzyme digests, ligation, and the like) can be used by the artisan to combine the mutagenized nucleic acid of interest into a suitable nucleic acid reagent capable of (i) being stably maintained by selection in the host organism, or (ii) being integrating into the genome of the host organism. As noted above, sometimes nucleic acid reagents comprise two replication origins to allow the same nucleic acid reagent to be manipulated in bacterial before final introduction of the final product into the host organism (e.g., yeast or fungus for example). Standard molecular biology and recombinant DNA methods available to one of skill in the art can be found in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. .
Nucleic acid reagents can be introduced into microorganisms using various techniques. Non- limiting examples of methods used to introduce heterologous nucleic acids into various organisms include; transformation, transfection, transduction, electroporation, ultrasound-mediated transformation, particle bombardment and the like. In some instances the addition of carrier molecules (e.g., bis-benzimdazolyl compounds, for example, see US Patent 5595899) can increase the uptake of DNA in cells typically though to be difficult to transform by conventional methods. Conventional methods of transformation are readily available to the artisan and can be found in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
Culture, Production and Process Methods Engineered microorganisms often are cultured under conditions that optimize yield of a target molecule. A non-limiting example of such a target molecule is ethanol. Culture conditions often can alter (e.g., add, optimize, reduce or eliminate, for example) activity of one or more of the following activities: phosphofructokinase activity, phosphogluconate dehydratase activity, 2-keto-3- deoxygluconate-6-phosphate aldolase activity, xylose isomerase activity, phosphoenolpyruvate carboxylase activity, alcohol dehydrogenase 2 activity, thymidylate synthase activities, 6- phosphogluconate dehydrogenase (decarboxylating), glucose-6-phosphate dehydrogenase, pyruvate decarboxylase, alcohol dehydrogenase 1 , 6-phosphogluconolactonase, glutamate synthase, trehalose-6-phosphate synthase, trehalose-6-phosphate phosphatase, glyceraldehyde- 3-phosphate dehydrogenase, pyruvate kinase, glucose transporter, xylose reductase, xylitol dehydrogenase, xylulokinase, glycerol-3-phosphate dehydrogenase, FPS1 membrane channel, neutral trehalose, alkaline phosphatase, phosphoglucose isomerase, transaldolase, and/or transketolase. In general, conditions that may be optimized include the type and amount of carbon source, the type and amount of nitrogen source, the carbon-to-nitrogen ratio, the oxygen level, growth temperature, pH, length of the biomass production phase, length of target product accumulation phase, and time of cell harvest.
Conditions also may be optimized to maximize carbon flux through engineered pathways. The engineered activities increase the production of ethanol through the increased activities in a number of pathways (e.g., Entner-Doudoroff, Embden-Meyerhoff, glycolysis, gluconeogenesis, pentose phosphate pathway, glutamate synthesis pathway). In some embodiments, the pathways are engineered to bias carbon flux through a pathway in the direction of the desired product. The pathways utilized in the non-limiting examples presented herein were selected to maximize production of ethanol by regenerating or utilizing metabolic byproducts to internally generate additional carbon sources that can be further metabolized to produce ethanol. The process of "carbon flux management" through engineered pathways produces ethanol at a level and rate closer to the calculated maximum theoretical yield for any given feedstock, in certain embodiments. The terms "theoretical yield" or "maximum theoretical yield" as used herein refer to the yield of product of a chemical or biological reaction that would be formed if the reaction went to completion. Theoretical yield is based on the stoichiometry of the reaction and ideal conditions in which starting material is completely consumed, undesired side reactions do not occur, the reverse reaction does not occur, and there no losses in the work-up procedure. The term "fermentation conditions" as used herein refers to any culture conditions suitable for maintaining a microorganism (e.g., in a static or proliferative state). Fermentation conditions can include several parameters, including without limitation, temperature, oxygen content, nutrient content (e.g., glucose content), pH, agitation level (e.g., revolutions per minute), gas flow rate (e.g., air, oxygen, nitrogen gas), redox potential, cell density (e.g., optical density), cell viability and the like. A change in fermentation conditions (e.g., switching fermentation conditions) is an alteration, modification or shift of one or more fermentation parameters. For example, one can change fermentation conditions by increasing or decreasing temperature, increasing or decreasing pH (e.g., adding or removing an acid, a base or carbon dioxide), increasing or decreasing oxygen content (e.g., introducing air, oxygen, carbon dioxide, nitrogen) and/or adding or removing a nutrient (e.g., one or more sugars or sources of sugar, biomass, vitamin and the like), or combinations of the foregoing. Examples of fermentation conditions are described herein. Aerobic conditions often comprise greater than about 50% dissolved oxygen (e.g., about 52%, 54%, 56%, 58%, 60%, 62%, 64%, 66%, 68%, 70%, 72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%, or greater than any one of the foregoing). Anaerobic conditions often comprise less than about 50% dissolved oxygen (e.g., about 1 %, 2%, 4%, 6%, 8%, 10%, 12%, 14%, 16%, 18%, 20%, 22%, 24%, 26%, 28%, 30%, 32%, 34%, 36%, 38%, 40%, 42%, 44%, 46%, 48%, or less than any one of the foregoing). Culture media generally contain a suitable carbon source. Carbon sources may include, but are not limited to, monosaccharides (e.g., glucose, fructose, xylose), disaccharides (e.g., lactose, sucrose), oligosaccharides, polysaccharides (e.g., starch, cellulose, hemicellulose, other lignocellulosic materials or mixtures thereof), sugar alcohols (e.g., glycerol), and renewable feedstocks (e.g., cheese whey permeate, cornsteep liquor, sugar beet molasses, barley malt). Carbon sources also can be selected from one or more of the following non-limiting examples: linear or branched alkanes (e.g., hexane), linear or branched alcohols (e.g., hexanol), fatty acids (e.g., about 10 carbons to about 22 carbons), esters of fatty acids, monoglycerides, diglycerides, triglycerides, phospholipids and various commercial sources of fatty acids including vegetable oils (e.g., soybean oil) and animal fats. A carbon source may include one-carbon sources (e.g., carbon dioxide, methanol, formaldehyde, formate and carbon-containing amines) from which metabolic conversion into key biochemical intermediates can occur. It is expected that the source of carbon utilized may encompass a wide variety of carbon-containing sources and will only be limited by the choice of the engineered microorganism(s). Nitrogen may be supplied from an inorganic (e.g., (NH.sub.4).sub.2SO.sub.4) or organic source (e.g., urea or glutamate). In addition to appropriate carbon and nitrogen sources, culture media also can contain suitable minerals, salts, cofactors, buffers, vitamins, metal ions (e.g., Mn.sup.+2, Co. sup. +2, Zn.sup.+2, Mg.sup.+2) and other components suitable for culture of microorganisms. Engineered microorganisms sometimes are cultured in complex media (e.g., yeast extract- peptone-dextrose broth (YPD)). In some embodiments, engineered microorganisms are cultured in a defined minimal media that lacks a component necessary for growth and thereby forces selection of a desired expression cassette (e.g., Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Mich.)). Culture media in some embodiments are common commercially prepared media, such as Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Mich.). Other defined or synthetic growth media may also be used and the appropriate medium for growth of the particular microorganism are known.
A variety of host organisms can be selected for the production of engineered microorganisms. Non-limiting examples include yeast and fungi. In specific embodiments, yeast are cultured in YPD media (10 g/L Bacto Yeast Extract, 20 g/L Bacto Peptone, and 20 g/L Dextrose). Filamentous fungi, in particular embodiments, are grown in CM (Complete Medium) containing 10 g/L Dextrose, 2 g/L Bacto Peptone, 1 g/L Bacto Yeast Extract, 1 g/L Casamino acids, 50 mL /L 20X Nitrate Salts (120 g/L NaN03, 10.4 g/L KCI, 10.4 g/L MgSCy7 H20 ), 1 mlJL 1000X Trace Elements (22 g/L ZnSCy7 H20, 1 1 g/L H3B03, 5 g/L MnCI2-7 H20, 5 g/L FeSCy 7 H20, 1 .7 g/L CoCI2-6 H20, 1.6 g/L CuSCy5 H20, 1.5 g/L Na2MoCy2 H20, and 50 g/L NajEDTA), and 1 mL/L Vitamin Solution (100 mg each of Biotin, pyridoxine, thiamine, riboflavin, p-aminobenzoic acid, and nicotinic acid in 100 mL water).
A suitable pH range for the fermentation often is between about pH 4.0 to about pH 8.0, where a pH in the range of about pH 5.5 to about pH 7.0 sometimes is utilized for initial culture conditions. Culturing may be conducted under aerobic or anaerobic conditions, where microaerobic conditions sometimes are maintained. A two-stage process may be utilized, where one stage promotes microorganism proliferation and another state promotes production of target molecule. In a two- stage process, the first stage may be conducted under aerobic conditions (e.g., introduction of air and/or oxygen) and the second stage may be conducted under anaerobic conditions (e.g., air or oxygen are not introduced to the culture conditions).
A variety of fermentation processes may be applied for commercial biological production of a target product. In some embodiments, commercial production of a target product from a recombinant microbial host is conducted using a batch, fed-batch or continuous fermentation process, for example.
A batch fermentation process often is a closed system where the media composition is fixed at the beginning of the process and not subject to further additions beyond those required for
maintenance of pH and oxygen level during the process. At the beginning of the culturing process the media is inoculated with the desired organism and growth or metabolic activity is permitted to occur without adding additional sources (i.e., carbon and nitrogen sources) to the medium. In batch processes the metabolite and biomass compositions of the system change constantly up to the time the culture is terminated. In a typical batch process, cells proceed through a static lag phase to a high-growth log phase and finally to a stationary phase, wherein the growth rate is diminished or halted. Left untreated, cells in the stationary phase will eventually die.
A variation of the standard batch process is the fed-batch process, where the carbon source is continually added to the fermentor over the course of the fermentation process. Fed-batch processes are useful when catabolite repression is apt to inhibit the metabolism of the cells or where it is desirable to have limited amounts of carbon source in the media at any one time.
Measurement of the carbon source concentration in fed-batch systems may be estimated on the basis of the changes of measurable factors such as pH, dissolved oxygen and the partial pressure of waste gases (e.g., CO. sub.2).
Batch and fed-batch culturing methods are known in the art. Examples of such methods may be found in Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, 2.sup.nd ed., (1989) Sinauer Associates Sunderland, Mass. and Deshpande, Mukund V., Appl. Biochem.
Biotechnol., 36:227 (1992).
In continuous fermentation process a defined media often is continuously added to a bioreactor while an equal amount of culture volume is removed simultaneously for product recovery.
Continuous cultures generally maintain cells in the log phase of growth at a constant cell density. Continuous or semi-continuous culture methods permit the modulation of one factor or any number of factors that affect cell growth or end product concentration. For example, an approach may limit the carbon source and allow all other parameters to moderate metabolism. In some systems, a number of factors affecting growth may be altered continuously while the cell concentration, measured by media turbidity, is kept constant. Continuous systems often maintain steady state growth and thus the cell growth rate often is balanced against cell loss due to media being drawn off the culture. Methods of modulating nutrients and growth factors for continuous culture processes, as well as techniques for maximizing the rate of product formation, are known and a variety of methods are detailed by Brock, supra. In various embodiments ethanol may be purified from the culture media or extracted from the engineered microorganisms. Culture media may be tested for ethanol concentration and drawn off when the concentration reaches a predetermined level. Detection methods are known in the art, including but not limited to the use of a hydrometer and infrared measurement of vibrational frequency of dissolved ethanol using the CH band at 2900 cm '. Ethanol may be present at a range of levels as described herein.
A target product sometimes is retained within an engineered microorganism after a culture process is completed, and in certain embodiments, the target product is secreted out of the microorganism into the culture medium. For the latter embodiments, (i) culture media may be drawn from the culture system and fresh medium may be supplemented, and/or (ii) target product may be extracted from the culture media during or after the culture process is completed. Engineered microorganisms may be cultured on or in solid, semi-solid or liquid media. In some embodiments media is drained from cells adhering to a plate. In certain embodiments, a liquid-cell mixture is centrifuged at a speed sufficient to pellet the cells but not disrupt the cells and allow extraction of the media, as known in the art. The cells may then be resuspended in fresh media. Target product may be purified from culture media according to methods known in the art.
In certain embodiments, target product is extracted from the cultured engineered microorganisms. The microorganism cells may be concentrated through centrifugation at speed sufficient to shear the cell membranes. In some embodiments, the cells may be physically disrupted (e.g., shear force, sonication) or chemically disrupted (e.g., contacted with detergent or other lysing agent). The phases may be separated by centrifugation or other method known in the art and target product may be isolated according to known methods.
Commercial grade target product sometimes is provided in substantially pure form (e.g., 90% pure or greater, 95% pure or greater, 99% pure or greater or 99.5% pure or greater). In some embodiments, target product may be modified into any one of a number of downstream products. For example, ethanol may be derivatized or further processed to produce ethyl halides, ethyl esters, diethyl ether, acetic acid, ethyl amines, butadiene, solvents, food flavorings, distilled spirits and the like.
Target product may be provided within cultured microbes containing target product, and cultured microbes may be supplied fresh or frozen in a liquid media or dried. Fresh or frozen microbes may be contained in appropriate moisture-proof containers that may also be temperature controlled as necessary. Target product sometimes is provided in culture medium that is substantially cell-free. In some embodiments target product or modified target product purified from microbes is provided, and target product sometimes is provided in substantially pure form. In certain embodiments, ethanol can be provided in anhydrous or hydrous forms. Ethanol may be transported in a variety of containers including pints, quarts, liters, gallons, drums (e.g., 10 gallon or 55 gallon, for example) and the like.
In certain embodiments, a target product (e.g., ethanol, succinic acid) is produced with a yield of about 0.30 grams of target product, or greater, per gram of glucose added during a fermentation process (e.g., about 0.31 grams of target product per gram of glucose added, or greater; about 0.32 grams of target product per gram of glucose added, or greater; about 0.33 grams of target product per gram of glucose added, or greater; about 0.34 grams of target product per gram of glucose added, or greater; about 0.35 grams of target product per gram of glucose added, or greater; about 0.36 grams of target product per gram of glucose added, or greater; about 0.37 grams of target product per gram of glucose added, or greater; about 0.38 grams of target product per gram of glucose added, or greater; about 0.39 grams of target product per gram of glucose added, or greater; about 0.40 grams of target product per gram of glucose added, or greater; about 0.41 grams of target product per gram of glucose added, or greater; 0.42 grams of target product per gram of glucose added, or greater; 0.43 grams of target product per gram of glucose added, or greater; 0.44 grams of target product per gram of glucose added, or greater; 0.45 grams of target product per gram of glucose added, or greater; 0.46 grams of target product per gram of glucose added, or greater; 0.47 grams of target product per gram of glucose added, or greater; 0.48 grams of target product per gram of glucose added, or greater; 0.49 grams of target product per gram of glucose added, or greater; 0.50 grams of target product per gram of glucose added, or greater; 0.51 grams of target product per gram of glucose added, or greater; 0.52 grams of target product per gram of glucose added, or greater; 0.53 grams of target product per gram of glucose added, or greater; 0.54 grams of target product per gram of glucose added, or greater; 0.55 grams of target product per gram of glucose added, or greater; 0.56 grams of target product per gram of glucose added, or greater; 0.57 grams of target product per gram of glucose added, or greater; 0.58 grams of target product per gram of glucose added, or greater; 0.59 grams of target product per gram of glucose added, or greater; 0.60 grams of target product per gram of glucose added, or greater; 0.61 grams of target product per gram of glucose added, or greater; 0.62 grams of target product per gram of glucose added, or greater; 0.63 grams of target product per gram of glucose added, or greater; 0.64 grams of target product per gram of glucose added, or greater; 0.65 grams of target product per gram of glucose added, or greater; 0.66 grams of target product per gram of glucose added, or greater; 0.67 grams of target product per gram of glucose added, or greater; 0.68 grams of target product per gram of glucose added, or greater; 0.69 or 0.70 grams of target product per gram of glucose added or greater). In some embodiments, 0.45 grams of target product per gram of glucose added, or greater, is produced during the fermentation process.
In some embodiments, the maximum theoretical yield of ethanol from glucose is about 0.51 grams of ethanol produced per gram of glucose consumed (e.g., about 6% g/g). In certain embodiments, the maximum theoretical yield of ethanol from xylose is about 0.51 grams of ethanol produced per gram of xylose consumed (e.g., about 6% g/g). In some embodiments, the theoretical maximum mole to mole ratio of ethanol produced to glucose consumed is about 2 moles of ethanol produced per mole of glucose consumed. In some embodiments, the theoretical maximum mole to mole ratio of ethanol produced to pentose consumed is about 2 moles of ethanol produced per mole of pentose sugar consumed. In certain embodiments, engineered strains described herein can produce between about 1 % g/g of ethanol to about a 6% g/g of ethanol, dependent on feedstock and genetic background (e.g., about 1 .0, 1 .1 , 1.2, 1 .3, 1.4, 1.5, 1.6, 1 .7, 1.8, 1.9, 2.0, 2.1 , 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1 , 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.1 , 4.1 , .4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5.0, 5.1 , 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8 or 5.9% g/g). In some embodiments, engineered strains described herein produce between about 60% and 100% of theoretical maximum yield (e.g., about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95% or about 100%). In certain embodiments, engineered strains described herein yield between about a 1 % to about a 10% increase in ethanol yield when compared to parental controls (e.g., about a 1 % increase, about a 1 .5% increase, about a 2% increase, about a 2.5% increase, about a 3% increase, about a 3.5% increase, about a 4% increase, about a 4.5% increase, about a 5% increase, about a 5.5% increase, about a 6% increase, about a 6.5% increase, about a 7% increase, about a 7 .5% increase, about a 8% increase, about a 8.5% increase, about a 9% increase, about a 9.5% increase, and about a 10% increase). In some embodiments, engineered yeast strains described herein show between about a 1 fold (e.g., 1 X) and about a 100 fold (e.g., 100X) increase in ethanol production when compared to a parental control under identical fermentation conditions (e.g., about 1 X, 1 .1 X, 1 .2X, 1 .3X, 1.4X, 1.5X, 1 .6X, 1 .7X, 1 .8X, 1 .9X, 2.0X, 2.1 X, 2.2X, 2.3X, 2.4X, 2.5X, 2.6X, 2.7X, 2.8X, 2.9X, 3.0X, 3.1 X, 3.2X, 3.3X, 3.4X, 3.5X, 3.6X, 3.7X, 3.8X, 3.9X, 4.0X, 4.5X, 5.0X, 5.5X, 6.0X, 6.5X, 7.0X, 7.5X, 8.0X, 8.5X, 9.0X, 9.5X, 10X, 1 1 X, 12X, 13X, 14X, 15X, 16X, 17X, 18X, 19X 20X, 25X, 30X, 35X, 40X, 45X, 50x, 60X, 65X, 70X, 75X, 80X, 85X, 90X, 95X, or about 100X).
Examples
The examples set forth below illustrate certain embodiments and do not limit the technology. Certain examples set forth below utilize standard recombinant DNA and other biotechnology protocols known in the art. Many such techniques are described in detail in Maniatis, T., E. F. Fritsch and J. Sambrook (1982) Molecular Cloning: a Laboratory Manual; Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. DNA mutagenesis can be accomplished using the
Stratagene (San Diego, CA) "QuickChange" kit according to the manufacturer's instructions, or by one of the other types of mutagenesis described above. Example 1: Activation of the Entner-Doudoroff Pathway in Yeast Cells
Genomic DNA from Zymomonas mobilis (ZM4) was obtained from the American Type Culture Collection (ATCC accession number 31821 D-5). The genes encoding phosphogluconate dehydratase EC 4.2.1 .12 (referred to as "edd') and 2-keto-3-deoxygluconate-6-phosphate aldolase EC 4.2.1 . 4 (referred to as "eda") were isolated from the ZM4 genomic DNA using the following oligonucleotides:.
The ZM4 eda gene:
5'- aactgactagtaaaaaaatgcgtgatatcgattcc-3' (SEQ ID No: 1 )
5'-agtaactcgagctactaggcaacagcagcgcgcttg -3' (SEQ ID No:2)
The ZM4 edd gene:
5'- aactgactagtaaaaaaatgactgatctgcattcaacg -3' (SEQ ID NO:3)
5'- agtaactcgagctactagataccggcacctgcatatattgc-3' (SEQ ID NO:4)
E. coli genomic DNA was prepared using Qiagen DNeasy blood and tissue kit according to the manufacture's protocol. The E. coli edd and eda constructs were isolated from E. coli genomic DNA using the following oligonucleotides:
The E. coli eda gene:
5'- aactgactagtaaaaaaatgaaaaactggaaaacaagtgcagaatc -3' (SEQ ID NO:5)
5'- agtaactcgagctactacagcttagcgccttctacagcttcacg -3' (SEQ ID NO:6) The E. coli edd gene:
5'-aactgactagtaaaaaaatgaatccacaattgttacgcgtaacaaatcg-3'(SEQ ID NO:7)
5'agtaactcgagctactaaaaagtgatacaggttgcgccctgttcggcac -3' (SEQ ID NO:8)
All oligonucleotides set forth above were purchased from Integrated DNA technologies ("IDT", Coralville, IA). These oligonucleotides were designed to incorporate a Spel restriction
endonuclease cleavage site upstream and a Xhol restriction endonuclease cleavage site downstream of the edd and eda gene constructs such that these sites could be used to clone these genes into yeast expression vectors p426GPD (ATCC accession number 87361 ) and p425GPD (ATCC accession number 87359). In addition to incorporating restriction endonuclease cleavage sites, the forward oligonucleotides were designed to incorporate six consecutive
AAAAAA nucleotides immediately upstream of the ATG initiation codon. This ensured that there was a conserved kozak sequence important for efficient translation initiation in yeast. Cloning the edd and eda genes from ZM4 and E. coli genomic DNA was accomplished using the following procedure: About 100ng of ZM4 or E. coli genomic DNA, 1 μΜ of the oligonucleotide primer set listed above, 2.5 U of PfuUltra High-Fidelity DNA polymerase (Stratagene), 300 μΜ dNTPs (Roche), and 1 X PfuUltra reaction buffer was mixed in a final reaction volume of 50μΙ. A BIORAD DNA Engine Tetrad 2 Peltier thermal cycler was used for the PCR reactions and the following cycle conditions were used: 5 min denaturation step at 95 SC, followed by 30 cycles of 20 sec at 95 8C, 20 sec at 55 5C, and 1 min at 72 SC, and a final step of 5 min at 72 SC.
In an attempt to maximize expression of the ZM4 edd and eda genes in yeast, two different approaches were undertaken to optimize the ZM4 edd and eda genes. The first approach was to remove translational pauses from the polynucleotide sequence by designing the gene to incorporate only codons that are preferred in yeast. This optimization is referred to as the "hot rod" optimization. In the second approach, translational pauses which are present in the native organism gene sequence are matched in the heterologous expression host organism by substituting the codon usage pattern of that host organism. This optimization is referred to as the "matched" optimization. The final gene and protein sequences for edd and eda from the Z 4 native, hot rod (HR) and matched versions, as well as the E. coli native are shown in Figure 6. Certain sequences in Figure 6 are presented at the end of this Example 1 . The matched version of ZM4 edd and ZM4 eda genes were synthesized by IDT, and the hot rod version was constructed using methods described in Larsen et al. (Int. J. Bioinform. Res. Appl; 2008:4[3]; 324-336).
Each version of each edd and eda gene was inserted into the yeast expression vector p426GPD (GPD promoter, 2 micron, URA3) (ATCC accession number 87361 ) between the Spel and Xhol cloning sites. Each version of the eda gene was also inserted into the Spel and Xhol sites of the yeast expression vector p425GPD (GPD promoter, 2 micron, LEU3) (ATCC accession number 87359). For each edd and eda version, 3' His tagged and non tagged p426 GPD constructs were made. Please refer to table 1 for all oligonucleotides used for PCR amplification of edd and eda constructs for cloning into p425 and p426 GPD vectors. All cloning procedures were conducted according to standard cloning procedures described by Maniatis et al. Each edd and eda p426GPD construct was transformed into Saccharomyces cerevisiae strain BY4742 (MATalpha his3delta1 Ieu2delta0 Iys2delta0 ura3delta0) (ATCC accession number 201389). This strain has a deletion of the his3 gene, an imidazoleglycerol-phosphate dehydratase which catalyzes the sixth step in histidine biosynthesis; a deletion of Ieu2 gene, a beta- isopropylmalate dehydrogenase which catalyzes the third step in the leucine biosynthesis pathway; a deletion of the Iys2 gene, an alpha aminoadipate reductase which catalyzes the fifth step in biosynthesis of lysine; and a deletion of the ura3 gene, an orotidine-5'-phosphate decarboxylase which catalyzes the sixth enzymatic step in the de novo biosynthesis of pyrimidines. The genotype of BY4742 makes it an auxotroph for histidine, leucine, lysine and uracil.
Transformation of the p426GPD plasmids containing an edd or an eda variant gene into yeast strain BY4742 was accomplished using the Zymo Research frozen-EZ yeast transformation II kit according to the manufacturer's protocol. The transformed BY4742 cells were selected by growth on a synthetic dextrose medium (SD) (0.67% yeast nitrogen base-2% dextrose) containing complete amino acids minus uracil (Krackeler Scientific Inc). Plates were incubated at about 30eC for about 48 hours. Transformant colonies for each edd and eda variant were inoculated onto 5ml of SD minus uracil medium and cells were grown at about 30eC and shaken at about 250 rpm for about 24hours. Cells were harvested by centrifugation at 1000 x g for about 5 minutes, after which protein crude extract was prepared with Y-PER Plus (Thermo Scientific) according to the manufacturer's instructions. Whole cell extract protein concentrations were determined using the Coomassie Plus Protein Assay (Thermo Scientific) according to the manufacturer's directions. For each edd and eda variant His-tagged construct, about 10 g of soluble and insoluble fractions were loaded on 4-12% NuPAGE Novex Bis-Tris protein gels (Invitrogen) and proteins were analyzed by western using anti-(His)6 mouse monoclonal antibody (Abeam) and HRP-conjugated secondary antibody (Abeam). Supersignal West Pico Chemiluminescent substrate (Thermo Scientific) was used for western detection according to manufacturer's instructions. All edd variants showed expression in both soluble and insoluble fractions whereas only the E. coli eda variant showed expression in the soluble fraction. In order to confirm that edd and eda variants were functional in yeast, the combined edd and eda activities were assayed by the formation of pyruvate, coupled to the NADH-dependent activity of lactate dehydrogenase. Transformation of combined edd (in p426GPD) and edd (in p425GPD) constructs was accomplished with the Zymo Research frozen-EZ yeast transformation II kit based on manufacturer's protocol. As a negative control, p425GPD and p426GPD vectors were also transformed into BY4742. Transformants (16 different combinations total including the variant edd and eda combinations plus vector controls) were selected on synthetic dextrose medium (SD) (0.67% yeast nitrogen base- 2% dextrose) containing complete amino acids minus uracil and leucine. Transformants of edd and eda variant combinations were inoculated onto 5ml of SD minus uracil and leucine and cells were grown at about 309C in shaker flasks at about 250 rpm for about 24 hours. Fresh overnight culture was used to inoculate about 100ml of (SD media minus uracil and leucine containing about 0.01 g ergosterol /L and about 400 μΙ of Tween80) to an initial inoculum OD60o„m of about 0.1 and grown anaerobically at about 30eC for approximately 14 hours until cells reached an OD600nm of 3-4. The cells were centrifuged at about 3000 g for about 10 minutes. The cells were then washed with 25 ml deionized H20 and centrifuged at 3000 g for 10 min. the cells were resuspended at about 2ml/g of cell pellet)in lysis buffer (50mM TrisCI pH7, 10mfv1 MgCI2, 1 X Calbiochem protease inhibitor cocktail set III). Approximately 900μΙ of glass beads were added and cells were lysed by vortexing at maximum speed for 4 x 30 seconds. Cell lysate was removed from the glass beads, placed into fresh tubes and spun at about 10,000g for about 10 minutes at about 4SC. The supernatant containing whole cell extract (WCE) was transferred to a fresh tube. WCE protein concentrations were measured using the Coomassie Plus Protein Assay (Thermo Scientific) according to the manufacturer's directions. A total of about 750Mg of WCE was used for the edd and eda coupled assay. For this assay, about 750Mg of WCE was mixed with about 2mM 6-phosphogluconate and about 4.5U lactate dehydrogenase in a final volume of about 400 μΙ. A total of about 100μΙ of NADH was added to this reaction to a final molarity of about 0.3mM, and NADH oxidation was monitored for about 10 minutes at about 340nM using a DU800 spectrophotometer.
ZM4 HR EDA GENE
ATGAGAGACATTGATTCTGTTATGAGATTGGCTCCAGTTATGCCAGTCTTGGTTAT
AGAAGATATAGCTGATGCTAAGCCAATTGCTGAGGCTTTGGTTGCTGGTGGTTTAA
ATGTTTTGGAAGTTACATTGAGAACTCCATGTGCTTTGGAAGCTATTAAAATTATG
AAGGAAGTTCCAGGTGCTGTTGTTGGTGCTGGTACTGTTTTAAACGCTAAAATGTT
GGATCAAGCTCAAGAAGCTGGTTGTGAGTTCTTTGTATCACCAGGTTTGACTGCTG
ATTTGGGAAAACATGCTGTTGCTCAAAAAGCGGCTCTTCTACCAGGGGTTGCTAAT
GCTGCTGATGTTATGTTGGGATTGGATTTGGGTTTGGATAGATTTAAATTCTTCCC
AGCTGAAAATATAGGTGGTTTGCCAGCTTTAAAATCTATGGCTTCTGTTTTTAGAC
AAGTTAGATTTTGTCCAACTGGAGGAATTACTCCGACTTCTGCTCCAAAATATTTG
GAAAATCCATCTATTTTGTGTGTTGGTGGTTCTTGGGTTGTTCCAGCGGGTAAACC
AGATGTTGCGAAAATTACTGCTTTGGCTAAAGAGGCTTCAGCTTTTAAAAGAGCTG
CTGTGGCGTAG ZM4 HR EDD GENE
ATGACGGATTTGCATTCAACTGTTGAGAAAGTAACTGCTAGAGTAATTGAAAGATC AAGGGAAACTAGAAAGGCTTATTTGGATTTGATACAATATGAGAGGGAAAAAGGTG TTGATAGACCAAATTTGTCTTGTTCTAATTTGGCTCATGGTTTTGCTGCTATGAAT GGTGATAAACCAGCTTTGAGAGATTTTAATAGAATGAATATAGGTGTAGTTACTTC TTATAATGATATGTTGTCTGCTCATGAACCATATTATAGATATCCAGAACAAATGA AGGTTTTTGCTCGTGAAGTTGGTGCTACAGTTCAAGTTGCTGGTGGTGTTCCTGCA ATGTGTGATGGTGTTACTCAAGGTCAACCAGGTATGGAAGAATCTTTGTTTTCCAG AGATGTAATTGCTTTGGCTACATCTGTTTCATTGTCTCACGGAATGTTTGAAGGTG CTGCATTGTTGGGAATTTGTGATAAAATTGTTCCAGGTTTGTTGATGGGTGCTTTG AGGTTCGGTCATTTGCCAACTATTTTGGTTCCATCTGGTCCAATGACTACTGGAAT CCCAAATAAAGAAAAGATTAGAATTAGACAATTGTATGCTCAAGGAAAAATTGGTC AAAAGGAATTGTTGGATATGGAAGCTGCCTGTTATCATGCTGAAGGTACTTGTACT TTTTATGGTACTGCTAACACTAATCAGATGGTTATGGAAGTTTTGGGTTTGCACAT GCCAGGTAGTGCATTCGTTACTCCAGGTACTCCACTGAGACAGGCTTTGACTAGAG CTGCTGTTCATAGAGTTGCAGAGTTGGGTTGGAAAGGTGATGATTATAGACCTTTG GGTAAAATTATTGATGAGAAATCTATTGTTAATGCTATTGTTGGTTTGTTAGCTAC AGGTGGTTCTACAAATCATACAATGCATATTCCGGCCATAGCTAGAGCAGCAGGGG TTATAGTTAATTGGAATGATTTTCATGATTTGTCTGAAGTTGTTCCATTGATTGCT AGAATTTATCCAAATGGTCCTAGAGATATAAATGAATTTCAAAATGCAGGAGGAAT GGCTTATGTAATTAAAGAATTGTTGAGTGCGAATTTGTTAAATAGAGATGTTACTA CTATTGCTAAAGGAGGGATAGAAGAATATGCTAAAGCTCCAGCTCTGAACGATGCG GGTGAATTGGTGTGGAAACCGGCTGGCGAACCTGGGGACGACACAATTTTGAGACC AGTATCTAATCCATTTGCTAAAGATGGTGGTTTGCGTCTCTTGGAAGGTAATTTGG GTAGAGCAATGTATAAGGCTTCTGCTGTAGATCCAAAATTCTGGACTATTGAAGCT CCCGTTAGAGTTTTCTCTGATCAAGATGATGTTCAAAAGGCTTTTAAAGCAGGCGA GTTAAATAAAGATGTTATAGTTGTTGTTAGATTTCAAGGTCCTCGTGCTAATGGTA TGCCTGAATTGCATAAGTTGACTCCTGCGCTAGGCGTATTGCAAGATAATGGTTAT AAGGTTGCTTTAGTTACTGATGGTAGAATGTCTGGTGCAACTGGTAAAGTACCGGT GGCTCTGCATGTTTCACCAGAGGCTTTAGGAGGTGGGGCGATTGGCAAGTTGAGAG ATGGCGATATAGTTAGAATTTCTGTTGAAGAAGGTAAATTAGAGGCTCTTGTCCCC GCCGACGAGTGGAATGCTAGACCACATGCTGAGAAGCCCGCTTTTAGACCTGGTAC TGGGAGAGAATTGTTTGACATTTTTAGACAAAACGCTGCTAAGGCTGAGGATGGTG CAGTTGCAATTTATGCTGGGGCAGGGATCTAG
ZM4 MATCHED EDA GENE
ATGAGGGATATTGATAGTGTGATGAGGTTAGCCCCTGTTATGCCTGTTCTCGTTAT TGAAGATATTGCAGATGCCAAACCTATTGCCGAAGCACTCGTTGCAGGTGGTCTAA ACGTTCTAGAAGTGACACTAAGGACTCCTTGTGCACTAGAAGCTATTAAGATTATG AAGGAAGTTCCTGGTGCTGTTGTTGGTGCTGGTACAGTTCTAAACGCCAAAATGCT CGACCAGGCACAAGAAGCAGGTTGCGAATTTTTCGTTTCACCTGGTCTAACTGCCG ACCTCGGAAAGCACGCAGTTGCTCAAAAAGCCGCATTACTACCCGGTGTTGCAAAT GCAGCAGATGTGATGCTAGGTCTAGACCTAGGTCTAGATAGGTTCAAGTTCTTCCC TGCCGAAAACATTGGTGGTCTACCTGCTCTAAAGAGTATGGCATCAGTTTTCAGGC AAGTTAGGTTCTGCCCTACTGGAGGTATAACTCCTACAAGTGCACCTAAATATCTA GAAAACCCTAGTATTCTATGCGTTGGTGGTTCATGGGTTGTTCCTGCCGGAAAACC CGATGTTGCCAAAATTACAGCCCTCGCAAAAGAAGCAAGTGCATTCAAGAGGGCAG CAGTTGCTTAG D EDD GENE
ATGACGGATCTACATAGTACAGTGGAGAAGGTTAC TGCCAGGGTTATTGAAAGGAG TAGGGAAACTAGGAAGGCATATCTAGATTTAATTCAATATGAGAGGGAAAAAGGAG TGGACAGGCCCAACCTAAGTTGTAGCAACCTAGCACATGGATTCGCCGCAATGAAT GGTGACAAGCCCGCATTAAGGGACTTCAACAGGATGAATATTGGAGTTGTGACGAG TTACAACGATATGTTAAGTGCACATGAACCCTATTATAGGTATCCTGAGCAAATGA AGGTGTTTGCAAGGGAAGTTGGAGCCACAGTTCAAGTTGC TGGTGGAGTGCCTGCA ATGTGCGATGGTGTGACTCAGGGTCAACC TGGAATGGAAGAATCCCTATTTTCAAG GGATGTTATTGCATTAGCAACTTCAGTTTCATTATCACATGGTATGTTTGAAGGGG CAGCTCTACTCGGTATATGTGACAAGATTGTTCCTGGTCTACTAATGGGAGCACTA AGGTTTGGTCACCTACCTACTATTCTAGTTCCCAGTGGACCTATGACAACGGGTAT ACCTAACAAAGAAAAAATTAGGATTAGGCAACTCTATGCACAAGGTAAAATTGGAC AAAAAGAACTACTAGATATGGAAGCCGCATGCTACCATGCAGAAGGTACTTGCACT TTCTATGGTACAGCCAACACTAACCAGATGGTTATGGAAGTTCTCGGTCTACATAT GCCCGGTAGTGCCTTTGTTACTCCTGGTACTCCTCTCAGGCAAGCACTAACTAGGG CAGCAGTGCATAGGGTTGCAGAATTAGGTTGGAAGGGAGACGATTATAGGCCTCTA GGTAAAATTATTGACGAAAAAAGTATTGTTAATGCAATTGTTGGTCTATTAGCCAC TGGTGGTAGTACTAACCATACGATGCATATTCCTGCTATTGCAAGGGCAGCAGGTG TTATTGTTAACTGGAATGACTTCCATGATCTATCAGAAGTTGTTCCTTTAATTGCT AGGATTTACCCTAATGGACCTAGGGACATTAACGAATTTCAAAATGCCGGAGGAAT GGCATATGTTATTAAGGAACTACTATCAGCAAATC TACTAAACAGGGATGTTACAA CTATTGC TAAGGGAGGTATAGAAGAATACGCTAAGGCACC TGCCCTAAATGATGCA GGAGAATTAGTTTGGAAGCCCGCAGGAGAACCTGGTGATGACACTATTCTAAGGCC TGTTTCAAATCCTTTCGCCAAAGATGGAGGTCTAAGGCTCTTAGAAGGTAACCTAG GAAGGGCCATGTACAAGGCTAGCGCCGTTGATCCTAAATTCTGGACTATTGAAGCC CCTGTTAGGGTTTTCTCAGACCAGGACGATGTTCAAAAAGCCTTCAAGGCAGGAGA ACTAAACAAAGACGTTATTGTTGTTGTTAGGTTCCAAGGACCTAGGGCCAACGGTA TGCCTGAATTACATAAGCTAACTCCTGCATTAGGTGTTCTACAAGATAATGGATAC AAAGTTGCATTAGTGACGGATGGTAGGATGAGTGGTGCAACTGG TAAAGTTCCTGT TGCATTACATGTTTCACCCGAAGCACTAGGAGGTGGTGCTATTGGTAAACTTAGGG ATGGAGATATTGTTAGGATTAGTGTTGAAGAAGGAAAACTTGAAGCACTCGTTCCC GCAGATGAGTGGAATGCAAGGCCTCATGCAGAAAAACCTGCATTCAGGCCTGGGAC TGGGAGGGAATTATTTGATATTTTCAGGCAAAATGCAGCAAAAGCAGAAGACGGTG CCGTTGCCATCTATGCCGGTGCTGGTATATAG
Example 2: Inactivation of the Embden-Meyerhof Pathway in Yeast
Saccharomyces cerevisiae strain YGR240CBY4742 was obtained from the ATCC (accession number 4015893). This strain is genetically identical to S. cerevisiae strain BY4742, except that YGR420C, the gene encoding the PFK1 enzyme, which is the alpha subunit of heterooctameric phosphofructokinase, has been deleted. A DNA construct designed to delete the gene encoding the PFK2 enzyme via homologous recombination was prepared. This construct substituted the gene encoding HIS3 (imidazoleglycerol-phosphate dehydratase, an enzyme required for synthesis of histidine) for the PFK2 gene. The DNA construct comprised, in the 5' to 3' direction, 100 bases of the 5' end of the open reading frame of PFK2, followed by the HIS3 promoter, HIS3 open reading frame, HIS3 terminator; and 100bp of the 3' end of the PFK2 open reading frame.
This construct was prepared by two rounds of PCR. In the first round, about 100ng of BY4742 genomic DNA was used as a template. The genomic DNA was prepared from cells using the Zymo Research Yeastar kit according to the manufacturer's instructions. PCR was performed using the following primers:
5'-tgcatattccgttcaatcttataaagctgccatagatttttacaccaagtcgttttaagagcttggtgagcgcta -3' (SEQ ID NO:9) 5'-cttgccagtgaatgacctttggcattctcatggaaacttcagtttcatagtcgagttcaagagaaaaaaaaagaa -3' (SEQ ID NO: 10)
The PCR reaction conditions were the same as those set forth in Example 1 for preparing the edd and eda genes.
For the second round of PCR, approximately 1 μΙ of the first PCR product was used as a template. The second round of PCR reaction was performed with the following primer set: 5'- atgactgttactactcctttlgtgaatggtacttcttattgtaccgtcactgcatatlccgttcaatcttataaa -3' (SEQ ID NO:1 1 ) 5'-ttaatcaactctctttcttccaaccaaatggtcagcaatgagtctggtagcttgccagtgaatgacctttggcat-3'(SEQ ID NO: 12)
PCR conditions for this reaction were the same as for the first reaction immediately above. The final PCR product was separated by agarose gel electrophoresis, excised, and purified using MP Biomedicals Geneclean II kit according to the manufacturer's instructions. Approximately 2 of the purified DNA was used for transformation of the yeast strain YGR240CBY4742 by lithium acetate procedure as described by Shiestl and Gietz with an additional recovery step added after the heat shock step. Essentially after heat shock, cells were centrifuged at 500 x g for 2 min and resuspended in 1 ml of YP-Ethanol (1 % yeast extract-2% peptone-2% ethanol) and incubated at 30s C for 2 hours prior to plating on selective media containing SC-Ethanol (0.67% yeast nitrogen base-2% ethanol) containing complete amino acids minus histidine. The engineered transformant strain referred to as YGR420CBY4742APFK2 has PFK1 and PFK2 genes deleted and is an auxotroph for leucine, uracil and lysine. The YGR420CBY4742APFK2 strain was used for transformation of the combination of edd-p426 GPD (edd variants in p426 GPD) and eda-p425 GPD (eda variants in p425 GPD) variant constructs. A total of 16 combinations of edd-p426 GPD and eda-p425 GPD variant constructs were tested. Each combination was transformed into YGR420CBY4742APFK2. For all transformation, ^g of edd-p426 GPD and ~\ g of eda-p425 GPD was used. All transformants from each edd-p426 GPD and eda-p425 GPD construct combination were selected on SC-Ethanol
(0.67% yeast nitrogen base-2% ethanol) containing complete amino acids minus uracil and leucine.
To confirm that the edd and eda variants are functional in yeast, a complementation test for growth of YGR420CBY4742APFK2 strain on YPD (1 % yeast extract-2% peptone-2% dextrose) and YPGIuconate (1 % yeast extract-2% peptone-2% gluconate) was performed. Viable colonies of edd-p426 GPD and eda-p425 GPD variant construct combinations grown on SC-Ethanol minus uracil and leucine were patched to plates containing SC-ethanol minus uracil and leucine and incubated at 309 C for 48hrs. These patches were used to inoculate 5ml of YPD media to an initial inoculum OD60onm of 0.1 and the cells were grown anaerobically at 309C for 3 to 7 days.
Example 3: Preparation of Carbon Dioxide Fixing Yeast Cells
Total genomic DNA from Zymomonas mobilis was obtained from ATCC (ATCC Number 31821 ). The Z. mobilis gene encoding the enzyme phosphoenolpyruvate carboxylase ("PEP carboxylase") was isolated from this genomic DNA and cloned using PCR amplification. PCR was performed in a total volume of about 50 micro-liters in the presence of about 20 nanograms of Z. mobilis genomic DNA, about 0.2 imM of 5' forward primer, about 0.2 mM of 3' reverse primer, about 0.2 mM of dNTP, about 1 micro-liter of pfu Ultrall DNA polymerase (Stratagene, La Jolla, CA), and 1 X PCR buffer (Stratagene, La Jolla, CA). PCR was carried out in a thermocycler using the following program: Step One "95°C for 10 minutes" for 1 cycle, followed by Step Two "95°C for 20 seconds, 65°C for 30 seconds, and 72°C for 45 seconds" for 35 cycles, followed by Step Three "72°C for 5 minutes" for 1 cycle, and then Step Four "4°C Hold" to stop the reaction. The primers for the PCR reaction were:
5'GACTAACTGAACTAGTAAAAAAATGACCAAGCCGCGCACAATTAATCAG-3' (SEQ ID NO:13) 5'AAGTGAGTAACTCGAGTTATTAACCGCTGTTGCGAAGTGCCGTCGC-3' (SEQ ID NO:14)
The DNA sequence of native Z Mobilis PEP carboxylase is set forth as SEQ ID NO:20.
The cloned gene was inserted into the vector pGPD426 (ATCC Number: 87361 ) in between the Spel and Xhol sites. The final plasmid containing the PEP carboxylase gene was named pGPD426 PEPC. Separately, a similar plasmid, referred to as pGPD426 N-his PEPC was constructed to insert a six- histidine tag at the N-terminus of the PEPC sequence for protein expression verification in yeast. This plasmid was constructed using two rounds of PCR to extend the 5' end of the PEPC gene to incorporate a six-histidine tag at the N-terminus of the PEPC protein. The two 5' forward primers used sequentially were:
5'ATGTCTCATCATCATCATCATCATACCAAGCCGCGCACAATTAATCAGAAC-3' (SEQ ID NO:
15)
and
5'GACTAACTGAACTAGTAAAAAAATGTCTCATCATCATCATCATCATACCAAG-3' (SEQ ID NO:16)
The same 3' primer was used as described above. The PCR was performed in a total volume of about 50 micro-liters in the presence of about 20 nanograms of Z. Mobilis PEP carboxylase polynucleotide, about 0.2 mM of 5' forward primer, about 0.2 mM of 3' reverse primer, about 0.2 mM of dNTP, about 1 micro-liter of pfu Ultrall DNA polymerase (Stratagene, La Jolla, CA), and 1 X PCR buffer (Stratagene, La Jolla, CA). The PCR was carried out in a thermocycler using the following program: Step One "95°C for 10 minutes" for 1 cycle, followed by Step Two "95°C for 20 seconds, 65°C for 30 seconds, and 72°C for 45 seconds" for 35 cycles, followed Step Three "72°C for 5 minutes" for 1 cycle, and then Step Four "4°C Hold" to stop the reaction. To increase protein expression level of Z. Mobilis PEP carboxylase in yeast, the PEPC coding sequence was optimized to incorporate frequently used codons obtained from yeast glycolytic genes. The resulting PEP carboxylase amino acid sequence remains identical to the wild type. The codon optimized PEP carboxylase DNA sequence was ordered from IDT and was inserted into the vector pGPD426 at the Spel and Xhol site. The final plasmid containing the codon optimized PEP carboxylase gene was named pGPD426 PEPC_opti. A similar plasmid, named pGPD426 N-his PEPC_opti was constructed to insert a six-histidine tag at the N-terminus of the optimized PEPC gene for protein expression verification in yeast.
To construct pGPD426 N-his PEPC_opti, two rounds of PCR were performed to extend the 5' end of the codon optimized PEPC gene to incorporate the six-histidine tag at the N-terminus of the PEPC protein. Two 5' forward primers used in sequential order were: 5'ATGTCTCATCATCATCATCATCATATGACCAAGCCAAGAACTATTAACCAAAACCC-3' (SEQ ID NO:17) and
5'GACTAACTGAACTAGTAAAAAAATGTCTCATCATCATCATCATCATATGACCAAGCCAAG 3' (SEQ ID NO:18)
The 3' reverse primer sequence used for both PCR reactions was:
5'AAGTGAGTAACTCGAGTTATTAACCGGAGTTTCTCAAAGCAGTAGCGATAG3' (SEQ ID NO:19) Both PCR reactions were performed in a total volume of about 50 micro-liters in the presence of about 20 nanograms of the codon optimized PEP carboxylase polynucleotide, about 0.2 mM of 5' forward primer, about 0.2 mM of 3' reverse primer, about 0.2 mM of dNTP, about 1 micro-liter of pfu Ultrall DNA polymerase (Stratagene, La Jolla, CA), and 1 X PCR buffer (Stratagene, La Jolla, CA). PCR reactions were carried out in a thermocycler using the following program: Step One "95°C for 10 minutes" for 1 cycle, followed by Step Two "95°C for 20 seconds, 65°C for 30 seconds, and 72°C for 45 seconds" for 35 cycles, followed Step Three "72°C for 5 minutes" for 1 cycle, and then Step Four "4°C Hold" to stop the reaction. Saccharomyces cerevisiae strain BY4742 was cultured in YPD medium to an OD of about 1.0, and then prepared for transformation using the Frozen-EZ Yeast Transformation II kit (Zymo Research, Orange, CA) and following the manufacturer's instructions. Approximately 500 micrograms of each plasmid was added to the cells, and transformation was accomplished by addition of PEG solution ("Solution 3" in the Frozen-EZ Yeast Transformation II kit) and incubation at about 30°C for an hour. After transformation, the cells were plated on synthetic complete medium (described in Example IV below) minus uracil (sc-ura) medium, grown for about 48 hours at about 30°C, and transformants were selected based on auxotrophic complementation.
Following a similar procedure, the same plasmids were individually transformed using the procedure described above into the following yeast mutant strains: YKR097W (ATCC Number 4016013, APCK, in the phosphoenolpyruvate carboxykinase gene is deleted), YGL062W (ATCC Number 4014429, APYC1 ,in which the pyruvate carboxylase 1 gene is deleted), and YBR218C (ATCC Number 4013358, APYC2, in which the pyruvate carboxylase 2 gene is deleted). The transformed yeast cells were grown aerobically in a shake flask in synthetic complete medium minus uracil (see Example IV) containing 1 % glucose to mid-log phase (an OD of 2.0). The mid-log phase cultures were then used to inoculate a fresh culture (in sc-ura medium with 1 % glucose) to an initial OD of 0.1 at which time the cultures were then grown anaerobically in a serum bottle. Culture samples were drawn periodically to monitor the level of glucose consumption and ethanol production.
DNA sequence of the native Z. mobilis PEP carboxylase gene (SEQ ID NO:20):
ACJAiGjAAAAAAATGACCAAGCCGCGCACAATTAATCAGAACCCAGACCTTCGCTATTTTGGT AACCTGCTCGGTCAGGTTATTAAGGAACAAGGCGGAGAGTCTTTATTCAACCAGATCGAGCAA ATTCGCTCTGCCGCGATTAGACGCCATCGGGGTATTGTTGACAGCACCGAGCTAAGTTCTCG CTTAGCCGATCTCGACCTTAATGACATGTTCTCTTTTGCACATGCC I I I I I GCTGTTTTCAATG CTGGCCAATTTGGCTGATGATCGTCAGGGAGATGCCCTTGATCCTGATGCCAATATGGCAAGT GCCCTTAAGGACATAAAAGCCAAAGGCGTCAGTCAGCAGGCGATCATTGATATGATCGACAAA GCCTGCATTGTGCCTGTTCTGACAGCACATCCGACCGAAGTCCGTCGGAAAAGTATGCTTGA CCATTATAATCGCATTGCAGGTTTAATGCGGTTAAAAGATGCTGGACAAACGGTGACCGAAGA TGGTCTTCCGATCGAAGATGCGTTAATCCAGCAAATCACGATATTATGGCAGACTCGTCCGCT CATGCTGCAAAAGCTGACCGTGGCTGATGAAATCGAAACTGCCCTGTCTTTCTTAAGAGAAAC TTTTCTGCCTGTTCTGCCCCAGATTTATGCAGAATGGGAAAAATTGCTTGGTAGTTCTATTCCA AGCTTTATCAGACCTGGTAATTGGATTGGTGGTGACCGTGACGGTAACCCCAATGTCAATGCC GATACGATCATGCTGTCTTTGAAGCGCAGCTCGGAAACGGTATTGACGGATTATCTCAACCGT CTTGATAAACTGCTTTCCAACCTTTCGGTCTCAACCGATATGGTTTCGGTATCCGATGATATTC TACGTCTAGCCGATAAAAGTGGTGACGATGCTGCGATCCGTGCGGATGAACCTTATCGTCGT GCCTTAAATGGTATTTATGACCGTTTAGCCGCTACCTATCGTCAGATCGCCGGTCGCAACCCT TCGCGCCCAGCCTTGCGTTCTGCAGAAGCCTATAAACGGCCTCAAGAATTGCTGGCTGATTT GAAGACCTTGGCCGAAGGCTTGGGTAAATTGGCAGAAGGTAGTTTTAAGGCATTGATCCGTTC GGTTGAAACCTTTGGTTTCCATTTGGCCACCCTCGATCTGCGTCAGAATTCGCAGGTTCATGA AAGAGTTGTCAATGAACTGCTACGGACAGCCACCGTTGAAGCCGATTATTTATCTCTATCGGA AGAAG ATCGCGTTAAGCTGTTAAGACGGG AATTGTCGCAGCCGCGGACTCTATTCGTTCCGC GCGCCGATTATTCCGAAGAAACGCGTTCTGAACTTGATATTATTCAGGCAGCAGCCCGCGCC CATGAAA I I I I I GGCCCTGAATCCATTACGACTTATTTGATTTCGAATGGCGAAAGCATTTCCG ATATTCTGGAAGTCTATTTGCTTTTGAAAGAAGCAGGGCTGTATCAAGGGGGTGCTAAGCCAA AAGCGGCGATTGAAGCTGCGCCTTTATTCGAGACGGTGGCCGATCTTGAAAATGCGCCAAAG GTCATGGAGGAATGGTTCAAGCTGCCTGAAGCGCAAGCCATTGCAAAGGCACATGGCGTTCA GGAAGTGATGGTTGGCTATTCTGACTCCAATAAGGACGGCGGATATCTGACCTCGGTTTGGG GTCTTTATAAGGCTTGCCTCGCTTTGGTGCCGA I I I I I GAGAAAGCCGGTGTACCGATCCAGT TTTTCCATGGACGGGGTGGTTCCGTTGGTCGCGGTGGTGGTTCCAACTTTAATGCCATTCTGT CGCAGCCAGCCGGAGCCGTCAAAGGGCGTATCCGTTATACAGAACAGGGTGAAGTCGTGGC GGCCAAATATGGCACCCATGAAAGCGCTATTGCCCATCTGGATGAGGCCGTAGCGGCGACTT TGATTACGTCTTTGGAAGCACCGACCATTGTCGAGCCAGAGTTTAGTCGTTACCGTAAGGCCT TGGATCAGATCTCAGATTCAGCTTTCCAGGCCTATCGCCAATTGGTCTATGGAACGAAGGGCT TCCGTAAATTCTTTAGTGAATTTACGCCTTTGCCGGAAATTGCCCTGTTAAAGATCGGGTCACG CCCACCTAGCCGCAAAAAATCCGACCGGATTGAAGATCTACGCGCTATTCCTTGGGTGTTTAG CTGGTCTCAAGTTCGAGTCATGTTACCCGGTTGGTTCGGTTTCGGTCAGGCTTTATATGACTT TGAAGATACCGAGCTGTTACAGGAAATGGCAAGCCGTTGGCCG I I I I I CCGCACGACTATTCG GAATATGGAACAGGTGATGGCACGTTCCGATATGACGATCGCCAAGCATTATCTGGCCTTGGT TGAGGATCAGACAAATGGTGAGGCTATCTATGATTCTATCGCGGATGGCTGGAATAAAGGTTG TGAAGGTCTGTTAAAGGCAACCCAGCAGAATTGGCTGTTGGAACGCTTTCCGGCGGTTGATA ATTCGGTGCAGATGCGTCGGCCTTATCTGGAACCGCTTAATTACTTACAGGTCGAATTGCTGA AGAAATGGCGGGGAGGTGATACCAACCCGCATATCCTCGAATCTATTCAGCTGACAATCAATG CCATTGCGACGGCACTTCGCAACAGCGGTTAATAACTCGAG DNA sequence of the codon optimized PEP carboxylase gene (SEQ ID NO:21 ):
ACTAGTAAAAAAATGACCAAGCCAAGAACTATTAACCAAAACCCAGACTTGAGATACTTCGGTA ACTTGTTGGGTCAAGTTATCAAGGAACAAGGTGGTGAATCTTTGTTCAACCAAAT GAACAAAT CAGATCCGCTGCTATTAGAAGACACAGAGGTATCGTCGACTCTACCGAATTGTCCTCTAGATT GGCTGACTTGGACTTGAACGACATGTTCTCCTTCGCTCACGCTTTCTTGTTGTTCTCTATGTTG GCTAACTTGGCTGACGACAGACAAGGTGACGCTTTGGACCCAGACGCTAACATGGCTTCCGC TTTGAAGGACATTAAGGCTAAGGGTGTTTCTCAACAAGCTATCATTGACATGATCGACAAGGC TTGTATTGTCCCAGTTTTGACTGCTCACCCAACCGAAGTCAGAAGAAAGTCCATGTTGGACCA CTACAACAGAATCGCTGGTTTGATGAGATTGAAGGACGCTGGTCAAACTGTTACCGAAGACG GTTTGCCAATTGAAGACGCTTTGATCCAACAAATTACTATCTTGTGGCAAACCAGACCATTGAT GTTGCAAAAGTTGACTGTCGCTGACGAAATTGAAACCGCTTTGTCTTTCTTGAGAGAAACTTTC TTGCCAGTTTTGCCACAAATCTACGCTGAATGGGAAAAGTTGTTGGGTTCCTCTATTCCATCCT TCATCAGACCAGGTAACTGGATTGGTGGTGACAGAGACGGTAACCCAAACGTCAACGCTGAC ACCATCATGTTGTCTTTGAAGAGATCCTCTGAAACTGTTTTGACCGACTACTTGAACAGATTGG ACAAGTTGTTGTCCAACTTGTCTGTCTCCACTGACATGGTTTCTGTCTCCGACGACATTTTGAG ATTGGCTGACAAGTCTGGTGACGACGCTGCTATCAGAGCTGACGAACCATACAGAAGAGCTT TGAACGGTATTTACGACAGATTGGCTGCTACCTACAGACAAATCGCTGGTAGAAACCCATCCA GACCAGCTTTGAGATCTGCTGAAGCTTACAAGAGACCACAAGAATTGTTGGCTGACTTGAAGA CTTTGGCTGAAGGTTTGGGTAAGTTGGCTGAAGGTTCCTTCAAGGCTTTGATTAGATCTGTTG AAACCTTCGGTTTCCACTTGGCTACTTTGGACTTGAGACAAAACTCCCAAGTCCACGAAAGAG TTGTCAACGAATTGTTGAGAACCGCTACTGTTGAAGCTGACTACTTGTCTTTGTCCGAAGAAG ACAGAGTCAAGTTGTTGAGAAGAGAATTGTCTCAACCAAGAACCTTGTTCGTTCCAAGAGCTG ACTACTCCGAAGAAACTAGATCTGAATTGGACATCATTCAAGCTGCTGCTAGAGCTCACGAAA TCTTCGGTCCAGAATCCATTACCACTTACTTGATCTCTAACGGTGAATCCATTTCTGACATCTT GGAAGTCTACTTGTTGTTGAAGGAAGCTGGTTTGTACCAAGGTGGTGCTAAGCCAAAGGCTG CTATTGAAGCTGCTCCATTGTTCGAAACCGTTGCTGACTTGGAAAACGCTCCAAAGGTCATGG AAGAATGGTTCAAGTTGCCAGAAGCTCAAGCTATCGCTAAGGCTCACGGTGTTCAAGAAGTCA TGGTTGGTTACTCCGACTCTAACAAGGACGGTGGTTACTTGACTTCCGTCTGGGGTTTGTACA AGGCTTGTTTGGCTTTGGTTCCAATTTTCGAAAAGGCTGGTGTCCCAATCCAATTCTTCCACG GTAGAGGTGGTTCTGTTGGTAGAGGTGGTGGTTCCAACTTCAACGCTATTTTGTCTCAACCAG CTGGTGCTGTCAAGGGTAGAATCAGATACACCGAACAAGGTGAAGTTGTCGCTGCTAAGTAC GGTACTCACGAATCCGCTATTGCTCACTTGGACGAAGCTGTTGCTGCTACCTTGATCACTTCT TTGGAAGCTCCAACCATTGTCGAACCAGAATTCTCCAGATACAGAAAGGCTTTGGACCAAATC TCTGACTCCGCTTTCCAAGCTTACAGACAATTGGTTTACGGTACTAAGGGTTTCAGAAAGTTCT TCTCTGAATTCACCCCATTGCCAGAAATTGCTTTGTTGAAGATCGGTTCCAGACCACCATCTAG AAAGAAGTCCGACAGAATTGAAGACTTGAGAGCTATCCCATGGGTCTTCTCTTGGTCCCAAG.T TAGAGTCATGTTGCCAGGTTGGTTCGGTTTCGGTCAAGCTTTGTACGACTTCGAAGACACTGA ATTGTTGCAAGAAATGGCTTCTAGATGGCCATTCTTCAGAACCACTATTAGAAACATGGAACAA GTTATGGCTAGATCCGACATGACCATCGCTAAGCACTACTTGGCTTTGGTCGAAGACCAAACT AACGGTGAAGCTATTTACGACTCTATCGCTGACGGTTGGAACAAGGGTTGTGAAGGTTTGTTG AAGGCTACCCAACAAAACTGGTTGTTGGAAAGATTCCCAGCTGTTGACAACTCCGTCCAAATG AGAAGACCATACTTGGAACCATTGAACTACTTGCAAGTTGAATTGTTGAAGAAGTGGAGAGGT GGTGACACTAACCCACACATTTTGGAATCTATCCAATTGACCATTAACGCTATCGCTACTGCTT TGAGAAACTCCGGTTAATAACTCGAG
Example 4: Production of Pentose Sugar Utilizing Yeast Cells The full length gene encoding the enzyme xylose isomerase from Ruminococcus flavefaciens strain 17 (also known as Ruminococcus flavefaciens strain Siijpesteijn 1948) with a substitution at position 513 (in which cytidine was replaced by guanidine) was synthesized by Integrated DNA Technologies, Inc. ("IDT", Coralville, IA; www.idtdna.com). The sequence of this gene is set forth below as SEQ ID NO:22.
SEQ ID NO: 22 atggaatttttcagcaatatcggtaaaattcagtatcagggaccaaaaagtactgatcctctctcatttaagtactataaccctgaagaagtca tcaacggaaagacaatgcgcgagcatctgaagttcgctctttcatggtggcacacaatgggcggcgacggaacagatatgttcggctgc ggcacaacagacaagacctggggacagtccgatcccgctgcaagagcaaaggctaaggttgacgcagcattcgagatcatggataa gctctccattgactactattgtttccacgatcgcgatctttctcccgagtatggcagcctcaaggctaccaacgatcagcttgacatagttacag actatatcaaggagaagcagggcgacaagttcaagtgcctctggggtacagcaaagtgcttcgatcatccaagattcatgcacggtgca ggtacatctccttctgctgatgtattcgctttctcagctgctcagatcaagaaggctctGgagtcaacagtaaagctcggcggtaacggttac gttttctggggcggacgtgaaggctatgagacacttcttaatacaaatatgggactcgaactcgacaatatggctcgtcttatgaagatggct gttgagtatggacgttcgatcggcttcaagggcgacttctatatcgagcccaagcccaaggagcccacaaagcatcagtacgatttcgata cagctactgttctgggattcctcagaaagtacggtctcgataaggatttcaagatgaatatcgaagctaaccacgctacacttgctcagcata cattccagcatgagctccgtgttgcaagagacaatggtgtgttcggttctatcgacgcaaaccagggcgacgttcttcttggatgggataca gaccagttccccacaaatatctacgatacaacaatgtgtatgtatgaagttatcaaggcaggcggctlcacaaacggcggtctcaacttcg acgctaaggcacgcagagggagcttcactcccgaggatatcttctacagctatatcgcaggtatggatgcatttgctctgggcttcagagct gctctcaagcttatcgaagacggacgtatcgacaagttcgttgctgacagatacgcttcatggaataccggtatcggtgcagacataatcgc aggtaaggcagatttcgcatctcttgaaaagtatgctcttgaaaagggcgaggttacagcttcactctcaagcggcagacaggaaatgctg gagtctatcgtaaataacgttcttttcagtctgtaa
Separately, PCR was conducted to add a DNA sequence encoding 6 histidines to the 3' terminus of this gene.
Two variants designed to remove the translational pauses in the gene were prepared using the DNA self-assembly method of Larsen et al., supra. One variant contained DNA sequence encoding a 6-hisitidine tag at the 5' terminus, and the other version did not. The annealing temperature for the self assembly reactions was about 48 degrees Celsius. This gene variant is referred to as a "Hot Rod" or "HR" gene variant. The sequence of this HR gene is set forth below as SEQ ID NO: 23:
SEQ ID NO: 23
ATGGAGTTCTTTTCTAATATAGGTAAAATTCAGTATCAAGGTCCAAAATC TACAGATCCATTGTCTTTTAAATATTATAATCCAGAAGAAGTTATAAATG GTAAAACTATGAGAGAACATTTAAAATTTGCTTTGTCTTGGTGGCATACT ATGGGTGGTGATGGTACTGATATGTTCGGTTGTGGTACTACTGATAAAAC TTGGGGTCAATCTGATCCAGCTGCTAGAGCAAAAGCCAAAGTAGATGCAG CCTTTGAAATTATGGATAAATTGTCTATTGATTATTATTGTTTTCATGAT AGAGATTTGTCTCCTGAATATGGTTCTTTAAAAGCAACTAATGATCAATT GGACATTGTTACGGATTATATTAAAGAAAAACAAGGTGATAAATTTAAAT GTTTGTGGGGCACTGCGAAATGTTTTGATCATCCACGTTTTATGCATGGT GCGGGGACGAGTCCTTCTGCTGATGTTTTTGCTTTTTCTGCCGCTCAAAT TAAGAAGGCATTGGAATCAACTGTTAAATTAGGTGGGAACGGGTATGTAT TCTGGGGAGGAAGGGAAGGTTATGAAACATTATTAAACACTAATATGGGT TTGGAATTGGATAATATGGCTAGATTGATGAAAATGGCTGTAGAATACGG AAGGTCTATTGGTTTTAAGGGTGACTTTTATATTGAACCAAAACCTAAAG AGCCTACTAAACATCAATATGATTTTGATACTGCTACAGTTTTGGGATTC TTGAGAAAATATGGTCTGGATAAAGATTTTAAAATGAATATAGAAGCTAA TCATGCAACACTCGCACAACATACTTTTCAACATGAATTGAGAGTTGCCA GAGATAACGGAGTTTTTGGATCTATCGATGCAAACCAGGGAGACGTTTTG CTAGGATGGGATACTGATCAATTTCCAACTAACATTTATGATACTACTAT GTGTATGTATGAAGTAATTAAGGCAGGAGGCTTTACTAATGGCGGATTAA ACTTTGATGCGAAGGCTAGGCGTGGTAGTTTCACTCCAGAGGATATATTC TATTCTTATATTGCTGGAATGGATGCTTTCGCGTTAGGTTTCAGGGCAGC ACTAAAATTGATTGAAGATGGTAGAATTGATAAGTTTGTAGCTGATAGAT ATGCTTCTTGGAATACTGGAATAGGAGCAGATATAATCGCTGGGAAAGCC GACTTCGCCAGTCTGGAAAAATATGCGCTTGAAAAAGGAGAAGTTACTGC CAGCTTAAGTTCCGGTCGTCAAGAAATGTTGGAATCTATTGTAAACAATG
TTTTATTTTCTCTG
For cloning purposes, PCR was used to engineer a unique Spe\ restriction site into the 5' end of each of the xylose isomerase genes, and to engineer a unique Xho\ restriction site at the 3' end. In addition, a version of each gene was created that contained a 6-HIS tag at the 3' end of each gene to enable detection of the proteins using Western analysis.
PCR amplifications were performed in about 50μΙ reactions containing 1 X PfuW Ultra reaction buffer (Stratagene, San Diego, CA), 0.2mM dNTPs, 0.2μΜ specific 5' and 3' primers, and 1 U P JItra II polymerase (Stratagene, San Diego, CA). The reactions were cycled at 95 °C for 10 minutes, followed by 30 rounds of amplification (95°C for 30 seconds, 62°C for 30 seconds, 72°C for 30 seconds) and a final extension incubation at 72 °C for 5 minutes. Amplified PCR products were cloned into pCR Blunt II TOPO (Life Sciences, Carlsbad, CA) and confirmed by sequencing (GeneWiz, La Jolla, CA). The PCR primers for these reactions were:
5'ACTTGACTACTAGTATGGAGTTCTTTTCTAATATAGGTAAAATT (SEQ ID NO:26)
3' (without the His tag):
AGTCAAGTCTCGAGCAGAGAAAATAAAACATTGTTTACAATAGA (SEQ ID NO:27)
3' (with the His tag):
AGTCAAGTCTCGAGCTAATGATGATGATGATGATGCAGAGAAAATAAAACATTGTTTAC (SEQ ID NO:28)
Separately, the xylose isomerase gene from Piromyces, strain E2 (Harhangi ef al., Arch. Microbiol., 180(2): 134-141 (2003)) was synthesized by IDT. The sequence of this gene is set forth below as SEQ ID NO: 24.
1 atggctaagg aatatttccc acaaattcaa aagattaagt tcgaaggtaa ggactctaag 61 aatccattag ccttccacta ctacgatgct gaaaaggaag tcatgggtaa gaaaatgaag
121 gattggttac gtttcgccat ggcctggtgg cacactcttt gcgccgaagg tgctgaccaa 181 ttcggtggag gtacaaagtc tttcccatgg aacgaaggta ctgatgctat tgaaattgcc 241 aagcaaaagg ttgatgctgg tttcgaaatc atgcaaaagc ttggtattcc atactactgt 301 ttccacgatg ttgatcttgt ttccgaaggt aactcCattg aagaatacga atccaacctt 361 aaggctgtcg ttgcctacct caaggaaaag caaaaggaaa ccggtattaa gcttctctgg
421 agtactgcta acgtcttcgg tcacaagcgt tacatgaacg gtgcctccac taacccagac 481 tttgatgttg tcgcccgtgc tattgttcaa attaagaacg ccatagacgc cggtattgaa 541 cttggtgctg aaaactacgt cttctggggt ggtcgtgaag gttacatgag tctccttaac
601 actgaccaaa agcgtgaaaa ggaacacatg gccactatgc t taccatggc tegtgactae
661 gctcgt tcca agggattcaa gggtactttc ctcattgaac caaagccaat ggaaccaacc
721 aagcaccaat acgatgttga cactgaaacc gctattggtt tccttaaggc ccacaact ta
781 gacaaggact tcaaggtcaa cat tgaagt t aaccacgcta ctcttgctgg tcacact ttc
841 gaacacgaac ttgcctgtgc tgt tgatgct ggtatgctcg gttccat tga tgctaaccgt
901 ggtgactacc aaaacggttg ggatactgat caattcccaa ttgatcaata cgaactcgtc
961 caagcttgga tggaaatcat ccgtggtggt ggtttcgtta ctggtggtac caacttcgat
1021 gccaagactc gtcgtaactc tactgacctc gaagacatca teat tgccca cgtttctggt
1081 atggatgcta tggctcgtgc tcttgaaaac gctgccaagc t cct ccaaga atctccatac
1141 accaagatga agaaggaacg ttacgcttcc t tcgacagtg gtattggtaa ggactttgaa
1201 gatggtaagc tcaccctcga acaagtttac gaatacggta agaagaaegg tgaaccaaag
1261 caaacttctg gtaagcaaga actctacgaa gctattgttg ccatgtacca a t aa
Two hot rod ("HR") versions of the Piromyces xylose isomerase gene were prepared using the method of Larsen et al., supra. One version contained DNA sequence encoding a 6-histidine tag at the 5' terminus and the other did not. The annealing temperature for the self-assembling oligonucleotides was about 48 degrees Celsius. The sequence of this gene is set forth below as
SEQ ID NO: 25.
ATGGCTAAAGAATATTTTCCACAAATTCAGAAAATTAAATTTGAAGGTAAAGATTCTAAAAATCCATTGGCTTTCCATTA TTATGATGCTGAAAAAGAAGTTATGGGTAAAAAGATGAAAGATTGGTTGAGATTCGCTATGGCTTGGTGGCATACTCTAT GTGCTGAAGGAGCTGATCAATTTGGAGGAGGTACTAAATCTTTTCCTTGGAATGAAGGTACTGACGCTATTGAAATTGCT AAGCAGAAAGTAGACGCGGGTTTTGAAATTATGCAAAAATTGGGAATACCATATTATTGTTTTCATGATGTTGATTTGGT ATCTGAGGGTAATTCTATTGAAGAATATGAATCTAATTTAAAAGCTGTTGTTGCTTACTTAAAAGAAAAACAAAAAGAAA CTGGAATTAAATTGTTGTGGTCTACAGCTAATGTTTTCGGTCATAAAAGATATATGAATGGTGCTTCTACAAATCCAGAT TTTGATGTTGTAGCTAGAGCTATTGTTCAAATTAAAAATGCTATAGATGCAGGAATTGAATTAGGTGCCGAAAATTATGT TTTCTGGGGAGGTAGAGAAGGTTATATGTCTTTGTTAAATACTGATCAAAAACGTGAAAAGGAACACATGGCAACTATGT TGACAATGGCTAGGGATTATGCTAGATCTAAAGGTTTTAAAGGTACTTTCTTGATTGAGCCAAAACCTATGGAACCAACT AAACATCAATATGACGTTGACACTGAAACTGCTATTGGTTTCTTAAAAGCTCATAATTTGGATAAAGATTTTAAGGTTAA TATAGAAGTTAATCATGCTACACTAGCTGGTCATACTTTTGAACATGAATTAGCTTGTGCAGTTGATGCCGGTATGTTAG GTTCTATCGACGCAAATAGAGGTGATTATCAAAATGGTTGGGACACAGATCAATTTCCAATAGATCAATATGAATTGGTT CAAGCATGGATGGAAATTATTAGGGGTGGAGGCTTCGTTACAGGTGGAACTAATTTTGATGCTAAAACTAGGAGAAATTC TACAGATCTTGAAGATATAATTATTGCTCATGTATCTGGTATGGATGCGATGGCCCGTGCTTTGGAAAATGCAGCTAAAT TACTTCAAGAATCTCCTTATACTAAAATGAAAAAGGAAAGATATGCTTCTTTTGATTCTGGAATAGGTAAGGATTTTGAA GATGGTAAATTGACATTGGAACAAGTTTATGAATATGGTAAGAAGAATGGAGAACCAAAACAAACTTCTGGTAAACAAGA ATTATATGAGGCTATAGTAGCTATGTATCAAt aa
For cloning purposes, a unique Spel restriction site was engineered at the 5' end of each of the XI genes, and a unique Xho\ restriction site was engineered at the 3' end. When needed, a 6-HIS tag was engineered at the 3' end of each gene sequence to enable detection of the proteins using Western analysis. The primers are listed in Table X. PCR amplifications were performed in 50μΙ reactions containing 1 X PiuW Ultra reaction buffer (Stratagene, San Diego, CA), 0.2mM dNTPs, 0.2μΜ specific 5' and 3' primers, and 1 U PMJItra II polymerase (Stratagene, San Diego, CA). The reactions were cycled at 95 °C for 10 minutes, followed by 30 rounds of amplification (95 °C for 30 seconds, 62°C for 30 seconds, 72°C for 30 seconds) and a final extension incubation at 72 °C for 5 minutes. Amplified PCR products were cloned into pCR Blunt II TOPO (Life Sciences, Carlsbad, CA) and confirmed by sequencing (GeneWiz, La Jolla, CA) . The primers used for PCR were:
5' (native gene) ACTAGTATGGCTAAGGAATATTTCCCACAAATTCAAAAG
3' (native gene) CTCGAGCTACTATTGGTACATGGCAACAATAGC
3' (native gene plus His tag)
CTCGAGCTACTAATGATGATGATGATGATGTTGGTACATGGCAACAATAGCTTCG
5' (hot rod gene) ACTAGTATGGCTAAAGAATATTTTCCACAAATTCAG
3' (hot rod gene) CTCGAGTTATTGATACATAGCTACTATAGCCTC
3' (hot rod gene plus His tag)
CTCGAGTTAATGATGATGATGATGATGTTGATACATAGCTACTATAGCCTCATTGTTTAC
The genes encoding the native and HR versions of xylose isomerase were separately inserted into the vector p426GDP (ATCC catalog number 87361 ). Saccharomyces cerevisiae strain BY4742 cells (ATCC catalog number 201389) were cultured in YPD media ( 1 Og Yeast Extract, 20g Bacto-Peptone, 20g Glucose, 1 L total) at about 30 °C.
Separate aliquots of the cells were transformed with the plasmid constructs containing the various xylose isomerase constructs or with the vector alone. Transformation was accomplished using the Zymo kit (Catalog number T2001 ; Zymo Research Corp., Orange, CA 92867) using about ^ μg plasmid DNA and cultured on SC media (set forth below) containing glucose but no uracil (20g glucose; 2.21 g SC dry mix, 6.7g Yeast Nitrogen Base, 1 L total) for 2-3 days at about 30 °C. Synthetic Complete Medium mix (minus uracil) contained:
Adenine hemisulfate
3.5g Arginine
ig Glutamic Acid
0.433g Histidine
0.4g Myo-lnositol
5.2g Isoleucine
2.63g Leucine
0.9g Lysine
1 .5g Methionine
0.8g Phenylalanine
1 -1 g Serine
1 .2g Threonine
0.8g Tryptophan
0.2g Tyrosine
1 -2g Valine
For expression and activity analysis, transformed cells containing the various xylose isomerase constructs were selected from the cultures and grown in about 100ml of SC-Dextrose (minus uracil) to an OD600 of about 4.0. The S. cerevisiae cultures that were transformed with the various xylose isomerase-histidine constructs were then lysed using YPER-Plus reagent (Thermo
Scientific, catalog number 78999) according to the manufacturer's directions. Protein quantitation of the lysates was performed using the Coomassie-Plus kit (Thermo Scientific, catalog number 23236) as directed by the manufacturer. Denaturing and native Western blot analyses were then conducted. To detect his-tagged xylose isomerase polypeptides Western analysis was employed. Gels were transferred onto a nitrocellulose membrane (0.45 micron, Thermo Scientific, San Diego, CA) using Western blotting filter paper (Thermo Scientific) using a Bio-Rad Mini Trans-Blot Cell (BioRad, Hercules, CA) system for approximately 90 minutes at 40V. Following transfer, the membrane was washed in 1 X PBS (EMD, San Diego, CA), 0.05% Tween-20 (Fisher Scientific, Fairlawn, NJ) for 2-5 minutes with gentle shaking. The membrane was blocked in 3% BSA dissolved in 1 X PBS and 0.05% Tween-20 at room temperature for about 2 hours with gentle shaking. The membrane was washed once in 1 X PBS and 0.05% Tween-20 for about 5 minutes with gentle shaking. The membrane was then incubated at room temperature with the 1 :5000 dilution of primary antibody (Ms mAB to 6x His Tag, AbCam, Cambridge, MA) in 0.3% BSA (Fraction V, EMD, San Diego, CA) dissolved in 1 X PBS and 0.05% Tween-20 with gentle shaking. Incubation was allowed to proceed for about 1 hour with gentle shaking. The membrane was then washed three times for 5 minutes each with 1 X PBS and 0.05% Tween-20 with gentle shaking. The secondary antibody [Dnk pAb to Ms IgG (HRP), AbCam, Cambridge, MA] was used at
1 :15000 dilution in 0.3% BSA and allowed to incubate for about 90 minutes at room temperature with gentle shaking. The membrane was washed three times for about 5 minutes using 1 X PBS and 0.05% Tween-20 with gentle shaking. The membrane was then incubated with 5ml of Supersignal West Pico Chemiluminescent substrate (Thermo Scientific, San Diego, CA) for 1 minute and then was exposed to a phosphorimager (Bio-Rad Universal Hood II, Bio-Rad,
Hercules, CA) for about 10-100 seconds. The results are shown in Figure 7. As can be seen, both Piromyces ("P" in Figure 7) and Ruminococcus ("R" in Figure 7) xylose isomerases are expressed in both the soluble and insoluble fractions of the yeast cells. To measure activity of the various xylose isomerase constructs, assays were performed according to Kuyper et al. (FEMS Yeast Res., 4:69 [2003]). About 20Mg of soluble whole cell extract was incubated in the presence of 100mM Tris, pH 7.5, 10mM MgCI2, 0.15mM NADH (Sigma, St. Louis, MO), and about 2U sorbitol dehydrogenase (Roche) at about 30°C. To start the reaction, about 100μΙ of xylose was added at various final concentrations of 40-500mM. A Beckman DU-800 was utilized with an Enzyme Mechanism software package (Beckman Coulter, Inc.), and the change in the A34o was monitored for 2-3 minutes.
Example 5: Preparation of Selective Growth Yeast The yeast gene cdc21 encodes thymidylate synthase, which is required for de novo synthesis of pyrimidine deoxyribonucleotides. A cdc 21 mutant, strain 17206, (ATCC accession number 208583) has a point mutation G139S relative to the initiating methionine. The restrictive temperature of this temperature sensitive mutant is 37°C, which arrests cell division at S phase, so that little or no cell growth and division occurs at or above this temperature.
Saccharomyces cerevisiae strain YGR420CBY4742APFK2 was used as the starting cell line to create the cdc21 growth sensitive mutant. A construct for homologous recombination was prepared to replace the wild type thymidylate synthase YGR420CBY4742APFK2 for the cdc21 mutant. This construct was made in various steps. First, the cdc21 mutant region from Saccharomyces cerevisiae strain 17206 was PCR amplified using the following primers:
CDC21_fwd: 5'- aatcgatcaaagcttctaaatacaagacgtgcgatgacgactatactggac -3'
CDC21_rev: 5'- taccgtactacccgggtatatagtctttttgccctggtgttccttaataatttc -3'
For this PCR amplification reaction Saccharomyces cerevisiae 17206 genomic DNA was used. The genomic DNA was extracted using Zymo research YeaStar Genomic DNA kit according to instructions. In the PCR amplification reaction 100ng of 17206 genomic DNA, 1 μΜ of the oligonucleotide primer set listed above, 2.5 U of PfuUltra High-Fidelity DNA polymerase
(Stratagene), 300 μΜ dNTPs (Roche), and 1 X PfuUltra reaction buffer was mixed in a final reaction volume of 50μΙ. Using a BIORAD DNA Engine Tetrad 2 Peltier thermal cycler the following cycle conditions were used: 5 min denaturation step at 95 eC, followed by 30 cycles of 20 sec at 95 5C, 20 sec at 50 SC, and 1 min at 72 SC, and a final step of 5 min at 72 SC. This PCR product was digested with Hindlll and Xmal restriction endonucleases and cloned in the Hindlll and Xmal sites of PUC19 (NEB) according to standard cloning procedures described by Maniatis in Molecular Cloning.
The genomic DNA of BR214-4a (ATTC accession number 208600) was extracted using Zymo research YeaStar Genomic DNA kit according to instructions. The Iys2 gene with promoter and terminator regions was PCR amplified from BR214-4a genomic DNA using the following primers:
Lys2Fwd: 5'-tgctaatgacccgggaattccacttgcaattacataaaaaattccggcgg-3'
Lys2Rev: 5'-atgatcattgagctcagcttcgcaagtattcattttagacccatggtgg-3'.
The PCR cycle was identical to that just described above but with genomic DNA of BR214-4a instead. Xmal and Sacl restriction sites were designed to flank this DNA construct to clone it into the Xmal and Sacl sites of the PUC19-cdc21 vector according to standard cloning procedures described by Maniatis in Molecular Cloning. The new construct with the cdc21 mutation with a Iys2 directly downstream of that will be referred to as PUC19-cdc21 -lys2.
The final step involved the cloning of the downstream region of thymidylate synthase into the PUC19-cdc21 -lys2 vector immediately downstream of the Iys2 gene. The downstream region of the thymidylate synthase was amplified from BY4742 genomic DNA (ATCC accession number 201389D-5 using the following primers: ThymidylateSynthase_DownFwd: 5'-tgctaatgagagctctcattttttggtgcgatatgtttttggttgatg-3' and
ThymidylateSynthatse_DownRev: 5'- aatgatcatgagctcgtcaacaagaactaaaaaattgttcaaaaatgc-3'
This final construct is referred as PUC19-cdc21 -lys2-ThymidylateSynthase_down. The sequence is set forth in the tables. A final PCR amplification reaction of this construct was performed using the following PCR primers:
ThymidylateSynthase::cdc21 fwd: 5'- ctaaatacaagacgtgcgatgacgactatactgg-3' and
ThymidylateSynthase::cdc21 rev: 5'- gtcaacaagaactaaaaaattgttcaaaaatgcaattgtc-3'.
The PCR reaction was identical to that described above but using 100ng of the PUC19-cdc21 -lys2- ThymidylateSynthase__down construct as a template.
The final PCR product was separated by agarose gel electrophoresis, excised, and purified using MP Biomedicals Geneclean II kit as recommended. Homologous recombination of
YGR420CBY4742APFK2 to replace the wt thymidylate synthase for the cdc21 mutant was accomplished using 10 g of the purified PCR product to transform YGR420CBY4742APFK2 strain using same transformation protocol described above. Transformants were selected by culturing the cells on selective media containing SC-Ethanol (0.67% yeast nitrogen base-2% ethanol) containing complete amino acids minus lysine.
The genome of this final engineered strain contains the mutated cdc21 gene, and has both the PFK1 and PFK2 genes deleted. This final engineered strain will be transformed with the best combination of edd-p426 GPD and eda -p425 GPD variant constructs. Ethanol and glucose measurements will be monitored during aerobic and anaerobic growth conditions using Roche ethanol and glucose kits according to instructions.
Example 6: Examples of Polynucleotide Regulators Provided in the tables hereafter are non-limiting examples of regulator polynucleotides that can be utilized in embodiments herein. Such polynucleotides may be utilized in native form or may be modified for use herein. Examples of regulatory polynucleotides include those that are regulated by oxygen levels in a system (e.g., up-regulated or down-regulated by relatively high oxygen levels or relatively low oxygen levels) Regulated Yeast Promoters - Up-regulated by oxygen
Figure imgf000150_0001
ORF name Gene Relative Relative Ratio name mRNA level mRNA level
(Aerobic) (Anaerobic)
YDR536W STL1 55 30 2.7
YNL150W 78 30 2.6
YHR212C 149 30 2.4
YJL108C 106 30 2.4
YGR069W 49 30 2.4
YDR106W 60 30 2.3
YNR034W SOL1 197 30 2.2
YEL073C 104 30 2.1
YOL141W 81 30 1.8
Regulated Yeast Promoters - Down-regulated by oxygen
Relative Relative
Gene mRNA level mRNA level
ORF name name (Aerobic) (Anaerobic) Ratio
YJR047C ANB1 30 4901 231 .1
Y R319C FET4 30 1 1 59 58
YPR194C 30 982 49.1
YIR019C STA1 30 981 22.8
YHL042W 30 608 12
YHR210C 30 552 27.6
YHR079B SAE3 30 401 2.7
YGL162W ST01 30 371 9.6
YHL044W 30 334 16.7
YOL015W 30 320 6.1
YCLX07W 30 292 4.2
YIL013C PDR1 1 30 266 10.6
YDR046C 30 263 13.2
YBR040W FIG1 30 257 12.8
YLR040C 30 234 2.9
YOR255W 30 231 1 1.6
YOL014W 30 229 1 1.4
YAR028W 30 212 7.5
YER089C 30 201 6.2
YFL012W 30 193 9.7
YDR539W 30 187 3.4
YHL043W 30 179 8.9
YJR162C 30 173 6
YMR 165C SMP2 30 147 3.5
YER 106W 30 145 7.3
YDR541 C 30 140 7
YCRX07W 30 138 3.3
YHR048W 30 137 6.9
YCL021W 30 136 6.8
YOL160W 30 136 6.8
YCRX08W 30 132 6.6 Relative Relative
Gene mRNA level mRNA level
ORF name name (Aerobic) (Anaerobic) Ratio
YMR057C 30 109 5.5
YDR540C 30 83 4.2
YOR378W 30 78 3.9
YBR085W AAC3 45 1281 28.3
YER188W 47 746 15.8
YLL065W GIN1 1 50 175 3.5
YDL241W 58 645 1 1.1
YBR238C 59 274 4.6
YCR048W ARE1 60 527 8.7
YOL165C 60 306 5.1
YNR075W 60 251 4.2
YJL213W 60 250 4.2
YPL265W DIP5 61 772 12.7
YDL093W PMT5 62 353 5.7
YKR034W DAL80 63 345 5.4
YKR053C 66 1268 19.3
YJR147W 68 281 4.1
Known and putative DNA binding motifs
Figure imgf000152_0001
Ace2 RRRAARARAA-A-RARAA GTGTGTGTGTGTGTG
Adr1 A-AG-GAGAGAG-GGCAG YTSTYSTT-TTGYTWTT
Arg80 T-CCW-TTTKTTTC GCATGACCATCCACG
Arg81 AAAAARARAAAARMA GSGAYARMGGAMAAAAA
Aro80 YKYTYTTYTT--KY TRCCGAGRYW-SSSGCGS
Ash1 CGTCCGGCGC CGTCCGGCGC
Azf1 GAAAAAGMAAAAAAA AARWTSGARG-A-CSAA
Bas1 TTTTYYTTYTTKY-TY-T CS-CCAATGK--CS
Cadi OA I KY I 1 1 1 I I KY I Y GCT-ACTAAT
Cbf1 CACGTGACYA CACGTGACYA
Cha4 CA— ACACASA-A CAYAMRTGY-C
Cin5 none none
Crz1 GG-A-A-AR-ARGGC- TSGYGRGASA
Cup9 TTTKYTKTTY-YTTTKTY K-C-C-SCGCTACKGC
Dal81 WTTKTTTTTYTTTTT-T SR-GGCMCGGC-SSG
Dal82 TTKTTTTYTTC TACYACA-CACAWGA
Dig1 AAA--RAA-GARRAA-AR CCYTG-AYTTCW-CTTC
Dot6 GTGMAK-MGRA-G-G GTGMAK-MGRA-G-G
Fhl1 -TTWACAYCCRTACAY-Y -TTWACAYCCRTACAY-Y
Fkh1 TTT-CTTTKYTT-YTTTT AAW-RTAAAYARG
Fkh2 AAARA-RAAA-AAAR-AA GG-AAWA-GTAAACAA
Fzf1 CACACACACACACACAC SASTKCWCTCKTCGT
Gal4 TTGCTTGAACGSATGCCA TTGCTTGAACGSATGCCA
Gal4 (Gal) YCTTTTTTTTYTTYYKG CGGM— CW-Y-CCCG
Gat1 none none
Gat3 RRSCCGMCGMGRCGCGCS RGARGTSACGCAKRTTCT
Gcn4 AAA-ARAR-RAAAARRAR TGAGTCAY
Gcr1 GGAAGCTGAAACGYMWRR GGAAGCTGAAACGYMWRR
Gcr2 GGAGAGGCATGATGGGGG AGGTGATGGAGTGCTCAG
Gln3 CT-CCTTTCT GKCTRR-RGGAGA-GM
Grf10 GAAARRAAAAAAMRMARA -GGGSG-T-SYGT-CGA
Gts1 G-GCCRS--TM AG-AWGTTTTTGWCAAMA
Haa1 none none
Hal9 TTTTTTYTTTTY-KTTTT KCKSGCAGGCWTTKYTCT
Hap2 YTTCTTTTYT-Y-C-KT- G-CCSART-GC
Hap3 T-SYKCTTTTCYTTY SGCGMGGG-CC-GACCG
Hap4 STT-YTTTY-TTYTYYYY YCT-ATTSG-C-GS
Hap5 YK-TTTWYYTC T-TTSMTT-YTTTCCK-C
Hir1 AAAA-A-AARAR-AG CCACKTKSGSCCT-S
Hir2 WAAAAAAGAAAA-AAAAR CRSGCYWGKGC
Hms1 AAA-GG-ARAM -AARAAGC-GGGCAC-C
Hsf 1 TYTTCYAGAA-TTCY TYTTCYAGAA-TTCY
Ime4 CACACACACACACACACA CACACACACACACACACA lno2 TTTYCACATGC SCKKCGCKSTSSTTYAA lno4 G--GCATGTGAAAA G-GCATGTGAAAA
Ixr1 GAAAA-AAAAAAAARA-A CTTTTTTTYYTSGCC
Leu3 GAAAAARAARAA-AA GCCGGTMMCGSYC-
Mad YTTKT-TTTTTYTYTTT A-TTTTTYTTKYGC
Mal13 GCAG-GCAGG AAAC-TTTATA-ATACA
Mal33 none none
Matal GCCC-C CAAT-TCT-CK
Mbp1 TTTYTYKTTT-YYTTTTT G-RR-A-ACGCGT-R
Mcm1 TTTCC-AAW-RGGAAA TTTCC-AAW-RGGAAA
Met31 YTTYYTTYTTTTYTYTTC
Figure imgf000154_0001
Transcriptional repressors
Figure imgf000155_0001
Associated
Gene(s) Description(s)
RGM1 Putative transcriptional repressor with proline-rich zinc fingers;
overproduction impairs cell growth
YHP1 One of two homeobox transcriptional repressors (see also Yoxl p), that bind to Mcml p and to early cell cycle box (ECB) elements of cell cycle regulated genes, thereby restricting ECB-mediated transcription to the M/G 1 interval
HOS4 Subunit of the Set3 complex, which is a meiotic-specific repressor of sporulation specific genes that contains deacetylase activity;
potential Cdc28p substrate
CAF20 Phosphoprotein of the mRNA cap-binding complex involved in translational control, repressor of cap-dependent translation initiation, competes with elF4G for binding to elF4E
SAP1 Putative ATPase of the AAA family, interacts with the Sin l p
transcriptional repressor in the two-hybrid system
SET3 Defining member of the SET3 histone deacetylase complex which is a meiosis-specific repressor of sporulation genes; necessary for efficient transcription by RNAPII; one of two yeast proteins that contains both SET and PHD domains
RPH1 JmjC domain-containing histone demethylase which can specifically demeth'ylate H3K36 tri- and dimethyl modification states;
transcriptional repressor of PHR1 ; Rphl p phosphorylation during DNA damage is under control of the MEC 1 -RAD53 pathway
YMR 181 C Protein of unknown function; mRNA transcribed as part of a
bicistronic transcript with a predicted transcriptional repressor RGM1/Y R182C; mRNA is destroyed by nonsense-mediated decay (NMD); YMR181 C is not an essential gene
YLR345W Similar to 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase enzymes responsible for the metabolism of fructoso-2,6- bisphosphate; mRNA expression is repressed by the Rfx1 p-Tup1 p- Ssn6p repressor complex; YLR345W is not an essential gene
MCM1 Transcription factor involved in cell-type-specific transcription and pheromone response; plays a central role in the formation of both repressor and activator complexes
PHR1 DNA photolyase involved in photoreactivation, repairs pyrimidine dimers in the presence of visible light; induced by DNA damage; regulated by transcriptional repressor Rphl p
HOS2 Histone deacetylase required for gene activation via specific
deacetylation of lysines in H3 and H4 histone tails; subunit of the Set3 complex, a meiotic-specific repressor of sporulation specific genes that contains deacetylase activity
RGT1 Glucose-responsive transcription factor that regulates expression of several glucose transporter (HXT) genes in response to glucose; binds to promoters and acts both as a transcriptional activator and repressor
SRB7 Subunit of the RNA polymerase II mediator complex; associates with core polymerase subunits to form the RNA polymerase II holoenzyme; essential for transcriptional regulation; target of the global repressor Tupl p
GAL1 1 Subunit of the RNA polymerase II mediator complex; associates with core polymerase subunits to form the RNA polymerase II holoenzyme; affects transcription by acting as target of activators and repressors Transcriptional activators
Figure imgf000157_0001
Associated
Gene(s) Description(s)
Transcriptional activator of genes involved in glycolysis; DNA-binding protein that interacts and functions with the transcriptional activator
GCR 1 Gcr2p
Transcriptional activator of genes involved in glycolysis; interacts and
GCR2 functions with the DNA-binding protein Gcr1 p
Transcriptional activator of genes involved in nitrogen catabolite repression; contains a GATA-1 -type zinc finger DNA-binding motif; activity and localization regulated by nitrogen limitation and Ure2p
GAT1
Transcriptional activator of genes regulated by nitrogen catabolite repression (NCR), localization and activity regulated by quality of
GLN3 nitrogen source
Transcriptional activator of proline utilization genes, constitutively binds PUT1 and PUT2 promoter sequences and undergoes a conformational change to form the active state; has a Zn(2)-Cys(6) binuclear cluster domain
PUT3
Transcriptional activator of the basic leucine zipper (bZIP) family, required for transcription of genes involved in resistance to arsenic
ARR1 compounds
Transcriptional activator of the pleiotropic drug resistance network, regulates expression of ATP-binding cassette (ABC) transporters through binding to cis-acting sites known as PDREs (PDR responsive elements)
PDR3
Transcriptional activator related to Msn2p; activated in stress conditions, which results in translocation from the cytoplasm to the nucleus; binds DNA at stress response elements of responsive genes, inducing gene expression
MSN4
Transcriptional activator related to Msn4p; activated in stress conditions, which results in translocation from the cytoplasm to the nucleus; binds DNA at stress response elements of responsive genes, inducing gene expression
SN2
Transcriptional activator that enhances pseudohyphal growth;
regulates expression of FL01 1 , an adhesin required for pseudohyphal filament formation; similar to StuA, an A. nidulans developmental regulator; potential Cdc28p substrate
PHD1
Transcriptional activator with similarity to DNA-binding domain of Drosophila forkhead but unable to bind DNA in vitro; required for rRNA
FHL1 processing; isolated as a suppressor of splicing factor prp4
Transcriptional activator, required for the vitamin H-responsive element (VHRE) mediated induction of VHT1 (Vitamin H transporter) and BI05 (biotin biosynthesis intermediate transporter) in response to low biotin concentrations
VHR1
Cell-cycle regulated activator of anaphase-promoting
complex/cyclosome (APC/C), which is required for
metaphase/anaphase transition; directs ubiquitination of mitotic cyclins,
CDC20 Pds! p, and other anaphase inhibitors; potential Cdc28p substrate Associated
Gene(s) Description(s)
Cell-cycle regulated activator of the anaphase-promoting complex/cyclosome (APC/C), which directs ubiquitination of cyclins resulting in mitotic exit; targets the APC/C to specific substrates including Cdc20p, Asel p, Cin8p and Finl p
CDH1
Iron-regulated transcriptional activator; activates genes involved in intracellular iron use and required for iron homeostasis and resistance
AFT2 to oxidative stress; similar to Aftl p
Leucine-zipper transcriptional activator, responsible for the regulation of the sulfur amino acid pathway, requires different combinations of the
MET4 auxiliary factors Cbf 1 p, et28p, Met31 p and Met32p
Mitochondrial translational activator of the COB mRNA; interacts with
CBS2 translating ribosomes, acts on the COB mRNA 5'-untranslated leader
Mitochondrial translational activator of the COB mRNA; membrane protein that interacts with translating ribosomes, acts on the COB
CBS1 mRNA 5'-untranslated leader
Mitochondrial translational activator of the COB mRNA;
CBP6 phosphorylated
Mitochondrial translational activator specific for the COX2 mRNA;
PET1 1 1 located in the mitochondrial inner membrane
Mitochondrial translational activator specific for the COX3 mRNA, acts together with Pet54p and Pet122p; located in the mitochondrial inner
PET494 membrane
Mitochondrial translational activator specific for the COX3 mRNA, acts together with Pet54p and Pet494p; located in the mitochondrial inner
PET122 membrane
Peptidyl-prolyl cis/trans-isomerase, activator of the phosphotyrosyl phosphatase activity of PP2A; involved in G1 phase progression, microtubule dynamics, bud morphogenesis and DNA repair; subunit of
RRD1 the Tap42p-Sit4p-Rrd1 p complex
YPR196W Putative maltose activator
Putative transcriptional activator that promotes recovery from pheromone induced arrest; inhibits both alpha-factor induced G1 arrest and repression of CLN 1 and CLN2 via SCB/MCB promoter elements;
POG 1 potential Cdc28p substrate; SBF regulated
Putative transcriptional activator, that interacts with G 1 -specific transcription factor, MBF and G1 -specific promoters; ortholog of Msa2p, an MBF and SBF activator that regulates G 1 -specific transcription and cell cycle initiation
MSA2
Specific translational activator for the COX1 mRNA, also influences stability of intron-containing COX1 primary transcripts; localizes to the mitochondrial inner membrane; contains seven pentatricopeptide repeats (PPRs)
PET309
Ty1 enhancer activator required for full levels of Ty enhancer-mediated
TEA1 transcription; C6 zinc cluster DNA-binding protein Associated
Gene(s) Description(s)
Autoregulatory oleate-specific transcriptional activator of peroxisome proliferation, contains Zn(2)-Cys(6) cluster domain, forms heterodimer with Oaf 1 p, binds oleate response elements (OREs), activates beta- oxidation genes
PIP2
DNA binding transcriptional activator, mediates serine/threonine activation of the catabolic L-serine (L-threonine) deaminase (CHA1 ); Zinc-finger protein with Zn[2]-Cys[6] fungal-type binuclear cluster domain
CHA4
Transcriptional repressor and activator; involved in repression of flocculation-related genes, and activation of stress responsive genes; negatively regulated by cAMP-dependent protein kinase A subunit Tpk2p
SFL1
Zinc cluster transcriptional activator involved in conferring resistance to
RDS2 ketoconazole
Zinc cluster transcriptional activator necessary for derepression of a variety of genes under non-fermentative growth conditions, active after
CAT8 diauxic shift, binds carbon source responsive elements
Zinc finger transcriptional activator of the Zn2Cys6 family; activates transcription of aromatic amino acid catabolic genes in the presence of
ARO80 aromatic amino acids
C6 zinc cluster transcriptional activator that binds to the carbon source- responsive element (CSRE) of gluconeogenic genes; involved in the positive regulation of gluconeogenesis; regulated by Snfl p protein kinase; localized to the nucleus
SIP4
Putative histone acetylase, sequence-specific activator of histone genes, binds specifically and highly cooperatively to pairs of UAS elements in core histone promoters, functions at or near the TATA box
SPT10
Basic leucine zipper (bZIP) transcriptional activator in the Cbf l p- Met4p-Met28p complex, participates in the regulation of sulfur
MET28 metabolism
Basic leucine zipper (bZIP) transcriptional activator of amino acid biosynthetic genes in response to amino acid starvation; expression is tightly regulated at both the transcriptional and translational levels
GCN4
AP-1 -like basic leucine zipper (bZIP) transcriptional activator involved in stress responses, iron metabolism, and pleiotropic drug resistance; controls a set of genes involved in stabilizing proteins; binds consensus sequence TTACTAA
CAD1
Component of the heteromeric lno2p/lno4p basic helix-loop-helix transcription activator that binds inositol/choline-responsive elements (ICREs), required for derepression of phospholipid biosynthetic genes
IN02 in response to inositol depletion
Zinc finger protein of the Zn(ll)2Cys6 type, probable transcriptional
THI2 activator of thiamine biosynthetic genes
DNA binding component of the SBF complex (Swi4p-Swi6p), a transcriptional activator that in concert with MBF (Mbp1 -Swi6p) regulates late G1 -specific transcription of targets including cyclins and
SWI4 genes required for DNA synthesis and repair Associated
Gene(s) Description(s)
Subunit of the heme-activated, glucose-repressed Hap2/3/4/5 CCAAT- binding complex, a transcriptional activator and global regulator of respiratory gene expression; required for assembly and DNA binding activity of the complex
HAP5
Subunit of the heme-activated, glucose-repressed Hap2p/3p/4p/5p CCAAT-binding complex, a transcriptional activator and global regulator of respiratory gene expression; contains sequences
HAP3 contributing to both complex assembly and DNA binding
Subunit of the heme-activated, glucose-repressed Hap2p/3p/4p/5p CCAAT-binding complex, a transcriptional activator and global regulator of respiratory gene expression; contains sequences sufficient
HAP2 for both complex assembly and DNA binding
Subunit of the heme-activated, glucose-repressed Hap2p/3p/4p/5p CCAAT-binding complex, a transcriptional activator and global regulator of respiratory gene expression; provides the principal activation function of the complex
HAP4
Putative protein of unknown function with some characteristics of a transcriptional activator; may be a target of Dbf2p-Mob1 p kinase; GFP- fusion protein co-localizes with clathrin-coated vesicles; YML037C is not an essential gene
YML037C
Subunit of SAGA and NuA4 histone acetyltransferase complexes; interacts with acidic activators (e.g., Gal4p) which leads to transcription activation; similar to human TRRAP, which is a cofactor for c-Myc
TRA1 mediated oncogenic transformation
Putative protein of unknown function with similarity to Pip2p, an oleate- specific transcriptional activator of peroxisome proliferation; YLL054C
YLL054C is not an essential gene
Sensor of mitochondrial dysfunction; regulates the subcellular location of Rtgl p and Rtg3p, transcriptional activators of the retrograde (RTG) and TOR pathways; Rtg2p is inhibited by the phosphorylated form of Mksl p
RTG2
Dubious open reading frame, unlikely to encode a functional protein;
YBR012C expression induced by iron-regulated transcriptional activator Aft2p
Lactate transporter, required for uptake of lactate and pyruvate;
phosphorylated; expression is derepressed by transcriptional activator Cat8p during respiratory growth, and repressed in the presence of glucose, fructose, and mannose
JEN1
Mitochondrial ribosomal protein of the small subunit; MRP1 exhibits genetic interactions with PET122, encoding a COX3-specific translational activator, and with PET123, encoding a small subunit mitochondrial ribosomal protein
MRP1
Mitochondrial ribosomal protein of the small subunit; MRP17 exhibits genetic interactions with PET122, encoding a COX3-specific
MRP17 translational activator
Triose phosphate isomerase, abundant glycolytic enzyme; mRNA half- life is regulated by iron availability; transcription is controlled by activators Rebl p, Gcr1 p, and Rapl p through binding sites in the 5' non-coding region
TPI1 Associated
Gene(s) Description(s)
Protein kinase with similarity to mammalian phosphoinositide- dependent kinase 1 (PDK1 ) and yeast Pkhl p and Pkh2p, two redundant upstream activators of Pkd p; identified as a multicopy suppressor of a pkhl pkh2 double mutant
PKH3
Putative protein of unknown function; green fluorescent protein (GFP)- fusion protein localizes to the endosome; identified as a transcriptional
YGL079W activator in a high-throughput yeast one-hybrid assay
Subunit of TFIIH and nucleotide excision repair factor 3 complexes, required for nucleotide excision repair, target for transcriptional
TFB1 activators
Mitochondrial ribosomal protein of the small subunit; PET123 exhibits genetic interactions with PET122, which encodes a COX3 mRNA-
PET123 specific translational activator
Protein involved in homologous recombination in mitochondria and in transcription regulation in nucleus; binds to activation domains of acidic activators; required for recombination-dependent mtDNA partitioning
MHR1
Transcription factor involved in cell-type-specific transcription and pheromone response; plays a central role in the formation of both
MC 1 repressor and activator complexes
Subunit betal of the nascent polypeptide-associated complex (NAC) involved in protein targeting, associated with cytoplasmic ribosomes; enhances DNA binding of the Gal4p activator; homolog of human BTF3b
EGD1
Pheromone-response scaffold protein; binds Ste1 1 p, Ste7p, and Fus3p kinases, forming a MAPK cascade complex that interacts with the plasma membrane and Ste4p-Ste18p; allosteric activator of Fus3p
STE5 that facilitates Ste7p-mediated activation
Glucose-responsive transcription factor that regulates expression of several glucose transporter (HXT) genes in response to glucose; binds
RGT1 to promoters and acts both as a transcriptional activator and repressor
Serine-rich protein that contains a basic-helix-loop-helix (bHLH) DNA binding motif ; binds E-boxes of glycolytic genes and contributes to their activation; may function as a transcriptional activator in Ty1 -mediated gene expression
TYE7
Subunit H of the eight-subunit V1 peripheral membrane domain of the vacuolar H+-ATPase (V-ATPase), an electrogenic proton pump found throughout the endomembrane system; serves as an activator or a structural stabilizer of the V-ATPase
VMA13
Subunit of the RNA polymerase II mediator complex; associates with core polymerase subunits to form the RNA polymerase II holoenzyme;
GAL 1 1 affects transcription by acting as target of activators and repressors
Protein involved in regulated synthesis of Ptdlns(3,5)P(2), in control of trafficking of some proteins to the vacuole lumen via the MVB, and in maintenance of vacuole size and acidity; interacts with Fig4p; activator of Fabl p
VAC 14 Example 7: Heterologous Xylose Isomerase expression in yeast
Provided hereafter are non-limiting examples of certain organisms from which nucleic acids that encode a polypeptide having xylose isomerase activity can be obtained. Certain nucleic acid encoded polypeptides having active xylose isomerase activity can be expressed in an engineered yeast (S. cerevisiae).
Figure imgf000163_0001
Example 8: Examples of nucleic acid and amino acid sequences
Provided hereafter and non-limiting examples of certain nucleic acid sequences.
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
Figure imgf000170_0001
Provided hereafter are non-limiting examples of certain amino acid sequences.
Figure imgf000170_0002
Example 9: Preparation and Expression of Xylose Isomerase Genes
A full length native gene encoding a xylose isomerase from Ruminococcus flavefaciens was synthesized by IDT DNA, Inc. (Coralville, Iowa), with a single silent point mutation (a "C" to a "G") at position 513. The sequence of this gene is set forth as SEQ ID NO: 29 the point mutation is indicated as the larger bold capital letter "G".
SEQ ID NO: 29 ATGGAATTTTTCAGCAATATCGGTAAAATTCAGTATCAGGGACCAAAAAGTACTGATCCTCTC TCATTTAAGTACTATAACCCTGAAGAAGTCATCAACGGAAAGACAATGCGCGAGCATCTGAA GTTCGCTCTTTCATGGTGGCACACAATGGGCGGCGACGGAACAGATATGTTCGGCTGCGGC ACAACAGACAAGACCTGGGGACAGTCCGATCCCGCTGCAAGAGCAAAGGCTAAGGTTGACG CAGCATTCGAGATCATGGATAAGCTCTCCATTGACTACTATTGTTTCCACGATCGCGATCTTT CTCCCGAGTATGGCAGCCTCAAGGCTACCAACGATCAGCTTGACATAGTTACAGACTATATC AAGGAGAAGCAGGGCGACAAGTTCAAGTGCCTCTGGGGTACAGCAAAGTGCTTCGATCATC CAAGATTCATGCACGGTGCAGGTACATCTCCTTCTGCTGATGTATTCGCTTTCTCAGCTGCT CAGATCAAGAAGGCTCTGGAGTCAACAGTAAAGCTCGGCGGTAACGGTTACGTTTTCTGGG GCGGACGTGAAGGCTATGAGACACTTCTTAATACAAATATGGGACTCGAACTCGACAATATG GCTCGTCTTATGAAGATGGCTGTTGAGTATGGACGTTCGATCGGCTTCAAGGGCGACTTCTA TATCGAGCCCAAGCCCAAGGAGCCCACAAAGCATCAGTACGATTTCGATACAGCTACTGTTC TGGGATTCCTCAGAAAGTACGGTCTCGATAAGGATTTCAAGATGAATATCGAAGCTAACCAC GCTACACTTGCTCAGCATACATTCCAGCATGAGCTCCGTGTTGCAAGAGACAATGGTGTGTT CGGTTCTATCGACGCAAACCAGGGCGACGTTCTTCTTGGATGGGATACAGACCAGTTCCCC ACAAATATCTACGATACAACAATGTGTATGTATGAAGTTATCAAGGCAGGCGGCTTCACAAAC GGCGGTCTCAACTTCGACGCTAAGGCACGCAGAGGGAGCTTCACTCCCGAGGATATCTTCT ACAGCTATATCGCAGGTATGGATGCATTTGCTCTGGGCTTCAGAGCTGCTCTCAAGCTTATC GAAGACGGACGTATCGACAAGTTCGTTGCTGACAGATACGCTTCATGGAATACCGGTATCG GTGCAGACATAATCGCAGGTAAGGCAGATTTCGCATCTCTTGAAAAGTATGCTCTTGAAAAG GGCGAGGTTACAGCTTCACTCTCAAGCGGCAGACAGGAAATGCTGGAGTCTATCGTAAATA ACGTTCTTTTCAGTCTGTAA
The nucleotide sequence of the native gene is set forth as SEQ ID NO. 30 and
GenBank as accession number AJ132472 (CAB51938.1 ).
SEQ ID NO. 30 .
ATGGAATTTTTCAGCAATATCGGTAAAATTCAGTATCAGGGACCAAAAAGTACTGATCCTCTC TCATTTAAGTACTATAACCCTGAAGAAGTCATCAACGGAAAGACAATGCGCGAGCATCTGAA GTTCGCTCTTTCATGGTGGCACACAATGGGCGGCGACGGAACAGATATGTTCGGCTGCGGC ACAACAGACAAGACCTGGGGACAGTCCGATCCCGCTGCAAGAGCAAAGGCTAAGGTTGACG CAGCATTCGAGATCATGGATAAGCTCTCCATTGACTACTATTGTTTCCACGATCGCGATCTTT CTCCCGAGTATGGCAGCCTCAAGGCTACCAACGATCAGCTTGACATAGTTACAGACTATATC AAGGAGAAGCAGGGCGACAAGTTCAAGTGCCTCTGGGGTACAGCAAAGTGCTTCGATCATC CAAGATTCATGCACGGTGCAGGTACATCTCCTTCTGCTGATGTATTCGCTTTCTCAGCTGCT CAGATCAAGAAGGCTCTCGAGTCAACAGTAAAGCTCGGCGGTAACGGTTACGTTTTCTGGG GCGGACGTGAAGGCTATGAGACACTTCTTAATACAAATATGGGACTCGAACTCGACAATATG GCTCGTCTTATGAAGATGGCTGTTGAGTATGGACGTTCGATCGGCTTCAAGGGCGACTTCTA TATCGAGCCCAAGCCCAAGGAGCCCACAAAGCATCAGTACGATTTCGATACAGCTACTGTTC TGGGATTCCTCAGAAAGTACGGTCTCGATAAGGATTTCAAGATGAATATCGAAGCTAACCAC GCTACACTTGCTCAGCATACATTCCAGCATGAGCTCCGTGTTGCAAGAGACAATGGTGTGTT CGGTTCTATCGACGCAAACCAGGGCGACGTTCTTCTTGGATGGGATACAGACCAGTTCCCC ACAAATATCTACGATACAACAATGTGTATGTATGAAGTTATCAAGGCAGGCGGCTTCACAAAC GGCGGTCTCAACTTCGACGCTAAGGCACGCAGAGGGAGCTTCACTCCCGAGGATATCTTCT ACAGCTATATCGCAGGTATGGATGCATTTGCTCTGGGCTTCAGAGCTGCTCTCAAGCTTATC GAAGACGGACGTATCGACAAGTTCGTTGCTGACAGATACGCTTCATGGAATACCGGTATCG GTGCAGACATAATCGCAGGTAAGGCAGATTTCGCATCTCTTGAAAAGTATGCTCTTGAAAAG GGCGAGGTTACAGCTTCACTCTCAAGCGGCAGACAGGAAATGCTGGAGTCTATCGTAAATA ACGTTCTTTTCAGTCTGTAA The corresponding amino acid sequence of the native Ruminococcus flavefaciens is set forth in SEQ ID NO: 31 .
SEQ ID NO: 31 MEFFSNIGKIQYQGPKSTDPLSFKYYNPEEVINGKTMREHLKFALSWWHTMGGDGTD MFGCGTTDKTWGQSDPAARAKAKVDAAFEIMDKLSIDYYCFHDRDLSPEYGSLKATND QLDIVTDYIKEKQGDKFKCLWGTAKCFDHPRFMHGAGTSPSADVFAFSAAQIKKALEST VKLGGNGYVFWGGREGYETLLNTNMGLELDNMARLMKMAVEYGRSIGFKGDFYIEPK PKEPTKHQYDFDTATVLGFLRKYGLDKDFKMNI EANHATLAQHTFQHELRVARDNGVF GSIDANQGDVLLGWDTDQFPTNIYDTTMCMYEVIKAGGFTNGGLNFDAKARRGSFTPE DIFYSYIAGMDAFALGFRAALKLIEDGRIDKFVADRYASWNTGIGADIIAGKADFASLEKY ALEKGEVTASLSSGRQEMLESIVNNVLFSL
An additional nucleic acid variant of the native Ruminococcus xylose isomerase gene was designed to eliminate over-represented codon pairs, improve codon preferences, and reduce mRNA secondary structures. The amino acid sequence of the hot rod xylose isomerase gene is substantially identical to the wild type. This sequence variant, referred to as the "hot rod" variant, is set forth in SEQ ID NO: 32. SEQ ID NO: 32
ATGGAGTTCTTTTCTAATATAGGTAAAATTCAGTATCAAGGTCCAAAATCTACAGATCCATTGT
CTTTTAAATATTATAATCCAGAAGAAGTTATAAATGGTAAAACTATGAGAGAACATTTAAAATT
TGCTTTGTCTTGGTGGCATACTATGGGTGGTGATGGTACTGATATGTTCGGTTGTGGTACTA CTGATAAAACTTGGGGTCAATCTGATCCAGCTGCTAGAGCAAAAGCCAAAGTAGATGCAGCC TTTGAAA1 ATGGATAAATTGTCTATTGATTATTATTGTTTTCATGATAGAGATTTGTCTCCTGA ATATGGTTCTTTAAAAGCAACTAATGATCAATTGGACATTGTTACGGATTATATTAAAGAAAAA CAAGGTGATAAATTTAAATGTTTGTGGGGCACTGCGAAATGTTTTGATCATCCACGTTTTATG CATGGTGCGGGGACGAGTCCTTCTGCTGATGTTTTTGCTTTTTCTGCCGCTCAAATTAAGAA GGCATTGGAATCAACTGTTAAATTAGGTGGGAACGGGTATGTATTCTGGGGAGGAAGGGAA GGTTATGAAACATTATTAAACACTAATATGGGTTTGGAATTGGATAATATGGCTAGATTGATG AAAATGGCTGTAGAATACGGAAGGTCTATTGGTTTTAAGGGTGACTTTTATATTGAACCAAAA CCTAAAGAGCCTACTAAACATCAATATGATTTTGATACTGCTACAGTTTTGGGATTCTTGAGA AAATATGGTCTGGATAAAGATTTTAAAATGAATATAGAAGCTAATCATGCAACACTCGCACAA CATACTTTTCAACATGAATTGAGAGTTGCCAGAGATAACGGAGTTTTTGGATCTATCGATGCA AACCAGGGAGACGTTTTGCTAGGATGGGATACTGATCAATTTCCAACTAACATTTATGATACT ACTATGTGTATGTATGAAGTAATTAAGGCAGGAGGCTTTACTAATGGCGGATTAAACTTTGAT GCGAAGGCTAGGCGTGGTAGTTTCACTCCAGAGGATATATTCTATTCTTATATTGCTGGAAT GGATGCTTTCGCGTTAGGTTTCAGGGCAGCACTAAAATTGATTGAAGATGGTAGAATTGATA AGTTTGTAGCTGATAGATATGCTTCTTGGAATACTGGAATAGGAGCAGATATAATCGCTGGG AAAGCCGACTTCGCCAGTCTGGAAAAATATGCGCTTGAAAAAGGAGAAGTTACTGCCAGCTT AAGTTCCGGTCGTCAAGAAATGTTGGAATCTATTGTAAACAATGTTTTATTTTCTCTGTAA
This gene was synthesized by assembling the oligonucleotides set forth below first into seven separate "primary fragments" (also referred to as "PFs"). The PFs were then assembled into three "secondary fragments ("SFs") which in turn were assembled into the full length gene. All oligonucleotides were obtained from IDT. All of the
oligonucleotides used for gene construction are set forth in the table below.
Figure imgf000173_0001
4329
On 14 rev AGAACCATATTCAGGAGACAAATCTCTATCATGAAAACAATAATA 45-mer
4329
On 15 fwd I I I I C I CO I GAA I A I G I I (J I I I AAAAGCAACTAATGATCAA 43-mer
4329
On 16 rev AATATAATCCGTAACAATGTCCAATTGATCATTAGTTGC I I I I AA 45-mer
4329
On 17 fwd TTGGACATTGTTACGGATTATATTAAAGAAAAACAAGGTGATAAA 45-mer
4329
On 18 rev CGCAGTGCCCCACAAACATTTAAATTTATCACCTT I I I I I C I I I 45-mer
4329
On 19 fwd I I I AAA I I I I G I GUGGCAC I CGAAA I I I I I GATCATCCACGT 45-mer
4329
On20 rev ACTCGTCCCCGCACCATGCATAAAACGTGGATGATCAAAACA I I I 45-mer
4329
On21 fwd I I I A I CA I GG I GCGGGGACGAGTCCT I C I C I A I I I I I I GCT 45-mer
4329
On22 rev CTTCTTAA I I I GAGCGGCAGAAAAAGCAAAAACATCAGCAGAAGG 45-mer
4329
On23 fwd I I I I CTGCCGCTCAAATTAAGAAGGCATTGGAATCAACTGTTAAA 45-mer
4329
On24 rev GAATACATACCCGTTCCCACCTAA I I I AACAGTTGATTCCAATGC 45-mer
4329
On25 fwd TTAGGTGGGAACGGGTATGTATTCTGGGGAGGAAGGGAAGGTTAT 45-mer
4329
On26 rev CATATTAGTG I I I AA I AA I G I I I CATAACCTTCCCTTCCTCCCCA 45-mer
4329
On27 fwd GAAACATTATTAAACACTAATATGGG I I I GGAATTGGATAATATG 45-mer
4329
On28 rev TACAGCCA I I I I CATCAATCTAGCCATATTATCCAATTCCAAACC 45-mer
4329
On29 fwd GCTAGATTGATGAAAATGGCTGTAGAATACGGAAGGTCTATTGGT 45-mer
4329
On30 rev TTCAATATAAAAGTCACCCTTAAAACCAATAGACCTTCCGTATTC 45-mer
4329
On31 fwd TTTAAGGGTGACTTTTATATTGAACCAAAACCTAAAGAGCCTACT 45-mer
4329
On32 rev AGTATCAAAATCATATTGATG'I I I AG I AGGC I C I I I AG I I I I GG 45-mer
4329
On33 fwd AAACATCAATATGATTT I GA'I AC I GC I AUAG I I I I GGGATTCTTG 45-mer
4329
On34 rev ATCTTTATCCAGACCATATTTTCTCAAGAATCCCAAAACTGTAGC 45-mer
4329
On35 fwd AGAAAATATGGTCTGGATAAAGA I I I I AAAATGAATATAGAAGCT 45-mer
4329
On36 rev ATGTTGTGCGAGTGTTGCATGATTAGCTTCTATATTCA I I I I AAA 45-mer
4329
On37 fwd AATCATGCAACACTCGCACAACATACTTTTCAACATGAATTGAGA 45-mer 4329
On38 rev AAAAACTCCGTTATCTCTGGCAACTCTCAATTCATGTTGAAAAGT 45-mer
4329
On39 fwd GTTGCCAGAGATAACGGAGTTTTTGGATCTATCGATGCAAACCAG 45-mer
4329
On40 rev ATCCCATCCTAGCAAAACGTCTCCCTGG I I I GCATCGATAGATCC 45-mer
4329
On41 fwd GGAGACG I I I I'GCTAGGA I GGGA I AC I UA I CAA I I I CCAACTAAC 45-mer
4329
On42 rev CATACACATAGTAGTATCATAAATGTTAGTTGGAAATTGATCAGT 45-mer
4329
On43 fwd ATTTATGATACTACTATGTGTATGTATGAAGTAATTAAGGCAGGA 45-mer
4329
On44 rev G i l l AATCCGCCATTAGTAAAGCCTCCTGCCTTAATTACTTCATA 45-mer
4329
On45 fwd GGC I I I AC I AA I GCUGA I I AAAC I I I GATGCGAAGGCTAGGCGT 45-mer
4329
On46 rev TATATCCTCTGGAGTGAAACTACCACGCCTAGCCTTCGCATCAAA 45-mer
4329
On47 fwd GGTAG I I I CACTCCAGAGGATATATTCTATTCTTATATTGCTGGA 45-mer
4329
On48 rev GAAACCTAACGCGAAAGCATCCATTCCAGCAATATAAGAATAGAA 45-mer
4329
On49 fwd ATGGATGCTTTCGCGTTAGG I I I CAGGGCAGCACTAAAATTGATT 45-mer
4329
On50 rev CTTATCAATTCTACCATCTTCAATCAA I I I I AGTGCTGCCCT 42-mer
4329
On51 fwd GAAGATGGTAGAATTGATAAG I I I GTAGCTGATAGATATGCTTCT 45-mer
4329
On52 rev TGCTCCTATTCCAGTATTCCAAGAAGCATATCTATCAGCTACAAA 45-mer
4329
On53 fwd TGGAATACTGGAATAGGAGCAGATATAATCGCTGGGAAAGCCGAC 45-mer
4329
On54 rev ATA I I I I I CCAGACTGGCGAAGTCGGCTTTCCCAGCGATTATATC 45-mer
4329
On55 fwd TTCGCCAGTCTGGAAAAATATGCGCTTGAAAAAGGAGAAGTTACT 45-mer
4329
On56 rev ACGACCGGAACTTAAGCTGGCAGTAACTTCTCC I I I I I CAAGCGC 45-mer
4329
On57 fwd GCCAGCTTAAGTTCCGGTCGTCAAGAAATGTTGGAATCTAT 41 -mer
4329
On58 rev CAGAGAAAATAAAACATTG I I I ACAATAGATTCCAACATTTCTTG 45-mer
4329 AC I I AC I AA I AAG I I CA I A I GA I GGA I I C I I I I I AA I A I AG
On59 fwd GTAAAATT 55-mer
4329
On60 fwd ACTTGACTACTAGTATGGAGTTC I I I I CTAATATAGGTAAAATT 44-mer
4329 ACTTGACTAACTGAAGCTTCATATGTTGGACATTGTTACGGATTAT
On61 fwd ATTAAAGAA 55-mer 4329 ACTTGACTAACTGAAGCTTCATATGAAACATCAATATGA 1 1 1 1 GATA
On62 fwd CTGCTACA 55-mer
4329 AGTTAAGTGAGTAAACTAGTGAATTCCAGAGAAAATAAAACATTGT
On63 (or TTACAATAGA 56-mer
4329 AGTCAAGTCTCGAGCTACAGAGAAAATAAAACATTGTTTACAATAG
On64 rev A 44-mer
4329 AGTTAAGTGAGTAAACTAGTGAATTCCATATTAGTGTTTAATAATGT
On65 rev TTCATAACC 56-mer
4329 AGTTAAGTGAGTAAACTAGTGAATTCCATACACATAGTAGTATCAT
On66 rev AAATGTTAGT 56-mer
The 7 primary fragments ("PFs") were first separately assembled using polymerase chain reaction (PCR) mixture containing about 1 X Pfu Ultra II reaction buffer (Agilent, La Jolla, CA), about 0.2mM about , 0.04 μιτιοΙ of assembly primers (see table below), about 0.2μιηοΙ of end primers (see table below), and about 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA). The reaction conditions were 95°C for 10 minutes, 30 cycles of 95°C for
20 seconds, 44 °C for 30 seconds, and 72°C for 15 seconds, and a final extension of 5 minutes at 72°C.
Primary Fraqment Assembly Primers 5' and 3' End Primers
PF1 4329 On1 fwd 4329 On1 fwd
4329 On2 rev 4329 On 10 rev
4329 On3 fwd
4329 On4 rev
4329 On5 fwd
4329 On6 rev
4329 On7 fwd
4329 On8 rev
4329 On9 fwd
4329 On 10 rev
PF2 4329 On9 fwd 4329 On9 fwd
4329 On 10 rev 4329 On 18 rev
4329 On 1 1 fwd
4329 On 12 rev
4329 On 13 fwd
4329 On 14 rev
4329 On 15 fwd
4329 On 16 rev
4329 On 17 fwd
4329 On 18 rev
PF3 4329 On 17 fwd 4329 On 17 fwd
4329 On 18 rev 4329 On26 rev
4329 On 19 fwd 4329 On20 rev
4329 On21 fwd
4329 On22 rev
4329 On23 fwd
4329 On24 rev
4329 On25 fwd
4329 On26 rev
PF4 4329 On25 fwd 4329 On25 fwd
4329 On26 rev 4329 On34 rev 4329 On27 fwd
4329 On28 rev
4329 On29 fwd
4329 On30 rev
4329 On31 fwd
4329 On32 rev
4329 On33 fwd
4329 On34 rev
PF5 4329 On33 fwd 4329 On33 fwd
4329 On34 rev 4329 On42 rev 4329 On35 fwd
4329 On36 rev
4329 On37 fwd
4329 On38 rev
4329 On39 fwd
4329 On40 rev
4329 On41 fwd
4329 On42 rev
PF6 4329 On41 fwd 4329 On41 fwd
4329 On42 rev 4329 On50 rev 4329 On43 fwd
4329 On44 rev
4329 On45 fwd
4329 On46 rev
4329 On47 fwd
4329 On48 rev
4329 On49 fwd
4329 On50 rev
PF7 4329 On49 fwd 4329 On49 fwd
4329 On50 rev 4329 On58 rev 4329 On51 fwd
4329 On52 rev
4329 On53 fwd
4329 On54 rev
4329 On55 fwd
4329 On56 rev
4329 On57 fwd
4329 On58 rev Each assembled primary fragment was separately PCR purified using a Qiagen PCR purification kit (Qiagen, Valencia, CA) according to the manufacturer's directions and then reassembled into 3 secondary fragments ("SFs") in a PCR reaction containing about 1 X Pfu Ultra II reaction buffer (Agilent, La Jolla, CA), about 0.2mM dNTPs, about 0.1 pmol of each primary fragment (SF1 = PF1 +PF2+PF3; SF2=
PF3+PF4+PF5;SF3=PF5+PF6+PF7), about 0.2μιηοΙ of end primers (see table below), and about 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA). The reaction conditions were 95°C for 10 minutes, 30 cycles of 95°C for 20 seconds, 62°C for 30 seconds, and 72 °C for 15 seconds, and a final extension of 5 minutes at 72 °C.
Secondary Fragment Primary Fragments 5' and 3' End Primers
SF1 PF1 4329 On59 fwd (Tf 1 -5P1 Sf 1 -5P1 ]
PF2 4329 On65 rev (Sf 1 -3P1 )
PF3
SF2 PF3 4329 On61 fwd (Sf2-5P1 )
PF4 4329 On66 rev (Sf2-3P1 )
PF5
SF3 PF5 4329 On62 fwd (SI3-5P1 )
PF6 4329 On63 rev (Tf 1 -3P1 Sf3 3P1 ) PF7
Each secondary fragment was PCR purified using a Qiagen PCR purification kit (Qiagen, Valencia, CA) according to the manufacturer's directions, and the final, full length gene was assembled in a PCR reaction containing 1 X Pfu Ultra II reaction buffer (Agilent, La Jolla, CA), 0.2mM dNTPs, 0.1 pmol of each secondary fragment
(SF1 +SF2+SF3), 0.2μ ιοΙ of end primers (4329 On60 fwd and 4329 On64 rev), and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA). The reaction conditions were 95 °C for 10 minutes, 30 cycles of 95 °C for 20 seconds, 62°C for 30 seconds, and 72°C for 30 seconds, and a final extension of 5 minutes at 72 °C. The final product was PCR purified using a Qiagen PCR purification kit (Qiagen, Valencia, CA) according to the
manufacturer's directions and then cloned into pCR Blunt ll-TOPO (Invitrogen, Carlsbad, CA) according to the manufacturer's directions. Sequence confirmation of the final construct was performed at GeneWiz (La Jolla, CA).
An additional variant of the native R. flavefaciens xylose isomerase gene (XI-R-COOP) was prepared in which all of the codons were optimized for expression in Saccharomyces cerevisiae. This variant gene was synthesized by IDT DNA Inc. and the sequence is set forth below as SEQ ID NO: 33.
SEQ ID NO: 33
ATGGAATTCTTCTCTAACATTGGTAAGATCCAATACCAAGGTCCAAAGTCCACCGACCCATTG TCTTTCAAGTACTACAACCCAGAAGAAGTTATTAACGGTAAGACTATGAGAGAACACTTGAAG TTCGCTTTGTCCTGGTGGCACACCATGGGTGGTGACGGTACTGACATGTTCGGTTGTGGTA CCACTGACAAGACCTGGGGTCAATCTGACCCAGCTGCTAGAGCTAAGGCTAAGGTCGACGC TGCTTTCGAAATCATGGACAAGTTGTCCATTGACTACTACTGTTTCCACGACAGAGACTTGTC TCCAGAATACGGTTCCTTGAAGGCTACTAACGACCAATTGGACATCGTTACCGACTACATTA AGGAAAAGCAAGGTGACAAGTTCAAGTGTTTGTGGGGTACTGCTAAGTGTTTCGACCACCCA AGATTCATGCACGGTGCTGGTACCTCTCCATCCGCTGACGTCTTCGCTTTCTCTGCTGCTCA AATCAAGAAGGCTTTGGAATCCACTGTTAAGTTGGGTGGTAACGGTTACGTCTTCTGGGGTG GTAGAGAAGGTTACGAAACCTTGTTGAACACTAACATGGGTTTGGAATTGGACAACATGGCT AGATTGATGAAGATGGCTGTTGAATACGGTAGATCTATTGGTTTCAAGGGTGACTTCTACATC GAACCAAAGCCAAAGGAACCAACCAAGCACCAATACGACTTCGACACTGCTACCGTCTTGG GTTTCTTGAGAAAGTACGGTTTGGACAAGGACTTCAAGATGAACATTGAAGCTAACCACGCT ACTTTGGCTCAACACACCTTCCAACACGAATTGAGAGTTGCTAGAGACAACGGTGTCTTCGG TTCCATCGACGCTAACCAAGGTGACGTTTTGTTGGGTTGGGACACTGACCAATTCCCAACCA ACATTTACGACACTACCATGTGTATGTACGAAGTCATCAAGGCTGGTGGTTTCACTAACGGT GGTTTGAACTTCGACGCTAAGGCTAGAAGAGGTTCTTTCACCCCAGAAGACATTTTCTACTC CTACATCGCTGGTATGGACGCTTTCGCTTTGGGTTTCAGAGCTGCTTTGAAGTTGATTGAAG ACGGTAGAATCGACAAGTTCGTTGCTGACAGATACGCTTCTTGGAACACTGGTATTGGTGCT GACATCATTGCTGGTAAGGCTGACTTCGCTTCCTTGGAAAAGTACGCTTTGGAAAAGGGTGA AGTCACCGCTTCTTTGTCCTCTGGTAGACAAGAAATGTTGGAATCCATCGTTAACAACGTCTT GTTCTCTTTGTAA
Separately, the gene encoding xylose isomerase from Piromyces strain E2 was synthesized by IDT DNA, Inc. The sequence of this gene is set forth as SEQ ID NO: 34.
SEQ ID NO: 34
ACTAGTAAAAAAATGGCTAAGGAATATTTCCCACAAATTCAAAAGATTAAGTTCGAAGGTAAG GATTCTAAGAATCCATTAGCCTTCCACTACTACGATGCTGAAAAGGAAGTCATGGGTAAGAA AATGAAGGATTGGTTACGTTTCGCCATGGCCTGGTGGCACACTCTTTGCGCCGAAGGTGCT GACCAATTCGGTGGAGGTACAAAGTCTTTCCCATGGAACGAAGGTACTGATGCTATTGAAAT TGCCAAGCAAAAGGTTGATGCTGGTTTCGAAATCATGCAAAAGCTTGGTATTCCATACTACT GTTTCCACGATGTTGATCTTGTTTCCGAAGGTAACTCTATTGAAGAATACGAATCCAACCTTA AGGCTGTCGTTGCTTACCTCAAGGAAAAGCAAAAGGAAACCGGTATTAAGCTTCTCTGGAGT ACTGCTAACGTCTTCGGTCACAAGCGTTACATGAACGGTGCCTCCACTAACCCAGACTTTGA TGTTGTCGCCCGTGCTATTGTTCAAATTAAGAACGCCATAGACGCCGGTATTGAACTTGGTG CTGAAAACTACGTCTTCTGGGGTGGTCGTGAAGGTTACATGAGTCTCCTTAACACTGACCAA AAGCGTGAAAAGGAACACATGGCCACTATGCTTACCATGGCTCGTGACTACGCTCGTTCCAA GGGATTCAAGGGTACTTTCCTCATTGAACCAAAGCCAATGGAACCAACCAAGCACCAATACG ATGTTGACACTGAAACCGCTATTGGTTTCCTTAAGGCCCACAACTTAGACAAGGACTTCAAG GTCAACATTGAAGTTAACCACGCTACTCTTGCTGGTCACACTTTCGAACACGAACTTGCCTG TGCTGTTGATGCTGGTATGCTCGGTTCCATTGATGCTAACCGTGGTGACTACCAAAACGGTT GGGATACTGATCAATTCCCAATTGATCAATACGAACTCGTCCAAGCTTGGATGGAAATCATC CGTGGTGGTGGTTTCGTTACTGGTGGTACCAACTTCGATGCCAAGACTCGTCGTAACTCTAC TGACCTCGAAGACATCATCATTGCCCACGTTTCTGGTATGGATGCTATGGCTCGTGCTCTTG AAAACGCTGCCAAGCTCCTCCAAGAATCTCCATACACCAAGATGAAGAAGGAACGTTACGCT TCCTTCGACAGTGGTATTGGTAAGGACTTTGAAGATGGTAAGCTCACCCTCGAACAAGTTTA CGAATACGGTAAGAAGAACGGTGAACCAAAGCAAACTTCTGGTAAGCAAGAACTCTACGAAG CTATTGTTGCCATGTACCAATAGTAGCTCGAG
The amino acid sequence of the xylose isomerase from Piromyces strain E2 is set for SEQ ID NO. 35
SEQ ID NO. 35
MAKEYFPQIQKIKFEGKDSKNPLAFHYYDAEKEVMGKKMKDWLRFAMAWWHTLCAEG ADQFGGGTKSFPWNEGTDAIEIAKQKVDAGFEIMQKLGIPYYCFHDVDLVSEGNSIEEY ESNLKAVVAYLKEKQKETGIKLLWSTANVFGHKRYMNGASTNPDFDVVARAIVQIKNAI DAGIELGAENYVFWGGREGYMSLLNTDQKREKEHMATMLTMARDYARSKGFKGTFLI EPKPMEPTKHQYDVDTETAIGFLKAHNLDKDFKVNIEVNHATLAGHTFEHELACAVDAG MLGSIDANRGDYQNGWDTDQFPIDQYELVQAWMEIIRGGGFVTGGTNFDAKTRRNST DLEDIIIAHVSGMDAMARALENAAKLLQESPYTKMKKERYASFDSGIGKDFEDGKLTLE QVYEYGKKNGEPKQTSGKQELYEAIVAMYQ
For detection purposes, each gene was PCR amplified using a 3' oligonucleotide that added a 6-HIS tag onto the c-terminal end of each xylose isomerase gene. The oligonucleotides used for this purpose are set forth below.
Gene Primer Name Primer Sequence
XI-R-hotrod
4329 On63 for (SEQ ID NO. 36)
agttaagtgagtaaactagtgaattccagagaaaataaaacattgtttacaataga
4329-HIS REV (SEQ ID NO. 37)
agtcaagtctcgagtcaatggtgatggtggtgatgcagagaaaataaaacattgtttac
XI-R-native
KAS/5-XI-RF-NATIVE (SEQ ID NO. 38) actagtatggaatttttcagcaatatcggtaaaattc KAS/3-XI-RF-NATIVE-HIS (SEQ ID NO. 39) ctcgagrtacagactgaaaagaacgttatttacg
XI-R-coop
KAS/5-XI-RF-COOP (SEQ ID NO. 40) actagtatggaattcttctctaacattgg KAS/3-XI-RF-COOP-HIS (SEQ ID NO. 41 ) ctcgagttacaaagagaacaagacgttgttaacgatgg
Xl-P
XI-P_Native FL 5' (SEQ ID NO. 42) actagtaaaaaaatggctaaggaatatttcccacaaattcaaaag XI-P_Native FL 3'His-tag (SEQ ID NO. 43)
atgactcgagctactaatgatgatgatgatgatgttggtacatggcaacaatagcttcg Each xylose isomerase gene described above (plus or minus the HIS tag) was cloned into the yeast expression vector p426GPD (Mumberg et al., 1995, Gene 156: 1 19-122; obtained from ATCC #87361 ; PubMed: 7737504) using the Spel and Xhol sites located at the 5' and 3' ends of each gene. Each of the bacterial vectors containing a xylose isomerase gene (with or without the 6-HIS c-terminal tag) and the p426GPD yeast expression vector were digested with Spel and Xhol. The generated fragments were gel extracted using a Qiagen gel purification kit (Qiagen, Valencia, CA), the p426GPD vector reaction was cleaned up using a Qiagen PCR purification kit. About 30ng of each fragment was ligated to 50ng of the p426GPD vector using T4 DNA ligase (Fermentas, Glen Burnie, MD) in a 10μΙ volume reaction overnight at 16°C and transformed into NEB- 5a competent cells (NEB, Ipswich, MA) and plated onto LB media with ampicillin
(100 g/ml). Constructs were confirmed by sequence analysis (GeneWiz, La Jolla, CA).
A haploid Saccharomyces cerevisiae strain (BY4742; ATCC catalog number 201389) was cultured in YPD media (1 Og Yeast Extract, 20g Bacto- Peptone, 20g Glucose, 1 L total) at about 30 °C. Separate aliquots of these cultured cells were transformed with a plasmid construct containing a xylose isomerase gene or with vector alone.
Transformation was accomplished using the Zymo frozen yeast transformation kit (Catalog number T2001 ; Zymo Research Corp., Orange, CA 92867). To about 50 μΙ of cells was added approximately 0.5-1 g plasmid DNA and the cells were cultured on SC drop out media with glucose minus uracil (about 20g glucose; about 2.21 g SC drop-out mix [described below], about 6.7g yeast nitrogen base, all in about 1 L of water); this mixture was cultured for 2-3 days at about 30°C. SC drop-out mix contained the following ingredients (Sigma); all indicated weights are approximate:
0.4g Adenine hemisulfate
3.5g Arginine
1 g Glutamic Acid
0.433g Histidine
0.4g Myo-lnositol
5.2g Isoleucine
2.63g Leucine 0.9g Lysine
L5g Methionine
0.8g Phenylalanine
i .ig Serine
1 .2g Threonine
0.8g Tryptophan
0.2g Tyrosine
1 .2g Valine For expression and activity analysis, cultures expressing the various xylose isomerase wild type and variant gene constructs were grown in about 100ml SC-Dextrose (2%) at about 30°C to an OD600 of about 4.0.
S. cerevisiae cultures were lysed using YPER-Plus reagent (Thermo Scientific, San Diego, CA; catalog number 78999) according to the manufacturer's instructions.
Quantification of the lysates was performed using the Coomassie-Plus kit (Thermo Scientific, San Diego, CA; Catalog number 23236) as directed by the manufacturer. About 5-10 ig of total cell extract was used for SDS-gel [NuPage 4-12% Bis-Tris gels (Life Technologies, Carlsbad, CA)) and native gel electrophoresis and for native Western blot analyses.
SDS-PAGE gels were run according to the manufacturer's recommendation using NuPage MES-SDS Running Buffer at 1 X concentration with the addition of NuPage antioxidant into the cathode chamber at a 1 X concentration. Novex Sharp Protein Standards (Life Technologies, Carlsbad, CA) were used as standards. For Western analysis, gels were transferred onto a nitrocellulose membrane (0.45 micron, Thermo Scientific, San Diego, CA) using Western blotting filter paper (Thermo Scientific) using a Bio-Rad Mini Trans-Blot Cell (BioRad, Hercules, CA) system for approximately 90 minutes at 40V. Following transfer, the membrane was washed in 1 X PBS (EMD, San Diego, CA), 0.05% Tween-20 (Fisher Scientific, Fairlawn, NJ) for 2-5 minutes with gentle shaking. The membrane was blocked in 3% BSA dissolved in 1 X PBS and 0.05% Tween-20 at room temperature for about 2 hours with gentle shaking. The membrane was washed once in 1 X PBS and 0.05% Tween-20 for about 5 minutes with gentle shaking. The membrane was then incubated at room temperature with the 1 :5000 dilution of primary antibody (Ms mAB to 6x His Tag, AbCam, Cambridge, MA) in 0.3% BSA (Fraction V, EMD, San Diego, CA) dissolved in 1 X PBS and 0.05% Tween-20 with gentle shaking. Incubation was allowed to proceed for about 1 hour with gentle shaking. The membrane was then washed three times for 5 minutes each with 1 X PBS and 0.05% Tween-20 with gentle shaking. The secondary antibody [Dnk pAb to Ms IgG
(HRP), AbCam, Cambridge, MA] was used at 1 :15000 dilution in 0.3% BSA and allowed to incubate for about 90 minutes at room temperature with gentle shaking. The membrane was washed three times for about 5 minutes using 1 X PBS and 0.05% Tween-20 with gentle shaking. The membrane was then incubated with 5ml of
Supersignal West Pico Chemiluminescent substrate (Thermo Scientific, San Diego, CA) for 1 minute and then was exposed to a phosphorimager (Bio-Rad Universal Hood II, Bio-Rad, Hercules, CA) for about 10-100 seconds.
The results are shown in FIG. 7. As can be seen, the wild type R. flavefaciens xylose isomerase gene protein and the wild type Piromyces xylose isomerase gene are both expressed in the soluble fraction of the cells. The expected size of the xylose isomerase R. flavefaciens polypeptide is approximately 49.8kDa.
Example 10: In vitro Xylose Isomerase Activity Assays
Enzyme assays of the various xylose isomerase variants were performed according to Kuyper et al. (FEMS Yeast Res., 4:69 [2003]) with a few modifications. Approximately 20μ9 of soluble whole cell extract from each transformed cell line, prepared using Y-PER plus reagent as described above, was incubated in a solution containing about 100mM Tris, pH 7.5, 10mM MgCh2, 0.15mM NADH (Sigma, St. Louis, MO), and 2U Sorbitol Dehydrogenase (SDH) (Roche, Indianapolis, IN) at about 30°C. To start the reaction, about 100μΙ of xylose was added at various final concentrations of about 40 to about 500mM. A Beckman DU-800 spectrophotometer was utilized with an Enzyme
Mechanism software package (Beckman Coulter, Inc, Brea, CA.), and the change in the A340 was monitored for 2-3 minutes. Assays were repeated as described above in the absence and in the presence of about 0 to about 50mM xylitol, an inhibitor of xylose isomerase, in order to determine the Kj. Regular assays (no xylitol) were done independently about 5 to 10 times over the entire range of xylose concentrations and 2 times in the presence of the entire range of xylitol concentrations. The results are set forth in the table below.
Figure imgf000184_0001
Example 1 1 : Construction of Ruminococcus Xylose Isomerase Chimeric Variants
Several xylose isomerase gene variants were designed and constructed in which 6 adenosine bases were added to each variant directly 5' of the ATG "start" codon. Additionally, 15, 20, 45 or 60 base pairs of the 5' end of the Ruminococcus xylose isomerase gene were replaced with various portions of the 5' end of the Piromyces xylose isomerase gene to create "chimeric" or "hybrid" xylose isomerase genes.
Diagrams representative of non-limiting xylose isomerase chimeric variant gene embodiments is shown in FIG. 8.
PCR amplification was used to generate novel chimeric constructs. For all PCR reactions, approximately 0.2 mol of each oligonucleotide was added to 25-30ng of the appropriate purified DNA template with 0.2mM dNTPs (5' and 3'), 1 X Pfu Ultra II buffer and 1 unit (U) Pfu Ultra II polymerase (Agilent). The PCR reactions were thermocycled as follows; 95 °C for 10 minutes, followed by 30 cycles of 95 °C for 10 sec, 58 °C for 30 sec, and 72°C for 30 seconds. A 5 minute 72°C extension reaction completed the amplification rounds. 5' Oligonucleotides:
Figure imgf000185_0001
3'-Oligonucleotides:
KAS/3-XI-RF-NATIVE (SEQ ID NO. 53) ctcgagttacagactgaaaagaacgttatttacg
KAS/3-XI-RF-NATIVE-HIS (SEQ ID NO. 54) ctcgagttacagactgaaaagaacgttatttacg
The novel Xl-R constructs were generated using PCR with the relevant primers and template gene. The 5' primer KAS/XI-R-6A and either 3' primer KAS/3-XI-RF-NATIVE or 3' primer KAS/3-XI-RF-NATIVE-HIS were used in combination with the full length native Ruminococcus xylose isomerase (Xl-R) gene to generate the constructs referred to as "XI-Rf-6A" and "XI-Rf-6AHis". To generate the chimeric Xl-Rp5 gene, the 5' primer KAS/XI-R-P1 -5 and either 3' primer KAS/3-XI-RF-NATIVE or 3' primer KAS/3-XI-RF-NATIVE-HIS were used in combination with the full length native xylose isomerase Ruminococcus gene. The chimeric Xl-Rp5 gene includes the first 5 amino acids of the Piromyces xylose isomerase (Xl-P) polypeptide followed by amino acids 6 to 1323 of the native Ruminococcus xylose isomerase. To generate the chimeric Xl-Rp10 gene, the 5' primer KAS/XI-R-P6-10 and 3' primer KAS/3-XI-RF-NATIVE were first used to add nucleotides 16-30 from the Xl-P gene to the 5' end of the Xl-R gene keeping the remainder of the Xl-R gene in-frame. The chimeric Xl-Rp10 gene includes the first 10 amino acids of the Piromyces xylose isomerase followed by amino acids 1 1 to 438 of the Ruminococcus xylose isomerase. Following PCR purification of the resulting Xl-Rp6-10 amplified product, 5' primer KAS/XI-R-P1 -10 and either the 3' primer KAS/3-XI-RF-NATIVE or the 3' primer KAS/3-XI-RF-NATIVE- HIS oligonucleotides were used to add additional sequences. To generate the chimeric Xl-Rp15 gene, the 5' primer KAS/XI-R-P1 1 -15 and 3' primer KAS/3-XI-RF-NATIVE were first used on the Xl-Rp10 construct to add nucleotides 16 to 45 from the Xl-P gene to the 5' end of the Xl-R native gene. The chimeric Xl-Rp15 gene includes the first 15 amino acid of the Piromyces xylose isomerase followed by amino acids 16 to 438 of the Ruminococcus xylose isomerase. Following PCR purification of the resulting Xl-Rp6-15 amplified product, 5' primer KAS/XI-R-P1 -15 either 3' primer KAS/3-XI-RF-NATIVE or 3' primer KAS/3-XI-RF-NATIVE-HIS oligonucleotides were used to add additional sequences.
To generate the chimeric Xl-Rp20 gene, the 5' primer KAS/XI-R-P10-20 and 3' primer KAS/3-XI-RF-NATIVE were used on the Xl-Rp15 construct to add nucleotides 30-60 to the 5' end of the Xl-R native gene. The chimeric Xl-Rp20 gene includes the first 20 amino acids of the Piromyces xylose isomerase followed by amino acids 21 to 438 of the Ruminococcus xylose isomerase. Following PCR purification of the resulting Xl-Rp10- 20 amplified product, 5' primer KAS/XI-R-P1 -15 and either 3' primer KAS/3-XI-RF- NATIVE or 3' primer KAS/3-XI-RF-NATIVE-HIS oligonucleotides were used to add additional sequences.
Each of the novel chimeric xylose isomerase genes (with and without the c-terminal 6- HIS tag) were cloned into the bacterial cloning vector pCR Blunt II TOPO (Life
Technologies, Carlsbad, CA) according to the manufacturer's recommendations.
Following sequence verification (GeneWiz, La Jolla, CA), the approximate 1330bp Spel- Xhol fragment from each construct was subcloned into the yeast expression vector p426GPD by first extracting each fragment from a gel slice using a gel purification kit (Qiagen, Valencia, CA), and then preparing the p426GPD vector for ligation by purifying it using a PCR purification kit (Qiagen, Valencia, CA), according to the manufacturer's instructions. About 30ng of each of the chimeric genes was separately ligated to about 50ng of the p426GPD vector using T4 DNA ligase (Fermentas, Glen Burnie, MD) in a 10μΙ volume reaction overnight at about 16°C, followed by transformation using standard protocols into NEB-5a competent cells (NEB, Ipswich, MA). The transformed cell culture was plated onto LB media with ampicillin (100pg/ml). The constructs containing the chimeric genes were confirmed by sequence analysis (GeneWiz, La Jolla, CA).
A haploid Saccharomyces cerevisiae strain (BY4742; ATCC catalog number 201389) was cultured in YPD media (1 Og Yeast Extract, 20g Bacto-Peptone, 20g Glucose, 1 L total) at about 30 °C. Separate aliquots of these cultured cells were transformed with plasmid constructs containing the novel xylose isomerase chimeric genes as well as with the Piromyces and Ruminococcus native gene constructs described herein.
Transformation was performed using the Zymo frozen yeast transformation kit (Catalog number T2001 ; Zymo Research Corp., Orange, CA 92867). Approximately about 0.5 μg to about 1 μg plasmid DNA was added to about 50 μΙ of cells, and the transformed cells were cultured on SC drop out media with glucose minus uracil (e.g., about 20g glucose; about 2.21g SC drop-out mix], about 6.7g yeast nitrogen base, per 1 L of water) for 2-3 days at about 30 °C.
S. cerevisiae cultures were lysed using YPER-Plus reagent (Thermo Scientific, San Diego, CA; catalog number 78999) according to the manufacturer's instructions.
Quantification of the lysates was performed using the Coomassie-Plus kit (Thermo Scientific, San Diego, CA, catalog number 23236) as directed by the manufacturer. About 5 to 10 pg of total cell extract was used for SDS-gel (NuPage 4-12% Bis-Tris gels, Life Technologies, Carlsbad, CA) and native gel electrophoresis and Western blot analyses. SDS-PAGE gels were run according to the manufacturer's recommendation using NuPage MES-SDS Running Buffer (Life Technologies, Carlsbad, CA) at 1 X concentration with the addition of NuPage antioxidant to the cathode chamber at a 1 X concentration. Novex Sharp Protein Standards (Life Technologies, Carlsbad, CA) were used as standards.
For Western analysis, gels were transferred onto a nitrocellulose membrane (0.45 micron, Thermo Scientific, San Diego, CA) using Western blotting filter paper (Thermo Scientific) using a Bio-Rad Mini Trans-Blot Cell (BioRad, Hercules, CA) system for approximately 90 minutes at 40V. Following transfer, the membrane was washed in 1 X PBS (EMD, San Diego, CA), 0.05% Tween-20 (Fisher Scientific, Fairlawn, NJ) for 2-5 minutes with gentle shaking. The membrane was blocked in about 3% BSA dissolved in 1 X PBS and 0.05% Tween-20 at room temperature for about 2 hours with gentle shaking. The membrane was washed once in 1 X PBS and 0.05% Tween-20 for about 5 minutes with gentle shaking. The membrane was then incubated at room temperature with an approximately 1 :5000 dilution of primary antibody (anti-His mouse monoclonal antibody , AbCam, Cambridge, MA) in about 0.3% BSA (Fraction V, EMD, San Diego, CA) dissolved in 1 X PBS and 0.05% Tween-20 with gentle shaking. Incubation was allowed to proceed for about 1 hour with gentle shaking. The membrane was then washed three times for about 5 minutes each with 1 X PBS and 0.05% Tween-20 with gentle shaking. The secondary antibody (donkey anti mouse IgG polyclonal antibody linked to horse radish peroxidase, AbCam, Cambridge, MA) was used at about a 1 :15000 dilution in 0.3% BSA and allowed to incubate for about 90 minutes at room temperature with gentle shaking. The membrane was washed three times for about 5 minutes each time using 1 X PBS and 0.05% Tween-20 with gentle shaking. The membrane was then incubated with 5ml of Supersignal West Pico Chemiluminescent substrate (Thermo Scientific, San Diego, CA) for 1 minute and then was exposed to a phosphorimager (Bio-Rad Universal Hood II, Bio-Rad, Hercules, CA) for about 10-100 seconds. The expected size of the chimeric xylose isomerase constructs is
approximately 49.8kDa. The expected size of the xylose isomerase protein is approximately 50.2kDa. The results of Western blot analysis are shown in FIG. 9. For each protein 2 lanes were run: T= total protein from whole cell extract, S= soluble portion of the whole cell extract.
As can be seen in FIG. 9, adding 6 adenosine bases directly upstream of the 5' end of the xylose isomerase Ruminococcus wild type gene did not improve expression of the polypeptide. However, replacing the 5' end of the Ruminococcus wild type xylose isomerase gene with 15, 30 or 45 of the 5' base pairs of the Piromyces wild type xylose isomerase gene improved expression of the enzyme.
Enzyme assays of the various novel xylose isomerase chimeric polypeptides were performed according to Kuyper et al. (FEMS Yeast Res., 4:69 [2003]) with a few modifications as described above. Approximately 2 μg of soluble whole cell extract from each transformed cell line was prepared using Y-PER plus reagent as described above was incubated in a solution containing about 100mM Tris, pH 7.5, 10mM MgCI2, 0.15mM NADH (Sigma, St. Louis, MO), and 2U sorbitol dehydrogenase (SDH) (Roche,
Indianapolis, IN) at about 30°C. To start the reaction, about 10ΟμΙ of xylose was added at various final concentrations of 40-500mM. A Beckman DL 800 spectrophotometer was utilized with an Enzyme Mechanism software package (Beckman Coulter, Inc, Brea, CA.) , and the change in the A340 was monitored for 2-3 minutes. The results of the assays are set forth in the table below.
Figure imgf000189_0001
The results indicate that substituting 30 base pairs of DNA from the 5' end of the Ruminococcus xylose isomerase gene with the first 15 base pairs of the Piromyces wildtype xylose isomerase gene increased both the specific activity and the expression level to a level comparable to that of the wild type Piromyces xylose isomerase. The DNA and amino acid sequence for each chimeric gene is set forth below as SEQ ID NOs. 55 to 62. Small, bold "a" nucleotides indicated the 6 added adenosines. Large capital bold "A, T, G or C" nucleotides indicate the portion of the chimeric sequences donated by Piromyces and combined with the Ruminococcus sequence (e.g., small non-bold nucleotides).
SEQ ID NO. 55: Xl-Rp5 DNA aaaaaaATGGCTAAGGAATATTTCAGCAATATCGGTAAAATTCAGTATCAGGGACCAAAAA GTACTGATCCTCTCTCATTTAAGTACTATAACCCTGAAGAAGTCATCAACGGAAAGACAATGC GCGAGCATCTGAAGTTCGCTCTTTCATGGTGGCACACAATGGGCGGCGACGGAACAGATAT GTTCGGCTGCGGCACAACAGACAAGACCTGGGGACAGTCCGATCCCGCTGCAAGAGCAAA GGCTAAGGTTGACGCAGCATTCGAGATCATGGATAAGCTCTCCATTGACTACTATTGTTTCC ACGATCGCGATCTTTCTCCCGAGTATGGCAGCCTCAAGGCTACCAACGATCAGCTTGACATA GTTACAGACTATATCAAGGAGAAGCAGGGCGACAAGTTCAAGTGCCTCTGGGGTACAGCAA AGTGCTTCGATCATCCAAGATTCATGCACGGTGCAGGTACATCTCCTTCTGCTGATGTATTC GCTTTCTCAGCTGCTCAGATCAAGAAGGCTCTGGAGTCAACAGTAAAGCTCGGCGGTAACG GTTACGTTTTCTGGGGCGGACGTGAAGGCTATGAGACACTTCTTAATACAAATATGGGACTC GAACTCGACAATATGGCTCGTCTTATGAAGATGGCTGTTGAGTATGGACGTTCGATCGGCTT CAAGGGCGACTTCTATATCGAGCCCAAGCCCAAGGAGCCCACAAAGCATCAGTACGATTTC GATACAGCTACTGTTCTGGGATTCCTCAGAAAGTACGGTCTCGATAAGGATTTCAAGATGAA TATCGAAGCTAACCACGCTACACTTGCTCAGCATACATTCCAGCATGAGCTCCGTGTTGCAA GAGACAATGGTGTGTTCGGTTCTATCGACGCAAACCAGGGCGACGTTCTTCTTGGATGGGA TACAGACCAGTTCCCCACAAATATCTACGATACAACAATGTGTATGTATGAAGTTATCAAGGC AGGCGGCTTCACAAACGGCGGTCTCAACTTCGACGCTAAGGCACGCAGAGGGAGCTTCACT CCCGAGGATATCTTCTACAGCTATATCGCAGGTATGGATGCATTTGCTCTGGGCTTCAGAGC TGCTCTCAAGCTTATCGAAGACGGACGTATCGACAAGTTCGTTGCTGACAGATACGCTTCAT GGAATACCGGTATCGGTGCAGACATAATCGCAGGTAAGGCAGATTTCGCATCTCTTGAAAAG TATGCTCTTGAAAAGGGCGAGGTTACAGCTTCACTCTCAAGCGGCAGACAGGAAATGCTGG AGTCTATCGTAAATAACGTTCTTTTCAGTCTGTAA
SEQ ID NO. 56: Xl-Rp5 Polypeptide
MAKEYFSNIGKIQYQGPKSTDPLSFKYYNPEEVINGKTMREHLKFALSWWHTMGGDGT DMFGCGTTDKTWGQSDPAARAKAKVDAAFEIMDKLSIDYYCFHDRDLSPEYGSLKATN DQLDIVTDYIKEKQGDKFKCLWGTAKCFDHPRFMHGAGTSPSADVFAFSAAQIKKALES TVKLGGNGYVFWGGREGYETLLNTNMGLELDNMARLMKMAVEYGRSIGFKGDFYIEP KPKEPTKHQYDFDTATVLGFLRKYGLDKDFKMNIEANHATLAQHTFQHELRVARDNGV FGSIDANQGDVLLGWDTDQFPTNIYDTTMCMYEVIKAGGFTNGGLNFDAKARRGSFTP EDIFYSYIAGMDAFALGFRAALKLIEDGRIDKFVADRYASWNTGIGADIIAGKADFASLEK YALEKGEVTASLSSGRQEMLESIVNNVLFSL SEQ ID NO. 57: Xl-Rp10 DNA aaaaaaaTGGCTAAGGAATATTTCCCACAAATTCAACAGTATCAGGGACCAAAAAGTACT
GATCCTCTCTCATTTAAGTACTATAACCCTGAAGAAGTCATCAACGGAAAGACAATGCGCGA GCATCTGAAGTTCGCTCTTTCATGGTGGCACACAATGGGCGGCGACGGAACAGATATGTTC GGCTGCGGCACAACAGACAAGACCTGGGGACAGTCCGATCCCGCTGCAAGAGCAAAGGCT AAGGTTGACGCAGCATTCGAGATCATGGATAAGCTCTCCATTGACTACTATTGTTTCCACGAT CGCGATCTTTCTCCCGAGTATGGCAGCCTCAAGGCTACCAACGATCAGCTTGACATAGTTAC AGACTATATCAAGGAGAAGCAGGGCGACAAGTTCAAGTGCCTCTGGGGTACAGCAAAGTGC TTCGATCATCCAAGATTCATGCACGGTGCAGGTACATCTCCTTCTGCTGATGTATTCGCTTTC TCAGCTGCTCAGATCAAGAAGGCTCTGGAGTCAACAGTAAAGCTCGGCGGTAACGGTTACG TTTTCTGGGGCGGACGTGAAGGCTATGAGACACTTCTTAATACAAATATGGGACTCGAACTC GACAATATGGCTCGTCTTATGAAGATGGCTGTTGAGTATGGACGTTCGATCGGCTTCAAGGG GGACTTCTATATCGAGCCCAAGCCCAAGGAGCCCACAAAGCATCAGTACGATTTCGATACAG CTACTGTTCTGGGATTCCTCAGAAAGTACGGTCTCGATAAGGATTTCAAGATGAATATCGAA GCTAACCACGCTACACTTGCTCAGCATACATTCCAGCATGAGCTCCGTGTTGCAAGAGACAA TGGTGTGTTCGGTTCTATCGACGCAAACCAGGGCGACGTTCTTCTTGGATGGGATACAGAC CAGTTCCCCACAAATATCTACGATACAACAATGTGTATGTATGAAGTTATCAAGGCAGGCGG CTTCACAAACGGCGGTCTCAACTTCGACGCTAAGGCACGCAGAGGGAGCTTCACTCCCGAG GATATCTTCTACAGCTATATCGCAGGTATGGATGCATTTGCTCTGGGCTTCAGAGCTGCTCT CAAGCTTATCGAAGACGGACGTATCGACAAGTTCGTTGCTGACAGATACGCTTCATGGAATA CCGGTATCGGTGCAGACATAATCGCAGGTAAGGCAGATTTCGCATCTCTTGAAAAGTATGCT CTTGAAAAGGGCGAGGTTACAGCTTCACTCTCAAGCGGCAGACAGGAAATGCTGGAGTCTA TCGTAAATAACGTTCTTTTCAGTCTGTAA
SEQ ID NO. 58: Xl-Rp10 Polypeptide
MAKEYFPQIQQYQGPKSTDPLSFKYYNPEEVINGKTMREHLKFALSWWHTMGGDGTD MFGCGTTDKTWGQSDPAARAKAKVDAAFEIMDKLSIDYYCFHDRDLSPEYGSLKATND QLDIVTDYIKEKQGDKFKCLWGTAKCFDHPRFMHGAGTSPSADVFAFSAAQIKKALEST VKLGGNGYVFWGGREGYETLLNTNMGLELDNMARLMKMAVEYGRSIGFKGDFYIEPK PKEPTKHQYDFDTATVLGFLRKYGLDKDFKMNIEANHATLAQHTFQHELRVARDNGVF GSIDANQGDVLLGWDTDQFPTNIYDTTMCMYEVIKAGGFTNGGLNFDAKARRGSFTPE DIFYSYIAGMDAFALGFRAALKLIEDGRIDKFVADRYASWNTGIGADIIAGKADFASLEKY ALEKGEVTASLSSGRQEMLESIVNNVLFSL SEQ ID NO. 59: Xl-Rp15 DNA aaaaaaaTGGCTAAGGAATATTTCCCACAAATTCAAAAGATTAAGTTCGAAAAAAGTA
CTGATCCTCTCTCATTTAAGTACTATAACCCTGAAGAAGTCATCAACGGAAAGACAATGCGC GAGCATCTGAAGTTCGCTCTTTCATGGTGGCACACAATGGGCGGCGACGGAACAGATATGT TCGGCTGCGGCACAACAGACAAGACCTGGGGACAGTCCGATCCCGCTGCAAGAGCAAAGG CTAAGGTTGACGCAGCATTCGAGATCATGGATAAGCTCTCCATTGACTACTATTGTTTCCAC GATCGCGATCTTTCTCCCGAGTATGGCAGCCTCAAGGCTACCAACGATCAGCTTGACATAGT TACAGACTATATCAAGGAGAAGCAGGGCGACAAGTTCAAGTGCCTCTGGGGTACAGCAAAG TGCTTCGATCATCCAAGATTCATGCACGGTGCAGGTACATCTCCTTCTGCTGATGTATTCGC TTTCTCAGCTGCTCAGATCAAGAAGGCTCTGGAGTCAACAGTAAAGCTCGGCGGTAACGGTT ACGTTTTCTGGGGCGGACGTGAAGGCTATGAGACACTTCTTAATACAAATATGGGACTCGAA CTCGACAATATGGCTCGTCTTATGAAGATGGCTGTTGAGTATGGACGTTCGATCGGCTTCAA GGGCGACTTCTATATCGAGCCCAAGCCCAAGGAGCCCACAAAGCATCAGTACGATTTCGAT ACAGCTACTGTTCTGGGATTCGTCAGAAAGTACGGTCTCGATAAGGATTTCAAGATGAATAT CGAAGCTAACCACGCTACACTTGCTCAGCATACATTCCAGCATGAGCTCCGTGTTGCAAGAG ACAATGGTGTGTTCGGTTCTATCGACGCAAACCAGGGCGACGTTCTTCTTGGATGGGATACA GACCAGTTCCCCACAAATATCTACGATACAACAATGTGTATGTATGAAGTTATCAAGGCAGG CGGCTTCACAAACGGCGGTCTCAACTTCGACGCTAAGGCACGCAGAGGGAGCTTCACTCCC GAGGATATCTTCTACAGCTATATCGCAGGTATGGATGCATTTGCTCTGGGCTTCAGAGCTGC TCTCAAGCTTATCGAAGACGGACGTATCGACAAGTTCGTTGCTGACAGATACGCTTCATGGA ATACCGGTATCGGTGCAGACATAATCGCAGGTAAGGCAGATTTCGCATCTCTTGAAAAGTAT GCTCTTGAAAAGGGCGAGGTTACAGCTTCACTCTCAAGCGGCAGACAGGAAATGCTGGAGT CTATCGTAAATAACGTTCTTTTCAGTCTGTAA
SEQ ID NO. 60: Xl-Rp15 Polypeptide
MAKEYFPQIQKIKFEKSTDPLSFKYYNPEEVINGKTMREHLKFALSWWHTMGGDGTDM FGCGTTDKTWGQSDPAARAKAKVDAAFEIMDKLSIDYYCFHDRDLSPEYGSLKATNDQ LDIVTDYIKEKQGDKFKCLWGTAKCFDHPRFMHGAGTSPSADVFAFSAAQIKKALESTV KLGGNGYVFWGGREGYETLLNTNMGLELDNMARLMKMAVEYGRSIGFKGDFYIEPKP KEPTKHQYDFDTATVLGFLRKYGLDKDFKMNIEANHATLAQHTFQHELRVARDNGVFG SIDANQGDVLLGWDTDQFPTNIYDTTMCMYEVIKAGGFTNGGLNFDAKARRGSFTPED IFYSYIAGMDAFALGFRAALKLIEDGRIDKFVADRYASWNTGIGADIIAGKADFASLEKYA LEKGEVTASLSSGRQEMLESIVNNVLFSL SEQ ID NO. 61 : Xl-Rp20 DNA aaaaaaTGGCTAAGGAATATTTCCCACAAATTCAAAAGATTAAGTTCGAAGGTAAG
GATTCTAAGCTCTCATTTAAGTACTATAACCCTGAAGAAGTCATCAACGGAAAGACAATGC GCGAGCATCTGAAGTTCGCTCTTTCATGGTGGCACACAATGGGCGGCGACGGAACAGATAT GTTCGGCTGCGGCACAACAGACAAGACCTGGGGACAGTCCGATCCCGCTGCAAGAGCAAA GGCTAAGGTTGACGCAGCATTCGAGATCATGGATAAGCTCTCCATTGACTACTATTGTTTCC ACGATCGCGATCTTTCTCCCGAGTATGGCAGCCTCAAGGCTACCAACGATCAGCTTGACATA GTTACAGACTATATCAAGGAGAAGCAGGGCGACAAGTTCAAGTGCCTCTGGGGTACAGCAA AGTGCTTCGATCATCCAAGATTCATGCACGGTGCAGGTACATCTCCTTCTGCTGATGTATTC GCTTTCTCAGCTGCTCAGATCAAGAAGGCTCTGGAGTCAACAGTAAAGCTCGGCGGTAACG GTTACGTTTTCTGGGGCGGACGTGAAGGCTATGAGACACTTCTTAATACAAATATGGGACTC GAACTCGACAATATGGCTCGTCTTATGAAGATGGCTGTTGAGTATGGACGTTCGATCGGCTT CAAGGGCGACTTCTATATCGAGCCCAAGCCCAAGGAGCCCACAAAGCATCAGTACGATTTC GATACAGCTACTGTTCTGGGATTCCTCAGAAAGTACGGTCTCGATAAGGATTTCAAGATGAA TATCGAAGCTAACCACGCTACACTTGCTCAGCATACATTCCAGCATGAGCTCCGTGTTGCAA GAGACAATGGTGTGTTCGGTTCTATCGACGCAAACCAGGGCGACGTTCTTCTTGGATGGGA TACAGACCAGTTCCCCACAAATATCTACGATACAACAATGTGTATGTATGAAGTTATCAAGGC AGGCGGCTTCACAAACGGCGGTCTCAACTTCGACGCTAAGGCACGCAGAGGGAGCTTCACT CCCGAGGATATCTTCTACAGCTATATCGCAGGTATGGATGCATTTGCTCTGGGCTTCAGAGC TGCTCTCAAGCTTATCGAAGACGGACGTATCGACAAGTTCGTTGCTGACAGATACGCTTCAT GGAATACCGGTATCGGTGCAGACATAATCGCAGGTAAGGCAGATTTCGCATCTCTTGAAAAG TATGCTCTTGAAAAGGGCGAGGTTACAGCTTCACTCTCAAGCGGCAGACAGGAAATGCTGG AGTCTATCGTAAATAACGTTCTTTTCAGTCTGTAA
SEQ ID NO. 62: Xl-Rp20 Polypeptide
MAKEYFPQIQKIKFEGKDSKLSFKYYNPEEVINGKTMREHLKFALSWWHTMGGDGTDM FGCGTTDKTWGQSDPAARAKAKVDAAFEIMDKLSIDYYCFHDRDLSPEYGSLKATNDQ LDIVTDYIKEKQGDKFKCLWGTAKCFDHPRFMHGAGTSPSADVFAFSAAQIKKALESTV KLGGNGYVFWGGREGYETLLNTNMGLELDNMARLMKMAVEYGRSIGFKGDFYIEPKP KEPTKHQYDFDTATVLGFLRKYGLDKDFKMNIEANHATLAQHTFQHELRVARDNGVFG SIDANQGDVLLGWDTDQFPTNIYDTTMCMYEVIKAGGFTNGGLNFDAKARRGSFTPED IFYSYIAGMDAFALGFRAALKLIEDGRIDKFVADRYASWNTGIGADIIAGKADFASLEKYA LEKGEVTASLSSGRQEMLESIVNNVLFSL Example 12: Production of Additional Xylose Isomerase Variants
A series of specific point mutations were made to the "hot rod" Ruminococcus xylose isomerase gene using site-directed mutagenesis. The particular point mutations that were generated are set forth in the table below. W136F
F184S
G179A
G179A F184S
W136F G179A F184S
W136I G179A F184S
W136S G179A F184S
F87L W136F G179A F184S
F87M W136F G179A F184S
F87L W136F G179A F184S V214G
G179S F184A
F85S W136F G179A F184A V214G Q273T
F85S W 136F G 179A F 184 A V214 D257A
W136F G179A F184A
W136F G179A F184S D257E
Site directed mutagenesis was performed as follows: About 50ng of template DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μιτιοΙ of the relevant mutagenesis primers depending on the mutant being constructed and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The sequence of the mutagenesis primers used is set forth in the table below. The "hot rod" Ruminococcus xylose isomerase gene was used as the template DNA for constructing single point mutation variants.
Previously engineered mutants sometimes were used as "template" DNA to generate other mutants. The sequence of the oligonucleotides used to prepare each mutant is indicated in the table below. Each reaction was PCR cycled as follows: 95 °C 10 minutes followed by 30 rounds of 95°C for 20 seconds, 55 "C for 30 seconds, and 72°C for 5 minutes. A final 5 minute extension reaction at 72°C was also included. Following PCR, 1 .5 μΙ of Dpnl (NEB, Ipswich, MA) was added and allowed to digest the reaction mixture for 1 to 1 .5 hours at 37 °C. 5μΙ of this mixture was then used to transform NEB-5a cells (NEB, Ipswich, MA) and plated onto LB media with ampicillin (100 μg/ml).
GGCGACAAGTTCAAGTGCCTCTTCGGTACAGCAAAG W136F _Forward
CTTTGCTGTACCGAAGAGGCACTTGAACTTGTCGCC W136F_Reverse
TCGGCGGTAACGGTTACGTTAGCTGGGGCGGAC F184S_Forward
GTCCGCCCCAGCTAACGTAACCGTTACCGCCGA F184S_Reverse
AACAGTAAAGCTCGGCGCTAACGGTTACGTTTTCT G179A_Forward
AGAAAACGTAACCGTTAGCGCCGAGCTTTACTGTT G179A_Reverse .
AACAGTAAAGCTCGGCGCTAACGGTTACGTTAGCTGGGGCGGAC G179A-F184S_Forward
GTCCGCCCCAGCTAACGTAACCGTTAGCGCCGA G179A-F184S_Reverse
GCTAAGGTTGACGCAGCAATGGAGATCATGGATAAGCTC F85M_Forward
GAGCTTATCCATGATCTCCATTGCTGCGTCAACCTTAGC F85M_Reverse
CTAAGGTTGACGCAGCATTAGAGATCATGGATAAGCTC F85L_Forward
GAGCTTATCCATGATCTCTAATGCTGCGTCAACCTTAG F85L Reverse AGCTAACCACGCTACACTTGCTACGCATACATTCCAGCATG
CATGCTGGAATGTATGCGTAGCAAGTGTAGCGTGGTTAGCT
GCGACAAGTTCAAGTGCCTCATAGGTACAGCAAAGTGCTTCGA
TCGAAGCACTTTGCTGTACCTATGAGGCACTTGAACTTGTCGC
GCGACAAGTTCAAGTGCCTCTCGGGTACAGCAA
TTGCTGTACCCGAGAGGCACTTGAACTTGTCGC
GCTAACCACGCTACACTTGCTGGTCATACATTCCAGCAT
ATGCTGGAATGTATGACCAGCAAGTGTAGCGTGGTTAGC
CCTCAGAAAGTACGGTCTCGCTAAGGATTTCAAGATGAATA
TATTCATCTTGAAATCCTTAGCGAGACCGTACTTTCTGAGG
Figure imgf000195_0001
CCTCAGAAAGTACGGTCTCGAGAAGGATTTCAAGATGAATATC
GATATTCATCTTGAAATCCTTCTCGAGACCGTACTTTCTGAGG
GTCTTATGAAGATGGCTGGTGAGTATGGACGTTCGAT
ATCGAACGTCCATACTCACCAGCCATCTTCATAAGAC
GTCAACAGTAAAGCTCGGCAGTAACGGTTACGTTAGCTGG
Figure imgf000195_0002
CCAGCTAACGTAACCGTTACTGCCGAGCTTTACTGTTGAC
Following sequence verification (GeneWiz, La Jolla, CA), the approximate 1330bp Spel- Xhol fragment from each construct was subcloned into the yeast expression vector p426GPD by first gel extracting each fragment using a Qiagen gel purification kit (Qiagen, Valencia, CA), and then preparing the p426GPD vector for ligation by purifying it using a Qiagen PCR purification kit according to the manufacturer's instructions. About 30ng of each of the chimeric genes was separately ligated to about 50ng of the p426GPD vector using T4 DNA ligase followed by transformation using known protocols into NEB-5a competent cells (NEB, Ipswich, MA). The transformed cells were plated onto LB media with ampicillin ( 100Mg/ml). Constructs containing the chimeric genes were confirmed by sequence analysis (GeneWiz, La Jolla, CA).
A haploid Saccharomyces cerevisiae strain (BY4742; ATCC catalog number 201 389) was cultured in YPD media ( 1 Og Yeast Extract, 20g Bacto-Peptone, 20g Glucose, 1 L total) at about 30 °C. Separate aliquots of these cultured cells were transformed with a plasmid constructs containing the novel xylose isomerase chimeric genes as well as with the Piromyces and Ruminococcus native gene constructs made above. Transformation was accomplished using the Zymo frozen yeast transformation kit (Catalog number T2001 ; Zymo Research Corp., Orange, CA 92867). To about 50 μΙ of cells was added approximately 0.5-1 pg plasmid DNA and the cells were cultured on SC drop out media with glucose minus uracil (about 20g glucose; about 2.21 g SC drop-out mix, about 6.7g yeast nitrogen base, all in about 1 L of water); this mixture was cultured for 2-3 days at about 30 °C. Assays of the various novel xylose isomerase point mutation polypeptides were performed according to Kuyper et al. (FEMS Yeast Res., 4:69 [2003]) with a few modifications as described above. Approximately 20Mg of soluble whole cell extract from each transformed cell line was prepared using Y-PER plus reagent as described above was incubated in a solution containing about 100mM Tris, pH 7.5, 10mM MgCh2,
0.15mM NADH (Sigma, St. Louis, MO), and 2U Sorbitol Dehydrogenase (SDH) (Roche, Indianapolis, IN) at about 30 °C. To start the reaction, about 100μΙ of xylose was added at various final concentrations of 40-500mM. A Beckman DU-800 spectrophotometer was utilized with an Enzyme Mechanism software package (Beckman Coulter, Inc, Brea, CA.), and the change in the A340 was monitored for 2-3 minutes. The results of the assays are shown in FIG. 10.
Mutant G179A had the highest activity as compared to the Ruminococcus wild type xylose isomerase, as shown in FIG. 10. The kinetics of the G179A mutant were further analyzed using kinetic assays described herein, adding various concentrations of xylose ranging from about 40 to about 500mM. The results of the kinetic assays, shown below, for the G179A mutant illustrate that the mutant xylose isomerase activity has a higher specific activity than the Piromyces xylose isomerase.
Figure imgf000196_0001
Example 13: In vivo Evaluation of Xylose Isomerase Constructs
The yeast strain BY4742 was specifically engineered to more readily utilize xylose as a carbon source. The engineered strain was designed to include the following genetic modifications: the native Pho13 gene (alkaline phosphatase specific for p-nitrophenyl phosphate) was disrupted by inserting a construct containing the native TLK1 gene (Transketolase-1 ); the native aldose reductase gene (Gre3) was disrupted by inserting a construct containing the native high-atfinity glucose transporter-7 gene( HXT7); the native glucose-repressible alcohol dehydrogenase II gene (adh2) was disrupted by inserting a construct containing the native xylulokinase gene (XYLK); and the native orotidine-5' phosphate decarboxylase gene (ura3) was disrupted by inserting a construct containing the native transaldolase 1 gene (TAL1 ). The resulting strain had the following genotype: pho13::TKL1 , gre3::HXT7, adh2::XYLK, ura3::TAL1. The final strain is referred to as the "C5" strain and was used for in vivo evaluation of the xylose isomerase variants.
The C5 strain was transformed using standard protocols with either p426GPD (as a control) or the chimeric variants Xl-R, Xl-Rp5, Xl-Rp10, or Xl-Rp15. The transformed cells were grown on SC-glucose minus uracil initially and then passaged onto SC-xylose minus uracil. Cultures of each of the above constructs were made in SC-xylose minus uracil - and grown for one week. The cultures were grown aerobically at 30°C with 250 rpm agitation, 1 wm sparge of process air, 21 % 02. The pH was controlled at about 5.0 with 1 N NaOH. Ethanol, glucose and xylose concentrations in the fermentation broth were monitored by a YSI 2700 BioAnalyzer during aerobic fermentation. At 24 hours elapsed fermentation time the fermentation was switched to anaerobic conditions.
Before changing to anaerobic conditions, samples were taken to measure ethanol, glucose and xylose concentrations, and biomass was measured by OD600nm and dry cell weight. At the start of anaerobic fermentation, 4 ml/L of 2.5 g/L ergosterol in EtOH, 0.4 ml/L Tween 80, and 0.01 % AF-204 were added to each fermentor. Oxygen was purged with 100% N2 sparge at 1 wm until percent 02 was below 1 %. Aeration was then set at 0.25 wm 100% N2.
Samples were taken every 24 to 48 hours and measured for ethanol concentration, glucose concentration, xylose concentration, and cell density (OD600nm). The fermentation was harvested when xylose concentration was below 4 g/L in the Xl-R strains, at 372 hours after commencing fermentation. The final sample also measured biomass by dry cell weight. The results are presented in the table below. Dry Cell Ethanol Glucose Xylose
Strain ODeoOnm Wt (g/L) (g/L) (g/L) (g/L)
Xl-R 7.72 2.20 15.65 0 3.14
Vector 7.34 1.78 6.705 0 23.21
Xl-R 7.32 2.03 15.5 0 3.85
Vector 7.96 1 .87 9.91 0 23.1 1
The data presented indicate that the Ruminococcus xylose isomerase containing yeast cells were able to utilize xylose as a carbon source, and the cells containing vector only (e.g., vector with no xylose isomerase gene) utilized very little xylose.
Industrial Yeast Strain Evaluation
To evaluate the activity of the various native, modified and engineered (e.g., mutant and/or chimeric) Ruminococcus xylose isomerases in a commercial yeast strain, the Ruminococcus wild type gene or Ruminococcus Rp10 and Rp15 chimeric constructs were inserted into a yeast vector containing a 2μ origin and a KANMX4 (G418R) cassette (cloned from vector HO-poly-KANMX4-HO; ATCC Cat. No. 87804; Voth et al., 2001 NAR 29(12): e59, DDBJ/EMBL/GenBank accession nos. AF324723-9). A commercially available industrial diploid strain of Saccharomyces cerevisiae. (strain BF903; "Stillspirits" triple distilled yeast, Brewcraft, Albany, New Zealand; available at Hydrobrew, Oceanside, CA) was obtained and was made competent for transformation using known yeast cell transformation procedures. The transformed cells containing either vector alone or the various Ruminococcus xylose isomerase gene constructs were passaged in YPD medium containing about 100 g/ml G418 (EMD, San Diego, CA), and about 2% glucose. Transformed yeast containing each construct were grown overnight aerobically in a 15 ml culture tube on YPD media containing 2% glucose. After about 24 hours, about 25ml of YP media containing 2% glucose and 100Mg/ml G418 was seeded with the cells at an initial OD6oo of 0.5 in a 250ml Erlenmeyer flask and grown aerobically · at 30 °C. The cultures were then passaged once every 7 days into fresh media, also at an initial OD600 of 0.5. The fresh media contained increasing amounts of xylose and decreasing amounts of glucose as set forth below. Week Glucose Xylose
1 1% 1 %
2 0.50% 1 .50%
3 0.25% 1 .75%
4 0.10% 1 .90%
Measurements were taken of the cell optical density (OD600) to assess cell density and plated onto YPD with l OOpg/ml G418 to ensure that the plasmid was stable. Glucose, xylose and ethanol in the culture media were assayed using YSI 2700 Bioanalyzer instruments (World Wide Web uniform resource locator ysi.com), according to the manufacturer's recommendations. The strains were then grown overnight in YPD (with 100Mg/ml G418) and used to inoculate about 50ml YP-xylose (with 100 Mg/ml G418) into disposable 250ml Erlenmeyer flasks with vented caps at an initial OD600 of about 1. The cultures were allowed to grow aerobically at 30 °C at 200rpm for 7 days. The results are shown in FIG. 1 1. The results shown in FIG. 1 1 indicate that the commercial yeast strain expressing the Ruminococcus xylose isomerase is more efficient at consuming xylose than the strain carrying the vector control only.
To evaluate ethanol production, the transformed cells containing either the vector control or the gene encoding native Ruminococcus xylose isomerase were grown overnight in YP glucose (with 100 g/ml G418) and then used to inoculate serum bottles containing 50ml YP plus xylose (with 100 g/ml G4 8) at an initial OD60o of about 1 . The serum bottles were sealed with a butyl rubber stopper to prevent air exchange. As a result, the cultures became anaerobic once the available oxygen in the serum bottle was utilized. In general, anaerobiosis (e.g., the onset of anaerobic conditions) occurred a few hours after the culture was inoculated. Xylose utilization, ethanol production and cell growth were measured every twenty four hours. The results are shown in FIG. 12. The results suggest that yeast strains containing the Ruminococcus xylose isomerase, grew to a higher cell density and produced more ethanol using xylose as a carbon source than did cells carrying the vector only control (e.g., see FIG. 12).
Example 14: High Diversity Library of Xylose Isomerase of Variants
To generate additional Ruminococcus xylose isomerase variants, a high diversity library of mutants was generated using known molecular biology procedures. The library contained the combinations and permutations of subsitutions listed in the table below. The Ruminococcus xylose isomerase variants listed below and highlighted in boldface type have been transformed into yeast strains, and are evaluated for growth and ethanol production on xylose media utilizing protocols described above. Yeast transformation of the variants listed below not highlighted in boldface is conducted and resulting variants are tested.. In the table below "position" refers to the amino acid position in the
Ruminococcus xylose isomerase amino acid sequence, "AA1 " refers to the first of the considered amino acid substitutions for that position, "CODON1 " refers to the nucleotide sequence selected for the amino acid chosen in "AA1 ", "AA2" refers to the second of the considered amino acid substitutions for that position, "CODON2" refers to the nucleotide sequence selected for the amino acid chosen in "AA2", "AA3" refers to the third of the considered amino acid substitutions for that position, "CODON3" refers to the nucleotide sequence selected for the amino acid chosen in "AA3", "AA4" refers to the fourth of the considered amino acid substitutions for that position and "CODON4" refers to the nucleotide sequence selected for the amino acid chosen in "AA4".
High Diversity Library of Xylose Isomerase of Variants
Position AA1 CODON1 AA2 CODON2 AA3 CODON3 AA4 CODON4 substitutions LIB1 -A LIB1 -B
3 F TTT Y TAT 2 1 1
45 L TTG ATG 2 1 1
46 S TCT A GCT 2 1 1
51 ATG L TTG 2 1 1
52 G GGT C TGT 2 1 1
53 G GGT A GCT 2 1 1
58 ATG Q CAA 2 1 1
85 F TTT ATG L TTG 3 1 3
101 R AGA V GTT 2 1 1
107 Y TAT G GGT 2 2 2
121 T ACT V GTT 2 2 2
131 K AAA G GGT 2 2 2
136 w TGG F TTT 2 2 1
140 K AAA N AAT 2 2 2
147 F TTT Y TAT 2 2 2
179 G GGT R AGA A GCT 3 1 3
184 F TTT S TCT 2 1 1
204 D GAT E GAA 2 2 2
214 V GTT R AGA G GGT 3 1 1
257 D GAT A GCT E GAA '3 1 1
273 Q CAA F TTT T ACT G GGT 4 1 4
292 1 ATT V GTT 2 2 2
296 Q CAA R AGA 2 1 1
345 T ACT D GAT E GAA 3 3 3
373 D GAT E GAA 2 2 2
509607936 1536 27648
Number of possible variants: 5.096e+08
Expected completeness: 0.95
Required library size:
1 .527e+09
The xylose isomerase mutants listed above are generated using oligonucleotides listed below and a 3 step PCR method, described in further detail below.
KAS/XI-LIB-for-1 ctagaactagtaaaaaaatggctaagqaatattattctaatataggtaaaattcagtat
KAS/XI-LIB-for-2 Actatgagagaacatttaaaatttgctatgtcttqqtggcatactwt
KAS/XI-LIB-for-3 Agagaacatttaaaatttgctttggcttggtqqcatactwtgkgtg
KAS/XI-LIB-for-4 Actatgagaqaacatttaaaatttgctatgqcttqgtggcatactwtgkgtg
KAS/XI-LIB-for-5 Ttgctttgtcttqqtqqcatactttqkgtgstgatg
KAS XI-LIB-for-6 Cttggtggcatactatqtqtgstgatggtactqats
KAS/XI-LIB-for-7 Gtggcatactatgggtgctgatqqtactgatatgt
KAS/XI-LIB-for-8 Gtggcatactwtgkgtgctgatggtactqatcaat
KAS/XI-LIB-for-9 Gcatactwtqkqtgstgatggtactgalcaattcggttgtggtact
KAS/XI-LIB-for-10 gcaaaagccaaagtagatgcagccwtqqaaattatggataaattgtctattg
KAS/XI-LIB-»or-1 1 rtatqqataaatlqtctattqattartattqttttcatqatgttgatttgtctcctgaatatggttctttaaaag
KAS/XI-LIB-for-12 ttatqqataaartqtctattgattattattgtttlcatqatgttgattlgtctcctgaaggtggttctttaaaag
KAS/XI-LIB-for-13 qtttlcatgatagagatltqtctcctgaaggtqqttctttaaaagcaactaatg
KAS/XI-LIB-for-14 qttctttaaaaqcaactaatqatcaattqgacattgttgtlgattatattaaagaaaaacaaggtgataaatttaaatg
KAS/XI-LIB-for-15 gttcrttaaaagcaactaatgatcaatlggacattgttgttgattatartaaagaaaaacaaggtgatggttttaaatg
KAS/XI-LIB-for-16 cggattatattaaaqaaaaacaaggtgatgqttttaaatgtttgtkkggcactgcgaawt
KAS/XI-LIB-for-17 ttatattaaagaaaaacaaqqtgataaatttaaatgtltgtttggcactgcgaawtgttttgat
KAS/XI-LIB-for-18 Gttlgtggggcactgcgaatlqtttlgatcatcc
KAS/XI-LIB-for-19 Tttlgatcatccacqttatatgcatggtgcgggga
KAS/XI-LIB-for-20 Gqaatcaactgttaaattaqgtagaaacgggtatgtattctgggga
KAS/XI-LIB-for-21 Gaatcaactgttaaattaggtgctaacgggtatgtattctggggag
KAS/XI-LIB-for-22 Ggaatcaactgttaaattaggtagaaacgggtatgtatcttgggga
KAS/XI-LIB-for-23 Gaatcaactgttaaattaggtgctaacgggtatgtatcttggggag
KAS/XI-LIB-for-24 Aaattaggtgggaacgggtatgtatcttggggaggaaggg
KAS/XI-LIB-for-25 Cactaatatgggtttggaattggaaaatatggctagattgatgaaaatg
KAS/XI-LIB-for-26 ggataatatggctagattgatgaaaatggctaqagaatacggaaggtcta
KAS/XI-LIB-for-27 Gctagattgatgaaaatggctggtgaatacgqaaggtctattggtt
KAS/XI-LIB-for-28 cagttttgggattcttgagaaaatatggtttggctaaagattttaaaatgaatatagaagcta
KAS/XI-LIB-for-29 agtttlgqqatlcttgagaaaatatggtttggaaaaagattttaaaatgaatatagaagctaa
KAS/XI-LIB-for-30 atagaagctaatcatgcaacactcgcatttcatacttttcaacatgaattgagagtt
KAS/XI-LIB-for-31 atagaagctaatcatgcaacactcgcaactcatacttttcaacatgaattgagagtt
KAS/XI-LIB-for-32 agaagctaatcatgcaacactcgcaggtcatactttlcaacatgaattgagag
KAS/XI-LIB-for-33 Taacggaqttttlggatctgrtgatgcaaaccaqqqaqacq
KAS/XI-LIB-for-34 Taacggaqttttlggatctgttgatgcaaacagagqagacg
KAS/XI-LIB-for-35 Ttttqgatctatcgatgcaaacagaggagacgttttgctaggatggg
KAS XI-LIB-for-36 Aaqgctaqqcgtggtagtttcgatccagaqqatatattctattc
KAS/XI-LIB-for-37 Cgaaggctaggcgtggtagttlcgaaccagagqatatatlctattctta
KAS/XI-LIB-for-38 Caqgqcaqcactaaaattgattgaaqaaqqtaqaattgataaqtttq
KAS/XI-LIB-for-39 Tgggaaagccgacttcgccagtttggaaaaatatg
KAS/XI-LIB-rev-1 atactgaattttacctatattagaataatattccttagccatttttttactagttctag
KAS/XI-LIB-rev-2 Awagtatgccaccaagacatagcaaattttaaatqttctctcataqt
KAS/XI-LIB-rev-3 Cacmcawaqtatgccaccaagccaaagcaaatttlaaatgttctct
KAS/XI-LIB-rev-4 cacmcawagtatgccaccaagccataqcaaattttaaatgttctctcatagt
KAS/XI-LIB-rev-5 Catcascacmcaaagtatgccaccaagacaaaqcaa
KAS/XI-LIB-rev-6 Satcagtaccatcascacacatagtatgccaccaag KAS/XI-LIB-rev-7 Acatatcagtaccatcaqcacccatagtatgccac
KAS XI-LIB-rev-8 Attqatcaqtaccatcaqcacmcawaqtatqccac
KAS/XI-LIB-rev-9 Aqtaccacaaccqaattqatcaqtaccatcascacmcawaqtatqc
KAS/XI-LIB-rev-10 Caataqacaatttatccataatttccawqqctqcatctactttqgcttttqc
KAS/XI-LIB-rev- 1 1 cttttaaaqaaccatatlcaggagacaaatcaacatcatgaaaacaataataatcaatagacaattlatccataa
KAS/XI-LIB-rev- 12 crrttaaagaaccaccrtcaggagacaaatcaacatcatgaaaacaataataatcaatagacaatttatccataa
KAS/XI-LIB-rev- 13 cattagrtgcttttaaagaaccaccrtcagqaqacaaatctctatcatgaaaac
KAS/XI-LIB-rev-14 catttaaatttatcaccttqtttttcttlaatataatcaacaacaatqtccaattgatcattagttgcttttaaagaac
KAS/XI-LIB-rev-15 cattlaaaaccatcaccttgrrrrtcrtlaatataatcaacaacaatgtccaattgatcattagttgcttttaaagaac
KAS/XI-LIB-rev- 16 awtlcqcagtqccmmacaaacatttaaaaccatcaccttqtttttctttaatataatccg
KAS/XI-LIB-rev- 1 7 atcaaaacawttcqcaqtqccaaacaaacatttaaatttatcaccttqtttttctttaatataa
KAS/XI-LIB-rev-18 Gqatqatcaaaacaattcgcaqtgccccacaaac
KAS/XI-LIB-rev- 19 Tccccgcaccatgcatataacgtggatgatcaaaa
KAS/XI-LIB-rev-20 Tccccagaatacatacccgtttctacctaatttaacagttgattcc
KAS/XI-LIB-rev-21 Ctccccagaatacatacccgttagcacctaatttaacagttgattc
KAS/XI-LIB-rev-22 Tccccaaqatacatacccgttlctacctaatttaacaqttgatlcc
KAS/XI-LIB-rev-23 Ctccccaagatacatacccgttaqcacctaatttaacagttgattc
KAS/XI-LIB-rev-24 Cccttcctccccaagatacatacccgttcccacctaattt
KAS/XI-LIB-rev-25 Cattttcatcaatctagccatattttccaattccaaacccatatlagtg
KAS/XI-LIB-rev-26 Tagaccttccgtattctctagccattttcatcaatctagccatattatcc
KAS/XI-LIB-rev-27 Aaccaatagacctlccgtattcaccagccattttcatcaatctagc
KAS/XI-LIB-rev-28 tagcttctatattcatttlaaaatcttlagccaaaccatattltctcaagaatcccaaaactg
KAS/XI-LIB-rev-29 ttagcttctatattcattttaaaatcttttlccaaaccatattttctcaagaatcccaaaact
KAS/XI-LIB-rev-30 aactctcaattcatgttgaaaagtatgaaatgcgagtgttgcatgattagcttctat
KAS/XI-LIB-rev-31 aactctcaattcatgttgaaaagtatgagttgcgagtgttgcatgatlagcttctat
KAS/XI-LIB-rev-32 Ctctcaattcatgttgaaaagtatgacctgcgagtgttgcatgattagcrtct
KAS/XI-LIB-rev-33 Cgtctccctggtttgcatcaacagatccaaaaactccgtta
KAS/XI-LIB-rev-34 Cgtctcctctgtttgcatcaacagatccaaaaactccgtta
KAS/XI-LIB-rev-35 Cccatcctagcaaaacgtctcctctgttlgcatcgatagatccaaaa
KAS/XI-LIB-rev-36 Gaatagaatatatcctctggatcgaaactaccacgcctagcctt
KAS/XI-LIB-rev-37 Taagaatagaatatatcctctggttcgaaactaccacgcctagccttcg
KAS/XI-LIB-rev-38 Caaacrtatcaattctaccttcttcaatcaattttagtqctgccctg
KAS/XI-LIB-rev-39 Catatttttccaaactggcgaagtcggctttccca
KAS/FOR-XI-LIB Cctgaaattattcccctacttgact
KAS/REV-XI-LIB Ccttctcaagcaaggttttcagtat
The nucleotide sequences of the oligonucleotides above include lUPAC nucleotide symbol designations, in some embodiments. The lUPAC nucleotide symbol designations used in the listing above and the nucleotides they represent are; m (A or C nucleotides), k (G or T
nucleotides); s (C or G nucleotides), w (A or T nucleotides).
Oligonucleotides are prepared as 100 micromolar stocks to be diluted as needed for use as PCR and/or primer extension primers. Step 1 of the 3 step PCR protocol included initial primer extension reactions performed four times, each with a different concentration of mutant oligonucleotide (e.g., about 7.5 nanomolar, about 37.5 nanomolar, about 75 nanomolar, and about 150 nanomolar). An appropriate amount (e.g., dependent on the reaction) of each of the desired primers is contacted with Ruminococcus xylose isomerase nucleotide sequences, under
PCR/primer extension conditions to generate the xylose isomerase mutant variants. The forward and reverse primers listed above are designated with "-for-" or "-rev-" as part of the primer name. A non-limiting example of the PCR/primer extension conditions utilized for generating the xylose isomerase mutant variants listed above include 200 micromolar of each deoxyribonucleotide (dNTP), 1 X Pfu ultra II buffer, and 1 unit Pfu ultra II polymerase and a thermocycle profile of ; (a) an initial 10 minute denaturation at 94 °C, (b) 40 cycles of (i) 94 °C for 20 seconds, (ii) 56 °C for 30 seconds, and (iii) 72 °C for 45 seconds, and (c) a final extension at 72 °C for 5 minutes. The initial extension products are analyzed by gel electrophoresis on 1 .2% Tris-acetate agarose gels. The reactions are column purified and the resultant purified nucleic acids are used for subsequent steps in the 3 step PCR protocol.
The second step of the 3 step.protocol includes contacting the purified nucleic acids from the first step with Ruminococcus xylose isomerase gene primers (e.g., KAS/FOR-XI-LIB and KAS/REV-XI- LIB, as listed in the table above, under substantially similar PCR/primer extension conditions, with the modification of 5 units of Pfu ultra II polymerase instead of 1 unit. The PCR reactions also are performed four times, each with a differing amount of gene primers (e.g., about 20 nanomolar, about 100 nanomolar, about 2 micromolar and about 5 micromolar. The reaction products are analyzed by gel electrophoresis as described above and column purified, in preparation for the final step of the 3 step PCR/primer extension protocol.
The final step of the 3 step protocol generated full length nucleic acids of xylose isomerase mutants. The column purified nucleic acid of the second step was contacted with about 200 nanomolar of each gene primer under extension conditions as described for the second step. The protocol described herein was used to generate a wide range of mutant xylose isomerase variants, each with between about 1 and about 9 mutations per gene.
Example 15: Construction of mutant G 179 A with pIO or p 5 extensions
Site directed mutagenesis was performed as follows: 50ng of the vector pBF348 (pCR Blunt ll/XI- R-P10), pBF370 (pCR Blunt II/XI-R-P10-HIS), pBF349 (pCR Blunt II/XI-R-P15), or pBF370 (pCR Blunt II/XI-R-P10-HIS) was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μηιοΙ mutagenesis primers [JML75 (aacagtaaagctcggcgctaacggttacgttttct) and JMU6 (agaaaacgtaaccgttagcgccgagctttactgtt)], and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. This reaction mixture was cycled as follows: (a) 95 °C 10 minutes, followed by (b) 30 rounds of (i) 95°C for 20 seconds, (ii) 55 °C for 30 seconds, and (iii) 72°C for 5 minutes. A final 5 minute extension reaction at 72°C was also included. Following the cycling times, 1.5 μΙ of Dpnl (NEB, Ipswich, MA) was added and allowed to digest the reaction mixture for 1 to 1 .5 hours at 37 °C. 5μΙ of this mixture was then used to transform NEB-5a cells (NEB, Ipswich, MA) and plated onto LB media with kanamycin (35 Mg/ml). The following plasmids were generated using the procedure described herein; pBF613 (XI-R-P15 with G179A mutation), pBF614 (XI-R-P15-HIS with G179A mutation), pBF615 (XI-R-P10 with G179A mutation), and pBF616 (XI-R-P10-HIS with G179A mutation), where XI = xylose isomerase, R = Ruminococcus, P = Piromyces, HIS = Histidine Tag, and the numbers that follow P indicate how many amino acids of the Piromyces xylose isomerase was fused to the 5' end of the Ruminococcus xylose isomerase gene.
Following sequence verification, the approximately 1330 base pair Spel-Xhol fragment from each construct was subcloned into the yeast expression vector p426GPD. The generated xylose isomerase fragments were first gel extracted using a Qiagen gel purification kit (Qiagen, Valencia, CA), and the p426GPD vector reaction was cleaned up using a Qiagen PCR purification kit. 30ng of the XI- fragments was ligated to 50ng of the p426GPD vector using T4 DNA ligase (Fermentas, Glen Burnie, MD) in a 10μΙ volume reaction overnight at 16°C and transformed into NEB-5a competent cells (NEB, Ipswich, MA) and plated onto LB media with ampicillin (100μg/ml).
Constructs were confirmed by sequence analysis. The following plasmids were generated using the procedure described herein: pBF677 (p426GPD/XI-R-P15_G179A), pBF678 (p426GPD/XI-R- P15-HIS_G179A), pBF679 (p426GPD/XI-R-P10_G179A), pBF680 (p426GPD/XI-R-P10- HIS_G179A).
Example 16: Additional Xylose Isomerase High Diversity Variants
An additional library of high diversity mutants containing changes not listed above also is generated. The table below lists positions in the Ruminococcus xylose isomerase gene at which one of two further amino acid codons is substituted to generate additional high diversity xylose isomerase variants. In the table below "Xl-R position" refers to the amino acid position in the Ruminococcus xylose isomerase amino acid sequence, "AA1 " refers to the first of two considered amino acid substitutions for that position, "CODON1" refers to the nucleotide sequence selected for the amino acid chosen in "AA1 ", "AA2" refers to the second of two considered amino acid substitutions for that position and "CODON2" refers to the nucleotide sequence selected for the amino acid chosen in "AA2". The nucleotide sequences for each codon are chosen using sequence and codon optimization methods described herein.
AA1 CODON 1 AA2 CODON2
5 S TCT P CCA
6 N AAT Q CAA
42 K AAA R AGA
54 D GAT E GAA
56 T ACT A GCT
84 A GCT G GGT
137 G GGT S TCT
141 C TGT V GTT
180 N AAT E GAA
181 G GGT N AAT
203 L TTG K AAA
205 N AAT H CAT
208 R AGA T ACT
209 L TTG M ATG
210 M ATG L TTG
21 1 K AAA T ACT
215 E GAA D GAT
252 R AGA K AAA
253 K AAA A GCT
254 Y TAT H CAT
255 G GGT N AAT
277 Q CAA E GAA
299 V GTT Y TAT
300 L TTG Q CAA
301 L TTG N AAT
344 F TTT T ACT
346 P CCA L TTG
372 E GAA Q CAA
374 G GGT S TCT
375 R AGA P CCA
Example 17: Activation of the Entner-Doudoroff Pathway in Yeast Cells using EDD and EDA genes from Pseudomonas aeruginosa strain PA01.
Pseudomonas aeruginosa strain PA01 DNA was prepared using Qiagen DNeasy Blood and Tissue kit (Qiagen, Valencia, CA) according to the manufacture's instructions. The P. aeruginosa edd and eda constructs were isolated from P. aeruginosa genomic DNA using the following oligonucleotides: The P. aeruginosa edd gene:
5'-aactgaactgactagtaaaaaaatgcaccctcgtgtgctcgaagt-3' (SEQ ID NO:63)
5'-agtaaagtaaaagcttctactagcgccagccgttgaggctct-3' (SEQ ID NO:64) The P. aeruginosa edd gene with 6-HIS c-terminal tag:
5'-aactgaactgactagtaaaaaaatgcaccctcgtgtgctcgaagt-3' (SEQ ID N063)
5'-agtaaagtaaaagcttctactaatgatgatgatgatgatggcgccagccgttgaggctc-3' (SEQ ID NO:65)
The P. aeruginosa eda gene:
5'-aactgaactgactagtaaaaaaatgcacaaccttgaacagaagacc-3' (SEQ ID NO:66)
5'-agtaaagtaactcgagctattagtgtctgcggtgctcggcgaa-3' (SEQ ID NO:67)
The P. aeruginosa eda gene with 6-HIS c-terminal tag:
5'-aactgaactgactagtaaaaaaatgcacaaccttgaacagaagacc-3' (SEQ ID NO:66)
5'-taaagtaactcgagctactaatgatgatgatgatgatggtgtctgcggtgctcggcgaa-3' (SEQ ID NO:68)
All oligonucleotides set forth above were purchased from Integrated technologies ("IDT", Coralville, IA). These oligonucleotides were designed to incorporate a Spel restriction endonuclease cleavage site upstream of a Hindlll restriction endonuclease cleavage site or downstream of an Xhol restriction endonuclease cleavage site, with respect to the edd and eda gene constructs. These restriction endonuclease sites could be used to clone the edd and eda genes into yeast expression vectors p426GPD (ATCC accession number 87361 ) and p425GPD (ATCC accession number 87359). In addition to incorporating restriction endonuclease cleavage sites, the forward oligonucleotides also incorporate six consecutive A nucleotides (e.g., AAAAAA) immediately upstream of the ATG initiation codon. The six consecutive A nucleotides ensured that there was a conserved ribosome binding sequence for efficient translation initiation in yeast.
PCR amplification of the genes were performed as follows: about 100ng of the genomic P.
aeruginosa PA01 DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μιτιοΙ gene-specific primers (SEQ ID NOS: 63-68, and combinations as indicated), and 1 U Pfu Ultra II polymerase
(Agilent, La Jolla, CA) in a 50μΙ reaction mix. This was cycled as follows: 95°C 10 minutes followed by 30 rounds of 95°C for 20 seconds, 50 °C (eda amplifications) or 53 °C (edd amplifications) for 30 seconds, and 72°C for 15 seconds (eda amplifications) or 30 seconds (edd amplifications). A final 5 minute extension reaction at 72°C also was included. The about 670 bp (eda) or 1830 bp product (edd) was TOPO cloned into the pCR Blunt II TOPO vector (Life Technologies, Carlsbad, CA) according to the manufacturer's recommendations.
The nucleotide and amino acid sequences of the P. aeruginosa edd and eda genes are given below as SEQ ID NOS. 69 - 72.
P. aeruginosa edd nucleotide sequence: SEQ ID NO:69
ATGCACCCTCGTGTGCTCGAAGTCACCCGCCGCATCCAGGCCCGTAGCGCGGCCACTCGCC AGCGCTACCTCGAGATGGTCCGGGCTGCGGCCAGCAAGGGGCCGCACCGCGGCACCCTGC CGTGCGGCAACCTCGCCCACGGGGTCGCGGCCTGTGGCGAAAGCGACAAGCAGACCCTGC GGCTGATGAACCAGGCCAACGTGGCCATCGTTTCCGCCTACAACGACATGCTCTCGGCGCAC CAGCCGTTCGAGCGCTTTCCGGGGCTGATCAAGCAGGCGCTGCACGAGATCGGTTCGGTCG GCCAGTTCGCCGGCGGCGTGCCGGCCATGTGCGACGGGGTGACCCAGGGCGAGCCGGGCA TGGAACTGTCGCTGGCCAGCCGCGACGTGATCGCCATGTCCACCGCCATCGCGCTGTCTCA CAACATGTTCGATGCAGCGCTGTGCCTGGGTGTTTGCGACAAGATCGTGCCGGGCCTGCTGA TCGGCTCGCTGCGCTTCGGCCACCTGCCCACCGTGTTCGTCCCGGCCGGGCCGATGCCGAC CGGCATCTCCAACAAGGAAAAGGCCGCGGTGCGCCAACTGTTCGCCGAAGGCAAGGCCACT CGCGAAGAGCTGCTGGCCTCGGAAATGGCCTCCTACCATGCACCCGGCACCTGCACCTTCTA TGGCACCGCCAATACCAACCAGTTGCTGGTGGAGGTGATGGGCCTGCACTTGCCCGGTGCC TCCTTCGTCAACCCGAACACCCCCCTGCGCGACGAACTCACCCGCGAAGCGGCACGCCAGG CCAGCCGGCTGACCCCCGAGAACGGCAACTACGTGCCGATGGCGGAGATCGTCGACGAGAA GGCCATCGTCAACTCGGTGGTGGCGCTGCTCGCCACCGGCGGCTCGACCAACCACACCCTG CACCTGCTGGCGATCGCCCAGGCGGCGGGCATCCAGTTGACCTGGCAGGACATGTCCGAGC TGTCCCATGTGGTGCCGACCCTGGCGCGCATCTATCCGAACGGCCAGGCCGACATCAACCA CTTCCAGGCGGCCGGCGGCATGTCCTTCCTGATCCGCCAACTGCTCGACGGCGGGCTGCTT CACGAGGACGTACAGACCGTCGCCGGCCCCGGCCTGCGCCGCTACACCCGCGAGCCGTTC CTCGAGGATGGCCGGCTGGTCTGGCGCGAAGGGCCGGAACGGAGTCTCGACGAAGCCATC CTGCGTCCGCTGGACAAGCCGTTCTCCGCCGAAGGCGGCTTGCGCCTGATGGAGGGCAACC TCGGTCGCGGCGTGATGAAGGTCTCGGCGGTGGCGCCGGAACACCAGGTGGTCGAGGCGC CGGTACGGATCTTCCACGACCAGGCCAGCCTGGCCGCGGCCTTCAAGGCCGGCGAGCTGGA GCGCGACCTGGTCGCCGTGGTGCGTTTCCAGGGCCCGCGGGCGAACGGCATGCCGGAGCT GCACAAGCTCACGCCGTTCCTCGGGGTCCTGCAGGATCGTGGCTTCAAGGTGGCGCTGGTC ACCGACGGGCGCATGTCCGGGGCGTCGGGCAAGGTGCCCGCGGCCATCCATGTGAGTCCG GAAGCCATCGCCGGCGGTCCGCTGGCGCGCCTGCGCGACGGCGACCGGGTGCGGGTGGAT GGGGTGAACGGCGAGTTGCGGGTGCTGGTCGACGACGCCGAATGGCAGGCGCGCAGCCTG GAGCCGGCGCCGCAGGACGGCAATCTCGGTTGCGGCCGCGAGCTGTTCGCCTTCATGCGCA ACGCCATGAGCAGCGCGGAAGAGGGCGCCTGCAGCTTTACCGAGAGCCTCAACGGCTGGCG CTAGTAG
P. aeruginosa edd amino sequence: SEQ ID NO: 70
MHPRVLEVTRRIQARSAATRQRYLEMVRAAASKGPHRGTLPCGNLAHGVAACGESDKQTLRLMN QANVAIVSAYNDMLSAHQPFERFPGLIKQALHEIGSVGQFAGGVPAMCDGVTQGEPGMELSLASR DVIAMSTAIALSHNMFDAALCLGVCDKIVPGLLIGSLRFGHLPTVFVPAGPMPTG!SNKEKAAVRQL FAEGKATREELLASEMASYHAPGTCTFYGTANTNQLLVEVMGLHLPGASFVNPNTPLRDELTREA ARQASRLTPENGNYVPMAEIVDEKAIVNSVVALLATGGSTNHTLHLLAIAQAAGIQLTWQDMSELS HVVPTLARIYPNGQADINHFQAAGGMSFLIRQLLDGGLLHEDVQTVAGPGLRRYTREPFLEDGRLV WREGPERSLDEAILRPLDKPFSAEGGLRLMEGNLGRGVMKVSAVAPEHQVVEAPVRIFHDQASLA AAFKAGELERDLVAVVRFQGPRANGMPELHKLTPFLGVLQDRGFKVALVTDGRMSGASGKVPAAI HVSPEAIAGGPLARLRDGDRVRVDGVNGELRVLVDDAEWQARSLEPAPQDGNLGCGRELFAFM RNAMSSAEEGACSFTESLNGWR P. aeruginosa eda nucleotide sequence: SEQ ID NO: 71
ATGCACAACCTTGAACAGAAGACCGCCCGCATCGACACGCTGTGCCGGGAGGCGCGCATCC TCCCGGTGATCACCATCGACCGCGAGGCGGACATCCTGCCGATGGCCGATGCCCTCGCCGC CGGCGGCCTGACCGCCCTGGAGATCACCCTGCGCACGGCGCACGGGCTGACCGCCATCCG GCGCCTCAGCGAGGAGCGCCCGCACCTGCGCATCGGCGCCGGCACCGTGCTCGACCCGCG GACCTTCGCCGCCGCGGAAAAGGCCGGGGCGAGCTTCGTGGTCACCCCGGGTTGCACCGA CGAGTTGCTGCGCTTCGCCCTGGACAGCGAAGTCCCGCTGTTGCCCGGCGTGGCCAGCGCT TCCGAGATCATGCTCGCCTACCGCCATGGCTACCGCCGCTTCAAGCTGTTTCCCGCCGAAGT CAGCGGCGGCCCGGCGGCGCTGAAGGCGTTCTCGGGACCATTCCCCGATATCCGCTTCTGC CCCACCGGAGGCGTCAGCCTGAACAATCTCGCCGACTACCTGGCGGTACCCAACGTGATGT GCGTCGGCGGCACCTGGATGCTGCCCAAGGCCGTGGTCGACCGCGGCGACTGGGCCCAGG TCGAGCGCCTCAGCCGCGAAGCCCTGGAGCGCTTCGCCGAGCACCGCAGACACTAATAG P. aeruginosa eda amino sequence: SEQ ID NO: 72
MHNLEQKTARIDTLCREARILPVITIDREADILPMADALAAGGLTALEITLRTAHGLTAIRRLSEERPH LRIGAGTVLDPRTFAAAEKAGASFVVTPGCTDELLRFALDSEVPLLPGVASASEIMLAYRHGYRRF KLFPAEVSGGPAALKAFSGPFPDIRFCPTGGVSLNNLADYLAVPNVMCVGGTWMLPKAVVDRGD WAQVERLSREALERFAEHRRH
Cloning ol PAQ1 edd and eda genes into yeast expression vectors Following sequence confirmation (GeneWiz), the about 670 bp Spel-Xhol eda and about 1830 bp Spel-Hindlll edd fragments were cloned into the corresponding restriction sites in plasmids p425GPD and p426GPD vectors (Mumberg et al., 1995, Gene 156: 1 19-122; obtained from ATCC #87361 ; PubMed: 7737504), respectively. Briefly, about 50ng of Spel-Xhol-digested p425GPD vector was ligated to about 50ng of Spel/Xhol-restricted eda fragment in a 10μΙ reaction with 1 X T4 DNA ligase buffer and 1 U T4 DNA ligase (Fermentas) overnight at 16°C. About 3μΙ of this reaction was used to transform DH5a competent cells (Zymo Research) and plated onto LB agar media containing 100Mg/ml ampicillin. Similarly, about 50ng of Spel-Hindlll-digested p426GPD vector was ligated to about 42ng of Spel/Hindlll-restricted edd fragment in a 10μΙ reaction with 1 X T4 DNA ligase buffer and 1 U T4 DNA ligase (Fermentas) overnight at 16°C. About 3μΙ of this reaction was used to transform DH5a competent cells (Zymo Research) and plated onto LB agar media containing 100μg/ml ampicillin.
A haploid Saccharomyces cerevisiae strain (BY4742; ATCC catalog number 201389) was cultured in YPD media (10g Yeast Extract, 20g Bacto-Peptone, 20g Glucose, 1 L total) at about 30 °C.
Separate aliquots of these cultured cells were transformed with a plasmid construct(s) containing the eda gene alone, the eda and edd genes, or with vector alone. Transformation was
accomplished using the Zymo frozen yeast transformation kit (Catalog number T2001 ; Zymo Research Corp., Orange, CA). To 50 μΙ of cells was added approximately 0.5-1 μg plasmid DNA and the cells were cultured on SC drop out media with glucose minus leucine (eda), minus uracil and minus leucine (eda and edd) (about 20g glucose; about 2.21 g SC drop-out mix [described below], about 6.7g yeast nitrogen base, all in about 1 L of water); this mixture was cultured for 2-3 days at about 30 °C. SC drop-out mix contained the following ingredients (Sigma); all indicated weights are approximate: 0.4g Adenine hemisulfate
3.5g Arginine
i g Glutamic Acid
0.433g Histidine
0.4g Myo-lnositol
5.2g Isoleucine
2.63g Leucine
0.9g Lysine
1.5g Methionine
0.8g Phenylalanine
1 .1 g Serine
1 .2g Threonine
0.8g Tryptophan
0.2g Tyrosine
0.2g Uracil
1 .2g Valine
Activity and Western Analyses Cell lysates of the various EDD and EDA expressing strains were prepared as follows. About 50 to 100ml of SCD-ura-leu media containing 10mM MnCI2 was used to culture strains containing the desired plasmid constructs. When cultured aerobically, strains were grown in a 250ml baffled shaker flask. When grown anaerobically, 400μΙ _ Tween-80 (British Drug Houses, Ltd., West Chester, PA) plus 0.01 g/L Ergosterol (Alef Aesar, Ward Hill, MA) were added and the culture was grown in a 250ml serum bottle outfitted with a butyl rubber stopper with an aluminum crimp cap. Each strain was inoculated at an initial OD600 of about 0.2 and grown to an OD600 of about 3-4. Cells were grown at 30 °C at 200rpm.
Yeast cells were harvested by centrifugation at 1046 x g (e.g., approximately 3000 rpm) for 5 minutes at 4°C. The supernatant was discarded and the cells were resuspended in 25 mL cold sterile water. This wash step was repeated once. Washed cell pellets were resuspended in 1 mL sterile water, transferred to 1.5 mL screw cap tube, and centrifuged at 16,100 x g (e.g., approximately 13,200 rpm) for 3 minutes at 4 °C. Cell pellets were resuspended in about 800 - 10ΟΟμΙ of freshly prepared lysis buffer (50 mM Tris-CI pH 7.0, 10 mM MgCI2, 1 x protease inhibitor cocktail EDTA-free (Thermo Scientific, Waltham, MA) and the tube filled with zirconia beads to avoid any headspace in the tube. The tubes were placed in a Mini BeadBeater (Bio Spec Products, Inc., Bartlesville, OK) and vortexed twice for 30 seconds at room temperature. The supernatant was transferred to a new 1.5 mL microcentrifuge tube and centrifuged twice to remove cell debris at 16,100 x g (e.g., approximately 13,200 rpm) for 10 minutes, at 4 °C. Quantification of the lysates was performed using the Coomassie-Plus kit (Thermo Scientific, San Diego, CA) as directed by the manufacturer.
Strain EDD EDA
BF428 p426GPD (vector control) p425GPD (vector control)
BF604 E. coli native E. coli native
BF460 E. coli native with 6-HIS E. coli native with 6-HIS
BF591 PA01 native PA01 native
BF568 PA01 native with 6-HIS PA01 native with 6-HIS
BF592 PA01 native E. coli native
BF603 E. coli native PA01 native
About 5-10 pg of total cell extract was used for SDS-gel [NuPage 4-12% Bis-Tris gels (Life
Technologies, Carlsbad, CA)] electrophoresis and Western blot analyses.
SDS-PAGE gels were performed according to the manufacturer's recommendation using NuPage MES-SDS Running Buffer at 1 X concentration with the addition of NuPage antioxidant into the cathode chamber at a 1 X concentration. Novex Sharp Protein Standards (Life Technologies, Carlsbad, CA) were used as standards. For Western analysis, gels were transferred onto a nitrocellulose membrane (0.45 micron, Thermo Scientific, San Diego, CA) using Western blotting filter paper (Thermo Scientific) using a Bio-Ftad Mini Trans-Blot Cell (BioFtad, Hercules, CA) system for approximately 90 minutes at 40V. Following transfer, the membrane was washed in 1 X PBS (EMD, San Diego, CA), 0.05% Tween-20 (Fisher Scientific, Fairlawn, NJ) for 2-5 minutes with gentle shaking. The membrane was blocked in 3% BSA dissolved in 1 X PBS and 0.05% Tween- 20 at room temperature for about 2 hours with gentle shaking. The membrane was washed once in 1 X PBS and 0.05% Tween-20 for about 5 minutes with gentle shaking. The membrane was then incubated at room temperature with the 1 :5000 dilution of primary antibody (Ms mAB to 6x His Tag, AbCam, Cambridge, MA) in 0.3% BSA (Fraction V, EMD, San Diego, CA) dissolved in 1 X PBS and 0.05% Tween-20 with gentle shaking.
Incubation was allowed to proceed for about 1 hour with gentle shaking. The membrane was then washed three times for 5 minutes each with 1 X PBS and 0.05% Tween-20 with gentle shaking. The secondary antibody [Dnk pAb to Ms IgG (HRP), AbCam, Cambridge, MA] was used at 1 : 15000 dilution in 0.3% BSA and allowed to incubate for about 90 minutes at room temperature with gentle shaking. The membrane was washed three times for about 5 minutes using 1 X PBS and 0.05% Tween-20 with gentle shaking. The membrane incubated with 5ml of Supersignal West Pico Chemiluminescent substrate (Thermo Scientific, San Diego, CA) for 1 minute and then was exposed to a phosphorimager (Bio-Rad Universal Hood II, Bio-Rad, Hercules, CA) for about 10 - 100 seconds.
The results of the Western blots, shown in FIGS. 13A and 13B. Included in the expression data are engineered and/or optimized versions of certain eda and edd genes. The genes were modified to include a C-terminal HIS tag to facilitate purification. The two letters refer to the EDD and EDA source, respectively. P is from P. aeruginosa, PA01 , E is from E. coli, Z is from Zymomonas mobilis ZM4, hot rod is the optimized version of Zymomonas mobilis, Harmonized is the codon harmonized version of Zymomonas mobilis, V refers to the vector(s). Both total crude extract and the solubilized extract are shown. The results presented in FIGS. 13A and 13B indicate that the PA01 EDD protein is expressed and soluble in S. cerevisiae. The results also demonstrate that the E. coli EDA protein is expressed and soluble. It was not clear from these experiments if the PA01 EDA was soluble in yeast. Example 18: EDD and EDA activity assays
Cell lysates of the various EDD and EDA expressing strains were prepared as follows. About 50 to 100ml of SCD-ura-leu media containing 10mM MnCI2 was used. When cultured aerobically, strains were grown in a 250ml baffled shake flask. When grown anaerobically, 400μΙ/ί. Tween-80 (British Drug Houses, Ltd., West Chester, PA) plus 0.01 g/L Ergosterol (Alef Aesar, Ward Hill, MA) were added and the culture was grown in a 250ml serum bottle outfitted with a butyl rubber stopper with an aluminum crimp cap. Each strain was inoculated at an initial OD600 of about 0.2 and grown to an OD600 of about 3-4. Cells were grown at 30 °C at 200rpm. Yeast cells were harvested by centrif Ligation at 1046 x g (3000 rpm) for 5 minutes at 4°C. The supernatant was discarded and the cells were resuspended in 25 mL cold sterile water. This wash step was repeated once. Washed cell pellets were resuspended in 1 mL sterile water, transferred to 1.5 mL screw cap tube, and centrifuged at 16,100 x g (13,200 rpm) for 3 minutes at 4 °C. Cell pellets were resuspended in about 800-1000μΙ of freshly prepared lysis buffer (50 mM Tris-CI pH 7.0, 10 mM MgCI2, 1 x protease inhibitor cocktail EDTA-free (Thermo Scientific, Waltham, MA) and the tube filled with zirconia beads to avoid any headspace in the tube. The tubes were placed in a Mini BeadBeater (Bio Spec Products, Inc., Bartlesville, OK) and vortexed twice for 30 seconds at room temperature. The supernatant was transferred to a new 1 .5 mL microcentrifuge tube and centrifuged twice to remove cell debris at 16,100 x g (13,200 rpm) for 10 minutes, at 4 °C.
Quantification of the lysates was performed using the Coomassie-Plus kit (Thermo Scientific, San Diego, CA) as directed by the manufacturer.
About 750 g of crude extract was assayed using 1 X assay buffer (50 mM Tris-CI pH 7.0, 10 mM MgCI2), 3U lactate dehydrogenase (5 g/pL in 50 mM Tris-CI pH 7.0), and 10μΙ 1 mM 6- phosphogluconate dissolved in 50 mM Tris-CI pH 7.0 were mixed in a reaction of about 400μΙ. This reaction mix was transferred to a 1 ml Quartz cuvette and allowed to incubate about 5 minutes at 30 °C. To this reaction, 100μΙ of 1 .5mM NADH (prepared in 50mM Tris-CI pH 7.0) was added, and the change in Abs340nm over the course of 5 minutes at 30 °C was monitored in a Beckman DU-800 spectrophotometer using the Enzyme Mechanism software package (Beckman Coulter, Inc, Brea, CA).
The table below presents the relative specific activities for BY4742 strains expressing EDD and EDA from either P. aeruginosa (PA01 ) or E. coli sources. The results presented in the table below indicate that each of the listed combinations of EDD and EDA genes, when expressed in S. cerevisiae strain BY4742, confers activity.
Figure imgf000214_0001
EDD-E/EDA-E 0.839 x 10'3 0.16270 0.2169
The data presented above is also presented graphically in FIG. 14. FIG. 14 graphically displays the relative activities of the various EDD/EDA combinations presented in the table above, as measured in assays using 750 micrograms of crude extract. From the height of the PE bar in FIG. 14, and the data presented in the table above, it is evident that the combinations conferring the highest level of activity were the EDD-P/EDA-E (e.g., PE) and EDD-P/EDA-P (e.g., PP) combinations. Example 19: Improved ethanol yield from yeast strains expressing EDD and EDA constructs
Strains BF428 (vector control), BF591 (EDD-PA01 /EDA-PA01 ), BF592 (EDD-PA01 /EDA-E. coli), BF603 (EDD-E. coli/EDA-PA01 ) and BF604 (EDD-E. coli/EDA-E. coli) were inoculated into 1 5ml SCD-ura-leu media containing 400μΙ/Ι_ Tween-80 (British Drug Houses, Ltd., West Chester, PA) plus 0.01 g/L Ergosterol (EMD, San Diego, CA) in 20ml Hungate tubes outfitted with a butyl rubber stopper and sealed with an aluminum crimped cap to prevent oxygen from entering the culture at an initial OD600 of 0.5 and grown for about 20 hours. Glucose and ethanol in the culture media were assayed using YSI 2700 BioAnalyzer instruments (world wide web uniform resource locator ysi.com), according to the manufacturer's recommendations at 0 and 20 hours post inoculation. The results of the fermentation of glucose to ethanol are showing graphically in FIG. 15. The results presented in FIG. 1 5 indicate that the presence of the EDD/EDA combinations in S.
cerevisiae increase the yield of ethanol produced, when compared to a vector-only control. The EDD/EDA combinations that showed the greatest fermentation efficiency in yeast were EDD- P/EDA-E (e.g., PE) and EDD-E/EDA-P (e.g., EP).
Example 20: Improved ethanol yield from yeast strains expressing EDD and EDA from PA01 in fermentors
A fermentation test of the strain BF591 [BY4742 with plasmids pBF290 (p426GPD-EDD_PA01 ) and pBF292 (p425GPD-EDA_PA01 )] was conducted against BF428 (BY4742
p426GPD/p425GPD) control strain in 700ml w.v. Multifors multiplexed fermentors. The
fermentation medium was SC-Ura-Leu with about 2% glucose. Vessels were inoculated with about a 6.25% inoculum from overnight cultures grown in about 50 ml SC-Ura-Leu with about 2% glucose.
The cultures were grown aerobically at about 30 °C with about 250 rpm agitation, 1 wm sparge of process air, (21 % 02). The pH was controlled at around 5.0 with 0.25 N NaOH. Once glucose concentrations dropped below 0.5 g/L the fermentation was switched to anaerobic conditions. Before changing to anaerobic conditions, samples were taken to measure glucose concentrations and biomass by OD60o as reported in Table B. Ethanol and glucose concentrations in the fermentation broth were monitored using YSI 2700 BioAnalyzer instruments.
The table below presents the elapsed fermentation time (EFT), the biomass and glucose at the start of anaerobic fermentation in a 400 ml fermentor. The edd and eda combinations carried by the strains are described above.
Figure imgf000216_0001
At the beginning of the anaerobic portion of the fermentation, a bolus of 20 g/L glucose plus 3.35 g/L of yeast nitrogen base without amino acids was added to the fermentors. In addition, 4 ml/L of 2.5 g/L ergosterol in ethanol, 0.4 ml/L Tween 80, and 0.01 % AF-204 were added to each fermentor. Oxygen was purged with 100% N2 sparged at about 1 wm until p02 was below 1 %.
Samples were taken every 2 to 7 hours and measured for ethanol and glucose concentrations and OD600. The fermentation was harvested when the glucose concentration was below 0.05 g/L, at 50 hours elapsed fermentation time (EFT). Ethanol and glucose concentrations and OD600 of the final sample are reported in the table below.
Figure imgf000216_0002
The data presented in the table above also is presented graphically in FIGS. 16A and 16B. FIG. 16A presents the fermentation data from strain BF428 (BY4742 with vector controls) and FIG. 16B presents the fermentation data from strain BF591 (BY4742 with EDD-PA01/EDA-PA01 ).
Fermentation profiles for strains BF 428 and BF 591 , grown on 2% dextrose, were calculated and are presented in the table below.
Strain Yx/s Yp/s Yp/x Qp qp
BF428 0.24 0.40 7.19 0.02 0.05
BF591 023 043 7A4 < 02 O07
Yx/s=OD/g glucose
Yp/s=q ethanol/g glucose
Yp/x=g ethanol/OD
Qp=g ethanol/Lh''
qp=g ethanol/ODh'1 The results from the fermentation show that the BF591 has a higher ethanol yield (triangles, compare FIG. 16A and FIG. 16B) than the control BF428 strain. The calculated yield of ethanol was also determined to be higher in the engineered BF591 strain (0.43g ethanol/g glucose) than that of the BF428 control strain (0.40g ethanol/g glucose). Example 21: Improved ethanol yield in a tall strain of S. cerevisiae expressing EDD and EDA from PA01
To generate BY4741 and BY4742 tall mutant strains, the following procedure was used: Oligonucleotides
#350 - 5'-TAAAACGACGGCCAGTGAAT-3'
#351 - 5'-TGCAGGTCGACTCTAGAGGAT-3'
#352 - 5'-GTGTGCGTGTATGTGTACACCTGTATTTAATTTCCTTACTCGCGGGTTTTTCT AAAACGACGGCCAGTGAAT-3'
#353 - 5'-TGTACCAGTCTAGAATTCTACCAACAAATGGGGAAATCAAAGTAACTTGGGCTG CAGGTCGACTCTAGAGGA-3'
All oligonucleotides set forth above were purchased from Integrated Technologies ("IDT", Coralville, IA). PCR amplification of the genes were performed as follows: about 50ng of the pBFU-719 DNA (e.g., plasmid with unique 200-mer sequence) was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μιηοΙ gene-specific primers (#350/#351 in the first round), and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction mixture was cycled as follows: 95°C 10 minutes followed by 30 rounds of 95 °C for 20 seconds, 60°C for 30 seconds, and 72°C for 45 seconds. A final 5 minute extension reaction at 72°C was also included. A second round of PCR amplification was done using 50ng of the first round PCR amplification with 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 pmol gene-specific primers (#352/#353 in the second round), and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The second reaction mixture was cycled as follows: 95°C 10 minutes followed by 30 rounds of 95 °C for 20 seconds, 60 °C for 30 seconds, and 72°C for 45 seconds. A final 5 minute extension reaction at 72°C was also included. The final PCR product was purified using the Zymo Research DNA Clean &
Concentrator-25 kit (Zymo Research, Orange, CA).
Transformation was accomplished by a high-efficiency competency method. A 5ml culture of the BY4742 or BY4741 strain was grown overnight at about 30 °C with shaking at about 200rpm. A suitable amount of this overnight culture was added to 60ml of YPD media to obtain an initial OD600 of about 0.2 (approximately 2 x 106 cells/ml). The cells were allowed to grow at 30°C with agitation (about 200rpm) until the OD60o was about 1. The cells were then centrifuged at 3000rpm for 5 min, washed with 10ml sterile water and re-centrifuged. The cell pellet was resuspended in 1 ml sterile water, transferred to a 1 .5ml sterile microcentrifuge tube and spun down at 4000 x g for about 5 minutes. This cell pellet was resuspended in 1 ml sterile 1 X TE/LiOAC solution (10mM Tris-HCI, I mM EDTA, 100mM LiOAc, pH7.5) and re-centrifuged at about 4000 x g for about 5 minutes. The cell pellet was resuspended in 0.25ml 1 X TE/LiOAc solution. For the transformation, 50μΙ of these cells were aliquoted to a 1.5ml microcentrifuge tube and about 1 g purified PCR product and 5μΙ of salmon sperm DNA that had been previously boiled for about 5 minutes and placed on ice. 300μΙ of a sterile PEG solution was then added (40% PEG 3500, 10mM Tris-HCI, 1 mM EDTA, 100mM LiOAc, pH7.5). This mixture was allowed to incubate at 30°C for about one hour with gentle mixing every 15 minutes. About 40μΙ DMSO (Sigma, St. Louis, MO) was added to the incubating mixture, and the mixture heat shocked at about 42°C for about 15 minutes. The cells were pelleted in a microcentrifuge at 13000rpm for about 30 seconds and the supernatant removed. The cells were resuspended in 1 ml 1 X TE (1 OmM Tris-HCI, 1 mM EDTA, pH 7.5), centrifuged at 13000rpm for about 30 seconds and resuspended in 1 ml 1 X TE. About 100-200μΙ of cells were plated onto SCD-URA media, as described above, and allowed to grow at about 30 °C for about 3 days. After 3 days, transformed colonies were streaked for single colonies on SCD- URA plates and allowed to grow at about 30 °C for about 3 days. From these plates, single colonies were streaked onto SCD agar plates (20g/L agar in SCD media) containing 1 g/L 5-FOA (Research Products International Corp, Mt. Prospect, IL), and also inoculated into YPD liquid broth. The plates were allowed to grow at about 30°C for about 4 days and the liquid culture was grown overnight at about 30 °C with agitation of about 200rpm.
To confirm that integration of the construct was correct, genomic DNA was prepared from the YPD overnight cultures. Briefly, the yeast cells were pelleted by centrifugation at room temperature for 5 minutes at approximately 3000rpm. The cell pellet was resuspended in 200μΙ of breaking buffer (2% Triton X-100, 1 % SDS, l OOmM NaCI, l OmM Tris pH8, 1 mM EDTA) and placed into a 1 .5ml microcentrifuge tube containing about 200μΙ glass beads and about 200μΙ of
pheno!:chloroform:isoamyl alcohol (Ambion, Austin, Texas). The mixture was vortexed for about 2 to 5 minutes at room temperature. About 200μΙ of sterile water was then added and the mixture vortexed again. The mixture was centrifuged for about 10 minutes at about 13000rpm and the aqueous layer transferred to a new microcentrifuge tube. About 1/10th of the aqueous layers volume of 3M NaOAc ((British Drug Houses, Ltd., West Chester, PA) was added to the aqueous layer and 2.5X the total volume of the mixture of ethanol was added and mixed well. The genomic DNA was then precipitated by placing the tubes at -80 °C for at least one hour (or in a dry ice/ethanol bath for about 30minutes). The tubes were then centrifuged at about 13000rpm for 5 minutes at about 4°C to pellet the DNA. The DNA pellet was then washed two times or more times with about 200μΙ of 70% ethanol and re-centrifuged. The DNA pellet was dried using vacuum assisted air drying and resuspended in about 50 to 200μΙ 1 X TE.
The genomic DNA isolated as described above was used in a PCR amplification reaction consisting of about 50ng of the genomic DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μιηοΙ gene-specific primers (#276/#277), and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction mix was cycled as follows: 95 °C 10 minutes followed by 30 rounds of 95°C for 20 seconds, 60°C for 30 seconds, and 72°C for 45 seconds. A final 5 minute extension reaction at 72°C was also included. A second round of PCR amplification was done using 50ng of the first round PCR amplification with 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μηιοΙ gene-specific primers (#352/#353 in the second round), and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The second mixture was cycled as follows: 95 °C 10 minutes followed by 30 rounds of 95 °C for 20 seconds, 55 °C for 30 seconds, and 72°C for about 30 seconds. A final 5 minute extension reaction at 72°C was also included.
Positive colonies from the screen in YPD that had a PCR product of about 1600bp indicating the insertion of the integration construct in the TAL1 locus, and that grew on the plates containing 5- FOA were grown overnight in YPD at about 30 °C with agitation of about 200rpm. Genomic DNA was prepared as above and checked by PCR amplification using primers #276 and #277
(described below). Positive clones were identified which had a PCR product of 359bp indicating the deletion of the tall locus and the remaining portion of the 200-mer tag. The strain carrying the correct traits was labeled as BF716. The BY4741 version was labeled as BF717.
Oligonucleotides
#276 - 5'-GTCGACTGGAAATCTGGAAGGTTGGT-3'
#277 - 5'- GTCGACGCTTTGCTGCAAGGATTCAT-3'
The BY4742 tall strain was then made competent using the high efficiency competent method as described above. About 500ng of plasmids pBF290 and pBF292 or with plasmids p426GPD and p425GPD were used to transform the BY4742 tall strain. The final transformation mixture was plated onto SCD-ura-leu plates and grown at about 30 °C for about 3 days. Strain BF716 (BY4742 tall ) with p426GPD/p425GPD was labeled as BF738. Strain BF716 with pBF290/pBF292 was labeled as BF741 .
A fermentation test of the BF738 was conducted against BF741 in a 400ml multiplexed fermentor. The fermentation medium utilized was SC -Ura -Leu with 2% glucose. Cultures were grown overnight in 50 ml SC -Ura -Leu 2% glucose and used to inoculate the fermentors at 4 to 5% inoculum. OD600 readings of the inoculum are shown in the table below.
Figure imgf000220_0001
The cultures were grown aerobically at about 30 "C with about 250 rpm agitation, 0.5 vvm sparge of process air, 21 % 02. pH was controlled at 5.0 with 1 N NaOH. Glucose concentrations in the fermentation broth were monitored by YSI 2700 BioAnalyzers during aerobic fermentation. Once glucose was depleted the fermentation was switched to anaerobic conditions. Before changing to anaerobic conditions samples were taken to measure glucose usage. Biomass was measured by monitoring the optical density of the growth medium at 600 nanometers (e.g., OD600). EFT at glucose depletion, glucose concentrations and OD600 are shown in the table below. The table below reports the amount of biomass in the fermentor and the amount of ethanol produced in grams per liter, after the specified amount of time (EFT), by the respective strains.
Figure imgf000221_0001
At the beginning of anaerobic fermentation, about 19 g/L glucose, 3.7 g/L YNB, 4 ml/L of 2.5 g/L ergosterol (in ethanol), 0.4 ml/L Tween 80, and 0.01 % AF-204 were added to each fermentor. Oxygen was purged with 100% N2 sparged at 0.25 wm for the remainder of the fermentation. Samples were taken every 4 to 12 hours and analyzed for ethanol production and glucose utilization using the YSI Bioanalyzers, and amount of biomass by OD600. The fermentations were harvested when the glucose bolus was depleted. Anaerobic ethanol produced, anaerobic glucose consumption and OD600 of the final sample are shown in the table below.
Figure imgf000221_0002
The results are also presented graphically in FIGS. 17A and 17B. FIG. 17A illustrates the fermentation data for strain BF738 (BY4742 tall with vector controls p426GPD and p425GPD) and FIG. 17B illustrates the fermentation data for strain BF741 (BY4742 tall with plasmids pBF290 (EDD-PA01 ) and pBF292 (EDA-PA01 ). The results presented above and in FIGS. 17A and 17B indicate that strain BF741 , which expresses the activities encoded by the eda and edd genes, yields more ethanol than control strain BF738. Strain BF741 produced about 0.43g ethanol per gram of glucose consumed whereas strain BF738 produced only 0.36g ethanol per gram of glucose consumed. Fermentation profiles were calculated for strains BF738 and BF741 and are presented below.
Strain Yx/s Yp/s Yp/x Qp qp
BF738 0.198 0.358 3.76 0.371 0.103
BF741 0.203 0.439 2.16 0.439 0.131 Yx/s=OD/g glucose, Yp/s=q ethanol/g glucose, Yp/x=g ethanol/OD
Qp=g ethanol/Lh"', qp=g ethanol/ODh"'
Example 22: Complementation and improved ethanol yield in a pfkl strain of S. cerevisiae expressing the EDA and EDD genes from P. aeruginosa
Strain BF205 (YGR240C/BY4742, ATCC Cat. No. 4015893; PubMed: 10436161 ) was transformed with plasmids p426GPD and p425GPD or with plasmids pBF290 (p426GPD/EDD-PA01 ) and pBF292 (p426GPD/EDA-PA01 ), generating strains BF740 (vector controls) and BF743, respectively. Transformation was accomplished by a high-efficiency competency method using 500ng of plasmids p426GPD and p425GPD or plasmids pBF290 and pBF292. Transformants were plated onto SCD-ura-leu agar plates and grown at about 30 °C for about 3 days. The final strains were named BF740 (BY4742 pfkl with plasmids p426GPD and p425GPD) and BF743 (BY4742-pfk1 , pBF290/pBF292).
A fermentation test of the control strain BF740 (BY4742 pfkl with plasmids p426GPD and p425GPD) was conducted against BF743 (BY4742-pfk1 , pBF290/pBF292) in 400ml w.v. Multifors multiplexed fermentors. The fermentation medium was SC-Ura-Leu with 2% glucose. Vessels were inoculated with about a 10% inoculum from overnight cultures grown in about 50 ml SC-Ura- Leu with about 2% glucose and normalized to 0.5 OD600. The actual inoculated ODs for the fermentations are shown in the table below.
Figure imgf000222_0001
The cultures were grown aerobically at about 30 °C with about 250 rpm agitation, 1 wm sparge of process air, (21 % 02). The pH was controlled at around 5.0 with 0.25 N NaOH. Once glucose concentrations dropped below 0.5 g/L the fermentation was switched to anaerobic conditions. Before changing to anaerobic conditions, samples were taken to measure glucose concentrations and biomass by OD600 as shown in the table below. The table below shows the beginning cell biomass and glucose concentration (in grams per liter of nutrient broth). Ethanol and glucose concentrations in the fermentation broth were monitored using a YSI 2700 BioAnalyzer. Ethanol Glucose
Strain OD600nm (g/L) (g/L)
BF740 5.94 5.67 0.033
BF743 5.82 5.82 0.034
At the beginning of the anaerobic portion of the fermentation, a bolus of about 18 g/L glucose plus about 4 ml/L of 2.5 g/L ergosterol in Ethanol, 0.4 ml/L Tween 80, and 0.01% AF-204 were added to each fermentor. Oxygen was purged with 100% N2 sparged at about 1 wm until p02 was below 1 %. Samples were taken every 4 to 8 hours and measured for ethanol and glucose concentrations and biomass (OD60o) - The fermentation was harvested when the glucose concentration was below 0.05 g/L, at about 42 hours elapsed fermentation time (EFT). Ethanol and glucose concentrations and OD60o of the final sample are shown in the table below.
Figure imgf000223_0001
The results also are present graphically in FIGS. 18A and 18B. The results presented in FIG. 18A illustrate the fermentation data for strain BF740 grown on 2% dextrose and the results presented in FIG. 18B illustrate the fermentation data for strain BF743 grown on 2% dextrose. The results indicate that the BY4742 pftcl mutant strain, BF740 cannot utilize glucose nor produce ethanol under anaerobic conditions. However, the engineered strain BF743 is capable of both utilizing glucose and producing ethanol under anaerobic conditions. Strain BF743 has a yield of about 0.39g ethanol per gram of glucose consumed versus no yield in the control strain BF740. The fermentation profile for strains BF740 and BF743 are presented in the table below.
Strain Yx/s Yp/s Yp/x Qp qp
BF740 2.133 -0.700 -0.328 -0.022 -0.003
BF743 0.264 0.390 1 .483 0.178 0.035
Yx/s=OD/g glucose, Yp/s=q ethanol/g glucose, Yp/x=g ethanol/OD
Qp=g ethanol/Lh"', qp=g ethanol/ODh'' Example 23: EDD and EDA activities from other sources
The EDD and EDA genes also have been isolated from additional sources and tested for the ability to direct fermentation in yeast. The additional EDD and EDA genes have been isolated from Shewanella oneidensis, Gluconobacter oxydans, and Ruminococcus flavefaciens. Genomic DNA was purchased from ATCC for both S. oneidensis (Cat. No. 700550D) and G. oxydans (621 HD-5). R. flavefaciens, strain C94 (NCDO 2213) was also purchased from ATCC (Cat. No. 19208). To prepare genomic DNA, R. flavefaciens was grown in cooked meat media (Becton Dickinson, Franklin Lakes, NJ USA) overnight at 37°C and genomic DNA was isolated using a Qiagen DNeasy Blood and Tissue kit according to the manufacture's protocol. The eda and edd genes were PCR amplified from the corresponding genomic DNA using the following sets of PCR oligonucleotides. The nucleotide and amino acid sequences of eda and edd genes PCR amplified using the following sets of PCR oligonucleotide primers, also is given below.
The S. oneidensis edd gene:
5'- GTTCACTGCactagtaaaaaaATGCACTCAGTCGTTCAATCTG-3' (SEQ ID NO: 73)
5'- CTTCGAGATCTCGAGTTAGTAAAGTTCATCGATGGC-3' (SEQ ID NO: 74)
The S. oneidensis eda gene:
5'- GTTCACTGCactagtaaaaaaATGCTTGAGAATAACTGGTC-3' (SEQ ID NO: 75)
5'- CTTCGAGATCTCGAGTTAAAGTCCGCCAATCGCCTC-3' (SEQ ID NO: 76)
The G. oxydans edd gene:
5'- GTTCACTGCactagtaaaaaaATGTCTCTGAATCCCGTCGTC-3' (SEQ ID NO: 77)
5'- CTTCGAGATCTCGAGTTAGTGAATGTCGTCGCCAAC-3' (SEQ ID NO: 78)
The G. oxydans eda gene:
5'- GTTCACTGCactagtaaaaaaATGATCGATACTGCCAAACTC-3' (SEQ ID NO: 79)
5'- CTTCGAGATCTCGAGTCAGACCGTGAAGAGTGCCGC-3' (SEQ ID NO:80) The R. flavefaciens edd gene:
5'- GTTCACTGCactagtaaaaaaATGAGCGATAATTTTTTCTGCG-3' (SEQ ID NO: 81 )
5'- CTTCGAGATCTCGAGCTATTTCCTGTTGATGATAGC-3' (SEQ ID NO: 82) S. oneidensis 6-phosphogluconate dehydratase (edd) (SEQ ID NO: 83)
ATGCACTCAGTCGTTCAATCTGTTACTGACAGAATTATTGCCCGTAGCAAAGCATCTCGTGAA GCATACCTTGCTGCGTTAAACGATGCCCGTAACCATGGTGTACACCGAAGTTCCTTAAGTTGC GGTAACTTAGCCCACGGTTTTGCGGCTTGTAATCCCGATGACAAAAATGCATTGCGTCAATTG ACGAAGGCCAATATTGGGATTATCACCGCATTCAACGATATGTTATCTGCACACCAACCCTAT GAAACCTATCCTGATTTGCTGAAAAAAGCCTGTCAGGAAGTCGGTAGTGTTGCGCAGGTGGC TGGCGGTGTTCCCGCCATGTGTGACGGCGTGACTCAAGGTCAGGCCGGTATGGAATTGAGCT TACTGAGCCGTGAAGTGATTGCGATGGCAACCGCGGTTGGCTTATCACACAATATGTTTGATG GAGCCTTACTCCTCGGTATTTGCGATAAAATTGTACCGGGTTTACTGATTGGTGCCTTAAGTTT TGGCCATTTACCTATGTTGTTTGTGCCCGCAGGCCCAATGAAATCGGGTATTCCTAATAAGGA AAAAGCTCGCATTCGTCAGCAATTTGCTCAAGGTAAGGTCGATAGAGCACAACTGCTCGAAGC GGAAGCCCAGTCTTACCACAGTGCGGGTACTTGTACCTTCTATGGTACCGCTAACTCGAACCA ACTGATGCTCGAAGTGATGGGGCTGCAATTGCCGGGTTCATCTTTTGTGAATCCAGACGATCC ACTGCGCGAAGCCTTAAACAAAATGGCGGCCAAGCAGGTTTGTCGTTTAACTGAACTAGGCA CTCAATACAGTCCGATTGGTGAAGTCGTTAACGAAAAATCGATAGTGAATGGTATTGTTGCATT GCTCGCGACGGGTGGTTCAACAAACTTAACCATGCACATTGTGGCGGCGGCCCGTGCTGCA GGTATTATCGTCAACTGGGATGACTTTTCGGAATTATCCGATGCGGTGCCTTTGCTGGCACGT GTTTATCCAAACGGTCATGCGGATATTAACCATTTCCACGCTGCGGGTGGTATGGCTTTCCTT ATCAAAGAATTACTCGATGCAGGTTTGCTGCATGAGGATGTCAATACTGTCGCGGGTTATGGT CTGCGCCGTTACACCCAAGAGCCTAAACTGCTTGATGGCGAGCTGCGCTGGGTCGATGGCC CAACAGTGAGTTTAGATACCGAAGTATTAACCTCTGTGGCAACACCATTCCAAAACAACGGTG GTTTAAAGCTGCTGAAGGGTAACTTAGGCCGCGCTGTGATTAAAGTGTCTGCCGTTCAGCCAC AGCACCGTGTGGTGGAAGCGCCCGCAGTGGTGATTGACGATCAAAACAAACTCGATGCGTTA TTTAAATCCGGCGCATTAGACAGGGATTGTGTGGTGGTGGTGAAAGGCCAAGGGCCGAAAGC CAACGGTATGCCAGAGCTGCATAAACTAACGCCGCTGTTAGGTTCATTGCAGGACAAAGGCTT TAAAGTGGCACTGATGACTGATGGTCGTATGTCGGGCGCATCGGGCAAAGTACCTGCGGCGA TTCATTTAACCCCTGAAGCGATTGATGGCGGGTTAATTGCAAAGGTACAAGACGGCGATTTAA TCCGAGTTGATGCACTGACCGGCGAGCTGAGTTTATTAGTCTCTGACACCGAGCTTGCCACC AGAACTGCCACTGAAATTGATTTACGCCATTCTCGTTATGGCATGGGGCGTGAGTTATTTGGA GTACTGCGTTCAAACTTAAGCAGTCCTGAAACCGGTGCGCGTAGTACTAGCGCCATCGATGA ACTTTACTAA S. oneidensis 6-phosphogluconate dehydratase (edd)-Amino Acid sequence (SEQ ID NO: 84)
MHSVVQSVTDRIIARSKASREAYLAALNDARNHGVHRSSLSCGNLAHGFAACNPDDKNALRQLTK ANIGIITAFNDMLSAHQPYETYPDLLKKACQEVGSVAQVAGGVPAMCDGVTQGQPGMELSLLSRE VIAMATAVGLSHNMFDGALLLGICDKIVPGLLIGALSFGHLPMLFVPAGPMKSGIPNKEKARIRQQF AQGKVDRAQLLEAEAQSYHSAGTCTFYGTANSNQLMLEVMGLQLPGSSFVNPDDPLREALNKMA AKQVCRLTELGTQYSPIGEVVNEKSIVNGIVALLATGGSTNLTMHIVAAARAAGIIVNWDDFSELSD AVPLLARVYPNGHADINHFHAAGGMAFLIKELLDAGLLHEDVNTVAGYGLRRYTQEPKLLDGELR WVDGPTVSLDTEVLTSVATPFQNNGGLKLLKGNLGRAVIKVSAVQPQHRVVEAPAVVIDDQNKLD ALFKSGALDRDCVVVVKGQGPKANGMPELHKLTPLLGSLQDKGFKVALMTDGRMSGASGKVPAA IHLTPEAIDGGLIAKVQDGDLIRVDALTGELSLLVSDTELATRTATEIDLRHSRYGMGRELFGVLRSN LSSPETGARSTSAIDELY
G. oxydans 6-phosphogluconate dehydratase (edd) (SEQ ID NO:85)
ATGTCTCTGAATCCCGTCGTCGAGAGCGTGACTGCCCGTATCATCGAGCGTTCGAAAGTCTC CCGTCGCCGGTATCTCGCCCTGATGGAGCGCAACCGCGCCAAGGGTGTGCTCCGGCCCAAG CTGGCCTGCGGTAATCTGGCGCATGCCATCGCAGCGTCCAGCCCCGACAAGCCGGATCTGA TGCGTCCCACCGGGACCAATATCGGCGTGATCACGACCTATAACGACATGCTCTCGGCGCAT CAGCCGTATGGCCGCTATCCCGAGCAGATCAAGCTGTTCGCCCGTGAAGTCGGTGCGACGG CCCAGGTTGCAGGCGGCGCACCAGCAATGTGTGATGGTGTGACGCAGGGGCAGGAGGGCAT GGAACTCTCCCTGTTCTCCCGTGACGTGATCGCCATGTCCACGGCGGTCGGGCTGAGCCAC GGCATGTTTGAGGGCGTGGCGCTGCTGGGCATCTGTGACAAGATTGTGCCGGGCCTTCTGAT GGGCGCGCTGCGCTTCGGTCATCTCCCGGCCATGCTGATCCCGGCAGGGCCAATGCCGTCC GGTCTTCCAAACAAGGAAAAGCAGCGCATCCGCCAGCTCTATGTGCAGGGCAAGGTCGGGC AGGACGAGCTGATGGAAGCGGAAAACGCCTCCTATCACAGCCCGGGCACCTGCACGTTCTAT GGCACGGCCAATACGAACCAGATGATGGTCGAAATCATGGGTCTGATGATGCCGGACTCGGC TTTCATCAATCCCAACACGAAGCTGCGTCAGGCAATGACCCGCTCGGGTATTCACCGTCTGG CCGAAATCGGCCTGAACGGCGAGGATGTGCGCCCGCTCGCTCATTGCGTAGACGAAAAGGC CATCGTGAATGCGGCGGTCGGGTTGCTGGCGACGGGTGGTTCGACCAACCATTCGATCCATC TTCCTGCTATCGCCCGTGCCGCTGGTATCCTGATCGACTGGGAAGACATCAGCCGCCTGTCG TCCGCGGTTCCGCTGATCACCCGTGTTTATCCGAGCGGTTCCGAGGACGTGAACGCGTTCAA CCGCGTGGGTGGTATGCCGACCGTGATCGCCGAACTGACGCGCGCCGGGATGCTGCACAAG GACATTCTGACGGTCTCTCGTGGCGGTTTCTCCGATTATGCCCGTCGCGCATCGCTGGAAGG CGATGAGATCGTCTACACCCACGCGAAGCCGTCCACGGACACCGATATCCTGCGCGATGTGG CTACGCCTTTCCGGCCCGATGGCGGTATGCGCCTGATGACTGGTAATCTGGGCCGCGCGAT CTACAAGAGCAGCGCTATTGCGCCCGAGCACCTGACCGTTGAAGCGCCGGCACGGGTCTTC CAGGACCAGCATGACGTCCTCACGGCCTATCAGAATGGTGAGCTTGAGCGTGATGTTGTCGT GGTCGTCCGGTTCCAGGGACCGGAAGCCAACGGCATGCCGGAGCTTCACAAGCTGACCCCG ACTCTGGGCGTGCTTCAGGATCGCGGCTTCAAGGTGGCCCTGCTGACGGATGGACGCATGT CCGGTGCGAGCGGCAAGGTGCCGGCCGCCATTCATGTCGGTCCCGAAGCGCAGGTTGGCG GTCCGATCGCCCGCGTGCGGGACGGCGACATGATCCGTGTCTGCGCGGTGACGGGACAGAT CGAGGCTCTGGTGGATGCCGCCGAGTGGGAGAGCCGCAAGCCGGTCCCGCCGCCGCTCCC GGCATTGGGAACGGGCCGCGAACTGTTCGCGCTGATGCGTTCGGTGCATGATCCGGCCGAG GCTGGCGGATCCGCGATGCTGGCCCAGATGGATCGCGTGATCGAAGCCGTTGGCGACGACA TTCACTAA
G. oxydans 6-p osphogluconate dehydratase (edd)-Amino Acid sequence (SEQ ID NO:86)
MSLNPVVESVTARIIERSKVSRRRYLALMERNRAKGVLRPKLACGNLAHAIAASSPDKPDLMRPTG TNIGVITTYNDMLSAHQPYGRYPEQIKLFAREVGATAQVAGGAPAMCDGVTQGQEGMELSLFSRD VIAMSTAVGLSHGMFEGVALLGICDKIVPGLLMGALRFGHLPAMLIPAGPMPSGLPNKEKQRIRQL YVQGKVGQDELMEAENASYHSPGTCTFYGTANTNQMMVEIMGLMMPDSAFINPNTKLRQAMTR SGIHRLAEIGLNGEDVRPLAHCVDEKAIVNAAVGLLATGGSTNHSIHLPAIARAAGILIDWEDISRLSS AVPLITRVYPSGSEDVNAFNRVGGMPTVIAELTRAGMLHKDILTVSRGGFSDYARRASLEGDEIVY THAKPSTDTDILRDVATPFRPDGGMRLMTGNLGRAIYKSSAIAPEHLTVEAPARVFQDQHDVLTAY QNGELERDVVVVVRFQGPEANGMPELHKLTPTLGVLQDRGFKVALLTDGRMSGASGKVPAAIHV GPEAQVGGPIARVRDGDMIRVCAVTGQIEALVDAAEWESRKPVPPPLPALGTGRELFALMRSVHD PAEAGGSAMLAQMDRVIEAVGDDIH
Ft. flavefaciens phosphogluconate dehydratase/DHAD (SEQ ID NO: 87)
ATGAGCGATAA I I I I I I CTGCGAGGGTGCGGATAAAGCCCCTCAGCGTTCACTTTTCAATGCA CTGGGCATGACTAAAGAGGAAATGAAGCGTCCCCTCGTTGGTATCGTTTCTTCCTACAATGAG ATCGTTCCCGGCCATATGAACATCGACAAGCTGGTCGAAGCCGTTAAGCTGGGTGTAGCTAT GGGCGGCGGCACTCCTGTTGTTTTCCCTGCTATCGCTGTATGCGACGGTATCGCTATGGGTC ACACAGGCATGAAGTACAGCCTTGTTACCCGTGACCTTATTGCCGATTCTACAGAGTGTATGG CTCTTGCTCATCACTTCGACGCACTGGTAATGATACCTAACTGCGACAAGAACGTTCCCGGCC TGCTTATGGCGGCTGCACGTATCAATGTTCCTACTGTATTCGTAAGCGGCGGCCCTATGCTTG CAGGCCATGTAAAGGGTAAGAAGACCTCTCTTTCATCCATGTTCGAGGCTGTAGGCGCTTACA CAGCAGGCAAGATAGACGAGGCTGAACTTGACGAATTCGAGAACAAGACCTGCCCTACCTGC GGTTCATGTTCGGGTATGTATACCGCTAACTCCATGAACTGCCTCACTGAGGTACTGGGTATG GGTCTCAGAGGCAACGGCACTATCCCTGCTGTTTACTCCGAGCGTATCAAGCTTGCAAAGCA GGCAGGTATGCAGGTTATGGAACTCTACAGAAAGAATATCCGCCCTCTCGATATCATGACAGA GAAGGCTTTCCAGAACGCTCTCACAGCTGATATGGCTCTTGGATGTTCCACAAACAGTATGCT CCATCTCCCTGCTATCGCCAACGAATGCGGCATAAATATCAACCTTGACATGGCTAACGAGAT AAGCGCCAAGACTCCTAACCTCTGCCATCTTGCACCGGCAGGCCACACCTACATGGAAGACC TCAACG AAGCAGGCGG AGTTTATGCAGTTCTCAACGAGCTG AGCAAAAAGGGACTTATCAACA CCGACTGCATGACTGTTACAGGCAAGACCGTAGGCGAGAATATCAAGGGCTGCATCAACCGT GACCCTGAGACTATCCGTCCTATCGACAACCCATACAGTGAAACAGGCGGAATCGCCGTACT CAAGGGCAATCTTGCTCCCGACAGATGTGTTGTGAAGAGAAGCGCAGTTGCTCCCGAAATGC TGGTACACAAAGGCCCTGCAAGAGTATTCGACAGCGAGGAAGAAGCTATCAAGGTCATCTAT GAGGGCGGTATCAAGGCAGGCGACGTTGTTGTTATCCGTTACGAAGGCCCTGCAGGCGGCC CCGGCATGAGAGAAATGCTCTCTCCTACATCAGCTATACAGGGTGCAGGTCTCGGCTCAACT GTTGCTCTAATCACTGACGGACGTTTCAGCGGCGCTACCCGTGGTGCGGCTATCGGACACGT ATCCCCCGAAGCTGTAAACGGCGGTACTATCGCATATGTCAAGGACGGCGATATTATCTCCAT CGACATACCGAATTACTCCATCACTCTTGAAGTATCCGACGAGGAGCTTGCAGAGCGCAAAAA GGCAATGCCTATCAAGCGCAAGGAGAACATCACAGGCTATCTGAAGCGCTATGCACAGCAGG TATCATCCGCAGACAAGGGCGCTATCATCAACAGGAAATAG
R. flavefaciens phosphogluconate dehydratase/DHAD-Amino Acid sequence (SEQ ID NO: 88) MSDNFFCEGADKAPQRSLFNALGMTKEEMKRPLVGIVSSYNEIVPGHMNIDKLVEAVKLGVAMGG GTPVVFPAIAVCDGIAMGHTGMKYSLVTRDLIADSTECMALAHHFDALVMIPNCDKNVPGLLMAAA RINVPTVFVSGGPMLAGHVKGKKTSLSSMFEAVGAYTAGKIDEAELDEFENKTCPTCGSCSGMYT ANSMNCLTEVLGMGLRGNGTIPAVYSERIKLAKQAGMQVMELYRKNIRPLDIMTEKAFQNALTAD MALGCSTNSMLHLPAIANECGININLDMANEISAKTPNLCHLAPAGHTYMEDLNEAGGVYAVLNEL SKKGLINTDCMTVTGKTVGENIKGCINRDPETIRPIDNPYSETGGIAVLKGNLAPDRCVVKRSAVAP EMLVHKGPARVFDSEEEAIKVIYEGGIKAGDVVVIRYEGPAGGPGMREMLSPTSAIQGAGLGSTVA LITDGRFSGATRGAAIGHVSPEAVNGGTIAYVKDGDIISIDIPNYSITLEVSDEELAERKKAMPIKRKE NITGYLKRYAQQVSSADKGAIINRK Pair wise homology comparisons for various edd proteins are presented in the table below. The comparisons were made using ClustalW software (ClustalW and ClustalX version 2; Larkin M.A., Blackshields G., Brown N.P., Chenna R., McGettigan P.A., McWilliam H., Valentin F., Wallace I.M., Wilm A., Lopez P.., Thompson J.D., Gibson T.J. and Higgins D.G., Bioinformatics 2007 23(21 ): 2947-2948). ClustalW is a free alignment tool available at the European Bioinformatics Institute website (e.g., world wide web uniform resource locator ebi.ac.uk, specific ClustalW location is ebi.ac.uk/Tools/clustalw2/index.html). PA01 = Pseudomonas aeruginosa PA01 , E.C. = Eschericia coli, S.O. = S. oneidensis, G.O. = G. oxydans, R.F. = Ruminococcus flavefaciens.
Figure imgf000229_0001
S. oneidensis keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase (eda) (SEQ ID NO: 89)
ATGCTTGAGAATAACTGGTCATTACAACCACAAGATATTTTTAAACGCAGCCCTATTGTTCCTG TTATGGTGATTAACAAGATTGAACATGCGGTGCCCTTAGCTAAAGCGCTGGTTGCCGGAGGG ATAAGCGTGTTGGAAGTGACATTACGCACGCCATGCGCCCTTGAAGCTATCACCAAAATCGCC AAGGAAGTGCCTGAGGCGCTGGTTGGCGCGGGGACTATTTTAAATGAAGCCCAGCTTGGACA GGCTATCGCCGCTGGTGCGCAATTTATTATCACTCCAGGTGCGACAGTTGAGCTGCTCAAAG CGGGCATGCAAGGACCGGTGCCGTTAATTCCGGGCGTTGCCAGTATTTCCGAGGTGATGACG GGCATGGCGCTGGGCTACACTCACTTTAAATTCTTCCCTGCTGAAGCGTCAGGTGGCGTTGA TGCGCTTAAGGCTTTCTCTGGGCCGTTAGCAGATATCCGCTTCTGCCCAACAGGTGGAATTAC CCCGAGCAGCTATAAAGATTACTTAGCGCTGAAGAATGTCGATTGTATTGGTGGCAGCTGGAT TGCTCCTACCGATGCGATGGAGCAGGGCGATTGGGATCGTATCACTCAGCTGTGTAAAGAGG CGATTGGCGGACTTTAA S. oneidensis keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase (eda)-Amino Acid sequence (SEQ ID NO: 90)
MLENNWSLQPQDIFKRSPIVPVMVINKIEHAVPLAKALVAGGISVLEVTLRTPCALEAITKIAKEVPEA LVGAGTILNEAQLGQAIAAGAQFIITPGATVELLKAGMQGPVPLIPGVASISEVMTGMALGYTHFKF FPAEASGGVDALKAFSGPLADIRFCPTGGITPSSYKDYLALKNVDCIGGSWIAPTDAMEQGDWDRI TQLCKEAIGGL
G.oxydans keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase (eda) (SEQ ID NO: 91 )
ATGATCGATACTGCCAAACTCGACGCCGTCATGAGCCGTTGTCCGGTCATGCCGGTGCTGGT GGTCAATGATGTGGCTCTGGCCCGCCCGATGGCCGAGGCTCTGGTGGCGGGTGGACTGTCC ACGCTGGAAGTCACGCTGCGCACGCCCTGCGCCCTTGAAGCTATTGAGGAAATGTCGAAAGT ACCAGGCGCGCTGGTCGGTGCCGGTACGGTGCTGAATCCGTCCGACATGGACCGTGCCGTG AAGGCGGGTGCGCGCTTCATCGTCAGCCCCGGCCTGACCGAGGCGCTGGCAAAGGCGTCG GTTGAGCATGACGTCCCCTTCCTGCCAGGCGTTGCCAATGCGGGTGACATCATGCGGGGTCT GGATCTGGGTCTGTCACGCTTCAAGTTCTTCCCGGCTGTGACGAATGGCGGCATTCCCGCGC TCAAGAGCTTGGCCAGTG I I I I I GGCAGCAATGTCCGTTTCTGCCCCACGGGCGGCATTACG GAAGAGAGCGCACCGGACTGGCTGGCGCTTCCCTCCGTGGCCTGCGTCGGCGGATCCTGG GTGACGGCCGGCACGTTCGATGCGGACAAGGTCCGTCAGCGCGCCACGGCTGCGGCACTCT TCACGGTCTGA
G.oxydans keto-hydroxyglutarate-aldolase/keto-deoxy-phosp ogluconate aldolase (eda)-Amino Acid (SEQ ID NO: 92)
MIDTAKLDAVMSRCPVMPVLVVNDVALARPMAEALVAGGLSTLEVTLRTPCALEAIEEMSKVPGAL VGAGTVLNPSDMDRAVKAGARFIVSPGLTEALAKASVEHDVPFLPGVANAGDIMRGLDLGLSRFK FFPAVTNGGIPALKSLASVFGSNVRFCPTGGITEESAPDWLALPSVACVGGSWVTAGTFDADKVR QRATAAALFTV
Pair wise homology comparisons for various eda proteins are presented in the table below. The comparisons were made using ClustalW software (ClustalW and ClustalX version 2; Larkin M.A., Blackshields G., Brown N.P., Chenna R., McGettigan P.A., McWilliam H., Valentin F., Wallace I.M., Wilm A., Lopez R., Thompson J.D., Gibson T.J. and Higgins D.G., Bioinformatics 2007 23(21 ): 2947-2948). PA01 = Pseudomonas aeruginosa PA01 , E.C. = Eschericia coli, S.O. = S.
oneidensis, G.O. = G. oxydans, R.F. = Ruminococcus flavefaciens.
Figure imgf000231_0001
All oligonucleotides set forth above were purchased from Integrated technologies ("IDT", Coralville, IA). These oligonucleotides were designed to incorporate a Spel restriction endonuclease cleavage site upstream and an Xhol restriction endonuclease cleavage site downstream of the edd and eda gene constructs, such that the sites could be used to clone the genes into yeast expression vectors p426GPD (ATCC accession number 87361 ) and p425GPD (ATCC accession number 87359). In addition to incorporating restriction endonuclease cleavage sites, the forward oligonucleotides were designed to incorporate six consecutive A nucleotides immediately upstream of the ATG initiation codon. PCR amplification of the genes were performed as follows: about 100ng of the genomic DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μηιοΙ gene-specific primers and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction mixture was cycled as follows: 95 °C 10 minutes followed by 30 rounds of 95 °C for 20 seconds, 50 °C (eda amplifications) or 53 °C (edd amplifications) for 30 seconds, and 72°C for 15 seconds (eda amplifications) or 30 seconds (edd amplifications). A final 5 minute extension reaction at 72 °C was also included. Each amplified product was TOPO cloned into the pCR Blunt II TOPO vector (Life Technologies, Carlsbad, CA) according to the manufacturer's recommendations and the sequences verified (GeneWiz, La Jolla, CA).
Cloning of new edd and eda genes into yeast expression vectors Each of the sequence-verified eda and edd fragments were subcloned into the corresponding restriction sites in plasmids p425GPD and p426GPD vectors (ATCC #87361 ; PubMed: 7737504). Briefly, about 50ng of Spel-Xhol-digested p425GPD vector was ligated to about 50ng of Spel/Xhol- restricted eda or edd fragment in a 10μΙ reaction with 1 X T4 DNA ligase buffer and 1 U T4 DNA ligase (Fermentas) overnight at 16°C. About 3μΙ of this reaction was used to transform DH5a competent cells (Zymo Research) and plated onto LB agar media containing 100Mg/ml ampicillin. Final constructs were confirmed by restriction endonuclease digests and sequence verification (GeneWiz, La Jolla, CA). In vivo assay to determine optimal EDD/EDA combination
To determine the optimal EDD/EDA gene combinations, a yeast strain was developed to enable in vivo gene combination evaluation. Growth on glucose was impaired in this strain by disrupting both copies of phosphofructokinase (PFK), however, the strain could grow normally on galactose due to the presence of a single plasmid copy of the PFK2 gene under the control of a GAL1 promoter. The strain can only grow on glucose if a functional EDD/EDA is present in the cell. The strain was generated using strain BF205 (YGR240C/BY4742, ATCC Cat. No. 4015893; Winzeler EA, et al. Science 285: 901 -906, 1999, PubMed: 10436161 ) as the starting strain.
PFK2 expressing plasmid
The plasmid expressing the PFK2 gene under the control of the GAL1 promoter, for use in the in vivo edd/eda gene combination evaluations, was constructed by first isolating the PFK2 gene.
Primers JML/89 and JML/95 were used to amplify the PFK2 gene from BY4742 in a PCR reaction containing about 100ng of the genomic DNA, 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 mol gene- specific primers, and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reactions were cycled as follows: 95 °C for 10 minutes followed by 10 rounds of 95°C for 20 seconds, 55 °C for 20 seconds, and 72°C for 90 seconds and 25 rounds of 95°C for 20 seconds, 62 °C for 20 seconds, and 72 °C for 90 seconds. A final 5 minute extension reaction at 72 °C was also included. Each amplified product was TOPO cloned into the pCR Blunt II TOPO vector (Life Technologies, Carlsbad, CA) according to the manufacturer's recommendations and sequence verified (GeneWiz, San Diego, CA). The sequences of JML/89 and JML/95 are given below.
JML/89 ACTAGTATGACTGTTACTACTCCTTTTGTGAATGGTAC
JML/95 CTCGAGTTAATCAACTCTCTTTCTTCCAACCAAATGGTC
The primers used were designed to include a unique Spel restriction site at the 5' end of the gene and a unique Xhol restriction site at the 3' end of the gene. This Spel-Xhol fragment
(approximately 2900bp) was cloned into the Spel-Xhol sites of the yeast vector p416GAL (ATCC Cat. No. 87332; Mumberg D, et al., Nucleic Acids Res. 22: 5767-5768, 1994. PubMed: 7838736) in a 10μΙ ligation reaction containing about 50ng of the p416GAL plasmid and about 100ng of the PFK2 fragment with 1 X ligation buffer and 1 U T4 DNA ligase (Fermentas). This ligation reaction was allowed to incubate at room temperature for about one hour and was transformed into competent DH5a (Zymo Research, Orange, CA) and plated onto LB plates containing l OOpg/ml ampicillin. The final plasmid was verified by restriction digests and sequence confirmed (GeneWiz, San Diego, CA) and was called pBF744. Plasmid pBF744 was transformed in yeast strain BF205 (BY4742 pfkl ) using the procedure outlined below. This resulting strain was called BF1477.
1 . Inoculate 5mLs YPD with a single yeast colony. Grow O/N at 30 °C.
2. Next day: add 50μΙ culture to 450μΙ fresh YPD, check A660. Add suitable amount of cells to 60mLs fresh YPD to give an A660= 0.2 (2 x 106 cells/ ml_) . Grow to A660 = 1 .0 (2 x 107 cells/ mL), approximately 5 hours.
3. Boil a solution of 10mg/ml salmon sperm DNA for 5 min, then quick chill on ice.
4. Spin down 50mL cells at 3000rpm for 5 min, wash in 10 mL sterile water, recentrifuge. 5. Resuspend in 1 mL sterile water. Transfer to 1 .5 mL sterile microfuge tube, spin down.
6. Resuspend in 1 mL sterile TE/ LiOAC solution. Spin down, resuspend in 0.25mLs TE/LiOAc (4 x 109 cells).
7. In a 1 .5mL microfuge tube, mix 50μΙ yeast cells with 1 -5μς transforming DNA and 5μΙ single stranded carrier DNA (boiled salmon sperm DNA).
8. Add 300μΙ sterile PEG solution. Mix thoroughly. Incubate at 30°C for 60 min with gentle mixing every 15min.
9. Add 40μΙ DMSO, mix thoroughly. Heat shock at 42°C for 15 min.
10. Microfuge cells at 13000 rpm for 30 seconds, remove supernatant. Resuspend in 1 mL 1 X TE, microfuge 30 sec. Resuspend in 1 mL 1 X TE. Plate 100-200μΙ on selective media (SCD-ura). pfk2 knockout cassette
A knockout cassette for the PFK2 gene was constructed by first PCR amplifying about 300bp of the 5' and 3' flanking regions of the PFK2 gene from S. cerevisiae, strain BY4742 using primers JML/85 and JML/87 and primers JML/86 and JML/88, respectively. These flanking regions were designed such that the 5' flanking region had a Hindlll site at its 5' edge and a BamHI site at its 3' end. The 3' flanking region had a BamHI site at its 5' edge and a EcoRI site at its 3' edge. The nucleotide sequence of the PFK2 gene and the primers used for amplification of the PFK2 gene are given below.
S. cerevisiae PFK2 (from genomic sequence) SEQ ID NO: 121
ATGACTGTTACTACTCCTTTTGTGAATGGTACTTCTTATTGTACCGTCACTGCATATTCCGTTCA ATCTTATAAAGCTGCCATAGA I I I I I ACACCAAG I I I I I GTCATTAGAAAACCGCTCTTCTCCAG ATGAAAACTCCACTTTATTGTCTAACGATTCCATCTCTTTGAAGATCCTTCTACGTCCTGATGAA AAAATCAATAAAAATGTTGAGGCTCATTTGAAGGAATTGAACAGTATTACCAAGACTCAAGACT GGAGATCACATGCCACCCAATCCTTGGTATTTAACACTTCCGACATCTTGGCAGTCAAGGACA CTCTAAATGCTATGAACGCTCCTCTTCAAGGCTACCCAACAGAACTATTTCCAATGCAGTTGTA CACTTTGGACCCATTAGGTAACGTTGTTGGTGTTACTTCTACTAAGAACGCAGTTTCAACCAAG CCAACTCCACCACCAGCACCAGAAGCTTCTGCTGAGTCTGGTCTTTCCTCTAAAGTTCACTCT TACACTGATTTGGCTTACCGTATGAAAACCACCGACACCTATCCATCTCTGCCAAAGCCATTG AACAGGCCTCAAAAGGCAATTGCCGTCATGACTTCCGGTGGTGATGCTCCAGGTATGAACTCT AACGTTAGAGCCATCGTGCGTTCCGCTATCTTCAAAGGTTGTCGTGCCTTTGTTGTCATGGAA GGTTATGAAGGTTTGGTTCGTGGTGGTCCAGAATACATCAAGGAATTCCACTGGGAAGACGTC CGTGGTTGGTCTGCTGAAGGTGGTACCAACATTGGTACTGCCCGTTGTATGGAATTCAAGAAG CGCGAAGGTAGATTATTGGGTGCCCAACATTTGATTGAGGCCGGTGTCGATGCTTTGATCGTT TGTGGTGGTGACGGTTCTTTGACTGGTGCTGATCTGTTTAGATCAGAATGGCCTTCTTTGATC GAGGAATTGTTGAAAACAAACAGAATTTCCAACGAACAATACGAAAGAATGAAGCATTTGAATA TTTGCGGTACTGTCGGTTCTATTGATAACGATATGTCCACCACGGATGCTACTATTGGTGCTTA CTCTGCCTTGGACAGAATCTGTAAGGCCATCGATTACGTTGAAGCCACTGCCAACTCTCACTC AAGAGCTTTCGTTGTTGAAGTTATGGGTAGAAACTGTGGTTGGTTAGCTTTATTAGCTGGTATC GCCACTTCCGCTGACTATATCTTTATTCCAGAGAAGCCAGCCACTTCCAGCGAATGGCAAGAT CAAATGTGTGACATTGTCTCCAAGCACAGATCAAGGGGTAAGAGAACCACCATTGTTGTTGTT GCAGAAGGTGCTATCGCTGCTGACTTGACCCCAATTTCTCCAAGCGACGTCCACAAAGTTCTA GTTGACAGATTAGGTTTGGATACAAGAATTACTACCTTAGGTCACGTTCAAAGAGGTGGTACT GCTGTTGCTTACGACCGTATCTTGGCTACTTTACAAGGTCTTGAGGCCGTTAATGCCGTTTTG GAATCCACTCCAGACACCCCATCACCATTGATTGCTGTTAACGAAAACAAAATTGTTCGTAAAC CATTAATGGAATCCGTCAAGTTGACCAAAGCAGTTGCAGAAGCCATTCAAGCTAAGGATTTCA AGAGAGCTATGTCTTTAAGAGACACTGAGTTCATTGAACATTTAAACAATTTCATGGCTATCAA CTCTGCTGACCACAACGAACCAAAGCTACCAAAGGACAAGAGACTGAAGATTGCCATTGTTAA TGTCGGTGCTCCAGCTGGTGGTATCAACTCTGCCGTCTACTCGATGGCTACTTACTGTATGTC CCAAGGTCACAGACCATACGCTATCTACAATGGTTGGTCTGGTTTGGCAAGACATGAAAGTGT TCGTTCTTTGAACTGGAAGGATATGTTGGGTTGGCAATCCCGTGGTGGTTCTGAAATCGGTAC TAACAGAGTCACTCCAGAAGAAGCAGATCTAGGTATGATTGCTTACTATTTCCAAAAGTACGAA TTTGATGGTTTGATCATCGTTGGTGGTTTCGAAGCTTTTGAATCTTTACATCAATTAGAGAGAG CAAGAGAAAGTTATCCAGCTTTCAGAATCCCAATGGTCTTGATACCAGCTACTTTGTCTAACAA TGTTCCAGGTACTGAATACTCTTTGGGTTCTGATACCGCTTTGAATGCTCTAATGGAATACTGT GATGTTGTTAAACAATCCGCTTCTTCAACCAGAGGTAGAGCCTTCGTTGTCGATTGTCAAGGT GGTAACTCAGGCTATTTGGCCACTTACGCTTCTTTGGCTGTTGGTGCTCAAGTCTCTTATGTC CCAGAAGAAGGTATTTCTTTGGAGCAATTGTCCGAGGATATTGAATACTTAGCTCAATCTTTTG AAAAGGCAGAAGGTAGAGGTAGATTTGGTAAATTGATTTTGAAGAGTACAAACGCTTCTAAGG CTTTATCAGCCACTAAATTGGCTGAAGTTATTACTGCTGAAGCCGATGGCAGATTTGACGCTA AGCCAGCTTATCCAGGTCATGTACAACAAGGTGGTTTGCCATCTCCAATTGATAGAACAAGAG CCACTAGAATGGCCATTAAAGCTGTCGGCTTCATCAAAGACAACCAAGCTGCCATTGCTGAAG CTCGTGCTGCCGAAGAAAACTTCAACGCTGATGACAAGACCATTTCTGACACTGCTGCTGTCG TTGGTGTTAAGGGTTCACATGTCGTTTACAACTCCATTAGACAATTGTATGACTATGAAACTGA AGTTTCCATGAGAATGCCAAAGGTCATTCACTGGCAAGCTACCAGACTCATTGCTGACCATTT GGTTGGAAGAAAGAGAGTTGATTAA
JML/85 AAGCTTTTAATTAATATAACGCTATGACGGTAGTTGAATGTTAAAAAC
JML786 GAATTCTTAATTAAAGAGAACAAAGTATTTAACGCACATGTATAAATATTG
JML/87 GGATCCGCATGCGGCCGGCCAGCTTTTAATCAAGGAAGTAATAAATAAAGGAC
JML/88 GGATCCGAGCTCGCGGCCGCAGCTTTTGAACAATGAA I I I I I I GTTCCTTTC
The nucleic acid fragments were amplified using the following conditions; about 100ng of the BY4742 genomic DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μ ιοΙ gene-specific primers, and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction was cycled at 95 °C for 10 minutes, followed by 30 rounds of 95 °C for 20 seconds, 58°C for 30 seconds, and 72 °C for 20 seconds. A final 5 minute extension reaction at 72 °C was also included. Each amplified product was TOPO cloned into the pCR Blunt II TOPO vector (Life Technologies, Carlsbad, CA) according to the manufacturer's recommendations and the sequence of the construct was verified (GeneWiz, San Diego, CA). The resulting plasmids were named pBF648 (5' flanking region) and pBF649 (3' flanking region). A three fragment ligation was performed using about 100ng of the 5' flanking region Hindlll-BamHI fragment, about 100ng of the 3' flanking region BamHI-EcoRI fragment and about 50ng of pUC19 digested with Hindlll and EcoRI in a 5μΙ ligation reaction containing 1 X ligation buffer and 1 U T4 DNA ligase (Fermentas). This reaction was incubated at room temperature for about one hour. About 2μΙ of this reaction mix was used to transform competent DH5a cells (Zymo Research, Orange, CA) and plated onto LB agar media containing 100Mg/ml ampicillin. The final construct was confirmed by restriction endonuclease digests and sequence verification (GeneWiz, San Diego, CA), resulting in plasmid pBF653.
Lvs 2 gene cloning
The Lys2 gene was isolated by PCR amplification from pRS317 (ATCC Cat. No. 77157; Sikorski RS, Boeke JD. Methods Enzymol. 194: 302-318, 1991 . PubMed: 2005795) using primers JML/93 and JMIJ94. PCR amplification was performed as follows: about 25ng of the pRS317 plasmid DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μιτιοΙ gene-specific primers, and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reactions were cycled at: 95 °C 10 minutes followed by 10 rounds of 95 °C for 20 seconds, 55°C for 30 seconds, and 72 °C for 2 minutes, followed by 25 more rounds of 95 °C for 20 seconds, 62°C for 30 seconds, and 72°C for 2 minutes. A final 5 minute extension reaction at 72°C was also included. The amplified product was TOPO cloned into the pCR Blunt II TOPO vector as described herein, resulting in plasmid pBF656. The nucleotide sequence of Lys2 gene and the primers used for amplification of the Lys2 gene are given below.
JML/93 GCGGCCGCAGCTTCGCAAGTATTCATTTTAGACCCATG
JML/94 GGCCGGCCGGTACCAATTCCACTTGCAATTACATAAAAAATTCC
Lys 2 (from genomic sequence database), SEQ ID NO: 122
ATGACTAACGAAAAGGTCTGGATAGAGAAGTTGGATAATCCAACTCTTTCAGTGTTACCACAT GACTTTTTACGCCCACAACAAGAACCTTATACGAAACAAGCTACATATTCGTTACAGCTACCTC AGCTCGATGTGCCTCATGATAG I I I I I CTAACAAATACGCTGTCGCTTTGAGTGTATGGGCTG CATTGATATATAGAGTAACCGGTGACGATGATATTGTTCTTTATATTGCGAATAACAAAATCTTA AGATTCAATATTCAACCAACGTGGTCATTTAATGAGCTGTATTCTACAATTAACAATGAGTTGAA CAAGCTCAATTCTATTGAGGCCAATTTTTCCTTTGACGAGCTAGCTGAAAAAATTCAAAGTTGC CAAGATCTGGAAAGGACCCCTCAGTTGTTCCGTTTGGCCTTTTTGGAAAACCAAGATTTCAAAT TAGACGAGTTCAAGCATCATTTAGTGGACTTTGCTTTGAATTTGGATACCAGTAATAATGCGCA TGTTTTGAACTTAATTTATAACAGCTTACTGTATTCGAATGAAAGAGTAACCATTGTTGCGGAC CAATTTACTCAATATTTGACTGCTGCGCTAAGCGATCCATCCAATTGCATAACTAAAATCTCTC TGATCACCGCATCATCCAAGGATAGTTTACCTGATCCAACTAAGAACTTGGGCTGGTGCGATT TCGTGGGGTGTATTCACGACATTTTCCAGGACAATGCTGAAGCCTTCCCAGAGAGAACCTGTG TTGTGGAGACTCCAACACTAAATTCCGACAAGTCCCGTTCTTTCACTTATCGCGACATCAACC GCACTTCTAACATAGTTGCCCATTATTTGATTAAAACAGGTATCAAAAGAGGTGATGTAGTGAT GATCTATTCTTCTAGGGGTGTGGATTTGATGGTATGTGTGATGGGTGTCTTGAAAGCCGGCGC AACCTTTTCAGTTATCGACCCTGCATATCCCCCAGCCAGACAAACCATTTACTTAGGTGTTGCT AAACCACGTGGGTTGATTGTTATTAGAGCTGCTGGACAATTGGATCAACTAGTAGAAGATTAC ATCAATGATGAATTGGAGATTGTTTCAAGAATCAATTCCATCGCTATTCAAGAAAATGGTACCA TTGAAGGTGGCAAATTGGACAATGGCGAGGATGTTTTGGCTCCATATGATCACTACAAAGACA CCAGAACAGGTGTTGTAGTTGGACCAGATTCCAACCCAACCCTATCTTTCACATCTGGTTCCG AAGGTATTCCTAAGGGTGTTCTTGGTAGACA I I I I I CCTTGGCTTATTATTTCAATTGGATGTC CAAAAGGTTCAACTTAACAGAAAATGATAAATTCACAATGCTGAGCGGTATTGCACATGATCCA ATTCAAAGAGATATGTTTACACCATTA I I I I I AGGTGCCCAATTGTATGTCCCTACTCAAGATGA TATTGGTACACCGGGCCGTTTAGCGGAATGGATGAGTAAGTATGGTTGCACAGTTACCCATTT AACACCTGCCATGGGTCAATTACTTACTGCCCAAGCTACTACACCATTCCCTAAGTTACATCAT GCGTTCTTTGTGGGTGACATTTTAACAAAACGTGATTGTCTGAGGTTACAAACCTTGGCAGAA AATTGCCGTATTGTT TATGTACGGTACCACTGAAACACAGCGTGCAGTTTCTTATTTCGAAG TTAAATCAAAAAATGACGATCCAAACTTTTTGAAAAAATTGAAAGATGTCATGCCTGCTGGTAA AGGTATGTTGAACGTTCAGCTACTAGTTGTTAACAGGAACGATCGTACTCAAATATGTGGTATT GGCGAAATAGGTGAGATTTATGTTCGTGCAGGTGGTTTGGCCGAAGGTTATAGAGGATTACCA GAATTGAATAAAGAAAAATTTGTGAACAACTGGTTTGTTGAAAAAGATCACTGGAATTATTTGG ATAAGGATAATGGTGAACCTTGGAGACAATTCTGGTTAGGTCCAAGAGATAGATTGTACAGAA CGGGTGATTTAGGTCGTTATCTACCAAACGGTGACTGTGAATGTTGCGGTAGGGCTGATGATC AAGTTAAAATTCGTGGGTTCAGAATCGAATTAGGAGAAATAGATACGCACATTTCCCAACATCC ATTGGTAAGAGAAAACATTACTTTAGTTCGCAAAAATGCCGACAATGAGCCAACATTGATCACA TTTATGGTCCCAAGATTTGACAAGCCAGATGACTTGTCTAAGTTCCAAAGTGATGTTCCAAAGG AGGTTGAAACTGACCCTATAGTTAAGGGCTTAATCGGTTACCATCTTTTATCCAAGGACATCAG GACTTTCTTAAAGAAAAGATTGGCTAGCTATGCTATGCCTTCCTTGATTGTGGTTATGGATAAA CTACCATTGAATCCAAATGGTAAAGTTGATAAGCCTAAACTTCAATTCCCAACTCCCAAGCAAT TAAATTTGGTAGCTGAAAATACAGTTTCTGAAACTGACGACTCTCAGTTTACCAATGTTGAGCG CGAGGTTAGAGACTTATGGTTAAGTATATTACCTACCAAGCCAGCATCTGTATCACCAGATGAT TCG I I I I I CGATTTAGGTGGTCATTCTATCTTGGCTACCAAAATGATTTTTACCTTAAAGAAAAA GCTGCAAGTTGATTTACCATTGGGCACAATTTTCAAGTATCCAACGATAAAGGCCTTTGCCGC GGAAATTGACAGAATTAAATCATCGGGTGGATCATCTCAAGGTGAGGTCGTCGAAAATGTCAC TGCAAATTATGCGGAAGACGCCAAGAAATTGGTTGAGACGCTACCAAGTTCGTACCCCTCTCG AGAATATTTTGTTGAACCTAATAGTGCCGAAGGAAAAACAACAATTAATGTGTTTGTTACCGGT GTCACAGGATTTCTGGGCTCCTACATCCTTGCAGATTTGTTAGGACGTTCTCCAAAGAACTAC AGTTTCAAAGTGTTTGCCCACGTCAGGGCCAAGGATGAAGAAGCTGCATTTGCAAGATTACAA AAGGCAGGTATCACCTATGGTACTTGGAACGAAAAATTTGCCTCAAATATTAAAGTTGTATTAG GCGATTTATCTAAAAGCCAATTTGGTCTTTCAGATGAGAAGTGGATGGATTTGGCAAACACAG TTGATATAATTATCCATAATGGTGCGTTAGTTCACTGGGTTTATCCATATGCCAAATTGAGGGA TCCAAATGTTATTTCAACTATCAATGTTATGAGCTTAGCCGCCGTCGGCAAGCCAAAGTTCTTT GACTTTGTTTCCTCCACTTCTACTCTTGACACTGAATACTACTTTAATTTGTCAGATAAACTTGT TAGCGAAGGGAAGCCAGGCATTTTAGAATCAGACGATTTAATGAACTCTGCAAGCGGGCTCA CTGGTGGATATGGTCAGTCCAAATGGGCTGCTGAGTACATCATTAGACGTGCAGGTGAAAGG GGCCTACGTGGGTGTATTGTCAGACCAGGTTACGTAACAGGTGCCTCTGCCAATGGTTCTTCA AACACAGATGATTTCTTATTGAGA I I I I I GAAAGGTTCAGTCCAATTAGGTAAGATTCCAGATAT CGAAAATTCCGTGAATATGGTTCCAGTAGATCATGTTGCTCGTGTTGTTGTTGCTACGTCTTTG AATCCTCCCAAAGAAAATGAATTGGCCGTTGCTCAAGTAACGGGTCACCCAAGAATATTATTC AAAGACTACTTGTATACTTTACACGATTATGGTTACGATGTCGAAATCGAAAGCTATTCTAAAT GGAAGAAATCATTGGAGGCGTCTGTTATTGACAGGAATGAAGAAAATGCGTTGTATCCTTTGC TACACATGGTCTTAGACAACTTACCTGAAAGTACCAAAGCTCCGGAACTAGACGATAGGAACG CCGTGGCATCTTTAAAGAAAGACACCGCATGGACAGGTGTTGATTGGTCTAATGGAATAGGTG TTACTCCAGAAGAGGTTGGTATATATATTGCA I I I I I AAACAAGGTTGGA I I I I I ACCTCCACCA ACTCATAATGACAAACTTCCACTGCCAAGTATAGAACTAACTCAAGCGCAAATAAGTCTAGTTG CTTCAGGTGCTGGTGCTCGTGGAAGCTCCGCAGCAGCTTAA The knockout cassette was fully assembled by cloning the Notl-Fsel LYS2 fragment from plasmid pBF656 into the Notl-Fsel sites located between the 5' and 3' flanking PFK2 regions in plasmid pBF653. About 50ng of plasmid pBF653 digested with Notl and Fsel was ligated to about 100ng of the Notl-Fsel LYS2 fragment from plasmid pBF656 in a 5μΙ reaction containing 1 X ligation buffer and 1 U T4 DNA ligase (Fermentas) for about 1 hour at room temperature. About 2μΙ of this reaction was used to transform competent DH5a (Zymo Research, Orange, CA) and plated on 100μg/ml ampicillin. The structure of the final plasmid, pBF745, was confirmed by restriction enzyme digests. The approximately 5kbp Pad fragment containing the LYS2 cassette and PFK2 flanking regions was gel extracted using the Zymoclean Gel DNA Recovery Kit (Zymo Research, Orange, CA) according to the manufacturer's conditions. Strain BF1 77 was transformed with the about 5kbp Pad fragment using the method described above (LiOAc/PEG method) generating strain BF141 1 . Strain BF141 1 has the ability to grow on galactose as a carbon source, but cannot grow on glucose. Various combinations of the EDD and EDA constructs can be expressed in this strain and monitored for growth on glucose. Strains which show growth on glucose (or the highest growth rate on glucose) can be further characterized to determine which combination of EDD and EDA genes is present. Using the strain and method described herein, libraries of EDD and EDA genes can be screened for improved activities and activity combinations in a host organism. Example 24: Single plasmid system for industrial yeast
A single plasmid system expressing EDD and EDA for industrial yeast was constructed as follows: The approximately 2800bp fragment containing the GPD1 promoter, EDD-PA01 gene and CYC1 terminator from plasmid pBF291 (p426GPD with EDD-PA01 ) was PCR amplified using primers KAS/5'-BamHI-Pgpd and KAS/3'-Ndel-CYCt, described below. About 25ng of the plasmid DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μιτιοΙ gene-specific primers, and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction was cycled at 95 °C for 10 minutes, followed by 30 rounds of 95°C for 20 seconds, 55°C for 30 seconds, and 72°C for 45 seconds. A final 5 minute extension reaction at 72°C was also included. The amplified product was TOPO cloned into the pCR Blunt II TOPO vector, as described herein, and the final plasmid was sequence verified and designated, pBF475.
KAS/5'-BamHI-Pgpd GGATCCgtttatcattatcaatactcgccatttcaaag
KAS/3'-Ndel-CYCt CATATGttgggtaccggccgcaaattaaagccttcgagcg
An approximately 1500bp KANMX4 cassette was PCR amplified from plasmid pBF413 HO-poly- KanMX4-HO (ATCC Cat. No. 87804) using primers KAS/5'-Bam_Ndel-KANMX4 and KAS/3'- Sal_Nhel-KANMX4, described below. KAS/5'-Bam_Ndel-KANMX4 GGATTCagtcagatCATATGggtacccccgggttaattaaggcgcgccagatctg
KAS/3'-Sal_Nhel-KANMX4 GTCGACaggcctactgtacgGCTAGCgaattcgagctcgttttcgacactggatggcggc
About 25ng of plasmid pBF413 HO-poly-KanMX4-HO DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μηιοΙ gene-specific primers and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction was cycled at 95 °C for 10 minutes, followed by 30 rounds of 95°C for 20 seconds, 55°C for 30 seconds, and 72°C for 30 seconds. A final 5 minute extension reaction at 72°C was also included. The amplified product was TOPO cloned into the pCR Blunt II TOPO vector, as described herein. The resulting plasmid was sequence verified and designated, pBF465.
An approximately 225 bp ADH1 terminator was PCR amplified from the genome of BY4742 using primers KAS/5'-Xba-Xhol-ADHt and KAS/3'-Stul-ADH5. The sequence of primers KAS/5'-Xba- Xhol-ADHt and KAS/3'-Stul-ADH5 is given below.
KAS/5'-Xba-Xhol-ADHt tctagaCTCGAGtaataagcgaatttcttatgatttatg
KAS/3'-Stul-ADH5 aagcttAGGCCTggagcgatttgcaggcatttgc
About 100ng of genomic DNA from BY4742 was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μπιοΙ gene-specific primers and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction was cycled at 95 °C for 10 minutes, followed by 30 rounds of 95°C for 20 seconds, 55 °C for 30 seconds, and 72°C for 15 seconds. A final 5 minute extension reaction at 72 °C was also included. The amplified product was TOPO cloned into the pCR Blunt II TOPO vector according to the manufacturer's recommendations and sequence verified. The resulting plasmid was designated pBF437.
The TEF2 promoter was PCR amplified from the genome of BY4742 using primers KAS/5'-Xba- Xhol-ADHt and KAS/3'-Stul-ADH5, described below.
KAS/5'-Bam-Nrjel-Ptef GGATCCgctagcACCGCGAATCCTTACATCACACCC
KAS/3'-Xbal-Spel-Ptef tctagaCTCGAGtaataagcgaatttcttatgatttatg
About 100ng of genomic DNA from BY4742 was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μπιοΙ gene-specific primers, and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. This was cycled at 95°C for 10 minutes, followed by 30 rounds of 95°C for 20 seconds, 55 °C for 30 seconds, and 72°C for 15 seconds. A final 5 minute extension reaction at 72 °C was also included. The amplified product was TOPO cloned into the pCR Blunt II TOPO vector (Life Technologies, Carlsbad, CA) according to the manufacturer's recommendations and sequence verified (GeneWiz, San Diego, CA). The resulting plasmid was called pBF440. The EDA gene cassettes were constructed as follows: First the TEF2 promoter from the plasmid pBF440 was digested with BamHI and Xbal and was cloned into the BamHI and Xbal sites of pUC19 creating plasmid pBF480. Plasmid pBF480 was then digested with Xbal and Hindlll and was ligated to the Xbal-Hindlll fragment from plasmid pBF437 containing the ADH1 terminator, creating plasmid pBF521. Plasmid pBF521 was then digested with Spel and Xhol and then ligated to either Spel-Xhol fragment containing either the PA01 eda gene from plasmid pBF292 or the E. coli eda gene from plasmid pBF268. The 2 plasmids generated, depending on the eda gene chosen, were designated pBF523 (e.g., containing the PA01 -eda) and pBF568 (e.g., containing the E. coli-eda), respectively. The approximately 1386bp TEF-EDA-ADHt cassette from either plasmid pBF 523 or pBF568 was then gel extracted using the Nhel-Stul sites.
The final vector was generated by first altering the Nde~\ site in pUC19 using the mutagenesis primers described below. KAS/SDM-Ndel-pUC18-5 gattgtactgagagtgcacaatatgcggtgtgaaatacc
KAS/SDM-Ndel-pUC18-3 ggtatttcacaccgcatattgtgcactctcagtacaatc
About 50ng of pUC19 plasmid DNA was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μιτιοΙ SDM-specific primers and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction was cycled at 95 °C for 10 minutes, followed by 15 rounds of 95 °C for 15 seconds, 55°C for 40 seconds, and 72°C for 3 minutes. A final 10 minute extension reaction at 72 °C was also included. The PCR reaction mixture was then digested with 30U of Dpnl for about 2 hours and 5μΙ of the digested PCR reaction mixture was used to transform competent DH5a (Zymo Research, Orange, CA) and plated onto LB plates containing 100μg/ml ampicillin. The structure of the final plasmid, pBF421 , was confirmed by restriction digests.
An approximately 1359 bp EcoRI fragment containing the 2μ yeast origin cassette was cloned into the EcoRI site of plasmid pBF421 in a 10μΙ ligation reaction mixture containing 1 X ligation buffer, 50ng of EcoRI-digested pBF421 80ng of EcoRI-digested 2μ cassette, and 1 U T4 DNA ligase (Fermentas). The reaction was incubated at room temperature for about 2 hours and 3μΙ of this was used to transform competent DH5a (Zymo Research, Orange, CA). The structure of the resultant plasmid, pBF429, was confirmed by restriction enzyme digests. Plasmid pBF429 was then digested with BamH! and Sail and ligated to the BamHI-Sall KANMX4 cassette described above. The resultant plasmid, designated pBF515, was digested with BamHI and Ndel and ligated to the BamHI-Ndel fragment containing the 2802bp GPD-EDD-CYCt fragment from pBF475. The resulting plasmid, designated pBF522, was digested with Nhe!-Stul and was ligated to the 1386bp Nhel-Stul TEF-EDA-ADHt fragment from plasmids pBF523 or pBF568,creating final plasmids pBF524 and pBF612.
Expression levels of each of the single plasmid eda/edd expression system vectors was assayed and compared against the original eda/edd two plasmid expression system vectors. The results, presented in FIG. 19, graphically illustrate edd/eda coupled assay kinetics for the single and two plasmid systems. The kinetics graphs for both expression systems show substantially similar enzyme kinetics over the major of the time course.
Example 25: Chimeric Xylose Isomerase Activities
Chimeric Xylose Isomerase nucleotide sequences and functional activities were generated that included an N-terminal portion of Xylose isomerase from one donor organism and a C-terminal portion of Xylose isomerase from a different donor organism. In some embodiments the second donor organism was a Ruminococcus bacteria. Given below are oligonucleotides utilized to isolate and modify a nucleotide sequence encoding a xylose isomerase activity. Also given below are non-limiting examples of native and chimeric nucleotide and amino acid sequences encoding xylose isomerase activities. The native Ruminococcus flavefaciens nucleotide sequence utilized to generate chimeric xylose isomerase activities is given below.
ATGGAATTTTTCAGCAATATCGGTAAAATTCAGTATCAGGGACCAAAAAGTACTGATCCTCTCT CATTTAAGTACTATAACCCTGAAGAAGTCATCAACGGAAAGACAATGCGCGAGCATCTGAAGT TCGCTCTTTCATGGTGGCACACAATGGGCGGCGACGGAACAGATATGTTCGGCTGCGGCACA ACAGACAAGACCTGGGGACAGTCCGATCCCGCTGCAAGAGCAAAGGCTAAGGTTGACGCAG CATTCGAGATCATGGATAAGCTCTCCATTGACTACTATTGTTTCCACGATCGCGATCTTTCTCC CGAGTATGGCAGCCTCAAGGCTACCAACGATCAGCTTGACATAGTTACAGACTATATCAAGGA GAAGCAGGGCGACAAGTTCAAGTGCCTCTGGGGTACAGCAAAGTGCTTCGATCATCCAAGAT TCATGCACGGTGCAGGTACATCTCCTTCTGCTGATGTATTCGCTTTCTCAGCTGCTCAGATCA AGAAGGCTCTGGAGTCAACAGTAAAGCTCGGCGGTAACGGTTACGTTTTCTGGGGCGGACGT GAAGGCTATGAGACACTTCTTAATACAAATATGGGACTCGAACTCGACAATATGGCTCGTCTT ATGAAGATGGCTGTTGAGTATGGACGTTCGATCGGCTTCAAGGGCGACTTCTATATCGAGCC CAAGCCCAAGGAGCCCACAAAGCATCAGTACGATTTCGATACAGCTACTGTTCTGGGATTCCT CAGAAAGTACGGTCTCGATAAGGATTTCAAGATGAATATCGAAGCTAACCACGCTACACTTGC TCAGCATACATTCCAGCATGAGCTCCGTGTTGCAAGAGACAATGGTGTGTTCGGTTCTATCGA CGCAAACCAGGGCGACGTTCTTCTTGGATGGGATACAGACCAGTTCCCCACAAATATCTACGA TACAACAATGTGTATGTATGAAGTTATCAAGGCAGGCGGCTTCACAAACGGCGGTCTCAACTT CGACGCTAAGGCACGCAGAGGGAGCTTCACTCCCGAGGATATCTTCTACAGCTATATCGCAG GTATGGATGCATTTGCTCTGGGCTTCAGAGCTGCTCTCAAGCTTATCGAAGACGGACGTATCG ACAAGTTCGTTGCTGACAGATACGCTTCATGGAATACCGGTATCGGTGCAGACATAATCGCAG GTAAGGCAGATTTCGCATCTCTTGAAAAGTATGCTCTTGAAAAGGGCGAGGTTACAGCTTCAC TCTCAAGCGGCAGACAGGAAATGCTGGAGTCTATCGTAAATAACGTTCTTTTCAGTCTGTAA
The first 10 amino acids are underlined, and amino acids 1 1 -15 are in bold font. The native sequence was originally cloned into a pUC57 vector, called pBF202, which was utilized as the PCR template for the 5' chimera constructs. The oligonucleotides used to generate the 5' replacement nucleotide sequences (e.g., oligonucleotides used to replace the first 10 amino acids of the Ruminococcus xylose isomerase protein) are given in the table below. In some
embodiments, greater or fewer than 10 amino acids were replaced to maintain proper amino acid alignment between xylose isomerase activities.
Figure imgf000243_0001
KAS/3-XI-RF- Native-HISb ctcgagttagtgatggtggtggtgatgcagactgaaaagaacgttatttacg
In the table above the following abbreviations are used: Cp-Clostridium phytofermentans; O- Orpinomyces; Cth-Clostridium thermohydrosulfuricum, Bth-Bacteroides thetaiotaomicron, Bst- Bacillus stearothermophilus; Bun-Bacillus uniformis; Cce-Clostridium cellulolyticum; RF-
Ruminococcus flavefaciens FD1 , 18P10- Ruminococcus 18P13; B V10-Clostridials genomosp BVAB3 str UPII9-5; Re-E. coli.
All oligonucleotides set forth above were purchased from Integrated Technologies ("IDT",
Coralville, IA). The oligonucleotides were designed to incorporate a Spel restriction endonuclease cleavage site upstream and an Xhol restriction endonuclease cleavage site downstream of the new XI gene constructs, to allow cloning into the yeast expression vector p426GPD (ATCC accession number 87361 ), as described herein. In addition to incorporating restriction endonuclease cleavage sites, the forward oligonucleotides were designed to incorporate six consecutive A nucleotides immediately upstream of the ATG initiation codon.
PCR reactions to amplify the xylose isomerase genes were performed using about 40ng of the pBF202 plasmid (containing the native Xl-R gene in pUC57) DNA. The reactions were performed as described previously herein, using the oligonucleotide primers shown in the table above. Gene specific and for first and second rounds of PCR amplification were added at a final concentration of 0.3 μπιοΙ. The about 1350bp products were TOPO cloned into the pCR Blunt II TOPO vector (Life Technologies, Carlsbad, CA) according to the manufacturer's recommendations and sequenced confirmed (GeneWiz, La Jolla, CA). For the 5' E. coli 10 amino acid extension, the PCR reactions also were performed in two steps with the following exceptions; the first reaction the nucleotides corresponding to amino acids 6-10 from the E. coli XI were added first using the 5' oligonucleotide KAS/XI-Re6-10 (see table above) using the 3' oligonucleotide KAS/3-XI-RF-NATIVE (see table above). Once the PCR product was confirmed by agarose gel electrophoresis, nucleic acid was purified using the Zymo Research DNA Clean & Concentrator-25 kit (Zymo Research, Orange, CA). In a second PCR reaction, about 40ng of this cleaned PCR product was used in a second PCR reaction as outlined above but this time using the 5' oligonucleotide KAS/Xl-ReM O and either KAS/3-XI-RF-NATIVE or KAS/3-XI-RF-Native-HISb, which generated the Xl-R with 5' Xl-E. coli extensions. These products were also TOPO cloned as detailed above and the sequence confirmed by sequence analysis. Following sequence confirmation, the approximately 1350bp Spel-Xhol fragments were cloned into the corresponding restriction sites in the p425GPD vectors, as described above.
Chimeric Xylose Isomerase activities with a 5' 150 amino acid replacement
Chimeric xylose isomerase proteins were also generated that included greater that 10 or 15 5' amino acid replacements, in some embodiments. Described herein are non-limiting examples of chimeric xylose isomerase activities with a replacement of approximately 150 5' amino acids from a different donor organism. The first 450 nucleotides of the native xylose isomerase sequence given above can be replaced with any of the sequences given in the table below to create chimeric xylose isomerase activities with approximately the 150 5' amino acids donated by a different organism than Ruminococcus flavefaciens.
Figure imgf000245_0001
ATGAAAAATTACTTTCCAAATGTTCCAGAAGTAAAATACGAAGG
CCCAAATTCAACGAATCCATTTGCTTTTAAATATTATGACGCAA
ATAAAGTTGTAGCGGGTAAAACAATGAAAGAGCACTGTCGTTT
TGCATTATCTTGGTGGCATACTCTTTGTGCAGGTGGTGCTGAT
CCATTCGGTGTAACAACTATGGATAGAACCTACGGAAATATCA
CAGATCCAATGGAACTTGCTAAGGCAAAAGTTGACGCTGGTTT
CGAATTAATGACTAAATTAGGAATTGAATTCTTCTGTTTCCATG
ACGCAGATATTGCTCCAGAAGGTGATACTTTTGAAGAGTCAAA
Clostridium GAAGAATCTTTTTGAAATCGTTGATTACATCAAAGAGAAGATGG phytofermentans (SEQ ATCAGACTGGTATCAAGTTATTATGGGGTACTGCTAATAACTTT ID NO: 95) AGTCATCCAAGATTTATGCAT
ATGACTAAGGAATATTTCCCAACTATCGGCAAGATTAGATTCGA
AGGTAAGGATTCTAAGAATCCAATGGCCTTCCACTACTATGAT
GCTGAAAAGGAAGTCATGGGTAAGAAAATGAAGGATTGGTTAC
GTTTCGCCATGGCCTGGTGGCACACTCTTTGCGCCGATGGTG
CTGACCAATTCGGTGTTGGTACTAAGTCTTTCCCATGGAATGA
AGGTACTGACCCAATTGCTATTGCCAAGCAAAAGGTTGATGCT
GGTTTTGAAATCATGACCAAGCTTGGTATTGAACACTACTGTTT
CCACGATGTTGATCTTGTTTCTGAAGGTAACTCTATTGAAGAAT
ACGAATCCAACCTCAAGCAAGTTGTTGCTTACCTTAAGCAAAA
Orpinomyces (SEQ ID GCAACAAGAAACTGGTATTAAGCTTCTCTGGAGTACTGCCAAT NO: 96) GTTTTCGGTAACCCACGTTACATGAAC
ATGGAATACTTCAAAAATGTACCACAAATAAAATATGAAGGACC
AAAATCAAACAATCCATATGCATTTAAATTTTACAATCCAGATGA
AATAATAGACGGAAAACCTTTAAAAGAACACTTGCGTTTTTCAG
TAGCGTACTGGCACACATTTACAGCCAATGGGACAGATCCATT
TGGAGCACCCACAATGCAAAGGCCATGGGACCATTTTACTGAC
CCTATGGATATTGCCAAAGCGAGAGTAGAAGCAGCCTTTGAAC
TATTTGAAAAACTCGACGTACCATTTTTCTGTTTCCATGACAGA
GATATAGCTCCGGAAGGAGAGACATTAAGGGAGACGAACAAA
Clostridium AATTTAGATACAATAGTTGCAATGATAAAAGACTACTTAAAGAC thermo ydrosulfuricum GAGCAAAACAAAAGTATTATGGGGCACAGCGAACCTTTTTTCA (SEQ ID NO: 97) AATCCGAGATTTGTACAT
ATGGCAACAAAAGAATTTTTTCCGGGAATTGAAAAGATTAAATT
Bacteroides TGAAGGTAAAGATAGTAAGAACCCGATGGCATTCCGTTATTAC thetaiotaomicron (SEQ GATGCAGAGAAGGTGATTAATGGTAAAAAGATGAAGGATTGGC ID NO: 98) TGAGATTCGCTATGGCATGGTGGCACACATTGTGCGCTGAAG GTGGTGATCAGTTCGGTGGCGGAACAAAGCAATTCCCATGGA
ATGGTAATGCAGATGCTATACAGGCAGCAAAAGATAAGATGGA
TGCAGGATTTGAATTCATGCAGAAGATGGGTATCGAATACTATT
GCTTCCATGACGTAGACTTGGTTTCGGAAGGTGCCAGTGTAGA
AGAATACGAAGCTAACCTGAAAGAAATCGTAGCTTATGCAAAA
CAGAAACAGGCAGAAACCGGTATCAAACTACTGTGGGGTACT
GCTAATGTATTCGGTCACGCCCGCTATATGAAC
Figure imgf000247_0001
TAAAAACCTTGATACAATAGTTTCAGTAATTAAAGATAGAATGA
AATCCAGTCCGGTAAAGTTATTATGGGGAACTACAAATGCTTTC
GGAAACCCAAGATTTATGCAT
ATGGAATTTTTCAAGAACATAAGCAAGATCCCTTACGAGGGCA
AGGACAGCACAAATCCTCTCGCATTCAAGTACTACAATCCTGA
TGAGGTAATTGACGGCAAGAAGATGCGTGACATTATGAAGTTT
GCTCTCTCATGGTGGCATACAATGGGCGGCGACGGAACAGAT
ATGTTCGGCTGCGGTACAGCTGACAAGACATGGGGCGAAAAT
GATCCTGCTGCAAGAGCTAAGGCTAAGGTTGACGCAGCTTTC
GAGATCATGCAGAAGCTCTCTATCGATTACTTCTGTTTCCACGA
CCGTGATCTTTCTCCTGAGTACGGCTCACTGAAGGACACAAAC
Ruminococcus GCTCAGCTGGACATCGTTACAGATTACATCAAGGCTAAGCAGG flavefaciens FD1 (SEQ CTGAGACAGGTCTCAAGTGCCTCTGGGGTACAGCTAAGTGCTT ID NO: 102) CGATCACCCAAGATTCATGCAC
ATGAGCGAATTTTTTACAGGCATTTCAAAGATCCCCTTTGAGG
GAAAGGCATCCAACAATCCCATGGCGTTCAAGTACTACAACCC
GGATGAGGTCGTAGGCGGCAAGACCATGCGGGAGCAGCTGA
AGTTTGCGCTGTCCTGGTGGCATACTATGGGGGGAGACGGTA
CGGACATGTTTGGTGTGGGTACCACCAACAAGAAGTTCGGCG
GAACCGATCCCATGGACATTGCTAAGAGAAAGGTAAACGCTGC
GTTTGAGCTGATGGACAAGCTGTCCATCGATTATTTCTGTTTCC
ACGACCGGGATCTGGCGCCGGAGGCTGATAATCTGAAGGAAA
CCAACCAGCGTCTGGATGAAATCACCGAGTATATTGCACAGAT
Ruminococcus 18P13 GATGCAGCTGAACCCGGACAAGAAGGTTCTGTGGGGTACTGC (SEQ ID NO: 103) AAATTGCTTCGGCAATCCCCGGTATATGCAT
ATGAAATTTTTTGAAAATGTCCCTAAGGTAAAATATGAGGGAAG
CAAGTCTACCAACCCGTTTGCATTTAAGTATTACAATCCTGAAG
CGGTGATTGCCGGTAAAAAAATGAAGGATCACCTGAAATTCGC
GATGTCCTGGTGGCACACCATGACGGCGACCGGGCAAGACCA
GTTCGGTTCGGGGACGATGAGCCGAATATATGACGGGCAAAC
TGAACCGCTGGCCTTGGCCAAAGCCCGAGTGGATGCGGCTTT
CGATTTCATGGAAAAATTAAATATCGAATATTTTTGTTTTCATGA
Clostriales genomosp
TGCCGACTTGGCTCCAGAAGGTAACAGTTTGCAGGAACGCAA BVAB3 str
CGAAAATTTGCAGGAAATGGTGTCTTACCTGAAACAAAAGATG
UPII9-5 (SEQ ID NO: GCCGGAACTTCGATTAAGCTTTTATGGGGAACCTCGAATTGTT 104) TCAGCAACCCTCGTTTTATGCAC ATGGCAACAAAAGAGTATTTTCCCGGAATAGGAAAAATCAAATT
CGAAGGCAAAGAAAGTAAGAATCCTATGGCATTCCGCTACTAC
GATGCGGAAAAAGTAATCATGGGCAAGAAGATGAAAGATTGGT
TGAAGTTCTCTATGGCATGGTGGCATACACTCTGTGCAGAGGG
TGGTGACCAGTTCGGCGGCGGAACGAAACATTTCCCCTGGAA
CGGTGATGCCGATAAACTGCAGGCTGCCAAGAACAAAATGGA
TGCTGGTTTCGAGTTCATGCAGAAAATGGGCATCGAATATTAC
TGCTTCCACGATGTTGACCTTTGCGACGAGGCCGATACAATCG
AAGAGTACGAAGCAAACCTGAAAGCCATCGTTGCATACGCCAA
Bacillus stercoris (SEQ GCAAAAGCAGGAGGAAACAGGTATCAAACTGTTGTGGGGTAC
ID NO: 105) TGCCAACGTATTCGGTCATGCACGTTACATGAACG
ATGTACGAGCCCAAACCGGAGCACAGGTTTACCTTTGGCCTTT
GGACTGTGGGCAATGTGGGCCGTGATCCCTTCGGGGACGCG
GTTCGGGAGAGGCTGGACCCGGTTTACGTGGTTCATAAGCTG
GCGGAGCTTGGGGCCTACGGGGTAAACCTTCACGACGAGGAC
CTGATCCCGCGGGGCACGCCTCCTCAGGAGCGGGACCAGAT
CGTGAGGCGCTTCAAGAAGGCTCTCGATGAAACCGGCCTCAA
Thermus thermophilus GGTCCCCATGGTCACCGCCAACCTCTTCTCCGACCCTGCTTTC
(SEQ ID NO: 106) AAGGAC
Xylose isomerase genes from additional bacteria were also utilized as the C-terminal portion of chimeric xylose isomerase activities. In some embodiments, the bacteria used as xylose isomerase nucleotide sequence donors were additional Ruminococcus bacteria. In certain embodiments, the bacteria used as xylose isomerase nucleotide sequences donors were Clostridiales bacteria. The native nucleotide and amino acid sequences of the additional xylose isomerase genes utilized to create chimeric xylose isomerase activities are given below. The 5' approximately 150 amino acids of the sequences below can be replaced as described above, using the sequences above, to create novel chimeric xylose isomerase activities.
NUCLEOTIDE SEQUENCES:
Ruminococcus_FD1 Xylose Isomerase (ZP_06143883.1 , SEQ ID NO: 107)
ATGGAA I I I I I CAAGAACATAAGCAAGATCCCTTACGAGGGCAAGGACAGCACAAATCCTCTC
GCATTCAAGTACTACAATCCTGATGAGGTAATTGACGGCAAGAAGATGCGTGACATTATGAAG TTTGCTCTCTCATGGTGGCATACAATGGGCGGCGACGGAACAGATATGTTCGGCTGCGGTAC AGCTGACAAGACATGGGGCGAAAATGATCCTGCTGCAAGAGCTAAGGCTAAGGTTGACGCAG CTTTCGAGATCATGCAGAAGCTCTCTATCGATTACTTCTGTTTCCACGACCGTGATCTTTCTCC TGAGTACGGCTCACTGAAGGACACAAACGCTCAGCTGGACATCGTTACAGATTACATCAAGG CTAAGCAGGCTGAGACAGGTCTCAAGTGCCTCTGGGGTACAGCTAAGTGCTTCGATCACCCA AGATTCATGCACGGTGCAGGTACTTCACCATCCGCAGACGTATTCGCTTTCTCAGCTGCACAG ATCAAGAAGGCTCTCGAGTCTACTGTAAAGCTCGGCGGTACAGGCTACGTATTCTGGGGCGG ACGTGAGGGTTATGAGACTCTCCTCAACACAAACATGGGCCTTGAGCTTGACAACATGGCTC GTCTCATGAAGATGGCTGTTGAGTACGGACGTTCTATCGGCTTCAAGGGCGATTTCTACATCG AGCCTAAGCCAAAGGAGCCAACAAAGCACCAGTACGATTTCGATACTGCTACTGTTCTCGGCT TCCTCAGAAAGTACGGTCTCGACAAGGATTTCAAGATGAACATCGAAGCTAACCACGCTACAC TGGCTCAGCACACATTCCAGCACGAGCTCTGCGTAGCAAGAACAAACGGTGCTTTCGGTTCA ATCGACGCAAACCAGGGCGATCCTCTCCTCGGATGGGATACAGACCAGTTCCCGACAAATAT CTATGACACAACAATGTGTATGTACGAAGTTATCAAGGCTGGCGGCTTCACAAACGGCGGTCT CAACTTCGATGCAAAGGCAAGACGTGGAAGCTTCACACCTGAGGATATCTTCTACAGCTACAT TGCAGGTATGGATGCATTCGCTCTCGGCTACAAGGCTGCAAGCAAGCTCATCGCTGACGGAC GTATCGACAGCTTCATTTCCGACCGCTACGCTTCATGGAGCGAGGGAATCGGTCTCGACATC ATCTCAGGCAAGGCTGATATGGCTGCTCTTGAGAAGTATGCTCTCGAAAAGGGCGAGGTTAC AGACTCTATTTCCAGCGGCAGACAGGAACTCCTCGAGTCTATCGTAAACAACGTTATATTCAAT CTTTGA
Ruminococcus_18P13 Xylose Isomerase (CBL17278.1 , SEQ ID NO: 108)
ATGAGCGAA I I I I I I ACAGGCATTTCAAAGATCCCCTTTGAGGGAAAGGCATCCAACAATCCC ATGGCGTTCAAGTACTACAACCCGGATGAGGTCGTAGGCGGCAAGACCATGCGGGAGCAGC TGAAGTTTGCGCTGTCGTGGTGGCATACTATGGGGGGAGACGGTACGGACATGTTTGGTGTG GGTACCACCAACAAGAAGTTCGGCGGAACCGATCCCATGGACATTGCTAAGAGAAAGGTAAA CGCTGCGTTTGAGCTGATGGACAAGCTGTCCATCGATTATTTCTGTTTCCACGACCGGGATCT GGCGCCGGAGGCTGATAATCTGAAGGAAACCAACCAGCGTCTGGATGAAATCACCGAGTATA TTGCACAGATGATGCAGCTGAACCCGGACAAGAAGGTTCTGTGGGGTACTGCAAATTGCTTC GGCAATCCCCGGTATATGCATGGTGCCGGCACTGCGCCCAATGCGGACGTGTTTGCATTTGC AGCTGCGCAGATCAAAAAGGCAATTGAGATCACCGTAAAGCTGGGTGGCAAGGGCTATGTAT TCTGGGGCGGCAGAGAGGGCTACGAAACGCTGCTGAACACCAATATGGGTCTGGAACTGGA TAATATGGCACGGCTGCTGCATATGGCAGTGGACTATGCAAGAAGCATCGGCTTTACCGGCG ACTTCTACATCGAGCCCAAGCCCAAGGAGCCTACCAAGCATCAGTATGATTTTGATACCGCAA CCGTGATCGGCTTCCTGCGCAAGTATAATCTGGACAAGGACTTCAAGATGAACATCGAAGCCA ACCACGCAACCCTTGCACAGCACACCTTCCAGCATGAACTGCGGGTAGCACGGGAGAACGG CTTCTTTGGCTCCATCGATGCTAACCAGGGTGACACCCTGCTGGGCTGGGATACGGATCAGT TCCCCACTAATACCTATGACGCAGCACTGTGTATGTACGAGGTACTCAAGGCTGGCGGTTTTA CCAATGGCGGTCTGAACTTTGACTCCAAGGCACGGCGTGGATCCTTTGAGATGGAGGATATC TTCCACAGCTACATTGCCGGTATGGACACCTTTGCACTGGGTCTGAAGATTGCGCAGAAGATG ATCGATGACGGACGGATCGACCAGTTCGTGGCTGATCGGTATGCAAGCTGGAACACCGGCAT CGGTGCGGATATCATTTCCGGCAAGGCAACCATGGCAGATTTGGAGGCTTACGCACTGAGCA AGGGCGATGTGACCGCATCCCTCAAGAGCGGTCGTCAGGAATTGCTGGAAAGCATCCTGAAC AATATTATGTTCAATCTTTAA
Clostridiales_genomosp_BVAB3_UPII9-5 Xylose Isomerase (YP_003474614.1 , SEQ ID NO: 109) ATGAAATTTTTTGAAAATGTCCCTAAGGTAAAATATGAGGGAAGCAAGTCTACCAACCCGTTTG CATTTAAGTATTACAATCCTGAAGCGGTGATTGCCGGTAAAAAAATGAAGGATCACCTGAAATT CGCGATGTCCTGGTGGCACACCATGACGGCGACCGGGCAAGACCAGTTCGGTTCGGGGACG ATGAGCCGAATATATGACGGGCAAACTGAACCGCTGGCCTTGGCCAAAGCCCGAGTGGATGC GGCTTTCGATTTCATGGAAAAATTAAATATCGAATA I I I I I GTTTTCATGATGCCGACTTGGCTC CAGAAGGTAACAGTTTGCAGGAACGCAACGAAAATTTGCAGGAAATGGTGTCTTACCTGAAAC AAAAGATGGCCGGAACTTCGATTAAGCTTTTATGGGGAACCTCGAATTGTTTCAGCAACCCTC GTTTTATGCACGGGGCAGCCACATCTTGCGAAGCGGATGTGTTTGCTTGGACCGCCACTCAG TTGAAAAATGCCATCGATGCTACCATCGCGCTTGGCGGTAAAGGCTATGTTTTCTGGGGCGG CCGGGAAGGCTATGAAACCTTGCTGAACACTGATGTCGGCCTGGAGATGGATAATTATGCAA GAATGCTGAAAATGGCGGTTGCATATGCGCATTCTAAAGGTTATACGGGTGACTTTTATATTGA ACCTAAGCCAAAAGAACCCACTAAACATCAATATGATTTCGATGTCGCCACTTGCGTTGCTTTC CTTGAAAAATACGATTTGATGCGTGATTTTAAAGTAAACATTGAGGCTAATCACGCTACTTTGG CCGGTCATACTTTCCAACATGAGTTACGCATGGCGCGTACCTTCGGGGTATTCGGCTCGGTT GATGCCAATCAGGGCGACAGCAATCTGGGCTGGGATACCGATCAGTTCCCGGGCAATATTTA TGATACGACTTTGGCCATGTATGAGATTTTGAAGGCCGGTGGATTTACCAACGGAGGCTTGAA CTTTGATGCTAAAGTGCGTCGTCCGTCATTTACCCCGGAAGATATTGCTTATGCTTATATTTTG GGCATGGATACGTTTGCCTTAGGCTTGATTAAGGCGCAACAGCTGATTGAGGATGGCAGAATT GATCGTTTCGTAGCGGAAAAATATGCTAGTTATAAGTCGGGCATCGGTGCTGAAATCTTGAGT GGTAAAACCGGTTTGCCGGAATTGGAGGCTTACGCATTGAAGAAAGGCGAGCCTAAGTTGTA TAGTGGGCGGCAGGAATATCTTGAAAGTGTCGTTAATAACGTAATTTTCAACGGAAATCTTTGA
AMINO ACID SEQUENCES:
Ruminococcus_FD1 Xylose Isomerase (SEQ ID NO: 1 10)
MEFFKNISKIPYEGKDSTNPLAFKYYNPDEVIDGKKMRDIMKFALSWWHTMGGDGTDMFGCGTA DKTWGENDPAARAKAKVDAAFEIMQKLSIDYFCFHDRDLSPEYGSLKDTNAQLDIVTDYIKAKQAE TGLKCLWGTAKCFDHPRFMHGAGTSPSADVFAFSAAQIKKALESTVKLGGTGYVFWGGREGYET LLNTNMGLELDNMARLMKMAVEYGRSIGFKGDFYIEPKPKEPTKHQYDFDTATVLGFLRKYGLDK DFKMNIEANHATLAQHTFQHELCVARTNGAFGSIDANQGDPLLGWDTDQFPTNIYDTTMCMYEVI KAGGFTNGGLNFDAKARRGSFTPEDIFYSYIAGMDAFALGYKAASKLIADGRIDSFISDRYASWSE GIGLDIISGKADMAALEKYALEKGEVTDSISSGRQELLESIVNNVIFNL
Ruminococcus_18P13 Xylose Isomerase (SEQ ID NO: 1 1 1 )
MSEFFTGISKIPFEGKASNNPMAFKYYNPDEVVGGKTMREQLKFALSWWHTMGGDGTDMFGVG TTNKKFGGTDPMDIAKRKVNAAFELMDKLSIDYFCFHDRDLAPEADNLKETNQRLDEITEYIAQMM QLNPDKKVLWGTANCFGNPRYMHGAGTAPNADVFAFAAAQIKKAIEITVKLGGKGYVFWGGREG YETLLNTNMGLELDNMARLLHMAVDYARSIGFTGDFYIEPKPKEPTKHQYDFDTATVIGFLRKYNL DKDFKMNIEANHATLAQHTFQHELRVARENGFFGSIDANQGDTLLGWDTDQFPTNTYDAALCMY EVLKAGGFTNGGLNFDSKARRGSFEMEDIFHSYIAGMDTFALGLKIAQKMIDDGRIDQFVADRYAS WNTGIGADIISGKATMADLEAYALSKGDVTASLKSGRQELLESILNNIMFN
Clostridiales_genomosp. BVAB3 str UPII9-5 Xylose Isomerase (SEQ ID NO: 1 12)
MKFFENVPKVKYEGSKSTNPFAFKYYNPEAVIAGKKMKDHLKFAMSWWHTMTATGQDQFGSGT MSRIYDGQTEPLALAKARVDAAFDFMEKLNIEYFCFHDADLAPEGNSLQERNENLQEMVSYLKQK MAGTSIKLLWGTSNCFSNPRFMHGAATSCEADVFAWTATQLKNAIDATIALGGKGYVFWGGREG YETLLNTDVGLEMDNYARMLKMAVAYAHSKGYTGDFYIEPKPKEPTKHQYDFDVATCVAFLEKYD LMRDFKVNIEANHATLAGHTFQHELRMARTFGVFGSVDANQGDSNLGWDTDQFPGNIYDTTLAM YEILKAGGFTNGGLNFDAKVRRPSFTPEDIAYAYILGMDTFALGLIKAQQLIEDGRIDRFVAEKYASY KSGIGAEILSGKTGLPELEAYALKKGEPKLYSGRQEYLESVVNNVIFNGNL Amino acid similarity comparisons were performed on the various xylose isomerase proteins whose sequences were analyzed to generate the chimeric xylose isomerase activity nucleotide sequences. The results of the amino acid similarity comparison are presented in the table below.
Figure imgf000253_0001
Example 26: Nucleotide and amino acid sequences of over expressed activities useful for increasing sugar transport and/or sugar metabolism As noted herein, increased or over expression of certain activities can result in increased ethanol production due to an increase in the utilization of the fermentation substrate, sometimes due to an increase in transport and/or metabolism of a desired sugar. Non-limiting examples of activities that can be over expressed to increase ethanol production by increasing sugar transport and/or metabolism include activities encoded by the genes gxf 1 , gxsl , hxt7, zwf 1 , gal2, sol3, sol4, the like, homologs thereof (e.g., Candida albicans Soh p, Schizosaccharomyces pombe SoM p, human PGLS and human H6PD), that can be expressed in a desired host organism, and combinations thereof. Nucleotide and amino acid sequences for some of these additional activities are given below. In some embodiments, 1 , 2, 3, 4, 5, 6 or more of the non-limiting additional activities can be increased in expression or over expressed in an engineered host, thereby increasing transport and/or metabolism of a desired carbon source, wherein increased transport and/or metabolism of a desired carbon source results in increased ethanol production.
Nucleotide Sequences
Debaryomyces hansenii gxf 1 (SEQ ID NO: 1 13)
ATGTCTCAAGAAGAATATAGTTCTGGGGTACAAACCCCAGTTTCTAACCATTCTGGTTTAGAGA AAGAAGAGCAACACAAGTTAGACGGTTTAGATGAGGATGAAATTGTCGATCAATTACCTTCTTT ACCAGAAAAATCAGCTAAGGATTATTTATTAATTTCTTTCTTCTGTGTATTAGTTGCATTTGGTG GTTTTGTTTTCGGTTTCGATACTGGTACTATCTCAGGTTTCGTTAACATGAGTGATTACTTGGA AAGATTCGGTGAGCTTAATGCAGATGGTGAATATTTCTTATCTAATGTTAGAACTGGTTTGATT GTTGCTA I I I I I AATGTTGGTTGTGCTGTCGGTGGTATTTTCTTATCTAAGATTGCTGATGTTTA TGGTAGAAGAATTGGTCTTATGTTTTCCATGATTATTTATGTGATTGGTATAATTGTTCAAATCT CAGCTTCTGACAAGTGGTATCAAATCGTTGTTGGTAGAGCTATTGCAGGTTTAGCTGTTGGTA CCGTTTCTGTCTTATCCCCATTATTCATTGGTGAATCAGCACCTAAAACCTTAAGAGGTACTTT AGTGTGTTGTTTCCAATTATGTATTACCTTAGGTATCTTCTTAGGTTACTGTACTACATATGGTA CTAAAACCTACACCGACTCTAGACAATGGAGAATTCCATTAGGTTTATGTTTTGTTTGGGCTAT CATGTTGGTTATTGGTATGGTTTGCATGCCAGAATCACCAAGATACTTAGTTGTCAAGAACAAG ATTGAAGAAGCTAAGAAATCGATTGGTAGATCCAACAAGGTTTCACCAGAAGATCCTGCTGTT TACACCGAAGTCCAATTGATTCAAGCAGGTATTGAAAGAGAAAGTTTAGCTGGTTCTGCCTCTT GGACCGAATTGGTTACTGGTAAGCCAAGAATCTTTCGTAGAGTCATTATGGGTATTATGTTACA ATCTTTACAACAATTGACTGGTGACAACTATTTCTTCTACTATGGTACTACTATTTTCCAAGCTG TCGGTATGACTGATTCCTTCCAAACATCTATTGTTTTAGGTGTTGTTAACTTTGCATCTACATTT CTCGGTATCTACACAATTGAAAGATTCGGTAGAAGATTATGTTTGTTAACTGGTTCTGTCTGTA TGTTCGTTTGTTTCATCATTTACTCCATTTTGGGTGTTACAAACTTATATATTGATGGCTACGAT GGTCCAACTTCGGTTCCAACCGGTGATGCGATGATTTTCATTACTACCTTATACATTTTCTTCT TCGCATCCACCTGGGCTGGTGGTGTCTACTGTATCGTTTCCGAAACATACCCATTGAGAATTA GATCTAAGGCCATGTCCGTTGCCACCGCTGCTAACTGGATTTGGGGTTTCTTGATCTCTTTCT TCACTCCATTCATCACCTCGGCTATCCACTTCTACTACGGTTTCGTTTTCACAGGATGTTTGTT ATTCTCGTTCTTTTACGTTTACTTCTTTGTTGTTGAAACTAAGGGATTAACTTTAGAAGAAGTTG ATGAATTGTATGCCCAAGGTGTTGCCCCATGGAAGTCATCGAAATGGGTTCCACCAACCAAGG AAGAAATGGCCCATTCTTCAGGATATGCTGCTGAAGCCAAACCTCACGATCAACAAGTATAA
Saccharomyces cerevisiae gal2 (SEQ ID NO: 1 14)
ATGGCAGTTGAGGAGAACAATATGCCTGTTGTTTCACAGCAACCCCAAGCTGGTGAAGAC GTGATCTCTTCACTCAGTAAAGATTCCCATTTAAGCGCACAATCTCAAAAGTATTCTAAT GATGAATTGAAAGCCGGTGAGTCAGGGTCTGAAGGCTCCCAAAGTGTTCCTATAGAGATA CCCAAGAAGCCCATGTCTGAATATGTTACCGTTTCCTTGCTTTGTTTGTGTGTTGCCTTC GGCGGCTTCATGTTTGGCTGGGATACCGGTACTATTTCTGGGTTTGTTGTCCAAACAGAC I I I I I GAGAAGGTTTGGTATGAAACATAAGGATGGTACCCACTATTTGTCAAACGTCAGA ACAGGTTTAATCGTCGCCATTTTCAATATTGGCTGTGCCTTTGGTGGTATTATACTTTCC AAAGGTGGAGATATGTATGGCCGTAAAAAGGGTCTTTCGATTGTCGTCTCGGTTTATATA GTTGGTATTATCATTCAAATTGCCTCTATCAACAAGTGGTACCAATATTTCATTGGTAGA ATCATATCTGGTTTGGGTGTCGGCGGCATCGCCGTCTTATGTCCTATGTTGATCTCTGAA ATTGCTCCAAAGCACTTGAGAGGCACACTAGTTTCTTGTTATCAGCTGATGATTACTGCA GGTATC I I I I I GGGCTACTGTACTAATTACGGTACAAAGAGCTATTCGAACTCAGTTCAA TGGAGAGTTCCATTAGGGCTATGTTTCGCTTGGTCATTATTTATGATTGGCGCTTTGACG TTAGTTCCTGAATCCCCACGTTATTTATGTGAGGTGAATAAGGTAGAAGACGCCAAGCGT TCCATTGCTAAGTCTAACAAGGTGTCACCAGAGGATCCTGCCGTCCAGGCAGAGTTAGAT CTGATCATGGCCGGTATAGAAGCTGAAAAACTGGCTGGCAATGCGTCCTGGGGGGAATTA TTTTCCACCAAGACCAAAGTATTTCAACGTTTGTTGATGGGTGTGTTTGTTCAAATGTTC CAACAATTAACCGGTAACAATTA I I I I I I CTACTACGGTACCGTTATTTTCAAGTCAGTT GGCCTGGATGATTCCTTTGAAACATCCATTGTCATTGGTGTAGTCAACTTTGCCTCCACT TTCTTTAGTTTGTGGACTGTCGAAAACTTGGGACATCGTAAATGTTTACTTTTGGGCGCT GCCACTATGATGGCTTGTATGGTCATCTACGCCTCTGTTGGTGTTACTAGATTATATCCT CACGGTAAAAGCCAGCCATCTTCTAAAGGTGCCGGTAACTGTATGATTGTCTTTACCTGT TTTTATATTTTCTGTTATGCCACAACCTGGGCGCCAGTTGCCTGGGTCATCACAGCAGAA . TCATTCCCACTGAGAGTCAAGTCGAAATGTATGGCGTTGGCCTCTGCTTCCAATTGGGTA TG G G G GTTCTTG ATTG C A I I I I I CACCCCATTCATCACATCTGCCATTAACTTCTACTAC GGTTATGTCTTCATGGGCTGTTTGGTTGCCATG I I I I I I I ATGTC I I I I I CTTTGTTCCA GAAACTAAAGGCCTATCGTTAGAAGAAATTCAAGAATTATGGGAAGAAGGTGTTTTACCT TGGAAATCTGAAGGCTGGATTCCTTCATCCAGAAGAGGTAATAATTACGATTTAGAGGAT TTACAACATGACGACAAACCGTGGTACAAGGCCATGCTAGAATAA
Saccharomyces cerevisiae sol3 (SEQ ID NO: 1 15)
ATGGTGACAGTCGGTGTGTTTTCTGAGAGGGCTAGTTTGACCCATCAATTGGGGGAATTC ATCGTCAAGAAACAAGATGAGGCGCTGCAAAAGAAGTCAGACTTTAAAGTTTCCGTTAGC GGTGGCTCTTTGATCGATGCTCTGTATGAAAGTTTAGTAGCGGACGAATCACTATCTTCT CGAGTGCAATGGTCTAAATGGCAAATCTACTTCTCTGATGAAAGAATTGTGCCACTGACG GACGCTGACAGCAATTATGGTGCCTTCAAGAGAGCTGTTCTAGATAAATTACCCTCGACT AGTCAGCCAAACGTTTATCCCATGGACGAGTCCTTGATTGGCAGCGATGCTGAATCTAAC AACAAAATTGCTGCAGAGTACGAGCGTATCGTACCTCAAGTGCTTGATTTGGTACTGTTG GGCTGTGGTCCTGATGGACACACTTGTTCCTTATTCCCTGGAGAAACACATAGGTACTTG CTGAACGAAACAACCAAAAGAGTTGCTTGGTGCCACGATTCTCCCAAGCCTCCAAGTGAC AGAATCACCTTCACTCTGCCTGTGTTGAAAGACGCCAAAGCCCTGTGTTTTGTGGCTGAG GGCAGTTCCAAACAAAATATAATGCATGAGATCTTTGACTTGAAAAACGATCAATTGCCA ACCGCATTGGTTAACAAATTATTTGGTGAAAAAACATCCTGGTTCGTTAATGAGGAAGCT TTTGGAAAAGTTCAAACGAAAAC I I I I I AG
Saccharomyces cerevisiae zwfl (SEQ ID NO: 1 16) ATGAGTGAAGGCCCCGTCAAATTCGAAAAAAATACCGTCATATCTGTCTTTGGTGCGTCA GGTGATCTGGCAAAGAAGAAGACTTTTCCCGCCTTATTTGGGCTTTTCAGAGAAGGTTAC CTTGATCCATCTACCAAGATCTTCGGTTATGCCCGGTCCAAATTGTCCATGGAGGAGGAC CTGAAGTCCCGTGTCCTACCCCACTTGAAAAAACCTCACGGTGAAGCCGATGACTCTAAG GTCGAACAGTTCTTCAAGATGGTCAGCTACATTTCGGGAAATTACGACACAGATGAAGGC TTCGACGAATTAAGAACGCAGATCGAGAAATTCGAGAAAAGTGCCAACGTCGATGTCCCA CACCGTCTCTTCTATCTGGCCTTGCCGCCAAGCGTTTTTTTGACGGTGGCCAAGCAGATC AAGAGTCGTGTGTACGCAGAGAATGGCATCACCCGTGTAATCGTAGAGAAACCTTTCGGC CACGACCTGGCCTCTGGCAGGGAGCTGCAAAAAAACCTGGGGCCCCTCTTTAAAGAAGAA GAGTTGTACAGAATTGACCATTACTTGGGTAAAGAGTTGGTCAAGAATCTTTTAGTCTTG AGGTTCGGTAACCAGTTTTTGAATGCCTCGTGGAATAGAGACAACATTCAAAGCGTTCAG ATTTCGTTTAAAGAGAGGTTCGGCACCGAAGGCCGTGGCGGCTATTTCGACTCTATAGGC ATAATCAGAGACGTGATGCAGAACCATCTGTTACAAATCATGACTCTCTTGACTATGGAA AGACCGGTGTCTTTTGACCCGGAATCTATTCGTGACGAAAAGGTTAAGGTTCTAAAGGCC GTGGCCCCCATCGACACGGACGACGTCCTCTTGGGCCAGTACGGTAAATCTGAGGACGGG TCTAAGCCCGCCTACGTGGATGATGACACTGTAGACAAGGACTCTAAATGTGTCACTTTT GCAGCAATGACTTTCAACATCGAAAACGAGCGTTGGGAGGGCGTCCCCATCATGATGCGT GCCGGTAAGGCTTTGAATGAGTCCAAGGTGGAGATCAGACTGCAGTACAAAGCGGTCGCA TCGGGTGTCTTCAAAGACATTCCAAATAACGAACTGGTCATCAGAGTGCAGCCCGATGCC GCTGTGTACCTAAAGTTTAATGCTAAGACCCCTGGTCTGTCAAATGCTACCCAAGTCACA GATCTGAATCTAACTTACGCAAGCAGGTACCAAGACTTTTGGATTCCAGAGGCTTACGAG GTGTTGATAAGAGACGCCCTACTGGGTGACCATTCCAACTTTGTCAGAGATGACGAATTG GATATCAGTTGGGGCATATTCACCCCATTACTGAAGCACATAGAGCGTCCGGACGGTCCA ACACCGGAAATTTACCCCTACGGATCAAGAGGTCCAAAGGGATTGAAGGAATATATGCAA AAACACAAGTATGTTATGCCCGAAAAGCACCCTTACGCTTGGCCCGTGACTAAGCCAGAA GATACGAAGGATAATTAG
Amino Acid Sequences Debaryomyces hansenii gxf 1 (SEQ ID NO: 1 17)
1 MSQEEYSSGV QTPVSNHSGL EKEEQHKLDG LDEDEIVDQL PSLPEKSAKD YLLISFFCVL 61 VAFGGFVFGF DTGTISGFVN MSDYLERFGE LNADGEYFLS NVRTGLIVAI FNVGCAVGGI 121 FLSKIADVYG RRIGLMFSMI IYVIGIIVQI SASDKWYQIV VGRAIAGLAV GTVSVLSPLF 181 IGESAPKTLR GTLVCCFQLC ITLGIFLGYC TTYGTKTYTD SRQWRIPLGL CFVWAIMLVI
241 GMVCMPESPR YLVVKNKIEE AKKSIGRSNK VSPEDPAVYT EVQLIQAGIE RESLAGSASW 301 TELVTGKPRI FRRVIMGIML QSLQQLTGDN YFFYYGTTIF QAVGMTDSFQ TSIVLGVVNF 361 ASTFLGIYTI ERFGRRLCLL TGSVCMFVCF IIYSILGVTN LYIDGYDGPT SVPTGDAMIF 421 ITTLYIFFFA STWAGGVYCI VSETYPLRIR SKAMSVATAA NWIWGFLISF FTPFITSAIH 481 FYYGFVFTGC LLFSFFYVYF FVVETKGLTL EEVDELYAQG VAPWKSSKWV PPTKEEMAHS 541 SGYAAEAKPH DQQV Saccharomyces cerevisiae gal2 (SEQ ID NO: 1 18)
1 MAVEENNMPV VSQQPQAGED VISSLSKDSH LSAQSQKYSN DELKAGESGS 51 EGSQSVPIEI PKKPMSEYVT VSLLCLCVAF GGFMFGWDTG TISGFVVQTD 101 FLRRFGMKHK DGTHYLSNVR TGLIVAIFNI GCAFGGIILS KGGDMYGRKK 151 GLSIVVSVYI VGIIIQIASI NKWYQYFIGR IISGLGVGGI AVLCPMLISE 201 IAPKHLRGTL VSCYQLMITA GIFLGYCTNY GTKSYSNSVQ WRVPLGLCFA 251 WSLFMIGALT LVPESPRYLC EVNKVEDAKR SIAKSNKVSP EDPAVQAELD 301 LIMAGIEAEK LAGNASWGEL FSTKTKVFQR LLMGVFVQMF QQLTGNNYFF 351 YYGTVIFKSV GLDDSFETSI VIGVVNFAST FFSLWTVENL GHRKCLLLGA 401 ATMMACMVIY ASVGVTRLYP HGKSQPSSKG AGNCMIVFTC FYIFCYATTW 451 APVAWVITAE SFPLRVKSKC MALASASNWV WGFLIAFFTP FITSAINFYY 501 GYVFMGCLVA MFFYVFFFVP ETKGLSLEEI QELWEEGVLP WKSEGWIPSS 551 RRGNNYDLED LQHDDKPWYK AMLE
Saccharomyces cerevisiae zwf 1 (SEQ ID NO: 1 19)
1 MSEGPVKFEK NTVISVFGAS GDLAKKKTFP ALFGLFREGY LDPSTKIFGY 51 ARSKLSMEED LKSRVLPHLK KPHGEADDSK VEQFFKMVSY ISGNYDTDEG 101 FDELRTQIEK FEKSANVDVP HRLFYLALPP SVFLTVAKQI KSRVYAENGI 151 TRVIVEKPFG HDLASARELQ KNLGPLFKEE ELYRIDHYLG KELVKNLLVL 201 RFGNQFLNAS WNRDNIQSVQ ISFKERFGTE GRGGYFDSIG IIRDVMQNHL 251 LQIMTLLTME RPVSFDPESI RDEKVKVLKA VAPIDTDDVL LGQYGKSEDG 301 SKPAYVDDDT VDKDSKCVTF AAMTFNIENE RWEGVPIMMR AGKALNESKV 351 EIRLQYKAVA SGVFKDIPNN ELVIRVQPDA AVYLKFNAKT PGLSNATQVT 401 DLNLTYASRY QDFWIPEAYE VLIRDALLGD HSNFVRDDEL DISWGIFTPL 451 LKHIERPDGP TPEIYPYGSR GPKGLKEYMQ KHKYVMPEKH PYAWPVTKPE 501 DTKDN
Saccharomyces cerevisiae sol3 (SEQ ID NO: 120)
1 MVTVGVFSER ASLTHQLGEF IVKKQDEALQ KKSDFKVSVS GGSLIDALYE 51 SLVADESLSS RVQWSKWQIY FSDERIVPLT DADSNYGAFK RAVLDKLPST 101 SQPNVYPMDE SLIGSDAESN NKIAAEYERI VPQVLDLVLL GCGPDGHTCS 151 LFPGETHRYL LNETTKRVAW CHDSPKPPSD RITFTLPVLK DAKALCFVAE 201 GSSKQNIMHE IFDLKNDQLP TALVNKLFGE KTSWFVNEEA FGKVQTKTF
Example 27: Cloning of additional ZWF1 candidate genes
A variety of ZWF1 genes were cloned from S. cerevisiae, Zymomonas mobilis, Pseudomonas fluorescens (zwfl and zwf2), and P. aeruginosa strain PA01. The sequences of these additional ZWF1 genes are given below. zwf l from P. fluorescens
Amino Acid Sequence (SEQ ID NO: 123)
MTTTRKKSKALPAPPTTLFLFGARGDLVKRLLMPALYNLSRDGLLDEGLRIVGVDHNAVSDAEFAT LLEDFLRDEVLNKQGQGAAVDAAVWARLTRGINYVQGDFLDDSTYAELAARIAASGTGNAVFYLA TAPRFFSEVVRRLGSAGLLEEGPQAFRRVVIEKPFGSDLQTAEALNGCLLKVMSEKQIYRIDHYLG KETVQNILVSRFSNSLFEAFWNNHYIDHVQITAAETVGVETRGSFYEHTGALRDMVPNHLFQLLAM VAMEPPAAFGADAVRGEKAKVVGAIRPWSVEEARANSVRGQYSAGEVAGKALAGYREEANVAP DSSTETYVALKVMIDNWRWVGVPFYLRTGKRMSVRDTEIVICFKPAPYAQFRDTEVERLLPTYLRI QIQPNEGMWFDLLAKKPGPSLDMANIELGFAYRDFFEMQPSTGYETLIYDCLIGDQTLFQRADNIE NGWRAVQPFLDAWQQDASLQNYPAGVDGPAAGDELLARDGRVWRPLG
Nucleotide Sequence (SEQ ID NO: 124) ATGACCACCACGCGAAAGAAGTCCAAGGCGTTGCCGGCGCCGCCGACCACGCTGTTCCTGT TCGGCGCCCGCGGTGATCTGGTCAAGCGCCTGCTGATGCCGGCGCTGTACAACCTCAGCCG CGACGGTTTGCTGGATGAGGGGCTGCGGATTGTCGGCGTCGACCACAACGCGGTGAGCGAC GCCGAGTTCGCCACGCTGCTGGAAGACTTCCTTCGCGATGAAGTGCTCAACAAGCAAGGCCA GGGGGCGGCGGTGGATGCCGCCGTCTGGGCCCGCCTGACCCGGGGCATCAACTATGTCCA GGGCGATTTTCTCGACGACTCCACCTATGCCGAACTGGCGGCGCGGATTGCCGCCAGCGGC ACCGGCAACGCGGTGTTCTACCTGGCCACCGCACCGCGCTTCTTCAGTGAAGTGGTGCGCC GCCTGGGCAGCGCCGGGTTGCTGGAGGAGGGGCCGCAGGCTTTTCGCCGGGTGGTGATCG AAAAACCCTTCGGCTCCGACCTGCAGACCGCCGAAGCCCTCAACGGCTGCCTGCTCAAGGTC ATGAGCGAGAAGCAGATCTATCGCATCGACCATTACCTGGGCAAGGAAACGGTCCAGAACAT CCTGGTCAGCCG I I I I I CCAACAGCCTGTTCGAGGCATTCTGGAACAACCATTACATCGACCA CGTGCAGATCACCGCGGCGGAAACCGTCGGCGTGGAAACCCGTGGCAGCTTTTATGAACAC ACCGGTGCCCTGCGGGACATGGTGCCCAACCACCTGTTCCAGTTGCTGGCGATGGTGGCCA TGGAGCCGCCCGCTGCCTTTGGCGCCGATGCGGTACGTGGCGAAAAGGCCAAGGTGGTGG GGGCTATCCGCCCCTGGTCCGTGGAAGAGGCCCGGGCCAACTCGGTGCGCGGCCAGTACA GCGCCGGTGAAGTGGCCGGCAAGGCCCTGGCGGGCTACCGCGAGGAAGCCAACGTGGCGC CGGACAGCAGCACCGAAACCTACGTTGCGCTGAAGGTGATGATCGACAACTGGCGCTGGGT CGGGGTGCCGTTCTACCTGCGCACCGGCAAGCGCATGAGTGTGCGCGACACCGAGATCGTC ATCTGCTTCAAGCCGGCGCCCTATGCACAGTTCCGCGATACCGAGGTCGAGCGCCTGTTGCC GACCTACCTGCGGATCCAGATCCAGCCCAACGAAGGCATGTGGTTCGACCTGCTGGCGAAAA AGCCCGGGCCGAGCCTGGACATGGCCAACATCGAACTGGGTTTTGCCTACCGCGAC I I I I I C GAGATGCAGCCCTCCACCGGCTACGAAACCCTGATCTACGACTGCCTGATCGGCGACCAGAC CCTGTTCCAGCGCGCCGACAACATCGAGAACGGCTGGCGCGCGGTGCAACCCTTCCTCGAT GCCTGGCAACAGGACGCCAGCTTGCAGAACTACCCGGCGGGCGTGGATGGCCCGGCAGCC GGGGATGAACTGCTGGCCCGGGATGGCCGCGTATGGCGACCCCTGGGGTGA zwf2 from P. fluorescens
Amino Acid Sequence (SEQ ID NO: 125)
MPSITVEPCTFALFGALGDLALRKLFPALYQLDAAGLLHDDTRILALAREPGSEQEHLANIETELHKY VGDKDIDSQVLQRFLVRLSYLHVDFLKAEDYVALAERVGSEQRLIAYFATPAAVYGAICENLSRVGL NQHTRVVLEKPIGSDLDSSRKVNDAVAQFFPETRIYRIDHYLGKETVQNLIALRFANSLFETQWNQ NYISHVEITVAEKVGIEGRWGYFDKAGQLRDMIQNHLLQLLCLIA DPPADLSADSIRDEKVKVLKA LAPISPEGLTTQVVRGQYIAGHSEGQSVPGYLEEENSNTQSDTETFVALRADIRNWRWAGVPFYL RTGKRMPQKLSQIVIHFKEPSHYIFAPEQRLQISNKLIIRLQPDEGISLRVMTKEQGLDKGMQLRSG PLQLNFSDTYRSARIPDAYERLLLEVMRGNQNLFVRKDEIEAAWKWCDQLIAGWKKSGDAPKPYA AGSWGPMSSIAUTRDGRSWYGDI Nucleotide Sequence (SEQ ID NO: 126)
ATGCCTTCGATAACGGTTGAACCCTGCACCTTTGCCTTGTTTGGCGCGCTGGGCGATCTGGC GCTGCGTAAGCTGTTTCCTGCCCTGTACCAACTCGATGCCGCCGGTTTGCTGCATGACGACA CGCGCATCCTGGCCCTGGCCCGCGAGCCTGGCAGCGAGCAGGAACACCTGGCGAATATCGA AACCGAGCTGCACAAGTATGTCGGCGACAAGGATATCGATAGCCAGGTCCTGCAGCGTTTTC TCGTCCGCCTGAGCTACCTGCATGTGGACTTCCTCAAGGCCGAGGACTACGTCGCCCTGGCC GAACGTGTCGGCAGCGAGCAGCGCCTGATTGCCTACTTCGCCACGCCGGCGGCGGTGTATG GCGCGATCTGCGAAAACCTCTCCCGGGTCGGGCTCAACCAGCACACCCGTGTGGTCCTGGA AAAACCCATCGGCTCGGACCTGGATTCATCACGCAAGGTCAACGACGCGGTGGCGCAGTTCT TCCCGGAAACCCGCATCTACCGGATCGACCACTACCTGGGCAAGGAAACGGTGCAGAACCTG ATTGCCCTGCGTTTCGCCAACAGCCTGTTCGAAACCCAGTGGAACCAGAACTACATCTCCCAC GTGGAAATCACCGTGGCCGAGAAGGTCGGCATCGAAGGTCGCTGGGGCTATTTCGACAAGG CCGGCCAACTGCGGGACATGATCCAGAACCACTTGCTGCAACTGCTCTGCCTGATCGCGATG GACCCGCCGGCCG ACCTTTCGGCCG ACAGCATCCGCGACGAGAAGGTCAAGGTGCTCAAGG CCCTGGCGCCCATCAGCCCGGAAGGCCTGACCACCCAGGTGGTGCGCGGCCAGTACATCGC CGGCCACAGCGAAGGCCAGTCGGTGCCGGGCTACCTGGAGGAAGAAAACTCCAACACCCAG AGCGACACCGAGACCTTCGTCGCCCTGCGCGCCGATATCCGCAACTGGCGCTGGGCCGGTG TGCCTTTCTACCTGCGCACCGGCAAGCGCATGCCACAGAAGCTGTCGCAGATCGTCATCCAC TTCAAGGAACCCTCGCACTACATCTTCGCCCCCGAGCAGCGCCTGCAGATCAGCAACAAGCT GATCATCCGCCTGCAGCCGGACGAAGGTATCTCGTTGCGGGTGATGACCAAGGAGCAGGGC CTGGACAAGGGCATGCAACTGCGCAGCGGTCCGTTGCAGCTGAA I I I I I CCGATACCTATCG CAGTGCACGGATCCCCGATGCCTACGAGCGGTTGTTGCTGGAAGTGATGCGCGGCAATCAG AACCTGTTTGTGCGCAAAGATGAAATCGAAGCCGCGTGGAAGTGGTGTGACCAGTTGATTGC CGGGTGGAAGAAATCCGGCGATGCGCCCAAGCCGTACGCGGCCGGGTCCTGGGGGCCGAT GAGCTCCATTGCACTGATCACGCGGGATGGGAGGTCTTGGTATGGCGATATCTaA zwf l from P. aeruginosa, PA01 Amino Acid Sequence (SEQ ID NO: 127)
MPDVRVLPCTLALFGALGDLALRKLFPALYQLDRENLLHRDTRVLALARDEGAPAEHLATLEQRLR LAVPAKEWDDVVWQRFRERLDYLSMDFLDPQAYVGLREAVDDELPLVAYFATPASVFGGICENLA AAGLAERTRVVLEKPIGHDLESSREVNEAVARFFPESRIYRIDHYLGKETVQNLIALRFANSLFETQ WNQNHISHVEITVAEKVGIEGRWGYFDQAGQLRDMVQNHLLQLLCLIAMDPPSDLSADSIRDEKV KVLRALEPIPAEQLASRVVRGQYTAGFSDGKAVPGYLEEEHANRDSDAETFVALRVDIRNWRWS GVPFYLRTGKRMPQKLSQIVIHFKEPPHYIFAPEQRSLISNRLIIRLQPDEGISLQVMTKDQGLGKG MQLRTGPLQLSFSETYHAARIPDAYERLLLEVTQGNQYLFVRKDEVEFAWKWCDQLIAGWERLSE APKPYPAGSWGPVASVALVARDGRSWYGDF Nucleotide Sequence (SEQ ID NO: 128)
ATGCCTGATGTCCGCGTTCTGCCTTGCACGTTAGCGCTGTTCGGTGCGCTGGGCGATCTCGC CTTGCGCAAGCTGTTCCCGGCGCTCTACCAACTCGATCGTGAGAACCTGCTGCACCGCGATA CCCGCGTCCTGGCCCTGGCCCGTGACGAAGGCGCTCCCGCCGAACACCTGGCGACGCTGG AGCAGCGCCTGCGCCTGGCAGTGCCGGCGAAGGAGTGGGACGACGTGGTCTGGCAGCGTT TCCGCGAACGCCTCGACTACCTGAGCATGGACTTCCTCGACCCGCAGGCCTATGTCGGCTTG CGCGAGGCGGTGGATGACGAACTGCCGCTGGTCGCCTACTTCGCCACGCCGGCCTCGGTGT TCGGCGGCATCTGCGAGAACCTCGCCGCCGCCGGTCTCGCCGAGCGCACCCGGGTGGTGC TGGAGAAGCCCATCGGTCATGACCTGGAGTCGTCCCGCGAGGTCAACGAGGCAGTCGCCCG GTTCTTCCCGGAAAGCCGCATCTACCGGATCGACCATTACCTGGGCAAGGAGACGGTGCAGA ACCTGATCGCCCTGCGCTTCGCCAACAGCCTCTTCGAGACCCAGTGGAACCAGAACCACATC TCCCACGTGGAGATCACCGTGGCCGAGAAGGTCGGCATCGAAGGCCGCTGGGGCTACTTCG ACCAGGCCGGGCAACTGCGCGACATGGTGCAGAACCACCTGCTGCAACTGCTCTGCCTGAT CGCCATGGATCCGCCCAGCGACCTTTCGGCGGACAGCATTCGCGACGAGAAGGTCAAGGTC CTCCGCGCCCTCGAGCCGATTCCCGCAGAACAACTGGCTTCGCGCGTGGTGCGTGGGCAGT ACACCGCCGGTTTCAGCGACGGCAAGGCAGTGCCGGGCTACCTGGAGGAGGAACATGCGAA TCGCGACAGCGACGCGGAAACCTTCGTCGCCCTGCGCGTGGACATCCGCAACTGGCGCTGG TCGGGCGTGCCGTTCTACCTGCGCACCGGCAAGCGCATGCCGCAGAAGCTGTCGCAGATCG TCATCCACTTCAAGGAGCCGCCGCACTACATCTTCGCTCCCGAGCAGCGTTCGCTGATCAGC AACCGGCTGATCATCCGCCTGCAGCCGGACGAAGGTATCTCCCTGCAAGTGATGACCAAGGA CCAGGGCCTGGGCAAGGGCATGCAATTGCGTACCGGCCCGCTGCAACTGAG I I I I I CCGAG ACCTACCACGCGGCGCGGATTCCCGATGCCTACGAGCGTCTGCTGCTGGAGGTCACCCAGG GCAACCAGTACCTGTTCGTGCGCAAGGACGAGGTGGAGTTCGCCTGGAAGTGGTGCGACCA GCTGATCGCTGGCTGGGAACGCCTGAGCGAAGCGCCCAAGCCGTATCCGGCGGGGAGTTG GGGGCCGGTGGCCTCGGTGGCCCTGGTGGCCCGCGATGGGAGGAGTTGGTATGGCGATTT CTGA zwt1 from Z. mobilis
Amino Acid Sequence (SEQ ID NO: 129)
MTNTVSTMILFGSTGDLSQRMLLPSLYGLDADGLLADDLRIVCTSRSEYDTDGFRDFAEKALDRFV ASDRLNDDAKAKFLNKLFYATVDITDPTQFGKLADLCGPVEKGIAIYLSTAPSLFEGAIAGLKQAGLA GPTSRLALEKPLGQDLASSDHINDAVLKVFSEKQVYRIDHYLGKETVQNLLTLRFGNALFEPLWNS KG!DHVQISVAETVGLEGRIGYFDGSGSLRDMVQSHILQLVALVAMEPPAHMEANAVRDEKVKVF RALRPINNDTVFTHTVTGQYGAGVSGGKEVAGYIDELGQPSDTETFVAIKAHVDNWRWQGVPFYI RTGKRLPARRSEIVVQFKPVPHSIFSSSGG!LQPNKLRIVLQPDETIQISMMVKEPGLDRNGAHMRE VWLDLSLTDVFKDRKRRIAYERLMLDLIEGDATLFVRRDEVEAQWVWIDGIREGWKANSMKPKTY VSGTWGPSTAIALAERDGVTWYD
Nucleotide Sequence (SEQ ID NO: 130) ATG ACAAATACCGTTTCG ACG ATG ATATTGTTTGGCTCG ACTGGCG ACCTTTCACAGCGTATG CTGTTGCCGTCGCTTTATGGTCTTGATGCCGATGGTTTGCTTGCAGATGATCTGCGTATCGTC TGCACCTCTCGTAGCGAATACGACACAGATGGTTTCCGTGATTTTGCAGAAAAAGCTTTAGAT CGCTTTGTCGCTTCTGACCGGTTAAATGATGACGCTAAAGCTAAATTCCTTAACAAGCTTTTCT ACGCGACGGTCGATATTACGGATCCGACCCAATTCGGAAAATTAGCTGACCTTTGTGGCCCG GTCGAAAAAGGTATCGCCATTTATCTTTCGACTGCGCCTTCTTTGTTTGAAGGGGCAATCGCT GGCCTGAAACAGGCTGGTCTGGCTGGTCCAACTTCTCGCCTGGCGCTTGAAAAACCTTTAGG TCAAGATCTTGCTTCTTCCGATCATATTAATGATGCGGTTTTGAAAGTTTTCTCTGAAAAGCAA GTTTATCGTATTGACCATTATCTGGGTAAAG ACGGTTCAGAATCTTCTGACCCTGCGTTTTG GTAATGCTTTGTTTGAACCGCTTTGGAATTCAAAAGGCATTGACCACGTTCAGATCAGCGTTG CTGAAACGGTTGGTCTTGAAGGTCGTATCGGTTATTTCGACGGTTCTGGCAGCTTGCGCGATA TGGTTCAAAGCCATATCCTTCAGTTGGTCGCTTTGGTTGCAATGGAACCACCGGCTCATATGG AAGCCAACGCTGTTCGTGACGAAAAGGTAAAAGTTTTCCGCGCTCTGCGTCCGATCAATAACG ACACCGTCTTTACGCATACCGTTACCGGTCAATATGGTGCCGGTGTTTCTGGTGGTAAAGAAG TTGCCGGTTACATTGACGAACTGGGTCAGCCTTCCGATACCGAAACCTTTGTTGCTATCAAAG CGCATGTTGATAACTGGCGTTGGCAGGGTGTTCCGTTCTATATCCGCACTGGTAAGCGTTTAC CTGCACGTCGTTCTGAAATCGTGGTTCAGTTTAAACCTGTTCCGCATTCGATTTTCTCTTCTTC AGGTGGTATCTTGCAGCCGAACAAGCTGCGTATTGTCTTACAGCCTGATGAAACCATCCAGAT TTCTATGATGGTGAAAGAACCGGGTCTTGACCGTAACGGTGCGCATATGCGTGAAGTTTGGCT GGATCTTTCCCTCACGGATGTGTTTAAAGACCGTAAACGTCGTATCGCTTATGAACGCCTGAT GCTTGATCTTATCGAAGGCGATGCTACTTTATTTGTGCGTCGTGACGAAGTTGAGGCGCAGTG GGTTTGGATTGACGGAATTCGTGAAGGCTGGAAAGCCAACAGTATGAAGCCAAAAACCTATGT CTCTGGTACATGGGGGCCTTCAACTGCTATAGCTCTGGCCGAACGTGATGGAGTAACTTGGT ATGACTGA All the above genes were PCR amplified from their genomic DNA sources with and without c- terminal 6-HIS tags and cloned into the yeast expression vector p426GPD for testing.
Assays of candidate ZWF1 genes
Strain BY4742 zwf l (ATCC Cat. No. 401 1971 ; Winzeler EA, et al. Science 285: 901 -906, 1999. PubMed: 10436161 ) was used as the base strain for all ZWF1 assays. The assays were performed as follows: A 5ml overnight of the strain expressing the ZWF1 gene was grown in SCD- ura. A 50ml culture of the strain was then grown for about 18 hours from an initial OD600 of about 0.2 until it had reached about OD600 of about 4. The cells were centrif uged at 1046 x g washed twice with 25ml cold sterile water, and resuspended in 2ml/g Yper Plus (Thermo Scientific) plus 1 X protease inhibitors (EDTA-free). The cells were allowed to lyse at room temperature for about 30 minutes with constant rotation of the tubes. The lysate was centrif uged at 16,100 x g for 10 minutes at 4°C and the supernatants were transferred to a new 1 .5ml microcentrifuge tube.
Quantification of the lysates was performed using the Coomassie-Plus kit (Thermo Scientific, San Diego, CA) as directed by the manufacturer.
Each kinetic assay was done using approximately 50 to 60Mg of crude extract in a reaction mixture containing 50mM Tris-HCI, pH 8.9, and 1 mM NADP+ or NAD+. The reaction. was started with 20mM glucose-6-phosphate and the reaction was monitored at A340. The specific activity was measured as the μητιοΙ substrate/min/mg protein. The results of the assays are presented in the table below.
Figure imgf000264_0001
His NADP+ 0.0139 0.9653 0.2739
NAD+ ND ND ND
P. fluorescens 2
NADP+ NA NA NA
P. fluorescens 2 + NAD+ NA NA NA
His NADP+ ND ND ND
NAD+ NA NA NA
PA01
NADP+ 0.0104 0.6466 0. 564
NAD+ 0.0074 0.0071 0.1098
PA01 + His
NADP+ 0.0123 3.9050 0.1823
NA = cannot be calculated (substrate not used by
enzyme)
ND = was not determined (either not enough crude available or cells
did not grow)
Altering cofactor preference of S. cerevisiae ZWF1 ZWF1 from S. cerevisiae is an NADP*-only utilizing enzyme. Site-directed mutagenesis was used to alter of ZWF1 so that the altered ZWF1 could also utilize NAD+, thereby improving the REDOX balance within the cell. Site directed mutagenesis reactions were performed in the same manner for all mutations, and for mutants which include more than one mutation, each mutation was performed sequentially. About 50ng of plasmid DNA was added to 1 X Pfu Ultra II buffer, 0.3 mfvl dNTPs, 0.3 μηιοΙ site directed mutagenesis specific primers, and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction was cycled at 95 °C for 10 minutes, followed by 15 rounds of 95 °C for 15 seconds, 55 °C for 40 seconds, and 72°C for 3 minutes. A final 10 minute extension reaction at 72°C was also included. The PCR reaction mixture was then digested with 30U of Dpnl for about 2 hours and 5μΙ of the digested PCR reaction mixture was used to transform competent DH5a (Zymo Research, Orange, CA) and plated onto LB plates containing the appropriate antibiotics. The table below lists mutants generated in a first round of mutagenesis.
Mutant # zwf 1 sc Codon changes
1 A24G GCA -> GGT
2 A24G/T28G GCA -> GGT, ACT -> GGT
3 A51 N GCC -> AAT
4 A51 D GCC -> GAT
5 T28F ACT -> TTT
6 K46R AAG -> AGA
7 Y40L TAC -> TTG 8 F33Y I I I -> TAC
9 T28L ACT -> TTG
10 V16L GTC -> TTG
1 1 V13T GTC -> ACT
12 L66E CTA -> GAA
13 A24G/A51 D GCA -> GGT, GCC -> GAT
14 A24G/T28G/A51 D GCA -> GGT, ACT -> GGT, GCC -> GAT
1 5 R52D CGG -> GAT
16 A51 D/R52A GCC -> GAT, CGG -> GCT
1 7 A24G/A51 D/R52A GCA -> GGT, GCC -> GAT, CGG -> GCT
GCA -> GGT, ACT -> GGT, GCC -> GAT, CGG
18 A24G/T28G/A51 D/R52A -> GCT
19 A51 D/R52H GCC -> GAT, CGG -> CAT
20 R52H CGG -> CAT
21 D22R GAT -> AGA
The oligonucleotides, utilized to generate the mutants listed in the table above, are listed in the table below. All oligonucleotides were purchased from Integrated DNA Technologies ( IDT).
Figure imgf000266_0001
ka/zwf1sc_Y39
pBF300 Lrev tcttggtagatggatcaagcaaaccttctctgaaaagccc ka/zwf1sc F33
PBF300 Yfor gaagaagacttttcccgccttatacgggcttttcagagaag ka/zwf1sc_F33
PBF300 Yrev cttctctgaaaagcccgtataaggcgggaaaagtcttcttc ka/zwf1sc T28
PBF300 Lfor gtcaggtgatctggcaaagaagaagttgtttcccgccttatttgg ka/zwf'1sc_T28
pBF300 Lrev ccaaataaggcgggaaacaacttcttctttgccagatcacctgac ka/zwf1sc V16
PBF300 Lfor cgaaaaaaataccgtcatatctttgttlggtgcgtcaggtgatctg ka/zwf1sc_V16
PBF300 rev cagatcacctgacgcaccaaacaaagatatgacggtatttttttcg ka/zwf1sc L66
PBF300 Efor gacctgaagtcccgtgtcgaaccccacttgaaaaaacc
ka/zwf1sc_L66
PBF300 Erev ggttttttcaagtggggttcgacacgggacttcaggtc
ka/zwf1sc A24
PBF374 Gfor gtgcgtcaggtgatctgggtaagaagaagacttttccc
ka/zwf1sc_A24
PBF374 Grev gggaaaagtcttcttcttacccagatcacctgacgcac
ka/zwf1sc A24
PBF374 Gfor gtgcgtcaggtgatctgggtaagaagaagacttttccc
ka/zwf1sc_A24
PBF374 Grev gggaaaagtcttcttcttacccagatcacctgacgcac
KA/zwf1mut15f
PBF300 or accaagatcttcggttatgccgattccaaattgtccatggaggag
KA/zwf1mul15r
PBF300 ev ctcctccatggacaatttggaatcggcataaccgaagatcttggt
KA/zwf1mut16f tccatctaccaagatcttcggttatgatgcttccaaattgtccatgga
PBF374 or ggaggac
KA/zwf1mut16r gtcctcctccatggacaatttggaagcatcataaccgaagatcttg
PBF374 ev gtagatgga
KA/zwf1mut16f tccatctaccaagatcttcggttatgatgcttccaaattgtccatgga
PBF441 or ggaggac
KA/zwf1mut16r gtcctcctccatggacaatttggaagcatcataaccgaagatcttg
PBF441 ev gtagatgga
KA/zwf1mut16f tccatctaccaagatcttcggttatgatgcttccaaattgtccatgga pBF442 or ggaggac
KA/zwf1mut16r gtcctcctccatggacaatttggaagcatcataaccgaagatcttg
PBF442 ev gtagatgga
KA/zwf1sc mut
PBF374 19for aagatcttcggttatgatcattccaaattgtccatggagg
KA/zwf1sc_mut
PBF374 19rev cctccatggacaatttggaatgatcataaccgaagatctt
KA/zwf1sc mut
PBF300 20for aagatcttcggttatgcccattccaaattgtccatggagg
KA/zwf1sc_mut
PBF300 20rev cctccatggacaatttggaatgggcataaccgaagatctt Initial kinetic screening of the ZWF1 mutants generated as described above, identified the following altered ZWF1 genes and preliminary cofactor phenotype.
Figure imgf000268_0001
ND = not determined Mutants 4 (A51 D) and 13 (A24G/A51 D) were identified as mutants which enabled NAD+ utilization with concomitant loss of NADP+ utilization.
Cloning of SOL3 The SOL3 gene from S. cerevisiae was cloned as follows. The approximately 750 bp SOL3 gene was PCR amplified from the BY4742 genome using primers KAS/5-SOL3-Nhel and KAS/3'-SOL3- Sall, shown below.
KAS/5-SOL3-Nhel gctagcatggtgacagtcggtgtgttttctgag
KAS/3'-SOL3-Sall gtcgacctaaaaagttttcgtttgaacttttcc
About 100ng of genomic DNA from S. cerevisiae strain BY4742 was added to 1 X Pfu Ultra II buffer, 0.3 mM dNTPs, 0.3 μηιοΙ gene-specific primers, and 1 U Pfu Ultra II polymerase (Agilent, La Jolla, CA) in a 50μΙ reaction mix. The reaction was cycled at 95°C for 10 minutes, followed by 30 rounds of 95 °C for 20 seconds, 55 °C for 30 seconds, and 72°C for 15 seconds. A final 5 minute extension reaction at 72 °C was also included. The amplified product was TOPO cloned into the pCR Blunt II TOPO vector (Life Technologies, Carlsbad, CA) according to the manufacturer's recommendations and sequence verified (GeneWiz, San Diego, CA). The resultant plasmid was designated pBF301 . The sequence of the S. cerevisiae SOL3 gene is given below.
S. cerevisiae SOL3 (SEQ ID NO: 131 ) ATGGTGACAGTCGGTGTGTTTTCTGAGAGGGCTAGTTTGACCCATCAATTGGGGGAATTCATCGTCAAGAAAC AAGATGAGGCGCTGCAAAAGAAGTCAGACTTTAAAGTTTCCGTTAGCGGTGGCTCTTTGATCGATGCTCTGTA TGAAAGTTTAGTAGCGGACGAATCACTATCTTCTCGAGTGCAATGGTCTAAATGGCAAATCTACTTCTCTGAT GAAAGAATTGTGCCACTGACGGACGCTGACAGCAATTATGGTGCCTTCAAGAGAGCTGTTCTAGATAAATTAC CCTCGACTAGTCAGCCAAACGTTTATCCCATGGACGAGTCCTTGATTGGCAGCGATGCTGAATCTAACAACAA AATTGCTGCAGAGTACGAGCGTATCGTACCTCAAGTGCTTGATTTGGTACTGTTGGGCTGTGGTCCTGATGGA CACACTTGTTCCTTATTCCCTGGAGAAACACATAGGTACTTGCTGAACGAAACAACCAAAAGAGTTGCTTGGT GCCACGATTCTCCCAAGCCTCCAAGTGACAGAATCACCTTCACTCTGCCTGTGTTGAAAGACGCCAAAGCCCT GTGTTTTGTGGCTGAGGGCAGTTCCAAACAAAATATAATGCATGAGATCTTTGACTTGAAAAACGATCAATTG CCAACCGCATTGGTTAACAAATTATTTGGTGAAAAAACATCCTGGTTCGTTAATGAGGAAGCTTTTGGAAAAG TTCAAACGAAAACTTTTTAG
The Nhel-Sall SOL3 gene fragment from plasmid pBF301 will be cloned into the Spel-Xhol site in plasmids p413GPD and p423GPD (HIS3 marker-based plasmids; ATCC 87354 and ATCC 87355). Testing of ZWF1/SOL3 combinations in BY4742
A URA blaster cassette was digested with Notl and ligated into the MET17 integration cassette plasmid pBF691 to generate the Met17 knockout plasmid pBF772. Plasmid pBF772 was digested with Pad and linear fragments were purified by Zymo PCR purification kit (Zymo Research, Orange, CA) and concentrated in 10 μΙ ddH20. LiCI2 high efficiency transformation was performed as shown described. About 1 μg linear MET17 knockout fragment was transformed into 50 μΙ fresh made BY4742 competent cells and cells were plated onto SCD-Ura plates at 30°C for about 2-3 days. A single URA+ colony was streaked out on a SCD-Ura plate and grown at 30 °C for about 2-3 days. A single colony was inoculated overnight in YPD medium at 30°C. 50 μΙ of the overnight culture was then plated onto SCD complete -5FOA plates and incubated at 30 °C for about 3 days.
A single colony which grew on SCD complete-5FOA plates was then picked and inoculated in YPD medium and grown at 30°C overnight. Yeast genomic DNA was extracted by YeaStar genomic extraction kit (Zymo Research, Orange, CA) and confirmation of the strain was confirmed by PCR using primers JML/237 and JML7238, shown below.
JML7237: CCAACACTAAGAAATAATTTCGCCATTTCTTG
JML7238: GCCAACAATTAAATCCAAGTTCACCTATTCTG
The PCR amplification was performed as follows: 10ng of yeast genomic DNA with O. l mol gene specific primers, 1 X Pfu Ultra II buffer, 0.2mmol dNTPs, and 0.2U Taq DNA polymerase. The PCR mixture was cycled at 95°C for 2 minutes, followed by 30 cycles of 95 °C for 20 seconds, 55 °C for 30 seconds and 72°C for 45 seconds. A final step of 72 °C for 5 minutes was also included. The resultant strain was designated BF1618.
Strain BF1 618 is undergoing transformation with the following plasmid combinations. Additionally, the affect of the ZWF1 mutant constructs will also be evaluated with and without SOL3 constructs. The table below shows the plasmid combinations being transformed into strain BF1618.
Figure imgf000270_0001
Strains with improved ethanol production may benefit from two or more copies of the ZWF1 gene due to increased flux of the carbon towards the alternative pathway. A strain embodiment currently under construction has the phenotype; p1k\ , ZWF-\ , SOL3, fa/1 , ΕΌΟ-ΡΑΟΓ, EDA-E.coH* , where the "*" represents additional copies of the gene. It is believed that multiple copies of the EDD and EDA genes may provide additional increases in ethanol production. Example 28: Identification of Additional Xylose Isomerase 5' ends that can increase expression levels of Ruminococcus Xylose Isomerase in Yeast.
To determine if the 5' end of other xylose isomerase genes could also be used to increase the expression of the Ruminococcus xylose isomerase in yeast, as demonstrated herein for the 5' end of Piromyces xylose isomerase, additional chimeric molecules were generated as described herein, using approximately 10 amino acids from the xylose isomerase genes described in Example 25. The alternate xylose isomerase gene 5' ends were selected from xylose isomerase sequences previously shown to be expressed and active inn yeast. The xylose isomerase gene donors and the 5' end of the nucleotide sequence from each are presented in the table below.
Figure imgf000271_0001
Bacillus ATGGCTTA I I I I CCGAATATCGGCAAGATTGCGTATGAAGGGCCGGAGTCGCGCA s tearothermophilus ATCCGTTGGCGTTTAAG I I I I ATAATCCAGAAGAAAAAGTCGGCGACAAAACAATG
GAGGAGCATTTGCGC I I I I CAGTGGCCTATTGGCATACGTTTACGGGGGATGGGT CGGATCCGTTTGGCGTCGGCAATATGATTCGTCCATGGAATAAGTACAGCGGCAT GGATCTGGCGAAGGCGCGCGTCGAGGCGGCGTTTGAGCTGTTTGAAAAGCTGAA CGTTCCG I I I I I CTGCTTCCATGACGTCGACATCGCGCCGGAAGGGGAAACGCTC AGCGAGACGTACAAAAATTTGGATGAAATTGTCGATATGATTGAAGAATACATGAA AACAAGCAAAACGAAGCTGCTTTGGAATACGGCGAACTTGTTCAGCCATCCGCGC TTCGTTCAC
Bacillus uniformis ATGGCTACCAAGGAATACTTCCCAGGTATTGGTAAGATCAAATTCGAAGGTAAGGA
ATCCAAGAACCCAATGGCCTTCAGATACTACGATGCTGACAAGGTTATCATGGGTA
AGAAGATGTCTGAATGGTTAAAGTTCGCTATGGCTTGGTGGCATACCTTGTGTGCT
GAAGGTGGTGACCAATTCGGTGGTGGTACCAAGAAATTCCCATGGAACGGTGAAG
CTGACAAGGTCCAAGCTGCTAAGAACAAGATGGACGCTGGTTTCGAATTTATGCAA
AAGATGGGTATTGAATACTACTGTTTCCACGATGTTGACTTGTGTGAAGAAGCTGA
AACCATCGAAGAATACGAAGCTAACTTGAAGGAAATTGTTGCTTACGCTAAGCAAA
AGCAAGCTGAAACTGGTATCAAGCTATTATGGGGTACTGCTAACGTCTTTGGTCAT
GCCAGATACATGAAC
Clostridium ATGTCAGAAGTATTTAGCGGTATTTCAAACATTAAATTTGAAGGAAGCGGGTCAGA cellulolyticum TAATCCATTAGC I I I I AAGTACTATGACCCTAAGGCAGTTATCGGCGGAAAGACAA
TGGAAGAACATCTGAGATTCGCAGTTGCCTACTGGCATAC I I I I GCAGCACCAGGT
GCTGACATGTTCGGTGCAGGATCATATGTAAGACCTTGGAATACAATGTCCGATCC
TCTGGAAATTGCAAAATACAAAGTTGAAGCAAACTTTGAATTCATTGAAAAGCTGG
GAGCACCTTTCTTCGCTTTCCATGACAGGGATATTGCTCCTGAAGGCGACACACTC
GCTGAAACAAATAAAAACCTTGATACAATAGTTTCAGTAATTAAAGATAGAATGAAA
TCCAGTCCGGTAAAGTTATTATGGGGAACTACAAATGCTTTCGGAAACCCAAGATT
TATGCAT
Ruminococcus ATGGAATTTTTCAAGAACATAAGCAAGATCCCTTACGAGGGCAAGGACAGCACAAA flavefaciens FD 1 TCCTCTCGCATTCAAGTACTACAATCCTGATGAGGTAATTGACGGCAAGAAGATGC
GTGACATTATGAAGTTTGCTCTCTCATGGTGGCATACAATGGGCGGCGACGGAAC
AGATATGTTCGGCTGCGGTACAGCTGACAAGACATGGGGCGAAAATGATCCTGCT
GCAAGAGCTAAGGCTAAGGTTGACGCAGCTTTCGAGATCATGCAGAAGCTCTCTA
TCGATTACTTCTGTTTCCACGACCGTGATCTTTCTCCTGAGTACGGCTCACTGAAG
GACACAAACGCTCAGCTGGACATCGTTACAGATTACATCAAGGCTAAGCAGGCTG
AGACAGGTCTCAAGTGCCTCTGGGGTACAGCTAAGTGCTTCGATCACCCAAGATT
CATGCAC
Ruminococcus 18P13 ATGAGCGAATTTTTTACAGGCATTTCAAAGATCCCCTTTGAGGGAAAGGCATCCAA
CAATCCCATGGCGTTCAAGTACTACAACCCGGATGAGGTCGTAGGCGGCAAGACC
ATGCGGGAGCAGCTGAAGTTTGCGCTGTCCTGGTGGCATACTATGGGGGGAGAC
GGTACGGACATGTTTGGTGTGGGTACCACCAACAAGAAGTTCGGCGGAACCGATC
CCATGGACATTGCTAAGAGAAAGGTAAACGCTGCGTTTGAGCTGATGGACAAGCT
GTCCATCGATTATTTCTGTTTCCACGACCGGGATCTGGCGCCGGAGGCTGATAAT
CTGAAGGAAACCAACCAGCGTCTGGATGAAATCACCGAGTATATTGCACAGATGA
TGCAGCTGAACCCGGACAAGAAGGTTCTGTGGGGTACTGCAAATTGCTTCGGCAA
TCCCCGGTA TATGCAT
Clostriales genomosp ATGAAATTTTTTGAAAATGTCCCTAAGGTAAAATATGAGGGAAGCAAGTCTACCAA BVAB3 str UPII9-5 CCCGTTTGCATTTAAGTATTACAATCCTGAAGCGGTGATTGCCGGTAAAAAAATGA
AGGATCACCTGAAATTCGCGATGTCCTGGTGGCACACCATGACGGCGACCGGGC
AAGACCAGTTCGGTTCGGGGACGATGAGCCGAATATATGACGGGCAAACTGAACC
GCTGGCCTTGGCCAAAGCCCGAGTGGATGCGGCTTTCGATTTCATGGAAAAATTA
AATATCGAATA I I I I I G I I I I CATGATGCCGACTTGGCTCCAGAAGGTAACAGTTTG
CAGGAACGCAACGAAAATTTGCAGGAAATGGTGTCTTACCTGAAACAAAAGATGG
CCGGAACTTCGATTAAGC I I I I ATGGGGAACCTCGAATTGTTTCAGCAACCCTCGT
TTTATGCAC
The first 10 amino acids (30bp) of Xl-R was replaced with the 5' edge from the xylose isomerase genes presented in the table above using a single oligonucleotide in a PCR reaction, described herein. The oligonucleotides used for the PCR reactions are shown in the table below. The last 2 oligonucleotides were used as 3' oligonucleotides to amplify each resulting chimeric molecule with or without a c-terminal 6-HIS tag.
Figure imgf000273_0001
Each new PCR product was TOPO cloned using a pCR Blunt II vector (Invitrogen), verified by sequencing and subcloned into p426GPD, also as described herein. The resulting plasmids were transformed into BY4742 (S. cerevisiae) and selected on SCD-ura medium. Assays to detect levels of expressed xylose isomerase were performed as described herein.
Results
Each of the new chimeric genes was evaluated for expression against the native Ruminococcus xylose isomerase gene. Each chimeric variant (e.g., 5' end of an alternate XI donor gene attached to the Ruminococcus acceptor gene) was evaluated under saturating xylose conditions (e.g.,
500mM), using 20 μg crude extract. The assays were repeated several times and the results are presented graphically in FIG. 21 . As shown in FIG. 21 , the 5' edge of 4 of the 10 genes tested showed increased expression in yeast, with respect to the native Ruminococcus xylose isomerase control. The 5' edges (e.g., 5' ends) that showed increased expression or improved he activity of the Ruminococcus xylose isomerase gene were from Orpinomyces, Bacteroides thetaiotaomicron, Bacillus stearothermophilus, and B. uniformis. These results suggest that exchanging the 5' edge of a xylose isomerase gene with low expression and/or activity for the 5' edge of a different xylose isomerase gene can be used as a method to improve the activity and/or expression of xylose isomerase genes when expressed in eukaryotes such as yeast. 5' edge nucleic acid sequences that can improve activity or expression are not necessarily associated with native xylose isomerase genes that themselves show high levels of expressions. Therefore, these results also suggest that an "ideal" chimera can be created for organism specific expression of xylose isomerase, using the method described herein with routine levels of experimentation to determine the best 5' edges and best acceptor gene combinations for a specific organism..
The top 4 chimeric variants (e.g., new 5' edges combined with the Ruminococcus xylose isomerase acceptor gene) were further analyzed using a full kinetic assay using varying xylose concentrations ranging from about 40 mM to about 500 mM. The results are presented in the table below.
Figure imgf000274_0001
These results demonstrate that each of these 5' edge replacements confers increased activity to the native XI-R enzyme, with the Xl-R-Bun10 enzyme being the most active. The results of western blots are presented in FIG. 22. The western blot analysis presented in FIG. 22 shows the levels of expression of each chimeric construct in total crude extract and the soluble portion of the crude extract. The results of the western blot analysis are in good agreement with the results of the kinetic assays. G179A mutations, similar to those generated in the Ruminococcus native and chimeric genes described herein, are being generated for the 4 alternate 5' edge chimeras described in this example.
Example 29: Increased Expression of Ribulose-5-phosphate ketol-isomerase and Ribulose-5- phosphate-3-epimerase. Ribulose-5-phosphate ketol-isomerase (RKI1 ) and ribulose-5-phosphate-3-epimerase (RPE1 ) catalyze reactions in the non-oxidative portion of the Pentose Phosphate pathway. Ribulose-5- phosphate ketol-isomerase catalyzes the interconversion of ribulose-5-phosphate and ribose-5- phosphate. Ribulose-5-phosphate-3-epimerase catalyzes the interconversion of ribulose-5- phosphate to xylulose-5-phosphate. Increasing the activity of one or both of these enzymes can lead to increased ethanol production. Ribulose-5-phosphate ketol-isomerase activity and ribulose- 5-phosphate-3-epimerase activity each can be independently provided by a peptide. In some embodiments, the polypeptide is encoded by a heterologous nucleotide sequence introduced to a host microorganism. Nucleic acid sequences conferring Ribulose-5-phosphate ketol-isomerase activity and ribulose-5-phosphate-3-epimerase activity can be obtained from a number of sources, including, but not limited to S. cerevisiae, including but not limited to Kluyveromyces, Pichia, Escherichia, Bacillus, Ruminococcus, Schizosaccharomyces, and Candida.
Examples of an amino acid sequence of a polypeptide having ribulose-5-phosphate ketol- isomerase activity or ribu!ose-5-phosphate-3-epimerase activity, and a nucleotide sequence of a polynucleotide that encodes the respective polypeptide, are presented below. Increased activity of Ribulose-5-phosphate ketol-isomerase and Ribulose-5-phosphate-3-epimerase can be achieved using any suitable method. Non-limiting examples of methods suitable for adding, amplifying or over expressing ribulose-5-phosphate ketol-isomerase activity, ribulose-5-phosphate-3-epimerase activity, or ribulose-5-phosphate ketol-isomerase activity and ribulose-5-phosphate-3-epimerase activity include amplifying the number of RKI 1 and/or RPE1 gene(s) in yeast following
transformation with a high-copy number plasmid (e.g., such as one containing a 2uM origin of replication), integration of multiple copies of RKI1 and/or RPE1 gene(s) into the yeast genome, over-expression of the RKI1 and/or RPE1 gene(s) directed by a strong promoter, the like or combinations thereof. Presence, absence or amount of 6-phosphogluconolactonase activity can be detected by any suitable method known in the art, including nucleic acid based analysis and western blot analysis.
RKI1 nucleotide sequence
ATGGCTGCCGGTGTCCCAAAAATTGATGCGTTAGAATCTTTGGGCAATCCTTTGGAGGAT GCCAAGAGAGCTGCAGCATACAGAGCAGTTGATGAAAATTTAAAATTTGATGATCACAAA ATTATTGGAATTGGTAGTGGTAGCACAGTGGTTTATGTTGCCGAAAGAATTGGACAATAT TTGCATGACCCTAAATTTTATGAAGTAGCGTCTAAATTCATTTGCATTCCAACAGGATTC CAATCAAGAAACTTGATTTTGGATAACAAGTTGCAATTAGGCTCCATTGAACAGTATCCT CGCATTGATATAGCGTTTGACGGTGCTGATGAAGTGGATGAGAATTTACAATTAATTAAA GGTGGTGGTGCTTGTCTATTTCAAGAAAAATTGGTTAGTACTAGTGCTAAAACCTTCATT GTCGTTGCTGATTCAAGAAAAAAGTCACCAAAACATTTAGGTAAGAACTGGAGGCAAGGT GTTCCCATTGAAATTGTACCTTCCTCATACGTGAGGGTCAAGAATGATCTATTAGAACAA TTGCATGCTGAAAAAGTTGACATCAGACAAGGAGGTTCTGCTAAAGCAGGTCCTGTTGTA ACTGACAATAATAACTTCATTATCGATGCGGATTTCGGTGAAATTTCCGATCCAAGAAAA TTGCATAGAGAAATCAAACTGTTAGTGGGCGTGGTGGAAACAGGTTTATTCATCGACAAC GCTTCAAAAGCCTACTTCGGTAATTCTGACGGTAGTGTTGAAGTTACCGAAAAGTGA
RKI1 amino acid sequence
MAAGVPKI DALESLGNPLEDAKRAAAYRAVDE LKFDDHKI IGIGSGSTWYVAERIGQY LHDPKFYEVASKFICIPTGFQSRNLILDNKLQLGSIEQYPRIDIAFDGADEVDENLQLIK GGGACLFQEKLVSTSAKTFIWADSRKKSPKHLG NWRQGVPIEIVPSSYVRVKNDLLEQ LHAE VDIRQGGSAKAGPWTDNNNFI IDADFGEISDPR LHREIKLLVGWETGLFIDN AS AYFGNSDGSVEVTEK
RPE1 nucleotide sequence
ATGGTCAAACCAATTATAGCTCCCAGTATCCTTGCTTCTGACTTCGCCAACTTGGGTTGC GAATGTCATAAGGTCATCAACGCCGGCGCAGATTGGTTACATATCGATGTCATGGACGGC CATTTTGTTCCAAACATTACTCTGGGCCAACCAATTGTTACCTCCCTACGTCGTTCTGTG CCACGCCCTGGCGATGCTAGCAACACAGAAAAGAAGCCCACTGCGTTCTTCGATTGTCAC ATGATGGTTGAAAATCCTGAAAAATGGGTCGACGATTTTGCTAAATGTGGTGCTGACCAA TTTACGTTCCACTACGAGGCCACACAAGACCCTTTGCATTTAGTTAAGTTGATTAAGTCT AAGGGCATCAAAGCTGCATGCGCCATCAAACCTGGTACTTCTGTTGACGTTTTATTTGAA CTAGCTCCTCATTTGGATATGGCTCTTGTTATGACTGTGGAACCTGGGTTTGGAGGCCAA AAATTCATGGAAGACATGATGCCAAAAGTGGAAACTTTGAGAGCCAAGTTCCCCCATTTG AATATCCAAGTCGATGGTGGTTTGGGCAAGGAGACCATCCCGAAAGCCGCCAAAGCCGGT GCCAACGTTATTGTCGCTGGTACCAGTGTTTTCACTGCAGCTGACCCGCACGATGTTATC TCCTTCATGAAAGAAGAAGTCTCGAAGGAATTGCGTTCTAGAGATTTGCTAGATTAG
RPE1 amino acid sequence
MVKPI IAPSI LASDFANLGCECHKVI AGADWLHI DVMDGHFVPNITLGQPIVTSLRRSV PRPGDASNTEKKPTAFFDCH MVENPEKWVDDFAKCGADQFTFHYEATQDPLHLVKLI S GIKAACAI PGTSVDVLFELAPHLDMALV TVEPGFGGQKF EDM PKVETLRAKFPHL IQVDGGLGKETI P AAKAGANVIVAGTSVFTAADPHDVI SFMKEEVSKELRSRDLLD
Example 30: Xylulokinase over expression
As described herein, metabolism of xylose as a carbon source, either by xylose isomerase or the combination of xylose reductase and xylitol dehydrogenase, produces xylulose, which must be phosphorylated to enter the pentose phosphate pathway. Increased ethanol fermentation via the over expression of xylose isomerase or xylose reductase and xylitol dehydrogenase also may be further enhanced by the over expression of xylulokinase, in some embodiments. Presented herein are the nucleotide and amino acid sequence of the S. cerevisiae xylulokinase (XKS1 ) gene. The activity of xylulokinase was increased using methods described herein (e.g., strong promoter, multiple copies, the like and combinations thereof). The XKS1 gene of S. cerevisiae is functionally similar to the XYL3 gene of Pichia stipitis.
XKS1 ATGTTGTGTTCAGTAATTCAGAGACAGACAAGAGAGGTTTCCAACACAATGTCTTTAGAC TCATACTATCTTGGGTTTGATCTTTCGACCCAACAACTGAAATGTCTCGCCATTAACCAG GACCTAAAAATTGTCCATTCAGAAACAGTGGAATTTGAAAAGGATCTTCCGCATTATCAC ACAAAGAAGGGTGTCTATATACACGGCGACACTATCGAATGTCCCGTAGCCATGTGGTTA GAGGCTCTAGATCTGGTTCTCTCGAAATATCGCGAGGCTAAATTTCCATTGAACAAAGTT ATGGCCGTCTCAGGGTCCTGCCAGCAGCACGGGTCTGTCTACTGGTCCTCCCAAGCCGAA TCTCTGTTAGAGCAATTGAATAAGAAACCGGAAAAAGATTTATTGCACTACGTGAGCTCT GTAGCATTTGCAAGGCAAACCGCCCCCAATTGGCAAGACCACAGTACTGCAAAGCAATGT CAAGAGTTTGAAGAGTGCATAGGTGGGCCTGAAAAAATGGCTCAATTAACAGGGTCCAGA GCCCATTTTAGATTTACTGGTCCTCAAATTCTGAAAATTGCACAATTAGAACCAGAAGCT TACGAAAAAACAAAGACCATTTCTTTAGTGTCTAATTTTTTGACTTCTATCTTAGTGGGC CATCTTGTTGAATTAGAGGAGGCAGATGCCTGTGGTATGAACCTTTATGATATACGTGAA AGAAAATTCAGTGATGAGCTACTACATCTAATTGATAGTTCTTCTAAGGATAAAACTATC AGACAAAAATTAATGAGAGCACCCATGAAAAATTTGATAGCGGGTACCATCTGTAAATAT TTTATTGAGAAGTACGGTTTCAATACAAACTGCAAGGTCTCTCCCATGACTGGGGATAAT TTAGCCACTATATGTTCTTTACCCCTGCGGAAGAATGACGTTCTCGTTTCCCTAGGAACA AGTACTACAGTTCTTCTGGTCACCGATAAGTATCACCCCTCTCCGAACTATCATCTTTTC ATTCATCCAACTCTGCCAAACCATTATATGGGTATGATTTGTTATTGTAATGGTTCTTTG GCAAGGGAGAGGATAAGAGACGAGTTAAACAAAGAACGGGAAAATAATTATGAGAAGACT AACGATTGGACTCTTTTTAATCAAGCTGTGCTAGATGACTCAGAAAGTAGTGAAAATGAA TTAGGTGTATATTTTCCTCTGGGGGAGATCGTTCCTAGCGTAAAAGCCATAAACAAAAGG GTTATCTTCAATCCAAAAACGGGTATGATTGAAAGAGAGGTGGCCAAGTTCAAAGACAAG AGGCACGATGCCAAAAATATTGTAGAATCACAGGCTTTAAGTTGCAGGGTAAGAATATCT CCCCTGCTTTCGGATTCAAACGCAAGCTCACAACAGAGACTGAACGAAGATACAATCGTG AAGTTTGATTACGATGAATCTCCGCTGCGGGACTACCTAAATAAAAGGCCAGAAAGGACT TTTTTTGTAGGTGGGGCTTCTAAAAACGATGCTATTGTGAAGAAGTTTGCTCAAGTCATT GGTGCTACAAAGGGTAATTTTAGGCTAGAAACACCAAACTCATGTGCCCTTGGTGGTTGT TATAAGGCCATGTGGTCATTGTTATATGACTCTAATAAAATTGCAGTTCCTTTTGATAAA TTTCTGAATGACAATTTTCCATGGCATGTAATGGAAAGCATATCCGATGTGGATAATGAA AATTGGGATCGCTATAATTCCAAGATTGTCCCCTTAAGCGAACTGGAAAAGACTCTCATC TAA
XKS1 amino acid sequence
MLCSVIQRQTREVSNTMSLDSYYLGFDLSTQQL CLAINQDLKIVHSETVEFEKDLPHYH TKKGVYIHGDTIECPVAMWLEALDLVLSKYREAKFPLN VMAVSGSCQQHGSVYWSSQAE SLLEQLNKKPEKDLLHYVSSVAFARQTAPNWQDHSTAKQCQEFEECIGGPEKMAQLTGSR AHFRFTGPQILKIAQLEPEAYEKTKTISLVSNFLTSILVGHLVELEEADACGMNLYDIRE
RKFSDELLHLIDSSSKDKTIRQKLMRAPMKNLIAGTICKYFIEKYGFNTNCKVSPMTGDN LATICSLPLRKNDVLVSLGTSTTVLLVTDKYHPSPNYHLFIHPTLPNHYMGMICYCNGSL ARERIRDELNKERENNYEKTND TLFNQAVLDDSESSENELGVYFPLGEIVPSVKAINKR VIFNPKTGMIEREVAKFKDKRHDAKNIVESQALSCRVRISPLLSDSNASSQQRLNEDTIV KFDYDESPLRDYLNKRPERTFFVGGASKNDAIVKKFAQVIGATKGNFRLETPNSCALGGC YKAM SLLYDSNKIAVPFDKFLNDNFPWHVMESISDVDNENWDRYNSKIVPLSELEKTLI
Example 31 : Construction of the KanMX-A T01 -L 75Q cassette
A unique disruption cassette suitable for use when auxotrophic markers are unavailable, such as in diploid industrial strains or haploids derived from such strains, was constructed to allow
homologous recombination or integration of sequences in the absence of traditional auxotrophic
marker selection. The primers used for amplification of nucleic acids utilized to generate the
disruption cassette are described in the table below.
JML/ 51 ACTAGTATGTCTGACAAGGAACAAACGAGC 5 ' ScAtolSpel
JML/52 CTCGAGTTAAAAGATTACCCTTTCAGTAGATGGTAATG 3 ' ScAtolXhoI
JML/ 55 caagcctttggtggtacccagaatccagggttagctcc ScATO(L75Q)_For
JML/56 ggagctaaccctggattctgggtaccaccaaaggcttg ScATO (L75Q)_Rev
JML/ 57 ggtacaacgcatatgcagatgttgctacaaagcagaa ScAT01G259D_For
JML/ 58 ttctgctttgtagcaacatctgcatatgcgttgtacc ScAT01G259D_Rev
JML/ 59 GACGACGTCTAGAAAAGAATACTGGAGAAATGAAAAGAAAAC ReplacesJML/30
JML/ 63 GCATGCTTAATTAATGCGAGGCATATTTATGGTGAAGG F'of5'FlankingRegic
JML/ 64 GGCCGGCCAGATCTGCGGCCGCGGCCAGCAAAACTAAAAAACTGTATTATAAG F'of3'FlankingRegic
JML/ 65 GCGGCCGCAGATCTGGCCGGCCGATTTATCTTCGTTTCCTGCAGGTTTTTG ' of5 ' FlankingRegie
JML/ 66 GAATTCTTAATTAACTTTTGTTCCACTACTTTTTGGAACTCTTG R'of3FlankingRegion
JML/ 67 GCATGCGCGGCCGCACGTCGGCAGGCCCG F ' 200mer-R
JML/ 68 CGAAGGACGCGCGACCAAGTTTATCATTATCAATACTCGCCATTTC F ' 200mer-R-pGPD-ATC
JML/ 69 GAAATGGCGAGTATTGATAATGATAAACTTGGTCGCGCGTCCTTCG R'pGPD-ATOl-CYC-200
JML/ 70 GTCGACCCGCAAATTAAAGCCTTCGAGC R-pGPD-ATOl-CYC
JML/ 71 GTCGACGTACCCCCGGGTTAATTAAGGCG F-KanMX
JML/ 72 GTCGAAAACGAGCTCGAATTCGACGTCGGCAGGCCCG F-KanMX-200mer-R
JML/ 73 CGGGCCTGCCGACGTCGAATTCGAGCTCGTTTTCGAC R-200mer-R-KanMX
JML/ 7 GGATCCGCGGCCGCTGGTCGCGCGTCCTTCG R-200mer-R
ScATOI was amplified from genomic DNA (gDNA) isolated from BY4742 with primers oJML51 and
oJML.52 and cloned into pCR Blunt ll-TOPO (Invitrogen, Carlsbad, CA). Site Directed Mutagenesis
(SDM) was performed on that plasmid with oJML55 and oJML56, as described herein. The
mutagenized clone was re-amplified with primers oJML51 and oJML52 and cloned into pCR Blunt ll-TOPO (Invitrogen, Carlsbad, CA), and designated AT01-L75Q. AT01-L75Q was subcloned into p416GPD using Spe\IXho\ restriction enzyme sites. The resulting plasmid was designated pJLV048.
The 5' and 3' flanking regions of URA3 were amplified via PCR of the 5' regions with primers oJML63 and oJML65, the 3' region with primers oJML64 and oJML.66. The amplified nucleic acids were annealed and re-amplified with oligonucleotides oJML63 and oJML.66. The template used was TURBO gDNA. The PCR product was Topo cloned into pCR-Blunt II. The desired sequence was moved as an EcoR1 -Sp/7l fragment into vector pUC19 and designated pJLV63.
The R-KanMX fragment was made as follows: The KANMX fragment was first amplified from pBF524 with primers oJML71 and oJML73. The R-200-mer from plasmid pBF32 was then amplified using primers oJML72 and oJML74. The two fragments were annealed together and PCR amplified using primers oJML67 and oJML70 and topo cloned using pCR-Blunt II. The final plasmid construct was designated pJLV062. The R-PTDm-AT01-L75Q construct was generated by amplifying a mixture of PCR oJML67-oJM L69 (pBF32) + PCR oJML68-oJML70 (pJLV048). The resulting plasmid was designated pJLV065. The R-PTDH3-AT01 L75Q (Sal\/Sph\) fragment from pJLV065 was ligated in a 3 piece ligation to the Sa/l/BamHI (R- anM ) fragment from pJLV063 into the
Figure imgf000279_0001
fragment was ligated as a Not\ piece into the Λ/ofl site of pJLV63 and designated pJLV74. The letter "R" with reference to nucleic acid fragments, primers, plasmids and unique 200-mer sequence tags, refers to a unique 200-mer tag identification number. The unique sequence tags are described in Example 40. A table describing the intermediate and final plasmids is presented below.
Figure imgf000279_0002
pUC19-5' URA3-200m448- ProGDP-ScATOI L75Q -
PJLV0074 PBF654 KanMX-200m448-3' URA3 Notl(pJLV070) + Notl(pJLV063)
Example 32: Construction of the ura3 Disruptions in each Haploid Haploid yeast strains were transformed with 2 to 3 of a PvuW, Sph\ digested ura3::R-KanMX - ATO 1 -L75Q-R disruption cassette using the high-efficiency Li-PEG procedure with a heat shock time of 8 minutes. Transformants were plated on YPD plus G418 (200 pg/ml) plates. Colonies were re-streaked onto ScD FOA plates. Single colonies were replica plated on ScD-ura, ScD + FOA, YPD, and YPD G418 200 Mg/ml plates. Ura- FOAR G418R colonies were grown overnight in YPD. Genomic DNA was extracted and the presence of the KanMX-AT01-L75Q gene in the
URA3 loci was verified by PCR. 50 μΙ of each overnight culture was plated on ScD Acetate (2 g/L), pH 4.0, plates. Colonies were restreaked on ScD Acetate plates and single colonies grown overnight in YPD. Disruptions of the URA3 loci were verified by PCR with primers complementary to a region outside of the flanking region used for the disruption. The presence of the unique 200- mer sequence was verified by PCR with primers complementary to the 200-mer in combination with primers complementary to a region outside of the flanking region used for the disruption. The absence of the URA3 loci was verified by PCR that amplifies a 500bp region of the Actin gene open reading frame and a 300bp region of the URA3 open reading frame. The primers utilized for amplification and verification are presented, respectively, in the tables below.
Primers used for amplification of URA and Actin
JML/211 GAGGGCACAGTTAAGCCGCTAAAGG URA3
J L/212 GTCAACAGTACCCTTAGTATATTCTCCAGTAGCTAGGGAG URA3
J L/213 CGTTACCCAATTGAACACGGTATTGTCAC ACT1
JML/21 GAAGATTGAGCAGCGGTTTGCATTTC ACT1
Primers used to verify the presence or absence of URA3
JML/67 GCATGCgcggccgcACGTCGGCAGGCCCG F'200mer-R
JML/74 GGATCCgcggccgcTGGTCGCGCGTCCTTCG R-200mer-R
JML/102 gagtcaaacgacgttgaaattgaggctactgc PCRtover ifydisrupt ionofURA3
JML/103 GATTACTGCTGCTGTTCCAGCCCATATCCAAC PCRtover ifydisrupt ionofURA3 Example 33: EDA Gene Integration Method and Constructs.
Plasmid DNA was digested with Pad using manufacturers suggestions. The digestions were purified using the GeneJET™ Gel Extraction Kit I (Fermentas). Each column was eluted with 20 μΙ of Elution buffer and multiple digests were combined. S. cerevisiae was transformed using the high-efficiency Li-PEG procedure with 2 to 3 \ig of DNA and transformants were selected on ScD- ura solid media. Correct integrations were confirmed by PCR analysis with primers outside the flanking regions used as the disruption cassette and primers complementary to either the open reading frame of EDA or the 200-mer repeat. Oligonucleotide primers utilized for verification are described in the tables below. Primers - Outside
YBR110.5 5' GGCAATCAAATTGGGAACGAACAATG J L/187
3' CTCAAGGTATCCTCATGGCCAAGCAATAC JML/188
YDL075.5 5' GGGTCTACAAACTGTTGTTGTCGAAGAAGATG JML/189
3' CATTCAGTTCCAATGATTTATTGACAGTGCAC JML/190
Primers - Repeat and EDA going out
JML/276 CCTACCCGCCTCGGATCCCAGCTACC R-repeat
JML/277 GGTAGCTGGGATCCGAGGCGGGTAGG R-repeat
JML/278 CCTCCCGGCACAGCGTGTCGATGC R at the 5 ' EDA PaEDA going out and similar primers for EcEDA
JML/2 PCR for PaEDA going out at the 3' of
97 CGAAGCCCTGGAGCGCTTCGC the ORF
JML/2 GTGGTCAGGATTGATTCTGCACTTGTT
98 TTCCAG PCR for EcEDA Reverse at the 5' end JML/2
99 CGCGTGAAGCTGTAG AGGCGCTAAG PCR for EcEDA Forward at the 3' end
The PCR reactions were performed in a final reaction volume of 25 μΙ using the following amplification profile; 1 cycle at 94 degrees C for 2 minutes, followed by 35 cycles of 94 degrees C for 30 seconds, 52 degrees C for 30 second and 72 degrees C for 2 minutes. Construction of EDA disruption cassettes
PiDH3-PaEDA was amplified from pBF292 using primers oJML225 and oJML226, shown in the table below and Topo cloned in pCR Blunt II to make pJLV95.
JML/225 GAGCTCGGCCGCAAATTAAAGCCTTCGAG 3 ' cyCTERMINATOR
GGCCGGCCGTTTATCATTATCAATACTCGCCATTTCAAAGAATA JML/226 CG 5 ' PROMOTERgpd
The desired fragment was moved as a Fse\-Sac\ piece into pBF730 or pBF731 (the integration cassette of either YBR1 10.5 or YDL075.5, respectively) to make plasmids pJLV1 14 and pJLV1 15, respectively. YBR1 10.5 is located inbetween loci YBR1 10 and YBR1 1 1 , and YDL075.5 is located in between loci YDL075 and YDL076. The R-URA3-R sequence was moved into these plasmids as a Not fragment to make pJLV1 19 and pJLV120. The resultant plasmids are described in the table below.
PCR oJML225-oJ L226 pJLV0095 pBF777 pCR-Topo Bluntll - PaEDA (pBF292)
pUC19-5'-YBR1 10.5-PGDP1 -PaEDA- Fsel-Sacl(pBF730) + Fsel- pJLV01 14 pBF862 TCYC-3'YBR1 10.5 Sacl(pJLV95)
pUC19-5'-YDL075.5-PGDP1 -PaEDA- Fsel-Sacl(pBF731 ) + Fsel- pJLV01 15 pBF863 TCYC-3'YDL075.5 Sacl(pJLV95)
pUC19-5'-YBR1 10.5-PGDP1 -PaEDA- pJLV01 19 pBF867 TCYC-R-URA3-R-3'YBR 1 10.5 Notl(pBF742) + Notl(pJLV1 14) pUC19-5'-YDL075.5-PGDP1 -PaEDA- pJLV0120 pBF868 TCYC-R-URA3-R-3'YDL075.5 Notl(pBF742) + Notl(pJLV1 15) Example 34: Isolation and Evaluation of Additional EDA Genes
EDA genes isolated from a variety of sources were expressed in yeast and evaluated
independently of EDA activity, to identify EDA activities suitable of inclusion in an engineered yeast strain. The EDA activities were was independently assessed by adding saturating amounts of over expressed E. coli EDD extracts to S. cerevisiae EDA extracts lacking EDD (Cheriyan et al., Protein Science 16:2368-2377, 2007). The relative activities of EDAs, expressed in S. cerevisiae, were compared and ranked in this way. The activity of integrated EDAs in Thermosacc-Gold haploids, were also evaluated in this manner. The table below describes oligonucleotide primers used to isolate the various EDA genes. Name Description Sequence
KA/EDA Cloning primer for Shewanella GTTCACTGCACTAGTAAAAAAATGCTTGAGAATAACT -SoFor oneidensis EDA GGTC
KA/EDA Cloning primer for Shewanella
-SoRev oneidensis EDA CTTCGAGATCTCGAGTTAAAGTCCGCCAATCGCCTC
KA/EDA Cloning primer for GTTCACTGCACTAGTAAAAAAATGATCGATACTGCCA -GoFor Gluconobacter oxydansEDf AACTC
KA/EDA Cloning primer for
-GoRev Gluconobacter oxydans EDA CTTCGAGATCTCGAGTCAGACCGTGAAGAGTGCCGC
KA/EDA Cloning primer for Bacilluis GTTCACTGCACTAGTAAAAAAATGGTATTGTCACACA -BLFor licheniformis EDA TCGAAG
KA/EDA Cloning primer for Bacilluis CTTCGAGATCTCGAGTTACTGTTTTGCTGCTTCAACA -BLRev licheniformis EDA AATTG
KA/EDA Cloning primer for Bacillus GTTCACTGCACTAGTAAAAAAATGGAGTCCAAAGTCG -BsFor subtilis EDA TTGAAAACC
KA/EDA Cloning primer for Bacillus CTTCGAGATCTCGAGTTACACTTGGAAAACAGCCTGC -BsRev subtilis EDA AAATCC
KA/EDA Cloning primer for GTTCACTGCACTAGTAAAAAAATGACAAACCTCGCCC -PfFor Pseudomonas fluorescens EDA CGACC
KA/EDA Cloning primer for
-PfRev Pseudomonas fluorescens EDA CTTCGAGATCTCGAGTCAGTCCAGCAGGGCCAGG
Cloning primer for
KA/EDA Pseudomonas syringae GTTCACTGCACTAGTAAAAAAATGACACAGAACGAAA -PsFor EDA ATAATCAGCCGC
Cloning primer for
KA/EDA Pseudomonas syringae
-PsRev EDA CTTCGAGATCTCGAGTCAGTCAAACAGCGCCAGCGC
Cloning primer for
KA/EDA Saccharaophagus GTTCACTGCACTAGTAAAAAAATGGCTATTACAAAAG -SdFor degradans EDA AATTTTTAGCTCCAG
Cloning, primer for
KA/EDA Saccharaophagus CTTCGAGATCTCGAGTTAGCTAGAAATTTTAGCGGTA -SdRev degradans EDA GTTGCC
Cloning primer for
KA/EDA Xanthamonas axonopodi s GTTCACTGCACTAGTAAAAAAATGACGATTGCCCAGA -XaFor EDA CCCAG
Cloning primer for
KA/EDA Xanthamonas axonopodi s
-XaRev EDA CTTCGAGATCTCGAGTCAGCCCGCCCGCACC
KA/Nde
IEDDfo Cloning primer for E. GTTCACTGCCATATGAATCCACAATTGTTACGCGTAA r coli EDD CAAATCGAATCATTG
KA/Xho
IEDDre Cloning primer for E. CTTCGAGATCTCGAGTTAAAAAGTGATACAGGTTGCG
V coli EDD CCCTGTTCGGC
Listed below are the amino acid sequences, nucleotide sequences and accession numbers of the EDA genes evaluated as described in this Example. Accession Species Strain Nucleotide Sequence Amino Acid
Number Number Sequence
YP_526856.1 Saccharophagus 2-40 ATGGCTATTACAAAAGAATTTTTAGCTCCAGTTGGCGTAATGCCTGT MAITKEFLAPVGVMPVV
degradans TGTGGTTGTGGATCGTGTAGAAGATGCGGTGCCTATTACAAACGCAT VVDRVEDAVPITNALKA
TAAAAGCCGGCGGTATTAAAGCAGTTGAGATTACTTTACGTACTCCT GGIKAVEITLRTPAALD GCGGCACTGGATGCTATTCGCGCTATTAAAGCTGAGTGTGAAGACAT AIRAIKAECEDILVGVG CCTGGTGGGGGTAGGTACGGTTATTAACCATCAAAACCTTAAAGATA TVINHQNLKDIAAIGVD TTGCTGCAATTGGTGTTGATTTCGCCGTATCTCCTGGTTACACCCCA FAVSPGYTPTLLKQAQD ACATTGCTGAAGCAAGCGCAAGATTTGGGCGTAGAAATGTTGCCTGG LGVEMLPGVTSPSEVML TGTAACTTCGCCTTCTGAAGTTATGCTTGGTATGGAGCTAGGTTTGT GMELGLSCFKLFPAVAV CTTGCTTCAAGCTATTCCCTGCGGTTGCAGTAGGTGGTTTGCCATTA GGLPLLKSIGGPLPQVS CTTAAGTCTATTGGTGGCCCATTACCACAGGTTTCCTTCTGTCCAAC FCPTGGLTIDTFTDFLA AGGCGGTTTGACTATCGATACTTTCACCGACTTCTTGGCATTGCCTA LPNVACVGGTWLVPADA ACGTTGCTTGTGTGGGTGGTACTTGGTTGGTGCCTGCAGATGCTGTT VAAKNWQAITDIAAATT GCAGCTAAAAACTGGCAAGCTATTACTGATATTGCGGCGGCAACTAC AKISS CGCTAAAATTTCTAGCTAA
Xanthomonas ATCC ATGACGATTGCCCAGACCCAGAACACCGCCGAACAGTTGCTGCGCGA MTIAQTQNTAEQLLRDA axonopodis pv. 13902 TGCCGGCATCTTGCCCGTGGTCACCGTGGACACGCTGGATCAGGCGC GILPVVTVDTLDQARRV Vasculorum GCCGCGTCGCCGATGCGTTGCTCGAAGGCGGCCTGCCCGCGATCGAG ADALLEGGLPAIELTLR
CTGACCCTTCGCACGCCAGTGGCGATCGACGCGCTGGCGATGCTCAA TPVAIDALAMLKRELPN GCGCGAGCTTCCTAACATCTTGATCGGTGCCGGCACCGTGCTGAGCG ILIGAGTVLSELQLRQS AATTGCAGCTGCGTCAGTCGGTGGATGCCGGTGCAGACTTCCTGGTG VDAGADFLVTPGTPAPL ACCCCGGGCACGCCGGCGCCGCTGGCGCGCCTGCTGGCGGATGCGCC ARLLADAPIPAVPGAAT GATCCCGGCCGTTCCCGGCGCGGCCACTCCGACCGAGCTGCTGACCT PTELLTLMGLGFRVCKL TGATGGGTCTTGGCTTTCGCGTCTGCAAGCTGTTCCCGGCCACCGCC FPATAVGGLQMLRGLAG GTGGGCGGTCTGCAGATGCTCAGGGGCCTGGCCGGCCCGCTGTCCGA PLSELKLCPTGGISEAN GCTCAAGCTGTGCCCCACCGGCGGCATCAGCGAGGCCAACGCCGCCG AAEFLSQPNVLCIGGSW AGTTCCTGTCGCAGCCGAACGTGCTGTGCATCGGCGGTTCGTGGATG MVPKDWLAHGQWDKVKE GTCCCCAAGGATTGGCTGGCGCACGGCCAATGGGACAAGGTCAAGGA SSAKAAAIVRQVRAG AAGCTCGGCCAAGGCGGCGGCGATCGTGCGGCAGGTGCGGGCGGGCT GA
AA055695.1 Pseudomonas Pv. ATGACACAGAACGAAAATAATCAGCCGCTCACCAGCATGGCGAACAA MTQNENNQPLTSMANKI
syringiae Toma to GATTGCCCGGATCGACGAACTCTGCGCCAAGGCAAAGATTCTGCCGG ARIDELCAKAKILPVIT
str TCATCACCATTGCCCGTGATCAGGACGTATTGCCACTGGCCGACGCG IARDQDVLPLADALAAG
DC3000 CTGGCCGCTGGTGGCATGACGGCTCTGGAAATCACCCTGCGCTCGGC GMTALEITLRSAFGLSA
GTTCGGACTGAGTGCGATCCGCATTTTGCGCGAGCAGCGCCCAGAGC IRILREQRPELCTGAGT TGTGCACTGGCGCCGGGACCATTCTGGACCGCAAGATGCTGGCCGAC I LDRKMLADAEAAGSQF GCCGAGGCGGCGGGCTCGCAATTCATTGTGACCCCCGGCAGCACGCA IVTPGSTQELLQAALDS GGAACTGTTGCAGGCGGCGCTCGACAGCCCGTTGCCCCTGTTGCCAG PLPLLPGVSSASEIMIG GCGTCAGCAGCGCGTCGGAAATCATGATCGGCTATGCCTTGGGTTAT YALGYRRFKLFPAEISG
Figure imgf000285_0001
Figure imgf000286_0001
Figure imgf000287_0001
Figure imgf000288_0001
EDA extracts were prepared using the following protocol.
Day 1
Grow 5 ml LB-Kan preps of BF1055 (BL21 /DE3 with pET26b empty vector) and BF1 706 (BL21 DE3 with pET26b+ E. coli EDD).
Grow 5 ml preps of each EDA construct expressed in S. cerevisiae in appropriate selective media (e.g. ScD-leu). Day 2
Grow 50 ml LB-Kan prep of BF1055, 2% (v/v) inoculate.
Grow 50 ml prep of BF1706 using Novagen's Overnight Express (46.45 ml LB-Kan, 1 ml solution 1 , 2.5 ml solution 2, 50 μΙ solution 3, 5 μΙ of 1 M MnCI2, 50 μΙ of 0.5 M FeCI2), 2% (v/v) inoculate. Grow 50 ml prep of each EDA construct expressed in S. cerevisiae in appropriate selective media + 10 mM MnCI2. Inoculate to OD600 of 0.2.
Day 3
EDD extractions (adapted from Cheriyan et al, Protein Science 16:2368-2377, 2007):
1 ) Pellet cells in 50 ml conical tubes, 4 °C, 3,000 rpm, 10 mins, discard supernatant.
2) Resuspend in 2 ml degassed PDGH buffer (20 mM MES pH 6.5, 30 mM NaCI, 5 mM
MnCI2, 0.5 mM FeCI2, 10 mM 2-mercaptoethanol, 10 mM cysteine, sparged with nitrogen gas). Move to hungate tube.
3) Add 0.1 % Triton X-100, 10 ng/ml DNase, 10 Mg/ml PMSF, 10 pg/ml TAME (Na-(p-toluene sulfonyl)-L-arginine methyl ester), 100 Mg/ml lysozyme.
4) Sparge hungate tube with nitrogen gas, cap and seal. Incubate 2 hours at 37 °C, swirl occasionally.
5) Clarify by centrifugation in 2-ml tube, 4°C, 10 mins, 14,000 rpm. Keep supernatant.
6) Treat with 150 mM pyruvate and 10 mM sodium cyanoborohydride (work in hood) to
inactivate aldolase activity. Incubate 30 mins at room temperature.
7) During incubation, pre-equilibrate PD- 10 column from GE
a. Remove top cap, pour off storage buffer.
b. Cut off bottom tip, fit in 50 ml conical with adapter. c. Pour 5 ml of 20 mM MES buffer, pH 6.5 (total of 5 times). Discard flow-through.
8) Run sample through column, then add MES buffer to a total of 2.5 ml volume added.
Discard flow-through.
9) Run 3.5 ml 20 mM MES pH 6.5 buffer to elute protein. Discard column in appropriate waste receptacle.
10) Perform Bradford assay ( 1 :10 or 1 :20 dilution). EDA extractions:
1 ) Spin down in 50 ml conicals, 4°C, 3,400 rpm, 5 mins. Wash 2x with 25 ml water.
2) Resuspend in 1 ml lysis buffer (50 mM Tris-HCI, pH 7, 10 mM MgCI2, 1 x protease inhibitor.
3) Add 1 cap of zirconia beads, vortex 4-6 times, 15 sec bursts, ice in between.
4) Spin down cell debris, 4 °C, 14,000 rpm, 1 0 mins. Save supernatant.
5) Perform Bradford assay ( 1 :2 dilution). Activity assays:
Each reaction contains 50 mM Tris-HCI, pH 7, 10 mM MgCI2, 0.1 5 mM NADH, 15 pg LDH, saturating amounts of EDD determined empirically (usually - 100 pg), 1 -50 Mg EDA (depending on level of activity), and 1 mM 6-phosphogluconate. Reactions are started by the addition of 6- phosphogluconate and monitored for 5 mins at 30 °C.
Results
The S. cerevisiae strains tested for EDA activity are described in the table below. yCH strains Thermosacc-based (Lallemand). BF strains are based on BY4742.
Figure imgf000290_0001
BF1729 pBF729 Gluconobacter oxydans EDA
BF1730 PBF727 Shewanella oneidensis EDA
BF1775 pBF87 p425GPD (empty vector)
BF1776 pBF928 PA01 EDA codon optimized for S. cerevisiae
E. coli expressed EDD was prepared and confirmed by western blot analysis as shown in FIG. 23. The expected size of EDD is approximately 66 kilodaltons (kDa). A band of approximately that size (e.g., as determined by the nearest sized protein standard of approximately 60 kDa) was identified by western blot. The E, coli expressed EDD was used with S. cerevisiae expressed EDA's to evaluate the EDA activities. The results of EDA kinetic assays are presented in the table below.
Figure imgf000291_0001
In the results presented above, the slope of the E. coli (EC) EDA is outside the linear range for accurate detection, and is therefore underestimated. For the other EDA's, when compared to the E. coli EDA, the calculated percentage of maximum activity (e.g., %max) is overestimated, however the slopes are accurate. The results of this experiment indicate that the E. coli EDA has higher activity as compared to the other EDA activities evaluated herein, and is approximately 16- fold more active than the EDA from P. aeruginosa. EDA's from X. anoxopodis and a chimera between E. coli EDA and P. aeruginosa (e.g., PE15) show less activity than the vector control.
Codon-optimized EDA from P. aeruginosa showed a slight improvement over the native sequence, however chimeric versions (e.g., PE5, PE10, PE15) showed less activity than native. The experiments were repeated using 100 pg of EDD and 25 pg of EDA cell lysates in each reaction (unless otherwise noted, such as 5pg of E. coli EDA). The reactions in the repeated experiment all were in the linear range of detection and the results of these additional kinetic assays are shown graphically in FIG. 24, and in the table below. E. coli EDA was again found to be the most active of those EDA's tested.
Figure imgf000292_0001
Example 35: Nucleotide and Amino Acid Sequence of S. cerevisiae Phosphoglucose Isomerase.
Phosphoglucose isomerase (PGI1 ) activity was decreased or disrupted, in some embodiments, to favor the conversion of glucose-6-phosphate to gluconolactone-6-phosphate by the activity of ZWF1 (e.g., glucose-6-phosphate dehydrogenase). The nucleotide sequence of the S. cerevisiae PGM gene altered to decrease or disrupt phosphoglucose isomerase activity is shown below.
PGI1 nucleotide sequence
ATGTCCAATAACTCATTCACTAACTTCAAACTGGCCACTGAATTGCCAGCCTGGTCTAAG TTGCAAAAAATTTATGAATCTCAAGGTAAGACTTTGTCTGTCAAGCAAGAATTCCAAAAA GATGCCAAGCGTTTTGAAAAATTGAACAAGACTTTCACCAACTATGATGGTTCCAAAATC TTGTTCGACTACTCAAAGAACTTGGTCAACGATGAAATCATTGCTGCATTGATTGAACTG GCCAAGGAGGCTAACGTCACCGGTTTGAGAGATGCTATGTTCAAAGGTGAACACATCAAC TCCACTGAAGATCGTGCTGTCTACCACGTCGCATTGAGAAACAGAGCTAACAAGCCAATG TACGTTGATGGTGTCAACGTTGCTCCAGAAGTCGACTCTGTCTTGAAGCACATGAAGGAG TTCTCTGAACAAGTTCGTTCTGGTGAATGGAAGGGTTATACCGGTAAGAAGATCACCGAT GTTGTTAACATCGGTATTGGTGGTTCCGATTTGGGTCCAGTCATGGTCACTGAGGCTTTG AAGCACTACGCTGGTGTCTTGGATGTCCACTTCGTTTCCAACATTGACGGTACTCACATT GCTGAAACCTTGAAGGTTGTTGACCCAGAAACTACTTTGTTTTTGATTGCTTCCAAGACT TTCACTACCGCTGAAACTATCACTAACGCTAACACTGCCAAGAACTGGTTCTTGTCGAAG ACAGGTAATGATCCATCTCACATTGCTAAGCATTTCGCTGCTTTGTCCACTAACGAAACC GAAGTTGCCAAGTTCGGTATTGACACCAAAAACATGTTTGGTTTCGAAAGTTGGGTCGGT GGTCGTTACTCTGTCTGGTCGGCTATTGGTTTGTCTGTTGCCTTGTACATTGGCTATGAC AACTTTGAGGCTTTCTTGAAGGGTGCTGAAGCCGTCGACAACCACTTCACCCAAACCCCA TTGGAAGACAACATTCCATTGTTGGGTGGTTTGTTGTCTGTCTGGTACAACAACTTCT T GGTGCTCAAACCCATTTGGTTGCTCCATTCGACCAATACTTGCACAGATTCCCAGCCTAC TTGCAACAATTGTCAATGGAATCTAACGGTAAGTCTGTTACCAGAGGTAACGTGTTTACT GACTACTCTACTGGTTCTATCTTGTTTGGTGAACCAGCTACCAACGCTCAACACTCTTTC TTCCAATTGGTTCACCAAGGTACCAAGTTGATTCCATCTGATTTCATCTTAGCTGCTCAA TCTCATAACCCAATTGAGAACAAATTACATCAAAAGATGTTGGCTTCAAACTTCTTTGCT CAAGCTGAAGCTTTAATGGTTGGTAAGGATGAAGAACAAGTTAAGGCTGAAGGTGCCACT GGTGGTTTGGTCCCACACAAGGTCTTCTCAGGTAACAGACCAACTACCTCTATCTTGGCT
CAAAAGATTACTCCAGCTACTTTGGGTGCTTTGATTGCCTACTACGAACATGTTACTTTC ACTGAAGGTGCCATTTGGAATATCAACTCTTTCGACCAATGGGGTGTTGAATTGGGTAAA GTCTTGGCTAAAGTCATCGGCAAGGAATTGGACAACTCCTCCACCATTTCTACCCACGAT GCTTCTACCAACGGTTTAATCAATCAATTCAAGGAATGGATGTGA
Example 36: Nucleotide and Amino Acid Sequence of S. cerevisiae 6-phosphogluconate dehydrogenase (decarboxylating) 6-phosphogluconate dehydrogenase (decarboxylating) (GND1 ) activity was decreased or disrupted, in some embodiments, to minimize or eliminate the conversion of gluconate-6-phophate to ribulose-5-phosphate. The nucleotide sequence of the S. cerevisiae GND1 and GND2 genes altered to decrease or disrupt 6-phosphogluconate dehydrogenase (decarboxylating) activity is shown below.
GND1/YHR183W
ATGTCTGCTGATTTCGGTTTGATTGGTTTGGCCGTCATGGGTCAAAATTTGATCTTGAAC GCTGCTGACCACGGTTTCACTGTTTGTGCTTACAACAGAACTCAATCCAAGGTCGACCAT TTCTTGGCCAATGAAGCTAAGGGCAAATCTATCATCGGTGCTACTTCCATTGAAGATTTC ATCTCCAAATTGAAGAGACCTAGAAAGGTCATGCTTTTGGTTAAAGCTGGTGCTCCAGTT GACGCTTTGATCAACCAAATCGTCCCACTTTTGGAAAAGGGTGATATTATCATCGATGGT GGTAACTCTCACTTCCCAGATTCTAATAGACGTTACGAAGAATTGAAGAAGAAGGGTATT CTTTTCGTTGGTTCTGGTGTCTCCGGTGGTGAGGAAGGTGCCCGTTACGGTCCATCTTTG ATGCCAGGTGGTTCTGAAGAAGCTTGGCCACATATTAAGAACATCTTCCAATCCATCTCT GCTAAATCCGACGGTGAACCATGTTGCGAATGGGTTGGCCCAGCCGGTGCTGGTCACTAC GTCAAGATGGTTCACAACGGTATTGAATACGGTGATATGCAATTGATTTGTGAAGCTTAT GACATCATGAAGAGATTGGGTGGGTTTACCGATAAGGAAATCAGTGACGTTTTTGCCAAA TGGAACAATGGTGTCTTGGATTCCTTCTTGGTCGAAATTACCAGAGATATTTTGAAATTC GACGACGTCGACGGTAAGCCATTAGTTGAAAAAATCATGGATACTGCTGGTCAAAAGGGT ACTGGTAAGTGGACTGCCATCAACGCCTTGGATTTGGGTATGCCAGTTACTTTGATTGGT GAAGCTGTCTTTGCCCGTTGTCTATCTGCTTTGAAGAACGAGAGAATTAGAGCCTCCAAG GTCTTACCAGGCCCAGAAGTTCCAAAAGACGCCGTCAAGGACAGAGAACAATTTGTCGAT GATTTGGAACAAGCTTTGTATGCTTCCAAGATTATTTCTTACGCTCAAGGTTTCATGTTG ATCCGTGAAGCTGCTGCTACTTATGGCTGGAAACTAAACAACCCTGCCATCGCTTTGATG TGGAGAGGTGGTTGTATCATTAGATCTGTTTTCTTGGGTCAAATCACAAAGGCCTACAGA GAAGAACCAGATTTGGAAAACTTGTTGTTCAACAAGTTCTTCGCTGATGCCGTCACCAAG GCTCAATCTGGTTGGAGAAAGTCAATTGCGTTGGCTACCACCTACGGTATCCCAACACCA GCCTTTTCCACCGCTTTGTCTTTCTACGATGGGTACAGATCTGAAAGATTGCCAGCCAAC TTACTACAAGCTCAACGTGACTACTTTGGTGCTCACACTTTCAGAGTGTTGCCAGAATGT GCTTCTGACAACTTGCCAGTAGACAAGGATATCCATATCAACTGGACTGGCCACGGTGGT AATGTTTCTTCCTCTACATACCAAGCTTAA GND2/YGR256W
ATGTCAAAGGCAGTAGGTGATTTAGGCTTAGTTGGTTTAGCCGTGATGGGTCAAAATTTG ATCTTAAACGCAGCGGATCACGGATTTACCGTGGTTGCTTATAATAGGACGCAATCAAAG GTAGATAGGTTTCTAGCTAATGAGGCAAAAGGAAAATCAATAATTGGTGCAACTTCAATT GAGGACTTGGTTGCGAAACTAAAGAAACCTAGAAAGATTATGCTTTTAATCAAAGCCGGT GCTCCGGTCGACACTTTAATAAAGGAACTTGTACCACATCTTGATAAAGGCGACATTATT ATCGACGGTGGTAACTCACATTTCCCGGACACTAACAGACGCTACGAAGAGCTAACAAAG CAAGGAATTCTTTTTGTGGGCTCTGGTGTCTCAGGCGGTGAAGATGGTGCACGTTTTGGT CCATCTTTAATGCCTGGTGGGTCAGCAGAAGCATGGCCGCACATCAAGAACATCTTTCAA TCTATTGCCGCCAAATCAAACGGTGAGCCATGCTGCGAATGGGTGGGGCCTGCCGGTTCT GGTCACTATGTGAAGATGGTACACAACGGTATCGAGTACGGTGATATGCAGTTGATTTGC GAGGCTTACGATATCATGAAACGAATTGGCCGGTTTACGGATAAAGAGATCAGTGAAGTA TTTGACAAGTGGAACACTGGAGTTTTGGATTCTTTCTTGATTGAAATCACGAGGGACATT TTAAAATTCGATGACGTCGACGGTAAGCCATTGGTGGAAAAAATTATGGATACTGCCGGT CAAAAGGGTACTGGTAAATGGACTGCAATCAACGCCTTGGATTTAGGAATGCCAGTCACT TTAATTGGGGAGGCTGTTTTCGCTCGTTGTTTGTCAGCCATAAAGGACGAACGTAAAAGA GCTTCGAAACTTCTGGCAGGACCAACAGTACCAAAGGATGCAATACATGATAGAGAACAA TTTGTGTATGATTTGGAACAAGCATTATACGCTTCAAAGATTATTTCATATGCTCAAGGT TTCATGCTGATCCGCGAAGCTGCCAGATCATACGGCTGGAAATTAAACAACCCAGCTATT GCTCTAATGTGGAGAGGTGGCTGTATAATCAGATCTGTGTTCTTAGCTGAGATTACGAAG GCTTATAGGGACGATCCAGATTTGGAAAATTTATTATTCAACGAGTTCTTCGCTTCTGCA GTTACTAAGGCCCAATCCGGTTGGAGAAGAACTATTGCCCTTGCTGCTACTTACGGTATT CCAACTCCAGCTTTCTCTACTGCTTTAGCGTTTTACGACGGCTATAGATCTGAGAGGCTA CCAGCAAACTTGTTACAAGCGCAACGTGATTATTTTGGCGCTCATACATTTAGAATTTTA CCTGAATGTGCTTCTGCCCATTTGCCAGTAGACAAGGATATTCATATCAATTGGACTGGG CACGGAGGTAATATATCTTCCTCAACCTACCAAGCTTAA
Example 37: Nucleotide and Amino Acid Sequence of S. cerevisiae Transaldolase
Transaldolase (TAL1 ) activity was increased in some embodiments, and in certain embodiments transaldolase activity was decreased or disrupted. Transaldolase converts sedoheptulose 7- phosphate and giyceraidehyde 3-phosphate to erythrose 4-phosphate and fructose 6-phosphate. The rationale for increasing or decreasing transaldolase activity is described herein with respect to various embodiments. The nucleotide sequence of the S. cerevisiae TAL1 gene altered to increase or decrease transaldolase activity, and the encoded amino acid sequence are shown below.
TAL1 nucleotide sequence
ATGTCTGAACCAGCTCAAAAGAAACAAAAGGTTGCTAACAACTCTCTAGAACAATTGAAA GCCTCCGGCACTGTCGTTGTTGCCGACACTGGTGATTTCGGCTCTATTGCCAAGTTTCAA CCTCAAGACTCCACAACTAACCCATCATTGATCTTGGCTGCTGCCAAGCAACCAACTTAC GCCAAGTTGATCGATGTTGCCGTGGAATACGGTAAGAAGCATGGTAAGACCACCGAAGAA
CAAGTCGAAAATGCTGTGGACAGATTGTTAGTCGAATTCGGTAAGGAGATCTTAAAGATT GTTCCAGGCAGAGTCTCCACCGAAGTTGATGCTAGATTGTCTTTTGACACTCAAGCTACC ATTGAAAAGGCTAGACATATCATTAAATTGTTTGAACAAGAAGGTGTCTCCAAGGAAAGA GTCCTTATTAAAATTGCTTCCACTTGGGAAGGTATTCAAGCTGCCAAAGAATTGGAAGAA AAGGACGGTATCCACTGTAATTTGACTCTATTATTCTCCTTCGTTCAAGCAGTTGCCTGT GCCGAGGCCCAAGTTACTTTGATTTCCCCATTTGTTGGTAGAATTCTAGACTGGTACAAA TCCAGCACTGGTAAAGATTACAAGGGTGAAGCCGACCCAGGTGTTATTTCCGTCAAGAAA ATCTACAACTACTACAAGAAGTACGGTTACAAGACTATTGTTATGGGTGCTTCTTTCAGA AGCACTGACGAAATCAAAAACTTGGCTGGTGTTGACTATCTAACAATTTCTCCAGCTTTA TTGGACAAGTTGATGAACAGTACTGAACCTTTCCCAAGAGTTTTGGACCCTGTCTCCGCT AAGAAGGAAGCCGGCGACAAGATTTCTTACATCAGCGACGAATCTAAATTCAGATTCGAC TTGAATGAAGACGCTATGGCCACTGAAAAATTGTCCGAAGGTATCAGAAAATTCTCTGCC GATATTGTTACTCTATTCGACTTGATTGAAAAGAAAGTTACCGCTTAA
TAL1 amino acid sequence SEPAQKKQKVANNS LEQLKASGTV ADTGDFGS IAKFQPQDSTTNPS L I LAAAKQPTY AKL I DVAVEYGK HGKTTEEQVENAVDRLLVEFGKE I LK IVPGRVSTEVDARLS FDTQAT I EKARH I I KLFEQEGVSKERVL I K IASTWEG I QAA E LEEKDG I HCNLTLLFSFVQAVAC AEAQVTL I SPFVGR I LDWYKS STGKDYKGEADPGV I SVKK I Y YYKK YG YKT I VMGASFR STDE I KNLAGVDYLT I SPALLDKLMNSTEPFPRVLDPVSAKKEAGDK I SY I S DES FRF D LNE DAMATEKLSEG I RKFSAD I VTLFDL I EKKVT
Example 38: Nucleotide and Amino Acid Sequence of S. cerevisiae Transketolase
Transketolase (TKL1 and TKL2) activity was increased in some embodiments, and in certain embodiments transaldolase activity was decreased or disrupted. Transketolase converts xylulose- 5-phosphate and ribose-5-phosphate to sedoheptulose-7-phosphate and glyceraldehyde-3- phosphate. The rationale for increasing or decreasing transketolase activity is described herein with respect to various embodiments. The nucleotide sequence of the S. cerevisiae TKL1 gene altered to increase or decrease transketolase activity, and the encoded amino acid sequence are shown below.
TKL1 nucleotide sequence
ATGACTCAATTCACTGACATTGATAAGCTAGCCGTCTCCACCATAAGAATTTTGGCTGTG GACACCGTATCCAAGGCCAACTCAGGTCACCCAGGTGCTCCATTGGGTATGGCACCAGCT GCACACGTTCTATGGAGTCAAATGCGCATGAACCCAACCAACCCAGACTGGATCAACAGA GATAGATTTGTCTTGTCTAACGGTCACGCGGTCGCTTTGTTGTATTCTATGCTACATTTG ACTGGTTACGATCTGTCTATTGAAGACTTGAAACAGTTCAGACAGTTGGGTTCCAGAACA CCAGGTCATCCTGAATTTGAGTTGCCAGGTGTTGAAGTTACTACCGGTCCATTAGGTCAA GGTATCTCCAACGCTGTTGGTATGGCCATGGCTCAAGCTAACCTGGCTGCCACTTACAAC AAGCCGGGCTTTACCTTGTCTGACAACTACACCTATGTTTTCTTGGGTGACGGTTGTTTG CAAGAAGGTATTTCTTCAGAAGCTTCCTCCTTGGCTGGTCATTTGAAATTGGGTAACTTG ATTGCCATCTACGATGACAACAAGATCACTATCGATGGTGCTACCAGTATCTCATTCGAT GAAGATGTTGCTAAGAGATACGAAGCCTACGGTTGGGAAGTTTTGTACGTAGAAAATGGT AACGAAGATCTAGCCGGTATTGCCAAGGCTATTGCTCAAGCTAAGTTATCCAAGGACAAA CCAACTTTGATCAAAATGACCACAACCATTGGTTACGGTTCCTTGCATGCCGGCTCTCAC TCTGTGCACGGTGCCCCATTGAAAGCAGATGATGTTAAACAACTAAAGAGCAAATTCGGT TTCAACCCAGACAAGTCCTTTGTTGTTCCACAAGAAGTTTACGACCACTACCAAAAGACA ATTTTAAAGCCAGGTGTCGAAGCCAACAACAAGTGGAACAAGTTGTTCAGCGAATACCAA AAGAAATTCCCAGAATTAGGTGCTGAATTGGCTAGAAGATTGAGCGGCCAACTACCCGCA AATTGGGAATCTAAGTTGCCAACTTACACCGCCAAGGACTCTGCCGTGGCCACTAGAAAA TTATCAGAAACTGTTCTTGAGGATGTTTACAATCAATTGCCAGAGTTGATTGGTGGTTCT GCCGATTTAACACCTTCTAACTTGACCAGATGGAAGGAAGCCCTTGACTTCCAACCTCCT TCTTCCGGTTCAGGTAACTACTCTGGTAGATACATTAGGTACGGTATTAGAGAACACGCT ATGGGTGCCATAATGAACGGTATTTCAGCTTTCGGTGCCAACTACAAACCATACGGTGGT ACTTTCTTGAACTTCGTTTCTTATGCTGCTGGTGCCGTTAGATTGTCCGCTTTGTCTGGC CACCCAGTTATTTGGGTTGCTACACATGACTCTATCGGTGTCGGTGAAGATGGTCCAACA CATCAACCTATTGAAACTTTAGCACACTTCAGATCCCTACCAAACATTCAAGTTTGGAGA CCAGCTGATGGTAACGAAGTTTCTGCCGCCTACAAGAACTCTTTAGAATCCAAGCATACT CCAAGTATCATTGCTTTGTCCAGACAAAACTTGCCACAATTGGAAGGTAGCTCTATTGAA AGCGCTTCTAAGGGTGGTTACGTACTACAAGATGTTGCTAACCCAGATATTATTTTAGTG GCTACTGGTTCCGAAGTGTCTTTGAGTGTTGAAGCTGCTAAGACTTTGGCCGCAAAGAAC ATCAAGGCTCGTGTTGTTTCTCTACCAGATTTCTTCACTTTTGACAAACAACCCCTAGAA TACAGACTATCAGTCTTACCAGACAACGTTCCAATCATGTCTGTTGAAGTTTTGGCTACC ACATGTTGGGGCAAATACGCTCATCAATCCTTCGGTATTGACAGATTTGGTGCCTCCGGT AAGGCACCAGAAGTCTTCAAGTTCTTCGGTTTCACCCCAGAAGGTGTTGCTGAAAGAGCT CAAAAGACCATTGCATTCTATAAGGGTGACAAGCTAATTTCTCCTTTGAAAAAAGCTTTC TAA
TKL1 amino acid sequence
MTQFTDI D LAVSTI RI LAVDTVSKANSGHPGAPLGMAPAAHVLWSQMRMNPTNPDWI R DRFVLSNGHAVALLYSMLHLTGYDLSIEDLKQFRQLGSRTPGHPEFELPGVEVTTGPLGQ GI SNAVGMAMAQANLAATYNKPGFTLSD YTYVFLGDGCLQEGI SSEASSLAGHLKLGNL IAI YDDNKITI DGATS ISFDEDVAKRYEAYGWEVLYVENGNEDLAGIAKAIAQAKLSKDK PTLIKMTTTIGYGSLHAGSHSVHGAPLKADDVKQL SKFGFNPDKSFWPQEVYDHYQKT I LKPGVEANN WNKLFSEYQKKFPELGAELARRLSGQLPANWESKLPTYTAKDSAVATRK LSETVLEDVYNQLPELIGGSADLTPSNLTRWKEALDFQPPSSGSGNYSGRYIRYGIREHA MGAI NGISAFGANYKPYGGTFLNFVSYAAGAVRLSALSGHPVIWVATHDSIGVGEDGPT HQPI ETLAHFRSLP IQVWRPADGNEVSAAYKNSLESKHTPS I IALSRQNLPQLEGSSIE SAS GGYVLQDVANPDI I LVATGSEVSLSVEAA TLAAK IKARVVSLPDFFTFDKQPLE YRLSVLPDNVPI SVEVLATTCWG YAHQSFGI DRFGASG APEVF FFGFTPEGVAERA QKTIAFY GD LISPL KAF Example 39: Nucleotide and Amino Acid Sequences of Additional EDO genes evaluated for activity
Figure imgf000297_0001
Figure imgf000298_0001
Figure imgf000299_0001
YP_261706. Pseudomonas Pf-5 ATGCATCCCCGCGTTCTTGAGGTCACCGAACGGCTTATCGCCCGTAGTC MHPRVLEVTERLIARSR 1 fluorescein GCGCCACTCGCCAGGCCTATCTCGCGCTGATCCGCGATGCCGCCAGCG ATRQAYLAL1RDAASD
ACGGCCCGCAGCGGGGCAAGCTGCAATGTGCGAACTTCGCCCACGGC GPQRG LQCANFAHGV
GTGGCCGGTTGCGGCACCGACGACAAGCACAACCTGCGGATGATGAA AGCGTDD HNLRMM
TGCGGCCAACGTGGCAATTGTTTCGTCATATAACGACATGTTGTCGGC AANVAIVSSYNDMLSA
GCACCAGCCTTACGAGGTGTTCCCCGAGCAGATCAAGCGCGCCCTGCG HQPYEVFPEQI RALRE
CGAGATCGGCTCGGTGGGCCAGTTCGCCGGCGGCACCCCGGCCATGTG IGSVGQFAGGTPAMCD
CGATGGCGTGACCCAGGGCGAGGCCGGTATGGAACTGAGCCTGCCGA GVTQGEAGMELSLPSR
GCCGTGAAGTGATCGCCCTGTCTACGGCGGTGGCCCTCTCTCACAACA EVIALSTAVALSHN FD
TGTTCGATGCCGCGCTGATGCTGGGGATCTGCGACAAGATTGTCCCGG AALMLGICD IVPGLM
GGTTGATGATGGGCGCTCTGCGCTTCGGTCACCTGCCGACCATCTTCGT MGALRFGHLPTIFVPGG
TCCGGGCGGGCCCATGGTCTCGGGCATTTCCAACAAGCAGAAAGCCGA PMVSGISNKQKADVRQ
CGTGCGCCAGCGTTACGCCGAAGGCAAGGCCAGCCGCGAGGAACTGC RYAEGKASREELLESE
TGGAGTCGGAAATGAAGTCCTACCACAGCCCCGGCACCTGCACTTTCT M KS Y H S PGTCTF YGT A
ACGGCACCGCCAACACCAACCAGTTGCTGATGGAAGTGATGGGCCTGC NTNQLLMEV GLHLPG
ACCTGCCGGGCGCCTCTTTCGTCAACCCCAATACGCCGCTGCGCGACG ASFVNP TPLRDALTHE
CCCTGACCCATGAGGCGGCGCAGCAGGTCACGCGCCTGACCAAGCAG AAQQVTRLTKQSGAFM
AGCGGGGCCTTCATGCCGATTGGCGAGATCGTCGACGAGCGCGTGCTG PIGEIVDERVLVNSIVAL
GTCAACTCCATCGTTGCCCTGCACGCCACGGGCGGCTCCACCAACCAC HATGGSTNHTLHMPAI
ACCCTGCACATGCCGGCCATCGCCCAGGCGGCGGGCATCCAGCTGACC AQAAGIQLTWQD AD
TGGCAGGACATGGCCGACCTCTCCGAGGTGGTGCCGACCCTGTCCCAC LSEVVPTLSHVYPNG
GTCTATCCAAACGGCAAGGCCGATATCAACCACTTCCAGGCGGCGGGC ADINHFQAAGG SFLIR
GGCATGTCTTTCCTGATCCGCGAGCTGCTGGAAGCCGGCCTGCTCCAC ELLEAGLLHEDVNTVA
GAAGACGTCAATACCGTGGCCGGCCGCGGCCTGAGCCGCTATACCCAG GRGLSRYTQEPFLDNG
GAACCCTTCCTGGACAACGGCAAGCTGGTGTGGCGCGACGGCCCGATT KLVWRDGPIESLDEN1L
GAAAGCCTGGACGAAAACATCCTGCGCCCGGTGGCCCGGGCGTTCTCT RPVARAFSAEGGLRVM
GCGGAGGGCGGCTTGCGGGTCATGGAAGGCAACCTCGGTCGCGGCGT EGNLGRGV KVSAVAP
GATGAAGGTTTCCGCCGTGGCCCCGGAGCACCAGATCGTCGAGGCCCC EHQIVEAPAVVFQDQQ
GGCCGTGGTGTTCCAGGACCAGCAGGACCTGGCCGATGCCTTCAAGGC DLADAF AGLLE DFV
CGGCCTGCTGGAGAAGGACTTCGTCGCGGTGATGCGCTTCCAGGGCCC AVMRFQGPRSNGMPEL
GCGCTCCAACGGCATGCCCGAGCTGCACAAGATGACCCCCTTCCTCGG H MTPFLGVLQDRGF
GGTGCTGCAGGACCGCGGCTTCAAGGTGGCGCTGGTCACCGACGGGCG VALVTDGRMSGASGKJ
CATGTCCGGCGCTTCGGGCAAGATTCCGGCAGCGATCCATGTCAGCCC PAAIHVSPEAQVGGAL
CGAAGCCCAGGTGGGTGGCGCGCTGGCCCGGGTGCTGGACGGCGATA ARVLDGDIIRVDGVKG
TCATCCGAGTGGATGGCGTCAAGGGCACCCTGGAGCTTAAGGTAGACG TLEL VDAAEFAAREP
CCGCAGAATTCGCCGCCCGGGAGCCGGCCAAGGGCCTGCTGGGCAAC A GLLGNNVGTGRELF
AACGTTGGCACCGGCCGCGAACTCTTCGCCTTCATGCGCATGGCCTTC AFMR AFSSAEQGASA
AGCTCGGCAGAGCAGGGCGCCAGCGCCTTTACCTCTGCCCTGGAGACG FTSALETL
CTCAAGTGA
Figure imgf000301_0001
Figure imgf000302_0001
Figure imgf000303_0001
Figure imgf000304_0001
ATGTATACCGCTAACTCCATGAACTGCCTCACTGAGGTACTGGGTATG TANSMNCLTEVLGMGL
GGTCTCAGAGGCAACGGCACTATCCCTGCTGTTTACTCCGAGCGTATC RGNGTIPAVYSERI LA
AAGCTTGCAAAGCAGGCAGGTATGCAGGTTATGGAACTCTACAGAAA QAGMQV ELYR NI
GAATATCCGCCCTCTCGATATCATGACAGAGAAGGCTTTCCAGAACGC RPLDI TE AFQNALTA
TCTCACAGCTGATATGGCTCTTGGATGTTCCACAAACAGTATGCTCCAT DMALGCSTNSMLHLPA
CTCCCTGCTATCGCCAACGAATGCGGCATAAATATCAACCTTGACATG IANECGININLDMANEIS
GCTAACGAGATAAGCGCCAAGACTCCTAACCTCTGCCATCTTGCACCG A TPNLCHLAPAGHTY
GCAGGCCACACCTACATGGAAGACCTCAACGAAGCAGGCGGAGTTTA MEDLNEAGGVYAVLN
TGCAGTTCTCAACGAGCTGAGCAAAAAGGGACTTATCAACACCGACTG ELS GLI NTDC TVT
CATGACTGTTACAGGCAAGACCGTAGGCGAGAATATCAAGGGCTGCAT G TVGENI GCINRDPE
CAACCGTGACCCTGAGACTATCCGTCCTATCGACAACCCATACAGTGA TIRPIDNPYSETGGIAVL
AACAGGCGGAATCGCCGTACTCAAGGGCAATCTTGCTCCCGACAGATG KGNLAPDRCVV RSAV
TGTTGTGAAGAGAAGCGCAGTTGCTCCCGAAATGCTGGTACACAAAGG APEMLVHKGPARVFDS
CCCTGCAAGAGTATTCGACAGCGAGGAAGAAGCTATCAAGGTCATCTA EEEA I K V [ Y EGG I K AGD
TGAGGGCGGTATCAAGGCAGGCGACGTTGTTGTTATCCGTTACGAAGG VVVIRYEGPAGGPGMR
CCCTGCAGGCGGCCCCGGCATGAGAGAAATGCTCTCTCCTACATCAGC EMLSPTSAIQGAGLGST
TATACAGGGTGCAGGTCTCGGCTCAACTGTTGCTCTAATCACTGACGG VAL1TDGRFSGATRGAA
ACGTTTCAGCGGCGCTACCCGTGGTGCGGCTATCGGACACGTATCCCC IGHVSPEAVNGGTIAYV
CGAAGCTGTAAACGGCGGTACTATCGCATATGTCAAGGACGGCGATAT KDGDIIS1DIPNYS1TLEV
TATCTCCATCGACATACCGAATTACTCCATCACTCTTGAAGTATCCGAC SDEELAERK AMPI R
GAGGAGCTTGCAGAGCGCAAAAAGGCAATGCCTATCAAGCGCAAGGA ENITGYL RYAQQVS
GAACATCACAGGCTATCTGAAGCGCTATGCACAGCAGGTATCATCCGC SADKGAIINRK
AGACAAGGGCGCTATCATCAACAGGAAATAG
Example 40: Unique 200-mer nucleotide sequences used for integration constructs.
200-mer
Number Sequence
CACGCACGGACCGACCGTCACCGGACCGTTTCGCGCGACGTGCGCGAGGCTCCGACACGAAA GACGGGCCCCCTATTGCGCTCATGTCGGCCGCACCCCTGCGTAAAGTCAGATACGTGCGCCA CCCGAGCCGGGACCGCCCTGAGCGCATGGTCCGGGCGGCGTGGCAAGCGCAGGAGGGCGTGC
30 CCCGTTCGCTAGGCA
ACGTATGTCGGCTGATCGTACACGCCGACCAGCGCAGTCGGCGTACTCAGGCGTTCCGAGTA GCTCACATCTGTGGGCCCCGGCGTACCTTCGGCAGGGTTATGCGACGGGGCGGCAGGCTTGC GCTGGCGTCGGGAATCACCGCGAACTTGACCCGCGCCGGTTCCGTATCGGTCCGCTGCGGCC
44 GTGCTCCGCAGTCGA
TGCAGTCCGCCCAGCCGGCCGTGTAGCACGGCCGACTGCAGGTGCGACGTGCTAGGGGCCAG CACGCGAGCGGCCCTACCACGGGTCGTGTGGGGCGCATGACCGCCGGCCGGGTCTCGGCACG GGGCGACGCGGTGCTCCTAGGCTAGCAGGGCCTCACCGGGTGATCCCCCGTGTAGCGCCGCA
4 5 CAACACCCCCTGCGA
TGCCCGCATACCGCCCGCCCACTGGGGATCCTCCGGCGCTGTCGCGCTATGCGCGTCCATCC TGGTCGGACGGGCTCGGGCCCCGGACCAAACCGCAGCGGCCCCTGGCAGCGACTAAGGGCGC CGTCTCACCCTAGACTTCTTAATCGGGGTGTCCCGGTAGGCCGGGAGTAGCCTCGGCGGGCT
4 9 AGCCGCGTGACTATA
GCGGGTTAGTCCCCGTCGGACGTCATGCATACAGTCGGGGCTGGCGAGACAGGAGGCTACAG GGGGCGCCCGGAGGAACACACGTGGGACTAAGACGTCGGTCCGTGTGCCCCCGAACCGGCGT GCTCATCGTAGGACTGGGAAGTCCGTACCGCGTGGCTCGTACCTCGCGGTCTGAGTCCGACA
78 CCCGCTGACGCCGGA
CTGAGACGACTCCCGCACTACGGATCGCGAGCGTAGACTCAGCCCGGACTCTCACGCGACCT CGGACGCGGCCTAATGTCTCGACTCGCGGTCCGCTGAAGGTCTCGGGGCACGCGAGACGCGG GGTCAGGCCGGGGGGATCCCCGCACACACTCAGTCGCGGCGAACGGAGTCCCGTGGCCTGGC
1 3 TAGGGATCGTGGGTA
GGGGCGTCCACTCTGGCTCGGTAGAGCGCTGGGCTCCGCGCGACTGCGCGCACCCATCGGTT TGGCGCGACGCACCGTGGACTCCTGGGCTAGAGGGCGGGTCCCCGCCATACCCCGTTCTCGT GCCGGCTGGGTAGGACCGGAGTGACGGCTGTGGCCGGCGACTCGGGCGCGCACTGTAGTCGG
60 ATCTGGGCGGGCAGA
GTCGGGCGCGCGTCAGTCCACGCGTTAAACACTGGCCGACGACACGACGGGATCCGGGCACG CCCCGAGAGCGCGTGTTCGCGCGAGTCGATCGGGAGGCCGCAGCGTGTCGAGCCCAGACCCC GCTCTAGCGTGGCCATCGCGGTGCTAAGTGGGGCGGCCGGGTCCTATACACGCTTACCGATA
92 GTCAAGTTTGCGTGA
GTCTTAGGGCCCAGGGACCGCACGGGTCGACCGCGCGACTGGTCGGAGCTTGCGCGTCTACG CCACTCGGCGGCCCCGACGGGGGATGCCGCGGAATGTCCGCCGGCGTATGCGGCTCAAGCCG GACCGTCGGACTGCGAAGCGCCGTGAGCACCCCTCGACCTGACCGGACGCGGCGCACCCGTC
1 27 CGAGTATCGTCGCGA
TCGGGTCTCGCCCGGCGCTAGTCCAGCCGTAGCGCTCTCCGGCGATCACCCCGGAGCACTCT GGAGCCGAGCGGTCGGGTCTGTTGGGCGCGCCGCGGCTACGGACGGCTCGACTCAGTGGCGC TCGACCCCGTATCCCCCGTCTCGGACGACGCACCGTTGCGCGGGAACGATCGGCGGCGCTCA
1 53 CACGCACGATCGGAA
CTTAAGGCTGGCGCACCATGAGGGCCGCGCCACGTCCGACCCGCAGCCCGCGCGTAGTAGCC TAGCCGGGCGGGGTTCCTCCCGTGCGTCACCTAGCACGGGGCCTGGCACCGAACGCGAGCCC GTCCGGTCACCGCGGCGGGTCTGCGGACGTCCCCGGTCGCTCGGCTCGGAGTCCCCGCTGGG
299 GATCGCGTCGGGACA
CGACGGCGTAGCACTCGCGGACCTAGGGCGCGCGAGTCGGGGGAGCCCGCGGTGCGACGCTC GGGGAGGAGCTCGCATGCCCAAGGCACGATCTAGGGGGGGGTACGGGGGGCGTCCGTCCGAG CGCCGGGACTGCGATCCGGGGCCACATGCTAACCGGCGGAAGGGGGGACCTAACCGGTGTGG
31 7 ACTCCGGGTAATCCA
CGGGGGGCTGACACGTCTCGGATCGCCCCGTCAGTCAGCCCCCTAGTCCCGGACAGGACGTC GGAGGTCGAGTCCGCACTGTCGGGCCTGCTCGTGGGCACGGCAGGACGCGTCCCCATGGTCA
31 9 GCCGCCGTGCGATACCTCGCCACGACTCTGAGCCGGGCGCGAGCGTGAGAGCCCGAGCCGCG GTACACGGGGCGTCA
GCGAGCTCGCTCTCGACTCCGGGCTCCCGTGCTGACACGGGGTGCGACCCCGCGGCGATTGT CCGCACGCCTGTCGGACGACGTCGGCCCGTCGTAGTGCCGGTCAGAGGCAGGGGGGCTGCTC GCGCTGGCCGCCTCGTCGCGCGTGGACCCTATGGGGGATCACGCGTGGGGTCGGGATCGGGG
529 ACCGCGCGACTTGGA
CGCGCCCCGTAACGGACGCGGTGAGTCGAGCTTACGCGGCTAGGGCCGAGTCGTGTTAGCGT CTCGCGTAAGCGAATGCCACGTCCCCCGCCGCCCGTCGCGCAGCTGGCTACGCAACGCCTCC GCGGCCTCCGTAGCGAGTGCGTGGGACGCTGGCCGTCCGCGTGTTCCGGGACCTGGATGCGG
6 51 GAGGGACCTAAGGCA
AGAACGTGCGGTCGTCCCCACGCACGGGATGACGGACGGGGTAGACGGGCGTCGTGCGCGCG GGTAGCGTAACCGGTTACAGTCCCCGCAACGCTCTAGCTCCGGCCCTCGCTTAGGAGTTCGC GGCCGAGACATGAGGTGGTCCGGACGGCAGGGGGTCGCGGAGACCGTGGAGCCGATTCTGCC
6 77 GGACGCCACGTCCCA
CGGGACGCCCCGTACCGTGTACGAAGCCCCGGTCGGTCGGCGGATCGTAGATCCCGGAGCCG ACGCCTTGAACCCGGCTTTCCCAGCGACTCGCGCCCCCACTGGGTCCCTCGGGACCCCGCTC CCCCCAGACGCATACAGCCCGCAAGCGGGGGCAGTCTCGGACCGCCCGGACACTGGCCTTAG
708 GCACCGTGGGCTCGA
GTGTCCGGGGCGCATCGGAGCTGTCCGACCGAGTTCCGGGGACGGCGCACGTTGTGCCGGCC TCAGACGGAGCCTGTAGCCCCCGGACAGTGTGTGCCCGCCCACTACGGGTTAGGCACGGGGT TGGTCGGCACGCGTCCTCCGCGTGTCACGGACCGATGCAGACCGCTGGCCGGGAGGTCGCCC
71 7 CCCCAGGGGTGCACA
CGCGCAGCACGCACGTCCGGGGCACGCGCGGCTCGGAGGGTCCGGGCTGGGACGGGAGGTTT GGAGTCGCGTGCGCGTAGCAGCGCACCCGCCTGGTCGCCGGGTCTAGTAGGGCTGGGTTACG GAGGACGTGCAGGCGACCCCAACCGTTGACGACGGGTCCGACCACGCCTTTAGCCGTGGCGT
71 9 GTCCGTCGCGAGCCA
Example 41: Evaluation of additional Xylose Isomerase genes. As noted above, additional xylose isomerase genes were identified and isolated and chimeric versions generated in certain embodiments. Presented below are the results of activity assays of three candidate xylose isomerase genes from R. flavefaciens, FD-1 , Ruminococcus 18P13, and Clostridiales genomosp., when expressed in S. cerevisiae. The candidate xylose isomerase enzymes (Xl's) were assayed as total soluble crude extracts (prepared as described herein in YPER-PLUS reagent and quantified with the Coomasie-Plus kit). 100 g of each extract was compared for the candidate Xl's alongside the original Xl-R (e.g., Ruminococcus xylose isomerase) native construct. The Clostridiales enzyme was further characterized at 1974pg to confirm the presence of activity. The results in this experiment are presented as the slope of the activity at saturating xylose concentrations (500mM). The results are show in FIG. 25. The R. flavefaciens, FD-1 XI activity shows the highest activity of the candidates tested, and shows higher activity than the Xl-R native construct used as a control. The two candidate genes from Ruminococcus strains were further characterized in an enzyme assay using xylose concentrations between 40 and 500nM. Michaelis Mentin plots were generated for each candidate in order to determine Km and specific activity levels as compared to the native Xl-R enzyme. The results are presented in the table below. The R. flavefaciens, FD-1 XI activity shows a greater than 2 fold specific activity than the native XI control activity. Enzyme Source Km Specific Activity (umol min" 1 mg~' )
XI-R native 42.57nM 0.9605
R. flavefaciens FD- 1 71 .91 nM 2.3045
Ruminococcus 18P13 65.1 I Nm 0.20448 Example 42: Nucleotide Sequence Alignment of New Xylose Isomerase Activities
Three additional XI candidates were identified from a BLAST database analysis of our Xl-R gene against the GenBank database. Each of these new candidates was compared at both the nucleotide and amino acid level to the Xls previously shown to function in S. cerevisiae. The results of the alignment are presented in the table below. The closest homolog was identified as Clostridium phytofermentans with between 64.4% and 68.3% identity to our 3 newly identified XI candidates.
Figure imgf000309_0001
The activities of certain xylose isomerase genes are presented in Example 41 .
Example 43: Additional Nucleotide and Amino Acid Sequences of Activities Altered in
Engineered Organisms Described Herein
Loss of function mutants (e.g., gnd1/gnd2, gpd1/gpd2, tpsl , nth1 , fpsl , pho13, pfk1/pfk2, and others described herein) were generated using knockout cassettes that disrupted or completely deleted the various genomic coding sequences. In some embodiments, a knockout cassette also inserts coding sequences for one or more endogenous or exogenous activities, in the course of disrupting or completely deleting a genomic coding sequence. Non-limiting examples of coding sequences that can be utilized in knockout cassettes as expressible insertion sequences include: a ura' gene; a xylose isomerase (XI) gene (e.g., native, codon optimized or chimeric); a xylose reductase (XR) gene (e.g., native, codon optimized or chimeric); a xylitol dehydrogenase (XD) gene (e.g., native, codon optimized or chimeric); a xylulokinase (XK) gene (e.g., native, codon optimized or chimeric); a P. aeruginosa EDD gene (e.g., native or codon optimized); an E. coli EDA gene (e.g., native or codon optimized); a S. cerevisiae PDC gene; a S. cerevisiae ADH1 gene; a S. cerevisiae ZWF1 gene; a P. aeruginosa ZWF1 gene (e.g., native or codon optimized); a S. cerevisiae SOL3 gene; a S. cerevisiae SOL4 gene; or combinations thereof. Gain of function mutants (e.g., introduction of a novel activity (e.g., EDD activity, EDA activity, exogenous PFK1 activity) or increasing the level of a native activity) were generated using substantially similar insertion vectors and/or using constitutive or other strong promoters. Nucleotide and/or amino acid sequences for altered activities are presented in the table below. In the table of sequences included below, coding sequences are presented in upper case font, while upstream and/or downstream sequences (e.g., non-coding sequences, plasmid sequences, restriction enzyme recognition sequences, targeting sequences, genetic tags, affinity tags, linkers, the like, combinations thereof) are shown as lower case font with underlining, unless otherwise explained in the table's description column. In some coding sequences, the first codon of the coding sequences is presented as boldface font. For amino acid sequences, the entire sequence is presented in upper case font. The coding sequences presented for GND1 and GND2, also were presented in Example 36. The GND1 and GND2 sequences presented in this example include approximately 300 nucleotides of upstream and downstream sequences, as shown by the lower case font and underlining. Use of the sequences presented in the table below is further described herein.
Figure imgf000311_0001
Figure imgf000312_0001
Figure imgf000313_0001
Figure imgf000314_0001
Figure imgf000315_0001
Figure imgf000316_0001
Figure imgf000317_0001
Figure imgf000318_0001
Figure imgf000319_0001
Figure imgf000320_0001
Figure imgf000321_0001
Figure imgf000322_0001
Figure imgf000323_0001
Figure imgf000324_0001
CATTCATCTCAATGTGATCGTTTATTATTGAAATCTCCTATTTTGCATTGGAATGAGTTC
CAAGCTTTGAAAAACATTGAAGCTGCTTACCCATCATGGTCTGTAGCAGAAATTGATATC
ACATTCGACAAGAGTGAGGGTCTATTGGGCTATACCGACACAATTGATAAAATCACTAAG
TTAGCGAGCGAAGCAATTGATGATGGTAAAAAGATCTTAATAATTACTGACAGGAAAATG
GGTGCCAACCGTGTTTCCATCTCCTCTTTGATTGCAATTTCATGTATTCATCATCACCTA
ATCAGAAACAAGCAGCGTTCCCAAGTTGCTTTGATTTTGGAAACAGGTGAAGCCAGAGAA
ATTCACCATTTCTGTGTCCTACTAGGTTATGGTTGTGATGGTGTTTATCCATACTTAGCC
ATGGAAACTTTGGTCAGAATGAATAGAGAAGGTCTACTTCGTAATGTCAACAATGACAAT
GATACACTTGAGGAAGGGCAAATACTAGAAAATTACAAGCACGCTATTGATGCAGGTATC
TTGAAGGTTATGTCTAAAATGGGTATCTCCACTCTAGCATCCTACAAAGGTGCTCAAATT
TTTGAAGCCCTAGGTTTAGATAACTCTATTGTTGATTTGTGTTTCACAGGTACTTCTTCC
AGAATTAGAGGTGTAACTTTCGAGTATTTGGCTCAAGATGCCTTTTCTTTACATGAGCGT
GGTTATCCATCCAGACAAACCATTAGTAAATCTGTTAACTTACCAGAAAGTGGTGAATAC
CACTTTAGGGATGGTGGTTACAAACACGTCAACGAACCAACCGCAATTGCTTCGTTACAA
GATACTGTCAGAAACAAAAATGATGTCTCTTGGCAATTATATGTAAAGAAGGAAATGGAA
GCAATTAGAGACTGTACACTAAGAGGACTGTTAGAATTAGATTTTGAAAATTCTGTCAGT
ATCCCTCTAGAACAAGTTGAACCATGGACTGAAATTGCCAGAAGATTTGCGTCAGGTGCA
ATGTCTTATGGTTCTATTTCTATGGAAGCTCACTCTACATTGGCTATTGCCATGAATCGT
TTAGGGGCCAAATCCAATTGTGGTGAAGGTGGTGAAGACGCAGAACGTTCTGCTGTTCAA
GAAAACGGTGATACTATGAGATCTGCTATCAAACAAGTTGCTTCCGCTAGATTCGGTGTA
ACTTCATACTACTTGTCAGATGCTGATGAAATCCAAATTAAGATTGCTCAGGGTGCTAAG
CCGGGTGAAGGTGGTGAACTACCAGCCCACAAAGTGTCTAAGGATATCGCAAAAACCAGG
CACTCCACCCCTAATGTTGGGTTAATCTCTCCTCCTCCTCATCACGATATTTATTCCATT
GAAGATTTGAAACAACTGATTTATGATTTGAAATGTGCTAATCCAAGAGCGGGAATTTCT
GTAAAGTTGGTTTCCGAAGTTGGTGTTGGTATTGTTGCCTCTGGTGTAGCTAAGGCTAAA
GCCGATCATATCTTAGTTTCTGGTCATGATGGTGGTACAGGTGCTGCAAGATGGACGAGT
GTCAAATATGCGGGTTTGCCATGGGAATTAGGTCTAGCTGAAACTCACCAGACTTTAGTC
TTGAATGATTTAAGACGTAATGTTGTTGTCCAAACCGATGGTCAATTGAGAACTGGGTTT
GATATTGCTGTTGCAGTTTTATTAGGGGCAGAATCTTTTACCTTGGCAACAGTTCCATTA
ATTGCTATGGGTTGTGTTATGTTAAGAAGATGTCACTTGAACTCTTGTGCTGTTGGTATT
GCCACACAAGATCCATATTTGAGAAGTAAGTTTAAGGGTCAGCCCGAACATGTTATCAAC
TTCTTCTATTACTTGATCCAAGATTTAAGACAAATCATGGCCAAGTTAGGATTCCGTACC
ATTGACGAAATGGTGGGTCATTCTGAAAAATTAAAGAAAAGGGACGACGTAAATGCCAAA
GCCATAAATATCGATTTATCTCCTATTTTGACCCCAGCACATGTTATTCGTCCAGGTGTT
CCAACCAAGTTCACTAAGAAACAAGACCACAAACTCCACACCCGTCTAGATAATAAGTTA
ATCGATGAGGCTGAAGTTACTTTGGATCGTGGCTTACCAGTGAATATTGACGCCTCTATA
ATCAATACTGATCGTGCACTCGGTTCTACTTTATCTTACAGAGTCTCGAAGAAATTTGGT
GAAGATGGTTTGCCAAAGGACACCGTTGTCGTTAACATAGAAGGTTCAGCGGGTCAATCT
TTTGGTGCTTTCCTAGCTTCTGGTATCACTTTTATCTTGAATGGTGATGCTAATGATTAT
GTTGGTAAAGGTTTATCCGGTGGTATTATTGTCATTAAACCACCAAAGGATTCTAAATTC
AAGAGTGATGAAAATGTAATTGTTGGTAACACTTGTTTCTATGGTGCTACTTCTGGTACT
GCATTCATTTCAGGTAGTGCCGGTGAGCGTTTCGGTGTCAGAAACTCTGGTGCCACCATC
GTTGTTGAGAGAATTAAGGGTAACAATGCCTTTGAGTATATGACTGGTGGTCGTGCCATT
GTCTTATCACAAATGGAATCCCTAAACGCCTTCTCTGGTGCTACTGGTGGTATTGCATAC
TGTTTAACTTCCGATTACGACGATTTTGTTGGAAAGATTAACAAAGATACTGTTGAGTTA
GAATCATTATGTGACCCGGTCGAGATTGCGTTTGTTAAGAATTTGATCCAGGAGCATTGG
AACTACACACAATCTGATCTAGCAGCCAGGATTCTCGGTAATTTCAACCATTATTTGAAA
GATTTCGTTAAAGTCATTCCAACTGATTATAAGAAAGTTTTGTTGAAGGAGAAAGCAGAA
GCTGCCAAGGCAAAGGCTAAGGCAACTTCAGAATACTTAAAGAAGTTTAGATCGAACCAA
GAAGTTGATGACGAAGTCAATACTCTATTGATTGCTAATCAAAAAGCTAAAGAGCAAGAA
AAAAAGAAGAGTATTACTATTTCAAATAAGGCCACTTTGAAGGAGCCTAAGGTTGTTGAT
TTAGAAGATGCAGTTCCAGATTCCAAACAGCTAGAGAAGAATAGCGAAAGGATTGAAAAA
ACACGTGGTTTTATGATCCACAAACGTCGTCATGAGACACACAGAGATCCAAGAACCAGA
GTTAATGACTGGAAAGAATTTACTAACCCTATTACCAAGAAGGATGCCAAATATCAAACT
GCGAGATGTATGGATTGTGGTACACCATTCTGTTTATCTGATACCGGTTGTCCCCTATCT
AACATTATCCCCAAGTTTAATGAATTGTTATTCAAGAACCAATGGAAGTTGGCACTGGAC
AAATTGCTAGAGACAAACAATTTCCCAGAATTCACTGGAAGAGTATGTCCAGCACCCTGT
GAGGGAGCTTGTACACTAGGTATTATTGAAGACCCAGTCGGCATAAAATCGGTTGAAAGA
ATTATCATTGACAATGCTTTCAAGGAAGGATGGATTAAGCCTTGTCCACCAAGTACACGC
ACTGGCTTTACAGTGGGTGTCATTGGTTCTGGTCCAGCAGGTTTAGCGTGTGCTGATATG
TTGAACCGTGCCGGACATACGGTCACTGTTTATGAAAGATCCGACCGTTGTGGTGGGTTA
TTGATGTATGGTATTCCAAACATGAAGTTGGATAAGGCTATAGTGCAACGTCGTATTGAT
CTATTGAGTGCCGAAGGTATTGACTTTGTTACCAACACCGAAATTGGTAAAACCATAAGC
ATGGATGAGCTAAAGAACAAGCACAATGCAGTAGTGTATGCTATCGGTTCTACCATTCCA
CGTGACTTACCTATTAAGGGTCGTGAATTGAAGAATATTGATTTTGCCATGCAGTTGTTG
GAATCTAACACAAAAGCTTTATTGAACAAAGATCTGGAAATCATTCGTGAAAAGATCCAA
GGTAAGAAAGTAATTGTTGTCGGTGGTGGTGACACAGGTAACGATTGTTTAGGTACATCT
GTAAGACACGGTGCAGCATCAGTTTTGAATTTCGAATTGTTGCCTGAGCCACCAGTGGAA
CGTGCCAAAGACAATCCATGGCCTCAATGGCCGCGTGTCATGAGAGTGGACTACGGTCAT
GCTGAAGTGAAAGAGCATTATGGTAGAGACCCTCGTGAATACTGCATCTTGTCCAAGGAA
TTTATCGGTAACGATGAGGGTGAAGTCACTGCCATCAGAACTGTGCGCGTAGAATGGAAG
AAGTCACAAAGTGGCGTATGGCAAATGGTAGAAATTCCCAACAGTGAAGAGATCTTTGAA
GCCGATATCATTTTGTTGTCTATGGGTTTCGTGGGTCCTGAATTGATCAATGGCAACGAT
AACGAAGTTAAGAAGACAAGACGTGGTACGATTGCCACACTCGACGACTCCTCATACTCT
ATTGATGGAGGAAAGACTTTTGCATGTGGTGACTGTAGAAGAGGGCAATCTTTGATTGTC
TGGGCCATCCAAGAAGGTAGAAAATGTGCTGCCTCTGTCGATAAGTTCCTAATGGACGGC
ACTACGTATCTACCAAGTAATGGTGGTATCGTTCAACGTGATTACAAACTATTGAAAGAA
TTAGCTAGTCAAGTCTAA
Figure imgf000327_0001
coding qqaqtaaatqatqacacaaqqcaattqacccacqcatqtatctatctcattttcttacaccttctattaccttctqctctctctqatttqqaaaaaQct sequence_CYCl qaaaaaaaaqqttqaaaccaqttccctqaaattattcccctacttqactaataaqtatataaaqacqqtaqqtattqattqtaattctqtaaatct terminator. TPS atttcttaaacttcttaaattctacttttataqttaqtcttttttttaqttttaaaacaccaqaacttaqtttcqacqqattATGACCACCACTGCC coding sequence CAAGACAATTCTCCAAAGAAGAGACAGCGTATCATCAATTGTGTCACGCAGCTGCCCTACAAAA in upper case TCCAATTGGGAGAAAGCAACGATGACTGGAAAATATCTGCTACTACAGGTAACAGCGCATTATTT
TCCTCTCTAGAATACCTTCAATTTGATTCTACCGAGTACGAGCAACACGTTGTTGGTTGGACCGG
font)
CGAAATAACAAGAACCGAACGCAACCTGTTTACTAGAGAAGCGAAAGAAAAACCACAGGATCTG
GACGATGACCCACTATATTTAACAAAAGAGCAGATCAATGGGTTGACTACTACTCTACAAGATCA
TATGAAATCTGATAAAGAGGCAAAGACCGATACTACTCAAACAGCTCCCGTTACCAATAACGTTC
ATCCCGTTTGGCTACTTAGAAAAAACCAGAGTAGATGGAGAAATTACGCGGAAAAAGTAATTTG
GCCAACCTTCCACTACATCTTGAATCCTTCAAATGAAGGTGAACAAGAAAAAAACTGGTGGTAC
GACTACGTCAAGTTTAACGAAGCTTATGCACAAAAAATCGGGGAAGTTTACAGGAAGGGTGACA
TCATCTGGATCCATGACTACTACCTACTGTTATTGCCTCAACTACTGAGAATGAAATTTAACGAC
GAATCTATCATTATTGGTTATTTCCATCATGCCCCATGGCCTAGTAATGAATATTTTCGTTGTTTG
CCACGTAGAAAACAAATCTTAGATGGTCTTGTTGGGGCCAATAGAATTTGTTTCCAAAATGAATC
TTTCTCCCGTCATTTTGTATCGAGTTGTAAAAGATTACTCGACGCAACCGCCAAAAAATCTAAAA
ACTCTTCCAATAGTGATCAATATCAAGTCTCTGTGTACGGTGGTGACGTACTCGTAGATTCTTTG
CCTATAGGTGTTAACACAACTCAAATACTAAAAGATGCTTTCACGAAGGATATAGATTCCAAGGT
TCTTTCCATCAAGCAAGCTTATCAAAACAAAAAAATTATTATTGGTAGAGATCGTCTGGATTCCGT
CAGAGGCGTCGTTCAAAAATTAAGAGCTTTCGAAACTTTCTTGGCCATGTATCCAGAATGGCGA
GATCAAGTGGTATTGATCCAAGTCAGCAGTCCTACTGCCAACAGAAATTCCCCCCAAACTATCA
GATTGGAACAACAAGTCAACGAGTTGGTTAACTCCATAAATTCTGAATACGGTAATTTGAATTTTT
CTCCCGTCCAGCATTACTATATGAGAATCCCTAAAGATGTATACTTGTCCTTACTAAGAGTTGCA
GACTTATGTTTAATCACAAGTGTTAGAGACGGTATGAATACCACTGCTTTGGAATACGTCACTGT
CAAATCGCACATGTCGAACTTTTTATGCTACGGAAATCCATTGATCTTAAGTGAGTTTTCTGGCT
CTAGTAACGTATTGAAAGATGCCATTGTGGTTAACCCATGGGATTCGGTGGCCGTGGCTAAATC
TATTAACATGGCTTTGAAATTGGACAAGGAAGAAAAGTCCAATTTAGAATCAAAATTATGGAAAG
AAGTTCCTACAATTCAAGATTGGACTAATAAGTTTTTGAGTTCATTAAAGGAACAGGCGTCATCT
AATGATGATATGGAAAGGAAAATGACTCCAGCACTTAATAGACCTGTTCTTTTAGAAAATTACAA
GCAGGCTAAGCGTAGATTGTTCCTTTTTGATTACGATGGTACTTTGACCCCAATTGTCAAAGACC
CAGCTGCAGCTATTCCATCGGCAAGACTTTATACAATTCTACAAAAATTATGTGCTGATCCTCAT
AATCAAATCTGGATTATTTCTGGTCGTGACCAGAAGTTTTTGAACAAGTGGTTAGGCGGTAAACT
TCCTCAACTGGGTCTAAGTGCGGAGCATGGATGTTTCATGAAAGATGTTTCTTGCCAAGATTGG
GTCAATTTGACCGAAAAAGTTGATATGTCTTGGCAAGTACGCGTCAATGAAGTGATGGAAGAATT
TACCACAAGGACCCCAGGTTCATTCATCGAAAGAAAGAAAGTCGCTCTAACTTGGCATTATAGA
CGTACCGTTCCAGAATTGGGTGAATTCCACGCCAAAGAACTGAAAGAAAAATTGTTATCATTTAC
TGATGACTTTGATTTAGAGGTCATGGATGGTAAAGCAAACATTGAAGTTCGTCCAAGATTCGTCA
ACAAAGGTGAAATAGTCAAGAGACTAGTCTGGCATCAACATGGCAAACCACAGGACATGTTGAA
GGGAATCAGTGAAAAACTACCTAAGGATGAAATGCCTGATTTTGTATTATGTTTGGGTGATGACT
TCACTGACGAAGACATGTTTAGACAGTTGAATACCATTGAAACTTGTTGGAAAGAAAAATATCCT
Figure imgf000329_0001
Figure imgf000330_0001
Figure imgf000331_0001
GPPVAADRVLASLQGVEAIDAILSLTPETPSPMIALNENKITRKPLVESVALTKKVADAIGNKDFAEAM
RLRNPEFVEQLQGFLLTNSADKDRPQEPAKDPLRVAIVCTGAPAGG NAAIRSAVLYGLARGHQMF
AIHNGWSGLVKNGDDAVRELTWLEVEPLCQKGGCEIGTNRSLPECDLG IAYHFQRQRFDGLIVIG
GFEAFRALNQLDDARHAYPALRIP VGIPATISNNVPGTDYSLGADTCLNSLVQYCDVLKTSASATRL
RLFVVEVQGGNSGYIATVAGLITGAYVVYTPESGINLRLLQHDISYLKDTFAHQADVNRTGKLLLRNE
RSSNVFTTDVITGIINEEAKGSFDARTAIPGHVQQGGHPSPTDRVRAQRFAIKAVQFIEEHHGSKNNA
DHCVILGVRGSKFKYTSVSHLYAHKTEHGARRPKHSYWHAIGDIANMLVGRKAPPLPETLNDEIEKNI
AKEQGIIDPC
Example: 44 Additional Materials and Methods Utilized for Generation of Engineered Yeast Strains
Mating of yeast haploid cells To make a diploid strain from two haploid strains, each of the two haploid strains must be of opposite mating types (a or a). To identify the mating type of the strain, PCR identification can be performed to identify the mating type (see Huxley et al. (1990). Trends Genet 6(8): 236).
Identify any auxotrophies in each of the mating strains (e.g., ura', leu ). If the two strains have different auxotrophies, the diploids can be selected on media that is auxotrophic for both nutrients. If the mating strains do not have different auxotrophies, a different selection scheme will need to be employed.
To mate the haploid strains, patch both strains onto a YPD plate. Mix together a small amount of both cells in a clean part of the YPD plate. Incubate the plate at 30°C for 2 hours.
If the two haploid strains each have different auxotrophies, take a small amount of the mixed haploid cells on YPD and plate them on media that is auxotrophic for both nutrients. This will select for diploids that have arisen due to mating.
If the two haploid strains do not have different auxotrophies, take a YPD plate and divide it into three parts. In two of the three parts, streak out each of the haploid strains to individual colonies. In the third part, streak out the mixture of the haploid strains to individual colonies. Incubate the plate at 30 °C and check the plate regularly within 48 hr of plating the cells.
Diploid cells will grow faster and form larger colonies compared to haploid cells within this time window. To confirm if cells are diploid vs. haploid, pick several colonies and identify their mating type by colony PCR. Diploid cells will be positive for both MATa and MATa, unlike the haploid cells, which will be either MATa or MATa.
Mating type identification by PCR
Spheroplast Buffer
50mM KCI
10mM Tris, pH 8.0
5% glycerol 1 . Transfer a colony of yeast from an agar plate to 50μΙ spheroplast buffer (containing 2μΙ zymolase) in a PCR tube and mix
2. Set thermo cycler for
a. 37°C, 30min
b. 95°C, 10 min c. 4 °C, forever
3. Spin down cell debris by pulsing in centrifuge briefly
4. Use 1 μΙ in a 20μΙ PCR reaction
PCR reaction mix Ml
Template
Spheroplasted colony solution 1
Primers (each at 10 pmol/μΙ)
AGTCACATCAAGATCGTTTATGG 0.33
GCACGGAATATGGGACTACTTCG 0.33
ACTCCACTTCAAGTAAGAGTTTG 0.33
10x PCR buffer 2
50x dNTPS (10 mM each) 0.4
Taq 0.25
ddH20 15.35
TOTAL 20
Cycling conditions
Figure imgf000334_0001
Cells that are MATa will give a product that is approximately 369 bp in size while cells that are MATa will give a product that is approximately 492 bp in size. Diploid cells will yield both products SPORULATION AND DISSECTION OF DIPLOID YEAST
(Protocol has been modified and was originally described in "Yeast Protocols: Methods in Cell and Molecular Biology (1996). Methods in Molecular Biology, 53: 51 -68.")
SPO Media
10 g potassium acetate
20 g agar
ddH20 to 1 liter, mix well and autoclave Zymolyase dissection solution (ZDS)
Zymolyase-100T, 1 mg/ml, in 50% glycerol
Sporulation
Day 1
1 . Plate the diploid strain to be dissected on YPD. Spread the cells over the YPD plate and grow them at 30 °C for 24 to 36 hours.
Day 2
1. Patch the cells from the YPD plate onto SPO plates. Spread the cells as thinly as possible onto the SPO plates to ensure they are rapidly depleted of nitrogen to induce sporulation (D-
2. Incubate the plates at room temperature for 3 - 5 days before dissection.
3. Take some YPD plates and leave them on your bench so that they are dry when you are ready to dissect.
Dissection
Day 5 - 7
1 . Pipet 5 μΙ of water onto a microscope slide. Take a small amount of cells from the SPO plate and place them into the water on the microscope slide. Cover the cells in water with a coverslip and observe them in the microscope (use 40X magnification). Look for asci that indicat cells have sporulated. Asci look like this:
Figure imgf000335_0001
The individual ascospores are smaller than diploid cells, which can cluster together to look like an ascus.
2. If you see asci, pipet 80 μΙ of ZYMOLYASE DISSECTION SOLUTION (ZDS) into a
microfuge tube. Pick up some of the cells from the SPO plate and place them in the ZDS in the microfuge tube.
3. Flick the bottom of the microfuge tube with your finger to mix the cells well in the ZDS.
Incubate the cells in ZDS for 5 minutes at room temperature.
4. Take 12 μΙ of the cells in ZDS and pipet them onto a dry YPD plate. Pipet the cells onto the top left corner of the plate and angle the plate so that the cells in ZDS flow down towards the bottom left corner to form a streak. This will spread the cells out evenly to make it easier to find ascospores. 5. To dissect the ascospores, use a dissecting microscope. Use a micromanipulator with an attached needle to pick up an individual ascus. Be sure to clean the area surrounding the ascus and then clean the needle to make sure that only the ascus to be dissected is on the needle.
6. Pick up the ascus and move it to a clean area of the plate. Use the numbers on the
microscope platform (x-axis: spacing of each spore; y-axis: spacing of each tetrad) as a guideline for where to start dissecting. Make sure you use the same interval to space where you place each spore of each tetrad to be dissected. Ascospores are normally placed at intervals of 5 along the x-axis.
7. To dissect the spores, place the ascus on the YPD plate surface. With the needle on the ascus, tap the bench so that the needle vibrates on the plate surface. This will rub the ascus against the plate and cause the ascus cell wall to break apart and release the spores. If the spores do not come apart, the cell wall may be underdigested. Leave the plate for half an hour and come back to try and dissect again.
8. Once the spores are released from the ascus, leave one spore at the starting point and move the remaining three spores to the next spot (an interval of 5 on the x-axis). Leave a spore here and take the remaining two to the next spot. Do this again until you have placed the fourth and final spore at its location.
9. Obtain another ascus and repeat steps 5 - 8. Dissect 8 - 10 asci per plate.
SHAKE FLASK FERMENTATION PROTOCOL Three or four days before Time 0
1 . Streak on the appropriate plates, the strains you are going to test. Two days before Time 0
1 . Label the light-rubber gray stoppers and the serum bottles with matching numbers. Weigh the bottles with their respective stoppers. Place the plastic stoppers in aluminum foil and autoclave. Close the serum bottles with aluminum foil and autoclave. Be sure not to place any new autoclave tape in the bottle itself that will change its weight. Dry in the incubator.
2. Obtain the numbers of needles (23-Gauge) that you will need. Be sure that the weight on record is for the same kind of needle (record the average weight). 3. Obtain the number of aluminum caps that you will need. Be sure that they are around the same weight (record the weight).
4. Autoclave a big flask to mix the media together.
5. Be sure you have all the media components that you need.
6. Label the right amount of 250 ml autoclaved baffled flasks (4 per strain to be tested is recommended).
One day before Time 0
1 . Add 35 ml of media to the 250 ml baffled flask and inoculate with individual colonies from the plates. If the strain grows slowly, you can start early in the morning.
2. Label the corresponding number of aluminum trays, add a 0.45 pm Cellulose acetate filter disk (Whatman 10 404 006) and weigh them together using the analytical balance (record the weight). You will need two tray-filters per fermentation bottle.
Time 0
Prepare media
1 . Mix all of the components of the media leaving 10% of the volume for inoculum. You will need 120 ml of media per shake flask (including inoculum) and also some for OD determinations blanks and dilutions.
2. Add 108 ml of media to each shake flask.
Washing of cells
1 . Spin down cells in 50 ml Falcon Tubes (3000 rpm for 5 min).
2. Resuspend cells with 25 ml of sterile ddH20. Re-spin cells.
3. Resuspend cells in 25 ml of sterile ddH20.
4. To a culture tube with 9.9 ml of sterile ddH20 add 100 μΙ of the resuspended cells (this is a 100x dilution). Determine OD600.
Preparation of inoculum and inoculation
1. Dilute cells with water to a final OD60o of 10.
2. Add 12 ml of Inoculum (OD600=10) to each bottle. Mix
3. Take 6 ml of sample and place in culture tube on ice. 4. Close the bottle with its specific rubber stopper, clamp the metal cap and put the needle on top. Weigh the bottle (record the weight).
5. Place the bottle in the appropriate temperature and 150 rpm. Dry cell weight
1 . Using a pre-weighed membrane filter, filter 6 ml of the inoculum (OD60o=10).
2. Wash 2x with 12 ml of ddH20.
3. Let dry overnight in the vacuum oven. After an overnight in the oven, weigh the membrane filters/trays using the analytical balance. (Record the weight).
HPLC analysis
1 . Filter 1 ml of initial fermentation media with a 0.2 pm filter.
2. Place the HPLC vial in the analytical balance and tare it.
3. Add 1 ml of mobile phase. (Record the weight).
4. Add 250 μΙ of sample (Record the weight).
5. Close vial and place at 4°C until submission to the analytical department.
OD determination
1 . To 1 ml of media add 250 μΙ of the initial fermentation media (5x dilution). Use media as a blank and measure OD600. Record the QDfion. pH meter
1 . Use the pH meter to determine the pH of each sample. Be sure the media is at room temperature before taking the pH. Record the pH.
Visual inspection
1 . Check under the microscope for contamination and/or unusual phenotype of the cells.
Record any unusual observations. Time Final
1 . Use weight loss of the shake bottles to determine progression or completion of the fermentation. With an initial dextrose concentration of 80 g/L you should expect a drop of around 4 g.
2. Once the fermentation is completed, weigh the bottle. Record the weight. 3. Open the shake bottles with the appropriate tool. Shake well and remove 12 ml of media for analysis.
Dry cell weight
1 . Using a pre-weighed membrane filter, filter 6 ml of the final fermentation medium.
2. Wash 2x with 12 ml of ddH20.
3. Let it dry overnight in the vacuum oven. After an overnight in the oven weigh the membrane filters/trays using the analytical balance. (Record the weight). HPLC analysis
1 . Filter 1 ml of final fermentation media with 0.2 m filter.
2. Place the HPLC vial in the analytical balance and tare it.
3. Add 1 ml of mobile phase. (Record the weight).
4. Add 250 μΙ of sample (Record the weight).
5. Close vial and place at 4°C until submission to the analytical department.
OD determination
1 . To 9.9 ml of media add 100 μΙ of the final fermentation media (1 OOx dilution). Use media as a blank and measure OD60o- Record the ODgon. pH meter
1 . Use the pH meter to determine the pH of each sample. Be sure the media is at room temperature before measuring the pH. Record the pH. Visual inspection
1 . Check under the microscope for contamination and/or unusual phenotype of the cells.
Record any unusual observations.
EDD/EDA assay protocol
EDD/EDA extractions:
1 ) Spin down in 50 ml conical tubes at 4°C and 3,400 rpm for 5 minutes. Wash 2 times with
25 ml water.
2) Resuspend in about 1 ml lysis buffer (50 mM Tris-HCI, pH 7, 25mM reduced glutathione, 10 mM MgCI2).
3) Add 1 cap of zirconia beads, use the bead beater for 1.5 minutes, ice in between.
4) Spin down cell debris at 4°C and 14,000 rpm for 15 minutes. Save supernatant.
5) Perform Bradford assay.
Activity assays:
Each reaction contains 50 mM Tris-HCI, pH 7, 10 mM MgCI2, 0.3 mM NADH, 15 g LDH, about 125 g cell lysate (depending on level of activity), and 1 mM 6-phosphogluconate. Reactions are started by the addition of 6-phosphogluconate and monitored for 10 minutes at 30°C.
Example: 45 Comparison of Diploid and Haploid Strains in Ethanol Fermentation
As a baseline test, differences in the ethanol fermentation capabilities of the diploid and haploid starting strains were analyzed. The fermentation tests were done in three different types of media, AMM, UMM and YPD. The recipes for preparing each of the media are known in the art, and each is available commercially. UMM is similar to AMM, with urea and potassium phosphate replacing the ammonium sulfate and ammonium phosphate used in AMM. YPD is a complete, rich media. Each fermentation was done in triplicate. The results are presented in FIG. 26.
In all media, yCH24 ethanol yield was lower than yCH1 although it generally was not a statically significant difference. Significant decreases in glycerol production were seen in two of the media types (AMM and YPD). Cell mass as measured by optical density at OD60o consistently showed an increase with all the media tested. Microscopic observations revealed that yCH24 was more oval shaped and elongated that yCH1 . Additional growth and fermentation analysis are presented in the table below. Yield ( C mol/ C mol dextrose)
Ethanol Glycerol Biomass
Strain Ave Stdev Ave Stdev Ave Stdev
yCHl 0.559 0.006 0.075 0.001 0.061 0.002
yCH24 0.542 0.010 0.079 0.001 0.068 0.004
The results presented above represent the average of 6 replicates of each fermentation performed in UMM media with an initial dextrose concentration of 80 g/L. A drop of 3% in ethanol yield, an increase in glycerol yield by 5%, and an increase in biomass yield by 1 1 % in yCH24 was observed when compared to yCH1 .
Example: 46 Comparison of Growth, Ethanol Production and Glycerol Production in Strains With and Without TAL 1 Deletions
EDA/EDD cassettes were engineered into yeast strains with substantially identical genetic backgrounds, with or without a deletion of the TAL1 coding sequence. The cassettes were integrated into the genomic DNA as described herein. In some embodiments an insertion cassette also included an expression cassette that provided EDA activity, EDD activity or EDA and EDD activity. Engineered strains (e.g., yCH137:TAL1 , yCH208:tal1 ) were compared in side-by-side shake flask fermentation studies using the protocol described herein. The parent strain (e.g., yCH153:TAL1 ), which had been engineered as ura3 and then repaired to URA3, also was generated and analyzed with yCH137 and yCH208. Serum bottles containing 120ml of YPD media with 8% glucose were inoculated at an initial OD600 of about 10 and sealed using butyl rubber - stoppers. A gas/pressure outlet was provided in the form of a 25-gauge needle in the butyl stopper.
The results of this experiment demonstrated a drop in yield of about 0.36% (e.g., not significant as determined by Student t-test, when the EcCoEDA and PaCoEDD were expressed alone (compare yCH153 versus yCH137). However, an increase in ethanol yield of 2.5% along with a
corresponding drop in glycerol formation of 12.5% (e.g., both statistically significant) was observed when the EcCoEDA and PaCoEDD were expressed in a Atal1 background (yCH153 versus yCH208). The ethanol yield corresponds to a yield increase of approximately 2.5% (statistically significant). These experiments were repeated a second time and an overall yield increase of 1 .5% was observed with a concomitant decrease of 9% in glycerol formation when strains yCH153 and yCH208 were compared. These results strongly suggest that the combination of the tall disruption and expression of the codon optimized E. coli EDA and P. aeruginosa EDD genes can generate a greater than 1 % yield improvement in ethanol production due to a decrease in glycerol production. The results of the individual flasks and the average of each experiment are presented in the table below.
Figure imgf000342_0001
To determine if the results of Multifor fermentation were consistent with the results obtained in shake flasks, the same strains were analyzed under larger scale fermentation conditions. The results are presented in FIG. 27. FIG. 27 graphically illustrates the results of larger scale fermentation using yeast strains yCH137, yCH153 and yCH208. Shown in the graph are the ethanol and glycerol yields as well as the biomass for each strain. The results of this experiment show an overall increase in ethanol yield of about 1.75% in strain yCH208 over the wild type yCH153 strain with a concurrent decrease in glycerol production of 20.2%. The results of the larger scale fermentation were in agreement with that seen in the shake-flask fermentations. The fermentation is currently being repeated to verify the results.
Example 47: EDA and EDD Gene Engineering and Analysis Utilizing an in vitro Assay System Example 34 presents the results of in vitro assays utilized to evaluate the activity of EDA and EDD independently of each other and without the necessity of co-expression in the same extract. The table below presents the results of additional EDA activity analysis performed with various native, chimeric or codon optimized EDA genes expressed in various engineered strains (designated with a yCH name in the Sample column). Included in the table below are the experimental conditions, the measured slope value, any multiplicative factors, the calculated slope (e.g., C. slope) and a % maximum activity value.
Figure imgf000343_0001
To further optimize the reaction conditions of the in vitro EDA/EDD assay, the effect of addition of magnesium or manganese to the reactions was evaluated. The results are presented in the table below and shown graphically in FIG. 28. Magnesium and manganese both seem to activate the EDA/EDD reaction equally well.
Figure imgf000343_0002
Codon optimized versions of the E. coli and P. aeruginosa EDA genes were prepared and introduced on plasmids into various yeast strains. A comparison of the activities of native and codon optimized EDA activities from E. coli and P. aeruginosa are presented in the table below and in FIG. 29. For the table below and FIG. 29; Ec = native E. coli EDA; EcCO = codon optimized E. coli EDA; Pa = native P. aeruginosa EDA; and PaCO = codon optimized P. aeruginosa EDA.
Figure imgf000344_0001
EDA from E. coli (native or codon optimized) shows significantly more activity than the native or codon optimized EDA activity from P. aeruginosa.
After optimizing EDA conditions and identifying the most active EDA activity when expressed in yeast, EDD candidates were similarly evaluated to identify the best EDD activity for use in generating yeast strains engineered for increased ethanol production. The table below lists various yeast strains generated to evaluate the activity of various EDD genes expressed in yeast using the in vitro assay system described herein.
Figure imgf000345_0001
EDA/EDD assays were performed as described herein (see Example 43). The slopes of EcEDA with various EDD candidates were compared as shown in FIG. 30. FIG. 30 graphically illustrates the results of a comparison of EDD candidates using EcEDA as the EDA source for all. The most active EDD identified was the codon optimized EDD from P. aeruginosa (PaCoEDD). Analysis of the iron requirement when cultured in YPD media was also performed. Supplementation of iron in rich media showed minimal if any improvement in assay activity (data not shown).
Example 48: Construction of Integration Cassettes with Codon Optimized EDA and EDD Coding Sequences
After identifying the most active combination of EDA and EDD genes, an integration cassette was constructed. Each gene is expressed from a different promoter (PTEFI for EcCoEDA and PTDH3 for PaCoEDD), and each gene has a different terminator (tAoHi for EcCoEDA and tcvci for PaCoEDD) to avoid the possibility that the cassette could loop out (e.g., recombine out by interaction with other sequences in the integration cassette). The following is a list of the final integration cassettes with edges for integration in either the / 10.5 intergenic region (pBF1 105/1 106), or for disruption of TAL 1 (pBF1 107/1 108).
YBR1 10.5 5'-PTEFi-EcoEDA-TADHrPTDH3-PacoEDD-TcYcrR25-URA3-R25-YBR1 1 0.5 3'
YBR1 10.5 5'-PTEF,-EcoEDA-TADHrPTDH3-PacoEDD-TcYcrR'27-URA3-R'27-YBR 1 10.5 3'
5TAL1 -PTEFi-EcoEDA-TADH,-PTDH3-PacoEDD-TcYcrR25-URA3-R25-3TAL1
5TAL1 -PTEFi-EcoEDA-TADH, -PTDH3-PacoEDD-TcYcrR'27-URA3-R,27-3TAL1
Example 49: Antibiotic Cassette Engineering
Additional antibiotic resistance cassettes were engineered to allow the use of different selectable markers for plasmid maintenance and integration. Resistance cassettes were obtained for 3 new antibiotic resistance markers; PATMX4 conferring resistance to glufosinate, NATMX4 conferring resistance to noursethricin, and SHBLE conferring resistance to zeocin. The new antibiotic resistance markers were tested with each of the yeast parent strains used to construct engineered strains described herein (data not shown). The results indicated that the parental and engineered strains described herein were resistant to glufosinate, however the strains were susceptible to 100 microgram per milliliter or less of noursethricin and zeocin. Antibiotic cassettes were constructed that included the AT01 coding sequence, which enabled positive selection (respective antibiotic) as well as counter-selection (resistance to acetate).
Construction of a URA3 disruption cassette containing either ZeoR, NAT or PAT and with AT01 L75Q in between direct repeats of a unique sequence 200 mer was performed as follows. The Shble (Zeo), PAT and NAT open reading frames were amplified with primers oJML356-oJML357, oJML358-oJML359, oJML360-oJML361 , respectively, from either pTEF-ZEO, codon-optimized PAT, or codon-optimized NAT, respectively. The TEF1 promoter region from pBF1034 was
amplified using primers oJ L352 and oJML354. The TEF terminator was amplified with primers oJML353-oJML355. The promoter, open reading frame (ShBLE, PAT, or NAT) and terminator PCR were combined and reamplified with primers oJML352 and oJML353 and TOPO cloned to form pBF1082, pBF1083 and pBF1087, respectively. The entire fragment was then moved as a Xhol-Sacl piece into pBF1034 replacing the KanMX gene with either the Zeo, PATMX4, or
NATMX4 to form pBF1090, 1091 , and 1092, respectively. The sequence of the primers used for amplifying and combining the various fragments are described in the table below. FIGS. 31 A-D/ provide plasmid maps of the resulting constructs.
JML7352 AAGCTTCATTCCCATTACCATCTACTGAAAGG
JML7353 CGCAACGGTGCGTCG
JML/354 CATGGTTGTTTATGTTCGGATGTG
JMU355 TGATAATCAGTACTGACAATAAAAAGATTCTTGTTTTC
JML7356 CACATCCGAACATAAACAACCATGGCCAAGTTGACCAGTGC
JML7357 CAAGAATC I I I I I ATTGTCAGTACTGATTATCAGTCCTGCTCCTCGGCCAC
JML/358 CACATCCGAACATAAACAACCATGTCCCCAGAAAGAAGACCAGTC
JML7359 CAAGAATC I I I I I ATTGTCAGTACTGATTATCATCAGATTTGAGTAACTGGTCTAACTGG
JML7360 CACATCCGAACATAAACAACCATGGGTACTACCTTGGACGATACCG
JML7361 CAAGAATC I I I I I ATTGTCAGTACTGATTATCATGGACATGGCATAGACATGTAC Example 50: yCH1 Strain Engineering, Deletion of Phosphofructokinase Activity and Strain
Genotype Summary
Additional engineered strains were generated using parental diploids of different genetic
background, using methods for strain engineering described herein. One such wild type isolate, designated yCH1 , was engineered to generate the strains described in the summary table
presented at the end of this example. Dfk1::P?5/pfk1::P?5 Strain Construction
Generating the phosphofructokinase disrupted strain proved challenging. Of 48 strains generated and selected on SCD-ura plates (i.e. selecting for integration of the pfk 1. :URA3-F?5 cassette), only 3 strains could be confirmed by PCR analysis, as shown in FIG. 32. Without being limited by any theory, when the pfkl ::R25/PFK1 strain is transformed with the pfkl ::URA3-R25 cassette, four possibilities can be envisaged: 1 ) the correct integration at the second PFK1 locus as desired generating the pfkl ::R25/pfk1 ::URA3-F?5 genotype, 2) a repeat integration within the first knockout pfkl.R25 location, resulting in the pfklr.URAS-^/PFKI genotype, 3) integration of the URA3 locus at an unidentified/random location with no loss of the original pfklr.R25 disruption, or 4) integration of the URA3 locus at an unidentified/random location with loss of the original pfk 1. -.-R25 disruption. In this experiment the most common genotype recovered was one in which the cells were predominantly URA3*, but which still retained at least one copy of the PFK1 allele. Out of 48 colonies screened, only 3 generated the desired pfkl v.R25 / pfkl ::URA3-R25 genotype. These colonies were then plated onto SCD plates containing 5-FOA to screen for the looping out of the URA3 from the integration cassette. All colonies recovered on the 5-FOA-containing plates were found to have the desired pfklr.R25 /pfk 1. vf?25 genotype. This strain was named yCH232.
Strain yCH232 was transformed with various integration cassettes described herein, including: 1 ) 1 10SrEcCoEDA/PaCoEDD-URAS-R25 and the URA3 cassette used to restore at least one of the original URA3 loci such that the strain returns to a l/f?/\3*genotype/phenotype. These strains will be used to test the effect of a single integrated EcCoEDA/PaCoEDD expression cassette, in a strain deleted for phosphofructokinase activity (e.g., pfkl strain), as well as to confirm that the pfkl total deletion strain behaves as expected in aerobic versus anaerobic conditions. tall:: PwEcCo-EDA-Prnm-PaCo-EDD -F?s/tal1:: PwEcCo-EDA-Prnm-PaCo-EDD -R25 Strain Construction
A cassette was designed to disrupt the TAL 1 locus by integrating a copy of the expression cassette containing the EcCo-EDA and PaCo-EDD genes expressed from either the PrEF, or PTDH3 promoters, respectively. Using this disruption cassette, a first tall locus was disrupted and the URA3 marker successfully recycled from the strain. A colony found to have the desired genotype (tall r.EcCo-EDA/PaCo-EDD-R25), was named yCH234. The second copy of TAL 1 was disrupted in a similar manner and the URA3 recycled to generate strain yCH243. Strain yCH243 was transformed with a URA3 cassette designed to repair the strains original URA3 locus, generating strain yCH247. This method generated the URAS/uraSr.R448 genotype, and imparts the ability to grow in the absence of exogenous uracil. yCH247 has been confirmed by nucleic acid analysis methods and is being tested in shake flask fermentations and in the Multifor Fermentation system. All strains containing the EcCoEDA/PaCoEDD constructs have shown the expected activity when assayed (data not shown).
1 10.5::PTFF,-EcCo-EDA-PTn -PaCo-EDD/1 10.5:: PwEcCo-EDA-Prnm-PaCo-EDD Strain Construction
A similar approach was taken to integrate the PTEFt-EcCo-EDA-PTDH3-PaCo-EDD cassette at the intergenic region YBR1 10.5. Two copies were integrated, generating strain yCH241 , and the URA3 repaired to generate strain yCH245. yCH245 will be compared to yCH1 and to yCH247 in shake flask fermentations as well as in the Multifors Fermentation system. All strains containing the PTEFrEcCo-EDA-PTDH3-PaCo-EDD constructs have shown the expected activity when assayed (data not shown).
Additional engineered strains A number of additional activities have been altered in strains derived from the original diploid yCH1 lineage. Non-limiting examples of activities that have been, or are currently being, altered (e.g., activity increased, activity decreased, activity eliminated) and/or added include ZWF1 activity, GND1/GND2 activity, TAL1 activity, SOL3 activity, PDC1 activity, ADH1 activity, GLT1 activity, PFK1/PFK2 activity, GPD1/GPD2 activity, FPS1 activity, TPS1 activity, NTH1 activity, PH013 activity, TDH3 activity, PYK activity, chimeric activities, the like, and combinations thereof.
Additionally, yCH3, the MATa mating type haploid derived from yCH1 also has been engineered using the nucleic acid tools (e.g., integration cassettes, expression cassettes, antibiotic resistance cassettes, plasmids, the like and combinations thereof), described herein. A listing of additional strains created during the course of engineering yCH1 is presented in the table at the end of this Example. Shake flash fermentation evaluations yCH1 and yCH247 were compared in shake flask fermentations carried out in UMM containing 8% glucose as the carbon source. A total of ten shake flasks per strain were evaluated (n=10), over the course of two experiments. The results are presented graphically in FIG. M35/L26. yCH1 produced 0.445g (+/-0.004) ethanol/g glucose and yCH247 produced 0.451 g (+/-0.002) ethanol/g glucose, providing a 1 .3% increase in ethanol yield for the engineered strain. This difference in yield between strains was determined by a Student T-test to be statistically significant at 99% percentile confidence level.
Concomitant with the increased ethanol yield, a decrease in the formation of glycerol was also seen. yCH1 generated on average 0.078g (+/-0.002) glycerol/g glucose whereas yCH247 generated on average only 0.073g (+/-0.001 ) glycerol/g glucose. The difference in glycerol produced as a byproduct of the fermentation process was found to be statistically significant at a 99 percentile confidence level, using the Student T-test. The observed reduction in glycerol byproduct formation is consistent with the stoichiometry of the alternative pathway predicted to generate excess NAD+, thus obviating the need to produce glycerol to regenerate this required co- factor. There was no statistically significant difference in biomass yield between the two strains. Shake flask experiments using YPD with 8% glucose as the carbon source did not generate significant differences in ethanol yield, as shown in the table below. The amount of glycerol produced when the cells are grown on YPD is approximately half that seen when the cells are grown in UMM. This suggests that the engineered pathway may only provide a benefit when the cells are exposed to some type of stress which results in glycerol production.
Figure imgf000350_0001
Evaluation of glucose concentration on fermentation results
Glycerol production was observed to be decreased in the engineered tall ::EDA/EDD strains. To determine if the reduction in glycerol was due to a change in glycerol production during biomass increase or ethanol production, shake flask fermentations were carried out in the presence of 4% or 8% glucose. A yield increase of 1.1 % was observed in the initial 40 g/L dextrose fermentation and a 1.8% yield was observed in the initial 80 g/L dextrose fermentation. The results suggest that the yield improvement is more likely due to lowered glycerol production in the ethanol production phase, as shown in the table below. These results are currently being verified using larger scale fermentors.
Ethanol Glycerol
Ave Stdev Ave Stdev
U M (40g/L) yCH 1 0.392 0 078
U M (40g L) yCH 1 0.394 0 078
U MM (40g/L ) yCH 1 0.377 0.387 0.009 0 076 0.078 0 001
U M (40g L) yCH247 0 381 0.075
U M (40g L ) yCH247 0.389 0.075
U MM (40g'L ) yCH247 0.405 0 391 0.012 0.078 0 076 0.002
U MM (80g'L) yCH 1 0 405 0.076
U MM (80g L ) yCm 0 404 0 075
U MM (80g/L ) yCH 1 0 387 0.399 0 010 0 073 0.075 0 002
U MM (80g'L ) yCH247 0 396 0 069
U MM (80g'L ) yCH247 0 400 0 070
U MM (80g'L) yCH247 0.422 0.406 0 014 0 074 0 07 1 0.002
Fermentation in Multifors
Duplicate fermentation experiments were performed comparing yCH1 versus yCH247 in UMM media. Triplicates of each strain were performed in each experiment, for a total of six fermentations for each strain. In the first experiment, yCH1 had an ethanol yield of 0.420 g (+/- 0.006) of ethanol/g glucose consumed, compared to a yield of 0.417 g/g (+/- 0.005) for yCH247. These values were determined to not be significantly different (e.g., p value = 0.562). The results are presented graphically in FIG. 37. Glycerol yields are significantly different (e.g., yCH1 = 0.074 g/g [+/- 0.002], yCH247 = 0.068 g/g [+/- 0.003]). No significant biomass difference was observed between yCH1 (0.046 g/g [+/- 0.002]) and yCH247 (0.045 g/g [+/-0.002]). In the second experiment, a 2.1 % increase in ethanol yield was observed (statistically significant at 95% confidence). The results are presented in FIG. 38. yCH1 had an ethanol yield of 0.417 g (+/- 0.002) of ethanol/g glucose consumed, compared to a yield of 0.426 g/g (+/-0.005) for yCH247. These values were determined to be significantly different at a 95% confidence level. Glycerol yields were significantly different (e.g., yCH1 = 0.077 g/g [+/-0.001], yCH247 = 0.071 g/g [+/-
0.000]). No significant biomass difference was observed between yCH1 (0.049 g/g [+/- 0.001 ]) and yCH247 (0.046 g/g [+/- 0.003]). Thus, the glycerol production results seen in the shake flasks were confirmed in larger scale fermentations. That is the increase in ethanol production was detected with a corresponding decrease in glycerol production.
When YPD was used as the growth media with 8% glucose as the sole carbon source, no significant improvement in ethanol yield was observed, also confirming the results obtained in the shake flask fermentations (data not shown).
Stability Studies
The genetic modifications to strain yCH247 were determined to be stable for at least 68 generations, as shown in FIG. 39A-E. Starting from a single cell, yCH247 was cultured by sequential transfer in YPD media for over 68 generations and 10 individual descendants were analyzed for retention of all genetic modifications by both PCR and enzymatic assays. FIG. 39A shows the PCR analysis of the TAL1 gene along with an actin (ACT1 ) control. vStrain yCH1 generated the two expected PCR fragments corresponding to both the actin control (upper band) as well as the TAL1 genes (lower band). Strain yCH247 and its 10 descendants generated only the actin band, confirming the stable deletion of the TAL1 loci in these strains. FIG. 39B shows the PCR analysis of the PaCoEDD allele. Only strain yCH247 and its 10 descendants generated the lower band reflecting the presence and stability of the EDD gene in these strains. FIG. 39C/ demonstrates the presence and stability of the EcCoEDA gene in yCH247 and its descendants. FIG. 39D illustrates the presence of the R448 unique 200mer genetic fingerprint in yCH247 as well as 9 of the 10 descendants. One descendant was observed to have lost the R448 fingerprint suggesting that the heterozygous ura3::R448/URA3 loci in this strain has recombined to form the desired URA3/URA3 homozygous diploid. FIG. 39E demonstrates the stable presence of the R25 unique 200mer genetic fingerprint in yCH247 and all 10 of its descendants. Enzyme assays on yCH 1 , yCH247 and its 10 descendants also were conducted, and are presented in FIG. 40. The results presented in FIG. 40 show that only yCH247 and its
descendants demonstrated EDA/EDD activity. pfkl Mutant Phenotypes
To test the phenotype of strains completely deleted for PFK1 activity (e.g., pfk1/pfk1 mutant), an experiment was performed to test the growth in both aerobic and anaerobic conditions. Strains yCH1 (wild type) and yCH249 (pfk1/pfk1 ) were inoculated in 5ml of YPD and cultured overnight at 30°C. The next morning, 100ml of YPD in 250ml baffled shake flasks or 250ml serum bottles with a 2-3mm mineral oil overlay and outfitted with a 25-gauge needle were inoculated to an initial OD600 of approximately 0.025. The strains were allowed to grow at 30 °C for approximately 26 hours. The OD600 was determined after 26 hours of growth. The results of the growth experiments in aerobic and anaerobic conditions are presented in FIG. 41. In aerobic conditions, yCH1 and the pfkl mutant, yCH249, grow similarly. In contrast, in anaerobic conditions, yCH249 is severely restricted for growth after approximately 5 hours, likely when the oxygen has been depleted from the serum bottle. These results demonstrate that the pfkl mutation in yCH249 confers the expected phenotype. The summary table of engineered strains of yCH1 lineage is presented below.
Figure imgf000354_0001
Listed in the table below are additional engineered strains, derived from yCH1 , having altered activities hAlievpd tn hp henefinial for innreasprl ethannl nrndi intinn
Figure imgf000355_0001
Example 51: BF903 Strain Engineering, Deletion of Phosphofructokinase Activity and Strain Genotype Summary
Additional engineered strains were generated using parental diploids of different genetic background, using methods for strain engineering described herein. One such diploid strain, designated BF903, was engineered to generate the strains described in the summary tables presented at the end of this Example.
Strain BF903 is being engineered concurrently with strain yCH1 , using substantially similar nucleic acid cassettes for engineering, integration, expression, and antibiotic resistance. All strains containing the EcCoEDA/PaCoEDD constructs have shown the expected activity when assayed (data not shown). The strains also are being evaluated for their use as a strain in which to study the effect of further genetic modification to increase ethanol production, including but not limited to alteration of and/or addition of ZWF1 activity, GND1/GND2 activity, TAL1 activity, SOL3 activity, PDC1 activity, ADH1 activity, GLT1 activity, PFK1/PFK2 activity, GPD1/GPD2 activity, FPS1 activity, TPS1 activity, NTH1 activity, PH013 activity, TDH3 activity, PYK activity, chimeric activities, the like, and combinations thereof. Integration constructs have been generated for the next steps of engineering the BF903 strain, using nucleic acid tools described herein.
Evaluation of the pfkl phenotype
Strain BF903 has been engineered to delete the PFK1 activity (e.g., a pfk1/pfk1 mutant), and designated BF2095. To test the phenotype of the pfk1/pfk1 mutant, an experiment was performed to test the growth in both aerobic and anaerobic conditions. The growth of the cells was performed as described in Example 50, using strains BF903 and BF2095. After 26 hours of growth, the OD60o was determined. The results of the growth experiments in aerobic and anaerobic conditions are presented in FIG. 42. In aerobic conditions, both BF903 and the pfkl mutant, BF2095, grow similarly. In contrast, in anaerobic conditions, BF2095 is severely restricted for growth after approximately 5 hours, likely when the oxygen has been depleted from the serum bottle. These results demonstrate that the pfkl mutation in BF2095 confers the expected phenotype.
Shake Flask Fermentation Results
Shake flask fermentations to compare BF903 and BF2100 (tall ::EcCO-PaCO-R127/tal::EcCO- PaCO-R127, URA3/URA3) were conducted in UMM containing 8% glucose as the carbon source. One experiment was conducted in which a total of 6 shake flasks per strain were evaluated (n=6). The results are presented in FIG. 43. BF903 produced 0.403 g (+/-0.004) ethanol/g glucose and BF2100 produced 0.412 g (+/-0.003) ethanol/g glucose, providing a 2.23% increase in ethanol yield for the engineered strain. The difference in yield between strains was determined to be statistically significant at 99 percentile confidence level by a Student T-test.
As seen with yCH1 derived strains, the increased ethanol yield shown by BF903 derived strains, was accompanied by a decrease in the formation of glycerol as a metabolic byproduct. BF903 generated on average 0.061 g (+/-0.001 ) glycerol/g glucose whereas BF2100 generated on average only 0.059g (+/-0.001 ) glycerol/g glucose. The difference in glycerol production was not shown to be statistically significant at 99 percentile confidence level using a Student T-test.
However, the observed reduction in glycerol byproduct formation is consistent with the
stoichiometry of the alternative pathway predicted to generate excess NAD+, thus obviating the need to produce glycerol to regenerate this required co-factor. Thus it is believed that the reduction in glycerol is a function of the engineered pathways. There was no statistically significant difference in biomass yield between the two strains. A similar experiment was performed using the same strains in YPD media and 8% glucose. The results are presented in FIG. 44. The results were substantially similar to those seen with the yCH1 derived strains. No significant difference was seen between the two strains in terms of ethanol yield in rich media, suggesting the effect of the EDA/EDD pathway is enhanced under stress conditions.
Figure imgf000358_0001
Listed in the table below are additional engineered strains, derived from BF903, having altered activities believed to be beneficial for increased ethanol production.
Figure imgf000359_0001
Example 52: BF2 Strain Engineering, Deletion of Phosphofructokinase Activity and Strain
Genotype Summary
Additional engineered strains were generated using parental diploids of different genetic background, using methods for strain engineering described herein. One such wildtype strain, designated BF2, was engineered to generate the strains described in the summary table presented at the end of this Example.
Strain BF2 was engineered concurrently with strains yCH1 and BF903. BF2 derived strains listed in the table below are currently (i) undergoing final confirmation, and/or (ii) evaluation in shake flask fermentations. The strains also are being evaluated for their use as a strain in which to study the effect of further genetic modification to increase ethanol production, including but not limited to alteration of and/or addition of ZWF1 activity, GND1/GND2 activity, TAL1 activity, SOL3 activity, PDC1 activity, ADH1 activity, GLT1 activity, PFK1/PFK2 activity, GPD1/GPD2 activity, FPS1 activity, TPS1 activity, NTH1 activity, PH013 activity, TDH3 activity, PYK activity, chimeric activities, the like, and combinations thereof. Integration constructs have been generated for the next steps of engineering the BY4742 strain (e.g., BF2), using nucleic acid tools described herein.
Figure imgf000360_0001
Example 53: Engineering and Evaluation of yCH3 derived strains Shake flask fermentation Shake flask fermentation experiments were performed to evaluate various engineered generated from the yCH3 parent strain. Strains yCH153 ("wild type"), yCH137 (1 10.5::EcCoEDA-PaCoEDD) and yCH208 (tall ::EcCoEDA-PaCoEDD) were grown in YPD with 8% glucose as the sole carbon source. The experiments were conducted in duplicate. The results demonstrated that the presence of the EDD/EDA within the genome decreased the yield of ethanol. The first experiment gave the following yields; yCH153=0.427 g ethanol/g glucose and yCH137=0.425 g/g (data not shown). The second experiment gave the following yields: yCH153=0.420 g ethanol/g glucose and
yCH137=0.417 g/g; (data not shown). In contrast, the addition of the tall mutation in yCH208 statistically increased the yield to 0.437 g/g, giving rise to a yield increase of 2.48 % in the first experiment and an increase of 1 .46 in the second experiment. Glycerol was also reduced by 13% or 7% in strain yCH208 compared to yCH153, in experiment one or two respectively. The results are shown in FIG. 33. FIG. 33 graphically illustrated the results of shake flask fermentations for strains yCH153, yCH137 and yCH208 grown in YPD with 8% glucose. The results represent the average of 8 samples/strain. Analysis of CO? production as a product of ethanol production
The production of C02 also was measured in parental and engineered strains, in an effort to determine how much C02 produced was due to ethanol production from the engineered pathways. The data, presented in the table below, suggests that in the engineered yCH208 strain, more of the C02 produced could be attributed to ethanol production. Therefore, the TAL1 alteration (e.g., deletion of TAL1 ) seems to provide additional carbon flux through the engineered pathways, which should lead to at least a 1 % or greater increase in ethanol production in strains with the tall phenotype, when compared to strains that are native for TAL1 activity. yCH 137 97.74
yCH 137 97.92
yCH 137 98.17 Media YPD
yCH 137 98.21 Temperature 34.00 F-test 0.1490 Distributi 0.99 yCH 137 97.31
yCH 137 99.00 Strain 153 208 Type 3.00| yCH 137 97.70 98.01 0.53 Condition
yCH 137 120.73 Number of Replic 8 7
yCH 153 98.34 98.34 99.20 T-test 0.00 yCH 153 98.44 98.44 99.46 Distribu 1.00 yCH 153 98.74 98.74 99.51
yCH 153 96.31 96.31 99.76
yCH 153 97.25 97.25 98.93 Differen Yes yCH 153 97.04 97.04 98.75
yCH 153 98.08 98.08 98.53
yCH 153 98.02 97.78 0.£ 98.0
yCH 208 99.20
yCH 208 99.46
yCh 208 99.51 Average 97.78 99.16
yCH 208 99.76 StanDev 0.83 0.44
yCH 208 98.93
yCH 208 98.75
yCH 208 98.53 99.16 0.44
Fermentation in Multifors yCH153 and yCH208 were compared in larger scale fermentation reactions for a second time. A higher ethanol yield was obtained for yCH208 (0.436 g ethanol/g glucose) compared to yCH153 (0.422 g ethanol/g glucose), as shown in FIG. 34. As expected, a drop in glycerol production by yCH247 (0.089 g glycerol/g glucose) was also observed (compare to yCH1 at 0.103 g glycerol/g glucose). Biomass was slightly different between the two strains: yCH153 generating 0.015 versus 0.012 for yCH208. Statistical analysis of this data was not possible due to the limited data points. Example 54: Evaluation of Stress on Ethanol Production: Temperature Stress
As noted previously, it was observed that the amount of ethanol produced in the engineered strains was higher under stress conditions. Experiments were conducted to determine how stress affects ethanol fermentation in strains BF903 (Turbo) and BF2100. One of the stress factors investigated was the effect of higher temperatures on ethanol production. The results are presented in the tables below. The shake flask fermentations were performed at 30°C, 37°C and 39°C. The results demonstrate that as the engineered strain is stressed by temperature, ethanol production is increased and glycerol production is decreased, suggesting the presence of the engineered pathways (e.g., added EDA and EDD genes with other pathway modifications) is beneficial and can lead to increased ethanol production under stress conditions.
Ethanol (g/g) Glycerol (g/g) Biomass (g/g)
Ave Stdev Ave Stdev Ave Stdev
BF903 0.436 0.008 0.035 0.001 0.078 0.002
BF2100 0.435 0.009 0.030 0.001 0.065 0.003
Figure imgf000363_0001
39 °<
Figure imgf000363_0002
Example 55: Evaluation of Stress on Ethanol Production: Amino Acid Stress
The effect of additional amino acids in the media also was investigated with respect to ethanol and glycerol production. The experiments were performed using UMM and SCD media supplemented with amino acids. In each media tested, lowered glycerol yields were observed along with the increase in ethanol production in the engineered strain (e.g., BF2100). The results demonstrate that the stress caused by supplementing the media with additional amino acids resulted in a similar response to that seen with temperature stress. That is, increased ethanol production in conjunction with decreased glycerol production, suggesting the expression of the engineered pathways, in their current form, is enhanced by stress. Additional stress conditions are currently being investigated. Additionally, experiments are being undertaken to determine if the response is strain dependent.
UMM Elhanol (g/g) Glycerol (g/g) Biomass (g/g)
Ave Stdev Ave Stdev Ave Stdev
BF903 0.403 0.004 0.061 0 001 0 060 0 002
BF2100 0.412 0.003 0.059 0 002 0.057 0.002
Figure imgf000364_0001
Example 56: Alternative Xylose Metabolic Pathway Engineering
As noted herein, yeast strains that can metabolize xylose using the xylose isomerase activity, and the fungal xylose reductase, xylitol dehydrogenase are currently being engineered. The noted redox imbalance can be offset by additional alterations engineered in strains described herein. An integration construct useful for generating strains that contain the XI system and the alternative fungal XR/XD system has been constructed, and is diagrammatically illustrated in FIG. 45.
The integration construct is configured to enable integration of the Candida tenuis XR gene (CtXYLI ) and an XDH allele (e.g., currently evaluating XDH activities from C. shehatae and D. hansenii; XYL2). The integration construct also will provide a stronger non-glucose upregulated promoter upstream of the XKS1 allele present in the S. cerevisiae genome, after successful integration. The cassette is designed to be integrated into BF903 derived strains that are either wild type or expressing the alternative C6 pathway (e.g., minus the TAL1 deletion). Strains also are being engineered to alter GND1/GND2 activity (e.g., gnd1/gnd2 mutation) as well as upregulate the pentose phosphate pathway (e.g., PPP) to optimize improvements in ethanol production. Fermentation conditions also are being optimized to ensure proper expression of the XKS1 allele.
Example 57: Additional Native, Codon Optimized and Chimeric Nucleotide Sequences of Xylose Isomerase Activities Altered in Engineered Organisms Described Herein As noted above, additional xylose isomerase genes were identified and isolated and chimeric versions generated in certain embodiments. Presented below are native and codon optimized nucleotide sequences of candidate xylose isomerase genes from R. flavefaciens, FD-1 ,
Ruminococcus 18P13, and Clostridiales genomosp., and nucleotide sequences of native and codon optimized segments utilized to generate certain chimeric xylose isomerase sequence embodiments.
Figure imgf000366_0001
Figure imgf000367_0001
Figure imgf000368_0001
Figure imgf000369_0001
Figure imgf000370_0001
Figure imgf000371_0001
Figure imgf000372_0001
Example 58: Construction of a URA3 disruption cassette for use in an industrial strain pBF1034 was constructed using standard molecular biology techniques. pBF1034 contains about 300 bp of the 5' flanking region of URA3 (e.g., around position 6148 to 6450), and about 300 bp of the 3' flanking region of URA3 gene (e.g., around position 2061 to 2360). pBF1034 also contains the KanMX resistance gene (e.g., around position 3302 to 4150), the TDH3 promoter (e.g., around position 2635 to 3284), the ScATOI L75Q gene (e.g., around position 2635 to 3284) and the CYC terminator (e.g., around position 4155 to 4407). There also are two R153 repeats (e.g., around position 2380 to 2582 and around position 5924 to 6122). The entire 5' URA3-R153-PTDH3-AT01 L715Q-TCYC1-R153-3' URA3 s flanked by Pad restriction sites. The sequence of pBF1034 is given below as SEQ ID No: 181 .
SEQ ID No: 181 - Sequence of pBF1034
1 ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag
61 gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc
121 ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca
181 gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg
241 aagtggtggc ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg
301 aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct
361 ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa
421 gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa
481 gggattttgg tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa
541 tgaagtttta aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc
601 ttaatcagtg aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga
661 ctccccgtcg tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca
721 atgataccgc gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc
781 ggaagggccg agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat
841 tgttgccggg aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc
901 attgctacag gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt
961 tcccaacgat caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc
1021 ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg
1081 gcagcactgc ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt
1 141 gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg
1201 gcgtcaatac gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga
1261 aaacgttctt cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg
1321 taacccactc gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg
1381 tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt
1441 tgaatactca tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc
1501 atgagcggat acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca
1561 tttccccgaa aagtgccacc tgacgtctaa gaaaccatta ttatcatgac attaacctat
1621 aaaaataggc gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga cggtgaaaac
1681 ctctgacaca tgcagctccc ggagacggtc acagcttgtc tgtaagcgga tgccgggagc
1741 agacaagccc gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg gcttaactat
1801 gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga
1861 tgcgtaagga gaaaataccg catcaggcgc cattcgccat tcaggctgcg caactgttgg
1921 gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg gggatgtgct 1981 gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg taaaacgacg
2041 gccagtgaat tcttaattaa cttttgttcc actacttttt ggaactcttg ttgttctttg
2101 gagttcaatg cgtccatctl tacagtcctg tcttattgtt cttgatttgt gccccgtaaa
2161 atactgttac ttggttctgg cgaggtattg gatggttcct ttttataaag gccatgaagc
2221 tttttctttc caattttttt tttttcgtca ttataaaaat cattacgacc gagattcccg
2281 ggtaataact gatataatta aattgaagct ctaatttgtg agtttagtat acatgcattt
2341 acttataata cagtttttta gttttgctgg ccgcggccgc tcgggtctcg cccggcgcta
2401 gtccagccgt agcgctctcc ggcgatcacc ccggagcact ctggagccga gcggtcgggt
2461 ctgttgggcg cgccgcggct acggacggct cgactcactg gcgctcgacc ccgtatcccc
2521 cgtctcggac gacgcaccgt tgcgcgggaa cgatcggcgg cgctcacacg cacgatcgga
2581 aggatccact agtaacggcc gccagtgtgc tgtggaattc gcccttggat tccaagttta
2641 tcattatcaa tactcgccat ttcaaagaat acgtaaataa ttaatagtag tgattttcct
2701 aactttattt agtcaaaaaa ttagcctttt aattctgctg taacccgtac atgcccaaaa
2761 tagggggcgg gttacacaga atatataaca tcgtaggtgt ctgggtgaac agtttattcc
2821 tggcatccac taaatataat ggagcccgct ttttaagctg gcatccagaa aaaaaaagaa
2881 tcccagcacc aaaatattgt tttcttcacc aaccatcagt tcataggtcc attctcttag
2941 cgcaactaca gagaacaggg gcacaaacag gcaaaaaacg ggcacaacct caatggagtg
3001 atgcaacctg cctggagtaa atgatgacac aaggcaattg acccacgcat gtatctatct
3061 cattttctta caccttctat taccttctgc tctctctgat ttggaaaaag ctgaaaaaaa
3121 aggttgaaac cagttccctg aaattattcc cctacttgac taataagtat ataaagacgg
3181 taggtattga ttgtaattct gtaaatctat ttcttaaact tcttaaattc tacttttata
3241 gttagtcttt tttttagttt taaaacacca gaacttagtt tcgacggatt ctagaactag
3301 tatgtctgac aaggaacaaa cgagcggaaa cacagatttg gagaatgcac cagcaggata
3361 ctatagttcc catgataacg acgttaatgg cgttgcagaa gatgaacgtc catctcatga
3421 ttcgttgggc aagatttaca ctggaggtga taacaatgaa tatatctata ttgggcgtca
3481 aaagtttttg aagagcgact tataccaagc ctttggtggt acccagaatc cagggttagc
3541 tcctgctcca gtgcacaaat ttgctaatcc tgcgccctta ggtctttcag ccttcgcgtt
3601 gacgacattt gtgctgtcca tgttcaatgc gagagcgcaa gggatcactg ttcctaatgt
3661 tgtcgtcggt tgtgctatgt tttatggtgg tttggtgcaa ttgattgctg gtatttggga
3721 gatagctttg gaaaatactt ttggtggtac cgcattatgt tcttacggtg ggttttggtt
3781 gagtttcgct gcaatttaca ttccttggtt tggtatcttg gaagcttacg aagacaatga
3841 atctgatttg aataatgctt taggatttta tttgttgggg tgggccatct ttacgtttgg
3901 tttaaccgtt tgtaccatga aatccactgt tatgttcttt ttgttgttct tcttactagc
3961 attaactttc ctactgttgt ctattggtca ctttgctaat agacttggtg tcacaagagc
4021 tggtggtgtc ctgggagttg ttgttgcttt cattgcttgg tacaacgcat atgcaggtgt
4081 tgctacaaag cagaattcat atgtactggc tcgtccattc ccattaccat ctactgaaag
4141 ggtaatcttt taactcgagt catgtaatta gttatgtcac gcttacattc acgccctccc
4201 cccacatccg ctctaaccga aaaggaagga gttagacaac ctgaagtcta ggtccctatt
4261 tattttttta tagttatgtt agtattaaga acgttattta tatttcaaat ttttcttttt
4321 tttctgtaca gacgcgtgta cgcatgtaac attatactga aaaccttgct tgagaaggtt
4381 ttgggacgct cgaaggcttt aatttgcggg tcgacgtacc cccgggataa ttaaggcgcg
4441 ccagatctgt ttagcttgcc tcgtccccgc cgggtcaccc ggccagcgac atggaggccc
4501 agaataccct ccttgacagt cttgacgtgc gcagctcagg ggcatgatgt gactgtcgcc
4561 cgtacattta gcccatacat ccccatgtat aatcatttgc atccatacat tttgatggcc
4621 gcacggcgcg aagcaaaaat tacggctcct cgctgcagac ctgcgagcag ggaaacgctc
4681 ccctcacaga cgcgttgaat tgtccccacg ccgcgcccct gtagagaaat ataaaaggtt
4741 aggatttgcc actgaggttc ttctttcata tacttccttt taaaatcttg ctaggataca
4801 gttctcacat cacatccgaa cataaacaac catgggtaag gaaaagactc acgtttcgag
4861 gccgcgatta aattccaaca tggatgctga tttatatggg tataaatggg ctcgcgataa
4921 tgtcgggcaa tcaggtgcga caatctatcg attgtatggg aagcccgatg cgccagagtt
4981 gtttctgaaa catggcaaag gtagcgttgc caatgatgtt acagatgaga tggtcagact 504 aaactggctg acggaattta tgcctcttcc gaccatcaag cattttatcc gtactcctga
510 tgatgcatgg ttactcacca ctgcgatccc cggcaaaaca gcattccagg tattagaaga
516 atatcctgat tcaggtgaaa atattgttga tgcgctggca gtgttcctgc gccggttgca
522 ttcgattcct gtttgtaatt gtccttttaa cagcgatcgc gtatttcgtc tcgctcaggc
528 gcaatcacga atgaataacg gtttggttga tgcgagtgat tttgatgacg agcgtaatgg
534 ctggcctgtt gaacaagtct ggaaagaaat gcataagctt ttgccattct caccggattc
540 agtcgtcact catggtgatt tctcacttga taaccttatt tttgacgagg ggaaattaat
546 aggttgtatt gatgttggac gagtcggaat cgcagaccga taccaggatc ttgccatcct
552 atggaactgc ctcggtgagt tttctccttc attacagaaa cggcttttlc aaaaatatgg
558 tattgataat cctgatatga ataaattgca gtttcatttg atgctcgatg agtttttcta
564 atcagtactg acaataaaaa gattcttgtt ttcaagaact tgtcatttgt atagtltttt
570 tatattgtag ttgttctatt ttaatcaaat gttagcgtga tttatatttt ttttcgcctc
576 gacatcatct gcccagatgc gaagttaagt gcgcagaaag taatatcatg cgtcaatcgt
582 atgtgaatgc tggtcgctat actgctgtcg attcgatact aacgccgcca tccagtgtcg
588 aaaacgagct cgaattgaat ccaagggcga attctgcaga tatctcgggt ctcgcccggc
594 gctagtccag ccgtagcgct ctccggcgat caccccggag cactctggag ccgagcggtc
600 gggtctgttg ggcgcgccgc ggctacggac ggctcgactc actggcgctc gaccccgtat
606 cccccgtctc ggacgacgca ccgttgcgcg ggaacgatcg gcggcgctca cacgcacgat
612 cggaagcggc cgcagatctg gccggccgat ttatcttcgt ttcctgcagg tttttgttct
618 gtgcagttgg gttaagaata ctgggcaatt tcatgtttct tcaacactac atatgcgtat
624 atataccaat ctaagtctgt gctccttcct tcgttcttcc ttctgttcgg agattaccga
630 atcaaaaaaa tttcaaggaa accggaatca aaaaaaagaa taaaaaaaaa atgatgaatt
636 gaaaagcttt atggaccctg aaaccacagc cacattaacc ttctttgatg gtcaaaactt
642 attcttcacc ataaatatgc ctcgcattaa ttaagcatgc aagcttggcg taatcatggt
648 catagctgtt tcctgtgtga aattgttatc cgctcacaat tccacacaac atacgagccg
654 gaagcataaa gtgtaaagcc tggggtgcct aatgagtgag ctaactcaca ttaattgcgt
660 tgcgctcact gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat taatgaatcg
666 gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc tcgctcactg
672 actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa
678 tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc
684 aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc
690 ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat
696 aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc
702 cgcttaccgg atacctgtcc gcctttctc
Example 59: Plasmid Constructs Utilized for Genetic Manipulation of Strain BF2513 and descendants of BF2513
Plasmid pBF1221 integrates the Saccharomyces cerevisiae (Sc) RKE1 gene (e.g., ScRKEI; under control of the TDH3 promoter) and the ScTAL I (e.g., under control of the TEF1 promoter) at the PH013 gene locus, replacing the PH013 open reading frame. pBF1221 was constructed using standard molecular biology techniques. The plasmid contains about 300 bp of the 3'flanking region of PH013 (e.g., around position 714 to 1014) and about 300 bp of the 5' flanking region of PH013 (e.g., around position 5980 to 6280). pBF1221 also contains the ScTEFI promoter (e.g., located at about position 5502 to 5971 ), the ScTAL I open reading frame (e.g., located at about position 4485 to 5489), the ScADHI terminator (e.g., locatged at about position 4248 to 4471 ), the TDH3 promoter (e.g., located at about position 3572 to 4244), the ScRKEI open reading frame (e.g., located at about position 2846 to 3559), and the CYC1 terminator (e.g., located at about position 2577 to 2833. In addition pBF1221 contains the ScL/RA3 gene (e.g., located at about position 1233 to 2353) flanked by two R127 repeats (e.g., one at about position 2363 to 2563 and another at about position 1029 to 1229). The whole 5' PH013 - PTEF1-ScTAL 1 - PTDH3 - ScRKEI -R'27 - ScURA3-Rm - 3' PH013 cassette is flanked by Pad restriction enzymes sites. The sequence of pBF1221 is given below as SEQ ID No: 182. SEQ ID No: 182 - Sequence of pBF1221
1 ttcagcatct tttactttca ccagcgtttc tgggtgagca aaaacaggaa ggcaaaatgc
61 cgcaaaaaag ggaataaggg cgacacggaa atgttgaata ctcatactct tcctttttca
121 atattattga agcatttatc agggttattg tctcatgagc ggatacatat ttgaatgtat
181 ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc cacctgacgt
241 ctaagaaacc attattatca tgacattaac ctataaaaat aggcgtatca cgaggccctt
301 tcgtctcgcg cgtttcggtg atgacggtga aaacctctga cacatgcagc tcccggagac
361 ggtcacagct tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg gcgcgtcagc
421 gggtgttggc gggtgtcggg gctggcttaa ctatgcggca tcagagcaga ttgtactgag
481 agtgcaccat atgcggtgtg aaataccgca cagatgcgta aggagaaaat accgcatcag
541 gcgccattcg ccattcaggc tgcgcaactg ttgggaaggg cgatcggtgc gggcctcttc
601 gctattacgc cagctggcga aagggggatg tgctgcaagg cgattaagtt gggtaacgcc
661 agggttttcc cagtcacgac gttgtaaaac gacggccagt gaattcttaa ttaacaaacc
721 tgaatatttt tccttttcaa aaagtaattc tacccctaga ttttgcattg ctcctctata
781 actcattatt ggttaaggtg tagatgtcac caagtttatc aatgtaaaat ttaggtcttg
841 gataatcgtg cgaaatcttc aaggctctct cttcggtttc aataccactc aaaacgagta
901 gtgtgccacc taacccacct tcaacaccga atttcatatc ggtgtttaat ctgtcaccaa
961 ccatacagca ctttgatcta tccaggttga atgccgatat aatgctgttt agcagcggcc
1021 gccccggggt cttagggccc agggaccgca cgggtcgacc gcgcgactgg tcggagcttg
1081 cgcgtctacg ccactcggcg gccccgacgg gggatgccgc ggaatgtccg ccggcgtatg
1 141 cggctcaagc cggaccgtcg gactgcgaag cgccgtgagc acccctcgac ctgaccggac
1201 gcggcgcacc cgtccgagta tcgtcgcgag gtacccaggg tccataaagc ttttcaattc
1261 atcatttttt ttttattctt ttttttgatt tcggtttcct tgaaattttt ttgattcggt
1321 aatctccgaa cagaaggaag aacgaaggaa ggagcacaga cttagattgg tatatatacg
1381 catatgtagt gttgaagaaa catgaaattg cccagtattc ttaacccaac tgcacagaac
1441 aaaaacctgc aggaaacgaa gataaatcat gtcgaaagct acatataagg aacgtgctgc
1501 tactcatcct agtcctgttg ctgccaagct atttaatatc atgcacgaaa agcaaacaaa ·
1561 cttgtgtgct tcattggatg ttcgtaccac caaggaatta ctggagttag ttgaagcatt
1621 aggtcccaaa atttgtttac taaaaacaca tgtggatatc ttgactgatt tttccatgga
1681 gggcacagtt aagccgctaa aggcattatc cgccaagtac aattttttac tcttcgaaga
1741 cagaaaattt gctgacattg gtaatacagt caaattgcag tactctgcgg gtgtatacag
1801 aatagcagaa tgggcagaca ttacgaatgc acacggtgtg gtgggcccag gtattgttag
1861 cggtttgaag caggcggcag aagaagtaac aaaggaacct agaggccttt tgatgttagc
1921 agaattgtca tgcaagggct ccctatctac tggagaatat actaagggta ctgttgacat
1981 tgcgaagagc gacaaagatt ttgttatcgg ctttattgct caaagagaca tgggtggaag
2041 agatgaaggt tacganggt tgattatgac acccggtgtg ggtttagatg acaagggaga
2101 cgcattgggt caacagtata gaaccgtgga tgatgtggtc tctacaggat ctgacattat 2161 tattgttgga agaggactat ttgcaaaggg aagggatgct aaggtagagg gtgaacgtta 2221 cagaaaagca ggctgggaag catatttgag aagatgcggc cagcaaaact aaaaaactgt 2281 attataagta aatgcatgta tactaaactc acaaattaga gcttcaattt aattatatca 2341 gttattaccc gcggccggcc gcgtcttagg gcccagggac cgcacgggtc gaccgcgcga 2401 ctggtcggag cttgcgcgtc tacgccactc ggcggccccg acgggggatg ccgcggaatg 2461 tccgccggcg tatgcggctc aagccggacc gtcggactgc gaagcgccgt gagcacccct 2521 cgacctgacc ggacgcggcg cacccgtccg agtatcgtcg cgacccgggt taatgcggcc 2581 gcggccgaaa ttaaagcctt cgagcgtccc aaaaccttct caagcaaggt tttcagtata 2641 atgttacatg cgtacacgcg tctgtacaga aaaaaaagaa aaatttgaaa tataaataac 2701 gttcttaata ctaacataac tataaaaaaa taaataggga cctagacttc aggttgtcta 2761 actccttcct tttcggttag agcggatgtg gggggagggc gtgaatgtaa gcgtgacata 2821 actaattaca tgactcgagc tactaatcta gcaaatctct agaacgcaat tccttcgaga 2881 cttcttcttt catgaaggag ataacatcgt gcgggtcagc tgcagtgaaa acactggtac 2941 cagcgacaat aacgttggca ccggctttgg cggctttcgg gatggtctcc ttgcccaaac 3001 caccatcgac ttggatattc aaatggggga acttggctct caaagtttcc acttttggca 3061 tcatgtcttc catgaatttt tggcctccaa acccaggttc cacagtcata acaagagcca 3121 tatccaaatg aggagctagt tcaaataaaa cgtcaacaga agtaccaggt ttgatggcgc 3181 atgcagcttt gatgccctta gacttaatca acttaactaa atgcaaaggg tcttgtgtgg 3241 cctcgtagtg gaacgtaaat tggtcagcac cacatttagc aaaatcgtcg acccattttt 3301 caggattttc aaccatcatg tgacaatcga agaacgcagt gggcttcttt tctgtgttgc 3361 tagcatcgcc agggcgtggc acagaacgac gtagggaggt aacaattggt tggcccagag 3421 taatgtttgg aacaaaatgg ccgtccatga catcgatatg taaccaatct gcgccggcgt 3481 tgatgacctt atgacattcg caacccaagt tggcgaagtc agaagcaagg atactgggag 3541 ctataattgg tttgaccatt tttttactag ttctagaatc cgtcgaaact aagttctggt
3601 gttttaaaac taaaaaaaag actaactata aaagtagaat ttaagaagtt taagaaatag 3661 atttacagaa ttacaatcaa tacctaccgt ctttatatac ttattagtca agtaggggaa 3721 taatttcagg gaactggttt caaccttttt tttcagcttt ttccaaatca gagagagcag 3781 aaggtaatag aaggtgtaag aaaatgagat agatacatgc gtgggtcaat tgccttgtgt 3841 catcatttac tccaggcagg ttgcatcact ccattgaggt tgtgcccgtt ttttgcctgt 3901 ttgtgcccct gttctctgta gttgcgctaa gagaatggac ctatgaactg atggttggtg 3961 aagaaaacaa tattttggtg ctgggattct ttttttttct ggatgccagc ttaaaaagcg 4021 ggctccatta tatttagtgg atgccaggaa taaactgttc acccagacac ctacgatgtt 4081 atatattctg tgtaacccgc cccctatttt gggcatgtac gggttacagc agaattaaaa 4141 ggctaatttt ttgactaaat aaagttagga aaatcactac tattaattat ttacgtattc
4201 tttgaaatgg cgagtattga taatgataaa cggccggccg agctcctggg agcgatttgc 4261 aggcatttgc tcggcatgcc ggtagaggtg tggtcaataa gagcgacctc atgctatacc 4321 tgagaaagca acctgaccta caggaaagag ttactcaaga ataagaattt tcgttttaaa 4381 acctaagagt cactttaaaa tttgtataca cttatttttt ttataactta tttaataata
4441 aaaatcataa atcataagaa attcgcttat tactcgagct attaagcggt aactttcttt 4501 tcaatcaagt cgaatagagt aacaatatcg gcagagaatt ttctgatacc ttcggacaat 4561 ttttcagtgg ccatagcgtc ttcattcaag tcgaatctga atttagattc gtcgctgatg 4621 taagaaatct tgtcgccggc ttccttctta gcggagacag ggtccaaaac tcttgggaaa 4681 ggttcagtac tgttcatcaa cttgtccaat aaagctggag aaattgttag atagtcaaca 4741 ccagccaagt ttttgatttc gtcagtgctt ctgaaagaag cacccataac aatagtcttg 4801 taaccgtact tcttgtagta gttgtagatt ttcttgacgg aaataacacc tgggtcggct 4861 tcacccttgt aatctttacc agtgctggat ttgtaccagt ctagaattct accaacaaat 4921 ggggaaatca aagtaacttg ggcctcggca caggcaactg cttgaacgaa ggagaataat 4981 agagtcaaat tacagtggat accgtccttt tcttccaatt ctttggcagc ttgaatacct 5041 tcccaagtgg aagcaatttt aataaggact ctttccttgg agacaccttc ttgttcaaac 5101 aatttaatga tatgtctagc cttttcaatg gtagcttgag tgtcaaaaga caatctagca 5161 tcaacttcgg tggagactct gcctggaaca atctttaaga tctccttacc gaattcgact 5221 aacaatctgt ccacagcatt ttcgacttgt tcttcggtgg tcttaccatg cttcttaccg
5281 tattccacgg caacatcgat caacttggcg taagttggtt gcttggcagc agccaagatc 5341 aatgatgggt tagttgtgga gtcttgaggt tgaaacttgg caatagagcc gaaatcacca 5401 gtgtcggcaa caacgacagt gccggaggct ttcaattgtt ctagagagtt gttagcaacc 5461 ttttgtttct tttgagctgg ttcagacatt tttttactag ttttgtaatt aaaacttaga
5521 ttagattgct atgctttctt tctaatgagc aagaagtaaa aaaagttgta atagaacaag 5581 aaaaatgaaa ctgaaacttg agaaattgaa gaccgtttat taacttaaat atcaatggga 5641 ggtcatcgaa agagaaaaaa atcaaaaaaa aaaattttca agaaaaagaa acgtgataaa 5701 aatttttatt gcctttttcg acgaagaaaa agaaacgagg cggtctcttt tttcttttcc
5761 aaacctltag tacgggtaat taacgacacc ctagaggaag aaagagggga aatttagtat 5821 gctgtgcttg ggtgttttga agtggtacgg cgatgcgcgg agtccgagaa aatctggaag 5881 agtaaaaaag gagtagaaac attttgaagc tatggtgtgt gggggatcac ttgtggggga 5941 ttgggtgtga tgtaaggatt cgcggtgcta gggccggcct ttcccgagtt gtatattctt 6001 tgtcagggca agctataagg ctttttttgt gatttggctc gagcttgata tttatgtctg
6061 cttgttataa accgattgcg tcaattacaa ttgaaaaatt cactaaaacg gaaagaacat 6121 tcaccgaaaa aaagaaaaac cggaaagtta atagataagg gacgtcatta atatacgatt 6181 aagtaaatac tgaaaacgtg ctggagaata gtaaagatgt cacatcacca tcttcaacgt 6241 cttcctctac aaaacttgag ttgataaaga agcatagtat taattaaaag cttggcgtaa 6301 tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 6361 cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta 6421 attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 6481 tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 6541 ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 6601 gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 6661 ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 6721 cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 6781 ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 6841 accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 6901 catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 6961 gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 7021 tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 7081 agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 7141 actagaagaa cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 7201 gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 7261 aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 7321 gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 7381 aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 7441 atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca 7501 gcgatctgtc tattlcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg 7561 atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca 7621 ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt 7681 cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt 7741 agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca 7801 cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca 7861 tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga 7921 agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact 7981 gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga 8041 gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg 8101 ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc 8161 tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga 8221 tc Plasmid pBF1292 integrates ScTKL under the control of the TDH3 promoter and ScXKSI under the control of the TEF1 promoter at the ADH2 gene replacing its open reading frame. pBF1292 was constructed using standard molecular biology techniques. pBF1292 contains about 300 bp of the 3'flanking region of ScADH2 (e.g., located at about position 9022 to 9322) and 300 bp of the 5' flanking region of ADH2 (e.g., located at about position 1638 to 1938). pBF1292 also contains the ScTEFI promoter (e.g., located at about position 1947 to 2416), the ScXKSI open reading frame (e.g., located at about position 2429 to 4228), the ScADHI terminator (e.g., located at about position 4239 to 4462), the TDH3 promoter (e.g., located at about position 4479 to 51 14), the ScTKL I open reading frame (e.g., located at about position 5151 to 7190), and the CYC1 terminator (e.g., located at about position 7194 to 7444). In addition pBF1292 contains the
ScURA3 gene (e.g., located at about position 7683 to 8803) flanked by two R127 repeats (e.g., one located at about position 7473 to 7673 and another located at about position 8807 to 9007). The whole 5' ADH2-PTEFrScXKS1-PTDH3 -ScTKL 1- '27-SCURA3-R'Z7-3' ADH2 cassette is flanked by Pad restriction enzymes sites. The sequence of pBF1292 is presented below as SEQ ID No: 183.
SEQ ID No: 183 - Sequence of pBF1292
1 caacgatcgg aggaccgaag gagctaaccg cttttttgca caacatgggg gatcatgtaa
61 ctcgccttga tcgttgggaa ccggagctga atgaagccat accaaacgac gagcgtgaca
121 ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact attaactggc gaactactta
181 ctctagcttc ccggcaacaa ttaatagact ggatggaggc ggataaagtt gcaggaccac
241 ttctgcgctc ggcccttccg gctggctggt ttattgctga taaatctgga gccggtgagc
301 gtgggtctcg cggtatcatt gcagcactgg ggccagatgg taagccctcc cgtatcgtag
361 ttatctacac gacggggagt caggcaacta tggatgaacg aaatagacag atcgctgaga
421 taggtgcctc actgattaag cattggtaac tgtcagacca agtttactca tatatacttt
481 agattgattt aaaacttcat ttttaartta aaaggatcta ggtgaagatc ctttttgata
541 atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag
601 aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa
661 caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt
721 ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgttctt ctagtgtagc
781 cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa
841 tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa
901 gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc
961 ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa
1021 gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa
1081 caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg
1 141 ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc
1201 tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg
1261 ctcacatgtt ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg
1321 agtgagctga taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg
1381 aagcggaaga gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat
1441 gcagctggca cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg
1501 tgagttagct cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagctttt aattaaaccg ggcatctcca acttataagt tggagaaata agagaatttc agattgagag aatgaaaaaa aaaaaaaaaa aaaaaaggca gaggagagca tagaaatggg gttcactttt tggtaaagct atagcatgcc tatcacatat aaatagagtg ccagtagcga cttttttcac actcgaaata ctcttactac tgctctcttg ttgtttttat cacttcttgt
ttcttcttgg taaatagaat atcaagctac aaaaagcata caatcaacta tcaactatta actatatcgt aatacacagg ccggccctag caccgcgaat ccttacatca cacccaatcc cccacaagtg atcccccaca caccatagct tcaaaatgtt tctactcctt ttttactctt ccagattttc tcggactccg cgcatcgccg taccacttca aaacacccaa gcacagcata ctaaatttcc cctctttctt cctctagggt gtcgttaatt acccgtacta aaggtttgga
aaagaaaaaa gagaccgcct cgtttctttt tcttcgtcga aaaaggcaat aaaaattttt atcacgtttc tttttcttga aaattttttt ttttgatttt tttctctttc gatgacctcc
cattgatatt taagttaata aacggtcttc aatttctcaa gtttcagttt catttttctt
gttctattac aacttttttt acttcttgct cattagaaag aaagcatagc aatctaatct
aagttttaat tacaaaacta gtaaaaaaat gttgtgttca gtaattcaga gacagacaag agaggtttcc aacacaatgt ctttagactc atactatctt gggtttgatc tttcgaccca acaactgaaa tgtctcgcca ttaaccagga cctaaaaatt gtccattcag aaacagtgga atttgaaaag gatcttccgc attatcacac aaagaagggt gtctatatac acggcgacac tatcgaatgt cccgtagcca tgtggttaga ggctctagat ctggttctct cgaaatatcg cgaggctaaa tttccattga acaaagttat ggccgtctca gggtcctgcc agcagcacgg gtctgtctac tggtcctccc aagccgaatc tctgttagag caattgaata agaaaccgga aaaagattta ttgcactacg tgagttctgt agcatttgca aggcaaaccg cccccaattg gcaagaccac agtactgcaa agcaatgtca agagtttgaa gagtgcatag gtgggcctga aaaaatggct caattaacag ggtccagagc ccattttaga tttactggtc ctcaaattct gaaaattgca caattagaac cagaagctta cgaaaaaaca aagaccattt ctttagtgtc taattttttg acttctatct tagtgggcca tcttgttgaa ttagaggagg cagatgcctg
tggtatgaac ctttatgata tacgtgaaag aaaattcagt gatgagctac tacatctaat tgatagttct tctaaggata aaactatcag acaaaaatta atgagagcac ccatgaaaaa tttgatagcg ggtaccatct gtaaatattt tattgagaag tacggtttca atacaaactg caaggtctct cccatgactg gggataattt agccactata tgttctttac ccctgcggaa gaatgacgtt ctcgtttccc taggaacaag tactacagtt cttctggtca ccgataagta tcacccctct ccgaactatc atcttttcat tcatccaact ctgccaaacc attatatggg tatgatttgt tattgtaatg gttctttggc aagggagagg ataagagacg agttaaacaa agaacgggaa aataattatg agaagactaa cgattggact ctttttaatc aagctgtgct agatgactca gaaagtagtg aaaatgaatt aggtgtatat tttcctctgg gggagatcgt tcctagcgta aaagccataa acaaaagggt tatcttcaat ccaaaaacgg gtatgattga aagagaggtg gccaagttca aagacaagag gcacgatgcc aaaaatattg tagaatcaca ggctttaagt tgcagggtaa gaatatctcc cctgctttcg gattcaaacg caagctcaca acagagactg aacgaagata caatcgtgaa gtttgattac gatgaatctc cgctgcggga ctacctaaat aaaaggccag aaaggacttt ttttgtaggt ggggcttcta aaaacgatgc tattgtgaag aagtttgctc aagtcattgg tgctacaaag ggtaatttta ggctagaaac accaaactca tgtgcccttg gtggttgtta taaggccatg tggtcattgt tatatgactc taataaaatt gcagttcctt ttgataaatt tctgaatgac aattttccat ggcatgtaat
ggaaagcata tccgatgtgg ataatgaaaa ttgggatcgc tataattcca agattgtccc cttaagcgaa ctggaaaaga ctctcatcta actcgagtaa taagcgaatt tcttatgatt tatgattttt attattaaat aagttataaa aaaaataagt gtatacaaat tttaaagtga ctcttaggtt ttaaaacgaa aattcttatt cttgagtaac tctttcctgt aggtcaggtt
gctttctcag gtatagcatg aggtcgctct tattgaccac acctctaccg gcatgccgag caaatgcctg caaatcgctc ccaggagctc ggccggccgt ttatcattat caatactcgc catttcaaag aatacgtaaa taattaatag tagtgatttt cctaacttta tttagtcaaa aaattagcct tttaattctg ctgtaacccg tacatgccca aaataggggg cgggttacac 4621 agaatatata acatcgtagg tgtctgggtg aacagtttat tcctggcatc cactaaatat 4681 aatggagccc gctttttaag ctggcatcca gaaaaaaaaa gaatcccagc accaaaatat 4741 tgttttcttc accaaccatc agttcatagg tccattctct tagcgcaact acagagaaca 4801 ggggcacaaa caggcaaaaa acgggcacaa cctcaatgga gtgatgcaac ctgcctggag 4861 taaatgatga cacaaggcaa ttgacccacg catgtatcta tctcattttc ttacaccttc 4921 tattaccttc tgctctctct gatttggaaa aagctgaaaa aaaaggttga aaccagttcc 4981 ctgaaattat tcccctactt gactaataag tatataaaga cggtaggtat tgattgtaat 5041 tctgtaaatc tatttcttaa acttcttaaa ttctactttt atagttagtc ttttttttag
5101 ttttaaaaca ccagaactta gtttcgacgg attctagaac tagtaaaaaa atgactcaat 5161 tcactgacat tgataagcta gccgtctcca ccataagaat tttggctgtg gacaccgtat 5221 ccaaggccaa ctcaggtcac ccaggtgctc cattgggtat ggcaccagct gcacacgttc 5281 tatggagtca aatgcgcatg aacccaacca acccagactg gatcaacaga gatagatttg 5341 tcttgtctaa cggtcacgcg gtcgctttgt tgtattctat gctacatttg actggttacg
5401 atctgtctat tgaagacttg aaacagttca gacagttggg ttccagaaca ccaggtcatc 5461 ctgaatttga gttgccaggt gttgaagtta ctaccggtcc attaggtcaa ggtatctcca 5521 acgctgttgg tatggccatg gctcaagcta acctggctgc cacttacaac aagccgggct 5581 ttaccttgtc tgacaactac acctatgttt tcttgggtga cggttgtttg caagaaggta
5641 tttcttcaga agcttcctcc ttggctggtc atttgaaatt gggtaacttg attgccatct
5701 acgatgacaa caagatcact atcgatggtg ctaccagtat ctcattcgat gaagatgttg 5761 ctaagagata cgaagcctac ggttgggaag ttttgtacgt agaaaatggt aacgaagatc 5821 tagccggtat tgccaaggct attgctcaag ctaagttatc caaggacaaa ccaactttga 5881 tcaaaatgac cacaaccatt ggttacggtt ccttgcatgc cggctctcac tctgtgcacg 5941 gtgccccatt gaaagcagat gatgttaaac aactaaagag caaattcggt ttcaacccag 6001 acaagtcctt tgttgttcca caagaagttt acgaccacta ccaaaagaca attttaaagc 6061 caggtgtcga agccaacaac aagtggaaca agttgttcag cgaataccaa aagaaattcc 6121 cagaattagg tgctgaattg gctagaagat tgagcggcca actacccgca aattgggaat 6181 ctaagttgcc aacttacacc gccaaggact ctgccgtggc cactagaaaa ttatcagaaa 6241 ctgttcttga ggatgtttac aatcaattgc cagagttgat tggtggttct gccgatttaa
6301 caccttctaa cttgaccaga tggaaggaag cccttgactt ccaacctcct tcttccggtt 6361 caggtaacta ctctggtaga tacattaggt acggtattag agaacacgct atgggtgcca 6421 taatgaacgg tatttcagct ttcggtgcca actacaaacc atacggtggt actttcttga 6481 acttcgtttc ttatgctgct ggtgccgtta gattgtccgc tttgtctggc cacccagtta
6541 tttgggttgc tacacatgac tctatcggtg tcggtgaaga tggtccaaca catcaaccta 6601 ttgaaacttt agcacacttc agatccctac caaacattca agtttggaga ccagctgatg 6661 gtaacgaagt ttctgccgcc tacaagaact ctttagaatc caagcatact ccaagtatca 6721 ttgctttgtc cagacaaaac ttgccacaat tggaaggtag ctctattgaa agcgcttcta 6781 agggtggtta cgtactacaa gatgttgcta acccagatat tattttagtg gctactggtt 6841 ccgaagtgtc tttgagtgtt gaagctgcta agactttggc cgcaaagaac atcaaggctc 6901 gtgttgtttc tctaccagat ttcttcactt ttgacaaaca acccctagaa tacagactat 6961 cagtcttacc agacaacgtt ccaatcatgt ctgttgaagt tttggctacc acatgttggg 7021 gcaaatacgc tcatcaatcc ttcggtattg acagatttgg tgcctccggt aaggcaccag 7081 aagtcttcaa gttcttcggt ttcaccccag aaggtgttgc tgaaagagca caaaagacca 7141 ttgcattcta taagggtgac aagctaattt ctcctttgaa aaaagctttc taatagctcg
7201 agtcatgtaa ttagttatgt cacgcttaca ttcacgccct ccccccacat ccgctctaac 7261 cgaaaaggaa ggagttagac aacctgaagt ctaggtccct atttattttt ttatagttat 7321 gttagtatta agaacgttat ttatatttca aatttttctt ttttttctgt acagacgcgt
7381 gtacgcatgt aacattatac tgaaaacctt gcttgagaag gttttgggac gctcgaaggc 7441 tttaatttcg gccgcggccg cattaacccg ggtcgcgacg atactcggac gggtgcgccg 7501 cgtccggtca ggtcgagggg tgctcacggc gcttcgcagt ccgacggtcc ggcttgagcc 7561 gcatacgccg gcggacattc cgcggcatcc cccgtcgggg ccgccgagtg gcgtagacgc 7621 gcaagctccg accagtcgcg cggtcgaccc gtgcggtccc tgggccctaa gacgcggccg 7681 gccgcgggta ataactgata taattaaatt gaagctctaa tttgtgagtt tagtatacat
7741 gcatttactt ataatacagt tttttagttt tgctggccgc atcttctcaa atatgcttcc
7801 cagcctgctt ttctgtaacg ttcaccctct accttagcat cccttccctt tgcaaatagt
7861 cctcttccaa caataataat gtcagatcct gtagagacca catcatccac ggttctatac
7921 tgttgaccca atgcgtctcc cttgtcatct aaacccacac cgggtgtcat aatcaaccaa
7981 tcgtaacctt catctcttcc acccatgtct ctttgagcaa taaagccgat aacaaaatct
8041 ttgtcgctct tcgcaatgtc aacagtaccc ttagtatatt ctccagtaga tagggagccc
8101 ttgcatgaca attctgctaa catcaaaagg cctctaggtt cctttgttac ttcttctgcc
8161 gcctgcttca aaccgctaac aatacctggg cccaccacac cgtgtgcatt cgtaatgtct
8221 gcccattctg ctattctgta tacacccgca gagtactgca atttgactgt attaccaatg
8281 tcagcaaatt ttctgtcttc gaagagtaaa aaattgtact tggcggataa tgcctttagc
8341 ggcttaactg tgccctccat ggaaaaatca gtcaagatat ccacatgtgt ttttagtaaa
8401 caaattttgg gacctaatgc ttcaactaac tccagtaatt ccttggtggt acgaacatcc
8461 aatgaagcac acaagtttgt ttgcttttcg tgcatgatat taaatagctt ggcagcaaca
8521 ggactaggat gagtagcagc acgttcctta tatgtagctt tcgacatgat ttatcttcgt
8581 ttcctgcagg tttttgttct gtgcagttgg gttaagaata ctgggcaatt tcatgtttct
8641 tcaacactac atatgcgtat atataccaat ctaagtctgt gctccttcct tcgttcttcc
8701 ttctgttcgg agattaccga atcaaaaaaa tttcaaggaa accgaaatca aaaaaaagaa
8761 taaaaaaaaa atgatgaatt gaaaagcttt atggaccctg ggtacctcgc gacgatactc
8821 ggacgggtgc gccgcgtccg gtcaggtcga ggggtgctca cggcgcttcg cagtccgacg
8881 gtccggcttg agccgcatac gccggcggac attccgcggc atcccccgtc ggggccgccg
8941 agtggcgtag acgcgcaagc tccgaccagt cgcgcggtcg acccgtgcgg tccctgggcc
9001 ctaagacccc ggggcggccg cgcggatctc ttatgtcttt acgatttata gttttcatta
9061 tcaagtatgc ctatattagt atatagcatc tttagatgac agtgttcgaa gtttcacgaa
9121 taaaagataa tattctactt tttgctccca ccgcgtttgc tagcacgagt gaacaccatc
9181 cctcgcctgt gagttgtacc cattcctcta aactgtagac atggtagctt cagcagtgtt
9241 cgttatgtac ggcatcctcc aacaaacagt cggttatagt ttgtcctgct cctctgaatc
9301 gtctccctcg atatttctca tttaattaag aattcactgg ccgtcgtttt acaacgtcgt
9361 gactgggaaa accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc
9421 agctggcgta atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg
9481 aatggcgaat ggcgcctgat gcggtatttt ctccttacgc atctgtgcgg tatttcacac
9541 cgcatatggt gcactctcag tacaatctgc tctgatgccg catagttaag ccagccccga
9601 cacccgccaa cacccgctga cgcgccctga cgggcttgtc tgctcccggc atccgcttac
9661 agacaagctg tgaccgtctc cgggagctgc atgtgtcaga ggttttcacc gtcatcaccg
9721 aaacgcgcga gacgaaaggg cctcgtgata cgcctatttt tataggtlaa tgtcatgata
9781 ataatggttt cttagacgtc aggtggcact tttcggggaa atgtgcgcgg aacccctatt
9841 tgtttatttt tctaaataca ttcaaatatg tatccgctca tgagacaata accctgataa
9901 atgcttcaat aatattgaaa aaggaagagt atgagtattc aacatttccg tgtcgccctt
9961 attccctttt ttgcggcatt ttgccttcct gtttttgctc acccagaaac gctggtgaaa
10021 gtaaaagatg ctgaagatca gttgggtgca cgagtgggtt acatcgaact ggatctcaac
10081 agcggtaaga tccttgagag ttttcgcccc gaagaacgtt ttccaatgat gagcactttt
10141 aaagttctgc tatgtggcgc ggtattatcc cgtattgacg ccgggcaaga gcaactcggt
10201 cgccgcatac actattctca gaatgacttg gttgagtact caccagtcac agaaaagcat
10261 cttacggatg gcatgacagt aagagaatta tgcagtgctg ccataaccat gagtgataac
10321 actgcggcca acttacttct ga
Plasmid pBF1377 integrates ScGAL2 under the control of the TEF1 promoter at the intergenic region between BSC2 and PMP3. pBF1 377 contains about 300 bp of intergenic region (e.g., located at about position 874 to 1 172) and about 300 bp intergenic region (e.g., located at about position 5149 to 5451 ). pBF1377 also contains the ScTEFI promoter (e.g., located at about position 2746 to 3150), the ScGAL2 open reading frame (e.g., located at about position 3154 to 4875) and the ScCYCI terminator (e.g., located at about position 4885 to 5123). In addition, pBF1377 contains the ScURA3 gene (e.g., located at about positions 1391 to 251 1 ) flanked by two R127 repeats, (e.g., one located at about positions 1 187 to 1387 and another located at about position 2521 to 2721 ). The whole 5' YDR275-PTEFrScGAL 1—R'27-ScURA3-R'27-3' YDR275 3' cassette is flanked by Pad restriction enzymes sites. The sequence of pBF1377 is presented below as SEQ ID No: 184. SEQ ID No: 184 - pBF1377 sequence
1 gttgctcttg cccggcgtca atacgggata ataccgcgcc acatagcaga actttaaaag
61 tgctcatcat tggaaaacgt tcttcggggc gaaaactctc aaggatctta ccgctgttga
121 gatccagttc gatgtaaccc actcgtgcac ccaactgatc ttcagcatct tttactttca
181 ccagcgtttc tgggtgagca aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg
241 cgacacggaa atgttgaata ctcatactct tcctttttca atattattga agcatttatc
301 agggttattg tctcatgagc ggatacatat ttgaatgtat ttagaaaaat aaacaaatag
361 gggttccgcg cacatttccc cgaaaagtgc cacctgacgt ctaagaaacc attattatca
421 tgacattaac ctataaaaat aggcgtatca cgaggccctt tcgtctcgcg cgtttcggtg
481 atgacggtga aaacctctga cacatgcagc tcccggagac ggtcacagct tgtctgtaag
541 cggatgccgg gagcagacaa gcccgtcagg gcgcgtcagc gggtgttggc gggtgtcggg
601 gctggcttaa ctatgcggca tcagagcaga ttgtactgag agtgcaccat atgcggtgtg
661 aaataccgca cagatgcgta aggagaaaat accgcatcag gcgccattcg ccattcaggc
721 tgcgcaactg ttgggaaggg cgatcggtgc gggcctcttc gctattacgc cagctggcga
781 aagggggatg tgctgcaagg cgattaagtt gggtaacgcc agggttttcc cagtcacgac
841 gttgtaaaac gacggccagt gaattcttaa ttaattttgg cttggttccc aggtatgcta
901 tatgccttgt acattgtcct acaagatlaa gctcattcaa taaacgaaag taactatcaa
961 atgaaataag gaagcagtat taggaaacgg tatcaacatg ttcgggtcac cctcattttt
1021 atttgtttcc tttctttttt tttgtttctc tttctcttaa ggtttcctcc cccgctgtta
1081 attttaacta cgtgtgtatt ttttaataat taattttatc acaaaaaaaa cttctcatct
1 141 tttgcctttt atcatttttg tacttttttc ttgcggccgc cccggggtct tagggcccag
1201 ggaccgcacg ggtcgaccgc gcgactggtc ggagcttgcg cgtctacgcc actcggcggc
1261 cccgacgggg gatgccgcgg aatgtccgcc ggcgtatgcg gctcaagccg gaccgtcgga
1321 ctgcgaagcg ccgtgagcac ccctcgacct gaccggacgc ggcgcacccg tccgagtatc
1381 gtcgcgaggt acccagggtc cataaagctt ttcaattcat catttttttt ttattctttt
1441 ttttg atttc ggtttccttg aaattttttt gattcggtaa tctccgaaca gaaggaagaa
1501 cgaaggaagg agcacagact tagattggta tatatacgca tatgtagtgt tgaagaaaca
1561 tgaaattgcc cagtattctt aacccaactg cacagaacaa aaacctgcag gaaacgaaga
1621 taaatcatgt cgaaagctac atataaggaa cgtgctgcta ctcatcctag tcctgttgct
1681 gccaagctat ttaatatcat gcacgaaaag caaacaaact tgtgtgcttc attggatgtt
1741 cgtaccacca aggaattact ggagttagtt gaagcattag gtcccaaaat ttgtttacta
1801 aaaacacatg tggatatctt gactgatttt tccatggagg gcacagttaa gccgctaaag
1861 gcattatccg ccaagtacaa ttttttactc ttcgaagaca gaaaatttgc tgacattggt
1921 aatacagtca aattgcagta ctctgcgggt gtatacagaa tagcagaatg ggcagacatt
1981 acgaatgcac acggtgtggt gggcccaggt attgttagcg gtttgaagca ggcggcagaa
2041 gaagtaacaa aggaacctag aggccttttg atgttagcag aattgtcatg caagggctcc
2101 ctatctactg gagaatatac taagggtact gttgacattg cgaagagcga caaagatttt 2161 gttatcggct ttattgctca aagagacatg ggtggaagag atgaaggtta cgattggttg
2221 attatgacac ccggtgtggg tttagatgac aagggagacg cattgggtca acagtataga
2281 accgtggatg atgtggtctc tacaggatct gacattatta ttgttggaag aggactattt
2341 gcaaagggaa gggatgctaa ggtagagggt gaacgttaca gaaaagcagg ctgggaagca
2401 tatttgagaa gatgcggcca gcaaaactaa aaaactgtat tataagtaaa tgcatgtata
2461 ctaaactcac aaattagagc ttcaatttaa ttatatcagt tattacccgc ggccggccgc
2521 gtcttagggc ccagggaccg cacgggtcga ccgcgcgact ggtcggagct tgcgcgtcta
2581 cgccactcgg cggccccgac gggggatgcc gcggaatgtc cgccggcgta tgcggctcaa
2641 gccggaccgt cggactgcga agcgccgtga gcacccctcg acctgaccgg acgcggcgca
2701 cccgtccgag tatcgtcgcg acccgggtta atgcggccgc gagctcatag cttcaaaatg
2761 tttctactcc ttttttactc ttccagattt tctcggactc cgcgcatcgc cgtaccactt
2821 caaaacaccc aagcacagca tactaaattt cccctctttc ttcctctagg gtgtcgttaa
2881 ttacccgtac taaaggtttg gaaaagaaaa aagagaccgc ctcgtttctt tttcttcgtc
2941 gaaaaaggca ataaaaattt ttatcacgtt tctttttctt gaaaattttt tttttgattt
3001 ttttctcttt cgatgacctc ccattgatat ttaagttaat aaacggtctt caatttctca
3061 agtttcagtt tcatttttct tgttctatta caactttttt tacttcttgc tcattagaaa
3121 gaaagcatag caatctaatc taagttttct agaatggcag ttgaggagaa caatatgcct
3181 gttgtttcac agcaacccca agctggtgaa gacgtgatct cttcactcag taaagattcc
3241 catttaagcg cacaatctca aaagtattct aatgatgaat tgaaagccgg tgagtcaggg
3301 tctgaaggct cccaaagtgt tcctatagag atacccaaga agcccatgtc tgaatatgtt
3361 accgtttcct tgctttgttt gtgtgttgcc ttcggcggct tcatgtttgg ctgggatacc
3421 ggtactattt ctgggtttgt tgtccaaaca gactttttga gaaggtttgg tatgaaacat
3481 aaggatggta cccactattt gtcaaacgtc agaacaggtt taatcgtcgc cattttcaat
3541 attggctgtg cctttggtgg tattatactt tccaaaggtg gagatatgta tggccgtaaa
3601 aagggtcttt cgattgtcgt ctcggtttat atagttggta ttatcattca aattgcctct
3661 atcaacaagt ggtaccaata tttcattggt agaatcatat ctggtttggg tgtcggcggc
3721 atcgccgtct tatgtcctat gttgatctct gaaattgctc caaagcactt gagaggcaca
3781 ctagtttctt gttatcagct gatgattact gcaggtatct ttttgggcta ctgtactaat
3841 tacggtacaa agagctattc gaactcagtt caatggagag ttccattagg gctatgtttc
3901 gcttggtcat tatttatgat tggcgctttg acgttagttc ctgaatcccc acgttattta
3961 tgtgaggtga ataaggtaga agacgccaag cgttccattg ctaagtctaa caaggtgtca
4021 ccagaggatc ctgccgtcca ggcagagtta gatctgatca tggccggtat agaagctgaa
4081 aaactggctg gcaatgcgtc ctggggggaa ttattttcca ccaagaccaa agtatttcaa
4141 cgtttgttga tgggtgtgtt tgttcaaatg ttccaacaat taaccggtaa caattatttt
4201 ttctactacg gtaccgttat tttcaagtca gttggcctgg atgattcctt tgaaacatcc
4261 attgtcattg gtgtagtcaa ctttgcctcc actttcttta gtttgtggac tgtcgaaaac
4321 ttggggcgtc gtaaatgttt acttttgggc gctgccacta tgatggcttg tatggtcatc
4381 tacgcctctg ttggtgttac tagattatat cctcacggta aaagccagcc atcttctaaa
4441 ggtgccggta actgtatgat tgtctttacc tgtttttata ttttctgtta tgccacaacc
4501 tgggcgccag ttgcctgggt catcacagca gaatcattcc cactgagagt caagtcgaaa
4561 tgtatggcgt tggcctctgc ttccaattgg gtatgggggt tcttgattgc atttttcacc
4621 ccattcatca catctgccat taacttctac tacggttatg tcttcatggg ctgtttggtt
4681 gccatgtttt tttatgtctt tttctttgtt ccagaaacta aaggcctatc gttagaagaa
4741 attcaagaat tatgggaaga aggtgtttta ccttggaaat ctgaaggctg gattccttca
4801 tccagaagag gtaataatta cgatttagag gatttacaac atgacgacaa accgtggtac
4861 aaggccatgc tagaataact cgagtcatgt aattagttat gtcacgctta cattcacgcc
4921 ctccccccac atccgctcta accgaaaagg aaggagttag acaacctgaa gtctaggtcc
4981 ctatttattt ttttatagtt atgttagtat taagaacgtt atttatattt caaatttttc
5041 ttttttttct gtacagacgc gtgtacgcat gtaacattat actgaaaacc ttgcttgaga
5101 aggttttggg acgctcgaag gctcgaaggc tttaatttgc ggccggcctt aaaatacaca
5161 tgatatagac gcttaactat ctgtcctgca agtttgttta tctatataaa tatagccagt 5221 atcatccgtt tacctgtgtt gtagagttcc aaaagataga acagcctttt atcagatggg
5281 tggcagaccc ttacccatat tatctctctt attttgtttt tatttctctc tagccttata
5341 attattttta gtttcctttc tgtagcaata agtgttctta gcagtggcgt tgcgttgcct
5401 tgaccttgat tatcaattta agcagctttc tgctcacaag ccacttcatt aattaaaagc
5461 ttggcgtaat catggtcata gctgtltcct gtgtgaaatt gttatccgct cacaattcca
5521 cacaacatac gagccggaag cataaagtgt aaagcctggg gtgcctaatg agtgagctaa
5581 ctcacattaa ttgcgttgcg ctcactgccc gctttccagt cgggaaacct gtcgtgccag
5641 ctgcattaat gaatcggcca acgcgcgggg agaggcggtt tgcgtattgg gcgctcttcc
5701 gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct
5761 cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg aaagaacatg
5821 tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc
5881 cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca gaggtggcga
5941 aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct cgtgcgctct
6001 cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc gggaagcgtg
6061 gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag
6121 ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc cggtaactat
6181 cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc cactggtaac
6241 aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg gtggcctaac
6301 tacggctaca ctagaaggac agtatttggt atctgcgctc tgctgaagcc agttaccttc
6361 ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag cggtggtttt
6421 tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc
6481 ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat tttggtcatg
6541 agattatcaa aaaggatctt cacctagatc cttttaaatt aaaaatgaag ttttaaatca
6601 atctaaagta tatatgagta aacttggtct gacagttacc aatgcttaat cagtgaggca
6661 cctatctcag cgatctgtct atttcgttca tccatagttg cctgactccc cgtcgtgtag
6721 ataactacga tacgggaggg cttaccatct ggccccagtg ctgcaatgat accgcgagac
6781 ccacgctcac cggctccaga tttatcagca ataaaccagc cagccggaag ggccgagcgc
6841 agaagtggtc ctgcaacttt atccgcctcc atccagtcta ttaattgttg ccgggaagct
6901 agagtaagta gttcgccagt taatagtttg cgcaacgttg ttgccattgc tacaggcatc
6961 gtggtgtcac gctcgtcgtt tggtatggct tcattcagct ccggttccca acgatcaagg
7021 cgagttacat gatcccccat gttgtgcaaa aaagcggtta gctccttcgg tcctccgatc
7081 gttgtcagaa gtaagttggc cgcagtgtta tcactcalgg ttatggcagc actgcataat
7141 tctcttactg tcatgccatc cgtaagatgc ttttctgtga ctggtgagta ctcaaccaag
7201 tcattctgag aatagtgtat gcggcgaccg a
Plasmid pBF1386 integrates ScXKSI under the control of the TDH3 promoter and ScRKII under the control of the TEF1 promoter at the GRE3 gene. pBF1386 was constructed using standard molecular biology techniques. pBF1386 contains about 500 bp of the 5' region of ScGRE3 (e.g., located at about position 1669 to 2154) and about 600 bp of the 5' flanking region of GRE3 (e.g., located at about 7969 to 8573). pBF1386 also contains the ScTEFI promoter e.g., located at about position 2160 to 2629), the ScRKSI open reading frame (e.g., located at about position 2642 to 3415), the ScADHI terminator (e.g., located at about position 3249 to 3652), the TDH3 promoter around (e.g., located at about 3669 to 4301 ), the ScXKSI open reading frame (e.g., located at about position 4341 to 6140), and the CYC1 terminator (e.g., located at about position 6150 to 6395). In addition, pBF1386 contains the ScURA3 gene (e.g., located at about positions 6630 to 7750) flanked by two R127 repeats (e.g., one located at about positions 6420 to 6620 and another located at about position 7754 to 7954). The whole 5' GRE3-PTEFrScRKI1-PTDH3 -ScXKS1-R'27- ScURA3-R'27-3' GRE3 cassette is flanked by Pad restriction enzymes sites. The sequence of pBF1386 is presented below as SEQ ID No: 185.
SEQ ID No: 185 - pBF1386 sequence
1 tgataacact gcggccaact tacttctgac aacgatcgga ggaccgaagg agctaaccgc
61 ttttttgcac aacatggggg atcatgtaac tcgccttgat cgttgggaac cggagctgaa
121 tgaagccata ccaaacgacg agcgtgacac cacgatgcct gtagcaatgg caacaacgtt
181 gcgcaaacta ttaactggcg aactacttac tctagcttcc cggcaacaat taatagactg
241 gatggaggcg gataaagttg caggaccact tctgcgctcg gcccttccgg ctggctggtt
301 tattgctgat aaatctggag ccggtgagcg tgggtctcgc ggtatcattg cagcactggg
361 gccagatggt aagccctccc gtatcgtagt tatctacacg acggggagtc aggcaactat
421 ggatgaacga aatagacaga tcgctgagat aggtgcctca ctgattaagc attggtaact
481 gtcagaccaa gtttactcat atatacttta gattgattta aaacttcatt tttaatttaa
541 aaggatctag gtgaagatcc tttttgataa tctcatgacc aaaatccctt aacgtgagtt
601 ttcgttccac tgagcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt
661 ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg
721 tttgccggat caagagctac caactctttt tccgaaggta actggcttca gcagagcgca
781 gataccaaat actgttcttc tagtgtagcc gtagttaggc caccacttca agaactctgt
841 agcaccgcct acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga
901 taagtcgtgt cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc
961 gggctgaacg gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact
1021 gagataccta cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga
1081 caggtatccg gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg
1 141 aaacgcctgg tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt
1201 tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt
1261 acggttcctg gccttttgct ggccttttgc tcacatgttc tttcctgcgt tatcccctga
1321 ttctgtggat aaccgtatta ccgcctttga gtgagctgat accgctcgcc gcagccgaac
1381 gaccgagcgc agcgagtcag tgagcgagga agcggaagag cgcccaatac gcaaaccgcc
1441 tctccccgcg cgttggccga ttcattaatg cagctggcac gacaggtttc ccgactggaa
1501 agcgggcagt gagcgcaacg caattaatgt gagttagctc actcattagg caccccaggc
1561 tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt gtgagcggat aacaatttca
1621 cacaggaaac agctatgacc atgattacgc caagcttcga ttaattaacc cctagtcggc
1681 ttagggtgct ggaaaattga caaaaaagtc tgtgcgaatc aaatttatga agctatcaaa
1741 ttaggctacc gtttattcga tggtgcttgc gactacggca acgaaaagga agttggtgaa
1801 ggtatcagga aagccatctc cgaaggtctt gtttctagaa aggatatatt tgttgtttca
1861 aagttatgga acaattttca ccatcctgat catgtaaaat tagctttaaa gaagacctta
1921 agcgatatgg gacttgatta tttagacctg tattatattc acttcccaat cgccttcaaa
1981 tatgttccat ttgaagagaa ataccctcca ggattctata cgggcgcaga tgacgagaag
2041 aaaggtcaca tcaccgaagc acatgtacca atcatagata cgtaccgggc tctggaagaa
2101 tgtgttgatg aaggcttgat taagtctatl ggtgtttcca actttcaggg aggccggccc
2161 tagcaccgcg aatccttaca tcacacccaa tcccccacaa gtgatccccc acacaccata
2221 gcttcaaaat gtttctactc cttttttact cttccagatt ttctcggact ccgcgcatcg
2281 ccgtaccact tcaaaacacc caagcacagc atactaaatt tcccctcttt cttcctctag
2341 ggtgtcgtta attacccgta ctaaaggttt ggaaaagaaa aaagagaccg cctcgtttct
2401 ttttcttcgt cgaaaaaggc aataaaaatt tttatcacgt ttcttlttct tgaaaatttt
2461 tttttttgat ttttttctct ttcgatgacc tcccattgat atttaagtta ataaacggtc ttcaatttct caagtttcag tttcattttt cttgtlctat tacaactttt tttacttctt
gctcattaga aagaaagcat agcaatctaa tctaagtttt aattacaaaa ctagtaaaaa aatggctgcc ggtgtcccaa aaattgatgc gttagaatct ttgggcaatc ctttggagga tgccaagaga gctgcagcat acagagcagt tgatgaaaat ttaaaatttg atgatcacaa aattatlgga attggtagtg gtagcacagt ggtttatgtt gccgaaagaa ttggacaata tttgcatgac cctaaatttt atgaagtagc gtctaaattc atltgcatlc caacaggatl ccaatcaaga aacttgattt tggataacaa gttgcaatta ggctccattg aacagtatcc tcgcattgat atagcgtttg acggtgctga tgaagtggat gagaattlac aattgattaa aggtggtggt gcttgtctat ttcaagaaaa attggttagt actagcgcta aaaccttcat tgtcgttgct gattcaagaa aaaagtcacc aaaacattta ggtaagaact ggaggcaagg tgttcccatt gaaattgtac cttcctcata cgtgagggtc aagaatgatc tattagaaca attgcatgct gaaaaagttg acatcagaca aggaggttct gctaaagcag gtcctgttgt aactgacaat aataacttca ttatcgatgc ggatttcggt gaaatttccg atccaagaaa attgcataga gaaatcaaac tgttagtggg cgtggtggaa acaggtttat tcatcgacaa cgcttcaaaa gcctacttcg gtaattctga cggtagtgtt gaagttaccg aaaagtgata gctcgagtaa taagcgaatt tcttatgatt tatgattttt attattaaat aagttataaa aaaaataagt gtatacaaat tttaaagtga ctcttaggtt ttaaaacgaa aattcttatt cttgagtaac tctttcctgt aggtcaggtt gctttctcag gtatagcatg aggtcgctct tattgaccac acctctaccg gcatgccgag caaatgcctg caaatcgctc ccaggagctc ggccggccgt ttatcattat caatactcgc catttcaaag aatacgtaaa taattaatag tagtgatttt cctaacttta tttagtcaaa aaattagcct tttaattctg ctgtaacccg tacatgccca aaataggggg cgggttacac agaatatata acatcgtagg tgtctgggtg aacagtttat tcctggcatc cactaaatat aatggagccc gctttttaag ctggcatcca gaaaaaaaaa gaatcccagc accaaaatat tgttttcttc accaaccatc agttcatagg tccattctct tagcgcaact acagagaaca ggggcacaaa caggcaaaaa acgggcacaa cctcaatgga gtgatgcaac ctgcctggag taaatgatga cacaaggcaa ttgacccacg catgtatcta tctcattttc ttacaccttc tattaccttc tgctctctct gatttggaaa
aagctgaaaa aaaaggttga aaccagttcc ctgaaattat tcccctactt gactaataag tatataaaga cggtaggtat tgattgtaat tctgtaaatc tatttcttaa acttcttaaa ttctactttt atagttagtc ttttttttag ttttaaaaca ccagaactta gtttcgacgg
attctagaac tagtaaaaaa atgttgtgtt cagtaattca gagacagaca agagaggttt ccaacacaat gtctttagac tcatactatc ttgggtttga tctttcgacc caacaactga aatgtctcgc cattaaccag gacctaaaaa ttgtccattc agaaacagtg gaatttgaaa aggatcttcc gcattatcac acaaagaagg gtgtctatat acacggcgac actatcgaat gtcccgtagc catgtggtta gaggctctag atctggttct ctcgaaatat cgcgaggcta aatttccatt gaacaaagtt atggccgtct cagggtcctg ccagcagcac gggtctgtct actggtcctc ccaagccgaa tctctgttag agcaattgaa taagaaaccg gaaaaagatt tattgcacta cgtgagttct gtagcatttg caaggcaaac cgcccccaat tggcaagacc acagtactgc aaagcaatgt caagagtttg aagagtgcat aggtgggcct gaaaaaatgg ctcaattaac agggtccaga gcccatttta gatttactgg tcctcaaatt ctgaaaattg cacaattaga accagaagct tacgaaaaaa caaagaccat ttctttagtg tctaattttt tgacttctat cttagtgggc catcttgttg aattagagga ggcagatgcc tgtggtatga acctttatga tatacgtgaa agaaaattca gtgatgagct actacatcta attgatagtt cttctaagga taaaactatc agacaaaaat taatgagagc acccatgaaa aatttgatag cgggtaccat ctgtaaatat tttattgaga agtacggttt caatacaaac tgcaaggtct ctcccatgac tggggataat ttagccacta tatgttcttt acccctgcgg aagaatgacg ttctcgtttc cctaggaaca agtactacag ttcttctggt caccgataag tatcacccct ctccgaacta tcatcttttc attcatccaa ctctgccaaa ccattatatg ggtatgattt gttattgtaa tggttctttg gcaagggaga ggataagaga cgagttaaac aaagaacggg aaaataatta tgagaagact aacgattgga ctctttttaa tcaagctgtg ctagatgact cagaaagtag tgaaaatgaa ttaggtgtat attttcctct gggggagatc gttcctagcg 5581 taaaagccat aaacaaaagg gttatcttca atccaaaaac gggtatgatt gaaagagagg 5641 tggccaagtt caaagacaag aggcacgatg ccaaaaatat tgtagaatca caggctttaa 5701 gttgcagggt aagaatatct cccctgcttt cggattcaaa cgcaagctca caacagagac 5761 tgaacgaaga tacaatcgtg aagtttgatt acgatgaatc tccgctgcgg gactacctaa 5821 ataaaaggcc agaaaggact ttttttgtag gtggggcttc taaaaacgat gctattgtga 5881 agaagtttgc tcaagtcatt ggtgctacaa agggtaattt taggctagaa acaccaaact 5941 catgtgccct tggtggttgt tataaggcca tgtggtcatt gttatatgac tctaataaaa 6001 ttgcagttcc ttttgataaa tttctgaatg acaattttcc atggcatgta atggaaagca 6061 tatccgatgt ggataatgaa aattgggatc gctataattc caagattgtc cccttaagcg 6121 aactggaaaa gactctcatc taactcgagt catgtaatta gttatgtcac gcttacattc 6181 acgccctccc cccacatccg ctctaaccga aaaggaagga gttagacaac ctgaagtcta 6241 ggtccctatt tattttttta tagttatgtt agtattaaga acgttattta tatttcaaat
6301 ttttcttttt tttctgtaca gacgcgtgta cgcatgtaac attatactga aaaccttgct
6361 tgagaaggtt ttgggacgct cgaaggcttt aatttcggcc gcggccgcat taacccgggt 6421 cgcgacgata ctcggacggg tgcgccgcgt ccggtcaggt cgaggggtgc tcacggcgct 6481 tcgcagtccg acggtccggc ttgagccgca tacgccggcg gacattccgc ggcatccccc 6541 gtcggggccg ccgagtggcg tagacgcgca agctccgacc agtcgcgcgg tcgacccgtg 6601 cggtccctgg gccctaagac gcggccggcc gcgggtaata actgatataa ttaaattgaa 6661 gctctaattt gtgagtttag tatacatgca tttacttata atacagtttt ttagttttgc
6721 tggccgcatc ttctcaaata tgcttcccag cctgcttttc tgtaacgttc accctctacc 6781 ttagcatccc ttccctttgc aaatagtcct cttccaacaa taataatgtc agatcctgta 6841 gagaccacat catccacggt tctatactgt tgacccaatg cgtctccctt gtcatctaaa 6901 cccacaccgg gtgtcataat caaccaatcg taaccttcat ctcttccacc catgtctctt 6961 tgagcaataa agccgataac aaaatctttg tcgctcttcg caatgtcaac agtaccctta 7021 gtatattctc cagtagatag ggagcccttg catgacaatt ctgctaacat caaaaggcct 7081 ctaggttcct ttgttacttc ttctgccgcc tgcttcaaac cgctaacaat acctgggccc 7141 accacaccgt gtgcattcgt aatgtctgcc cattctgcta ttctgtatac acccgcagag 7201 tactgcaatt tgactgtatt accaatgtca gcaaattttc tgtcttcgaa gagtaaaaaa 7261 ttgtacttgg cggataatgc ctttagcggc ttaactgtgc cctccatgga aaaatcagtc 7321 aagatatcca catgtgtttt tagtaaacaa attttgggac ctaatgcttc aactaactcc 7381 agtaattcct tggtggtacg aacatccaat gaagcacaca agtttgtttg cttttcgtgc 7441 atgatattaa atagcttggc agcaacagga ctaggatgag tagcagcacg ttccttatat 7501 gtagctttcg acatgattta tcttcgtttc ctgcaggttt ttgttctgtg cagttgggtt
7561 aagaatactg ggcaatttca tgtttcttca acactacata tgcgtatata taccaatcta 7621 agtctgtgct ccttccttcg ttcttccttc tgttcggaga ttaccgaatc aaaaaaattt
7681 caaggaaacc gaaatcaaaa aaaagaataa aaaaaaaatg atgaattgaa aagctttatg 7741 gaccctgggt acctcgcgac gatactcgga cgggtgcgcc gcgtccggtc aggtcgaggg 7801 gtgctcacgg cgcttcgcag tccgacggtc cggcttgagc cgcatacgcc ggcggacatt 7861 ccgcggcatc ccccgtcggg gccgccgagt ggcgtagacg cgcaagctcc gaccagtcgc 7921 gcggtcgacc cgtgcggtcc ctgggcccta agaccccggg gcggccgcgg atggtaaatt 7981 ccccactttt gcctgatcca gccagtaaaa tccatactca acgacgatat gaacaaattt 8041 ccctcattcc gatgctgtat atgtgtataa atttttacat gctcttctgt ttagacacag
8101 aacagcttta aataaaatgt tggatatact ttttctgcct gtggtgtcat ccacgctttt
8161 aattcatctc ttgtatggtt gacaatttgg ctatttttta acagaaccca acggtaattg
8221 aaattaaaag ggaaacgagt gggggcgatg agtgagtgat actaaaatag acaccaagag 8281 agcaaagcgg tcccaaaatc atttgagtaa ccggatatct atcgggatat taatagcagc 8341 ttccatttca actaaaacaa cagcaagata tgagcgacaa gatatccttt ctacctcccg 8401 aacccatcca actacttgac gaagactcca cggagcctga actcgacatt gactcacaac 8461 aagaaaatga gggacccatc agtgcgtcaa acagcaatga tagcactagc catagtaatg 8521 attgcggtgc cacaattacc agaacaagac ctagacgaag cagttctatc attaattaac 8581 gagaattcac tggccgtcgt tttacaacgt cgtgactggg aaaaccctgg cgttacccaa 8641 cttaatcgcc ttgcagcaca tcccccttlc gccagctggc gtaatagcga agaggcccgc
8701 accgatcgcc cttcccaaca gttgcgcagc ctgaatggcg aatggcgcct gatgcggtat
8761 tttctcctta cgcatctgtg cggtatttca caccgcatat ggtgcactct cagtacaatc
8821 tgctctgatg ccgcatagtt aagccagccc cgacacccgc caacacccgc tgacgcgccc
8881 tgacgggctt gtctgctccc ggcatccgct tacagacaag ctgtgaccgt ctccgggagc
8941 tgcatgtgtc agaggttttc accgtcatca ccgaaacgcg cgagacgaaa gggcctcgtg
9001 atacgcctat ttttataggt taatgtcatg ataataatgg tttcttagac gtcaggtggc
9061 acttttcggg gaaatgtgcg cggaacccct atttgtttat ttttctaaat acattcaaat
9121 atgtatccgc tcatgagaca ataaccctga taaatgcttc aataatattg aaaaaggaag
9181 agtatgagta ttcaacattt ccgtgtcgcc cttattccct ttrttgcggc attttgcctt
9241 cctgtttttg ctcacccaga aacgctggtg aaagtaaaag atgctgaaga tcagttgggt
9301 gcacgagtgg gttacatcga actggatctc aacagcggta agatccttga gagttttcgc
9361 cccgaagaac gttttccaat gatgagcact tttaaagttc tgctatgtgg cgcggtatta
9421 tcccgtattg acgccgggca agagcaactc ggtcgccgca tacactattc tcagaatgac
9481 ttggttgagt actcaccagt cacagaaaag catcttacgg atggcatgac agtaagagaa
9541 ttatgcagtg ctgccataac catgag
Plasmid pBF1373 express RfCo'XI under control of the TDH3 promoter in a 2μ URA3 plasmid based on p426-GPD. pBF1373 contains the TDH3 promoter (e.g., located at about position 3622 to 4276) and the RfCo* XI open reading frame (e.g., located at about position 4284 to 5582). In addition, pBF1373 contains a URA3 gene (e.g., located at about position 6587 to 7624) and the 2μ origin of replication (e.g., located at about position 55 to 141 1 ). The sequence of pBF1373 is presented below as SEQ ID No: 186.
SEQ ID No: 186 - pBF1373 sequence
1 gacgaaaggg cctcgtgata cgcctattrt tataggttaa tgtcatgata ataatggttt
61 cttagtatga tccaatatca aaggaaatga tagcattgaa ggatgagact aatccaattg
121 aggagtggca gcatatagaa cagctaaagg gtagtgctga aggaagcata cgataccccg
181 catggaatgg gataatatca caggaggtac tagactacct ttcatcctac ataaatagac
241 gcatataagt acgcatttaa gcataaacac gcactatgcc gttcttctca tgtatatata
301 tatacaggca acacgcagat ataggtgcga cgtgaacagt gagctgtatg tgcgcagctc
361 gcgttgcatt ttcggaagcg ctcgttttcg gaaacgcttt gaagttccta ttccgaagtt
421 cctattctct agaaagtata ggaacttcag agcgcttttg aaaaccaaaa gcgctctgaa
481 gacgcacttt caaaaaacca aaaacgcacc ggactgtaac gagctactaa aatattgcga
541 ataccgcttc cacaaacatt gctcaaaagt atctctttgc tatatatctc tgtgctatat
601 ccctatataa cctacccatc cacctttcgc tccttgaact tgcatctaaa ctcgacctct
661 acatttttta tgtttatctc tagtattact ctttagacaa aaaaattgta gtaagaacta
721 ttcatagagt gaatcgaaaa caatacgaaa atgtaaacat ttcctatacg tagtatatag
781 agacaaaata gaagaaaccg ttcataattt tctgaccaat gaagaatcat caacgctatc
841 actttctgtt cacaaagtat gcgcaatcca catcggtata gaatataatc ggggatgcct
901 ttatcttgaa aaaatgcacc cgcagcttcg ctagtaatca gtaaacgcgg gaagtggagt
961 caggcttttt ttatggaaga gaaaatagac accaaagtag ccttcttcta accttaacgg
1021 acctacagtg caaaaagtta tcaagagact gcattataga gcgcacaaag gagaaaaaaa
1081 gtaatctaag atgctttgtt agaaaaatag cgctctcggg atgcattttt gtagaacaaa
1 141 aaagaagtat agattctttg ttggtaaaat agcgctctcg cgttgcattt ctgttctgta
1201 aaaatgcagc tcagattctt tgtttgaaaa attagcgctc tcgcgttgca tttttgtttt
1261 acaaaaatga agcacagatt cttcgttggt aaaatagcgc rttcgcgttg catttctgtt 1321 ctgtaaaaat gcagctcaga ttctttgttt gaaaaattag cgctctcgcg ttgcattttt
1381 gttctacaaa atgaagcaca gatgcttcgt tcaggtggca cttttcgggg aaatgtgcgc 1441 ggaaccccta tttgtttatt tttctaaata catlcaaata tgtatccgct catgagacaa
1501 taaccctgat aaatgcttca alaatattga aaaaggaaga gtatgagtat tcaacatttc 1561 cgtgtcgccc ttattccctt ttttgcggca ttttgccttc ctgtttttgc tcacccagaa
1621 acgctggtga aagtaaaaga tgctgaagat cagttgggtg cacgagtggg ttacatcgaa 1681 ctggatctca acagcggtaa gatccttgag agttttcgcc ccgaagaacg ttttccaatg 1741 atgagcactt ttaaagttct gctatgtggc gcggtattat cccgtattga cgccgggcaa 1801 gagcaactcg gtcgccgcat acactattct cagaatgact tggttgagta ctcaccagtc 1861 acagaaaagc atcttacgga tggcatgaca gtaagagaat tatgcagtgc tgccataacc 1921 atgagtgata acactgcggc caacttactt ctgacaacga tcggaggacc gaaggagcta 1981 accgcttttt tgcacaacat gggggatcat gtaactcgcc ttgatcgttg ggaaccggag 2041 ctgaatgaag ccataccaaa cgacgagcgt gacaccacga tgcctgtagc aatggcaaca 2101 acgttgcgca aactattaac tggcgaacta cttactctag cttcccggca acaattaata 2161 gactggatgg aggcggataa agttgcagga ccacttctgc gctcggccct tccggctggc 2221 tggtttattg ctgataaatc tggagccggt gagcgtgggt ctcgcggtat cattgcagca 2281 ctggggccag atggtaagcc ctcccgtatc gtagttatct acacgacggg gagtcaggca 2341 actatggatg aacgaaatag acagatcgct gagataggtg cctcactgat taagcattgg 2401 taactgtcag accaagttta ctcatatata ctttagattg atttaaaact tcatttttaa
2461 tttaaaagga tctaggtgaa gatccttttt gataatctca tgaccaaaat cccttaacgt
2521 gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 2581 cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 2641 gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 2701 gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 2761 tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 2821 ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 2881 cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 2941 gaactgagat acctacagcg tgagctatga gaaagcgcca cgcttcccga agggagaaag 3001 gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 3061 gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 3121 cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 3181 tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc
3241 cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 3301 cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 3361 ccgcctctcc ccgcgcgttg gccgattcat taatgcagct ggcacgacag gtttcccgac 3421 tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt acctcactca ttaggcaccc 3481 caggctttac actttatgct tccggctcct atgttgtgtg gaattgtgag cggataacaa
3541 tttcacacag gaaacagcta tgaccatgat tacgccaagc gcgcaattaa ccctcactaa 3601 agggaacaaa agctggagct cagtttatca ttatcaatac tcgccatttc aaagaatacg 3661 taaataatta atagtagtga ttttcctaac tttatttagt caaaaaatta gccttttaat
3721 tctgctgtaa cccgtacatg cccaaaatag ggggcgggtt acacagaata tataacatcg 3781 taggtgtctg ggtgaacagt ttattcctgg catccactaa atataatgga gcccgctttt
3841 taagctggca tccagaaaaa aaaagaatcc cagcaccaaa atattgtttt cttcaccaac 3901 catcagttca taggtccatt ctcttagcgc aactacagag aacaggggca caaacaggca 3961 aaaaacgggc acaacctcaa tggagtgatg caacctgcct ggagtaaatg atgacacaag 4021 gcaattgacc cacgcatgta tctatctcat tttcttacac cttctattac cttctgctct
4081 ctctgatttg gaaaaagctg aaaaaaaagg ttgaaaccag ttccctgaaa ttattcccct 4141 acttgactaa taagtatata aagacggtag gtattgattg taattctgta aatctatttc
4201 ttaaacttct taaattctac ttttatagtt agtctttttt ttagttttaa aacaccagaa
4261 cttagtttcg acggattcta gaactagtaa aaaaatggct aaggaatact tcccacaaat 4321 ccaaaaaatt caataccaag gtccaaaatc taccgaccca ttatccttca aatactacaa 4381 cccagaagaa gttatcaacg gtaagaccat gagagaacac ttgaagttcg ccttgtcctg 4441 gtggcacacc atgggtggtg atggtactga catgttcggt tgtggtacta ccgacaagac 4501 ctggggtcaa tccgacccag ctgctagagc taaggctaag gttgacgctg ctttcgaaat 4561 tatggacaag ttgtctatcg actactactg tttccacgac agagacttgt ctccagaata 4621 cggttctttg aaggctacca acgatcaatt ggacatcgtc accgactaca tcaaggaaaa 4681 gcaaggtgac aagttcaagt gtttgtgggg tactgctaag tgtttcgacc atccaagatt 4741 catgcacggt gctggtacct ctccatctgc tgatgtcttc gctttctccg ctgctcaaat 4801 caagaaggct ttggaatcca ctgtcaagtt gggtgctaac ggttacgtct tctggggtgg 4861 tagagaaggt tacgaaactt tgttaaacac taacatgggt ttggaattgg ataacatggc 4921 tagattgatg aagatggccg tcgaatacgg tagatctatc ggttttaagg gtgacttcta 4981 catcgaacca aaaccaaagg aaccaactaa gcaccaatac gatttcgaca ctgccactgt 5041 cttgggtttc ttgagaaaat acggtttgga caaggacttc aagatgaaca ttgaagctaa 5101 ccacgccacc ttggcccaac atacttttca acacgaattg agagtcgcca gagataacgg 5161 tgtcttcggt tctatcgacg ctaaccaagg tgatgtcttg ttaggttggg acaccgacca 5221 attcccaacc aacatctacg ataccaccat gtgtatgtac gaagtcatca aagctggtgg 5281 ttttaccaac ggtggtttga acttcgatgc taaggctaga agaggttcct tcaccccaga 5341 agacattttc tactcctaca ttgctggtat ggatgctttc gctttgggtt tcagagctgc 5401 tttgaagtta atcgaagatg gtagaattga taagttcgtt gctgacagat acgcttcttg 5461 gaacactggt atcggtgctg atatcattgc tggtaaggct gacttcgctt ccttggaaaa 5521 gtacgctttg gaaaagggtg aagttaccgc ttctttgtcc tctggtagac aagaaatgtt 5581 ggaatctatc gtcaacaacg tcttgttctc tttgtaactc gagtcatgta attagttatg 5641 tcacgcttac attcacgccc tccccccaca tccgctctaa ccgaaaagga aggagttaga 5701 caacctgaag tctaggtccc tatttatttt tttatagtta tgttagtatt aagaacgtta
5761 tttatatttc aaatttttct tttttttctg tacagacgcg tgtacgcatg taacattata
5821 ctgaaaacct tgcttgagaa ggttttggga cgctcgaagg ctttaatttg cggccggtac 5881 ccaattcgcc ctatagtgag tcgtattacg cgcgctcact ggccgtcgtt ttacaacgtc 5941 gtgactggga aaaccctggc gttacccaac ttaatcgcct tgcagcacat ccccctttcg 6001 ccagctggcg taatagcgaa gaggcccgca ccgatcgccc ttcccaacag ttgcgcagcc 6061 tgaatggcga atggcgcgac gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg 6121 ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct ttcgctttct 6181 tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat cgggggctcc 6241 ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt gattagggtg 6301 atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg acgttggagt 6361 ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac cctatctcgg 6421 tctattcttt tgatttataa gggattttgc cgatttcggc ctattggtta aaaaatgagc
6481 tgatttaaca aaaatttaac gcgaatttta acaaaatatt aacgtttaca atttcctgat 6541 gcggtatttt ctccttacgc atctgtgcgg tatttcacac cgcatagggt aataactgat 6601 ataattaaat tgaagctcta atttgtgagt ttagtataca tgcatttact tataatacag 6661 ttttttagtt ttgctggccg catcttctca aatatgcttc ccagcctgct tttctgtaac
6721 gttcaccctc taccttagca tcccttccct ttgcaaatag tcctcttcca acaataataa 6781 tgtcagatcc tgtagagacc acatcatcca cggttctata ctgttgaccc aatgcgtctc 6841 ccttgtcatc taaacccaca ccgggtgtca taatcaacca atcgtaacct tcatctcttc 6901 cacccatgtc tctttgagca ataaagccga taacaaaatc tttgtcgctc ttcgcaatgt 6961 caacagtacc cttagtatat tctccagtag atagggagcc cttgcatgac aattctgcta 7021 acatcaaaag gcctctaggt tcctttgtta cttcttctgc cgcctgcttc aaaccgctaa 7081 caatacctgg gcccaccaca ccgtgtgcat tcgtaatgtc tgcccattct gctattctgt 7141 atacacccgc agagtactgc aatttgactg tattaccaat gtcagcaaat tttctgtctt 7201 cgaagagtaa aaaattgtac ttggcggata atgcctttag cggcttaact gtgccctcca 7261 tggaaaaatc agtcaagata tccacatgtg tttttagtaa acaaattttg ggacctaatg 7321 cttcaactaa ctccagtaat tccttggtgg tacgaacatc caatgaagca cacaagtttg 7381 tttgcttttc gtgcatgata ttaaatagct tggcagcaac aggactagga tgagtagcag 7441 cacgttcctt atatgtagct ttcgacatga tttatcttcg tttcctgcag gtttttgtlc
7501 tgtgcagttg ggttaagaat actgggcaat ttcatgtttc ttcaacacta catatgcgta
7561 tatataccaa tctaagtctg tgctccttcc ttcgttcttc cttctgttcg gagattaccg
7621 aatcaaaaaa atttcaaaga aaccgaaatc aaaaaaaaga ataaaaaaaa aatgatgaat
7681 tgaattgaaa agctgtggta tggtgcactc tcagtacaat ctgctctgat gccgcatagt
7741 taagccagcc ccgacacccg ccaacacccg ctgacgcgcc ctgacgggct tgtctgctcc
7801 cggcatccgc ttacagacaa gctgtgaccg tctccgggag ctgcatgtgt cagaggtttt
7861 caccgtcatc accgaaacgc gcga Plasmid pBF1676 integrates RfCo' XI under the control of the TEF1 promoter into the rDNA genes. pBF1676 was constructed using standard molecular biology techniques. pBF1 676 contains about 600 bp of a segment of an rDNA gene (e.g., located at about position 1818 to 2435) and about 1 100 bp of another segment of an rDNA gene (e.g., located at about position 4488 and 5625). pBF1676 also contains the ScTEFI promoter (e.g., located at about position 2455 to 2868), the RfCo* XI open reading frame (e.g., located at about position 2880 to 4199) and the ScADHI terminator (e.g., located at about position 4208 to 4430). The whole 5' TDNA-PTEFI- RfCo'XI - 3'rDNA cassette is flanked by Pad restriction enzymes sites. The sequence of pBF1676 is presented below as SEQ ID No: 187 SEQ ID No: 187 - pBF1676 sequence
1 gtattatccc gtattgacgc cgggcaagag caactcggtc gccgcataca ctattctcag
61 aatgacttgg ttgagtactc accagtcaca gaaaagcatc ttacggatgg catgacagta
121 agagaattat gcagtgctgc cataaccatg agtgataaca ctgcggccaa cttacttctg
181 acaacgatcg gaggaccgaa ggagctaacc gcttttttgc acaacatggg ggatcatgta
241 actcgccttg atcgttggga accggagctg aatgaagcca taccaaacga cgagcgtgac
301 accacgatgc ctgtagcaat ggcaacaacg ttgcgcaaac tattaactgg cgaactactt
361 actctagctt cccggcaaca attaatagac tggatggagg cggataaagt tgcaggacca
421 cttctgcgct cggcccttcc ggctggctgg tttattgctg ataaatctgg agccggtgag
481 cgtgggtctc gcggtatcat tgcagcactg gggccagatg gtaagccctc ccgtatcgta
541 gttatctaca cgacggggag tcaggcaact atggatgaac gaaatagaca gatcgctgag
601 ataggtgcct cactgattaa gcattggtaa ctgtcagacc aagtttactc atatatactt
661 tagattgatt taaaacttca tttttaattt aaaaggatct aggtgaagat cctttttgat
721 aatctcatga ccaaaatccc ttaacgtgag ttttcgttcc actgagcgtc agaccccgta
781 gaaaagatca aaggatcttc ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa
841 acaaaaaaac caccgctacc agcggtggtt tgtttgccgg atcaagagct accaactctt
901 tttccgaagg taactggctt cagcagagcg cagataccaa atactgtcct tctagtgtag
961 ccgtagttag gccaccactt caagaactct gtagcaccgc ctacatacct cgctctgcta
1021 atcctgttac cagtggctgc tgccagtggc gataagtcgt gtcttaccgg gttggactca
1081 agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag
1 141 cccagcttgg agcgaacgac ctacaccgaa ctgagatacc tacagcgtga gctatgagaa
1201 agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga
1261 acaggagagc gcacgaggga gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc
1321 gggtttcgcc acctctgact tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc
1381 ctatggaaaa acgccagcaa cgcggccttt ttacggttcc tggccttttg ctggcctttt
1441 gctcacatgt tctttcctgc gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg ataccgctcg ccgcagccga acgaccgagc gcagcgagtc agtgagcgag gaagcggaag agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagcttt taattaaccc ggggcacctg tcactttgga aaaaaaatat acgctaagat ttttggagaa tagcttaaat tgaagttttt ctcggcgaga aatacgtagt taaggcagag cgacagagag ggcaaaagaa aataaaagta agattttagt ttgtaatggg agggggggtt tagtcatgga gtacaagtgt gaggaaaagt agttgggagg tacttcatgc gaaagcagtt gaagacaagt tcgaaaagag tttggaaacg aattcgagta ggctlgtcgt tcgttatgtt tttgtaaatg gcctcgtcaa acggtggaga gagtcgctag gtgatcgtca gatctgccta gtctctatac agcgtgttta attgacatgg gttgatgcgt attgagagat acaatttggg aagaaattcc cagagtgtgt ttcttttgcg tttaacctga acagtctcat cgtgggcatc ttgcgattcc attggtgagc agcgaaggat ttggtggatt actagctaat agcaatctat ttcaaagaat tcaaacttgg gggaatgcct tgttgaatag ccggtcgcaa gactgtgatt cttcaagtgt aacctcctct caaatcagcg atatcggccg gccgcatgcg gatccatagc ttcaaaatgt ttctactcct tttttactct tccagatttt ctcggactcc gcgcatcgcc gtaccacttc aaaacaccca agcacagcat actaaatttc ccctctttct tcctctaggg tgtcgttaat tacccgtact aaaggtttgg aaaagaaaaa agagaccgcc tcgtttcttt ttcttcgtcg aaaaaggcaa taaaaatttt tatcacgttt ctttttcttg aaaatttttt tttttgattt ttttctcttt cgatgacctc ccattgatat ttaagttaat aaacggtctt
caatttctca agtttcagtt tcatttttct tgttctatta caactttttt tacttcttgc
tcattagaaa gaaagcatag caatctaatc taagttttaa ttacaaaact agtaaaaaaa tggctaagga atacttccca caaatccaaa aaattcaata ccaaggtcca aaatctaccg acccattatc cttcaaatac tacaacccag aagaagttat caacggtaag accatgagag aacacttgaa gttcgccttg tcctggtggc acaccatggg tggtgatggt actgacatgt tcggttgtgg tactaccgac aagacctggg gtcaatccga cccagctgct agagctaagg ctaaggttga cgctgctttc gaaattatgg acaagttgtc tatcgactac tactgtttcc acgacagaga cttgtctcca gaatacggtt ctttgaaggc taccaacgat caattggaca tcgtcaccga ctacatcaag gaaaagcaag gtgacaagtt caagtgtttg tggggtactg ctaagtgttt cgaccatcca agattcatgc acggtgctgg tacctctcca tctgctgatg tcttcgcttt ctccgctgct caaatcaaga aggctttgga atccactgtc aagttgggtg ctaacggtta cgtcttctgg ggtggtagag aaggttacga aactttgtta aacactaaca tgggtttgga attggataac atggctagat tgatgaagat ggccgtcgaa tacggtagat ctatcggttt taagggtgac ttctacatcg aaccaaaacc aaaggaacca actaagcacc aatacgattt cgacactgcc actgtcttgg gtttcttgag aaaatacggt ttggacaagg acttcaagat gaacattgaa gctaaccacg ccaccttggc ccaacatact tttcaacacg aattgagagt cgccagagat aacggtgtct tcggttctat cgacgctaac caaggtgatg tcttgttagg ttgggacacc gaccaattcc caaccaacat ctacgatacc accatgtgta tgtacgaagt catcaaagct ggtggtttta ccaacggtgg tttgaacttc gatgctaagg ctagaagagg ttccttcacc ccagaagaca ttttctactc ctacattgct ggtatggatg ctttcgcttt gggtttcaga gctgctttga agttaatcga agatggtaga attgataagt tcgttgctga cagatacgct tcttggaaca ctggtatcgg tgctgatatc attgctggta aggctgactt cgcttccttg gaaaagtacg ctttggaaaa gggtgaagtt accgcttctt tgtcctctgg tagacaagaa atgttggaat ctatcgtcaa caacgtcttg ttctctttgt aactcgagta ataagcgaat ttcttatgat ttatgatttt tattattaaa taagttataa aaaaaataag tgtatacaaa ttttaaagtg actcttaggt tttaaaacga aaattcttat tcttgagtaa ctctttcctg taggtcaggt tgctttctca ggtatagcat gaggtcgctc ttattgacca cacctctacc ggcatgccga gcaaatgcct gcaaatcgct cccgcggccg aagggcgaat tccagcacac tggcggccgt tactagtgga tccgagctcg cggccgcagg ccttgggtgc ttgctggcga attgcaatgt cattttgcgt ggggataaat catttgtata 4561 cgacttagat gtacaacggg gtattgtaag cagtagagta gccttgttgt tacgatctgc
4621 tgagattaag cctttgttgt ctgatttgtt ttttatttct ttctaagtgg gtactggcag
4681 gagccggggc ctagtttaga gagaagtaga ctgaacaagt ctctataaat tttatttgtc
4741 ttaagaattc gcatgcagag gtagtttcaa ggtgacaggt tatgaagata tggtgcaaaa
4801 gacaaatgga tggtggcagg catagtaaaa tgatggtglg gaagacatag atggtatttg
4861 ttttgcattt acggcaccgg atgcgggcga taatgacggg aagagattta gtatgtggga
4921 cagaatgtcg gcggcagtat tgagaccatg agagtagcaa acgtaagtct aaaggttgtt
4981 ttatagtagt taggatgtag aaaatgtatt ccgataggcc attttacatt tggagggacg
5041 gttgaaagtg gacagaggaa aaggtgcgga aatggctgat tttgattgtt tatgttttgt
5101 gtgatgattt tacatttttg catagtatta ggtagtcaga tgaaagatga atagacatag
5161 gagtaagaaa acatagaata gttaccgtta ttggtaggag tgtggtgggg tggtatagtc
5221 cgcattggga tgttactttc ctgttatggc atggattlcc ctttagggtc tctgaagcgt
5281 atttccgtca ccgaaaaagg cagaaaaagg gaaactgaag ggaggatagt agtaaagttt
5341 gaatggtggt agtgtaatgt atgatatccg ttggttttgg tttcggttgt gaaaagtttt
5401 ttggtatgat attttgcaag tagcatatat ttcttgtgtg agaaaggtat attttgtatg
5461 ttttgtatgt tcccgcgcgt ttccgtattt tccgcttccg cttccgcagt aaaaaatagt
5521 gaggaactgg gttacccggg gcacctgtca ctttggaaaa aaaatatacg ctaagatttt
5581 tggagaatag cttaaattga agtttttctc ggcgagaaat acgtattaat taacatatgg
5641 tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagccccg acacccgcca
5701 acacccgctg acgcgccctg acgggcttgt ctgctcccgg catccgctta cagacaagct
5761 gtgaccgtct ccgggagctg catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg
5821 agacgaaagg gcctcgtgat acgcctattt ttataggtta atgtcatgat aataatggtt
5881 tcttagacgt caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt
5941 ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa
6001 taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt
6061 tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat
6121 gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag
6181 atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg
6241 ctatgtggcg eg
Example 60: Generating Xylose Fermenting Yeast: Expression of Xylose Isomerase Using Plasmid-Based or Genome Integrated Systems Strain BF2513 was transformed with Pad digested pBF1221 . Transformants were selected in ScD-ura plates. Transformants were streaked in ScD-ura plates and single colonies grown overnight in YPD, as described herein. Genomic DNA was extracted from the overnight culture. Loss of the PH013 locus and presence of the TAL 1 and RPE1 open reading frames in the new locus was verified by PCR. A strain meeting the criteria was identified and designated BF2573. The URA3 marker was recycled by plating the strain in 5-FOA and loss of the URA3 verified by PCR, resulting in a strain designated as BF2575.
Strain BF2575 was transformed with Pad digested pBF1292. Transformants were selected in ScD-ura plates. Transformants were streaked in ScD-ura plates and single colonies grown overnight in YPD, as described herein. Genomic DNA was extracted from the overnight. Loss of the ADH2 locus and presence of the XKS1 and TKL 1 open reading frames in the new locus was verified by PCR. A strain meeting the criteria was identified and designated BF2625. The URA3 marker was recycled by plating the strain in 5-FOA and loss of the URA3 verified by PCR, resulting in a strain designated as BF2659.
Strain BF2659 was transformed with Pad digested pBF1377. Transformants were selected in ScD-ura plates. Transformants were streaked in ScD-ura plates and single colonies grown overnight in YPD. Genomic DNA was extracted from the overnight. Loss of the YDR275.5 junction and presence of the GAL2 open reading frame in the new locus was verified by PCR. A strain meeting the criteria was selected and the URA3 markers recycled by plating on 5-FOA. Loss of the URA3 was verified by PCR, resulting in a strain designated as BF2712.
Strain BF2712 was transformed with Pad digested pBF1386. Transformants were selected in ScD-ura plates. Transformants were streaked in ScD-ura plates and single colonies grown overnight in YPD, as described herein. Genomic DNA was extracted from the overnight. Loss of the GRE3 junction and presence of the XKS1 and RKL 1 open reading frames in the new locus was verified by PCR. A strain meeting the criteria was identified and designated BF22733. The URA3 marker was recycled by plating the strain in 5-FOA and loss of the URA3 was verified by PCR, resulting in a strain designated as BF2735.
Strain BF2735 was transformed with pBF1375. Transformants were selected in ScD-ura plates and streaked for single colonies. A strain meeting the desired criteria was identified and designated BF2744. Strain BF2744 was adapted (e.g., evolved) in YP with 1.9% xylose and 0.1 % glucose by sequential transfer for 3 weeks. The evolved strain was designated BF2807.
Strain BF2807 was cured of pBF1375 by plating in ScD+FOA plates. The cured strain was designated BF2808. BF2808 was transformed with Pad digested pBF1676 and xylose consuming strains were selected by growth in YPX, where X indicates xylose as the carbon source instead of dextrose (e.g., YPD). A strain that grew in xylose was selected and designated BF3319.
Example 61: Growth Evaluation of Strain BF2807 in Fermentors.
Strain BF2807 was grown for 24 hours in Sc-ura DX (5g/L Dextrose, 60 g/L xylose) and used to inoculate 350 ml of Sc-ura DX in Multifors to an initial OD600 of 1. Fermentation was performed for 78 hours at 300 rpm, with no aeration at a temperature at 30°C. pH was maintained at 4.8 with 1 N KOH. A glucose feed of 0.2 grams per liter per hour (e.g., g/L-hr) was started after the dextrose concentration fell below 1 g/L. After 78 hours of fermentation the xylose concentration was less than 1 g/L. Final fermention yield was 0.43 g ethanol per g of sugar consumed. The xylitol concentration was determined to be about 1 g/L.
Example 62: Growth Evaluation of Strain BF3319 in alcohol fermentation monitors (AFM) and Fermentors.
A seed culture was inoculated from a frozen stock and grown for four days in YPX. The seed culture was used to inoculate about 400 ml of YPDX (e.g., about 80 g/L dextrose and about 40 g/L xylose) in a Multifor at an initial OD600 of around 1 .0. A pH of 4.8 was maintained with 1 N KOH and the fermentation temperature controlled at 30°C. Cultures were agitated at about 300 rpm with no aeration. Three different fermentation tanks were inoculated. Samples were taken at 0, 6, 12, 24, 30 and 48 hours. All xylose and glucose was consumed by 48 hr, as shown in FIG. 46. Yield was 0.41 g ethanol / g of sugar consumed. Additional analysis indicated that BF3319 produced less than 0.5 g/L xylitol during the fermentation process.
A seed culture was grown in YPX from a xylose plate for about three days and used to inoculate around 400 ml of YPDX (e.g., about 70 g/L dextrose and about 40 g/l xylose) or ScDX (e.g., about 70 g/L dextrose and about 38 g/l xylose) in an AFM at 30° C. Three bottles were inoculated per media. C02 production was used to identify when all the sugars were consumed. After 68 hours the C02 production was close to zero and samples were taken to calculate yield. Glucose and xylose concentrations were measured in all bottles and were less than 1 g/L after 68 hours. The overall yields were as follows: YP media, 0.46 g ethanol / g consumed sugars; SC media, 0.44 g ethanol / g consumed sugars.
A seed culture was grown in YPX from a xylose plate for about three days and used to inoculate around 400 ml of YPDX (e.g., about 80 g/L dextrose and about 42 g/l xylose) or ScDX (e.g., about 80 g/L dextrose and about 43 g/l xylose) in an AFM at 30 ° C. Three bottles were inoculated per media. C02 production was used to identify when all the sugars were consumed. After 51 hours the C02 production was close to zero and samples were taken to calculate yield. Glucose and xylose concentrations were measured in all bottles and were less than 1 g/L after 51 hours. The overall yields were as follows: YP media, 0.46 g ethanol / g consumed sugars; SC Media, 0.44 g ethanol / g consumed sugars. Example 63: Examples of Embodiments
Provided hereafter are certain non-limiting embodiments of the technology. A1 . An engineered microorganism that comprises:
(a) a functional Embden-Meyerhoff glycolysis pathway that metabolizes six-carbon sugars under aerobic fermentation conditions, and
(b) a genetic modification that reduces an Embden-Meyerhoff glycolysis pathway member activity upon exposure of the engineered microorganism to anaerobic fermentation conditions, whereby the engineered microorganism preferentially metabolizes six-carbon sugars by the
Enter-Doudoroff pathway under the anaerobic fermentation conditions.
A2. The engineered microorganism of embodiment A1 , wherein the genetic modification is insertion of a promoter into genomic DNA in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity.
A3. The engineered microorganism of embodiment A1 , wherein the genetic modification is provision of a heterologous promoter polynucleotide in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity.
A4. The engineered microorganism of embodiment A1 , wherein the genetic modification is a deletion or disruption of a polynucleotide that encodes, or regulates production of, the Embden- Meyerhoff glycolysis pathway member, and the microorganism comprises a heterologus nucleic acid that includes a polynucleotide encoding the Embden-Meyerhoff glycolysis pathway member operably linked to a polynucleotide that down-regulates production of the member under anaerobic fermentation conditions.
A5. The engineered microorganism of any one of embodiments A1 -A4, wherein the Embden- Meyerhoff glycolysis pathway member activity is a phosphofructokinase activity.
A6. The engineered microorganism of any one of embodiments A1 -A5, which microorganism comprises an added or altered five-carbon sugar metabolic activity. A7. The engineered microorganism of embodiment A6, wherein the microorganism comprises an added or altered xylose isomerase activity.
A8. The engineered microorganism of any one of embodiments A1 -A7, wherein the
microorganism comprises an added or altered five-carbon sugar transporter activity.
A9. The engineered microorganism of embodiment A8, wherein the transporter activity is a transporter facilitator activity. A10. The engineered microorganism of embodiment A8, wherein the transporter activity is an active transporter activity.
A1 1 . The engineered microorganism of any one of embodiments A1 -A10, wherein the microorganism comprises an added or altered carbon dioxide fixation activity.
A12. The engineered microorganism of embodiment A1 1 , wherein the microorganism comprises an added or altered phosphoenolpyruvate (PEP) carboxylase activity.
A13. The engineered microorganism of any one of embodiments A1 -A12, wherein the microorganism comprises a genetic modification that reduces or removes an alcohol
dehydrogenase 2 activity.
A14. The engineered microorganism of any one of embodiments A1 -A13, wherein the microorganism comprises a genetic modification described in any one of embodiments B1 -B208.
B1 . An engineered microorganism that comprises a genetic modification that inhibits cell division upon exposure to a change in fermentation conditions, wherein:
the genetic modification comprises introduction of a heterologous promoter operably linked to a polynucleotide encoding a polypeptide that regulates the cell cycle of the microorganism; and the promoter activity is altered by the change in fermentation conditions.
B2. The engineered microorganism of embodiment B , wherein the genetic modification induces cell cycle arrest. B3. The engineered microorganism of embodiment B1 or B2, wherein the change in fermentation conditions is a switch to anaerobic fermentation conditions.
B4. The engineered microorganism of embodiment B1 or B2, wherein the change in fermentation conditions is a switch to an elevated temperature.
B5. The engineered microorganism of any one of embodiments B1 -B4, wherein the polypeptide that regulates the cell cycle has thymidylate synthase activity. B6. The engineered microorganism of any one of embodiments B1 -B5, wherein the promoter activity is reduced by the change in fermentation conditions.
B100. An engineered microorganism that comprises a genetic modification that inhibits cell division and/or cell proliferation upon exposure of the microorganism to a change in fermentation conditions.
B101. The engineered microorganism of embodiment B100, wherein the change in fermentation conditions is a switch to anaerobic fermentation conditions. B102. The engineered microorganism of embodiment B100, wherein the change in fermentation conditions is a switch to an elevated temperature.
B103. The engineered microorganism of any one of embodiments B100-B102, wherein the genetic modification induces cell cycle arrest upon exposure to the change in fermentation conditions.
B104. The engineered microorganism of any one of embodiments B100-B103, wherein the genetic modification reduces thymidylate synthase activity upon exposure to the change in fermentation conditions.
B200. The engineered microorganism of any one of embodiments B1 -B104, wherein the genetic modification is a temperature sensitive mutation. B201 . The engineered microorganism of any one of embodiments B1 -B200, wherein the microorganism comprises an added or altered five-carbon sugar metabolic activity.
B202. The engineered microorganism of embodiment B201 , wherein the microorganism comprises an added or altered xylose isomerase activity.
B203. The engineered microorganism of any one of embodiments B1 -B202, wherein the microorganism comprises an added or altered five-carbon sugar transporter activity. B204. The engineered microorganism of embodiment B203, wherein the transporter activity is a transporter facilitator activity.
B205. The engineered microorganism of embodiment B203, wherein the transporter activity is an active transporter activity.
B206. The engineered microorganism of any one of embodiments B1 -B205, wherein the microorganism comprises an added or altered carbon dioxide fixation activity.
B207. The engineered microorganism of embodiment B206, wherein the microorganism comprises an added or altered phosphoenolpyruvate (PEP) carboxylase activity.
B208. The engineered microorganism of any one of embodiments B1 -B207, wherein the microorganism comprises a genetic modification that reduces or removes an alcohol
dehydrogenase 2 activity.
B300. The engineered microorganism of any one of embodiments A1 -B208, wherein the microorganism is an engineered yeast.
B301. The engineered microorganism of embodiment B300, wherein the yeast is a
Saccharomyces yeast.
B302. The engineered microorganism of embodiment B301 , wherein the Saccharomyces yeast is S. cerevisiae. C1 . A method for manufacturing a target product produced by an engineered microorganism, which comprises:
(a) culturing an engineered microorganism of any one of embodiments A1 -B302 under aerobic conditions; and
(b) culturing the engineered microorganism after (a) under anaerobic conditions, whereby the engineered microorganism produces the target product.
C2. The method of embodiment C1 , wherein the target product is ethanol. C3. The method of embodiment C1 , wherein the target product is succinic acid.
C4. The method of any one of embodiments C1 -C3, wherein the host microorganism from which the engineered microorganism is produced does not produce a detectable amount of the target product.
C5. The method of any one of embodiments C1 -C4, wherein the culture conditions comprise fermentation conditions.
C6. The method of any one of embodiments C1 -C5, wherein the culture conditions comprise introduction of biomass.
C7. The method of any one of embodiments C1 -C6, wherein the culture conditions comprise introduction of a six-carbon sugar. C8. The method of embodiment C7, wherein the sugar is glucose.
C9. The method of any one of embodiments C1 -C8, wherein the culture conditions comprise introduction of a five-carbon sugar. C10. The method of embodiment C9, wherein the sugar is xylose.
C1 1 . The method of any one of embodiments C1 -C10, wherein the target product is produced with a yield of greater than about 0.3 grams per gram of glucose added. C12. The method of any one of embodiments C1 -C1 1 , which comprises purifying the target product from the cultured microorganisms.
C13. The method of embodiment C12, which comprises modifying the target product, thereby producing modified target product.
C14. The method of any one of embodiments C1 -C13, which comprises placing the cultured microorganisms, the target product or the modified target product in a container. C15. The method of embodiment C14, which comprises shipping the container.
D1 . A method for producing a target product by an engineered microorganism, which comprises:
(a) culturing an engineered microorganism of any one of embodiments A1 -B302 under a first set of fermentation conditions; and
(b) culturing the engineered microorganism after (a) under a second set of fermentation conditions different than the first set of fermentation conditions, whereby the second set of fermentation conditions inhibits cell division and/or cell proliferation of the engineered
microorganism. D2. The method of embodiment D1 , wherein the second set of fermentation conditions comprises anaerobic fermentation conditions and the first set of fermentation conditions comprises aerobic fermentation conditions.
D3. The method of embodiment D1 , wherein the second set of fermentation conditions comprises an elevated temperature as compared to the temperature in the first set of fermentation conditions.
D4. The method of any one of embodiments D1 -D3, wherein the genetic modification inhibits the cell cycle of the engineered microorganism upon exposure to the second set of fermentation conditions.
D5. The method of any one of embodiments D1 -D4, wherein the genetic modification induces cell cycle arrest upon exposure to the second set of fermentation conditions. D6. The method of any one of embodiments D1 -D5, wherein the genetic modification inhibits thymidylate synthase activity upon exposure to the change in fermentation conditions.
D7. The method of embodiment D6, wherein the genetic modification comprises a temperature sensitive mutation.
D8. The method of any one of embodiments D1 -D7, wherein the microorganism comprises an added or altered five-carbon sugar metabolic activity. D9. The method of embodiment D8, wherein the microorganism comprises an added or altered xylose isomerase activity.
D10. The method of any one of embodiments D1 -D9, wherein the microorganism comprises an added or altered five-carbon sugar transporter activity.
D1 1 . The method of embodiment D10, wherein the transporter activity is a transporter facilitator activity.
D12. The method of embodiment D10, wherein the transporter activity is an active transporter activity.
D13. The method of any one of embodiments D1 -D12, wherein the microorganism comprises an added or altered carbon dioxide fixation activity. D14. The method of embodiment D13, wherein the microorganism comprises an added or altered phosphoenolpyruvate (PEP) carboxylase activity.
D15. The method of any one of embodiments D1 -D14, wherein the microorganism comprises a genetic modification that reduces or removes an alcohol dehydrogenase 2 activity.
D16. The method of any one of embodiments D1 -D15, wherein the microorganism is an engineered yeast.
D17. The method of embodiment D16, wherein the yeast is a Saccharomyces yeast. D18. The method of embodiment D17, wherein the Saccharomyces yeast is S. cerevisiae. D19. The method of any one of embodiments D1 -D18, wherein the target product is ethanol. D20. The method of any one of embodiments D1 -D18, wherein the target product is succinic acid.
E1 . A method for manufacturing an engineered microorganism, which comprises:
(a) introducing a genetic modification to a host microorganism that reduces an Embden- Meyerhoff glycolysis pathway member activity upon exposure of the engineered microorganism to anaerobic conditions; and
(b) selecting for engineered microorganisms that (i) metabolize six-carbon sugars by the Embden-Meyerhoff glycolysis pathway under aerobic fermentation conditions, and (ii) preferentially metabolize six-carbon sugars by the Enter-Doudoroff pathway under the anaerobic fermentation conditions.
E2. The method of embodiment E1 , wherein the genetic modification is insertion of a promoter into genomic DNA in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity. E3. The method of embodiment E1 , wherein the genetic modification is provision of a
heterologous promoter polynucleotide in operable linkage with a polynucleotide that encodes the Embden-Meyerhoff glycolysis pathway member activity.
E4. The method of embodiment E1 , wherein the genetic modification is a deletion or disruption of a polynucleotide that encodes, or regulates production of, the Embden-Meyerhoff glycolysis pathway member, and the microorganism comprises a heterologous nucleic acid that includes a polynucleotide encoding the Embden-Meyerhoff glycolysis pathway member operably linked to a polynucleotide that down-regulates production of the member under anaerobic fermentation conditions.
E5. The method of any one of embodiments E1 -E4, wherein the Embden-Meyerhoff glycolysis pathway member activity is a phosphofructokinase activity. E6. The method of any one of embodiments E1 -E5, which" comprises introducing a genetic alteration that adds or alters a five-carbon sugar metabolic activity.
E7. The method of embodiment E6, wherein the genetic alteration adds or alters a xylose isomerase activity.
E8. The method of any one of embodiments E1 -E7, which comprises introducing a genetic modification that adds or alters a five-carbon sugar transporter activity. E9. The method of embodiment E8, wherein the transporter activity is a transporter facilitator activity.
E10. The method of embodiment E8, wherein the transporter activity is an active transporter activity.
E1 1 . The method of any one of embodiments E1 -E10, which comprises introducing a genetic modification that adds or alters a carbon dioxide fixation activity.
E12. The method of embodiment E1 1 , which comprises introducing a genetic modification that adds or alters a phosphoenolpyruvate (PEP) carboxylase activity.
E13. The method of any one of embodiments E1 -E12, which comprises introducing a genetic modification that reduces or removes an alcohol dehydrogenase 2 activity. E14. The method of any one of embodiments E1 -E13, which comprises introducing a genetic modification described in any one of embodiments B1 -B208.
F1 . A method for manufacturing an engineered microorganism, which comprises:
(a) introducing a genetic modification to a host microorganism that inhibits cell division upon exposure to a change in fermentation conditions, thereby producing engineered microorganisms; and
(b) selecting for engineered microorganisms with inhibited cell division upon exposure of the engineered microorganisms to the change in fermentation conditions. F2. The method of embodiment F2, wherein the change in fermentation conditions comprises a change to anaerobic fermentation conditions.
F3. The method of embodiment F1 , wherein the change in fermentation conditions comprises a change to an elevated temperature.
F4. The method of any one of embodiments F1 -F3, wherein the genetic modification inhibits the cell cycle of the engineered microorganism upon exposure to the change in fermentation conditions.
F5. The method of any one of embodiments F1 -F4, wherein the genetic modification induces cell cycle arrest upon exposure to the second set of fermentation conditions.
F6. The method of any one of embodiments F1 -F5, wherein the genetic modification inhibits thymidylate synthase activity upon exposure to the change in fermentation conditions.
F7. The method of embodiment F6, wherein the genetic modification comprises a temperature sensitive mutation. F8. The method of any one of embodiments F1 -F7, wherein the genetic modification adds or alters a five-carbon sugar metabolic activity.
F9. The method of embodiment F8, wherein the genetic modification adds or alters a xylose isomerase activity.
F10. The method of any one of embodiments F1 -F9, wherein the genetic modification adds or alters a five-carbon sugar transporter activity.
F1 1 . The method of embodiment F10, wherein the transporter activity is a transporter facilitator activity.
F12. The method of embodiment F10, wherein the transporter activity is an active transporter activity. F13. The method ot any one of embodiments F1 -F12, wherein the genetic modification adds or alters a carbon dioxide fixation activity.
F14. The method of embodiment F13, wherein the genetic modification adds or alters a phosphoenolpyruvate (PEP) carboxylase activity.
F15. The method of any one of embodiments F1 -F14, wherein the genetic modification reduces or removes an alcohol dehydrogenase 2 activity. F16. The method of any one of embodiments E1 -E14 and F1 -F15, wherein the microorganism is an engineered yeast.
F17. The method of embodiment F16, wherein the yeast is a Saccharomyces yeast. F18. The method of embodiment F17, wherein the Saccharomyces yeast is S. cerevisiae.
G1. A nucleic acid, comprising a polynucleotide that encodes a polypeptide from Ruminococcus flavefaciens possessing a xylose to xylulose xylose isomerase activity. G2. The nucleic acid of embodiment G1 , wherein the polynucleotide includes one or more substituted codons.
G3. The nucleic acid of embodiment G2, wherein the one or more substituted codons are yeast codons.
G4. The nucleic acid of any one of embodiments G1 to G3, wherein the polynucleotide includes a nucleotide sequence of SEQ ID NO: 29, 30, 32 or 33, fragment thereof, or sequence having 50% identity or greater to the foregoing.
G5. The nucleic acid of any one of embodiments G1 to G4, wherein the polypeptide includes an amino acid sequence of SEQ ID NO: 31 , fragment thereof, or sequence having 75% identity or greater to the foregoing. G6. The nucleic acid of any one of embodiments G1 to G5, wherein a stretch of contiguous nucleotides of the polynucleotide is from another organism.
G7. The nucleic acid of embodiment G6, wherein the stretch of contiguous nucleotides from the other organism is from a nucleotide sequence that encodes a polypeptide possessing a xylose isomerase activity.
G8. The nucleic acid of embodiment G5 or G6, wherein the other organism is a fungus. G9. The nucleic acid of embodiment G8, wherein the fungus is a Piromyces fungus.
G10. The nucleic acid of embodiment G9, wherein the fungus is a Piromyces strain E2.
G1 1 . The nucleic acid of embodiment G10, wherein the stretch of contiguous nucleotides from the other organism is from SEQ ID NO: 34, or sequence having 50% identity or greater to the foregoing.
G12. The nucleic acid of embodiment G10, wherein the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or sequence having 75% identity or greater to the foregoing.
G13. The nucleic acid of any one of embodiments G6 to G12, wherein the stretch of contiguous nucleotides from the other organism is about 1 % to about 30% of the total number of nucleotides in the polynucleotide that encodes the polypeptide possessing xylose isomerase activity.
G14. The nucleic acid of embodiment G13, wherein about 30 contiguous nucleotides from the polynucleotide from R. flavefaciens are replaced by about 10 to about 20 nucleotides from the other organism. G15. The nucleic acid of embodiment G13 or G14, wherein the contiguous stretch of
polynucleotides from the other organism are at the 5' end of the polynucleotide. G16. The nucleic acid of any one of embodiments G6 to G15, wherein the polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater to the foregoing. G17. The nucleic acid of any one of embodiments G6 to G15, wherein the polynucleotide encodes a polypeptide that includes an amino acid sequence of SEQ ID NO: 58, 60 or 62, fragment thereof, or sequence having 75% identity or greater to the foregoing.
G18. The nucleic acid of any one of embodiments G1 to G17, which comprises one or more point mutations.
G19. The nucleic acid of embodiment G18, wherein the point mutation is at a position
corresponding to position 179 of the R. flavefaciens polypeptide having xylose isomerase activity. G20. The nucleic acid of embodiment G19, wherein the point mutation is a glycine 179 to alanine point mutation.
H1 . An expression vector comprising a polynucleotide that encodes a polypeptide from
Ruminococcus flavefaciens possessing a xylose to xylulose xylose isomerase activity.
H2. The expression vector of embodiment H1 , wherein the polynucleotide includes one or more substituted codons.
H3. The expression vector of embodiment H2, wherein the one or more substituted codons are yeast codons.
H4. The expression vector of any one of embodiments H1 to H3, wherein the polynucleotide includes a nucleotide sequence of SEQ ID NO: 29, 30, 32 or 33, fragment thereof, or sequence having 50% identity or greater to the foregoing.
H5. The expression vector of any one of embodiments H1 to H4, wherein the polypeptide includes an amino acid sequence of SEQ ID NO: 31 , fragment thereof, or sequence having 75% identity or greater to the foregoing. H6. The expression vector of any one of embodiments H1 to H5, wherein a stretch of contiguous nucleotides of the polynucleotide is from another organism.
H7. The expression vector of embodiment H6, wherein the stretch of contiguous nucleotides from the other organism is from a nucleotide sequence that encodes a polypeptide possessing a xylose isomerase activity.
H8. The expression vector of embodiment H5 or H6, wherein the other organism is a fungus. H9. The expression vector of embodiment H8, wherein the fungus is a Piromyces fungus.
H10. The expression vector of embodiment H9, wherein the fungus is a Piromyces strain E2.
H1 1 . The expression vector of embodiment H10, wherein the stretch of contiguous nucleotides from the other organism is from SEQ ID NO: 34, or sequence having 50% identity or greater to the foregoing.
H12. The expression vector of embodiment H10, wherein the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or sequence having 75% identity or greater to the foregoing.
H13. The expression vector of any one of embodiments H6 to H12, wherein the stretch of contiguous nucleotides from the other organism is about 1 % to about 30% of the total number of nucleotides in the polynucleotide that encodes the polypeptide possessing xylose isomerase activity.
H14. The expression vector of embodiment H13, wherein about 30 contiguous nucleotides from the polynucleotide from R. flavefaciens are replaced by about 10 to about 20 nucleotides from the other organism.
H15. The expression vector of embodiment H13 or H14, wherein the contiguous stretch of polynucleotides from the other organism are at the 5' end of the polynucleotide. H16. The expression vector of any one of embodiments H6 to H15, wherein the polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater to the foregoing. H17. The expression vector of any one of embodiments H6 to H15, wherein the polynucleotide encodes a polypeptide that includes an amino acid sequence of SEQ ID NO: 58, 60 or 62, fragment thereof, or sequence having 75% identity or greater to the foregoing.
H18. The expression vector of any one of embodiments H1 to H 7, which comprises one or more point mutations.
H19. The expression vector of embodiment H18, wherein the point mutation is at a position corresponding to position 179 of the R. flavefaciens polypeptide having xylose isomerase activity. H20. The expression vector of embodiment H19, wherein the point mutation is a glycine 179 to alanine point mutation.
H21 . The expression vector of any one of embodiments, H1 to H20, comprising a regulatory nucleotide sequence in operable linkage with the polynucleotide.
H22. The expression vector of embodiment J25, wherein the regulatory nucleotide sequence comprises a promoter sequence.
H23. The expression vector of embodiment J26, wherein the promoter sequence is an inducible promoter sequence.
H24. The expression vector of embodiment J26, wherein the promoter sequence is a constitutively active promoter sequence. H25. A method for preparing an expression vector of any one of embodiments H1 to H24, comprising: (i) providing a nucleic acid that contains a regulatory sequence, and (ii) inserting the polynucleotide into the nucleic acid in operable linkage with the regulatory sequence. 11 . A nucleic acid , comprising a polynucleotide that includes a first stretch of contiguous nucleic acids from a first organism and a second stretch of contiguous nucleic acids from a second organism, wherein the polynucleotide encodes a polypeptide possessing a xylose to xylulose xylose isomerase activity.
12. The nucleic acid of embodiment 11 , wherein the first organism and the second organism are the same species.
13. The nucleic acid of embodiment 11 , wherein the first organism and the second organism are different species.
14. The nucleic acid of any one of embodiments 11 to I3, wherein the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having xylose isomerase activity.
15. The nucleic acid of any one of embodiments 11 to I4, wherein the first organism is a bacterium.
16. The nucleic acid of embodiment 15, wherein the bacterium is a Ruminococcus bacterium. 17. The nucleic acid of embodiment 16, wherein the bacterium is a Ruminococcus flavefaciens bacterium.
18. The nucleic acid of any one of embodiments 15 to 17, wherein the stretch of contiguous nucleotides is from SEQ ID NO: 29, 30, 32, 33, or a sequence having 50% identity or greater to the foregoing.
19. The nucleic acid of embodiment 18, wherein the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 31 , or a sequence having 75% identity or greater to the foregoing.
110. The nucleic acid of any one of embodiments 11 to I9, wherein the second organism is a fungus.
11 1 . The nucleic acid of embodiment 110, wherein the fungus is a Piromyces fungus. 112. The nucleic acid of embodiment 11 1 , wherein the fungus is a Piromyces strain E2 fungus.
113. The nucleic acid of any one of embodiments 110 to 112, wherein the stretch of contiguous nucleotides is from SEQ ID NO: 34, or a sequence having 50% identity or greater to the foregoing.
114. The nucleic acid of embodiment 113, wherein the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or a sequence having 75% identity or greater to the foregoing. 115. The nucleic acid of any one of embodiments 11 to 114, wherein the polynucleotide includes one or more substituted codons.
116. The nucleic acid of embodiment 11 5, wherein the one or more substituted codons are yeast codons.
11 7. The nucleic acid of any one of embodiments 11 to 11 6, wherein the stretch of contiguous nucleotides from the first organism or second organism is about 1 % to about 30% of the total number of nucleotides in the polynucleotide that encodes the polypeptide possessing xylose isomerase activity.
118. The nucleic acid of embodiment 117, wherein the stretch of contiguous nucleotides from the second organism is about 1 % to about 30% of the total number of nucleotides in the
polynucleotide. 119. The nucleic acid of embodiment 118, wherein the contiguous stretch of polynucleotides from the second organism are at the 5' end of the polynucleotide.
120. The nucleic acid of any one of embodiments 11 to 119, wherein the polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater to the foregoing.
121 . The nucleic acid of any one of embodiments 11 to I20, wherein the polynucleotide encodes a polypeptide that includes an amino acid sequence of SEQ ID NO: 58, 60 or 62, fragment thereof, or sequence having 75% identity or greater to the foregoing. 122. The nucleic acid of any one of embodiments 11 to 121 , which comprises one or more point mutations.
123. The nucleic acid of embodiment I22, wherein the point mutation is at a position corresponding to position 179 of the R. flavefaciens polypeptide having xylose isomerase activity.
124. The nucleic acid of embodiment I23, wherein the point mutation is a glycine 179 to alanine point mutation. J1 . An expression vector , comprising a polynucleotide that includes a first stretch of contiguous nucleotides from a first organism and a second stretch of contiguous nucleotides from a second organism, wherein the polynucleotide encodes a polypeptide possessing a xylose to xylulose xylose isomerase activity. J2. The expression vector of embodiment J1 , wherein the first organism and the second organism are the same.
J3. The expression vector of embodiment J1 , wherein the first organism and the second organism are different.
J4. The expression vector of any one of embodiments J1 to J3, wherein the first stretch of contiguous nucleotides and the second stretch of contiguous nucleotides independently are selected from nucleotide sequence that encodes a polypeptide having xylose isomerase activity. J5. The expression vector of any one of embodiments J1 to J4, wherein the first organism is a bacterium.
J6. The expression vector of embodiment J5, wherein the bacterium is a Ruminococcus bacterium.
J7. The expression vector of embodiment J6, wherein the bacterium is a Ruminococcus flavefaciens bacterium. J8. The expression vector of any one of embodiments J5 to J7, wherein the stretch of contiguous nucleotides is from SEQ ID NO: 29, 30, 32, 33, or a sequence having 50% identity or greater to the foregoing. J9. The expression vector of embodiment J8, wherein the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 31 , or a sequence having 75% identity or greater to the foregoing.
J10. The expression vector of any one of embodiments J1 to J9, wherein the second organism is a fungus.
J 1 1. The expression vector of embodiment J10, wherein the fungus is a Piromyces fungus.
J12. The expression vector of embodiment J1 1 , wherein the fungus is a Piromyces strain E2 fungus.
J13. The expression vector of any one of embodiments J10 to J12, wherein the stretch of contiguous nucleotides is from SEQ ID NO: 34, or a sequence having 50% identity or greater to the foregoing.
J14. The expression vector of embodiment J13, wherein the stretch of contiguous nucleotides from the other organism encodes an amino acid sequence from SEQ ID NO: 35, or a sequence having 75% identity or greater to the foregoing. J15. The expression vector of any one of embodiments J1 to J14, wherein the polynucleotide includes one or more substituted codons.
J 16. The expression vector of embodiment J 15, wherein the one or more substituted codons are yeast codons.
J17. The expression vector of any one of embodiments J1 to J16, wherein the stretch of contiguous nucleotides from the first organism or second organism is about 1 % to about 30% of the total number of nucleotides in the polynucleotide that encodes the polypeptide possessing xylose isomerase activity. J18. The expression vector of embodiment J17, wherein the stretch of contiguous nucleotides from the second organism is about 1 % to about 30% of the total number of nucleotides in the polynucleotide. J19. The expression vector of embodiment J18, wherein the contiguous stretch of polynucleotides from the second organism are at the 5' end of the polynucleotide.
J20. The expression vector of any one of embodiments J1 to J19, wherein the polynucleotide includes a nucleotide sequence of SEQ ID NO: 55, 56, 57, 59 or 61 , fragment thereof, or sequence having 50% identity or greater to the foregoing.
J21 . The expression vector of any one of embodiments J1 to J20, wherein the polynucleotide encodes a polypeptide that includes an amino acid sequence of SEQ ID NO: 58, 60 or 62, fragment thereof, or sequence having 75% identity or greater to the foregoing.
J22. The expression vector of any one of embodiments J1 to J21 , which comprises one or more point mutations.
J23. The expression vector of embodiment J22, wherein the point mutation is at a position corresponding to position 179 of the R. flavefaciens polypeptide having xylose isomerase activity.
J24. The expression vector of embodiment J23, wherein the point mutation is a glycine 179 to alanine point mutation. J25. The expression vector of any one of embodiments, J1 to J24, comprising a regulatory nucleotide sequence in operable linkage with the polynucleotide.
J26. The expression vector of embodiment J25, wherein the regulatory nucleotide sequence comprises a promoter sequence.
J27. The expression vector of embodiment J26, wherein the promoter sequence is an inducible promoter sequence. J28. The expression vector of embodiment J26, wherein the promoter sequence is a constitutively active promoter sequence.
J29. A method for preparing an expression vector of any one of embodiments J1 to J28, comprising: (i) providing a nucleic acid that contains a regulatory sequence, and (ii) inserting the polynucleotide into the nucleic acid in operable linkage with the regulatory sequence.
K1 . A microbe comprising a polynucleotide of the nucleic acid of any one of embodiments G1 to G20 or any one of embodiments 11 to I24.
K2. A microbe comprising an expression vector of any one of embodiments H1 to H20 or any one of embodiments J1 to J24.
K3. The microbe of embodiment K1 or K2, which is a yeast.
K4. The microbe of embodiment K3, which is a Saccharomyces yeast.
K5. The microbe of embodiment K4, which is a Saccharomyces cerevisiae yeast. L1 . A method, comprising contacting a microbe of any one of embodiments K1 to K5 with a feedstock comprising a five carbon molecule under conditions for generating ethanol.
L2. The method of embodiment L1 , wherein the five carbon molecule comprises xylose. L3. The method of embodiment L1 or L2, wherein about 15 grams per liter of ethanol or more is generated within about 372 hours.
L4. The method of any one of embodiments L1 to L3, wherein about 2.0 grams per liter dry cell weight is generated within about 372 hours.
M1 . A composition comprising an engineered yeast comprising heterologous polynucleotide subsequences that encode a phosphogluconate dehydratase enzyme, a 2-keto-3-deoxygluconate- 6-phosphate aldolase enzyme and a xylose isomerase enzyme. M2. The composition of embodiment M1 , wherein the yeast is a Saccharomyces spp. yeast.
M3. The composition of embodiment M2, wherein the yeast is a Saccharomyces cerevisiae yeast strain.
M3. The composition of any one of embodiments M1 to M3, wherein the polynucleotide subsequences encoding the phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6- phosphate aldolase enzyme independently are from an Escherichia spp. microbe or Pseudomonas spp. microbe.
M4. The composition of embodiment M3, wherein the Escherichia spp. microbe is an Escherichia coli strain.
M5. The composition of embodiment M3 or M4, wherein the Pseudomonas spp. microbe is a Pseudomonas aeruginosa strain.
M6. The composition of any one of embodiments M1 to M5, wherein the polynucleotide subsequence that encodes the phosphogluconate dehydratase enzyme is an EDD gene. M7. The composition of any one of embodiments M1 to M5, wherein the polynucleotide subsequence that encodes the 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme is an EDA gene.
M8. The composition of any one of embodiments M1 to M7, wherein the xylose isomerase enzyme is a chimeric enzyme.
M8.1. The composition of embodiment M8, wherein a first portion of the polynucleotide
subsequence that encodes the chimeric xylose isomerase enzyme is from a first microbe and a second portion of the polynucleotide subsequence that encodes the chimeric xylose isomerase enzyme is from a second microbe.
M8.2. The composition of embodiment M8 or M8.1 , wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales spp., Ruminococcus spp., Thermus spp., Bacillus spp.,
Clostridium spp., Orpinomyces spp., Escherichia spp. and Piromyces spp. microbes.
M8.3. The composition of embodiment M8.2, wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales_genomosp. BVAB3 str UPII9-5, Ruminococcus flavefaciens,
Ruminococcus_FD1 , Ruminococcus_18P13, Thermus thermophilus, Bacillus stercoris, Clostridium cellulolyticum, Bacillus uniformis, Bacillus stearothermophilus, Bacteroides thetaiotaomicron, Clostridium thermohydrosulfuricum,
Orpinomyces, Clostridium phytofermentans, Escherichia coli and Piromyces strain E2.
M8.4. The composition of any one of embodiments M1 to M7, wherein 80% or more of the polynucleotide subsequence that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase-encoding sequence.
M9. The composition of embodiment M8.4, wherein all of the polynucleotide subsequence that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase- encoding sequence. M10. The composition of any one of embodiments M8.4 to M9, wherein the Ruminococcus spp. microbe is a Ruminococcus flavefaciens strain.
M1 1 . The composition of any one of embodiments M8.4 to M10, wherein the polynucleotide subsequence that encodes the xylose isomerase enzyme is chimeric and includes a sequence that encodes a xylose isomerase from another microbe.
M12. The composition of embodiment M1 1 , wherein the other microbe is a fungus.
M13. The composition of embodiment M12, wherein the fungus is an anaerobic fungus.
M14. The composition of embodiment M12, wherein the fungus is a Piromyces spp. fungus.
M15. The composition of embodiment M14, wherein the Piromyces spp. fungus is a Piromyces strain E2. M16. The composition of any one of embodiments M1 to M15, wherein the yeast expresses a glucose-6-phosphate dehydrogenase enzyme, a glucose-6-phosphate dehydrogenase enzyme, or a glucose-6-phosphate dehydrogenase enzyme and a glucose-6-phosphate dehydrogenase enzyme.
M17. The composition of embodiment M16, wherein the polynucleotide subsequences that encode the glucose-6-phosphate dehydrogenase enzyme, the glucose-6-phosphate
dehydrogenase enzyme, or the glucose-6-phosphate dehydrogenase enzyme and the glucose-6- phosphate dehydrogenase enzyme are from a yeast.
M18. The composition of embodiment M17, wherein the yeast from which the polynucleotide subsequence or subsequences are derived is a Saccharomyces spp. yeast.
M19. The composition of embodiment 18, wherein the yeast is a Saccharomyces cerevisiae strain.
M20. The composition of any one of embodiments M16 to M19, wherein the yeast over-expresses an endogenous glucose-6-phosphate dehydrogenase enzyme, an endogenous glucose-6- phosphate dehydrogenase enzyme, or an endogenous glucose-6-phosphate dehydrogenase enzyme and an endogenous glucose-6-phosphate dehydrogenase enzyme.
M21 . The composition of any one of embodiments M16 to M20, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a ZWF gene.
M22. The composition of embodiment M21 , wherein the ZWF gene is a ZWF1 gene.
M23. The composition of any one of embodiments M16 to M22, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a SOL gene.
M24. The composition of embodiment M23, wherein the SOL gene is a SOL3 gene.
M25. The composition of any one of embodiments M1 to M25, wherein the yeast includes a polynucleotide subsequence that encodes a glucose transporter. M26. The composition of embodiment M25, wherein the polynucleotide subsequence that encodes the glucose transporter is from a yeast.
M27. The composition of embodiment M25 or M26, wherein the yeast over-expresses one or more endogenous glucose transport enzymes.
M28. The composition of any one of embodiments M25 to M27, wherein the glucose transporter is encoded by a one or more of a GAL2, GSX1 and GXF1 gene. M29. The composition of any one of embodiments M1 to M28, wherein the yeast includes a genetic alteration that reduces the activity of an endogenous phosphofructokinase (PFK) enzyme activity.
M29.1 . The composition of embodiment M29, wherein the PFK enzyme is a PFK-2 enzyme.
M30. The composition of any one of embodiments M1 to M29, wherein the yeast includes one or more extra copies of an endogenous promoter, or a heterologous promoter operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotide
subsequences.
M31. The composition of embodiment M30, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and triose phosphate dehydrogenase (TDH-1 ). M32. The composition of any one of embodiments M1 to M31 , wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are not integrated in the yeast nucleic acid.
M33. The composition of embodiment M32, wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are in one or more plasmids.
M34. The composition of any one of embodiments M1 to M31 , wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are integrated in genomic DNA of the yeast. M35. The composition of embodiment M34, wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are integrated in a transposition integration event, in a homologous recombination integration event, or in a transposition integration event and a homologous recombination integration event.
M36. The composition of embodiment M35, wherein the transposition integration event includes transposition of an operon comprising two or more of the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters. M37. The composition of embodiment 34, wherein the homologous recombination integration event includes homologous recombination of an operon comprising two or more of the
polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters. N1 . A method, comprising contacting an engineered yeast of any one of embodiments M1 to M37 with a feedstock that contains one or more hexose sugars and one or more pentose sugars under conditions in which the microbe synthesizes ethanol.
N2. The method of embodiment N1 , wherein the engineered yeast synthesizes ethanol to about 85% to about 99% of theoretical yield.
N3. The method of embodiment N1 or N2, comprising recovering ethanol synthesized by the engineered yeast. N4. The method of any one of embodiments, N1 to N3, wherein the conditions are fermentation conditions.
01 . A composition comprising a nucleic acid comprising heterologous polynucleotides that encode a phosphogluconate dehydratase enzyme, a 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme and a xylose isomerase enzyme.
02. The composition of embodiment 01 , wherein the yeast is a Saccharomyces spp. yeast. 03. The composition of embodiment 02, wherein the yeast is a Saccharomyces cerevisiae yeast strain.
03.1. The composition of any one of embodiments 01 to 03, wherein the polynucleotides encoding the phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp. microbe or Pseudomonas spp. microbe.
04. The composition of embodiment 03, wherein the Escherichia spp. microbe is an Escherichia coli strain.
05. The composition of embodiment 03 or 04, wherein the Pseudomonas spp. microbe is a Pseudomonas aeruginosa strain. 06. The composition of any one of embodiments 01 to 05, wherein the polynucleotide that encodes the phosphogluconate dehydratase enzyme is an EDD gene.
07. The composition of any one of embodiments 01 to 05, wherein the polynucleotide that encodes the 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme is an EDA gene.
08. The composition of any one of embodiments 01 to 07, wherein the xylose isomerase enzyme is a chimeric enzyme.
08.1 . The composition of embodiment 08, wherein a first portion of the polynucleotide that encodes the chimeric xylose isomerase enzyme is from a first microbe and a second portion of the polynucleotide that encodes the chimeric xylose isomerase enzyme is from a second microbe.
08.2. The composition of embodiment 08 or 08.1 , wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales spp., Ruminococcus spp., Thermus spp., Bacillus spp.,
Clostridium spp., Orpinomyces spp., Escherichia spp. and Piromyces spp. microbes.
08.3. The composition of embodiment 08.2, wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales_genomosp. BVAB3 str UPII9-5, Ruminococcus flavefaciens,
Ruminococcus_FD1 , Ruminococcus_18P13, Thermus thermophilus, Bacillus stercoris, Clostridium cellulolyticum, Bacillus uniformis, Bacillus stearothermophilus, Bacteroides thetaiotaomicron, Clostridium thermohydrosulfuricum,
Orpinomyces, Clostridium phytofermentans, Escherichia coli and Piromyces strain E2.
08.4. The composition of any one of embodiments 01 to 07, wherein 80% or more of the polynucleotide that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase-encoding sequence.
09. The composition of embodiment 08.4, wherein all or a portion of the polynucleotide that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase- encoding sequence. O10. The composition of any one of embodiments 08.4 to 09, wherein the Ruminococcus spp. microbe is a Ruminococcus flavefaciens strain.
01 1 . The composition of any one of embodiments 08.4 to 010, wherein the polynucleotide that encodes the xylose isomerase enzyme is chimeric and includes a sequence that encodes a xylose isomerase from another microbe.
01 1 .1 . The composition of any one of embodiments 08.4 to 01 1 , wherein the portion of the polynucleotide from the Ruminococcus spp. microbe xylose isomerase is 3' with respect to the portion of the polynucleotide from another microbe.
012. The composition of embodiment 01 1 or 01 1.1 , wherein the other microbe is a fungus.
013. The composition of embodiment 012, wherein the fungus is an anaerobic fungus. 014. The composition of embodiment 012, wherein the fungus is a Piromyces spp. fungus.
015. The composition of embodiment 014, wherein the Piromyces spp. fungus is a Piromyces strain E2. 016. The composition of any one of embodiments 01 to 015, wherein the nucleic acid includes one or more polynucleotides that encode a glucose-6-phosphate dehydrogenase enzyme, a 6- phosphogluconolactonase enzyme, or a glucose-6-phosphate dehydrogenase enzyme and a 6- phosphogluconolactonase enzyme.
017. The composition of embodiment 016, wherein the one or more polynucleotides that encode the glucose-6-phosphate dehydrogenase enzyme, the 6-phosphogluconolactonase enzyme, or the glucose-6-phosphate dehydrogenase enzyme and the 6-phosphogluconolactonase enzyme are from a yeast.
018. The composition of embodiment 017, wherein the yeast from which the polynucleotide or polynucleotides are derived is a Saccharomyces spp. yeast.
019. The composition of embodiment 018, wherein the yeast is a Saccharomyces cerevisiae strain.
020. The composition of any one of embodiments 016 to 019, wherein the nucleic acid includes one or more polynucleotides that encode an endogenous glucose-6-phosphate dehydrogenase enzyme, an endogenous 6-phosphogluconolactonase enzyme, or an endogenous glucose-6- phosphate dehydrogenase enzyme and an endogenous 6-phosphogluconolactonase enzyme.
021 . The composition of any one of embodiments 016 to 020, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a ZWF gene. 022. The composition of embodiment 021 , wherein the ZWF gene is a ZWF1 gene.
023. The composition of any one of embodiments 016 to 022, wherein the 6- phosphogluconolactonase enzyme is expressed from a SOL gene. 024. The composition of embodiment 023, wherein the SOL gene is a SOL3 gene.
025. The composition of any one of embodiments 01 to 024, wherein the nucleic acid includes one or more polynucleotides that encode one or more glucose transporters. 026. The composition of embodiment 025, wherein the polynucleotide that encodes the one or more glucose transporters is from a yeast.
027. The composition of embodiment 025 or 026, wherein the one or more glucose transporters is encoded by a one or more of a GAL2, GSX1 and GXF1 gene.
028. The composition of any one of embodiments 01 to 027, wherein the nucleic acid includes one or more polynucleotides that encode a transketolase enzyme, transaldolase enzyme, or a transketolase enzyme and transaldolase enzyme.
029. The composition of embodiment 028, wherein the transketolase enzyme is encoded by a TKL1 coding sequence or a TKL2 coding sequence.
030. The composition of embodiment 028, wherein the transaldolase is encoded by a TAL coding sequence.
031 . The composition of any one of embodiments 028 to 030, wherein the transketolase enzyme or the transaldolase enzyme is from a yeast. 032. The composition of any one of embodiments 01 to 031 , wherein the nucleic acid includes one or more promoters operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotides.
033. The composition of embodiment 032, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and triose phosphate dehydrogenase (TDH-1 ).
034. The composition of any one of embodiments 01 to 033, wherein the nucleic acid includes one or more polynucleotides that homologously combine in a gene of a host that encodes a phosphofructokinase (PFK) enzyme, phosphoglucoisomerase (PGI) enzyme or 6- phosphogluconate dehydrogenase (decarboxylating) enzyme.
035. The composition of embodiment 034, wherein the phosphofructokinase (PFK) enzyme is a PFK-2 enzyme or PFK-1 enzyme. 035.1 . The composition of embodiment 034, wherein the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme is encoded by a GND-1 gene or a GND-2 gene.
036. The composition of embodiment 034, wherein the PGI is encoded by a PGI-1 gene.
037. The composition of any one of embodiments 01 to 036, wherein the nucleic acid is one or two separate nucleic acid molecules.
038. The composition of embodiment 037, wherein each nucleic acid molecule includes one or two or more of the polynucleotide subsequences, one or two or more of the promoters, or one or two or more of the polynucleotide subsequences and one or two or more of the promoters.
039. The composition of embodiment 037 or 038, wherein each of the one or two nucleic acid molecules are in circular form.
040. The composition of embodiment 037 or 038, wherein each of the one or two nucleic acid molecules are in linear form.
041 . The composition of any one of embodiments 037 to 040, wherein each of the one or two nucleic acid molecules functions as an expression vector.
042. The composition of any one of embodiments 037 to 041 , wherein each of the one or two nucleic acid molecules includes flanking sequences for integrating the polynucleotides, the promoter sequences, or the polynucleotides and the promoter sequences in the nucleic acid into genomic DNA of a host organism.
P1 . The composition comprising an engineered yeast that includes an alteration that adds or increases a phosphogluconate dehydratase activity, a 2-keto-3-deoxygluconate-6-phosphate aldolase activity and a xylose isomerase activity.
P2. The composition of embodiment P1 , wherein the yeast is a Saccharomyces spp. yeast.
P3. The composition of embodiment P2, wherein the yeast is a Saccharomyces cerevisiae yeast strain. P4. The composition of any one of embodiments P1 to P3 that includes heterologous polynucleotides that encode a phosphogiuconate dehydratase enzyme, a 2-keto-3- deoxygluconate-6-phosphate aldolase enzyme and a xylose isomerase enzyme. P5. The composition of embodiment P4, wherein the polynucleotides encoding the
phosphogiuconate dehydratase enzyme and the 3-deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp. microbe or Pseudomonas spp. microbe.
P6. The composition of embodiment P5, wherein the Escherichia spp. microbe is an Escherichia coli strain.
P7. The composition of embodiment P5, wherein the Pseudomonas spp. microbe is a
Pseudomonas aeruginosa strain. P8. The composition of any one of embodiments P4 to P7, wherein the polynucleotide that encodes the phosphogiuconate dehydratase enzyme is an EDD gene.
P9. The composition of any one of embodiments P4 to P7, wherein the polynucleotide that encodes the 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme is an EDA gene.
P10. The composition of any one of embodiments P1 to P9, wherein the xylose isomerase enzyme is a chimeric enzyme.
P1 1 . The composition of embodiment P10, wherein a first portion of the polynucleotide that encodes the chimeric xylose isomerase enzyme is from a first microbe and a second portion of the polynucleotide that encodes the chimeric xylose isomerase enzyme is from a second microbe.
P12. The composition of embodiment P10 or P1 1 , wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales spp., Ruminococcus spp., Thermus spp., Bacillus spp.,
Clostridium spp., Orpinomyces spp., Escherichia spp. and Piromyces spp. microbes.
P13. The composition of embodiment P12, wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales_genomosp. BVAB3 str UPII9-5, Ruminococcus flavefaciens,
Ruminococcus_FD1 , Ruminococcus_18P13, Thermus thermophilus, Bacillus stercoris, Clostridium cellulolyticum, Bacillus uniformis, Bacillus stearothermophilus, Bacteroides thetaiotaomicron, Clostridium thermohydrosulfuricum,
Orpinomyces, Clostridium phytofermentans, Escherichia coli and Piromyces strain E2.
P14. The composition of any one of embodiments P10 to P13, wherein 80% or more of the polynucleotide that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase-encoding sequence.
P15. The composition of embodiment P14, wherein all or a portion of the polynucleotide that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase- encoding sequence.
P16. The composition of embodiment P15, wherein the Ruminococcus spp. microbe is a
Ruminococcus flavefaciens strain.
P17. The composition of any one of embodiments P10 to P16, wherein the polynucleotide that encodes the xylose isomerase enzyme is chimeric and includes a sequence that encodes a xylose isomerase from another microbe.
P18. The composition of any one of embodiments P10 to P17, wherein the portion of the polynucleotide from the Ruminococcus spp. microbe xylose isomerase is 3' with respect to the portion of the polynucleotide from another microbe.
P19. The composition of embodiment P17 or P18, wherein the other microbe is a fungus.
P20. The composition of embodiment P19, wherein the fungus is an anaerobic fungus.
P21 . The composition of embodiment P20, wherein the fungus is a Piromyces spp. fungus.
P22. The composition of embodiment P21 , wherein the Piromyces spp. fungus is a Piromyces strain E2. P23. The composition of any one of embodiments P1 to P22, wherein one or more of the following activities are added or increased: a glucose-6-phosphate dehydrogenase activity, a 6- phosphogluconolactonase activity, or a glucose-6-phosphate dehydrogenase activity and a 6- phosphogluconolactonase activity.
P24. The composition of embodiment P24, wherein the yeast comprises one or more
heterologous polynucleotides that encode one or more of the following enzymes, or wherein the yeast comprises multiple copies of endogenous polynucleotides that encode one or more of the following enzymes: glucose-6-phosphate dehydrogenase enzyme, 6-phosphogluconolactonase enzyme, or glucose-6-phosphate dehydrogenase enzyme and 6-phosphogluconolactonase enzyme.
P25. The composition of embodiment P24, wherein the one or more polynucleotides that encode the glucose-6-phosphate dehydrogenase enzyme, the 6-phosphogluconolactonase enzyme, or the glucose-6-phosphate dehydrogenase enzyme and the 6-phosphogluconolactonase enzyme are from a yeast.
P26. The composition of embodiment P25, wherein the yeast is a Saccharomyces spp. yeast. P27. The composition of embodiment P26, wherein the yeast is a Saccharomyces cerevisiae strain.
P28. The composition of any one of embodiments P24 to P27, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a ZWF gene.
P29. The composition of embodiment P28, wherein the ZWF gene is a ZWF1 gene.
P30. The composition of any one of embodiments P24 to P29, wherein the 6- phosphogluconolactonase enzyme is expressed from a SOL gene.
P31 . The composition of embodiment P31 , wherein the SOL gene is a SOL3 gene.
P32. The composition of any one of embodiments P1 to P31 , wherein the nucleic acid includes one or more polynucleotides that encode one or more glucose transporters. P33. The composition of embodiment P32, wherein the polynucleotide that encodes the one or more glucose transporters is from a yeast.
P34. The composition of embodiment P32 or P33, wherein the one or more glucose transporters is encoded by a one or more of a GAL2, GSX1 and GXF1 gene.
P35. The composition of any one of embodiments P1 to P34, wherein the yeast includes one or more added activities or increased activities selected from the group consisting of transketolase activity, transaldolase activity, or a transketolase activity and transaldolase activity.
P36. The composition of embodiment P35, wherein the yeast includes one or more heterologous polynucleotides that encodes one or more of the following enzymes, or includes multiple copies of polynucleotides that encode one or more of the following enzymes: transketolase enzyme, transaldolase enzyme, or a transketolase enzyme and transaldolase enzyme
P37. The composition of embodiment P36, wherein the transketolase enzyme is encoded by a TKL1 coding sequence or a TKL2 coding sequence.
P38. The composition of embodiment P36, wherein the transaldolase is encoded by a TAL coding sequence.
P39. The composition of any one of embodiments P36 to P38, wherein the transketolase enzyme or the transaldolase enzyme is from a yeast. P40. The composition of any one of embodiments P1 to P39, wherein the nucleic acid includes one or more promoters operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotides.
P41 . The composition of embodiment P40, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and triose phosphate dehydrogenase (TDH-1 ).
P42. The composition of any one of embodiments P1 to P41 , wherein the yeast includes a reduction in one or more of the following activities: phosphofructokinase (PFK) activity, phosphoglucoisomerase (PGI) activity, 6-phosphogluconate dehydrogenase (decarboxylating) activity or combination thereof.
P43. The composition of embodiment P42, wherein the yeast includes an alteration in one or more polynucleotides that inhibits production of one or more enzymes selected from the group consisting of phosphofructokinase (PFK) enzyme, phosphoglucoisomerase (PGI) enzyme, 6- phosphogluconate dehydrogenase (decarboxylating) enzyme or combination thereof.
P44. The composition of embodiment P43, wherein the phosphofructokinase (PFK) enzyme is a PFK-2 enzyme or PFK-1 enzyme.
P44.1 . The composition of embodiment P43, wherein the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme is encoded by a GND-1 gene or GND-2 gene. P45. The composition of embodiment P43, wherein the PGI is encoded by a PGI-1 gene.
P46. The composition of any one of embodiments P1 to P45, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are not integrated in the yeast nucleic acid. P47. The composition of embodiment P46, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are in one or more plasmids.
P48. The composition of any one of embodiments P1 to P47, wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are integrated in genomic DNA of the yeast.
P49. The composition of embodiment P48, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are integrated in a transposition integration event, in a homologous recombination integration event, or in a transposition integration event and a homologous recombination integration event.
P50. The composition of embodiment P49, wherein the transposition integration event includes transposition of an operon comprising two or more of the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters. P51 . The composition of embodiment P49, wherein the homologous recombination integration event includes homologous recombination of an operon comprising two or more of the
polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters.
Q1 . A method, comprising contacting an engineered yeast of any one of embodiments P1 to P51 with a feedstock that contains one or more hexose sugars and one or more pentose sugars under conditions in which the microbe synthesizes ethanol. Q2. The method of embodiment Q1 , wherein the engineered yeast synthesizes ethanol to about 85% to about 99% of theoretical yield.
Q3. The method of embodiment Q1 or Q2, comprising recovering ethanol synthesized by the engineered yeast.
Q4. The method of any one of embodiments Q1 to Q3, wherein the conditions are fermentation conditions.
R1 . A composition comprising a nucleic acid comprising heterologous polynucleotides that encode a phosphogluconate dehydratase enzyme, a 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme and a 6-phosphogluconolactonase enzyme.
R2. The composition of embodiment R1 , wherein the yeast is a Saccharomyces spp. yeast. R3. The composition of embodiment R2, wherein the yeast is a Saccharomyces cerevisiae yeast strain.
R3.1 . The composition of any one of embodiments R1 to R3, wherein the polynucleotides encoding the phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp. microbe or Pseudomonas spp. microbe.
R4. The composition of embodiment R3, wherein the Escherichia spp. microbe is an Escherichia coli strain. R5. The composition of embodiment R3 or R4, wherein the Pseudomonas spp. microbe is a Pseudomonas aeruginosa strain.
R6. The composition of any one of embodiments R1 to R5, wherein the polynucleotide that encodes the phosphogluconate dehydratase enzyme is an EDD gene.
R7. The composition of any one of embodiments R1 to R5, wherein the polynucleotide that encodes the 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme is an EDA gene. R8. The composition of any one of embodiments R1 to R7, wherein the 6- phosphogluconolactonase enzyme is expressed from a SOL gene.
R9. The composition of embodiment R8, wherein the SOL gene is a SOL3 gene. R10. The composition of any one of embodiments R1 to R9, wherein the nucleic acid includes a polynucleotide that encodes a glucose-6-phosphate dehydrogenase enzyme.
R1 1 . The composition of embodiment R10, wherein the polynucleotide that encodes the glucose- 6-phosphate dehydrogenase enzyme is from a yeast.
R12. The composition of embodiment R1 1 , wherein the yeast is a Saccharomyces spp. yeast.
R13. The composition of embodiment R12, wherein the yeast is a Saccharomyces cerevisiae strain.
R14. The composition of any one of embodiments R10 to R13, wherein the nucleic acid includes a polynucleotide that encode an endogenous glucose-6-phosphate dehydrogenase enzyme.
R15. The composition of any one of embodiments R10 to R14, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a ZWF gene.
R16. The composition of embodiment R15, wherein the ZWF gene is a ZWF1 gene. R17. The composition of any one of embodiments R1 to R16, wherein the nucleic acid includes one or more promoters operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotides. R18. The composition of embodiment R17, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and triose phosphate dehydrogenase (TDH-1 ).
R19. The composition of any one of embodiments R1 to R18, wherein the nucleic acid includes one or more polynucleotides that homologously combine in a gene of a host that encodes a phosphofructokinase (PFK) enzyme, phosphoglucoisomerase (PGI) enzyme, 6-phosphogluconate dehydrogenase (decarboxylating) enzyme, transketolase enzyme, transaldolase enzyme, or combination thereof. R20. The composition of embodiment R19, wherein the transketolase enzyme is encoded by a TKL-1 coding sequence or a TKL-2 coding sequence.
R21 . The composition of embodiment R19, wherein the transaldolase is encoded by a TAL-1 coding sequence.
R22. The composition of embodiment R19, wherein the phosphofructokinase (PFK) enzyme is a PFK-2 enzyme or PFK-1 enzyme.
R23. The composition of embodiment R19, wherein the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme is encoded by a GND-1 gene or a GND-2 gene.
R24. The composition of embodiment R19, wherein the PGI is encoded by a PGI-1 gene.
R25. The composition of any one of embodiments R1 to R24, wherein the nucleic acid is one or two separate nucleic acid molecules.
R26. The composition of embodiment R25, wherein each nucleic acid molecule includes one or two or more of the polynucleotide subsequences, one or two or more of the promoters, or one or two or more of the polynucleotide subsequences and one or two or more of the promoters. R27. The composition of embodiment R25 or R26, wherein each of the one or two nucleic acid molecules are in circular form.
R28. The composition of embodiment R25 or R26, wherein each of the one or two nucleic acid molecules are in linear form.
R29. The composition of any one of embodiments R25 to R28, wherein each of the one or two nucleic acid molecules functions as an expression vector.
R30. The composition of any one of embodiments R25 to R29, wherein each of the one or two nucleic acid molecules includes flanking sequences for integrating the polynucleotides, the promoter sequences, or the polynucleotides and the promoter sequences in the nucleic acid into genomic DNA of a host organism.
51 . A composition comprising an engineered yeast that includes an alteration that adds or increases a phosphogluconate dehydratase activity, a 2-keto-3-deoxygluconate-6-phosphate aldolase activity and a 6-phosphogluconolactonase activity.
52. The composition of embodiment S1 , wherein the yeast is a Saccharomyces spp. yeast.
53. The composition of embodiment S2, wherein the yeast is a Saccharomyces cerevisiae yeast strain.
54. The composition of any one of embodiments S1 to S3 that includes heterologous polynucleotides that encode a phosphogluconate dehydratase enzyme, a 2-keto-3- deoxygluconate-6-phosphate aldolase enzyme and a 6-phosphogluconolactonase enzyme.
55. The composition of embodiment S4, wherein the polynucleotides encoding the
phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp. microbe or Pseudomonas spp. microbe.
56. The composition of embodiment S5, wherein the Escherichia spp. microbe is an Escherichia coli strain. 57. The composition of embodiment S5, wherein the Pseudomonas spp. microbe is a Pseudomonas aeruginosa strain.
58. The composition of any one of embodiments S4 to S7, wherein the polynucleotide that encodes the phosphogluconate dehydratase enzyme is an EDD gene.
59. The composition of any one of embodiments S4 to S7, wherein the polynucleotide that encodes the 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme is an EDA gene. S10. The composition of any one of embodiments S4 to S9, wherein the 6- phosphogluconolactonase enzyme is expressed from a SOL gene.
S1 1 . The composition of embodiment S10, wherein the SOL gene is a SOL3 gene. S12. The composition of any one of embodiments S4 to S1 1 , wherein a glucose-6-phosphate dehydrogenase activity is added or increased.
513. The composition of embodiment S12, wherein the yeast comprises a heterologous polynucleotide that encodes a glucose-6-phosphate dehydrogenase enzyme, or wherein the yeast comprises multiple copies of an endogenous polynucleotide that encodes a glucose-6-phosphate dehydrogenase enzyme.
514. The composition of embodiment S13, wherein the polynucleotide that encodes the glucose- 6-phosphate dehydrogenase enzyme is from a yeast.
515. The composition of embodiment S14, wherein the yeast is a Saccharomyces spp. yeast.
516. The composition of embodiment S15, wherein the yeast is a Saccharomyces cerevisiae strain.
517. The composition of any one of embodiments S13 to S17, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a ZWF gene.
518. The composition of embodiment S17, wherein the ZWF gene is a ZWF1 gene. S19. The composition of any one of embodiments S1 to S18, wherein the nucleic acid includes one or more promoters operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotides. S20. The composition of embodiment S19, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GSD), translation elongation factor (TEF-1 ), phosphoglucokinase (SGK-1 ) and triose phosphate dehydrogenase (TDH-1 ).
521 . The composition of any one of embodiments S1 to S20, wherein the yeast includes a reduction in one or more of the following activities: phosphofructokinase (PFK) activity, phosphoglucoisomerase (PGI) activity, 6-phosphogluconate dehydrogenase (decarboxylating) activity, transketolase activity, transaldolase activity, or combination thereof.
522. The composition of embodiment S21 , wherein the yeast includes an alteration in one or more polynucleotides that inhibits production of one or more enzymes selected from the group consisting of phosphofructokinase (PFK) enzyme, phosphoglucoisomerase (PGI) enzyme, 6- phosphogluconate dehydrogenase (decarboxylating) enzyme, transketolase enzyme,
transaldolase enzyme, or combination thereof. S23. The composition of embodiment S22, wherein the transketolase enzyme is encoded by a TKL-1 coding sequence or a TKL-2 coding sequence.
524. The composition of embodiment S22, wherein the transaldolase is encoded by a TAL- 1 coding sequence.
525. The composition of embodiment S22, wherein the phosphofructokinase (PFK) enzyme is a PFK-2 enzyme or PFK-1 enzyme.
526. The composition of embodiment S22, wherein the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme is encoded by a GND-1 gene or GND-2 gene.
527. The composition of embodiment S22, wherein the PGI is encoded by a PGI-1 gene. 528. The composition of any one of embodiments S1 to S27, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are not integrated in the yeast nucleic acid.
529. The composition of embodiment S28, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are in one or more plasmids.
530. The composition of any one of embodiments S1 to S29, wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are integrated in genomic DNA of the yeast.
53 . The composition of embodiment S30, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are integrated in a transposition integration event, in a homologous recombination integration event, or in a transposition integration event and a homologous recombination integration event.
532. The composition of embodiment S31 , wherein the transposition integration event includes transposition of an operon comprising two or more of the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters. S33. The composition of embodiment S31 , wherein the homologous recombination integration event includes homologous recombination of an operon comprising two or more of the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters. T1 . A method, comprising contacting an engineered yeast of any one of embodiments S1 to S33 with a feedstock that contains one or more hexose sugars under conditions in which the microbe synthesizes ethanol.
T2. The method of embodiment T1 , wherein the engineered yeast synthesizes ethanol to about 85% to about 99% of theoretical yield.
T3. The method of embodiment T1 or T2, comprising recovering ethanol synthesized by the engineered yeast. T4. The method of any one of embodiments T1 to T3, wherein the conditions are fermentation conditions.
U1 . A composition comprising an engineered yeast that includes an alteration that adds or increases a xylose isomerase activity and a glucose transporter activity.
U2. The composition of embodiment U1 , wherein the yeast is a Saccharomyces spp. yeast.
U3. The composition of embodiment U2, wherein the yeast is a Saccharomyces cerevisiae yeast strain.
U4. The composition of any one of embodiments U1 to U3 that includes heterologous
polynucleotides that encode a xylose isomerase enzyme and a glucose transport enzyme. U5. The composition of any one of embodiments U1 to U4, wherein the xylose isomerase enzyme is a chimeric enzyme.
U6. The composition of embodiment U5, wherein a first portion of the polynucleotide that encodes the chimeric xylose isomerase enzyme is from a first microbe and a second portion of the polynucleotide that encodes the chimeric xylose isomerase enzyme is from a second microbe.
U7. The composition of embodiment U5 or U6, wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales spp., Ruminococcus spp., Thermus spp., Bacillus spp.,
Clostridium spp., Orpinomyces spp., Escherichia spp. and Piromyces spp. microbes.
U8. The composition of embodiment U7, wherein the first microbe, the second microbe, or the first microbe and the second microbe independently are selected from one or more of the group consisting of Clostridiales_genomosp. BUAB3 str UUII9-5, Ruminococcus flavefaciens,
Ruminococcus_FD1 , Ruminococcus_18U13, Thermus thermophilus, Bacillus stercoris, Clostridium cellulolyticum, Bacillus uniformis, Bacillus stearothermophilus, Bacteroides thetaiotaomicron, Clostridium thermohydrosulfuricum, Orpinomyces, Clostridium phytofermentans, Escherichia coli and Piromyces strain E2. U9. The composition of any one of embodiments U5 to U8, wherein 80% or more of the polynucleotide that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase-encoding sequence. U10. The composition of embodiment U9, wherein all or a portion of the polynucleotide that encodes the xylose isomerase enzyme is from a Ruminococcus spp. microbe xylose isomerase- encoding sequence.
U1 1 . The composition of embodiment U10, wherein the Ruminococcus spp. microbe is a
Ruminococcus flavefaciens strain.
U12. The composition of any one of embodiments U5 to U12, wherein the polynucleotide that encodes the xylose isomerase enzyme is chimeric and includes a sequence that encodes a xylose isomerase from another microbe.
U13. The composition of any one of embodiments U5 to U12, wherein the portion of the polynucleotide from the Ruminococcus spp. microbe xylose isomerase is 3' with respect to the portion of the polynucleotide from another microbe. U14. The composition of embodiment U12 or U13, wherein the other microbe is a fungus.
U15. The composition of embodiment U14, wherein the fungus is an anaerobic fungus.
U16. The composition of embodiment U15, wherein the fungus is a Piromyces spp. fungus.
U17. The composition of embodiment U16, wherein the Piromyces spp. fungus is a Piromyces strain E2.
U18. The composition of any one of embodiments U4 to U17, wherein the polynucleotide that encodes the one or more glucose transporters is from a yeast.
U19. The composition of embodiment U18, wherein the one or more glucose transporters is encoded by a one or more of a GAL2, GSX1 and GXF1 gene. U20. The composition of any one of embodiments U1 to U19, wherein the yeast includes one or more added activities or increased activities selected from the group consisting of transketolase activity, transaldolase activity, or a transketolase activity and transaldolase activity. U21 . The composition of embodiment U20, wherein the yeast includes one or more heterologous polynucleotides that encodes one or more of the following enzymes, or includes multiple copies of polynucleotides that encode one or more of the following enzymes: transketolase enzyme, transaldolase enzyme, or a transketolase enzyme and transaldolase enzyme U22. The composition of embodiment U21 , wherein the transketolase enzyme is encoded by a TKL1 coding sequence or a TKL2 coding sequence.
U23. The composition of embodiment U21 , wherein the transaldolase is encoded by a TAL coding sequence.
U24. The composition of any one of embodiments U21 to U23, wherein the transketolase enzyme or the transaldolase enzyme is from a yeast.
U25. The composition of any one of embodiments U1 to U24, wherein the nucleic acid includes one or more promoters operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotides.
U26. The composition of embodiment U25, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GUD), translation elongation factor (TEF-1 ), phosphoglucokinase (UGK-1 ) and triose phosphate dehydrogenase (TDH-1 ).
U27. The composition of any one of embodiments U1 to U26, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are not integrated in the yeast nucleic acid.
U28. The composition of embodiment U27, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are in one or more plasmids. U29. The composition of any one of embodiments U1 to U28, wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are integrated in genomic DNA of the yeast. U30. The composition of embodiment U29, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are integrated in a transposition integration event, in a homologous recombination integration event, or in a transposition integration event and a homologous recombination integration event. U31 . The composition of embodiment U30, wherein the transposition integration event includes transposition of an operon comprising two or more of the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters.
U32. The composition of embodiment U30, wherein the homologous recombination integration event includes homologous recombination of an operon comprising two or more of the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters.
V1 . A method, comprising contacting an engineered yeast of any one of embodiments U1 to U32 with a feedstock that contains one or more pentose sugars under conditions in which the microbe synthesizes ethanol.
V2. The method of embodiment V1 , wherein the engineered yeast synthesizes ethanol to about 85% to about 99% of theoretical yield.
V3. The method of embodiment V1 or V2, comprising recovering ethanol synthesized by the engineered yeast.
W1 . A composition comprising a synthetic nucleic acid that includes a polynucleotide selected from the group of twenty (20) polynucleotides consisting of:
CACGCACGGACCGACCGTCACCGGACCGTTTCGCGCGACGTGCGCGAGGCTCCGACACGAA AGACGGGCCCCCTATTGCGCTCATGTCGGCCGCACCCCTGCGTAAAGTCAGATACGTGCGC CACCCGAGCCGGGACCGCCCTGAGCGCATGGTCCGGGCGGCGTGGCAAGCGCAGGAGGGCG TGCCCCGTTCGCTAGGCA ACGTATGTCGGCTGATCGTACACGCCGACCAGCGCAGTCGGCGTACTCAGGCGTTCCGAGT AGCTCACATCTGTGGGCCCCGGCGTACCTTCGGCAGGGTTATGCGACGGGGCGGCAGGCTT GCGCTGGCGTCGGGAATCACCGCGAACTTGACCCGCGCCGGTTCCGTATCGGTCCGCTGCG GCCGTGCTCCGCAGTCGA
TGCAGTCCGCCCAGCCGGCCGTGTAGCACGGCCGACTGCAGGTGCGACGTGCTAGGGGCCA GCACGCGAGCGGCCCTACCACGGGTCGTGTGGGGCGCATGACCGCCGGCCGGGTCTCGGCA CGGGGCGACGCGGTGCTCCTAGGCTAGCAGGGCCTCACCGGGTGATCCCCCGTGTAGCGCC GCACAACACCCCCTGCGA
TGCCCGCATACCGCCCGCCCACTGGGGATCCTCCGGCGCTGTCGCGCTATGCGCGTCCATC CTGGTCGGACGGGCTCGGGCCCCGGACCAAACCGCAGCGGCCCCTGGCAGCGACTAAGGGC GCCGTCTCACCCTAGACTTCTTAATCGGGGTGTCCCGGTAGGCCGGGAGTAGCCTCGGCGG GCTAGCCGCGTGACTATA
GCGGGTTAGTCCCCGTCGGACGTCATGCATACAGTCGGGGCTGGCGAGACAGGAGGCTACA GGGGGCGCCCGGAGGAACACACGTGGGACTAAGACGTCGGTCCGTGTGCCCCCGAACCGGC GTGCTCATCGTAGGACTGGGAAGTCCGTACCGCGTGGCTCGTACCTCGCGGTCTGAGTCCG ACACCCGCTGACGCCGGA
CTGAGACGACTCCCGCACTACGGATCGCGAGCGTAGACTCAGCCCGGACTCTCACGCGACC TCGGACGCGGCCTAATGTCTCGACTCGCGGTCCGCTGAAGGTCTCGGGGCACGCGAGACGC GGGGTCAGGCCGGGGGGATCCCCGCACACACTCAGTCGCGGCGAACGGAGTCCCGTGGCCT GGCTAGGGATCGTGGGTA
GGGGCGTCCACTCTGGCTCGGTAGAGCGCTGGGCTCCGCGCGACTGCGCGCACCCATCGGT TTGGCGCGACGCACCGTGGACTCCTGGGCTAGAGGGCGGGTCCCCGCCATACCCCGTTCTC GTGCCGGCTGGGTAGGACCGGAGTGACGGCTGTGGCCGGCGACTCGGGCGCGCACTGTAGT CGGATCTGGGCGGGCAGA
GTCGGGCGCGCGTCAGTCCACGCGTTAAACACTGGCCGACGACACGACGGGATCCGGGCAC GCCCCGAGAGCGCGTGTTCGCGCGAGTCGATCGGGAGGCCGCAGCGTGTCGAGCCCAGACC CCGCTCTAGCGTGGCCATCGCGGTGCTAAGTGGGGCGGCCGGGTCCTATACACGCTTACCG ATAGTCAAGTTTGCGTGA
GTCTTAGGGCCCAGGGACCGCACGGGTCGACCGCGCGACTGGTCGGAGCTTGCGCGTCTAC GCCACTCGGCGGCCCCGACGGGGGATGCCGCGGAATGTCCGCCGGCGTATGCGGCTCAAGC CGGACCGTCGGACTGCGAAGCGCCGTGAGCACCCCTCGACCTGACCGGACGCGGCGCACCC GTCCGAGTATCGTCGCGA
TCGGGTCTCGCCCGGCGCTAGTCCAGCCGTAGCGCTCTCCGGCGATCACCCCGGAGCACTC TGGAGCCGAGCGGTCGGGTCTGTTGGGCGCGCCGCGGCTACGGACGGCTCGACTCACTGGC GCTCGACCCCGTATCCCCCGTCTCGGACGACGCACCGTTGCGCGGGAACGATCGGCGGCGC TCACACGCACGATCGGAA
CTTAAGGCTGGCGCACCATGAGGGCCGCGCCACGTCCGACCCGCAGCCCGCGCGTAGTAGC CTAGCCGGGCGGGGTTCCTCCCGTGCGTCACCTAGCACGGGGCCTGGCACCGAACGCGAGC CCGTCCGGTCACCGCGGCGGGTCTGCGGACGTCCCCGGTCGCTCGGCTCGGAGTCCCCGCT GGGGATCGCGTCGGGACA
CGACGGCGTAGCACTCGCGGACCTAGGGCGCGCGAGTCGGGGGAGCCCGCGGTGCGACGCT CGGGGAGGAGCTCGCATGCCCAAGGCACGATCTAGGGGGGGGTACGGGGGGCGTCCGTCCG AGCGCCGGGACTGCGATCCGGGGCCACATGCTAACCGGCGGAAGGGGGGACCTAACCGGTG TGGACTCCGGGTAATCCA
CGGGGGGCTGACACGTCTCGGATCGCCCCGTCAGTCAGCCCCCTAGTCCCGGACAGGACGT CGGAGGTCGAGTCCGCACTGTCGGGCCTGCTCGTGGGCACGGCAGGACGCGTCCCCATGGT CAGCCGCCGTGCGATACCTCGCCACGACTCTGAGCCGGGCGCGAGCGTGAGAGCCCGAGCC GCGGTACACGGGGCGTCA GCGAGCTCGCTCTCGACTCCGGGCTCCCGTGCTGACACGGGGTGCGACCCCGCGGCGATTG
TCCGCACGCCTGTCGGACGACGTCGGCCCGTCGTAGTGCCGGTCAGAGGCAGGGGGGCTGC TCGCGCTGGCCGCCTCGTCGCGCGTGGACCCTATGGGGGATCACGCGTGGGGTCGGGATCG GGGACCGCGCGACTTGGA
CGCGCCCCGTAACGGACGCGGTGAGTCGAGCTTACGCGGCTAGGGCCGAGTCGTGTTAGCG TCTCGCGTAAGCGAATGCCACGTCCCCCGCCGCCCGTCGCGCAGCTGGCTACGCAACGCCT CCGCGGCCTCCGTAGCGAGTGCGTGGGACGCTGGCCGTCCGCGTGTTCCGGGACCTGGATG CGGGAGGGACCTAAGGCA
AGAACGTGCGGTCGTCCCCACGCACGGGATGACGGACGGGGTAGACGGGCGTCGTGCGCGC GGGTAGCGTAACCGGTTACAGTCCCCGCAACGCTCTAGCTCCGGCCCTCGCTTAGGAGTTC GCGGCCGAGACATGAGGTGGTCCGGACGGCAGGGGGTCGCGGAGACCGTGGAGCCGATTCT GCCGGACGCCACGTCCCA
CGGGACGCCCCGTACCGTGTACGAAGCCCCGGTCGGTCGGCGGATCGTAGATCCCGGAGCC GACGCCTTGAACCCGGCTTTCCCAGCGACTCGCGCCCCCACTGGGTCCCTCGGGACCCCGC TCCCCCCAGACGCATACAGCCCGCAAGCGGGGGCAGTCTCGGACCGCCCGGACACTGGCCT TAGGCACCGTGGGCTCGA
GTGTCCGGGGCGCATCGGAGCTGTCCGACCGAGTTCCGGGGACGGCGCACGTTGTGCCGGC CTCAGACGGAGCCTGTAGCCCCCGGACAGTGTGTGCCCGCCCACTACGGGTTAGGCACGGG GTTGGTCGGCACGCGTCCTCCGCGTGTCACGGACCGATGCAGACCGCTGGCCGGGAGGTCG CCCCCCCAGGGGTGCACA
CGCGCAGCACGCACGTCCGGGGCACGCGCGGCTCGGAGGGTCCGGGCTGGGACGGGAGGTT TGGAGTCGCGTGCGCGTAGCAGCGCACCCGCCTGGTCGCCGGGTCTAGTAGGGCTGGGTTA CGGAGGACGTGCAGGCGACCCCAACCGTTGACGACGGGTCCGACCACGCCTTTAGCCGTGG CGTGTCCGTCGCGAGCCA
W2. A microorganism comprising a polynucleotide that includes a sequence selected from the group of twenty (20) sequences consisting of
CACGCACGGACCGACCGTCACCGGACCGTTTCGCGCGACGTGCGCGAGGCTCCGACACGAA AGACGGGCCCCCTATTGCGCTCATGTCGGCCGCACCCCTGCGTAAAGTCAGATACGTGCGC CACCCGAGCCGG'GACCGCCCTGAGCGCATGGTCCGGGCGGCGTGGCAAGCGCAGGAGGGCG
TGCCCCGTTCGCTAGGC A
ACGTATGTCGGCTGATCGTACACGCCGACCAGCGCAGTCGGCGTACTCAGGCGTTCCGAGT AGCTCACATCTGTGGGCCCCGGCGTACCTTCGGCAGGGTTATGCGACGGGGCGGCAGGCTT GCGCTGGCGTCGGGAATCACCGCGAACTTGACCCGCGCCGGTTCCGTATCGGTCCGCTGCG GCCGTGCTCCGCAGTCGA
TGCAGTCCGCCCAGCCGGCCGTGTAGCACGGCCGACTGC AGGTGCGACGTGCTAGGGGCCA GCACGCGAGCGGCCCTACCACGGGTCGTGTGGGGCGCATGACCGCCGGCCGGGTCTCGGCA CGGGGCGACGCGGTGCTCCTAGGCTAGCAGGGCCTCACCGGGTGATCCCCCGTGTAGCGCC GCACAACACCCCCTGCGA
TGCCCGCATACCGCCCGCCCACTGGGGATCCTCCGGCGCTGTCGCGCTATGCGCGTCCATC CTGGTCGGACGGGCTCGGGCCCCGGACCAAACCGCAGCGGCCCCTGGCAGCGACTAAGGGC GCCGTCTCACCCTAGACTTCTTAATCGGGGTGTCCCGGTAGGCCGGGAGTAGCCTCGGCGG GCTAGCCGCGTGACTATA
GCGGGTTAGTCCCCGTCGGACGTCATGCATACAGTCGGGGCTGGCGAGACAGGAGGCTACA GGGGGCGCCCGGAGGAACACACGTGGGACTAAGACGTCGGTCCGTGTGCCCCCGAACCGGC GTGCTCATCGTAGGACTGGGAAGTCCGTACCGCGTGGCTCGTACCTCGCGGTCTGAGTCCG ACACCCGCTGACGCCGGA CTGAGACGACTCCCGCACTACGGATCGCGAGCGTAGACTCAGCCCGGACTCTCACGCGACC TCGGACGCGGCCTAATGTCTCGACTCGCGGTCCGCTGAAGGTCTCGGGGCACGCGAGACGC GGGGTCAGGCCGGGGGGATCCCCGCACACACTCAGTCGCGGCGAACGGAGTCCCGTGGCCT GGCTAGGGATCGTGGGTA
GGGGCGTCCACTCTGGCTCGGTAGAGCGCTGGGCTCCGCGCGACTGCGCGCACCCATCGGT TTGGCGCGACGCACCGTGGACTCCTGGGCTAGAGGGCGGGTCCCCGCCATACCCCGTTCTC GTGCCGGCTGGGTAGGACCGGAGTGACGGCTGTGGCCGGCGACTCGGGCGCGCACTGTAGT CGGATCTGGGCGGGCAGA
GTCGGGCGCGCGTCAGTCCACGCGTTAAACACTGGCCGACGACACGACGGGATCCGGGCAC GCCCCGAGAGCGCGTGTTCGCGCGAGTCGATCGGGAGGCCGCAGCGTGTCGAGCCCAGACC CCGCTCTAGCGTGGCCATCGCGGTGCTAAGTGGGGCGGCCGGGTCCTATACACGCTTACCG ATAGTCAAGTTTGCGTGA
GTCTTAGGGCCCAGGGACCGCACGGGTCGACCGCGCGACTGGTCGGAGCTTGCGCGTCTAC GCCACTCGGCGGCCCCGACGGGGGATGCCGCGGAATGTCCGCCGGCGTATGCGGCTCAAGC CGGACCGTCGGACTGCGAAGCGCCGTGAGCACCCCTCGACCTGACCGGACGCGGCGCACCC GTCCGAGTATCGTCGCGA
TCGGGTCTCGCCCGGCGCTAGTCCAGCCGTAGCGCTCTCCGGCGATCACCCCGGAGCACTC TGGAGCCGAGCGGTCGGGTCTGTTGGGCGCGCCGCGGCTACGGACGGCTCGACTCACTGGC GCTCGACCCCGTATCCCCCGTCTCGGACGACGCACCGTTGCGCGGGAACGATCGGCGGCGC TCACACGCACGATCGGAA
CTTAAGGCTGGCGCACCATGAGGGCCGCGCCACGTCCGACCCGCAGCCCGCGCGTAGTAGC CTAGCCGGGCGGGGTTCCTCCCGTGCGTCACCTAGCACGGGGCCTGGCACCGAACGCGAGC CCGTCCGGTCACCGCGGCGGGTCTGCGGACGTCCCCGGTCGCTCGGCTCGGAGTCCCCGCT GGGGATCGCGTCGGGACA
CGACGGCGTAGCACTCGCGGACCTAGGGCGCGCGAGTCGGGGGAGCCCGCGGTGCGACGCT CGGGGAGGAGCTCGCATGCCCAAGGCACGATCTAGGGGGGGGTACGGGGGGCGTCCGTCCG AGCGCCGGGACTGCGATCCGGGGCCACATGCTAACCGGCGGAAGGGGGGACCTAACCGGTG TGGACTCCGGGTAATCCA
CGGGGGGCTGACACGTCTCGGATCGCCCCGTCAGTCAGCCCCCTAGTCCCGGACAGGACGT CGGAGGTCGAGTCCGCACTGTCGGGCCTGCTCGTGGGCACGGCAGGACGCGTCCCCATGGT CAGCCGCCGTGCGATACCTCGCCACGACTCTGAGCCGGGCGCGAGCGTGAGAGCCCGAGCC GCGGTACACGGGGCGTCA
GCGAGCTCGCTCTCGACTCCGGGCTC CCGTGCTGACACGGGGTGCGACCCCGCGGCGATTG TCCGCACGCCTGTCGGACGACGTCGGCCCGTCGTAGTGCCGGTCAGAGGCAGGGGGGC TGC TCGCGCTGGCCGCCTCGTCGCGCGTGGACCC TATGGGGGATCACGCGTGGGGTCGGGATCG GGGACCGCGCGACTTGGA
CGCGCCCCGTAACGGACGCGGTGAGTCGAGC TTACGCGGCTAGGGCCGAGTCGTGTTAGCG TCTCGCGTAAGCGAATGCCACGTCCCCCGCCGCCCGTCGCGCAGCTGGC TACGCAACGCCT CCGCGGCCTCCGTAGCGAGTGCGTGGGACGC TGGCCGTCCGCGTGTTCCGGGACCTGGATG CGGGAGGGACCTAAGGCA
AGAACGTGCGGTCGTCCCCACGCACGGGATGACGGACGGGGTAGACGGGCGTCGTGCGCGC GGGTAGCGTAACCGGTTACAGTCCCCGCAACGCTC TAGCTCCGGCCCTCGCTTAGGAGTTC GCGGCCGAGACATGAGGTGGTCCGGACGGCAGGGGGTCGCGGAGACCGTGGAGCCGATTCT GCCGGACGCCACGTCCCA
CGGGACGCCCCGTACCGTGTACGAAGCCCCGGTCGGTCGGCGGATCGTAGATCCCGGAGCC GACGCCTTGAACCCGGC TTTCCCAGCGACTCGCGCCCCCACTGGGTCCCTCGGGACCCCGC TCCCCCCAGACGCATACAGCCCGCAAGCGGGGGCAGTCTCGGACCGCCCGGACACTGGCC T TAGGCACCGTGGGCTCGA GTGTCCGGGGCGCATCGGAGCTGTCCGACCGAGTTCCGGGGACGGCGCACGTTGTGCCGGC
CTCAGACGGAGCCTGTAGCCCCCGGACAGTGTGTGCCCGCCCACTACGGGTTAGGCACGGG GTTGGTCGGCACGCGTCCTCCGCGTGTCACGGACCGATGCAGACCGCTGGCCGGGAGGTCG CCCCCCCAGGGGTGCACA
CGCGCAGCACGCACGTCCGGGGCACGCGCGGCTCGGAGGGTCCGGGCTGGGACGGGAGGTT TGGAGTCGCGTGCGCGTAGCAGCGCACCCGCCTGGTCGCCGGGTCTAGTAGGGCTGGGTTA CGGAGGACGTGCAGGCGACCCCAACCGTTGACGACGGGTCCGACCACGCCTTTAGCCGTGG CGTGTCCGTCGCGAGCCA
W3. A method comprising detecting the presence or absence of a nucleotide sequence identification tag in a microorganism, wherein the nucleotide sequence is selected from the group of twenty (20) nucleotide sequences consisting of
CACGCACGGACCGACCGTCACCGGACCGTTTCGCGCGACGTGCGCGAGGCTCCGACACGAA AGACGGGCCCCCTATTGCGCTCATGTCGGCCGCACCCCTGCGTAAAGTCAGATACGTGCGC CACCCGAGCCGGGACCGCCCTGAGCGCATGGTCCGGGCGGCGTGGCAAGCGCAGGAGGGCG TGCCCCGTTCGCTAGGCA
ACGTATGTCGGCTGATCGTACACGCCGACCAGCGCAGTCGGCGTACTCAGGCGTTCCGAGT AGCTCACATCTGTGGGCCCCGGCGTACCTTCGGCAGGGTTATGCGACGGGGCGGCAGGCTT GCGCTGGCGTCGGGAATCACCGCGAACTTGACCCGCGCCGGTTCCGTATCGGTCCGCTGCG GCCGTGCTCCGCAGTCGA
TGCAGTCCGCCCAGCCGGCCGTGTAGCACGGCCGACTGCAGGTGCGACGTGCTAGGGGCCA GCACGCGAGCGGCCCTACCACGGGTCGTGTGGGGCGCATGACCGCCGGCCGGGTCTCGGCA CGGGGCGACGCGGTGCTCCTAGGCTAGCAGGGCCTCACCGGGTGATCCCCCGTGTAGCGCC GCACAACACCCCCTGCGA
TGCCCGCATACCGCCCGCCCACTGGGGATCCTCCGGCGCTGTCGCGCTATGCGCGTCCATC CTGGTCGGACGGGCTCGGGCCCCGGACCAAACCGCAGCGGCCCCTGGCAGCGACTAAGGGC GCCGTCTCACCCTAGACTTCTTAATCGGGGTGTCCCGGTAGGCCGGGAGTAGCCTCGGCGG GCTAGCCGCGTGACTATA
GCGGGTTAGTCCCCGTCGGACGTCATGCATACAGTCGGGGCTGGCGAGACAGGAGGCTACA GGGGGCGCCCGGAGGAACACACGTGGGACTAAGACGTCGGTCCGTGTGCCCCCGAACCGGC GTGCTCATCGTAGGACTGGGAAGTCCGTACCGCGTGGCTCGTACCTCGCGGTCTGAGTCCG ACACCCGCTGACGCCGGA
CTGAGACGACTCCCGCACTACGGATCGCGAGCGTAGACTCAGCCCGGACTCTCACGCGACC TCGGACGCGGCCTAATGTCTCGACTCGCGGTCCGCTGAAGGTCTCGGGGCACGCGAGACGC GGGGTCAGGCCGGGGGGATCCCCGCACACACTCAGTCGCGGCGAACGGAGTCCCGTGGCCT GGCTAGGGATCGTGGGTA
GGGGCGTCCACTCTGGCTCGGTAGAGCGCTGGGCTCCGCGCGACTGCGCGCACCCATCGGT TTGGCGCGACGCACCGTGGACTCCTGGGCTAGAGGGCGGGTCCCCGCCATACCCCGTTCTC GTGCCGGCTGGGTAGGACCGGAGTGACGGCTGTGGCCGGCGACTCGGGCGCGCACTGTAGT CGGATCTGGGCGGGCAGA
GTCGGGCGCGCGTCAGTCCACGCGTTAAACACTGGCCGACGACACGACGGGATCCGGGCAC GCCCCGAGAGCGCGTGTTCGCGCGAGTCGATCGGGAGGCCGCAGCGTGTCGAGCCCAGACC CCGCTCTAGCGTGGCCATCGCGGTGCTAAGTGGGGCGGCCGGGTCCTATACACGCTTACCG ATAGTCAAGTTTGCGTGA GTCTTAGGGCCCAGGGACCGCACGGGTCGACCGCGCGACTGGTCGGAGCTTGCGCGTCTAC
GCCACTCGGCGGCCCCGACGGGGGATGCCGCGGAATGTCCGCCGGCGTATGCGGCTCAAGC CGGACCGTCGGACTGCGAAGCGCCGTGAGCACCCCTCGACCTGACCGGACGCGGCGCACCC GTCCGAGTATCGTCGCGA
TCGGGTCTCGCCCGGCGCTAGTCCAGCCGTAGCGCTCTCCGGCGATCACCCCGGAGCACTC TGGAGCCGAGCGGTCGGGTCTGTTGGGCGCGCCGCGGCTACGGACGGCTCGACTCACTGGC GCTCGACCCCGTATCCCCCGTCTCGGACGACGCACCGTTGCGCGGGAACGATCGGCGGCGC TCACACGCACGATCGGAA
CTTAAGGCTGGCGCACCATGAGGGCCGCGCCACGTCCGACCCGCAGCCCGCGCGTAGTAGC CTAGCCGGGCGGGGTTCCTCCCGTGCGTCACCTAGCACGGGGCCTGGCACCGAACGCGAGC CCGTCCGGTCACCGCGGCGGGTCTGCGGACGTCCCCGGTCGCTCGGCTCGGAGTCCCCGCT GGGGATCGCGTCGGGACA
CGACGGCGTAGCACTCGCGGACCTAGGGCGCGCGAGTCGGGGGAGCCCGCGGTGCGACGCT CGGGGAGGAGCTCGCATGCCCAAGGCACGATCTAGGGGGGGGTACGGGGGGCGTCCGTCCG AGCGCCGGGACTGCGATCCGGGGCCACATGCTAACCGGCGGAAGGGGGGACCTAACCGGTG TGGACTCCGGGTAATCCA
CGGGGGGCTGACACGTCTCGGATCGCCCCGTCAGTCAGCCCCCTAGTCCCGGACAGGACGT CGGAGGTCGAGTCCGCACTGTCGGGCCTGCTCGTGGGCACGGCAGGACGCGTCCCCATGGT CAGCCGCCGTGCGATACCTCGCCACGACTCTGAGCCGGGCGCGAGCGTGAGAGCCCGAGCC GCGGTACACGGGGCGTCA
GCGAGCTCGCTCTCGACTCCGGGCTCCCGTGCTGACACGGGGTGCGACCCCGCGGCGATTG TCCGCACGCCTGTCGGACGACGTCGGCCCGTCGTAGTGCCGGTCAGAGGCAGGGGGGCTGC TCGCGCTGGCCGCCTCGTCGCGCGTGGACCCTATGGGGGATCACGCGTGGGGTCGGGATCG GGGACCGCGCGACTTGGA
CGCGCCCCGTAACGGACGCGGTGAGTCGAGCTTACGCGGCTAGGGCCGAGTCGTGTTAGCG TCTCGCGTAAGCGAATGCCACGTCCCCCGCCGCCCGTCGCGCAGCTGGCTACGCAACGCCT CCGCGGCCTCCGTAGCGAGTGCGTGGGACGCTGGCCGTCCGCGTGTTCCGGGACCTGGATG CGGGAGGGACCTAAGGCA
AGAACGTGCGGTCGTCCCCACGCACGGGATGACGGACGGGGTAGACGGGCGTCGTGCGCGC GGGTAGCGTAACCGGTTACAGTCCCCGCAACGCTCTAGCTCCGGCCCTCGCTTAGGAG TC GCGGCCGAGACATGAGGTGGTCCGGACGGCAGGGGGTCGCGGAGACCGTGGAGCCGATTCT GCCGGACGCCACGTCCCA
CGGGACGCCCCGTACCGTGTACGAAGCCCCGGTCGGTCGGCGGATCGTAGATCCCGGAGCC GACGCCTTGAACCCGGCTTTCCCAGCGACTCGCGCCCCCACTGGGTCCCTCGGGACCCCGC TCCCCCCAGACGCATACAGCCCGCAAGCGGGGGCAGTCTCGGACCGCCCGGACACTGGCCT TAGGCACCGTGGGCTCGA
GTGTCCGGGGCGCATCGGAGCTGTCCGACCGAGTTCCGGGGACGGCGCACGTTGTGCCGGC CTCAGACGGAGCCTGTAGCCCCCGGACAGTGTGTGCCCGCCCACTACGGGTTAGGCACGGG GTTGGTCGGCACGCGTCCTCCGCGTGTCACGGACCGATGCAGACCGCTGGCCGGGAGGTCG CCCCCCCAGGGGTGCACA
CGCGCAGCACGCACGTCCGGGGCACGCGCGGCTCGGAGGGTCCGGGCTGGGACGGGAGGTT TGGAGTCGCGTGCGCGTAGCAGCGCACCCGCCTGGTCGCCGGGTCTAGTAGGGCTGGGTTA CGGAGGACGTGCAGGCGACCCCAACCGTTGACGACGGGTCCGACCACGCCTTTAGCCGTGG CGTGTCCGTCGCGAGCCA
W4. The method of embodiment W3, wherein the microorganism includes two or more different identification tags. W5. The method of embodiment W3, wherein the microorganism includes multiple copies of one or more of the identification tags.
X1 . A composition comprising a nucleic acid comprising heterologous polynucleotides that encode a phosphogluconate dehydratase enzyme and a 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme, and one or more polynucleotides that homologously combine in a gene of a host that encodes a 6-phosphogluconate dehydrogenase (decarboxylating) enzyme.
X2. The composition of embodiment X1 , wherein the yeast is a Saccharomyces spp. yeast.
X3. The composition of embodiment X2, wherein the yeast is a Saccharomyces cerevisiae yeast ' strain.
X3.1. The composition of any one of embodiments X1 to X3, wherein the polynucleotides encoding the phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp. microbe or Pseudomonas spp. microbe.
X4. The composition of embodiment X3, wherein the Escherichia spp. microbe is an Escherichia coli strain.
X5. The composition of embodiment X3 or X4, wherein the Pseudomonas spp. microbe is a Pseudomonas aeruginosa strain. X6. The composition of any one of embodiments X1 to X5, wherein the polynucleotide that encodes the phosphogluconate dehydratase enzyme is an EDD gene.
X7. The composition of any one of embodiments X1 to X5, wherein the polynucleotide that encodes the 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme is an EDA gene.
X8. The composition of any one of embodiments X1 to X7, wherein the nucleic acid includes a polynucleotide that encodes a 6-phosphogluconolactonase enzyme. X8.1. The composition of embodiment X8, wherein the polynucleotide that encodes the 6- phosphogluconolactonase enzyme is from a yeast.
X8.2. The composition of embodiment X8.1 , wherein the yeast is a Saccharomyces spp. yeast.
X8.3. The composition of embodiment X8.2, wherein the yeast is a Saccharomyces cerevisiae strain.
X8.4. The composition of any one of embodiments X8 to X8.3, wherein the 6- phosphogluconolactonase enzyme is expressed from a SOL gene.
X9. The composition of embodiment X8.4, wherein the SOL gene is a SOL3 gene.
X10. The composition of any one of embodiments X1 to X9, wherein the nucleic acid includes a polynucleotide that encodes a glucose-6-phosphate dehydrogenase enzyme.
X1 1 . The composition of embodiment X10, wherein the polynucleotide that encodes the glucose- 6-phosphate dehydrogenase enzyme is from a yeast. X12. The composition of embodiment X1 1 , wherein the yeast is a Saccharomyces spp. yeast.
X13, The composition of embodiment X12, wherein the yeast is a Saccharomyces cerevisiae strain. X14. The composition of any one of embodiments X10 to X13, wherein the nucleic acid includes a polynucleotide that encode an endogenous glucose-6-phosphate dehydrogenase enzyme.
X15. The composition of any one of embodiments X10 to X14, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a ZWF gene.
X16. The composition of embodiment X15, wherein the ZWF gene is a ZWF1 gene. X17. The composition of any one of embodiments X1 to X16, wherein the nucleic acid includes one or more promoters operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotides. X18. The composition of embodiment X17, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GPD), translation elongation factor (TEF-1 ), phosphoglucokinase (PGK-1 ) and triose phosphate dehydrogenase (TDH-1 ).
X19. The composition of any one of embodiments X1 to X18, wherein the nucleic acid includes one or more polynucleotides that homologously combine in a gene of a host that encodes a phosphofructokinase (PFK) enzyme, phosphoglucoisomerase (PGI) enzyme, transketolase enzyme, transaldolase enzyme, trehalose-6-phosphate synthase (TPS1 ), plasma membrane channel (FPS1 ), glycerol-3-phosphate dehydrogenase (GPD1/GPD2), neutral trehalose (NTH1 ), alkaline phosphatase (PH013) or combination thereof.
X20. The composition of embodiment X19, wherein the transketolase enzyme is encoded by a TKL-1 coding sequence or a TKL-2 coding sequence.
X21 . The composition of embodiment X19, wherein the transaldolase is encoded by a TAL-1 coding sequence.
X22. The composition of embodiment X19, wherein the phosphofructokinase (PFK) enzyme is a PFK-2 enzyme or PFK-1 enzyme. X23. The composition of any one of embodiments X1 to X22, wherein the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme is encoded by a GND-1 gene or a GND-2 gene.
X24. The composition of embodiment X19, wherein the PGI is encoded by a PGI-1 gene. X24.1 The composition of embodiment X19, wherein the trehalose-6-phosphate synthase is encoded by a TPS1 coding sequence.
X24.2 The composition of embodiment X19, wherein the plasma membrane channel is encoded by a FPS1 coding sequence. X24.3 The composition of embodiment X19, wherein the glycerol-3-phosphate dehydrogenase enzyme is encoded by a GPD1 gene or a GPD2 gene.
X24.4 The composition of embodiment X19, wherein the neutral trehalose enzyme is encoded by a NTH1 coding sequence.
X24.5 The composition of embodiment X19 wherein the alkaline phosphatase enzyme is encoded by a PH013 coding sequence X25. The composition of any one of embodiments X1 to X24.5, wherein the nucleic acid is one or two separate nucleic acid molecules.
X25.01 The composition of any one of embodiments X1 to X25, wherein the nucleic acid includes one or more polynucleotides that homologously combine in a gene of a host and increases the production of one or more enzymes or polypeptides selected from the group consisting of pyruvate decarboxylase (PDC1 ), alcohol dehydrogenase (ADH1 ), glutamate synthase (e.g., GLT1 ), trehalose-6-phosphate phosphatase (TPS2), glyceraldehyde-3-phosphate dehydrogenase (TDH3), pyruvate kinase PYK1 , CDC19), glucose transporter (GAL2, GSX1 , GXF1 , HXT7),
phosphogluconate dehydratase (EDD), 2-keto-3-deoxygluconate-6-phosphate aldolase (e.g., EDA), xylose isomerase (XI), xylose reductase (XR), xylitol dehydrogenase (XD), or xylulokinase (XK).
X25.02 The composition of embodiment X25.01 , wherein the pyruvate decarboxylase enzyme is encoded by a PDC1 coding sequence.
X25.03 The composition of embodiment X25.01 , wherein the alcohol dehydrogenase 1 enzyme is encoded by an ADH1 coding sequence.
X25.04 The composition of embodiment X25.01 , wherein the glutamate synthase enzyme is encoded by a GLT1 coding sequence.
X25.05 The composition of embodiment X25.01 , wherein the trehalose-6-phosphate phosphatase enzyme is encoded by a TPS2 coding sequence. X25.06 The composition of embodiment X25.01 , wherein the glyceraldehyde-3-phosphate dehydrogenase enzyme is encoded by a TDH3 coding sequence.
X25.07 The composition of embodiment X25.01 , wherein the pyruvate kinase enzyme is encoded by a PYK1 coding sequence.
X25.08 The composition of embodiment Χ25.0 , wherein the pyruvate kinase enzyme is encoded by a CDC19 coding sequence. X25.09 The composition of embodiment X25.01 , wherein the glucose transporter is encoded by a GAL2 coding sequence.
X25.10 The composition of embodiment X25.01 , wherein the glucose transporter is encoded by a GSX1 coding sequence.
X25.1 1 The composition of embodiment X25.01 , wherein the glucose transporter is encoded by a GXF1 coding sequence.
X25.12 The composition of embodiment X25.01 , wherein the glucose transporter is encoded by a HXT7 coding sequence.
X25.13 The composition of embodiment X25.01 , wherein the phosphogluconate dehydratase enzyme is encoded by a EDD coding sequence.
X25.14 The composition of embodiment X25.01 , wherein the 2-keto-3-deoxygluconate-6- phosphate aldolase enzyme is encoded by a EDA coding sequence.
X25.15 The composition of embodiment X25.01 , wherein the xylose reductase enzyme is encoded by a XR coding sequence.
X25.16 The composition of embodiment X25.01 , wherein the xylose isomerase enzyme is encoded by a XI coding sequence. X25.17 The composition of embodiment X25.01 , wherein the xylitol dehydrogenase enzyme is encoded by a XD coding sequence.
X25.18 The composition of embodiment X25.01 , wherein the xylulokinase enzyme is encoded by a XK coding sequence.
X26. The composition of any one of embodiments X25.01 to X25.18, wherein each nucleic acid molecule includes one or two or more of the polynucleotide subsequences, one or two or more of the promoters, or one or two or more of the polynucleotide subsequences and one or two or more of the promoters.
X27. The composition of any one of embodiments X25.01 to X26, wherein each of the one or two nucleic acid molecules are in circular form. X28. The composition of any one of embodiments X25.01 to X26, wherein each of the one or two nucleic acid molecules are in linear form.
X29. The composition of any one of embodiments X25.01 to X28, wherein each of the one or two nucleic acid molecules functions as an expression vector.
X30. The composition of any one of embodiments X25.01 to X29, wherein each of the one or two nucleic acid molecules includes flanking sequences for integrating the polynucleotides, the promoter sequences, or the polynucleotides and the promoter sequences in the nucleic acid into genomic DNA of a host organism.
Y1 . A composition comprising an engineered yeast that includes an alteration that adds or increases a phosphogluconate dehydratase activity and a 2-keto-3-deoxygluconate-6-phosphate aldolase activity, and an alteration that reduces a 6-phosphogluconate dehydrogenase
(decarboxylating) activity.
Y2. The composition of embodiment Y1 , wherein the yeast is a Saccharomyces spp. yeast.
Y3. The composition of embodiment Y2, wherein the yeast is a Saccharomyces cerevisiae yeast strain. Y4. The composition of any one of embodiments Y1 to Y3, wherein the yeast includes an altered gene that encodes a 6-phosphogluconate dehydrogenase (decarboxylating) enzyme.
Y4.1 . The composition of any one of embodiments Y1 to Y4 where fhe yeast includes heterologous polynucleotides, or multiple copies of endogenous polynucleotides, that encode a phosphogluconate dehydratase enzyme and a 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme.
Y5. The composition of embodiment Y4, wherein the polynucleotides encoding the
phosphogluconate dehydratase enzyme and the 3-deoxygluconate-6-phosphate aldolase enzyme independently are from an Escherichia spp. microbe or Pseudomonas spp. microbe.
Y6. The composition of embodiment Y5, wherein the Escherichia spp. microbe is an Escherichia coli strain.
Y7. The composition of embodiment Y5, wherein the Pseudomonas spp. microbe is a
Pseudomonas aeruginosa strain.
Y8. The composition of any one of embodiments Y4 to Y7, wherein the polynucleotide that encodes the phosphogluconate dehydratase enzyme is an EDD gene.
Y9. The composition of any one of embodiments Y4 to Y7, wherein the polynucleotide that encodes the 2-keto-3-deoxygluconate-6-phosphate aldolase enzyme is an EDA gene. Y10. The composition of any one of embodiments Y1 to Y1 , wherein a glucose-6-phosphate dehydrogenase activity is added or increased.
Y10.1 . The composition of embodiment Y10, wherein the yeast comprises a heterologous polynucleotide that encodes a 6-phosphogluconolactonase enzyme, or wherein the yeast comprises multiple copies of an endogenous polynucleotide that encodes a 6- phosphogluconolactonase enzyme.
Y10.2. The composition of embodiment Y10.1 , wherein the polynucleotide that encodes the 6- phosphogluconolactonase enzyme is from a yeast. Y10.3. The composition of embodiment Y10.2, wherein the yeast is a Saccharomyces spp. yeast.
Y10.4. The composition of embodiment Y10.3, wherein the yeast is a Saccharomyces cerevisiae strain.
Y10.5. The composition of any one of embodiments Y10 to Y10.4, wherein the 6- phosphogluconolactonase enzyme is expressed from a SOL gene.
Y1 1 . The composition of embodiment Y10.4, wherein the SOL gene is a SOL3 gene.
Y12. The composition of any one of embodiments Y4 to Y1 1 , wherein a glucose-6-phosphate dehydrogenase activity is added or increased.
Y13. The composition of embodiment Y12, wherein the yeast comprises a heterologous polynucleotide that encodes a glucose-6-phosphate dehydrogenase enzyme, or wherein the yeast comprises multiple copies of an endogenous polynucleotide that encodes a glucose-6-phosphate dehydrogenase enzyme.
Y14. The composition of embodiment Y13, wherein the polynucleotide that encodes the glucose- 6-phosphate dehydrogenase enzyme is from a yeast.
Y15. The composition of embodiment Y14, wherein the yeast is a Saccharomyces spp. yeast.
Y16. The composition of embodiment Y15, wherein the yeast is a Saccharomyces cerevisiae strain.
Y17. The composition of any one of embodiments Y13 to Y17, wherein the glucose-6-phosphate dehydrogenase enzyme is expressed from a ZWF gene. Y18. The composition of embodiment Y17, wherein the ZWF gene is a ZWF1 gene.
Y19. The composition of any one of embodiments Y1 to Y18, wherein the nucleic acid includes one or more promoters operable in a yeast, wherein the promoter is in operable connection with one or more of the polynucleotides. Y20. The composition of embodiment Y19, wherein the promoter is selected from promoters that regulate glucose phosphate dehydrogenase (GYD), translation elongation factor (TEF-1 ), phosphoglucokinase (YGK-1 ) and triose phosphate dehydrogenase (TDH-1 ). Y21. The composition of any one of embodiments Y1 to Y20, wherein the yeast includes a reduction in one or more of the following activities: phosphofructokinase (PFK) activity, phosphoglucoisomerase (PGI) activity, transketolase activity, transaldolase activity, trehalose-6- phosphate synthase (TPS1 ), plasma membrane channel (FPS1 ), glycerol-3-phosphate dehydrogenase (GPD1/GPD2), neutral trehalose (NTH1 ), alkaline phosphatase (PH013) or combination thereof.
Y22. The composition of embodiment Y21 , wherein the yeast includes an alteration in one or more polynucleotides that inhibits production of one or more enzymes or polypeptides selected from the group consisting of phosphofructokinase (PFK) enzyme, phosphoglucoisomerase (PGI) enzyme, 6-phosphogluconate dehydrogenase (decarboxylating) enzyme, transketolase enzyme, transaldolase enzyme, trehalose-6-phosphate synthase (TPS1 ), plasma membrane channel (FPS1 ), glycerol-3-phosphate dehydrogenase (GPD1/GPD2), neutral trehalose (NTH1 ), alkaline phosphatase (PH013) or combination thereof. Y23. The composition of embodiment Y22, wherein the transketolase enzyme is encoded by a TKL-1 coding sequence or a TKL-2 coding sequence.
Y24. The composition of embodiment Y22, wherein the transaldolase is encoded by a TAL-1 coding sequence.
Y25. The composition of embodiment Y22, wherein the phosphofructokinase (PFK) enzyme is a PFK-2 enzyme or PFK-1 enzyme.
Y26. The composition of any one of embodiments Y4 to Y25, wherein the 6-phosphogluconate dehydrogenase (decarboxylating) enzyme is encoded by a GND-1 gene or GND-2 gene.
Y27. The composition of embodiment Y22, wherein the PGI is encoded by a PGI-1 gene. Y27.1 The composition of embodiment Y22, wherein the trehalose-6-phosphate synthase is encoded by a TPS1 coding sequence.
Y27.2 The composition of embodiment Y22, wherein the plasma membrane channel is encoded by a FPS1 coding sequence.
Y27.3 The composition of embodiment Y22, wherein the glycerol-3-phosphate dehydrogenase enzyme is encoded by a GPD1 gene or a GPD2 gene. Y27.4 The composition of embodiment Y22, wherein the neutral trehalose enzyme is encoded by a NTH1 coding sequence.
Y27.5 The composition of embodiment Y22, wherein the alkaline phosphatase enzyme is encoded by a PH013 coding sequence.
Y28. The composition of any one of embodiments Y1 to Y27.5, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are not integrated in the yeast nucleic acid.
Y29. The composition of embodiment Y28, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are in one or more plasmids.
Y30. The composition of any one of embodiments Y1 to Y29, wherein the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters are integrated in genomic DNA of the yeast.
Y30.01 The composition of any one of embodiments Y1 to Y30, wherein the yeast includes an alteration in one or more polynucleotides that increases production of one or more enzymes or polypeptides selected from the group consisting of pyruvate decarboxylase (PDC1 ), alcohol dehydrogenase (ADH1 ), glutamate synthase (e.g., GLT1 ), trehalose-6-phosphate phosphatase (TPS2), glyceraldehyde-3-phosphate dehydrogenase (TDH3), pyruvate kinase PYK1 , CDC19), glucose transporter (GAL2, GSX1 , GXF1 , HXT7), phosphogluconate dehydratase (EDD), 2-keto-3- deoxygluconate-6-phosphate aldolase (e.g., EDA), xylose isomerase (XI), xylose reductase (XR), xylitol dehydrogenase (XD), or xylulokinase (XK). Y30.02 The composition of embodiment Y30.01 , wherein the pyruvate decarboxylase enzyme is encoded by a PDC1 coding sequence.
Y30.03 The composition of embodiment Y30.01 , wherein the alcohol dehydrogenase 1 enzyme is encoded by an ADH1 coding sequence.
Y30.04 The composition of embodiment Y30.01 , wherein the glutamate synthase enzyme is encoded by a GLT1 coding sequence. Y30.05 The composition of embodiment Y30.01 , wherein the trehalose-6-phosphate phosphatase enzyme is encoded by a TPS2 coding sequence.
Y30.06 The composition of embodiment Y30.01 , wherein the glyceraldehyde-3-phosphate dehydrogenase enzyme is encoded by a TDH3 coding sequence.
Y30.07 The composition of embodiment Y30.01 , wherein the pyruvate kinase enzyme is encoded by a PYK1 coding sequence.
Y30.08 The composition of embodiment Y30.01 , wherein the pyruvate kinase enzyme is encoded by a CDC19 coding sequence.
Y30.09 The composition of embodiment Y30.01 , wherein the glucose transporter is encoded by a GAL2 coding sequence. Y30.10 The composition of embodiment Y30.01 , wherein the glucose transporter is encoded by a GSX1 coding sequence.
Y30.1 1 The composition of embodiment Y30.01 , wherein the glucose transporter is encoded by a GXF1 coding sequence.
Y30.12 The composition of embodiment Y30.01 , wherein the glucose transporter is encoded by a HXT7 coding sequence. Y30.13 The composition of embodiment Y30.01 , wherein the phosphogluconate dehydratase enzyme is encoded by a EDD coding sequence.
Y30.14 The composition of embodiment Y30.01 , wherein the 2-keto-3-deoxygluconate-6- phosphate aldolase enzyme is encoded by a EDA coding sequence.
Y30.15 The composition of embodiment Y30.01 , wherein the xylose reductase enzyme is encoded by a XR coding sequence. Y30.16 The composition of embodiment Y30.01 , wherein the xylose isomerase enzyme is encoded by a XI coding sequence.
Y30.17 The composition of embodiment Y30.01 , wherein the xylitol dehydrogenase enzyme is encoded by a XD coding sequence.
Y30.18 The composition of embodiment Y30.01 , wherein the xylulokinase enzyme is encoded by a XK coding sequence.
Y31. The composition of any one of embodiments Y30 to Y30.18, wherein the polynucleotides, the promoters, or the polynucleotides and the promoters are integrated in a transposition integration event, in a homologous recombination integration event, or in a transposition integration event and a homologous recombination integration event.
Y32. The composition of embodiment Y31 , wherein the transposition integration event includes transposition of an operon comprising two or more of the polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters.
Y33. The composition of embodiment Y31 , wherein the homologous recombination integration event includes homologous recombination of an operon comprising two or more of the
polynucleotide subsequences, the promoters, or the polynucleotide subsequences and the promoters. Z1 . A method, comprising contacting an engineered yeast of any one of embodiments Y1 to Y33 with a feedstock that contains one or more hexose sugars under conditions in which the microbe synthesizes ethanol. Z2. The method of embodiment Z1 , wherein the engineered yeast synthesizes ethanol to about 85% to about 99% of theoretical yield.
Z3. The method of embodiment Z1 or Z2, comprising recovering ethanol synthesized by the engineered yeast.
Z4. The method of any one of embodiments Z1 to Z3, wherein the conditions are fermentation conditions.
Z5. The method of any one of embodiments Z1 to Z4, wherein the engineered yeast synthesizes ethanol with a yield greater than 0.4 grams of ethanol per gram of feedstock.
Z6. The method of any one of embodiments Z1 to Z5, wherein the engineered yeast comprises between about a 1 -fold to about a 100-fold increase in ethanol production when compared to wild- type, parental or partially engineered organisms of the same strain, under identical fermentation conditions.
AA1 . An isolated nucleic acid comprising a polynucleotide that is 80% or more identical to SEQ ID NO: 179. AA2. The nucleic acid of embodiment AA1 , which comprises a polynucleotide that is 85% or more identical to SEQ ID NO: 179.
AA3. The nucleic acid of embodiment AA1 , which comprises a polynucleotide that is 90% or more identical to SEQ ID NO: 179.
AA4. The nucleic acid of embodiment AA1 , which comprises a polynucleotide that is 95% or more identical to SEQ ID NO: 179. AA5. The nucleic acid of embodiment AA1 , which comprises a polynucleotide that comprises SEQ ID NO: 179.
AA6. The nucleic acid of embodiment AA1 , which comprises a polynucleotide consisting of SEQ ID NO: 179.
AA7. The nucleic acid of any one of embodiments AA1 to AA6, wherein the polynucleotide encodes an amino acid sequence of SEQ ID NO: 180. AA7.1 . The nucleic acid of embodiment AA7, wherein the polynuceoltide is codon optimized.
AA8. The nucleic acid of any one of embodiments AA1 to AA7.1 , which is an expression vector.
AB1 . An engineered microorganism comprising a polynucleotide that is 80% or more identical to SEQ ID NO: 179.
AB1 .1 . The engineered microorganism of embodiment AB1 , which comprises a polynucleotide that is 85% or more identical to SEQ ID NO: 179. AB1 .2. The engineered microorganism of embodiment AB1 , which comprises a polynucleotide that is 90% or more identical to SEQ ID NO: 179.
AB1 .3. The engineered microorganism of embodiment AB1 , which comprises a polynucleotide that is 95% or more identical to SEQ ID NO: 179.
AB1 .4. The engineered microorganism of embodiment AB1 , which comprises a polynucleotide that comprises SEQ ID NO: 179.
AB1 .5. The engineered microorganism of embodiment AB1 , which comprises a polynucleotide consisting of SEQ ID NO: 179.
AB1 .6. The engineered microorganism of any one of embodiments AB1 to AB1 .5, wherein the polynucleotide encodes an amino acid sequence of SEQ ID NO: 180. AB1 .7. The engineered microorganism of embodiment AB1.6, wherein the polynuceoltide is codon optimized for the microorganism.
AB2. The engineered microorganism of any one of embodiments AB1 to AB1.7, which is a eukaryote.
AB3. The engineered microorganism of embodiment AB2, which is a yeast.
AB4. The engineered microorganism of embodiment AB3, wherein the yeast is a Saccharomyces yeast.
AB5. The engineered microorganism of embodiment AB3, wherein the yeast is a Saccharomyces cerevisiae yeast. AC1 . A method for producing ethanol, which comprises contacting the engineered microorganism of any one of embodiments AB1 to AB5 with a 5 carbon sugar, a 6 carbon sugar, or mixture comprising 5 carbon and 6 carbon sugars, under fermentation conditions, whereby ethanol is produced by the engineered microorganism. AC2. The method of embodiment AC1 , wherein the engineered microorganism synthesizes ethanol to about 85% to 99% of theoretical yield.
AC3. The method of embodiment AC1 or AC2, which comprises recovering ethanol synthesized by the engineered microorganism.
AC4. The method of any one of embodiments AC1 to AC3, wherein the amount of ethanol produced by the engineered microorganism is about 2-fold to about 100-fold increased compared to the amount of ethanol produced by wild-type, parental, or partially engineered microorganisms of the same strain, under identical fermentation conditions.
The entirety of each patent, patent application, publication and document referenced herein hereby is incorporated by reference. Citation of the above patents, patent applications, publications and documents is not an admission that any of the foregoing is pertinent prior art, nor does it constitute any admission as to the contents or date of these publications or documents.
Modifications may be made to the foregoing without departing from the basic aspects of the technology. Although the technology has been described in substantial detail with reference to one or more specific embodiments, those of ordinary skill in the art will recognize that changes may be made to the embodiments specifically disclosed in this application, yet these modifications and improvements are within the scope and spirit of the technology. The technology illustratively described herein suitably may be practiced in the absence of any element(s) not specifically disclosed herein. Thus, for example, in each instance herein any of the terms "comprising," "consisting essentially of," and "consisting of" may be replaced with either of the other two terms. The terms and expressions which have been employed are used as terms of description and not of limitation, and use of such terms and expressions do not exclude any equivalents of the features shown and described or portions thereof, and various modifications are possible within the scope of the claimed technology. The term "a" or "an" can refer to one of or a plurality of the elements it modifies (e.g., "a reagent" can mean one or more reagents) unless it is contextually clear either one of the elements or more than one of the elements is described. The term "about" as used herein refers to a value within 10% of the underlying parameter (i.e., plus or minus 10%), and use of the term "about" at the beginning of a string of values modifies each of the values (i.e., "about T, 2 and 3" refers to about 1 , about 2 and about 3). For example,, a weight of "about 100 grams" can include weights between 90 grams and 1 10 grams. Further, when a listing of values is described herein (e.g., about 50%, 60%, 70%, 80%, 85% or 86%) the listing includes all intermediate and fractional values thereof (e.g., 54%, 85.4%). Thus, it should be understood that although the present technology has been specifically disclosed by representative
embodiments and optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and such modifications and variations are considered within the scope of this technology. Certain embodiments of the technology are set forth in the claim(s) that follow(s).

Claims

What is claimed is:
1 . An isolated nucleic acid comprising a polynucleotide that is 80% or more identical to SEQ ID NO: 179.
2. The nucleic acid of claim 1 , wherein the polynucleotide encodes an amino acid sequence of SEQ ID NO: 180.
3. The nucleic acid of claim 1 , which comprises a polynucleotide that is 85% or more identical to SEQ ID NO: 179.
4. The nucleic acid of claim 1 , which comprises a polynucleotide that is 90% or more identical to SEQ ID NO: 179.
5. The nucleic acid of claim 1 , which comprises a polynucleotide that is 95% or more identical to SEQ ID NO: 179.
6. The nucleic acid of claim 1 , which comprises the polynucleotide of SEQ ID NO: 179.
7. The nucleic acid of claim 1 , consisting of the polynucleotide of SEQ ID NO: 179.
8. The nucleic acid of any one of claims 1 to 7, which is an expression vector.
9. An engineered microorganism comprising a polynucleotide that is 80% or more identical to SEQ ID NO: 179.
10. The engineered microorganism of claim 9, which is a eukaryote.
1 1 . The engineered microorganism of claim 10, which is a yeast.
12. The engineered microorganism of claim 1 1 , wherein the yeast is a Saccharomyces yeast.
13. The engineered microorganism of claim 1 1 , wherein the yeast is a Saccharomyces cerevisiae yeast.
14. A method for producing ethanol, which comprises contacting the engineered microorganism of any one of claims 9 to 13 with a 5 carbon sugar, a 6 carbon sugar, or mixture comprising 5 carbon and 6 carbon sugars, under fermentation conditions, whereby ethanol is produced by the engineered microorganism.
15. The method of claim 14, wherein the engineered microorganism synthesizes ethanol to about 85% to 99% of theoretical yield.
16. The method of claim 14 or 15, which comprises recovering ethanol synthesized by the engineered microorganism.
17. The method of any one of claims 14 to 16, wherein the amount of ethanol produced by the engineered microorganism is about 2-fold to about 100-fold increased compared to the amount of ethanol produced by wild-type, parental, or partially engineered microorganisms of the same strain, under identical fermentation conditions.
PCT/US2012/020982 2011-01-12 2012-01-11 Engineered microorganisms with enhanced fermentation activity WO2012097091A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161432104P 2011-01-12 2011-01-12
US61/432,104 2011-01-12

Publications (2)

Publication Number Publication Date
WO2012097091A2 true WO2012097091A2 (en) 2012-07-19
WO2012097091A3 WO2012097091A3 (en) 2012-10-26

Family

ID=46507653

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2012/020982 WO2012097091A2 (en) 2011-01-12 2012-01-11 Engineered microorganisms with enhanced fermentation activity

Country Status (1)

Country Link
WO (1) WO2012097091A2 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011150313A1 (en) 2010-05-28 2011-12-01 Codexis, Inc. Pentose fermentation by a recombinant microorganism
WO2015069796A1 (en) * 2013-11-05 2015-05-14 Board Of Regents, The University Of Texas System Methods for engineering sugar transporter preferences
WO2016016805A1 (en) 2014-08-01 2016-02-04 Versalis S.P.A. Gene construct for the transformation of yeast strains
WO2016024218A1 (en) * 2014-08-11 2016-02-18 Lallemand Hungary Liquidity Management Llc Chimeric polypeptides having xylose isomerase activity
US20170335358A1 (en) * 2016-05-18 2017-11-23 Korea University Research And Business Foundation Microorganism having improved ability to produce n-acetylglucosamine as a result of modulating glycolytic flux
WO2018220116A1 (en) * 2017-05-31 2018-12-06 Novozymes A/S Xylose fermenting yeast strains and processes thereof for ethanol production
WO2019011948A1 (en) * 2017-07-11 2019-01-17 Adisseo France S.A.S. Enhanced metabolite-producing yeast
CN114107357A (en) * 2020-08-27 2022-03-01 中国科学院天津工业生物技术研究所 Recombinant filamentous fungus for producing ethanol by phosphofructokinase2 mutant, construction thereof and application of recombinant filamentous fungus for producing ethanol
US11306304B2 (en) 2014-08-11 2022-04-19 Lallemand Hungary Liquidity Management Llc Mutations in iron-sulfur cluster proteins that improve xylose utilization
CN114592023A (en) * 2022-03-31 2022-06-07 杭州优玛达生物科技有限公司 Cell lysis self-assembly polypeptide compound, self-assembly method, self-assembly polypeptide preparation and application

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010039692A2 (en) * 2008-09-30 2010-04-08 The United States Of America, As Represented By The Secretary Of Agriculture Transformed saccharomyces cerevisiae engineered for xylose utilization

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010039692A2 (en) * 2008-09-30 2010-04-08 The United States Of America, As Represented By The Secretary Of Agriculture Transformed saccharomyces cerevisiae engineered for xylose utilization

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DATABASE NCBI 10 August 2010 'RecName: Full=Xylose isomerase' Database accession no. Q9S306 *
DATABASE NCBI 15 April 2005 'Ruminococcus flavefaciens xylan utilization operon' Database accession no. AJ132472 *
KUYPER, M. ET AL.: 'Evolutionary engineering of mixed-sugar utilization by a xylose-fermenting Saccharomyces cerevisiae strain' FEMS YEAST RESEARCH. vol. 5, no. 10, July 2005, pages 925 - 834 *
KUYPER, M. ET AL.: 'HIgh-level functional expression of a fungal xylose isomerase: the key to efficient ethanolic fermentation of xylose by Saccharomyces cerevisiae?' FEMS YEAST RESEARCH. vol. 4, no. 1, October 2003, pages 69 - 78 *
MADHAVAN, A. ET AL.: 'Alcoholic fermentation of xylose and mixed sugars using recombinant Saccharomyces cerevisiae engineered for xylose utilization' APPLIED MICROBIOLOGY AND BIOTECHNOLOGY. vol. 82, no. 6, 06 June 2009, pages 1037 - 1047 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011150313A1 (en) 2010-05-28 2011-12-01 Codexis, Inc. Pentose fermentation by a recombinant microorganism
US9752171B2 (en) 2010-05-28 2017-09-05 Codexis, Inc. Pentose fermentation by a recombinant microorganism
EP2576804A1 (en) * 2010-05-28 2013-04-10 Codexis, Inc. Pentose fermentation by a recombinant microorganism
EP2576804A4 (en) * 2010-05-28 2013-11-27 Codexis Inc Pentose fermentation by a recombinant microorganism
US9926347B2 (en) 2013-11-05 2018-03-27 Board Of Regents, The University Of Texas System Methods for engineering sugar transporter preferences
WO2015069796A1 (en) * 2013-11-05 2015-05-14 Board Of Regents, The University Of Texas System Methods for engineering sugar transporter preferences
WO2016016805A1 (en) 2014-08-01 2016-02-04 Versalis S.P.A. Gene construct for the transformation of yeast strains
WO2016024218A1 (en) * 2014-08-11 2016-02-18 Lallemand Hungary Liquidity Management Llc Chimeric polypeptides having xylose isomerase activity
CN106795508A (en) * 2014-08-11 2017-05-31 拉勒曼德匈牙利流动管理有限责任公司 Chimeric polyeptides with xylose isomerase activity
EP3180428A1 (en) * 2014-08-11 2017-06-21 Lallemand Hungary Liquidity Management LLC Chimeric polypeptides having xylose isomerase activity
US11306304B2 (en) 2014-08-11 2022-04-19 Lallemand Hungary Liquidity Management Llc Mutations in iron-sulfur cluster proteins that improve xylose utilization
US10619147B2 (en) 2014-08-11 2020-04-14 Lallemand Hungary Liquidity Management Llc Chimeric polypeptides having xylose isomerase activity
US11034949B2 (en) 2014-08-11 2021-06-15 Lallemand Hungary Liquidity Management Llc Chimeric polypeptides having xylose isomerase activity
US20170335358A1 (en) * 2016-05-18 2017-11-23 Korea University Research And Business Foundation Microorganism having improved ability to produce n-acetylglucosamine as a result of modulating glycolytic flux
US10724015B2 (en) * 2016-05-18 2020-07-28 Korea University Research And Business Foundation Microorganism having improved ability to produce N-acetylglucosamine as a result of modulating glycolytic flux
WO2018220116A1 (en) * 2017-05-31 2018-12-06 Novozymes A/S Xylose fermenting yeast strains and processes thereof for ethanol production
US11091753B2 (en) 2017-05-31 2021-08-17 Novozymes A/S Xylose fermenting yeast strains and processes thereof for ethanol production
WO2019011948A1 (en) * 2017-07-11 2019-01-17 Adisseo France S.A.S. Enhanced metabolite-producing yeast
CN110869488A (en) * 2017-07-11 2020-03-06 安迪苏法国联合股份有限公司 Enhanced metabolite production yeast
US11319564B2 (en) 2017-07-11 2022-05-03 Adisseo France S.A.S. Enhanced metabolite-producing yeast
CN110869488B (en) * 2017-07-11 2024-04-05 安迪苏法国联合股份有限公司 Enhanced metabolite producing yeast
CN114107357A (en) * 2020-08-27 2022-03-01 中国科学院天津工业生物技术研究所 Recombinant filamentous fungus for producing ethanol by phosphofructokinase2 mutant, construction thereof and application of recombinant filamentous fungus for producing ethanol
CN114107357B (en) * 2020-08-27 2024-04-05 中国科学院天津工业生物技术研究所 Recombinant filamentous fungus for producing ethanol by phosphofructokinase2 mutant, construction thereof and application of recombinant filamentous fungus in ethanol production
CN114592023A (en) * 2022-03-31 2022-06-07 杭州优玛达生物科技有限公司 Cell lysis self-assembly polypeptide compound, self-assembly method, self-assembly polypeptide preparation and application

Also Published As

Publication number Publication date
WO2012097091A3 (en) 2012-10-26

Similar Documents

Publication Publication Date Title
WO2012097091A2 (en) Engineered microorganisms with enhanced fermentation activity
DK2678432T3 (en) RECOMBINANT MICROORGANISMS AND APPLICATIONS THEREOF
EP3033413B2 (en) Methods for the improvement of product yield and production in a microorganism through glycerol recycling
US8114974B2 (en) Engineered microorganisms with enhanced fermentation activity
CA2832279C (en) Methods for the improvement of product yield and production in a microorganism through the addition of alternate electron acceptors
DK2670846T3 (en) METHODS FOR THE DEVELOPMENT OF TERPEN SYNTHASE VARIETIES
EP3559220A1 (en) Improved glycerol free ethanol production
CA2744426C (en) Saccharomyces strain with ability to grow on pentose sugars under anaerobic cultivation conditions
KR20120089631A (en) Methods and compositions for improving sugar transport, mixed sugar fermentation, and production of biofuels
US9512448B2 (en) Method for enhancing cellobiose utilization
JP2017536815A (en) Gal2 transporter variants and uses thereof
EP3228697B1 (en) Highly efficient ethanol-fermentative bacteria
WO2018106792A1 (en) Improved processes for production of ethanol from xylose-containing cellulosic substrates using engineered yeast strains
WO2010095750A1 (en) Manufacturing method for substances from candida utilis that can use xylose as carbon source
JP2023509176A (en) D-xylulose 4-epimerase, variants thereof and uses thereof
KR102173211B1 (en) Yeast strain with glucose and xylose co-utilization capacity
US20220090045A1 (en) Methods for producing isopropanol and acetone in a microorganism
US20230227861A1 (en) Gene duplications for crabtree-warburg-like aerobic xylose fermentation
BR112021012497A2 (en) RECOMBINANT YEAST AND METHOD TO PRODUCE ETHANOL USING THE SAME
WO2018073107A1 (en) Eukaryotic cell comprising xylose isomerase

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12734491

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12734491

Country of ref document: EP

Kind code of ref document: A2