CN117897490A - Recombinant yeast cells - Google Patents

Recombinant yeast cells Download PDF

Info

Publication number
CN117897490A
CN117897490A CN202280059116.5A CN202280059116A CN117897490A CN 117897490 A CN117897490 A CN 117897490A CN 202280059116 A CN202280059116 A CN 202280059116A CN 117897490 A CN117897490 A CN 117897490A
Authority
CN
China
Prior art keywords
seq
acid sequence
nucleic acid
protein
activity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280059116.5A
Other languages
Chinese (zh)
Inventor
S·L·罗塞尔-阿拉贡特
M·L·A·詹森
I·M·武格特-范卢茨
J·P·J·施密茨
E·T·范里奇
R·M·德容
H·M·C·J·德布吕伊恩
P·E·布雷曼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Danisco US Inc
Original Assignee
Danisco US Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Danisco US Inc filed Critical Danisco US Inc
Priority claimed from PCT/EP2022/068917 external-priority patent/WO2023285280A1/en
Publication of CN117897490A publication Critical patent/CN117897490A/en
Pending legal-status Critical Current

Links

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

Disclosed is a recombinant yeast cell that functionally expresses: -a nucleic acid sequence encoding a native protein having transketolase activity (EC 2.2.1.1); and-a nucleic acid sequence encoding a heterologous protein (EC 2.2.1.1) having transketolase activity.

Description

Recombinant yeast cells
Technical Field
The present invention relates to a recombinant yeast cell and a method for producing ethanol, wherein the recombinant yeast cell is used.
Background
Microbial fermentation processes are suitable for industrial production of a wide and rapidly expanding range of chemical compounds from renewable carbohydrate feedstocks. In particular in anaerobic fermentation processes, the cofactor pair NADH/NAD + The redox balance of (c) may impose significant limitations on product yield. An example of such a challenge is the formation of glycerol as a major byproduct in the industrial production of, for example, fuel ethanol from saccharomyces cerevisiae (Saccharomyces cerevisiae), which is a direct consequence of the need to reoxidize NADH formed in the biosynthetic reaction.
Ethanol production from Saccharomyces cerevisiae is currently the largest single fermentation process in industrial biotechnology on a volumetric basis. Various methods have been proposed to improve the fermentation properties of organisms used in industrial biotechnology by genetic modification. A significant challenge associated with the stoichiometry of yeast-based ethanol production is that large amounts of NADH-dependent byproducts (such as glycerol) are typically formed as byproducts, especially under anaerobic and oxygen-limited conditions or under conditions where respiration is otherwise limited or absent. It is estimated that in a typical industrial ethanol process, up to about 4wt.% of the sugar feedstock is converted to glycerol (Nissen et al, "Anaerobic and aerobic batch cultivations of Saccharomyces cerevisiae mutants impaired in glycerol Synthesis" [ "effect of anaerobic and aerobic batch culture of Saccharomyces cerevisiae mutants on glycerol synthesis" ], (2000), yeast [ Yeast ], volume 16, pages 463-474). Under conditions ideal for anaerobic growth, the conversion to glycerol may be even higher, up to about 10%.
Glycerol production under anaerobic conditions is mainly related to redox metabolism. During anaerobic growth of Saccharomyces cerevisiae (S. Cerevisiae), glycosylation occurs via alcoholic fermentation. In this process, NADH formed in the glycolytic glyceraldehyde-3-phosphate dehydrogenase reaction is produced by passing NAD + The dependent alcohol dehydrogenase converts acetaldehyde formed by decarboxylation of pyruvic acid into ethanol to be reoxidized. When NAD + The fixed stoichiometry of this redox-neutral catabolic pathway causes problems when the net reduction to NADH occurs elsewhere in the metabolism. Under anaerobic conditions, NADH reoxidation in Saccharomyces cerevisiae is strictly dependent on the reduction of sugar to glycerol. Glycerol formation is initiated by the reduction of dihydroxyacetone phosphate (DHAP), an intermediate of glycolysis, to glycerol 3-phosphate (glycerol-3P), a reaction which is produced by NAD + Dependent glycerol 3-phosphate dehydrogenase catalyzed. Subsequently, the glycerol 3-phosphate formed in this reaction is hydrolyzed by glycerol-3-phosphatase to produce glycerol and inorganic phosphate. Thus, glycerol is a major byproduct in the anaerobic production of ethanol from saccharomyces cerevisiae, which is undesirable because it reduces the overall conversion of sugar to ethanol. Furthermore, the presence of glycerol in the effluent of an ethanol production plant may increase the cost of wastewater treatment.
In the literature, several different methods have been reported that can help reduce the amount of by-product glycerol and shift the carbon to ethanol, resulting in an increase in ethanol yield per gram of fermented carbohydrate.
In WO 2011/010923, WO 2014/081803, WO 2014/129898, WO 2015/107496, WO 2015/148272, WO 2018/172328 and WO 2018/228836 several alternative reduction pathways have been proposed to reoxidize the generated NADH, also called alternative "redox sink", and have been implemented into (recombinant) yeast cells.
For example, in WO 2011/010923, NADH related by-product (glycerol) formation in a process for producing ethanol from a carbohydrate-containing feedstock is addressed by providing a recombinant yeast cell comprising one or more recombinant nucleic acid sequences encoding nad+ dependent acetylating acetaldehyde dehydrogenase (EC 1.2.1.10) activity. The cell may, for example, lack the enzymatic activity required for NADH dependent glycerol synthesis, or the cell may have reduced enzymatic activity in NADH dependent glycerol synthesis compared to its corresponding wild-type yeast cell.
WO 2014/129898 describes recombinant cells functionally expressing heterologous nucleic acid sequences encoding ribulose-1, 5-phosphate carboxylase/oxygenase (EC 4.1.1.39; abbreviated herein as "Rubisco") and optionally a chaperone of Rubisco and phosphoribulokinase (EC 2.7.1.19; abbreviated herein as "PRK"). Furthermore, the use of carbon dioxide as an electron acceptor in recombinant autotrophic microorganisms is mentioned.
Although the described methods and yeast cells are advantageous, continued improvement is desirable. In an industrial setting, the above reductions in glycerol production by recombinant yeast cells can potentially affect their tolerance to hypertonicity and their stress response to the external environment. Especially under challenging process conditions, for example when a fermentation medium with a high dry solids content and/or a high fermentation temperature is applied, this may lead to a decrease of the cell population and/or the cell activity at the end of the fermentation period. It would be an advance in the art to provide such methods and yeast cells for use therein, wherein the yeast cells have improved robustness under high dry solids/high dry matter conditions and/or high temperatures. Furthermore, it would be an advance in the art to provide yeast cells having reduced glucose accumulation and/or total sugar content within the yeast cells. That is, it would be an advance in the art to achieve sustained performance of yeast cells and/or low concentrations of residual glucose at the end of fermentation, even in the presence of high concentrations of glucose at the beginning of fermentation and/or throughout the fermentation process.
Disclosure of Invention
The inventors have now unexpectedly found that the methods and yeast cells of the prior art can be improved by using specific combinations of transketolase proteins.
Accordingly, the present invention provides a recombinant yeast cell that functionally expresses:
-a nucleic acid sequence encoding a native protein having transketolase activity; and
-a nucleic acid sequence encoding a heterologous protein having transketolase activity.
In addition, the present invention provides a method for producing ethanol, comprising transforming a carbon source (such as a carbohydrate or another organic carbon source) using the above recombinant yeast cells, thereby suitably forming ethanol.
Advantageously, the use of the above recombinant yeast cells and/or the above methods results in improved robustness. This is particularly advantageous when media with a high dry solids content are used and/or if high fermentation temperatures are used.
The process of producing ethanol from a carbon source, such as a carbohydrate, may advantageously be performed in the presence of a glycosylase, such as a glucoamylase, to convert polysaccharides and/or oligosaccharides to glucose. When the process is performed in a medium with a high dry matter content, for example after starting the process with a high concentration of corn mash, the concentration of glucose in the medium may become very high. Without wishing to be bound by any type of theory, it is believed that high concentrations of glucose may cause osmotic stress in the yeast cells, causing the yeast cells to cease to exhibit performance, even death.
Without wishing to be bound by any type of theory, it is believed that the above recombinant yeast cells allow for a reduced accumulation of glucose and/or other sugars within the yeast cells, thereby suitably allowing for improved robustness, compared to yeast cells that do not comprise a combination of the claimed transketolase proteins.
The advantages are illustrated by way of example. In the examples, the fermentation is carried out at a high dry matter content of 36% w/w. As demonstrated by the examples, the recombinant yeast cells according to the invention and the methods according to the invention allow for continuous performance of the yeast cells and/or continuous conversion of glucose. Recombinant yeast cells survived after 66 hours and converted carbohydrates to ethanol even in media containing glucose at concentrations up to 36% w/w and/or at temperatures up to 32 ℃. Thus, even in the case where a high concentration of glucose is present at the beginning of the fermentation and/or throughout the fermentation, a low concentration of residual glucose can be obtained at the end of the fermentation.
Description of sequence Listing
The present application contains a sequence listing in computer readable form, which is incorporated herein by reference. Table 1 below provides an overview.
Table 1: overview of the sequence listing:
in the context of the present patent application, each of the above protein/amino acid sequences is preferably encoded by a DNA/nucleic acid sequence optimized for expression in yeast, more preferably for expression in saccharomyces cerevisiae.
Detailed Description
Definition of the definition
Unless defined otherwise or clearly indicated by context, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art.
Throughout this specification and the claims which follow, the words "comprise" and "include" and variations such as "comprises", "comprising", "including" and "including" are to be interpreted as being inclusive. That is, where the context permits, these words are intended to convey that other elements or integers not specifically enumerated may be included.
The articles "a" and "an" are used herein to refer to the grammatical object of the article (i.e., one/one or at least one/at least one). For example, "an element" may mean one element/one element (one element) or more than one element/more than one element (more than one element). When referring to a noun (e.g., a compound, additive, etc.) in the singular, the plural is intended to be included. Thus, when referring to a particular portion (e.g., "a gene"), unless otherwise specified, this means "at least one" in the gene, e.g., "at least one gene".
When referring to a compound in which several isomers (e.g., D and L enantiomers) are present, the compound in principle includes all enantiomers, diastereomers and cis/trans isomers of the compound that may be used in certain aspects of the invention; in particular when referring to such a compound, it includes one or more of the natural isomers.
The various embodiments of the invention described herein may be cross-combined unless explicitly indicated otherwise.
The term "carbon source" refers to a source of carbon, preferably a compound or molecule comprising carbon. Preferably, the carbon source is a carbohydrate. Carbohydrates are understood herein as organic compounds consisting of carbon, oxygen and hydrogen. Suitably, the carbon source may be selected from the group consisting of: monosaccharides, disaccharides and/or polysaccharides, acids and acid salts. More preferably, the carbon source is a compound selected from the group consisting of: glucose, arabinose, xylose, galactose, mannose, rhamnose, fructose, glycerol and acetic acid or salts thereof.
The terms "dry matter" and "dry solids" (abbreviated as "DM" and "DS", respectively) are used interchangeably herein and refer to the material remaining after removal of water. Thus, the dry matter content may be determined by any method known to a person skilled in the art.
The term "fermentation" and variants thereof such as "fermentation" and/or "fermentation" are used herein in a classical sense, i.e. to indicate that the process is or has been performed under anaerobic conditions. Anaerobic fermentation is defined herein as fermentation performed under anaerobic conditions. Anaerobic conditions are defined herein as conditions that do not have any oxygen or that the yeast cells do not substantially consume oxygen. Conditions that do not substantially consume oxygen suitably correspond to less than 5mmol/l.h -1 Oxygen consumption of in particular less than 2.5mmol/l.h -1 Or less than 1mmol/l.h -1 Oxygen consumption of (2). More preferably, 0mmol/L/h is consumed (i.e., oxygen consumption is undetectable). This suitably corresponds to a dissolved oxygen concentration in the culture broth of less than 5% of the air saturation, more suitably less than 1% of the air saturation or less than 0.2% of the air saturation.
The term "fermentation process" refers to a process for preparing or producing a fermentation product.
The term "cell" refers to a eukaryotic organism or a prokaryotic organism, preferably present as a single cell. In the present invention, the cells are recombinant yeast cells. That is, the recombinant cell is selected from the group of genera consisting of yeasts.
The terms "yeast" and "yeast cell" are used interchangeably herein and refer to a group of phylogenetically diverse single-cell fungi, most of which belong to the Ascomycota (Ascomycota) and Basidiomycota (Basidiomycota). Budding yeasts ("true yeasts") are classified in Saccharomyces (Saccharomyces cerevisiae). The yeast cell according to the invention is preferably a yeast cell derived from Saccharomyces (Saccharomyces). More preferably, the yeast cell is a yeast cell of the species Saccharomyces cerevisiae.
As used herein, the term "recombinant" (e.g., references to "recombinant yeast," "recombinant cell," "recombinant microorganism," and/or "recombinant strain") refers to a yeast, cell, microorganism, or strain, respectively, that contains a nucleic acid as a result of one or more genetic modifications. Briefly, a yeast, cell, microorganism or strain contains different combinations of nucleic acids from one or more of its parents (any of them). To construct a recombinant yeast, cell, microorganism or strain, one or more recombinant DNA techniques and/or another one or more mutagenesis techniques may be used. For example, a recombinant yeast and/or recombinant yeast cell may comprise a nucleic acid that is not present in the corresponding wild-type yeast and/or cell into which the nucleic acid has been introduced using recombinant DNA techniques (i.e., a transgenic yeast and/or cell), or which is not present in the wild-type yeast and/or cell as a result of one or more mutations (e.g., using recombinant DNA techniques or another mutagenesis technique such as UV irradiation) in a nucleic acid sequence (such as a gene encoding a wild-type polypeptide) present in the wild-type yeast and/or yeast cell, or wherein the nucleic acid sequence of the gene has been modified to target the polypeptide product (encoding it) to another cellular compartment. Furthermore, the term "recombinant" may suitably relate to, for example, yeasts, cells, microorganisms or strains from which nucleic acid sequences have been removed using recombinant DNA techniques.
Recombinant yeast comprising or having some activity is understood herein as recombinant yeast may comprise one or more nucleic acid sequences encoding a protein having such activity. Thus, recombinant yeast are allowed to functionally express such proteins or enzymes.
The term "functionally express" means that there is functional transcription of the relevant nucleic acid sequence, allowing the nucleic acid sequence to be actually transcribed, for example resulting in the synthesis of a protein.
As used herein, the term "transgene" (e.g., reference to "transgenic yeast" and/or "transgenic cell") refers to a yeast and/or cell, respectively, that contains nucleic acids that do not naturally occur in the yeast and/or cell and that have been introduced into the yeast and/or cell using, for example, recombinant DNA techniques, such as recombinant yeast and/or cells.
The term "mutation" as used herein with respect to a protein or polypeptide means that at least one amino acid has been replaced with, inserted into, or deleted from a different amino acid sequence than the wild-type or naturally occurring protein or polypeptide sequence. Amino acid substitutions, insertions or deletions may be made, for example, by mutagenesis of the nucleic acid encoding the amino acid. Mutagenesis is a method well known in the art and includes site-directed mutagenesis, e.g., by means of PCR or via oligonucleotide-mediated mutagenesis, as described in: sambrook et al, molecular Cloning-ALaboratory Manual, molecular cloning-laboratory Manual, 2 nd edition, volumes 1-3 (1989), published by Cold Spring Harbor Publishing, cold spring harbor publication Co.).
The term "mutation" as used herein with respect to a gene means that at least one nucleotide in the nucleic acid sequence of the gene or its regulatory sequence has been replaced by a different nucleotide, inserted into the nucleic acid sequence or deleted from the nucleic acid sequence, as compared to the wild-type or naturally occurring nucleic acid sequence. Amino acid substitutions, insertions or deletions may be effected, for example, via mutagenesis, resulting in, for example, transcription of a protein sequence with qualitatively or quantitatively altered function or a knockout of the gene. In the context of the present invention, "altered gene" has the same meaning as a mutated gene.
As used herein, the term "gene" or "gene" refers to a nucleic acid sequence of an mRNA that can be transcribed into and then translated into a protein. A gene encoding a protein refers to one or more nucleic acid sequences encoding such a protein.
As used herein, the term "nucleic acid" or "nucleotide" refers to a monomeric unit in a deoxyribonucleotide or ribonucleotide polymer (i.e., polynucleotide) in either single-or double-stranded form, and unless otherwise limited, encompasses known analogs having the essential properties of natural nucleotides, as they hybridize to single-stranded nucleic acids (e.g., peptide nucleic acids) in a manner similar to naturally occurring nucleotides. For example, an enzyme defined by a nucleotide sequence encoding an enzyme includes (unless otherwise limited) a nucleotide sequence that hybridizes to a reference nucleotide sequence encoding the enzyme. The polynucleotide may be the full length or a subsequence of a native or heterologous structure or regulatory gene. Unless otherwise indicated, the term includes references to a specified sequence and its complement. Thus, DNA or RNA having a backbone modified for stability or other reasons is the term "polynucleotide" as contemplated herein. In addition, DNA or RNA comprising rare bases (such as inosine) or modified bases (such as tritylated bases), to name just two examples, is the term polynucleotide as used herein. It will be appreciated that a wide variety of modifications have been made to DNA and RNA for many useful purposes known to those skilled in the art. The term polynucleotide as used herein includes such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as chemical forms of DNA and RNA that are characteristic of viruses and cells (including, inter alia, simple and complex cells).
The terms "nucleotide sequence" and "nucleic acid sequence" are used interchangeably herein. An example of a nucleic acid sequence is a DNA sequence.
The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues, for example, as displayed by an amino acid sequence. These terms apply to amino acid polymers in which one or more amino acid residues are artificial chemical analogues of the corresponding naturally occurring amino acid, as well as naturally occurring amino acid polymers. An essential attribute of such analogs of naturally occurring amino acids is that when incorporated into a protein, the protein is specifically reactive to antibodies raised by proteins that are identical but consist entirely of naturally occurring amino acids. The terms "polypeptide", "peptide" and "protein" also include modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation.
The term "enzyme" refers herein to a protein having a catalytic function. The terms "protein" and "enzyme" may be used interchangeably herein in the context of a protein catalyzing a biological reaction of some sort. When referring to Enzymes (EC), enzymes are a class in which enzymes are classified or may be classified according to enzyme nomenclature provided by the International Union of biochemistry and molecular biology Commission (the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology, NC-IUBMB), which nomenclature may be found in http:// www.chem.qmul.ac.uk/IUBMB/enzyme. It is intended to include other suitable enzymes that have not been (yet) classified in a given class but may be so classified.
If a protein or nucleic acid sequence (such as a gene) is referred to herein by reference to an accession number, this number is used specifically to refer to a protein or nucleic acid sequence (gene) having a sequence that can be found via www.ncbi.nlm.nih.gov/(available 10/1/2020), unless otherwise specified.
Each nucleic acid sequence encoding a polypeptide herein also includes any conservatively modified variant thereof. By reference to the genetic code, this includes that it describes every possible silent variation of the nucleic acid. The term "conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to specific nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical amino acid sequences or conservatively modified amino acid sequence variants due to the degeneracy of the genetic code. The term "degeneracy of the genetic code" refers to the fact that a large number of functionally identical nucleic acids encode any given protein. For example, both codons GCA, GCC, GCG and GCU encode the amino acid alanine. Thus, at each position where the codon specifies an alanine, the codon can be changed to any of the described corresponding codons without changing the encoded polypeptide. Such nucleic acid variations are "silent variations" and represent a conservatively modified variation.
As used herein, the term "functional homolog" (or simply "homolog") of a polypeptide and/or amino acid sequence having a particular sequence (e.g., "SEQ ID NO: X") refers to a polypeptide and/or amino acid sequence comprising said particular sequence, provided that one or more amino acids are mutated, substituted, deleted, added and/or inserted, and that the polypeptide has (qualitatively) the same enzymatic function for substrate conversion.
As used herein, the term "functional homolog" (or simply "homolog") of a polynucleotide and/or nucleic acid sequence having a particular sequence (e.g., "SEQ ID NO: X") refers to a polynucleotide and/or nucleic acid sequence comprising said particular sequence, provided that one or more nucleic acids are mutated, substituted, deleted, added and/or inserted, and that the polynucleotide encodes a polypeptide sequence having (qualitatively) the same enzymatic function for substrate conversion. With respect to nucleic acid sequences, the term functional homolog is intended to include nucleic acid sequences that differ from another nucleic acid sequence due to the degeneracy of the genetic code and that encode the same polypeptide sequence.
Sequence identity is defined herein as the relationship between two or more amino acid (polypeptide or protein) sequences or two or more nucleic acid (polynucleotide) sequences, as determined by comparing the sequences. Typically, sequence identity or similarity is compared over the entire length of the sequences being compared. "identity" also means in the art the degree of sequence relatedness between amino acid or nucleic acid sequences, as the case may be, as determined by the match between strings of such sequences.
Amino acid or nucleotide sequences are said to be homologous when they exhibit a certain level of similarity. The two sequences are homologous indicating a common evolutionary origin. Whether two homologous sequences are closely related or more distant related is indicated by a "percent identity" or a "percent similarity", which are high or low, respectively. Although controversial, to indicate "percent identity" or "percent similarity", "level of homology" or "percent homology" are often used interchangeably. Comparison of sequences and determination of percent identity between two sequences may be accomplished using a mathematical algorithm. The skilled artisan will appreciate the fact that several different computer programs are available for aligning two sequences and determining homology between the two sequences (Kruskal et al, "An overview of sequence comparison: time warp, string editions, and macromolecules" [ "overview of sequence comparisons: time warp, string edit and macromolecule" ], (1983), society for Industrial and Applied Mathematics (SIAM) [ Society of Industry and Application Mathematics (SIAM) ], volume 25, phase 2, pages 201-237 and the handbook edited by D.Sankoff and J.B.Kruskal, "Time warp, string edits and macromolecules: the theory and practice of sequence comparison" [ "Time warp, string edit and macromolecule: theory and practice of sequence comparison" ], (1983), pages 1-44, published by Addison-Wesley Publishing Company, massachusetts USA [ Aison-Disli, mass.).
The percentage identity between two amino acid sequences can be determined by aligning the two sequences using the niman (Needleman) and the Wunsch algorithm. (Needleman et al, "A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins" [ "a general method suitable for finding similarity of amino acid sequences of two proteins" ] (1970) J.mol.biol. [ J. Mol. Biol. ] volume 48, pages 443-453). The algorithm aligns amino acid sequences and nucleotide sequences. The nidman-tumbler algorithm has been implemented in the computer program NEEDLE. For the purposes of the present invention, NEEDLE program from EMBOSS package (version 2.8.0 or higher, see Rice et al, "EMBOSS: the European Molecular Biology Open Software Suite" [ EMBOSS: european molecular biology open software suite ] (2000), trends in Genetics [ genetics trend ] (6) pages 276-277, http:// EMBOSS. Bioinformation. Nl /). For protein sequences, EBLOSUM62 was used as a substitution matrix. For the nucleotide sequence, EDNAFULL was used. Other matrices may be specified. The optional parameters for amino acid sequence alignment are a gap opening penalty of 10 and a gap expansion penalty of 0.5. The skilled person will appreciate that all of these different parameters will produce slightly different results, but that the overall percentage of identity of the two sequences does not change significantly when different algorithms are used.
Homology or identity is the percentage of identical matches between two complete sequences over the total alignment region including any gaps or extensions. Homology or identity between two aligned sequences is calculated as follows: the number of corresponding positions showing the same amino acid in both sequences in the alignment is divided by the total length of the alignment including gaps. IDENTITY as defined herein can be obtained from NEEDLE and is labeled "IDENTITY" in the output of the program.
Homology or identity between two aligned sequences is calculated as follows: the number of corresponding positions showing the same amino acid in both sequences in the alignment is divided by the total length of the alignment after subtracting the total number of gaps in the alignment. Identity as defined herein may be obtained from NEEDLE by using the NOBRIEF option and is marked as "longest identity" in the output of the program.
Variants of a nucleotide or amino acid sequence disclosed herein may also be defined as having one or more substitutions, insertions, and/or deletions compared to the nucleotide or amino acid sequence specifically disclosed herein (e.g., in the sequence listing).
Optionally, the skilled artisan may also consider so-called "conservative" amino acid substitutions in determining the degree of amino acid similarity, as will be clear to the skilled artisan. Conservative amino acid substitutions refer to the interchangeability of residues having similar side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains are serine and threonine; a group of amino acids having amide-containing side chains are asparagine and glutamine; a group of amino acids having aromatic side chains are phenylalanine, tyrosine and tryptophan; a group of amino acids with basic side chains are lysine, arginine and histidine; and a group of amino acids having sulfur-containing side chains are cysteine and methionine. In one embodiment, the conservative amino acid substitution sets are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine and asparagine-glutamine. A substitution variant of an amino acid sequence disclosed herein is a variant in which at least one residue in the disclosed sequence has been removed and a different residue inserted at its position. Preferably, the amino acid changes are conservative. In one embodiment, conservative substitutions for each naturally occurring amino acid are as follows: ala to Ser; arg to Lys; asn to gin or His; asp to Glu; cys to Ser or Ala; gln to Asn; glu to Asp; gly to Pro; his to Asn or Gln; ile to Leu or Val; leu to Ile or Val; lys to Arg; gln or Glu; met to Leu or Ile; phe to Met, leu, or Tyr; ser to Thr; thr to Ser; trp to Tyr; tyr to Trp or Phe; and Val to Ile or Leu.
The nucleotide sequences of the present invention may also be defined by their ability to hybridize under moderate hybridization conditions or, preferably, under stringent hybridization conditions, respectively, to portions of the specific nucleotide sequences disclosed herein. Stringent hybridization conditions are defined herein as conditions that allow nucleic acid sequences of at least about 25 nucleotides, preferably about 50, 75 or 100 nucleotides, most preferably about 200 or more nucleotides to hybridize at a temperature of about 65 ℃ in a solution comprising about 1M salt (preferably 6xSSC or any other solution having comparable ionic strength), and to wash at 65 ℃ in a solution comprising about 0.1M or less salt (preferably 0.2 xSSC or any other solution having comparable ionic strength). Preferably, hybridization is performed overnight, i.e., for at least 10 hours; and preferably the washing is carried out for at least one hour, wherein the washing solution is replaced at least twice. These conditions will typically allow specific hybridization of sequences having about 90% or greater sequence identity. Moderate conditions are defined herein as conditions that allow nucleic acid sequences of at least 50 nucleotides, preferably about 200 or more nucleotides, to hybridize in a solution comprising about 1M salt (preferably 6x SSC or any other solution having comparable ionic strength) at a temperature of about 45 ℃ and to wash in a solution comprising about 1M salt (preferably 6x SSC or any other solution having comparable ionic strength) at room temperature. Preferably, hybridization is performed overnight, i.e., for at least 10 hours; and preferably the washing is carried out for at least one hour, wherein the washing solution is replaced at least twice. These conditions will typically allow specific hybridization of sequences with up to 50% sequence identity. Those skilled in the art will be able to modify these hybridization conditions in order to specifically identify sequences that vary in identity between 50% and 90%.
"expression" refers to the transcription of a gene into structural RNA (rRNA, tRNA) or messenger RNA (mRNA) followed by translation into a protein.
By "overexpression" is meant that the expression of a gene (and correspondingly nucleic acid sequence) by a recombinant cell exceeds its expression in a corresponding wild-type cell. Such overexpression may be arranged, for example, by: increasing the frequency of transcription of one or more nucleic acid sequences, for example, by operably linking the nucleic acid sequences to a promoter functional in a recombinant cell; and/or by increasing the copy number of a nucleic acid sequence.
The terms "up", "up" and "up" refer to the process by which a cell increases the amount of a cellular component, such as RNA or a protein. Such upregulation may be responsive to or caused by a genetic modification.
The term "pathway" or "metabolic pathway" is understood herein as a series of chemical reactions in a cell that build and break down molecules.
The nucleic acid sequence (i.e., polynucleotide) or protein (i.e., polypeptide) may be native or heterologous to the genome of the host cell.
"native", "homologous" or "endogenous" with respect to a host cell means that the nucleic acid sequence does naturally occur in the genome of the host cell, or that the protein is naturally produced by the cell. The terms "natural," "homologous," and "endogenous" are used interchangeably herein.
As used herein, "heterologous" may refer to a nucleic acid sequence or a protein. For example, with respect to a host cell, "heterologous" may refer to a polynucleotide that does not naturally occur in the genome of the host cell in this manner, or a polypeptide or protein is not naturally produced by the cell in this manner. Heterologous nucleic acid sequences are nucleic acids derived from a foreign species or, if from the same species, have been substantially modified in composition and/or genomic locus relative to their native form by deliberate human intervention. For example, a promoter operably linked to a native structural gene is from a different species than the species from which the structural gene was derived, or if from the same species, one or both are substantially modified relative to their original form. Heterologous proteins may be derived from foreign species or, if from the same species, substantially modified with respect to their original form by deliberate human intervention. That is, heterologous protein expression relates to the expression of proteins that are not naturally expressed in the host cell in this manner. The term "heterologous expression" refers to expression of a heterologous nucleic acid in a host cell. Expression of heterologous proteins in eukaryotic host cell systems, such as yeast, is well known to those skilled in the art. Polynucleotides comprising a nucleic acid sequence encoding a gene for a protein or enzyme having a particular activity may be expressed in such eukaryotic systems. In some embodiments, the transformed/transfected cells may be used as an expression system for expressing enzymes. Expression of heterologous proteins in yeast is well known. Sherman, F.et al, methods in Yeast Genetics [ Yeast genetics methods ], (1986), published by Cold Spring Harbor Laboratory [ Cold spring harbor laboratory ] are well-known works describing a variety of methods that can be used to express proteins in yeast. Two widely used yeasts are Saccharomyces cerevisiae and Pichia pastoris. Vectors, strains and protocols for expression in Saccharomyces and Pichia (Pichia) are known in the art and available from commercial suppliers such as, for example, invitrogen. Suitable vectors typically have expression control sequences such as promoters (including 3-phosphoglycerate kinase or alcohol oxidase promoters), origins of replication, termination sequences, and the like, as desired.
As used herein, a "promoter" refers to a DNA sequence that directs transcription of a (structural) gene or other (partial) nucleic acid sequence. Suitably, the promoter is located in the 5' region of the gene, close to the transcription start site of the (structural) gene. The promoter sequence may be constitutive, inducible or repressible. In one embodiment, no (external) inducer is required.
As used herein, the term "vector" includes reference to an autosomal expression vector and an integration vector for integration into a chromosome.
The term "expression vector" refers to a linear or circular DNA molecule comprising a segment encoding a polypeptide of interest under the control of (i.e., operably linked to) an additional nucleic acid segment that provides for its transcription. Such additional segments may include promoter and terminator sequences, and may optionally include one or more origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, and the like. Expression vectors are typically derived from plasmid or viral DNA, or may contain elements of both. In particular, the expression vector comprises a nucleic acid sequence comprising and operably linked in the 5 'to 3' direction: (a) a yeast-recognized transcription and translation initiation region, (b) a coding sequence for a polypeptide of interest, and (c) a yeast-recognized transcription and translation termination region.
"plasmid" refers to autonomously replicating extra-chromosomal DNA that does not integrate into the genome of a microorganism and is typically circular in nature.
An "integrative vector" refers to a linear or circular DNA molecule that can be incorporated into the genome of a microorganism and provide stable inheritance of a gene encoding a polypeptide of interest. An integrative vector typically comprises one or more segments containing a gene sequence encoding the polypeptide of interest under the control of (i.e., operably linked to) an additional nucleic acid segment that provides for its transcription. Such additional segments may include promoter and terminator sequences, as well as one or more segments that drive the incorporation of the gene of interest into the genome of the target cell (typically by methods of homologous recombination). Typically, an integrative vector will be a vector that can be transferred into a target cell but has a replicon that is not functional in the organism. If appropriate markers are included in the segment, integration of the segment comprising the gene of interest may be selected.
A "host cell" is herein understood to be a cell, such as a yeast cell, which is transformed with one or more nucleic acid sequences encoding one or more heterologous proteins to construct a transformed cell (also referred to as a recombinant cell). For example, the transformed cells may contain a vector and may support replication and/or expression of the vector.
As used herein, "transformation" and "transformation" refer to insertion of an exogenous polynucleotide into a host cell, regardless of the method used for insertion, such as direct uptake, transduction, f-ligation, or electroporation. The exogenous polynucleotide may be maintained as a non-integrating vector (e.g., a plasmid), or alternatively may be integrated into the host cell genome. As used herein, "transformation" and "transformation" refer to the insertion of an exogenous polynucleotide (i.e., an exogenous nucleic acid sequence) into a host cell, regardless of the method used for insertion, such as direct uptake, transduction, f-ligation, or electroporation. The exogenous polynucleotide may be maintained as a non-integrating vector (e.g., a plasmid), or alternatively may be integrated into the host cell genome.
"constitutive expression (constitutive expression)" and "constitutive expression (constitutively expressing)" are understood herein to mean that there is a continuous transcription of the nucleic acid sequence. That is, the nucleic acid sequence is transcribed in a sustained manner. The constitutively expressed genes are always "on".
"anaerobic constitutive expression" is understood herein to mean that the nucleic acid sequence is constitutively expressed in the organism under anaerobic conditions. That is, under anaerobic conditions, the nucleic acid sequence is transcribed in a sustained manner, i.e., under such anaerobic conditions, the gene is always "on".
"disruption" is understood herein to mean any disruption of activity, including but not limited to deletion, mutation, and reduction of the affinity of disrupted genes and expression of RNAs complementary to such disrupted genes. It includes all nucleic acid modifications such as nucleotide deletions or substitutions, gene knockouts and other actions affecting translation or transcription of the corresponding polypeptide and/or affecting the (specific) activity of the enzyme, its substrate specificity and/or stability. It also includes modifications of the coding sequence or promoter of the gene that can be targeted. A gene disruption strain (disruptant) is a cell that has one or more disruptions of the corresponding gene. Naturally occurring in yeast is understood herein to mean that the gene is present in the yeast cell prior to disruption.
The term "encoding" has the same meaning as "encoding for". Thus, for example, the "gene or genes encoding a protein having activity X (one or more genes encoding a protein having activity X)" has the same meaning as the "gene or genes encoding a protein having activity X (one or more genes coding for a protein having activity X)".
In the case of a gene or nucleic acid sequence encoding a protein or enzyme, the phrase "nucleic acid sequence encoding X" (correspondingly "one or more nucleic acid sequences encoding X") (where X denotes a certain protein or (enzyme) activity) has the same meaning as "nucleic acid sequence encoding a protein with X activity" (correspondingly "one or more nucleic acid sequences encoding a protein with X activity"). Thus, for example, a "nucleic acid sequence or sequences encoding a transketolase" has the same meaning as "nucleic acid sequence or sequences encoding a protein having transketolase activity". As indicated above, the article "a" means "one or more".
By "redox sink" is herein understood a metabolic pathway that generally consumes or oxidizes NADH to NAD+ and/or prevents or reduces NAD+ consumption or reduction to NADH. The non-native metabolic pathway is a metabolic pathway that does not occur in the corresponding wild-type cell. Thus, the non-native metabolic pathway forming the redox sink is preferably a non-native metabolic pathway that increases NADH consumption and/or decreases NAD+ consumption compared to a corresponding wild-type yeast cell. By increasing NADH consumption and/or reducing NAD+ consumption, an (additional) unnatural redox sink can advantageously be produced within the cell.
The abbreviation "NADH" refers to the reduced hydrogenated form of nicotinamide adenine dinucleotide. The abbreviation "NAD+" refers to the oxidized form of nicotinamide adenine dinucleotide. Nicotinamide adenine dinucleotide can act as a so-called cofactor, assisting biochemical reactions and/or transformations in cells.
"NADH dependency" or "NAD+ dependency" is herein equivalent to NADH specificity, and "NADH dependency" or "NAD+ dependency" is herein equivalent to NADH specificity.
An "NADH-dependent" or "NAD+ -dependent" enzyme is herein understood to be an enzyme that, compared to other types of cofactors, depends only on NADH/NAD+ as cofactor or mainly on NADH/NAD+ as cofactor. An "NADH/NAD+ -only dependent" enzyme is herein understood to be an enzyme which has an absolute requirement for NADH/NAD+ relative to NADPH/NADP+. That is, it is active only when NADH/NAD+ is used as a cofactor. A "primary NADH/NDA+ dependent" enzyme is herein understood to be an enzyme having a higher specificity and/or a higher catalytic efficiency for NADH/NAD+ as cofactor than for NADPH/NADP+ as cofactor.
The specificity of an enzyme can be described by the following formula:
1<K m NADP + /K m NAD + <infinity (infinity)
Wherein K is m Is a so-called Mie constant.
For the primary NADH dependent enzyme, preferably K m NADP + /K m NAD + Between 1 and 1000, between 1 and 500, between 1 and 200, between 1 and 100, between 1 and 50, between 1 and 10, between 5 and 100, between 5 and 50, between 5 and 20, or between 5 and 10.
The K of the enzymes herein can be determined using known analytical techniques, calculations and protocols m Determined to be respectively directed to NAD + And enzyme specificity of NADP+. These are described, for example, in the following documents: loish et al Molecular Cell Biology [ molecular cell biology ] ]Edition 6, freeman, pages 80 and 81, e.g. FIGS. 3-22. For the primary NADH dependent enzymePreferably, the catalytic efficiency (k) for NADPH/NADP+ as cofactor cat /K m ) NADP+ And catalytic efficiency (k) for NADH/NAD+ as cofactor cat /K m ) NAD+ Is the ratio of (i.e., the catalytic efficiency ratio (k cat /K m ) NADP+ :(k cat /K m ) NAD+ ) Greater than 1:1, more preferably equal to or greater than 2:1, still more preferably equal to or greater than 5:1, even more preferably equal to or greater than 10:1, yet even more preferably equal to or greater than 20:1, even more preferably equal to or greater than 100:1, and most preferably equal to or greater than 1000:1. There is no upper limit, but for practical reasons the catalytic efficiency ratio (k cat /K m ) NADP+ :(k cat /K m ) NAD+ May be equal to or less than 1.000.000.000:1 (i.e., 1.10) 9 :1)。
Yeast cells
The recombinant yeast cell is preferably a yeast cell or is derived from a yeast cell from the genus Saccharomyces (Saccharomyces cerevisiae) or the genus Schizosaccharomyces (Schizosaccharomyces cerevisiae). That is, preferably, the host cell from which the recombinant yeast cell is derived is a yeast cell from the genus Saccharomyces or Schizosaccharomyces.
Examples of suitable yeast cells include Saccharomyces, such as Saccharomyces cerevisiae, saccharomyces cerevisiae (Saccharomyces eubayanus), saccharomyces jurei, saccharomyces pastorianus (Saccharomyces pastorianus), saccharomyces beticus, fermenting yeast (Saccharomyces fermentati), saccharomyces mirabilis (Saccharomyces paradoxus), saccharomyces vitis (Saccharomyces uvarum), and Saccharomyces bayanus (Saccharomyces bayanus).
Examples of suitable yeast cells further include Schizosaccharomyces (Schizosaccharomyces), such as Schizosaccharomyces pombe, schizosaccharomyces japan (Schizosaccharomyces japonicus), schizosaccharomyces octaspore (Schizosaccharomyces octosporus), and Schizosaccharomyces psychrophilum (Schizosaccharomyces cryophilus).
Other exemplary yeasts include the genus torulopsis (torularia), such as torulopsis delbrueckii (Torulaspora delbrueckii); kluyveromyces (Kluyveromyces) such as Kluyveromyces marxianus (Kluyveromyces marxianus); pichia, such as Pichia stipitis (Pichia stipitis), pichia pastoris, or Pichia angustifolia; bound yeasts such as Saccharomyces cerevisiae (Zygosaccharomyces bailii); brettanomyces, such as Brettanomyces (Brettanomyces inter medius); brettanomyces brucei (Brettanomyces bruxellensis), brettanomyces iso (Brettanomyces anomalus), brettanomyces bambusicola (Brettanomyces custersianus), brettanomyces naughty (Brettanomyces naardenensis), brettanomyces nanensis (Brettanomyces nanus), brettanomyces brucei (Dekkera bruxellensis) and Dekkera anomala; genus mergilmyces (Metschmkowia), genus isatchenkia (isatchenkia), such as isasatchen orientalis (Issatchenkia orientalis), genus klebsiella (Kloeckera), such as klebsiella citrifolia (Kloeckera apiculata); and Aureobasidium (Aureobasidium), such as Aureobasidium pullulans (Aureobasidium pullulans).
The yeast cell is preferably a yeast cell of the genus schizosaccharomyces (also referred to herein as a schizosaccharomyces yeast cell), or a yeast cell of the genus saccharomyces (also referred to herein as a saccharomyces yeast cell). More preferably, the yeast cell is a yeast cell derived from a Saccharomyces cerevisiae species (also referred to herein as a Saccharomyces cerevisiae cell). That is, preferably, the host cell from which the recombinant yeast cell is derived is a yeast cell from the species Saccharomyces cerevisiae.
Preferably, the yeast cell is an industrial yeast cell. The survival environment of yeast cells in industrial processes is significantly different from that in the laboratory. Industrial yeast cells must be capable of performing well under a variety of environmental conditions, which may vary in the process. Such changes include changes in nutrient sources, pH, ethanol concentration, temperature, oxygen concentration, etc., which together have potential effects on cell growth and ethanol production by yeast cells. Industrial yeast cells can be understood to refer to yeast cells having more robust properties when compared to laboratory counterparts. That is, when combined withIndustrial yeast cells exhibit less performance variation when one or more environmental conditions selected from the group of nutrient source, pH, ethanol concentration, temperature, oxygen concentration vary during fermentation than laboratory counterparts. Preferably, the yeast cells are constructed on the basis of industrial yeast cells as hosts, wherein the construction is performed as described below. An example of an industrial yeast cell is Ethanol (Fermentis, fremantis corporation)),>(Dissmann Co., ltd. (DSM)) and +.>(Raman company (Lallemand)).
The recombinant yeast cells described herein can be derived from any host cell capable of producing a fermentation product. Preferably, the host cell is a yeast cell, more preferably an industrial yeast cell as described above. Preferably, the yeast cells described herein are derived from host cells having the ability to produce ethanol.
Thus, the yeast cells described herein may be derived from host cells by any technique known to be suitable to those skilled in the art. Such techniques may include any one or more of mutagenesis, recombinant DNA techniques (including but not limited to CRISPR-CAS techniques), selective and/or adaptive evolution, conjugation, cell fusion, and/or cytokinesis between yeast strains. Suitably, one or more desired genes are incorporated into the yeast cell by a combination of one or more of the above techniques.
The recombinant yeast cells according to the invention are preferably inhibitor tolerant, i.e. they can withstand the common inhibitors at the level of common pretreatment and hydrolysis conditions they typically have, so that the recombinant yeast cells can be used in a wide range of applications, i.e. it has a high adaptability to different raw materials, different pretreatment methods and different hydrolysis conditions. In one embodiment, the recombinant yeast cell is inhibitor tolerant. Inhibitor tolerance is resistance to an inhibitory compound. The presence and level of inhibitory compounds in lignocellulose can vary widely with the feedstock, pretreatment process, hydrolysis process. Examples of inhibitor classes are carboxylic acids, furans and/or phenolic compounds. Examples of carboxylic acids are lactic acid, acetic acid or formic acid. Examples of furans are furfural and hydroxy-methylfurfural. Examples of phenolic compounds are vanillin (vanilin), syringic acid, ferulic acid and coumaric acid. Typical amounts of inhibitors, for carboxylic acids: up to 20 g/liter or more, depending on the feedstock, pretreatment and hydrolysis conditions. For furan: hundreds of milligrams per liter, up to several grams per liter, depending on the feedstock, pretreatment, and hydrolysis conditions. For phenols: up to a gram per liter, tens of milligrams per liter, depending on the starting materials, pretreatment and hydrolysis conditions.
In one embodiment, the recombinant yeast cell is a cell that is naturally capable of alcoholic fermentation, preferably anaerobic alcoholic fermentation. Recombinant yeast cells preferably have high tolerance to ethanol, low pH (i.e., capable of growing at a pH of less than about 5, about 4, about 3, or about 2.5), and organics, and/or high tolerance to elevated temperatures.
Natural transketolase
Recombinant yeast cells suitably express functionally:
-a nucleic acid sequence encoding a native protein having transketolase activity; and
-a nucleic acid sequence encoding a heterologous protein having transketolase activity.
Proteins having transketolase activity are also referred to herein as "transketolase proteins", "transketolase (transketolase enzyme)" or simply as "transketolase". "transketolase" is abbreviated herein as "TKL".
Similarly, a native protein having a transketolase activity is also referred to herein as a "native transketolase protein," "native transketolase (native transketolase enzyme)" or simply as a "native transketolase (native transketolase)".
Transketolase is an enzyme active in the pentose phosphate pathway of yeast cells. The gene encoding the pentose phosphate pathway is also referred to herein as the "PPP" gene. Preferably, references to the pentose phosphate pathway in this specification are to be understood as references to the non-oxidized part of the pentose phosphate pathway. Enzymes active in the pentose phosphate pathway include ribulose-5-phosphate isomerase (RKI), ribulose-5-phosphate epimerase (RPE), transketolase (TKL) and Transaldolase (TAL).
"transketolase" (belonging to the enzyme class EC 2.2.1.1) is defined herein as an enzyme that catalyzes the following reaction: d-ribose 5-phosphate +D-xylulose 5-sedoheptulose 7-phosphate +D-glyceraldehyde 3-phosphate;
and vice versa.
This enzyme is also known as trans-glycolaldehyde enzyme or sedoheptulose-7-phosphate D-glyceraldehyde-3-phosphate trans-glycolaldehyde enzyme. A certain transketolase may be further defined by its amino acid sequence. Likewise, a transketolase may be further defined by a nucleotide sequence encoding a transketolase. As explained in detail below under the definition above, a certain transketolase defined by a nucleotide sequence encoding an enzyme includes (unless otherwise limited) a nucleotide sequence that hybridizes to such a nucleotide sequence encoding a transketolase.
The nucleic acid sequence encoding a native protein having transketolase activity may itself be a homologous nucleic acid sequence, a heterologous nucleic acid sequence, or a mixture of homologous and heterologous nucleic acid sequences, provided that the protein encoded by such nucleic acid sequence is a native (i.e., endogenous) transketolase protein to the host cell. Saccharomyces cerevisiae is a preferred host cell. Thus, preferably, the recombinant yeast cell is a recombinant s.cerevisiae cell functionally expressing a nucleic acid sequence encoding a protein having transketolase activity, which is native to s.cerevisiae cell.
More preferably, the nucleic acid sequence encoding a native protein having transketolase activity is native to the host cell.
Thus, the recombinant yeast cell preferably expresses functionally:
-a native nucleic acid sequence encoding a native protein having transketolase activity; and
-a heterologous nucleic acid sequence encoding a heterologous protein having transketolase activity.
The wild-type yeast may comprise one or two native transketolase genes. In addition to the first polyketide gene "TKL1", some yeasts (such as, for example, saccharomyces cerevisiae) also comprise a paralogous gene "TKL2" (the second polyketide gene).
Suitably, the recombinant yeast cell according to the invention may comprise the native TKL1 gene and/or the native TKL2 gene.
That is, suitably, the recombinant yeast cell may comprise:
-a nucleic acid sequence encoding a natural TKL1 (e.g., gene "TKL 1"); or alternatively
-a nucleic acid sequence encoding a natural TKL2 (e.g., gene "TKL 2"); or alternatively
Both a nucleic acid sequence encoding a natural TKL1 (e.g., gene "TKL 1") and a nucleic acid sequence encoding a natural TKL2 (e.g., gene "TKL 2").
Preferably, the recombinant yeast cell comprises a natural nucleotide sequence encoding a natural transketolase TKL 1. That is, preferably, the recombinant yeast cell comprises a native TKL1 gene.
The recombinant yeast cell may comprise one or more copies (suitably in the range of from equal to or greater than 1 to equal to or less than 30 copies, preferably in the range of from equal to or greater than 1 to equal to or less than 20 copies) of the native gene encoding the transketolase. More preferably, the recombinant yeast cell comprises one, two, three, four, five, six, seven, eight, nine, ten, eleven or twelve copies of the native gene encoding the transketolase.
The recombinant yeast cell may be one in which the native nucleic acid sequence encoding the native protein having transketolase activity is under the control of a TKL promoter, as described in detail below.
As indicated above, host cells from saccharomyces cerevisiae species are preferred. The amino acid sequence of Saccharomyces cerevisiae's native transketolase 1 is shown by SEQ ID NO. 1. The natural nucleic acid sequence encoding transketolase 1 in Saccharomyces cerevisiae is shown by SEQ ID NO. 2.
Preferably, the native protein having transketolase activity comprises or consists of:
the amino acid sequence of SEQ ID NO. 1; or a functional homolog of SEQ ID NO. 1, which is suitably native to the host cell, having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the amino acid sequence of SEQ ID NO. 1.
Thus, preferably, the preferably natural nucleic acid sequence encoding the natural transketolase protein comprises or consists of:
-the nucleic acid sequence of SEQ ID No. 2; or alternatively
-a functional homolog of SEQ ID No. 2 encoding a native transketolase having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the nucleic acid sequence of SEQ ID No. 2.
More preferably, the recombinant yeast cell is a recombinant s.cerevisiae cell comprising (or consisting of, respectively, functionally expressing) a native protein having a transketolase activity, which protein comprises or consists of the amino acid sequence of SEQ ID NO. 1. Most preferably, the recombinant yeast cell is a recombinant Saccharomyces cerevisiae cell comprising (functionally expressed by, respectively) a nucleic acid sequence comprising or consisting of SEQ ID NO. 2.
Heterologous transketolase
As indicated above, the recombinant yeast cells suitably functionally express:
-a nucleic acid sequence encoding a native protein having transketolase activity; and
-a nucleic acid sequence encoding a heterologous protein having transketolase activity.
Heterologous proteins having transketolase activity are also referred to herein as "heterologous transketolase proteins", "heterologous transketolase (heterologous transketolase enzyme)" or simply as "heterologous transketolase (heterologous transketolase)".
The recombinant yeast cell suitably functionally expresses a heterologous nucleic acid sequence encoding a heterologous protein having transketolase activity.
In addition to the preferably natural nucleic acid sequence encoding a natural protein having a transketolase activity (correspondingly natural transketolase), such a heterologous nucleic acid sequence encoding a heterologous protein having a transketolase activity (correspondingly heterologous transketolase) is suitably present.
Preferably, the recombinant yeast cell comprises a heterologous nucleic acid sequence encoding a heterologous transketolase in addition to the native nucleic acid sequence encoding the native transketolase.
Such heterologous nucleic acid sequences encoding a transketolase are preferably under the control of a TKL promoter, as detailed below.
The recombinant yeast cell can comprise (and correspondingly functionally express) one or more heterologous nucleic acid sequences encoding one or more heterologous transketolase proteins. That is, the recombinant yeast cell can comprise one or more heterologous transketolase enzymes.
Preferably, the one or more heterologous transketolase comprises or consists of:
-an amino acid sequence of SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NO. 9, SEQ ID NO. 10, SEQ ID NO. 11, SEQ ID NO. 12, SEQ ID NO. 13, SEQ ID NO. 14, SEQ ID NO. 15, SEQ ID NO. 16, SEQ ID NO. 17 or SEQ ID NO. 19; or alternatively
-a functional homolog of SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17 or SEQ ID NO 19 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 95% identity to the amino acid sequence of SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17 or SEQ ID NO 19; or alternatively
-SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17 or a functional homolog of SEQ ID NO 19 comprising an amino acid sequence having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 16, SEQ ID NO 17 or SEQ ID NO 19, wherein more preferably the amino acid sequence of SEQ ID NO 3, SEQ ID NO 4, SEQ ID NO 5, SEQ ID NO 6, SEQ ID NO 7, SEQ ID NO 8, SEQ ID NO 9, SEQ ID NO 10, SEQ ID NO 11, SEQ ID NO 12, SEQ ID NO 13, SEQ ID NO 14, SEQ ID NO 15, SEQ ID NO 17 or SEQ ID NO 19 does not exceed the amino acid sequence of No. 3, 4, 5, 10, 150, 10, 15, 16, 15, 17 or 19, 17, or 19, which does not exceed the amino acid sequence of any of the sequence, no more than 40, no more than 30, no more than 20, no more than 10, or no more than 5 amino acid mutations, substitutions, insertions, and/or deletions.
Preferably, the recombinant yeast cell comprises (is functionally expressed, respectively):
-one or more heterologous nucleic acid sequences encoding one or more amino acid sequences selected from the group consisting of: SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NO. 9, SEQ ID NO. 10, SEQ ID NO. 11, SEQ ID NO. 12, SEQ ID NO. 13, SEQ ID NO. 14, SEQ ID NO. 15, SEQ ID NO. 16, SEQ ID NO. 17 or SEQ ID NO. 19; and/or
-functional homologs thereof comprising a nucleic acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to any of those nucleic acid sequences; and/or
Functional homologues thereof comprising a nucleic acid sequence having one or more mutations, substitutions, insertions and/or deletions when compared to any of those nucleic acid sequences.
More preferably, the nucleic acid sequence of any such functional homolog has no more than 300, no more than 250, no more than 200, no more than 150, no more than 100, no more than 75, no more than 50, no more than 40, no more than 30, no more than 20, no more than 10, or no more than 5 nucleic acid mutations, substitutions, insertions, and/or deletions as compared to such nucleic acid sequence.
More preferably, the heterologous transketolase is derived from F.falciparum (a yeast species also known as "Pichia pastoris"). For example, preferably, the heterologous transketolase comprises or consists of: the amino acid sequence of SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 16 or SEQ ID NO. 17; or a functional homolog thereof comprising an amino acid sequence that has at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 16 or SEQ ID NO. 17; or a functional homolog thereof comprising an amino acid sequence having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 16 or SEQ ID NO. 17, wherein more preferably the amino acid sequence of any such functional homolog has NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 16 or SEQ ID NO. 17.
In order to allow good expression of the heterologous transketolase in the host cell, it may be advantageous to use a heterologous transketolase that may have an amino acid sequence with equal to or greater than 30%, equal to or greater than 35%, equal to or greater than 40%, equal to or greater than 45%, equal to or greater than 50%, equal to or greater than 55%, equal to or greater than 60%, equal to or greater than 65%, equal to or greater than 70%, equal to or greater than 75%, equal to or greater than 80%, equal to or greater than 85%, equal to or greater than 90%, equal to or greater than 95%, equal to or greater than 98% or equal to or greater than 99% sequence identity to the amino acid sequence of the native transketolase of the host cell.
However, the heterologous transketolase may also preferably be a heterologous transketolase that is not regulated by a natural (i.e., endogenous) regulatory factor of the host cell. That is, preferably, the heterologous transketolase is one whose activity cannot be increased or decreased by a molecule naturally produced by the host cell. To avoid native regulatory factors, it may be advantageous to use a heterologous transketolase in the host cell, which heterologous transketolase may have an amino acid sequence having equal to or less than 99%, equal to or less than 98%, equal to or less than 95%, equal to or less than 90%, equal to or less than 85%, equal to or less than 80%, equal to or less than 75%, equal to or less than 70% or equal to or less than 65% sequence identity to the amino acid sequence of the native transketolase of the host cell.
Thus, more preferably, the heterologous transketolase comprises or consists of an amino acid sequence having a percentage of identity with the amino acid sequence of the native transketolase of the host cell within the following ranges: in the range of equal to or greater than 30% to equal to or less than 80%, more preferably in the range of equal to or greater than 30% or equal to or greater than 35% to equal to or less than 75%, and most preferably in the range of equal to or greater than 35% to equal to or less than 70% or even equal to or less than 65%.
More preferably, any heterologous nucleic acid sequence encoding a heterologous transketolase comprises or consists of a nucleic acid sequence having a percentage of identity with a nucleic acid sequence encoding a native transketolase of a host cell within the following ranges: in the range of equal to or greater than 30% to equal to or less than 80%, more preferably in the range of equal to or greater than 30% or equal to or greater than 35% to equal to or less than 75%, and most preferably in the range of equal to or greater than 35% to equal to or less than 70% or even equal to or less than 65%.
Host cells from Saccharomyces cerevisiae species are preferred. As indicated above, the amino acid sequence of the native transketolase 1 of Saccharomyces cerevisiae is shown by SEQ ID NO. 1 and the native nucleic acid sequence encoding transketolase 1 in Saccharomyces cerevisiae is shown by SEQ ID NO. 2.
Thus, the recombinant yeast cell is preferably a recombinant s.cerevisiae cell functionally expressing:
-a nucleic acid sequence encoding a native transketolase, wherein the native transketolase preferably comprises or consists of the amino acid sequence of SEQ ID No. 1; and
a nucleic acid sequence encoding a heterologous transketolase, wherein the heterologous transketolase preferably comprises or consists of an amino acid sequence having a sequence identity in the range of from equal to or more than 30% to equal to or less than 80%, more preferably in the range of from equal to or more than 30% or equal to or more than 35% to equal to or less than 75%, most preferably in the range of from equal to or more than 35% to equal to or less than 70% or even equal to or less than 65% to the amino acid sequence of SEQ ID NO 1.
Similarly, the recombinant yeast cell is preferably a recombinant s.cerevisiae cell functionally expressing:
-a native nucleic acid sequence encoding a native transketolase, wherein the native nucleic acid preferably comprises or consists of the nucleic acid sequence of SEQ ID No. 2; and
a heterologous nucleic acid sequence encoding a heterologous transketolase, wherein the heterologous nucleic acid sequence preferably comprises or consists of a nucleic acid sequence having a sequence identity in the range of from equal to or more than 30% to equal to or less than 80%, more preferably in the range of from equal to or more than 35% or equal to or more than 35% to equal to or less than 75%, most preferably in the range of from equal to or more than 35% to equal to or less than 70% or even equal to or less than 65% to the nucleic acid sequence of SEQ ID NO 2.
Most preferably, the recombinant yeast cell comprises a suitably heterologous nucleic acid sequence encoding a heterologous protein having transketolase activity, wherein the nucleic acid sequence comprises or consists of:
-the nucleic acid sequence of SEQ ID NO. 18 or SEQ ID NO. 20; or alternatively
-a functional homolog of SEQ ID No. 18 or SEQ ID No. 20 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the nucleic acid sequence of SEQ ID No. 18 or SEQ ID No. 20; or alternatively
Functional homologs of SEQ ID No. 18 or SEQ ID No. 20 which have one or more mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID No. 20 or SEQ ID No. 22, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 nucleic acid mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID No. 18 or SEQ ID No. 20.
The recombinant yeast cell can comprise one, two or more copies of a heterologous nucleic acid sequence (e.g., a heterologous gene) encoding a heterologous transketolase and/or one, two or more copies of a native nucleic acid sequence (e.g., a native gene) encoding a native transketolase.
Most preferably, the recombinant yeast cell can comprise one, two, three, four, five, six, seven, eight, nine, ten, eleven or twelve copies of a heterologous nucleic acid sequence (e.g., a heterologous gene) encoding a heterologous transketolase and/or one, two, three, four, five, six, seven, eight, nine, ten, eleven or twelve copies of a native nucleic acid sequence (e.g., a native gene) encoding a native transketolase.
Preferably, the recombinant yeast cell is a recombinant yeast cell comprising (functionally expressed in correspondence with):
-one, two or more copies of a nucleic acid sequence comprising or consisting of a nucleic acid sequence having a sequence identity in the range of from equal to or more than 30% to equal to or less than 80%, more preferably in the range of from equal to or more than 35% to equal to or less than 75%, most preferably in the range of from equal to or more than 35% to equal to or less than 70% or even equal to or less than 65% to a nucleic acid sequence encoding a native transketolase; and/or
-one, two or more copies of a nucleic acid sequence encoding any of the heterologous transketolase described above; and/or
-one, two or more copies of the nucleic acid sequence of SEQ ID NO. 18 and/or SEQ ID NO. 20; and/or
-one, two or more copies of a nucleic acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the nucleic acid sequence of SEQ ID No. 18 and/or SEQ ID No. 20; and/or
One, two or more copies of a nucleic acid sequence having one or more mutations, substitutions, insertions and/or deletions compared to the nucleic acid sequence of SEQ ID NO. 18 and/or SEQ ID NO. 20, respectively, wherein more preferably the nucleic acid sequence has NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 nucleic acid mutations, substitutions, insertions and/or deletions compared to the nucleic acid sequence of SEQ ID NO. 18 and/or SEQ ID NO. 20, respectively.
Most preferably, the recombinant yeast cell is a recombinant saccharomyces cerevisiae cell comprising (functionally expressed by) the following:
i) One, two or more copies of the nucleic acid sequence of SEQ ID NO. 10; and
ii) one, two or more copies of the nucleic acid sequence of SEQ ID NO. 18 and/or SEQ ID NO. 20; and/or one, two or more copies of a nucleic acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the nucleic acid sequence of SEQ ID NO. 18 and/or SEQ ID NO. 20; and/or one, two or more copies of a nucleic acid sequence having one or more mutations, substitutions, insertions and/or deletions compared to the nucleic acid sequence of SEQ ID NO. 18 and/or SEQ ID NO. 20, respectively, wherein more preferably the nucleic acid sequence has NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 nucleic acid mutations, substitutions, insertions and/or deletions compared to the nucleic acid sequence of SEQ ID NO. 18 and/or SEQ ID NO. 20, respectively.
Optional overexpression of one or more other enzymes of the PPP pathway
The recombinant yeast cell may further optionally comprise one or more genetic modifications in other PPP genes (i.e., RKI, RPE, and TAL) that increase the flux of the pentose phosphate pathway. Advantageously, this or such genetic modification may allow for a further increase in flux through the non-oxidized part of the pentose phosphate pathway.
Thus, the recombinant yeast cell may optionally comprise one or more additional genetic modifications to overexpress (the non-oxidized part of) one or more other enzymes of the pentose phosphate pathway. For example, a recombinant yeast cell can comprise one or more nucleic acid sequences to overexpress one or more enzymes selected from the group consisting of: ribulose-5-phosphate isomerase, ribulose-5-phosphate epimerase and transaldolase.
"ribulose 5-phosphate epimerase" (EC 5.1.3.1) is defined herein as an enzyme that catalyzes the epimerization of D-xylulose 5-phosphate to D-ribulose 5-phosphate (and vice versa). This enzyme is also known as ribulose phosphate epimerase; erythrose-4-phosphate isomerase; pentose phosphate 3-epimerase; xylulose phosphate 3-epimerase; pentose phosphate epimerase; ribulose 5-phosphate 3-epimerase; d-ribulose phosphate-3-epimerase; d-ribulose 5-phosphate epimerase; D-ribulose-5-P3-epimerase; d-xylulose-5-phosphate 3-epimerase; pentose-5-phosphate 3-epimerase; or D-ribulose-5-phosphate 3-epimerase. Ribulose 5-phosphate epimerase can be further defined by its amino acid sequence. Likewise, a ribulose 5-phosphate epimerase can be defined by a nucleotide sequence encoding the enzyme and a nucleotide sequence hybridizing to a reference nucleotide sequence encoding the ribulose 5-phosphate epimerase. The nucleotide sequence encoding ribulose 5-phosphate epimerase is referred to herein as RPE or RPE1.
"ribulose 5-phosphate isomerase" (EC 5.3.1.6) is defined herein as an enzyme that catalyzes the direct isomerisation of D-ribose 5-phosphate to D-ribulose 5-phosphate (and vice versa). This enzyme is also known as pentose phosphate isomerase; phosphoribosyl isomerase; ribose phosphate isomerase; 5-phosphoribosyl isomerase; d-ribose 5-phosphate isomerase; d-ribose-5-phosphate ketol-isomerase; or D-ribose-5-phosphate aldose-ketose-isomerase. Ribulose 5-phosphate isomerase may be further defined by its amino acid sequence. Likewise, a ribulose 5-phosphate isomerase may be defined by a nucleotide sequence encoding the enzyme and a nucleotide sequence hybridizing to a reference nucleotide sequence encoding the ribulose 5-phosphate isomerase. The nucleotide sequence encoding ribulose 5-phosphate isomerase is referred to herein as RKI or RKI1.
"transaldolase" (EC 2.2.1.2) is defined herein as an enzyme that catalyzes the reaction: sedoheptulose 7-phosphate + D-glyceraldehyde 3-phosphate < - > -D-erythrose 4-phosphate + D-fructose 6-phosphate and vice versa. This enzyme is also known as dihydroxyacetone transferase; dihydroxyacetone synthase; formaldehyde transketolase; or sedoheptulose-7-phosphate, D-glyceraldehyde-3-phosphoglyceromulotransferase. Transaldolase may be further defined by its amino acid sequence. Likewise, a transaldolase may be defined by a nucleotide sequence encoding the enzyme and a nucleotide sequence hybridizing to a reference nucleotide sequence encoding the transaldolase. The nucleotide sequence encoding a transketolase is referred to herein as TAL or TAL1.
TKL promoter
The recombinant yeast cell is preferably a recombinant yeast cell in which the nucleic acid sequence encoding the native protein having transketolase activity and/or the nucleic acid sequence encoding the heterologous protein having transketolase activity is under the control of a promoter ("TKL promoter") having an anaerobic/aerobic expression ratio of 2 or more for transketolase. By this is meant appropriately that the expression of heterologous and/or native transketolase ("TKL") under anaerobic conditions is at least 2-fold higher than under aerobic conditions. The above may alternatively be expressed as functionally expressing for a recombinant yeast cell a nucleic acid sequence encoding a native transketolase (corresponding to a heterologous transketolase) under the control of a promoter ("TKL promoter") whose TKL expression ratio Anaerobic/aerobic 2 or higher.
The TKL promoter may suitably be operably linked to a nucleic acid sequence encoding a protein having transketolase activity. Preferably, the TKL promoter is located in the 5' region of the TKL gene; more preferably, it is located close to the transcription initiation site of the TKL gene.
Preferably, the TKL promoter is ROX 1-inhibited. ROX1 is herein a heme-dependent repressor of one or more hypoxia genes; which mediate aerobic transcriptional repression of hypoxia-inducible genes such as COX5b and CYC 7; repressor function is regulated by decreasing promoter occupancy in response to oxidative stress; and contains HMG domains responsible for DNA binding activity; is involved in hypertonic stress resistance. ROX1 is regulated by oxygen.
Without wishing to be bound by any type of theory, it is believed that the regulation of ROX1 may function as follows: genomic analysis of anaerobic induction genes in Saccharomyces cerevisiae according to Kwat et al, "Genomic Analysis of Anaerobically induced genes in Saccharomyces cerevisiae: functional roles of ROX1 and other factors in mediating the anoxic response" [ ": the functional role of ROX1 and other factors in mediating hypoxia responses "] (2002), journal of bacteriology [ journal of bacteriology ], volume 184, phase 1, pages 250-265, incorporated herein by reference: "although Rox1 functions in an O2-dependent manner, its expression is oxygen (heme) -dependent, activated by heme-dependent transcription factor Hap1 [19]. Thus, as oxygen levels drop to a level that limits heme biosynthesis [20], ROX1 no longer transcribes [21], its protein level drops [22], and the gene it regulates de-represses.
Further details and suitable motifs are provided by: keng, T. (1992), "HAP1 and ROX1 form a regulatory pathway in the repression of HEM transcription in Saccharomyces cerevisiae" [ "HAP1 and ROX1 form a regulatory pathway in Saccharomyces cerevisiae that inhibits HEM13 transcription" ], mol.cell.biol. [ molecular and cell biology ] 12:2616-2623, and Ter Kinde and de Steensma, "A microarray-assisted screen for potential Hap1 and Rox1 target genes in Saccharomyces cerevisiae" [ "microarray-assisted screening of potential handle 1 and Rox1 target genes in Saccharomyces cerevisiae" ], (2002), yeast [ Yeast ] 19:825-840, are incorporated herein by reference.
Preferably, the TKL promoter comprises a ROX1 binding motif. The TKL promoter may suitably comprise one or more ROX1 binding motifs.
More preferably, the TKL promoter may comprise one or more copies of motif NNNATTGTTNNN in its nucleic acid sequence. In this context "N" represents a nucleic acid selected from the group consisting of: adenine (A), guanine (G), cytosine (C) and thymine (T). Such a motif is shown by SEQ ID NO. 21.
More preferably, the TKL promoter comprises or consists of: a nucleic acid sequence identical to the nucleic acid sequence of a preferably natural promoter of a gene selected from the list consisting of: FET4, ANB1, YHR048W, DAN1, AAC3, TIR2, DIP5, HEM13, YNR014W, YAR028W, FUN, COX5B, OYE2, SUR2, FRDS1, PIS1, LAC1, YGR035C, YAL028W, EUG1, HEM14, ISU2, ERG26, YMR252C and SML1, more preferably FET4, ANB1, YHR048W, DAN1, AAC3, TIR2, DIP5 and HEM13, or functional homologs thereof comprising a nucleic acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the above. Reference herein to a native promoter refers to a promoter that is native to the host cell.
Preferably, the recombinant yeast cell is a recombinant saccharomyces cerevisiae cell; and preferably the TKL promoter is a natural promoter of a saccharomyces cerevisiae gene selected from the list consisting of: FET4, ANB1, YHR048W, DAN1, AAC3, TIR2, DIP5, HEM13, YNR014W, YAR028W, FUN, COX5B, OYE2, SUR2, FRDS1, PIS1, LAC1, YGR035C, YAL028W, EUG1, HEM14, ISU2, ERG26, YMR252C and SML1.
Additionally or in the alternative, the TKL promoter preferably comprises one or more copies of the following motifs in its nucleic acid sequence: TCGTTYAG and/or AAAAATTGTTGA. Herein, "Y" represents C or T. The AAAAATTGTTGA motif is shown by SEQ ID NO. 22.
The TKL promoter may also comprise or consist of a nucleic acid sequence which is identical to the nucleic acid sequence of a preferably natural promoter of a DAN, TIR or PAU gene. For example, the TKL promoter may suitably comprise or consist of: a nucleic acid sequence of a preferably natural promoter of a gene selected from the list consisting of: TIR2, DAN1, TIR4, TIR3, PAU7, PAU5, yl 064C, YGR W, DAN3, YIL176C, YGL261C, YOL161C, PAU1, PAU6, DAN2, YDR542W, YIR041W, YKL224C, PAU3, yl 025W, YOR394W, YHL046C, YMR325W, YAL068C, YPL282C, PAU, and PAU4, or functional homologs thereof, comprising a nucleic acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% sequence identity to the above. Reference herein to a native promoter refers to a promoter that is native to the host cell.
Preferably, the recombinant yeast cell is a recombinant saccharomyces cerevisiae cell; and preferably the TKL promoter is a natural promoter of a saccharomyces cerevisiae gene selected from the list consisting of: TIR2, DAN1, TIR4, TIR3, PAU7, PAU5, yl 064C, YGR294W, DAN3, YIL176C, YGL261C, YOL161C, PAU1, PAU6, DAN2, YDR542W, YIR041W, YKL 35224C, PAU3, yl 025W, YOR394W, YHL046C, YMR325W, YAL068C, YPL282C, PAU and PAU4.
More preferably, the TKL promoter may comprise or consist of: a sequence identical to the nucleic acid sequence of a preferably natural promoter of a gene selected from the list consisting of: TIR2, DAN1, TIR4, TIR3, PAU7, PAU5, yl 064C, YGR294W, DAN3, YIL176C, YGL261C, YOL161C, PAU1, PAU6, DAN2, YDR542W, YIR041W, YKL224C, PAU3, and yl 025W, or a functional homolog thereof comprising a nucleic acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% sequence identity to the above.
The nucleic acid sequence of the Saccharomyces cerevisiae ANB1 promoter is shown in SEQ ID NO. 23. The nucleic acid sequence of the Saccharomyces cerevisiae DAN1 promoter is shown in SEQ ID NO. 24.
Thus, a preferred TKL promoter may comprise or consist of:
-the nucleic acid sequence of SEQ ID NO. 23 or SEQ ID NO. 24; or alternatively
-a functional homolog of the nucleic acid sequence of SEQ ID No. 23 or SEQ ID No. 24, which functional homolog has at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the nucleic acid sequence of SEQ ID No. 23 or SEQ ID No. 24; or alternatively
A functional homolog of the nucleic acid sequence of SEQ ID NO. 23 or SEQ ID NO. 24, which has one or more mutations, substitutions, insertions and/or deletions compared to the nucleic acid sequence of SEQ ID NO. 23 or SEQ ID NO. 24, wherein more preferably the nucleic acid sequence has NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 nucleic acid mutations, substitutions, insertions and/or deletions compared to the nucleic acid sequence of SEQ ID NO. 23 or SEQ ID NO. 24.
The TKL promoter may also be a synthetic oligonucleotide. That is, the TKL promoter may be a product of artificial oligonucleotide synthesis. Artificial oligonucleotide synthesis is a method in synthetic biology for the production of artificial oligonucleotides (such as genes) in the laboratory. Commercial gene synthesis services are now available from many companies around the world, some of which have established their business models around this task. Current methods of gene synthesis are most often based on a combination of organic chemistry and molecular biology techniques, and can synthesize the entire gene "de novo" without the need for a precursor template DNA.
TKL expression ratio of TKL promoter Anaerobic/aerobic Is 2 or higher, preferably 3 or higher, 4 or higher, 5 or higher, 6 or higher, 7 or higher, 8 or higher, 9 or higher, 10 or higher, 20 or higher, or 50 or higher. 2 or higher TKL expression ratio Anaerobic/aerobic Suitably it means that under further identical expression conditions, the expression of the transketolase ("TKL") under anaerobic conditions is at least 2 times higher than under aerobic conditions.
There is no upper limit, and the TKL promoter may be a TKL promoter that allows promotion of expression of the transketolase gene only under anaerobic conditions, not under aerobic conditions.
For practical reasons, it is considered that the ratio of the index 10 (i.e., 10) is equal to or greater than 2 to equal to or less than 10 10 ) Or to or below 10 index 4 (i.e., 10 4 ) TKL expression ratio in the range of (2) Anaerobic/aerobic
As indicated above, "expression" herein refers to transcription of a gene into structural RNA (rRNA, tRNA) or messenger RNA (mRNA) followed by translation into a protein.
The TKL expression ratio can be determined, for example, by measuring the amount of the Transketolase (TKL) protein of cells grown under aerobic and anaerobic conditions. The amount of TKL protein may be determined proteomic or any other method known to quantify the amount of protein.
The level or ratio of Transketolase (TKL) expression can also be determined by measuring the Transketolase (TKL) activity of cells grown under aerobic and anaerobic conditions (e.g., in cell-free extracts).
Additionally or alternatively to the above, the level or TKL expression ratio may be determined by measuring the transcript level of the TKL gene (e.g., as an amount of mRNA) in cells grown under aerobic and anaerobic conditions. The skilled artisan knows how to determine the level of translation using methods generally known in the art (e.g., Q-PCR, real-time PCR, northern blotting, RNA-seq).
The TKL promoter advantageously enables higher expression of the transketolase under anaerobic conditions than under aerobic conditions. In the method according to the invention, the recombinant yeast cell preferably expresses a transketolase, wherein the amount of transketolase expressed under anaerobic conditions is a multiple of the amount of transketolase expressed under aerobic conditions, and wherein the multiple is preferably 2 or more, more preferably 3 or more, 4 or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 20 or more or 50 or more.
Increased flux
Preferably, one or more genetic modifications to the PPP gene (i.e., with respect to TKL1 and optionally RKI, RPE, and TAL) such that the flux of the non-oxidized portion of the pentose phosphate pathway is increased are understood herein to mean a modification that increases the flux by at least about 1.1-fold, about 1.2-fold, about 1.5-fold, about 2-fold, about 5-fold, about 10-fold, or about 20-fold as compared to the flux in a genetically identical strain except for the genetic modification that increases the flux. The flux of the non-oxidized part of the pentose phosphate pathway can be measured by: xylose as sole carbonThe source grows the modified host, determines the specific xylose consumption rate, and if any xylitol is produced, subtracts the specific xylitol production rate from the specific xylose consumption rate. However, the flux of the non-oxidized portion of the pentose phosphate pathway is proportional to the growth rate with xylose as the sole carbon source, preferably proportional to the anaerobic growth rate with xylose as the sole carbon source. At a growth rate (. Mu.) with xylose as sole carbon source max ) There is a linear relationship with the flux of the non-oxidized part of the pentose phosphate pathway. Specific xylose consumption Rate (Q) s ) Equal to the growth rate (μ) divided by the biomass yield by sugar (Y xs ) Because biomass yield against sugar is constant (under a given set of conditions: anaerobic, growth medium, pH, genetic background of strain, etc.; namely Q s =μ/Y xs ). Thus, an increase in flux of the non-oxidized part of the pentose phosphate pathway can be deduced from an increase in maximum growth rate under these conditions, except for transport (uptake is limiting).
One or more genetic modifications that increase the flux of the pentose phosphate pathway can be introduced into a host cell in a variety of ways. These include, for example, achieving higher steady-state activity levels of one or more enzymes of the xylulokinase and/or non-oxidized partial pentose phosphate pathway and/or reduced steady-state levels of non-specific aldose reductase activity. These changes in steady state activity levels can be achieved by selection of mutants (spontaneous or induced by chemicals or radiation) and/or by recombinant DNA techniques (e.g. by overexpression or inactivation of genes encoding enzymes or factors regulating these genes, respectively).
In preferred host cells, the genetic modification comprises overexpression of at least one enzyme of the pentose phosphate pathway (a non-oxidized moiety). Preferably, the enzyme is selected from the group consisting of: enzymes encoding ribulose-5-phosphate isomerase, ribulose-5-phosphate epimerase, transketolase and transaldolase. Various combinations of enzymes of the pentose phosphate pathway can be overexpressed (non-oxidized moieties). For example, the overexpressed enzyme may be at least a ribulose-5-phosphate isomerase and a ribulose-5-phosphate epimerase; or at least a ribulose-5-phosphate isomerase and a transketolase; or at least ribulose-5-phosphate isomerase and transaldolase; or at least ribulose-5-phosphate epimerase and transketolase; or at least ribulose-5-phosphate epimerase and transaldolase; or at least a transketolase and a transaldolase; or at least ribulose-5-phosphate epimerase, transketolase and transaldolase; or at least ribulose-5-phosphate isomerase, transketolase and transaldolase; or at least a ribulose-5-phosphate isomerase, a ribulose-5-phosphate epimerase and a transaldolase; or at least ribulose-5-phosphate isomerase, ribulose-5-phosphate epimerase and transketolase. In one embodiment of the invention, each of the ribulose-5-phosphate isomerase, ribulose-5-phosphate epimerase, transketolase and transaldolase is overexpressed in a host cell. More preferred are host cells wherein the genetic modification comprises at least overexpression of both a transketolase and a transaldolase, as such host cells are already capable of anaerobic growth on xylose. In fact, under some conditions, host cells that overexpress only transketolase and transaldolase already have the same rate of anaerobic growth by xylose as host cells that overexpress all four enzymes (i.e., ribulose-5-phosphate isomerase, ribulose-5-phosphate epimerase, transketolase and transaldolase). Furthermore, host cells that overexpress both ribulose-5-phosphate isomerase and ribulose-5-phosphate epimerase are preferred over host cells that overexpress only isomerase or only epimerase, as overexpression of only one of these enzymes may create a metabolic imbalance.
Redox sink
Preferably, the recombinant yeast cell may further comprise one or more genetic modifications for functionally expressing proteins that play a role in metabolic pathways that form the unnatural redox sink.
For example, the one or more genetic modifications may be one or more genetic modifications for functional expression of one or more optionally heterologous nucleic acid sequences encoding one or more nad+/NADH dependent proteins, which one or more proteins play a role in the metabolic pathway that converts NADH to nad+. There are several examples of such metabolic pathways, as further shown below.
For example, "one or more genetic modifications for functionally expressing a protein that functions in a metabolic pathway that forms a non-native redox sink" may be selected from the group consisting of:
a) Comprising or consisting of one or more of the following genetic modifications:
-a preferably heterologous nucleic acid sequence encoding a protein comprising phosphoketolase activity (EC 4.1.2.9 or EC 4.1.2.22, pkl); and/or
-a preferably heterologous nucleic acid sequence encoding a protein (EC 2.3.1.8) having Phosphotransacetylase (PTA) activity; and/or
Preferably heterologous nucleic acid sequence encoding a protein with acetate kinase (ACK) activity (EC 2.7.2.12).
And/or
b) Comprising or consisting of one or more of the following genetic modifications:
-a preferably heterologous nucleic acid sequence encoding a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity; and/or
-a preferably heterologous nucleic acid sequence encoding a protein having Phosphoribulokinase (PRK) activity; and
-optionally, a preferably heterologous nucleic acid sequence encoding one or more chaperones of a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity;
and/or
c) Comprising or consisting of one or more of the following genetic modifications:
a preferably heterologous nucleic acid sequence encoding a protein comprising NADH dependent acetylating acetaldehyde dehydrogenase activity.
For example, WO 2014/081803 describes recombinant microorganisms expressing heterologous phosphoketolase, phosphotransacetylase or acetate kinase and bifunctional acetaldehyde-alcohol dehydrogenase, incorporated herein by reference; and WO 2015/148272 describes recombinant saccharomyces cerevisiae strains expressing heterologous phosphoketolase, phosphotransacetylase and acetylacetaldehyde dehydrogenase enzymes, incorporated herein by reference. Furthermore, WO 2018172328A1 describes recombinant cells that may comprise one or more (heterologous) genes encoding enzymes having phosphoketolase activity. The Phosphoketolase (PKL) pathway described in WO 2014/081803, WO 2015/148272 and WO 2018172328A1 (all incorporated herein by reference) provides a preferred metabolic pathway for converting NADH to nad+, and the NADH-dependent phosphoketolase described therein is a preferred NADH-dependent protein for use in the present invention.
Rubisco
As indicated above, the recombinant yeast cell may advantageously functionally express a nucleic acid sequence encoding ribulose-1, 5-phosphate carboxylase/oxygenase (EC 4.1.1.39; rubisco) and optionally one or more preferably heterologous chaperones of Rubisco.
More preferably, the recombinant yeast cell functionally expresses:
-a heterologous nucleic acid sequence encoding a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity; and/or
-a heterologous nucleic acid sequence encoding a protein having Phosphoribulokinase (PRK) activity; and/or
-optionally one or more heterologous nucleic acid sequences encoding one or more chaperones of a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity.
Proteins having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity are also referred to herein as "ribulose-1, 5-bisphosphate carboxylase oxygenase (ribulose-1, 5-biphosphate carboxylase oxygenase)", "ribulose-1, 5-bisphosphate carboxylase oxygenase protein", "ribulose-1, 5-bisphosphate carboxylase oxygenase (ribulose-1, 5-biphosphate carboxylase oxygenase enzyme)", "Rubisco enzyme", "Rubisco protein" or simply "Rubisco". The ribulose-1, 5-bisphosphate carboxylase oxygenase can be further defined by its amino acid sequence. Likewise, the ribulose-1, 5-bisphosphate carboxylase oxygenase may be further defined by a nucleotide sequence encoding the ribulose-1, 5-bisphosphate carboxylase oxygenase. As explained in detail below in the definitions above, a certain ribulose-1, 5-bisphosphate carboxylase oxygenase defined by a nucleotide sequence encoding an enzyme includes (unless otherwise limited) a nucleotide sequence that hybridizes to such a nucleotide sequence encoding a ribulose-1, 5-bisphosphate carboxylase oxygenase. Preferred Rubisco proteins and nucleic acid sequences encoding such Rubisco proteins are as described in WO 2014/129898 (incorporated herein by reference).
The Rubisco protein may be suitably selected from the group of eukaryotic and prokaryotic Rubisco proteins. The Rubisco proteins are preferably from non-phototrophic organisms. For example, the Rubisco protein may be from a chemolithoautotrophic microorganism. Good results have been obtained with bacterial Rubisco proteins. Preferably, the Rubisco protein is derived from an inorganic autotrophic Thiobacillus (Thiobacillus), in particular Thiobacillus denitrificans.
The Rubisco protein may be a single subunit Rubisco protein or a Rubisco protein having more than one subunit. Preferably, the Rubisco protein is a single subunit Rubisco protein. Good results have been obtained with the use of the Rubisco protein as a so-called type II Rubisco protein. Particularly good results were obtained with the Rubisco protein encoded by the cbbM gene (also known as cbbM).
Preferred Rubisco proteins are the Rubisco proteins encoded by the cbbM gene from thiobacillus denitrificans. SEQ ID NO. 25 shows the amino acid sequence of a suitable Rubisco protein encoded by the cbbM gene from Thiobacillus denitrificans. SEQ ID NO. 26 shows the nucleic acid sequence of the cbbM gene from S.denitrificans, codon optimized for Saccharomyces cerevisiae.
Thus, preferably, the protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity comprises or consists of:
-the amino acid sequence of SEQ ID No. 25; or alternatively
-a functional homolog of SEQ ID No. 25 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 25; or alternatively
Functional homologs of SEQ ID NO. 25 which have one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 25, more preferably functional homologs having NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 25.
Preferably, the nucleic acid sequence encoding a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity comprises or consists of:
-the nucleic acid sequence of SEQ ID NO. 26; or alternatively
-a functional homolog of SEQ ID No. 26 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the nucleic acid sequence of SEQ ID No. 26; or alternatively
Functional homologs of SEQ ID NO. 26 which have one or more mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 26, more preferably functional homologs which have not more than 300, not more than 250, not more than 200, not more than 150, not more than 100, not more than 75, not more than 50, not more than 40, not more than 30, not more than 20, not more than 10 or not more than 5 nucleic acid mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 26.
Examples of other suitable Rubisco polypeptides and sources thereof are given in table 1 and table 2 below of WO 2014/129898 (incorporated herein by reference), with reference to sequence identity to the amino acid sequence of SEQ ID No. 25.
The nucleic acid sequence (e.g. gene) encoding a ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) protein can be suitably incorporated into the genome of a recombinant yeast cell, for example as described in the examples of WO 2014/129898 and the following articles: guadalupe-Medina et al, "Carbon dioxide fixation by Calvin-Cycle enzymes improves ethanol yield in yeast" [ "carbon dioxide fixation by Calvin cycle enzymes improves ethanol yield in yeast" ], published in Biotechnol, biofields [ biotech for biofuel ],2013, volume 6, page 125, both incorporated herein by reference.
Table 2: natural Rubisco polypeptides suitable for expression
As indicated above, the Rubisco protein is suitably functionally expressed in the recombinant yeast cell, at least during use in a fermentation process.
The nucleic acid sequence encoding the Rubisco protein may be present in one, two or more copies in a recombinant yeast cell. Without wishing to be bound by any type of theory, it is believed that the robustness of recombinant yeast cells is best when the nucleic acid sequence (e.g., gene) encoding the Rubisco protein is present in the recombinant yeast cells in less than 12 copies, more preferably less than 8 copies. Thus, preferably, the recombinant yeast cell comprises a nucleic acid sequence (e.g., a gene) encoding a Rubisco protein that ranges from equal to or greater than 1 copy, more preferably equal to or greater than 2 copies to equal to or less than 7 copies, more preferably equal to or less than 6 copies. The recombinant yeast cell can, for example, comprise one, two, three, four, five, six or seven copies of a nucleic acid sequence encoding a ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco).
To increase the likelihood that the Rubisco protein is expressed in a transformed (recombinant) host cell of the invention at a sufficient level and in an active form, the nucleic acid sequences encoding the Rubisco protein and other proteins as described herein (see below) are preferably adapted to optimize their codon usage to that of the host cell in question. The adaptability of the nucleic acid sequence encoding the enzyme to the codon usage of the host cell may be expressed as a Codon Adaptation Index (CAI). Codon usage index is defined herein as a measure of the relative fitness of the codon usage of a gene to that of a gene highly expressed in a particular host cell or organism. The relative fitness (w) of each codon is the ratio of the use of each codon to the use of the highest abundance codon for the same amino acid. The CAI index is defined as the geometric mean of these relative fitness values. Non-synonymous codons and stop codons (depending on the genetic code) are excluded. CAI values range from 0 to 1, with higher values indicating higher ratios of most abundant codons (see Sharp and Li, "The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications" [ "codon usage index-measure of directional synonymous codon usage bias and its potential use" ], (1987), published in Nucleic Acids Research [ nucleic acids research ] volume 15, pages 1281-1295; see also Jansen et al, "Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models" [ "re-examine codon usage index from a genome-wide perspective" ], (2003), nucleic Acids Res "[ nucleic acids research ] volume 31 (8), pages 2242-51). The CAI of the adapted nucleic acid sequence is preferably at least 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 or 0.9. Preferably, the sequence has been codon optimized for expression in the fungal host cell in question (e.g. like a s.cerevisiae cell).
Preferably, the functionally expressed Rubisco protein is dependent on ribulose-1, 5-bisphosphate 14 The rate of incorporation of C bicarbonate into the cell extract defines an activity of at least 1nmol.min -1 (mg protein) -1 In particular an activity of at least 2nmol. Min -1 (mg protein) -1 More particularly an activity of at least 4nmol. Min -1 (mg protein) -1 . The upper limit of activity is not critical. In practice, the activity may be about 200nmol.min -1 (mg protein) -1 Or lower, in particular 25nmol. Min -1 (mg protein) -1 More particularly 15nmol. Min -1 (mg protein) -1 Or less, e.g., about 10nmol. Min -1 (mg protein) -1 Or lower. The conditions for determining the assay of this Rubisco activity are as seen in the examples of WO 2014/129898 (incorporated herein by reference) (e.g., example 4).
Ribulokinase phosphate
Preferably, the recombinant yeast cell also functionally expresses a heterologous nucleic acid sequence encoding a protein having ribulose phosphate kinase (PRK) activity (EC 2.7.1.19; PRK).
Proteins having Phosphoribulokinase (PRK) activity are also referred to herein as "phosphoribulokinase proteins", "phosphoribulokinase (phosphoribulokinase enzyme)", "phosphoribulokinase (phosphoribulokinase)", "PRK enzymes", "PRK proteins" or simply "PRK". Preferred PRK proteins and nucleic acid sequences encoding such PRK proteins are as described in WO 2014/129898 (incorporated herein by reference).
Functionally expressed phosphoribulokinase (PRK, (EC 2.7.1.19)) according to the invention is capable of catalyzing the following chemical reactions:
thus, the two substrates of the enzyme are ATP and D-ribulose 5-phosphate; the two products are ADP and D-ribulose 1, 5-bisphosphate.
PRK proteins belong to the family of transferases, in particular transferases (phosphotransferases) which transfer phosphorus-containing groups with alcohol groups as acceptors. The systematic name of this enzyme is ATP, D-ribulose-5-phosphate 1-phosphotransferase. Other names commonly used include pentose phosphate kinase, ribulose-5-phosphate kinase, pentose phosphate kinase, ribulophosphate kinase (phosphorylated), ribulo5-phosphate kinase, ribulophosphate kinase, PKK, PRuK and PRK. PRK enzymes are involved in carbon fixation. The Phosphoribulokinase (PRK) protein may be further defined by its amino acid sequence. Likewise, a Phosphoribulokinase (PRK) protein may be further defined by a nucleotide sequence encoding a Phosphoribulokinase (PRK). As explained in detail below under the definition above, a certain Phosphoribulokinase (PRK) defined by a nucleotide sequence encoding an enzyme includes (unless otherwise limited) a nucleotide sequence that hybridizes to such a nucleotide sequence encoding a Phosphoribulokinase (PRK).
PRKs may be from prokaryotes or eukaryotes. Good results have been obtained with PRKs derived from eukaryotic organisms. Preferably, the PRK protein is derived from a plant selected from the order Caryophyllales (Caryophyllales), in particular from the family Amaranthaceae (Amaranthaceae), more in particular from the genus Spinacia (spindia).
Preferred PRK proteins are PRK proteins from the genus spinacia. SEQ ID NO. 27 shows the amino acid sequence of this PRK protein from the genus spinacia. SEQ ID NO. 28 shows the nucleic acid sequence of the prk gene from spinach-codon optimization for Saccharomyces cerevisiae.
Thus, preferably, the protein having Phosphoribulokinase (PRK) activity comprises or consists of:
-the amino acid sequence of SEQ ID No. 27; or alternatively
-a functional homolog of SEQ ID No. 27 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 27; or alternatively
Functional homologs of SEQ ID NO. 27 which have one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 27, more preferably functional homologs having NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 27.
Preferably, the nucleic acid sequence encoding a protein having Phosphoribulokinase (PRK) activity comprises or consists of:
-the nucleic acid sequence of SEQ ID NO. 28; or alternatively
-a functional homolog of SEQ ID No. 28 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the nucleic acid sequence of SEQ ID No. 28; or alternatively
Functional homologs of SEQ ID NO. 28 which have one or more mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 28, more preferably functional homologs which have not more than 300, not more than 250, not more than 200, not more than 150, not more than 100, not more than 75, not more than 50, not more than 40, not more than 30, not more than 20, not more than 10 or not more than 5 nucleic acid mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 28.
The nucleic acid sequence (e.g. gene) encoding a protein having ribulose phosphate kinase (PRK) activity may be suitably incorporated into the genome of a recombinant yeast cell, for example as described in the examples of WO 2014/129898 (incorporated herein by reference).
Examples of suitable PRK polypeptides and sources thereof are given in Table 2 and Table 3 below of WO 2014/129898 (incorporated herein by reference), with reference to sequence identity to the amino acid sequence of SEQ ID NO. 27.
Table 3: natural PRK polypeptide suitable for expression and identity with PRK from spinach genus
The nucleic acid sequence encoding the PRK protein may be under the control of a promoter ("PRK promoter") that enables expression under anaerobic conditions to be higher than expression under aerobic conditions. Examples of such promoters are described in WO 2017/216136A1 and WO 2018/228836 (both incorporated herein by reference). More preferably, the PRK expression ratio of such a promoter is anaerobic/aerobic preferably 2 or higher, 3 or higher, 4 or higher, 5 or higher, 6 or higher, 7 or higher, 8 or higher, 9 or higher, 10 or higher, 20 or higher or 50 or higher. Other preferences are as described in WO 2018/228836 (incorporated herein by reference).
Rubiosco chaperones
Optionally, the recombinant yeast cell further comprises one or more, preferably heterologous, nucleic acid sequences encoding one or more chaperones for a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity.
Suitably, such chaperones are also referred to herein as "chaperones", "chaperones" or simply as "chaperones". Preferred chaperones and nucleic acid sequences encoding such chaperones are as described in WO 2014/129898 (incorporated herein by reference).
Preferably, the recombinant yeast cell comprises one or more heterologous nucleic acid sequences encoding one or more chaperones for a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity.
Chaperones are proteins that provide a favorable condition for proper folding of other proteins to prevent aggregation. The newly produced protein must typically be folded from a linear chain of amino acids into a three-dimensional form. Chaperones belong to a large class of molecules that assist in protein folding, called chaperones. The energy to fold the protein is provided by Adenosine Triphosphate (ATP). The following documents write a review article about chaperones that can be used herein: y beenes et al, "Chaperonins: two rings for folding" [ "chaperonin: two loops "] (2011) for folding, trends in Biochemical Sciences [ trends in biochemical science ], volume 36, phase 8, pages 424-432, incorporated herein by reference.
The one or more chaperones may be prokaryotic chaperones or eukaryotic chaperones. Furthermore, the chaperones may be homologous or heterologous. For example, the recombinant yeast cell may comprise one or more nucleic acid sequences encoding one or more homologous or heterologous prokaryotic or eukaryotic chaperones that, when expressed, are capable of functionally interacting with enzymes in the recombinant yeast cell, in particular with at least one of Rubisco and PRK.
Suitably, the one or more molecular chaperones are derived from bacteria, more preferably from Escherichia, in particular E.coli. Preferred molecular chaperones are GroEL and GroES from E.coli. Other preferred chaperones are chaperones from the genus Saccharomyces, in particular Saccharomyces cerevisiae Hsp10 and Hsp60.
If chaperones are naturally expressed in organelles such as mitochondria (examples are Hsp60 and Hsp10 of saccharomyces cerevisiae), repositioning to the cytosol may be achieved, for example, by modifying the natural signal sequence of chaperones. In eukaryotes, the proteins Hsp60 and Hsp10 are almost identical in structure and function to GroEL and GroES, respectively. Thus, it is contemplated that Hsp60 and Hsp10 from any recombinant yeast cell may be used as chaperones for Rubisco. This is described, for example, in the following documents: zeilstra-Ryalls et al, "The universally conserved GroE (Hsp 60) chaperonins" [ "GroE (Hsp 60) chaperonin, which is commonly conserved" ] (1991), annu Rev Microbiol [ annual. Microbiology ] volume 45, pages 301-325; horwire et al, "Two Families of Chaperonin: physiology and Mechanism" [ "two chaperone families: physiology and mechanisms "] (2007), annu.Rev.cell.Dev.biol [ annual assessment of cell and developmental biology ] volume 23, pages 115-145, both of which are incorporated herein by reference.
Good results have been obtained with recombinant yeast cells comprising both heterologous chaperones GroEL and GroES.
As an alternative to GroES, there may be functional homologs of GroES, in particular functional homologs comprising an amino acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the amino acid sequence of GroES (corresponding to the amino acid sequence of SEQ ID NO: 31).
SEQ ID NO. 31 provides a preferred translated protein sequence for GroES based on E.coli. SEQ ID NO. 32 provides a synthetic nucleic acid sequence based on GroES from E.coli, codon optimized for expression in Saccharomyces cerevisiae.
Examples of suitable natural chaperone polypeptides homologous to GroES are given in Table 4.
Table 4: natural molecular chaperones homologous to GroES polypeptide suitable for expression
As alternatives to GroEL, functional homologs of GroEL may be present, in particular functional homologs comprising an amino acid sequence having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the amino acid sequence of GroEL (corresponding amino acid sequence of SEQ ID NO: 29).
SEQ ID NO. 29 provides a preferred translated protein sequence for GroEL based on E.coli. SEQ ID NO. 30 provides a synthetic nucleic acid sequence based on GroEL from E.coli, codon optimized for expression in Saccharomyces cerevisiae.
Suitable natural chaperone polypeptides homologous to GroEL are given in Table 5.
Table 5: natural molecular chaperones homologous to GroEL polypeptides suitable for expression
The recombinant yeast cell preferably comprises (respectively functionally expresses) a GroES chaperone and a GroEL chaperone. Preferably, the 10kDa chaperone ("GroES") from Table 4 is combined with a matched 60kDa chaperone ("GroEL") from the same organism genus or species of Table 5 for expression in recombinant yeast cells.
For example: 71-168 10kDa chaperone [ Pyrenophora repentis ] are expressed with matched > gi|189189366|ref|XP_00193155.1| heat shock protein 60, mitochondrial, precursor [ Pyrenophora repentis Pt-1C-BFP ]. All other combinations from tables 4 and 5, similarly produced from the same organism source, are also available to the skilled artisan for expression. Furthermore, chaperones from one organism from table 4 may be combined with chaperones from another organism from table 5, or GroES may be combined with chaperones from table 5, or GroEL may be combined with chaperones from table 4.
Thus, preferably, the one or more chaperones comprise or consist of:
-the amino acid sequence of SEQ ID NO. 29 and/or SEQ ID NO. 31; or alternatively
-one or more functional homologs of SEQ ID No. 29 and/or SEQ ID No. 31 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the amino acid sequence of SEQ ID No. 29 and/or SEQ ID No. 31, respectively; or alternatively
One or more functional homologs of SEQ ID NO. 29 and/or SEQ ID NO. 31 which have one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 29 and/or SEQ ID NO. 31, respectively, more preferably one or more functional homologs which have not more than 300, not more than 250, not more than 200, not more than 150, not more than 100, not more than 75, not more than 50, not more than 40, not more than 30, not more than 20, not more than 10 or not more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 29 and/or SEQ ID NO. 31, respectively.
Preferably, the one or more nucleic acid sequences encoding a chaperone comprise or consist of:
-the nucleic acid sequence of SEQ ID NO. 30 and/or SEQ ID NO. 32; or alternatively
-one or more functional homologs of SEQ ID No. 30 and/or SEQ ID No. 32 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the nucleic acid sequence of SEQ ID No. 30 and/or SEQ ID No. 32, respectively; or alternatively
One or more functional homologs of SEQ ID NO. 30 and/or SEQ ID NO. 32 which have one or more mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 30 and/or SEQ ID NO. 32, respectively, more preferably one or more functional homologs of not more than 300, not more than 250, not more than 200, not more than 150, not more than 100, not more than 75, not more than 50, not more than 40, not more than 30, not more than 20, not more than 10 or not more than 5 nucleic acid mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 30 and/or SEQ ID NO. 32, respectively.
One or more nucleic acid sequences encoding a chaperone may be suitably incorporated into the genome of a recombinant yeast cell, for example as described in the examples of WO 2014/129898 (incorporated herein by reference).
Phosphoketolase
As indicated above, the recombinant yeast cell may advantageously comprise a preferably heterologous nucleic acid sequence encoding a protein (EC 4.1.2.9 or EC 4.1.2.22) comprising Phosphoketolase (PKL) activity and/or a preferably heterologous nucleic acid sequence encoding a protein (EC 2.3.1.8) having Phosphotransacetylase (PTA) activity and/or a preferably heterologous nucleic acid sequence encoding a protein (EC 2.7.2.12) having acetate kinase (ACK) activity.
The recombinant cells may comprise one or more heterologous genes encoding a protein having phosphoketolase activity. Such proteins having phosphoketolase activity are also referred to herein as "phosphoketolase proteins", "phosphoketolase (phosphoketolase enzyme)" or simply as "phosphoketolase (phosphoketolase)". Phosphoketolase is further abbreviated herein as "PKL" or "XFP".
As used herein, phosphoketolase catalyzes at least the conversion of D-xylulose 5-phosphate to D-glyceraldehyde 3-phosphate and acetyl phosphate. Phosphoketolase participates in at least one of the following reactions:
EC 4.1.2.9:
EC 4.1.2.22:
Suitable enzyme assays for measuring phosphoketolase activity are described, for example, in the following documents: sondereger et al, "Metabolic Engineering of a Phosphoketolase Pathway for Pentose Catabolismin Saccharomyces cerevisiae" [ "metabolic engineering of the phosphoketolase pathway of pentose catabolism in Saccharomyces cerevisiae" ] (2004), applied & Environmental Microbiology [ application and environmental microbiology ], volume 70 (5), pages 2892-2897, incorporated herein by reference.
Preferably, the protein having Phosphoketolase (PKL) activity comprises or consists of:
-the amino acid sequence of SEQ ID NO. 33, SEQ ID NO. 34, SEQ ID NO. 35 or SEQ ID NO. 36; or alternatively
-a functional homolog of SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35 or SEQ ID No. 36 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 33, SEQ ID No. 34, SEQ ID No. 35 or SEQ ID No. 36; or alternatively
A functional homolog of SEQ ID NO. 33, SEQ ID NO. 34, SEQ ID NO. 35 or SEQ ID NO. 36 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 33, SEQ ID NO. 34, SEQ ID NO. 35 or SEQ ID NO. 36, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 33, SEQ ID NO. 34, SEQ ID NO. 35 or SEQ ID NO. 36.
Suitable nucleic acid sequences encoding phosphoketolase proteins can be found in organisms selected from the group consisting of: aspergillus niger, neurospora crassa, lactobacillus casei (L.casei), lactobacillus plantarum (L.plantarum), lactobacillus plantarum, bifidobacterium adolescentis (B.adolescentis), bifidobacterium bifidum (B.bifidum), bifidobacterium homolupulum (B.gallicum), bifidobacterium animalis (B.analalis), bifidobacterium lactis (B.lactis), lactobacillus pentosus (L.pentosum), lactobacillus acidophilus (L.acidophilus), penicillium chrysogenum (P.chrysogenum), aspergillus nidulans (A.nidulans), aspergillus clavatus (A.clavatus), leuconostoc mesenteroides (L.messenteroides) and wine coccus (O.oenii).
The nucleic acid sequence (e.g., gene) encoding a protein having Phosphoketolase (PKL) activity may be suitably incorporated into the genome of a recombinant yeast cell.
The recombinant cells may comprise one or more (heterologous) genes encoding enzymes having phosphoketolase activity.
Phosphotransacetylase
As indicated above, the recombinant yeast cell may advantageously comprise a preferably heterologous nucleic acid sequence encoding a protein (EC 4.1.2.9 or EC 4.1.2.22) comprising Phosphoketolase (PKL) activity and/or a preferably heterologous nucleic acid sequence encoding a protein (EC 2.3.1.8) having Phosphotransacetylase (PTA) activity and/or a preferably heterologous nucleic acid sequence encoding a protein (EC 2.7.2.12) having acetate kinase (ACK) activity.
As used herein, phosphotransacetylase catalyzes at least the conversion of acetyl phosphate to acetyl-coa.
The recombinant cells may comprise one or more heterologous genes encoding a protein having phosphotransacetylase activity. Such a protein having phosphotransacetylase activity is also referred to herein as "phosphotransacetylase protein", "phosphotransacetylase (phosphotransacetylase enzyme)" or simply "phosphotransacetylase". Phosphotransacetylase is further abbreviated herein as "PTA".
Preferably, the protein having Phosphotransacetylase (PTA) activity comprises or consists of:
-the amino acid sequence of SEQ ID NO. 37, SEQ ID NO. 38, SEQ ID NO. 39 or SEQ ID NO. 40; or alternatively
-a functional homolog of SEQ ID No. 37, SEQ ID No. 38, SEQ ID No. 39 or SEQ ID No. 40 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 37, SEQ ID No. 38, SEQ ID No. 39 or SEQ ID No. 40; or alternatively
-a functional homolog of SEQ ID No. 37, SEQ ID No. 38, SEQ ID No. 39 or SEQ ID No. 40 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 37, SEQ ID No. 38, SEQ ID No. 39 or SEQ ID No. 40, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 37, SEQ ID No. 38, SEQ ID No. 39 or SEQ ID No. 40.
Suitable nucleic acid sequences encoding enzymes having phosphotransacetylase can be found in organisms selected from the group consisting of: bifidobacterium adolescentis, bacillus subtilis, clostridium defibricum, clostridium phytofermentans, bifidobacterium bifidum, bifidobacterium animalis, leuconostoc mesenteroides, lactobacillus plantarum, myceliophthora thermophila (m. Thermophila) and oenococcus.
The nucleic acid sequence (e.g., gene) encoding a protein having Phosphotransacetylase (PTA) activity may be suitably incorporated into the genome of a recombinant yeast cell.
Acetate kinase
As indicated above, the recombinant yeast cell may comprise a preferably heterologous nucleic acid sequence encoding a protein (EC 4.1.2.9 or EC 4.1.2.22) comprising Phosphoketolase (PKL) activity and/or a preferably heterologous nucleic acid sequence encoding a protein (EC 2.3.1.8) having Phosphotransacetylase (PTA) activity and/or a preferably heterologous nucleic acid sequence encoding a protein (EC 2.7.2.12) having acetate kinase (ACK) activity.
As used herein, acetate kinase catalyzes at least the conversion of acetic acid to acetyl phosphate.
The recombinant cell may comprise one or more preferably heterologous genes encoding a protein having acetate kinase activity (EC 2.7.2.12). Such proteins having acetate kinase activity are also referred to herein as "acetate kinase proteins", "acetate kinase (acetate kinase enzyme)" or simply "acetate kinase". Acetate kinase is further abbreviated herein as "ACK".
Preferably, the protein having acetate kinase (ACK) activity comprises or consists of:
-the amino acid sequence of SEQ ID NO. 41 or SEQ ID NO. 42; or alternatively
-a functional homolog of SEQ ID No. 41 or SEQ ID No. 42 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the amino acid sequence of SEQ ID No. 41 or SEQ ID No. 42; or alternatively
Functional homologs of SEQ ID No. 41 or SEQ ID No. 42 which have one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 41 or SEQ ID No. 42, more preferably functional homologs which have NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 41 or SEQ ID No. 42.
The nucleic acid sequence (e.g., gene) encoding a protein having acetate kinase (ACK) activity may be suitably incorporated into the genome of a recombinant yeast cell.
Acetylaldehyde dehydrogenase
As indicated above, the recombinant yeast cell may advantageously comprise and functionally express a preferably heterologous nucleic acid sequence encoding a protein (EC 1.2.1.10) comprising nad+ dependent acetylaldehyde dehydrogenase activity.
More preferably, the recombinant yeast cell functionally expresses, if an acetylating acetaldehyde dehydrogenase is present:
-a preferably heterologous nucleic acid sequence encoding a protein comprising nad+ -dependent acetylating acetaldehyde dehydrogenase activity (EC 1.2.1.10); and
Coding with NAD + Protein dependent alcohol dehydrogenase activity (EC 1).1.1.1 or EC 1.1.1.2) suitably endogenous or heterologous nucleic acid sequences; and
a suitably endogenous or heterologous nucleic acid sequence encoding a protein having acetyl-coa synthetase activity (EC 6.2.1.1).
The acetylating acetaldehyde dehydrogenase is an enzyme that catalyzes the conversion of acetyl-coa to acetaldehyde (EC 1.2.1.10). This conversion can be represented by the following equilibrium equation:
acetyl-CoA+NADH+H + <->acetaldehyde+NAD + +coenzyme A
The protein having an acetylating acetaldehyde dehydrogenase activity is also referred to herein as "acetylating acetaldehyde dehydrogenase protein", "acetylating acetaldehyde dehydrogenase (acetylating acetaldehyde dehydrogenase enzyme)", or simply "acetylating acetaldehyde dehydrogenase (acetylating acetaldehyde dehydrogenase)". Preferred acetylating acetaldehyde dehydrogenases and nucleic acid sequences encoding such acetylating acetaldehyde dehydrogenases are described in WO 2011/010923 and WO 2019/0635507 (incorporated herein by reference).
Encoding a polypeptide having NAD + The nucleic acid sequence of the protein dependent on the activity of the acetylating acetaldehyde dehydrogenase (EC 1.2.1.10) is preferably a heterologous nucleic acid sequence. Thus, the encoded NAD + The dependent acetylating acetaldehyde dehydrogenase may preferably be a heterologous NAD + A dependent acetylating acetaldehyde dehydrogenase.
Proteins having acetylating acetaldehyde dehydrogenase activity may be monofunctional or bifunctional.
Encoding NAD + The nucleic acid sequence of the dependent acetylating acetaldehyde dehydrogenase may in principle originate from any organism comprising a nucleic acid sequence encoding said dehydrogenase. Known acetylating acetaldehyde dehydrogenases that catalyze the NADH-dependent reduction of acetyl-CoA to acetaldehyde can be generally divided into the following three types of NAD+ -dependent acetylating acetaldehyde dehydrogenase functional homologs:
1) Bifunctional proteins that catalyze the reversible conversion of acetyl-coa to acetaldehyde, followed by the reversible conversion of acetaldehyde to ethanol. These types of proteins advantageously have both acetylating acetaldehyde dehydrogenase activity and alcohol dehydrogenase activity. This kind ofAn example of a type of protein is the AdhE protein in E.coli (GenBank accession number: NP-415757). AdhE appears to be an evolutionary product of gene fusion. NH of AdhE protein 2 The terminal region is highly homologous to the aldehyde NAD+ oxidoreductase, while the COOH terminal region is Fe 2+ Dependent ethanol NAD+ oxidoreductase family homology (see Membrillo-Hernandez et al, "Evolution of the adhE Gene Product of Escherichia coli from a Functional Reductase to a Dehydrogenase" [ "evolution of the adhE Gene product of E.coli from functional reductase to dehydrogenase" ](2000) J.biol.chem. [ journal of biochemistry ]]275, pages 33869-33875, incorporated herein by reference). Coli AdhE is subject to metal-catalyzed oxidation and is therefore sensitive to oxygen (see Tamarit et al, "Identification of the Major Oxidatively Damaged Proteins in Escherichia coli Cells Exposed to Oxidative Stress" [ "identification of major oxidative damage proteins in E.coli cells exposed to oxidative stress") "](1998) J.biol.chem. [ journal of biochemistry ]]273, pages 3027-3032, incorporated herein by reference).
2) Proteins that catalyze the reversible conversion of acetyl-CoA to acetaldehyde in a strictly or facultative anaerobic microorganism, but do not possess alcohol dehydrogenase activity. An example of this type of protein has been reported in Clostridium kluyveri (Clostridium kluyveri) (see Smith et al, "Purification, properties, and Kinetic Mechanism of Coenzyme A-Linked Aldehyde Dehydrogenase from Clostridium kluyveri" [ "Purification, nature and kinetic mechanism of coenzyme A-linked aldehyde dehydrogenase from Clostridium kluyveri" ] (1980) arch. Biochem. Biophys. [ Biochem. Biophysics, volume 203: pages 663-675, incorporated herein by reference). The acetylating acetaldehyde dehydrogenase (GenBank accession number: EDK 33116) has been annotated in the genome of Clostridium kluyveromyces DSM 555. A homologous protein AcdH (GenBank accession number: NP-784141) was identified in the genome of Lactobacillus plantarum. Another example of this type of protein is The Gene product described in Clostridium beijerinckii (Clostridium beijerinckii) NRRL B593 (see Toth et al, "The ald Gene, encoding a Coenzyme A-Acylating Aldehyde Dehydrogenase, distinguishes Clostridium beijerinckii and Two Other Solvent-Producing Clostridia from Clostridium acetobutylicum" [ "The ald Gene encoding a CoA-acylated aldehyde dehydrogenase distinguishes Clostridium beijerinckii from The other two solvolyte-producing Clostridium acetobutylicum" ], (1999), appl. Environ. Microbiol. [ applied and environmental microbiology ] volume 65:4973-4980, genBank accession number: AAD31841, incorporated herein by reference).
3) A protein that is part of a bifunctional aldolase-dehydrogenase complex involved in catabolism of 4-hydroxy-2-oxopentanoate. Such bifunctional enzymes catalyze the last two steps of the meta-cleavage pathway of catechol, an intermediate in the degradation of phenol, benzoate, naphthalene, biphenyl, and other aromatic compounds in many bacterial species (Powlowski and Shingler "Genetics and biochemistry of phenol degradation by Pseudomonas sp.cf600" [ "genetics and biochemistry of pseudomonas species CF600 degrading phenol" ] (1994) biodegration [ Biodegradation ] volume 5, pages 219-236, incorporated herein by reference). 4-hydroxy-2-oxopentanoic acid is first converted into pyruvic acid and acetaldehyde by 4-hydroxy-2-oxopentanoic acid aldolase, followed by conversion of acetaldehyde into acetyl-coa by acetylacetaldehyde dehydrogenase. An example of this type of acetylating acetaldehyde dehydrogenase is the DmpF protein in Pseudomonas (Pseudomonas) species CF600 (GenBank accession number: CAA 43226) (Shingler et al, "Nucleotide Sequence and Functional Analysis of the Complete Phenol/3,4-Dimethylphenol Catabolic Pathway of Pseudomonas sp.Strain CF600" [ "nucleotide sequence and functional analysis of the phenol/3,4-dimethylphenol catabolic pathway of Pseudomonas species strain CF600" ] (1992), J.Bacteriol. [ journal of bacteriology ], volume 174, pages 711-724, incorporated herein by reference). The E.coli MphF protein is homologous to the DmpF protein in Pseudomonas species CF600 (Ferrandez et al, "Genetic Characterization and Expression in Heterologous Hosts of the 3- (3-Hydroxyphenyl) Propionate Catabolic Pathway of Escherichia coli K-12" [ "genetic characterization of the 3- (3-Hydroxyphenyl) propionic acid catabolic pathway of E.coli K-12" ] (1997) J.Bacteriol. [ J.Bacteriol. ] 179:p.2573-2581, genBank accession number: NP-414885, incorporated herein by reference).
In a preferred embodiment, the protein having acetylating acetaldehyde dehydrogenase activity is bifunctional and comprises NAD + Dependent acetylating acetaldehyde dehydrogenase (EC 1.2.1.10) activity and NAD + Both dependent alcohol dehydrogenase (EC 1.1.1.1 or EC 1.1.1.2) activities.
Suitable nucleic acid sequences can be found in particular in organisms selected from the group: escherichia, in particular E.coli; mycobacterium (Mycobacterium), in particular Mycobacterium marinum (Mycobacterium marinum), mycobacterium ulcerans (Mycobacterium ulcerans), mycobacterium tuberculosis (Mycobacterium tuberculosis); thermophilic carbon oxide bacteria (Carboxydothermus), in particular thermophilic carbon hydroxide bacteria (Carboxydothermus hydrogenoformans); entamoeba, especially Entamoeba histolytica (Entamoeba histolytica); shigella (Shigella), in particular Shigella sonnei; burkholderia (Burkholderia), particularly Burkholderia-like gangrene (Burkholderia pseudo mallei), klebsiella (Klebsiella), particularly Klebsiella pneumoniae; azotobacter (Azotobacter), in particular Azotobacter brown (Azotobacter vinelandii); a Azoarcus (Azoarcus) species; cupriacus (cupravidus), in particular cupriacus taiwanensis (Cupriavidus taiwanensis); pseudomonas, in particular Pseudomonas species CF600; pelomacuum, in particular Pelotomaculum thermopropionicum. Preferably, NAD is encoded + The nucleic acid sequence of the dependent acetylating acetaldehyde dehydrogenase is derived from the genus Escherichia, more preferably from Escherichia coli.
Particularly suitable are the mhpF genes from E.coli or functional homologs thereof. The gene is described in the following documents: ferrandez et al, "Genetic Characterization and Expression in Heterologous Hosts of the 3- (3-Hydroxyphenyl) Propionate Catabolic Pathway of Escherichia coli K-12" [ "genetic characterization of the 3- (3-Hydroxyphenyl) propionic acid catabolic pathway of E.coli K-12" ] (1997) J.Bacteriol. [ J.bacteriology ] 179:2573-2581. Good results have been obtained with Saccharomyces cerevisiae, in which the mhpF gene from E.coli has been incorporated. In a further advantageous embodiment, the nucleic acid sequence encoding (acetylated) acetaldehyde dehydrogenase is derived from Pseudomonas, in particular dmpF, for example from Pseudomonas species CF 600.
Furthermore, the acetylating acetaldehyde dehydrogenase (or the nucleic acid sequence encoding such an activity) may be, for example, selected from the group of: coli adhE, amoebaadh 2 within the tissue, staphylococcus aureus adhE, ruminococcus (Piromyces) species E2 adhE, clostridium krypton EDK33116, lactobacillus plantarum acdH, escherichia coli eutE, listeria innocuous acdH, and pseudomonas putida (Pseudomonas putida) YP 001268189.
Preferably, have NAD + The protein dependent on the activity of the acetylating acetaldehyde dehydrogenase comprises or consists of:
-the amino acid sequence of SEQ ID NO. 43, SEQ ID NO. 44, SEQ ID NO. 45, SEQ ID NO. 46, SEQ ID NO. 47 or SEQ ID NO. 48; or alternatively
-a functional homolog of SEQ ID No. 43, SEQ ID No. 44, SEQ ID No. 45, SEQ ID No. 46, SEQ ID No. 47 or SEQ ID No. 48 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 43, SEQ ID No. 44, SEQ ID No. 45, SEQ ID No. 46, SEQ ID No. 47 or SEQ ID No. 48; or alternatively
-a functional homolog of SEQ ID No. 43, SEQ ID No. 44, SEQ ID No. 45, SEQ ID No. 46, SEQ ID No. 47 or SEQ ID No. 48 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 43, SEQ ID No. 44, SEQ ID No. 45, SEQ ID No. 46, SEQ ID No. 47 or SEQ ID No. 48, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 43, SEQ ID No. 44, SEQ ID No. 45, SEQ ID No. 46, SEQ ID No. 47 or SEQ ID No. 48.
Most preferably, the acetylating acetaldehyde dehydrogenase protein is a bifunctional protein having both acetylating acetaldehyde dehydrogenase activity and alcohol dehydrogenase activity.
The nucleic acid sequence (e.g., gene) encoding a protein having acetylating acetaldehyde dehydrogenase activity may be suitably incorporated into the genome of a recombinant yeast cell.
Examples of suitable enzymes for BLAST of the listed enzymes are further shown in tables 6 (a) to 6 (e) below, giving suitable alternative alcohol/acetaldehyde dehydrogenases.
Table 6 (a)BLAST query-adhE from E.coli
Table 6 (b)BLAST query-acdH from Lactobacillus plantarum
Watch 6 (c)BLAST query-eutE from E.coli
Watch 6 (d)BLAST query-Lin 1129 from Listeria innocens
Watch 6 (e)BLAST query-adhE from Staphylococcus aureus
Acetyl-coa synthetase
If the recombinant yeast cell functionally expresses a protein having acetylating acetaldehyde dehydrogenase activity, then preferably the recombinant yeast cell further functionally expresses:
coding with NAD + Nucleic acid sequences of proteins which depend on alcohol dehydrogenase activity (EC 1.1.1.1 or EC 1.1.1.2); and/or
Nucleic acid sequences encoding proteins with acetyl-CoA synthetase activity (EC 6.2.1.1).
Proteins having acetyl-coa synthetase activity may also be referred to herein as "acetyl-coa synthetase proteins", "acetyl-coa synthetases (acetyl-Coenzyme A synthetase enzyme)", or simply "acetyl-coa synthetases (acetyl-Coenzyme A synthetase)", or even "acetyl-coa synthetases (acetyl CoA synthetase)". This protein is further abbreviated herein as "ACS".
acetyl-CoA synthase (also known as acetate-CoA ligase or acetyl-activator) catalyzes the formation of acetyl-CoA from acetate, coA (coenzee a/CoA) and ATP as follows:
atp+acetate+coa=amp+diphosphate+acetyl coa
It will be appreciated that the recombinant yeast cell may naturally contain an endogenous gene encoding an acetyl-coa synthetase protein. Alternatively or in addition thereto, the recombinant yeast cell may comprise a heterologous nucleic acid sequence encoding a protein having acetyl-coa synthetase activity (EC 6.2.1.1).
For example, a recombinant yeast cell according to the invention may comprise an acetyl-CoA synthetase that may be present in wild-type cells, as is the case, for example, with Saccharomyces cerevisiae, which contains two acetyl-CoA synthetase isozymes encoded by ACS1 (amino acid sequence shown as SEQ ID NO: 49) and ACS2 (amino acid sequence shown as SEQ ID NO: 50) genes (van den Berg et al (1996) J.biol. Chem. [ J. Biochemistry ] 271:pages 28953-28959), or the host cell may be provided with one or more heterologous genes encoding such an activity, for example, ACS1 and/or ACS2 genes or functional homologues thereof of Saccharomyces cerevisiae may be incorporated into cells lacking acetyl-CoA synthetase isozymes activity.
Preferably, have NAD + The protein dependent on acetyl-coa synthetase activity comprises or consists of:
-the amino acid sequence of SEQ ID NO. 49 or SEQ ID NO. 50; or alternatively
-a functional homolog of SEQ ID No. 49 or SEQ ID No. 50 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity with the amino acid sequence of SEQ ID No. 49 or SEQ ID No. 50; or alternatively
A functional homolog of SEQ ID NO. 49 or SEQ ID NO. 50 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 49 or SEQ ID NO. 50, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 49 or SEQ ID NO. 50.
Preferably, the recombinant yeast cell is one in which the endogenous or heterologous acetyl-coa synthetase protein is overexpressed, most preferably by using a suitable promoter as described, for example, in WO 2011/010923 (incorporated herein by reference). Any heterologous nucleic acid sequence (e.g., a gene) encoding a protein having acetyl-coa synthetase activity may be suitably incorporated into the genome of a recombinant yeast cell.
Examples of suitable proteins having acetyl-coa synthetase activity are listed in table 7. At the top of table 7, ACS2 used in the examples and BLAST is mentioned.
Table 7:BLAST query-ACS 2 from Saccharomyces cerevisiae
Alcohol dehydrogenase
If the recombinant yeast cell functionally expresses a protein having acetylating acetaldehyde dehydrogenase activity, then preferably the recombinant yeast cell further functionally expresses:
coding with NAD + Nucleic acid sequences of proteins which depend on alcohol dehydrogenase activity (EC 1.1.1.1 or EC 1.1.1.2); and/or
Nucleic acid sequences encoding proteins with acetyl-CoA synthetase activity (EC 6.2.1.1).
The protein having alcohol dehydrogenase activity is also referred to herein as "alcohol dehydrogenase protein", "alcohol dehydrogenase (alcohol dehydrogenase enzyme)" or simply "alcohol dehydrogenase (alcohol dehydrogenase)". This protein is further abbreviated herein as "ADH".
Alcohol dehydrogenase catalyzes the conversion of acetaldehyde to ethanol.
It will be appreciated that the recombinant yeast cell may naturally comprise an endogenous nucleic acid sequence encoding an alcohol dehydrogenase protein. Alternatively or additionally, the recombinant yeast cell may comprise a heterologous nucleic acid sequence encoding a protein having alcohol dehydrogenase activity
For example, the recombinant yeast cell may naturally comprise a gene encoding an alcohol dehydrogenase as in the case of Saccharomyces cerevisiae (the amino acid sequences of natural Saccharomyces cerevisiae alcohol dehydrogenases ADH1, ADH2, ADH3, ADH4 and ADH5 are shown as SEQ ID NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54 and SEQ ID NO:55, respectively), see Lutstorf and Megnet, "Multiple Forms of Alcohol Dehydrogenase in Saccharomyces Cerevisiae" [ "various forms of alcohol dehydrogenases in Saccharomyces cerevisiae" ] (1968), arch. Biochem. Biophys. [ biochemical and biophysical archives ], vol.126, pp.933-944, respectively; or Ciriacy, "Genetics of Alcohol Dehydrogenase in Saccharomyces cerevisiae I.isolation and genetic analysis of adh mutants" [ "isolation and genetic analysis of an alcohol dehydrogenase genetic I.adh mutant in Saccharomyces cerevisiae" ] (1975), mutat.Res. [ mutation research ]29, pages 315-326, incorporated herein by reference).
Preferably, however, the recombinant yeast cell comprises an alcohol dehydrogenase activity within a suitably heterologous bifunctional enzyme having both an acetylating acetaldehyde dehydrogenase activity as well as an alcohol dehydrogenase activity, as described above. That is, most preferably, the alcohol dehydrogenase protein is a bifunctional protein having both an acetylating acetaldehyde dehydrogenase activity and an alcohol dehydrogenase activity. When the recombinant yeast cell comprises a heterologous nucleic acid sequence encoding a bifunctional protein having both an acetylating acetaldehyde dehydrogenase activity and an alcohol dehydrogenase activity, any native nucleic acid sequence encoding any native protein encoding an alcohol dehydrogenase activity may or may not be disrupted and/or deleted.
Thus, the recombinant yeast cell may advantageously be a recombinant yeast cell functionally expressing:
-one or more heterologous nucleic acid sequences encoding a bifunctional protein having the following activities: NAD (NAD) + Dependent acetylating acetaldehyde dehydrogenase activity (EC 1.2.1.10); and NAD + A dependent alcohol dehydrogenase activity (EC 1.1.1.1 or EC 1.1.1.2); and
one or more native or heterologous nucleic acid sequences encoding a protein having acetyl-CoA synthetase activity (EC 6.2.1.1),
Wherein optionally the code has NAD + One or more native nucleic acid sequences of the protein (EC 1.1.1.1 or EC 1.1.1.2) which are dependent on alcohol dehydrogenase activity are disrupted or deleted.
Alternatively, the recombinant yeast cell may advantageously be a recombinant yeast cell functionally expressing:
coding with NAD + One or more native or heterologous nucleic acid sequences of a monofunctional protein (EC 1.2.1.10) that is dependent on the activity of an acetylating acetaldehyde dehydrogenase; and
-one or more native or heterologous nucleic acid sequences encoding a protein having acetyl-coa synthetase activity (EC 6.2.1.1); and
coding with NAD + One or more native or heterologous nucleic acid sequences of a protein dependent on alcohol dehydrogenase activity (EC 1.1.1.1 or EC 1.1.1.2).
Preferences for bifunctional proteins are provided above, and as listed for the acetylating acetaldehyde dehydrogenase protein. If the protein is not bifunctional, NAD + The dependent alcohol dehydrogenase protein preferably has NAD + A protein dependent on alcohol dehydrogenase activity comprising or consisting of:
-the amino acid sequence of SEQ ID NO. 51, SEQ ID NO. 52, SEQ ID NO. 53, SEQ ID NO. 54 or SEQ ID NO. 55; or alternatively
-a functional homolog of SEQ ID No. 51, SEQ ID No. 52, SEQ ID No. 53, SEQ ID No. 54 or SEQ ID No. 55 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 51, SEQ ID No. 52, SEQ ID No. 53, SEQ ID No. 54 or SEQ ID No. 55; or alternatively
-a functional homolog of SEQ ID No. 51, SEQ ID No. 52, SEQ ID No. 53, SEQ ID No. 54 or SEQ ID No. 55 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 51, SEQ ID No. 52, SEQ ID No. 53, SEQ ID No. 54 or SEQ ID No. 55, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 51, SEQ ID No. 52, SEQ ID No. 53, SEQ ID No. 54 or SEQ ID No. 55.
Can be encoded with NAD + Any heterologous nucleic acid sequence (e.g., gene) of a protein that is dependent on alcohol dehydrogenase activity) Suitably incorporated into the genome of a recombinant yeast cell.
Deletion or disruption of glycerol 3-phosphate phosphohydrolase and/or glycerol 3-phosphate dehydrogenase
The recombinant yeast cell further may or may not comprise a deletion or disruption of one or more endogenous nucleotide sequences encoding a glycerol 3-phosphate phosphohydrolase gene and/or encoding a glycerol 3-phosphate dehydrogenase gene.
Preferably, the enzymatic activity required for NADH dependent glycerol synthesis in the yeast cells is reduced or deleted. The reduction or deletion of the enzymatic activity of glycerol 3-phosphate hydrolase and/or glycerol 3-phosphate dehydrogenase may be achieved by: one or more genes encoding NAD-dependent glycerol 3-phosphate dehydrogenase (GPD) and/or one or more genes encoding glycerophosphate phosphatase (GPP) are modified such that the expression of the enzyme is substantially lower than wild-type or such that the gene encodes a polypeptide with reduced activity. Such modifications may be made using commonly known biotechnology, and may in particular include one or more knockout mutations or site-directed mutagenesis of the promoter region or coding region of structural genes encoding GPD and/or GPP. Alternatively, yeast strains deficient in glycerol production may be obtained by random mutagenesis followed by selection of strains with reduced or absent GPD and/or GPP activity. The Saccharomyces cerevisiae GPD1, GPD2, GPP1 and GPP2 genes are shown in WO 2011010923 and are disclosed in SEQ ID NOS: 24-27 of this application.
Preferably, the recombinant yeast is a recombinant yeast further comprising a deletion or disruption of the glycerol-3-phosphate dehydrogenase (GPD) gene. One or more of the glycerophosphate phosphatase (GPP) genes may or may not be deleted or disrupted.
More preferably, the recombinant yeast is a recombinant yeast comprising a deletion or disruption of the glycerol-3-phosphate dehydrogenase 1 (GPD 1) gene. The glycerol-3-phosphate dehydrogenase 2 (GPD 2) gene may or may not be deleted or disrupted.
Most preferably, the recombinant yeast is a recombinant yeast comprising a deletion or disruption of the glycerol-3-phosphate dehydrogenase 1 (GPD 1) gene, while the glycerol-3-phosphate dehydrogenase 2 (GPD 2) gene remains active and/or intact. Thus, preferably only one of the Saccharomyces cerevisiae GPD1, GPD2, GPP1 and GPP2 genes is disrupted and deleted, and most preferably only GPD1 selected from the group consisting of GPD1, GPD2, GPP1 and GPP2 genes is disrupted or deleted.
Without wishing to be bound by any type of theory, it is believed that recombinant yeasts according to the present invention, wherein the GPD1 gene is not deleted or disrupted, may be advantageous when applied in a fermentation process in which glucose is preferably equal to or greater than 80g/L, more preferably equal to or greater than 90g/L, even more preferably equal to or greater than 100g/L, still more preferably equal to or greater than 110g/L, yet even more preferably equal to or greater than 120g/L, equal to or greater than 130g/L, equal to or greater than 140g/L, equal to or greater than 150g/L, equal to or greater than 160g/L, equal to or greater than 170g/L, or equal to or greater than 180g/L at the beginning of fermentation or during fermentation.
Preferably, at least one gene encoding GPD and/or at least one gene encoding GPP is deleted entirely or at least a part of a gene encoding a part of an enzyme essential for its activity is deleted. Good results can be obtained with s.cerevisiae cells in which the open reading frames of the GPD1 gene and/or the GPD2 gene have been inactivated. Inactivation of a structural gene (target gene) can be accomplished by one of skill in the art by synthetically synthesizing or otherwise constructing DNA fragments consisting of selectable marker genes flanked by DNA sequences that are identical to sequences flanking the genomic region of the host cell to be deleted. Suitably, good results are obtained by inactivating the GPD1 and GPD2 genes in Saccharomyces cerevisiae by integration of the marker genes kanMX and hphMX 4. Subsequently, the DNA fragment is transformed into a host cell. For example by diagnostic polymerase chain reaction or DNA hybridization, it is checked whether transformed cells expressing the dominant marker gene correctly replace the region designed to be deleted.
Thus, in the recombinant yeast cells of the invention, glycerol 3-phosphate phosphatase activity in the cells and/or glycerol 3-phosphate dehydrogenase activity in the cells may be advantageously reduced.
Glucoamylase enzyme
Preferably, the recombinant yeast cell further functionally expresses a nucleic acid sequence encoding a glucoamylase (EC 3.2.1.20 or 3.2.1.3).
A protein having glucoamylase activity is also referred to herein as "glucoamylase (glucoamylase enzyme)", "glucoamylase protein", or simply "glucoamylase (glucoamylase)". Glucoamylases have been abbreviated herein as "GA".
Glucoamylases (also known as amyloglucosidase, alpha-glucosidase, glucan 1, 4-alpha-glucosidase, maltase glucoamylase and maltase-glucoamylase) catalyze the hydrolysis of at least the terminal 1, 4-linked alpha-D-glucose residues from the non-reducing end of the amylose chain to release free D-glucose. Glucoamylases may be further defined by their amino acid sequence. Likewise, a glucoamylase may be further defined by a nucleotide sequence encoding a glucoamylase. As explained in detail below under the definition above, a certain glucoamylase defined by a nucleotide sequence encoding an enzyme includes (unless otherwise limited) a nucleotide sequence that hybridizes to such a nucleotide sequence encoding a glucoamylase.
Preferably, the protein having glucoamylase activity comprises or consists of:
-the amino acid sequence of SEQ ID NO. 56, SEQ ID NO. 57 or SEQ ID NO. 58; or alternatively
-a functional homolog of SEQ ID No. 56, SEQ ID No. 57 or SEQ ID No. 58 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 56, SEQ ID No. 57 or SEQ ID No. 58; or alternatively
A functional homolog of SEQ ID NO. 56, SEQ ID NO. 57 or SEQ ID NO. 58 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 56, SEQ ID NO. 57 or SEQ ID NO. 58, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 56, SEQ ID NO. 57 or SEQ ID NO. 58.
The polypeptide of SEQ ID NO. 56 encodes a "mature glucoamylase" which refers to an enzyme in its final form after translation and any post-translational modifications such as N-terminal treatment, C-terminal truncation, glycosylation, phosphorylation, etc.
In one embodiment, the nucleotide sequence encodes a polypeptide having the amino acid sequence of SEQ ID NO. 57 or a variant thereof having at least 50%, preferably at least 60%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity with the amino acid sequence of SEQ ID NO. 57. Amino acids 1-17 of SEQ ID NO. 57 may encode a natural signal sequence.
In another embodiment, the nucleotide sequence that allows expression of the glucoamylase encodes a polypeptide having the amino acid sequence of SEQ ID NO. 58 or a variant thereof having at least 50%, preferably at least 60%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity with the amino acid sequence of SEQ ID NO. 58. Amino acids 1-19 of SEQ ID NO. 58 may encode a signal sequence.
A signal sequence (also known as a signal peptide, targeting signal, localization sequence, transit peptide, leader sequence or leader peptide) may be present at the N-terminus of the polypeptide (here, glucoamylase) where it signals that the polypeptide is to be secreted (e.g., secreted into the extracellular and medium).
The nucleic acid sequence (e.g., gene) encoding a protein having glucoamylase activity may be suitably incorporated into the genome of a recombinant yeast cell.
Glycerol reuptake
The recombinant yeast cell may or may not further comprise one or more additional nucleic acid sequences as part of the glycerol reuptake pathway. That is, the recombinant yeast cell may or may not further comprise:
-one or more heterologous nucleic acid sequences encoding glycerol dehydrogenase; and/or
-a nucleic acid sequence encoding one or more homologs or heterogenies of dihydroxyacetone kinase; and/or
-one or more heterologous nucleic acid sequences encoding glycerol transporters.
Thus, in a preferred embodiment, the recombinant yeast cell is a recombinant yeast cell that functionally expresses:
-one or more heterologous nucleic acid sequences encoding ribulose-1, 5-phosphate carboxylase/oxygenase (EC 4.1.1.39; rubisco) and optionally one or more nucleic acid sequences encoding a chaperone of Rubisco;
-one or more heterologous nucleic acid sequences encoding a phosphoribulokinase (EC 2.7.1.19; prk);
one or more nucleic acid sequences encoding a transketolase (EC 2.2.1.1), wherein the transketolase is under the control of a promoter ("TKL promoter") whose TKL expression ratio is Anaerobic/aerobic 2 or higher;
-one or more heterologous nucleic acid sequences encoding glycerol dehydrogenase;
-a nucleic acid sequence encoding one or more homologs or heterogenies of dihydroxyacetone kinase; and
-optionally one or more heterologous nucleic acid sequences encoding glycerol transporters.
Without wishing to be bound by any type of theory, it is believed that recombinant yeast cells further comprising a combination of glycerol dehydrogenase, dihydroxyacetone kinase, and optionally glycerol transporter have improved overall performance in the form of higher ethanol yields.
In an alternative preferred embodiment, the recombinant yeast cell isDoes not takeFunctionally expressing recombinant yeast cells that:
-one or more heterologous nucleic acid sequences encoding glycerol dehydrogenase; and/or
-one or more heterologous nucleic acid sequences encoding dihydroxyacetone kinase; and/or
-one or more heterologous nucleic acid sequences encoding glycerol transporters.
Without wishing to be bound by any type of theory, it is believed that in the absence of one or more of these features of this glycerol re-uptake pathway, the resulting recombinant yeast cells have very low glucose and/or other sugar accumulation and improved robustness when applied in media containing large amounts of sugar. Therefore, when applied to the following fermentation process, Does not takeThe use of recombinant yeast cells comprising one or more of the following may be advantageous: heterologous and/or homologous glycerol dehydrogenases; heterologous and/or homologous dihydroxyacetone kinase; and/or heterologous and/or homologous glycerol transporters, preferably equal to or greater than 80g/L, more preferably equal to or greater than 90g/L, even more preferably equal to or greater than 100g/L, still more preferably equal to or greater than 110g/L, yet even more preferably equal to or greater than 120g/L, equal to or greater than 130g/L, equal to or greater than 140g/L, equal to or greater than 150g/L, equal to or greater than 160g/L, equal to or greater than 170g/L, or equal to or greater than 180g/L at the beginning of or during fermentation.
Thus, most preferably, the recombinant yeast is one that functionally expresses:
-one or more heterologous nucleic acid sequences encoding ribulose-1, 5-phosphate carboxylase/oxygenase (EC 4.1.1.39; rubisco) and optionally one or more nucleic acid sequences encoding a chaperone of Rubisco;
-one or more heterologous nucleic acid sequences encoding a phosphoribulokinase (EC 2.7.1.19; prk);
one or more nucleic acid sequences encoding a transketolase (EC 2.2.1.1), wherein the transketolase is under the control of a promoter ("TKL promoter") whose TKL expression ratio is Anaerobic/aerobic 2 or higher;
wherein the recombinant yeast cellDoes not takeFunctional expression
-one or more heterologous nucleic acid sequences encoding glycerol dehydrogenase; and/or
-one or more heterologous nucleic acid sequences encoding dihydroxyacetone kinase; and/or
-one or more heterologous nucleic acid sequences encoding glycerol transporters.
Glycerol dehydrogenase
As indicated above, the recombinant yeast cells may or may not be functionally expressed
-a nucleic acid sequence encoding a protein having glycerol dehydrogenase activity (e.c. 1.1.1.6);
-a nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity (e.c. 2.7.1.28 or e.c. 2.7.1.29); and
-optionally a nucleic acid sequence encoding a protein having glycerol transporter activity.
Thus, the recombinant yeast cell may or may not functionally express one or more, preferably heterologous, nucleic acid sequences encoding glycerol dehydrogenase.
If glycerol dehydrogenase is present, the recombinant yeast cell can comprise NAD + Linked glycerol dehydrogenase (EC 1.1.1.6) and/or NADP + Linked glycerol dehydrogenase (EC 1.1.1.72). That is, the recombinant yeast cell may or may not contain a nucleic acid encoding a nucleic acid having NAD + Nucleic acid sequences and/or encoding proteins (EC 1.1.1.6) with glycerol dehydrogenase-dependent activity and/or NADP-dependent proteins + Nucleic acid sequence of a protein dependent on glycerol dehydrogenase activity (EC 1.1.1.72).
In one embodiment, the protein having glycerol dehydrogenase activity is preferably a protein having nad+ -dependent glycerol dehydrogenase activity (EC 1.1.1.6); and preferably, the recombinant yeast cell functionally expresses the encoding gene with NAD + Nucleic acid sequence of a protein dependent on glycerol dehydrogenase activity (EC 1.1.1.6). Such proteins may be derived from bacterial sources or, for example, from fungal sources. One example is gldA from E.coli (E.coli).
In an alternative or additional embodiment, NADP can be present + Dependent glycerol dehydrogenase (EC 1.1.1.72).
NAD, if glycerol dehydrogenase is present + Linked glycerol dehydrogenases are preferred.
The protein having glycerol dehydrogenase activity is also referred to herein as "glycerol dehydrogenase protein", "glycerol dehydrogenase (glycerol dehydrogenase enzyme)" or simply as "glycerol dehydrogenase (glycerol dehydrogenase)". Similarly, proteins having nad+ dependent glycerol dehydrogenase activity are also referred to herein as "nad+ dependent glycerol dehydrogenase proteins", "nad+ dependent glycerol dehydrogenases (nad+ dependent glycerol dehydrogenase enzyme)", or simply "nad+ dependent glycerol dehydrogenases (nad+ dependent glycerol dehydrogenase)". Glycerol dehydrogenase is abbreviated GLD.
Preferred glycerol dehydrogenases and nucleic acid sequences encoding such glycerol dehydrogenases are described in WO 2015028582 (incorporated herein by reference).
Nad+ dependent glycerol dehydrogenase (EC 1.1.1.6) is an enzyme that catalyzes the following chemical reaction:
thus, the two substrates of the enzyme are glycerol and NAD + The three products are glycerone, NADH and H + . Glycerone and dihydroxyacetone are synonymous herein.
Glycerol dehydrogenases belong to the family of oxidoreductases, in particular in NAD + Or NADP + Oxidoreductases act as acceptors on CH-OH groups of the donor. The systematic name of this enzyme is glycerol: NAD + 2-oxidoreductase. Other names commonly used include glycerol (dehydrogenase) and NAD + Linked glycerol dehydrogenase. The enzyme is involved in glycerolipid metabolism. The glycerol dehydrogenase protein may be further defined by its amino acid sequence. Likewise, the glycerol dehydrogenase protein may be further defined by a nucleotide sequence encoding the glycerol dehydrogenase protein. As explained in detail under the definition above, a certain glycerol dehydrogenase protein defined by the nucleotide sequence encoding the enzyme includes (unless otherwise specifiedRestriction) to a nucleotide sequence that hybridizes with such a nucleotide sequence encoding a glycerol dehydrogenase protein.
The nucleic acid sequence encoding a protein having glycerol dehydrogenase activity may be a heterologous nucleic acid sequence. The protein having glycerol dehydrogenase activity may be a heterologous protein having NAD+ -dependent glycerol dehydrogenase activity.
If the recombinant yeast cell comprises one or more heterologous nucleic acid sequences encoding a glycerol dehydrogenase, the recombinant yeast cell preferably further comprises a suitable cofactor to enhance the activity of the glycerol dehydrogenase. For example, recombinant yeast cells can comprise zinc, zinc ions, or zinc salts and/or one or more pathways that include these in the cell.
Suitable examples of heterologous proteins having glycerol dehydrogenase activity include glycerol dehydrogenase proteins of klebsiella pneumoniae, enterococcus aerogenes, yersinia arvensis and escherichia coli, respectively. The amino acid sequences of such proteins have been shown by SEQ ID NO 59, SEQ ID NO 60, SEQ ID NO 61 and SEQ ID NO 62, respectively.
Thus, the recombinant yeast cell may or may not comprise one or more suitably heterologous glycerol dehydrogenase proteins having the amino acid sequences of SEQ ID NO 59, SEQ ID NO 60, SEQ ID NO 61 and/or SEQ ID NO 62; and/or functional homologs thereof comprising an amino acid sequence that has at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID NO. 59, SEQ ID NO. 60, SEQ ID NO. 61 and/or SEQ ID NO. 62; and/or functional homologues thereof comprising an amino acid sequence having one or more mutations, substitutions, insertions and/or deletions compared to the amino acid sequence of SEQ ID NO. 59, SEQ ID NO. 60, SEQ ID NO. 61 and/or SEQ ID NO. 62, wherein more preferably the amino acid sequence of such functional homologues has NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions compared to the amino acid sequence of SEQ ID NO. 59, SEQ ID NO. 60, SEQ ID NO. 61 and/or SEQ ID NO. 62.
A preferred glycerol dehydrogenase protein is one encoded by the gldA gene from E.coli. SEQ ID NO. 62 shows the amino acid sequence of this preferred NAD+ -dependent glycerol dehydrogenase protein encoded by the gldA gene from E.coli. The nucleic acid sequence of the gldA gene of E.coli is shown by SEQ ID NO. 63.
If the recombinant yeast cell comprises one or more heterologous nucleic acid sequences encoding glycerol dehydrogenase, the recombinant yeast cell therefore most preferably comprises a heterologous nucleotide sequence derived from E.coli encoding a protein having NAD+ -dependent glycerol dehydrogenase activity (E.C.1.1.1.6), optionally codon optimized for the host cell, as exemplified by the nucleic acid sequence shown in SEQ ID NO: 63.
Thus, preferably, the nucleic acid sequence encoding a protein having glycerol dehydrogenase activity comprises or consists of:
-the nucleic acid sequence of SEQ ID NO. 63; or alternatively
-a functional homolog of SEQ ID No. 63 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the nucleic acid sequence of SEQ ID No. 63; or alternatively
Functional homologs of SEQ ID NO. 63 which have one or more mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 63, more preferably functional homologs which have not more than 300, not more than 250, not more than 200, not more than 150, not more than 100, not more than 75, not more than 50, not more than 40, not more than 30, not more than 20, not more than 10 or not more than 5 nucleic acid mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 63.
If the recombinant yeast cell comprises one or more heterologous nucleic acid sequences encoding a glycerol dehydrogenase, the recombinant yeast cell therefore most preferably comprises one or more nucleotide sequences encoding a glycerol dehydrogenase (e.c. 1.1.1.6) derived from e.coli, optionally codon optimized for the host cell. Such a nucleic acid sequence (e.g. a gene) encoding a glycerol dehydrogenase protein may be suitably incorporated into the genome of a recombinant yeast cell, for example as described in the examples of WO 2015/028583 (incorporated herein by reference).
Other examples of suitable glycerol dehydrogenases are listed in tables 8 (a) to 8 (d). At the top of each table, gldA is mentioned and BLAST.
Table 8 (a):BLAST query-gldA from E.coli
Table 8 (b):BLAST query-gldA from Klebsiella pneumoniae
Table 8 (c):BLAST query-gldA from enterococcus aerogenes
Table 8 (d):BLAST query-gldA from Yersinia arvensis
Dihydroxyacetone kinase
As indicated above, the recombinant yeast cells may or may not be functionally expressed
-a nucleic acid sequence encoding a protein having glycerol dehydrogenase activity (e.c. 1.1.1.6);
-a nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity (e.c. 2.7.1.28 or e.c. 2.7.1.29); and
-optionally a nucleic acid sequence encoding a protein having glycerol transporter activity.
That is, the recombinant yeast cell may or may not functionally express one or more homologous or heterologous nucleic acid sequences encoding dihydroxyacetone kinase (e.c. 2.7.1.28 or e.c. 2.7.1.29).
Proteins having dihydroxyacetone kinase activity are also referred to herein as "dihydroxyacetone kinase proteins", "dihydroxyacetone kinase (dihydroxyacetone kinase enzyme)" or simply as "dihydroxyacetone kinase (dihydroxyacetone kinase)". Dihydroxyacetone kinase is abbreviated herein as DAK.
Preferred dihydroxyacetone kinases and nucleic acid sequences encoding such dihydroxyacetone kinases are as described in WO 2015028582 (incorporated herein by reference).
Proteins having dihydroxykinase activity may suitably belong to the enzyme classes e.c.2.7.1.28 and/or e.c. 2.7.1.29. Thus, the recombinant yeast cell suitably functionally expresses a nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity (e.c. 2.7.1.28 and/or e.c. 2.7.1.29).
Dihydroxyacetone kinase is herein preferably understood to be an enzyme (EC 2.7.1.29) that catalyzes the following chemical reaction:
ATP+glycerone ADP+glycerophosphate
And/or enzymes (EC 2.7.1.28) that catalyze the following chemical reactions:
ATP+D-glyceraldehyde ADP+D-glyceraldehyde 3-phosphate.
Other names commonly used for dihydroxyacetone kinase include glycerone kinase, ATP, glycerone phosphotransferase and (phosphorylated) acetol kinase. It is further understood that glycerone and dihydroxyacetone are the same molecule. Dihydroxyacetone kinase proteins may be further defined by their amino acid sequence. Likewise, dihydroxyacetone kinase proteins may be further defined by a nucleotide sequence encoding dihydroxyacetone kinase protein. As explained in detail below under the definition above, a certain dihydroxyacetone kinase protein defined by a nucleotide sequence encoding an enzyme includes (unless otherwise limited) a nucleotide sequence that hybridizes to such a nucleotide sequence encoding dihydroxyacetone kinase protein.
The recombinant yeast cell, if present, preferably functionally expresses a nucleic acid sequence encoding a native protein having dihydroxyacetone kinase activity. More preferably, the nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity is a native nucleic acid sequence.
The yeast contains two natural isoenzymes of dihydroxyacetone kinase (DAK 1 and DAK 2). According to the invention, these natural dihydroxyacetone kinases are preferred. Preferably, the host cell is a Saccharomyces cerevisiae cell; and preferably, the above natural dihydroxyacetone kinase is a natural dihydroxyacetone kinase of s.cerevisiae cells. The amino acid sequences of the natural dihydroxyacetone kinase proteins DAK1 and DAK2 of Saccharomyces cerevisiae have been shown by SEQ ID NO. 64 and SEQ ID NO. 65, respectively. The nucleic acid sequences encoding these native dihydroxyacetone kinase proteins DAK1 and DAK2 have been shown by SEQ ID NO 69 and SEQ ID NO 70, respectively.
Recombinant yeast cells can also functionally express a nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity, wherein the nucleic acid sequence is a heterologous nucleic acid sequence, and correspondingly wherein the protein is a heterologous protein. In one embodiment, the recombinant yeast cell comprises a heterologous gene encoding dihydroxyacetone kinase. Suitable heterologous genes include genes encoding dihydroxyacetone kinases from the following: kluyveromyces kudrii, kluyveromyces bailii, kluyveromyces lactis, candida glabra, yarrowia lipolytica, klebsiella pneumoniae, enterobacter aerogenes, escherichia coli, yarrowia lipolytica, schizosaccharomyces pombe, botrytis cinerea (Botryotinia fuckeliana) and Exophiala dermatitis (Exophiala dermatitidis). Preferred heterologous proteins having dihydroxyacetone kinase activity include those derived from Klebsiella pneumoniae, yarrowia lipolytica and Schizosaccharomyces pombe, respectively, as shown by SEQ ID NO:66, SEQ ID NO:67 and SEQ ID NO:68, respectively.
The recombinant yeast cell may or may not contain a genetic modification that causes dihydroxyacetone kinase to be overexpressed (e.g., by overexpressing a nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity). The nucleotide sequence encoding dihydroxyacetone kinase may be native or heterologous to the cell. Nucleic acid sequences which can be used for the overexpression of dihydroxyacetone kinase in the cells of the invention are, for example, the dihydroxyacetone kinase genes (DAK 1) and (DAK 2) from saccharomyces cerevisiae, as described, for example, in the following documents: molin et al, "Dihydroxy-acetone kinases in Saccharomyces cerevisiae are involved in detoxification of dihydroxyacetone" [ "dihydroxyacetone kinase in Saccharomyces cerevisiae is involved in detoxification of dihydroxyacetone" ] (2003), J.biol. Chem. [ journal of biochemistry ], volume 278:pages 1415-1423, incorporated herein by reference. In a preferred embodiment, the codon-optimized (see above) nucleotide sequence encoding a dihydroxyacetone kinase is overexpressed, such as, for example, the codon-optimized nucleotide sequence encoding a dihydroxyacetone kinase of SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID NO:67 or SEQ ID NO: 68.
As indicated above, the native nucleic acid sequences encoding dihydroxyacetone kinase proteins DAK1 and DAK2 in Saccharomyces cerevisiae have been shown by SEQ ID NO:69 and SEQ ID NO:70, respectively.
Preferably, the recombinant yeast cell comprises a genetic modification that increases the specific activity of any dihydroxyacetone kinase in the cell. For example, the recombinant yeast cell can comprise one or more native and/or heterologous nucleic acid sequences encoding one or more native and/or heterologous dihydroxyacetone kinase proteins that are overexpressed (such as DAK1 and/or DAK 2). Natural dihydroxyacetone kinases (such as DAK1 and/or DAK 2) may be overexpressed, for example, via one or more genetic modifications such that the gene encoding the dihydroxyacetone kinase is more copied than is present in non-genetically modified cells, and/or a non-natural promoter may be employed.
Preferably, the recombinant yeast cell is a recombinant yeast cell in which expression of the nucleic acid sequence encoding the protein having dihydroxyacetone kinase activity is under the control of a promoter. The promoter may, for example, be a promoter native to another gene in the host cell.
In order to overexpress a nucleotide sequence encoding dihydroxyacetone kinase, the nucleotide sequence (to be overexpressed) may be placed in an expression construct, wherein it is operably linked to suitable expression control regions/sequences to ensure overexpression of dihydroxyacetone kinase after transformation of the expression construct into a host cell of the invention (see above). Suitable promoters for (over) expression of nucleotide sequences encoding enzymes having dihydroxyacetone kinase activity include promoters which are preferably insensitive to inhibition by the decomposition metabolite (glucose), active under anaerobic conditions and/or preferably do not require xylose or arabinose for induction. Examples of such promoters are given above. The dihydroxyacetone kinase that is overexpressed is preferably overexpressed at least 1.1, 1.2, 1.5, 2, 5, 10 or 20-fold compared to a genetically identical strain except for the genetic modification that causes the overexpression. Preferably, dihydroxyacetone kinase is overexpressed at least 1.1, 1.2, 1.5, 2, 5, 10 or 20-fold under anaerobic conditions as compared to a genetically identical strain except for the genetic modification causing the overexpression. It will be appreciated that these levels of overexpression may be applicable to steady-state levels of enzyme activity (specific activity in a cell), steady-state levels of enzyme protein, and steady-state levels of transcripts encoding enzymes in a cell. Overexpression of the nucleotide sequence in the host cell results in a specific dihydroxyacetone kinase activity of at least 0.002, 0.005, 0.01, 0.02 or 0.05U min-1 (mg protein) -1, as determined in cell extracts of transformed host cells at 30 ℃, as described in examples of e.g. WO 2013/081456.
The most preferred dihydroxyacetone kinase protein is the dihydroxyacetone kinase protein encoded by the Dak1 gene from Saccharomyces cerevisiae. SEQ ID NO. 64 shows the amino acid sequence of a suitable dihydroxyacetone kinase protein encoded by the Dak1 gene from Saccharomyces cerevisiae. SEQ ID NO. 69 shows the nucleic acid sequence of the Dak1 gene itself.
If the recombinant yeast cell comprises one or more overexpressed nucleic acid sequences encoding dihydroxyacetone kinase, the recombinant yeast cell thus most preferably comprises one or more overexpressed nucleotide sequences encoding dihydroxyacetone kinase derived from Saccharomyces cerevisiae, as exemplified by the nucleic acid sequence shown in SEQ ID NO: 69.
Thus, preferably, the protein having dihydroxyacetone kinase activity comprises or consists of:
-the amino acid sequence of SEQ ID NO. 64, SEQ ID NO. 65, SEQ ID NO. 66, SEQ ID NO. 67 or SEQ ID NO. 68; or alternatively
-a functional homolog of SEQ ID No. 64, SEQ ID No. 65, SEQ ID No. 66, SEQ ID No. 67 or SEQ ID No. 68 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 64, SEQ ID No. 65, SEQ ID No. 66, SEQ ID No. 67 or SEQ ID No. 68; or alternatively
-a functional homolog of SEQ ID No. 64, SEQ ID No. 65, SEQ ID No. 66, SEQ ID No. 67 or SEQ ID No. 68 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 64, SEQ ID No. 65, SEQ ID No. 66, SEQ ID No. 67 or SEQ ID No. 68, more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID No. 64, SEQ ID No. 65, SEQ ID No. 66, SEQ ID No. 67 or SEQ ID No. 68.
Proteins having the amino acid sequence of SEQ ID NO. 64 and functional homologs thereof are most preferred.
Preferably, the nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity comprises or consists of:
-the nucleic acid sequence of SEQ ID NO. 69 or SEQ ID NO. 70; or alternatively
-SEQ ID No. 69 or a functional homolog of SEQ ID No. 70 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the nucleic acid sequence of SEQ ID No. 69 or SEQ ID No. 70; or alternatively
-a functional homolog of SEQ ID No. 69 or SEQ ID No. 70 having one or more mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID No. 69 or SEQ ID No. 70; more preferably NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 nucleic acid mutations, substitutions, insertions and/or deletions when compared to the nucleic acid sequence of SEQ ID NO. 69 or SEQ ID NO. 70.
The nucleic acid sequence (e.g., gene) encoding dihydroxyacetone kinase protein can be suitably incorporated into the genome of a recombinant yeast cell.
Examples of suitable dihydroxyacetone kinases are listed in tables 9 (a) to 9 (d). At the top of each table, the BLAST DAK used in the examples is mentioned.
Table 9 (a):BLAST query-DAK 1 from Saccharomyces cerevisiae
Table 9 (b):BLAST query-from Klebsiella pneumoniaedhaK
Table 9 (c):BLAST query-DAK 1 from yarrowia lipolytica
Table 9 (d):BLAST query-DAK 1 from Schizosaccharomyces pombe
Glycerol transporter
The recombinant yeast cell may optionally comprise (i.e., may or may not comprise) a nucleotide sequence encoding a glycerol transporter. Such glycerol transporters may allow any glycerol to be transported into the cells and converted to ethanol, which is externally available in the medium (e.g., from reflux in corn mash) or secreted after synthesis by internal cells.
If glycerol transporters are present, the recombinant yeast preferably comprises one or more nucleic acid sequences encoding a heterologous glycerol transporter represented by the amino acid sequence SEQ ID NO:71, SEQ ID NO:72 or a functional homologue thereof having at least 50%, preferably at least 60%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity with the amino acid sequence of SEQ ID NO:71 and/or SEQ ID NO: 72.
In one embodiment, the recombinant yeast may further comprise a deletion or disruption of one or more endogenous nucleotide sequences encoding a glycerol export protein (e.g., FPS 1).
Nitrate reductase
The recombinant yeast cell may also advantageously comprise (correspondingly functionally expressed) a nucleic acid sequence encoding an enzyme having NADH-dependent nitrate reductase activity and/or a nucleic acid sequence encoding an enzyme having NADH-dependent nitrite reductase activity. Details of expressing this alternative redox sink have been described in non-prepublished U.S. patent application US 63087642 filed on 5 months 10 in 2020 to the U.S. patent office, the contents of which are incorporated herein by reference.
Nitrate Reductase (NR) catalyzes nitrate (NO 3 - ) Reduction to Nitrite (NO) 2 - ). Nitrite reductase catalyzes the reduction of nitrite to ammonia (NH) 3 ). Nitrate reductase and/or nitrite reductase may be part of the so-called nitrogen assimilation pathway in certain cells. Cells comprising nitrate reductase activity and/or nitrite reductase activity include certain plant cells and bacterial cells and a few yeast cells. As indicated by Linder, the ability to assimilate inorganic nitrogen sources other than ammonia in budding yeast is considered rare. Few fungi that are naturally capable of assimilating nitrate or nitrite include Botrytis cinerea (Blastobotrys adeninivorans) (Mao Gong Aspergillus fumigatus), candida boidinii (Pichia pastoris), saccharomyces cerevisiae (Cyberlindnera jadinii) (Phaffiamycolaceae) and Hansenula polymorpha (Ogataea polymorpha) (Pichia) which are naturally capable of assimilating nitrate or nitrite.
Preferably, the recombinant yeast cell as described herein comprises at least one or more genes encoding an NADH-dependent nitrate reductase.
NADH-dependent nitrate reductase is understood herein to be a nitrate reductase which depends solely on NADH as cofactor or primarily on NADH as cofactor. Preferably, the catalytic efficiency (k) of NADH-dependent nitrate reductase on NADPH/NADP+ as cofactor cat /K m ) NADP+ And catalytic efficiency (k) for NADH/NAD+ as cofactor cat /K m ) NAD+ Is the ratio of (i.e., the catalytic efficiency ratio (k cat /K m ) NADP+ :(k cat /K m ) NAD+ ) Greater than 1:1, more preferably equal to or greater than 2:1, still more preferably equal to or greater than 5:1, even more preferably equal to or greater than 10:1, yet even more preferably equal to or greater than 20:1, even more preferably equal to or greater than 100:1, and most preferably equal to or greater than 1000:1. There is no upper limit, but for practical reasons the catalytic efficiency ratio (k cat /K m ) NADP+ :(k cat /K m ) NAD+ May be equal to or less than 1.000.000.000:1 (i.e., 1.10) 9 ). Most preferably, the NADH dependent nitrate reductase is dependent only on NADH/NAD+ as cofactor. That is, most preferably, NADH-dependent nitrate reductase absolutely requires NADH/NAD+ as a cofactor, not NADPH/NADP+ as a cofactor.
Preferably, the NADH-dependent nitrate reductase is an NADH-dependent nitrate reductase of enzyme class EC 1.7.1.1 (i.e., having EC number EC 1.7.1.1) or enzyme class EC.1.6.6.1 (i.e., having EC number 1.6.6.1). Suitably, the NADH-dependent nitrate reductase (also referred to as NADH-dependent nitrate oxidoreductase) is an enzyme that catalyzes at least the following chemical reactions:
nitrate+NADH+H + nitrite+NAD + +H 2 O
Suitable NADH-dependent nitrate reductases may include, for example, one or more NADH-dependent nitrate reductases obtained or derived from: the plant species may be selected from the group consisting of Amaranthus (Agrostemma githago), amaranthus viridis (Amaranthus hybridus), amaranthus tricolor (Amaranthus tricolor), brown fiber alga (Ankistrodesmus braunii), arabidopsis thaliana, aspergillus niger, aspergillus nidulans, auxenochlorella pyrenoidosa, rhizobium (Bradyrhizobium) species, rhizobium species 750, brassica Juncea, brassica oleracea (Brassica oleracea), wild tea tree (Camellia sinensis), candida boidinii, candida utilis (Candida), capsici (Capsicum frutescens), chenopodia (Chenopodium album), saccharomyces jikunnii, brassica juncea, wild tea tree, capsicum, chenopodium, chlamydomonas (Chlamydomonas reinhardtii), chlorella (Chlorella furcata), chlorella (Fusca), chlorella (Chlorella) species, chlorella (Cyprinus sp. Berlin), chlorella (Chlorella vulgaris), cyperus (Conticribra weissflogii), cucumis sativus (Cumiq. Sativus), cyperus (Cumiq. Cucumis) species, cumiq. Pastoris (35), leucor (35, leucopia (35), leucopia (Lavandula angusta (35), leucopia (Lavandula) and Cyperus (35), leucopia (Lavandula (35), leucopia (Laurensis) and Cyperus (35, leucor) are the plant species, leucon (Laurena) and Cyperus) that are the plant species Bai Yushan bean (Lupinus albus), mycobacterium tuberculosis, tobacco leaf (Nicotiana plumbaginifolia), tobacco (Nicotiana tabacum), hansenula angusta (Ogataea angusta), hansenula polymorpha, rice (Oryza sativa), antarctica (Phaeocystis Antarctica), reed (Phragmites australis), physcomitrella patens (Physcomitrella patens), purple pea (Pisum arvense), rhodobryum (Polytrichum commune), porphyra yezoensis (Pyropia yezoensis), radish (rapanus sativus), rhodobacter capsulatus (Rhodobacter capsulatus), rhodobacter capsulatus E1F1, castor (Ricinus commu s), small green algae (Selaginella kraussiana), sinapis alba (Skeletonema costatum), ribbonisatus (Skeletonema tropicum), tomato (Solanum lycopersicum), spinach, nuda salsa maritima (3795), tetrandra (Thalassia Testudinum), sarca, sarcandra (Thalassiosira Antarctica), sarcandra glabra (86), uvalsa (4572), or Uvalsa species (4572) of the genus Uvalis; and/or functional homologs of such NADH-dependent nitrate reductase enzymes comprising an amino acid sequence that has at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to one or more of such NADH-dependent nitrate reductase enzymes described above; and/or functional homologues of such NADH-dependent nitrate reductase enzymes comprising an amino acid sequence having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of the above-mentioned such NADH-dependent nitrate reductase enzymes, wherein preferably the amino acid sequence of any of the above functional homologues has no more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to the above-mentioned such NADH-dependent nitrate reductase enzymes.
Preferred NADH-dependent nitrate reductases include NADH-dependent nitrate reductases as obtained or derived from: candida boidinii (nitrate reductase capable of utilizing both NADH and NADPH as electron donors), candida utilis (nitrate reductase capable of utilizing both NADH and NADPH as electron donors), fusarium oxysporum (as described by Fujii et al in the article entitled "Denitrification by the Fungus Fusarium oxysporum Involves NADH-Nitrate Reductase" [ "denitrification of the fungus fusarium oxysporum involves NADH-nitrate reductase" ], published in biosci. Biotechnol. Biochem [ bioscience, biotechnology and biochemistry ],72 (2), pages 412-420, 2008, incorporated herein by reference), spinach and maize.
Thus, preferred NADH-dependent nitrate reductases include: such NADH-dependent nitrate reductase comprising a polypeptide having the amino acid sequence of SEQ ID NO. 89 and/or SEQ ID NO. 90 as described herein; and/or functional homologs of SEQ ID NO. 89 and/or SEQ ID NO. 90 comprising an amino acid sequence having at least 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to one or more of SEQ ID NO. 89 and/or SEQ ID NO. 90, respectively; and/or functional homologs of SEQ ID NO. 89 and/or SEQ ID NO. 90 comprising an amino acid sequence having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of SEQ ID NO. 89 and/or SEQ ID NO. 90, respectively. Preferably, the amino acid sequence of any of the above functional homologs has NO more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to SEQ ID NO 89 and/or SEQ ID NO 90, respectively.
Preferably, the recombinant yeast cell comprises an exogenous gene encoding an enzyme having NADH dependent nitrate reductase activity. More preferably, the recombinant yeast cell comprises an exogenous gene encoding an enzyme having NADH dependent nitrate reductase activity selected from the group consisting of NADH dependent nitrate reductases as obtained or derived from: the plant growth regulator comprises a wheat-grass, amaranthus viridis, amaranthus tricolor, brown fiber alga, arabidopsis thaliana, aspergillus niger, aspergillus nidulans, auxenochlorella pyrenoidosa, bradyrhizobium species 750, brassica juncea, wild tea tree, candida boidinii, candida utilis, capsicum, chenopodium album, jieskioshima, brassica juncea, brassica oleracea, wild tea tree, capsicum, chenopodium, chlamydomonas reinhardtii, blackish chlorella species, chlorella vulgaris, hawkthorn, cucumber, melon, zucchini, pumpkin species, dunaliella salina, round-grained algae, trichosanthes kirilowii, trichosanthes acum, fusarium oxysporum JCM 11502 rhizoma et radix Pogostemonis, semen glycines, upland cotton, gracilaria chili, gracilaria tenuifolia, sunflower, barley, lettuce, duckweed, lupin, mycobacterium tuberculosis, nicotiana tabacum, tobacco, hansenula polymorpha, rice, antarctic brown algae, phragmites communis, physcomitrella patens, semen Pisi Sativi, david moss, porphyra yezoensis, radix Raphani, rhodobacter capsulatus E1F1, ricinus, cuminum parviflora, semen Sinapis Albae, skeletonema costatum, trogopyrum tropicalis, fructus Lycopersici Esculenti, herba Spinaciae, cyperus nudus, thailand Teslae crassifolium, alternaria lunata, pseudostellaria pseudominis, semen Tritici Lepidii, leucopia species, ulva and semen Maydis; and functional homologs of such NADH-dependent nitrate reductases that comprise an amino acid sequence that has at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to one or more of such NADH-dependent nitrate reductases described above; and functional homologues of such NADH dependent nitrate reductase enzymes comprising an amino acid sequence having one or more substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of such NADH dependent nitrate reductase enzymes described above, wherein preferably the amino acid sequence of any of the above functional homologues has no more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to such NADH dependent nitrate reductase enzymes described above.
Suitably, the recombinant yeast cell may comprise a nucleotide sequence encoding the amino acid sequence of any of SEQ ID NO. 89 and/or SEQ ID NO. 90 or having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of any of SEQ ID NO. 89 and/or SEQ ID NO. 90. Preferably, the amino acid sequence has NO more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to SEQ ID NO. 89 and/or SEQ ID NO. 90, respectively.
The recombinant yeast cell can combine one or more genes encoding the above NADH-dependent nitrate reductase with one or more genes encoding NADPH-dependent nitrite reductase. Preferably, however, the recombinant yeast cell combines one or more genes encoding the above NADH-dependent nitrate reductase with one or more genes encoding NADH-dependent nitrite reductase.
Examples of suitable NADH-dependent nitrate reductases, their UniProt database accession numbers (which can be found on the Uniprot website (www.uniprot.org/, 10 th.4 th 2020), their descriptions, organisms from which they can be derived and their amino acid sequence identity with SEQ ID NO:89 are listed in Table 10 below.
Table 10:examples of suitable NADH-dependent nitrate reductases, their UniProt database accession numbers (which can be found on the Uniprot website (www.uniprot.org/, 4 th 10 th 2020), their descriptions, organisms from which they can be derived and their amino groups with SEQ ID NO: 89)The acid sequence identity is listed in table 10 below.
Nitrite reductase
As indicated above, nitrite reductase catalyzes the reduction of nitrite to ammonia (NH 3 )。
Preferably, the recombinant yeast cell as described herein comprises at least one or more genes encoding an NADH-dependent nitrite reductase.
NADH-dependent nitrite reductase is understood herein to be a nitrite reductase which depends solely on NADH as cofactor or primarily on NADH as cofactor. Preferably, the catalytic efficiency (k) of NADH-dependent nitrite reductase on NADPH/NADP+ as cofactor cat /K m ) NADP+ And catalytic efficiency (k) for NADH/NAD+ as cofactor cat /K m ) NAD+ Is the ratio of (i.e., the catalytic efficiency ratio (k cat /K m ) NADP+ :(k cat /K m ) NAD+ ) Greater than 1:1, more preferably equal to or greater than 2:1, still more preferably equal to or greater than 5:1, even more preferably equal to or greater than 10:1, yet even more preferably equal to or greater than 20:1, even more preferably equal to or greater than 100:1, and most preferably equal to or greater than 1000:1. There is no upper limit, but for practical reasons the catalytic efficiency ratio (k cat /K m ) NADP+ :(k cat /K m ) NAD+ May be equal to or less than 1.000.000.000:1 (i.e., 1.10) 9 ). Most preferably, the NADH-dependent nitrite reductase is dependent only on NADH/NAD+ as cofactor. That is, most preferably, the NADH-dependent nitrite reductase absolutely requires NADH/NAD+ as a cofactor, not NADPH/NADP+ as a cofactor.
Preferably, the NADH-dependent nitrite reductase is an NADH-dependent nitrite reductase of enzyme class EC 1.7.1.15 (i.e., having EC number EC 1.7.1.15). Suitably, the NADH-dependent nitrite reductase (also referred to as NADH-dependent nitrite oxidoreductase) is an enzyme which catalyzes at least the following chemical reaction:
nitrite +3NADH+5H + Ammonia+3 NAD + +2H 2 O
Those skilled in the art will appreciate that ammonia may also be used as the so-called ammonium hydroxide NH 4 OH is present and/or is referred to as so-called ammonium hydroxide NH 4 OH
Suitable NADH-dependent nitrite reductases may include, for example, NADH-dependent nitrite reductases derived from one or more of the following: aspergillus nidulans (also known as naked shell of nidulans), toxobacter reniforme (Arcobacter ellisii), toxobacter pacific (Arcobacter pacificus), bacillus subtilis JH642, copper species of Pacific fungus of Taiwan, escherichia coli, rockwell of Taiwan (Ralstonia taiwanensis), morganella syringiensis (Ralstonia syzygii), ralstonia solanacearum (Ralstonia solanacearum), rhodobacter capsulatus, burkholderia of Li Bei Luofu (Paraburkholderia ribeironis); and/or functional homologs of such NADH-dependent nitrite reductase comprising an amino acid sequence that has at least 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to one or more of such NADH-dependent nitrite reductase as described above; and/or functional homologues of such NADH-dependent nitrite reductase comprising an amino acid sequence having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of such NADH-dependent nitrite reductase as described above, wherein preferably the amino acid sequence of any of the above functional homologues has no more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to such NADH-dependent nitrite reductase as described above.
Coli utilizes several different enzymes in its nitrite assimilation pathway. The nirD gene encodes the NADH-dependent nitrite reductase (NADH) small subunit, while the nirB gene encodes the NADH-dependent nitrite reductase (NADH) large subunit.
Preferred NADH-dependent nitrite reductases include NADH-dependent nitrite reductases derived from Aspergillus nidulans (also known as naked nidulans), which are nitrite reductases capable of utilizing both NADH and NADPH as electron donors, and/or E.coli. Nitrite reductase encoded by the nirB gene of E.coli is particularly preferred at high nitrate and/or nitrite concentrations.
Thus, preferred NADH-dependent nitrite reductases include: such NADH-dependent nitrite reductase comprising a polypeptide having the amino acid sequence of SEQ ID NO. 91 (E.coli nitrite reductase small subunit encoded by nirD) and/or SEQ ID NO. 92 (E.coli nitrite reductase large subunit encoded by nirB) and/or SEQ ID NO. 93 (nidogen nucleocapsid nitrate reductase encoded by niiA) as described herein; and/or functional homologs of SEQ ID NO. 91 and/or SEQ ID NO. 92 and/or SEQ ID NO. 93 comprising an amino acid sequence having at least 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to one or more of SEQ ID NO. 91 and/or SEQ ID NO. 92 and/or SEQ ID NO. 93, respectively; and/or functional homologs of SEQ ID NO. 91 and/or SEQ ID NO. 92 and/or SEQ ID NO. 93 comprising an amino acid sequence having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of SEQ ID NO. 91 and/or SEQ ID NO. 92 and/or SEQ ID NO. 93, respectively. Preferably, the amino acid sequence of any of the above functional homologs has NO more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to SEQ ID NO 91 and/or SEQ ID NO 92 and/or SEQ ID NO 93, respectively.
Preferably, the recombinant yeast cell comprises an exogenous gene encoding an enzyme having NADH dependent nitrite reductase activity. More preferably, the recombinant yeast cell comprises an exogenous gene encoding an enzyme having NADH-dependent nitrite reductase activity selected from the group consisting of NADH-dependent nitrite reductases as derived from: aspergillus nidulans (also known as naked shell of nidulans), toxobacter renifolius, toxobacter pacific, bacillus subtilis JH642, copper bacterium, escherichia coli, ralstonia taiwan, ralstonia nectaricum, ralstonia solanaceae, rhodobacter capsulatus, and Burkholderia of Ri Bei Luofu; and/or functional homologs of such NADH-dependent nitrite reductase comprising an amino acid sequence that has at least 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to one or more of such NADH-dependent nitrite reductase as described above; and/or functional homologues of such NADH-dependent nitrite reductase comprising an amino acid sequence having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of such NADH-dependent nitrite reductase as described above, wherein preferably the amino acid sequence of any of the above functional homologues has no more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to such NADH-dependent nitrite reductase as described above.
Suitably, the recombinant yeast cell may comprise an amino acid sequence encoding any of SEQ ID NO. 91 (small subunit of E.coli nitrate reductase encoded by nirD) and/or SEQ ID NO. 92 (large subunit of E.coli nitrate reductase encoded by nirB) and/or SEQ ID NO. 93 (naked nido-capsid nitrate reductase encoded by niiA) or a nucleotide sequence having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of any of SEQ ID NO. 91 and/or SEQ ID NO. 92 and/or SEQ ID NO. 93. Preferably, the amino acid sequence has NO more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to SEQ ID NO. 91 and/or SEQ ID NO. 92 and/or SEQ ID NO. 93, respectively.
The recombinant yeast cell can combine one or more genes encoding one or more of the above NADH-dependent nitrite reductase with one or more genes encoding NADPH-dependent nitrate reductase. Preferably, however, the recombinant yeast cell combines one or more genes encoding one or more of the above NADH-dependent nitrite reductase with one or more genes encoding NADH-dependent nitrate reductase.
Examples of suitable NADH-dependent nitrite reductases, their UniProt database accession numbers (which can be found on the Uniprot website (www.uniprot.org/, 10 th.4 th.2020), their descriptions, organisms from which they can be derived and their amino acid sequence identity with SEQ ID NO:91 (the small subunit encoded by nirD) are listed in Table 11 below.
Examples of suitable NADH-dependent nitrite reductases, their UniProt database accession numbers (which can be found on the Uniprot website (www.uniprot.org/, 10 th.4 th.2020), their descriptions, organisms from which they can be derived and their amino acid sequence identity with SEQ ID NO:92 (the large subunit encoded by nirB) are listed in Table 12 below.
Table 11:examples of suitable NADH-dependent nitrite reductases, their UniProt database accession numbers (which can be found on the Uniprot website (www.uniprot.org/, 10 th.4 th.2020)), their descriptions, organisms from which they can be derived, and their amino acid sequence identity with SEQ ID NO:91 (the small subunit encoded by nirD).
Table 12:examples of suitable NADH-dependent nitrite reductases, their UniProt database accession numbers (which can be found on the Uniprot website (www.uniprot.org/, 10 th.4 th.2020)), their descriptions, organisms from which they can be derived and their use with SEQ ID NO:92 (large subunit encoded by nirB) Group) amino acid sequence identity.
Nitrate/nitrite transporter
Preferably, the recombinant yeast cell further comprises one or more genetic modifications that result in increased transport of a nitric oxide source (such as nitrate or nitrite) into the yeast cell. More preferably, the recombinant yeast cell further comprises one or more genes encoding nitrate and/or nitrite transport proteins.
Suitable transporters may include sulfite transporters Ssu1 and Ssu2 (as described by cabera et al in the article entitled "Molecular Components of Nitrate and Nitrite Efflux in Yeast" [ "molecular components of nitrate and nitrite effluents in yeast" ], published on pages 267-278 of the volume 13, 2 nd Eukaryotic Cell [ Eukaryotic Cell ], 2014, incorporated herein by reference); and nitrate/nitrite transporter YNT1 derived from pichia angusta (also known as hansenula polymorpha); and/or a functional homolog of one or more of such nitrate/nitrite transporters comprising an amino acid sequence that has at least 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to one or more of the nitrate/nitrite transporters described above; and/or functional homologs of one or more of such nitrate/nitrite transporters comprising an amino acid sequence having one or more substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of such nitrate/nitrite transporters described above, wherein preferably the amino acid sequence of any of the above functional homologs has no more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to such nitrate/nitrite transporter YNT1 described above.
Preferably, the recombinant yeast cell comprises a nucleic acid sequence encoding: nitrate/nitrite transporter YNT1 derived from pichia angustifolia; and/or a functional homolog of the nitrate/nitrite transporter YNT1 comprising an amino acid sequence that has at least 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity to nitrate/nitrite transporter YNT1; and/or functional homologs of such nitrate/nitrite transporter YNT1 comprising an amino acid sequence having one or more substitutions, insertions and/or deletions compared to the amino acid sequence of one or more of such nitrate/nitrite transporter YNT1 described above, wherein preferably the amino acid sequence of any of the above functional homologs has no more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to such nitrate/nitrite transporter YNT1 described above.
Thus, preferred nitrate/nitrite transporters include: such nitrate/nitrite transporter comprising a polypeptide having the amino acid sequence of SEQ ID NO. 94 as described herein; and/or functional homologs of SEQ ID NO. 94 comprising an amino acid sequence having at least 40%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or at least 99% amino acid sequence identity with SEQ ID NO. 94; and/or functional homologues of SEQ ID NO. 94 comprising an amino acid sequence having one or more substitutions, insertions and/or deletions compared to the amino acid sequence of SEQ ID NO. 94. Preferably, the amino acid sequence of any of the above functional homologs has NO more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to SEQ ID NO. 94.
Suitably, the recombinant yeast cell may comprise a nucleotide sequence encoding the amino acid sequence of SEQ ID NO. 94 or an amino acid sequence having one or several substitutions, insertions and/or deletions compared to the amino acid sequence of any of SEQ ID NO. 94. Preferably, the amino acid sequence has NO more than 300, 250, 200, 150, 100, 75, 50, 40, 30, 20, 10 or 5 amino acid substitutions, insertions and/or deletions compared to SEQ ID NO. 94, respectively.
Examples of suitable nitrite/nitrate transporters, their UniProt database accession numbers (which can be found on Uniprot website (www.uniprot.org/, 10 th.4 th.2020), descriptions thereof, organisms from which they can be derived and their amino acid sequence identity with SEQ ID NO. 94 are listed in Table 13 below.
Table 13:suitable nitrite/nitrate transportExamples of proteins, their UniProt database accession numbers (which can be found on the UniProt website (www.uniprot.org/, 10 th.4 th 2020)), descriptions thereof, organisms from which they can be derived, and their amino acid sequence identity with SEQ ID No. 94.
Cofactor(s)
Preferably, the recombinant yeast cell further comprises a suitable cofactor to enhance the activity of the above-mentioned NADH-dependent nitrate reductase and/or NADH-dependent nitrite reductase. Preferred cofactors include Flavin Adenine Dinucleotide (FAD), heme cofactor and/or molybdenum cofactor (MoCo). Thus, preferably, the recombinant yeast cell may further comprise one or more genes encoding enzymes for synthesizing one or more of Flavin Adenine Dinucleotide (FAD), heme prosthetic group, and/or molybdenum cofactor (MoCo). For example, a recombinant yeast cell can comprise one or more genes encoding an enzyme having FAD synthase activity. Preferred cofactors are as exemplified in non-prepublished U.S. patent application US 63087642 filed on 5 th 10 th 2020 to the U.S. patent office, the contents of which are incorporated herein by reference.
Recombinant expression
Recombinant yeast cells are such recombinant cells. That is, the recombinant yeast cell comprises a nucleotide sequence that does not naturally occur in the cell in question, or is transformed with or genetically modified with the nucleotide sequence. Techniques for recombinantly expressing enzymes in cells and for making additional genetic modifications to recombinant yeast cells are well known to those skilled in the art. Typically, such techniques involve transforming cells with a nucleic acid construct comprising the relevant sequence. Such methods are known, for example, from standard handbooks, such as Sambrook and Russel (2001) "Molecular Cloning: ALaboratory Manual" [ "molecular cloning: laboratory Manual (3 rd edition), published by Cold Spring Harbor Laboratory Press [ Cold spring harbor laboratory Press ], or edited by F.Ausubel et al, "Current protocols in molecular biology" [ "guidelines for molecular biology experiments" ], green Publishing and Wiley Interscience [ Green publishing company and American Wei Lip, new York (1987). Methods for transforming and genetically modifying fungal host cells are known, for example, from EP-A-06355574, WO 98/46772, WO 99/60102, WO 00/37671, WO 90/14423, EP-A-0481008, EP-A-0635554 and US 6265186.
Fermentation process
The invention further provides a method for producing ethanol, comprising converting a carbon source, preferably a carbohydrate or another organic carbon source, using a recombinant yeast cell as described in the specification, thereby forming ethanol.
The feed for the fermentation process suitably comprises one or more fermentable carbon sources. The fermentable carbon source preferably comprises or consists of one or more fermentable carbohydrates. More preferably, the fermentable carbon source comprises one or more monosaccharides, disaccharides and/or polysaccharides. For example, the fermentable carbon source may comprise one or more carbohydrates selected from the group consisting of: glucose, fructose, sucrose, maltose, xylose, arabinose, galactose, mannose and trehalose. The fermentable carbon source preferably comprising or consisting of one or more carbohydrates may suitably be obtained from starch, cellulose, hemicellulose, lignocellulose and/or pectin. Suitably, the fermentable carbon source may be in the form of a slurry, suspension or liquid, preferably aqueous.
The concentration of fermentable carbohydrates (such as, for example, glucose) during fermentation is preferably equal to or greater than 80g/L. That is, the initial concentration of glucose at the beginning of fermentation is preferably 80g/L or more, more preferably 90g/L or more, even more preferably 100g/L or more, still more preferably 110g/L or more, still even more preferably 120g/L or more, 130g/L or more, 140g/L or more, 150g/L or more, 160g/L or more, 170g/L or 180g/L or more. The initiation of fermentation may be at the time of contacting the fermentable carbohydrate with the recombinant cells of the invention.
The fermentable carbon source may be prepared by contacting starch, lignocellulose and/or pectin with an enzyme composition wherein one or more mono-, di-, and/or polysaccharides are produced and wherein the produced mono-, di-, and/or polysaccharides are subsequently fermented to obtain a fermentation product.
The lignocellulosic material may be pretreated prior to the enzymatic treatment. Pretreatment may include exposing the lignocellulosic material to an acid, base, solvent, heat, peroxide, ozone, mechanical comminution, grinding, milling or rapid depressurization, or a combination of any two or more thereof. Such chemical pretreatment is typically combined with thermal pretreatment (e.g., between 150 ℃ and 220 ℃ for 1 to 30 minutes). The pretreated material may then be subjected to enzymatic hydrolysis to release sugars that may be fermented according to the invention. This can be done in a conventional manner, for example, by contacting with a cellulase (e.g., one or more cellobiohydrolases, one or more endoglucanases, one or more beta-glucosidase enzymes, and optionally other enzymes). The conversion with cellulase enzymes may be carried out at ambient temperature or higher for a reaction time that releases a sufficient amount of one or more sugars. The result of enzymatic hydrolysis is a hydrolysate comprising C5/C6 sugars, referred to herein as a sugar composition.
In one embodiment, the fermentable carbohydrate is or consists of a biomass hydrolysate such as corn stover or corn fiber hydrolysate. Such biomass hydrolysate, in turn, may comprise or be derived from corn stover and/or corn fiber.
By "hydrolysate" is herein understood a material comprising polysaccharides (such as corn stover, corn starch, corn fiber or lignocellulose material) which have been hydrolyzed by the addition of water to form mono-and oligosaccharides. The hydrolysate can be produced by enzymatic or acid hydrolysis of the polysaccharide containing material.
The biomass hydrolysate may be a lignocellulosic biomass hydrolysate. Lignocellulose herein includes hemicellulose and hemicellulose fractions of biomass. Lignocellulose also includes the lignocellulose fraction of biomass. Suitable lignocellulosic materials can be found in the following list: orchard bottom materials, chalcona communities, mill waste, municipal wood waste, municipal waste, felling waste, forest raising waste (forest thining), short-term rotation woody crops, industrial waste, wheat straw, oat straw, rice straw, barley straw, rye straw, flax straw, soybean hulls, rice straw, corn gluten feed, oat hulls, sugarcane, corn stover, corn cobs, corn husks, switchgrass, miscanthus, sweet sorghum, canola stems, soybean stems, grassland grasses, duck-cogongrass, foxtail; beet pulp, citrus fruit pulp, seed hulls, cellulose animal waste, lawn-trim waste (including macroalgae and microalgae), cotton, seaweed, algae (including macroalgae and microalgae), trees, softwood, hard wood, poplar, pine, shrub (shrub), grasses, wheat straw, bagasse, corn husks, corncob, corn kernels, fibers from grain, products and byproducts from wet or dry milling of grain, municipal solid waste, waste paper, yard waste, herbaceous material, agricultural residues, forestry residues, municipal solid waste, waste paper, pulp, paper mill residues, branches, shrubs, sugarcane, corn husks, energy crops, forests, fruits, flowers, grains, grasses, herbaceous crops, leaves, bark, needle leaves, raw wood, roots, saplings, shrub (shrub), switchgrass, trees, vegetables, pericarp, vine, sugar beet pulp, wheat bran, oat hulls, hard wood or softwood, organic materials resulting from an agricultural process, wood waste, or a combination of any two or more thereof. Algae (such as macroalgae and microalgae) have the following advantages: they may contain a large amount of sugar alcohols (such as sorbitol and/or mannitol). Lignocellulose, which can be considered as a potentially renewable raw material, generally comprises the polysaccharides cellulose (dextran) and hemicellulose (xylan, heteroxylan and xyloglucan). In addition, some hemicellulose may be present as glucomannans in, for example, wood derived raw materials. These polysaccharides are enzymatically hydrolyzed to soluble sugars (including both monomers and polymers, such as glucose, cellobiose, xylose, arabinose, galactose, fructose, mannose, rhamnose, ribose, galacturonic acid, glucuronic acid, and other hexoses and pentoses) by the action of synergistic diverse enzymes. In addition, pectins and other pectic substances (such as arabinans) may account for a substantial proportion of the typical cell wall dry mass from non-woody plant tissue (about one-fourth to one-half of the dry mass may be pectin). The lignocellulosic material may be pretreated. Pretreatment may include exposing the lignocellulosic material to an acid, base, solvent, heat, peroxide, ozone, mechanical comminution, grinding, milling or rapid depressurization, or a combination of any two or more thereof. Such chemical pretreatment is typically combined with thermal pretreatment (e.g., between 150 ℃ and 220 ℃ for 1 to 30 minutes).
The method for producing ethanol may include an aerobic proliferation step and an anaerobic fermentation step. More preferably, the method according to the invention is a method comprising the steps of: an aerobic proliferation step in which the population of recombinant yeast cells is increased; and an anaerobic fermentation step in which the carbon source is converted into ethanol by using a recombinant yeast cell population.
Proliferation is understood herein as the process of growing recombinant yeast cells that results in an increased initial population of recombinant yeast cells. The main purpose of proliferation is to increase the population of recombinant yeast cells using the recombinant yeast cells as the natural reproductive capacity of living organisms. That is, proliferation is for biomass production, not for ethanol production. Proliferation conditions may include appropriate carbon sources, aeration, temperature and nutrient addition. Proliferation is an aerobic process, so the proliferation vessel must be properly aerated to maintain a certain level of dissolved oxygen. Proper aeration is typically achieved by an air inductor mounted on the pipe into the propagation tank that introduces air into the propagation mixture as the tank fills and during recirculation. The ability of the propagation mixture to retain dissolved oxygen varies with the amount of air added and the consistency of the mixture, which is why water is typically added in a mash to water ratio of between 50:50 and 90:10. "viscous" proliferation mixtures (80:20 and higher mash to water ratios) typically require the addition of compressed air to compensate for the reduced capacity to retain dissolved oxygen. The amount of dissolved oxygen in the propagation mixture also varies with the bubble size, so some ethanol plants add air through spargers that produce smaller bubbles compared to air inductors. Proper aeration and lower glucose are important to promote aerobic respiration during proliferation, so that the environment during proliferation is different from the anaerobic environment during fermentation.
Anaerobic fermentation process is understood herein to be a fermentation step operating under anaerobic conditions.
Anaerobic fermentation is preferably run at a temperature optimal for the cells. Thus, for most recombinant yeast cells, the fermentation process is conducted at a temperature of less than about 50 ℃, less than about 42 ℃, or less than about 38 ℃. For recombinant yeast cells or filamentous fungal host cells, the fermentation process is preferably conducted at a temperature of less than about 35 ℃, about 33 ℃, about 30 ℃, or about 28 ℃ and at a temperature of greater than about 20 ℃, about 22 ℃, or about 25 ℃.
In the process according to the invention, the ethanol yield based on xylose and/or glucose is preferably at least about 50%, about 60%, about 70%, about 80%, about 90%, about 95% or about 98%. Ethanol yield is defined herein as a percentage of the theoretical maximum yield.
The process according to the invention and the propagation step and/or fermentation step suitably included therein may be carried out in batch, fed-batch or continuous mode. A stepwise hydrolysis and fermentation (separate hydrolysis and fermentation, SHF) process or a simultaneous saccharification and fermentation (simultaneous saccharification and fermentation, SSF) process may also be applied.
The recombinant yeasts and methods according to the invention advantageously allow a more robust method. Advantageously, the process or any anaerobic fermentation during the process may be carried out in the presence of a high concentration of carbon source. Thus, the process (and correspondingly any anaerobic fermentation step therein) is preferably carried out in the presence of glucose at the following concentrations: 25g/L or higher, 30g/L or higher, 35g/L or higher, 40g/L or higher, 45g/L or higher, 50g/L or higher, 55g/L or higher, 60g/L or higher, 65g/L or higher, 70g/L or higher, 75g/L or higher, 80g/L or higher, 85g/L or higher, 90g/L or higher, 95g/L or higher, 100g/L or higher, 110g/L or higher, 120g/L or higher, or may be, for example, in the range of 25g/L-250g/L, 30g/L-200g/L, 40g/L-200g/L, 50g/L-200g/L, 60g/L-200g/L, 70g/L-200g/L, 80g/L-200g/L, or 90g/L-200 g/L.
For recovery of the fermentation product, the prior art is used. Different recovery methods are appropriate for different fermentation products. Existing processes for recovering ethanol from aqueous mixtures typically use fractionation and adsorption techniques. For example, beer distillers can be used to process fermentation products containing ethanol in an aqueous mixture to produce an ethanol-enriched mixture, which is then fractionated (e.g., by fractional distillation or other similar techniques). Next, the fraction containing the highest concentration of ethanol may be passed through an adsorbent to remove most, if not all, of the remaining water from the ethanol. In one embodiment, in addition to recovering the fermentation product, yeast may be recovered.
All patents and references cited in this specification are incorporated herein by reference in their entirety.
The following examples are provided for illustrative purposes only and are not intended to limit the scope of the present invention in any way
Examples
General molecular biology techniques
Unless indicated otherwise, the methods used are standard biochemical techniques. Examples of suitable general method textbooks include Sambrook et al, molecular Cloning, a Laboratory Manual [ molecular cloning, A laboratory Manual ] (1989) and Ausubel et al, current Protocols in Molecular Biology [ guidelines for molecular biology experiments ] (1995), john Wiley & Sons, inc. [ John Willi parent ].
HPLC analysis
HPLC analysis is typically performed as described in the following documents: "Determination of sugars, byproducts and degradation products in liquid fraction in process sample" [ "determination of sugar, by-products and degradation products in liquid fraction in process sample" ]; laboratory analysis procedures (Laboratory Analytical Procedure, LAP, release date: 12/08/2006; A.Sluicter, B.Hames, R.Ruiz, C.Scarlata, J.Sluicter and D.Templeton; technical Report [ Technical Report ] (NREL/TP-51042623); month 1 in 2008; national Renewable Energy Laboratory [ national renewable energy laboratory ].
After fermentation, the sample for HPLC analysis was separated from the yeast biomass and insoluble components (corn mash) by passing the clarified supernatant after centrifugation through a 0.2 μm pore size filter
Initial strain
Using EthanolAs a starting strain, a strain was prepared. Ethanol->Is a commercial strain of Saccharomyces cerevisiae available from Le Sifu company (Lesafre).
The strain construction methods that can be followed are described in WO 2013/144257A1 and WO 2015/028582 (incorporated herein by reference).
Expression cassettes from a variety of genes of interest can be recombined in vivo into the pathway at a specific locus after transformation of the yeast (US 9738890 B2). The promoter, ORF and terminator sequences were assembled into expression cassettes using Golden Gate technology as described by Engler et al (2011) and ligated into Bsal digested backbone vectors that modified the expression cassettes with linkers for in vivo recombination procedures. The expression cassette including the linker was amplified by PCR. In addition, PCR was used to amplify 5 'and 3' dna fragments of the upstream and downstream portions of the integration locus and modified with linker sequences. After transformation of yeast cells with these DNA fragments, in vivo recombination and integration into the genome is performed at the desired location. CRISPR-Cas9 technology is used to create unique double strand breaks at the integration locus to target the pathway to this particular locus (DiCarlo et al, 2013,Nucleic Acids Res [ nucleic acids research ] 41:4336-4343) as well as WO 16110512 and US 2019309268. The gRNA is expressed from a multicopy yeast shuttle vector containing a natMX marker that confers resistance to the yeast cell against the biomass Nociceptin (NTC). The backbone of this plasmid is based on pRS305 (Sikorski and Hieter, genetics [ Genetics ]1989, vol.122, pages 19-27) and comprises a functional 2 μm ORI sequence. Streptococcus pyogenes (Streptococcus pyogenes) CRISPR-associated protein 9 (Cas 9) was expressed from the pRS414 plasmid with kanMX markers (Sikorski and Hieter, 1989), which confer resistance to biomass geneticin (G418) on yeast cells. The guide RNA and protospacer sequences were designed using the gRNA design tool (https:// www.atum.bio/eCommerce/cas 9/input).
Example 1: construction of the "Rubisco" Strain (intermediate Strain IX 1)
In this example, the starting strain was transformed with the cbbM gene from thiobacillus denitrificans encoding the single subunit of ribulose-1, 5-bisphosphate carboxylase (RuBisCO) and the genes from escherichia coli encoding the chaperones GroEL and GroES (to help the correct folding of the RuBisCO protein in the cytosol of saccharomyces cerevisiae).
Expression cassettes with the Saccharomyces cerevisiae TDH3 promoter, the cbbM gene and the Saccharomyces cerevisiae CYC1 terminator were obtained from plasmid pBTWW002, expression cassettes with the Saccharomyces cerevisiae TEF1 promoter, groEL and Saccharomyces cerevisiae ACT1 terminator were obtained from plasmid pUD232, and expression cassettes with the Saccharomyces cerevisiae TPI1 promoter, groES gene and Saccharomyces cerevisiae PGI1 terminator were obtained from plasmid pUD233, as described in the following documents: guadalube-Medina et al, biotechnol, biofuels [ biotech of biofuel ],2013, vol.6, p.125 and US2019309268.
The three cassettes were integrated with CRISPR-Cas9 using the gRNA expression cassette into the INT1 locus located between NTR1 (YOR 071 c) and GYP1 (YOR 070 c) on the XV chromosome of Saccharomyces cerevisiae (as described in Di Carlo et al, nucleic Acids Res. [ nucleic acids Ind. 2013; pages 1-8). This cassette was ordered as a synthetic DNA cassette (gBLOCK) at integrated DNA technologies company (Integrated DNA Technologies) (belgium), as INT1 gBLOCK and by homologous recombination with: PCR fragments (5 '-INT 1) generated using genomic DNA of strain CEN.PK113-7D as template with primers BoZ-783 and DBC-18463, and PCR fragments (3' -INT 1) generated using genomic DNA of strain CEN.PK113-7D as template with primers DBC-18464 and BoZ-788, as described in WO 2018/114762.
Diagnostic PCR was performed to confirm proper assembly and integration of the GroES-cbbM-GroEL cassette at the INT1 locus. Cas9 and gRNA plasmids were removed by two overnight liquid incubations with YEPhD, dilutions were plated on YEPhD plates, after which single colonies were picked. Cells prepared from single colonies of the YEPhD plates were streaked and re-streaked on YEPhD and YEPhD+G418 (400 mg/ml) and YEPhD+NTC (200 mg/ml). Plasmid-free colonies that did not grow on the antibiotic plates were selected.
The above example gives an intermediate strain IX1 which contains two copies of GroES, cbbM and GroEL (see Table 14 for detailed genotypes).
Example 2: reference "PRK-Rubisco" strainConstruction of reference Strain RX2
Intermediate strain IX1 obtained as in example 1 was transformed with ribulose-phosphate kinase (prk) from S.oleracera (as described in Guadalube-Medina et al, biotechnol, biofuels [ biotech of biofuel ],2013, volume 6, page 125) and ribulose-1, 5-bisphosphate carboxylase (RubisCO) from Thiobacillus denitrificans encoded by the cbbM gene. The prk gene was expressed by the Saccharomyces cerevisiae ANB1 promoter (shown by SEQ ID NO:26 (cbbM), SEQ ID NO:28 (prk) and SEQ ID NO:23 (ANB 1)) and the Saccharomyces cerevisiae EBO1 terminator (Sc_EBO1t) (shown in SEQ ID NO: 79) as described in US 10689670.
The cbbM gene was cloned between the saccharomyces cerevisiae ENO2 promoter (shown by SEQ ID No. 80) and the saccharomyces cerevisiae YOX1 terminator (shown by SEQ ID No. 81).
The cassette was assembled into the "prk-cbbM" pathway and the following two sequences for homologous integration were integrated into the INT14.02 locus as a non-coding region between ORF YNL179c and RPS3 (YNL 178W) on chromosome XIV from saccharomyces cerevisiae using CRISPR-Cas9 with the INT14.02 proto-spacer (shown by SEQ ID No. 82): sc_INt14.02_flanking 5 (shown by SEQ ID NO: 83) and Sc_INt14.02_flanking 3 (shown by SEQ ID NO: 84).
Diagnostic PCR was performed to confirm proper assembly and integration of the prk and cbbM cassettes at the INT14.02 locus, and the gRNA and Cas9 plasmids were removed as described above. This gave a reference strain RX2 which contained two copies of prk, groEL and groES and 4 copies of cbbM (see Table 14 for detailed genotypes).
Example 3: construction of novel strains NX3 and NX4
New strains NX3 and NX4 were constructed by transforming the reference strain RX2 obtained in example 2 as follows:
the DNA fragments were compiled using Golden Gate clones (as described, for example, in Engler et al, "Generation of Families of Construct Variants Using Golden Gate Shuffling" [ "construct variant family generated using Golden Gate shuffling" ], (2011), published in Chaofu Lu et al (eds.), cDNA Libraies: methods and Applications, methods in Molecular Biology [ cDNA library: methods and applications, methods of molecular biology ], vol.729, chapter 11 at pages 167-180, incorporated herein by reference), and contained the Saccharomyces cerevisiae ANB1 promoter (shown by SEQ ID NO: 23), pichia pastoris TKL1 gene (shown by SEQ ID NO: 18), and Saccharomyces cerevisiae TDH1 terminator. This DNA fragment was designated "fragment A" (shown by SEQ ID NO: 73). Fragment A was integrated into the INT95 locus located between SOD1 (YJR 104C) and ADO1 (YJR 105W) on chromosome X of Saccharomyces cerevisiae reference strain RX2 using CRISPR-Cas9 and INT95 protospacers (shown by SEQ ID NO: 85) and the following two sequences for homologous integration: sc_INT95B_flanking 5 (shown by SEQ ID NO: 86) and Sc_INT95B_flanking 3 (shown by SEQ ID NO: 87).
Diagnostic PCR was performed to confirm proper assembly and integration of the facilitation pichia TKL1 expression cassette at the INT95 locus. Plasmid-free colonies were selected and this resulted in new strains NX3 and NX4 containing two copies of the promoted pichia TKL1 expression cassette (see table 14 for detailed genotypes).
Example 4: construction of novel strains NX5 and NX6
New strains NX5 and NX6 were constructed by transforming the reference strain RX2 obtained in example 2 with the following four expression cassettes:
the first cassette contains a DNA fragment named "fragment a" comprising the saccharomyces cerevisiae ANB1 promoter, pichia pastoris TKL1 orf and saccharomyces cerevisiae TDH1 terminator, as mentioned in example 3. (shown by SEQ ID NO: 73)
The second cassette contains a DNA fragment named "fragment B" comprising the saccharomyces cerevisiae MYO4 promoter (sc_myo4. Pro), saccharomyces cerevisiae DAK1 orf (sc_dak1. Orf) and saccharomyces cerevisiae GPM1 terminator (sc_gpm1. Ter). The nucleic acid sequence of this DNA fragment is shown in SEQ ID NO. 74.
The third cassette contains a DNA fragment designated "fragment C" comprising the Saccharomyces cerevisiae HHF2 promoter (Sc_HH2. Pro), E.coli gldA orf (ec_gldA. Orf) and the Saccharomyces cerevisiae EFM1 terminator (Sc_EFM1. Ter). The nucleic acid sequence of this DNA fragment is shown in SEQ ID NO. 75.
The fourth cassette contains a DNA fragment named "fragment D" comprising the saccharomyces cerevisiae ANB1 promoter (sc_anb1. Pro_0001), the reuptake yeast orf (zrou_t5. Orf) encoding the glycerol transporter GLYT (ZYRO 0E 01210) and the saccharomyces cerevisiae terminator (sc_tef1. Ter_0001). The nucleic acid sequence of this DNA fragment is shown in SEQ ID NO. 76.
These four cassettes were integrated into the INT95 locus located between SOD1 (YJR 104C) and ADO1 (YJR 105W) on chromosome X of s.cerevisiae reference strain RX2 using CRISPR-Cas9 using the following sequences for homologous integration as described above, i.e., with Sc_INT95B_flanking 5 (shown by SEQ ID NO: 86) and Sc_INT95B_flanking 3 (shown by SEQ ID NO: 87).
Diagnostic PCR was performed to confirm that the four expression cassettes were assembled and integrated correctly at the INT95 locus. Plasmid-free colonies were selected, which resulted in new strains NX5 and NX6 (see table 14 for detailed genotypes).
Example 5: construction of novel strains NX7 and NX8
New strains NX7 and NX8 were constructed by transforming the reference strain RX2 obtained in example 2 with a cassette containing the DNA fragment comprising the Saccharomyces cerevisiae ANB1 promoter, pichia pastoris TKL1 orf and Saccharomyces cerevisiae TDH1 terminator as mentioned in example 3 (i.e. "fragment A") (shown by SEQ ID NO: 73) in the INT95 locus using CRISPR-Cas9 with Sc_INT95B_flanking 5.Acc (SEQ ID NO: 86) and Sc_INT95B_flanking 3.Acc (SEQ ID NO: 87) also shown in example 3 for homologous integration.
GPD2 was deleted using CRISPR-Cas9 and GPD2 proto-spacers (shown by SEQ ID NO: 88) and repair DNA fragments for homologous recombination and deletion of GPD2 (shown by SEQ ID NO: 78). INT95 integration and GPD2 loss were checked by diagnostic PCR. Plasmid-free colonies were selected, which resulted in new strains NX7 and NX8 (see table 14 for detailed genotypes).
Example 6: construction of novel strains NX9 and NX10
New strains NX9 and NX10 were constructed by transforming reference strain RX2 with the following three expression cassettes:
the first cassette contains a DNA fragment named "fragment a" comprising the saccharomyces cerevisiae ANB1 promoter, pichia pastoris TKL1 orf and saccharomyces cerevisiae TDH1 terminator, as mentioned in example 3. (shown by SEQ ID NO: 73)
The second cassette contains a DNA fragment designated "fragment E" which comprises the Saccharomyces cerevisiae PFY1 promoter (Sc_PFY1. Pro), E.coli gldA orf (ec_gldA. Orf) and the Saccharomyces cerevisiae EFM1 terminator (Sc_EFM1. Ter). The nucleic acid sequence of this DNA fragment is shown in SEQ ID NO. 77.
The third cassette contains a DNA fragment named "fragment D" comprising the saccharomyces cerevisiae ANB1 promoter (sc_anb1. Pro_0001), the reuptake yeast orf (zrou_t5. Orf) encoding the glycerol transporter GLYT (ZYRO 0E 01210) and the saccharomyces cerevisiae terminator (sc_tef1. Ter_0001). The nucleic acid sequence of this DNA fragment is shown in SEQ ID NO. 76.
The three cassettes were integrated into the INT95 locus using CRISPR-Cas9 using INT95B_flanking 5 (SEQ ID NO: 86) and INT95B_flanking 3 (SEQ ID NO: 87) for homologous integration. Diagnostic PCR was performed to confirm that the three expression cassettes were assembled and integrated correctly at the INT95 locus. Plasmid-free colonies were selected, which resulted in strains NX9 and NX10 (see table 14 for detailed genotypes).
Table 14: saccharomyces cerevisiae strains used in this study
Example 7: fermentation
Precultures of the above new "NX3-NX10" strains were prepared as follows: glycerol stock (-80 ℃) was thawed at room temperature and used to inoculate 0.2L mineral medium supplemented with 2% (w/v) glucose at pH 6.0 (adjusted with 2m h2so4/4N KOH) in a baffle-free 0.5L shake flask [ as described in: luttik, MLH et al (2000) "The Saccharomyces cerevisiae ICL2 Gene Encodes a Mitochondrial 2-Methylisocitrate Lyase Involved in Propionyl-Coenzyme A Metabolism" [ "Saccharomyces cerevisiae ICL2 gene encodes mitochondrial 2-methyl isocitrate lyase involved in propionyl-CoA metabolism" ].J.Bacteriol. [ J.bacteriology ]182:7007-13]. The preculture was incubated at 32℃for 18 hours and shaken at 200 RPM. After estimating the CDW by OD600 measurement (using the existing yeast Cell Dry Weight (CDW) versus OD600 calibration curve), the preculture was centrifuged (3 min,530 x g) in an amount corresponding to the 0.5g CDW/L inoculum concentration required for proliferation, washed once with sterile demineralized water of a sample volume, centrifuged once again, and resuspended in proliferation medium.
Proliferation of the above "NX3-NX10" strain was performed as follows: the propagation step was performed in 500mL shake flasks using 100mL of filtered and diluted corn mash (70% v/v corn mash: 30% v/v water) supplemented with 1.25g/L urea and the following antibiotics: the final concentrations were 50. Mu.g/mL and 100. Mu.g/mL neomycin and penicillin G, respectively. After all additions, the pH was adjusted to 5.0 using 2M H2SO4/4N KOH. Glucoamylase was administered at a concentration of 0.1mL/L at the beginning of proliferationT, novelimes (Novozymes)). All strains were allowed to proliferate for 6h at 32℃and were shaken at 200 RPM.
The main fermentation of the above "NX3-NX10" strain was performed as follows: the main fermentation step was performed using 200ml of medium in a 500ml Schott bottle equipped with a pressure recording/release cap (Ankom Technology, ma Xideng, new york, usa) while shaking at 140rpm and 32 ℃. The pH was not controlled during fermentation. Fermentation was performed with corn mash with an increased dry solids content of 36% w/w DS. Subsequently, the corn mash was supplemented with 1.0g/L urea and the following antibiotics: neomycin and penicillin G at final concentrations of 50 μg/mL and 100 μg/mL, respectively; antifoam (Basildon, inc., approximately 0.5 mL/L). After all additions, the pH was adjusted to 5.0 using 2M H2SO4/4N KOH. Glucoamylase was administered at a concentration of 0.24mL/L at the beginning of fermentation T, novelin). The amount of yeast added (pitch) required from propagation to fermentation was 1.5% of the fermentation volume. All strains were tested at high solids (i.e. 36% w/wDS).
Sampling of the fermentation was performed as follows: samples were taken from the primary fermentation only. Samples for HPLC analysis were collected at 18, 24, 42, 48 and 66 hours. Table 15 summarizes the ethanol production (g/l) at each time point. Table 16 summarizes the remaining glucose concentrations (g/l) at each time point.
The conclusion is as follows: the remaining glucose concentration is an indicator of the robustness of the yeast strain. Glucose is continuously produced due to the presence of glucoamylase. Without wishing to be bound by any type of theory, it is believed that less robust strains will become more inhibited near the end of fermentation, and therefore will identify higher concentrations of unconverted glucose in the sample. More robust strains will become less inhibited near the end of the fermentation, and therefore will identify lower concentrations of unconverted glucose in the sample.
As shown in table 16, the concentration of unconverted glucose obtained after 48 hours and 66 hours of fermentation by yeasts NX3, NX4, NX5, NX6, NX7, NX8, NX9 and NX10 according to the present invention was lower than the reference strain.
Example 8 construction of reference Strain FGG1-pPATH1 (i.e., reference Strain RX 11) expressing the phosphoketolase pathway
WO 2018/172328 describes the construction of several strains of saccharomyces cerevisiae (including FGG1-pPATH1 strains) that express the phosphoketolase pathway. The strain FGG1-pPATH1 has a related genotype comprising PKL, PTA and AADH. A summary of relevant strains for the following examples is provided in table 17 below. The strain may be constructed as described in WO 2018/172328 (incorporated herein by reference).
As explained in WO 2018/172328, in an industrial environment, the tolerance to hypertonicity and stress response to the external environment of strains such as FGG1-pPATH1 strains may be affected.
Table 17: saccharomyces cerevisiae strain expressing phosphoketolase pathway
Example 9: construction of New Strain NX12 (prophetic according to the invention)
New strain NX12 can be constructed by transforming reference strain RX11 (FGG 1-pPATH1 as described in WO 2018/172328) as follows:
a DNA fragment was compiled comprising the Saccharomyces cerevisiae ANB1 promoter (shown by SEQ ID NO: 23), the Pichia pastoris TKL1 gene (shown by SEQ ID NO: 18) and the Saccharomyces cerevisiae TDH1 terminator. This DNA fragment was designated "fragment A" (shown by SEQ ID NO: 73). DNA fragment A was assembled using Golden Gate clones (as described, for example, in Engler et al, "Generation of Families of Construct Variants Using Golden Gate Shuffling" [ "use Golden Gate shuffling to generate construct variant family" ], (2011), published in Chaofu Lu et al (eds.), cDNA Libraies: methods and Applications, methods in Molecular Biology [ cDNA library: methods and applications, methods of molecular biology ], volume 729, chapter 11, pages 167-180, incorporated herein by reference). Using CRISPR-Cas9 and INT95 protospacers (shown by SEQ ID NO: 85) the following two sequences for homologous integration, the expression cassette can be integrated into the INT95 locus located between SOD1 (YJR 104C) and ADO1 (YJR 105W) on chromosome X of s.cerevisiae reference strain RX 11: sc_INT95B_flanking 5 (shown by SEQ ID NO: 86) and Sc_INT95B_flanking 3 (shown by SEQ ID NO: 87).
Diagnostic PCR can be performed to confirm proper assembly and integration of the facilitation TKL1 expression cassette at the INT95 locus. Plasmid-free colonies were then selected and this resulted in a new strain NX12 containing two copies of the facilitation TKL1 expression cassette (see table 17 for detailed genotypes).
Example 10: fermentation (prophetic)
Precultures of the above new "NX12" strain were prepared as follows: glycerol stock (-80 ℃) was thawed at room temperature and used to inoculate 0.2L mineral medium supplemented with 2% (w/v) glucose at pH 6.0 (adjusted with 2MH2SO4/4N KOH) in a baffle-free 0.5L shake flask [ as described in: luttik, MLH et al (2000) "The Saccharomyces cerevisiae ICL2 Gene Encodes a Mitochondrial 2-Methylisocitrate Lyase Involved in Propionyl-Coenzyme A Metabolism" [ "Saccharomyces cerevisiae ICL2 gene encodes mitochondrial 2-methyl isocitrate lyase involved in propionyl-CoA metabolism" ].J.Bacteriol. [ J.bacteriology ]182:7007-13]. The preculture was incubated at 32℃for 18 hours and shaken at 200 RPM. After estimating the CDW by OD600 measurement (using the existing yeast Cell Dry Weight (CDW) versus OD600 calibration curve), the preculture was centrifuged (3 min,530 x g) in an amount corresponding to the 0.5g CDW/L inoculum concentration required for proliferation, washed once with sterile demineralized water of a sample volume, centrifuged once again, and resuspended in proliferation medium.
Proliferation of the above "NX12" strain can be performed as follows: the propagation step was performed in 500mL shake flasks using 100mL of filtered and diluted corn mash (70% v/v corn mash: 30% v/v water) supplemented with 1.25g/L urea and the following antibiotics: the final concentrations were 50. Mu.g/mL and 100. Mu.g/mL neomycin and penicillin G, respectively. After all additions, the pH was adjusted to 5.0 using 2M H2SO4/4N KOH. Glucoamylase was administered at a concentration of 0.1mL/L at the beginning of proliferationT, novelin). All strains were allowed to proliferate at 32℃for 6 hours and were shaken at 200 RPM.
The main fermentation of the above "NX12" strain can be performed as follows: while shaking at 140rpm and 32 ℃, a pressure recording/releasing cap (angon technologies, U.S. new york) was equippedAbout state Ma Xideng) in a 500ml Schott bottle using 200ml of medium. The pH was not controlled during fermentation. Fermentation was performed with corn mash with an increased dry solids content of 36% w/w DS. Subsequently, the corn mash was supplemented with 1.0g/L urea and the following antibiotics: neomycin and penicillin G at final concentrations of 50 μg/mL and 100 μg/mL, respectively; antifoam (bassinet, approximately 0.5 mL/L). After all additions, the pH was adjusted to 5.0 using 2M H2SO4/4N KOH. Glucoamylase was administered at a concentration of 0.24mL/L at the beginning of fermentation T, novelin). The amount of yeast added required from propagation to fermentation was 1.5% of the fermentation volume. All strains were tested under high solids (i.e. 36% w/w DS).
Sampling of the fermentation can be performed as follows: samples were taken from the primary fermentation only. Samples for HPLC analysis were collected at 18, 24, 42, 48 and 66 hours. The ethanol yield (g/l) at each time point and the remaining glucose concentration (g/l) at each time point can be analyzed.
The conclusion may be as follows: the remaining glucose concentration is an indicator of the robustness of the yeast strain. Glucose is continuously produced due to the presence of glucoamylase. Without wishing to be bound by any type of theory, it is believed that less robust strains (such as reference strain RX 11) will become more inhibited near the end of the fermentation, and therefore will identify higher concentrations of unconverted glucose in the sample. More robust strains (such as NX 12) will become less inhibited near the end of the fermentation, and therefore will identify lower concentrations of unconverted glucose in the sample.
EXAMPLE 11 construction of reference Strain IMZ132 (i.e., reference Strain RX 13) expressing an Acetylaldehyde dehydrogenase
WO 2011/010923 describes a strain IMZ132 expressing an acetylating acetaldehyde dehydrogenase, further referred to herein as reference strain RX13. Strain IMZ132 can be constructed as described in WO 2011/010923 (incorporated herein by reference). In addition, strain IMZ132 was deposited at the netherlands collection of microorganisms and cell cultures (Centraalbureau voor Schimmelcultures) at 7.16, 2009 under accession number CBS125049.
Table 18: saccharomyces cerevisiae strain expressing acetylating acetaldehyde dehydrogenase
Example 12: construction of New Strain NX14 (prophetic according to the invention)
The new strain NX14 can be constructed by transforming the reference strain RX13 (IMZ 132 as described in WO 2011/010923) as follows:
a DNA fragment was compiled comprising the Saccharomyces cerevisiae ANB1 promoter (shown by SEQ ID NO: 23), the Pichia pastoris TKL1 gene (shown by SEQ ID NO: 18) and the Saccharomyces cerevisiae TDH1 terminator. This DNA fragment was designated "fragment A" (shown by SEQ ID NO: 73). DNA fragment A was assembled using Golden Gate clones (as described, for example, in Engler et al, "Generation of Families of Construct Variants Using Golden Gate Shuffling" [ "use Golden Gate shuffling to generate construct variant family" ], (2011), published in Chaofu Lu et al (eds.), cDNA Libraies: methods and Applications, methods in Molecular Biology [ cDNA library: methods and applications, methods of molecular biology ], volume 729, chapter 11, pages 167-180, incorporated herein by reference). Using CRISPR-Cas9 and INT95 protospacers (shown by SEQ ID NO: 85) the following two sequences for homologous integration, the expression cassette can be integrated into the INT95 locus located between SOD1 (YJR 104C) and ADO1 (YJR 105W) on chromosome X of saccharomyces cerevisiae reference strain RX 13: sc_INT95B_flanking 5 (shown by SEQ ID NO: 86) and Sc_INT95B_flanking 3 (shown by SEQ ID NO: 87).
Diagnostic PCR can be performed to confirm proper assembly and integration of the facilitation TKL1 expression cassette at the INT95 locus. Plasmid-free colonies were then selected and this resulted in a new strain NX14 containing two copies of the facilitation TKL1 expression cassette (see table 18 for detailed genotypes).
Example 13: fermentation(prophetic)
Precultures of the above new "NX14" strain were prepared as follows: glycerol stock (-80 ℃) was thawed at room temperature and used to inoculate 0.2L mineral medium supplemented with 2% (w/v) glucose at pH 6.0 (regulated with 2MH2SO4/4N KOH) in a baffle-free 0.5L shake flask (as described in Luttik, mlh et al (2000) "The Saccharomyces cerevisiae ICL2 Gene Encodes a Mitochondrial 2-Methylisocitrate Lyase Involved in Propionyl-Coenzyme A Metabolism" [ "saccharomyces cerevisiae ICL2 gene encodes mitochondrial 2-methyl isocitrate lyase involved in propionyl coa metabolism" ].j.bacteriol. 182:7007-13). The preculture was incubated at 32℃for 18 hours and shaken at 200 RPM. After estimating the CDW by OD600 measurement (using the existing yeast Cell Dry Weight (CDW) versus OD600 calibration curve), the preculture was centrifuged (3 min,530 x g) in an amount corresponding to the 0.5g CDW/L inoculum concentration required for proliferation, washed once with sterile demineralized water of a sample volume, centrifuged once again, and resuspended in proliferation medium.
Proliferation of the above "NX14" strain can be performed as follows: the propagation step was performed in 500mL shake flasks using 100mL of filtered and diluted corn mash (70% v/v corn mash: 30% v/v water) supplemented with 1.25g/L urea and the following antibiotics: the final concentrations were 50. Mu.g/mL and 100. Mu.g/mL neomycin and penicillin G, respectively. After all additions, the pH was adjusted to 5.0 using 2M H2SO4/4N KOH. Glucoamylase was administered at a concentration of 0.1mL/L at the beginning of proliferationT, novelin). All strains were allowed to proliferate at 32℃for 6 hours and were shaken at 200 RPM.
The main fermentation of the above "NX14" strain can be performed as follows: the main fermentation step was performed using 200ml of medium in a 500ml Schott bottle equipped with a pressure recording/release cap (anka technologies, ma Xideng, new york, usa) while shaking at 140rpm and applying a temperature of 32 ℃. The pH was not controlled during fermentation. Hair with corn mash with increased dry solids content of 36% w/w DSAnd (5) fermenting. Subsequently, the corn mash was supplemented with 1.0g/L urea and the following antibiotics: neomycin and penicillin G at final concentrations of 50 μg/mL and 100 μg/mL, respectively; antifoam (bassinet, approximately 0.5 mL/L). After all additions, the pH was adjusted to 5.0 using 2M H2SO4/4N KOH. Glucoamylase was administered at a concentration of 0.24mL/L at the beginning of fermentation T, novelin). The amount of yeast added required from propagation to fermentation was 1.5% of the fermentation volume. All strains were tested under high solids (i.e. 36% w/w DS).
Sampling of the fermentation can be performed as follows: samples were taken from the primary fermentation only. Samples for HPLC analysis were collected at 18, 24, 42, 48 and 66 hours. The ethanol yield (g/l) at each time point and the remaining glucose concentration (g/l) at each time point can be analyzed.
The conclusion may be as follows: the remaining glucose concentration is an indicator of the robustness of the yeast strain. Glucose is continuously produced due to the presence of glucoamylase. Without wishing to be bound by any type of theory, it is believed that less robust strains (such as reference strain RX 13) will become more inhibited near the end of the fermentation, and therefore will identify higher concentrations of unconverted glucose in the sample. More robust strains (such as NX 14) will become less inhibited near the end of the fermentation, and therefore will identify lower concentrations of unconverted glucose in the sample.
Reference to the literature
Entian KD,P.Yeast genetic strain and plasmid collections.Method Microbiol.2007;629-66.
Nijkamp JF,van den Broek M,Datema E,de Kok S,Bosman L,Luttik MA,Daran-Lapujade P,Vongsangnak W,Nielsen J,Heijne WHM,Klaassen P,Paddon CJ,Platt D,P,van Ham RC,Reinders MJT,Pronk JT,de Ridder D,Daran J-M.De novo sequencing,assembly and analysis of the genome of the laboratory strain Saccharomyces cerevisiae CEN.PK113-7D,a model for modern industrial biotechnology.Microb Cell Fact.2012;11:36.
Verduyn C,Postma E,Scheffers WA,van Dijken JP.Effect of benzoic acid on metabolic fluxes in yeasts:A continuous-culture study on the regulation of respiration and alcoholic fermentation.Yeast.1992;8:501-17.
Mans R,van Rossum HM,Wijsman M,Backx A,Kuijpers NG,van denBroek M,Daran-Lapujade P,Pronk JT,van Maris AJA,Daran J-M.CRISPR/Cas9:a molecular Swiss army knife for simultaneous introductionof multiple genetic modifications in Saccharomyces cerevisiae.FEMS YeastRes.2015;15:fov004.
DiCarlo JE,Norville JE,Mali P,Rios X,Aach J,Church GM.Genomeengineering in Saccharomyces cerevisiae using CRISPR-Cassystems.Nucleic Acids Res.2013;1-8.
Mikkelsen MD,Buron LD,Salomonsen B,Olsen CE,Hansen BG,Mortensen UH,Halkier BA.Microbial production of indolylglucosinolatethrough engineering of a multi-gene pathway in a versatile yeast expressionplatform.Metab Eng.2012;14:104-11.
Knijnenburg TA,Daran JM,van den Broek MA,Daran-Lapujade PA,de Winde JH,Pronk JT,Reinders MJ,Wessels LF.Combinatorial effects ofenvironmental parameters on transcriptional regulation in Saccharomycescerevisiae:A quantitative analysis of a compendium of chemostat-basedtranscriptome data.BMC Genomics.2009;10:53.
Mumberg D,Müller R,Funk M.Yeast vectors for the controlledexpression of heterologous proteins in different geneticbackgrounds.Gene.1995;156:119-22.
Gueldener U,Heinisch J,Koehler GJ,Voss D,Hegemann JH.A secondset of loxP marker cassettes for Cre-mediated multiple gene knockouts inbudding yeast.Nucleic Acids Res.2002;30:e23.
Guadalupe-Medina V,Wisselink H,Luttik M,de Hulster E,Daran J-M,Pronk JT,van Maris AJA.Carbon dioxide fixation by Calvin-Cycle enzymesimproves ethanol yield in yeast.Biotechnol Biofuels.2013;6:125.
Daniel Gietz R,Woods RA:Transformation of yeast by lithiumacetate/single-stranded carrier DNA/polyethylene glycol method.MethodsEnzymol.2002:87-96.
Solis-Escalante D,Kuijpers NGA,Bongaerts N,Bolat I,Bosman L,Pronk JT,Daran J-M,Daran-Lapujade P.amdSYM,a new dominantrecyclable marker cassette for Saccharomyces cerevisiae.FEMS Yeast Res.2013;13:126-39.
Guadalupe-Medina V,Almering MJH,van Maris AJA,PronkJT.Elimination of glycerol production in anaerobic cultures of aSaccharomyces cerevisiae strain engineered to use acetic acid as an electronacceptor.Appl Environ Microb.2010;76:190-5.
Papapetridis I,van Dijk M,Dobbe AP,Metz B,Pronk JT,van MarisAJA.Improving ethanol yield in acetate-reducing Saccharomyces cerevisiaeby cofactor engineering of 6-phosphogluconate dehydrogenase and deletionof ALD6.Microb Cell Fact.2016;15:1-16.
Heijnen JJ,van Dijken JP.In search of a thermodynamic description ofbiomass yields for the chemotrophic growth of microorganisms.BiotechnolBioeng.1992;39:833-58.
Postma E,Verduyn C,Scheffers WA,van Dijken JP.Enzymic analysisof the crabtree effect in glucose-limited chemostat cultures ofSaccharomyces cerevisiae.Appl Environ Microbiol.1989;55:468-77.
Verduyn C,Postma E,Scheffers WA,van Dijken JP.Physiology ofSaccharomyces cerevisiae in anaerobic glucose-limited chemostat cultures.JGen Microbiol.1990;136:395-403.
Kwast et al.Genomic Analysis of Anaerobically induced genes inSaccharomyces cerevisiae:Functional roles of ROX1 and other factors inmediating the anoxic response,2002,Journal of bacteriology vol 184,no1p250-265.
Keng,T.1992.HAP1 and ROX1 form a regulatory pathway in therepression of HEM13 transcription in Saccharomyces cerevisiae.Mol.Cell.Biol.12:2616-2623.
Labbe-Bois,R.,and P.Labbe.1990.Tetrapyrrole and heme biosynthesisin the yeast Saccharomyces cerevisiae,p.235-285.In H.A.Dailey(ed.),Biosynthesis of heme and chlorophylls.McGraw-Hill,New York,N.Y.
Zitomer,R.S.,and C.V.Lowry.1992.Regulation of gene expression byoxygen in Saccharomyces cerevisiae.Microbiol.Rev.56:1-11.
Zitomer,R.S.,P.Carrico,and J.Deckert.1997.Regulation of hypoxicgene expression in yeast.Kidney Int.51:507-513.
Cohen et al.,Induction and repression of DAN1 and the family ofanaerobic mannoprotein genes in Saccharomyces cerevisiae occurs through acomplex array of regulatory sites.Nucleic Acid Research,2001 Vol.29,No3,799-808
Ter Kinde and de Steensma,A microarray-assisted screen for potentialHap1 and Rox1 target genes in Saccharomyces cerevisiae,2002,Yeast 19:825-840.
Sertil et al.The DAN1 gene of S cerevisiae is regulated in parallel withthe hypoxic gene,but by a different mechanism,1997,Gene Vol 192,pag199-205.
Nissen et al.,"Anaerobic and aerobic batch cultivations ofSaccharomyces cerevisiae mutants impaired in glycerol Synthesis",(2000),Yeast,vol.16,pages 463-474.
Sambrook et al.,Molecular Cloning-A Laboratory Manual,2nd ed.,Vol.1-3(1989),published by Cold Spring Harbor Publishing.
Kruskal et al,"An overview of sequence comparison:Time warps,stringedits,and macromolecules",(1983),Society for Industrial and AppliedMathematics(SIAM),Vol 25,No.2,pages 201-237.
D.Sankoff and J.B.Kruskal,(ed.),Time warps,string edits andmacromolecules:the theory and practice of sequence comparison,pp.1-44Addison-Wesley Publishing Company.
Needleman et al"A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins"(1970)J.Mol.Biol.Vol.48,pages 443-453.
Sherman,F.,et al.,Methods in Yeast Genetics,Cold Spring Harbor Laboratory(1986)
Rice et al,"EMBOSS:The European Molecular Biology Open Software Suite"(2000),Trends in Genetics vol.16,(6)pages 276-277,http:// emboss.bioinformatics.nl/.
Neves et al.,"Yeast orthologues associated with glycerol transport and metabolism",(2004),FEMS Yeast Res.Vol.5,pages 51-62.
Neves et al"New insights on glycerol transport in Saccharomyces cerevisiae",(2004),FEBS Letters 565(2004)160-162.
Kwast et al.,"Genomic Analysis of Anaerobically induced genes in Saccharomyces cerevisiae:Functional roles of ROX1 and other factors in mediating the anoxic response",(2002),Journal of bacteriology vol 184,no1 pages 250-265.
Molin et al(2003)"Dihydroxy-acetone kinases in Saccharomyces cerevisiae are involved in detoxification of dihydroxyacetone"(2003),J.Biol.Chem.,vol.278:pages 1415-1423.
Guadalupe-Medina et al.,"Carbon dioxide fixation by Calvin-Cycle enzymes improves ethanol yield in yeast",published in Biotechnol,Biofuels,2013,vol.6,p.125 onwards.
Yébenes et al.,“Chaperonins:two rings for folding”(2011),Trends in Biochemical Sciences,Vol.36,No.8,pages 424-432.
Zeilstra-Ryalls et al.,"The universally conserved GroE(Hsp60)chaperonins",published in Annu Rev Microbiol.(1991)vol.45,pages301-25.
Horwich et al.,"Two Families of Chaperonin:Physiology and Mechanism",(2007),Annu.Rev.Cell.Dev.Biol.Vol.23,pages 115-45.
Sonderegger et al.,"Metabolic Engineering of a PhosphoketolasePathway for Pentose Catabolism in Saccharomyces cerevisiae",(2004),Applied&Environmental Microbiology,vol.70(5),pages 2892-2897.
Membrillo-Hernandez et al.,"Evolution of the adhE Gene Product ofEscherichia coli from a Functional Reductase to a Dehydrogenase",(2000)J.Biol.Chem.275:pages 33869-33875.
Tamarit et al."Identification of the Major Oxidatively DamagedProteins in Escherichia coli Cells Exposed to Oxidative Stress"(1998)J.Biol.Chem.273:pages 3027-3032.
Smith et al."Purification,Properties,and Kinetic Mechanism ofCoenzyme A-Linked Aldehyde Dehydrogenase from Clostridium kluyveri"(1980)Arch.Biochem.Biophys.203:pages 663-675.
Toth et al."The ald Gene,Encoding a Coenzyme A-Acylating AldehydeDehydrogenase,Distinguishes Clostridium beijerinckii and Two OtherSolvent-Producing Clostridia from Clostridium acetobutylicum",(1999),Appl.Environ.Microbiol.65:pages 4973-4980.
Powlowski and Shingler"Genetics and biochemistry of phenoldegradation by Pseudomonas sp.CF600",(1994),Biodegradation vol.5,pages 219-236.
Shingler et al.,"Nucleotide Sequence and Functional Analysis of theComplete Phenol/3,4-Dimethylphenol Catabolic Pathway of Pseudomonassp.Strain CF600",(1992),J.Bacteriol.,Vol.174,pages 711-724.
Ferrandez et al.,"Genetic Characterization and Expression inHeterologous Hosts ofthe 3-(3-Hydroxyphenyl)Propionate CatabolicPathway of Escherichia coli K-12"(1997)J.Bacteriol.179:pages2573-2581.
Lutstorf and Megnet,"Multiple Forms of Alcohol Dehydrogenase inSaccharomyces Cerevisiae",(1968),Arch.Biochem.Biophys.,vol.126,pages 933-944.
Ciriacy,"Genetics of Alcohol Dehydrogenase in Saccharomycescerevisiae I.Isolation and genetic analysis of adh mutants",(1975),Mutat.Res.29,pages 315-326.
Engler et al.,"Generation of Families of Construct Variants UsingGolden Gate Shuffling",(2011),published in chapter 11 of Chaofu Lu et al.(eds.),cDNA Libraries:Methods and Applications,Methods in MolecularBiology,vol.729,pages 167-180.
DiCarlo et al.,"Genome engineering in Saccharomyces cerevisiaeusing CRISPR-Cas systems",(2013),Nucleic Acids Res Vol 41,pages4336-4343.
Sikorski and Hieter,"A System of Shuttle Vectors and Yeast HostStrains Designed for Efficient Manipulation of DNA in Saccharomycescerevisiae",(1989),Genetics,vol.122,pages 19-27

Claims (19)

1. A recombinant yeast cell that functionally expresses:
-a nucleic acid sequence encoding a native protein having transketolase activity (EC 2.2.1.1); and
-a nucleic acid sequence encoding a heterologous protein (EC 2.2.1.1) having transketolase activity.
2. The recombinant yeast cell of claim 1, wherein the heterologous protein having transketolase activity comprises or consists of an amino acid sequence having a percent identity with the amino acid sequence of the native protein having transketolase activity within the following ranges: in the range of equal to or greater than 30% to equal to or less than 80%, more preferably in the range of equal to or greater than 30% or equal to or greater than 35% to equal to or less than 75%, and most preferably in the range of equal to or greater than 35% to equal to or less than 70% or even equal to or less than 65%.
3. The recombinant yeast cell of claim 1 or claim 2,
wherein the heterologous protein having transketolase activity comprises or consists of:
-an amino acid sequence of SEQ ID NO. 3, SEQ ID NO. 4, SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NO. 9, SEQ ID NO. 10, SEQ ID NO. 11, SEQ ID NO. 12, SEQ ID NO. 13, SEQ ID NO. 14, SEQ ID NO. 15, SEQ ID NO. 16, SEQ ID NO. 17 or SEQ ID NO. 19; or alternatively
-a functional homolog of SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, SEQ ID No. 7, SEQ ID No. 8, SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 11, SEQ ID No. 12, SEQ ID No. 13, SEQ ID No. 14, SEQ ID No. 15, SEQ ID No. 16, SEQ ID No. 17 or SEQ ID No. 19 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 75%, at least 80%, at least 85%, at least 95% identity to the amino acid sequence of SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, SEQ ID No. 7, SEQ ID No. 8, SEQ ID No. 9, SEQ ID No. 10, SEQ ID No. 11, SEQ ID No. 12, SEQ ID No. 13, SEQ ID No. 14, SEQ ID No. 15, SEQ ID No. 16, SEQ ID No. 17 or SEQ ID No. 19; or alternatively
-SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17 or 19, which has one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO:16, 17 or 19, more preferably when compared to SEQ ID NO:3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 or 19, the amino acid sequence of No. 3, 5, 6, 10, 11, 14, 10, 15, 10, or 15, 10, or 15, or NO to the at which is at a non or a or a 5 a or a 5, or 5, 5, functional homologs of substitutions, insertions and/or deletions.
4. The recombinant yeast cell according to any one of claim 1 to 3,
wherein the native protein having transketolase activity comprises or consists of:
-the amino acid sequence of SEQ ID No. 1; or alternatively
-a functional homolog of SEQ ID No. 1 having at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% or at least 99% sequence identity to the amino acid sequence of SEQ ID No. 1; or alternatively
Functional homologs of SEQ ID NO. 1 having one or more mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 1, more preferably functional homologs having NO more than 300, NO more than 250, NO more than 200, NO more than 150, NO more than 100, NO more than 75, NO more than 50, NO more than 40, NO more than 30, NO more than 20, NO more than 10 or NO more than 5 amino acid mutations, substitutions, insertions and/or deletions when compared to the amino acid sequence of SEQ ID NO. 1.
5. The recombinant yeast cell according to any one of claim 1 to 4,
Wherein expression of the nucleic acid sequence encoding the heterologous protein having transketolase activity is under the control of a promoter ("TKL promoter") having an anaerobic/aerobic expression ratio of 2 or more for transketolase.
6. The recombinant yeast cell according to any one of claim 1 to 5,
wherein expression of the nucleic acid sequence encoding the native protein having transketolase activity is under the control of a promoter ("TKL promoter") having an anaerobic/aerobic expression ratio of 2 or more for transketolase.
7. The recombinant yeast cell of claim 5 or claim 6,
wherein the TKL promoter is a promoter of ANB1 and/or DAN 1.
8. The recombinant yeast cell according to any one of claim 1 to 7,
wherein the recombinant yeast cell comprises one or more genetic modifications for functionally expressing a protein that plays a role in a metabolic pathway that forms a non-native redox sink.
9. The recombinant yeast cell according to any one of claim 1 to 8,
wherein the recombinant yeast cell functionally expresses:
-a nucleic acid sequence encoding a protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity; and/or
-a nucleic acid sequence encoding a protein having Phosphoribulokinase (PRK) activity; and/or
-optionally a nucleic acid sequence encoding one or more chaperones of said protein having ribulose-1, 5-bisphosphate carboxylase oxygenase (Rubisco) activity.
10. The recombinant yeast cell according to any one of claim 1 to 8,
wherein the recombinant yeast cell functionally expresses:
-a nucleic acid sequence encoding a protein comprising phosphoketolase activity (EC 4.1.2.9 or EC 4.1.2.22, pkl); and/or
-a nucleic acid sequence encoding a protein (EC 2.3.1.8) having Phosphotransacetylase (PTA) activity; and/or
-a nucleic acid sequence encoding a protein having acetate kinase (ACK) activity (EC 2.7.2.12).
11. The recombinant yeast cell according to any one of claim 1 to 8,
wherein the recombinant yeast cell functionally expresses a nucleic acid sequence encoding a protein comprising nad+ -dependent acetylating acetaldehyde dehydrogenase activity (EC 1.2.1.10).
12. The recombinant yeast cell according to any one of claim 1 to 8,
wherein the recombinant yeast cell functionally expresses a nucleic acid sequence encoding an enzyme having NADH-dependent nitrate reductase activity and/or a nucleic acid sequence encoding an enzyme having NADH-dependent nitrite reductase activity.
13. The recombinant yeast cell of claim 12, wherein the recombinant yeast cell further functionally expresses a nucleic acid sequence encoding an enzyme having nitrate and/or nitrite transporter activity.
14. The recombinant yeast cell according to any one of claim 1 to 13,
wherein the recombinant yeast cell further comprises a deletion or disruption of a nucleic acid sequence encoding a protein having glycerol-3-phosphate dehydrogenase (GPD) activity and/or a nucleic acid sequence encoding a protein having glycerophosphate phosphatase (GPP) activity.
15. The recombinant yeast cell according to any one of claim 1 to 14,
wherein the recombinant yeast cell further functionally expresses:
-a nucleic acid sequence encoding a protein having glycerol dehydrogenase activity (e.c. 1.1.1.6);
-a nucleic acid sequence encoding a protein having dihydroxyacetone kinase activity (e.c. 2.7.1.28 or e.c. 2.7.1.29); and/or
-optionally a nucleic acid sequence encoding a protein having glycerol transporter activity.
16. The recombinant yeast cell according to any one of claim 1 to 15,
wherein the recombinant yeast cell further functionally expresses a nucleic acid sequence encoding a protein having glucoamylase activity (EC 3.2.1.20 or 3.2.1.3).
17. A method for producing ethanol, the method comprising transforming a carbon source, preferably a carbohydrate, using a recombinant yeast cell according to any one of claims 1 to 16.
18. The method of claim 17, wherein the method is performed at least in part in a medium comprising glucose at the following glucose concentrations: 25g/L or higher, 30g/L or higher, 35g/L or higher, 40g/L or higher, 45g/L or higher, 50g/L or higher, 55g/L or higher, 60g/L or higher, 65g/L or higher, 70g/L or higher, 75g/L or higher, 80g/L or higher, 85g/L or higher, 90g/L or higher, 95g/L or higher, 100g/L or higher, 110g/L or higher, or 120g/L or higher.
19. The method of claim 17 or claim 18, wherein the method is performed at least in part in the presence of a glycosylase such as a glucoamylase.
CN202280059116.5A 2021-07-12 2022-07-07 Recombinant yeast cells Pending CN117897490A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163220896P 2021-07-12 2021-07-12
EP21185148.0 2021-07-12
US63/220,896 2021-07-12
PCT/EP2022/068917 WO2023285280A1 (en) 2021-07-12 2022-07-07 Recombinant yeast cell

Publications (1)

Publication Number Publication Date
CN117897490A true CN117897490A (en) 2024-04-16

Family

ID=90643281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280059116.5A Pending CN117897490A (en) 2021-07-12 2022-07-07 Recombinant yeast cells

Country Status (1)

Country Link
CN (1) CN117897490A (en)

Similar Documents

Publication Publication Date Title
EP3638770B1 (en) Recombinant yeast cell
WO2018172328A1 (en) Improved glycerol free ethanol production
EP3298133A1 (en) Acetate consuming yeast cell
EP3359655B1 (en) Eukaryotic cell with increased production of fermentation product
EP4370651A1 (en) Recombinant yeast cell
WO2021089877A1 (en) Process for producing ethanol
US11414683B2 (en) Acetic acid consuming strain
CN117897490A (en) Recombinant yeast cells
CN117940571A (en) Recombinant yeast cells
CN117881773A (en) Recombinant yeast cells
EP4370689A1 (en) Recombinant yeast cell
EP4370692A1 (en) Recombinant yeast cell
CN117916381A (en) Recombinant yeast cells
CN117940570A (en) Recombinant yeast cells
US20230374443A1 (en) Saccharomyces yeast cell and fermentation process using such
CN118056011A (en) Recombinant yeast cells
CN118176296A (en) Recombinant yeast cells
WO2023208762A2 (en) Mutant yeast cell and process for the production of ethanol

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination