WO2012078741A2 - Novel fungal esterases - Google Patents

Novel fungal esterases Download PDF

Info

Publication number
WO2012078741A2
WO2012078741A2 PCT/US2011/063716 US2011063716W WO2012078741A2 WO 2012078741 A2 WO2012078741 A2 WO 2012078741A2 US 2011063716 W US2011063716 W US 2011063716W WO 2012078741 A2 WO2012078741 A2 WO 2012078741A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
acid sequence
protein
enzyme
nucleic acid
Prior art date
Application number
PCT/US2011/063716
Other languages
French (fr)
Other versions
WO2012078741A3 (en
Inventor
Johannes Visser
Sandra Hinz
Jan Werij
Jacob Visser
Vivi Joosten
Martijn Koetsier
Mark Emalfarb
Original Assignee
Dyadic International (Usa) Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dyadic International (Usa) Inc. filed Critical Dyadic International (Usa) Inc.
Publication of WO2012078741A2 publication Critical patent/WO2012078741A2/en
Publication of WO2012078741A3 publication Critical patent/WO2012078741A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/01Carboxylic ester hydrolases (3.1.1)
    • C12Y301/01011Pectinesterase (3.1.1.11)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/01Carboxylic ester hydrolases (3.1.1)
    • C12Y301/01072Acetylxylan esterase (3.1.1.72)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/01Carboxylic ester hydrolases (3.1.1)
    • C12Y301/01073Feruloyl esterase (3.1.1.73)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/01Carboxylic ester hydrolases (3.1.1)
    • C12Y301/01074Cutinase (3.1.1.74)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/01Carboxylic ester hydrolases (3.1.1)
    • C12Y301/01086Rhamnogalacturonan acetylesterase (3.1.1.86)

Definitions

  • This invention relates to novel enzymes and novel methods for producing the same.
  • this invention relates to enzymes produced by fungi. More specifically this invention relates to enzymes of fungal origin classified as esterases and produced by fungi.
  • Esterases represent a category of various enzymes including but not limited to lipases, cutinases, phospholipases, phytases, acetylesterases like xylan acetylesterase, feruloylesterase, glucuronyl esterase, rhamnogalacturonan acetylesterase, pectin acetylesterase and pectin methylesterase.
  • the invention also relates to a method to degrade lignocellulosic or cellulosic material and to novel combinations of enzymes, including those that provide a combined or synergistic release of sugars from plant biomass.
  • the invention also relates to a method to modify specific plant cell wall components such as pectin, arabino(glucurono)xylan, acetylgalacto(gluco)mannan altering their physico-chemical properties and a method to provide a more complete release of monomeric and oligomeric constituents of such polysaccharides.
  • the invention also relates to a method to release cellular contents by effecting degradation of the cell walls.
  • the invention also relates to methods to use the novel enzymes and compositions of such enzymes in a variety of other processes, such as washing of clothing or fabrics, detergent processes, processes in the animal feed, food and beverage industries; biorefining, deinking and biobleaching of paper and pulp; and treatment of waste streams.
  • Esterases are hydrolytic enzymes that split esters into an acid and an alcohol in a chemical reaction with water. Esterases represent a category of various enzymes including lipases, phospholipases and phytases that catalyze the hydrolysis and synthesis of ester bonds in compounds.
  • Lipases are water-soluble enzymes that catalyze the hydrolysis of ester chemical bonds in water-insoluble lipid substrates; phospholipases hydrolyze phospholipids into fatty acids and other lipophilic substances; and phytases break down the undigestible phytic acid (phytate) part found in grains and oil seeds and thus release digestible phosphorus, calcium and other nutrients.
  • Esterases are useful for the hydrolysis of complex carbohydrates. This makes them also useful in a variety of industrial textile applications, as well as industrial paper and pulp applications. Esterases are also employed in the food, feed, and beverage industry (e.g., to improve flavors, releasing of minerals, degum vegetable oils, and ripining of cheese). Esterases are also used in the pressing of agricultural material to produce increasing yields (e.g. production of fruit juice) and in the treatment of waste streams. Esterases, such as lipases, are also used in detergent compositions, for the purpose of enhancing the cleaning ability of the composition together with cellulases or to act as a softening agent.
  • plant biomass mainly consists of cellulose, various hemiceUuloses, pectins, lignin and some cutin waxes which form a protective layer on the outer surface of the intact plant cell wall.
  • Large amounts of carbohydrates in plant biomass provide a plentiful source of potential energy in the form of sugars (both five carbon and six carbon sugars) that can be utilized for numerous industrial and agricultural processes.
  • sugars both five carbon and six carbon sugars
  • Esterases can help hemicellulases and pectinases to degrade this material.
  • DDG Distillers' dried grains
  • Milled whole corn kernels are treated with amylases to liquefy the starch within the kernels and hydrolyze it to glucose.
  • the glucose so produced is then fermented in a second step to ethanol.
  • the residual solids after the ethanol fermentation and distillation are centrifuged and dried, and the resulting product is DDG, which is used as an animal feed stock.
  • DDG composition can vary, a typical composition for DDG is: about 32% hemicellulose, 22% cellulose, 30% protein, 10% lipids, 4% residual starch, and 4% inorganics.
  • the cellulose and hemicellulose fractions comprising about 54% of the weight of the DDG, can be efficiently hydrolyzed to fermentable sugars by enzymes; however, it has been found that the carbohydrates comprising lignocellulosic materials in DDG are more difficult to digest due to the recalcitrant nature of these substrates.
  • the cost of producing the requisite enzymes is higher than the cost of producing amylases for starch hydrolysis.
  • Major polysaccharides comprising lignocellulosic materials include cellulose and hemicelluloses.
  • the enzymatic hydrolysis of these polysaccharides to soluble sugars (and finally to monomers such as glucose, xylose and other hexoses and pentoses) is catalyzed by several enzymes acting in concert.
  • endo-1,4-P-glucanases (EGs) and exo-cellobiohydrolases (CBHs) catalyze the hydrolysis of insoluble cellulose to cellooligosachharides (with cellobiose the main product), while ⁇ - glucosidaes (BGLs) convert the oligosaccharides to glucose.
  • EGs endo-1,4-P-glucanases
  • CBHs exo-cellobiohydrolases
  • BGLs ⁇ - glucosidaes
  • xylanases and ⁇ -xylosidases together with other enzymes, remove side chains and substituents by enzymes such as alpha-L- arabinofuranosidases, ferulic acid esterases and acetylxylan esterases.
  • Glucuronylesterases catalyze the hydrolysis of hemicelluloses containing glucuronyl residues and thereby improve the hydrolysis of hemicellulose and thereby cellulose.
  • galacto(gluco)mannan often partially acetylated, can be hydrolyzed by the conceited action of endo-mannanase, beta-mannosidase, alpha- galactosidase, and acetylesterase.
  • Pectin is another complex polysaccharide which mainly consists out of homogalacturonan and rhamnogalacturonan backbone structures.
  • the aJpha-1,4-linked galacturonic acids which occur in the backbone of these polysaccharide structures are often methylated at the C6 carboxyl moiety and can in addition be acetylated.
  • Esterases such as pectin methylesterase and rhamnogalacturonan esterase modify the functionality of such polysaccharides.
  • rhamnogalacturonan hydrolases In concert with endo- and exo- polygalacturonases, rhamnogalacturonan hydrolases, pectin and pectate lyases, rhamnosidases and arabinofuranosidases to remove side chains from the backbone, they can breakdown these polysaccharides.
  • Filamentous fungi are a rich source of polysaccharide degrading enzymes as well as a variety of other enzymes such as, but not limited to, acetylxylan esterases, feruloyl esterases, glucuronyl esterases, pectinesterases which are all useful in the enzymatic hydrolysis or modification of major polysaccharides. It is desirable to produce inexpensive enzymes and enzyme mixtures that efficiently degrade or modify such polysaccharides for use in a variety of agricultural and industrial applications.
  • the present invention relates generally to proteins that play a role in the degradation and/or modification of hemicelluloses, pectins, waxes and other lipid substances and nucleic acids encoding the same.
  • the present invention relates to enzymes isolated from a filamentous fungal strain denoted herein as CI (Accession No. V M F-3500-D), nucleic acids encoding the enzymes, and methods of producing and using the enzymes.
  • the invention also provides compositions that include at least one of the enzymes described herein for uses including, but not limited to, the hydrolysis of lignocellulose.
  • the invention stems, in part, from the discovery of a variety of novel esterases produced by the CI fungus that exhibit activity toward hemicelluloses, pectins and other components of biomass such as phytate, lipids, waxes and the like.
  • the present invention also provides methods and compositions for the conversion of plant biomass to fermentable sugars that can in turn be converted to useful products.
  • Such products may include, without limitation, metabolites, bioplastics, biopolymers and biofuels.
  • the methods include methods for degrading lignocellulosic material using enzyme mixtures to liberate sugars.
  • the compositions of the invention include enzyme combinations that break down lignocellulose.
  • biomass or "lignocellulosic material” includes materials containing cellulose and or hemicellulose. Generally, these materials also contain pectin, lignin, protein, carbohydrates (such as starch and sugar), lipids, and ash. Lignocellulose is generally found, for example, in the stems, leaves, hulls, husks, and cobs of plants or leaves, branches, and wood of trees.
  • Fermentable sugars refers to simple sugars, such as glucose, xylose, arabinose, galactose, mannose, rhamnose, sucrose and fructose.
  • Biomass can include virgin biomass and/or non-virgin biomass such as agricultural biomass, commercial organics, construction and demolition debris, municipal solid waste, waste paper and yard waste.
  • biomass include trees, shrubs and grasses, wheat, wheat straw, sugar cane bagasse, sugar beet, soybean, corn, corn husks, corn kernel including fiber from kernels, products and by-products from milling of grains such as corn, tobacco, wheat and barley (including wet milling and dry milling).
  • the biomass can also be, but is not limited to, herbaceous material, agricultural residues, forestry residues, and pulp and paper mill residues.
  • Agricultural biomass includes branches, bushes, canes, corn and corn husks, energy crops, algae, fruits, flowers, grains, grasses, herbaceous crops, leaves, bark, needles, logs, roots, saplings, short rotation woody crops, shrubs, switch grasses, trees, vegetables, fruit peels, vines, sugar beet pulp, wheat midlings, oat hulls, peat moss, mushroom compost and hard and soft woods (not including woods with deleterious materials).
  • agricultural biomass includes organic waste materials generated from agricultural processes including farming and forestry activities, specifically including forestry wood waste. Agricultural biomass may be any of the aforestated singularly or in any combination or mixture thereof.
  • Energy crops are fast-growing crops that are grown for the specific purpose of producing energy, including without limitation, biofuels, from all or part of the plant.
  • Energy crops can include crops that are grown (or are designed to grow) for their increased cellulose, xylose and sugar contents. Examples of such plants include, without limitation, switchgrass, willow and poplar.
  • Energy crops may also include algae, for example, designer algae that are genetically engineered for enhanced production of hydrogen, alcohols, and oils, which can be further processed into diesel and jet fuels, as well as other bio-based products.
  • biomass high in starch, sugar, or protein such as corn, grains, fruits and vegetables are usually consumed as food.
  • biomass high in cellulose, hemicellulose and lignin are not readily digestible and are primarily utilized for wood and paper products, animal feed, fuel, or are typically disposed.
  • the substrate is of high lignocellulose content, including distillers' dried grains corn stover, corn cobs, rice straw, wheat straw, hay, sugarcane bagasse, sugar cane pulp, citrus peels and other agricultural biomass, switchgrass, forestry wastes, poplar wood chips, pine wood chips, sawdust, yard waste, and the like, including any combination thereof.
  • the lignocellulosic material is distillers' dried grains (DDG).
  • DDG also known as dried distiller's grain, or distiller's spent grain
  • the lignocellulosic material can also be distiller's dried grain with soluble material recycled back (DDGS). While reference will be made herein to DDG for convenience and simplicity, it should be understood that both DDG and DDGS are contemplated as desired lignocellulosic materials. These are largely considered to be waste products and can be obtained after the fermentation of the starch derived from any of a number of grains, including corn, wheat, barley, oats, rice and rye. In one embodiment the DDG is derived from corn.
  • distiller's grains do not necessarily have to be dried.
  • the grains are normally currently dried, water and enzymes are added to the DDG substrate in the present invention. If the saccharification was done on site, the drying step could be eliminated and enzymes could be added to the distiller's grains without drying. [030] Due in part to the many components that comprise biomass and lignocellulosic materials, enzymes or a mixture of enzymes capable of degrading hemicellulose like xylan and cellulose are needed to achieve saccharification.
  • the present invention includes enzymes with esterase activities or compositions thereof with, for example, cellobiohydrolase, endoglucanase, xylanase, ⁇ -glucosidase, ⁇ -xylosidase and other hemicellulase activities.
  • Fermentable sugars can be converted to useful value-added fermentation products, non-limiting examples of which include amino acids, vitamins, pharmaceuticals, animal feed supplements, specialty chemicals, chemical feedstocks, plastics, solvents, fuels, or other organic polymers, lactic acid, and ethanol, including fuel ethanol.
  • Specific value-added products that may be produced by the methods of the invention include, but are not limited to, biofuels (including ethanol and butanol); lactic acid; plastics; specialty chemicals; organic acids, including citric acid, succinic acid, itaconic acid and maleic acid; solvents; animal feed supplements; pharmaceuticals; vitamins; amino acids, such as lysine, methionine, tryptophan, threonine, and aspartic acid; industrial enzymes, such as proteases, cellulases, amylases, glucanases, lactases, xylanases, arabinanases, lipases, lyases, oxidoreductases, transferases, and chemical feedstocks.
  • biofuels including ethanol and butanol
  • lactic acid plastics
  • specialty chemicals organic acids, including citric acid, succinic acid, itaconic acid and maleic acid
  • solvents including citric acid, succinic acid, itaconic acid and maleic acid
  • the enzymes of the present invention may also be used for stone washing cellulosic fabrics such as cotton (e.g., denim), linen, hemp, ramie, cupro, lyocell, newcell, rayon and the like. See, for example, U.S. Patent No. 6,015,707. Enzymes and compositions of the present invention may also be used in the treatment of paper pulp (e.g., for improving the drainage or for de-inking of recycled paper) or for the treatment of wastewater streams (e,g., to improve hydrolysis of waste material containing cellulose, hemicellulose and pectins).
  • paper pulp e.g., for improving the drainage or for de-inking of recycled paper
  • wastewater streams e.g., to improve hydrolysis of waste material containing cellulose, hemicellulose and pectins.
  • the enzymes of the present invention may also be used to release the contents of a cell.
  • contacting or mixing the cells with the enzymes and/or compositions of the present invention will degrade the cell walls, preferentially those of plant origin, resulting in cell lysis and release of the cellular contents.
  • the middle lamella will be degraded leading to separation of cells, a process called liquefaction.
  • Alcohols and oils released from plants can be further processed to produce diesel, jet fuels, as well as other economically important bio- products such as flavourants or fragrances.
  • the enzymes of the present invention may be used alone, or in combination with other enzymes, chemicals or biological materials.
  • the enzymes of the present invention may be used for in vitro applications in which the enzymes or mixtures thereof are added to or mixed with the appropriate substrates to catalyze the desired reactions. Additionally, the enzymes of the present invention may be used for in vivo applications in which nucleic acid molecules encoding the enzymes are introduced into cells and are expressed therein to produce the enzymes and catalyze the desired reactions within the cells.
  • the present invention includes proteins isolated from, or derived from the knowledge of enzymes from, a fungus such as Myceliophthora thermophila or a mutant or other derivative thereof, and more particularly, from the fungal strain denoted herein as CI (Accession No. VKM F-3500-D).
  • M. thermophila has previously appeared in patent applications and in the literature as Chrysosporium lucknowense or Sporotrichtan thermophile.
  • the proteins of the invention possess enzymatic activity.
  • U.S. Patent No. 6,015,707 or U.S. Patent No. 6,573,086 a strain called CI (Accession No.
  • VKM F-3500-D was isolated from samples of forest alkaline soil from Sola Lake, Far Bast of the Russian Federation. This strain was deposited at the All-Russian Collection of Microorganisms of the Russian Academy of Sciences (VKM), Bakhurhina St. 8, Moscow, Russia, 113184, under the terms of the Budapest Treaty on the International Regulation of the Deposit of Microorganisms for the Purposes of Patent Procedure on August 29, 1996, as Chrysosporium lucknowense Garg 27K, VKM-F 3500 D. Various mutant strains of M. thermophila CI (previously C.
  • strain NG7C-19 accesion No. VKM F-3633 D
  • strain UV18-25 accesion No. VKM F-3631 D
  • strain W1L accesion No. CBS122189
  • strain WIUIOOL accesion No. CBS 122190
  • Strain CI was initially classified as a Chrysosporium lucknowense based on morphological and growth characteristics of the microorganism, as discussed in detail in U.S. Patent No. 6,015,707, U.S. Patent No. 6,573,086 and patent PCT/NL2010/000045. The CI strain was subsequently reclassified as Myceliophthora thermophila based on genetic tests. C. lucknowense has also appeared in the literature as Sporotrichum thermophile.
  • a protein of the invention comprises, consists essentially of, or consists of an amino acid sequence selected from SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66
  • the present invention also includes homologues or variants of any of the above sequences, including fragments and sequences having a given identity to any of the above sequences, wherein the homologue, variant, or fragment has at least one biological activity of the wild-type protein, as described herein.
  • esterases In general, the proteins disclosed herein have hydrolytic enzymatic activity and are classified as esterases, (phospho)lipases, sulphatases, phosphatases. Of particular interest are esterases with the ability to hydrolyze esters in carbohydrate-containing materials. A review of enzymes involved in the degradation of polysaccharides can be found in de Vries et al., Microbiol. Mol. Biol. Rev. 65:497-522 (2001).
  • esterase refers to any protein that possess acetylxylan esterase, ferulic acid esterase, coumaryl esterase, pectin methyl esterase, glucuronyl esterase, rhamnogalacturonan acetyl esterase or acetyl(gluco)mannan esterase activity.
  • Other proteins of interest may possess lipase, phospholipase, cutinase, phytase or sulphatase activity.
  • carbohydrase refers to any protein that catalyzes the hydrolysis of carbohydrates. Endoglucanases, cellobiohydrolases, ⁇ -glucosidases, ⁇ -glucosidases, xylanases, ⁇ -xylosidases, galactanases, ⁇ -galactosidases, ⁇ -galactosidases, ⁇ - amylases, glucoamylases, endo-arabinases, arabinofuranosidases, mannanases, ⁇ - mannosidases, pectinases, acetyl xylan esterases, acetyl mannan esterases, ferulic acid esterases, coumaric acid esterases, pectin methyl esterases, and chitosanases are examples of glycosidases.
  • Hemicellulase refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials.
  • Hemicellulose is a complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another.
  • Hemicelluloses include a variety of compounds, such as xylans, arabinoxylans, xyloglucans, mannans, glucomannans, and galactomannans.
  • Hemicellulose can also contain glucan, which is a general term for beta-linked glucose residues.
  • a main component of hemicellulose is beta- 1,4- linked xylose, a five carbon sugar.
  • this xylose is often branched as beta-1,3 linkages or beta- 1,2 linkages, and can be substituted with linkages to arabinose, galactose, mannose, glucuronic acid, or by esterification to acetic acid.
  • the composition, nature of substitution, and degree of branching of hemicellulose is very different in dicotyledonous plants (dicots, i.e., plant whose seeds have two cotyledons or seed leaves such as lima beans, peanuts, almonds, peas, kidney beans) as compared to monocotyledonous plants (monocots; i.e., plants having a single cotyledon or seed leaf such as corn, wheat, rice, grasses, barley).
  • hemicellulose is comprised mainly of xyloglucans that are 1,4-beta-linked glucose chains with 1,6-alpha-linked xylosyl side chains.
  • xyloglucans that are 1,4-beta-linked glucose chains with 1,6-alpha-linked xylosyl side chains.
  • heteroxylans are primarily comprised of 1 ,4- beta-linked xylose backbone polymers with 1,2- or 1,3-beta linkages to arabinose, galactose and mannose as well as xylose modified by ester-linked acetic acids.
  • branched beta glucans comprised of 1,3- and 1,4-beta-linked glucosyl chains.
  • Hemicellulase refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicelluloses are complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicellulolytic enzymes, i.e. hemicellulases, include both endo- acting and exo-acting enzymes, such as xylanases, ⁇ -xylosidases.
  • Hemicellulases also include the accessory enzymes, such as alpha- giucuronidases, acetylesterases, glucuronyl esterases, ferulic acid esterases, and coumaric acid esterases.
  • xylanases and acetyl xylan esterases cleave the xylan and acetyl side chains of xylan and the remaining xylo-oligomers are unsubstituted and can thus be hydrolysed with ⁇ -xylosidase only.
  • xylanases, acetylesterases and ⁇ -xylosidases are examples of hemicellulases.
  • the other accessory enzymes mentioned remove glucuronic acid, ferulic acid and coumaric acid which also form obstacles for complete degradation of the hemicellulose structure.
  • Xylanase specifically refers to an enzyme that hydrolyzes the ⁇ -1,4 bond in the xylan backbone, producing short xylooligosaccharides.
  • ⁇ -Mannanase or "endo-1,4- ⁇ -mannosidase” refers to a protein that hydrolyzes mannan-based hemicelluloses (mannan, glucomannan, galactomannan) and produces short ⁇ -1,4-mannooligosaccharides.
  • Mannan endo-1,6- ⁇ -mannosidase refers to a protein that hydrolyzes 1,6- ⁇ - mannosidic linkages in unbranched 1,6-mannans.
  • ⁇ -Mannosidase ( ⁇ -1,4-mannoside mannohydrolase; EC 3.2.1.25) refers to a protein that catalyzes the removal of ⁇ -D-mannose residues from the nonreducing ends of oligosaccharides.
  • Galactanase refers to proteins that catalyze the hydrolysis of endo-1,6- ⁇ -D- galactosidic or endo-1,4- -D-galactosidic linkages in arabinogalactans.
  • a ⁇ arabinofuranosidase refers to a protein that hydrolyzes arab inofuranosyl -containing hemicelluloses or pectins. Some of these enzymes remove arabinofuranoside residues from 0-2 or 0-3 single substituted xylose residues, as well as from 0-2 and/or 0-3 double substituted xylose residues. Some of these enzymes remove arabinose residues from arabinan oligomers.
  • Endo-arabinase refers to a protein that catalyzes the hydrolysis of 1,5- ⁇ - arabinofuranosidic linkages in 1,5-arabinans.
  • Exo-arabinase refers to a protein that catalyzes the hydrolysis of 1,5- ⁇ -linkages in 1,5-arabinans or 1,5- ⁇ -L arabino-oligosaccharides, releasing mainly arabinobiose, although a small amount of arabinotriose can also be liberated.
  • ⁇ -xylosidase refers to a protein that hydrolyzes short 1,4- ⁇ -D-xylooligomers into xylose.
  • ct-Glucuronidase refers to a protein that hydrolyzes the 1,2- ⁇ -glucuronic acid linkages in hemicelluloses.
  • Alcohol xylan esterase refers to a protein that catalyzes the removal of the acetyl groups from xylose residues.
  • Alcohol mannan esterase refers to a protein that catalyzes the removal of the acetyl groups from mannose residues,
  • ferulic esterase or "ferulic acid esterase” refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and ferulic acid.
  • Coumaric acid esterase refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and coumaric acid.
  • Glucuronyl esterase refers to a protein that hydrolyzes the ester bond between glucuronic acid and lignin.
  • Acetyl xylan esterases, glucuronyl esterases, ferulic acid esterases and coumaric acid esterases are examples of carbohydrate esterases.
  • Pectin refers to polysaccharides which are composed of homogalacturonan and rhamnogalacturonan.
  • Homogalacturonan is composed of alpha 1,4 -linked galacturonic acid residues which may be methyl esterified at the C6 carboxylate function and /or acetylated at the C2 or C3 position
  • Rhamnogalacturonan is composed of alternating ⁇ -1,4-rhamnose and ⁇ -1,2-linked galacturonic acid, with side chains linked 1,4 to rhamnose.
  • the side chains include Type I galactan, which is ⁇ -1,4-linked galactose with ⁇ -1,3-linked arabinose substituents; Type ⁇ galactan, which is ⁇ -1,3-1,6-linked galactoses (very branched) with arabinose substituents; and arabinan, which is ⁇ -1,5-linked arabinose with ct-1,3- linked arabinose branches.
  • the galacturonic acid substituents may be acetylated and/or methylated.
  • Pectinolytic enzymes include both endo-acting and exo-acting enzymes, such as polygalacturonases, pectin and pectate lyases, arabinofuranosidases, rhamnosidases and several esterases like pectin methyl esterases. These and some other enzymes found like ferulic acid esterases are suitable to be used in multi-enzyme compostions to degrade pectin materials.
  • Pectin methyl esterase refers to a protein that catalyzes the removal of the methyl groups ester linked to the carboxylic acid residues in galacturonic acid
  • Rhamnogalacturonan acetylesterase refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the highly branched rhamnogalacturonan (hairy) regions of pectin.
  • Pectin acetyl esterase refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the homogalacturonan ( smooth) regions of pectin.
  • Esterases active on pectin are another examples of carbohydrate esterases.
  • Polygalacturonase refers to a protein that catalyzes the hydrolysis of alpha 1,4- linked galacturonic acid residues from homogalacturonan thus converting polygalacturonides to galacturonic acid or galacturonic acid oligosaccharides.
  • Rhamnogalacturon hydrolase refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin to galacturonic acid or rhamnogalacturonan oligosaccharides.
  • Pectate lyase refers to a protein that catalyzes the cleavage of 1,4- ⁇ -D-galacturonan by beta-elimination acting on polymeric and or oligosaccharide substrates.
  • Pectin lyase refers to a protein that catalyzes the cleavage of 1,4- ⁇ -D-galacturonan by beta-elimination acting on polymeric and/or oligosaccharide substrates. The action of the enzyme is not hindered by acetyl esters.
  • Rhamnogalacturonan lyase refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin via a ⁇ -elimination mechanism (see, e.g., Pages et al., J. Bacterial 185:4727-4733 (2003)).
  • Glycosidases glycoside hydrolases; GH
  • GH glycoside hydrolases
  • Glycosidases such as the proteins of the present invention may be assigned to families on the basis of sequence similarities, and there are now over 100 different such families defined (see the CAZy (Carbohydrate Active EnZymes database) website, maintained by the Architecture of Fonction de Macromolecules Bi Anlagens of the Centre National de lalich Scientifique, which describes the families of structurally-related catalytic and carbohydrate-binding modules (or functional domains) of enzymes that degrade, modify, or create glycosidic bonds; Coutinho, P.M. & Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In “Recent Advances in Carbohydrate Bioengineering", H.J. Gilbert, G. Davies, B.
  • sequence homology may be used to identify particular domains within proteins, such as cellulose binding modules (CBMs; also known as cellulose binding domains (CBDs)).
  • CBMs cellulose binding modules
  • CBDs cellulose binding domains
  • An enzyme assigned to a particular CAZy family may exhibit one or more of the enzymatic activities or substrate specificities associated with the CAZy family. In other embodiments, the enzymes of the present invention may exhibit one or more of the enzyme activities discussed above.
  • Proteins of the present invention may also include homologues, variants and fragments of the proteins disclosed herein.
  • the protein fragments include, but are not limited to, fragments comprising a catalytic domain (CD) and or a cellulose-binding domain (also known as a cellulose or carbohydrate binding module (CBM); both are referred to herein as CBM).
  • CD catalytic domain
  • CBM carbohydrate binding module
  • the identity and location of domains within proteins of the present invention are disclosed in detail below.
  • the present invention encompasses all combinations of the disclosed domains.
  • a protein fragment may comprise a CD of a protein but not a CBM of the protein or a CBM of a protein but not a CD.
  • domains from different proteins may be combined.
  • Protein fragments comprising a CD, CBM or combinations thereof for each protein disclosed herein can be readily produced using standard techniques known in the art.
  • a protein fragment comprises a domain of a protein that has at least one biological activity of the full-length protein. Homologues or variants of proteins of the invention that have at least one biological activity of the full-length protein are described in detail below.
  • biological activity of a protein refers to any functions) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vitro or in vivo.
  • a protein fragment comprises a domain of a protein that has the catalytic activity of the full-length enzyme. Specific biological activities of the proteins of the invention, and structures within the proteins that are responsible for the activities, are described below.
  • Esterases represent a category of various enzymes including acetylxylan esterases, ferulic acid esterases, coumaryl esterases, pectin methyl esterases, glucuronyl esterases, rhamnogalacturonan acetyl esterases, acetyl(gluco)mannan esterases, lipases, phospholipases, cutinases, phytases or sulphatases that catalyze the hydrolysis and synthesis of ester bonds in compounds.
  • sequence ES 1, 193_g is encoded by the nucleotides of SEQ ID No: 1, which encodes the amino acid sequence of SEQ ID NO: 2. This enzyme is believed to have feruloyl esterase activity activity.
  • sequence ES 2, 1176_g is encoded by the nucleotides of SEQ ID No: 3, which encodes the amino acid sequence of SEQ ID No: 4. This enzyme is believed to have carboxylesterase activity.
  • sequence ES 3, 1838_g is encoded by the nucleotides of SEQ ID No: 5, which encodes the amino acid sequence of SEQ ID No: 6. This enzyme is believed to have pectinesterase activity.
  • sequence ES 4, 2778_g is encoded by the nucleotides of SEQ ID No: 7, which encodes the amino acid sequence of SEQ ID No: 8. This enzyme is believed to have phospholipase A2 activity.
  • sequence ES 5 2819_g is encoded by the nucleotides of SEQ ID No: 9, which encodes the amino acid sequence of SEQ ID No: 10. This enzyme is believed to have epoxide hydrolase activity.
  • sequence ES 6, 3130_g is encoded by the nucleotides of SEQ ID No: 11, which encodes the amino acid sequence of SEQ ID No: 12. This enzyme is believed to have acetylxylan esterase activity.
  • sequence ES 7, 3190_g is encoded by the nucleotides of SEQ ID No: 13, which encodes the amino acid sequence of SEQ ID No: 14. This enzyme is believed to have pectinesterase activity.
  • sequence ES 8, 3597_g is encoded by the nucleotides of SEQ ID No: 15, which encodes the amino acid sequence of SEQ ID No: 16. This enzyme is believed to have sterol esterase activity.
  • sequence ES 9, 3629_g is encoded by the nucleotides of SEQ ID No: 17, which encodes the amino acid sequence of SEQ ID No: 18. This enzyme is believed to have phospholipase A2 activity.
  • sequence ES 10, 3696_g is encoded by the nucleotides of SEQ ID No: 19, which encodes the amino acid sequence of SEQ ID No: 20. This enzyme is believed to have carboxylesterase activity activity.
  • sequence ES 11, 5193_g is encoded by the nucleotides of SEQ ID No: 21, which encodes the amino acid sequence of SEQ ID No: 22. This enzyme is believed to have carboxylesterase activity.
  • sequence ES 12, 6264_g is encoded by the nucleotides of SEQ ID No: 23, which encodes the amino acid sequence of SEQ ID No: 24. This enzyme is believed to have acetylxylan esterase activity.
  • sequence ES 13, 6416_g is encoded by the nucleotides of SEQ ID No: 25, which encodes the amino acid sequence of SEQ ID No: 26. This enzyme is believed to have acetylxylan esterase activity.
  • sequence ES 14, 7039_g is encoded by the nucleotides of SEQ ID No: 27, which encodes the amino acid sequence of SEQ ID No: 28. This enzyme is believed to have feruloyl esterase activity.
  • sequence ES 1 , 8052_g is encoded by the nucleotides of SEQ ID No: 29, which encodes the amino acid sequence of SEQ ID No: 30. This enzyme is believed to have carboxylesterase activity.
  • sequence ES 16, 8917_g is encoded by the nucleotides of SEQ ID No: 31, which encodes the amino acid sequence of SEQ ID No: 32. This enzyme is believed to have cutinase activity.
  • sequence ES 17, 373_g is encoded by the nucleotides of SEQ ID No: 33, which encodes the amino acid sequence of SEQ ID No: 34. This enzyme is believed to have phospholipase activity.
  • sequence ES 18, 611_g is encoded by the nucleotides of SEQ ID No: 35, which encodes the amino acid sequence of SEQ ID No: 36. This enzyme is believed to have lipase activity.
  • sequence ES 19, 1024_g is encoded by the nucleotides of SEQ ID No: 37, which encodes the amino acid sequence of SEQ ID No: 38. This enzyme is believed to have lipase activity.
  • sequence ES 20, 1257_g is encoded by the nucleotides of SEQ ID No: 39, which encodes the amino acid sequence of SEQ ID No: 40. This enzyme is believed to have lipase activity.
  • sequence ES 21, 1406_g is encoded by the nucleotides of SEQ ID No: 41, which encodes the amino acid sequence of SEQ ID No: 42. This enzyme is believed to have phospholipase C activity.
  • sequence ES 22, 1407_g is encoded by the nucleotides of SEQ ID No: 43, which encodes the amino acid sequence of SEQ ID No: 44. This enzyme is believed to have phospholipase C activity.
  • sequence ES 23, 1416_g is encoded by the nucleotides of SEQ ID No: 45, which encodes the amino acid sequence of SEQ ID No: 46. This enzyme is believed to have lipase activity.
  • sequence ES 24, 341 _g is encoded by the nucleotides of SEQ ID No: 47, which encodes the amino acid sequence of SEQ ID No: 48. This enzyme is believed to have phospholipase C activity.
  • sequence ES 25, 4299_g is encoded by the nucleotides of SEQ ID No: 49, which encodes the amino acid sequence of SEQ ID No: SO. This enzyme is believed to have phospholipase activity.
  • sequence ES 26, 4588_g is encoded by the nucleotides of SEQ ID No: 51, which encodes the amino acid sequence of SEQ ID No: 52. This enzyme is believed to have lipase activity.
  • sequence ES 27, 5125_g is encoded by the nucleotides of SEQ ID No: 53, which encodes the amino acid sequence of SEQ ID No: 54. This enzyme is believed to have phospholipase C activity.
  • sequence ES 28, 5310_g is encoded by the nucleotides of SEQ ID No: 55, which encodes the amino acid sequence of SEQ ID No: 56. This enzyme is believed to have lipase activity.
  • sequence ES 29, 5865_g is encoded by the nucleotides of SEQ ID No: 57, which encodes the amino acid sequence of SEQ ID No: 58. This enzyme is believed to have phospholipase activity.
  • sequence ES 30, 5916_g is encoded by the nucleotides of SEQ ED No: 59, which encodes the amino acid sequence of SEQ ID No: 60. This enzyme is believed to have lysophospholipase activity.
  • sequence ES 31, 6098_g is encoded by the nucleotides of SEQ ID No: 61, which encodes the amino acid sequence of SEQ ID No: 62. This enzyme is believed to have phospholipase D activity.
  • sequence ES 32, 7468_g is encoded by the nucleotides of SEQ ID No: 63, which encodes the amino acid sequence of SEQ ID No: 64. This enzyme is believed to have lipase activity.
  • sequence ES 33 scaffold00131.pathl.gene867
  • SEQ ID No: 65 which encodes the amino acid sequence of SEQ ID No: 66. This enzyme is believed to have lipase activity.
  • sequence ES 34, 8020_g is encoded by the nucleotides of SEQ ID No: 67, which encodes the amino acid sequence of SEQ ID No: 68. This enzyme is believed to have phospholipase C activity.
  • sequence ES 35, 8606_g is encoded by the nucleotides of SEQ ID No: 69, which encodes the amino acid sequence of SEQ ID No: 70. This enzyme is believed to have lipase activity.
  • sequence ES 36, 8708_g is encoded by the nucleotides of SEQ ID No: 71, which encodes the amino acid sequence of SEQ ID No: 72. This enzyme is believed to have lipase activity.
  • sequence ES 37, 9212_g is encoded by the nucleotides of SEQ ID No: 73, which encodes the amino acid sequence of SEQ ID No: 74. This enzyme is believed to have lipase activity.
  • sequence ES 38, scaffold00016.G6S4 is encoded by the nucleotides of SEQ ID No: 75, which encodes the amino acid sequence of SEQ ID No: 76. This enzyme is believed to have acetylxylan esterase activity.
  • sequence ES 39, scaffold00016.G247 is encoded by the nucleotides of SEQ ID No: 77, which encodes the amino acid sequence of SEQ ID No: 78. This enzyme is believed to have carboxylesterase activity.
  • sequence ES 40, scaffold00050.G653 is encoded by the nucleotides of SEQ ID No: 79, which encodes the amino acid sequence of SEQ ID No: 80. This enzyme is believed to have feruoyl esterase activity.
  • sequence ES 41, scaffold00071.G701 is encoded by the nucleotides of SEQ ID No: 81, which encodes the amino acid sequence of SEQ ID No: 82. This enzyme is believed to have feruoyl esterase activity.
  • sequence ES 42, scaffold00050.G259 is encoded by the nucleotides of SEQ ID No: 83, which encodes the amino acid sequence of SEQ ID No: 84. This enzyme is believed to have lipase activity.
  • sequence ES 43, scaffold00092.G669 is encoded by the nucleotides of SEQ ID No: 85, which encodes the amino acid sequence of SEQ ID No: 86. This enzyme is believed to have lipase activity.
  • sequence ES 44, scaffold00016.pathl.gene410 is encoded by the nucleotides of SEQ ID No: 87, which encodes the amino acid sequence of SEQ ID No: 88. This enzyme is believed to have phospholipase C activity.
  • sequence ES 45, scaffold00050.pathl.gene302 is encoded by the nucleotides of SEQ ID No: 89, which encodes the amino acid sequence of SEQ ID No: 90. This enzyme is believed to have feruloyl esterase activity.
  • sequence ES 46, scaffold00050.pathl.gene704 is encoded by the nucleotides of SEQ ID No: 91, which encodes the amino acid sequence of SEQ ID No: 92. This enzyme is believed to have feruloyl esterase activity.
  • sequence ES 47, scaffold00031.pathl.gene803 is encoded by the nucleotides of SEQ ID No: 93, which encodes the amino acid sequence of SEQ ID No: 94. This enzyme is believed to have esterase activity.
  • sequence ES 48, scaffold00031.pathl.gene9166 is encoded by the nucleotides of SEQ ID No: 95, which encodes the amino acid sequence of SEQ ID No: 96. This enzyme is believed to have phospholipase D activity.
  • sequence ES 49, 696_g is encoded by the nucleotides of SEQ ID No: 97, which encodes the amino acid sequence of SEQ ID No: 98. This enzyme is believed to have Lipase activity.
  • sequence ES 50, 1442_g is encoded by the nucleotides of SEQ ID No: 99, which encodes the amino acid sequence of SEQ ID No: 100. This enzyme is believed to have esterase activity.
  • sequence ES 51, 1456_g is encoded by the nucleotides of SEQ ID No: 101, which encodes the amino acid sequence of SEQ ID No: 102. This enzyme is believed to have Lipase activity.
  • sequence ES 52, 1940 _g is encoded by the nucleotides of SEQ ID No: 103, which encodes the amino acid sequence of SEQ ID No: 104. This enzyme is believed to have Lipase activity.
  • sequence ES 53, 3461_g is encoded by the nucleotides of SEQ ID No: 105, which encodes the amino acid sequence of SEQ ID No: 106. This enzyme is believed to have sulfuric ester hyrolase activity.
  • sequence ES 54, 3637_g is encoded by the nucleotides of SEQ ID No: 107, which encodes the amino acid sequence of SEQ ED No: 108. This enzyme is believed to have Lipase activity.
  • sequence ES 55, 3638_g is encoded by the nucleotides of SEQ ID No: 109, which encodes the amino acid sequence of SEQ ID No: 110. This enzyme is believed to have Lipase activity.
  • sequence ES 56, 4267_g is encoded by the nucleotides of SEQ ID No: 111, which encodes the amino acid sequence of SEQ ID No: 112. This enzyme is believed to have Lipase activity.
  • sequence ES 57, 4984_g is encoded by the nucleotides of SEQ ED No: 113, which encodes the amino acid sequence of SEQ ID No: 114. This enzyme is believed to have Lipase activity.
  • sequence ES 58, 5066_g is encoded by the nucleotides of SEQ ID No: 115, which encodes the amino acid sequence of SEQ ID No: 116. This enzyme is believed to have Lipase activity.
  • sequence ES 59, 5391_g is encoded by the nucleotides of SEQ ID No: 117, which encodes the amino acid sequence of SEQ ID No: 118. This enzyme is believed to have Lipase activity.
  • sequence ES 60, 5406_g is encoded by the nucleotides of SEQ ID No: 119, which encodes the amino acid sequence of SEQ ID No: 120. This enzyme is believed to have Lipase activity.
  • sequence ES 61, 5765_g is encoded by the nucleotides of SEQ ID No: 121, which encodes the amino acid sequence of SEQ ID No: 122. This enzyme is believed to have Lipase activity.
  • sequence ES 62, 7449_g is encoded by the nucleotides of SEQ ID No: 123, which encodes the amino acid sequence of SEQ ID No: 124. This enzyme is believed to have esterase activity.
  • sequence ES 63, 7692_g is encoded by the nucleotides of SEQ ID No: 125, which encodes the amino acid sequence of SEQ ID No: 126. This enzyme is believed to have Lipase activity.
  • sequence ES 64, 8351_g is encoded by the nucleotides of SEQ ID No: 127, which encodes the amino acid sequence of SEQ ID No: 128. This enzyme is believed to have Lipase activity.
  • sequence ES 65, 8598_g is encoded by the nucleotides of SEQ ID No: 129, which encodes the amino acid sequence of SEQ ID No: 130. This enzyme is believed to have esterase activity.
  • sequence ES 66, 8626_g is encoded by the nucleotides of SEQ ID No: 131, which encodes the amino acid sequence of SEQ ID No: 132. This enzyme is believed to have Lipase activity.
  • sequence ES 67, 8665_g is encoded by the nucleotides of SEQ ID No: 133, which encodes the amino acid sequence of SEQ ID No: 134. This enzyme is believed to have Lipase activity.
  • sequence ES 68, 8742_g is encoded by the nucleotides of SEQ ID No: 135, which encodes the amino acid sequence of SEQ ID No: 136. This enzyme is believed to have Lipase activity.
  • sequence ES 69, 9705_g is encoded by the nucleotides of SEQ ID No: 137, which encodes the amino acid sequence of SEQ ID No: 138. This enzyme is believed to have Lipase activity.
  • sequence ES 70, 1839_g is encoded by the nucleotides of SEQ ID No: 139, which encodes the amino acid sequence of SEQ ID No: 140. This enzyme is believed to have Lipase activity.
  • sequence ES 71, 937_g is encoded by the nucleotides of SEQ ID No: 141, which encodes the amino acid sequence of SEQ ID No: 142. This enzyme is believed to have esterase/lipase activity.
  • sequence ES 72, 1237_g is encoded by the nucleotides of SEQ ID No: 143, which encodes the amino acid sequence of SEQ ID No: 144. This enzyme is believed to have lipase activity.
  • sequence ES 73, 2253_g is encoded by the nucleotides of SEQ ID No: 145, which encodes the amino acid sequence of SEQ ID No: 146. This enzyme is believed to have esterase/lipase activity.
  • sequence ES 74, 3232_g is encoded by the nucleotides of SEQ ID No: 147, which encodes the amino acid sequence of SEQ ID No: 148. This enzyme is believed to have esterase activity.
  • sequence ES 75, 4735_g is encoded by the nucleotides of SEQ ID No: 149, which encodes the amino acid sequence of SEQ ID No: 150. This enzyme is believed to have esterase/lipase activity.
  • sequence ES 76, 4736_g is encoded by the nucleotides of SEQ ID No: 151, which encodes the amino acid sequence of SEQ ID No: 152. This enzyme is believed to have esterase/lipase activity.
  • sequence ES 77, 5568_g is encoded by the nucleotides of SEQ ID No: 153, which encodes the amino acid sequence of SEQ ID No: 154. This enzyme is believed to have phospholipase D activity.
  • sequence ES 78, 6656 _g is encoded by the nucleotides of SEQ ID No: 155, which encodes the amino acid sequence of SEQ ID No: 156. This enzyme is believed to have esterase activity.
  • sequence ES 79, 6881_g is encoded by the nucleotides of SEQ ID No: 157, which encodes the amino acid sequence of SEQ ID No: 158. This enzyme is believed to have esterase activity.
  • sequence ES 80, 745l_g is encoded by the nucleotides of SEQ ID No: 159, which encodes the amino acid sequence of SEQ ID No: 160. This enzyme is believed to have esterase activity.
  • sequence ES 81, 7776_g is encoded by the nucleotides of SEQ ID No: 161, which encodes the amino acid sequence of SEQ ID No: 162. This enzyme is believed to have esterase activity.
  • sequence ES 82, 8080_g is encoded by the nucleotides of SEQ ID No: 163, which encodes the amino acid sequence of SEQ ID No: 164. This enzyme is believed to have esterase activity,
  • sequence ES 83, 8408_g is encoded by the nucleotides of SEQ ID No: 165, which encodes the amino acid sequence of SEQ ID No: 166. This enzyme is believed to have esterase activity.
  • sequence ES 84, 9687_g is encoded by the nucleotides of SEQ ID No: 167, which encodes the amino acid sequence of SEQ ID No: 168. This enzyme is believed to have esterase/lipase activity.
  • sequence ES 85, 9709_g is encoded by the nucleotides of SEQ ID No: 169, which encodes the amino acid sequence of SEQ ID No: 170. This enzyme is believed to have esterase/lipase activity.
  • sequence ES 86, 8079_g is encoded by the nucleotides of SEQ ID No: 171, which encodes the amino acid sequence of SEQ ID No: 172. This enzyme is believed to have phospholipase activity.
  • sequence ES 87 scaffoId00071.G611
  • sequence ES 87 is encoded by the nucleotides of SEQ ID No: 173, which encodes the amino acid sequence of SEQ ID No: 174. This enzyme is believed to have lipase activity.
  • sequence ES 88, scaffold00071.G3443 is encoded by the nucleotides of SEQ ID No: 175, which encodes the amino acid sequence of SEQ ID No: 176. This enzyme is believed to have phospholipase activity.
  • sequence ES 89, scaffold00031.pathl.gene288 is encoded by the nucleotides of SEQ ID No: 177, which encodes the amino acid sequence of SEQ ID No: 178. This enzyme is believed to have lipase activity.
  • sequence ES 90, scaffold00092.pathl.gene71 is encoded by the nucleotides of SEQ ID No: 179, which encodes the amino acid sequence of SEQ ID No: 180. This enzyme is believed to have phospholipase activity.
  • sequence ES 91, scaffold00050.pathl.gene362 is encoded by the nucleotides of SEQ ID No: 181, which encodes the amino acid sequence of SEQ ID No: 182. This enzyme is believed to have esterase activity.
  • sequence ES 92, scaffold00227.pathl.gene278, is encoded by the nucleotides of SEQ ID No: 183, which encodes the amino acid sequence of SEQ ID No: 184. This enzyme is believed to have carboxylesterase activity.
  • the methods may be performed one or more times in whole or in part. That is, one may perform one or more pretreatments, followed by one or more reactions with a protein of the present invention, composition or product of the present invention and/or accessory enzyme.
  • the enzymes may be added in a single dose, or may be added in a series of small doses. Further, the entire process may be repeated one or more times as necessary. Therefore, one or more additional treatments with heat and enzymes are contemplated.
  • Proteins of the present invention at least one protein of the present invention, compositions comprising such protein(s) of the present invention, and multi-enzyme compositions (examples of which are described above) may be used in any method where it is desirable to hydrolyze glycosidic linkages in lignocellulosic material, or any other method wherein enzymes of the same or similar function are useful.
  • the present invention includes the use of at least one protein of the present invention, compositions comprising at least one protein of the present invention, or multi-enzyme compositions in methods for hydrolyzing lignocellulose and the generation of fermentable sugars therefrom.
  • the method comprises contacting the lignocellulosic material with an effective amount of one or more proteins of the present invention, composition comprising at least one protein of the present invention, or a multi-enzyme composition, whereby at least one fermentable sugar is produced (liberated).
  • the lignocellulosic material may be partially or completely degraded to fermentable sugars. Economical levels of degradation at commercially viable costs are contemplated.
  • the amount of enzyme or enzyme composition contacted with the lignocellulose will depend upon the amount of glucan and hemicellulose present in the lignocellulose. h some embodiments, the amount of enzyme or enzyme composition contacted with the lignocellulose may be from about 0.1 to about 200 mg enzyme or enzyme composition per gram of biomass dry weight; in other embodiments, from about 3 to about 20 mg enzyme or enzyme composition per gram of biomass dry weight.
  • the invention encompasses the use of any suitable or sufficient amount of enzyme or enzyme composition between about 0.1 mg and about 200 mg enzyme per gram biomass dry weight, in increments of 0.05 mg (i.e., 0.1 mg, 0.15 mg, 0.2 mg... 199.9 mg, 199.95 mg, 200 mg).
  • the invention provides a method for degrading DDG, preferably, but not limited to, DDG derived from corn, to sugars.
  • the method comprises contacting the DDG with a protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition, i certain embodiments, at least 10% of fermentable sugars are liberated.
  • At least 15% of the sugars are liberated, or at least 20% of the sugars are liberated, or at least 23% of the sugars are liberated, or at least 24% of the sugars are liberated, or at least 25% of the sugars are liberated, or at least 26% of the sugars are liberated, or at least 27% of the sugars are liberated, or at least 28% of the sugars are liberated.
  • the invention provides a method for producing fermentable sugars comprising cultivating a genetically modified microorganism of the present invention in a nutrient medium comprising a lignocellulosic material, whereby fermentable sugars are produced.
  • Such enzymes have been described elsewhere herein.
  • the accessory enzyme or enzymes may be added at the same time, prior to, or following the addition of a protein of the present invention or a composition comprising at least one protein of the present invention, or a multi- enzyme composition, or can be expressed (endogenously or overexpressed) in a genetically modified microorganism used in a method of the invention.
  • the protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition will be compatible with the enzymes selected.
  • a composition comprising at least one protein of the present invention, or a multi-enzyme composition
  • the conditions such as temperature and pH
  • the accessory enzyme may also be present in the lignocellulosic material itself as a result of genetically modifying the plant.
  • the nutrient medium used in a fermentation can also comprise one or more accessory enzymes.
  • the method comprises a pretreatment process.
  • a pretreatment process will result in components of the iignocellulose being more accessible for downstream applications or so that it is more digestible by enzymes following treatment in the absence of hydrolysis.
  • the pretreatment can be a chemical, physical or biological pretreatment.
  • the Iignocellulose may have been previously treated to release some or all of the sugars, as in the case of DDG. Physical treatments, such as grinding, boiling, freezing, milling, vacuum infiltration, and the like may also be used with the methods of the invention.
  • the heat treatment comprises heating the lignocellulosic material to 121°C for 15 minutes.
  • a physical treatment such as milling can allow a higher concentration of Iignocellulose to be used in the methods of the invention.
  • a higher concentration refers to about 20%, up to about 25%, up to about 30%, up to about 35%, up to about 40%, up to about 45%, or up to about 50% lignocellulose.
  • the lignoce!lulose may also be contacted with a metal ion, ultraviolet light, ozone, and the like.
  • Additional pretreatment processes are known to those skilled in the art, and can include, for example, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment, including ammonia fiber explosion (AFEX) technology.
  • AFEX ammonia fiber explosion
  • the method comprises detoxifying the lignocellulosic material.
  • Dextoxification may be desirable in the event that inhibitors are present in the lignocellulosic material. Such inhibitors can be generated by a pretreatment process, deriving from sugar degradation or are direct released from the lignocellulose polymer.
  • Detoxifying can include the reduction of their formation by adjusting sugar extraction conditions; the use of inhibitor-tolerant or inhibitor-degrading strains of microorganisms. Detoxifying can also be accomplished by the addition of ion exchange resins, active charcoal, enzymatic detoxification using, e.g., laccase, and the like.
  • the proteins, compositions or products of the present invention further comprises detoxifying agents.
  • the invention comprises, but is not limited to, methods for esterases in the food industry, such as degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food processing.
  • methods for esterases in the food industry such as degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated
  • the invention comprises, but is not limited to, methods for using esterases in the household industry, such as use in laundry and detergent (e.g., removal of stains); cleaning agents; and hydrolysis of tallow for laundry detergent.
  • the invention comprises, but is not limited to, methods for esterases in the publishing and printing industry, such as the removal of triglycerides, steryl esters, resin acids, free fatty acids, and sterols (e.g., lipophilic wood extractives).
  • the invention comprises, but is not limited to, methods for using esterases in the bioenergy industry, such as the production of biodiesel and other biofuels.
  • the invention comprises, but is not limited to, applications of esterases in the feed industry reducing the amount of phosphate in feed.
  • the invention comprises methods for using esterases in other industries, such as the use as a biocatalyst; sewage treatment; cleaning up oil pollution; the synthesis of esters; the synthesis of fragrances; enantio-specific catalysis of fine chemicals (e.g., esters for chemical and drug intermediates); the production of isopropyl myristate, isopropyl palmitate and 2-ethylpalmitate for use as emollient in personal care products; saving of energy and minimization of thermal degradation in oleochemical industry; use as a feed additive; and enhancing the recovery of oil (e.g., during drilling).
  • esterases in other industries, such as the use as a biocatalyst; sewage treatment; cleaning up oil pollution; the synthesis of esters; the synthesis of fragrances; enantio-specific catalysis of fine chemicals (e.g., esters for chemical and drug intermediates); the production of isopropyl myristate, isopropyl palmitate and 2-
  • the invention comprises, but is not limited to, methods for using esterases in the food industry to prepare gelling agents, in processes to clarify beverages, to prepare dietary oligosaccharides, and the like.
  • the invention comprises but is not limited to a method to release ferulic acid from biomass in order to synthesize vanillin.
  • the present invention provides methods for improving the nutritional quality of food (or animal feed) comprising adding to the food (or the animal feed) at least one protein of the present invention. In some embodiments, the present invention provides methods for improving the nutritional quality of the food (or animal feed) comprising pretreating the food (or the animal feed) with at least one isolated protein of the present invention. Improving the nutritional quality can mean making the food (or the animal feed) more digestible and/or less allergenic, and encompasses changes in the caloric value, taste and/or texture of the food.
  • the proteins of the present invention may be used as part of nutritional supplements. In some embodiments, the proteins of the present invention may be used as part of digestive aids.
  • an isolated protein or polypeptide in the present invention includes full-length proteins and their glycosylated or otherwise modified forms, fusion proteins, or any fragment or homologue or variant of such a protein. More specifically, an isolated protein, such as an enzyme according to the present invention, is a protein (including a polypeptide or peptide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include purified proteins, partially purified proteins, recombinantly produced proteins, synthetically produced proteins, proteins complexed with lipids, soluble proteins, and isolated proteins associated with other proteins, for example.
  • a "C. lucknowense protein” or “C. lucknowense enzyme” refers to a protein (generally including a homologue or variant of a naturally occurring protein) from M. thermophila or to a protein that has been otherwise produced from the knowledge of the structure (e.g., sequence) and perhaps the function of a naturally occurring protein from M. thermophila.
  • a C. lucknowense protein includes any protein that has substantially similar structure and function of a naturally occurring C.
  • a C. lucknowense protein or that is a biologically active (i.e., has biological activity) homologue or variant of a naturally occurring protein from C. lucknowense as described in detail herein.
  • a C. lucknowense protein can include purified, partially purified, recombinant, mutated/modified and synthetic proteins.
  • an isolated protein can be isolated from its natural source, produced recombinantly, or produced synthetically.
  • modification can be used interchangeably, particularly with regard to the modifications/mutations/varients to the amino acid sequence of a protein or peptide (e.g, C. lucknowense protein) (or nucleic acid sequences) described herein.
  • modification can also be used to describe post-translational modifications to a protein or peptide including, but not limited to, methylation, farnesylation, carboxymethylation, geranyl geranylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, and/or amidation.
  • homologue or “variants” are used to refer to a protein or peptide which differs from a naturally occurring protein or peptide (i.e., the "prototype” or “wild-type” protein) by minor modifications to the naturally occurring protein or peptide, but which maintains the basic protein and side chain structure of the naturally occurring form.
  • Such changes include, but are not limited to: changes in one or a few amino acid side chains; changes in one or a few amino acids, including deletions (e.g., a truncated version of the protein or peptide), insertions and or substitutions; changes in stereochemistry of one or a few atoms; and/or minor derivatizations, including but not limited to for example: methylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, amidation and/or addition of glycosylphosphatidyl inositol.
  • a homologue or variant can have either enhanced, decreased, or substantially similar properties as compared to the naturally occurring protein or peptide.
  • Homologues or variants can be the result of natural allelic variation or natural mutation.
  • a naturally occurring allelic variant of a nucleic acid encoding a protein is a gene that occurs at essentially the same locus (or loci) in the genome as the gene which encodes such protein, but which, due to natural variations caused by, for example, mutation or recombination, has a similar but not identical sequence.
  • Homologous can also be the result of a gene duplication and rearrangement, resulting in a different location.
  • Allelic variants typically encode proteins having similar activity to that of the protein encoded by the gene to which they are being compared.
  • One class of allelic variants can encode the same protein but have different nucleic acid sequences due to the degeneracy of the genetic code.
  • Allelic variants can also comprise alterations in the 5' or 3' untranslated regions of the gene (e.g., in regulatory control regions). Allelic variants are well known to those skilled in the art.
  • Homologues or variants can be produced using techniques known in the art for the production of proteins including, but not limited to, direct modifications to the isolated, naturally occurring protein, direct protein synthesis, or modifications to the nucleic acid sequence encoding the protein using, for example, classic or recombinant DNA techniques to effect random or targeted mutagenesis.
  • Modifications of a protein may result in proteins having the same biological activity as the naturally occurring protein, or in proteins having decreased or increased biological activity as compared to the naturally occurring protein. Modifications which result in a decrease in protein expression or a decrease in the activity of the protein, can be referred to as inactivation (complete or partial), down-regulation, or decreased action of a protein. Similarly, modifications which result in an increase in protein expression or an increase in the activity of the protein, can be referred to as amplification, overproduction, activation, enhancement, up-regulation or increased action of a protein.
  • an isolated protein including a biologically active homologue, variant, or fragment thereof, has at least one characteristic of biological activity of a wild-type, or naturally occurring, protein.
  • the biological activity or biological action of a protein refers to any tunction(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions).
  • the biological activity of a protein of the present invention can include an enzyme activity (catalytic activity and/or substrate binding activity), such as cellulase activity, hemicellulase activity, ⁇ -glucanase activity, xylanase activity, or any other activity disclosed herein.
  • an enzyme activity catalytic activity and/or substrate binding activity
  • cellulase activity such as cellulase activity, hemicellulase activity, ⁇ -glucanase activity, xylanase activity, or any other activity disclosed herein.
  • Specific biological activities of the proteins disclosed herein are described in detail above and in the Examples.
  • Methods of detecting and measuring the biological activity of a protein of the invention include, but are not limited to, the assays described in the Examples section below.
  • Such assays include, but are not limited to, measurement of enzyme activity (e.g., catalytic activity), measurement of substrate binding, and the like.
  • an isolated protein of the present invention (including homologues or variants) is not required to have a biological activity such as catalytic activity.
  • a protein can be a truncated, mutated or inactive protein, or lack at least one activity of the wild-type enzyme, for example.
  • Inactive proteins may be useful in some screening assays, for example, or for other purposes such as antibody production.
  • Methods to measure protein expression levels of a protein according to the invention include, but are not limited to: western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to, ligand binding or interaction with other protein partners.
  • Homologies or variants of a protein encompassed by the present invention can comprise, consist essentially of, or consist of, in one embodiment, an amino acid sequence that is at least about 35% identical, and more preferably at least about 40% identical, and more preferably at least about 45% identical, and more preferably at least about 50% identical, and more preferably at least about 55% identical, and more preferably at least about 60% identical, and more preferably at least about 65% identical, and more preferably at least about 70% identical, and more preferably at least about 75% identical, and more preferably at least about 80% identical, and more preferably at least about 85% identical, and more preferably at least about 90% identical, and more preferably at least about 95% identical, and more preferably at least about 96% identical, and more preferably at least about 97% identical, and more preferably at least about 98% identical, and more preferably at least about 99% identical, or any percent identity between 35% and 99%, in whole integers (i.e., 36%, 37%, etc.), to an amino acid sequence disclosed
  • the amino acid sequence of the homologue or variant has a biological activity of the wild-type or reference protein or of a biologically active domain thereof (e.g., a catalytic domain).
  • a biologically active domain thereof e.g., a catalytic domain.
  • the amino acid position of the wild-type is typically used.
  • the wild-type can also be referred to as the "parent.” Additionally, any generation before the variant at issue can be a parent.
  • a protein of the present invention comprises, consists essentially of, or consists of an amino acid sequence that, alone or in combination with other characteristics of such proteins disclosed herein, is less than 100% identical to an amino acid sequence selected from SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60,
  • a protein of the present invention can be less than 100% identical, in combination with being at least about 35% identical, to a given disclosed sequence.
  • a homologue or variant according to the present invention has an amino acid sequence that is less than about 99% identical to any of such amino acid sequences, and in another embodiment, is less than about 98% identical to any of such amino acid sequences, and in another embodiment, is less man about 97% identical to any of such amino acid sequences, and in another embodiment, is less than about 96% identical to any of such amino acid sequences, and in another embodiment, is less than about 95% identical to any of such amino acid sequences, and in another embodiment, is less than about 94% identical to any of such amino acid sequences, and in another embodiment, is less than about 93% identical to any of such amino acid sequences, and in another embodiment, is less than about 92% identical to any of such amino acid sequences, and in another embodiment, is less than about 91% identical to any of such amino acid sequences, and in another embodiment, is less than about 90% identical to
  • reference to a percent (%) identity refers to an evaluation of homology which is performed using: (1) a BLAST 2.0 Basic BLAST homology search using blastp for amino acid searches and blastn for nucleic acid searches with standard default parameters, wherein the query sequence is filtered for low complexity regions by default (described in Altschul, S.F., Madden, T.L., Schaaffer, A A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.” Nucleic Acids Res.
  • PSI- BLAST provides an automated, easy-to-use version of a "profile" search, which is a sensitive way to look for sequence homologues or variants.
  • the program first performs a gapped BLAST database search.
  • the PSI-BLAST program uses the information from any significant alignments returned to construct a position-specific score matrix, which replaces the query sequence for the next round of database searching. Therefore, it is to be understood that percent identity can be determined by using any one of these programs.
  • BLAST 2 sequence alignment is performed in blastp or blastn using the BLAST 2.0 algorithm to perform a Gapped BLAST search (BLAST 2.0) between the two sequences allowing for the introduction of gaps (deletions and insertions) in the resulting alignment.
  • BLAST 2.0 Gapped BLAST search
  • a BLAST 2 sequence alignment is performed using the standard default parameters as follows.
  • a protein of the present invention can also include proteins having an amino acid sequence comprising at least 10 contiguous amino acid residues of any of the sequences described herein (i.e., 10 contiguous amino acid residues having 100% identity with 10 contiguous amino acids of the amino acid sequences of SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ED No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ED No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ED
  • a homologue or variant of a protein amino acid sequence includes amino acid sequences comprising at least 20, or at least 30, or at least 40, or at least 50, or at least 75, or at least 100, or at least 125, or at least 150, or at least 175, or at least 150, or at least 200, or at least 250, or at least 300, or at least 350 contiguous amino acid residues of any of the amino acid sequence represented disclosed herein.
  • Even small fragments of proteins without biological activity are useful in the present invention, for example, in the preparation of antibodies against the full-length protein or in a screening assay (e.g., a binding assay).
  • Fragments can also be used to construct fusion proteins, for example, where the fusion protein comprises functional domains from two or more different proteins (e.g., a CBM from one protein linked to a CD from another protein).
  • a homologue or variant has a measurable or detectable biological activity associated with the wild-type protein (e.g., enzymatic activity).
  • the term "contiguous” or “consecutive”, with regard to nucleic acid or amino acid sequences described herein, means to be connected in an unbroken sequence.
  • first sequence to comprise 30 contiguous (or consecutive) amino acids of a second sequence means that the first sequence includes an unbroken sequence of 30 amino acid residues that is 100% identical to an unbroken sequence of 30 amino acid residues in the second sequence.
  • first sequence to have "100% identity" with a second sequence means that the first sequence exactly matches the second sequence with no gaps between nucleotides or amino acids.
  • a protein of the present invention includes a protein having an amino acid sequence that is sufficiently similar to a natural amino acid sequence that a nucleic acid sequence encoding the homologue or variant is capable of hybridizing under moderate, high or very high stringency conditions (described below) to (i.e., with) a nucleic acid molecule encoding the natural protein (i.e., to the complement of the nucleic acid strand encoding the natural amino acid sequence).
  • a homologue or variant of a protein of the present invention is encoded by a nucleic acid molecule comprising a nucleic acid sequence that hybridizes under low, moderate, or high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising, consisting essentially of, or consisting of, an amino acid sequence represented by any of SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No; 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID NO: 2,
  • a nucleic acid sequence complement of nucleic acid sequence encoding a protein of the present invention refers to the nucleic acid sequence of the nucleic acid strand that is complementary to the strand which encodes the protein. It will be appreciated that a double stranded DNA which encodes a given amino acid sequence comprises a single strand DNA and its complementary strand having a sequence that is a complement to the single strand DNA.
  • nucleic acid molecules of the present invention can be either double-stranded or single-stranded, and include those nucleic acid molecules that form stable hybrids under stringent hybridization conditions with a nucleic acid sequence that encodes an amino acid sequence such as the amino acid sequences of SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ED No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ED No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ED No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ
  • hybridization conditions refers to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., (see specifically, pages 9.31-9.62). In addition, formulae to calculate the appropriate hybridization and wash conditions to achieve hybridization permitting varying degrees of mismatch of nucleotides are disclosed, for example, in Meinkoth et al., 1984, Anal. Biochem. 138, 267-284; Meinkoth et al., ibid.
  • moderate stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 70% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 30% or less mismatch of nucleotides).
  • High stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 80% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 20% or less mismatch of nucleotides).
  • Very high stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides).
  • conditions permitting about 10% or less mismatch of nucleotides i.e., one of skill in the art can use the formulae in Meinkoth et al., ibid, to calculate the appropriate hybridization and wash conditions to achieve these particular levels of nucleotide mismatch. Such conditions will vary, depending on whether DNA:RNA or DNAtDNA hybrids are being formed. Calculated melting temperatures for DNArDNA hybrids are 10°C less man for DNA:RNA hybrids.
  • stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 MNa + at a temperature of between about 20°C and about 35°C (tower stringency), more preferably, between about 28°C and about 40°C (more stringent), and even more preferably, between about 35°C and about 45°C (even more stringent), with appropriate wash conditions.
  • 6X SSC 0.9 MNa + at a temperature of between about 20°C and about 35°C (tower stringency), more preferably, between about 28°C and about 40°C (more stringent), and even more preferably, between about 35°C and about 45°C (even more stringent), with appropriate wash conditions.
  • stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na*) at a temperature of between about 30°C and about 45°C, more preferably, between about 38°C and about 50°C, and even more preferably, between about 45°C and about 55°C, with similarly stringent wash conditions.
  • 6X SSC 0.9 M Na*
  • T m can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62.
  • the wash conditions should be as stringent as possible, and should be appropriate for the chosen hybridization conditions.
  • hybridization conditions can include a combination of salt and temperature conditions that are approximately 20-25°C below the calculated T m of a particular hybrid
  • wash conditions typically include a combination of salt and temperature conditions that are approximately 12-20°C below the calculated T m of the particular hybrid.
  • hybridization conditions suitable for use with DNA:DNA hybrids includes a 2-24 hour hybridization in 6X SSC (50% formamide) at about 42°C, followed by washing steps that include one or more washes at room temperature in about 2X SSC, followed by additional washes at higher temperatures and lower ionic strength (e.g., at least one wash as about 37°C in about 0.1X-0.5X SSC, followed by at least one wash at about 68°C in about 0.1X-0.5X SSC).
  • the minimum size of a protein and/or homologue or variant of the present invention is a size sufficient to have biological activity or, when the protein is not required to have such activity, sufficient to be useful for another purpose associated with a protein of the present invention, such as for the production of antibodies that bind to a naturally occurring protein.
  • the protein of the present invention is at least 20 amino acids in length, or at least about 25 amino acids in length, or at least about 30 amino acids in length, or at least about 40 amino acids in length, or at least about 50 amino acids in length, or at least about 60 amino acids in length, or at least about 70 amino acids in length, or at least about 80 amino acids in length, or at least about 90 amino acids in length, or at least about 100 amino acids in length, or at least about 125 amino acids in length, or at least about 150 amino acids in length, or at least about 175 amino acids in length, or at least about 200 amino acids in length, or at least about 250 amino acids in length, and so on up to a full length of each protein, and including any size in between in increments of one whole integer (one amino acid).
  • the protein can include a portion of a protein or a full-length protein, plus additional sequence (e.g., a fusion protein sequence), if desired.
  • the present invention also includes a fusion protein that includes a domain of a protein of the present invention (including a homologue or variant) attached to one or more fusion segments, which are typically heterologous in sequence to the protein sequence ⁇ i.e., different than protein sequence).
  • Suitable fusion segments for use with the present invention include, but are not limited to, segments that can: enhance a protein's stability; provide other desirable biological activity; and/or assist with the purification of the protein (e.g., by affinity chromatography).
  • a suitable fusion segment can be a domain of any size that has the desired function ⁇ e.g., imparts increased stability, solubility, action or biological activity; and/or simplifies purification of a protein).
  • Fusion segments can be joined to amino and/or carboxyl termini of the domain of a protein of the present invention and can be susceptible to cleavage in order to enable straight-forward recovery of the protein.
  • Fusion proteins are preferably produced by culturing a recombinant cell transfected with a fusion nucleic acid molecule that encodes a protein including the fusion segment attached to either the carboxyl and/or amino terminal end of a domain of a protein of the present invention.
  • proteins of the present invention also include expression products of gene fusions (for example, used to overexpress soluble, active forms of the recombinant protein), of mutagenized genes (such as genes having codon modifications to enhance gene transcription and translation), and of truncated genes (such as genes having membrane binding modules removed to generate soluble forms of a membrane protein, or genes having signal sequences removed which are poorly tolerated in a particular recombinant host).
  • gene fusions for example, used to overexpress soluble, active forms of the recombinant protein
  • mutagenized genes such as genes having codon modifications to enhance gene transcription and translation
  • truncated genes such as genes having membrane binding modules removed to generate soluble forms of a membrane protein, or genes having signal sequences removed which are poorly tolerated in a particular recombinant host.
  • any of the amino acid sequences described herein can be produced with from at least one, and up to about 20, additional heterologous amino acids flanking each of the C- and/or N-terminal ends of the specified amino acid sequence.
  • the resulting protein or polypeptide can be referred to as "consisting essentially of the specified amino acid sequence.
  • the heterologous amino acids are a sequence of amino acids that are not naturally found (i.e., not found in nature, in vivo) flanking the specified amino acid sequence, or that are not related to the function of the specified amino acid sequence, or that would not be encoded by the nucleotides that flank the naturally occurring nucleic acid sequence encoding the specified amino acid sequence as it occurs in the gene, if such nucleotides in the naturally occurring sequence were translated using standard codon usage for the organism from which the given amino acid sequence is derived.
  • the present invention also provides enzyme combinations that break down lignocellulose material.
  • Such enzyme combinations or mixtures can include a multi- enzyme composition that contains at least one protein of the present invention in combination with one or more additional proteins of the present invention or one or more enzymes or other proteins from other microorganisms, plants, or similar organisms.
  • Synergistic enzyme combinations and related methods are contemplated.
  • the invention includes methods to identify the optimum ratios and compositions of enzymes with which to degrade each lignocellulosic material. These methods entail tests to identify the optimum enzyme composition and ratios for efficient conversion of any lignocellulosic substrate to its constituent sugars.
  • the Examples below include assays that may be used to identify optimum ratios and compositions of enzymes with which to degrade lignocellulosic and other plant cell derived materials.
  • any combination of the proteins disclosed herein is suitable for use in the multi- enzyme compositions of the present invention. Due to the complex nature of most biomass sources, which can contain cellulose, hemicellulose, pectin, lignin, protein, lipids, waxes and ash, among other components, preferred enzyme combinations may contain enzymes with a range of substrate specificities that work together to degrade biomass into fermentable sugars in the most efficient manner.
  • a multi-enzyme complex for lignocellulose saccharification is a mixture of cellobiohydrolase(s), xylanase(s), endoglucanase(s), ⁇ -glucosidase(s), ⁇ -xylosidase(s), and accessory enzymes.
  • any of the enzymes described specifically herein can be combined with any one or more of the enzymes described herein or with any other available and suitable enzymes, to produce a multi- enzyme composition.
  • the invention is not restricted or limited to the specific exemplary combinations listed below.
  • the celiobiohydrolase(s) comprise between about 30% and about 90% or between about 40% and about 70% of the enzymes in the composition, and more preferably, between about 55% and 65%, and more preferably, about 60% of the enzymes in the composition (including any percentage between 40% and 70% in 0.5% increments (e.g., 40%, 40.5%, 41%, etc.).
  • the xylanase(s) comprise between about 10% and about 30% of the enzymes in the composition, and more preferably, between about 15% and about 25%, and more preferably, about 20% of the enzymes in the composition (including any percentage between 10% and 30% in 0.5% increments).
  • the endoglucanase(s) comprise between about 5% and about 15% of the enzymes in the composition, and more preferably, between about 7% and about 13%, and more preferably, about 10% of the enzymes in the composition (including any percentage between 5% and 15% in 0.5% increments).
  • the ⁇ -glucosidase(s) comprise between about 1% and about 15% of the enzymes in the composition, and preferably between about 2% and 10%, and more preferably, about 3% of the enzymes in the composition (including any percentage between 1% and 15% in 0.5% increments).
  • the ⁇ -xylosidase(s) comprise between about 1% and about 3% of the enzymes in the composition, and preferably, between about 1.5% and about 2.5%, and more preferably, about 2% of the enzymes in the composition (including any percentage between 1% and 3% in 0.5% increments.
  • the accessory enzymes comprise between about 2% and about 8% of the enzymes in the composition, and preferably, between about 3% and about 7%, and more preferably, about 5% of the enzymes in the composition (including any percentage between 2% and 8% in 0.5% increments.
  • One particularly preferred example of a multi-enzyme complex for Iignocellulose saccharification is a mixture of about 60% cellobiohydrolase(s), about 20% xy!anase(s), about 10% endoglucanase(s), about 3% ⁇ -glucosidase(s), about 2% ⁇ - xy)osidase(s) and about 5% accessory enzyme(s).
  • Enzymes and multi-enzyme compositions of the present invention may also be used to break down arabinoxylan or arabinoxylan-containing substrates.
  • Arabinoxylan is a polysaccharide composed of xylose and arabinose, wherein ct-L-arabinofuranose residues are attached as branch-points to a ⁇ -(1,4)-linked xylose polymeric backbone.
  • the xylose residues may be mono-substituted at the C2 or C3 position, or di- substituted at both positions.
  • Ferulic acid or coumaric acid may also be ester-linked to the C5 position of arabinosyl residues. Further details on the hydrolysis of arabinoxylan can be found in International Publication No. WO 2006/114095.
  • the substitutions on the xylan backbone can inhibit the enzymatic activity of xylanases, and the complete hydrolysis of arabinoxylan typically requires the action of several different enzymes.
  • a multi-enzyme complex for arabinoxylan hydrolysis is a mixture of endoxylanase(s), ⁇ -xylosidase(s), and arabinofuranosidase(s), including those with specificity towards single and double substituted xylose residues.
  • the multi-enzyme complex may further comprise one or more carbohydrate esterases, such as acetyl xylan esterases, ferulic acid esterases, or coumaric acid esterases. Any combination of two or more of the above-mentioned enzymes is suitable for use in the multi-enzyme complexes. However, it is to be understood that the invention is not restricted or limited to the specific exemplary combinations listed herein.
  • the endoxylanase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.).
  • Endoxylanase(s) either alone or as part of a multi- enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g kg, or 0.05 to 0.2 g kg of substrate.
  • the ⁇ -xylosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.).
  • ⁇ -xylosidase(s) either alone or as part of a multi- enzyme complex, may be used in amounts of 0.001 to 2.0 g kg, 0.005 to 1.0 g kg, or 0.05 to 0.2 g/kg of substrate.
  • the arabinofuranosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g, 5.0%, 5.5%, 6.0%, etc.).
  • the total percentage of arabinofuranosidase(s) present in the composition may include arabinofuranosidase(s) with specificity towards single substituted xylose residues, arabinofuranosidase(s) with specificity towards double substituted xylose residues, or any combination thereof.
  • Arabinofuranosidase(s) either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g kg, 0.005 to 1.0 g/kg, or 0.05 to 0.2 g/kg of substrate.
  • the acetylxylan esterases, ferulic acid esterases, the coumaryl esterases, the glucuronyl esterases and/ or the glucuronidases alone or in combination comprise at least about 1%, at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50% of the enzymes in the composition (including any percentage between 1% and 50% in 0.5% increments (e.g., 1.0%, 1.5%, 2.0%, etc.).
  • Acetylxylan esterases, ferulic acid esterases, coumaric acid esterases, glucuronyl esterases, glucuronidases may be used in amounts of 0.0001 to 2.0 g/kg, 0.0005 to 1.0 g/kg or 0.005 to 0.2 g kg of substrate.
  • One or more components of a multi-enzyme composition can be obtained from or derived from a microbial, plant, or other source or combination thereof, and will contain enzymes capable of degrading lignocellulosic material or other biomass components.
  • Examples of enzymes included in the multi-enzyme compositions of the invention include cellulases, hemicellulases (such as xylanases, including endoxylanases, exoxylanases, and ⁇ -xylosidases; mannanases, including endomannanases, exomannanases, and ⁇ -mannosidases), pectinases, ligninases, amylases, glucuronidases, proteases, esterases, lipases, glucosidases (such as ⁇ -glucosidase), and xyloglucanases.
  • cellulases such as xylanases, including endoxylanases, exoxylanases, and ⁇ -xylosidases
  • mannanases including endomannanases, exomannanases, and ⁇ -mannosidases
  • pectinases such as xylanases
  • the multi-enzyme composition may contain many types of enzymes, mixtures comprising enzymes that increase or enhance sugar release from biomass are preferred, including hemicellulases.
  • the hemicellulase is selected from a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, an endo-galactanase, a mannanase, an endo-arabinase, an exo-arabinase, an exo- galactanase, a ferulic acid esterase, a galactomannanase, a xylogluconase, or mixtures of any of these.
  • the enzymes can include glucoamylase, ⁇ -xylosidase and/or ⁇ -glucosidase.
  • mixtures comprising enzymes that are capable of degrading cell walls and releasing
  • the enzymes of the multi-enzyme composition can be provided by a variety of sources.
  • the enzymes can be produced by growing organisms such as bacteria, algae, fungi, and plants which produce the enzymes naturally or by virtue of being genetically modified to express the enzyme or enzymes.
  • at least one enzyme of the multi-enzyme composition is a commercially available enzyme.
  • the multi-enzyme compositions comprise an accessory enzyme.
  • An accessory enzyme is any additional enzyme capable of hydrolyzing lignocellulose or enhancing or promoting the hydrolysis of lignocellulose, wherein the accessory enzyme is typically provided in addition to a core enzyme or core set of enzymes.
  • An accessory enzyme can have the same or similar function or a different function as an enzyme or enzymes in the core set of enzymes. These enzymes have been described elsewhere herein,.
  • the enzymes may also include cellulases, xylanases, ligninases, amylases, lipases, cutinases or glucuronidases, for example.
  • Accessory enzymes can include enzymes that when contacted with biomass in a reaction, allow for an increase in the activity of enzymes (e.g., hemicellulases) in the multi-enzyme composition.
  • An accessory enzyme or enzyme mix may be composed of enzymes from (1) commercial suppliers; (2) cloned genes expressing enzymes; (3) complex broth (such as that resulting from growth of a microbial strain in media, wherein the strains secrete proteins and enzymes into the media); (4) cell lysates of strains grown as in (3); and, (S) plant material expressing enzymes capable of degrading lignocellulose or other plant derived biomass constituents (e.g. pectins).
  • a ligninase is an enzyme that can hydrolyze or break down the structure of Iignin polymers, including !ignin peroxidases, manganese peroxidases, laccases, and other enzymes described in the art known to depolymerize or otherwise break lignin polymers. Also included are enzymes capable of hydrolyzing bonds formed between hemicellulosic sugars (notably arabinose) and lignin.
  • the multi-enzyme compositions comprise a biomass comprising microorganisms or a crude fermentation product of microorganisms.
  • a crude fermentation product refers to the fermentation broth which has been separated from the microorganism biomass (by filtration, for example).
  • the microorganisms are grown in fermentors, optionally centrifuged or filtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi-enzyme composition that is a crude fermentation product.
  • enzyme(s) or multi-enzyme compositions produced by the microorganism are subjected to one or more purification steps, such as ammonium sulfate precipitation, chromatography, and/or ultrafiltration, which result in a partially purified or purified enzyme(s).
  • the enzyme(s) will include recombinant enzymes.
  • the enzyme(s) may include both naturally occurring and recombinant enzymes.
  • compositions comprising at least about 500 ng, and preferably at least about 1 ⁇ 3 ⁇ 4, and more preferably at least about 5 ug, and more preferably at least about 10 ug, and more preferably at least about 25 ug, and more preferably at least about 50 ug, and more preferably at least about 75 ug, and more preferably at least about 100 ⁇ g, and more preferably at least about 250 ug, and more preferably at least about 500 ug, and more preferably at least about 750 ug, and more preferably at least about 1 mg, and more preferably at least about 5 mg, of an isolated protein comprising any of the proteins or homologues, variants, or fragments thereof discussed herein.
  • composition of the present invention may include any carrier with which the protein is associated by virtue of the protein preparation method, a protein purification method, or a preparation of the protein for use in any method according to the present invention.
  • a carrier can include any suitable buffer, extract, or medium that is suitable for combining with the protein of the present invention so that the protein can be used in any method described herein according to the present invention.
  • an immobilized enzyme includes immobilized isolated enzymes, immobilized microbial cells which contain one or more enzymes of the invention, other stabilized intact cells that produce one or more enzymes of the invention, and stabilized cell/membrane homogenates.
  • Stabilized intact cells and stabilized cell/membrane homogenates include cells and homogenates from naturally occurring microorganisms expressing the enzymes of the invention and preferably, from genetically modified microorganisms as disclosed elsewhere herein.
  • nucleic acid molecules that encode a protein of the present invention, as well as homologues, variants, or fragments of such nucleic acid molecules.
  • a nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding any of the isolated proteins disclosed herein, including a fragment or a homologue or variant of such proteins, described above.
  • Nucleic acid molecules can include a nucleic acid sequence that encodes a fragment of a protein that does not have biological activity, and can also include portions of a gene or polynucleotide encoding the protein that are not part of the coding region for the protein (e.g., introns or regulatory regions of a gene encoding the protein). Nucleic acid molecules can include a nucleic acid sequence that is useful as a probe or primer (oligonucleotide sequences).
  • a nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence represented in SEQ ID No: 1, SEQ ID No: 3, SEQ ID No: 5, SEQ ID No: 7, SEQ ID No: 9, SEQ ID No: 11 , SEQ ID No: 13, SEQ ID No: 15, SEQ ID No: 17, SEQ ID No: 19, SEQ ID No: 21, SEQ ID No: 23, SEQ ID No: 25, SEQ ID No: 27, SEQ ID No: 29, SEQ ID No: 31, SEQ ID No: 33, SEQ ID No: 35, SEQ ID No: 37, SEQ ID No: 39, SEQ ID No: 41, SEQ ID No: 43, SEQ ID No: 45, SEQ ID No: 47, SEQ ID No: 49, SEQ ID No: 51, SEQ ID No: 53, SEQ ID No: 55, SEQ ED No: 56, SEQ ID No: 57, SEQ JD No
  • a nucleic molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding an amino acid sequence represented in Sequences SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No:
  • nucleic acid molecules include isolated nucleic acid molecules that hybridize under moderate stringency conditions, and more preferably under high stringency conditions, and even more preferably under very high stringency conditions, as described above, with the complement of a nucleic acid sequence encoding a protein of the present invention (i.e., including naturally occurring allelic variants encoding a protein of the present invention).
  • an isolated nucleic acid molecule encoding a protein of the present invention comprises a nucleic acid sequence that hybridizes under moderate, high, or very high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising an amino acid sequence represented in SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ JD No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ED No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ED No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ED No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ED No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56
  • an isolated nucleic acid molecule is a nucleic acid molecule (polynucleotide) that has been removed from its natural milieu ⁇ i.e., that has been subject to human manipulation) and can include DNA, RNA, or derivatives of either DNA or RNA, including cDNA.
  • isolated does not reflect the extent to which the nucleic acid molecule has been purified.
  • nucleic acid molecule primarily refers to the physical nucleic acid molecule
  • nucleic acid sequence primarily refers to the sequence of nucleotides on the nucleic acid molecule
  • the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein.
  • An isolated nucleic acid molecule of the present invention can be isolated from its natural source or produced using recombinant DNA technology ⁇ e.g., polymerase chain reaction (PCR) amplification, cloning) or chemical synthesis.
  • PCR polymerase chain reaction
  • Isolated nucleic acid molecules can include, for example, genes, natural allelic variants of genes, coding regions or portions thereof, and coding and/or regulatory regions modified by nucleotide insertions, deletions, substitutions, and/or inversions in a manner such that the modifications do not substantially interfere with the nucleic acid molecule's ability to encode a protein of the present invention or to form stable hybrids under stringent conditions with natural gene isolates.
  • An isolated nucleic acid molecule can include degeneracies.
  • nucleotide degeneracy refers to the phenomenon that one amino acid can be encoded by different nucleotide codons.
  • nucleic acid sequence of a nucleic acid molecule that encodes a protein of the present invention can vary due to degeneracies. It is noted that a nucleic acid molecule of the present invention is not required to encode a protein having protein activity. A nucleic acid molecule can encode a truncated, mutated or inactive protein, for example. In addition, nucleic acid molecules of the invention are useful as probes and primers for the identification, isolation and/or purification of other nucleic acid molecules.
  • the nucleic acid molecule is an oligonucleotide, such as a probe or primer
  • the oligonucleotide preferably ranges from about 5 to about SO or about 500 nucleotides, more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length.
  • a gene includes all nucleic acid sequences related to a natural (i.e. wild-type) gene, such as regulatory regions that control production of the protein encoded by that gene (such as, but not limited to, transcription, translation or post-translation control regions) as well as the coding region itself.
  • a gene can be a naturally occurring allelic variant that includes a similar but not identical sequence to the nucleic acid sequence encoding a given protein. Allelic variants have been previously described above. Genes can include or exclude one or more introns or any portions thereof or any other sequences or which are not included in the cDNA for that protein.
  • the phrases "nucleic acid molecule" and “gene” can be used interchangeably when the nucleic acid molecule comprises a gene as described above.
  • an isolated nucleic acid molecule of the present invention is produced using recombinant DNA technology (e.g., polymerase chain reaction (PCR) amplification, cloning, etc.) or chemical synthesis.
  • Isolated nucleic acid molecules include any nucleic acid molecules and homologues or variants thereof that are part of a gene described herein and/or that encode a protein described herein, including, but not limited to, natural allelic variants and modified nucleic acid molecules (homologues or variants) in which nucleotides have been inserted, deleted, substituted, and/or inverted in such a manner that such modifications provide the desired effect on protein biological activity or on the activity of the nucleic acid molecule.
  • Allelic variants and protein homologues or variants e.g., proteins encoded by nucleic acid homologues or variants
  • a nucleic acid molecule homologue or variant i.e., encoding a homologue or variant of a protein of the present invention
  • a nucleic acid molecule homologue or variant can be produced using a number of methods known to those skilled in the art (see, for example, Sam rook et a/.).
  • nucleic acid molecules can be modified using a variety of techniques including, but not limited to, by classic mutagenesis and recombinant DNA techniques (e.g., site- directed mutagenesis, chemical treatment, restriction enzyme cleavage, ligation of nucleic acid fragments and/or PCR amplification), or synthesis of oligonucleotide mixtures and ligation of mixture groups to "build" a mixture of nucleic acid molecules and combinations thereof.
  • Another method for modifying a recombinant nucleic acid molecule encoding a protein is gene shuffling (i.e., molecular breeding) (See, for example, U.S. Patent No.
  • Nucleic acid molecule homologues or variants can be selected by hybridization with a gene or polynucleotide, or by screening for the function of a protein encoded by a nucleic acid molecule (i.e., biological activity).
  • the minimum size of a nucleic acid molecule of the present invention is a size sufficient to encode a protein (including a fragment, homologue, or variant of a full- length protein) having biological activity, sufficient to encode a protein comprising at least one epitope which binds to an antibody, or sufficient to form a probe or oligonucleotide primer that is capable of forming a stable hybrid with the complementary sequence of a nucleic acid molecule encoding a natural protein (e.g., under moderate, high, or high stringency conditions).
  • the size of the nucleic acid molecule encoding such a protein can be dependent on nucleic acid composition and percent homology or identity between the nucleic acid molecule and complementary sequence as well as upon hybridization conditions per se (e.g., temperature, salt concentration, and formamide concentration).
  • the minimal size of a nucleic acid molecule that is used as an oligonucleotide primer or as a probe is typically at least about 12 to about 15 nucleotides in length if the nucleic acid molecules are GC-rich and at least about 15 to about 18 bases in length if they are AT- rich.
  • nucleic acid molecule of the present invention can include a portion of a protein encoding sequence, a nucleic acid sequence encoding a full-length protein (including a gene), including any length fragment between about 20 nucleotides and the number of nucleotides that make up the full length cDNA encoding a protein, in whole integers (e.g., 20, 21, 22, 23, 24, 25 nucleotides), or multiple genes, or portions thereof.
  • the heterologous nucleotides are not naturally found (i.e., not found in nature, in vivo) flanking the nucleic acid sequence encoding the specified amino acid sequence as it occurs in the natural gene or do not encode a protein that imparts any additional function to the protein or changes the function of the protein having the specified amino acid sequence.
  • the polynucleotide probes or primers of the invention are conjugated to detectable markers.
  • Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., DynabeadsTM), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e,g., 3 H, ,25 1, 35 S, C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads.
  • the polynucleotide probes are immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports.
  • One embodiment of the present invention relates to a recombinant nucleic acid molecule which comprises the isolated nucleic acid molecule described above which is operatively linked to at least one expression control sequence. More particularly, according to the present invention, a recombinant nucleic acid molecule typically comprises a recombinant vector and any one or more of the isolated nucleic acid molecules as described herein. According to the present invention, a recombinant vector is an engineered (i.e., artificially produced) nucleic acid molecule that is used as a tool for manipulating a nucleic acid sequence of choice and/or for introducing such a nucleic acid sequence into a host cell.
  • the recombinant vector is therefore suitable for use in cloning, sequencing, and/or otherwise manipulating the nucleic acid sequence of choice, such as by expressing and/or delivering the nucleic acid sequence of choice into a host cell to form a recombinant cell.
  • a vector typically contains nucleic acid sequences that are not naturally found adjacent to nucleic acid sequence to be cloned or delivered, although the vector can also contain regulatory nucleic acid sequences (e.g., promoters, untranslated regions) which are naturally found adjacent to nucleic acid sequences of the present invention or which are useful for expression of the nucleic acid molecules of the present invention (discussed in detail below).
  • the vector can be either RNA or DNA, either prokaryotic or eukaryotic, and typically is a plasmid.
  • the vector can be maintained as an extrachromosomal element (e.g., a plasmid) or it can be integrated into the chromosome of a recombinant host cell, although it is preferred if the vector remains separate from the genome for most applications of the invention.
  • the entire vector can remain in place within a host cell, or under certain conditions, the plasmid DNA can be deleted, leaving behind the nucleic acid molecule of the present invention.
  • An integrated nucleic acid molecule can be under chromosomal promoter control, under native or plasmid promoter control, or under a combination of several promoter controls. Single or multiple copies of the nucleic acid molecule can be integrated into the chromosome.
  • a recombinant vector of the present invention can contain at least one selectable marker.
  • a recombinant vector used in a recombinant nucleic acid molecule of the present invention is an expression vector.
  • expression vector is used to refer to a vector that is suitable for production of an encoded product (e.g., a protein of interest, such as an enzyme of the present invention).
  • a nucleic acid sequence encoding the product to be produced e.g., the protein or homologue or variant thereof is inserted into the recombinant vector to produce a recombinant nucleic acid molecule.
  • the nucleic acid sequence encoding the protein to be produced is inserted into the vector in a manner that operatively links the nucleic acid sequence to regulatory sequences in the vector which enable the transcription and translation of the nucleic acid sequence within the recombinant host cell.
  • a recombinant nucleic acid molecule includes at least one nucleic acid molecule of the present invention operatively linked to one or more expression control sequences (e.g., transcription control sequences or translation control sequences).
  • expression control sequences e.g., transcription control sequences or translation control sequences.
  • the phrase "recombinant molecule” or “recombinant nucleic acid molecule” primarily refers to a nucleic acid molecule or nucleic acid sequence operatively linked to a transcription control sequence, but can be used interchangeably with the phrase “nucleic acid molecule", when such nucleic acid molecule is a recombinant molecule as discussed herein.
  • the phrase "operatively linked” refers to linking a nucleic acid molecule to an expression control sequence in a manner such that the molecule is able to be expressed when transfected (i.e., transformed, transduced, transfected, conjugated or conducted) into a host cell.
  • Transcription control sequences are sequences which control the initiation, elongation, or termination of transcription. Particularly important transcription control sequences are those which control transcription initiation, such as promoter, enhancer, operator and repressor sequences. Suitable transcription control sequences include any transcription control sequence that can function in a host cell or organism into which the recombinant nucleic acid molecule is to be introduced. Transcription control sequences may also include any combination of one or more of any of the foregoing.
  • Recombinant nucleic acid molecules of the present invention can also contain additional regulatory sequences, such as translation regulatory sequences, origins of replication, and other regulatory sequences that are compatible with the recombinant cell.
  • a recombinant molecule of the present invention including those which are integrated into the host cell chromosome, also contains secretory signals (i.e., signal segment nucleic acid sequences) to enable an expressed protein to be secreted from the cell that produces the protein.
  • Suitable signal segments include a signal segment that is naturally associated with the protein to be expressed or any heterologous signal segment capable of directing the secretion of the protein according to the present invention.
  • a recombinant molecule of the present invention comprises a leader sequence to enable an expressed protein to be delivered to and inserted into the membrane of a host cell.
  • Suitable leader sequences include a leader sequence that is naturally associated with the protein, or any heterologous leader sequence capable of directing the delivery and insertion of the protein to the membrane of a cell.
  • the term "transfection” is generally used to refer to any method by which an exogenous nucleic acid molecule (i.e., a recombinant nucleic acid molecule) can be inserted into a cell.
  • transformation can be used interchangeably with the term “transfection” when such term is used to refer to the introduction of nucleic acid molecules into microbial cells or plants and describes an inherited change due to the acquisition of exogenous nucleic acids by the microorganism that is essentially synonymous with the term “transfection.”
  • Transfection techniques include, but are not limited to, transformation, particle bombardment, electroporation, microinjection, Hpofection, adsorption, infection and protoplast fusion.
  • One or more recombinant molecules of the present invention can be used to produce an encoded product (e.g., a protein) of the present invention.
  • an encoded product is produced by expressing a nucleic acid molecule as described herein under conditions effective to produce the protein.
  • a preferred method to produce an encoded protein is by transfecting a host cell with one or more recombinant molecules to form a recombinant cell. Suitable host cells to transfect include, but are not limited to, any bacterial, fungal (e.g., filamentous fungi or yeast or mushrooms), algal, plant, insect, or animal cell that can be transfected. Host cells can be either untransfected cells or cells that are already transfected with at least one other recombinant nucleic acid molecule.
  • Suitable cells may include any microorganism (e.g., a bacterium, a protist, an alga, a fungus, or other microbe), and is preferably a bacterium, a yeast or a filamentous fungus.
  • Suitable bacterial genera include, but are not limited to, Escherichia, Bacillus, Lactobacillus, Pseudomonas and Streptomyces.
  • Suitable bacterial species include, but are not limited to, Escherichia coli, Bacillus subtilis, Bacillus licheniformis, Bacillus stearothermophilus, Lactobacillus brevis, Pseudomonas aeruginosa and Streptomyces lividans.
  • Suitable genera of yeast include, but are not limited to, Saccharomyces, Schizosaccharomyces, Candida, Hansenula, Pichia, Kluyveromyces, and Phqffia.
  • Suitable yeast species include, but are not limited to, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans, Hansenula polymorpha, Pichia pastoris, P. canadensis, Kluyveromyces marxianus and Phqffia rhodozyma.
  • Suitable fungal genera include, but are not limited to, Chrysosporium, Ihielavia, Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynasc s, Cryptococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Hu icola, and Trichoderma, and anamorphs and teleomorphs thereof.
  • Suitable fungal species include, but are not limited to, Aspergillus niger, Aspergillus oryzae, Aspergillus nidulans, Aspergillus japonicus, Absidia coerulea, Rhizopus oryzae, Mycelophthora thermophila, Newospora crassa, Neurospora intermedia, Trichoderma reesei, Penicittiwn canescens, Penicillium solitum, Penicillium fiiniculosum, Talaromyces emersonii and Talaromyces flams.
  • the host cell is a fungal cell of the species M. thermophila..
  • a while (low cellulose) strain is sued.
  • the host cell is a fiingal cell of Strain CI (VKM F-3500- D) or a mutant strain derived therefrom (e.g., UV13-6 (Accession No. VKM F-3632 D); NG7C-19 (Accession No. VKM F-3633 D); UV18-25 (VKM F-3631D), W1L (CBS122189), or W1L#100L (CBS122190)).
  • Host cells can be either untransfected cells or cells that are already transfected with at least one other recombinant nucleic acid molecule. Additional embodiments of the present invention include any of the genetically modified cells described herein.
  • suitable host cells include insect cells (most particularly Drosophila melanogaster cells, Spodoptera frugiperda Sf9 and Sr21 cells and Trichoplusa High-Five cells), nematode cells (particularly C. elegans cells), avian cells, amphibian cells (particularly Xenopus laevis cells), reptilian cells, and mammalian cells (most particularly human, simian, canine, rodent, bovine, or sheep cells, e.g. N1H3T3, CHO (Chinese hamster ovary cell), COS, VERO, BHK, HEK, and other rodent or human cells).
  • insect cells most particularly Drosophila melanogaster cells, Spodoptera frugiperda Sf9 and Sr21 cells and Trichoplusa High-Five cells
  • nematode cells particularly C. elegans cells
  • avian cells particularly amphibian cells (particularly Xenopus laevis cells)
  • one or more protein(s) expressed by an isolated nucleic acid molecule of the present invention are produced by culturing a cell that expresses the protein (i.e., a recombinant cell or recombinant host cell) under conditions effective to produce the protein.
  • the protein may be recovered, and in others, the cell may be harvested in whole, either of which can be used in a composition.
  • Microorganisms used in the present invention are cultured in an appropriate fermentation medium.
  • An appropriate, or effective, fermentation medium refers to any medium in which a cell of the present invention, including a genetically modified microorganism (described below), when cultured, is capable of expressing enzymes useful in the present invention and/or of catalyzing the production of sugars from lignocellulosic biomass.
  • a medium is typically an aqueous medium comprising assimilable carbon, nitrogen and phosphate sources.
  • Such a medium can also include appropriate salts, minerals, metals and other nutrients.
  • Microorganisms and other cells of the present invention can be cultured in conventional fermentation bioreactors.
  • the microorganisms can be cultured by any fermentation process which includes, but is not limited to, batch, fed-batch, cell recycle, and continuous fermentation.
  • the fermentation of microorganisms such as fungi may be carried out in any appropriate reactor, using methods known to those skilled in the art.
  • the fermentation may be carried out for a period of 1 to 14 days, or more preferably between about 3 and 10 days.
  • the temperature of the medium is typically maintained between about 25 and 50°C, and more preferably between 28 and 40°C.
  • the pH of the fermentation medium is regulated to a pH suitable for growth and protein production of the particular organism.
  • the fermentor can be aerated in order to supply the oxygen necessary for fermentation and to avoid the excessive accumulation of carbon dioxide produced by fermentation.
  • the aeration helps to control the temperature and the moisture of the culture medium.
  • the fungal strains are grown in fermentors, optionally centrifuged or filtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi- enzyme composition that is a crude fermentation product.
  • Particularly suitable conditions for culturing filamentous fungi are described, for example, in U.S. Patent No. 6,015,707 and U.S. Patent No. 6,573,086, supra.
  • resultant proteins of the present invention may either remain within the recombinant cell; be secreted into the culture medium; be secreted into a space between two cellular membranes; or be retained on the outer surface of a cell membrane.
  • the phrase "recovering the protein” refers to collecting the whole culture medium containing the protein and need not imply additional steps of separation or purification.
  • Proteins produced according to the present invention can be purified using a variety of standard protein purification techniques, such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential precipitation or solubilization.
  • standard protein purification techniques such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential precipitation or solubilization.
  • Proteins of the present invention are preferably retrieved, obtained, and/or used in "substantially pure” form.
  • substantially pure refers to a purity that allows for the effective use of the protein in any method according to the present invention.
  • a protein to be useful in any of the methods described herein or in any method utilizing enzymes of the types described herein according to the present invention it is substantially free of contaminants, other proteins and/or chemicals that might interfere or that would interfere with its use in a method disclosed by the present invention (e.g., that might interfere with enzyme activity), or that at least would be undesirable for inclusion with a protein of the present invention (including homologues and variants) when it is used in a method disclosed by the present invention (described in detail below).
  • a "substantially pure" protein is a protein that can be produced by any method (i.e., by direct purification from a natural source, recombinantly, or synthetically), and that has been purified from other protein components such that the protein comprises at least about 80% weight weight of the total protein in a given composition (e.g., the protein of interest is about 80% of the protein in a solution/composition/buffer), and more preferably, at least about 85%, and more preferably at least about 90%, and more preferably at least about 91%, and more preferably at least about 92%, and more preferably at least about 93%, and more preferably at least about 94%, and more preferably at least about 95%, and more preferably at least about 96%, and more preferably at least about 97%, and more preferably at least about 98%, and more preferably at least about 99%, weight/weight of the total protein in a given composition.
  • Recombinant techniques useful for controlling the expression of nucleic acid molecules include, but are not limited to, integration of the nucleic acid molecules into one or more host cell chromosomes, addition of vector stability sequences to plasmids, substitutions or modifications of transcription control signals (e.g., promoters, operators, enhancers), substitutions or modifications of translational control signals (e.g., ribosome binding sites), modification of nucleic acid molecules to correspond to the codon usage of the host cell, and deletion of sequences that destabilize transcripts.
  • transcription control signals e.g., promoters, operators, enhancers
  • substitutions or modifications of translational control signals e.g., ribosome binding sites
  • a genetically modified microorganism that has been transfected with one or more nucleic acid molecules of the present invention.
  • a genetically modified microorganism can include a genetically modified bacterium, alga, yeast, filamentous fungus, or other microbe.
  • Such a genetically modified microorganism has a genome which is modified (i.e., mutated or changed) from its normal (i.e., wild-type or naturally occurring) form such that the desired result is achieved (i.e., increased or modified activity and/or production of at least one enzyme or a multi-enzyme composition for the conversion of lignocellulosic material to fermentable sugars).
  • Genetic modification of a microorganism can be accomplished using classical strain development and/or molecular genetic techniques. Such techniques are known in the art and are generally disclosed for microorganisms, for example, in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press or Molecular Cloning: A Laboratory Manual, third edition (Sambrook and Russel, 2001), (jointly referred to herein as "Sambrook”).
  • a genetically modified microorganism can include a microorganism in which nucleic acid molecules have been inserted, deleted or modified (i.e., mutated; e.g., by insertion, deletion, substitution, and/or inversion of nucleotides), in such a manner that such modifications provide the desired effect within the microorganism.
  • a genetically modified microorganism can endogenously contain and express an enzyme or a multi-enzyme composition for the conversion/degredation of lignocellulosic material, and the genetic modification can be a genetic modification of one or more of such endogenous enzymes, whereby the modification has some effect on the ability of the microorganism to convert/degrade lignocellulosic material (e.g., increased expression of the protein by introduction of promoters or other expression control sequences, or modification of the coding region by homologous recombination to increase the activity of the encoded protein).
  • a genetically modified microorganism can endogenously contain and express an enzyme for the conversion/degredation of lignocellulosic material, and the genetic modification can be an introduction of at least one exogenous nucleic acid sequence (e.g., a recombinant nucleic acid molecule), wherein the exogenous nucleic acid sequence encodes at least one additional enzyme useful for the conversion/degredation of lignocellulosic material and/or a protein that improves the efficiency of the enzyme for the conversion/degredation of lignocellulosic material.
  • exogenous nucleic acid sequence e.g., a recombinant nucleic acid molecule
  • the microorganism can also have at least one modification to a gene or genes comprising its endogenous enzyme(s) for the conversion/degredation of lignocellulosic material.
  • the genetically modified microorganism does not necessarily endogenously (naturally) contain an enzyme for the conversion/degredation of lignocellulosic material, but is genetically modified to introduce at least one recombinant nucleic acid molecule encoding at least one enzyme or a multiplicity of enzymes for the conversion/degredation of lignocellulosic material.
  • microorganism can be used in a method of the invention, or as a production microorganism for crude fermentation products, partially purified recombinant enzymes, and/or purified recombinant enzymes, any of which can then be used in a method of the present invention.
  • a cell extract that contains the activity to test can be generated. For example, a lysate from the host cell is produced, and the supernatant containing the activity is harvested and/or the activity can be isolated from the lysate. In the case of cells that secrete enzymes into the culture medium, the culture medium containing them can be harvested, and/or the activity can be purified from the culture medium.
  • the extracts/activities prepared in this way can be tested using assays known in the art. Accordingly, methods to identify mutli-enzyme compositions capable of degrading lignocellulosic biomass are provided.
  • DDG dinitrosalicylic acid assay
  • the present invention is not limited to fungi and also contemplates genetically modified organisms such as algae, bacterial, and plants transformed with one or more nucleic acid molecules of the invention.
  • the plants may be used for production of the enzymes, and/or as the lignocellulosic material used as a substrate in the methods of the invention.
  • Methods to generate recombinant plants are known in the art. For instance, numerous methods for plant transformation have been developed, including biological and physical transformation protocols. See, for example, Miki et al., "Procedures for Introducing Foreign DNA into Plants” in Methods in Plant Molecular Biology and Biotechnology, Glick, B.R. and Thompson, J.E. Eds.
  • Another method for physical delivery of DNA to plants is sonication of target cells.
  • Some embodiments of the present invention include genetically modified organisms comprising at least one nucleic acid molecule encoding at least one enzyme of the present invention, in which the activity of the enzyme is downregulated.
  • the downregulation may be achieved, for example, by introduction of inhibitors (chemical or biological) of the enzyme activity, by manipulating the efficiency with which those nucleic acid molecules are transcribed, the efficiency with which the resultant transcripts are translated, and the efficiency of post-translational modifications, or by "knocking out” the endogenous copy of the gene.
  • a “knock out” of a gene refers to a molecular biological technique by which the gene in the organism is made inoperative, so that the expression of the gene is substantially reduced or eliminated.
  • the activity of the enzyme may be upregulated.
  • the present invention also contemplates downregulating activity of one or more enzymes while simultaneously upregulating activity of one or more enzymes to achieve the desired outcome.
  • This example illustrates an assay to measure acetyl xylan esterase activity towards arabinoxylan oligosaccharides from Eucalyptus wood. This assay measures the release of acetate by the action of the acetyl xylan esterases on the arabinoxylan oligosaccharides.
  • Phosphate buffer (0.05 M, pH 7.0) is prepared as follows. 13.8 g of NaH 2 P0 4 * H 2 0 is dissolved in 1 L of Millipore water. 26.8 g Na 2 HPO 4 *7H 2 0 is dissolved in Millipore water. 195 mL of the NaH 2 PO 4 solution is mixed with 305 mL of the Na 2 HP0 4 solution and adjusted to 1000 mL with Millipore water. The pH of the resulting solution is equal to 7.0.
  • Acetylated, 4-0-MeGlcA substituted xylo-oligosaccharides with 2-10 xylose residues from Eucalyptus globulus wood (EW-XOS), Eucalyptus globulus wood ATS and Eucalyptus globulus xylan polymer are obtained using the method described in Lucas et al.2002.
  • CE-LIF Capillary Electrophoresis-Laser induced fluorescence detector
  • CE-LIF is performed using ProteomeLab PA800 Protein Characterization System (Beckman Coulter), controlled by 32 Karat Software.
  • the capillary column used is polyvinyl alcohol coated capillary (N-CHO capillary, Beckman Coulter), having 50 ⁇ m ID, 50.2 cm length and 40 cm to detector window. 25 mM sodium acetate buffer pH 4.75 containing 0.4% polyethyleneoxide (Carbohydrate separation buffer, Beckman Coulter) is used as running buffer.
  • the sample (ca. 3.5 nL) is injected to the capillary by a pressure of 0.5 psi for 3 seconds. The separation is done for 20 minutes at 30 kV separating voltage, with reversed polarity.
  • the samples are stored at 10°C.
  • the labeled EW-XOS are detected using LIF detector at 488 nm excitation and 520 nm emission wavelengths.
  • Esterase activity can be examined by spectrophotometry (Davies et al., 2000) with p- nitrophenyl butyrate (p-NPB) as a substrate. Cutinase activity can also be measured using 3 H labelled apple cutin as substrate by an adaptation of the methodology presented in K6ller et al. (1982) and Davies et al. (2000).
  • This assay measures the release of -nitrophenol by the action of ferulic acid esterase on p-nitrophenylbutyrate (PNBu).
  • PNBu p-nitrophenylbutyrate
  • One ferulic acid esterase unit of activity is the amount of enzyme that liberates 1 micromole of p-nitrophenol in one minute at 37°C and pH 7.2.
  • Phosphate buffer (0.01 M, pH 7.2) is prepared as follows: 0.124 g of NaH2P04 * H20 and 0.178 g Na2HP04 are dissolved in distilled water so that the final volume of the solution is 500 ml and the pH of the resulting solution is equal to 7.2.
  • PNPBu (Sigma, USA, cat # N9876-5G) is used as the assay substrate. 10 ⁇ l of PNPBu is mixed with 25 ml of 0.01 M phosphate buffer using a magnetic stirrer to obtain a 2 mM stock solution. The solution is stable for 2 days with storage at 4°C.
  • the stop reagent (0.25 M Tris-HCl, pH 8.5) is prepared as follows: 30.29 g of Tris is dissolved in 900 ml of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 is prepared by mixing solution A with 37% HCl until the pH of the resulting solution is equal to 8.5. The solution volume is adjusted to 1000 ml. This reagent is used to terminate the enzymatic reaction. Using the above reagents, the assay is performed as detailed below.
  • ⁇ 4 05 A s - ASB
  • DF the enzyme dilution factor
  • 21 the dilution of 10 ul enzyme solution in 210 ⁇ reaction volume
  • 1.33 is the conversion factor of microtiter plates to cuvettes
  • 13.700 is the extinction coefficient 13700 M -1 cm -1 of/j-nitrophenol released corrected for mol L to umol/mL
  • 10 minutes is the reaction time.
  • the following assay is used to measure the enzymatic activity of a ferulic acid esterase towards wheat bran (WB) oligosaccharides by measuring the release of ferulic acid.
  • Wheat bran oligosaccharides are prepared by degradation of wheat bran (obtained from Nedalco, The Netherlands) by endo-xylanase HI from A. niger (enzyme collection Laboratory of Food Chemistry, Wageningen University, The Netherlands). 50 mg of WB is dissolved in 10 ml of 0.05 M acetate buffer pH 5.0 using a magnetic stirrer. 1.0 ml of WB stock solution is mixed with 0.0075 mg of the enzyme and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The residual material is removed by centrifugation (15 minutes at 14000 rpm), and the supernatant is used as the substrate in the assay detailed below.
  • 1.0 ml of wheat bran oligosaccharides stock solution is mixed with 0.005 mg of 0.05 M acetate buffer, pH 5.0, and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of ferulic acid is analyzed by measuring the absorbance at 335 nm.
  • acetyl esterase activity measures the release of p-nitrophenol by the action of acetyl esterase on /j-nitrophenyl acetate (PNPAc).
  • PNPAc /j-nitrophenyl acetate
  • One acetyl esterase unit of activity is the amount of enzyme that liberates 1 micromole of p-nitrophenol in one minute at 37 °C and pH 5.
  • Sodium acetate buffer (0.05 M, pH S.0) is prepared as follows. 4.1 g of anhydrous sodium acetate or 6.8 g of sodium acetate * 3H 2 0 is dissolved in distilled water so that the final volume of the solution to be 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 5.0, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is equal to 5.0.
  • PNPAc from Fluka (Switzerland, cat. # 46021) is used as the assay substrate. 3.6 mg of PNPAc is dissolved in 10 mL of 0.05 M sodium acetate buffer using magnetic stirrer to obtain 2 mM stock solution. The solution is stable for 2 days on storage at 4 °C.
  • the stop reagent (0.25 M Tris-HCl, pH 8.8) is prepared as follows. 30.29 g of Tris is dissolved in 900 mL of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 is prepared by mixing solution A with 37% HCl until the pH of the resulting solution is equal to 8.8. The solution volume is adjusted to 1000 mL. This reagent is used to terminate the enzymatic reaction.
  • the assay is performed as detailed below.
  • ⁇ 405 As (enzyme sample) - ASB (substrate blank)
  • DF is the enzyme dilution factor
  • 21 is the dilution of 10 ul enzyme solution in 210 ul reaction volume
  • 1.33 is the conversion factor of microtiter plates to cuvettes
  • 13.700 is the extinction coefficient 13700 M -1 cm -1 of p-niitrophenol released corrected for mol/L to umol mL
  • 10 minutes is the reaction time.
  • the following assay is used to measure glucuronyl esterase activity. This assay measures the release of 4-O-methyl-glucuronic acid by the action of the glucuronyl esterases on methyl-4-O-methyl-glucuronic acid.
  • Sodium acetate buffer (0.1 M, pH 5.0) is prepared as follows. 8.2 g of anhydrous sodium acetate or 13.6 g of sodium acetate * 3H 2 0 is dissolved in distilled water so that the final volume of the solution to be 1000 mL (Solution A). In a separate flask, 6.0 g (5.72 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.1 M sodium acetate buffer, pH 5.0, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is equal to 5.0.
  • the following assay was used to measure pectin and/or rhamnogalacturonan acetyl esterase activity. This assay measures the release of acetic acid by the action of the pectin acetyl esterase or rhamnogalacturonan acetyl esterase on sugar beet pectin.
  • pectin acetyl esterases or the rhamnogalacturonan acetyl esterase were incubated with sugar beet pectin at 50°C in 10 mM phosphate buffer pH 7.0 during 16 hours of incubation.
  • the E/S ratio was 0.5% (5 ⁇ g enzyme/mg substrate).
  • the total volume of the reaction was ⁇ ⁇ ,.
  • the released acetic acid was analyzed with the acetic acid assay kit according to instructions of the supplier. Enzyme with known pectin acetyl easterase activity or rhamnogalacturonan acetyl esterase activity were used as a reference.

Abstract

This invention relates to novel enzymes and novel methods for producing the same. More specifically this invention relates to a variety of fungal enzymes. Nucleic acid molecules encoding such enzymes, compositions, recombinant and genetically modified host cells, and methods of use are described. The invention also relates to a method to convert lignocellulosic biomass to fermentable sugars with enzymes that degrade the lignocellulosic material and novel combinations of enzymes, including those that provide a synergistic release of sugars from plant biomass. The invention also relates to a method to release cellular content by degradation of cell walls. The invention also relates to methods to use the novel enzymes and compositions of such enzymes to improve the digestability of animal feed and in a variety of other processes, including food and beverage processing, baking, washing of clothing, detergent processes, biorefining, deinking and biobleaching of paper and pulp, and treatment of waste streams.

Description

Novel Fungal Esterases
[001] RELATED APPLICATIONS
[002] This application claims benefit of priority under 35. U.S.C. § 119(e) of United States Provisional Application number 61/420,531 filed on December 7, 2010.
[003] INCORPORATION BY REFERENCE
[004] The content of all patents, patent applications, publications, articles, or literature cited herein are expressly incorporated by reference.
[005] FIELD OF THE INVENTION
[006] This invention relates to novel enzymes and novel methods for producing the same.
Specifically this invention relates to enzymes produced by fungi. More specifically this invention relates to enzymes of fungal origin classified as esterases and produced by fungi. Esterases represent a category of various enzymes including but not limited to lipases, cutinases, phospholipases, phytases, acetylesterases like xylan acetylesterase, feruloylesterase, glucuronyl esterase, rhamnogalacturonan acetylesterase, pectin acetylesterase and pectin methylesterase. The invention also relates to a method to degrade lignocellulosic or cellulosic material and to novel combinations of enzymes, including those that provide a combined or synergistic release of sugars from plant biomass. The invention also relates to a method to modify specific plant cell wall components such as pectin, arabino(glucurono)xylan, acetylgalacto(gluco)mannan altering their physico-chemical properties and a method to provide a more complete release of monomeric and oligomeric constituents of such polysaccharides. The invention also relates to a method to release cellular contents by effecting degradation of the cell walls. The invention also relates to methods to use the novel enzymes and compositions of such enzymes in a variety of other processes, such as washing of clothing or fabrics, detergent processes, processes in the animal feed, food and beverage industries; biorefining, deinking and biobleaching of paper and pulp; and treatment of waste streams.
[007] BACKGROUND OF THE INVENTION
[008] Esterases are hydrolytic enzymes that split esters into an acid and an alcohol in a chemical reaction with water. Esterases represent a category of various enzymes including lipases, phospholipases and phytases that catalyze the hydrolysis and synthesis of ester bonds in compounds. Lipases are water-soluble enzymes that catalyze the hydrolysis of ester chemical bonds in water-insoluble lipid substrates; phospholipases hydrolyze phospholipids into fatty acids and other lipophilic substances; and phytases break down the undigestible phytic acid (phytate) part found in grains and oil seeds and thus release digestible phosphorus, calcium and other nutrients.
[009] Esterases are useful for the hydrolysis of complex carbohydrates. This makes them also useful in a variety of industrial textile applications, as well as industrial paper and pulp applications. Esterases are also employed in the food, feed, and beverage industry (e.g., to improve flavors, releasing of minerals, degum vegetable oils, and ripining of cheese). Esterases are also used in the pressing of agricultural material to produce increasing yields (e.g. production of fruit juice) and in the treatment of waste streams. Esterases, such as lipases, are also used in detergent compositions, for the purpose of enhancing the cleaning ability of the composition together with cellulases or to act as a softening agent.
[010] For example, plant biomass mainly consists of cellulose, various hemiceUuloses, pectins, lignin and some cutin waxes which form a protective layer on the outer surface of the intact plant cell wall. Large amounts of carbohydrates in plant biomass provide a plentiful source of potential energy in the form of sugars (both five carbon and six carbon sugars) that can be utilized for numerous industrial and agricultural processes. However, the enormous energy potential of these carbohydrates is currently under-utilized because the sugars are locked in complex polymers, and hence are not readily accessible for fermentation. Esterases can help hemicellulases and pectinases to degrade this material.
[011] Distillers' dried grains (DDG) are lignocellulosic byproducts of the corn dry milling process. Milled whole corn kernels are treated with amylases to liquefy the starch within the kernels and hydrolyze it to glucose. The glucose so produced is then fermented in a second step to ethanol. The residual solids after the ethanol fermentation and distillation are centrifuged and dried, and the resulting product is DDG, which is used as an animal feed stock. Although DDG composition can vary, a typical composition for DDG is: about 32% hemicellulose, 22% cellulose, 30% protein, 10% lipids, 4% residual starch, and 4% inorganics. In theory, the cellulose and hemicellulose fractions, comprising about 54% of the weight of the DDG, can be efficiently hydrolyzed to fermentable sugars by enzymes; however, it has been found that the carbohydrates comprising lignocellulosic materials in DDG are more difficult to digest due to the recalcitrant nature of these substrates. Currently the cost of producing the requisite enzymes is higher than the cost of producing amylases for starch hydrolysis.
[012] Major polysaccharides comprising lignocellulosic materials include cellulose and hemicelluloses. The enzymatic hydrolysis of these polysaccharides to soluble sugars (and finally to monomers such as glucose, xylose and other hexoses and pentoses) is catalyzed by several enzymes acting in concert. For example, endo-1,4-P-glucanases (EGs) and exo-cellobiohydrolases (CBHs) catalyze the hydrolysis of insoluble cellulose to cellooligosachharides (with cellobiose the main product), while β- glucosidaes (BGLs) convert the oligosaccharides to glucose. Similarly, xylanases and β-xylosidases, together with other enzymes, remove side chains and substituents by enzymes such as alpha-L- arabinofuranosidases, ferulic acid esterases and acetylxylan esterases. Glucuronylesterases catalyze the hydrolysis of hemicelluloses containing glucuronyl residues and thereby improve the hydrolysis of hemicellulose and thereby cellulose. Similarly, galacto(gluco)mannan, often partially acetylated, can be hydrolyzed by the conceited action of endo-mannanase, beta-mannosidase, alpha- galactosidase, and acetylesterase.
[013] Pectin is another complex polysaccharide which mainly consists out of homogalacturonan and rhamnogalacturonan backbone structures. The aJpha-1,4-linked galacturonic acids which occur in the backbone of these polysaccharide structures are often methylated at the C6 carboxyl moiety and can in addition be acetylated. Esterases such as pectin methylesterase and rhamnogalacturonan esterase modify the functionality of such polysaccharides. In concert with endo- and exo- polygalacturonases, rhamnogalacturonan hydrolases, pectin and pectate lyases, rhamnosidases and arabinofuranosidases to remove side chains from the backbone, they can breakdown these polysaccharides.
[014] Regardless of the type of feedstock, the cost and hydrolytic efficiency of enzymes are major factors that restrict the widespread use of biomass byconversion processes. The production costs of microbially produced enzymes are tightly connected with a productivity of the enzyme-producing strain and the final activity yield in the fermentation broth. The hydrolytic efficiency of a multi-enzyme complex in, for example, the process of lignocellulose saccharification depends both on properties of individual enzymes, the synergies between them, and their ratio in the multienzyme cocktail. [015] As another example, biodiesel is commonly produced by the transesterification of vegetable oil or animal fat feedstock wherein the complex alcohol, glycerol, is substituted with a simple alcohol, such as ethanol or methanol. There are several methods for carrying out this transesterification reaction including the common batch process, supercritical processes, ultrasonic methods, and even microwave methods. Biodiesel producers are growingly interested in exploring more cost effective and environmentally greener ways to make biodiesel so that: (1) producers can avoid the labor-intensive glycerol removal process, (2) utilize ethanol instead of methanol and (3) sidestep the hazardous use of sodium hydroxide or other strong alkali. Esterases can be utilized to make the biodiesel process more cost effective and environmentally friendly, and there exists the need to develop cheaper, more effective esterases to aid in the production of biodiesel.
[016] Furthermore, bioremediation for waste disposal is a new development of esterase biotechnology. Lipases are useful for the treatment of oil spills and lipid-tinged waste from factories and restaurants. Given the importance of this from a global perspective, it is important to engineer lipases that function in different environments to combat such pollution.
[017] Filamentous fungi are a rich source of polysaccharide degrading enzymes as well as a variety of other enzymes such as, but not limited to, acetylxylan esterases, feruloyl esterases, glucuronyl esterases, pectinesterases which are all useful in the enzymatic hydrolysis or modification of major polysaccharides. It is desirable to produce inexpensive enzymes and enzyme mixtures that efficiently degrade or modify such polysaccharides for use in a variety of agricultural and industrial applications.
[018] In spite of the continued research of the last few decades to understand enzymatic biomass degradation, it remains desirable to discover or to engineer new highly active hydrolases including esterases and cutinases. It would also be highly desirable to construct highly efficient enzyme compositions capable of performing rapid and efficient biodegradation of lignocellulosic materials.
[019] DETAILED DESCRIPTION OF THE INVENTION
[020] The present invention relates generally to proteins that play a role in the degradation and/or modification of hemicelluloses, pectins, waxes and other lipid substances and nucleic acids encoding the same. In particular, the present invention relates to enzymes isolated from a filamentous fungal strain denoted herein as CI (Accession No. V M F-3500-D), nucleic acids encoding the enzymes, and methods of producing and using the enzymes. The invention also provides compositions that include at least one of the enzymes described herein for uses including, but not limited to, the hydrolysis of lignocellulose. The invention stems, in part, from the discovery of a variety of novel esterases produced by the CI fungus that exhibit activity toward hemicelluloses, pectins and other components of biomass such as phytate, lipids, waxes and the like.
[021] The present invention also provides methods and compositions for the conversion of plant biomass to fermentable sugars that can in turn be converted to useful products. Such products may include, without limitation, metabolites, bioplastics, biopolymers and biofuels. The methods include methods for degrading lignocellulosic material using enzyme mixtures to liberate sugars. The compositions of the invention include enzyme combinations that break down lignocellulose.
[022] As used herein the terms "biomass" or "lignocellulosic material" includes materials containing cellulose and or hemicellulose. Generally, these materials also contain pectin, lignin, protein, carbohydrates (such as starch and sugar), lipids, and ash. Lignocellulose is generally found, for example, in the stems, leaves, hulls, husks, and cobs of plants or leaves, branches, and wood of trees.
[023] The process of converting less or more complex carbohydrates (such as starch or hemicellulose) into fermentable sugars is also referred to herein as "saccharification."
[024] Fermentable sugars, as used herein, refers to simple sugars, such as glucose, xylose, arabinose, galactose, mannose, rhamnose, sucrose and fructose.
[025] Biomass can include virgin biomass and/or non-virgin biomass such as agricultural biomass, commercial organics, construction and demolition debris, municipal solid waste, waste paper and yard waste. Common forms of biomass include trees, shrubs and grasses, wheat, wheat straw, sugar cane bagasse, sugar beet, soybean, corn, corn husks, corn kernel including fiber from kernels, products and by-products from milling of grains such as corn, tobacco, wheat and barley (including wet milling and dry milling). The biomass can also be, but is not limited to, herbaceous material, agricultural residues, forestry residues, and pulp and paper mill residues. "Agricultural biomass" includes branches, bushes, canes, corn and corn husks, energy crops, algae, fruits, flowers, grains, grasses, herbaceous crops, leaves, bark, needles, logs, roots, saplings, short rotation woody crops, shrubs, switch grasses, trees, vegetables, fruit peels, vines, sugar beet pulp, wheat midlings, oat hulls, peat moss, mushroom compost and hard and soft woods (not including woods with deleterious materials). In addition, agricultural biomass includes organic waste materials generated from agricultural processes including farming and forestry activities, specifically including forestry wood waste. Agricultural biomass may be any of the aforestated singularly or in any combination or mixture thereof.
[026] Energy crops are fast-growing crops that are grown for the specific purpose of producing energy, including without limitation, biofuels, from all or part of the plant. Energy crops can include crops that are grown (or are designed to grow) for their increased cellulose, xylose and sugar contents. Examples of such plants include, without limitation, switchgrass, willow and poplar. Energy crops may also include algae, for example, designer algae that are genetically engineered for enhanced production of hydrogen, alcohols, and oils, which can be further processed into diesel and jet fuels, as well as other bio-based products.
[027] Biomass high in starch, sugar, or protein such as corn, grains, fruits and vegetables are usually consumed as food. Conversely, biomass high in cellulose, hemicellulose and lignin are not readily digestible and are primarily utilized for wood and paper products, animal feed, fuel, or are typically disposed. Generally, the substrate is of high lignocellulose content, including distillers' dried grains corn stover, corn cobs, rice straw, wheat straw, hay, sugarcane bagasse, sugar cane pulp, citrus peels and other agricultural biomass, switchgrass, forestry wastes, poplar wood chips, pine wood chips, sawdust, yard waste, and the like, including any combination thereof.
[028] In one embodiment, the lignocellulosic material is distillers' dried grains (DDG).
DDG (also known as dried distiller's grain, or distiller's spent grain) is spent, dried grains recovered after alcohol fermentation. The lignocellulosic material can also be distiller's dried grain with soluble material recycled back (DDGS). While reference will be made herein to DDG for convenience and simplicity, it should be understood that both DDG and DDGS are contemplated as desired lignocellulosic materials. These are largely considered to be waste products and can be obtained after the fermentation of the starch derived from any of a number of grains, including corn, wheat, barley, oats, rice and rye. In one embodiment the DDG is derived from corn.
[029] It should be noted that the distiller's grains do not necessarily have to be dried.
Although the grains are normally currently dried, water and enzymes are added to the DDG substrate in the present invention. If the saccharification was done on site, the drying step could be eliminated and enzymes could be added to the distiller's grains without drying. [030] Due in part to the many components that comprise biomass and lignocellulosic materials, enzymes or a mixture of enzymes capable of degrading hemicellulose like xylan and cellulose are needed to achieve saccharification. The present invention includes enzymes with esterase activities or compositions thereof with, for example, cellobiohydrolase, endoglucanase, xylanase, β-glucosidase, β-xylosidase and other hemicellulase activities.
[031] Fermentable sugars can be converted to useful value-added fermentation products, non-limiting examples of which include amino acids, vitamins, pharmaceuticals, animal feed supplements, specialty chemicals, chemical feedstocks, plastics, solvents, fuels, or other organic polymers, lactic acid, and ethanol, including fuel ethanol. Specific value-added products that may be produced by the methods of the invention include, but are not limited to, biofuels (including ethanol and butanol); lactic acid; plastics; specialty chemicals; organic acids, including citric acid, succinic acid, itaconic acid and maleic acid; solvents; animal feed supplements; pharmaceuticals; vitamins; amino acids, such as lysine, methionine, tryptophan, threonine, and aspartic acid; industrial enzymes, such as proteases, cellulases, amylases, glucanases, lactases, xylanases, arabinanases, lipases, lyases, oxidoreductases, transferases, and chemical feedstocks.
[032] The enzymes of the present invention may also be used for stone washing cellulosic fabrics such as cotton (e.g., denim), linen, hemp, ramie, cupro, lyocell, newcell, rayon and the like. See, for example, U.S. Patent No. 6,015,707. Enzymes and compositions of the present invention may also be used in the treatment of paper pulp (e.g., for improving the drainage or for de-inking of recycled paper) or for the treatment of wastewater streams (e,g., to improve hydrolysis of waste material containing cellulose, hemicellulose and pectins).
[033] The enzymes of the present invention may also be used to release the contents of a cell. In some embodiments, contacting or mixing the cells with the enzymes and/or compositions of the present invention will degrade the cell walls, preferentially those of plant origin, resulting in cell lysis and release of the cellular contents. In another embodiment the middle lamella will be degraded leading to separation of cells, a process called liquefaction. Alcohols and oils released from plants can be further processed to produce diesel, jet fuels, as well as other economically important bio- products such as flavourants or fragrances. The enzymes of the present invention may be used alone, or in combination with other enzymes, chemicals or biological materials. The enzymes of the present invention may be used for in vitro applications in which the enzymes or mixtures thereof are added to or mixed with the appropriate substrates to catalyze the desired reactions. Additionally, the enzymes of the present invention may be used for in vivo applications in which nucleic acid molecules encoding the enzymes are introduced into cells and are expressed therein to produce the enzymes and catalyze the desired reactions within the cells.
[034] In one aspect, the present invention includes proteins isolated from, or derived from the knowledge of enzymes from, a fungus such as Myceliophthora thermophila or a mutant or other derivative thereof, and more particularly, from the fungal strain denoted herein as CI (Accession No. VKM F-3500-D). M. thermophila has previously appeared in patent applications and in the literature as Chrysosporium lucknowense or Sporotrichtan thermophile. Preferably, the proteins of the invention possess enzymatic activity. As described in U.S. Patent No. 6,015,707 or U.S. Patent No. 6,573,086 a strain called CI (Accession No. VKM F-3500-D), was isolated from samples of forest alkaline soil from Sola Lake, Far Bast of the Russian Federation. This strain was deposited at the All-Russian Collection of Microorganisms of the Russian Academy of Sciences (VKM), Bakhurhina St. 8, Moscow, Russia, 113184, under the terms of the Budapest Treaty on the International Regulation of the Deposit of Microorganisms for the Purposes of Patent Procedure on August 29, 1996, as Chrysosporium lucknowense Garg 27K, VKM-F 3500 D. Various mutant strains of M. thermophila CI (previously C. lucknowense CI) have been constructed and these strains have also been deposited at the All-Russian Collection of Microorganisms of the Russian Academy of Sciences (VKM), Bakhurhina St. 8, Moscow, Russia, 113184, under the terms of the Budapest Treaty on the International Regulation of the Deposit of Microorganisms for the Purposes of Patent Procedure on September 2, 1998 or at the Centraal Bureau voor Schimmelcultures (CBS), Uppsalalaan 8, 3584 CT Utrecht, The Netherlands for the purposes of Patent Procedure on December 5, 2007. For example, Strain CI was mutagenised by subjecting it to ultraviolet light to generate strain UV13-6 (Accession No. VKM F-3632 D). This strain was subsequently further mutated with N-methyl-N'-nitro-N-nitrosoguanidine to generate strain NG7C-19 (Accession No. VKM F-3633 D). This' latter strain in turn was subjected to mutation by ultraviolet light, resulting in strain UV18-25 (Accession No. VKM F-3631 D). This strain in turn was again subjected to mutation by ultraviolet light, resulting in strain W1L (Accession No. CBS122189), which was subsequently subjected to mutation by ultraviolet light, resulting in strain WIUIOOL (Accession No. CBS 122190). Strain CI was initially classified as a Chrysosporium lucknowense based on morphological and growth characteristics of the microorganism, as discussed in detail in U.S. Patent No. 6,015,707, U.S. Patent No. 6,573,086 and patent PCT/NL2010/000045. The CI strain was subsequently reclassified as Myceliophthora thermophila based on genetic tests. C. lucknowense has also appeared in the literature as Sporotrichum thermophile.
[035] In certain embodiments of the present invention, a protein of the invention comprises, consists essentially of, or consists of an amino acid sequence selected from SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ID No: 184. The present invention also includes homologues or variants of any of the above sequences, including fragments and sequences having a given identity to any of the above sequences, wherein the homologue, variant, or fragment has at least one biological activity of the wild-type protein, as described herein.
[036] In general, the proteins disclosed herein have hydrolytic enzymatic activity and are classified as esterases, (phospho)lipases, sulphatases, phosphatases. Of particular interest are esterases with the ability to hydrolyze esters in carbohydrate-containing materials. A review of enzymes involved in the degradation of polysaccharides can be found in de Vries et al., Microbiol. Mol. Biol. Rev. 65:497-522 (2001). As used herein, "esterase" refers to any protein that possess acetylxylan esterase, ferulic acid esterase, coumaryl esterase, pectin methyl esterase, glucuronyl esterase, rhamnogalacturonan acetyl esterase or acetyl(gluco)mannan esterase activity. Other proteins of interest may possess lipase, phospholipase, cutinase, phytase or sulphatase activity.
[037] As used herein, "carbohydrase" refers to any protein that catalyzes the hydrolysis of carbohydrates. Endoglucanases, cellobiohydrolases, β-glucosidases, α-glucosidases, xylanases, β-xylosidases, galactanases, α-galactosidases, β-galactosidases, α- amylases, glucoamylases, endo-arabinases, arabinofuranosidases, mannanases, β- mannosidases, pectinases, acetyl xylan esterases, acetyl mannan esterases, ferulic acid esterases, coumaric acid esterases, pectin methyl esterases, and chitosanases are examples of glycosidases.
[038] "Hemicellulase" refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicellulose is a complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicelluloses include a variety of compounds, such as xylans, arabinoxylans, xyloglucans, mannans, glucomannans, and galactomannans. Hemicellulose can also contain glucan, which is a general term for beta-linked glucose residues. In general, a main component of hemicellulose is beta- 1,4- linked xylose, a five carbon sugar. However, this xylose is often branched as beta-1,3 linkages or beta- 1,2 linkages, and can be substituted with linkages to arabinose, galactose, mannose, glucuronic acid, or by esterification to acetic acid. The composition, nature of substitution, and degree of branching of hemicellulose is very different in dicotyledonous plants (dicots, i.e., plant whose seeds have two cotyledons or seed leaves such as lima beans, peanuts, almonds, peas, kidney beans) as compared to monocotyledonous plants (monocots; i.e., plants having a single cotyledon or seed leaf such as corn, wheat, rice, grasses, barley). In dicots, hemicellulose is comprised mainly of xyloglucans that are 1,4-beta-linked glucose chains with 1,6-alpha-linked xylosyl side chains. In monocots, including most grain crops, the principal components of hemicellulose are heteroxylans. These are primarily comprised of 1 ,4- beta-linked xylose backbone polymers with 1,2- or 1,3-beta linkages to arabinose, galactose and mannose as well as xylose modified by ester-linked acetic acids. Also present are branched beta glucans comprised of 1,3- and 1,4-beta-linked glucosyl chains. In monocots, cellulose, heteroxylans and beta glucans are present in roughly equal amounts, each comprising about 15-25% of the dry matter of cell walls. "Hemicellulase" refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicelluloses are complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicellulolytic enzymes, i.e. hemicellulases, include both endo- acting and exo-acting enzymes, such as xylanases, β-xylosidases. galactanases, α- galactosidases, β-galactosidases, endo-arabinases, arabinofuranosidases, mannanases, β-mannosidases. Hemicellulases also include the accessory enzymes, such as alpha- giucuronidases, acetylesterases, glucuronyl esterases, ferulic acid esterases, and coumaric acid esterases. Among these, xylanases and acetyl xylan esterases cleave the xylan and acetyl side chains of xylan and the remaining xylo-oligomers are unsubstituted and can thus be hydrolysed with β-xylosidase only. In addition, several less known side activities have been found in enzyme preparations which hydrolyze hemicellulose. Accordingly, xylanases, acetylesterases and β-xylosidases are examples of hemicellulases. Similarly the other accessory enzymes mentioned remove glucuronic acid, ferulic acid and coumaric acid which also form obstacles for complete degradation of the hemicellulose structure.
[039] "Xylanase" specifically refers to an enzyme that hydrolyzes the β-1,4 bond in the xylan backbone, producing short xylooligosaccharides.
[040] "β-Mannanase" or "endo-1,4-β-mannosidase" refers to a protein that hydrolyzes mannan-based hemicelluloses (mannan, glucomannan, galactomannan) and produces short β-1,4-mannooligosaccharides.
[041] "Mannan endo-1,6-α-mannosidase" refers to a protein that hydrolyzes 1,6-α- mannosidic linkages in unbranched 1,6-mannans.
[042] "β-Mannosidase" (β-1,4-mannoside mannohydrolase; EC 3.2.1.25) refers to a protein that catalyzes the removal of β-D-mannose residues from the nonreducing ends of oligosaccharides.
[043] "Galactanase", "endo-β-1,6-galactanase" or "arabinogalactan endo-1,4-β- galactosidase" refers to proteins that catalyze the hydrolysis of endo-1,6-β-D- galactosidic or endo-1,4- -D-galactosidic linkages in arabinogalactans.
[044] "a^arabinofuranosidase", "α-N-arabinofuranosidase", "α-arabinofuranosidase", "arabinosidase" or "arabinofuranosidase" refers to a protein that hydrolyzes arab inofuranosyl -containing hemicelluloses or pectins. Some of these enzymes remove arabinofuranoside residues from 0-2 or 0-3 single substituted xylose residues, as well as from 0-2 and/or 0-3 double substituted xylose residues. Some of these enzymes remove arabinose residues from arabinan oligomers.
[045] "Endo-arabinase" refers to a protein that catalyzes the hydrolysis of 1,5-α- arabinofuranosidic linkages in 1,5-arabinans.
[046] "Exo-arabinase" refers to a protein that catalyzes the hydrolysis of 1,5-α-linkages in 1,5-arabinans or 1,5-α-L arabino-oligosaccharides, releasing mainly arabinobiose, although a small amount of arabinotriose can also be liberated.
[047] "β-xylosidase" refers to a protein that hydrolyzes short 1,4-β-D-xylooligomers into xylose.
[048] "ct-Glucuronidase" refers to a protein that hydrolyzes the 1,2-α-glucuronic acid linkages in hemicelluloses.
[049] "Acetyl xylan esterase" refers to a protein that catalyzes the removal of the acetyl groups from xylose residues. "Acetyl mannan esterase" refers to a protein that catalyzes the removal of the acetyl groups from mannose residues, "ferulic esterase" or "ferulic acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and ferulic acid. "Coumaric acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and coumaric acid. "Glucuronyl esterase" refers to a protein that hydrolyzes the ester bond between glucuronic acid and lignin. Acetyl xylan esterases, glucuronyl esterases, ferulic acid esterases and coumaric acid esterases are examples of carbohydrate esterases.
[050] Pectin refers to polysaccharides which are composed of homogalacturonan and rhamnogalacturonan. Homogalacturonan is composed of alpha 1,4 -linked galacturonic acid residues which may be methyl esterified at the C6 carboxylate function and /or acetylated at the C2 or C3 position
[051] Rhamnogalacturonan is composed of alternating α-1,4-rhamnose and α-1,2-linked galacturonic acid, with side chains linked 1,4 to rhamnose. The side chains include Type I galactan, which is β-1,4-linked galactose with α-1,3-linked arabinose substituents; Type Π galactan, which is β-1,3-1,6-linked galactoses (very branched) with arabinose substituents; and arabinan, which is α-1,5-linked arabinose with ct-1,3- linked arabinose branches. The galacturonic acid substituents may be acetylated and/or methylated. [052] Pectinolytic enzymes include both endo-acting and exo-acting enzymes, such as polygalacturonases, pectin and pectate lyases, arabinofuranosidases, rhamnosidases and several esterases like pectin methyl esterases. These and some other enzymes found like ferulic acid esterases are suitable to be used in multi-enzyme compostions to degrade pectin materials.
[053] "Pectin methyl esterase" refers to a protein that catalyzes the removal of the methyl groups ester linked to the carboxylic acid residues in galacturonic acid
[054] "Rhamnogalacturonan acetylesterase" refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the highly branched rhamnogalacturonan (hairy) regions of pectin.
[055] "Pectin acetyl esterase" refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the homogalacturonan ( smooth) regions of pectin.
[056] Esterases active on pectin are another examples of carbohydrate esterases.
[057] "Polygalacturonase" refers to a protein that catalyzes the hydrolysis of alpha 1,4- linked galacturonic acid residues from homogalacturonan thus converting polygalacturonides to galacturonic acid or galacturonic acid oligosaccharides.
[058] "Rhamnogalacturon hydrolase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin to galacturonic acid or rhamnogalacturonan oligosaccharides.
[059] "Pectate lyase" refers to a protein that catalyzes the cleavage of 1,4-α-D-galacturonan by beta-elimination acting on polymeric and or oligosaccharide substrates.
[060] "Pectin lyase" refers to a protein that catalyzes the cleavage of 1,4-α-D-galacturonan by beta-elimination acting on polymeric and/or oligosaccharide substrates. The action of the enzyme is not hindered by acetyl esters.
[061] "Rhamnogalacturonan lyase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin via a β-elimination mechanism (see, e.g., Pages et al., J. Bacterial 185:4727-4733 (2003)).
[062] Glycosidases (glycoside hydrolases; GH), a large family of enzymes that includes cellulases and hemicellulases, catalyze the hydrolysis of glycosidic linkages, predominantly in carbohydrates. Glycosidases such as the proteins of the present invention may be assigned to families on the basis of sequence similarities, and there are now over 100 different such families defined (see the CAZy (Carbohydrate Active EnZymes database) website, maintained by the Architecture of Fonction de Macromolecules Biologiques of the Centre National de la Recherche Scientifique, which describes the families of structurally-related catalytic and carbohydrate-binding modules (or functional domains) of enzymes that degrade, modify, or create glycosidic bonds; Coutinho, P.M. & Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In "Recent Advances in Carbohydrate Bioengineering", H.J. Gilbert, G. Davies, B. Henrissat and B. Svensson eds., The Royal Society of Chemistry, Cambridge, pp. 3-12). Because there is a direct relationship between the amino acid sequence of a protein and its folding similarities, such a classification reflects the structural features of these enzymes and their substrate specificity. Such a classification system can help to reveal the evolutionary relationships between these enzymes and provide a convenient tool to determine information such as an enzyme's activity and function. Thus, enzymes assigned to a particular family based on sequence homology with other members of the family are expected to have similar enzymatic activities and related substrate specificities. CAZy family classifications also exist for glycosyltransferases (GT), polysaccharide lyases (PL), and carbohydrate esterases (CE). Likewise, sequence homology may be used to identify particular domains within proteins, such as cellulose binding modules (CBMs; also known as cellulose binding domains (CBDs)). An enzyme assigned to a particular CAZy family may exhibit one or more of the enzymatic activities or substrate specificities associated with the CAZy family. In other embodiments, the enzymes of the present invention may exhibit one or more of the enzyme activities discussed above.
[063] Proteins of the present invention may also include homologues, variants and fragments of the proteins disclosed herein. The protein fragments include, but are not limited to, fragments comprising a catalytic domain (CD) and or a cellulose-binding domain (also known as a cellulose or carbohydrate binding module (CBM); both are referred to herein as CBM). The identity and location of domains within proteins of the present invention are disclosed in detail below. The present invention encompasses all combinations of the disclosed domains. For example, a protein fragment may comprise a CD of a protein but not a CBM of the protein or a CBM of a protein but not a CD. Similarly, domains from different proteins may be combined. Protein fragments comprising a CD, CBM or combinations thereof for each protein disclosed herein can be readily produced using standard techniques known in the art. In some embodiments, a protein fragment comprises a domain of a protein that has at least one biological activity of the full-length protein. Homologues or variants of proteins of the invention that have at least one biological activity of the full-length protein are described in detail below. As used herein, the phrase "biological activity" of a protein refers to any functions) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vitro or in vivo. In certain embodiments, a protein fragment comprises a domain of a protein that has the catalytic activity of the full-length enzyme. Specific biological activities of the proteins of the invention, and structures within the proteins that are responsible for the activities, are described below.
[064] Descriptions of the enzymes of the present invention are provided below, along with activities and homologies. Although each enzyme is expected to exhibit the activity exemplified below, enzymes of the present invention may also exhibit any of the enzyme activities or substrate specificities discussed throughout this disclosure.
[065] Esterases of the Present Invention
[066] Esterases represent a category of various enzymes including acetylxylan esterases, ferulic acid esterases, coumaryl esterases, pectin methyl esterases, glucuronyl esterases, rhamnogalacturonan acetyl esterases, acetyl(gluco)mannan esterases, lipases, phospholipases, cutinases, phytases or sulphatases that catalyze the hydrolysis and synthesis of ester bonds in compounds.
[067] The following is a listing of nucleic acid and amino acid sequences of the esterases of the present invention along with their expected activity.
[068] The enzyme identified as sequence ES 1, 193_g is encoded by the nucleotides of SEQ ID No: 1, which encodes the amino acid sequence of SEQ ID NO: 2. This enzyme is believed to have feruloyl esterase activity activity.
[069] The enzyme identified as sequence ES 2, 1176_g is encoded by the nucleotides of SEQ ID No: 3, which encodes the amino acid sequence of SEQ ID No: 4. This enzyme is believed to have carboxylesterase activity.
[070] The enzyme identified as sequence ES 3, 1838_g is encoded by the nucleotides of SEQ ID No: 5, which encodes the amino acid sequence of SEQ ID No: 6. This enzyme is believed to have pectinesterase activity.
[071] The enzyme identified as sequence ES 4, 2778_g is encoded by the nucleotides of SEQ ID No: 7, which encodes the amino acid sequence of SEQ ID No: 8. This enzyme is believed to have phospholipase A2 activity. [072] The enzyme identified as sequence ES 5 2819_g is encoded by the nucleotides of SEQ ID No: 9, which encodes the amino acid sequence of SEQ ID No: 10. This enzyme is believed to have epoxide hydrolase activity.
[073] The enzyme identified as sequence ES 6, 3130_g is encoded by the nucleotides of SEQ ID No: 11, which encodes the amino acid sequence of SEQ ID No: 12. This enzyme is believed to have acetylxylan esterase activity.
[074] The enzyme identified as sequence ES 7, 3190_g is encoded by the nucleotides of SEQ ID No: 13, which encodes the amino acid sequence of SEQ ID No: 14. This enzyme is believed to have pectinesterase activity.
[075] The enzyme identified as sequence ES 8, 3597_g is encoded by the nucleotides of SEQ ID No: 15, which encodes the amino acid sequence of SEQ ID No: 16. This enzyme is believed to have sterol esterase activity.
[076] The enzyme identified as sequence ES 9, 3629_g is encoded by the nucleotides of SEQ ID No: 17, which encodes the amino acid sequence of SEQ ID No: 18. This enzyme is believed to have phospholipase A2 activity.
[077] The enzyme identified as sequence ES 10, 3696_g is encoded by the nucleotides of SEQ ID No: 19, which encodes the amino acid sequence of SEQ ID No: 20. This enzyme is believed to have carboxylesterase activity activity.
[078] The enzyme identified as sequence ES 11, 5193_g is encoded by the nucleotides of SEQ ID No: 21, which encodes the amino acid sequence of SEQ ID No: 22. This enzyme is believed to have carboxylesterase activity.
[079] The enzyme identified as sequence ES 12, 6264_g is encoded by the nucleotides of SEQ ID No: 23, which encodes the amino acid sequence of SEQ ID No: 24. This enzyme is believed to have acetylxylan esterase activity.
[080] The enzyme identified as sequence ES 13, 6416_g is encoded by the nucleotides of SEQ ID No: 25, which encodes the amino acid sequence of SEQ ID No: 26. This enzyme is believed to have acetylxylan esterase activity.
[081] The enzyme identified as sequence ES 14, 7039_g is encoded by the nucleotides of SEQ ID No: 27, which encodes the amino acid sequence of SEQ ID No: 28. This enzyme is believed to have feruloyl esterase activity. [082] The enzyme identified as sequence ES 1 , 8052_g is encoded by the nucleotides of SEQ ID No: 29, which encodes the amino acid sequence of SEQ ID No: 30. This enzyme is believed to have carboxylesterase activity.
[083] The enzyme identified as sequence ES 16, 8917_g is encoded by the nucleotides of SEQ ID No: 31, which encodes the amino acid sequence of SEQ ID No: 32. This enzyme is believed to have cutinase activity.
[084] The enzyme identified as sequence ES 17, 373_g is encoded by the nucleotides of SEQ ID No: 33, which encodes the amino acid sequence of SEQ ID No: 34. This enzyme is believed to have phospholipase activity.
[085] The enzyme identified as sequence ES 18, 611_g is encoded by the nucleotides of SEQ ID No: 35, which encodes the amino acid sequence of SEQ ID No: 36. This enzyme is believed to have lipase activity.
[086] The enzyme identified as sequence ES 19, 1024_g is encoded by the nucleotides of SEQ ID No: 37, which encodes the amino acid sequence of SEQ ID No: 38. This enzyme is believed to have lipase activity.
[087] The enzyme identified as sequence ES 20, 1257_g is encoded by the nucleotides of SEQ ID No: 39, which encodes the amino acid sequence of SEQ ID No: 40. This enzyme is believed to have lipase activity.
[088] The enzyme identified as sequence ES 21, 1406_g is encoded by the nucleotides of SEQ ID No: 41, which encodes the amino acid sequence of SEQ ID No: 42. This enzyme is believed to have phospholipase C activity.
[089] The enzyme identified as sequence ES 22, 1407_g is encoded by the nucleotides of SEQ ID No: 43, which encodes the amino acid sequence of SEQ ID No: 44. This enzyme is believed to have phospholipase C activity.
[090] The enzyme identified as sequence ES 23, 1416_g is encoded by the nucleotides of SEQ ID No: 45, which encodes the amino acid sequence of SEQ ID No: 46. This enzyme is believed to have lipase activity.
[091] The enzyme identified as sequence ES 24, 341 _g is encoded by the nucleotides of SEQ ID No: 47, which encodes the amino acid sequence of SEQ ID No: 48. This enzyme is believed to have phospholipase C activity. [092] The enzyme identified as sequence ES 25, 4299_g is encoded by the nucleotides of SEQ ID No: 49, which encodes the amino acid sequence of SEQ ID No: SO. This enzyme is believed to have phospholipase activity.
[093] The enzyme identified as sequence ES 26, 4588_g is encoded by the nucleotides of SEQ ID No: 51, which encodes the amino acid sequence of SEQ ID No: 52. This enzyme is believed to have lipase activity.
[094] The enzyme identified as sequence ES 27, 5125_g is encoded by the nucleotides of SEQ ID No: 53, which encodes the amino acid sequence of SEQ ID No: 54. This enzyme is believed to have phospholipase C activity.
[095] The enzyme identified as sequence ES 28, 5310_g is encoded by the nucleotides of SEQ ID No: 55, which encodes the amino acid sequence of SEQ ID No: 56. This enzyme is believed to have lipase activity.
[096] The enzyme identified as sequence ES 29, 5865_g is encoded by the nucleotides of SEQ ID No: 57, which encodes the amino acid sequence of SEQ ID No: 58. This enzyme is believed to have phospholipase activity.
[097] The enzyme identified as sequence ES 30, 5916_g is encoded by the nucleotides of SEQ ED No: 59, which encodes the amino acid sequence of SEQ ID No: 60. This enzyme is believed to have lysophospholipase activity.
[098] The enzyme identified as sequence ES 31, 6098_g is encoded by the nucleotides of SEQ ID No: 61, which encodes the amino acid sequence of SEQ ID No: 62. This enzyme is believed to have phospholipase D activity.
[099] The enzyme identified as sequence ES 32, 7468_g is encoded by the nucleotides of SEQ ID No: 63, which encodes the amino acid sequence of SEQ ID No: 64. This enzyme is believed to have lipase activity.
[0100] The enzyme identified as sequence ES 33, scaffold00131.pathl.gene867, is encoded by the nucleotides of SEQ ID No: 65, which encodes the amino acid sequence of SEQ ID No: 66. This enzyme is believed to have lipase activity.
[0101] The enzyme identified as sequence ES 34, 8020_g is encoded by the nucleotides of SEQ ID No: 67, which encodes the amino acid sequence of SEQ ID No: 68. This enzyme is believed to have phospholipase C activity. [0102] The enzyme identified as sequence ES 35, 8606_g is encoded by the nucleotides of SEQ ID No: 69, which encodes the amino acid sequence of SEQ ID No: 70. This enzyme is believed to have lipase activity.
[0103] The enzyme identified as sequence ES 36, 8708_g is encoded by the nucleotides of SEQ ID No: 71, which encodes the amino acid sequence of SEQ ID No: 72. This enzyme is believed to have lipase activity.
[0104] The enzyme identified as sequence ES 37, 9212_g is encoded by the nucleotides of SEQ ID No: 73, which encodes the amino acid sequence of SEQ ID No: 74. This enzyme is believed to have lipase activity.
[0105] The enzyme identified as sequence ES 38, scaffold00016.G6S4 is encoded by the nucleotides of SEQ ID No: 75, which encodes the amino acid sequence of SEQ ID No: 76. This enzyme is believed to have acetylxylan esterase activity.
[0106] The enzyme identified as sequence ES 39, scaffold00016.G247, is encoded by the nucleotides of SEQ ID No: 77, which encodes the amino acid sequence of SEQ ID No: 78. This enzyme is believed to have carboxylesterase activity.
[0107] The enzyme identified as sequence ES 40, scaffold00050.G653 is encoded by the nucleotides of SEQ ID No: 79, which encodes the amino acid sequence of SEQ ID No: 80. This enzyme is believed to have feruoyl esterase activity.
[0108] The enzyme identified as sequence ES 41, scaffold00071.G701 is encoded by the nucleotides of SEQ ID No: 81, which encodes the amino acid sequence of SEQ ID No: 82. This enzyme is believed to have feruoyl esterase activity.
[0109] The enzyme identified as sequence ES 42, scaffold00050.G259 is encoded by the nucleotides of SEQ ID No: 83, which encodes the amino acid sequence of SEQ ID No: 84. This enzyme is believed to have lipase activity.
[0110] The enzyme identified as sequence ES 43, scaffold00092.G669, is encoded by the nucleotides of SEQ ID No: 85, which encodes the amino acid sequence of SEQ ID No: 86. This enzyme is believed to have lipase activity.
[0111] The enzyme identified as sequence ES 44, scaffold00016.pathl.gene410, is encoded by the nucleotides of SEQ ID No: 87, which encodes the amino acid sequence of SEQ ID No: 88. This enzyme is believed to have phospholipase C activity. [0112] The enzyme identified as sequence ES 45, scaffold00050.pathl.gene302, is encoded by the nucleotides of SEQ ID No: 89, which encodes the amino acid sequence of SEQ ID No: 90. This enzyme is believed to have feruloyl esterase activity.
[0113] The enzyme identified as sequence ES 46, scaffold00050.pathl.gene704, is encoded by the nucleotides of SEQ ID No: 91, which encodes the amino acid sequence of SEQ ID No: 92. This enzyme is believed to have feruloyl esterase activity.
[0114] The enzyme identified as sequence ES 47, scaffold00031.pathl.gene803 is encoded by the nucleotides of SEQ ID No: 93, which encodes the amino acid sequence of SEQ ID No: 94. This enzyme is believed to have esterase activity.
[0115] The enzyme identified as sequence ES 48, scaffold00031.pathl.gene916, is encoded by the nucleotides of SEQ ID No: 95, which encodes the amino acid sequence of SEQ ID No: 96. This enzyme is believed to have phospholipase D activity.
[0116] The enzyme identified as sequence ES 49, 696_g is encoded by the nucleotides of SEQ ID No: 97, which encodes the amino acid sequence of SEQ ID No: 98. This enzyme is believed to have Lipase activity.
[0117] The enzyme identified as sequence ES 50, 1442_g is encoded by the nucleotides of SEQ ID No: 99, which encodes the amino acid sequence of SEQ ID No: 100. This enzyme is believed to have esterase activity.
[0118] The enzyme identified as sequence ES 51, 1456_g is encoded by the nucleotides of SEQ ID No: 101, which encodes the amino acid sequence of SEQ ID No: 102. This enzyme is believed to have Lipase activity.
[0119] The enzyme identified as sequence ES 52, 1940 _g is encoded by the nucleotides of SEQ ID No: 103, which encodes the amino acid sequence of SEQ ID No: 104. This enzyme is believed to have Lipase activity.
[0120] The enzyme identified as sequence ES 53, 3461_g is encoded by the nucleotides of SEQ ID No: 105, which encodes the amino acid sequence of SEQ ID No: 106. This enzyme is believed to have sulfuric ester hyrolase activity.
[0121] The enzyme identified as sequence ES 54, 3637_g is encoded by the nucleotides of SEQ ID No: 107, which encodes the amino acid sequence of SEQ ED No: 108. This enzyme is believed to have Lipase activity. [0122] The enzyme identified as sequence ES 55, 3638_g is encoded by the nucleotides of SEQ ID No: 109, which encodes the amino acid sequence of SEQ ID No: 110. This enzyme is believed to have Lipase activity.
[0123] The enzyme identified as sequence ES 56, 4267_g is encoded by the nucleotides of SEQ ID No: 111, which encodes the amino acid sequence of SEQ ID No: 112. This enzyme is believed to have Lipase activity.
[0124] The enzyme identified as sequence ES 57, 4984_g is encoded by the nucleotides of SEQ ED No: 113, which encodes the amino acid sequence of SEQ ID No: 114. This enzyme is believed to have Lipase activity.
[0125] The enzyme identified as sequence ES 58, 5066_g is encoded by the nucleotides of SEQ ID No: 115, which encodes the amino acid sequence of SEQ ID No: 116. This enzyme is believed to have Lipase activity.
[0126] The enzyme identified as sequence ES 59, 5391_g is encoded by the nucleotides of SEQ ID No: 117, which encodes the amino acid sequence of SEQ ID No: 118. This enzyme is believed to have Lipase activity.
[0127] The enzyme identified as sequence ES 60, 5406_g is encoded by the nucleotides of SEQ ID No: 119, which encodes the amino acid sequence of SEQ ID No: 120. This enzyme is believed to have Lipase activity.
[0128] The enzyme identified as sequence ES 61, 5765_g is encoded by the nucleotides of SEQ ID No: 121, which encodes the amino acid sequence of SEQ ID No: 122. This enzyme is believed to have Lipase activity.
[0129] The enzyme identified as sequence ES 62, 7449_g is encoded by the nucleotides of SEQ ID No: 123, which encodes the amino acid sequence of SEQ ID No: 124. This enzyme is believed to have esterase activity.
[0130] The enzyme identified as sequence ES 63, 7692_g is encoded by the nucleotides of SEQ ID No: 125, which encodes the amino acid sequence of SEQ ID No: 126. This enzyme is believed to have Lipase activity.
[0131] The enzyme identified as sequence ES 64, 8351_g is encoded by the nucleotides of SEQ ID No: 127, which encodes the amino acid sequence of SEQ ID No: 128. This enzyme is believed to have Lipase activity. [0132] The enzyme identified as sequence ES 65, 8598_g is encoded by the nucleotides of SEQ ID No: 129, which encodes the amino acid sequence of SEQ ID No: 130. This enzyme is believed to have esterase activity.
[0133] The enzyme identified as sequence ES 66, 8626_g is encoded by the nucleotides of SEQ ID No: 131, which encodes the amino acid sequence of SEQ ID No: 132. This enzyme is believed to have Lipase activity.
[0134] The enzyme identified as sequence ES 67, 8665_g is encoded by the nucleotides of SEQ ID No: 133, which encodes the amino acid sequence of SEQ ID No: 134. This enzyme is believed to have Lipase activity.
[0135] The enzyme identified as sequence ES 68, 8742_g is encoded by the nucleotides of SEQ ID No: 135, which encodes the amino acid sequence of SEQ ID No: 136. This enzyme is believed to have Lipase activity.
[0136] The enzyme identified as sequence ES 69, 9705_g is encoded by the nucleotides of SEQ ID No: 137, which encodes the amino acid sequence of SEQ ID No: 138. This enzyme is believed to have Lipase activity.
[0137] The enzyme identified as sequence ES 70, 1839_g is encoded by the nucleotides of SEQ ID No: 139, which encodes the amino acid sequence of SEQ ID No: 140. This enzyme is believed to have Lipase activity.
[0138] The enzyme identified as sequence ES 71, 937_g is encoded by the nucleotides of SEQ ID No: 141, which encodes the amino acid sequence of SEQ ID No: 142. This enzyme is believed to have esterase/lipase activity.
[0139] The enzyme identified as sequence ES 72, 1237_g is encoded by the nucleotides of SEQ ID No: 143, which encodes the amino acid sequence of SEQ ID No: 144. This enzyme is believed to have lipase activity.
[0140] The enzyme identified as sequence ES 73, 2253_g is encoded by the nucleotides of SEQ ID No: 145, which encodes the amino acid sequence of SEQ ID No: 146. This enzyme is believed to have esterase/lipase activity.
[0141] The enzyme identified as sequence ES 74, 3232_g is encoded by the nucleotides of SEQ ID No: 147, which encodes the amino acid sequence of SEQ ID No: 148. This enzyme is believed to have esterase activity. [0142] The enzyme identified as sequence ES 75, 4735_g is encoded by the nucleotides of SEQ ID No: 149, which encodes the amino acid sequence of SEQ ID No: 150. This enzyme is believed to have esterase/lipase activity.
[0143] The enzyme identified as sequence ES 76, 4736_g is encoded by the nucleotides of SEQ ID No: 151, which encodes the amino acid sequence of SEQ ID No: 152. This enzyme is believed to have esterase/lipase activity.
[0144] The enzyme identified as sequence ES 77, 5568_g is encoded by the nucleotides of SEQ ID No: 153, which encodes the amino acid sequence of SEQ ID No: 154. This enzyme is believed to have phospholipase D activity.
[0145] The enzyme identified as sequence ES 78, 6656 _g is encoded by the nucleotides of SEQ ID No: 155, which encodes the amino acid sequence of SEQ ID No: 156. This enzyme is believed to have esterase activity.
[0146] The enzyme identified as sequence ES 79, 6881_g is encoded by the nucleotides of SEQ ID No: 157, which encodes the amino acid sequence of SEQ ID No: 158. This enzyme is believed to have esterase activity.
[0147] The enzyme identified as sequence ES 80, 745l_g is encoded by the nucleotides of SEQ ID No: 159, which encodes the amino acid sequence of SEQ ID No: 160. This enzyme is believed to have esterase activity.
[0148] The enzyme identified as sequence ES 81, 7776_g is encoded by the nucleotides of SEQ ID No: 161, which encodes the amino acid sequence of SEQ ID No: 162. This enzyme is believed to have esterase activity.
[0149] The enzyme identified as sequence ES 82, 8080_g is encoded by the nucleotides of SEQ ID No: 163, which encodes the amino acid sequence of SEQ ID No: 164. This enzyme is believed to have esterase activity,
[0150] The enzyme identified as sequence ES 83, 8408_g is encoded by the nucleotides of SEQ ID No: 165, which encodes the amino acid sequence of SEQ ID No: 166. This enzyme is believed to have esterase activity.
[0151] The enzyme identified as sequence ES 84, 9687_g is encoded by the nucleotides of SEQ ID No: 167, which encodes the amino acid sequence of SEQ ID No: 168. This enzyme is believed to have esterase/lipase activity. [0152] The enzyme identified as sequence ES 85, 9709_g is encoded by the nucleotides of SEQ ID No: 169, which encodes the amino acid sequence of SEQ ID No: 170. This enzyme is believed to have esterase/lipase activity.
[0153] The enzyme identified as sequence ES 86, 8079_g is encoded by the nucleotides of SEQ ID No: 171, which encodes the amino acid sequence of SEQ ID No: 172. This enzyme is believed to have phospholipase activity.
[0154] The enzyme identified as sequence ES 87, scaffoId00071.G611, is encoded by the nucleotides of SEQ ID No: 173, which encodes the amino acid sequence of SEQ ID No: 174. This enzyme is believed to have lipase activity.
[0155] The enzyme identified as sequence ES 88, scaffold00071.G3443, is encoded by the nucleotides of SEQ ID No: 175, which encodes the amino acid sequence of SEQ ID No: 176. This enzyme is believed to have phospholipase activity.
[0156] The enzyme identified as sequence ES 89, scaffold00031.pathl.gene288, is encoded by the nucleotides of SEQ ID No: 177, which encodes the amino acid sequence of SEQ ID No: 178. This enzyme is believed to have lipase activity.
[0157] The enzyme identified as sequence ES 90, scaffold00092.pathl.gene71, is encoded by the nucleotides of SEQ ID No: 179, which encodes the amino acid sequence of SEQ ID No: 180. This enzyme is believed to have phospholipase activity.
[0158] The enzyme identified as sequence ES 91, scaffold00050.pathl.gene362, is encoded by the nucleotides of SEQ ID No: 181, which encodes the amino acid sequence of SEQ ID No: 182. This enzyme is believed to have esterase activity.
[0159] The enzyme identified as sequence ES 92, scaffold00227.pathl.gene278, is encoded by the nucleotides of SEQ ID No: 183, which encodes the amino acid sequence of SEQ ID No: 184. This enzyme is believed to have carboxylesterase activity.
[0160] In some embodiments, the methods may be performed one or more times in whole or in part. That is, one may perform one or more pretreatments, followed by one or more reactions with a protein of the present invention, composition or product of the present invention and/or accessory enzyme. The enzymes may be added in a single dose, or may be added in a series of small doses. Further, the entire process may be repeated one or more times as necessary. Therefore, one or more additional treatments with heat and enzymes are contemplated. [0161] Proteins of the present invention, at least one protein of the present invention, compositions comprising such protein(s) of the present invention, and multi-enzyme compositions (examples of which are described above) may be used in any method where it is desirable to hydrolyze glycosidic linkages in lignocellulosic material, or any other method wherein enzymes of the same or similar function are useful.
[0162] In one embodiment, the present invention includes the use of at least one protein of the present invention, compositions comprising at least one protein of the present invention, or multi-enzyme compositions in methods for hydrolyzing lignocellulose and the generation of fermentable sugars therefrom. In one embodiment, the method comprises contacting the lignocellulosic material with an effective amount of one or more proteins of the present invention, composition comprising at least one protein of the present invention, or a multi-enzyme composition, whereby at least one fermentable sugar is produced (liberated). The lignocellulosic material may be partially or completely degraded to fermentable sugars. Economical levels of degradation at commercially viable costs are contemplated.
[01 3] Typically, the amount of enzyme or enzyme composition contacted with the lignocellulose will depend upon the amount of glucan and hemicellulose present in the lignocellulose. h some embodiments, the amount of enzyme or enzyme composition contacted with the lignocellulose may be from about 0.1 to about 200 mg enzyme or enzyme composition per gram of biomass dry weight; in other embodiments, from about 3 to about 20 mg enzyme or enzyme composition per gram of biomass dry weight. The invention encompasses the use of any suitable or sufficient amount of enzyme or enzyme composition between about 0.1 mg and about 200 mg enzyme per gram biomass dry weight, in increments of 0.05 mg (i.e., 0.1 mg, 0.15 mg, 0.2 mg... 199.9 mg, 199.95 mg, 200 mg).
[0164] In a further embodiment, the invention provides a method for degrading DDG, preferably, but not limited to, DDG derived from corn, to sugars. The method comprises contacting the DDG with a protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition, i certain embodiments, at least 10% of fermentable sugars are liberated. In other embodiment, at least 15% of the sugars are liberated, or at least 20% of the sugars are liberated, or at least 23% of the sugars are liberated, or at least 24% of the sugars are liberated, or at least 25% of the sugars are liberated, or at least 26% of the sugars are liberated, or at least 27% of the sugars are liberated, or at least 28% of the sugars are liberated.
[0165] In another embodiment, the invention provides a method for producing fermentable sugars comprising cultivating a genetically modified microorganism of the present invention in a nutrient medium comprising a lignocellulosic material, whereby fermentable sugars are produced.
[0166] Also provided are methods that comprise further contacting the lignocellulosic material with at least one other enzyme. Such enzymes have been described elsewhere herein. The accessory enzyme or enzymes may be added at the same time, prior to, or following the addition of a protein of the present invention or a composition comprising at least one protein of the present invention, or a multi- enzyme composition, or can be expressed (endogenously or overexpressed) in a genetically modified microorganism used in a method of the invention. When added simultaneously, the protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition will be compatible with the enzymes selected. When the enzymes are added following the treatment with the protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition, the conditions (such as temperature and pH) may be altered to those optimal for the enzyme before, during, or after addition of the enzyme of the present invention. Multiple rounds of enzyme addition are also encompassed. The accessory enzyme may also be present in the lignocellulosic material itself as a result of genetically modifying the plant. The nutrient medium used in a fermentation can also comprise one or more accessory enzymes.
[0167] In some embodiments, the method comprises a pretreatment process. In general, a pretreatment process will result in components of the iignocellulose being more accessible for downstream applications or so that it is more digestible by enzymes following treatment in the absence of hydrolysis. The pretreatment can be a chemical, physical or biological pretreatment. The Iignocellulose may have been previously treated to release some or all of the sugars, as in the case of DDG. Physical treatments, such as grinding, boiling, freezing, milling, vacuum infiltration, and the like may also be used with the methods of the invention. In one embodiment, the heat treatment comprises heating the lignocellulosic material to 121°C for 15 minutes. A physical treatment such as milling can allow a higher concentration of Iignocellulose to be used in the methods of the invention. A higher concentration refers to about 20%, up to about 25%, up to about 30%, up to about 35%, up to about 40%, up to about 45%, or up to about 50% lignocellulose. The lignoce!lulose may also be contacted with a metal ion, ultraviolet light, ozone, and the like. Additional pretreatment processes are known to those skilled in the art, and can include, for example, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment, including ammonia fiber explosion (AFEX) technology. Details on pretreatment technologies and processes can be found in Wyman et al., Bioresource Tech 96:1959 (2005); Wyman et al., Bioresource Tech 96:2026(2005); Hsu, "Pretreatment of biomass" In Handbook on Bioethanol: Production and Utilization, Wyman, Taylor and Francis Eds., p. 179-212 (1996); and Mosier et al., Bioresource Tech 96:673 (2005).
[0168] In an additional embodiment, the method comprises detoxifying the lignocellulosic material. Dextoxification may be desirable in the event that inhibitors are present in the lignocellulosic material. Such inhibitors can be generated by a pretreatment process, deriving from sugar degradation or are direct released from the lignocellulose polymer. Detoxifying can include the reduction of their formation by adjusting sugar extraction conditions; the use of inhibitor-tolerant or inhibitor-degrading strains of microorganisms. Detoxifying can also be accomplished by the addition of ion exchange resins, active charcoal, enzymatic detoxification using, e.g., laccase, and the like. In some embodiments, the proteins, compositions or products of the present invention further comprises detoxifying agents.
4. In some embodiments, the invention comprises, but is not limited to, methods for esterases in the food industry, such as degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food processing.
[0168] In some embodiments, the invention comprises, but is not limited to, methods for using esterases in the household industry, such as use in laundry and detergent (e.g., removal of stains); cleaning agents; and hydrolysis of tallow for laundry detergent. [0170] In some embodiments, the invention comprises, but is not limited to, methods for esterases in the publishing and printing industry, such as the removal of triglycerides, steryl esters, resin acids, free fatty acids, and sterols (e.g., lipophilic wood extractives).
[0171] In some embodiments, the invention comprises, but is not limited to, methods for using esterases in the bioenergy industry, such as the production of biodiesel and other biofuels.
[0172] In some embodiments, the invention comprises, but is not limited to, applications of esterases in the feed industry reducing the amount of phosphate in feed.
[0173] In other embodiments, the invention comprises methods for using esterases in other industries, such as the use as a biocatalyst; sewage treatment; cleaning up oil pollution; the synthesis of esters; the synthesis of fragrances; enantio-specific catalysis of fine chemicals (e.g., esters for chemical and drug intermediates); the production of isopropyl myristate, isopropyl palmitate and 2-ethylpalmitate for use as emollient in personal care products; saving of energy and minimization of thermal degradation in oleochemical industry; use as a feed additive; and enhancing the recovery of oil (e.g., during drilling).
[0174] In other embodiments, the invention comprises, but is not limited to, methods for using esterases in the food industry to prepare gelling agents, in processes to clarify beverages, to prepare dietary oligosaccharides, and the like.
[0175] In another embodiment the invention comprises but is not limited to a method to release ferulic acid from biomass in order to synthesize vanillin.
[0176] In some embodiments, the present invention provides methods for improving the nutritional quality of food (or animal feed) comprising adding to the food (or the animal feed) at least one protein of the present invention. In some embodiments, the present invention provides methods for improving the nutritional quality of the food (or animal feed) comprising pretreating the food (or the animal feed) with at least one isolated protein of the present invention. Improving the nutritional quality can mean making the food (or the animal feed) more digestible and/or less allergenic, and encompasses changes in the caloric value, taste and/or texture of the food.
[0177] In some embodiments, the proteins of the present invention may be used as part of nutritional supplements. In some embodiments, the proteins of the present invention may be used as part of digestive aids.
[0178] As used herein, reference to an isolated protein or polypeptide in the present invention, including any of the enzymes disclosed herein, includes full-length proteins and their glycosylated or otherwise modified forms, fusion proteins, or any fragment or homologue or variant of such a protein. More specifically, an isolated protein, such as an enzyme according to the present invention, is a protein (including a polypeptide or peptide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include purified proteins, partially purified proteins, recombinantly produced proteins, synthetically produced proteins, proteins complexed with lipids, soluble proteins, and isolated proteins associated with other proteins, for example. As such, "isolated" does not reflect the extent to which the protein has been purified. Preferably, an isolated protein of the present invention is produced recombinantly. In addition, and by way of example, a "C. lucknowense protein" or "C. lucknowense enzyme" refers to a protein (generally including a homologue or variant of a naturally occurring protein) from M. thermophila or to a protein that has been otherwise produced from the knowledge of the structure (e.g., sequence) and perhaps the function of a naturally occurring protein from M. thermophila. In other woTds, a C. lucknowense protein includes any protein that has substantially similar structure and function of a naturally occurring C. lucknowense protein or that is a biologically active (i.e., has biological activity) homologue or variant of a naturally occurring protein from C. lucknowense as described in detail herein. As such, a C. lucknowense protein can include purified, partially purified, recombinant, mutated/modified and synthetic proteins.
[0179] According to the present invention, an isolated protein can be isolated from its natural source, produced recombinantly, or produced synthetically.
[0180] According to the present invention, the terms "modification," "mutation," and "variant" can be used interchangeably, particularly with regard to the modifications/mutations/varients to the amino acid sequence of a protein or peptide (e.g, C. lucknowense protein) (or nucleic acid sequences) described herein. The term "modification" can also be used to describe post-translational modifications to a protein or peptide including, but not limited to, methylation, farnesylation, carboxymethylation, geranyl geranylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, and/or amidation.
[0181] As used herein, the terms "homologue" or "variants" are used to refer to a protein or peptide which differs from a naturally occurring protein or peptide (i.e., the "prototype" or "wild-type" protein) by minor modifications to the naturally occurring protein or peptide, but which maintains the basic protein and side chain structure of the naturally occurring form. Such changes include, but are not limited to: changes in one or a few amino acid side chains; changes in one or a few amino acids, including deletions (e.g., a truncated version of the protein or peptide), insertions and or substitutions; changes in stereochemistry of one or a few atoms; and/or minor derivatizations, including but not limited to for example: methylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, amidation and/or addition of glycosylphosphatidyl inositol. A homologue or variant can have either enhanced, decreased, or substantially similar properties as compared to the naturally occurring protein or peptide. Homologues or variants can be the result of natural allelic variation or natural mutation. A naturally occurring allelic variant of a nucleic acid encoding a protein is a gene that occurs at essentially the same locus (or loci) in the genome as the gene which encodes such protein, but which, due to natural variations caused by, for example, mutation or recombination, has a similar but not identical sequence. Homologous can also be the result of a gene duplication and rearrangement, resulting in a different location. Allelic variants typically encode proteins having similar activity to that of the protein encoded by the gene to which they are being compared. One class of allelic variants can encode the same protein but have different nucleic acid sequences due to the degeneracy of the genetic code. Allelic variants can also comprise alterations in the 5' or 3' untranslated regions of the gene (e.g., in regulatory control regions). Allelic variants are well known to those skilled in the art.
[0182] Homologues or variants can be produced using techniques known in the art for the production of proteins including, but not limited to, direct modifications to the isolated, naturally occurring protein, direct protein synthesis, or modifications to the nucleic acid sequence encoding the protein using, for example, classic or recombinant DNA techniques to effect random or targeted mutagenesis.
[0183] Modifications of a protein, such as in a homologue or variant, may result in proteins having the same biological activity as the naturally occurring protein, or in proteins having decreased or increased biological activity as compared to the naturally occurring protein. Modifications which result in a decrease in protein expression or a decrease in the activity of the protein, can be referred to as inactivation (complete or partial), down-regulation, or decreased action of a protein. Similarly, modifications which result in an increase in protein expression or an increase in the activity of the protein, can be referred to as amplification, overproduction, activation, enhancement, up-regulation or increased action of a protein.
[0184] According to the present invention, an isolated protein, including a biologically active homologue, variant, or fragment thereof, has at least one characteristic of biological activity of a wild-type, or naturally occurring, protein. As discussed above, in general, the biological activity or biological action of a protein refers to any tunction(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions). The biological activity of a protein of the present invention can include an enzyme activity (catalytic activity and/or substrate binding activity), such as cellulase activity, hemicellulase activity, β-glucanase activity, xylanase activity, or any other activity disclosed herein. Specific biological activities of the proteins disclosed herein are described in detail above and in the Examples. Methods of detecting and measuring the biological activity of a protein of the invention include, but are not limited to, the assays described in the Examples section below. Such assays include, but are not limited to, measurement of enzyme activity (e.g., catalytic activity), measurement of substrate binding, and the like. It is noted that an isolated protein of the present invention (including homologues or variants) is not required to have a biological activity such as catalytic activity. A protein can be a truncated, mutated or inactive protein, or lack at least one activity of the wild-type enzyme, for example. Inactive proteins may be useful in some screening assays, for example, or for other purposes such as antibody production.
[0185] Methods to measure protein expression levels of a protein according to the invention include, but are not limited to: western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to, ligand binding or interaction with other protein partners.
[0186] Many of the enzymes and proteins of the present invention may be desirable targets for modification and use in the processes described herein. These proteins have been described in terms of function and amino acid sequence (and nucleic acid sequence encoding the same) of representative wild-type proteins. In one embodiment of the invention, homologues or variants of a given protein (which can include related proteins from other organisms or modified forms of the given protein) are encompassed for use in the invention. Homologies or variants of a protein encompassed by the present invention can comprise, consist essentially of, or consist of, in one embodiment, an amino acid sequence that is at least about 35% identical, and more preferably at least about 40% identical, and more preferably at least about 45% identical, and more preferably at least about 50% identical, and more preferably at least about 55% identical, and more preferably at least about 60% identical, and more preferably at least about 65% identical, and more preferably at least about 70% identical, and more preferably at least about 75% identical, and more preferably at least about 80% identical, and more preferably at least about 85% identical, and more preferably at least about 90% identical, and more preferably at least about 95% identical, and more preferably at least about 96% identical, and more preferably at least about 97% identical, and more preferably at least about 98% identical, and more preferably at least about 99% identical, or any percent identity between 35% and 99%, in whole integers (i.e., 36%, 37%, etc.), to an amino acid sequence disclosed herein that represents the amino acid sequence of an enzyme or protein according to the invention (including a biologically active domain of a full-length protein). Preferably, the amino acid sequence of the homologue or variant has a biological activity of the wild-type or reference protein or of a biologically active domain thereof (e.g., a catalytic domain). When denoting mutation positions, the amino acid position of the wild-type is typically used. The wild-type can also be referred to as the "parent." Additionally, any generation before the variant at issue can be a parent.
[0187] In one embodiment, a protein of the present invention comprises, consists essentially of, or consists of an amino acid sequence that, alone or in combination with other characteristics of such proteins disclosed herein, is less than 100% identical to an amino acid sequence selected from SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ID No: 184 (i.e., a homologue or variant). For example, a protein of the present invention can be less than 100% identical, in combination with being at least about 35% identical, to a given disclosed sequence. In another aspect of the invention, a homologue or variant according to the present invention has an amino acid sequence that is less than about 99% identical to any of such amino acid sequences, and in another embodiment, is less than about 98% identical to any of such amino acid sequences, and in another embodiment, is less man about 97% identical to any of such amino acid sequences, and in another embodiment, is less than about 96% identical to any of such amino acid sequences, and in another embodiment, is less than about 95% identical to any of such amino acid sequences, and in another embodiment, is less than about 94% identical to any of such amino acid sequences, and in another embodiment, is less than about 93% identical to any of such amino acid sequences, and in another embodiment, is less than about 92% identical to any of such amino acid sequences, and in another embodiment, is less than about 91% identical to any of such amino acid sequences, and in another embodiment, is less than about 90% identical to any of such amino acid sequences, and so on, in increments of whole integers.
[0188] As used herein, unless otherwise specified, reference to a percent (%) identity refers to an evaluation of homology which is performed using: (1) a BLAST 2.0 Basic BLAST homology search using blastp for amino acid searches and blastn for nucleic acid searches with standard default parameters, wherein the query sequence is filtered for low complexity regions by default (described in Altschul, S.F., Madden, T.L., Schaaffer, A A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. (1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25:3389-3402); (2) a BLAST 2 alignment (using the parameters described below); (3) PSI-BLAST with the standard default parameters (Position- Specific Iterated BLAST; and/or (4) CAZy homology determined using standard default parameters from the Carbohydrate Active EnZymes database (Coutinho, P.M. & Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In "Recent Advances in Carbohydrate Bioengjneering", HJ. Gilbert, G. Davies, B. Henrissat and B. Svensson eds., The Royal Society of Chemistry, Cambridge, pp. 3-12).
[0189] It is noted that due to some differences in the standard parameters between BLAST 2.0 Basic BLAST and BLAST 2, two specific sequences might be recognized as having significant homology using the BLAST 2 program, whereas a search performed in BLAST 2.0 Basic BLAST using one of the sequences as the query sequence may not identify the second sequence in the top matches. In addition, PSI- BLAST provides an automated, easy-to-use version of a "profile" search, which is a sensitive way to look for sequence homologues or variants. The program first performs a gapped BLAST database search. The PSI-BLAST program uses the information from any significant alignments returned to construct a position-specific score matrix, which replaces the query sequence for the next round of database searching. Therefore, it is to be understood that percent identity can be determined by using any one of these programs.
[0190] Two specific sequences can be aligned to one another using BLAST 2 sequence as described in Tatusova and Madden, (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250. BLAST 2 sequence alignment is performed in blastp or blastn using the BLAST 2.0 algorithm to perform a Gapped BLAST search (BLAST 2.0) between the two sequences allowing for the introduction of gaps (deletions and insertions) in the resulting alignment. For purposes of clarity herein, a BLAST 2 sequence alignment is performed using the standard default parameters as follows.
For blastn, using 0 BLOSUM62 matrix:
Reward for match = 1
Penalty for mismatch - -2
Open gap (5) and extension gap (2) penalties
gap x dropoff (50) expect (10) word size (1 1) filter (on)
For blastp, using 0 BLOSUM62 matrix:
Open gap (11) and extension gap (1) penalties
gap x dropoff (50) expect (10) word size (3) filter (on). [0191] A protein of the present invention can also include proteins having an amino acid sequence comprising at least 10 contiguous amino acid residues of any of the sequences described herein (i.e., 10 contiguous amino acid residues having 100% identity with 10 contiguous amino acids of the amino acid sequences of SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ED No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ED No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ED No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ED No: 58, SEQ ED No: 60, SEQ ED No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ED No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ED No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ED No: 88, SEQ ED No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ED No: 100, SEQ ID No: 102, SEQ ED No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ED No: 118, SEQ ED No: 120, SEQ ED No: 122, SEQ ID No: 124, SEQ ED No: 126, SEQ ID No: 128, SEQ ED No: 130, SEQ ED No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ED No: 138, SEQ ED No: 140, SEQ ED No: 142, SEQ ID No: 144, SEQ ED No: 146, SEQ ID No: 148, SEQ ED No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ED No: 158, SEQ ID No: 160, SEQ ED No: 162, SEQ ID No: 164, SEQ ED No: 166, SEQ ED No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ED No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ED No: 180, SEQ ID No: 182, SEQ ED No: 184). In other embodiments, a homologue or variant of a protein amino acid sequence includes amino acid sequences comprising at least 20, or at least 30, or at least 40, or at least 50, or at least 75, or at least 100, or at least 125, or at least 150, or at least 175, or at least 150, or at least 200, or at least 250, or at least 300, or at least 350 contiguous amino acid residues of any of the amino acid sequence represented disclosed herein. Even small fragments of proteins without biological activity are useful in the present invention, for example, in the preparation of antibodies against the full-length protein or in a screening assay (e.g., a binding assay). Fragments can also be used to construct fusion proteins, for example, where the fusion protein comprises functional domains from two or more different proteins (e.g., a CBM from one protein linked to a CD from another protein). In one embodiment, a homologue or variant has a measurable or detectable biological activity associated with the wild-type protein (e.g., enzymatic activity).
[0192] According to the present invention, the term "contiguous" or "consecutive", with regard to nucleic acid or amino acid sequences described herein, means to be connected in an unbroken sequence. For example, for a first sequence to comprise 30 contiguous (or consecutive) amino acids of a second sequence, means that the first sequence includes an unbroken sequence of 30 amino acid residues that is 100% identical to an unbroken sequence of 30 amino acid residues in the second sequence. Similarly, for a first sequence to have "100% identity" with a second sequence means that the first sequence exactly matches the second sequence with no gaps between nucleotides or amino acids.
[0193] In another embodiment, a protein of the present invention, including a homologue or variant, includes a protein having an amino acid sequence that is sufficiently similar to a natural amino acid sequence that a nucleic acid sequence encoding the homologue or variant is capable of hybridizing under moderate, high or very high stringency conditions (described below) to (i.e., with) a nucleic acid molecule encoding the natural protein (i.e., to the complement of the nucleic acid strand encoding the natural amino acid sequence). Preferably, a homologue or variant of a protein of the present invention is encoded by a nucleic acid molecule comprising a nucleic acid sequence that hybridizes under low, moderate, or high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising, consisting essentially of, or consisting of, an amino acid sequence represented by any of SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No; 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ED No: 182, SEQ ID No: 184. Such hybridization conditions are described in detail below.
[0194] A nucleic acid sequence complement of nucleic acid sequence encoding a protein of the present invention refers to the nucleic acid sequence of the nucleic acid strand that is complementary to the strand which encodes the protein. It will be appreciated that a double stranded DNA which encodes a given amino acid sequence comprises a single strand DNA and its complementary strand having a sequence that is a complement to the single strand DNA. As such, nucleic acid molecules of the present invention can be either double-stranded or single-stranded, and include those nucleic acid molecules that form stable hybrids under stringent hybridization conditions with a nucleic acid sequence that encodes an amino acid sequence such as the amino acid sequences of SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ED No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ED No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ED No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ED No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ED No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ED No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ED No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ED No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ED No: 138, SEQ ED No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ED No: 146, SEQ ED No: 148, SEQ ED No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ED No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ID No: 184, and/or with the complement of the nucleic acid sequence that encodes an amino acid sequence such as the amino acid sequences of Sequences SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ Π No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ID No: 184. Methods to deduce a complementary sequence are known to those skilled in the art. It should be noted that since nucleic acid sequencing technologies are not entirely error-tree, the sequences presented herein, at best, represent apparent sequences of the proteins of the present invention.
[0195] As used herein, reference to hybridization conditions refers to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., (see specifically, pages 9.31-9.62). In addition, formulae to calculate the appropriate hybridization and wash conditions to achieve hybridization permitting varying degrees of mismatch of nucleotides are disclosed, for example, in Meinkoth et al., 1984, Anal. Biochem. 138, 267-284; Meinkoth et al., ibid. [0196] More particularly, moderate stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 70% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 30% or less mismatch of nucleotides). High stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 80% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 20% or less mismatch of nucleotides). Very high stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides). As discussed above, one of skill in the art can use the formulae in Meinkoth et al., ibid, to calculate the appropriate hybridization and wash conditions to achieve these particular levels of nucleotide mismatch. Such conditions will vary, depending on whether DNA:RNA or DNAtDNA hybrids are being formed. Calculated melting temperatures for DNArDNA hybrids are 10°C less man for DNA:RNA hybrids. In particular embodiments, stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 MNa+ at a temperature of between about 20°C and about 35°C (tower stringency), more preferably, between about 28°C and about 40°C (more stringent), and even more preferably, between about 35°C and about 45°C (even more stringent), with appropriate wash conditions. In particular embodiments, stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na*) at a temperature of between about 30°C and about 45°C, more preferably, between about 38°C and about 50°C, and even more preferably, between about 45°C and about 55°C, with similarly stringent wash conditions. These values are based on calculations of a melting temperature for molecules larger than about 100 nucleotides, 0% formamide and a G + C content of about 40%. Alternatively, Tm can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62. In general, the wash conditions should be as stringent as possible, and should be appropriate for the chosen hybridization conditions. For example, hybridization conditions can include a combination of salt and temperature conditions that are approximately 20-25°C below the calculated Tm of a particular hybrid, and wash conditions typically include a combination of salt and temperature conditions that are approximately 12-20°C below the calculated Tm of the particular hybrid. One example of hybridization conditions suitable for use with DNA:DNA hybrids includes a 2-24 hour hybridization in 6X SSC (50% formamide) at about 42°C, followed by washing steps that include one or more washes at room temperature in about 2X SSC, followed by additional washes at higher temperatures and lower ionic strength (e.g., at least one wash as about 37°C in about 0.1X-0.5X SSC, followed by at least one wash at about 68°C in about 0.1X-0.5X SSC).
[0197] The minimum size of a protein and/or homologue or variant of the present invention is a size sufficient to have biological activity or, when the protein is not required to have such activity, sufficient to be useful for another purpose associated with a protein of the present invention, such as for the production of antibodies that bind to a naturally occurring protein. In one embodiment, the protein of the present invention is at least 20 amino acids in length, or at least about 25 amino acids in length, or at least about 30 amino acids in length, or at least about 40 amino acids in length, or at least about 50 amino acids in length, or at least about 60 amino acids in length, or at least about 70 amino acids in length, or at least about 80 amino acids in length, or at least about 90 amino acids in length, or at least about 100 amino acids in length, or at least about 125 amino acids in length, or at least about 150 amino acids in length, or at least about 175 amino acids in length, or at least about 200 amino acids in length, or at least about 250 amino acids in length, and so on up to a full length of each protein, and including any size in between in increments of one whole integer (one amino acid). There is no limit, other than a practical limit, on the maximum size of such a protein in that the protein can include a portion of a protein or a full-length protein, plus additional sequence (e.g., a fusion protein sequence), if desired.
[0198] The present invention also includes a fusion protein that includes a domain of a protein of the present invention (including a homologue or variant) attached to one or more fusion segments, which are typically heterologous in sequence to the protein sequence {i.e., different than protein sequence). Suitable fusion segments for use with the present invention include, but are not limited to, segments that can: enhance a protein's stability; provide other desirable biological activity; and/or assist with the purification of the protein (e.g., by affinity chromatography). A suitable fusion segment can be a domain of any size that has the desired function {e.g., imparts increased stability, solubility, action or biological activity; and/or simplifies purification of a protein). Fusion segments can be joined to amino and/or carboxyl termini of the domain of a protein of the present invention and can be susceptible to cleavage in order to enable straight-forward recovery of the protein. Fusion proteins are preferably produced by culturing a recombinant cell transfected with a fusion nucleic acid molecule that encodes a protein including the fusion segment attached to either the carboxyl and/or amino terminal end of a domain of a protein of the present invention. Accordingly, proteins of the present invention also include expression products of gene fusions (for example, used to overexpress soluble, active forms of the recombinant protein), of mutagenized genes (such as genes having codon modifications to enhance gene transcription and translation), and of truncated genes (such as genes having membrane binding modules removed to generate soluble forms of a membrane protein, or genes having signal sequences removed which are poorly tolerated in a particular recombinant host).
[0199] In one embodiment of the present invention, any of the amino acid sequences described herein can be produced with from at least one, and up to about 20, additional heterologous amino acids flanking each of the C- and/or N-terminal ends of the specified amino acid sequence. The resulting protein or polypeptide can be referred to as "consisting essentially of the specified amino acid sequence. According to the present invention, the heterologous amino acids are a sequence of amino acids that are not naturally found (i.e., not found in nature, in vivo) flanking the specified amino acid sequence, or that are not related to the function of the specified amino acid sequence, or that would not be encoded by the nucleotides that flank the naturally occurring nucleic acid sequence encoding the specified amino acid sequence as it occurs in the gene, if such nucleotides in the naturally occurring sequence were translated using standard codon usage for the organism from which the given amino acid sequence is derived.
[0200] The present invention also provides enzyme combinations that break down lignocellulose material. Such enzyme combinations or mixtures can include a multi- enzyme composition that contains at least one protein of the present invention in combination with one or more additional proteins of the present invention or one or more enzymes or other proteins from other microorganisms, plants, or similar organisms. Synergistic enzyme combinations and related methods are contemplated. The invention includes methods to identify the optimum ratios and compositions of enzymes with which to degrade each lignocellulosic material. These methods entail tests to identify the optimum enzyme composition and ratios for efficient conversion of any lignocellulosic substrate to its constituent sugars. The Examples below include assays that may be used to identify optimum ratios and compositions of enzymes with which to degrade lignocellulosic and other plant cell derived materials.
[0201] Any combination of the proteins disclosed herein is suitable for use in the multi- enzyme compositions of the present invention. Due to the complex nature of most biomass sources, which can contain cellulose, hemicellulose, pectin, lignin, protein, lipids, waxes and ash, among other components, preferred enzyme combinations may contain enzymes with a range of substrate specificities that work together to degrade biomass into fermentable sugars in the most efficient manner. One example of a multi-enzyme complex for lignocellulose saccharification is a mixture of cellobiohydrolase(s), xylanase(s), endoglucanase(s), β-glucosidase(s), β-xylosidase(s), and accessory enzymes. However, it is to be understood mat any of the enzymes described specifically herein can be combined with any one or more of the enzymes described herein or with any other available and suitable enzymes, to produce a multi- enzyme composition. The invention is not restricted or limited to the specific exemplary combinations listed below.
[0202] In one embodiment, the celiobiohydrolase(s) comprise between about 30% and about 90% or between about 40% and about 70% of the enzymes in the composition, and more preferably, between about 55% and 65%, and more preferably, about 60% of the enzymes in the composition (including any percentage between 40% and 70% in 0.5% increments (e.g., 40%, 40.5%, 41%, etc.).
[0203] In one embodiment, the xylanase(s) comprise between about 10% and about 30% of the enzymes in the composition, and more preferably, between about 15% and about 25%, and more preferably, about 20% of the enzymes in the composition (including any percentage between 10% and 30% in 0.5% increments).
[0204] In one embodiment, the endoglucanase(s) comprise between about 5% and about 15% of the enzymes in the composition, and more preferably, between about 7% and about 13%, and more preferably, about 10% of the enzymes in the composition (including any percentage between 5% and 15% in 0.5% increments).
[0205] In one embodiment, the β-glucosidase(s) comprise between about 1% and about 15% of the enzymes in the composition, and preferably between about 2% and 10%, and more preferably, about 3% of the enzymes in the composition (including any percentage between 1% and 15% in 0.5% increments).
[0206] In one embodiment, the β-xylosidase(s) comprise between about 1% and about 3% of the enzymes in the composition, and preferably, between about 1.5% and about 2.5%, and more preferably, about 2% of the enzymes in the composition (including any percentage between 1% and 3% in 0.5% increments.
[0207] In one embodiment, the accessory enzymes comprise between about 2% and about 8% of the enzymes in the composition, and preferably, between about 3% and about 7%, and more preferably, about 5% of the enzymes in the composition (including any percentage between 2% and 8% in 0.5% increments.
[0208] One particularly preferred example of a multi-enzyme complex for Iignocellulose saccharification is a mixture of about 60% cellobiohydrolase(s), about 20% xy!anase(s), about 10% endoglucanase(s), about 3% β-glucosidase(s), about 2% β- xy)osidase(s) and about 5% accessory enzyme(s).
[0209] Enzymes and multi-enzyme compositions of the present invention may also be used to break down arabinoxylan or arabinoxylan-containing substrates. Arabinoxylan is a polysaccharide composed of xylose and arabinose, wherein ct-L-arabinofuranose residues are attached as branch-points to a β-(1,4)-linked xylose polymeric backbone. The xylose residues may be mono-substituted at the C2 or C3 position, or di- substituted at both positions. Ferulic acid or coumaric acid may also be ester-linked to the C5 position of arabinosyl residues. Further details on the hydrolysis of arabinoxylan can be found in International Publication No. WO 2006/114095.
[0210] The substitutions on the xylan backbone can inhibit the enzymatic activity of xylanases, and the complete hydrolysis of arabinoxylan typically requires the action of several different enzymes. One example of a multi-enzyme complex for arabinoxylan hydrolysis is a mixture of endoxylanase(s), β-xylosidase(s), and arabinofuranosidase(s), including those with specificity towards single and double substituted xylose residues. In some embodiments, the multi-enzyme complex may further comprise one or more carbohydrate esterases, such as acetyl xylan esterases, ferulic acid esterases, or coumaric acid esterases. Any combination of two or more of the above-mentioned enzymes is suitable for use in the multi-enzyme complexes. However, it is to be understood that the invention is not restricted or limited to the specific exemplary combinations listed herein.
[0211] In one embodiment, the endoxylanase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.). Endoxylanase(s), either alone or as part of a multi- enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g kg, or 0.05 to 0.2 g kg of substrate.
[0212] In one embodiment, the β-xylosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.). β-xylosidase(s), either alone or as part of a multi- enzyme complex, may be used in amounts of 0.001 to 2.0 g kg, 0.005 to 1.0 g kg, or 0.05 to 0.2 g/kg of substrate.
[0213] In one embodiment, the arabinofuranosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g, 5.0%, 5.5%, 6.0%, etc.). The total percentage of arabinofuranosidase(s) present in the composition may include arabinofuranosidase(s) with specificity towards single substituted xylose residues, arabinofuranosidase(s) with specificity towards double substituted xylose residues, or any combination thereof. Arabinofuranosidase(s), either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g kg, 0.005 to 1.0 g/kg, or 0.05 to 0.2 g/kg of substrate.
[0214] In one embodiment, the acetylxylan esterases, ferulic acid esterases, the coumaryl esterases, the glucuronyl esterases and/ or the glucuronidases alone or in combination comprise at least about 1%, at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50% of the enzymes in the composition (including any percentage between 1% and 50% in 0.5% increments (e.g., 1.0%, 1.5%, 2.0%, etc.). Acetylxylan esterases, ferulic acid esterases, coumaric acid esterases, glucuronyl esterases, glucuronidases, either alone or as part of a multi-enzyme complex, may be used in amounts of 0.0001 to 2.0 g/kg, 0.0005 to 1.0 g/kg or 0.005 to 0.2 g kg of substrate.
[0215] One or more components of a multi-enzyme composition (other than proteins of the present invention) can be obtained from or derived from a microbial, plant, or other source or combination thereof, and will contain enzymes capable of degrading lignocellulosic material or other biomass components. Examples of enzymes included in the multi-enzyme compositions of the invention include cellulases, hemicellulases (such as xylanases, including endoxylanases, exoxylanases, and β-xylosidases; mannanases, including endomannanases, exomannanases, and β-mannosidases), pectinases, ligninases, amylases, glucuronidases, proteases, esterases, lipases, glucosidases (such as β-glucosidase), and xyloglucanases.
[0216] While the multi-enzyme composition may contain many types of enzymes, mixtures comprising enzymes that increase or enhance sugar release from biomass are preferred, including hemicellulases. In one embodiment, the hemicellulase is selected from a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, an endo-galactanase, a mannanase, an endo-arabinase, an exo-arabinase, an exo- galactanase, a ferulic acid esterase, a galactomannanase, a xylogluconase, or mixtures of any of these. In particular, the enzymes can include glucoamylase, β-xylosidase and/or β-glucosidase. Also preferred are mixtures comprising enzymes that are capable of degrading cell walls and releasing cellular contents.
[0217] The enzymes of the multi-enzyme composition can be provided by a variety of sources. In one embodiment, the enzymes can be produced by growing organisms such as bacteria, algae, fungi, and plants which produce the enzymes naturally or by virtue of being genetically modified to express the enzyme or enzymes. In another embodiment, at least one enzyme of the multi-enzyme composition is a commercially available enzyme.
[0218] In some embodiments, the multi-enzyme compositions comprise an accessory enzyme. An accessory enzyme is any additional enzyme capable of hydrolyzing lignocellulose or enhancing or promoting the hydrolysis of lignocellulose, wherein the accessory enzyme is typically provided in addition to a core enzyme or core set of enzymes. An accessory enzyme can have the same or similar function or a different function as an enzyme or enzymes in the core set of enzymes. These enzymes have been described elsewhere herein,. The enzymes may also include cellulases, xylanases, ligninases, amylases, lipases, cutinases or glucuronidases, for example. Accessory enzymes can include enzymes that when contacted with biomass in a reaction, allow for an increase in the activity of enzymes (e.g., hemicellulases) in the multi-enzyme composition. An accessory enzyme or enzyme mix may be composed of enzymes from (1) commercial suppliers; (2) cloned genes expressing enzymes; (3) complex broth (such as that resulting from growth of a microbial strain in media, wherein the strains secrete proteins and enzymes into the media); (4) cell lysates of strains grown as in (3); and, (S) plant material expressing enzymes capable of degrading lignocellulose or other plant derived biomass constituents (e.g. pectins).
[0219] As used herein, a ligninase is an enzyme that can hydrolyze or break down the structure of Iignin polymers, including !ignin peroxidases, manganese peroxidases, laccases, and other enzymes described in the art known to depolymerize or otherwise break lignin polymers. Also included are enzymes capable of hydrolyzing bonds formed between hemicellulosic sugars (notably arabinose) and lignin.
[0220] The multi-enzyme compositions, in some embodiments, comprise a biomass comprising microorganisms or a crude fermentation product of microorganisms. A crude fermentation product refers to the fermentation broth which has been separated from the microorganism biomass (by filtration, for example). In general, the microorganisms are grown in fermentors, optionally centrifuged or filtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi-enzyme composition that is a crude fermentation product. In other embodiments, enzyme(s) or multi-enzyme compositions produced by the microorganism (including a genetically modified microorganism as described below) are subjected to one or more purification steps, such as ammonium sulfate precipitation, chromatography, and/or ultrafiltration, which result in a partially purified or purified enzyme(s). If the microorganism has been genetically modified to express the enzyme(s), the enzyme(s) will include recombinant enzymes. If the genetically modified microorganism also naturally expresses the enzyme(s) or other enzymes useful for lignocellulosic saccharification or any other useful application mentioned herein, the enzyme(s) may include both naturally occurring and recombinant enzymes.
[0221] Another embodiment of the present invention relates to a composition comprising at least about 500 ng, and preferably at least about 1 μ¾, and more preferably at least about 5 ug, and more preferably at least about 10 ug, and more preferably at least about 25 ug, and more preferably at least about 50 ug, and more preferably at least about 75 ug, and more preferably at least about 100 μg, and more preferably at least about 250 ug, and more preferably at least about 500 ug, and more preferably at least about 750 ug, and more preferably at least about 1 mg, and more preferably at least about 5 mg, of an isolated protein comprising any of the proteins or homologues, variants, or fragments thereof discussed herein. Such a composition of the present invention may include any carrier with which the protein is associated by virtue of the protein preparation method, a protein purification method, or a preparation of the protein for use in any method according to the present invention. For example, such a carrier can include any suitable buffer, extract, or medium that is suitable for combining with the protein of the present invention so that the protein can be used in any method described herein according to the present invention.
[0222] In one embodiment of the invention, one or more enzymes of the invention is bound to a solid support, i.e., an immobilized enzyme. As used herein, an immobilized enzyme includes immobilized isolated enzymes, immobilized microbial cells which contain one or more enzymes of the invention, other stabilized intact cells that produce one or more enzymes of the invention, and stabilized cell/membrane homogenates. Stabilized intact cells and stabilized cell/membrane homogenates include cells and homogenates from naturally occurring microorganisms expressing the enzymes of the invention and preferably, from genetically modified microorganisms as disclosed elsewhere herein. Thus, although methods for immobilizing enzymes are discussed below, it will be appreciated that such methods are equally applicable to immobilizing microbial cells and in such an embodiment, the cells can be lysed, if desired.
[0223] A variety of methods for immobilizing an enzyme are disclosed in Industrial Enzymology 2nd Ed., Godfrey, T. and West, S. Eds., Stockton Press, New York, N.Y., 1996, pp. 267-272; Immobilized Enzymes, Chibata, I. Ed., Halsted Press, New York, N.Y., 1978; Enzymes and Immobilized Cells in Biotechnology, Laskin, A. Ed., Benjamin/Cummings Publishing Co., Inc., Menlo Park, California, 1 85; and Applied Biochemistry and Bioengineering, Vol. 4, Chibata, I. and Wingard, Jr., L. Eds, Academic Press, New York, N.Y., 1983.
[0224] Further embodiments of the present invention include nucleic acid molecules that encode a protein of the present invention, as well as homologues, variants, or fragments of such nucleic acid molecules. A nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding any of the isolated proteins disclosed herein, including a fragment or a homologue or variant of such proteins, described above. Nucleic acid molecules can include a nucleic acid sequence that encodes a fragment of a protein that does not have biological activity, and can also include portions of a gene or polynucleotide encoding the protein that are not part of the coding region for the protein (e.g., introns or regulatory regions of a gene encoding the protein). Nucleic acid molecules can include a nucleic acid sequence that is useful as a probe or primer (oligonucleotide sequences).
[0225] In one embodiment, a nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence represented in SEQ ID No: 1, SEQ ID No: 3, SEQ ID No: 5, SEQ ID No: 7, SEQ ID No: 9, SEQ ID No: 11 , SEQ ID No: 13, SEQ ID No: 15, SEQ ID No: 17, SEQ ID No: 19, SEQ ID No: 21, SEQ ID No: 23, SEQ ID No: 25, SEQ ID No: 27, SEQ ID No: 29, SEQ ID No: 31, SEQ ID No: 33, SEQ ID No: 35, SEQ ID No: 37, SEQ ID No: 39, SEQ ID No: 41, SEQ ID No: 43, SEQ ID No: 45, SEQ ID No: 47, SEQ ID No: 49, SEQ ID No: 51, SEQ ID No: 53, SEQ ID No: 55, SEQ ED No: 56, SEQ ID No: 57, SEQ JD No: 59, SEQ ID No: 61, SEQ ID No: 63, SEQ ID No: 65, SEQ ID No: 67, SEQ ID No: 69, SEQ ID No: 71, SEQ ID No: 73, SEQ ID No: 75, SEQ ID No: 77, SEQ ID No: 79, SEQ ID No: 81, SEQ ID No: 83, SEQ ID No: 85, SEQ ID No: 87, SEQ ID No: 89, SEQ ID No: 91, SEQ ID No: 93, SEQ ID No: 95, SEQ ID No: 97, SEQ ID No: 99, SEQ ID No: 101, SEQ ID No: 103, SEQ ID No: 105, SEQ ID No: 107, SEQ ID No: 109, SEQ ID No: 111, SEQ ID No: 113, SEQ ID No: 115, SEQ ID No: 117, SEQ ID No: 119, SEQ ID No: 121, SEQ ID No: 123, SEQ ID No: 125, SEQ ID No: 127, SEQ ID No: 129, SEQ ID No: 131, SEQ ID No: 133, SEQ ID No: 135, SEQ ID No: 137, SEQ ID No: 139, SEQ ID No: 141, SEQ ID No: 143, SEQ ID No: 145, SEQ ID No: 147, SEQ ID No: 149, SEQ ID No: 151, SEQ ID No: 153, SEQ ID No: 155, SEQ ID No: 157, SEQ ID No: 159, SEQ ID No: 161, SEQ ID No: 163, SEQ ID No: 165, SEQ ID No: 167, SEQ JD No: 169, SEQ ID No: 171, SEQ ID No: 173, SEQ ID No: 175, SEQ JD No: 177, SEQ ID No: 179, SEQ ID No: 181, SEQ ID No: 183 or fragments or homologues or variants thereof. Preferably, the nucleic acid sequence encodes a protein (including fragments and homologues or variants thereof) useful in the invention, or encompasses useful oligonucleotides or complementary nucleic acid sequences.
[0226] In one embodiment, a nucleic molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding an amino acid sequence represented in Sequences SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ED No: 184 or fragments or homologues or variants thereof. Preferably, the nucleic acid sequence encodes a protein (including fragments and homologues or variants thereof) useful in the invention, or encompasses useful oligonucleotides or complementary nucleic acid sequences.
[0227] In one embodiment, such nucleic acid molecules include isolated nucleic acid molecules that hybridize under moderate stringency conditions, and more preferably under high stringency conditions, and even more preferably under very high stringency conditions, as described above, with the complement of a nucleic acid sequence encoding a protein of the present invention (i.e., including naturally occurring allelic variants encoding a protein of the present invention). Preferably, an isolated nucleic acid molecule encoding a protein of the present invention comprises a nucleic acid sequence that hybridizes under moderate, high, or very high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising an amino acid sequence represented in SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ JD No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ED No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ED No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ED No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ED No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 1 0, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ID No: 184.
[0228] In accordance with the present invention, an isolated nucleic acid molecule is a nucleic acid molecule (polynucleotide) that has been removed from its natural milieu {i.e., that has been subject to human manipulation) and can include DNA, RNA, or derivatives of either DNA or RNA, including cDNA. As such, "isolated" does not reflect the extent to which the nucleic acid molecule has been purified. Although the phrase "nucleic acid molecule" primarily refers to the physical nucleic acid molecule, and the phrase "nucleic acid sequence" primarily refers to the sequence of nucleotides on the nucleic acid molecule, the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein. An isolated nucleic acid molecule of the present invention can be isolated from its natural source or produced using recombinant DNA technology {e.g., polymerase chain reaction (PCR) amplification, cloning) or chemical synthesis. Isolated nucleic acid molecules can include, for example, genes, natural allelic variants of genes, coding regions or portions thereof, and coding and/or regulatory regions modified by nucleotide insertions, deletions, substitutions, and/or inversions in a manner such that the modifications do not substantially interfere with the nucleic acid molecule's ability to encode a protein of the present invention or to form stable hybrids under stringent conditions with natural gene isolates. An isolated nucleic acid molecule can include degeneracies. As used herein, nucleotide degeneracy refers to the phenomenon that one amino acid can be encoded by different nucleotide codons. Thus, the nucleic acid sequence of a nucleic acid molecule that encodes a protein of the present invention can vary due to degeneracies. It is noted that a nucleic acid molecule of the present invention is not required to encode a protein having protein activity. A nucleic acid molecule can encode a truncated, mutated or inactive protein, for example. In addition, nucleic acid molecules of the invention are useful as probes and primers for the identification, isolation and/or purification of other nucleic acid molecules. If the nucleic acid molecule is an oligonucleotide, such as a probe or primer, the oligonucleotide preferably ranges from about 5 to about SO or about 500 nucleotides, more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length.
[0229] According to the present invention, reference to a gene includes all nucleic acid sequences related to a natural (i.e. wild-type) gene, such as regulatory regions that control production of the protein encoded by that gene (such as, but not limited to, transcription, translation or post-translation control regions) as well as the coding region itself. In another embodiment, a gene can be a naturally occurring allelic variant that includes a similar but not identical sequence to the nucleic acid sequence encoding a given protein. Allelic variants have been previously described above. Genes can include or exclude one or more introns or any portions thereof or any other sequences or which are not included in the cDNA for that protein. The phrases "nucleic acid molecule" and "gene" can be used interchangeably when the nucleic acid molecule comprises a gene as described above.
[0230] Preferably, an isolated nucleic acid molecule of the present invention is produced using recombinant DNA technology (e.g., polymerase chain reaction (PCR) amplification, cloning, etc.) or chemical synthesis. Isolated nucleic acid molecules include any nucleic acid molecules and homologues or variants thereof that are part of a gene described herein and/or that encode a protein described herein, including, but not limited to, natural allelic variants and modified nucleic acid molecules (homologues or variants) in which nucleotides have been inserted, deleted, substituted, and/or inverted in such a manner that such modifications provide the desired effect on protein biological activity or on the activity of the nucleic acid molecule. Allelic variants and protein homologues or variants (e.g., proteins encoded by nucleic acid homologues or variants) have been discussed in detail above.
[0231] A nucleic acid molecule homologue or variant (i.e., encoding a homologue or variant of a protein of the present invention) can be produced using a number of methods known to those skilled in the art (see, for example, Sam rook et a/.). For example, nucleic acid molecules can be modified using a variety of techniques including, but not limited to, by classic mutagenesis and recombinant DNA techniques (e.g., site- directed mutagenesis, chemical treatment, restriction enzyme cleavage, ligation of nucleic acid fragments and/or PCR amplification), or synthesis of oligonucleotide mixtures and ligation of mixture groups to "build" a mixture of nucleic acid molecules and combinations thereof. Another method for modifying a recombinant nucleic acid molecule encoding a protein is gene shuffling (i.e., molecular breeding) (See, for example, U.S. Patent No. 5,605,793 to Stemmer; Minshull and Stemmer; 1999, Curr. Opir Chem. Biol. 3:284-290; Stemmer, 1994, P.N.A.S. USA 91:10747-10751). This technique can be used to efficiently introduce multiple simultaneous changes in the protein. Nucleic acid molecule homologues or variants can be selected by hybridization with a gene or polynucleotide, or by screening for the function of a protein encoded by a nucleic acid molecule (i.e., biological activity).
[0232] The minimum size of a nucleic acid molecule of the present invention is a size sufficient to encode a protein (including a fragment, homologue, or variant of a full- length protein) having biological activity, sufficient to encode a protein comprising at least one epitope which binds to an antibody, or sufficient to form a probe or oligonucleotide primer that is capable of forming a stable hybrid with the complementary sequence of a nucleic acid molecule encoding a natural protein (e.g., under moderate, high, or high stringency conditions). As such, the size of the nucleic acid molecule encoding such a protein can be dependent on nucleic acid composition and percent homology or identity between the nucleic acid molecule and complementary sequence as well as upon hybridization conditions per se (e.g., temperature, salt concentration, and formamide concentration). The minimal size of a nucleic acid molecule that is used as an oligonucleotide primer or as a probe is typically at least about 12 to about 15 nucleotides in length if the nucleic acid molecules are GC-rich and at least about 15 to about 18 bases in length if they are AT- rich. There is no limit, other than a practical limit, on the maximal size of a nucleic acid molecule of the present invention, in that the nucleic acid molecule can include a portion of a protein encoding sequence, a nucleic acid sequence encoding a full-length protein (including a gene), including any length fragment between about 20 nucleotides and the number of nucleotides that make up the full length cDNA encoding a protein, in whole integers (e.g., 20, 21, 22, 23, 24, 25 nucleotides), or multiple genes, or portions thereof.
[0233] The phrase "consisting essentially of, when used with reference to a nucleic acid sequence herein, refers to a nucleic acid sequence encoding a specified amino acid sequence that can be flanked by from at least one, and up to as many as about 60, additional heterologous nucleotides at each of the 5' and/or the 3' end of the nucleic acid sequence encoding the specified amino acid sequence. The heterologous nucleotides are not naturally found (i.e., not found in nature, in vivo) flanking the nucleic acid sequence encoding the specified amino acid sequence as it occurs in the natural gene or do not encode a protein that imparts any additional function to the protein or changes the function of the protein having the specified amino acid sequence.
[0234] In one embodiment, the polynucleotide probes or primers of the invention are conjugated to detectable markers. Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e,g., 3H, ,251, 35S, C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Preferably, the polynucleotide probes are immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports.
[0235] One embodiment of the present invention relates to a recombinant nucleic acid molecule which comprises the isolated nucleic acid molecule described above which is operatively linked to at least one expression control sequence. More particularly, according to the present invention, a recombinant nucleic acid molecule typically comprises a recombinant vector and any one or more of the isolated nucleic acid molecules as described herein. According to the present invention, a recombinant vector is an engineered (i.e., artificially produced) nucleic acid molecule that is used as a tool for manipulating a nucleic acid sequence of choice and/or for introducing such a nucleic acid sequence into a host cell. The recombinant vector is therefore suitable for use in cloning, sequencing, and/or otherwise manipulating the nucleic acid sequence of choice, such as by expressing and/or delivering the nucleic acid sequence of choice into a host cell to form a recombinant cell. Such a vector typically contains nucleic acid sequences that are not naturally found adjacent to nucleic acid sequence to be cloned or delivered, although the vector can also contain regulatory nucleic acid sequences (e.g., promoters, untranslated regions) which are naturally found adjacent to nucleic acid sequences of the present invention or which are useful for expression of the nucleic acid molecules of the present invention (discussed in detail below). The vector can be either RNA or DNA, either prokaryotic or eukaryotic, and typically is a plasmid. The vector can be maintained as an extrachromosomal element (e.g., a plasmid) or it can be integrated into the chromosome of a recombinant host cell, although it is preferred if the vector remains separate from the genome for most applications of the invention. The entire vector can remain in place within a host cell, or under certain conditions, the plasmid DNA can be deleted, leaving behind the nucleic acid molecule of the present invention. An integrated nucleic acid molecule can be under chromosomal promoter control, under native or plasmid promoter control, or under a combination of several promoter controls. Single or multiple copies of the nucleic acid molecule can be integrated into the chromosome. A recombinant vector of the present invention can contain at least one selectable marker.
[0236] In one embodiment a recombinant vector used in a recombinant nucleic acid molecule of the present invention is an expression vector. As used herein, the phrase "expression vector" is used to refer to a vector that is suitable for production of an encoded product (e.g., a protein of interest, such as an enzyme of the present invention). In this embodiment, a nucleic acid sequence encoding the product to be produced (e.g., the protein or homologue or variant thereof) is inserted into the recombinant vector to produce a recombinant nucleic acid molecule. The nucleic acid sequence encoding the protein to be produced is inserted into the vector in a manner that operatively links the nucleic acid sequence to regulatory sequences in the vector which enable the transcription and translation of the nucleic acid sequence within the recombinant host cell.
[0237] Typically, a recombinant nucleic acid molecule includes at least one nucleic acid molecule of the present invention operatively linked to one or more expression control sequences (e.g., transcription control sequences or translation control sequences). As used herein, the phrase "recombinant molecule" or "recombinant nucleic acid molecule" primarily refers to a nucleic acid molecule or nucleic acid sequence operatively linked to a transcription control sequence, but can be used interchangeably with the phrase "nucleic acid molecule", when such nucleic acid molecule is a recombinant molecule as discussed herein. According to the present invention, the phrase "operatively linked" refers to linking a nucleic acid molecule to an expression control sequence in a manner such that the molecule is able to be expressed when transfected (i.e., transformed, transduced, transfected, conjugated or conduced) into a host cell. Transcription control sequences are sequences which control the initiation, elongation, or termination of transcription. Particularly important transcription control sequences are those which control transcription initiation, such as promoter, enhancer, operator and repressor sequences. Suitable transcription control sequences include any transcription control sequence that can function in a host cell or organism into which the recombinant nucleic acid molecule is to be introduced. Transcription control sequences may also include any combination of one or more of any of the foregoing.
[0238] Recombinant nucleic acid molecules of the present invention can also contain additional regulatory sequences, such as translation regulatory sequences, origins of replication, and other regulatory sequences that are compatible with the recombinant cell. In one embodiment, a recombinant molecule of the present invention, including those which are integrated into the host cell chromosome, also contains secretory signals (i.e., signal segment nucleic acid sequences) to enable an expressed protein to be secreted from the cell that produces the protein. Suitable signal segments include a signal segment that is naturally associated with the protein to be expressed or any heterologous signal segment capable of directing the secretion of the protein according to the present invention. In another embodiment, a recombinant molecule of the present invention comprises a leader sequence to enable an expressed protein to be delivered to and inserted into the membrane of a host cell. Suitable leader sequences include a leader sequence that is naturally associated with the protein, or any heterologous leader sequence capable of directing the delivery and insertion of the protein to the membrane of a cell.
[0239] According to the present invention, the term "transfection" is generally used to refer to any method by which an exogenous nucleic acid molecule (i.e., a recombinant nucleic acid molecule) can be inserted into a cell. The term "transformation" can be used interchangeably with the term "transfection" when such term is used to refer to the introduction of nucleic acid molecules into microbial cells or plants and describes an inherited change due to the acquisition of exogenous nucleic acids by the microorganism that is essentially synonymous with the term "transfection." Transfection techniques include, but are not limited to, transformation, particle bombardment, electroporation, microinjection, Hpofection, adsorption, infection and protoplast fusion.
[0240] One or more recombinant molecules of the present invention can be used to produce an encoded product (e.g., a protein) of the present invention. In one embodiment, an encoded product is produced by expressing a nucleic acid molecule as described herein under conditions effective to produce the protein. A preferred method to produce an encoded protein is by transfecting a host cell with one or more recombinant molecules to form a recombinant cell. Suitable host cells to transfect include, but are not limited to, any bacterial, fungal (e.g., filamentous fungi or yeast or mushrooms), algal, plant, insect, or animal cell that can be transfected. Host cells can be either untransfected cells or cells that are already transfected with at least one other recombinant nucleic acid molecule.
[0241] Suitable cells (e.g., a host cell or production organism) may include any microorganism (e.g., a bacterium, a protist, an alga, a fungus, or other microbe), and is preferably a bacterium, a yeast or a filamentous fungus. Suitable bacterial genera include, but are not limited to, Escherichia, Bacillus, Lactobacillus, Pseudomonas and Streptomyces. Suitable bacterial species include, but are not limited to, Escherichia coli, Bacillus subtilis, Bacillus licheniformis, Bacillus stearothermophilus, Lactobacillus brevis, Pseudomonas aeruginosa and Streptomyces lividans. Suitable genera of yeast include, but are not limited to, Saccharomyces, Schizosaccharomyces, Candida, Hansenula, Pichia, Kluyveromyces, and Phqffia. Suitable yeast species include, but are not limited to, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans, Hansenula polymorpha, Pichia pastoris, P. canadensis, Kluyveromyces marxianus and Phqffia rhodozyma.
[0242] Suitable fungal genera include, but are not limited to, Chrysosporium, Ihielavia, Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynasc s, Cryptococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Hu icola, and Trichoderma, and anamorphs and teleomorphs thereof. Suitable fungal species include, but are not limited to, Aspergillus niger, Aspergillus oryzae, Aspergillus nidulans, Aspergillus japonicus, Absidia coerulea, Rhizopus oryzae, Mycelophthora thermophila, Newospora crassa, Neurospora intermedia, Trichoderma reesei, Penicittiwn canescens, Penicillium solitum, Penicillium fiiniculosum, Talaromyces emersonii and Talaromyces flams. In one embodiment, the host cell is a fungal cell of the species M. thermophila.. In another embodiment, a while (low cellulose) strain is sued. In one embodiment, the host cell is a fiingal cell of Strain CI (VKM F-3500- D) or a mutant strain derived therefrom (e.g., UV13-6 (Accession No. VKM F-3632 D); NG7C-19 (Accession No. VKM F-3633 D); UV18-25 (VKM F-3631D), W1L (CBS122189), or W1L#100L (CBS122190)). Host cells can be either untransfected cells or cells that are already transfected with at least one other recombinant nucleic acid molecule. Additional embodiments of the present invention include any of the genetically modified cells described herein.
[0243] In another embodiment, suitable host cells include insect cells (most particularly Drosophila melanogaster cells, Spodoptera frugiperda Sf9 and Sr21 cells and Trichoplusa High-Five cells), nematode cells (particularly C. elegans cells), avian cells, amphibian cells (particularly Xenopus laevis cells), reptilian cells, and mammalian cells (most particularly human, simian, canine, rodent, bovine, or sheep cells, e.g. N1H3T3, CHO (Chinese hamster ovary cell), COS, VERO, BHK, HEK, and other rodent or human cells).
[0244] In one embodiment, one or more protein(s) expressed by an isolated nucleic acid molecule of the present invention are produced by culturing a cell that expresses the protein (i.e., a recombinant cell or recombinant host cell) under conditions effective to produce the protein. In some instances, the protein may be recovered, and in others, the cell may be harvested in whole, either of which can be used in a composition.
[0245] Microorganisms used in the present invention (including recombinant host cells or genetically modified microorganisms) are cultured in an appropriate fermentation medium. An appropriate, or effective, fermentation medium refers to any medium in which a cell of the present invention, including a genetically modified microorganism (described below), when cultured, is capable of expressing enzymes useful in the present invention and/or of catalyzing the production of sugars from lignocellulosic biomass. Such a medium is typically an aqueous medium comprising assimilable carbon, nitrogen and phosphate sources. Such a medium can also include appropriate salts, minerals, metals and other nutrients. Microorganisms and other cells of the present invention can be cultured in conventional fermentation bioreactors. The microorganisms can be cultured by any fermentation process which includes, but is not limited to, batch, fed-batch, cell recycle, and continuous fermentation. The fermentation of microorganisms such as fungi may be carried out in any appropriate reactor, using methods known to those skilled in the art. For example, the fermentation may be carried out for a period of 1 to 14 days, or more preferably between about 3 and 10 days. The temperature of the medium is typically maintained between about 25 and 50°C, and more preferably between 28 and 40°C. The pH of the fermentation medium is regulated to a pH suitable for growth and protein production of the particular organism. The fermentor can be aerated in order to supply the oxygen necessary for fermentation and to avoid the excessive accumulation of carbon dioxide produced by fermentation. In addition, the aeration helps to control the temperature and the moisture of the culture medium. In general the fungal strains are grown in fermentors, optionally centrifuged or filtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi- enzyme composition that is a crude fermentation product. Particularly suitable conditions for culturing filamentous fungi are described, for example, in U.S. Patent No. 6,015,707 and U.S. Patent No. 6,573,086, supra.
[0246] Depending on the vector and host system used for production, resultant proteins of the present invention may either remain within the recombinant cell; be secreted into the culture medium; be secreted into a space between two cellular membranes; or be retained on the outer surface of a cell membrane. The phrase "recovering the protein" refers to collecting the whole culture medium containing the protein and need not imply additional steps of separation or purification. Proteins produced according to the present invention can be purified using a variety of standard protein purification techniques, such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential precipitation or solubilization.
[0247] Proteins of the present invention are preferably retrieved, obtained, and/or used in "substantially pure" form. As used herein, "substantially pure" refers to a purity that allows for the effective use of the protein in any method according to the present invention. For a protein to be useful in any of the methods described herein or in any method utilizing enzymes of the types described herein according to the present invention, it is substantially free of contaminants, other proteins and/or chemicals that might interfere or that would interfere with its use in a method disclosed by the present invention (e.g., that might interfere with enzyme activity), or that at least would be undesirable for inclusion with a protein of the present invention (including homologues and variants) when it is used in a method disclosed by the present invention (described in detail below). Preferably, a "substantially pure" protein, as referenced herein, is a protein that can be produced by any method (i.e., by direct purification from a natural source, recombinantly, or synthetically), and that has been purified from other protein components such that the protein comprises at least about 80% weight weight of the total protein in a given composition (e.g., the protein of interest is about 80% of the protein in a solution/composition/buffer), and more preferably, at least about 85%, and more preferably at least about 90%, and more preferably at least about 91%, and more preferably at least about 92%, and more preferably at least about 93%, and more preferably at least about 94%, and more preferably at least about 95%, and more preferably at least about 96%, and more preferably at least about 97%, and more preferably at least about 98%, and more preferably at least about 99%, weight/weight of the total protein in a given composition.
[0248] It will be appreciated by one skilled in the art that use of recombinant DNA technologies can improve control of expression of transfected nucleic acid molecules by manipulating, for example, the number of copies of the nucleic acid molecules within the host cell, the efficiency with which those nucleic acid molecules are transcribed, the efficiency with which the resultant transcripts are translated, and the efficiency of post-translational modifications. Additionally, the promoter sequence might be genetically engineered to improve the level of expression as compared to the native promoter. Recombinant techniques useful for controlling the expression of nucleic acid molecules include, but are not limited to, integration of the nucleic acid molecules into one or more host cell chromosomes, addition of vector stability sequences to plasmids, substitutions or modifications of transcription control signals (e.g., promoters, operators, enhancers), substitutions or modifications of translational control signals (e.g., ribosome binding sites), modification of nucleic acid molecules to correspond to the codon usage of the host cell, and deletion of sequences that destabilize transcripts.
[0249] Another aspect of the present invention relates to a genetically modified microorganism that has been transfected with one or more nucleic acid molecules of the present invention. As used herein, a genetically modified microorganism can include a genetically modified bacterium, alga, yeast, filamentous fungus, or other microbe. Such a genetically modified microorganism has a genome which is modified (i.e., mutated or changed) from its normal (i.e., wild-type or naturally occurring) form such that the desired result is achieved (i.e., increased or modified activity and/or production of at least one enzyme or a multi-enzyme composition for the conversion of lignocellulosic material to fermentable sugars). Genetic modification of a microorganism can be accomplished using classical strain development and/or molecular genetic techniques. Such techniques are known in the art and are generally disclosed for microorganisms, for example, in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press or Molecular Cloning: A Laboratory Manual, third edition (Sambrook and Russel, 2001), (jointly referred to herein as "Sambrook"). A genetically modified microorganism can include a microorganism in which nucleic acid molecules have been inserted, deleted or modified (i.e., mutated; e.g., by insertion, deletion, substitution, and/or inversion of nucleotides), in such a manner that such modifications provide the desired effect within the microorganism.
[0250] In one embodiment, a genetically modified microorganism can endogenously contain and express an enzyme or a multi-enzyme composition for the conversion/degredation of lignocellulosic material, and the genetic modification can be a genetic modification of one or more of such endogenous enzymes, whereby the modification has some effect on the ability of the microorganism to convert/degrade lignocellulosic material (e.g., increased expression of the protein by introduction of promoters or other expression control sequences, or modification of the coding region by homologous recombination to increase the activity of the encoded protein).
[0251] In another embodiment, a genetically modified microorganism can endogenously contain and express an enzyme for the conversion/degredation of lignocellulosic material, and the genetic modification can be an introduction of at least one exogenous nucleic acid sequence (e.g., a recombinant nucleic acid molecule), wherein the exogenous nucleic acid sequence encodes at least one additional enzyme useful for the conversion/degredation of lignocellulosic material and/or a protein that improves the efficiency of the enzyme for the conversion/degredation of lignocellulosic material. In this aspect of the invention, the microorganism can also have at least one modification to a gene or genes comprising its endogenous enzyme(s) for the conversion/degredation of lignocellulosic material. [0252] In yet another embodiment, the genetically modified microorganism does not necessarily endogenously (naturally) contain an enzyme for the conversion/degredation of lignocellulosic material, but is genetically modified to introduce at least one recombinant nucleic acid molecule encoding at least one enzyme or a multiplicity of enzymes for the conversion/degredation of lignocellulosic material. Such a microorganism can be used in a method of the invention, or as a production microorganism for crude fermentation products, partially purified recombinant enzymes, and/or purified recombinant enzymes, any of which can then be used in a method of the present invention.
[0253] Once the proteins (enzymes) are expressed in a host cell, a cell extract that contains the activity to test can be generated. For example, a lysate from the host cell is produced, and the supernatant containing the activity is harvested and/or the activity can be isolated from the lysate. In the case of cells that secrete enzymes into the culture medium, the culture medium containing them can be harvested, and/or the activity can be purified from the culture medium. The extracts/activities prepared in this way can be tested using assays known in the art. Accordingly, methods to identify mutli-enzyme compositions capable of degrading lignocellulosic biomass are provided.
[0254] Artificial substrates, or complex mixtures of polymeric carbohydrates and lignin, or actual lignocellulose can be used in such tests. One assay that may be used to measure the release of sugars and oligosaccharides from these complex substrates is the dinitrosalicylic acid assay (DNS). In this assay, the lignocellulosic material such as DDG is incubated with enzymes(s) for various times and reducing sugars are measured.
[0255] The present invention is not limited to fungi and also contemplates genetically modified organisms such as algae, bacterial, and plants transformed with one or more nucleic acid molecules of the invention. The plants may be used for production of the enzymes, and/or as the lignocellulosic material used as a substrate in the methods of the invention. Methods to generate recombinant plants are known in the art. For instance, numerous methods for plant transformation have been developed, including biological and physical transformation protocols. See, for example, Miki et al., "Procedures for Introducing Foreign DNA into Plants" in Methods in Plant Molecular Biology and Biotechnology, Glick, B.R. and Thompson, J.E. Eds. (CRC Press, Inc., Boca Raton, 1993) pp. 67-88. In addition, vectors and in vitro culture methods for plant cell or tissue transformation and regeneration of plants are available. See, for example, Gruber et al., "Vectors for Plant Transformation" in Methods in Plant Molecular Biology and Biotechnology, Glick, B.R. and Thompson, J.E. Eds. (CRC Press, Inc., Boca Raton, 1993) pp. 89-119.
[0256] The most widely utilized method for introducing an expression vector into plants is based on the natural transformation system of Agrobacterium. See, for example, Horsch et al., Science 227:1229 (1985). Descriptions of Agrobacterium vector systems and methods for Agrobacterium-mediated gene transfer are provided by numerous references, including Gruber et al., supra, Miki et al., supra, Moloney et al., Plant Cell Reports 8:238 (1989), and U.S. Patents Nos.4,940,838 and 5,464,763.
[0257] Another generally applicable method of plant transformation is microprojectile- mediated transformation, see e.g. Sanford et al., Part. Sci. Technol. 5:27 (1987), Sanford, J.C., Trends Biotech. 6:299 (1988), Sanford, J.C., Physiol. Plant 79:206 (1990), Klein et al., Biotechnology 10:268 (1992).
[0258] Another method for physical delivery of DNA to plants is sonication of target cells.
Zhang et al., Bio/Technology 9:996 (1991). Alternatively, liposome or spheroplast fusion have been used to introduce expression vectors into plants. Deshayes et al., EMBO J., 4:2731 (1985), Christou et al., Proc Natl. Acad. Sci. USA 84:3962 (1987). Direct uptake of DNA into protoplasts using CaCl2 precipitation, polyvinyl alcohol or poly-L-ornithine have also been reported. Plain et al., Mol. Gen. Genet. 199:161 (1985) and Draper et al., Plant Cell Physiol. 23:451 (1982). Electroporation of protoplasts and whole cells and tissues have also been described. Donn et al., In Abstracts of Vllth International Congress on Plant Cell and Tissue Culture IAPTC, A2-38, p. 53 (1990); D'Halluin et al., Plant Cell 4:1495-1505 (1992) and Spencer et al., Plant Mol Biol. 24:51-61 (1994).
[0259] Some embodiments of the present invention include genetically modified organisms comprising at least one nucleic acid molecule encoding at least one enzyme of the present invention, in which the activity of the enzyme is downregulated. The downregulation may be achieved, for example, by introduction of inhibitors (chemical or biological) of the enzyme activity, by manipulating the efficiency with which those nucleic acid molecules are transcribed, the efficiency with which the resultant transcripts are translated, and the efficiency of post-translational modifications, or by "knocking out" the endogenous copy of the gene. A "knock out" of a gene refers to a molecular biological technique by which the gene in the organism is made inoperative, so that the expression of the gene is substantially reduced or eliminated. Alternatively, in some embodiments the activity of the enzyme may be upregulated. The present invention also contemplates downregulating activity of one or more enzymes while simultaneously upregulating activity of one or more enzymes to achieve the desired outcome.
[0260] Exemplary methods according to the invention are presented below. Examples of the methods described above may also be found in the following references: Trichoderma & Gliocladium, Volume 2, Enzymes, biological control and commercial applications, Editors: Gary E. Herman, Christian P. Kubicek, Taylor & Francis Ltd. 1998, 393 (in particular, chapters 14, 15 and 16); Helmut Uhlig, Industrial enzymes and their applications, Translated and updated by Elfriede M. Linsmaier-Bednar, John Wiley & Sons, Inc 1998, p. 454 (in particular, chapters 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.9, 5.10, 5.11, and 5.13). For saccharification applications: Hahn-Hagerdal, B., Galbe, M., Gorwa-Grauslund, Mi*. Liden, Zacchi, G. Bio-ethanol - the fuel of tomorrow from the residues of today, Trends in Biotechnology, 2006, 24 (12), 549- 556; Mielenz, J.R. Ethanol production from biomass: technology and commercialization status, Current Opinion in Microbiology, 2001, 4, 324-329; Himmel, M.E., Ruth, M.F., Wyman, C.E., Cellulase for commodity products from cellulosic biomass, Current Opinion in Biotechnology, 1999, 10, 358-364; Sheehan, J., Himmel, M. Enzymes, energy, and the environment: a strategic perspective on the U.S. Department of Energy's Research and Development Activities for Bioethanol, Biotechnology Progress. 1999, 15, 817-827. For textile processing applications: Galante, Y.M., Formantici, C, Enzyme applications in detergency and in manufacturing industries, Current Organic Chemistry, 2003, 7, 1399-1422. For pulp and paper applications: Bajpai, P., Bajpai, P.K Deinking with enzymes: a review. TAPPIJournal, 1998, 81(12), 111-117; Viikari, L., Pere, J., SuurnSkki, A., Oksanen, T., Buchert, J. Use of cellulases in pulp and paper applications. In: "Carbohydrates from Trichoderma reesei and other microorganisms. Structure, Biochemistry, Genetics and Applications." Editors: Mark Claessens, Wim Nerinckx, and Kathleen Piens, The Royal Society of Chemistry 1998, 245-254. For food and beverage applications: Roller, S., Dea, I.C.M. Biotechnology in the production and modification of biopolymers for foods, Critical Reviews in Biotechnology, 1992, 12(3), 261-277.
[0261] Additional assays and methods for examining the activity of the enzymes are found in U.S. Patent Applications 60/806,876, 60/970,876, 11/487,547, 11/775,777, 11/833,133, and 12/205,694 and incorporated herein by reference.
[0262] The following examples are provided for the purpose of illustration and are not intended to limit the scope of the present invention.
[0263] EXAMPLES
[0264] Example 1
[0265] Acetylxylan Esterase Assay
[0266] This example illustrates an assay to measure acetyl xylan esterase activity towards arabinoxylan oligosaccharides from Eucalyptus wood. This assay measures the release of acetate by the action of the acetyl xylan esterases on the arabinoxylan oligosaccharides.
[0267] Reagents
[0268] Phosphate buffer (0.05 M, pH 7.0) is prepared as follows. 13.8 g of NaH2P04 * H20 is dissolved in 1 L of Millipore water. 26.8 g Na2HPO4*7H20 is dissolved in Millipore water. 195 mL of the NaH2PO4 solution is mixed with 305 mL of the Na2HP04 solution and adjusted to 1000 mL with Millipore water. The pH of the resulting solution is equal to 7.0.
[0269] Acetylated, 4-0-MeGlcA substituted xylo-oligosaccharides with 2-10 xylose residues from Eucalyptus globulus wood (EW-XOS), Eucalyptus globulus wood ATS and Eucalyptus globulus xylan polymer are obtained using the method described in Kabel et al.2002.
[0270] Enzvme Sample
[0271] 5 mL of substrate solution, containing 1 mg EW-XOS in Millipore water is mixed with 0.5% (w/w) enzyme/substrate ratio and incubated at 40 °C and pH 7 for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of acetic acid and formation of new (arabino)xylan oligosaccharides are analyzed by Matrix-Assisted Laser Desorption/ Ionisation Time-Of-Flight Mass Spectrometry and Capillary Electrophoresis.
[0272] Substrate Blank
[0273] 5 mL of substrate solution, containing 1 mg EW-XOS in Millipore water is mixed with buffer to the same volume as the enzyme sample and incubated at 40 °C and pH 7 for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of acetic acid and formation of new (arabino)xylan oligosaccharides are analyzed by Matrix-Assisted Laser Desorption/ Ionisation Time- Of-Flight Mass Spectrometry and Capillary Electrophoresis.
[0274] Matrix-Assisted Laser Desorption/ Ionisation Time-Of-Flieht Mass Spectrometry (MALDI-TOF MS)
[0275] For MALDI-TOF MS an Ultraflex workstation (Bruker Daltronics GmbH, Germany) is used with a nitrogen laser at 337 nm. The mass spectrometer is calibrated with a mixture of malto-dextrins (mass range 365-2309). The samples are mixed with a matrix solution (1 uL each). The matrix solution is prepared by dissolving 10 mg of 2,5-dihydroxybenzoic acid in a 1 mL mixture of Millipore water in order to prepare a saturated solution. After thorough mixing, the solution is centrifuged to remove undissolved material. 1 uL of the prepared sample and 1 μΐ^ of matrix solution is put on a gold plate and dried with warm air.
[0276] Capillary Electrophoresis-Laser induced fluorescence detector (CE-LIF)
[0277] Samples containing about 0.4 mg of EW-XOS are substituted with 5 nmol of maltose as an internal standard. The samples are dried using centrifugal vacuum evaporator (Speedvac). 5 mg of APTS labeling dye (Beckman Coulter) is dissolved in 48 uL of 15% acetic acid (Beckman Coulter). The dried samples are mixed with 2 uL of the labeling dye solution and 2 uL of 1 M Sodium Cyanoborohydride (THF, Sigma- Aldrich). The samples are incubated overnight in the dark to allow the labeling reaction to be completed. After overnight incubation, the labeled samples are diluted 100 times with Millipore water before analysis by CE-LIF.
[0278] CE-LIF is performed using ProteomeLab PA800 Protein Characterization System (Beckman Coulter), controlled by 32 Karat Software. The capillary column used is polyvinyl alcohol coated capillary (N-CHO capillary, Beckman Coulter), having 50 μm ID, 50.2 cm length and 40 cm to detector window. 25 mM sodium acetate buffer pH 4.75 containing 0.4% polyethyleneoxide (Carbohydrate separation buffer, Beckman Coulter) is used as running buffer. The sample (ca. 3.5 nL) is injected to the capillary by a pressure of 0.5 psi for 3 seconds. The separation is done for 20 minutes at 30 kV separating voltage, with reversed polarity. During analysis, the samples are stored at 10°C. The labeled EW-XOS are detected using LIF detector at 488 nm excitation and 520 nm emission wavelengths.
[0279] Example 2
[0280] Cutininase Assay
[0281] Esterase activity can be examined by spectrophotometry (Davies et al., 2000) with p- nitrophenyl butyrate (p-NPB) as a substrate. Cutinase activity can also be measured using 3H labelled apple cutin as substrate by an adaptation of the methodology presented in K6ller et al. (1982) and Davies et al. (2000).
[0282] Davies, K.A., de Lorono, L, Foster, S.J., Li, D., Johnstone, K., Ashby, AM. (2000)
Evidence for a role of cutinase in pathogenicity of Pyrenopeziza brassicae on brassicas. Physiol. ol. Plant Pathol. 57:63-75.
[0283] Koller, W., Allan, C.R., olattukudy, P.E. (1982) Role of cutinase and cell wall degrading enzymes in infection of Pisum sativum by Fusarium so- lanif. sp. pisi.
Physiol Plant Pathol.20:47-60.
[0284] Example 3
[0285] Ferulic Acid Esterase (Feruloyl Esterase) Assay
[0286] The following example illustrates the assay that measures ferulic acid esterase enzymatic activity.
[0287] This assay measures the release of -nitrophenol by the action of ferulic acid esterase on p-nitrophenylbutyrate (PNBu). One ferulic acid esterase unit of activity is the amount of enzyme that liberates 1 micromole of p-nitrophenol in one minute at 37°C and pH 7.2.
[0288] Phosphate buffer (0.01 M, pH 7.2) is prepared as follows: 0.124 g of NaH2P04 * H20 and 0.178 g Na2HP04 are dissolved in distilled water so that the final volume of the solution is 500 ml and the pH of the resulting solution is equal to 7.2.
[0289] PNPBu (Sigma, USA, cat # N9876-5G) is used as the assay substrate. 10 μl of PNPBu is mixed with 25 ml of 0.01 M phosphate buffer using a magnetic stirrer to obtain a 2 mM stock solution. The solution is stable for 2 days with storage at 4°C.
[0290] The stop reagent (0.25 M Tris-HCl, pH 8.5) is prepared as follows: 30.29 g of Tris is dissolved in 900 ml of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 is prepared by mixing solution A with 37% HCl until the pH of the resulting solution is equal to 8.5. The solution volume is adjusted to 1000 ml. This reagent is used to terminate the enzymatic reaction. Using the above reagents, the assay is performed as detailed below.
[0291] For the enzyme sample, 0.10 mL of 2 mM PNBu stock solution is mixed with 0.01 mL of the enzyme sample and incubated at 37°C for 10 minutes. After 10 minutes of incubation, 0.10 mL of 0.25 M Tris HCl solution pH 8.8 is added and the absorbance at 405 nm (A405) is then measured in microtiter plates as As.
[0292] For the substrate blank, 0.10 mL of 2 mM PNBu stock solution is mixed with 0.01 mL of 0.01 M phosphate buffer, pH 7.2. 0.10 mL of 0.25 M Tris HCl solution pH 8.8 is added and the absorbance at 405 nm (A405) is measured in microliter plates as ASB.
Activity is calculated as follows:
Figure imgf000068_0001
where ΔΑ405 = As - ASB, DF is the enzyme dilution factor, 21 is the dilution of 10 ul enzyme solution in 210 μΐ reaction volume, 1.33 is the conversion factor of microtiter plates to cuvettes, 13.700 is the extinction coefficient 13700 M-1 cm-1 of/j-nitrophenol released corrected for mol L to umol/mL, and 10 minutes is the reaction time.
[0293] Example 4
[0294] Ferulic Acid Esterase Activity
[0295] The following assay is used to measure the enzymatic activity of a ferulic acid esterase towards wheat bran (WB) oligosaccharides by measuring the release of ferulic acid.
[0296] Wheat bran oligosaccharides are prepared by degradation of wheat bran (obtained from Nedalco, The Netherlands) by endo-xylanase HI from A. niger (enzyme collection Laboratory of Food Chemistry, Wageningen University, The Netherlands). 50 mg of WB is dissolved in 10 ml of 0.05 M acetate buffer pH 5.0 using a magnetic stirrer. 1.0 ml of WB stock solution is mixed with 0.0075 mg of the enzyme and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The residual material is removed by centrifugation (15 minutes at 14000 rpm), and the supernatant is used as the substrate in the assay detailed below.
[0297] For the enzyme sample, 1.0 ml of wheat bran oligosaccharides stock solution is mixed with 0.005 mg of the enzyme sample and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of ferulic acid is analyzed by measuring the absorbance at 335 nm.
[0298] For the substrate blank, 1.0 ml of wheat bran oligosaccharides stock solution is mixed with 0.005 mg of 0.05 M acetate buffer, pH 5.0, and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of ferulic acid is analyzed by measuring the absorbance at 335 nm.
[0299] Example 5
[0300] Acetyl Esterase Activity
[0301] The following assay is used to measure acetyl esterase activity. This assay measures the release of p-nitrophenol by the action of acetyl esterase on /j-nitrophenyl acetate (PNPAc). One acetyl esterase unit of activity is the amount of enzyme that liberates 1 micromole of p-nitrophenol in one minute at 37 °C and pH 5.
[0302] Reagents
[0303] Sodium acetate buffer (0.05 M, pH S.0) is prepared as follows. 4.1 g of anhydrous sodium acetate or 6.8 g of sodium acetate * 3H20 is dissolved in distilled water so that the final volume of the solution to be 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 5.0, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is equal to 5.0.
[0304] PNPAc from Fluka (Switzerland, cat. # 46021) is used as the assay substrate. 3.6 mg of PNPAc is dissolved in 10 mL of 0.05 M sodium acetate buffer using magnetic stirrer to obtain 2 mM stock solution. The solution is stable for 2 days on storage at 4 °C.
[0305] The stop reagent (0.25 M Tris-HCl, pH 8.8) is prepared as follows. 30.29 g of Tris is dissolved in 900 mL of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 is prepared by mixing solution A with 37% HCl until the pH of the resulting solution is equal to 8.8. The solution volume is adjusted to 1000 mL. This reagent is used to terminate the enzymatic reaction.
[0306] Using the above reagents, the assay is performed as detailed below.
[0307] Enzyme Sample
[0308] 0.10 mL of 2 mM PNPAc stock solution is mixed with 0.01 mL of the enzyme sample and incubated at 37 °C for 10 minutes. After exactly 10 minutes of incubation, 0.1 mL of 0.25 M Tris-HCl solution is added and then the absorbance at 405 nm (A405) is measured in microtiter plates as As (enzyme sample).
[0309] Substrate Blank
[0310] 0.10 mL of 2 mM PNPAc stock solution is mixed with 0.01 mL of 0.05 M sodium acetate buffer, pH 5.0. Then, 0.1 mL of 0.25 M Tris-HCl solution is added and the absorbance at 405 nm (A405) is measured microtiter plates as ASB (substrate blank).
[0311] Calculation of Activity
[0312] Activity is calculated as follows:
Figure imgf000069_0001
where ΔΑ405 = As (enzyme sample) - ASB (substrate blank), DF is the enzyme dilution factor, 21 is the dilution of 10 ul enzyme solution in 210 ul reaction volume, 1.33 is the conversion factor of microtiter plates to cuvettes, 13.700 is the extinction coefficient 13700 M-1 cm-1 of p-niitrophenol released corrected for mol/L to umol mL, and 10 minutes is the reaction time.
[0313] Example 6
[0314] Glucuronyl esterase activity
[0315] The following assay is used to measure glucuronyl esterase activity. This assay measures the release of 4-O-methyl-glucuronic acid by the action of the glucuronyl esterases on methyl-4-O-methyl-glucuronic acid.
[0316] Reagents
[0317] Sodium acetate buffer (0.1 M, pH 5.0) is prepared as follows. 8.2 g of anhydrous sodium acetate or 13.6 g of sodium acetate * 3H20 is dissolved in distilled water so that the final volume of the solution to be 1000 mL (Solution A). In a separate flask, 6.0 g (5.72 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.1 M sodium acetate buffer, pH 5.0, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is equal to 5.0.
[0318] Methyl-4-O-methyl-glucuronic acid was kindly provided by Prof. Peter Biely
(Spanikova and Biely, 2006).
[0319] Enzvme Sample
[0320] 200 uL of methyl-4-O-methyl-glucuronic acid stock solution (0.5 mg/mL) is mixed with 10 uL of the enzyme sample and incubated at 30 °C for 4 hours. The reaction is stopped by heating the samples for 15 minutes at 99°C. The release of glucose was analyzed by UPLC-MS .
[0321] Substrate Blank
[0322] 200 uL of methyl-4-O-methyl-glucuronic acid stock solution (0.5 mg/mL) is mixed with 10 uL of buffer and incubated at 30 °C for 4 hours. The reaction is stopped by heating the samples for 15 minutes at 99°C. The release of glucose was analyzed by UPLC-MS.
[0323] Results
[0324] It was found that after 4 hours of incubation methyl-4-O-methyl-D-GlcA has been degraded to 4-O-methyl-D-GlcA by Guel (CL10365, seq. 98, enzyme produced in 1.5L fermentations). The MS-diagram is shown in figure 7. After incubation a part of the substrate is degraded to 4-O-methyl-D-GlcA by Gue2 (CL11231, seq. 99, enzyme produced in 1.5L fermentations). The MS-diagram is shown in figure 8.
[0325] References
[0326] Spanikova, S., Biely, P. (2006). FEBS let.580: 4597-4601.
[0327] Example 7
[0328] Pectin acetyl esterases or Rhamnogalacturonan acetyl esterase activity
[0329] The following assay was used to measure pectin and/or rhamnogalacturonan acetyl esterase activity. This assay measures the release of acetic acid by the action of the pectin acetyl esterase or rhamnogalacturonan acetyl esterase on sugar beet pectin.
[0330] Enzyme assay
[0331] Sugar beet pectin was purchased at CP Kelco (Atlanta, USA). The acetic acid assay kit was purchased at Megazyme (Bray, Ireland).
[0332] The pectin acetyl esterases or the rhamnogalacturonan acetyl esterase were incubated with sugar beet pectin at 50°C in 10 mM phosphate buffer pH 7.0 during 16 hours of incubation. The E/S ratio was 0.5% (5 μg enzyme/mg substrate). The total volume of the reaction was Ι ΙΟμϋ,. The released acetic acid was analyzed with the acetic acid assay kit according to instructions of the supplier. Enzyme with known pectin acetyl easterase activity or rhamnogalacturonan acetyl esterase activity were used as a reference.
* * *
[0333] While various embodiments of the present invention have been described in detail, it is apparent that modifications and adaptations of those embodiments will occur to those skilled in the art. It is to be expressly understood, however, that such modifications and adaptations are within the scope of the present invention, as set forth in the following exemplary claims.

Claims

WHAT IS CLAIMED IS:
1. An isolated nucleic acid sequence selected from the group consisting of:
a) a nucleic acid sequence encoding a protein comprising an amino acid sequence selected from the amino acid sequences of Sequences SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ED No: 44, SEQ ED No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ED No: 96, SEQ ID No: 98, SEQ ED No: 100, SEQ ED No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ED No: 108, SEQ ED No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ED No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ED No: 126, SEQ ID No: 128, SEQ ED No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ED No: 156, SEQ ED No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ED No: 184.
b) a nucleic acid sequence encoding a fragment of the protein of (a), wherein the fragment has a biological activity of the protein of (a); and
c) a nucleic acid sequence encoding an amino acid sequence that is at least about 70% identical to an amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
2. The isolated nucleic acid sequence of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 80% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
3. The isolated nucleic acid sequence of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 90% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
4. The isolated nucleic acid sequence of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 95% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
5. The isolated nucleic acid sequence of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 97% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
6. The isolated nucleic acid sequence of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 99% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
7. The isolated nucleic acid sequence of claim 1, wherein said nucleic acid sequence encodes a protein comprising an amino acid sequence selected from the group consisting of: the amino acid sequences of SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ ID No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ID No: 184.
8. The isolated nucleic acid sequence of claim 1, wherein said nucleic acid sequence comprises a nucleic acid sequence selected from the group consisting of: the nucleic acid sequences of SEQ ID No: 1, SEQ ID No: 3, SEQ ED No: 5, SEQ ID No: 7, SEQ ID No: 9, SEQ ID No: 11, SEQ ID No: 13, SEQ ID No: 15, SEQ ID No: 17, SEQ ID No: 19, SEQ ID No: 21, SEQ ID No: 23, SEQ ID No: 25, SEQ ID No: 27, SEQ ID No: 29, SEQ ID No: 31, SEQ ID No: 33, SEQ ID No: 35, SEQ ID No: 37, SEQ ID No: 39, SEQ ID No: 41, SEQ ID No: 43, SEQ ID No: 45, SEQ ID No: 47, SEQ ID No: 49, SEQ ID No: 51, SEQ ID No: 53, SEQ ID No: 55, SEQ ID No: 56, SEQ ID No: 57, SEQ ID No: 59, SEQ ID No: 61, SEQ ID No: 63, SEQ ID No: 65, SEQ ID No: 67, SEQ ID No: 69, SEQ ID No: 71, SEQ ID No: 73, SEQ ID No: 75, SEQ ID No: 77, SEQ ID No: 79, SEQ ID No: 81, SEQ ID No: 83, SEQ ID No: 85, SEQ ID No: 87, SEQ ID No: 89, SEQ ID No: 91, SEQ ID No: 93, SEQ ID No: 95, SEQ ID No: 97, SEQ ID No: 99, SEQ ID No: 101, SEQ ID No: 103, SEQ ID No: 105, SEQ ID No: 107, SEQ ID No: 109, SEQ ID No: 111, SEQ ID No: 113, SEQ ID No: 1 15, SEQ ID No: 117, SEQ ID No: 119, SEQ ID No: 121, SEQ ID No: 123, SEQ ID No: 125, SEQ ID No: 127, SEQ ID No: 129, SEQ ID No: 131, SEQ ID No: 133, SEQ ID No: 135, SEQ ID No: 137, SEQ ID No: 139, SEQ ID No: 141, SEQ ID No: 143, SEQ ID No: 145, SEQ ID No: 147, SEQ ID No: 149, SEQ ID No: 151, SEQ ID No: 153, SEQ ID No: 155, SEQ ID No: 157, SEQ ID No: 159, SEQ ID No: 161, SEQ ID No: 163, SEQ ID No: 165, SEQ ID No: 167, SEQ ID No: 169, SEQ ID No: 171, SEQ ID No: 173, SEQ ID No: 175, SEQ ID No: 177, SEQ ID No: 179, SEQ ID No: 181, SEQ ID No: 183.
9. An isolated nucleic acid sequence comprising a nucleic acid sequence that is fully complementary to the nucleic acid sequence of the nucleic acid sequence of any one of Claims 1 to 8,
10. An isolated protein comprising an amino acid sequence encoded by the nucleic acid sequence of any one of Claims 1 to 8.
11. An isolated fusion protein comprising the isolated protein of Claim 10 fused to a protein comprising an amino acid sequence that is heterologous to the isolated protein of Claim 10.
12. An isolated antibody or antigen binding fragment thereof that selectively binds to the protein of Claim 10.
13. A kit for degrading a complex carbohydrate material to fermentable sugars comprising at least one isolated protein of Claim 10.
14. A detergent comprising at least one isolated protein of Claim 10.
15. A composition for the degradation complex carbohydrate material comprising at least one isolated protein of Claim 10.
16. A composition for the degumming of vegetable oils comprising at least one isolated protein of Claim 10.
17. A composition for the enhancement of the flavor of cheese comprising at least one isolated protein of Claim 10.
18. A composition for the ripening cheese comprising at least one isolated protein of Claim 10.
19. A composition for transesterification of flavors comprising at least one isolated protein of Claim 10.
20. A composition for transesterification of cocoa butter subsitutes at least one isolated protein of Claim 10.
21. A composition for the removal of wax comprising at least one isolated protein of Claim 10.
22. A cleaning agent comprising at least one isolated protein of Claim 10.
23. A composition for the the production of biodiesel comprising at least one isolated protein of Claim 10.
24. A composition for the production of biofuel comprising at least one isolated protein of Claim 10.
25. A composition for reducing the amount of phosphate in animal feed comprising at least one isolated protein of Claim 10.
26. A composition for enhancing the recovery of oil comprising at least one isolated protein of Claim 10.
27. A recombinant nucleic acid sequence comprising the isolated nucleic acid sequence of any one of Claims 1 to 8, operatively linked to at least one expression control sequence.
28. The recombinant nucleic acid sequence of Claim 27, wherein the recombinant nucleic acid sequence comprises an expression vector.
29. The recombinant nucleic acid sequence of Claim 27, wherein the recombinant nucleic acid sequence comprises a targeting vector.
30. An isolated host cell transfected with the nucleic acid sequence of any one of Claims 1 to 9.
31. The isolated host cell of Claim 30, wherein the host cell is selected from the group consisting of: a fungal cell, a plant cell, an algal cell, and a bacterium.
32. The isolated host cell of Claim 30, wherein the host cell is selected from the group consisting of: yeast, mushroom, or a filamentous fungus.
33. The isolated host cell of Claim 30, wherein the filamentous fungus is from a genus selected from the group consisting of: Chrysosporivm, Thielavia,Thermomyces,
Thermoasctts, Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynascus,
Cryplococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Humicola,
Talaromyce and Trichoderma, and anamorphs and teleomorphs thereof.
34. The isolated host cell of Claim 30, wherein the host cell is a bacterium.
35. An oligonucleotide consisting essentially of at least 12 consecutive nucleotides of a nucleic acid sequence selected from the group consisting of the nucleic acid sequence of SEQ ID No: 1, SEQ ID No: 3, SEQ ID No: 5, SEQ ID No: 7, SEQ ID No: 9, SEQ ID No: 11 , SEQ ID No: 13, SEQ ID No: 15, SEQ ID No: 17, SEQ ID No: 19, SEQ ID No: 21, SEQ ID No: 23, SEQ ID No: 25, SEQ ID No: 27, SEQ ID No: 29, SEQ ID No: 31, SEQ ID No: 33, SEQ ID No: 35, SEQ ID No: 37, SEQ ID No: 39, SEQ ID No: 41, SEQ ID No: 43, SEQ ID No: 45, SEQ ID No: 47, SEQ ID No: 49, SEQ ID No: 51, SEQ ID No: 53, SEQ ID No: 55, SEQ ID No: 56, SEQ ID No: 57, SEQ ID No: 59, SEQ ID No: 61, SEQ ID No: 63, SEQ ID No: 65, SEQ ID No: 67, SEQ ID No: 69, SEQ ID No: 71, SEQ ID No: 73, SEQ ID No: 75, SEQ ID No: 77, SEQ ID No: 79, SEQ ID No: 81, SEQ ID No: 83, SEQ ID No: 85, SEQ ID No: 87, SEQ ID No: 89, SEQ ID No: 91, SEQ ID No: 93, SEQ ID No: 95, SEQ ID No: 97, SEQ ID No: 99, SEQ ID No: 101, SEQ ED No: 103, SEQ ID No: 105, SEQ ID No: 107, SEQ ID No: 109, SEQ ID No: 111, SEQ ID No: 113, SEQ ID No: 115, SEQ ID No: 117, SEQ ID No: 119, SEQ ID No: 121, SEQ ID No: 123, SEQ ID No: 125, SEQ ID No: 127, SEQ ID No: 129, SEQ ID No: 131, SEQ ID No: 133, SEQ ID No: 135, SEQ ID No: 137, SEQ ID No: 139, SEQ ID No: 141, SEQ ID No: 143, SEQ ID No: 145, SEQ ID No: 147, SEQ ID No: 149, SEQ ID No: 151 , SEQ ID No: 153, SEQ ID No: 155, SEQ ID No: 157, SEQ ID No: 159, SEQ ID No: 161, SEQ ID No: 163, SEQ ID No: 165, SEQ ID No: 167, SEQ ID No: 169, SEQ ID No: 171, SEQ ID No: 173, SEQ ID No: 175, SEQ ID No: 177, SEQ ID No: 179, SEQ ID No: 181, SEQ ID No: 183, or the complement thereof.
36. A kit comprising at least one oligonucleotide of claim 35.
37. A method for producing the protein of Claim 10, comprising culturing a cell that has been transfected with a nucleic acid sequence comprising a nucleic acid sequence encoding the protein, and expressing the protein with the transfected cell.
38. The method of Claim 37, further comprising recovering the protein from the cell or from a culture comprising the cell.
39. A genetically modified organism comprising components suitable for degrading a lignocelhilosic material to fermentable sugars, wherein the organism has been genetically modified to express at least one protein of Claim 10.
40. The genetically modified organism of Claim 39, wherein the genetically modified organism is selected from the group consisting of: plants, algae, fungi, and bacteria.
41. The genetically modified organism of Claim 40, wherein the fungus is selected from the group consisting of: yeast, mushroom and filamentous fungus.
42. The genetically modified organism of Claim 41, wherein the filamentous fungus is from a genus selected from the group consisting of: C ysosporium, T ielavia, Thermomyces, Thermoascus, Neurospora, Aureobasidi m, Filibasidium, Piromyces, Corynascus,
Cryptococcus, Acremonium, Tolypocladiwn, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Talaromyces, Gibberella, Myceliophihora, Mucor, Aspergillus, Fusarium, Humicola, and Trichoderma.
43. The genetically modified organism of Claim 41, wherein the filamentous fungus is selected from the group consisting of: Trichoderma reesei, Chrysosporium lucknowense, Myceliophthora thermophila, Aspergillus niger, Aspergillus nidulans, Aspergillus oryzae, Aspergillus aculeatus, Aspergillus japonicus, Penicillium canescens, Penicillium solitum, Penicillium fiiniculosum, Tcdaromyces emersonii and Talaromyces flavus.
44. The genetically modified organism of Claim 39, wherein the organism has been genetically modified to express at least one additional enzyme.
45. The genetically modified organism of Claim 44, wherein the additional enzyme is selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, (gluco)tnannanase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate or pectin lyase, chitosanases, exo-p-D-glucosaminidase, cellobiose dehydrogenase, glucuronyl esterase and acetylxylan esterase.
46. The genetically modified organism of Claim 39, wherein the genetically modified organism is a plant.
47. A recombinant enzyme isolated from the genetically modified microorganism of any one of claims 39 to 46.
48. The recombinant enzyme of claim 37, wherein the enzyme has been subjected to a purification step.
49. A crude fermentation product produced by culturing the cells from the genetically modified organism of any one of claims 39 to 46, wherein the crude fermentation product contains the at least one protein of Claim 10.
50. A multi-enzyme composition comprising enzymes produced by the genetically modified organism of any one of Claims 39 to 46, and recovered therefrom.
51. A multi-enzyme composition comprising at least one protein of Claim 10, and at least one additional protein for degrading complex carbohydrates or a fragment thereof that has biological activity.
52. The multi-enzyme composition of Claim 51, wherein the composition comprises at least one esterase, cellobiohydrolase, at least one xylanase, at least one endoglucanase, at least one β-glucosidase, at least one β-xylosidase, and at least one accessory enzyme.
53. The multi-enzyme composition of Claim 52, wherein the composition comprises about 60% ceilobiohydrolases, about 20% xylanases, about 10% endoglucanases, about 3% β- glucosidases, about 2% β-xylosidases, and about 5% accessory enzymes.
54. The multi-enzyme composition of Claim 52 or Claim 53, wherein the xylanases are selected from the group consisting of: endoxylanases, exoxylanases, and β-xylosidases.
55. The multi-enzyme composition of Claim 52 or Claim 53 wherein the accessory enzymes include an enzyme selected from the group consisting of: ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate and pectin lyase, chitosanases, exo^-D-glucosaminidase, cellobiose dehydrogenase, glucuronyl esterase, and acetylxylan esterase.
56. The multi-enzyme composition of any one of Claims 50 to 51, wherein the multi-enzyme composition comprises at least one hemicellulase.
57. The multi-enzyme composition of Claim 50, wherein the hemicellulase is selected from the group consisting of a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, and endo-galactanase, a mannanase, an endo arabinase, an exo arabinase, an exo-galactanase, a ferulic acid esterase, a galactomannanase, a xylogluconase, and mixtures thereof.
58. The multi-enzyme composition of Claim 57, wherein the xylanase is selected from the group consisting of endoxylanases, exoxylanase, and β-xylosidase.
59. The multi-enzyme composition of any one of Claims 50 to 58, wherein the multi-enzyme composition comprises at least one cellulase.
60. The multi-enzyme composition of any one of Claims 50 to 58, wherein the composition is a crude fermentation product.
61. The multi-enzyme composition of any one of Claims 50 to 52, wherein the composition is a crude fermentation product that has been subjected to a purification step.
62. The multi-enzyme composition of any one of Claims 50 to 52, further comprising one or more accessory enzymes.
63. The multi-enzyme composition of Claim 62, wherein the accessory enzymes includes at least one enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanases, εχο-β-D- glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
64. The multi-enzyme composition of Claim 62, wherein the accessory enzyme is selected from the group consisting of a glucoamylase, a pectinase, and a ligninase.
65. The multi-enzyme composition of Claim 62, wherein the accessory enzyme is a glucoamylase.
66. The multi-enzyme composition of Claim 62, wherein the accessory enzyme is added as a crude or a semi-purified enzyme mixture.
67. The multi-enzyme composition of Claim 62, wherein the accessory enzyme is produced by culturing at least one organism on a substrate to produce the enzyme.
68. A multi-enzyme composition comprising at least one protein of Claim 10, and at least one additional protein for degrading an arabinoxylan-containing material or a fragment thereof that has biological activity.
69. The multi-enzyme composition of Claim 68, wherein the composition comprises at least one endoxylanase, at least one β-xylosidase, and at least one arabinofuranosidase.
70. The multi-enzyme composition of Claim 69, wherein the at least one arabinofuranosidase comprises an arabinofuranosidase with specificity towards single substituted xylose residues, an arabinofuranosidase with specificity towards double substituted xylose residues, or a combination thereof.
71. A method for degrading complex carbohydates to fermentable sugars, comprising contacting the complex carbohydates with at least one isolated protein of Claim 10.
72. The method of Claim 71, further comprising contacting the lignocellulosic material with at least one additional isolated protein comprising an amino acid sequence that is at least about 95% identical to an amino acid sequence selected from the group consisting of the amino acid sequences of Sequences SEQ ID NO: 2, SEQ ID No: 4, SEQ ID No: 6, SEQ ID No: 8, SEQ ID No: 10, SEQ ID No: 12, SEQ ID No: 14, SEQ ID No: 16, SEQ ID No: 18, SEQ ID No: 20, SEQ ID No: 22, SEQ ID No: 24, SEQ ID No: 26, SEQ ID No: 28, SEQ ID No: 30, SEQ ID No: 32, SEQ ID No: 34, SEQ ID No: 36, SEQ ID No: 38, SEQ ID No: 40, SEQ ID No: 42, SEQ ID No: 44, SEQ ID No: 46, SEQ ID No: 48, SEQ ID No: 50, SEQ ID No: 52, SEQ ID No: 54, SEQ ID No: 56, SEQ ID No: 58, SEQ ID No: 60, SEQ ID No: 62, SEQ ID No: 64, SEQ ID No: 66, SEQ ID No: 68, SEQ ID No: 70, SEQ ID No: 72, SEQ ID No: 74, SEQ ID No: 76, SEQ ID No: 78, SEQ ID No: 80, SEQ ID No: 82, SEQ ID No: 84, SEQ ID No: 86, SEQ ID No: 88, SEQ ID No: 90, SEQ ID No: 92, SEQ ID No: 94, SEQ ID No: 96, SEQ ID No: 98, SEQ ID No: 100, SEQ ID No: 102, SEQ ID No: 104, SEQ ID No: 106, SEQ ID No: 108, SEQ ID No: 110, SEQ ID No: 112, SEQ ID No: 114, SEQ ID No: 116, SEQ ID No: 118, SEQ ID No: 120, SEQ ID No: 122, SEQ ID No: 124, SEQ ID No: 126, SEQ ID No: 128, SEQ ID No: 130, SEQ ID No: 132, SEQ ID No: 134, SEQ ID No: 136, SEQ ID No: 138, SEQ ID No: 140, SEQ ID No: 142, SEQ ID No: 144, SEQ ID No: 146, SEQ ID No: 148, SEQ ID No: 150, SEQ ID No: 152, SEQ ID No: 154, SEQ ID No: 156, SEQ ID No: 158, SEQ ID No: 160, SEQ ID No: 162, SEQ ID No: 164, SEQ ID No: 166, SEQ ID No: 168, SEQ ID No: 170, SEQ ID No: 172, SEQ ID No: 174, SEQ JD No: 176, SEQ ID No: 178, SEQ ID No: 180, SEQ ID No: 182, SEQ ID No: 184, wherein the at least one additional protein has cellulolytic enhancing activity.
73. The method of Claim 71, wherein the isolated protein is part of a multi-enzyme composition.
74. A method for degrading a complex carbohydates to fermentable sugars, comprising contacting the lignocellulosic material with at least one multi-enzyme composition of any one of Claims 50 to 70.
75. A method for producing an organic substance, comprising:
saccharifying a lignocellulosic material with a multi-enzyme composition of any one of Claims 50 to 70;
fermenting the saccharified lignocellulosic material obtained with one or more fermentating microoganisms; and
recovering the organic substance from the fermentation.
76. The method of claim 75, wherein the steps of saccharifying and fermenting are performed simultaneously.
77. The method of claim 75, wherein the organic substance is an alcohol, organic acid, ketone, amino acid, or gas.
78. The method of claim 75, wherein the organic substance is an alcohol.
79. The method of claim 78, wherein the alcohol is ethanol.
80. The method of any one of Claims 71 to 79, wherein the lignocellulosic material is selected from the group consisting of consisting of herbaceous material, agricultural residue, forestry residue, municipal solid waste, waste paper, and pulp and paper mill residue.
81. The method of any one of Claims 71 to 79, wherein the lignocellulosic material is distiller's dried grains or distiller's dried grains with solubles.
82. The method of any one of Claims 71 to 79, wherein the distiller's dried grains or distiller's dried grains with solubles is derived from corn.
83. A method for degrading a lignocellulosic material consisting of distiller's dried grains or distiller's dried grains with solubles to sugars, the method comprising contacting the distiller's dried grains or distiller's dried grains with solubles with a multi-enzyme composition, whereby at least about 10% of the fermentable sugars are liberated, wherein the multi-enzyme composition is the multi-enzyme composition of any one of Claims 50 to 70.
84. The method of Claim 83, whereby at least about 1 % of the sugars are liberated.
85. The method of claim 83, whereby at least about 20% of the sugars are liberated.
86. The method of claim 83, whereby at least about 23% of the sugars are liberated.
87. The method of claim 83, wherein the distiller's dried grains or distiller's dried grains with solubles is derived from corn.
88. The method of any one of Claims 71 to 87, further comprising a pretreatment process for pretreating the lignocellulosic material.
89. The method of Claim 88, wherein the pretreatment process is selected from the group consisting of physical treatment, metal ion, ultraviolet light, ozone, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment.
90. The method of Claim 88, wherein the pretreatment process is selected from the group consisting of organosolv, steam explosion, heat treatment and APEX.
91. The method of Claim 90, wherein the heat treatment comprises heating the
lignocellulosic material to 121°C for 15 minutes.
92. The method of any one of Claims 71 to 91, further comprising detoxifying the lignocellulosic material.
93. The method of any one of Claims 71 to 92, further comprising recovering the fermentable sugar.
94. The method of Claim 93, wherein the sugar is selected from the group consisting of glucose, xylose, arabinose, galactose, mannose, rhamnose, sucrose and fructose.
95. The method of any one of Claims 71 to 94, further comprising recovering the contacted lignocellulosic material after the fermentable sugars are degraded.
96. A feed additive comprising the recovered lignocellulosic material of Claim 95.
97. The feed additive of Claim 96, wherein the protein content of the recovered
lignocellulosic material is higher than that of the starting lignocellulosic material.
98. A method of improving the performance of an animal which comprises administering to the animal the feed additive of Claim 96.
99. method for improving the nutritional quality of an animal feed comprising adding the feed additive of Claim 96 to an animal feed.
100. A method of biorefining, deinking or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one isolated protein of Claim 10.
101. A method of biorefining, deinking or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one multi-enzyme composition of any one of Claims 50 to 70.
102. A method for removing stains with a detergent composition, comprising adding at least one isolated protein of Claim 10 to the detergent composition.
103. A method for removing stains with a detergent composition, comprising adding at least one multi-enzyme composition of any one of Claims 50 to 70 to the detergent composition.
104. A method for removing stains with a cleaning composition, comprising adding at least one isolated protein of Claim 10 to the detergent composition.
105. A method for remiving stains with a cleaning composition, comprising adding at least one multi-enzyme composition of any one of Claims 50 to 70 to the detergent composition.
106. A detergent composition, comprising at least one isolated protein of Claim 10 and at least one surfactant.
107. A detergent composition, comprising at least one multi-enzyme composition of any one of Claims 70 to 50 and at least one surfactant.
108. A method for releasing cellular contents comprising contacting a cell with at least one isolated protein of Claim 10.
109. The method of claim 108, wherein the cell is selected from the group consisting of: a bacterium, an algal cell, a fungal cell or a plant cell.
110. The method of claim 108, where the cell is an algal cell.
111. The method of claim 108, wherein contacting the cell with at least one isolated protein of Claim 10 degrades the cell wall.
112. The method of claim 108, wherein the cellular contents are selected from the group consisting of: alcohols and oils.
113. A composition for degrading cell walls comprising at least one isolated protein of Claim 10.
114. A method for improving the nutritional quality of food comprising adding to the food at least one isolated protein of Claim 10.
115. A method for improving the nutritional quality of food comprising pretreating the food with at least one isolated protein of Claim 10.
116. A method for improving the nutritional quality of animal feed comprising adding to the animal feed at least one isolated protein of Claim 10.
117. A method for improving the nutritional quality of animal feed comprising pretreating the feed with at least one isolated protein of Claim 10.
118. A method for the degumming of vegetable oils comprising at least one isolated protein of Claim 10.
119. A method for the enhancement of the flavor of cheese comprising at least one isolated protein of Claim 10.
120. A method for the ripening cheese comprising at least one isolated protein of Claim 10.
121. A method for transesterification of flavors comprising at least one isolated protein of Claim 10.
122. A method for transesterification of cocoa butter subsitutes at least one isolated protein of Claim 10.
123. A method for the removal of wax comprising at least one isolated protein of Claim 10.
124. A method for producing biodiesel comprising at least one isolated protein of Claim 10.
125. A method for producing of biofuel comprising at least one isolated protein of Claim 10.
126. A method for reducing the amount of phosphate in animal feed comprising at least one isolated protein of Claim 10.
127. A method for enhancing the recovery of oil comprising at least one isolated protein of Claim 10.
128. A genetically modified organism comprising at least one nucleic acid sequence encoding al least one protein of Claim 10, in which the activity of one or more of the proteins of claim 10 is upregulated, the activity of one or more of the proteins of claim 10
downregulated, or the activity of one or more of the proteins of claim 10 is upregulated and the activity of one or more of the proteins of claim 10 is downregulated.
PCT/US2011/063716 2010-12-07 2011-12-07 Novel fungal esterases WO2012078741A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US42053110P 2010-12-07 2010-12-07
US61/420,531 2010-12-07

Publications (2)

Publication Number Publication Date
WO2012078741A2 true WO2012078741A2 (en) 2012-06-14
WO2012078741A3 WO2012078741A3 (en) 2012-11-29

Family

ID=46207708

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/063716 WO2012078741A2 (en) 2010-12-07 2011-12-07 Novel fungal esterases

Country Status (1)

Country Link
WO (1) WO2012078741A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014064331A1 (en) * 2012-10-26 2014-05-01 Roal Oy Novel esterases in the treatment of cellulosic and lignocellulosic material
US9434929B2 (en) 2012-10-26 2016-09-06 Roal Oy Esterases useful in the treatment of cellulosic and lignocellulosic material
CN108532000A (en) * 2018-04-17 2018-09-14 中国农业科学院麻类研究所 A kind of Degumming method of bluish dogbane bast
CN109486867A (en) * 2018-10-30 2019-03-19 中国科学院天津工业生物技术研究所 Compound cellulose enzyme system and its application in starch fuel ethanol production
WO2023225459A2 (en) 2022-05-14 2023-11-23 Novozymes A/S Compositions and methods for preventing, treating, supressing and/or eliminating phytopathogenic infestations and infections

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100304437A1 (en) * 2009-05-29 2010-12-02 Novozymes, Inc. Methods for enhancing the degradation or conversion of cellulosic material

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100304437A1 (en) * 2009-05-29 2010-12-02 Novozymes, Inc. Methods for enhancing the degradation or conversion of cellulosic material

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
CHAOGUANG TIAN ET AL.: 'Systems analysis of plant cell wall degradation by the model filamentous fungus Neurospora crassa' PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES vol. 106, no. 52, 29 December 2009, ISSN 0027-8424 pages 22157 - 22162 *
DATABASE GENBANK 09 April 2008 'Chaetomium globosum CBS 148.51 hypothetical protein (CHGG_05615) partial mRNA' Database accession no. XM_001221709 *
DATABASE GENBANK 10 April 2008 'Esterase D [Neurospora crassa OR74A]' Database accession no. XP_956396 *
DATABASE GENBANK 21 February 2008 'Aspergillus clavatus NRRL 1 esterase, putative (ACLA_012870), partial mRNA' Database accession no. XM_001274284 *
DATABASE GENBANK 28 October 2011 'Carbohydrate esterase family 1 protein [Thielavia terrestris NRRL 8126]' Database accession no. AE069542 *
DATABASE GENBANK 28 October 2011 'Thielavia terrestris NRRL 8126 chromosome 4, complete sequence' Database accession no. CP003012 *
MANUEL D. OSPINA-GIRALDO ET AL.: 'The CAZyome of Phytophthora spp.: A comprehensive analysis of the gene complement coding for carbohydrate -active enzymes in species of the genus Phytophthora' BMC GENOMICS vol. 11, 28 September 2010, ISSN 1471-2164 page 525 *
RANDY M. BERKA ET AL.: 'Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris' NATURE BIOTECHNOLOGY vol. 29, no. 10, 02 October 2011, ISSN 1087-0156 pages 922 - 927 *
RONALD P. DE VRIES ET AL.: 'Aspergillus enzymes involved in degradation of plant cell wall polysaccharides' MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS. December 2001, ISSN 1092-2172 pages 497 - 522 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014064331A1 (en) * 2012-10-26 2014-05-01 Roal Oy Novel esterases in the treatment of cellulosic and lignocellulosic material
CN104781398A (en) * 2012-10-26 2015-07-15 罗尔公司 Novel esterases in the treatment of cellulosic and lignocellulosic material
US9434929B2 (en) 2012-10-26 2016-09-06 Roal Oy Esterases useful in the treatment of cellulosic and lignocellulosic material
CN104781398B (en) * 2012-10-26 2018-05-08 罗尔公司 Handle the novel esterases of cellulose and ligno-cellulosic materials
CN108532000A (en) * 2018-04-17 2018-09-14 中国农业科学院麻类研究所 A kind of Degumming method of bluish dogbane bast
CN108532000B (en) * 2018-04-17 2020-07-10 中国农业科学院麻类研究所 Degumming method of apocynum venetum bast
CN109486867A (en) * 2018-10-30 2019-03-19 中国科学院天津工业生物技术研究所 Compound cellulose enzyme system and its application in starch fuel ethanol production
CN109486867B (en) * 2018-10-30 2022-05-27 中国科学院天津工业生物技术研究所 Composite cellulase system and application thereof in starch fuel ethanol production
WO2023225459A2 (en) 2022-05-14 2023-11-23 Novozymes A/S Compositions and methods for preventing, treating, supressing and/or eliminating phytopathogenic infestations and infections

Also Published As

Publication number Publication date
WO2012078741A3 (en) 2012-11-29

Similar Documents

Publication Publication Date Title
EP2197893B1 (en) Novel fungal enzymes
CA2657684C (en) Construction of highly efficient cellulase compositions for enzymatic hydrolysis of cellulose
EP2421965B1 (en) Carbohydrate degrading polypeptide and uses thereof
US9133448B2 (en) Polypeptide having cellobiohydrolase activity and uses thereof
US9260704B2 (en) Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof
EP3318574B1 (en) Polypeptide having beta-glucosidase activity and uses thereof
US20130280764A1 (en) Method of improving the activity of cellulase enzyme mixtures in the saccharification (ligno)cellulosic material
EP2588492B1 (en) Polypeptide having beta-glucosidase activity and uses thereof
WO2012027374A2 (en) Novel fungal carbohydrate hydrolases
US10266863B2 (en) Enzymatic activity of lytic polysaccharide monooxygenase
EP2183363A2 (en) Novel fungal enzymes
WO2012018691A2 (en) Novel fungal enzymes
WO2012021883A2 (en) Novel fungal enzymes
WO2012078741A2 (en) Novel fungal esterases
US9175050B2 (en) Polypeptide having swollenin activity and uses thereof
WO2016207351A2 (en) Polypeptides having demethylating activity

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11847464

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11847464

Country of ref document: EP

Kind code of ref document: A2