WO2012018691A2 - Novel fungal enzymes - Google Patents

Novel fungal enzymes Download PDF

Info

Publication number
WO2012018691A2
WO2012018691A2 PCT/US2011/045949 US2011045949W WO2012018691A2 WO 2012018691 A2 WO2012018691 A2 WO 2012018691A2 US 2011045949 W US2011045949 W US 2011045949W WO 2012018691 A2 WO2012018691 A2 WO 2012018691A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
protein
acid sequence
nucleic acid
amino acid
Prior art date
Application number
PCT/US2011/045949
Other languages
French (fr)
Other versions
WO2012018691A3 (en
Inventor
Johannes Visser
Sandra Hinz
Jan Werij
Jacob Visser
Vivi Joosten
Martijn Koetsier
Mark Emalfarb
Original Assignee
Dyadic International, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dyadic International, Inc. filed Critical Dyadic International, Inc.
Publication of WO2012018691A2 publication Critical patent/WO2012018691A2/en
Publication of WO2012018691A3 publication Critical patent/WO2012018691A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/02Monosaccharides
    • AHUMAN NECESSITIES
    • A23FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
    • A23KFODDER
    • A23K10/00Animal feeding-stuffs
    • A23K10/10Animal feeding-stuffs obtained by microbiological or biochemical processes
    • A23K10/14Pretreatment of feeding-stuffs with enzymes
    • AHUMAN NECESSITIES
    • A23FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
    • A23KFODDER
    • A23K20/00Accessory food factors for animal feeding-stuffs
    • A23K20/10Organic substances
    • A23K20/189Enzymes
    • AHUMAN NECESSITIES
    • A23FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
    • A23LFOODS, FOODSTUFFS, OR NON-ALCOHOLIC BEVERAGES, NOT COVERED BY SUBCLASSES A21D OR A23B-A23J; THEIR PREPARATION OR TREATMENT, e.g. COOKING, MODIFICATION OF NUTRITIVE QUALITIES, PHYSICAL TREATMENT; PRESERVATION OF FOODS OR FOODSTUFFS, IN GENERAL
    • A23L29/00Foods or foodstuffs containing additives; Preparation or treatment thereof
    • A23L29/06Enzymes
    • CCHEMISTRY; METALLURGY
    • C11ANIMAL OR VEGETABLE OILS, FATS, FATTY SUBSTANCES OR WAXES; FATTY ACIDS THEREFROM; DETERGENTS; CANDLES
    • C11DDETERGENT COMPOSITIONS; USE OF SINGLE SUBSTANCES AS DETERGENTS; SOAP OR SOAP-MAKING; RESIN SOAPS; RECOVERY OF GLYCEROL
    • C11D3/00Other compounding ingredients of detergent compositions covered in group C11D1/00
    • C11D3/16Organic compounds
    • C11D3/38Products with no well-defined composition, e.g. natural products
    • C11D3/386Preparations containing enzymes, e.g. protease or amylase
    • CCHEMISTRY; METALLURGY
    • C11ANIMAL OR VEGETABLE OILS, FATS, FATTY SUBSTANCES OR WAXES; FATTY ACIDS THEREFROM; DETERGENTS; CANDLES
    • C11DDETERGENT COMPOSITIONS; USE OF SINGLE SUBSTANCES AS DETERGENTS; SOAP OR SOAP-MAKING; RESIN SOAPS; RECOVERY OF GLYCEROL
    • C11D3/00Other compounding ingredients of detergent compositions covered in group C11D1/00
    • C11D3/16Organic compounds
    • C11D3/38Products with no well-defined composition, e.g. natural products
    • C11D3/386Preparations containing enzymes, e.g. protease or amylase
    • C11D3/38636Preparations containing enzymes, e.g. protease or amylase containing enzymes other than protease, amylase, lipase, cellulase, oxidase or reductase
    • CCHEMISTRY; METALLURGY
    • C11ANIMAL OR VEGETABLE OILS, FATS, FATTY SUBSTANCES OR WAXES; FATTY ACIDS THEREFROM; DETERGENTS; CANDLES
    • C11DDETERGENT COMPOSITIONS; USE OF SINGLE SUBSTANCES AS DETERGENTS; SOAP OR SOAP-MAKING; RESIN SOAPS; RECOVERY OF GLYCEROL
    • C11D3/00Other compounding ingredients of detergent compositions covered in group C11D1/00
    • C11D3/16Organic compounds
    • C11D3/38Products with no well-defined composition, e.g. natural products
    • C11D3/386Preparations containing enzymes, e.g. protease or amylase
    • C11D3/38645Preparations containing enzymes, e.g. protease or amylase containing cellulase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/24Hydrolases (3) acting on glycosyl compounds (3.2)
    • C12N9/2402Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/02Preparation of oxygen-containing organic compounds containing a hydroxy group
    • C12P7/04Preparation of oxygen-containing organic compounds containing a hydroxy group acyclic
    • C12P7/06Ethanol, i.e. non-beverage
    • C12P7/08Ethanol, i.e. non-beverage produced as by-product or from waste or cellulosic material substrate
    • C12P7/10Ethanol, i.e. non-beverage produced as by-product or from waste or cellulosic material substrate substrate containing cellulosic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y302/00Hydrolases acting on glycosyl compounds, i.e. glycosylases (3.2)
    • C12Y302/01Glycosidases, i.e. enzymes hydrolysing O- and S-glycosyl compounds (3.2.1)
    • C12Y302/01055Alpha-N-arabinofuranosidase (3.2.1.55)
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06LDRY-CLEANING, WASHING OR BLEACHING FIBRES, FILAMENTS, THREADS, YARNS, FABRICS, FEATHERS OR MADE-UP FIBROUS GOODS; BLEACHING LEATHER OR FURS
    • D06L1/00Dry-cleaning or washing fibres, filaments, threads, yarns, fabrics, feathers or made-up fibrous goods
    • D06L1/12Dry-cleaning or washing fibres, filaments, threads, yarns, fabrics, feathers or made-up fibrous goods using aqueous solvents
    • D06L1/14De-sizing
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06LDRY-CLEANING, WASHING OR BLEACHING FIBRES, FILAMENTS, THREADS, YARNS, FABRICS, FEATHERS OR MADE-UP FIBROUS GOODS; BLEACHING LEATHER OR FURS
    • D06L4/00Bleaching fibres, filaments, threads, yarns, fabrics, feathers or made-up fibrous goods; Bleaching leather or furs
    • D06L4/40Bleaching fibres, filaments, threads, yarns, fabrics, feathers or made-up fibrous goods; Bleaching leather or furs using enzymes
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06MTREATMENT, NOT PROVIDED FOR ELSEWHERE IN CLASS D06, OF FIBRES, THREADS, YARNS, FABRICS, FEATHERS OR FIBROUS GOODS MADE FROM SUCH MATERIALS
    • D06M16/00Biochemical treatment of fibres, threads, yarns, fabrics, or fibrous goods made from such materials, e.g. enzymatic
    • D06M16/003Biochemical treatment of fibres, threads, yarns, fabrics, or fibrous goods made from such materials, e.g. enzymatic with enzymes or microorganisms
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06PDYEING OR PRINTING TEXTILES; DYEING LEATHER, FURS OR SOLID MACROMOLECULAR SUBSTANCES IN ANY FORM
    • D06P1/00General processes of dyeing or printing textiles, or general processes of dyeing leather, furs, or solid macromolecular substances in any form, classified according to the dyes, pigments, or auxiliary substances employed
    • D06P1/22General processes of dyeing or printing textiles, or general processes of dyeing leather, furs, or solid macromolecular substances in any form, classified according to the dyes, pigments, or auxiliary substances employed using vat dyestuffs including indigo
    • D06P1/228Indigo
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06PDYEING OR PRINTING TEXTILES; DYEING LEATHER, FURS OR SOLID MACROMOLECULAR SUBSTANCES IN ANY FORM
    • D06P5/00Other features in dyeing or printing textiles, or dyeing leather, furs, or solid macromolecular substances in any form
    • D06P5/13Fugitive dyeing or stripping dyes
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06PDYEING OR PRINTING TEXTILES; DYEING LEATHER, FURS OR SOLID MACROMOLECULAR SUBSTANCES IN ANY FORM
    • D06P5/00Other features in dyeing or printing textiles, or dyeing leather, furs, or solid macromolecular substances in any form
    • D06P5/13Fugitive dyeing or stripping dyes
    • D06P5/137Fugitive dyeing or stripping dyes with other compounds
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06PDYEING OR PRINTING TEXTILES; DYEING LEATHER, FURS OR SOLID MACROMOLECULAR SUBSTANCES IN ANY FORM
    • D06P5/00Other features in dyeing or printing textiles, or dyeing leather, furs, or solid macromolecular substances in any form
    • D06P5/15Locally discharging the dyes
    • DTEXTILES; PAPER
    • D06TREATMENT OF TEXTILES OR THE LIKE; LAUNDERING; FLEXIBLE MATERIALS NOT OTHERWISE PROVIDED FOR
    • D06PDYEING OR PRINTING TEXTILES; DYEING LEATHER, FURS OR SOLID MACROMOLECULAR SUBSTANCES IN ANY FORM
    • D06P5/00Other features in dyeing or printing textiles, or dyeing leather, furs, or solid macromolecular substances in any form
    • D06P5/15Locally discharging the dyes
    • D06P5/158Locally discharging the dyes with other compounds
    • DTEXTILES; PAPER
    • D21PAPER-MAKING; PRODUCTION OF CELLULOSE
    • D21CPRODUCTION OF CELLULOSE BY REMOVING NON-CELLULOSE SUBSTANCES FROM CELLULOSE-CONTAINING MATERIALS; REGENERATION OF PULPING LIQUORS; APPARATUS THEREFOR
    • D21C5/00Other processes for obtaining cellulose, e.g. cooking cotton linters ; Processes characterised by the choice of cellulose-containing starting materials
    • D21C5/005Treatment of cellulose-containing material with microorganisms or enzymes
    • DTEXTILES; PAPER
    • D21PAPER-MAKING; PRODUCTION OF CELLULOSE
    • D21HPULP COMPOSITIONS; PREPARATION THEREOF NOT COVERED BY SUBCLASSES D21C OR D21D; IMPREGNATING OR COATING OF PAPER; TREATMENT OF FINISHED PAPER NOT COVERED BY CLASS B31 OR SUBCLASS D21G; PAPER NOT OTHERWISE PROVIDED FOR
    • D21H17/00Non-fibrous material added to the pulp, characterised by its constitution; Paper-impregnating material characterised by its constitution
    • D21H17/005Microorganisms or enzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P2203/00Fermentation products obtained from optionally pretreated or hydrolyzed cellulosic or lignocellulosic material as the carbon source
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E50/00Technologies for the production of fuel of non-fossil origin
    • Y02E50/10Biofuels, e.g. bio-diesel

Definitions

  • This invention relates to novel enzymes and novel methods for producing the same. More specifically this invention relates to enzymes produced by fungi.
  • the invention also relates to a method to convert lignocellulosic biomass or cellulosic substrates to fermentable sugars with enzymes that degrade lignocellulosic, cellulosic, and even more complex plant cell wall material and to novel combinations of enzymes, including those that provide a combined or synergistic release of sugars from plant biomass.
  • the invention also relates to a method to release cellular contents by effecting degradation of the cell walls.
  • the invention also relates to methods of using the novel enzymes and compositions of such enzymes in a variety of other processes, such as washing or treating of clothing or fabrics, detergent processes, animal feed, food, baking, production of biochemicals and biofuel, starch preparation, liquification, beverage, biorefining, deinking and biobleaching of paper and pulp, oil and waste dispersing, and treatment of waste streams.
  • distillers' dried grains are lignocellulosic byproducts of the corn dry milling process. Milled whole com kernels are treated with amylases to liquefy the starch within the kernels and hydrolyze it to glucose. The glucose so produced is then fermented in a second step to ethanol. The residual solids after the ethanol fermentation and distillation are centrifuged and dried, and the resulting product is DDG, which is used as an animal feed stock.
  • DDG composition can vary, a typical composition for DDG is: about 32% hemicellulose, 22% cellulose, 30% protein, 10% lipids, 4% residual starch, and 4% inorganics.
  • the cellulose and hemicellulose fractions comprising about 54% of the weight of the DDG, can be efficiently hydrolyzed to fermentable sugars by enzymes; however, it has been found that the carbohydrates comprising lignocellulosic materials in DDG are more difficult to digest.
  • the efficiency of hydrolysis of these (hemi) cellulosic polymers by enzymes is much lower than the hydrolytic efficiency of starch, due to the more complex and recalcitrant nature of these substrates. Accordingly, the cost of producing the requisite enzymes is higher than the cost of producing amylases for starch hydrolysis.
  • Major polysaccharides comprising lignocellulosic materials include cellulose and hemicelluloses.
  • the enzymatic hydrolysis of these polysaccharides to soluble sugars (and finally to monomers such as glucose, xylose and other hexoses and pentoses) is catalyzed by several enzymes acting in concert.
  • endo- l,4-p-glucanases (EGs) and exo-cellobiohydrolases (CBHs) catalyze the hydrolysis of insoluble cellulose to cellooligosachharides (with cellobiose as the main product), while ⁇ -glucosidases (BGLs) convert the oligosaccharides to glucose.
  • EGs endo- l,4-p-glucanases
  • CBHs exo-cellobiohydrolases
  • BGLs ⁇ -glucosidases
  • xylanases together with other enzymes such as a-L- arabinofuranosidases, feralic and acetylxylan esterases and ⁇ -xylosidases, catalyze the hydrolysis of hemicelluloses.
  • Enzymes useful for the hydrolysis of complex polysaccharides are also highly useful in a variety of industrial textile applications, as well as industrial paper and pulp applications, and in the treatment of waste streams.
  • methods for treating cellulose-containing fabrics for clothing with hydrolytic enzymes, such as cellulases are known to improve the softness or feel of such fabrics.
  • Cellulases are also used in detergent compositions, either for the purpose of enhancing the cleaning ability of the composition or as a softening agent.
  • Cellulases are also used in combination with polymeric agents in processes for providing a localized variation in the color density of fibers.
  • Such enzymes can also be used for the saccharification of lignocellulosic biomass in waste streams, such as municipal solid waste, for biobleaching of wood pulp, and for deinking of recycled print paper.
  • waste streams such as municipal solid waste
  • biobleaching of wood pulp and for deinking of recycled print paper.
  • hydrolysis of these polysaccharides in lignocellulosic materials for use as feedstocks described above the cost and hydrolytic efficiency of the enzymes are major factors that control the use of enzymes in these processes.
  • Enzymes are also useful in the food and animal feed industry.
  • esterases can be utilized to degum vegetable oils; improving the production of various food products as well as enhancing the flavor of food products.
  • Esterases can be used in the feed to reduce the amount of phosphate in feed.
  • Carbohydrases can be used to increase the yield of fruit juice and oils; stimulate fermentation in the brewing industry; produce gelling agents; and modify starches, to mention a few.
  • Carbohydrases in the feed industry include, but are not limited to, improving feed conversion, reducing the viscosity, and producing oligosaccharides.
  • Filamentous fungi such as Aspergillus sp. and Trichoderma sp. are sources of cellulases and hemicellulases, as well as other enzymes useful in the enzymatic hydrolysis of major polysaccharides.
  • strains of Trichoderma sp. such as T. viride, T. reesei and 71 longibrachiatum, and Penicillium sp., and enzymes derived from these strains, have previously been used to hydrolyze Patent 124702-0230 crystalline cellulose.
  • the costs associated with producing enzymes from these fungi, as well as the presence of additional, undesirable enzymes remains a drawback. It is therefore desirable to produce inexpensive enzymes and enzyme mixtures that efficiently degrade cellulose and hemicellulose for use in a variety of agricultural and industrial applications.
  • the present invention comprises an isolated nucleic acid sequence selected from the group consisting of:
  • nucleic acid sequence encoding a protein comprising an amino acid sequence selected from the group consisting of: Table 1 or Table 2.
  • nucleic acid sequence encoding an amino acid sequence that is at least about 70% identical to an amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
  • the nucleic acid sequence encodes an amino acid sequence that is at least about 90%, at least about 95%, at least about 97% or at least about 99% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
  • the nucleic acid sequence encodes a protein comprising an amino acid sequence selected from the group consisting of: Table 1 or Table 2.
  • the nucleic acid sequence comprises a nucleic acid sequence selected from the group consisting of: Table 1 or Table 2.
  • the present invention comprises nucleic acid sequences that are fully complementary to any of the nucleic acid sequences described above.
  • the present invention comprises an isolated protein comprising an amino acid sequence encoded by any of the nucleic acid molecules . described above.
  • the present invention comprises an isolated fusion protein comprising an isolated protein of the present invention fused to a protein comprising an amino acid sequence that is heterologous to the isolated protein.
  • the present invention comprises an isolated antibody or antigen binding fragment thereof that selectively binds to a protein of the present invention.
  • the present invention comprises a kit for degrading a lignocellulosic material to fermentable sugars comprising at least one isolated protein of the present invention.
  • the present invention comprises a detergent comprising at least one isolated protein of the present invention.
  • the present invention comprises a composition for the degradation of a lignocellulosic material comprising at least one isolated protein of the present invention.
  • the present invention comprises a recombinant nucleic acid molecule comprising an isolated nucleic acid molecule of the present invention, operatively linked to at least one expression control sequence.
  • the recombinant nucleic acid molecule comprises an expression vector.
  • the recombinant nucleic acid molecule comprises a targeting vector.
  • the present invention comprises an isolated host cell transfected with a nucleic acid molecule of the present invention.
  • the host cell is a fungus.
  • the host cell is a filamentous fungus.
  • the filamentous fungus is from a genus selected from the group consisting of: Chrysosporium, Thielavia, Talaromyces, Neurospora, Aiireobasidmm, Filih sidium, Piromyces, Corynasciis, Cryptococcus, Acremonium, Totypocladium, Scytalidium, Schizophyttum, Sporotrichum, Penicillium, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Patent 124702-0230
  • the host cell is a bacterium.
  • the present invention comprises an oligonucleotide consisting essentially of at least 12 consecutive nucleotides of a nucleic acid sequence selected from the group consisting of: Table 1 or Table 2 or the complement thereof.
  • the present invention comprises a kit comprising at least one oligonucleotide of the present invention.
  • the present invention comprises methods for producing a protein of the present invention, comprising culturing a cell that has been transfected with a nucleic acid molecule comprising a nucleic acid sequence encoding the protein, and expressing the protein with the transfected cell. In some embodiments, the present invention further comprises recovering the protein from the cell or from a culture comprising the cell.
  • the present invention comprises a genetically modified organism comprising components suitable for degrading a lignocellulosic material to fermentable sugars, wherein the organism has been genetically modified to express at least one protein of the present invention.
  • the genetically modified organism is a plant, alga, fungus or bacterium.
  • the fungus is yeast, mushroom or filamentous fungus.
  • the filamentous fungus is from a genus selected from the group consisting of: Chrysosporium, Thielavia, Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynascus, Cryplococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Talaromyces, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Humicola, and Trichoderma.
  • the filamentous fungus is selected from the group consisting of: Trichoderma reesei, Chrysosporium lucknowense, Myceliophthora thermophila, Aspergillus japonicus, Penicillium canescens, Penicillium solitum, Penicillium fimiculosum, and Talaromyces flavus.
  • the genetically modified organism has been genetically modified to express at least one additional enzyme.
  • the additional enzyme is an accessory enzyme selected from the group consisting of: Patent 124702-0230 cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanases, ⁇ - ⁇ -D- glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
  • the genetically modified organism is a plant.
  • the present invention comprises a recombinant enzyme isolated from a genetically modified microorganism of the present invention.
  • the recombinant enzyme has been subjected to a purification step.
  • the present invention comprises a crude fermentation product produced by culturing the cells from the genetically modified organism of the present invention, wherein the crude fermentation product contains at least one protein of the present invention.
  • the present invention comprises a multi-enzyme composition comprising enzymes produced by a genetically modified organism of the present invention, and recovered therefrom.
  • the present invention comprises a multi-enzyme composition comprising at least one protein of the present inventions, and at least one additional protein for degrading a lignocellulosic material or a fragment thereof that has biological activity.
  • the multi-enzyme composition comprises at least one cellobiohydrolase, at least one xylanase, at least one endoglucanase, at least one ⁇ - glucosidase, at least one ⁇ -xylosidase, and at least one accessory enzyme.
  • between about 50% and about 70% of the enzymes in the multi-enzyme composition are cellobiohydrolases. In some embodiments, between about 10% and about 30% of the enzymes in the composition are xylanases. In some embodiments, between about 5% and about 15% of the enzymes in the composition are endoglucanases. In some embodiments, between about 1% and about 5% of the enzymes in the composition are ⁇ -glucosidases. In some embodiments, between about 1% and about 3% of the enzymes in the composition are ⁇ -xylosidases.
  • the multi-enzyme composition comprises about 60% Patent 124702-0230 cellobiohydrolases, about 20% xylanases, about 10% endoglucanases, about 3% ⁇ - glucosidases, about 2% ⁇ -xylosidases, and about 5% accessory enzymes.
  • the xylanases are selected from the group consisting of: endoxylanases, exoxylanases, and ⁇ -xylosidases.
  • the accessory enzymes include an enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofui'anosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanase, ⁇ - ⁇ -D- glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
  • the multi-enzyme composition comprises at least one hemicellulase.
  • the hemicellulase is selected from the group consisting of a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, and endo-galactanase, a mannanase, an endo arabinase, an exo arabinase, an exo-galactanase, a ferulic acid esterase, a galactomannanase, a xyloglucanase, and mixtures thereof.
  • the xylanase is selected from the group consisting of endoxylanases, exoxylanase, and ⁇ - xylosidase.
  • the multi-enzyme composition comprises at least one cellulase.
  • the composition is a crude fermentation product. In some embodiments, the composition is a crude fermentation product that has been subjected to a purification step.
  • the multi-enzyme composition further comprises one or more accessory enzymes.
  • the accessory enzymes include at least one enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanase, exo ⁇ -D-glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
  • the accessory enzyme is selected from the group consisting of a glucoamylase, a pectinase, and a ligninase. In some embodiments, the accessory enzyme is added as a crude or a semi-purified enzyme Patent 124702-0230 mixture. In some embodiments, the accessory enzyme is produced by culturing at least one organism on a substrate to produce the enzyme.
  • the multi-enzyme composition comprises at least one protein of the present invention, and at least one additional protein or a fragment thereof that has biological activity for degrading an arabinoxylan-containing material.
  • the composition comprises at least one endoxylanase, at least one ⁇ -xylosidase, and at least one arabinofuranosidase.
  • the arabinofuranosidase comprises an arabinofuranosidase with specificity towards single substituted xylose residues, an arabinofuranosidase with specificity towards double substituted xylose residues, or a combination thereof.
  • the present invention comprises methods for degrading a lignocellulosic material to fermentable sugars, comprising contacting the lignocellulosic material with at least one isolated protein of the present invention.
  • the methods of the present invention further comprise contacting the lignocellulosic material with at least one additional isolated protein comprising an amino acid sequence that is at least about 95% identical to an amino acid sequence selected from the group consisting of: Table 1 or Table 2, wherein at least one additional protein has cellulolytic enhancing activity.
  • the additional isolated protein is part of a multi-enzyme composition.
  • the present invention comprises methods for degrading a lignocellulosic material to fermentable sugars, comprising contacting the lignocellulosic material with at least one multi-enzyme composition of the present invention.
  • the present invention comprises a method for producing an organic substance, comprising:
  • the steps of saccharifying and fermenting are performed Patent 124702-0230 simultaneously.
  • the organic substance is an alcohol, organic acid, ketone, amino acid, or gas.
  • the alcohol is ethanol.
  • the lignocellulosic material is selected from the group consisting of herbaceous material, agricultural residue, forestry residue, municipal solid waste, waste paper, and pulp and paper mill residue.
  • the lignocellulosic material is distiller's dried grains (DDG) or DDG with solubles.
  • DDG distiller's dried grains
  • the DDG or DDG with solubles is derived from corn.
  • the present invention comprises a method for degrading a lignocellulosic material consisting of DDG or DDG with solubles to sugars, the method comprising contacting the DDG or DDG with solubles with a multi- enzyme composition of the present invention, whereby at least about 10% of the fermentable sugars are liberated. In some embodiments, at least about 15%, at least 20%, or at least about 23% of the sugars are liberated.
  • the present invention further comprises a pretreatment process for pretreating the lignocellulosic material.
  • the pretreatment process is selected from the group consisting of physical treatment, metal ion, ultraviolet light, ozone, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment.
  • the pretreatment process is selected from the group consisting of organosolv, steam explosion, heat treatment and AFEX.
  • the heat treatment comprises heating the lignocellulosic material to 121°C for 15 minutes.
  • the present invention further comprises detoxifying the lignocellulosic material.
  • the present invention further comprises recovering the fermentable sugar.
  • the sugar is selected from the group consisting of glucose, xylose, arabinose, galactose, mannose, rhamnose, sucrose and fructose.
  • the present invention further comprises recovering the Patent 124702-0230 contacted lignocellulosic material after the fermentable sugars are degraded.
  • the present invention comprises a feed additive comprising the recovered lignocellulosic material of the present invention.
  • the protein content of the recovered lignocellulosic material is higher than that of the starting lignocellulosic material.
  • the present invention comprises methods of improving the performance of an animal which comprises administering to the animal the feed additive of the present invention.
  • the present invention comprises methods for improving the nutritional quality of an animal feed comprising adding the feed additive of the present invention to an animal feed.
  • the present invention comprises methods for stonewashing a fabric, comprising contacting the fabric with at least one isolated protein of the present invention.
  • the present invention comprises methods for stonewashing a fabric, comprising contacting the fabric with at least one multi-enzyme composition of the present invention.
  • the fabric is denim.
  • the present invention comprises methods for enhancing the softness or feel of a fabric or depilling a fabric, comprising contacting the fabric with at least one isolated protein of the present invention, or a fragment thereof comprising a carbohydrate binding module (CBM) of the protein.
  • CBM carbohydrate binding module
  • the present invention comprises methods for enhancing the softness or feel of a fabric or depilling a fabric, comprising contacting the fabric with at least one multi-enzyme composition of the present invention.
  • the present invention comprises methods for restoring color to or brightening a fabric, comprising contacting the fabric with at least one isolated protein of the present invention.
  • the present invention comprises methods for restoring color to or brightening a fabric, comprising contacting the fabric with at least one multi- enzyme composition of the present invention.
  • the present invention comprises methods of biopolishing, defibrillating, bleaching, dyeing or desizing a fabric, comprising contacting the Patent 124702-0230 fabric with at least one isolated protein of the present invention.
  • the present invention comprises methods of biopolishing, defibrillating, bleaching, dyeing or desizing a fabric, comprising contacting the fabric with at least one multi-enzyme composition of the present invention.
  • the present invention comprises methods of biorefining, deinking or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one isolated protein of the present invention.
  • the present invention comprises methods of biorefining, deinking or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one multi-enzyme composition of the present invention
  • the present invention comprises methods for enhancing the cleaning ability of a detergent composition, comprising adding at least one isolated protein of the present invention to the detergent composition.
  • the present invention comprises methods for enhancing the cleaning ability of a detergent composition, comprising adding at least one multi- enzyme composition of the present invention to the detergent composition.
  • the present invention comprises a detergent composition, comprising at least one isolated protein of the present invention and at least one surfactant.
  • the present invention comprises a detergent composition, comprising at least one multi-enzyme composition of the present invention and at least one surfactant.
  • the present invention comprises methods for releasing cellular contents comprising contacting a cell with at least one protein of the present invention.
  • the cell may be a bacterium, an algal cell, a fungal cell or a plant cell. In preferred embodiments, the cell is an algal cell.
  • contacting the cell with at least one protein of the present invention degrades the cell wall.
  • the cellular contents are selected from the group consisting of: alcohols and oils.
  • the present invention comprises compositions for degrading cell walls comprising at least one protein of the present invention.
  • the present invention comprises methods for improving the nutritional quality of food comprising adding to the food at least one protein of the present invention.
  • the present invention comprises methods for improving the nutritional quality of food comprising pretreating the food with at least one protein of the present invention.
  • the present invention comprises methods for improving the nutritional quality of animal feed comprising adding to the animal feed at least one protein of the present invention.
  • the present invention comprises methods for improving the nutritional quality of animal feed comprising pretreating the feed with at least one isolated protein of the present invention.
  • the present invention comprises a genetically modified organism comprising at least one nucleic acid molecule encoding a protein of the present invention, in which the activity of one or more of the proteins is upregulated, the activity of one or more of the proteins downregulated, or the activity of one or more of the proteins is upregulated and the activity of one or more of the proteins is downregulated.
  • Figure 2 Chromatogram of cellulose and amylose before (blank) and after
  • Figure 3 Chromatogram of linear arabinan, branched arabinan and oat spelt xylan before (blank) and after digestion with the ⁇ -glucanase Laml (4 h). The experiments were performed at pH 5.0, 50°C during 4 hours.
  • Figure 4 Chromatogram of potato galactan and larch galactan before (blank) and after digestion with the ⁇ -glucanase Laml (4 h). The experiments were performed at pH 5.0, 50°C during 4 hours.
  • Figure 5 MS diagram of methyl-4-O-methyl-glucuronic acid treated by Guel at 0 hours and 4 hours of incubation. The experiments were performed at pH 5.0, 50°C Patent 124702-0230 during 4 hours.
  • Figure 6 MS diagram of methyl-4-O-methyl-glucuronic acid treated by Gue2 at 0 hours and 4 hours of incubation. The experiments were performed at pH 5.0, 50°C during 4 hours.
  • Figure 7 Activity of the type B feruloyl esterase FaeB3 towards wheat bran oligomers (Upper), and sugar beet pulp oligomers (Lower). The release of ferulic acid is indicated as the absorbance at 310 nm.
  • FIG 8 A HPAEC diagram of the incubation of Agu2 alone with aldouronic acids. The incubations have been performed at 50°C and pH 5 during 16 hours.
  • Figure 8B HPAEC diagram of the incubation of Agu2 in combination with a
  • GH10 xylanase from Myceliophthora thermophila CI on aldouronic acids (B). The incubations have been performed at 50°C and pH 5 during 16 hours.
  • Figure 9A HPAEC diagram of the incubation of Gxhl with xylopentaose (solid line) and reduced xylopentaose (dashed line). Incubation was performed at 50°C, pH 5 during 16 hours.
  • Figure 9B HPAEC diagram of the incubation of Gxh2 with xylopentaose (solid line) and reduced xylopentaose (dashed line). Incubation was performed at 50°C, pH 5 during 16 hours.
  • Gxhl, Gxh2, a GH10 xylanase (XI 0) and a GH11 xylanase (XI 1) from
  • FIG. 13 pH-profile of Agal . Incubations were performed on pNP- a -D- galactoside at 40°C during 10 minutes.
  • FIG. 14 pH-profile of Aga2. Incubations were performed on pNP- a -D- galactoside at 40°C during 10 minutes.
  • the present invention relates generally to proteins that play a role in the
  • the present invention relates to enzymes isolated from a filamentous fungal strain denoted herein as CI (Accession No. V M F-3500-D), nucleic acids encoding the enzymes, and methods of producing and using the enzymes.
  • the invention also provides compositions that include at least one of the enzymes described herein for uses including, but not limited to, the hydrolysis of lignocellulose.
  • the invention stems, in part, from the discovery of a variety of novel cellulases and hemicellulases produced by the CI fungus that exhibit high activity toward cellulose and other components of biomass.
  • the present invention also provides methods and compositions for the conversion
  • compositions of the invention include enzyme combinations that break down lignocellulose.
  • biomass or lignocellulosic material includes materials
  • these materials also contain pectin, lignin, protein, carbohydrates (such as starch and sugar) and ash.
  • Lignocellulose is generally found, for example, in the stems, leaves, hulls, husks, and cobs of plants or leaves, branches, and wood of trees.
  • cellulose or hemicellulose into fermentable sugars is also referred to herein as
  • Fermentable sugars refers to simple sugars, such as glucose,
  • xylose arabinose, galactose, mannose, rhamnose, sucrose and fructose.
  • Biomass can include virgin biomass and/or non- virgin biomass such as agricultural
  • biomass commercial organics, construction and demolition debris, municipal solid Patent 124702-0230 waste, waste paper and yard waste.
  • Common forms of biomass include trees, shrubs and grasses, wheat, wheat straw, sugar cane bagasse, sugar beet, soybean, corn, corn husks, com kernel including fiber from kernels, prodvicts and byproducts from milling of grains such as com, tobacco, wheat and barley (including wet milling and dry milling) as well as municipal solid waste, waste paper and yard waste.
  • the biomass can also be, but is not limited to, herbaceous material, agricultural residues, forestry residues, municipal solid wastes, waste paper, and pulp and paper mill residues.
  • Agricultural biomass includes branches, bushes, canes, com and com husks, energy crops, algae, fruits, flowers, grains, grasses, herbaceous crops, leaves, bark, needles, logs, roots, saplings, short rotation woody crops, shrubs, switch grasses, trees, vegetables, fruit peels, vines, sugar beet pulp, wheat midlings, oat hulls, peat moss, mushroom compost and hard and soft woods (not including woods with deleterious materials).
  • agricultural biomass includes organic waste materials generated from agricultural processes including farming and forestry activities, specifically including forestry wood waste. Agricultural biomass may be any of the aforestated singularly or in any combination or mixture thereof.
  • Energy crops are fast-growing crops that are grown for the specific purpose of producing energy, including without limitation, biofuels, from all or part of the plant.
  • Energy crops can include crops that are grown (or are designed to grow) for their increased cellulose, xylose and sugar contents. Examples of such plants include, without limitation, switchgrass, willow and poplar.
  • Energy crops may also include algae, for example, designer algae that are genetically engineered for enhanced production of hydrogen, alcohols, and oils, which can be further processed into diesel and jet fuels, as well as other bio-based products.
  • biomass high in starch, sugar, or protein such as corn, grains, fruits and vegetables are usually consumed as food.
  • biomass high in cellulose, hemicellulose and lignin are not readily digestible and are primarily utilized for wood and paper products, animal feed, fuel, or are typically disposed.
  • the substrate is of high lignocellulose content, including distillers' dried grains com stover, com cobs, rice straw, wheat straw, hay, sugarcane bagasse, sugar cane pulp, citrus peels and other agricultural biomass, switchgrass, forestry wastes, Patent 124702-0230 poplar wood chips, pine wood chips, sawdust, yard waste, and the like, including any combination thereof.
  • the lignocellulosic material is distillers' dried grains (DDG).
  • DDG also known as dried distiller's grain, or distiller's spent grain
  • the lignocellulosic material can also be distiller's dried grain with soluble material recycled back (DDGS). While reference will be made herein to DDG for convenience and simplicity, it should be understood that both DDG and DDGS are contemplated as desired lignocellulosic materials. These are largely considered to be waste products and can be obtained after the fermentation of the starch derived from any of a number of grains, including corn, wheat, barley, oats, rice and rye. In one embodiment the DDG is derived from com.
  • distiller's grains do not necessarily have to be dried.
  • the grains normally are currently dried, water and enzymes are added to the DDG substrate in the present invention. If the saccharification were done on site, the drying step could be eliminated and enzymes could be added to the distiller's grains without drying.
  • the present invention includes enzymes or compositions thereof with, for example, cellobiohydrolase, endoglucanase, xylanase, ⁇ -glucosidase, and hemicellulase activities.
  • Fermentable sugars can be converted to useful value-added fermentation products, non-limiting examples of which include amino acids, vitamins, pharmaceuticals, animal feed supplements, specialty chemicals, chemical feedstocks, plastics, solvents, fuels, or other organic polymers, lactic acid, and ethanol, including fuel ethanol.
  • Specific value-added products that may be produced by the methods of the invention include, but are not limited to, biofuels (including ethanol and butanol); lactic acid; plastics; specialty chemicals; organic acids, including citric acid, succinic acid and maleic acid; solvents; animal feed supplements; pharmaceuticals; vitamins; amino acids, such as lysine, methionine, tryptophan, threonine, and aspartic acid; industrial enzymes, such as proteases, cellulases, amylases, Patent 124702-0230 glucanases, xylanases, arabinanases, lactases, lipases, esterases, lyases, oxidoreductases, transferases ; and chemical feedstocks.
  • the enzymes of the present invention may also be used for stone washing cellulosie fabrics such as cotton (e.g., denim), linen, hemp, ramie, cupro, lyocell, newcell, rayon and the like. See, for example, U.S. Patent No. 6,015,707.
  • the enzymes and compositions of the present invention are suitable for industrial textile applications in addition to the stone washing process.
  • cellulases are used in detergent compositions, either for the purpose of enhancing the cleaning ability of the composition or as a softening agent. When so used, the cellulase will degrade a portion of the cellulosie material, e.g., cotton fabric, in the wash, which facilitates the cleaning and/or softening of the fabric.
  • endoglucanase components of fungal cellulases have also been used for the purposes of enhancing the cleaning ability of detergent compositions, for use as a softening agent, and for use in improving the feel of cotton fabrics, and the like.
  • Enzymes and compositions of the present invention may also be used in the treatment of paper pulp (e.g., for improving the drainage or for de-inldng of recycled paper) or for the treatment of wastewater streams (e.g., to hydrolyze waste material containing cellulose, hemicellulose and pectins to soluble lower molecular weight polymers).
  • the enzymes of the present invention may also be used to release the contents of a cell.
  • contacting or mixing the cells with the enzymes of the present invention will degrade the cell walls, resulting in cell lysis and release of the cellular contents.
  • Such cells can include bacteria, plant cells, fungi including yeasts, and algae.
  • the enzymes of the present invention may be used to degrade the cell walls of algal cells in order to release the materials contained within the algal cells.
  • such materials may include, without limitation, alcohols and oils. The alcohols and oils so released can be further processed to produce diesel, jet fuels, as well as other economically important bio- products..
  • the enzymes of the present invention may be used alone, or in combination with other enzymes, chemicals or biological materials.
  • the enzymes of the present invention may be used for in vitro applications in which the enzymes or mixtures thereof are added to or mixed with the appropriate substrates to catalyze the Patent 124702-0230 desired reactions. Additionally, the enzymes of the present invention may be used for in vivo applications in which nucleic acid molecules encoding the enzymes are introduced into cells and are expressed therein to produce the enzymes and catalyze the desired reactions within the cells.
  • enzymes capable of promoting cell wall degradation may be added to algal cells suspended in solutions to degrade the algal cell walls and release their content, whereas in some embodiments, nucleic acid molecules encoding such enzymes may be introduced into the algal cells to express the enzymes therein, so that these enzymes can degrade the algal cell walls from within.
  • Some embodiments may combine the in vitro applications with the in vivo applications.
  • nucleic acids encoding enzymes capable of catalyzing cell wall degradation may be introduced into algal cells to express the enzymes in those cells and to degrade their cell walls, while enzymes may also added to or mixed with the cells to further promote the cell wall degradation.
  • the enzymes used for in vitro applications may be different from the enzymes used for in vivo applications.
  • an enzyme with the laminarinase activity may be mixed with the cells, while an enzyme with the xyloglucanase activity is expressed within the cells.
  • the present invention includes proteins isolated from, or derived from the knowledge of enzymes from a fungus such as Myceliophthora thermophila or a mutant or other derivative thereof, and more pailicularly, from the fungal strain denoted herein as CI (Accession No. VKM F-3500-D).
  • Myceliophthora thermophila has previously appeared in patent applications and in the literature as Chrysosporium lucknowense or Sporotrichitm thermophile.
  • the proteins of the invention possess enzymatic activity.
  • U.S. Patent No. 6,015,707 or U.S. Patent No. 6,573,086 a strain called CI (Accession No.
  • VKM F- 3500-D was isolated from samples of forest alkaline soil from Sola Lake, Far East of the Russian Federation. This strain was deposited at the All-Russian Collection of Microorganisms of Russian Academy of Sciences (VKM), Bakhurhina St. 8, Moscow, Russia, 113184, under the terms of the Budapest Treaty on the International Regulation of the Deposit of Microorganisms for the Purposes of Patent Procedure on August 29, 1996, as Chrysosporium lucknowense Garg 27K, VKM-F 3500 D. Various mutant strains of Myceliophthora thermophila Patent 124702-0230
  • This strain was subsequently further mutated with N-methyl-N'-nitiO-N-nitiOSoguanidine to generate stain NG7C-19 (Accession No. VKM F-3633 D).
  • This latter stain in turn was subjected to mutation by ultraviolet light, resulting in strain UV18-25 (Accession No. VKM F-3631 D).
  • This strain in turn was again subjected to mutation by ultraviolet light, resulting in strain W1L (Accession No. CBS122189), which was subsequently subjected to mutation by ultraviolet light, resulting in strain W1L#100L (Accession No. CBS 122190).
  • Strain CI was previously classified as a Chrysosporium lucknowense based on morphological and growth characteristics of die microorganism, as discussed in detail in U.S. Patent No. 6,015,707, U.S. Patent No. 6,573,086 and patent PCT/NL2010/000045.
  • a protein of the invention comprises, consists essentially of, or consists of an amino acid sequence selected from Tables 1 and 2.
  • the present invention also includes homologues or variants of any of the above sequences, including fragments and sequences having a given identity to any of the above sequences, wherein the homologue, variant, or fragment has at least one biological activity of the wild-type protein, as described herein.
  • the proteins disclosed herein possess carbohydrase enzymatic activity, or the ability to degrade carbohydrate-containing materials.
  • carbohydrase enzymatic activity or the ability to degrade carbohydrate-containing materials.
  • a review of enzymes involved in the degradation of polysaccharides can be found in de Vries et al., Microbiol. Mol. Biol. Rev. 65:497-522 (2001).
  • the proteins may possess cellulase activity such as endoglucanase activity (e.g., l,4-p-D-glucan-4- glucanohydrolases), exoglucanase activity (e.g., l,4- -D-glucan cellobiohydrolases), and ⁇ -glucosidase activity.
  • the proteins may possess Patent 124702-0230 hemicellulase activity such as endoxylanase activity, exoxylanase activity, or ⁇ - xylosidase activity.
  • the proteins may possess laminarinase, xyloglucanase, galactanase, glucoamylase, pectate lyase, chitosanase, exo-p-D-glucosaminidase, cellobiose dehydrogenase, acetylxylan esterase, ligninase, amylase, glucuronidase, ferulic acid esterase, arabinofuranosidase, pectin methyl esterase, arabinase, lipase, glucosidase, ⁇ -hexosaminidase, rhamnogalacturonan acetylesterase, exo- rhamnogalacturonase, rham
  • carbohydrase refers to any protein that catalyzes the hydrolysis of carbohydrates.
  • glycoside hydrolase refers to a protein that catalyzes the hydrolysis of the glycosidic bonds between carbohydrates or between a carbohydrate and a non-carbohydrate residue.
  • Endoglucanases cellobiohydrolases, ⁇ -glucosidases, a-glucosidases, xylanases, ⁇ - xylosidases, alpha- xylosidases, galactanases, a-galactosidases, ⁇ -galactosidases, a-amylases, glucoamylases, endo-arabinases, arabinofuranosidases, mannanases, ⁇ -mannosidases, pectinases, acetyl xylan esterases, acetyl mannan esterases, femlic acid esterases, coumaric acid esterases, pectin methyl esterases, and chitosanases are examples of glycosidases.
  • Cellulase refers to a protein that catalyzes the hydrolysis of 1 ,4 ⁇ -D-glycosidic linkages in cellulose (such as bacterial cellulose, cotton, filter paper, phosphoric acid swollen cellulose, Avicel ® ); cellulose derivatives (such as carboxymethylcellulose and hydroxyethylcellulose); plant lignocellulosic materials, beta-D-glucans or xyloglucans.
  • Cellulose is a linear beta-(l-4) glucan consisting of anhydrocellobiose units. Endoglucanases, cellobiohydrolases, and ⁇ - glucosidases are examples of cellulases.
  • Endoglucanase refers to a protein that catalyzes the hydrolysis of cellulose to oligosaccharide chains at random locations by means of an endoglucanase activity.
  • Cellobiohydrolase refers to a protein that catalyzes the hydrolysis of cellulose to cellobiose via an exoglucanase activity, sequentially releasing molecules of cellobiose from the reducing or non-reducing ends of cellulose or cello- oligosaccharides.
  • ⁇ -glucosidase refers to an enzyme that catalyzes the 5949
  • Patent 124702-0230 conversion of cellobiose and oligosaccharides to glucose.
  • Hemicellulase refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicelluloses are complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicelluloses include a variety of compounds, such as xylans, arabinoxylans, xyloglucans, mamians, glucomannans, and galacto(gluco)mannans. Hemicellulose can also contain glucan, which is a general term for beta-linked glucose residues. In general, a main component of hemicellulose is beta-l,4-linked xylose, a five carbon sugar.
  • this xylose is often branched as beta- 1,3 linkages or beta- 1,2 linkages, and can be substituted with linkages to arabinose, galactose, mannose, glucuronic acid, or by esterification to acetic acid.
  • the composition, nature of substitution, and degree of branching of hemicellulose is very different in dicotyledonous plants (dicots, i.e., plant whose seeds have two cotyledons or seed leaves such as lima beans, peanuts, almonds, peas, kidney beans) as compared to monocotyledonous plants (monocots;
  • hemicellulose is comprised mainly of xyloglucans that are 1 ,4- beta-linked glucose chains with 1,6-alpha-linked xylosyl side chains.
  • xyloglucans that are 1 ,4- beta-linked glucose chains with 1,6-alpha-linked xylosyl side chains.
  • monocots, including most grain crops, the principal components of hemicellulose are heteroxylans.
  • branched beta glucans comprised of 1,3- and
  • Hemicellulolytic enzymes include both endo-acting and exo-acting enzymes, such as xylanases, ⁇ -xylosidases. alpha- xylosidases, galactanases, a-galactosidases, ⁇ -galactosidases, endo-arabinases, arabinofuranosidases, mannanases, ⁇ -mannosidases.
  • Hemicellulases also include the accessory enzymes, such as acetylesterases, ferulic acid esterases, and coumaric acid esterases.
  • xylanases and acetyl xylan esterases cleave the xylan and acetyl side chains of xylan and the remaining xylo-oligomers are unsubstituted and can thus be hydrolysed with ⁇ -xylosidase only.
  • Patent 124702-0230 several less known side activities have been found in enzyme preparations which hydrolyze hemicellulose. Accordingly, xylanases, acetylesterases and ⁇ - xylosidases are examples of hemicellulases.
  • Xylanase specifically refers to an enzyme that hydrolyzes the ⁇ -1,4 bond in the xylan backbone, producing short xylooligosaccharides.
  • ⁇ -Mannanase or "endo-l,4-P-mannosidase” refers to a protein that hydrolyzes mannan-based hemicelluloses (mannan, glucomannan, galacto(gluco)mannan) and produces short P-l,4-mannooligosaccharides.
  • Mannan endo-l,6- -mannosidase refers to a protein that hydrolyzes 1,6-a- mannosidic linkages in unbranched 1 ,6-mannans.
  • ⁇ -Marmosidase P-l,4-mannoside mannohydrolase; EC 3.2.1.25
  • P-l,4-mannoside mannohydrolase EC 3.2.1.25
  • Galactanase refers to a protein that catalyzes the hydrolysis of endo-l,4 ⁇ -D- galactosidic linkages in arabinogalactans.
  • Glucoamylase refers to a protein that catalyzes the hydrolysis of terminal 1 ,4- linked -D-glucose residues successively from non-reducing ends of the glycosyl chains in starch with the release of ⁇ -D-glucose.
  • ⁇ -hexosaminidase or " ⁇ - ⁇ -acetylglucosaminidase” refers to a protein that catalyzes the hydrolysis of terminal N-acetyl-D-hexosamine residues in N-acetyl- ⁇ -D-hexosamines.
  • a-L-arabinofuranosidase refers to a protein that hydrolyzes ai'abinofuranosyl-containing hemicelluloses or pectins. Some of these enzymes remove arabinofuranoside residues from 0-2 or 0-3 single substituted xylose residues, as well as from 0-2 and/or 0-3 double substituted xylose residues. Some of these enzymes remove arabinose residues from arabinan oligomers.
  • Endo-arabinase refers to a protein that catalyzes the hydrolysis of 1,5-a- arabinofuranosidic linkages in 1,5-arabinans.
  • Exo-arabinase refers to a protein that catalyzes the hydrolysis of 1,5-a-linkages in 1,5-arabinans or 1,5-a-L arabino-oligosaccharides, releasing mainly arabinobiose, although a small amount of arabinotriose can also be liberated.
  • Patent 124702-0230
  • ⁇ -xylosidase refers to a protein that hydrolyzes short l,4-P-D-xylooligomers into xylose.
  • Redwood dehydrogenase refers to a protein that oxidizes cellobiose to cellobionolactone.
  • Chitosanase refers to a protein that catalyzes the endohydrolysis of ⁇ -1,4- linkages between D-glucosamine residues in acetylated chitosan (i.e., deacetylated chitin).
  • Exo-polygalacturonase refers to a protein that catalyzes the hydrolysis of terminal alpha 1 ,4-linked galacturonic acid residues from non-reducing ends thus converting polygalacturonides to galacturonic acid.
  • Acetyl xylan esterase refers to a protein that catalyzes the removal of the acetyl groups from xylose residues.
  • Acetyl mannan esterase refers to a protein that catalyzes the removal of the acetyl groups from marrnose residues,
  • femlic esterase or "ferulic acid esterase” refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and ferulic acid.
  • Coumaric acid esterase refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and coumaric acid.
  • Acetyl xylan esterases, ferulic acid esterases and pectin methyl esterases are examples of carbohydrate esterases.
  • Pectate lyase and pectin lyases refer to proteins that catalyze the cleavage of 1,4-a-D-galacturonan by beta-elimination acting on polymeric and/or oligosaccharide substrates (pectates and pectins, respectively).
  • Endo-l,3- -glucanase or “laminarinase” refers to a protein that catalyzes the cleavage of 1,3-linkages in ⁇ -D-glucans such as laminarin or lichenin.
  • Laminarin is a linear polysaccharide made up of p-l,3-gluoan with P-l,6-linkages.
  • lichenan refers to a protein that catalyzes the hydrolysis of lichenan, a linear, 1,3-1,4- -D glucan.
  • Rhamnogalacturonan is composed of alternating -l,4-rhamnose and a-l,2-linked galacturonic acid, with side chains linked 1,4 to rhamnose.
  • the side chains include Type I galactan, which is -l,4-linked galactose with a-l,3-linked arabinose substituents; Type II galactan, which is ⁇ - 1 ,3-1 ,6-linlced galactoses (very branched) with arabinose substituents; and arabinan, which is a-l,5-linked arabinose with a-l,3-linked arabinose branches.
  • the galacturonic acid substituents Patent 124702-0230 may be acetylated and/or methylated.
  • Exo-rhamnogalacturonanase refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin from the nonreducing end.
  • Rhamnogalacturonan acetylesterase refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the highly branched rhamnogalacturonan (hairy) regions of pectin.
  • Rhamnogalacturonan lyase refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin via a ⁇ -elimination mechanism (see, e.g., Pages et al., J. Bacterial 185:4727-4733 (2003)).
  • Alpha-rhamnosidase refers to a protein that catalyzes the hydrolysis of terminal non-reducing a-L-rhamnose residues in a-L-rhamnosides.
  • Glycosidases glycoside hydrolases; GH
  • GH glycoside hydrolases
  • Glycosidases such as the proteins of the present invention may be assigned to families on the basis of sequence similarities, and there are now over 100 different such families defined (see the CAZy (Carbohydrate Active EnZymes database) website, maintained by the Architecture of Fonction de Macromolecules Bi Anlagens of the Centre National de lalich Scientifique, which describes the families of structurally-related catalytic and carbohydrate-binding modules (or functional domains) of enzymes that degrade, modify, or create glycosidic bonds; Coutinho, P.M. & Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In “Recent Advances in Carbohydrate Bioengineering", H.J. Gilbert, G. Davies, B.
  • sequence homology may be used to identify particular domains within proteins, such as carbohydrate binding modules (CBMs; also known as carbohydrate binding domains (CBDs), sometimes called cellulose binding domains).
  • CBMs carbohydrate binding modules
  • CBDs carbohydrate binding domains
  • the CAZy homologies of proteins of the present invention are disclosed below.
  • An enzyme assigned to a particular CAZy family may exhibit one or more of the enzymatic activities or substrate specificities associated with the CAZy family. In other embodiments, the enzymes of the present invention may exhibit one or more of the enzyme activities discussed above.
  • Certain proteins of the present invention may be classified as "Family 61 glycosidases" based on homology of the polypeptides to CAZy Family GH61.
  • Family 61 glycosidases may exhibit cellulolytic enhancing activity or endoglucanase activity. Additional information on the properties of Family 61 glycosidases may be found in U.S. Patent Application Publication Nos. 2005/0191736, 2006/0005279, 2007/0077630, and in PCT Publication No.. WO 2004/031378.
  • cellulolytic enhancing activity refers to a biological activity that enhances the hydrolysis of a cellulosic material by proteins having cellulolytic activity.
  • saccharifying a cellulosic material with a cellulolytic protein in the presence of a Family 61 glycosidase may increase the degradation of cellulosic material compared to the presence of only the cellulolytic protein.
  • the cellulosic material can be any material containing cellulose.
  • the cellulolytic activity is a biological activity that hydrolyzes a cellulosic material.
  • Cellulolytic enhancing activity can be deteimined by measuring the increase in sugars from the hydrolysis of a cellulosic material by cellulolytic protein.
  • Proteins of the present invention may also include homologues, variants, and fragments of the proteins disclosed herein.
  • the protein fragments include, but are not limited to, fragments comprising a catalytic domain (CD) and/or a carbohydrate binding module (CBM) (also known as a cellulose-binding domain; both can be referred to herein as CBM).
  • CD catalytic domain
  • CBM carbohydrate binding module
  • the identity and location of domains within proteins of the present invention are disclosed in detail below.
  • the present invention encompasses all combinations of the disclosed domains.
  • a protein fragment may comprise a CD of a protein but not a CBM of the protein or a CBM of Patent 124702-0230 a protein but not a CD.
  • domains from different proteins may be combined.
  • Protein fragments comprising a CD, CBM or combinations thereof for each protein disclosed herein can be readily produced using standard techniques known in the art.
  • a protein fragment comprises a domain of a protein that has at least one biological activity of the full-length protein. Homologues or variants of proteins of the invention that have at least one biological activity of the full-length protein are described in detail below.
  • biological activity of a protein refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vitro or in vivo.
  • a protein fragment comprises a domain of a protein that has the catalytic activity of the full- length enzyme. Specific biological activities of the proteins of the invention, and structures within the proteins that are responsible for the activities, are described below.
  • Esterases represent a category of various enzymes including lipases, phospholipases, cutinases, and phytases that catalyze the hydrolysis and synthesis of ester bonds in compounds.
  • esterases in the food industry include, but are not limited to, degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food processing.
  • degumming vegetable oils improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food
  • esters include, but are not limited to, the removal of triglycerides, steryl esters, resin acids, free fatty acids, and sterols (e.g., lipophilic wood extractives).
  • esterases in other industries include, but are not limited to, the use as a biocatalysis; sewage treatment; cleaning up oil pollution; the synthesis of esters; the synthesis of fragrances; enantio-specific catalysis of fine chemicals (e.g., esters for chemical and drug intermediates); the production of isopropyl myristate, isopropyl palmitate and 2-ethylpalmitate for use as emollient in personal care products; saving of energy and minimization of thermal degradation in oleochemical industry; use as a feed additive; and enhancing the recovery of oil (e.g., during drilling).
  • fine chemicals e.g., esters for chemical and drug intermediates
  • isopropyl myristate, isopropyl palmitate and 2-ethylpalmitate for use as emollient in personal care products
  • saving of energy and minimization of thermal degradation in oleochemical industry use as a feed additive
  • enhancing the recovery of oil e.g., during drilling
  • the enzyme Aes is encoded by the nucleic acid sequence represented by SEQ ID NO: 1 in Table 1.
  • the Aes nucleic acid sequence encodes a 302 amino acid sequence, represented SEQ ID NO: 2 in Table 1.
  • the signal peptide for Aes is located from about position 1 to about position 21 of the Aes amino acid sequence, with the mature protein spanning from about position 22 to about position 302 of the Aes amino acid sequence.
  • a catalytic domain (CD) is present.
  • CD catalytic domain
  • Aes can be assigned to CE16 of the CAZy families and is expected to have Acetyl esterase activiy.
  • Aes possesses significant homology (about 65% from amino acids 30 to 302 of Aes) with a predicted protein from Neurospora crassa OR74A (Genbank Accession No. EAA28920.1).
  • Aes also possesses significant homology (about 68% from amino acids 38 to 302 of Aes) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65933).
  • Aes possessed acetyl esterase activity.
  • the Guel nucleic acid sequence encodes a 397 amino acid sequence, represented SEQ ID NO: 4 in Table 1.
  • SEQ ID NO: 4 the signal peptide for Guel is located from about position 1 to about position 18 of the Guel amino acid sequence, with the mature protein spanning from about position 19 to about position 397 of the Guel amino acid sequence.
  • a catalytic domain CD
  • the amino acid sequence containing the CD of Guel spans from a starting point of about position 19 to an ending point of about position 397 of the Guel amino acid sequence. Based on homology, Guel can be assigned to CE15 of the CAZy families and is expected to have glucuronyl esterase activity.
  • Guel possesses significant homology (about 73% from amino acids 1 to 389 of Guel) with a hypothetical protein NCU09445 from Neurospora crassa OR74A (Genbank Accession No. EAA29361.1). Guel also possesses significant homology (about 77% from amino acids 1 to 397 of Guel) with a protein from Podospora a erina S mat+ (Genbank Accession No. CAP65970). As evidenced below in Example 3, Guel possessed glucuronyl esterase activity.
  • the enzyme Gue2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 5 in Table 1.
  • the Gue2 nucleic acid sequence encodes a 417 aimno acid sequence, represented by SEQ ID NO: 6 in Table 1.
  • the signal peptide for Gue2 is located from about position 1 to about position 17 of the Gue2 amino acid sequence, with the mature protein spanning from about position 18 to about position 417 of the Gue2 amino acid sequence, Within Gue2 a catalytic domain (CD) is present.
  • CD catalytic domain
  • Gue2 can be assigned to CE15 of theCAZy families and is expected to have glucuronyl esterase activity.
  • Gue2 possesses significant homology (about 78% from amino acids 24 to 373 of Gue2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP59671).
  • Gue2 also possesses significant homology (about 33% from amino acids 59 to 391 of Gue2) with Cip2 from Hypocrea jecorina (Genbank Accession No. AAP57749). As evidenced below in Example 3, Gue2 possessed glucuronyl esterase activity.
  • the enzyme FaeB3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 7 in Table 1.
  • the FaeB3 nucleic acid sequence encodes a 439 amino acid Patent 124702-0230 sequence, represented by SEQ ID NO: 8 in Table 1.
  • the signal peptide for FaeB3 is located from about position 1 to about position 21 of the FaeB3 amino acid sequence, with the mature protein spanning from about position 22 to about position 439 of the FaeB3 amino acid sequence.
  • a catalytic domain CD
  • FaeB3 is expected to have feraloyl esterase activity. FaeB3 possesses significant homology (about 47% from amino acids 38 to 439 of FaeB3) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP64723). FaeB3 also possesses significant homology (about 56% from amino acids 1 to 439 of FaeB3) with a hypothetical protein CHGG_11054 from Chaetomium globosum CBS 148.5 (Genbank Accession No. EAQ82878). As evidenced below in Example 4, FaeB3 possessed feruloyl esterase activity.
  • the enzyme FaeB4 is encoded by the nucleic acid sequence represented by SEQ ID NO: 9 in Table 1.
  • the FaeB4 nucleic acid sequence encodes a 234 amino acid sequence, represented by SEQ ID NO: 10 in Table 1.
  • the signal peptide for FaeB4 is located from about position 1 to about position 19 of the FaeB4 amino acid sequence, with the mature protein spanning from about position 20 to about position 234 of the FaeB4 amino acid sequence.
  • a catalytic domain CD
  • FaeB4 is expected to have feruloyl esterase activity. FaeB4 possesses significant homology (about 53% from amino acids 1 to 233 of FaeB4) with hypothetical protein MGG_08737 from Magnaporthe grisea 70-15 (Genbank Accession No. EDJ93992). FaeB4 also possesses significant homology (about 52% from amino acids 15 to 233 of FaeB4) with a putative feruloyl esterase from Aspergillus fwnigatus Al 163 (Genbank Accession No. EDP49472).
  • the enzyme Fae8 is encoded by the nucleic acid sequence represented by SEQ ID NO: 11 in Table 1.
  • the Fae8 nucleic acid sequence encodes a 373 amino acid sequence, represented by SEQ ID NO: 12 in Table 1.
  • SEQ ID NO: 12 in Table 1.
  • CD catalytic domain
  • Fae8 possesses significant homology (about 50% from amino acids 7 to 362 of Fae8) with a hypothetical protein from Gibberella zeae PH-1 (Genbank Accession No. XP_389574). Fae8 also possesses significant homology (about 54% from amino acids 9 to 370 of Fae8) with EDK03956 (Genbank Accession No. XP367307).
  • the enzyme FaeA3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 13 in Table 1.
  • the FaeA3 nucleic acid sequence encodes a 445 amino acid sequence, represented by SEQ ID NO: 14 in Table 1.
  • the signal peptide for FaeA3 is located from about position 1 to about position 20 of the FaeA3 amino acid sequence, with the mature protein spanning from about position 21 to about position 445 of the FaeA3 amino acid sequence.
  • a catalytic domain CD
  • FaeA3 is expected to have feruloyl esterase activity. FaeA3 possesses significant homology (about 56% from amino acids 78 to 445 of FaeA3) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP60411). FaeA3 also possesses significant homology (about 56% from amino acids 78 to 432 of Fae A3) with EAA68101 (Genbank Accession No. XP381416).
  • the enzyme Rgae2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 15 in Table 1.
  • the Rgae2 nucleic acid sequence encodes a 253 amino acid sequence, represented by SEQ ID NO: 16 in Table 1.
  • the signal peptide for Rgae2 is located from about position 1 to about position 20 of the Rgae2 amino acid sequence, with the mature protein spanning from about position 21 to about position 253 of the Rgae2 amino acid sequence.
  • a catalytic domain CD
  • Rgae2 can be assigned to CE12 of the CAZy families and is expected to Patent 124702-0230 have rhamnogalacturonan acetyl esterase activity. Rgae2 may also possess pectin acetylesterase or acetyl xylan esterase activity. Rgae2 possesses significant homology (about 49% from amino acids 6 to 252 of Rgae2) with a protein from Aspergillus oryzae (Genbanlc Accession No. BAE63203).
  • Rgae2 possesses rhamnogalacturonan acetyl esterase activity which can be measured for example as described by Searle - van Leeuwen et al, (1 92) Appl. Microbiol. Biotechnol. 38; 347-349 and in Searle - van Leeuwen et al, (1996)In: “Pectins and Pectinases", J. Visser and A. Voragen (eds) Progress in Biotechnology Vol 14, Elsevier, Amsterdam, pp 793-798.
  • Carbohydrases represent a category of various enzymes and polypeptides including amylases, cellulases, hemicellulases, pectinases, and chitinases that catalyze and/or enhance the hydrolysis or synthesis of a carbohydrate.
  • [189] Applications of carbohydrases in the food industry include, but are not limited to, increasing the yield of fruit juice production in total liquefaction; increasing the pressing yield of oils e.g. from olives, cleaning filters, reduction of viscosity, hydrolyzing starch, and stimulating fermentation in the brewing industry; increasing the loaf volume and improving crust color in the baking industry; preventing/reducing the staling of bread; removing lactose from milk products; clarifying, filtrating, and extracting aroma and color (e.g., the wine industiy); debittering or detoxifying plant glycosidic compounds; processing coffee; aiding in digestion; producing starch (e.g., separating starch from gluten); producing oligosaccharides (e.g., nutraceuticals); producing gelling agents; modifying viscosity; saccharification of starch and other biopolymers; and modifying starch.
  • oils e.g. from olives, cleaning filters, reduction of viscosity, hydrolyzing starch, and stimulating fermentation in the
  • the enzyme Agu2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 17 in Table 2.
  • the Agu2 nucleic acid sequence encodes a 1076 amino acid sequence, represented by SEQ ID NO: 18 in Table 2.
  • the signal peptide for Agu2 is located from about position 1 to about position 20 of the Agu2 amino acid sequence, with the mature protein spanning from about position 21 to about position 1076 of the Agu2 amino acid sequence.
  • a catalytic domain CD
  • Agu2 can be assigned to GH115 of the CAZy families and is expected to have a- glucuronidase activity, a-glucuronidase activity assay methods have been summarized by J. Inc; -Glucuronidases in the hydrolysis of wood xylans. In: Xylans and Xylanases, eds J. Visser et al. Elsevier, Amsterdam 1992, pp 213-224.
  • Agu2 possesses significant homology (about 71% from amino acids 1 to 1075 of Agu2) with a conserved hypothetical protein from Newospora crassa OR74A (Genbank Accession No. EAA30769).
  • Agu2 also possesses significant homology (about 70% from amino acids 1 to 1075 of Agu2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65960).
  • the enzyme Gxhl is encoded by the nucleic acid sequence represented in by SEQ ID NO: 19 Table 2.
  • the Gxhl nucleic acid sequence encodes a 484 amino acid sequence, represented by SEQ ID NO: 20 in Table 2.
  • the signal peptide for Gxhl is located from about position 1 to about position 19 of the Gxhl amino acid sequence, with the mature protein spanning from about position 20 to about position 484 of the Gxhl amino acid sequence.
  • a catalytic domain CD
  • Gxhl can be assigned to GH5 of the CAZy families and is expected to have xylobiohydrolase activity.
  • Gxhl possesses significant homology (about 71% from amino acids 1 to Patent 124702-0230
  • Gxhl a protein from Podospora amerina S mat+ (Genbank Accession No. CAP68494).
  • Gxhl's xylobiohydrolase activity can be assayed with HPAEC analysis of xylo oligos as the substrate.
  • the enzyme Gxh2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 21 in Table 2.
  • the Gxh2 nucleic acid sequence encodes a 477 amino acid sequence, represented by SEQ ID NO: 22 in Table 2.
  • the signal peptide for Gxh2 is located from about position 1 to about position 17 of the Gxh2 amino acid sequence, with the mature protein spanning from about position 18 to about position 477 of the Gxh2 amino acid sequence.
  • a catalytic domain CD
  • Gxh2 can be assigned to GH5 of the CAZy families and is expected to have xylanase activity. Gxh2 may also exhibit chitosanase; ⁇ -mannosidase; Cellulase; glucan 1,3- ⁇ - glucosidase; licheninase; glucan endo-l,6-p-glucosidase; mannan endo-p-1,4- mannosidase; endo-P-l,4-xylanase; cellulose p-l,4-cellobiosidase; endo-P-1,6- galactanase; P-l,3-mannanase; xyloglucan-specific endo-p-l,4-glucanase; and/or mannan transglycosylase activity.
  • Gxh2 possesses significant homology (about 56% from amino acids 20 to 474 of Gxh2) with a protein from Podospora amerina S mat+ (Genbank Accession No. CAP64828). The activity can be tested with HPAEC analysis.
  • the enzyme Agal is encoded by the nucleic acid sequence represented by SEQ ID NO: 23 in Table 2.
  • the Agal nucleic acid sequence encodes a 435 amino acid sequence, represented by SEQ ID NO: 24 in Table 2.
  • the signal peptide for Agal is located from about position 1 to about position 14 of the Agal amino acid sequence, with the mature protein spanning from about position 15 to about position 435 of the Agal amino acid sequence.
  • a catalytic domain (CD) is present.
  • CD catalytic domain
  • Agal can be assigned to GH27 of the CAZy families and is expected to have a-galactosidase activity.
  • Agal may also possess a-N-acetylgalactosaminidase; isomalto- dextranase; and/or ⁇ -L-arabinopyranosidase activity.
  • Agal possesses significant Patent 124702-0230 homology (about 56% from amino acids 14 to 425 of Agal) with a hypothetical protein POSPLD AFT_l 34790 from Postia placenta Mad-698-R (Genbank Accession No. EED85274).
  • Agal also possesses significant homology (about 51% from amino acids 17 to 416 of Agal) with a glycoside hydrolase family 27 protein from Laccaria bicolor S238N-H82 (Genbank Accession No. EDR08276). Activity of Agal can be measured with pNP- -D-galactopyranoside as the substrate. See Example 14,
  • the enzyme Aga2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 25 in Table 2.
  • the Aga2 nucleic acid sequence encodes a 416 amino acid sequence, represented by SEQ ID NO: 26 in Table 2.
  • the signal peptide for Aga2 is located from about position 1 to about position 20 of the Aga2 amino acid sequence, with the mature protein spanning from about position 21 to about position 416 of the Aga2 amino acid sequence.
  • a catalytic domain CD
  • Aga2 can be assigned to GH27 of the CAZy families and is expected to have a-galactosidase activity. Aga2 may also possess a-N-acetylgalactosaminidase; isomalto- dextranase; and/or ⁇ -L-arabinopyranosidase activity. Aga2 possesses significant homology (about 61% from amino acids 1 to 403 of Aga2) with hypothetical protein MGG_13626 from Magnaporthe grisea 70-15 (Genbank Accession No. EDJ99928).
  • Aga2 also possesses significant homology (about 73% from amino acids 1 to 403 of Aga2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP71790).
  • Activity pf Aga2 can be measured with pNP-a-D- galactopyranoside as the substrate. See Example 14.
  • the enzyme Aga3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 27 in Table 2.
  • the Aga3 nucleic acid sequence encodes a 892 amino acid sequence, represented by SEQ ID NO: 28 in Table 2.
  • SEQ ID NO: 28 in Table 2.
  • CD catalytic domain
  • Aga3 can be assigned to GH36 of the CAZy Patent 124702-0230 families and is expected to have a-galactosidase/ raffmose synthase activity.
  • Aga2 may also possess a-N-acetylgalactosaminidase; isomalto-dextranase; and/or ⁇ -L- arabinopyranosidase activity.
  • Aga3 possesses significant homology (about 67% from amino acids 1 to 892 of Aga3) with hypothetical protein CHGG_01365 from Chaetomium globosum CBS 148.51 (Genbank Accession No. EAQ93130).
  • Activity pf Aga3 can be measured with pNP-cc-D-galactopyranoside as the substrate. See Example 14.
  • Man8 is encoded by the nucleic acid sequence represented by SEQ ID NO: 29 in Table 2.
  • the Man8 nucleic acid sequence encodes a 897 amino acid sequence, represented by SEQ ID NO: 30 in Table 2.
  • SEQ ID NO: 30 in Table 2.
  • CD catalytic domain
  • Man8 may also possess ⁇ -galactosidase; ⁇ -glucuronidase; mannosylglycoprotein endo- ⁇ -mannosidase; and/or exo ⁇ -glucosaminidase activity.
  • Man8 possesses significant homology (about 54% from amino acids 5 to 897 of Man8) with EDK05879 (Genbank Accession No. XP364819).
  • Man8 also possesses significant homology (about 50% from amino acids 4 to 897 of Man8) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP61774).
  • the activity of Man8 can be examined with a ⁇ -mannosidas assay. See Example 15.
  • Man9 is encoded by the nucleic acid sequence represented by SEQ ID NO: 31 in Table 2.
  • the Man9 nucleic acid sequence encodes a 855 amino acid sequence, represented by SEQ ID NO: 32 in Table 2.
  • SEQ ID NO: 32 in Table 2.
  • CD catalytic domain
  • the amino acid sequence containing the CD of Man9 spans from a stalling point of about position 2 to an ending point of about position 654 of the Man9 amino acid sequence. Based on homology, Man9 can be assigned to GH2 of the CAZy families and is expected to have ⁇ -mannosidase activity.
  • Man9 may also possess ⁇ -galactosidase; ⁇ -glucuronidase; mannosylglycoprotein endo ⁇ -mannosidase; Patent 124702-0230 and/or exo- -glucosaminidase activity.
  • Man9 possesses significant homology (about 73% from amino acids 1 to 853 of Man9) with EDK05161 (Genbank Accession No. XP363252).
  • Man9 also possesses significant homology (about 76% from amino acids 1 to 855 of Man9) with hypothetical protein NCU00890 from Neurospora crassa OR74A (Genbank Accession No. EAA35570).
  • the activity of Man9 can be examined with a ⁇ -mannosidas assay. See Example 15.
  • the enzyme Bgl2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 33 in Table 2.
  • the Bgl2 nucleic acid sequence encodes a 968 amino acid sequence, represented by SEQ ID NO: 34 in Table 2.
  • SEQ ID NO: 34 in Table 2.
  • the amino acid sequence containing the CD of Bgl2 spans from a starting point of about position 168 to an ending point of about position 750 of the Bgl2 amino acid sequence.
  • Bgl2 can be assigned to GH3 of the CAZy families and is expected to have ⁇ -glucosidase activity.
  • Bgl2 may also possess ⁇ -glucosidase; xylan 1,4-P-xylosidase; ⁇ - ⁇ -acetymexosaminidase; glucan l,3 ⁇ -glucosidase; exo- 1,3-1,4-glucanase; and/or alpha-L-arabinofuranosidase activity.
  • Bgl2 possesses significant homology (about 66% from amino acids 1 to 968 of Bgl2) with EAA35949 (Genbank Accession No. XP965185).
  • Bgl2 also possesses significant homology (about 75% from amino acids 1 to 968 of Bgl2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65606).
  • the enzyme Bgl3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 35 in Table 2.
  • the Bgl3 nucleic acid sequence encodes a 865 amino acid sequence, represented by SEQ ID NO: 36 in Table 2.
  • SEQ ID NO: 36 in Table 2.
  • CD catalytic domain
  • Bgl3 can be assigned to GH3 of the CAZy families and is expected to have ⁇ -glucosidase activity. Which can be assayed as for example described by Esen In: Handbook of Food Enzymology. Edited by John Patent 124702-0230
  • Bgl3 may also possess ⁇ - glxicosidase; xylan 1 ,4 ⁇ -xylosidase; ⁇ - ⁇ -acetymexosaminidase; glucan 1,3- ⁇ - glucosidase; exo-l,3-l,4-glucanase; and/or alpha-L-arabinofuranosidase activity.
  • Bgl3 possesses significant homology (about 55.68% from amino acids 97 to 865 of Bgl3) with a protein from Aspergillus oryzae (Genbank Accession No. BAE60358.1).
  • the enzyme Bgl4 is encoded by the nucleic acid sequence represented by SEQ ID NO: 37 in Table 2.
  • the Bgl4 nucleic acid sequence encodes a 884 amino acid sequence, represented by SEQ ID NO: 38 in Table 2.
  • SEQ ID NO: 38 in Table 2.
  • CD catalytic domain
  • Bgl4 can be assigned to GH3 of the CAZy families and is expected to have ⁇ -glucosidase activity. Which can be assayed as for example described by Esen In: Handbook of Food Enzymology. Edited by John R. Whitaker et al CRC Press 2002, pp791-803. Bgl4 may also possess ⁇ - glucosidase; xylan l,4-P-xylosidase; ⁇ - ⁇ -acetylhexosaminidase; glucan 1,3- ⁇ - glucosidase; exo-l,3-l,4-glucanase; and/or alpha-L-arabinofuranosidase activity.
  • Bgl4 possesses significant homology (about 79% from amino acids 12 to 884 of Bgl4) with EAA35798 (Genbank Accession No. XP363252). Bgl4 also possesses significant homology (about 82% from amino acids 11 to 870 of Bgl4) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP69216).
  • the enzyme Bgl5 is encoded by the nucleic acid sequence represented by SEQ ID NO: 39 in Table 2.
  • the Bgl5 nucleic acid sequence encodes a 903 amino acid sequence, represented by SEQ ID NO: 40 in Table 2.
  • the signal peptide for Bgl5 is located from about position 1 to about position 23 of the Bgl5 amino acid sequence, with the mature protein spanning from about position 24 to about position 903 of the Bgl5 amino acid sequence.
  • a catalytic domain (CD) is present.
  • the amino acid sequence containing the CD of Bgl5 spans from a starting point of about position 93 to an ending point of about position 662 of the Bgl5 amino acid sequence.
  • Bgl5 can be Patent 124702-0230 assigned to GH3 of the CAZy families and is expected to have ⁇ -glucosidase activity. Which can be assayed as described by Esen In: Handbook of Food Enzymology. Edited by John R. Whitaker et al CRC Press 2002, pp791-803.
  • Bgl5 may also possess ⁇ -glucosidase; xylan l,4-p-xylosidase; ⁇ - ⁇ - acetymexosammidase; glucan l,3-43-glucosidase; exo-l,3-l,4-glucanase; and/or alpha-L-arabinofuranosidase activity.
  • Bgl5 possesses significant homology (about 66% from amino acids 1 to 903 of Bgl5) with a probable beta-glucosidase 1 precursor from Ne rospora crassa (Genbank Accession No. CAC28685).
  • Bgl5 also possesses significant homology (about 75% from amino acids 1 to 903 of Bgl5) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP73569).
  • the enzyme Glul is encoded by the nucleic acid sequence represented by SEQ ID NO: 41 in Table 2.
  • the Glul nucleic acid sequence encodes a 819 amino acid sequence, represented by SEQ ID NO: 42 in Table 2.
  • the signal peptide for Glul is located from about position 1 to about position 17 of the Glul amino acid sequence, with the mature protein spanning from about position 18 to about position 819 of the Glul amino acid sequence.
  • a catalytic domain CD
  • Glul can be assigned to GH55 of the CAZy families and is expected to have endo- and/or exo- ⁇ -glucanase activity.
  • Glul possesses significant homology (about 56% from amino acids 19 to 811 of Glul) with a conserved hypothetical protein from Neurospora crassa OR74A (Genbank Accession No. EAA35992).
  • Glul also possesses significant homology (about 54% from amino acids 7 to 815 of Glul) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65594).
  • the enzyme Glu2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 43 in Table 2.
  • the Glu2 nucleic acid sequence encodes a 550 amino acid sequence, represented by SEQ ID NO: 44 in Table 2.
  • the signal peptide for Glu2 is located from about position 1 to about position 19 of the Glu2 amino acid sequence, with the mature protein spanning from about position 20 to about position 550 of the Glu2 amino acid sequence.
  • a catalytic Patent 124702-0230 domain CD
  • Glu2 can be assigned to GH17 of the CAZy families and is expected to have ⁇ -glucanase activity. Glu2 may also express glucan endo-l,3-P-glucosidase; glucan 1,3- ⁇ - glucosidase; licheninase; p-l,3-glucanosyltransglycosyIase activity. Glu2 possesses significant homology (about 76% from amino acids 1 to 550 of Glu2) with exo-beta-l,3-glucanase from Chaetomi m globosum (Genbank Accession No. ACM42426). Glu2 also possesses significant homology (about 58% from amino acids 1 to 550 of Glu2) with EAA34684 (Genbank Accession No. XP963920).
  • the enzyme Abn6 is encoded by the nucleic acid sequence represented by SEQ ID NO: 45 in Table 2.
  • the Abn6 nucleic acid sequence encodes a 611 amino acid sequence, represented by SEQ ID NO: 46 in Table 2.
  • the signal peptide for Abn6 is located from about position 1 to about position 16 of the Abn6 amino acid sequence, with the mature protein spanning from about position 17 to about position 611 of the Abn6 amino acid sequence.
  • a catalytic domain (CD) is present.
  • CD catalytic domain
  • Abn6 can be assigned to GH43 of the CAZy families and is expected to have arabinofuranosidase/arabinase/xylosidase activity.
  • Abn6 may possess ⁇ - xylosidase; -l,3-xylosidase; a-L-arabinofuranosidase; xylanase; and/or galactan 1,3-P-galactosidase activity.
  • Abn6 possesses significant homology (about 49% from amino acids 1 to 609 of Abn6) with a protein from Aspergillus oryzae (Genbank Accession No. BAE61393).
  • Abn6 also possesses significant homology (about 72% from amino acids 16 to 611 of Abn6) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP66759).
  • the enzyme Abn8 is encoded by the nucleic acid sequence represented by SEQ ID NO: 47 in Table 2.
  • the Abn8 nucleic acid sequence encodes a 631 amino acid sequence, represented by SEQ ID NO: 48 in Table 2.
  • the signal peptide for Abn8 is located from about position 1 to about position 20 of the Abn8 amino acid sequence, with the mature protein spanning from about position 21 to about position 631 of the Abn8 amino acid sequence.
  • a catalytic Patent 124702-0230 domain CD
  • Abn8 can be assigned to GH43 of the CAZy families and is expected to have arabinofuranosidase/arabinase/xylosidase activity.
  • Abn8 may possess ⁇ - xylosidase; p-l,3-xylosidase; ⁇ -L-arabinofuranosidase; xylanase; and/or galactan l,3-p-galactosidase activity.
  • Abn8 possesses significant homology (about 54% from amino acids 34 to 588 of Abn8) with a putative xylanase 27 from Gibberella zeae (Genbank Accession No. AAV98256).
  • Abn8 also possesses significant homology (about 52% from amino acids 32 to 585 of Abn8) with a hypothetical protein MGG 05479 from Magnaporthe grisea 70-15 (Genbank Accession No. EDK06156).
  • the enzyme AbnlO is encoded by the nucleic acid sequence represented by SEQ ID NO: 49 in Table 2.
  • the AbnlO nucleic acid sequence encodes a 424 amino acid sequence, represented by SEQ ID NO: 50 in Table 2.
  • the signal peptide for AbnlO is located from about position 1 to about position 20 of the AbnlO amino acid sequence, with the mature protein spanning from about position 21 to about position 424 of the AbnlO amino acid sequence.
  • a catalytic domain CD
  • AbnlO can be assigned to GH93 of the CAZy families and is expected to have arabinanase activity.
  • AbnlO possesses significant homology (about 70% from amino acids 56 to 414 of AbnlO) with a conserved hypothetical protein from Neurospora crassa O 74A (Genbank Accession No. EAA28974), AbnlO also possesses significant homology (about 55% from amino acids 58 to 414 of AbnlO) with a protein from Aspergillus or zae (Genbank Accession No. BAE64690).
  • the enzyme Laml is encoded by the nucleic acid sequence represented by SEQ ID NO: 51 in Table 2.
  • the Laml nucleic acid sequence encodes a 763 amino acid sequence, represented by SEQ ID NO: 52 in Table 2.
  • the signal peptide for Laml is located from about position 1 to about position 20 of the Laml amino acid sequence, with the mature protein spanning from about position 21 to about position 763 of the Laml amino acid sequence.
  • a catalytic Patent 124702-0230 domain CD
  • Laml can be assigned to GH55 of the CAZy families and is expected to have exo- and/or endo- ⁇ -glucanase activity.
  • Laml possesses significant homology (about 74% from amino acids 8 to 738 of Laml) with a hypothetical protein NCU04850 from Neurospora crassa OR74A (Genbank Accession No. EAA31173).
  • Laml also possesses significant homology (about 74% from amino acids 22 to 762 of Laml) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP66532). As evidenced below in Example 2, Laml possessed ⁇ -glucanase activity.
  • the enzyme Bgal is encoded by the nucleic acid sequence represented by SEQ ID NO: 53 in Table 2.
  • the Bgal nucleic acid sequence encodes a 881 amino acid sequence, represented by SEQ ID NO: 54 in Table 2.
  • the signal peptide for Bgal is located from about position 1 to about position 30 of the Bgal amino acid sequence, with the mature protein spanning from about position 31 to about position 881 of the Bgal amino acid sequence.
  • a catalytic domain (CD) is present.
  • CD catalytic domain
  • Bgal can be assigned to GH2 of the CAZy families and is expected to have ⁇ -galactosidase activity.
  • Bgal possesses significant homology (about 58% from amino acids 44 to 879 of Bgal) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP70161).
  • assays to test Bgal activity include ⁇ -galactosidase assay using ONPG (oNP-a-D-galactopyranoside) or PNPG (pNP-a-D- galactopyranoside) as a substrate (See example described by Mahoney, R.R. In: Handbook of Food Enzymology. Edited by John R.
  • Bgal may also possess ⁇ -mannosidase; ⁇ -glucuronidase; mannosylglycoprotein endo ⁇ -mannosidase; and/or exo ⁇ -glucosaminidase activity.
  • the enzyme Bga3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 55 in Table 2.
  • the Bga3 nucleic acid sequence encodes a 298 amino acid sequence, represented by SEQ ID NO: 56 in Table 2.
  • SEQ ID NO: 56 in Table 2.
  • the mature Bga3 Patent 124702-0230 protein spans from about position 1 to about position 298 of the Bga3 amino acid sequence.
  • a catalytic domain (CD) is present.
  • CD catalytic domain
  • Bga3 can be assigned to GH42 of the CAZy families and is expected to have -galactosidase/ galactanase activity. Bga3 possesses significant homology (about 50% from amino acids 1 to 264 of Bga3) with hypothetical protein CBG19899 from Caenorhabditis briggsae (Genbank Accession No. CAP37064).
  • the enzyme Gal2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 57 in Table 2.
  • the Gal2 nucleic acid sequence encodes a 490 amino acid sequence, represented by SEQ ID NO: 58 in Table 2.
  • the signal peptide for Gal2 is located from about position 1 to about position 18 of the Gal2 amino acid sequence, with the mature protein spanning from about position 19 to about position 490 of the Gal2 amino acid sequence.
  • a catalytic domain (CD) is present.
  • CD catalytic domain
  • Gal 2 can be assigned to GH43 of the CAZy families and is expected to have galactanase/arabinase activity. Gal2 may also possess ⁇ -xylosidase ; ⁇ -1,3- xylosidase; a-L-ai-abinofuranosidase; xylanase; and/or galactan l,3-P-galactosidase activity. Gal 2 possesses significant homology (about 79% from amino acids 1 to 484 of Gal2) with a protein from Aspergillus oryzae (Genbank Accession No. BAE60540).
  • Gal 2 also possesses significant homology (about 73% from amino acids 1 to 484 of Gal2) with protein Pc22g09450 from Penicillium chrysogenum Wisconsin 54-1255 (Genbank Accession No. CAP98233). Gal2 activity can be tested by at least a reducing sugars assay using galactan or arabinan substrates.
  • the enzyme Arhl is encoded by the nucleic acid sequence represented by SEQ ID NO: 59 in Table 2.
  • the Arhl nucleic acid sequence encodes a 777 amino acid sequence, represented by SEQ ID NO: 60 in Table 2.
  • the signal peptide for Arhl is located from about position 1 to about position 15 of the Arhl amino acid sequence, with the mature protein spanning from about position 16 to about position 777 of the Arhl amino acid sequence.
  • a catalytic Patent 124702-0230 domain CD
  • Arhl can be assigned to GH78 of the CAZy families and is expected to have a-rhamnosidase activity.
  • Arhl possesses significant homology (about 80% from amino acids 1 to 777 of Arhl) with hypothetical protein CHQG_06672 from Chaetomium globosum CBS 148.51 (Genbank Accession No. EAQ90053).
  • Arhl also possesses significant homology (about 54% from amino acids 25 to 777 of Arhl) with a predicted protein from Neurospora crassa OR74A (Genbank Accession No. EAA28787).
  • a-L-rhamnosidase ctivity can be measured as described in Manzanares, P. et al. (1 97) FEMS Microbial Lett. 157, 279-283.
  • the enzyme Cell is encoded by the nucleic acid sequence represented by SEQ ID NO: 61 in Table 2.
  • the Cell nucleic acid sequence encodes a 476 amino acid sequence, represented by SEQ ID NO: 62 in Table 2.
  • SEQ ID NO: 62 in Table 2.
  • CD catalytic domain
  • the amino acid sequence containing the CD of Cell spans from a starting point of about position 2 to an ending point of about position 474 of the Cell amino acid sequence.
  • Cell can be assigned to GH1 of the CAZy families and is expected to have ⁇ -glucosidase activity. Which can be assayed as described by Esen In: Handbook of Food Enzymology.
  • Cell may also possess ⁇ -galactosidase; ⁇ -mannosidase; ⁇ -glucuronidase; ⁇ -D-fucosidase; phlorizin hydrolase; exo ⁇ -l,4-glucanase; 6-phospho ⁇ -galactosidase; 6-phospho ⁇ - glucosidase; strictosidine ⁇ -glucosidase; lactase; amygdalin ⁇ -glucosidase; prunasin ⁇ -glucosidase; raucaffricine ⁇ -glucosidase; thioglucosidase; ⁇ - primeverosidase; isoflavonoid 7-0 ⁇ -apiosyl ⁇ -glucosidase; hydroxyisourate hydrolase; and/or ⁇ -glycosidase activity.
  • the enzyme Mip is encoded by the nucleic acid sequence represented by SEQ ID NO: 63 in Table 2.
  • the Mip nucleic acid sequence encodes a 850 amino acid sequence, represented by SEQ ID NO: 64 in Table 2.
  • SEQ ID NO: 64 a catalytic domain
  • the amino acid sequence containing the CD of Mip spans from a starting point of about position 172 to an ending point of about position 602 of the Mip amino acid sequence. Based on homology, Mip can be assigned to GH31 of the CAZy families and is expected to have amylase/a-glucosidase activity.
  • Mip possesses significant homology (about 61% from amino acids 7 to 823 of Mip) with hypothetical protein NCU04885 from Neurospora crassa OR74A (Genbank Accession No. EAA28714). Mip also possesses significant homology (about 67% from amino acids 7 to 850 of Mip) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP70184). Mip may also possess a-l,3-glucosidase; sucrase-isomaltase; ct-xylosidase; a-glucan lyase; and/or isomaltosyltransferase activity.
  • MW Molecular Weight in kiloDaltons (kDa), as calculated based on amino acid sequence with Clone Manager 9 Professional Edition
  • pi isoelectric point, as calculated based on amino acid sequence with Clone Manager 9 Professional Edition
  • an isolated protein or polypeptide in the present invention includes full-length proteins and their glycosylated or otherwise modified forms forms, fusion proteins, or any fragment or homologue or variant of such a protein.
  • an isolated protein such as an enzyme according to the present invention, is a protein (including a polypeptide or peptide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include purified proteins, partially purified proteins, recombinantly produced proteins, synthetically produced proteins, proteins complexed with lipids, soluble proteins, and isolated proteins associated with other proteins, for example.
  • a "Myceliophtora thermophila CI protein” or “Myceliophtora thermophila CI enzyme” refers to a protein (generally including a homologue or variant of a naturally occurring protein) from Myceliophtora thermophila or to a protein that has been otherwise produced from the knowledge of the structure (e.g., sequence) and perhaps the function of a naturally occurring protein from Myceliophtora thermophila. In other words, a M.
  • Thermophila protein includes any protein that has substantially similar structure and function of a naturally Patent 124702-0230 occurring M.
  • Thermophila protein can include purified, partially purified, recombinant, mutated/modified and synthetic proteins.
  • modifications can be used interchangeably, particularly with regard to the modifications/mutations to the amino acid sequence of a M Thermophila protein (or nucleic acid sequences) described herein.
  • An isolated protein according to the present invention can be isolated from its natural source, produced recombinantly or produced synthetically.
  • modification and “mutation” can be used interchangeably, particularly with regard to the modifications/mutations to the primary amino acid sequences of a protein or peptide (or nucleic acid sequences) described herein.
  • modification can also be used to describe post- translational modifications to a protein or peptide including, but not limited to, methylation, farnesylation, carboxymethylation, geranyl geranylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, and/or amidation.
  • Modification can also included the cleavage of a signal peptide, or methionine, or other portions of the peptide that require cleavage to generate the mature peptide. Modifications can also include, for example, complexing a protein or peptide with another compound. Such modifications can be considered to be mutations, for example, if the modification is different than the post-translational modification that occurs in the natural, wild-type protein or peptide.
  • homologue or “variants” are used to refer to a protein or peptide which differs from a naturally occurring protein or peptide (i.e., the "prototype” or “wild-type” protein) by minor modifications to the naturally occuning protein or peptide, but which maintains the basic protein and side chain structure of the naturally occurring form.
  • Such changes include, but are not limited to: changes in one or a few amino acid side chains; changes one or a few amino acids, including deletions (e.g., a truncated version of the protein or peptide), insertions and/or substitutions; changes in stereochemistry of one or a few atoms; Patent 124702-0230 and/or minor derivatizations, including but not limited to: methylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, amidation and/or addition of glycosylphosphatidyl inositol.
  • a homologue or variant can have either enhanced, decreased, or substantially similar properties as compared to the naturally occurring protein or peptide.
  • a homologue or variant can include an agonist of a protein or an antagonist of a protein.
  • Homologues or variants can be the result of natural allelic variation or natural mutation.
  • a naturally occurring allelic variant of a nucleic acid encoding a protein is a gene that occurs at essentially the same locus (or loci) in the genome as the gene which encodes such protein, but which, due to natural variations caused by, for example, mutation or recombination, has a similar but not identical sequence.
  • Homologous can also be the result of a gene duplication and rearrangement, resulting in a different location.
  • Allelic variants typically encode proteins having similar activity to that of the protein encoded by the gene to which they are being compared.
  • allelic variants can encode the same protein but have different nucleic acid sequences due to the degeneracy of the genetic code. Allelic variants can also comprise alterations in the 5' or 3' untranslated regions of the gene (e.g., in regulatory control regions). Allelic variants are well known to those skilled in the art.
  • Homologues or variants can be produced using techniques known in the art for the production of proteins including, but not limited to, direct modifications to the isolated, naturally occurring protein, direct protein synthesis, or modifications to the nucleic acid sequence encoding the protein using, for example, classic or recombinant DNA techniques to effect random or targeted mutagenesis.
  • Modifications of a protein, such as in a homologue or variant may result in proteins having the same biological activity as the naturally occurring protein, or in proteins having decreased or increased biological activity as compared to the naturally occurring protein.
  • Modifications which result in a decrease in protein expression or a decrease in the activity of the protein can be referred to as inactivation (complete or partial), down-regulation, or decreased Patent 124702-0230 action of a protein.
  • modifications which result in an increase in protein expression or an increase in the activity of the protein can be referred to as amplification, overproduction, activation, enhancement, up-regulation or increased action of a protein.
  • an isolated protein including a biologically active homologue, variant, or fragment thereof, has at least one characteristic of biological activity of a wild-type, or naturally occurring, protein.
  • the biological activity or biological action of a protein refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions).
  • the biological activity of a protein of the present invention can include an enzyme activity (catalytic activity and/or substrate binding activity), such as cellulase activity, hemicellulase activity, ⁇ -glucanase activity, ⁇ - glucosidase activity, a-galactosidase activity, ⁇ -galactosidase activity, xylanase activity or any other activity disclosed herein.
  • an enzyme activity catalytic activity and/or substrate binding activity
  • cellulase activity hemicellulase activity, ⁇ -glucanase activity, ⁇ - glucosidase activity, a-galactosidase activity, ⁇ -galactosidase activity, xylanase activity or any other activity disclosed herein.
  • Such assays include, but are not limited to, measurement of enzyme activity (e.g., catalytic activity), measurement of substrate binding, and the like. It is noted that an isolated protein of the present invention (including homologues or valiants) is not required to have a biological activity such as catalytic activity.
  • a protein can be a truncated, mutated or inactive protein, or lack at least one activity of the wild-type enzyme, for example. Inactive proteins may be useful in some screening assays, for example, or for other purposes such as antibody production.
  • Methods to measure protein expression levels of a protein according to the invention include, but are not limited to: SDS-PAGE-analysis, protein concentration assays (Lowry, Bradford, BCA), western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to, ligand binding or interaction with other protein partners. Binding assays are also well known in the Patent 124702-0230 art. For example, a BIAoore machine can be used to determine the binding constant of a complex between two proteins. The dissociation constant for the complex can be determined by monitoring changes in the refractive index with respect to time as buffer is passed over the chip (O'Shannessy et al.
  • suitable assays for measuring the binding of one protein to another include, for example, immunoassays such as enzyme linked immunoabsorbent assays (ELISA) and radioimmunoassays (RIA), or determination of binding by monitoring the change in the spectroscopic or optical properties of the proteins through fluorescence, UV absorption, circular dichroism, or nuclear magnetic resonance (NMR).
  • immunoassays such as enzyme linked immunoabsorbent assays (ELISA) and radioimmunoassays (RIA)
  • ELISA enzyme linked immunoabsorbent assays
  • RIA radioimmunoassays
  • enzymes and proteins of the present invention may be desirable targets for modification and use in the processes described herein. These proteins have been described in terms of function and amino acid sequence (and nucleic acid sequence encoding the same) of representative wild-type proteins.
  • homologues or variants of a given protein (which can include related proteins from other organisms or modified forms of the given protein) are encompassed for use in the invention.
  • Homologues or variants of a protein encompassed by the present invention can comprise, consist essentially of, or consist of, in one embodiment, an amino acid sequence that is at least about 35% identical, and more preferably at least about 40% identical, and more preferably at least about 45% identical, and more preferably at least about 50% identical, and more preferably at least about 55% identical, and more preferably at least about 60% identical, and more preferably at least about 65% identical, and more preferably at least about 70% identical, and more preferably at least about 75% identical, and more preferably at least about 80% identical, and more preferably at least about 85% identical, and more preferably at least about 90% identical, and more preferably at least about 95% identical, and more preferably at least about 96% identical, and more preferably at least about 97% identical, and more preferably at least about 98% identical, and more preferably at least about 99% identical, or any percent identity between 35% and 99%, in whole integers (i.e., 36%, 37%, etc.), to an amino acid sequence disclosed
  • the amino acid sequence of the homologue or variant has a biological activity of the wild-type or reference protein or of a biologically active domain thereof (e.g., a catalytic domain).
  • the amino acid position of the wild-type is typically used.
  • the wild-type can also be referred to as the "parent.” Additionally, any generation before the variant at issue can be a parent.
  • a protein of the present invention comprises, consists essentially of, or consists of an amino acid sequence that, alone or in combination with other characteristics of such proteins disclosed herein, is less than 100% identical to an amino acid sequence selected from Tables 1 and 2 (i.e., a homologue or variant).
  • a protein of the present invention can be less than 100% identical, in combination with being at least about 35% identical, to a given disclosed sequence.
  • a homologue or variant according to the present invention has an amino acid sequence that is less than about 99% identical to any of such amino acid sequences, and in another embodiment, is less than about 98% identical to any of such amino acid sequences, and in another embodiment, is less than about 97% identical to any of such amino acid sequences, and in another embodiment, is less than about 96% identical to any of such amino acid sequences, and in another embodiment, is less than about 95% identical to any of such amino acid sequences, and in another embodiment, is less than about 94% identical to any of such amino acid sequences, and in another embodiment, is less than about 93% identical to any of such amino acid sequences, and in another embodiment, is less than about 92% identical to any of such amino acid sequences, and in another embodiment, is less than about 91% identical to any of such amino acid sequences, and in another embodiment, is less than about 90% identical to any of such amino acid sequences, and so on, in increments of whole integers.
  • reference to a percent (%) identity refers to an evaluation of homology which is performed using: (1) a BLAST 2.0 Basic BLAST homology search using blastp for amino acid searches and blastn for nucleic acid searches with standard default parameters, wherein the query sequence is filtered for low complexity regions by default (described in Altschul, S.F., Madden, T.L., Schaaffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. Patent 124702-0230
  • PSI-BLAST provides an automated, easy-to-use version of a "profile" search, which is a sensitive way to look for sequence homologues or variants.
  • the program first performs a gapped BLAST database search.
  • the PSI-BLAST program uses the information from any significant alignments returned to construct a position-specific score matrix, which replaces the query sequence for the next round of database searching. Therefore, it is to be understood that percent identity can be determined by using any one of these programs.
  • BLAST 2 sequence alignment is performed in blastp or blastn using the BLAST 2.0 algorithm to perform a Gapped BLAST search (BLAST 2.0) between the two sequences allowing for the introduction of gaps (deletions and insertions) in the resulting alignment.
  • BLAST 2.0 Gapped BLAST search
  • a BLAST 2 sequence alignment is performed using the standard default parameters as follows.
  • gap x_dropoff 50) expect (10) word size (3) filter (on).
  • a protein of the present invention can also include proteins having an amino acid sequence comprising at least 10 contiguous amino acid residues of any of the sequences described herein (i.e., 10 contiguous amino acid residues having 100% identity with 10 contiguous amino acids of the amino acid sequences of Tables 1 and 2).
  • a homologue or variant of a protein amino acid sequence includes amino acid sequences comprising at least 20, or at least 30, or at least 40, or at least 50, or at least 75, or at least 100, or at least 125, or at least 150, or at least 175, or at least 150, or at least 200, or at least 250, or at least 300, or at least 350 contiguous amino acid residues of any of the amino acid sequence represented disclosed herein.
  • fragments of proteins without biological activity are useful in the present invention, for example, in the preparation of antibodies against the full-length protein or in a screening assay (e.g., a. binding assay). Fragments can also be used to construct fusion proteins, for example, where the fusion protem comprises functional domains from two or more different proteins (e.g., a CBM from one protein linked to a CD from another protein). In one embodiment, a homologue or variant has a measurable or detectable biological activity associated with the wild-type protein (e.g., enzymatic activity).
  • the term "contiguous” or “consecutive”, with regard to nucleic acid or amino acid sequences described herein, means to be connected in an unbroken sequence.
  • a first sequence to comprise 30 contiguous (or consecutive) amino acids of a second sequence means that the first sequence includes an unbroken sequence of 30 amino acid residues that is 100% identical to an unbroken sequence of 30 amino acid residues in the second sequence.
  • a first sequence to have "100% identity" with a second sequence means that the first sequence exactly matches the second sequence with no gaps between nucleotides or amino acids.
  • a protein of the present invention includes a protein having an amino acid sequence that is sufficiently similar to a natural amino acid sequence that a nucleic acid sequence encoding the Patent 124702-0230 homologue or variant is capable of hybridizing under moderate, high or very high stringency conditions (described below) to ⁇ i.e., with) a nucleic acid molecule encoding the natural protein ⁇ i.e., to the complement of the nucleic acid strand encoding the natural amino acid sequence).
  • a homologue or variant of a protein of the present invention is encoded by a nucleic acid molecule comprising a nucleic acid sequence that hybridizes under low, moderate, or high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising, consisting essentially of, or consisting of, an amino acid sequence represented by any of Tables 1 and 2.
  • hybridization conditions are described in detail below.
  • a nucleic acid sequence complement of nucleic acid sequence encoding a protein of the present invention refers to the nucleic acid sequence of the nucleic acid strand that is complementary to the strand which encodes the protein. It will be appreciated that a double stranded DNA which encodes a given amino acid sequence comprises a single strand DNA and its complementary strand having a sequence that is a complement to the single strand DNA.
  • nucleic acid molecules of the present invention can be either double-stranded or single- stranded, and include those nucleic acid molecules that form stable hybrids under stringent hybridization conditions with a nucleic acid sequence that encodes an amino acid sequence such as the amino acid sequences of Tables 1 and 2, and/or with the complement of the nucleic acid sequence that encodes an amino acid sequence such as the amino acid sequences of Tables 1 and 2.
  • Methods to deduce a complementary sequence are known to those skilled in the art. It should be noted that since nucleic acid sequencing technologies are not entirely error-free, the sequences presented herein, at best, represent apparent sequences of the proteins of the present invention.
  • hybridization conditions refers to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., (see specifically, pages 9.31-9.62). In addition, formulae to calculate the appropriate hybridization and wash conditions to achieve hybridization permitting varying degrees of mismatch Patent 124702-0230 of nucleotides are disclosed, for example, in Meinkoth et al., 1984, Anal. Biochem. 138, 267-284; Meinkoth et al, ibid.
  • moderate stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 70% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 30% or less mismatch of nucleotides).
  • High stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 80% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 20% or less mismatch of nucleotides).
  • Very high stringency hybridization and washing conditions refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides).
  • conditions permitting about 10% or less mismatch of nucleotides i.e., one of skill in the art can use the formulae in Meinkoth et al., ibid, to calculate the appropriate hybridization and wash conditions to achieve these particular levels of nucleotide mismatch. Such conditions will vary, depending on whether DNA:R A or DNA:DNA hybrids are being formed. Calculated melting temperatures for DNA:DNA hybrids are 10°C less than for DNA:RNA hybrids.
  • stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na + ) at a temperature of between about 20°C and about 35°C (lower stringency), more preferably, between about 28°C and about 40°C (more stringent), and even more preferably, between about 35°C and about 45°C (even more stringent), with appropriate wash conditions.
  • 6X SSC 0.9 M Na +
  • stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na + ) at a temperature of between about 30°C and about 45°C, more preferably, between about 38°C and about 50°C, and even more preferably, between about 45°C and about 55°C, with similarly stringent wash conditions.
  • 6X SSC 0.9 M Na +
  • T m can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62.
  • wash conditions should be as stringent as possible, and should be appropriate for the chosen hybridization conditions.
  • hybridization conditions can include a combination of salt and temperature conditions that are approximately 20-25°C below the calculated T NU, of a particular hybrid, and wash conditions typically include a combination of salt and temperature conditions that are approximately 12-20°C below the calculated T m of the particular hybrid.
  • hybridization conditions suitable for use with DNA:DNA hybrids includes a 2-24 hour hybridization in 6X SSC (50% formamide) at about 42°C, followed by washing steps that include one or more washes at room temperature in about 2X SSC, followed by additional washes at higher temperatures and lower ionic strength (e.g., at least one wash as about 37°C in about 0.1X-0.5X SSC, followed by at least one wash at about 68°C in about 0.1X-0.5X SSC).
  • the minimum size of a protein and/or homologue or variant of the present invention is a size sufficient to have biological activity or, when the protein is not required to have such activity, sufficient to be useful for another purpose associated with a protein of the present invention, such as for the production of antibodies that bind to a naturally occurring protein.
  • the protein of the present invention is at least 20 amino acids in length, or at least about 25 amino acids in length, or at least about 30 amino acids in length, or at least about 40 amino acids in length, or at least about 50 amino acids in length, or at least about 60 amino acids in length, or at least about 70 amino acids in length, or at least about 80 amino acids in length, or at least about 90 amino acids in length, or at least about 100 amino acids in length, or at least about 125 amino acids in length, or at least about 150 amino acids in length, or at least about 175 amino acids in length, or at least about 200 amino acids in length, or at least about 250 amino acids in length, and so on up to a full length of each protein, and including any size in between in increments of one whole integer (one amino acid).
  • the protein can include a portion of a protein or a full-length protein, plus additional sequence (e.g., a fusion protein sequence), if desired.
  • the present invention also includes a fusion protein that includes a domain of a Patent 124702-0230 protein of the present invention (including a homologue or vai'iant) attached to one or more fusion segments, which are typically heterologous in sequence to the protein sequence (i.e., different than protein sequence).
  • Suitable fusion segments for use with the present invention include, but are not limited to, segments that can: enhance a protein's stability; provide other desirable biological activity; and/or assist with the purification of the protein (e.g., by affinity chromatography).
  • a suitable fusion segment can be a domain of any size that has the desired function (e.g., imparts increased stability, solubility, action or biological activity; and/or simplifies purification of a protein).
  • Fusion segments can be joined to amino and/or carboxyl termini of the domain of a protein of the present invention and can be susceptible to cleavage in order to enable straight-forward recovery of the protein.
  • Fusion proteins are preferably produced by culturing a recombinant cell transfected with a fusion nucleic acid molecule that encodes a protein including the fusion segment attached to either the carboxyl and/or amino terminal end of a domain of a protein of the present invention.
  • proteins of the present invention also include expression products of gene fusions (for example, used to overexpress soluble, active forms of the recombinant protein), of mutagenized genes (such as genes having codon modifications to enhance gene transcription and translation), and of truncated genes (such as genes having membrane binding modules removed to generate soluble forms of a membrane protein, or genes having signal sequences removed which are poorly tolerated in a particular recombinant host).
  • gene fusions for example, used to overexpress soluble, active forms of the recombinant protein
  • mutagenized genes such as genes having codon modifications to enhance gene transcription and translation
  • truncated genes such as genes having membrane binding modules removed to generate soluble forms of a membrane protein, or genes having signal sequences removed which are poorly tolerated in a particular recombinant host.
  • any of the amino acid sequences described herein can be produced with from at least one, and up to about 20, additional heterologous amino acids flanking each of the C- and/or N-terminal ends of the specified amino acid sequence.
  • the resulting protein or polypeptide can be referred to as "consisting essentially of the specified amino acid sequence.
  • the heterologous amino acids are a sequence of amino acids that are not naturally found (i.e., not found in nature, in vivo) flanking the specified amino acid sequence, or that are not related to the function of the specified amino acid sequence, or that would not be encoded by the nucleotides that flank the naturally occurring nucleic acid sequence encoding the specified amino acid sequence as it occurs in the gene, if such nucleotides in the naturally Patent 124702-0230 occurring sequence were translated using standard codon usage for the organism from which the given amino acid sequence is derived.
  • the present invention also provides enzyme combinations that break down lignocellulose material.
  • Such enzyme combinations or mixtures can include a multi-enzyme composition that contains at least one protein of the present invention in combination with one or more additional proteins of the present invention or one or more enzymes or other protems from other microorganisms, plants, or similar organisms.
  • Synergistic enzyme combinations and related methods are contemplated.
  • the invention includes methods to identify the optimum ratios and compositions of enzymes with which to degrade each lignocellulosic material. These methods entail tests to identify the optimum enzyme composition and ratios for efficient conversion of any lignocellulosic substrate to its constituent sugars.
  • the Examples below include assays that may be used to identify optimum ratios and compositions of enzymes with which to degrade lignocellulosic materials.
  • Aay combination of the proteins disclosed herein is suitable for use in the multi- enzyme compositions of the present invention. Due to the complex nature of most biomass sources, which can contain cellulose, hemicellulose, pectin, lignin, protein, and ash, among other components, preferred enzyme combinations may contain enzymes with a range of substrate specificities that work together to degrade biomass into fermentable sugars in the most efficient manner.
  • One example of a multi-enzyme complex for lignocellulose saccharification is a mixture of cellobiohydrolase(s), xylanase(s), endoglucanase(s), P-glucosidase(s), P-xylosidase(s), and accessory enzymes.
  • any of the enzymes described specifically herein can be combined with any one or more of the enzymes described herein or with any other available and suitable enzymes, to produce a multi-enzyme composition.
  • the invention is not restricted or limited to the specific exemplary combinations listed below.
  • the cellobiohydrolase(s) comprise between about 30% and about 90% or between about 40% and about 70% of the enzymes in the composition, and more preferably, between about 55% and 65%, and more preferably, about 60% of the enzymes in the composition (including any percentage between 40% and 70% in 0.5% increments (e.g., 40%, 40.5%, 41 %, Patent 124702-0230 etc.).
  • the xylanase(s) comprise between about 10% and about 30% of the enzymes in the composition, and more preferably, between about 15% and about 25%, and more preferably, about 20% of the enzymes in the composition (including any percentage between 10% and 30% in 0.5% increments).
  • the endoglucanase(s) comprise between about 5% and about 15% of the enzymes in the composition, and more preferably, between about 7% and about 13%, and more preferably, about 10% of the enzymes in the composition (including any percentage between 5% and 15% in 0.5% increments).
  • the P-glucosidase(s) comprise between about 1% and about 15% of the enzymes in the composition, and preferably between about 2% and 10%, and more preferably, about 3% of the enzymes in the composition (including any percentage between 1% and 15% in 0.5% increments).
  • the p-xylosidase(s) comprise between about 1% and about 3% of the enzymes in the composition, and preferably, between about 1.5% and about 2.5%, and more preferably, about 2% of the enzymes in the composition (including any percentage between 1% and 3% in 0.5% increments.
  • the accessory enzymes comprise between about 2% and about 8% of the enzymes in the composition, and preferably, between about 3% and about 7%, and more preferably, about 5% of the enzymes in the composition (including any percentage between 2% and 8% in 0.5% increments.
  • One particularly preferred example of a multi-enzyme complex for lignocellulose saccharification is a mixture of about 60% cellobiohydrolase(s), about 20% xylanase(s), about 10% endoglucanase(s), about 3% P-glucosidase(s), about 2% ⁇ - xylosidase(s) and about 5% accessory enzyme(s).
  • Enzymes and multi-enzyme compositions of the present invention may also be used to break down arabinoxylan or arabinoxylan-containing substrates.
  • Arabinoxylan is a polysaccharide composed of xylose and arabinose, wherein -L- arabinofuranose residues are attached as branch-points to a p-(l,4)-linked xylose polymeric backbone.
  • the xylose residues may be mono-substituted at the C2 or C3 position, or di-substituted at both positions.
  • Ferulic acid or coumaric acid may also be ester-linked to the C5 position of arabinosyl residues. Further details on the hydrolysis of arabinoxylan can be found in International Publication No. WO Patent 124702-0230
  • the substitutions on the xylan backbone can inhibit the enzymatic activity of xylanases, and the complete hydrolysis of arabinoxylan typically requires the action of several different enzymes.
  • a multi-enzyme complex for arabinoxylan hydrolysis is a mixture of endoxylanase(s), p-xylosidase(s), and arabinofuranosidase(s), including those with specificity towards single and double substituted xylose residues.
  • the multi-enzyme complex may further comprise one or more carbohydrate esterases, such as acetyl xylan esterases, ferulic acid esterases, coumaric acid esterases or pectin methyl esterases. Any combination of two or more of the above-mentioned enzymes is suitable for use in the multi-enzyme complexes. However, it is to be understood that the invention is not restricted or limited to the specific exemplary combinations listed herein.
  • the endoxylanase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.).
  • Endoxylanase(s) either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g/kg, or 0.05 to 0.2 g/kg of substrate.
  • the -xylosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.).
  • -xylosidase(s) either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g/kg, or 0.05 to 0.2 g/kg of substrate.
  • the arabinofuranosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.).
  • the total percentage of arabinofuranosidase(s) present in the composition may include Patent 124702-0230 arabinofuranosidase(s) with specificity towards single substituted xylose residues, arabinofuranosidase(s) with specificity towards double substituted xylose residues, or any combination thereof.
  • Arabinofuranosidase(s) either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g kg, or 0.05 to 0.2 g/kg of substrate.
  • One or more components of a multi-enzyme composition can be obtained from or derived from a microbial, plant, or other source or combination thereof, and will contain enzymes capable of degrading lignocellulosic material.
  • enzymes included in the multi- enzyme compositions of the invention include cellulases, hemicellulases (such as xylanases, including, exoxylanases, and ⁇ -xylosidases; mannanases, including endomannanases, exomannanases, and ⁇ -mannosidases), glucuronidases, and esterases (including ferulic acid esterase and glucuronyl esterases), lipases, glucosidases (such as ⁇ -glucosidase).
  • the multi-enzyme composition may contain many types of enzymes, mixtures comprising enzymes that increase or enhance sugar release from biomass are preferred, including hemicellulases.
  • the he icellulase is selected from a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, an endo-galactanase, a mannanase, an endo-arabinase, an exo- arabinase, an exo-galactanase, a ferulic acid esterase, a galactomannanase, a xyloglucanase, or mixtures of any of these.
  • the enzymes can include glucoamylase, ⁇ -xylosidase and/or ⁇ -glucosidase.
  • mixtures comprising enzymes that are capable of degrading cell walls and releasing
  • the enzymes of the multi-enzyme composition can be provided by a variety of sources.
  • the enzymes can be produced by growing organisms such as bacteria, algae, fungi, and plants which produce the enzymes naturally or by virtue of being genetically modified to express the enzyme or enzymes.
  • at least one enzyme of the multi-enzyme composition is a commercially available enzyme.
  • the multi-enzyme compositions comprise an accessory enzyme.
  • An accessory enzyme is any additional enzyme capable of hydrolyzing lignocellulose or enhancing or promoting the hydrolysis of lignocellulose, wherein Patent 124702-0230 the accessory enzyme is typically provided in addition to a core enzyme or core set of enzymes.
  • An accessory enzyme can have the same or similar function or a different function as an enzyme or enzymes in the core set of enzymes.
  • enzymes have been described elsewhere herein, and can generally include cellulases, xylanases, ligninases, amylases, or glucuronidases and esterases, such as ferulic acid esterases, glucuronyl esterases and rhamnogalacturonyl esterases, for example.
  • Accessory enzymes can include enzymes that when contacted with biomass in a reaction, allow for an increase in the activity of enzymes ⁇ e.g., hemicellulases) in the multi-enzyme composition.
  • An accessory enzyme or enzyme mix may be composed of enzymes from (1) commercial suppliers; (2) cloned genes expressing enzymes; (3) complex broth (such as that resulting from growth of a microbial strain in media, wherein the strains secrete proteins and enzymes into the media); (4) cell lysates of strains grown as in (3); and, (5) plant material expressing enzymes capable of degrading lignocellulose.
  • the accessory enzyme is a glucoamylase, a pectinase, or a ligninase.
  • a ligninase is an enzyme that can hydrolyze or break down the structure of lignin polymers, including lignin peroxidases, manganese peroxidases, laccases, and other enzymes described in the art known to depolymerize or otherwise break lignin polymers. Also included are enzymes capable of hydrolyzing bonds formed between hemicellulosic sugars (notably arabinose) and lignin.
  • the multi-enzyme compositions comprise a biomass comprising microorganisms or a crude fermentation product of microorganisms.
  • a crude fermentation product refers to the fermentation broth which has been separated from the microorganism biomass (by filtration, for example).
  • the microorganisms are grown in fermentors, optionally centrifuged or filtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi-enzyme composition that is a crude fermentation product.
  • enzyme(s) or multi-enzyme compositions produced by the microorganism are subjected to one or more purification steps, such as ammonium sulfate precipitation, chromatography, and/or ultrafiltration, which result in a partially purified or purified enzyme(s).
  • the microorganism has been genetically Patent 124702-0230 modified to express the enzyme(s)
  • the enzyme(s) will include recombinant enzymes.
  • the genetically modified microorganism also naturally expresses the enzyme(s) or other enzymes useful for lignocellulosic saccharifieation, the enzyme(s) may include both naturally occurring and recombinant enzymes.
  • compositions comprising at least about 500 ng, and preferably at least about I ⁇ g, and more preferably at least about 5 g, and more preferably at least about 10 ⁇ g, and more preferably at least about 25 g, and more preferably at least about 50 ⁇ g, and more preferably at least about 75 ⁇ g, and more preferably at least about 100 ⁇ g, and more preferably at least about 250 ⁇ g, and more preferably at least about 500 ⁇ g, and more preferably at least about 750 ⁇ g, and more preferably at least about 1 mg, and more preferably at least about 5 mg, of an isolated protein comprising any of the proteins or homologues, variants, or fragments thereof discussed herein.
  • Such a composition of the present invention may include any earner with which the protein is associated by virtue of the protein preparation method, a protein purification method, or a preparation of the protein for use in any method according to the present invention.
  • a carrier can include any suitable buffer, extract, or medium that is suitable for combining with the protein of the present invention so that the protein can be used in any method described herein according to the present invention.
  • an immobilized enzyme includes immobilized isolated enzymes, immobilized microbial cells which contain one or more enzymes of the invention, other stabilized intact cells that produce one or more enzymes of the invention, and stabilized cell/membrane homogenates.
  • Stabilized intact cells and stabilized cell/membrane homogenates include cells and homogenates from naturally occurring microorganisms expressing the enzymes of the invention and preferably, from genetically modified microorganisms as disclosed elsewhere herein.
  • Entrapment can also be used to immobilize an enzyme.
  • Entrapment of an enzyme involves formation of, inter alia, gels (using organic or biological polymers), vesicles (including microencapsulation), semipermeable membranes or other matrices.
  • Exemplary materials used for entrapment of an enzyme include collagen, gelatin, agar, cellulose triacetate, alginate, polyacrylamide, polystyrene, polyurethane, epoxy resins, carrageenan, and egg albumin.
  • Some of the polymers, in particular cellulose triacetate can be used to entrap the enzyme as they are spun into a fiber.
  • Other materials such as polyacrylamide gels can be polymerized in solution to entrap the enzyme.
  • Still other materials such as polyglycol oligomers that are functionalized with polymerizable vinyl end groups can entrap enzymes by forming a cross-linked polymer with UV light illumination in the presence of a photosensitizer.
  • nucleic acid molecules that encode a protein of the present invention, as well as homologues, valiants, or fragments of such nucleic acid molecules.
  • a nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding any of the isolated proteins disclosed herein, including a fragment or a homologue or variant of such proteins, described above.
  • Nucleic acid molecules can include a nucleic acid sequence that encodes a fragment of a protein that does not have biological activity, and can also include portions of a gene or polynucleotide encoding the protein that are not part of the coding region for the protein (e.g., introns or regulatory regions of a gene encoding the protein). Nucleic acid molecules can include a nucleic acid sequence that is useful as a probe or primer (oligonucleotide sequences).
  • a nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence represented in Tables 1 and 2 or fragments or homologues or Patent 124702-0230 variants thereof.
  • the nucleic acid sequence encodes a protein (including fragments and homologues or variatns thereof) useful in the invention, or encompasses useful oligonucleotides or complementary nucleic acid sequences.
  • a nucleic molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding an amino acid sequence represented in Tables 1 and 2 or fragments or homologues or valiants thereof.
  • the nucleic acid sequence encodes a protein (including fragments and homologues or variants thereof) useful in the invention, or encompasses useful oligonucleotides or complementary nucleic acid sequences.
  • nucleic acid molecules include isolated nucleic acid molecules that hybridize under moderate stringency conditions, and more preferably under high stringency conditions, and even more preferably under very high stringency conditions, as described above, with the complement of a nucleic acid sequence encoding a protein of the present invention (i.e., including naturally occurring allelic variants encoding a protein of the present invention).
  • an isolated nucleic acid molecule encoding a protein of the present invention comprises a nucleic acid sequence that hybridizes under moderate, high, or very high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising an amino acid sequence represented in Tables 1 and 2.
  • an isolated nucleic acid molecule is a nucleic acid molecule (polynucleotide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include DNA, RNA, or derivatives of either DNA or RNA, including cDNA. As such, “isolated” does not reflect the extent to which the nucleic acid molecule has been purified.
  • nucleic acid molecule primarily refers to the physical nucleic acid molecule
  • nucleic acid sequence primarily refers to the sequence of nucleotides on the nucleic acid molecule
  • the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein.
  • An isolated nucleic acid molecule of the present invention can be isolated from its natural source or produced using recombinant DNA technology (e.g., polymerase chain reaction Patent 124702-0230
  • Isolated nucleic acid molecules can include, for example, genes, natural allelic variants of genes, coding regions or portions thereof, and coding and/or regulatory regions modified by nucleotide insertions, deletions, substitutions, and/or inversions in a manner such that the modifications do not substantially interfere with the nucleic acid molecule's ability to encode a protein of the present invention or to form stable hybrids under stringent conditions with natural gene isolates.
  • An isolated nucleic acid molecule can include degeneracies. As used herein, nucleotide degeneracy refers to the phenomenon that one amino acid can be encoded by different nucleotide codons.
  • nucleic acid sequence of a nucleic acid molecule that encodes a protein of the present invention can vary due to degeneracies. It is noted that a nucleic acid molecule of the present invention is not required to encode a protein having protein activity. A nucleic acid molecule can encode a truncated, mutated or inactive protein, for example. In addition, nucleic acid molecules of the invention are useful as probes and primers for the identification, isolation and/or purification of other nucleic acid molecules.
  • the nucleic acid molecule is an oligonucleotide, such as a probe or primer
  • the oligonucleotide preferably ranges from about 5 to about 50 or about 500 nucleotides, more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length.
  • a gene includes all nucleic acid sequences related to a natural (i.e. wild-type) gene, such as regulatory regions that control production of the protein encoded by that gene (such as, but not limited to, transcription, translation or post-translation control regions) as well as the coding region itself.
  • a gene can be a naturally occurring allelic variant that includes a similar but not identical sequence to the nucleic acid sequence encoding a given protein. Allelic variants have been previously described above. Genes can include or exclude one or more introns or any portions thereof or any other sequences or which are not included in the cDNA for that protein.
  • the phrases "nucleic acid molecule" and “gene” can be used interchangeably when the nucleic acid molecule comprises a gene as described above.
  • an isolated nucleic acid molecule of the present invention is produced Patent 124702-0230 using recombinant DNA technology ⁇ e.g., polymerase chain reaction (PCR) amplification, cloning, etc.) or chemical synthesis.
  • Isolated nucleic acid molecules include any nucleic acid molecules and homologues or valiants thereof that are part of a gene described herein and/or that encode a protein described herein, including, but not limited to, natural allelic variants and modified nucleic acid molecules (homologues or variants) in which nucleotides have been inserted, deleted, substituted, and/or inverted in such a manner that such modifications provide the desired effect on protein biological activity or on the activity of the nucleic acid molecule.
  • Allelic variants and protein homologues or variants e.g., proteins encoded by nucleic acid homologues or variants
  • a nucleic acid molecule homologue or variant (i.e., encoding a homologue or variant of a protein of the present invention) can be produced using a number of methods known to those skilled in the art (see, for example, Sambrook et ah).
  • nucleic acid molecules can be modified using a variety of techniques including, but not limited to, by classic mutagenesis and recombinant DNA techniques (e.g., site-directed mutagenesis, chemical treatment, restriction enzyme cleavage, ligation of nucleic acid fragments and/or PCR amplification), or synthesis of oligonucleotide mixtures and ligation of mixture groups to "build" a mixture of nucleic acid molecules and combinations thereof.
  • nucleic acid molecule homologues or variants can be selected by hybridization with a gene or polynucleotide, or by screening for the function of a protein encoded by a nucleic acid molecule (i.e., biological activity).
  • the minimum size of a nucleic acid molecule of the present invention is a size sufficient to encode a protein (including a fragment, homologue, or variant of a full-length protein) having biological activity, sufficient to encode a protein comprising at least one epitope which binds to an antibody, or sufficient to form a probe or oligonucleotide primer that is capable of forming a stable hybrid with the Patent 124702-0230 complementary sequence of a nucleic acid molecule encoding a natural protein (e.g., under moderate, high, or high stringency conditions).
  • the size of the nucleic acid molecule encoding such a protein can be dependent on nucleic acid composition and percent homology or identity between the nucleic acid molecule and complementary sequence as well as upon hybridization conditions per se (e.g., temperature, salt concentration, and formamide concentration).
  • the minimal size of a nucleic acid molecule that is used as an oligonucleotide primer or as a probe is typically at least about 12 to about 15 nucleotides in length if the nucleic acid molecules are GC-rich and at least about 15 to about 18 bases in length if they are AT-rich.
  • nucleic acid molecule of the present invention can include a portion of a protein encoding sequence, a nucleic acid sequence encoding a full-length protein (including a gene), including any length fragment between about 20 nucleotides and the number of nucleotides that make up the full length cDNA encoding a protein, in whole integers (e.g., 20, 21, 22, 23, 24, 25 nucleotides), or multiple genes, or portions thereof.
  • the heterologous nucleotides are not naturally found (i.e., not found in nature, in vivo) flanking the nucleic acid sequence encoding the specified amino acid sequence as it occurs in the natural gene or do not encode a protein that imparts any additional function to the protein or changes the function of the protein having the specified amino acid sequence.
  • the polynucleotide probes or primers of the invention are conjugated to detectable markers.
  • Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., DynabeadsTM), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3 H, Patent 124702-0230
  • the polynucleotide probes are immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports.
  • a recombinant nucleic acid molecule which comprises the isolated nucleic acid molecule described above which is operatively linked to at least one expression control sequence. More particularly, according to the present invention, a recombinant nucleic acid molecule typically comprises a recombinant vector and any one or more of the isolated nucleic acid molecules as described herein. According to the present invention, a recombinant vector is an engineered (i.e., artificially produced) nucleic acid molecule that is used as a tool for manipulating a nucleic acid sequence of choice and/or for introducing such a nucleic acid sequence into a host cell.
  • the recombinant vector is therefore suitable for use in cloning, sequencing, and/or otherwise manipulating the nucleic acid sequence of choice, such as by expressing and/or delivering the nucleic acid sequence of choice into a host cell to form a recombinant cell.
  • a vector typically contains nucleic acid sequences that are not naturally found adjacent to nucleic acid sequence to be cloned or delivered, although the vector can also contain regulatory nucleic acid sequences (e.g., promoters, untranslated regions) which are naturally found adjacent to nucleic acid sequences of the present invention or which are useful for expression of the nucleic acid molecules of the present invention (discussed in detail below).
  • the vector can be either RNA or DNA, either prokaryotic or eukaryotic, and typically is a plasmid.
  • the vector can be maintained as an extrachromosomal element (e.g., a plasmid) or it can be integrated into the chromosome of a recombinant host cell, although it is preferred if the vector remains separate from the genome for most applications of the invention.
  • the entire vector can remain in place within a host cell, or under certain conditions, the plasmid DNA can be deleted, leaving behind the nucleic acid molecule of the present invention.
  • An integrated nucleic acid molecule can be under chromosomal promoter control, under native or plasmid promoter control, or under a combination of several promoter controls. Single or Patent 124702-0230 multiple copies of the nucleic acid molecule can be integrated into the chromosome.
  • a recombinant vector of the present invention can contain at least one selectable marker.
  • a recombinant vector used in a recombinant nucleic acid molecule of the present invention is an expression vector.
  • expression vector is used to refer to a vector that is suitable for production of an encoded product (e.g., a protein of interest, such as an enzyme of the present invention).
  • a nucleic acid sequence encoding the product to be produced e.g., the protein or homologue or variant thereof is inserted into the recombinant vector to produce a recombinant nucleic acid molecule.
  • the nucleic acid sequence encoding the protein to be produced is inserted into the vector in a manner that operatively links the nucleic acid sequence to regulatory sequences in the vector which enable the transcription and translation of the nucleic acid sequence within the recombinant host cell.
  • a recombinant nucleic acid molecule includes at least one nucleic acid molecule of the present invention operatively linked to one or more expression control sequences (e.g., transcription control sequences or translation control sequences).
  • expression control sequences e.g., transcription control sequences or translation control sequences.
  • the phrase "recombinant molecule” or “recombinant nucleic acid molecule” primarily refers to a nucleic acid molecule or nucleic acid sequence operatively linked to a transcription control sequence, but can be used interchangeably with the phrase “nucleic acid molecule", when such nucleic acid molecule is a recombinant molecule as discussed herein.
  • the phrase "operatively linked” refers to linking a nucleic acid molecule to an expression control sequence in a manner such that the molecule is able to be expressed when transfected (i.e., transformed, transduced, transfected, conjugated or conducted) into a host cell.
  • Transcription control sequences are sequences which control the initiation, elongation, or termination of transcription. Particularly important transcription control sequences are those which control transcription initiation, such as promoter, enhancer, operator and repressor sequences. Suitable transcription control sequences include any transcription control sequence that can function in a host cell or organism into which the recombinant nucleic acid molecule is to be introduced. Transcription control sequences may also include any combination of one or more of any of the foregoing.
  • Patent 124702-0230 are sequences which control the initiation, elongation, or termination of transcription. Particularly important transcription control sequences are those which control transcription initiation, such as promoter, enhancer, operator and repressor sequences. Suitable transcription control
  • Recombinant nucleic acid molecules of the present invention can also contain additional regulatory sequences, such as translation regulatory sequences, origins of replication, and other regulatory sequences that are compatible with the recombinant cell.
  • a recombinant molecule of the present invention including those which are integrated into the host cell chromosome, also contains secretory signals (i.e., signal segment nucleic acid sequences) to enable an expressed protein to be secreted from the cell that produces the protein.
  • Suitable signal segments include a signal segment that is naturally associated with the protein to be expressed or any heterologous signal segment capable of directing the secretion of the protein according to the present invention.
  • a recombinant molecule of the present invention comprises a leader sequence to enable an expressed protein to be delivered to and inserted into the membrane of a host cell.
  • Suitable leader sequences include a leader sequence that is naturally associated with the protein, or any heterologous leader sequence capable of directing the delivery and insertion of the protein to the membrane of a cell.
  • the term "transfection” is generally used to refer to any method by which an exogenous nucleic acid molecule (i.e., a recombinant nucleic acid molecule) can be inserted into a cell.
  • transformation can be used interchangeably with the term “transfection” when such term is used to refer to the introduction of nucleic acid molecules into microbial cells or plants and describes an inherited change due to the acquisition of exogenous nucleic acids by the microorganism that is essentially synonymous with the term “transfection.”
  • Transfection techniques include, but are not limited to, transformation, particle bombardment, electroporation, microinjection, lipofection, adsorption, infection and protoplast fusion.
  • One or more recombinant molecules of the present invention can be used to produce an encoded product (e.g., a protein) of the present invention.
  • an encoded product is produced by expressing a nucleic acid molecule as described herein under conditions effective to produce the protein.
  • a preferred method to produce an encoded protein is by transfecting a host cell with one or more recombinant molecules to form a recombinant cell. Suitable host cells to transfect include, but are not limited to, any bacterial, fungal (e.g., filamentous Patent 124702-0230 fungi or yeast or mushrooms), algal, plant, insect, or animal cell that can be transfected. Host cells can be either untransfected cells or cells that are already transfected with at least one other recombinant nucleic acid molecule.
  • Suitable cells may include any microorganism (e.g., a bacterium, a protist, an alga, a fungus, or other microbe), and is preferably a bacterium, a yeast or a filamentous fungus.
  • Suitable bacterial genera include, but are not limited to, Escherichia, Bacillus, Lactobacillus, Pseudomonas and Streptomyces.
  • Suitable bacterial species include, but are not limited to, Escherichia coll, Bacillus subtilis, Bacillus licheniformis, Bacillus Stearothermophilus, Lactobacillus brevis, Pseudomonas aeruginosa and Streptomyces lividans.
  • Suitable genera of yeast include, but are not limited to, Saccharomyces, Schizosaccharomyces, Candida, Hansenula, Pichia, Kluyveromyces, and Phaffia.
  • Suitable yeast species include, but are not limited to, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans, Hansenula pofymorpha, Pichia pastoris, P. canadensis, Kluyveromyces marxianus and Phaffia rhodozyma.
  • Suitable fungal genera include, but are not limited to, Chrysosporium, Thielavia, Talaromyces, Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynascus, Cryptococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Humicola, and Trichoderma, and anamorphs and teleomorphs thereof.
  • Suitable fungal species include, but are not limited to, Aspergillus niger, Aspergillus oryzae, Aspergillus idulans, Aspergillus japonicus, Absidia coerulea, Rhizopus oryzae, Chrysosporium lucknowense, Myceliophthora thermophila, Neurospora crassa, Neurospora intermedia, Trichoderma reesei, Penicillium canescens, Penicillium solitum, Penicillium funiculosum, Talaromyces emersonii and Talaromyces flavus.
  • the host cell is a fungal cell of the species Myceliophthora thermophila.
  • a while (low cellulose) strain is sued.
  • the host cell is a fungal cell of Strain CI (VKM F-3500-D) or a mutant strain derived therefrom ⁇ e.g., UV13-6 (Accession No. VKM F-3632 D); NG7C-19 (Accession No. VKM F-3633 D); UV18-25 (VKM F-3631D), 1L (CBS122189), or WIUIOOL (CBS122190)).
  • Host cells can be either untransfected cells or cells that are already transfected with Patent 124702-0230 at least one other recombinant nucleic acid molecule. Additional embodiments of the present invention include any of the genetically modified cells described herein.
  • suitable host cells include insect cells (most particularly Drosophila melanogaster cells, Spodoptera frugiperda Sf9 and Sf21 cells and Trichoplusa High-Five cells), nematode cells (particularly C. elegans cells), avian cells, amphibian cells (particularly Xenopus laevis cells), reptilian cells, and mammalian cells (most particularly human, simian, canine, rodent, bovine, or sheep cells, e.g. NIH3T3, CHO (Chinese hamster ovary cell), COS, VE O, BHK, HEK, and other rodent or human cells).
  • insect cells most particularly Drosophila melanogaster cells, Spodoptera frugiperda Sf9 and Sf21 cells and Trichoplusa High-Five cells
  • nematode cells particularly C. elegans cells
  • avian cells particularly amphibian cells (particularly Xenopus laevis cells)
  • one or more protein(s) expressed by an isolated nucleic acid molecule of the present invention are produced by culturing a cell that expresses the protein (i.e., a recombinant cell or recombinant host cell) under conditions effective to produce the protein.
  • the protein may be recovered, and in others, the cell may be harvested in whole, either of which can be used in a composition.
  • Microorganisms used in the present invention are cultured in an appropriate fermentation medium.
  • An appropriate, or effective, fermentation medium refers to any medium in which a cell of the present invention, including a genetically modified microorganism (described below), when cultured, is capable of expressing enzymes useful in the present invention and/or of catalyzing the production of sugars from lignocellulosic biomass.
  • a medium is typically an aqueous medium comprising assimilable carbon, nitrogen and phosphate sources.
  • Such a medium can also include appropriate salts, minerals, metals and other nutrients.
  • Microorganisms and other cells of the present invention can be cultured in conventional fermentation bioreactors.
  • the microorganisms can be cultured by any fermentation process which includes, but is not limited to, batch, fed-batch, cell recycle, and continuous fermentation.
  • the fermentation of microorganisms such as fungi may be carried out in any appropriate reactor, using methods known to those skilled in the art.
  • the fermentation may be carried out for a period of 1 to 14 days, or more preferably between about 3 and 10 days.
  • the temperature of the medium is typically maintained between about 25 and 50°C, Patent 124702-0230 and more preferably between 28 and 40°C.
  • the pH of the fermentation medium is regulated to a pH suitable for growth and protein production of the particular organism.
  • the fermentor can be aerated in order to supply the oxygen necessary for fermentation and to avoid the excessive accumulation of carbon dioxide produced by fermentation.
  • the aeration helps to control the temperature and the moisture of the culture medium.
  • the fungal strains are grown in fermentors, optionally centrifuged or fdtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi- enzyme composition that is a crude fermentation product.
  • Particularly suitable conditions for culturing fdamentous fungi are described, for example, in U.S. Patent No. 6,015,707 and U.S. Patent No. 6,573,086, supra.
  • resultant proteins of the present invention may either remain within the recombinant cell; be secreted into the culture medium; be secreted into a space between two cellular membranes; or be retained on the outer surface of a cell membrane.
  • the phrase "recovering the protein” refers to collecting the whole culture medium containing the protein and need not imply additional steps of separation or purification.
  • Proteins produced according to the present invention can be purified using a variety of standard protein purification techniques, such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential precipitation or solubilization.
  • standard protein purification techniques such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential precipitation or solubilization.
  • Proteins of the present invention are preferably retrieved, obtained, and/or used in "substantially pure” form.
  • substantially pure refers to a purity that allows for the effective use of the protein in any method according to the present invention.
  • a protein to be useful in any of the methods described herein or in any method utilizing enzymes of the types described herein according to the present invention it is substantially free of contaminants, other proteins and/or chemicals that might interfere or that would interfere with its use in a method disclosed by the present invention ⁇ e.g., that might interfere with enzyme activity), or that at least would be undesirable for inclusion with a protein of the present invention (including homologues and variants) when it is used in a method Patent 124702-0230 disclosed by the present invention (described in detail below).
  • a "substantially pure" protein is a protein that can be produced by any method (i.e., by direct purification from a natural source, recombinantly, or synthetically), and that has been purified from other protein components such that the protein comprises at least about 80% weight/weight of the total protein in a given composition (e.g., the protein of interest is about 80% of the protein in a sohition/compositi on/buffer), and more preferably, at least about 85%, and more preferably at least about 90%, and more preferably at least about 91%, and more preferably at least about 92%, and more preferably at least about 93%, and more preferably at least about 94%, and more preferably at least about 95%, and more preferably at least about 96%, and more preferably at least about 97%, and more preferably at least about 98%, and more preferably at least about 99%, weight/weight of the total protein in a given composition.
  • the protein of interest is about 80% of the protein in a sohition
  • Recombinant techniques useful for controlling the expression of nucleic acid molecules include, but are not limited to, integration of the nucleic acid molecules into one or more host cell chromosomes, addition of vector stability sequences to plasmids, substitutions or modifications of transcription control signals (e.g., promoters, operators, enhancers), substitutions or modifications of translational control signals (e.g., ribosome binding sites), modification of nucleic acid molecules to correspond to the codon usage of the host cell, and deletion of sequences that destabilize transcripts.
  • transcription control signals e.g., promoters, operators, enhancers
  • substitutions or modifications of translational control signals e.g., ribosome binding sites
  • a genetically modified microorganism that has been transfected with one or more nucleic acid molecules of the present invention.
  • a genetically modified microorganism can include a genetically modified bacterium, alga, yeast, filamentous fungus, or other microbe.
  • Such a genetically modified microorganism has a genome which is Patent 124702-0230 modified (i.e., mutated or changed) from its normal (i.e., wild-type or naturally occurring) form such that the desired result is achieved (i.e., increased or modified activity and/or production of at least one enzyme or a multi-enzyme composition for the conversion of lignocellulosic material to fermentable sugars).
  • Genetic modification of a microorganism can be accomplished using classical strain development and/or molecular genetic techniques. Such techniques known in the ait and are generally disclosed for microorganisms, for example, in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press or Molecular Cloning: A Laboratory Manual, third edition (Sambrook and Russel, 2001), (jointly referred to herein as "Sambrook”).
  • a genetically modified microorganism can include a microorganism in which nucleic acid molecules have been inserted, deleted or modified (i.e., mutated; e.g., by insertion, deletion, substitution, and/or inversion of nucleotides), in such a manner that such modifications provide the desired effect within the microorganism.
  • a genetically modified microorganism can endogenously contain and express an enzyme or a multi-enzyme composition for the conversion of lignocellulosic material to fermentable sugars, and the genetic modification can be a genetic modification of one or more of such endogenous enzymes, whereby the modification has some effect on the ability of the microorganism to convert lignocellulosic material to fermentable sugars (e.g., increased expression of the protein by introduction of promoters or other expression control sequences, or modification of the coding region by homologous recombination to increase the activity of the encoded protein).
  • a genetically modified microorganism can endogenously contain and express an enzyme for the conversion of lignocellulosic material to fermentable sugars, and the genetic modification can be an introduction of at least one exogenous nucleic acid sequence (e.g., a recombinant nucleic acid molecule), wherein the exogenous nucleic acid sequence encodes at least one additional enzyme useful for the conversion of lignocellulosic material to fermentable sugars and/or a protein that improves the efficiency of the enzyme for the conversion of lignocellulosic material to fermentable sugars.
  • the microorganism can also have at least one modification to a gene or genes comprising its endogenous enzyme(s) for the conversion of lignocellulosic material Patent 124702-0230 to fermentable sugars.
  • the genetically modified microorganism does not necessarily endogenously (naturally) contain an enzyme for the conversion of lignocellulosic material to fermentable sugars, but is genetically modified to introduce at least one recombinant nucleic acid molecule encoding at least one enzyme or a multiplicity of enzymes for the conversion of lignocellulosic material to fermentable sugars.
  • a microorganism can be used in a method of the invention, or as a production microorganism for crude fennentation products, partially purified recombinant enzymes, and/or purified recombinant enzymes, any of which can then be used in a method of the present invention.
  • a cell extract that contains the activity to test can be generated. For example, a lysate from the host cell is produced, and the supernatant containing the activity is harvested and/or the activity can be isolated from the lysate. In the case of cells that secrete enzymes into the culture medium, the culture medium containing them can be harvested, and/or the activity can be purified from the culture medium.
  • the extracts/activities prepared in this way can be tested using assays known in the art. Accordingly, methods to identify mutli-enzyme compositions capable of degrading lignocellulosic biomass are provided.
  • DDG dinitrosalicylic acid assay
  • the present invention is not limited to fungi and also contemplates genetically modified organisms such as algae, bacterial, and plants transformed with one or more nucleic acid molecules of the invention.
  • the plants may be used for production of the enzymes, and/or as the lignocellulosic material used as a substrate in the methods of the invention.
  • Methods to generate recombinant plants are known in the art. For instance, numerous methods for plant transformation have been developed, including biological and physical transformation protocols. See, for example, Mild et al., "Procedures for Introducing Foreign DNA into Patent 124702-0230
  • A. tumefaciens and A. rhizogenes are plant pathogenic soil bacteria which genetically transform plant cells.
  • the Ti and Ri plasmids of A tumefaciens w.dA. rhizogenes, respectively, cany genes responsible for genetic transformation of the plant. See, for example, Kado, C.I., Crit. Rev. Plant. Sci. 10:1 (1991).
  • Agrobacterium vector systems and methods for Agrobacterium- ediaXed gene transfer are provided by numerous references, including Gruber et al., supra, Miki et al., supra, Moloney et al., Plant Cell Reports 8:238 (1989), and U.S. Patents Nos. 4,940,838 and 5,464,763.
  • microprojectile- mediated transformation wherein DNA is carried on the surface of microprojectiles.
  • the expression vector is introduced into plant tissues with a biolistic device that accelerates the microprojectiles to speeds sufficient to penetrate plant cell walls and membranes.
  • Some embodiments of the present invention include genetically modified organisms comprising at least one nucleic acid molecule encoding at least one enzyme of the present invention, in which the activity of the enzyme is downregulated.
  • the downregulation may be achieved, for example, by introduction of inhibitors (chemical or biological) of the enzyme activity, by manipulating the efficiency with which those nucleic acid molecules are transcribed, the efficiency with which the resultant transcripts are translated, and the efficiency of post-translational modifications, or by "knocking out” the endogenous copy of the gene.
  • a “knock out” of a gene refers to a molecular biological technique by which the gene in the organism is made inoperative, so that the expression of the gene is substantially reduced or eliminated.
  • the activity of the enzyme may be upregulated.
  • the present invention also contemplates downregulating activity of one or more enzymes while simultaneously upregulating activity of one or more enzymes to achieve the desired outcome.
  • Another embodiment of the present invention relates to an isolated binding agent capable of selectively binding to a protein of the present invention.
  • Suitable binding agents may be selected from an antibody, an antigen binding fragment, or a binding partner.
  • the binding agent selectively binds to an amino acid sequence selected from Tables 1 and 2, including to any fragment of any of the above sequences comprising at least one antibody binding epitope.
  • the phrase “selectively binds to” refers to the ability of an antibody, antigen binding fragment or binding partner of the present invention to preferentially bind to specified proteins. More specifically, the phrase “selectively binds” refers to the specific binding of one protein to another ⁇ e.g., an antibody, fragment thereof, or binding partner to an antigen), wherein the level of binding, as measured by any standard assay (e.g., an immunoassay), is statistically significantly higher than the background control for the assay.
  • any standard assay e.g., an immunoassay
  • controls when performing an immunoassay, controls typically include a reaction well/tube that contain antibody or antigen binding fragment alone (i.e., in the absence of antigen), wherein an amount of reactivity (e.g., non-specific binding to the well) by the Patent 124702-0230 antibody or antigen binding fragment thereof in the absence of the antigen is considered to be background. Binding can be measured using a variety of methods standard in the art including enzyme immunoassays (e.g., ELISA), immunoblot assays, etc.).
  • enzyme immunoassays e.g., ELISA
  • immunoblot assays etc.
  • Antibodies are characterized in that they comprise immunoglobulin domains and as such, they are members of the immunoglobulin superfamily of proteins.
  • An antibody of the invention includes polyclonal and monoclonal antibodies, divalent and monovalent antibodies, bi- or multi-specific antibodies, serum containing such antibodies, antibodies that have been purified to vai ing degrees, and any functional equivalents of whole antibodies.
  • Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees.
  • Whole antibodies of the present invention can be polyclonal or monoclonal.
  • antibodies such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab', or F(ab)2 fragments), as well as genetically-engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention.
  • Methods for the generation and production of antibodies are well known in the art.
  • Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein (Nature 256:495-497, 1 75).
  • Non-antibody polypeptides sometimes referred to as binding partners, are designed to bind specifically to a protein of the invention. Examples of the design of such polypeptides, which possess a prescribed ligand specificity are given in Beste et al. (Proc. Natl. Acad. Sci. 96:1898-1903, 1999).
  • a binding agent of the invention is immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports such as for use in a screening assay.
  • Proteins of the present invention at least one protein of the present invention, compositions comprising such protein(s) of the present invention, and multi- enzyme compositions (examples of which are described above) may be used in any method where it is desirable to hydrolyze glycosidic linkages in lignocellulosic Patent 124702-0230 material, or any other method wherein enzymes of the same or similar function are useful.
  • the present invention includes the use of at least one protein of the present invention, compositions comprising at least one protein of the present invention, or multi-enzyme compositions in methods for hydrolyzing lignocellulose and the generation of fermentable sugars therefrom.
  • the method comprises contacting the lignocellulosic material with an effective amount of one or more proteins of the present invention, composition comprising at least one protein of the present invention, or a multi-enzyme composition, whereby at least one fermentable sugar is produced (liberated).
  • the lignocellulosic material may be partially or completely degraded to fermentable sugars. Economical levels of degradation at commercially viable costs are contemplated.
  • the amount of enzyme or enzyme composition contacted with the lignocellulose will depend upon the amount of glucan present in the lignocellulose.
  • the amount of enzyme or enzyme composition contacted with the lignocellulose may be from about 0.1 to about 200 mg enzyme or enzyme composition per gram of glucan; in other embodiments, from about 3 to about 20 mg enzyme or enzyme composition per gram of glucan.
  • the invention encompasses the use of any suitable or sufficient amount of enzyme or enzyme composition between about 0.1 mg and about 200 mg enzyme per gram glucan, in increments of 0.05 mg (i.e., 0.1 mg, 0.15 mg, 0.2 mg... 199.9 mg, 199.95 mg, 200 mg).
  • the invention provides a method for degrading DDG, preferably, but not limited to, DDG derived from corn, to sugars.
  • the method comprises contacting the DDG with a protein of the present invention, a composition comprising at least one protein of the present invention, or a multi- enzyme composition. In certain embodiments, at least 10% of fermentable sugars are liberated.
  • the at least 15% of the sugars are liberated, or at least 20% of the sugars are liberated, or at least 23% of the sugars are liberated, or at least 24% of the sugars are liberated, or at least 25% of the sugars are liberated, or at least 26% of the sugars are liberated, or at least 27% of the sugars are liberated, or at least 28% of the sugars are liberated.
  • the invention provides a method for producing fermentable sugars comprising cultivating a genetically modified microorganism of the present invention in a nutrient medium comprising a lignocellulosic material, whereby fermentable sugars are produced.
  • Accessory enzymes have been described elsewhere herein.
  • the accessory enzyme or enzymes may be added at the same time, prior to, or following the addition of a protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition, or can be expressed (endogenously or overexpressed) in a genetically modified microorganism used in a method of the invention.
  • the protein of the present invention, a composition comprising at least one protein of the present invention, or a multi- enzyme composition will be compatible with the accessory enzymes selected.
  • a composition comprising at least one protein of the present invention, or a multi-enzyme composition
  • the conditions such as temperature and pH
  • the accessory enzyme may also be present in the lignocellulosic material itself as a result of genetically modifying the plant.
  • the nutrient medium used in a fermentation can also comprise one or more accessory enzymes.
  • the method comprises a pretreatment process.
  • a pretreatment process will result in components of the lignocellulose being more accessible for downstream applications or so that it is more digestible by enzymes following treatment in the absence of hydrolysis.
  • the pretreatment can be a chemical, physical or biological pretreatment.
  • the lignocellulose may have been previously treated to release some or all of the sugars, as in the case of DDG. Physical treatments, such as grinding, boiling, freezing, milling, vacuum infiltration, and the like may also be used with the methods of the invention.
  • the heat treatment comprises heating the lignocellulosic material to 121°C for 15 minutes.
  • a physical treatment such as milling can allow a higher Patent 124702-0230 concentration of lignocellulose to be used in die methods of the invention.
  • a higher concentration refers to about 20%, up to about 25%, up to about 30%, up to about 35%, up to about 40%, up to about 45%, or up to about 50% lignocellulose.
  • the lignocellulose may also be contacted with a metal ion, ultraviolet light, ozone, and the like.
  • Additional pretreatment processes are known to those skilled in the art, and can include, for example, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment, including ammonia fiber explosion (AFEX) technology.
  • AFEX ammonia fiber explosion
  • the method comprises detoxifying the lignocellulosic material.
  • Dextoxification may be desirable in the event that inhibitors are present in the lignocellulosic material. Such inhibitors can be generated by a pretreatment process, deriving from sugar degradation or are direct released from the lignocellulose polymer.
  • Detoxifying can include the reduction of their formation by adjusting sugar extraction conditions; the use of inhibitor-tolerant or inhibitor- degrading strains of microorganisms. Detoxifying can also be accomplished by the addition of ion exchange resins, active charcoal, enzymatic detoxification using, e.g., laccase, and the like.
  • the proteins, compositions or products of the present invention further comprises detoxifying agents.
  • the methods may be performed one or more times in whole or in part. That is, one may perform one or more pretreatments, followed by one or more reactions with a protein of the present invention, composition or product of the present invention and/or accessory enzyme.
  • the enzymes may be added in a single dose, or may be added in a series of small doses. Further, the entire process may be repeated one or more times as necessary. Therefore, one or more additional treatments with heat and enzymes are contemplated.
  • the fermentable sugars may be Patent 124702-0230 recovered.
  • the sugars can be recovered through a continuous, batch or fed-batch method.
  • the sugars recovered can be concentrated or purified. Recovery may occur by any method known in the ait, including, but not limited to, washing, gravity flow, pressure, chromatography, extraction, crystallization (e.g., evaporative crystallization), membrane separation, reverse osmosis, distillation, and filtration.
  • the sugars can be subjected further processing; e.g., they can also be sterilized, for example, by filtration.
  • the invention provides means for improving quality of lignocellulosic material, including DDG for animal nutrition.
  • the treated lignocellulosic material e.g., a lignocellulosic material which has been saccharified
  • the recovered material can be used as an animal feed additive. It is believed that the recovered material will have beneficial properties for animal nutrition, possibly due to a higher protein content.
  • the amount of enzyme or enzyme composition contacted with the lignocellulosic material may be from about 0.0001 % to about 1.0 % of the weight of the lignocellulosic material; in other embodiments, from about 0.005 % to about 0.1 % of the weight of the lignocellulosic material.
  • the invention includes the use of any amount of enzyme or enzyme composition between about 0.0001 % and about 1.0 %, in increments of 0.0001 (i.e., 0.0001 , 0.0002, 0.0003...etc.).
  • the invention provides a method for producing an organic substance, comprising saccharifying a lignocellulosic material with an effective amount of a protein of the present invention or a composition comprising at least one protein of the present invention, fermenting the saccharified lignocellulosic material obtained with one or more fermentating microorganisms, and recovering the organic substance from the fermentation.
  • Sugars released from biomass can be converted to useful fermentation products including but not limited to amino acids, vitamins, pharmaceuticals, animal feed supplements, specialty chemicals, chemical feedstocks, plastics, solvents, fuels, or other organic polymers, lactic acid, and ethanol, including fuel ethanol.
  • Specific products that may be produced by the methods of the invention include, but not limited to, biofuels (including ethanol); lactic acid; plastics; specialty chemicals; organic acids, including citric acid, succinic acid, itaconic acid and maleic acid; solvents; animal Patent 124702-0230 feed supplements; pharmaceuticals; vitamins; amino acids, such as lysine, methionine, tryptophan, threonine, and aspartic acid; industrial enzymes, such as proteases, cellulases, amylases, glucanases, lactases, lipases, lyases, oxidoreductases, and transferases; and chemical feedstocks.
  • the methods of the invention are also useful to generate feedstocks for fermentation by fermenting microorganisms.
  • the method further comprises the addition of at least one fermenting organism.
  • fertilizing organism refers to an organism capable of fermentation, such as bacteria and fungi, including yeast. Such feedstocks have additional nutritive value above the nutritive value provided by the liberated sugars.
  • the present invention provides methods for improving the nutritional quality of food (or animal feed) comprising adding to the food (or the animal feed) at least one protein of the present invention.
  • the present invention provides methods for improving the nutritional quality of the food (or animal feed) comprising pretreating the food (or the animal feed) with at least one isolated protein of the present invention.
  • use of the enzymes xylanases and arabinofuranosidases in bread making has been known to improve the nutritional quality of the dough by degrading the arabinoxylans in the dough.
  • Improving the nutritional quality can mean making the food (or the animal feed) more digestible and/or less allergenic, and encompasses changes in the caloric value, taste and/or texture of the food.
  • the proteins of the present invention may be used as part of nutritional supplements.
  • the proteins of the present invention may be used as part of digestive aids, and may help in providing relief from digestive disorders such as acid reflux and celiac disease.
  • Proteins of the present invention and compositions comprising at least one protein of the present invention are also useful in a variety of other applications involving the hydrolysis of glycosidic linkages in lignocellulosic material, such as stone washing, color brightening, depilling and fabric softening, as well as other applications well known in the art. Proteins of the present invention and compositions comprising at least one protein of the present invention are also readily amenable to use as additives in detergent and other media used for such Patent 124702-0230 applications. These and other methods of use will readily suggest themselves to those of skill in the art based on the invention described herein.
  • proteins and compositions of the present invention can be used in stone washing procedures for fabrics or other textiles.
  • the proteins and compositions can be used in stone washing procedures for denim jeans.
  • the method for stone washing the fabric comprises contacting the fabric with a protein or composition of the present invention, hi an additional embodiment, the protein or composition of the present invention is included in a detergent composition, as described below.
  • a preferred pH range of stone wash applications is between about 5.5 to 7.5, most preferably at about pH 6 to about 7.
  • One of skill in the art will know how to regulate the amount or concentration of the protein or composition produced by this invention based on such factors as the activity of the enzyme and the wash conditions, including but not limited to temperature and pH. Examples of these uses can be found in U.S. Patent Application Publication No. 2003/0157595.
  • the cellulase compositions of this invention can be used to reduce or eliminate the harshness associated with a fabric or textile by contacting the fabric or textile with a protein or composition of the present invention.
  • the fabric or textile may be made from cellulose or cotton.
  • a preferred range for reducing or eliminating the harshness associated with a fabric or textile is between about pH 8 to about 12, or between about pH 10 to about 11.
  • the proteins or compositions of the subject invention can be used in detergent compositions.
  • the detergent composition may comprise at least one protein or composition of the present invention and one or more surfactants.
  • the detergent compositions may also include any additional detergent ingredient known in the art.
  • Detergent ingredients contemplated for use with the detergent compositions of the subject invention include, but are not limited to, detergents, buffers, surfactants, bleaching agents, softeners, solvents, solid forming agents, abrasives, alkalis, inorganic electrolytes, cellulase activators, antioxidants, builders, silicates, preservatives, and stabilizers.
  • the detergent compositions of this invention preferably employ a surface active agent, i.e., surfactant, including anionic, non-ionic, and ampholytic surfactants well known for their use in Patent 124702-0230 detergent compositions.
  • a surface active agent i.e., surfactant, including anionic, non-ionic, and ampholytic surfactants well known for their use in Patent 124702-0230 detergent compositions.
  • the detergent compositions of this invention can additionally contain one or more of the following components: the enzymes amylases, cellulases, proteinase, lipases, oxido-reductases, peroxidases and other enzymes; cationic surfactants and long-chain fatty acids; builders; antiredeposition agents; bleaching agents; bluing agents and fluorescent dyes; caking inhibitors; masking agents for factors inhibiting the cellulase activity; cellulase activators; antioxidants; and solubilizers.
  • detergent compositions of this invention can be used, if desired, with the detergent compositions of this invention.
  • detergent compositions employing cellulases are exemplified in U.S. Pat. Nos. 4,435,307; 4,443,355; 4,661,289; 4,479,881; 5,120,463.
  • a detergent base used in the present invention is in the form of a powder, it may be one which is prepared by any known preparation method including a spray- drying method and/or a granulation method.
  • the granulation method are the most preferred because of the non-dusting nature of granules compared to spray dry products.
  • the detergent base obtained by the spray-drying method is hollow granules which are obtained by spraying an aqueous slurry of heat-resistant ingredients, such as surface active agents and builders, into a hot space.
  • the granules have a size of from about 50 to about 2000 micrometers.
  • the detergent base With a highly dense, granular detergent base obtained by such as the spray-drying-granulation method, various ingredients may also be added after the preparation of the base.
  • the detergent base When the detergent base is a liquid, it may be either a homogenous solution or an inhomogeneous solution.
  • compositions of the present invention include, but are not limited to, garment dyeing applications such as enzymatic mercerizing of viscose, bio-polishing applications, enzymatic surface polishing; biowash (washing or washing down treatment of textile materials), enzymatic microfibrillation, enzymatic "colonization” of linen, ramie and hemp; and treatment of Lyocel® or Newcell® (i.e., "TENCEL®” from Courtauld's), Cupro® and other cellulosic fibers or garments, dye removal from dyed cellulosic substrates such as dyed cotton (Leisola & Linko— (1976) Analytical Patent 124702-0230
  • the amount of enzyme or enzyme composition contacted with a textile may vary with the particular application. Typically, for biofinishing and denim washing applications, from about 0.02 wt. % to about 5 wt. % of an enzyme or enzyme composition may be contacted with the textile. In some embodiments, from about 0.5 wt. % to about 2 wt. % of an enzyme or enzyme composition may be contacted with the textile. For bioscouring, from about 0.1 to about 10, or from about 0.1 to about 1.0 grams of an enzyme or enzyme composition per kilogram of textile may be used, including any amount between about 0.1 grams and about 10 grams, in increments of 0.1 grams.
  • proteins or compositions of the present invention can be used in the saccharification of lignocellulose biomass from agriculture, forest products, municipal solid waste, and other sources, for biobleaching of wood pulp, and for de-inking of recycled print paper all by methods known to one skilled in the art.
  • the amount of enzyme or enzyme composition used for pulp and paper modification typically varies depending upon the stock that is used, the pH and temperature of the system, and the retention time.
  • the amount of enzyme or enzyme composition contacted with the paper or pulp may be from about 0.01 to about 50 U; from about 0.1 to about 15 U; or from about 0.1 to about 5 U of enzyme or enzyme composition per dry gram of fiber, including any amount between about 0.01 and about 50 U, in 0.01 U increments.
  • the amount of enzyme or enzyme composition contacted Patent 124702-0230 with the paper or pulp may be from about 1 to about 2000 grams or from about 100 to about 500 grams enzyme or enzyme composition per dry ton of pulp, including any amount between about 1 and about 2000 grams, in 1 gram increments.
  • Proteins or compositions of the present invention can added to wastewater to reduce the amount of solids such as sludge or to increase total biochemical oxygen demand (BOD) and chemical oxygen demand (COD) removal.
  • BOD biochemical oxygen demand
  • COD chemical oxygen demand
  • proteins or compositions of the present invention can be used to transform particulate COD to soluble COD in wastewater produced from grain/fruit/cellulose industrial processes or to increase the BOD/COD ratio by increasing waste biodegradability (soluble lower molecular weight polymers in cellulosic hemicellulosic wastes are typically more readily biodegradable than non- soluble material).
  • proteins or compositions of the present invention can also be used to increase waste digestion by aerobic and/or anaerobic bacteria.
  • acetyl esterase activity was measured the release of -mtrophenol by the action of acetyl esterase on p- nitrophenyl acetate (PNPAc).
  • PNPAc p- nitrophenyl acetate
  • One acetyl esterase unit of activity was the amount of enzyme that liberates 1 micromole of j-nitrophenol in one minute at 37 °C and pH 5.
  • PNPAc from Fluka (Switzerland, cat. # 46021) was used as the assay substrate. 3.6 mg of PNPAc was dissolved in 10 mL of 0.10 M potassium phosphate buffer pH 6.9 using magnetic stirrer to obtain 2 raM stock solution. The solution was stable for 2 days on storage at 4 °C.
  • the stop reagent (0.25 M Tris-HCl, pH 8.8) was prepared as follows. 30.29 g of Tris was dissolved in 900 mL of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 was prepared by mixing solution A with 37% HCl until the pH of the resulting solution was equal to 8.8. The solution volume was adjusted to 1000 mL. This reagent was used to terminate the enzymatic reaction.

Abstract

This invention relates to novel enzymes and novel methods for producing the same. More specifically this invention relates to a variety of fungal enzymes. Nucleic acid molecules encoding such enzymes, compositions, recombinant and genetically modified host cells, and methods of use are described. The invention also relates to a method to convert lignocellulosic biomass to fermentable sugars with enzymes that degrade the lignocellulosic material and novel combinations of enzymes, including those that provide a synergistic release of sugars from plant biomass. The invention also relates to a method to release cellular content by degradation of cell walls. The invention also relates to methods to use the novel enzymes and compositions of such enzymes to improve digestability of animal feed and in a variety of other processes, including food and beverage processing, baking, washing of clothing, detergent processes, biorefming, deinking and biobleaching of paper and pulp, and treatment of waste streams.

Description

Patent 124702-0230
Novel Fungal Enzymes
[1] RELATED APPLICATIONS
[2] This application claims benefit of priority under 35. U.S.C. § 119(e) of United
States Provisional Application number 61/369,688 filed on July 31, 2010.
[3] INCORPORATION BY REFERENCE
[4] The content of all patents, patent applications, publications, articles, or literature cited herein are expressly incorporated by reference.
[5] FIELD OF THE INVENTION
[6] This invention relates to novel enzymes and novel methods for producing the same. More specifically this invention relates to enzymes produced by fungi. The invention also relates to a method to convert lignocellulosic biomass or cellulosic substrates to fermentable sugars with enzymes that degrade lignocellulosic, cellulosic, and even more complex plant cell wall material and to novel combinations of enzymes, including those that provide a combined or synergistic release of sugars from plant biomass. The invention also relates to a method to release cellular contents by effecting degradation of the cell walls. The invention also relates to methods of using the novel enzymes and compositions of such enzymes in a variety of other processes, such as washing or treating of clothing or fabrics, detergent processes, animal feed, food, baking, production of biochemicals and biofuel, starch preparation, liquification, beverage, biorefining, deinking and biobleaching of paper and pulp, oil and waste dispersing, and treatment of waste streams.
[7] BACKGROUND OF THE INVENTION
[8] Large amounts of carbohydrates in plant biomass provide a plentiful source of potential energy in the form of sugars (both five carbon and six carbon sugars) that can be utilized for numerous industrial and agricultural processes. However, the enormous energy potential of these carbohydrates is currently under-utilized because the sugars are locked in complex polymers and, hence, are not readily accessible for fermentation. These complex polymers are often referred to collectively as lignocellulose. Sugars generated from degradation of plant biomass potentially represent plentiful, economically competitive feedstocks for fermentation into chemicals, plastics, and fuels, including ethanol as a substitute Patent 124702-0230 for petroleum.
[9] For example, distillers' dried grains (DDG) are lignocellulosic byproducts of the corn dry milling process. Milled whole com kernels are treated with amylases to liquefy the starch within the kernels and hydrolyze it to glucose. The glucose so produced is then fermented in a second step to ethanol. The residual solids after the ethanol fermentation and distillation are centrifuged and dried, and the resulting product is DDG, which is used as an animal feed stock. Although DDG composition can vary, a typical composition for DDG is: about 32% hemicellulose, 22% cellulose, 30% protein, 10% lipids, 4% residual starch, and 4% inorganics. In theory, the cellulose and hemicellulose fractions, comprising about 54% of the weight of the DDG, can be efficiently hydrolyzed to fermentable sugars by enzymes; however, it has been found that the carbohydrates comprising lignocellulosic materials in DDG are more difficult to digest. To date, the efficiency of hydrolysis of these (hemi) cellulosic polymers by enzymes is much lower than the hydrolytic efficiency of starch, due to the more complex and recalcitrant nature of these substrates. Accordingly, the cost of producing the requisite enzymes is higher than the cost of producing amylases for starch hydrolysis.
[10] Major polysaccharides comprising lignocellulosic materials include cellulose and hemicelluloses. The enzymatic hydrolysis of these polysaccharides to soluble sugars (and finally to monomers such as glucose, xylose and other hexoses and pentoses) is catalyzed by several enzymes acting in concert. For example, endo- l,4-p-glucanases (EGs) and exo-cellobiohydrolases (CBHs) catalyze the hydrolysis of insoluble cellulose to cellooligosachharides (with cellobiose as the main product), while β-glucosidases (BGLs) convert the oligosaccharides to glucose. Similarly, xylanases, together with other enzymes such as a-L- arabinofuranosidases, feralic and acetylxylan esterases and β-xylosidases, catalyze the hydrolysis of hemicelluloses.
[11] Regardless of the type of cellulosic feedstock, the cost and hydrolytic efficiency of enzymes are major factors that restrict the widespread use of biomass bioconversion processes. The production costs of microbially produced enzymes are tightly connected with the productivity of the enzyme-producing strain and the final activity yield in the fermentation broth. The hydrolytic efficiency of a multi- Patent 124702-0230 enzyme cocktail in the process of lignocellulose saccharification depends on properties of individual enzymes, the synergies between them, and their ratio in the multi-enzyme cocktail. The hydrolytic efficiency of a multi-enzyme complex in the process of lignocellulosic saccharification depends both on properties of the individual enzymes and the ratio of each enzyme within the complex.
[12] Enzymes useful for the hydrolysis of complex polysaccharides are also highly useful in a variety of industrial textile applications, as well as industrial paper and pulp applications, and in the treatment of waste streams. For example, as an alternative to the use of pumice in the stone washing process, methods for treating cellulose-containing fabrics for clothing with hydrolytic enzymes, such as cellulases, are known to improve the softness or feel of such fabrics. Cellulases are also used in detergent compositions, either for the purpose of enhancing the cleaning ability of the composition or as a softening agent. Cellulases are also used in combination with polymeric agents in processes for providing a localized variation in the color density of fibers. Such enzymes can also be used for the saccharification of lignocellulosic biomass in waste streams, such as municipal solid waste, for biobleaching of wood pulp, and for deinking of recycled print paper. As with the hydrolysis of these polysaccharides in lignocellulosic materials for use as feedstocks described above, the cost and hydrolytic efficiency of the enzymes are major factors that control the use of enzymes in these processes.
[13] Enzymes are also useful in the food and animal feed industry. For example, esterases can be utilized to degum vegetable oils; improving the production of various food products as well as enhancing the flavor of food products. Esterases can be used in the feed to reduce the amount of phosphate in feed. Carbohydrases can be used to increase the yield of fruit juice and oils; stimulate fermentation in the brewing industry; produce gelling agents; and modify starches, to mention a few. Carbohydrases in the feed industry include, but are not limited to, improving feed conversion, reducing the viscosity, and producing oligosaccharides.
[14] Filamentous fungi such as Aspergillus sp. and Trichoderma sp. are sources of cellulases and hemicellulases, as well as other enzymes useful in the enzymatic hydrolysis of major polysaccharides. In particular, strains of Trichoderma sp., such as T. viride, T. reesei and 71 longibrachiatum, and Penicillium sp., and enzymes derived from these strains, have previously been used to hydrolyze Patent 124702-0230 crystalline cellulose. However, the costs associated with producing enzymes from these fungi, as well as the presence of additional, undesirable enzymes, remains a drawback. It is therefore desirable to produce inexpensive enzymes and enzyme mixtures that efficiently degrade cellulose and hemicellulose for use in a variety of agricultural and industrial applications.
[15] In spite of the continued research of the last few decades to understand enzymatic lignocellulosic biomass degradation and cellulase production, it remains desirable to discover or to engineer new highly active cellulases and hemicellulases. It would also be highly desirable to construct highly efficient enzyme compositions capable of performing rapid and efficient biodegradation of lignocellulosic materials.
[16] SUMMARY OF THE INVENTION
[17] In one embodiment, the present invention comprises an isolated nucleic acid sequence selected from the group consisting of:
a) a nucleic acid sequence encoding a protein comprising an amino acid sequence selected from the group consisting of: Table 1 or Table 2.
b) a nucleic acid sequence encoding a fragment of the protein of (a), wherein the fragment has a biological activity of the protein of (a); and
c) a nucleic acid sequence encoding an amino acid sequence that is at least about 70% identical to an amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
[18] In some embodiments, the nucleic acid sequence encodes an amino acid sequence that is at least about 90%, at least about 95%, at least about 97% or at least about 99% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
[19] In some embodiments, the nucleic acid sequence encodes a protein comprising an amino acid sequence selected from the group consisting of: Table 1 or Table 2.
[20] In some embodiments, the nucleic acid sequence comprises a nucleic acid sequence selected from the group consisting of: Table 1 or Table 2. Patent 124702-0230
[21] In some embodiments, the present invention comprises nucleic acid sequences that are fully complementary to any of the nucleic acid sequences described above.
[22] In some embodiments, the present invention comprises an isolated protein comprising an amino acid sequence encoded by any of the nucleic acid molecules . described above.
[23] In some embodiments, the present invention comprises an isolated fusion protein comprising an isolated protein of the present invention fused to a protein comprising an amino acid sequence that is heterologous to the isolated protein.
[24] In some embodiments, the present invention comprises an isolated antibody or antigen binding fragment thereof that selectively binds to a protein of the present invention.
[25] In some embodiments, the present invention comprises a kit for degrading a lignocellulosic material to fermentable sugars comprising at least one isolated protein of the present invention.
[26] In some embodiments, the present invention comprises a detergent comprising at least one isolated protein of the present invention.
[27] In some embodiments, the present invention comprises a composition for the degradation of a lignocellulosic material comprising at least one isolated protein of the present invention.
[28] In some embodiments, the present invention comprises a recombinant nucleic acid molecule comprising an isolated nucleic acid molecule of the present invention, operatively linked to at least one expression control sequence. In some embodiments, the recombinant nucleic acid molecule comprises an expression vector. In some embodiments, the recombinant nucleic acid molecule comprises a targeting vector.
[29] In some embodiments, the present invention comprises an isolated host cell transfected with a nucleic acid molecule of the present invention. In some embodiments, the host cell is a fungus. In some embodiments, the host cell is a filamentous fungus. In some embodiments, the filamentous fungus is from a genus selected from the group consisting of: Chrysosporium, Thielavia, Talaromyces, Neurospora, Aiireobasidmm, Filih sidium, Piromyces, Corynasciis, Cryptococcus, Acremonium, Totypocladium, Scytalidium, Schizophyttum, Sporotrichum, Penicillium, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Patent 124702-0230
Humicola, and Tric oderma, and anamorphs and teleomorphs thereof. In some embodiments, the host cell is a bacterium.
[30] In some embodiments, the present invention comprises an oligonucleotide consisting essentially of at least 12 consecutive nucleotides of a nucleic acid sequence selected from the group consisting of: Table 1 or Table 2 or the complement thereof.
[31] In some embodiments, the present invention comprises a kit comprising at least one oligonucleotide of the present invention.
[32] In some embodiments, the present invention comprises methods for producing a protein of the present invention, comprising culturing a cell that has been transfected with a nucleic acid molecule comprising a nucleic acid sequence encoding the protein, and expressing the protein with the transfected cell. In some embodiments, the present invention further comprises recovering the protein from the cell or from a culture comprising the cell.
[33] In some embodiments, the present invention comprises a genetically modified organism comprising components suitable for degrading a lignocellulosic material to fermentable sugars, wherein the organism has been genetically modified to express at least one protein of the present invention.
[34] In some embodiments, the genetically modified organism is a plant, alga, fungus or bacterium. In some embodiments, the fungus is yeast, mushroom or filamentous fungus. In some embodiments, the filamentous fungus is from a genus selected from the group consisting of: Chrysosporium, Thielavia, Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynascus, Cryplococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Talaromyces, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Humicola, and Trichoderma. In some embodiments, the filamentous fungus is selected from the group consisting of: Trichoderma reesei, Chrysosporium lucknowense, Myceliophthora thermophila, Aspergillus japonicus, Penicillium canescens, Penicillium solitum, Penicillium fimiculosum, and Talaromyces flavus.
[35] In some embodiments, the genetically modified organism has been genetically modified to express at least one additional enzyme. In some embodiments, the additional enzyme is an accessory enzyme selected from the group consisting of: Patent 124702-0230 cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanases, εχο-β-D- glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
[36] In some embodiments, the genetically modified organism is a plant.
[37] In some embodiments, the present invention comprises a recombinant enzyme isolated from a genetically modified microorganism of the present invention. In some embodiments the recombinant enzyme has been subjected to a purification step.
[38] In some embodiments, the present invention comprises a crude fermentation product produced by culturing the cells from the genetically modified organism of the present invention, wherein the crude fermentation product contains at least one protein of the present invention.
[39] In some embodiments, the present invention comprises a multi-enzyme composition comprising enzymes produced by a genetically modified organism of the present invention, and recovered therefrom.
[40] In some embodiments, the present invention comprises a multi-enzyme composition comprising at least one protein of the present inventions, and at least one additional protein for degrading a lignocellulosic material or a fragment thereof that has biological activity.
[41] In some embodiments, the multi-enzyme composition comprises at least one cellobiohydrolase, at least one xylanase, at least one endoglucanase, at least one β- glucosidase, at least one β-xylosidase, and at least one accessory enzyme.
[42] In some embodiments, between about 50% and about 70% of the enzymes in the multi-enzyme composition are cellobiohydrolases. In some embodiments, between about 10% and about 30% of the enzymes in the composition are xylanases. In some embodiments, between about 5% and about 15% of the enzymes in the composition are endoglucanases. In some embodiments, between about 1% and about 5% of the enzymes in the composition are β-glucosidases. In some embodiments, between about 1% and about 3% of the enzymes in the composition are β-xylosidases.
[43] In some embodiments, the multi-enzyme composition comprises about 60% Patent 124702-0230 cellobiohydrolases, about 20% xylanases, about 10% endoglucanases, about 3% β- glucosidases, about 2% β-xylosidases, and about 5% accessory enzymes.
[44] In some embodiments, the xylanases are selected from the group consisting of: endoxylanases, exoxylanases, and β-xylosidases.
[45] In some embodiments, the accessory enzymes include an enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofui'anosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanase, εχο-β-D- glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
[46] In some embodiments, the multi-enzyme composition comprises at least one hemicellulase. In some embodiments, the hemicellulase is selected from the group consisting of a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, and endo-galactanase, a mannanase, an endo arabinase, an exo arabinase, an exo-galactanase, a ferulic acid esterase, a galactomannanase, a xyloglucanase, and mixtures thereof. In some embodiments, the xylanase is selected from the group consisting of endoxylanases, exoxylanase, and β- xylosidase.
[47] In some embodiments, the multi-enzyme composition comprises at least one cellulase.
[48] In some embodiments, the composition is a crude fermentation product. In some embodiments, the composition is a crude fermentation product that has been subjected to a purification step.
[49] In some embodiments, the multi-enzyme composition further comprises one or more accessory enzymes. In some embodiments, the accessory enzymes include at least one enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanase, exo^-D-glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase. In some embodiments, the accessory enzyme is selected from the group consisting of a glucoamylase, a pectinase, and a ligninase. In some embodiments, the accessory enzyme is added as a crude or a semi-purified enzyme Patent 124702-0230 mixture. In some embodiments, the accessory enzyme is produced by culturing at least one organism on a substrate to produce the enzyme.
[50] In some embodiments, the multi-enzyme composition comprises at least one protein of the present invention, and at least one additional protein or a fragment thereof that has biological activity for degrading an arabinoxylan-containing material.
[51] In some embodiments, the composition comprises at least one endoxylanase, at least one β-xylosidase, and at least one arabinofuranosidase. In some embodiments, the arabinofuranosidase comprises an arabinofuranosidase with specificity towards single substituted xylose residues, an arabinofuranosidase with specificity towards double substituted xylose residues, or a combination thereof.
[52] In some embodiments, the present invention comprises methods for degrading a lignocellulosic material to fermentable sugars, comprising contacting the lignocellulosic material with at least one isolated protein of the present invention.
[53] In some embodiments, the methods of the present invention further comprise contacting the lignocellulosic material with at least one additional isolated protein comprising an amino acid sequence that is at least about 95% identical to an amino acid sequence selected from the group consisting of: Table 1 or Table 2, wherein at least one additional protein has cellulolytic enhancing activity.
[54] In some embodiments, the additional isolated protein is part of a multi-enzyme composition.
[55] In some embodiments, the present invention comprises methods for degrading a lignocellulosic material to fermentable sugars, comprising contacting the lignocellulosic material with at least one multi-enzyme composition of the present invention.
[56] In some embodiments, the present invention comprises a method for producing an organic substance, comprising:
[57] saccharifying a lignocellulosic material with a multi-enzyme composition of the present invention;
[58] fermenting the saccharified lignocellulosic material obtained with one or more fermentating microoganisms; and
[59] recovering the organic substance from the fermentation.
[60] In some embodiments, the steps of saccharifying and fermenting are performed Patent 124702-0230 simultaneously.
In some embodiments, the organic substance is an alcohol, organic acid, ketone, amino acid, or gas. In some embodiments, the alcohol is ethanol.
In some embodiments, the lignocellulosic material is selected from the group consisting of herbaceous material, agricultural residue, forestry residue, municipal solid waste, waste paper, and pulp and paper mill residue.
In some embodiments, the lignocellulosic material is distiller's dried grains (DDG) or DDG with solubles. In some embodiments, the DDG or DDG with solubles is derived from corn.
In some embodiments, the present invention comprises a method for degrading a lignocellulosic material consisting of DDG or DDG with solubles to sugars, the method comprising contacting the DDG or DDG with solubles with a multi- enzyme composition of the present invention, whereby at least about 10% of the fermentable sugars are liberated. In some embodiments, at least about 15%, at least 20%, or at least about 23% of the sugars are liberated.
In some embodiments, the present invention further comprises a pretreatment process for pretreating the lignocellulosic material.
In some embodiments, the pretreatment process is selected from the group consisting of physical treatment, metal ion, ultraviolet light, ozone, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment. In some embodiments, the pretreatment process is selected from the group consisting of organosolv, steam explosion, heat treatment and AFEX. In some embodiments, the heat treatment comprises heating the lignocellulosic material to 121°C for 15 minutes.
In some embodiments, the present invention further comprises detoxifying the lignocellulosic material.
In some embodiments, the present invention further comprises recovering the fermentable sugar.
In some embodiments, the sugar is selected from the group consisting of glucose, xylose, arabinose, galactose, mannose, rhamnose, sucrose and fructose.
In some embodiments, the present invention further comprises recovering the Patent 124702-0230 contacted lignocellulosic material after the fermentable sugars are degraded.
[71] In some embodiments, the present invention comprises a feed additive comprising the recovered lignocellulosic material of the present invention. In some embodiments, the protein content of the recovered lignocellulosic material is higher than that of the starting lignocellulosic material.
[72] In some embodiments, the present invention comprises methods of improving the performance of an animal which comprises administering to the animal the feed additive of the present invention.
[73] In some embodiments, the present invention comprises methods for improving the nutritional quality of an animal feed comprising adding the feed additive of the present invention to an animal feed.
[74] In some embodiments, the present invention comprises methods for stonewashing a fabric, comprising contacting the fabric with at least one isolated protein of the present invention.
[75] In some embodiments, the present invention comprises methods for stonewashing a fabric, comprising contacting the fabric with at least one multi-enzyme composition of the present invention.
[76] In some embodiments, the fabric is denim.
[77] In some embodiments, the present invention comprises methods for enhancing the softness or feel of a fabric or depilling a fabric, comprising contacting the fabric with at least one isolated protein of the present invention, or a fragment thereof comprising a carbohydrate binding module (CBM) of the protein.
[78] In some embodiments, the present invention comprises methods for enhancing the softness or feel of a fabric or depilling a fabric, comprising contacting the fabric with at least one multi-enzyme composition of the present invention.
[79] In some embodiments, the present invention comprises methods for restoring color to or brightening a fabric, comprising contacting the fabric with at least one isolated protein of the present invention.
[80] In some embodiments, the present invention comprises methods for restoring color to or brightening a fabric, comprising contacting the fabric with at least one multi- enzyme composition of the present invention.
[81] In some embodiments, the present invention comprises methods of biopolishing, defibrillating, bleaching, dyeing or desizing a fabric, comprising contacting the Patent 124702-0230 fabric with at least one isolated protein of the present invention.
[82] In some embodiments, the present invention comprises methods of biopolishing, defibrillating, bleaching, dyeing or desizing a fabric, comprising contacting the fabric with at least one multi-enzyme composition of the present invention.
[83] In some embodiments, the present invention comprises methods of biorefining, deinking or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one isolated protein of the present invention.
[84] In some embodiments, the present invention comprises methods of biorefining, deinking or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one multi-enzyme composition of the present invention
[85] In some embodiments, the present invention comprises methods for enhancing the cleaning ability of a detergent composition, comprising adding at least one isolated protein of the present invention to the detergent composition.
[86] In some embodiments, the present invention comprises methods for enhancing the cleaning ability of a detergent composition, comprising adding at least one multi- enzyme composition of the present invention to the detergent composition.
[87] In some embodiments, the present invention comprises a detergent composition, comprising at least one isolated protein of the present invention and at least one surfactant.
[88] In some embodiments, the present invention comprises a detergent composition, comprising at least one multi-enzyme composition of the present invention and at least one surfactant.
[89] In some embodiments, the present invention comprises methods for releasing cellular contents comprising contacting a cell with at least one protein of the present invention.
[90] In some embodiments, the cell may be a bacterium, an algal cell, a fungal cell or a plant cell. In preferred embodiments, the cell is an algal cell.
[91] In some embodiments, contacting the cell with at least one protein of the present invention degrades the cell wall.
[92] In some embodiments, the cellular contents are selected from the group consisting of: alcohols and oils.
[93] In some embodiments, the present invention comprises compositions for degrading cell walls comprising at least one protein of the present invention. Patent 124702-0230
[94] In some embodiments, the present invention comprises methods for improving the nutritional quality of food comprising adding to the food at least one protein of the present invention.
[95] In some embodiments, the present invention comprises methods for improving the nutritional quality of food comprising pretreating the food with at least one protein of the present invention.
[96] In some embodiments, the present invention comprises methods for improving the nutritional quality of animal feed comprising adding to the animal feed at least one protein of the present invention.
[97] In some embodiments, the present invention comprises methods for improving the nutritional quality of animal feed comprising pretreating the feed with at least one isolated protein of the present invention.
[98] In some embodiments, the present invention comprises a genetically modified organism comprising at least one nucleic acid molecule encoding a protein of the present invention, in which the activity of one or more of the proteins is upregulated, the activity of one or more of the proteins downregulated, or the activity of one or more of the proteins is upregulated and the activity of one or more of the proteins is downregulated.
[99] BRIEF DESCRIPTION OF THE FIGURES
[100] Figure 1 : Chromatogram of curdlan before (blank) and after digestion with the β- glucanase Laml (1 and 4 h). The experiments were performed at pH 5.0, 50°C during 1 and 4 hours. G=glucose.
[101] Figure 2: Chromatogram of cellulose and amylose before (blank) and after
digestion with the β-glucanase Laml (4 h). The experiments were performed at pH
5.0, 50°C during 4 hours.
[102] Figure 3: Chromatogram of linear arabinan, branched arabinan and oat spelt xylan before (blank) and after digestion with the β-glucanase Laml (4 h). The experiments were performed at pH 5.0, 50°C during 4 hours.
[103] Figure 4: Chromatogram of potato galactan and larch galactan before (blank) and after digestion with the β-glucanase Laml (4 h). The experiments were performed at pH 5.0, 50°C during 4 hours.
[104] Figure 5: MS diagram of methyl-4-O-methyl-glucuronic acid treated by Guel at 0 hours and 4 hours of incubation. The experiments were performed at pH 5.0, 50°C Patent 124702-0230 during 4 hours.
[105] Figure 6: MS diagram of methyl-4-O-methyl-glucuronic acid treated by Gue2 at 0 hours and 4 hours of incubation. The experiments were performed at pH 5.0, 50°C during 4 hours.
[106] Figure 7: Activity of the type B feruloyl esterase FaeB3 towards wheat bran oligomers (Upper), and sugar beet pulp oligomers (Lower). The release of ferulic acid is indicated as the absorbance at 310 nm.
[107] Figure 8 A: HPAEC diagram of the incubation of Agu2 alone with aldouronic acids. The incubations have been performed at 50°C and pH 5 during 16 hours.
[108] Figure 8B: HPAEC diagram of the incubation of Agu2 in combination with a
GH10 xylanase from Myceliophthora thermophila CI on aldouronic acids (B). The incubations have been performed at 50°C and pH 5 during 16 hours.
[109] Figure 9A: HPAEC diagram of the incubation of Gxhl with xylopentaose (solid line) and reduced xylopentaose (dashed line). Incubation was performed at 50°C, pH 5 during 16 hours.
[110] Figure 9B: HPAEC diagram of the incubation of Gxh2 with xylopentaose (solid line) and reduced xylopentaose (dashed line). Incubation was performed at 50°C, pH 5 during 16 hours.
[I l l] Figure 10: Oligosaccharide release from birch wood xylan when incubated with
Gxhl, Gxh2, a GH10 xylanase (XI 0) and a GH11 xylanase (XI 1) from
Myceliophthora thermophila CI alone or m combination. Incubation was performed at 50°C at pH 5 during 1 hour.
[112] Figure 11. Temperature profile of Agal. Incubations were performed on pNP- a -
D-galactoside at H5 during 10 minutes.
[113] Figure 12. Temperature profile of Aga2. Incubations were performed on pNP- cn -
D-galactoside at pH5 during 10 minutes.
[114] Figure 13. pH-profile of Agal . Incubations were performed on pNP- a -D- galactoside at 40°C during 10 minutes.
[115] Figure 14. pH-profile of Aga2. Incubations were performed on pNP- a -D- galactoside at 40°C during 10 minutes.
[116] Figure 15. Temperature profile of Man9. Incubations were performed on pNP- -
D-mannoside at H5 during 10 minutes. 45949
Patent 124702-0230
[117] Figure 16. pH-profiles of Man9. Incubations were performed on pNP- a -D- mannoside at 40°C during 10 minutes.
[118] DETAILED DESCRIPTION OF THE INVENTION
[119] The present invention relates generally to proteins that play a role in the
degradation of cellulose and hemicellulose and nucleic acids encoding the same.
In particular, the present invention relates to enzymes isolated from a filamentous fungal strain denoted herein as CI (Accession No. V M F-3500-D), nucleic acids encoding the enzymes, and methods of producing and using the enzymes. The invention also provides compositions that include at least one of the enzymes described herein for uses including, but not limited to, the hydrolysis of lignocellulose. The invention stems, in part, from the discovery of a variety of novel cellulases and hemicellulases produced by the CI fungus that exhibit high activity toward cellulose and other components of biomass.
[120] The present invention also provides methods and compositions for the conversion
of plant biomass to fermentable sugar's that can, in turn, be converted to useful products. Such products may include, without limitation, metabolites, bioplastics, biopolymers and biofuels. The methods include methods for degrading lignocellulosic material using enzyme mixtures to liberate sugars. The compositions of the invention include enzyme combinations that break down lignocellulose.
[121] As used herein the terms "biomass" or "lignocellulosic material" includes materials
containing cellulose and/or hemicellulose. Generally, these materials also contain pectin, lignin, protein, carbohydrates (such as starch and sugar) and ash.
Lignocellulose is generally found, for example, in the stems, leaves, hulls, husks, and cobs of plants or leaves, branches, and wood of trees.
[122] The process of converting less or more complex carbohydrates (such as starch,
cellulose or hemicellulose) into fermentable sugars is also referred to herein as
"saccharification. "
[123] Fermentable sugars, as used herein, refers to simple sugars, such as glucose,
xylose, arabinose, galactose, mannose, rhamnose, sucrose and fructose.
[124] Biomass can include virgin biomass and/or non- virgin biomass such as agricultural
biomass, commercial organics, construction and demolition debris, municipal solid Patent 124702-0230 waste, waste paper and yard waste. Common forms of biomass include trees, shrubs and grasses, wheat, wheat straw, sugar cane bagasse, sugar beet, soybean, corn, corn husks, com kernel including fiber from kernels, prodvicts and byproducts from milling of grains such as com, tobacco, wheat and barley (including wet milling and dry milling) as well as municipal solid waste, waste paper and yard waste. The biomass can also be, but is not limited to, herbaceous material, agricultural residues, forestry residues, municipal solid wastes, waste paper, and pulp and paper mill residues. "Agricultural biomass" includes branches, bushes, canes, com and com husks, energy crops, algae, fruits, flowers, grains, grasses, herbaceous crops, leaves, bark, needles, logs, roots, saplings, short rotation woody crops, shrubs, switch grasses, trees, vegetables, fruit peels, vines, sugar beet pulp, wheat midlings, oat hulls, peat moss, mushroom compost and hard and soft woods (not including woods with deleterious materials). In addition, agricultural biomass includes organic waste materials generated from agricultural processes including farming and forestry activities, specifically including forestry wood waste. Agricultural biomass may be any of the aforestated singularly or in any combination or mixture thereof.
[125] Energy crops are fast-growing crops that are grown for the specific purpose of producing energy, including without limitation, biofuels, from all or part of the plant. Energy crops can include crops that are grown (or are designed to grow) for their increased cellulose, xylose and sugar contents. Examples of such plants include, without limitation, switchgrass, willow and poplar. Energy crops may also include algae, for example, designer algae that are genetically engineered for enhanced production of hydrogen, alcohols, and oils, which can be further processed into diesel and jet fuels, as well as other bio-based products.
[126] Biomass high in starch, sugar, or protein such as corn, grains, fruits and vegetables are usually consumed as food. Conversely, biomass high in cellulose, hemicellulose and lignin are not readily digestible and are primarily utilized for wood and paper products, animal feed, fuel, or are typically disposed. Generally, the substrate is of high lignocellulose content, including distillers' dried grains com stover, com cobs, rice straw, wheat straw, hay, sugarcane bagasse, sugar cane pulp, citrus peels and other agricultural biomass, switchgrass, forestry wastes, Patent 124702-0230 poplar wood chips, pine wood chips, sawdust, yard waste, and the like, including any combination thereof.
[127] In one embodiment, the lignocellulosic material is distillers' dried grains (DDG).
DDG (also known as dried distiller's grain, or distiller's spent grain) is spent, dried grains recovered after alcohol fermentation. The lignocellulosic material can also be distiller's dried grain with soluble material recycled back (DDGS). While reference will be made herein to DDG for convenience and simplicity, it should be understood that both DDG and DDGS are contemplated as desired lignocellulosic materials. These are largely considered to be waste products and can be obtained after the fermentation of the starch derived from any of a number of grains, including corn, wheat, barley, oats, rice and rye. In one embodiment the DDG is derived from com.
[128] It should be noted that the distiller's grains do not necessarily have to be dried.
Although the grains normally are currently dried, water and enzymes are added to the DDG substrate in the present invention. If the saccharification were done on site, the drying step could be eliminated and enzymes could be added to the distiller's grains without drying.
[129] Due in part to the many components that comprise biomass and lignocellulosic materials, enzymes or a mixture of enzymes capable of degrading xylan, lignin, protein, and carbohydrates are needed to achieve saccharification. The present invention includes enzymes or compositions thereof with, for example, cellobiohydrolase, endoglucanase, xylanase, β-glucosidase, and hemicellulase activities.
[130] Fermentable sugars can be converted to useful value-added fermentation products, non-limiting examples of which include amino acids, vitamins, pharmaceuticals, animal feed supplements, specialty chemicals, chemical feedstocks, plastics, solvents, fuels, or other organic polymers, lactic acid, and ethanol, including fuel ethanol. Specific value-added products that may be produced by the methods of the invention include, but are not limited to, biofuels (including ethanol and butanol); lactic acid; plastics; specialty chemicals; organic acids, including citric acid, succinic acid and maleic acid; solvents; animal feed supplements; pharmaceuticals; vitamins; amino acids, such as lysine, methionine, tryptophan, threonine, and aspartic acid; industrial enzymes, such as proteases, cellulases, amylases, Patent 124702-0230 glucanases, xylanases, arabinanases, lactases, lipases, esterases, lyases, oxidoreductases, transferases ; and chemical feedstocks.
[131] The enzymes of the present invention may also be used for stone washing cellulosie fabrics such as cotton (e.g., denim), linen, hemp, ramie, cupro, lyocell, newcell, rayon and the like. See, for example, U.S. Patent No. 6,015,707. The enzymes and compositions of the present invention are suitable for industrial textile applications in addition to the stone washing process. For example, cellulases are used in detergent compositions, either for the purpose of enhancing the cleaning ability of the composition or as a softening agent. When so used, the cellulase will degrade a portion of the cellulosie material, e.g., cotton fabric, in the wash, which facilitates the cleaning and/or softening of the fabric. The endoglucanase components of fungal cellulases have also been used for the purposes of enhancing the cleaning ability of detergent compositions, for use as a softening agent, and for use in improving the feel of cotton fabrics, and the like. Enzymes and compositions of the present invention may also be used in the treatment of paper pulp (e.g., for improving the drainage or for de-inldng of recycled paper) or for the treatment of wastewater streams (e.g., to hydrolyze waste material containing cellulose, hemicellulose and pectins to soluble lower molecular weight polymers).
[132] The enzymes of the present invention may also be used to release the contents of a cell. In some embodiments, contacting or mixing the cells with the enzymes of the present invention will degrade the cell walls, resulting in cell lysis and release of the cellular contents. Such cells can include bacteria, plant cells, fungi including yeasts, and algae. For example, the enzymes of the present invention may be used to degrade the cell walls of algal cells in order to release the materials contained within the algal cells. In some embodiments, such materials may include, without limitation, alcohols and oils. The alcohols and oils so released can be further processed to produce diesel, jet fuels, as well as other economically important bio- products..
[133] The enzymes of the present invention may be used alone, or in combination with other enzymes, chemicals or biological materials. The enzymes of the present invention may be used for in vitro applications in which the enzymes or mixtures thereof are added to or mixed with the appropriate substrates to catalyze the Patent 124702-0230 desired reactions. Additionally, the enzymes of the present invention may be used for in vivo applications in which nucleic acid molecules encoding the enzymes are introduced into cells and are expressed therein to produce the enzymes and catalyze the desired reactions within the cells. For example, in some embodiments, enzymes capable of promoting cell wall degradation may be added to algal cells suspended in solutions to degrade the algal cell walls and release their content, whereas in some embodiments, nucleic acid molecules encoding such enzymes may be introduced into the algal cells to express the enzymes therein, so that these enzymes can degrade the algal cell walls from within. Some embodiments may combine the in vitro applications with the in vivo applications. For example, nucleic acids encoding enzymes capable of catalyzing cell wall degradation may be introduced into algal cells to express the enzymes in those cells and to degrade their cell walls, while enzymes may also added to or mixed with the cells to further promote the cell wall degradation. In some embodiments, the enzymes used for in vitro applications may be different from the enzymes used for in vivo applications. For example, an enzyme with the laminarinase activity may be mixed with the cells, while an enzyme with the xyloglucanase activity is expressed within the cells.
In one aspect, the present invention includes proteins isolated from, or derived from the knowledge of enzymes froma fungus such as Myceliophthora thermophila or a mutant or other derivative thereof, and more pailicularly, from the fungal strain denoted herein as CI (Accession No. VKM F-3500-D). Myceliophthora thermophila has previously appeared in patent applications and in the literature as Chrysosporium lucknowense or Sporotrichitm thermophile. Preferably, the proteins of the invention possess enzymatic activity. As described in U.S. Patent No. 6,015,707 or U.S. Patent No. 6,573,086 a strain called CI (Accession No. VKM F- 3500-D), was isolated from samples of forest alkaline soil from Sola Lake, Far East of the Russian Federation. This strain was deposited at the All-Russian Collection of Microorganisms of Russian Academy of Sciences (VKM), Bakhurhina St. 8, Moscow, Russia, 113184, under the terms of the Budapest Treaty on the International Regulation of the Deposit of Microorganisms for the Purposes of Patent Procedure on August 29, 1996, as Chrysosporium lucknowense Garg 27K, VKM-F 3500 D. Various mutant strains of Myceliophthora thermophila Patent 124702-0230
(C hicknowense) CI have been produced and these stains have also been deposited at the All-Russian Collection of Microorganisms of Russian Academy of Sciences (VKM), Bakhurhina St. 8, Moscow, Russia, 113184, under the terms of the Budapest Treaty on the International Regulation of the Deposit of Microorganisms for the Purposes of Patent Procedure on September 2, 1998 or at the Centraal Bureau voor Schimmelcultures (CBS), Uppsalalaan 8, 3584 CT Utrecht, The Netherlands for the purposes of Patent Procedure on December 5, 2007. For example, Strain CI was mutagenised by subjecting it to ultraviolet light to generate strain UV13-6 (Accession No. VKM F-3632 D). This strain was subsequently further mutated with N-methyl-N'-nitiO-N-nitiOSoguanidine to generate stain NG7C-19 (Accession No. VKM F-3633 D). This latter stain in turn was subjected to mutation by ultraviolet light, resulting in strain UV18-25 (Accession No. VKM F-3631 D). This strain in turn was again subjected to mutation by ultraviolet light, resulting in strain W1L (Accession No. CBS122189), which was subsequently subjected to mutation by ultraviolet light, resulting in strain W1L#100L (Accession No. CBS 122190). Strain CI was previously classified as a Chrysosporium lucknowense based on morphological and growth characteristics of die microorganism, as discussed in detail in U.S. Patent No. 6,015,707, U.S. Patent No. 6,573,086 and patent PCT/NL2010/000045.
[135] In certain embodiments of the present invention, a protein of the invention comprises, consists essentially of, or consists of an amino acid sequence selected from Tables 1 and 2. The present invention also includes homologues or variants of any of the above sequences, including fragments and sequences having a given identity to any of the above sequences, wherein the homologue, variant, or fragment has at least one biological activity of the wild-type protein, as described herein.
[136] In general, the proteins disclosed herein possess carbohydrase enzymatic activity, or the ability to degrade carbohydrate-containing materials. A review of enzymes involved in the degradation of polysaccharides can be found in de Vries et al., Microbiol. Mol. Biol. Rev. 65:497-522 (2001). More specifically, the proteins may possess cellulase activity such as endoglucanase activity (e.g., l,4-p-D-glucan-4- glucanohydrolases), exoglucanase activity (e.g., l,4- -D-glucan cellobiohydrolases), and β-glucosidase activity. The proteins may possess Patent 124702-0230 hemicellulase activity such as endoxylanase activity, exoxylanase activity, or β- xylosidase activity. The proteins may possess laminarinase, xyloglucanase, galactanase, glucoamylase, pectate lyase, chitosanase, exo-p-D-glucosaminidase, cellobiose dehydrogenase, acetylxylan esterase, ligninase, amylase, glucuronidase, ferulic acid esterase, arabinofuranosidase, pectin methyl esterase, arabinase, lipase, glucosidase, β-hexosaminidase, rhamnogalacturonan acetylesterase, exo- rhamnogalacturonase, rhamnogalacturonan lyase, exo-polygalacturonase, lichenase, pectin lyase, pectate lyase, β-mannanase, endo 1 ,6-a-mannanase, or glucomannanase activities. Physical properties, biochemical characteristics and substrate specificities of proteins of the present invention are illustrated below.
[137] As used herein, "carbohydrase" refers to any protein that catalyzes the hydrolysis of carbohydrates. "Glycoside hydrolase", "glycosyl hydrolase" or "glycosidase" refers to a protein that catalyzes the hydrolysis of the glycosidic bonds between carbohydrates or between a carbohydrate and a non-carbohydrate residue. Endoglucanases, cellobiohydrolases, β-glucosidases, a-glucosidases, xylanases, β- xylosidases, alpha- xylosidases, galactanases, a-galactosidases, β-galactosidases, a-amylases, glucoamylases, endo-arabinases, arabinofuranosidases, mannanases, β-mannosidases, pectinases, acetyl xylan esterases, acetyl mannan esterases, femlic acid esterases, coumaric acid esterases, pectin methyl esterases, and chitosanases are examples of glycosidases.
[138] "Cellulase" refers to a protein that catalyzes the hydrolysis of 1 ,4^-D-glycosidic linkages in cellulose (such as bacterial cellulose, cotton, filter paper, phosphoric acid swollen cellulose, Avicel®); cellulose derivatives (such as carboxymethylcellulose and hydroxyethylcellulose); plant lignocellulosic materials, beta-D-glucans or xyloglucans. Cellulose is a linear beta-(l-4) glucan consisting of anhydrocellobiose units. Endoglucanases, cellobiohydrolases, and β- glucosidases are examples of cellulases.
[139] "Endoglucanase" refers to a protein that catalyzes the hydrolysis of cellulose to oligosaccharide chains at random locations by means of an endoglucanase activity.
[140] "Cellobiohydrolase" refers to a protein that catalyzes the hydrolysis of cellulose to cellobiose via an exoglucanase activity, sequentially releasing molecules of cellobiose from the reducing or non-reducing ends of cellulose or cello- oligosaccharides. "β-glucosidase" refers to an enzyme that catalyzes the 5949
Patent 124702-0230 conversion of cellobiose and oligosaccharides to glucose.
"Hemicellulase" refers to a protein that catalyzes the hydrolysis of hemicellulose, such as that found in lignocellulosic materials. Hemicelluloses are complex polymers, and their composition often varies widely from organism to organism, and from one tissue type to another. Hemicelluloses include a variety of compounds, such as xylans, arabinoxylans, xyloglucans, mamians, glucomannans, and galacto(gluco)mannans. Hemicellulose can also contain glucan, which is a general term for beta-linked glucose residues. In general, a main component of hemicellulose is beta-l,4-linked xylose, a five carbon sugar. However, this xylose is often branched as beta- 1,3 linkages or beta- 1,2 linkages, and can be substituted with linkages to arabinose, galactose, mannose, glucuronic acid, or by esterification to acetic acid. The composition, nature of substitution, and degree of branching of hemicellulose is very different in dicotyledonous plants (dicots, i.e., plant whose seeds have two cotyledons or seed leaves such as lima beans, peanuts, almonds, peas, kidney beans) as compared to monocotyledonous plants (monocots;
i.e., plants having a single cotyledon or seed leaf such as corn, wheat, rice, grasses, barley). In dicots, hemicellulose is comprised mainly of xyloglucans that are 1 ,4- beta-linked glucose chains with 1,6-alpha-linked xylosyl side chains. In monocots, including most grain crops, the principal components of hemicellulose are heteroxylans. These are primarily comprised of 1,4-beta-linked xylose backbone polymers with 1,2- or 1,3 -alpha linkages to arabinose, linkage of galactose and mannose to arabmose or xylose in side chains, as well as xylose modified by ester- linked acetic acids, Also present are branched beta glucans comprised of 1,3- and
1,4-beta-linked glucosyl chains. In monocots, cellulose, heteroxylans and beta glucans are present in roughly equal amounts, each comprising about 15-25% of the dry matter of cell walls. Hemicellulolytic enzymes, i.e. hemicellulases, include both endo-acting and exo-acting enzymes, such as xylanases, β-xylosidases. alpha- xylosidases, galactanases, a-galactosidases, β-galactosidases, endo-arabinases, arabinofuranosidases, mannanases, β-mannosidases. Hemicellulases also include the accessory enzymes, such as acetylesterases, ferulic acid esterases, and coumaric acid esterases. Among these, xylanases and acetyl xylan esterases cleave the xylan and acetyl side chains of xylan and the remaining xylo-oligomers are unsubstituted and can thus be hydrolysed with β-xylosidase only. In addition, Patent 124702-0230 several less known side activities have been found in enzyme preparations which hydrolyze hemicellulose. Accordingly, xylanases, acetylesterases and β- xylosidases are examples of hemicellulases.
[142] "Xylanase" specifically refers to an enzyme that hydrolyzes the β-1,4 bond in the xylan backbone, producing short xylooligosaccharides.
[143] "β-Mannanase" or "endo-l,4-P-mannosidase" refers to a protein that hydrolyzes mannan-based hemicelluloses (mannan, glucomannan, galacto(gluco)mannan) and produces short P-l,4-mannooligosaccharides.
[144] "Mannan endo-l,6- -mannosidase" refers to a protein that hydrolyzes 1,6-a- mannosidic linkages in unbranched 1 ,6-mannans.
[145] "β-Marmosidase" (P-l,4-mannoside mannohydrolase; EC 3.2.1.25) refers to a protein that catalyzes the removal of β-D-mannose residues from the nonreducing ends of oligosaccharides.
[146] "Galactanase", "endo-p-l,6-galactanse" or "arabinogalactan endo-l,4- - galactosidase" refers to a protein that catalyzes the hydrolysis of endo-l,4^-D- galactosidic linkages in arabinogalactans.
[147] "Glucoamylase" refers to a protein that catalyzes the hydrolysis of terminal 1 ,4- linked -D-glucose residues successively from non-reducing ends of the glycosyl chains in starch with the release of β-D-glucose.
[148] "β-hexosaminidase" or "β-Ν-acetylglucosaminidase" refers to a protein that catalyzes the hydrolysis of terminal N-acetyl-D-hexosamine residues in N-acetyl- β-D-hexosamines.
[149] "a-L-arabinofuranosidase", "a-N-arabmofuranosidase", "a-arabinofuranosidase", "arabinosidase" or "arabinofuranosidase" refers to a protein that hydrolyzes ai'abinofuranosyl-containing hemicelluloses or pectins. Some of these enzymes remove arabinofuranoside residues from 0-2 or 0-3 single substituted xylose residues, as well as from 0-2 and/or 0-3 double substituted xylose residues. Some of these enzymes remove arabinose residues from arabinan oligomers.
[150] "Endo-arabinase" refers to a protein that catalyzes the hydrolysis of 1,5-a- arabinofuranosidic linkages in 1,5-arabinans.
[151] "Exo-arabinase" refers to a protein that catalyzes the hydrolysis of 1,5-a-linkages in 1,5-arabinans or 1,5-a-L arabino-oligosaccharides, releasing mainly arabinobiose, although a small amount of arabinotriose can also be liberated. Patent 124702-0230
[152] "β-xylosidase" refers to a protein that hydrolyzes short l,4-P-D-xylooligomers into xylose.
[153] "Cellobiose dehydrogenase" refers to a protein that oxidizes cellobiose to cellobionolactone.
[154] "Chitosanase" refers to a protein that catalyzes the endohydrolysis of β-1,4- linkages between D-glucosamine residues in acetylated chitosan (i.e., deacetylated chitin).
[155] "Exo-polygalacturonase" refers to a protein that catalyzes the hydrolysis of terminal alpha 1 ,4-linked galacturonic acid residues from non-reducing ends thus converting polygalacturonides to galacturonic acid.
[156] "Acetyl xylan esterase" refers to a protein that catalyzes the removal of the acetyl groups from xylose residues. "Acetyl mannan esterase" refers to a protein that catalyzes the removal of the acetyl groups from marrnose residues, "femlic esterase" or "ferulic acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and ferulic acid. "Coumaric acid esterase" refers to a protein that hydrolyzes the ester bond between the arabinose substituent group and coumaric acid. Acetyl xylan esterases, ferulic acid esterases and pectin methyl esterases are examples of carbohydrate esterases.
[157] "Pectate lyase" and "pectin lyases" refer to proteins that catalyze the cleavage of 1,4-a-D-galacturonan by beta-elimination acting on polymeric and/or oligosaccharide substrates (pectates and pectins, respectively).
[158] "Endo-l,3- -glucanase" or "laminarinase" refers to a protein that catalyzes the cleavage of 1,3-linkages in β-D-glucans such as laminarin or lichenin. Laminarin is a linear polysaccharide made up of p-l,3-gluoan with P-l,6-linkages.
[159] "Lichenase" refers to a protein that catalyzes the hydrolysis of lichenan, a linear, 1,3-1,4- -D glucan.
[160] Rhamnogalacturonan is composed of alternating -l,4-rhamnose and a-l,2-linked galacturonic acid, with side chains linked 1,4 to rhamnose. The side chains include Type I galactan, which is -l,4-linked galactose with a-l,3-linked arabinose substituents; Type II galactan, which is β- 1 ,3-1 ,6-linlced galactoses (very branched) with arabinose substituents; and arabinan, which is a-l,5-linked arabinose with a-l,3-linked arabinose branches. The galacturonic acid substituents Patent 124702-0230 may be acetylated and/or methylated.
[161] "Exo-rhamnogalacturonanase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin from the nonreducing end.
[162] "Rhamnogalacturonan acetylesterase" refers to a protein that catalyzes the removal of the acetyl groups ester-linked to the highly branched rhamnogalacturonan (hairy) regions of pectin.
[163] "Rhamnogalacturonan lyase" refers to a protein that catalyzes the degradation of the rhamnogalacturonan backbone of pectin via a β-elimination mechanism (see, e.g., Pages et al., J. Bacterial 185:4727-4733 (2003)).
[164] "Alpha-rhamnosidase" refers to a protein that catalyzes the hydrolysis of terminal non-reducing a-L-rhamnose residues in a-L-rhamnosides.
[165] Glycosidases (glycoside hydrolases; GH), a large family of enzymes that includes cellulases and hemicellulases, catalyze the hydrolysis of glycosidic linkages, predominantly in carbohydrates. Glycosidases such as the proteins of the present invention may be assigned to families on the basis of sequence similarities, and there are now over 100 different such families defined (see the CAZy (Carbohydrate Active EnZymes database) website, maintained by the Architecture of Fonction de Macromolecules Biologiques of the Centre National de la Recherche Scientifique, which describes the families of structurally-related catalytic and carbohydrate-binding modules (or functional domains) of enzymes that degrade, modify, or create glycosidic bonds; Coutinho, P.M. & Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In "Recent Advances in Carbohydrate Bioengineering", H.J. Gilbert, G. Davies, B. Henrissat and B. Svensson eds., The Royal Society of Chemistry, Cambridge, pp. 3-12). Because there is a direct relationship between the amino acid sequence of a protein and its folding similarities, such a classification reflects the structural features of these enzymes and their substrate specificity. Such a classification system can help to reveal the evolutionary relationships between these enzymes and provide a convenient tool to determine information such as an enzyme's activity and function. Thus, enzymes assigned to a particular family based on sequence homology with other members of the family are expected to have similar enzymatic activities and related substrate specificities. CAZy family classifications also exist for glycosyltransferases (GT), polysaccharide lyases (PL), Patent 124702-0230 and carbohydrate esterases (CE). Likewise, sequence homology may be used to identify particular domains within proteins, such as carbohydrate binding modules (CBMs; also known as carbohydrate binding domains (CBDs), sometimes called cellulose binding domains). The CAZy homologies of proteins of the present invention are disclosed below. An enzyme assigned to a particular CAZy family may exhibit one or more of the enzymatic activities or substrate specificities associated with the CAZy family. In other embodiments, the enzymes of the present invention may exhibit one or more of the enzyme activities discussed above.
[166] Certain proteins of the present invention may be classified as "Family 61 glycosidases" based on homology of the polypeptides to CAZy Family GH61. Family 61 glycosidases may exhibit cellulolytic enhancing activity or endoglucanase activity. Additional information on the properties of Family 61 glycosidases may be found in U.S. Patent Application Publication Nos. 2005/0191736, 2006/0005279, 2007/0077630, and in PCT Publication No.. WO 2004/031378.
[167] As used herein, "cellulolytic enhancing activity" refers to a biological activity that enhances the hydrolysis of a cellulosic material by proteins having cellulolytic activity. In other words, saccharifying a cellulosic material with a cellulolytic protein in the presence of a Family 61 glycosidase may increase the degradation of cellulosic material compared to the presence of only the cellulolytic protein. The cellulosic material can be any material containing cellulose. The cellulolytic activity is a biological activity that hydrolyzes a cellulosic material. Cellulolytic enhancing activity can be deteimined by measuring the increase in sugars from the hydrolysis of a cellulosic material by cellulolytic protein.
[168] Proteins of the present invention may also include homologues, variants, and fragments of the proteins disclosed herein. The protein fragments include, but are not limited to, fragments comprising a catalytic domain (CD) and/or a carbohydrate binding module (CBM) ( also known as a cellulose-binding domain; both can be referred to herein as CBM). The identity and location of domains within proteins of the present invention are disclosed in detail below. The present invention encompasses all combinations of the disclosed domains. For example, a protein fragment may comprise a CD of a protein but not a CBM of the protein or a CBM of Patent 124702-0230 a protein but not a CD. Similarly, domains from different proteins may be combined. Protein fragments comprising a CD, CBM or combinations thereof for each protein disclosed herein can be readily produced using standard techniques known in the art. In some embodiments, a protein fragment comprises a domain of a protein that has at least one biological activity of the full-length protein. Homologues or variants of proteins of the invention that have at least one biological activity of the full-length protein are described in detail below. As used herein, the phrase "biological activity" of a protein refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vitro or in vivo. In certain embodiments, a protein fragment comprises a domain of a protein that has the catalytic activity of the full- length enzyme. Specific biological activities of the proteins of the invention, and structures within the proteins that are responsible for the activities, are described below.
[169] Descriptions of the enzymes of the present invention are provided below, along with activities and homologies. Although each enzyme is expected to exhibit the activity exemplified below, enzymes of the present invention may also exhibit any of the enzyme activities or substrate specificities discussed throughout this disclosure.
[170] Esterases
[171] Esterases represent a category of various enzymes including lipases, phospholipases, cutinases, and phytases that catalyze the hydrolysis and synthesis of ester bonds in compounds.
[172] Applications of esterases in the food industry include, but are not limited to, degumming vegetable oils; improving the production of bread (e.g., in situ production of emulsifiers); producing crackers, noodles, and pasta; enhancing flavor development of cheese, butter, and margarine; ripening cheese; removing wax; trans-esterification of flavors and cocoa butter substitutes; synthesizing structured lipids for infant formula and nutraceuticals; improving the polyunsaturated fatty acid content in fish oil; and aiding in digestion and releasing minerals in food processing.
[173] Applications of esterases in the household industry include, but are not limited to, use in laundry and detergent (e.g., removal of stains); cleaning agents; and Patent 124702-0230 hydrolysis of tallow for laundry detergent.
[174] Applications of esterases in the publishing and printing industry include, but are not limited to, the removal of triglycerides, steryl esters, resin acids, free fatty acids, and sterols (e.g., lipophilic wood extractives).
[175] Applications of esterases in the bioenergy industiy include, but are not limited to, the production of biodiesel and hydrolysis of hemicellulose.
[176] Applications of esterases in the feed industry include, but are not limited to, reducing the amount of phosphate in feed.
[177] Applications of esterases in other industries include, but are not limited to, the use as a biocatalysis; sewage treatment; cleaning up oil pollution; the synthesis of esters; the synthesis of fragrances; enantio-specific catalysis of fine chemicals (e.g., esters for chemical and drug intermediates); the production of isopropyl myristate, isopropyl palmitate and 2-ethylpalmitate for use as emollient in personal care products; saving of energy and minimization of thermal degradation in oleochemical industry; use as a feed additive; and enhancing the recovery of oil (e.g., during drilling).
[178] The enzyme Aes is encoded by the nucleic acid sequence represented by SEQ ID NO: 1 in Table 1. The Aes nucleic acid sequence encodes a 302 amino acid sequence, represented SEQ ID NO: 2 in Table 1. We believe the signal peptide for Aes is located from about position 1 to about position 21 of the Aes amino acid sequence, with the mature protein spanning from about position 22 to about position 302 of the Aes amino acid sequence. Within Aes a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Aes spans from a stalling point of about position 33 to an ending point of about position 293 of the Aes amino acid sequence. Based on homology, Aes can be assigned to CE16 of the CAZy families and is expected to have Acetyl esterase activiy. Aes possesses significant homology (about 65% from amino acids 30 to 302 of Aes) with a predicted protein from Neurospora crassa OR74A (Genbank Accession No. EAA28920.1). Aes also possesses significant homology (about 68% from amino acids 38 to 302 of Aes) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65933). As evidenced below in Example 1, Aes possessed acetyl esterase activity.
[179] The enzyme Guel is encoded by the nucleic acid sequence represented SEQ ID Patent 124702-0230
NO: 3 in Table 1. The Guel nucleic acid sequence encodes a 397 amino acid sequence, represented SEQ ID NO: 4 in Table 1. We believe the signal peptide for Guel is located from about position 1 to about position 18 of the Guel amino acid sequence, with the mature protein spanning from about position 19 to about position 397 of the Guel amino acid sequence. Within Guel a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Guel spans from a starting point of about position 19 to an ending point of about position 397 of the Guel amino acid sequence. Based on homology, Guel can be assigned to CE15 of the CAZy families and is expected to have glucuronyl esterase activity. Guel possesses significant homology (about 73% from amino acids 1 to 389 of Guel) with a hypothetical protein NCU09445 from Neurospora crassa OR74A (Genbank Accession No. EAA29361.1). Guel also possesses significant homology (about 77% from amino acids 1 to 397 of Guel) with a protein from Podospora a erina S mat+ (Genbank Accession No. CAP65970). As evidenced below in Example 3, Guel possessed glucuronyl esterase activity.
[180] The enzyme Gue2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 5 in Table 1. The Gue2 nucleic acid sequence encodes a 417 aimno acid sequence, represented by SEQ ID NO: 6 in Table 1. We believe the signal peptide for Gue2 is located from about position 1 to about position 17 of the Gue2 amino acid sequence, with the mature protein spanning from about position 18 to about position 417 of the Gue2 amino acid sequence, Within Gue2 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Gue2 spans from a starting point of about position 18 to an ending point of about position 417 of the Gue2 amino acid sequence. Based on homology, Gue2 can be assigned to CE15 of theCAZy families and is expected to have glucuronyl esterase activity. Gue2 possesses significant homology (about 78% from amino acids 24 to 373 of Gue2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP59671). Gue2 also possesses significant homology (about 33% from amino acids 59 to 391 of Gue2) with Cip2 from Hypocrea jecorina (Genbank Accession No. AAP57749). As evidenced below in Example 3, Gue2 possessed glucuronyl esterase activity.
[181] The enzyme FaeB3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 7 in Table 1. The FaeB3 nucleic acid sequence encodes a 439 amino acid Patent 124702-0230 sequence, represented by SEQ ID NO: 8 in Table 1. We believe the signal peptide for FaeB3 is located from about position 1 to about position 21 of the FaeB3 amino acid sequence, with the mature protein spanning from about position 22 to about position 439 of the FaeB3 amino acid sequence. Within FaeB3 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of FaeB3 spans from a starting point of about position 101 to an ending point of about position 428 of the FaeB3 amino acid sequence. Based on homology, FaeB3 is expected to have feraloyl esterase activity. FaeB3 possesses significant homology (about 47% from amino acids 38 to 439 of FaeB3) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP64723). FaeB3 also possesses significant homology (about 56% from amino acids 1 to 439 of FaeB3) with a hypothetical protein CHGG_11054 from Chaetomium globosum CBS 148.5 (Genbank Accession No. EAQ82878). As evidenced below in Example 4, FaeB3 possessed feruloyl esterase activity.
[182] The enzyme FaeB4 is encoded by the nucleic acid sequence represented by SEQ ID NO: 9 in Table 1. The FaeB4 nucleic acid sequence encodes a 234 amino acid sequence, represented by SEQ ID NO: 10 in Table 1. We believe the signal peptide for FaeB4 is located from about position 1 to about position 19 of the FaeB4 amino acid sequence, with the mature protein spanning from about position 20 to about position 234 of the FaeB4 amino acid sequence. Within FaeB4 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of FaeB4 spans from a starting point of about position 73 to an ending point of about position 233 of the FaeB4 amino acid sequence. Based on homology, FaeB4 is expected to have feruloyl esterase activity. FaeB4 possesses significant homology (about 53% from amino acids 1 to 233 of FaeB4) with hypothetical protein MGG_08737 from Magnaporthe grisea 70-15 (Genbank Accession No. EDJ93992). FaeB4 also possesses significant homology (about 52% from amino acids 15 to 233 of FaeB4) with a putative feruloyl esterase from Aspergillus fwnigatus Al 163 (Genbank Accession No. EDP49472).
[183] The enzyme Fae8 is encoded by the nucleic acid sequence represented by SEQ ID NO: 11 in Table 1. The Fae8 nucleic acid sequence encodes a 373 amino acid sequence, represented by SEQ ID NO: 12 in Table 1. We believe the mature protein spans from about position 1 to about position 373 of the Fae8 amino acid Patent 124702-0230 sequence. Within Fae8 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Fae8 spans from a stalling point of about position 91 to an ending point of about position 207 of the Fae8 amino acid sequence. Based on homology, Fae8 is expected to have femloyl esterase activity. Fae8 possesses significant homology (about 50% from amino acids 7 to 362 of Fae8) with a hypothetical protein from Gibberella zeae PH-1 (Genbank Accession No. XP_389574). Fae8 also possesses significant homology (about 54% from amino acids 9 to 370 of Fae8) with EDK03956 (Genbank Accession No. XP367307).
[184] The enzyme FaeA3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 13 in Table 1. The FaeA3 nucleic acid sequence encodes a 445 amino acid sequence, represented by SEQ ID NO: 14 in Table 1. We believe the signal peptide for FaeA3 is located from about position 1 to about position 20 of the FaeA3 amino acid sequence, with the mature protein spanning from about position 21 to about position 445 of the FaeA3 amino acid sequence. Within FaeA3 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of FaeA3 spans from a starting point of about position 83 to an ending point of about position 337 of the FaeA3 amino acid sequence. Based on homology, FaeA3 is expected to have feruloyl esterase activity. FaeA3 possesses significant homology (about 56% from amino acids 78 to 445 of FaeA3) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP60411). FaeA3 also possesses significant homology (about 56% from amino acids 78 to 432 of Fae A3) with EAA68101 (Genbank Accession No. XP381416).
[185] The enzyme Rgae2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 15 in Table 1. The Rgae2 nucleic acid sequence encodes a 253 amino acid sequence, represented by SEQ ID NO: 16 in Table 1. We believe the signal peptide for Rgae2 is located from about position 1 to about position 20 of the Rgae2 amino acid sequence, with the mature protein spanning from about position 21 to about position 253 of the Rgae2 amino acid sequence. Within Rgae2 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Rgae2 spans from a stalling point of about position 38 to an ending point of about position 226 of the Rgae2 amino acid sequence. Based on homology, Rgae2 can be assigned to CE12 of the CAZy families and is expected to Patent 124702-0230 have rhamnogalacturonan acetyl esterase activity. Rgae2 may also possess pectin acetylesterase or acetyl xylan esterase activity. Rgae2 possesses significant homology (about 49% from amino acids 6 to 252 of Rgae2) with a protein from Aspergillus oryzae (Genbanlc Accession No. BAE63203). Rgae2 possesses rhamnogalacturonan acetyl esterase activity which can be measured for example as described by Searle - van Leeuwen et al, (1 92) Appl. Microbiol. Biotechnol. 38; 347-349 and in Searle - van Leeuwen et al, (1996)In: "Pectins and Pectinases", J. Visser and A. Voragen (eds) Progress in Biotechnology Vol 14, Elsevier, Amsterdam, pp 793-798.
[186] Table . Nucleic Acid and Amino Acid Sequence Listings for Myceliophtora
thermophila C1 Esterases
Enzyme Nucleic Acid Sequence Amino Acid Sequence
Aes SEQ ID NO 1 SEQ ID NO 2
ctatataagaagaggataggtataaagaacccaccttttgccaggatccacagggcttatcttatc MG FLTTTALALLATGGAATARP taaccttatatgacagatatgtccgtagacaaacggactgcatcacttgctcttagcicggctgttc IRACDVSTKYLITFGDSYSQTGFD gggtgacatcccgcaacgtctccttccgccctcctggatttggcgtctcaacgggccacttgctc VTGTKPSASNPLGNPLLPGWTAS ccgctccgaggcaaatctccgccaatcagctgcaaacagttccgggatcgagcgtggtcacca GGLN WVGFL V SEFNTS TTLS YNF gagcggactccgcgggacaagggaaggaacccattcaagtcctgctataaatatgtaactaca AYGGATT ATIVPPYQPTVLSFID attagtagctccaggcctgggttattattataataacctagattggtagctgttgttgcgcctcggg QVAQFSGSIARKPDYAPWNADNA gcgacactaaggtactcaacgtctgcggtgttctcgagcaaccactgtcaactcgagaagaaca LFGVWIGVNDVGNVWWDPNYDS ataattaattacggtttgagrtcaCcagcatgggtcgcttcctgacgacgacggcgttggctcttct LLEQLMESYFGQLQILYDAGARNF cgcaacgggtggggcagctaccgcgagacctatcagggcttgtgacgtctctaccaagtacctt VLLSVPPIQRTPAVLLNNSPENQ atcactttcggcgactcgtactcccagaccggtttcgacgtcacgggcaccaagccgtcggcga AEALAVDKYNEALAANLEAFTD gcaacccgctcggcaaccctctgctacccgggtggacggccagcggcgggctgaactgggt KNGGITA IVDTGVPFNTALDNPT cggcttcctggtctccgagttcaacacgtcgacgacgctgtcgtacaactttgcctacggcggcg DYGAPDATCYNSDG SCLWFND ccacgaccaacgccaccatcgtgccgccctaccagccgacggtgctcagcttcatcgaccag YHPGIEINPvLVAQAVADAWKGSF gttgcccagttcagcggcagcatcgcgaggaagccagactatgcgccgtggaacgccgacaa F
tgcactctttggcgtctggatcggtgtcaacgatgtgggcaacgtctggtgggatcccaactatg
actcgctcctcgagcagatcatggagagctacttcggtcagcttcagattctttacgacgccgga
gccaggaactttgtcctlctcagcgtgccgcgtaagttgggctatgcccccttttggtctcccccc
cccccccccccgcgaccacacagagtgttattactgacaaatcgccacccgccgcgggaggt
ggggggcggacaaaaaccacagccatccagcggacgccggcggtgctgctcaacaactcgc
ccgagaaccagaaggccgaggcgctcgcggtcgataagtacaacgaagccctcgcggccaa
cctcgaggccttcacggacaagaacggcggcatcacggccaagattgtcgacacgggtgtgc
ccttcaacactgccctcgacaaccctaccgactacggcgcccccgacgccacctgctacaaca
gcgacggcaagtcctgcctatggttcaacgactaccaccccggcatcgagatcaacaggctgg
tcgcccaggcggtcgccgatgcctggaagggcagctttttctgatgattgaaccggtgcctccg
agctccgaggtcgcggggtctttgaggtttctttgaagctcggggtgacacgacaatgcggccg
cctctcigttctctgattttcgcagtatatattctctttggtigctactgtgtgtgttcgcctgtcttatac
gcagtgcggatacgctactgatgaagccgcaagaagggggggcctagacatcgaacagcatc
caaagacaatTetatata
Guel SEQ ID NO SEQ ID NO
CL10365 3aaagaagaaacatgcgcatccgcagcgtcccacacactccttcgtgatcgaagaaggtgccg 4MVHLTSALLVAGAAFAAAAPM acctccctcttaccaagcaacgatggtccatctgacctcagcccttctcgtggccggcgcggcct NHJFERQDTCSVSDNYPTVNSAKL tcgccgcggctgcgcccatgaaccacatctttgagcgccaggacacctgctcggtcagcgaca PDPFTTAAGEKVTTKDQFECRRA actacccgacggtgaactcggccaagctccccgacccgttcacgacggcagcgggcgagaa EINKTLQQYELGEYPGPPDSVEAS ggtgacgaccaaggaccagttcgagtgccggcgggccgagatcaacaagatcctgcagcagt LSGNSITVRVTVGS SISFSASIR acgagctgggcgagtacccgggaccgccggacagcgtcgaggcgtcgcttagcggcaac PSGAGPFPAIIGIGGASIPIPSNVATI gcatcaccgtcagggtgacggtcggcagcaagagcatctccttctcggcctcgatcaggaagc TF NDEFG AQ MG S GSRGQGKF Y cgtccggcgccggcccgttccccgccatcatcggcatcggcggcgcgtccatccccatcccg DLFGRDHSAGSLTAWAWGVDRL agcaacgtggccaccatcacgttcaacaacgacgagttcgecEcgcagatgggcagcgggtc IDGLEOVGAQASGIDT RLGVTG Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
gcggggccagggcaagttctacgacctctttggcagggaccactccgccggctcgctcaccg CSRNGKGAFITGALVDRIALTIPQ cgtgggcctggggtgtggaccggctcatcgacggtctggagcaggtcggggcgcaggcgtc ESGAGGAACWRISDQQ AAGAM gggcatcgacacgaaacgtctcggcgtgacgggctgctcgcgcaacggcaagggcgccttc QTAAQIITENPWFSRNFDPHVNSI atcacgggtgcgctcgtcgaccgtatcgccctgacgatcccgcaggagtcgggagccggcgg TSVPQDHHLLAALIVPRGLAVFEN cgcggcctgctggcgcatcagcgaccagcagaaggcggccggcgccaacatccagacggc NIDWLGPVSTTGCMAAGRLIYKA ggcgcagatcatcaccgagaacccctggttctccaggaacttcgacccccacgtcaactccatc YGVPNNMGFSLVGGHNHCQFPSS acgtcggtcccgcaagaccaccacctactggccgccctgatcgtgccccgcggcctggccgt QNQDLNSYINYFLLGQGSPSGVE cttcgagaacaacatcgactggctcggccccgtgtcgacgacgggctgcatggccgccggcc HSDVNVNVAEWAP GAGAPTLA gcctcatctacaaggcctacggcgtgcccaacaacatgggcttctccctcgtcggcggccaca
accactgccagttcccgagcagccagaaccaggacctcaacagctacatcaactacttccigct
cggacagggctcccccagcggcgfcgagcacagcgacgtcaacgtcaacgtcgccgaatgg
gcgccgtggggcgccggcgccccgactctggcgtgagccgtcgccttgttttcggaagcggat
gccaggccgtgctcgcacttcacgccgtggtctccgtcgtcggcgggcatcgctagcagggca
ttggttgcaggcatggaaggaaaaagagatgtaactgtaaatacggaacaggataacgagttct
cttgtgtcttgtctattgtgccgategagtgcgcgccgtaaatgtacttttgotcatgctgtgatctcg
actcgttcagtgaccatcaccggactttactttattctcgacgatgtcagaggagaactgtcaaata
ttaaa
Gue2 SEQ ID NO 5 SEQ ID NO 6
CL1 1231 ctataaagattccgagtcttggtgctcgggtccggcaattgtgtgaacgccgatccaaaccggct MKLVSAYTLIGAAIGSASRVPRIP aatatgaagaatacgaccggtgaattgccaaccccacacgacccaaacgctaaggtccaatgc RQGGGNTMIECAPIPSPFPTWQEL ccgggacagccgggcgacggaatgctcgactttcggaagttattctggoggttgttaacctcgc PLQSSMPDPFLPLAYTTPDNAAD cccggacagatgggacaagcactcgggtctgatatacacgtatcctggagggcctcagagga VVAGRGKGRVQTPEEWYRCRQP gggggacaagacgcacctcaggtcagccttgattcagagcccgtcacctcttgtcatagagca EIIQLLQEYQYGYYPDPSEE VEA gagcagattgtttcaagatgaagctagtctcagcatacacgttgataggggcagcgatcggcag TRSGNTLNIWTAGGKQGSFRATI cgcctcccgggttccccggatcccgcgacaaggcggcggtaacaccatgatcgagtgtgccc SLPSGASASNPAPVVINIGGMQNQ cgatcccgtccccittcccgacctggcaagagctcccgctgcagtcgtcgatgccggacccctt PYLSAGIAVAQFDYTTVSPDSNA cctcccgctcgcgtacaccacgcccgataacgccgcggatgtcgtggccggcaggggcaag TGAFWSIYNGRDIGVLTAWAW ggcagggtgcagactcccgaggagtggtaccggtgccggcagcccgagatcattcagctgct GFHRTLDAINLTVPEIDAARVGVT gcaggagiaccagtatggctactacccggacccctccgaggagaaggtcgaggccacgcgc GCSRLG AALAAGLFDKRITLTM agcggcaacacgctcaacatcgtcgtgacggcgggcggcaagcagggcagcttcagggcga PMSSGVQGAGPYRYYDMSGQGE ccatctcgctgccctccggggcttccgcgtcgaacccggcgccggtggtgatcaacattggcg NLENSKQGAGWWTNS LGTFVN ggatgcagaaccagccatatctgagcgcggggatcgccgttgcgcagtttgactacaccaccg HAQNLPYDAHTIVAAIAPRAVIID tategccggatagcaatgcgaagacgggggctnctggagtatctacaacgggagggacattg QGTGDPFVNS GTAVVVYPAA gtaattaactaaaccacctagagtccactttttgtttactttttttgcttcagccacgaacagagcgct VVYDWLGAGENIGISVRGGGHCD aacacggcgggaaaaaaaatggttgtataataatcaggagtcctgacggcctgggcctggggc LSGYTAILPYVQKIFFGTPTD DY ttccaccgtacgctggatgccatcaacctgacggtccccgagatcgacgcggcccgggtcggc NNLGSYGSPVSSAFLWATAVPGA gtgaccgggtgctcgcgcctgggcaaggcggcgctggcggcggggctctttgacaagcgcat
cacgctcaccatgcccatgtcgtcgggcgtccagggggcggggccctaccggtactacgaca
tgagcgggcagggcgagaacctcgagaacagcaaacagggcgccggctggtggaccaaca
gcaagctgggcaccttcgtgaaccacgcccagaacctgccgtacgacgcccacaccatcgtg
gcggccatcgcgccgagggccgtcatcatcgaccagggcaccggcgaccccttcgtcaaca
gcaagggcaccgccgtcgtcgtctaccctgcggccaaggtcgtctatgactggctgggagcg
ggtgaaaacattggcateagcgtgcgcgggggcgggcactgcgatctgagcggctagigagt
ggcagaccagatgtcctctcctcffitttttttttttttttttttttttttagtggcggggctgaccgatgat
gcacctgttgctttattagcacggciatcctgccgtacgtccaaaagatcttctttgggaccccgac
agacaaggactacaacaacctcggctcgtacggctcacccgtgtcgtccgccttcctatgggcg
acggcggttcccggagcttgaaccgggttcgggtcagactggctgaaacgggaaatgaacca
catagcttgaggtagctacttacggggaactgacgacacaacaaaagaaagttgccgaagatg
ccactcttcagagaagacacgtgtgtagcctgagccttcaaaaagatgatcaga
FaeB3 SEQ ID NO 7 SEQ ID NO 8
CL106I2 tccatgactgagtttggacaagacgagcatagcagatattatatgtgatgctaacatggcccgcct MARLSALIASTAVFLAFNMLAAP ttcagcgcttattgccagtactgccgtctttctcgccttcaacatgctcgcagccccatttggtctgt FGLSRGPRLPEKALRCNRVFFGD cgcgcgggccacggctgccggaaaaggcmgcgctgcaaccgagtgttetttggtgatgtgct VLPPDATLEKVAVVREGGSYGEG ccctccagatgcgacccttgagaaggtcgcggtggtccgggagggcgggagctacggcgag EANV AYSVDPTGLPALCVITVRV ggcgaggcaaacgtggcatactcggtagacccgaccggtcttccagcattgtgtgtcattaccg RSSSTSSYRLGLFLPDKWNSRFLV ttcgggtgaggagctcttcgacgagcagctaccggcttggcctetttctgccagataagtggaac VGNGGFAGGDJWLDILYNFPVTO tcgaggtttrtggtggtaggaaacgggggcttogcgggtggcatcaactggttggacatgtgag PKHLSVADTALIADEVIRQCDLAD cgcttattaccctttgaatgctcccttggtggacgtcccacctcaacaactacgtcgcccaagcca GVQDGIVSTPDRCAPELTVLLCSG gcctgtacaactttcccglaaccgaccccaaacacctctccgttgccgacaccgcccttatcgca KDNGAKNAPTSGKADCLRPAQLE gacgaagtcatccggcaatgcgacctggccgacggcgtgcaggatggcattgtcagcacccc TARNVYSD TLPPNNELLHPGLT ggatcggtgcgccccggagctgactgttctcctctgcagcggcaaagacaacggggccaaga FSSEGEWLLILNGSEPVPYGIGYA acgccccaacgtccgggaaggcagattgtctccgcccggctcagctcgagaccgcccgaaac RDFLFDDDGGGGGGGSGSDAPP gtctacagcgactggacccttccgcccaacaacgagctccttcatccgggtctgaccttctcctc WDWRTSFNESWRYADEHDPGN ggagggagaatggttgctcatcctcaac-ggctccgagccggtgccctacgggatcggatacgc ATADDCAALGAVRERGGKVVIY Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
gcgcgacttcctgttcgacgacgacggcggcggcggcggcggcggcagcggcagcgatgc HGLADGLVPTKGTGVYWNRTLD cccgccgtgggactggcgcacgtccttcaacgagagcgtcgtgcgctacgccgacgagcacg ALGGLGGLGGLEDF RLFLVPG accccgggaacgcgacggcggacgactgcgccgcgctgggggccgtccgagagaggggc MGHCYGTAVDAPWNFGGAFQA ggaaaggtggtgatctaccacgggctcgcggacgggctggtgccgacaaaggggaccggc GLIGSGVWSVPGFEDAEHDALMA gtgtactggaatcggacgttggatgccctgggcggcctgggcggcctgggcggcctggagga LVDWVE GKPLDSIVATTWEDPT cttcatgaggctgtttctcgtgcccggcatgggacattgttacgggacggccgtagacgcgccg NSSSG
tggaacttcgggggcgcatttcaggcgggcctgatagggagcggggtgtggtcggtgccggg
gttcgaggatgcggagcatgacgccttgatggcgttggtcgactgggtcgagaaagggaagc
ccttggacagcattgtcgcgacaacgtgggaagacccgacaaattcaagttcgggataaagag
acaaagaccaatctgtgcgtggcctcggcaggcactgtgggatggtaaaggagatgttgacga
tgccacaagttggagtcgttcgagttaagaagtccccgggattatgggtactatgtagatatgag
ccccaagtttggcaggttttggatcaattatcgcitgaccgccttctttgtccggaagttgccccgc
atttctgccccgcctacgcttgatgcattagcgcccgtcgcggcacagtggactcggtggtggg
gtaaagcaactgttcctttgaagatgtttggggggcaaagtttgacacgataacaatttggttgcttt
cagatgaagtgctctcttctgggccaaattggagtttccccttatcgtagcacgggcatagtcagg
tatttgtcgagggttacaaccgtagtcaaccattgccaaatat
FaeB4 SEQ ID NO 9 SEQ ID NO 10
CL10381 ctgatattaaagatacgteggcagaggagtagttaaggcaccaggcttataactaccgaacgta MRFSLYMALISTLAAVAECSSFQQ gtagaagctcogggagcgttgaaagcagtctgtagtgggtggctatacgaatgagacagtgcct QCSKLASKLKIENATAWFSEYVA tgttaaactatccacaaaggagacaaaaacgactaatatttatagacaaaggaactggctaatgc AGTNMSFPDTDPSCAQGSLWDV gttggtaacgacgcctcatacgctgtggcacgtatgtggctaatactcttagaagctccacagctt DFCRVAPYVATSHRSGISMEAWL ctgacaaccctagaacaccagagcgggcaactactgaataattatagcatttgctccttttgtttac PKNWTGRFLSTGNGGLNGCLSYD ctcgggccggctgatcagatcggcaagtaggaatcgagttagccaggtctttcgccaaaagaa DMAYTVELGFSTVGAKNCHNGH agaataattcatggatgttcaaaaaggttcatgttatcatgtgagcttgatctccgtttgtgtttgatgt NGTSGAPFLNNPDVLVVDFAWPS trtaggcagttaaccatgccctcgctggtttcagccgtgcatctgaattgagccgttgccatgcgct VHTNAWG QISEARPATSTASSR tctcgctctacatggcgctcatttccacgctcgcggcggtagccgagtgcagcagcttccagca TRNSAGTAFVGGRAGGEVAFRRR gcagtgctccaagctcgccagcaagctcaagatcgagaatgctaccgcctggttctccgagtac HCKYPLRNTYGGKGDPKDPDSW gtggccgctggcaccaacaigtccttcccagacacggacccgtcgtgcgcccagggcagcct QCVL
ggtggtcgatgtcgacttctgccgcgtcgQcccgtacgttgccacgtcccaccgttcgggcatc
agcatggaggcctggctgcccaagaactggaccggccggttcctgagcacgggcaacggcg
gtctcaatggctgtctctcgtacgacgacatggcctacacggtcgagcttggcttctccaccgtc
ggcgccaaaaactgccacaacggccacaacggcacctcgggcgcccccttcctcaataaccc
cgacgtattagtcgtggactttgcctggccgtccgttcacaccaacgctgtggtaggcaagcag
atctcggaagccagacgtacctgacgcccgagcagtgggacgccgtgaacgccgacgtcctg
gcgcagtgcgatggcctaggcggctacgtcgacggcatcatcgaggacccggaactctgccg
gtaccgccttcgtcggcggccgcgccggcggcgaggtcgccttccgccgcaggcattgcaag
tacccgctgcggaacacgtacgggggcaagggagaccccaaggatcccgacagctggcagt
gcgttctgtgagaacaacctcccctgccccaaccctttccagccgcacggacaatggtgggtac
aaggaatttctcttcgtggcggtgtggagggaagggggttgatcaagacctggggaggtcctg
atgaaatattaaggctgagttgcatggcgctttgctttttgaaggattgtcgaaaaaggaaaaaaa
aaaagaaaccattcaatggatcttcacccacggtggcatcatcggtggcctatagccctgtgaaa
gttgaccagtttccaaagtcgaagcaaccagtgacgcaatcttcggaggccaagtgggcgtcc
atgctgaaacaccacgaaac
Fae8 SEQ ID NO SEQ ID NO 12
C110381 part 11 agtaggaccgtttcgtaaaaaccttgctttttcagattttccccacatcaagcgtgggggagcc AP TPLPPVPTVAETL PIPAYPT 2 gggccggcgcgtgtgtggggagcgggtcccgaggctcccgtgccgagaaccgaagtggcc ATWNLEPDRKGLVPVAEGRGGPF gacgacaactttgagaacrtccactttccgatctcgtcgacgccggctctcttgttgtgcagggag NMSWEIHGVGPIRLIL1MGLGGFR cggagtccgtcaacccaagccagcatggcgccaaagacaccgctaccgcccgtaccgacgg TAWQRQTLHFGHERRDRYSVLLI tcgcagagaccttgaagcatcctgcctatccgacggcgacatggaacctcgagccggaccgc DNRGMGDSDKPLMRYSTSEMAL aaggg ctggtgccggtcgccgaaggccgcggcggtcccttcaacatgtcctgggagatccat DIVEVLASPDRTLHWGISLGGMI ggcgtcgggccaatcaggctlattgtgagtgatatcgctcgtctggcaatgtctcctcctcatcttt AQELAV VLAEYLS SLSLICTAAV V ctcatccccgcccctctcagtccactcccccgctctctcttcgttctcacggactaagcggttggct ENTASFAEHMAQRASLILPKSVD cactcartccaatcgttcgactaaactaactcacccgcctcgcttcacccaacgggatggcagct RSVADAARRIFAPSWLALPDDVR catcatgggcctcggcggatttaggacggcctggcagcggcagacgctacactttggccacga LPDPATTPKCKPPRAAAAEGGSG gcggcgggaccggtactclgtcctgctgatcgacaaccgcggcatgggcgacagcgacaag SGSGSGNGGEGEGGGRYL FETN ccgctcatgcgctacagcaccagcgagatggccctcgacatcgtcgaggtgctggccagtccg AQRFVAQELHKRLDPAGRFTLKG gacgtcggctggctgccctcgtcttacccgctgccgccttcgcctcccgctcccgctcccgctcc FLLQLIAAGWHR TPAQLAAMA cgctcccgagcggaccctgcacgtggtcggcatctcgctcggcggcatgatcgcccaggagc DRVGRDRILVMHGTEDGMISVPH tcgccgtcgtcctggccgagtacctctcgagcctgtcgctcatctgcaccgccgccgtggtcga GRKLIDYVRPAKGIP EGMGHAPL gaacacggcctccttcgccgagcacatggcccagcgcgcctccctgatcctgcccaagtcggt VERWEWFNQVIEEQCLLGERLDG cgaccgctccgtcgccgacgccgcccgccgcatcttcgccccgtcctggctcgccctcccgga RA
cgacgtccgcctgcccgacccggccacgactcccaagtgcaaacccccccgggccgccgcc
gccgagggcgggagcgggagcgggagcgggagtggaaacggcggcgagggcgagggc
gggggccggtacctcaagttcgagaccaacgcgcagcggttcgtcgcgcaggagctgcaca Patent 124702-0230
Figure imgf000037_0001
[187] Carbohydrases Patent 124702-0230
[188] Carbohydrases represent a category of various enzymes and polypeptides including amylases, cellulases, hemicellulases, pectinases, and chitinases that catalyze and/or enhance the hydrolysis or synthesis of a carbohydrate.
[189] Applications of carbohydrases in the food industry include, but are not limited to, increasing the yield of fruit juice production in total liquefaction; increasing the pressing yield of oils e.g. from olives, cleaning filters, reduction of viscosity, hydrolyzing starch, and stimulating fermentation in the brewing industry; increasing the loaf volume and improving crust color in the baking industry; preventing/reducing the staling of bread; removing lactose from milk products; clarifying, filtrating, and extracting aroma and color (e.g., the wine industiy); debittering or detoxifying plant glycosidic compounds; processing coffee; aiding in digestion; producing starch (e.g., separating starch from gluten); producing oligosaccharides (e.g., nutraceuticals); producing gelling agents; modifying viscosity; saccharification of starch and other biopolymers; and modifying starch.
[190] Applications of carbohydrases in the publishing and printing industry include, but are not limited to, biorefining; bleaching; increasing yield; deinking; reducing energy use; increasing paper strength; and modifying the properties of natural fibers.
[191] Applications of carbohydrases in the biofuel industry include, but are not limited to, releasing fermentable sugars, and the saccharification of lignocelluloses and starch. Saccharification can also be used to produce fermentable sugars for the production of high value metabolites and chemical building blocks for producing derivatives thereof and fine chemical.
[192] Applications of carbohydrases in the feed industry include, but are not limited to, improving feed conversion, reducing the viscosity, and producing oligosaccharides,
[193] Applications of carbohydrases in the textile industry include, but are not limited to, removing non-cellulosic components from fabric (e.g., scouring by pectinase); pretreating cotton; improving biofimshing; "stone washing" of jeans; and increasing the softness of a garment.
[194] Applications of carbohydrases in other industries include, but are not limited to, removing/killing fungi; treating infections; use as a biopesticide; treating sewage; use as a slime controlling agent; use in oil drilling and cleanup of drilling fluid Patent 124702-0230 filter cake; use in detergents (e.g., removal of stains); use in biocatalysis; producing biodegradable plastics and coatings; increasing the sorption properties of cellulosic wound dressings; improving the bleachability of teak veneer surfaces; and producing N-acetyl glucosamine.
[195] The enzyme Agu2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 17 in Table 2. The Agu2 nucleic acid sequence encodes a 1076 amino acid sequence, represented by SEQ ID NO: 18 in Table 2. We believe the signal peptide for Agu2 is located from about position 1 to about position 20 of the Agu2 amino acid sequence, with the mature protein spanning from about position 21 to about position 1076 of the Agu2 amino acid sequence. Within Agu2 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Agu2 spans from a starting point of about position 21 to an ending point of about position 1075 of the Agu2 amino acid sequence. Based on homology, Agu2 can be assigned to GH115 of the CAZy families and is expected to have a- glucuronidase activity, a-glucuronidase activity assay methods have been summarized by J. Puis; -Glucuronidases in the hydrolysis of wood xylans. In: Xylans and Xylanases, eds J. Visser et al. Elsevier, Amsterdam 1992, pp 213-224. Agu2 possesses significant homology (about 71% from amino acids 1 to 1075 of Agu2) with a conserved hypothetical protein from Newospora crassa OR74A (Genbank Accession No. EAA30769). Agu2 also possesses significant homology (about 70% from amino acids 1 to 1075 of Agu2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65960).
[1 6] The enzyme Gxhl is encoded by the nucleic acid sequence represented in by SEQ ID NO: 19 Table 2. The Gxhl nucleic acid sequence encodes a 484 amino acid sequence, represented by SEQ ID NO: 20 in Table 2. We believe the signal peptide for Gxhl is located from about position 1 to about position 19 of the Gxhl amino acid sequence, with the mature protein spanning from about position 20 to about position 484 of the Gxhl amino acid sequence. Within Gxhl a catalytic domain (CD) is present. We believe the amino acid sequence contaimng the CD of Gxhl spans from a starting point of about position 33 to an ending point of about position 470 of the Gxhl amino acid sequence. Based on homology, Gxhl can be assigned to GH5 of the CAZy families and is expected to have xylobiohydrolase activity. Gxhl possesses significant homology (about 71% from amino acids 1 to Patent 124702-0230
481 of Gxhl) with a protein from Podospora amerina S mat+ (Genbank Accession No. CAP68494). Gxhl's xylobiohydrolase activity can be assayed with HPAEC analysis of xylo oligos as the substrate.
[197] The enzyme Gxh2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 21 in Table 2. The Gxh2 nucleic acid sequence encodes a 477 amino acid sequence, represented by SEQ ID NO: 22 in Table 2. We believe the signal peptide for Gxh2 is located from about position 1 to about position 17 of the Gxh2 amino acid sequence, with the mature protein spanning from about position 18 to about position 477 of the Gxh2 amino acid sequence. Within Gxh2 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Gxh2 spans from a starting point of about position 59 to an ending point of about position 469 of the Gxh2 amino acid sequence. Based on homology, Gxh2 can be assigned to GH5 of the CAZy families and is expected to have xylanase activity. Gxh2 may also exhibit chitosanase; β-mannosidase; Cellulase; glucan 1,3-β- glucosidase; licheninase; glucan endo-l,6-p-glucosidase; mannan endo-p-1,4- mannosidase; endo-P-l,4-xylanase; cellulose p-l,4-cellobiosidase; endo-P-1,6- galactanase; P-l,3-mannanase; xyloglucan-specific endo-p-l,4-glucanase; and/or mannan transglycosylase activity. Gxh2 possesses significant homology (about 56% from amino acids 20 to 474 of Gxh2) with a protein from Podospora amerina S mat+ (Genbank Accession No. CAP64828). The activity can be tested with HPAEC analysis.
[198] The enzyme Agal is encoded by the nucleic acid sequence represented by SEQ ID NO: 23 in Table 2. The Agal nucleic acid sequence encodes a 435 amino acid sequence, represented by SEQ ID NO: 24 in Table 2. We believe the signal peptide for Agal is located from about position 1 to about position 14 of the Agal amino acid sequence, with the mature protein spanning from about position 15 to about position 435 of the Agal amino acid sequence. Within Agal a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Agal spans from a starting point of about position 15 to an ending point of about position 379 of the Agal amino acid sequence. Based on homology, Agal can be assigned to GH27 of the CAZy families and is expected to have a-galactosidase activity. Agal may also possess a-N-acetylgalactosaminidase; isomalto- dextranase; and/or β-L-arabinopyranosidase activity. Agal possesses significant Patent 124702-0230 homology (about 56% from amino acids 14 to 425 of Agal) with a hypothetical protein POSPLD AFT_l 34790 from Postia placenta Mad-698-R (Genbank Accession No. EED85274). Agal also possesses significant homology (about 51% from amino acids 17 to 416 of Agal) with a glycoside hydrolase family 27 protein from Laccaria bicolor S238N-H82 (Genbank Accession No. EDR08276). Activity of Agal can be measured with pNP- -D-galactopyranoside as the substrate. See Example 14,
[199] The enzyme Aga2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 25 in Table 2. The Aga2 nucleic acid sequence encodes a 416 amino acid sequence, represented by SEQ ID NO: 26 in Table 2. We believe the signal peptide for Aga2 is located from about position 1 to about position 20 of the Aga2 amino acid sequence, with the mature protein spanning from about position 21 to about position 416 of the Aga2 amino acid sequence. Within Aga2 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Aga2 spans from a starting point of about position 21 to an ending point of about position 397 of the Aga2 amino acid sequence. Based on homology, Aga2 can be assigned to GH27 of the CAZy families and is expected to have a-galactosidase activity. Aga2 may also possess a-N-acetylgalactosaminidase; isomalto- dextranase; and/or β-L-arabinopyranosidase activity. Aga2 possesses significant homology (about 61% from amino acids 1 to 403 of Aga2) with hypothetical protein MGG_13626 from Magnaporthe grisea 70-15 (Genbank Accession No. EDJ99928). Aga2 also possesses significant homology (about 73% from amino acids 1 to 403 of Aga2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP71790). Activity pf Aga2 can be measured with pNP-a-D- galactopyranoside as the substrate. See Example 14.
[200] The enzyme Aga3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 27 in Table 2. The Aga3 nucleic acid sequence encodes a 892 amino acid sequence, represented by SEQ ID NO: 28 in Table 2. We believe the mature Aga3 protein spans from about position 1 to about position 892 of the Aga3 amino acid sequence. Within Aga3 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Aga3 spans from a stalling point of about position 195 to an ending point of about position 534 of the Aga3 amino aeid sequence. Based on homology, Aga3 can be assigned to GH36 of the CAZy Patent 124702-0230 families and is expected to have a-galactosidase/ raffmose synthase activity. Aga2 may also possess a-N-acetylgalactosaminidase; isomalto-dextranase; and/or β-L- arabinopyranosidase activity. Aga3 possesses significant homology (about 67% from amino acids 1 to 892 of Aga3) with hypothetical protein CHGG_01365 from Chaetomium globosum CBS 148.51 (Genbank Accession No. EAQ93130). Activity pf Aga3 can be measured with pNP-cc-D-galactopyranoside as the substrate. See Example 14.
[201] The enzyme Man8 is encoded by the nucleic acid sequence represented by SEQ ID NO: 29 in Table 2. The Man8 nucleic acid sequence encodes a 897 amino acid sequence, represented by SEQ ID NO: 30 in Table 2. We believe the mature Man8 protein spans from about position 1 to about position 897 of the Man8 amino acid sequence. Within Man8 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Man8 spans from a stalling point of about position 7 to an ending point of about position 703 of the Man8 amino acid sequence. Based on homology, Man8 can be assigned to GH2 of the CAZy families and is expected to have β-mannosidase activity. Man8 may also possess β-galactosidase; β-glucuronidase; mannosylglycoprotein endo-^-mannosidase; and/or exo^-glucosaminidase activity. Man8 possesses significant homology (about 54% from amino acids 5 to 897 of Man8) with EDK05879 (Genbank Accession No. XP364819). Man8 also possesses significant homology (about 50% from amino acids 4 to 897 of Man8) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP61774). The activity of Man8 can be examined with a β-mannosidas assay. See Example 15.
[202] The enzyme Man9 is encoded by the nucleic acid sequence represented by SEQ ID NO: 31 in Table 2. The Man9 nucleic acid sequence encodes a 855 amino acid sequence, represented by SEQ ID NO: 32 in Table 2. We believe the mature Man9 protein spans from about position 1 to about position 855 of the Man9 amino acid sequence. Within Man9 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Man9 spans from a stalling point of about position 2 to an ending point of about position 654 of the Man9 amino acid sequence. Based on homology, Man9 can be assigned to GH2 of the CAZy families and is expected to have β-mannosidase activity. Man9 may also possess β-galactosidase; β-glucuronidase; mannosylglycoprotein endo^-mannosidase; Patent 124702-0230 and/or exo- -glucosaminidase activity. Man9 possesses significant homology (about 73% from amino acids 1 to 853 of Man9) with EDK05161 (Genbank Accession No. XP363252). Man9 also possesses significant homology (about 76% from amino acids 1 to 855 of Man9) with hypothetical protein NCU00890 from Neurospora crassa OR74A (Genbank Accession No. EAA35570). The activity of Man9 can be examined with a β-mannosidas assay. See Example 15.
[203] The enzyme Bgl2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 33 in Table 2. The Bgl2 nucleic acid sequence encodes a 968 amino acid sequence, represented by SEQ ID NO: 34 in Table 2. We believe the mature Bgl2 protein spans from about position 1 to about position 968 of the Bgl2 amino acid sequence. Within Bgl2 a catalytic domain (CD) is present. The amino acid sequence containing the CD of Bgl2 spans from a starting point of about position 168 to an ending point of about position 750 of the Bgl2 amino acid sequence. Based on homology, Bgl2 can be assigned to GH3 of the CAZy families and is expected to have β-glucosidase activity. Which can be assayed as for example described by Esen In: Handbook of Food Enzymology. Edited by John R. Whitaker et al CRC Press 2002, pp791-803. Bgl2 may also possess β-glucosidase; xylan 1,4-P-xylosidase; β-Ν-acetymexosaminidase; glucan l,3^-glucosidase; exo- 1,3-1,4-glucanase; and/or alpha-L-arabinofuranosidase activity. Bgl2 possesses significant homology (about 66% from amino acids 1 to 968 of Bgl2) with EAA35949 (Genbank Accession No. XP965185). Bgl2 also possesses significant homology (about 75% from amino acids 1 to 968 of Bgl2) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65606).
[204] The enzyme Bgl3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 35 in Table 2. The Bgl3 nucleic acid sequence encodes a 865 amino acid sequence, represented by SEQ ID NO: 36 in Table 2. We believe the mature Bgl3 protein spans from about position 1 to about position 865 of the Bgl3 amino acid sequence. Within Bgl3 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Bgl3 spans from a starting point of about position 107 to an ending point of about position 732 of the Bgl3 amino acid sequence. Based on homology, Bgl3 can be assigned to GH3 of the CAZy families and is expected to have β-glucosidase activity. Which can be assayed as for example described by Esen In: Handbook of Food Enzymology. Edited by John Patent 124702-0230
R. Whitaker et al CRC Press 2002, pp791-803. Bgl3 may also possess β- glxicosidase; xylan 1 ,4^-xylosidase; β-Ν-acetymexosaminidase; glucan 1,3-β- glucosidase; exo-l,3-l,4-glucanase; and/or alpha-L-arabinofuranosidase activity. Bgl3 possesses significant homology (about 55.68% from amino acids 97 to 865 of Bgl3) with a protein from Aspergillus oryzae (Genbank Accession No. BAE60358.1).
[205] The enzyme Bgl4 is encoded by the nucleic acid sequence represented by SEQ ID NO: 37 in Table 2. The Bgl4 nucleic acid sequence encodes a 884 amino acid sequence, represented by SEQ ID NO: 38 in Table 2. We believe the mature Bgl4 protein spans from about position 1 to about position 884 of the Bgl4 amino acid sequence. Within Bgl4 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Bgl4 spans from a stalling point of about position 81 to an ending point of about position 737 of the Bgl4 amino acid sequence. Based on homology, Bgl4 can be assigned to GH3 of the CAZy families and is expected to have β-glucosidase activity. Which can be assayed as for example described by Esen In: Handbook of Food Enzymology. Edited by John R. Whitaker et al CRC Press 2002, pp791-803. Bgl4 may also possess β- glucosidase; xylan l,4-P-xylosidase; β-Ν-acetylhexosaminidase; glucan 1,3-β- glucosidase; exo-l,3-l,4-glucanase; and/or alpha-L-arabinofuranosidase activity. Bgl4 possesses significant homology (about 79% from amino acids 12 to 884 of Bgl4) with EAA35798 (Genbank Accession No. XP363252). Bgl4 also possesses significant homology (about 82% from amino acids 11 to 870 of Bgl4) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP69216).
[206] The enzyme Bgl5 is encoded by the nucleic acid sequence represented by SEQ ID NO: 39 in Table 2. The Bgl5 nucleic acid sequence encodes a 903 amino acid sequence, represented by SEQ ID NO: 40 in Table 2. We believe the signal peptide for Bgl5 is located from about position 1 to about position 23 of the Bgl5 amino acid sequence, with the mature protein spanning from about position 24 to about position 903 of the Bgl5 amino acid sequence. Within Bgl5 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Bgl5 spans from a starting point of about position 93 to an ending point of about position 662 of the Bgl5 amino acid sequence. Based on homology, Bgl5 can be Patent 124702-0230 assigned to GH3 of the CAZy families and is expected to have β-glucosidase activity. Which can be assayed as described by Esen In: Handbook of Food Enzymology. Edited by John R. Whitaker et al CRC Press 2002, pp791-803. Bgl5 may also possess β-glucosidase; xylan l,4-p-xylosidase; β-Ν- acetymexosammidase; glucan l,3-43-glucosidase; exo-l,3-l,4-glucanase; and/or alpha-L-arabinofuranosidase activity. Bgl5 possesses significant homology (about 66% from amino acids 1 to 903 of Bgl5) with a probable beta-glucosidase 1 precursor from Ne rospora crassa (Genbank Accession No. CAC28685). Bgl5 also possesses significant homology (about 75% from amino acids 1 to 903 of Bgl5) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP73569).
[207] The enzyme Glul is encoded by the nucleic acid sequence represented by SEQ ID NO: 41 in Table 2. The Glul nucleic acid sequence encodes a 819 amino acid sequence, represented by SEQ ID NO: 42 in Table 2. We believe the signal peptide for Glul is located from about position 1 to about position 17 of the Glul amino acid sequence, with the mature protein spanning from about position 18 to about position 819 of the Glul amino acid sequence. Within Glul a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Glul spans from a starting point of about position 18 to an ending point of about position 819 of the Glul amino acid sequence. Based on homology, Glul can be assigned to GH55 of the CAZy families and is expected to have endo- and/or exo- β-glucanase activity. Glul possesses significant homology (about 56% from amino acids 19 to 811 of Glul) with a conserved hypothetical protein from Neurospora crassa OR74A (Genbank Accession No. EAA35992). Glul also possesses significant homology (about 54% from amino acids 7 to 815 of Glul) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP65594).
[208] The enzyme Glu2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 43 in Table 2. The Glu2 nucleic acid sequence encodes a 550 amino acid sequence, represented by SEQ ID NO: 44 in Table 2. We believe the signal peptide for Glu2 is located from about position 1 to about position 19 of the Glu2 amino acid sequence, with the mature protein spanning from about position 20 to about position 550 of the Glu2 amino acid sequence. Within Glu2 a catalytic Patent 124702-0230 domain (CD) is present. We believe the amino acid sequence containing the CD of Glu2 spans from a starting point of about position 20 to an ending point of about position 543 of the Glu2 amino acid sequence. Based on homology, Glu2 can be assigned to GH17 of the CAZy families and is expected to have β-glucanase activity. Glu2 may also express glucan endo-l,3-P-glucosidase; glucan 1,3-β- glucosidase; licheninase; p-l,3-glucanosyltransglycosyIase activity. Glu2 possesses significant homology (about 76% from amino acids 1 to 550 of Glu2) with exo-beta-l,3-glucanase from Chaetomi m globosum (Genbank Accession No. ACM42426). Glu2 also possesses significant homology (about 58% from amino acids 1 to 550 of Glu2) with EAA34684 (Genbank Accession No. XP963920).
[209] The enzyme Abn6 is encoded by the nucleic acid sequence represented by SEQ ID NO: 45 in Table 2. The Abn6 nucleic acid sequence encodes a 611 amino acid sequence, represented by SEQ ID NO: 46 in Table 2. We believe the signal peptide for Abn6 is located from about position 1 to about position 16 of the Abn6 amino acid sequence, with the mature protein spanning from about position 17 to about position 611 of the Abn6 amino acid sequence. Within Abn6 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Abn6 spans from a starting point of about position 26 to an ending point of about position 335 of the Abn6 amino acid sequence. Based on homology, Abn6 can be assigned to GH43 of the CAZy families and is expected to have arabinofuranosidase/arabinase/xylosidase activity. Abn6 may possess β- xylosidase; -l,3-xylosidase; a-L-arabinofuranosidase; xylanase; and/or galactan 1,3-P-galactosidase activity. Abn6 possesses significant homology (about 49% from amino acids 1 to 609 of Abn6) with a protein from Aspergillus oryzae (Genbank Accession No. BAE61393). Abn6 also possesses significant homology (about 72% from amino acids 16 to 611 of Abn6) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP66759).
[210] The enzyme Abn8 is encoded by the nucleic acid sequence represented by SEQ ID NO: 47 in Table 2. The Abn8 nucleic acid sequence encodes a 631 amino acid sequence, represented by SEQ ID NO: 48 in Table 2. We believe the signal peptide for Abn8 is located from about position 1 to about position 20 of the Abn8 amino acid sequence, with the mature protein spanning from about position 21 to about position 631 of the Abn8 amino acid sequence. Within Abn8 a catalytic Patent 124702-0230 domain (CD) is present. We believe the amino acid sequence containing the CD of Abn8 spans from a stalling point of about position 33 to an ending point of about position 350 of the Abn8 amino acid sequence. Based on homology, Abn8 can be assigned to GH43 of the CAZy families and is expected to have arabinofuranosidase/arabinase/xylosidase activity. Abn8 may possess β- xylosidase; p-l,3-xylosidase; α-L-arabinofuranosidase; xylanase; and/or galactan l,3-p-galactosidase activity. Abn8 possesses significant homology (about 54% from amino acids 34 to 588 of Abn8) with a putative xylanase 27 from Gibberella zeae (Genbank Accession No. AAV98256). Abn8 also possesses significant homology (about 52% from amino acids 32 to 585 of Abn8) with a hypothetical protein MGG 05479 from Magnaporthe grisea 70-15 (Genbank Accession No. EDK06156).
[211] The enzyme AbnlO is encoded by the nucleic acid sequence represented by SEQ ID NO: 49 in Table 2. The AbnlO nucleic acid sequence encodes a 424 amino acid sequence, represented by SEQ ID NO: 50 in Table 2. We believe the signal peptide for AbnlO is located from about position 1 to about position 20 of the AbnlO amino acid sequence, with the mature protein spanning from about position 21 to about position 424 of the AbnlO amino acid sequence. Within AbnlO a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of AbnlO spans from a starting point of about position 54 to an ending point of about position 414 of the AbnlO amino acid sequence. Based on homology, AbnlO can be assigned to GH93 of the CAZy families and is expected to have arabinanase activity. AbnlO possesses significant homology (about 70% from amino acids 56 to 414 of AbnlO) with a conserved hypothetical protein from Neurospora crassa O 74A (Genbank Accession No. EAA28974), AbnlO also possesses significant homology (about 55% from amino acids 58 to 414 of AbnlO) with a protein from Aspergillus or zae (Genbank Accession No. BAE64690).
[212] The enzyme Laml is encoded by the nucleic acid sequence represented by SEQ ID NO: 51 in Table 2. The Laml nucleic acid sequence encodes a 763 amino acid sequence, represented by SEQ ID NO: 52 in Table 2. We believe the signal peptide for Laml is located from about position 1 to about position 20 of the Laml amino acid sequence, with the mature protein spanning from about position 21 to about position 763 of the Laml amino acid sequence. Within Laml a catalytic Patent 124702-0230 domain (CD) is present. We believe the amino acid sequence containing the CD of Laml spans from a starting point of about position 21 to an ending point of about position 763 of the Laml amino acid sequence. Based on homology, Laml can be assigned to GH55 of the CAZy families and is expected to have exo- and/or endo- β-glucanase activity. Laml possesses significant homology (about 74% from amino acids 8 to 738 of Laml) with a hypothetical protein NCU04850 from Neurospora crassa OR74A (Genbank Accession No. EAA31173). Laml also possesses significant homology (about 74% from amino acids 22 to 762 of Laml) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP66532). As evidenced below in Example 2, Laml possessed β-glucanase activity.
[213] The enzyme Bgal is encoded by the nucleic acid sequence represented by SEQ ID NO: 53 in Table 2. The Bgal nucleic acid sequence encodes a 881 amino acid sequence, represented by SEQ ID NO: 54 in Table 2. We believe the signal peptide for Bgal is located from about position 1 to about position 30 of the Bgal amino acid sequence, with the mature protein spanning from about position 31 to about position 881 of the Bgal amino acid sequence. Within Bgal a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Bgal spans from a stalling point of about position 44 to an ending point of about position 520 of the Bgal amino acid sequence. Based on homology, Bgal can be assigned to GH2 of the CAZy families and is expected to have β-galactosidase activity. Bgal possesses significant homology (about 58% from amino acids 44 to 879 of Bgal) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP70161). Examples of assays to test Bgal activity include β-galactosidase assay using ONPG (oNP-a-D-galactopyranoside) or PNPG (pNP-a-D- galactopyranoside) as a substrate (See example described by Mahoney, R.R. In: Handbook of Food Enzymology. Edited by John R. Whitaker et al CRC Press 2002, pp823-828. Bgal may also possess β-mannosidase; β-glucuronidase; mannosylglycoprotein endo^-mannosidase; and/or exo^-glucosaminidase activity.
[214] The enzyme Bga3 is encoded by the nucleic acid sequence represented by SEQ ID NO: 55 in Table 2. The Bga3 nucleic acid sequence encodes a 298 amino acid sequence, represented by SEQ ID NO: 56 in Table 2. We believe the mature Bga3 Patent 124702-0230 protein spans from about position 1 to about position 298 of the Bga3 amino acid sequence. Within Bga3 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Bga3 spans from a starting point of about position 2 to an ending point of about position 298 of the Bga3 amino acid sequence. Based on homology, Bga3 can be assigned to GH42 of the CAZy families and is expected to have -galactosidase/ galactanase activity. Bga3 possesses significant homology (about 50% from amino acids 1 to 264 of Bga3) with hypothetical protein CBG19899 from Caenorhabditis briggsae (Genbank Accession No. CAP37064).
[215] The enzyme Gal2 is encoded by the nucleic acid sequence represented by SEQ ID NO: 57 in Table 2. The Gal2 nucleic acid sequence encodes a 490 amino acid sequence, represented by SEQ ID NO: 58 in Table 2. We believe the signal peptide for Gal2 is located from about position 1 to about position 18 of the Gal2 amino acid sequence, with the mature protein spanning from about position 19 to about position 490 of the Gal2 amino acid sequence. Within Gal2 a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Gal 2 spans from a starting point of about position 64 to an ending point of about position 249 of the Gal 2 amino acid sequence. Based on homology, Gal 2 can be assigned to GH43 of the CAZy families and is expected to have galactanase/arabinase activity. Gal2 may also possess β-xylosidase ; β-1,3- xylosidase; a-L-ai-abinofuranosidase; xylanase; and/or galactan l,3-P-galactosidase activity. Gal 2 possesses significant homology (about 79% from amino acids 1 to 484 of Gal2) with a protein from Aspergillus oryzae (Genbank Accession No. BAE60540). Gal 2 also possesses significant homology (about 73% from amino acids 1 to 484 of Gal2) with protein Pc22g09450 from Penicillium chrysogenum Wisconsin 54-1255 (Genbank Accession No. CAP98233). Gal2 activity can be tested by at least a reducing sugars assay using galactan or arabinan substrates.
[216] The enzyme Arhl is encoded by the nucleic acid sequence represented by SEQ ID NO: 59 in Table 2. The Arhl nucleic acid sequence encodes a 777 amino acid sequence, represented by SEQ ID NO: 60 in Table 2. We believe the signal peptide for Arhl is located from about position 1 to about position 15 of the Arhl amino acid sequence, with the mature protein spanning from about position 16 to about position 777 of the Arhl amino acid sequence. Within Arhl a catalytic Patent 124702-0230 domain (CD) is present. We believe the amino acid sequence containing the CD of Arhl spans from a starting point of about position 44 to an ending point of about position 736 of the Arhl amino acid sequence. Based on homology, Arhl can be assigned to GH78 of the CAZy families and is expected to have a-rhamnosidase activity. Arhl possesses significant homology (about 80% from amino acids 1 to 777 of Arhl) with hypothetical protein CHQG_06672 from Chaetomium globosum CBS 148.51 (Genbank Accession No. EAQ90053). Arhl also possesses significant homology (about 54% from amino acids 25 to 777 of Arhl) with a predicted protein from Neurospora crassa OR74A (Genbank Accession No. EAA28787). a-L-rhamnosidase ctivity can be measured as described in Manzanares, P. et al. (1 97) FEMS Microbial Lett. 157, 279-283.
[217] The enzyme Cell is encoded by the nucleic acid sequence represented by SEQ ID NO: 61 in Table 2. The Cell nucleic acid sequence encodes a 476 amino acid sequence, represented by SEQ ID NO: 62 in Table 2. We believe the mature Cell protein spans from about position 1 to about position 476 of the Cell amino acid sequence. Within Cell a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Cell spans from a starting point of about position 2 to an ending point of about position 474 of the Cell amino acid sequence. Based on homology, Cell can be assigned to GH1 of the CAZy families and is expected to have β-glucosidase activity. Which can be assayed as described by Esen In: Handbook of Food Enzymology. Edited by John R. Whitaker et al CRC Press 2002, pp791-803. Cell possesses significant homology (about 92.23% from amino acids 1 to 476 of Cell) with a beta-glucosidase from Humicola grisea var. thermoidea (Genbank Accession No. BAA74958.1), Cell may also possess β-galactosidase; β-mannosidase; β-glucuronidase; β-D-fucosidase; phlorizin hydrolase; exo^-l,4-glucanase; 6-phospho^-galactosidase; 6-phospho^- glucosidase; strictosidine β-glucosidase; lactase; amygdalin β-glucosidase; prunasin β-glucosidase; raucaffricine β-glucosidase; thioglucosidase; β- primeverosidase; isoflavonoid 7-0^-apiosyl^-glucosidase; hydroxyisourate hydrolase; and/or β-glycosidase activity.
[218] The enzyme Mip is encoded by the nucleic acid sequence represented by SEQ ID NO: 63 in Table 2. The Mip nucleic acid sequence encodes a 850 amino acid sequence, represented by SEQ ID NO: 64 in Table 2. We believe the mature Mip Patent 124702-0230 protein spans from about position 1 to about position 850 of the Mip amino acid sequence. Within Mip a catalytic domain (CD) is present. We believe the amino acid sequence containing the CD of Mip spans from a starting point of about position 172 to an ending point of about position 602 of the Mip amino acid sequence. Based on homology, Mip can be assigned to GH31 of the CAZy families and is expected to have amylase/a-glucosidase activity. Mip possesses significant homology (about 61% from amino acids 7 to 823 of Mip) with hypothetical protein NCU04885 from Neurospora crassa OR74A (Genbank Accession No. EAA28714). Mip also possesses significant homology (about 67% from amino acids 7 to 850 of Mip) with a protein from Podospora anserina S mat+ (Genbank Accession No. CAP70184). Mip may also possess a-l,3-glucosidase; sucrase-isomaltase; ct-xylosidase; a-glucan lyase; and/or isomaltosyltransferase activity.
[219] Table 2. Nucleic Acid and Amino Acid Sequence Listings for
Myceliophtora thermophila C1 Carbohydrases
Enzyme Nucleic Acid Sequence Amino Acid Sequence
Agu2 SEQ ID NO 17 SEQ ID NO 18
CL10353 gcagggccccaagtgagagagccccaacgctgttggtagcgaaagccatcgttcctggtt MLTAGVKGALWLAFGSVVAALG
aacaggtctatggtctattttcccaccaatacccgtcctggtctctctccctccaiggccatca QESnST AHGAHFQIAGGHVG G tggccctcatggccctcgtggccctcatggcatcccggctccgagtggagctgcgtacctt QILVSSNDYWGVIRAAGDLAVDF gcatggctaccttgiaacccggcgagctgggtggtgtctggacctgttaataaatcgcaca GRVTGTNYTLSNGERNAAPATYT agatattcgtcgcgatcgcacgtccctcgccttgcttttcaaccaiccccggaagagagag YHPVHN NNTYYSTTGTA FTGP gcgattacgtacgagctacgcccgccttcttggtcggactgtccgcagcgcacgtgttgcg AYADPDPEKVVIIAGTIGHSKVID tccacccttacctactctctctcaagatgcttaccgcgggcgtgaagggagccctctggctc KLIASRSLDVSRVKGKWESFTSQL gccttcggctcggtggtggccgctctgggacaggagagcatcatctcgaccaacgcccat VIC PVPGCKQALVVAGSDPRGTI ggcgcccacttccagatcgccggcggccacgtgggcaagggccagattctggtctcgag YGIYDISEQIGVSPWYFWADSPPR caacgactactggggtgtcatccgcgctgccggtgacctcgccgtcgacttcggccgcgt KHKNLYVTTKKKVQGPPSVKYR cacgggaacgaactacacgctcagcaacggtgagaggaacgcggcccccgcgacgta GFFLNDEQPALTNWVASHWQDT cacgtaccatccggtcaacaacaagaataacacttacgtgagtaccgcgaccgcgcttttat PYGPGYGAAFYGLIFELLLRLRAN tatatacggccctcccagcacccaaagttttttttttttmcttttctttttntttttm YLWPAIWATMFEVDDPANQPLA cccgccagtattccacgaccgggaccgccaacttcaccggcccggcctacgcagacccc DAFEVVIGSSHTEPLMRAQNEFG gacccggagaaggtggtgatcattgccggcaccattggccactccaaggtaatcgacaag HFYEGPWAY LNNKITDDYFRYG ctgattgcgtcgcggtccctcgacgtgtcgcgcgtcaagggcaagtgggagtcgttcacc VQRAKPYAR SLWTMGMRGTG agccagctggtcaagaacccggtgccgggatgcaagcaggcgctcgtggtagccggaa DTAIEGLGVEYIVEMLQTLVKNQ gcgacccccgcggcaccatctacggcatctacgacatcagcgagcagatcggcgtcagc RQIMAEGLGIKDITTVPQMWCLY ccgtggtacttctgggccgacagcccgcccaggaagcacaagaacctctatgtcacgacc KEVMSYLSAGLQVPDDVTLLWA aagaagaaggtccagggcccgccgtccgtcaagtaccgcggtttcttcctgaacgacgag DDNWGNVRRLPLRNETQRHGGA cagccggcgttgacgaactgggtcgcctcccactggcaggacaccccctacggccccg GIYYHFDYVGGPRDYKWINTIQL gctatggcgcagccttctacggcctcatcitcgagctgctgctccgcctccgcgccaactac TKTAEQ HMAYARGADRIWIVN ctgtggcccgccatctgggcgaccatgttcgaggttgacgaccccgcgaaccagcccctg VGDMKALEIPISHFFDLGYDAER gccgacgccttcgaggtcgtcatcggatcctcgcacacggagcccctgatgcgcgcaca WHVDS REWAEAWAAREFGPAR gaacgagrtcggccacttclacgagggcccgtgggcctacaacctcaacaacaagaccat AREIADVMMKYGMYAARRKYEL cgacgactacttccggtacggcgtccagcgcgccaagccgtacgcccgcaacagcctgt VEPWVYSVINY EAEAVLQQWA ggaccatgggcatgcgcggaaccggcgacacggccatcgagggcctcggcgtcgagt ELVADAQAIYDELPAEA PAFFQ acattgtcgagatgctgcagaccctggtcaagaaccagcgccagatcatggccgagggc TVLHPALAGEIVHQINVGGAR M ctcggcatcaaggacatcaccacggtacctcagatgtggtgcctctacaaggaggtcatgt LYAGQKRNAANKAIQDVLAYSA cctacctctcggccggcctgcaggttcccgacgacgtgacgctgctgtgggccgacgac ADANLTRRWDALLDGKWKHFM aactggggcaacgtgcgccggctgccgctgcggaacgagacgcagcgccacggcggc DQTHLGYDGYWQQPMRNALPPM gccggcatctactaccacttcgactacgtcggcgggccgcgcgactacaagtggatcaac VYVQTDFTSLAGEIGIGVEGSNAT acgatccagctgaccaagacggccgagcagatgcacatggcctacgcccgcggcgccg VKGDDRWHANSGNDLTLPPLDP Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
acaggatctggatcgtcaacgtcggcgacatga aggcgctcgagatccccatc agccact YGPATRYIDEFSRGSEECSWTLSSS tcttcgatctcggctacgacgccgagcgctggcacgtcgacagcacccgggaatgggcc KPWLKLSQSTGWGPNRPDTRVL gaggcctgggccgcccgcgagttcgggcccgcccgcgcccgcgagatcgccgacgtc VSVDWKSAPPAPYSETVQINITTQ atgatgaagtacggcatgtacgcggcccgccgcaagtacgagctggtcgagccctgggt CTGMDRYGFGDPHVLVPVTVRG ctactcggtcatcaactacaacgaggccgaggcggtgctgcagcagtgggccgagctcg VP SFKRGFVESDGTVAFAAEHY tcgccgacgcccaggccatctacgacgagctccccgccgaggccaagcccgccttcttc SRJVEPASKGNGKGKG GNDEDV cagaccgtcctccacccggccctcgccggcgagatcgtgcaccagatcaacgtcggcgg TYHTFASYGRTSSGVGLVPLGAE cgcccggaacatgctgtacgccggccagaagcgcaacgccgccaacaaggccatccag KLAREEAPALEYDLYLFTNTSAA gacgtcctcgcttactcggccgccgacgcgaacctcactcgccgctgggatgctctcctcg NVTLYLSPALNYLGDATPLEYAV atgggaagtggaagcacttcatggaccgtaagltttatcccggtttttttctctcccgcctccct ALFP A SS S SGPSPEDDDD AV A VK ccctcccgcctctcttcctccacacaatcccattttttcaccttttcgccgtgacactaaaggg HVQPVGATEGGNMPRGWDGAV aaattcaccgcacacagaaacacatctcggctacgacggctactggcaacagcccatgcg ADGVWGLTGGYTTSSFEVAREG laacgcgctcccgcccatggtgtatgtccagaccgacttcacctccctcgctggtgagatc AY LRVWALMPGVWQ VVVD ggcatcggtgtcgagggctccaacgccaccgtcaagggcgacgacaggtggcacgcca LGGVRPSYLGPPESFLVGRDRIGA acagcggcaacgacctcaccctgccgcccctcgacccgtacggcccggccacgcgcta RNGTSFLG
catcgacatcttctctcgcggctccgaggagtgttcctggaccttatcctcctccaaaccctg
gctcaagctgtcccagtccacgggcgtcgtcgggcccaaccgcccggacacccgcgtcc
tcgtctcggtcgactggaagtccgccccgcccgccccgtactccgaaacggtccagatca
acatcaccacccagtgcacgggaatggaccggtacggcttcggcgacccgcacgtcctt
gtgcccgtgacggtgcgcggtgtgcccaagtcgttcaagcgcggcttcgtcgagtcggac
ggcaccgtcgccttcgccgccgagcactacagcaggatcgtcgagcccgctagcaagg
gcaacggtaagggtaagggtaagggcaacgacgaggacgtcacctaccacaccttcgc
ctcgtacggccgcacctcgtccggcgtcggcctggtcccgctgggcgccgagaagctcg
cgcgcgaggaggcgcccgcgctcgagtacgacctctacctcttcaccaacacctcggcc
gccaacgtgacgctgtacctgtccccggcgctcaactacctcggcgacgccaccccgctc
gagtacgccgtcgccctgttccccgcttcctcctcctccggcccctcaccggaggacgac
gacgacgcagttgccgtcaagcacgtccagcccgtcggcgccaccgagggcggcaaca
tgccgcgcggctgggacggcgcggtcgccgacggcgtctggggcctcacgggcggct
acaccaccagctcgttcgaggtcgcccgcgagggcgcctacaagctgcgcgtgtgggcc
ctcatgcccggcgtggtggtccagaaggtcgtcgtcgacctcggcggcgtccggcccag
ctacctgggcccgcccgagagcttcctcgtgggcagggaccggatcggagcgaggaac
ggaaccagcttcctggggtgaagggggggaaaaaagaaacrtgcgttgcttcttctgccg
cttcttcttttcttcttcitggatgatcaaggccagtagttttatttatcggtctcgaacgaaactg
taatctgtatagttttttttatcattgtagcgcaagaagaaaacacaaataataaaatgtttgata
taattcaggacaagtaggagttcccgagaagctgatatacttaactgtggttgtgtcgcttgt
ctgctctgtcctggcgctcagaatgggtggaccgccgggacgctcgcagacctgaccttt
ggattgagaactattcactcgctattgctcctccaaacaagttggacaggtgaaagctgcatt
ttccgagtaaggccttttgctgaatcctgtccctaagctggaglataatgtagtcgagttgcct
ctttgggtgcaccgaggttaaagggtgaaaagctcggaggacttcagtttcacgttttcgcc
ctcgaatacagagatactcgtaatttagttacggtattggcccgtgttggctaaagcgttcgct
ttccctgctccaaatggtagaaaaggcggggaattcgaaatatgcacttaccgcgtaaaci
gtgaaaccaagtcctcgaaacatcaagtcaaacaagcttaggagtcgtggattcaagcaaa
gaccagattgcataatgttaagacccacttgcactttcgtttcaggaacataacgttagtgtag
ccccactctctggtcgctacggaaacgggacctggcgaaaacagtcaatattccgcccgc
caatcccagccgagaaattgcaggcaggaggcccagcaagagccagcaacgtcatttcc
cttcgttctctggcttccacgcgtcgttgccccttct
Gxhl SEQ ID NO 19 SEQ ID NO 20
agacgacccgcgtcaacaccctccaggagctctctcataaacgtaaccactcacaaagcc MHLTGTFLTV AAAMS G AAAAP S actgttaacggactatgcacctcaccggaaccttcctcacggtggcggcagccatgagcg ERPIEARQANTITVDLSQTYQRMD gcgccgccgcagcccccagtgagaggcctatcgaagcacgccaagcaaacactaicac GFGFSLAFQRANLITNMSD T Q ggtcgacctctcccagacgtaccagcgcatggatggattcggcttcagcctggcgttccag RELLDLLFNRTTGAGFSILRNGIGS cgggcgaacttgatcaccaacatgtcggacaagaccaagcagcgagagctgctcgacct SPNSNSDFMNTIAPNNPGSPNAEP gctcttcaaccggacgacgggggccggcttcagcatcctgcggaatggcattgggtcgic QYMWDGKDSGQLWVSQQAVNL gcccaacagcaactcggacttcatgaacaccatcgcgcccaacaacccggggagcccc YGV N1YADAWSAPGYMKTNGR aacgcggagccccagtacatgtgggacggcaaggacagcgggcagctgtgggtgtcgc DTNGGTLCGVPGAQCASGDWRQ agcaggcagtgaacttgtatggggtcaagaatatctacgccggtaagtgctccgttgggaa AYANYLVAYIGFYAEEGVNITHL gaaggtaactgggccggtccgtggcttctaatcaacaacggttggctcgctgtagacgcct GFLNEPD Y S AS Y ASMQ SNGNQAA ggagcgctccggggtacatgaagaccaacgggcgcgacaccaacggcggcaccctct DFIKILHPTLE AVGLGDRVRI V CC gcggggtgccgggagcgcagtgtgcgtccggcgactggcggcaggcgtacgccaact DSMGWNNQ VS V SQIRS AG AED atctggtggcctacattggcttctacgcggaggaaggcgttaacatcacgcacctggggtt LLGTVTSHTYSGGPGGPMSSRAP cctcaacgagccggactacagcgccagctacgcgtcaatgcaatccaacggcaaccagg VWLSEQCDLNGAWTTAWYSYG cggccgacttcatcaagatcctgcacccaacactcgaggccgtagggcttggggaccgg GAGEGLTWASNIYNAVVNA SG gtacggatcgtgtgctgcgactcaatgggctggaacaaccaggtgagcatggtctcgcag YL Y WEGVQ WPNPNTNEKLIR VD atccggtcggccggcgccgaggacctgctggggaccgtgacctcgcacacgtattcggg NTTNTYEVSSRLWAFANWSRYV gggcccgggcgggccgatgagctcgcgggctcccgtctggctgtcggagcagtgcgac RPGAVRVGVSGGGNGLRTAAFR Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
ctcaacggggcctggaccacggcctggta ctcgtacgg cggcgcgggcgaggggctga NEDGTIAVIAISSGGSAANVSIKIS cgtgggccagtaacatctacaacgcggtcgicaacgccaacatctcgggctacctgtactg GGPGAAAVQAFVSDNTRKCESTP ggagggtgtccagtggcctaacccaaacaccaatgagaagctcatccgcgtggacaaca ATVAGDGTISGSVAARSITTFFILP cgaccaacacgtacgaagtgtcgagcaggctgtgggcttttgccaactggtcacgctacgt ESS
gcggcccggggctgtgcgggtgggggtcagcggcggggggaatggactgcgcacgg
cggccttcagaaacgaagacggcacgattgcggtcattgccatcagctcgggggggtcg
gcggcgaacgtgagcatcaagatctctggaggccccggcgcagccgcggtgcaggcgt
ttgtgtctgacaacacgagaaaatgcgagtctacgccggcgacggtggcaggtgatggaa
ccatctcagggtcagtcgcagccagatcaatcacgaccttctttatcttgccggagagctcc
tgagcagaacgggaggtgaggtatggcgggaacggaaatgacctagctatggtgaaag
gacgaggccaaaaaaaaagcccaagaagcactggtagcagaagatgaaatcgagttac
acattttatgttttctcctgcccttttttttttcciaatgcagaagcttcgatgtgatggatgggga
aggaatgatgaaagtcgggaagctcacctgccccactaaggccggaacctccttccagca
cgctcgcgctatctatatccaccaccccaccatccggtcccggacttgcaagcttcccggg
titcaagaatccctccccgaaaaccgaccattcartgtgaaccagtgtgttccggccacttgc
cgaatccaaccaacagcctcccgtaaccgagct
Gxh2 SEQ ID NO 21 SEQ ID NO 22
tggcgaactggttgtcttgtgcaataatactatacctcggggtcttctccaacctgctgggctc MYSLLIALLCAGTAVDAQALQQR gtcattgtgctgttgttttgtgtrtgtccgcaacaagcattccccagagcaccgacccgtgaa QAGTTLTVDLSTTYQRID GFGTSE aggcgtggttggcactccgcaagtcccttttctttgctgagcgagcttcatccgacagccca AFQRAVQMS LPEEGQRRALDVL ccacccccgcaatgtattctctgctcattgccctcctctgcgccggcactgccgtcgacgcc FSTT GAGLSILRNGIGSSPDMSS caggctctccagcagcgacaggccgggacaactctcaccgtcgacctgtccacgacgta DH VSIAPKSPGSP NPLIYSWDG ccagaggaicgacgggttcggtacctcggaggccttccagcgcgcggtgcagatgagca SDNKQLWVSQEAVHTYGVKTIY ggctgcccgaggaaggacagcggcgcgctctcgacgtcctgtttagtaccaccaacggc ADAWSAPGYMKTNGNDANGGT gccgggctctccatccttcgcaacggcatcgggtcctcacccgacatgagctccgaccac LCGLS GAQCAS GD WRQAYAD YL atggtctcgatcgcgcccaagagccccggctcgcccaacaacccgctgatctactcctgg TKY VEFY QESNVTVTHLGFI EPE gacggctccgacaacaagcagctgtgggtgtcgcaagaagccgttcacacgtacggcgt LTTSYAS RFSASQAAEFiKJLYPT caagaccatctacgccgacgcctggagcgccccgggatacatgaagaccaacggcaac IQKSNLTYKPTIACCDAEGWNSQ gacgccaacggcggcaccctgtgcggcctcagcggtgcccagtgcgcgagcggcgact AGMLGALSSVNSMFGLVTAHAY ggaggcaggcatatgccgactacctcaccaagtgagtaggacctccaccttctccaaccc TSQPGFSMNTPHPV MTEAADL cccccccccccagctgagatcttcggtttgtccggccaaggcagcggcagaaggaaacg QGAWTSAWYSYGGAGEG TWA aaaagaaaagaagctaatagtggaattccagatacgtcgagttttatcaggagtccaacgt NNVYNA1VNGNASAYLYWIGAQ gacggttacccacctcggcttcatcaacgagccagaactgacgtacgtcccccctcccctc TGNTNSH VMDANAGTVEPS R aatctctttcttcttccccccatatcgaccacgcatctctcggttcactacigaccacacacaat LWALGQWSRFVRPGARRVAVSG ttctccctcaatctttcttgtgtaacagaacgagctacgcctcgatgcgcttctccgcctccca AS GSLRTAAF NEDGS V AV V VIN ggcagccgagttcatccgcatcctgtacccgaccatccagaagtccaaccttacctataag SGGDAAVNVRLASSSSADQQPAS ccgaccatcgcctgctgcgacgctgagggctggaactcgcaggctggcatgctgggtgc AKAWATDNSRAIEEIQASFADGV cctgagctccgtcaactccatgtttggcctcgtcacggcgcacgcctacacctcgcagccg ATVNVPSRSMTTVVLYPAADA ggcttctccatgaacacgccacacccggtctggatgaccgaggccgccgatctgcgttag
acatccctclttctmctcatcttccttcttatcMcmctttctttctttccttttttcccctcttcgtc
tttctttccagaagaaaagctgacatgccatgtcggaaaaaaaaagagggagcgtggacct
cggcgtggtactcgtacggcggcgcgggcgagggctggacgtgggccaacaacgtgta
caacgccatcgtcaacggcaatgcgtcggcctacctgtactggaicggcgcgcagacgg
gcaacaccaacagccacatggtccacatcgacgccaacgccggcaccgtcgagcccag
caagcggctgtgggcgctgggccagtggagccgcttcgtgcgccccggcgcccgccgt
gtcgccgtctcgggcgcctccggcagcctgcgcaccgccgccttccgcaacgaggacg
gttccgtcgccgtcgtcgtcatcaacagcggcggcgacgccgccgtcaacgtcaggctc
gcctcgtcctcgtcggccgatcagcaacccgcctcggccaaggcctgggccaccgacaa
ctcgagggccatcgaagagatccaggccagcttcgccgacggcgtcgccaccgtcaac
gttccctcgaggtccatgaccacggtggtgctgtatccggcggccgacgcgtgatgtccgt
cggcatgcagaccccggctgtctttgtacatacatacacaccgtgtgctctgtgtgctgttca
ggggctcgicaagcacccatcgtgcatctagacaagctatctatttgcctaagtggtagctc
ggtggcccaatcaaccatctttggtacattgaggaacgccgtggccttggatcccat ttctc
caacc
Agal SEQ ID NO 23 SEQ ID NO 24
CL08363 gcggtcagctgagcgaaaagcacatcaagaccgagagttgagaacattacgtcgacgat MALGLLLYAMGTAALNDGVGKL ctgagccttgtggattcgtcggcgtgttgagttcaatccattgtcgaataattaaatctccgct PALGFNSTSAYQCNYTADVLLEQ gaagcccatcaaggccgtctctggactggcggactaatccgacaactgatcctttcttgccc AQAIVDRGLLKAGYNYFMLDDC aagagttcggaatgagaccctccgcagccgaaaatgaaaattaacatccgactgcaatgct YSL ERDEN GRI VEDPEKFPNGM aagttgtagctacatcctcaccggtcaagttgcttgcaggttgatacag tcgttigtctgtgg KNFTESLAKLGFRAGIYSDAGYR acgcgtttgtaggataagcccigtggatctggcaaacgggctggcaaacggagggttcttt TCGGYPGSYGNEAKDLETFAEW gttcctatcctctacttatacagagccacagggccagcaaacgtgggaagggacatcgca GFEYL YDNCYIPFDNVTQENVY cttttgaactatttacaacttaccaactttcttccatcaattggaaccatcttccatctattggaaa GRYER AEAIRARAEETDSAPLQ ccatcttctatcacaggtgctcacataattatatatacactacaagtatatataaaacttcagtg LALCEWGWQQPWRWAGRLGQS atcattctttRtatttactatttacaaaccaiaatcacagacttiaatatgtctrtcaaiaccaatta WRIGGDIRPWWSALSSIINQASFIA Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
actgttttcccctcttcacctctttcaggccgagtcaaclgaacttctcccaaccgttctggaa GATGFYARNDLDMLEVGNTGIGT aatgacggcgacaacaataaaatcactaatggcccttgggctcctgctctatgcaatgggc PPGNLSYDEAKSHFTAWALLKSP accgcagctctaaatgatggagtaggaaagctccccgcccttgggttcaacagtacgtctg LLiGSDMASASRETMElLGNEDIL cctatgacgctttctctatcgagtccctgccgctacccatcctgtacacatgtacataccgac RINQDPHVGEAVAPFRWGHNAD gttggttgacgggatgaacagcttggaacctgtaccagtgtaattacactgccgacgtcttg YTWDPDHPAEYWTGNSSYGVVF ctggagcaggcccaggcaatagttgatcgcggccttttgaaggccggatacaatgtgcgtt MVLNTFDEPRHMSFNLTESWAIR cgctgagagaacaagcagtcgaggttatgactcgagagctaacatggaggctgctcagta AGRLYSVYDLWTHTDNGTAYRN ctttatgcttgatgactgctactccttgaaggagagggacgagaatggcaggattgtcgaag ISLTLPPHG VAALLLND AGPEPEA gtaactgactgaaaggcggttcgtatggggcgctgagcctcttgtttcacctctctcttttcctt AEFYCAVWWQCSYPNVVDDDD catgtttccccccttccctccactcgtaagagacctgacacaggacctgatcagttgattgac GWHG
atatttagacccggaaaagttcccgaatggcatgaagaacttcaccgagtcgctggccaag
ctgggcttccgggcgggcatctacagcgacgcgggctaccgcacttgcggaggctatcc
gggctcgtacggcaatgaggccaaggacctcgagacgtttgccgagtgggggttcgagt
atctcaagtacgacaattgctacatccccttcgacaacgtgacgcaggagaacgtgtacgg
gcggtacgagcggatggcggaggcgatccgggcgcgggccgaggagacggactcgg
cgccgttgcagctggccctctgcgaatgggggtggcagcagccgtggcggtgggcggg
gcggctgggccagtcgtggcggatcgggggcgacatccggccgtggtggtcggcgctg
tcgagcatcatcaaccaggcgagcticatcgcgggcgcgacgggcttctacgcgcgcaa
cgacctcgacatgctcgaggtgggcaacacgggcatcgggacgccgccgggcaacctc
agctacgacgaggcaaagtcgcacttcaccgcctgggcgctcctaaagtcgcccctgctg
atcggctccgatatggcctctgcctcccgcgagacgatggagatcctcggcaacgaggac
atcctccgcatcaaccaggacccgcacgtcggcgaggccgtcgccccgttccgctgggg
ccacaacgccgactacacctgggacccggaccacccggccgaatactggacggggaac
tcgtcgiacggcgtcgtcttcatggtcctcaacactttcgacgagccccgccacatgtccttc
aacctgaccgagagctgggccatccgggccgggcggctctactcggtctatgacctgtgg
acgcacaccgataacgggacggcctaccgcaacatctcgctcaccctgcccccgcacgg
cgtggccgcgctgctgctgaacgacgccggcccggaacccgaggctgccgagccgtac
tgcgccgtgtggtggcagtgctcgtatccgaatggtacgttggctggtttttctgctcctaata
gtaattaatcctagtcgtcgacgatgacgacggatggcacggctgagactgacagcatgg
cccaggcacgtactacagtaactaggggaccgggcgagccgaaaaagagagaaaaac
gagaaaggaaactgcttcttgagaagccctacctttgcgttaacacaacgacctaataatga
ttgtatgcttgatgatagttcggataggagtaatattgaacaacaaaagtaggagcacaacc
cagagtgaagcgtcgacccttccatacctatctagcgcctgacttgggactcatagtgcatg
tatggttcctaai
Aga2 SEQ ID NO 25 SEQ ID NO 26
CL08970 ctcatcatattaatacgcatcgctgccgccagtccggccaattcgaattcctcagcgttgatc YARNSFLVALYWPRVAKALD ttgggattcatccgccagtctttcctgaattgaaacaccgctaattaccatgaagtacgcgcg NGVGLTPHMGWSSWNVAQCDA caacagtttcctcgtggccctgtattggccgcgagtcgccaaggccttggataatggggtc ASAKYALDTAEKFISLGL DLGY ggcttgacgcctcacatgggatggagcagctgggtaagaggaacatcatcatgcagcgc EYTNIDD C WSLKSRDEN GKLVPDP ccttccccctgtcaaccccccgaggccagaagagacgacggggctcggttcgctaattttc GKWPDGIKPVADRIHDLGLKFGL ccatctaattatgccgcagaatgtcgcccagtgcgatgctgcctcggccaagtatgccctc YGCAGQKTC AGYP GS D GDKY AA gacacggccgagaagtttatttcgctcgggctcaaggacctcgggtacgaatgiaagaag SDVSQLVEWGVDFW YDNCYTP gacaatggggcaagcttctcctggagccaacctgccctcatccccagccgccccccagc CLDNPPPQTCQRPAGNSQEWYAP cggattctccgagccgtgtgtgcatgacgcgtgctgacggcggcccccccccccccccg MRDAILGVQETRKIHFNLCNWGR gcagacatcaacattgacgactgctggtcgctcaagtctcgcgacgagaacgggaagctt DDVWEWGDDYGHSWRMSVDN gtgcccgatccgggcaaatggcccgacggcatcaaacccgtcgccgatcggatccacga WGDWESVERIGSAAADIAEYSGP cctggggctcaagtttggtctgtacggctgtgccggccagaagacctgcgccggttaccc G GFNDLD LYLGSPI LN ANEERL cgggagcgacggggataagtacgcggcctcggatgtctctcagctggtcgagtggggag HFGLWAIA SPLVLGLDLATISNA tcgatitttggtatgtggtt attattaacccccttglttttcttcmctttctttctttctttctttcittct TLDIIRNI GIIDINQDRLG AATTF ttctttctaaaacccactacccgtacaggaagtatgacaactgctacaccccctgcttggaca TPPGRPGPESESGRIYPYWAGPLS acccaccgccgcaaacctgccagagaccggcgggaaacagccaagagtggtacgccc DGV VIGLC AGS S AGT YAVDERD V cgatgcgggatgccatcctgggcgtccaggagacgcgcaagatccacttcaacctctgca PGLGDGSYSWEEMYTGQTGTGT actggggacgtgacgacgtctgggagtggggcgatgactacggacactcgtggcggtac GVSFDIDLHDMRVI V TAGTKG gtctcgccgaggtccgctccgtggacccttcctaatgtttaagccctaaigtttaagctaatac GRAAEL
cgtgataaacaaatctaacctaacaaccgccggatcgaccatggacaggatgagcgtcga
caactggggagactgggagagcgtcgagcggatcggcagcgcggcggccgacattgc
cgagtacagcgggccgggcgggttcaacgacctcgacatgctgtacctcgggtcgccga
agcrtaacgccaacgaggagcgcctccacttcggcctctgggccatcgccaagtcgcccc
tcgtgctggggctcgacctcgccaccatctcgaacgcgacgctcgacatcatccgcaaca
agggcatcatcgacatcaaccaggacaggctgggcaaggccgctaccaccttcaccccg
cccggccggcccggcccggagagcgagagcggcaggatctacccgtactgggccggc
ccgctcagcgacggcgtcgtcatcggcctctgcgccggctcctcggcggggacctacgc
cgtcgacttcagggacgtcccgggcctcggggacggcagctactcgtgggaggagatgt
acacgggccagaccggcaccggcaccggcgtctcgttcgacattgacctccacgacatg
cgggtcatcaaggtgaagacggcgggcacgaaaggaggacgggcggcagaactttga Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
gttgtccaagacaaatacatacggtaattaagcagtgtgcaaacgttgtaccttaattaccta
cctttgcagggggacagcttgataacatgggtaattaatacattattagagccatcttgagcc
cctctcggcaacgtaggtagtgattggctcgaaatgtacatacactacctcgcgagtccttg
atttgaccataattattctctggagaaaacatttcgtttctga
Aga3 SEQ ID NO 27 SEQ ID NO 28
CL00565 atgtccgcttttatcgtcacgtacccgccgctcggccaagttacgcagctccagaactctga MSAFIVTYPPLGQVTQLQNSELTI
actgacgatccatgcggttctggagattcctcctgaatctactacggagaattggcagcttg HAVLEIPPESTTENWQLALWYSN ctctetggtattccaacggagateaggaggaatgggaagaagcagtcttagttccctctate GDQEEWEEAVLVPSIHDVRPTEL cacgatgtccgtcccacggaactacacgaatccattggagccgcggcgaggttgtacttca HESIGAAARLYFTTRVVVRSSLTF cgacccgggtagtcgtcaggtcctcgttaaccttcaccatcaagtttcgtcaaggcaccgat TIKFRQGTDDQEWKWVRSEQGS gaccaagaatggaaatgggttcgcagcgagcagggctcgggtgatgccattgtgatcatc GDAIVIINQKPTREDDPEDLPDLIR aaccagaaaccaactcgagaggacgaccccgaggacctgcctgacctgattcgggatct DLNPELEWKRHMSQSPGTRLWT gaatccggagctggaatggaaacgccacatgagccagtctcctggaacgcgcctgtgga VEAQVHGAKEDESAFVEVPLGIP ctgttgaggcacaagtacacggggcaaaggaggatgaatcagcgtttgtcgaagtaccac WGGRFLRWFALVRSWTPWLAPR tgggcattccgtggggcgggcggttcctcaggtaaggcgtggtgccgttgaaagggactg HG SEFGLDKDGLLCSFLSSHG attttggatctctttcgcctcccttgcaatgcccgccgactccaatactgatgatgctggcaga HLVMLGLGGINNVTALLRSGDAG tggtttgccctggttcggtcctggaccccatggcttgctcccagacacggcaagtcagaglt RLILSLRNDNVKSTTGTVLVAVG cggcctggacaaggatggcttactttgttcgtttctctcctctcacgggaagcacttggtgat DNLENAVAAVMYHARTLVTAAN gcttggacttggcggaatcaacaatgtcactgcgctcttgcgcagcggtgacgctggccgt APNSHGIASPTAEGDVGFQWYEN ctgatactcagcgtatgccatatgtgtcccctggggctctgtgctgcaaggactgaggttga WYDGLGYCTWNSLGQQLTEEKIL cccagcttcgaaacgataatgtgaaatcaacaacaggaacggttctggtagccgtcggtga NALDTLAENKVNISNLIIDDNWQ caaccttgagaacgcggttgctgccgtcatgtaccacgcccggacccttgttacggcagc DIDYRGDGQWQYGWNDFEAEPR gaatgctccaaattctcacggtatagcctcgcccacggcagagggcgatgtcgggccgca AFPRGLEALVSDIRSKHKNIQHIA atggtatgaaaattggtacgacggtttgggatactgtgagtgcaaaattacttctccaccaaa VWHALLGYWAGLAPSGPLVKRY tgtacctgtatccaatgctaaagtccagggcaggcacatggaactcgttgggccagcagtt ETVQVSRDDTQ SHLPIGNAMTV gaccgaggagaagatcctgaatgcactagacacgttggccgagaacaaggicaacatct VAPSDVQDFYEDFYRFLTSCGIDG ctaacctgatcatcgatgacaactggcaagacattgactaccgcggcgacggccaatggc VKTDAQYMLDTLTQPAARRTLTS agtacggctggaacgacttcgaagccgagccgagagccttcccccggggcctcgaagc SYLDAWTSSTLGHFAGGPVVAG cctcgtctccgacatccggtccaagcacaagaacatacagcacatcgcagtctggcatgc MALSPPTLFHPRLFRTSLPQIVCRT cctcctaggctactgggccggtctcgccccctccgggccccttgtcaaacgctacgaaac SDDFVPTGGGDDSDDDAHPWHV cgtccaggtctcacgcgacgacacccaaaagtcccacctccccatcggcaacgccatga WTNAHNALLAQHLNALPDWDM cggtcgtcgctccrtccgacgtccaggacttctacgaagacttctaccgcttcctcacctcct FQTAHPRGGFHAAARCVSGGPVC gcggcatcgacggcgtcaagactgacgcccagtacatgctcgacacgctcacccagccc VTDPPGQHDEELLRQIAGATPRG gccgcccgccgcacccttactagctcctacctcgacgcctggacctcctccaccctcggc RTVVFRPSTVGRTLDAYSSRADG cacttcgccgggggccccgtggtcgcgggcatggcgctgtccccgcccacgctcttcca GGGGLLKVGAYHGRAGTGTGIV cccccgcctgttccgcaccagcctgccgcagattgtgtgtcgtacctcggacgacttcgtc AVFNVDPRGNRPVAELLPLARFP ccgaccggcggcggcgacgacagcgacgacgacgcgcacccgtggcacgtgtggac GVGTGTGAGEGGAGGRYVVRAH caacgcgcacaacgcactgctcgcgcagcacctcaacgcgctgccggactgggacatgt RSGKVTPPLRPGSPAALVTVSLEA tccagacggcacacccgcggggcgggttccacgcggcggcgcggtgcgtcagcggcg GWDVLSAYPLHAVQSGTRGEV ggcccgtctgcgtcacggacccgcccggccagcacgacgaggagctgctgcgccagat LL
cgcgggcgccacgccgcgcggccgcaccgtcgtcttccgccccagcacggtcggccg ANLGLVGKMTGCAAVLRTVFEA gaccctggacgcctacagctcccgcgcggacggcggcggcggcgggttgctcaaggtc RENGRMLVDATVKALGVLGVYIS ggcgcgtaccacggccgtgccgggacgggcacgggcategtggccgtgttcaacgtcg VLPELSINDDFMVTIRGQPIPPHTV atccgcggggtaaccggccggtggccgagctcctgcccctcgcccggtttcccggggtt SVSRQDERVLEVDIETAWTEMGL gggacgggtacgggtgcgggtgaggggggcgcgggcgggaggtacgtggtgcgggc ESGWANEVQVI VYFALEKK gcaccgcagcgggaaggtgacgccteccttgcggcccgggtcgccggccgctctggtg
acggtctctctcgaggcgaaggggtgggacgtgctgtctgcctatccgctgcatgcggtg
cagtcggggacgaggggggaggtgttgcrtgcgaacctggggctggtgggcaagatga
cggggtgtgcggcggtgctgaggaccgtgttcgaggcgcgcgagaacgggaggatgtt
ggtggatgctaccgtgaaggcgotaggggtgttgggtaagccatcaaccaaatatggggg
ggaagaagaattcgatccccaaagaccgatgcgatgctgacttgcggcaggcgtctacat
ctcggtcctccctgaactatcgatcaacgacgacttcatggtgaccattcgggggcagccg
ataccQcctcacactgtgtccgtgagccggcaagacgagcgogtcctggaggtggatatc
gaaacagcgtggaccgaaatgggcctcgagagcggctgggcgaacgaggtgcaggtc
aaggtctactttgccctggagaagaagtag
Man8 SEQ ID NO 29 SEQ ID NO 30
CL00009 agtatggggatcaccaatgggcatgtagtacttgattggtgattcggacaatgttgccccga MTQHTARVLSSGWEFKSDADEK
gggtcgggatacgggccttggacggccgtatgaacgaggagttgctccaaacacactcg WLPVSGSPSNVHTDLMRHGLIPD attttcactctctgagcgtcaacgctacccctccactaatcttcgttcactcctcatcatcaactt PFQDTNELEVRWVAERTWRYRTS tgcctttgtctcagagacacgacaaggaagtggagtctcccagcatgacacaacacactg FATPSCYGRARGVRVDLVFEGLD cccgtgtcctctcgtccggctgggagtttaagtcggacgcggatgagaaatggctcccggt TFATVTLNGQVILRSDNMFLEHR ctcggggtcccccagcaacgtccataccgatctgatgcggcatggtctgatacccgaccc VDVGDVLVDGAEEESINTLEIAFE attccaagacactaacgaactggaagtccgctgggttgccgagcggacgtggcggtatcg PAGRRGLELVRAHPEHEFIVHQTE aacctcgtttgcgacgccgagttgttatgggcgggcccggggagtcagggtcgatctcgt VSRGPVRKAQYHWGWDWGPILL gttcgaggggctcgacaccttcgcgacggtgacgctcaatggccaggtcattctgcggtc TCGPWKPVRLETYVGRIEDVRVD Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
ggacaaiatgttcctcgagcaccgagtcgacgtcggcgacgttctggtggacggcgccga YEICSTG AEGDORPIVEATVHAH agaggagagcatcaacaccttggagategcgttcgagcctgcgggccggcggggcctg VLGPAAELEAELLLSGERVASWR gaactcgtccgggcccatcccgagcacgagttcattgttcaccagacggaggtcagcagg DRMGDDAAAGGPPPSSSSSSRRY gggccggtccggaaggcccaatatcattggggctgggattggggccccatcctcttgacg SSPRLRIERAELWWPRGYGPQSL tgtggcccgtggaaaccggtaaggctggagacalatgtcggcaggatcgaagacgteag YELKLRILAADGLTVLAEEHRRIG ggtagactacgagatttgcagcacgggccgggcggagggagatgggcgcccgatcgte FRKVELIREEDRFGQSFYFRVNGV gaggcgaccgtgcacgctcacgtcttgggcccggcggcggagctcgaggccgagctgc DVFSGGSCWVPADSFLPEISPERY tectatccggtgaacgagtggccagttggagggaccgcatgggcgacgatgccgccgct RDWIRLVAEGNQNMVRVWGGG ggtggtccgccgccgtcgtcgtcgtcgtcgtcgcggcgatactcgtccccccggctccgg VYEPDVFYAACDELGIMVWQDF atcgagcgggcggagctgtggtggccgcgaggatacggtccgcagtccrtatacgagtt MFACASYPTYPAFLDSVAREARQ gaagctgcggatcctggccgcggacgggttgacggtcttggcggaggagcaccggagg AVRRLRHHPCVVLWCGNNEDYQ atcgggtttcgcaaggtcgagctgatccgagaagaggatcggttcgggcaatccttctactt LVERYGLEYRFEEDRDPASWLRS ccgcgtcaacggcgtggacgtgttctcgggcggctcttgctgggtccccgcggacagctt TFPARYIYEHLLPGVVRDENPAG cctgcccgagatctetccggagcgctaccgggactggatccggctggtcgccgagggca AGATPYHPGSPWGDGRSTTLRVD accagaacatggtccgcgtctggggcgggggcgtctacgagccggacgtcttctacgcg PTVGDVHQWELWNGEARPWQLL gcatgcgacgagctcggcataatggtctggcaggacttcatgttcgcctgcgcgtcgtacc PRMGGRFVSEFGMLSHPHADTVA cgacctacccggccttcctcgactcggtcgcccgggaggcgcgccaggcggtccggcg RFVSDPAERRAGSRTMDFHT AV cctccggcaccacccgtgcgtcgtgctctggtgcggcaacaacgaggactaccagctggt AHERRLLAYVGENFGVARAAGG cgagcggtacgggctcgagtaccgcttcgaggaggacagggacccggcgtcctggctc GGAGAFAHLTQVVQADAVAAAY cggtcgaccttcccggcgcggtacatctacgagcacctgctcccgggcgtggtgcgaga RSWRRHWGRPGERRCGGVLVW cgagaaccccgcgggggcaggggcgaccccttaccacccgggcagcccctggggcg QLNDCWPAVSWAVVDYYLVRKP acggcaggagtacgacgctccgggtcgacccgacggtcggcgacgtgcaccagtggg AFYAIRRALAPLAVGVARKFHDW agctctggaacggcgaggcgcggccgtggcagttgcttccgcggatgggcgggcggttt TTRPADALWRRNTGHVDPRGML gtgagcgagttcggcatgttgagccacccgcacgccgacacggtcgcgcgcttcgtctcg TDVEFDVWVSSSRSDAVRARAV gacccggcggagcggcgcgcggggagccggacgatggacttccacaccaaggcggte VRFVSVRSGREVGDRJEREVQVG gcgcacgagcgcaggctgctcgcgtacgtgggcgagaacttcggggtcgcccgggccg PNGCTELLVGYKFDWRTAAVAEP ccgggggcgggggcgcgggagcgttcgcgcacctgacgcaggtggtgcaggcggac EHFVIHVALWVGGVQVSSDTSWP gcggtcgcggccgcgtaccggtcctggcggaggcactggggccggccgggggagcg DPI YLDFPDRGVSVRHLGPGLVE gcggtgcggaggcgtgctggtctggcagctcaacgactgctggcccgccgtctegtggg VSAQRPVKGFVFSEKRGVKLSDN ccgtggtggactactacctggtcaggaagccggccttctacgcgatccggcgggccctgg GFDLVPGDEPKRVEVQGCEVDEL cgccgctcgccgtgggcgtggcgcgcaagttccacgactggacgacccggccggccga SWTFVGQ
cgcgctctggcggcggaacacgggccacgtcgacccgcgcgggatgctgaccgacgtc
gagttcgacgtctgggtctccagctcgaggtcggacgccgtgcgggcgagggcggtcgt
gcgcttcgtctcggtccggtcgggccgcgaggtgggggaccggatcgagcgcgaggtg
caagtagggccgaacggctgcaccgagctgctggtgggttacaagttcgactggcgcac
ggcggcggtggccgaaccggagcactttgtcatccacgtggcgctgtgggtgggtggcg
tccaggtcagcagcgatacctcgtggcccgatccgatcaagtacctcgacttcccggaca
ggggcgtctcggtgaggcaccteggccccggcctcgtcgaggtctcggcccagcggcc
ggtaaaggggtttgtcltctccgagaagcgcggcgtcaagcttagcgacaacggcttcgat
ctggtgcccggagacgaacccaagagggtagaggtccaaggctgtgaggtcgatgaact
ctcatggactttegtcggtcaatgaacgggggttgcgggggggcgttttgggtgattaatta
c
Man9 SEQ ID NO 31 SEQ ID NO 32
CL08391 gcagctcaagctccagcaccacatgtgcttgcacttgcatccaatccacggagctggccaa APRWIPLDQNWEFRQADK.PDS gatgtgaataacaacaggagatataaggcagcgcctgcgacggccggccgctgctgcct KFLPVSQFPTNVHLDLQHHGLIPD gtgtctttttccccccctcaaccaacgacgcctcttgctgcctcggagtcgagcgctgcgat PFIGKNELLVQWVGEAQWTYRT acagcactcgttcagcacttcgttctttcccgcagctcgccaaagttcccaggcaaccacgc VFAAPPVPEGARAVIAFDGLDTFA catctgcccctccccctccccctcccacccccgggcgttcctcgggcagccgcccgtcgt TVVLNGTTILESDNMFLPHRVEVT catcatggcaccccgcgttgtgatcccgctcgaccagaactgggagttccggcaggcgg SVLKAEGNELV1TFDSAYLRGCKL acaagccagactccaagttcctgcccgtcagccagttcccgactaacgtccacctcgacct VEQHPNHKWGC NGDVSRLAV gcaacaccacggcctgatcccggacccgttcatcggcaagaatgagctcctcgtccagtg RKAQYHWGWDWGPTLLTCGPW ggtgggcgaggcgcaatggacctacaggaccgtcttcgcggcgccgccggtgcccgag RPVHLEIYESRLSDLYAETWD S ggcgccagggcagtaatcgcgttcgacggcctcgacactttcgcgaccgtggtgctcaat LKRASVKVTAVAERRADRVRFDI ggcaccaccatcctcgagtcggacaacatgttcctcccgcatcgcgtcgaagtgacctcg ALDGQQVATETAELDATSGEATV gtcctgaaggccgaaggtaacgagctcgtcatcacgtttgacagcgcctacctgcggggc SFLIDSPALWYPVRYGKQPLYDIR tgcaagctggtcgagcagcacccgaaccacaagtggggttgctggaacggcgacgtctc ATLLAGDDEVDTLSKRIGLRRAE gcggctggccgtgcgcaaggcgcagtaccactgggtgagcctcccacctcctccctccc LIQRPLEGQPGTSFFFEVNNIRIYC gcagatgaccatccgccgttctcaccgttctcaccgatgctgactgaactgtcccattacgc GGSDWIPADNFIPRISRRRYYDWV agggctgggattggggcccgactctcctgacctgcggcccctggcggccggtccacctc RLVAEGNQF IRVWGGGIYEEQA gagatatacgagtcgcgcctgtccgatctctacgccgagacggtcgtcgacaagtcgctg FYDACDELG1LVWQDFMFGCGN aagcgggccagcgtcaaggtgacagcagtegcagagcgcagggoggacagggtgcg YPAWPALLESIRREATENV RLR gttcgacatcgccctcgacggccagcaggtggccaccgagacagccgagctggacgcc HHPSIVIWAGNNEDYQYQESEGL acgtcgggcgaggctaccgtgtcgrtcctcatcgacagccccgcgcigtggtacccggtc TYDYANKDAESWLKTDFPARYIY cgctatggcaagcagccgctctacgacalccgggcaaccctgctcgccggcgacgatga E ILADVCADLVPSTPYHPGSPW ggtcgacacgctttccaagcggatcggcctgcgccgcgccgagctgatccaacggccac GAGLNTHDATVGDIHQ NVWH Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
tggaaggccagccaggtacgagcttcttcttcgaggtcaacaacatccgcatctactgcgg GTQEKWQNFDRLVGRFVSEFGM cg cagcgactggatccctgccgacaacttcatcccgcgtaicagccgtcggcgctactat QAFPAVKTIDAYLPLGRDDPD Y gactgggtcaggctggtggccgagggcaaccagttcatgatccgcgtctggggcggcgg PQSSTVDFHN AEGHERRTALYL catctacgaagagcaggccttctacgacgcttgcgacgagctcggcatcctggtctggca VENLRYAPDPLEHFVYCTQLMQG ggacttcatgttcggctgcggcaactacccggcctggccggccctgctcgagtccatcag ECLASAYRLWKREWRGPGREYC gcgcgaggccaccgagaacgtcaagcggctgcgccaccacccctccattgtcatctggg GGALV QTNDCWPVTSWSIVDY ccggcaacaacgaggactaccagtaccaggagtcggagggcctcacgtacgactacgc YLRPIiLAYFTV REMAPVSIGITR caacaaggacgccgagagctggctcaagaccgacticccggcccgctacatctacgaga RTHLHPRDRHTRVNVDVKTQIEV agatcctggccgacgtgtgcgccgacctggtccccagcaccccctaccacccgggctcg WASNLTLEDLTVDCVLKAWDVE ccctggggcgccgggctcaatacgcacgacgccaccgtcggcgacatccatcagtgga SGEETFSETVAAALLLRENRSTEI acgtgtggcacggcacgcaggagaagtggcagaacttcgaccggctggtcggccggttc AALDVPVRQKNVGEEGRIVVAA gtgtccgagttcggcatgcaggccttcccggccgtcaagaccatcgacgcctacctgccc YLVDKEGRQMARYVNWPEPL Y ctgggccgggacgacccggaccggtacccgcagtcgtcgacggtcgacttccacaaca VHLQKPRALRAQLTADYSAVEVS aggccgagggccacgagcggcgcatcgcgctgtacctggtcgagaatctgcgctacgc AEVPVKGVALECEDDGVRFDDN gcccgacccgctcgagcactttgtctactgcacgcagctgatgcagggcgagtgcctggc LVDIVPGEVVTIGVSGAGKDTKIE cteggcctaccgcctctggaagcgcgagtggcgcggacccgggcgcgagtactgcggc TRYLGMI
ggcgcgctggtctggcagaccaacgactgctggcccgtcacgagctggagcatcgtcga
ctactacctgcgccccaagctggcctacttcacggtcaagcgcgagatggcgcccgtgag
catcggcatcacgcgcaggacgcacctgcacccgcgcgaccggcacacccgagtcaac
gtggacgtcaagacgcagatcgaggtgtgggccagcaacctgaccctcgaggacttgac
ggtcgactgcgtactcaaggcctgggacgtcgagagcggcgaggagaccttctccgaga
cggtggcggcggcgctgctcctgcgggagaaccggtcgaccgagatcgccgcgctgga
cgttcccgtccggcagaagaacgtcggggaggagggccggatcgtggtggcggcctat
ctggtcgacaaggagggcaggcagatggcccgctacgtcaactggcccgagccgctca
agtacgtgcacctgcagaagcctcgggcgctccgggcgcagctgactgccgattactcg
gcggtcgaggtcagcgccgaggtgccagtcaagggcgtagcgctggagtgcgaggatg
acggggtccgattcgacgacaacctggtggatattgtgccgggcgaggtggtcaccatcg
gcgtcagcggggcgggcaaggacaccaagatcgagacgcggtatctgggcatgatcta
atttccggttcttccggaaggggagtgggggaagagttatatagaaggagaagaataggt
atctagaccaatttattacgacaggtggctttagaagaatggaccggaacggcagaaggg
acaagaacggggaaacccaaggaacgatgaggaagctcattacgaatgatgttggagac
ataaacaaagttcgattattgatcicaattatctacggcactaccaagaacccgcgcccccc
ccccccccccgcatagacgcctagtagatcatccatataaaccggccgatcaccatgcatg
cacaccgttgaacgccacgagatgcaaccaaagtataaaa
Bgl2 SEQ ID NO 33 SEQ ID NO 34
cgaggtttcacgacccgatgaagtcgccaggctttcctgatggcgacccgttcccgaacat MARDGSHQGLLRDSKRSPQRPKD agagttttaagatcaatttgcgagctcgatttttttacggggacacgcaaggcaattcaagtc DAGHDSDSD1EAIEYLDRNAPRSP gatacacgatggctcgcgacgggagccatcaagggctcctccgcgattctaagcggtcg KSDYGPSFPAANKARQSWILRCL ccccaacgccccaaggacgacgccggccatgattccgactccgacattgaagcgatcga GRRSRCCIGFLAGFIGLWILLSAG gtatcttgaccgcaacgcacctcgatcacccaagtccgactacggcccgtcgtttcccgcg GAFVYKKYQEEPPYGQSPPWYPA gcaaacaaagcacgtcaatcgtggatcctccgttgcctggggcggcggagccggtgttgc PKGG1AKT AESYE AAKMVSK atcggttttttggctggcttcatcgggctgtggatcttgctaagcgcgggaggtgccttcgtct MTLAEKVNVTTGTGWQMGLAV acaagaaatatcaggaagagcctccctatggccagtcgcccccgtggtacccggccccta GTNGPAVHVGFPQLQLQDGPLGI agggcggcattgcgaagacgtgggccgagagttatgagaaggcggccaagatggtgag RFADNITAFPASITVGATWNRQL caagatgacgctggcggaaaaggtcaatgtcacgaccgggacagggtggcagatgggc MYARGRAHGIEARQ GI VLLGP cttgcggttggcaccaacggtccggccgtccacgtgggcttcccgcaactgcagttgcag CVGPLGRMPAGGRNWEGFGADP gacgggccgttgggcatcaggttcgccgataacatcaccgcctttcccgcaagcatcacg YLQGVAGAETV GIQSEGVMATI gtaggcgcaacctggaatcgccagttgatgtacgcgcgtggccgtgctcacggcatcgaa HFVGNEQEHFRQPWEWGLPHA gcgcgccaaaagggcattaacgtccttctgggcccctgcgtgggccccctcggccgtatg LSANIDDRTLHELYVWPFADAV cccgccggcgggcgcaactgggaaggattcggggccgatccgtacctgcagggcgtgg AGVASVMCSYNMVNNSYACGNS cgggtgccgagacggtcaagggcatccagagcgagggcgtcatggccacaattaagca KLLNGILKDEMGFQGFVMSDWL cttcgtcggaaacgagcaggagcacttccgccagccgtgggagtggggcctgccccalg AQRSGVASALAGLDMTMPGDGA ccctcagtgccaacatcgacgaccgtaccctgcacgagctgtatgtttggcccttcgccga KWASGESFWGPELSRAVLNGSVP cgcagtcaaagccggcgtcgcctccgtcatgtgctcctacaacatggtcaacaactcgtac VDRLNDMVTRIVAAWYQLGQDD gcctgcggcaacagcaagttgctcaatggcatcctcaaggacgagatgggcttccagggt ETKFPRQPPNFSSWTDDRTGVLAP ttcgtcatgagcgactggcttgcccagcgctccggggtggcctcggocctggcgggtctg GSPSPQEKVVVNQFVDV ANHSV gacatgaccatgcccggcgatggcgcgaagtgggccagcggagagtcgttttggggac I ARE V A VQ GTVLLKNEGLLPI SRT ccgagctgagcagggcggtcctgaacggcagcgtccccgtcgatcgcctgaacgacatg GLDEAELEARRGAAARATGRRRT gtcacccgaatcgttgcggcctggiaccagctggggcaggacgacgagaccaagrtccc GKFSVGVFGDDAGPG GPNYCK gcgccagccgcccaacttcicgtcctggaccgatgaccgcaccggcgtcctggcgcccg DRACNQGTLASGWGSGAVEFPY ggagcccgagcccgcaggagaaggtcgtggtcaaccagttcgtggacgtgaaggcgaa LVSPIDALRKEFDSSKVELFIEHLT ccactccgtcatcgcccgcgaagtggccgtccagggcacggtgctgctcaagaatgagg DKPSFAG DAAVLEDLELCIVFV gattgctgcctattagccgcactggcctagacgaggcggaactcgaggcgagacgtggc NADAGEGFVRWEDVKGDRPNLN gctgctgcgagagcgacaggaagaagacgcaccggaaagttcagcgttggcgtgtttgg LQKGGDDLIVNVASKCGSGSGDV ggacgacgctgggcccgggaaaggacccaactactgcaaggatcgtgcttggtatgtcg IVVVHAVGPVLME WIELPNIKA Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
atccctccatcaaaccgtttcctagtcgaaagagaggagcagtctacgccgtcgatccaag VLFANLPGQESGNALADVLFGKA catcacacacgtagaaactaacacattgtagtaaccaaggcacgctggcttcgggctggg NPSGHLPFTIG SPEDYGPGG V gttccggggccgtcgagttcccttatttagtatcgcccattgatgcgctccgcaaggagtteg MYLPNGWPQQDFSEGLYVDYR acagctccaaagtcgagctgcacgagcatcigacagataagccatccttcgccggcaagg YFDKKDIEPRFEFGFGLSYTTFNL acgcggcggtccttgaagacctggaactctgcatcgtcttcgtgaacgccgacgcgggcg SNLRVTSHKNKSALPAARPSPAA agggattcgtacggtgggaggacgtcaagggagatcggccgaacctaaacctccagaaa QPPSYPTAIPPKDEAMFPPGFRAL ggcggcgacgacctcattgtcaacgtagcatccaagtgcggttctggctccggcgacgtc E YIYPYLESVDDIEVGDYPYPDG attgttgtcgtccatgcggtagggccggtgctgatggagaagtggatcgagctgcccaaca YDTEQPLSGAGGDEGGNPDLWA tcaaagctgttctcttcgccaacctccccggccaagaatctggaaatgcgcttgccgacgtc TYVTVSADVKNDGPVAGAVVPQ ctcttcggcaaggccaatccgtcaggccatctgcccttcaccattggcaaatccccggagg LYLEYPAKEGVDFPVRVLRGFEK attacggtcctggcggcaaggtcatgtacctgccaaacggcgtcgttccccagcaggattt FYLEPGETHAARFNLTRRDLSYW ctcggagggtctgtacgtggactaccgctactttgacaagaaggatatcgagccccggttc DVVEQNWVMVTEGEYKFRVGFS gagttcggcttcggcctcagttacaccaccttcaacctgagcaaccttcgcgtcacgtcgca SRDLPLTGTW
caagaacaagtcggccctgccggcagcccggccgagccccgccgcgcaaccaccaag
ctacccgaccgecatcccgcccaaggacgaagccatgttccccccggggttccgcgcgc
tggagaagtatatttacccgtatctcgagtcggtggatgacatcgaggtcggagactaccc
gtatcccgacggctacgacacggagcagcccctgagcggcgccgggggcgatgaggg
cggcaaccctgacttgtgggcgacctacgtgactgtctcggcggacgtcaagaacgatgg
ccccgtggccggcgccgtggtgccgcagctatatctagagtatcccgccaaggaggggg
tggatttccccgtccgggtgctccggggattcgaaaagttctatctcgaaccgggcgagac
gcacgccgctcgttttaacctgacgaggagggatctgagttactgggatgtcgtggagcag
aattgggtcatggtcacggagggagagtacaagtttcgggtcggcttcagttagagggattt
gcctttgacgggcacctggtgatgctctaacaactccttactaatgatgcagacgtattgttg
gattagtcaagacccatcccgaggttacttggttcccgtgcggtctgtatcccctcaiaattg
gatttactactactcgttgcttcatcttcacgatccgcaacaacaatagttcgtacg
Bgl3 SEQ ID N0 35 SEQ ID NO 36
atccctcaactaaccttttgcgttccattgttttgtgtgctgtggttgtctcgggttattgtgtgac MGGNNDQRLPSTNLKPSRPFRKI cccgcacggcacaagacaacagtaatgggaggtaacaacgaccaacgacttccttcgac AELRSSLETEHPILHTPQLTQRFPF caacctgaaagtaataacctgacaggagtaaacaagaaaacccgaggtgctcaagtaatt LRTKRGIALVVAAVLLFVGGGLS ggtcgtcaaagtgaaatgcagctgtccccgtcgcgttgctgagagctttctcaaatcggattt GLAALRTRSSGDGDSASAGEDGE cgactgtacagtacactgtaaatccaaatagaaagttcaagaagccgtatttacctacataca GVIKDDSWFYGLSPPVYPSPNTTG tatgtagggtagttacccaaccagtgactgaccaggcaactgcagccctctcgtccgttcc GEGKWVEATRKARELVRRMTLE gcaagatcgcagaattgcgttcttcacttgaaactgaacaccccattctgcacacgccgca EKVSLTAGDGTFPGCSGSLPAIPR gtacgtcttatctgccacggcccgccaatccgggtttcccaggicatccttcatccttcagcc LKFPGMCLSDAGNGLRGTDFVST aaccaacacatcacatttgtacaatacaacaatgtcgacgaccaaccccgtctccaagccc WPSGIHVGASWNKALARRRGSG ctcatcgcagacaatgccaccacaaccgccgccgccgccatggccgctgcctcggcctc MGAEFKTICGVNVMLGPWGPAG taaaggcaattgacaatcaaggctgacgcagcgcttcccgtttctccgcacgaagcgcgg RVVLGGRNWEGFSSDPYLSGVLV catcgccctcgtcgtcgccgccgtcctgctcatcgttggcggcggtctgtccgggctggcc SETVTAVQAAGVITSTKHYIANEQ gcgctgcgcacccgcagcagcggggacggggactctgcgtcggcaggcgaggatggg ETNRNPSGDIQSVSSNIDDRTMHE gagggggtgatcaaggatgactcgtggttctacggactgtcgccacctgtgtatccctcgc FYLWPFQDAIRAGSCNIMCSYQR gtgagtaattccttt tttttgtcttgctcctttcctttttcttcatcatcatatcttacgcggctaata LN SYGCANSKALNGLLKTELGF acggtgatgtagcgaatattacgggtggtgaaggcaagtgggtagaggcgacgaggaaa QGWVVSDWGAQHAGVAAALAG gccagggagttggtgcggaggatgacgcttgaggagaaggtgagtggcctgttcccgag MDVAMPNGDALWGPHLVDAV cgggctgcgtgtgatgttactgatgagcgacgggggtgggtaggtcagcttgactgccgg NGSVSETRVDDMWRTLATWYL cgatggcacatttcccggctgttccggctctctcccggccattccacgcctcaaattcccgg LGQDRDFPRPGIGMPTDLTRNHGI gcatgtgtctgagcgatgcgggaaatgggctccgcggcaccgatttcgtcagtacctggc VDARNSSFRSTLFDGAVEGHVLV cgagcggtatacacgttggtgccagctggaacaaggctctagcccgccggcggggttct KNTRNALPLREPKMLSVFGYSAK ggaatgggagccgagttcaagaccaagggtgtcaatgtgatgctggggcctgtcgtcgg NPDH NPSPGQSPWLWGSESFNY gccagctggtcgagtggtgctgggcggtaggaactgggaaggtaatccaacttttttcccc TEFGGRFFGLPGEYGSTPIAFNGTI caagagtatggacgcggctggttactaatttaaatcgatcaggcttctcaagcgacccctatt YSGGGSGATSPAAMVSPFDALVQ tatcgggagttttggtatccgagactgtcacggcggtccaagcagccggcgtcatcaccag RAYDDGTALFWDFVSGEPFVNPA caccaaggtaacgtctcttcttcccatcatctcggtgtctttcacactacgccttgttccatcttg SDACLVFGNAYATEAADRPGAR cttcctctgctgaatcctccctcccaatatctataccagatccggctgacattcactagcacta DDYTDGLIRHVADRCA TIVVIH tattgccaatgagcaagagacgaaccgaaacccctccggcgacatecagtcggtgtcatc NAGIRLVDQFIDHPNVTALLFAHL aaacatagatgataggaccatgcacgagttctatctatggtaagtcgtgagctcctgctcttg PGEASGRALVSLLYGDENPSGKL cttcccctgctaataataatgtgtcaggccttttcaagacgctatccgcgccggcagctgca PYTVARNESDYPVLGPDVAAEGS acataatgtgctcgtaccaacgcctcaacaactcgtacggctgcgcaaacagcaaggcgc MFARFFQSNFTEGVFLDYRHFDA tcaacgggttactcaagaccgagctcggctttcagggctgggtcgtaagcgactggggcg RNITPRYEFGFGLSYTTFAYDNLV cccagcatgccggggtcgccgcggcactggcaggcatggatgttgccatgccgaatggc VE VASAGRLGEYPTGRVVEGG gatgctctctggggtccgcatetggtcgacgccgtcaacaatggttcggtttccgaaactcg QEDLWDVLVEVSAEVTNTGKVA cgtagacgacatggtcgtcaggtaaatgccgcctcgttgccgctttatcaccgcctgcctcc GAEVAQLYVGIPAEGAPVRQLRG taacagccgcaggacgctcgcgacatggtacctgttgggccaggatcgggacttcccac FEKPFLNASATATVRFPLTRRDLS ggcccggtatcggtatgcccacggatctcacgaggaatcacgggattgtegacgcgagg VWDWAQKWRLVRGREY IEVG aactccagtttccggtcaacactgttcgacggtgccgtcgagggtcacgttctggtcaagaa GSSRDLPLVGTVTI cacccgaaacgccctgccgctgagagaaccaaagatgctgtctgttttcggttactcggcc Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
aagaacccggaccacaacaatccctcacccggccaaicgccctggctctggggctccga
gtccttcaactacaccgagttcggcggccgcttctttggtctccccggcgagtatggctcca
cgcccatcgccttcaacggcaccatctactcgggtggcgggtccggggccacgtcgccg
gccgccatggtctccccctttgacgcgctggtccagcgggcatacgacgacggcacggc
cctcttctgggactttgtcagcggagagcccltcgtcaatcccgcgtcggacgcctgcctg
gttttcggcaacgcttatgccaccgaggccgccgaccgacccggggcgcgcgacgacta
caccgacggcctgatcaggcacgtcgcagaccgctgcgccaacacgatcgtcgtcatcc
acaatgcggggatccgcctggtcgaccagttcatcgaccaccccaacgtgacggccctcc
tgttcgcgcacctgcccggggaggcatccgggcgcgcgctggtgtcgctgctgtacggg
gacgagaacccgtcgggcaagctgccgtatacggtggcgcgcaacgagtcggactatcc
cgtgctgggcccggacgtggcggccgaggggagcatgttcgcgcggttcccgcagagc
aacttcaccgagggcgtgttcctggactatcggcactttgacgcgcggaacatcacgccg
cggtacgagttcgggttcgggctgagctataccacgtttgcclacgacaacctggtggtgg
aaaaggtggcctccgcaggacggctcggcgagtatccgacgggacgggtcgtcgaggg
cggccaggaggacctgtgggatgtgctggtggaggtgtcggccgaggtgaccaacacc
gggaaggtggccggcgccgaggttgcgcagctgtacgtggggatccccgcggagggg
gcgccggtgaggcagctgaggggcttcgagaagccgttcctgaacgcgtaggcgaccg
cgaccgtccggttcccgttgacgaggagggacctgagcgtgtgggatgtggtggcgcag
aagtggcggcttgtcaggggtcgtgagtataagatcgaggtgggcggcagcagcaggg
acttgccgctcgttgggacggttacgatctgaaggtgttttctcgtcggtgccggggggtta
ggagtttgcttgtatttttccccctttctctcctttatcttatttttactctatttccttttttctgtcatct
tgtttcccagggattggiatatcatgagacggaacggacatttatttcttaatgtatggcgaca
gatatgtttttttttttccttcttgattacgaagcgagcatagaatgcatata
Bgl4 SEQ ID NO 37 SEQ ID NO 38
atttataagcaacgagaggaatcgagggcaggtcactatggcagcacctgcatcagtgga MAAPASVENVDLRQDIDPNASPA aaatgtcgacctcaggcaggaiaicgacccaaatgcgtcgccggccggcagcatcgaca QSIDTNTTPDTEFSPPESPLQRGGP ccaatacaacccctgataccgaattctcgcctccggaaagccctctccagcggggcggac PK.DARKLARTKLATLTTEEKVSL cgcccaaggatgcgaggaaactggcgagaaccaagctggccaccttgacgacggaaga LTAADFWRT AIPS NIPAV TSD aaaggttcgctgcgataacgttccgcttcttgaaccgcagagtgtaataacgccgagcagg GPNGARGGIFVGGTKAALFPCGIS tatctctcctgacagccgcagacttctggagaacaaaagcgattccatccaagaacatccc LAATWNKDLLYQVGQHLADEVK ggcagtaaagacgagtgatgggcctaacggcgctcgcggtggcatctttgtggggggca ARSANVLLAPTVCMHRHPLGGR caaaagtgagcttagctttcccgcaggacgaggctcgcgcgggtgcacaagacgctgat NFESFSEDPLLCGKLAAQYI GLQ cgctcgagacaggctgcgcttttcccctgcggtatctcgctggcagcaacatggaataagg EKGVAATI HFVGNEQETNRMTI atctcctttaccaggttggccagcaccttgcggacgaggtcaaggctcggtcggccaatgt NSHAERPLRELYLRPFEIAVREAK cclgctcgcccctacagtctgcatgcacagacacccgctgggtgggaggaactttgagtct PWALMSSYNLVNGVHADM THI ttctccgaggatcctctcttgtgcggcaagctggctgcccagtacatcaagggtctccagga LRDILRGEWGYDGTIMSDWGGV gaaaggcgtcgccgctaccataaaacgicagctcgcttccggacctcttcgctcttctctctc NSTVESI AGCDIEFPYSDICWRFG tcactcrtgtgaagtggatatgacggactggctgactattgaatagactttgtcggaaacgag KVLDALKEGKIAEADIDRAAENV caggagacgaacaggatgaccatcaactcgatcatcgccgaacggccgcttcgcgaact LTLVERT GSDLTAEAEEREDDR ctaccttcgcccgttcgagattgcggttcgcgaggccaagccgtgggctctcatgagttctt EETRNLIREAGVQGLTLL NEGSI ataaccttgtcaacggcgtccacgcggacatgaacacgcacatcctcagggatatcctccg LPIDPATAKVAVIGPNANRAIAGG cggtgagtggggatacgatgggtaagtgacgactcggctaagcatgccgcccccccggt GGSASLNPYYTTLPLDSIRKVA Q gataaagcgaggaagatctgctgacgctccaaccccgctaaccccccctcccctctcccc PVSYSQGCHIHKWLPVASPYCSD ccccaaccagaaccatcatgtcggactggggaggcgtcaactccaccgtcgaatcgatca KTG QGVSIEWF GDKFEGQPW aggccggatgcgacat gagttcccgtactatgacaagtggcgcttcggcaaggttcttga FQRRTNTDLFLWDSAPLAQVGPQ cgccctcaaggaagggaagattgccgaggccgatatcgaccgggcagccgagaacgtg WSAIVTTYLTPPSTGKHTISF TV ctcactctggtcgagcggaccaagggcagcgacctgacggccgaggccgaggaaaga GPGRLYVNGQLALDLWNWTEEG gaggacgaccgcgaagagacgaggaatctcatccgcgaagcgggcgtgcaggggctc EAMFDGSIDYLVDVEMQAGRPVE accctcctgaagaacgaggggtccatcctgcccatcgacccggccacggccaaggttgc LRVEITNELRPIAKQ QFDMTHK cgtgatcgggcccaatgccaacagggcgatcgcgggaggcggcgggagcgccagcct YGGCRIGFKPED VDYLQEAVDA caacccgtactacaccacgctgcccctggacagcatccgcaaggttgccaagcagccgg AKAADVAVVIVGLDAEWESEGY tcagctacagccaggggtgccacatccacaagtggcrtcctgttgcgtccccctactgctcc DRKSMDLPSDGSQDRLVEAVLAA gacaagacggggaagcagggggtgtcgatcgagtggttcaagggcgacaagttcgagg NPRTVWNQSGSPVTMPWADRV gccagccggtcgtgttccagcggcggaccaacaccgacctcttcctgtgggactcggccc PAHQAWYQGQEAGNALAAVLFG cgctcgcccaggtcggcccgcaatggtcggccatcgtgacgacctacctgacgcccccg LDNP SGKLP CTFPRRLEDTP AYHN agcacgggcaagcacaccatctcgttcatgacggtcgggccgggcaggctctacgtcaa WPGENLEVIYGEGMYIGYRHYDR cgggcagctcgcgctggacctctggaactggaccgaggagggcgaggccatgttcgac VGVAPLFPFGHGLSYTTFEYGRPS ggctcgatcgactacctagtegacgtcgagatgcaggcgggccggccggtggagctecg VSPKVLGPDGGAIELVVAISNVGP ggtcgagatcaccaacgagctgcgcccgatcgccaagcagaagcagttcgacatgacgc VRGLETVQVYVRDERSRLPRPEK acaagtacggcggctgccggatcgggttcaagcccgaggacaaggtcgactacctgcag ELVAFEKVELEPGETKHLRIPLD gaggccgtcgatgcggccaaggcggccgacgtcgccgtggtcatcgtcggcctggacg YAVGYYDTALRRWVAEQGTFRA ccgagtgggagtcggagggctacgaccgcaagagcatggacctgccctcggacggga LVGASAADIKYDVAFEVKETFTW gccaggaccgcctcgtcgaggccgtcctcgcggccaacccgcgcaccgtcgtcgtcaac VF
cagtccggctcacccgtcaccatgccctgggccgaccgcgtgcctgccattatccaggcg
tggtaccagggccaggaagccggaaacgccctcgccgccgtgctctttggcctcgacaa Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
cccaagcggcaagctcccggtaagcctgcagcaacaacacccccacctaacacttccctc
cccccctttttttttttictttttcctfgccctatcctactttactctatcctactctgcactgtttacatt
ctgacctgtcttgtttctagtgcaccttcccccgacgtctcgaggacacgcccgcgtaccac
aactggcccggigagaacctggaggtcatctacggcgagggcatgtacatcgggtaccg
gcactacgaccgggtcggggtggcgccgctgttcccgttcgggcacgggctgtcgtaca
cgacgttcgagtacgggcggccgtcggtgagcccgaaggtgctcggcccggacgggg
gcgcgatcgagctggfggtggccatcagcaacgtcgggccggtecggggcctcgagao
ggtgcaggtgtacgtgcgggacgagcggagccggctgccgcgaccggaaaaggagct
cgtcgcgttcgagaaggtcgagctcgagccgggcgagaccaagcacctgcgcatcccg
ctcgacaagtacgccgtcggctactacgacacggccctgcgccgctgggtggccgagca
gggcaccttccgggccctggtcggtgcctcggccgcggatatcaagtacgtagtctttccg
acttcttccccgtttgatgtttgatcctctctcgtctttctcttggtttgctaacgtggggatgacc
aaggtatgatgtcgcgtttgaggtcaaggagacgtttacctgggttttctagtgtaaaaggaa
aaaaggggataggccacgggtctgaacacggacatgcatgcatggagggtttgctgccat
cctcccccacttctcgctgtatigtacatacacacataattacaaccttcgtttcttgccccgaa
cggcatc
Bgl5 SEQ ID NO 39 SEQ ID NO 40
cgtccctttaagacgcgtcgaatgactggtcttgtcgtctttgtcctcctgttggcgtgcgcct MVYRASAVWLSLVLSLSAPPTLA ggtctctcaaccacagccgaccggaatagttcaaatcggcgggcctgcttgcctgcctccc RTIGPRDKVPEGFYAAPYYPTPHG tgtacttcctgtttgacttgggagcgcgaaggtctagacggcgcggtcgagatggtttaccg GWLDSWKDAYAKAHALVSRMT cgcgagcgccgtttggctctcgctggttctcagcctctctgctcccccaacgctcgcccgta LAEKTNITSGVGIFMGMPCVGNT ccattgggccgagagacaaggtcccagaaggattctatgcggciccctactaccccacgc GSADRLSFPQLCLQDSALGVASA cgcacggcgggtggcttgattcgtggaaggatgcgtaegcaaaggcccacgccttggtgt DNVTAFPPGITTGATWDKALMYA ctcggatgactttggctgagaagaccaacatcacgtctggcgtcggcatcttcatgggtatg RGVAIGREFRGKGANVHLGPSVG tagtagagaccccctgaataaggcggccctcacctctctctttatgttcttatctccagcatct PIGR PLGGRNWEGFGADPVLQA catgtcgctaaaatgttgtcctcacagggtcggtccaagagcacgtttctcttccacgccttc AAALHKGVQEQGVIATV HLI attttgctgactaggaggtatgcaggccatgtgtgggaaacacgggaagcgccgaccgcc GNEQEMFRMYNPFQPGYSANIDD tgagcttcccccagctctgccttcaagactcggcgcttggggtggcatccgccgacaacgt RTLHELYLWPFAEAVHAGVGAV caccgctttcccacccggcatcaccaccggggcgacgtgggataaggccctgatgtacg MTAYNAVNGSASSQNSYLINGLL cccgcggagttggtattggcagggagftccgcggcaagggggccaacgtccatctcggc DELGFQGMVMSDWLSHISGVG ccctccgttggccggattggacggaagccactcggcggccggaactgggaagggttcgg SALAGLDLN PGDTNVPLFGFSL cgccgacccggtgctgcaggccaaggcggccgccttgcacattaagggcgtccaggag WQYELTRSVLNGSVPLDRLND cagggtgttatcgcgaccgtcaagcatctgatcgggaacgagcaggagatgttccgcatg ATRVVAAWYKMRQDKDFPRPNF tacaacccgttccagcccgggtacagcgccaacattggtgagaaggacggattcctcgga SSNTRDRNGLLYPAAIFSPIGQVN gcgccaccagctccgccggcgtgctcccgtacagcgctgtgctgaccaaaagttgctaga WFVNVQEDHYKIARQVAQDAITL cgaccggacgctccatgagttgtacctgtggccgttcgccgaggccgtccacgccggcgt LKNDGSLLPLTGSGKITVFGTGAQ cggcgcggtcatgaccgcctacaatgccgtgaacgggtccgcctcgtcgcagaatagcta VNPAGPNACLNRACNKGTLGMG cctgatcaacggcctcctcaaggacgagctcggcttccagggcatggtcatgtcagactg WGSGVADYPYFDDPITAIRKRVP gctcagtcacatctcgggcgtcggctcggctctggccggtctcgacctcaacatgccggg DVEFHNSDEFPLFFTGEAPAPDDV cgacaccaacgtcccgctctttggcttcagcctgtggcagtacgagctgaccaggtccgtc AVVFISSDAGENSFTVENNHGDR ctgaacggctccgtgcccctagataggctgaatgacatggctacccgggtggtggcggcc DADKLAAWHGGDGLVKKVADK tggtataagatgaggcaggacaaggatttcccacggcccaacttctcgtccaacacgcgg FPNVWVAHTVGPLILEPWIDHPS gaccggaacgggctcctgtacccggccgccatcttctctcccatcggccaggtcaactggt VKAVLFAHLPGQEAGESLAGVLF tcgttaacgtccaggaggaccactacaagatcgctcgccaggtcgcccaggatgccatta GDVSPSGHLPYSITRAETDYPDSI cgctcctcaagaacgacggcagccttctgccgctgacgggctcgggaaagatcaccgtct A LKGFTLGQVQDTYSEGLYIDY tcgggaccggcgcgcaggtcaaccccgcggggccgaacgcctgtctgaacagggcgt RWLNKRSIKPRFAFGHGLSYTTFA gcaacaagggcaccctcggcatgggctggggctcgggcgtggccgaclacccctacttc FTOATIRAVTAPLDPIPPPRPSKLP gacgaccccatcacggccatcaggaagagggtcccggacgtcgagttccacaacagcg TPSYPTDLPAASEAYYPDGFKPIW acgagttcccgctcttcttcacgggcgaggcgccggcccccgacgacgtcgccgtcgtct RYLYSWLPKSEADAAWAIGATG tcatcagctcggacgcgggcgagaacagcttcacggtcgagaacaaccacggcgaccg KQKYAYPDGYSTTQ PGPAAGG cgacgcggacaagctggcagcctggcacggcggtgacgggctggtcaagaaggtggc GEGGHPALWDVAWEVDVTK TT cgacaagttcccgaacgtggtcgtcgtcgcccacacggtggggccgctcatcctcgagcc GGVGGPGGLPRFFPGPARASVQA ctggatcgaccacccttccgtcaaggccgtcctcttcgcccacctcccgggccaggaggc YVQYPPGIPYDTPPVQLRDFEKTP gggcgagtcgctggccggcgtcctcttcggcgacgtgtcgccgagcggtcacctccccta PLAPGESQTVTLRLTR DVSVWD ctccatcacccgggccgagaccgactaccccgacagcatcgccaagctcaaggggttca VELQNWVVPGATLGGGGKGPGR ccctcggccaggtccaagacacgtactcggaggggctctacatcgactaccgctggctca YTIWIGEASDQLFLACYTDTGKCE acaagcgctccatcaagccgcgtttcgccttcggccacggcctcagctacaccaccttcgc QGLEPPV
cttcaccaacgccaccattcgcgccgtcaccgcccccctggaccccatcccgccgccgc
ggccctccaagctgccgaccccgtcctacccgaccgacctgcccgccgcctcggaggc
ctactacccggacgggttcaagcccatctggcgctatctctactcgtggctgccgaagtcc
gaggcggacgccgcgtgggccatcggcgccaccggcaagcagaagtacgcttacccg
gacgggtactcaaccacgcagaagcccgggccggcggcgggcgggggcgagggcg
gccacccggcgctgtgggacgtggcgtgggaggtcgacgtgacggtgaccaacgcgg
ccggcagcggcgccaacatcagctccgcggcggagaagaagacgacgggtggtgtcg
ECKgCCCCggCgggCtgCCCCggttcttCCCCggCQCEKCECBggCCtcggtgcaggC Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
gtacgtgcagtacccgccgggcatcccgtacgacacgccgccggtgcagctgcgcgact
ttgagaagacgccccccctggcgcccggcgagagccagacggtcacgctgcggctgac
gcgcaaggacgtcagcgtctgggacgtcgagctgcagaactgggtcgtecccggggcg
acgctcgggggaggagggaaaggcccgggcaggtacaccatctggatcggcgaggcc
agtgatcagctgttcctggcctgttatacggataccggcaagtgcgagcaggggctggag
ccgcccgtctgagaggatcccatagtcaagaatgtaaatactgtttgacagggaagagttg
gatgaagggaagaacgcacatgtgtcgtagtcggcggctgtggggggtaatataataatta
ggtccacatttaatagcggcagcgtggttaatcatagtagttattcataatacctccttaaacc
cccttccaacgcccaaagggaatgatttgagatgaaccactggcatgcctgcctaattatag
atcccaagaaagctaagccatggctcccacattactacaacagagaacgaagcagccgta
attgaccctccttagtcctataattagccggtgcgcggagtgctgtttgctgattgcattctttg
gcacgaagccgaatgcataattaa
Glut SEQ ID NO 41 SEQ ID NO 42
gatEctccgcgttttctttggccagtcagcattcagctttcactgccgagatgcaattctgcgtt MQFCVALLLGVVSGVWAAYWM gctctcttattgggagttgtatctggcgtctgggcggcctactggatggaggaccttcaccg EDLHRQGLAPFAEASRYSVFRNV ccagggtctcgcaccgttcgctgaagcaagcagatacagcgtcttcagaaatgtgaagca QWGAKGDGG ICAAPRVVPAD gtggggagctaagggagatggaggtaagatctgtgctgcaccacgagttgttcccgctga GCLVTDDTAAINAAISDGDRCGG cggttgtctagttaccgacgataccgcagccatcaatgccgcaattagcgacggcgatcgt PDCVGSTTTPAIVYFPPGTYMISSP tgcggcggcccggactgtgttggctcaaccacgacgccggccattgtttatttccctcccg IFSYYYTQIIGDPTNMPVIKASQNF gaacttacatgatcagttcgccaattttcagctactactatacccaaattattggcgaccctac PTNVLAMLDADRYMDNGRLNFL caatatgccagtcatcaaagcttcccaaaactttcctaccaatgtgctcgccatgctggatgc ATNVFFRQLRNLVFDTTAVRGTIT cgaccgttacatggacaacggaagtaggtagtttctgccatatggtcaaactttcgrtgctaa GIHWPSSQATLVQNCVFKLSSRE cgtggttggctccagggctcaactttctcgcaacaaacgtcttcttcagacagcttcgcaac DDTHVGIF EEGSGGM ADLIFH ctcgtcrttgacacaacggctgttcgtgggacgatcaccggaatccattggccatcctccca GGKYGARFGNQQYTMRNLTFYD agcgaccctggtccagaactgcgtcttcaagctttcctctagagaagatgacacccacgtc CDTAIEQIWNWGWTY SL WG ggtattttcatggaagaagggagcgggggcatgatggcagaccttatattccacggcggc SRVGINMSSSDVGSVTLLDSSFVN aaatacggcgctcggtttggcaaccagcagtacacgatgcgaaatctaacattctatgactg VSTALISGRIPGNKIGLGSLLIQNV cgatacggccatcgagcagatatggaatlggggatggacgtacaaatcgcttaaggttgtt EYKNVPTVLAEADGRPLLLGDAN ggttcccgcgtaggtatcaacatgtcttcaagcgatgttggctcagtcactctgctcgacag GTVYDRGYARGNTYAPNGPLWL ctcgtttgtcaatgtcagcacagctttgatatctggccgcattcccggaaacaagattggtctt EGHEFNFSQPSTLKIGDRYYERSK ggatcactgctgatccagaatgtcgaatacaaaaacgttccgacagtattggcagaagctg PQYEDYSSSDFISARDHNAFGDGR atgggaggccgcttctattgggagacgcgaacgggacagtctacgacagaggttacgca TDDTDMINKVIQAAANSSYIAFLD agagtgggcactccaaatacgttgattcgatttgaacatgctattaacttcaccactagggca AGYYRVTDTIFIPPNTRVMGEGLA acacatatgctccgaatgggcccttgtggcttgagggacacgagtttaacttttctcaacctt TVEMGTGEKFSDPNNPRPVVQVG cgacccttaaaatcggagaccggtactacgagagatcaaaaccgcagtacgaggactact KLGDTGFVEINDLIASTQGPAAGA cttcttcggactttatatcagcccgagatcacaatgccttcggcgatgggcgcaccgacga F IEYNLNTPAAESMCSLGSPPSG cacggacatgatcaacaaggtcatccaggctgccgcgaattcatcatacatcgccttcttag MWDVHVRVGGFRGSQLQVAECP atgccggttattacagggtcacagataccatattcatcccacccaacactcgagtcatgggc TTPERLDYVNPSCIAGYMGMHISP gaaggactcgccaccgtcatcatggggacgggggagaagttctcggatccaaataaccc SARNLYLENSWIWVADHDVDDW acgccctgtggtacaggttgggaagcttggtgatactgggtttgtcgaaattaacgacctca NQTQISVFVARGMLVQGSRIWLV tcgcgtcgacgcaggggccagctgcaggcgccatcatgatcgagtacaatctcaacacg GSSVEHHALYQYQLLNASDVWM ccagcagctgagagcatgtgtagtctgggaagccccccgtcggggatgtgggatgtccat GQIQTETPYYQPNPPASYPFTQLN gtccgcgtcggtggcttcagaggatctcagcttcaagttgctgagtgtcccacgacccccg DSIRDPDFTVDCRERETENSLSSQ agaggcttgactacgtcaacccgtcatgcatcgctggttatatgggtatgcatatctctccga GNPSCAMAWGLRIIGSQNVWFG gcgctcgaaatctgtacctggaaaacagctggatatgggtggcagaccatgacgttgatga AGLYSFFNNYNTSCSTVESGENC ttggaaccagacccaaatcagcgtcttcgtggcacggggaatgctggttcaggggagccg QARIFWVGQDTSDGAERQGSAEG tatctggttggttggaagctctgttgagcateacgcgttgtaccagtaccagctgctcaacgc EEMLAVEVYNLNTIGSVIMITQSG ttcggatgtctggatggggcagatccaaacagaaaccccttactaccaacccaacccacc LDMATWAQNRATFASTLAAFRSE agcatcgtaccctttcacccagctgaacgacagcattcgcgatcctgattttacagitgaclg GHRMAL
cagggaacgcgagacggagaactccttgagctcacaaggtaacccgtcatgcgcaatgg
catggggacttcgcattattggctcccagaatgtggtggtgttcggagcgggcctgtacagt
ttcttcaacaattacaacacgagctgcagcaccgtcgagtcgggtgagaactgccaggcg
cgcattttctgggttggccaggacacgtccgacggagctgaacgccaaggtagcgctgaa
ggcgaggagatgctagcagtggaggtgtacaacctcaacacgatcggcagcgtcatcat
gatcacccaatctgggttggatatggcgacatgggctcagaacagggcgacgtttgcgag
cacactggctgctttccgttctgagggccacaggatggcgctatagcaagcatcttgtaag
gtgatgcaggactttgttaggcacggccagcgatgctggcaccagagcgtgcgactgaca
gcgcctctggctgagttgccataa
Glu2 SEQ ID NO 43 SEQ ID NO 44
aaataaagacaaaacaaaggcgactcgatacggtctctgaatcactcatctcccaagatga M AAVMAAAAAVLANGVSAAR aggcagctgtgatggccgccgctgcggcggttctcgccaatggcgtgagcgcggcccg VHGHRHAHALFKRGDTGEVCTP ggttcacgggcaccgccatgcgcacgccctatttaagaggggcgacaccggtgaagtgt GCTTIYSTITGEPTLLPPAPTPTTT gcactccgggctgcaccaccatctactccaccatcaccggtgaaccgacccgtacgttgc AAPEPTTTSGPVLVPTPIQQTCPTP actcccccgaaggcgccccccccccccccaaccaggcggagcgaaacaagattaggca GTYTIPATTVVLTETTTVCGASST gcacggcaagcaaccccagaggctaacagaaagggtatcatgatgcagttctcccgcct EVPSGTHTLGGVTTWETATTVV Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
gcgccgactcccacgaccacggccgctcccgagcccaccacgacgagcgggcccgtc CPVATTETQNGVVTSVI TTTYV ctcgtcccgacgccgatccagcagacctgcccgactcccggcacctacaccatccccgc CPSAGTYTIAPITTVVPSVTTVVVP gacgaccgtcgtcctgaccgagacgaccaccgtctgcggcgcgtcttcgaccgaggtcc VVETYCPGTYTAPELVTTVTETSV cgagcggcacccacaccctcggcggcgtcaccaccgtcgtcgagacggccaccacggt 1VTCPFTSVTPAEPTSAPVPTSEPA cgtctgccccgtggccaccaccgagacccagaafggcgtcgttaccagcgtcatcaagac EPTSAPAPTSEPAEPAPAPPSSSS caccacctacgtctgcccgtcggccggtacctacaccattgcccccatcaccaccgtcgtc VAPSPSPSPSSPSKGPNPDNNG R cccagcgtcaccaccgtggtcgtgcccgtcgtcgagacgtacigcccgggtacctacacc WA TYTPYAETAEGGCKSASEV gccccggagctcgtcaccaccgtcaccgagaccagcgtgatcgtcacctgccccttcactt MDDVAA1AKAGF ALRVYSTDC cggtcacccccgccgagcccacttcggccccggttcccacttcggagcccgctgagccc DTLPNVGAAARAHGLRLIVGIFIG acttcggcccctgctcccacttcggagcccgctgagccggcccctgcccctccctcgtcgt KVGCDNNSPDVADQISALKEWKE cgtccaaggtggctccgtccccatctccgtccccttcgtccccgtccaagggccctaatcc WDLVDLCVVGNEALFNGFCSVSE cgacaacaacggcaagcggtgggccatgacctacaccccgtacgccgagactgctgag LASLIGRV .SELGSVGYNGPFTTT ggtggctgcaagagcgcctccgaggtaatggacgatgtcgccgc cattgc caaggcggg DVVAAWTNNDVSAICDAIDVTAT tttcaaggccctccgtgtctactcgaccgactgcgacaccctgcccaacgttggtgccgct NAHAYFNADTEPADAGKFVAGQ gctcgcgcccacggcctccgcctgaitgtcggcatcttcatcggcaaggicggctgcgac LAJVEKVCGKPGYV ETGWPSA aacaacagccccgacgttgccgaccagatcagcgccctcaaggagtggaaggagtggg GN CNG VAC AGE AEQ ATAIHS IEQ atctcgttgacctgtgcgtcgtcggcaacgaggccctcttcaacggctictgctccgtctcc ELGNKAVFFSFRNDPW QPGECN gagctcgccagcctcatcggccgcgtcaagtcggagctcggctcggtcggttataacggc CEQHWGCANVFGV cccttcaccaccaccgacgtcgtcgccgcctggaccaacaacgacgtctcggccatctgc
gacgccatcgacgtcaccgccaccaacgcccacgcctacttcaatgccgacaccgagcc
cgccgatgccggcaagttcgtcgccggccagctggccattgtcgagaaggtcigcggca
agcccggctacgtcatggagaccggciggccctcggccggcaactgcaacggcgtcgc
ctgcgccggcgaggctgagcaggccaccgccatccactccatcgagcaggagctcggc
aacaaggccgtcticttctccl ccgcaacgacccgtggaagcagcccggcgagtgcaac
tgcgagcagcactggggctgcgccaacgtcttcggcgtttaagcicicttccagccgactct
ggtccscacgcctccccctctcacttgtgggcagctcatctttccacaatat
Abn6 SEQ ID NO 45 SEQ ID NO 46
CL08095 ttattaaitgtccaagtcgaactttgctgcictcccaacatggtgcctcgcaaaggcaaggaa MFFASLLLGLLAGVSASPGHGRN caaccgcgtcacctggagccatgttcttcgcttctctgctgctcggtctcctggcgggcgtgt STFYNPIFPGFYPDPSCIYVPERDH ccgcttcaccgggacacgggcggaartccaccttctacaaccccatcrtccccggcttctac TFFCASSSFNAFPG1PIHASKDLQN cccgatccgagctgcatctacgtgcccgagcgtgaccacaccttcttctgtgcctcgtcga WKLIGH VLNR EQ LPRL AETNRS gcticaacgccttcccgggcatcccgattcatgccagcaaggacctgcagaactggaagtt TS GI WAPTLRFHDDTF WL VTTL V gatcggccatgtgctgaatcgcaaggaacagcttccccggctcgctgagaccaaccggtc DDDRPQEDASRWDNIIFKA NPY gaccaggtatgttgccgcctctggcgctccgaaggcgtgcttgacgaacacggccttacgt DPRSWSI AVHFNFTGYDTEPFWD ccgacagcggcatctgggcacccaccctccggttccatgacgacaccttciggttggtcac EDG VY1TGAHA HVGPY1QQAE cacactagtggacgacgaccggccgcaggaggacgcttccagatgggacaatgtgcgt VDLDTGAVGEWRnWNGTGGMA gcggcgcatgattattgttgtctctgttttttmttttttcttctttttm PEGPHIYRKDGWYYLLAAEGGTG gatccccgaaggctgatggctccttttgctctagattatcttcaaggcaaagaatccgtatga IDHM VTMARSRI IS SP YE SNPNNP tccgaggtcctggtccaaggccgtccacttcaacttcactggctacgacacggagcctttct VLTNANTTSYFQTVGHSDLFHDR gggacgaagatggaaaggtgtacatcaccggcgcccatgcttggcatgttgggtaagctc HGNWWAVALSTRSGPEYLHYPM gtccacatcgccgcccgccggatctgttctggcgctcacccttcccaagcccatacatcca GRETVMTAVSWP DEWPTFTPIS gcaggccgaagtcgatctcgacacgggggccgtcggcgagtggcgcatcatctggaac G S GWPMPP SQ DIRGVGP Y V ggaacgggcggcatggtatgattctcgcacacttcctcctctctggcgtattccctcctgaca NSPDPEHLTFPRSAPLPAHLTYWR caactgggagtttaggctcctgaagggccgcacatctaccgcaaagatgggtggtactact YPNP S S YTP SPPGHPNTLRLTPSRL tgctggcigctgaaggtgagcaagacccttcttgtggacactgttggtgccgccgalgctg NLTALNGNYAGADQTFVSRRQQ acacgagccaggggggaccggcatcgaccatatggtgaccatggcccggtcgagaaaa HTLFTYSVTLDYAPRTAGEEAGV atctccagtcctiacgagtccaacccaaacaaccccgtgttgaccaacgccaacacgacc TAFLTQNHHLDLGVVLLPRGSAT agttactgtaagaccaaccacaactcgtgcctgatacaatagcgctcaccatgtcatagttc AP SLPGLS S STTTTS S S S SRPDEEE aaaccgtcgggcattcagacctgttccatgacagacatgggaactggtaagaaacatttctt EREAGEEEEEGGQDLMIPHVRFR accittgtctcggcaacatctcttgtctctcgccgttttgaagctaatcgtgctcctgaaggig GESYVPVPAPVVYP1PRAWRGGK ggcagtcgccctctccacccgctccggtccagaatatcttcactaccccatgggccgcga LVLEIRACNSTHFSFRVGPDGRRS gaccgtcatgacagccgtgagctggccgaaggacgagtggccaaccttcacccccatat ERTVVMEASNEAVSWGFTGTLLG ctggcaagatgagcggctggccgatgcctccitcgcagaaggacattcgcggagtcggg IYATSNGGNGTTPAYFSDWRYTP taagccgttttc rtccggcctccccccccccccaaaaaaaaaactctcggaccatcccccg LEQFRD
ccgaccgctaacaccttccctcctcctcctcccctcttciccccacagcccctacgtcaactc
ccccgacccggaacacctgaccttcccccgctcggcgcccctgccggcccacctcacct
actggcgatacccgaacccgtcctcctacacgccgtccccgcccgggcaccccaacacc
ctccgcctgaccccgtcccgcctgaacctgaccgccctcaacggcaactacgcgggggc
cgaccagaccttcgtctcgcgccggcagcagcacaccctcttcacctacagcgtcacgct
cgactacgcgccgcggaccgccggggaggaggccggcgtgaccgccttcctgacgca
gaaccaccacctcgacctgggcgtcgtcctgctcccicgcggctccgccaccgcgccctc
gctgccgggcctgagtagtagtacaactactactagtagtagtagtagtcgtccggacgag
gaggaggagcgcgaggcgggcgaagaggaagaagagggcggacaagacttgatgat
cccgcatgtgcggttcaggggcgagtcgtacgtgcccgtcccggcgcccgtcgtgtaccc
gataccccgggcctg^agaggcsggaagcttgtgttasagatccgggcttstaatlcgact Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
cacttctcgttccgtgtcgggccggacgggagacggtctgagcggacggtggtcatgga
ggcttcgaacgaggccgttagctggggctttactggtgagttctctctctcctatttcigttttc
cttgcgaagcgtgtaagcgccattgccagagctttttaattgacctgaaatttccttttttttttcc
ccgccgaacgcaggaacgctgctgggcatctatgcgaccagtaatggtggcaacggaac
cacgccggcgtatttttcggattggaggtacacaccattggagcagtttagggattgaaaaa
aaggaaggggagttgtgggctggtatcgccattcggga
Abn8 SEQ tD NO 47 SEQ ID NO 48
CL04698 gattcgcggactttgttaataaitaaiccccggaaaaccccataaatgtcctttcccaaagctg MRLKSGLAGALAWGTTAAAAAA tcaggcagtgcgacctggtgtcacatttttcttaartaaccggcatatttaccggccgcctggt VARVGAGAAANSTYYNPILPGW atcgatgcgcccgaatcaagtgcaagatcaacagcgctaccatgatccatcgtcgicttgg HSDPSCVQVEGIFYCVTSTFISFPG cgcccacgtggtgacgatgcgcgcagcgaggttgggaagtatttatgttcgtacatacatc LPIYASRDLIN KHVSHVWNRES aggtagataggtatctaggtagccacttgaaataatcatcttattaatctaccgacggcttcct QLPGYSWATEGQQEGMYAATIR tcactcgctcaagttaagggaaccctttttttttcttccttcctctcccgcgccactgttccctag HREGVFYVICEYLGVGGRDAGVL gaagggggtttatcatccgttctcgagaattgatcactcctcctcgccaacccgacccccctt FRATDPFDDAAWSDALTFAAPKI cacgacacgatgataatgatgagactcaagtcgggactggccggggcgctggcctgggg DPDLFWDDDGTAYVATQGVQVQ aacgacggcggcggcggcggcggcggtggcgagagtgggagccggcgcggccgcg RMDLDTGAIGPPVPLWNGTGGV aactcgacctactacaacccgatcctccccgggtggcactcggacccgtcgtgcgtgcag WPEGPfflYRRADHYYLMIAEGGT gtggaggggatcttctactgcgtgacgtcgaccrtcatctcgttccccggcctgcccatcta AEDHAITIARSDRLTGPYVSCPHN cgcgtcccgggacctgatcaactggaagcacgtcagccacgtgtggaaccgcgagtccc PILTN GTDEYFQTVGHGDLFQD agctgcccgggtacagctgggcgacggagggccagcaggagggcatglacgcggcga AAGNWWGVALATRSGPEY VYP cgatccggcaccgcgagggcgtcttctatgtcatctgcgagtacctgggcgtcggcggca MGRETVLFPVTWREGDWPVLQP gggacgccggcgtgctcttecgggcgacggacccgttcgacgacgcggcctggagcga VRGAMSGWPLPPPTRDLPGDGPF cgccctgaccttcgccgcgcccaagatcgacccggacctgttctgggacgacgacggga NADPDVYDFAGGGAEAGEEA P cggcctacgtggcgacgcagggogtgcaggtgcagcgcatggacctcgacacgggcg RNLVHWRVPREGAFATTARGLR ccatcggcccgcccgtgccgctgtggaacgggacgggcggggtgtggcccgagggcc VALGRNRLDGWPGGGEPAARAV cgcacatctaccgccgcgccgaccactactacctcatgatcgccgagggcggcacggcc SFVGRRQTDSLFTFSVDWLSRPTR gaggaccacgccatcaccatcgcccgcagcgaccggctgacggggccctacgtctcctg TARRPAVTAFLTQLANLQLGLVR cccgcacaacccgatcctgaccaaccgcggcacggacgagtacttccagacggtcggc LDGGQLRLRFNASGHVRDTAVPE cacggcgacctcttccaggacgccgccggcaactggtggggcgtcgccctggccacgc EWTEVGSCDGGDDGGDGGDGVV gctccggcccggagtaccgcgtctacccgatggggcgcgagaccgtgctgttccccgtc RVRLEIRMAAEDPSSYRFAAMLA acctggcgcgagggcgactggccggtcctgcagcccgtgcgcggcgccatgtcgggct SDPDPDRTRIEVGTAPAELLSGGS ggccgctgccgccgccgacgcgcgacctgcccggcgacgggcccttcaacgcggacc GSFVGTLLGVYATCNGAGEGIDC cggacgtgtacgactttgccggggggggagccgaggcgggggaggaggcgatgccgc PAGTPDAYFTRWRYTGEGQFYTE ggaacctggtgcactggcgggtcccgcgcgagggcgccttcgcgaccacggcgcgcg TDLVPPDEGQGKGKGKGNGKGK ggctccgcgtcgcgctggggcgcaaccggctcgacggctggcccgggggcggcgagc GNGNGNGKAAKRSRFVG cggccgccagggccgtctccttcgtggggcgccgccagaccgacagcctcttcaccttca
gcgtcgactggctttcgcgcccgacgcggacggccaggaggccggccgtgaccgcgtt
cctgacccagctcgccaacctgcagctcggcctggtccgcctcgacggcggccagctgc
ggctccgcttcaacgcgtegggccacgtccgcgataccgcggtgccggaggagtggac
cgaggtcggcagctgtgacggcggtgacgacggcggtgacggcggcgacggcglcgt
ccgggtccggctggagatccggatggcggccgaggacccgagctcgtaccggttcgcg
gccatgctggcgtccgacccggacccggaccggacccggatcgaggtcggcaccgcg
ccggccgagctgctcagcggcggctccggctccttcgtcggcaccctgctcggcgtctac
gccacctgcaacggggccggggagggcatcgactgccccgccggcacgcccgacgct
tacttcacccggtggaggtacacgggcgagggccagttctacaccgagaccgatctcgtc
ccgcccgacgagggccagggcaagggtaaaggtaaagggaacggtaaaggcaaggg
caacggcaacggcaacggcaaagccgccaagagaagcaggtttgttggttaagagggtc
aatgaatatgacgatgtaaaaacctattctttgctgtcagctcttagaagtgaatgcattgttga
ttgatcacagttataatggtagctggtggtga
AbnlO SEQ ID NO 49 SEQ ID NO 50
cccacagactattatgagtagtataaagcaggtcgagcagttgctctcaagcttcccgtgttc MRIHLRGFAILGLAVEAAIARPPG acctccatctcgcagataaaccgccacaggaagaacaagagcaatgcgtattcacctccg PPDTPRSPGPPLPPGPPGPPGPPVP tggttttgccatcctcggcctcgcggtcgaagctgccatcgcacgaccaccaggtcctcca PDSDHLPPFSTFSNRVIYTPPKGGR gatacccctcgttctccggggcctcctcttccccccggtcctccgggtccccctgggccac AVYPRVAELSDGTLLVTASVSGV ctgtaccgccggactcagaccatctccctccctttagcaccttcagcaaccgtgttatttaca VGPDNLPAFPIFESKDGGVTYQ I caccacccaagggtggtcgggcggtgtacccgcgtgtcgctgagctcagtgaoggcact SNLTDQVNGWGMSAQPALLELR ctcctggtgaccgcgagtgtcagcggcgttgtcgggcccgacaacctcccggcgttcccc QPLGGFKPGTVLASGNSWSDKGT atctttgagagcaaagalggcggcgtcacctaccagtggatctcgaatctcaccgaccagg RIDLYASTDKGRTWEFVSHAAEG tcaacggctggggcatgagcgcgcagcccgccctgctcgagctgcgacagccccttgg GRPNTTNGATPVWEPFLLTYDDE gggcttcaagcccggcaccgtcctggcgtcgggaaacagctggagcgacaagggaacc LIVYYSDQRDPRHGQKLAHQTSR cgtatcgacctgtacgcaagcacagacaaggggcggacctgggagtttgtgagccatgct DL HWGPVVNDVAYDEYLARPG gccgagggcggccgtcccaacaccaccaatggcgccacgcccgtgtgggagccgttcc MTVVAYIPP1K WILVYERPIGNS tcctgtatgtgtgcccccgttccgcagcaatgtccigataatcatcaacctggcgtagtgctg SSHGVNYPVHYRLADDPRKFDAA accctcaccctctccccctcttcccagcacttacgacgacgagctcatcgtctactactccg QPIPIVIETREGTTVAPNASPYW accagcgcgatcctaggcatggccagaagctcgcccaccagacatcccgtgatctcaag WSPVGGP GTIIVSDADRSWLYL Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
cactggggccccgttgtcaacgacgtcgcctatgacgaatacctcgcccggcccggcatg NTAGGDPDKWQIRECGQPEAYSR actgtcgtcgcgtacatcccgcccatcaagaagtggattcttgtgtatgaacggccgatcgg ALHVFEKRPDRLMVLGGDTFDG caactcgagctcccacggggtgaactatcctgttcactaccgcctcgccgatgacccgag NGVGALTDSVLDMN FLQGGYS gaagtttgacgcggcccagccgatccccattgtcatcgagaccagggaggggacgaccg APGQSP
tggcgccaaatgcatcgccctatgttgtctggtcccccgtgggaggacccaagggcacga
tcatcgtgtcagatgcggaccggagctggctctacctcaacaccgccggaggcgacccg
gataagtggcagatcagggagtgcggtcaaccagaggcatacagccgcgcactccacgt
tttcgagaaacggcsggatcgcctgatggtgttgggaggggatacctttgacggaaacgg
tgtcggcgcgctgacagacagcgttctcgacatgaacaagtttctgcaaggcggctattct
gcacctggccagtcaccctaataataagtggcaagtccgtattgagcttgtaatcagaggtt
caaaaattactggctggtggctggagcattaattacactactaatacgatiactgcgcgccc
ccccccggctcaatcttccaaattgaacccttaacttctcccccttcccc
Laml SEQ ID NO 51 SEQ ID NO 52
aaggt tgttttttggtctccagttcagagcagctactcatcgggcagcaacagtcgttatgc MRLLLPALLAVAGFVGTV GHW gactgctcctcccagcattgctggctgttgccggcttcgtggggacggtgaagggacactg LGEISHQGFAPFAGANYPVFRNV gctcggggagatctctcgtgagagctgtcagcgcggcttglagtctgaagtttcaactgacc KDYGARGDGVTDDTAAINAAINA gttcgatagatcagggttttgctccgtttgccggggccaactatcccgtcttccgcaacgtta GNPCNRGCASTTQTPAVIYFPAGT aggactacggcgcaagaggtacccccccctttccctctccccccccccccccgggggaa YLISSSIKPAYFTQLIGDASSRPTL actggtactggctgctgtcccgggcttgoggttgacatgtgaccaggtgatggcgtaacgg ATPNFAGFGLIDSNPYYTEVLN acgatacggcagccatcaatgccgccatcaacgccggaaacccttgcaacaggggctgt WKSQNWFRQIRNFVtDTTNIPPA gtaagtgaagcatgagatggggaggagggcttgacgggctgggcacagtgctcacgcg TAATGIHWPTSQATSLQNIVF MP ctcttaaaggcgtcgacgacccaaaccccagcagtgatctactttcccgccggcacatacc ATPDVVHVGLFMEEGSGGFLTDL tgatctcttcgtctatcaagccggcctacttcacccagctcatcggcgatgcttcgtctcggc EFNGGATGASMGNQQFTMRNMK caacgctcaaggcgacgcccaactttgccgggttcggtctcattgacagcaatccctacta FNNC TAIIQIWDWGWTYSGLSIN caccgaggtgctgaactggaagtctcaaaatgtcttcttccgccagatccgcaacttigtcat NCQVGIDMSNGNTMNVGSITLID cgacacgaccaacatcccgcccgcaacggctgccaccggtattcactggccgacgtctc SSFTNVPVAILTSWTENPNPATVE aggcgaccagcctccagaacattgtgttcaacatgcccgccacacccgacgtcgtccacg SLVMENWLDNVPVAVQGPNGR tcggcctgttcatggaagaaggaagcgggggcttcttgactgacctggagtttaacggtgg TLLAGGSTTING GIGHSYGSSGP igccactggtgccagtatgggaaaccaacaattcaccatgcggaatatgaagttcaacaac TSFAGPVTPNSRPGILLNNGRYFT tgcaagacgggtaagtattgcacccgttccaacctccccctcccctcttcatctttacgacct RSKPQYESVPVSSFLSVRSAGAKG gggatgtgaaatcagaccgggctaacggacgaagccatcatccaaatttgggactggggt DASTDDTAALQNAINTAVSQNKl tggacgtattccggcctgtccatcaacaactgccaggtcggtattgacatgtcgaacggca ΕΡΕϋΥΟΓι¾νΤ8Ή8ΙΡΡΟΑΚ]νθΕ acacgatgaacgtgggctcgatcaccctgattgacagcagcttcaccaacgtgcctgttgc TYPVIMSSGAFFNDBWP PVVQV catcctcacctcttggaccgagaacccgaaccctgccacggtcgagagtctggtgatgga G SGQQGQVELTDFIVSTQGRQA gaacgtggtactggacaacgtgcccgtcgccgtccagggaccaaacggaagaaccctg GAICEWNLASDAGNPSGMWDV ctcgccggcgggtcgacgacaatcaacggctggggcatcggccacagctacggctcctc HVRIGGFTGTQQQVAQCPKTPGN cggccctacctcgttcgccggtccagtaactcccaactcgcgcccgggaatcctcctcaat PAVNDNCLVAYMGMHVTKGAS aacgggcggtacttcactcgctccaagccgcagtacgagagcgtgcccgtctcgtccttcc GLYMENVWIWTADHDIDDAQNT tgtcggttcggtcggcgggagccaagggcgacgcctcgacggatgacacggcggcgct QITIYAGRGLLTESENGPLWLWG ccagaacgccatcaacacagctgtgtctcagaacaagatcctcttcctcgactacggcatct TGSEHFVLYQYQFAGTKNIFMGQ accgcgtgaccagcaccatcagcatcccgcccggcgccaagatcgtgggcgagacata IQTETPYYQPTPNALVPFPVASAL ccccgtcatcatgtcgtccggcgccttcttcaacgacataaacaaccccaagcccgtggtc RDPDFQAQCAGVEGNCAAAWGL caggtcggcaagtcgggccagcagggccaggtcgagctgaccgactttatcgtctcgac RWDSSDVLVYGAGLYSFFSDYS gcaggggaggcaagcgggggcgatctgcatcgaatggaacctcgccagcgacgccgg TACSTFDAGQTCQQRITSVEGSAT caaccccagcggcatgtgggacgtccatgtgcgcatcggtggcttcaccggtacgcagc NVNLYNLNTIGTREMLTRDGRRV agcaggtcgcccaatgccccaagacgccgggcaacccggccgtcaacgacaactgcct AWYADNQNTFASSVAVYKSN cgtggcctatatgggcatgcacgtcaccaagggagcaagcgggctatacatggagaatgt
gtggatctggtatgttgcttgccatcatcatatcgctgtcgtttgtcggcggcggaggaagac
atgaaatatcaggcgggagtagagctgacacaaaagaacaggactgccgaccacgatat
cgatgatgcicagaacacccaaatcaccatctatgccggccgtggtctccgtaagcttttctt
tttcttcccttttcctcttcccagcccccccaccccgctccctctttccaaatatacacagcctc
ctggattcttacggaggggaaaccgtcccgaaggacttgcaaggaacaggaactgacca
accgcaccaccacaccatcaaacagtaaccgaatccgaaaacggtcccctctggctctgg
ggcacgggttcggagcactttgtgctctaccagtaccagttcgccggcaccaagaacatct
tcatgggccagatccagaccgagacgccctactaccagccgaccccgaacgcgctggtg
cccttcccggtggcgteggcgttgcgcgacccggatttccaggcgcagtgcgcgggcgt
cgagggcaactgcgcggcggcgtggggcctgcgcgtcgtcgactcgagcgacgtgctg
gtgtacggcgccgggctgtactcgttcttcagcgactacagcaccgcctgctcgacgtttg
acgccgggcagacgtgccagcagcgcatcacctccgtcgagggctccgccaccaacgt
caacctgtacaacctcaacacgatcgggacgcgcgagatgctgacgcgggatgggagg
cgggtcgcttggtacgcggataaccagaacacctttgcgtccagcgtggcggtgtacaag
agcaactgaaggcttccttccgggttgaaaaaaaaaaatggtgggggaagggggaggga
gggagggagBgagggaggKgagaaggggatcatgaatgcaactgtgcaccagc
Bgal SEQ ID NO 53 SEQ ID NO 54
CL08660 atgctccctcctagcagcgtaactgccagccatgagagaggcaacaggaacggcacgcc MREATGTARLFHRLSTILLTFLLV Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
tgtttcacaggctcagtaccatactgctaacatttttgttggttccgcgtctcgrtggagcgag P LVGA PYPVHWADDNNGRQR accctaccccgtccattgggcagatgataataacggccgccaacgcatcagtctcaacga ISLNEGWKFSRFTSNPDSLSYDSL gggctggaagttttcacgtltcacttcgaatccggactcgctctcgtacgactctctaaagtg KWWILPSANNFIKGDKHQPPSGT gtggatcctgccatccgccaataacttcatcaagggtgacaagcaccagcccccgtccgg PPGSNVQYVQPDFDDRDWESVDL aacaccccccggcagcaatgttcagtatgtgcaaccggattttgacgaccgcgactggga PHDWAIKGPFNAPGVSGGMGRLP gtccgttgacctgccgcacgactgggccatcaaggggccgttcaacgctcccggcgtga SNGVGWYRRTLTRSPEDEDKSIFL gcggaggcatggggcgcctgccctcgaacggcgtgggatggtaccgtcggacgttgac DIDGAMSYAAVWLNGRLIGGWP cagatcccccgaggacgaagacaagtccatcttcctcgacattgacggtgccatgtcgtat YGYASFRLDLTPYLDGGDNLLAI gccgccgtctggcteaacggccgtttgatcggtgggtggccgtacgggtacgcctcattcc RLETRYTLAMVSRRGHLPHSAG gcctcgatctcacgccctacctggacggcggcgacaacctgcttgctatccggctggaga WSRSTRPMSAIFWTYITTSDVSEE cgcgctagacactcgcgatggtatcccggcgcgggcatctaccccactcggctggctggt KATVHLTVDVENGGASDTTVDV caaggtcaacccgacccatgtcggccattttttggacgtacatcaccacgtcggacgtgtc VTEIHLLDPATGTPGEEVVAQFEK ggaggagaaggccaccgtccacctcacggtcgatgtggagaacggcggtgcatccgac ATVAVPAGGKMSVSGSTAVRNPL acgacggtagacgtggtaacggaaatccacctgcttgacccggctacggggactcccgg LWGPPPEQEPNLYVATTRLTVNG agaagaggtcgtggcgcagttcgagaaagccacggtagccgtgccggccggggggaa TVIDTYETPFGIRSVTFDANQGVS gatgtcggtcagcggctcgacagcggtgaggaacccgctgctctggggtcccccgcccg VNGKPV IWGTCNHGDLGSLGT agcaagagccgaacctgtacgtcgcgacgaccaggttgacggtcaacggcaccgtcata ALNTRALERQLEALRE GSNALR gacacctacgagacccccttcggtatccgctccgtcaccttcgacgcgaaccagggcgtg TSFINPPAPEFLDLADRMGFLVLD tcggtcaacggcaagccggtcaagatctggggcacgtgcaaccacggcgacctcggctc EIFDTWADPJCTTNDFHAIFPDWH gctcgggacggccttgaacacccgcgccctggagcggcagctggaggcgctccgaga EPDLRAFVRRDRNHPSIIAWSYGN gatggggagcaacgcgctccgcacctcccacaaccccccggccccggagttcctcgac EIPNQSGSATGATAQALHGILVEE ctggccgaccggatgggcttcctggtgctggacgagatcttcgacacgtgggcggaccg DPTRPSTCAMNSAGPGSPLADAL gaagacgaccaacgacttccacgccatctttccggactggcacgagcccgacctgcgcg DIIGLNYQGEGLGTSTDGSFDRFH cctttgtccgccgcgaccgcaaccacccgtccatcatcgcgtggtcctacggcaacgaga AAYPGKWWSSESASALSTRGTY tccccaaccagtccggctcggccacgggcgcgaccgcccaggccctccacggcatcct LFPVTSGNHADVGDGPGQGGDG cgtcgaggaggacccgacccggccgtccacctgcgccatgaactcggcggggcccgg RDYRVSAYELYATSWGSSPDKW cagcccgctggccgacgcgctcgacatcatcggcctcaactaccagggcgaggggctc AAFCDAHPYVAGEFVWTGWDYL ggcacgtcgaccgacggctccttcgaccgcttccacgccgcatacccgggcaaggtcgt GEPTPYDGDDGARSSYFGIIDLAG ctggagcagcgagagcgcgtcggcgctcagcacgcgcggcacctacctgttccccgtga FRKDRFYLYQARWRPDLPMAHL cgtcggggaaccacgccgacgtgggcgacggcccgggccagggcggcgacgggcg LPH TWPDRVGQVTPVHVFSSG ggactaccgggtgagcgcgtacgagctgtacgcgaccagctgggggtcgtccccggac DEAELFV GKSAGRQKRGRGQY aaggtgttcgccgcgcacgacgcgcacccctacgtcgccggcgagttcgtctggaccgg RFRWDDVVYQPGNVSVWYKDG ctgggactacctgggcgagccgaccccctacgacggcgacgacggcgcgcggagctc EEWARDARWTVGKAKGLTLTAD ctacttcggcatcatcgacctcgccggcttcaggaaggaccgcttctacctctaccaggcc RAEIRGDGRDLSFVTVAWDENG cgctggcggcccgacctgcccatggcccacctgctgccgcactggacctggcccgacc DTVPEAGNAIAFSVSGPGRFVATD gcgtcggccaggtcacgccogtgcacgtcttctcgtccggcgacgaggccgagctgttcg NGDPADM EFPSLTRKAFSGLAL tcaaoggcaagtcggcggggcggcagaagcgcggccggggccagtaccggttccggt AIVRADKGASGDITVTASAEGLET gggacgacgttgtgtaccagccgggcaacgteagcgtggtcgtgtacaaggacggcga AEWIRAA
ggagtgggcgcgggacgccaggtggacggtcgggaaagccaaggggctgaccctgac
ggccgaccgggccgagatcaggggggacggccgcgacctgtcgttcgtcaccgtggcc
gtcgtcgacgagaacggcgacacggtgcccgaggccggcaacgccatcgcgttttccgi
ctccggtcccggccggatcgtcgcgaccgacaacggagacccggccgacatgaccgag
ttcccgtcccttacccggaaggccttcagcggtctcgcgctggcgatcgttcgggcggata
agggcgcttccggagacatcaccgtcacggcttccgccgagggtctcgagacggcgga
ggtgKtgatacgagctgcttaattaatgaa
Bga3 SEQ ID NO 55 SEQ ID NO 56
cgggtggcattgacggcgcgcagccagataattaatcgccgcagctaataataattaattc MHPFRLPVPELWEGILQKIKAMG gtgcgaaagctcgccatgcgactgttctgccgcctctacggcccagggggctfgtcggcc LRMVSIY1HWGFHAPAPG AVRE tggatagcctcgt taccctgctctcgctcctttcttcacggtttggatcctgcgcggtcgcat GENVLLVVHDDTGHDQLNAAVN cctcgtccggctcgagtccaattctcgacaatgggctccagcacgagatccaatgggacc PRRALNATLLGGKVAGTAGGSA gtcacagcatcgtcatcgggggcgaccgcctgttcttgtttggaggagagatgcatcccttc APDRAGLDRVRTRYNEGGLAAE cgactgccggttcccgagctgtgggagggcattctgcagaagatcaaggccatgggact RLGWHLRGFDDSAWETAEGPWL gagaatggtgtccatciacatacactggggcttccatgcgccggctccgggcaaggccgt GSRAPACASTAGGCRWTCRGGST gcgggagggcgagaacgtgctgctcgtggtacacgacgacacgggccacgaccagct CPSRSALKPAEEDVERGTLEYTVL gaacgcggcggtcaacccccgcagggcgctcaacgccacgctgctggggggtggtgc LFVNGWQYGRYYPSIASEDTFPV cgtctccgagggagggggggagggggaggcactgggttctccgcgtggaaggtcgcc PPGVLDYSGDNUGLAVWSLQEE ggcacggcgggcgggtcggcagccccggaccgggccgggctggaccgggtgcggac GARVDVDVVVGYVADSSLDVKF gcgctataacgagggcgggctggcggcggagcggctgggctggcacctgcgcgggttc DGSYLRPGWDPVRLEQLYMKGV gacgactcggcctgggagacggccgagggcccgtggttgggttcgagggccccggcgt SMWIRGGFPRVTAYQSLSPEARE gcgcttctaccgcgggtggttgccgctggacgtgccgcgggggatcgacctgtccctcgc LFF
gttccgctttaaagccggccgaggaggacgtcgagagggggacgctggagtacacggt
gctcctcttcgtcaacggctggcagtacgggaggtattagccctccatcgcgtccgaggat
accttccccgtcccgcccggggtgctggactacagcggcgacaatctgatcggcttggcc
gtctggtcgctgcaggaggagggagcgcgggttgatgtcgacgttgtggtcgggtatgtg
gcggactcgtcgctagacgtgaaattcgacggcagctatctgcggccgggatgggatcc Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
ggtgaggctcgagcaattatacatgaagggagtttcaatggttgtaattagaggaggctttc
ctcgtgtgacggcgtaccagtcattgtctcctgaggcacgggaattgtttttttaactttctttg
gcaatttgagaaaaaaaatgtgcgtggccatttaaatgctttgagaaaccggaggcggaag
gagcaaacagatcgaaaggccgttcttgcagctaattaatcattaattaccaaatcattcgag
aatggagaggtaaaaaaacgcgaagcaaaaaaaaaaaaatgtgtgaatcatttctgctcag
tcgtttaaagcccctctccccttggctctttcaatcatgccttggatctttgiiEctgtgtt
Gal2 SEQ ID NO 57 SEQ ID NO 58
aacgttccgggggcagcgagacgttgtgttcctcgtcaaatctcgcacaatggcccggtta MARLSVCISALLAGLSAA YIVPG agcgtatgtatctcggcactcctggccggcctctcggcagccaagtacatcgtcccgggg GRW DTDGNLVNAHAGCVTVD gggagatggcgcgacacggacggcaacctcgtcaacgcccatgcgggctgcgtaacgg KDTG FWLFGEYKVEGQTEGGG tcgacaaggataccggcaaattctggctctttggagagtacaaggtcgaaggccagactg VSVYSSDDLATWESHGLALAPIE aaggcggcggtgtttccgtctacagctcggatgacctggcgacgtgggaaagccacggc GHPYISPSH1IQRPKVVYSKVSNEY cttgccttaggttcgtgatctgtttcttcttcttccgaaatctggccgcaatcacactagctcac H WWHADNSTYGLLLQGFARSP gacctctcacagctcccatcgagggccacccgtacatctcgccctcccacatcatccagcg NISGPYTFVSATAPLGNWSQDFGI accaaaggtggtctacagcaaggtgtccaacgaatatcacgiacgtaggctaccatgttaa FTDY DGRSYALYSNGDSREGRD gcgaatcacagggtttaattaataaggcatgtggcaoggatgctaatgttaaagaacagatg VYLTAYNEDVSALDKVIHRFDICY tggtggcacgcagataactccacgtatggactgctgctccagggatttgctcggtcgccaa DLEAPTIVQTDKSYYA1MSHKTG acatttcggggccttacaccttcgtgagcgccaccgcgcccctgggcaactggtcccagg YRPNNVVAFRADSLAGPWSQPF atttcggcatctttaccgactacaaggacgggaggtcctacgcgctctactcgaacggcga MVAPPNTRTFNSQSGFTLTIRGKK cagccgggaaggccgagacgtgtatcttacggcctacaacgaggatgtgtcggcgctcg RTTYLYLGDQWDSNSLWESRYI acaaggtcatccaccgcttcgacaagtacgatctcgaagccccgaccatcgtccagacgg WLPMEIDDRKKTLRLV NDVYD acaagagttactacgccatcatgagccacaagacgggctaccggcccaacaacgtggtc LDV TGEWSPVRG TYYAADAK gccttccgggccgactcgctggccggtccgtggtcgcagcccttcatggtggcgccgccc TAGNAFKQEANFASKGVrVTGIR aacacgcggaccttcaactcccagtcgggcttcaccttgaccatcaggggcaagaagag GNDSTVTFEGIEGSG PQWVSFY gacgacgtacttgtacctgggcgaccagtgggactccaactcgctctgggagagccggta YQNTDD GFGDQPGGTPDRIGGT catctggcttccgatggagattgacgaccgcaagaagacactcaggctcgtgtggaacga WQLRRIASVVVNGNTEKVETLYQ cgtgtacgacttggacgtcaagacgggcgagtggtcccccgtccgaggcaagacatact RDTH GIILSTPLLLTLEKGKKNTI acgcggccgacgccaagacggccggcaacgcgttcaagcaggaggccaacttcgcca TVGGLWNGFDYKGADLDRIVVY gcaagggtgtcatcgtcacgggcattcgcggcaacgacagcaccgttacttttgaagggat PPETDRKRR
cgagggctccgggaagccgcagtgggtgtcgttctactaccagaacacggacgacatgg
gcttcggggaccaaccgggcgggacgccggaccggatcggcggcacgtggcagctgc
gccggatcgcgagcgtcgtcgtaaacggtaacacggagaaggicgagacgctgtacca
gcgggacacgcacaagggcatcatcctctcgacgcctctgctgctgacgctggagaagg
ggaagaagaacaccatcaccgtgggcgggctgtggaacgggttcgattacaagggggc
ggatctggacagaatcgtcgtgtatccccccgaaacggaccgcaagaggcgataagaga
gatcttgagcggctttctgcctattaattcccggcggtgctcggctcgaatgacggacacgg
cggggccaaagacaggagagcatgagacggtctagcaatgacctggcgccgtgatattc
tatctatcttggtatagggctgatgactaataat
Arhl SEQ ID NO 59 SEQ ID NO 60
agcttcgccattcctgtaacaactaattacctctctgatctcatctttgcttcgtttacggcgtca MIGLISLGLSAIAGAAVTRTHLHL acgtcggacactccatgatcggcctaalctccctgggcctgtccgccatcgcaggcgccg PQADVPLITVQDGPIVLSSTTTESA ctgttactcgaacacacctacatctgccgcaagcggacgttccgttgatcacggtgcaagat TLVLDYGANVEGIPSFEVVGATG ggaccgatcgtcctatcgtcaaccacgacagagagcgccaccctcgtcctcgattatggg DTTVFEITYSES AGLDLYMGDG gctaatgtagagggcatcccgtcctttgaggtcgtaggcgctacaggcgacacgaccgttt PIPLAAAMDTYRVNRYNIVGPEQ tcgagatcacgtactctgaaagcaaagccggcttggatttgtatatggtatgccgtgcccttt FTNRHVQGAFRYQKLNLSSPGEL acttccgcctcgtcatctgtgttctatgagccggaggctaattgctgatttgccacgtgtcag TLQNVGVTPTTRTTSIDKLPGSF S ggcgacgggccgatccctctagcggcggccatggacacgtaccgagtcaaccggiataa SDSS1TDIWAVGARTIQLTEIPKDS cattgttggcccggagcagttcaccaatcgccatgtccaaggggcctttcgataccagaag IPEFWEITSEGAVTDSLAPQANGAP ctgaatctatcctcacccggcgagctgacgttgcagaatgtcggcgtgattcccactaccc DAVTIDAGARTIAAYVGSTTEDTE gcacgac tcgatcgacaaactgcccgggtcgrtcaagagctccgacagctccattacag LTRATVPSNVT ALGSWHSVHV acatctgggcagtcggtgctagaaccattcagctgaccgaaatacccaaagartctatccc EVAMTDIAI SINGE VLKFTQYS cgagttttgggaaatcaccagcgagggtgccgtgatcgacagtcttgcaccccaagcaaa FYGSYGLGASFGHKAVFRDLVAT cggggccccagatgccgtcactagtacggcctataacctcgacttcaaggtcaaaccgct DPVGTVTYQHPLNDKSCL DFLL gattggcggctttgggttcagcgtgttgtctgacactcttaactcggcaatctaaatatcagtt GTNPLDVSVDGSRRDRIAYAGDL gatgctggagcgaggacaattgcggcctacgtcggatcgactaccgaggatacggagct DIAASAALVSTHGLEFVEGALNLL caccagggccacggtgccgtcgaatgtgaccatggctcttggcagctggcactccgttca ASMQATPGFFIPTVKIQQRPLSTPL cgtcgaggtcgctatgaccgacatcgccataagcatcaacggcgagcgagtgctcaagtt DVNITGLIGYSWNLLTAVSHTYM tacccagtactccaagtlctatggctcctatggccttggtgcttcctttggccacaaggctgtc HTGDLALALEWAPRIVRMLDWS ttcagagacctcgtcgcgaccgatcccgtgggcaccgtcacctaccagcaccccttgaac HSQTLSNGLFNLSDATFGGDWNY gacaagagctgcctgaaagattttcttttaggcaccaaccctctcgatgtatccgttgacggt YDPAQSGVVTKFNVLYAYALQET tcgcgccgagatcgtatcgcatacgctggcgacctagacatcgccgcctctgccgccctg VGLLADVGVDISVYQDRLAALRA gtctccacccatggactggagtttgtcgaaggtgcgcttaaccttttggcatcgatgcaggc AIDKHLWSDELGAYVYADGIRDG cacgcccggcttttttatacccacagtcaagatacagcaaaggccgctttcgactcccttgg FGQDSNAIAILAGVNLDPSHSSETI acgtcaacatcacaggcctgataggatactcgtggaacttgctcacagctgictctcacacg LSTLSRELSTPKGPLSFSSGVLQH tacatgcatacgggtgatctcgccctggccttggagtgggcacctagaattgtgcgaatgct GFQRYISPYASAYHLRAAFTSQNS Patent 124702-0230
Enzyme Nucleic Acid Sequence Amino Acid Sequence
tgactggtcgcactcgcaaacgctgtcgaacggcctcttcaatctcagcgacgccaccttc TAARELLDSLWAPMTDTNNANY ggcggagactggaactactacgaccctgcccagtctggagtcgtcaccaagttcaatgtcc SGCFWETLDETGRPAFGVHTSLC tctacgcctacgcgctccaagagacggtcgggctcttggccgatgtcggtgtcgacatctc HGWSAGPTAELSRFVLGAQPTKP tgtgtaccaagaccgtctcgctgctctacgggcagccattgataagcacctctggagtgac GWAE AVSPQTLGLTSA GEVPT gagctaggcgcgtacgtctacgccgacggtatccgggacgggttcggccaagactccaa PLGPLTVSWEFCGTLLNMSVEAP tgccatcgctattcttgccggcgtcaacttggacccatcccactcgtccgaaaccatcttgtc AGTTGLVNVPYPLLVPVTQSKFI taccctgtctagggaactttccacgcccaagggcccactatccttctcctccggcgtcctac MNGSVVNGTTLRVKGGSKVTIM agcacggattccagcggtacatcagcccttacgcctctgcataccatctccgcgccgcctt QLRK
cacatcgcagaactctaccgctgcccgggagttgcttgactcgctgtgggcgccgatgact
gataccaacaacgccaactactccggctgtttctgggagaccctggacgagaccggccg
ccccgctttcggagtccatacgagtttgtgccacggatggagcgcgggcccgacggctga
gctgagccgtttcgtgctcggtgcgcaacccaccaagcccggctgggccgagtgggcag
tctctccgcagactcttggtcttacctccgcccggggtgaggtgcccacacctctcggtcct
cttactgtaagctgggagttctgtggaaccctcctcaacatgtcggtcgaggctcctgccgg
caccacagggttggtcaatgtcccgtacccgctgttggttccggtaacccagtccaagttca
ttatgaatggttctgttgtcaacggcaccactctgagggttaaagggggctcaaaggtcacc
attatgcagttaaggaaatgagcaagtggatgcaacgcggtaattgagaataag
Cell SEQ ID NO 61 SEQ ID NO 62
atataaaggagatcaggccttccctcctcggctcattggggcctactagcacatcatcatcc MSLPKDFKWGFATASYQIEGSVN gtcttccatccctcctcagaacttccttccccttcctcctatccacctttcccttactcacacaga EDGRGPS1WDTFCAIPGKIADGSS caatcgtccatcgtccnccatgictcttcccaaggacttcaagtggggcttcgccaccgcct GAVACDSY RTKEDIELLKS1GAK cgtaagttcaaggacccgggcttttcgatcaagctcacagaaccgtccttggctgactgtgt AYRFSIAWSRVIPLGGRNDPrNQ gttccctttctctcctcacccacaggtaccagattgagggctccgtcaacgaggatggccgt GLDHYVKFVDDLVEAGEPF1TLS ggcccctccatctgggacacattctgcgccatccccggcaagatcgccgacggcagctc HWDLPDALEKRYGGYLN EEFA gggtgccgtggcttgcgactcgtacaagcgcaccaaggaggacattgagctgctcaagtc ADFENYARVMFKAIPKC HWITF gataggggccaaggcgtaccgcttctccatcgcgtggtcgcgcgtcatcccgctgggcg NEPWCTSILGYNTGYFAPQRTSD gtcgcaacgaccccatcaaccagaagggtctggaccactacgtcaagttcgtcgacgacc RS SPVGDSAREP IVGH ILIAH tcgtcgaggccggcatcgagcccttcatcaccctctcccactgggacctgccggacgcgc GRAVKAYREDF PTQGGEIGITLN
Iggagaagcgctacggcggctatctgaacaaggaggagttcgcggccgactttgagaact GDATLPWDPEDPADVEACDRKIE acgcgcgcgtcatgttcaaggccatccccaagtgcaagcactggatcaccttcaacgagc FAISWFADPIYFGEYPASMRKQLG cgtggtgcacgtccatcctgggctacaacacgggctacttcgcgcccggccgcacgtcg DRLPKFTAEEVALVKGSNDFYGM gaccgcagcaagtcgcccgtcggcgacagcgcgcgcgagccgtggatcgtcggccac NHYTANYIKHK GVPPEDDFLGN aacatcctcatcgcgcacgggagggccgtcaaggcgtaccgcgaggacttcaagcccac LETLFY NADCIGPETQSFWLRP gcagggcggcgagatcggcatcacgctcaacggcgacgccacgctcccctgggaccc HPQGFRDLLNWLSKRYGYPKIYV ggaggacccggccgacgtcgaggcgtgcgaccgcaagatcgagttcgccatctcgtggt TENGTSL GENDMPLEQILEDDFR tcgccgaccccalctactttggcgagtacccggcgtcgatgcgcaagcagcigggcgacc VKYFHDYVHAMAKASAEDGVN gcctgcccaagttcacggccgaggaggtggcgctcgtcaagggctccaacgacttctac VQGYLAWSLMDNFEWAEGYETR ggcatgaaccactacacggccaactacatcaagcacaagaagggcgtgccgcccgagg FGVTYVDYANDQKRYPKKSA S acgacttcctgggcaacctcgagacgctcttctacaacaagaacgccgactgcatcgggc LKPLFDSLIRKE
ccgagacgcagtccttctggctgcgcccgcacccccagggcttccgcgacctgctcaact
ggctgagcaagcggtacggctaccccaagatctacgtgaccgagaacggcacgtcgctc
aagggcgagaacgacatgccgctcgagcagatcctcgaggacgacttccgcgtcaagta
ctttcacgactacgtccacgccatggccaaggcctcggccgaggacggcgtcaacgtcc
agggctacctggcctggtcgctcatggataacttcgagtgggccgagggttacgagaccc
gcttcggcgtcacctacgtcgactacgccaacgaccagaagcgctaccccaagaagagc
gcaaagagcctcaagccgctgittgacagcctcatccggaaggagtaaggcaggcagga
gttggagtatgagggtagccgctgatggctattcttcccacgtttttgtgtgtttcctcttcatttt
tttttctcttgccgcaacatgacggctcctgtctctgaagggaacccctgaa
Mip SEQ ID NO 63 SEQ ID NO 64
atgaagcacgaggaacaaaggcagaggggaggttcacacccacggtacgaagtaatatt MASSRYRYTFPRNPKANPI AVVT attttccagagaggttgagttaaccctgggccgaacgaattagaacagtaggagcgggtg GG GSSYYRFTLLTERLIRYEWSE gctggttgacccgagtcgtgattggttgccgtcgcactcttgtgctctcgtggccagaccga DGGFEDRASTFAVFRYFDAPQYR actgctcctcccccgcatcgaggttgacagggataagcgcacctgaccaccggtttcagc VVETNDSLEIITDYFHLTYDKKKF ggcccgaaacgggggtggttgttcgttggggtaacgcgaccogtatttgggagtccaagg SSEGLSVRVGSDLWNYDGKSYG gcggtgtctccaacaacacctactctactctggtctcaaggtcccagttgggggattgtctgt DLGGTARTLDGAYGRVDLEPGVL tgaiacttgciagataaatcaacagcaacccctccagagtcggatctcacgctttcatccgc SRKAYAVLDDSKSMLFDDDGWI ccaaacaaacaaacgaacagagacaatggccagcagccggtaccggtacacgttcccg AIREPGRIDGYVFAYSGEHKAAIR aggaatccgaaggccaatccgaaggccgtcgtgacaggcggcaagggatcctcttactat DFYRLSGRQPVLPRWVLGNWWS cgcttcaccctcctcaccgaacggttgatccgttacgagtggtccgaggacggaggcttcg RYHAYSADEYIELMDHFKREGIPL aggatcgcgcgtccacgttcgcggtattcagatactttgatgccccgcagtaccgcgttgtc T SIVDMDWHRVDDVPPKYGSG gagacaaacgacagtctcgagatcatcacggactactttcacctcacctatgacaagaaga WTGYSWNRKLFPDPEGFLQELRN agttctcatcggaaggactttccgtcagagtcggctccgacctctggaattacgacggcaa RNLKVALNDHPADGIRAYEDLYP gagttatggagacctgggcggcaccgcccggaccctagacggcgcctatggccgcgtg AVAKALNHDTSREEPIKFDCTDR gacclggaaccgggtgtgctctcgcgcaaagcttatgcggttctcgacgacagcaagtcta KF DAYFDVLKLSLEKQGVMFW tRctctttgacgacgacgggtRgattgccattcgcgagccgggccgcattgacggrtacgt WIDWQQGTGSKLPSVDPLWVLN Patent 124702-0230
Figure imgf000068_0002
[220] The proposed physical properties of enzymes of the present invention are illustrated in Table 3 below, including the molecular weigh and isoelectric point, as calculated from the primary amino acid sequence using the ProtParam program (available at the ExPASy Proteomics Server).
[221] Table 3. Physical Properties of Myceliophtora t ermophila C1 Enzymes
Figure imgf000068_0001
Patent 124702-0230
MW = Molecular Weight in kiloDaltons (kDa), as calculated based on amino acid sequence with Clone Manager 9 Professional Edition
pi = isoelectric point, as calculated based on amino acid sequence with Clone Manager 9 Professional Edition
[222] As used herein, reference to an isolated protein or polypeptide in the present invention, including any of the enzymes disclosed herein, includes full-length proteins and their glycosylated or otherwise modified forms forms, fusion proteins, or any fragment or homologue or variant of such a protein. More specifically, an isolated protein, such as an enzyme according to the present invention, is a protein (including a polypeptide or peptide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include purified proteins, partially purified proteins, recombinantly produced proteins, synthetically produced proteins, proteins complexed with lipids, soluble proteins, and isolated proteins associated with other proteins, for example. As such, "isolated" does not reflect the extent to which the protein has been purified. Preferably, an isolated protein of the present invention is produced recombinantly. In addition, and by way of example, a "Myceliophtora thermophila CI protein" or "Myceliophtora thermophila CI enzyme" refers to a protein (generally including a homologue or variant of a naturally occurring protein) from Myceliophtora thermophila or to a protein that has been otherwise produced from the knowledge of the structure (e.g., sequence) and perhaps the function of a naturally occurring protein from Myceliophtora thermophila. In other words, a M. Thermophila protein includes any protein that has substantially similar structure and function of a naturally Patent 124702-0230 occurring M. Thermophila protein or that is a biologically active (i.e., has biological activity) homologue or valiant of a naturally occurring protein from C. lucknowense as described in detail herein. As such, a M. Thermophila protein can include purified, partially purified, recombinant, mutated/modified and synthetic proteins.
[223] According to the present invention, the terms "modification," "mutation," and "variant" can be used interchangeably, particularly with regard to the modifications/mutations to the amino acid sequence of a M Thermophila protein (or nucleic acid sequences) described herein. An isolated protein according to the present invention can be isolated from its natural source, produced recombinantly or produced synthetically.
[224] According to the present invention, the terms "modification" and "mutation" can be used interchangeably, particularly with regard to the modifications/mutations to the primary amino acid sequences of a protein or peptide (or nucleic acid sequences) described herein. The term "modification" can also be used to describe post- translational modifications to a protein or peptide including, but not limited to, methylation, farnesylation, carboxymethylation, geranyl geranylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, and/or amidation. Modification can also included the cleavage of a signal peptide, or methionine, or other portions of the peptide that require cleavage to generate the mature peptide. Modifications can also include, for example, complexing a protein or peptide with another compound. Such modifications can be considered to be mutations, for example, if the modification is different than the post-translational modification that occurs in the natural, wild-type protein or peptide.
[225] As used herein, the terms "homologue" or "variants" are used to refer to a protein or peptide which differs from a naturally occurring protein or peptide (i.e., the "prototype" or "wild-type" protein) by minor modifications to the naturally occuning protein or peptide, but which maintains the basic protein and side chain structure of the naturally occurring form. Such changes include, but are not limited to: changes in one or a few amino acid side chains; changes one or a few amino acids, including deletions (e.g., a truncated version of the protein or peptide), insertions and/or substitutions; changes in stereochemistry of one or a few atoms; Patent 124702-0230 and/or minor derivatizations, including but not limited to: methylation, glycosylation, phosphorylation, acetylation, myristoylation, prenylation, palmitation, amidation and/or addition of glycosylphosphatidyl inositol. A homologue or variant can have either enhanced, decreased, or substantially similar properties as compared to the naturally occurring protein or peptide. A homologue or variant can include an agonist of a protein or an antagonist of a protein.
[226] Homologues or variants can be the result of natural allelic variation or natural mutation. A naturally occurring allelic variant of a nucleic acid encoding a protein is a gene that occurs at essentially the same locus (or loci) in the genome as the gene which encodes such protein, but which, due to natural variations caused by, for example, mutation or recombination, has a similar but not identical sequence. Homologous can also be the result of a gene duplication and rearrangement, resulting in a different location. Allelic variants typically encode proteins having similar activity to that of the protein encoded by the gene to which they are being compared. One class of allelic variants can encode the same protein but have different nucleic acid sequences due to the degeneracy of the genetic code. Allelic variants can also comprise alterations in the 5' or 3' untranslated regions of the gene (e.g., in regulatory control regions). Allelic variants are well known to those skilled in the art.
[227] Homologues or variants can be produced using techniques known in the art for the production of proteins including, but not limited to, direct modifications to the isolated, naturally occurring protein, direct protein synthesis, or modifications to the nucleic acid sequence encoding the protein using, for example, classic or recombinant DNA techniques to effect random or targeted mutagenesis.
[228] Modifications in protein homologues or variants, as compai'ed to the wild-type protein, either agonize, antagonize, or do not substantially change, the basic biological activity of the homologue or variant as compared to the naturally occurring protein. Modifications of a protein, such as in a homologue or variant, may result in proteins having the same biological activity as the naturally occurring protein, or in proteins having decreased or increased biological activity as compared to the naturally occurring protein. Modifications which result in a decrease in protein expression or a decrease in the activity of the protein, can be referred to as inactivation (complete or partial), down-regulation, or decreased Patent 124702-0230 action of a protein. Similarly, modifications which result in an increase in protein expression or an increase in the activity of the protein, can be referred to as amplification, overproduction, activation, enhancement, up-regulation or increased action of a protein.
[229] According to the present invention, an isolated protein, including a biologically active homologue, variant, or fragment thereof, has at least one characteristic of biological activity of a wild-type, or naturally occurring, protein. As discussed above, in general, the biological activity or biological action of a protein refers to any function(s) exhibited or performed by the protein that is ascribed to the naturally occurring form of the protein as measured or observed in vivo (i.e., in the natural physiological environment of the protein) or in vitro (i.e., under laboratory conditions). The biological activity of a protein of the present invention can include an enzyme activity (catalytic activity and/or substrate binding activity), such as cellulase activity, hemicellulase activity, β-glucanase activity, β- glucosidase activity, a-galactosidase activity, β-galactosidase activity, xylanase activity or any other activity disclosed herein. Specific biological activities of the proteins disclosed herein are described in detail above and in the Examples. Methods of detecting and measuring the biological activity of a protein of the invention include, but are not limited to, the assays described in the Examples section below. Such assays include, but are not limited to, measurement of enzyme activity (e.g., catalytic activity), measurement of substrate binding, and the like. It is noted that an isolated protein of the present invention (including homologues or valiants) is not required to have a biological activity such as catalytic activity. A protein can be a truncated, mutated or inactive protein, or lack at least one activity of the wild-type enzyme, for example. Inactive proteins may be useful in some screening assays, for example, or for other purposes such as antibody production.
[230] Methods to measure protein expression levels of a protein according to the invention include, but are not limited to: SDS-PAGE-analysis, protein concentration assays (Lowry, Bradford, BCA), western blotting, immunocytochemistry, flow cytometry or other immunologic-based assays; assays based on a property of the protein including but not limited to, ligand binding or interaction with other protein partners. Binding assays are also well known in the Patent 124702-0230 art. For example, a BIAoore machine can be used to determine the binding constant of a complex between two proteins. The dissociation constant for the complex can be determined by monitoring changes in the refractive index with respect to time as buffer is passed over the chip (O'Shannessy et al. Anal. Biochem. 212:457-468 (1993); Schuster et al., Nature 365:343-347 (1993)). Other suitable assays for measuring the binding of one protein to another include, for example, immunoassays such as enzyme linked immunoabsorbent assays (ELISA) and radioimmunoassays (RIA), or determination of binding by monitoring the change in the spectroscopic or optical properties of the proteins through fluorescence, UV absorption, circular dichroism, or nuclear magnetic resonance (NMR).
Many of the enzymes and proteins of the present invention may be desirable targets for modification and use in the processes described herein. These proteins have been described in terms of function and amino acid sequence (and nucleic acid sequence encoding the same) of representative wild-type proteins. In one embodiment of the invention, homologues or variants of a given protein (which can include related proteins from other organisms or modified forms of the given protein) are encompassed for use in the invention. Homologues or variants of a protein encompassed by the present invention can comprise, consist essentially of, or consist of, in one embodiment, an amino acid sequence that is at least about 35% identical, and more preferably at least about 40% identical, and more preferably at least about 45% identical, and more preferably at least about 50% identical, and more preferably at least about 55% identical, and more preferably at least about 60% identical, and more preferably at least about 65% identical, and more preferably at least about 70% identical, and more preferably at least about 75% identical, and more preferably at least about 80% identical, and more preferably at least about 85% identical, and more preferably at least about 90% identical, and more preferably at least about 95% identical, and more preferably at least about 96% identical, and more preferably at least about 97% identical, and more preferably at least about 98% identical, and more preferably at least about 99% identical, or any percent identity between 35% and 99%, in whole integers (i.e., 36%, 37%, etc.), to an amino acid sequence disclosed herein that represents the amino acid sequence of an enzyme or protein according to the invention Patent 124702-0230
(including a biologically active domain of a full-length protein). Preferably, the amino acid sequence of the homologue or variant has a biological activity of the wild-type or reference protein or of a biologically active domain thereof (e.g., a catalytic domain). When denoting mutation positions, the amino acid position of the wild-type is typically used. The wild-type can also be referred to as the "parent." Additionally, any generation before the variant at issue can be a parent.
[232] In one embodiment, a protein of the present invention comprises, consists essentially of, or consists of an amino acid sequence that, alone or in combination with other characteristics of such proteins disclosed herein, is less than 100% identical to an amino acid sequence selected from Tables 1 and 2 (i.e., a homologue or variant). For example, a protein of the present invention can be less than 100% identical, in combination with being at least about 35% identical, to a given disclosed sequence. In another aspect of the invention, a homologue or variant according to the present invention has an amino acid sequence that is less than about 99% identical to any of such amino acid sequences, and in another embodiment, is less than about 98% identical to any of such amino acid sequences, and in another embodiment, is less than about 97% identical to any of such amino acid sequences, and in another embodiment, is less than about 96% identical to any of such amino acid sequences, and in another embodiment, is less than about 95% identical to any of such amino acid sequences, and in another embodiment, is less than about 94% identical to any of such amino acid sequences, and in another embodiment, is less than about 93% identical to any of such amino acid sequences, and in another embodiment, is less than about 92% identical to any of such amino acid sequences, and in another embodiment, is less than about 91% identical to any of such amino acid sequences, and in another embodiment, is less than about 90% identical to any of such amino acid sequences, and so on, in increments of whole integers.
[233] As used herein, unless otherwise specified, reference to a percent (%) identity refers to an evaluation of homology which is performed using: (1) a BLAST 2.0 Basic BLAST homology search using blastp for amino acid searches and blastn for nucleic acid searches with standard default parameters, wherein the query sequence is filtered for low complexity regions by default (described in Altschul, S.F., Madden, T.L., Schaaffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. Patent 124702-0230
(1997) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res. 25:3389-3402); (2) a BLAST 2 alignment (using the parameters described below); (3) PSI-BLAST with the standard default parameters (Position-Specific Iterated BLAST; and/or (4) CAZy homology determined using standard default parameters from the Carbohydrate Active EnZymes database (Coutinho, P.M. & Henrissat, B. (1999) Carbohydrate-active enzymes: an integrated database approach. In "Recent Advances in Carbohydrate Bio engineering", H.J. Gilbert, G. Davies, B. Henrissat and B. Svensson eds., The Royal Society of Chemistry, Cambridge, pp. 3-12).
[234] It is noted that due to some differences in the standard parameters between BLAST 2.0 Basic BLAST and BLAST 2, two specific sequences might be recognized as having significant homology using the BLAST 2 program, whereas a search performed in BLAST 2.0 Basic BLAST using one of the sequences as the query sequence may not identify the second sequence in the top matches. In addition, PSI-BLAST provides an automated, easy-to-use version of a "profile" search, which is a sensitive way to look for sequence homologues or variants. The program first performs a gapped BLAST database search. The PSI-BLAST program uses the information from any significant alignments returned to construct a position-specific score matrix, which replaces the query sequence for the next round of database searching. Therefore, it is to be understood that percent identity can be determined by using any one of these programs.
[235] Two specific sequences can be aligned to one another using BLAST 2 sequence as described in Tatusova and Madden, (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250. BLAST 2 sequence alignment is performed in blastp or blastn using the BLAST 2.0 algorithm to perform a Gapped BLAST search (BLAST 2.0) between the two sequences allowing for the introduction of gaps (deletions and insertions) in the resulting alignment. For purposes of clarity herein, a BLAST 2 sequence alignment is performed using the standard default parameters as follows.
For blastn, using 0 BLOSUM62 matrix:
Reward for match = 1
Penalty for mismatch = -2
Open gap (5) and extension gap (2) penalties Patent 124702-0230 gap x_dropoff (50) expect (10) word size (11) filter (on)
For blastp, using 0 BLOSU 62 matrix:
Open gap (1 1) and extension gap (1) penalties
gap x_dropoff (50) expect (10) word size (3) filter (on).
[236] A protein of the present invention can also include proteins having an amino acid sequence comprising at least 10 contiguous amino acid residues of any of the sequences described herein (i.e., 10 contiguous amino acid residues having 100% identity with 10 contiguous amino acids of the amino acid sequences of Tables 1 and 2). In other embodiments, a homologue or variant of a protein amino acid sequence includes amino acid sequences comprising at least 20, or at least 30, or at least 40, or at least 50, or at least 75, or at least 100, or at least 125, or at least 150, or at least 175, or at least 150, or at least 200, or at least 250, or at least 300, or at least 350 contiguous amino acid residues of any of the amino acid sequence represented disclosed herein. Even small fragments of proteins without biological activity are useful in the present invention, for example, in the preparation of antibodies against the full-length protein or in a screening assay (e.g., a. binding assay). Fragments can also be used to construct fusion proteins, for example, where the fusion protem comprises functional domains from two or more different proteins (e.g., a CBM from one protein linked to a CD from another protein). In one embodiment, a homologue or variant has a measurable or detectable biological activity associated with the wild-type protein (e.g., enzymatic activity).
[237] According to the present mvention, the term "contiguous" or "consecutive", with regard to nucleic acid or amino acid sequences described herein, means to be connected in an unbroken sequence. For example, for a first sequence to comprise 30 contiguous (or consecutive) amino acids of a second sequence, means that the first sequence includes an unbroken sequence of 30 amino acid residues that is 100% identical to an unbroken sequence of 30 amino acid residues in the second sequence. Similarly, for a first sequence to have "100% identity" with a second sequence means that the first sequence exactly matches the second sequence with no gaps between nucleotides or amino acids.
[238] In another embodiment, a protein of the present invention, including a homologue or variant, includes a protein having an amino acid sequence that is sufficiently similar to a natural amino acid sequence that a nucleic acid sequence encoding the Patent 124702-0230 homologue or variant is capable of hybridizing under moderate, high or very high stringency conditions (described below) to {i.e., with) a nucleic acid molecule encoding the natural protein {i.e., to the complement of the nucleic acid strand encoding the natural amino acid sequence). Preferably, a homologue or variant of a protein of the present invention is encoded by a nucleic acid molecule comprising a nucleic acid sequence that hybridizes under low, moderate, or high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising, consisting essentially of, or consisting of, an amino acid sequence represented by any of Tables 1 and 2. Such hybridization conditions are described in detail below.
[239] A nucleic acid sequence complement of nucleic acid sequence encoding a protein of the present invention refers to the nucleic acid sequence of the nucleic acid strand that is complementary to the strand which encodes the protein. It will be appreciated that a double stranded DNA which encodes a given amino acid sequence comprises a single strand DNA and its complementary strand having a sequence that is a complement to the single strand DNA. As such, nucleic acid molecules of the present invention can be either double-stranded or single- stranded, and include those nucleic acid molecules that form stable hybrids under stringent hybridization conditions with a nucleic acid sequence that encodes an amino acid sequence such as the amino acid sequences of Tables 1 and 2, and/or with the complement of the nucleic acid sequence that encodes an amino acid sequence such as the amino acid sequences of Tables 1 and 2. Methods to deduce a complementary sequence are known to those skilled in the art. It should be noted that since nucleic acid sequencing technologies are not entirely error-free, the sequences presented herein, at best, represent apparent sequences of the proteins of the present invention.
[240] As used herein, reference to hybridization conditions refers to standard hybridization conditions under which nucleic acid molecules are used to identify similar nucleic acid molecules. Such standard conditions are disclosed, for example, in Sambrook et al, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press, 1989. Sambrook et al., ibid., (see specifically, pages 9.31-9.62). In addition, formulae to calculate the appropriate hybridization and wash conditions to achieve hybridization permitting varying degrees of mismatch Patent 124702-0230 of nucleotides are disclosed, for example, in Meinkoth et al., 1984, Anal. Biochem. 138, 267-284; Meinkoth et al, ibid.
More particularly, moderate stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 70% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 30% or less mismatch of nucleotides). High stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 80% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 20% or less mismatch of nucleotides). Very high stringency hybridization and washing conditions, as referred to herein, refer to conditions which permit isolation of nucleic acid molecules having at least about 90% nucleic acid sequence identity with the nucleic acid molecule being used to probe in the hybridization reaction (i.e., conditions permitting about 10% or less mismatch of nucleotides). As discussed above, one of skill in the art can use the formulae in Meinkoth et al., ibid, to calculate the appropriate hybridization and wash conditions to achieve these particular levels of nucleotide mismatch. Such conditions will vary, depending on whether DNA:R A or DNA:DNA hybrids are being formed. Calculated melting temperatures for DNA:DNA hybrids are 10°C less than for DNA:RNA hybrids. In particular embodiments, stringent hybridization conditions for DNA:DNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na+) at a temperature of between about 20°C and about 35°C (lower stringency), more preferably, between about 28°C and about 40°C (more stringent), and even more preferably, between about 35°C and about 45°C (even more stringent), with appropriate wash conditions. In particular embodiments, stringent hybridization conditions for DNA:RNA hybrids include hybridization at an ionic strength of 6X SSC (0.9 M Na+) at a temperature of between about 30°C and about 45°C, more preferably, between about 38°C and about 50°C, and even more preferably, between about 45°C and about 55°C, with similarly stringent wash conditions. These values are based on calculations of a melting temperature for molecules larger than about 100 nucleotides, 0% formamide and a G + C Patent 124702-0230 content of about 40%. Alternatively, Tm can be calculated empirically as set forth in Sambrook et al., supra, pages 9.31 to 9.62. In general, the wash conditions should be as stringent as possible, and should be appropriate for the chosen hybridization conditions. For example, hybridization conditions can include a combination of salt and temperature conditions that are approximately 20-25°C below the calculated T„, of a particular hybrid, and wash conditions typically include a combination of salt and temperature conditions that are approximately 12-20°C below the calculated Tm of the particular hybrid. One example of hybridization conditions suitable for use with DNA:DNA hybrids includes a 2-24 hour hybridization in 6X SSC (50% formamide) at about 42°C, followed by washing steps that include one or more washes at room temperature in about 2X SSC, followed by additional washes at higher temperatures and lower ionic strength (e.g., at least one wash as about 37°C in about 0.1X-0.5X SSC, followed by at least one wash at about 68°C in about 0.1X-0.5X SSC).
[242] The minimum size of a protein and/or homologue or variant of the present invention is a size sufficient to have biological activity or, when the protein is not required to have such activity, sufficient to be useful for another purpose associated with a protein of the present invention, such as for the production of antibodies that bind to a naturally occurring protein. In one embodiment, the protein of the present invention is at least 20 amino acids in length, or at least about 25 amino acids in length, or at least about 30 amino acids in length, or at least about 40 amino acids in length, or at least about 50 amino acids in length, or at least about 60 amino acids in length, or at least about 70 amino acids in length, or at least about 80 amino acids in length, or at least about 90 amino acids in length, or at least about 100 amino acids in length, or at least about 125 amino acids in length, or at least about 150 amino acids in length, or at least about 175 amino acids in length, or at least about 200 amino acids in length, or at least about 250 amino acids in length, and so on up to a full length of each protein, and including any size in between in increments of one whole integer (one amino acid). There is no limit, other than a practical limit, on the maximum size of such a protein in that the protein can include a portion of a protein or a full-length protein, plus additional sequence (e.g., a fusion protein sequence), if desired.
[243] The present invention also includes a fusion protein that includes a domain of a Patent 124702-0230 protein of the present invention (including a homologue or vai'iant) attached to one or more fusion segments, which are typically heterologous in sequence to the protein sequence (i.e., different than protein sequence). Suitable fusion segments for use with the present invention include, but are not limited to, segments that can: enhance a protein's stability; provide other desirable biological activity; and/or assist with the purification of the protein (e.g., by affinity chromatography). A suitable fusion segment can be a domain of any size that has the desired function (e.g., imparts increased stability, solubility, action or biological activity; and/or simplifies purification of a protein). Fusion segments can be joined to amino and/or carboxyl termini of the domain of a protein of the present invention and can be susceptible to cleavage in order to enable straight-forward recovery of the protein. Fusion proteins are preferably produced by culturing a recombinant cell transfected with a fusion nucleic acid molecule that encodes a protein including the fusion segment attached to either the carboxyl and/or amino terminal end of a domain of a protein of the present invention. Accordingly, proteins of the present invention also include expression products of gene fusions (for example, used to overexpress soluble, active forms of the recombinant protein), of mutagenized genes (such as genes having codon modifications to enhance gene transcription and translation), and of truncated genes (such as genes having membrane binding modules removed to generate soluble forms of a membrane protein, or genes having signal sequences removed which are poorly tolerated in a particular recombinant host).
In one embodiment of the present invention, any of the amino acid sequences described herein can be produced with from at least one, and up to about 20, additional heterologous amino acids flanking each of the C- and/or N-terminal ends of the specified amino acid sequence. The resulting protein or polypeptide can be referred to as "consisting essentially of the specified amino acid sequence. According to the present invention, the heterologous amino acids are a sequence of amino acids that are not naturally found (i.e., not found in nature, in vivo) flanking the specified amino acid sequence, or that are not related to the function of the specified amino acid sequence, or that would not be encoded by the nucleotides that flank the naturally occurring nucleic acid sequence encoding the specified amino acid sequence as it occurs in the gene, if such nucleotides in the naturally Patent 124702-0230 occurring sequence were translated using standard codon usage for the organism from which the given amino acid sequence is derived.
[245] The present invention also provides enzyme combinations that break down lignocellulose material. Such enzyme combinations or mixtures can include a multi-enzyme composition that contains at least one protein of the present invention in combination with one or more additional proteins of the present invention or one or more enzymes or other protems from other microorganisms, plants, or similar organisms. Synergistic enzyme combinations and related methods are contemplated. The invention includes methods to identify the optimum ratios and compositions of enzymes with which to degrade each lignocellulosic material. These methods entail tests to identify the optimum enzyme composition and ratios for efficient conversion of any lignocellulosic substrate to its constituent sugars. The Examples below include assays that may be used to identify optimum ratios and compositions of enzymes with which to degrade lignocellulosic materials.
[246] Aay combination of the proteins disclosed herein is suitable for use in the multi- enzyme compositions of the present invention. Due to the complex nature of most biomass sources, which can contain cellulose, hemicellulose, pectin, lignin, protein, and ash, among other components, preferred enzyme combinations may contain enzymes with a range of substrate specificities that work together to degrade biomass into fermentable sugars in the most efficient manner. One example of a multi-enzyme complex for lignocellulose saccharification is a mixture of cellobiohydrolase(s), xylanase(s), endoglucanase(s), P-glucosidase(s), P-xylosidase(s), and accessory enzymes. However, it is to be understood that any of the enzymes described specifically herein can be combined with any one or more of the enzymes described herein or with any other available and suitable enzymes, to produce a multi-enzyme composition. The invention is not restricted or limited to the specific exemplary combinations listed below.
[247] In one embodiment, the cellobiohydrolase(s) comprise between about 30% and about 90% or between about 40% and about 70% of the enzymes in the composition, and more preferably, between about 55% and 65%, and more preferably, about 60% of the enzymes in the composition (including any percentage between 40% and 70% in 0.5% increments (e.g., 40%, 40.5%, 41 %, Patent 124702-0230 etc.).
[248] In one embodiment, the xylanase(s) comprise between about 10% and about 30% of the enzymes in the composition, and more preferably, between about 15% and about 25%, and more preferably, about 20% of the enzymes in the composition (including any percentage between 10% and 30% in 0.5% increments).
[249] In one embodiment, the endoglucanase(s) comprise between about 5% and about 15% of the enzymes in the composition, and more preferably, between about 7% and about 13%, and more preferably, about 10% of the enzymes in the composition (including any percentage between 5% and 15% in 0.5% increments).
[250] In one embodiment, the P-glucosidase(s) comprise between about 1% and about 15% of the enzymes in the composition, and preferably between about 2% and 10%, and more preferably, about 3% of the enzymes in the composition (including any percentage between 1% and 15% in 0.5% increments).
[251] In one embodiment, the p-xylosidase(s) comprise between about 1% and about 3% of the enzymes in the composition, and preferably, between about 1.5% and about 2.5%, and more preferably, about 2% of the enzymes in the composition (including any percentage between 1% and 3% in 0.5% increments.
[252] In one embodiment, the accessory enzymes comprise between about 2% and about 8% of the enzymes in the composition, and preferably, between about 3% and about 7%, and more preferably, about 5% of the enzymes in the composition (including any percentage between 2% and 8% in 0.5% increments.
[253] One particularly preferred example of a multi-enzyme complex for lignocellulose saccharification is a mixture of about 60% cellobiohydrolase(s), about 20% xylanase(s), about 10% endoglucanase(s), about 3% P-glucosidase(s), about 2% β- xylosidase(s) and about 5% accessory enzyme(s).
[254] Enzymes and multi-enzyme compositions of the present invention may also be used to break down arabinoxylan or arabinoxylan-containing substrates. Arabinoxylan is a polysaccharide composed of xylose and arabinose, wherein -L- arabinofuranose residues are attached as branch-points to a p-(l,4)-linked xylose polymeric backbone. The xylose residues may be mono-substituted at the C2 or C3 position, or di-substituted at both positions. Ferulic acid or coumaric acid may also be ester-linked to the C5 position of arabinosyl residues. Further details on the hydrolysis of arabinoxylan can be found in International Publication No. WO Patent 124702-0230
2006/1 14095.
[255] The substitutions on the xylan backbone can inhibit the enzymatic activity of xylanases, and the complete hydrolysis of arabinoxylan typically requires the action of several different enzymes, One example of a multi-enzyme complex for arabinoxylan hydrolysis is a mixture of endoxylanase(s), p-xylosidase(s), and arabinofuranosidase(s), including those with specificity towards single and double substituted xylose residues. In some embodiments, the multi-enzyme complex may further comprise one or more carbohydrate esterases, such as acetyl xylan esterases, ferulic acid esterases, coumaric acid esterases or pectin methyl esterases. Any combination of two or more of the above-mentioned enzymes is suitable for use in the multi-enzyme complexes. However, it is to be understood that the invention is not restricted or limited to the specific exemplary combinations listed herein.
[256] In one embodiment, the endoxylanase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.). Endoxylanase(s), either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g/kg, or 0.05 to 0.2 g/kg of substrate.
[257] In one embodiment, the -xylosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.). -xylosidase(s), either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g/kg, or 0.05 to 0.2 g/kg of substrate.
[258] In one embodiment, the arabinofuranosidase(s) comprise at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 40%, at least about 50%, at least about 70% of the enzymes in the composition (including any percentage between 5% and 70% in 0.5% increments (e.g., 5.0%, 5.5%, 6.0%, etc.). The total percentage of arabinofuranosidase(s) present in the composition may include Patent 124702-0230 arabinofuranosidase(s) with specificity towards single substituted xylose residues, arabinofuranosidase(s) with specificity towards double substituted xylose residues, or any combination thereof. Arabinofuranosidase(s), either alone or as part of a multi-enzyme complex, may be used in amounts of 0.001 to 2.0 g/kg, 0.005 to 1.0 g kg, or 0.05 to 0.2 g/kg of substrate.
[259] One or more components of a multi-enzyme composition (other than proteins of the present invention) can be obtained from or derived from a microbial, plant, or other source or combination thereof, and will contain enzymes capable of degrading lignocellulosic material. Examples of enzymes included in the multi- enzyme compositions of the invention include cellulases, hemicellulases (such as xylanases, including, exoxylanases, and β-xylosidases; mannanases, including endomannanases, exomannanases, and β-mannosidases), glucuronidases, and esterases (including ferulic acid esterase and glucuronyl esterases), lipases, glucosidases (such as β-glucosidase)..
[260] While the multi-enzyme composition may contain many types of enzymes, mixtures comprising enzymes that increase or enhance sugar release from biomass are preferred, including hemicellulases. In one embodiment, the he icellulase is selected from a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, an endo-galactanase, a mannanase, an endo-arabinase, an exo- arabinase, an exo-galactanase, a ferulic acid esterase, a galactomannanase, a xyloglucanase, or mixtures of any of these. In particular, the enzymes can include glucoamylase, β-xylosidase and/or β-glucosidase. Also preferred are mixtures comprising enzymes that are capable of degrading cell walls and releasing cellular contents.
[261] The enzymes of the multi-enzyme composition can be provided by a variety of sources. In one embodiment, the enzymes can be produced by growing organisms such as bacteria, algae, fungi, and plants which produce the enzymes naturally or by virtue of being genetically modified to express the enzyme or enzymes. In another embodiment, at least one enzyme of the multi-enzyme composition is a commercially available enzyme.
[262] In some embodiments, the multi-enzyme compositions comprise an accessory enzyme. An accessory enzyme is any additional enzyme capable of hydrolyzing lignocellulose or enhancing or promoting the hydrolysis of lignocellulose, wherein Patent 124702-0230 the accessory enzyme is typically provided in addition to a core enzyme or core set of enzymes. An accessory enzyme can have the same or similar function or a different function as an enzyme or enzymes in the core set of enzymes. These enzymes have been described elsewhere herein, and can generally include cellulases, xylanases, ligninases, amylases, or glucuronidases and esterases, such as ferulic acid esterases, glucuronyl esterases and rhamnogalacturonyl esterases, for example. Accessory enzymes can include enzymes that when contacted with biomass in a reaction, allow for an increase in the activity of enzymes {e.g., hemicellulases) in the multi-enzyme composition. An accessory enzyme or enzyme mix may be composed of enzymes from (1) commercial suppliers; (2) cloned genes expressing enzymes; (3) complex broth (such as that resulting from growth of a microbial strain in media, wherein the strains secrete proteins and enzymes into the media); (4) cell lysates of strains grown as in (3); and, (5) plant material expressing enzymes capable of degrading lignocellulose. In some embodiments, the accessory enzyme is a glucoamylase, a pectinase, or a ligninase.
[263] As used herein, a ligninase is an enzyme that can hydrolyze or break down the structure of lignin polymers, including lignin peroxidases, manganese peroxidases, laccases, and other enzymes described in the art known to depolymerize or otherwise break lignin polymers. Also included are enzymes capable of hydrolyzing bonds formed between hemicellulosic sugars (notably arabinose) and lignin.
[264] The multi-enzyme compositions, in some embodiments, comprise a biomass comprising microorganisms or a crude fermentation product of microorganisms. A crude fermentation product refers to the fermentation broth which has been separated from the microorganism biomass (by filtration, for example). In general, the microorganisms are grown in fermentors, optionally centrifuged or filtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi-enzyme composition that is a crude fermentation product. In other embodiments, enzyme(s) or multi-enzyme compositions produced by the microorganism (including a genetically modified microorganism as described below) are subjected to one or more purification steps, such as ammonium sulfate precipitation, chromatography, and/or ultrafiltration, which result in a partially purified or purified enzyme(s). If the microorganism has been genetically Patent 124702-0230 modified to express the enzyme(s), the enzyme(s) will include recombinant enzymes. If the genetically modified microorganism also naturally expresses the enzyme(s) or other enzymes useful for lignocellulosic saccharifieation, the enzyme(s) may include both naturally occurring and recombinant enzymes.
[265] Another embodiment of the present invention relates to a composition comprising at least about 500 ng, and preferably at least about I μg, and more preferably at least about 5 g, and more preferably at least about 10 μg, and more preferably at least about 25 g, and more preferably at least about 50 μg, and more preferably at least about 75 μg, and more preferably at least about 100 μg, and more preferably at least about 250 μg, and more preferably at least about 500 μg, and more preferably at least about 750 μg, and more preferably at least about 1 mg, and more preferably at least about 5 mg, of an isolated protein comprising any of the proteins or homologues, variants, or fragments thereof discussed herein. Such a composition of the present invention may include any earner with which the protein is associated by virtue of the protein preparation method, a protein purification method, or a preparation of the protein for use in any method according to the present invention. For example, such a carrier can include any suitable buffer, extract, or medium that is suitable for combining with the protein of the present invention so that the protein can be used in any method described herein according to the present invention.
[266] In one embodiment of the invention, one or more enzymes of the invention is bound to a solid support, i.e., an immobilized enzyme. As used herein, an immobilized enzyme includes immobilized isolated enzymes, immobilized microbial cells which contain one or more enzymes of the invention, other stabilized intact cells that produce one or more enzymes of the invention, and stabilized cell/membrane homogenates. Stabilized intact cells and stabilized cell/membrane homogenates include cells and homogenates from naturally occurring microorganisms expressing the enzymes of the invention and preferably, from genetically modified microorganisms as disclosed elsewhere herein. Thus, although methods for immobilizing enzymes are discussed below, it will be appreciated that such methods are equally applicable to immobilizing microbial cells and in such an embodiment, the cells can be lysed, if desired.
[267] A variety of methods for immobilizing an enzyme are disclosed in Industrial Patent 124702-0230
Enzymology 2nd Ed., Godfrey, T. and West, S. Eds., Stockton Press, New York, Ν.Ύ., 1996, pp. 267-272; Immobilized Enzymes, Chibata, I. Ed., Halsted Press, New York, N.Y., 1978; Enzymes and Immobilized Cells in Biotechnology, Laskin, A. Ed., Benjamin/Cummings Publishing Co., Inc., Menlo Park, California, 1985; and Applied Biochemistry and Bioengineering, Vol. 4, Chibata, I. and Wingard, Jr., L. Eds, Academic Press, New York, N.Y., 1983.
[268] Entrapment can also be used to immobilize an enzyme. Entrapment of an enzyme involves formation of, inter alia, gels (using organic or biological polymers), vesicles (including microencapsulation), semipermeable membranes or other matrices. Exemplary materials used for entrapment of an enzyme include collagen, gelatin, agar, cellulose triacetate, alginate, polyacrylamide, polystyrene, polyurethane, epoxy resins, carrageenan, and egg albumin. Some of the polymers, in particular cellulose triacetate, can be used to entrap the enzyme as they are spun into a fiber. Other materials such as polyacrylamide gels can be polymerized in solution to entrap the enzyme. Still other materials such as polyglycol oligomers that are functionalized with polymerizable vinyl end groups can entrap enzymes by forming a cross-linked polymer with UV light illumination in the presence of a photosensitizer.
[269] Further embodiments of the present invention include nucleic acid molecules that encode a protein of the present invention, as well as homologues, valiants, or fragments of such nucleic acid molecules. A nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding any of the isolated proteins disclosed herein, including a fragment or a homologue or variant of such proteins, described above. Nucleic acid molecules can include a nucleic acid sequence that encodes a fragment of a protein that does not have biological activity, and can also include portions of a gene or polynucleotide encoding the protein that are not part of the coding region for the protein (e.g., introns or regulatory regions of a gene encoding the protein). Nucleic acid molecules can include a nucleic acid sequence that is useful as a probe or primer (oligonucleotide sequences).
[270] In one embodiment, a nucleic acid molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence represented in Tables 1 and 2 or fragments or homologues or Patent 124702-0230 variants thereof. Preferably, the nucleic acid sequence encodes a protein (including fragments and homologues or variatns thereof) useful in the invention, or encompasses useful oligonucleotides or complementary nucleic acid sequences.
[271] In one embodiment, a nucleic molecule of the present invention includes a nucleic acid molecule comprising, consisting essentially of, or consisting of, a nucleic acid sequence encoding an amino acid sequence represented in Tables 1 and 2 or fragments or homologues or valiants thereof. Preferably, the nucleic acid sequence encodes a protein (including fragments and homologues or variants thereof) useful in the invention, or encompasses useful oligonucleotides or complementary nucleic acid sequences.
[272] In one embodiment, such nucleic acid molecules include isolated nucleic acid molecules that hybridize under moderate stringency conditions, and more preferably under high stringency conditions, and even more preferably under very high stringency conditions, as described above, with the complement of a nucleic acid sequence encoding a protein of the present invention (i.e., including naturally occurring allelic variants encoding a protein of the present invention). Preferably, an isolated nucleic acid molecule encoding a protein of the present invention comprises a nucleic acid sequence that hybridizes under moderate, high, or very high stringency conditions to the complement of a nucleic acid sequence that encodes a protein comprising an amino acid sequence represented in Tables 1 and 2.
[273] In accordance with the present invention, an isolated nucleic acid molecule is a nucleic acid molecule (polynucleotide) that has been removed from its natural milieu (i.e., that has been subject to human manipulation) and can include DNA, RNA, or derivatives of either DNA or RNA, including cDNA. As such, "isolated" does not reflect the extent to which the nucleic acid molecule has been purified. Although the phrase "nucleic acid molecule" primarily refers to the physical nucleic acid molecule, and the phrase "nucleic acid sequence" primarily refers to the sequence of nucleotides on the nucleic acid molecule, the two phrases can be used interchangeably, especially with respect to a nucleic acid molecule, or a nucleic acid sequence, being capable of encoding a protein. An isolated nucleic acid molecule of the present invention can be isolated from its natural source or produced using recombinant DNA technology (e.g., polymerase chain reaction Patent 124702-0230
(PCR) amplification, cloning) or chemical synthesis. Isolated nucleic acid molecules can include, for example, genes, natural allelic variants of genes, coding regions or portions thereof, and coding and/or regulatory regions modified by nucleotide insertions, deletions, substitutions, and/or inversions in a manner such that the modifications do not substantially interfere with the nucleic acid molecule's ability to encode a protein of the present invention or to form stable hybrids under stringent conditions with natural gene isolates. An isolated nucleic acid molecule can include degeneracies. As used herein, nucleotide degeneracy refers to the phenomenon that one amino acid can be encoded by different nucleotide codons. Thus, the nucleic acid sequence of a nucleic acid molecule that encodes a protein of the present invention can vary due to degeneracies. It is noted that a nucleic acid molecule of the present invention is not required to encode a protein having protein activity. A nucleic acid molecule can encode a truncated, mutated or inactive protein, for example. In addition, nucleic acid molecules of the invention are useful as probes and primers for the identification, isolation and/or purification of other nucleic acid molecules. If the nucleic acid molecule is an oligonucleotide, such as a probe or primer, the oligonucleotide preferably ranges from about 5 to about 50 or about 500 nucleotides, more preferably from about 10 to about 40 nucleotides, and most preferably from about 15 to about 40 nucleotides in length.
[274] According to the present invention, reference to a gene includes all nucleic acid sequences related to a natural (i.e. wild-type) gene, such as regulatory regions that control production of the protein encoded by that gene (such as, but not limited to, transcription, translation or post-translation control regions) as well as the coding region itself. In another embodiment, a gene can be a naturally occurring allelic variant that includes a similar but not identical sequence to the nucleic acid sequence encoding a given protein. Allelic variants have been previously described above. Genes can include or exclude one or more introns or any portions thereof or any other sequences or which are not included in the cDNA for that protein. The phrases "nucleic acid molecule" and "gene" can be used interchangeably when the nucleic acid molecule comprises a gene as described above.
[275] Preferably, an isolated nucleic acid molecule of the present invention is produced Patent 124702-0230 using recombinant DNA technology {e.g., polymerase chain reaction (PCR) amplification, cloning, etc.) or chemical synthesis. Isolated nucleic acid molecules include any nucleic acid molecules and homologues or valiants thereof that are part of a gene described herein and/or that encode a protein described herein, including, but not limited to, natural allelic variants and modified nucleic acid molecules (homologues or variants) in which nucleotides have been inserted, deleted, substituted, and/or inverted in such a manner that such modifications provide the desired effect on protein biological activity or on the activity of the nucleic acid molecule. Allelic variants and protein homologues or variants (e.g., proteins encoded by nucleic acid homologues or variants) have been discussed in detail above.
[276] A nucleic acid molecule homologue or variant (i.e., encoding a homologue or variant of a protein of the present invention) can be produced using a number of methods known to those skilled in the art (see, for example, Sambrook et ah). For example, nucleic acid molecules can be modified using a variety of techniques including, but not limited to, by classic mutagenesis and recombinant DNA techniques (e.g., site-directed mutagenesis, chemical treatment, restriction enzyme cleavage, ligation of nucleic acid fragments and/or PCR amplification), or synthesis of oligonucleotide mixtures and ligation of mixture groups to "build" a mixture of nucleic acid molecules and combinations thereof. Another method for modifying a recombinant nucleic acid molecule encoding a protein is gene shuffling (i.e., molecular breeding) (See, for example, U.S. Patent No. 5,605,793 to Stemmer; Minshull and Stemmer; 1999, Curr. Opin. Chem. Biol. 3:284-290; Stemmer, 1994, P.N.A.S. USA 91 :10747-10751). This technique can be used to efficiently introduce multiple simultaneous changes in the protein. Nucleic acid molecule homologues or variants can be selected by hybridization with a gene or polynucleotide, or by screening for the function of a protein encoded by a nucleic acid molecule (i.e., biological activity).
[277] The minimum size of a nucleic acid molecule of the present invention is a size sufficient to encode a protein (including a fragment, homologue, or variant of a full-length protein) having biological activity, sufficient to encode a protein comprising at least one epitope which binds to an antibody, or sufficient to form a probe or oligonucleotide primer that is capable of forming a stable hybrid with the Patent 124702-0230 complementary sequence of a nucleic acid molecule encoding a natural protein (e.g., under moderate, high, or high stringency conditions). As such, the size of the nucleic acid molecule encoding such a protein can be dependent on nucleic acid composition and percent homology or identity between the nucleic acid molecule and complementary sequence as well as upon hybridization conditions per se (e.g., temperature, salt concentration, and formamide concentration). The minimal size of a nucleic acid molecule that is used as an oligonucleotide primer or as a probe is typically at least about 12 to about 15 nucleotides in length if the nucleic acid molecules are GC-rich and at least about 15 to about 18 bases in length if they are AT-rich. There is no limit, other than a practical limit, on the maximal size of a nucleic acid molecule of the present invention, in that the nucleic acid molecule can include a portion of a protein encoding sequence, a nucleic acid sequence encoding a full-length protein (including a gene), including any length fragment between about 20 nucleotides and the number of nucleotides that make up the full length cDNA encoding a protein, in whole integers (e.g., 20, 21, 22, 23, 24, 25 nucleotides), or multiple genes, or portions thereof.
[278] The phrase "consisting essentially of, when used with reference to a nucleic acid sequence herein, refers to a nucleic acid sequence encoding a specified amino acid sequence that can be flanked by from at least one, and up to as many as about 60, additional heterologous nucleotides at each of the 5' and/or the 3' end of the nucleic acid sequence encoding the specified amino acid sequence. The heterologous nucleotides are not naturally found (i.e., not found in nature, in vivo) flanking the nucleic acid sequence encoding the specified amino acid sequence as it occurs in the natural gene or do not encode a protein that imparts any additional function to the protein or changes the function of the protein having the specified amino acid sequence.
[279] In one embodiment, the polynucleotide probes or primers of the invention are conjugated to detectable markers. Detectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like), radiolabels (e.g., 3H, Patent 124702-0230
I, 35S, C, or P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Preferably, the polynucleotide probes are immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports.
One embodiment of the present invention relates to a recombinant nucleic acid molecule which comprises the isolated nucleic acid molecule described above which is operatively linked to at least one expression control sequence. More particularly, according to the present invention, a recombinant nucleic acid molecule typically comprises a recombinant vector and any one or more of the isolated nucleic acid molecules as described herein. According to the present invention, a recombinant vector is an engineered (i.e., artificially produced) nucleic acid molecule that is used as a tool for manipulating a nucleic acid sequence of choice and/or for introducing such a nucleic acid sequence into a host cell. The recombinant vector is therefore suitable for use in cloning, sequencing, and/or otherwise manipulating the nucleic acid sequence of choice, such as by expressing and/or delivering the nucleic acid sequence of choice into a host cell to form a recombinant cell. Such a vector typically contains nucleic acid sequences that are not naturally found adjacent to nucleic acid sequence to be cloned or delivered, although the vector can also contain regulatory nucleic acid sequences (e.g., promoters, untranslated regions) which are naturally found adjacent to nucleic acid sequences of the present invention or which are useful for expression of the nucleic acid molecules of the present invention (discussed in detail below). The vector can be either RNA or DNA, either prokaryotic or eukaryotic, and typically is a plasmid. The vector can be maintained as an extrachromosomal element (e.g., a plasmid) or it can be integrated into the chromosome of a recombinant host cell, although it is preferred if the vector remains separate from the genome for most applications of the invention. The entire vector can remain in place within a host cell, or under certain conditions, the plasmid DNA can be deleted, leaving behind the nucleic acid molecule of the present invention. An integrated nucleic acid molecule can be under chromosomal promoter control, under native or plasmid promoter control, or under a combination of several promoter controls. Single or Patent 124702-0230 multiple copies of the nucleic acid molecule can be integrated into the chromosome. A recombinant vector of the present invention can contain at least one selectable marker.
[281] In one embodiment, a recombinant vector used in a recombinant nucleic acid molecule of the present invention is an expression vector. As used herein, the phrase "expression vector" is used to refer to a vector that is suitable for production of an encoded product (e.g., a protein of interest, such as an enzyme of the present invention). In this embodiment, a nucleic acid sequence encoding the product to be produced (e.g., the protein or homologue or variant thereof) is inserted into the recombinant vector to produce a recombinant nucleic acid molecule. The nucleic acid sequence encoding the protein to be produced is inserted into the vector in a manner that operatively links the nucleic acid sequence to regulatory sequences in the vector which enable the transcription and translation of the nucleic acid sequence within the recombinant host cell.
[282] Typically, a recombinant nucleic acid molecule includes at least one nucleic acid molecule of the present invention operatively linked to one or more expression control sequences (e.g., transcription control sequences or translation control sequences). As used herein, the phrase "recombinant molecule" or "recombinant nucleic acid molecule" primarily refers to a nucleic acid molecule or nucleic acid sequence operatively linked to a transcription control sequence, but can be used interchangeably with the phrase "nucleic acid molecule", when such nucleic acid molecule is a recombinant molecule as discussed herein. According to the present invention, the phrase "operatively linked" refers to linking a nucleic acid molecule to an expression control sequence in a manner such that the molecule is able to be expressed when transfected (i.e., transformed, transduced, transfected, conjugated or conduced) into a host cell. Transcription control sequences are sequences which control the initiation, elongation, or termination of transcription. Particularly important transcription control sequences are those which control transcription initiation, such as promoter, enhancer, operator and repressor sequences. Suitable transcription control sequences include any transcription control sequence that can function in a host cell or organism into which the recombinant nucleic acid molecule is to be introduced. Transcription control sequences may also include any combination of one or more of any of the foregoing. Patent 124702-0230
[283] Recombinant nucleic acid molecules of the present invention can also contain additional regulatory sequences, such as translation regulatory sequences, origins of replication, and other regulatory sequences that are compatible with the recombinant cell. In one embodiment, a recombinant molecule of the present invention, including those which are integrated into the host cell chromosome, also contains secretory signals (i.e., signal segment nucleic acid sequences) to enable an expressed protein to be secreted from the cell that produces the protein. Suitable signal segments include a signal segment that is naturally associated with the protein to be expressed or any heterologous signal segment capable of directing the secretion of the protein according to the present invention. In another embodiment, a recombinant molecule of the present invention comprises a leader sequence to enable an expressed protein to be delivered to and inserted into the membrane of a host cell. Suitable leader sequences include a leader sequence that is naturally associated with the protein, or any heterologous leader sequence capable of directing the delivery and insertion of the protein to the membrane of a cell.
[284] According to the present invention, the term "transfection" is generally used to refer to any method by which an exogenous nucleic acid molecule (i.e., a recombinant nucleic acid molecule) can be inserted into a cell. The term "transformation" can be used interchangeably with the term "transfection" when such term is used to refer to the introduction of nucleic acid molecules into microbial cells or plants and describes an inherited change due to the acquisition of exogenous nucleic acids by the microorganism that is essentially synonymous with the term "transfection." Transfection techniques include, but are not limited to, transformation, particle bombardment, electroporation, microinjection, lipofection, adsorption, infection and protoplast fusion.
[285] One or more recombinant molecules of the present invention can be used to produce an encoded product (e.g., a protein) of the present invention. In one embodiment, an encoded product is produced by expressing a nucleic acid molecule as described herein under conditions effective to produce the protein. A preferred method to produce an encoded protein is by transfecting a host cell with one or more recombinant molecules to form a recombinant cell. Suitable host cells to transfect include, but are not limited to, any bacterial, fungal (e.g., filamentous Patent 124702-0230 fungi or yeast or mushrooms), algal, plant, insect, or animal cell that can be transfected. Host cells can be either untransfected cells or cells that are already transfected with at least one other recombinant nucleic acid molecule.
[286] Suitable cells (e.g., a host cell or production organism) may include any microorganism (e.g., a bacterium, a protist, an alga, a fungus, or other microbe), and is preferably a bacterium, a yeast or a filamentous fungus. Suitable bacterial genera include, but are not limited to, Escherichia, Bacillus, Lactobacillus, Pseudomonas and Streptomyces. Suitable bacterial species include, but are not limited to, Escherichia coll, Bacillus subtilis, Bacillus licheniformis, Bacillus Stearothermophilus, Lactobacillus brevis, Pseudomonas aeruginosa and Streptomyces lividans. Suitable genera of yeast include, but are not limited to, Saccharomyces, Schizosaccharomyces, Candida, Hansenula, Pichia, Kluyveromyces, and Phaffia. Suitable yeast species include, but are not limited to, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida albicans, Hansenula pofymorpha, Pichia pastoris, P. canadensis, Kluyveromyces marxianus and Phaffia rhodozyma.
[287] Suitable fungal genera include, but are not limited to, Chrysosporium, Thielavia, Talaromyces, Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynascus, Cryptococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillium, Gibberella, Myceliophthora, Mucor, Aspergillus, Fusarium, Humicola, and Trichoderma, and anamorphs and teleomorphs thereof. Suitable fungal species include, but are not limited to, Aspergillus niger, Aspergillus oryzae, Aspergillus idulans, Aspergillus japonicus, Absidia coerulea, Rhizopus oryzae, Chrysosporium lucknowense, Myceliophthora thermophila, Neurospora crassa, Neurospora intermedia, Trichoderma reesei, Penicillium canescens, Penicillium solitum, Penicillium funiculosum, Talaromyces emersonii and Talaromyces flavus. In one embodiment, the host cell is a fungal cell of the species Myceliophthora thermophila. In another embodiment, a while (low cellulose) strain is sued. In one embodiment, the host cell is a fungal cell of Strain CI (VKM F-3500-D) or a mutant strain derived therefrom {e.g., UV13-6 (Accession No. VKM F-3632 D); NG7C-19 (Accession No. VKM F-3633 D); UV18-25 (VKM F-3631D), 1L (CBS122189), or WIUIOOL (CBS122190)). Host cells can be either untransfected cells or cells that are already transfected with Patent 124702-0230 at least one other recombinant nucleic acid molecule. Additional embodiments of the present invention include any of the genetically modified cells described herein.
[288] In another embodiment, suitable host cells include insect cells (most particularly Drosophila melanogaster cells, Spodoptera frugiperda Sf9 and Sf21 cells and Trichoplusa High-Five cells), nematode cells (particularly C. elegans cells), avian cells, amphibian cells (particularly Xenopus laevis cells), reptilian cells, and mammalian cells (most particularly human, simian, canine, rodent, bovine, or sheep cells, e.g. NIH3T3, CHO (Chinese hamster ovary cell), COS, VE O, BHK, HEK, and other rodent or human cells).
[289] In one embodiment, one or more protein(s) expressed by an isolated nucleic acid molecule of the present invention are produced by culturing a cell that expresses the protein (i.e., a recombinant cell or recombinant host cell) under conditions effective to produce the protein. In some instances, the protein may be recovered, and in others, the cell may be harvested in whole, either of which can be used in a composition.
[290] Microorganisms used in the present invention (including recombinant host cells or genetically modified microorganisms) are cultured in an appropriate fermentation medium. An appropriate, or effective, fermentation medium refers to any medium in which a cell of the present invention, including a genetically modified microorganism (described below), when cultured, is capable of expressing enzymes useful in the present invention and/or of catalyzing the production of sugars from lignocellulosic biomass. Such a medium is typically an aqueous medium comprising assimilable carbon, nitrogen and phosphate sources. Such a medium can also include appropriate salts, minerals, metals and other nutrients. Microorganisms and other cells of the present invention can be cultured in conventional fermentation bioreactors. The microorganisms can be cultured by any fermentation process which includes, but is not limited to, batch, fed-batch, cell recycle, and continuous fermentation. The fermentation of microorganisms such as fungi may be carried out in any appropriate reactor, using methods known to those skilled in the art. For example, the fermentation may be carried out for a period of 1 to 14 days, or more preferably between about 3 and 10 days. The temperature of the medium is typically maintained between about 25 and 50°C, Patent 124702-0230 and more preferably between 28 and 40°C. The pH of the fermentation medium is regulated to a pH suitable for growth and protein production of the particular organism. The fermentor can be aerated in order to supply the oxygen necessary for fermentation and to avoid the excessive accumulation of carbon dioxide produced by fermentation. In addition, the aeration helps to control the temperature and the moisture of the culture medium. In general the fungal strains are grown in fermentors, optionally centrifuged or fdtered to remove biomass, and optionally concentrated, formulated, and dried to produce an enzyme(s) or a multi- enzyme composition that is a crude fermentation product. Particularly suitable conditions for culturing fdamentous fungi are described, for example, in U.S. Patent No. 6,015,707 and U.S. Patent No. 6,573,086, supra.
[291] Depending on the vector and host system used for production, resultant proteins of the present invention may either remain within the recombinant cell; be secreted into the culture medium; be secreted into a space between two cellular membranes; or be retained on the outer surface of a cell membrane. The phrase "recovering the protein" refers to collecting the whole culture medium containing the protein and need not imply additional steps of separation or purification. Proteins produced according to the present invention can be purified using a variety of standard protein purification techniques, such as, but not limited to, affinity chromatography, ion exchange chromatography, filtration, electrophoresis, hydrophobic interaction chromatography, gel filtration chromatography, reverse phase chromatography, concanavalin A chromatography, chromatofocusing and differential precipitation or solubilization.
[292] Proteins of the present invention are preferably retrieved, obtained, and/or used in "substantially pure" form. As used herein, "substantially pure" refers to a purity that allows for the effective use of the protein in any method according to the present invention. For a protein to be useful in any of the methods described herein or in any method utilizing enzymes of the types described herein according to the present invention, it is substantially free of contaminants, other proteins and/or chemicals that might interfere or that would interfere with its use in a method disclosed by the present invention {e.g., that might interfere with enzyme activity), or that at least would be undesirable for inclusion with a protein of the present invention (including homologues and variants) when it is used in a method Patent 124702-0230 disclosed by the present invention (described in detail below). Preferably, a "substantially pure" protein, as referenced herein, is a protein that can be produced by any method (i.e., by direct purification from a natural source, recombinantly, or synthetically), and that has been purified from other protein components such that the protein comprises at least about 80% weight/weight of the total protein in a given composition (e.g., the protein of interest is about 80% of the protein in a sohition/compositi on/buffer), and more preferably, at least about 85%, and more preferably at least about 90%, and more preferably at least about 91%, and more preferably at least about 92%, and more preferably at least about 93%, and more preferably at least about 94%, and more preferably at least about 95%, and more preferably at least about 96%, and more preferably at least about 97%, and more preferably at least about 98%, and more preferably at least about 99%, weight/weight of the total protein in a given composition.
[293] It will be appreciated by one skilled in the art that use of recombinant DNA technologies can improve control of expression of transfected nucleic acid molecules by manipulating, for example, the number of copies of the nucleic acid molecules within the host cell, the efficiency with which those nucleic acid molecules are transcribed, the efficiency with which the resultant transcripts are translated, and the efficiency of post-translational modifications. Additionally, the promoter sequence might be genetically engineered to improve the level of expression as compared to the native promoter. Recombinant techniques useful for controlling the expression of nucleic acid molecules include, but are not limited to, integration of the nucleic acid molecules into one or more host cell chromosomes, addition of vector stability sequences to plasmids, substitutions or modifications of transcription control signals (e.g., promoters, operators, enhancers), substitutions or modifications of translational control signals (e.g., ribosome binding sites), modification of nucleic acid molecules to correspond to the codon usage of the host cell, and deletion of sequences that destabilize transcripts.
[294] Another aspect of the present invention relates to a genetically modified microorganism that has been transfected with one or more nucleic acid molecules of the present invention. As used herein, a genetically modified microorganism can include a genetically modified bacterium, alga, yeast, filamentous fungus, or other microbe. Such a genetically modified microorganism has a genome which is Patent 124702-0230 modified (i.e., mutated or changed) from its normal (i.e., wild-type or naturally occurring) form such that the desired result is achieved (i.e., increased or modified activity and/or production of at least one enzyme or a multi-enzyme composition for the conversion of lignocellulosic material to fermentable sugars). Genetic modification of a microorganism can be accomplished using classical strain development and/or molecular genetic techniques. Such techniques known in the ait and are generally disclosed for microorganisms, for example, in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Labs Press or Molecular Cloning: A Laboratory Manual, third edition (Sambrook and Russel, 2001), (jointly referred to herein as "Sambrook"). A genetically modified microorganism can include a microorganism in which nucleic acid molecules have been inserted, deleted or modified (i.e., mutated; e.g., by insertion, deletion, substitution, and/or inversion of nucleotides), in such a manner that such modifications provide the desired effect within the microorganism.
[295] In one embodiment, a genetically modified microorganism can endogenously contain and express an enzyme or a multi-enzyme composition for the conversion of lignocellulosic material to fermentable sugars, and the genetic modification can be a genetic modification of one or more of such endogenous enzymes, whereby the modification has some effect on the ability of the microorganism to convert lignocellulosic material to fermentable sugars (e.g., increased expression of the protein by introduction of promoters or other expression control sequences, or modification of the coding region by homologous recombination to increase the activity of the encoded protein).
[296] In another embodiment, a genetically modified microorganism can endogenously contain and express an enzyme for the conversion of lignocellulosic material to fermentable sugars, and the genetic modification can be an introduction of at least one exogenous nucleic acid sequence (e.g., a recombinant nucleic acid molecule), wherein the exogenous nucleic acid sequence encodes at least one additional enzyme useful for the conversion of lignocellulosic material to fermentable sugars and/or a protein that improves the efficiency of the enzyme for the conversion of lignocellulosic material to fermentable sugars. In this aspect of the invention, the microorganism can also have at least one modification to a gene or genes comprising its endogenous enzyme(s) for the conversion of lignocellulosic material Patent 124702-0230 to fermentable sugars.
[297] In yet another embodiment, the genetically modified microorganism does not necessarily endogenously (naturally) contain an enzyme for the conversion of lignocellulosic material to fermentable sugars, but is genetically modified to introduce at least one recombinant nucleic acid molecule encoding at least one enzyme or a multiplicity of enzymes for the conversion of lignocellulosic material to fermentable sugars. Such a microorganism can be used in a method of the invention, or as a production microorganism for crude fennentation products, partially purified recombinant enzymes, and/or purified recombinant enzymes, any of which can then be used in a method of the present invention.
[298] Once the proteins (enzymes) are expressed in a host cell, a cell extract that contains the activity to test can be generated. For example, a lysate from the host cell is produced, and the supernatant containing the activity is harvested and/or the activity can be isolated from the lysate. In the case of cells that secrete enzymes into the culture medium, the culture medium containing them can be harvested, and/or the activity can be purified from the culture medium. The extracts/activities prepared in this way can be tested using assays known in the art. Accordingly, methods to identify mutli-enzyme compositions capable of degrading lignocellulosic biomass are provided.
[299] Artificial substrates, or complex mixtures of polymeric carbohydrates and lignin, or actual lignocellulose can be used in such tests. One assay that may be used to measure the release of sugars and oligosaccharides from these complex substrates is the dinitrosalicylic acid assay (DNS). In this assay, the lignocellulosic material such as DDG is incubated with enzymes(s) for various times and reducing sugars are measured.
[300] The present invention is not limited to fungi and also contemplates genetically modified organisms such as algae, bacterial, and plants transformed with one or more nucleic acid molecules of the invention. The plants may be used for production of the enzymes, and/or as the lignocellulosic material used as a substrate in the methods of the invention. Methods to generate recombinant plants are known in the art. For instance, numerous methods for plant transformation have been developed, including biological and physical transformation protocols. See, for example, Mild et al., "Procedures for Introducing Foreign DNA into Patent 124702-0230
Plants" in Methods in Plant Molecular Biology and Biotechnology, Glick, B. . and Thompson, J.E. Eds. (CRC Press, Inc., Boca Raton, 1993) pp. 67-88. In addition, vectors and in vitro culture methods for plant cell or tissue transformation and regeneration of plants are available. See, for example, Gruber et al., "Vectors for Plant Transformation" in Methods in Plant Molecular Biology and Biotechnology, Glick, B.R. and Thompson, J.E. Eds. (CRC Press, Inc., Boca Raton, 1993) pp. 89- 119.
[301] The most widely utilized method for introducing an expression vector into plants is based on the natural transformation system of Agrobacterium. See, for example, Horsch et al, Science 227:1229 (1985). A. tumefaciens and A. rhizogenes are plant pathogenic soil bacteria which genetically transform plant cells. The Ti and Ri plasmids of A tumefaciens w.dA. rhizogenes, respectively, cany genes responsible for genetic transformation of the plant. See, for example, Kado, C.I., Crit. Rev. Plant. Sci. 10:1 (1991). Descriptions of Agrobacterium vector systems and methods for Agrobacterium- ediaXed gene transfer are provided by numerous references, including Gruber et al., supra, Miki et al., supra, Moloney et al., Plant Cell Reports 8:238 (1989), and U.S. Patents Nos. 4,940,838 and 5,464,763.
[302] Another generally applicable method of plant transformation is microprojectile- mediated transformation wherein DNA is carried on the surface of microprojectiles. The expression vector is introduced into plant tissues with a biolistic device that accelerates the microprojectiles to speeds sufficient to penetrate plant cell walls and membranes. Sanford et al., Part. Sci. Technol. 5:27 (1987), Sanford, J.C., Trends Biotech. 6:299 (1988), Sanford, J.C., Physiol. Plant 79:206 (1990), Klein et al., Biotechnology 10:268 (1992).
[303] Another method for physical delivery of DNA to plants is sonication of target cells. Zhang et al., Bio/Technology 9:996 (1991). Alternatively, liposome or spheroplast fusion have been used to introduce expression vectors into plants. Deshayes et al., EMBO J., 4:2731 (1985), Christou et al., Proc Natl. Acad. Sci. USA 84:3962 (1987). Direct uptake of DNA into protoplasts using CaCl2 precipitation, polyvinyl alcohol or poly-L-ornithine have also been reported. Hain et al., Mol. Gen. Genet. 199:161 (1985) and Draper et al., Plant Cell Physiol. 23 :451 (1982). Electroporation of protoplasts and whole cells and tissues have also been described. Donn et al., In Abstracts of Vllth International Congress on Patent 124702-0230
Plant Cell and Tissue Culture IAPTC, A2-38, p. 53 (1990); D'Halluin et al., Plant Ce// 4: 1495-1505 (1992) and Spencer et al., Plant Mol. Biol. 24:51-61 (1994).
[304] Some embodiments of the present invention include genetically modified organisms comprising at least one nucleic acid molecule encoding at least one enzyme of the present invention, in which the activity of the enzyme is downregulated. The downregulation may be achieved, for example, by introduction of inhibitors (chemical or biological) of the enzyme activity, by manipulating the efficiency with which those nucleic acid molecules are transcribed, the efficiency with which the resultant transcripts are translated, and the efficiency of post-translational modifications, or by "knocking out" the endogenous copy of the gene. A "knock out" of a gene refers to a molecular biological technique by which the gene in the organism is made inoperative, so that the expression of the gene is substantially reduced or eliminated. Alternatively, in some embodiments the activity of the enzyme may be upregulated. The present invention also contemplates downregulating activity of one or more enzymes while simultaneously upregulating activity of one or more enzymes to achieve the desired outcome.
[305] Another embodiment of the present invention relates to an isolated binding agent capable of selectively binding to a protein of the present invention. Suitable binding agents may be selected from an antibody, an antigen binding fragment, or a binding partner. The binding agent selectively binds to an amino acid sequence selected from Tables 1 and 2, including to any fragment of any of the above sequences comprising at least one antibody binding epitope.
[306] According to the present invention, the phrase "selectively binds to" refers to the ability of an antibody, antigen binding fragment or binding partner of the present invention to preferentially bind to specified proteins. More specifically, the phrase "selectively binds" refers to the specific binding of one protein to another {e.g., an antibody, fragment thereof, or binding partner to an antigen), wherein the level of binding, as measured by any standard assay (e.g., an immunoassay), is statistically significantly higher than the background control for the assay. For example, when performing an immunoassay, controls typically include a reaction well/tube that contain antibody or antigen binding fragment alone (i.e., in the absence of antigen), wherein an amount of reactivity (e.g., non-specific binding to the well) by the Patent 124702-0230 antibody or antigen binding fragment thereof in the absence of the antigen is considered to be background. Binding can be measured using a variety of methods standard in the art including enzyme immunoassays (e.g., ELISA), immunoblot assays, etc.).
[307] Antibodies are characterized in that they comprise immunoglobulin domains and as such, they are members of the immunoglobulin superfamily of proteins. An antibody of the invention includes polyclonal and monoclonal antibodies, divalent and monovalent antibodies, bi- or multi-specific antibodies, serum containing such antibodies, antibodies that have been purified to vai ing degrees, and any functional equivalents of whole antibodies. Isolated antibodies of the present invention can include serum containing such antibodies, or antibodies that have been purified to varying degrees. Whole antibodies of the present invention can be polyclonal or monoclonal. Alternatively, functional equivalents of whole antibodies, such as antigen binding fragments in which one or more antibody domains are truncated or absent (e.g., Fv, Fab, Fab', or F(ab)2 fragments), as well as genetically-engineered antibodies or antigen binding fragments thereof, including single chain antibodies or antibodies that can bind to more than one epitope (e.g., bi-specific antibodies), or antibodies that can bind to one or more different antigens (e.g., bi- or multi-specific antibodies), may also be employed in the invention. Methods for the generation and production of antibodies are well known in the art.
[308] Monoclonal antibodies may be produced according to the methodology of Kohler and Milstein (Nature 256:495-497, 1 75). Non-antibody polypeptides, sometimes referred to as binding partners, are designed to bind specifically to a protein of the invention. Examples of the design of such polypeptides, which possess a prescribed ligand specificity are given in Beste et al. (Proc. Natl. Acad. Sci. 96:1898-1903, 1999). In one embodiment, a binding agent of the invention is immobilized on a substrate such as: artificial membranes, organic supports, biopolymer supports and inorganic supports such as for use in a screening assay.
[309] Proteins of the present invention, at least one protein of the present invention, compositions comprising such protein(s) of the present invention, and multi- enzyme compositions (examples of which are described above) may be used in any method where it is desirable to hydrolyze glycosidic linkages in lignocellulosic Patent 124702-0230 material, or any other method wherein enzymes of the same or similar function are useful.
[310] In one embodiment, the present invention includes the use of at least one protein of the present invention, compositions comprising at least one protein of the present invention, or multi-enzyme compositions in methods for hydrolyzing lignocellulose and the generation of fermentable sugars therefrom. In one embodiment, the method comprises contacting the lignocellulosic material with an effective amount of one or more proteins of the present invention, composition comprising at least one protein of the present invention, or a multi-enzyme composition, whereby at least one fermentable sugar is produced (liberated). The lignocellulosic material may be partially or completely degraded to fermentable sugars. Economical levels of degradation at commercially viable costs are contemplated.
[311] Typically, the amount of enzyme or enzyme composition contacted with the lignocellulose will depend upon the amount of glucan present in the lignocellulose. In some embodiments, the amount of enzyme or enzyme composition contacted with the lignocellulose may be from about 0.1 to about 200 mg enzyme or enzyme composition per gram of glucan; in other embodiments, from about 3 to about 20 mg enzyme or enzyme composition per gram of glucan. The invention encompasses the use of any suitable or sufficient amount of enzyme or enzyme composition between about 0.1 mg and about 200 mg enzyme per gram glucan, in increments of 0.05 mg (i.e., 0.1 mg, 0.15 mg, 0.2 mg... 199.9 mg, 199.95 mg, 200 mg).
[312] In a further embodiment, the invention provides a method for degrading DDG, preferably, but not limited to, DDG derived from corn, to sugars. The method comprises contacting the DDG with a protein of the present invention, a composition comprising at least one protein of the present invention, or a multi- enzyme composition. In certain embodiments, at least 10% of fermentable sugars are liberated. In other embodiment, the at least 15% of the sugars are liberated, or at least 20% of the sugars are liberated, or at least 23% of the sugars are liberated, or at least 24% of the sugars are liberated, or at least 25% of the sugars are liberated, or at least 26% of the sugars are liberated, or at least 27% of the sugars are liberated, or at least 28% of the sugars are liberated. Patent 124702-0230
[313] In another embodiment, the invention provides a method for producing fermentable sugars comprising cultivating a genetically modified microorganism of the present invention in a nutrient medium comprising a lignocellulosic material, whereby fermentable sugars are produced.
[314] Also provided are methods that comprise further contacting the lignocellulosic material with at least one accessory enzyme. Accessory enzymes have been described elsewhere herein. The accessory enzyme or enzymes may be added at the same time, prior to, or following the addition of a protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition, or can be expressed (endogenously or overexpressed) in a genetically modified microorganism used in a method of the invention. When added simultaneously, the protein of the present invention, a composition comprising at least one protein of the present invention, or a multi- enzyme composition will be compatible with the accessory enzymes selected. When the enzymes are added following the treatment with the protein of the present invention, a composition comprising at least one protein of the present invention, or a multi-enzyme composition, the conditions (such as temperature and pH) may be altered to those optimal for the accessory enzyme before, during, or after addition of the accessory enzyme. Multiple rounds of enzyme addition are also encompassed. The accessory enzyme may also be present in the lignocellulosic material itself as a result of genetically modifying the plant. The nutrient medium used in a fermentation can also comprise one or more accessory enzymes.
[315] In some embodiments, the method comprises a pretreatment process. In general, a pretreatment process will result in components of the lignocellulose being more accessible for downstream applications or so that it is more digestible by enzymes following treatment in the absence of hydrolysis. The pretreatment can be a chemical, physical or biological pretreatment. The lignocellulose may have been previously treated to release some or all of the sugars, as in the case of DDG. Physical treatments, such as grinding, boiling, freezing, milling, vacuum infiltration, and the like may also be used with the methods of the invention. In one embodiment, the heat treatment comprises heating the lignocellulosic material to 121°C for 15 minutes. A physical treatment such as milling can allow a higher Patent 124702-0230 concentration of lignocellulose to be used in die methods of the invention. A higher concentration refers to about 20%, up to about 25%, up to about 30%, up to about 35%, up to about 40%, up to about 45%, or up to about 50% lignocellulose. The lignocellulose may also be contacted with a metal ion, ultraviolet light, ozone, and the like. Additional pretreatment processes are known to those skilled in the art, and can include, for example, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment, including ammonia fiber explosion (AFEX) technology. Details on pretreatment technologies and processes can be found in Wyman et al., Bioreso rce Tech. 96:1959 (2005); Wyman et al, Bioresource Tech. 96:2026(2005); Hsu, "Pretreatment of biomass" In Handbook on Bioet anol: Production and Utilization, Wyman, Taylor and Francis Eds., p. 179- 212 (1996); and Mosier et al., Bioresource Tech. 96:673 (2005).
[316] In an additional embodiment, the method comprises detoxifying the lignocellulosic material. Dextoxification may be desirable in the event that inhibitors are present in the lignocellulosic material. Such inhibitors can be generated by a pretreatment process, deriving from sugar degradation or are direct released from the lignocellulose polymer. Detoxifying can include the reduction of their formation by adjusting sugar extraction conditions; the use of inhibitor-tolerant or inhibitor- degrading strains of microorganisms. Detoxifying can also be accomplished by the addition of ion exchange resins, active charcoal, enzymatic detoxification using, e.g., laccase, and the like. In some embodiments, the proteins, compositions or products of the present invention further comprises detoxifying agents.
[317] In some embodiments, the methods may be performed one or more times in whole or in part. That is, one may perform one or more pretreatments, followed by one or more reactions with a protein of the present invention, composition or product of the present invention and/or accessory enzyme. The enzymes may be added in a single dose, or may be added in a series of small doses. Further, the entire process may be repeated one or more times as necessary. Therefore, one or more additional treatments with heat and enzymes are contemplated.
[318] The methods described above result in the production of fermentable sugars.
During, or subsequent to the methods described, the fermentable sugars may be Patent 124702-0230 recovered. In the case of a cultivation of microorganisms, the sugars can be recovered through a continuous, batch or fed-batch method. The sugars recovered can be concentrated or purified. Recovery may occur by any method known in the ait, including, but not limited to, washing, gravity flow, pressure, chromatography, extraction, crystallization (e.g., evaporative crystallization), membrane separation, reverse osmosis, distillation, and filtration. The sugars can be subjected further processing; e.g., they can also be sterilized, for example, by filtration.
[319] In a related embodiment, the invention provides means for improving quality of lignocellulosic material, including DDG for animal nutrition. In one embodiment, the treated lignocellulosic material (e.g., a lignocellulosic material which has been saccharified) is recovered (e.g., has the soluble sugars removed). The recovered material can be used as an animal feed additive. It is believed that the recovered material will have beneficial properties for animal nutrition, possibly due to a higher protein content. In some embodiments, the amount of enzyme or enzyme composition contacted with the lignocellulosic material may be from about 0.0001 % to about 1.0 % of the weight of the lignocellulosic material; in other embodiments, from about 0.005 % to about 0.1 % of the weight of the lignocellulosic material. The invention includes the use of any amount of enzyme or enzyme composition between about 0.0001 % and about 1.0 %, in increments of 0.0001 (i.e., 0.0001 , 0.0002, 0.0003...etc.).
[320] In an additional embodiment, the invention provides a method for producing an organic substance, comprising saccharifying a lignocellulosic material with an effective amount of a protein of the present invention or a composition comprising at least one protein of the present invention, fermenting the saccharified lignocellulosic material obtained with one or more fermentating microorganisms, and recovering the organic substance from the fermentation. Sugars released from biomass can be converted to useful fermentation products including but not limited to amino acids, vitamins, pharmaceuticals, animal feed supplements, specialty chemicals, chemical feedstocks, plastics, solvents, fuels, or other organic polymers, lactic acid, and ethanol, including fuel ethanol. Specific products that may be produced by the methods of the invention include, but not limited to, biofuels (including ethanol); lactic acid; plastics; specialty chemicals; organic acids, including citric acid, succinic acid, itaconic acid and maleic acid; solvents; animal Patent 124702-0230 feed supplements; pharmaceuticals; vitamins; amino acids, such as lysine, methionine, tryptophan, threonine, and aspartic acid; industrial enzymes, such as proteases, cellulases, amylases, glucanases, lactases, lipases, lyases, oxidoreductases, and transferases; and chemical feedstocks. The methods of the invention are also useful to generate feedstocks for fermentation by fermenting microorganisms. In one embodiment, the method further comprises the addition of at least one fermenting organism.
[321] As used herein, "fermenting organism" refers to an organism capable of fermentation, such as bacteria and fungi, including yeast. Such feedstocks have additional nutritive value above the nutritive value provided by the liberated sugars.
[322] In some embodiments, the present invention provides methods for improving the nutritional quality of food (or animal feed) comprising adding to the food (or the animal feed) at least one protein of the present invention. In some embodiments, the present invention provides methods for improving the nutritional quality of the food (or animal feed) comprising pretreating the food (or the animal feed) with at least one isolated protein of the present invention. For instance, use of the enzymes xylanases and arabinofuranosidases in bread making has been known to improve the nutritional quality of the dough by degrading the arabinoxylans in the dough. Improving the nutritional quality can mean making the food (or the animal feed) more digestible and/or less allergenic, and encompasses changes in the caloric value, taste and/or texture of the food.. In some embodiments, the proteins of the present invention may be used as part of nutritional supplements. In some embodiments, the proteins of the present invention may be used as part of digestive aids, and may help in providing relief from digestive disorders such as acid reflux and celiac disease.
[323] Proteins of the present invention and compositions comprising at least one protein of the present invention are also useful in a variety of other applications involving the hydrolysis of glycosidic linkages in lignocellulosic material, such as stone washing, color brightening, depilling and fabric softening, as well as other applications well known in the art. Proteins of the present invention and compositions comprising at least one protein of the present invention are also readily amenable to use as additives in detergent and other media used for such Patent 124702-0230 applications. These and other methods of use will readily suggest themselves to those of skill in the art based on the invention described herein.
[324] In one embodiment of this invention, proteins and compositions of the present invention can be used in stone washing procedures for fabrics or other textiles. In some embodiments, the proteins and compositions can be used in stone washing procedures for denim jeans. By way of example, the method for stone washing the fabric comprises contacting the fabric with a protein or composition of the present invention, hi an additional embodiment, the protein or composition of the present invention is included in a detergent composition, as described below. A preferred pH range of stone wash applications is between about 5.5 to 7.5, most preferably at about pH 6 to about 7. One of skill in the art will know how to regulate the amount or concentration of the protein or composition produced by this invention based on such factors as the activity of the enzyme and the wash conditions, including but not limited to temperature and pH. Examples of these uses can be found in U.S. Patent Application Publication No. 2003/0157595.
[325] In yet another embodiment of this invention, the cellulase compositions of this invention can be used to reduce or eliminate the harshness associated with a fabric or textile by contacting the fabric or textile with a protein or composition of the present invention. In some embodiments, the fabric or textile may be made from cellulose or cotton. By way of example, a preferred range for reducing or eliminating the harshness associated with a fabric or textile is between about pH 8 to about 12, or between about pH 10 to about 11.
[326] The proteins or compositions of the subject invention can be used in detergent compositions. In one embodiment, the detergent composition may comprise at least one protein or composition of the present invention and one or more surfactants. The detergent compositions may also include any additional detergent ingredient known in the art. Detergent ingredients contemplated for use with the detergent compositions of the subject invention include, but are not limited to, detergents, buffers, surfactants, bleaching agents, softeners, solvents, solid forming agents, abrasives, alkalis, inorganic electrolytes, cellulase activators, antioxidants, builders, silicates, preservatives, and stabilizers. The detergent compositions of this invention preferably employ a surface active agent, i.e., surfactant, including anionic, non-ionic, and ampholytic surfactants well known for their use in Patent 124702-0230 detergent compositions. In addition to the at least one protein or composition of the present invention and the surface active agent, the detergent compositions of this invention can additionally contain one or more of the following components: the enzymes amylases, cellulases, proteinase, lipases, oxido-reductases, peroxidases and other enzymes; cationic surfactants and long-chain fatty acids; builders; antiredeposition agents; bleaching agents; bluing agents and fluorescent dyes; caking inhibitors; masking agents for factors inhibiting the cellulase activity; cellulase activators; antioxidants; and solubilizers. In addition, perfumes, preservatives, dyes, and the like can be used, if desired, with the detergent compositions of this invention. Examples of detergent compositions employing cellulases are exemplified in U.S. Pat. Nos. 4,435,307; 4,443,355; 4,661,289; 4,479,881; 5,120,463.
[327] When a detergent base used in the present invention is in the form of a powder, it may be one which is prepared by any known preparation method including a spray- drying method and/or a granulation method. The granulation method are the most preferred because of the non-dusting nature of granules compared to spray dry products. The detergent base obtained by the spray-drying method is hollow granules which are obtained by spraying an aqueous slurry of heat-resistant ingredients, such as surface active agents and builders, into a hot space. The granules have a size of from about 50 to about 2000 micrometers. After the spray- drying, perfumes, enzymes, bleaching agents, and or inorganic alkaline builders may be added. With a highly dense, granular detergent base obtained by such as the spray-drying-granulation method, various ingredients may also be added after the preparation of the base. When the detergent base is a liquid, it may be either a homogenous solution or an inhomogeneous solution.
[328] Other textile applications in which proteins and compositions of the present invention may be used include, but are not limited to, garment dyeing applications such as enzymatic mercerizing of viscose, bio-polishing applications, enzymatic surface polishing; biowash (washing or washing down treatment of textile materials), enzymatic microfibrillation, enzymatic "colonization" of linen, ramie and hemp; and treatment of Lyocel® or Newcell® (i.e., "TENCEL®" from Courtauld's), Cupro® and other cellulosic fibers or garments, dye removal from dyed cellulosic substrates such as dyed cotton (Leisola & Linko— (1976) Analytical Patent 124702-0230
Biochemistry, v. 70, p. 592. Determination Of The Solubilizing Activity Of A Cellulase Complex With Dyed Substrates; Blum & Stahl— Enzymic Degradation Of Cellulose Fibers; Reports of the Shizuoka Prefectural Hamamatsu Textile Industrial Research Institute No. 24 (1985)), as a bleaching agent to make new indigo dyed denim look old (Fujikawa— Japanese Patent Application Kokai No. 50- 132269), to enhance the bleaching action of bleaching agents (Suzuki— Great Britain Patent No. 2 094 826), and in a process for compositions for enzymatic desizing and bleaching of textiles (Windbichtler et al., U.S. Pat. No. 2,974,001. Another example of enzymatic desizing using cellulases is provided in Bhatawadekar (May 1983) Journal of the Textile Association, pages 83-86.
[329] The amount of enzyme or enzyme composition contacted with a textile may vary with the particular application. Typically, for biofinishing and denim washing applications, from about 0.02 wt. % to about 5 wt. % of an enzyme or enzyme composition may be contacted with the textile. In some embodiments, from about 0.5 wt. % to about 2 wt. % of an enzyme or enzyme composition may be contacted with the textile. For bioscouring, from about 0.1 to about 10, or from about 0.1 to about 1.0 grams of an enzyme or enzyme composition per kilogram of textile may be used, including any amount between about 0.1 grams and about 10 grams, in increments of 0.1 grams.
[330] In other embodiments, the proteins or compositions of the present invention can be used in the saccharification of lignocellulose biomass from agriculture, forest products, municipal solid waste, and other sources, for biobleaching of wood pulp, and for de-inking of recycled print paper all by methods known to one skilled in the art.
[331] The amount of enzyme or enzyme composition used for pulp and paper modification (e.g., biobleaching of wood pulp, de-inking of paper, or biorefming of pulp for paper making) typically varies depending upon the stock that is used, the pH and temperature of the system, and the retention time. In certain embodiments, the amount of enzyme or enzyme composition contacted with the paper or pulp may be from about 0.01 to about 50 U; from about 0.1 to about 15 U; or from about 0.1 to about 5 U of enzyme or enzyme composition per dry gram of fiber, including any amount between about 0.01 and about 50 U, in 0.01 U increments. In other embodiments, the amount of enzyme or enzyme composition contacted Patent 124702-0230 with the paper or pulp may be from about 1 to about 2000 grams or from about 100 to about 500 grams enzyme or enzyme composition per dry ton of pulp, including any amount between about 1 and about 2000 grams, in 1 gram increments.
[332] Proteins or compositions of the present invention can added to wastewater to reduce the amount of solids such as sludge or to increase total biochemical oxygen demand (BOD) and chemical oxygen demand (COD) removal. For example, proteins or compositions of the present invention can be used to transform particulate COD to soluble COD in wastewater produced from grain/fruit/cellulose industrial processes or to increase the BOD/COD ratio by increasing waste biodegradability (soluble lower molecular weight polymers in cellulosic hemicellulosic wastes are typically more readily biodegradable than non- soluble material). In biological wastewater treatment systems, proteins or compositions of the present invention can also be used to increase waste digestion by aerobic and/or anaerobic bacteria.
[333] Exemplary methods according to the invention are presented below. Examples of the methods described above may also be found in the following references: Trichoderma & Gliocladium, Volume 2, Enzymes, biological control and commercial applications, Editors: Gary E. Harman, Christian P. Kubicek, Taylor & Francis Ltd. 1998, 393 (in particular, chapters 14, 15 and 16); Helmut Uhlig, Industrial enzymes and their applications, Translated and updated by Elfriede M. Linsmaier-Bednar, John Wiley & Sons, Inc 1998, p. 454 (in particular, chapters 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.9, 5.10, 5.11, and 5.13). For sacchaiification applications: Hahn-Hagerdal, B., Galbe, M., Gorwa-Grauslund, M.F. Liden, Zacchi, G. Bio-ethanol - the fuel of tomorrow from the residues of today, Trends in Biotechnology, 2006, 24 (12), 549-556; Mielenz, J.R. Ethanol production from biomass: technology and commercialization status, Current Opinion in Microbiology, 2001, 4, 324-329; Himmel, M.E., Ruth, M.F., Wyman, C.E., Cellulase for commodity products from cellulosic biomass, Current Opinion in Biotechnology, 1999, 10, 358-364; Sheehan, J., Himmel, M. Enzymes, energy, and the environment: a strategic perspective on the U.S. Department of Energy's Research and Development Activities for Bioethanol, Biotechnology Progress. 1999, 15, 817-827. For textile processing applications: Galante, Y.M., Formantici, C, Enzyme applications in detergency and in manufacturing industries, Current Patent 124702-0230
Organic Chemistry, 2003, 7, 1399-1422. For pulp and paper applications: Bajpai, P., Bajpai, P.K Deinking with enzymes: a review. TAPPIJournal, 1998, 81(12), 1 11-117; Viikari, L., Pere, J., Suurnakki, A., Oksanen, T., Buchert, J. Use of cellulases in pulp and paper applications. In: "Carbohydrates from Trichoderma reesei and other microorganisms. Structure, Biochemistry, Genetics and Applications." Editors: Mark Claessens, Wim Nerinckx, and Kathleen Piens, The Royal Society of Chemistry 1998, 245-254. For food and beverage applications: Roller, S., Dea, I. CM. Biotechnology in the production and modification of biopolymers for foods, Critical Reviews in Biotechnology, 1992, 12(3), 261-277.
[334] Additional assays and methods for examining the activity of the enzymes are found in U.S. Patent Applications 60/806,876, 60/970,876, 11/487,547, 11/775,777, 11/833,133, and 12/205,694 and incoiporated herein by reference.
[335] The following examples are provided for the purpose of illustration and are not intended to limit the scope of the present invention.
[336] Example 1
[337] The following assay was used to measure acetyl esterase activity. This assay measured the release of -mtrophenol by the action of acetyl esterase on p- nitrophenyl acetate (PNPAc). One acetyl esterase unit of activity was the amount of enzyme that liberates 1 micromole of j-nitrophenol in one minute at 37 °C and pH 5.
[338] Materials
[339] PNPAc from Fluka (Switzerland, cat. # 46021) was used as the assay substrate. 3.6 mg of PNPAc was dissolved in 10 mL of 0.10 M potassium phosphate buffer pH 6.9 using magnetic stirrer to obtain 2 raM stock solution. The solution was stable for 2 days on storage at 4 °C.
[340] The stop reagent (0.25 M Tris-HCl, pH 8.8) was prepared as follows. 30.29 g of Tris was dissolved in 900 mL of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 was prepared by mixing solution A with 37% HCl until the pH of the resulting solution was equal to 8.8. The solution volume was adjusted to 1000 mL. This reagent was used to terminate the enzymatic reaction.
[341] Using the above reagents, the assay was performed as detailed below.
[342] Enzyme Sample
[343] 0.10 mL of 2 raM PNPAc stock solution was mixed with 0.01 mL of the enzyme
- I l l - Patent 124702-0230 sample and incubated at 37 °C for 10 minutes. After exactly 10 minutes of incubation, 0.1 mL of 0.25 M Tris-HCl solution was added and then the absorbance at 405 nm (A405) was measured in microtiter plates as As (enzyme sample).
[344] Substrate Blank
[345] 0.10 mL of 2 mM PNPAc stock solution was mixed with 0.01 mL of 0.05 M sodium acetate buffer, pH 5.0. Then, 0.1 mL of 0.25 M Tris-HCl solution was added and the absorbance at 405 nm (A405) was measured microtiter plates as ASB (substrate blank).
[346] Calculation of Activity
[347] Activity was calculated as follows:
ΔΑ405 * DF *21 *1.33
Activity (IU/rnl) =
13.700 * 10
where ΔΑ405 = As (enzyme sample) - ASB (substrate blank), DF was the enzyme dilution factor, 21 was the dilution of 10 μΐ enzyme solution in 210 μΐ reaction volume, 1.33 was the conversion factor of microtiter plates to cuvettes, 13.700 was the extinction coefficient 13700 M"1 cm"1 of j9-nitrophenol released corrected for mol/L to μιηοΙ/mL, and 10 minutes was the reaction time.
[348] Results
[349] The acetyl esterase activity of Aes (CL10113, SEQ ID NO: 2) was found to be
0.39 IU/mL (ΔΑ4(¾= 1.91, DF=1) of enzyme produced in 1.5 L fermentations.
[350] Example 2
[351] The following assay was used to measure p-l,3-glucanase activity. This assay measured the release of glucose by the action of the β-glucanase on curdlan.
[352] Reagents
[353] Sodium acetate buffer (0.05 M, pH 5.0) was prepared as follows. 4.1 g of anhydrous sodium acetate or 6.8 g of sodium acetate * 3¾0 was dissolved in distilled water so that the final volume of the solution to be 1000 mL (Solution A).
In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid was mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 5.0, was prepared by mixing Solution A with Solution B until the pH of the resulting solution was equal to 5.0. Patent 124702-0230
[354] Curdlan was purchased at Megazyme (Bray Ireland, Cat. # P-CURDL). Cellulose, amylase and oat spelt xylan were purchased from Sigma (St. Louis, USA, CatJ 435236, A-0512, X0627, respectively), linear and branched arabinan and potato and larch galactan from Megazyme (Bray Ireland, Cat. # P-ARAB, P-LARB, P- GALPOT, P-ARGAL, respectively).
[355] Enzyme Sample
[356] 50 μΐ, of substrate stock solution (5 mg/mL) was mixed with 10 μΐ, (purified enzyme, concentration not known) of enzyme sample and the reaction mixture was adjusted to 100 μΐ^ with 0.05 M sodium acetate buffer pH 5.0. This mixture was incubated at 50 °C for 1-4 hours. The reaction was stopped by heating the samples for 10 minutes at 100°C. The release of glucose was analyzed by HPAEC.
[357] Substrate Blank
[358] 50 μL of substrate stock solution (5 mg/mL) was mixed with 50 μί 0.05 M sodium acetate buffer pH 5.0 and incubated at 50 °C for 1-4 hours. The reaction was stopped by heating the samples for 10 minutes at 100°C. The release of glucose was analyzed by HPAEC.
[359] High Performance Anion Exchange Chromatography
[360] The analysis was performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (1 mm ID x 25 mm) and a Dionex EDetl PAD-detector (Dionex Co., Sunnyvale). A flow rate of 0.25 mL/min was used with the following gradient of sodium acetate in 0.1 M NaOH: 0-40 min, 0-400 mM. Each elution was followed by a washing step of 5 min 1,000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH.
[361] Results
[362] Laml (SEQ ID NO: 52) was found to release glucose from curdlan (as indicated in Figure 1). The enzyme was only able to remove glucose, no oligosaccharides were formed. It had no activity towards cellulose, linear or branched arabinan, xyloglucan, potato or larch galactan, amylase, or oat spelt xylan (as indicated in Figures 2-4). From this result it can be concluded that Laml exhibited βχο-β-1,3- glucanase activity.
[363] Example 3:
[364] The following assay was used to measure glucuronyl esterase activity. This assay Patent 124702-0230 measured the release of 4-O-methyl-glucuronic acid by the action of the glucuronyl esterases on methyl-4-O-methyl-glucuronic acid.
[365] Reagents. Sodium acetate buffer (0.1 M, pH 5.0) was prepared as follows. 8.2 g of anhydrous sodium acetate or 13.6 g of sodium acetate * 3¾0 was dissolved in distilled water so that the final volume of the solution to be 1000 mL (Solution A). In a separate flask, 6.0 g (5.72 mL) of glacial acetic acid was mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.1 M sodium acetate buffer, pH 5.0, was prepared by mixing Solution A with Solution B until the pH of the resulting solution was equal to 5.0. Methyl-4-O-methyl-glucuronic acid was kindly provided by Prof. Peter Biely (Spanikova and Biely, 2006).
[366] Enzyme Sample
[367] 200 μL of methyl-4-O-methyl-glucuiOnic acid stock solution (0.5 mg/mL) was mixed with 10 of the enzyme sample and incubated at 30 °C for 4 hours. The reaction was stopped by heating the samples for 15 minutes at 99°C. The release of glucose was analyzed by UPLC-MS.
[368] Substrate Blank
[369] 200 iL of methyl-4-O-methyl-glucuronic acid stock solution (0.5 mg/mL) was mixed with 10 iL of buffer and incubated at 30 °C for 4 hours. The reaction was stopped by heating the samples for 15 minutes at 99°C. The release of glucose was analyzed by UPLC-MS.
[370] Results
[371] It was found that after 4 hours of incubation methyl-4-O-methyl-D-GlcA has been degraded to 4-O-methyl-D-GlcA by Guel (CL10365, SEQ ID NO: 4, enzyme produced in 1.5L fermentations). The MS-diagram is shown in Figure 5A, 5B. 5C and 5D. After incubation a part of the substrate was degraded to 4-O-methyl-D- GlcA by Gue2 (CL11231, SEQ ID NO: 6, enzyme produced in 1.5L fermentations). The MS-diagram is shown in Figure 6A, 6b, 6C and 6D.
[372] References
[373] Spanikova, S., Biely, P. (2006). FEBS let. 580: 4597-4601.
[374] Example 4
[375] The following assays were used to measure feruloyl esterase activity. This assay measured the release of arabinose by the action of the -arabinofuranosidase on branched arabinan. Patent 124702-0230
[376] Enzyme Assay
[377] Ferulic acid esterase FaeB3 (SEQ ID NO: 8)was incubated with Wheat bran and sugar beet pulp AIS (Alcohol Insoluble Solids), in combination with endoxylanase (glycosyl hydrolase family 11) or Rapidase Liq+ (DSM) and the increase in absorbance at 310 nm was recorded. The absorbance at 310nm was correlated to the release of free ferulic acid (it was assumed that the oligomers containing ferulic acid are almost immediately hydrolyzed by the enzyme). The experiments were performed at pH 6.0 and at 35°C.
[378] The ratio E/S was 0.1% (^g enzyme/mg substrate).
[379] Results
[380] Fae B3 showed to be the most active on wheat bran AIS when used with endoxylanase (GH11). FaeB3 also demonstrated activity against sugar beet pulp AIS in combination with Rapidase Liq+ (see Figure 7). A lack time of 2 hours was seen when working on SBP indicating that FaeB3 cannot act on the substrate directly and that oligomers first need to be formed.
[381] Example 5
[382] The following examples illustrates an additional assay to measure β-glucanase activity. Such activity is demonstrated by using β-glucan as a substrate and a reducing sugars assay (PAHBAH) as a detection method.
[383] Reagents
[384] Reagent A: 10 g of p-Hydroxy benzoic acid hydrazide (PAHBAH) is suspended in 60 mL water. 10 mL of concentrated hydrochloric acid is added and the volume is adjusted to 200 ml. Reagent B: 24.9 g of trisodium citrate is dissolved in 500 ml of water. To this solution 2.2 g of calcium chloride and 40 g sodium hydroxide are added. The volume is adjusted to 2 L with water. Both reagents are stored at room temperature. Working Reagent: 10 ml of Reagent A is added to 90 ml of Reagent B. This solution is prepared freshly every day, and is stored on ice between uses.
[385] Using the above reagents, the assay is performed as detailed below
[386] Enzyme Sample
[387] The assay is conducted in micro titer plate format. Each well contains 50 μΐ of β- glucan substrate (1 %(w/v) Bailey β-glucan in water), 30 μΐ of 0,2 M HAc NaOH pH 5, 20 μΐ β-glucanase sample. These are incubated at 37°C for 2 hours. After incubation 25 μΐ of each well are mixed with 125 μΐ working reagent. These Patent 124702-0230 solutions are heated at 95 °C for 5 minutes. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as As (enzyme sample). The standard curve is determined and from that the enzyme activities are determined.
[388] Substrate Blank
[389] 50 iL of β-glucan substrate (1 %(w/v) Barley β-glucan in water)is mixed with 50 μΐ, 0.2 M sodium acetate buffer pH 5.0 and incubated at 37 °C for 2 hours. To 25 μΐ, of this reaction mixture, 125 of working solution is added. The samples are heated for 5 minutes at 95°C. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as ASB (substrate blank sample).
[390] Calculation of Activity
[391] Activity is calculated as follows: β-glucanase activity is determined by reference to a standard curve of the cellulase standard solution. Activity (IU/ml) = ΔΑ410 / SC * DF where ΔΑ410 = As (enzyme sample) - ASB (substrate blank), SC is the slope of the standard curve and DF is the enzyme dilution factor.
[392] Example 6
[393] The following examples illustrates an assay to measure a-1, 6-Marmanase activity.
Such activity is demonstrated by using a-l,6-linked mannobiose as the substrate and the D-mannose detection kit (Megazyme International ) as a detection method, using a four enzyme coupled assay, using ATP and NADP+.
[394] Reagents
[395] Reactions are conducted at 37°C in 100 mM MOPS (pH 7.0), containing 0.1 mM ZnS04, 1 mg mL-1 BSA, and 20 μΐ of a-1, 6-Mannanase sample. Mannose is liberated by a-1, 6-Mannanase is phosphorylated to mannose-6-phosphate by hexokinase (HK). Mannose-6-phosphate is subsequently converted to fructose-6- phosphate by phosphomannose isomerase (PMI) which is then isomerized to glucose-6-phosphate by phosphoglucose isomerase (PGI). Finally, glucose-6- phosphate is oxidized to gluconate-6-phosphate by glucose-6-phosphate dehydrogenase (G6P-DH) The concurrent reduction of the NADP+ cofactor to NADPH is monitored at 340 nm using an extinction coefficient of 6223 (M-Tcm-l). The enzymes were individually obtained from Sigma.
[396] Calculation of Activity Patent 124702-0230
[397] The A340 values are plotted against time in minutes (X-axis). The slope of the graph is calculated (dA). Enzyme activity is calculated by using the following formula:
dA * Fa * d
Specific activity = ;— ; ——
ε * I * Iproteinl * p
dA = slope in A/min; Va = reaction volume in 1; d = dilution factor of assay mix ; ε = extinction coefficient for NAD(P)H of 0.006223 μΜ"1 cm"1; 1 = length of cell in cm; [protein] = protein stock concentration in mg/ml; Vp = volume of protein stock added to assay in ml
[398] Example 7
[399] The following example illustrates an assay to measure rhamnogalacturonyl hydrolase activity. Such activity is demonstrated by using rhamnogalacturonan as a substrate and a reducing sugars assay (PAHBAH) as the detection method.
[400] Reagents
[401] Reagent A: 10 g of i-Hydroxy benzoic acid hydrazide (PAHBAH) is suspended in 60 mL water. 10 mL of concentrated hydrochloric acid is added and the volume is adjusted to 200 ml. Reagent B: 24.9 g of trisodium citrate is dissolved in 500 ml of water. To this solution 2.2 g of calcium chloride and 40 g sodium hydroxide are added. The volume is adjusted to 2 L with water. Both reagents are stored at room temperature. Working Reagent: 10 ml of Reagent A is added to 90 ml of Reagent B. This solution is prepared freshly every day, and is stored on ice between uses.
[402] Using the above reagents, the assay is performed as detailed below
[403] Enzyme Sample
[404] The assay is conducted in micro titer plate format. Each well contains 50 μΐ of rhamnogalacturonan substrate (1 %(w/v) in water), 30 μΐ of 0,2 M HAc/NaOH pH 5, 20 μΐ rhamnogalacturonyl hydrolase sample. These are incubated at 37°C for 2 hours. After incubation 25 μΐ of each well are mixed with 125 μΐ working reagent. These solutions are heated at 95°C for 5 minutes. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as As (enzyme sample). The standard curve is determined and from that the enzyme activities are determined.
[405] Substrate Blank
[406] 50 μΐ, of rhamnogalacturonan substrate (1 %(w/v) in water) is mixed with 50 μΕ Patent 124702-0230
0.2 M sodium acetate buffer pH 5.0 and incubated at 37 °C for 2 hours. To 25 μΙ of this reaction mixture, 125 ί of working solution is added. The samples are heated for 5 minutes at 95°C. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as ASB (substrate blank sample).
[407] Calculation of Activity
[408] Activity is calculated as follows: rhamnogalacturonyl hydrolase activity is determined by reference to a standard curve of the cellulase standard solution. Activity (IU/ml) = ΔΑ410 / SC * DF
where ΔΑ410 = As (enzyme sample) - ASB (substrate blank), SC is the slope of the standard curve and DF is the enzyme dilution factor.
[409] Example 8
[410] The following examples illustrates an assay to measure a- Amylase activity. Such activity is demonstrated by using amylose as substrate and a reducing sugars assay (PAHBAH) as detection method.
[411] Reagents
[412] Reagent A: 10 g of p-Hydroxy benzoic acid hydrazide (PAHBAH) is suspended in 60 mL water. 10 mL of concentrated hydrochloric acid is added and the volume is adjusted to 200 ml. Reagent B: 24.9 g of trisodium citrate is dissolved in 500 ml of water. To this solution 2.2 g of calcium chloride and 40 g sodium hydroxide are added. The volume is adjusted to 2 L with water. Both reagents are stored at room temperature. Working Reagent: 10 ml of Reagent A is added to 90 ml of Reagent B. This solution is prepared freshly every day, and is stored on ice between uses.
[413] Using the above reagents, the assay is performed as detailed below
[414] Enzyme Sample
[415] The assay is conducted in micro titer plate format. Each well contains 50 μΐ of amylose substrate (0.15 % (w/v) in water), 30 μΐ of 0,2 M HAc/NaOH pH 5, 20 μΐ α-amylase sample. These are incubated at 37°C for 15 minutes. After incubation 25 μΐ of each well are mixed with 125 μΐ working reagent. These solutions are heated at 95 'C for 5 minutes. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as As (enzyme sample).
[416] Substrate Blank
[417] 50 μΤ of a-amylose substrate (0.15 %(w/v) in water) is mixed with 50 μΐ,, 0.2 M sodium acetate buffer pH 5.0 and incubated at 37 °C for 15 minutes. To 25 μΐ, of Patent 124702-0230 this reaction mixture, 125 μΐ, of working solution is added. The samples are heated for 5 minutes at 95°C. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as ASB (substrate blank sample).
[418] Calculation of Activity
[419] Activity is calculated as follows: a-amylase activity is determined by reference to a standard curve of the cellulase standard solution. Activity (IU/ml) = ΔΑ410 / SC * DF; where ΔΑ410 = As (enzyme sample) - ASB (substrate blank), SC is the slope of the standard curve and DF is the enzyme dilution factor.
[420] Example 9
[421] This example illustrates an assay to measure a-glucosidase activity. This assay measures the release of p-nitrophenol by the action of α-glucosidase on p- nitrophenyl a-D-glucopyranoside. One α-glucosidase unit of activity is the amount of enzyme that liberates 1 micromole ofp-nitrophenol in one minute.
[422] Reagents
[423] Sodium acetate buffer (0.2 M, pH 5.0) is prepared as follows. 16.4 g of anhydrous sodium acetate or 27.2 g of sodium acetate * 3¾0 is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 12 g (11.44 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.1 M sodium acetate buffer, pH 5.0, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is equal to 5.0.
[424] p-nitrophenyl a-D-glucopyranoside (3 mM) from Sigma (#N1377) is used as the assay substrate. 4.52 mg of p-nitrophenyl a-D-glucopyranoside is dissolved in 5 mL of sodium acetate buffer using magnetic stirrer.
[425] The stop reagent (0.25 M Tris-HCl, pH 8.8) is prepared as follows. 30.29 g of Tris is dissolved in 900 mL of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 is prepared by mixing solution A with 37% HC1 until the pH of the resulting solution is equal to 8.8. The solution volume is adjusted to 1000 mL. This reagent is used to terminate the enzymatic reaction.
[426] Using the above reagents, the assay is performed as detailed below.
[427] Enzyme Sample
[428] 0.025 mL of p-nitrophenyl α-D-glucopyranoside stock solution is mixed with 1 μL Patent 124702-0230 of the enzyme sample, 0.075 mL buffer and 0.099 mL Millipore water and incubated at 37 °C for 4 minutes. Every minute during 4 minutes a 0.04 mL sample is taken and added to 0.06 mL stop reagent. The absorbance at 410 run is measured in microtiter plates as As (enzyme sample).
[429] Substrate Blank
[430] 0.025 mL of >-nitrophenyl a-D-glucopyranoside stock solution is mixed with 0.075 mL buffer and 0.1 mL Millipore water and incubated at 37 °C for 4 minutes. Every minute during 4 minutes a 0.04 mL sample is taken and added to 0.06 mL stop reagent. The absorbance at 410 nm (A410) is measured in microtiter plates as ASB (substrate blank sample).
[431] Calculation of Activity
[432] The A410 values are plotted against time in minutes (X-axis). The slope of the graph is calculated (dA). Enzyme activity is calculated by using the following formula:
dA * Vs * d
Specific activity = — = =——
ε * ! * Iprotemi » p
dA = slope in A/min; Va = reaction volume in 1; d = dilution factor of assay mix ; ε = extinction coefficient of p-nitrophenol (0.0137 μΜ" 1 cm"1); 1 = length of cell in cm; [protein] = protein stock concentration in mg/ml; Vp = volume of protein stock added to assay in ml.
[433] Example 10
[434] The following examples illustrates an assay to measure endo-glucanase activity.
Such activity is demonstrated by using a glucan (e.g. dextran, glycogen, pullulan, amylose, amylopectin, cellulose, curdlan, laminarin, chrysolaminarin, lentinan, lichenin, pleuran, zymosan, etc.) as substrate and a reducing sugars assay (PAHBAH) as detection method.
[435] Reagents
[436] Reagent A: 10 g of /"-Hydroxy benzoic acid hydrazide (PAHBAH) is suspended in 60 mL water. 10 mL of concentrated hydrochloric acid is added and the volume is adjusted to 200 ml. Reagent B: 24.9 g of trisodium citrate is dissolved in 500 ml of water. To this solution 2.2 g of calcium chloride and 40 g sodium hydroxide are added. The volume is adjusted to 2 L with water. Both reagents are stored at room temperature. Working Reagent: 10 ml of Reagent A is added to 90 ml of Reagent Patent 124702-0230
B. This solution is prepared freshly every day, and is stored on ice between uses.
[4 7] Using the above reagents, the assay is performed as detailed below.
[438] Enzyme Sample
[439] The assay is conducted in micro titer plate format. Each well contains 50 μΐ of glucan substrate (1 % (w/v) glucan in water), 30 μΐ of 0,2 M HAc NaOH pH 5, 20 μΐ endo-glucanase sample. These are incubated at 37°C for 2 hours. After incubation 25 μΐ of each well are mixed with 125 μΐ working reagent. These solutions are heated at 95°C for 5 minutes. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as As (enzyme sample). The standard curve is determined and from that the enzyme activities are determined.
[440] Substrate Blank
[441] 50 μΐ, of glucan substrate (1 %(w/v) glucan in water) is mixed with 50 0.2 M sodium acetate buffer pH 5.0 and incubated at 37 °C for 2 hours. To 25 \iL of this reaction mixture, 125 μΤ of working solution is added. The samples are heated for 5 minutes at 95°C. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as ASB (substrate blank sample).
[442] Calculation of Activity
[443] Activity is calculated as follows: endo-glucanase activity is determined by reference to a standard curve of the cellulase standard solution. Activity (IU/ml) = ΔΑ410 / SC * DF, where ΔΑ410 = As (enzyme sample) - ASB
[444] Example 11
[445] The following examples illustrates an assay to measure a-glucanase activity. Such activity is demonstrated by using an ct-glucan (e.g. dextran, glycogen, pullulan, amylopectin, amylose, etc.) as the substrate and a reducing sugars assay (PAHBAH) as the detection method.
[446] Reagents
[447] Reagent A: 10 g ofp-Hydroxy benzoic acid hydrazide (PAHBAH) is suspended in 60 mL water. 10 mL of concentrated hydrochloric acid is added and the volume is adjusted to 200 ml. Reagent B: 24.9 g of trisodium citrate is dissolved in 500 ml of water. To this solution 2.2 g of calcium chloride and 40 g sodium hydroxide are Patent 124702-0230 added. The volume is adjusted to 2 L with water. Both reagents are stored at room temperature. Working Reagent: 10 ml of Reagent A is added to 90 ml of Reagent B. This solution is prepared freshly every day, and is stored on ice between uses.
[448] Using the above reagents, the assay is performed as detailed below
[449] Enzyme Sample
[450] The assay is conducted in micro titer plate format. Each well contains 50 μΐ of ct- glucan substrate (1 % (w/v) α-glucan in water), 30 μΐ of 0,2 M HAc/NaOH pH 5, 20 μΐ endo-glucanase sample. These are incubated at 37°C for 2 hours. After incubation 25 μΐ of each well are mixed with 125 μΐ working reagent. These solutions are heated at 95°C for 5 minutes. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A4I0) as As (enzyme sample).
[451] Substrate Blank
[452] 50 μΐ, of α-glucan substrate (1 %(w/v) α-glucan in water) is mixed with 50 yiL 0.2 M sodium acetate buffer pH 5.0 and incubated at 37 °C for 2 hours. To 25 μΐ, of this reaction mixture, 125 μΐ, of working solution is added. The samples are heated for 5 minutes at 95°C. After cooling down, the samples are analyzed by measuring the absorbance at 410 nm (A410) as ASB (substrate blank sample).
[453] Calculation of Activity
[454] Activity is calculated as follows: endo-glucanase activity is determined by reference to a standard curve of the cellulase standard solution. Activity (IU/ml) = ΔΑ410 / SC * DF where ΔΑ410 = As (enzyme sample) - ASB (substrate blank), SC is the slope of the standard curve and DF is the enzyme dilution factor.
[455] Example 12
[456] The following example illustrates the assay that can be used to measure the ferulic acid esterase enzymatic activity.
[457] This assay measures the release of p-nitrophenol by the action of ferulic acid esterase on p-nitrophenylbutyrate (PNBu). One ferulic acid esterase unit of activity is the amount of enzyme that liberates 1 micromole of p-nitrophenol in one minute at 37°C and pH 7.2.
[458] Phosphate buffer (0.01 M, pH 7.2) is prepared as follows: 0.124 g Of NaH2P04 * H20 and 0.178 g Na2HP04 are dissolved in distilled water so that the final volume of the solution is 500 ml and the pH of the resulting solution is equal to Patent 124702-0230
7.2.
[459] PNPBu (Sigma, USA, cat. # N9876-5G) is used as the assay substrate. 10 μΐ of PNPBu is mixed with 25 ml of 0.01 M phosphate buffer using a magnetic stirrer to obtain a 2 mM stock solution. The solution is stable for 2 days with storage at 4°C.
[460] The stop reagent (0.25 M Tris-HCl, pH 8.5) is prepared as follows: 30.29 g of Tris is dissolved in 900 ml of distilled water (Solution A). The final 0.25 M Tris-HCl pH 8.5 is prepared by mixing solution A with 37% HC1 until the pH of the resulting solution is equal to 8.5. The solution volume is adjusted to 1000 ml. This reagent is used to terminate the enzymatic reaction. Using the above reagents, the assay is performed as detailed below..
[461] For the enzyme sample, 0.10 mL of 2 mM PNBu stock solution is mixed with 0.01 mL of the enzyme sample and incubated at 37°C for 10 minutes. After 10 minutes of incubation, 0.10 mL of 0.25 M Tris/HCl solution pH 8.8 is added and the absorbance at 405 nm is then measured in microfiter plates as As.
[462] For the substrate blank, 0.10 mL of 2 mM PNBu stock solution is mixed with 0.01 mL of 0.01 M phosphate buffer, pH 7.2. 0.10 mL of 0.25 M Tris/HCl solution pH 8.8 is added and the absorbance at 405 nm (A405) is measured in microtiter plates as ASB.
[463] Activity is calculated as follows:
Figure imgf000125_0001
Activity (IU/ml) =
13.700 * 10
where ΔΑ405 = As - ASB, DF is the enzyme dilution factor, 21 is the dilution of 10 μΐ enzyme solution in 210 μΐ reaction volume, 1.33 is the conversion factor of microtiter plates to cuvettes, 13.700 is the extinction coefficient 13700 M"1 cm"1 of /j-nitrophenol released corrected for mol/L to μιηοΙ mL, and 10 minutes is the reaction time.
[464] Example 13
[465] The following assay is used to measure the enzymatic activity of a ferulic acid esterase towards wheat bran (WB) oligosaccharides by measuring the release of ferulic acid.
[466] Wheat bran oligosaccharides are prepared by degradation of wheat bran (obtained from Nedalco, The Netherlands) by endo-xylanase III from A. niger (enzyme Patent 124702-0230 collection Laboratory of Food Chemistry, Wageningen University, The Netherlands). 50 mg of WB is dissolved in 10 ml of 0.05 M acetate buffer pH 5.0 using a magnetic stirrer. 1.0 ml of WB stock solution is mixed with 0.0075 mg of the enzyme and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The residual material is removed by centrifugation (15 minutes at 14000 rpm), and the supernatant is used as the substrate in the assay detailed below.
[467] For the enzyme sample, 1.0 ml of wheat bran oligosaccharides stock solution is mixed with 0.005 mg of the enzyme sample and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of ferulic acid is analyzed by measuring the absorbance at 335 nm.
[468] For the substrate blank, 1.0 ml of wheat bran oligosaccharides stock solution is mixed with 0.005 mg of 0.05 M acetate buffer, pH 5.0, and incubated at 35°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of ferulic acid is analyzed by measuring the absorbance at 335 nm.
[469] Example 14
[470] The following assay is used to measure a-galactosidase activity.
[471] Substrate:
[472] 2 mM 4-Nitrophenyl-a-D-galactopyranoside in 50 mM NaAc pH5.0
[473] Stop solution:
[474] 0.25 M NaCO3
[475] Assay:
[476] In 96 wells microplate, mix ΙΟμΙ sample + 100 μΐ substrate and incubate for 10 minutes at 37°C. Add ΙΟΟμΙ 0.25 M NaC03. Measure samples in plate reader at E410nm.
[477] For quantitative activity measurements, use cuvettes and adjust the volumes accordingly. Take timed samples (to verify that you measure initial rates).
Calculate the specific activity as follows:
[478] A grapliic is assembled with E410nm as Y-axis and time in minutes as X-axis.
[479] Calculate the slope of the graph (Y X).
[480] Enzyme activity is calculated by using the following formula:
dA * Vr * d * D„
Specific activity = ;— f ——
E * I « iproiein] » Kp Patent 124702-0230 dA = slope in A/min; Vr = reaction volume in 1; De = enzyme dilution before adding to reaction mix; d = dilution factor of assay mix after adding stop reagent; ε = extinction coefficient (0.0158 μΜ"1 cm"1); 1 = length of cell (1.0 cm in case of cuvettes); [protein] = protein stock concentration in mg/ml; Vp = volume of protein solution added to assay in ml.
[481] Example 15
[482] The following assay is used to measure β-mannosidas activity.
[483] Substrate:
[484] 2 mM 4-Nitrophenyl-B-D-mannopyranoside in 50 mM NaAc pH5.0
[485] Stop solution:
[486] 0.25 M NaCO3
[487] Assay:
[488] In 96 wells microplate, mix ΙΟμΙ sample and 100 μΐ substrate. Incubate 10
minutes at 37°C. Add ΙΟΟμΙ 0.25 M NaC03. Measure samples in plate reader @ E410nm
[489] For quantitative activity measurements, use cuvettes and change the volumes accordingly. Take time samples (to verify that you measure initial rates) and then the specific activity can be calculated as follows:
[490] A graphic is assembled with E410nm as Y-axis and time in minutes as X-axis.
[491] Calculate the slope of the graph (Y/X).
[492] Enzyme activity is calculated by using the following formula:
dA « Vr * if « De
Specific activity =— ;— ; n——
lyrrozemi * Vp
dA = slope in A min; Vr = reaction volume in 1; De = enzyme dilution before adding to reaction mix; d = dilution factor of assay mix after adding stop reagent; ε = extinction coefficient (0.0158 μΜ"1 cm"1); 1 = length of cell (1.0 cm in case of cuvettes); [protein] = protein stock concentration in mg/ml; Vp = volume of protein solution added to assay in ml.
[493] Example 16
[494] The following assay was used to measure rhamnogalacturonan acetyl esterase activity. This assay measures the release of acetic acid by the action of the rhamnogalacturonan acetyl esterase on sugar beet pectin. Patent 124702-0230
[495] Enzyme assay
[496] Sugar beet pectin was purchased at CP Kelco (Atlanta, USA). The acetic acid assay kit was purchased at Megazyme (Bray, Ireland).
[497] The rhamnogalacturonan acetyl esterase was incubated with sugar beet pectin at 50°C in 10 mM phosphate buffer pH 7.0 during 16 hours of incubation. The E/S ratio was 0.5% (5 μg enzyme/mg substrate). The total volume of the reaction was ΙΙΟμΕ. The released acetic acid was analyzed with the acetic acid assay kit according to instructions of the supplier. The enzyme with known rhamnogalacturonan acetyl esterase activity Rgael (CL1 1462) was used as a reference.
[498] Results
[499] Rgae2 (CL11477; SEQ ID NO: 16) was produced in a low cellulose background strain of Myceliophtora thermophila CI which was transformed by introducing copies of the rgae2 gene (SEQ ID NO: 15) under control of a chitinase promoter.The enzyme produced was found to release acetic acid from sugar beet pectin. Sugar beet pectin contains about 5% of rhamnogalacturonan and has a total degree of esterification of 30%. Based on this, it was suggested that 5% of the acetyl esters is linked to rhamnogalacturonan. Rgae2 showed to release 118 mg/L acetic acid under the described conditions, which is 4.3% of the total acetic acid present in the sample..The control rhamnogalacturonan acetyl esterase Rgael released 135 mg L acetic acid, which is about 4.9% of the total acetic acid present. Release of these acetyl moieties is essential to hydrolyze the rhamnogalacturonan backbone by using either a rhamnogalacturonan hydrolase or a rhamnogalacturonan lyase. Based on these results it is shown that Rgae2 is a true rhamnogalacturonan acetyl esterase.
[500] Example 17:
[501] The following assay is used to measure a-glucuronidase activity. This assay measures the release of linear xylooligosaccharides by the action of the a- glucuronidase on aldouronic acids.
[502] Enzyme assay
[503] Aldouronic acids were purchased from Megazyme (Bray Ireland).
[504] Agu2 (SEQ ID NO: 18) has been produced in a low cellulase background strain of Myceliophthora thermophila CI which was transformed by introducing copies of Patent 124702-0230 the agul gene (SEQ ID N: 17) under control of a chitinase promoter. The crude Agu2 was used in this assay. The Agu2 was tested for its ability to release linear xylo-oligomers from the aldouronic acid mixture. A GH10 xylanase derived from Myceliophthora thermophila CI has been used as a control. The incubations have been performed with 1 mg/mL substrate and 10 iL crude Agu2 in a total volume of 200 }L. The temperature used was 50°C and the incubation was performed in a 50 mM acetate buffer pH 5.0. After incubation, the samples were boiled for 10 minutes to stop the reaction. Samples were centrifuged and the supe natants were analyzed by HPAEC.
[505] High Performance Anion Exchange Chromatography
[506] The analysis was performed using a Dionex ICS-3000 HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (1 mm ID x 25 mm) and a Dionex EDetl PAD- detector (Dionex Co., Sunnyvale). A flow rate of 0.25 mL/min was used with the following gradient of sodium acetate in NaOH: 0-15 min, 16 mM NaOH; 15-20 min, 16-100 mM NaOH; 20-40 min, 0-200 mM sodium acetate in 100 mM NaOH. Each elution was followed by a washing step of 5 min 1 M sodium acetate in 100 mM NaOH and an equilibration step of 15 min 16 mM NaOH.
[507] Results
[508] It was found that Agu2 (CL10353) released linear xylobiose and xylotriose. Also the pattern of the aldouronic acid peaks changes, indicating that 4-O-mefhyl- glucuronic acid residues are removed from the oligosaccharides (Figure 8A). In combination with a pure OHIO xylanase from Myceliophthora thermophila CI the xylotriose is completely degraded to xylobiose and xylose. The xylanase alone is releasing a small amount of xylose and xylobiose, most likely derived from the tetraose fraction in the substrate. However, the xylose and xylobiose peaks are much higher when the enzyme is used in combination with the Agu2. Also the pattern of the aldouronic acid peaks changes in the combination of the two enzymes, not with the xylanase alone (Figure 8B).
[509] This experiment clearly indicates that the release of glucuronic acid is essential for the hydrolysis of xylans which are strongly decorated with glucuronyl / 4-0- methylglucuronyl moieties.
[510] Example 18: Patent 124702-0230
[511] The following assays were used to measure xylanase activity. This assay measures the release of xylooligosaccharides by the action of the xylanases on xylopentaose and reduced xylopentaose.
[512] Enzyme Assay
[513] Xylanases Gxhl (CL06719; SEQ ID NO: 20) and Gxh2 (CL03182; SEQ ID NO:
22) were incubated with xylopentaose (Megazyme, Bray, Ireland) or reduced xylopentaose. The release of xylooligosaccharides was analyzed by HPAEC. The experiments on xylopentaose and reduced xylopentaose, which was obtained by a sodium borohydride treatment of the xylopentaose, were performed at pH 5.0 and at 50°C, during 16 hours. The ratio E/S is 0.5% (5 μg enzyme/mg substrate). After incubation the samples were boiled for 10 minutes to stop the reaction. The samples were centrifuged and the supernatants were analyzed by HPAEC.
[514] High Performance Anion Exchange Chromatography
[515] The analysis was performed using a Dionex ICS-3000 HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (1 mm ID x 25 mm) and a Dionex EDetl PAD- detector (Dionex Co., Sunnyvale). A flow rate of 0.25 mL/min was used with the following gradient of sodium acetate in NaOH: 0-15 min, 16 ai NaOH; 15-20 min, 16-100 mM NaOH; 20-40 min, 0-200 mM sodium acetate in 100 mM NaOH. Each elution was followed by a washing step of 5 min 1 M sodium acetate in 100 mM NaOH and an equilibration step of 15 min 16 mM NaOH.
[516] Results
[517] Both Gxhl and Gxh2 show the release of xylobiose from xylopentaose (see Figure 2). Gxhl is not able to release xylobiose form the reduced xylopentaose, which shows that the enzyme acts from the reducing end of the oligosaccharide (see Figure 9A). Gxh2 releases xylobiose and xylitol from the reduced xylopentaose. The postion of the xylitol released was confirmed by comparing its postion with that of commercially available xylitol. This shows that the enzyme Gxh2 acts from the non-reducing end of the oligosaccharide (see Figure 9B).
[518] This experiment demonstrates that Gxhl and Gxh2 have xylobiohydrolase activity and act from the reducing and non-reducing end of the xylan chain, respectively. The resemblance with the degradation mechanism for crystalline cellulose by Cbhl and Cbh2 is surprising. Patent 124702-0230
[519] Example 19
[520] The following assays were used to measure synergistic activity of xylanases. This assay measures the release of xylooligosaccharides by the action of the xylanases on birchwood xylan.
[521] Enzyme Assay
[522] Xylanases Gxhl (CL06719; SEQ ID NO: 20) and Gxh2 (CL03182; SEQ ID NO:
22) were incubated with birch wood xylan (Sigma, USA), alone and in combination with a GH10 and a GH11 xylanase from Myceliophthora thermophila CI. The release of xylooligosaccharides was analyzed by HPAEC. The experiments were performed at pH 5.0 and at 50°C, during 1 hour. The ratio E/S was 0.05% (0.5 ig enzyme/mg substrate) for the GH10 and GH11 xylanase, the ratio E/S was 0.2% for the Gxhl and Gxh2. After incubation the samples were boiled for 10 minutes to stop the reaction. The samples were centrifuged and the supernatants were analyzed by HPAEC.
[523] High Performance Anion Exchange Chromatography
[524] The analysis was performed using a Dionex ICS-3000 HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (1 mm ID x 25 mm) and a Dionex EDetl PAD- detector (Dionex Co., Sunnyvale). A flow rate of 0.25 mL/min was used with the following gradient of sodium acetate in NaOH: 0-15 min, 16 mM NaOH; 15-20 min, 16-100 mM NaOH; 20-40 min, 0-200 mM sodium acetate in 100 mM NaOH. Each elution was followed by a washing step of 5 min 1 sodium acetate in 100 mM NaOH and an equilibration step of 15 min 16 mM NaOH.
[525] Results
[526] Both Gxhl and Gxh2 are releasing xylobiose from birch wood xylan when the enzymes are used alone. When the enzymes are used in combination they show a synergistic effect in the release of xylobiose. This effect is even more pronounced when a GH10 xylanase from Myceliophthora thermophila (X10) is used in combination with Gxhl, Gxh2 or both. The addition of a GH11 xylanase from Myceliophthora thermophila CI results in a further increase, but not synergistically since the xylanase 11 leads already to a substantial oligosaccharide release when used alone (Figure 10). The experiment also demonstrates combinations of enzymes to be used in order to prepare preferentially xylobiose whereas the Patent 124702-0230 addition of beta-xylosidase activity like the Myceliophtora thermophilic Bxll will convert xylobiose all in xylose.
[527] Furthermore, addition of accessory enzymes removing acetyl-, (4-O-methyl) glucuronyl and or arabinosyl moieties from complex xylans to enzyme compostions containing Gxhl or both Gxhl and Gxh2 in combination with endo- xylanases like X10 or XI 1 or both, increases the yield of xylobiose.
[528] Example 20
[529] The following assays were used to measure a-galactosidase activity. This assay measures the release of galactose from pNP-a-galactoside, raffinose, stacchyose and galactomannans of different sources. In this assay also other substrates have been tested in order to define the substrate specificity of the a-galactosidases.
[530] Enzyme Assay
[531] pNP- -D-galactoside, pNP- β -D-galactoside, pNP- β -D-mannoside, pNP- a -D- xyloside, pNP- β -D-xyloside, pNP- 3 -D-glucoside and pNP- -L-rhamnoside have been incubated with Agal (CL08363; SEQ ID NO: 24) and Aga2 (CL08370; SEQ ID NO: 26) at 40°C and pH5 during 10 minutes. The final substrate concentration was 1 m . After exactly 10 min of incubation 0.5 mL of 1 M sodium carbonate solution is added and the absorbance was measured at 400 nm.
[532] Sucrose, raffinose, stachyose, debranched arabinan, arabinan, wheat arabinoxylan, galactan, arabinogalactan, guar galactomannan, Tara galactomannan and locus bean galactomannan have been incubated with Agal (CL08363) and Aga2 (CL08370) at 50°C and pH5 during 10 minutes. The final substrate concentration was 5 g/L. After exactly 10 min of incubation the released reducing sugars were analyzed according to the Nelson-Somogyi assay (Somogyi, 1952).
[533] Temperature optima were determined by incubation of pNP- -D-galactoside with Agal (CL08363) and Aga2 (CL08370) at pH5 during 10 minutes at temperatures 40-80°C. pH optima were determined by incubation of pNP- a -D-galactoside with Agal (CL08363) and Aga2 (CL08370) at 40°C during 10 minutes at a pH range of 3.0-6.0.
[534] Results
[535] Among the substrates tested Agal (CL08363) demonstrated activity toward pNP- a -D-galactoside and toward galactomannans derived from Guar, Tara and Carob Patent 124702-0230
(Table 3). Only trace activity was detected toward raffmose and stachyose, as well as toward wheat arabinoxylan, galactan and arabinogalactan (Table 3). Aga2 (CL08370) had relatively high activity toward pNP-a-D-galactoside, a low activity toward raffinose and stachyose, and no activity toward galactomarnans from different sources, arabinans, wheat arabinoxylan, galactan and arabinogalactan (Table 3). A more detailed study of the hydrolysis process of stachyose by Aga2 showed that the end products of hydrolysis are raffmose, sucrose and galactose. Sucrose is not further degraded (data not shown).
[536] Agal has different substrate specificity compared to Aga2. The main differences are that Agal possesses activity toward galactomarrnans, whereas Aga2 does not. Galactose was the main product of galactomannan hydrolysis by Agal.. Agal has also a lower specific activity toward pNP- -D-galactoside in comparison with Aga2.
[537] The optimal temperature for both Agal and Aga2 was 60°C (Figures 11 and 12, respectively). The pH optimum of Agal was 5 (Figure 13) and the pH optimum of Aga2 was 4 (Figure 14).
[538] The pH and temperature profile and optimum of Agal and its specificity for seed galactomarrnans can be exploited in processes to modify the galactose mannose ratio's in order to improve the viscosity and thickening properties of galactomarrnans derived from sources like guar and tara. Both Agal and Aga2 are able to convert the trisaccharides raffinose and stachyose which in human nutrition are known to cause flatulence.
[539] Table 3. Specific activities of Agal and Aga2 toward different substrates.
Incubations have been performed at 40 or 50°C and pH5.
Agal
Substrate Aga2
Substrate concentration T, °C (U/mg) (U/mg)
pNP-a-D-galactoside IraM 40 26 65
ρΝΡ-β-D-galactoside lmM 40 0 0
ρΝΡ-β-D-mannoside ImM 40 0 0
pNP-a-D-xyloside lmM 40 0 0
ρΝΡ-β-D-xyloside ImM 40 0 0
ρΝΡ-β-D-glucoside lmM 40 0 0
pNP-a-L-rhamnoside ImM 40 0 0
Sucrose 2mM 50 0 0
Raffinose 2mM 50 0.6 0.3
Stachyose 2mM 50 0.2 0.7
Arabinan debranched 5 g L 50 0 0 Patent 124702-0230
Figure imgf000134_0001
[540] References: Somogyi M. Notes on sugar determination. J Biol Chem (1952)
195 : 19-23.
[541] Example 21
[542] The following assays were used to measure β-mannosidase activity. This assay measures the release of mannose from ρΝΡ-β-mannoside. In this assay also other substrates have been tested in order to define the substrate specificity of the β- mannosidase.
[543] Enzyme Assay
[544] pNP- β -D-mannoside, pNP- -D-galactoside and pNP- β -D-galactoside have been incubated with Man9 (CL08391 ; SEQ ID NO: 32) at 40°C and pH5 during 10 minutes. The final substrate concentration was 1 mM. After exactly 10 min of incubation 0.5 mL of 1 M sodium carbonate solution is added and the absorbance was measured at 400 nm.
[545] Carboxymethyl cellulose, barley j3 -glucan, birch wood xylan, Conjac Glucomannan, Guar Galactomannan, Tara Galactomannan, and Locust bean Galactomannan have been incubated with Man9 (CL08391) at 35°C and pH5 during 10 minutes. The final substrate concentration was 5 g/L. After exactly 10 min of incubation the reducing sugars released have been analyzed according to the Nelson-Somogyi assay (Somogyi, 1952).
[546] Temperature optimum was determined by incubation of ρΝΡ-β-D-mannoside with Man9 (CL08391) at pH5 during 10 minutes at temperatures in the range from 40- 80°C. The pH optimum was determined by incubation of ρΝΡ-β-D-mannoside with Man9 (CL08391) at 40°C during 10 minutes at a pH range of 3.0-6.0.
[547] Results
[548] Among the tested substrates Man9 demonstrated only activity towai'd pNP- β -D- mannoside (Table 4). The enzyme was not able to hydrolyze polysaccharide Patent 124702-0230 substrates such as gluco- or galactomannans, xylan, β-glucan and carboxymethyl cellulose.
[549] The optimal temperature for Man9 was 40°C (Figures 15) and the pH optimum of
Man9 was 5.5 (Figure 16).
[550] Table 4. Specific activities of Man9 toward different substrates. Incubations have been performed at 40 or 35°C and pH5.
Figure imgf000135_0001
[551] References: Somogyi . Notes on sugar determination. J Biol Chem (1952)
195:19-23.
[552] Example 22
[553] The following assays were used to measure β-galactosidase activity. This assay measures the release of galactose from ρΝΡ-β-D-galactoside, οΝΡ-β-D-galactoside and lactose. In this assay also other substrates have been tested in order to define the substrate specificity of the β-mannosidase.
[554] Enzyme Assay
[555] Bgal (CL08660; SEQ ID NO: 54) has been produced in a low background stain of Myceliophthora thermophila CI in a similar way as described in Example 17. The crude Bgal was used in this assay. To test the activity of Bgal ρΝΡ-β-D- galactoside, οΝΡ-β-D-galactoside and lactose were used as substrates. The reaction mix with a volume of 500μΙ, contained: 11-30 μg/ml Bgal, 1.42 mM ρΝΡ-β-D- galactoside or 4 mg/ml οΝΡ-β-D-galactoside and 16 mM sodium citrate buffer at pH 5. For the reactions at pH 7, 29.8 mM sodium phosphate buffer was used. The incubation was done at 37 °C. 50 μΐ, aliquots were taken at different time points (0-10 min) and added to 150 \ L Tris-HCl pH 8.8 in the wells of a microliter plate. The absorbance was measured at 410 nm. The specific activity (U/mg) is defined Patent 124702-0230 as μπιοΐεβ of substrate converted to product per min per milligram of protein.
[556] To evaluate the activity of Bgal (CL08660) on natural substrate, concentrations of 4 and 10 mg/ml of lactose were tested. The reaction mixture (500 uL) contained an enzyme to substrate loading of 1% w/w at pH 5 (33 mM sodium citrate buffer) and pH 7 (60mM sodium phosphate buffer), incubated at 37 °C. Aliquots (50 μΕ) were taken at 0, 3, 5, 10 min intervals and the enzyme was inactivated by the addition of 50 μΐ, of 19 mM NaOH (pH12.3). The quantification of galactose, glucose and lactose was done by HPAEC analysis.
[557] High Performance Anion Exchange Chromatography
[558] The analysis was performed using a Dionex ICS-3000 HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (1 mm ID x 25 mm) and a Dionex EDetl PAD- detector (Dionex Co., Sunnyvale). A flow rate of 0.25 mL/min was used with the following gradient of sodium acetate in NaOH: 0-15 min, 16 mM NaOH; 15-20 min, 16-100 mM NaOH; 20-40 min, 0-200 mM sodium acetate in 100 mM NaOH. Each elution was followed by a washing step of 5 min 1 sodium acetate in 100 mM NaOH and an equilibration step of 15 min 16 mM NaOH.
[559] Results
[560] With ρΝΡ-β-D-galactoside, the Bgal preparation showed slightly higher activities at pH 5 and pH 7 compared to οΝΡ-β-D-galactoside. These results may be due to the lower substrate concentration of οΝΡ-β-D-galactoside (4 mg/ml) (Table 5).
[561] Bgal activity was tested on two concentrations of the natural substrate lactose. At 4 mg/ml the crude Bgal preparation showed activities of 1.7 and 1.3 U/mg at pH 5 and 7, respectively. At 10 mg/ml of lactose Bgal activity was slightly lower at pH7.
[562] In summary, Bgal had a higher activity under acidic conditions than under alkaline conditions regardless of the substrates.
Patent 124702-0230
[563] Table 5: Overview of the activity of β-galactosidase Bgal towards different subtrates.
Figure imgf000137_0001
[564] While various embodiments of the present invention have been described in detail, it is apparent that modifications and adaptations of those embodiments will occur to those skilled in the art. It is to be expressly understood, however, that such modifications and adaptations are within the scope of the present invention, as set forth in the following exemplary claims.

Claims

WHAT IS CLAIMED IS:
1. An isolated nucleic acid molecule comprising a nucleic acid sequence selected from the group consisting of:
a) a nucleic acid sequence encoding a protein comprising an amino acid sequence selected from the group consisting of the amino acid sequences SEQ ID No: 54; SEQ ID No: 56, SEQ ID No: 20; SEQ ID No: 22; SEQ ID No: 2; SEQ ID No: 18; SEQ ID No: 48; SEQ ID No: 46; SEQ ID No: 8; SEQ ID No: 10; SEQ ID No: 14; SEQ ID No: 12; SEQ ID No: 16; SEQ ID No: 24; SEQ ID No: 26; SEQ ID No: 28; SEQ ID No: 32; SEQ ID No: 30; SEQ ID No: 40; SEQ ID No: 36; SEQ ID No: 38; SEQ ID No: 4; SEQ ID No: 2; SEQ ID No: 6; SEQ ID No: 16; SEQ ID No: 34; SEQ ID No: 42; SEQ ID No: 44; SEQ ID No: 50; SEQ ID No: 52; SEQ ID No: 58; SEQ ID No: 60 SEQ ID No: 62; and SEQ ID No: 64;
b) a nucleic acid sequence encoding a fragment of the protein of (a), wherein the fragment has a biological activity of the protein of (a); and
c) a nucleic acid sequence encoding an amino acid sequence that is at least about 70% identical to an amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
2. The isolated nucleic acid molecule of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 90% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
3. The isolated nucleic acid molecule of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 95% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
4. The isolated nucleic acid molecule of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 97% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence,
5. The isolated nucleic acid molecule of claim 1, wherein said nucleic acid sequence encodes an amino acid sequence that is at least about 99% identical to the amino acid sequence of (a) and has a biological activity of the protein comprising the amino acid sequence.
6. The isolated nucleic acid molecule of claim 1, wherein said nucleic acid sequence encodes a protein comprising an amino acid sequence selected from the group consisting of: SEQ ID No: 54; SEQ ID No: 56, SEQ ID No: 20; SEQ ID No: 22; SEQ ID No: 2; SEQ ID No: 18; SEQ ID No: 48; SEQ ID No: 46; SEQ ID No: 8; SEQ ID No: 10; SEQ ID No: 14; SEQ ID No: 12; SEQ ID No: 16; SEQ ID No: 24; SEQ ID No: 26; SEQ ID No: 28; SEQ ID No: 32; SEQ ID No: 30; SEQ ID No: 40; SEQ ID No: 36; SEQ ID No: 38; SEQ ID No: 4; SEQ ID No: 6; SEQ ID No: 34; SEQ ID No: 42; SEQ ID No: 44; SEQ ID No: 50; SEQ ID No: 52; SEQ ID No: 58; SEQ ID No: 60 SEQ ID No: 62; and SEQ ID No: 64.
7. The isolated nucleic acid molecule of claim 1, wherein said nucleic acid sequence comprises a nucleic acid sequence selected from the group consisting of: the nucleic acid sequences of SEQ ID No: 53; SEQ ID No: 55, SEQ ID No: 19; SEQ ID No: 21 ; SEQ ID No: 1 ; SEQ ID No: 17; SEQ ID No: 47; SEQ ID No: 45; SEQ ID No: 7; SEQ ID No: 9; SEQ ID No: 13; SEQ ID No: 11 ; SEQ ID No: 15; SEQ ID No: 23; SEQ ID No: 25; SEQ ID No: 27; SEQ ID No: 31; SEQ ID No: 29; SEQ ID No: 39; SEQ ID No: 35; SEQ ID No: 37; SEQ ID No: 3; SEQ ID No: 5; SEQ ID No: 33; SEQ ID No: 41 ; SEQ ID No: 43; SEQ ID No: 49; SEQ ID No: 51 ; SEQ ID No: 57; SEQ ID No: 59 SEQ ID No: 61; and SEQ ID No: 63.
8. An isolated nucleic acid molecule comprising a nucleic acid sequence that is fully complementary to the nucleic acid sequence of the nucleic acid molecule of any one of Claims 1 to 7.
9. An isolated protein comprising an amino acid sequence encoded by the nucleic acid molecule of any one of Claims 1 to 7.
10. An isolated fusion protein comprising the isolated protein of Claim 9 fused to a protein comprising an amino acid sequence that is heterologous to the isolated protein of Claim 9.
1 1. An isolated antibody or antigen binding fragment thereof that selectively binds to the protein of Claim 9.
12. A kit for degrading a lignocellulosic material to fermentable sugars comprising at least one isolated protein of Claim 9.
13. A detergent comprising at least one isolated protein of Claim 9.
14. A composition for the degradation of a lignocellulosic material comprising at least one isolated protein of Claim 9.
15. A recombinant nucleic acid molecule comprising the isolated nucleic acid molecule of any one of Claims 1 to 7, operatively linked to at least one expression control sequence.
16. The recombinant nucleic acid molecule of Claim 15, wherein the recombinant nucleic acid molecule comprises an expression vector.
17. The recombinant nucleic acid molecule of Claim 15, wherein the recombinant nucleic acid molecule comprises a targeting vector.
18. An isolated host cell transfected with the nucleic acid molecule of any one of Claims 1 to 7.
19. The isolated host cell of Claim 18, wherein the host cell is selected from the group consisting of: a fungal cell, a plant cell, an algal cell, and a bacterium.
20. The isolated host cell of Claim 18, wherein the host cell is selected from the group consisting of: yeast, mushroom, or a filamentous fungus.
21. The isolated host cell of Claim 18, wherein the filamentous fungus is from a genus selected from the group consisting of: Chrysosporium, Thielavia, Talaromyces,
Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynascus, Cryptococciis, Acremonium, Tolypocladi m, Scytalidium, Schizophylhim, Sporotrichum, Penicillhim, Gibberella, Myceliophthoro, Mucor, Aspergillus, Fusarium, Humicola, and Trichoderma, and anamorphs and teleomorphs thereof.
22. The isolated host cell of Claim 18, wherein the host cell is a bacterium.
23. An oligonucleotide consisting essentially of at least 12 consecutive nucleotides of a nucleic acid sequence selected from the group consisting of the nucleic acid sequence of SEQ ID No: 53; SEQ ID No: 55, SEQ ID No: 19; SEQ ID No: 21 ; SEQ ID No: 1 ; SEQ ID No: 17; SEQ ID No: 47; SEQ ID No: 45; SEQ ID No: 7; SEQ ID No: 9; SEQ ID No: 13; SEQ ID No: 11 ; SEQ ID No: 15; SEQ ID No: 23; SEQ ID No: 25; SEQ ID No: 27; SEQ ID No: 31 ; SEQ ID No: 29; SEQ ID No: 39; SEQ ID No: 35; SEQ ID No: 37; SEQ ID No: 3; SEQ ID No: 5; SEQ ID No: 33; SEQ ID No: 41 ; SEQ ID No: 43; SEQ ID No: 49; SEQ ID No: 51 ; SEQ ID No: 57; SEQ ID No: 59 SEQ ID No: 61 ; and SEQ ID No: 63, or the complement thereof.
24. A kit comprising at least one oligonucleotide of claim 23.
25. A method for producing the protein of Claim 9, comprising culturing a cell that has been transfected with a nucleic acid molecule comprising a nucleic acid sequence encoding the protein, and expressing the protein with the transfected cell.
26. The method of Claim 25, further comprising recovering the protein from the cell or from a culture comprising the cell.
27. A genetically modified organism comprising components suitable for degrading a lignocellulosic material to fermentable sugars, wherein the organism has been genetically modified to express at least one protein of Claim 9.
28. The genetically modified organism of Claim 27, wherein the genetically modified organism is selected from the group consisting of: plants, algae, fungi, and bacteria.
29. The genetically modified organism of Claim 28, wherein the fungus is selected from the group consisting of: yeast, mushroom and filamentous fungus.
30. The genetically modified organism of Claim 29, wherein the filamentous fungus is from a genus selected from the group consisting of: Chrysosporium, Thielavia,
Neurospora, Aureobasidium, Filibasidium, Piromyces, Corynascus, Cryplococcus, Acremonium, Tolypocladium, Scytalidium, Schizophyllum, Sporotrichum, Penicillhim, Talaromyces, Gibberella, Myceliophthora, Mucor, Aspergillus, Fitsarium, Humicola, and Trichoderma.
31. The genetically modified organism of Claim 29, wherein the filamentous fungus is selected from the group consisting of: Trichoderma reesei, Chrysosporium luckno 'ense, Myceliophtora thermophila, Aspergillus niger, Aspergillus nidulans, Aspergillus oryzoe, Aspergillus aculeatus, Aspergillus japonicus, Penicillium canescens, Penicillium solitum, Penicillium funiculosum, and Talaromyces emersonii, and Talaromyces flavus.
32. The genetically modified organism of Claim 27, wherein the organism has been genetically modified to express at least one additional enzyme.
33. The genetically modified organism of Claim 32, wherein the additional enzyme is an accessory enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase,
xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanases, exo- β-D-glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
34. The genetically modified organism of Claim 27, wherein the genetically modified organism is a plant.
35. A recombinant enzyme isolated from the genetically modified microorganism of any one of claims 27 to 34.
36. The recombinant enzyme of claim35, wherein the enzyme has been subjected to a purification step.
37. A crude fermentation product produced by culturing the cells from the genetically modified organism of any one of claims 27 to 34, wherein the crude fermentation product contains the at least one protein of Claim 9.
38. A multi-enzyme composition comprising enzymes produced by the genetically modified organism of any one of Claims 27 to 34, and recovered therefrom.
39. A multi-enzyme composition comprising at least one protein of Claim 9, and at least one additional protein for degrading a lignocellulosic material or a fragment thereof that has biological activity.
40. The multi-enzyme composition of Claim 39, wherein the composition comprises at least one cellobiohydrolase, at least one xylanase, at least one endoglucanase, at least one β-glucosidase, at least one β-xylosidase, and at least one accessory enzyme.
41. The multi-enzyme composition of Claim 39, wherein between about 50% and about 70% of the enzymes in the composition are cellobiohydrolases.
42. The multi-enzyme composition of Claim 39, wherein between about 10% and about 30% of the enzymes in the composition are xylanases.
43. The multi-enzyme composition of Claim 39, wherein between about 5% and about 15% of the enzymes in the composition are endoglucanases.
44. The multi-enzyme composition of Claim 39, wherein between about 1% and about 5% of the enzymes in the composition are β-glucosidases.
45. The multi-enzyme composition of Claim 39, wherein between about 1% and about 3% of the enzymes in the composition are β-xylosidases.
46. The multi-enzyme composition of Claim 39, wherein the composition comprises about 60% cellobiohydrolases, about 20% xylanases, about 10% endoglucanases, about 3% β-glucosidases, about 2% β-xylosidases, and about 5% accessory enzymes.
47. The multi-enzyme composition of Claim 40 or Claim 46, wherein the xylanases are selected from the group consisting of: endoxylanases, exoxylanases, and β-xylosidases.
48. The multi-enzyme composition of Claim 40 or Claim 46 wherein the accessory enzymes include an enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, ferulic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanases, exo^-D-glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
49. The multi-enzyme composition of any one of Claims 38 to 48, wherein the multi- enzyme composition comprises at least one hemicellulase.
50. The multi-enzyme composition of Claim 49, wherein the hemicellulase is selected from the group consisting of a xylanase, an arabinofuranosidase, an acetyl xylan esterase, a glucuronidase, and endo-galactanase, a mannanase, an endo arabinase, an exo arabinase, an exo-galactanase, a ferulic acid esterase, a galactomannanase, a xylogluconase, and mixtures thereof.
51. The multi-enzyme composition of Claim 50, wherein the xylanase is selected from the group consisting of endoxylanases, exoxylanase, and β-xylosidase.
52. The multi-enzyme composition of any one of Claims 38 to 51, wherein the multi- enzyme composition comprises at least one cellulase.
53. The multi-enzyme composition of any one of Claims 38 to 51, wherein the composition is a crude fermentation product.
54. The multi-enzyme composition of any one of Claims 38 to 51, wherein the composition is a crude fermentation product that has been subjected to a purification step.
55. The multi-enzyme composition of any one of Claims 38 to 51 further comprising one or more accessory enzymes.
56. The multi-enzyme composition of Claim 55, wherein the accessory enzymes includes at least one enzyme selected from the group consisting of: cellulase, glucosidase, xylanase, xylosidase, ligninase, glucuronidase, arabinofuranosidase, arabinase, arabinogalactanase, femlic acid esterase, lipase, pectinase, glucomannase, amylase, laminarinase, xyloglucanase, galactanase, galactosidase, glucoamylase, pectate lyase, chitosanases, exo-P-D-glucosaminidase, cellobiose dehydrogenase, and acetylxylan esterase.
57. The multi-enzyme composition of Claim 55, wherein the accessory enzyme is selected from the group consisting of a glucoamylase, a pectinase, and a ligninase.
58. The multi-enzyme composition of Claim 55, wherein the accessory enzyme is a glucoamylase.
59. The multi-enzyme composition of Claim 55, wherein the accessory enzyme is added as a crude or a semi-purified enzyme mixture.
60. The multi-enzyme composition of Claim 55, wherein the accessory enzyme is produced by culturing at least one organism on a substrate to produce the enzyme.
61. A multi-enzyme composition comprising at least one protein of Claim 9, and at least one additional protein for degrading an arabinoxylan-containing material or a fragment thereof that has biological activity.
62. The multi-enzyme composition of Claim 61, wherein the composition comprises at least one endoxylanase, at least one β-xylosidase, and at least one arabinofuranosidase.
63. The multi-enzyme composition of Claim 62, wherein the at least one arabinofuranosidase comprises an arabinofuranosidase with specificity towards single substituted xylose residues, an arabinofuranosidase with specificity towards double substituted xylose residues, or a combination thereof.
64. A method for degrading a lignocellulosic material to fermentable sugars, comprising contacting the lignocellulosic material with at least one isolated protein of Claim 9.
65. The method of Claim 64, further comprising contacting the lignocellulosic material with at least one additional isolated protein comprising an amino acid sequence that is at least about 95% identical to an amino acid sequence selected from the group consisting of the amino acid sequences of Table 1 and Table 2, wherein the at least one additional protein has cellulolytic enhancing activity.
66. The method of Claim 64, wherein the isolated protein is part of a multi-enzyme composition.
67. A method for degrading a lignocellulosic material to fermentable sugars, comprising contacting the lignocellulosic material with at least one multi-enzyme composition of any one of Claims 38 to 63.
68. A method for producing an organic substance, comprising:
saccharifying a lignocellulosic material with a multi-enzyme composition of any one of Claims 38 to 63;
fermenting the saccharified lignocellulosic material obtained with one or more fermentating microoganisms; and
recovering the organic substance from the fermentation.
69. The method of claim 68, wherein the steps of saccharifying and fermenting are performed simultaneously.
70. The method of claim 68, wherein the organic substance is an alcohol, organic acid, ketone, amino acid, or gas.
71. The method of claim 68, wherein the organic substance is an alcohol.
72. The method of claim 71, wherein the alcohol is ethanol.
73. The method of any one of Claims 64 to 72, wherein the lignocellulosic material is selected from the group consisting of consisting of herbaceous material, agricultural residue, forestry residue, municipal solid waste, waste paper, and pulp and paper mill residue.
74. The method of any one of Claims 64 to 72, wherein the lignocellulosic material is distiller's dried grains or distiller's dried grains with solubles.
75. The method of any one of Claims 64 to 72, wherein the distiller's dried grains or distiller's dried grains with solubles is derived from corn.
76. A method for degrading a lignocellulosic material consisting of distiller's dried grains or distiller's dried grains with solubles to sugars, the method comprising contacting the distiller's dried grains or distiller's dried grains with solubles with a multi-enzyme composition, whereby at least about 10% of the fermentable sugars are liberated, wherein the multi-enzyme composition is the multi-enzyme composition of any one of Claims 38 to 60.
77. The method of Claim 76, whereby at least about 15% of the sugars are liberated.
78. The method of claim 76, whereby at least about 20% of the sugars are liberated.
79. The method of claim 76, whereby at least about 23% of the sugars are liberated.
80. The method of claim 76, wherein the distiller's dried grains or distiller's dried grains with solubles is derived from corn.
81. The method of any one of Claims 64 to 80, further comprising a pretreatment process for pretreating the lignocellulosic material.
82. The method of Claim 81, wherein the pretreatment process is selected from the group consisting of physical treatment, metal ion, ultraviolet light, ozone, organosolv treatment, steam explosion treatment, lime impregnation with steam explosion treatment, hydrogen peroxide treatment, hydrogen peroxide/ozone (peroxone) treatment, acid treatment, dilute acid treatment, and base treatment.
83. The method of Claim 81, wherein the pretreatment process is selected from the group consisting of organosolv, steam explosion, heat treatment and AFEX.
84. The method of Claim 83, wherein the heat treatment comprises heating the lignocellulosic material to 121°C for 15 minutes.
85. The method of any one of Claims 64 to 84, further comprising detoxifying the lignocellulosic material.
86. The method of any one of Claims 64 to 85, further comprising recovering the fermentable sugar.
87. The method of Claim 86, wherein the sugar is selected from the group consisting of glucose, xylose, arabinose, galactose, mannose, rhamnose, sucrose and fructose.
88. The method of any one of Claims 64 to 87, further comprising recovering the contacted lignocellulosic material after the fermentable sugars are degraded.
89. A feed additive comprising the recovered lignocellulosic material of Claim 88.
90. The feed additive of Claim 89, wherein the protein content of the recovered lignocellulosic material is higher than that of the starting lignocellulosic material.
91. A method of improving the performance of an animal which comprises administering to the animal the feed additive of Claim 89.
92. A method for improving the nutritional quality of an animal feed comprising adding the feed additive of Claim 89 to an animal feed.
93. A method for stone washing a fabric, comprising contacting the fabric with at least one isolated protein of Claim 9.
94. A method for stonewashing a fabric, comprising contacting the fabric with at least one multi-enzyme composition of any one of Claims 38 to 63.
95. The method of Claim 93 or Claim 94, wherein the fabric is denim.
96. A method for enhancing the softness or feel of a fabric or depilling a fabric, comprising contacting the fabric with at least one isolated protein of Claim 9 or a fragment of at least one isolated protein of Claim 9 comprising a cellulose binding module (CBM) of the protein.
97. A method for enhancing the softness or feel of a fabric or depilling a fabric, comprising contacting the fabric with at least one multi-enzyme composition of any one of Claims 38 to 63.
98. A method for restoring color to or brightening a fabric, comprising contacting the fabric with at least one isolated protein of Claim 9.
99. A method for restoring color to or brightening a fabric, comprising contacting the fabric with at least one multi-enzyme composition of any one of Claims 38 to 63.
100. A method of biopolishing, defibrillating, bleaching, dyeing or desizing a fabric, comprising contacting the fabric with at least one isolated protein of Claim 9.
101. A method of biopolishing, defibrillating, bleaching, dyeing or desizing a fabric, comprising contacting the fabric with at least one multi-enzyme composition of any one of Claims 38 to 63.
102. A method of biorefming, deinldng or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one isolated protein of Claim 9.
103. A method of biorefming, deinking or biobleaching paper or pulp, comprising contacting the paper or pulp with at least one multi-enzyme composition of any one of Claims 38 to 63.
104. A method for enhancing the cleaning ability of a detergent composition, comprising adding at least one isolated protein of Claim 9 to the detergent composition.
105. A method for enhancing the cleaning ability of a detergent composition, comprising adding at least one multi-enzyme composition of any one of Claims 38 to 63 to the detergent composition.
106. A detergent composition, comprising at least one isolated protein of Claim 9 and at least one surfactant.
107. A detergent composition, comprising at least one multi-enzyme composition of any one of Claims 38 to 63 and at least one surfactant.
108. A method for releasing cellular contents comprising contacting a cell with at least one isolated protein of Claim 9.
109. The method of claim 108, wherein the cell is selected from the group consisting of: a bacterium, an algal cell, a fungal cell or a plant cell.
1 10. The method of claim 108, where the cell is an algal cell.
111. The method of claim 108, wherein contacting the cell with at least one isolated protein of Claim 9 degrades the cell wall.
112. The method of claim 108, wherein the cellular contents are selected from the group consisting of: alcohols and oils.
113. A composition for degrading cell walls comprising at least one isolated protein of Claim 9.
1 14. A method for improving the nutritional quality of food comprising adding to the food at least one isolated protein of Claim 9.
1 15. A method for improving the nutritional quality of food comprising pretreating the food with at least one isolated protein of Claim 9.
1 16. A method for improving the nutritional quality of animal feed comprising adding to the animal feed at least one isolated protein of Claim 9.
117. A method for improving the nutritional quality of animal feed comprising pretreating the feed with at least one isolated protein of Claim 9.
118. A genetically modified organism comprising at least one nucleic acid molecule encoding al least one protein of Claim 9, in which the activity of one or more of the proteins of claim 9 is upregulated, the activity of one or more of the proteins of claim 9 downregulated, or the activity of one or more of the proteins of claim 9 is upregulated and the activity of one or more of the proteins of claim 9 is downregulated.
PCT/US2011/045949 2010-07-31 2011-07-29 Novel fungal enzymes WO2012018691A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US36968810P 2010-07-31 2010-07-31
US61/369,688 2010-07-31

Publications (2)

Publication Number Publication Date
WO2012018691A2 true WO2012018691A2 (en) 2012-02-09
WO2012018691A3 WO2012018691A3 (en) 2012-05-03

Family

ID=45560002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/045949 WO2012018691A2 (en) 2010-07-31 2011-07-29 Novel fungal enzymes

Country Status (1)

Country Link
WO (1) WO2012018691A2 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2740840A1 (en) * 2012-12-07 2014-06-11 Novozymes A/S Improving drainage of paper pulp
WO2015011319A1 (en) 2013-07-22 2015-01-29 Abengoa Bioenergía Nuevas Tecnologías, S. A. Myceliophthora thermophila host cell expressing a heterologous alpha-xilosidase enzyme and use thereof in a method for the degradation biomass
WO2015039962A1 (en) * 2013-09-20 2015-03-26 Novozymes A/S Enzymatic bleaching of paper pulp
US9512413B2 (en) 2012-05-31 2016-12-06 Novozymes A/S Polypeptides having organophosphorous hydrolase activity
WO2017017292A1 (en) 2015-07-29 2017-02-02 Abengoa Bioenergía Nuevas Tecnologías, S. A. Expression of recombinant beta-xylosidase enzymes
EP3202900A1 (en) * 2013-02-04 2017-08-09 DSM IP Assets B.V. Carbohydrate degrading polypeptide and uses thereof
US9834763B2 (en) 2013-05-20 2017-12-05 Abengoa Bioenergia Nuevas Tecnologias, S.A. Expression of recombinant beta-xylosidase enzymes
CN109295131A (en) * 2017-12-04 2019-02-01 合肥工业大学 A kind of receptor positioning solid phase enzymolysis preparation of dendrobium nobile activated oligosaccharide
CN110014037A (en) * 2019-05-10 2019-07-16 湖南泰谷生态工程有限公司 A kind of combined remediation method of copper polluted soil
CN111019923A (en) * 2020-01-06 2020-04-17 河南新仰韶生物科技有限公司 Preparation method for improving refining effect of enzyme preparation for food industry
CN114606143A (en) * 2020-12-08 2022-06-10 青岛蔚蓝康成生物科技有限公司 Trichoderma reesei mutant strain capable of producing rhamnosidase in high yield and application of trichoderma reesei mutant strain
WO2023220060A1 (en) * 2022-05-11 2023-11-16 C16 Biosciences, Inc. Enzymatic lysis for extraction of bioproducts from yeast

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090099079A1 (en) * 2007-09-07 2009-04-16 Emalfarb Mark A Novel Fungal Enzymes
US20090280105A1 (en) * 2007-08-02 2009-11-12 Dyadic International, Inc. Novel Fungal Enzymes

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090280105A1 (en) * 2007-08-02 2009-11-12 Dyadic International, Inc. Novel Fungal Enzymes
US20090099079A1 (en) * 2007-09-07 2009-04-16 Emalfarb Mark A Novel Fungal Enzymes

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DATABASE UNIPROT [Online] 23 March 2010 Database accession no. Q2GUG7_CHAGB *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10066220B2 (en) 2012-05-31 2018-09-04 Novozymes A/S Polypeptides having organophosphorous hydrolase activity
US9512413B2 (en) 2012-05-31 2016-12-06 Novozymes A/S Polypeptides having organophosphorous hydrolase activity
WO2014086976A1 (en) * 2012-12-07 2014-06-12 Novozymes A/S Improving drainage of paper pulp
EP2740840A1 (en) * 2012-12-07 2014-06-11 Novozymes A/S Improving drainage of paper pulp
US10077531B2 (en) 2012-12-07 2018-09-18 Novozymes A/S Improving drainage of paper pulp
US10655115B2 (en) 2013-02-04 2020-05-19 Dsm Ip Assets B.V. Carbohydrate degrading polypeptide and uses thereof
US10316305B2 (en) 2013-02-04 2019-06-11 Dsm Ip Assets B.V. Carbohydrate degrading polypeptide and uses thereof
US9988615B2 (en) 2013-02-04 2018-06-05 Dsm Ip Assets B.V. Carbohydrate degrading polypeptide and uses thereof
US11447759B2 (en) 2013-02-04 2022-09-20 Dsm Ip Assets B.V. Carbohydrate degrading polypeptide and uses thereof
EP3202900A1 (en) * 2013-02-04 2017-08-09 DSM IP Assets B.V. Carbohydrate degrading polypeptide and uses thereof
US9834763B2 (en) 2013-05-20 2017-12-05 Abengoa Bioenergia Nuevas Tecnologias, S.A. Expression of recombinant beta-xylosidase enzymes
WO2015011319A1 (en) 2013-07-22 2015-01-29 Abengoa Bioenergía Nuevas Tecnologías, S. A. Myceliophthora thermophila host cell expressing a heterologous alpha-xilosidase enzyme and use thereof in a method for the degradation biomass
WO2015039962A1 (en) * 2013-09-20 2015-03-26 Novozymes A/S Enzymatic bleaching of paper pulp
WO2017017292A1 (en) 2015-07-29 2017-02-02 Abengoa Bioenergía Nuevas Tecnologías, S. A. Expression of recombinant beta-xylosidase enzymes
CN109295131A (en) * 2017-12-04 2019-02-01 合肥工业大学 A kind of receptor positioning solid phase enzymolysis preparation of dendrobium nobile activated oligosaccharide
CN110014037A (en) * 2019-05-10 2019-07-16 湖南泰谷生态工程有限公司 A kind of combined remediation method of copper polluted soil
CN111019923A (en) * 2020-01-06 2020-04-17 河南新仰韶生物科技有限公司 Preparation method for improving refining effect of enzyme preparation for food industry
CN111019923B (en) * 2020-01-06 2022-06-17 河南新仰韶生物科技有限公司 Preparation method for improving refining effect of enzyme preparation for food industry
CN114606143A (en) * 2020-12-08 2022-06-10 青岛蔚蓝康成生物科技有限公司 Trichoderma reesei mutant strain capable of producing rhamnosidase in high yield and application of trichoderma reesei mutant strain
WO2023220060A1 (en) * 2022-05-11 2023-11-16 C16 Biosciences, Inc. Enzymatic lysis for extraction of bioproducts from yeast

Also Published As

Publication number Publication date
WO2012018691A3 (en) 2012-05-03

Similar Documents

Publication Publication Date Title
US8551751B2 (en) BX11 enzymes having xylosidase activity
US7923236B2 (en) Fungal enzymes
WO2012027374A2 (en) Novel fungal carbohydrate hydrolases
US7883872B2 (en) Construction of highly efficient cellulase compositions for enzymatic hydrolysis of cellulose
US9133448B2 (en) Polypeptide having cellobiohydrolase activity and uses thereof
WO2012018691A2 (en) Novel fungal enzymes
US9260704B2 (en) Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof
EP3318574B1 (en) Polypeptide having beta-glucosidase activity and uses thereof
EP2588492B1 (en) Polypeptide having beta-glucosidase activity and uses thereof
US20130280764A1 (en) Method of improving the activity of cellulase enzyme mixtures in the saccharification (ligno)cellulosic material
US8790894B2 (en) Mutant cellobiohydrolase
WO2012021883A2 (en) Novel fungal enzymes
US9175050B2 (en) Polypeptide having swollenin activity and uses thereof
WO2012078741A2 (en) Novel fungal esterases

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11815117

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11815117

Country of ref document: EP

Kind code of ref document: A2