WO2014060379A1

WO2014060379A1 - Cell wall deconstruction enzymes of myceliophthora fergusii (corynascus thermophilus) and uses thereof

Info

Publication number: WO2014060379A1
Application number: PCT/EP2013/071465
Authority: WO
Inventors: Adrian Tsang; Justin Powlowski; Gregory Butler
Original assignee: Dsm Ip Assets B.V.; Concordia University
Priority date: 2012-10-16
Filing date: 2013-10-15
Publication date: 2014-04-24

Abstract

The present invention relates to novel Myceliophthora fergusii (Corynascus thermophilus) enzymes or proteins for cell wall deconstruction, polynucleotide sequences encoding the polypeptides according to the invention, a production process for the enzymes according to the invention and the use of the enzymes according to the invention in various industrial processes.

Description

CELL WALL DECONSTRUCTION ENZYMES OF MYCELIOPHTHORA FERGUSII (CORYNASCUS THERMOPHILUS) AND USES THEREOF

Field of the invention

The invention relates to newly identified polynucleotide sequences comprising genes that encode novel cell wall deconstruction enzymes. The enzymes may be isolated from the fungus, Myceliophthora fergusii (Corynascus thermophilus) strain CBS 405.69. The invention features the full length coding sequences of the novel genes, the genomic sequences of each gene, as well as the amino acid sequences of the full-length functional proteins and functional equivalents of the genes or the amino acid sequences. The invention also relates to methods of using these proteins in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins and cells wherein a protein according to the invention is genetically modified to enhance or reduce its activity and/or level of expression.

Background of the invention

Cell wall deconstruction enzymes have a number of industrial applications for example in food applications, such as cereal-based food products, in the textile industry such as for the treatment of cellulose-based fabrics; in the feed-enzyme industry such as for increasing the digestability of nutrients; in the pulp and paper industry such as for enhancing bleachability of the pulp; in the waste treatment industry such as for the decolorization of synthetic dyes; and in the bioethanol industry such as for improving the ethanol yield and increasing the efficiency and economy of ethanol production.

The conversion of biomass into second-generation biofuels, driven by the limited availability of fossil fuels, is heavily dependent on inexpensive and effective enzymes for the conversion of lignocellulose to ethanol. Cellulase enzyme cocktails require the concerted action of endoglucanases, cellobiohydrolases, and beta-glucosidases. The current cost of cellulose enzymes is too high for bioethanol to compete economically with fossil fuels: lowered cellulose costs may result from the discovery of cellulase enzymes with higher specific activity, lower production costs, or greater compatibility with processing conditions including temperature, pH and the presence of inhibitors in the biomass, or produced as the result of biomass pre-treatment.

Conversion of plant biomass to glucose may also be enhanced by supplementing cellulose cocktails with enzymes that degrade the other components of biomass, including hemicelluloses, pectins and lignins, and their linkages, to improve the accessibility of cellulose to the cellulase enzymes. These enzymes include: xylanases, mannanases, arabinanases, esterases, glucuronidases, xyloglucanases and arabinofuranosidases for hemicelluloses; lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases for lignin; and pectate lyase, pectin lyase, polygalacturonase, pectin acetyl esterase, alpha-arabinofuranosidase, beta-galactosidase, galactanase, arabinanase, rhamnogalacturonase, rhamnogalacturonan lyase, and rhamnogalacturonan acetyl esterase, xylogalacturonosidase, xylogalacturonase, and rhamnogalacturonan lyase. Additionally, glycoside hydrolase family 61 (GH61 ) proteins have been shown to stimulate the activity of cellulase preparations.

The enzymes described may also be useful for other purposes in processing biomass. The lignin modifiying enzymes may be used to alter the structure of lignin to produce novel materials, and hemicelluases may be employed to produce 5-carbon sugars from hemicelluloses, which may then be further converted to chemical products.

There is also a need for improved enzymes for feed and food processing applications. Cereal-based food products such as pasta, noodles and bread can be prepared from a dough which is usually made from the basic ingredients (cereal) flour, water and optionally salt. As a result of a consumer-driven need to replace the chemical additives by more natural products, several enzymes have been developed with dough and/or cereal-based food product improving properties and which are used in all possible combinations depending on the specific application conditions. Suitable enzymes include xylanase, starch degrading enzymes, oxidizing enzymes, fatty material splitting enzymes, protein degrading, modifying or crosslinking enzymes. Many of these enzymes are also used for treating animal feed or animal feed additives, to make them more digestible or to improve their nutritional quality. Amylases are used for the conversion of plant starches to glucose. Pectin-active enzymes are used in fruit processing, for example to increase the yield of juices, and in fruit juice clarification, as well as in other food processing steps. Object of the invention

It is an object of the invention to provide novel polynucleotides encoding novel cell wall deconstruction enzymes. A further object is to provide naturally and recombinantly produced cell wall deconstruction enzymes as well as recombinant strains producing these. Also fusion polypeptides are part of the invention as well as methods of making and using the polynucleotides and polypeptides according to the invention. Summary of the invention

The invention provides for a novel process for degrading biomass or pretreated biomass to sugars wherein an enzyme is used comprising a polypeptide having a. a polypeptide sequence as set forth in any one of SEQ ID Nos: 3, 6, 9, 12 and 15; b. a polypeptide that is at least 60%, preferably at least 70%, more preferably at least 80%, even more preferably at least 90%, 95%, 96%, 97%, 98% or 99% homologous to the any one of SEQ ID Nos: 3, 6, 9, 12 and 15; c. a polypeptide sequence encoded by nucleic acids sequence as set forth in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 or nucleic acids that are at least 60%, preferably at least 70%, more preferably at least 80%, even more preferably at least 90%, 95%, 96%, 97%, 98% or 99% homologous to any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14; d. a polypeptide sequence encoded by a nucleic acids sequence hybridizing under stringent conditions to the polynucleotide as set forth in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14; or e. a polypeptide sequence encoded by a nucleic acids sequence hybridizing under stringent conditions to the reverse complement of a polynucleotide as set forth in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14.

According to an aspect of the invention in the present process for degrading biomass or pretreated biomass to sugars the polypeptide or enzyme is a cellulase, preferably a betaglucosidase, an arabinan endo-1 ,5-alpha-L-arabinosidase, a 1 ,4- beta-xylosidase, an acetylxylan esterase or an endo-1 ,4-beta-xylanase, preferably the polypeptide is obtainable from a fungus, in particular Myceliophthora (Corynascus) is preferred and even more preferred Myceliophthora fergusii (Corynascus thermophilum). As used herein, the expressions "Myceliophthora (Corynascus)" and "Myceliophthora fergusii (Corynascus thermophilum)" are meant to reflect the recent proposed changes in the taxonomy of all existing Corynascus species, which should be renamed to Myceliophthora according to phylogenic studies by van den Brink et al., ("Phylogeny of the industrial relevant, thermophilic genera Myceliophthora and Corynascus", Fungal Diversity (2012), 52:197-207). However, regardless of taxonomic classification, a person of skill in the art would be able to identify the organism used to determine the sequences disclosed herein for example based on the strain's accession number (CBS 405.69).

Preferably in the process of the invention for degrading biomass or pretreated biomass to sugars, the formed sugars are converted into ethanol. According to another aspect of the invention in the process of the invention cellulase or cellulases are added. According to a further aspect of the invention in the process of the invention the cellulolytic material or lignin is pretreated.

The examples of activities of enzymes according to the invention are herein intended to at least cover any of the following:

• Enzymes that hydrolyze cellulose, including endoglucanases ((E.C. 3.2.1.4) hydrolyze the beta-1 ,4-linkages between glucose units); exoglucanases, also known as cellobiohydrolases 1 and 2 ((E.C. 3.2.1.91 ) hydrolyze cellobiose, a glucose disaccharide, from the reducing and non-reducing ends of cellulose); and beta- glucosidases ((E.C. 3.2.1.21 ) hydrolyze the beta-1 ,4 glycoside bond of cellobiose to glucose)

• Glycoside hydrolase family 61 (GH61 ) proteins, which enhance the action of cellulose enzymes on lignocellulose substrates.

• Enzymes that degrade or modify xylan and/or xylan-lignin complexes including xylanase ((E.C. 3.2.1.8) catalyzes random cleavage of beta-1 ,4 bonds in xylan or xyloglucan), xylan 1 ,4-beta-xylosidase (EC 3.2.1 .37) catalyzes hydrolysis of 1 ,4- beta-D-xylans, to remove successive D-xylose residues from the non-reducing terminals, and also cleaves xylobiose), alpha-arabinofuranosidase ((EC 3.2.1 .55) hydrolyzes terminal non-reducing alpha-L-arabinofuranoside residues in alpha-L- arabinosides including arabinoxylans and arabinogalactans), alpha-glucuronidase ((EC 3.2.1.139) hydrolyzes an alpha-D-glucuronoside to the corresponding alcohol and D-glucuronate), feruloyl esterase ((EC 3.1 .1 .73) catalyzes hydrolysis of the 4- hydroxy-3-methoxycinnamoyl (feruloyl) group from an esterified sugar, which is usually arabinose in natural substrates), and acetyl xylan esterase ((EC 3.1 .1 .72) catalyzes deacetylation of xylans and xylo-oligosaccharides)

• Enzymes that degrade or modify mannan including mannanase ((EC 3.2.1 .78) catalyzes random hydrolysis of 1 ,4-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans), mannosidase ((EC 3.2.1 .25) hydrolyzes terminal, non-reducing beta-D-mannose residues in beta-D-mannosides), alpha- galactosidase ((EC 3.2.1 .22) hydrolyzes terminal, non-reducing alpha-D-galactose residues in alpha-D-galactosides, including galactose oligosaccharides, galactomannans and galactohydrolase), and mannan acetyl esterase . • Enzymes that degrade xyloglucans including xyloglucanase ((EC 3.2.1.151 ) involves endohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan while (EC 3.2.1.155) catalyzes exohydrolysis of 1 ,4-beta-D-glucosidic linkages in xyloglucan), endoglucanase, and cellulase.

• Enzymes that degrade beta-1 ,4-glucan including endoglucanase, cellobiohydrolase, and beta-glucosidase.

• Enzymes that degrade beta-1 ,3-1 ,4-glucan including endo-beta-1 ,3(4)-glucanase ((EC 3.2.1.6) catalyzes endohydrolysis of 1 ,3- or 1 ,4-linkages in beta-D-glucans when the glucose residue whose reducing group is involved in the linkage to be hydrolysed is itself substituted at C-3), endoglucanase (beta-glucanase, cellulase), and beta-glucosidase.

• Enzymes that degrade galactan include galactanases ((EC 3.2.1 .23) hydrolyzes terminal non-reducing beta-D-galactose residues in beta-D-galactosides).

• Enzymes that degrade arabinan include arabinanases ((EC 3.2.1 .99) catalyze endohydrolysis of 1 ,5-alpha-arabinofuranosidic linkages in 1 ,5-arabinans).

• Enzymes that degrade starch, including alpha-amylase ((EC 3.2.1 .1 catalyzes endohydrolysis of 1 ,4-alpha-D-glucosidic linkages in polysaccharides containing three or more 1 ,4-alpha-linked D-glucose units) and alpha-glucosidase ((EC 3.2.1 .20) hydrolyzes terminal, non-reducing 1 ,4-linked alpha-D-glucose residues with release of alpha-D-glucose).

• Enzymes that degrade or modify pectin, including pectate lyase ((EC 4.2.2.2) carries out eliminative cleavage of pectate to give oligosaccharides with 4-deoxy-alpha-D- gluc-4-enuronosyl groups at their non-reducing ends), pectin lyase ((EC 4.2.2.10) catalyzes eliminative cleavage of (1 -4)-alpha-D-galacturonan methyl ester to give oligosaccharides with 4-deoxy-6-0-methyl-alpha-D-galact-4-enuronosyl groups at their non-reducing ends), polygalacturonase ((EC 3.2.1 .15) carries out random hydrolysis of 1 ,4-alpha-D-galactosiduronic linkages in pectate and other galacturonans), pectin acetyl esterase ((EC 3.1 .1 .1 1 ) hydrolyzes acetate from pectin acetyl esters), alpha-arabinofuranosidase, beta-galactosidase, galactanase, arabinanase, rhamnogalacturonase (EC 3.2.1 .-) hydrolyzes alpha-D- galacturonopyranosyl-(1 ,2)-alpha-L-rhamnopyranosyl linkages in the backbone of the hairy regions of pectins), rhamnogalacturonan lyase (EC 4.2.2.-) degrades type I rhamnogalacturonan from plant cell walls and releases disaccharide products), rhamnogalacturonan acetyl esterase ((EC 3.1.1 .-) hydrolyzes acetate from rhamnogalacturonan), xylogalacturonosidase, and xylogalacturonase ((EC 3.2.1 .- ) hydrolyzes xylogalacturonan (xga), a galacturonan backbone heavily substituted with xylose, and which is one important component of the hairy regions of pectin).

• Enzymes that degrade or modify lignin, including lignin peroxidases ((EC 1 .1 1 .1 .14) oxidize lignin and lignin model compounds using hydrogen peroxide), manganese- dependent peroxidases ((EC 1 .1 1.1 .13) oxidizes lignin and lignin model compounds using Mn²⁺ and hydrogen peroxide), versatile peroxidases ((EC

1 .1 1.1 .16) oxidizes lignin and lignin model compounds using an electron donor and hydrogen peroxide and combines the substrate-specificity characteristics of the two other ligninolytic peroxidases, EC 1 .1 1.1 .13, manganese peroxidase and EC 1 .1 1 .1 .14, lignin peroxidase), and laccases ((EC 1 .10.3.2) a group of multi-copper proteins of low specificity acting on both o- and p-quinols, and often acting also on lignin).

• Enzymes acting on chitin, including chitinase ((EC 3.2.1.14) which catalyzes random hydrolysis of N-acetyl-beta-D-glucosaminide 1 ,4-beta-linkages in chitin and chitodextrins) and beta-N-acetylhexosaminidase ((EC 3.2.1.52) which hydrolyzes terminal non-reducing N-acetyl-D-hexosamine residues in N-acetyl-beta-D- hexosaminides).

In some instances, certain enzymes (or family of enzymes) can be re-classified, for example, to take into account newly discovered enzyme functions or properties. Accordingly, the polypeptides/enzymes of the present invention are not meant to be limited to specific enzyme classes as they currently exist. The skilled person would know how to appropriately reclassify (and assign the appropriate functions) to the enzymes of the present invention based on the amino acid sequence information provided herein. Such reclassifications are thus within the scope of the present invention.

The invention also relates to vectors comprising a polynucleotide sequence according to the invention, as well as primers, probes and fragments that may be used to amplify or detect the DNA according to the invention.

In a further preferred embodiment, a vector is provided wherein the polynucleotide sequence according to the invention is functionally linked with at least one regulatory sequence suitable for expression of the encoded amino acid sequence in a suitable host cell, such as a filamentous fungus, for example Aspergillus. The invention also provides methods for preparing polynucleotides and vectors according to the invention.

The invention also relates to recombinantly produced host cells that contain heterologous or homologous polynucleotides according to the invention.

In another embodiment, the invention provides recombinant host cells wherein the expression of an enzyme according to the invention is significantly increased or wherein the activity of the enzyme is increased. In another embodiment the invention provides for a recombinantly produced host cell that contains heterologous or homologous DNA according to the invention and wherein the cell is capable of producing a functional enzyme according to the invention, preferably a cell capable of over-expressing the enzyme according to the invention, for example an Aspergillus niger strain comprising an increased copy number of a gene according to the invention.

In yet another aspect of the invention, a purified polypeptide is provided. The polypeptides according to the invention include the polypeptides encoded by the polynucleotides according to the invention. Especially preferred are polypeptides according to any one of SEQ ID Nos: 3, 6, 9, 12 and 15 or a functional equivalent thereof.

Fusion proteins comprising a polypeptide according to the invention are also within the scope of the invention. The invention also provides methods of making the polypeptides according to the invention.

The invention also relates to the use of the enzyme according to the invention in any industrial process as described herein.

Legends to the figures

Fig. 1 represents a schematic map of pGBFIN-49 Detailed description of the invention

Polynucleotides

The present invention provides polynucleotides encoding enzymes having amino acid sequences according to any one of SEQ ID NOs: 3, 6, 9, 12 and 15 or a functional equivalent thereof. The sequences of the genes were determined by sequencing cDNA clones, mRNA transcripts, or genomic DNA obtained from Myceliophthora fergusii (Corynascus thermophilus) CBS 405.69. The invention provides polynucleotide sequences comprising the genes encoding the enzymes listed in Table 8 as well as their coding sequences. Accordingly, the invention relates to an isolated polynucleotide comprising the nucleotide sequences according to any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 or a functional equivalent thereof.

In particular, the invention relates to an isolated polynucleotide hybridizable under stringent conditions, preferably under high stringent conditions, to the complement of the polynucleotide listed above. Advantageously, such isolated polynucleotide may be obtained from fungi, in particular from Myceliophthora (Corynascus), preferably from Myceliophthora fergusii (Corynascus thermophilum). More specifically, the invention relates to isolated polynucleotides having nucleotide sequences according to any one of SEQ ID NOs: 1 , 2, 4,

5, 7, 8, 10, 1 1 , 13 and 14.

As used herein, the terms "gene" and "recombinant gene" refer to nucleic acid molecules which may be isolated from chromosomal DNA, which include an open reading frame encoding a protein, e.g., Myceliophthora fergusii (Corynascus thermophilus) enzymes according to the present invention. A gene may include coding sequences, non-coding sequences, introns and regulatory sequences. Moreover, a gene refers to an isolated nucleic acid molecule as defined herein.

A nucleic acid molecule of the present invention, such as a nucleic acid molecule having the nucleotide sequences listed above can be isolated using standard molecular biology techniques and the sequence information provided herein. For example, using all or a portion of these nucleic acid sequences as hybridization probes, nucleic acid molecules according to the invention can be isolated using standard hybridization and cloning techniques (e.g., as described in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual.2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor

Laboratory Press, Cold Spring Harbor, NY, 1989).

Moreover, a nucleic acid molecule encompassing all or a portion of any one of SEQ

ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 can be isolated by the polymerase chain reaction

(PCR) using synthetic oligonucleotide primers designed based upon the sequence information contained in these sequences.

A nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, genomic DNA, as a template and appropriate oligonucleotide primers according to standard

PCR amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis.

Furthermore, oligonucleotides corresponding to or hybridizable to nucleotide sequences according to the invention can be prepared by standard synthetic techniques, e.g., using an automated DNA synthesizer.

In a preferred embodiment, isolated nucleic acid molecules of the invention comprise the nucleotide sequences shown in any one of SEQ ID NOs: 2, 5, 8, 1 1 and 14. These sequences correspond to the coding regions of the Myceliophthora fergusii

(Corynascus thermophilus) genes shown in Table 8. These DNA sequences encode the

Myceliophthora fergusii (Corynascus thermophilus) polypeptides according to any one of

SEQ ID NOs: 3, 6, 9, 12 and 15.

In another preferred embodiment, an isolated nucleic acid molecule of the invention comprises a nucleic acid molecule which is a reverse complement of the nucleotide sequences shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 or a functional equivalent thereof.

A nucleic acid molecule which is complementary to another nucleotide sequence is one which is sufficiently complementary to the other nucleotide sequence such that it can hybridize to the other nucleotide sequence thereby forming a stable duplex.

One aspect of the invention pertains to isolated nucleic acid molecules that encode a polypeptide of the invention or a functional equivalent thereof such as a biologically active fragment or domain, as well as nucleic acid molecules sufficient for use as hybridization probes to identify nucleic acid molecules encoding a polypeptide of the invention and fragments of such nucleic acid molecules suitable for use as PCR primers for the amplification or mutation of nucleic acid molecules.

An "isolated polynucleotide" or "isolated nucleic acid" is a DNA or RNA that is not immediately contiguous with both of the coding sequences with which it is immediately contiguous (one on the 5' end and one on the 3' end) in the naturally occurring genome of the organism from which it is derived. Thus, in one embodiment, an isolated nucleic acid includes some or all of the 5' non-coding (e.g., promoter) sequences that are immediately contiguous to the coding sequence. The term therefore includes, for example, a recombinant DNA that is incorporated into a vector, into an autonomously replicating plasmid or virus, or into the genomic DNA of a prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a genomic DNA fragment produced by PCR or restriction endonuclease treatment) independent of other sequences. It also includes a recombinant DNA that is part of a hybrid gene encoding an additional polypeptide that is substantially free of cellular material, viral material, or culture medium (when produced by recombinant DNA techniques), or chemical precursors or other chemicals (when chemically synthesized). Moreover, an "isolated nucleic acid fragment" is a nucleic acid fragment that is not naturally occurring as a fragment and would not be found in the natural state.

As used herein, the terms "polynucleotide" or "nucleic acid molecule" are intended to include DNA molecules (e.g., cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and analogs of the DNA or RNA generated using nucleotide analogs. The nucleic acid molecule can be single-stranded or double-stranded, but preferably is double-stranded DNA. The nucleic acid may be synthesized using oligonucleotide analogs or derivatives (e.g., inosine or phosphorothioate nucleotides). Such oligonucleotides can be used, for example, to prepare nucleic acids that have altered base-pairing abilities or increased resistance to nucleases.

Another embodiment of the invention provides an isolated nucleic acid molecule which is antisense to nucleic acid molecules shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 e.g., the coding strand of these nucleic acid molecules. In a further embodiment, it is also provided an antisense molecule which hybridizes with at least 10 contiguous, 20 contiguous, 40 contiguous, more preferably 50 contiguous, 60 contiguous, at least 80 contiguous and more preferably 100 contiguous nucleotides to any sequences shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 e.g., the coding strands of a these molecules. Also included within the scope of the invention are the complement strands of the nucleic acid molecules described herein.

Sequencing errors

The sequence information as provided herein should not be so narrowly construed as to require inclusion of erroneously identified bases. The specific sequences disclosed herein can be readily used to isolate the complete gene from filamentous fungi, in particular from Myceliophthora fergusii (Corynascus thermophilus) which in turn can easily be subjected to further sequence analyses thereby identifying sequencing errors.

Unless otherwise indicated, all nucleotide sequences determined by sequencing a DNA molecule herein were determined using an automated DNA sequencer and all amino acid sequences of polypeptides encoded by DNA molecules determined herein were predicted by translation of a DNA sequence determined as above. Therefore, as is known in the art for any DNA sequence determined by this automated approach, any nucleotide sequence determined herein may contain some errors. Nucleotide sequences determined by automation are typically at least about 90% identical, more typically at least about 95% to at least about 99.9% identical to the actual nucleotide sequence of the sequenced DNA molecule. The actual sequence can be more precisely determined by other approaches including manual DNA sequencing methods well known in the art. As is also known in the art, a single insertion or deletion in a determined nucleotide sequence compared to the actual sequence will cause a frame shift in translation of the nucleotide sequence such that the predicted amino acid sequence encoded by a determined nucleotide sequence will be completely different from the amino acid sequence actually encoded by the sequenced DNA molecule, beginning at the point of such an insertion or deletion.

The person skilled in the art is capable of identifying such erroneously identified bases and knows how to correct for such errors.

Nucleic acid fragments, probes and primers

Nucleic acid molecules according to the invention may comprise only a portion or a fragment of the nucleic acid sequences shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 for example a fragment which can be used as a probe or primer or a fragment encoding a portion of these genes. The nucleotide sequence determined from the cloning of the genes shown in Table 8 allows for the generation of probes and primers designed for use in identifying and/or cloning other family members, as well as homologues from other species. The probe/primer typically comprises substantially purified oligonucleotide which typically comprises a region of nucleotide sequence that hybridizes preferably under highly stringent conditions to at least about 12 or 15, preferably about 18 or 20, preferably about 22 or 25, more preferably about 30, 35, 40, 45, 50, 55, 60, 65, or 75 or more consecutive nucleotides of a nucleotide sequence shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 or a functional equivalent thereof.

Probes based on these nucleotide sequences can be used to detect transcripts or genomic sequences encoding the same or homologous proteins for instance in other organisms. In preferred embodiments, the probe further comprises a label group attached thereto, e.g., the label group can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme cofactor. Such probes can also be used as part of a diagnostic test kit for identifying cells which express a protein encoded by the genes shown in Table 8.

Identity & homology

The terms "homology" or "percent identity" are used interchangeably herein. For the purpose of this invention, it is defined here that in order to determine the percent identity of two amino acid sequences or of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second amino or nucleic acid sequence). The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences (i.e., % identity = number of identical positions/total number of positions (i.e. overlapping positions) x 100). Preferably, the two sequences are the same length.

The skilled person will be aware of the fact that several different computer programs are available to determine the homology between two sequences. For instance, a comparison of sequences and determination of percent identity between two sequences can be accomplished using a mathematical algorithm. In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48): 444-453 (1970)) algorithm which has been incorporated into the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1 , 2, 3, 4, 5, or 6. The skilled person will appreciate that all these different parameters will yield slightly different results but that the overall percentage identity of two sequences is not significantly altered when using different algorithms.

In yet another embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the Accelrys GCG software package (available at http://www.accelrys.com/products/gcg/), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1 , 2, 3, 4, 5, or 6. In another embodiment, the percent identity two amino acid or nucleotide sequence is determined using the algorithm of E. Meyers and W. Miller (CABIOS, 4:1 1 -17 (1989) which has been incorporated into the ALIGN program (version 2.0) (available at the ALIGN Query using sequence data of the Genestream server IGH Montpellier France http://vega.igh.cnrs.fr/bin/align-guess.cgi) using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

The nucleic acid and protein sequences of the present invention can further be used as a "query sequence" to perform a search against public databases to, for example, identify other family members or related sequences. Such searches can be performed using the N BLAST and XBLAST programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403— 10. BLAST nucleotide searches can be performed with the NBLAST program, score = 100, wordlength = 12 to obtain nucleotide sequences homologous to nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score = 50, wordlength = 3 to obtain amino acid sequences homologous to protein molecules of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., (1997) Nucleic Acids Res. 25(17): 3389-3402. When utilizing BLAST and Gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used. See the homepage of the National Center for Biotechnology Information at http://www.ncbi.nlm.nih.gov/.

Hybridization

As used herein, the term "hybridizing" is intended to describe conditions for hybridization and washing under which nucleotide sequences at least about 60%, at least about 70%, at least about 80%, more preferably at least about 85%, even more preferably at least about 90%, more preferably at least 95%, more preferably at least 98% or more preferably at least 99% homologous to each other typically remain hybridized to each other.

A preferred, non-limiting example of such hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more washes in 1 X SSC, 0.1 % SDS at 50°C, preferably at 55°C, preferably at 60°C and even more preferably at 65°C.

Highly stringent conditions include, for example, hybridizing at 68°C in 5x SSC/5x Denhardt's solution / 1 .0% SDS and washing in 0.2x SSC/0.1 % SDS at room temperature. Alternatively, washing may be performed at 42°C.

The skilled artisan will know which conditions to apply for stringent and highly stringent hybridization conditions. Additional guidance regarding such conditions is readily available in the art, for example, in Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, N.Y.; and Ausubel et al. (eds.), 1995, Current Protocols in Molecular Biology, (John Wiley & Sons, N.Y.).

Of course, a polynucleotide which hybridizes only to a poly A sequence (such as the 3' terminal poly(A) tract of mRNAs), or to a complementary stretch of T (or U) residues, would not be included in a polynucleotide of the invention used to specifically hybridize to a portion of a nucleic acid of the invention, since such a polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double-stranded cDNA clone).

Obtaining full length DNA from other organisms

In a typical approach, cDNA libraries constructed from other organisms, e.g., brown rot fungi, in particular from the micro-organism family Myceliophthora (Corynascus) can be screened.

For example, Myceliophthora (Corynascus) strains can be screened for homologous polynucleotides shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 by Northern blot analysis. Upon detection of transcripts homologous to polynucleotides according to the invention, cDNA libraries can be constructed from RNA isolated from the appropriate strain, utilizing standard techniques well known to those of skill in the art. Alternatively, a total genomic DNA library can be screened using a probe hybridizable to a polynucleotide shown above.

Homologous gene sequences can be isolated, for example, by performing PCR using two degenerate oligonucleotide primer pools designed on the basis of nucleotide sequences as taught herein.

The template for the reaction can be cDNA obtained by reverse transcription of mRNA prepared from strains known or suspected to express a polynucleotide according to the invention. The PCR product can be subcloned and sequenced to ensure that the amplified sequences represent the sequences of a new nucleic acid sequence corresponding to those shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 or a functional equivalent thereof. The PCR fragment can then be used to isolate a full-length cDNA clone by a variety of known methods. For example, the amplified fragment can be labelled and used to screen a bacteriophage or cosmid cDNA library. Alternatively, the labelled fragment can be used to screen a genomic library.

PCR technology also can be used to isolate full-length cDNA sequences from other organisms. For example, RNA can be isolated, following standard procedures, from an appropriate cellular or tissue source. A reverse transcription reaction can be performed on the RNA using an oligonucleotide primer specific for the most 5' end of the amplified fragment for the priming of first strand synthesis.

The resulting RNA DNA hybrid can then be "tailed" (e.g., with guanines) using a standard terminal transferase reaction, the hybrid can be digested with RNase H, and second strand synthesis can then be primed (e.g., with a poly-C primer). Thus, cDNA sequences upstream of the amplified fragment can easily be isolated. For a review of useful cloning strategies, see e.g. Sambrook et al., supra; and Ausubel et al., supra.

Vectors

Another aspect of the invention pertains to vectors, preferably expression vectors, containing a nucleic acid encoding a protein shown in Table 8 and whose sequence may be found in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 or a functional equivalent thereof.

As used herein, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid", which refers to a circular double stranded DNA loop into which additional DNA segments can be ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non- episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "expression vectors". In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. The terms "plasmid" and "vector" can be used interchangeably herein as the plasmid is the most commonly used form of vector. However, the invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent functions.

The recombinant expression vectors of the invention comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vector includes one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operatively linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner which allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). The term "regulatory sequence" is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signal). Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cells and those which direct expression of the nucleotide sequence only in a certain host cell (e.g., tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or peptides, encoded by nucleic acids as described herein (e.g. proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 mutant forms of these proteins, fragments, variants or functional equivalents thereof, fusion proteins, etc.).

The recombinant expression vectors of the invention can be designed for expression of proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9 and 15 in prokaryotic or eukaryotic cells. For example, these proteins can be expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression vectors) yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Alternatively, the recombinant expression vector can be transcribed and translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase.

Expression vectors useful in the present invention include chromosomal-, episomal- and virus-derived vectors e.g., vectors derived from bacterial plasmids, bacteriophage, yeast episome, yeast chromosomal elements, viruses such as baculoviruses, papova viruses, vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, such as cosmids and phagemids.

The DNA insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to name a few. Other suitable promoters will be known to the skilled person. In a specific embodiment, promoters are preferred that are capable of directing a high expression level of lignocellulose active proteins from fungi. Such promoters are known in the art. The expression constructs may contain sites for transcription initiation, termination, and, in the transcribed region, a ribosome binding site for translation. The coding portion of the mature transcripts expressed by the constructs will include a translation initiating AUG at the beginning and a termination codon appropriately positioned at the end of the polypeptide to be translated.

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. As used herein, the terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, transduction, infection, lipofection, cationic lipid-mediated transfection or electroporation. Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual, 2^nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989), Davis et al., Basic Methods in Molecular Biology (1986) and other laboratory manuals.

For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those which confer resistance to drugs, such as G418, hygromycin and methatrexate. Nucleic acid encoding a selectable marker can be introduced into a host cell on the same vector as that encoding proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 can be introduced on a separate vector. Cells stably transfected with the introduced nucleic acid can be identified by drug selection (e.g. cells that have incorporated the selectable marker gene will survive, while the other cells die).

Expression of proteins in prokaryotes is often carried out in E. coli with vectors containing constitutive or inducible promoters directing the expression of either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, e.g. to the amino terminus of the recombinant protein. Such fusion vectors typically serve three purposes: 1 ) to increase expression of recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the recombinant protein to enable separation of the recombinant protein from the fusion moiety subsequent to purification of the fusion protein.

As indicated, the expression vectors will preferably contain selectable markers. Such markers include dihydrofolate reductase or neomycin resistance for eukaryotic cell culture and tetracyline or ampicillin resistance for culturing in E. coli and other bacteria. Representative examples of appropriate host include bacterial cells, such as E. coli, Streptomyces Salmonella typhimurium and certain Bacillus species; fungal cells such as Aspergillus species, for example A. niger, A. oryzae and A. nidulans, yeast cells such as Kluyveromyces, for example K. lactis and/or Pichia, for example P. pastoris; insect cells such as Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, COS and Bowes melanoma; and plant cells. Appropriate culture mediums and conditions for the above- described host cells are known in the art.

Vectors preferred for use in bacteria are for example disclosed in WO-A1 - 2004/074468, which are hereby enclosed by reference. Other suitable vectors will be readily apparent to the skilled artisan.

Known bacterial promoters suitable for use in the present invention include the promoters disclosed in WO-A1 -2004/074468, which are hereby enclosed by reference.

Transcription of the DNA encoding the polypeptides of the present invention by higher eukaryotes may be increased by inserting an enhancer sequence into the vector. Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp that act to increase transcriptional activity of a promoter in a given host cell-type. Examples of enhancers include the SV40 enhancer, which is located on the late side of the replication origin at bp 100 to 270, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin, and adenovirus enhancers.

For secretion of the translated protein into the lumen of the endoplasmic reticulum, into the periplasmic space or into the extracellular environment, appropriate secretion signal may be incorporated into the expressed polypeptide. The signals may be endogenous to the polypeptide or they may be heterologous signals.

The polypeptide whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9,

12 and 15 may be expressed in a modified form, such as a fusion protein, and may include not only secretion signals but also additional heterologous functional regions. Thus, for instance, a region of additional amino acids, particularly charged amino acids, may be added to the N-terminus of the polypeptide to improve stability and persistence in the host cell, during purification or during subsequent handling and storage. Also, peptide moieties may be added to the polypeptide to facilitate purification. Polypeptides according to the invention

The invention provides isolated polypeptides having the amino acid sequences shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 alone or in an appropriate host. Also, a peptide or polypeptide comprising a functional equivalent of the above polypeptides is comprised within the present invention. The above polypeptides are collectively comprised in the term "polypeptides according to the invention"

The terms "peptide" and "oligopeptide" are considered synonymous (as is commonly recognized) and each term can be used interchangeably as the context required to indicate a chain of at least two amino acids coupled by peptidyl linkages. The word "polypeptide" is used herein for chains containing more than seven amino acid residues. All oligopeptide and polypeptide formulas or sequences herein are written from left to right and in the direction from amino terminus to carboxyl terminus. The one-letter code of amino acids used herein is commonly known in the art and can be found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual, 2^nd, ed. Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989). Sequence Listings programs can convert easily this one-letter code of amino acids sequence into a three-letter code.

By "isolated" polypeptide or protein is intended a polypeptide or protein removed from its native environment. For example, recombinantly produced polypeptides and proteins expressed in host cells are considered isolated for the purpose of the invention as are native or recombinant polypeptides which have been substantially purified by any suitable technique such as, for example, the single-step purification method disclosed in Smith and Johnson, Gene 67:31 -40 (1988).

The proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 according to the invention can be recovered and purified from recombinant cell cultures by methods known in the art. Most preferably, high performance liquid chromatography ("HPLC") is employed for purification.

Polypeptides of the present invention include naturally purified products, products of chemical synthetic procedures, and products produced by recombinant techniques from a prokaryotic or eukaryotic host, including, for example, bacterial, yeast, higher plant, insect and mammalian cells. Depending upon the host employed in a recombinant production procedure, the polypeptides of the present invention may be glycosylated or may be non- glycosylated. In addition, polypeptides of the invention may also include an initial modified methionine residue, in some cases as a result of host-mediated processes. Protein fragments

The invention also features biologically active fragments of the polypeptides according to the invention.

Biologically active fragments of a polypeptide of the invention include polypeptides comprising amino acid sequences sufficiently identical to or derived from the amino acid sequences shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 which include fewer amino acids than the full length protein but which exhibit at least one biological activity of the corresponding full-length protein. Typically, biologically active fragments comprise a domain or motif with at least one activity of the full-length protein. A biologically active fragment of a protein of the invention can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino acids in length. Moreover, other biologically active portions, in which other regions of the protein are deleted, can be prepared by recombinant techniques and evaluated for one or more of the biological activities of the native form of a polypeptide of the invention.

The invention also features nucleic acid fragments which encode the above biologically active fragments of the proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15.

Fusion proteins

The proteins of the present invention or functional equivalents thereof, e.g., biologically active portions thereof, can be operatively linked to unrelated polypeptides (e.g., heterologous amino acid sequences) to form fusion proteins. "Unrelated polypeptides" refer to polypeptides having amino acid sequences corresponding to proteins which are not substantially homologous to the proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15. Such "unrelated polypeptides" can be derived from the same or a different organism. Within a fusion protein the polypeptide derived from the sequences shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 can correspond to all or a biologically active fragment of a these proteins. In a preferred embodiment, a fusion protein comprises at least two biologically active portions of proteins whose sequences are shown above. Within the fusion protein, the term "operatively linked" is intended to indicate that the polypeptide whose sequence is shown above, and the unrelated polypeptide are fused in-frame to each other. The unrelated polypeptide can be fused to the N-terminus or C-terminus of the polypeptide whose sequence is one of those shown above.

For example, in one embodiment, the fusion protein is a fusion protein in which the protein whose sequence as shown above is fused to the C-terminus of the GST sequences. Such fusion proteins can facilitate the purification of recombinant protein from Myceliophthora fergusii (Corynascus thermophilum). In another embodiment, the fusion protein is a protein whose sequence is one of those shown above, containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian and yeast host cells), expression and/or secretion of proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 can be increased through use of a heterologous signal sequence.

In another example, the gp67 secretory sequence of the baculovirus envelope protein can be used as a heterologous signal sequence (Current Protocols in Molecular Biology, Ausubel et al., eds., John Wiley & Sons, 1992). Other examples of eukaryotic heterologous signal sequences include the secretory sequences of melittin and human placental alkaline phosphatase (Stratagene; La Jolla, California). In yet another example, useful prokaryotic heterologous signal sequences include the phoA secretory signal (Sambrook et al., supra) and the protein A secretory signal (Pharmacia Biotech; Piscataway, New Jersey).

A signal sequence can be used to facilitate secretion and isolation of a protein or polypeptide of the invention. Signal sequences are typically characterized by a core of hydrophobic amino acids, which are generally cleaved from the mature protein during secretion in one or more cleavage events. Such signal peptides contain processing sites that allow cleavage of the signal sequence from the mature proteins as they pass through the secretory pathway. The signal sequence directs secretion of the protein, such as from a eukaryotic host into which the expression vector is transformed, and the signal sequence is subsequently or concurrently cleaved. The protein can then be readily purified from the extracellular medium by known methods. Alternatively, the signal sequence can be linked to the protein of interest using a sequence, which facilitates purification, such as with a GST domain. Thus, for instance, the sequence encoding the polypeptide may be fused to a marker sequence, such as a sequence encoding a peptide, which facilitates purification of the fused polypeptide. In certain preferred embodiments of this aspect of the invention, the marker sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector (Qiagen, Inc.), among others, many of which are commercially available. As described in Gentz et al, Proc. Natl. Acad. Sci. USA 86:821 -824 (1989), for instance, hexa-histidine provides for convenient purification of the fusion protein. The HA tag is another peptide useful for purification which corresponds to an epitope derived of influenza hemaglutinin protein, which has been described by Wilson et al., Cell 37:767 (1984), for instance.

Preferably, a fusion protein of the invention (corresponding to one of those whose sequences shown above) is produced by standard recombinant DNA techniques. For example, DNA fragments coding for the different polypeptide sequences are ligated together in-frame in accordance with conventional techniques, for example by employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR amplification of gene fragments can be carried out using anchor primers, which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for example, Current Protocols in Molecular Biology, eds. Ausubel et al. John Wiley & Sons: 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding one of the proteins shown in Table 8 can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the this protein.

Functional equivalents

The terms "functional equivalents" and "functional variants" are used interchangeably herein. Functional equivalents of DNA whose sequences are shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 are isolated DNA fragments that encode a polypeptide that exhibits a particular function of the corresponding Myceliophthora fergusii (Corynascus thermophilus) enzyme or protein as shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15. A functional equivalent of a polypeptide according to the invention is a polypeptide that exhibits at least one function of a Myceliophthora fergusii (Corynascus thermophilus) enzyme or protein as defined herein. Functional equivalents therefore also encompass biologically active fragments.

Functional protein or polypeptide equivalents may contain only conservative substitutions of one or more amino acids of proteins whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 or substitutions, insertions or deletions of non-essential amino acids. Accordingly, a non-essential amino acid is a residue that can be altered in proteins whose sequences are shown above, without substantially altering the biological function. For example, amino acid residues that are conserved among the proteins of the present invention are predicted to be particularly unamenable to alteration. Furthermore, amino acids conserved among the proteins according to the present invention (shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15) and other enzymes are not likely to be amenable to alteration.

The term "conservative substitution" is intended to indicate a substitution in which the amino acid residue is replaced with an amino acid residue having a similar side chain. These families are known in the art and include amino acids with basic side chains (e.g. lysine, arginine and hystidine), acidic side chains (e.g. aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagines, glutamine, serine, threonine, tyrosine, cysteine), non-polar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine tryptophan, histidine).

Functional nucleic acid equivalents may typically contain silent mutations or mutations that do not alter the biological function of encoded polypeptide. Accordingly, the invention provides nucleic acid molecules encoding the proteins whose sequences are shown above, that contain changes in amino acid residues that are not essential for a particular biological activity. Such proteins differ in amino acid sequence from those shown yet retain at least one biological activity thereof. In one embodiment the isolated nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises a substantially homologous amino acid sequence of at least about 72%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more homologous to the amino acid sequences shown above.

For example, guidance concerning how to make phenotypically silent amino acid substitutions is provided in Bowie, J.U. et al., Science 247:1306-1310 (1990) and the references cited therein. As the authors state, these studies have revealed that proteins are surprisingly tolerant of amino acid substitutions. The authors further indicate which changes are likely to be permissive at a certain position of the protein.

An isolated nucleic acid molecule encoding a protein homologous to a protein whose sequence is shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15, can be created by introducing one or more nucleotide substitutions, additions or deletions into the coding nucleotide sequences above such that one or more amino acid substitutions, deletions or insertions are introduced into the encoded protein. Such mutations may be introduced by standard techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis.

The term "functional equivalents" also encompasses orthologues of the Myceliophthora fergusii (Corynascus thermophilus) proteins. Orthologues of the Myceliophthora fergusii (Corynascus thermophilus) proteins are proteins that can be isolated from other strains or species and possess a similar or identical biological activity. Such orthologues can readily be identified as comprising an amino acid sequence that is substantially homologous to one of the sequences shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15.

As defined herein, the term "substantially homologous" refers to a first amino acid or nucleotide sequence which contains a sufficient or minimum number of identical or equivalent (e.g., with similar side chain) amino acids or nucleotides to a second amino acid or nucleotide sequence such that the first and the second amino acid or nucleotide sequences have a common domain. For example, amino acid or nucleotide sequences which contain a common domain having about 72%, preferably 75%, more preferably 80%, even more preferably 85%, 90%, 95%, 96%, 97%, 98% or 99% identity or more are defined herein as sufficiently identical.

Also, nucleic acids encoding other family members related to those proteins whose sequences are shown above, which thus have a nucleotide sequences that differ from sequences shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 are within the scope of the invention. Moreover, nucleic acids encoding proteins corresponding to those whose sequences are shown above from different species which can have a nucleotide sequences which differ from those shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 are within the scope of the invention.

Nucleic acid molecules corresponding to variants (e.g. natural allelic variants) and homologues of the DNA of the invention (shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14) can be isolated based on their homology to the nucleic acids disclosed herein using the cDNAs disclosed herein or a suitable fragment thereof, as a hybridization probe according to standard hybridization techniques preferably under highly stringent hybridization conditions.

In addition to naturally occurring allelic variants of the sequences shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 the skilled person will recognise that changes can be introduced by mutation into those nucleotide sequences thereby leading to changes in the amino acid sequences of the corresponding proteins (whose sequences are shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15, respectively without substantially altering the functions of the corresponding proteins.

In another aspect of the invention, improved proteins derived from the sequences shown above are provided. Improved proteins are proteins wherein at least one biological activity is improved. Such proteins may be obtained by randomly introducing mutations along all or part of the coding sequences of the polypeptides shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 such as by saturation mutagenesis, and the resulting mutants can be expressed recombinantly and screened for biological activity. For instance, the art provides for standard assays for measuring the enzymatic activity of the resulting protein and thus improved proteins may easily be selected.

In a preferred embodiment the protein has an amino acid sequence according to a sequence shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15. In another embodiment, the polypeptide is substantially homologous to the amino acid sequence according to a sequence shown above and retains at least one biological activity of a polypeptide according to the sequence shown above, yet differs in amino acid sequence due to natural variation or mutagenesis as described above.

In a further preferred embodiment, the protein has an amino acid sequence encoded by an isolated nucleic acid fragment capable of hybridizing to a nucleic acid according to the sequences shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 preferably under highly stringent hybridization conditions.

Accordingly, the protein is preferably a protein which comprises an amino acid sequence at least about 72%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more homologous to an amino acid sequence shown in any one of SEQ ID NOs: 3, 6, 9, 12 and 15 and retains at least one functional activity of the polypeptide according to the sequences shown above.

Functional equivalents of a protein according to the invention can also be identified e.g. by screening combinatorial libraries of mutants, e.g. truncation mutants, of the protein of the invention for activity. In one embodiment, a variegated library of variants is generated by combinatorial mutagenesis at the nucleic acid level. A variegated library of variants can be produced by, for example, enzymatically ligating a mixture of synthetic oligonucleotides into gene sequences such that a degenerate set of potential protein sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g. for phage display). There are a variety of methods that can be used to produce libraries of potential variants of the polypeptides of the invention from a degenerate oligonucleotide sequence. Methods for synthesizing degenerate oligonucleotides are known in the art (see, e.g., Narang (1983) Tetrahedron 39:3; Itakura et al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et al. (1983) Nucleic Acid Res. 1 1 :477).

In addition, libraries of fragments of the coding sequence of a polypeptide of the invention can be used to generate a variegated population of polypeptides for screening a subsequent selection of variants. For example, a library of coding sequence fragments can be generated by treating a double stranded PCR fragment of the coding sequence of interest with a nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the double stranded DNA, renaturing the DNA to form double stranded DNA which can include sense/antisense pairs from different nicked products, removing single stranded portions from reformed duplexes by treatment with S1 nuclease, and ligating the resulting fragment library into an expression vector. By this method, an expression library can be derived which encodes N-terminal and internal fragments of various sizes of the protein of interest.

Several techniques are known in the art for screening gene products of combinatorial libraries made by point mutations of truncation, and for screening cDNA libraries for gene products having a selected property. The most widely used techniques, which are amenable to high through-put analysis, for screening large gene libraries typically include cloning the gene library into replicable expression vectors, transforming appropriate cells with the resulting library of vectors, and expressing the combinatorial genes under conditions in which detection of a desired activity facilitates isolation of the vector encoding the gene whose product was detected. Recursive ensemble mutagenesis (REM), a technique which enhances the frequency of functional mutants in the libraries, can be used in combination with the screening assays to identify variants of a protein of the invention (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA 89:781 1 -7815; Delgrave et al., (1993) Protein Engineering 6(3): 327-331 ).

In addition to the gene sequences shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 it will be apparent for the person skilled in the art that DNA sequence polymorphisms may exist within a given population, which may lead to changes in the amino acid sequence of the protein sequences as shown herein. Such genetic polymorphisms may exist in cells from different populations or within a population due to natural allelic variation. Allelic variants may also include functional equivalents.

Fragments of a polynucleotide according to the invention may also comprise polynucleotides not encoding functional polypeptides. Such polynucleotides may function as probes or primers for a PCR reaction.

Nucleic acids according to the invention irrespective of whether they encode functional or non-functional polypeptides can be used as hybridization probes or polymerase chain reaction (PCR) primers. Uses of the nucleic acid molecules of the present invention that do not encode a polypeptide having an activity shown in Table 8, inter alias, (1 ) isolating the gene encoding the protein, or allelic variants thereof from a cDNA library e.g. from an organism other than Myceliophthora fergusii (Corynascus thermophilum); (2) in situ hybridization (e.g. FISH) to metaphase chromosomal spreads to provide precise chromosomal location of the gene as described in Verma et al., Human Chromosomes: a Manual of Basic Techniques, Pergamon Press, New York (1988); (3) Northern blot analysis for detecting expression of mRNA corresponding to one of those shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 in specific tissues and/or cells and (4) probes and primers that can be used as a diagnostic tool to analyse the presence of a nucleic acid hybridizable to the a sequence shown above, probe in a given biological (e.g. tissue) sample.

Also encompassed by the invention is a method of obtaining a functional equivalent of a gene corresponding to one of those shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14. Such a method entails obtaining a labelled probe that includes an isolated nucleic acid which encodes all or a portion of the protein sequence according to any one of SEQ ID NOs: 3, 6, 9, 12 and 15 and to Table 8 or a variant thereof; screening a nucleic acid fragment library with the labelled probe under conditions that allow hybridization of the probe to nucleic acid fragments in the library, thereby forming nucleic acid duplexes, and preparing a full-length gene sequence from the nucleic acid fragments in any labelled duplex to obtain a gene related to the gene shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14.

In one embodiment, a nucleic acid of the invention is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more homologous to a nucleic acid sequence shown herein or to the reverse complement thereof.

Host cells

In another embodiment, the invention features cells, e.g., transformed host cells or recombinant host cells that contain a nucleic acid or vector encompassed by the invention. A "transformed cell" or "recombinant cell" is a cell into which (or into an ancestor of which) has been introduced, by means of recombinant DNA techniques, a nucleic acid or vector according to the invention. Both prokaryotic and eukaryotic cells are included, e.g., bacteria, fungi, yeast, and the like, especially preferred are cells from filamentous fungi, in particular Myceliophthora fergusii (Corynascus thermophilum). A cell of the invention is typically not a wild-type Myceliophthora fergusii (Corynascus thermophilus) or a naturally-occurring cell. They include, but are not limited to: fungi such as Aspergillus niger, Trichoderma reesii, Myceliophthora thermophila or Talaromyces emersonii; yeasts such as Saccharomyces cerevisiae, Yarrowia lipolytica and Pichia pastoris; bacteria such as Escherichia coli and Bacillus sp.^m, and plants such as Nicotiana benthamiana, Nicotiana tabacum and Medicago sativa.

In some instances, new phylogenic analyses of fungal species have resulted in taxonomic reclassifications. For example, following their phylogenic studies reported in van den Brink et al., ("Phylogeny of the industrial relevant, thermophilic genera Myceliophthora and Corynascus", Fungal Diversity (2012), 52:197-207), the authors proposed renaming all existing Corynascus species to Myceliophthora. Such changes in taxonomic classification are within the scope of the present invention and, regardless of future reclassifications, a person of skill in the art would be able to identify the organism used to determine the sequences disclosed herein for example based on the strain's accession number (CBS 405.69).

A nucleic acid molecule (or a nucleic acid molecule which is comprised within a vector) may be homologous or heterologous with respect to the cell into which it is introduced. In this context, a nucleic acid molecule is homologous to a cell if the nucleic acid molecule naturally occurs in that cell. A nucleic acid molecule is heterologous to a cell if the nucleic acid molecule does not naturally occur in that cell. Accordingly, the invention provides a cell which comprises a heterologous or a homologous sequence corresponding to one of those shown in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14.

A host cell can be chosen that modulates the expression of the inserted sequences, or modifies and processes the gene product in a specific, desired fashion. Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may facilitate optimal functioning of the protein.

Various host cells have characteristic and specific mechanisms for post- translational processing and modification of proteins and gene products. Appropriate cell lines or host systems familiar to those of skill in the art of molecular biology and/or microbiology can be chosen to ensure the desired and correct modification and processing of the foreign protein expressed. To this end, eukaryotic host cells that possess the cellular machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of the gene product can be used. Such host cells are well known in the art.

Host cells also include, but are not limited to, mammalian cell lines such as CHO, VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, and choroid plexus cell lines.

If desired, a stably transfected cell line can produce the polypeptides according to the invention. A number of vectors suitable for stable transfection of mammalian cells are available to the public, methods for constructing such cell lines are also publicly known, e.g., in Ausubel et al. (supra).

Use of the Myceliophthora fergusii (Corynascus thermophilus) enzymes in industrial processes

The invention also relates to the use of the enzymes according to the invention in a selected number of industrial processes. Despite the long term experience obtained with these processes, the enzymes according to the invention feature a number of significant advantages over the enzymes currently used. Depending on the specific application, these advantages can include aspects such as lower production costs, higher specificity towards the substrate, greater synergies with existing enzymes, less antigenic effect, less undesirable side activities, higher yields when produced in a suitable microorganism, more suitable pH and temperature ranges, better properties of the final product, and food grade or kosher aspects.

The present invention also relates to methods for preparing a food product comprising incorporating into the food product an effective amount of an enzyme of the present invention. This improves one or more properties of the food product relative to a food product in which the polypeptide is not incorporated.

The phrase "incorporated into the food product" is defined herein as adding the enzyme according to the invention to the food product, any ingredient from which the food product is to be made, and/or any mixture of food ingredients from which the food product is to be made. In other words, the enzyme according to the invention may be added in any step of the food product preparation and may be added in one, two or more steps. The enzyme according to the invention is added to the ingredients of a food product which can then be treated by methods including cooking, boiling, drying, frying, steaming or baking as is known in the art.

The term "effective amount" is defined herein as an amount of the enzyme according to the invention that is sufficient for providing a measurable effect on at least one property of interest of the food product.

The term "improved property" is defined herein as any property of a food product which is improved by the action of an enzyme according to the invention relative to a food product in which the enzyme according to the invention is not incorporated. The improved property may be determined by comparison of a food product prepared with and without addition of a polypeptide of the present invention. Organoleptic qualities may be evaluated using procedures well established in the food industry, and may include, for example, the use of a panel of trained taste-testers.

The enzymes of the present invention may be in any form suitable for the use in question, e.g., in the form of a dry powder, agglomerated powder, or granulate, in particular a non-dusting granulate, liquid, in particular a stabilized liquid, or protected enzyme such as described in WO01/1 1974 and WO02/26044. Granulates and agglomerated powders may be prepared by conventional methods, e.g., by spraying the enzyme according to the invention onto a carrier in a fluid-bed granulator. The carrier may consist of particulate cores having a suitable particle size. The carrier may be soluble or insoluble, e.g., a salt (such as NaCI or sodium sulphate), sugar (such as sucrose or lactose), sugar alcohol (such as sorbitol), starch, rice, corn grits, or soy. The enzyme according to the invention and/or additional enzymes may be contained in slow-release formulations. Methods for preparing slow-release formulations are well known in the art. Adding nutritionally acceptable stabilizers such as sugar, sugar alcohol, or another polyol, and/or lactic acid or another organic acid according to established methods may for instance, stabilize liquid enzyme preparations.

The enzyme according to the invention may also be incorporated in yeast comprising compositions such as disclosed in EP-A-0619947, EP-A-0659344 and One or more additional enzymes may also be incorporated into the food product. The additional enzyme may be of any origin, including mammalian and plant, and preferably of microbial (bacterial, yeast or fungal) origin and may be obtained by techniques conventionally used in the art. Enzymes may conveniently be produced in microorganisms. Microbial enzymes are available from a variety of sources; Bacillus species are a common source of bacterial enzymes, whereas fungal enzymes are commonly produced in Aspergillus species.

Suitable additional enzymes include other starch degrading enzymes, xylanases, oxidizing enzymes, fatty material splitting enzymes, or protein-degrading, modifying or crosslinking enzymes.

Starch degrading enzymes are for instance endo-acting enzymes such as alpha- amylase, maltogenic amylase, pullulanase or other debranching enzymes and exo-acting enzymes that cleave off glucose (amyloglucosidase), maltose (beta-amylase), maltotriose, maltotetraose and higher oligosaccharides.

Suitable xylanases are for instance xylanases, pentosanases, hemicellulase, arabinofuranosidase, glucanase, cellulase, cellobiohydrolase, beta-glucosidase, and others.

Oxidizing enzymes are for instance glucose oxidase, hexose oxidase, pyranose oxidase, sulfhydryl oxidase, lipoxygenase, laccase, polyphenol oxidases and others.

Fatty material splitting enzymes are for instance triacylglycerol lipases, phospholipases (such as A₂, B, C and D) and galactolipases.

Protein degrading, modifying or crosslinking enzymes are for instance endo-acting proteases (serine proteases, metalloproteases, aspartyl proteases, thiol proteases), exo- acting peptidases that cleave off one amino acid, or dipeptide, tripeptide etceteras from the N-terminal (aminopeptidases) or C-terminal (carboxypeptidases) ends of the polypeptide chain, asparagines or glutamine deamidating enzymes such as deamidase and peptidoglutaminase or crosslinking enzymes such as transglutaminase.

In a preferred embodiment, the additional enzyme may be an amylase, such as an alpha-amylase (can be useful for providing sugars fermentable by yeast) or beta-amylase, cyclodextrin glucanotransferase, peptidase, in particular, an exopeptidase (can be useful in flavour enhancement), transglutaminase, lipase (can be useful for the modification of lipids present in the food or food constituents), phospholipase, cellulase, hemicellulase, protein disulfide isomerase, peroxidase, laccase, or oxidase, e.g., an glucose oxidase, hexose oxidase, aldose oxidase, pyranose oxidase, lipoxygenase or L-amino acid oxidase..

When one or more additional enzyme activities are to be added in accordance with the methods of the present invention, these activities may be added separately or together with the polypeptide according to the invention.

In addition to the use of the enzymes according to the present invention in food applications, the present invention also relates to the use of the enzymes according to the present invention in other industrial applications.

The enzymes of the current invention may be used in new or improved methods for enzymatically degrading or converting plant cell wall polysaccharides from biomass into various useful products. In addition to cellulose and hemicellulose, plant cell walls contain associated pectins and lignins, the removal of which by enzymes of the current invention can improve accessibility to cellulases and hemicellulases, or which can themselves be converted to useful products.

Usually, biomass must be subjected to pre-treatment in order to make the cellulose more accessible and the enzymes of the current invention may also be used in improved methods for the processing of pretreated biomass. Pretreatment technologies may involve chemical, physical, or biological treatments. Examples of pre-treatment technologies include but are not limited to: steam explosion; ammonia; acid hydrolysis; alkaline hydrolysis; solvent extraction; crushing; milling; etc.

One example of a product produced from biomass is bioethanol. Bioethanol is usually produced by the fermentation of glucose to ethanol by yeasts such as Saccharomyces cerevisiae: in addition to ethanol, other chemicals may be synthesized starting from glucose. Ethanol, today, is produced mostly from sugars or starches, obtained from sugar cane, fruits and grains. In contrast, cellulosic ethanol is obtained from cellulose, the main component of wood, straw and much of the plants. Sources of biomass for cellulosic ethanol production comprise agricultural residues (such as leftover crop materials from stalks, leaves, and husks of corn plants), forestry wastes (such as chips and sawdust from lumber mills, dead trees, and tree branches), energy crops (such as dedicated fast- growing trees and grasses such as switch grass), municipal solid waste (such as household garbage and paper products), food processing and other industrial wastes (such as black liquor, paper manufacturing by-product, etc.).

Plant biomass is a mixture of plant polysaccharides, including cellulose, hemicelluloses, and pectin, together with the structural polymer, lignin. Glucose is released from cellulose by the action of mixtures of enzymes, including: endoglucanases, exoglucanases (cellobiohydrolases 1 and 2) and beta-glucosidases. Efficient large-scale conversion of cellulosic materials by such mixtures requires the full complement of enzymes, and can be enhanced by the addition of enzymes that attack the other plant cell wall components (hemicelluloses, pectins, and lignins), as well as chemical linkages between these components. Hence enzymes of the current invention that are highly expressed, or have high specific activity, stability, or resistance to inhibitors would improve the efficiency of the process, and lower enzyme costs. It would be an advantage to the art to improve the degradation and conversion of plant cell wall polysaccharides by composing cellulase mixtures using cellulase enzymes with such properties. Furthermore, enzymes of the current invention that are able to function at extremes of pH and temperature are desirable, both since improved enzyme robustness decreases costs, and because enzymes that function at high temperature will allow high processing temperatures under high substrate consistency conditions that decrease viscosity and thus improve yields.

Glycoside hydrolases from family GH61 are known to stimulate the activity of cellulose cocktails on lignocellulosic substrates and are thus considered to exhibit cellulose- enhancing activity (P. V. Harris et al., Biochemistry 49, 3305 (2010)). They have no known enzymatic activities of their own. Enhancement of cellulase cocktail efficiency by GH61 proteins of the current invention would contribute to lowering the costs of cellulase enzymes used for the production of glucose from plant cell biomass, as decribed above.

Enzymatic hydrolysis of plant hemicellulose yields 5-carbon sugars that either may be fermented to ethanol by some species of yeast, or converted to other types of chemical products. Enzymatic deconstruction of hemicellulose is also known to improve the accessibility of plant cell wall cellulose to cellulase enzymes for the production of glucose from lignocellulosic materials. Hemicellulase enzymes of the current invention that enhance glucose production from lignocellulose would find utility in the bioethanol industry and in other process that rely on glucose or pentose streams from lignocellulose.

Lignin is composed of methoxylated phenyl-propane units linked by ether linkages and carbon-carbon bonds. The chemical composition of lignin may, depending on species, include guaiacyl, 4-hydroxyphenyl, and syringyl groups. Enzymatic modification of lignin by the enzymes of the current invention can be used for the production of structural materials from plant biomass, or alternatively improve the accessibility of plant cellulose and hemicelluloses to cellulase enzymes for the release of glucose from biomass as descibed above. Enzymes that degrade the lignin component of lignocellulose include lignin peroxidases, manganese-dependent peroxidases, versatile peroxidases, and laccases (Vicuna, 2000, Molecular Biotechnology 14: 173-176; Broda et al., 1996, Molecular Microbiology 19: 923-932). These enzymes of the current invention may also in certain instances be active in the decolourization of industrial dyes, and thus useful for the treatment and detoxification of chemical wastes.

Pectin degrading enzymes of the current invention can also enhance the action of cellulases on plant biomass by improving the accessibilty of cellulase to the cellulose component of lignocellulose. The enzymes of the present invention may also be used in other applications for hydrolyzing non-starch polysaccharide (NSP).

One application is in the detergent industry for removal from laundry of carbohydrate-based stains. The textile industry uses various enzymes to improve the properties of its products. Such improvement relates to softness, quality of the finish, "stone- wash look" of denim, etc. Enzymes are used in detergents in order to improve its efficacy to remove most types of dirt. Enzymes have been used in textile processing since the early part of this century to remove starch-based sizing, but only in the past decade has serious attention been given to using enzymes for a wide range of textile applications. Enzymes are expected to have an even greater impact on effluent quality as more fibre preparation, pre- treatment and value-added finishing processes convert to biotreatment. In addition, enzymes are very effective catalysts even under mild conditions and do not require the high energy input often associated with chemical processes. The use of enzymes of the present invention finds utility in the detergent industry for removal from laundry of carbohydrate- based stains.

Feed enzymes have an important role to play in current farming systems. They can increase the digestibility of nutrients, leading to greater efficiency in the production of animal products such as meat and eggs. At the same time they can play a role in minimizing the environmental impact of increased animal production.

Non-starch polysaccharides (NSP) can increase the viscosity of the digesta which can, in turn, decrease nutrient availability and animal performance.

Endoxylanases and phytases are the best-known feed-enzyme products. Phytase enzymes hydrolyse phytic acid and release inorganic phosphate, thereby avoiding the need to add inorganic phosphates to the diet and reducing phosphorus excretion. Addition of xylanases to feed has also been shown to have positive effects on animal growth. Adding specific nutrients to feed improves animal digestion and thereby reduces feed costs. A lot of feed additives are being currently used and new concepts are continuously developed. Use of specific enzymes like non-starch carbohydrate degrading enzymes could breakdown the fibre releasing energy as well as increasing the protein digestibility due to better accessibility of the protein when the fibre gets broken down. In this way the feed cost could come down as well as the protein levels in the feed also could be reduced.

Non-starch polysaccharides (NSPs) are also present in virtually all feed ingredients of plant origin. NSPs are poorly utilized and can, when solubilized, exert adverse effects on digestion. Exogenous enzymes can contribute to a better utilization of these NSPs and as a consequence reduce any antinutritional effects. The hemicellulases and other polysaccharide-active enzymes of the present invention can be used for this purpose in cereal-based diets for poultry and, to a lesser extent, for pigs and other species.

The xylanases of the present invention can be used for prebleaching of kraft pulp. Xylanases have been found to be most effective for that purpose.

Xylanases attract increasing scientific and commercial attention due to applications in the pulp and paper industry for removal of hemicellulose from dissolving pulps or for enhancement of the bleachability of pulp and, thus, reduction of the use of environmentally harmful bleaching chemicals. A similar application of xylanases for pulp prebleaching is an already well-established technology and has greatly stimulated research on hemicellulases in the past decade. Although lignin-active peroxidases of the present invention may also be active in modification of lignin and hence have bleaching properties, such enzymes are generally less attractive for bleaching dues to the need to use and recycle expensive redox mediators.

Xylanases of the present invention can be used to pre-bleach pulp to reduce the amount of bleaching chemicals to obtain a given brightness. It is suggested that xylanase depolymerises xylan blocks and increases accessibility or helps liberation of residual lignin by releasing xylan-chromophore fragments. In addition to brownstock prior to bleaching, xylanases of the present invention can save on bleaching chemicals. The enzymes hydrolyze surface xylans and are able to break linkages between hemicellulose and lignin. Other hemicellulase active enzymes of the present invention which can break these linkages can function effectively in bleaching or pre-bleaching of pulp.

In addition, xylanases of the present invention can also be used in antibacterial formulation as well as in pharmaceutical products such as throat lozenges, toothpastes, and mouthwash.

Chitin is a β-(1 ,4)-linked polymer of N-acetyl D-glucosamine (GlcNAc), found as a structural polysaccharide in fungal cell walls as well as in the exoskeleton of arthropods and the outer shell of crustaceans. Approximately 75% the total weight of shellfish, is considered waste, and a large proportion of the material making up the waste is chitin. Chitin degrading enzymes of the current invention are useful in the modification and degradation of chitin, allowing the production of chitin-derived material, such as chitooligosaccharides and N- acetyl D-glucosamine, from chitin waste; another use of chitinase enzymes as antifungal agents. EXAMPLES

Fermentation of the organism

Materials & Methods

In general, for each species, starter mycelium was grown in rich medium (either mycological broth or yeast malt broth (the latter being indicated with YM in the growth conditions table)) and then washed with water. The starter was then used to inoculate different liquid media or solid substrate and the resulting mycelium was used for RNA extraction and library construction.

Following are the medium recipes and the solid substrates with a referenced source (if available) as well as a table listing the media variations, since in some cases the basic recipes of the referenced source have been altered depending on the species grown. This is then followed by a summary of the specific species as grown in the examples.

A. Mycological broth

Per liter: 10g soytone, 40g D-glucose, 1 ml Trace Element solution, Double-distilled water;

Adjust pH to 5.0 with hydrochloric acid (HCI) and bring volume to 1 L with double- distilled water.

Trace Element Solution contains 2mM Iron(ll) sulphate heptahydrate (FeS0₄ 7H₂0), 1 mM Copper (II) sulphate pentahydrate (CuS0₄ ^'5H₂0), 5 mM Zinc sulphate heptahydrate (ZnS0₄7H₂0), 10 mM Manganese sulphate monohydrate (MnS0₄ H₂0), 5 mM Cobalt(ll) chloride hexahydrate (CoCI₂ ^'6H₂0), 0.5 mM Ammonium molybdate tetrahydrate ((NH₄)₆Mo₇0₂₄-4H₂0), and 95 mM Hydrochloric acid (HCI)dissolved in double-distilled water.

B. Yeast-Malt broth (YM)

(Reference: ATCC medium No. 200)

Per liter: 3g yeast extract, 3g malt extract, 5g peptone, 10g D-glucose, Double-distilled water to 1 L.

C. Trametes Defined Medium (TDM)

(Reference: I. D. Reid and M. G. Piace. Effect of Residual lignin type and amount on biological bleaching of kraft pulp by Trametes versicolor. Applied Environmental Microbiology 60: 1395-1400, 1994.)

Per liter: 10 g D-glucose, 0.75 g L-Asparagine monohydrate, 0.68 g Potassium phosphate monobasic (KH₂P0₄), 0.25 g Magnesium sulphate heptahydrate (Mg SCy7H₂0), 15 mg Calcium chloride dihydrate (CaCI₂-2H₂0), 100 Thiamine hydrochloride, 1 ml Trace Element solution, 0.5 g Tween 80, Double distilled water; Adjust pH to 5.5 with 3M potassium hydroxide and bring volume to 1 L with double- distilled water.

Table 1. Variations of TDM media used for library construction

Variation Description

TDM-1 Medium was prepared as in basic recipe described above.

TDM-2 Quantity of asparagine monohydrate was reduced to 0.15g.

TDM-3 Manganese sulphate monohydrate was omitted from the medium.

The quantity of manganese sulphate monohydrate was raised to 0.2mM final

TDM-4

concentration in the medium.

TDM-5 The quantity of copper (II) sulphate pentahydrate was raised to 20μΜ.

TDM-6 Glucose was replaced with 10g per liter of cellulose (Solka-Floc, 200FCC)

Glucose was replaced with 10g per liter of xylan from birchwood (Sigma Cat. #

TDM-7

X-0502)

TDM-8 Glucose was replaced with 10g per liter of wheat bran¹.

TDM-9 Glucose was replaced with 10g per liter of citrus pectin (Sigma Cat. # P-9135).

TDM-10 Tween80 was omitted from the medium.

The double-distilled water was replaced with Whitewater² collected from

TDM-1 1

peroxide bleaching (which occurs during the manufacture of fine paper).

The double-distilled water was replaced with Whitewater² collected from

TDM-12

newsprint manufacture.

TDM-13 Glucose was replaced with 5g per liter of ground hardwood kraft pulp³.

TDM-14 The medium's pH was raised to 7.5.

TDM-15 The strain was incubated at 5°C above its optimum growth temperature.

TDM-16 The strain was incubated at 10°C below its optimum growth temperature.

One half of the double-distilled water was replaced with Whitewater from

TDM-17

newsprint manufacture. Glucose was omitted.

Potassium phosphate monobasic was replaced with 5mM phytic acid from rice

TDM-18

(Sigma Cat. # P3168).

TDM-19 Asparagine monohydrate was increased to 4g per liter.

Asparagine monohydrate was increased to 4g per liter and glucose was

TDM-20 replaced with 2% fructose.

Asparagine monohydrate was increased to 4g per liter; 100ml of double-

TDM-21

distilled water was replaced with 100ml kerosene⁴. Glucose was omitted.

Asparagine monohydrate was increased to 4g per liter; 100ml of double-

TDM-22 distilled water was replaced with 100ml hexadecane (Sigma cat. # H0255).

Glucose was omitted.

Applicant's behalf.

³ Hardwood kraft pulp was sourced from Quebec paper mills by PAPRICAN on the Applicant's behalf.

⁴ Kerosene was sourced from a general hardware store. Asparagine Salts Medium (AS):

(Reference: R. Ikeda, T. Sugita, E. Jacobson, and T. Shinoda. Laccase and Melanization in Clinically Important Cryptococcus Species Other Than Cryptococcus neoformans Journal of Clinical Microbiology 40: 1214-1218, 2002)

Per liter: 3.0 g D-glucose, 1 .0 g L-Asparagine monohydrate, 3.0 g KH₂P0₄, 0.5 g Mg S0₄"7H₂0, 1 mg Thiamine.

Table 2. Variations of AS media used for library construction

E. Solid substrates used:

SS-1 5 g Wheat Bran.

SS-2 5g Wheat bran plus 5ml defined lipid.

SS-3 5g Oat bran (food grade, sourced from supermarket).

The Myceliophthora fergusii (Corynascus thermophilus) strain was grown according to the methods described above under the following growth conditions: TDM-1 , -2, -3, -4, -5, -6, -7, -8, 9, -10, -13, -14, -15, -39; YM, whereby the following optimal growth temperature was used: 25°C.

The strains carrying the recombinant genes were grown according to the methods described above under the following growth conditions: minimal medium as described in Kafer (1977, Adv Genet. 19:33-131 ) except that the salt concentrations were raised ten-fold and the glucose concentration was 150 grams per litre, at 30°C

Genome sequencing and assembly

Genomic DNA was isolated from mycelium when the growth culture had reached the mid log phase. Genomic DNA was sequenced using the Roche 454 Titanium technology (http://www.454.com) to a genome coverage of over 20-fold according to the instruction of the manufacturer. The sequences were assembled using the Newbler and Celera

assemblers, (http://sourceforge.net/apps/mediawiki/wgs-assembler). Building the cDNA library

Total RNA was isolated from fungal cells or mycelia when the growth cultures had reached the late log phase. The mycelia were collected by filtration through Miracloth and washed with water by filtration. The mycelia were padded dry using paper towels, and frozen in liquid nitrogen and stored at -80°C. To extract total RNA, the frozen mycelia or cells were ground to a fine powder in liquid nitrogen using pestle and mortar. Approximately 1 -1 .5 gram of frozen fungal powder was dissolved in 10 ml of TRIzol^® reagent and RNA was extracted according to the manufacturer's protocol (Invitrogen Life Sciences, Catalog #15596-018). Following extraction, the RNA was dissolved at 1 -1 .5 mg/ml of DEPC-treated water.

The PolyATtract^® mRNA Isolation Systems (Promega, Catalog #Z5300) was used to isolate poly(A)+RNA. In general, equal amounts of total RNA extracted from up to ten culture conditions were pooled. One milligram of total RNA was used for isolation of poly(A)+RNA according to the protocol provided by the manufacturer. The purified poly(A)+RNA was dissolved at 200-500 Mg/ml of DEPC-treated water.

Five micrograms of poly(A)+RNA were used for the construction of cDNA library.

Double-stranded cDNA was synthesized using the ZAP-cDNA^® Synthesis Kit (Stratagene, Catalog #200400) according to the manufacturer's protocol with the following modifications. An anchored oligo(dT) linker-primer was used in the first-strand synthesis reaction to force the primer to anneal to the beginning of the poly(A) tail of the mRNA. The anchored oligo(dT) linker-primer has the sequence:

5'-GAGAGAGAGAGAGAGAGAGAACTAGTCTCGAGTTTTTTTTTTTTTTTTTTVN-3' (SEQ I D NO: 16) where V is A, C, or G and N is A, C, G, or T. A second modification was made by adding trehalose at a final concentration of 0.6M and betaine at a final concentration of 2M in the buffer of the first-strand synthesis reaction to promote full-length synthesis. Following synthesis and size fractionation, fractions of double-stranded cDNA with sizes longer than 600 bp were pooled. The pooled cDNA was cloned directionally into the plasmid vector BlueScript KS+^® (Stratagene) or a modified BlueScript KS+ vector that contained Gateway^® (Invitrogen) recombination sites. The cDNA library was transformed into E. coli strain XL10- Gold ultracompetent cells (Stratagene, Catalog #Z00315) for propagation.

Bacterial cells carrying cDNA clones were grown on LB agar containing the antibiotic Ampicillin for selection of plasmid-borne bacteria and X-gal and IPTG to use the blue/white system to screen for the presence cDNA inserts. The white bacterial colonies, those carrying cDNA inserts, were transferred by a colony-picking robot to 384-well MTP for replication and storage. Clones that were to be analyzed by sequencing were transferred to 96-well deep blocks using liquid-handling robots. The bacteria were cultured at 37°C with shaking at 150 rpm. After 24 hours of growth, plasmid DNA from the cDNA clones was prepared by alkaline lysis and sequenced from the 5' end using ABI 3730x1 DNA analyzers (Applied Biosystems). The chromatograms obtained following single-pass sequencing of the cDNA clones were processed using Phred (available at http://www.phrap.org) to assign sequence quality values, Lucy as described in Chou and Holmes (2001 , Bioinformatics, 17(12) 1093-1 104) to remove vector and low quality sequences, and Phrap (available at http://www.phrap.org/) to assemble overlapping sequences derived from the same gene into contigs.

Annotation

An in-house automated annotation pipeline was used to predict genes in the assembled genome sequence. The analysis pipeline used in part the ab initio tool Genemark (http://exon.biology.gatech.edu/) for prediction. It also used the predictor Augustus (http://augustus.gobics.de/) trained on de novo assembled sequences and orthologus sequences for gene finding. Sequence similarity searches against the mycoCLAP (http://cubique.fungalgenomics.ca/mycoCLAP/) and NCBI non-redundant databases were performed with BLASTX as described in Altschul et al., (1997) (Nucleic Acids Res. 25(17): 3389-3402). Proteins encoding biomass-degrading enzymes possess conserved domains. We used the domains available at the European Bioinformatics Institute (www.ebi.ac.uk/Tools/lnterProScan/) to assist in the identification of target enzymes.

Proteins targeted to the extracellular space by the classical secretory pathway possess an N-terminal signal peptide, composed of a central hydrophobic core surrounded by N- and C- terminal hydrophilic regions. We used Phobius (available at http://phobius.cgb.ki.se) and SignalP version 3 (available at http://www.cbs.dtu.dk/services/SignalP) to recognize the presence of signal peptides encoded by the cDNA clones. The tools TargetP (available at http://www.cbs.dtu.dk/services/TargetP) and Big-PI Fungal Predictor (available at http://mendel.imp.ac.at/gpi/fungi_server.html) were used to remove sequences that encode proteins which are targeted to the mitochondria or bound to the cell wall. Finally, sequences predicted to encode soluble secreted protein by these automated tools were analyzed manually. Clones that comprise full-length cDNAs which are predicted to encode soluble secreted proteins were sequenced completely. For genes identified from the genome sequence, oligonucleotide primers specific to the target genes were designed and used to PCR amplified the target genes from double-stranded cDNA or genomic DNA. The PCR amplified products were cloned into an appropriate expression vector for protein production in host cells. General Molecular Biology Procedures:

Standard molecular cloning techniques such as DNA isolation, gel electrophoresis, enzymatic restriction modifications of nucleic acids, E. coli transformation etc. were performed as described by Sambrook et al., 1989, (Molecular cloning: a laboratory manual, 2^nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York and Innes et al. (1990) PCR protocols, a guide to methods and applications, Academic Press, San Diego, edited by Michael A. Innis et al). Primers were prepared by IDT (Integrated DNA Technologies). Sanger DNA sequencing was performed using an Applied Biosystem's 3730x1 DNA Analyzer technology at the Innovation Centre (Genome Quebec), McGill University in Montreal.

Construction of pGBFIN49 expression plasmids

Genes of interest were cloned into the expression vector pGBFIN-49. This vector is a derivative of pGBFIN-41 that contains the A. niger g/aA promoter, A. niger TrpC terminator, A. nidulans gpd promoter, gene encoding the pheomycin resistance gene, A. niger g/aA terminator and an E. coli backbone. Figure 1 represents a schematic map of pGBFIN-49 and the complete nucleotide sequence is presented as SEQ ID NO: 17.

Details of the construction of pGBFIN-49 are as follows:

1 . TtrpC terminator PCR amplification (0.7kb):

TtrpC terminator was PCR amplified using purified pGBFIN33 plasmid as a template. The following primers and PCR program were used:

Primer-3: S'-GTCCGTCGCCGTCCHCAccgccggtccgacg-S' (^SE0- ^{I D N0: 8})

Primer-4: 5'-GCGGCCGGCGTATTGGGTGttacggagc-3' (SEQ ID NO: 19)

Primer-4 is entirely specific to TtrpC 3' end. Primer-3 was designed to suit the LIC cloning strategy but also to keep TtrpC sequence as close as the original sequence. To do so, five adenines were replaced by thymines (underlined).

PCR master mix:

pGBFIN33 1 μΙ (5-10ng)

Primer-3 (10mM) 1 μΙ

Primer-4 (10mM) 1 μΙ

dNTPs (2mM) 5 μΙ

HF Buffer (5χ) 10 μΙ

Phusion DNA pol 0.5 μΙ Nuclease-free water 31.5 ul

Total 50 μΙ

PCR program: 1x98°C - 2 min; 25x ( 98°C - 30 sec, 68°C - 30 sec, 72°C - 1 min); 72°C - 7 min.

Reaction conditions: 5 μΙ of the PCR reaction was ran on 1 .0% agarose gel and remaining was purified using QIAEX II gel Extraction kit (QIAGEN) and resuspended in nuclease-free water.

2. PGBFIN41 vector PCR amplification (8.3kb):

Vector backbone was PCR amplified using pGBFIN41 as a template. Primers were designed outside of the ccdA region (not included in pGBFIN49). The following primers and PCR program were used:

Primer-2: 5'-CACCCAATACGCCGGCCGCgcttccagacagctc-3' (SEQ ID NO: 20)

Primer-1 C: 5'-GGTGTTTTGTTGCTGGGGAtgaagctcaggctctcagttgcgtc-3' (SEQ ID NO: 21 ) Primer-2 contains a pgpdA-specific region and an extra sequence specific to TtrpC 3' end (also included in Primer-4). Primer-1 C was designed to suit the LIC cloning strategy but also to keep PgalA region as close as the original sequence. To do so, three thymines were replaced by adenines (underlined).

PCR master mix:

pGBFIN41 1 μΙ (50 ng)

Primer-2 (10mM) 1 μΙ

Primer-1 C (10mM) 1 μΙ

dNTPs (2mM) 5 μΙ

HF Buffer (5x) 10 μΙ

Phusion DNA pol 0.5 μΙ

DMSO 1 μΙ

Nuclease-free water 30.5 ul

Total 50 μΙ PCR program: 1x98°C - 3 min; 10x (98°C - 30 sec, 68°C - 30 sec, 72°C - 5 min); 20 x (98°C - 30 sec, 68°C - 30 sec, 72°C - 5 min+10 sec/cycle); 72°C - 10 min.

Reaction conditions: 5 μΙ of the PCR reaction was ran on 0.5% agarose gel and remaining was purified using QIAEX II gel Extraction kit (QIAGEN) and resuspended in nuclease-free water.

3. PGBFIN41 + TtrpC overlap-extension PCR:

Overlap-extension / Long range PCR was performed to a) fused the two PCR pieces together; b) add Sfol restriction site to re-circulate the vector. No primers were used in the overlap-extension stage. Primer-1 1 and Primer-12 were used for the long range PCR reaction.

Primer-1 1 : 5'-CACCGGCGCCGTCCGTCGCCGTCCTTC -3' (SEQ ID NO: 22)

Primer-12: 5'-ACGGCGCCGGTGTTTTGTTGCTGGGGATG -3' (SEQ ID NO: 23) Primers-1 1 is specific to the LIC tag located on the TtrpC terminator while Primer-12 is specific to the LIC tag located on the PglaA region. Sfol restriction site sequence is underlined.

A standard PCR master mix was prepared to perform overlap-extension PCR using pGBFIN41 and TtrpC purified PCR products as templates. No primers were added.

Overlap-extension master mix:

TtrpC 1 μΙ

pGBFIN41 9 μΙ

Buffer GC (5x) 10 μΙ

dNTPs (2mM) 5 μΙ

Phusion DNA pol 0.5 μΙ

Nuclase-free water 24.5 μΙ

Total 50 μΙ

PCR program - overlap (no primers): 1 x 98°C - 2 min; 5x (98°C - 15 sec, 58° - C30 sec, 72°C - 5 min), 5x (98°C - 15 sec, 63°C - 30 sec, 72°C - 5 min), 5x (98°C - 15 sec, 68°C -

30 sec, 72°C - 5 min); 72°C - 10 min).

The overlap-extension PCR product was then, purified on QIAEX II column and 5 μΙ of the purified reaction was used as template DNA for Long range PCR step with primers- 1 1 and -12. PCR master mix:

Overlap product 5ul

Primer-1 1 (10mM) 1 μΙ

Primer-12 (10mM) 1 μΙ

dNTPs (2mM) 5 μΙ

HF Buffer (5x) 10 μΙ

Phusion DNA pol 0.5 μΙ

DMSO 1 μΙ

Nuclease-free water 26.5 ul

Total 50 μΙ

PCR program - Long range: 1χ 98°C - 3 min; 10x (98°C - 30 sec, 68°C - 30 sec, 72°C - 5 min); 20 x (98°C - 30 sec, 68°C - 30 sec, 72°C - 5 min+10 sec/cycle); 72°C - 10 min. Reaction conditions: 5 μΙ of the PCR reaction was ran on 0.5% agarose gel and remaining was purified using QIAEX II gel Extraction kit and resuspended in nuclease- free water. Then, Sfol digestion was performed and digested product was purified using QIAEX II gel extraction kit follow the procedure as described by the manufacture.

4. Ligation: 100 ng of the purified digested fragment was ligated to itself using 1 μΙ of T4 DNA Ligase

(New England Biolabs, M0202), and incubated at 16C overnight. Enzyme inactivation was performed at 65°C for 10 minutes.

Then, 10 μ I of ligation product were transformed in DH5a E. coli competent cells and plated on 2xYT agar containing 100 ug/ml ampicillin. DNA extraction was performed on single colonies the next day. Restriction analysis and sequencing were done to confirm the structure.

Cloning of Cornyascus thermophilus genes in E. coli

Cloning genes of interest in the pGBFIN-49 expression vector was performed using the Ligation-independent cloning (LIC) method according to Aslanidis, C, de Jong, P. (1990) Nucleic Acids Research Vol. 18 No. 20, 6069-6074.

Coding sequences from genes of interest were amplified by PCR using primers containing LIC tags which are homologous to Pg/a and TrpC sequences in the pGBFIN-49 cloning vector fused to sequences homologous to the coding sequences of the gene of interest, and ei genomic DNA or cDNA as template. Primers have following sequences: Forward primer: 5 -CCCCAGCAACAAAACACCTCAGCAATG...15-20 nucleotides specific to each gene to be cloned (SEQ ID NO: 24)

Reverse primer: 5'- GAAGGACGGCGACGGACTTCA...15-20 nucleotides specific to each gene to be cloned (SEQ ID NO: 25)

PCR mix consists of following components:

PCR amplification was carried out with following conditions:

Following PCR, 90 μΙ milliQ water was added to each sample and the mix was purified using a Multiscreen PCR₉₆ Filter Plate (Millipore) according to manufacturer's instructions. The PCR product was eluted from the filter in 25 μΙ 10 mM Tris-HCI pH8.0.

Expression vector pGBFIN-49 was PCR amplified using primers with following sequences:

Forward primer: 5'- GTCCGTCGCCGTCCTTCACCG -3' (SEQ ID NO: 26)

Reverse primer: 5'- GGTGTTTTGTTGCTGGGGATGAAGC -3' (SEQ ID NO: 27)

(Primers are located at either site of the Sfol restriction site.)

PCR mix consists of following components:

PCR amplification was carried out with following conditions:

Following PCR, 1 μΙ Dpnl was added to the PCR mix and digestion was allowed overnight at 37°C. Digested PCR product was purified using the Qiaquick PCR purification kit (Qiagen) according to manufacturer's instructions.

Obtained PCR fragments were treated with T4 DNA polymerase in the presence of dTTP to create single stranded tails at the ends of the PCR fragments. The single stranded tails of the PCR fragment are complementary to those at the vector, thus permitting non- covalent bi-molecular associations e.g. circularization between molecules.

Reaction mix of T4 DNA polymerase treatment pGBFIN-49 PCR fragment consists of following components: Purified pGBFIN-49 PCR fragment 600 ng

10X Neb Buffer 2 2 μΙ

25 mM dTTP 2 μΙ

DTT 100 μΜ 0.8 μΙ

T4 DNA Polymerase 3U/ μΙ 1 μΙ

H₂0 Up to 20 μΙ

TOTAL 20 μΙ

Reaction mix of T4 DNA polymerase treatment of Gene of Interest (( fragment consists of following components:

Reaction conditions were as follows:

Following T4 DNA polymerase treatment, 2 μΙ pGBFIN-49 vector and 4 μΙ of the GOI were mixed and incubated at room temperature allowing annealing of GOI fragment with pGBFIN-49 vector fragment. The bi-molecular forms are used to transform E. coli. Plasmid DNA of resulting transformants was isolated and verified by sequence analyses for correct amplification and cloning of the gene of interest.

Transformation of Cornyascus thermophilus gene expression cassettes into A. niger.

As host strain for enzyme production, A. niger GBA307 was used. Construction of A. niger GBA307 is described in WO201 1009700.

Transformation of A. niger was performed essentially according to the method described by Tilburn, J. et. al. (1983) Gene 26, 205-221 and Kelly, J & Hynes, M. (1985) EMBO J., 4, 475- 479 with the following modifications:

- Spores were grown for 16-24 hours at 30°C in a rotary shaker at 250 rpm in Aspergillus minimal medium. Aspergillus minimal medium contains per liter: 6 g NaN0₃; 0.52 g KCI; 1 .52 g KH₂P0₄; 1.12 ml 4 M KOH; 0.52 g MgS0₄.7H₂0; 10 g glucose; 1 g casamino acids; 22 mg ZnS0₄.7H₂0; 1 1 mg H₃B0₃; 5 mg FeS0₄.7H₂0; 1.7 mg CoCI₂.6H₂0; 1.6 mg CuS0₄.5H₂0; 5 mg MnCI₂.2H₂0; 1 .5 mg Na₂Mo0₄.2 H₂0; 50 mg EDTA; 2 mg riboflavin; 2 mg thiamine-HCI; 2 mg nicotinamide; 1 mg pyridoxine-HCI; 0.2 mg panthotenic acid; 4 μg biotin; 10 ml Penicillin (5000IU/ml/Streptomycin (5000 UG/ml) solution (Invitrogen);

- Glucanex 200G (Novozymes) was used for the preparation of protoplasts;

- After protoplast formation (2-3 hours) 10 ml TB layer (per liter: 109.32 g Sorbitol; 100 ml 1 M Tris-HCI pH 7.5) was pipetted gently on top of the protoplast suspension. After centrifugation for 10 min at 4330 x g at 4°C in a swinging bucket rotor, the protoplasts on the interface were transferred to a fresh tube and washed with STC buffer (1 .2 M Sorbitol, 10 mM Tris-HCI pH7.5, 50 mM CaCI₂). The protoplast suspension was centrifuged for 10 min at 1560 x g in a swinging bucket rotor and resuspended in STC-buffer at a concentration of 10⁸ protoplasts/ml;

- To 200 μΙ of the protoplast suspension, 20 μΙ ATA (0.4 M Aurintricarboxylic acid), the DNA dissolved in 10 μΙ in TE buffer (10 mM Tris-HCI pH 7.5, 0.1 mM EDTA), 100 μΙ of a PEG solution (20% PEG 4000 (Merck), 0.8M sorbitol, 10 mM Tris-HCI pH 7.5, 50 mM CaCI₂) was added;

- After incubation of the DNA-protoplast suspension for 10 min at room temperature, 1 .5 ml PEG solution (60% PEG 4000 (Merck), 10 mM Tris-HCI pH7.5, 50 mM CaCI₂) was added slowly, with repeated mixing of the tubes. After incubation for 20 min at room temperature, suspensions were diluted with 5 ml 1 .2 M sorbitol, mixed by inversion and centrifuged for 10 min at 2770 x g at room temperature.

- The protoplasts were resuspended gently in 1 ml 1 .2 M sorbitol and plated onto selective regeneration medium consisting of Aspergillus minimal medium without riboflavin, thiamine. HCI, nicotinamide, pyridoxine, panthotenic acid, biotin, casamino acids and glucose, supplemented with 150 μg ml Phleomycin (Invitrogen), 0.07 M NaN0₃, 1 M sucrose, solidified with 2 % bacteriological agar #1 (Oxoid, England). After incubation for 5-10 days at 30°C, single transformants were isolated on PDA (Potato Dextrose Agar (Difco) supplemented with 150 μg ml Phleomycin in 96 wells MTP. After 5-7 days growth at 30°C single transformants were used for MTP fermentation.

Approximately 1 x 10⁶-1 x 10⁷ spores were inoculated in 20 ml pre-culture medium containing Maltose 30 g/l; Peptone (aus casein) 10 g/l; Yeast extract 5 g/l; KH2P04 1 g/l; MgS047H20 0.5 g/l; ZnCI2 0.03 g/l; CaCI2 0.02 g/l; MnS04-4H20 0.01 g/l; FeS047H20 0.3 g/l; Tween-80 3 g/l; pH5.5. After growing overnight at 34°C in a rotary shaker, 10-15 ml of the growing culture was inoculated in 100 ml main culture containing Glucose-H20 70 g/l; Peptone (aus casein) 25 g/l; Yeast extract 12.5g/l; K2S04 2 g/l; KH2P04 1 g/l; MgS047H20 0.5 g/l; ZnCI2 0.03 g/l; CaCI2 0.02 g/l; MnS04-1 H20 0.009 g/l; FeS047H20 0.003 g/l; pH 5.6.

Note: for GH61 enzymes the culture media were supplemented with 10 μΜ CuS04.

Main cultures were grown until all glucose was consumed as measured with Combur Test N strips (Roche) which was the case mostly after 4-7 days of growth. Culture supernatants were harvested by centrifugation for 10 minutes at 5000 x g followed by germ-free filtration of the supernatant over 0.2 μηη PES filters (Nalgene).

Protein concentration determination with TCA-biuret method:

Concentrated protein samples (supernatants) were diluted with water to a concentration between 2 and 8 mg/ml. Bovine serum albumin (BSA) dilutions (0, 1 , 2, 5, 8 and 10 mg/ml) were made and included as samples to generate a calibration curve. 1 ml of each diluted protein sample was transferred into a 10 ml tube containing 1 ml of a 20% (w/v) trichloro acetic acid solution in water and mixed thoroughly. Subsequently, the tubes were incubated on ice water for one hour and centrifuged for 30 minutes, at 4°C and 6000 rpm. The supernatant was discarded and pellets were dried by inverting the tubes on a tissue and letting them stand for 30 minutes at room temperature. Next, 4 ml BioQuant Biuret reagent mix was added to the pellet in the tube and the pellet was solubilised upon mixing. Next, 1 ml water was added to the tube, the tube was mixed thoroughly and incubated at room temperature for 30 minutes. The absorption of the mixture was measured at 546 nm with a water sample used as a blank measurement and the protein concentration was calculated via the BSA calibration line.

Sugar-release activity assays

A. niger strains expressing Cornyascus thermophilus clones were grown in shake flask, as described above, in order to obtain greater amounts of material for further testing. The fermentation supernatants (volume between 40 and 80ml) were concentrated using a 10 kDa spin filter to a volume of approximately 5 ml. Subsequently, the protein concentration in the concentrated supernatant was determined via a TCA-biuret method, as described above. The (hemi-)cellulase activity of these protein samples was tested in an assay where the supernatants were spiked on top of an enzyme base mix in the presence of 10% (w/w) acid pretreated corn stover. To spike' or 'spiking of a supernatant or an enzyme indicates in this context the addition of a supernatant or an enzyme to a (hemi)-cellulase base mix. The feedstock solution was prepared via the dilution of a concentrated feedstock solution with water. Subsequently the pH was adjusted to pH 4.5 with a 4M NaOH solution. The proteins were spiked based on protein dosage in a total volume of 10 ml at a feedstock concentration of 10% aCS (w/w) in an 30-ml centrifuge bottle (Nalgene Oakridge). All experiments were performed at least in duplicate and were incubated for 72 hours at 65°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described below. Soluble sugar analysis by HPLC

The sugar content of the samples after enzymatic hydrolysis were analyzed using a High- Performance Liquid Chromatography System (Agilent 1 100) equipped with a refection index detector (Agilent 1260 Infinity). The separation of the sugars was achieved by using a 300 X 7.8 mm Aminex HPX-87P (Bio rad cat no 125-0098) column; Pre-column: Micro guard Carbo-P (Bio Rad cat no 125-01 19); mobile phase was HPLC grade water ; flow rate of 0.6 ml/min and a column temperature of 85 °C. The injection volume was 10 μΙ.

The samples were diluted with HPLC grade water to a maximum of 10 g/l glucose and filtered by using 0.2 μηη filter (Afridisc LC25mm syringe filter PVDF membrane) . The glucose was identified and quantified according to the retention time, which was compared to the external glucose standard (D-(+)-Glucose Sigma cat no: G7528) ranging from 0.2; 0.4; 1 .0; 2.0 g/l. g-arabino(furano)sidase activity assay

This assay measures the ability of oarabino(furano)sidases to remove the a-L- arabinofuranosyl residues from substituted xylose residues.

Single and double substituted oligosaccharides are prepared by incubating wheat arabinoxylan (WAX medium viscosity; 2 mg/mL; Megazyme, Bray, Ireland) in 50 mM acetate buffer pH 4.5 with an appropriate amount of endo-xylanase (Aspergillus Awamori, FJM, Kormelink, Carbohydrate Research, 249 (1993) 355-367) for 48 hours at 50°C to produce a sufficient amount of arabinoxylo-oligosaccharides. The reaction is stopped by heating the samples at 100°C for 10 minutes. The samples are centrifuged for 5 minutes at 10.000 x g. The supernatant is used for further experiments. Degradation of the arabinoxylan is followed by High Performance Anion Exchange Chromatography (HPAEC).

The enzyme is added to the single and double substituted arabinoxylo-oligosaccharides (endo-xylanase treated WAX) in a dosage of 10 mg protein/ g substrate in 50 mM sodium acetate buffer which is then incubated at 65°C for 24 hours. The reaction is stopped by heating the samples at 100°C for 10 minutes. The samples are centrifuged for 5 minutes at 10.000 x g and 10 times diluted. Release of arabinose from the arabinoxylo-oligosaccharides is analyzed by HPAEC analysis.

The analysis is performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (2 mm ID x 50 mm) and a Dionex PAD-detector (Dionex Co. Sunnyvale). A flow rate of 0.3 mL/min is used with the following gradient of sodium acetate in 0.1 M NaOH: 0-40 min, 0-400 mM. Each elution is followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH. Arabinose release is quantified by an arabinose standard (Sigma) and compared to a sample where no enzyme was added. Endo-xylanase activity assay 1

Endo-xylanases are enzymes able to hydrolyze β-1 ,4 bonds in the xylan backbone, producing short xylooligosaccharides. This assay measures the release of xylose and xylo- oligosaccharides by the action of xylanases on wheat arabinoxylan (WAX)

(Megazyme, Medium viscosity 29 cSt) and Beech Wood Xylan (Beech) (Sigma).

Sodium acetate buffer (0.05 M, pH 4.5) is prepared as follows; 4.1 g of anhydrous sodium acetate is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 3.0 g (2.86 mL) of glacial acetic acid is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium acetate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is 4.5.

The substrates WAX and Beech are solved in sodium acetate buffer to obtain 2.0 mg/mL. The enzyme is added to the substrate in a dosage of 10 mg protein/ g substrate which is then incubated at 65°C for 24 hours. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of xylose and (arabino)xylo-oligosaccharides is analyzed by High Performance Anion Exchange Chromatography.

As a blank sample the substrate is treated and incubated in the same way but then without the addition of enzyme. The analysis is performed using a Dionex HPLC system equipped with a Dionex CarboPac PA-1 (2 mm ID x 250 mm) column in combination with a CarboPac PA guard column (2 mm ID x 50 mm) and a Dionex PAD-detector (Dionex Co. Sunnyvale). A flow rate of 0.3 mL/min is used with the following gradient of sodium acetate in 0.1 M NaOH: 0-40 min, 0-400 mM. Each elution is followed by a washing step of 5 min 1000 mM sodium acetate in 0.1 M NaOH and an equilibration step of 15 min 0.1 M NaOH. Standards of xylose, xylobiose, xylotriose and xylotetraose (Sigma) are used to identify and quantify these oligomers released by the action of the enzyme.

Endo-xylanase activity assay 2

(Megazyme, Medium viscosity 29 cSt).

The substrate WAX is solved in sodium acetate buffer to obtain 2.0 mg/mL. The enzyme is added to the substrate in a dosage of 1 mg protein/ g substrate which is then incubated at

65°C for 24 hours. During these 24 hours samples are taken and the reaction is stopped by heating the samples for 10 minutes at 100°C.

The enzyme activity is demonstrated by using a reducing sugars assay (PAHBAH) as detection method.

Reagent A: 5 g of p-Hydroxybenzoic acid hydrazide (PAHBAH) is suspended in 60 mL water, 4.1 mL of concentrated hydrochloric acid is added and the volume is adjusted to 100 ml. Reagent B:0,5 M sodium hydroxide. Both reagents are stored at room temperature. Working Reagent: 10 ml of Reagent A is added to 40 ml of Reagent B. This solution is prepared freshly every day, and is stored on ice between uses. Using the above reagents, the assay is performed as detailed below

The assay is conducted in microtiter plate format. After incubation 10 μ I of each sample is added to a well and mixed with 150 μΙ working reagent. These solutions are heated at 70°C for 30 minutes or for 5 minutes at 90°C. After cooling down, the samples are analyzed by measuring the absorbance at 405 nm. The standard curve is made by treating 10 μ I of an appropriate diluted xylose solution the same way as the samples. The reducing-ends formed due to the action of enzyme are expressed as xylose equivalents.

Acetyl xylan esterase activity assay

Acetyl xylan esterases are enzymes able to hydrolyze ester linked acetyl groups attached to the xylan backbone, releasing acetic acid. This assay measures the release of acetic acid by the action of acetyl xylan esterase on acid pretreated corn stover (pCS) that contains ester linked acetyl groups.

Determine the presence of acetyl groups in pCS

The pCS used contains ± 284 (± 5,5) μg acetic acid/ 20 mg pCS as determined according to the following method.

About 20 mg of pCS substrate was weighed in a 2ml reaction tube and placed in an ice- water bath. Then 1 mL of 0.4M NaOH in Millipore water/ isopropanol (1 :1 ) was added and the sample was thoroughly mixed. This was incubated on ice for 1 hour. Subsequently, the samples were mixed again and incubated for 2 additional hours at room temperature (mixed once in a while). After this samples were centrifuged for 5 min at 12000 rpm and the supernatant was analyzed for acetic acid content by HPLC.

Enzyme incubations

Enzyme incubations were performed in citrate buffer (0.05 M, pH 4.5) which is prepared as follows; 14.7 g of tri-sodium citrate is dissolved in distilled water to a final volume of 1000 mL (Solution A). In a separate flask, 10.5 g citric acid monohydrate is mixed with distilled water to make the total volume of 1000 mL (Solution B). The final 0.05 M sodium citrate buffer, pH 4.5, is prepared by mixing Solution A with Solution B until the pH of the resulting solution is 4.5.

The pCS substrate is solved in citrate buffer to obtain ± 20 mg/mL. The enzyme is added to the substrate in a dosage of 1 or 10 mg protein/ g substrate which is then incubated at 60°C for 24 hours head-over-tail. The reaction is stopped by heating the samples for 10 minutes at 100°C. The release of acetic acid is analyzed by HPLC.

As a blank sample the substrate is treated and incubated in the same way but then without the addition of enzyme.

The analysis is performed using an Ultimate 3000 system (Dionex) equipped with a Shodex Rl detector and an Aminex HPX 87H column (7.8 mm ID x 300 mm) column (BioRad). A flow rate of 0.6 mL/min is used with 5.0 mM H₂S0₄ as eluent for 30 minutes at a column temperature of 40°C. Acetic acid was used as a standard to quantify its release from pCS by the enzymes.

Rasamsonia (Talaromyces) emersonii strain was deposited at CENTRAAL BUREAU

VOOR SCHIMMELCULTURES, Uppsalalaan 8, P.O. Box 85167, NL-3508 AD Utrecht, The Netherlands in December 1964 having the Accession Number CBS 393.64.

Other suitable strains can be equally used in the present examples to show the effect and advantages of the invention. For example TEC-101 , TEC-147, TEC-192, TEC-201 or TEC- 210 are suitable Rasamsonia strains wich are described in WO201 1/000949. The "4E mix" or "4E composition" was used containing CBHI, CBHII, EG4 and BG (30wt%, 25wt%, 28wt% and 8wt%, respectively, as described in WO201 1/098577, wt% on dry matter protein.

Rasamsonia (Talaromyces) emersonii strain TEC-101 (also designated as FBG 101 ) was deposited at CENTRAAL BUREAU VOOR SCHIMMELCULTURES, Uppsalalaan 8, P.O. Box 85167, NL-3508 AD Utrecht, The Netherlands on 30^th June 2010 having the Accession Number CBS 127450.

TEC-210 was fermented according to the inoculation and fermentation procedures described in WO201 1/000949. The 4E mix (4 enzymes mixture or 4 enzyme mix) containing CBHI, CBHII, GH61 and BG (30%, 25%, 36% and 9%, respectively as described in WO201 1/098577) was used.

3E mix (3 enzymes mixture or 3 enzyme mix) is spiked with a fourth enzyme to form the 4E mix. Example 1. Identification of CORTH (Cornyascus thermophilus) genes that encode a secreted protein

Genes were identified that based on curation (described above) encoded a secreted protein. A list of these genes is shown in Table 8. Example 2. Improvement of a thermophilic cellulase mixture composed of three enzymes by a Cornyascus thermophilus betaglucosidase.

The cellulase activity of a Cornyascus thermophilus BG protein (Corth2p4_001043) was further analyzed. The supernatant of the A. niger expressing Corth2p4_001043shake flask fermentation was concentrated and spiked in a dosage of 0.45 mg/gDM on top of a base activity of a three enzyme base mix (4.55 mg/gDM composed of: CBHI at 1.25 g/gDM, CBHII at 1 .5 mg/gDM and GH61 at 1 .8 mg/gDM) at a feedstock concentration of 10% (w/w) aCS, as described above. As a negative control, the 3 enzyme base mix was also tested. All experiments were performed at least in duplicate and were incubated for 72 hours at 65°C in an oven incubator (Techne HB-1 D hybridization oven) while rotating at set-point 3. After incubation, the samples were centrifuged and soluble sugars were analysed by HPLC as described above.

Addition of this Cornyascus thermophilus BG protein showed increased sugar release as shown below in Table 3.

Table 3: Effect of BG protein Corth2p4_001043 spiked on top of a 3E mix using aCS substrate.

Example 3. Identification of thermophilic Cornyascus thermophilus

arabino(furano)sidases

The arabino(furano)sidase activity of Cornyascus thermophilus enzymes was analysed as described above. The supernatant of A. niger shake flask fermentations were concentrated and assayed for arabinose release from wheat arabinoxylan, which was pre-digested by an endo-xylanase, after incubation for 24 hours at pH 4.5 and 65°C. Two enzymes showed increased arabinose release as shown below in Table 4.

Table 4: Effect of Cornyascus thermophilus arabinofuranosidases on wheat arabinoxylan substrate pre-digested with an endo-xylanase.

Example 4. Identification of thermophilic Cornyascus thermophilus endo-xylanases

The endo-xylanase activity of several Cornyascus thermophilus enzymes was analysed. The supernatant of the Cornyascus thermophilus A. niger shake flask fermentations were concentrated and assayed for endo-xylanase activity on wheat arabinoxylan oligosaccharides and beech wood xylan as described above in endo-xylanase activity assay 1 . Enzyme CORTH_1_02177 was able to release xylose and xylooligomers from the two substrates after incubation for 24 hours with 1 % (w/w) enzyme dose at pH 4.5 and 65°C as is shown in Table 5.

Table 5: Effect of Cornyascus thermophilus endo-xylanase CORTH_1_02177 on the release

In a second experiment the endo-xylanase activity of Cornyascus thermophilus enzymes was analysed as described above in endo-xylanase activity assay 2. The supernatant of the Cornyascus thermophilus A. niger shake flask fermentations were concentrated and assayed for endo-xylanase activity by measuring reducing-end formation expressed as xylose equivalents after incubation of the enzymes at 0.1 % (w/w) dose on wheat arabinoxylan during 24 hours at 65 °C and pH 4.5. Enzyme CORTH_1_02177 was able to release reducing sugars from the substrates as shown in Table 6.

Table 6: Effect of Cornyascus thermophilus endo-xylanase CORTH_1_02177 on the release of reducing sugars (reported as xylose equivalents) from Wheat arabinoxylan.

ig xylose equivalents/ time

ml_ (h)

0 0.5 1 2 3 4 6 24 no enzyme -6.0 -14.5 -17.0 -1 1.9 -1 1.7 -12.6 -1 1.8 -9.9 CORTH_1_02177 2.3 431 .4 446.6 466.8 481 .5 436.5 452.2 502.7 Example 5. Identification of thermophilic Cornyascus thermophilus acetyl-xylan esterase

The acetyl xylan esterase activity of Cornyascus thermophilus CORTH 2p4_004688 was analysed. The supernatant of this Cornyascus thermophilus A. niger shake flask fermentation was concentrated and assayed for acetic acid release from pretreated corn stover as described above. The enzyme was identified as active acetyl xylan esterase because it was able to release acetic acid from the substrate as is shown in Table 7.

Table 7: Effect of Cornyascus thermophilus acetyl xylan esterase CORTH2p4_004688 enzyme on the release of acetic acid from pretreated corn stover.

Table 8. List of target genes and reference to gene, transcript and protein sequences

in present text in present text

(priority application) (priority application)

Genomic Coding

sequenc sequen Amino acid

Enzyme Enzyme e SEQ ID ce SEQ sequence

Gene ID function family NO: ID NO: SEQ ID NO:

Corth2p4_001043 1 2 3

(Corth2p4_001043) betaglucosidase GH3 (58) (59) (60)

arabinan endo-

Corth2p4_000941 1 ,5-alpha-L- 4 5 6

(Corth2p4_000941) arabinosidase GH43 (40) (41) (42)

Corth2p4_007532 1 ,4-beta- 7 8 9

(Corth2p4_007532) xylosidase GH43 (433) (434) (435)

Corth2p4_004688 acetylxylan 10 1 1 12

(Corth2p4_004688) esterase CE5 (250) (251) (252)

CORTH_1_02177 Endo-1 ,4-beta- 13 14 15

(Corth2p4_003506) xylanase GH1 (190) (191) (192)

Claims

1 . A process for degrading biomass or pretreated biomass to sugars wherein an enzyme is used comprising a polypeptide having a. a polypeptide sequence as set forth in any one of SEQ ID NOs: 3, 6, 9, 12 and 15; b. a polypeptide that is at least 60%, preferably at least 70%, more preferably at least 80%, even more preferably at least 90%, 95%, 96%, 97%, 98% or 99% homologous to the any one of SEQ ID Nos: 3, 6, 9, 12 and 15; c. a polypeptide sequence encoded by nucleic acids sequence as set forth in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14 or nucleic acids that are at least 60%, preferably at least 70%, more preferably at least 80%, even more preferably at least 90%, 95%, 96%, 97%, 98% or 99% homologous to any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14; d. a polypeptide sequence encoded by a nucleic acids sequence hybridizing under stringent conditions to the polynucleotide as set forth in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14; or e. a polypeptide sequence encoded by a nucleic acids sequence hybridizing under stringent conditions to the reverse complement of a polynucleotide as set forth in any one of SEQ ID NOs: 1 , 2, 4, 5, 7, 8, 10, 1 1 , 13 and 14.

2. A process for degrading biomass or pretreated biomass to sugars according to claim 1 wherein the enzyme is a cellulase-enhancing protein, a glycoside hydrolase, preferably a betaglucosidase, an arabinan endo-1 ,5-alpha-L- arabinosidase, a 1 ,4-beta-xylosidase, an acetylxylan esterase or an endo-1 ,4- beta-xylanase.

3. A process for degrading biomass or pretreated biomass to sugars according to claim 1 or 2 wherein the polypeptide is obtainable from Myceliophthora fergusii (Corynascus thermophilum).

4. A process for degrading biomass or pretreated biomass to sugars according to any one of claims 1 to 3 further comprising adding an enzyme that has cellulase enhancing activity.

5. A process for degrading biomass or pretreated biomass to sugars according to any one of claims 1 to 4 wherein the formed sugars are converted into ethanol.

6. A process for degrading biomass or pretreated biomass to sugars according to any one of claims 1 to 5 further comprising adding a cellulase or cellulases.

7. A process for degrading biomass or pretreated biomass to sugars according to any one of claims 1 to 6 wherein the cellulolytic material or lignin is pretreated.