WO2013164340A1 - Endoglucanases with improved properties - Google Patents

Endoglucanases with improved properties Download PDF

Info

Publication number
WO2013164340A1
WO2013164340A1 PCT/EP2013/058985 EP2013058985W WO2013164340A1 WO 2013164340 A1 WO2013164340 A1 WO 2013164340A1 EP 2013058985 W EP2013058985 W EP 2013058985W WO 2013164340 A1 WO2013164340 A1 WO 2013164340A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
protein
endoglucanase
preferred
amino acid
Prior art date
Application number
PCT/EP2013/058985
Other languages
French (fr)
Inventor
Christoph Reisinger
Jörg CLAREN
Isabel Unterstrasser
Aleksandra Mitrovic
Karlheinz Flicker
Gabi GEBHARD
Original Assignee
Clariant Produkte (Deutschland) Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Clariant Produkte (Deutschland) Gmbh filed Critical Clariant Produkte (Deutschland) Gmbh
Priority to EA201401199A priority Critical patent/EA034175B1/en
Priority to BR112014027168A priority patent/BR112014027168A8/en
Priority to AU2013255931A priority patent/AU2013255931B2/en
Priority to EP13719552.5A priority patent/EP2844749A1/en
Priority to CN201380023399.9A priority patent/CN104271738B/en
Priority to CA2871841A priority patent/CA2871841C/en
Priority to US14/397,980 priority patent/US9677059B2/en
Priority to MX2014013255A priority patent/MX2014013255A/en
Publication of WO2013164340A1 publication Critical patent/WO2013164340A1/en
Priority to ZA2014/07404A priority patent/ZA201407404B/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/24Hydrolases (3) acting on glycosyl compounds (3.2)
    • C12N9/2402Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
    • C12N9/2405Glucanases
    • C12N9/2434Glucanases acting on beta-1,4-glucosidic bonds
    • C12N9/2437Cellulases (3.2.1.4; 3.2.1.74; 3.2.1.91; 3.2.1.150)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/02Monosaccharides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/14Preparation of compounds containing saccharide radicals produced by the action of a carbohydrase (EC 3.2.x), e.g. by alpha-amylase, e.g. by cellulase, hemicellulase

Definitions

  • Cellulose is a major component of plant material. It is the basis for the structural integrity of plants and is often found in a lignocellulose matrix composed of cellulose, hemicelluloses. and lignin. Applications employing cellulose take advantage of either its structural properties (fibers, textiles, paper, etc.) or of its carbohydrate nature, producing D-glucose, cellobiose and/or cellulose oligomers.
  • Lignoceiiuloses are readily available from agriculture and forestry including byproduct streams from cereals, corn, sugar cane, sugar beet, timber, etc. Plants that are optimized for their lignocellulose content and yield ("energy crops") will likely contribute as an important resource in the near future.
  • Cellulases comprise a structurally and functionally diverse class of glycohydrolases acting on cellulose.
  • Cellulases are found in bacteria, archea, fungi and plants. Having in common the hydrolytic cleavage activity of glycosidic bonds present in cellulose polymers or oligomers, they differ in substrate specificity, mode of action, and enzyme parameters, including processivity, pH and temperature optima. Most cellulases act on ⁇ -1 ,4-linkages between two glucose moieties. However other linkages found in Iignoceliuloses may also be hydrolysed. Cellulases can be subdivided by their mode of action into endo- and exo-enzymes.
  • Endoglucanases introduce random cleavages into the cellulose polymer, thereby reducing the degree of polymerization.
  • Exo-enzymes like cellobiohydrolases, work in a successive mode of action, releasing cellobiose (D-glucose- ⁇ -1 ,4-D- glucopyranoside) from the reducing or non-reducing end of the polymer.
  • the CAZY Database [Cantare! BL, Coutinho PM, Rancurel C, Bernard T. Lombard V, Henrissat B (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert, resource for Glycogenomics. Nucleic Acids Res 37:D233-238 PMID: 18838391 ] holds, amongst others, a collection of known glucohydrolases including cellulose degrading enzymes (i.e. cellulases). In this database enzymes are classified to different GH-classes according to structural elements.
  • GH classes include endoglucanases, in particular the classes GH5, GH7, GH9, GH12, GH16, GH45, GH48, GH61 and GH74.
  • endoglucanases in particular the classes GH5, GH7, GH9, GH12, GH16, GH45, GH48, GH61 and GH74.
  • members of one GH class often have similar physical and enzymatic parameters. This allows general statements to be made like substrate specificity, pH range, stability, or catalytic efficiency for members of a certain GH class.
  • Cellulose-degrading microorganisms often produce and secrete a complex mixture of cellulases.
  • endoglucanases have been identified belonging to 6 different GH classes (Ce!5A. Cel7B, Cel12A. Cel45A, Cel61 A, Cel61 B, Cel74A).
  • the different endoglucanases show a spectrum of properties (Karlsson J, Siika-aho M, Tenkanen M, Tjerneld F. Enzymatic properties of thelow molecular mass endogiucanases Cel12A (EG III) and Cel45A (EG V) ofTrichoderma reesei. J Biotechnol.
  • thermostable GH7 family protein is often advantageous for high hydrolysis rates.
  • Endogiucanases were reported, as part of complex enzyme mixtures as single enzyme activities.
  • Cellulases are important for making cellulose-derived biofue!s. After cutting and, optionally, chemical and/or physical pretreatment, lignoceliuloses are incubated with cellulases to release sugar monomers that are further processed. Process conditions need to be aaapted to optimize hydrolysis rates, yields and/or stability. Higher temperatures are often preferred in these processes but require more thermostable enzymes.
  • Simultaneous saccharification and hydrolysis (SSF) processes require celluloiytic enzymes that are active under fermentative conditions.
  • Consolidated bioprocessing (CBP) further requires the combination of enzyme properties, in order to have enzyme production, saccharification and fermentation done in a single step.
  • Endogiucanases aim only on a partial hydrolysis or modification of cellulose fibers (fiber modification, biopolishing, biostoning, etc.). Endogiucanases used therefore need to work and/or be stable at elevated temperatures, extreme (e.g. alkaline, acid) pH, and chemical conditions (e.g. laundry, detergents, proteases, solvents, etc.). Fiber damage must be minimized for such applications. Endogiucanases can also assist in the separation of non-cellulosic fractions from the fiber material in pulping processes (pulp & paper prodcution) or improve rheoiogical properties of process streams.
  • Detergent stability and protease resistance can be seen as a product of increased stability of the enzyme structure, a property that is also connected to increased thermal stability.
  • Endogiucanases also find applications in food and feed processing (breweries, wine production, oil recovery from press cake, baking, dough preparation. Often sterilization or pasteurization requires higher temperatures. For shortening of processing times the operational stability of the endogiucanase can be advantageous.
  • Endoglucanase I proteins (Cel7B) derived from fungi of the genus Trichoderma (anamorph Hypocrea) show high degrees of identity and are considered mesophilic.
  • the most stable members of endoglucanases from the GH family 7 reported are native enzymes from Humicola insulens (Cel7B) and Fusarium oxysporum (eg1) (US5912157). According to said report, EG I does not exhibit activity above 60°C. There is thus a need in the field for the provision of more thermostable endoglucanases from the GH family 7.
  • endoglucanases of GH12 and GH45 were reported.
  • Thermostable endoglucanases have been reported from the structural folds of GH5 and GH48. Said endoglucanases substantially differ with respect to their kinetic properties and substrate preference from the endoglucanases of the GH7 class.
  • processive endoglucanases particularly of the GH7 family, with superior temperature profiles. It would furthermore be desirable to achieve good productivity from their expression host.
  • the need is further supported by the fact that many processes of industrial relevance run under harsh conditions and at elevated temperatures.
  • a problem to be solved by the present invention is the provision of improved endoglucanases, particularly of endoglucanases with improved thermal properties. Further problems addressed and solved by this invention will become apparent from the sections below.
  • thermostable endoglucanase proteins polypeptides.
  • the solutions provided are: 1 .
  • a protein having endoglucanase activity which belongs to the GH7 class and which shows active thermostabilization.
  • a protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2. .
  • the endoglucanase proteins of the invention show more than 95% residual activity at 60 °C.
  • a further aspect of the invention are nucleic acids encoding said polypeptides and expression constructs comprising these polynucleotides in a vector backbone contained in an organism.
  • Another aspect of the invention is the application of the proteins of the invention for the processing of lignocellulose and cellulose materials. In particular, saccharification of lignocelluiose feedstock in consolidated, partially consolidated or non-consolidated processes, or in the processing of food, feed, cellulose fiber, or cleaning applications.
  • the invention also relates to production/expression organisms for the production of the proteins of the invention and to processes for the cultivation of such organisms for the purpose or protein production.
  • the organisms are selected from organisms including microorganisms (fungal, bacterial, or archea) or plants.
  • FIG. 1 The SDS-gel shows the expression of a Seq. ID NO 8 - a Seq. ID NO 2 variant - protein secreted into the supernatant. The band of the expressed protein is visible between 75 and 100 kDa.
  • Figure 2 Trichoderma reesei expression plasmid. The DNA sequence coding for the mature endoglucanase gene is cloned in fusion to the 77CBHI signal peptide sequence under control of the T/CBH! promoter.
  • the SwaUSbft excisable expression cassette contains a hygromycine resistance cassette for selection of transformants.
  • Figure 4 Calibration of 200 ⁇ portions of alkaline 4-methylumbe!iiferone solution to the fluorescence read out in a Tecan Infinite 200 plate-reader.
  • a 10 mM solution was prepared by dissolving 440 mg of 4-methyl umbelliferone (Sigma Aldrich Cat. Nr. 69580) in 250ml of 0.5 M sodium carbonate solution.
  • Serial dilutions were prepared by in 0,5 M sodium carbonate. Fluorescence intensity was measured at 360nm/454nm with at gain 50.
  • Figure 5 Thermal stabilization of Seq. ID NO: 13 compared to Seq. ID NO: 4
  • Figure 6 Determination of haif-lives at 70°C (Example 7) for Seq. ID NO; 14 compared to Seq. ID NO: 4.
  • Thermostability is a term used to describe an intrinsic property of a particular protein with endogiucanase activity according to the present invention.
  • Active thermostabilization is a term used to describe an intrinsic property of a particular protein with endogiucanase activity according to the present invention.
  • thermostability and/or active thermostabilization are determined as follows.
  • the enzyme is expressed in Pichia pastoris as described in Example 2.
  • the enzyme is optionally purified.
  • An enzyme solution of an appropriate concentration is made by dilution of purified enzyme or Pichia pastoris culture supernatant in sodium acetate buffer (50 mM, pH 5) to an applicable working concentration.
  • a serial dilution of the enzyme obtained in step 1 ) above is prepared in the sodium acetate buffer and 10 ⁇ aliquots are tested in the temperature gradient as described in Example 4.
  • An applicable working concentration is defined as a concentration which results in a fluorescence signal between 5,000 and 15.000 in a Tecan Infinite M200 plate-reader at gain 50, or an equivalent concentration of 5.4 ⁇ to 19 ⁇ 4- ethylumbelliferon after incubation as described in o Example 4.
  • a protein is characterized temperature stable if the relative substrate conversion at 60 °C is 0.5 or more, preferably 0.7 or more and more preferably 0.9 or more, such as 0.95 or more.
  • Determination of active thermostabilization Analysis of the plot obtained in step 5) for the presence of a plateau at a relative substrate conversion which is lower than the maximum level (which is 1 ), but which is at least as high as 0.15.
  • a plateau is defined as a level of the relative substrate conversion which is essentially unchanged within a temperature range of at least 5 °C, preferably from 70 to 75 °C (i.e. within +/- 0.1 around the average value within said temperature range).
  • Variants showing no active thermostabilization have a relative substrate conversion between 0 and lower than 0.15, usually around 0.1 .
  • the measured relative substrate conversion of usually around 0.1 is due to finite temperature ramps in the thermocycler and/or during handling of the sample mixtures.
  • Thermal properties is a term generally used to refer to the properties of an enzyme at higher temperatures (e.g. 60 °C or more).
  • the term can include one or both of "temperature stability” as defined above and ' ' active thermostabilization" as described above.
  • Endoglucanase activity in the context of this invention is defined as the catalytic acceleration of the breakage of p ⁇ 1 ,4-glucosidic bonds via nucleophilic attack by a polar molecule as water or organic molecules with their hydroxyl- or mercapto- or amino-functions, by a protein.
  • the definition aiso includes the cleavage of synthetic moiecuies having a non-carbohydrate molecule linked to glucose, cellobiose or lactose, via ⁇ -1 ,4-glycosidic linkage.
  • Example reactions catalyzed by endoglucanases are listed by the Brenda Database ( http://www. brenda-enzym es .
  • Residual activity is defined as the enzymatic activity that is recovered after incubation of the enzyme for a defined time at a defined (elevated) temperature in comparison to the activity without the incubation step.
  • a protocol for the determination of the residual activity is given in Example 4.
  • Sequence Alignment with SEQ ID NO: 2 Pairwise alignment of any second GH7 endoglucanase sequence with the parental sequence (SEQ ID NO 2) is done using the ClustalW Algorithm (Larkin M.A., Blackshields G.. Brown N.P., Cnenna R., McGettigan P.A., McWilliam H., Valentin F., Wallace I.M., Wilm A., Lopez R., Thompson J.D., Gibson T.J. and Higgins D.G. (2007) ClustalW and ClustalX version 2. Bioinformatics 2007 23(21 ): 2947-2948). The pairwise alignment will show position numbers for SEQ ID NO: 2.
  • Said numbers can be used for reference, for example when saying that, e.g. the residue corresponding to position no. 2 of SEQ ID NO: 2 is mutated in the second GH7 endoglucanase.
  • the amino acid within the parental protein sequence SEQ ID NO: 2 is referred to as position number 1 or S1 or serine 1 .
  • the numbering of all amino acids will be according to their position in the parental sequence given in SEQ ID NO: 2 relative to this position number 1.
  • Sequence identity For determination of Sequence identity the software AlignX from the VectorNTI Package sold by Life Technology Corporation is used, using the standard settings (Gap opening penalty 10, Gap extension penalty 0.1 ).
  • Protein variants are polypeptides whose amino acid sequence differs in one or more positions from this parental protein, whereby differences might be replacements of one amino acid residue(s) by another, deletions of single or several amino acid residue(s), or insertion of additional amino acid residue(s) or stretches of amino acid residue(s) into the parental sequence. Proteins can be modified at defined positions by introduction of point mutations into the encoding nucleic acids.
  • modified protein sequence herein always refers to proteins resulting from transcription and translation as well as optional post-translational modification and translocation processes from correspondingly modified nucleic acids, either in vitro or by a suitable expression host.
  • Methods for the generation of such protein variants are well known in the art and thus not limited, examples include random or site directed mutagenesis, site-saturation mutagenesis, PCR-based fragment assembly, DNA shuffling, homologous recombination in vitro or in vivo, and methods of gene-synthesis based on chemical DNA synthesis.
  • Inserted additional amino acids receive the number of the preceding position extended by a small letter in alphabetical order relative to their distance to their point of insertion.
  • 3aW, 3bW the insertion of two tryptophanes after position 3
  • Introduction of untranslated codons TAA. TGA and TAG into the nucleic acid sequence is indicated as “ * " in the amino acid sequence, thus the introduction of a terminating codon at position 4 of the amino acid sequence is referred to as "G4 * ".
  • Multiple mutations are separated by a pius sign or a slash or a comma.
  • positions 20 and 21 substituting alanine and glutamic acid for glycine and serine, respectively, are indicated as "A20G+E21 S” or “A20G/E21 S” 'A20G.E21 S " .
  • amino acid residue at a given position is substituted with two or more alternative amino acid residues these residues are separated by a comma or a slash.
  • substitution of alanine at position 30 with either glycine or glutamic acid is indicated as "A20G.E “ or "A20G/E", or "A20G, A20E".
  • any amino acid residue may be substituted for the amino acid residue present in the position.
  • the alanine may be deleted or substituted for any other amino acid residue (i.e. any one of R, N, D. C. Q, E, G. H, I, L. K. M, F, P, S, T, W, Y and V).
  • similar mutation' or “similar substitution” refer to an amino acid mutation wherein an amino acid residue in a first mutation (with respect to the parental sequence, such as e.g. SEQ ID NO: 2) is replaced again by a second mutation, and whereby the amino acid residue brought in by the second mutation has similar properties to the amino acid residue that had been brought in by the first mutation.. Similar in this context means an amino acid that has similar chemical properties. If, for example, a first mutation at a specific position leads to a substitution of a non-aliphatic amino acid residue (e.g. Ser) with an aliphatic amino acid residue (e.g. Leu), then a substitution at the same position with a different aliphatic amino acid by means of a second mutation (e.g.
  • a similar mutation is referred to as a similar mutation.
  • Further chemical properties include size of the residue, hydrophobicity, polarity, charge. pK-value, and the like.
  • a similar mutation may include substitution such as basic for basic, acidic for acidic, polar for poiar etc.
  • the sets of amino acids thus derived are likely to be conserved for structural reasons. These sets can be described in the form of a Venn diagram (Livingstone CD. and Barton GJ. (1993) "Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation” Comput.Appl Biosci. 9: 745-756: Taylor W. R. (1986) "The classification of amino acid conservation” J.Theor.Biol. 1 19; 205-2 8).
  • An expression construct herein is defined as a DNA sequence comprising all required sequence elements for establishing expression of an comprised open reading frame (ORF) in the host cell including sequences for transcription initiation (promoters), termination and regulation, sites for translation initiation, regions for stable replication or integration into the host genome and a selectable genetic marker.
  • the open reading frame optionally consists of a fusion of a nucleic acid coding for the target protein with further elements, especially secretion signals, a cellulose binding domain, TAGs for enhancement of the expression level or facilitation of purification or isolation from the fermentation broth.
  • the functional setup thereby can be already established or reached by arranging (integration etc.) event in the host cell.
  • the expression construct contains a promoter functionally linked to the open reading frame followed by an optional termination sequence.
  • Preferred promoters are medium to high strength promoters, functional in the selected hosts under fermentation conditions. For illustration, examples of preferred promoters are given as follows:
  • Bacteria e.g. Escherichia coli
  • lac lac
  • tac tip
  • CP7 CP21
  • araBAD ⁇ Yeast
  • AOXI AOXII
  • F DH GAP
  • TEF PFK1 , FBA1 , PGK1 ,
  • ADH2, TDH3 • Fungi e.g. Trichoderma: CBHI, CBHIl. EG I, PGK.
  • promoters for heterologous expression are reported in the literature.
  • Other parts of the expression construct are genetic elements requirements for a stable heritage of the introduced nucleic acids and selectable markers including genetic elements referring antibiotic resistance or complementing defined auxotrophies of the host strain.
  • nucleic acids of the invention can be adjusted towards optimal codon usage in the selected expression host.
  • the nucleic acids having such optimized/optimal codon usage for the particular expression host are also part of this invention.
  • a production host is used herein synonympously to expression host and means an organism, which, upon cultivation produces the protein of the present invention.
  • the protein of the present invention is not secreted by the production host; however, in a preferred embodiment, it is secrested into the surrounding medium.
  • Such an organisms is preferably selected from the kingdom of Bacteria, Archea, Yeasts, Fungi, and/or Plants.
  • One preferred expression host is Pichia pastoris.
  • Bacteria shall herein refer to prokaryotic organisms.
  • Bacteria are eubacteria, and even more preferably they are selected among of the genus Escherichia, Bacillus, Klebsiella, Streptomyces, Lactococcus and Lactobacillus in particular Escherichia coli, Bacillus subtilis, Bacillus licheniformis, Bacillus amyloliquefaciens, Bacillus megaterium, Klebsiella planticola, Streptomyces lividans, Lactococcus lactis, Lactobacillus brevis.
  • Yeast shall herein refer to all lower eukaryotic organisms showing a unicellular vegetative state in their life cycle. This especially includes organisms of the class Saccharomycetes. in particular of the genus Saccharomyces. Pachysolen, Pichia, Candida. Yarrowina, Debaromyces. Klyveromyces. Zygosaccharomyces.
  • “Filamentous fungi” or “fungi” shall herein refer to all lower eukaryotic organisms showing hyphal growth during at least one state in their life cycle. This especially includes organisms of the phylum Ascomyccta and Basidiomycota, in particular of the genus Trichoderma. Taiaromyces. Aspergillus. Penicillium, Chrysosporium, Phanerochaete, Thermoascus. Agaricus, Pleutrus, Irpex.
  • Plant shall herein refer to all eukaryotic organisms belonging to the kingdom of plants.
  • the expression host is selected form plants of the genus Zea, Triticum, Hordeum, Secale, Miscanthus. Saccharum. Solanum. Ipomea, Manihot, Helianthus. Camellia, Aspalathus. Eucalyptus, Beta, Fagus, members of the family of Pinaceae, Betulaceae. Malvaceae. Cupressaceae, Rosaceae. Arecaceae.
  • Enzyme formulation is meant to be any liquid or solid composition containing the enzyme as a fraction. Additional components preferably comprise water, polyols, sugars, detergents, buffering agents, reducing agents, inorganic salts, solid carriers, conserving agents especially with antibacterial or anti-fungal activity, dyes, fragrances and/or perfumes.
  • endogiucanases such as particularly of the endoglucanase of the present invention (non- limiting examples): hydrolysis of lignocellulose feedstocks for the generation of monomeric, dimeric or or oligomeric sugars; production of pulp and paper; textile applications for the improvement or general processing of fibers, yams or denim; cleaning applications for industrial or home care applications; release of nutrients, production yield enhancement or improvement of dough properties in the field of food and feed.
  • the invention relates to GH7 endogiucanases with superior properties. More particularly, the invention relates to thermostable endoglucanase proteins (polypeptides).
  • the solutions provided are:
  • a protein having endoglucanase activity which belongs to the GH7 class and which shows active thermostabilization.
  • a protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2. .
  • thermostability is defined above.
  • An example for the determination of the thermostability is given in Example 4.
  • Endogiucanases of the GH7 class are listed in the Table 1 (EC 3.2.1 .4). Unless excluded by particular sequence identity constraints in a particular claim, the invention relates to variants of of all endogiucanases of the GH7 class, comprised therein variants of the ones shown in Table 1.
  • EglB Emericella nidulans AAM54071 .1 endo- -1 ,4-glucanase I (EG l;Eg1 ;Fof7) (Cel7B) Fusarium oxysporum AAA65586.1 10VW[
  • HmEG2 Holomastigotoides mirabile BAB64564.1 endoglucanase 1 (HmEG1 ) Holomastigotoides mirabile BAB64563.1 endo- ⁇ -1 ,4-glucanase 1 (Egll ;EG-I) Humicola grisea var. BAA09786.1
  • thermoidea thermoidea
  • endoglucanase 1 (EG l;EG1 ) (Cel7B) Humicola insolens AAE25068.1 1A39[A] endo- ⁇ -1 ,4-glucanase 1 (EGI:Egl1 :EG-I) (Cel7B) Hypocrea jecorina AAA34212.1 1 EGHA
  • a protein having endoglucanase activity which belongs to the GH7 class and which shows active thermostabiiization.
  • the proteins have enaoglucanase activity and superior thermal properties.
  • the superior thermal properties are defined as a temperature stability that manifests in a relative substrate conversion! activity higher than 90% (such as higher than 95 %) upon incubation at temperatures of 60 °C or higher, and active thermostabiiization.
  • the active thermostabiiization is described in the following.
  • the inventors of the present invention have surprisingly found out that proteins showing active thermostabilzation also show temperature stability.
  • the inventors generated a GH7 endoglucanase, that is a particular variant of SEQ I D NO: 4 (i.e. the one given by SEQ ID NO: 2) as follows.
  • a nucleic acid encoding a polypeptide with SEQ I D NO: 2 was obtained by random mutagenesis (error prone PGR, as described in Example 1 ). Methods for random mutagenesis are well known in the art.
  • a respective nucleic acid encoding this protein can be directly prepared by the skilled person.
  • Methods therefor include for example gene synthesis or site-directed mutagenesis, starting from a nucleic acid with a high degree of sequence identity (e.g. more than 90 %) to SEQ ID NO: 4 and introduction of mutations by site-directed mutagenesis (in one or several steps) to obtain the nucleic acid encoding the protein of SEQ ID NO: 2.
  • a starting sequence from which the nucleic acid encoding SEQ ID NO. 2 can be obtained by mutagenesis is Cel7B from Hypocrea pseudokonigii given here as SEQ ilD NO.4 (Gene Bank Accession number AB 90986).
  • this protein solves the technical problem underlying the present invention, i.e. has higher temperature stability than its parental protein. (SEQ ID NO: 4). This is evident for example from the fact that the relative substrate conversion is still near its maximum at e.g. 60 °C, whereas the relative substrate conversion of the protein having SEQ ID NO: 4 is at a very low level at said temperature (see Figure 3).
  • the inventors have found that at even higher temperatures, e.g. in the range from 68 to 76 °C (including 70 to 74 °C), the relative substrate conversion does not significantly drop with increasing temperature. This is in sharp contrast to the properties of the parental protein having SEQ ID NO: 4, which, in a plot against increasing temperatures, shows a decrease of relative substrate conversion, the decrease going down to background levels without any intermediate piateau. It is believed that the protein having SEQ ID NO: 4, when exposed to higher temperatures, e.g. 60 °C or more , such as 70 °C or more, is not present in its active state.
  • thermal unfolding is a well-known phenomenon for proteins of almost any type, particularly enzymes, at higher temperatures.
  • the thermal unfolding observed for the protein having SEQ ID NO: 4 is thus in line with the expectations of a skilled person.
  • the protein of SEQ ID NO: 4 is not part of the invention.
  • the protein of this aspect of the invention shows a piateau phase at higher temperatures, e.g. in the range from 68 to 76 °C (including 70 to 74 °C). This plateau is lower than the maximum relative substrate conversion, but higher than the background relative substrate conversion.
  • the inventors of the present invention conclude that the protein of the invention is present at these higher temperatures in a state which is different from the folded state at lower temperatures (e.g. 46 °C), but yet this protein is enzymatically active. It may thus be assumed that at high temperatures, this protein of the invention actively refolds, i.e. refolds to obtain a further active state (and thus enabling the observed relative substrate conversion at higher tempeatures).
  • any type of mutations including deletion, insertion or replacement of one or several amino acid residues, and being randomly or directed
  • any endoglucanase of the GH7 family particularly into any one named in Table 1 to obtain a mutant protein, or a library thereof.
  • The, the so-obtained mutant protein or the library thereof may be screened for active thermostabilization as defined above.
  • the known proteins given in table 1 above are not part of the invention, but any mutants thereof showing active thermostabilization are included in the invention.
  • thermostabilzation As defined above, all enzymes of the first aspect of the present invention, i.e. the ones which show active thermostabilzation as defined above, show temperature stability as defined above.
  • the active thermostabiiisation is thus, in a first aspect, a solution to the problem underlying the present invention. Whether or not any given protein falls under the first aspect of the invention can be reliably tested by the assay for active thermostabilization given above.
  • a protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2.
  • the inventors embarked on a mutagenesis project, starting from the protein of SEQ ID NO: 2.
  • the inventors have introduced mutations, such as point mutations, into the protein having SEQ ID NO: 2 (i.e. by modifying the underlying nucleic acid, as described below).
  • the inventors have found out that many such mutants also show temperature stability and thus solve the underlying problem in a second aspect. Examples of the solutions are given in Figure 3.
  • the invention relates to a protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2.
  • This protein may typically belong to the GH7 class.
  • the present invention also provides specific mutants of the protein with sequence of SEQ ID NO: 2.
  • the sequence given in SEQ ID NO: 2 is modified in one or more (preferably 1 , 2, 3, 4, 5, 6, 7. 8. 9, 10, 1 1 , 12, 13, 14, 15, 1 6, 17. 18 or 19) positions. Such modification may consist of replacement, deletion, insertion and the like.
  • the modification consists in a replacement.
  • the modification consists in a replacement of any one or more of the specific positions of SEQ ID NO: 2 which are individualized in the very left column of Tables 2, 3, 4. While such modification at any of these given positions my in principle be a replacement by any amino acid residue, it is preferred that the replacement is a replacement by a amino acid residue given in lane number 4 of any one or more of Tabies 2, 3 or 4. or by a an amino acid residue similar thereto (similar mutation as defined above). Thus, similar mutations as defined above might be introduced instead of the listed ones. Example 5 shows some of such mutants.. Methods for the introduction of mutations are known in the art. Exemplary guidance can be taken from Example 1 . In other words, a preferred embodiment of the invention reiates to preferred positions for mutagenesis of
  • the first and second aspect of the invention although being different solutions to the same problem, are not necessarily mutually exclusive.
  • the invention relates to proteins fulfilling the conditions of both the first aspect and the second aspect above.
  • the first aspect and the second aspect are two alternative solutions to the problem of providing GH7 enzymes with improved temperature stability. These solutions are independent (although for some examples overlapping) and thus need not be necessarily combined.
  • the protein identified as [6] in Figure 3 shows temperature stability compared to the protein having SEQ ID NO: 4 ([2] in Figure 3), yet does not display active thermostabilization.
  • the invention thus provide may different enzyme variant according to the first aspect above and/or according to the second aspect above. Whether any given enzyme shows the desired thermal properties (temperature stability and/or active thermostabilization) can be easily tested by the test entitled "Determination of thermostability and/or active thermostabilization" above.
  • thermostable enzymes of the invention also come with reduced of agglomerate formation at higher temperatures, and thus with reduced precipitation.
  • the avoidance of such precipitates is particularly advantagous in the presence of garnets, denim or woven materials as well for the application in membrane reactors, reducing the membrane fouling characterisitcs.
  • Fusion proteins comprising any protein of the invention are also part of the invention.
  • Another aspect of the invention is related to the production of the proteins of the invention by heterologous expression in a production host, also termed expression host.
  • Methods for the heterologous expression comprise the transfer of a nucleic acid encoding the protein of the invention (expression construct) into the production host by transformation, transfection, crossing or equivalent methods with respect to the nucleic acid (DNA or RNA) transfer.
  • Methods for transformation within the meaning of this invention are not particularly limited. Examples have been reported for a variety of species and include electroporation, protoplast-transformation, chemical transformation, and transfer via ballistic particles, micro-injection, viral -infection, crossing mating or the use of natural competent strains or cell lines.
  • a preferred production host co-secretes the endoglucanase of the invention with other celiuiases, hemi-cellulases or pectinases into the culture broth. It is thus preferred that the coding sequence on the expression construct encodes for the endoglucanase of the invention preceded by a signal for secretion from the particular host strain.
  • signals are well known in the art: for example in Eubacteria they are called signal peptides. Without wishing to be bound to a particular theory, these signals have in common the ability to direct secretion of a protein, typically in a co-translational fashion.
  • a preferred expression host is Trichoderma reesei.
  • a further aspect of the invention is the application of the above-described endoglucanase proteins.
  • This includes the applications of the purified, partially purified ore crude protein preparations as such or in enzyme formulation, as well as the application of whole cells or organisms, expressing the target protein. Fields of applications for endoglucanases can be found in the chapter field of invention. As stated there the application of thermal stable proteins is highly desirable.
  • a preferred application of the endoglucanase lies in the field of enzymatic lignocellulose conversion.
  • a library based on Seq. ID NO: 3 (“N7" library) was produced using SEQ ID NO: 3 as template by error-prone PGR using Taq polymerase following the literature protocol (Joyce et al) using PGR conditions as follows: 2min at 95°C, 30 cycles of (1 min at 95°C, 1 min at 56°C, 1 min at 72°C), 5min at 72 C. All products obtained from PCRs were purified with the QIAquick PGR Purification Kit (Qiagen, Hilden, Germany). Specific variants of Seq. ID NO: 3 were prepared by a modified PGR protocol using primers containing the mutated nucleotide sequence (Ho, S. N. et al. Gene: 1989; 77; 51 -9).
  • Linear expression cassette (LEG ) construction - LECs (Liu Z. et al. Chembiochem. 2008 Jan 4; 9 (1 ):58-61 ) with Zeocin marker and the GAP promotor were constructed by a modified PGR protocol.
  • Pichia pastoris transformation and cultivation - Competent cells were prepared and transformed as described (Lin-Cereghino, J., et al. BioTechniques. 2005. 38. 44-48).
  • Transformants were selected on YPD agar plates containing Zeocin 100mg/L, and picked to deepwell plates (DWP) (BMD5% 250mi/well) by picking robot (QPix2. Genetix). Inoculated DWPs were cultivated for 60h at 28°C, 80% humidity, and 280rpm.
  • Example 3 Expression in Trichoderma reesei
  • Example 4 Determination of substrate conversion capacity at different temperatures for indication of the thermostabi!ity of Seq ID NO. 2 -Variants using 4-methylumbe!lifery-B-D- ceilobiosid (4-MUC)
  • 4-MUC 4-methylumbe!lifery-B-D- ceilobiosid
  • Example 6 Determination of reducing sugar release on straw
  • the release of reducing sugar on straw was determined by applying acid pretreated wheat straw with a dry matter of 2,5%.
  • the following enzymes were added to the reaction mixture: cellobiohydrolase I (12.5 mg/l ), beta-glucosidase (40 CBU/mg celiobiohydrolase I) and the tested GH7 endoglucanase variant (12.5 mg/l).
  • the straw hydrolysis was incubated at 60°C by continuous shaking for 48h.
  • MUL MUL (4-methylumbellyferryl ⁇ -D-lactopyranoside) activity assay
  • 10 ⁇ of the cultivation supernatant was mixed with 90 ⁇ 100 ⁇ MUL in 25 mM Na-acetate buffer with pH 4.8. Plates were sealed and incubated for 2h, with 300rpm shaking, at 45 ' C and 59°C each (for rescreening also at 65°C). Reaction was quenched by adding 0 ⁇ Na 2 C0 3 per well. Excitation was performed at 365nm, and fluorescence measured at 450nm. The results are shown in figure 5.
  • the half-lives of the enzymes were determined by measuring the residual activity using the MUL assay described in Example 7 after incubation of expression supernatants of Pichia pastoris cultures at 70°C for 0 to 7min in a water bath. Samples were put on ice after the precise incubation period before setup of the activity assay.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Artificial Filaments (AREA)
  • Separation Of Suspended Particles By Flocculating Agents (AREA)

Abstract

The present invention relates to thermostable endoglucanases, particularly to proteins having endoglucanase activity which comprises an amino acid sequence having at least 96% identity to SEQ. ID NO.: 2, and proteins having endoglucanase activity which belongs to the GH7 class and which shows active thermostabilization.

Description

Title: Endoglucanases with improved properties
Field of the Invention and background art Cellulose is a major component of plant material. It is the basis for the structural integrity of plants and is often found in a lignocellulose matrix composed of cellulose, hemicelluloses. and lignin. Applications employing cellulose take advantage of either its structural properties (fibers, textiles, paper, etc.) or of its carbohydrate nature, producing D-glucose, cellobiose and/or cellulose oligomers.
Lignoceiiuloses are readily available from agriculture and forestry including byproduct streams from cereals, corn, sugar cane, sugar beet, timber, etc. Plants that are optimized for their lignocellulose content and yield ("energy crops") will likely contribute as an important resource in the near future.
Cellulases comprise a structurally and functionally diverse class of glycohydrolases acting on cellulose. Cellulases are found in bacteria, archea, fungi and plants. Having in common the hydrolytic cleavage activity of glycosidic bonds present in cellulose polymers or oligomers, they differ in substrate specificity, mode of action, and enzyme parameters, including processivity, pH and temperature optima. Most cellulases act on β-1 ,4-linkages between two glucose moieties. However other linkages found in Iignoceliuloses may also be hydrolysed. Cellulases can be subdivided by their mode of action into endo- and exo-enzymes. Endoglucanases introduce random cleavages into the cellulose polymer, thereby reducing the degree of polymerization. Exo-enzymes. like cellobiohydrolases, work in a successive mode of action, releasing cellobiose (D-glucose-β-1 ,4-D- glucopyranoside) from the reducing or non-reducing end of the polymer.
The CAZY Database [Cantare! BL, Coutinho PM, Rancurel C, Bernard T. Lombard V, Henrissat B (2009) The Carbohydrate-Active EnZymes database (CAZy): an expert, resource for Glycogenomics. Nucleic Acids Res 37:D233-238 PMID: 18838391 ] holds, amongst others, a collection of known glucohydrolases including cellulose degrading enzymes (i.e. cellulases). In this database enzymes are classified to different GH-classes according to structural elements. Several GH classes include endoglucanases, in particular the classes GH5, GH7, GH9, GH12, GH16, GH45, GH48, GH61 and GH74. Despite the high diversity within some of the GH classes, members of one GH class often have similar physical and enzymatic parameters. This allows general statements to be made like substrate specificity, pH range, stability, or catalytic efficiency for members of a certain GH class.
Cellulose-degrading microorganisms often produce and secrete a complex mixture of cellulases. For instance, in the secretome of Trichoderma reesei 7 endoglucanases have been identified belonging to 6 different GH classes (Ce!5A. Cel7B, Cel12A. Cel45A, Cel61 A, Cel61 B, Cel74A). The different endoglucanases show a spectrum of properties (Karlsson J, Siika-aho M, Tenkanen M, Tjerneld F. Enzymatic properties of thelow molecular mass endogiucanases Cel12A (EG III) and Cel45A (EG V) ofTrichoderma reesei. J Biotechnol. 2002 Oct 9:99(1 ):63-78. PubMed PMID: 12; Karlsson J, Momcilovic D, Wittgren B, Schiilein M, Tjerneld F, Brinkmalm G. Enzymatic degradation of carboxymethyl celiuiose hydrolyzed by the endogiucanasesCelSA, Cel7B, and Cel45A from Humicola insolens and Cei7B, Cei12A and Gel45Acore from Trichoderma reesei. Biopolymers. 2002 Jan;63(1 ):32-40. PubMed PMID:1 1754346.). The two predominant endogiucanases, EGI (Cel7B, GH7) and EGII (Cel5A), are considered to be the mose active enzymes thereof.
The synergistic activity of celluloiytic enzymes allows the efficient breakdown of complex substrates (B. Henrissat. H. Driguez, C. Viet & M. Schulein: Synergism of Cellulases from Trichoderma reesei in the Degradation of Cellulose; Nature Biotechnology 3, 722 - 726 (1985) doi:10.1038/nbt0885-722) and precludes the replacement of a component of one structural ciass by an enzyme from a second fold, when at the same time the hydrolytic efficiency needs to be kept at maximum level (Non- equivalency of different EGs). A simple replacement by another GH class enzyme is not always possible. Generally speaking, members of endogiucanases from the GH5 family (including EGs from thermophilic bacteria) show higher thermostability compared to endogiucanases of the GH7 family; nevertheless, the application of a thermostable GH7 family protein is often advantageous for high hydrolysis rates.
Many applications of endogiucanases were reported, as part of complex enzyme mixtures as single enzyme activities. Cellulases are important for making cellulose-derived biofue!s. After cutting and, optionally, chemical and/or physical pretreatment, lignoceliuloses are incubated with cellulases to release sugar monomers that are further processed. Process conditions need to be aaapted to optimize hydrolysis rates, yields and/or stability. Higher temperatures are often preferred in these processes but require more thermostable enzymes. Simultaneous saccharification and hydrolysis (SSF) processes require celluloiytic enzymes that are active under fermentative conditions. Consolidated bioprocessing (CBP) further requires the combination of enzyme properties, in order to have enzyme production, saccharification and fermentation done in a single step.
Other applications of endogiucanases aim only on a partial hydrolysis or modification of cellulose fibers (fiber modification, biopolishing, biostoning, etc.). Endogiucanases used therefore need to work and/or be stable at elevated temperatures, extreme (e.g. alkaline, acid) pH, and chemical conditions (e.g. laundry, detergents, proteases, solvents, etc.). Fiber damage must be minimized for such applications. Endogiucanases can also assist in the separation of non-cellulosic fractions from the fiber material in pulping processes (pulp & paper prodcution) or improve rheoiogical properties of process streams. Detergent stability and protease resistance can be seen as a product of increased stability of the enzyme structure, a property that is also connected to increased thermal stability. Endogiucanases also find applications in food and feed processing (breweries, wine production, oil recovery from press cake, baking, dough preparation. Often sterilization or pasteurization requires higher temperatures. For shortening of processing times the operational stability of the endogiucanase can be advantageous.
Endoglucanase I proteins (Cel7B) derived from fungi of the genus Trichoderma (anamorph Hypocrea) show high degrees of identity and are considered mesophilic.The most stable members of endoglucanases from the GH family 7 reported are native enzymes from Humicola insulens (Cel7B) and Fusarium oxysporum (eg1) (US5912157). According to said report, EG I does not exhibit activity above 60°C. There is thus a need in the field for the provision of more thermostable endoglucanases from the GH family 7.
It was reported that some endoglucanases can be thermally inactivated at higher temperatures (Dominguez JM, Acebal C, Jimenez J, de la Mata I, Macarron R, Castillon MP. Mechanisms of thermoinactivation of endoglucanase I from Trichoderma reesei QM 9414. Biochem J. 1992 Oct 15;287 ( Pt 2):583-8.). The authors of said study also attempted re-activation of thermoinactivated endoglucanase. but this required harsh conditions involving 8 M urea and further agents. Effects described as productive refolding were shown on other proteins than endoglucanases [Zhang N, Suen WC, Windsor W, Xiao L, Madison V, Zaks A. Improving tolerance of Candida antarctica lipase B towards irreversible thermal inactivation through directed evolution. Protein Eng. 2003 Aug; 6(8):599- 605.]. but to the knowledge of the inventors not for endoglucanases. in particular endoglucanases of GH7. It is believed in the art that thermoinactivated endoglucanases are of little use in industrial breakdown of cellulose. On the other hand, elevated thermostability is often desired for endoglucanases. in particular for enzymes of fungal origin. So far, only some improvements for endoglucanases of GH12 and GH45 were reported. Thermostable endoglucanases have been reported from the structural folds of GH5 and GH48. Said endoglucanases substantially differ with respect to their kinetic properties and substrate preference from the endoglucanases of the GH7 class. In summary, there is a need for processive endoglucanases, particularly of the GH7 family, with superior temperature profiles. It would furthermore be desirable to achieve good productivity from their expression host. The need is further supported by the fact that many processes of industrial relevance run under harsh conditions and at elevated temperatures. A problem to be solved by the present invention is the provision of improved endoglucanases, particularly of endoglucanases with improved thermal properties. Further problems addressed and solved by this invention will become apparent from the sections below.
Summary of the invention
The invention relates to thermostable endoglucanase proteins (polypeptides). The solutions provided are: 1 . A protein having endoglucanase activity which belongs to the GH7 class and which shows active thermostabilization. 2. A protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2. .
. Preferably, the endoglucanase proteins of the invention show more than 95% residual activity at 60 °C.
A further aspect of the invention are nucleic acids encoding said polypeptides and expression constructs comprising these polynucleotides in a vector backbone contained in an organism. Another aspect of the invention is the application of the proteins of the invention for the processing of lignocellulose and cellulose materials. In particular, saccharification of lignocelluiose feedstock in consolidated, partially consolidated or non-consolidated processes, or in the processing of food, feed, cellulose fiber, or cleaning applications.
The invention also relates to production/expression organisms for the production of the proteins of the invention and to processes for the cultivation of such organisms for the purpose or protein production. The organisms are selected from organisms including microorganisms (fungal, bacterial, or archea) or plants.
Brief description of the figures
Figure 1 : The SDS-gel shows the expression of a Seq. ID NO 8 - a Seq. ID NO 2 variant - protein secreted into the supernatant. The band of the expressed protein is visible between 75 and 100 kDa. Figure 2: Trichoderma reesei expression plasmid. The DNA sequence coding for the mature endoglucanase gene is cloned in fusion to the 77CBHI signal peptide sequence under control of the T/CBH! promoter. The SwaUSbft excisable expression cassette contains a hygromycine resistance cassette for selection of transformants.
Figure 3: Endoglucanase variants showing increased temperature stability ([6] and [3]) and variants with increased temperature stability and active thermostabilization [1 ], [4], [5], [7], [8], [9] and [10] in comparison to a native GH7 protein [2].
Figure 4: Calibration of 200μΙ portions of alkaline 4-methylumbe!iiferone solution to the fluorescence read out in a Tecan Infinite 200 plate-reader. A 10 mM solution was prepared by dissolving 440 mg of 4-methyl umbelliferone (Sigma Aldrich Cat. Nr. 69580) in 250ml of 0.5 M sodium carbonate solution. Serial dilutions were prepared by in 0,5 M sodium carbonate. Fluorescence intensity was measured at 360nm/454nm with at gain 50.
Figure 5: Thermal stabilization of Seq. ID NO: 13 compared to Seq. ID NO: 4 Figure 6: Determination of haif-lives at 70°C (Example 7) for Seq. ID NO; 14 compared to Seq. ID NO: 4.
Definitions Thermostability is a term used to describe an intrinsic property of a particular protein with endogiucanase activity according to the present invention.
"Active thermostabilization" is a term used to describe an intrinsic property of a particular protein with endogiucanase activity according to the present invention.
Determination of thermostability and/or active thermostabilization: Thermostability and active thermostabilization are determined as follows.
1 . ) The enzyme is expressed in Pichia pastoris as described in Example 2. The enzyme is optionally purified.
2. ) Adjustment of the enzyme concentration
An enzyme solution of an appropriate concentration is made by dilution of purified enzyme or Pichia pastoris culture supernatant in sodium acetate buffer (50 mM, pH 5) to an applicable working concentration. For determination of the applicable working concentration, a serial dilution of the enzyme obtained in step 1 ) above is prepared in the sodium acetate buffer and 10 μΙ aliquots are tested in the temperature gradient as described in Example 4. An applicable working concentration is defined as a concentration which results in a fluorescence signal between 5,000 and 15.000 in a Tecan Infinite M200 plate-reader at gain 50, or an equivalent concentration of 5.4 μΜ to 19 μΜ 4- ethylumbelliferon after incubation as described in o Example 4.
3. ) Determination of the substrate conversion capacity as described in Example 4. with the exception that the 10 μΙ aliquot of the culture supernatant is replaced by the 10 μΙ aliquot of the enzyme solution in applicable working concentration as defined in step 2).
4.) Normalization of the measurement by division of all relative fluorescence unit (rfu) reads by the maximum rfu read within the temperature gradient to obtain a relative substrate conversion for each protein tested at each temperature tested.
5. ) Plotting of the relative substrate conversion vs. the tested reaction temperatures.
6. ) Determination of the temperature stability as described in (a) or determination of active thermostabilization as described in (b) as follows.
a. Determination of temperature stability: a protein is characterized temperature stable if the relative substrate conversion at 60 °C is 0.5 or more, preferably 0.7 or more and more preferably 0.9 or more, such as 0.95 or more. Determination of active thermostabilization: Analysis of the plot obtained in step 5) for the presence of a plateau at a relative substrate conversion which is lower than the maximum level ( which is 1 ), but which is at least as high as 0.15.
A plateau is defined as a level of the relative substrate conversion which is essentially unchanged within a temperature range of at least 5 °C, preferably from 70 to 75 °C (i.e. within +/- 0.1 around the average value within said temperature range).
b. Variants showing no active thermostabilization have a relative substrate conversion between 0 and lower than 0.15, usually around 0.1 . Without wishing to be bound to any particular theory, it is believed that the measured relative substrate conversion of usually around 0.1 (rather than 0.0, as expected for an inactive enzyme at a given temperature) is due to finite temperature ramps in the thermocycler and/or during handling of the sample mixtures.
Thermal properties is a term generally used to refer to the properties of an enzyme at higher temperatures (e.g. 60 °C or more). The term can include one or both of "temperature stability" as defined above and ''active thermostabilization" as described above.
Endoglucanase activity in the context of this invention is defined as the catalytic acceleration of the breakage of p~1 ,4-glucosidic bonds via nucleophilic attack by a polar molecule as water or organic molecules with their hydroxyl- or mercapto- or amino-functions, by a protein. The definition aiso includes the cleavage of synthetic moiecuies having a non-carbohydrate molecule linked to glucose, cellobiose or lactose, via β-1 ,4-glycosidic linkage. Example reactions catalyzed by endoglucanases are listed by the Brenda Database ( http://www. brenda-enzym es . i nf o (Release 2012.1 (January 2012)); Enzyme data and metabolic information: BRENDA. a resource for research in biology, biochemistry, and medicine Schomburg, I., Hofmann. O., Baensch. C. Chang, A.. Schomburg, D. Gene Fund Dis. 3-4, 109-18 (2000)) Residual activity is defined as the enzymatic activity that is recovered after incubation of the enzyme for a defined time at a defined (elevated) temperature in comparison to the activity without the incubation step. A protocol for the determination of the residual activity is given in Example 4.
Sequence Alignment with SEQ ID NO: 2: Pairwise alignment of any second GH7 endoglucanase sequence with the parental sequence (SEQ ID NO 2) is done using the ClustalW Algorithm (Larkin M.A., Blackshields G.. Brown N.P., Cnenna R., McGettigan P.A., McWilliam H., Valentin F., Wallace I.M., Wilm A., Lopez R., Thompson J.D., Gibson T.J. and Higgins D.G. (2007) ClustalW and ClustalX version 2. Bioinformatics 2007 23(21 ): 2947-2948). The pairwise alignment will show position numbers for SEQ ID NO: 2. Said numbers can be used for reference, for example when saying that, e.g. the residue corresponding to position no. 2 of SEQ ID NO: 2 is mutated in the second GH7 endoglucanase. As convention for numbering of amino acids and designation of protein variants for the description of protein variants the amino acid within the parental protein sequence SEQ ID NO: 2 is referred to as position number 1 or S1 or serine 1 . The numbering of all amino acids will be according to their position in the parental sequence given in SEQ ID NO: 2 relative to this position number 1.
Sequence identity: For determination of Sequence identity the software AlignX from the VectorNTI Package sold by Life Technology Corporation is used, using the standard settings (Gap opening penalty 10, Gap extension penalty 0.1 ).
Protein variants are polypeptides whose amino acid sequence differs in one or more positions from this parental protein, whereby differences might be replacements of one amino acid residue(s) by another, deletions of single or several amino acid residue(s), or insertion of additional amino acid residue(s) or stretches of amino acid residue(s) into the parental sequence. Proteins can be modified at defined positions by introduction of point mutations into the encoding nucleic acids. The term modified protein sequence herein always refers to proteins resulting from transcription and translation as well as optional post-translational modification and translocation processes from correspondingly modified nucleic acids, either in vitro or by a suitable expression host. Methods for the generation of such protein variants are well known in the art and thus not limited, examples include random or site directed mutagenesis, site-saturation mutagenesis, PCR-based fragment assembly, DNA shuffling, homologous recombination in vitro or in vivo, and methods of gene-synthesis based on chemical DNA synthesis.
The nomenclature of amino acids, peptides, nucleotides and nucleic acids is done according to lUPAC. Generally amino acids are named within this document according to the one letter code. Exchanges of single amino acids are described by naming the single letter code of the original amino acid followed by its position number and the single letter code of the replacing amino acid, i.e. the change of glutamine at position one to a leucine at this position is described as "Q1 L". For deletions of single positions from the sequence the symbol of the replacing amino acid is substituted by the three letter abbreviation "del" thus the deletion of alanine at position 3 would be referred to as "A3del". Inserted additional amino acids receive the number of the preceding position extended by a small letter in alphabetical order relative to their distance to their point of insertion. Thus, the insertion of two tryptophanes after position 3 is referred to as "3aW, 3bW". Introduction of untranslated codons TAA. TGA and TAG into the nucleic acid sequence is indicated as "*" in the amino acid sequence, thus the introduction of a terminating codon at position 4 of the amino acid sequence is referred to as "G4*". Multiple mutations are separated by a pius sign or a slash or a comma. For example, two mutations in positions 20 and 21 substituting alanine and glutamic acid for glycine and serine, respectively, are indicated as "A20G+E21 S" or "A20G/E21 S" 'A20G.E21 S". When an amino acid residue at a given position is substituted with two or more alternative amino acid residues these residues are separated by a comma or a slash. For example, substitution of alanine at position 30 with either glycine or glutamic acid is indicated as "A20G.E" or "A20G/E", or "A20G, A20E". When a position suitable for modification is identified herein without any specific modification being suggested, it is to be understood that any amino acid residue may be substituted for the amino acid residue present in the position. Thus, for instance, when a modification of an alanine in position 20 is mentioned but not specified, it is to be understood that the alanine may be deleted or substituted for any other amino acid residue (i.e. any one of R, N, D. C. Q, E, G. H, I, L. K. M, F, P, S, T, W, Y and V).
The terms "similar mutation' or "similar substitution" refer to an amino acid mutation wherein an amino acid residue in a first mutation (with respect to the parental sequence, such as e.g. SEQ ID NO: 2) is replaced again by a second mutation, and whereby the amino acid residue brought in by the second mutation has similar properties to the amino acid residue that had been brought in by the first mutation.. Similar in this context means an amino acid that has similar chemical properties. If, for example, a first mutation at a specific position leads to a substitution of a non-aliphatic amino acid residue (e.g. Ser) with an aliphatic amino acid residue (e.g. Leu), then a substitution at the same position with a different aliphatic amino acid by means of a second mutation (e.g. lie or Val) is referred to as a similar mutation. Further chemical properties include size of the residue, hydrophobicity, polarity, charge. pK-value, and the like. Thus, a similar mutation may include substitution such as basic for basic, acidic for acidic, polar for poiar etc. The sets of amino acids thus derived are likely to be conserved for structural reasons. These sets can be described in the form of a Venn diagram (Livingstone CD. and Barton GJ. (1993) "Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation" Comput.Appl Biosci. 9: 745-756: Taylor W. R. (1986) "The classification of amino acid conservation" J.Theor.Biol. 1 19; 205-2 8). Similar substitutions may be made, for example, according to the following grouping of amino acids: Hydrophobic: F W Y H K M I L V A G; Aromatic: F W Y H; Aliphatic: I L V; Poiar: W Y H K R E D C S T : Charged H K R E D; Positively charged: H K R; Negatively charged: E D.
An expression construct herein is defined as a DNA sequence comprising all required sequence elements for establishing expression of an comprised open reading frame (ORF) in the host cell including sequences for transcription initiation (promoters), termination and regulation, sites for translation initiation, regions for stable replication or integration into the host genome and a selectable genetic marker. The open reading frame optionally consists of a fusion of a nucleic acid coding for the target protein with further elements, especially secretion signals, a cellulose binding domain, TAGs for enhancement of the expression level or facilitation of purification or isolation from the fermentation broth. The functional setup thereby can be already established or reached by arranging (integration etc.) event in the host cell. In a preferred embodiment the expression construct contains a promoter functionally linked to the open reading frame followed by an optional termination sequence. Preferred promoters are medium to high strength promoters, functional in the selected hosts under fermentation conditions. For illustration, examples of preferred promoters are given as follows:
• Bacteria (e.g. Escherichia coli): lac, tac, tip, tet. T3 T7, CP7, CP21 , araBAD · Yeast (e.g. Pichia. Saccharomyces): AOXI. AOXII, F DH, GAP, TEF. PFK1 , FBA1 , PGK1 ,
ADH1 . ADH2, TDH3 • Fungi (e.g. Trichoderma): CBHI, CBHIl. EG I, PGK. BGL. XYL1 , XYL2
Further examples of suitable promoters for heterologous expression are reported in the literature. Other parts of the expression construct are genetic elements requirements for a stable heritage of the introduced nucleic acids and selectable markers including genetic elements referring antibiotic resistance or complementing defined auxotrophies of the host strain.
The sequence of all nucleic acids of the invention, or of nucleic acids encoding polypeptides/proteins of the invention can be adjusted towards optimal codon usage in the selected expression host. The nucleic acids having such optimized/optimal codon usage for the particular expression host are also part of this invention.A production host is used herein synonympously to expression host and means an organism, which, upon cultivation produces the protein of the present invention. In one embodiment, the protein of the present invention is not secreted by the production host; however, in a preferred embodiment, it is secrested into the surrounding medium. Such an organisms is preferably selected from the kingdom of Bacteria, Archea, Yeasts, Fungi, and/or Plants. One preferred expression host is Pichia pastoris. "Bacteria shall herein refer to prokaryotic organisms. In a preferred embodiment Bacteria are eubacteria, and even more preferably they are selected among of the genus Escherichia, Bacillus, Klebsiella, Streptomyces, Lactococcus and Lactobacillus in particular Escherichia coli, Bacillus subtilis, Bacillus licheniformis, Bacillus amyloliquefaciens, Bacillus megaterium, Klebsiella planticola, Streptomyces lividans, Lactococcus lactis, Lactobacillus brevis. "Yeast" shall herein refer to all lower eukaryotic organisms showing a unicellular vegetative state in their life cycle. This especially includes organisms of the class Saccharomycetes. in particular of the genus Saccharomyces. Pachysolen, Pichia, Candida. Yarrowina, Debaromyces. Klyveromyces. Zygosaccharomyces.
"Filamentous fungi" or "fungi" shall herein refer to all lower eukaryotic organisms showing hyphal growth during at least one state in their life cycle. This especially includes organisms of the phylum Ascomyccta and Basidiomycota, in particular of the genus Trichoderma. Taiaromyces. Aspergillus. Penicillium, Chrysosporium, Phanerochaete, Thermoascus. Agaricus, Pleutrus, Irpex.
'"Plant" shall herein refer to all eukaryotic organisms belonging to the kingdom of plants. In a preferred embodiment the expression host is selected form plants of the genus Zea, Triticum, Hordeum, Secale, Miscanthus. Saccharum. Solanum. Ipomea, Manihot, Helianthus. Camellia, Aspalathus. Eucalyptus, Beta, Fagus, members of the family of Pinaceae, Betulaceae. Malvaceae. Cupressaceae, Rosaceae. Arecaceae.
Enzyme formulation is meant to be any liquid or solid composition containing the enzyme as a fraction. Additional components preferably comprise water, polyols, sugars, detergents, buffering agents, reducing agents, inorganic salts, solid carriers, conserving agents especially with antibacterial or anti-fungal activity, dyes, fragrances and/or perfumes.
Uses of endogiucanases, such as particularly of the endoglucanase of the present invention (non- limiting examples): hydrolysis of lignocellulose feedstocks for the generation of monomeric, dimeric or or oligomeric sugars; production of pulp and paper; textile applications for the improvement or general processing of fibers, yams or denim; cleaning applications for industrial or home care applications; release of nutrients, production yield enhancement or improvement of dough properties in the field of food and feed. Detailed description of the Invention
The invention relates to GH7 endogiucanases with superior properties. More particularly, the invention relates to thermostable endoglucanase proteins (polypeptides). The solutions provided are:
1 . A protein having endoglucanase activity which belongs to the GH7 class and which shows active thermostabilization.
2. A protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2. .
These two embodiments are described in detail below.
The temperature stability is defined above. An example for the determination of the thermostability is given in Example 4. Endogiucanases of the GH7 class are listed in the Table 1 (EC 3.2.1 .4). Unless excluded by particular sequence identity constraints in a particular claim, the invention relates to variants of of all endogiucanases of the GH7 class, comprised therein variants of the ones shown in Table 1.
Table 1 : Known endogiucanases of the GH7 class
Protein Name Organism GenBank PDB/3D
1 cellulase lll-A (peptide fragment) Acremonium cellulolyticus
2 endo-p-1 ,4-glucanase (EglB;AN3418.2) Aspergillus nidulans FGSC EAA63386.1
A4
3 endo-β-1 ,4-glucanase (CelB) Aspergillus oryzae KBN616 BAA22589.1
4 endo- -1 .4-glucanase (CelB;AO090010000314) Aspergillus oryzae RIB40 AEB00821.1
5 endo-β-1 ,4-glucanase 1 Aspergillus terreus MS-31 ADR78837.1
6 endo-β-1.3-1 .4-glucanase (Bgl7A) Bispora so. MEY-1 / ACT53749.1
CGMCC 2500
7 EG I (peptide fragment) (Cel7C) Chrysosporium lucknowense
endo-β-1 ,4-glucanase (CLhgEG I ) Coptotermes lacteus BAC07551 .1
symbiont WH2002
endo-β-1 ,4-glucanase (CLhgEG2) Coptotermes lacteus BAC07552.1
symbiont WH2002
endo-β-1 ,4-glucanase (EglB) Emericella nidulans AAM54071 .1 endo- -1 ,4-glucanase I (EG l;Eg1 ;Fof7) (Cel7B) Fusarium oxysporum AAA65586.1 10VW[
A,B,C,D ] endoglucanase 3 (HmEG3) (fragment) Holomastigotoides mirabile 3AB64565.1
endoglucanase 2 (HmEG2) Holomastigotoides mirabile BAB64564.1 endoglucanase 1 (HmEG1 ) Holomastigotoides mirabile BAB64563.1 endo-β-1 ,4-glucanase 1 (Egll ;EG-I) Humicola grisea var. BAA09786.1
thermoidea
endoglucanase 1 (EG l;EG1 ) (Cel7B) Humicola insolens AAE25068.1 1A39[A] endo-β-1 ,4-glucanase 1 (EGI:Egl1 :EG-I) (Cel7B) Hypocrea jecorina AAA34212.1 1 EGHA
■ C] endoglucanase I (Egl) Hypocrea jecorina M5 AD 08177.1 endo-β-1 ,4-glucanase (Egl1 ) Hypocrea jecorina PTCC AAX28897.1
5142
endoglucanase I Hypocrea pseudokoningii AB 90986.1 endoglucanase I (Eg1 ) Hypocrea pseudokoningii AEQ29501 .1
3.3002
endoglucanase I (Eg1 ) Hypocrea rufa AE017039.1 endo-p-1 ,4-glucanase 1 (EGI;Bgl!) Hypocrea rufa AS 3.3711 AAQ21382.1
endoglucanase 1 Hypocrea rufa HK-75
endo-β-1 ,4-glucanase (Egl1 ;MG02532,4) Magnaporthe grisea 70-15 XP_366456.
1
endoglucanase Myceliophthora AAE25067.1
thermophila CBS 117.65
endoglucanase I (Egi1 ) (Cel7B) Penicillium decumbens ABY56790.1
114-2
endoglucanase I (Egl1 ) Penicillium decumbens L- ACJ 15337.1
06
endoglucanase I (Egl1 ;Eg1 ) Penicillium oxalicum ACS32299.1
endoglucanase (Cel7B) Penicillium purpurogenum AEL78899.1 endoglucanase (Bgl7C7) Penicillium sp. C7 AEG74551 .1 32 endo-β-1 ,4-glucanase (EGI) (peptide fragments) Penicillium verruculosum
(Cel7B)
33 endog!ucanase 3 (PgEG3) Pseudotrichonympha BAB64562.1
grassii
34 endoglucanase 2 (PgEG2) Pseudotrichonympha BAB64561 .1
grassii
35 endoglucanase 1 (PgEGI h) Pseudotrichonympha BAB64553.1
grassii
36 Eg!1 (fragment) Trichoderma asperellum AAS37698.1
T203
37 endoglucanase I Trichoderma AEI71804.1
longibrachiatum 3. 1029
38 endoglucanase I (Eg!1 ) Trichoderma AEC03714.1
longibrachiatum 36MS
39 endo-p-1 ,4-giucanase I (Egl1 ;Egl!;TICel7A) Trichoderma 1920181 A
(Cel7A) longibrachiatum CECT
2606
40 endoglucanase I Trichoderma ACZ34302 1
longibrachiatum FU05
41 endoglucanase I (Egl;EGI) Trichoderma sp. SSL ACH68455.1
42 endo-β-1 ,4-glucanase (RsSymEGI ;S 2038B1 1 ) uncultured symbiotic protist BAF57296.1
of Reticulitermes speratus
The first and second aspects will now be described in detail.
First aspect: A protein having endoglucanase activity which belongs to the GH7 class and which shows active thermostabiiization. In the first aspect of the invention the proteins have enaoglucanase activity and superior thermal properties. The superior thermal properties are defined as a temperature stability that manifests in a relative substrate conversion! activity higher than 90% (such as higher than 95 %) upon incubation at temperatures of 60 °C or higher, and active thermostabiiization. The active thermostabiiization is described in the following. The inventors of the present invention have surprisingly found out that proteins showing active thermostabilzation also show temperature stability.
This was shown by the following example. The inventors generated a GH7 endoglucanase, that is a particular variant of SEQ I D NO: 4 (i.e. the one given by SEQ ID NO: 2) as follows. A nucleic acid encoding a polypeptide with SEQ I D NO: 2 was obtained by random mutagenesis (error prone PGR, as described in Example 1 ). Methods for random mutagenesis are well known in the art. Furthermore, now that the inventors have disclosed here the suitability of a polypeptide encoded by SEQ ID NO: 2. a respective nucleic acid encoding this protein can be directly prepared by the skilled person.
Methods therefor include for example gene synthesis or site-directed mutagenesis, starting from a nucleic acid with a high degree of sequence identity (e.g. more than 90 %) to SEQ ID NO: 4 and introduction of mutations by site-directed mutagenesis (in one or several steps) to obtain the nucleic acid encoding the protein of SEQ ID NO: 2. A starting sequence from which the nucleic acid encoding SEQ ID NO. 2 can be obtained by mutagenesis is Cel7B from Hypocrea pseudokonigii given here as SEQ ilD NO.4 (Gene Bank Accession number AB 90986).
The inventors of the present invention characterized the thermostability of the protein having SEQ ID NO: 2. As can be seen in Figure 3, this protein solves the technical problem underlying the present invention, i.e. has higher temperature stability than its parental protein. (SEQ ID NO: 4). This is evident for example from the fact that the relative substrate conversion is still near its maximum at e.g. 60 °C, whereas the relative substrate conversion of the protein having SEQ ID NO: 4 is at a very low level at said temperature (see Figure 3).
Surprisingly, the inventors have found that at even higher temperatures, e.g. in the range from 68 to 76 °C (including 70 to 74 °C), the relative substrate conversion does not significantly drop with increasing temperature. This is in sharp contrast to the properties of the parental protein having SEQ ID NO: 4, which, in a plot against increasing temperatures, shows a decrease of relative substrate conversion, the decrease going down to background levels without any intermediate piateau. It is believed that the protein having SEQ ID NO: 4, when exposed to higher temperatures, e.g. 60 °C or more , such as 70 °C or more, is not present in its active state. Without wishing to be bound to any particular theory, it is believed that this effect is due to thermal unfolding (or folding of non-active conformations) of the protein. Without wishing to be bound to any particular theory, the effect of activity loss at high temperatures will in the following be called thermal unfolding. Thermal unfolding is a well-known phenomenon for proteins of almost any type, particularly enzymes, at higher temperatures. The thermal unfolding observed for the protein having SEQ ID NO: 4 is thus in line with the expectations of a skilled person. The protein of SEQ ID NO: 4 is not part of the invention.
In sharp contrast, the protein of this aspect of the invention shows a piateau phase at higher temperatures, e.g. in the range from 68 to 76 °C (including 70 to 74 °C). This plateau is lower than the maximum relative substrate conversion, but higher than the background relative substrate conversion. Without wishing to be bound to any particular theory, the inventors of the present invention conclude that the protein of the invention is present at these higher temperatures in a state which is different from the folded state at lower temperatures (e.g. 46 °C), but yet this protein is enzymatically active. It may thus be assumed that at high temperatures, this protein of the invention actively refolds, i.e. refolds to obtain a further active state (and thus enabling the observed relative substrate conversion at higher tempeatures). The inventors have therefore termed this property, which is also defined above in the definitions section, as "active thermostabilization''. The protein of the invention thus solves the technical problem underlying the present invention by being temperature stable. Furthermore, based on the disclosure of the present invention, the skilled worker is given guidance for the identification of further proteins according to this first aspect of the invention. Such further proteins may be found as follows. First, any type of mutations (including deletion, insertion or replacement of one or several amino acid residues, and being randomly or directed) may be introduced into any endoglucanase of the GH7 family, particularly into any one named in Table 1 to obtain a mutant protein, or a library thereof. The, the so-obtained mutant protein or the library thereof may be screened for active thermostabilization as defined above. The known proteins given in table 1 above are not part of the invention, but any mutants thereof showing active thermostabilization are included in the invention.
Importantly, all enzymes of the first aspect of the present invention, i.e. the ones which show active thermostabilzation as defined above, show temperature stability as defined above. The active thermostabiiisation is thus, in a first aspect, a solution to the problem underlying the present invention. Whether or not any given protein falls under the first aspect of the invention can be reliably tested by the assay for active thermostabilization given above.
Second aspect: A protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2. In searching a second solution to the problem underlying the present invention, the inventors embarked on a mutagenesis project, starting from the protein of SEQ ID NO: 2. Thus, the inventors have introduced mutations, such as point mutations, into the protein having SEQ ID NO: 2 (i.e. by modifying the underlying nucleic acid, as described below). The inventors have found out that many such mutants also show temperature stability and thus solve the underlying problem in a second aspect. Examples of the solutions are given in Figure 3.
Thus, in a second aspect, the invention relates to a protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99,5%identity to SEQ. ID NO.: 2. This protein may typically belong to the GH7 class. Particularly, the present invention also provides specific mutants of the protein with sequence of SEQ ID NO: 2. Thus, the sequence given in SEQ ID NO: 2 is modified in one or more (preferably 1 , 2, 3, 4, 5, 6, 7. 8. 9, 10, 1 1 , 12, 13, 14, 15, 1 6, 17. 18 or 19) positions. Such modification may consist of replacement, deletion, insertion and the like. In a particular embodiment thereof, the modification consists in a replacement. In an even more specific embodiment, the modification consists in a replacement of any one or more of the specific positions of SEQ ID NO: 2 which are individualized in the very left column of Tables 2, 3, 4. While such modification at any of these given positions my in principle be a replacement by any amino acid residue, it is preferred that the replacement is a replacement by a amino acid residue given in lane number 4 of any one or more of Tabies 2, 3 or 4. or by a an amino acid residue similar thereto (similar mutation as defined above). Thus, similar mutations as defined above might be introduced instead of the listed ones. Example 5 shows some of such mutants.. Methods for the introduction of mutations are known in the art. Exemplary guidance can be taken from Example 1 . In other words, a preferred embodiment of the invention reiates to preferred positions for mutagenesis of
endoglucanases of the GH7 class. A list of preferred exchanges is given in the Table 2, lane 2. In another preferred embodiment the preferred mutations are selected from the listing in Table 3, lane 2. In another preferred embodiment of the invention the preferred mutations are selected from Table 4, lane 2. It is also possible to combine two or three of these preferred embodiments, for example one or more D referred exchange given in Table 2 can be combined with one or more preferred exchanges given in Table 3 and/or Table 4.
Table 2: Preferred exchanges of Amino acids with respect to Seq. ID NO. 2 o o O Q 3' S
D 5
3 o O co
co o o
S »
O Si o a e.
co D 0) 2^ o
D a) 9
O o
3 8 P i O o ¾
O o O o ho § w 3
σ _ o ' ω
hi 3; — o z o o < o o σ
O c I
o P co " co
J l s ^
lanel lane 2 lane 3 lane 4 lanel lane 2 lane 3 lane 4
L,Q L,Q Q 217 ΑΎ
T,C 231 R,G,H,K R,H,K H,K
16 T,C 233 G,N
19 K,E 235 P.S
23 S,H S.H 242 G.D G
30 N,D 249 P,R P.R
32 Y,S Y,S 261 G G,C
41 W W,R W 263 P,T P,T
42 l,M 271 T,K T.K
48 N,Y 277 N.D.E N,D,E D,E 55 G,C G.C C 290 T,E T,E
64 E,H, K Ε,Η,Κ H.K 293 S,T
65 A,D A,D 299 T,A T,E
67 G.C G.C C 312 E.D.S E,D,S D.S 68 S.C.G S.C.G C,G 317 l,V I
75 G G,C G 322
86 N,S N,S 323 D.N
Figure imgf000018_0001
Figure imgf000018_0003
"D o
Figure imgf000018_0002
Figure imgf000018_0004
Figure imgf000018_0005
A A,G A
Q Q,E ε E
G G..S G
Q Q,L Q
T T,D,E T,D,E D,E
Q Q, α K
L L,F L,F F
P P,L,S P
N NJ N
R R,G,H,K R,H,K H,K
G G,D G
P P,R P,R R
G G,C G
P PJ PJ T
T T,K T,K K
N N,D,E N,D,E D,E
T T,E T,E E
T T,A T,E E
E E,D,S E,D,S D,S
D D,N D
S S,T S
M M,K M
D D,E,S D.t.S E,S
H H,E H,E E
P P,L P,L L
S S,L S
T T,l T,l I
I IJ l,T T s S.G S,G G
D D,Y D,Y Y
Figure imgf000020_0001
9 σ
o
o
Figure imgf000020_0002
382 P P,del P
383 P P,del P
392 A A,T A,T T
398 S SJ SJ T
431 Y Y,H H
448 H H,Y H
The first and second aspect of the invention, although being different solutions to the same problem, are not necessarily mutually exclusive. Thus, the invention relates to proteins fulfilling the conditions of both the first aspect and the second aspect above. It is important to note that the first aspect and the second aspect are two alternative solutions to the problem of providing GH7 enzymes with improved temperature stability. These solutions are independent (although for some examples overlapping) and thus need not be necessarily combined. For example, the protein identified as [6] in Figure 3 shows temperature stability compared to the protein having SEQ ID NO: 4 ([2] in Figure 3), yet does not display active thermostabilization. The invention thus provide may different enzyme variant according to the first aspect above and/or according to the second aspect above. Whether any given enzyme shows the desired thermal properties (temperature stability and/or active thermostabilization) can be easily tested by the test entitled "Determination of thermostability and/or active thermostabilization" above.
As given in detail above in the definitions section, as well as individualized by the examples below, it is briefly summarized here how the desired mutations can be obtained:
• Pairwise alignment of any GH7 endoglucanase sequence with Seq ID NO 2 using the
ClustalW algorithm
• Identification of corresponding positions (lane 1 ) in the GH7 endoglucanase target sequence
• Modification of corresponding positions in the GH7 endoglucanase target sequence according to the proposed preferred exchanges given in lane 2, or preferably in lane 3
• Expression of the modified sequence and testing of the expressed protein for improved
thermal properties
It is believed that the thermostable enzymes of the invention also come with reduced of agglomerate formation at higher temperatures, and thus with reduced precipitation. The avoidance of such precipitates is particularly advantagous in the presence of garnets, denim or woven materials as well for the application in membrane reactors, reducing the membrane fouling characterisitcs.
Fusion proteins comprising any protein of the invention are also part of the invention. Another aspect of the invention is related to the production of the proteins of the invention by heterologous expression in a production host, also termed expression host. Methods for the heterologous expression comprise the transfer of a nucleic acid encoding the protein of the invention (expression construct) into the production host by transformation, transfection, crossing or equivalent methods with respect to the nucleic acid (DNA or RNA) transfer. Methods for transformation within the meaning of this invention are not particularly limited. Examples have been reported for a variety of species and include electroporation, protoplast-transformation, chemical transformation, and transfer via ballistic particles, micro-injection, viral -infection, crossing mating or the use of natural competent strains or cell lines. A preferred production host co-secretes the endoglucanase of the invention with other celiuiases, hemi-cellulases or pectinases into the culture broth. It is thus preferred that the coding sequence on the expression construct encodes for the endoglucanase of the invention preceded by a signal for secretion from the particular host strain. Such signals are well known in the art: for example in Eubacteria they are called signal peptides. Without wishing to be bound to a particular theory, these signals have in common the ability to direct secretion of a protein, typically in a co-translational fashion. A preferred expression host is Trichoderma reesei.
A further aspect of the invention is the application of the above-described endoglucanase proteins. This includes the applications of the purified, partially purified ore crude protein preparations as such or in enzyme formulation, as well as the application of whole cells or organisms, expressing the target protein. Fields of applications for endoglucanases can be found in the chapter field of invention. As stated there the application of thermal stable proteins is highly desirable. A preferred application of the endoglucanase lies in the field of enzymatic lignocellulose conversion.
Overview of the sequences disclosed herein
Figure imgf000023_0001
Sequences disclosed herein (NO: 1 -16)
SEQ ID NO: 1
TCTCTGCAGCCAGGAACTTCTACTCCAGAGGTGCACCCAAAGCTGACCACCTACAAGTGTACCACCTCTGGTGGT TGTGTTGCTCAGAACACCTATGTTGTTCTGGACTGGAACTACAGATGGATCCACGACGCCAACTACAACTCTTGT ACCGTGAACGGTGGTGTCAACACTACTCTGTGTCCAGACGAGGCTACTGGTAGCAAGAACTGCTTCATCGAGGGT GTTGACTACGCTGCTTCTGGTGTTACTGCCAATGGTTCTACCTTGACCCTGAACCAGTACATGCCATCTTCCTCT GGCGGTTACACTTCTGTGTCGCCAAGACTGTACTTGTTGGGTCCAGACGGTAAGTACGTTATGCTGAAGCTGAAC GGACAGGAGCTGTCTTTTGACGTTGACCTGTCTGCTTTGCCATGTGGAGAGAACGCTTCTCTGTACCTGTCTCAG ATGGACGAGAACGGTGGAGCTAACCAGTACAACACCGCCGGTGCTAACTACGGTTCTGGTTACTGTGACGCCCAG TGTCCAGTTCAGACTTGGAGAAACGGAACCCTGAACACTTCTGGCCAGGGATTCTGCTGTAACGAGATGGACATC TTGGAGGGAAACTCTAGAGCTAACGCTCTGACCCCACACTCTTGTAATGCTACCGCTTGTGACTCTGCTGGTTGC GGTTTTAACCCATACCGCTCGGGTTACCCAAACTACTTTGGCCCAGGTGGCACTGTTGACACCTCGAAGCCATTC ACCATCATCACCCAGTTCAACACCGACAACGGTTCTCCATCTGGTAACCTGGTGTCGATCACCAGAAAGTACAGA CAGAACGGCGTTGACATCCCATCTGCTAAACCAGGTGGCGACACCATTTCGTCTTGTCCATCTGCCTCTACTTAC GGTGGATTGGCTACCATGGGAAAGGCTCTGTCCGAGGGAATGGTGCTGATCTTCTCGATCTGGAACGACAACTCG CAGTACATGAACTGGCTGGACTCTGGTGATGCTGGTCCATGTTCTTCTACCGAGGGCAACCCATCTAACATCCTG GCTAACAACCCTGGTACTCACGTGGTGTACTCGAACATTAGATGGGGCGACATTGGTTCTACCACCAACTCTACC GGTGGTAACCCACCACCACCACCTGCATCTTCTACCACCTTCTCGACCGCCAGAAGATCGTCTACCTCCTCTTCT TCTCCATCTTGTATCCAGACTCACTGGGGTCAGTGTGGTGGTATTGGCTACACCGGCTGTAAGACCTGTACCTCT GGAACCACTTGCCAGTACAGCAACGACTACTACTCTCAGTGCCTGTGA
SEQ ID NO: 2
SLQPGTSTPEVHPKLTTYKCTTSGGCVAQNTY LDWNYR IHDANYNSCTVNGGVNTTLCPDEATGSKNCFIEG VDYAASGVTA GSTLTLNQYMPSSSGGYTSVSPRLYLLGPDGKYVML LNGQELSFDVDLSALPCGENASLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCNATACDSAGC GFNPYRSGYPNYFGPGGTVDTSKPFTI ITQFNTDNGSPSGNLVSI RKYRQNGVDIPSAKPGGDTISSCPSASTY GGLATMGKALSEGMVLIFSIWNDNSQYMNWLDSGDAGPCSSTEGNPSNILANNPGTHWYSNIRWGDIGSTTNST GGNPPPPPASSTTFSTARRSSTSSSSPSCIQTHWGQCGGIGYTGC TCTSGTTCQYSNDYYSQCL*
SEQ ID NO: 3
TCTCAGCAGCCAGGAACTTCTACTCCAGAGGTGCACCCAAAGCTGACCACCTACAAGTGTACCACCTCTGGTGGT TGTGTTGCTCAGGACACCTCTGTTGTTCTGGACTGGAACTACAGATGGATGCACGACGCCAACTACAACTCTTGT ACCGTGAACGGTGGTGTCAACACTACTC GTGTCCAGACGAGGCTACTTGTGGCAAGAACTGCTTCATCGAGGGT GTTGACTACGCTGCTTCTGGTGTTACTGCCTCTGGTTCTACCTTGACCCTGAACCAGTACATGCCATCTTCCTCT GGCGGTTACTCTTCTGTGTCGCCAAGACTGTACTTGTTGGGTCCAGACGGTGAGTACGTTATGCTGAAGCTGAAC GGACAGGAGCTGTCTTTTGACGTTGACCTGTCTGCTTTGCCATGTGGAGAGAACGGTTCTCTGTACCTGTCTCAG ATGGACGAGAACGGTGGAGCTAACCAGTACAACACCGCCGGTGCTAACTACGGTTCTGGTTACTGTGACGCCCAG TGTCCAGTTCAGACTTGGAGAAACGGAACCCTGAACACTTCTGGCCAGGGATTCTGCTGTAACGAGATGGACATC TTGGAGGGAAACTCTAGAGCTAACGCTCTGACCCCACACTCTTGTACTGCTACCGCTTGTGACTCTGCTGGTTGC GGTTTTAACCCATACGGCTCGGGTTACCCAAACTACT TGGCCCAGGTGACACTGTTGACACCTCGAAGCCATTC ACCATCATCACCCAGTTCAACACCGACAACGGTTCTCCATCTGGTAACCTGGTGTCGATCACCAGAAAGTACAGA CAGAACGGCGTTGACATCCCATCTGCTAAACCAGGTGGCGACACCATTTCGTCTTGTCCATCTGCCTCTGCTTAC GGTGGAT GGCTACCATGGGAAAGGCTCTGTCCTCTGGAATGGTGCTGATCTTCTCGATCTGGAACGACAACTCG CAGTACATGAACTGGCTGGACTCTGGTTCTGCTGGTCCATGTTCTTCTACCGAGGGCAACCCATCTAACATCCTG GCTAACAACCCTGGTACTCACGTGGTGTACTCGAACATTAGATGGGGCGACATTGGTTCTACCACCAACTCTACC GGTGGTAACCCACCACCACCACCTGCATCTTCTACCACCTTCTCGACCACCAGAAGATCGTCTACCACCTCTTCT TCTCCATCTTGTACCCAGACTCACTGGGGTCAGTGTGGTGGTATTGGCTACACCGGCTGTAAGACCTGTACCTCT GGAACCACTTGCCAGTACGGCAACGACTACTACTCTCAGTGCCTGTGA
SEQ ID NO: 4
SQQPGTSTPEVHPKL TY CTTSGGCVAQDTSWLD NYRWMHDANYNSCTV GGVNTTLCPDEATCGK CFIEG VDYAASGVTASGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGEYVMLKLNGQELSFDVDLSALPCGENGSLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCTATACDSAGC GFNPYGSGYPNYFGPGDTVDTSKPFTIITQFNTDNGSPSGNLVSI RKYRQNGVDIPSAKPGGDTISSCPSASAY GGLATMGKALSSGMVLIFSI NDNSQYMNWLDSGSAGPCSSTEGNPSNILANNPGTHWYSNIRWGDIGSTTNST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTCTSGTTCQYGNDYYSQCL*
SEQ ID NO: 5
SQQPGTSTPEVHPKLTTYKCTTSGGCVAQDTS LD NYRWMHDANYNSCTVNGGVNTTLCPDEATCGKNCFIEG VDYAASGVTASGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGEYVMLKLNGQELSFDVDLSALPCGENGSLYLSQ MDKNGGANQYNTAGANYGSGYCDAOCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCTATACDSAGC GFNPYGSGYPNYFGPGDTVDTSKPF I ITQFNTDNGSPSGNLVS ITRKYRONGVD PSAKPGGDTISSCPSASAY GGLATMGKALSSGMVLIFSI NDNSQYMNWLDSGSAGPCSSTEGNPSNILANNPGTHWYSNIR GDIGSTTNST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTC SGTTCQYGNDYYSQCL*
SEQ ID NO: 6
SLQPGTSTPEVHPKLTTYKCTTSGGCVAQNTSVVLD YRWMHDANYNSCTVNGGVNTTLCPDEATGGKNCFIEG VDYAASGVTASGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGKYVMLKLNGQELSFDVDLSALPCGENASLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCNATACDSAGC GFNPYGSGYPNYFGPGGTVDTSKPFTI ITQFNTDNGSPSGNLVSITRKYRQNGVDI SAKPGGDTISSCPSASTY GGLAT GKALSSGMVLIFSIWNDNSQYMNWLDSGSAGPCSSTSGNPSNILANNPGTKWYSNIR GDIGSTTNST GGNPPPPPASSTTFSTTRRSSTSSSSPSCIQTH GQCGGIGYTGCKTCTSGTTCQYSNDYYSQCL*
SEQ ID NO: 7
SLQPGTSTPEVHPKLTTYKCTTSGGCVAQNTYWLD NYRWIHDANYNSCTVNGGVNTTLCPDEATGS NCFIEG VDYAASGVTANGSTLTLNQYMPSSSGGYTSVSPRLYLLGPDGKYVMLKLNGQELSFDVDLSALPCGENASLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCNATACDSAGC GFNPYGSGYPNYFGPGGTVDTSKPFTI ITQFNTDNGSPSGNLVSI RKYRQNGVDIPSAKPGGDTISSCPSASTY GGLATMGKALSSGMVLIFSIWNDNSQYMNWLDSGSAGPCSSTEGNPSNILANNPGTHWYSNIR GDIGSTTNST GGNPPPPPASSTTFSTTRRSSTSSSSPSCIQTHWGQCGGIGYTGCKTCTSGTTCQYSNDYYSQCL*
SEQ ID NO: 8
SQQPGTSTPEVHPKLTTYKCTTSGGCVAQNTSWLDWNYRWMHDANYNSCTVNGGVNTTLCPDEATCGKNCFIEG VDYAASGVTASGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGKYVMLKLNGQELSFDVDLSALPCGENASLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCNATACDSAGC GFNPYGSGYPNYFGPGGTVDTSKPFTI ITQFNTDNGSPSGNLVSITRKYRQNGVDIPSA.KPGGDTISSCPSASAY GGLATMGKALSSGMVLIFSIW DNSQYMNWLDSGSAGPCSSTEGNPSNILAN PGTHWYSNIRWGDIGSTTNST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTCTSGTTCQYGNDYYSQCL*
SEQ ID NO: 9
SLQPGTSTPEVHPKLTTYKCTTSGGCVAQNTSWLDVJNYR MHDANYNSCTVNGGVNTTLCPDEATCCKNCFIEG VDYAASGVTASGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGKYVMLKLNGQELSFDVDLSALPCGENASLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGKGFCCNEMDILEGNSRANALTPHSCNATACDSAGC GFNPYGSGYPNYFGPGGTVDTS PFTIITQFNTDNGSPSGNLVSITR YRQNGVDIPSA PGGDTISSCPSASAY GGLATMGKALSSGMVLIFSIWNDNSQYMN LDSGSAGPCSSTEGNPSNILANNPGTHVVYSNIRWGDIGSTTNST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTCTSGTTCQYSNDYYSQCL*
SEQ ID NO: 10
SQQPGTSTPEVHPKLTTYKCTTSGGCVAQNTSWLD NYRWMHDANY SCTV GGVNTTLCPDEATCGKNCFIEG VDYAASGVTASGSTLTLNQY PSSSGGYSSVSPRLYLLGPDGKYVMLKLNGQELSFDVDLSALPCGENASLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCNA ACDSAGC GFNPYKSGYPNYFGPGGTVDTSKPFTIITQFNTDNGSPSGNLVSITRKYRQNGVDIPSA PGGDTISSCPSASAY GGLATMGKALSEGMVLIFSIWNDNSQYMNWLDSGSAGPCSSTEGNPSNILANNPGTHWYSNIRWGDIGSTTNST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTCTSGTTCQYGNDYYSQCL*
SEQ ID NO: 11
SQQPGTSTPEVHPKLTTYKCTTSGGCVAQNTSWLDVraYR HDA YNSCTVNGGV TTLCPDEATCG NCFIEG VDYAASGVTASGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGKYVMLKLNGQELSFDVDLSALPCGENASLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQT RNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCNATACDSAGC GFNPYGSGYPNYFGPGGTVDTSKPFTIITQFNTDNGSPSGNLVSITRKYRQNGVDIPSA PGGDTISSCPSASAY GGLATMGKALSDGMVLIFSIWNDNSQYMNWLDSGEAGPCSSTEGNPSNILAN PGTHWYSNIRWGDIGSTT ST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTCTSGTTCQYGNDYYSQCL*
SEQ ID NO: 12
SQQPGTSTPEVHPKLTTYKCTTSGGCVAQNTSWLD NYRViMHDA YNSCTVTfGGWITTLCPDEATCGFJxiCFIEG VDYAASGVTASGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGKYVMLKLNGQELSFDVDLSALPCGENASLYLSQ DENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCNATACDSAGC GFNPYKSGYPNYFGPGGTVDTSKPFTIITQFNTDNGSPSGNLVSITRKYRQNGVDIPSA PGGDTISSCPSASAY GGKA MGKALSDGMVLIFSIWNDNSQYMNWLDSGSAGPCSSTEGNPSNILANNPGTHVVYSNIR GDIGSTTNST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTH GQCGGIGYTGCKTCTSGTTCQYG DYYSQCL*
SEQ ID NO: 13
SQQPGTSTPEVHPKLTTYKCTTSGGCVAQDTSWLDWNYR IHDAISTYNSCTV GGVWTTLCPDEATCSK CFIEG VDYAASGVTA GSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGEYVML LNGQELSFDVDLSALPCGENGSLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQT RNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCTATACDSAGC GFNPYGSGYPNYFGPGDTVDTSKPFTI ITQFNTDNGSPSGNLVSITRKYRQNGVDIPSAKPGGDTISSCPSASAY GGLATMGKALSSGMVL I FS I WNDNSQYM WLDSGSAGPCSSTEGNPSNI LAN PGTHWYSNI RWGDI GSTTNST GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTCTSGTTCQYGNDYYSQCL*
SEQ ID NO: 14
SQQPGTSTPEVHPKLTTYKCTTSGGCVAQDTSVVLDWNYRWIHDANYNSCTVNGGVNTTLCPDEATCGKNCFIEG VDYAASGVTANGSTLTLNQYMPSSSGGYSSVSPRLYLLGPDGEYVMLKLNGQELSFDVDLSALPCGENGSLYLSQ MDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTLNTSGQGFCCNEMDILEGNSRANALTPHSCTATACDSAGC GFNPYGSGYPNYFGPGDTVDTSKPFTII QFNTDNGSPSGNLVSITRKY QNGVDIPSA PGGDTISSCPSASAY GGLATMGKALSSGMVLIFSI NDNSQYMN LDSGSAGPCSSTEGNPSNILANNPGTHWYSNIR GDIGSTTNST
GGNPPPPPASSTTFSTTRRSSTTSSSPSCTQTHWGQCGGIGYTGCKTCTSGTTCQYGNDYYSQCL*
SEQ ID NO: 15
atgagatttccttcaatttttactgcagttttattcgcagcatcctccgcattagctgctccagtcaacactaca acagaagatgaaacggcacaaattccggctgaagctgtcatcggttacttagatttagaaggggatttcgatgtt gctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgct aaagaagaaggggtatctttggataaacgtgaggcggaagcatgccaccaccaccaccaccactcctccggctct ctgcagccaggaacttctactccagaggtgcacccaaagctgaccacctacaagtgtaccacctctggtggttgt gttgctcagaacacc atgt gttctggactggaactacagatggatccacgacgccaactacaactcttgtacc gtgaacggtggtgtcaacactac ctgtgtccagacgaggctactggtagcaagaactgcttcatcgagggtgtt gactacgctgcttctggtgttactgccaatggt ctaccttgaccctgaaccagtacatgccatcttcctctggc ggttacacttctgtgtcgccaagactgtacttgttggg ccagacggtaagtacg tatgctgaagctgaacgga caggagctgtcttttgacgttgacctgtctgctttgccatgtggagagaacgcttctctgtacctgtctcagatg gacgagaacggtggagctaaccagtacaacaccgccggtgctaactacggttctggt actgtgacgcccagtgt ccagttcagacttggagaaacggaaccctgaacac tc ggccagggattctgctgtaacgagatggacatc tg gagggaaactctagagctaacgctctgaccccacactcttgtaatgctaccgcttgtgactctgctggttgcggt tttaacccataccgctcgggttacccaaactactttggcccaggtggcactgttgacacctcgaagccattcacc atcatcacccagttcaacaccgacaacggttctccatctggtaacctggtgtcgatcaccagaaagtacagacag aacggcgttgacatcccatctgctaaaccaggtggcgacaccatttcgtcttgtccatctgcctctacttacggt gga tggctaccatgggaaaggctctgtccgagggaatggtgctgatcttctcgatctggaacgacaactcgcag tacatgaactggctggactctggtgatgctggtccatgttcttctaccgagggcaacccatctaacatcctggct aacaaccctggtactcacgtggtgtactcgaacattagatggggcgacattggttctaccaccaactctaccggt ggtaacccaccaccaccacctgcatcttctaccaccttctcgaccgccagaagatcgtctacctcctcttcttct ccatcttgtatccagactcactggggtcagtgtggtggtattggctacaccggc gtaagacctgtacctctgga accacttgccagtacagcaacgactactactctcagtgcctgtga
SEQ ID NO: 16:
atgtatcggaagttggccgtca ctcggccttcttggccacagcacgggcttctctgcaaccgggtaccagcacc cccgaggtccatcccaagttgacaacctacaagtgtacaacctccggggggtgcgtggcccagaacacctatgtg gtccttgactggaactaccgctggatccacgacgcaaactacaactcgtgcaccgtcaacggcggcgtcaacacc acgctctgccctgacgaggcgaccggtagcaagaactgcttcatcgagggcgtcgactacgccgcctcgggcgtc acggccaatggcagcaccctcaccctgaaccagtacatgcccagcagctctggcggctacactagcgtctctcct cggctgtatctcctgggtccagacggtaagtacgtgatgctgaagctcaacggccaggagctgagcttcgacgtc gacctctctgctctgccgtgtggagagaacgcctcgctctacctgtctcagatggacgagaacgggggcgccaac cagtataacacggccggtgccaactacgggagcggctactgcgatgctcagtgccccgtccagacatggaggaac ggcaccctcaacactagcggccagggcttctgctgcaacgagatggatatcctggagggcaactcgagggcgaat gccttgacccctcactcttgcaatgccacggcctgcgactctgccggttgcggcttcaacccctatcgcagcggc tacccaaactacttcggccccggaggcaccgttgacacctccaagccattcaccatcatcacccagttcaacacg gacaacggctcgccctcgggcaaccttgtgagcatcacccgcaagtacagacaaaacggcgtcgaca ccccagc gccaaacccggcggcgacaccatctcgtcctgcccgtccgcctcaacttacggcggcctcgccaccatgggcaag gccctgagcgagggcatggtgctcatcttcagcatttggaacgacaacagccagtacatgaactggctcgacagc ggcgatgccggcccctgcagcagcaccgagggcaacccatccaacatcctggccaacaaccccggtacgcacgtc gtctactccaacatccgctggggagacattgggtctactacgaactcgactggtggtccgcccccgcctgcgtcc agcacgacgttttcgactgcccggaggagctcgacgtcctcgagcagcccgagctgcatccagactcactggggg cagtgcggtggcattgggtacaccgggtgcaagacgtgcacgtcgggcac acgtgccagtatagcaacgactac tactcgcaatgcctt taa
Examples
Example 1 : Generation of libraries and specific variants
A library based on Seq. ID NO: 3 ("N7" library) was produced using SEQ ID NO: 3 as template by error-prone PGR using Taq polymerase following the literature protocol (Joyce et al) using PGR conditions as follows: 2min at 95°C, 30 cycles of (1 min at 95°C, 1 min at 56°C, 1 min at 72°C), 5min at 72 C. All products obtained from PCRs were purified with the QIAquick PGR Purification Kit (Qiagen, Hilden, Germany). Specific variants of Seq. ID NO: 3 were prepared by a modified PGR protocol using primers containing the mutated nucleotide sequence (Ho, S. N. et al. Gene: 1989; 77; 51 -9).
Example 2: Expression in Pichia pastoris
Linear expression cassette (LEG) construction - LECs (Liu Z. et al. Chembiochem. 2008 Jan 4; 9 (1 ):58-61 ) with Zeocin marker and the GAP promotor were constructed by a modified PGR protocol. Pichia pastoris transformation and cultivation - Competent cells were prepared and transformed as described (Lin-Cereghino, J., et al. BioTechniques. 2005. 38. 44-48). Transformants were selected on YPD agar plates containing Zeocin 100mg/L, and picked to deepwell plates (DWP) (BMD5% 250mi/well) by picking robot (QPix2. Genetix). Inoculated DWPs were cultivated for 60h at 28°C, 80% humidity, and 280rpm. Example 3: Expression in Trichoderma reesei
Trichoderma reesei expression vector construct
Sbfl/Swal digested linearized pV7 plasmid (Figure 2) DNA was transformed into Trichoderma reesei SCF41 essentially as described by Penttila et al 1997. Selection of transformants was done on Mandel's Andreotti media plates containing hygromycine as selective agent (100mg/l). Transformants were verified by PGR.
Example 4: Determination of substrate conversion capacity at different temperatures for indication of the thermostabi!ity of Seq ID NO. 2 -Variants using 4-methylumbe!lifery-B-D- ceilobiosid (4-MUC) For precise comparison of the thermal stability 10μΙ of the Pichia pastoris culture supernatants containing the secreted endoglucanase variants were incubated with 90μΙ of "Ι ΟΟμΜ 4-MUC
(dissolved in sodium acetate buffer (50mM, pH 5.0)) in the temperature gradient of an Eppendorff Gradient Thermocycler. 24 reaction mixtures were incubated in a temperature gradient reaching from 45" C to 65°C and from 55°C to 75°C (each reaction was held at a unique constant temperature level) for one hour. The enzymatic activity at the respective temperature could be determined after addition of 10Ομί 1 M sodium carbonate solution to each reaction and measurement of the fluorescence intensity at 360nm/454nm in a Tecan infinite M200 plate reader. For comparison of the thermostability the fluorescence counts of each temperature point, the relative enzymatic activity was determined by dividing by the maximum count of a series (normalization to 1 ). The temperature profile for any given enzyme was generated by plotting the relative enzymatic activity over the measured temperature range.
Example 5: Active thermostabilization of some endoglucanase variants
This example describes exampies of the surprising effect of active thermostabilization. In this example proteins (culture supernatant) (Table 5 below) expressed in Pichia pastoris were used. Figure 3 demonstrates the determined properties of the proteins of the invention: proteins designated as [1 ], [4], [5], [7], [8], [9] and [10] show active thermostabilization and temperature stability, while proteins designated as [3] and [6] show temperature stability, but not active thermostabilization.
Figure imgf000030_0001
Example 6: Determination of reducing sugar release on straw
The release of reducing sugar on straw was determined by applying acid pretreated wheat straw with a dry matter of 2,5%. The following enzymes were added to the reaction mixture: cellobiohydrolase I (12.5 mg/l ), beta-glucosidase (40 CBU/mg celiobiohydrolase I) and the tested GH7 endoglucanase variant (12.5 mg/l). The straw hydrolysis was incubated at 60°C by continuous shaking for 48h.
Example 7: Determination of the temperature profile of Seq ID. No2 variants
For the MUL (4-methylumbellyferryl β-D-lactopyranoside) activity assay, 10μΙ of the cultivation supernatant was mixed with 90μΙ 100μΜ MUL in 25 mM Na-acetate buffer with pH 4.8. Plates were sealed and incubated for 2h, with 300rpm shaking, at 45' C and 59°C each (for rescreening also at 65°C). Reaction was quenched by adding 0ΟμΙ Na2C03 per well. Excitation was performed at 365nm, and fluorescence measured at 450nm. The results are shown in figure 5.
Example 8: Determination of the temperature profile of Seq ID. No2 variants
The half-lives of the enzymes were determined by measuring the residual activity using the MUL assay described in Example 7 after incubation of expression supernatants of Pichia pastoris cultures at 70°C for 0 to 7min in a water bath. Samples were put on ice after the precise incubation period before setup of the activity assay.

Claims

Claims
1 . A protein having endoglucanase activity which comprises an amino acid sequence having at least 96%, preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99.5% identity to SEQ. ID NO.: 2.
2. The protein according to claim 1 , wherein the protein shows at least 90% residual substrate conversion capacity at temperature 60°C when incubation is done for one hour.
3. A protein having endoglucanase activity which belongs to the GH7 class and which shows active thermostabilization.
4. A protein according to claim 3 and having at least 70%, preferably at least 90%, more
preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, even more preferably at least 99%, such as at least 99.5% identity to SEQ. ID NO.: 2.
5. The protein according to any one of claims 1 , 3 or 4, which is identical to any one of SEQ ID NO: 5; SEQ ID NO: 6; SEQ ID NO: 7; SEQ ID NO: 8; SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 1 1 ; SEQ ID NO: 12; SEQ ID NO: 13; SEQ ID NO: 14; SEQ ID NO: 2; or which has at least one mutation with respect to SEQ ID NO: 2.
6. The polypeptide according to claim 5, wherein the at least one mutation is one selected among the preferred, more preferred of alternative changes shown in Table 2.
7. The polypeptide according to claim 5, wherein the at least one mutation is one selected among the preferred, more preferred of alternative changes shown in Table 3.
8. The polypeptide according to claim 5, wherein the at least one mutation is one selected among the preferred, more preferred of alternative changes shown in Table 4.
9. A nucleic acid encoding a protein of any of the preceding claims.
10. An expression vector comprising the nucleic acid of claim 9.
1 1 . A microorganism containing the vector construct of claim 10.
12. A mixture containing the protein according to any one of claims 1 -8 and one or more further enzyme(s), preferably selected from one or more of cellulases, hemi-cellulases and pectinases
13. Use of the protein of any one of claims 1 -8 or of the mixture of claim 12.
14. Use of claim 13 for the saccharification of lignocellulose.
PCT/EP2013/058985 2012-05-02 2013-04-30 Endoglucanases with improved properties WO2013164340A1 (en)

Priority Applications (9)

Application Number Priority Date Filing Date Title
EA201401199A EA034175B1 (en) 2012-05-02 2013-04-30 Protein having endoglucanase activity and use thereof for saccharification of lignocellulose
BR112014027168A BR112014027168A8 (en) 2012-05-02 2013-04-30 protein, nucleic acid, expression vector, microorganism, mixture and use
AU2013255931A AU2013255931B2 (en) 2012-05-02 2013-04-30 Endoglucanases with improved properties
EP13719552.5A EP2844749A1 (en) 2012-05-02 2013-04-30 Endoglucanases with improved properties
CN201380023399.9A CN104271738B (en) 2012-05-02 2013-04-30 The endoglucanase that characteristic improves
CA2871841A CA2871841C (en) 2012-05-02 2013-04-30 Endoglucanases with improved properties
US14/397,980 US9677059B2 (en) 2012-05-02 2013-04-30 Endoglucanases with improved properties
MX2014013255A MX2014013255A (en) 2012-05-02 2013-04-30 Endoglucanases with improved properties.
ZA2014/07404A ZA201407404B (en) 2012-05-02 2014-10-13 Endoglucanases with improved properties

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP12166458.5A EP2660319B1 (en) 2012-05-02 2012-05-02 Endoglucanases with improved properties
EP12166458.5 2012-05-02

Publications (1)

Publication Number Publication Date
WO2013164340A1 true WO2013164340A1 (en) 2013-11-07

Family

ID=48227298

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2013/058985 WO2013164340A1 (en) 2012-05-02 2013-04-30 Endoglucanases with improved properties

Country Status (19)

Country Link
US (1) US9677059B2 (en)
EP (2) EP2660319B1 (en)
CN (1) CN104271738B (en)
AU (1) AU2013255931B2 (en)
BR (1) BR112014027168A8 (en)
CA (1) CA2871841C (en)
CO (1) CO7151534A2 (en)
DK (1) DK2660319T3 (en)
EA (1) EA034175B1 (en)
ES (1) ES2607061T3 (en)
HR (1) HRP20161615T1 (en)
HU (1) HUE030744T2 (en)
MX (1) MX2014013255A (en)
MY (1) MY168774A (en)
PL (1) PL2660319T3 (en)
RS (1) RS55389B1 (en)
SI (1) SI2660319T1 (en)
WO (1) WO2013164340A1 (en)
ZA (1) ZA201407404B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020150386A1 (en) * 2019-01-16 2020-07-23 Fornia Biosolutions, Inc. Endoglucanase compositions and methods

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4363564A1 (en) 2021-06-29 2024-05-08 Agilent Technologies, Inc. Polymerase mutants and use with 3'-oh unblocked reversible terminators

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007115723A2 (en) * 2006-04-06 2007-10-18 Institut Français Du Petrole Fusion proteins between plant cell-wall degrading enzymes and a swollenin, and their uses
WO2011153516A2 (en) * 2010-06-03 2011-12-08 Mascoma Corporation Yeast expressing saccharolytic enzymes for consolidated bioprocessing using starch and cellulose
WO2012036810A2 (en) * 2010-09-15 2012-03-22 The Regents Of The University Of California Thermophilic mutants of trichoderma reesei endoglucanase i

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5912157A (en) 1994-03-08 1999-06-15 Novo Nordisk A/S Alkaline cellulases

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007115723A2 (en) * 2006-04-06 2007-10-18 Institut Français Du Petrole Fusion proteins between plant cell-wall degrading enzymes and a swollenin, and their uses
WO2011153516A2 (en) * 2010-06-03 2011-12-08 Mascoma Corporation Yeast expressing saccharolytic enzymes for consolidated bioprocessing using starch and cellulose
WO2012036810A2 (en) * 2010-09-15 2012-03-22 The Regents Of The University Of California Thermophilic mutants of trichoderma reesei endoglucanase i

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE Geneseq [online] 10 May 2012 (2012-05-10), "Trichoderma reesei endoglucanase I (TrEGI) mutant G230R/D113S.", XP002680452, retrieved from EBI accession no. GSP:AZU33761 Database accession no. AZU33761 *
ERIKSSON T ET AL: "A model explaining declining rate in hydrolysis of lignocellulose substrates with cellobiohydrolase I (Cel7A) and Endoglucanase I (Cel7B) of Trichoderma reesei", APPLIED BIOCHEMISTRY AND BIOTECHNOLOGY, HUMANA PRESS, INC, UNITED STATES, vol. 101, no. 1, 1 April 2002 (2002-04-01), pages 41 - 60, XP008103337, ISSN: 0273-2289, [retrieved on 20070601], DOI: 10.1385/ABAB:101:1:41 *
ZHU Y S ET AL: "Induction and regulation of cellulase synthesis in Trichoderma pseudokoningii mutants EA3-867 and N2-78", ENZYME AND MICROBIAL TECHNOLOGY, STONEHAM, MA, US, vol. 4, no. 1, 1 January 1982 (1982-01-01), pages 3 - 12, XP023678002, ISSN: 0141-0229, [retrieved on 19820101], DOI: 10.1016/0141-0229(82)90003-5 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020150386A1 (en) * 2019-01-16 2020-07-23 Fornia Biosolutions, Inc. Endoglucanase compositions and methods
US10927358B2 (en) 2019-01-16 2021-02-23 Fornia Biosolutions, Inc. Endoglucanase compositions and methods

Also Published As

Publication number Publication date
CA2871841C (en) 2018-09-25
PL2660319T3 (en) 2017-03-31
EP2660319A1 (en) 2013-11-06
ZA201407404B (en) 2015-11-25
CN104271738A (en) 2015-01-07
MY168774A (en) 2018-12-04
EA201401199A1 (en) 2015-08-31
EP2844749A1 (en) 2015-03-11
MX2014013255A (en) 2015-01-16
BR112014027168A8 (en) 2021-06-29
EP2660319B1 (en) 2016-09-14
RS55389B1 (en) 2017-03-31
BR112014027168A2 (en) 2017-06-27
HRP20161615T1 (en) 2017-01-13
US20150175987A1 (en) 2015-06-25
AU2013255931A1 (en) 2014-11-13
DK2660319T3 (en) 2017-01-02
CN104271738B (en) 2018-05-11
SI2660319T1 (en) 2017-01-31
CO7151534A2 (en) 2014-12-29
HUE030744T2 (en) 2017-05-29
ES2607061T3 (en) 2017-03-29
EA034175B1 (en) 2020-01-14
US9677059B2 (en) 2017-06-13
AU2013255931B2 (en) 2017-05-04
CA2871841A1 (en) 2013-11-07

Similar Documents

Publication Publication Date Title
Singhania et al. Genetic modification: a tool for enhancing beta-glucosidase production for biofuel application
US8017373B2 (en) Process for enzymatic hydrolysis of pretreated lignocellulosic feedstocks
US20180237759A1 (en) Compositions for degrading cellulosic material
CN105316303B (en) Beta-glucosidase I variant with improved characteristic
CN103842515A (en) Method for reducing viscosity in saccharification process
CA2689910A1 (en) Compositions for degrading cellulosic material
EP2751265A1 (en) Novel cellobiohydrolase enzymes
Javed et al. Catalytic and thermodynamic characterization of endoglucanase (CMCase) from Aspergillus oryzae cmc-1
US20150252340A1 (en) Compositions and methods of us
Phadtare et al. Recombinant thermo-alkali-stable endoglucanase of Myceliopthora thermophila BJA (rMt-egl): biochemical characteristics and applicability in enzymatic saccharification of agro-residues
CA2888753A1 (en) Beta-glucosidase from magnaporthe grisea
US20150252344A1 (en) Beta-glucosidase from neurospora crassa
Jain et al. Functional expression of a thermostable endoglucanase from Thermoascus aurantiacus RCKK in Pichia pastoris X-33 and its characterization
RU2008107784A (en) GENETIC CONSTRUCTION FOR PROVISION OF EXPRESSION OF TARGET HOMOLOGICAL AND HETEROLOGICAL GENES IN CELLS OF HEROSENUS CHEROMIUM CHESOME
CA2730662A1 (en) Modified family 6 glycosidases with altered substrate specificity
US9677059B2 (en) Endoglucanases with improved properties
Bhiri et al. Molecular cloning, gene expression analysis and structural modelling of the cellobiohydrolase I from Penicillium occitanis
Wei et al. Molecular cloning and characterization of two major endoglucanases from Penicillium decumbens
US20120115235A1 (en) Enhanced cellulase expression in s. degradans
US9080162B2 (en) Cellulase variants
Bankeeree et al. Cellulase as biocatalyst produced from agricultural wastes
Sanusi Identification and enzyme production of a cellulolytic Bacillus-strain isolated from moose (Alces alces) rumen
US8137945B1 (en) Thermostable cellulase having increased enzyme activity
Phadtare et al. Recombinant thermo-alkali-stable endoglucanase of

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13719552

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2013719552

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2871841

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 14397980

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: IDP00201406751

Country of ref document: ID

Ref document number: MX/A/2014/013255

Country of ref document: MX

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2013255931

Country of ref document: AU

Date of ref document: 20130430

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 201401199

Country of ref document: EA

WWE Wipo information: entry into national phase

Ref document number: 14264528

Country of ref document: CO

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112014027168

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112014027168

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20141030