WO2001049878A1 - FUNGAL EXTRACELLULAR Fam35 BETA-GALACTOSIDASES - Google Patents

FUNGAL EXTRACELLULAR Fam35 BETA-GALACTOSIDASES Download PDF

Info

Publication number
WO2001049878A1
WO2001049878A1 PCT/DK2000/000693 DK0000693W WO0149878A1 WO 2001049878 A1 WO2001049878 A1 WO 2001049878A1 DK 0000693 W DK0000693 W DK 0000693W WO 0149878 A1 WO0149878 A1 WO 0149878A1
Authority
WO
WIPO (PCT)
Prior art keywords
enzyme
sequence
dna
seq
cell
Prior art date
Application number
PCT/DK2000/000693
Other languages
French (fr)
Inventor
Kirk Schnorr
Lene Lange
Søren Flensted LASSEN
Original Assignee
Novozymes A/S
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Novozymes A/S filed Critical Novozymes A/S
Priority to AU19961/01A priority Critical patent/AU1996101A/en
Publication of WO2001049878A1 publication Critical patent/WO2001049878A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y302/00Hydrolases acting on glycosyl compounds, i.e. glycosylases (3.2)
    • C12Y302/01Glycosidases, i.e. enzymes hydrolysing O- and S-glycosyl compounds (3.2.1)
    • C12Y302/01023Beta-galactosidase (3.2.1.23), i.e. exo-(1-->4)-beta-D-galactanase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/24Hydrolases (3) acting on glycosyl compounds (3.2)
    • C12N9/2402Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1)
    • C12N9/2468Hydrolases (3) acting on glycosyl compounds (3.2) hydrolysing O- and S- glycosyl compounds (3.2.1) acting on beta-galactose-glycoside bonds, e.g. carrageenases (3.2.1.83; 3.2.1.157); beta-agarase (3.2.1.81)
    • C12N9/2471Beta-galactosidase (3.2.1.23), i.e. exo-(1-->4)-beta-D-galactanase

Definitions

  • ⁇ -galactosidases or lactases (EC 3.2.1.23) of the Family 35 (Fam35) are known to be produced by Aspergillus niger and Aspergillus oryzae .
  • US Patent No. 5,736,374 and the corresponding PCT-application WO 96/00786 describe an A . oryzae lactase, gene and product, and the use of this lactase for treating lactose intolerance in mammals.
  • Lactases are widely used in the preparation of foodstuffs or feed for consumption by lactose intolerant humans or animals, and the industrial production of lactases is an active field of research.
  • lactases Traditionally the screening for such lactases involves laborious activity based assays, and the development of new and easier ways to screen for lactases is of importance to the industry.
  • O-Glycoside hydrolases (EC 3.2.1.) are a widespread group of enzymes which hydrolyse the glycosidic bond between two or more carbohydrates or between a carbohydrate and a non- carbohydrate moiety.
  • the IUB-MB enzyme nomenclature of glycoside hydrolases is based on their substrate specificity and occasionally on their molecular mechanism; such a classification does not reflect (and was not intended to) the structural features of these enzymes.
  • a classification of glycoside hydrolases in families (Fam) based on amino acid sequence similarities has been proposed a few years ago
  • ⁇ -galactosidases or lactases (EC 3.2.1.23) of the Family 35 (Fam35) are known to be produced by Aspergillus niger and Aspergillus oryzae .
  • US Patent No. 5,736,374 and the corresponding PCT-application WO 96/00786 describe an A . oryzae lactase, gene and product, and the use of this lactase for treating lactose intolerance in mammals.
  • US Patent No. 5,821,350 describes an A . niger lactase, gene and product.
  • WO 90/10703 describes an A . niger lactase which is produced and secreted by a yeast.
  • 3,718,739 describes a method of reducing lactose intolerance in mammals by treating with a lactase from A . niger.
  • US Patent No. 4,522,832 describes addition of a lactase to baked yeast goods, specifically lactase from A . oryzae .
  • the addition of compositions comprising lactase to dairy products under certain conditions was described in US Patent No. 5,707,843.
  • a method of preparing an ice cream containing a lactose composition, specifically from yeast, was described in US Patent No. 5,942,264.
  • the use in animal feed of lactase in combination with other enzymes such as galactanase was described in WO 97/16982.
  • lactase products fall in two categories, the so called neutral lactases e.g. the one that is produced in the yeast Kluyveromyces lactis as an intracellular
  • New lactases would be highly desirable for use in both industrial applications (to provide lactose-free products for lactose intolerant people e.g. on milk & cheese) and end-user applications (as digestive aid) and even in animal feed. Further such lactases could be used in the production of fermentation stocks from whey and in applications within the dairy industry for obtaining improved mouth-feel and preventing crystallization during freezing of yogurt and ice cream.
  • the present invention provides a solution to the problem of how to obtain new Fam35 lactases of fungal origin for industrial production, without having to perform traditional activity based screening assays.
  • Fam35 genes/enzymes come from plants; also a few examples are known from the animal kingdom, however only very few examples from micro organisms.
  • sequence variation within the entire known Fam35 is broad, the sequences of the three known Fam35 genes of fungal origin (viz. A . niger, A . oryzae, and Penicillium canescens) are highly conserved. Further all of these three fungal species are all very closely related, being from two genera of the Trichocomaceae family, belonging to the Ascomycete order Eurotiales. It was therefore highly surprising when we discovered first of all that the Basidiomycete Meripilus giganteus had a
  • Fam35 gene/enzyme was surprising to find that this new fungal Fam35 gene/enzyme had a quite different sequence as compared to ones known from Aspergillus and Penicillium .
  • Partial sequences have further been isolated from five more strains of Ascomycete and Basidiomycete species: Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, and Christaspora arxii .
  • sequence data reveal surprising differences between the genes in spite of them being recognized by the same sets of primers defined in the present invention, thus confirming that knowledge of the conserved regions described herein was a prerequisite for successfully designing the PCR-primers of the invention.
  • the degenerated primer sets of the invention By using the degenerated primer sets of the invention, it was proven possible to detect Fam35 lactases with new characteristics, with relation to both pH- , temperature-, and substrate specificity profiles and to obtain enzymes with different levels of specific activity. In this way the invention opens up for the industrial production of new and improved lactases for use in many industrial segments, including both the acid and neutral lactase market segment .
  • E-F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G Region 4 (SEQ ID No.4) : Y-T-S-Y-D-Y-G-S;
  • D-K-V-R-G Region 6 (SEQ ID No.6) : N-E-G-G-L- [Y/F] -A-E-R; Region 7 (SEQ ID No.7) :
  • a bracket in the above sequence listings such as in x [P/S] ' denotes one position in the amino acid sequence, where either of the two amino acid residues indicated within the bracket may be present in the sequence, in the case of [P/S] either 4 P' or S' may be present in that position.
  • the invention relates to a method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a forward PCR-primer comprising sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y- I-N-A-E (SEQ ID No.l), A-V-D-I -Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] - G-F-D-C- [A/S] -N-P (SEQ ID No.2), E-F-Q- [A/G] -G- [A/S] - [F/Y] - D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T-
  • the first step in the above aspect is to select a microorganism, however this is not to be interpreted in a narrow manner, the method for screening will function just as well on DNA extracted from environmental samples taken from soil, or other ecological niches, thus finding application in the screening of viable but non-culturable cells also.
  • a second aspect of the invention relates to a method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a nucleotide probe comprising a sense or an antisense nucleotide sequence corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y-I-N-A-E (SEQ ID No.l), A-V-D-I- Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] -G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E-F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G (SEQ
  • the enzyme is manufactured industrially. Industrial enzyme production can be achieved in a number of ways, one is simply to culture the cells in which the DNA of interest was identified, and to recover the enzyme from the fermentation broth or from the cells.
  • a third aspect of the invention relates to a process for producing an enzyme of interest, the process comprising steps of: a) screening for a DNA sequence coding for the enzyme of interest by the method of any of the aspects of the invention, b) culturing the microorganism of the first or second aspects step (a) , the genomic DNA or cDNA of which comprises the DNA sequence coding for the enzyme of interest, under suitable conditions to express the enzyme of interest, and c) recovering the enzyme from the culture.
  • Another way to industrially produce the enzyme of interest is to clone the DNA of interest by the usual techniques of the art, and to express the cloned DNA of interest in a homologous or a heterologous microbial host cell.
  • a fourth aspect of the invention relates to a process for producing an enzyme of interest, the process comprising steps of: a) screening for a DNA sequence coding for the enzyme of interest and isolating said DNA sequence by the method of any of the aspects of the invention, b) culturing the microbial host cell of the invention under suitable conditions to express the enzyme of interest and recovering the expressed enzyme from the culture.
  • the present invention has allowed easy isolation of a number of Fam35 lactases from various fungal Orders, lactases that would likely not have been isolated using conventional shotgun-cloning techniques without substantial labor.
  • a fifth aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Agaricales, Polyporales, Phanerochaetales, Leotiales, or Dothideales Order, said enzyme comprising at least one of the amino acid sequences shown in SEQ ID No .1 - 7, and said enzyme being obtainable by the method of the third or forth aspect of the invention.
  • lactase of the invention isolated from a certain fungal Genus indicates that other members of that Genus are likely to have DNA encoding a related lactase of the invention.
  • a sixth aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus, Mycena, Trametes, Spathularia, Diplodia, Microsphaeropsis , Penicillium,
  • the DNA encoding the lactase enzymes of the invention is also encompassed by the invention.
  • an eighth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Agaricales, Polyporales, Phanerochaetales, Leotiales, or Dothideales Order, said sequence comprising subsequences that encode at least one of the amino acid sequences shown in SEQ ID No.l - 7, and said sequence being obtainable by the method of any of the first or second aspects of the invention.
  • a ninth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus, Mycena, Trametes, Spathularia, Diplodia, Microsphaeropsis, Penicillium, Petromyces, or Christaspora Genus, said sequence comprising subsequences that encode at least one of the amino acid sequences shown in SEQ ID No.l - 7, and said sequence being obtainable by the method of any of the first or second aspects of the invention.
  • an enzyme with lactase activity EC 3.2.1.23
  • a tenth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus giganteus, Trametes ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis sp . , Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, or Christaspora arxii species, said sequence comprising subsequences that encode at least one of the amino acid sequences shown in SEQ ID No.l - 7, and said sequence being obtainable by the method of any of the first or second aspects of the invention.
  • an eleventh aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) which is obtainable by the method of any of the third or fourth aspects and which comprises an amino acid sequence at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to any of the sequence shown in SEQ ID No's: 9, 20, 22, 24, 26, 28, 30, 32, 34, 36, or 38.
  • a twelwth aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) which is obtainable by the method of any of the third or fourth aspects and which comprises an amino acid sequence at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to a lactase encoded by a DNA sequence comprised in a strain chosen from the group consisting of Microsphaeropsis sp . CBS102583, Trametes ochracea CBS 102584, Penicillium carneum CBS 102585, and Meripilus giganteus CBS 52195.
  • a thirteenth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23), said sequence being obtainable by the method of any of the first or second aspects, and said sequence being at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to any of the sequence shown in SEQ ID No's: 8, 18, 19, 21, 23, 25, 27, 29, 31, 33, 35, or 37.
  • a fourteenth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23), said sequence being obtainable by the method of any of the first or second aspects, and said sequence being at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to the lactase encoding sequence comprised in a strain chosen from the group consisting of Microsphaeropsis sp . CBS102583, Trametes ochracea CBS 102584, Penicillium carneum CBS 102585, and Meripilus giganteus CBS 52195.
  • enzyme activity and/or characteristics can be improved by shuffling or recombining two or more DNA sequences encoding related enzymes, in order to produce a final shuffled sequence encoding the improved enzyme.
  • Ways of generating and producing DNA libraries from natural sequences are well known, but besides natural DNA sequences, a number of ways are also known in which to generate very large populations of diverse artificial DNA sequences starting from one or more natural sequences, e.g. shuffling or directed evolution (WO 98/42832; US 5,965,408; WO 98/01581; WO 97/07205; WO 95/22625; US 5,093,257).
  • an aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) said sequence obtainable by shuffling at least two isolated DNA sequences as defined in the previous aspects.
  • Another aspect of the invention relates to a recombinant vector comprising a DNA sequence as defined in any of the previous aspects.
  • Yet another aspect relates to a recombinant host cell comprising a DNA sequence according to any of the previous aspects.
  • a further aspect relates to a transgenic animal comprising and expressing the nucleic acid construct according to any of the preceding aspects.
  • One more aspect relates to a transgenic plant 5 containing and expressing the nucleic acid construct according to any of the preceding aspects.
  • an aspect relates to a method of producing an enzyme with lactase activity (EC 3.1.2.23), which method comprises recovering the enzyme from the transgenic animal lo according to the sixteenth aspect .
  • An aspect relates to a method of producing an enzyme with lactase activity (EC 3.1.2.23), which method comprises growing a cell of a transgenic plant according to the seventeenth aspect, and recovering the enzyme from the
  • An aspect relates to a composition comprising an enzyme with lactase activity (EC 3.2.1.23) as defined in any of the preceding aspects.
  • a final aspect relates to the use of an enzyme with 20 lactase activity or use of a composition comprising an enzyme with lactase activity as defined in any of the preceding aspects, in the manufacture or processing of foodstuffs or feeds fit for consumption by lactose intolerant humans or animals, or for improvement of the nutritional value of an 25 animal feed.
  • Isolated When applied to a protein, the term “isolated” indicates that the protein is found in a condition other than its native environment, such as apart from blood and animal tissue. In a preferred form, the isolated protein is substantially free of other proteins, particularly other proteins of animal origin. It is preferred to provide the proteins in a highly purified form, i.e., greater than 95% pure, more preferably greater than 99% pure.
  • the term “isolated” indicates that the molecule is removed from its natural genetic milieu, and is thus free of other extraneous or unwanted coding sequences, and is in a form suitable for use within genetically engineered protein production systems.
  • isolated molecules are those that are separated from their natural environment and include cDNA and genomic clones.
  • isolated DNA molecules of the present invention are free of other genes with which they are ordinarily associated, and may include naturally occurring 5' and 3' untranslated regions such as promoters and terminators. The identification of associated regions will be evident to one of ordinary skill in the art (see for example, Dynan and Tijan, Nature 316: 774-78, 1985) .
  • polynucleotide is a single- or double-stranded polymer of deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3' end.
  • Polynucleotides include RNA and DNA, and may be isolated from natural sources, synthesized in vi tro, or prepared from a combination of natural and synthetic molecules.
  • nucleic acid molecule refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules”) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine , or deoxy- cytidine; "DNA molecules”) in either single stranded form, or a double-stranded helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible.
  • nucleic acid molecule refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary or quaternary forms.
  • this term includes double-stranded DNA found, inter alia, in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes.
  • sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the nontranscribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA) .
  • a "recombinant DNA molecule” is a DNA molecule that has undergone a molecular biological manipulation.
  • a nucleic acid molecule "hybridizes" to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength (see Sambrook et al . , supra) . The conditions of temperature and ionic strength determine the "stringency" of the hybridization.
  • a DNA "coding sequence” is a double-stranded DNA sequence which is transcribed and translated into a polypeptide in a cell in vi tro or in vivo when placed under the control of appropriate regulatory sequences .
  • the boundaries of the coding sequence are determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxyl) terminus.
  • a coding sequence can include, but is not limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA sequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNA sequences. If the coding sequence is intended for expression in a eukaryotic cell, a polyadenylation signal and transcription termination sequence will usually be located 3' to the coding sequence .
  • Expression vector A DNA molecule, linear or circular, that comprises a segment encoding a polypeptide of interest operably linked to additional segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and optionally one or more origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, and the like. Expression vectors are generally derived from plasmid or viral DNA, or may contain elements of both.
  • Transcriptional and translational control sequences are DNA regulatory sequences, such as promoters, enhancers, terminators, and the like, that provide for the expression of a coding sequence in a host cell.
  • polyadenylation signals are control sequences.
  • a “secretory signal sequence” is a DNA sequence that encodes a polypeptide (a "secretory peptide” that, as a component of a larger polypeptide, directs the larger polypeptide through a secretory pathway of a cell in which it is synthesized.
  • the larger polypeptide is commonly cleaved to remove the secretory peptide during transit through the secretory pathway.
  • a coding sequence is "under the control" of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then trans- RNA spliced and translated into the protein encoded by the coding sequence .
  • a cell has been "transformed" by exogenous or heterologous DNA when the transfected DNA effects a phenotypic change
  • the transforming DNA should be integrated (covalently linked) into chromosomal DNA making up the genome of the cell.
  • Homologous recombination refers to the insertion of a foreign DNA sequence of a vector in a chromosome.
  • the vector targets a specific chromosomal site for homologous recombination.
  • the vector will contain sufficiently long regions of homology to sequences of the chromosome to allow complementary binding and incorporation of the vector into the chromosome. Longer regions of homology, and greater degrees of sequence similarity, may increase the efficiency of homologous recombination.
  • Nucleic Acid Sequence The techniques used to isolate or clone a nucleic acid sequence encoding a polypeptide are known in the art and include isolation from genomic DNA, preparation from cDNA, or a combination thereof.
  • the cloning of the nucleic acid sequences of the present invention from such genomic DNA can be effected, e. g. , by using the well-known polymerase chain reaction (PCR) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features . See, e . g . , Innis et al . , 1990, A Guide to Methods and Application, Academic Press, New York.
  • PCR polymerase chain reaction
  • nucleic acid amplification procedures such as ligase chain reaction (LCR) , ligated activated transcription (LAT) and nuceic acid sequence- based amplification (NASBA) may be used.
  • LCR ligase chain reaction
  • LAT ligated activated transcription
  • NASBA nuceic acid sequence- based amplification
  • the nucleic acid sequence may be cloned from a strain of the [Genus] producing the polypeptide, or another or related organism and thus, for example, may be an allelic or species variant of the polypeptide encoding region of the nucleic acid sequence.
  • isolated nucleic acid sequence or DNA refers to a nucleic acid sequence which is essentially free of other nucleic acid sequences, e . g. , at least about 20% pure, preferably at least about 40% pure, more preferably about 60% pure, even more preferably about 80% pure, most preferably about 90% pure, and even most preferably about 95% pure, as determined by agarose gel electorphoresis .
  • an isolated nucleic acid sequence can be obtained by standard cloning procedures used in genetic engineering to relocate the nucleic acid sequence from its natural location to a different site where it will be reproduced.
  • the cloning procedures may involve excision and isolation of a desired nucleic acid fragment comprising the nucleic acid sequence encoding the polypeptide, insertion of the fragment into a vector molecule, and incorporation of the recombinant vector into a host cell where multiple copies or clones of the nucleic acid sequence will be replicated.
  • the nucleic acid sequence may be of genomic, cDNA, RNA, semisynthetic, synthetic origin, or any combinations thereof.
  • the degree of identity between two nucleic acid sequences may be determined by means of computer programs known in the art such as GAP provided in the GCG program package (Needleman and Wunsch, 1970, Journal of Molecular Biology 48:443-453). For purposes of determining the degree of identity between two nucleic acid sequences for the present invention, GAP is used with the following settings: GAP creation penalty of 5.0 and GAP extension penalty of 0.3.
  • Nucleic Acid Construct as used herein is intended to indicate any nucleic acid molecule of cDNA, genomic DNA, synthetic DNA or RNA origin.
  • the term "construct” is intended to indicate a nucleic acid segment which may be single- or double-stranded, and which may be based on a complete or partial naturally occurring nucleotide sequence encoding a polypeptide of interest.
  • the construct may optionally contain other nucleic acid segments.
  • a nucleic acid construct of the invention encoding an enzyme of the invention may suitably be of genomic or cDNA origin, for instance obtained by preparing a genomic or cDNA library and screening for DNA sequences coding for all or part of the polypeptide by hybridization using synthetic oligonucleotide probes in accordance with standard techniques (cf. Sambrook et al . , supra) .
  • the nucleic acid construct of the invention encoding the polypeptide may also be prepared synthetically by established standard methods, e.g. the phosphoamidite method described by Beaucage and Caruthers, Tetrahedron Letters 22 (1981) , 1859 - 1869, or the method described by Matthes et al . , EMBO Journal 3_ (1984), 801 - 805.
  • phosphoamidite method oligonucleotides are synthesized, e.g. in an automatic DNA synthesizer, purified, annealed, ligated and cloned in suitable vectors .
  • nucleic acid construct may be of mixed synthetic and genomic, mixed synthetic and cDNA or mixed genomic and cDNA origin prepared by ligating fragments of synthetic, genomic or cDNA origin (as appropriate) , the fragments corresponding to various parts of the entire nucleic acid construct, in accordance with standard techniques.
  • the nucleic acid construct may also be prepared by polymerase chain reaction using specific primers, for instance as described in US 4,683,202 or Saiki et al . , Science 239 (1988) , 487 - 491.
  • nucleic acid construct may be synonymous with the term expression cassette when the nucleic acid construct contains all the control sequences required for expression of a coding sequence of the present invention.
  • coding sequence as defined herein is a sequence which is transcribed into mRNA and translated into a polypeptide of the present invention when placed under the control of the above mentioned control sequences. The boundaries of the coding sequence are generally determined by a translation start codon ATG at the 5' -terminus and a translation stop codon at the 3' -terminus.
  • a coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences.
  • control sequences is defined herein to include all components which are necessary or advantageous for expression of the coding sequence of the nucleic acid sequence.
  • Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide.
  • control sequences include, but are not limited to, a leader, a polyadenylation sequence, a propeptide sequence, a promoter, a signal sequence, and a transcription terminator.
  • the control sequences include a promoter, and transcriptional and translational stop signals.
  • the control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
  • the control sequence may be an appropriate promoter sequence, a nucleic acid sequence which is recognized by a host cell for expression of the nucleic acid sequence.
  • the promoter sequence contains transcription and translation control sequences which mediate the expression of the polypeptide.
  • the promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
  • the control sequence may also be a suitable transcription terminator sequence, a sequence recognized by a host cell to terminate transcription.
  • the terminator sequence is operably linked to the 3' terminus of the nucleic acid sequence encoding the polypeptide. Any terminator which is functional in the host cell of choice may be used in the present invention.
  • the control sequence may also be a polyadenylation sequence, a sequence which is operably linked to the 3' terminus of the nucleic acid sequence and which, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence which is functional in the host cell of choice may be used in the present invention.
  • the control sequence may also be a signal peptide-coding region, which codes for an amino acid sequence linked to the amino terminus of the polypeptide which can direct the expressed polypeptide into the cell's secretory pathway of the host cell.
  • the 5' end of the coding sequence of the nucleic acid sequence may inherently contain a signal peptide-coding region naturally linked in translation reading frame with the segment of the coding region which encodes the secreted polypeptide.
  • the 5' end of the coding sequence may contain a signal peptide-coding region which is foreign to that portion of the coding sequence which encodes the secreted polypeptide.
  • a foreign signal peptide-coding region may be required where the coding sequence does not normally contain a signal peptide-coding region.
  • the foreign signal peptide coding region may simply replace the natural signal peptide coding region in order to obtain enhanced secretion of the [enzyme] relative to the natural signal peptide coding region normally associated with the coding sequence.
  • the signal peptide-coding region may be obtained from a glucoamylase or an amylase gene from an Aspergillus species, a lipase or proteinase gene from a Rhizomucor species, the gene for the alpha-factor from Saccharomyces cerevisiae, an amylase or a protease gene from a Bacillus species, or the calf preprochymosin gene.
  • any signal peptide-coding region capable of directing the expressed polypeptide into the secretory pathway of a host cell of choice may be used in the present invention.
  • the control sequence may also be a propeptide coding region, which codes for an amino acid sequence positioned at the amino terminus of a polypeptide.
  • the resultant polypeptide is known as a proenzyme or propolypeptide (or a zymogen in some cases) .
  • a propolypeptide is generally inactive and can be converted to mature active polypeptide by catalytic or autocatalytic cleavage of the propeptide from the propolypeptide.
  • the propeptide coding region may be obtained from the Bacillus subtilis alkaline protease gene ⁇ aprE) , the Bacillus subtilis neutral protease gene (nprT) , the Saccharomyces cerevisiae alpha- factor gene, or the
  • Myceliophthora thermophilum laccase gene (WO 95/33836) .
  • the nucleic acid constructs of the present invention may also comprise one or more nucleic acid sequences which encode one or more factors that are advantageous in the expression of the polypeptide, e . g. , an activator (e . g. , a trans-acting factor), a chaperone, and a processing protease. Any factor that is functional in the host cell of choice may be used in the present invention.
  • the nucleic acids encoding one or more of these factors are not necessarily in tandem with the nucleic acid sequence encoding the polypeptide.
  • An activator is a protein which activates transcription of a nucleic acid sequence encoding a polypeptide (Kudla et al . , 1990, EMBO Journal 9:1355-1364; Jarai and Buxton, 1994,
  • the nucleic acid sequence encoding an activator may be obtained from the genes encoding Bacillus s tearothermophilus NprA
  • nprA Saccharomyces cerevisiae heme activator protein 1
  • hapl Saccharomyces cerevisiae galactose metabolizing protein 4
  • areA Aspergillus nidulans ammonia regulation protein
  • a chaperone is a protein which assists another polypeptide in folding properly (Hartl et al . , 1994, TIBS 19:20-25; Bergeron et al . , 1994, TIBS 19:124-128; Demolder et al . , 1994, Journal of Biotechnology 32:179-189; Craig, 1993, Science 260:1902-1903; Gething and Sambrook, 1992, Nature 355:33-45; Puig and Gilbert, 1994, Journal of Biological Chemistry 269:7764-7771; Wang and Tsou, 1993, The FASEB Journal 7:1515-11157; Robinson et al . , 1994, Bio/Technology 1:381-384).
  • the nucleic acid sequence encoding a chaperone may be obtained from the genes encoding Bacillus subtilis GroE proteins, Aspergillus oryzae protein disulphide isomerase, Saccharomyces cerevisiae calnexin, Saccharomyces cerevisiae BiP/GRP78, and Saccharomyces cerevisiae Hsp70. For further examples, see Gething and Sambrook, 1992, supra, and Hartl et al . , 1994, supra .
  • a processing protease is a protease that cleaves a propeptide to generate a mature biochemically active polypeptide (Enderlin and Ogrydziak, 1994, Yeast 10:67-79; Fuller et al . , 1989, Proceedings of the National Academy of Sciences USA 86:1434-1438; Julius et al . , 1984, Cell 37:1075- 1089; Julius et al . , 1983, Cell 32:839-852).
  • the nucleic acid sequence encoding a processing protease may be obtained from the genes encoding Aspergillus niger Kex2 , Saccharomyces cerevisiae dipeptidylaminopeptidase, Saccharomyces cerevi siae Kex2 , and Yarrowia lipolytica dibasic processing endoprotease (xpr6) . It may also be desirable to add regulatory sequences which allow the regulation of the expression of the polypeptide relative to the growth of the host cell. Examples of regulatory systems are those which cause the expression of the gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound.
  • Regulatory systems in prokaryotic systems would include the lac, tac, and trp operator systems.
  • yeast the ADH2 system or GAL1 system may be used.
  • filamentous fungi the TAKA alpha-amylase promoter, Aspergillus niger glucoamylase promoter, and the Aspergillus oryzae glucoamylase promoter may be used as regulatory sequences.
  • Other examples of regulatory sequences are those allowing for gene amplification.
  • these include the dihydrofolate reductase gene which is amplified in the presence of methotrexate, and the metallothionein genes which are amplified with heavy metals. In these cases, the nucleic acid sequence encoding the polypeptide would be placed in tandem with the regulatory sequence .
  • Promoters for directing the transcription of the nucleic acid constructs of the present invention, especially in a bacterial host cell, are exemplified in the promoters obtained from the E. coli lac operon, the Streptomyces coelicolor agarase gene ( dagA) , the Bacillus subtilis levansucrase gene ( sacB) , the Bacillus subtilis alkaline protease gene, the Bacillus licheniformis alpha-amylase gene ( amyL) , the Bacillus stearothermophilus maltogenic amylase gene (amyM) , the Bacillus amyloliguefaciens alpha-amylase gene (amyQ) , the Bacillus amyloliquefaciens BAN AMYLASE GENE, the Bacillus licheniformis penicillinase gene (penP) , the Bacillus subtilis xylA and xylB genes, and the prokaryotic beta-lac
  • promoters for directing the transcription of the nucleic acid constructs of the present invention in a filamentous fungal host cell are promoters obtained from the genes encoding Aspergillus oryzae TAKA amylase, Rhi zomucor miehei aspartic proteinase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase ( glaA) , Rhizomucor mi ehei lipase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, Fusarium oxysporum trypsin-like protease (as described in U.S.
  • Patent No. 4,288,627 which is incorporated herein by reference
  • Particularly preferred promoters for use in filamentous fungal host cells are the TAKA amylase, NA2-tpi (a hybrid of the promoters from the genes encoding Aspergillus niger neutral a-amylase and Aspergillus oryzae triose phosphate isomerase), and glaA promoters.
  • Further suitable promoters for use in filamentous fungus host cells are the ADH3 promoter (McKnight et al . , The EMBO J . 4 (1985), 2093 - 2099) or the tpiA promoter.
  • promoters for use in yeast host cells include promoters from yeast glycolytic genes (Hitzeman et al., J. Biol. Chem. 255 (1980), 12073 - 12080; Alber and Kawasaki, J. Mol. Appl. Gen. 1 (1982), 419 - 434) or alcohol dehydrogenase genes (Young et al . , in Genetic Engineering of Microorganisms for Chemicals (Hollaender et al , eds.), Plenum Press, New York, 1982), or the TPI1 (US 4,599,311) or ADH2-4c (Russell et al . , Nature 304 (1983), 652 - 654) promoters.
  • Saccharomyces cerevisiae galactokinase gene GAL1
  • Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde- 3-phosphate dehydrogenase genes ADH2/GAP
  • Saccharomyces cerevisiae 3-phosphoglycerate kinase gene Other useful promoters for yeast host cells are described by Romanos et al . , 1992, Yeast 8:423-488.
  • useful promoters include viral promoters such as those from Simian Virus 40 (SV40) , Rous sarcoma virus (RSV) , adenovirus, and bovine papilloma virus (BPV) .
  • SV40 Simian Virus 40
  • RSV Rous sarcoma virus
  • BPV bovine papilloma virus
  • Suitable promoters for directing the transcription of the DNA encoding the polypeptide of the invention in mammalian cells are the SV40 promoter (Subramani et al., Mol. Cell Biol. 1 (1981), 854 -864), the MT-1 (metallothionein gene) promoter (Palmiter et al . , Science 222 (1983) , 809 - 814) or the adenovirus 2 major late promoter.
  • a suitable promoter for use in insect cells is the polyhedrin promoter (US 4,745,051; Vasuvedan et al . , FEBS Lett. 311, (1992) 7 - 11) , the P10 promoter (J.M. Vlak et al., J. Gen. Virology 69, 1988, pp. 765-776), the Autographa calif ornica polyhedrosis virus basic protein promoter (EP 397 485) , the baculovirus immediate early gene 1 promoter (US 5,155,037; US 5,162,222), or the baculovirus 39K delayed-early gene promoter (US 5,155,037; US 5,162,222).
  • the polyhedrin promoter US 4,745,051; Vasuvedan et al . , FEBS Lett. 311, (1992) 7 - 11
  • the P10 promoter J.M. Vlak et al., J. Gen. Virology 69, 1988, pp.
  • Preferred terminators for filamentous fungal host cells are obtained from the genes encoding Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, Aspergillus niger alpha-glucosidase, and Fusarium oxysporum trypsin-like protease.
  • TPI1 Alber and Kawasaki, op. cit .
  • ADH3 McKnight et al . , op. cit .
  • Preferred terminators for yeast host cells are obtained from the genes encoding Saccharomyces cerevisiae enolase, Saccharomyces cerevisiae cytochrome C (CYC1) , or Saccharomyces cerevisiae glyceraldehyde-3 -phosphate dehydrogenase.
  • Other useful terminators for yeast host cells are described by Romanos et al . , 1992, supra .
  • Preferred polyadenylation sequences for filamentous fungal host cells are obtained from the genes encoding
  • Aspergillus oryzae TAKA amylase Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, and Aspergillus niger alpha-glucosidase .
  • Polyadenylation sequences are well known in the art for mammalian host cells such as SV40 or the adenovirus 5 Elb region.
  • An effective signal peptide-coding region for bacterial host cells is the signal peptide-coding region obtained from the maltogenic amylase gene from Bacillus NCIB 11837, the Bacillus stearothermophilus alpha-amylase gene, the Bacillus licheni -formis subtilisin gene, the Bacillus licheniformis beta-lactamase gene, the Bacillus stearothermophilus neutral pro-teases genes (nprT, nprS, nprM) , and the Bacillus subtilis PrsA gene. Further signal peptides are described by Simonen and Palva, 1993, Microbiological Reviews 57:109-137.
  • An effective signal peptide coding region for filamentous fungal host cells is the signal peptide coding region obtained from Aspergillus oryzae TAKA amylase gene, Aspergillus niger neutral amylase gene, the Rhizomucor miehei aspartic proteinase gene, the Humicola lanuginosa cellulase or lipase gene, or the Rhizomucor miehei lipase or protease gene, Aspergillus sp. amylase or glucoamylase, a gene encoding a Rhizomucor miehei lipase or protease.
  • the signal peptide is preferably derived from a gene encoding A . oryzae TAKA amylase, A . niger neutral a-amylase, A . niger acid-stable amylase, or A . niger glucoamylase.
  • Useful signal peptides for yeast host cells are obtained from the genes for Saccharomyces cerevisiae a-factor and Saccharomyces cerevisiae invertase. Other useful signal peptide coding regions are described by Romanos et al . , 1992, supra .
  • the secretory signal sequence may encode any signal peptide which ensures efficient direction of the expressed polypeptide into the secretory pathway of the cell.
  • the signal peptide may be naturally occurring signal peptide, or a functional part thereof, or it may be a synthetic peptide. Suitable signal peptides have been 5 found to be the a-factor signal peptide (cf.
  • leader peptide For efficient secretion in yeast, a sequence encoding a is leader peptide may also be inserted downstream of the signal sequence and uptream of the DNA sequence encoding the polypeptide.
  • the function of the leader peptide is to allow the expressed polypeptide to be directed from the endoplasmic reticulum to the Golgi apparatus and further to a secretory
  • the leader peptide may be the yeast a- factor leader (the use of which is described in e.g. US 4,546,082, EP 16 201, EP
  • the leader peptide may be a synthetic leader peptide, which is to say a leader peptide not found in nature. Synthetic leader peptides may, for instance, be constructed as described in WO 89/02463 or WO 92/11378.
  • the signal peptide may conveniently be derived from an insect gene (cf. WO 90/05783), such as the lepidopteran Manduca sexta adipokinetic hormone precursor signal peptide (cf. US 5,023,328).
  • insect gene cf. WO 90/05783
  • lepidopteran Manduca sexta adipokinetic hormone precursor signal peptide cf. US 5,023,328.
  • the present invention also relates to recombinant expression vectors comprising a nucleic acid sequence of the present invention, a promoter, and transcriptional and translational stop signals.
  • the various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites.
  • the nucleic acid sequence of the present invention may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression.
  • the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression, and possibly secretion.
  • the recombinant expression vector may be any vector (e.g., a plasmid or virus) which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the nucleic acid sequence.
  • the choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced.
  • the vectors may be linear or closed circular plasmids.
  • the vector may be an autonomously replicating vector, i . e . , a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e . g. , a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome.
  • the vector may contain any means for assuring self-replication.
  • the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome (s) into which it has been integrated.
  • the vector system may be a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon.
  • the vectors of the present invention preferably contain one or more selectable markers which permit easy selection of transformed cells.
  • a selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like.
  • bacterial selectable markers are the dal genes from Bacillus subtilis or Bacillus licheniformis , or markers which confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol , tetracycline, neomycin, hygromycin or etho- trexate resistance.
  • a frequently used mammalian marker is the dihydrofolate reductase gene (DHFR) .
  • DHFR dihydrofolate reductase gene
  • Suitable markers for yeast host cells are ADE2 , HIS3, LEU2 , LYS2 , MET3 , TRP1, and URA3.
  • a selectable marker for use in a filamentous fungal host cell may be selected from the group including, but not limited to, amdS (acetamidase) , argB (ornithine carbamoyltransferase) , bar
  • phosphinothricin acetyltransferase phosphinothricin acetyltransferase
  • hygB hygromycin phosphotransferase
  • niaD nitrate reductase
  • pyrG orotidine-
  • glufosinate resistance markers as well as equivalents from other species.
  • Preferred for use in an Aspergillus cell are the amdS and pyrG markers of Aspergillus nidulans or Aspergillus oryzae and the jbar marker of Streptomyces hygroscopicus .
  • selection may be accomplished by co-transformation, e . g. , as described in WO 91/17243, where the selectable marker is on a separate vector.
  • the vectors of the present invention preferably contain an element (s) that permits stable integration of the vector into the host cell genome or autonomous replication of the vector in the cell independent of the genome of the cell.
  • the vectors of the present invention may be integrated into the host cell genome when introduced into a host cell .
  • the vector may rely on the nucleic acid sequence encoding the polypeptide or any other element of the vector for stable integration of the vector into the genome by homologous or nonhomologous recombination.
  • the vector may contain additional nucleic acid sequences for directing integration by homologous recombination into the genome of the host cell.
  • the additional nucleic acid sequences enable the vector to be integrated into the host cell genome at a precise location (s) in the chromosome (s) .
  • the integrational elements should preferably contain a sufficient number of nucleic acids, such as 100 to 1,500 base pairs, preferably 400 to 1,500 base pairs, and most preferably 800 to 1,500 base pairs, which are highly homologous with the corresponding target sequence to enhance the probability of homologous recombination.
  • the integrational elements may be any sequence that is homologous with the target sequence in the genome of the host cell.
  • the integrational elements may be non-encoding or encoding nucleic acid sequences.
  • the vector may be integrated into the genome of the host cell by non-homologous recombination.
  • These nucleic acid sequences may be any sequence that is homologous with a target sequence in the genome of the host cell, and, furthermore, may be non-encoding or encoding sequences .
  • the vector may further comprise an origin of replication enabling the vector to replicate autonomously in the host cell in question.
  • origins of replication are the origins of replication of plasmids pBR322, pUC19, pACYC177, pACYC184, pUBHO, pE194, pTA1060, and pAM ⁇ l .
  • origin of replications for use in a yeast host cell are the 2 micron origin of replication, the combination of CEN6 and ARS4 , and the combination of CEN3 and ARS1.
  • the origin of replication may be one having a mutation which makes its functioning temperature-sensitive in the host cell (see, e . g. , Ehrlich, 1978, Proceedings of the National Academy of Sciences USA 75:1433) .
  • More than one copy of a nucleic acid sequence encoding a polypeptide of the present invention may be inserted into the host cell to amplify expression of the nucleic acid sequence.
  • Stable amplification of the nucleic acid sequence can be obtained by integrating at least one additional copy of the sequence into the host cell genome using methods well known in the art and selecting for transformants .
  • the procedures used to ligate the elements described above to construct the recombinant expression vectors of the present invention are well known to one skilled in the art (see, e . g. , Sambrook et al . , 1989, supra) .
  • the present invention also relates to recombinant microbial host cells, comprising a nucleic acid sequence of the invention, which are advantageously used in the recombinant production of the polypeptides .
  • host cell encompasses any progeny of a parent cell which is not identical to the parent cell due to mutations that occur during replication.
  • the cell is preferably transformed with a vector comprising a nucleic acid sequence of the invention followed by integration of the vector into the host chromosome.
  • Transformation means introducing a vector comprising a nucleic acid sequence of the present invention into a host cell so that the vector is maintained as a chromosomal integrant or as a self-replicating extra-chromosomal vector. Integration is generally considered to be an advantage as the nucleic acid sequence is more likely to be stably maintained in the cell. Integration of the vector into the host chromosome may occur by homologous or non-homologous recombination as described above.
  • the transformation of a bacterial host cell may, for instance, be effected by protoplast transformation (see, e . g. , Chang and Cohen, 1979, Molecular General Genetics 168:111-115), by using competent cells (see, e . g. , Young and Spizizin, 1961, Journal of Bacteriology 81:823-829, or Dubnar and Davidoff- Abelson, 1971, Journal of Molecular Biology 56:209-221), by electroporation (see, e . g. , Shigekawa and Dower, 1988, Bio echniques 6:742-751), or by conjugation (see, e . g. , Koehler and Thorne, 1987, Journal of Bacteriology 169:5771-5278).
  • protoplast transformation see, e . g. , Chang and Cohen, 1979, Molecular General Genetics 168:111-115
  • competent cells see, e . g. , Young and Spizizin, 1961, Journal of
  • the host cell may be a eukaryote, such as a mammalian cell, an insect cell, a plant cell or a fungal cell.
  • a mammalian cell such as a mammalian cell, an insect cell, a plant cell or a fungal cell.
  • Useful mammalian cells include Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, COS cells, or any number of other immortalized cell lines available, e.g., from the American Type Culture Collection.
  • suitable mammalian cell lines are the COS (ATCC CRL 1650 and 1651), BHK (ATCC CRL 1632, 10314 and 1573, ATCC CCL 10), CHL (ATCC CCL39) or CHO (ATCC CCL 61) cell lines.
  • Methods of transfecting mammalian cells and expressing DNA sequences introduced in the cells are described in e.g. Kaufman and Sharp, J. Mol. Biol. 159 (1982), 601 - 621; Southern and Berg, J . Mol . Appl . Genet . 1 (1982), 327 - 341; Loyter et al . , Proc. Natl. Acad. Sci.
  • the host cell is a fungal cell.
  • "Fungi” as used herein includes the phyla Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota (as defined by Hawksworth et al . , In, Ainsworth and Bisby' s Dictionary of The Fungi , 8th edition, 1995, CAB International, University Press, Cambridge, UK) as well as the Oomycota (as cited in Hawksworth et al . , 1995, supra, page 171) and all mitosporic fungi (Hawksworth et al . , 1995, supra) .
  • Representative groups of Ascomycota include, e . g.
  • Basidiomycota examples include mushrooms, rusts, and smuts.
  • Representative groups of Chytridiomycota include, e.g., Allomyces, Blastocladiella, Coelomomyces , and aquatic fungi.
  • Representative groups of Oomycota include, e . g. , Saprolegniomycetous aquatic fungi
  • mitosporic fungi include Aspergillus, Penicillium, Candida, and Al ternaria .
  • Repre-sentative groups of Zygomycota include, e.g., Rhizopus and Mucor .
  • the fungal host cell is a yeast-like cell.
  • yeast as used herein includes ascosporogenous yeast (Endomycetales) , basidiosporogenous yeast, and yeast belonging to the Fungi Imperfecti (Blastomycetes) .
  • the ascosporogenous yeasts are divided into the families Spermophthoraceae and Saccharomycetaceae . The latter is comprised of four subfamilies, Schizo- saccharomycoideae (e.g. , genus Schizosaccharomyces) , Nad- sonioideae, Lipomycoideae, and Saccharomycoideae ( e . g.
  • the basidio- porogenous yeasts include the genera Leucosporidim, Rhodosporidium, Sporidiobolus, Filobasidium, and Filo- basidiella .
  • Yeast belonging to the Fungi Imperfecti is divided into two families, Sporobolomycetaceae (e.g., genera Soro- bolomyces and Bullera) and Cryptococcaceae (e.g., genus Candida) .
  • yeast Since the classification of yeast may change in the future, for the purposes of this invention, yeast shall be defined as described in Biology and Activi ties of Yeast (Skinner, F.A., Passmore, S.M., and Davenport, R.R., eds, Soc . App. Bacteriol . Symposium Series No. 9, 1980.
  • the biology of yeast and manipulation of yeast genetics are well known in the art (see, e.g., Biochemistry and Genetics of Yeast, Bacil, M.
  • the yeast host cell may be selected from a cell of a species of Candida, Kluyveromyces , Saccharomyces , Schizosaccharomyces, Candida, Pichia, Hansenula, or Yarrowia .
  • the yeast host cell is a Saccharomyces carlsbergensis, Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii , Saccharomyces kluyveri , Saccharomyces norbensis or Saccharomyces oviformis cell.
  • yeast host cells are a Kluyveromyces lactis s Kluyveromyces fragilis Hansehula polymorpha, Pichia pastoris Yarrowia lipolytica, Schizosaccharomyces pombe, Ustilgo maylis, Candida mal tose, Pichia guillermondii and Pichia methanol io cell (cf. Gleeson et al . , J. Gen. Microbiol . 132, 1986, pp. 3459-3465; US 4,882,279 and US 4,879,231).
  • the fungal host cell is a filamentous fungal cell.
  • “Filamentous fungi” include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al . , 1995, supra) .
  • the filamentous fungi are characterized by a vegetative mycelium composed of s chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides .
  • Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic.
  • vegetative growth by yeasts such as Saccharomyces cerevisiae is by budding of a unicellular thallus and carbon catabolism may 0 be fermentative.
  • the filamentous fungal host cell is a cell of a species of, but not limited to, Acremonium, Aspergillus , Fusarium, Humicola, Mucor, Myce-liophthora, Neurospora, Penicillium, Thielavia , Toly- pocladium, and Trichoderma or a teleomorph or synonym thereof. 5
  • the filamentous fungal host cell is an Aspergillus cell.
  • the filamentous fungal host cell is an Acremonium cell.
  • the filamentous fungal host cell is a Fusarium cell.
  • the filamentous fungal host cell is a Humicola cell.
  • the filamentous fungal host cell is a Mucor cell. In another even more preferred embodiment, the filamentous fungal host cell is a Myceliophthora cell. In another even more 5 preferred embodiment, the filamentous fungal host cell is a Neurospora cell. In another even more preferred embodiment, the filamentous fungal host cell is a Penicillium cell. In another even more preferred embodiment, the filamentous fungal host cell is a Thielavia cell. In another even more preferred embodiment, the filamentous fungal host cell is a Tolypocladium cell. In another even more preferred embodiment, the filamentous fungal host cell is a Trichoderma cell.
  • the filamentous fungal host cell is an Aspergillus aw amor i , Aspergillus foetidus , Aspergillus japonicus , Aspergillus niger, Aspergillus nidulans or Aspergillus oryzae cell.
  • the filamentous fungal host cell is a Fusarium cell of the section Discolor (also known as the section Fusarium) .
  • the filamentous fungal parent cell may be a Fusarium bactridioides , Fusarium cerealis , Fusarium crook -wellense , Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi , Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sulphureum, or Fusarium trichothecioides cell .
  • the filamentous fungal parent cell is a Fusarium strain of the section Elegans, e . g.
  • the filamentous fungal host cell is a Humicola insolens or Humicola lanuginosa cell. In another most preferred embodiment, the filamentous fungal host cell is a Mucor miehei cell. In another most preferred embodiment, the filamentous fungal host cell is a Myceliophthora thermophilum cell. In another most preferred embodiment, the filamentous fungal host cell is a Neurospora crassa cell. In another most preferred embodiment, the filamentous fungal host cell is a Penicillium purpurogenum cell. In another most preferred embodiment, the filamentous fungal host cell is a Thielavia terrestris cell or a Acremonium chrysogenum cell .
  • the Trichoderma cell is a Trichoderma harzianum, Trichoderma koningii , Trichoderma longibrachiatum, Trichoderma reesei or Trichoderma viride cell.
  • Aspergillus spp. for the expression of proteins is described in, e.g., EP 272 277, EP 230 023.
  • Transformation Fungal cells may be transformed by a process involving protoplast formation, transformation of the protoplasts, and regeneration of the cell wall in a manner known per se .
  • Suitable procedures for transformation of Aspergillus host cells are described in EP 238 023 and Yelton et al . , 1984, Proceedings of the National Academy of Sciences USA 81:1470- 1474.
  • a suitable method of transforming Fusarium species is described by Malardier et al . , 1989, Gene 78:147-156 or in copending US Serial No. 08/269,449.
  • Examples of other fungal cells are cells of filamentous fungi, e.g. Aspergillus spp., Neurospora spp., Fusarium spp.
  • Trichoderma spp. in particular strains of A . oryzae, A . nidulans or A . niger.
  • Aspergillus spp. for the expression of proteins is described in, e.g., EP 272 277, EP 230 023.
  • the transformation of F. oxysporum may, for instance, be carried out as described by Malardier et al . , 1989, Gene 78: 147-156.
  • Yeast may be transformed using the procedures described by Becker and Guarente, In Abelson, J.N. and Simon, M.I., editors, Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, Volume 194, pp 182-187, Academic Press, Inc., New York; Ito et al . , 1983, Journal of Bacteriology 153:163; and Hinnen et al . , 1978, Proceedings of the National Academy of Sciences USA 75:1920. Mammalian cells may be transformed by direct uptake using the calcium phosphate precipitation method of Graham and Van der Eb (1978, Virology 52:546) .
  • Transformation of insect cells and production of heterologous polypeptides therein may be performed as described in US 4,745,051; US 4, 775, 624; US 4,879,236; US 5,155,037; US 5,162,222; EP 397,485) all of which are incorporated herein by reference.
  • the insect cell line used as the host may suitably be a Lepidoptera cell line, such as Spodoptera frugiperda cells or Trichoplusia ni cells (cf. US 5,077,214).
  • Culture conditions may suitably be as described in, for instance, WO 89/01029 or WO 89/01028, or any of the aforementioned references.
  • transgenic animals It is also within the scope of the present invention to employ transgenic animal technology to produce the present polypeptide.
  • a transgenic animal is one in whose genome a heterologous DNA sequence has been introduced.
  • the polypeptide of the invention may be expressed in the mammary glands of a non-human female mammal, in particular one which is known to produce large quantities of milk. Examples of preferred mammals are livestock animals such as goats, sheep and cattle, although smaller mammals such as mice, rabbits or rats may also be employed.
  • the DNA sequence encoding the present polypeptide may be introduced into the animal by any one of the methods previously described for the purpose. For instance, to obtain expression in a mammary gland, a transcription promoter from a milk protein gene is used.
  • Milk protein genes include the genes encoding casein (cf. US 5,304,489), beta-lactoglobulin, alpha- lactalbumin and whey acidic protein.
  • the currently preferred promoter is the beta-lactoglobulin promoter (cf. Whitelaw et al . , Biochem J. 286, 1992, pp. 31-39).
  • introns from, e.g. the beta- lactoglobulin gene may also be preferred to include at least some introns from, e.g. the beta- lactoglobulin gene.
  • One such region is a DNA segment which provides for intron splicing and RNA polyadenylation from the 3' non-coding region of the ovine beta-lactogloblin gene. When substituted for the native 3' non-coding sequences of a gene, this segment may will enhance and stabilise expression levels of the polypeptide of interest. It may also be possible to replace the region surrounding the initiation codon of the polypeptide of interest with corresponding sequences of a milk protein gene. Such replacement provides a putative tissue- specific initiation environment to enhance expression.
  • a nucleotide sequence encoding the polypeptide is operably linked to additional DNA sequences required for its expression to produce expression units.
  • additional sequences include a promoter as indicated above, as well as sequences providing for termination of transcription and polyadenylation of mRNA.
  • the expression unit further includes a DNA sequence encoding a secretory signal sequence operably linked to the sequence encoding the polypeptide.
  • the secretory signal sequence may be one native to the polypeptide or may be that of another protein such as a milk protein (cf. von Heijne et al., Nucl . Acids Res. 14, 1986, pp. 4683-4690; and US 4,873,316) .
  • Construction of the expression unit for use in transgenic animals may conveniently be done by inserting a DNA sequence encoding the present polypeptide into a vector containing the additional DNA sequences, although the expression unit may be constructed by essentially any sequence of ligations. It is particularly convenient to provide a vector containing a DNA sequence encoding a milk protein and to replace the coding region for the milk protein with a DNA sequence coding for the present polypeptide, thereby creating a fusion which includes expression control sequences of the milk protein gene.
  • the expression unit is then introduced into fertilized ova or early- stage embryos of the selected host species.
  • Introduction of heterologous DNA may be carried out in a number of ways, including microinjection (cf. US 4,873,191), retroviral infection (cf. Jaenisch, Science 240, 1988, pp. 1468-1474) or site-directed integration using embryonic stem cells (reviewed by Bradley et al . , Bio/Technology 10, 1992, pp. 534-539) .
  • the ova are then implanted into the oviducts or uteri of pseudopregnant females and allowed to develop to term.
  • Offspring carrying the introduced DNA in their germ line can pass the DNA on to their progeny, allowing the development of transgenic herds .
  • Transgenic plants Production in transgenic plants may also be employed. It has previously been described to introduce DNA sequences into plants, which sequences code for protein products imparting to the transformed plants certain desirable properties such as increased resistance against pests, pathogens, herbicides or stress conditions (cf. for instance EP 90 033, EP 131 620, EP 205 518, EP 270 355, WO 89/04371 or WO 90/02804), or an improved nutrient value of the plant proteins (cf. for instance EP 90 033, EP 205 518 or WO 89/04371) .
  • WO 89/12386 discloses the transformation of plant cells with a gene coding for levansucrase or dextransucrase, regeneration of the plant (especially a tomato plant) from the cell resulting in fruit products with altered viscosity characteristics.
  • the DNA sequence encoding the present polypeptide is under the control of a regulatory sequence which directs the expression of the polypeptide from the DNA sequence in plant cells and intact plants.
  • the regulatory sequence may be either endogenous or heterologous to the host plant cell.
  • the regulatory sequence may comprise a promoter capable of directing the transcription of the DNA sequence encoding the polypeptide in plants.
  • promoters capable of directing the transcription of the DNA sequence encoding the polypeptide in plants.
  • promoters which may be used according to the invention are the 35s RNA promoter from cauliflower mosaic virus (CaMV) , the class I patatin gene B 33 promoter, the ST-LS1 gene promoter, promoters conferring seed- specific expression, e.g. the phaseolin promoter, or promoters which are activated on wounding, such as the promoter of the proteinase inhibitor II gene or the wunl or wun2 genes.
  • the promoter may be operably connected to an enhancer sequence, the purpose of which is to ensure increased transcription of the DNA sequence encoding the polypeptide.
  • enhancer sequences are enhancers from the 5 ' -upstream region of the 35s RNA of CaMV, the 5 ' -upstream region of the ST-LS1 gene, the 5 ' -upstream region of the Cab gene from wheat, the 5 ' -upstream region of the 1'- and 2 ' -genes of the T R -DNA of the Ti plasmid pTi ACH5 , the 5 ' -upstream region of the octopine synthase gene, the 5 ' -upstream region of the leghemoglobin gene, etc.
  • the regulatory sequence may also comprise a terminator capable of terminating the transcription of the DNA sequence encoding the polypeptide in plants.
  • suitable terminators are the terminator of the octopine synthase gene of the T-DNA of the Ti -plasmid pTiACH5 of Agrobacterium tumefaciens, of the gene 7 of the T-DNA of the Ti plasmid pTiACH5, of the nopaline synthase gene, of the 35s RNA-coding gene from CaMV or from various plant genes, e.g. the ST-LS1 gene, the Cab gene from wheat, class I and class II patatin genes, etc.
  • the DNA sequence encoding the polypeptide may also be operably connected to a DNA sequence encoding a leader peptide capable of directing the transport of the expressed polypeptide to a specific cellular compartment (e.g. vacuoles) or to extracellular space.
  • suitable leader peptides are the leader peptide of proteinase inhibitor II from potato, the leader peptide and an additional about 100 amino acid fragments of patatin, or the transit peptide of various nucleus-encoded proteins directed into chloroplasts (e.g. from the St -LSI gene, SS-Rubisco genes, etc.) or into mitochondria (e.g. from the ADP/ATP translocator) .
  • DNA sequence encoding the polypeptide may be modified in the 5 ' non-translated region resulting in enhanced translation of the sequence. Such modifications may, for instance, result in removal of hairpin loops in RNA of the 5' non-translated region.
  • Translation enhancement may be provided by suitably modifying the omega sequence of tobacco mosaic virus or the leaders of other plant viruses (e.g. BMV, MSV) or of plant genes expressed at high levels (e.g. SS- Rubisco, class I patatin or proteinase inhibitor II genes from potato) .
  • the DNA sequence encoding the polypeptide may furthermore be connected to a second DNA sequence encoding another polypeptide or a fragment thereof in such a way that expression of said DNA sequences results in the production of a fusion protein.
  • the second DNA sequence may, for instance, encode patatin or a fragment thereof (such as a fragment of about 100 amino acids) .
  • the plant in which the DNA sequence coding for the polypeptide is introduced may suitably be a dicotyledonous plant, examples of which are is a tobacco, potato, tomato, or leguminous (e.g. bean, pea, soy, alfalfa) plant. It is, however, contemplated that monocotyledonous plants, e.g. cereals, may equally well be transformed with the DNA sequence coding for the enzyme .
  • E. coli Procedures for the genetic manipulation of monocotyledonous and dicotyledonous plants are well known.
  • numerous cloning vectors are available which generally contain a replication system for E. coli and a selectable/screenable marker system permitting the recognition of transformed cells.
  • These vectors include e.g. pBR322, the pUC series, pACYC, M13 mp series etc.
  • the foreign sequence may be cloned into appropriate restriction sites.
  • the recombinant plasmid obtained in this way may subsequently be used for the transformation of E. coli.
  • Transformed E. coli cells may be grown in an appropriate medium, harvested and lysed.
  • the chimeric plasmid may then be reisolated and analyzed. Analysis of the recombinant plasmid may be performed by e.g. determination of the nucleotide sequence, restriction analysis, electrophoresis and other molecular-biochemical methods. After each manipulation the sequence may be cleaved and ligated to another DNA sequence. Each DNA sequence can be cloned on a separate plasmid DNA. Depending on the way used for transferring the foreign DNA into plant cells other DNA sequences might be of importance.
  • the Ti -plasmid or the Ri plasmid of Agrobacterium tumef aciens or Agrobacterium rhizogenes at least the right border of the T-DNA may be used, and often both the right and the left borders of the T-DNA of the Ri or Ti plasmid will be present flanking the DNA sequence to be transferred into plant cells.
  • T-DNA for transferring foreign DNA into plant cells has been described extensively in the prior literature (cf. Gasser and Fraley, 1989, Science 244, 1293 - 1299 and references cited therein) .
  • this sequence is fairly stable at the original locus and is usually not lost in subsequent mitotic or meiotic divisions.
  • a selectable marker gene will be cotransferred in addition to the gene to be transferred, which marker renders the plant cell resistant to certain antibiotics, e.g. kanamycin, hygromycin, G418 etc. This marker permits the recognition of the transformed cells containing the DNA sequence to be transferred compared to nontransformed cells.
  • Agrobacterium mediated transfer the fusion of protoplasts with liposomes containing the respective DNA, microinjection of foreign DNA, electro- poration etc.
  • the DNA to be transferred has to be present in special plasmids which are either of the intermediate type or the binary type . Due to the presence of sequences homologous to T-DNA sequences, intermediate vectors may integrate into the Ri- or Ti-plasmid by homologous recombination. The Ri- or Ti- plasmid additionally contains the vir-region which is necessary for the transfer of the foreign gene into plant cells.
  • Binary vectors may replicate in both Agrobacterium species and E. coli. They may contain a selectable marker and a poly-linker region which to the left and right contains the border sequences of the T-DNA of Agrobacterium rhizogenes or Agrobacterium tumef aciens . Such vectors may be transformed directly into Agrobacterium species.
  • the Agrobacterium cell serving as the host cell has to contain a vir-region on another plasmid. Additional T-DNA sequences may also be contained in the Agrobacterium cell .
  • the Agrobacterium cell containing the DNA sequences to be transferred into plant cells either on a binary vector or in the form of a cointegrate between the intermediate vector and the T-DNA region may then be used for transforming plant cells.
  • multicellular explants e.g. leaf discs, stem segments, roots
  • single cells protoplasts
  • cell suspensions are cocultivated with Agrobacterium cells containing the DNA sequence to be transferred into plant cells.
  • the plant cells treated with the Agrobacterium cells are then selected for the cotransferred resistance marker (e.g. kanamycin) and subsequently regenerated to intact plants. These regenerated plants will then be tested for the presence of the DNA sequences to be transferred.
  • cotransferred resistance marker e.g. kanamycin
  • DNA is transferred by e.g. electroporation or microinjection, no special requirements are needed to effect transformation.
  • Simple plasmids e.g. of the pUC series may be used to transform plant cells.
  • Regenerated transgenic plants may be grown normally in a greenhouse or under other conditions. They should display a new phenotype (e.g. production of new proteins) due to the transfer of the foreign gene(s) .
  • the transgenic plants may be crossed with other plants which may either be wild-type or transgenic plants transformed with the same or another DNA sequence. Seeds obtained from transgenic plants should be tested to assure that the new genetic trait is inherited in a stable Mendelian fashion. See also Hiatt, Nature 344: 469-479, 1990; Edelbaum et al . , J. Interferon Res. 12: 449-453, 1992; Sijmons et al . , Bio/Technology 8: 217-221, 1990: and EP 255 378.
  • the transformed or transfected host cells described above are cultured in a suitable nutrient medium under conditions permitting the expression of the desired polypeptide, after which the resulting polypeptide is recovered from the cells, or the culture broth.
  • the medium used to culture the cells may be any conventional medium suitable for growing the host cells, such as minimal or complex media containing appropriate supplements. Suitable media are available from commercial suppliers or may be prepared according to published recipes (e.g. in catalogues of the American Type Culture Collection) . The media are prepared using procedures known in the art (see, e.g., references for bacteria and yeast; Bennett, J.W. and LaSure, L., editors, More Gene Manipulations in Fungi , Academic Press, CA, 1991) .
  • the polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it is recovered from cell lysates.
  • the polypeptide are recovered from the culture medium by conventional procedures including separating the host cells from the medium by centrifugation or filtration, precipitating the proteinaceous components of the supernatant or filtrate by means of a salt, e.g. ammonium sulphate, purification by a variety of chromatographic procedures, e.g. ion exchange chromatography, gelfiltration chromatography, affinity chromatography, or the like, dependent on the type of polypeptide in question.
  • the polypeptides may be detected using methods known in the art that are specific for the polypeptides.
  • polypeptides of the present invention may be purified by a variety of procedures known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion) , electro- phoretic procedures (e.g., preparative isoelectric focusing (IEF) , differential solubility (e.g., ammonium sulfate precipitation), or extraction (see, e.g., Protein Purification, J.C. Janson and Lars Ryden, editors, VCH Publishers, New York, 1989) .
  • chromatography e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion
  • electro- phoretic procedures e.g., preparative isoelectric focusing (IEF)
  • differential solubility e.g., ammonium sulfate precipitation
  • extraction see, e.g., Protein Purification, J.C. Janson and Lars Ryden, editors, V
  • Figure 1 shows an alignment of the known amino acid sequences of the Aspergillus oryzae (A oryzae) and Aspergillus niger (A niger) Fam35 lactases with the Meripulus giganteus (M giganteus) Fam35 lactase of the invention.
  • the seven conserved regions of the invention are shaded gray.
  • the invention relates to a method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a forward PCR-primer comprising sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y- I-N-A-E (SEQ ID No.l), A-V-D-I-Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] - G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E-F-Q- [A/G] -G- [A/S] - [F/Y] - D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T
  • a preferred embodiment of the invention relates to a method of the first aspect, wherein the primer of step (b) comprises the DNA sequence 5'CCXTAYATH, preferably 5'GGXCCXTAYATHAA, most preferably 5 ' CCXGGXCCXTAYATHAAYGC ; and the primer of step (c) comprises the DNA sequence 5'TGGGGXGGX, preferably 5' CCXTGGGGXGGXCCX, most preferably 5' GAYCCXTGGGGXGGXCCXGGG .
  • a preferred embodiment of the invention relates to a method of the first aspect, wherein the DNA fragment of the first aspect step (d) is cloned in a suitable vector which is then tranformed into a suitable homologous or heterologous microbial host cell .
  • step (a) is a fungus, preferably the fungus is of the Phylum Basidiomycota, Ascomycota or Zygomycota, more preferably the fungus is of a Class selected from the group consisting of: Hymenomycetes, Gasteromycetes, Loculoa- scomycetes, Discomycetes, Plectomycetes, Hemiascomycetes, Archiascomycetes, Pyrenomycetes and Zygomycetes, even more preferably the fungus is of an Order selected from the group consisting of: Agaricales, Polyporales, Stereales, Hymenochaetales, Hericiales, Boletales, Chantarellales, Tre- mellales, Auriculariales, Dothideales, Coryneliales, Rhytis- matales, Pezizales, Heloti
  • a further preferred embodiment relates to a method of the first or second aspects, wherein the genomic DNA or cDNA of steps (a) is fragmented and a DNA fragment of interest is chosen which hybridizes in a Southern blot with the DNA fragment of the first aspect step (d) or with the nucleotide probe of the second aspect step (b) ; preferably the chosen DNA fragment is cloned in a suitable vector which is then tranformed into a suitable homologous or heterologous microbial host cell; more preferably the DNA fragment is stably integrated into the host cell genome, preferably in multiple copies .
  • the host cell of the previous embodiments is a filamentous fungus or a yeast-like cell; or more preferably the host cell is a fungus; an Aspergillus, a Fusarium, a Meri pilus, a Trametes, a Penicillium, a Microspaeropsis , a Mycena, a Spathularia, a Diplodia, a Petromyces, a Christaspora , or a Hansenula cell; most preferably the fungus is selected from the group consisting of: Aspergillus niger, Aspergillus oryzae, Aspergillus nidulans, Fusarium venenatum, Meripilus giganteus, Trametes ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis, Mycena pura, Spathularia flavida, Diplodia gossypina, Petromy
  • Enzymes for DNA manipulations were used according to the specifications of the suppliers (e.g. restriction endo- nucleases, ligases etc. are obtainable from New England Biolabs, Inc . ) .
  • Yeast strain The Saccharomyces cerevisiae strain used was W3124 (MATa; ura 3-52; leu 2-3, 112; his 3-D200; pep 4-1137; prcl: :HIS3; prbl : : LEU2 ; cir+) .
  • E. coli strain DH10B (Life Technologies) .
  • Plasmids pBK-CMV (StratageneTM inc., La Jolla Ca . ) ; The automatic sublcloning plasmid liberated from lambdaZAPTM Express (Stratagene) .
  • the Aspergillus expression vector pHD414 is a derivative of the plasmid p775 described in EP 238 023; the construction of pHD414 is further described in WO 93/11249.
  • PYESTM 2.0 InvitrogenTM, US.
  • pA2LAl construction described herein.
  • pA3Lacl construction described herein.
  • YPD 10 g yeast extract, 20 g peptone, H 2 0 to 900 ml.
  • YPM 10 g yeast extract, 20 g peptone, H 2 0 to 900 ml.
  • SC-URA 100 ml 10 x Basal salt, 28 ml 20% casamino acids without vitamins, 10 ml 1% tryptophan, H 2 0 ad 900 ml, autoclaved, 3.6 ml
  • SC-agar SC-URA, 20g/l agar added.
  • X-gal SC-agar plates SC-agar with 2% galactose and 40 mg/1 X- gal (5-bromo-4-chloro-3-indolyl-beta-d-galactopyranoside from
  • BA media Rofec (Roquette # 1023642) 10 g NH4N03 10 g
  • Potato dextrose broth (Difco 0549) 24 g is dissolved in deionized water to 1000 ml.
  • Genomic DNA was prepared essentially according to the fungal genomic DNA protocol supplied with the Qiagen500 genomic tips with the following modifications: 1) The frozen mycelia were ground in a pre chilled morter and pestle using quartz sand as an abrasive. 2) The proteinase K digestion was performed for only 1 hour instead of two hours .
  • Genomic DNA for Tramacetes ochracea, Spathularia flavidia and Mycena pura was prepared as mini prep Genomic DNA based on a modified Qiaprep protocol (Qiagen GmBh) . Briefly, the mycelia were ground exactly as in the Qiagen protocol described above and were placed in a 2ml microcentrifuge tube. One ml of lysis buffer was added and 2 uls of Rnase A (20ug/ul) solution was also added. The samples were mixed in an Eppindorf thermomixer at 37 degrees for 10 minutes. Thirty uls of proteinase K solution was added and the samples were incubated for thirty minutes ate 50 degrees.
  • Qiaprep protocol Qiagen GmBh
  • LambdaZAP express cloning kit with BamHI digested and dephosphorylated arms from Stratagene. Isolated DNA was partially digested with
  • Standard screening procedure was employed for recovering genomic clones from the fungal libraries (Sambrook et al . , 1995). 200,000 plaque-forming units (pfus) were screened at 50,000 pfus/l25mm petri plate containing NZY media. Plaque lifts were performed with Hybond N nylon membranes which were processed for hybrization according to the manufacturer' s instructions (Amersham pharmacia biotech Ltd.). Hybridizations were performed at high stringency (68 C) in Modified Church buffer (Biorad Inc., USA). Purified PCR product, amplified from the genomic DNA was used to make each fungal library. PCR products were produced by using the degenerate primers described in the invention.
  • the PCR fragment was randomly primed using a Pharmacia kit 32 P-CTP Random priming kit according to the manufacturer's instructions (Amersham pharmacia biotech Ltd.). Initial positives underwent two rounds of purification to isolate pure plaques. Pure plaque isolates were automatically subcloned to plasmid according to the manufacturer's instructions (Stratagene).
  • RNA, cDNA synthesis, Mung bean nuclease treatment, Blunt-ending with T4 DNA polymerase, Adaptor ligation, Notl digestion and size selection, and construction of the cDNA library were performed as described in WO 97/31102.
  • a beta-galactosidase producing yeast colony is inoculated into 20 ml YPD broth in a 50 ml glass test tube. The tube is shaken for 2 days at 30°C. The cells are harvested by centrifugation for 10 min. at 3000 rpm. DNA is isolated according to WO 94/14953 and dissolved in 50 ml water. The DNA is transformed into E. coli by standard procedures. Plasmid DNA is isolated from E. coli using standard procedures, and the cDNA inset is sequenced with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) and synthetic oligonucleotide primers using an Applied Biosystems ABI PRISMTM 377 DNA Sequencer according to the manufacturers instructions. The cDNA insert is excised using appropriate restriction enzymes and ligated into an Aspergillus expression vector.
  • Protoplasts may be prepared as described in WO 95/02043, page 16, line 21 - page 17, line 12, which is hereby incorporated by reference.
  • Tris-HCl, pH 7.5, 10 mM CaCl 2 ).
  • Protoplasts are mixed with p3SR2 (an A . nidulans amdS gene carrying plasmid) .
  • the mixture is left at room temperature for 25 minutes.
  • 0.2 ml of 60% PEG 4000 (BDH 29576), 10 mM CaCl 2 and 10 mM Tris-HCl, pH 7.5 is added and carefully mixed (twice) and finally 0.85 ml of the same solution is added and carefully mixed.
  • the mixture is left at room temperature for 25 minutes, spun at 2500 g for 15 minutes and the pellet is resuspended in 2 ml of 1.2 M sorbitol.
  • the protoplasts are spread on minimal plates (Cove, Biochem. Biophys. Acta 113 (1966) 51-56) containing 1.0 M sucrose, pH 7.0, 10 mM acetamide as nitrogen source and 20 mM CsCl to inhibit background growth. After incubation for 4-7 days at 37°C spores are picked and spread for single colonies. This procedure is repeated and spores of a single colony after the second reisolation is stored as a defined transformant .
  • Each of the transformants are inoculated in 10 ml of YPM and propagated. After 2-5 days of incubation at 30°C, the supernatant is removed.
  • the beta-galactosidase activity can be identified by applying 20 ⁇ l supernatant to 4 mm diameter holes punched out in an X-gal SC-agar plate and incubation overnight at 30°C; beta-galactosidase activity is then identified by a blue halo.
  • a method for identifying new fungal family 35 enzymes are designing degenerated oligonucleotide primer for PCR based on conserved amino acid region within the amino acid sequences coding for the known fungal family 35 enzymes and use it for molecular screening/cloning of gene family members for instance as described G. M. Preston, Methods in Molecular Biology vol. 67, 1997, pp433-449; S. Bartl, Methods in Molecular Biology vol. 67, 1997, pp451-457; R. M. Horton et al . , Methods in Molecular Biology vol. 67, 1997, pp459-479.
  • the full sequence can then be determined by library construction and screening as described above or by cloning by inverse PCR for instance as described by J. Silver, pp.137-146, in PCR a Practical Approach. Edited by: M. J.McPherson, P. Quirke and G.R. Taylor. Oxford University Press, 1991. Sequence Determination:
  • Plasmid clones were sequenced by use of the New England Biolabs GPS-1 Genome Priming System. Briefly, transposons were inserted randomly into the plasmid genomic clone according to the manufacturers instructions (New England Biolabs, USA) . Sixty to one hundred individual plasmid transposants were sequenced with one of the primers contained on the inserted transposon. The DNA Star software package version 4.2 (www.dnastar.com) was used to assemble the DNA contigs generated from the plasmid transposants. Both strands were sequenced and custom primers were used as necessary to complete the sequence in both directions. An Applied Biosytems ABI371 automated sequencer utilizing dye terminator technology was used (Applied Biosystems USA) .
  • Total DNA was isolated from a beta-galactosidase positive yeast colony and plasmid DNA was rescued by transformation of E. coli as described above.
  • the DNA was digested with appropriate restriction enzymes, size fractionated on gel, and a fragment corresponding to the beta-galactosidase gene was purified.
  • the gene was subsequently ligated to pHD414, digested with appropriate restriction enzymes, resulting in the plasmid pA2LAl .
  • the full length cDNA inset encoding the beta- galactosidase of Meripilus giganteus of Qiagen purified plasmid DNA of pA2LAl (Qiagen, USA) was sequenced with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) and synthetic oligonucleotide primers using an Applied Biosystems ABI PRISMTM 377 DNA Sequencer according to the manufacturers instructions. After amplification of the DNA in E. coli the plasmid was transformed into Aspergillus oryzae as described above.
  • Each of the transformants was tested for enzyme activity as described above. Some of the transformants had beta- galactosidase activity which was significantly larger than the Aspergillus oryzae background. This demonstrates efficient expression of the beta-galactosidase in Aspergillus oryzae .
  • 5'- CCC AAG CTT CCI GGI CCI TAY ATH AAY GC -3' corresponds to amino acids P-G- [P/S] -Y-I-N-A (comprised in SEQ ID No.l) with a CCC and Hind III site 5' tail.
  • I deoxylnosine
  • H A, C or T
  • R A or G
  • Y C or
  • PCR reaction Approximately 100 to 200 ng genomic DNA or 10-20 ng doublestranded cDNA is used as template for PCR amplification in PCR buffer (10 mM Tris-HCl, pH 8.3 , 50 mM KCl) containing 200 ⁇ M of each dNTP, 3.5 mM MgCl2, 2.5 Units AmpliTaq GoldTM, 5 and 100 pmol of each of the degenerate primers 647 and 648. The total volume is 50 ⁇ l .
  • the PCR reaction is carried out in a Perkin - Elmer GeneAmp PCR System 2400.
  • the PCR reaction is performed using a cycle profile of:
  • the PCR fragments can be purified and sequenced using GFXTM PCR DNA and Gel Band Purification Kit (Pharmacia Biotech) according to the manufacturer's instructions. The nucleotide sequences of the amplified PCR fragments are determined
  • Genomic DNA from Penicillium roquefortii NN0048065 was isolated as described above and screened by PCR as described in example 2.
  • a PCR fragment was obtained and sequenced as described above.
  • the PCR fragment was randomly primed using a Pharmacia kit 32 P-CTP Random priming kit according to the manufacturer's instructions (Amersham pharmacia biotech Ltd.) and used as probe for screening by hybridization of the Penicillium roquefortii bacteriophage lambda library.
  • Primer BKCMV forward and -reverse were designed from Figure 2 in the ZAP Express Predigested Vector Kit and ZAP Express Predigested GigaPack Cloning Kits Instruction Manual (Stratagene) and primer 693 and 694 were designed from the Penicillium roquefortii 647/648 PCR fragment DNA sequence.
  • BKCMV forward primer (SEQ ID NO.12) : 5 ' -GAAATTAACCCTCACTAAAGG-3 '
  • BKCMV reverse primer (SEQ ID NO.13) : 5 ' -CCGGGTGGAAAATCGATGGGCC-3 ' Primer sense 693 (SEQ ID NO.14) :
  • Primer antisense 694 (SEQ ID NO.15) : 5 ' -CTCACGTTTCGGCAGGGTCACATAC-3 '
  • PCR bands containing the full upstream and full downstream genomic DNA sequence for the Penicillium roquefortii beta-galactosidase was obtained from the positive lamda plaque D2. Based on the DNA sequence obtained PCR oligonucleotide primers for amplifying genomic DNA encoding the Penicillium roquefortii beta-galactosidase were design with appropriated restriction site tail for cloning purpose.
  • PCR reaction directly on the Penicillium roquefortii genomic DNA was carried out with primer 823 and 723 according to manufactures instruction (Boehringer-Mannheim: Expand Long template PCR system) and the obtained PCR fragment was cloned into Aspergillus expression vector pHD423, a derivative of pHD414 with Not 1 and Xbal restriction sites in the polylinker region, resulting in plasmid pA3Lacl.
  • the DNA sequence of the insert encoding the beta-galactosidase of Penicillium roquefortii of Qiagen was determined from purified plasmid DNA of pA3Lacl (Qiagen, USA) with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) and synthetic oligonucleotide primers using an Applied Biosystems ABI PRISMTM 377 DNA Sequencer according to the manufacturers instructions.
  • the DNA sequence of the Notl/Xba 1 insert in pA3Lacl encoding the beta-galactosidase of Penicillium roquefortii is shown in SEQ ID No. 18 and the corresponding cDNA sequence is shown in SEQ ID No. 19.
  • the amino acid sequence of the beta- galactosidase is shown in SEQ ID No.20.
  • Mycelia was grown in FG4 media at 200rpm, 20 degrees centigrade for 17 days. Mycelium was harvested and genomic DNA was prepared from Spathularia flavidia according to materials and methods section. A lambda phage genomic library was prepared with an average insert size of between 3 and lOkb and a primary titer of 370,000 pfus/ml . 200,000 plaques were plated on four plates at 50,000 pfus/125mm petri plate containing NZY media. Plaque lifts were performed and hybridisation carried out as described. The hybridisation probe was generated from PCR on the same genomic DNA used to make the Spathularia flavidia genomic library and using primers 647 and 648 (Example 2) .

Abstract

The present invention provides a solution to the problem of how to obtain new Fam35 lactases of fungal origin for industrial production, without having to perform traditional activity based screening assays. A method of screening for a DNA sequence coding for an enzyme of interest; processes for producing the enzyme of interest; enzymes obtainable by the above method and/or process; DNA encoding said enzymes; and cells comprising DNA encoding said enzymes.

Description

Fungal Extracellular Fam35 Beta-Galactosidases
Field of the invention β-galactosidases or lactases (EC 3.2.1.23) of the Family 35 (Fam35) are known to be produced by Aspergillus niger and Aspergillus oryzae . US Patent No. 5,736,374 and the corresponding PCT-application WO 96/00786 describe an A . oryzae lactase, gene and product, and the use of this lactase for treating lactose intolerance in mammals. Lactases are widely used in the preparation of foodstuffs or feed for consumption by lactose intolerant humans or animals, and the industrial production of lactases is an active field of research.
Traditionally the screening for such lactases involves laborious activity based assays, and the development of new and easier ways to screen for lactases is of importance to the industry.
Background
O-Glycoside hydrolases (EC 3.2.1.) are a widespread group of enzymes which hydrolyse the glycosidic bond between two or more carbohydrates or between a carbohydrate and a non- carbohydrate moiety. The IUB-MB enzyme nomenclature of glycoside hydrolases is based on their substrate specificity and occasionally on their molecular mechanism; such a classification does not reflect (and was not intended to) the structural features of these enzymes. A classification of glycoside hydrolases in families (Fam) based on amino acid sequence similarities has been proposed a few years ago
[Henrissat, A classification of glycosyl hydrolases based on amino-acid sequence similarities. Biochem. J. 280:309- 316(1991); Henrissat & Bairoch, New families in the classification of glycosyl hydrolases based on amino-acid sequence similarities. Biochem. J. 293:781-788(1993); Henrissat & Bairoch, Updating the sequence-based classification of glycosyl hydrolases. Biochem. J. 316:695-696(1996); Davies & Henrissat, Structures and mechanisms of glycosyl hydrolases. Structure 3:853-859(1995)].
Because there is a direct relationship between sequence and folding similarities, such a classification is expected to: (i) reflect the structural features of these enzymes better than their sole substrate specificity, (ii) help to reveal the evolutionary relationships between these enzymes, and (iii) provide a convenient tool to derive mechanistic information.
β-galactosidases or lactases (EC 3.2.1.23) of the Family 35 (Fam35) are known to be produced by Aspergillus niger and Aspergillus oryzae . US Patent No. 5,736,374 and the corresponding PCT-application WO 96/00786 describe an A . oryzae lactase, gene and product, and the use of this lactase for treating lactose intolerance in mammals. US Patent No. 5,821,350 describes an A . niger lactase, gene and product. WO 90/10703 describes an A . niger lactase which is produced and secreted by a yeast. US Patent No. 3,718,739 describes a method of reducing lactose intolerance in mammals by treating with a lactase from A . niger. US Patent No. 4,522,832 describes addition of a lactase to baked yeast goods, specifically lactase from A . oryzae . The addition of compositions comprising lactase to dairy products under certain conditions was described in US Patent No. 5,707,843. A method of preparing an ice cream containing a lactose composition, specifically from yeast, was described in US Patent No. 5,942,264. The use in animal feed of lactase in combination with other enzymes such as galactanase was described in WO 97/16982.
The commercially available lactase products fall in two categories, the so called neutral lactases e.g. the one that is produced in the yeast Kluyveromyces lactis as an intracellular
(Fam2) homologue product, and the so called acid lactases that are produced e.g. in Aspergillus oryzae and Aspergillus niger as extracellular homologue products. New lactases would be highly desirable for use in both industrial applications (to provide lactose-free products for lactose intolerant people e.g. on milk & cheese) and end-user applications (as digestive aid) and even in animal feed. Further such lactases could be used in the production of fermentation stocks from whey and in applications within the dairy industry for obtaining improved mouth-feel and preventing crystallization during freezing of yogurt and ice cream. Hence the screening for and detection of genes encoding secreted lactases with desirable properties, such as activity in a wide range of pH or temperature, has become a competitive area in the field. Expression cloning of this type of gene is very difficult though, probably due to the relatively big size of the lactase molecule of approx. 1000 amino acids, giving a gene size of approx 3.6Kbp (including introns) .
As a result of the present invention we now have an efficient molecular screening method to detect new Fam35 lactase candidates.
Summary of the invention
The present invention provides a solution to the problem of how to obtain new Fam35 lactases of fungal origin for industrial production, without having to perform traditional activity based screening assays.
By far the most of the known Fam35 genes/enzymes come from plants; also a few examples are known from the animal kingdom, however only very few examples from micro organisms. Although the sequence variation within the entire known Fam35 is broad, the sequences of the three known Fam35 genes of fungal origin (viz. A . niger, A . oryzae, and Penicillium canescens) are highly conserved. Further all of these three fungal species are all very closely related, being from two genera of the Trichocomaceae family, belonging to the Ascomycete order Eurotiales. It was therefore highly surprising when we discovered first of all that the Basidiomycete Meripilus giganteus had a
Fam35 gene/enzyme. Secondly it was surprising to find that this new fungal Fam35 gene/enzyme had a quite different sequence as compared to ones known from Aspergillus and Penicillium .
After aligning the novel M. giganteus lactase sequence with two known Fam35 sequences from A . niger and A . oryzae we found that the fungal Fam35 lactases were not nearly as conserved as we would have expected. Thus had we attempted a PCR-based cloning strategy with primers designed from the known lactase sequences, we would likely not have succeeded in isolating the M. giganteus lactase.
From the abovementioned alignment of the known lactase sequences with the novel lactase sequence of the invention, we identified a number of regions that were highly conserved
(Figure 1) . The present invention is based on these results; we were able to show that sets of degenerated primers could be constructed that were effective in the identification of novel fungal Fam35 lactases from a wide spectrum of the fungal kingdom, despite the sequence diversity.
To date by the use of the primers of the invention we have found several novel Fam35 lactases in the following strains: Meripilus giganteus CBS 52195, deposited on July 04, 1995; Trametes ochracea CBS 102584 (Basidiomycota) , deposited on September 22, 2000; Penicillium roquefortii , Penicillium carneum CBS 102585; and Microsphaeropsis sp . CBS 102583 (Dothideales, Ascomycota) , deposited on February 28, 2000. Partial sequences have further been isolated from five more strains of Ascomycete and Basidiomycete species: Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, and Christaspora arxii .
The sequence data reveal surprising differences between the genes in spite of them being recognized by the same sets of primers defined in the present invention, thus confirming that knowledge of the conserved regions described herein was a prerequisite for successfully designing the PCR-primers of the invention.
By using the degenerated primer sets of the invention, it was proven possible to detect Fam35 lactases with new characteristics, with relation to both pH- , temperature-, and substrate specificity profiles and to obtain enzymes with different levels of specific activity. In this way the invention opens up for the industrial production of new and improved lactases for use in many industrial segments, including both the acid and neutral lactase market segment . Further, lactases with activity in cold store temperatures as well as lactases surviving food and fodder processing and lactases withstanding high temperature during decontamination processes can now be found, By doing the abovementioned alignment we were able to identify seven highly conserved regions in the Fam35 -lactases, the conserved regions are : Region 1 (SEQ ID No.l) : R-P-G- [P/S] -Y-I-N-A-E; Region 2 (SEQ ID No.2) :
A-V-D-I-Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] -G-F-D-C- [A/S] -N-P; Region 3 (SEQ ID No.3) :
E-F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G Region 4 (SEQ ID No.4) : Y-T-S-Y-D-Y-G-S;
Region 5 (SEQ ID No.5) :
D-K-V-R-G Region 6 (SEQ ID No.6) : N-E-G-G-L- [Y/F] -A-E-R; Region 7 (SEQ ID No.7) :
G-P-Q- [T/A] -S-F-P-V-P- [E/V] -G-I; Note: A bracket in the above sequence listings, such as in x [P/S] ' denotes one position in the amino acid sequence, where either of the two amino acid residues indicated within the bracket may be present in the sequence, in the case of [P/S] either 4P' or S' may be present in that position.
Accordingly in a first aspect the invention relates to a method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a forward PCR-primer comprising sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y- I-N-A-E (SEQ ID No.l), A-V-D-I -Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] - G-F-D-C- [A/S] -N-P (SEQ ID No.2), E-F-Q- [A/G] -G- [A/S] - [F/Y] - D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T-S-Y-D-Y-G-S (SEQ ID No.4), D-K-V-R-G (SEQ ID No.5), and N-E-G-G-L- [Y/F] -A-E-R (SEQ ID No.6); c) selecting a reverse PCR-primer comprising anti-sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: A-V-D-I-Y-G-
[L/H] -D- [A/S] -Y-P- [Q/L] -G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E- F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T-S- Y-D-Y-G-S (SEQ ID No.4), D-K-V-R-G (SEQ ID No.5), N-E-G-G-L-
[Y/F] -A-E-R (SEQ ID No.6), and G-P-Q- [T/A] -S-F-P-V-P- [E/V] - G-I (SEQ ID No.7) ; d) performing a PCR reaction using the genomic DNA or cDNA from step (a) as template with the primers from step (b) and step
(c) , and screening the PCR products for a generated DNA fragment of interest .
The first step in the above aspect is to select a microorganism, however this is not to be interpreted in a narrow manner, the method for screening will function just as well on DNA extracted from environmental samples taken from soil, or other ecological niches, thus finding application in the screening of viable but non-culturable cells also. A second aspect of the invention relates to a method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a nucleotide probe comprising a sense or an antisense nucleotide sequence corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y-I-N-A-E (SEQ ID No.l), A-V-D-I- Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] -G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E-F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T- S-Y-D-Y-G-S (SEQ ID No.4), D-K-V-R-G (SEQ ID No.5), N-E-G-G- L- [Y/F] -A-E-R (SEQ ID No.6), and G-P-Q- [T/A] -S-F-P-V-P- [E/V]-G-I (SEQ ID No.7); c) using the probe of step (b) in a Southern blot against fragmented genomic DNA or fragmented cDNA of step (a) .
After having identified a DNA sequence coding for an enzyme of interest, it is further encompassed by the invention that the enzyme is manufactured industrially. Industrial enzyme production can be achieved in a number of ways, one is simply to culture the cells in which the DNA of interest was identified, and to recover the enzyme from the fermentation broth or from the cells.
Accordingly a third aspect of the invention relates to a process for producing an enzyme of interest, the process comprising steps of: a) screening for a DNA sequence coding for the enzyme of interest by the method of any of the aspects of the invention, b) culturing the microorganism of the first or second aspects step (a) , the genomic DNA or cDNA of which comprises the DNA sequence coding for the enzyme of interest, under suitable conditions to express the enzyme of interest, and c) recovering the enzyme from the culture. Another way to industrially produce the enzyme of interest, is to clone the DNA of interest by the usual techniques of the art, and to express the cloned DNA of interest in a homologous or a heterologous microbial host cell.
Consequently a fourth aspect of the invention relates to a process for producing an enzyme of interest, the process comprising steps of: a) screening for a DNA sequence coding for the enzyme of interest and isolating said DNA sequence by the method of any of the aspects of the invention, b) culturing the microbial host cell of the invention under suitable conditions to express the enzyme of interest and recovering the expressed enzyme from the culture.
The present invention has allowed easy isolation of a number of Fam35 lactases from various fungal Orders, lactases that would likely not have been isolated using conventional shotgun-cloning techniques without substantial labor.
A fifth aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Agaricales, Polyporales, Phanerochaetales, Leotiales, or Dothideales Order, said enzyme comprising at least one of the amino acid sequences shown in SEQ ID No .1 - 7, and said enzyme being obtainable by the method of the third or forth aspect of the invention.
The finding of a lactase of the invention isolated from a certain fungal Genus indicates that other members of that Genus are likely to have DNA encoding a related lactase of the invention.
Accordingly a sixth aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus, Mycena, Trametes, Spathularia, Diplodia, Microsphaeropsis , Penicillium,
Petromyces, or Christaspora Genus, said enzyme comprising at least one of the amino acid sequences shown in SEQ ID No .1 - 7, and said enzyme being obtainable by the method of the third or forth aspect of the invention. Specific lactases of the invention were isolated from a number of fungal species. Consequently a seventh aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus giganteus, Trametes ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis sp . , Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, or Christaspora arxii species, said enzyme comprising at least one of the amino acid sequences shown in SEQ ID No.l - 7, and said enzyme being obtainable by the method of the third or fourth aspects of the invention.
The DNA encoding the lactase enzymes of the invention is also encompassed by the invention.
Accordingly an eighth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Agaricales, Polyporales, Phanerochaetales, Leotiales, or Dothideales Order, said sequence comprising subsequences that encode at least one of the amino acid sequences shown in SEQ ID No.l - 7, and said sequence being obtainable by the method of any of the first or second aspects of the invention.
Further a ninth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus, Mycena, Trametes, Spathularia, Diplodia, Microsphaeropsis, Penicillium, Petromyces, or Christaspora Genus, said sequence comprising subsequences that encode at least one of the amino acid sequences shown in SEQ ID No.l - 7, and said sequence being obtainable by the method of any of the first or second aspects of the invention. A tenth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus giganteus, Trametes ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis sp . , Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, or Christaspora arxii species, said sequence comprising subsequences that encode at least one of the amino acid sequences shown in SEQ ID No.l - 7, and said sequence being obtainable by the method of any of the first or second aspects of the invention. Specific lactases of the invention were isolated. Accordingly an eleventh aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) which is obtainable by the method of any of the third or fourth aspects and which comprises an amino acid sequence at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to any of the sequence shown in SEQ ID No's: 9, 20, 22, 24, 26, 28, 30, 32, 34, 36, or 38.
Some of the strains, from which non-limiting specific lactases of the invention were isolated and from which the corresponding lactase encoding DNA sequences were also derived, were deposited under the Budapest treaty at CBS on February 28, 2000 and in the latter case on 22 September 2000: Microsphaeropsis sp . NN003539 was deposited as CBS 102583; and Penicillium carneum NN048067 was deposited as CBS 102585 and Tra etes ochracea NN007143 was deposited as CBS 102584. Meripilus giganteus CBS 52195, was deposited on July 04, 1995.
Accordingly a twelwth aspect of the invention relates to an isolated enzyme with lactase activity (EC 3.2.1.23) which is obtainable by the method of any of the third or fourth aspects and which comprises an amino acid sequence at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to a lactase encoded by a DNA sequence comprised in a strain chosen from the group consisting of Microsphaeropsis sp . CBS102583, Trametes ochracea CBS 102584, Penicillium carneum CBS 102585, and Meripilus giganteus CBS 52195.
By analogy the DNA encoding the specific lactases of the previous aspect of the invention is encompassed by the invention. Consequently a thirteenth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23), said sequence being obtainable by the method of any of the first or second aspects, and said sequence being at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to any of the sequence shown in SEQ ID No's: 8, 18, 19, 21, 23, 25, 27, 29, 31, 33, 35, or 37. A fourteenth aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23), said sequence being obtainable by the method of any of the first or second aspects, and said sequence being at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to the lactase encoding sequence comprised in a strain chosen from the group consisting of Microsphaeropsis sp . CBS102583, Trametes ochracea CBS 102584, Penicillium carneum CBS 102585, and Meripilus giganteus CBS 52195.
In the art it is well known that enzyme activity and/or characteristics can be improved by shuffling or recombining two or more DNA sequences encoding related enzymes, in order to produce a final shuffled sequence encoding the improved enzyme. Ways of generating and producing DNA libraries from natural sequences are well known, but besides natural DNA sequences, a number of ways are also known in which to generate very large populations of diverse artificial DNA sequences starting from one or more natural sequences, e.g. shuffling or directed evolution (WO 98/42832; US 5,965,408; WO 98/01581; WO 97/07205; WO 95/22625; US 5,093,257).
Accordingly an aspect of the invention relates to an isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) said sequence obtainable by shuffling at least two isolated DNA sequences as defined in the previous aspects. Another aspect of the invention relates to a recombinant vector comprising a DNA sequence as defined in any of the previous aspects.
Yet another aspect relates to a recombinant host cell comprising a DNA sequence according to any of the previous aspects. A further aspect relates to a transgenic animal comprising and expressing the nucleic acid construct according to any of the preceding aspects.
One more aspect relates to a transgenic plant 5 containing and expressing the nucleic acid construct according to any of the preceding aspects.
Also an aspect relates to a method of producing an enzyme with lactase activity (EC 3.1.2.23), which method comprises recovering the enzyme from the transgenic animal lo according to the sixteenth aspect .
An aspect relates to a method of producing an enzyme with lactase activity (EC 3.1.2.23), which method comprises growing a cell of a transgenic plant according to the seventeenth aspect, and recovering the enzyme from the
15 resulting plant.
An aspect relates to a composition comprising an enzyme with lactase activity (EC 3.2.1.23) as defined in any of the preceding aspects.
A final aspect relates to the use of an enzyme with 20 lactase activity or use of a composition comprising an enzyme with lactase activity as defined in any of the preceding aspects, in the manufacture or processing of foodstuffs or feeds fit for consumption by lactose intolerant humans or animals, or for improvement of the nutritional value of an 25 animal feed.
Definitions
In accordance with the present invention there may be employed conventional molecular biology, microbiology, and
30 recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual , Second Edition (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (herein "Sambrook et al . ,
35 1989") DNA Cloning: A Practical Approach, Volumes I and II /D.N. Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed. 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. Higgins eds (1985)); Transcription And Translation (B.D. Hames & S.J. Higgins, eds. (1984)); Animal Cell Cul ture (R.I. Freshney, ed. (1986) ) ; Immojbilize Cells And Enzymes (IRL Press, (1986)); B. Perbal, A Practical Guide To Molecular Cloning (1984) .
Isolated: When applied to a protein, the term "isolated" indicates that the protein is found in a condition other than its native environment, such as apart from blood and animal tissue. In a preferred form, the isolated protein is substantially free of other proteins, particularly other proteins of animal origin. It is preferred to provide the proteins in a highly purified form, i.e., greater than 95% pure, more preferably greater than 99% pure. When applied to a polynucleotide molecule, the term "isolated" indicates that the molecule is removed from its natural genetic milieu, and is thus free of other extraneous or unwanted coding sequences, and is in a form suitable for use within genetically engineered protein production systems. Such isolated molecules are those that are separated from their natural environment and include cDNA and genomic clones. Isolated DNA molecules of the present invention are free of other genes with which they are ordinarily associated, and may include naturally occurring 5' and 3' untranslated regions such as promoters and terminators. The identification of associated regions will be evident to one of ordinary skill in the art (see for example, Dynan and Tijan, Nature 316: 774-78, 1985) .
A "polynucleotide" is a single- or double-stranded polymer of deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3' end. Polynucleotides include RNA and DNA, and may be isolated from natural sources, synthesized in vi tro, or prepared from a combination of natural and synthetic molecules. A "nucleic acid molecule" refers to the phosphate ester polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine; "RNA molecules") or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine , or deoxy- cytidine; "DNA molecules") in either single stranded form, or a double-stranded helix. Double stranded DNA-DNA, DNA-RNA and RNA-RNA helices are possible. The term nucleic acid molecule, and in particular DNA or RNA molecule, refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary or quaternary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear or circular DNA molecules (e.g., restriction fragments), plasmids, and chromosomes. In discussing the structure of particular double-stranded DNA molecules, sequences may be described herein according to the normal convention of giving only the sequence in the 5' to 3' direction along the nontranscribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA) . A "recombinant DNA molecule" is a DNA molecule that has undergone a molecular biological manipulation.
A nucleic acid molecule "hybridizes" to another nucleic acid molecule, such as a cDNA, genomic DNA, or RNA, when a single stranded form of the nucleic acid molecule can anneal to the other nucleic acid molecule under the appropriate conditions of temperature and solution ionic strength (see Sambrook et al . , supra) . The conditions of temperature and ionic strength determine the "stringency" of the hybridization.
A DNA "coding sequence" is a double-stranded DNA sequence which is transcribed and translated into a polypeptide in a cell in vi tro or in vivo when placed under the control of appropriate regulatory sequences . The boundaries of the coding sequence are determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxyl) terminus. A coding sequence can include, but is not limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA sequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNA sequences. If the coding sequence is intended for expression in a eukaryotic cell, a polyadenylation signal and transcription termination sequence will usually be located 3' to the coding sequence .
Expression vector: A DNA molecule, linear or circular, that comprises a segment encoding a polypeptide of interest operably linked to additional segments that provide for its transcription. Such additional segments may include promoter and terminator sequences, and optionally one or more origins of replication, one or more selectable markers, an enhancer, a polyadenylation signal, and the like. Expression vectors are generally derived from plasmid or viral DNA, or may contain elements of both.
Transcriptional and translational control sequences are DNA regulatory sequences, such as promoters, enhancers, terminators, and the like, that provide for the expression of a coding sequence in a host cell. In eukaryotic cells, polyadenylation signals are control sequences.
A "secretory signal sequence" is a DNA sequence that encodes a polypeptide (a "secretory peptide" that, as a component of a larger polypeptide, directs the larger polypeptide through a secretory pathway of a cell in which it is synthesized. The larger polypeptide is commonly cleaved to remove the secretory peptide during transit through the secretory pathway.
The term "promoter" is used herein for its art-recognized meaning to denote a portion of a gene containing DNA sequences that provide for the binding of RNA polymerase and initiation of transcription. Promoter sequences are commonly, but not always, found in the 5' non-coding regions of genes. "Operably linked" , when referring to DNA segments, indicates that the segments are arranged so that they function in concert for their intended purposes, e.g. transcription initiates in the promoter and proceeds through the coding segment to the terminator.
A coding sequence is "under the control" of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then trans- RNA spliced and translated into the protein encoded by the coding sequence .
A cell has been "transformed" by exogenous or heterologous DNA when the transfected DNA effects a phenotypic change, Preferably, the transforming DNA should be integrated (covalently linked) into chromosomal DNA making up the genome of the cell.
"Homologous recombination" refers to the insertion of a foreign DNA sequence of a vector in a chromosome. Preferably, the vector targets a specific chromosomal site for homologous recombination. For specific homologous recombination, the vector will contain sufficiently long regions of homology to sequences of the chromosome to allow complementary binding and incorporation of the vector into the chromosome. Longer regions of homology, and greater degrees of sequence similarity, may increase the efficiency of homologous recombination.
Nucleic Acid Sequence. The techniques used to isolate or clone a nucleic acid sequence encoding a polypeptide are known in the art and include isolation from genomic DNA, preparation from cDNA, or a combination thereof. The cloning of the nucleic acid sequences of the present invention from such genomic DNA can be effected, e. g. , by using the well-known polymerase chain reaction (PCR) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features . See, e . g . , Innis et al . , 1990, A Guide to Methods and Application, Academic Press, New York. Other nucleic acid amplification procedures such as ligase chain reaction (LCR) , ligated activated transcription (LAT) and nuceic acid sequence- based amplification (NASBA) may be used. The nucleic acid sequence may be cloned from a strain of the [Genus] producing the polypeptide, or another or related organism and thus, for example, may be an allelic or species variant of the polypeptide encoding region of the nucleic acid sequence.
"isolated" nucleic acid sequence or DNA as used herein refers to a nucleic acid sequence which is essentially free of other nucleic acid sequences, e . g. , at least about 20% pure, preferably at least about 40% pure, more preferably about 60% pure, even more preferably about 80% pure, most preferably about 90% pure, and even most preferably about 95% pure, as determined by agarose gel electorphoresis . For example, an isolated nucleic acid sequence can be obtained by standard cloning procedures used in genetic engineering to relocate the nucleic acid sequence from its natural location to a different site where it will be reproduced. The cloning procedures may involve excision and isolation of a desired nucleic acid fragment comprising the nucleic acid sequence encoding the polypeptide, insertion of the fragment into a vector molecule, and incorporation of the recombinant vector into a host cell where multiple copies or clones of the nucleic acid sequence will be replicated. The nucleic acid sequence may be of genomic, cDNA, RNA, semisynthetic, synthetic origin, or any combinations thereof.
The degree of identity between two nucleic acid sequences may be determined by means of computer programs known in the art such as GAP provided in the GCG program package (Needleman and Wunsch, 1970, Journal of Molecular Biology 48:443-453). For purposes of determining the degree of identity between two nucleic acid sequences for the present invention, GAP is used with the following settings: GAP creation penalty of 5.0 and GAP extension penalty of 0.3.
Nucleic Acid Construct as used herein is intended to indicate any nucleic acid molecule of cDNA, genomic DNA, synthetic DNA or RNA origin. The term "construct" is intended to indicate a nucleic acid segment which may be single- or double-stranded, and which may be based on a complete or partial naturally occurring nucleotide sequence encoding a polypeptide of interest. The construct may optionally contain other nucleic acid segments.
A nucleic acid construct of the invention encoding an enzyme of the invention may suitably be of genomic or cDNA origin, for instance obtained by preparing a genomic or cDNA library and screening for DNA sequences coding for all or part of the polypeptide by hybridization using synthetic oligonucleotide probes in accordance with standard techniques (cf. Sambrook et al . , supra) .
The nucleic acid construct of the invention encoding the polypeptide may also be prepared synthetically by established standard methods, e.g. the phosphoamidite method described by Beaucage and Caruthers, Tetrahedron Letters 22 (1981) , 1859 - 1869, or the method described by Matthes et al . , EMBO Journal 3_ (1984), 801 - 805. According to the phosphoamidite method, oligonucleotides are synthesized, e.g. in an automatic DNA synthesizer, purified, annealed, ligated and cloned in suitable vectors .
Furthermore, the nucleic acid construct may be of mixed synthetic and genomic, mixed synthetic and cDNA or mixed genomic and cDNA origin prepared by ligating fragments of synthetic, genomic or cDNA origin (as appropriate) , the fragments corresponding to various parts of the entire nucleic acid construct, in accordance with standard techniques.
The nucleic acid construct may also be prepared by polymerase chain reaction using specific primers, for instance as described in US 4,683,202 or Saiki et al . , Science 239 (1988) , 487 - 491.
The term nucleic acid construct may be synonymous with the term expression cassette when the nucleic acid construct contains all the control sequences required for expression of a coding sequence of the present invention. The term "coding sequence" as defined herein is a sequence which is transcribed into mRNA and translated into a polypeptide of the present invention when placed under the control of the above mentioned control sequences. The boundaries of the coding sequence are generally determined by a translation start codon ATG at the 5' -terminus and a translation stop codon at the 3' -terminus. A coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences. The term "control sequences" is defined herein to include all components which are necessary or advantageous for expression of the coding sequence of the nucleic acid sequence. Each control sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide. Such control sequences include, but are not limited to, a leader, a polyadenylation sequence, a propeptide sequence, a promoter, a signal sequence, and a transcription terminator. At a minimum, the control sequences include a promoter, and transcriptional and translational stop signals. The control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
The control sequence may be an appropriate promoter sequence, a nucleic acid sequence which is recognized by a host cell for expression of the nucleic acid sequence. The promoter sequence contains transcription and translation control sequences which mediate the expression of the polypeptide. The promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
The control sequence may also be a suitable transcription terminator sequence, a sequence recognized by a host cell to terminate transcription. The terminator sequence is operably linked to the 3' terminus of the nucleic acid sequence encoding the polypeptide. Any terminator which is functional in the host cell of choice may be used in the present invention. The control sequence may also be a polyadenylation sequence, a sequence which is operably linked to the 3' terminus of the nucleic acid sequence and which, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence which is functional in the host cell of choice may be used in the present invention.
The control sequence may also be a signal peptide-coding region, which codes for an amino acid sequence linked to the amino terminus of the polypeptide which can direct the expressed polypeptide into the cell's secretory pathway of the host cell. The 5' end of the coding sequence of the nucleic acid sequence may inherently contain a signal peptide-coding region naturally linked in translation reading frame with the segment of the coding region which encodes the secreted polypeptide. Alternatively, the 5' end of the coding sequence may contain a signal peptide-coding region which is foreign to that portion of the coding sequence which encodes the secreted polypeptide. A foreign signal peptide-coding region may be required where the coding sequence does not normally contain a signal peptide-coding region. Alternatively, the foreign signal peptide coding region may simply replace the natural signal peptide coding region in order to obtain enhanced secretion of the [enzyme] relative to the natural signal peptide coding region normally associated with the coding sequence. The signal peptide-coding region may be obtained from a glucoamylase or an amylase gene from an Aspergillus species, a lipase or proteinase gene from a Rhizomucor species, the gene for the alpha-factor from Saccharomyces cerevisiae, an amylase or a protease gene from a Bacillus species, or the calf preprochymosin gene. However, any signal peptide-coding region capable of directing the expressed polypeptide into the secretory pathway of a host cell of choice may be used in the present invention.
The control sequence may also be a propeptide coding region, which codes for an amino acid sequence positioned at the amino terminus of a polypeptide. The resultant polypeptide is known as a proenzyme or propolypeptide (or a zymogen in some cases) . A propolypeptide is generally inactive and can be converted to mature active polypeptide by catalytic or autocatalytic cleavage of the propeptide from the propolypeptide. The propeptide coding region may be obtained from the Bacillus subtilis alkaline protease gene {aprE) , the Bacillus subtilis neutral protease gene (nprT) , the Saccharomyces cerevisiae alpha- factor gene, or the
Myceliophthora thermophilum laccase gene (WO 95/33836) .
The nucleic acid constructs of the present invention may also comprise one or more nucleic acid sequences which encode one or more factors that are advantageous in the expression of the polypeptide, e . g. , an activator ( e . g. , a trans-acting factor), a chaperone, and a processing protease. Any factor that is functional in the host cell of choice may be used in the present invention. The nucleic acids encoding one or more of these factors are not necessarily in tandem with the nucleic acid sequence encoding the polypeptide.
An activator is a protein which activates transcription of a nucleic acid sequence encoding a polypeptide (Kudla et al . , 1990, EMBO Journal 9:1355-1364; Jarai and Buxton, 1994,
Current Genetics 26:2238-244; Verdier, 1990, Yeast 6:271-297).
The nucleic acid sequence encoding an activator may be obtained from the genes encoding Bacillus s tearothermophilus NprA
(nprA) , Saccharomyces cerevisiae heme activator protein 1 (hapl) , Saccharomyces cerevisiae galactose metabolizing protein 4 (gal4) , and Aspergillus nidulans ammonia regulation protein (areA) . For further examples, see Verdier, 1990, supra and MacKenzie et al . , 1993, Journal of General Microbiology 139:2295-2307.
A chaperone is a protein which assists another polypeptide in folding properly (Hartl et al . , 1994, TIBS 19:20-25; Bergeron et al . , 1994, TIBS 19:124-128; Demolder et al . , 1994, Journal of Biotechnology 32:179-189; Craig, 1993, Science 260:1902-1903; Gething and Sambrook, 1992, Nature 355:33-45; Puig and Gilbert, 1994, Journal of Biological Chemistry 269:7764-7771; Wang and Tsou, 1993, The FASEB Journal 7:1515-11157; Robinson et al . , 1994, Bio/Technology 1:381-384). The nucleic acid sequence encoding a chaperone may be obtained from the genes encoding Bacillus subtilis GroE proteins, Aspergillus oryzae protein disulphide isomerase, Saccharomyces cerevisiae calnexin, Saccharomyces cerevisiae BiP/GRP78, and Saccharomyces cerevisiae Hsp70. For further examples, see Gething and Sambrook, 1992, supra, and Hartl et al . , 1994, supra .
A processing protease is a protease that cleaves a propeptide to generate a mature biochemically active polypeptide (Enderlin and Ogrydziak, 1994, Yeast 10:67-79; Fuller et al . , 1989, Proceedings of the National Academy of Sciences USA 86:1434-1438; Julius et al . , 1984, Cell 37:1075- 1089; Julius et al . , 1983, Cell 32:839-852). The nucleic acid sequence encoding a processing protease may be obtained from the genes encoding Aspergillus niger Kex2 , Saccharomyces cerevisiae dipeptidylaminopeptidase, Saccharomyces cerevi siae Kex2 , and Yarrowia lipolytica dibasic processing endoprotease (xpr6) . It may also be desirable to add regulatory sequences which allow the regulation of the expression of the polypeptide relative to the growth of the host cell. Examples of regulatory systems are those which cause the expression of the gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound. Regulatory systems in prokaryotic systems would include the lac, tac, and trp operator systems. In yeast, the ADH2 system or GAL1 system may be used. In filamentous fungi, the TAKA alpha-amylase promoter, Aspergillus niger glucoamylase promoter, and the Aspergillus oryzae glucoamylase promoter may be used as regulatory sequences. Other examples of regulatory sequences are those allowing for gene amplification. In eukaryotic systems, these include the dihydrofolate reductase gene which is amplified in the presence of methotrexate, and the metallothionein genes which are amplified with heavy metals. In these cases, the nucleic acid sequence encoding the polypeptide would be placed in tandem with the regulatory sequence .
Promoters for directing the transcription of the nucleic acid constructs of the present invention, especially in a bacterial host cell, are exemplified in the promoters obtained from the E. coli lac operon, the Streptomyces coelicolor agarase gene ( dagA) , the Bacillus subtilis levansucrase gene ( sacB) , the Bacillus subtilis alkaline protease gene, the Bacillus licheniformis alpha-amylase gene ( amyL) , the Bacillus stearothermophilus maltogenic amylase gene (amyM) , the Bacillus amyloliguefaciens alpha-amylase gene (amyQ) , the Bacillus amyloliquefaciens BAN AMYLASE GENE, the Bacillus licheniformis penicillinase gene (penP) , the Bacillus subtilis xylA and xylB genes, and the prokaryotic beta-lactamase gene (Villa-Kamaroff et al . , 1978, Proceedings of the National Academy of Sciences USA 75:3727-3731), as well as the tac promoter (DeBoer et al . , 1983, Proceedings of the National Academy of Sciences USA 80:21-25) , or the Bacillus pumilus xylosidase gene, or by the phage Lambda PR or PL promoters or the E. coli lac, trp or tac promoters. Further promoters are described in "Useful proteins from recombinant bacteria" in Scientific American, 1980, 242:74-94; and in Sambrook et al . , 1989, supra . Examples of suitable promoters for directing the transcription of the nucleic acid constructs of the present invention in a filamentous fungal host cell are promoters obtained from the genes encoding Aspergillus oryzae TAKA amylase, Rhi zomucor miehei aspartic proteinase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase ( glaA) , Rhizomucor mi ehei lipase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, Fusarium oxysporum trypsin-like protease (as described in U.S. Patent No. 4,288,627, which is incorporated herein by reference), and hybrids thereof. Particularly preferred promoters for use in filamentous fungal host cells are the TAKA amylase, NA2-tpi (a hybrid of the promoters from the genes encoding Aspergillus niger neutral a-amylase and Aspergillus oryzae triose phosphate isomerase), and glaA promoters. Further suitable promoters for use in filamentous fungus host cells are the ADH3 promoter (McKnight et al . , The EMBO J . 4 (1985), 2093 - 2099) or the tpiA promoter.
Examples of suitable promoters for use in yeast host cells include promoters from yeast glycolytic genes (Hitzeman et al., J. Biol. Chem. 255 (1980), 12073 - 12080; Alber and Kawasaki, J. Mol. Appl. Gen. 1 (1982), 419 - 434) or alcohol dehydrogenase genes (Young et al . , in Genetic Engineering of Microorganisms for Chemicals (Hollaender et al , eds.), Plenum Press, New York, 1982), or the TPI1 (US 4,599,311) or ADH2-4c (Russell et al . , Nature 304 (1983), 652 - 654) promoters.
Further useful promoters are obtained from the Saccharomyces cerevisiae enolase (ENO-1) gene, the
Saccharomyces cerevisiae galactokinase gene (GAL1) , the Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde- 3-phosphate dehydrogenase genes (ADH2/GAP) , and the Saccharomyces cerevisiae 3-phosphoglycerate kinase gene. Other useful promoters for yeast host cells are described by Romanos et al . , 1992, Yeast 8:423-488. In a mammalian host cell, useful promoters include viral promoters such as those from Simian Virus 40 (SV40) , Rous sarcoma virus (RSV) , adenovirus, and bovine papilloma virus (BPV) .
Examples of suitable promoters for directing the transcription of the DNA encoding the polypeptide of the invention in mammalian cells are the SV40 promoter (Subramani et al., Mol. Cell Biol. 1 (1981), 854 -864), the MT-1 (metallothionein gene) promoter (Palmiter et al . , Science 222 (1983) , 809 - 814) or the adenovirus 2 major late promoter.
An example of a suitable promoter for use in insect cells is the polyhedrin promoter (US 4,745,051; Vasuvedan et al . , FEBS Lett. 311, (1992) 7 - 11) , the P10 promoter (J.M. Vlak et al., J. Gen. Virology 69, 1988, pp. 765-776), the Autographa calif ornica polyhedrosis virus basic protein promoter (EP 397 485) , the baculovirus immediate early gene 1 promoter (US 5,155,037; US 5,162,222), or the baculovirus 39K delayed-early gene promoter (US 5,155,037; US 5,162,222).
Terminators
Preferred terminators for filamentous fungal host cells are obtained from the genes encoding Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, Aspergillus niger alpha-glucosidase, and Fusarium oxysporum trypsin-like protease. For fungal hosts) the TPI1 (Alber and Kawasaki, op. cit . ) or ADH3 (McKnight et al . , op. cit . ) terminators.
Preferred terminators for yeast host cells are obtained from the genes encoding Saccharomyces cerevisiae enolase, Saccharomyces cerevisiae cytochrome C (CYC1) , or Saccharomyces cerevisiae glyceraldehyde-3 -phosphate dehydrogenase. Other useful terminators for yeast host cells are described by Romanos et al . , 1992, supra .
Polyadenylation Signals
Preferred polyadenylation sequences for filamentous fungal host cells are obtained from the genes encoding
Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus nidulans anthranilate synthase, and Aspergillus niger alpha-glucosidase .
Useful polyadenylation sequences for yeast host cells are described by Guo and Sherman, 1995, Molecular Cellular Biology 15:5983-5990.
Polyadenylation sequences are well known in the art for mammalian host cells such as SV40 or the adenovirus 5 Elb region.
Signal Sequences
An effective signal peptide-coding region for bacterial host cells is the signal peptide-coding region obtained from the maltogenic amylase gene from Bacillus NCIB 11837, the Bacillus stearothermophilus alpha-amylase gene, the Bacillus licheni -formis subtilisin gene, the Bacillus licheniformis beta-lactamase gene, the Bacillus stearothermophilus neutral pro-teases genes (nprT, nprS, nprM) , and the Bacillus subtilis PrsA gene. Further signal peptides are described by Simonen and Palva, 1993, Microbiological Reviews 57:109-137. An effective signal peptide coding region for filamentous fungal host cells is the signal peptide coding region obtained from Aspergillus oryzae TAKA amylase gene, Aspergillus niger neutral amylase gene, the Rhizomucor miehei aspartic proteinase gene, the Humicola lanuginosa cellulase or lipase gene, or the Rhizomucor miehei lipase or protease gene, Aspergillus sp. amylase or glucoamylase, a gene encoding a Rhizomucor miehei lipase or protease. The signal peptide is preferably derived from a gene encoding A . oryzae TAKA amylase, A . niger neutral a-amylase, A . niger acid-stable amylase, or A . niger glucoamylase.
Useful signal peptides for yeast host cells are obtained from the genes for Saccharomyces cerevisiae a-factor and Saccharomyces cerevisiae invertase. Other useful signal peptide coding regions are described by Romanos et al . , 1992, supra . For secretion from yeast cells, the secretory signal sequence may encode any signal peptide which ensures efficient direction of the expressed polypeptide into the secretory pathway of the cell. The signal peptide may be naturally occurring signal peptide, or a functional part thereof, or it may be a synthetic peptide. Suitable signal peptides have been 5 found to be the a-factor signal peptide (cf. US 4,870,008), the signal peptide of mouse salivary amylase (cf. O. Hagenbuchle et al . , Nature 289, 1981, pp. 643-646), a modified carboxypeptidase signal peptide (cf. L.A. Vails et al . , Cell 48, 1987, pp. 887-897), the yeast BAR1 signal peptide (cf. WO lo 87/02670) , or the yeast aspartic protease 3 (YAP3) signal peptide (cf. M. Egel-Mitani et al . , Yeast 6, 1990, pp. 127- 137) .
For efficient secretion in yeast, a sequence encoding a is leader peptide may also be inserted downstream of the signal sequence and uptream of the DNA sequence encoding the polypeptide. The function of the leader peptide is to allow the expressed polypeptide to be directed from the endoplasmic reticulum to the Golgi apparatus and further to a secretory
20 vesicle for secretion into the culture medium (i.e. exportation of the polypeptide across the cell wall or at least through the cellular membrane into the periplasmic space of the yeast cell) . The leader peptide may be the yeast a- factor leader (the use of which is described in e.g. US 4,546,082, EP 16 201, EP
25 123 294, EP 123 544 and EP 163 529) . Alternatively, the leader peptide may be a synthetic leader peptide, which is to say a leader peptide not found in nature. Synthetic leader peptides may, for instance, be constructed as described in WO 89/02463 or WO 92/11378.
30 For use in insect cells, the signal peptide may conveniently be derived from an insect gene (cf. WO 90/05783), such as the lepidopteran Manduca sexta adipokinetic hormone precursor signal peptide (cf. US 5,023,328). Expression Vectors
The present invention also relates to recombinant expression vectors comprising a nucleic acid sequence of the present invention, a promoter, and transcriptional and translational stop signals. The various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites. Alternatively, the nucleic acid sequence of the present invention may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression. In creating the expression vector, the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression, and possibly secretion.
The recombinant expression vector may be any vector (e.g., a plasmid or virus) which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the nucleic acid sequence. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vectors may be linear or closed circular plasmids. The vector may be an autonomously replicating vector, i . e . , a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e . g. , a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome (s) into which it has been integrated. The vector system may be a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon. The vectors of the present invention preferably contain one or more selectable markers which permit easy selection of transformed cells. A selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like. Examples of bacterial selectable markers are the dal genes from Bacillus subtilis or Bacillus licheniformis , or markers which confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol , tetracycline, neomycin, hygromycin or etho- trexate resistance. A frequently used mammalian marker is the dihydrofolate reductase gene (DHFR) . Suitable markers for yeast host cells are ADE2 , HIS3, LEU2 , LYS2 , MET3 , TRP1, and URA3. A selectable marker for use in a filamentous fungal host cell may be selected from the group including, but not limited to, amdS (acetamidase) , argB (ornithine carbamoyltransferase) , bar
(phosphinothricin acetyltransferase) , hygB (hygromycin phosphotransferase) , niaD (nitrate reductase) , pyrG (orotidine-
5 '-phosphate decarboxylase) , sC (sulfate adenyltransferase) , trpC (anthranilate synthase) , and glufosinate resistance markers, as well as equivalents from other species. Preferred for use in an Aspergillus cell are the amdS and pyrG markers of Aspergillus nidulans or Aspergillus oryzae and the jbar marker of Streptomyces hygroscopicus . Furthermore, selection may be accomplished by co-transformation, e . g. , as described in WO 91/17243, where the selectable marker is on a separate vector.
The vectors of the present invention preferably contain an element (s) that permits stable integration of the vector into the host cell genome or autonomous replication of the vector in the cell independent of the genome of the cell. The vectors of the present invention may be integrated into the host cell genome when introduced into a host cell . For integration, the vector may rely on the nucleic acid sequence encoding the polypeptide or any other element of the vector for stable integration of the vector into the genome by homologous or nonhomologous recombination. Alternatively, the vector may contain additional nucleic acid sequences for directing integration by homologous recombination into the genome of the host cell. The additional nucleic acid sequences enable the vector to be integrated into the host cell genome at a precise location (s) in the chromosome (s) . To increase the likelihood of integration at a precise location, the integrational elements should preferably contain a sufficient number of nucleic acids, such as 100 to 1,500 base pairs, preferably 400 to 1,500 base pairs, and most preferably 800 to 1,500 base pairs, which are highly homologous with the corresponding target sequence to enhance the probability of homologous recombination. The integrational elements may be any sequence that is homologous with the target sequence in the genome of the host cell. Furthermore, the integrational elements may be non-encoding or encoding nucleic acid sequences. On the other hand, the vector may be integrated into the genome of the host cell by non-homologous recombination. These nucleic acid sequences may be any sequence that is homologous with a target sequence in the genome of the host cell, and, furthermore, may be non-encoding or encoding sequences .
For autonomous replication, the vector may further comprise an origin of replication enabling the vector to replicate autonomously in the host cell in question. Examples of bacterial origins of replication are the origins of replication of plasmids pBR322, pUC19, pACYC177, pACYC184, pUBHO, pE194, pTA1060, and pAMβl . Examples of origin of replications for use in a yeast host cell are the 2 micron origin of replication, the combination of CEN6 and ARS4 , and the combination of CEN3 and ARS1. The origin of replication may be one having a mutation which makes its functioning temperature-sensitive in the host cell (see, e . g. , Ehrlich, 1978, Proceedings of the National Academy of Sciences USA 75:1433) .
More than one copy of a nucleic acid sequence encoding a polypeptide of the present invention may be inserted into the host cell to amplify expression of the nucleic acid sequence. Stable amplification of the nucleic acid sequence can be obtained by integrating at least one additional copy of the sequence into the host cell genome using methods well known in the art and selecting for transformants . The procedures used to ligate the elements described above to construct the recombinant expression vectors of the present invention are well known to one skilled in the art (see, e . g. , Sambrook et al . , 1989, supra) .
Host Cells
The present invention also relates to recombinant microbial host cells, comprising a nucleic acid sequence of the invention, which are advantageously used in the recombinant production of the polypeptides . The term "host cell" encompasses any progeny of a parent cell which is not identical to the parent cell due to mutations that occur during replication.
The cell is preferably transformed with a vector comprising a nucleic acid sequence of the invention followed by integration of the vector into the host chromosome. "Transformation" means introducing a vector comprising a nucleic acid sequence of the present invention into a host cell so that the vector is maintained as a chromosomal integrant or as a self-replicating extra-chromosomal vector. Integration is generally considered to be an advantage as the nucleic acid sequence is more likely to be stably maintained in the cell. Integration of the vector into the host chromosome may occur by homologous or non-homologous recombination as described above.
The transformation of a bacterial host cell may, for instance, be effected by protoplast transformation (see, e . g. , Chang and Cohen, 1979, Molecular General Genetics 168:111-115), by using competent cells (see, e . g. , Young and Spizizin, 1961, Journal of Bacteriology 81:823-829, or Dubnar and Davidoff- Abelson, 1971, Journal of Molecular Biology 56:209-221), by electroporation (see, e . g. , Shigekawa and Dower, 1988, Bio echniques 6:742-751), or by conjugation (see, e . g. , Koehler and Thorne, 1987, Journal of Bacteriology 169:5771-5278).
The host cell may be a eukaryote, such as a mammalian cell, an insect cell, a plant cell or a fungal cell. Useful mammalian cells include Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster kidney (BHK) cells, COS cells, or any number of other immortalized cell lines available, e.g., from the American Type Culture Collection.
Examples of suitable mammalian cell lines are the COS (ATCC CRL 1650 and 1651), BHK (ATCC CRL 1632, 10314 and 1573, ATCC CCL 10), CHL (ATCC CCL39) or CHO (ATCC CCL 61) cell lines. Methods of transfecting mammalian cells and expressing DNA sequences introduced in the cells are described in e.g. Kaufman and Sharp, J. Mol. Biol. 159 (1982), 601 - 621; Southern and Berg, J . Mol . Appl . Genet . 1 (1982), 327 - 341; Loyter et al . , Proc. Natl. Acad. Sci. USA 79 (1982), 422 - 426; Wigler et al . , Cell 14 (1978) , 725; Corsaro and Pearson, Somatic Cell Genetics 7 (1981), 603, Ausubel et al . , Current Protocols in Molecular Biology, John Wiley and Sons, Inc., N.Y., 1987, Hawley-Nelson et al . , Focus 15 (1993), 73; Ciccarone et al . , Focus 15 (1993), 80; Graham and van der Eb, Virology 52 (1973), 456; and Neumann et al., EMBO J. 1 (1982), 841 - 845.
In a preferred embodiment, the host cell is a fungal cell. "Fungi" as used herein includes the phyla Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota (as defined by Hawksworth et al . , In, Ainsworth and Bisby' s Dictionary of The Fungi , 8th edition, 1995, CAB International, University Press, Cambridge, UK) as well as the Oomycota (as cited in Hawksworth et al . , 1995, supra, page 171) and all mitosporic fungi (Hawksworth et al . , 1995, supra) . Representative groups of Ascomycota include, e . g. , Neurospora, Eupenicillium ( =Peni - cillium) , Emericella (= Aspergillus) , Eurotium ( =Aspergillus) , and the true yeasts listed above. Examples of Basidiomycota include mushrooms, rusts, and smuts. Representative groups of Chytridiomycota include, e.g., Allomyces, Blastocladiella, Coelomomyces , and aquatic fungi. Representative groups of Oomycota include, e . g. , Saprolegniomycetous aquatic fungi
(water molds) such as Achlya . Examples of mitosporic fungi include Aspergillus, Penicillium, Candida, and Al ternaria .
Repre-sentative groups of Zygomycota include, e.g., Rhizopus and Mucor .
In a preferred embodiment, the fungal host cell is a yeast-like cell. "Yeast" as used herein includes ascosporogenous yeast (Endomycetales) , basidiosporogenous yeast, and yeast belonging to the Fungi Imperfecti (Blastomycetes) . The ascosporogenous yeasts are divided into the families Spermophthoraceae and Saccharomycetaceae . The latter is comprised of four subfamilies, Schizo- saccharomycoideae (e.g. , genus Schizosaccharomyces) , Nad- sonioideae, Lipomycoideae, and Saccharomycoideae ( e . g. , genera Pichia, Kluyveromyces and Saccharomyces) . The basidio- porogenous yeasts include the genera Leucosporidim, Rhodosporidium, Sporidiobolus, Filobasidium, and Filo- basidiella . Yeast belonging to the Fungi Imperfecti is divided into two families, Sporobolomycetaceae (e.g., genera Soro- bolomyces and Bullera) and Cryptococcaceae (e.g., genus Candida) . Since the classification of yeast may change in the future, for the purposes of this invention, yeast shall be defined as described in Biology and Activi ties of Yeast (Skinner, F.A., Passmore, S.M., and Davenport, R.R., eds, Soc . App. Bacteriol . Symposium Series No. 9, 1980. The biology of yeast and manipulation of yeast genetics are well known in the art (see, e.g., Biochemistry and Genetics of Yeast, Bacil, M. , Horecker, B.J., and Stopani, A.O.M., editors, 2nd edition, 1987; The Yeasts, Rose, A.H., and Harrison, J.S., editors, 2nd edition, 1987; and The Molecular Biology of the Yeast Saccharomyces, Strathern et al . , editors, 1981) .
The yeast host cell may be selected from a cell of a species of Candida, Kluyveromyces , Saccharomyces , Schizosaccharomyces, Candida, Pichia, Hansenula, or Yarrowia . In a preferred embodiment, the yeast host cell is a Saccharomyces carlsbergensis, Saccharomyces cerevisiae, Saccharomyces diastaticus, Saccharomyces douglasii , Saccharomyces kluyveri , Saccharomyces norbensis or Saccharomyces oviformis cell. Other useful yeast host cells are a Kluyveromyces lactis s Kluyveromyces fragilis Hansehula polymorpha, Pichia pastoris Yarrowia lipolytica, Schizosaccharomyces pombe, Ustilgo maylis, Candida mal tose, Pichia guillermondii and Pichia methanol io cell (cf. Gleeson et al . , J. Gen. Microbiol . 132, 1986, pp. 3459-3465; US 4,882,279 and US 4,879,231). o In a preferred embodiment, the fungal host cell is a filamentous fungal cell. "Filamentous fungi" include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al . , 1995, supra) . The filamentous fungi are characterized by a vegetative mycelium composed of s chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides . Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic. In contrast, vegetative growth by yeasts such as Saccharomyces cerevisiae is by budding of a unicellular thallus and carbon catabolism may 0 be fermentative. In a more preferred embodiment, the filamentous fungal host cell is a cell of a species of, but not limited to, Acremonium, Aspergillus , Fusarium, Humicola, Mucor, Myce-liophthora, Neurospora, Penicillium, Thielavia , Toly- pocladium, and Trichoderma or a teleomorph or synonym thereof. 5 In an even more preferred embodiment, the filamentous fungal host cell is an Aspergillus cell. In another even more preferred embodiment, the filamentous fungal host cell is an Acremonium cell. In another even more preferred embodiment, the filamentous fungal host cell is a Fusarium cell. In another 0 even more preferred embodiment, the filamentous fungal host cell is a Humicola cell. In another even more preferred embodiment, the filamentous fungal host cell is a Mucor cell. In another even more preferred embodiment, the filamentous fungal host cell is a Myceliophthora cell. In another even more 5 preferred embodiment, the filamentous fungal host cell is a Neurospora cell. In another even more preferred embodiment, the filamentous fungal host cell is a Penicillium cell. In another even more preferred embodiment, the filamentous fungal host cell is a Thielavia cell. In another even more preferred embodiment, the filamentous fungal host cell is a Tolypocladium cell. In another even more preferred embodiment, the filamentous fungal host cell is a Trichoderma cell. In a most preferred embodiment, the filamentous fungal host cell is an Aspergillus aw amor i , Aspergillus foetidus , Aspergillus japonicus , Aspergillus niger, Aspergillus nidulans or Aspergillus oryzae cell. In another most preferred embodiment, the filamentous fungal host cell is a Fusarium cell of the section Discolor (also known as the section Fusarium) . For example, the filamentous fungal parent cell may be a Fusarium bactridioides , Fusarium cerealis , Fusarium crook -wellense , Fusarium culmorum, Fusarium graminearum, Fusarium graminum, Fusarium heterosporum, Fusarium negundi , Fusarium reticulatum, Fusarium roseum, Fusarium sambucinum, Fusarium sarcochroum, Fusarium sulphureum, or Fusarium trichothecioides cell . In another prefered embodiment, the filamentous fungal parent cell is a Fusarium strain of the section Elegans, e . g. , Fusarium oxysporum. In another most preferred embodiment, the filamentous fungal host cell is a Humicola insolens or Humicola lanuginosa cell. In another most preferred embodiment, the filamentous fungal host cell is a Mucor miehei cell. In another most preferred embodiment, the filamentous fungal host cell is a Myceliophthora thermophilum cell. In another most preferred embodiment, the filamentous fungal host cell is a Neurospora crassa cell. In another most preferred embodiment, the filamentous fungal host cell is a Penicillium purpurogenum cell. In another most preferred embodiment, the filamentous fungal host cell is a Thielavia terrestris cell or a Acremonium chrysogenum cell . In another most preferred embodiment, the Trichoderma cell is a Trichoderma harzianum, Trichoderma koningii , Trichoderma longibrachiatum, Trichoderma reesei or Trichoderma viride cell. The use of Aspergillus spp. for the expression of proteins is described in, e.g., EP 272 277, EP 230 023.
Transformation Fungal cells may be transformed by a process involving protoplast formation, transformation of the protoplasts, and regeneration of the cell wall in a manner known per se . Suitable procedures for transformation of Aspergillus host cells are described in EP 238 023 and Yelton et al . , 1984, Proceedings of the National Academy of Sciences USA 81:1470- 1474. A suitable method of transforming Fusarium species is described by Malardier et al . , 1989, Gene 78:147-156 or in copending US Serial No. 08/269,449. Examples of other fungal cells are cells of filamentous fungi, e.g. Aspergillus spp., Neurospora spp., Fusarium spp. or Trichoderma spp., in particular strains of A . oryzae, A . nidulans or A . niger. The use of Aspergillus spp. for the expression of proteins is described in, e.g., EP 272 277, EP 230 023. The transformation of F. oxysporum may, for instance, be carried out as described by Malardier et al . , 1989, Gene 78: 147-156.
Yeast may be transformed using the procedures described by Becker and Guarente, In Abelson, J.N. and Simon, M.I., editors, Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, Volume 194, pp 182-187, Academic Press, Inc., New York; Ito et al . , 1983, Journal of Bacteriology 153:163; and Hinnen et al . , 1978, Proceedings of the National Academy of Sciences USA 75:1920. Mammalian cells may be transformed by direct uptake using the calcium phosphate precipitation method of Graham and Van der Eb (1978, Virology 52:546) . Transformation of insect cells and production of heterologous polypeptides therein may be performed as described in US 4,745,051; US 4, 775, 624; US 4,879,236; US 5,155,037; US 5,162,222; EP 397,485) all of which are incorporated herein by reference. The insect cell line used as the host may suitably be a Lepidoptera cell line, such as Spodoptera frugiperda cells or Trichoplusia ni cells (cf. US 5,077,214). Culture conditions may suitably be as described in, for instance, WO 89/01029 or WO 89/01028, or any of the aforementioned references.
Transgenic animals It is also within the scope of the present invention to employ transgenic animal technology to produce the present polypeptide. A transgenic animal is one in whose genome a heterologous DNA sequence has been introduced. In particular, the polypeptide of the invention may be expressed in the mammary glands of a non-human female mammal, in particular one which is known to produce large quantities of milk. Examples of preferred mammals are livestock animals such as goats, sheep and cattle, although smaller mammals such as mice, rabbits or rats may also be employed. The DNA sequence encoding the present polypeptide may be introduced into the animal by any one of the methods previously described for the purpose. For instance, to obtain expression in a mammary gland, a transcription promoter from a milk protein gene is used. Milk protein genes include the genes encoding casein (cf. US 5,304,489), beta-lactoglobulin, alpha- lactalbumin and whey acidic protein. The currently preferred promoter is the beta-lactoglobulin promoter (cf. Whitelaw et al . , Biochem J. 286, 1992, pp. 31-39).
It is generally recognized in the art that DNA sequences lacking introns are poorly expressed in transgenic animals in comparison with those containing introns (cf. Brinster et al . , Proc. Natl. Acad. Sci. USA 85, 1988, pp. 836-840; Palmiter et al., Proc. Natl. Acad. Sci. USA 88, 1991, pp. 478-482; Whitelaw et al . , Transgenic Res. 1, 1991, pp. 3-13; WO 89/01343; WO 91/02318) . For expression in transgenic animals, it is therefore preferred, whenever possible, to use genomic sequences containing all or some of the native introns of the gene encoding the polypeptide of interest. It may also be preferred to include at least some introns from, e.g. the beta- lactoglobulin gene. One such region is a DNA segment which provides for intron splicing and RNA polyadenylation from the 3' non-coding region of the ovine beta-lactogloblin gene. When substituted for the native 3' non-coding sequences of a gene, this segment may will enhance and stabilise expression levels of the polypeptide of interest. It may also be possible to replace the region surrounding the initiation codon of the polypeptide of interest with corresponding sequences of a milk protein gene. Such replacement provides a putative tissue- specific initiation environment to enhance expression.
For expression of the present polypeptide in transgenic animals, a nucleotide sequence encoding the polypeptide is operably linked to additional DNA sequences required for its expression to produce expression units. Such additional sequences include a promoter as indicated above, as well as sequences providing for termination of transcription and polyadenylation of mRNA. The expression unit further includes a DNA sequence encoding a secretory signal sequence operably linked to the sequence encoding the polypeptide. The secretory signal sequence may be one native to the polypeptide or may be that of another protein such as a milk protein (cf. von Heijne et al., Nucl . Acids Res. 14, 1986, pp. 4683-4690; and US 4,873,316) .
Construction of the expression unit for use in transgenic animals may conveniently be done by inserting a DNA sequence encoding the present polypeptide into a vector containing the additional DNA sequences, although the expression unit may be constructed by essentially any sequence of ligations. It is particularly convenient to provide a vector containing a DNA sequence encoding a milk protein and to replace the coding region for the milk protein with a DNA sequence coding for the present polypeptide, thereby creating a fusion which includes expression control sequences of the milk protein gene.
The expression unit is then introduced into fertilized ova or early- stage embryos of the selected host species. Introduction of heterologous DNA may be carried out in a number of ways, including microinjection (cf. US 4,873,191), retroviral infection (cf. Jaenisch, Science 240, 1988, pp. 1468-1474) or site-directed integration using embryonic stem cells (reviewed by Bradley et al . , Bio/Technology 10, 1992, pp. 534-539) . The ova are then implanted into the oviducts or uteri of pseudopregnant females and allowed to develop to term. Offspring carrying the introduced DNA in their germ line can pass the DNA on to their progeny, allowing the development of transgenic herds .
General procedures for producing transgenic animals are known in the art, cf . for instance, Hogan et al . , Manipulating the Mouse Embryo : A Laboratory Manual , Cold Spring Harbor Laboratory, 1986; Simons et al . , Bio/Technology 6 , 1988, pp. 179-183; Wall et al . , Biol. Reprod. 32, 1985, pp. 645-651; Buhler et al . , Bio/Technology 8, 1990, pp. 140-143; Ebert et al . , Bio/Technology 6: 179-183, 1988; Krimpenfort et al . , Bio/Tecnology 9 : 844-847, 1991, Wall et al . , J. Cell. Biochem. 49: 113-120, 1992; US 4,873,191, US 4,873,316; WO 88/00239, WO 90/05188; WO 92/11757 and GB 87/00458. Techniques for introducing heterologous DNA sequences into mammals and their germ cells were originally developed in the mouse. See, e.g. Gordon et al . , Proc. Natl. Acad. Sci. USA 77 : 7380-7384, 1980, Gordon and Ruddle, Science 214 : 1244-1246, 1981; Palmiter and Brinster, Cell 41: 343-345, 1985; Brinster et al . , Proc . Natl .
Acad. Sci. USA 82: 4438-4442, 1985; and Hogan et al . (ibid.). These techniques were subsequently adapted for use with larger animals, including livestock species (see e.g., WO 88/00239, WO 90/01588 and WO 92/11757; and Simons et al . , Bio/Technology 6: 179-183, 1988). To summarize, in the most efficient route used to date in the generation of transgenic mice or livestock, several hundred linear molecules of the DNA of interest are injected into one of the pro-nuclei of a fertilized egg according to techniques which have become standard in the art . Injection of DNA into the cytoplasm of a zygote can also be employed.
Transgenic plants Production in transgenic plants may also be employed. It has previously been described to introduce DNA sequences into plants, which sequences code for protein products imparting to the transformed plants certain desirable properties such as increased resistance against pests, pathogens, herbicides or stress conditions (cf. for instance EP 90 033, EP 131 620, EP 205 518, EP 270 355, WO 89/04371 or WO 90/02804), or an improved nutrient value of the plant proteins (cf. for instance EP 90 033, EP 205 518 or WO 89/04371) . Furthermore, WO 89/12386 discloses the transformation of plant cells with a gene coding for levansucrase or dextransucrase, regeneration of the plant (especially a tomato plant) from the cell resulting in fruit products with altered viscosity characteristics.
In the plant cell, the DNA sequence encoding the present polypeptide is under the control of a regulatory sequence which directs the expression of the polypeptide from the DNA sequence in plant cells and intact plants. The regulatory sequence may be either endogenous or heterologous to the host plant cell.
The regulatory sequence may comprise a promoter capable of directing the transcription of the DNA sequence encoding the polypeptide in plants. Examples of promoters which may be used according to the invention are the 35s RNA promoter from cauliflower mosaic virus (CaMV) , the class I patatin gene B 33 promoter, the ST-LS1 gene promoter, promoters conferring seed- specific expression, e.g. the phaseolin promoter, or promoters which are activated on wounding, such as the promoter of the proteinase inhibitor II gene or the wunl or wun2 genes.
The promoter may be operably connected to an enhancer sequence, the purpose of which is to ensure increased transcription of the DNA sequence encoding the polypeptide. Examples of useful enhancer sequences are enhancers from the 5 ' -upstream region of the 35s RNA of CaMV, the 5 ' -upstream region of the ST-LS1 gene, the 5 ' -upstream region of the Cab gene from wheat, the 5 ' -upstream region of the 1'- and 2 ' -genes of the TR-DNA of the Ti plasmid pTi ACH5 , the 5 ' -upstream region of the octopine synthase gene, the 5 ' -upstream region of the leghemoglobin gene, etc.
The regulatory sequence may also comprise a terminator capable of terminating the transcription of the DNA sequence encoding the polypeptide in plants. Examples of suitable terminators are the terminator of the octopine synthase gene of the T-DNA of the Ti -plasmid pTiACH5 of Agrobacterium tumefaciens, of the gene 7 of the T-DNA of the Ti plasmid pTiACH5, of the nopaline synthase gene, of the 35s RNA-coding gene from CaMV or from various plant genes, e.g. the ST-LS1 gene, the Cab gene from wheat, class I and class II patatin genes, etc.
The DNA sequence encoding the polypeptide may also be operably connected to a DNA sequence encoding a leader peptide capable of directing the transport of the expressed polypeptide to a specific cellular compartment (e.g. vacuoles) or to extracellular space. Examples of suitable leader peptides are the leader peptide of proteinase inhibitor II from potato, the leader peptide and an additional about 100 amino acid fragments of patatin, or the transit peptide of various nucleus-encoded proteins directed into chloroplasts (e.g. from the St -LSI gene, SS-Rubisco genes, etc.) or into mitochondria (e.g. from the ADP/ATP translocator) .
Furthermore, the DNA sequence encoding the polypeptide may be modified in the 5 ' non-translated region resulting in enhanced translation of the sequence. Such modifications may, for instance, result in removal of hairpin loops in RNA of the 5' non-translated region. Translation enhancement may be provided by suitably modifying the omega sequence of tobacco mosaic virus or the leaders of other plant viruses (e.g. BMV, MSV) or of plant genes expressed at high levels (e.g. SS- Rubisco, class I patatin or proteinase inhibitor II genes from potato) .
The DNA sequence encoding the polypeptide may furthermore be connected to a second DNA sequence encoding another polypeptide or a fragment thereof in such a way that expression of said DNA sequences results in the production of a fusion protein. When the host cell is a potato plant cell, the second DNA sequence may, for instance, encode patatin or a fragment thereof (such as a fragment of about 100 amino acids) . The plant in which the DNA sequence coding for the polypeptide is introduced may suitably be a dicotyledonous plant, examples of which are is a tobacco, potato, tomato, or leguminous (e.g. bean, pea, soy, alfalfa) plant. It is, however, contemplated that monocotyledonous plants, e.g. cereals, may equally well be transformed with the DNA sequence coding for the enzyme .
Procedures for the genetic manipulation of monocotyledonous and dicotyledonous plants are well known. In order to construct foreign genes for their subsequent introduction into higher plants, numerous cloning vectors are available which generally contain a replication system for E. coli and a selectable/screenable marker system permitting the recognition of transformed cells. These vectors include e.g. pBR322, the pUC series, pACYC, M13 mp series etc. The foreign sequence may be cloned into appropriate restriction sites. The recombinant plasmid obtained in this way may subsequently be used for the transformation of E. coli. Transformed E. coli cells may be grown in an appropriate medium, harvested and lysed. The chimeric plasmid may then be reisolated and analyzed. Analysis of the recombinant plasmid may be performed by e.g. determination of the nucleotide sequence, restriction analysis, electrophoresis and other molecular-biochemical methods. After each manipulation the sequence may be cleaved and ligated to another DNA sequence. Each DNA sequence can be cloned on a separate plasmid DNA. Depending on the way used for transferring the foreign DNA into plant cells other DNA sequences might be of importance. In case the Ti -plasmid or the Ri plasmid of Agrobacterium tumef aciens or Agrobacterium rhizogenes, at least the right border of the T-DNA may be used, and often both the right and the left borders of the T-DNA of the Ri or Ti plasmid will be present flanking the DNA sequence to be transferred into plant cells.
The use of the T-DNA for transferring foreign DNA into plant cells has been described extensively in the prior literature (cf. Gasser and Fraley, 1989, Science 244, 1293 - 1299 and references cited therein) . After integration of the foreign DNA into the plant genome, this sequence is fairly stable at the original locus and is usually not lost in subsequent mitotic or meiotic divisions. As a general rule, a selectable marker gene will be cotransferred in addition to the gene to be transferred, which marker renders the plant cell resistant to certain antibiotics, e.g. kanamycin, hygromycin, G418 etc. This marker permits the recognition of the transformed cells containing the DNA sequence to be transferred compared to nontransformed cells.
Numerous techniques are available for the introduction of DNA into a plant cell. Examples are the Agrobacterium mediated transfer, the fusion of protoplasts with liposomes containing the respective DNA, microinjection of foreign DNA, electro- poration etc. In case Agrobacterium mediated gene transfer is employed, the DNA to be transferred has to be present in special plasmids which are either of the intermediate type or the binary type . Due to the presence of sequences homologous to T-DNA sequences, intermediate vectors may integrate into the Ri- or Ti-plasmid by homologous recombination. The Ri- or Ti- plasmid additionally contains the vir-region which is necessary for the transfer of the foreign gene into plant cells. Intermediate vectors cannot replicate in Agrojbac erium species and are transferred into Agrobacterium by either direct transformation or mobilization by means of helper plasmids (conjugation) . (Cf . Gasser and Fraley, op. cit. and references cited therein) .
Binary vectors may replicate in both Agrobacterium species and E. coli. They may contain a selectable marker and a poly-linker region which to the left and right contains the border sequences of the T-DNA of Agrobacterium rhizogenes or Agrobacterium tumef aciens . Such vectors may be transformed directly into Agrobacterium species. The Agrobacterium cell serving as the host cell has to contain a vir-region on another plasmid. Additional T-DNA sequences may also be contained in the Agrobacterium cell .
The Agrobacterium cell containing the DNA sequences to be transferred into plant cells either on a binary vector or in the form of a cointegrate between the intermediate vector and the T-DNA region may then be used for transforming plant cells. Usually either multicellular explants (e.g. leaf discs, stem segments, roots), single cells (protoplasts) or cell suspensions are cocultivated with Agrobacterium cells containing the DNA sequence to be transferred into plant cells. The plant cells treated with the Agrobacterium cells are then selected for the cotransferred resistance marker (e.g. kanamycin) and subsequently regenerated to intact plants. These regenerated plants will then be tested for the presence of the DNA sequences to be transferred.
If the DNA is transferred by e.g. electroporation or microinjection, no special requirements are needed to effect transformation. Simple plasmids e.g. of the pUC series may be used to transform plant cells. Regenerated transgenic plants may be grown normally in a greenhouse or under other conditions. They should display a new phenotype (e.g. production of new proteins) due to the transfer of the foreign gene(s) . The transgenic plants may be crossed with other plants which may either be wild-type or transgenic plants transformed with the same or another DNA sequence. Seeds obtained from transgenic plants should be tested to assure that the new genetic trait is inherited in a stable Mendelian fashion. See also Hiatt, Nature 344: 469-479, 1990; Edelbaum et al . , J. Interferon Res. 12: 449-453, 1992; Sijmons et al . , Bio/Technology 8: 217-221, 1990: and EP 255 378. Methods of Production
The transformed or transfected host cells described above are cultured in a suitable nutrient medium under conditions permitting the expression of the desired polypeptide, after which the resulting polypeptide is recovered from the cells, or the culture broth.
The medium used to culture the cells may be any conventional medium suitable for growing the host cells, such as minimal or complex media containing appropriate supplements. Suitable media are available from commercial suppliers or may be prepared according to published recipes (e.g. in catalogues of the American Type Culture Collection) . The media are prepared using procedures known in the art (see, e.g., references for bacteria and yeast; Bennett, J.W. and LaSure, L., editors, More Gene Manipulations in Fungi , Academic Press, CA, 1991) .
If the polypeptide is secreted into the nutrient medium, the polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it is recovered from cell lysates. The polypeptide are recovered from the culture medium by conventional procedures including separating the host cells from the medium by centrifugation or filtration, precipitating the proteinaceous components of the supernatant or filtrate by means of a salt, e.g. ammonium sulphate, purification by a variety of chromatographic procedures, e.g. ion exchange chromatography, gelfiltration chromatography, affinity chromatography, or the like, dependent on the type of polypeptide in question. The polypeptides may be detected using methods known in the art that are specific for the polypeptides. These detection methods may include use of specific antibodies, formation of an enzyme product, or disappearance of an enzyme substrate. For example, an enzyme assay may be used to determine the activity of the polypeptide. The polypeptides of the present invention may be purified by a variety of procedures known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion) , electro- phoretic procedures (e.g., preparative isoelectric focusing (IEF) , differential solubility (e.g., ammonium sulfate precipitation), or extraction (see, e.g., Protein Purification, J.C. Janson and Lars Ryden, editors, VCH Publishers, New York, 1989) .
Description of the figure
Figure 1 shows an alignment of the known amino acid sequences of the Aspergillus oryzae (A oryzae) and Aspergillus niger (A niger) Fam35 lactases with the Meripulus giganteus (M giganteus) Fam35 lactase of the invention. The seven conserved regions of the invention are shaded gray.
Detailed description of the invention
In a first embodiment the invention relates to a method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a forward PCR-primer comprising sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y- I-N-A-E (SEQ ID No.l), A-V-D-I-Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] - G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E-F-Q- [A/G] -G- [A/S] - [F/Y] - D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T-S-Y-D-Y-G-S (SEQ ID No.4), D-K-V-R-G (SEQ ID No.5), and N-E-G-G-L- [Y/F] -A-E-R (SEQ ID No .6) ; c) selecting a reverse PCR-primer comprising anti-sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: A-V-D-I-Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] -G-F-D-C- [A/S] -N-P (SEQ ID No.2), E- F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T-S- Y-D-Y-G-S (SEQ ID No.4), D-K-V-R-G (SEQ ID No.5), N-E-G-G-L- [Y/F] -A-E-R (SEQ ID No.6), and G-P-Q- [T/A] -S-F-P-V-P- [E/V] - G-I (SEQ ID No.7) ; d) performing a PCR reaction using the genomic DNA or cDNA from step (a) as template with the primers from step (b) and step (c) , and screening the PCR products for a generated DNA fragment of interest .
In the art is well known that when primers are designed to correspond to a certain amino acid sequence, due to the degeneracy of the geenetic code, some amino acid sequences are better suited than others, e.g. amino acids that are encoded by only one codon such as Tryptophan or by two codons only.
A preferred embodiment of the invention relates to a method of the first aspect, wherein the primer of step (b) comprises the DNA sequence 5'CCXTAYATH, preferably 5'GGXCCXTAYATHAA, most preferably 5 ' CCXGGXCCXTAYATHAAYGC ; and the primer of step (c) comprises the DNA sequence 5'TGGGGXGGX, preferably 5' CCXTGGGGXGGXCCX, most preferably 5' GAYCCXTGGGGXGGXCCXGGG .
It is standard technique in the art when a DNA sequence of interest is to be further manipulated or expressed to clone the sequence in a vector and to transform the resulting construct into a homologous or heterologous microbial host cell. A preferred embodiment of the invention relates to a method of the first aspect, wherein the DNA fragment of the first aspect step (d) is cloned in a suitable vector which is then tranformed into a suitable homologous or heterologous microbial host cell .
Another preferred embodiment relates to a method of the first or second aspects, wherein the microorganism of the first or second aspects 4 step (a) is a fungus, preferably the fungus is of the Phylum Basidiomycota, Ascomycota or Zygomycota, more preferably the fungus is of a Class selected from the group consisting of: Hymenomycetes, Gasteromycetes, Loculoa- scomycetes, Discomycetes, Plectomycetes, Hemiascomycetes, Archiascomycetes, Pyrenomycetes and Zygomycetes, even more preferably the fungus is of an Order selected from the group consisting of: Agaricales, Polyporales, Stereales, Hymenochaetales, Hericiales, Boletales, Chantarellales, Tre- mellales, Auriculariales, Dothideales, Coryneliales, Rhytis- matales, Pezizales, Helotiales, Eurotiales, Saccharomycetales, Schizosaccharomycetales, Xyla-riales, Hypocreales, Sordariales, Microascales, Diaporthales, Trichosphaeriales , Phyllachorales, Mucorales, Kikxellales, Enthomophthorales and Dimargaritales; and most preferably the fungus is of a Genus selected from the group consisting of: Meripilus, Mycena, Trametes, Spathularia, Diplodia, Micro -sphaeropsis, Penicillium, Petromyces, and Christaspora . The fungus may also be of the Chytridiomycota Phylum.
Besides PCR techniques other standard methods exist in the art for screening a DNA library such as Southern Blotting.
A further preferred embodiment relates to a method of the first or second aspects, wherein the genomic DNA or cDNA of steps (a) is fragmented and a DNA fragment of interest is chosen which hybridizes in a Southern blot with the DNA fragment of the first aspect step (d) or with the nucleotide probe of the second aspect step (b) ; preferably the chosen DNA fragment is cloned in a suitable vector which is then tranformed into a suitable homologous or heterologous microbial host cell; more preferably the DNA fragment is stably integrated into the host cell genome, preferably in multiple copies . Preferably the host cell of the previous embodiments is a filamentous fungus or a yeast-like cell; or more preferably the host cell is a fungus; an Aspergillus, a Fusarium, a Meri pilus, a Trametes, a Penicillium, a Microspaeropsis , a Mycena, a Spathularia, a Diplodia, a Petromyces, a Christaspora , or a Hansenula cell; most preferably the fungus is selected from the group consisting of: Aspergillus niger, Aspergillus oryzae, Aspergillus nidulans, Fusarium venenatum, Meripilus giganteus, Trametes ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis, Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, and Christaspora arxi i .
Materials and Methods
General molecular biology methods: Unless otherwise mentioned the DNA manipulations and transformations were performed using standard methods of molecular biology (Sambrook et al . (1989) Molecular cloning: A laboratory manual, Cold Spring Harbor lab., Cold Spring Harbor, NY; Ausubel, F. M. et al . (eds.) "Current Protocols in Molecular Biology". John Wiley and Sons, 1995; Harwood, C. R. , and Cutting, S. M. (eds.) "Molecular Biological Methods for Bacillus". John Wiley and Sons, 1990) .
Enzymes for DNA manipulations were used according to the specifications of the suppliers (e.g. restriction endo- nucleases, ligases etc. are obtainable from New England Biolabs, Inc . ) .
Other strains:
Yeast strain: The Saccharomyces cerevisiae strain used was W3124 (MATa; ura 3-52; leu 2-3, 112; his 3-D200; pep 4-1137; prcl: :HIS3; prbl : : LEU2 ; cir+) . E. coli strain: DH10B (Life Technologies) .
Plasmids : pBK-CMV (Stratagene™ inc., La Jolla Ca . ) ; The automatic sublcloning plasmid liberated from lambdaZAP™ Express (Stratagene) .
The Aspergillus expression vector pHD414 is a derivative of the plasmid p775 described in EP 238 023; the construction of pHD414 is further described in WO 93/11249. PYES™ 2.0 (Invitrogen™, US). pA2LAl; construction described herein. pA3Lacl; construction described herein.
Media:
YPD: 10 g yeast extract, 20 g peptone, H20 to 900 ml.
Autoclaved, 100 ml 20% glucose (sterile filtered) added.
YPM: 10 g yeast extract, 20 g peptone, H20 to 900 ml.
Autoclaved, 100 ml 20% maltodextrin (sterile filtered) added, 10 x Basal salt: 75 g yeast nitrogen base, 113 g succinic acid, 68 g NaOH, H 0 ad 1000 ml, sterile filtered.
SC-URA: 100 ml 10 x Basal salt, 28 ml 20% casamino acids without vitamins, 10 ml 1% tryptophan, H20 ad 900 ml, autoclaved, 3.6 ml
5% threonine and 100 ml 20% glucose or 20% galactose added, SC-agar: SC-URA, 20g/l agar added.
X-gal SC-agar plates: SC-agar with 2% galactose and 40 mg/1 X- gal (5-bromo-4-chloro-3-indolyl-beta-d-galactopyranoside from
SIGMA) added.
PEG 4000 (polyethylene glycol, molecular weight = 4,000) (BDH, England)
Strain media temp rpm days
(°C)
Microspaeropsi s sp BA+PD 26 150 5
Penicillium sp . FG4 26 200 2
Tramacetes ochracea FG4 20 200 7
Spa thularia flavidia FG4 20 80 17
Mycena pura FG4 20 80 17
Meripilus giganteus Mexl 20 150 5
Table 1. Propagation media; media recipes below.
BA media : Rofec (Roquette # 1023642) 10 g NH4N03 10 g
KH2P04 10 g MgS04-7H20 0.75g
Pluroninc PE-6100 0.1 ml
CaC03 0.5 g
Deionized water to 1000 ml pH adjusted to 6.5 before autoclaving
PD Media:
Potato dextrose broth (Difco 0549) 24 g is dissolved in deionized water to 1000 ml.
FG4 media:
Soymeal (SFK 102-2548) 30 g
Maltodextrin 01 (Roquette) 15 g
Bacto peptone (Difco 0118) 5 g Pluroninc PE-6100 0.2 ml
Deionized water to 1000 ml.
Mexl :
Soy flour 20 g/1 Wheat bran 15 g/1
Cellulose 10 g/1
Maltodextrin 5 g/1
Bacto pepton 3 g/1
Pluronic 0.2 g/1 Olive oil 1 g/1
Adjust to pH=6.0
Harvest of mycelia:
Mycelia from the shakeflask cultures were harvested by filtering the contents though a miracloth lined funnel . The mycelia were then sandwiched between two miracloth pieces and blotted dry with absorbant paper towels. The mycelial mass was then transferred to Falcon 1059 plastic centrifuge tubes and frozen in liquid nitrogen. Frozen mycelia were stored in a -80°C freezer until use. Genomic DNA Preparation:
Genomic DNA was prepared essentially according to the fungal genomic DNA protocol supplied with the Qiagen500 genomic tips with the following modifications: 1) The frozen mycelia were ground in a pre chilled morter and pestle using quartz sand as an abrasive. 2) The proteinase K digestion was performed for only 1 hour instead of two hours .
Genomic DNA for Tramacetes ochracea, Spathularia flavidia and Mycena pura was prepared as mini prep Genomic DNA based on a modified Qiaprep protocol (Qiagen GmBh) . Briefly, the mycelia were ground exactly as in the Qiagen protocol described above and were placed in a 2ml microcentrifuge tube. One ml of lysis buffer was added and 2 uls of Rnase A (20ug/ul) solution was also added. The samples were mixed in an Eppindorf thermomixer at 37 degrees for 10 minutes. Thirty uls of proteinase K solution was added and the samples were incubated for thirty minutes ate 50 degrees. The remaining solutions are supplied with the standard Qiagen Qiaprep kit. Samples were centrifuged in a microcentrifuge at maximum speed (20,000g). Samples were decanted directly onto a qiaprep column and the samples centrifuged at 20,000g for 1 minute. Flowthrough was discarded and 500ul of PB buffer was added to the columns and the samples were recentrifuged. The flowthrough was discarded and 750uls of buffer PE was added to the columns and recentrifuged. The flowthrough was again discarded before a final brief (30 seconds) centrifugation step was performed. The columns were transferred to 1.5 ml microcentrifuge tubes and 150ul of TE buffer was added (lOmM Tris-HCl pH 8.0, 0. lmM EDTA) to the columns. After one minute's incubation, the samples were centrifuged to collect the DNA. The eluted DNA was quantified on 1% agarose gels stained with ethidium bromide.
Creation of bacteriophage lambda libraries from fungal candidates containing family 35 enzyme: In order to enable us to screen for genomic clones by hybridization screening of plaque lifts, we selected LambdaZAP express cloning kit with BamHI digested and dephosphorylated arms from Stratagene. Isolated DNA was partially digested with
Sau3A and size fractionated on a 1% DNA agarose gel. DNA was excised from the agarose gel between 3 and 10 Kb and purified using Pharmacia GFX column purification procedure instructions
(Amersham pharmacia biotech Ltd.) .lOOng of purified, fractionated DNA was ligated with 1 ug of BamHI dephosphorylated ZAPexpress vector arms (4 degrees overnight) .
Ligation reaction was packaged directly with Mutaplax phage packaging mix (Epicentre technologies) according to the packaging instructions found in the Stratagene ZAP Express kit instructions. Phage libraries were titered with E. coli strain
XLlblue mrf- (Stratagene) .
Screening of the DNA-libraries :
Standard screening procedure was employed for recovering genomic clones from the fungal libraries (Sambrook et al . , 1995). 200,000 plaque-forming units (pfus) were screened at 50,000 pfus/l25mm petri plate containing NZY media. Plaque lifts were performed with Hybond N nylon membranes which were processed for hybrization according to the manufacturer' s instructions (Amersham pharmacia biotech Ltd.). Hybridizations were performed at high stringency (68 C) in Modified Church buffer (Biorad Inc., USA). Purified PCR product, amplified from the genomic DNA was used to make each fungal library. PCR products were produced by using the degenerate primers described in the invention. The PCR fragment was randomly primed using a Pharmacia kit 32P-CTP Random priming kit according to the manufacturer's instructions (Amersham pharmacia biotech Ltd.). Initial positives underwent two rounds of purification to isolate pure plaques. Pure plaque isolates were automatically subcloned to plasmid according to the manufacturer's instructions (Stratagene).
Extraction of total RNA, cDNA synthesis, Mung bean nuclease treatment, Blunt-ending with T4 DNA polymerase, Adaptor ligation, Notl digestion and size selection, and construction of the cDNA library were performed as described in WO 97/31102.
Identification of positive clones: After 3-5 days of incubation, the SC agar plates are replica plated onto a set of X-gal SC agar plates. Those plates are incubated for 2-4 days at 30°C and beta-galactosidase positive colonies are identified as blue/green colonies.
Isolation of a cDNA gene for expression in Aspergillus :
A beta-galactosidase producing yeast colony is inoculated into 20 ml YPD broth in a 50 ml glass test tube. The tube is shaken for 2 days at 30°C. The cells are harvested by centrifugation for 10 min. at 3000 rpm. DNA is isolated according to WO 94/14953 and dissolved in 50 ml water. The DNA is transformed into E. coli by standard procedures. Plasmid DNA is isolated from E. coli using standard procedures, and the cDNA inset is sequenced with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) and synthetic oligonucleotide primers using an Applied Biosystems ABI PRISM™ 377 DNA Sequencer according to the manufacturers instructions. The cDNA insert is excised using appropriate restriction enzymes and ligated into an Aspergillus expression vector.
Transformation of Aspergillus oryzae or Aspergillus niger
Protoplasts may be prepared as described in WO 95/02043, page 16, line 21 - page 17, line 12, which is hereby incorporated by reference.
100 μl of protoplast suspension is mixed with 5-25 μg of the appropriate DNA in 10 μl of STC (1.2 M sorbitol, 10 mM
Tris-HCl, pH = 7.5, 10 mM CaCl2). Protoplasts are mixed with p3SR2 (an A . nidulans amdS gene carrying plasmid) . The mixture is left at room temperature for 25 minutes. 0.2 ml of 60% PEG 4000 (BDH 29576), 10 mM CaCl2 and 10 mM Tris-HCl, pH 7.5 is added and carefully mixed (twice) and finally 0.85 ml of the same solution is added and carefully mixed. The mixture is left at room temperature for 25 minutes, spun at 2500 g for 15 minutes and the pellet is resuspended in 2 ml of 1.2 M sorbitol. After one more sedimentation the protoplasts are spread on minimal plates (Cove, Biochem. Biophys. Acta 113 (1966) 51-56) containing 1.0 M sucrose, pH 7.0, 10 mM acetamide as nitrogen source and 20 mM CsCl to inhibit background growth. After incubation for 4-7 days at 37°C spores are picked and spread for single colonies. This procedure is repeated and spores of a single colony after the second reisolation is stored as a defined transformant .
Test of A . oryzae transformants :
Each of the transformants are inoculated in 10 ml of YPM and propagated. After 2-5 days of incubation at 30°C, the supernatant is removed. The beta-galactosidase activity can be identified by applying 20 μl supernatant to 4 mm diameter holes punched out in an X-gal SC-agar plate and incubation overnight at 30°C; beta-galactosidase activity is then identified by a blue halo.
Molecular screening for new fungal family 35 enzymes:
A method for identifying new fungal family 35 enzymes are designing degenerated oligonucleotide primer for PCR based on conserved amino acid region within the amino acid sequences coding for the known fungal family 35 enzymes and use it for molecular screening/cloning of gene family members for instance as described G. M. Preston, Methods in Molecular Biology vol. 67, 1997, pp433-449; S. Bartl, Methods in Molecular Biology vol. 67, 1997, pp451-457; R. M. Horton et al . , Methods in Molecular Biology vol. 67, 1997, pp459-479. The full sequence can then be determined by library construction and screening as described above or by cloning by inverse PCR for instance as described by J. Silver, pp.137-146, in PCR a Practical Approach. Edited by: M. J.McPherson, P. Quirke and G.R. Taylor. Oxford University Press, 1991. Sequence Determination:
Plasmid clones were sequenced by use of the New England Biolabs GPS-1 Genome Priming System. Briefly, transposons were inserted randomly into the plasmid genomic clone according to the manufacturers instructions (New England Biolabs, USA) . Sixty to one hundred individual plasmid transposants were sequenced with one of the primers contained on the inserted transposon. The DNA Star software package version 4.2 (www.dnastar.com) was used to assemble the DNA contigs generated from the plasmid transposants. Both strands were sequenced and custom primers were used as necessary to complete the sequence in both directions. An Applied Biosytems ABI371 automated sequencer utilizing dye terminator technology was used (Applied Biosystems USA) .
References for Public domain fungi:
Kumar V., Ramakrishnan S., Teeri T.T., Knowles J.K.,
Hartley B . S . ; "Saccharomyces cerevisiae cells secreting an Aspergillus niger beta-galactosidase grow on whey permeate";
Biotechnology 10:82-85(1992).
Nikiolaev I.V., Eplshin S.M., Zakharova E.S., Kotenko
S.V., Vinetski Y.P.; "Molecular cloning of the gene for secreted beta-galactosidase of the filamentous fungus Penicillium canescens." Mol. Biol. (Mosk) 26:869-875(1992).
(Russian)
Examples
Example 1 Cloning and expression of a beta-galactosidase from from Meripilus giganteus CBS 521.95
mRNA was isolated from Meripilus giganteus, CBS No.
521.95, grown in Mexl fermentation medium with agitation to ensure sufficient aeration. Mycelia were harvested after 3-5 days' growth, immediately frozen in liquid nitrogen and stored at -80°C. A library from M. giganteus , CBS No. 521.95, consisting of approx. 10^ individual clones was constructed in E. coli as described with a vector background of 1%. Plasmid DNA from some of the pools was transformed into yeast, and 50-100 plates containing 250-400 yeast colonies were obtained from each pool . β-galactosidase positive yeast colonies were identified and isolated on X-gal SC-agar plates as described above . Total DNA was isolated from a beta-galactosidase positive yeast colony and plasmid DNA was rescued by transformation of E. coli as described above. In order to express the beta-galactosidase in Aspergillus, the DNA was digested with appropriate restriction enzymes, size fractionated on gel, and a fragment corresponding to the beta-galactosidase gene was purified. The gene was subsequently ligated to pHD414, digested with appropriate restriction enzymes, resulting in the plasmid pA2LAl .
The full length cDNA inset encoding the beta- galactosidase of Meripilus giganteus of Qiagen purified plasmid DNA of pA2LAl (Qiagen, USA) was sequenced with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) and synthetic oligonucleotide primers using an Applied Biosystems ABI PRISM™ 377 DNA Sequencer according to the manufacturers instructions. After amplification of the DNA in E. coli the plasmid was transformed into Aspergillus oryzae as described above.
The sequence of the BstXl/Notl adapted cDNA encoding the Meripilus giganteus beta-galactosidase is shown in SEQ ID No. 8 and the corresponding amino acid sequence is shown in SEQ ID No. 9.
Test of A . oryzae transformants:
Each of the transformants was tested for enzyme activity as described above. Some of the transformants had beta- galactosidase activity which was significantly larger than the Aspergillus oryzae background. This demonstrates efficient expression of the beta-galactosidase in Aspergillus oryzae .
Example 2
Design of degenerated oligonucleotide primers for molecular screening of family 35 β-galactosidase genes
An amino acid alignment of β-galactosidase from Aspergillus oryzae strain CCC161 (ATCC 74285) (WO 96/00786) , β- galactosidase from Aspergillus niger (V. Kumar et al . , Bio/Technology Vol. 10(1): 82-85 (1992), β-galactosidase from Penicillium canescens (the DNA sequence has only been partly determined) (Nikolaev IV et al . Molekuliarnaia Biologiia Vol. 26(4): 869-875 (1992)) and β-galactosidase from Meripilus giganteus were made. Based on the alignment conserved regions have been identified and the following degenerate oligonucleotide primers coding for these regions have been designed for use in molecular screening for fungal family 35 enzymes:
647 sense primer (SEQ ID NO.10) :
5'- CCC AAG CTT CCI GGI CCI TAY ATH AAY GC -3' corresponds to amino acids P-G- [P/S] -Y-I-N-A (comprised in SEQ ID No.l) with a CCC and Hind III site 5' tail.
648 antisense primer (SEQ ID NO.11) :
5'- GCT CTA GAC CCI GGI CCI CCC CAI GGR TC -3' corresponds to amino acids D-P-W-G-G-P-G (comprised in SEQ ID No.3) with a GC and Xba I site 5' tail.
wherein I = deoxylnosine; H = A, C or T; R = A or G; Y = C or
T.
Experimental procedure : Approximately 100 to 200 ng genomic DNA or 10-20 ng doublestranded cDNA is used as template for PCR amplification in PCR buffer (10 mM Tris-HCl, pH 8.3 , 50 mM KCl) containing 200 μM of each dNTP, 3.5 mM MgCl2, 2.5 Units AmpliTaq Gold™, 5 and 100 pmol of each of the degenerate primers 647 and 648. The total volume is 50 μl . The PCR reaction is carried out in a Perkin - Elmer GeneAmp PCR System 2400.
The PCR reaction is performed using a cycle profile of:
10
94 °C - 10 min; 1 cycle
94 °C - 1 min, 60°C - 1 min, 72°C - 30 sec; 2 cycles
94 °C - 1 min, 59°C - 1 min, 72°C - 30 sec; 2 cycles
94 °C - 1 min, 58°C - 1 min, 72°C - 30 sec; 2 cycles
15
94 °C - 1 min, 52°C - 1 min, 72°C - 30 sec; 2 cycles 94 °C - 1 min, 50°C - 1 min, 72°C - 30 sec; 14 cycles 20 72 °C - 7 min; 1 cycles
5 μl aliquots of the amplification products are analyzed by electrophoresis in 1.5% agarose gels.
25 Purification and sequencing of PCR bands:
The PCR fragments can be purified and sequenced using GFX™ PCR DNA and Gel Band Purification Kit (Pharmacia Biotech) according to the manufacturer's instructions. The nucleotide sequences of the amplified PCR fragments are determined
30 directly on the purified PCR products using 200-300 ng as template, the Taq deoxy-terminal cycle sequencing kit (Perkin- Elmer, USA) , fluorescently labelled terminators and 5 pmol of either 647 sense or 648 antisense primer on a ABI PRISM 377 DNA Sequencer, Perkin Elmer.
35 Example 3 :
Cloning of a beta-galactosidase from Penicillium roquefortii
NN0048065:
Genomic DNA from Penicillium roquefortii NN0048065 was isolated as described above and screened by PCR as described in example 2. A PCR fragment was obtained and sequenced as described above. The PCR fragment was randomly primed using a Pharmacia kit 32P-CTP Random priming kit according to the manufacturer's instructions (Amersham pharmacia biotech Ltd.) and used as probe for screening by hybridization of the Penicillium roquefortii bacteriophage lambda library.
Creation of bacteriophage lambda libraries from genomic Penicillium roquefortii DNA was done under the conditions and by the methods described above and the library was screened according to the description also given above. Initial positive clones underwent two rounds of purification to isolate pure monoclonal plaques. Lambda DNA from the purified plaques was isolated from plate lysate (Qiagene Lambda System) .
Long Range PCR were carried out on Lambda DNA from the positive plaques according to the instruction given by the manufacture (Boehringer-Mannheim: Expand Long template PCR system) with BKCMV forward primer and sense primer 693, BKCMV forward primer and antisense primer 694, BKCMV reverse primer and sense primer 693, and BKCMV reverse primer and antisense primer 694.
Primer BKCMV forward and -reverse were designed from Figure 2 in the ZAP Express Predigested Vector Kit and ZAP Express Predigested GigaPack Cloning Kits Instruction Manual (Stratagene) and primer 693 and 694 were designed from the Penicillium roquefortii 647/648 PCR fragment DNA sequence.
BKCMV forward primer (SEQ ID NO.12) : 5 ' -GAAATTAACCCTCACTAAAGG-3 '
BKCMV reverse primer (SEQ ID NO.13) : 5 ' -CCGGGTGGAAAATCGATGGGCC-3 ' Primer sense 693 (SEQ ID NO.14) :
5 ' -GGATATCTCCTGTGTGGATTATC-3 ' Primer antisense 694 (SEQ ID NO.15) : 5 ' -CTCACGTTTCGGCAGGGTCACATAC-3 '
PCR bands containing the full upstream and full downstream genomic DNA sequence for the Penicillium roquefortii beta-galactosidase was obtained from the positive lamda plaque D2. Based on the DNA sequence obtained PCR oligonucleotide primers for amplifying genomic DNA encoding the Penicillium roquefortii beta-galactosidase were design with appropriated restriction site tail for cloning purpose.
Primer 823 (SEQ ID NO.16):
5' -ATAAGAATGCGGCCGCCCACCATGAAGCTCGCATATTCTTGGGCCATTG-3 ' Primer 723 (SEQ ID NO.17) :
5 ' -GCTCTAGACTAATAAGCCCCCTGGCGCTTCTTG-3 '
PCR reaction directly on the Penicillium roquefortii genomic DNA was carried out with primer 823 and 723 according to manufactures instruction (Boehringer-Mannheim: Expand Long template PCR system) and the obtained PCR fragment was cloned into Aspergillus expression vector pHD423, a derivative of pHD414 with Not 1 and Xbal restriction sites in the polylinker region, resulting in plasmid pA3Lacl. The DNA sequence of the insert encoding the beta-galactosidase of Penicillium roquefortii of Qiagen was determined from purified plasmid DNA of pA3Lacl (Qiagen, USA) with the Taq deoxy terminal cycle sequencing kit (Perkin Elmer, USA) and synthetic oligonucleotide primers using an Applied Biosystems ABI PRISM™ 377 DNA Sequencer according to the manufacturers instructions.
The DNA sequence of the Notl/Xba 1 insert in pA3Lacl encoding the beta-galactosidase of Penicillium roquefortii is shown in SEQ ID No. 18 and the corresponding cDNA sequence is shown in SEQ ID No. 19. The amino acid sequence of the beta- galactosidase is shown in SEQ ID No.20.
Example 4 : Cloning of a beta-galactosidase from Spathularia flavidia
Mycelia was grown in FG4 media at 200rpm, 20 degrees centigrade for 17 days. Mycelium was harvested and genomic DNA was prepared from Spathularia flavidia according to materials and methods section. A lambda phage genomic library was prepared with an average insert size of between 3 and lOkb and a primary titer of 370,000 pfus/ml . 200,000 plaques were plated on four plates at 50,000 pfus/125mm petri plate containing NZY media. Plaque lifts were performed and hybridisation carried out as described. The hybridisation probe was generated from PCR on the same genomic DNA used to make the Spathularia flavidia genomic library and using primers 647 and 648 (Example 2) .
12 positive hybridising plaques were identified and were purified by two successive rounds of replating and hybridisation. A total of 11 isolates were subcloned into pBK- CMV plasmid by the automatic subcloning procedure (Stratagene, USA) . The clones, which contained subgenomic fragments of S. flavidia hybridising to the probe, were analysed by restriction polymorphisms by agarose electrophoresis . Distinct clones were selected for sequencing. Sequencing was performed using both custom primers and in vivo transposon generated sequencing using the Primer Island Transposon kit (New England Biolabs, USA) . Sequence analysis was performed with DNA Star also as described in the materials and method section.

Claims

Claims
1. A method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a forward PCR-primer comprising sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y- I-N-A-E (SEQ ID No.l), A-V-D-I-Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] - G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E-F-Q- [A/G] -G- [A/S] - [F/Y] - D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T-S-Y-D-Y-G-S (SEQ ID No .4 ) , D-K-V-R-G (SEQ ID No.5), and N-E-G-G-L- [Y/F] -A-E-R (SEQ ID No.6) ; c) selecting a reverse PCR-primer comprising anti-sense DNA corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: A-V-D-I-Y-G-
[L/H] -D- [A/S] -Y-P- [Q/L] -G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E- F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T-S- Y-D-Y-G-S (SEQ ID No.4), D-K-V-R-G (SEQ ID No.5), N-E-G-G-L-
[Y/F] -A-E-R (SEQ ID No.6), and G-P-Q- [T/A] -S-F-P-V-P- [E/V] - G-I (SEQ ID No.7) ; d) performing a PCR reaction using the genomic DNA or cDNA from step (a) as template with the primers from step (b) and step
(c) , and screening the PCR products for a generated DNA fragment of interest .
2. The method of claim 1, wherein the primer of step (b) comprises the DNA sequence 5'CCXTAYATH, preferably
5'GGXCCXTAYATHAA, most preferably 5 ' CCXGGXCCXTAYATHAAYGC; and the primer of step (c) comprises the DNA sequence 5'TGGGGXGGX, preferably 5' CCXTGGGGXGGXCCX, most preferably 5' GAYCCXTGGGGXGGXCCXGGG .
3. The method of claims 1 - 2 , wherein the DNA fragment of claim 1 step (d) is cloned in a suitable vector which is then tranformed into a suitable homologous or heterologous microbial host cell .
4. A method of screening for a DNA sequence coding for an enzyme of interest, the method comprising: a) selecting a microorganism of interest and obtaining genomic DNA or cDNA from said microorganism; b) selecting a nucleotide probe comprising a sense or an antisense nucleotide sequence corresponding to an amino acid sequence of at least five residues chosen from the group consisting of: R-P-G- [P/S] -Y-I-N-A-E (SEQ ID No.l), A-V-D-I- Y-G- [L/H] -D- [A/S] -Y-P- [Q/L] -G-F-D-C- [A/S] -N-P (SEQ ID No .2 ) , E-F-Q- [A/G] -G- [A/S] - [F/Y] -D-P-W-G-G-P-G (SEQ ID No .3 ) , Y-T- S-Y-D-Y-G-S (SEQ ID No.4), D-K-V-R-G (SEQ ID No.5), N-E-G-G- L- [Y/F] -A-E-R (SEQ ID No.6), and G-P-Q- [T/A] -S-F-P-V-P- [E/V] -G-I (SEQ ID No.7); c) using the probe of step (b) in a Southern blot against fragmented genomic DNA or fragmented cDNA of step (a) .
5. The method of any of claims 1 - 4, wherein the microorganism of claims 1 and 4 step (a) is a fungus.
6. The method of claim 5, wherein the fungus is of the Phylum Basidiomycota, Ascomycota or Zygomycota.
7. The method of claim 5, wherein the fungus is of a Class selected from the group consisting of: Hymenomycetes, Gasteromycetes, Loculoascomycetes, Discomycetes, Plectomycetes, Hemiascomycetes, Archiascomycetes, Pyrenomycetes and Zygomycetes
8. The method of claim 5, wherein the fungus is of an Order selected from the group consisting of: Agaricales, Polyporales,
Stereales, Hymenochaetales, Hericiales, Boletales, Chan- tarellales, Tremellales, Auriculariales, Dothideales, Coryneliales, Rhytismatales, Pezizales, Helotiales, Eurotiales, Saccharomycetales, Schizosaccharomycetales, Xylariales, Hypo- creales, Sordariales, Microascales, Diaporthales, 5 Trichosphaeriales, Phyllachorales, Mucorales, Kikxellales, Enthomophthorales and Dimargaritales .
9. The method of claim 5, wherein the fungus is of a Genus selected from the group consisting of: Meripilus, Mycena, io Trametes, Spathularia, Diplodia, Microsphaeropsis , Penicillium, Petromyces, and Christaspora .
10. The method of claim 1 or 4 , wherein the genomic DNA or cDNA of step (a) is fragmented and a DNA fragment of interest is is chosen which hybridizes in a Southern blot with the DNA fragment of claim 1 step (d) or with the nucleotide probe of claim 4 step (b) .
11. The method of claim 10, wherein the chosen DNA fragment is 20 cloned in a suitable vector which is then tranformed into a suitable homologous or heterologous microbial host cell.
12. The method of claims 3 or 11, wherein the DNA fragment is stably integrated into the host cell genome, preferably in
25 multiple copies.
13. The method of claim 3, 11 or 12, where the host cell is a filamentous fungus or a yeast-like cell.
30 14. The method of claim 3, 11 and 12, where the host cell is a fungus .
15. The method of claim 14, where the fungus is an Aspergillus, a Fusarium, a Meripilus, a Trametes, a Penicillium, a 35 Microspaeropsis , a Mycena, a Spathularia, a Diplodia, a Petromyces , a Christaspora, ox a Hansenula cell.
16. The method of claim 14, where the fungus is selected from the group consisting of: Aspergillus niger, Aspergillus oryzae, Aspergillus nidulans, Fusarium venenatum, Meripilus giganteus, 5 Trametes ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis , Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, and Christaspora arxii .
io 17. A process for producing an enzyme of interest, the process comprising steps of: a) screening for a DNA sequence coding for the enzyme of interest by the method of any of claims 1 - 10, b) culturing the microorganism of claim 1 or 4 step (a) , the is genomic DNA or cDNA of which comprises the DNA sequence coding for the enzyme of interest, under suitable conditions to express the enzyme of interest, and c) recovering the enzyme from the culture.
20 18. A process for producing an enzyme of interest, the process comprising steps of: a) screening for a DNA sequence coding for the enzyme of interest and isolating said DNA sequence by the method of any of claims 3, 11 - 16, 25 b) culturing the microbial host cell of claims 3, 11 - 16 under suitable conditions to express the enzyme of interest and recovering the expressed enzyme from the culture.
19. An isolated enzyme with lactase activity (EC 3.2.1.23) 30 originating from a cell of the Agaricales, Polyporales, Phanerochaetales, Leotiales, or Dothideales Order, said enzyme comprising at least one of the amino acid sequences shown in SEQ ID No. 1-7, and said enzyme being obtainable by the method of claim 17 or 18.
35
20. An isolated enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus , Mycena, Trametes, Spathularia, Diplodia, Microsphaeropsis, Penicillium, Petromyces, or Christaspora Genus, said enzyme comprising at least
5 one of the amino acid sequences shown in SEQ ID No. 1-7, and said enzyme being obtainable by the method of claim 17 or 18.
21. An isolated enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus giganteus, Trametes io ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis sp . , Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, or Christaspora arxii species, said enzyme comprising at least one of the amino acid sequences shown in SEQ ID No. 1-7, and said enzyme being is obtainable by the method of claim 17 or 18.
22. An isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Agaricales, Polyporales, Phanerochaetales, Leotiales, or
20 Dothideales Order, said sequence comprising subsequences that encode at least one of the amino acid sequences shown in SEQ ID No. 1-7, and said sequence being obtainable by the method of any of claims 1 - 16.
25 23. An isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) originating from a cell of the Meripilus, Mycena, Trametes, Spathularia, Diplodia, Microsphaeropsis, Penicillium, Petromyces, or Christaspora Genus, said sequence comprising subsequences that encode at least one
30 of the amino acid sequences shown in SEQ ID No. 1-7, and said sequence being obtainable by the method of any of claims 1 - 16.
24. An isolated DNA sequence encoding an enzyme with lactase
35 activity (EC 3.2.1.23) originating from a cell of the Meripilus giganteus, Trametes ochracea, Penicillium roquefortii , Penicillium carneum, Microsphaeropsis sp . , Mycena pura, Spathularia flavida, Diplodia gossypina, Petromyces alliaceus, or Christaspora arxii species, said sequence comprising subsequences that encode at least one of the amino acid 5 sequences shown in SEQ ID No. 1-7, and said sequence being obtainable by the method of any of claims 1 - 16.
25. An isolated enzyme with lactase activity (EC 3.2.1.23) which is obtainable by the method of any of claims 17 - 18 and io which comprises an amino acid sequence at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to any of the sequence shown in SEQ ID No's: 9, 20, 22, 24,
26, 28, 30, 32, 34, 36, or 38.
is 26. An isolated enzyme with lactase activity (EC 3.2.1.23) which is obtainable by the method of any of claims 17 - 18 and which comprises an amino acid sequence at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to a lactase encoded by a DNA sequence comprised in a strain chosen
20 from the group consisting of Microsphaeropsis sp . CBS102583, Trametes ochracea CBS 102584, Penicillium carneum CBS 102585, and Meripilus giganteus CBS 52195.
27. An isolated DNA sequence encoding an enzyme with lactase 25 activity (EC 3.2.1.23), said sequence being obtainable by the method of any of claims 1 - 16, and said sequence being at least 85%, preferably 90%, more preferably 95%, most preferably 98% identical to any of the sequence shown in SEQ ID No's: 8, 18, 19, 21, 23, 25, 27, 29, 31, 33, 35, or 37.
30
28. An isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23), said sequence being obtainable by the method of any of claims 1 - 16, and said sequence being at least 85%, preferably 90%, more preferably 95%, most preferably
35 98% identical to the lactase encoding sequence comprised in a strain chosen from the group consisting of Microsphaeropsis sp . CBS102583, Trametes ochracea CBS 102584, Penicillium carneum CBS 102585, and Meripilus giganteus CBS 52195.
29. An isolated DNA sequence encoding an enzyme with lactase activity (EC 3.2.1.23) said sequence obtainable by shuffling at least two isolated DNA sequences as defined in claims 22 - 24, or 27 - 28.
30. A recombinant vector comprising a DNA sequence as defined in any of claims 22 - 24, 27 - 29.
31. A recombinant host cell comprising a DNA sequence according to any of claims 22 - 24, 27 - 29, or the vector according to claim 30.
32. A transgenic animal comprising and expressing the nucleic acid construct according to any of claims 22 - 24, or 27 - 29.
33. A transgenic plant containing and expressing the nucleic acid construct according to any of claims 22 - 24, or 27 - 29.
34. A method of producing an enzyme with lactase activity (EC 3.1.2.23), which method comprises recovering the enzyme from the transgenic animal according to claim 32.
35. A method of producing an enzyme with lactase activity (EC 3.1.2.23), which method comprises growing a cell of a transgenic plant according to claim 33, and recovering the enzyme from the resulting plant .
36. A composition comprising an enzyme with lactase activity (EC 3.2.1.23) as defined in any of claims 19 - 21, or 25 - 26.
37. Use of an enzyme with lactase activity or use of a composition comprising an enzyme with lactase activity as defined in any of claims 19 - 21, 25 - 26, or 36, in the manufacture or processing of foodstuffs or feeds fit for consumption by lactose intolerant humans or animals.
PCT/DK2000/000693 1999-12-30 2000-12-14 FUNGAL EXTRACELLULAR Fam35 BETA-GALACTOSIDASES WO2001049878A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU19961/01A AU1996101A (en) 1999-12-30 2000-12-14 Fungal extracellular fam35 beta-galactosidases

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
DKPA199901888 1999-12-30
DKPA199901888 1999-12-30
DKPA200000397 2000-03-13
DKPA200000397 2000-03-13
DKPA200001529 2000-10-13
DKPA200001529 2000-10-13

Publications (1)

Publication Number Publication Date
WO2001049878A1 true WO2001049878A1 (en) 2001-07-12

Family

ID=27221429

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK2000/000693 WO2001049878A1 (en) 1999-12-30 2000-12-14 FUNGAL EXTRACELLULAR Fam35 BETA-GALACTOSIDASES

Country Status (2)

Country Link
AU (1) AU1996101A (en)
WO (1) WO2001049878A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7666649B2 (en) 2004-08-06 2010-02-23 Novozymes A/S Polypeptides of Botryospaeria rhodina
CN112969798A (en) * 2018-11-13 2021-06-15 株式会社益力多本社 Method for producing secreted beta-galactosidase

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1990010703A1 (en) * 1989-03-13 1990-09-20 Imperial College Of Science, Technology & Medicine Dna construct and modified yeast
WO1996000786A1 (en) * 1994-06-29 1996-01-11 Genencor International, Inc. INCREASED PRODUCTION OF β-GALACTOSIDASE IN ASPERGILLUS ORYZAE
JPH08275780A (en) * 1995-04-07 1996-10-22 Meiji Milk Prod Co Ltd Beta-galactosidase gene
WO1997016555A1 (en) * 1995-11-01 1997-05-09 Nexia Biotechnologies, Inc. Aspergillus niger beta-galactosidase gene
WO1998030709A2 (en) * 1997-01-14 1998-07-16 Chiron Corporation Non-immunogenic prodrugs and selectable markers for use in gene therapy

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1990010703A1 (en) * 1989-03-13 1990-09-20 Imperial College Of Science, Technology & Medicine Dna construct and modified yeast
WO1996000786A1 (en) * 1994-06-29 1996-01-11 Genencor International, Inc. INCREASED PRODUCTION OF β-GALACTOSIDASE IN ASPERGILLUS ORYZAE
JPH08275780A (en) * 1995-04-07 1996-10-22 Meiji Milk Prod Co Ltd Beta-galactosidase gene
WO1997016555A1 (en) * 1995-11-01 1997-05-09 Nexia Biotechnologies, Inc. Aspergillus niger beta-galactosidase gene
WO1998030709A2 (en) * 1997-01-14 1998-07-16 Chiron Corporation Non-immunogenic prodrugs and selectable markers for use in gene therapy

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
ACTA BIOTECHNOLOGICA, vol. 15, no. 2, 1995, pages 211 - 222 *
DATABASE BIOSIS [online] BILAI T.I. ET AL.: "Extracellular beta galactosidase of penicillium-canescens sopp. 20171", XP002939817, accession no. STN International Database accession no. 1988:503031 *
DATABASE BIOSIS [online] ROGALSKI J. ET AL.: "The purification and immobilization of penicillium notatum beta-galactosidase", XP002939816, accession no. STN International Database accession no. 1995:509567 *
DATABASE CAPLUS [online] LOBARZEWSKI JERZY: "Fungus trametes versicolor (basidiomycetes) grown on whey as a new source of .beta.-galactosidase", XP002939815, accession no. STN International Database accession no. 1978:4715 *
DATABASE WPI Week 199701, Derwent World Patents Index; AN 1997-006243, XP002939814, KOKUZEI CHO CHOHAN: "Aspergillus oryzae beta-galactosidase gene - useful for recombinant production of the enzyme in filamentous fungi" *
MICROBIOL. ZH, vol. 50, no. 4, 1988, (KIEV), pages 48 - 51 *
SPOZYW., vol. 31, no. 7, 1977, pages 270 - 273 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7666649B2 (en) 2004-08-06 2010-02-23 Novozymes A/S Polypeptides of Botryospaeria rhodina
US8143047B2 (en) 2004-08-06 2012-03-27 Novozymes Als Polypeptides of botryosphaeria rhodina
CN112969798A (en) * 2018-11-13 2021-06-15 株式会社益力多本社 Method for producing secreted beta-galactosidase
EP3882353A4 (en) * 2018-11-13 2022-10-12 Kabushiki Kaisha Yakult Honsha Method for producing secreted beta-galactosidase

Also Published As

Publication number Publication date
AU1996101A (en) 2001-07-16

Similar Documents

Publication Publication Date Title
JP4050329B2 (en) Polypeptide having prolyl pipetidyl aminopeptidase activity and nucleic acid encoding the same
CN109312353A (en) Improve microorganism by CRISPR- inhibition
US20130089915A1 (en) DNA Sequences For Regulating Transcription
SK280670B6 (en) Purified and isolated dna sequence, construct, vector, transformed cell, peptide or protein having phytase activity, process for its preparation, and its use
CN105671027A (en) Methods for using positively and negatively selectable genes in filamentous fungal cell
CA2655478A1 (en) Catalytically inactive proteins and method for recovery of enzymes from plant-derived materials
WO1999061651A2 (en) Methods for producing a polypeptide by modifying the copy number of a gene
US5989889A (en) Nucleic acids encoding polypeptides having tripeptide aminopeptidase activity
US9045748B2 (en) Methods for transforming and expression screening of filamentous fungal cells with a DNA library
JP2005514911A6 (en) DNA sequence for transcriptional regulation
AU2002354845A1 (en) DNA sequences for regulating transcription
JP4563585B2 (en) Fungal transcriptional activators useful in polypeptide production methods
JPH08500733A (en) Fungal promoter active in the presence of glucose
US8609386B2 (en) Polypeptides having tyrosinase activity and polynucleotides encoding same
WO2001040489A1 (en) Methods for producing a polypeptide using a consensus translational initiator sequence
JP2003516112A (en) Recombinant penicillium funiculosum for production of homologous or heterologous proteins
WO2001049878A1 (en) FUNGAL EXTRACELLULAR Fam35 BETA-GALACTOSIDASES
US5770371A (en) Modification of cryptic splice sites in heterologous genes expressed in fungi
WO2022251056A1 (en) Transcriptional regulators and polynucleotides encoding the same
US5874275A (en) Polypeptides having mutanase activity and nucleic acids encoding same
WO2002032947A1 (en) Transgenic plants
EP0912748B1 (en) Modification of cryptic splice sites in heterologous genes expressed in fungi
WO2001032834A1 (en) A method of screening cell populations
US20050227357A1 (en) Recombinant Penicillium funiculosum for homologous and heterologous protein production
US20030032165A1 (en) Aspartic acid proteases and nucleic acids encoding same

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP