WO2018093752A1 - Filamentous fungi with improved protein production - Google Patents

Filamentous fungi with improved protein production Download PDF

Info

Publication number
WO2018093752A1
WO2018093752A1 PCT/US2017/061475 US2017061475W WO2018093752A1 WO 2018093752 A1 WO2018093752 A1 WO 2018093752A1 US 2017061475 W US2017061475 W US 2017061475W WO 2018093752 A1 WO2018093752 A1 WO 2018093752A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
filamentous fungal
mutation
protein
polypeptide
Prior art date
Application number
PCT/US2017/061475
Other languages
French (fr)
Inventor
David A. Estell
Michael C. Miller
Original Assignee
Danisco Us Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Danisco Us Inc. filed Critical Danisco Us Inc.
Publication of WO2018093752A1 publication Critical patent/WO2018093752A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/80Vectors or expression systems specially adapted for eukaryotic hosts for fungi
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/58Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from fungi
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione

Definitions

  • filamentous fungi are known for their robust ability to secrete large quantities of proteins. Such expression can reach as high as 40 g/L (Durand et al., Enzyme and Microbial Technology, 1988. 10(6):341-346) with a translational and post-translational modification process similar to that of mammalian cells except for their glycosylation. They are widely used in the chemical, pharmaceutical and food industries and it is generally regarded as safe (Schuster, E., et al., Appl Microbiol Biotechnol, 2002. 59(4- 5):426-35). However, heterologous protein production in filamentous fungi is still relatively low as compared to the more common bacterial expression systems. Thus a need exists for more efficient expression systems that can produce heterologous proteins in greater quantities. SUMMARY
  • compositions and methods relate to recombinant filamentous fungal cells that produce decreased amount of proteases and/or produce increased amounts of protease inhibitors compared to comparable parental cells.
  • compositions and methods are described in the following, independently -numbered
  • a filamentous fungal cell comprising at least one mutation that decreases the amount of active protease in the cell, wherein the mutation is in a gene encoding a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 25.
  • the mutation is located in a non-coding region of the gene.
  • the mutation is located in a coding region of the gene.
  • the mutation results in reduced expression of a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 23.
  • the mutation is a polynucleotide homologous to or identical to a polynucleotide selected from the group consiting of the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47 and SEQ ID NO: 48. 6. In some embodiments of the filamentous fungal cell of any of paragraphs 1-3, the mutation results in overexpression of the polypeptide homologous to a polypeptide
  • the mutation is a polynucleotide homologous to or identical to a polynucleotide of SEQ ID NO: 49 or SEQ ID
  • the mutation comprises an insertion mutation.
  • the insertion mutation comprises insertion of a selectable marker.
  • the insertion mutation comprises insertion of an expression cassette for overexpressing the polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
  • the filamentous fungal cell is an Aspergillus species, a Rhizopus species, a Trichoderma species or aMucor species.
  • Trichoderma species is selected from the group consisting of Trichoderma reesei, Trichoderma viride, Trichoderma koningii, and Trichoderma harzianums.
  • the mutation results in increased production of a protein of interest compared to otherwise identical parental filamentous fungal cells that lack the deletion.
  • the protein of interest is an antibody or fragment, thereof.
  • a method for increasing expression of a protein of interest in a filamentous fungal host comprising: (i) introducing a mutation is in a gene encoding a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 25 into filamentous fungal host cells capable of expressing a protein of interest; (ii) cultivating the group consiting of SEQ ID NO
  • the mutation is located in a non-coding region of the gene, a coding region of the gene, or both.
  • the mutation results in reduced expression of a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 2, and/or overexpression of a polypeptide homologous to or identical to a polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
  • the mutation is a polynucleotide homologous to or identical to a polynucleotide selected from the group consiting of the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46,. SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49 and SEQ ID NO: 50.
  • the mutation comprises an insertion mutation, optionally having a selectable marker.
  • the filamentous fungal cell is an Aspergillus species, a Rhizopus species, a Trichoderma species or aMucor species.
  • the mutation results in increased production of a protein of interest compared to otherwise identical parental filamentous fungal cells that lack the deletion.
  • the protein of interest is an antibody or fragment, thereof.
  • a method for increasing expression of a protein of interest in a filamentous fungal host comprising: (i) cultivating the filamentous fungal cell of any of paragraphs 1-14 under conditions conducive for production of a protein of interest; and (ii) recovering the protein of interest; wherein the presence of the mutation results in increased production of the protein of interest compared to the production in otherwise identical parental filamentous fungal cells that lack the deletion.
  • the present invention relates recombinant filamentous fungal cells that produce decreased amount of proteases and/or produce increased amounts of protease inhibitors compared to comparable parental cells.
  • the recombinant cells are capable of expressing at least one heterologous protein of interest encoded by a heterologous gene, or overexpressing a nomologous protein of interest. Nucleic acids and methods for making the mutant filamentous fungal cells are provided, as well as methods for using the cells for the altered production of heterologous proteins of interest.
  • proteases and protease inhibitors relevant to the present recombinant filamentous fungal cells are listed in Table 1, below.
  • Joint Genome Institute JGI; U.S. Department of Energy
  • AA amino acid
  • NA nucleic acid
  • VFGGVDDAHYEGKIEYIPLRRKAYWEVDLDSIAFGDEVAELENTGAILDTGTSLNVLP SGLAELLNAEIGAKKGFGGQYTVDCSKRDSLPDITFSLAGSKYSLPASDYIIEMSGNCIS
  • GQVC SWDDDDREVETLS GLQPECPHCVGATEARQERGKRALGVGKLRPDIVLYGEEH PSAHLISPIVTHDLALYPDMLLILGTSLRVHGLKVLVREFAKTVHSRGGKVVFVNFTKP
  • the recombinant filamentous fungal cells include a mutation in a gene that has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or even at least 99% nucleic acid sequence identity to the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49 and/or SEQ
  • the recombinant filamentous fungal cells include a mutation in a gene that encodes a polypeptide having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or even at least 99% amino acid sequence identity to the a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 1 1 , SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21 , SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and/or SEQ ID NO:
  • nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxyl orientation, respectively.
  • the headings provided herein are not limitations of the various aspects or embodiments of the invention that can be had by reference to the specification as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the Specification as a whole.
  • the term “impaired” or “impairment” refers to any method that decreases, but does not abolish, the functional expression of one or more genes or the functional activity of the resulting gene product (i.e. protein), fragments or homologues thereof, wherein the gene or gene product exerts its known function to a lesser extent than in the corresponding parent strain. It is intended to encompass any means of gene impairment include partial deletions, disruptions of the protein-coding sequence, non-coding sequences, or both, insertions, additions, mutations, gene silencing (e.g. RNAi genes antisense) and the like.
  • deletion of a gene refers to deletion of the entire coding sequence, deletion of part of the coding sequence, or deletion of the coding sequence including flanking regions.
  • a “disruption sequence” or “disruption mutant” as used herein refers to a nucleic acid or amino acid sequence, typically a coding region sequence, that comprises an insertion of nucleotides or amino acids.
  • insertion or “addition” in the context of a sequence refers to a change in a nucleic acid or amino acid sequence in which one or more nucleotides or amino acid residues have been added as compared to the endogenous chromosomal sequence or protein product.
  • non-revertable refers to a strain which will naturally revert back to it corresponding parent strain with a frequency of less than 10 "7
  • corresponding parent strain refers to the host strain from which a mutant is derived (e.g. , the originating and/or wild-type strain).
  • strain viability refers to reproductive viability.
  • the impairment of a gene does not deleteriously affect division and survival of the mutant under laboratory conditions.
  • coding region refers to the region of a gene that encodes the amino acid sequence of a protein.
  • amino acid refers to peptide or protein sequences or portions thereof.
  • protein refers to peptide or protein sequences or portions thereof.
  • polypeptide refers to proteins, amino acids, and amino acids.
  • heterologous protein or “exogenous protein” refers to a protein or polypeptide that does not naturally occur in the host cell, and includes genetically engineered versions of naturally occurring endogenous proteins.
  • endogenous protein or “native protein” refers to a protein or polypeptide naturally occurring in a cell.
  • host refers to a cell that can express a
  • filamentous fungal cell refers to a cell of any of the species of microscopic fungi that grow as multicellular filamentous strands including but not limited to:
  • Aspergillus sp. Rhizopus sp., Trichoderma sp., and Mucor sp.
  • Aspergillus or 'Aspergillus sp. includes all species within the genus 'Aspergillus " as known to those of skill in the art, including but not limited to A. oryzae, A. niger, A. awamori, A. nidulans, A. sojae, A. japonicus, A. kawachi and A. aculeatus.
  • nucleic acid refers to a nucleotide or polynucleotide sequence, and fragments or portions thereof, as well as to DNA, cDNA, and RNA of genomic or synthetic origin which may be double-stranded or single-stranded, whether representing the sense or antisense strand. It will be understood that as a result of the degeneracy of the genetic code, a multitude of nucleotide sequences may encode a given protein.
  • the term "gene” means a segment of DNA involved in producing a polypeptide and can include regions preceding and following the coding regions (e.g., promoter, terminator, 5' untranslated (5' UTR) or leader sequences and 3' untranslated (3' UTR) or trailer sequences, as well as intervening sequence (introns) between individual coding segments (exons).
  • regions preceding and following the coding regions e.g., promoter, terminator, 5' untranslated (5' UTR) or leader sequences and 3' untranslated (3' UTR) or trailer sequences, as well as intervening sequence (introns) between individual coding segments (exons).
  • homologous gene refers to a gene which has a homologous sequence and results in a protein having an identical or similar function.
  • the term encompasses genes that are separated by speciation ⁇ i.e., the development of new species) (e.g., orthologous genes), as well as genes that have been separated by genetic duplication (e.g., paralogous genes).
  • homologous sequences refers to a nucleic acid or polypeptide sequence having at least about 99%, at least about 98%, at least about 97%, at least about 96%, at least about 95%, at least about 94%, at least about 93%, at least about 92%, at least about 91%, at least about 90%, at least about 88%, at least about 85%, at least about 80%, at least about 75% or at least about 70% sequence identity to a subject nucleotide or amino acid sequence when optimally aligned for comparison.
  • homologous sequences have between about 80% and 100% sequence identity, in some embodiments between about 90% and 100% sequence identity, and in some embodiments, between about 95% and 100% sequence identity.
  • Sequence homology can be determined using standard techniques known in the art (see e.g., Smith and Waterman, Adv. Appl. Math., 1981. 2:482; Needleman and Wunsch, J. Mol. Biol, 1970. 48:443; Pearson and Lipman, Proc. Natl. Acad. Sci. USA 1988. 85:2444;
  • PILEUP uses a simplification of the progressive alignment method of Feng and Doolittle (Feng and Doolittle, J. Mol. Evol. ,1987. 35:351-360). The method is similar to that described by Higgins and Sharp (Higgins and Sharp, CABIOS 1989. 5: 151-153).
  • Useful PILEUP parameters including a default gap weight of 3.00, a default gap length weight of 0.10, and weighted end gaps.
  • a particularly useful BLAST program is the WU-BLAST-2 program (See, Altschul et al., Meth. EnzymoL, 1996. 266:460-480).
  • WU-BLAST-2 uses several search parameters, most of which are set to the default values.
  • the HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. However, the values may be adjusted to increase sensitivity.
  • a % amino acid sequence identity value is determined by the number of matching identical residues divided by the total number of residues of the "longer" sequence in the aligned region. The "longer" sequence is the one having the most actual residues in the aligned region (gaps introduced by WU-Blast-2 to maximize the alignment score are ignored).
  • vector refers to any nucleic acid that can be replicated in cells and can carry new genes or DNA segments into cells. Thus, the term refers to a nucleic acid construct designed for transfer between different host cells.
  • An "expression vector” refers to a vector that has the ability to incorporate and express heterologous DNA fragments (i.e., non-native DNA) in a cell. Many prokaryotic and eukaryotic expression vectors are commercially available. Selection of appropriate expression vectors is within the knowledge of those having skill in the art.
  • DNA construct refers to a nucleic acid molecule generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell (i.e., vectors or vector elements, as described above).
  • DNA construct can be incorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment.
  • DNA constructs also include a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell.
  • a DNA construct of the invention comprises a selectable marker.
  • DNA construct refers to DNA that is used to introduce sequences into a host cell or organism (i.e., "transform a host cell”).
  • the DNA construct may be generated in vitro by PCR or any other suitable techniques.
  • the transforming DNA can include an incoming sequence, and/or can include an incoming sequence flanked by homology boxes.
  • the transforming DNA comprises other non-homologous sequences, added to the ends (e.g., stuffer sequences or flanks). The ends can be closed such that the transforming DNA forms a closed circle (i.e., a plasmid), such as, for example, insertion into a vector.
  • plasmid refers to a circular double-stranded (ds) DNA construct used as a cloning vector, and which forms an extrachromosomal self-replicating genetic element in many bacteria and some eukaryotes. In some embodiments, plasmids become incorporated into the genome of the host cell.
  • isolated and purified are used to refer to a molecule (e.g. , a nucleic acid or polypeptide) or other component that is removed from at least one other component with which it is naturally associated.
  • altered expression is construed to include an increase or decrease in production of a protein of interest by an altered (i.e. , engineered) cell strain relative to the normal level of production from the corresponding unaltered parent strain (i.e., when grown under essentially the same conditions).
  • the term "enhanced expression” is construed to include the increased production of a protein of interest by an altered (i.e., engineered) cell strain above the normal level of production from the corresponding unaltered parent strain (i.e. , when grown under essentially the same conditions).
  • the term "expression” refers to a process by which a polypeptide is produced.
  • the process includes both transcription and translation of the gene.
  • the process also includes secretion of the polypeptide.
  • introducing refers to any method suitable for transferring the nucleic acid sequence into the cell, including but not limited to transformation, electroporation, nuclear microinjection, transduction, transfection, (e.g. , lipofection mediated and DEAE-Dextrin mediated transfection), incubation with calcium phosphate DNA precipitate, high velocity bombardment with DNA-coated microprojectiles, agrobacterium mediated transformation, and protoplast fusion.
  • stably transformed refers to a cell that has a non-native (heterologous) polynucleotide sequence integrated into its genome or as an episomal plasmid that is maintained for at least two generations.
  • an incoming sequence refers to a DNA sequence that is being introduced into a host cell.
  • the incoming sequence can be part of a DNA construct, can encode one or more proteins of interest (e.g., heterologous protein), can be a functional or nonfunctional gene and/or a mutated or modified gene, and/or can be a selectable marker gene(s).
  • homology box refers to a nucleic acid sequence, which is homologous to the sequence of gene in the chromosome of a filamentous fungal cell. More specifically, a homology box is an upstream or downstream region having between about 80 and 100% sequence identity, between about 90 and 100% sequence identity, or between about 95 and 100% sequence identity with the immediate flanking coding region of a gene or part of a gene to be impaired according to the invention. These sequences direct where in the chromosome a DNA construct or incoming sequence is integrated and directs what part of the chromosome is replaced by the DNA construct or incoming sequence.
  • a homology box may include between about 1 base pair (bp) to 200 kilobases (kb).
  • a homology box includes about between 1 bp and 10.0 kb; between 1 bp and 5.0 kb; between 1 bp and 2.5 kb; between 1 bp and 1.0 kb, and between 0.25 kb and 2.5 kb.
  • a homology box may also include about 10.0 kb, 5.0 kb, 2.5 kb, 2.0 kb, 1.5 kb, 1.0 kb, 0.5 kb, 0.25 kb and 0.1 kb.
  • the 5' and 3' ends of a selective marker are flanked by a homology box wherein the homology box comprises nucleic acid sequences immediately flanking the coding region of the gene.
  • the transforming DNA sequence comprises homology boxes without the presence of an incoming sequence. In this embodiment, it is desired to delete the endogenous DNA sequence between the two homology boxes.
  • the transforming sequences are wild-type, while in other embodiments, they are mutant or modified sequences.
  • the transforming sequences are homologous, while in other embodiments, they are heterologous.
  • target sequence refers to a DNA sequence in the host cell that encodes the sequence where it is desired for the incoming sequence to be inserted into the host cell genome.
  • the target sequence encodes a functional wild-type gene or operon, while in other embodiments the target sequence encodes a functional mutant gene or operon, or a non-functional gene or operon.
  • a "flanking sequence” refers to any sequence that is either upstream or downstream of the sequence being discussed (e.g. , for genes A-B-C, gene B is flanked by the A and C gene sequences).
  • the incoming sequence is flanked by a homology box on each side.
  • the incoming sequence and the homology boxes comprise a unit that is flanked by stuffer sequence on each side.
  • a flanking sequence is present on only a single side (either 3' or 5'), and in other embodiments, it is on each side of the sequence being flanked.
  • the sequence of each homology box is homologous to a sequence in the Aspergillus chromosome.
  • these sequences direct where in the Aspergillus chromosome the new construct gets integrated without any part of the chromosome being replaced by the incoming sequence.
  • the 5' and 3' ends of a selective marker are flanked by a polynucleotide sequence comprising a section of the desired chromosomal segment.
  • a flanking sequence is present on only a single side (either 3' or 5'), and in other embodiments, it is present on each side of the sequence being flanked.
  • chromosomally integrated refers to a sequence, typically a mutant gene (e.g. , disrupted form of a native gene), that has become incorporated into the chromosomal DNA of a host cell.
  • chromosomal integration occurs via the process of "homologous recombination,” wherein the homologous regions of the introduced
  • selectable marker refers to a nucleic acid capable of expression in host cell, which allows for ease of selection of those hosts containing the marker.
  • selectable marker refers to genes that provide an indication that a host cell has taken up (e.g., has been successfully transformed with) an incoming nucleic acid of interest or some other reaction has occurred.
  • selectable markers are genes that confer antimicrobial resistance or a metabolic advantage on the host cell to allow cells containing the exogenous DNA to be distinguished from cells that have not received any exogenous sequence during the transformation.
  • Selective markers useful with the present invention include, but are not limited to, antimicrobial resistance markers (e.g., ampR; phleoR; specR; kanR; eryR; tetR; cmpR; hygroR and neoR; see e.g. , Guerot-Fleury, Gene, 1995. 167:335-337; Palmeros et al, Gene 2000. 247:255-264; and Trieu-Cuot et al,
  • antimicrobial resistance markers e.g., ampR; phleoR; specR; kanR; eryR; tetR; cmpR; hygroR and neoR
  • antimicrobial resistance markers e.g., ampR; phleoR; specR; kanR; eryR; tetR; cmpR; hygroR and neoR
  • auxotrophic markers such as trpC, pyrG and amdS
  • detection markers such as ⁇ -galactosidase.
  • promoter refers to a nucleic acid sequence that functions to direct transcription of a downstream gene.
  • the promoter is appropriate to the host cell in which a desired gene is being expressed.
  • the promoter, together with other transcriptional and translational regulatory nucleic acid sequences (also termed “control sequences") is necessary to express a given gene.
  • control sequences also termed “control sequences”
  • the transcriptional and translational regulatory sequences include, but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences.
  • a nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence.
  • DNA encoding a secretory leader i. e. , a signal peptide
  • a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence
  • a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation.
  • operably linked means that the DNA sequences being linked are contiguous, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.
  • hybridization refers to the process by which a strand of nucleic acid joins with a complementary strand through base pairing, as known in the art.
  • a nucleic acid sequence is considered to be "selectively hybridizable" to a reference nucleic acid sequence if the two sequences specifically hybridize to one another under moderate to high stringency hybridization and wash conditions.
  • Hybridization conditions are based on the melting temperature (Tra) of the nucleic acid binding complex or probe.
  • “maximum stringency” typically occurs at about Tm-5°C (5° below the Tm of the probe); “high stringency” at about 5-10°C below the Tm; “intermediate stringency” at about 10-20°C below the Tm of the probe; and “low stringency” at about 20-25°C below the Tm.
  • maximum stringency conditions may be used to identify sequences having strict identity or near-strict identity with the hybridization probe; while an intermediate or low stringency hybridization can be used to identify or detect polynucleotide sequence homologs.
  • Moderate and high stringency hybridization conditions are well known in the art.
  • An example of high stringency conditions includes hybridization at about 42°C in 50%
  • moderate stringent conditions include an overnight incubation at 37°C in a solution comprising 20% formamide, 5 x SSC (150mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x Denhardt's solution, 10% dextran sulfate and 20 mg/ml denaturated sheared salmon sperm DNA, followed by washing the filters in lx SSC at about 37 - 50°C.
  • 5 x SSC 150mM NaCl, 15 mM trisodium citrate
  • 50 mM sodium phosphate pH 7.6
  • 5 x Denhardt's solution 10% dextran sulfate and 20 mg/ml denaturated sheared salmon sperm DNA
  • recombinant used in reference to a cell or vector refers to being modified by the introduction of a heterologous nucleic acid sequence, or a cell derived from a cell so modified.
  • recombinant cells express genes that are not found in identical form within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, underexpressed, overexpressed or not expressed at all as a result of deliberate human intervention.
  • “Recombination, "recombining,” or generating a “recombined” nucleic acid is generally the assembly of two or more nucleic acid fragments wherein the assembly gives rise to a chimeric gene.
  • the term "primer” refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e. , in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is single stranded for maximum efficiency in amplification. Most often, the primer is an oligodeoxyribonucleotide.
  • PCR polymerase chain reaction
  • restriction endonucleases and “restriction enzymes” refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
  • restriction site refers to a nucleotide sequence recognized and cleaved by a given restriction endonuclease and is frequently the site for insertion of DNA fragments.
  • restriction sites are engineered into the selective marker and into 5' and 3' ends of the DNA construct.
  • the host cell is a filamentous fungal cell.
  • Filamentous fungal cells useful with the present invention include, but are not limited to: Aspergillus sp. , ⁇ e.g. , A. oryzae, A. niger, A. awamori, A. nidulans, A. sojae, A. japonicus, A. kawachi and A.
  • the host cells are Aspergillus niger cells.
  • the mutation leading to altered decreased protease activity or increased protease inhibitor activity in the cell is a deletion in a non-coding regulatory region flanking the coding sequence of the gene .
  • the mutation comprises an insertion mutation.
  • the insertion mutation comprises insertion of a selectable marker.
  • the genomic DNA is already known, the 5' flanking fragment and the 3' flanking fragment of the locus to be deleted is cloned by two PCR reactions, and in embodiments wherein the locus is disrupted or otherwise altered, the DNA fragment is cloned by one PCR reaction.
  • the coding region flanking sequences include a range of about Ibp to 2500 bp; about Ibp to 1500 bp, about 1 bp to 1000 bp, about 1 bp to 500 bp, and 1 bp to 250 bp.
  • the number of nucleic acid sequences comprising the coding region flanking sequence may be different on each end of the gene coding sequence. For example, in some embodiments, the 5' end of the coding sequence includes less than 25 bp and the 3' end of the coding sequence includes more than 100 bp.
  • the incoming sequence comprises is a disruption sequence that comprises a selective marker flanked on the 5' and 3' ends with a fragment of the gene sequence.
  • the location of the selective marker renders the gene non-functional for its intended purpose.
  • the incoming sequence comprises the selective marker located in the promoter region of the gene. In other embodiments, the incoming sequence comprises the selective marker located after the promoter region of gene.
  • the incoming sequence is a disruption sequence comprising the selective marker located in the coding region of the gene.
  • the incoming sequence comprises a selective marker flanked by a homology box on both ends.
  • the incoming sequence includes a sequence that interrupts the transcription and/or translation of the coding sequence.
  • the DNA construct includes restriction sites engineered at the upstream and downstream ends of the construct.
  • the A. nidulans amdS gene provides a selectable marker system for the transformation of filamentous fungi useful with the present invention.
  • the amdS gene codes for an acetamidase enzyme deficient in strains of Aspergillus and provides positive selective pressure for transformants grown on acetamide media.
  • the amdS gene can be used as a selectable marker even in fungi known to contain an endogenous amdS gene or homolog, e.g., in A. nidulans (Tilbum et al. Gene 1983. 26: 205-221) and A. oryzae (Gomi et al. Gene 1991. 108:91-98). Background amdS activity of non-transformants can be suppressed by the inclusion of CsCl in the selection medium.
  • the DNA constructs comprising an incoming sequence may be incorporated into a vector (e.g., in a plasmid), or used directly to transform the filamentous fungal cell, thereby resulting in a mutant.
  • the DNA construct is stably transformed resulting in chromosomal integration of the impaired gene which is non-revertable.
  • Exemplary vectors useful with the present invention include pBS-T, pFB6, pBR322, pUC18, pUClOO and pENTR/D.
  • At least one copy of a DNA construct is integrated into the host chromosome.
  • one or more DNA constructs of the invention are used to transform host cells.
  • Impairment occurs via any suitable means, including deletions, substitutions (e.g. , mutations), disruptions, insertions in the nucleic acid gene sequence, and/or gene silencing mechanisms, such as RNA interference (RNAi).
  • the expression product of an impaired gene is a truncated protein with a corresponding change in the biological activity of the protein.
  • the impairment results in an attenuation of biological activity of the gene.
  • remaining residual activity will be less than 25%, 20%, 15%, 10%, 5%, or 2% compared to the biological activity of the same or homologous gene in a corresponding parent strain.
  • impairment is achieved by deletion and in other embodiments impairment is achieved by disruption of the protein-coding region of the gene.
  • the gene is altered by homologous recombination.
  • a deletion mutant comprises deletion of one or more genes that results in a stable and non-reverting deletion.
  • Flanking regions of the coding sequence may include from about lbp to about 500 bp at the 5' and 3' ends.
  • the flanking region may be larger than 500 bp but typically does not include other genes in the region which may be impaired or deleted according to the invention.
  • the disruption sequence comprises an insertion of a selectable marker gene into or near the protein-coding region.
  • this insertion is performed in vitro by reversely inserting a gene sequence into or near the coding region sequence of the gene to be impaired.
  • Flanking regions of the coding sequence may include about 1 bp to about 500 bp at the 5' and 3' ends.
  • the flanking region may be larger than 500 bp, but will typically not include other genes in the region.
  • the DNA construct aligns with the homologous sequence of the host chromosome and in a double crossover event the translation or transcription of the gene is disrupted.
  • decreased protease expression and/or increased protease inhibitor expression is accomplished by homologous recombination or DNA editing, for example, using the CRISPR method, optionally in by insertion of a selectable marker in the coding or non- coding region of the target gene (s).
  • impairment of the gene is by insertion in a single crossover event with a plasmid as the vector.
  • the vector is integrated into the host cell chromosome and the gene is altered by the insertion of the vector in the protein-coding sequence of the gene or in the regulatory region of the gene.
  • impairment results due to mutation of the gene.
  • Methods of mutating genes include but are not limited to site-directed mutation, generation of random mutations, gapped-duplex approaches and CRISPR (See e.g. , U. S. Pat. 4,760,025; Moring et al, Biotech. 1984. 2:646; and Kramer et al, Nucleic Acids Res., 1984. 12:9441).
  • a mutant encompassed by the invention will exhibit altered expression and translation (i. e., protein production) of one or more endogenous and/or heterologous proteins of interest in comparison to the expression and translation of the same protein(s) by the corresponding parent strain of filamentous fungus.
  • the mutants of filamentous fungal cells encompassed by the invention will produce the endogenous and/or heterologous proteins of interest in an amount at least about 5% to about 200% (or more) greater than the production of the same protein(s) in the corresponding parent strain. Accordingly, in some embodiments, the production of the protein(s) of interest by the mutant is at least about 0% to 100% greater, and in some embodiments is at least about 10% to 60% greater, including embodiments wherein production at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, and 55% greater, than the production of the endogenous and/or heterologous protein(s) in the corresponding parent strain.
  • the protein of interest produced by the mutant of a filamentous fungal cell is an intracellularly produced protein (i. e. , an intracellular, non-secreted polypeptide).
  • the protein of interest is a secreted polypeptide.
  • the protein of interest may be a fusion or hybrid protein.
  • the mutant exhibits altered production of a plurality of proteins, some of which are intracellular and some of which are secreted.
  • Proteins of interest useful with the present invention include enzymes known in the art, including, but not limited to those chosen from amylolytic enzymes, proteolytic enzymes, cellulytic enzymes, oxidoreductase enzymes and plant cell-wall degrading enzymes.
  • these enzyme include, but are not limited to amylases, glucoamylases, proteases, xylanases, lipases, laccases, phenol oxidases, oxidases, cutinases, cellulases, hemicellulases, esterases, perioxidases, catalases, glucose oxidases, phytases, pectinases, glucosidases, isomerases, transferases, galactosidases and chitinases.
  • enzymes include but are not limited to amylases, glucoamylases, proteases, phenol oxidases, cellulases, hemicellulases, glucose oxidases and phytases.
  • the polypeptide of interest is a protease, cellulase, glucoamylase or amylase.
  • the protein of interest is a secreted polypeptide, which is fused to a signal peptide (i. e. , an amino-terminal extension on a protein to be secreted).
  • a signal peptide i. e. , an amino-terminal extension on a protein to be secreted.
  • Nearly all secreted proteins use an amino- terminal protein extension, which plays a role in the targeting to and translocation of precursor proteins across the membrane. This extension is
  • the polypeptide of interest is a protein such as a protease inhibitor, which inhibits the action of proteases.
  • protease inhibitors are known in the art, for example the protease inhibitors belonging to the family of serine proteases inhibitors which are known to inhibit trysin, cathepsinG, thrombin and tissue kallikrein.
  • the protease inhibitors useful in the present invention are Bowman-Birk inhibitors and soybean trypsin inhibitors (See, Birk, Int. J. Pept. Protein Res. 1985. 25 : 113- 131 ; Kennedy, ⁇ . J. Clin. Neutr. 1998. 68: 1406S-1412S and Billings et al, Proc. Natl. Acad. &Z.1992. 89:3120 - 3124).
  • the polypeptide of interest is chosen from hormones, antibodies, growth factors, receptors, cytokines, etc.
  • Hormones encompassed by the present invention include but are not limited to, follicle-stimulating hormone, luteinizing hormone, corticotropin-releasing factor, somatostatin, gonadotropin hormone, vasopressin, oxytocin, erythropoietin, insulin and the like.
  • Growth factors include, but are not limited to platelet-derived growth factor, insulin-like growth factors, epidermal growth factor, nerve growth factor, fibroblast growth factor, transforming growth factors, cytokines, such as interleukins (e.g.
  • Antibodies include but are not limited to immunoglobulins obtained directly from any species from which it is desirable to produce antibodies.
  • the present invention encompasses modified antibodies. Polyclonal and monoclonal antibodies are also provided.
  • the antibodies or fragments thereof are chimeric or humanized antibodies, including but not limited to: anti-pl 85 Her2 , HulDlO-, trastuzumab, bevacizumab, palivizumab, infliximab, daclizumab, and rituximab.
  • the antibodies include one or more antibodies, as in the case of an antibody cocktail for treating a disease such as caused by Ebola virus.
  • the nucleic acid encoding the protein of interest will be operably linked to a suitable promoter, which shows transcriptional activity in a fungal host cell.
  • the promoter may be derived from genes encoding proteins either endogenous or heterologous to the host cell.
  • the promoter may be a truncated or hybrid promoter. Further the promoter may be an inducible promoter.
  • the promoter is useful in a Trichoderma host or an Aspergillus host. Suitable nonlimiting examples of promoters include cbhl, cbhl, eg/1, eg/2, and xynl .
  • the promoter is one that is native to the host cell.
  • promoters include promoters from the genes of A. awamori and A. niger glucoamylase genes (glaA) (Nunberg et al. , Mol. Cell Biol. 1984. 4:2306-2315 and Boel et al., EMBO J. 1984.
  • the polypeptide coding sequence is operably linked to a signal sequence which directs the encoded polypeptide into the cell's secretory pathway.
  • the 5' end of the coding sequence may naturally contain a signal sequence naturally linked in translation reading frame with the segment of the coding region which encodes the secreted polypeptide.
  • the DNA encoding the signal sequence typically is the sequence which is naturally associated with the polypeptide to be expressed.
  • the signal sequence is encoded by an Aspergillus niger alpha-amylase, Aspergillus niger neutral amylase or Aspergillus niger glucoamylase.
  • the signal sequence is the Trichoderma cdh ⁇ signal sequence which is operably linked to a cdhX promoter.
  • Introduction of a DNA construct or vector into a host cell includes techniques such as transformation; electroporation; nuclear microinjection; transduction; transfection, (e.g., lipofection mediated and DEAE-Dextrin mediated transfection); incubation with calcium phosphate DNA precipitate; high velocity bombardment with DNA-coated microprojectiles; agrobacterium mediated transformation and protoplast fusion.
  • General transformation techniques are known in the art (see, e.g., Ausubel et al, (1987), supra, chapter 9; and Sambrook (1989) supra, Campbell et al, Curr. Genet. 1989. 16:53-56 and THE
  • Transformants of the present invention can be purified using known techniques.
  • the filamentous fungal cells may be grown in conventional culture medium.
  • the culture media for transformed cells may be modified as appropriate for activating promoters and selecting transformants.
  • the specific culture conditions such as temperature, pH and the like will be apparent to those skilled in the art.
  • Typical culture conditions for filamentous fungi useful with the present invention are well known and may be found in the scientific literature such as Sambrook, (1982) supra, and from the American Type Culture Collection. Additionally, fermentation procedures for production of heterologous proteins are known per se in the art. For example, proteins can be produced either by solid or submerged culture, including batch, fed-batch and continuous-flow processes.
  • Fermentation temperature can vary somewhat, but for filamentous fungi such as Aspergillus niger the temperature generally will be within the range of about 20°C to 40°C, typically in the range of about 28°C to 37°C, depending on the strain of microorganism chosen.
  • the pH range in the aqueous microbial ferment (fermentation admixture) should be in the exemplary range of about 2.0 to 8.0. With filamentous fungi, the pH normally is within the range of about 2.5 to 8.0; with Aspergillus niger the pH normally is within the range of about 4.0 to 6.0, and typically in the range of about 4.5 to 5.5.
  • the average retention time of the fermentation admixture in the fermentor can vary considerably, depending in part on the fermentation temperature and culture employed, generally it will be within the range of about 24 to 500 hours, typically about 24 to 400 hours.
  • Any type of fermentor useful for culturing filamentous fungi may be employed in the present invention.
  • One useful embodiment with the present invention is operation under 15L Biolafitte (Saint-Germain-en-Laye, France).
  • Various assays are known to those of ordinary skill in the art for detecting and measuring activity of intracellularly and extracellularly expressed polypeptides.
  • Means for determining the levels of secretion of a protein of interest in a host cell and detecting expressed proteins include the use of immunoassays with either polyclonal or monoclonal antibodies specific for the protein. Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), fluorescence immunoassay (FIA), and fluorescent activated cell sorting (FACS).
  • ELISA enzyme-linked immunosorbent assay
  • RIA radioimmunoassay
  • FACS fluorescence immunoassay
  • other methods are known to those in the art and find use in assessing the protein of interest (See e.g., Hampton et al, SEROLOGICAL METHODS, A
  • the protein of interest may be recovered and further purified.
  • the recovery and purification of the protein of interest from a fermentation broth can be done by procedures known in the art.
  • the fermentation broth will generally contain cellular debris, including cells, various suspended solids and other biomass contaminants, as well as the desired protein product.
  • Suitable processes for such removal include conventional solid-liquid separation techniques such as, e.g., centrifugation, filtration, dialysis, microfiltration, rotary vacuum filtration, or other known processes, to produce a cell-free filtrate. Often, it may be useful to further concentrate the fermentation broth or the cell-free filtrate prior to crystallization using techniques such as ultrafiltration, evaporation or precipitation.
  • Precipitating the proteinaceous components of the supernatant or filtrate may be accomplished by means of a salt, followed by purification by a variety of chromatographic procedures, e.g. , ion exchange chromatography, affinity chromatography or similar art recognized procedures.
  • the polypeptide may be purified from the growth media.
  • the expression host cells are removed from the media before purification of the polypeptide (e.g., by centrifugation).
  • the expression host cells are collected from the media before the cell disruption (e.g., by centrifugation).

Abstract

Described are recombinant filamentous fungal cells that produce decreased amount of proteases and/or produce increased amounts of protease inhibitors compared to comparable parental cells. Such cells are useful for producting proteins of interest, including antibodies.

Description

FILAMENTOUS FUNGI WITH IMPROVED PROTEIN PRODUCTION
CROSS-REFERENCE TO RELATED APPLICATIONS
[01] This application claims the benefit of U.S. Provisional Application No. 62/422,413, filed November 15, 2016, the disclosure of which is incorporated herein by reference in its entirety.
TECHNICAL FIELD
[02] Described are recombinant filamentous fungal cells that produce decreased amount of proteases and/or produce increased amounts of protease inhibitors compared to comparable parental cells. Such cells are useful for producting proteins of interest, including antibodies.
BACKGROUND
[03] Genetic engineering has allowed improvements in microorganisms used as industrial bioreactors, cell factories and in food fermentations. Important enzymes and proteins produced by engineered microorganisms include glucoamylases, a-amylases, cellulases, neutral proteases, and alkaline (or serine) proteases, hormones and antibodies. However, the occurrence of protein degradation and modification in some genetically engineered systems can interfere with efficient production.
[04] Filamentous fungi {e.g., Aspergillus and Trichoderma species) and certain bacteria (e.g. , Bacillus species) have been engineered to produce and secrete a large number of useful proteins and metabolites (see e.g., Bio/Technol. 1987. 5: 369 - 376, 713 - 719 and 1301 -1304 and Zukowski, "Production of commercially valuable products," In: Doi and McGlouglin (eds.) Biology of Bacilli: Applications to Industry,© 1992, Butterworth-Heinemann,
Stoneham. Mass pp 311-337).
[05] As lower eukaryotic microorganisms, filamentous fungi are known for their robust ability to secrete large quantities of proteins. Such expression can reach as high as 40 g/L (Durand et al., Enzyme and Microbial Technology, 1988. 10(6):341-346) with a translational and post-translational modification process similar to that of mammalian cells except for their glycosylation. They are widely used in the chemical, pharmaceutical and food industries and it is generally regarded as safe (Schuster, E., et al., Appl Microbiol Biotechnol, 2002. 59(4- 5):426-35). However, heterologous protein production in filamentous fungi is still relatively low as compared to the more common bacterial expression systems. Thus a need exists for more efficient expression systems that can produce heterologous proteins in greater quantities. SUMMARY
[06] The present compositions and methods relate to recombinant filamentous fungal cells that produce decreased amount of proteases and/or produce increased amounts of protease inhibitors compared to comparable parental cells. Aspects and embodiments of the
compositions and methods are described in the following, independently -numbered
paragraphs.
1. In one aspect, a filamentous fungal cell is provided, comprising at least one mutation that decreases the amount of active protease in the cell, wherein the mutation is in a gene encoding a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 25.
2. In some embodiments of the filamentous fungal cell of paragraph 1, the mutation is located in a non-coding region of the gene.
3. In some embodiments of the filamentous fungal cell of paragraph 1, the mutation is located in a coding region of the gene.
4. In some embodiments of the filamentous fungal cell of any of paragraphs 1-3, the mutation results in reduced expression of a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 23.
5. In some embodiments of the filamentous fungal cell of paragraph 4, the mutation is a polynucleotide homologous to or identical to a polynucleotide selected from the group consiting of the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47 and SEQ ID NO: 48. 6. In some embodiments of the filamentous fungal cell of any of paragraphs 1-3, the mutation results in overexpression of the polypeptide homologous to a polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
7. In some embodiments of the filamentous fungal cell of paragraph 6, the mutation is a polynucleotide homologous to or identical to a polynucleotide of SEQ ID NO: 49 or SEQ ID
NO: 50.
8. In some embodiments of the filamentous fungal cell of any of paragraphs 1-7, the mutation comprises an insertion mutation.
9. In some embodiments of the filamentous fungal cell of paragraph 8, the insertion mutation comprises insertion of a selectable marker.
10. In some embodiments of the filamentous fungal cell of paragraph 8 or 9, the insertion mutation comprises insertion of an expression cassette for overexpressing the polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
11. In some embodiments of the filamentous fungal cell of any of paragraphs 1-10, the filamentous fungal cell is an Aspergillus species, a Rhizopus species, a Trichoderma species or aMucor species.
12. In some embodiments of the filamentous fungal cell of paragraph 11, the
Trichoderma species is selected from the group consisting of Trichoderma reesei, Trichoderma viride, Trichoderma koningii, and Trichoderma harzianums.
13. In some embodiments of the filamentous fungal cell of any of paragraphs 1-12, the mutation results in increased production of a protein of interest compared to otherwise identical parental filamentous fungal cells that lack the deletion.
14. In some embodiments of the filamentous fungal cell of any of paragraphs 1-13, the protein of interest is an antibody or fragment, thereof.
15. In another aspect, a method for increasing expression of a protein of interest in a filamentous fungal host is provided, the method comprising: (i) introducing a mutation is in a gene encoding a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 25 into filamentous fungal host cells capable of expressing a protein of interest; (ii) cultivating the filamentous fungal cell having the deletion under conditions conducive for production of the protein of interest; and (iii) recovering the protein of interest; wherein the presence of the mutation results in increased production of the protein of interest compared to the production in otherwise identical parental filamentous fungal cells that lack the deletion.
16. In some embodiments of the method of paragraph 15, the mutation is located in a non-coding region of the gene, a coding region of the gene, or both.
17. In some embodiments of the method of paragraph 15 or 16, the mutation results in reduced expression of a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 2, and/or overexpression of a polypeptide homologous to or identical to a polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
18. In some embodiments of the method of any of paragraphs 15-18, the mutation is a polynucleotide homologous to or identical to a polynucleotide selected from the group consiting of the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46,. SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49 and SEQ ID NO: 50.
19. In some embodiments of the method of any of paragraphs 15-18, the mutation comprises an insertion mutation, optionally having a selectable marker.
20. In some embodiments of the method of paragraph 19, the filamentous fungal cell is an Aspergillus species, a Rhizopus species, a Trichoderma species or aMucor species.
21. In some embodiments of the method of any of paragraphs 15-20, the mutation results in increased production of a protein of interest compared to otherwise identical parental filamentous fungal cells that lack the deletion.
22. In some embodiments of the method of any of paragraphs 15-21, the protein of interest is an antibody or fragment, thereof.
23. In another aspect, a method for increasing expression of a protein of interest in a filamentous fungal host is provided, the method comprising: (i) cultivating the filamentous fungal cell of any of paragraphs 1-14 under conditions conducive for production of a protein of interest; and (ii) recovering the protein of interest; wherein the presence of the mutation results in increased production of the protein of interest compared to the production in otherwise identical parental filamentous fungal cells that lack the deletion.
[07] The and other aspects and embodiments of the compositions and methods will be apparent from the description.
DETAILED DESCRIPTION
[08] The present invention relates recombinant filamentous fungal cells that produce decreased amount of proteases and/or produce increased amounts of protease inhibitors compared to comparable parental cells. Preferably, the recombinant cells are capable of expressing at least one heterologous protein of interest encoded by a heterologous gene, or overexpressing a nomologous protein of interest. Nucleic acids and methods for making the mutant filamentous fungal cells are provided, as well as methods for using the cells for the altered production of heterologous proteins of interest.
[09] The proteases and protease inhibitors relevant to the present recombinant filamentous fungal cells are listed in Table 1, below. Joint Genome Institute (JGI; U.S. Department of Energy) identifiers,a brief description, amino acid (AA) SEQ ID NOs and nucleic acid (NA) SEQ ID NOs, are indicated.
Table 1. Proteases and protease inhibitors
Figure imgf000006_0001
>Jgi Trire2| 1066611 fgenesh5_pg. C_scaffold_7000462 eqolisin 11 36
>Jgi Trire2|51365|estExt_Genewisel .C_220234 s8 12 37
>Jgi Trire2| 122076| estExt_fgenesh5_pg. C_l 00212 pepsin 13 38
>Jgi Trire2| 1238651 estExt_fgenesh5_pg. C_270147 s8 14 39
>Jgi Trire2| 815171 estExt_GeneWisePlus . C_250157 s8 15 40
>Jgi Trire2| 123244| estExt_fgenesh5_pg. C_l 90118 16 41
>Jgi Trire2| 81070| estExt_GeneWisePlus . C_220116 QM 6a 17 42
>Jgi Trire2|53961 |e_gwl .1.1743.1 18 43
>Jgi Trire2|69555|e_gwl .29.235.1 19 44
>Jgi Trire2| 216681 estExt_fgenesh l_pm. C_30227 QM 6a 20 45
>Jgi Trire2|58698|e_gwl .5.263.1 21 46
>Jgi Trire2| 1227031 estExt_fgenesh5_pg. C_l 40132 22 47
>Jgi Trire2|60581 |e_gwl .7.573.1 23 48
>Jgi Trire2|74129|estExt_GeneWisePlus.C_l 1150 Inhibitor 24 49
>Jgi Trire2| 111915|fgenesh5_pg.C_scaffold_30000054 Inhibitor 25 50
[10] The amino acid sequences of the proteases and protease inhibitors are listed, below.
[11] >jgi|Trire2|73897|estExt_GeneWisePlus.C_10741 polypeptide (SEQ ID NO: 1) MAPASQVVSALMLPALALGAAIQPRGADIVGGTAASLGEFPYIVSLQNPNQGGHFCG GVLVNANTVVTAAHCSVVYPASQIRVRAGTLTWNSGGTLVGVSQIIVNPSYNDRTTD FDVAVWHLSSPIRESSTIGYATLPAQGSDPVAGSTVTTAGWGTTSENSNSIPSRLNKVS VPVVARSTCQADYRSQGLSVTNNMFCAGLTQGGKDSCSGDSGGPIVDANGVLQGVV SWGIGCAEAGFPGVYTRIGNFVNYINQNLA
[12] >jgi|Trire2|77579|estExt_GeneWisePlus.C_80475 polypeptide (SEQ ID NO: 2) MKSALLAAAALVGSAQAGIHKMKLQKVSLEQQLEGSSIEAHVQQLGQKYMGVRPTS
RAEVMFNDKPPKVQGGHPVPVTNFMNAQYFSEITIGTPPQSFKVVLDTGSSNLWVPSQ
SCNSIACFLHSTYDSSSSSTYKPNGSDFEIHYGSGSLTGFISNDVVTIGDLKIKGQDFAEA
TSEPGLAFAFGRFDGILGLGYDTISVNGIVPPFYQMVNQKLIDEPVFAFYLGSSDEGSEA
VFGGVDDAHYEGKIEYIPLRRKAYWEVDLDSIAFGDEVAELENTGAILDTGTSLNVLP SGLAELLNAEIGAKKGFGGQYTVDCSKRDSLPDITFSLAGSKYSLPASDYIIEMSGNCIS
SFQGMDFPEPVGPLVILGDAFLRRYYSVYDLGRDAVGLAKAK
[13] >jgi|Trire2|22459|estExt_fgeneshl_pm.C_l 00158 polypeptide (SEQ ID NO: 3)
MKTFIPVALLALSQATSACLLPHERDEASGLKPVVRRQSSNGTPIGTGDRFSGGSTAPR
GLGTQSSSTSFSTILNVKEVTSGLQGLANTYGVQTFNTPYKTAQGATVVGAKVGGTG GNCTDAYRVFFNGNIHARERGSADSVLYFISDLLYANAHNTGLTYGSKTYSNAQVKT ALAAGIVFIPLSNPDGVAYDQSTNSCWRKNRNPNSGQSPGVDLNRNFDFLWDFRNLF ASSAQSSVGSTSPSSETYHGASAFSEPETK IKWVFDTYSKVRWFVDLHSYAGDVLW NWGSDENQVDYPTMNFLNGTYNKVRGILTDTPSPGRGYGEYVPQADLDVKEAAAKR VASALTAGGGRSYTAFQSAQLYPTSGASDDYAYSRHFSDPTK LIHSYTIEFGFAN A ASCPFYPSVSQYNSNLRATSAGFMELLLAATDYGLGDATT
[14] >jgi|Trire2|74156|estExt_GeneWisePlus.C_l 1223 polypeptide (SEQ ID NO: 4) MQTFGAFLVSFLAASGLAAALPTEGQKTASVEVQYNKNYVPHGPTALFKAKRKYGA PISDNLKSLVAARQAKQALAKRQTGSAPNHPSDSADSEYITSVSIGTPAQVLPLDFDTG SSDLWVFSSETPKSSATGHAIYTPSKSSTSKKVSGASWSISYGDGSSSSGDVYTDKVTI GGFSVNTQGVESATRVSTEFVQDTVISGLVGLAFDSGNQVRPHPQKTWFSNAASSLAE PLFTADLRHGQNGSYNFGYIDTSVAKGPVAYTPVDNSQGFWEFTASGYSVGGGKLNR NSIDGIADTGTTLLLLDDNVVDAYYANVQSAQYDNQQEGVVFDCDEDLPSFSFGVGS STITIPGDLLNLTPLEEGSSTCFGGLQSSSGIGINIFGDVALKAALVVFDLGNERLGWAQ
K
[15] >jgi|Trire2| 119876|estExt_fgenesh5_pg.C_10816 polypeptide (SEQ ID NO: 5) MRFVQYVSLAGLFAAATVSAGVVTVPFEKRNLNPDFAPSLLRRDGSVSLDAINNLTG GGYYAQFSVGTPPQKLSFLLDTGSSDTWVNSVTADLCTDEFTQQTVGEYCFRQFNPR RSSSYKASTEVFDITYLDGRRIRGNYFTDTVTINQANITGQKIGLALQSVRGTGILGLGF RENEA ADTKYPTVIDNL V S QKVIP VP AF S LYLNDLQTS QGILLFGGVDTDKFHGGL AT LPLQSLPPSIAETQDIVMYSVNLDGFSASDVDTPDVSAKAVLDSGSTITLLPDAVVQEL FDEYDVLNIQGLPVPFIDCAKANIKDATFNFKFDGKTIKVPIDEMVLNNLAAASDEIMS DPSLSKFFKGWSGVCTFGMGSTKTFGIQSDEFVLLGDTFLRSAYVVYDLQNKQIGIAQ ATLNSTSSTIVEFKAGSKTIPGPASTGDDSDDSSDDSDEDSAGAALHPTFSIALAGTLFT AVSMMMSVL
[16] >jgi|Trire2| 111818|fgenesh5_pg.C_scaffold_29000088 polypeptide (SEQ ID NO: 6) MHAALVSLSIASLAAAALPEGVVEVPLQRIKNQTAYGVEFQVGNPPQKAVISADTGSP TYAFESPRNTVCQQGLCSAYGTYDNTTSTTSKWLSDGYSDMLIDHGFGSFINDTLRIG GVTLNDMMFGVVEQNFASFPINTQQTAIFGLGAFCQTPACDTYSTFLNQLYEHGAIAR RAFSVYLGPNDPDATGSLLFGGIDLAKRQGPVHKLKVLDPTVTAANLQPNWVSLSSIE LQLSNGTTITSTYDNGTYALWDTGSPGWYVQQDMFDALTSYWGLTNPNPNNDIIVDC KFREPSDDSLAVNIAPGVTINVPLSSLPIDNGDGTCTARVSPWGRVMGDTFLRNVYFTF DYEDLTVEFALVKYTDETNIVKIE
[17] >jgi|Trire2|82623|estExt_GeneWisePlus.C_450028 polypeptide (SEQ ID NO: 7) MAKLSTLRLASLLSLVSVQVSASVHLLESLEKLPHGWKAAETPSPSSQIVLQVALTQQ NIDQLESRLAAVSTPTSSTYGKYLDVDEINSIFAPSDASSSAVESWLQSHGVTSYTKQG SSIWFQTNISTANAMLSTNFHTYSDLTGAKKVRTLKYSIPESLIGHVDLISPTTYFGTTK AMRKLKSSGVSPAADALAARQEPSSCKGTLVFEGETFNVFQPDCLRTEYSVDGYTPSV KSGSRIGFGSFLNESASFADQALFEKHFNIPSQNFSVVLINGGTDLPQPPSDANDGEANL DAQTILTIAHPLPITEFITAGSPPYFPDPVEPAGTPNENEPYLQYYEFLLSKSNAEIPQVIT NSYGDEEQTVPRSYAVRVCNLIGLLGLRGISVLHSSGDEGVGASCVATNSTTPQFNPIF PATCPYVTSVGGTVSFNPEVAWAGSSGGFSYYFSRPWYQQEAVGTYLEKYVSAETKK YYGPYVDFSGRGFPDVAAHSVSPDYPVFQGGELTPSGGTSAASPVVAAIVALLNDARL REGKPTLGFLNPLIYLHASKGFTDITSGQSEGCNGN TQTGSPLPGAGFIAGAHWNAT KGWDPTTGFGVPNLKKLLALVRF
[18] >jgi|Trire2|81004|estExt_GeneWisePlus.C_220006 polypeptide (SEQ ID NO: 8)
MKFHAAALTLACLASSASAGVAQPRADEVESAEQGKTFSLEQIPNERYKGNIPAAYIS
ALAKYSPTIPDKIKHAIEINPDLHRKFSKLINAGNMTGTAVASPPPGADAEYVLPVKIGT PPQTLPLNLDTGSSDLWVISTDTYPPQVQGQTRYNVSASTTAQRLIGESWVIRYGDGSS ANGIVYKDRVQIGNTFFNQQAVESAVNISNEISDDSFSSGLLGAASSAANTVRPDRQTT YLENIKSQLARPVFTANLKKGKPGNYNFGYINGSEYIGPIQYAAINPSSPLWEVSVSGY RVGSNDTKYVPRVWNAIADTGTTLLLVPNDIVSAYYAQVKGSTFSNDVGMMLVPCA ATLPDFAFGLGNYRGVIPGSYINYGRMNKTYCYGGIQSSEDAPFAVLGDIALKAQFVV FDMGNKVVGFANKNTNV
[19] >jgi|Trire2| 110490|fgenesh5_pg.C_scaffold_20000149 polypeptide (SEQ ID NO: 9) MNGLSFERAKALREPSADRATMHFSRLRGIKGRPGQQQGQQQHHHRSAVSALSRVRS SSSSSTSSGGFKQSFQNVSAVTDASTQYAIQCGWDGVPVWLLVDTGSSDTWATQTGF ECADLGGDAHSEAACGFSRPLIDGFGGGQIDDLHFFLKYGSGERISGPMGYSDLSCGG VAVARQQVGLANSTYWHGNNVTVGILGLAYPAITSAYYGDVGSEAPWNAMSYTPFL TSAISQGAIEPLFSVALVRNSTDGVIAWGGLPPMDWQFHGFAKTDLIVANLIGSPETA WKHSFYTIVPDGMKWDQTTDTTKYPYIVDTGTTMLYLPPPLAEAIANAFQPRAVYLY QWGTYFVECTAIPPHFAILIEGVEFWINPADLIYRDLVDPLTGYCAVGVASGGPGPYIL GDVFLQNVVAVFDVGAAEMRFYGRK
[20] >jgi|Trire2| 121133|estExt_fgenesh5_pg.C_60052 polypeptide (SEQ ID NO: 10)
MEAILQAQAKFRLDRGLQKITAVRNKNYKRHGPKSYVYLLNRFGFEPTKPGPYFQQH RIHQRGLAHPDFKAAVGGRVTRQKVLAKKVKEDGTVDAGGSKTGEVDAEDQQNDSE YLCEVTIGTPGQKLMLDFDTGSSDLWVFSTELSKHLQENHAIFDPKKSSTFKPLKDQT WQISYGDGSSASGTCGSDTVTLGGLSIKNQTIELASKLAPQFAQGTGDGLLGLAWPQI NTVQTDGRPTPANTPVANMIQQDDIPSDAQLFTAAFYSERDENAESFYTFGYIDQDLV SASGQEIAWTDVDNSQGFWMFPSTKTTINGKDISQEGNTAIADTGTTLALVSDEVCEA LYKAIPGAKYDDNQQGYVFPINTDASSLPELKVSVGNTQFVIQPEDLAFAPADDSNWY GGVQSRGSNPFDILGDVFLKSVYAIFDQGNQRFGAVPKIQAKQNLQPPQ
[21] >jgi|Trire2| 106661|fgenesh5_pg.C_scaffold_7000462 polypeptide (SEQ ID NO: 11) MD AIRARS AARRSNRFQ AGS SKNVNGT AD VES TNW AGAAITTS GVTEV S GTFTVPRP S VPAGGSSREEYCGAAWVGIDGYSDADLIQTGVLWCVEDGEYLYEAWYEYLPAALVE YSGISVTAGSVVTVTATKTGTNSGVTTLTSGGKTVSHTFSRQNSPLPGTSAEWIVEDFT SGSSLVPFADFGSVTFTGATAVVNGATVTAGGDSPVIIDLEDSRGDILTSTTVSGSTVT VEYE
[22] >jgi|Trire2|51365|estExt_Genewisel .C_220234 polypeptide (SEQ ID NO: 12) MVRSALFVSLLATFSGVIARVSGHGSKIVPGAYIFEFEDSQDTADFYKKLNGEGSTRLK FDYKLFKGVSVQLKDLDNHEAKAQQMAQLPAVKNVWPVTLIDAPNPKVEWVAGST APTLESRAIKKPPIPNDSSDFPTHQMTQIDKLRAKGYTGKGVRVAVIDTGIDYTHPALG GCFGRGCLVSFGTDLVGDDYTGFNTPVPDDDPVDCAGHGSHVAGIIAAQENPYGFTG GAPDVTLGAYRVFGCDGQAGNDVLISAYNQAFEDGAQIITASIGGPSGWAEEPWAVA VTRIVEAGVPCTVSAGNEGDSGLFFASTAANGKKVIAVASVDNENIPSVLSVASYKIDS GAAQDFGYVSSSKAWDGVSKPLYAVSFDTTIPDDGCSPLPDSTPDLSDYIVLVRRGTC TFVQKAQNVAAKGAKYLLYYNNIPGALAVDVSAVPEIEAVGMVDDKTGATWIAALK DGKTVTLTLTDPIESEKQIQFSDNPTTGGALSGYTTWGPTWELDVKPQISSPGGNILST YPVALGGYATLSGTSMACPLTAAAVALIGQARGTFDPALIDNLLATTANPQLFNDGEK FYDFLAPVPQQGGGLIQAYDAAFATTLLSPSSLSFNDTDHFIKKKQITLKNTSKQRVTY KLNHVPTNTFYTLAPGNGYPAPFPNDAVAAHANLKFNLQQVTLPAGRSITVDVFPTPP RDVDAKRLALWSGYITVNGTDGTSLSVPYQGLTGSLHKQKVLYPEDSWIADSTDESL APVENGTVFTIPAPGNAGPDDKLPSLVVSPALGSRYVRVDLVLLSAPPHGTKLKTVKF LDTTSIGQPAGSPLLWISRGANPIAWTGELSDNKFAPPGTYKAVFHALRIFGNEKKKED WDVSESPAFTIKYA
[23] >jgi|Trire2| 122076|estExt_fgenesh5_pg.C_100212 polypeptide (SEQ ID NO: 13) MRASPLAVAGVALASAAQAQVVQFDIEKRHAPRLSRRDGTIDGTLSNQRVQGGYFIN VQVGSPGQNITLQLDTGSSDVWVPSSTAAICTQVSERNPGCQFGSFNPDDSDTFDEVG QGLFDITYVDGSSSKGDYFQDNFQINGVTVKNLTMGLGLSSSIPNGLIGVGYMNDEAS VSTTRSTYPNLPIVLQQQKLINSVAFSLWLNDLDASTGSILFGGIDTEKYHGDLTSIDIIS PNGGKTFTEFAVNLYSVQATSPSGTDTLSTSEDTLIAVLDSGTTLTYLPQDMAEEAWN EVGAEYSNELGLAVVPCSVGNTNGFFSFTFAGTDGPTINVTLSELVLDLFSGGPAPRFS SGPNKGQSICEFGIQNGTGSPFLLGDTFLRSAFVVYDLVN QIAIAPTNFNSTRTNVVAF
ASSGAPIPSATAAPNQSRTGHSSSTHSGLSAASGFHDGDDENAGSLTSVFSGPGMAVV
GMTICYTLLGSAIFGIGWL
[24] >jgi|Trire2| 123865|estExt_fgenesh5_pg.C_270147 polypeptide (SEQ ID NO: 14)
MRACLLFLGITALATAIPALKPPHGSPDRAHTTQLAKVSIALQPECRELLEQALHHLSD
PSSPRYGRYLGREEAKALLRPRREATAAVKRWLARAGVPAHDVLTDGQFIHVRTLAE
KAQALLGFEYNSTLGSQTIAISTLPGKIRKHVMTVQYVPLWTEADWEECKTIITPSCLK
RLYHVDSYRAKYESSSLFGIVGFSGQAAQHDELDKFLHDFAPYSTNANFSIESVNGGQ
SPQGMNEPASEANGDVQYAVAMGYHVPVRYYAVGGENHDIIPDLDLVDTTEEYLEPF
LEFASHLLDLDDDELPRVVSISYGANEQLFPRSYAHQVCDMFGQLGARGVSIVVAAG
DLGPGVSCQSNDGSARPKFIPSFPATCPYVTSVGSTRGIMPEVAASFSSGGFSDYFARP
AWQDRAVGAYLGAHGEEWEGFYNPAGRGFPDVAAQGVNFRFRAHGNESLSSGTSLS
SPVFAALIALLNDHRSKSGMPPMGFLNPWIYTVGSHAFTDIIEARSEGCPGQSVEYLAS
PYIPNAGWSAVPGWDPVTGWGTPLFDRMLNLSLV
[25] >jgi|Trire2|81517|estExt_GeneWisePlus.C_250157 polypeptide (SEQ ID NO: 15)
MKSALLWAAPLSLLAGLGACGKNFDELLEVPEGWTQLNDAVNPSQHIRLSIAVKQPYI
DSLEARMAEKGNRLSMQEVRELQTPAKKDIDNVLHWLSQNNLYGVVEKDFIRVWTT
VAKAEPLLKMKLSRFSYEGKPAVLRTTKYTIPDSVADSISFINPINNFMSARHRERGLTF
LLPPSKGAVLPGNTTAYCAGSVTPSCLSKLYNINYSPANTSSPVIFGVAGFLEENANLQ
DLRQFLNQSAPEVAKTGRTINVELVNGGVNSQELSESGIEAALDVDYAVSLGFPTNVT
FYSTGGRGVKLNDDGQPIEGEDDDNEPYLEFFQYLLAKPDGQVPHVLSLSYSDDELSV
PRDYAKRVCSLFGLLTARGTSIIFSSGDGGARGGRASSCLTNDGTKRQVTMATFPPTCP
WVTSIGAVTNIAEPPNGARFSTGGFSQYFAQPRWQNEAVEGYVKALGGHLDGYYNES
MRAIPDVSAVGTAFSIISGGYPRSVQGTSASAPVFAAMIALINDARLRAGKKSLGFLNQ
HLYSSEVRAVLQDITAGQSASCIWNDADIPGGWPAAEGWDAITGLGVPKRFDKLMEV
LVNHLPRLATGFARFKMPTQHVEPESHALLQDVANSLLKARKVVVVTGAGISTNSGIP
DFRSENGLYSLIQAQFDAANQPTRPAERSKADGTADSGEEPRPTKRGMTSREASPDLD
EVTRQLRDDIEARAESQRPAASSRAAGTQPAVAASDANASAEGVCLSTPRRKPALPST
PLPTTSPLSSPPREDFLIPLPSWSSSSILRTEDRKRITDASHNVVSSPLSSPPPVLFDPFHPS
SPSDEDRSRRSSTTPSEAGENPPNAMPASQTSSFGKANLPNMKGKDLFDASIWSDPTRT
SVFYQFATSLRQKVRDAEPTSSHKFISHLRDRGKLVRCYTQNIDQIEEKVGLSTSLEDG
PGSRGRFSRKATANASQLNKMVQEVSSGEGGASSHVNASSQSSNGSSEQSSAESSQAN
DRTEEEDSAGSSSTTAATTTTTTTPPDRPKPAPRKEPPQSGVECVFLHGSLQLLRCFLC
GQVC SWDDDDREVETLS GLQPECPHCVGATEARQERGKRALGVGKLRPDIVLYGEEH PSAHLISPIVTHDLALYPDMLLILGTSLRVHGLKVLVREFAKTVHSRGGKVVFVNFTKP
PES SWGDIID YWIQWDCD AWV S DLQ VRIPIQ^ WQDPEPPKPKKKRD S GGAADDNREEK
KRPPAQNPVALRDTKVNGAYCTLKILKELRRITYTRDSAAIGNPIITAPEPPTTTEAISRA
VAMDSSRTSTPRGKSKRSRRSATGAIDRPKRTPSTLNPNHGRSKKPTAEAKKKQQEEE
EVPETPSQPPASTVEEFSSILHSVKSNPRIRKRKMIDGEEFVFPAVGKKRGAVDSLYKGP
GDDTKELPPLRPMP
[26] >jgi|Trire2| 123244|estExt_fgenesh5_pg.C_190118 polypeptide (SEQ ID NO: 16)
MRSVVALSMAAVAQASTFQIGTIHEKSAPVLSNVEANAIPDAYIIKFKDHVGEDDASK
HHDWIQSIHTNVEQERLELRKRSNVFGADDVFDGLKHTFKIGDGFKGYAGHFHESVIE
QVRNHPDVEYIERDSIVHTMLPLESKDSIIVEDSCNGETEKQAPWGLARISHRETLNFG
SFNKYLYTADGGEGVDAYVIDTGTNIEHVDFEGRAKWGKTIPAGDEDEDGNGHGTH
CSGTVAGKKYGVAKKAHVYAVKVLRSNGSGTMSDVVKGVEYAALSHIEQVKKAKK
GKRKGFKGSVANMSLGGGKTQALDAAVNAAVRAGVHFAVAAGNDNADACNYSPA
AATEPLTVGASALDDSRAYFSNYGKCTDIFAPGLSIQSTWIGSKYAVNTISGTSMASPH
ICGLLAYYLSLQPAGDSEFAVAPITPKKLKESVISVATKNALSDLPDSDTPNLLAWNGG
GCSNFSQIVEAGSYTVKPKQNKQAKLPSTIEELEEAIEGDFEVVSGEIVKGAKSFGSKA
EKFAKKIHDLVEEEIEEFISELSE
[27] >jgi|Trire2|81070|estExt_GeneWisePlus.C_220116 polypeptide (SEQ ID NO: 17)
MAVLSRLALTASFALCGVSAAGIQQPLTAPESLPPSHEAVADYGSKPIIDSEALQSAISI
DTLVKRAESFYKFAKASEEEYGHPTRVIGSAGHEQTLNYITNTLLDHGDYYNVSVQEF
PVTLSNVFQFRLVLADEVSKSAIPMGLTPPTKDKEPVHGDLVLVQNSGCDASDYPQNV
KGNIAFIRRGACSFGDKSIGAGKAGAKAAVIYNTDPEELHGTLGLPVEDHIATFGIDGV
EGKKILAKLSNGESVDAIAYIDAEVKQIQTVNVLAQTEEGDPDNCVMLGGHSDGVAE
GPGINDDGSGSISVLEVAVQLTKFKVNNCVRFAWWAAEEEGLLGSDFYAASLSDEEN
QKIRLFMDYDMMASPNFAYQIYNATNAESPAGSEELRNLYVDWYKSQGLNYTFIPFD
GRSDYDGFIRAGIPAGGIATGAEAVKTKEEAEMFGGRAGEWLDPCYHQLCDDLGNLN
HTAWEVNTKLIAHSVATYALSFDGFPKRKLETEMSAYSQTTKHHGPKLIL
[28] >jgi|Trire2|53961 |e_gwl .1.1743.1 polypeptide (SEQ ID NO: 18)
MQPSFGSFLVTVLSASMAAGSVIPSTNANPGSFEIKRSANKAFTGRNGPLALARTYAK
YGVEVPKTLVDAIQLVKSIQLAKRDSATVTATPDHDDIEYLVPVKIGTPPQTLNLDFDT
GSSDLWVFSSDVDPTSSQGHDIYTPSKSTSSKKLEGASWNITYGDRSSSSGDVYHDIVS
VGNLTVKSQAVESARNVSAQFTQGNNDGLVGLAFSSINTVKPTPQKTWYDNIVGSLD
SPVFVADLRHDTPGSYHFGSIPSEASKAFYAPIDNSKGFWQFSTSSNISGQFNAVADTG TTLLLASDDLVKAYYAKVQGARVNVFLGGYVFNCTTQLPDFTFTVGEGNITVPGTLIN
YSEAGNGQCFGGIQPSGGLPFAIFGDIALKAAYVIFDSGNKQVGWAQKK
[29] >jgi|Trire2|69555|e_gwl .29.235.1 polypeptide (SEQ ID NO: 19)
MFIAGVALSALLCADTVLAGVAQDRGLAARLARRAGRRSAPFRNDTSHATVQSNWG
GAILEGSGFTAASATVNVPRGGGGSNAAGSAWVGIDGASCQTAILQTGFDWYGDGTY
DAWYEWYPEFAADFSGIDIRQGDQIAMSVVATSLTGGSATLENLSTGQKVTQNFNRV
TAGSLCETSAEFIIEDFEECNSNGSNCQPVPFASFSPAITFSSATATRSGRSVSLSGAEITE
VIVNNQDLTRCSVSGSSTLTCSYV
[30] >jgi|Trire2|21668|estExt_fgeneshl_pm.C_30227 polypeptide (SEQ ID NO: 20)
MTITAQKLTPEVLLAAPRRSPGVPNATGELVLYTVSTYSFDSHSKTAQIRVLNLKEGTS
HLVSEDSAASEPIWIAEQEIAYVKSLDHGASALVAQHVFNPNESNTIQRFGGSINSLKA
KPLSTDKVAFCCAALTTPDGPJVIYSPAAEPKSYTSAKIYTSLFVRHWDSXVNTENKNSL
WYGQLNKVDGKWTLGNSELTNLLAGTRLHSPVPPFGGTGDFDISTTGIVFVAKDPDL
NHARTTKTDLYFVPLNSYLDQPTFPQIVKTGALRGYSLSPVFSNDGKQVAFLRMKSQQ
YEADKTRLLLIPDVTDLSNVQEFYATEDGKGGWDYKPDWLIWSHDDKELYVAAEKH
ARVVLWKLPSSPLEAKSLPTPIHEDGSVAEARVLGKGSSLLITTRSRVESSNYSILDPAS
KSTTIISSSSRQGKTFALSKSQCQEIWFNGSKGYPIHALVTLPSTFDSSKKYPLAFFIHGG
PQGAWGDDWSTRWNPAVFAEQGYVVVSPNPTGSTGYGQDHTDAIQNNWGGDPYID
LVKCFEFLEEEVNYIDTTRAVALGASYGGYMINWIQGHDLGRKFKALVCHDGVFSTL
NQWSTEELFFPEHDFGGALWENREGYEKWDPAKHVGNWATPQLVIHNELDYRLPISE
GLAMFNVLQARGVPSKFVMFPDEHHWVLKPENSLVWHREVLNWINKYSGISEKN
[31] >jgi|Trire2|58698|e_gwl .5.263.1 polypeptide (SEQ ID NO: 21)
MAWLKKLALVLLAIVPYATASPALSPRSREILSLEDLESEDKYVIGLKQGLSPTDLKKH
LLRVSAVQYRNKNSTFEGGTGVKRTYAIGDYRAYTAVLDRDTVREIWNDTLEKPPW
GLATLSNKKPHGFLYRYDKSAGEGTFAYVLDTGINSKHVDFEGRAYMGFSPPKTEPT
DINGHGTHVAGIIGGKTFGVAKKTQLIGVKVFLDDEATTSTLMEGLEWAVNDITTKGR
QGRSVINMSLGGPYSQALNDAIDHIADMGILPVAAAGNKGIPATFISPASADKAMTVG
AINSDWQETNFSNFGPQVNILAPGEDVLSAYVSTNTATRVLSGTSMAAPHVAGLALYL
MALEEFDSTQKLTDRILQLGMKNKVVNLMTDSPNLIIHNNVK
[32] >jgi|Trire2| 122703 |estExt_fgenesh5_pg.C_l 40132 polypeptide (SEQ ID NO: 22)
MASRRLALNLSQGLRARSGLSGLRRGFATPSTVGKTQTTTLKNGLTVATEYSPWAQT
STVGMWIDAGSRAETNETNGTAHFLEHLAFKGTAKRSQHQLELEIENMGGHLNAYTS
RENTVYFAKAFNSDIPQTVDILADILQNSKLEQSAIERERDVILRESEEVEKQVEEVVFD
HLHATAFQHQPLGRTILGPRQNIRDITRTELVNYIKNNYTADRMVLAAAGGVPHEQLV ELAEKHFSGLASHGPETEAYVLSKQKADFIGSDVRVRDDTMPTANVAIAVEGVSWNS
DDYYTALVAQAIVGNYDKAMGNAPHQGGKLSGYVHKHDLANSFMSFSTSYSDTGL
WGIYLVTDNATRLDDLVHFAIREWMRLCYNVSEAEVERAKAQLKASILLSLDGTTAV
AEDIGRQLITTGRRASPGEIERKIDAITDKDVTDFANRYLWDKDIAISAVGKIEALFDYQ
RLRNTMKPKF
[33] >jgi|Trire2|60581 |e_gwl .7.573.1 polypeptide (SEQ ID NO: 23)
MSRRPVYFNPLAESWTAPSPDDPQLAYRFHSQLPAYSPTQLIPLTDLAKELGVQSIHLK
DETSRLGLPSFKILGASWGTFRAIAQRLDLPIDSSLGSVEQKLASSNITLYAATDGNHG
RAVARMASILGVPAQIHVPTTMHQSTIDLIKSEGARVVISDGFYDDAVVDARVAAAK
DDTALVIQDFASGDYVQIPQWIVDGYLTMMLEIDGQLGCTTPDLVVVPVGVGSFAQA
VVTHFKKPGKQTKVLTVEPDTSASLWKSLRSGESSSTSEKSPSIMAGLDCGTPSSISWA
VLRHGVDASLTISDYEAHKACEYLKSQGVSAGPCGAAPIAALRRLERADRERLGLTK
NSVVVIFCTEGARDYDVPHSVASDDPVEITQTLVKINSANPFLGSVPGPGETAIARYITA
WMEHRDIESHWIEPTSGRPSVVGIVRGLGGGKSLMLNGHIDTVTLMGYEGNPLSGDIQ
DGKLYGRGAADMKGGVAAAMAALANAKKHSLRGDVIFTGVADEEFESIGTQQVLEA
GWTADGAIVSEPTNMEILYAHKGFVWFDVDIQGLAAHGSRYDLGIDAISKAGYFLVE
LDKHASHLTAQSGDAVLGPGSIHASLIKGGEEVSSYPSRVQIQLERRTVNGETPETVRK
ELEEILDGLTKTVPNFTYSLRTTFHRSPFKADLSHPFAKLVHKHVGNTLGREPAVLGAP
YWTDCALLDGAGIPALLWGPQGEGLHGKLEYADVESIKQVAEALTAIAVEFCS
[34] >jgi|Trire2|74129|estExt_GeneWisePlus.C_l 1150 polypeptide (SEQ ID NO: 24)
MPLVVPGIMTNSDDKTQEWANKLVGKTYSENESNETRDLPEVHRIIKPGSIVTKDFRP
ERLNIHLNEDGTVSHVRHG
>jgi|Trire2| 111915|fgenesh5_pg.C_scaffold_30000054 polypeptide (SEQ ID NO: 25) MLFQTMLLALITSLALAQSEVGRPCGFKMAPCPFDMKCVPDNAYCPHPSRCPGHCEF KNKYDQCGGFTPRPHVCRRGSRCQDDPRLPPNCGMACDAPGICIPENAPFCGGFMGL ACPKGLYCYDALDDCDPNNGGADCGGICL
[35] The nucleic acid sequences of the proteases and protease inhibitors are listed, below.
[36] >jgi|Trire2|73897|estExt_GeneWisePlus.C_10741 polynucleotide (SEQ ID NO: 26)
CTCTGCCACTTAGAAGAATCATCCCTCGCAGCTTTTCCTCCATCGCAATGGCTCCC
GCTTCCCAAGTCGTCTCAGCTCTCATGCTGCCCGCTCTCGCCTTGGGAGCCGCCAT
CCAGCCCCGTGGCGCTGACATCGTGGGAGGAACCGCCGCCTCGCTCGGCGAGTTC
CCCTACATTGTCAGTCTGCAGAACCCCAACCAGGGCGGCCACTTCTGCGGTGGTGT
CTTGGTCAACGCCAACACCGTCGTTACCGCCGCTCACTGCTCCGTTGTCTACCCTG
CCTCGCAGATCCGCGTCCGCGCCGGTACTCTTACCTGGAACTCTGGCGGTACCCTG GTCGGCGTCTCCCAGATCATCGTGAACCCGTCCTACAACGACCGCACCACCGACTT TGACGTTGCCGTCTGGCACCTGTCCAGCCCTATCCGCGAGAGCTCCACCATTGGCT ACGCCACTCTTCCCGCCCAGGGCTCCGACCCCGTGGCCGGCTCGACCGTCACCACC GCTGGCTGGGGCACCACCAGCGAGAACTCCAACTCCATCCCCTCCCGCCTGAACA AGGTCTCCGTCCCCGTCGTCGCCCGCTCCACCTGCCAGGCCGACTACCGCAGCCAG GGGCTCAGTGTCACCAACAACATGTTCTGCGCCGGCCTCACCCAGGGCGGCAAGG ACTCTTGCTCTGGCGACTCTGGCGGCCCCATCGTTGACGCCAACGGTGTCCTCCAG GGTGTCGTTTCTTGGGGTATCGGCTGTGCTGAGGCCGGTTTCCCTGGTGTCTACAC CAGAATCGGCAACTTTGTCAACTACATCAACCAGAACCTCGCATAAGCGATTCCCT GTGTTGGCAACCAAAGCAGTAAAAAAAAAATGGAAAAGAAAGCATCTTCTAAGT GGCAGGGTCCTGTTGGCTCTTTTGTTGATTGGTTATGTAGCTGAGGCTGTGGGCAA AATCGGGCT
[37] >jgi|Trire2|77579|estExt_GeneWisePlus.C_80475 polynucleotide (SEQ ID NO: 27) CATCGCCTTCCACCTGCCAACGAGCCCATCATCAACACCTCACCTCCGCGTCGCCC AACATGAAGAGCGCGTTACTTGCCGCCGCGGCGCTTGTCGGCTCCGCCCAAGCCG GCATTCACAAGATGAAGCTGCAGAAGGTCTCCCTGGAGCAGCAGCTGGAGGGTTC GAGCATCGAGGCCCACGTCCAGCAGCTCGGCCAGAAGTACATGGGCGTCCGCCCT ACTAGCCGTGCCGAGGTCATGTTCAACGACAAGCCGCCCAAGGTCCAGGGCGGGC ACCCGGTTCCCGTCACCAACTTCATGAATGCCCAATACTTCTCTGAGATTACCATC GGCACCCCCCCTCAGTCGTTCAAGGTTGTCCTCGACACGGGAAGCTCTAACCTCTG GGTTCCCTCTCAGTCGTGCAACAGCATCGCCTGCTTCCTGCACTCCACGTACGATT CGTCTTCATCGTCGACGTACAAGCCCAACGGCTCCGATTTTGAGATCCACTACGGA TCAGGTAGCTTGACTGGCTTCATCTCCAACGATGTCGTGACGATTGGCGACCTCAA GATCAAGGGGCAGGACTTTGCCGAGGCAACCAGCGAGCCCGGCCTTGCCTTTGCT TTCGGCCGCTTCGACGGCATTCTTGGCCTTGGCTACGATACCATCTCGGTCAATGG CATTGTCCCCCCCTTTTACCAGATGGTCAACCAGAAGCTGATCGACGAGCCCGTCT TTGCTTTCTACCTGGGAAGCAGCGACGAGGGTTCCGAGGCTGTCTTTGGCGGCGTC GACGATGCTCACTACGAGGGCAAGATTGAGTACATTCCCCTGCGCCGCAAGGCCT ACTGGGAGGTGGACCTTGACTCCATTGCCTTCGGTGACGAGGTCGCCGAGCTCGA GAACACTGGCGCCATCCTTGACACCGGCACCTCTCTCAACGTCCTCCCCTCGGGCC TCGCCGAGCTCCTGAACGCTGAGATTGGCGCCAAGAAGGGCTTTGGCGGTCAGTA CACTGTTGACTGCTCCAAGCGTGATTCCCTCCCCGACATCACCTTCAGCCTGGCCG GCTCCAAGTACAGCCTTCCCGCCAGCGACTACATCATTGAGATGTCTGGCAACTGC ATTTCGTCCTTCCAGGGCATGGACTTCCCCGAGCCCGTGGGCCCCCTGGTCATTCT GGGTGATGCTTTCTTGCGCCGCTACTACTCCGTCTACGACCTTGGCAGGGACGCCG TTGGTCTTGCCAAGGCCAAATAA
[38] >jgi|Trire2|22459|estExt_fgeneshl_pm.C_l 00158 polynucleotide (SEQ ID NO: 28) ATGAAGACCTTTATCCCCGTTGCGCTTCTTGCGCTCTCCCAAGCCACCTCGGCGTG TCTTCTTCCCCATGAGCGAGACGAGGCGAGTGGCCTGAAGCCAGTCGTCCGCCGC CAGTCCAGCAACGGAACTCCCATCGGCACCGGCGACAGATTTAGCGGTGGCTCAA CTGCGCCTCGAGGACTGGGAACTCAGTCCTCATCCACCTCCTTCAGCACAATTCTC AACGTCAAGGAGGTCACCAGCGGTCTGCAGGGCCTGGCCAACACGTACGGAGTTC AAACTTTCAACACTCCTTACAAGACGGCCCAGGGCGCAACCGTCGTCGGAGCCAA GGTCGGCGGTACCGGAGGTAACTGTACCGATGCCTACCGCGTCTTCTTCAACGGCA ACATCCACGCCCGAGAGCGCGGCTCCGCAGACAGTGTGCTCTACTTCATCTCGGAT CTGCTGTATGCCAACGCCCACAACACCGGCCTCACCTACGGCTCCAAGACGTACA GCAACGCCCAGGTGAAGACGGCGCTGGCCGCGGGAATCGTCTTCATCCCGCTCAG CAACCCCGACGGCGTGGCCTATGACCAGTCCACCAACAGCTGCTGGCGAAAGAAC CGCAACCCCAACTCGGGCCAGTCCCCCGGAGTCGACCTGAACCGGAACTTTGACT TCCTCTGGGACTTCCGCAACCTCTTCGCCTCGTCGGCACAGTCCAGCGTTGGCTCG ACCTCTCCTAGCTCCGAGACGTACCACGGTGCCAGCGCCTTCTCTGAGCCCGAGAC CAAGAACATCAAGTGGGTGTTCGACACGTACTCCAAAGTCCGCTGGTTCGTGGAC CTTCACTCGTATGCCGGTGACGTCCTGTGGAACTGGGGCAGCGATGAGAACCAGG TCGACTACCCGACCATGAACTTCCTCAATGGCACCTACAACAAGGTCCGCGGAAT CCTGACCGACACGCCGTCTCCCGGCCGAGGCTACGGCGAGTATGTCCCCCAGGCG GACCTGGACGTTAAGGAGGCCGCGGCCAAGCGCGTGGCCTCAGCTCTGACTGCTG GCGGCGGCCGCTCTTACACGGCCTTTCAGTCGGCGCAGCTGTATCCCACCTCGGGC GCGAGTGATGACTACGCTTACTCGAGGCACTTTTCCGATCCTACCAAGAATCTCAT CCACTCTTACACGATTGAGTTTGGCTTTGCCAACAACGCCGCGTCTTGCCCCTTTTA TCCCTCTGTTTCCCAGTACAACTCGAACCTCAGGGCTACCAGTGCTGGTTTCATGG AGCTGCTGTTGGC AGC C AC CGACT AC GGC CTTGGAGATGC AAC GACGTGTT AG
[39] >jgi|Trire2|74156|estExt_GeneWisePlus.C_l 1223 polynucleotide (SEQ ID NO: 29) AGCAACCTTCTCCGATATTCAAGATGCAGACCTTTGGAGCTTTTCTCGTTTCCTTCC TCGCCGCCAGCGGCCTGGCCGCGGCCCTCCCCACCGAGGGTCAGAAGACGGCTTC CGTCGAGGTCCAGTACAACAAGAACTACGTCCCCCACGGCCCTACTGCTCTCTTCA AGGCCAAGAGAAAGTATGGCGCTCCCATCAGCGACAACCTGAAGTCTCTCGTGGC TGCCAGGCAGGCCAAGCAGGCTCTCGCCAAGCGCCAGACCGGCTCGGCGCCCAAC CACCCCAGTGACAGCGCCGATTCGGAGTACATCACCTCCGTCTCCATCGGCACTCC GGCTCAGGTCCTCCCCCTGGACTTTGACACCGGCTCCTCCGACCTGTGGGTCTTTA GCTCCGAGACGCCCAAGTCTTCGGCCACCGGCCACGCCATCTACACGCCCTCCAA GTCGTCCACCTCCAAGAAGGTGTCTGGCGCCAGCTGGTCCATCAGCTACGGCGAC GGCAGCAGCTCCAGCGGCGATGTCTACACCGACAAGGTCACCATCGGAGGCTTCA GCGTCAACACCCAGGGCGTCGAGTCTGCCACCCGCGTGTCCACCGAGTTCGTCCA GGACACGGTCATCTCTGGCCTCGTCGGCCTTGCCTTTGACAGCGGCAACCAGGTCA GGCCGCACCCGCAGAAGACGTGGTTCTCCAACGCCGCCAGCAGCCTGGCTGAGCC CCTTTTCACTGCCGACCTGAGGCACGGACAGAACGGCAGCTACAACTTTGGCTAC ATCGACACCAGCGTCGCCAAGGGCCCCGTTGCCTACACCCCCGTTGACAACAGCC AGGGCTTCTGGGAGTTCACTGCCTCGGGCTACTCTGTCGGCGGCGGCAAGCTCAAC CGCAACTCCATCGACGGCATTGCCGACACCGGCACCACCCTGCTCCTCCTCGACGA CAACGTCGTCGATGCCTACTACGCCAACGTCCAGTCGGCCCAGTACGACAACCAG CAGGAGGGTGTCGTCTTCGACTGCGACGAGGACCTCCCTTCGTTCAGCTTCGGTGT TGGAAGCTCCACCATCACCATCCCTGGCGATCTGCTGAACCTGACTCCCCTCGAGG AGGGCAGCTCCACCTGCTTCGGTGGCCTCCAGAGCAGCTCCGGCATTGGCATCAA CATCTTTGGTGACGTTGCCCTCAAGGCTGCCCTGGTTGTCTTTGACCTCGGCAACG AGCGCCTGGGCTGGGCTCAGAAATAA
[40] >jgi|Trire2| 119876|estExt_fgenesh5_pg.C_10816 polynucleotide (SEQ ID NO: 30) ACGGCGTGTTCATAGCCTGCTCTGGGTGCTTGGGCATATTCTCAAGCCGCCATCAC AACGTTGGGCTTGTCCTATTCTAAATTTACCCCCATCTTCATCCACAATGCGTTTCG TTCAGTACGTCTCGCTGGCCGGCCTCTTCGCCGCTGCCACAGTTTCTGCAGGCGTC GTTACCGTCCCCTTTGAGAAGCGCAACCTCAACCCCGACTTTGCCCCGTCACTGCT GCGTCGCGATGGCAGCGTGAGCCTTGACGCCATCAATAACCTCACTGGAGGTGGC TACTATGCCCAATTCAGCGTTGGCACGCCGCCTCAGAAGCTGAGTTTCCTTCTCGA TACCGGCAGCAGTGATACCTGGGTCAACTCCGTCACTGCGGACCTCTGCACAGAC GAGTTCACTCAGCAGACTGTTGGAGAATACTGTTTCAGACAGTTCAACCCAAGGA GGAGTAGTTCCTATAAGGCGAGCACTGAAGTCTTCGACATCACCTACCTTGACGGC CGCAGGATACGAGGCAACTACTTCACGGATACCGTCACCATCAACCAGGCGAACA TCACGGGCCAGAAGATTGGCCTAGCCCTGCAGTCAGTCCGCGGCACAGGCATCCT GGGCTTGGGATTCCGGGAAAACGAGGCAGCCGACACCAAGTATCCCACCGTCATC GATAACCTGGTGTCTCAAAAGGTCATTCCTGTTCCGGCATTCAGTCTCTACCTCAA CGACCTGCAAACTAGCCAGGGCATCCTCCTCTTTGGTGGTGTCGATACCGACAAGT TCCACGGCGGCCTTGCCACTCTGCCCCTTCAGTCGCTGCCGCCGTCCATTGCCGAG ACCCAGGATATTGTCATGTACAGTGTTAACCTTGATGGCTTCTCGGCGTCTGACGT TGATACGCCCGACGTCAGCGCCAAAGCCGTTCTCGACTCTGGCTCAACAATCACCC TCCTCCCAGATGCTGTCGTGCAGGAACTTTTTGACGAGTACGACGTCCTCAACATT CAGGGACTCCCCGTTCCTTTTATCGACTGCGCCAAAGCAAACATCAAGGATGCCAC CTTCAACTTCAAATTCGACGGCAAGACGATCAAGGTGCCCATTGACGAAATGGTC TTGAACAACCTCGCCGCGGCTTCAGACGAGATTATGTCGGATCCCTCCTTGAGCAA GTTTTTCAAGGGTTGGAGCGGAGTGTGCACCTTTGGCATGGGCTCGACCAAGACCT TTGGCATCCAGTCTGACGAGTTTGTCCTGCTGGGCGACACCTTCTTGCGGTCCGCC TATGTCGTCTACGACCTGCAGAACAAGCAGATTGGCATTGCCCAGGCGACGCTCA ACTCGACCAGCAGCACCATTGTCGAGTTCAAGGCGGGCTCCAAGACCATTCCCGG CCCTGCTTCCACGGGCGACGACTCTGACGATTCGTCGGACGATTCAGATGAAGATT CTGCCGGTGCTGCTCTTCATCCTACATTTTCTATTGCTCTGGCCGGCACCTTGTTCA CGGCCGTTTCCATGATGATGAGTGTACTATAG
[41] >jgi|Trire2| 111818|fgenesh5_pg.C_scaffold_29000088 polynucleotide (SEQ ID NO: 31)
ATGCATGCGGCCCTCGTGTCACTATCAATCGCGTCTCTTGCCGCAGCTGCTCTACC CGAGGGCGTGGTCGAGGTTCCGCTCCAGCGCATCAAGAATCAGACTGCGTACGGC GTCGAGTTCCAGGTCGGCAATCCGCCCCAAAAGGCTGTCATTTCTGCCGATACAGG CAGTCCGACTTATGCCTTTGAGTCTCCGCGCAACACCGTCTGCCAGCAAGGCCTCT GCTCAGCCTATGGAACCTACGATAACACAACCTCTACCACGTCTAAATGGCTCAGC GACGGCTACTCCGACATGCTCATCGACCACGGCTTCGGCAGTTTTATCAACGACAC TCTTCGAATTGGCGGCGTCACCCTCAACGACATGATGTTTGGCGTCGTTGAGCAAA ACTTTGCCTCGTTCCCGATCAATACCCAGCAAACCGCTATTTTTGGCCTCGGGGCA TTCTGTCAGACGCCTGCGTGCGACACATATTCGACGTTCCTCAACCAGCTATACGA GCACGGTGCGATTGCTCGGCGTGCCTTCAGTGTGTATCTGGGCCCAAACGACCCGG ATGCTACGGGGTCGCTGCTTTTCGGGGGGATTGATCTTGCGAAACGCCAGGGCCC GGTCCATAAATTGAAGGTCCTCGATCCTACAGTAACAGCGGCGAATCTCCAGCCC AACTGGGTCAGCCTGTCCAGCATCGAACTCCAGCTTTCCAATGGCACTACGATTAC TTCGACGTATGACAACGGGACGTATGCTCTTTGGGACACGGGGTCCCCGGGCTGG TATGTCCAGCAAGACATGTTCGACGCCCTCACCAGCTACTGGGGCCTCACGAATCC CAATCCCAACAATGATATCATCGTTGACTGCAAGTTCCGTGAGCCGTCCGACGACT CGCTTGCCGTGAATATCGCCCCAGGAGTCACCATCAACGTGCCCCTGAGTTCGTTG CCTATTGATAACGGCGACGGCACTTGCACTGCGCGTGTTTCGCCTTGGGGAAGAGT GATGGGCGATACGTTTCTTCGAAATGTCTACTTTACGTTTGACTACGAGGATCTGA CAGTTGAATTTGCTTTAGTCAAGTATACCGATGAGACGAACATTGTCAAAATTGAG TAA
[42] >jgi|Trire2|82623|estExt_GeneWisePlus.C_450028 polynucleotide (SEQ ID NO: 32) ATGGCAAAGTTGAGCACTCTCCGGCTTGCGAGCCTTCTTTCCCTTGTCAGTGTGCA GGTATCTGCCTCTGTCCATCTATTGGAGAGTCTGGAGAAGCTGCCTCATGGATGGA AAGCAGCTGAAACCCCGAGCCCTTCGTCTCAAATCGTCTTGCAGGTTGCTCTGACG CAGCAGAACATTGACCAGCTTGAATCGAGGCTCGCAGCTGTATCCACACCCACTTC TAGCACCTACGGCAAATACTTGGATGTAGACGAGATCAACAGCATCTTCGCTCCA AGTGATGCTAGCAGTTCTGCCGTCGAGTCTTGGCTTCAGTCCCACGGAGTGACGAG TTACACCAAGCAAGGCAGCAGCATTTGGTTTCAAACAAACATCTCCACTGCAAAT GCGATGCTCAGCACCAATTTCCACACGTACAGCGATCTCACCGGCGCGAAGAAGG TGCGCACTCTCAAGTACTCGATCCCGGAGAGCCTCATCGGCCATGTCGATCTCATC TCTCCCACGACCTATTTTGGCACGACAAAGGCCATGAGGAAGTTGAAATCCAGTG GCGTGAGCCCAGCCGCTGATGCTCTAGCCGCTCGCCAAGAACCTTCCAGCTGCAA AGGAACTCTAGTCTTTGAGGGAGAAACGTTCAATGTCTTTCAGCCAGACTGTCTCA GGACCGAGTATAGTGTTGATGGATACACCCCGTCTGTCAAGTCTGGCAGCAGAAT TGGGTTTGGTTCCTTTCTCAATGAGAGCGCAAGCTTCGCAGATCAAGCACTCTTTG AGAAGCACTTCAACATCCCCAGTCAAAACTTCTCCGTTGTCCTGATCAACGGTGGA ACGGATCTCCCTCAGCCGCCTTCTGACGCCAACGATGGCGAAGCCAACCTGGACG CTCAAACCATTTTGACCATCGCACATCCTCTCCCCATCACCGAATTCATCACCGCC GGCAGTCCGCCATACTTCCCCGATCCAGTTGAACCTGCGGGAACACCCAACGAGA ACGAGCCTTATTTACAGTATTACGAATTTCTGTTGTCCAAGTCCAACGCTGAAATT CCGCAAGTCATTACCAACTCCTACGGCGACGAGGAGCAAACTGTGCCGCGGTCAT ATGCCGTTCGAGTTTGCAATCTGATTGGTCTGCTAGGACTACGCGGTATCTCTGTC CTTCATTCCTCGGGCGACGAGGGTGTGGGCGCCTCTTGCGTTGCTACCAACAGCAC CACGCCTCAGTTTAACCCCATCTTTCCTGCTACATGTCCTTATGTTACAAGTGTTGG CGGAACCGTGAGCTTCAATCCCGAGGTTGCCTGGGCTGGTTCATCTGGAGGTTTCA GCTACTACTTCTCTAGACCCTGGTACCAGCAGGAAGCTGTGGGTACTTACCTTGAG AAATATGTCAGTGCTGAGACAAAGAAATACTATGGACCTTATGTCGATTTCTCCGG ACGAGGTTTCCCCGATGTTGCAGCCCACAGCGTCAGCCCCGACTATCCTGTGTTTC AGGGCGGTGAACTCACCCCAAGCGGAGGCACTTCAGCAGCCTCTCCTGTCGTAGC AGCCATCGTGGCGCTGTTGAACGATGCCCGTCTCCGCGAAGGAAAACCCACGCTT GGATTTCTCAATCCGCTGATTTACCTACACGCCTCCAAAGGGTTCACCGACATCAC CTCGGGC C AATCTGAAGGGTGC AAC GGC AAT AAC ACC C AGAC GGGC AGTC CTCTC CCAGGAGCCGGCTTCATTGCAGGCGCACACTGGAACGCGACCAAGGGATGGGACC CGACGACTGGATTTGGTGTTCCAAACCTCAAAAAGCTCCTCGCACTTGTCCGGTTC TAAGAGGACTCGGGTGGAATATAGGGCCGCGGCGGATGGCTCGCAATAGGACTGC CATGGAATTGGTAGAAGTCACCATAGGATCATCATTCTCGTCC
[43] >jgi|Trire2|81004|estExt_GeneWisePlus.C_220006 polynucleotide (SEQ ID NO: 33) ATGAAGACCTTTATCCCCGTTGCGCTTCTTGCGCTCTCCCAAGCCACCTCGGCGTG TCTTCTTCCCCATGAGCGAGACGAGGCGAGTGGCCTGAAGCCAGTCGTCCGCCGC CAGTCCAGCAACGGAACTCCCATCGGCACCGGCGACAGATTTAGCGGTGGCTCAA CTGCGCCTCGAGGACTGGGAACTCAGTCCTCATCCACCTCCTTCAGCACAATTCTC AACGTCAAGGAGGTCACCAGCGGTCTGCAGGGCCTGGCCAACACGTACGGAGTTC AAACTTTCAACACTCCTTACAAGACGGCCCAGGGCGCAACCGTCGTCGGAGCCAA GGTCGGCGGTACCGGAGGTAACTGTACCGATGCCTACCGCGTCTTCTTCAACGGCA ACATCCACGCCCGAGAGCGCGGCTCCGCAGACAGTGTGCTCTACTTCATCTCGGAT CTGCTGTATGCCAACGCCCACAACACCGGCCTCACCTACGGCTCCAAGACGTACA GCAACGCCCAGGTGAAGACGGCGCTGGCCGCGGGAATCGTCTTCATCCCGCTCAG CAACCCCGACGGCGTGGCCTATGACCAGTCCACCAACAGCTGCTGGCGAAAGAAC CGCAACCCCAACTCGGGCCAGTCCCCCGGAGTCGACCTGAACCGGAACTTTGACT TCCTCTGGGACTTCCGCAACCTCTTCGCCTCGTCGGCACAGTCCAGCGTTGGCTCG ACCTCTCCTAGCTCCGAGACGTACCACGGTGCCAGCGCCTTCTCTGAGCCCGAGAC CAAGAACATCAAGTGGGTGTTCGACACGTACTCCAAAGTCCGCTGGTTCGTGGAC CTTCACTCGTATGCCGGTGACGTCCTGTGGAACTGGGGCAGCGATGAGAACCAGG TCGACTACCCGACCATGAACTTCCTCAATGGCACCTACAACAAGGTCCGCGGAAT CCTGACCGACACGCCGTCTCCCGGCCGAGGCTACGGCGAGTATGTCCCCCAGGCG GACCTGGACGTTAAGGAGGCCGCGGCCAAGCGCGTGGCCTCAGCTCTGACTGCTG GCGGCGGCCGCTCTTACACGGCCTTTCAGTCGGCGCAGCTGTATCCCACCTCGGGC GCGAGTGATGACTACGCTTACTCGAGGCACTTTTCCGATCCTACCAAGAATCTCAT CCACTCTTACACGATTGAGTTTGGCTTTGCCAACAACGCCGCGTCTTGCCCCTTTTA TCCCTCTGTTTCCCAGTACAACTCGAACCTCAGGGCTACCAGTGCTGGTTTCATGG AGCTGCTGTTGGC AGC C AC CGACT ACGGC CTTGGAGATGC AAC GAC GTGTT AG
[44] >jgi|Trire2| 110490|fgenesh5_pg.C_scaffold_20000149 polynucleotide (SEQ ID NO: 34)
ATGAACGGCCTCAGCTTTGAACGGGCCAAGGCCCTCCGGGAGCCCAGCGCGGACC GCGCAACGATGCACTTTTCGCGCCTCAGGGGCATCAAGGGCCGCCCCGGCCAGCA GCAGGGCCAGCAGCAACACCACCACCGCTCTGCCGTCTCGGCGCTCAGCCGCGTC CGCAGCAGCAGCAGCAGCAGCACGAGCAGCGGCGGCTTCAAGCAGTCGTTCCAAA ACGTGTCGGCCGTGACGGACGCCTCGACGCAGTACGCCATCCAGTGCGGCTGGGA CGGCGTGCCCGTCTGGCTGCTGGTCGACACGGGCAGCTCCGACACGTGGGCGACG CAGACGGGCTTCGAGTGCGCGGACCTGGGCGGCGACGCGCACAGCGAGGCGGCG TGCGGCTTCTCGCGGCCGCTGATTGACGGCTTCGGCGGCGGCCAGATCGACGACCT GCACTTCTTCCTCAAGTACGGCTCGGGCGAGCGCATCTCGGGCCCGATGGGCTACA GCGACCTGTCGTGCGGCGGCGTCGCCGTGGCCCGGCAGCAGGTCGGCCTGGCCAA CAGCACCTACTGGCACGGCAACAACGTCACCGTCGGCATCCTGGGGCTGGCGTAC CCGGCCATCACGAGCGCCTACTACGGCGACGTCGGCTCCGAGGCGCCCTGGAACG CAATGAGCTACACGCCCTTCCTGACGAGCGCCATCAGCCAGGGAGCCATCGAGCC GCTCTTCAGCGTCGCCCTCGTGCGCAACTCGACCGACGGCGTCATTGCCTGGGGCG GGCTGCCCCCCATGGACTGGCAGTTTCACGGCTTTGCCAAGACGGATCTGATTGTG GCAAACCTCATCGGGTCGCCCGAGACGGCGTGGAAGCACTCCTTCTACACGATTG TGCCCGACGGCATGAAATGGGACCAGACGACGGACACGACCAAGTATCCATACAT TGTCGACACGGGCACGACGATGCTCTACCTGCCCCCTCCCCTCGCCGAGGCCATCG CCAACGCCTTCCAGCCGCGCGCCGTCTACCTCTACCAATGGGGCACCTACTTCGTC GAGTGCACCGCCATCCCGCCGCACTTTGCCATCCTCATCGAGGGCGTCGAGTTCTG GATCAACCCGGCCGACCTCATCTACCGCGACCTCGTCGACCCGCTGACGGGCTACT GCGCCGTTGGCGTCGCCAGCGGCGGGCCGGGGCCGTATATTCTCGGCGACGTCTTC TTGCAAAACGTCGTCGCCGTGTTTGACGTCGGAGCCGCCGAGATGCGCTTCTATGG GCGCAAGTGA
[45] >jgi|Trire2| 121133|estExt_fgenesh5_pg.C_60052 polynucleotide (SEQ ID NO: 35)
AGCCCCTCGACTCCAGAAACCTCGAGGCCTCTCCAATCATGAGATGCAGCTACAA
GCTGCTCTCTGTTGTTTTTCCTGCTTCTGCAGCTTTTTCACAGTATCGGTTCCCTCTG CATAGCTTTGCATCTGCCATTCCAGGGGTTGTGCCAGGATCTTGCGGACGACAGCT GCTGCGCATTGCCTGTGTAGCGTCACACTGACGGAAGAGGCAAACCCGTGCATCA TTCAACCTGCTGTGAGACACACGCGTCAGCTGTCAAGGGACACACGATCGATGTT ACAGCGACACGCGCGTTGAGTATCATCATATGGCTGATAACTGGTTGATATTGTGA CGATCTGATATACGCAGACATTCTTTGAAACGACGGTGGAATCAAGACAGGCCTA CGTAAACCTGCCCTCACTGCTGTGTATAAATCGGCCTCAACTCCTCCCACTATTCC CGGGAACAGAATCCCAAACTCCATCAACCAATTCCCACACAGACTCAACAACCAG CCTCAACCCCATCAAAACAGACAGGCAGACACACAGCTCCCGTCCAAAATGGAAG CCATCCTCCAGGCCCAGGCAAAGTTCAGGCTCGACCGCGGCCTCCAGAAAATCAC CGCCGTGCGCAACAAGAACTACAAGCGTCACGGCCCCAAGTCCTACGTCTACCTC CTCAACCGCTTTGGCTTCGAGCCCACCAAGCCCGGCCCTTACTTCCAGCAGCACCG CATCCACCAGCGCGGCCTCGCCCACCCAGACTTCAAGGCCGCCGTCGGCGGGAGA GTCACGAGGCAGAAGGTGCTCGCCAAGAAGGTCAAGGAGGATGGGACGGTCGAT GCGGGCGGCAGTAAGACGGGAGAGGTCGATGCTGAGGACCAGCAGAATGACTCT GAATACCTTTGCGAGGTGACCATCGGAACTCCTGGGCAGAAGCTTATGCTCGACTT TGACACCGGCTCATCCGATCTCTGGGTCTTCTCTACAGAACTGAGCAAGCATCTTC AAGAAAACCACGCCATCTTCGACCCCAAAAAGTCGTCCACGTTCAAGCCCCTAAA GGACCAGACATGGCAAATCTCCTACGGCGACGGCAGCTCCGCCTCGGGCACCTGC GGCTCCGACACCGTCACCCTCGGCGGTCTCTCCATCAAGAACCAGACCATTGAACT CGCCTCCAAGCTCGCGCCCCAGTTCGCCCAGGGCACCGGCGATGGCCTCCTCGGTC TCGCCTGGCCGCAGATCAACACCGTCCAGACGGACGGCCGCCCGACGCCCGCCAA CACGCCCGTCGCCAACATGATCCAGCAGGACGATATCCCCAGCGACGCCCAGCTC TTTACCGCCGCCTTCTACAGCGAGCGCGACGAAAACGCAGAGTCTTTCTACACCTT TGGCTACATCGACCAGGACCTCGTGTCCGCCTCGGGCCAGGAGATTGCCTGGACG GATGTCGACAACTCGCAGGGCTTCTGGATGTTCCCTTCCACTAAGACTACCATCAA CGGCAAGGACATTTCTCAGGAGGGTAACACCGCCATTGCCGACACTGGAACCACC CTCGCTCTGGTCTCCGACGAGGTCTGCGAGGCGCTGTACAAGGCCATCCCTGGCGC CAAGTACGACGATAACCAGCAGGGCTACGTCTTCCCCATCAACACCGATGCTTCG AGTCTGCCCGAGCTCAAGGTCTCGGTTGGAAACACGCAGTTTGTCATCCAGCCCGA GGACCTGGCTTTTGCTCCTGCGGACGACAGCAACTGGTACGGCGGTGTTCAGTCGA GGGGAAGCAACCCCTTTGACATTCTGGGTGACGTATTCCTCAAGTCGGTCTATGCG ATCTTTGATCAGGGCAACCAGCGATTCGGCGCAGTCCCCAAGATCCAGGCAAAGC AGAACCTGCAGCCCCCCCAGTAAACACCCCAACTCGTCTTATGCCTTCTCGTACCT TTAGCTCAATAGCGAATGTGGAATGCCCCGTCAATTTAATGAACTGCTATATAATA AACCC
[46] >jgi|Trire2| 106661|fgenesh5_pg.C_scaffold_7000462 polynucleotide (SEQ ID NO: 36)
[47] ATGGATGCTATCCGAGCCAGGAGTGCTGCCCGCCGCTCCAACAGGTTCCAA GCCGGCTCATCTAAGAACGTCAATGGAACGGCCGATGTTGAGTCTACGAACTGGG CAGGCGCCGCTATCACGACTTCGGGCGTTACCGAGGTCAGCGGCACCTTCACGGT CCCCAGGCCCAGCGTCCCAGCTGGTGGCAGTTCTCGCGAGGAGTACTGTGGTGCT GCTTGGGTTGGAATTGACGGCTACTCTGATGCCGACCTCATCCAGACTGGCGTCCT TTGGTGCGTCGAAGACGGCGAATACTTGTACGAGGCTTGGTATGAGTACCTTCCGG CCGCGTTGGTTGAGTACTCTGGCATCTCTGTGACGGCTGGATCGGTTGTCACCGTT ACTGCCACCAAGACTGGCACTAACAGCGGCGTTACTACCCTGACAAGCGGTGGGA AGACTGTATCGCACACCTTCTCCCGACAGAACTCACCCCTGCCCGGCACGAGCGCT GAGTGGATTGTCGAGGACTTTACCTCTGGCAGCTCGTTGGTTCCCTTTGCAGACTT TGGATCCGTTACCTTTACCGGAGCCACCGCTGTCGTGAACGGAGCAACTGTGACA GCGGGCGGCGACTCTCCCGTCATTATCGACCTGGAGGACTCAAGGGGAGACATTC TC AC ATC AAC AACGGTCTCTGGC AGC ACTGTGACTGTTGAGTATGAATAG711
[48] >jgi|Trire2|51365|estExt_Genewisel.C_220234 polynucleotide (SEQ ID NO: 37) TCGTCTTGTCGTTGCGCAAAGCATTGTGTGTCACTGTCCATACAGCCCTCCCACAG TCAGACAAGCGGGATAGAGTGTGAGCCATGGTGCGCTCCGCCCTATTCGTGTCGCT GCTCGCGACCTTCTCCGGAGTCATTGCCCGTGTCTCCGGGCATGGGTCAAAGATCG TTCCCGGCGCGTACATCTTCGAATTCGAGGATTCACAGGACACGGCCGATTTCTAC AAGAAGCTCAACGGCGAGGGCTCAACGCGCCTGAAGTTCGACTACAAGCTGTTCA AGGGCGTCTCCGTCCAGCTCAAGGACCTAGACAACCATGAGGCAAAGGCCCAGCA GATGGCCCAGCTGCCTGCTGTCAAGAACGTGTGGCCCGTCACCCTCATCGACGCCC CCAACCCCAAGGTCGAGTGGGTTGCCGGCAGCACGGCGCCTACTCTGGAGAGCAG GGCGATCAAGAAGCCACCGATCCCGAACGACTCGAGCGACTTCCCCACGCACCAG ATGACCCAAATCGACAAGCTGCGAGCCAAGGGCTACACGGGCAAGGGCGTCAGG GTTGCCGTCATTGATACAGGCATTGACTACACCCACCCTGCTCTCGGCGGCTGCTT TGGTAGGGGCTGTCTGGTCTCCTTTGGCACCGATTTGGTCGGTGACGACTACACCG GCTTTAACACGCCTGTCCCCGATGATGACCCCGTCGACTGCGCCGGCCACGGTTCT CACGTTGCTGGTATCATTGCTGCGCAGGAGAATCCGTACGGCTTCACTGGCGGCGC TCCCGATGTCACCCTCGGCGCTTATCGAGTCTTTGGCTGCGACGGCCAGGCCGGTA ACGATGTCCTGATTTCCGCTTACAACCAGGCCTTTGAGGACGGTGCCCAGATCATC ACTGCCTCCATTGGCGGTCCCTCTGGCTGGGCTGAGGAGCCGTGGGCCGTTGCCGT CACCCGCATCGTTGAGGCAGGTGTTCCCTGCACGGTCTCTGCCGGCAACGAGGGC GACTCTGGTCTCTTCTTTGCCAGCACGGCAGCCAATGGCAAGAAAGTCATTGCTGT CGCCTCCGTCGACAACGAGAACATCCCTTCAGTGCTGTCCGTGGCCTCTTACAAAA TTGACAGCGGCGCTGCCCAGGACTTTGGCTACGTCTCCTCCTCCAAGGCGTGGGAC GGCGTGAGCAAGCCCCTGTATGCTGTGTCGTTCGACACTACTATTCCCGACGATGG CTGCTCGCCTCTCCCTGACAGCACTCCCGACCTCTCTGACTACATTGTCCTTGTCCG CCGTGGCACCTGCACCTTTGTCCAGAAAGCCCAAAATGTCGCTGCAAAGGGCGCC AAGT AC CTGCTCT ATT AT AAC AAC ATTCCCGGTGC GCTGGCC GTC GATGTC AGC GC CGTCCCCGAGATTGAGGCTGTCGGCATGGTCGATGACAAGACGGGTGCTACCTGG ATTGCCGCCCTCAAGGATGGAAAGACCGTCACCCTGACACTGACTGACCCGATCG AGAGCGAGAAGCAAATTCAGTTCAGCGACAACCCGACAACTGGCGGTGCTCTGAG CGGCTACACAACCTGGGGCCCTACCTGGGAGCTGGACGTCAAGCCTCAGATCAGC TCTCCCGGCGGCAACATTCTCTCCACGTACCCCGTGGCTCTCGGAGGATATGCCAC CCTGTCCGGTACCTCCATGGCCTGCCCCCTGACGGCGGCTGCTGTTGCTCTGATTG GACAAGCTCGTGGCACCTTTGACCCTGCCTTGATCGACAACTTGTTGGCAACGACT GCGAACCCCCAGCTGTTCAACGACGGCGAGAAGTTCTACGACTTCCTCGCCCCCGT TCCCCAACAGGGCGGTGGCCTCATCCAGGCCTACGATGCCGCCTTTGCGACCACTC TC CTGTC ACC GTC C AGC CTGTC GTTC AAC GAC ACTGAC C ACTTC ATC AAGAAGAAG CAGATCACCCTCAAGAACACCAGCAAGCAGAGGGTCACCTACAAGCTCAACCACG TCCCCACCAACACCTTTTACACTCTGGCACCCGGTAACGGCTATCCAGCTCCCTTT CCTAACGACGCCGTTGCCGCTCACGCCAATCTCAAGTTTAATCTGCAGCAAGTGAC CCTGCCCGCCGGCAGGTCCATCACTGTCGACGTCTTCCCTACTCCCCCCAGGGACG TCGACGCCAAGCGCCTGGCGCTTTGGTCGGGCTACATCACGGTCAACGGCACGGA TGGCACCAGTCTGTCTGTCCCGTACCAGGGCCTCACCGGCTCCCTGCACAAGCAGA AGGTGCTCTATCCGGAGGACTCCTGGATCGCCGATTCCACCGATGAAAGCCTGGC CCCTGTTGAGAACGGCACCGTCTTCACCATTCCCGCGCCGGGCAACGCTGGCCCCG ATGACAAGCTCCCATCGCTCGTCGTCAGCCCTGCCCTTGGCTCTCGTTATGTCCGC GTTGATCTCGTCCTCCTGTCCGCGCCTCCTCATGGCACCAAGCTCAAGACGGTCAA GTTCCTCGACACCACCTCCATCGGCCAGCCTGCCGGATCACCGCTCCTCTGGATCA GCCGTGGCGCCAACCCTATTGCTTGGACCGGCGAGCTGTCTGACAACAAGTTTGCT CCCCCTGGAACGTACAAGGCCGTGTTCCATGCTCTGCGTATTTTCGGCAACGAGAA GAAGAAGGAGGACTGGGATGTGAGCGAATCTCCTGCCTTCACCATCAAGTATGCG TAGGGTAGATGTAAGAGGGTTTCTTGAGGGGTAATATTCAGATCA
[49] >jgi|Trire2| 122076|estExt_fgenesh5_pg.C_100212 polynucleotide (SEQ ID NO: 38) TGTTGATAACGCCCTGTCTTGTGTACAAAGATTTCTGTTGACGTTCCCTCCTACCCC GTTGTTCGATACACATTCACCTGCCTGGTAGCACCTCAATACCAAGGTTACTACTA ACTCCACCAAGACGCGCGGCGATCCATCCCCCTGACCCGAGAAGCCCTACGAAAA AGCTATCGACATTGTAAAATCACACGAGATTGTAGATTCTTCTACCTTACCTGCCT ACACACCTATAGCCCAGTAGCTTTGCGCCATGAGAGCCTCGCCCCTGGCCGTTGCA GGCGTCGCCCTGGCCTCTGCAGCCCAGGCGCAAGTCGTCCAGTTCGACATCGAGA AGCGCCACGCGCCGCGGCTCAGCCGCCGAGACGGAACCATCGACGGCACTCTGTC CAACCAAAGAGTCCAGGGCGGATACTTTATCAACGTCCAGGTTGGAAGTCCTGGC CAGAACATTACGTTGCAGCTCGACACAGGCAGTAGCGATGTCTGGGTCCCGTCAT CAACGGCTGCCATCTGCACCCAGGTCTCAGAACGGAATCCCGGATGCCAATTCGG AAGCTTCAACCCTGATGACTCCGATACCTTTGACGAGGTCGGCCAAGGCCTGTTCG ACATCACCTACGTCGACGGGTCCTCGTCCAAAGGCGATTACTTCCAGGACAACTTC CAGATCAACGGCGTGACCGTCAAGAATCTCACCATGGGTCTCGGCCTGAGCTCAT CGATCCCCAACGGCCTGATAGGAGTGGGCTACATGAACGACGAGGCCTCCGTCAG CACGACCCGAAGCACCTACCCGAACCTCCCCATCGTCCTGCAGCAGCAGAAGCTC ATCAACAGCGTGGCCTTTAGCCTGTGGCTCAACGATCTGGATGCCAGCACGGGGT CCATCCTGTTCGGCGGCATCGACACGGAAAAGTACCACGGCGACCTGACTAGCAT CGACATCATCTCTCCCAACGGCGGCAAGACCTTTACCGAGTTTGCCGTCAACCTGT ACTCGGTGCAGGCCACAAGCCCGAGCGGAACCGACACGCTCTCCACAAGCGAGGA TACGCTCATCGCCGTGCTGGACTCTGGAACGACCCTGACCTATCTGCCGCAGGACA TGGCCGAGGAGGCGTGGAACGAAGTGGGAGCCGAGTACAGCAACGAGCTCGGCC TGGC GGTGGTCCC CTGCTCTGTGGGC AAC AC C AAC GGCTTCTTCTC CTTT AC CTTTG CCGGAACCGACGGCCCTACGATCAACGTCACCCTGTCCGAGCTGGTCCTCGACCTC TTCAGCGGCGGCCCCGCGCCTCGGTTCTCGTCCGGCCCCAACAAGGGCCAGTCAAT CTGCGAGTTTGGCATCCAGAACGGGACCGGCTCTCCGTTCCTGCTGGGCGACACGT TCCTGCGCTCCGCGTTCGTCGTCTACGACCTCGTCAACAACCAGATTGCCATTGCC CCGACCAACTTCAACTCCACCAGGACCAACGTCGTCGCCTTTGCGAGCAGCGGCG CCCCGATCCCGTCTGCCACGGCGGCCCCGAACCAGAGCAGGACGGGCCATTCCTC GTCGACGCACTCGGGCCTGTCTGCTGCCAGTGGCTTCCACGATGGTGACGACGAG AATGCTGGCTCCTTGACGAGCGTCTTTTCTGGTCCGGGAATGGCCGTGGTCGGCAT GACCATCTGCTACACTCTGCTCGGAAGCGCCATATTCGGCATTGGCTGGCTGTAAA TGTGTGAATCCTGGCTTTTCTCAGCCCTATGAGTGCTCTTTGACGAACTTTTGAATT TTGCTTGTTATTGGGGCGCATTGGTTTGATTATGGGGTGTTCTGGTTCAAATGTTCT TTATTTCTTCTTCTTCCGTCTATTCATACTAGCAGGCATTCATTCGCTGGAGTTTGG ATGGACTTACAGGACTTTCG
[50] >jgi|Trire2| 123865|estExt_fgenesh5_pg.C_270147 polynucleotide (SEQ ID NO: 39) CTCCAGCCTCTCCTCGAACGACATCTTGCTGCAAACGCTCAGTATGAGGGCCTGCC TGCTCTTCCTGGGCATCACCGCCCTGGCCACAGCAATCCCAGCTTTGAAACCACCC CATGGGAGTCCAGACAGAGCCCACACAACCCAGCTTGCCAAAGTCTCCATCGCCC TGCAGCCCGAGTGTCGAGAGCTCCTCGAGCAAGCCCTCCACCACCTCTCCGACCCC TCAAGCCCACGGTACGGCCGTTACCTTGGCCGCGAAGAAGCCAAAGCTCTGCTCC GCCCTCGACGAGAGGCCACGGCCGCCGTCAAGAGATGGCTGGCCAGGGCTGGTGT TCCTGCTCATGATGTTTTGACTGATGGGCAGTTCATCCACGTGCGGACGCTTGCGG AGAAGGCTCAGGCTCTGCTTGGATTCGAGTACAACTCGACGCTGGGTTCGCAGAC AATCGCAATCTCTACCCTTCCAGGAAAGATACGGAAGCATGTCATGACCGTTCAGT ATGTTCCGCTTTGGACGGAAGCAGACTGGGAGGAATGCAAGACGATAATCACGCC TAGCTGCCTGAAAAGGCTGTATCACGTCGACAGTTACCGCGCAAAGTATGAGAGC AGCAGTCTTTTTGGCATCGTGGGCTTCAGCGGGCAAGCGGCTCAGCACGACGAAC TGGACAAGTTTCTCCACGATTTTGCGCCCTACTCTACCAACGCAAACTTCTCCATC GAGTCTGTCAATGGAGGGCAGAGTCCCCAGGGGATGAACGAGCCGGCAAGCGAG GCCAACGGAGATGTCCAGTACGCCGTTGCCATGGGCTATCATGTTCCTGTTCGGTA TTACGCAGTCGGTGGCGAGAACCATGACATTATCCCAGACCTGGACCTGGTTGAT ACAACAGAGGAGTATCTTGAGCCGTTTCTCGAATTCGCCAGCCATCTCCTCGACCT CGACGACGATGAGCTCCCAAGGGTAGTCTCCATCTCGTACGGCGCCAACGAGCAG CTCTTCCCCCGATCCTACGCCCATCAAGTCTGCGACATGTTTGGCCAGCTCGGCGC CCGCGGCGTCTCCATCGTCGTGGCCGCTGGCGACCTGGGCCCCGGCGTATCATGCC AATCAAACGACGGCTCAGCAAGGCCAAAGTTCATCCCCTCGTTTCCCGCGACGTG CCCCTACGTGACGAGCGTGGGGTCGACGCGCGGCATCATGCCCGAGGTGGCGGCG AGCTTTTCGTCGGGCGGCTTCTCGGATTATTTTGCTCGTCCGGCGTGGCAGGATAG AGCGGTGGGCGCGTACCTCGGGGCGCATGGCGAGGAGTGGGAGGGCTTTTATAAT CCTGCTGGGAGAGGGTTTCCTGATGTTGCTGCTCAGGGGGTGAATTTTCGGTTTAG AGCCCATGGGAATGAGAGCTTGAGTTCCGGGACGAGCCTCTCCTCGCCGGTGTTTG CCGCTCTCATTGCTCTGCTCAACGACCATCGATCTAAGAGCGGCATGCCGCCCATG GGATTCCTGAATCCTTGGATCTACACCGTCGGCAGTCACGCCTTCACAGACATCAT CGAGGCCAGGTCGGAGGGATGCCCAGGACAGAGCGTCGAATATCTCGCATCGCCC TATATCCCCAACGCGGGCTGGAGCGCCGTGCCTGGATGGGATCCGGTGACGGGTT GGGGCACGCCGCTGTTTGACAGGATGCTGAATCTGTCTTTGGTGTAGGAGGTGATG ATGGGTGATAAGCTTGGGTTTTGGAGAAGAC
[51] >jgi|Trire2|81517|estExt_GeneWisePlus.C_250157 polynucleotide (SEQ ID NO: 40) ATGAAGTCGGCATTGCTCTGGGCGGCTCCCCTCTCGCTGCTGGCGGGCCTCGGTGC CTGCGGGAAGAACTTTGACGAGCTTCTCGAGGTCCCGGAAGGATGGACGCAGCTC AACGATGCCGTCAATCCCTCGCAGCACATACGGCTGTCGATTGCCGTGAAGCAGC CATATATCGACAGCCTGGAGGCCAGGATGGCAGAGAAAGGCAACCGTCTGTCGAT GCAAGAGGTCCGGGAGCTGCAGACTCCAGCCAAAAAGGACATTGACAATGTGCTG CACTGGCTCTCGCAGAACAACCTCTACGGCGTCGTCGAAAAGGACTTTATTCGAGT CTGGACGACCGTCGCCAAAGCCGAGCCGCTGCTCAAGATGAAGCTCAGCCGCTTC TCGTACGAAGGCAAGCCGGCGGTGCTGCGAACGACCAAGTACACCATCCCCGACT CCGTGGCCGACTCCATCAGCTTCATCAACCCCATCAACAACTTCATGTCGGCGCGC CATCGTGAACGGGGCCTGACTTTCCTACTTCCTCCGTCAAAGGGCGCCGTCCTGCC AGGGAACACGACGGCCTATTGCGCCGGCAGCGTGACCCCGTCGTGTCTTTCCAAG CTCTACAACATCAACTACTCGCCCGCCAACACCAGCTCGCCCGTCATCTTTGGCGT GGCTGGCTTCCTCGAGGAAAACGCCAACCTCCAGGATCTCCGACAGTTCCTCAACC AGTCGGCGCCCGAGGTGGCCAAGACGGGACGCACCATCAATGTCGAGCTGGTCAA CGGCGGCGTCAACTCGCAGGAGCTGAGCGAGTCTGGCATCGAAGCGGCCCTGGAC GTTGACTATGCCGTGTCGCTGGGCTTCCCCACAAACGTCACCTTTTACTCTACCGG AGGCCGCGGCGTCAAGCTCAACGACGACGGACAGCCCATCGAGGGCGAGGACGA CGACAACGAGCCGTACCTCGAGTTCTTCCAGTACCTGCTGGCCAAGCCCGACGGC CAGGTGCCGCACGTCCTGTCGCTGTCGTACTCGGACGACGAGTTGTCCGTGCCCCG CGACTATGCGAAGCGCGTCTGCAGCCTGTTTGGCCTGCTGACGGCCCGCGGAACG TCCATCATCTTCTCGTCCGGCGATGGCGGCGCTCGCGGCGGCCGAGCATCGAGCTG CCTGACCAACGACGGCACGAAGCGTCAAGTCACCATGGCGACCTTCCCGCCCACC TGTCCCTGGGTGACGTCGATTGGCGCCGTCACCAACATCGCAGAGCCTCCCAACG GCGCCAGATTCAGCACCGGCGGCTTCTCCCAGTACTTTGCCCAACCTCGATGGCAG AACGAGGCAGTCGAGGGCTACGTCAAGGCGCTCGGGGGGCACCTGGACGGCTACT ACAACGAGTCCATGCGAGCCATCCCCGACGTTTCAGCAGTCGGCACCGCCTTCAG CATCATCTCGGGCGGCTACCCGCGCTCTGTTCAGGGCACCAGCGCCAGCGCCCCCG TCTTCGCCGCCATGATTGCCCTGATCAACGACGCCCGCCTGCGCGCCGGCAAGAA GTCGCTGGGCTTCCTGAACCAGCACCTGTACTCGAGCGAGGTCAGAGCCGTGCTG CAGGACATCACGGCGGGACAGAGCGCCTCTTGCATCTGGAACGACGCAGACATAC CGGGAGGCTGGCCGGCGGCCGAGGGCTGGGATGCCATTACGGGGCTGGGCGTGCC CAAGCGGTTCGACAAGCTGATGGAGGTGCTGGTGAATCATCTTCCACGCCTGGCG ACTGGCTTCGCCCGCTTCAAGATGCCCACGCAGCACGTCGAGCCCGAGTCTCACGC TCTGCTCCAAGATGTCGCAAACTCTCTGCTCAAGGCGCGAAAGGTGGTCGTGGTTA CCGGCGCGGGCATCAGCACCAACTCGGGCATACCGGACTTCCGCTCCGAAAATGG CCTCTACTCGCTCATCCAGGCTCAGTTCGACGCTGCGAACCAGCCTACGCGACCGG CCGAACGAAGCAAGGCTGACGGCACCGCCGACAGCGGAGAGGAGCCTCGCCCGA CGAAACGCGGAATGACGTCGCGGGAGGCTTCCCCAGACCTGGACGAGGTTACCCG ACAGCTGAGAGACGACATTGAAGCCCGCGCCGAGTCCCAGAGGCCCGCTGCGTCC AGTCGAGCCGCCGGCACGCAGCCAGCCGTGGCCGCCTCGGATGCAAACGCGTCGG CAGAGGGCGTTTGTTTGAGCACTCCGCGTCGCAAGCCCGCGCTTCCCTCCACCCCC CTGCCGACGACGTCTCCCCTCTCTTCTCCTCCCCGAGAAGACTTTCTGATTCCTTTG CCGTCATGGTCTTCGTCATCGATCCTGCGAACCGAAGATCGTAAGCGCATCACCGA CGCCTCTCACAATGTCGTTTCGTCTCCTCTCTCCTCGCCGCCGCCGGTCCTCTTCGA TCCCTTTCACCCTTCCTCTCCTTCCGACGAAGACAGGAGTCGCCGTAGTAGCACCA CGCCGTCGGAAGCGGGAGAGAACCCGCCCAACGCGATGCCCGCTTCCCAGACATC CAGCTTCGGAAAGGCAAACCTGCCCAACATGAAGGGCAAGGACCTCTTCGACGCT TCGATATGGTCGGATCCTACGAGGACCTCCGTCTTCTACCAGTTTGCAACGTCTCT ACGGCAAAAGGTCCGCGACGCCGAGCCTACAAGCTCCCACAAGTTCATCAGCCAC CTGAGGGACCGCGGCAAGCTGGTACGATGCTATACGCAAAACATTGACCAAATTG AGGAGAAGGTCGGCCTGTCGACGTCTTTGGAGGATGGGCCGGGCAGTCGAGGACG CTTCTCGCGCAAAGCGACGGCAAACGCTTCGCAGCTCAACAAAATGGTGCAAGAA GTATCCAGCGGTGAGGGCGGCGCATCCAGCCATGTCAATGCGTCCAGCCAATCAT CGAATGGGTCGAGTGAGCAATCATCAGCAGAGTCATCACAGGCCAATGATAGGAC GGAAGAGGAAGACAGTGCTGGCAGCAGCAGCACCACCGCCGCCACCACCACCAC CACCACCACACCGCCAGATCGACCCAAGCCGGCCCCGAGAAAGGAGCCGCCGCA ATCGGGTGTGGAGTGCGTCTTCCTGCATGGCTCGCTGCAGCTTCTACGATGCTTCC TCTGCGGGCAGGTGTGCTCTTGGGACGACGACGACCGCGAGGTGGAGACGCTGTC CGGCCTGCAGCCTGAATGTCCTCACTGCGTTGGCGCCACGGAAGCGCGTCAGGAA AGGGGCAAGCGAGCGCTGGGCGTGGGCAAGCTGCGTCCCGACATTGTGCTCTATG GCGAGGAGCACCCCAGCGCCCACCTCATCAGCCCCATCGTGACGCACGATCTTGC CCTGTATCCGGACATGCTCCTGATCCTGGGGACCAGCCTGCGAGTCCACGGGCTCA AGGTCCTCGTCAGGGAGTTTGCAAAGACGGTGCACAGCCGTGGCGGCAAGGTAGT CTTTGTCAACTTCACCAAGCCGCCAGAGAGCTCCTGGGGCGACATCATTGACTACT GGATCCAGTGGGACTGTGACGCGTGGGTGAGCGATCTCCAGGTGCGCATCCCCAA GCTCTGGCAAGATCCGGAGCCGCCAAAGCCCAAGAAGAAGCGCGACTCGGGAGG AGCCGCCGACGACAATCGCGAGGAGAAGAAGAGACCGCCTGCACAGAACCCGGT TGCCTTGCGCGACACAAAGGTGAATGGCGCGTACTGCACGCTGAAAATTCTCAAA GAGCTGCGCAGAATCACATACACTCGTGACTCGGCCGCCATCGGGAACCCCATCA TCACCGCGCCCGAGCCCCCGACTACCACCGAGGCGATTTCTCGAGCTGTGGCCATG GATTCTTCGCGCACAAGCACGCCCAGGGGCAAGTCCAAACGATCGCGTAGGTCTG CGACAGGAGCCATAGACAGGCCTAAGCGGACGCCATCGACACTGAATCCGAACCA TGGCCGGTCCAAGAAACCCACGGCAGAGGCCAAGAAGAAACAGCAGGAGGAGGA GGAAGTGCCTGAGACGCCTAGCCAGCCGCCAGCTAGCACGGTGGAGGAATTCAGC TCGATCCTCCACTCTGTCAAGTCTAACCCACGCATACGGAAGCGGAAGATGATTG ACGGAGAAGAATTTGTATTTCCCGCGGTTGGGAAGAAGCGCGGCGCGGTGGATAG CCTATACAAGGGGCCGGGGGACGACACAAAGGAGCTACCACCTCTGCGCCCTATG CCG [52] >jgi|Trire2| 123244|estExt_fgenesh5_pg.C_190118 polynucleotide (SEQ ID NO: 41) CTCTCCCTCCCATCCATCACCCATCATCTTCCCCCAGTGGTTCCCATCACGCAGGTC GAAGCCGGAGTTTCTCTGTTTCGCTGCGTTTTTTCAGTCTCTTCTGGTGTTGTTTGT CGCAGGTTGACCAACTTCCGTCCTTCCCTCACCAGCCCATCACATCCGCCACCATG CGGTCCGTTGTCGCCCTCTCCATGGCGGCCGTTGCCCAGGCCAGCACATTCCAGAT TGGCACCATCCACGAGAAGTCGGCCCCCGTGCTGAGCAACGTCGAGGCCAACGCC ATCCCCGATGCCTACATCATCAAGTTCAAGGACCACGTGGGTGAGGATGATGCCT CCAAGCACCACGACTGGATCCAGAGCATCCACACAAACGTTGAGCAGGAGCGCCT TGAGCTCCGCAAGCGAAGCAACGTCTTTGGCGCCGACGACGTCTTTGACGGTCTG AAGCACACTTTCAAGATTGGCGACGGCTTCAAGGGCTACGCCGGTCACTTCCACG AGTCTGTCATTGAGCAGGTCCGGAACCACCCTGACGTTGAGTACATCGAGCGCGA CAGCATTGTGCACACCATGCTTCCCCTCGAGTCCAAGGACAGCATCATCGTTGAGG ACTCGTGCAACGGCGAGACGGAGAAGCAGGCTCCCTGGGGTCTTGCCCGTATCTC TCACCGAGAGACGCTCAACTTTGGCTCCTTCAACAAGTACCTCTACACCGCTGATG GTGGTGAGGGTGTTGATGCCTATGTCATTGACACCGGCACCAACATCGAGCACGT CGACTTTGAGGGTCGTGCCAAGTGGGGCAAGACCATCCCTGCCGGCGATGAGGAC GAGGACGGCAACGGCCACGGCACTCACTGCTCTGGTACCGTTGCTGGTAAGAAGT ACGGTGTTGCCAAGAAGGCCCACGTCTACGCCGTCAAGGTGCTCCGATCCAACGG ATCCGGCACCATGTCTGACGTCGTCAAGGGCGTCGAGTACGCTGCTCTCTCCCACA TTGAGCAGGTGAAGAAGGCCAAGAAGGGCAAGCGGAAGGGCTTCAAGGGCTCCG TCGCCAACATGTCCCTCGGTGGTGGCAAGACCCAGGCTCTTGACGCTGCCGTCAAC GCCGCCGTCCGCGCCGGTGTCCACTTTGCCGTTGCTGCCGGCAACGACAACGCTGA TGCTTGCAACTACTCCCCCGCTGCCGCCACTGAGCCCCTCACCGTCGGTGCTTCTG CTCTCGATGACAGCCGTGCTTACTTCTCCAACTACGGCAAGTGCACTGACATCTTC GCCCCTGGTCTGAGCATCCAGTCCACCTGGATTGGCTCCAAGTATGCCGTCAACAC CATCTCTGGTACCTCCATGGCCTCTCCTCACATCTGCGGTCTCCTGGCCTACTACCT GTCTCTCCAGCCCGCTGGTGACTCTGAGTTCGCTGTTGCCCCCATCACCCCCAAGA AGCTCAAGGAGAGCGTCATCTCTGTCGCCACCAAGAACGCCCTCTCTGACCTGCCC GACTCTGACACCCCCAACCTGCTCGCCTGGAACGGCGGTGGCTGCAGCAACTTCTC CCAGATTGTCGAGGCCGGCAGCTACACTGTCAAGCCCAAGCAGAACAAGCAGGCC AAGCTCCCCAGCACCATTGAGGAGCTCGAGGAGGCCATCGAGGGTGACTTTGAGG TCGTCTCTGGCGAGATCGTCAAGGGTGCCAAGAGCTTTGGCTCCAAGGCGGAGAA GTTTGCCAAGAAGATCCACGATCTCGTCGAGGAGGAGATTGAGGAGTTCATCTCT GAGCTCTCCGAGTAAGGTCTATCTCTTTTGAGTAGTCCCCAATTTATTTCCTTTGTG TGATGGCCACGATGCGCTCTACGTCATGATTGAGTACGAGGGTCATAGTATGGGA TCAGGAGAGACTTGTTTTGCTTTTTTGGGTATTTGGCACGGATATGTTATTTAACGG GGACTTTACCCCCTAGGTTGTACTCTACGCTGCATGGTGCTACGCTCTCAAGATGA GCTTATACAGATCATGATAATAAGACAAACCAGCATTGTCAGATTGATGCTTT
2056
[53] >jgi|Trire2|81070|estExt_GeneWisePlus.C_220116 polynucleotide (SEQ ID NO: 42) CTCCTCTCCTCTACACACCCTCCATATCCTCACAGAACCATCATCATGGCCGTCCTC AGCAGATTGGCCCTCACGGCCTCGTTCGCCCTCTGCGGCGTCTCGGCAGCCGGCAT CCAGCAGCCGCTGACTGCCCCAGAGTCGCTGCCTCCCTCACACGAGGCCGTCGCC GACTACGGCTCCAAGCCGATCATTGATTCCGAGGCCCTGCAGAGCGCCATCAGCA TCGACACGCTCGTGAAGCGTGCGGAGAGCTTCTACAAGTTCGCAAAGGCCTCGGA GGAGGAATACGGCCATCCGACTCGAGTCATTGGCAGTGCCGGCCACGAGCAGACG CTCAACTACATCACAAACACCCTGCTCGACCACGGCGACTACTACAATGTCTCTGT CCAGGAATTCCCCGTCACGCTGTCCAACGTCTTCCAGTTCCGCCTGGTCTTGGCCG ATGAGGTTTCCAAGTCGGCCATCCCCATGGGGCTGACGCCGCCCACCAAGGACAA GGAGCCCGTCCACGGCGACCTGGTGCTCGTCCAAAACAGCGGCTGCGACGCGTCC GACTACCCCCAGAACGTCAAGGGCAACATTGCCTTTATCCGCCGAGGAGCCTGCT CCTTTGGCGACAAGTCGATTGGGGCCGGCAAGGCTGGTGCCAAGGCTGCCGTCAT CTACAACACCGACCCGGAGGAGCTTCATGGCACGCTGGGACTGCCCGTTGAGGAC C AC ATTGCC AC CTTTGGC ATTGAC GGC GTC GAGGGC AAGAAGATTCTCGC C AAGC TCAGCAATGGCGAGTCTGTCGATGCCATTGCCTACATCGATGCCGAGGTCAAGCA GATCCAGACGGTCAATGTCCTGGCCCAGACGGAAGAGGGCGATCCCGACAACTGC GTCATGCTCGGCGGTCACAGCGACGGAGTTGCCGAGGGCCCCGGCATCAACGACG ACGGCTCCGGCAGTATATCCGTTCTGGAGGTCGCCGTTCAGCTGACCAAGTTCAAG GTGAACAACTGCGTCCGCTTCGCCTGGTGGGCTGCCGAGGAGGAGGGCCTGCTGG GCTCCGACTTTTACGCCGCCAGCCTGTCTGACGAGGAGAACCAAAAGATCCGCCT CTTCATGGACTATGACATGATGGCCAGTCCCAACTTTGCCTACCAGATCTACAACG CAACCAACGCCGAGAGCCCTGCCGGCTCCGAGGAGCTGAGGAACCTCTACGTGGA CTGGTACAAGTCCCAGGGCCTCAACTACACCTTCATTCCCTTTGACGGCCGCAGCG ACTACGACGGCTTCATCCGCGCTGGCATCCCCGCCGGTGGTATTGCTACCGGCGCC GAGGCTGTCAAGACCAAGGAGGAGGCCGAGATGTTTGGCGGACGCGCCGGCGAG TGGCTTGACCCTTGCTACCACCAGCTGTGTGACGACCTGGGCAACCTGAACCACAC CGCCTGGGAGGTCAACACCAAACTTATTGCCCACTCGGTTGCCACCTACGCCCTCT CGTTTGACGGCTTCCCCAAGCGCAAGCTGGAGACGGAGATGAGCGCCTACAGCCA GACGACCAAGCACCACGGGCCCAAGCTGATTCTGTAAAGTGGTGGCGAGGTGCGT TTCTGTAGCATTTACGAGCATGCTGT
[54] >jgi|Trire2|53961 |e_gwl .1.1743.1 polynucleotide (SEQ ID NO: 43)
ATGCAGCCCTCATTTGGCAGCTTCCTCGTCACCGTCCTGTCTGCCTCCATGGCAGC AGGCAGTGTCATTCCCAGCACAAACGCCAACCCTGGCTCCTTCGAGATCAAGAGA TCCGCCAACAAAGCCTTCACAGGCCGCAATGGCCCTCTAGCATTAGCCCGTACATA CGCCAAGTACGGTGTTGAAGTCCCCAAAACTCTGGTCGATGCTATTCAACTCGTCA AGTCCATCCAGCTCGCAAAGCGGGACAGCGCCACCGTCACTGCCACGCCGGACCA CGACGACATCGAGTATCTTGTCCCCGTCAAGATCGGAACTCCTCCCCAAACACTTA ACCTGGATTTTGACACGGGCAGCTCCGATCTCTGGGTCTTCTCATCAGATGTCGAC CCGACCTCCTCCCAGGGCCATGACATCTACACCCCGTCCAAGAGCACATCTTCCAA AAAGTTGGAAGGAGCCTCATGGAACATCACATATGGAGACCGCTCATCATCATCC GGCGATGTCTACCACGATATTGTCTCCGTCGGAAACCTGACAGTAAAGTCCCAAG CCGTCGAGTCCGCTCGAAACGTCTCGGCCCAGTTCACCCAGGGCAACAACGACGG CCTCGTCGGCCTGGCGTTTAGCTCCATCAACACAGTCAAGCCCACGCCGCAAAAG ACGTGGTACGACAACATCGTCGGCAGCCTTGACTCTCCCGTCTTTGTTGCTGATCT GCGCCACGACACGCCCGGCAGCTACCACTTCGGCTCCATCCCCTCCGAAGCAAGC AAAGCCTTCTACGCCCCCATCGACAACAGCAAGGGCTTCTGGCAATTCAGCACGA GCAGCAACATTAGCGGCCAGTTCAACGCCGTTGCAGACACTGGCACTACTCTGCT GCTCGCCAGCGACGACCTCGTCAAGGCCTACTACGCAAAGGTCCAGGGCGCCCGT GTGAACGTCTTCCTGGGCGGCTACGTCTTCAACTGCACCACTCAGCTGCCCGACTT TACCTTTACTGTTGGAGAGGGCAACATCACTGTCCCCGGTACCTTGATAAACTATT CCGAGGCTGGCAACGGCCAGTGTTTTGGCGGTATTCAGCCGTCGGGGGGTCTTCCT TTTGCTATCTTTGGTGACATTGCTCTTAAGGCTGCGTATGTTATTTTTGACAGTGGC AACAAGCAGGTTGGCTGGGCGCAGAAGAAATAG
[55] >jgi|Trire2|69555|e_gwl .29.235.1 polynucleotide (SEQ ID NO: 44)
ATGTTCATCGCTGGCGTCGCCCTCTCCGCCCTTCTTTGCGCCGATACCGTCTTGGCT GGTGTTGCCCAGGACCGAGGCCTTGCCGCTCGTCTCGCTCGCCGTGCTGGCCGTCG TTCGGCTCCTTTCAGAAACGACACCAGCCATGCTACTGTTCAATCTAACTGGGGTG GTGCCATACTTGAGGGATCTGGCTTTACCGCAGCTTCGGCTACTGTGAACGTGCCA AGGGGTGGTGGCGGATCCAATGCTGCTGGCTCTGCTTGGGTTGGCATTGATGGCGC CAGCTGCCAGACTGCCATTCTCCAGACGGGATTTGACTGGTACGGTGATGGCACCT ACGATGCCTGGTATGAGTGGTACCCTGAGTTTGCTGCCGACTTCTCGGGCATTGAT ATCCGCCAAGGCGACCAAATTGCCATGTCTGTTGTTGCCACCTCTTTAACCGGCGG CTCTGCTACGCTCGAGAACCTTTCCACTGGCCAGAAGGTTACTCAGAACTTCAACC GTGTCACTGCTGGGTCTCTCTGCGAAACAAGCGCCGAGTTCATTATCGAGGACTTT GAGGAGTGCAACTCGAACGGCAGCAACTGCCAGCCTGTGCCATTTGCTTCTTTCAG CCCCGCCATTACTTTCAGCTCTGCCACAGCCACTCGAAGCGGCCGGAGCGTTTCTC TGAGCGGCGCTGAAATCACCGAGGTTATTGTCAACAACCAGGACCTTACCAGATG CTCCGTCTCCGGTAGCAGCACCTTGACCTGCTCTTACGTTTAG
[56] >jgi|Trire2|21668|estExt_fgeneshl_pm.C_30227 polynucleotide (SEQ ID NO: 45) TGATGACCATCACGGCCCAGAAACTCACCCCAGAGGTTCTCCTCGCGGCGCCGCG GCGCTCCCCGGGGGTGCCCAATGCCACGGGCGAGCTGGTTCTGTATACGGTCTCG ACATACTCCTTCGACTCCCACTCCAAGACGGCGCAGATCCGCGTCCTCAACCTCAA AGAAGGCACGTCGCACCTCGTCTCCGAAGACTCCGCCGCCAGCGAACCGATCTGG ATTGCCGAGCAGGAGATTGCCTATGTCAAGAGCCTCGATCACGGTGCTTCGGCCCT GGTGGCGCAGCACGTCTTCAACCCGAATGAGTCAAACACGATCCAGCGCTTCGGC GGCAGCATTAACAGCCTCAAGGCCAAGCCGCTGTCCACAGACAAGGTGGCTTTCT GCTGTGCTGCCCTGACGACCCCCGATGGCCGCATGTACAGCCCGGCTGCGGAGCC AAAGTCTTACACGTCGGCCAAGATCTACACATCCCTCTTTGTGCGGCACTGGGACT CCTGGAACACCGAGAACAAGAACTCGCTCTGGTACGGCCAGCTTAACAAGGTCGA CGGCAAGTGGACGCTTGGAAACTCGGAGCTCACCAACCTCCTGGCCGGCACGCGG CTCCATTCTCCCGTGCCCCCCTTTGGAGGTACCGGTGACTTTGACATATCCACGAC CGGCATTGTGTTTGTCGCAAAGGATCCGGATCTGAACCATGCGAGAACAACCAAG ACGGACCTGTACTTTGTTCCACTAAATTCATACTTGGACCAGCCAACATTTCCCCA GATTGTCAAGACGGGAGCGCTGCGAGGCTATTCGCTCTCTCCCGTGTTTTCGAATG ATGGGAAACAGGTAGCTTTCCTGCGCATGAAGTCCCAGCAGTACGAGGCCGACAA AACGAGGCTGTTGCTAATTCCGGACGTTACCGACTTGAGCAACGTTCAGGAGTTTT ATGCCACGGAAGATGGCAAGGGTGGTTGGGACTACAAGCCCGATTGGCTCATCTG GAGCCATGACGATAAGGAACTGTACGTTGCGGCCGAGAAGCACGCTCGAGTCGTC CTGTGGAAGTTGCCGTCATCGCCATTGGAGGCCAAGTCTCTGCCCACTCCTATTCA CGAGGATGGATCTGTGGCCGAAGCTAGAGTCCTCGGGAAAGGCTCGTCTCTTTTG ATCACGACGAGATCTCGCGTCGAAAGCTCCAACTACTCGATATTAGACCCCGCCTC CAAATCGACCACCATCATCTCATCCAGTTCAAGGCAAGGTAAGACCTTTGCTCTAA GCAAGTCGCAGTGCCAGGAGATTTGGTTCAATGGGTCCAAGGGCTACCCCATCCA CGCCCTAGTGACCCTGCCCTCAACATTCGACTCATCCAAGAAGTATCCCTTGGCCT TTTTCATTCATGGCGGGCCCCAGGGAGCATGGGGAGACGATTGGAGCACTCGATG GAACCCGGCCGTCTTTGCTGAGCAAGGCTACGTTGTGGTGAGCCCCAATCCAACC GGGAGCACAGGATATGGCCAAGACCATACCGATGCCATCCAGAACAACTGGGGA GGCGACCCCTATATCGACCTGGTCAAGTGCTTTGAGTTTCTGGAGGAGGAAGTGA ACTACATTGATACAACCAGAGCGGTGGCTCTCGGGGCGTCCTATGGCGGCTACAT GATTAACTGGATCCAAGGCCATGACCTTGGGCGGAAATTCAAGGCATTGGTGTGC CACGACGGCGTCTTTTCGACCCTGAACCAGTGGTCCACAGAGGAGCTCTTCTTCCC GGAGCACGACTTTGGCGGCGCCCTCTGGGAGAATAGAGAAGGTTACGAGAAGTGG GATCCCGCAAAGCACGTTGGAAACTGGGCTACGCCGCAACTGGTCATTCACAATG AGCTTGATTACCGCCTGCCCATCTCCGAGGGTCTGGCCATGTTCAACGTCCTACAG GCTCGTGGCGTGCCGAGCAAGTTTGTCATGTTCCCGGACGAGCATCACTGGGTGCT CAAGCCTGAGAACTCTCTCGTCTGGCACAGGGAAGTGTTGAATTGGATCAACAAG TACAGTGGAATATCTGAAAAAAACTAG
[57] >jgi|Trire2|58698|e_gwl .5.263.1 polynucleotide (SEQ ID NO: 46)
ATGGCCTGGTTGAAAAAGCTGGCCCTCGTCTTGTTGGCAATCGTGCCATATGCCAC
GGCTTCTCCGGCCCTCAGCCCCAGGTCCCGTGAGATTCTGAGCCTCGAAGATCTCG AATCCGAGGACAAGTACGTCATCGGCCTGAAGCAGGGCCTGTCGCCCACGGACTT GAAGAAGCATCTACTGCGCGTCTCCGCCGTCCAATATCGCAACAAGAACAGCACG TTCGAAGGCGGCACTGGCGTCAAGAGGACGTATGCCATCGGCGACTACCGGGCAT ACACTGCCGTCCTCGATCGAGACACGGTTCGAGAGATTTGGAACGATACTCTTGA GAAGCCGCCCTGGGGCCTTGCAACGCTCTCGAACAAGAAGCCACACGGATTCCTG TACCGCTACGACAAGAGCGCGGGCGAGGGAACCTTTGCCTACGTGCTGGACACGG GCATCAACTCCAAGCACGTGGACTTTGAGGGCCGCGCCTACATGGGGTTCAGTCC GCCCAAGACGGAGCCGACGGACATTAACGGACACGGGACTCATGTTGCGGGCATC ATCGGCGGCAAGACCTTTGGCGTGGCCAAGAAGACGCAGCTCATCGGCGTCAAGG TGTTTCTGGATGATGAGGCGACGACGTCGACGCTGATGGAGGGGCTCGAGTGGGC AGTCAATGACATTACGACAAAGGGCCGCCAGGGCCGTTCTGTCATTAACATGTCTC TTGGAGGACCCTATTCGCAAGCGTTGAACGACGCCATCGACCACATCGCCGACAT GGGGATCCTCCCGGTTGCTGCTGCGGGAAACAAGGGCATTCCGGCCACGTTCATCT CGCCCGCTTCCGCCGACAAGGCGATGACGGTGGGCGCCATCAACTCGGACTGGCA AGAGACCAACTTTTCCAACTTTGGCCCCCAGGTCAACATTTTGGCGCCCGGAGAAG ACGTCTTGTCGGCGTATGTGAGCACAAACACGGCGACGCGCGTGCTATCGGGCAC TTCGATGGCGGCGCCTCACGTTGCCGGACTGGCGCTGTACCTGATGGCGTTGGAGG AGTTTGACTCGACGCAGAAGCTGACGGACCGCATATTGCAGCTGGGGATGAAAAA CAAGGTGGTCAACCTGATGACGGACTCGCCCAACCTGATTATTCACAACAATGTC AAATGA [58] >jgi|Trire2| 122703 |estExt_fgenesh5_pg.C_l 40132 polynucleotide (SEQ ID NO: 47) ATGGCGTCCCGCAGATTAGCTCTGAACCTGTCGCAAGGCCTGCGAGCCCGTTCCGG CCTGTCAGGCTTGCGCCGGGGCTTTGCTACGCCCTCCACCGTCGGCAAGACGCAGA CGACGACGCTCAAGAACGGCCTGACTGTCGCTACTGAATACTCACCATGGGCTCA GACCTCGACCGTCGGCATGTGGATCGACGCCGGCTCCCGTGCTGAGACCAACGAG ACCAACGGCACTGCCCACTTCCTCGAGCACCTGGCCTTCAAGGGCACCGCAAAGC GATCACAGCACCAATTGGAACTCGAGATTGAGAACATGGGTGGTCACCTCAACGC CTACACCTCGCGCGAGAACACCGTCTACTTTGCCAAGGCCTTCAACTCCGATATTC CCCAGACCGTCGACATCCTGGCCGATATTCTGCAGAACTCCAAGCTCGAGCAGTCC GCCATCGAGCGCGAGCGCGACGTCATTCTCCGAGAGTCCGAGGAGGTCGAGAAGC AGGTTGAGGAGGTTGTCTTCGACCACCTGCACGCCACTGCCTTCCAGCACCAGCCC CTTGGCCGCACCATCCTCGGCCCCCGCCAGAACATCCGTGACATTACCCGCACCGA GCTCGTCAACTACATCAAGAACAACTACACCGCCGACCGCATGGTTCTCGCCGCTG CCGGTGGCGTTCCCCACGAGCAGCTGGTCGAGCTCGCCGAGAAGCACTTCTCCGG CCTTGCCAGCCACGGACCCGAGACCGAGGCCTATGTCCTGTCTAAGCAGAAGGCC GACTTCATTGGCTCTGACGTGCGTGTCCGCGACGACACCATGCCCACTGCCAACGT CGCCATTGCTGTTGAGGGTGTCAGCTGGAACTCTGATGACTACTACACTGCTCTCG TCGCTCAGGCCATTGTTGGCAACTACGACAAGGCCATGGGCAACGCCCCTCACCA GGGCGGCAAGCTCAGCGGCTACGTTCACAAGCACGACCTGGCCAACAGCTTCATG AGCTTCTCCACCAGCTACAGCGACACTGGTCTCTGGGGCATCTACCTCGTCACCGA TAACGCCACCCGCCTCGACGACCTCGTCCACTTCGCCATCCGTGAGTGGATGCGTC TCTGCTACAACGTCAGCGAAGCCGAGGTTGAGCGCGCCAAGGCCCAGCTGAAGGC CTCCATCCTGCTGTCCCTGGACGGCACCACCGCCGTTGCCGAGGACATTGGCCGCC AGCTCATCACCACTGGCCGCCGCGCCAGCCCCGGCGAGATTGAGCGCAAGATCGA CGCCATCACCGACAAGGATGTCACCGACTTCGCCAACCGATACCTCTGGGACAAG GACATTGCCATCAGCGCTGTTGGAAAGATTGAGGCTCTGTTTGACTACCAGCGACT GCGGAACACCATGAAGCCCAAGTTTTAAGGGTGGGAGAGCGGATGCGAAGCGAA TGGAGCAGGGGCGGACTTGTCTGAA
[59] >jgi|Trire2|60581 |e_gwl .7.573.1 polynucleotide (SEQ ID NO: 48)
ATGAGCCGTCGCCCCGTCTACTTCAATCCCTTGGCCGAATCATGGACGGCGCCCTC ACCGGATGACCCTCAGTTGGCATACCGCTTCCATTCACAATTACCGGCTTACTCTC CCACTCAATTGATCCCTCTCACGGACCTAGCCAAGGAGCTCGGCGTGCAATCCATC CACCTCAAGGACGAGACAAGTCGACTCGGACTCCCCTCCTTCAAGATCCTCGGCG CCTCATGGGGTACATTTCGGGCCATTGCCCAGCGACTTGATCTTCCTATAGACTCT TCCTTGGGCTCCGTCGAGCAGAAGTTGGCATCTTCAAATATCACACTCTACGCGGC CACTGACGGCAATCATGGCCGGGCCGTTGCCCGCATGGCCTCCATCCTGGGTGTCC CAGCCCAGATCCACGTCCCTACGACCATGCACCAAAGCACCATTGATCTGATAAA GTCAGAGGGTGCGCGCGTCGTCATCTCGGACGGCTTCTACGACGATGCTGTTGTTG ACGCTCGAGTGGCAGCTGCTAAAGACGATACCGCCCTTGTTATCCAGGACTTTGCT TCGGGCGACTACGTTCAGATTCCGCAGTGGATCGTCGATGGCTACCTCACCATGAT GCTGGAAATCGACGGCCAGCTTGGCTGTACTACCCCTGACCTCGTCGTTGTGCCGG TGGGAGTCGGGTCCTTTGCGCAAGCCGTCGTAACGCACTTCAAGAAGCCCGGAAA ACAAACAAAAGTCCTGACCGTCGAACCTGATACATCCGCCAGTCTCTGGAAAAGC CTCAGAAGCGGCGAGTCCTCCTCCACGTCGGAAAAGTCCCCCAGCATAATGGCAG GGCTGGATTGCGGGACGCCTTCATCGATATCTTGGGCTGTTCTCCGACACGGAGTC GATGCCAGCCTCACAATATCTGACTATGAGGCGCACAAAGCTTGCGAATACCTAA AGTCTCAAGGTGTGTCTGCAGGGCCCTGTGGAGCCGCACCTATCGCTGCGCTCAGG AGATTAGAGCGAGCGGATAGAGAAAGGCTGGGTCTCACCAAGAACTCCGTCGTAG TAATCTTTTGCACAGAGGGTGCACGAGACTACGACGTTCCTCACAGCGTGGCGAG CGACGATCCCGTCGAAATAACGCAGACGTTGGTGAAGATCAACTCGGCCAATCCG TTCCTTGGATCGGTTCCCGGTCCTGGAGAAACTGCTATTGCTCGCTATATCACTGCT TGGATGGAGCACCGTGATATTGAGTCTCATTGGATTGAGCCAACGTCCGGACGAC CTTCTGTGGTAGGCATCGTCCGCGGACTCGGCGGTGGCAAGAGCCTCATGCTCAAC GGGCACATTGACACCGTGACCCTCATGGGATACGAAGGCAATCCCCTCAGCGGTG ACATCCAAGACGGCAAGCTCTATGGCCGTGGAGCCGCTGACATGAAAGGCGGTGT AGCAGCTGCCATGGCTGCCCTAGCCAACGCGAAAAAGCACAGCCTCAGGGGAGAT GTCATCTTCACAGGCGTGGCAGACGAGGAGTTTGAGAGCATCGGCACGCAGCAGG TACTGGAGGCAGGCTGGACAGCTGACGGCGCAATCGTCAGTGAGCCGACCAACAT GGAGATTCTATACGCTCACAAGGGCTTTGTATGGTTTGATGTTGACATTCAGGGCC TTGCAGCGCACGGCTCGAGATACGACCTCGGTATTGATGCCATTAGCAAGGCAGG CTATTTCCTCGTTGAGTTGGATAAACATGCAAGCCATTTGACAGCGCAAAGCGGCG ATGCTGTCCTTGGCCCAGGTAGCATTCACGCCTCCCTCATCAAGGGCGGCGAGGA AGTCTCATCATACCCTTCTCGTGTTCAGATCCAGCTCGAGCGGCGAACCGTGAACG GCGAGACACCGGAAACTGTACGGAAAGAGCTCGAAGAGATACTCGACGGCCTGA CGAAAACAGTCCCCAACTTCACGTACAGTCTTCGCACCACTTTTCACCGGTCTCCC TTCAAGGCCGACCTTTCCCACCCTTTTGCGAAACTCGTCCACAAGCACGTGGGAAA CACTCTTGGCAGGGAGCCTGCTGTCTTGGGCGCCCCCTACTGGACTGACTGCGCAC TACTAGATGGTGCTGGGATTCCGGCATTGCTCTGGGGGCCCCAGGGTGAAGGCTT ACATGGAAAGTTGGAATACGCAGATGTTGAGTCGATAAAGCAGGTCGCAGAGGCC TTGACTGCAATCGCTGTGGAGTTTTGTAGCTGA
[60] >jgi|Trire2|74129|estExt_GeneWisePlus.C_l 1150 polynucleotide (SEQ ID NO: 49) TGTCAGACTCGGTTGCGTCGTAACAGAAACACTATCAATCATCGCGACAACCAAC AGACTTGAGAATTTGCCATTCACCACCACCTACCGTCTACAGCACCCCCCGTCAAA TCTTGCACCCTTTCAAATAAGAAGCTCGAGATTCACAGCCCACCATGCCTCTCGTC GTGCCAGGCATCATGACCAACTCGGACGACAAGACGCAGGAATGGGCCAACAAG CTCGTCGGCAAGACGTACTCTGAAAACGAGTCCAACGAGACGCGGGATCTGCCAG AGGTGCACCGCATCATCAAGCCCGGCTCCATTGTGACAAAGGACTTTAGGCCGGA GAGGTTGAATATTCACTTGAATGAGGATGGGACCGTGTCGCATGTGCGGCATGGC TGA
[61] >jgi|Trire2| 111915|fgenesh5_pg.C_scaffold_30000054 polynucleotide (SEQ ID NO: 50)
ATGCTCTTCCAAACCATGCTCCTCGCCCTCATCACCTCCCTGGCCCTCGCCCAATCC GAAGTCGGCCGTCCCTGCGGCTTCAAGATGGCCCCCTGCCCCTTCGACATGAAGTG CGTCCCCGACAACGCCTACTGCCCTCACCCCAGCCGCTGCCCGGGCCACTGCGAGT TCAAGAACAAGTACGACCAGTGCGGCGGCTTCACGCCCCGGCCGCACGTTTGCCG CCGGGGGTCCCGGTGCCAGGACGATCCCCGGCTGCCGCCCAACTGCGGCATGGCG TGCGATGCCCCGGGGATCTGCATTCCCGAGAATGCTCCCTTCTGTGGAGGGTTCAT GGGGTTGGCTTGTCCGAAGGGATTGTACTGTTATGATGCGTTGGATGACTGTGATC CGAACAATGGAGGTGCAGACTGTGGAGGAATCTGTCTGTAG
[62] In some embodiments, the recombinant filamentous fungal cells include a mutation in a gene that has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or even at least 99% nucleic acid sequence identity to the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49 and/or SEQ ID NO: 50.
[63] In some embodiments, the recombinant filamentous fungal cells include a mutation in a gene that encodes a polypeptide having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, or even at least 99% amino acid sequence identity to the a polypeptide of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 1 1 , SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21 , SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and/or SEQ ID NO: 25.
[64] All patents and publications, including all sequences disclosed within such patents and publications, referred to herein are expressly incorporated by reference. Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs (See e.g., Singleton et al, DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY, 2D ED., © 1994, John Wiley and Sons, New York; and Hale and Marham, THE HARPER COLLINS
DICTIONARY OF BIOLOGY, © 1991 , Harper Perennial, NY, both of which provide one of skill with a general dictionary of many of the terms used herein). Any methods and materials similar or equivalent to the various embodiments described herein can be used in the practice or testing of the present invention.
[65] It is intended that every maximum (or minimum) numerical limitation disclosed in this specification includes every lower (or higher) numerical limitation, as if such lower (or higher) numerical limitations were expressly written herein. Moreover, every numerical range disclosed in this specification is intended to include every narrower numerical range that falls within such broader numerical range, as if such narrower numerical ranges were all expressly written herein.
[66] As used herein, the singular "a", "an" and "the" includes the plural reference unless the context clearly dictates otherwise. Thus, for example, reference to a "host cell" includes a plurality of such host cells.
[67] Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxyl orientation, respectively. The headings provided herein are not limitations of the various aspects or embodiments of the invention that can be had by reference to the specification as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the Specification as a whole.
[68] As used herein, the term "impaired" or "impairment" refers to any method that decreases, but does not abolish, the functional expression of one or more genes or the functional activity of the resulting gene product (i.e. protein), fragments or homologues thereof, wherein the gene or gene product exerts its known function to a lesser extent than in the corresponding parent strain. It is intended to encompass any means of gene impairment include partial deletions, disruptions of the protein-coding sequence, non-coding sequences, or both, insertions, additions, mutations, gene silencing (e.g. RNAi genes antisense) and the like.
[69] As used herein, "deletion" of a gene refers to deletion of the entire coding sequence, deletion of part of the coding sequence, or deletion of the coding sequence including flanking regions.
[70] As used herein "disruption" refers to a change in a nucleotide or amino acid sequence by the insertion of one or more nucleotides or amino acid residues, respectively, as compared to the parent or naturally occurring sequence. Accordingly, a "disruption sequence" or "disruption mutant" as used herein refers to a nucleic acid or amino acid sequence, typically a coding region sequence, that comprises an insertion of nucleotides or amino acids.
[71] As used herein, "insertion" or "addition" in the context of a sequence refers to a change in a nucleic acid or amino acid sequence in which one or more nucleotides or amino acid residues have been added as compared to the endogenous chromosomal sequence or protein product.
[72] As used herein, "non-revertable" refers to a strain which will naturally revert back to it corresponding parent strain with a frequency of less than 10"7
[73] As used herein, the term "corresponding parent strain" refers to the host strain from which a mutant is derived (e.g. , the originating and/or wild-type strain).
[74] As used herein, "strain viability" refers to reproductive viability. In some
embodiments, the impairment of a gene does not deleteriously affect division and survival of the mutant under laboratory conditions.
[75] As used herein "coding region" refers to the region of a gene that encodes the amino acid sequence of a protein.
[76] As used herein "amino acid" refers to peptide or protein sequences or portions thereof. The terms "protein," "peptide," and "polypeptide" are used interchangeably.
[77] As used herein, the term "heterologous protein" or "exogenous protein" refers to a protein or polypeptide that does not naturally occur in the host cell, and includes genetically engineered versions of naturally occurring endogenous proteins.
[78] As used herein, "endogenous protein" or "native protein" refers to a protein or polypeptide naturally occurring in a cell.
[79] As used herein, "host," "host cell," or "host strain" refer to a cell that can express a
DNA sequence introduced into the cell. In some embodiments of the present invention, the host cells are Aspergillus sp. [80] As used herein, "filamentous fungal cell" refers to a cell of any of the species of microscopic fungi that grow as multicellular filamentous strands including but not limited to:
Aspergillus sp., Rhizopus sp., Trichoderma sp., and Mucor sp.
[81] As used herein, "Aspergillus" or 'Aspergillus sp." includes all species within the genus 'Aspergillus " as known to those of skill in the art, including but not limited to A. oryzae, A. niger, A. awamori, A. nidulans, A. sojae, A. japonicus, A. kawachi and A. aculeatus.
[82] As used herein, "nucleic acid" refers to a nucleotide or polynucleotide sequence, and fragments or portions thereof, as well as to DNA, cDNA, and RNA of genomic or synthetic origin which may be double-stranded or single-stranded, whether representing the sense or antisense strand. It will be understood that as a result of the degeneracy of the genetic code, a multitude of nucleotide sequences may encode a given protein.
[83] As used herein the term "gene" means a segment of DNA involved in producing a polypeptide and can include regions preceding and following the coding regions (e.g., promoter, terminator, 5' untranslated (5' UTR) or leader sequences and 3' untranslated (3' UTR) or trailer sequences, as well as intervening sequence (introns) between individual coding segments (exons).
[84] As used herein, "homologous gene," "gene homolog," or "homolog" refers to a gene which has a homologous sequence and results in a protein having an identical or similar function. The term encompasses genes that are separated by speciation {i.e., the development of new species) (e.g., orthologous genes), as well as genes that have been separated by genetic duplication (e.g., paralogous genes).
[85] As used herein, "homologous sequences" refers to a nucleic acid or polypeptide sequence having at least about 99%, at least about 98%, at least about 97%, at least about 96%, at least about 95%, at least about 94%, at least about 93%, at least about 92%, at least about 91%, at least about 90%, at least about 88%, at least about 85%, at least about 80%, at least about 75% or at least about 70% sequence identity to a subject nucleotide or amino acid sequence when optimally aligned for comparison. In some embodiments, homologous sequences have between about 80% and 100% sequence identity, in some embodiments between about 90% and 100% sequence identity, and in some embodiments, between about 95% and 100% sequence identity.
[86] Sequence homology can be determined using standard techniques known in the art (see e.g., Smith and Waterman, Adv. Appl. Math., 1981. 2:482; Needleman and Wunsch, J. Mol. Biol, 1970. 48:443; Pearson and Lipman, Proc. Natl. Acad. Sci. USA 1988. 85:2444;
programs such as GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package (Genetics Computer Group, Madison, WI); and Devereux et al, Nucl. Acid Res., 1984. 12:387-395).
[87] Useful algorithms for determining sequence homology include: PILEUP and BLAST (Altschul et al, J. Mol. Biol, 1990. 215:403-410; and Karlin et al, Proc. Natl. Acad. Sci. USA 1993. 90:5873-5787). PILEUP uses a simplification of the progressive alignment method of Feng and Doolittle (Feng and Doolittle, J. Mol. Evol. ,1987. 35:351-360). The method is similar to that described by Higgins and Sharp (Higgins and Sharp, CABIOS 1989. 5: 151-153). Useful PILEUP parameters including a default gap weight of 3.00, a default gap length weight of 0.10, and weighted end gaps.
[88] A particularly useful BLAST program is the WU-BLAST-2 program (See, Altschul et al., Meth. EnzymoL, 1996. 266:460-480). WU-BLAST-2 uses several search parameters, most of which are set to the default values. The adjustable parameters are set with the following values: overlap span =1, overlap fraction = 0.125, word threshold (T) = 11. The HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. However, the values may be adjusted to increase sensitivity. A % amino acid sequence identity value is determined by the number of matching identical residues divided by the total number of residues of the "longer" sequence in the aligned region. The "longer" sequence is the one having the most actual residues in the aligned region (gaps introduced by WU-Blast-2 to maximize the alignment score are ignored).
[89] As used herein, the term "vector" refers to any nucleic acid that can be replicated in cells and can carry new genes or DNA segments into cells. Thus, the term refers to a nucleic acid construct designed for transfer between different host cells. An "expression vector" refers to a vector that has the ability to incorporate and express heterologous DNA fragments (i.e., non-native DNA) in a cell. Many prokaryotic and eukaryotic expression vectors are commercially available. Selection of appropriate expression vectors is within the knowledge of those having skill in the art.
[90] As used herein, the term "DNA construct" refers to a nucleic acid molecule generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell (i.e., vectors or vector elements, as described above). For example, DNA construct can be incorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. In some embodiments, DNA constructs also include a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell. In some embodiments, a DNA construct of the invention comprises a selectable marker.
[91] Also as used herein, the term "DNA construct" (as well as "transforming DNA," and "transforming sequence") refers to DNA that is used to introduce sequences into a host cell or organism (i.e., "transform a host cell"). The DNA construct may be generated in vitro by PCR or any other suitable techniques. In some embodiments, the transforming DNA can include an incoming sequence, and/or can include an incoming sequence flanked by homology boxes. In yet a further embodiment, the transforming DNA comprises other non-homologous sequences, added to the ends (e.g., stuffer sequences or flanks). The ends can be closed such that the transforming DNA forms a closed circle (i.e., a plasmid), such as, for example, insertion into a vector.
[92] As used herein, the term "plasmid" refers to a circular double-stranded (ds) DNA construct used as a cloning vector, and which forms an extrachromosomal self-replicating genetic element in many bacteria and some eukaryotes. In some embodiments, plasmids become incorporated into the genome of the host cell.
[93] As used herein, the terms "isolated" and "purified" are used to refer to a molecule (e.g. , a nucleic acid or polypeptide) or other component that is removed from at least one other component with which it is naturally associated.
[94] As used herein, the term "altered expression" is construed to include an increase or decrease in production of a protein of interest by an altered (i.e. , engineered) cell strain relative to the normal level of production from the corresponding unaltered parent strain (i.e., when grown under essentially the same conditions).
[95] As used herein, the term "enhanced expression" is construed to include the increased production of a protein of interest by an altered (i.e., engineered) cell strain above the normal level of production from the corresponding unaltered parent strain (i.e. , when grown under essentially the same conditions).
[96] As used herein, the term "expression" refers to a process by which a polypeptide is produced. The process includes both transcription and translation of the gene. In some embodiments, the process also includes secretion of the polypeptide.
[97] As used herein in the context of "introducing a nucleic acid sequence into a cell," the term "introducing" (and in past tense, "introduced") refers to any method suitable for transferring the nucleic acid sequence into the cell, including but not limited to transformation, electroporation, nuclear microinjection, transduction, transfection, (e.g. , lipofection mediated and DEAE-Dextrin mediated transfection), incubation with calcium phosphate DNA precipitate, high velocity bombardment with DNA-coated microprojectiles, agrobacterium mediated transformation, and protoplast fusion.
[98] As used herein, the terms "stably transformed" refers to a cell that has a non-native (heterologous) polynucleotide sequence integrated into its genome or as an episomal plasmid that is maintained for at least two generations.
[99] As used herein "an incoming sequence" refers to a DNA sequence that is being introduced into a host cell. The incoming sequence can be part of a DNA construct, can encode one or more proteins of interest (e.g., heterologous protein), can be a functional or nonfunctional gene and/or a mutated or modified gene, and/or can be a selectable marker gene(s).
[100] As used herein, "homology box" refers to a nucleic acid sequence, which is homologous to the sequence of gene in the chromosome of a filamentous fungal cell. More specifically, a homology box is an upstream or downstream region having between about 80 and 100% sequence identity, between about 90 and 100% sequence identity, or between about 95 and 100% sequence identity with the immediate flanking coding region of a gene or part of a gene to be impaired according to the invention. These sequences direct where in the chromosome a DNA construct or incoming sequence is integrated and directs what part of the chromosome is replaced by the DNA construct or incoming sequence. While not meant to limit the invention, a homology box may include between about 1 base pair (bp) to 200 kilobases (kb). Typically, a homology box includes about between 1 bp and 10.0 kb; between 1 bp and 5.0 kb; between 1 bp and 2.5 kb; between 1 bp and 1.0 kb, and between 0.25 kb and 2.5 kb. A homology box may also include about 10.0 kb, 5.0 kb, 2.5 kb, 2.0 kb, 1.5 kb, 1.0 kb, 0.5 kb, 0.25 kb and 0.1 kb. In some embodiments, the 5' and 3' ends of a selective marker are flanked by a homology box wherein the homology box comprises nucleic acid sequences immediately flanking the coding region of the gene.
[101] In an alternative embodiment, the transforming DNA sequence comprises homology boxes without the presence of an incoming sequence. In this embodiment, it is desired to delete the endogenous DNA sequence between the two homology boxes. Furthermore, in some embodiments, the transforming sequences are wild-type, while in other embodiments, they are mutant or modified sequences. In addition, in some embodiments, the transforming sequences are homologous, while in other embodiments, they are heterologous.
[102] As used herein, the term "target sequence" refers to a DNA sequence in the host cell that encodes the sequence where it is desired for the incoming sequence to be inserted into the host cell genome. In some embodiments, the target sequence encodes a functional wild-type gene or operon, while in other embodiments the target sequence encodes a functional mutant gene or operon, or a non-functional gene or operon.
[103] As used herein, a "flanking sequence" refers to any sequence that is either upstream or downstream of the sequence being discussed (e.g. , for genes A-B-C, gene B is flanked by the A and C gene sequences). In some embodiments, the incoming sequence is flanked by a homology box on each side. In another embodiment, the incoming sequence and the homology boxes comprise a unit that is flanked by stuffer sequence on each side. In some embodiments, a flanking sequence is present on only a single side (either 3' or 5'), and in other embodiments, it is on each side of the sequence being flanked. The sequence of each homology box is homologous to a sequence in the Aspergillus chromosome. These sequences direct where in the Aspergillus chromosome the new construct gets integrated and what part of the Aspergillus chromosome will be replaced by the incoming sequence. In some
embodiments these sequences direct where in the Aspergillus chromosome the new construct gets integrated without any part of the chromosome being replaced by the incoming sequence. In some embodiments, the 5' and 3' ends of a selective marker are flanked by a polynucleotide sequence comprising a section of the desired chromosomal segment. In some embodiments, a flanking sequence is present on only a single side (either 3' or 5'), and in other embodiments, it is present on each side of the sequence being flanked.
[104] As used herein, the term "chromosomally integrated" refers to a sequence, typically a mutant gene (e.g. , disrupted form of a native gene), that has become incorporated into the chromosomal DNA of a host cell. Typically, chromosomal integration occurs via the process of "homologous recombination," wherein the homologous regions of the introduced
(transforming) DNA align with homologous regions of the host chromosome. Subsequently, the sequence between the homologous regions is replaced by the incoming sequence in a double crossover. Thus, "chromosomally integrated" is used interchangeably herein with "homologously recombined" or "homologously integrated."
[105] As used herein, the terms "selectable marker" and "selective marker" refer to a nucleic acid capable of expression in host cell, which allows for ease of selection of those hosts containing the marker. Thus, the term "selectable marker" refers to genes that provide an indication that a host cell has taken up (e.g., has been successfully transformed with) an incoming nucleic acid of interest or some other reaction has occurred. Typically, selectable markers are genes that confer antimicrobial resistance or a metabolic advantage on the host cell to allow cells containing the exogenous DNA to be distinguished from cells that have not received any exogenous sequence during the transformation. Selective markers useful with the present invention include, but are not limited to, antimicrobial resistance markers (e.g., ampR; phleoR; specR; kanR; eryR; tetR; cmpR; hygroR and neoR; see e.g. , Guerot-Fleury, Gene, 1995. 167:335-337; Palmeros et al, Gene 2000. 247:255-264; and Trieu-Cuot et al,
Gene,\ 983. 23:331 -341), auxotrophic markers, such as trpC, pyrG and amdS, and detection markers, such as β-galactosidase.
[106] As used herein, the term "promoter" refers to a nucleic acid sequence that functions to direct transcription of a downstream gene. In some embodiments, the promoter is appropriate to the host cell in which a desired gene is being expressed. The promoter, together with other transcriptional and translational regulatory nucleic acid sequences (also termed "control sequences") is necessary to express a given gene. In general, the transcriptional and translational regulatory sequences include, but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences.
[107] A nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA encoding a secretory leader (i. e. , a signal peptide), is operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation.
Generally, "operably linked" means that the DNA sequences being linked are contiguous, and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers do not have to be contiguous. Linking is accomplished by ligation at convenient restriction sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in accordance with conventional practice.
[108] As used herein, the term "hybridization" refers to the process by which a strand of nucleic acid joins with a complementary strand through base pairing, as known in the art.
[109] A nucleic acid sequence is considered to be "selectively hybridizable" to a reference nucleic acid sequence if the two sequences specifically hybridize to one another under moderate to high stringency hybridization and wash conditions. Hybridization conditions are based on the melting temperature (Tra) of the nucleic acid binding complex or probe. For example, "maximum stringency" typically occurs at about Tm-5°C (5° below the Tm of the probe); "high stringency" at about 5-10°C below the Tm; "intermediate stringency" at about 10-20°C below the Tm of the probe; and "low stringency" at about 20-25°C below the Tm. Functionally, maximum stringency conditions may be used to identify sequences having strict identity or near-strict identity with the hybridization probe; while an intermediate or low stringency hybridization can be used to identify or detect polynucleotide sequence homologs.
[110] Moderate and high stringency hybridization conditions are well known in the art. An example of high stringency conditions includes hybridization at about 42°C in 50%
formamide, 5X SSC, 5X Denhardt's solution, 0.5% SDS and 100 μg/ml denatured carrier DNA followed by washing two times in 2X SSC and 0.5% SDS at room temperature and two additional times in 0.1X SSC and 0.5% SDS at 42°C. An example of moderate stringent conditions include an overnight incubation at 37°C in a solution comprising 20% formamide, 5 x SSC (150mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x Denhardt's solution, 10% dextran sulfate and 20 mg/ml denaturated sheared salmon sperm DNA, followed by washing the filters in lx SSC at about 37 - 50°C. Those of skill in the art know how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like.
[Ill] As used herein, "recombinant" used in reference to a cell or vector refers to being modified by the introduction of a heterologous nucleic acid sequence, or a cell derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found in identical form within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, underexpressed, overexpressed or not expressed at all as a result of deliberate human intervention. "Recombination, "recombining," or generating a "recombined" nucleic acid is generally the assembly of two or more nucleic acid fragments wherein the assembly gives rise to a chimeric gene.
[112] As used herein, the term "primer" refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, (i.e. , in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). Usually, the primer is single stranded for maximum efficiency in amplification. Most often, the primer is an oligodeoxyribonucleotide.
[113] As used herein, the term "polymerase chain reaction" ("PCR") refers to methods for amplifying DNA strands using a pair of primers, DNA polymerase, and repeated cycles of
DNA polymerization, melting, and annealing (see, e.g. , U.S. Patent Nos. 4,683,195 4,683,202, and 4,965,188, which are hereby incorporated by reference herein). [114] As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
[115] A "restriction site" refers to a nucleotide sequence recognized and cleaved by a given restriction endonuclease and is frequently the site for insertion of DNA fragments. In certain embodiments of the invention restriction sites are engineered into the selective marker and into 5' and 3' ends of the DNA construct.
[116] In the present invention, the host cell is a filamentous fungal cell. Filamentous fungal cells useful with the present invention include, but are not limited to: Aspergillus sp. , {e.g. , A. oryzae, A. niger, A. awamori, A. nidulans, A. sojae, A. japonicus, A. kawachi and A.
aculeatus); Rhizopus sp. , Trichoderma sp. {e.g., Trichoderma reesei (previously classified as T. longibrachiatum and currently also known as Hypocrea jecorina), Trichoderma viride, Trichoderma koningii, and Trichoderma harzianums)) and Mucor sp. {e.g. , M. miehei and pusillus). In a preferred embodiment, the host cells are Aspergillus niger cells.
[117] In certain embodiments, the mutation leading to altered decreased protease activity or increased protease inhibitor activity in the cell is a deletion in a non-coding regulatory region flanking the coding sequence of the gene . In another embodiment, the mutation comprises an insertion mutation. In certain embodiments, the insertion mutation comprises insertion of a selectable marker. In some embodiments, wherein the genomic DNA is already known, the 5' flanking fragment and the 3' flanking fragment of the locus to be deleted is cloned by two PCR reactions, and in embodiments wherein the locus is disrupted or otherwise altered, the DNA fragment is cloned by one PCR reaction.
[118] In some embodiments, the coding region flanking sequences include a range of about Ibp to 2500 bp; about Ibp to 1500 bp, about 1 bp to 1000 bp, about 1 bp to 500 bp, and 1 bp to 250 bp. The number of nucleic acid sequences comprising the coding region flanking sequence may be different on each end of the gene coding sequence. For example, in some embodiments, the 5' end of the coding sequence includes less than 25 bp and the 3' end of the coding sequence includes more than 100 bp.
[119] In some embodiments, the incoming sequence comprises is a disruption sequence that comprises a selective marker flanked on the 5' and 3' ends with a fragment of the gene sequence. In other embodiments, when the DNA construct comprising the selective marker and gene, gene fragment or homologous sequence thereto is transformed into a host cell, the location of the selective marker renders the gene non-functional for its intended purpose. In some embodiments, the incoming sequence comprises the selective marker located in the promoter region of the gene. In other embodiments, the incoming sequence comprises the selective marker located after the promoter region of gene.
[120] In yet other embodiments, the incoming sequence is a disruption sequence comprising the selective marker located in the coding region of the gene. In further embodiments, the incoming sequence comprises a selective marker flanked by a homology box on both ends. In still further embodiments, the incoming sequence includes a sequence that interrupts the transcription and/or translation of the coding sequence. In yet additional embodiments, the DNA construct includes restriction sites engineered at the upstream and downstream ends of the construct.
[121] In one embodiment, the A. nidulans amdS gene provides a selectable marker system for the transformation of filamentous fungi useful with the present invention. The amdS gene codes for an acetamidase enzyme deficient in strains of Aspergillus and provides positive selective pressure for transformants grown on acetamide media. The amdS gene can be used as a selectable marker even in fungi known to contain an endogenous amdS gene or homolog, e.g., in A. nidulans (Tilbum et al. Gene 1983. 26: 205-221) and A. oryzae (Gomi et al. Gene 1991. 108:91-98). Background amdS activity of non-transformants can be suppressed by the inclusion of CsCl in the selection medium.
[122] Methods for using amdS marker system in the transformation of industrially important filamentous fungi are established in the art (e.g., in Aspergillus niger {see e.g., Kelly and Hynes EMBO J. 1985. 4:475-479; Wang et al, Fungal Genet. Biol. 2008. 45(1): 17-27); in Penicillium chrysogenum (see e.g., Beri and Turner, Curr. Genet. 2987. 11 :639-641); in Trichoderma reesei (see e.g., Pentilla et al. Gene 1987. 61 : 155-164); in Aspergillus oryzae (see e.g., Christensen et l., Bio/technology 1988. 6: 1419-1422); in Trichoderma harzianum (see e.g., Pe'er et al., Soil Biol. Biochem. 1990. 23: 1043-1046); and U.S patent no. 6,548,285, each of which is hereby incorporated by reference herein).
[123] The DNA constructs comprising an incoming sequence may be incorporated into a vector (e.g., in a plasmid), or used directly to transform the filamentous fungal cell, thereby resulting in a mutant. Typically, the DNA construct is stably transformed resulting in chromosomal integration of the impaired gene which is non-revertable. Methods for in vitro construction and insertion of DNA constructs into a suitable vector are well known in the art.
[124] Deletion and/or insertion of sequences is generally accomplished by ligation at convenient restriction sites. If such sites do not exist, synthetic oligonucleotide linkers can be prepared and used in accordance with conventional practice. (See, Sambrook (1989) supra, and Bennett and Lasure, MORE GENE MANIPULATIONS IN FUNGI, Academic Press, San Diego (1991) pp 70 - 76.). Additionally, vectors can be constructed using known recombination techniques (e.g. , Invitrogen Life Technologies, Gateway Technology). Examples of suitable expression and/or integration vectors that may be used in the practice of the invention are provided in Sambrook et al., (1989) supra, Ausubel (1987) supra, van den Hondel et al. (1991) in Bennett and Lasure (Eds.) MORE GENE MANIPULATIONS IN FUNGI, Academic Press pp. 396- 428 and U.S. Patent No. 5,874,276. Exemplary vectors useful with the present invention include pBS-T, pFB6, pBR322, pUC18, pUClOO and pENTR/D.
[125] In some embodiments, at least one copy of a DNA construct is integrated into the host chromosome. In some embodiments, one or more DNA constructs of the invention are used to transform host cells.
[126] Impairment occurs via any suitable means, including deletions, substitutions (e.g. , mutations), disruptions, insertions in the nucleic acid gene sequence, and/or gene silencing mechanisms, such as RNA interference (RNAi). In one embodiment, the expression product of an impaired gene is a truncated protein with a corresponding change in the biological activity of the protein. In some embodiments, the impairment results in an attenuation of biological activity of the gene. In some embodiments, remaining residual activity will be less than 25%, 20%, 15%, 10%, 5%, or 2% compared to the biological activity of the same or homologous gene in a corresponding parent strain.
[127] In some embodiments, impairment is achieved by deletion and in other embodiments impairment is achieved by disruption of the protein-coding region of the gene. In some embodiments, the gene is altered by homologous recombination.
[128] In the cases a deletion is used to impair a gene, typically the deletion is partial. In some embodiments, a deletion mutant comprises deletion of one or more genes that results in a stable and non-reverting deletion. Flanking regions of the coding sequence may include from about lbp to about 500 bp at the 5' and 3' ends. The flanking region may be larger than 500 bp but typically does not include other genes in the region which may be impaired or deleted according to the invention.
[129] In some embodiments, the disruption sequence comprises an insertion of a selectable marker gene into or near the protein-coding region. Typically, this insertion is performed in vitro by reversely inserting a gene sequence into or near the coding region sequence of the gene to be impaired. Flanking regions of the coding sequence may include about 1 bp to about 500 bp at the 5' and 3' ends. The flanking region may be larger than 500 bp, but will typically not include other genes in the region. The DNA construct aligns with the homologous sequence of the host chromosome and in a double crossover event the translation or transcription of the gene is disrupted.
[130] While not meant to limit the methods used for impairment or increased expression, in some embodiments decreased protease expression and/or increased protease inhibitor expression is accomplished by homologous recombination or DNA editing, for example, using the CRISPR method, optionally in by insertion of a selectable marker in the coding or non- coding region of the target gene (s).
[131] In some embodiments, impairment of the gene is by insertion in a single crossover event with a plasmid as the vector. For example, the vector is integrated into the host cell chromosome and the gene is altered by the insertion of the vector in the protein-coding sequence of the gene or in the regulatory region of the gene.
[132] In alternative embodiments, impairment results due to mutation of the gene. Methods of mutating genes are well known in the art and include but are not limited to site-directed mutation, generation of random mutations, gapped-duplex approaches and CRISPR (See e.g. , U. S. Pat. 4,760,025; Moring et al, Biotech. 1984. 2:646; and Kramer et al, Nucleic Acids Res., 1984. 12:9441).
[133] In some embodiments a mutant encompassed by the invention will exhibit altered expression and translation (i. e., protein production) of one or more endogenous and/or heterologous proteins of interest in comparison to the expression and translation of the same protein(s) by the corresponding parent strain of filamentous fungus.
[134] In some embodiments, the mutants of filamentous fungal cells encompassed by the invention will produce the endogenous and/or heterologous proteins of interest in an amount at least about 5% to about 200% (or more) greater than the production of the same protein(s) in the corresponding parent strain. Accordingly, in some embodiments, the production of the protein(s) of interest by the mutant is at least about 0% to 100% greater, and in some embodiments is at least about 10% to 60% greater, including embodiments wherein production at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, and 55% greater, than the production of the endogenous and/or heterologous protein(s) in the corresponding parent strain.
[135] In some embodiments of the present invention, the protein of interest produced by the mutant of a filamentous fungal cell is an intracellularly produced protein (i. e. , an intracellular, non-secreted polypeptide). In other embodiments, the protein of interest is a secreted polypeptide. In addition, the protein of interest may be a fusion or hybrid protein. In some embodiments, the mutant exhibits altered production of a plurality of proteins, some of which are intracellular and some of which are secreted. [136] Proteins of interest useful with the present invention include enzymes known in the art, including, but not limited to those chosen from amylolytic enzymes, proteolytic enzymes, cellulytic enzymes, oxidoreductase enzymes and plant cell-wall degrading enzymes. More particularly, these enzyme include, but are not limited to amylases, glucoamylases, proteases, xylanases, lipases, laccases, phenol oxidases, oxidases, cutinases, cellulases, hemicellulases, esterases, perioxidases, catalases, glucose oxidases, phytases, pectinases, glucosidases, isomerases, transferases, galactosidases and chitinases. In some embodiments, enzymes include but are not limited to amylases, glucoamylases, proteases, phenol oxidases, cellulases, hemicellulases, glucose oxidases and phytases. In some embodiments, the polypeptide of interest is a protease, cellulase, glucoamylase or amylase.
[137] In some embodiments, the protein of interest is a secreted polypeptide, which is fused to a signal peptide (i. e. , an amino-terminal extension on a protein to be secreted). Nearly all secreted proteins use an amino- terminal protein extension, which plays a role in the targeting to and translocation of precursor proteins across the membrane. This extension is
proteolytically removed by a signal peptidase during or immediately following membrane transfer.
[138] In some embodiments of the present invention, the polypeptide of interest is a protein such as a protease inhibitor, which inhibits the action of proteases. Protease inhibitors are known in the art, for example the protease inhibitors belonging to the family of serine proteases inhibitors which are known to inhibit trysin, cathepsinG, thrombin and tissue kallikrein. Among the protease inhibitors useful in the present invention are Bowman-Birk inhibitors and soybean trypsin inhibitors (See, Birk, Int. J. Pept. Protein Res. 1985. 25 : 113- 131 ; Kennedy, ^. J. Clin. Neutr. 1998. 68: 1406S-1412S and Billings et al, Proc. Natl. Acad. &Z.1992. 89:3120 - 3124).
[139] In some embodiments of the present invention, the polypeptide of interest is chosen from hormones, antibodies, growth factors, receptors, cytokines, etc. Hormones encompassed by the present invention include but are not limited to, follicle-stimulating hormone, luteinizing hormone, corticotropin-releasing factor, somatostatin, gonadotropin hormone, vasopressin, oxytocin, erythropoietin, insulin and the like. Growth factors include, but are not limited to platelet-derived growth factor, insulin-like growth factors, epidermal growth factor, nerve growth factor, fibroblast growth factor, transforming growth factors, cytokines, such as interleukins (e.g. , IL-1 through IL-13), interferons, colony stimulating factors, and the like. Antibodies include but are not limited to immunoglobulins obtained directly from any species from which it is desirable to produce antibodies. In addition, the present invention encompasses modified antibodies. Polyclonal and monoclonal antibodies are also
encompassed by the present invention. In some embodiments, the antibodies or fragments thereof are chimeric or humanized antibodies, including but not limited to: anti-pl 85Her2, HulDlO-, trastuzumab, bevacizumab, palivizumab, infliximab, daclizumab, and rituximab. In some embodiments, the antibodies include one or more antibodies, as in the case of an antibody cocktail for treating a disease such as caused by Ebola virus.
[140] In a further embodiment, the nucleic acid encoding the protein of interest will be operably linked to a suitable promoter, which shows transcriptional activity in a fungal host cell. The promoter may be derived from genes encoding proteins either endogenous or heterologous to the host cell. The promoter may be a truncated or hybrid promoter. Further the promoter may be an inducible promoter. Typically, the promoter is useful in a Trichoderma host or an Aspergillus host. Suitable nonlimiting examples of promoters include cbhl, cbhl, eg/1, eg/2, and xynl . In one embodiment, the promoter is one that is native to the host cell. Other examples of useful promoters include promoters from the genes of A. awamori and A. niger glucoamylase genes (glaA) (Nunberg et al. , Mol. Cell Biol. 1984. 4:2306-2315 and Boel et al., EMBO J. 1984. 3: 1581-1585); Aspergillus oryzae TAKA amylase; Rhizomucor miehei aspartic proteinase; Aspergillus niger neutral alpha-amylase; Aspergillus niger acid stable alpha-amylase; Trichoderma reesei stpl and the cellobiohydrolase 1 gene promoter (see e.g., EP 0 137 280 Al, which is hereby incorporated by reference herein) and mutant, truncated and hybrid promoters thereof.
[141] In some embodiments, the polypeptide coding sequence is operably linked to a signal sequence which directs the encoded polypeptide into the cell's secretory pathway. The 5' end of the coding sequence may naturally contain a signal sequence naturally linked in translation reading frame with the segment of the coding region which encodes the secreted polypeptide. The DNA encoding the signal sequence typically is the sequence which is naturally associated with the polypeptide to be expressed. Typically, the signal sequence is encoded by an Aspergillus niger alpha-amylase, Aspergillus niger neutral amylase or Aspergillus niger glucoamylase. In some embodiments, the signal sequence is the Trichoderma cdh\ signal sequence which is operably linked to a cdhX promoter.
[142] Introduction of a DNA construct or vector into a host cell includes techniques such as transformation; electroporation; nuclear microinjection; transduction; transfection, (e.g., lipofection mediated and DEAE-Dextrin mediated transfection); incubation with calcium phosphate DNA precipitate; high velocity bombardment with DNA-coated microprojectiles; agrobacterium mediated transformation and protoplast fusion. General transformation techniques are known in the art (see, e.g., Ausubel et al, (1987), supra, chapter 9; and Sambrook (1989) supra, Campbell et al, Curr. Genet. 1989. 16:53-56 and THE
BIOTECHNOLOGY OF FILAMENTOUS FUNGI, © 1992, Chap. 6. Eds. Finkelstein and Ball, Butterworth and Heinenmann, each of which is hereby incorporated by reference herein).
[143] Production of heterologous proteins in filamentous fungal cell expression systems are also known in the art. For example, the expression of heterologous proteins in Trichoderma is described in Harkki et al, Enzyme Microb. Technol. 1991. 13:227-233; Harkki et al, Bio Technol. 1989. 7:596-603; EP 244,234; EP 215,594; and Nevalainen et al , "The Molecular Biology of Trichoderma and its Application to the Expression of Both Homologous and Heterologous Genes", in MOLECULAR INDUSTRIAL MYCOLOGY, © 1992, Eds. Leong and
Berka, Marcel Dekker Inc., NY, pp. 129 - 148; and U.S. Patent Nos. 6,022,725 and 6,268,328, each of which is hereby incorporated by reference herein.
[144] The expression of heterologous proteins in Aspergillus sp. is described in Cao et al, Sci. 2000. 9:991-1001 ; and U.S. Patent No. 6,509,171, each of which is hereby incorporated by reference herein.
[145] Transformants of the present invention can be purified using known techniques.
[146] The filamentous fungal cells may be grown in conventional culture medium. The culture media for transformed cells may be modified as appropriate for activating promoters and selecting transformants. The specific culture conditions, such as temperature, pH and the like will be apparent to those skilled in the art. Typical culture conditions for filamentous fungi useful with the present invention are well known and may be found in the scientific literature such as Sambrook, (1982) supra, and from the American Type Culture Collection. Additionally, fermentation procedures for production of heterologous proteins are known per se in the art. For example, proteins can be produced either by solid or submerged culture, including batch, fed-batch and continuous-flow processes. Fermentation temperature can vary somewhat, but for filamentous fungi such as Aspergillus niger the temperature generally will be within the range of about 20°C to 40°C, typically in the range of about 28°C to 37°C, depending on the strain of microorganism chosen. The pH range in the aqueous microbial ferment (fermentation admixture) should be in the exemplary range of about 2.0 to 8.0. With filamentous fungi, the pH normally is within the range of about 2.5 to 8.0; with Aspergillus niger the pH normally is within the range of about 4.0 to 6.0, and typically in the range of about 4.5 to 5.5. While the average retention time of the fermentation admixture in the fermentor can vary considerably, depending in part on the fermentation temperature and culture employed, generally it will be within the range of about 24 to 500 hours, typically about 24 to 400 hours. Any type of fermentor useful for culturing filamentous fungi may be employed in the present invention. One useful embodiment with the present invention is operation under 15L Biolafitte (Saint-Germain-en-Laye, France).
[147] Various assays are known to those of ordinary skill in the art for detecting and measuring activity of intracellularly and extracellularly expressed polypeptides. Means for determining the levels of secretion of a protein of interest in a host cell and detecting expressed proteins include the use of immunoassays with either polyclonal or monoclonal antibodies specific for the protein. Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), fluorescence immunoassay (FIA), and fluorescent activated cell sorting (FACS). However, other methods are known to those in the art and find use in assessing the protein of interest (See e.g., Hampton et al, SEROLOGICAL METHODS, A
LABORATORY MANUAL, © 1990, APS Press, St. Paul, MN; and Maddox et al, J. Exp. Med., 1983. 158: 1211, each of which is hereby incorporated by reference herein).
[148] Once the desired protein is expressed and, optionally, secreted, the protein of interest may be recovered and further purified. The recovery and purification of the protein of interest from a fermentation broth can be done by procedures known in the art. The fermentation broth will generally contain cellular debris, including cells, various suspended solids and other biomass contaminants, as well as the desired protein product.
[149] Suitable processes for such removal include conventional solid-liquid separation techniques such as, e.g., centrifugation, filtration, dialysis, microfiltration, rotary vacuum filtration, or other known processes, to produce a cell-free filtrate. Often, it may be useful to further concentrate the fermentation broth or the cell-free filtrate prior to crystallization using techniques such as ultrafiltration, evaporation or precipitation.
[150] Precipitating the proteinaceous components of the supernatant or filtrate may be accomplished by means of a salt, followed by purification by a variety of chromatographic procedures, e.g. , ion exchange chromatography, affinity chromatography or similar art recognized procedures. When the expressed desired polypeptide is secreted the polypeptide may be purified from the growth media. Typically, the expression host cells are removed from the media before purification of the polypeptide (e.g., by centrifugation).
[151] When the expressed recombinant desired polypeptide is not secreted from the host cell, usually the host cell is disrupted and the polypeptide released into an aqueous "extract" which is the first stage of purification. Typically, the expression host cells are collected from the media before the cell disruption (e.g., by centrifugation).

Claims

CLAIMS What is claimed is:
1. A filamentous fungal cell comprising at least one mutation that decreases the amount of active protease in the cell, wherein the mutation is in a gene encoding a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 25.
2. The filamentous fungal cell of claim 1, wherein the mutation is located in a non- coding region of the gene.
3. The filamentous fungal cell of claim 1, wherein the mutation is located in a coding region of the gene.
4. The filamentous fungal cell of any of claims 1-3, wherein the mutation results in reduced expression of a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 23.
5. The filamentous fungal cell of claim 4, wherein the mutation is a polynucleotide homologous to or identical to a polynucleotide selected from the group consiting of the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47 and SEQ ID NO: 48.
6. The filamentous fungal cell of any of claims 1-3, wherein the mutation results in overexpression of the polypeptide homologous to a polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
7. The filamentous fungal cell of claim 6, wherein the mutation is a polynucleotide homologous to or identical to a polynucleotide of SEQ ID NO: 49 or SEQ ID NO: 50.
8. The filamentous fungal cell of any of claims 1-7, wherein the mutation comprises an insertion mutation.
9. The filamentous fungal cell of claim 8, wherein the insertion mutation comprises insertion of a selectable marker.
10. The filamentous fungal cell of claim 8 or 9, wherein the insertion mutation comprises insertion of an expression cassette for overexpressing the polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
11. The filamentous fungal cell of any of claims 1-10, wherein the filamentous fungal cell is an Aspergillus species, aRhizopus species, a Trichoderma species or aMucor species.
12. The filamentous fungal cell of claim 11, wherein the Trichoderma species is selected from the group consisting of Trichoderma reesei, Trichoderma viride, Trichoderma koningii, and Trichoderma harzianums.
13. The filamentous fungal cell of any of claims 1-12, wherein the mutation results in increased production of a protein of interest compared to otherwise identical parental filamentous fungal cells that lack the deletion.
14. The filamentous fungal cell of any of claims 1-13, wherein the protein of interest is an antibody or fragment, thereof.
15. A method for increasing expression of a protein of interest in a filamentous fungal host, the method comprising:
(i) introducing a mutation is in a gene encoding a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24 and SEQ ID NO: 25 into filamentous fungal host cells capable of expressing a protein of interest;
(ii) cultivating the filamentous fungal cell having the deletion under conditions conducive for production of the protein of interest; and
(iii) recovering the protein of interest;
wherein the presence of the mutation results in increased production of the protein of interest compared to the production in otherwise identical parental filamentous fungal cells that lack the deletion.
16. The method of claim 15, wherein the mutation is located in a non-coding region of the gene, a coding region of the gene, or both.
17. The method of claim 15 or 16, wherein the mutation results in reduced expression of a polypeptide homologous to or identical to a polypeptide selected from the group consiting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 1, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22 and SEQ ID NO: 2, and/or overexpression of a polypeptide homologous to or identical to a polypeptide of SEQ ID NO: 24 and/or SEQ ID NO: 25.
18. The method of any of claims 15-17, wherein the mutation is a polynucleotide homologous to or identical to a polynucleotide selected from the group consiting of the polynucleotide of SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO: 40, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 45, SEQ ID NO: 46,. SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49 and SEQ ID NO: 50.
19. The method of any of claims 15-18, wherein the mutation comprises an insertion mutation, optionally having a selectable marker.
20. The method of claim 19, wherein the filamentous fungal cell is an Aspergillus species, aRhizopus species, a Trichoderma species or aMucor species.
21. The method of any of claims 15-20, wherein the mutation results in increased production of a protein of interest compared to otherwise identical parental filamentous fungal cells that lack the deletion.
22. The method of any of claims 15-21, wherein the protein of interest is an antibody or fragment, thereof.
23. A method for increasing expression of a protein of interest in a filamentous fungal host, the method comprising:
(i) cultivating the filamentous fungal cell of any of claims 1-14 under conditions conducive for production of a protein of interest; and
(ii) recovering the protein of interest;
wherein the presence of the mutation results in increased production of the protein of interest compared to the production in otherwise identical parental filamentous fungal cells that lack the deletion.
PCT/US2017/061475 2016-11-15 2017-11-14 Filamentous fungi with improved protein production WO2018093752A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662422413P 2016-11-15 2016-11-15
US62/422,413 2016-11-15

Publications (1)

Publication Number Publication Date
WO2018093752A1 true WO2018093752A1 (en) 2018-05-24

Family

ID=60480462

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/061475 WO2018093752A1 (en) 2016-11-15 2017-11-14 Filamentous fungi with improved protein production

Country Status (1)

Country Link
WO (1) WO2018093752A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021007630A1 (en) * 2019-07-16 2021-01-21 Centro Nacional De Pesquisa Em Energia E Materiais Modified trichoderma fungal strain for the production of an enzyme cocktail
WO2021086606A1 (en) 2019-10-28 2021-05-06 Danisco Us Inc Microbial host cells for the production of heterologous cyanuric acid hydrolases and biuret hydrolases
WO2021143696A1 (en) * 2020-01-13 2021-07-22 中国科学院分子植物科学卓越创新中心 Factor regulating protein expression efficiency of trichoderma reesei, and regulation method and use thereof

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0137280A1 (en) 1983-08-31 1985-04-17 Cetus Oncology Corporation Recombinant fungal cellobiohydrolases
EP0215594A2 (en) 1985-08-29 1987-03-25 Genencor International, Inc. Heterologous polypeptide expressed in filamentous fungi, processes for their preparation, and vectors for their preparation
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
EP0244234A2 (en) 1986-04-30 1987-11-04 Alko Group Ltd. Transformation of trichoderma
US4760025A (en) 1984-05-29 1988-07-26 Genencor, Inc. Modified enzymes and methods for making same
US4965188A (en) 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
US5874276A (en) 1993-12-17 1999-02-23 Genencor International, Inc. Cellulase enzymes and systems for their expressions
US6022725A (en) 1990-12-10 2000-02-08 Genencor International, Inc. Cloning and amplification of the β-glucosidase gene of Trichoderma reesei
US6268328B1 (en) 1998-12-18 2001-07-31 Genencor International, Inc. Variant EGIII-like cellulase compositions
US6509171B1 (en) 1988-07-01 2003-01-21 Genencor International, Inc. Aspartic proteinase deficient filamentous fungi
US6548285B1 (en) 1995-08-03 2003-04-15 Dsm N.V. Polynucleotides encoding Aspergillus Niger and Penicillium Chrysogenum acetamidases and methods of use as selectable markers
WO2006110677A2 (en) * 2005-04-12 2006-10-19 Genencor International, Inc. Gene inactivated mutants with altered protein production
WO2011075677A2 (en) * 2009-12-18 2011-06-23 Novozymes, Inc. Methods for producing polypeptides in protease-deficient mutants of trichoderma
WO2013028912A2 (en) * 2011-08-24 2013-02-28 Novozymes, Inc. Methods for producing multiple recombinant polypeptides in a filamentous fungal host cell

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0137280A1 (en) 1983-08-31 1985-04-17 Cetus Oncology Corporation Recombinant fungal cellobiohydrolases
US4760025A (en) 1984-05-29 1988-07-26 Genencor, Inc. Modified enzymes and methods for making same
US4683202B1 (en) 1985-03-28 1990-11-27 Cetus Corp
US4683202A (en) 1985-03-28 1987-07-28 Cetus Corporation Process for amplifying nucleic acid sequences
EP0215594A2 (en) 1985-08-29 1987-03-25 Genencor International, Inc. Heterologous polypeptide expressed in filamentous fungi, processes for their preparation, and vectors for their preparation
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683195B1 (en) 1986-01-30 1990-11-27 Cetus Corp
EP0244234A2 (en) 1986-04-30 1987-11-04 Alko Group Ltd. Transformation of trichoderma
US4965188A (en) 1986-08-22 1990-10-23 Cetus Corporation Process for amplifying, detecting, and/or cloning nucleic acid sequences using a thermostable enzyme
US6509171B1 (en) 1988-07-01 2003-01-21 Genencor International, Inc. Aspartic proteinase deficient filamentous fungi
US6022725A (en) 1990-12-10 2000-02-08 Genencor International, Inc. Cloning and amplification of the β-glucosidase gene of Trichoderma reesei
US5874276A (en) 1993-12-17 1999-02-23 Genencor International, Inc. Cellulase enzymes and systems for their expressions
US6548285B1 (en) 1995-08-03 2003-04-15 Dsm N.V. Polynucleotides encoding Aspergillus Niger and Penicillium Chrysogenum acetamidases and methods of use as selectable markers
US6268328B1 (en) 1998-12-18 2001-07-31 Genencor International, Inc. Variant EGIII-like cellulase compositions
WO2006110677A2 (en) * 2005-04-12 2006-10-19 Genencor International, Inc. Gene inactivated mutants with altered protein production
WO2011075677A2 (en) * 2009-12-18 2011-06-23 Novozymes, Inc. Methods for producing polypeptides in protease-deficient mutants of trichoderma
WO2013028912A2 (en) * 2011-08-24 2013-02-28 Novozymes, Inc. Methods for producing multiple recombinant polypeptides in a filamentous fungal host cell

Non-Patent Citations (45)

* Cited by examiner, † Cited by third party
Title
"THE BIOTECHNOLOGY OF FILAMENTOUS FUNGI", 1992, BUTTERWORTH AND HEINENMANN
ALTSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410
ALTSCHUL ET AL., METH. ENZYMOL., vol. 266, 1996, pages 460 - 480
ANDRÃ CR SCHUSTER ET AL: "Biology and biotechnology of Trichoderma", APPLIED MICROBIOLOGY AND BIOTECHNOLOGY, SPRINGER, BERLIN, DE, vol. 87, no. 3, 12 May 2010 (2010-05-12), pages 787 - 799, XP019841630, ISSN: 1432-0614 *
BENNETT; LASURE: "MORE GENE MANIPULATIONS IN FUNGI", 1991, ACADEMIC PRESS, pages: 70 - 76
BERI; TURNER, CURR. GENET. 2987, vol. 11, pages 639 - 641
BILLINGS ET AL., PROC. NATL. ACAD. SCI., vol. 89, 1992, pages 3120 - 3124
BIOLTECHNOL, vol. 5, 1987, pages 369 - 376,713-719,1301-1304
BIRK, INT. J. PEPT. PROTEIN RES., vol. 25, 1985, pages 113 - 131
BOEL ET AL., EMBO J., vol. 3, 1984, pages 1581 - 1585
CAMPBELL ET AL., CURR. GENET., vol. 16, 1989, pages 53 - 56
CAO ET AL., SCI, vol. 9, 2000, pages 991 - 1001
CHRISTENSEN ET AL., BIO/TECHNOLOGY, vol. 6, 1988, pages 1419 - 1422
DEVEREUX ET AL., NUCL. ACID RES., vol. 12, 1984, pages 387 - 395
DIEGO MARTINEZ ET AL: "Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina)", NATURE BIOTECHNOLOGY (ADVANCE ONLINE PUBLICATION), GALE GROUP INC, vol. 26, no. 5, 1 May 2008 (2008-05-01), pages 553 - 560, XP002689190, ISSN: 1087-0156, [retrieved on 20080504], DOI: 10.1038/NBT1403 *
DURAND ET AL., ENZYME AND MICROBIAL TECHNOLOGY, vol. 10, no. 6, 1988, pages 341 - 346
FENG; DOOLITTLE, J. MOL. EVOL., vol. 35, 1987, pages 351 - 360
GOMI ET AL., GENE, vol. 108, 1991, pages 91 - 98
GUEROT-FLEURY, GENE, vol. 167, 1995, pages 335 - 337
HALE; MARHAM: "THE HARPER COLLINS DICTIONARY OF BIOLOGY", 1991, HARPER PERENNIAL
HAMPTON ET AL.: "SEROLOGICAL METHODS, A LABORATORY MANUAL", 1990, APS PRESS
HARKKI ET AL., BIO TECHNOL., vol. 7, 1989, pages 596 - 603
HARKKI ET AL., ENZYME MICROB. TECHNOL., vol. 13, 1991, pages 227 - 233
HIGGINS; SHARP, CABIOS, vol. 5, 1989, pages 151 - 153
KARLIN ET AL., PROC. NATL. ACAD. SCI. USA, vol. 90, 1993, pages 5873 - 5787
KELLY; HYNES, EMBO J., vol. 4, 1985, pages 475 - 479
KENNEDY, AM. J. CLIN. NEUTR., vol. 68, 1998, pages 1406S - 1412S
KRAMER ET AL., NUCLEIC ACIDS RES., vol. 12, 1984, pages 9441
MADDOX ET AL., J. EXP. MED., vol. 158, 1983, pages 1211
MORING ET AL., BIOTECH., vol. 2, 1984, pages 646
NEEDLEMAN; WUNSCH, J. MOL. BIOL., vol. 48, 1970, pages 443
NEVALAINEN ET AL.: "MOLECULAR INDUSTRIAL MYCOLOGY", 1992, MARCEL DEKKER INC., article "The Molecular Biology of Trichoderma and its Application to the Expression of Both Homologous and Heterologous Genes", pages: 129 - 148
NUNBERG ET AL., MOL. CELL BIOL., vol. 4, 1984, pages 2306 - 2315
PALMEROS ET AL., GENE, vol. 247, 2000, pages 255 - 264
PEARSON; LIPMAN, PROC. NATL. ACAD. SCI. USA, vol. 85, 1988, pages 2444
PE'ER ET AL., SOIL BIOL. BIOCHEM., vol. 23, 1990, pages 1043 - 1046
PENTILLA ET AL., GENE, vol. 61, 1987, pages 155 - 164
SCHUSTER, E. ET AL., APPL MICROBIOL BIOTECHNOL, vol. 59, no. 4-5, 2002, pages 426 - 35
SINGLETON ET AL.: "DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY", 1994, JOHN WILEY AND SONS
SMITH; WATERMAN, ADV. APPL. MATH., vol. 2, 1981, pages 482
TILBUM ET AL., GENE, vol. 26, 1983, pages 205 - 221
TRIEU-CUOT ET AL., GENE, vol. 23, 1983, pages 331 - 341
VAN DEN HONDEL ET AL.: "MORE GENE MANIPULATIONS IN FUNGI", 1991, ACADEMIC PRESS, pages: 396 - 428
WANG ET AL., FUNGAL GENET. BIOL., vol. 45, no. 1, 2008, pages 17 - 27
ZUKOWSKI: "Biology of Bacilli: Applications to Industry", 1992, BUTTERWORTH-HEINEMANN, article "Production of commercially valuable products", pages: 311 - 337

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021007630A1 (en) * 2019-07-16 2021-01-21 Centro Nacional De Pesquisa Em Energia E Materiais Modified trichoderma fungal strain for the production of an enzyme cocktail
WO2021086606A1 (en) 2019-10-28 2021-05-06 Danisco Us Inc Microbial host cells for the production of heterologous cyanuric acid hydrolases and biuret hydrolases
WO2021143696A1 (en) * 2020-01-13 2021-07-22 中国科学院分子植物科学卓越创新中心 Factor regulating protein expression efficiency of trichoderma reesei, and regulation method and use thereof

Similar Documents

Publication Publication Date Title
DK2333045T3 (en) GEN-INACTIVATED MUTANTS WITH CHANGED PROTEIN PRODUCTION
US7794974B2 (en) Fungal transcriptional activators useful in methods for producing a polypeptide
US20130266984A1 (en) Filamentous fungi with inactivated protease genes for altered protein production
US20120041171A1 (en) Process for the production of a recombinant polypeptide of interest
US20220145278A1 (en) Protein production in filamentous fungal cells in the absence of inducing substrates
WO2018093752A1 (en) Filamentous fungi with improved protein production
US20120149064A1 (en) Filamentous Fungi With Impaired PTRB Activity For Altered Protein Production
EP4004211A1 (en) Filamentous fungal expression system
Yin et al. Construction of a shuttle vector for heterologous expression of a novel fungal α-amylase gene in Aspergillus oryzae

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17805081

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17805081

Country of ref document: EP

Kind code of ref document: A1